240 58 10MB
English Pages 322 [323] Year 2023
Springer Texts in Business and Economics
Işık Biçer
Supply Chain Analytics
An Uncertainty Modeling Approach
Springer Texts in Business and Economics
Springer Texts in Business and Economics (STBE) delivers high-quality instructional content for undergraduates and graduates in all areas of Business/Management Science and Economics. The series is comprised of selfcontained books with a broad and comprehensive coverage that are suitable for class as well as for individual self-study. All texts are authored by established experts in their fields and offer a solid methodological background, often accompanied by problems and exercises.
I¸sık Biçer
Supply Chain Analytics An Uncertainty Modeling Approach
I¸sık Biçer York University Toronto, Canada
This work was supported by Schulich School of Business, York University
ISSN 2192-4333 ISSN 2192-4341 (electronic) Springer Texts in Business and Economics ISBN 978-3-031-30346-3 ISBN 978-3-031-30347-0 (eBook) https://doi.org/10.1007/978-3-031-30347-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The World Economic Forum considers supply chains the backbone of the global economy.1 There is no doubt that effective supply chain management has the potential to solve many economic, social, and environmental problems we face today, and it helps organizations achieve sustainable and environmentally friendly growth. Companies that excel in supply chain management are more likely to gain a competitive edge by optimizing products, information, and financial flows, thereby generating high profit margins. Supply chain management deals with both operational and non-operational tasks as products move from upstream suppliers to downstream customers. The ultimate objective of supply chain management is to perfectly match consumer demand with supply in an economic and efficient way.2 Most companies fail to meet this objective basically due to uncertainties that have an impact on the availability of products in markets (e.g., supply uncertainty) and the customer demand (e.g., demand uncertainty). This book blends some traditional and novel theories of operations management and business analytics to address how to manage uncertainties in supply chains. It emphasizes the ontological uncertainty as the most important uncertainty type, such as occurs when it is not possible for decision makers to outline all potential outcomes of their decisions, nor to identify the agents that cause the uncertainty. We have designed this book as a graduate-level textbook for supply chain analytics courses. Because we emphasize Python as the programming language to solve complex supply chain problems, participants of such courses are encouraged to learn Python to solve examples in this book. There are three types of exercises in the book. First, we provide readers with some in-context examples while introducing some topics. These examples are supported by a complementary online web application that allows readers to enter input parameters and generate results for each example. The Python codes are also shared with readers so those who are interested can use the codes to run experiments on their own computers. Second, we present some practice examples at the end of each chapter and show their solutions.
1 https://www.weforum.org/reports/building-resilience-supply-chains/. 2 Cachon,
G. and Terwiesch, C. (2012). Matching Supply with Demand: An Introduction to Operations Management. McGraw Hill, NY 10036, U.S.A., 2nd edition. v
vi
Preface
Some examples can be solved in Python, and their solutions are provided in the online web application. Finally, there are some self-study exercises at the end of each chapter, which are left to readers. The online web application is accessible via: https://www.yorku.ca/research/areas/supplychainanalytics/textbook/, with the following information: .
U sername : testuser P assword : SchuSum2021
This book starts with an introductory chapter on risk analysis in supply chains. We introduce a unified supply chain framework that outlines the flows of information, products and capital in supply chains.3 Based on this framework, we identify five salient parameters that help characterize supply chains. We also present some indicators of supply chain risks derived from the five salient parameters. In Chap. 2, we review foundational theories and methods of predictive and prescriptive analytics to help readers familiarize the important concepts of business analytics. In Chap. 3, we elaborate on inventory management theories under demand uncertainty. We start with problems that have analytical solutions and discuss their settings in detail. One of the key messages that we would like to convey in this chapter is that decision makers can find optimal or near optimal solutions intuitively by applying marginal analysis without going through long, quantitatively exhaustive derivations. We also discuss how the assumptions on which the analytical solutions are based are often violated in practice. In such cases, a Monte-Carlo simulation has proven to be useful. We conclude this chapter by showing how to apply the Monte-Carlo simulation to a complex supply chain problem, which is also supported by the online web application. In Chap. 4, we cover the uncertainty modelling approach along two dimensions. First, we discuss the additive and multiplicative demand models to capture the evolutionary dynamics of uncertainty. Second, we introduce advanced analytical methods (i.e., the fast Fourier transformation (FFT) and demand regularization methods) to model complex demand dynamics when demand is formed as a combination of different uncertain elements. In Chap. 5, we present the methods used to enhance supply chain responsiveness, such as lead time reduction, sourcing flexibility, multiple sourcing, capacity buffers, etc. Then, we continue with managing the product variety in Chap. 6, where we explore the importance of uncertainty modelling approach in product-portfolio management along the supply chain. In Chap. 7, we focus on supply risks, where we quantify the cost of supply disruptions and shortfalls. We then elaborate on the value of operational flexibility in mitigating supply risks in this chapter. Supply chain management has a direct impact on cash-flow management. To generate revenues in a short time period, it is necessary to reduce the time duration 3 Biçer, I. (2022). Securing the upside of digital transformation before implementation. California Management Review. https://cmr.berkeley.edu/2022/05/securing-the-upside-of-digitaltransformation-before-implementation/.
Preface
vii
between the procurement of raw materials and delivery of the final products to customers. Apart from that, companies can use different instruments, such as trade credit, dynamic discounting, and reverse factoring, to finance their operations. While an effective operational strategy would have positive implications for cash-flow management, it may fail to generate sustainable profits if it is not coupled with the right financial strategy. We explore the interaction between financial and operational dynamics in Chap. 8 by focusing on supply chain finance. In Chap. 9, we discuss some future trends of artificial intelligence and how AI can be adapted to solve some important supply chain problems. In particular, we highlight the potential and limitations of artificial intelligence (AI) as a decision tool in supply chain management. We aim to make each chapter as self-sufficient as possible. Accordingly, each chapter ends with an appendix of the derivations of the analytical models used. There is one appendix at the end of the book that presents the preliminaries of Python programming. It would help readers who are unfamiliar with Python programming learn its basics. Then, they would be able to understand the Python codes given in the online web application. Toronto, ON, Canada February 2023
I¸sık Biçer
Acknowledgements
First, I would like to thank my institution (Schulich School of Business, York University), my doctoral and post-doctoral supervisors, colleagues and research partners who encouraged me to study supply chain analytics. This book would not have been possible without their support. My special thanks go to Suzanne de Treville. She introduced me to the world of financial modelling applied to supply chain management problems. She is also the one who endured my poor communication skills during my doctoral studies at the University of Lausanne and helped me transition myself from an operations research scientist to a confident academic. I am also extremely grateful to Ralf Seifert who helped me improve my knowledge on supply chain finance and general concepts of supply chain management during my post-doctoral studies at the Swiss Federal Institute of Technology (EPFL). I have greatly benefited from my research collaborations with Florian Lucker, Verena Hagspiel, Murat Kristal, David Johnston, Murat Tarakcı, Ayhan Kuzu, Taner Bilgiç, Benjamin Avanzi, Merve Kırcı, Yara Kayyali-Elalem, Philipp Schneider and Ata Senkon ¸ on different projects. I hope to continue to work closely with them on some future projects. I have had a lot of constructive feedback from Özalp Özer, Lauri Saarinen and Ioannis Fragkos on different research papers such that their comments helped me later write this book. I appreciate very much their efforts, and acknowledge that all errors and omissions are naturally my responsibility. I am also grateful to my former colleagues at the Rotterdam School of Management, Erasmus University, where I had many productive research conversations during coffee and lunch breaks. While writing this book, my former graduate assistants Dhrun Lauwers and Vivian Duygu Parlak helped me very much. Duygu read earlier drafts several times and provided me with valuable feedback. Dhrun’s efforts were indispensable such that he transferred my initial codes in R to Python and created the online web application in Python. He also developed the webpage of the book in collaboration with Brenn Kha. Beverley Lennox copyedited the book and helped me correct several typos and grammar mistakes. Christian Peterson produced chapter videos (available on the book webpage) that summarize each chapter. Our administrative assistant Paula Gowdie Rose managed all administrative tasks. I deeply appreciate the time and effort of these wonderful people. ix
x
Acknowledgements
Finally, I would like to thank my family, my wife Gökçe, my sons Cem and Eren, for their unconditional love and support. Toronto, ON, Canada, February 2023
I¸sık Biçer
Contents
1
Introduction and Risk Analysis in Supply Chains . . . . . . . . . . . . . . . . . . . . . . . 1.1 Internet Era and Shift in Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Three Challenges of Modern Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Supply Chain Management Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Supply Chain Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Supply Chain Analytics at the Interface of Supply Chain Management and Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Case Study: Kordsa Inc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 2 4 6 12
2
Analytical Foundations: Predictive and Prescriptive Analytics . . . . . . . . 2.1 Predictive Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Bias-Variance Trade-Off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Matrix Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Generalized Least Squares (GLS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Dealing with Endogeneity Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.6 Regularization Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.7 Classification Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.8 Time Series Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Prescriptive Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Taylor’s Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Gradient Descent Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 Lagrange Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 28 29 30 32 35 36 41 43 49 58 59 61 62 63 64 66 67 73 73
3
Inventory Management Under Demand Uncertainty . . . . . . . . . . . . . . . . . . . .
75
15 18 21 22 23 24
xi
xii
4
5
Contents
3.1 3.2
Inventory Productivity and Financial Performance . . . . . . . . . . . . . . . . . . . Single-Period Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Single-Period Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Multi-period Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Multi-period Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Reorder Model with Continuous Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Reorder Model Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Monte Carlo Simulation for Inventory Models . . . . . . . . . . . . . . . . . . . . . . . 3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Appendix to Chap. 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.1 Analytical Solution to the Newsvendor Problem . . . . . . . . . . . . . 3.9.2 Analytical Solution to the Base Stock Problem . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77 80 86 89 91 93 98 100 104 106 115 116 116 118 119
Uncertainty Modelling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Uncertainty Modelling Versus Demand Forecasting . . . . . . . . . . . . . . . . . 4.2 Evolutionary Dynamics of Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Additive Demand Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Multiplicative Demand Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Integration of Uncertain Elements in a Unified Model . . . . . . . . . . . . . . . 4.3.1 Inventory Management with the Additive Integration of Uncertain Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Inventory Management with the Multiplicative Integration of Uncertain Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 A Commonly Used Multiplicative Model: Jump Diffusion Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Demand Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
121 124 128 129 132 134
Supply Chain Responsiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Lead Time Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Multiple Sourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Quantity Flexibility Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Multiple and Sequential Ordering Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Multiple Ordering in a Multi-echelon Model . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 Appendix to Chap. 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 162 167 173 176 182 187 187 194 194
136 139 142 144 146 150 151 157 157
Contents
xiii
5.9.1 Derivation of Q∗1 and K ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 6
Managing Product Variety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Mean-Variance Analysis for Product Selection . . . . . . . . . . . . . . . . . . . . . . . 6.2 Resource Allocation and Capacity Management . . . . . . . . . . . . . . . . . . . . . 6.3 Multiple Ordering Model with Multiple Products . . . . . . . . . . . . . . . . . . . . 6.4 Product Proliferation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Operational Excellence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 Appendix to Chap. 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.1 Mean-Variance Analysis Derivations . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.2 Multi-product Newsvendor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
197 199 203 205 209 214 220 221 226 226 226 227 230
7
Managing the Supply Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Type-1 Disruption Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Type-2 Disruption Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Implications of the Shipment Ownership for Global Trade . . . . . . . . . . 7.4 Type-3 Risk: Delivery Shortfalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
231 232 240 244 245 251 251 254 254
8
Supply Chain Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Early Payment Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Reverse Factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Letter of Credit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Dynamic Discounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Market Mechanism of Dynamic Discounting. . . . . . . . . . . . . . . . . 8.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Practice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
257 259 261 264 268 271 275 276 279 279
9
Future Trends: AI and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Activation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 ANNs in Inventory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
281 283 285 287 288 291 292
A Introduction to Python Programming for Supply Chain Analytics . . . . 293
xiv
Contents
A.1 NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 SciPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4 Matplotlib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 General Programming Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
298 300 302 306 307 311
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
1
Introduction and Risk Analysis in Supply Chains
Keywords
Risk analysis · Supply chain management · Supply chain framework · Supply chain integration
Supply chain management (SCM) has a long history that dates back to ancient times when traders used the Royal Road and the Silk Road to connect the Far East to the Middle East and Western Europe (Coates, 2012). Analytical methods, however, have only been applied to complex supply chain management problems since the World War II era, when operations research (OR) emerged as a field (Dantzig, 2002; Dreyfus, 2002). OR is an analytical discipline that uses a combination of optimization, statistical and computational methods to solve complex real-world problems. It is the theoretical foundation of supply chain analytics because supply chain problems can be formulated, with some assumptions, using OR models. There are several subcategories of OR that use different optimization approaches. The classical problems to which OR methods and techniques are applied are inventory management, lot sizing, routing, facility layout design problems, etc. The main challenge in OR applications is identifying how to cast a real-world setting into a theoretical framework and solve it using the right approach.1 There are four elements that must be defined in the preliminary stage of developing an OR model (Dantzig, 1955). First, decision variables are variables that decisionmakers can control. For example, production quantity is a decision variable in production planning problems because planners determine production quantities. Second, state variables are those that describe the system, but they cannot be directly controlled by decision-makers. Customer demand is a state variable in inventory management problems such that decision-makers determine inventory
1 In
the field of OR, such models are called mathematical models.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_1
1
2
1 Introduction and Risk Analysis in Supply Chains
levels to meet demand that is considered an exogenous parameter. Third, constraints describe the limitations of decision-makers. For example, capacity is a constraint in inventory management problems because production quantity (i.e. the decision variable) cannot be more than the capacity limit. Finally, an objective function is a mathematical representation of the objective of a decision-maker. For example, decision-makers often aim to make decisions to maximize profit. Thus, profit formulation can be an objective function that needs to be maximized. In different cases, cost function can be an objective function that needs to be minimized. Naturally, OR aims to address the following questions. What are the constraints, decision variables and state variables? What is the objective function? Which OR method should be used to find the optimal values of decision variables? These were difficult questions to answer, and OR/SCM researchers had mainly focused on them before the 1990s (Dantzig, 2002; Dreyfus, 2002). Owing to these efforts, companies achieved substantial savings and passed these savings on to consumers in the form of competitive prices.
1.1
Internet Era and Shift in Supply Chains
After the adoption of Internet technologies in the 1990s, supply chain management experienced a shift in business models (Swaminathan & Tayur, 2003). The Internet made it possible to fulfil customer demand through different channels. Since its founding in 1994, Amazon has disrupted the retail industry by eliminating retail stores and selling products online. Nowadays, we cannot imagine any big retailer that does not offer online shopping to its customers. The Internet has also improved the connectivity between supply chain parties, making the information flow much faster and more reliable. When a customer places an online order for a product, she can track all the stages of the shipment. Using this information, the delivery time can be predicted accurately. Social media platforms and product webpages are valuable information sources for companies such that they can use the Internet to get information about product reviews, likes and scores, which in turn helps them better estimate future demand (Gallaugher & Ransbotham, 2010). Such advances have made classic demand forecasting models obsolete. Historically, supply chain models relied on statistical distributions fitted to empirical demand data. But now, companies collect information about different explanatory variables from different sources. The information is also updated at different frequencies for each variable. There is a dynamic action space that adds to this complexity such that several decisions are made at different time epochs, which are exposed to different levels of uncertainty because information about the explanatory variables is updated frequently. Let’s consider a very simple supply chain example where organizers of a sports event request catering services from a local restaurant. The organizers offer free pasta to the sports event participants, and the restaurant prepares the food. The organizers contacted the restaurant 1 month before the event and told the chef they expected around 400 participants. But the exact number is not known because
1.1 Internet Era and Shift in Supply Chains
3
registration for the event will remain open until 2 days before the event. Suppose only two kinds of pasta are offered (e.g. tomato and cheese pasta) and participants will be asked to choose one when they register. The chef first orders the pasta and cheese from a wholesaler 1 week before the event. She also decides to buy tomatoes from a local grocery store 2 days before the event, so they will still be fresh when she starts to cook the pasta. The amount of pasta ordered is enough to fulfil the demand of 600 people maximum. The chef also assumes that half of the participants will prefer cheese pasta, so she orders enough cheese for 300 participants. After the registration closes (i.e. 2 days before the event), she finds out that 550 participants have already registered for the event and 200 of them want to eat cheese pasta while the rest prefer tomato pasta. So, the chef decides to buy enough tomatoes to serve 350 participants. In the end, the total demand for cheese pasta turns out to be 200 units, while the total demand for tomato pasta is 350 units. All demand is fulfilled. However, the restaurant ends up with an excess stock of cheese and pasta—enough to serve 100 and 50 customers, respectively. If the cheese is not consumed by its best-before date, it must be thrown away. On the other hand, the amount of tomatoes bought from the local grocery store was a perfect match with the demand. In this setting, there are two decision epochs. The decisions, timeline and information update are depicted in Fig. 1.1. First, the chef must determine the amount of pasta and cheese to order 1 week before the event. Second, she should decide on the amount of tomatoes 2 days before the event. The first set of decisions also influences the chef’s decision at the second decision epoch. For example, it does not make any sense to buy enough tomatoes to serve more than 600 participants given that the amount of pasta ordered is only enough for 600 participants. To estimate demand, the organizers collect valuable information to estimate demand
Fig. 1.1 Decision epochs, timeline and information update structure for the pasta example
4
1 Introduction and Risk Analysis in Supply Chains
over time including the number of people who have already registered for the event and their meal preferences. This information evolves over time. Therefore, there are two main challenges associated with this decision-making process. The first one is the dynamic action space such that the chef must make sequential decisions that are related to each other and influenced by the state of information. The second challenge is demand uncertainty. The uncertainty decreases over time as more information is collected to better estimate demand. So, the evolutionary dynamics of demand uncertainty should be incorporated into the dynamic action space to better match the amount of pasta with the number of participants. This makes uncertainty modelling very important for the restaurant’s supply chain management.
1.2
Three Challenges of Modern Supply Chains
The structure of the chef’s problem exemplifies a supply chain management setting in which she purchases the ingredients, transforms them into two different types of pasta and then delivers meals to the event organizers. However, this is a much simpler setting than what companies face in practice. In the chef’s case, the information about the number of people who registered for the event is directly translated into a demand value. In practice, however, companies collect information about several explanatory variables and feed them into statistical models to estimate demand. As the number of explanatory variables increases, statistical models could suffer from an overfitting problem. Overfitting occurs when the number of observations is limited compared to the number of explanatory variables. Suppose that a retailer uses ten different variables (e.g. customer age, gender, shopping frequency, loyalty card membership status, etc.) to estimate demand. A niche item was sold only ten times in the past. If the retailer tries to estimate the demand for the item using the ten variables, an overfitting problem occurs. In this case, each variable identifies one sale incident, and the data fits perfectly. But the overfitted model fails to predict future demand as new purchases of the item are observed. The overfitting problem is related to a high-dimensional state space. In addition to the overfitting problem, supply chains also suffer from a dynamic action space and high demand uncertainty. Therefore, there are three important challenges of modelling and solving supply chain management problems: 1. Sequential decision-making process and its dynamic action space 2. Demand uncertainty and its evolutionary dynamics 3. High dimension of explanatory variables The first challenge has long been studied by optimization researchers. The field of dynamic programming focuses on sequential decision-making problems (Puterman, 2014). One of the principal outcomes of applying dynamic programming to supply chain management problems is that it is possible to obtain optimal or near-optimal solutions using marginal analysis (Biçer & Seifert, 2017; Biçer et al., 2022a). We will elaborate on this in the following chapters. The third challenge is distinct from
1.2 Three Challenges of Modern Supply Chains
5
the other two. When there are many explanatory variables available to estimate demand, it would be possible to preprocess a vast amount of data to reduce the dimension. There are alternative dimension reduction methods that can be used for this purpose. In machine learning, L1 and L2 regularization techniques and principal component analysis address the third challenge (Hastie et al., 2001). Such techniques are often embedded in some software solutions, such as Python and R. Relying on marginal analysis to address the first challenge and on dimension reduction methods to ease the problems related to the third challenge, we can pay more attention to demand uncertainty and its evolutionary dynamics. We argue that for the problems within the boundaries of supply chain management, the second challenge deserves more attention than the others. Explicit modelling of the uncertainty and its evolutionary dynamics makes it possible to compare alternative supply chain responsiveness strategies. For example, manufacturers can source raw materials from different suppliers (e.g. low-cost vs. responsive suppliers) that have different lead times. Although low-cost suppliers offer items at low prices, decision-makers need to place orders well in advance of the time of delivery. In contrast, responsive suppliers allow decision-makers to postpone the ordering decision, so decision-makers can collect valuable information about market demand before placing their procurement orders. However, the procurement cost is expected to be higher if the products are purchased from responsive suppliers. The decision of whether to source products from a low-cost or a responsive supplier directly depends on the evolutionary dynamics of the uncertainty model. Wider adoption of Internet technologies has facilitated access to different suppliers with different lead times and costs. Suppliers can also sell their products through different channels with different delivery options and prices. As a result, decision-makers are often exposed to such trade-offs, which makes uncertainty modelling highly critical for supply chain management. Another important aspect of uncertainty modelling is that sequential decisions can be optimized with the use of an uncertainty model. Let’s go back to the chef example in which there are two decision epochs. To determine the amount of pasta and cheese to order, the chef needs information on the level of uncertainty at the second decision epoch. The second decision (i.e. the amount of tomatoes to order) is made after the number of participants and their meal preferences are fully known. Therefore, the demand for tomato pasta can be fully fulfilled once there is enough pasta ordered at the first decision epoch without any risk of having an excess stock of tomatoes in the end. This would induce the chef to order pasta in large amounts, compared to what would have been ordered if the second decision had been made under uncertainty. Therefore, the flow of information would change the level of demand uncertainty that a decision-maker is exposed to, which in turn would have an impact on her ordering behaviour. For that reason, the flow of information and goods must be jointly taken into account while modelling the demand uncertainty.
6
1.3
1 Introduction and Risk Analysis in Supply Chains
Supply Chain Management Framework
Supply chain management is a discipline that deals with activities starting with the procurement of raw materials, transforming them into finished goods and delivering the finished goods to end customers (Simchi-Levi et al., 2008). Naturally, supply chain practices focus on the seamless flow of goods by successfully carrying out operational tasks. However, a complete supply chain framework goes beyond the operational aspects of supply chain management. There are several non-operational tasks that need to be performed to maintain an effective supply chain, and a unified framework should include some key non-operational tasks. Figure 1.2 depicts a supply chain framework that includes three layers and shows the interactions among them (Biçer, 2022). Initially, companies procure some items from their suppliers. For example, manufacturers order raw materials or retailers buy end products from their suppliers. When they place a procurement order, the requested delivery time is specified. The time that elapses from the instant the procurement order is placed to when it is delivered by the supplier is called supply lead time. The supply lead time differs among suppliers. Although domestic
Fig. 1.2 Supply chain framework. There are three layers: (1) information, (2) operational and (3) capital flow layers
1.3 Supply Chain Management Framework
7
suppliers are usually expected to offer shorter supply lead times, it is not uncommon for offshore suppliers to be able to deliver ordered items faster than domestic suppliers (e.g. by utilizing air cargo rather than ocean transhipment). The items procured from suppliers are kept in stock until they are delivered to customers. The time period the inventory is kept in stock is called days of inventory. Days of inventory is expected to be greater for manufacturers than retailers because manufacturers transform the raw materials into finished goods. On the other hand, retailers sell the items purchased from their suppliers directly to their customers without transforming them into other products. The absence of long manufacturing steps makes days of inventory shorter for retailers than manufacturers. The sum of supply lead time and days of inventory is referred to as operating lead time. Operating lead time is the total time that a company needs to create value for customers. It starts with the procurement order of raw materials from a supplier and ends when final products are delivered to customers. The time period that elapses from the instant a customer places a purchase order to when it is delivered to the customer is called demand lead time. This time duration is highly critical for companies, and it sometimes influences business models. When a customer issues a purchase order, the customer’s exact demand is revealed to the seller. Therefore, the seller has the full information about total demand during the demand lead time. If a seller has a demand lead time of 20 days, for example, she knows the exact quantity of demand for the next 20 days. Hence, there is no demand uncertainty within this time duration. Demand lead time can also be considered an indication of customers’ willingness to buy a product. The longer the demand lead time, the more willingness the customers show to buy the product. For example, customer orders can be fulfilled without bearing the risk of excess inventory when: Demand lead time > Operating lead time.
.
In this case, the decision-makers first receive orders from customers. Then, they order items from suppliers and transform them into finished goods. Finally, they deliver the goods to the customers. Therefore, decision-makers do not need to pile up any inventory until demand is fully known, so inventory risk can be fully eliminated. In other words, customers wait for the products. Demand lead time is expected to be longer for manufacturers than retailers. Manufacturers usually sell their products to other companies, not to end consumers. Their customers tend to place bulky orders such that a large volume of the same item is ordered in each distinct order. Such orders are placed well in advance of the requested delivery times. As a result, demand lead times can be quite long, and they vary substantially between different orders. In retail stores, demand lead time is almost zero because customers buy the products when they visit the stores. Prior to their visits, they do not make any commitments to buy a product; instead, the products wait for the customers on the shelves of retailers. In this case, decisionmakers first order items from suppliers and transform them into finished goods, if necessary. They then display them for customers; therefore, customers do not have to wait for the products to be produced. We remark that demand lead time may vary
8
1 Introduction and Risk Analysis in Supply Chains
depending on the sales channel in the retail industry. In physical stores, customers pick items from shelves and proceed to checkout. Therefore, the demand lead time is almost zero for physical stores. Thus, retailers can sell the items in physical stores if they have enough stock. For online channels, demand lead time varies depending on the availability of items. Retailers offer a more diverse range of products (i.e. both popular and niche products) online than they do in physical stores. Customers who order niche products online are often willing to wait for delivery. Therefore, demand lead time can be longer for online orders than in physical stores. In practice, demand lead time is often positive and less than operating lead time: Operating lead time > Demand lead time > 0.
.
In this situation, decision-makers first order items from suppliers and transform them into finished goods, if necessary. Customer orders are delivered within the demand lead time. When the demand lead time is relatively long, companies do not need to keep finished goods in stock. They may keep semi-finished components in stock if the time it takes to complete the remaining tasks and deliver the items ordered to the customers can be done within the demand lead time period. The earliest stage in the supply chain that allows the remaining tasks to be done during the demand lead time is called the decoupling point. Companies have to perform some tasks after the decoupling point, but these tasks are carried out with exact information about the final demand. The tasks that are performed before the decoupling point are done under demand uncertainty. The resources used for the tasks before the decoupling point may be overutilized due to demand uncertainty. For example, a manufacturer processes raw materials through tasks A, B and C in sequence to obtain finished goods. Tasks A and B are carried out before the decoupling point, whereas task C is performed after the decoupling point. Suppose that the expected demand for the final goods is equal to 300 units at the time when tasks A and B are carried out. Then, the manufacturer observes the exact amount of demand at the decoupling point, which is equal to 200 units. Therefore, the resources used for tasks A and B are overutilized by .50%, while the resources used for task C are not overutilized. For capital-intensive tasks, it is very important to avoid overutilization, so they should be postponed until after the decoupling point. To do so, some companies invest in process flexibility to redesign the processes so that the capital-intensive ones are deferred (Heskett & Signorelli, 1989; Lee & Tang, 1998). In practice, designing supply chains and positioning the decoupling point are highly critical when sequential decisions are made under demand uncertainty. Consider a fashion apparel producer that sells its products to retailers. The producer purchases yarns from suppliers and makes clothes in two stages. First, yarns are woven into textiles. Second, the textiles are sewn into differently sized clothing. The supply lead time for the yarn is equal to 1 month. It takes another month to produce the textiles. Finally, the processing time for sewing is also equal to 1 month. The final products are delivered to retail stores in 2 weeks. Suppose that the demand
1.3 Supply Chain Management Framework
9
lead time is equal to 2 months. Therefore, the decoupling point becomes the sewing operation. The supply chain structure for this example is shown in Fig. 1.3. The procurement order for the yarns and the production quantity for the textiles are determined under demand uncertainty because these operations are scheduled to be completed before the decoupling point. However, the sewing operation is carried out when demand is fully known. In this setting, the sewing operation is considered the point of differentiation. The same yarns and textile are used for clothes of different sizes. For that reason, the first two decisions are made to fulfil consolidated demand. However, order quantities for different stock keeping units (SKUs) should be decided at the beginning of the sewing operation. Thus, decision-makers face high product variety at the sewing operation. Nevertheless, designing the supply chain such that the decoupling point becomes the sewing operation alleviates the difficulty of managing high product variety. The supply chain framework shown in Fig. 1.2 gives us an idea about the position of the decoupling point. Without further information about the sequence of the processes, we may consider the time of receiving the customer order (i.e. the starting point of the demand lead time) the decoupling point. The difference between operating lead time and the demand lead time is called the decision lead time: Decision lead time = Operating lead time − Demand lead time.
.
The decision lead time is the time duration that elapses from the instant when the procurement order is placed with the supplier until the actual demand is observed. The longer the decision lead time, the more the demand uncertainty. Companies with long decision lead times should make their ordering decisions well in advance of the realization of actual demand under high uncertainty. As time passes, they can collect valuable information about final demand and use such information to reduce demand uncertainty. If such companies reduce their decision lead times, ordering decisions can be postponed until partial or full resolution of demand uncertainty. This makes it possible to base ordering decisions on more accurate demand information and helps companies better match supply with demand. Decision lead time is expected to be longer for downstream entities of a supply chain (e.g. retailer) than for upstream entities (e.g. raw material supplier) because demand lead time is often shorter for retailers than for manufacturers as explained above. For that reason, companies selling their products directly to consumers face long decision lead times in comparison to those selling the wholesalers. For example, wheat farmers would sell their crop in farmer markets such that they only know the demand for their crop and the base price after harvesting. However, sugar beet farmers tend to contract with sugar refineries before planting their crop. Refineries set the base price and maximum purchase quantity depending on such factors as the area of farm land and crop yield of farmers in previous years. Therefore, sugar beet farmers know market demand and base price even before planting their crop (which is not the case for wheat farmers) because their customers (i.e. refineries) are still downstream supply chain entities in comparison to customers of wheat farmers.
Fig. 1.3 Timeline of ordering decisions for the fashion apparel example. The decoupling point is shown by a red spot, which is the sewing operation
10 1 Introduction and Risk Analysis in Supply Chains
1.3 Supply Chain Management Framework
11
The comparison between wheat and sugar beet farmers brings us to another interesting question: Why do some farmers prefer to sell their crop to other downstream supply chain entities (e.g. wholesalers and refineries) whereas some others want to sell end consumers at the expense of having long decision lead times? Although selling end consumers directly leads to long decision lead times, which translates into high demand uncertainty and high supply-demand mismatches, it offers some financial advantages over selling downstream supply chain entities. End consumers often pay immediately upon receiving products. However, corporations tend to extend payment terms and delay the payment to upstream suppliers. As payment terms increase, suppliers must wait longer periods to get paid by their suppliers and their accounts receivable increases accordingly. For instance, if a supplier sells its products worth $5000 to a retailer with a payment term of 60 days (i.e. the retailer pays the supplier for the products purchased 60 days after delivery), the value of this transaction (i.e. $5000) is kept in accounts receivable of the supplier’s balance sheet. If the supplier sells directly to end consumers with immediate payment, the value of the transaction is kept in cash accounts of the supplier. The cash flow uncertainty of a company increases with total value of accounts receivable. Companies are exposed to the default risk of their invoices when their buyers fail to remit payments for the products purchased. In such a case, they write down accounts receivable and increase bad debt charges. In 2020, some well-known brands (e.g. Nike Inc. and Samsonite International SA) wrote down accounts receivable and disclosed millions of dollars in bad debt charges because they did not expect to get paid by some financially distressed retailers in the early stages of the Covid pandemic (Broughton, 2020). Therefore, suppliers would be exposed to the trade-off between demand uncertainty and cash flow uncertainty when they make a decision between selling their products to retailers and selling them directly to end consumers. If they sell directly end consumers, this would cause long decision lead times and high demand uncertainty but low cash flow uncertainty. If they sell retailers, it would cause high accounts receivable and high cash flow uncertainty but low demand uncertainty. The flow of capital has a significant impact on supply chains, and companies often look for some sales channels that would allow them to collect revenues in a short time period. For that reason, the flow of information and goods cannot be separated from the flow of capital as illustrated in Fig. 1.2. Companies have different payment terms with their suppliers and customers (Grosse-Ruyken et al., 2011; Schneider et al., 2023). The length of time between the instant when ordered items are received from a supplier and when the payment is made to the supplier is called days of payables. This duration increases as payments to suppliers are delayed, giving the company extra cash to manage its operations. However, suppliers always request early payments to minimize their working capital needs. The field of supply chain finance deals with such problems and investigates the role of banks and other intermediaries in alleviating cash flow problems for both parties. The time that elapses from the instant when the items are delivered to customers until payments from them are collected is called days of receivables. The sum of days of inventory and days of receivables gives the operating cycle, which refers
12
1 Introduction and Risk Analysis in Supply Chains
to the total length of time it takes to transform the value proposed by the business into cash. In other words, operating cycle is the time elapsed to generate revenue from customers. It starts from the instant when the supplier delivers raw materials to the company because the company has the potential to generate revenue after the delivery from the supplier is completed. It ends when the payment from customers is received. Business models with long operating cycles are often inefficient in terms of the cash flow performance. However, some of the inefficiencies are diverted to suppliers by delaying payments to them. Thus, the length of time during which working capital is needed to maintain the business is: Cash conversion cycle = Operating cycle − Days of payables.
.
This measure is very important for cash flow and financial management. One of the elements leading to the widespread use of this metric is that its value can be extracted from the financial statements of publicly traded companies. Finance professionals often use the cash conversion cycle together with operating margin (i.e. operating income divided by revenues) to assess the performance of companies. Increasing the responsiveness of supply chain activities to sudden changes in market demand helps improve these two metrics in two ways. First, it reduces the risk of excess inventory so products are kept in stock for a short time period, thereby leading to a decrease in days of inventory. Second, mismatches between supply and demand are minimized, which in turn increases the operating margin. Therefore, supply chain management has a significant impact on the financials. In Chap. 8, we discuss the financial aspects of supply chain management and four most important supply chain finance alternatives (i.e. early-payment scheme, reverse factoring, letter of credit and dynamic discounting) in detail.
1.4
Supply Chain Integration
In supply chain management, the importance of uncertainty modelling is closely linked to supply chain integration. Companies that integrate supply chains effectively can have high control over both downstream and upstream supply chain activities (Flynn et al., 2010). This allows such companies to react quickly to unexpected events, such as demand shocks or surges. Manufacturers and retailers often invest in supply chain integration to improve supply chain responsiveness, so they can adjust inventory-related decisions over time. Imagine a manufacturer that places a production order for a seasonal product at the beginning of the production horizon. The production horizon is assumed to be 3 months. The products are sold to retail stores, not directly to end consumers. During the production horizon, purchase orders from retailers are received. Approximately one-third of total orders are confirmed by retailers within the first month of the production horizon. The advance orders are delivered to the retailers at the beginning of the selling season. The manufacturer uses the advance demand information to improve the accuracy of its demand forecasts. Because the manufacturer knows that the advance
1.4 Supply Chain Integration
13
Fig. 1.4 Manufacturer with a partial ordering flexibility can adjust inventory level according to demand updates
demand received within the first month of the production horizon constitutes onethird of total demand—e.g. this ratio can be extracted after analysing historical demand patterns—total demand can be approximated by multiplying the amount of advance demand by 3. This approach would increase the accuracy of the forecasts substantially. It has been well documented that advance demand information can help increase forecast accuracy by as much as 95% (Fisher & Raman, 1996). If the manufacturer has the flexibility to place partial production orders for each month, supply-demand mismatches can be substantially reduced by using the advance demand information. Let’s assume that total demand is expected to be between .10,000 and .30,000 units. The evolution of monthly production orders and information updates is depicted in Fig. 1.4. The production capacity is .10,000 units per month. The manufacturer places a monthly production order of .10,000 units at the very beginning. This amount is low enough to avoid an excess inventory risk because demand is not expected to be lower than .10,000 units. It is also high enough to avoid a capacity shortage risk. If demand turns out to be high, which can be .30,000 units at most, .20,000 units can be produced during the remaining 2 months. The manufacturer collects advance orders and then updates the demand information at the end of the first month such that demand is expected to be between .12,000 and .14,000 units. The second production order of 2000 units is placed at the end of the first month. Thus, the total inventory level reaches .12,000 units. Demand information is updated to be between .13,000 and .13,200 units at the end of the second month. The manufacturer then places the final order with a quantity of 1100 units. The total inventory level at the end of the production horizon equals .13,100 units, and the actual demand is .13,000 units. Therefore, the manufacturer ends up with excess inventory of 100 units. We can compare this outcome with that of a manufacturer without any partial ordering flexibility. Such a manufacturer must determine total quantity to be produced during the production horizon at the very beginning, and there is no flexibility to adjust the order quantity after production starts. Therefore, total
14
1 Introduction and Risk Analysis in Supply Chains
quantity must be determined under high demand uncertainty without using advance demand information. Suppose that the order quantity equals .20,000 units because demand is expected to be between .10,000 and .30,000 units at the beginning of the production horizon. Then the manufacturer ends up with 7000 units of excess inventory when demand turns out to be .13,000 units. Therefore, partial ordering flexibility helps reduce excess inventory by .7000 − 100 = 6900 units in this example. Alternative demand management practices can be adopted by manufacturers depending on the availability of partial ordering flexibility. Without any partial ordering flexibility, manufacturers rely on the accuracy of demand forecasts made at the very beginning. They relentlessly work on finding more sophisticated forecasting models to improve the accuracy of forecasts, which in turn helps them reduce supply-demand mismatches. On the other hand, manufacturers with partial ordering flexibility care less about the forecast accuracy at the very beginning. Instead, they focus on reducing the uncertainty level until the next decision epoch. Therefore, such manufacturers should be forward-looking, trying to collect information from their customers over time, which can be through advance orders, contractual commitments, social media likes or comments, etc. The amount of information collected within a time period can be extrapolated and translated into accurate demand estimates. However, manufacturers without any partial ordering flexibility utilize static forecasting models that use inputs, such as historical data or exogenous factors. Such models often generate demand estimates with high forecasting errors because they lack the information about customers’ willingness to buy the products in the future. In practice, a forward-looking demand management approach with partial ordering flexibility pays off substantially. Boohoo, a UK-based online fashion retailer, achieved a very high growth rate over the last 5 years due to its supply chain integration strategy (Monroe, 2021). Unlike most fashion retailers, Boohoo keeps production close to its market, in Leicester, UK. The founders of Boohoo entered the business by opening a textile factory and supplying large retailers, such as Topshop and Primark. Later, it started to sell its products directly to end customers online through the company website. Leveraging an integrated supply chain and understanding the dynamics of merchandise planning, Boohoo orders in very small quantities just to observe initial sales. After observing customers’ reactions to new collections, additional orders are placed for high-demand items, which in turn helps the company reduce its inventory risk (Monroe, 2021). In the Chinese smartphone industry, Oppo/Vivo became the market leader in 2016, owing to its supply chain integration strategy (Seifert et al., 2017). Unlike its competitors, such as Xiaomi, Huawei and Apple, Oppo/Vivo keeps manufacturing, distribution and retail operations in-house. This helps the company adjust inventory according to demand. In 2015 and 2016, Oppo/Vivo ramped up production to meet increasing demand, while its competitors could not react quickly to the upward demand trend. This helped the company become the market leader in China. The examples of Boohoo and Oppo/Vivo can also be found in other industries. It is quite common in practice that supply chain integration coupled with the right supply
1.5 Supply Chain Analytics at the Interface of Supply Chain Management. . .
15
chain responsiveness and demand management practices helps improve profits significantly, thereby leading to more sustainable growth for organizations. Supply chain integration forms the basis for uncertainty modelling (Chap. 4) and supply chain responsiveness (Chap. 5). There is almost no value of modelling how demand uncertainty unfolds over time if the supply chain is disintegrated, where decision-makers do not have any ordering flexibility. In this case, decisionmakers aim to improve the forecast accuracy at the very beginning without paying attention to the evolutionary dynamics of demand uncertainty because they cannot update initial order quantity later after observing partial or full resolution of demand uncertainty. Companies operating in disintegrated supply chains would also underinvest in supply chain responsiveness. Supply chain responsiveness is the ability of a company to respond quickly to sudden fluctuations in customer demand without keeping too much inventory. It is often confused with lean manufacturing although there are some fundamental differences between supply chain responsiveness and lean manufacturing. Lean manufacturing emphasizes the importance of keeping inventory at different echelons of a supply chain to ensure a smooth flow of operations, where the flow of inventory is organized by using Kanban cards (Womack et al., 2007). If a customer demands 20 units of a product from a lean supplier, for example, the demand is fulfilled directly from finished goods inventory. Then, the supplier places a production order by sending 20 Kanban cards from finished goods inventory to the adjacent upstream echelon, which makes it possible to transform 20 units of the semi-finished product into finished goods. To have a fully functioning lean manufacturing system, suppliers must keep inventory at different echelons and incur inventory holding costs. In contrast to lean manufacturing, supply chain responsiveness emphasizes systematically reducing the decision lead time. Suppliers can reduce their decision lead times in alternative ways such as increasing capacity, reducing batch sizes and setup time and increasing the speed of production processes. To employ these alternatives, supply chains must be integrated.
1.5
Supply Chain Analytics at the Interface of Supply Chain Management and Data Analytics
Data analytics is an interdisciplinary field that uses models and methods from computer science, optimization theory, statistics, econometrics and so on (Agarwal & Dhar, 2014; Kopcso & Pachamanova, 2018). There are three branches of data analytics: (1) descriptive analytics, (2) predictive analytics and (3) prescriptive analytics. Descriptive analytics is often considered the first step in data analytics. It aims to describe business practices and systems in a structural and data-centric way with descriptive tools, such as histograms, box plots, pie charts and so on (Nussbaumer-Knaflic, 2015). Decision-makers often use descriptive analytics to answer the questions: (1) What happened, and (2) why did it happen? For that reason, descriptive analytics sometimes reveals the missing links in business
16
1 Introduction and Risk Analysis in Supply Chains
systems and helps organizations identify some opportunities to improve the bottom line. Although descriptive analytics tools are considered to be simple visual representations of data, generating these tools may be highly difficult. Companies typically store transactional data that does not offer any valuable insights in its raw format. Data manipulation and cleansing techniques play an important role in transforming transactional data into useful formats. Descriptive analytics can only be applied to data that has been transformed into a useful format. This preprocessing step is sometimes laborious to complete. Predictive analytics is used to estimate variables of interest, which are not known by decision-makers (Biçer et al., 2022b). In particular, decision-makers are often interested in predicting future outcomes and trends under uncertainty. For that reason, predictive analytics aims to address the question of what will happen. Econometric models that relate explanatory variables to a variable of interest (i.e. a dependent variable) or time series models that use historical values of a variable to estimate its future value exemplify some important predictive analytics tools. Neural networks and AI are also used as predictive analytics tools to estimate variables of interest. In supply chain management, demand is often the dependent variable, and predictive analytics methods are used to estimate the demand. Although both descriptive and predictive analytics are useful in solving some supply chain management problems, the potential of supply chain analytics is unlocked with the application of prescriptive analytics (Bertsimas & Kallus, 2020). Prescriptive analytics aims to prescribe actions to improve the bottom line. It is especially relevant and important for supply chain management as supply chain problems exhibit a high level of complexity due to their dynamic nature, which involves sequential decisions and an evolutionary uncertainty structure. Selecting and applying the right prescriptive analytics method is very important for extracting the maximum value from analytics projects. The common practice in supply chain management is that predictive analytics is first used to predict future demand. Then, optimization models use this prediction and distributional properties of future demand to determine the optimal set of actions. However, this two-step approach does not work well when the goal is to optimize upstream decisions. Predictive analytics methods are calibrated based on empirical data such that the aim is to minimize forecast error. Recent research by Elmachtoub and Grigas (2021) has shown that the calibration of the model should be done with the purpose of minimizing the decision error, not the forecast error. This result can be interpreted as follows. Uncertainty modelling should be integrated in prescriptive analytics, not predictive analytics, although it would rely on some predictive analytics methods. Therefore, prescriptive analytics methods with the correct form of uncertainty modelling should be developed on their own. Supply chain analytics has emerged at the interface of supply chain management and data analytics with an objective of utilizing prescriptive analytics to improve supply chain management. There are three pillars of supply chain analytics as shown in Fig. 1.5. The use of prescriptive analytics as described above helps companies improve their competitive edge if decision-makers have control over supply chain activities.
1.5 Supply Chain Analytics at the Interface of Supply Chain Management. . .
17
Fig. 1.5 Pillars of supply chain analytics
Supply chains start with upstream activities such as procurement of raw materials and end with downstream activities such as delivery of finished goods to end customers. Having control over both upstream and downstream activities requires supply chains to be integrated. The easiest and most direct way to integrate supply chains is through vertical integration such that manufacturing, distribution and retail operations are performed by a single company. However, it is not compulsory to acquire supply chain partners to achieve supply chain integration. There are other methods that help integrate supply chains without mergers and acquisitions. For example, vendor-managed inventory (VMI) and collaborative forecasting, planning and replenishment (CFPR) help integrate supply chains without vertical integration (Achabal et al., 2000). The second pillar is operational flexibility, which is described as the ability to adjust inventory and logistics decisions according to evolving demand patterns. Operational flexibility, together with supply chain integration, allows decisionmakers to make sequential, interlocking decisions over time. Without operational flexibility, all decisions regarding the activities that take place within a given time period should be made at the beginning of the period. In production systems, for example, manufacturers sometimes set frozen time periods such that activities during a frozen time period are planned in advance and cannot be altered afterwards. Operational flexibility challenges such practices in order to provide decisionmakers with the ability to postpone some actions until the accuracy of demand forecasts is improved. In other words, supply chain integration makes it possible for decision-makers to have control over both upstream and downstream activities. However, integrated supply chains can still be rigid such that order quantities cannot be updated over time as products flow from upstream to downstream echelons.
18
1 Introduction and Risk Analysis in Supply Chains
Operational flexibility makes it possible to adjust order quantities as products flow from upstream to downstream echelons and new information about market demand is collected over time. The last pillar is demand management, which focuses on collecting useful information to characterize demand and its evolutionary dynamics. We emphasize the uncertainty modelling approach as an effective demand management tool in this book because demand is often formed as an additive or multiplicative combination of different uncertain elements (Biçer & Tarakcı, 2021). If a manufacturer sells its products in three different markets, for example, demand is formed as an additive combination of demand values in three markets. Suppose that demand in the first market is modelled using a statistical distribution such as a lognormal distribution. Demand in the second market follows a stochastic model. Finally, demand in the third market cannot be fit by a statistical distribution nor a stochastic model. However, the manufacturer has an empirical dataset of periodic demand values in the third market. Then, uncertainty modelling approach aims to incorporate this information in an additive demand model to better characterize demand and its evolutionary dynamics. If a manufacturer sells its products to a small set of customers, the manufacturer may receive bulky orders from customers. In this case, demand is formed as a multiplicative combination of the number of distinct orders within a time period and the quantity demanded for each order. Suppose that the manufacturer expects the number of distinct orders received from a customer to be between 9 and 12 in a month. The quantity demanded for each distinct order follows a stochastic model. Then, uncertainty modelling approach aims to incorporate this information in a multiplicative demand model to better characterize demand and its evolutionary dynamics. Such examples of additive and multiplicative demand models can be found in different settings. For example, online retailers would form a multiplicative demand model with two uncertain parameters: (1) website traffic and (2) choice probabilities. Therefore, uncertainty modelling approach can be applied to many settings to improve the quality of supply chain decisions.
1.6
Case Study: Kordsa Inc.
We now present a case study to illustrate how supply chain analytics helped Kordsa Inc., a global manufacturer of reinforcing composites (i.e. tire cords), improve profits and gain a competitive edge (Biçer & Tarakcı, 2021; Biçer et al., 2022a). Kordsa is the market leader in developing and manufacturing tire cords that are used to reinforce tires. It has 12 facilities in Brazil, Indonesia, Thailand, Turkey and the USA with 4500 employees, generating .$815 million in revenues in 2019 (Biçer & Tarakcı, 2021). The production of tire cords involves four steps in sequence as shown in Fig. 1.6. The first step produces the polymer yarns from polypropylene through a continuous process. The speed of the process can be adjusted to control the output rate of the polymer yarns. However, it is not possible to stop the yarn production process
1.6 Case Study: Kordsa Inc.
19
Fig. 1.6 Production processes at Kordsa Inc.
because restarting it is too costly. The second step is the twisting operation, where two yarn threads are twisted together to increase the strength of the polymer yarns and make them ready for the weaving operation. The twisting operation consumes too much energy, and it determines the throughput rate of the entire production system because it is the bottleneck operation. Product differentiation occurs during the twisting operation. Switching the operation from one type of yarn twists to another requires a long setup time. To avoid long setup times for some product families, the company has dedicated machines to produce a selection of yarn twists with similar specifications. However, long setup processes are unavoidable for other product families (Biçer & Tarakcı, 2021). The third step is the weaving operation, where twisted yarns are woven to produce plain cords. Product differentiation occurs at this stage because the number of twisted yarns per meter and the width of cords vary across tire cords. Switching the operation between two different tire cords does not require a long setup because Kordsa has substantially reduced the setup time in the weaving operation through the use of automated systems. Finally, woven cords are dipped into a chemical blend to obtain the right level of strength and elasticity. Product differentiation occurs sequentially in the last three operations—namely, twisting, weaving and dipping. In the twisting operation, the number of twists per meter length determines the component specification, which varies across the cords used for different tires (e.g. auto, truck and airplane tires). Therefore, the twisting operation is the first differentiation point of the tire cord manufacturing process. In the weaving operation, the number of twisted yarns used per meter width differs between tire cords, which makes the weaving operation the second differentiation point. Finally, the same plain cords can be dipped into different chemical blends for different lengths of time to obtain slightly different specifications depending on customer requests. Therefore, the dipping operation is the last differentiation point. Applying the principals of supply chain analytics, Kordsa has maintained high product variety and reduced inventory-related costs substantially (Biçer et al., 2022b). The company has shown strong financial performance during the last decade and resilience during the pandemic despite the negative impact of the pandemic on the auto industry. Figure 1.7 depicts its stock price over the last decade.
20
1 Introduction and Risk Analysis in Supply Chains
Fig. 1.7 Stock market performance of Kordsa Inc.
Kordsa has a fully integrated supply chain such that all the operations in Fig. 1.6 are maintained in-house. There are two important advantages of supply chain integration. First, the company can offer a high variety of tire cords to its customers cost-efficiently because product differentiation occurs sequentially along the production stages, not at the very beginning. In the earlier production stages, for example, Kordsa is exposed to high demand uncertainty but low product variety. Therefore, demand for different SKUs is pooled to determine the order quantities in the early stages (e.g. yarn production), which in turn alleviates the negative impact of demand uncertainty. As production moves forward, Kordsa collects demand information from its customers to improve the accuracy of demand forecasts. Thus, Kordsa faces high product variety at the later stages, but the negative impact of high variety on operational costs is alleviated owing to improved forecast accuracy. Hence, supply chain integration coupled with demand management makes it possible to maintain high product variety in a cost-efficient way. The second advantage of supply chain integration is the ability to sell some components to customers. For example, Kordsa sells yarns produced after the first step to some customers. This helps the company increase revenues. As a matter of fact, the production of each component can be considered a touchpoint, which has the potential to contribute to total sales. This potential can only be realized when the supply chain is integrated. One of the operational challenges that Kordsa faces is related to demand modelling due to its complex dynamics. The company operates in the businessto-business (B2B) environment. Each customer order specifies the SKU demanded, its quantity and the demand lead time. Although the number of distinct orders is predictable, the quantity demanded and the demand lead time vary substantially
1.7 Chapter Summary
21
across orders. It takes 3–4 weeks to complete production after a production order is placed. If the demand lead time for a customer order is longer than this time length, it is produced to order. If the demand lead time for a customer order is shorter than 3 weeks, it must be fulfilled from the stock. Capitalizing on operational flexibility with the correct form of demand model, Kordsa has successfully integrated the sequential ordering decisions along its supply chain with demand dynamics, which has helped the company reduce its costs by more than .30% and increase its inventory turnover by around .90% (Biçer & Tarakcı, 2021). The company also maintains high product variety (Biçer et al., 2022a).
1.7
Chapter Summary
Over the last 15 years, I have done academic research and carried out industry projects with different companies. I have observed two misunderstandings in practice. First, practitioners overstate the value of predictive analytics. After we deployed sophisticated predictive analytics tools in some of the projects, we consistently faced some criticism from sales and operations planning teams because the estimates coming from the demand planners were generally more accurate than those of the predictive analytics tools. When we first received this criticism, we retested different models and compared their performance repetitively to be sure that we hadn’t made any mistakes while developing the models. We later thought that the demand planners had access to some private information, which would explain why their estimates were much better than those of the developed models. But it was not the case. The planners were indeed using information that was available in the sales transactions databases before casting their estimates. The way the predictive analytics methods are deployed, however, makes it impossible to use such information. In the B2B setting, for example, customers typically place bulky orders. The number of distinct orders received from a customer within a period is uncertain. For each order, the quantity demanded for a product and the demand lead time are both uncertain. Thus, the demand is formed through a process along which there are three uncertain parameters. In predictive analytics, however, we use aggregate periodic demand values to fit a model such that only one uncertain parameter (i.e. periodic demand) is fed into the model and the other three uncertain elements are ignored. Therefore, the deployment of predictive analytics in supply chain management suffers from squeezing valuable ordering information, which explains why the estimates of the demand planners were more accurate than those of the predictive analytics. In the online retail setting, for example, customers visit retailers’ websites and make their choices among different alternatives before placing their orders. The number of visitors and their choice probabilities are both uncertain. Predictive analytics ignores the data generation process and uses aggregate periodic demand information. An uncertainty modelling approach challenges the current practice of predictive analytics as shown in Fig. 1.8. Predictive analytics addresses the
22
1 Introduction and Risk Analysis in Supply Chains
Fig. 1.8 Uncertainty modelling as an alternative to predictive analytics
question: “What will happen in the future?” But uncertainty modelling addresses the question: “How is the data generated?” It aims to use all uncertain parameters that are observed along the data generation process. Owing to its flexibility to handle different uncertain parameters and its ability to be integrated into optimization tools, it has helped organizations in the pharmaceutical, consumer packaged goods, automotive and agriculture industries improve their profits. The second misunderstanding is that practitioners often understate the value of supply chain analytics. In the following chapters, we go through the theoretical foundations of supply chain analytics and elaborate on how the tools and methods of supply chain analytics can be used to address supply chain problems effectively. The examples given throughout the book are adapted from real-life problems of companies such as Kordsa, although we avoid mentioning specific company names in the examples.
1.8
Practice Examples
Example 1.1 Suppose that a manufacturer processes raw materials through four serial operations (similar to Fig. 1.6) to produce a final product. It takes 1 week to complete each operation. The demand lead time is zero. Thus, the manufacturer keeps inventory of the final product. If the manufacturer asks customers to place their orders 1.5 weeks before the delivery date (i.e. making the demand lead time 1.5 weeks), what will be the new decoupling point? Solution 1.1 Decoupling point will become the last operation because it is the earliest stage that allows the remaining tasks to be completed in 1.5 weeks. Therefore, the manufacturer can observe the actual market demand before placing a production order for the last operation. This allows the manufacturer to keep
1.9 Exercises
23
inventory of semi-finished products at the beginning of the last operation instead of keeping inventory of finished goods. Example 1.2 Suppose the manufacturer has 12000 units of finished goods to be sold to customers before the selling season. The actual demand is only 10000 units. There is no salvage value of unsold inventory. The manufacturer incurs a processing cost of one dollar per unit for the last operation. What is the value of having a demand lead time of 1.5 weeks for the manufacturer? Solution 1.2 If the demand lead time is zero, the manufacturer ends up with having 2000 units of unsold finished goods. If the demand lead time is increased to 1.5 weeks, the manufacturer keeps 12000 units of semi-finished goods in inventory. After the manufacturer observes the actual demand of 10000 units, she transforms only 10000 units from semi-finished to finished goods. In this case, the manufacturer ends up with having 2000 units of semi-finished goods and avoids processing 2000 units from semi-finished to finished goods. This helps the manufacturer reduce the cost by $2000. Example 1.3 Suppose that the delivery lead time is negligible such that customer orders can be delivered immediately. It also takes 1.5 weeks to receive raw materials from the supplier. What would be approximate values of operating lead time and decision lead time? Solution 1.3 It takes 4 weeks to complete the production, so production lead time is 4 weeks. The time duration required to deliver customer orders is negligible. Therefore, delivery lead time is zero. Days of inventory is the time elapsed from the instant when the supplier delivers raw materials until the instant when customer orders are fulfilled. We approximate days of inventory as the sum of production lead time and delivery lead time, amounting to 4 weeks. We remark that this approximation is based on the assumption that inventory do not wait idle along the manufacturing operations. The operating lead time (i.e. sum of days of inventory and supply lead time) is then calculated to be 5.5 weeks. Then, decision lead time is calculated to be 5.5 − 1.5 = 4 weeks.
1.9
Exercises
1. What is supply chain responsiveness? How does supply chain responsiveness differ from lean manufacturing? 2. Why do companies need to establish both supply chain integration and operational flexibility to achieve supply chain responsiveness? 3. What are the three analytical approaches used for solving business problems? What are their differences? How are they linked to each other? 4. How does uncertainty modelling approach differ from predictive analytics?
24
1 Introduction and Risk Analysis in Supply Chains
5. What are the most salient five factors of supply chain framework? What are the four derivatives that can be extracted from the five salient factors? Which derivatives can be used as demand and cash flow uncertainty? 6. Suppose that a manufacturer purchases raw materials from a supplier, where the supply lead time is 1 week. The manufacturer uses raw materials to produce finished goods in a single operation. It takes 1 week to complete the production of finished goods. The finished goods are delivered to customers in 1 week. The demand lead time is 2.5 weeks. The customers pay the manufacturer for the orders delivered immediately upon delivery. However, the payment term between the manufacturer and the supplier is 4 weeks. A. What are operating lead time and operating cycle for the manufacturer? B. What are decision lead time and cash conversion cycle? C. Where is the decoupling point in the supply chain?
References Achabal, D. D., McIntyre, S. H., Smith, S. A., & Kalyanam, K. (2000). A decision support system for vendor managed inventory. Journal of Retailing, 76(4), 430–454. Agarwal, R., & Dhar, V. (2014). Big data, data science, and analytics: The opportunity and challenge for is research. Information Systems Research, 25(3), 443–448. Bertsimas, D., & Kallus, N. (2020). From predictive to prescriptive analytics. Management Science, 66(3), 1025–1044. Biçer, I. (2022). Securing the upside of digital transformation before implementation. California Management Review. https://cmr.berkeley.edu/2022/05/securing-the-upside-of-digitaltransformation-before-implementation/ Biçer, I., Lücker, F., & Boyaci, T. (2022a). Beyond retail stores: Managing product proliferation along the supply chain. Production and Operations Management, 31(3), 1135–1156. Biçer, I., Tarakçı, M., & Kuzu, A. (2022b). Using uncertainty modeling to better predict demand. Harvard Business Review. https://hbr.org/2022/01/using-uncertainty-modeling-tobetter-predict-demand Biçer, I., & Tarakcı, M. (2021). From transactional data to optimized decisions: An uncertainty modeling approach with fast fourier transform. Working Paper, York University, Schulich School of Business. Biçer, I., & Seifert, R. W. (2017). Optimal dynamic order scheduling under capacity constraints given demand-forecast evolution. Production and Operations Management, 26(12), 2266–2286. Broughton, K. (2020). Cash crunch at retailers strings suppliers during pandemic. Wall Street Journal. https://www.wsj.com/articles/cash-crunch-at-retailers-stings-suppliers-during-pandemic11594818000 Coates, R. (2012). The silk road—the first global supply chain. Supply Chain Management Review. https://www.scmr.com/article/the_silk_road_the_first_global_supply_chain Dantzig, G. B. (1955). Linear programming under uncertainty. Management Science, 1(3–4), 197– 206. Dantzig, G. B. (2002). Linear programming. Operations Research, 50(1), 42–47. Dreyfus, S. (2002). Richard bellman on the birth of dynamic programming. Operations Research, 50(1), 48–51. Elmachtoub, A. N., & Grigas, P. (2021). Smart “predict, then optimize”. Management Science, 68(1), 9–26. Fisher, M., & Raman, A. (1996). Reducing the cost of demand uncertainty through accurate response to early sales. Operations Research, 44(1), 87–99.
References
25
Flynn, B. B., Huo, B., & Zhao, X. (2010). The impact of supply chain integration on performance: A contingency and configuration approach. Journal of operations management, 28(1), 58–71. Gallaugher, J., & Ransbotham, S. (2010). Social media and customer dialog management at Starbucks. MIS Quarterly Executive, 9(4), Article 3. Grosse-Ruyken, P. T., Wagner, S. M., & Jönke, R. (2011). What is the right cash conversion cycle for your supply chain? International Journal of Services and Operations Management, 10(1), 13–29. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Springer series in statistics (2nd ed.). Heskett, J. L., & Signorelli, S. (1989). Benetton (A). Harvard Business School Case: 9-685-014. Kopcso, D., & Pachamanova, D. (2018). Case article—business value in integrating predictive and prescriptive analytics models. INFORMS Transactions on Education, 19(1), 36–42. Lee, H. L., & Tang, C. S. (1998). Variability reduction through operations reversal. Management Science, 44(2), 162–172. Monroe, R. (2021). Ultra-fast fashion is eating the world. The Atlantic. https://www.theatlantic. com/magazine/archive/2021/03/ultra-fast-fashion-is-eating-the-world/617794/ Nussbaumer-Knaflic, C. (2015). Storytelling with data: A data visualization guide for business professionals. Wiley. Puterman, M. L. (2014). Markov decision processes: Discrete stochastic dynamic programming. Wiley. Schneider, P., Biçer, I., & Weber, T. A. (2023). Securing payment from a financially distressed buyer. Working Paper, York University, Schulich School of Business. Seifert, R. W., Zheng, J., Chen, B., & Yuan, Z. (2017). Rethinking the smartphone supply chain: Oppo/vivo’s big bet. IMD Case#: IMD-7-1920. Simchi-Levi, D., Kaminsky, P., Simchi-Levi, E., and Shankar, R. (2008). Designing and managing the supply chain: Concepts, strategies and case studies. McGraw-Hill Education. Swaminathan, J. M., & Tayur, S. R. (2003). Models for supply chains in e-business. Management Science, 49(10), 1387–1406. Womack, J. P., Jones, D. T., & Roos, D. (2007). The machine that changed the world. Free Press.
2
Analytical Foundations: Predictive and Prescriptive Analytics
Keywords
Linear regression · Logistic regression · Endogeneity · Time series analysis · Numerical optimization
Supply chain management can be regarded by many professionals as a combination of art and science of making decisions under uncertainty (Escudero et al., 1999; Sodhi et al., 2008). Supply chain executives make important inventory and logistics decisions to better match supply with demand in such an environment that is surrounded by different uncertainties. For example, customer demand in a future period is almost always uncertain in many industries (especially the retail and manufacturing industries). In the agriculture industry, the amount of fresh produce depends on some external factors such as weather conditions and perishable nature of crops. Thus, supply is highly uncertain in the agriculture industry. In addition to these uncertainties driven by exogenous factors, supply chains are often fragmented such that upstream and downstream operations are controlled by different firms. This in turn causes information distortion and additional uncertainty due to agency conflicts. To better understand the contribution of agency conflicts to uncertainty in supply chains, let’s consider a retailer that has five products. Two of them are in excellent condition, whereas three products in average condition. The retailer wants to sell the customers as many products as possible. The customers are only interested in buying the products if the probability of a purchased item being in excellent condition is at least .50%. We assume that the customers have partial information about the quality of the products such that they only know that there are two products in excellent condition. However, they don’t know which products they are. The retailer has full information about the products. If the retailer shares the information with customers truthfully, the customers will buy the two products. If the retailer does not disclose any information, the customers do not buy anything because the probability of a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_2
27
28
2 Analytical Foundations: Predictive and Prescriptive Analytics
purchased product being in excellent condition is less than .50% (i.e. .2/5 = 40%). The optimal strategy for the retailer is to share noisy information, where the retailer picks up three products (two in excellent and one in average condition) and offers them to the customers. Then, the customers will buy the three products given that the probability of a purchased product being in excellent condition is more than .50% (i.e. .3/5 = 66%). Hence, sharing noisy information with customers would help the retailer increase the sales. This type of behaviour hinders transparent information transfer among supply chain entities and contributes to the uncertainty of supply chain elements. Given these factors and uncertainties, supply chain executives coordinate different activities from procurement to warehousing and outbound logistics with an objective of making products available in the market for consumers in the most economic way. Therefore, analytical methods that help decision-makers predict future values of uncertain elements and prescribe actionable solutions play a crucial role in improving supply chains. For that reason, we review foundational analytical theories and models in this chapter. We complement the topics reviewed here with practice examples (some of them are programming-based in Python) at the end of the chapter.
2.1
Predictive Analytics
Predictive models are commonly used in supply chain management to predict the future values of uncertain variables, especially demand for products. We can group predictive models into three such as (1) regression models (Pardoe, 2021), (2) time series models (Shumway & Stoffer, 2011) and (3) classification models (Train, 2009). If an uncertain variable is continuous and influenced by some explanatory variables, regression models can be used to relate the uncertain variable to the explanatory variables. If an uncertain variable is continuous and influenced by its historical values, time series models can be used to relate the uncertain variable to its historical values. Finally, if an uncertain variable is binary or categorical and influenced by some explanatory variables, classification models can be used to relate the uncertain variable to the explanatory variables. The ultimate objective of predictive analytics is to identify the true model that correctly relates a dependent variable to some explanatory variables or its historical values by using a dataset. However, finding the true model is very challenging and sometimes impossible. For that reason, researchers often develop an estimate model and then try to improve it gradually. During this process, they are exposed to the bias-variance trade-off.
2.1 Predictive Analytics
2.1.1
29
Bias-Variance Trade-Off
Suppose that a dependent variable y, an uncertain variable of interest, is influenced by an explanatory variable x. The true model that correctly relates y to x is given by: yi = f (xi ) + i ,
.
where we use the subscript i to denote the .ith observation in the training dataset that is used to develop a predictive model and .f (·) is the true model function. The term .i is the error term for the .ith observation, which is basically the difference between the value of the dependent variable and the model output. The expected value of . is zero, and its standard deviation is .σ . Suppose that we don’t know the true model and we use an estimate model .g(·) instead. If the estimate model is used, the mean squared error is given by (Hastie et al., 2001): E(y − g(x))2 = E(f (x) + − g(x))2 = E(f (x) − g(x))2
.
+2E()E(f (x) − g(x)) + E()2 , = E(f (x) − g(x))2 + σ 2 . We remark that .E() = 0 and .E()2 = σ 2 because . has a mean of zero and a standard deviation of .σ . Following up from the last expression (Hastie et al., 2001), E(y − g(x))2 = E(f (x) − g(x))2 + σ 2 ,
.
= E(f (x) − g(x) + E(g(x)) − E(g(x)))2 + σ 2 , = E(f (x) − E(g(x)))2 − 2E(f (x) − E(g(x)))E(g(x) −E(g(x))) + E(g(x) − E(g(x)))2 + σ 2 , = E(f (x) − E(g(x)))2 + E(g(x) − E(g(x)))2 + σ 2 . The first term on the right-hand side of the last expression is referred to as squared bias, which is the squared difference between the true model prediction and the average prediction of the estimate model. The second term is the variance of the estimate model. The variance exists for the estimate model because the coefficients of .g(·) would change when the training dataset is updated. The trade-off between the bias and variance occurs when the true model is not known and the modeller increases the complexity of the estimate model to reduce the bias. This, however, would increase the variance because predictions from a complex model change significantly when the training data is updated. When a simple model is used, the predictions from the model would be different from the true model predictions (high bias). But the variance of the simple model would be much smaller than that of a complex model (low variance).
30
2.1.2
2 Analytical Foundations: Predictive and Prescriptive Analytics
Linear Models
Linear models are commonly used in practice to relate a variable of interest to some explanatory variables (Hastie et al., 2001; Pardoe, 2021). They are often referred to as linear regression models. The variable of interest is a dependent variable because it depends on some explanatory variables through a linear model. Explanatory variables are independent variables because it is assumed that they are independent from the residuals of the model. We consider a hypothetical example in which we aim to predict the daily gross income of a person after graduating from university using her/his cumulative GPA. Suppose that we collect data from 100 university graduates and present these observations in red in Fig. 2.1. The y-axis represents the dependent variable (i.e. daily gross income in US dollars), whereas the x-axis represents the independent variable (i.e. the cumulative GPA on a scale between 0 and 80). The blue line is the best-fitting linear line to the given set of observations. The residual value for an observation is the vertical distance between the point of the observation and the linear line. In a linear model, the residuals are assumed to be independent from the explanatory variables. For that reason, the explanatory variables are also referred to as independent variables. The linear model given in Fig. 2.1 is a simple linear model because it has a single independent variable to predict the dependent variable. The linear models can have multiple independent variables. We use y to denote the dependent variable and .xi to denote the i.th independent variable. We also use the subscript j to denote the values of the dependent and independent variables for j .th observation. Then, the linear
Fig. 2.1 Linear model and true observations
2.1 Predictive Analytics
31
model that uses m independent variables to predict a dependent variable given n observations is written as follows: y1 = β0 + β1 x11 + β2 x21 + · · · + βm xm1 + 1 ,
.
y2 = β0 + β1 x12 + β2 x22 + · · · + βm xm2 + 2 , .. .. .=. yj = β0 + β1 x1j + β2 x2j + · · · + βm xmj + j , .. .. .=. yn = β0 + β1 x1n + β2 x2n + · · · + βm xmn + n , where .β terms are coefficients of independent variables and . is the residual term. In this model, we only know y and x values initially. Our objective is to find the bestfitting coefficients using this information. To this end, we need some assumptions to facilitate the process of estimating the best-fitting coefficients. The linear regression assumptions are given by (Hastie et al., 2001; Pardoe, 2021): Assumption 1 There is a linear relationship between y and x values. Assumption 2 The . values follow a normal distribution with zero mean and a variance of .σ 2 . Assumption 3 The independent variables (xs) are independent from the residuals (i.e. .). Assumption 4 The residuals are homoscedastic such that they have a fixed variance, which does not change depending on the x values. We can use the second assumption to write the maximum likelihood estimation for the residuals. To simplify the notation, we define a function .f (·) that involves the linear part of the model: f (x) = β0 + β1 x1 + β2 x2 + · · · + βm xm .
.
Then, the . term is modelled as follows: = y − f (x) ∼ N(0, σ 2 ),
.
where .N (·, ·) denotes the normal distribution function with mean variance. √ and − 2 2 The normal distribution has a probability density function of .1/(σ 2π )e /(2σ ) . Then, the likelihood of getting the values of n observations based on the normality
32
2 Analytical Foundations: Predictive and Prescriptive Analytics
assumption of the residuals is formulated by the likelihood function (Silvey, 1975): n
n
−(yj −f (xj ))2 2σ 2
1 .L(y, x | σ ) = √ e σ 2π j =1
= σ −n (2π )−n/2 e
j =1
−(yj −f (xj ))2 2σ 2
One common practice in the maximum likelihood estimation is to transform the likelihood model to a log-likelihood function for exponential family functions (Silvey, 1975). The log-likelihood function is obtained by taking the natural logarithm of the likelihood function: n
n .l(y, x | σ ) = ln(L(y, x | σ )) = −n ln(σ ) − ln(2π ) − 2
j =1
(yj − f (xj ))2 2σ 2
While fitting the coefficients of the independent variables, we want to maximize this log-likelihood function. The first two terms on the right-hand side of this expression do not depend on the observed values of y and x variables. Only the last term depends on the observed values because it involves the sum of squared residuals. Given that the last term has a minus sign, maximizing the log-likelihood is equivalent to minimizing the sum squared residuals. Therefore, the objective function that we must use while estimating the coefficients is: Minimize
.
n (yj − f (xj ))2 , j =1
which is the least squares method of linear regression (Pardoe, 2021).
2.1.3
Matrix Formation
Matrices are useful for applying analytical models to data (Strang, 2019), since datasets can be easily represented in a matrix form in analytical applications (e.g. pandas.Dataframe in Python). We now write the set of equations of n observations in the matrix form such that: Y = XB + E,
.
where: ⎤ ⎡ ⎤ ⎡ β0 1 x11 y1 ⎢ β1 ⎥ ⎢ 1 x12 ⎢ y2 ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ B = ⎢ . ⎥, X = ⎢ . . .Y = ⎢ . ⎥ , ⎣ .. ⎦ ⎣ .. . . ⎣ .. ⎦ yn βm 1 x1n ⎡
x21 x22 .. . x2n
⎤ ⎡ ⎤ 0 xm1 ⎢ 1 ⎥ xm2 ⎥ ⎥ ⎢ ⎥ .. ⎥ , E = ⎢ .. ⎥ . ⎦ ⎣ . ⎦ . n · · · xmn ··· ··· .. .
2.1 Predictive Analytics
33
Then, the least squares objective function can be rewritten in the matrix form as follows: Minimize E T E = (Y − XB)T (Y − XB) = Y T Y − Y T XB − B T XT Y + B T XT XB.
.
We remark that the superscript .T is used to denote the transpose of a matrix. The best-fitting coefficients can be found by taking the first derivative of this expression with respect to B and setting it equal to zero: .
∂E T E = −XT Y − XT Y + XT XB + XT XB = 0, ∂B
2XT XB = 2XT Y, (XT X)−1 XT XB = (XT X)−1 XT Y. Following up from the last expression, the coefficients that minimize the sum of squared errors are found by the least squares formula: B = (XT X)−1 XT Y.
.
We note that taking derivative in the matrix form is slightly different from the standard way of taking derivatives. For example, the matrix multiplication of the first row of X matrix and B is written as: ⎡
X[1,:] B = 1 x11 x21
.
β0 ⎢ ⎢ β1 · · · xm1 ⎢ . ⎣ ..
⎤ ⎥ ⎥ ⎥, ⎦
βm = β0 + β1 x11 + β2 x21 + · · · + βm xm1 . When we take the first derivative of the last expression with respect to B, which is a column vector, the result will be the transpose of .X[1,:] : ⎡
1 ⎢ x11 ∂(X[1,:] B) ⎢ ⎢ = ⎢ x21 . ⎢ . ∂B ⎣ ..
⎤ ⎥ ⎥ ⎥ T . ⎥ = X[1,:] ⎥ ⎦
xm1 Therefore, the first derivative of .X[1,:] B with respect to B becomes equal to the T ). In the same manner, the derivative of .Y T XB with transpose of .X[1,:] (i.e. .X[1,:] respect to B becomes equal to the transpose of .(Y T X), which is .(Y T X)T = XT Y .
34
2 Analytical Foundations: Predictive and Prescriptive Analytics T Likewise, the matrix multiplication .B T X[1,:] is equal to .X[1,:] B. Then,
.
T ∂B T X[1,:]
∂B
=
∂X[1,:] B T = X[1,:] . ∂B
T T . In Thus, the first derivative of .B T X[1,:] with respect to B becomes equal to .X[1,:] the same manner, the derivative of .B T XT Y with respect to B becomes equal to T .X Y . Following these principles of taking derivatives in the matrix form, we can calculate the first derivative of .E T E with respect to B as given above, which leads to the derivation of .B = (XT X)−1 XT Y . The third assumption of linear models states that the independent variables are independent from the error terms. The independence assumption guarantees that X is orthogonal to E, which can also be formalized as follows:
X ⊥ E → XT E = 0.
.
From the last expression, we again reach the same result of B such that .B = (XT X)−1 XT Y . The formal derivation of B from the independence assumption is given in the solutions of “Practice Examples” at the end of this chapter. The last assumption of linear models states that error terms have a fixed variance such that .V ar(E) = σ 2 . Thus, the variance of Y is equal to .σ 2 because: V ar(Y ) = V ar(XB + E) = V ar(E) = σ 2 .
.
Given that XB is deterministic, it has zero variance. For that reason, the variance Y is equal to the variance of E. Then, the best-fitting coefficients have the variance of: V ar(B) = (XT X)−1 XT σ 2 X(XT X)−1 = σ 2 (XT X)−1 .
.
Using the matrix form, the ordinary least squares (OLS) estimates and their variance are given by: BOLS = (XT X)−1 XT Y,
.
V ar(BOLS ) = σ 2 (XT X)−1 . There are two indicators of assessing the accuracy of OLS model, which are popularly used in practice. The first approach is based on the .R 2 coefficient of determination (Barrett, 1974), which indicates the percentage of total variance of y that can be explained by the OLS model. Thus, its value lies between zero and one. The .R 2 value can be calculated by: R2 = 1 −
.
ET E (Y − XB)T (Y − XB) = 1 − . YT Y YT Y
2.1 Predictive Analytics
35
When the OLS model fits data very well, the sum of squared errors would be very close to zero, which in turn makes .R 2 close to .100%. If the OLS model fits data poorly, the .R 2 value would be less than .20% or even close to zero. The .R 2 coefficient of determination is a relevant method to assess the model performance based on insample dynamics. If the dataset that has been used to estimate coefficients of the model is also employed to assess the performance of the model, it is referred to as in-sample model assessment. Therefore, the .R 2 value is a relevant indicator of the in-sample performance of the model. The second approach is based on the mean absolute percentage error (MAPE) (De Myttenaere et al., 2016). The MAPE approach is a relevant out-of-sample model assessment technique. In some cases, a dataset is split into two as training and test sets. Training set is used to estimate coefficients of the model, whereas test set is employed to assess the accuracy of model. This approach is referred to as out-ofsample model assessment. Suppose that test set involves .n1 observations. Then, the MAPE is calculated as follows: MAPE =
.
n1 yi
1
yi −
. n1 yi i=1
where .yi is the actual value of the i.th observation in the test set and . yi is the predicted value by the model. The MAPE value cannot be less than zero because the expression inside the summation term is restricted to being positive. However, it can exceed .100% when predicted values substantially deviate from actual values. We remark that there are other model assessment metrics such as mean squared error, mean absolute error, max error, etc. To the best of my knowledge, however, they are less popular indicators than .R 2 and MAPE.
2.1.4
Generalized Least Squares (GLS)
If the last assumption of linear models is violated, the error terms do not have a fixed variance. Then, the OLS estimates of coefficients and their variances (i.e. .BOLS and .V ar(BOLS )) become inconsistent. The generalized least squares (GLS) method provides consistent estimates when the fourth assumption of linear models is violated (Greene, 2017). In this case, the covariance matrix of error terms, denoted by ., can be written as a function of autocorrelation matrix of errors (denoted by .): ⎡ ⎢ ⎢ 2 2⎢ .Cov(E) = = σ = σ ⎢ ⎢ ⎣
ρ0 ρ1 ρ2 .. .
ρ1 ρ0 ρ1 .. .
ρ2 ρ1 ρ0 .. .
ρn−1 ρn−2 ρn−3
⎤ · · · ρn−1 · · · ρn−2 ⎥ ⎥ · · · ρn−3 ⎥ ⎥, .. ⎥ .. . . ⎦ · · · ρ0
36
2 Analytical Foundations: Predictive and Prescriptive Analytics
where .ρi is lag-i autocorrelation of the error term, which is calculated as follows: n
ρi =
.
j j −i
j =i+1 n
j =1
. j2
The autocorrelation matrix . is an .n × n symmetric matrix with n rows and n columns. From the properties of symmetric matrices, . can be written as a multiplication of .ωT and .ω, where .ω is an .n × n unknown symmetric matrix. = ωT ω
.
Without knowing the elements of .ω matrix, that is, there is no need to know the values of each element of .ω to estimate the coefficients, the variance term of .ω−1 E is given by: V ar(ω−1 E) = σ 2 I,
.
where I is the identity matrix. Then, we can rewrite the systems of equations of the linear model in the matrix form as follows: ω−1 Y = ω−1 XB + ω−1 E.
.
Replacing X and Y with .ω−1 X and .ω−1 Y , respectively, in the derivation of .BOLS , the estimates of the coefficients for the GLS method are given by Greene (2017): BGLS = (XT −1 X)−1 XT −1 Y.
.
Then, the variance of the coefficients is calculated by: V ar(BGLS ) = (XT −1 X)−1 XT −1 σ 2 −1 X(XT −1 X)−1 ,
.
= σ 2 (XT −1 X)−1 .
2.1.5
Dealing with Endogeneity Problems
The second and third assumptions of linear models complement each other because both assumptions yield to the same result: the least squares method of linear regression. If the third assumption is violated such that the exploratory variables (X) are not independent from the error terms (E), the OLS estimation of coefficients becomes inconsistent. The violation of the third assumption is referred to as the endogeneity problem (Antonakis et al., 2010). When there is an endogeneity
2.1 Predictive Analytics
37
problem, the X matrix is no longer orthogonal to the E column vector. Thus, XT E = δ,
.
where .δ is a non-zero vector. Then, XT E = XT (Y − XB) = δ,
.
(XT X)B = XT Y − δ, BCON S = (XT X)−1 XT Y − (XT X)−1 δ, where .BCON S is the consistent estimates of coefficients in the presence of the endogeneity problems. The first term on the right-hand side of the last expression is indeed the OLS estimation because .BOLS = (XT X)−1 XT Y . Then, .BCON S is rewritten in the following form: BCON S = BOLS − (XT X)−1 δ,
.
BOLS = BCON S + (XT X)−1 δ. Therefore, the OLS estimation of coefficients is biased by .(XT X)−1 δ. If the endogeneity problems in a linear model are ignored and the OLS model is wrongly used to model the relationship between Y and X, the estimates of the coefficients would deviate from the consistent estimates by .(XT X)−1 δ. When .δ approaches to zero, the negative impact of endogeneity would be very limited, and the bias term decreases accordingly. When .δ is sufficiently high, the bias term would be significant, and the prediction accuracy of the OLS model deteriorates. There are two main reasons of the endogeneity problem: (1) omitted variable bias and (2) the measurement error (Antonakis et al., 2010). Suppose, for example, that Y depends on X variables and also an omitted variable Z such that: ⎡
⎤ z1 ⎢ z2 ⎥ ⎢ ⎥ .Z = ⎢ . ⎥ . ⎣ .. ⎦ zn The column vector Z involves n observations of an omitted variable, where the .ith observation of Z is denoted by .zi . The true linear model involves both X and Z such that (Antonakis et al., 2010): Y = XB + Zθ + E,
.
38
2 Analytical Foundations: Predictive and Prescriptive Analytics
where .θ is the coefficient of Z. In this true model, the error terms are orthogonal to the explanatory variables (i.e. X and Z). Then, XT E = XT (Y − XB − Zθ ) = 0,
.
(XT X)B = XT Y − XT Zθ, BCON S = (XT X)−1 XT Y − (XT X)−1 XT Zθ. The last expression yields to consistent estimates of coefficients, which depend on both observed and omitted variables. The restricted model does not include the omitted variable. There may be several reasons of why researchers would use a restricted model instead of the true model that includes all variables influencing the dependent variable. For example, data of the omitted variable would not be available. Another reason would be that researchers might be unaware of the impact of the omitted variable on the dependent variables, so they might fail to include the omitted variable in a linear model. In those cases, a restricted model would be used, which relates Y to X without incorporating Z in the model. The OLS estimation of the coefficients in the restricted model is given by .BOLS = (XT X)−1 XT Y . Then, BCON S = BOLS − (XT X)−1 XT Zθ,
.
BOLS = BCON S + (XT X)−1 XT Zθ. Therefore, the OLS estimation of coefficients is biased by .(XT X)−1 XT Zθ when there is an omitted variable bias. The endogeneity problem would also result from the measurement error such as occurs when the values in X are recorded incorrectly (Antonakis et al., 2010). Suppose, for example, that the true model that relates Y to X is given by .Y = − ψ, where .X XB + E. However, the X values are not accurate such that .X = X is the measured values. In other words, the measured values deviate from the true values of X by .ψ. Using the orthogonality condition in the true model, the following results are obtained: XT E = 0,
.
− ψ)T E = 0, (X − XB) = ψ T E, X(Y T XB = ψ T E, T Y − X X T X) T X) −1 X T Y − (X −1 ψ T E. BCON S = (X If the OLS estimation is used by taking into consideration the measured values T X) the estimates of coefficients would be .BOLS = (X −1 X T Y . Then, the (i.e. .X),
2.1 Predictive Analytics
39
expression of consistent estimates is rewritten as follows: T X) −1 ψ T E. BCON S = BOLS − (X
.
T X) −1 ψ T E when the Therefore, the OLS estimation of coefficients is biased by .(X measured values of explanatory variables are different from their true values. When we analyse the bias term for the omitted variable case (i.e. .(XT X)−1 XT Zθ ), we observe that the bias can be eliminated completely if .XT Z = 0. Likewise, when T X) −1 ψ T E), we we analyse the bias term for the measurement error case (i.e. .(X T also observe that the bias can be eliminated completely if .ψ E = 0. These two observations indicate that if we can marginalize X such that the part of X that depends on the omitted variable and the measurement error would be extracted from the model, the remaining part can be used in the OLS model to obtain unbiased estimates of coefficients. This can be done by using instrumental variables (Antonakis et al., 2010; Greene, 2017). Instrumental variables are some additional variables that must have a high correlation with X but no correlation with Y . Therefore, they cannot be included in the linear model .Y = XB + E directly. However, we can use them to predict values of X. The predicted values of X would later be used, instead of true values, in .Y = XB + E to estimate unbiased coefficients of B. We use V to denote the matrix of observations of instrumental variables. The V matrix has a similar structure to the X matrix. Then, we construct the following linear model, which is referred to as the instrumental variable (IV) model, to relate X to V : X = V + ,
.
where . is the coefficients of the IV model and . is the error terms. The . term is a column vector that includes the coefficients of instrumental variables. It is estimated by employing the OLS method as follows: = (V T V )−1 V T X.
.
The predicted values of X are calculated by the IV model such that: XP RED = V (V T V )−1 V T X.
.
If we replace X with .XP RED in .Y = XB + E, the OLS estimation of B would be unbiased. This results from the fact that .XP RED is marginalized such that the dependency of X on omitted variables and measurement errors is eliminated when .XP RED is used in the linear model .Y = XB + E instead of X. Therefore, the OLS estimation of coefficients B is consistent for the linear model .Y = XP RED B + E. Hence, consistent estimates of B are obtained by using instrumental variables in two steps. The first step is the IV model. The second step is the OLS model that uses predicted values of X to find the consistent (unbiased) coefficients. For that
40
2 Analytical Foundations: Predictive and Prescriptive Analytics
reason, this approach is referred to as the two-stage least squares (2SLS) method (Antonakis et al., 2010). Using the expression .XP RED = V (V T V )−1 V T X, the unbiased estimates of coefficients B are obtained as follows: B2SLS = (XT V (V T V )−1 V T X)−1 XT V (V T V )−1 V T Y.
.
The consistency of 2SLS estimates of coefficients depends on the validity of instrumental variables. If an instrumental variable has a relatively high correlation with the dependent variable, it would not be a valid instrumental variable because it would fail to marginalize X well enough to make .B2SLS consistent. In such circumstances, the generalized method of moments (GMM) would give robust estimates (Hansen, 1982). In an OLS model, the moment conditions are obtained by the orthogonality condition such that: XT (Y − XB) = 0.
.
When there is an endogeneity problem, this moment condition cannot be used because some of the explanatory variables are expected to correlate with error terms. We use .XEN DO to denote the matrix of observations for the endogenous explanatory variables. We note that there would be some other explanatory variables that do not correlate with error terms, which are referred to as exogenous variables. We use XEXOG = X. The .XEXOG to denote exogenous variables such that .XEN DO exogenous variables satisfy the following relationship: T XEXOG (Y − XB) = 0.
.
The expression is used in the GMM as one set of moment conditions. We recall that the first column of X matrix is the column of ones, which is needed to have the intercept .β0 in the linear model. If we regard the column of ones as observed values of an explanatory variable, it must be independent from error terms because it has a constant value (i.e. one). Therefore, the column of ones should be part of .XEXOG , not .XEN DO . The moment condition that corresponds to the column of T (Y − XB) = 0 is given by: ones in .XEXOG ⎤ 1 ⎥ ⎢ ⎢ 2 ⎥ . 1 1 · · · 1 ⎢ . ⎥ = E(Y − XB) = 0. ⎣ .. ⎦ ⎡
n The second set of moment conditions is related to instrumental variables because they must be independent from error terms: V T (Y − XB) = 0.
.
2.1 Predictive Analytics
41
Therefore, the GMM model is built on the following moment conditions: E(Y − XB) = 0,
.
T XEXOG (Y
− XB) = 0,
V T (Y − XB) = 0. It employs numerical optimization techniques and converges to the consistent estimates of B by iteration that satisfy the moment conditions. The identification of the GMM model can be tested by the Hansen test of model specification (Hansen, 1982), where the null hypothesis is that moment conditions .V T (Y − XB) = 0 are fulfilled. If the null hypothesis is not rejected, the GMM estimates of coefficients are considered consistent.
2.1.6
Regularization Methods
The OLS and GLS estimates of coefficients and their variances can be used to develop t-test statistics of the significance of coefficients. The t-test checks whether the null hypothesis (stating that the coefficient of each variable equals zero) is statistically significant (Pardoe, 2021). The test statistics would help determine whether a coefficient is significant and the variable that corresponds to the coefficient should stay in the model. Such test statistics are included in summary tables of various computer programs (e.g. Python’s statsmodels.regression.linear_model.OLS). This approach takes into account the in-sample dynamics. If there are 100 observations that are used to fit an OLS model, for example, the test statistics are derived from these 100 observations, then the results identify the variables that have significant coefficients, and they show which variables must stay in the model. In the machine learning field, researchers often split a dataset into two as training and test sets (Hastie et al., 2001). Training set is used to train a model and estimate coefficients. Test set is later used to assess the performance of the trained model based on a performance metric (e.g. mean absolute percentage error (MAPE)). The t-test statistics are not useful in this setting because there is no direct relationship between the t-test statistics (calculated by utilizing the training set) and the performance of the model (calculated by utilizing the test set). Nevertheless, regularization methods provide researchers with the flexibility to adjust a parameter, that is, .α penalty term, and they incorporate this parameter in the model to estimate coefficients by utilizing the training set. The performance of the model is later assessed by utilizing the test set. The most robust set of coefficients is determined by selecting the .α term that performs the best on the test set. This approach does not necessarily minimize the sum of squared errors on the training set. But it emphasizes the out-of-sample performance of the model because it aims to find the best model based on the performance on the test set. In some cases, it makes values of some coefficients equal to zero, thereby automatically extracting some variables from the
42
2 Analytical Foundations: Predictive and Prescriptive Analytics
model. There are two popular regularization methods: (1) L1 regularization and (2) L2 regularization (Hastie et al., 2001). The L1 regularization is also referred to as Lasso method, whereas the L2 regularization is referred to as Ridge regression. The objective function in the OLS method is to minimize the sum of squared errors. The Lasso model updates the objective function in the following way (Hastie et al., 2001): Minimize E T E + α
m
.
|βj | = (Y − XB)T (Y − XB) + α
j =1
m
|βj |.
j =1
This objective function adds the penalty term to the sum of squared errors such that objective function deteriorates as more coefficients are included in the model. To minimize the objective function, the sum of squared errors and the number of non-zero coefficients must be jointly reduced. This function has an absolute value term. For that reason, the optimal values of coefficients cannot be determined by an analytical expression. Thus, a numerical optimization technique based on least angle regression (Efron et al., 2004) must be employed to find the optimal values of coefficients. The least angle regression method is an iterative algorithm that starts with initial values of .β1 = β2 = · · · = βm = 0. Then, the error terms for each observation are found based on these initial values. We use .E0 to denote the error terms (i.e. a column vector) for the first iteration. The variable whose observed values have the highest correlation with .E0 must then be selected. Suppose that it is .xj . The least angle regression method increases the value of .βj gradually and recalculates .E0 accordingly. It stops increasing .βj until .xj values have no longer the highest correlation with .E0 . At that point, the second iteration starts with .E1 ← E0 . The next variable that has the highest correlation with .E1 is selected. Suppose that it is .xk . Likewise, the value of .βk is increased gradually until .xk values have no longer the highest correlation with .E1 . While updating the .β values, the algorithm calculates the objective function value. At the very beginning, the objective function has a very high value, which decreases over time as .β values are updated. The objective function values start to increase at one point—that is, when algorithm stops iterating because the minimum value of objective function is reached. The Ridge regression model uses an alternative objective function (Hastie et al., 2001): Minimize E T E + αB T B = (Y − XB)T (Y − XB) + αB T B,
.
= Y T Y − Y T XB − B T XT Y + B T XT XB + αB T B. Unlike the Lasso model, there exists an analytical expression for the optimal set of parameters in the Ridge regression model. It is given by: Bridge = (XT X + αI )−1 XT Y,
.
2.1 Predictive Analytics
43
where I is an identity matrix with m elements. The formal derivation of .Bridge is given in the solutions of “Practice Examples” at the end of this chapter.
2.1.7
Classification Methods
The OLS model is used to estimate coefficients of explanatory variables when the dependent variable has a continuous value. For example, graduate students who take a predictive analytics course would have final scores between 0 and 100. We may argue that the final score is closely related to the time spent on studying the predictive analytics course during the semester. To understand the impact of time spent on studying the course on the final score, we would develop a linear model where y indicates the final score and .x1 indicates the time spent on the course: y = β0 + β1 x1 + .
.
The coefficients of this model can be estimated by utilizing the OLS method when assumptions of linear models hold true. When some of the assumptions are violated, alternative approaches such as GLS, 2SLS, GMM and regularization methods can be utilized. Instead of modelling the final score, researchers would be more interested in predicting whether students pass or fail in the course. In this case, the dependent variable can be defined as a binary variable, which takes zero if a student fails in the course and one otherwise. When the dependent variable is binary or categorical, linear models cannot be used directly. In Fig. 2.2, we present a hypothetical example in which there are 19 students who take the predictive analytics course. The figure shows the relationship between total study time and whether students pass or fail in the course. The fitted linear model has a large deviation from the observed values at the outliers. Therefore, a linear model fits poorly when the dependent variable has a binary or categorical value.
Fig. 2.2 Misspecification of linear models for classification problems
44
2 Analytical Foundations: Predictive and Prescriptive Analytics
Force-fitting a linear model to a binary or categorical dependent variable causes the misspecification problem. Nevertheless, such a dependent variable can be transformed into a continuous variable, making it possible to employ a linear model. Rather than using y in its explicit form, we can write it as a probability function such that: P r(y = 1 | x1 ) = f (x1 ).
.
The value of the probability function lies between zero and one. Then, we can apply another transformation and obtain the following result (Pregibon, 1981; Train, 2009): .
ln
f (x ) 1 = β0 + β1 x1 . 1 − f (x1 )
This model is a logistic regression model, where the left-hand side of this expression is referred to as the logit term. The logit term has continuous values without any lower and upper bounds. When .f (x1 ) approaches to zero, the logit term goes to minus infinity. When .f (x1 ) approaches to one, it goes to plus infinity. Therefore, the logit expression transforms the binary dependent variable to a continuous variable. Using this expression, the following result is obtained (Train, 2009): f (x1 ) =
.
eβ0 +β1 x1 . 1 + eβ0 +β1 x1
We now consider another case in which the dependent variable is categorical with more than two classes. Suppose, for example, that a retailer wants to predict customers’ purchase probability for a certain product family (e.g. cheese). There are four kinds of cheese sold by the retailer: (1) cheddar, (2) Gruyere, (3) Gouda and (4) feta. The retailer doubts the customer preference over cheese highly depends on the average grocery expenditure per month. In practice, retailers can acquire such information easily by analysing point-of-sales (POS) data of customers who have loyalty cards. We now extend the logit model with a binary dependent variable to a multinomial logit model with four classes. To this end, we define the y variable as follows: ⎧ 1 ⎪ ⎪ ⎨ 2 .y = ⎪3 ⎪ ⎩ 4
if Cheddar, if Gruyere, if Gouda, if Feta.
In a multinomial logit model, we must have .(n − 1) logit models to predict a dependent variable with n classes (Hastie et al., 2001). For the retailer example,
2.1 Predictive Analytics
45
the logit models are constructed as follows: .
P r(y = 1 | x) = β(0,1) + β(1,1) x, P r(y = 4 | x) P r(y = 2 | x) = β(0,2) + β(1,2) x, ln P r(y = 4 | x) P r(y = 3 | x) = β(0,3) + β(1,3) x, ln P r(y = 4 | x)
ln
where x is the explanatory variable that indicates the average grocery expenditure of a customer per month. Using these expressions, we obtain the following: P r(y = 1 | x) = P r(y = 4 | x)eβ(0,1) +β(1,1) x ,
.
P r(y = 2 | x) = P r(y = 4 | x)eβ(0,2) +β(1,2) x , P r(y = 3 | x) = P r(y = 4 | x)eβ(0,3) +β(1,3) x . Total probability of customer preferences is equal to one such that: P r(y = 1 | x) + P r(y = 2 | x) + P r(y = 3 | x) + P r(y = 4 | x) = 1.
.
Combining this expression with the previous three ones, we obtain the following results:
.
3 P r(y = 4 | x) 1 + eβ(0,j ) +β(1,j ) x = 1, j =1
1
P r(y = 4 | x) = 1+ P r(y = 1 | x) =
3
,
eβ(0,j ) +β(1,j ) x
j =1
eβ(0,1) +β(1,1) x , 3 β +β x 1+ e (0,j ) (1,j ) j =1
P r(y = 2 | x) =
eβ(0,2) +β(1,2) x , 3 1+ eβ(0,j ) +β(1,j ) x j =1
P r(y = 3 | x) =
eβ(0,3) +β(1,3) x . 3 β +β x (0,j ) (1,j ) 1+ e j =1
46
2 Analytical Foundations: Predictive and Prescriptive Analytics
In a multinomial logit model, one of the classes is used as a reference point, which appears as denominator of logit terms. In the retailer example, the reference point is the probability value for last cheese type (i.e. .P r(y = 4 | x)). It has been well established in Hastie et al. (2001) that the reference probability term can be avoided in a softmax normalization. The softmax normalization can be conceptualized as follows. We weakly assume .eβ(0,4) +β(1,4) x = 1. Then, the probability values become equal to: P r(y = 1 | x) =
.
eβ(0,1) +β(1,1) x , 4 β +β x e (0,j ) (1,j ) j =1
P r(y = 2 | x) =
eβ(0,2) +β(1,2) x , 4 β +β x (0,j ) (1,j ) e j =1
P r(y = 3 | x) =
eβ(0,3) +β(1,3) x , 4 β +β x (0,j ) (1,j ) e j =1
P r(y = 4 | x) =
eβ(0,4) +β(1,4) x . 4 eβ(0,j ) +β(1,j ) x j =1
These expressions can be written in a general form: P r(y = i | x) =
.
eβ(0,i) +β(1,i) x . 4 β +β x (0,j ) (1,j ) e j =1
This expression is referred to as the softmax function, which is also a commonly used activation function in neural networks that will be discussed in Chap. 9. We can now remove the assumption of .eβ(0,4) +β(1,4) x = 1 from our model. In this case, the coefficients .β(0,4) and .β(1,4) are estimated from data without imposing any constraint, like other coefficients. Once the assumption is removed from the model, all coefficients will be adjusted automatically without having any negative impact on the consistency of estimates of coefficients. The coefficients of the multinomial logit model are estimated by the maximum likelihood estimation (MLE) method (Böhning, 1992). To write the likelihood function, we first define four binary dummy variables: y1 = 1 if y = 1, otherwise it is zero,
.
y2 = 1 if y = 2, otherwise it is zero,
2.1 Predictive Analytics
47
y3 = 1 if y = 3, otherwise it is zero, y4 = 1 if y = 4, otherwise it is zero. Then, the likelihood function of n observations of customer choices is written such that: L(y, x) =
n
.
P r(y1 = 1 | x)y1 P r(y2 = 1 | x)y2 P r(y3 = 1 | x)y3
j =1
× P r(y4 = 1 | x)y4 . Then, the log-likelihood function is obtained by taking the natural logarithm transformation of this expression: l(y, x) =
n
.
y1 ln(P r(y1 = 1 | x)) + y2 ln(P r(y2 = 1 | x))
j =1
+ y3 ln(P r(y3 = 1 | x)) + y4 ln(P r(y4 = 1 | x)) . The coefficients of the multinomial logit model that maximizes this log-likelihood expression can be found by numerical optimization techniques (Böhning, 1992). Assessment of model accuracy must be analysed from different angles for a logit model. Suppose that Table 2.1 presents the outcome of the logit model for our hypothetical midterm example given by Fig. 2.2. This outcome matrix is often referred to as the confusion matrix (Deng et al., 2016). Columns of the matrix indicate what is observed in reality, whereas row shows what the logit model predicts. According to the confusion matrix, there are 11 students who succeed in the predictive analytics course. However, the logit model identifies only 9 successful students (out of 11) correctly. There are also eight students who fail in the course. The logit model identifies only five of them correctly. There are two important metrics that can be extracted from the confusion matrix (Deng et al., 2016). First, the precision of the model is calculated by the ratio of successful students who are identified correctly by the model to the total number of Table 2.1 Confusion matrix for the midterm example
What the model predicts
Success Failure Total
What is observed in reality Success Failure 9 3 2 5 11 8
Total 12 7 19
48
2 Analytical Foundations: Predictive and Prescriptive Analytics
success predictions. Therefore, Precision =
.
[Value in upper-left quadrant] = 9/12 = 75%. [Total of first row]
The second metric is recall, which is defined as the ratio of successful students who are identified correctly by the model to the total number of successful students. Thus, Recall =
.
[Value in upper-left quadrant] = 9/11 = 81.8%. [Total of first column]
One of these metrics would be more important than the other one depending on the context of a problem. For example, investors would be interested in identifying a portfolio of stocks that return a positive payoff in the next 2 years. When a logit model predicts a set of stocks with positive returns, investors would not be tolerant if some of these stocks do not yield a positive return. The model does not need to predict all stocks with a positive return. It can identify few stocks; however, they must be high-profit ones. For instance, if a clairvoyant tells an investor that Apple will certainly generate 20% return in the stock market in 2 years, the investor does not need to predict another stock correctly. She would only invest in Apple. For that reason, precision is more important than recall for the investment problem because investors desire a logit model with a high precision, which brings the value of upperright quadrant in the confusion matrix close to zero. At the onset of Covid pandemic, governments around the world increased the number of Covid-19 rapid antigen tests applied to patients who have some Covid symptoms. These test kits are not fully reliable. They would wrongly identify a Covid-positive person as negative or vice versa. Identifying a Covid-negative person as positive would cause the person to stay at home during a quarantine period, which is not desired by public officials. A much worse case would be the test kits identifying a Covid-positive person as negative. In this case, the patient would be outside and spread the pandemic to other people. For such test kits, recall is more important than precision because public officials desire a logit model with a high recall, which brings the value of lower-left quadrant in the confusion matrix close to zero. When precision and recall are equally important, a single metric can be used to assess the accuracy of a logit model (Deng et al., 2016). This metric is the F 1 score, which is formulated by: F1 =
.
2 1 P recision
+
1 Recall
.
2.1 Predictive Analytics
49
For the confusion matrix given by Table 2.1, the F 1 score is calculated as: F1 =
.
2 1 0.75
+
1 0.818
= 0.7825.
The F 1 score can take values between zero and one. When both precision and recall are close to one, the F 1 score must be close to one. In this case, the logit model is considered highly accurate in terms of correctly identifying the categories. If either precision or recall is close to zero, the F 1 score must be close to zero, regardless of the value of other metric.
2.1.8
Time Series Analysis
The predictive models that have been discussed above aim to model a dependent variable as a function of explanatory variables. In Fig. 2.1, for example, we present a linear model in which the dependent variable is daily gross income of an adult and the explanatory variable is high school GPA. Time series models are different from the other models discussed so far in that a variable of interest is modelled as a function of its historical values. In some situations, historical values of a dependent variable would help predict its future values better than different explanatory variables that are available to the modeller. Thus, time series models can be more appealing than regression models depending on the context. The objective of this section is to introduce the reader to some basic concepts time series analysis and autoregressive integrated moving average (ARIMA) models (Box & Pierce, 1970; Hamilton, 1994; Chatfield, 2004; Shumway & Stoffer, 2011). In Fig. 2.3, we present the daily stock prices of Amazon.com and Walmart from January 1, 2000, until December 31, 2021. Between 2000 and 2008, the stock price of Amazon was steady. Then, it started to increase exponentially. However,
Fig. 2.3 Daily stock prices of Amazon.com and Walmart between 2000 and 2021
50
2 Analytical Foundations: Predictive and Prescriptive Analytics
the stock price of Walmart was steady between 2000 and 2011. It later started to increase at a lower pace than Amazon. Finally, Amazon had a substantially higher stock price than Walmart in 2021. Using the historical stock prices, it would be possible for these two retailers to predict future stock prices. For instance, we take the evolution of Amazon’s stock prices between 2008 and 2019. The stock price grows exponentially during this time window. The data between 2008 and 2012 can be used to capture the exponential dynamics in time series models. Then, time series models would help predict the stock prices of Amazon between 2012 and 2019 very accurately. This observation is indicative of high prediction accuracy of time series models in such situations. We observe similar dynamics for Walmart between 2016 and 2021. It would also be possible for other publicly listed companies to have such time series dynamics that historical stock price information is enough to predict future values without any need for other explanatory variables. We use .xt to denote the value of a variable of interest, where the subscript t is the time index. The time index starts from zero and increases by one for each subsequent observation. For example, Fig. 2.3 shows stock prices of Amazon and Walmart for 5536 days. Let .xt denote Amazon’s stock price. Thus, .x0 is the stock price on the first date of the data shown in the figure, which is January 1, 2000. Then, .x1 , .x2 , .x3 , .· · · are stock prices on the days January 2, 3, 4, respectively. Finally, .x5535 is the stock price on the last day of the data, which is December 31, 2021. In time series models, there is a white noise for each time index. The white noise term, denoted by .ωt , is defined as a random variable that is not correlated with its historical values and satisfies: ωt ∼ iid(0, σ 2 ),
.
∀t.
Therefore, .ω is independent and identically distributed (i.e. iid) random variable with zero mean and a standard deviation of .σ . Due to its iid property, white noise has an autocovariance of .σ 2 for lag-0 and of zero for the lags different from zero. Autocovariance measures the dependence between two observations of the same variable at different times: .ζx (i) = cov(xt , xt−i ) = E (xt − μ)(xt+i − μ) , where .μ = E(x). Thus, the term .ζx (i) gives the autocovariance of x values when the time difference is equal to i periods. In other words, .ζx (i) is the lag-i autocovariance of x. Suppose that there are n values of x. To calculate the lag-i autocovariance of x, we can conceptualize the following two series as two different variables such that: ⎡ ⎢ ⎢ xt−i = ⎢ ⎣
.
x0 x1 .. . xn−i−1
⎤ ⎥ ⎥ ⎥ ⎦
⎡
⎤ xi ⎢ xi+1 ⎥ ⎢ ⎥ and xt = ⎢ . ⎥ . ⎣ .. ⎦ xn−1
Then, the covariance between .xt and .xt−i is equal to lag-i autocovariance of x.
2.1 Predictive Analytics
51
Let’s consider a time series model with white noise terms: xt = ωt + θ1 ωt−1 + θ2 ωt−2 .
.
This model is referred to as moving average model with two components—that is, MA(2). The lag-0 autocovariance of x is found by: ζx (0) = cov(ωt + θ1 ωt−1 + θ2 ωt−2 , ωt + θ1 ωt−1 + θ2 ωt−2 ).
.
Given that white noise is independently and identically distributed, .ωt of the first term inside parentheses only correlates with .ωt of the second term. Because .ω has a variance of .σ 2 , the contribution of .ωt (which appears in both terms inside parentheses) to .ζx (0) is equal to .σ 2 . Likewise, .ωt−1 of the first term only correlates with .ωt−1 of the second term inside parentheses. The contribution of .ωt−1 to .ζx (0) is equal to .θ12 σ 2 because it has the coefficient of .θ1 in both terms. Finally, the contribution of .ωt−2 to .ζx (0) is equal to .θ22 σ 2 . Therefore, the expression can be written as follows: ζx (0) = σ 2 (1 + θ12 + θ22 ).
.
The lag-1 autocovariance is formulated as follows: ζx (1) = cov(ωt + θ1 ωt−1 + θ2 ωt−2 , ωt−1 + θ1 ωt−2 + θ2 ωt−3 ).
.
In this expression, only .ωt−1 and .ωt−2 appear in both terms inside parentheses. Due to the iid property of white noise, only the variables that appear in both terms can have a contribution to the autocovariance. The contribution of .ωt−1 to .ζx (1) is equal to .θ1 σ 2 , and the contribution of .ωt−2 to .ζx (1) is equal to .θ1 θ2 σ 2 . Thus, ζx (1) = σ 2 (θ1 + θ1 θ2 ).
.
The lag-2 autocovariance is formulated as follows: ζx (2) = cov(ωt + θ1 ωt−1 + θ2 ωt−2 , ωt−2 + θ1 ωt−3 + θ2 ωt−4 ).
.
In this expression, only .ωt−2 appears in both terms inside parentheses. Therefore, ζx (2) = σ 2 θ2 .
.
The lag-3 autocovariance is equal to zero. It is given by: ζx (3) = cov(ωt + θ1 ωt−1 + θ2 ωt−2 , ωt−3 + θ1 ωt−4 + θ2 ωt−5 ).
.
52
2 Analytical Foundations: Predictive and Prescriptive Analytics
The first term inside parentheses includes .ωt , .ωt−1 and .ωt−2 . None of these .ω values appear in the second term because the second term includes different .ω values—that is, .ωt−3 , .ωt−4 and .ωt−5 . Thus, the lag-3 autocovariance becomes equal to zero due to the iid property of white noises. Likewise, autocovariances for the lags more than three are all equal to zero. The derivations above show that the autocovariances for the lags greater than or equal to three are equal to zero for an MA(2) model. This result can be generalized for the moving average models such that autocovariance drops to zero after q lags in an MA(q) model. This result provides useful insights regarding the construction of time series models (Chatfield, 2004; Shumway & Stoffer, 2011). Autocovariance values can be computed for different lags using time series data of a variable of interest. We would expect the values to be higher for lower lags. As the lag increases, autocovariance would decrease in absolute terms. When the autocovariance value in absolute terms is no longer statistically different from zero, we would get that point as the cut-off point and use this information to determine the number of elements in the moving average model—that is, the q value for MA(q) model. Autocovariance values would have high variability depending on time series data. Autocorrelation can be used to scale autocovariance such that lag-i autocorrelation is calculated by dividing lag-i autocovariance by lag-0 autocovariance. We use .ρx (i) to denote the lag-i autocorrelation of x such that: ρx (i) =
.
ζx (i) . ζx 0
Autocorrelation values lie between .−1 and 1: .−1 ≤ ρx (i) ≤ +1. The autocorrelation values for Amazon are given in Fig. 2.4. The dark blue area shows the true autocorrelation values, while the light blue area is the confidence interval for the null hypothesis that autocorrelation is not statistically different from zero. The x-axis represents the lags, while the y-axis gives the autocorrelation values. When the lag is greater than around 400, the autocorrelation values lie within Fig. 2.4 Autocorrelation values for Amazon’s stock price for the first 1500 lags
Autocorrelation
1.00 0.75 0.50 0.25 0.00 –0.25 –0.50 –0.75 –1.00 0
200
400
600
800 1000 1200 1400
2.1 Predictive Analytics
53
the confidence interval. Therefore, a moving average model with 400 elements (i.e. MA(400)) would fit the Amazon’s stock price data from a theoretical point of view. However, it is not practically possible to estimate the coefficients of such a moving average model. This results from the fact that there are 400 coefficients of white noise terms in an MA(400) model. The white noise information is not available at the beginning because the only data used to fit a time series model is the Amazon’s stock price data. The estimation of coefficients in such a time series model is based on a conditional likelihood model that iteratively updates the coefficients.1 Nevertheless, we observe that there has been an increasing trend in Amazon’s stock price as depicted by Fig. 2.3. When there is a trend in time series data, the data can be transformed to the first- or higher-order differences to reduce its dependence on historical information. The first-order difference of x is formulated as follows: xt = xt − xt−1 .
.
Suppose that .xt increases with t linearly according to the function: xt = β0 + β1 t.
.
The first-order differencing helps simplify the time series data such that: xt = β0 + β1 t − (β0 + β1 (t − 1)) = β1 .
.
If .xt increases as quadratic function of t, the second-order differencing would be a better alternative than the first-order differencing. The n.th-order differencing for .n ≥ 2 is formulated as follows: n xt = n−1 xt − n−1 xt−1 .
.
In Fig. 2.5, we present the autocorrelation values for the first difference of Amazon’s stock price. As shown in the figure, only the first three lags have autocorrelation values that are statistically different from zero. Therefore, an MA(3) model after applying the first-order differencing to Amazon’s stock price data would potentially fit well Amazon’s data. Such a model is referred to as integrated moving average model: IMA(1,3). The first integer in parentheses refers to the order of differencing, while the second integer refers to the moving average order. In time series derivations, it is common to use a backshift operator for notational and exposition purposes. The backshift operator B is used to represent a historical
1 We
refer the reader to Shumway and Stoffer (2011, pp. 121–140), Chatfield (2004, Ch. 4) or Hamilton (1994, Ch. 5) for time series estimation methods.
54
2 Analytical Foundations: Predictive and Prescriptive Analytics
Fig. 2.5 Autocorrelation values for the first difference of Amazon’s stock price for the first ten lags
Autocorrelation
1.00 0.75 0.50 0.25 0.00 –0.25 –0.50 –0.75 –1.00 0
2
4
6
8
10
value by using more recent values of a time series variable such that: xt−1 = Bxt ,
.
xt−2 = B 2 xt , .. .. . = ., xt−n = B n xt . Thus, .B n is the n.th-order backshift operator that helps represent .xt−n as a function of .xt . Using the backshift operator, the first- and second-order differences of .xt are written as follows: xt = xt − xt−1 = xt − Bxt = xt (1 − B),
.
2 xt = (xt − xt−1 ) − (xt−1 − xt−2 ) = (1 − 2B + B 2 )xt = (1 − B)2 xt . Likewise, the n.th-order difference is also written using the backshift operator as follows: n xt = (1 − B)n xt .
.
We note that the backshift operator is sometimes called lag operator in some textbooks. For example, Hamilton (1994) uses the term “lag operator” denoted by L, whereas Shumway and Stoffer (2011) use the term “backshift operator” denoted by B. The “lag operator” term is somehow vaguely defined as it does not specify the direction of the operator. For that reason, we follow the definition of Shumway and Stoffer (2011) here. ARIMA Models Time series models are constructed by combining different sub-models. For example, moving average components can be added to an autore-
2.1 Predictive Analytics
55
gressive model, which in turn leads to the development of an autoregressive moving average (ARMA) model. If differencing is applied to an ARMA model, it makes the model autoregressive integrated moving average (ARIMA) model (Box & Pierce, 1970; Hamilton, 1994; Chatfield, 2004; Shumway & Stoffer, 2011). An autoregressive model with order p (AR(p)) is formalized as follows: xt = φ1 xt−1 + φ2 xt−2 + · · · + φp xt−p + ωt .
.
Hence, AR models make it possible to relate the current value of a time series variable to its historical values. Using the backshift operator, the last expression can be written as follows: ωt = xt − φ1 xt−1 − φ2 xt−2 − · · · − φp xt−p ,
.
= xt (1 − Bφ1 − B 2 φ2 − · · · − B p φp ), = xt φ(B), where .φ(B) is the autoregressive operator such that: φ(B) = (1 − Bφ1 − B 2 φ2 − · · · − B p φp ).
.
Consider an AR(1) model such that: xt = φ1 xt−1 + ωt .
.
Using the autoregressive expressions for historical observations: xt−1 = φ1 xt−2 + ωt−1 ,
.
xt−2 = φ1 xt−3 + ωt−2 , the AR(1) model can be rewritten as follows: xt = φ1 (φ1 xt−2 + ωt−1 ) + ωt = φ12 xt−2 + φ1 ωt−1 + ωt ,
.
= φ12 (φ1 xt−3 + ωt−2 ) + φ1 ωt−1 + ωt , = φ13 xt−3 + φ12 ωt−2 + φ1 ωt−1 + ωt . Replacing the .xt−3 term on the right-hand side of the last expression with its AR expression (i.e. .xt−3 = φ1 xt−4 + ωt−3 ) and further extending this expression accordingly, the following result is derived (Shumway & Stoffer, 2011): xt =
∞
.
j =0
j
φ1 ωt−j .
56
2 Analytical Foundations: Predictive and Prescriptive Analytics
The autocovariance for lag-i of the AR(1) model is given by: ⎛ ζx (i) = cov(xt , xt−i ) = cov ⎝
∞
.
j φ1 ωt−j ,
j =0
∞
⎞ j φ1 ωt−i−j ⎠ ,
j =0
= cov(ωt + φ1 ωt−1 + · · · + φ1i ωt−i + · · · , ωt−i + φ1 ωt−i−1 + · · · ), for .i ∈ {0, 1, 2, · · · }. In the last expression, there are two terms in parentheses on the right-hand side. The variables .ωt−i , .ωt−i−1 , .· · · appear in both terms. Therefore, the autocovariance is calculated as follows: ζx (i) = φ1i σ 2 (1 + φ12 + φ14 + φ16 + · · · ),
.
=
σ 2 φ1i 1 − φ12
.
The last expression holds true when .φ1 is assumed to be less than one. Therefore, lag-0 autocovariance of AR(1) model is equal to: ζx (0) =
.
σ2 . 1 − φ12
Using the autocovariance formulations, the autocorrelation for lag-i is given by: ρx (i) =
.
ζx (i) = φ1i . ζx (0)
Thus, .ρx (0) = 1 and .ρx (1) = φ i for AR(1) model. Then, the autocorrelation decreases as the lag increases, given that .φ1 is assumed to be less than one. This result indicates the difference between dynamics of AR and MA models. The autocorrelation values do not drop to zero after lag-p in an AR(p) model. Unlike the AR model, the autocorrelation values drop to zero after lag-q in an MA(q) model. Analysing autocorrelation is not useful to determine the order of an AR model because autocorrelation values decrease gradually without having any cut-off point in AR models. Nevertheless, the partial autocorrelation can be used for finding the order of an AR model (Ramsey, 1974). The partial autocorrelation between .xt and .xt−i is indeed the autocorrelation between .xt and .xt−i after marginalizing the dependency of the autocorrelation on the set .{xt−1 , xt−2 , · · · , xt−i+1 }. We use .ρˆx (i) to denote the partial autocorrelation of lag-i. The partial autocorrelation of lag-1 is the same as the autocorrelation of lag-1 such that: ρˆx (1) = ρx (1).
.
2.1 Predictive Analytics
57
For higher lags, it is restricted to being no more than the autocorrelation. For an AR(p) model, the partial autocorrelation values drop to zero after lag-p such that: ρˆx (p + 1) = ρˆx (p + 2) = · · · = 0.
.
The moving average model with order q (MA(q)) is formalized as follows: xt = ωt + θ1 ωt−1 + θ2 ωt−2 + · · · + θq ωt−q .
.
Using the backshift operator, the MA(q) expression is rewritten as follows: xt = ωt (1 + θ1 B + θ2 B 2 + · · · + θq B q ),
.
= ωt θ (B), where .θ (B) is referred to as the moving average operator such that: θ (B) = 1 + θ1 B + θ2 B 2 + · · · + θq B q .
.
Previously, we have shown the autocovariance derivations for the MA(2) model. The derivations can be generalized such that: ζx (0) = σ 2 (1 + θ12 + θ22 + · · · + θq2 ),
.
ζx (1) = σ 2 (θ1 + θ2 θ1 + θ3 θ2 + · · · + θq θq−1 ), ζx (2) = σ 2 (θ2 + θ3 θ1 + θ4 θ2 + · · · + θq θq−2 ), .. .. .=. ζx (q − 1) = σ 2 (θq−1 + θq θ1 ), ζx (q) = σ 2 θq . The autocovariance values are zero after lag-q—that is, .ζx (q+1) = 0, .ζx (q+2) = 0, and so on. The autoregressive and moving average elements can be combined to develop an autoregressive moving average (ARMA(p,q)) model: xt = φ1 xt−1 + φ2 xt−2 + · · · + φp xt−p + ωt + θ1 ωt−1 + θ2 ωt−2 + · · · + θq ωt−q .
.
Using the backshift operators, the ARMA(p,q) model can rewritten as follows: xt φ(B) = ωt φ(B).
.
The ARMA(p,q) model can be further extended to capture different dynamics that are observed in the time series data. If the time series variable .xt is transformed
58
2 Analytical Foundations: Predictive and Prescriptive Analytics
to .d xt = (1 − B)d xt , the ARMA model becomes the ARIMA(p,d,q) model— that is, the autoregressive integrated moving average. When there is a trend in time series data such that it increases or decreases over time, the differencing must be applied to data. If there is a seasonality such that some patterns are observed every s periods, it becomes the SARIMA(p,d,q,s) model—that is, the seasonal autoregressive integrated moving average. The seasonality can be detected by looking at the time series data, autocorrelation and partial autocorrelation plots. We have not covered the SARIMA models and estimation of coefficients in this section. We refer the reader to Shumway and Stoffer (2011) for a detailed and rigorous analysis of SARIMA and other advanced topics of time series modelling as well as the estimation methods. The orders p, d and q must be determined to develop an ARIMA(p,d,q) model. The higher the p value (all else being equal), the higher the likelihood of the time series model. Likewise, the likelihood of the model increases with the q value. In other words, the likelihood of the model increases as more autoregressive or moving average elements are added to the model. However, the likelihood does not necessarily increase with d. The differencing order d must be aligned with the trend dynamics of time series data. If the value of .xt increases linearly over time, for example, the first-order differencing would lead to the highest likelihood (while all else being equal). If the value increases quadratically, the second-order differencing would yield the highest likelihood. Therefore, the differencing order d must be determined by visually inspecting the time series data (i.e. .xt values over time). To determine the orders p and q, the autocorrelation and partial autocorrelation plots must be inspected. If the autocorrelation values cut off after lag-3, for example, the minimum value of q must be equal to three. Although adding more moving average elements helps increase the likelihood of the model, estimation of the coefficients of such a model may fail to yield robust estimates for the reasons explained above (see the explanation of Fig. 2.4). Therefore, setting the q value to the cut-off lag (i.e. three) would be viable in this case. If the partial autocorrelation values cut off after lag-1, for instance, the minimum value of p must be equal to one.
2.2
Prescriptive Analytics
Prescriptive analytics is a field of data analytics that relies on optimization theory to help decision-makers improve their decisions. It involves a vast range of topics such as numerical optimization, linear programming, mixed-integer programming, constraint optimization, dynamic programming, etc. In this section, we introduce the reader to the basic principles of optimization theory, such as Taylor’s expansion and convexity. Then, we cover numerical optimization techniques (i.e. Newton’s method and gradient descent algorithm) and Lagrange optimization. We remark that this section must be complemented with the Google’s OR-Tools for a comprehensive analysis and study of prescriptive methods. The complete package of Google’s OR-
2.2 Prescriptive Analytics
59
Fig. 2.6 Taylor’s expansion illustration
Tools is accessible online: https://developers.google.com/optimization/introduction/ overview. We recommend interested readers to finish this section and then walk through the optimization methods (as well as Python practices) on that website.
2.2.1
Taylor’s Expansion
Taylor’s expansion is an approximation of a function using the first- and higherorder derivatives from a fixed point (Strang, 2019). It forms the basis of numerical optimization methods together with the convexity theory. When there is limited information about a function, Taylor’s expansion would be helpful for approximating the unknown parts of the function. To illustrate how Taylor’s expansion works, we present an example in Fig. 2.6. The red curve gives the values of the function .y = f (x). Suppose that we have partial information about this function such that we only know the dynamics within the frame. We would like to use this limited information to approximate the value of .f (x2 ). To this end, we draw a tangent line that goes through the point .(x1 , f (x1 )) as shown in the figure. The approximation of .f (x2 ) based on this tangent line lies below the red curve. It is indeed the point at which the tangent line intersects with the vertical dashed line from .x2 . This approximation can be found by: f (x2 ) ≈ f (x1 ) + s1 (x2 − x1 ),
.
60
2 Analytical Foundations: Predictive and Prescriptive Analytics
where .s1 is the slope of the tangent line. This slope is equal to the first derivative of f (x) at .x = x1 . Thus, the last expression can be rewritten as follows:
.
f (x2 ) ≈ f (x1 ) + f (x1 )(x2 − x1 ),
.
where .f (·) is the first derivative of function .f (·). This functional form is known as the first-order approximation of .f (x2 ). If the second derivative is known, it can be used to reduce the approximation error and get a better approximated value of .f (x2 ). The second-order approximation of .f (x2 ) is found by: f (x2 ) ≈ f (x1 ) + f (x1 )(x2 − x1 ) +
.
f 2 (x1 ) (x2 − x1 )2 . 2
Depending on the availability and knowledge of higher-order derivatives of .f (x), the approximation procedure can be improved. The .nth-order approximation is written as follows: f (x2 ) = f (x1 ) + f (x1 )(x2 − x1 ) +
.
f n (x1 ) f 2 (x1 ) (x2 − x1 )2 + · · · + (x2 − x1 )n . 2 n!
This is the generalized form of Taylor’s expansion, which provides the foundational basis of various numerical optimization methods. Consider the function: .y = f (x) = (x − 3)2 . Suppose that we would like to calculate the approximated value of .f (x2 ) for .x2 = 3 by using the information of .f (x1 ) and .f (x1 ) at .x1 = 1. .f (1) = 4 and .f (1) = −4. Then, the first-order approximation of .f (x2 ) is: f (3) ≈ f (1) + f (1)(3 − 1) = 4 − 4 × 2 = −4.
.
We now use the second derivative to calculate the second-order approximation. f 2 (1) = 2. Thus, the second-order approximation is given by:
.
f (3) ≈ f (1) + f (1)(3 − 1) +
.
2 f 2 (1) (3 − 1)2 = 4 − 4 × 2 + (3 − 1)2 = 0. 2 2
The true value of .f (3) is zero, which is calculated by the exact function f (x) = (x −3)2 . Thus, the approximation error, calculated by .|approximate value− true value|, is equal to four for the first-order approximation. Adding the second derivative to the approximation function helps reduce the approximation error to zero. Another approach to reduce the approximation error is to use an .x1 value closer to .x2 . Suppose that we now approximate .f (x2 ) from .x1 = 2. .f (2) = 1, .f (2) = −2 and .f 2 (2) = 2. Then, the first-order approximation for the case when .x1 = 2 is:
.
f (3) ≈ f (2) + f (2)(3 − 2) = 1 − 2 × 1 = −1.
.
2.2 Prescriptive Analytics
61
Thus, the first-order approximation error decreases from four to one if the .x1 value is changed from one to two. The second-order approximation for the case when .x1 = 2 is equal to: f (3) ≈ f (2) + f (2)(3 − 2) +
.
f 2 (2) 2 (3 − 2)2 = 1 − 2 × 1 + (3 − 2)2 = 0. 2 2
Therefore, the second-order approximation error is still zero when the .x1 value is changed from one to two. As demonstrated by this example, the approximation error in Taylor’s expansion can be reduced by adding higher-order derivatives and/or making .x1 closer to .x2 .
2.2.2
Convexity
In optimization theory, the existence of an optimal point (at which the minimum value of a function is attained) is guaranteed when the function is convex (Bazaraa et al., 2006). If a function can be described as a convex function, it means that there is a single point at which the minimum value is reached. For that reason, some optimization techniques rely on the properties of convexity (Judd, 1998, e.g. Newton’s method and gradient descent). The convexity can be described such that the function .f (x) is convex if it satisfies the following condition: For .xi = tx1 + (1 − t)x2 and .0 ≤ t ≤ 1, tf (x1 ) + (1 − t)f (x2 ) ≥ f (tx1 + (1 − t)x2 ).
.
In other words, the value of point A in Fig. 2.7 is higher than the value of point B for any arbitrary point on the curve between .x1 and .x2 .
Fig. 2.7 Convexity of a function
62
2 Analytical Foundations: Predictive and Prescriptive Analytics
There are some properties of convex functions that are used in optimization methods. If .x2 > x1 in the convex function, the slope at .x2 should always be higher than the slope at .x1 . We recall that the first derivative of a function gives its slope. The second derivative gives the slope of the slope function. Therefore, the second derivative should always be positive in a convex function so that the slope at .x2 should always be higher than the slope at .x1 . And, .f (x) reaches its minimum value at the point when the slope is equal to zero. In other words, the minimum value is obtained when .f (x) = 0. Therefore, the minimum value for a function .f (x) is obtained at .x = x ∗ if the following conditions are satisfied: f (x ∗ ) = 0, and f 2 (x ∗ ) > 0.
.
If .f 2 (x) is positive for any x value, .x ∗ becomes a global minimum. If .f 2 (x ∗ ) is positive but .f 2 (x) is not always positive for any x value, then there may be another point (e.g. .xj ) with .f (xj ) < f (x ∗ ). In that case, .x ∗ becomes a local (not global) minimum. Consider the function: .f (x) = (x − 3)2 . To understand whether it is convex, we need to check its second derivative. The second derivative is .f 2 (x) = 2. Because it is positive regardless of the x value, .f (x) is a convex function. The first derivative of .f (x) is .f (x) = 2(x − 3). It takes negative values when x is less than three; otherwise, it is positive. Given that .f (x) is the slope of the function at point x, the value of .f (x) decreases as x increases for .x < 3. The value of .f (x) increases with x when .x > 3. Therefore, the minimum value of .f (x) is reached when .x = 3. This is exactly the point at which the first derivative is equal to zero. Thus, the optimal value .x ∗ is found by: x ∗ = {x | f (x) = 2(x − 3) = 0} = 3.
.
2.2.3
Newton’s Method
Newton’s optimization method is a well-known technique, and it is sometimes considered a powerful alternative to the gradient descent method (Judd, 1998). Suppose that we apply Taylor’s expansion to the derivative of a function (i.e. applied to .f (x), not to .f (x)). Then, f (xk+1 ) = f (xk ) + f 2 (xk )(xk+1 − xk ).
.
The objective is to find the point for which .f (x) has the minimum value. The slope of the function becomes equal to zero at this point such that .f (x) = 0. Newton’s method starts with an initial value of .x0 , which has to be determined by the modeller. It then aims to reach the optimal value of .x ∗ by iteration. We replace .f (xk+1 ) with zero in the last expression to derive the iteration function because we want the iteration function to converge to the optimal value that satisfies .f (x ∗ ) = 0. Thus,
2.2 Prescriptive Analytics
63
the iteration function is given by (Kollerstrom, 1992; Judd, 1998): 0 = f (xk ) + f 2 (xk )(xk+1 − xk ),
.
−f (xk ) = f 2 (xk )(xk+1 − xk ). For a given .x0 value (by setting .xk = x0 ), .x1 can be found by using this expression. Then, .x2 can be found by setting .xk = x1 and so on. The iteration continues until the stopping condition is satisfied: |xk+1 − xk | ≤ ,
.
where . is a very small number, which should be specified by the modeller. Consider the function .f (x) = 3x 3 − x for which the minimum value is obtained at .x = 1/3 for .x > 0. Applying Newton’s method, the iterative function is written as follows: xk+1 = (9xk2 + 1)/(18xk ).
.
Starting with an initial value .x1 = 1, the iteration converges to the optimal value in five steps with an . value of .0.001: .x2 = 0.556, .x3 = 0.378, .x4 = 0.336, .x5 = 0.33334, .· · · , .x10 = 0.33333.
2.2.4
Gradient Descent Method
Newton’s method converges very quickly to the optimal value. However, its application is often limited because it requires the second derivative to be well defined. In some applications, finding the second derivative may not be possible due to the complexity of models. For instance, it is very difficult to compute the second derivative for the estimation of weights in neural networks although the calculation of the first derivative would be straightforward. In such settings, the gradient descent method can be considered a powerful alternative to Newton’s method to find the optimal value of a function (Judd, 1998). We present the dynamics of the gradient descent method in Fig. 2.8. If the value of .xk is less than .x ∗ , the iteration function must return a higher value such that ∗ .xk+1 > xk . In the green region where .xk < x , the slope of .f (x) (hence the
first derivative .f (x)) is negative. Therefore, the iteration function must use this information and return a higher value of .xk+1 > xk when .xk < x ∗ . In the blue region where .xk > x ∗ , the slope of .f (x) is positive. The iteration function must return a lower value of .xk+1 < xk by using this information. Therefore, the iteration function must: • Increase the value of .xk and return .xk+1 > xk when .f (x) is negative • Decrease the value of .xk and return .xk+1 < xk when .f (x) is positive
64
2 Analytical Foundations: Predictive and Prescriptive Analytics
Fig. 2.8 Gradient descent method. When the slope is negative for .xk , the iterative function should return .xk+1 > xk . Otherwise, it should return .xk+1 > xk until the stopping condition is met
Such an iteration function can be formalized as follows: xk+1 = xk − θf (xk ),
.
where .θ > 0 is the learning rate. Its value must be specified prior to running the gradient descent algorithm. We again consider the function .f (x) = 3x 3 −x such that .f (x) = 9x 2 −1. We set .θ = 1/8. The . value that determines the stopping condition is set equal to .0.001. Then, the gradient descent algorithm converges to the optimal value in ten steps: .x1 = 1, .x2 = 0, .x3 = 0.125, .x4 = 0.232, .x5 = 0.297, .x6 = 0.322, .x7 = 0.331, .x8 = 0.3326, .x9 = 0.3331 and .x1 0 = 0.3332. When we compare these results with those of Newton’s method, we see that the gradient descent method converges to optimal results much slower. In other words, it takes much longer to converge to the optimal value with the gradient descent method. However, it does not require the second derivative, which makes it practically more popular to train neural networks.
2.2.5
Lagrange Optimization
The Lagrange optimization makes it possible to find the optimal value of a function when the optimal solution is restricted to satisfying a constraint (Bazaraa et al., 2006). We consider the following function: f (x) =
.
1 2 x . 3
2.2 Prescriptive Analytics
65
Suppose that we would like to find the optimal value of x that minimizes .f (x) subject to the constraint: x ≥ 4.
.
The optimal value that also satisfies the constraint is equal to .x ∗ = 4. Thus, our constraint is binding in this example. If the constraint is replaced by .x ≤ 4, the optimal solution becomes equal to .x ∗ = 0. In that case, the constraint isn’t binding because the existence of the constraint (i.e. .x ≤ 4) would not have any impact on the optimal solution. The Lagrange method combines the function .f (x) and the constraint .x ≥ 4 into an updated function: 1 2 x + λ(4 − x), 3
J (x, λ) =
.
where .λ is the Lagrange multiplier that can take only non-negative values. The second term on the right-hand side of this expression is the penalty term. If the constraint .x ≥ 4 is violated, the objective function that aims to minimize .f (x) is penalized by a factor “.λ(4 − x)” for a non-negative value of .λ. Thus, the x value that minimizes .J (x, λ) while satisfying .λ ≥ 0 is exactly the optimal value .x ∗ which minimizes .f (x) subject to the constraint .x ≥ 4. Then, the optimal value of x can be found by taking the first derivative of .J (x, λ) as follows: .
2 ∂J (x, λ) = 0 = x − λ, ∂x 3 ∂J (x, λ) = 0 = 4 − x. ∂λ
Therefore, x∗ = 4
.
and
λ∗ = 8/3.
Once we obtain the .x ∗ and .λ∗ values, we have to check the sign of .λ∗ . Having ∗ .λ > 0 implies that the constraint .x ≥ 4 is binding at the optimality. Otherwise, the constraint is non-binding, and the optimal x value can be found by solving the unconstrained problem given by .f (x). Now, we consider another problem with a non-binding constraint. We would like to minimize the function: f (x) =
.
1 (x − 2)2 , 3
subject to: x ≤ 4.
.
66
2 Analytical Foundations: Predictive and Prescriptive Analytics
The Lagrange form of this optimization problem is written as follows: J (x, λ) =
.
1 (x − 2)2 + λ(x − 4). 3
We first attempt to calculate the optimal values of x and .λ from the following expressions: .
2 ∂J (x, λ) = (x − 2) + λ = 0, ∂x 3 ∂J (x, λ) = 0 = x − 4. ∂λ
Hence, x∗ = 4
.
and
λ∗ = −4/3.
Here, the calculated .λ∗ value is negative. However, .λ can only take non-negative values in the Lagrange model. For that reason, the .λ∗ value is set equal to zero. In this case, the constraint .x ≤ 4 is considered non-binding, and it should be removed from the problem. Therefore, the .x ∗ value that minimizes .f (x) = 1/3(x − 2)2 is indeed the optimal solution. It is found by: 2 (x − 2) = 0, 3 x∗ = 2.
f (x) =
.
The Lagrange model can be considered a two-step optimization model. The first step involves the calculation of .x ∗ and .λ∗ . The second step is the adjustment stage such that the constraint must be removed from the problem and the unconstrained model must be solved when .λ∗ is negative. It is very powerful because it can be embedded into a computer algorithm to solve constrained optimization problems.
2.3
Chapter Summary
In this chapter, we have reviewed some fundamental analytical theories that can be used for predictive and prescriptive modelling in supply chain management. We have started with regression models and discussed ordinary least squares (OLS), generalized least squares (GLS) and the two methods to deal with endogeneity problems (i.e. two-stage least squares (2SLS) and generalized method of moments (GMM)). The regression models can be used in supply chain management to predict some uncertain parameters, such as demand, by using explanatory variables when available. If it is not possible to identify any explanatory variable, uncertain
2.4 Practice Examples
67
variables can still be predicted by using their historical values. Time series analysis makes this possible, which we have covered in this chapter. With advances in technology, companies can monitor the data of customers searching for their products in retail stores or online. This would make it possible to observe how customers form their product choices. Choice models fall into classification models, where the dependent variable is a categorical variable. We have discussed implementation and assessment of classification models here. We have ended this chapter with prescriptive analytics although a more comprehensive analysis of optimization model is beyond the scope of this book. Building on analytical concepts discussed in this chapter, we will analyse how to optimize supply chain decisions in the next chapters.
2.4
Practice Examples
Example 2.1 Describe how to handle categorical explanatory variables in linear and classification models. Solution 2.1 Linear models are used when the dependent variable is continuous. If the dependent variable is categorical, then classification methods must be used. The type of model to be used for predictive analytics depends on the dependent variable, not explanatory variables. Categorical explanatory variables can be used in linear and classification models after transforming them to dummy variables. In Python, the procedure pandas.get_dummies can be used for this purpose. Suppose that a categorical variable has three classes: “A”, “B” and “C”. It takes values 1, 2 and 3 for these classes, respectively. Without getting dummy variables, force-fitting a model and estimating the coefficient for the categorical variable lead to wrong results. Suppose that the coefficient is 0.5. Thus, switching from class 1 to 2 has an impact of 0.5 on the dependent variable. Likewise, switching from class 2 to 3 has the same impact on the dependent variable. This is incorrect. Because they are different classes, switching from 1 to 2 would be very different from switching from 2 to 3. For that reason, a binary dummy variable must be created for each class to correctly measure the actual impact of each class of a categorical variable on the dependent variable. Example 2.2 Show that the independence assumption of linear models yields to ordinary least squares estimation of the coefficients: BOLS = (XT X)−1 XT Y . Solution 2.2 The independence assumption is that error terms are independent from explanatory variables. Mathematically speaking, it means that error terms must be orthogonal to explanatory variables such that: X ⊥ E → XT E = 0.
.
68
2 Analytical Foundations: Predictive and Prescriptive Analytics
Given that Y = XB + E, we can replace E with Y − XB in the last expression. Then, XT (Y − XB) = 0,
.
XT Y − XT XB = 0, (XT X)−1 XT XB = (XT X)−1 XT Y, B = (XT X)−1 XT Y. The last expression derived from the independence assumption is exactly the same as BOLS . Therefore, the independence assumption complements the other assumptions of linear models so the results would be consistent when alternative derivations are applied to estimate the coefficients. Example 2.3 Suppose that a data science intern aims to develop a linear model to predict a variable of interest y. While recording the values of explanatory variables, deviate from the true the intern makes a mistake such that the measured values X = X + ψ. The intern also forgets adding an important values X by a factor of ψ: X explanatory variable Z. What is the combined impact of measurement error and omitted variable bias to the consistency of estimates? Solution 2.3 The full model that involves the omitted variables and true value of X is given by: Y = XB + Zθ + E.
.
Using the orthogonality condition on the full model, we have the following result: − ψ)T (Y − (X − ψ)B − Zθ ) = 0, X T E = (X
.
− ψ)T (X − ψ)B = (X − ψ)T Y − (X − ψ)T Zθ. (X Thus, consistent estimates of coefficients are given by: −1 − ψ)T (X − ψ) − ψ)T Y BCON S = (X (X
.
−1 − ψ) − ψ)T Zθ. − ψ)T (X (X − (X The OLS estimates of the coefficients that are based on wrong model developed by the intern are given by: T X) −1 XY. BOLS = (X
.
2.4 Practice Examples
69
Fig. 2.9 MAPE values for varying α
Therefore, the estimation bias due to the endogeneity problem is: Bias = BOLS − BCON S .
.
Example 2.4 Use the dataset data_regularization.csv (available in the online web application) to develop a Lasso model in Python. Dataset includes 30 observations of a dependent and 12 independent (explanatory) variables. Split the dataset into two as train and test sets. Use 20 observations to train the model and the remaining 10 observations to test the performance of the model based on the mean absolute percentage error (MAPE). Report the best α value for the Lasso model that minimizes the MAPE on the test set. Solution 2.4 The Python codes are given in the online web application. The MAPE values for varying α are also shown in Fig. 2.9. The α value that minimizes the MAPE is 7.14, and the corresponding MAPE for it is 16.65%. Example 2.5 Use the dataset small_retailers_stock_performance.csv to develop the two-stage least squares (2SLS) and GMM models in Python that handle the endogeneity problem for the operational and financial analysis of small retailers. The stock return of small retailers would be impacted by operating profits and inventory turnover. However, it is not clear how the interaction between inventory turnover and operating profits would affect the stock return of small retailers. The hypothesis is that the retailers with a high operating profit and high inventory turnover would generate high returns in the stock market. When the inventory turnover is low, high operating profits would delude investors that the high profitability is sustainable. Therefore, the stock return must be low in that case. Use “Stock Change” as the dependent variable and “Inventory Turnover”, “Operating Profit” and “Interaction Effect” as independent variables. The interaction effect is equal to [I nventory turnover] × [Operating prof it]. The inventory turnover is
70
2 Analytical Foundations: Predictive and Prescriptive Analytics
Table 2.2 Results for the 2SLS and GMM models Two-stage least squares (2SLS) Generalized method of moments (GMM)
β0 −0.0176 −0.0200
β1 0.0011 0.0011
β2 −0.1201 −0.1071
β3 0.0014 0.0011
an endogenous variable. There are three instruments “Current Ratio”, “Quick Ratio” and “Debt-to-Asset Ratio” that can be used to solve the endogeneity problem for the 2SLS and GMM models. Solution 2.5 The model for this problem is formalized as follows: Stock change = β0 + β1 × [I nventory turnover]
.
+β2 × [Operating prof it] +β3 × [I nteraction eff ect] + The Python codes for this example are given in the online web application. The coefficients estimated by the 2SLS and GMM models are given in Table 2.2. The constant term β0 is not statistically significant for both models as it can be observed in the Python codes. The other coefficients are all significant at the 10% significance level. The results obtained by either model supports the hypothesis that the retailers with a high operating profit and high inventory turnover would generate high returns in the stock market. In the GMM model, for example, the coefficient of the operating profits is negative: β2 = −0.1071. However, the interaction effect [I nventory turnover] × [Operating prof it] has a positive coefficient: β3 = 0.0011. If the inventory turnover exceeds 0.1071/0.0011 = 97.36 for a small retailer (i.e. especially the case for food retailers and restaurant chains), an increase in operating profits has a positive effect on stock returns. Suppose that inventory turnover is equal to 100. Then, the operating profit terms of the model are given by: −0.1071 × [Operating prof it] + 0.0011 × 100 × [Operating prof it]
.
= 0.0029 × [Operating prof it]. Thus, an increase in operating profits has a positive impact on stock returns. If the inventory turnover is lower than 97.36, an increase in operating profits has a negative impact effect on stock returns. Suppose that it is 50. Then, .
−0.1071 × [Operating prof it] + 0.0011 × 50 × [Operating prof it] = −0.0521 × [Operating prof it].
2.4 Practice Examples
71
Thus, an increase in operating profits has a negative impact on stock returns. We observe similar dynamics when we use the 2SLS estimates of coefficients. Thus, our hypothesis is supported by both 2SLS and GMM models after addressing the endogeneity problems. Example 2.6 Use the classification dataset from sklearn library in Python: https:// scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris. Develop a logistic regression model by using the whole dataset and report the F1 score. Solution 2.6 In the Iris dataset, there are one dependent variable and four explanatory variables. The dependent variable has three classes. Thus, the logistic regression model is formalized as follows: P r(y = 1) =
.
eβ(0,1) +β(1,1) x1 +β(2,1) x2 +β(3,1) x3 +β(4,1) x4 , 3 β +β x +β x +β x +β x 1 2 3 4 (0,j ) (1,j ) (2,j ) (3,j ) (4,j ) e j =1
P r(y = 2) =
eβ(0,2) +β(1,2) x1 +β(2,2) x2 +β(3,2) x3 +β(4,1) x4 , 3 β +β x +β x +β x +β x 1 2 3 4 (0,j ) (1,j ) (2,j ) (3,j ) (4,j ) e j =1
P r(y = 3) =
eβ(0,3) +β(1,3) x1 +β(2,3) x2 +β(3,3) x3 +β(4,3) x4 . 3 eβ(0,j ) +β(1,j ) x1 +β(2,j ) x2 +β(3,j ) x3 +β(4,j ) x4 j =1
We remark that this model corresponds to the softmax normalization as Python returns the results according to the softmax normalization. The estimates of the coefficients are given in Table 2.3. The F 1 score is 0.9733. The Python codes are available in the online web application. Example 2.7 Upload the data of Amazon’s share prices between 2000 and 2021 from Yahoo Finance by using pandas_datareader in Python. Plot share prices, autocorrelation (ACF) and partial autocorrelation (PACF). Then, report log-likelihood values for ARIMA(1,1,1), ARIMA(1,2,1), ARIMA(1,0,1) and ARIMA(1,1,10) models. Solution Python codes of this problem are given in the online web application. The log-likelihood values are given in Table 2.4. Table 2.3 Logistic regression estimates of coefficients
β0 β1 β2 β3 β4 P r(y = 1) 9.854 −0.424 0.966 −2.517 −1.079 P r(y = 2) 2.233 0.535 −0.321 −0.206 −0.944 P r(y = 3) −12.088 −0.110 −0.645 2.723 2.023
72
2 Analytical Foundations: Predictive and Prescriptive Analytics
Table 2.4 Log-likelihood values of time series models Log-likelihood
ARIMA(1,1,1) −7964
ARIMA(1,2,1) −7969
ARIMA(1,0,1) −7975
ARIMA(1,1,10) −7938
Fig. 2.10 Convergence performance of Newton’s method
The results indicate that the ARIMA(1,1,10) model yields the highest loglikelihood value. Thus, it must be preferred over the others. Example 2.8 Consider the function f (x) = 3x 3 − x. Develop Newton’s algorithm in Python for this function, and plot the number of iterations depending on the error tolerance for the values between 10−10 and 0.1. Solution 2.8 The iteration function is given above, which is: xk+1 = (9xk2 + 1)/(18xk ).
.
As discussed earlier, the minimum value of f (x) for positive x values is obtained at x = 1/3. Python codes of this example are given in the online web application. Newton’s method converges to the optimal value very fast. Even for a very low tolerance such that = 10−10 , it takes six iterations to converge to the optimal value as shown in Fig. 2.10. Example 2.9 Consider again the function f (x) = 3x 3 − x. Develop the gradient descent algorithm in Python for this function (with the learning rate θ = 1/8), and plot the number of iterations depending on the error tolerance for the values between 10−10 and 0.1. Solution 2.9 The iteration function is written as follows: xk+1 = xk − θ (9xk2 − 1).
.
References
73
Fig. 2.11 Convergence performance of the gradient descent
Python codes for this example are given in the online web application. Unlike Newton’s method, the gradient descent algorithm converges slowly to the optimal value as shown in Fig. 2.11. For example, it takes 19 iterations to converge to the optimal value when = 10−10 , while Newton’s method converges in 6 iterations for the same error tolerance.
2.5
Exercises
Consider the function f (x) = 4x 4 + (2x − 5)2 − 19. Is f (x) a convex function? Write the iteration function of f (x) based on Newton’s method. Write the iteration function of f (x) based on the gradient descent algorithm. Develop Python codes for f (x) that uses Newton’s method to converge to the optimal value for = 0.001. 5. Develop Python codes for f (x) that uses the gradient descent algorithm to converge to the optimal value for = 0.001 and θ = 1/8.
1. 2. 3. 4.
References Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making causal claims: A review and recommendations. The Leadership Quarterly, 21(6), 1086–1120. Barrett, J. P. (1974). The coefficient of determination—some limitations. The American Statistician, 28(1), 19–20. Bazaraa, M. S., Sherali, H. D., & Shetty, C. (2006). Nonlinear programming: Theory and applications (3rd ed.). Wiley. Böhning, D. (1992). Multinomial logistic regression algorithm. Annals of the Institute of Statistical Mathematics, 44(1), 197–200. Box, G. E., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressiveintegrated moving average time series models. Journal of the American Statistical Association, 65(332), 1509–1526. Chatfield, C. (2004). The analysis of time series: An introduction (6th ed.). Chapman & Hall/CRC.
74
2 Analytical Foundations: Predictive and Prescriptive Analytics
De Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean absolute percentage error for regression models. Neurocomputing, 192, 38–48. Deng, X., Liu, Q., Deng, Y., & Mahadevan, S. (2016). An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Information Sciences, 340, 250–261. Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407–499. Escudero, L. F., Galindo, E., Garcıa, G., Gomez, E., & Sabau, V. (1999). Schumann, a modeling framework for supply chain management under uncertainty. European Journal of Operational Research, 119(1), 14–34. Greene, W. H. (2017). Econometric analysis (8th ed.). Pearson. Hamilton, J. D. (1994). Time series analysis. Princeton University Press. Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50(4), 1029–1054. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Springer Series in statistics (2nd ed.). Judd, K. L. (1998). Numerical methods in economics. The MIT Press. Kollerstrom, N. (1992). Thomas simpson and ‘newton’s method of approximation’: An enduring myth. The British Journal for the History of Science, 25(3), 347–354. Pardoe, I. (2021). Applied regression modeling (3rd ed.). Wiley. Pregibon, D. (1981). Logistic regression diagnostics. The Annals of Statistics, 9(4), 705–724. Ramsey, F. L. (1974). Characterization of the partial autocorrelation function. The Annals of Statistics, 2(6), 1296–1301. Shumway, R. H., & Stoffer, D. S. (2011). Time series analysis and its applications (3rd ed.). Springer. Silvey, S. D. (1975). Statistical inference. Chapman & Hall/CRC. Sodhi, M. S., Son, B.-G., & Tang, C. S. (2008). Asp, the art and science of practice: What employers demand from applicants for mba-level supply chain jobs and the coverage of supply chain topics in mba courses. Interfaces, 38(6), 469–484. Strang, G. (2019). Linear algebra and learning from data. Wellesley-Cambridge Press. Train, K. E. (2009). Discrete choice methods with simulation (2nd ed.). Cambridge University Press.
3
Inventory Management Under Demand Uncertainty
Keywords
Inventory productivity · Newsvendor problem · Base stock model · Reorder model · Monte Carlo simulation
Inventory is considered a collection of raw materials, semi-components and finished goods that are kept in stock with the intention of being sold to customers. Manufacturers keep finished goods inventory on hand in order to fulfil customer demand immediately. They also keep raw materials and semi-components inventory to produce finished goods. Retailers typically hold only finished goods inventory, so they directly meet customer demand in retail stores. In the retail industry, for example, the total inventory value represents approximately 36% of total assets (Gaur et al., 2005). In management accounting, inventory is regarded as a short-term asset because it is stocked for a limited time duration before being sold to customers. The average time duration that inventory is kept in stock is referred to as days of inventory, which is one of the important elements of the supply chain framework discussed in Chap. 1. We recall from Chap. 1 that when the sum of supply lead time and days of inventory is longer than the demand lead time—in other words, the decision lead time is positive—inventory decisions are made in the face of demand uncertainty. Demand uncertainty disappears when customers place their orders and decision-makers learn the exact value of demand. In practice, inventory decisions are usually made before this happens, which makes demand uncertainty a critical element of inventory management practices. The main trade-off in inventory management under demand uncertainty is between excess inventory and stockouts. Companies incur excess inventory charges if their inventory level exceeds market demand. The cost of excess inventory is tremendous in both the manufacturing and retail industries. In 2013, BlackBerry announced an inventory write-off of $1 billion as the market demand for its newly launched BlackBerry 10 model turned out to be much lower than expected (Connors © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_3
75
76
3 Inventory Management Under Demand Uncertainty
Fig. 3.1 Cost components of excess inventory and stockouts
& Terlep, 2013). Following the announcement, the company cut its workforce by 40% (amounting to 4500 employees), and the stock price fell by 17%. In the retail industry, it was reported that Amazon was destroying around 120,000 unsold items every week in only one of its fulfilment centres (Hassan, 2021). When the inventory level is less than the market demand, companies lose the opportunity to sell their products at full price. Therefore, stockouts cause lost sales in the short term. However, they may also have a long-term negative impact on profits. When customers are exposed to a stockout, they tend to switch to a competitor and never come back 60% of the time (Kapner, 2015). The annual cost of stockouts in the retail industry is estimated to be $634 billion (Gustafson, 2015). Figure 3.1 summarizes the different cost components of excess inventory and stockouts. Excess inventory costs can be categorized into two different groups. First, inventory holding costs cover the costs of keeping and storing inventory. To store inventory, companies incur warehousing costs such as rent, labour and warehouse machinery costs. Second, companies also sacrifice interest earnings when they tie up capital in inventory, which is referred to as capital costs. Warehousing and capital costs constitute a substantial portion of inventory holding costs. Excess inventory also causes obsolescence costs at the end of a product’s life. The value of a product would be much lower than the cost at the end of its life. For example, perishable goods are thrown away after their sell-by dates. Fashion goods and electronics would lose much of their value at the end of their selling seasons. If a product becomes worthless at the end of its life (e.g. perishable goods), it must be written off in the accounting statements so the value of excess inventory is updated to be equal to zero. If a product has a positive salvage value at the end of its life (e.g. fashion goods and electronics), it must be written down so the value of the excess inventory is reduced, though not to zero. The obsolescence costs due to the loss of a product’s value are often referred to as the write-off and write-down costs. When
3.1 Inventory Productivity and Financial Performance
77
a product becomes obsolete, it may have to be destroyed due to regulations. In the pharmaceutical industry, for example, obsolete items should be destroyed according to public health regulations, which causes disposal costs. Therefore, the write-off, write-down and disposal costs constitute obsolescence costs that companies incur at the end of a product’s life. Stockout costs can be examined in three different groups. First, customers who are exposed to a stockout are more likely to cancel their purchases (Anderson et al., 2006). Such customers tend to cancel their purchases of not only the out-of-stock item but also the other items in their shopping baskets. These costs are called the cost of lost sales. Stockouts also have a negative impact on long-term profits because customers who are exposed to a stockout tend to switch to a competitor. Even if they do not switch to a competitor completely, they are more likely to order fewer items and spend less (Anderson et al., 2006). Therefore, stockouts also cause long-term costs due to the loss of customer goodwill and loyalty. Customers who experience a stockout may be convinced to backorder the products, rather than cancelling their purchases. Companies often give coupons to induce their customers to backorder stockout items, which in turn reduces profits. Once a customer places an out-of-stock item on backorder, the order would be expedited through an express delivery and/or overtime work. These additional logistics and labour costs constitute the backorder penalty costs along with the costs of incentives (i.e. the cost of backorder coupons) that make customers accept backordering the items. In inventory management, single-period models are based on the assumption that excess demand is lost. These types of models are designed for products with a short selling season, and there is only one replenishment that occurs before the selling season ends. When a customer experiences a stockout, it is not possible to reorder the product to fulfil the customer demand. Therefore, backordering out-ofstock items is technically impossible for single-period models. On the other hand, excess demand is often assumed to be backordered in multi-period models because there is more than one replenishment in such models. If a customer is exposed to a stockout and willing to wait until the next replenishment to buy the product, the customer demand can be fulfilled later.
3.1
Inventory Productivity and Financial Performance
Excess inventory and stockout costs have a substantial negative impact on operating profits. According to a market survey conducted by IHL Group in 2015, the total annual cost of excess inventory and stockouts is estimated to be more than $1 trillion in the retail industry globally (Gustafson, 2015). Companies that mitigate the excess inventory and stockout risks are naturally expected to outperform their competitors. Inventory productivity has a direct positive impact on the financial performance of firms (Alan et al., 2014). One of the important metrics executives and investors monitor to assess inventory productivity is inventory turnover (Gaur et al., 2005). Inventory turnover is formulated as the ratio of cost of goods sold to
78
3 Inventory Management Under Demand Uncertainty
Fig. 3.2 Inventory turnover for Dillard’s and Toys “R” Us between 2004 and 2016
average inventory: I nventory turnover =
.
Cost of goods sold Average inventory level
The popularity of this metric comes from the fact that its value for a publicly traded firm can be calculated by looking at the financial statements. The cost of goods sold during a year is reported on income statements, while the inventory value is reported on balance sheets. This measure indicates how many times in a year inventory is fully replenished. Companies with relatively high inventory turnover have high inventory productivity such that they can generate high revenues with little inventory. Low inventory turnover indicates that inventory may be stocked excessively, causing a high inventory holding cost and the risk of inventory obsolescence. Because inventory is a short-term asset, turnover has a direct impact on key financial measures such as return on assets. Figure 3.2 depicts the inventory turnover for two US-based retailers—Dillard’s and Toys “R” Us—from 2004 to 2016. The turnover value varies over time, and it is expected to fluctuate within an interval in which the boundaries are determined by upper and lower thresholds. A downside movement that crosses the lower threshold would signal difficulty in generating positive operating profits. This was the case for Toys “R” Us in 2012 when the inventory turnover reduced below the lower threshold. The downward trend has continued since then. An upside movement that exceeds the upper threshold indicates a positive change regarding the operational aspects of the business, and it would be linked to an increase in operating profits. This happened to Dillard’s in 2008, and the company maintained high turnover rates until 2015. Figure 3.3 shows the return on assets (ROA) for the same two retailers between 2010 and 2016. The ROA for Toys “R” Us turned negative in 2012 and remained
3.1 Inventory Productivity and Financial Performance
79
Fig. 3.3 Return on assets for Dillard’s and Toys “R” Us between 2010 and 2016
negative until 2016. One year later, in 2017, the company announced its bankruptcy (Morgenson & Rizzo, 2018). On the other hand, Dillard’s retained a positive ROA during the same time period and maintained a healthy and sustainable financial position. Although the bankruptcy of Toys “R” Us and the financial stability of Dillard’s cannot only be explained by their inventory productivity, there is no doubt that inventory turnover provides valuable signals regarding the operational performance of these two firms, which has a direct impact on the financials. Figures 3.2 and 3.3 show the positive association between inventory turnover and return on assets using the information provided in financial statements. A direct connection between those two measures can also be constructed via a DuPont analysis (Soliman, 2008). The DuPont model breaks down the ROA into three parts as follows: .Return
on assets =
N et prof it Cost of goods sold Average inventory . T otal assets Cost of goods sold Average inventory
Multiplication of all three terms on the right-hand side yields net profit divided by total assets, which is equal to ROA. The first term is the markup ratio, the second one the inventory turnover, and the last one the percentage of assets tied up in inventory. The DuPont equation demonstrates that inventory turnover has a direct impact on ROA, which is moderated by profitability (i.e. measured by the markup ratio) and the percentage of total assets that inventory constitutes. Consider a wholesaler that does not own any fixed assets and uses a line of credit to maintain its operations. In this case, inventory is the only asset, and the last term on the right-hand side is equal to one. If the wholesaler can increase its inventory turnover, higher ROA can be achieved, even when the profit margin is relatively low. For asset-heavy businesses, however, the last term may have a very low value, and a high inventory turnover alone would not be enough to improve ROA substantially. Relating inventory turnover to ROA, the DuPont model shows us how to interpret the turnover values of companies. First, looking only at the inventory turnover does not give us enough information to assess the performance of a firm. For that reason, analysts and investors should look for some benchmark firms that operate in the same industry with similar markup ratios and asset structures. Then, the inventory turnover of the firm should be compared with those of the benchmark firms. Some
80
3 Inventory Management Under Demand Uncertainty
successful hedge funds (especially those focusing on the retail industry) apply such comparisons to assess the inventory risk of firms before making any investment (Raman et al., 2006).
3.2
Single-Period Models
Inventory management under demand uncertainty has been investigated by scholars from different angles in different real-world settings. The most basic form of such settings is the single-period model, which is also known as the newsvendor model (Scarf, 1958). In a newsvendor problem, a decision-maker should determine the optimal inventory level Q for a product for which the market demand is uncertain at the time the decision is made. The newsvendor setting is relevant in practice for products with a short selling season or perishable items that cannot be carried over to the next selling period. The demand value is denoted by D. Due to demand uncertainty, demand can be higher or lower than the inventory level. If demand is lower than Q, there will be excess inventory of .Q − D units at the end of the period. If demand turns out be higher than Q, there will be lost sales of .D − Q units. We consider a single product that is sold in the market at a price of p per unit. The cost per unit is c. The residual value for the unsold items at the end of the selling period is equal to s per unit. In the fashion apparel industry, for example, excess inventory is often sold at a discount such that the discounted prices can be lower than the costs. For some perishable items, such as food, excess inventory is often thrown away, rendering the residual value equal to zero. In the pharmaceutical industry, however, the residual value is often negative because medicine has to be destroyed after the sell-by date. Therefore, pharmaceutical companies incur a salvage cost for throwing away excess inventory. In a newsvendor model, total profit made at the end of the period is formulated as follows: = p min(Q, D) − cQ + s max(Q − D, 0).
.
(3.1)
The first term on the right-hand side of this expression gives the total revenue made by regular sales. The sales amount is the minimum of demand and inventory level. The second term is total costs. The last term is the total revenue made by selling excess inventory at its residual value. The term .max(Q − D, 0) gives excess inventory, which can take a positive value when .Q > D and zero otherwise. Suppose, for example, that a computer shop starts to sell a new laptop model. The store manager orders laptops once from an offshore manufacturer and sells the items for a 3-month period. The demand is estimated to be 50 laptops with a probability of .25%, 100 laptops with a probability of .50% and 150 laptops with a probability of .25%. The selling price is .$125 per laptop; the cost is .$50 per laptop; and the residual value is .$25 per laptop. Suppose also that the store manager orders 125 laptops.
3.2 Single-Period Models
81
• With a probability of .0.25, the demand turns out to be 50 laptops, and the store manager can only sell 50 units at the full price. The remaining amount (i.e. .125− 50 = 75) is sold at the residual value. • With a probability of .0.5, the demand turns out to be 100 laptops, and the store manager can only sell 100 units at the full price. The remaining amount (i.e. .125 − 100 = 25) is sold at the residual value. • With a probability of .0.25, the demand turns out to be 150 laptops, and the store manager can sell all 125 units at the full price. Therefore, the expected profit is: E() = 125 0.25 × 50 + 0.5 × 100 + 0.25 × 125 − 50 × 125 + 25 0.25 × 75 + 0.5 × 25 ,
.
= $6250. We assume in this example that demand can only take three different values with certain probabilities. In practice, demand values vary within some intervals, and a probability function can be used to represent the demand density. We use a random variable x and a function .f (x) to denote the random demand parameter and the density function, respectively. When demand uncertainty is low, the normal distribution may fit well with empirical data. If x follows a normal distribution with a mean of .μ and a standard deviation of .σ , the density function .f (x) is given by: − 12 1 .f (x) = √ e σ 2π
x−μ σ
2 .
Figure 3.4 shows demand density according to a normal distribution with a mean of 500 units and a standard deviation of 100 units. The normal distribution assigns higher densities to the demand values that are closer to the mean value, resulting in the bell-shaped curve. We use .F (x) to denote the cumulative probability distribution function such that .F (x) is equal to the probability that demand will be less than or equal to x. Therefore, .F (x) ranges between zero and one as depicted in Fig. 3.5. For example, .F (0) = 0, .F (0.5) = 500, and .F (1000) = 1. Therefore, the probabilities that demand values are less 0, 500 and 1000 units are equal to .0%, .50% and .100%, respectively. The inverse function of .F (x) is denoted .F −1 (·), which returns the demand level for a given cumulative probability value. For example, .F −1 (0) = 0, −1 (500) = 0.5, and .F −1 (1000) = 1. .F The normal distribution has a standardized form, which is called the standard normal distribution. The standard normal distribution has a mean of zero and a standard deviation of one. It is possible to transform any normal distribution into
82
3 Inventory Management Under Demand Uncertainty
Fig. 3.4 Demand density according to a normal distribution with a mean of 500 units and a standard deviation of 100 units
Fig. 3.5 Cumulative distribution function of a normal distribution with a mean of 500 units and a standard deviation of 100 units
the standard one. The density and cumulative distribution functions of the standard normal distribution are denoted by .φ(·) and .(·), respectively. Because .μ = 0 and .σ = 1, the density function of the normal distribution has a simplified form: 1 2 1 φ(x) = √ e− 2 x . 2π
.
3.2 Single-Period Models
83
Suppose that demand follows a normal distribution with a mean of .μ and a standard deviation of .σ . Then, the probability of demand being less than D is equal to .F (D). Using the standard normal distribution, the same probability value can be obtained by the change of variables such that: F (D) = (z), where z =
.
D−μ . σ
In inventory management, decision-makers sometimes determine order quantities to maintain an in-stock probability. In-stock probability for an item is the probability that the product is available in stock. An in-stock probability of .80% indicates that the product is in stock .80% of the time when it is demanded by customers. The order quantity to maintain an in-stock probability of .α is equal to .Q = F −1 (α). Using the standard normal transformation given by the last expression, the order quantity is found by: Q = μ + −1 (α)σ.
.
In inventory management under demand uncertainty, this formulation has been widely used in alternative forms. For example, the optimal order quantity in the newsvendor model corresponds to a stock level for which in-stock probability is equal to a critical fractile value (its derivation will be shown later in this section). Once the critical fractile value is calculated for a given set of cost parameters, the optimal order quantity can be derived by replacing the .α value in the last expression with the critical fractile. When the uncertainty is high, however, the normal distribution may show a poor fit to empirical data for two reasons (Gallego et al., 2007). First, the normal distribution may assign a non-zero probability to negative values when the uncertainty is high even though demand cannot be negative for any product. Therefore, the probability of a negative demand value should always be zero, which has to be guaranteed by the statistical distribution. Second, the normal distribution is symmetric around the mean value. In reality, when demand uncertainty is high, demand is expected to be positively skewed because it cannot be less than zero but can have large values. High uncertainty often results from demand variability for high values. To overcome the normal distribution’s problems, the lognormal distribution is often used when demand uncertainty is high. When a random variable x follows a lognormal distribution, its density function is given by: f (x) =
.
1 √
xσ 2π
e
− 12
ln(x)−μ σ
2 .
Figure 3.6 presents the lognormal density with a location parameter of .5.97 and a scale parameter of .0.70. These values correspond to a mean of 500 units and
84
3 Inventory Management Under Demand Uncertainty
Fig. 3.6 Demand density according to a lognormal distribution with a mean of 500 and a standard deviation of 400. Lognormal parameters are 5.97 (location parameter) and 0.70 (scale parameter)
a standard deviation of 400 units.1 Unlike the normal distribution, the lognormal distribution does not directly use the mean value and standard deviation as its parameters. The normal distribution is characterized by two parameters—the mean and standard deviation. The lognormal distribution is characterized by two parameters— the location and scale parameters. Suppose that a random variable x follows a lognormal distribution. The location parameter is the mean of the natural logarithm transformation of x. If we transform the random variable such that .y = ln(x), the location parameter of the lognormal distribution of x is equal to the mean of y. The scale parameter of lognormal distribution is the standard deviation of y. We now use .F (x) to denote the lognormal distribution that has a location parameter of .μ and a scale parameter of .σ . The probability that demand is less than D is given by .F (D), which can also be written as follows: F (D) = (z), where z =
.
ln(D) − μ . σ
Then, the order quantity to maintain an in-stock probability of .α for a lognormal distribution is found by the exponential function: Q = eμ+
.
−1 (α)σ
1 The mean demand for the lognormal distribution is given by .eμ+σ 2 /2 , while the variance is .(eσ 2 − 2
1)e2μ+σ .
3.2 Single-Period Models
85
Fig. 3.7 Demand density according to a normal distribution with a mean of 500 and a standard deviation of 400
The lognormal distribution guarantees that demand values can never be negative. The demand curve is also positively skewed such that positive probability values are assigned to large demand realizations. These characteristics of the lognormal distribution are consistent with empirical observations of demand. To compare it with the normal distribution, we present the demand density for a normal distribution with a mean of 500 units and a standard deviation of 400 units in Fig. 3.7. As shown in the figure, positive probabilities are assigned to demand values between .−1000 and 0 although this cannot be possible in practice. In the newsvendor problem, decision-makers aim to optimize the inventory level Q under demand uncertainty to maximize expected profits given a functional form of demand distribution along with the price and cost parameters. Suppose that the initial inventory level is set to zero and an order of Q units is placed. The demand has a mean value of 100 units and a standard deviation of 10 units. If the decisionmaker orders only one unit, the expected profit made on the ordered amount is equal to .p − c because the probability that demand is lower than one unit is zero. And the decision-maker is certain that the one unit ordered will be sold at the full price and a profit of .p − c will be made. If the decision-maker orders 99 units, the expected profit made by ordering 1 additional unit is less than .p − c because the probability that demand is lower than 100 units is positive and the 1 additional unit may not be sold at the full price. The expected profit increase from stocking one additional unit after .Q − 1 units have already been ordered is defined as the marginal profit. The marginal profit decreases with the order quantity. When it becomes equal to zero, the optimal order quantity is reached. We formulate the marginal profit as follows: = p(1 − F (Q)) + sF (Q) − c.
.
86
3 Inventory Management Under Demand Uncertainty
The first term on the right-hand side is the expected revenue earned by ordering one additional unit once .Q − 1 units have already been ordered. If demand exceeds Q, marginal revenue of p is collected. Therefore, we multiply p with the probability of demand exceeding Q to find the expected revenue. If demand turns out to be less than Q, there is a residual value of s for unsold inventory. The second term is thus equal to the expected residual value, which is found by multiplying s with the probability of demand being less than Q. The last term is the cost of the product such that the decision-maker incurs the cost of c regardless of whether the last ordered unit is sold at the full price. Setting . = 0, the optimal order quantity, denoted by ∗ .Q , is found as follows: F (Q∗ ) =
.
p−c , p−s
which is famously known as the critical fractile solution of the newsvendor problem (Scarf, 1958). The analytical solution to this problem, as illustrated by Scarf (1958), is given in the appendix at the end of this chapter. The marginal analysis approach yields the same result, which is also intuitive and can be easily extended to different forms of inventory problems. The total profit function given by Eq. (3.1) can be rewritten in the following form after redesigning the terms in it: = (p − c)Q − (p − s) max(Q − D, 0).
.
(3.2)
The first term in this expression can be interpreted as maximum profit that can be made if all units ordered are sold at the selling price. However, demand may turn out to be lower than Q, and the seller may end up with excess inventory. In this case, the excess amount is sold at the residual value of s. The second term in this expression can be interpreted as the value of an insurance policy that fully hedges the seller against the inventory risk. To derive the expected value of total profit, we need to take a partial integral on the second term of the expression. In the appendix, we present the partial integration derivations and the expected profit formulas for both normal and lognormal distributions.
3.2.1
Single-Period Example
Suppose that a retailer sells a fashion product during a short selling season at a price of .$100 per unit. The retailer purchases the items from a supplier at a cost of .$50 per unit. Unsold inventory at the end of the selling season is sold at a discount store at a price of .$25. The retailer orders the products before the selling season when the demand is uncertain. Due to the short selling season and long supply lead time, there is no opportunity to replenish inventory during the season.
3.2 Single-Period Models
87
We first consider the case in which demand follows a normal distribution with a mean of 500 units and a standard deviation of 100 units. To calculate the optimal order quantity, we need to find the critical fractile value such that: α∗ =
.
100 − 50 = 0.667. 100 − 25
The inverse of the standard normal distribution for the critical fractile value is: z∗ = −1 (0.667) = 0.432.
.
The inverse function for the normal distribution is embedded in different computer programs. In Python, norm.ppf(.·) of the scipy.stats.norm library is used to get the inverse of the standard normal distribution. In R and MS Excel, we can use qnorm(.·) and NORM.S.INV(.·) to get the inverse of the standard normal distribution, respectively. Then, the optimal order quantity is calculated as follows: Q∗ = μ + z∗ σ = 500 + 0.432 × 100 = 543.2.
.
This result should be rounded up to the nearest integer value because order quantity cannot have decimals. Thus, the optimal order quantity should be set equal to 544 units. In Fig. 3.8, we present expected profits for varying quantity values when demand follows the normal distribution with a mean of 500 units and a standard deviation of 100 units. The maximum profit is obtained for the order quantity of 544 units, which is the optimal order quantity as calculated above. We now assume that demand follows a lognormal distribution with a location parameter of .5.97 and a scale parameter of .0.70. We assume the same cost parameters so that .α ∗ = 0.667 and .z∗ = 0.432. The optimal order quantity for the lognormal distribution is: Q∗ = e5.96+0.432×0.70 = 524.47.
.
This result is rounded up to the nearest integer, and the optimal order quantity becomes equal to 525 units. Figure 3.9 depicts the expected profit for varying order quantity when demand follows the lognormal distribution with a location parameter of 5.97 and a scale parameter of 0.7. The maximum profit value is obtained when the order quantity is equal to 525 units. The online web application that complements this book includes the examples given in Figs. 3.8 and 3.9. The reader can specify the cost and demand parameters and observe the resulting curve for both the normal and lognormal distribution. The Python codes are also given in the web application.
88
3 Inventory Management Under Demand Uncertainty
Fig. 3.8 Expected profit curve for varying order quantity when demand follows a normal distribution with a mean of 500 and a standard deviation of 100
Fig. 3.9 Expected profit curve for varying order quantity when demand follows a lognormal distribution with a location parameter of 5.97 and a scale parameter of 0.7
3.3 Multi-period Models
3.3
89
Multi-period Models
A single-period model is used for products with a short selling season such that it is not possible to sell unsold inventory at the full price at the end of the selling season. Fashion products with short life cycles and perishable products fall into this category. Durable goods, however, can be sold over multiple selling seasons at their regular selling prices. Thus, decision-makers replenish the inventory of durable goods periodically, and they have to determine the replenishment quantities at each period. Demand for each period is uncertain, which is characterized by a statistical distribution. In such a setting, a periodic review base stock policy (also referred to as order-up-to model) can optimize inventory decisions (Hadley & Whitin, 1963). Figure 3.10 shows the order and replenishment dynamics in a multi-period model such that replenishment orders are delivered after a supply lead time of 1 week. The blue curve represents inventory position, and the red curve shows on-hand inventory. In this setting, there are three elements of inventory management at play. First, on-hand inventory is the amount available in stock. Customer demand is directly fulfilled from on-hand inventory. Second, in-transit inventory is the amount already ordered from the supplier, but it has not been received yet. Third, backorders is the amount of customer demand that is backordered due to a shortage of on-hand inventory. Combining these three elements, we develop the formula for the inventory position: I nventory position = On-hand inventory + I n-transit − Backorders.
.
The duration between two consecutive ordering opportunities is set equal to one period. In Fig. 3.10, there are three periods with the same duration: .T1 = T2 − T1 =
Fig. 3.10 Order and replenishment dynamics for a multi-period model with backorders and a supply lead time of 1 week. The blue curve represents the inventory position, and the red curve shows the on-hand inventory
90
3 Inventory Management Under Demand Uncertainty
T3 − T2 . At the beginning of each period, the decision-maker reviews the inventory position and places a replenishment order. The order quantity is determined in such a way that the inventory position becomes equal to a base stock level after the order is placed. The base stock level is denoted by S, as shown in the figure. Because the decision-maker brings the inventory position to the base stock level at each decision epoch, the order quantity is indeed equal to the demand level in each period. The inventory position and on-hand inventory are both set to S at time 0 in the figure. Demand until the next replenishment is fulfilled from on-hand inventory. We use .Di to denote the demand realized between .Ti−1 and .Ti . The quantity ordered at time .T1 is equal to the customer demand satisfied between time 0 and .T1 . In mathematical terms, .Q1 = D1 . Likewise, the order quantity ordered at time .T1 is equal to the customer demand that is fulfilled between .T1 and .T2 (i.e. .Q2 = D2 ) and so on. Because the supply lead time is equal to one period, the quantity ordered at time .T1 is received at time .T2 . Thus, on-hand inventory at time .T2 after receiving .Q1 units from the supplier is .S − D2 . With this amount of on-hand inventory, the decision-maker aims to satisfy demand .D3 , which is the demand realized between .T2 and .T3 . If .D3 exceeds .S − D2 , the company incurs a penalty cost for each unit backordered. As discussed previously, customers that are exposed to a stockout are sometimes offered discounts. Such discounts can be a very effective way to convince the customers to return to the store to buy the items when they are back in stock. The cost of discounts for companies can be interpreted as the backorder penalty cost. Additionally, companies may expedite the deliveries of backordered items or ask their employees to work overtime to fulfil backorders. These activities cause additional costs that should be included in the backorder penalty cost. In the multi-period model, backorders are allowed for both practical and technical reasons. Practically, backorders are observed in practice when inventory is replenished periodically. Customers that experience a stockout can come later to purchase the product when it is replenished. From a technical perspective, the optimal base stock level, denoted by .S ∗ , can be derived analytically when backorders are allowed. If excess demand is lost and the supply lead time is positive, there is no closed-form solution to the base stock model. In such a case, the Monte Carlo simulation can be applied to improve the replenishment decisions. If .D3 is less than .S − D2 , the company incurs an inventory holding cost. This cost element covers the warehousing, material handling and capital costs tied up in inventory. Unlike the single-period model, obsolescence costs are not included in the holding cost because the multi-period model is primarily used for durable goods that do not have any obsolescence risk. We use b and h to denote the backorder and holding costs per unit per period, respectively. Then, the sum of holding and backorder costs is formulated as follows: = h max(S − Di−1 − Di , 0) + b max(Di−1 + Di − S, 0).
.
The cost function includes the demand term .Di−1 + Di . We recall that the supply lead time is set equal to one period in this example. Thus, the demand term .Di−1 +
3.3 Multi-period Models
91
Di is indeed the demand over .l + 1 periods, where l is the supply lead time. We use .fl+1 (·) and .Fl+1 (·) to denote the probability density and distribution functions of demand over .l + 1 periods, respectively. The marginal cost (of ordering one additional unit) is then written as follows: = hFl+1 (S) − b(1 − Fl+1 (S)).
.
The first term on the right-hand side is the expected cost of increasing the orderup-to level by one unit once it was set equal to .S − 1 units. If the demand over .l + 1 periods turns out to be less than S, the decision-maker cannot sell this additional unit, and she incurs a holding cost. Therefore, the expected marginal cost is calculated by multiplying the holding cost with the probability of the demand being less than the base stock level (i.e. .hFl+1 (S)). The second term is the expected value of increasing the order-up-to level by one unit once it was set equal to .S − 1 units. Because . is defined as a cost function, it has a negative sign. If demand over .l + 1 periods turns out to be higher than S, the additional unit avoids the backorder for one customer. Therefore, the expected marginal value is found by multiplying the backorder cost with the probability of the demand being more than the base stock level (i.e. .b(1 − Fl+1 (S))). The . value is equal to .−b for .S = 0 and h for .S = +∞. It increases monotonically from .−b to h as S is increased. The optimal base stock level .S ∗ is found by setting . = 0: Fl+1 (S ∗ ) =
.
b . b+h
This critical fractile solution is known as the multi-period extension to the newsvendor problem (Hadley & Whitin, 1963; Clark, 1958). We apply the marginal analysis to derive the expression. The same result can also be obtained by the analytical solution, which is given in the appendix along with the expected cost derivations for both the normal and lognormal distribution cases.
3.3.1
Multi-period Example
Suppose that a retailer sells a durable good in a market with uncertain demand. The retailer replenishes inventory weekly. The supply lead time is 3 weeks. Therefore, the retailer places an order from the supplier every week, and it takes 3 weeks to receive each order. Demand during a week is independent and identically distributed, and it follows a normal distribution with a mean of 100 units and a standard deviation of 50 units. Customers that are exposed to a stockout are backordered. The backorder cost is .$3 per backordered unit. The holding cost to carry over one unit of inventory for 1 week is .$1. We first calculate the critical fractile for the given holding and backorder costs: α∗ =
.
b 3 = = 0.75. b+h 3+1
92
3 Inventory Management Under Demand Uncertainty
The inverse of the standard normal distribution for the critical fractile is: z∗ = −1 (0.75) = 0.674.
.
As explained previously, we can use norm.ppf(.·), qnorm(.·) and NORM.S.INV(.·) to get the inverse function value in Python, R and MS Excel, respectively. We now calculate the demand parameters for .l +1 = 4 weeks. The mean demand for 4 weeks is .μ√ l+1 = 100×4 = 400. The standard deviation of demand for 4 weeks is .σl+1 = 50 × 4 = 100. Then, the optimal base stock level is obtained as follows: S ∗ = μl+1 + z∗ σl+1 = 400 + 0.674 × 100 = 467.4.
.
We round this result up to the nearest integer so the optimal base stock level is set equal to 468 units. Figure 3.11 shows the expected cost curve for a varying base stock level, where the minimum expected cost is obtained for .S = 468 units. We now consider a lognormal distribution such that demand during a week follows a lognormal distribution with a location parameter of 4.494 and a scale parameter of 0.472. The mean and standard deviation of weekly demand are
Fig. 3.11 Expected cost curve for varying base stock level when demand for 1 period follows a normal distribution with a mean of 100 units and a standard deviation of 50 units
3.4 Reorder Model with Continuous Review
93
calculated based on the mean and standard deviation formulas of the lognormal distribution: 2
Mean demand = e4.494+0.5×0.472 = 100, 2 2 Std deviation = (e0.472 − 1)e2×4.494+0.472 = 50.
.
The mean and standard √ deviation of demand over .l + 1 = 4 weeks are equal to 100 × 4 = 400 and .50 × 4 = 100, respectively. The scale and location parameters for demand over 4 weeks are then obtained as follows:
.
σl+1 =
.
ln (100/400)2 + 1 = 0.246,
μl+1 = ln(400) − 0.2462 /2 = 5.96. The optimal base stock level is then obtained as follows: S ∗ = eμl+1 +z
.
∗σ l+1
= e5.96+0.674×0.246 = 457.5.
This result is rounded up to the nearest integer value, which makes the optimal base stock level equal to 458 units. Figure 3.12 shows the expected cost curve for a varying base stock level, where the minimum expected cost is obtained for .S = 458 units. The online web application of this book involves the examples given in Figs. 3.11 and 3.12. The reader can specify the cost and demand parameters and the supply lead time. The Python codes are also given in the web application to provide the readers with the flexibility to use the codes in their Python platform and modify them if needed.
3.4
Reorder Model with Continuous Review
In the base stock model reviewed in the previous section, a decision-maker places an order at the beginning of each period to bring the inventory position up to the base stock level. One of the assumptions of the base stock model is that there is no fixed ordering cost. In practice, however, companies often incur a fixed cost for each order, independent from the order quantity. For example, the procurement and logistics teams plan and manage the delivery of each replenishment. Suppliers or third-party logistics (3PL) providers also charge companies the transshipment costs for each shipment separately. These labour and transshipment costs constitute the fixed ordering costs. When there is a fixed ordering cost in addition to the purchasing cost (calculated on a per-unit basis), the decision-maker may aim to reduce the frequency of orders in order to minimize the fixed ordering costs. The reorder model with continuous
94
3 Inventory Management Under Demand Uncertainty
Fig. 3.12 Expected cost curve for a varying base stock level when demand for one period follows a lognormal distribution with a location parameter of 4.494 and a scale parameter of 0.472
review optimizes the ordering policy of companies when there is a fixed cost and the decision-maker has the flexibility to place an order any time (Gavish & Graves, 1980). Figure 3.13 depicts the ordering and replenishment dynamics for the reorder model. The blue curve represents the inventory position, and the green curve shows the on-hand inventory level. In this setting, the decision-maker has to determine both the reorder level R and order quantity Q. The inventory position is monitored continuously, and a replenishment order of Q units is placed whenever the inventory position drops to the reorder level R. Each order is delivered after a lead time of l periods. Unlike the newsvendor and base stock models, there are two decision variables in the reorder model that need to be optimized together. The optimization mechanism basically aims to optimize two different trade-offs that occur in the reorder model. The first trade-off is between the fixed and supply-demand mismatch costs. When the inventory is replenished more frequently, customer demand can be fulfilled without keeping excessive inventory. This would in turn help reduce the supplydemand mismatches. However, the number of replenishments and the total fixed costs increase with the replenishment frequency. Therefore, increasing the replenishment frequency helps reduce the supply-demand mismatches, but it causes an increase in the total fixed ordering costs.
3.4 Reorder Model with Continuous Review
95
Fig. 3.13 Order and replenishment dynamics for the reorder model. The reordering level is R units and order quantity is Q units
The second trade-off is between the backorder and inventory holding costs during the supply lead time after each replenishment. There is the backorder risk after each replenishment when the demand during the supply lead time exceeds the reorder level. If the demand turns out to be lower than the reorder level, the company incurs inventory holding costs. Therefore, increasing the reorder level helps reduce the backorders, but it causes an increase in the inventory level and the associated inventory holding costs. One of the challenges associated with the reorder model is that the duration between two consecutive orders is not fixed. In Fig. 3.13, we refer to this duration as the ordering cycle. If demand is high, inventory will be depleted quickly, which in turn reduces the ordering cycle. If demand is low, inventory will be kept in stock for a long time, resulting in longer ordering cycles. The average length of the ordering cycle is found by dividing the order quantity by the average demand rate—that is, .Q/λ with .λ denoting the average demand rate. In the base stock model, we recall that the holding cost parameter is defined as the cost per unit per period. Thus, the holding cost is the cost of carrying over one unit of inventory for the time duration between two consecutive orders—that is, the ordering cycle with a fixed length in the base stock model. In the reorder model, however, the holding cost parameter h should be defined as the cost rate (i.e. per unit per time) because the ordering cycle varies due to demand uncertainty. The cost rate should be consistent with the demand rate .λ. For example, .λ is often set equal to the expected annual demand. Then, h should be defined as the annual holding cost of one unit of inventory. Or, if .λ is defined as the expected monthly demand, h should be the monthly holding cost of one unit of inventory.
96
3 Inventory Management Under Demand Uncertainty
The total expected holding cost is calculated by multiplying the cost rate h with the average inventory. Average inventory is found by: Average inventory = R − λl + Q/2.
.
The derivation of this formula can be explained as follows. Each replenishment order placed when the inventory position drops to R units arrives after the supply lead time of l periods. The average demand during the supply lead time is .λl units. Thus, the expected on-hand inventory level immediately before an order arrives is equal to .R − λl units. The expected on-hand inventory immediately after an order arrives is equal to .R − λl + Q. Therefore, on-hand inventory fluctuates between .R − λl and .R − λl + Q, on average, and it is depleted with a demand rate of .λ. For that reason, the average inventory is calculated by finding the average value of .R − λl and .R − λl + Q, which amounts to .R − λl + Q/2. If demand during the supply lead time exceeds the reorder level, the company incurs a backorder cost. The backorder cost parameter is time independent. When a customer is exposed to a stockout, the demand will be backordered. It will later be fulfilled after the replenishment. A backorder cost of b is incurred for each unit of demand backordered regardless of whether it will be fulfilled in a short or long time period. We use K and D to denote the fixed ordering cost and the random variable of demand during the supply lead time, respectively. Then, the sum of fixed, holding and backorder costs is formulated as follows: λ λ + h(R − λl + Q/2) + b E(max(D − R, 0)). Q Q
(Q, R) = K
.
The first term on the right-hand side of this expression gives the expected annual ordering cost (assuming that .λ is defined as the expected demand for 1 year). The term .λ/Q is the average number of replenishments over a year. So, multiplying it with the fixed ordering cost amounts to the expected annual ordering cost. The second term is the expected annual holding cost, which is found by multiplying the cost rate h (i.e. holding cost per unit per year) with the average inventory level. The last term gives the expected annual backorder cost. Backorders may occur after each replenishment order during the supply lead time. Therefore, the backorder cost b is multiplied with the average number of replenishments over a year (.λ/Q) and the expected amount of backorders (.E(max(D − R, 0))) to calculate the expected annual backorder cost. The derivation of the analytical expressions that yields to the calculation of the optimal Q and R values is given in the appendix (Gavish & Graves, 1980). We now apply the marginal analysis to reach the same results. Suppose that the decisionmaker sets the reorder level to .R − 1 and considers increasing it by one unit. The marginal cost of increasing the reorder level by one unit is: (R | Q) = h − b
.
λ (1 − Fl (R)) Q
3.4 Reorder Model with Continuous Review
97
The first term on the right-hand side of this expression is the expected cost of increasing the reorder level by one unit. Increasing the reorder level has a direct impact on the inventory level, which will in turn increase the total cost of holding the inventory by h. On the other hand, increasing the reorder level helps reduce the backorders. The second term on the right-hand side of the expression can be interpreted as the expected value of increasing the reorder level by one unit, which results from the reduction of the backorder penalty cost. We multiply b with .λ/Q to find the backorder cost per unit per year. A backorder occurs if demand during the supply lead time exceeds the reorder level. Let .Fl (·) and .fl (·) denote the probability distribution and density functions of demand during the supply lead time, respectively. In statistical terms, the probability of having a backorder is .(1 − Fl (R)). Thus, the expected value of increasing the reorder level by one unit is .bλ/Q(1 − Fl (R)). Since . (Q, R) is defined as a cost function, the second term has a negative sign. Setting . (R | Q) = 0, the optimal R level is found by the following expression: 1 − Fl (R ∗ ) =
.
hQ . bλ
Now, suppose that the decision-maker sets the order quantity to .Q − 1 and considers increasing it by one unit. The marginal cost of increasing the order quantity by one unit is: (Q | R) =
.
λ h λ − K 2 − b 2 E(max(D − R, 0)). 2 Q Q
The first term on the right-hand side of this expression is the expected cost of increasing the order quantity. If the order quantity is increased by one unit, the average inventory level increases by .0.5 units. We recall that the average inventory level is .R − λl + Q/2 for the reorder model. Therefore, increasing Q by one unit results in the average inventory level increasing by 0.5 units. For that reason, the impact of increasing the order quantity by one unit on the holding cost is equal to .h/2. The second and third expressions are considered the value of increasing the order quantity. It follows from the total cost function that the fixed ordering cost per year is .Kλ/Q. The ordering cycle increases with the order quantity such that a one-unit increase in the order quantity increases the ordering cycle by a factor of .1/Q. Thus, increasing the quantity by one unit reduces the fixed ordering cost by 2 .Kλ/Q . Likewise, it follows from the total cost function that the expected backorder cost per year is .bE(max(D − R, 0))λ/Q. Increasing the order quantity by one unit leads to a reduction in the backorder cost by .bE(max(D − R, 0))λ/Q2 . Setting . (Q | R) = 0 yields the optimal order quantity expression: ∗
Q =
.
2λ(K + bE(max(D − R, 0))) h
98
3 Inventory Management Under Demand Uncertainty
Therefore, we have two optimality equations for the reorder model: hQ∗ , .1 − Fl (R ) = bλ ∗
∗
Q =
2λ(K + bE(max(D − R ∗ , 0))) h
The optimal values √ of R and Q can be found by iteration, starting with an initial value for .Q = 2λK/ h. This initial value is then used in the first expression to find the corresponding R value, which is then used in the second expression to find a new Q value and so on. After some iterations, the values of Q and R converge to the optimal results.
3.4.1
Reorder Model Example
Suppose that a pharmaceutical company follows the reorder policy to replenish the inventory for a product. It takes 3 months to replenish the inventory after an order is placed so the supply lead time is 3 months. Demand during the supply lead time follows a normal distribution with a mean of 100 and a standard deviation of 50 units. Excess demand is backordered at a backorder penalty cost of .$3 per unit. The cost of carrying over one unit of inventory during the supply lead time is .$0.25. The fixed ordering cost per replenishment is .$100. The annual holding cost is thus equal to .h = $0.25 × 4 = $1 per unit. The expected annual demand is .λ = 100 × 4 = 400 units. The first optimality equation is written as follows: Q∗ hQ∗ =1− , bλ 1200 R ∗ = μ + zQ σ,
Fl (R ∗ ) = 1 −
.
where .μ = 100, .σ = 50 and .zQ = −1 1 − Q∗ /1200 . The second optimality equation is written as follows: ∗
Q =
.
=
2λ(K + bE(max(D − R ∗ , 0))) , h 800(100 + 3 × E(max(D − R ∗ , 0)),
3.4 Reorder Model with Continuous Review
99
Fig. 3.14 Convergence of Q and R to the optimal values when demand during the supply lead time follows a normal distribution with a mean of 100 and a standard deviation of 50 units
where: +∞ .E(max(D − R , 0)) = (x − R ∗ )fl (x)∂x, ∗
R∗
= μ(1 − (zR )) + σ × φ(zR ) − R ∗ (1 − (zR )), zR =
R−μ . σ
Using the expressions for .R ∗ and .Q∗ , the optimal results are found by iteration √ such that the initial value of the order quantity is .Q = 2λK/ h = 28. The optimal values are .Q∗ = 315 and .R ∗ = 132. Figure 3.14 shows the Q and R values for each iteration. As shown in the figure, the results converge to the optimal values fast. We now consider a lognormal distribution without changing the cost parameters. Demand during the supply lead time is assumed to follow a lognormal distribution with a location parameter of .μ = 4.494 and a scale parameter of .σ = 0.472. The 2 expected demand for these parameters is .e4.494+0.472 /2 = 100 units. Therefore, the demand rate .λ is still equal to 400 units annually.
100
3 Inventory Management Under Demand Uncertainty
The first optimality equation for the lognormal distribution case is written as follows: Fl (R ∗ ) = 1 −
.
hQ∗ Q∗ =1− , bλ 1200
R ∗ = eμ+zQ σ , where .μ = 4.494, .σ = 0.472 and .zQ = −1 1−Q∗ /1200 . The second optimality equation is written as follows: ∗
Q =
.
=
2λ(K + bE(max(D − R ∗ , 0))) , h 800(100 + 3 × E(max(D − R ∗ , 0)),
where: +∞ .E(max(D − R , 0)) = (x − R ∗ )fl (x)∂x, ∗
R∗
= eμ+σ zR =
2 /2
(1 − (zR − σ )) − R ∗ (1 − (zR )),
ln(R) − μ . σ
Using the expressions for .R ∗ and .Q∗ , the optimal results are again found by iteration. The optimal values are .R ∗ = 119 and .Q∗ = 332 as shown in Fig. 3.15. Similar to the normal distribution case, the results converge to the optimal values fast. The online web application includes the examples given in Figs. 3.14 and 3.15, where readers can specify the cost and demand parameters. The readers can also review the Python codes given in the web application.
3.5
Monte Carlo Simulation for Inventory Models
The inventory models discussed thus far have an analytical solution, so optimal values can be found by using corresponding formulas. However, these models often rely on some restrictive assumptions. For example, one of the assumptions of the multi-period model is that the supply lead time is fixed for all consecutive orders. This assumption is often violated in practice because the supply lead time may increase due to some transshipment delays, weather conditions and supply disruptions. The other modelling assumptions may sometimes be violated in practice, thereby making those models ineffective for solving some real-world problems.
3.5 Monte Carlo Simulation for Inventory Models
101
Fig. 3.15 Convergence of Q and R to the optimal values when demand during the supply lead time follows a lognormal distribution with a location parameter of 4.494 and a scale parameter of 0.472
For example, consider a pharmaceutical company with customers from three different countries. Demand for each customer is independent from one another, and it follows a statistical distribution. The company replenishes inventory periodically, similar to the base stock policy. Suppose that due to regulations, customers from the first country do not accept products if the remaining shelf life is less than 180 days. Those from the second country do not accept them if the remaining shelf life is less than 240 days. And there is no shelf life restriction for the third country. In this setting, the demand from all customers can be consolidated and met from a pooled inventory if all inventory has a remaining shelf life that is longer than 240 days. The products that have a shelf life between 180 and 240 days can be sold to the first and third countries. If the remaining shelf life is less than 180 days, the products can only be sold to the third country. There is no analytical solution to the inventory problem faced by the pharmaceutical company due to the complex relationship between inventory characteristics and demand dynamics. Consider also a manufacturer that follows a reorder model to control inventory. As discussed previously, both the reorder level R and order quantity Q are calculated by some analytical formulas based on the supply lead time, which has a fixed value. Now, suppose that the supply lead time is uncertain. Due to this uncertainty, the analytical formulas no longer yield optimal results.
102
3 Inventory Management Under Demand Uncertainty
Such examples can be extended to capture various real-world challenges for which there is no analytical solution offered by the extant inventory theories. In these cases, a Monte Carlo simulation can be applied to improve the inventory management practices. There are three steps in the Monte Carlo simulation (Judd, 1998): 1. Generating random variables for uncertain variables 2. Calculating the value of the function of interest for each generated random variable 3. Reporting summary statistics based on the calculated values The first step is to generate random variables so that the profit or cost values per period can be calculated by assuming that the realized values for the uncertain variables are those generated randomly. In a technical sense, a Monte Carlo simulation is considered a deterministic expansion of stochastic models. If it is applied to a sample of 1000 days, for example, 1000 random values of demand should be generated from the statistical distribution of demand. These values are then assumed to be the realized values for the next 1000 days. Then, the profit and cost calculations are done using these values for a given inventory model. When a random variable follows a statistical distribution, standard functions embedded in software solutions (e.g. Python and R) can be used to generate random values. If only empirical values are available to characterize an uncertain variable without any functional form of a statistical distribution, a cumulative distribution table should be constructed to generate random variables. Consider a computer store that sells a new laptop in a busy district. The store is too small to keep too much inventory. It replenishes from a central warehouse, and the supply lead time ranges from 1 day to 6 days. Figure 3.16 illustrates the
Fig. 3.16 Order and replenishment dynamics for the reorder model with varying supply lead times
3.5 Monte Carlo Simulation for Inventory Models
103
Table 3.1 Probability table for the supply lead time Value 1 2 3 4 5 6 Total
# of occurrences 6 4 3 7 5 5 30
Probability 6/30 .= 20% 4/30 .= 13% 3/30 .= 10% 7/30 .= 23% 5/30 .= 17% 5/30 .= 17% 100%
Cumulative 20% 33% 43% 66% 83% 100%
Interval [0, 0.20] [0.20, 0.33] [0.33, 0.43] [0.43, 0.66] [0.66, 0.83] [0.83, 1]
replenishment dynamics for the reorder model with varying lead times. The first order is replenished in a very short time period. The second order is replenished after a long time when on-hand inventory drops to zero. Therefore, uncertainty in the supply lead time may cause stockouts if this uncertainty is not taken into account in replenishment planning. For the last 30 replenishments, the number of occurrences for each lead time value is given in Table 3.1. The first column shows the lead time values, and the second one the number of occurrences. For instance, the orders are supplied in 1 day for 6 replenishments out of 30. They are replenished in 2 days 4 times out of 30, and so on. Based on this information, we calculate the probability of occurrences for each lead time value. The probability of the supply lead time being equal to 1 day is equal to .6/30 = 20%. It is equal to .13% for 2 days, and so on. Using the probability information, the cumulative probability values are calculated as shown in the fourth column. The cumulative probabilities determine the intervals assigned to each lead time value. The last column shows these intervals. To generate lead time values using this information, a random value from a uniform distribution between zero and one is drawn. The randomly drawn value falls within one of those intervals. The lead time value corresponding to this interval is subsequently chosen as a random draw for the lead time. Daily demand for laptops is also random, varying between one and five with equal probabilities. The probability table including the intervals that correspond to each demand value is constructed in Table 3.2. We now apply the Monte Carlo simulation to the inventory problem of the computer store for a period of 20 days. The beginning inventory is eight laptops, Table 3.2 Probability table for demand values
Value 1 2 3 4 5
Probability Cumulative Interval 20% 20% [0, 0.20] 20% 40% (0.20, 0.40] 20% 60% (0.40, 0.60] 20% 80% (0.60, 0.80] 20% 100% (0.80, 1.00]
104
3 Inventory Management Under Demand Uncertainty
the reorder level is four laptops, and the order quantity is ten laptops. With these initial parameters, the results for each day are shown in Table 3.3. Using the results from Table 3.3, the average inventory level per day ([Inventory after replenishment .+ Ending inventory]/2) is equal to 7.5 units over 20 days. Average demand per day is 2.55 units. Average sales per day is 2 units. Average lost sales is 0.55 units. These average values can be multiplied by the holding cost and profit margin to calculate the expected profit. Therefore, the Monte Carlo simulation does not directly return the optimal set of reorder level and order quantity. Instead, it allows us to find the expected profit or cost for a given set of decision parameters. Decision-makers can later change these decision parameters to determine effective policies to improve the current state of practices. Compared to the analytical models discussed in the previous sections, the main advantage of the Monte Carlo simulation is that it captures real-world dynamics directly without imposing any restrictive assumptions. However, its main disadvantage is that finding the optimal values of the decision parameters would be computationally exhaustive. The user needs to run the simulation algorithm several times. In each attempt, the input parameters that characterize the inventory model should be changed. A comparison should be made at the end to determine the optimal values. Another difficulty is the construction of the simulation model that replicates an actual business setting. This step can also be time-consuming. Once these challenges are addressed, the Monte Carlo simulation is considered a powerful approach for solving complex inventory management problems.
3.6
Chapter Summary
Inventory management is a major determinant of organizational performance. Manufacturers and retailers that manage their inventories effectively can easily improve their operating margins, gaining a competitive edge in their markets. Scholars have studied inventory models exhaustively since the 1950s (Clark, 1958; Hadley & Whitin, 1963). There are various theoretical models that address the challenges of inventory management from different angles. In Fig. 3.17, we present the logic tree that identifies the right inventory model depending on the product and market dynamics. When the selling season is short, the newsvendor model optimizes the trade-off between excess inventory and stockouts. If the product is sold over multiple periods, there are three issues we should look at before deciding on the right inventory model. The first one is related to the customers’ reaction to stockouts. If the customers are willing to buy a product later in an out-of-stock situation, the inventory problem is called the backordering problem. Otherwise, it is a lost sales problem, for which there is no exact solution offered by inventory theories. Hence, the Monte Carlo simulation should be utilized to improve the inventory decisions. The second issue is related to the supply lead time. If the supply lead time varies for consecutive orders, there is again no exact solution so the Monte Carlo simulation should be used. When the supply lead time is fixed, we then need to investigate the third issue:
Days 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Replenishment quantity 0 0 0 0 0 0 0 0 10 0 10 0 0 10 0 0 0 0 0 10
Inventory after replenishment 8 7 6 5 2 0 0 0 10 9 14 9 5 13 11 6 4 2 1 10
Random draw for demand 0.11 0.18 0.17 0.59 0.56 0.17 0.69 0.21 0.07 0.92 0.93 0.74 0.21 0.31 0.81 0.39 0.31 0.16 0.61 0.21
Table 3.3 Monte Carlo simulation results of the example Demand 1 1 1 3 3 1 4 2 1 5 5 4 2 2 5 2 2 1 4 2
Sales 1 1 1 3 2 0 0 0 1 5 5 4 2 2 5 2 2 1 1 2
Ending inventory 7 6 5 2 0 0 0 0 9 4 9 5 3 11 6 4 2 1 0 8
Lost sales 0 0 0 0 1 1 4 2 0 0 0 0 0 0 0 0 0 0 3 0
Order quantity 0 0 0 10 0 0 0 0 0 10 0 0 10 0 0 10 0 0 0 0
Lead time
5
1
1
4
Random draw for lead time
0.79
0.06
0.15
0.59
3.6 Chapter Summary 105
106
3 Inventory Management Under Demand Uncertainty
Fig. 3.17 Logic tree that identifies the right inventory model depending on the product and market dynamics
Is there a fixed ordering cost per order? If the answer is “Yes”, the reorder model optimizes the two sets of trade-offs as discussed before. Otherwise, the multi-period model should be used to find the optimal order-up-to level.
3.7
Practice Examples
Example 3.1 Suppose that a manufacturer reported $140 million cost of goods sold in the annual income statement of 2017. The value of inventory is reported in the balance sheet of the manufacturer for each quarter. For Q1, Q2, Q3 and Q4 of 2017, the value of inventory is $10 million, $20 million, $37 million and $13 million, respectively. Calculate the inventory turnover and days of inventory for the manufacturer in 2017. Solution 3.1 The cost of goods sold annually is $140 million. The average inventory level for the same year is found by calculating the average of quarterly inventory levels: Average inventory level =
.
10 + 20 + 37 + 13 = $20 million. 4
Then, the inventory turnover is found as follows: I nventory turnover =
.
140 = 7. 20
3.7 Practice Examples
107
Therefore, the manufacturer replenishes inventory fully seven times in 2017. The value of days of inventory is: Days of inventory = 365/7 = 52days.
.
Thus, the manufacturer keeps inventory enough to meet demand for 52 days on average. Example 3.2 What are the names of functions in Python that return the inverse, cumulative probability and probability density of normal and lognormal distributions? Solution 3.2 In Python, the probability functions are included in the scipy library. The normal and lognormal distribution functions can be imported to the Jupyter platform by the following lines of codes, respectively: from scipy.stats import norm from scipy.stats import lognorm The inverse, cumulative distribution and probability density functions for the normal distribution are norm.ppf(probability, mean, standard deviation), norm.cdf(value, mean, standard deviation) and norm.pdf(value, mean, standard deviation), respectively. Likewise, the inverse, cumulative distribution and probability density functions for the lognormal distribution are lognorm.ppf(probability, location parameter, scale parameter), lognorm.cdf(value, location parameter, scale parameter) and lognorm.pdf(value, location parameter, scale parameter), respectively. Example 3.3 A retailer sells winter jackets in a short selling season between November and February. The retailer purchases the jackets from an offshore contract manufacturer in September. Due to long lead times, it is not possible to have in-season replenishments so the inventory system can be conceptualized as a singleperiod newsvendor model. The demand for the jackets follows a normal distribution with a mean of 2000 units and a standard deviation of 400 units. The selling price of the jackets is $450 per unit. The purchase cost is $250 per unit. Unsold jackets at the end of February are moved to a discount store, where they are sold at $120 per unit. Calculate the optimal order quantity. Solution 3.3 The critical fractile is found by: α∗ =
.
p−c 450 − 250 = = 0.606. p−s 450 − 120
108
3 Inventory Management Under Demand Uncertainty
Then, z∗ = −1 (0.606) = 0.269.
.
The optimal order quantity is equal to: Q∗ = μ + σ z∗ = 2000 + 400 × 0.269 = 2108 units.
.
Example 3.4 Use the information given in Example 3.3, and calculate the expected profit, expected shortage and expected excess inventory when the order quantity is 2108 units. Solution 3.4 It was found in the previous example that Q∗ = 2108. Using the expected profit formulation given in “Appendix to Chap. 3”, we obtain: E((Q∗ )) = (p − s) μ(z∗ ) − σ φ(z∗ ) ,
.
= (450 − 120)(2000 × (0.269) − 400 × φ(0.269)) = $349K. The expected shortage for a given order quantity Q is formulated as follows:
.
+∞ (x − Q)f (x)∂x = μ(1 − (z)) + σ φ(z) − (μ + σ z)(1 − (z)), Q
= σ φ(z) − σ z(1 − (z)) = σ φ(z) − z(1 − (z)) ,
where x is a random variable that denotes the demand and f (·) is the probability density function of demand. Then, the expected shortage is found as follows: 400 φ(0.269) − 0.269(1 − (0.269)) = 112 units.
.
The expected excess inventory for a given order quantity Q is also formulated as follows: Q . (Q − x)f (x)∂x = (μ + σ z)(z) − μ(z) + σ φ(z), 0
= σ z(z) + σ φ(z) = σ z(z) + φ(z) .
Then, the expected excess inventory is calculated as follows: 400 0.269 × (0.269) + φ(0.269) = 219 units.
.
3.7 Practice Examples
109
Example 3.5 A retailer in Canada sells winter tires for a very short time period that starts in the beginning of November and ends at the end of December. The retailer orders the tires directly from a manufacturers, and it takes around 3 months to deliver the orders. Due to the long lead time, the retailer does not have in-season replenishments. Hence, the inventory model of the retailer can be conceptualized as a single-period newsvendor model. The market demand follows a lognormal distribution with a mean of 1000 units and a coefficient of variation (CV) of 1. Calculate the location and scale parameters of the lognormal distribution for the market demand. Solution 3.5 Lognormal distribution is characterized by two parameters, (1) location and (2) scale parameters, denoted by μ and σ , respectively. Let x be a random variable that denotes the demand. If x follows a lognormal distribution with the location and scale parameters of μ and σ , it means that the natural logarithm transformation of x (i.e. y = ln(x)) follows a normal distribution with a mean of μ and a standard deviation of σ . 2 The mean demand for the lognormal distribution is given by e(μ+σ /2) . The 2 2 variance of demand for the lognormal distribution is also given by (eσ −1)e(2μ+σ ) . The coefficient of variation (CV) is defined as the ratio of standard deviation of demand to its mean value (i.e. [Std. deviation]/[Mean]. Thus, 2
V ariance (eσ − 1)e(2μ+σ .CV = = 2 Mean2 e(2μ+σ ) σ = 0.83. 2
2)
2
= eσ − 1 = 1,
If the CV is known for a lognormal distribution, the scale parameter can be directly calculated. Using the mean value of demand, the location parameter can also be calculated as follows: 1000 = eμ+σ
.
2 /2
= eμ+0.83
2 /2
,
μ = ln(1000) − 0.832 /2 = 6.56 Therefore, the demand follows a lognormal distribution with the location and scale parameters of 6.56 and 0.83, respectively. It has been well established in the literature by Gallego et al. (2007) that the lognormal distribution fits well highly uncertain demand better than normal distribution. When the CV is less than 0.3, the demand is considered having a low uncertainty. Then, it would be possible to assume a normal distribution to characterize the demand distribution. Otherwise, it is more convenient to assume a lognormal distribution. Example 3.6 The retailer given in Example 3.5 sells the tires at a price of $200 per unit. The purchase cost is $120 per unit. We assume that unsold tires at the end of December are sold at a discounted value of $100 per unit later. Calculate the optimal order quantity for the retailer.
110
3 Inventory Management Under Demand Uncertainty
Solution 3.6 The critical fractile is found by: α∗ =
.
p−c 200 − 120 = = 0.8. p−s 200 − 100
Then, z∗ = −1 (0.8) = 0.842.
.
The optimal order quantity is equal to: ∗
Q∗ = eμ+σ z = e6.56+0.83×0.842 = 1421 units.
.
Example 3.7 Using the information in Example 3.5, calculate the expected profit, expected shortage and expected excess inventory when the order quantity is 1421 units. Solution 3.7 It was found in the previous example that Q∗ = 1421. Using the expected profit formulation for the lognormal distribution given in “Appendix to Chap. 3”, we obtain: E((Q∗ )) = (p − s)eμ+σ
.
2 /2
(z∗ − σ ),
= (200 − 100)e6.56+0.83
2 /2
(0.842 − 0.83) = $50.3K.
When the demand follows a lognormal distribution, the expected shortage for a given order quantity Q is formulated as follows:
.
+∞ 2 (x − Q)f (x)∂x = eμ+σ /2 (1 − (z − σ )) − eμ+σ z (1 − (z)). Q
Then, the expected shortage is found as follows: e6.56+0.83
.
2 /2
(1 − (0.842 − 0.83)) − e6.56+0.83×0.842 (1 − (0.842)) = 209 units.
The expected excess inventory for a given order quantity Q is also formulated as follows: Q 2 . (Q − x)f (x)∂x = eμ+σ z (z) − eμ+σ /2 (z − σ ). 0
Then, the expected excess inventory is calculated as follows: e6.56+0.83×0.842 (0.842) − e6.56+0.83
.
2 /2
(0.842 − 0.83) = 633 units.
3.7 Practice Examples
111
Example 3.8 An electronics manufacturer follows a reorder policy to replenish the inventory of an electronic connector. It takes 2 months to replenish inventory after an order is placed. The fixed ordering cost per replenishment is $100. Demand during the supply lead time has a mean of 300 and a CV of 0.3. Excess demand is backordered at a backorder penalty cost of $10 per unit. The cost of carrying over one unit of inventory for 1 month is $1. Calculate the optimal order quantity and reorder point when the demand is assumed to follow a normal distribution. Solution 3.8 The normal distribution parameters are μ = 300 and σ = 300×0.3 = 90. The expected annual demand is λ = 300 × 12/2 = 1800 units. The annual holding cost is h = $1 × 12 = $12 per unit inventory. The first optimality condition is: Q∗ hQ∗ =1− , bλ 1500 R ∗ = 300 + 90 × zQ ,
Fl (R ∗ ) = 1 −
.
where zQ = −1 (1 − Q∗ /1500). The second optimality equation is given by: ∗
Q =
.
=
2λ(K + bE(max(D − R ∗ , 0))) , h
300(100 + 10 × E(max(D − R ∗ , 0)),
where: +∞ .E(max(D − R , 0)) = (x − R ∗ )fl (x)∂x, ∗
R∗
= μ(1 − (zR )) + σ × φ(zR ) − R ∗ (1 − (zR )), zR =
R−μ . σ
The optimal solution can be found by iterating these two optimality equations. To start the iteration, we need to set an initial value for Q. An ideal initial value for Q can be: √ 2λK = 30000 = 173 units. .Qinitial = h Then, the optimal values are found as follows: Q∗ = 226 units,
.
R ∗ = 393 units.
112
3 Inventory Management Under Demand Uncertainty
Python codes of this example and the iteration algorithm are given in the online web application. Example 3.9 Calculate the optimal order quantity and reorder point for Example 3.8 when the demand is assumed to follow a lognormal distribution. Solution 3.9 The location and scale parameters of the lognormal distribution for the demand with a mean of 300 and a CV of 0.3 are: 2
CV 2 = eσ − 1 = 0.09,
.
σ = 0.29, 300 = eμ+σ
2 /2
= eμ+0.29
2 /2
,
μ = ln(300) − 0.292 /2 = 5.66. Therefore, the lognormal distribution with a mean of 300 and a CV of 0.3 is characterized by the location and scale parameters of 5.66 and 0.29, respectively. The first optimality condition is: Fl (R ∗ ) = 1 −
.
Q∗ hQ∗ =1− , bλ 1500
R ∗ = e5.66+0.29×zQ , where zQ = −1 (1 − Q∗ /1500). The second optimality equation is given by: ∗
Q =
.
=
2λ(K + bE(max(D − R ∗ , 0))) , h
300(100 + 10 × E(max(D − R ∗ , 0)),
where: +∞ .E(max(D − R , 0)) = (x − R ∗ )fl (x)∂x, ∗
R∗
= e5.66+0.29 zR =
2 /2
(1 − (zR − 0.29)) − R ∗ (1 − (zR )),
ln(R) − 5.66 . 0.29
3.7 Practice Examples
113
We can still use the Qinitial found in Solution 3.8 as the initial value of Q, which is needed to start the iteration algorithm. Then, the optimal values are found as follows: Q∗ = 254 units,
.
R ∗ = 380 units. Python codes of this example and the iteration algorithm are given in the online web application. Example 3.10 Suppose that a random variable is generated from a probability function in an online game such that its values lie between one and eight. The probability function is given by: ⎧ 0.077 ⎪ ⎪ ⎪ ⎪ ⎪ 0.086 ⎪ ⎪ ⎪ ⎪ 0.277 ⎪ ⎪ ⎨ 0.419 .P rob(x = i) = ⎪ 0.064 ⎪ ⎪ ⎪ ⎪ 0.032 ⎪ ⎪ ⎪ ⎪ 0.025 ⎪ ⎪ ⎩ 0.020
if i if i if i if i if i if i if i if i
= 1, = 2, = 3, = 4, = 5, = 6, = 7, = 8.
The players can earn $3 if the value of the random variable is less than five and zero otherwise. Therefore, the profit function is formalized as follows: g(x) = 3 × 1[x s. Therefore, setting .∂E()/∂Q = 0, the optimal order quantity is obtained as follows: F (Q∗ ) =
.
p−c . p−s
With the critical fractile solution at hand, the expected profit function can be interpreted from a risk management perspective. First, the expected profit function can be rewritten as follows: Q .E() = (p − c)Q − (p − s) (Q − x)f (x)∂x. 0
The first term on the right-hand side is the total profit when the mismatches between supply and demand are totally avoided such that the decision-maker knows that all of
3.9 Appendix to Chap. 3
117
the quantity ordered will be sold at the price of p. The second term can be interpreted as the cost of fully hedging the mismatch risk. The maximum value of the expected profit is achieved when .Q∗ units are ordered based on the critical fractile solution. Then, rearranging the terms yields the following: ∗
∗
∗
Q
∗
E((Q )) = (p − c)Q − (p − s)Q F (Q ) + (p − s)
.
xf (x)∂x, 0
Q∗
= (p − s)
xf (x)∂x. 0
The derivation of the partial integral is critical to finding the expected profit in inventory problems with demand uncertainty. When demand follows a normal distribution with a mean of .μ and a standard deviation of .σ , the partial integral is written as: Q
Q xf (x)∂x =
.
0
0
2 1 − (x−μ) x √ e 2σ 2 ∂x = σ 2π
zQ 1 2 (μ + zσ ) √ e−z /2 ∂z, 2π 0
= μ(zQ ) − σ φ(zQ ), where .φ(·) and .(·) are the density and distribution functions of the standard normal distribution, which has a mean of zero and a standard deviation of one. Also, Q−μ x−μ .zQ = σ and .z = σ . In particular, we apply the change of variables (replacing x with z) to derive the last expression. Therefore, the expected profit function is formalized as below when demand follows a normal distribution with a mean of .μ and a standard deviation of .σ : E((Q) | N(μ, σ )) = (p − c)Q − (p − s)Q(zQ ) + (p − s) μ(zQ ) − σ φ(zQ ) .
.
When demand follows a lognormal distribution with a location parameter of .μ and a scale parameter of .σ , the partial integral is written as follows: Q
Q xf (x)∂x =
.
0
x 0
1 √
σ x 2π
= eμ+σ
2 /2
zQ 0
e
− (ln(x)−μ) 2 2σ
zQ
2
∂x = 0
1 2 √ eμ+σ z e−z /2 ∂z, 2π
1 2 2 √ e−(z−σ ) /2 ∂z = eμ+σ /2 (zQ − σ ), 2π
118
3 Inventory Management Under Demand Uncertainty
where .z = (ln(x) − μ)/σ and .zQ = (ln(Q) − μ)/σ . Then, the expected profit is formulated as below when demand follows a lognormal distribution with a location parameter of .μ and a scale parameter of .σ : E((Q) | log N(μ, σ )) = (p − c)Q − (p − s)Q(zQ ) 2 + (p − s) eμ+σ /2 (zQ − σ ) .
.
3.9.2
Analytical Solution to the Base Stock Problem
In the base stock model, the sum of holding and backorder costs at the end of each period is: = h max(S −
l
.
⎛ ⎞ l
Di−j , 0) + b max ⎝ Di−j − S, 0⎠ .
j =0
j =0
Then, the expected cost is formulated as: S E( ) = h
.
+∞ (S − x)fl+1 (x)∂x + b (x − S)fl+1 (x)∂x.
0
S
The first derivative of this expression with respect to S yields: .
∂E( ) = hFl+1 (S) − b(1 − Fl+1 (S)). ∂S
We then take the second derivative to check if the expected cost function is convex: .
∂ 2 E( ) = hfl+1 (S) + bfl+1 (S). ∂S 2
The second derivative is always non-negative. Therefore, it is convex. Setting ∂E( )/∂S = 0, the optimal base stock level is obtained as follows:
.
Fl+1 (S ∗ ) =
.
b . b+h
References
119
When demand follows a normal distribution with a mean of .μ and a standard deviation of .σ , the expected cost function is written such that: S E( | N(μ, σ )) = hSFl+1 (S) − bS(1 − Fl+1 (S)) − h
xfl+1 (x)∂x
.
0
+∞ +b
xfl+1 (x)∂x, S
= hSFl+1 (S) − bS(1 − Fl+1 (S)) − h(μ(zS ) − σ φ(zS )) + b(μ(1 − (zS )) + σ φ(zS )). When demand follows a lognormal distribution with a location parameter of .μ and a scale parameter of .σ , the expected cost function is written as follows: E( | log -N(μ, σ )) = hSFl+1 (S) − bS(1 − Fl+1 (S)) − heμ+σ
.
+ be
μ+σ 2 /2
2 /2
(zS − σ )
(1 − (zS − σ )).
References Alan, Y., Gao, G. P., & Gaur, V. (2014). Does inventory productivity predict future stock returns? A retailing industry perspective. Management Science, 60(10), 2416–2434. Anderson, E. T., Fitzsimons, G. J., & Simester, D. (2006). Measuring and mitigating the costs of stockouts. Management Science, 52(11), 1751–1763. Clark, A. (1958). A dynamic, single-item, multi-echelon inventory model rm2297. The Rand Corporation, Santa Monica, California. Connors, W., & Terlep, S. (2013). Blackberry stuck with $1 billion in unsold phones. Wall Street Journal. Gallego, G., Katircioglu, K., & Ramachandran, B. (2007). Inventory management under highly uncertain demand. Operations Research Letters, 35(3), 281–289. Gaur, V., Fisher, M. L., & Raman, A. (2005). An econometric analysis of inventory turnover performance in retail services. Management Science, 51(2), 181–194. Gavish, B., & Graves, S. C. (1980). A one-product production/inventory problem under continuous review policy. Operations Research, 28(5), 1228–1236. Gustafson, K. (2015). Retailers are losing $1.75 trillion over this. CNBC. https://www.cnbc.com/ 2015/11/30/retailers-are-losing-nearly-2-trillion-over-this.html Hadley, G., & Whitin, T. M. (1963). Analysis of inventory systems. Prentice-Hall. Hassan, J. (2021). Footage of amazon destroying thousands of unsold items in britain prompts calls for official investigation. The Washington Post. https://www.washingtonpost.com/world/ 2021/06/23/amazon-uk-warehouses-destroy-items/ Judd, K. L. (1998). Numerical methods in economics. The MIT Press. Kapner, S. (2015). Struggling toys r us tries fuller stores. Wall Street Journal. Morgenson, G., & Rizzo, L. (2018). Who killed toys r us? Hint: It wasn’t only amazon. Wall Street Journal.
120
3 Inventory Management Under Demand Uncertainty
Raman, A., Gaur, V., & Kesavan, S. (2006). David Berman. Harvard Business School Case#: 9605-081. Scarf, H. E. (1958). A min-max solution of an inventory problem. In K. Arrow, S. Karlin, & H. Scarf (Eds.), Studies in the mathematical theory of inventory and production (pp. 201–209) Soliman, M. T. (2008). The use of dupont analysis by market participants. The Accounting Review, 83(3), 823–853.
4
Uncertainty Modelling
There is no fish in clear water. A Chinese proverb (Q˜ıng shuì wú yú)
Keywords
Ontological uncertainty · Evolutionary demand models · Integration of multiple uncertain elements · Fast Fourier transform (FFT) · Demand regularization
The ultimate objective of organizations is to create value for their customers, which in turn helps them make profits (Coase, 1937). While organizations attempt to fulfil this objective, they have to manage different uncertainties. The value of their offerings may change in the future such that customers would undervalue or overvalue them over time. The processes carried out by organizations may be affected by some external factors, which would be very difficult to keep under control. Therefore, the value of product or service offerings would be uncertain in the future, as well as the production and processing costs. It has been well established in the economics literature (Guiso & Parigi, 1999) that uncertainty leads to more investments and higher growth in competitive markets, whereas monopolistic firms hold back from investing in new projects under high demand uncertainty. We often observe these dynamics in practice. When Jeff Bezos first founded Amazon.com as an online bookstore, for example, he struggled very much to raise capital. In a fireside chat (Kadakia, 2020), he said that all investors asked him what the Internet was. At that time, the Internet was unknown to most people; and hence, the future of the online business was highly uncertain. Even Mr. Bezos admitted that he could not foresee Amazon as one of the most valuable firms in the world when he first founded the company.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_4
121
122
4 Uncertainty Modelling
What would happen if there was no uncertainty about the future of the Internet in the 1990s? In that case, the Amazon.com project would have collapsed at the very beginning. Cash-rich investors or companies (e.g. Walmart) could have come up with a better product and never let Amazon grow if they had foreseen the future of the Internet in the 1990s. If there is no uncertainty, wealthy investors would compete to acquire important resources and develop some key technologies that customers value in the future. Thus, it would not be possible for a cash-constrained entrepreneur to start up a business and grow her company well because all precious resources and technologies would be under the control of wealthy investors in the absence of uncertainty. Uncertainty fosters entrepreneurship, and it is beneficial for small firms and start-ups, whereas it would cause big firms to lose their market power. Supply chains consist of different types of uncertainties such as demand uncertainty, supply uncertainty, cash flow uncertainty, etc. Both scholars and practitioners pay close attention to uncertainties in supply chains. Mathematical models, traditionally developed by supply chain scholars, often incorporate uncertainties in their solution approaches. Practitioners use such mathematical models (embedded in ERP systems) and also apply detailed scenario analysis to understand potential consequences of their decisions given uncertainties in supply chains. In this chapter, we extend traditional models by introducing the uncertainty modelling approach to the reader, which is built on the economics of uncertainties (Gatheral, 2006; Laffont, 1989; Lane & Maxfield, 2005). Economists have long classified uncertainty into three different groups: (1) truth uncertainty, (2) epistemological uncertainty and (3) ontological uncertainty (Lane & Maxfield, 2005). Suppose that decision-makers can outline all potential outcomes of their decisions; however, there is still uncertainty about which outcome will be realized in the future. This type of uncertainty is referred to as truth uncertainty. Epistemological uncertainty is observed when there is a miscommunication between agents such that information is interpreted in different ways by them. Suppose that a company develops a new product to meet a certain demand in the market. The functionality and use of the product may not be truly known by all potential customers. Some customers may not be aware of important features of the product so the company cannot sell it to those customers. Some customers may find the product too complicated although this is not the case. Demand uncertainty can be attributed to the epistemological uncertainty in such cases. If the availability and clarity of the information are improved by a successful media campaign, the epistemological uncertainty can be reduced, which in turn helps decision-makers predict demand more accurately. Epistemological uncertainty is caused by poor communication among agents or agents’ tendency to distort information. The former can be resolved by improving information transfer between agents. The latter, however, can be resolved by analysing and engineering agents’ incentives to distort information (Candogan, 2020). To better understand motives of agents to distort information, imagine a retailer who has five units of a product in stock. Two of them are in excellent condition, while the rest are in average condition. Customers are willing to buy
4 Uncertainty Modelling
123
only those that are in excellent condition; but they cannot identify them. Suppose that customers have the information that there are two items in excellent condition, but they don’t know which items they are. Customers can only buy an item when its probability of being in excellent condition is more than 50%. If the retailer shares the information about products truthfully, customers will learn which products are in excellent condition. Then, the retailer can only sell two items. In that case, the retailer’s payoff is equal to two. If the retailer disguises the information about products, the probability of being in excellent condition is 40% for each product. In that case, customers do not buy anything so the retailer’s payoff becomes equal to zero. If the retailer creates a bundle of three items that includes two excellent items and the last one in average condition, the probability of being in excellent condition becomes equal to 66.6% for each product in the bundle. When the retailer offers such a bundle to customers, they tend to buy all three items because the probability is more than 50%. In that case, the retailer can increase the payoff to three. Therefore, the retailer is incentivized to distort information to maximize her own payoff, which in turn contributes to epistemological uncertainty. The last type of uncertainty is ontological uncertainty such as occurs when there is an uncertainty about the identity of agents and their interactions with each other (Lane & Maxfield, 2005). Suppose that R&D department of a company develops a new product that has some primary and secondary functionalities. It turns out that the secondary functionality of the product becomes much more popular than the primary functionality. Thus, demand projections outlined at the very beginning turn out to be wrong because actual customer segments are different from those targeted at the beginning. In this case, the demand uncertainty at the beginning of product life cycle can be attributed to ontological uncertainty. For instance, the pharmaceutical giant Pfizer had developed Viagra as a medicine for hypertension treatment (see https://en.wikipedia.org/wiki/Sildenafil for more detailed information about the history of the medicine). Later it was introduced as a medicine for treating sexual dysfunction after clinical trials. Thus, cash flow and demand projections at the very beginning had been wrong because they were based on the assumption that patients who could use the medicine are those with hypertension disorders, which is not the case in the end. In supply chain management, the ambiguity of target customer segments and their interactions with each other cause ontological uncertainty about the demand. Even if customer segments and their interactions are well defined in the present time, they may change in the future. Therefore, ontological uncertainty can be resolved by correctly identifying agents and their interactions and understanding the evolutionary dynamics of uncertainty. Traditional mathematical models in supply chain management (e.g. those discussed in the previous chapter) attempt to prescribe an optimal set of actions when the demand uncertainty belongs to the type of truth uncertainty. In this case, the demand can be represented by a probability distribution or a stochastic process. Agents’ motivations and incentives to distort information fall into the category of information transfer design problems (please see Candogan, 2020 and the references therein for more information about information transfer design). Thus, epistemological uncertainty must be investigated under this category in the
124
4 Uncertainty Modelling
extant literature. Finally, uncertainty modelling approach addresses ontological uncertainty in two ways: (1) modelling the evolutionary dynamics of uncertainty and (2) incorporating the elements of uncertainty in a unified model, which we cover in this chapter.
4.1
Uncertainty Modelling Versus Demand Forecasting
To clarify the distinction between uncertainty modelling and demand forecasting, we first question their objectives. Demand forecasting aims to predict future demand by using some predictive analytics methods, such as time series analysis, linear models, regression trees, logit model, etc. In the predictive analytics approach, the dependent variable (e.g. monthly demand for a product) and some explanatory variables are first identified. Then, one of the predictive analytics models is fit to empirical data, which relates the dependent variable to the explanatory variables. The demand distribution used as an input in inventory models is indeed an outcome of the developed predictive analytics model. The problem with this approach is that each model is built on some restrictive assumptions. For example, one of the assumptions in a linear regression model is that there is a linear relationship between the dependent and explanatory variables. Such assumptions are not often relevant in practice. Even if they are relevant, the predictive analytics can still be useless due to inaccuracy or the lack of explanatory variables. Suppose that a retailer aims to predict soda demand for the next summer, which is mainly affected by the weather. The problem here is that it is not possible to feed the model with accurate temperature values several months in advance. Thus, even when a good predictive analytics model is fit to historical data, the model cannot be used to predict future values due to the absence of values for the explanatory variables for the future states. Uncertainty modelling takes a different approach. Instead of aiming to predict future demand directly, it questions how the demand data is generated. To understand the importance of uncertainty modelling, let us consider a manufacturer that receives bulk orders for one of its products. One customer order is received every 3 weeks. The quantity demanded in the first order is 750 units, and the quantity increases by 750 units for each new order. Thus, the customer’s demand is 750 units in week 3, 1500 units in week 6, 2250 units in week 9, 3000 units in week 12, and so on. Demand lead time is assumed to be zero. What does the manufacturer see 5 months after the product is launched? The manufacturer sees the monthly demand values of 750, 1500, 5250, 3750 and 4500 units for the first 5 months. We remark that the orders received in weeks 9 and 12 constitute the third month’s demand, which amounts to 5250 units. Thus, demand values for the first 5 months have a mean of 3150 units and a standard deviation of 1941 units. Demand forecasting methods fail to return an accurate estimate due to high uncertainty. Even time series models that take into account the autocorrelation between consecutive demand values would fail because of the demand peak in the third month. But all uncertainty disappears when the decision-maker looks at how the demand data is generated from the customer order.
4.1 Uncertainty Modelling Versus Demand Forecasting
125
The uncertainty modelling approach differs from predictive analytics in the following way: Predictive analytics addresses the question of what will happen in the future. Therefore, it is forward-looking in the sense that it aims to predict the future demand. However, uncertainty modelling addresses the question of how data is generated. For that reason, it is backward-looking such that it aims to understand the dynamics of demand generation to characterize the demand uncertainty. It originates from the finance literature, where other terms such as volatility modelling and volatility surface modelling are used instead of uncertainty modelling (Gatheral, 2006). Wald’s survivorship bias would probably be the most famous example showing the importance of the uncertainty modelling approach (Biçer et al., 2022). During World War II, Abraham Wald was part of a research group that investigated postmission aircraft to determine what areas of the aircraft to strengthen. Strengthening the whole aircraft would have been costly because it would lead to higher fuel consumption and limited manoeuvring capability. Therefore, the research group was tasked with determining the most important areas to strengthen. They first suggested armouring the frequently struck areas. Wald challenged this suggestion and recommended strengthening the areas that had not been struck because the planes that were hit there had likely been shot down. He convinced the military officers to change their decision, which was only possible by looking at the data generation process. The value of uncertainty modelling magnifies in supply chains with sequential decisions. When it is possible to place multiple orders during the production horizon, modelling the uncertainty and its evolutionary dynamics is necessary to improve the decisions. This, in turn, helps decision-makers estimate the rate at which demand uncertainty is resolved over time and determine how much to order now and how much to postpone to later production periods. Uncertainty modelling also makes it possible to use transactional data without aggregating and squeezing valuable demand information. Figure 4.1 shows how to integrate the uncertainty model with the supply chain management system. The primary source of information in demand forecasting is the aggregated demand values, which are available in the demand fulfilment datasets. However, it is transactional data for uncertainty modelling, which can be found in the order management datasets. For this reason, uncertainty modelling is better linked to market dynamics. Customer demand is formed after a series of events. Initially, customers search for products to fulfil some specific needs. They collect information about potential products from different sources and then place an order to buy one of them. The customer journey that starts with collecting information and ends with processing the purchase order is known as the “path to conversion” in the marketing literature. Marketing science addresses the problems associated with these steps. The next step is the order management that falls into supply chain analytics. The customer orders translate into the supply and logistics services that are performed to fulfil customer demand on time. For downstream activities, logistics planning is done based on the customer order information. For example, a customer specifies the product, desired quantity and time of delivery upon placing an order. Using this
126
4 Uncertainty Modelling
Fig. 4.1 The role of demand modelling and demand forecasting in supply chain analytics. Red arrows indicate the information flow, whereas the black arrow indicates a physical process
information, outbound logistics activities are planned such that the items demanded are delivered to the customers in full and on time. For upstream activities, planning the operations is done based on the demand fulfilment dataset if demand forecasting models are utilized. For example, a manufacturer forms monthly production plans based on monthly demand forecasts. These forecasts are obtained from forecasting tools that use some exploratory variables to predict the demand values. To estimate the parameters of the forecasting model, historical monthly demand values are used, which are found by aggregating the customer demand fulfilled within each month. Uncertainty modelling aims to estimate the inflow of customer orders and translate this information into demand information. Suppose, for example, each customer order specifies the product, the demand lead time and the quantity demanded. For future periods, the manufacturer does not know how many customer orders will be received for one of the products, how many units are demanded for each order and when delivery is requested. Therefore, the number of customer orders received would be uncertain for a specific product. For each order, the demand lead time and quantity would also vary significantly. In such a case, there are three uncertain parameters (i.e. the number of orders, demand lead time and quantity demanded for each order). An uncertainty modelling approach combines these elements with the correct specifications to characterize the demand uncertainty. The main advantage of uncertainty modelling is that the uncertain parameters can be linked to market data (collected from customers) more easily than the aggregate demand value. Although the number of uncertain parameters is higher in uncertainty modelling than in demand forecasting, the level of cumulative uncertainty about the demand is reduced when information from the market is used effectively. This approach also adds evolutionary dynamics to the model due to the variability in demand lead times. When the demand lead time is positive, there is a time lag
4.1 Uncertainty Modelling Versus Demand Forecasting
127
between the instant when an order is received and when it is delivered to the customer. In practice, demand lead times vary across customer orders, and some customers request their orders to be delivered after a long demand lead time. This information has a positive impact on improving the future demand estimates, which are updated over time as orders are received. The forecast updating mechanism that comes from an uncertainty model of demand determines the evolutionary dynamics. Another advantage of uncertainty modelling is its superior performance over demand forecasting when demand uncertainty is extremely high. For example, demand forecasting tools failed to estimate demand peaks after lockdowns during the first year of the Covid-19 pandemic. Demand at non-essential retailers dropped to zero during the lockdowns. Demand forecasting tools (e.g. time series models) used this information wrongly and predicted a decrease in customer demand for future periods. However, demand values being equal to zero during lockdowns did not indicate low demand in the future but more customers who postponed their purchases. After the stores were reopened, these customers made their delayed purchases, causing demand peaks. Uncertainty modelling is very powerful in such cases because it focuses on characterizing the inflow of customer orders. The main disadvantage of uncertainty modelling is its complexity because it has to deal with more than one uncertain parameter. For demand forecasting, there is only one uncertain parameter, which is the demand value for a product. In the example above, however, there are three uncertain parameters for the uncertainty model such as the number of distinct orders, the quantity and the demand lead time for each order. Therefore, the uncertainty modelling approach should correctly use these parameters and translate them into future demand estimates. Traditionally, uncertainty models have been based on an implicit assumption that there exists human intervention. Demand planners analyse the new information coming from the customers and update the demand forecasts accordingly. Based on this type of forecast updating mechanism, uncertain parameters related to the order management system, such as the number of orders, the demand lead time and the quantity for each order, are not directly incorporated into the demand model. Demand planners interpret the historical and recent values of these parameters to update demand forecasts. Therefore, uncertainty models would only deal with how to model the evolution of demand forecasts. There are two distinct forms of the traditional models. The first one is the additive model, where demand forecasts are updated additively. The second one is the multiplicative model. Traditional models are very useful for quantifying uncertainty and assessing the value of supply chain responsiveness in an evolutionary setting, where decisionmakers make sequential decisions. However, it is not possible to develop a standalone decision support tool with them. More recently, uncertainty models are built on machine learning regularization techniques and complex stochastic diffusion processes. Due to these efforts, the intervention of demand planners is reduced because the models based on such machine learning and stochastic diffusion techniques directly incorporate the order management dynamics and translate this information into operational decisions.
128
4.2
4 Uncertainty Modelling
Evolutionary Dynamics of Uncertainty
The demand parameter in the inventory models reviewed in the previous chapters is uncertain such that the uncertainty is captured by a probability distribution of demand. Analytical results for the optimal order quantities include a term of the probability distribution. Once the demand value that corresponds to the critical fractile is estimated, the optimal order quantity can be found. Suppose that the critical fractile is .0.5 in a newsvendor model. If the demand value that corresponds to the 50th percentile of demand distribution is 150 units, the optimal order quantity is equal to 150 units. In practice, however, ordering decisions can be postponed, cancelled or partially made. For example, because demand forecasts evolve, buyers sometimes make sequential ordering decisions (Wang et al., 2012) rather than committing to a single bulk order well in advance of the realization of actual demand. In this case, the parameters of the demand distribution would change over time, making the characterization of the demand uncertainty very important to derive the optimal order quantities. Consider a fashion apparel retailer that sells blue and red shirts in the spring season. The production starts in September and continues until the end of February. Total monthly production capacity is 1000 units. Both shirts have independent and identical demand distributions that vary between 0 and 6000 units with a mean of 3000 units. Based on this information, the decision-maker determines the production plan as scheduled in Table 4.1. Actual demand observed in March is 4000 units for blue and 2000 units for red shirts. Based on the production plan given in Table 4.1, the amount produced is 3000 units for each shirt. Therefore, the company incurs lost sales of 1000 units of blue shirts and excess inventory of 1000 units of red shirts. Now, suppose that the company collects demand information from the market at the end of December, which reveals that the expected demand is 3500 units for blue and 2000 units for red shirts. Then, the production plan for the last 2 months can be revised to better match production quantities with demand as given in Table 4.2. After revising the production plan based on the updated demand information, the company can reduce lost sales by 500 units of blue shirts and fully avoid excess inventory of red shirts. Table 4.1 Production plan for the shirts. Monthly capacity is 1000 units Product Blue shirt Red shirt
September 500 500
October 500 500
November 500 500
December 500 500
January 500 500
February 500 500
Table 4.2 Revised production plan after collecting demand information at the end of December Product Blue shirt Red shirt
September 500 500
October 500 500
November 500 500
December 500 500
January 750 0
February 750 0
4.2 Evolutionary Dynamics of Uncertainty
129
The example illustrates the role of ordering flexibility in mitigating the mismatches between supply and demand. In practice, companies often have ordering flexibility to some extent. Instead of fully committing to production quantities in September for shirts that will be sold 6 months later, fashion apparel retailers, for example, often collect information from the market and update production quantities accordingly. In such a setting, decision-makers have to make sequential decisions, such as the quantity to order at the beginning of the production period and the adjusted quantity depending on the evolution of the demand forecasts. Therefore, the information updating mechanism has a direct impact on the structure of the sequential decision-making process. If the retailer fails to collect valuable demand information to improve demand forecasts, it would make sense to determine the production quantities at the very beginning and not revise them over time. However, if the retailer collects information, it would be better off ordering in small quantities at the beginning and postponing large order quantities until after the valuable demand information has been collected. Modelling uncertainty is thus critical to developing an ordering strategy so decision-makers have some flexibility to revise production plans. Unlike demand forecasting, the primary purpose of uncertainty modelling is not to find the best demand forecast. Instead, it aims to model the evolution of demand risk over time as decision-makers collect valuable information regarding market demand. For the fashion apparel retailer, forecasting models aim to find the most accurate estimate for the demand that will occur during the spring season. In contrast, uncertainty modelling aims to fully characterize the uncertainty in each of the months starting from September. Once the uncertainty is fully characterized, demand forecasts are naturally obtained as an outcome of the uncertainty model.
4.2.1
Additive Demand Models
Demand uncertainty can be characterized by an additive model when new information is included in the model additively. Imagine a manufacturer that uses advance orders and other information collected from its customers to update its demand forecasts over time. Demand planners interpret such information and determine the demand forecasts. The evolution of demand forecasts follows an additive demand model. Let .Di denote the demand forecast at time .ti for .i ∈ {0, 1, · · · , n}. The actual demand value is observed at time .tn . In the fashion apparel example, given in Table 4.1, actual demand is observed in March, and demand forecasts evolve from September to the following March. Therefore, .t0 = September, .t1 = October, .t2 = November, .t3 = December, .t4 = J anuary, .t5 = F ebruary, and .t6 = March. Thus, .n = 6 because there are six decision epochs before observing the final demand. In the additive demand model, forecasts are updated as follows: Di = D0 + μ(ti − t0 ) + 1 + 2 + · · · + i ,
.
130
4 Uncertainty Modelling
for .i ∈ {1, · · · , n}, where . terms follow a normal distribution: i ∼ N(0, σ ti − ti−1 ).
.
The term .μ is the drift rate for the demand and .σ is the volatility parameter. The actual demand .Dn , conditional on the forecast .Di , is then formalized as follows: √ Dn | Di ∼ N(Di + μ(tn − ti ), σ tn − ti ).
.
The last expression states that demand uncertainty decreases over time as .ti approaches to .tn . The drift rate .μ captures the upward and downward trends in the underlying demand evolution. For example, demand forecasts based on expert knowledge may be biased to be less than the actual demand values. A positive drift captures such a bias in demand forecasts. Likewise, a negative drift captures the bias that demand forecasts are more than the actual demand values. In Fig. 4.2, we present a sample path for the evolution of demand forecasts according to an additive demand model. We set the initial demand forecast to .D0 = 1 and drift rate .μ = 0 and volatility parameter .σ = 1. The time interval .tn −t0 is normalized to one. The solid blue curve represents the demand forecasts, and its value for .t = 1 is the actual demand. The shaded area shows the .95% confidence interval. As the time approaches to one, the distance between the upper and lower
Fig. 4.2 Simulation of the evolution of demand forecasts with parameters .D0 = 1, .μ = 0 and =1
.σ
4.2 Evolutionary Dynamics of Uncertainty
131
bounds of the confidence interval decreases. This results from the fact that demand uncertainty decreases as the time to the realization of actual demand gets closer. The figure also shows that the lower bound of the confidence interval is negative for the majority of the time when .t < 0.75. However, demand values cannot be negative in practice. This observation indicates the problem that additive models may provide a poor fit to empirical demand values when the forecast horizon is long and the volatility is high. In such cases, multiplicative models should be preferred over the additive models to overcome this problem. As an example, we now consider a retailer that can order products from a supplier at the beginning of the forecasting horizon. The selling price of the product is .$10 per unit. The unit cost is .$5. Unsold units can be sold at a discount store at .$3 per unit. The initial demand forecast at the very beginning is 100 units with a drift rate of zero and a volatility of 100 units. The critical fractile for the newsvendor order quantity is .(p − c)/(p − s) = 0.625. The length of the forecasting horizon is set equal to one. Then, the optimal order quantity is: Q∗ = D0 + μ(tn − t0 ) + z∗ σ (tn − t0 ),
.
= 100 + 100 × −1 (0.625) ≈ 132. We remark that .μ and .σ denote the drift rate and volatility parameters in this expression. They were denoting the normal distribution parameters in the previous chapter. The normal distribution parameters in this example are .D0 + μ(tn − t0 ) and √ tn − t0 with .μ and .σ denoting the drift rate and volatility, respectively. .σ Suppose the supplier offers the retailer the opportunity to postpone the ordering decision until .ti = 0.5 such that the order will be placed right in the middle of the forecasting horizon. Let .Di be equal to 100 units. Then, the optimal order quantity is: √ Q∗ = Di + μ(tn − ti ) + z∗ σ tn − ti , √ = 100 + 100 × 0.5 × −1 (0.625) ≈ 123.
.
Therefore, the order quantity is reduced from 132 units to 123 units when the ordering decision is postponed given that the demand forecasts are assumed to be the same at .t0 and .ti . We also remark that the retailer tends to order excessively under uncertainty because the critical fractile is relatively high. For that reason, postponing the ordering decision helps reduce the excess inventory risk in this example. When the critical fractile is relatively low, postponing the ordering decision reduces the shortage risk. In such a case, we would expect an increase in the order quantity when the ordering decision is postponed.
132
4.2.2
4 Uncertainty Modelling
Multiplicative Demand Models
One of the drawbacks of the additive demand model is that it assigns positive probability densities to negative realizations. In Fig. 4.2, for example, the probability that demand turns out to be less than zero (practically, this is impossible) is high enough to misguide decision-makers for long lead times. To correct for this, the evolution of demand forecasts can be modelled as a multiplicative process. In the multiplicative demand model, forecasts are updated multiplicatively according to the process: Di = D0 eμ(ti −t0 )+1 +2 +···+i ,
.
for .i ∈ {1, · · · , n}, where . terms follow a normal distribution: i ∼ N(−σ 2 (ti − ti−1 ), σ ti − ti−1 )
.
with .σ being the volatility parameter. Therefore, .ei follows a lognormal distribution, and its expected value is equal to one. Then, the end demand, conditional on the demand forecast at time .ti , follows a lognormal distribution: √ Dn | Di ∼ log -N(ln(Di ) + (μ − σ 2 /2)(tn − ti ), σ tn − ti ),
.
for .i ∈ {0, · · · , n − 1}. The expression indicates that demand volatility decreases as .ti approaches .tn . As in the additive demand model, this captures the real-world dynamics that the shorter the forecasting horizon, the more accurate the demand forecasts. In Fig. 4.3, we present a sample path for the evolution of demand forecasts according to a multiplicative model. We assume that .D0 = 1, .μ = 0 and .σ = 1. The time interval .tn −t0 is normalized to one. The solid curve represents the demand forecasts such that its value for .t = 1 gives the actual demand value. The shaded area shows the .95% confidence interval. In contrast to the additive demand model, the lower bounds can never be negative in the multiplicative model. Thus, the multiplicative model fits the empirical data better than the additive one when the forecast horizon is long and volatility is high. As an example, we again consider the retailer that can order products from a supplier at the beginning of the forecasting horizon. The selling price is .$10 per unit, the cost .$5 per unit, and the salvage value .$3 per unit of unsold inventory. The initial forecast is 100 units. The drift rate and volatility parameter for the multiplicative model are zero and one, respectively. The critical fractile is equal to .0.625. The length of the forecasting horizon is set equal to one. And the optimal order quantity at the beginning of the forecasting horizon is: Q∗ = eln(D0 )+(μ−σ
.
2 /2)(t
= eln(100)−0.5+
n −t0 )+z
−1 (0.625)
∗ σ √t
≈ 84.
n −t0
,
4.2 Evolutionary Dynamics of Uncertainty
133
Fig. 4.3 Simulation of the evolution of demand forecasts according to the multiplicative demand model with parameters .D0 = 1, .μ = 0 and .σ = 1
Suppose again that the supplier offers the retailer an option to postpone the ordering decision until .ti = 0.5. Let .Di be the same as the initial demand forecast: .Di = 100 units. The optimal order quantity then becomes: Q∗ = eln(Di )+(μ−σ
.
=e
2 /2)(t
n −ti )+z
∗ σ √t
n −ti
,
√ √ ln(100)−0.5× 0.5+−1 (0.625)× 0.5
≈ 88.
Therefore, the optimal order quantity increases from 84 units to 88 units when the ordering decision is postponed. In this example, we simply look at how the order quantity is affected by the order timing. If we analyse the change in expected profit after the ordering decision is postponed, we would price the value of reducing the lead time, which will be done in the next chapter. Comparing the results of this example between the additive and multiplicative models, the reader would notice two important findings. First, order quantities are higher than the initial demand estimate for the additive case, while they are lower than the initial demand estimate for the multiplicative demand model. Second, the optimal order quantity decreases from 132 units to 123 units in the additive model when the ordering decision is postponed from .t = 0 to .t = 0.5. But, it increases from 84 units to 88 units in the multiplicative model when the ordering
134
4 Uncertainty Modelling
decision is postponed. These observations are explained as follows: The marginal demand distribution is the normal distribution for the additive model. The normal distribution is symmetric around the mean, and its median value is the same as the mean. A critical fractile higher than .0.5 indicates that the optimal order quantity must be higher than the mean demand. In our example, the critical fractile is .0.625, and the mean demand is the same as the initial demand estimate given that the drift rate is equal to zero. For this reason, the optimal order quantity becomes higher than the initial demand estimate. When the ordering decision is postponed, the optimal order quantity gets closer to the mean value due to the reduction of demand uncertainty. This also explains why the order quantity decreases for the additive model after postponing the ordering decision. In the multiplicative model, the marginal demand distribution is the lognormal distribution, which is positively skewed. Thus, the mean value is higher than the median for the lognormal distribution. The median demand for the multiplicative demand parameters of our example is equal to: eln(D0 )+(μ−σ
.
2 /2)(t
n −t0 )
= eln(100)−0.5 = 60.6.
The percentile value that corresponds to the mean demand for the lognormal distribution is equal to 0.69: 100 = eln(D0 )+(μ−σ
.
2 /2)(t
n −t0 )+zσ
√
tn −t0
= 100e−0.5+z ,
z = 0.5, (z) = (0.5) = 0.69. This value is higher than the critical fractile of our example (i.e. .0.625). Thus, the optimal order quantity for the multiplicative model must be between the median (i.e. 60.6) and mean (i.e. 100) demand values. For this reason, the optimal order quantities are lower than the initial demand estimate. When the ordering decision is postponed, the optimal order quantity gets closer to the mean demand due to the reduction in demand uncertainty. Thus, the order quantity increases from 84 to 88 units when the ordering decision is postponed.
4.3
Integration of Uncertain Elements in a Unified Model
Organizations collect credible demand information to improve their demand forecasts. Demand is often formed by incorporating different parameters collected from different sources in a single model. Imagine a retailer that sells different products only online. Customers visit product pages before making their purchase decisions. After visiting the online product pages, they choose which products to buy. Therefore, total demand is an outcome of the number of customers that visit the product category webpages and their choice probabilities. In other words, total demand for a product on a specific day is the multiplication of the number of
4.3 Integration of Uncertain Elements in a Unified Model
135
customers who visited the product category webpages and the probability of them choosing the product on that day. We now assume that a product has been in the market for 5 days. Daily demand values for the item are 140, 40, 150, 60 and 280 units on those days. The mean value and standard deviation of demand are 134 and 94 units, respectively. Thus, there is high demand uncertainty for the product. Over the 5 days, 1750, 500, 2000, 705 and 3500 customers viewed the product category webpages. Then, the choice probabilities are obtained as follows: .
40 150 60 280 140 = 0.08, = 0.08, = 0.075, = 0.085, and = 0.08. 1750 500 2000 705 3500
The choice probabilities are quite certain, ranging from 7.5% to 8.5%. Hence, most demand uncertainty can be attributed to the uncertainty regarding the number of visits. Based on historical data, the retailer knows that the number of visits follows a lognormal distribution with the location and scale parameters of 7.4 and 0.42. Therefore, the demand is formed as a combination of two uncertain variables such that each has a different number of observations and distributional properties. To correctly combine these two variables, we must avoid using the probability distribution of demand. Instead, we must use the characteristic function of demand. Let D be a random variable denoting the daily demand for the product. It can be formed as a multiplicative combination of two uncertain parameters: D = A × B,
.
where .A denotes the number of visits of the product category webpage and .B is the choice probability of the product. The characteristic function of demand is written as follows: ψD (ω) = E[eim ωD ],
.
√ where .im = −1 is the imaginary number. For our example, the parameters are combined multiplicatively to form the demand. For this reason, we must get the natural logarithm transformation of the demand values to write the characteristic function of the log demand: ψln(D) (ω) = E[eim ω ln(D) ] = ψln(A) (ω) × ψln(B) (ω).
.
The first term on the right-hand side of this expression is the characteristic function of log-transformed values of the number of visits, while the second term is logtransformed values of choice probabilities. Because the number of visits follows a lognormal distribution, its log-transformed values follow a normal distribution with a mean of 7.4 and a standard deviation of 0.42. Then, ψln(A) (ω) = eim ×7.4×ω−0.42
.
2 ×ω2 /2
,
136
4 Uncertainty Modelling
which is obtained from the characteristic function formulation of the normal distribution. There are only five observations for the choice probabilities. Thus, the .ψ ln(B) (ω) term is formulated as an empirical characteristic function: 1 im ω ln(pj ) e , 5 5
ψln(B) (ω) =
.
j =1
where .pj is the choice probability on the .j th day. Combining all these results, the characteristic function representation of the demand process can be formulated easily. We now consider another motivating example in which a retailer sells its products in three different markets. Total demand is formulated additively as the summation of demand values of all markets: D = A + B + C,
.
where .A, .B and .C denote demand parameters for each market. Suppose that demand follows a stochastic process in the first market; it follows a statistical distribution in the second market; finally, there are only few empirical values of demand in the third market. The characteristic function exists for each parameter. Hence, the characteristic function of total demand is formalized as follows: ψD (ω) = ψA (ω) × ψB (ω) × ψC (ω).
.
The characteristic function representation makes it possible to incorporate uncertain parameters in a single model to correctly characterize the demand. Once the model is specified correctly, key decisions can be optimized by using the fast Fourier transform (FFT) approach.
4.3.1
Inventory Management with the Additive Integration of Uncertain Elements
We have observed in Chap. 3 that the optimal order quantity in the newsvendor model and the optimal order-up-to level in the base stock model are found by the critical fractile solution. We now apply the FFT to calculate the optimal result for a given critical fractile when the demand is represented by the characteristic function .ψD (ω). We recall from the previous subsection that the characteristic function of demand is formalized in the form of .ψD (ω) when it is formed as an additive combination of different uncertain elements, whereas it is formalized in the form of .ψln(D) (ω) when it is formed as a multiplicative combination of different uncertain elements.
4.3 Integration of Uncertain Elements in a Unified Model
137
The optimal order quantity for a critical fractile of .β is found by: Q β=
f (D)∂D,
.
−∞
where .f (·) is the density function of demand. Following up from (Carr & Madan, 1999; Biçer & Tarakcı, 2021), the Fourier transform of this expression is written as follows: +∞ Q im ωQ αQ .ψβ (ω) = e e f (D)∂D∂Q, −∞
−∞
where .α is a damping factor, which is just used to guarantee the square integration condition so the Fourier transformation can be applied. Then, +∞ +∞ .ψβ (ω) = e(im ω+α)Q ∂Qf (D)∂D. −∞ D
The inner integral has a finite value only if .α < 0 so we set .α = −0.5. And, 1 .ψβ (ω) = − im ω + α =
+∞ e(im ω+α)D f (D)∂D, −∞
im ψD (ω − αim ) . ω − αim
The last expression includes the characteristic function of demand in the additive form: .ψD (ω). Then, the critical fractile can be found by the inverse Fourier transform: 1 .β = 2π
+∞ e−(im ω+α)Q ψβ (ω)∂ω. −∞
It follows from the properties of complex numbers that: e−αQ .β = π
+∞ e−im ωQ ψβ (ω)∂ω. 0
138
4 Uncertainty Modelling
The computation of this integral is exhaustive. Nevertheless, the FFT approach can be used to calculate the integral efficiently (Carr & Madan, 1999, pp. 67–70). The standard FFT approach uses Simpson’s rule of weightings to discretize the integral such that:
.
+∞ N 2π η e−im ωQ ψβ (ω)∂ω = e−im N (η−1)(u−1) eim bωη ψβ (ωη ) [3 + (−1)η − 1[η=1] ] 3 η=1
0
where .1[η=1] is equal to one when .η = 1 and zero otherwise. The Q values are discretized such that: Q = −b + (u − 1) for u = 1, · · · , N,
.
1 N , 2 2π = . Nη b=
The FFT approach hence reduces the computational complexity from .O(N 2 ) to .O(N log2 N) (Carr & Madan, 1999). The FFT function in software packages, such as Python and R, uses the vector of discretized points as an input and returns the result of the Fourier transformation. The N value should be a power of two (we set it equal to .212 in the example given in our online web application). Once its value is entered, the input vector for the FFT function should be in the following format: η eim bωη ψβ (ωη ) [3 + (−1)η − 1[η=1] ]. 3
.
The output of the FFT function is the cumulative probability values for the Q values. Using this vector, the optimal order quantity that makes the cumulative probability equal to the critical fractile is given by Biçer and Tarakcı (2021): e−α(−b+ (u−1)) Q∗ = −b + {u | β = π
.
×
N η=1
2π η e−im N (η−1)(u−1) eim bωη ψβ (ωη ) [3 + (−1)η − 1[η=1] ]} − 1 . 3
When demand is formalized by a characteristic function in the additive form ψD (·), the cumulative distribution of demand for an order quantity can be calculated by applying the FFT. As we have demonstrated in this subsection, it is possible to calculate .P rob(D < Q) (i.e. equal to the critical fractile .β for the optimal order quantity) by employing the FFT with the characteristic function in the additive form.
.
4.3 Integration of Uncertain Elements in a Unified Model
139
This in turn makes it possible to find the optimal order quantity for the newsvendor or the base stock model because the optimal solutions for these models are found by the critical fractile approach. However, the optimal results for the reorder model cannot be calculated by applying the FFT when the demand is formalized by the characteristic function in the additive form. This results from the fact that the partial Q expectation integral (i.e. . Df (D)∂D) cannot be calculated analytically when the −∞
demand is formalized by the characteristic function in the additive form.
4.3.2
Inventory Management with the Multiplicative Integration of Uncertain Elements
When demand is formed as a multiplicative combination of different uncertain elements (e.g. the online retailer example given above in which the demand is formed as a multiplicative combination of number of visits and choice probabilities), the characteristic function must be developed for the natural logarithm transformation of demand .ψln(D) . This representation of characteristic function is more useful than the additive representation because both the cumulative distribution of demand and the partial expectation can be computed when the demand is formalized by the multiplicative characteristic function. For that reason, optimal results for the newsvendor, base stock and reorder models can be analytically derived for the multiplicative characteristic function. Additionally, the expected profit formulation can also be obtained for the multiplicative case, whereas it is not possible for the additive characteristic function. To calculate the cumulative distribution, we take the log transformation of the quantity and demand values such that .q = ln(Q) and .d = ln(D). The critical fractile formulation is then: q β=
f (d)∂d.
.
−∞
Following the same steps as in the previous subsection (Carr & Madan, 1999; Biçer & Tarakcı, 2021), the Fourier transform of this expression is: +∞ q im ωq αq .ψβ (ω) = e e f (d)∂d∂q, −∞
−∞
+∞ +∞ =
e(im ω+α)q ∂qf (d)∂d. −∞ d
140
4 Uncertainty Modelling
The inner integral has a finite value only if .α < 0 so we set .α = −0.5. And, 1 .ψβ (ω) = − im ω + α
+∞ e(im ω+α)d f (d)∂d, −∞
im ψln(D) (ω − αim ) = . ω − αim The last expression includes the characteristic function of the natural logarithm transformation of demand. Then, the critical fractile can be found by the inverse Fourier transform: 1 .β = 2π
+∞ e−(im ω+α)q ψβ (ω)∂ω, −∞
e−αq = π
+∞ e−iωq ψβ (ω)∂ω. 0
Following up from Carr and Madan (1999), as explained in the previous subsection, we discretize the q values to compute the integral such that: q = −b + (u − 1) for u = 1, · · · , N,
.
1 N , 2 2π = . Nη b=
Then, the optimal order quantity .Q∗ for a critical fractile of .β is given by Biçer and Tarakcı (2021): .
e−α(−b+ (u−1)) ln(Q∗ ) = −b + {u | β = π ×
N η=1
2π η e−im N (η−1)(u−1) eim bωη ψβ (ωη ) [3 + (−1)η − 1[η=1] ]} − 1 . 3
The last expression makes it possible to calculate the optimal order quantity for the newsvendor or the base stock model. To calculate the optimal results for the reorder model, we need not only the derivation of cumulative distribution but also the derivation of partial expectation. In particular, we must calculate the value of .E(max(D − R, 0)) for a reorder level of R to find the optimal order quantity in the
4.3 Integration of Uncertain Elements in a Unified Model
141
reorder model. This expression is derived by Biçer and Tarakcı (2021): ∞ .E(max(D − R, 0)) = (ed − er )f (d)∂d, r
where .r = ln(R). Applying the Fourier transformation to the last expression, ∞ ψ(ω) =
e
.
im ωr αr
e
−∞
∞ (ed − er )f (d)∂d∂r r
∞ d =
(e(im ω+α)r ed − e(im ω+α+1)r )f (d)∂r∂d, −∞ −∞
ψln(D) (ω − im (α + 1)) . (im ω + α + 1)(im ω + α)
=
We now use the inverse Fourier transform to obtain: E(max(D − R, 0)) =
.
1 2π
∞
e−(im ω+α)r ψ(ω)∂ω =
−∞
e−αr π
∞
e−im ωr ψ(ω)∂ω.
0
Thus, the analytical expression of .E(max(D − R, 0)) is given by Biçer and Tarakcı (2021): E(max(D − R, 0)) = −b +
.
N e−α(−b+ (u−1))
π
2π
e−im N (η−1)(u−1) eim bωη ψ(ωη )
η=1
η × [3 + (−1)η − 1[η=1] ] − 1 , 3 where: r = ln(R) = −b + (u − 1) for u = 1, · · · , N,
.
1 N , 2 2π = . Nη b=
The FFT approach can also be applied to calculate the newsvendor profit. After applying the log transformation, the expected profit function is written as follows: E( (q)) = (p − c)eq − (p − s)P(q),
.
142
4 Uncertainty Modelling
where: Q +∞ e−αq q d .P(q) = (e − e )f (d)∂d = e−iωq ψP (ω)∂ω, π 0
0
ψln(Dn ) (ω − (α + 1)i) ψP (ω) = 2 . α + α − ω2 + iω(2α + 1) The FFT approach makes it possible to characterize the demand when it is formed as an additive or multiplicative combination of different uncertain elements. The additive characteristic function is more restrictive than the multiplicative one in the sense that optimal solutions can only be obtained for a limited set of inventory models. Therefore, it would be appealing to develop the demand model in the multiplicative form, if possible, to apply the FFT to a more general set of inventory models. In what follows, we describe a commonly used multiplicative model and present an example complemented with the online web application.
4.3.3
A Commonly Used Multiplicative Model: Jump Diffusion Process
When companies use credible demand information to update their demand forecasts, there are certain time epochs at which the forecasts are changed sharply. For example, pharmaceutical companies are often asked to submit a tender to supply vaccines to health agencies such as GAVI, the Vaccine Alliance (Biçer et al., 2018). The one offering the lowest price is often selected as the vaccine supplier. Therefore, demand forecasts strongly depend on the results of such tenders. In B2B settings, the number of customers is limited, and customers tend to purchase products in large quantities. In these cases, demand forecasts may suddenly jump up or down depending on the result of a tender or a customer placing a bulk order. In B2C settings, demand forecasts are largely affected by social or mainstream media. For example, a quality problem that is important enough to appear on social media may cause a sharp decrease in demand, while a new product promoted by an influencer would be in high demand in the market. The jump diffusion model developed by Merton (1976) is effectively used to represent the demand evolution process in these circumstances. It is formalized such that:
Di = D0 e
.
μ(ti −t0 )+1 +2 +···+i +
Nt i
j =1
ln(Yj )
,
for .i ∈ {1, · · · , n}. In this expression, .Di is the demand forecast at time .ti . Therefore, the .Di term is linked to the initial demand forecast at .t0 through an exponential term. We use .μ to denote the drift rate. The . terms are the uncertain
4.3 Integration of Uncertain Elements in a Unified Model
143
forecast adjustments at the decision epochs that follow a normal distribution: i ∼ N(−σ 2 (ti − ti−1 ), σ ti − ti−1 )
.
Nt i
with .σ being the volatility parameter. The last term .
j =1
ln(Yj ) captures the forecast
updates after credible demand information is received. We use .Nti to denote the number of critical information updates, received from .t0 to .ti , that change the demand forecasts sharply. This parameter follows a Poisson process with a rate of .λ, representing the arrival process of the information. After receiving the information, a positive or negative jump might occur, changing the demand forecast by .(Yj −1)Dt . When .Yj = 0, it indicates a quality problem or a supply disruption (observed at time .tj ), which drops the demand forecast to zero. Likewise, when .Yj > 1, it indicates a bulky order was received at .tj or a social media influencer promoted the product at .tj , causing a sharp increase in the demand forecast. Therefore, .Yj is defined as the size of jump in the demand forecast when credible information about demand is received at time .tj . Following the common approach in the literature (Merton, 1976), it is assumed that .Yj follows a lognormal distribution with a location parameter of .τ and a scale parameter of .ς . The demand value realized at time .tn , conditional on the forecast at time .ti , satisfies the following:
.
ln(Dn ) = ln(Di ) + μ(tn − ti ) + i+1 + i+2 + · · · + n +
Nt n
ln(Yj ).
j =Nti
This analytical representation is complex enough to capture different types of uncertainties in a demand model. When we set .τ and .ς very low, for example, .Yj is expected to be close to zero. Combining this with a very low .λ would make it possible to capture the risk of demand dropping to zero, which may happen due to a supply disruption or a serious quality problem. The . terms capture intrinsic demand uncertainty that would not be linked to a specific reason. Using the . terms and .Yj , it would be possible to have a complex demand model that captures the intrinsic demand uncertainty and the risk of supply disruption simultaneously. If the demand model is purely based on the advance orders such that they are collected over time to tally total demand, the . terms should be removed from the analytical expression. The lognormal parameters of .τ and .ς are determined depending on how the advance orders are collected. Despite its flexibility to capture different types of demand uncertainty, the jump diffusion model cannot be modelled with well-known statistical distributions such as the normal and lognormal distributions. Nevertheless, a characteristic function exists for this demand process (Biçer et al., 2018) such that: ψln(Dn ) (ω) = ψA (ω) × ψB (ω) × ψC (ω) × ψD (ω)
.
144
4 Uncertainty Modelling
with: .
ψA (ω) = Diim ω , ψB (ω) = eim ωμ(tn −ti ) , ψC (ω) = e−σ
2 (t
n −ti )ω
2 /2
, ψD (ω) = eλ(tn −ti )(e
(im ωτ −ς 2 ω2 /2) −1)
.
In this expression, .ψA (ω) is the characteristic function of .ln(Di ); .ψB (ω) is the characteristic function of .μ(tn − ti ); .ψC (ω) is the characteristic function of N tn
.i+1 + i+2 + · · · + n ; and .ψC (ω) is the characteristic function of . ln(Yj ). j =Nti
Multiplying these characteristic functions yields the characteristic function of demand .ψln(Dn ) (ω).
4.3.4
Example
We present an example of a manufacturer that receives advance orders for a seasonal product from some retailers. Once the retailers purchase the product, they sell it to end consumers during a short selling season. The selling season starts in September every year, so the retailers request delivery of the orders at the beginning of September. The advance orders are placed by the retailers from March through the end of August. We assume that the manufacturer starts production in June. The production volume is determined by the manufacturer at the beginning of June, after which it can no longer be updated. We normalize the time period from March to September to one. Thus, the time when the manufacturer starts production is equal to .ti = 0.5. We assume that the advance demand received before .ti is equal to 1000 units. For the remaining time period, the manufacturer expects one more order. Hence, we set .λ = 2 given .(tn − ti ) = 0.5. The quantity demanded for each order has a mean of 1000 units and a median of 800 units. Then, the demand model can be written in the following form:
.
Nt n 2 ln(Dn ) = ln(Di ) − λ(tn − ti ) eτ +ς /2 − 1 + ln(Yj ).
(4.1)
j =Nti
The second term in this expression cancels out the expectation of the last term such that: Nt n 2 E ln(Yj ) = λ(tn − ti ) eτ +ς /2 − 1 .
.
j =Nti
4.3 Integration of Uncertain Elements in a Unified Model
145
Having the second term in this model makes it possible to use .Di as the total demand forecast at time .ti , which is equal to .1000 + λ(tn − ti ) × 1000 = 2000 units. Total demand is still uncertain at .ti , which has a median value of .1000 + λ(tn − ti ) × 800 = 1800 units. When there is no order received between .ti and .tn , the last term on the right-hand side of the demand model expression becomes equal to zero. In that case, the negative drift term given by the second expression makes the final demand equal to the advance demand received until .ti . In mathematical terms, 2 ln(Di ) − λ(tn − ti ) eτ +ς /2 − 1 = ln(1000), 2 ln(2000) − 2 × 0.5 × eτ +ς /2 − 1 = ln(1000). .
Using the median values and the median demand formulation of the lognormal distribution, we also obtain the following expression: .
ln(1800) − 2 × 0.5 × eτ − 1 = ln(1000).
Combining the last two expressions, we have: ln ln(2) + 1 = τ + ς 2 /2, ln ln(1.8) + 1 = τ. .
And, τ = 0.462, ς = 0.358.
.
We use the .λ, .τ and .ς values together with the log demand model (4.1) to obtain the characteristic function .ψln(Dn ) (ω). Regarding the cost and price parameters, we assume that the manufacturer sells the products to retailers at a price of .$150 per unit. The cost of the product is .$50 per unit. Unsold products at time .tn are sold to other retailers in offshore markets at a salvage price of .$30 per unit. Therefore, the critical fractile is equal to .(150 − 50)/(150 − 30) = 83.3%. Using the FFT formulas derived for the cumulative probability distribution, the optimal order quantity for the manufacturer is found to be 2725 units. The FFT implementation is presented in the online web application, which shows the results and the Python codes. We also use the FFT formulas derived for the expected profit calculation to present the expected profit as a function of the order quantity. The results are provided in Fig. 4.4. The maximum expected profit is achieved when the quantity is equal to 2725 units, which is the optimal order quantity calculated above. The online web application also includes the FFT application to calculate the expected
146
4 Uncertainty Modelling
Fig. 4.4 Expected profit as a function of order quantity
profit. The reader can specify the input parameters and regenerate the figure. The Python codes are also shared with the reader in the web application.
4.4
Demand Regularization
Demand regularization is another uncertainty modelling approach that combines statistical modelling with the dynamics of data generation to characterize demand uncertainty correctly. It would be highly effective in practice when the demand lead time varies for different orders. In such cases, companies collect advance orders from some customers. However, some customers may still place urgent orders, for which the demand lead time would be very short. There are two benefits of advance demand information. First, it correlates with the actual demand, so it can be used as an explanatory variable to predict demand. Second, the actual demand can never be less than the advance demand because customers have already committed to purchasing the advance demand. Therefore, it should be used as a lower bound for the demand estimates. In Fig. 4.5, we present an example of a manufacturer that collects advance orders from customers (i.e. some retailers). The figure shows the actual and advance demand values for the last 25 months. The solid curve represents the actual monthly demand values, while the dashed line shows the advance demand values. The manufacturer plans the production on a monthly basis. Customers are
4.4 Demand Regularization
147
Fig. 4.5 Historical values of actual monthly demand and advance demand when the order quantity is determined
restricted to placing their purchase orders by the last day of the month before their requested delivery date. If a customer wants the ordered items delivered on July 20, for example, the purchase order should be placed on June 30 at the latest. The manufacturer also induces customers to place their orders well in advance by informing them that orders will be replenished on a first-come, first-served basis if there is a product shortage. The manufacturer plans the production schedule at the beginning of June with an intention of meeting the demand for July. The sum of the orders placed in May and earlier months to be delivered in July constitutes the advance demand. The sum of the orders placed in June to be delivered in July constitutes the urgent demand. The sum of the advance and urgent demand gives the actual demand for July. When the production is planned (i.e. the beginning of June), the manufacturer knows the advance demand. She can use this information as an explanatory variable and a lower bound constraint to better predict demand and characterize its uncertainty. The figure shows that there is a high correlation between the actual and advance demand values. Demand values in months 11, 13, 18, 19 and 22 are also equal to the advance demand values. If demand planners ignore the advance demand information and regress demand on other exploratory variables, they would use a linear regression model. Suppose that a dataset contains n demand values and there are k explanatory variables. We consider a regression model to forecast demand. For notational simplicity, we
148
4 Uncertainty Modelling
present the model in the matrix form: Y = XB + E,
.
with: 1 1 X = . .. 1
y1 y2 .Y = . , .. y n
x11 x12 x21 x22 .. . xn1 xn2
· · · x1k · · · x2k , .. . ··· x nk
β0 β1 B = . , .. β k
1 2 E = . , .. n
where Y and X are the vectors of demand values and the matrix of explanatory variables, respectively. For example, if data is collected for n months, .y1 in the Y vector indicates the demand value for the first month. Likewise, .x11 , .x12 , .· · · , .x1k indicate the values of the exploratory variables for the first month. Those exploratory variables can be economic factors (e.g. economic indices) or intrinsic factors (e.g. online traffic of the company website). The terms B and E denote the vectors of coefficients and error terms, respectively. The first column of the X vector includes only ones, so .β0 gives the intercept of the linear model. It follows from Chap. 2 that the OLS estimation of B is given by: BOLS = (XT X)−1 XT Y.
.
Advance demand information determines the lower bound for the actual demand so it should be added to the forecasting model as a constraint. Suppose that the amount of advance orders is equal to .Li for the i.th observation. Then, the advance demand vector is defined such that each entry indicates the amount of advance orders for each observation: L1 L2 .L = . . .. L n
Then, the least squares problem is written as follows: .
Minimize: such that:
(Y − XB)T (Y − XB), XB ≥ L.
The problem can be solved using Lagrangian optimization, which is reviewed in Chap. 2. The Lagrangian model is written as follows: J (B, λ) = (Y − XB)T (Y − XB) + λT (L − XB),
.
4.4 Demand Regularization
149
where .λ is the Lagrange multiplier vector. This multiplier should be considered as a penalty cost when the advance demand constraint in the least squares problem is violated. A positive value of an entry in the .λ vector indicates that the constraint .XB ≥ L for that entry is binding. Therefore, the predicted value of OLS for that entry is less than the advance demand so the demand prediction should be replaced by the advance demand. In this expression, we have two sets of variables, that is, B and .λ, which we would like to estimate. We take the first derivative of the Lagrangian model with respect to these two variables to find the values that minimize .J (B, λ): .
∂J (B, λ) = −2XT Y + 2XT XB − XT λ = 0, ∂B 1 BREG = (X T X)−1 (XT Y + XT λ), 2 ∂J (B, λ) = L − XB = 0. ∂λ
Combining the last two expressions, λ = 2(X(XT X)−1 XT )−1 L − 2Y
.
The formulas for .BREG and .λ are used together with the inputs X, Y and L to calculate the coefficients. The .λ vector gives us useful insights regarding the value of advance demand in predicting the actual demand. Demand predictions from the regression model are found by .XB for each value of Y . Those with a positive .λ value should be replaced by the advance demand, which occurs when the advance demand is unexpectedly high. Given that the process of replacing some demand predictions with advance demand is part of the demand regularization approach, the estimates of .BREG are expected to be different from the OLS estimates .BOLS . We now turn back to our example given in Fig. 4.5. The values of the actual and advance demand are given in Table 4.3. We regress the demand values on the Table 4.3 Values of the actual and advance demand Month Demand Advance demand Month Demand Advance demand Month Demand Advance demand
1 100 71 11 134 134 21 144 91
2 112 30 12 86 52 22 202 202
3 107 75 13 99 99 23 158 105
4 103 64 14 89 56 24 160 101
5 91 41 15 111 81 25 144 96
6 85 51 16 114 79
7 84 42 17 118 73
8
9
85 51 18 163 163
79 57 19 193 193
10 81 49 20 143 99
150
4 Uncertainty Modelling
previous months’ demand and the current months’ advance demand values. For this reason, there should be three columns in the X matrix. The first column is the column of ones. The second column is the column of the previous months’ demand values, which includes the demand values from the 1st month until the .24th month. The last column shows the advance demand values including the values from the 2nd month to the .25th month. The Y vector includes the demand values from the 2nd month to the .25th month. Likewise, the L vector includes the advance demand values from the 2nd month until the .25th month. Therefore, based on the data given in Table 4.3, these parameters look like the following: 112 107 .Y = . , .. 144
1 1 X = . .. 1
30 75 .. , . 160 96 100 112 .. .
30 75 L = . . .. 96
The reason we start from the second month in the Y vector is that the first month’s value is used as an explanatory variable. We then construct the X matrix and the L vector accordingly. We present the results of this example in the online web application, which also includes the Python codes. The mean absolute percentage error (MAPE) is 5.96% for the demand regularization model, whereas it is 6.67% for the OLS model. Therefore, the demand regularization model performs better than the OLS model.
4.5
Chapter Summary
In this chapter, we have presented different approaches to modelling demand with its evolutionary dynamics and characterizing demand uncertainty. The additive evolutionary model is useful for quantifying uncertainty and assessing the value of supply chain responsiveness when demand uncertainty is low and the forecast horizon is relatively short. When demand uncertainty is high, the multiplicative model can be considered as an alternative to the additive model. If demand is formed as a combination of different uncertain elements, we have demonstrated how to use the FFT to characterize demand and solve the inventory management problems. When the demand lead time varies substantially in supply chains, advance demand information can be used in two ways to better predict demand. First, the demand values are correlated with the advance demand values, so they can be used as explanatory variables to predict demand. Second, the demand values cannot be less than the advance demand values, so they can also be added as a lower bound to the demand forecasting models. The demand regularization model uses the advance demand information in both ways in a single model.
4.6 Practice Examples
4.6
151
Practice Examples
Example 4.1 A manufacturer produces seasonal items (of a single SKU) and sells them to some retailers before the selling season. The retailers place their orders 1 month before the selling season. The manufacturer starts the production 5 months before the selling season, and the production run must be completed at the time when orders from the retailers are collected. Thus, the production lasts 4 months. The demand of the product is estimated to have a mean value of 1000 units and a standard deviation of 400 units. The manufacturer updates the demand forecasts according to the additive model at the beginning of each month. Calculate the parameters of the additive model. (Use this information to answer the questions of Examples 4.1–4.3.) Solution 4.1 We normalize the forecasting horizon (i.e. 4 months) to one and use ti to denote the time when forecasts are updated. Thus, t0 = 0, t1 = 0.25, t2 = 0.5, t3 = 0.75 and t4 = 1. We note that t4 is the time when the actual demand is realized. Because we normalize the forecast horizon to one, the volatility parameter of the additive model becomes equal to the standard deviation: σ = 400. The drift rate must be equal to zero, since there is no information about the drift such that the demand forecasts are expected to increase or decrease over the forecast horizon. Thus, μ = 0. At time ti , the demand follows a normal distribution with the following parameters: D4 | D0 ∼ N(1000, 400), D4 | Di ∼ N(Di , 400 1 − ti ),
.
for i ∈ {1, 2, 3}.
Example 4.2 The manufacturer sells the product at a price of $10 per unit, and it costs $4 to produce one item. The unsold units are salvaged at a residual value of $2 per unit. The demand forecasts are updated such that D0 = 1000, D1 = 950, D2 = 900, D3 = 850, and D4 = 800 units. Calculate the optimal order quantity and excess inventory values if the ordering decision is made at ti ∀i ∈ {0, 1, 2, 3}. Solution 4.2 The critical fractile is: β=
.
p−c = 6/8 = 75%. p−s
The z value is found by the inverse normal distribution: z = −1 (0.75) = 0.674. When the ordering decision is made at the very beginning (i.e. t0 ), the order quantity must be: Q0 = 1000 + 0.674 × 400 = 1270 units.
.
152
4 Uncertainty Modelling
The excess inventory is calculated by subtracting the actual demand, which is D4 = 800, from the order quantity. It is equal to 470 units. When the ordering decision is made at t1 , the order quantity must be: √ Q1 = 950 + 0.674 × 400 1 − 0.25 = 1183 units.
.
Then, the excess inventory is equal to 1183 − 800 = 383 units. When the ordering decision is made at t2 , the order quantity must be: √ Q2 = 900 + 0.674 × 400 1 − 0.5 = 1091 units.
.
The excess inventory is equal to 1091 − 800 = 291 units. When the ordering decision is made at t3 , the order quantity must be: √ Q3 = 850 + 0.674 × 400 1 − 0.75 = 985 units.
.
The excess inventory is equal to 985 − 800 = 185 units. Example 4.3 Using the results of Example 4.2, calculate the benefit of postponing the ordering decision from t0 to t3 . Solution 4.3 With the order quantity Q0 = 1270, the manufacturer can only sell 800 (which is the final demand). Then, total profit of the manufacturer is equal to:
(Q0 = 1270) = 10 × 800 − 4 × 1270 + 2 × 470 = $3860.
.
If the ordering decision is made at t3 , total profit is:
(Q3 = 985) = 10 × 800 − 4 × 985 + 2 × 185 = $4430.
.
Therefore, postponing the ordering decision from t0 to t3 helps the manufacturer increase total profit by $570. Example 4.4 Suppose that the manufacturer in the previous example experience some sharp changes in demand dynamics. The demand is now assumed to follow the multiplicative model with an initial estimate of 1000 units and a coefficient of variation (CV) of one. Calculate the parameters of the multiplicative model. (Use this information to answer the questions of Examples 4.4–4.6.) Solution 4.4 We again normalize the forecasting horizon to one and use ti to denote the time when forecasts are updated such that t0 = 0, t1 = 0.25, t2 = 0.5, t3 = 0.75 and t4 = 1. It follows from Example 3.5 that the volatility parameter σ = 0.83. At
4.6 Practice Examples
153
time ti , the demand follows a lognormal distribution with the following parameters: D4 | D0 ∼ log -N(ln(1000) − 0.832 /2, 0.83),
.
D4 | Di ∼ log -N(ln(Di ) − 0.832 /2 × (1 − ti ), 0.83 1 − ti ),
for i ∈ {1, 2, 3}.
Example 4.5 The manufacturer updates the demand forecasts such that D0 = 1000, D1 = 900, D2 = 750, D3 = 725 and D4 = 700 units. Use the price and cost parameters given in Example 4.2, and calculate the optimal order quantity and excess inventory values if the ordering decision is made at ti ∀i ∈ {0, 1, 2, 3}. Solution 4.5 Demand follows a lognormal distribution with the location parameter of ln(1000) − 0.832 /2 = 6.56 and a scale parameter of 0.83 at time t0 . With the z value of 0.674 for the given critical fractile, the optimal order quantity is: Q0 = e6.56+0.83×0.674 = 1235 units.
.
If the ordering decision is made at t0 , the excess inventory is equal to 1235 − 700 = 535 units. At time t1 , demand follows a lognormal distribution with the location √ parameter of ln(900)−0.832 /2×(1−0.25) = 6.54 and a scale parameter of 0.83 1 − 0.25 = 0.72. The optimal order quantity is: Q1 = e6.54+0.72×0.674 = 1124 units.
.
If the ordering decision is made at t1 , the excess inventory is equal to 1124 − 700 = 424 units. At time t2 , demand follows a lognormal distribution with the location √ parameter of ln(750) − 0.832 /2 × (1 − 0.5) = 6.45 and a scale parameter of 0.83 1 − 0.5 = 0.59. The optimal order quantity is: Q2 = e6.45+0.59×0.674 = 942 units.
.
If the ordering decision is made at t2 , the excess inventory is equal to 942 − 700 = 242 units. At time t3 , demand follows a lognormal distribution with the location √ parameter of ln(725)−0.832 /2×(1−0.75) = 6.50 and a scale parameter of 0.83 1 − 0.75 = 0.42. The optimal order quantity is: Q3 = e6.50+0.42×0.674 = 883 units.
.
If the ordering decision is made at t3 , the excess inventory is equal to 883 − 700 = 183 units.
154
4 Uncertainty Modelling
Example 4.6 Using the results of Example 4.5, calculate the benefit of postponing the ordering decision from t0 to t3 . Solution 4.6 With the order quantity Q0 = 1235, the manufacturer can only sell 700 (which is the final demand). Then, total profit of the manufacturer is equal to:
(Q0 = 1235) = 10 × 700 − 4 × 1235 + 2 × 535 = $3130.
.
If the ordering decision is made at t3 , total profit is:
(Q3 = 883) = 10 × 700 − 4 × 883 + 2 × 183 = $3834.
.
Therefore, postponing the ordering decision from t0 to t3 helps the manufacturer increase total profit by $704. Example 4.7 Consider the online retailer we have discussed above at the beginning of the section “Integration of Uncertain Elements in a Unified Model”. Recall that the characteristic function of the log demand is given by: ψln(D) (ω) = E[eim ω ln(D) ] = ψln(A) (ω) × ψln(B) (ω),
.
where: ψln(A) (ω) = eim ×7.4×ω−0.42
.
2 ×ω2 /2
,
1 im ω ln(pj ) e , 5 5
ψln(B) (ω) =
j =1
and pj is the choice probability on the j th day. The probability values are only available for 5 days such that 0.08, 0.08, 0.075, 0.085 and 0.08. Suppose the retailer replenishes the inventory according to the multi-period order-up-to model given in the previous chapter. Backorders are allowed with a backorder penalty cost of $3 per backorder. The holding cost is $2 per unit per period. The retailer replenishes the inventory daily from a local supplier. The supply lead time is negligible, so it is assumed to be zero. Develop Python codes to calculate the optimal base stock (order-up-to) level. Solution 4.7 The critical fractile value is: β=
.
3 b = = 60%. b+h 3+2
Given that the lead time is zero, the optimal order quantity is calculated based on the demand distribution for 1 day (remember the (l+1) rule of the order-up-to model
4.6 Practice Examples
155
from Chap. 3). The optimal order quantity is computed on Python, which is equal to 146 units. The Python codes are available in the online web application. An alternative solution would be based on a heuristic approach that applies scaling according to purchase probabilities: The z value for the given critical fractile is z∗ = −1 (0.6) = 0.253. The number of visits that corresponds to this critical fractile is calculated by e7.4+0.253×0.42 = 1820. We then calculate the value for the heuristic approach by multiplying this value with the mean of probabilities. The mean of 0.08, 0.08, 0.075, 0.085 and 0.08 is equal to 0.08. Thus, the heuristic solution becomes equal to 1820 × 0.08 = 146 units. The heuristic solution is the same of the exact solution calculated by the FFT approach, since the standard deviation of the probabilities is very low. When we increase the standard deviation of the probabilities by changing their values to 0.08, 0.02, 0.01, 0.15 and 0.14, the heuristic solution is still equal to 1820 × 0.08 = 146 units. However, the exact solution is now equal to 159 units. Therefore, the optimality gap of the heuristic approach increases, rendering the heuristic method highly inaccurate. Example 4.8 A retailer sells winter jackets in the US market. Due to the seasonality of the market demand, the retailer replenishes the inventory from an offshore contract manufacturer once a year in September. The jackets are sold in the market from October until next February at a price of $700 per unit. The purchasing cost is $300 per unit. Excess inventory at the end of February is sold at a discounted price of $200 per unit. The demand follows a normal distribution with a mean of 10000 and a standard deviation of 2000 units. Calculate the optimal order quantity. (Use this information to answer the questions of Examples 4.8–4.10.) Solution 4.8 Because there is a single replenishment of inventory due to seasonal demand, the optimal order quantity can be found by the newsvendor model. The critical fractile is: β=
.
700 − 300 p−c = = 80%. p−s 700 − 200
The z value for this critical fractile is obtained by taking its inverse: z = −1 (0.8) = 0.842. Then, the optimal order quantity is equal to: Q∗ = 10000 + 0.842 × 2000 = 11684 units.
.
Example 4.9 Suppose that the retailer had already started to sell the winter jackets in Argentina, independently, 5 years ago. The demand values for the jackets are 6000, 2000, 9000, 3000 and 5000 units for each year. The retailer now decides to merge the US and Argentina operations such that the jackets are offered in the US market from October until next February and then in the Argentina from April to August. Therefore, a single replenishment is made from the contract manufacturer to the US warehouse of the retailer once a year in September. The excess inventory at the end of the US selling season is then transferred to Argentina
156
4 Uncertainty Modelling
to meet demand there. The retailer sells the jackets at $700 per unit in both markets. The purchasing cost is $300 per unit. The excess inventory at the end of August is sold at a discounted price of $200 in Argentina. Discuss the advantages of merging operations of two different markets. Solution 4.9 There are two main advantages of merging operations. First, it allows demand pooling such that total uncertainty can be reduced. To better understand the advantages of demand pooling, suppose that a company sells its products in two markets with identical demand dynamics. Assume that demand follows a normal distribution with a mean of μ and a standard deviation of σ in each market. The coefficient of variation, which is considered a measure of uncertainty, is CV = σ/μ for each market. If demand is pooled, the pooled demand √ follows a normal distribution with a mean√ of 2μ and a standard deviation of σ 2. Thus, the √ uncertainty level is reduced to σ 2/(2μ) = σ/(μ 2). The second advantage of merging operations is that the jackets are no longer offered at a discount in the US market. Price promotions and end-of-season discounts would induce customers to behave strategically such that the purchase of a product is postponed to buy the item at a lower price, rather than buying it at the full price (Cachon & Swinney, 2009). After merging the operations, the number of US customers who are willing to buy the jackets at the full price would increase because of unavailability of discount shopping. Example 4.10 Develop Python codes to calculate the optimal order quantity after merging the operations. Solution 4.10 Total demand is the sum of two uncertain elements—that is, demand values in the US and Argentina markets. Recall that the characteristic function of the demand is given by: ψD (ω) = E[eim ωD ] = ψA (ω) × ψB (ω),
.
where: ψA (ω) = eim ×10000×ω−2000
.
2 ×ω2 /2
,
1 im ωBj ψB (ω) = e , 5 5
j =1
and Bj is the demand in the Argentina market for year j . Applying the FFT produce of the additive integration case to our problem, the optimal order quantity is found as 17798 units. The Python codes of this example are provided in the online web application.
References
4.7
157
Exercises
1. An online retailer offers luxury goods to customers at discount through flash sales (see https://www.shopify.com/ca/enterprise/flash-sale for more information about flash sale concept). Each product offering is made available on the retailer’s website for a limited time period. The retailer purchases a luxury item from a wholesaler at a wholesale price of $100 per unit and sells it at a price of $120 per unit. Excess inventory cannot be sold on the website, so the retailer donates them to some non-profit organizations. The item will be sold on the website only for 6 hours. The retailer assumes that the number of visits during this time period follows a lognormal distribution with a location parameter of 6.56 and a scale parameter of 0.83. The purchase probability of each customer is estimated from past sales of similar items. The retailer has a set of empirical values of such purchase probabilities as 0.02, 0.23, 0.19, 0.47, 0.26, 0.29, 0.33, 0.24 and 0.53. Develop the Python codes and calculate the optimal order quantity. 2. What would be the expected profit if the retailer orders 400 units? Develop the Python codes to answer this question.
References Biçer, I., Tarakçı, M., & Kuzu, A. (2022). Using uncertainty modeling to better predict demand. Harvard Business Review. https://hbr.org/2022/01/using-uncertainty-modeling-tobetter-predict-demand Biçer, I., & Tarakcı, M. (2021). From transactional data to optimized decisions: An uncertainty modeling approach with fast fourier transform. Working Paper, York University, Schulich School of Business. Biçer, I., Hagspiel, V., & De Treville, S. (2018). Valuing supply-chain responsiveness under demand jumps. Journal of Operations Management, 61(1), 46–67. Cachon, G. P., & Swinney, R. (2009). Purchasing, pricing, and quick response in the presence of strategic consumers. Management Science, 55(3), 497–511. Candogan, O. (2020). Information design in operations. In Tutorials in operations research (pp. 176–201). Carr, P., & Madan, D. (1999). Option valuation using the fast fourier transform. Journal of computational finance, 2(4), 61–73. Coase, R. H. (1937). The nature of the firm. Economica, 4(16), 386–405. Gatheral, J. (2006). The volatility surface: A practitioner’s guide. John Wiley & Sons. Guiso, L., & Parigi, G. (1999). Investment and demand uncertainty. The Quarterly Journal of Economics, 114(1), 185–227. Kadakia, P. M. (2020). Inside view: What happened when jeff bezos met shah rukh khan, zoya akhtar. Forbes India. https://www.forbesindia.com/article/special/inside-view-what-happenedwhen-jeff-bezos-met-shah-rukh-khan-zoya-akhtar/57181/1 Laffont, J.-J. (1989). The economics of uncertainty and information. The MIT Press. Lane, D. A., & Maxfield, R. R. (2005). Ontological uncertainty and innovation. Journal of Evolutionary Economics, 15(1), 3–50. Merton, R. C. (1976). Option pricing when underlying stock returns are discontinuous. Journal of financial economics, 3(1–2), 125–144. Wang, T., Atasu, A., & Kurtulu¸s, M. (2012). A multiordering newsvendor model with dynamic forecast evolution. Manufacturing & Service Operations Management, 14(3), 472–484.
5
Supply Chain Responsiveness
Keywords
Lead time reduction · Dual sourcing · Quantity flexibility · Multiple ordering models
In Chap. 3, we reviewed the inventory models under demand uncertainty, where the order quantity is determined in such a way that optimizes the trade-off between excess inventory and out-of-stocks. In this chapter, we explore the strategies that help reduce the excess inventory and out-of-stock risks jointly, rather than optimizing the trade-off. As stated earlier, one of the most important consequences of operational inefficiency is the imbalance between supply and demand. If supply exceeds demand, companies end up with excess inventory and incur inventory write-offs and the cost of holding excess inventory. According to market research (Gustafson, 2015), it was estimated that excess inventory costs amounted to $472 billion in the retail industry alone. When supply falls short of demand, companies end up with empty shelves and incur lost revenues. According to the same market research, the cost of stock-outs was estimated to be $634 billion in the retail industry alone. Other industries such as manufacturing and wholesaling also suffer from supplydemand mismatches. For example, Hewlett-Packard identified three types of excess inventory costs—price devaluation, salvage losses, and capital costs (Callioni et al., 2005). Electronics manufacturers incur price devaluation costs when the prices of electronic components decrease over time. The manufacturers with high inventory levels are severely affected by price devaluation, compared to those with low inventory levels. Salvage losses occur when there is excess inventory at the end of the product’s life cycle. Capital costs are a financial loss resulting from holding inventory, because companies cannot generate any financial income when their capital is tied up to inventory. For example, interest income can be earned if a manufacturer keeps her capital in a savings account at a bank. When the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_5
159
160
5 Supply Chain Responsiveness
manufacturer uses the capital to hold inventory instead, she losses the opportunity of earning interest income, resulting in a capital cost for holding inventory. At Hewlett-Packard, these cost elements constitute a substantial part of the company’s operating expenses, which can be as high as the total operating profit (Callioni et al., 2005). Equally important, product shortages may have severe consequences for manufacturers. For example, the shortage of electronic chips in 2021 has caused substantial losses not only for chip makers but also for car, computer, and home appliance manufacturers (Jeong & Strumpf, 2021). To minimize the mismatches between supply and demand, companies ought to improve supply chain responsiveness, which is defined as the ability to react quickly to the changes in customer demand. Companies with responsive supply chains can rapidly adjust inventory levels in accordance with customer demand, which translates into lower levels of out-of-stocks and excess inventory. Establishing supply chain responsiveness is costly; therefore, organizations should assess its value carefully before making irreversible investments. Offshore production offers the benefits of low production costs, but long lead times often induce companies to commit to production quantities well in advance of the realization of market demand. This in turn causes large mismatches between supply and demand. On the other hand, domestic production helps organizations postpone ordering decisions until after valuable market information is collected, making it possible to base such decisions on more accurate demand information compared to offshore production. However, the land and labour costs are generally much higher for production facilities located near market bases than they are for those located offshore. Therefore, organizations that consider re-shoring their production near market bases to improve supply chain responsiveness face a trade-off between the cost of supply-demand mismatches and production costs (e.g. land and labour costs). There are different mechanisms to improve supply chain responsiveness. In this chapter, we discuss these approaches and the analytical models that underlie them. The first approach is lead time reduction. In Chap. 1, we presented a supply chain framework that includes some lead time components (see Fig. 1.2). We now refer to the decision lead time as the lead time in short. It is the time elapsed from the instant a production/procurement order is placed to when the customer demand is known. We recall that its duration is equal to “supply lead time .+ days of inventory .− demand lead time”. The lower its value, the more accurate the demand forecast is when the order is placed. Depending on their organizational structure, companies can reduce lead time in a variety of ways. Those that vertically integrate manufacturing and retail operations can cut lead time by building facilities close to their markets. There are also alternative approaches to reduce lead time without changing the manufacturing location. For example, increasing production capacity or changing the production system may help reduce lead time. Manufacturers often produce different items by utilizing the same resources. To switch production from one product to another, resources should be set up accordingly. Because set-ups take some time and can be costly, companies produce
5 Supply Chain Responsiveness
161
items in large quantities to minimize the number of set-ups. As a result, the replenishment frequency is reduced and production orders have to be determined well in advance of the delivery to customers. To postpone production orders and hence reduce lead time, companies should increase the frequency of replenishments and reduce production quantities. This can be achieved by investing in reducing setup times (Suri, 1998). If reducing the set-up time is not possible, the production capacity can be expanded to increase the replenishment frequency and reduce lead times. The second approach for increasing supply chain responsiveness is the multiple sourcing strategy. Companies that source products from an offshore manufacturing site can utilize an additional local supplier to meet demand peaks. In this setting, it is important to order from the offshore supplier in small quantities so that the risk of demand being less than the offshore order quantity is kept at the minimum level. But this strategy could lead to a high risk of stock-outs, especially when demand turns out to be greater than expected. To meet demand peaks, domestic suppliers can be utilized at a higher cost. This method is known as dual sourcing in the operations management literature (Biçer, 2015). In a dual-sourcing model, the procurement order from the offshore supplier is placed under demand uncertainty. Decision-makers also reserve a responsive capacity with a local supplier that can be used after the demand is known. Therefore, there is a single cost parameter for orders from the offshore supplier, which is the unit cost. For orders from the local supplier, however, decision-makers first incur a capacity reservation cost and then the cost of utilizing the capacity for each unit ordered. The third approach is to enter into a quantity flexibility contract with suppliers. In quantity flexibility settings, an initial order quantity is determined at the beginning. The order quantity can be increased or decreased within some limits later. Offshore suppliers sometimes offer such contracts to help their buyers mitigate supplydemand mismatches. The minimum level the buyer agreed to purchase is often transported by ship, which takes a long time. The additional amount that later becomes known to the supplier can be delivered by air cargo in a short time. The fourth approach is the partial-ordering strategy, such that partial orders can be placed over time during the lead time period. This flexible ordering strategy helps organizations improve supply chain responsiveness, because it allows decisionmakers to collect valuable demand information over time and determine their partial order quantities accordingly. There are different ways to improve ordering flexibility, depending on how products are manufactured. Some items are produced after raw materials go through a series of operations. To make clothes, for example, yarns are produced first from cotton or wool. Then, they are woven to make textiles, which are sewn into different models in different sizes. In this setting, the first decision is the amount of yarn to produce. Once yarn production has been completed, the next decision is about the amount of textiles. The order quantities of each SKU are determined once the textiles are produced. Therefore, the final order quantities are determined after some time, during which decision-makers can collect market information about demand and improve the forecast accuracy. The fashion apparel industry has distinct operations that are completed sequentially to make
162
5 Supply Chain Responsiveness
clothes. This setting naturally provides decision-makers with ordering flexibility. Manufacturers in other industries use innovative techniques to simplify complex production systems by producing and assembling modular components in sequence to make final products. Such innovations divide the production systems into distinct, sequential operations, and the decision about the final order quantities can be postponed until the start of the final operation. This in turn leads to greater ordering flexibility and supply chain responsiveness. For example, ASML, the Dutch technology company that produces modular lithography systems for semiconductor manufacturers, has successfully implemented this strategy to postpone the costly operations to later stages, thereby minimizing the risk of overutilizing expensive resources (Biçer et al., 2022). If dividing production systems into distinct, sequential operations is not possible, companies can still establish ordering flexibility by reducing set-up times and costs. Reducing set-up times makes it possible to produce in small quantities and switch production between different products frequently. Consequently, faster setups enable quicker responses to customer demands. Suppose that a manufacturer plans to produce 100 units of a product in four weeks. If set-up is costly and set-up times are long, 100 units would be produced in a single production run to minimize the number of set-ups. If set-up is not costly and set-up times are relatively short, 100 units would be produced in four production runs that take place in each subsequent week. Now consider a case where the manufacturer collects demand information over time and updates the demand forecast at the end of the third week to be 75 units. Then, the last production run can be cancelled and only 75 would be produced to perfectly match supply with demand.
5.1
Lead Time Reduction
Imagine a retailer that can source its products from either an offshore supplier or a domestic supplier. After the products are delivered by one of the suppliers, they are sold on the market. Once the market demand is realized, the goods are immediately delivered to the retailer’s customers. On the one hand, ordering from the domestic supplier makes it possible to defer the ordering decision until there is a partial or full resolution of demand uncertainty, which in turn reduces the mismatch costs. On the other hand, the offshore supplier offers lower purchasing costs than the domestic supplier. The retailer should select one of the suppliers depending on the trade-off between the mismatch and purchasing costs. If the supply-demand mismatch cost is much higher than the purchasing cost difference between the offshore and domestic suppliers, the retailer would be better off selecting the domestic supplier. Otherwise, selecting the offshore supplier would be a better alternative. Figure 5.1 shows the ordering dynamics and the long and short lead time alternatives for the retailer. The time when the market demand is realized is denoted by .tn . If the procurement order is placed with the offshore supplier, it has to be done at time .tl . If the procurement order is placed with a domestic supplier, it has to be done at time .ts , such that .tn > ts > tl .
5.1 Lead Time Reduction
163
Ordering decision
Delivery to customers
Long lead me =
− Ordering decision
Short lead me =
Delivery to customers
−
Fig. 5.1 Ordering decisions for long and short lead time cases
To price the value of lead time reduction, we consider a multiplicative demand model, because it is more convenient than an additive one for capturing dynamics with high demand uncertainty. Nevertheless, the analytical results presented here can be extended to an additive demand model by using the formulas in the previous chapter. The conditional demand distribution for the multiplicative process is given by: √ Dn | Di ∼ log -N(ln(Di ) + (μ − σ 2 /2)(tn − ti ), σ tn − ti ).
.
With this demand distribution, the maximum expected profit for the newsvendor model (i.e. when the profit maximizing order quantity is placed) is found by: ∗
∗
Q
E((Q )) = (p − s)
.
xf (x)∂x, 0
√ = (p − s)Di eμ(tn −ti ) (zQ − σ tn − ti ). We assume that the cost of the product per unit is .cl when the order is placed with the offshore supplier with a long lead time and .cs when it is placed with the domestic supplier with a short lead time. The offshore supplier is cheaper than the local one, so .cl < cs . The demand forecast at .tl is .Dl , and it is .Ds at .ts . When the retailer purchases the products from the offshore supplier, the maximum expected profit becomes: √ El ((Q∗ ) | Dl ) = (p − s)Dl eμ(tn −tl ) −1 (p − cl )/(p − s) − σ tn − tl .
.
164
5 Supply Chain Responsiveness
The ordering decision with the domestic supplier is made when the decision-maker observes the demand forecast at time .ts . The maximum expected profit for ordering with the domestic supplier is found by: √ Es ((Q∗ ) | Ds ) = (p − s)Ds eμ(tn −ts ) −1 (p − cs )/(p − s) − σ tn − ts .
.
Decision-makers often have to determine whether to use offshore or domestic suppliers at the very beginning. They know the demand forecast .Dl at that time. However, the .Ds term is not known when a decision to choose between offshore and domestic suppliers is made. For that reason, we replace .Ds in the last expression with its expected value, such that .E(Ds | Dl ) = Dl eμ(ts −tl ) . Thus, √ Es ((Q∗ ) | Dl ) = (p − s)Dl eμ(tn −tl ) −1 (p − cs )/(p − s) − σ tn − ts .
.
We now consider a case in which the retailer already has an offshore supplier, for which the cost is .cl . The retailer also explores some domestic suppliers as an alternative to the offshore supplier. Before selecting a domestic supplier, the decision-makers want to find the break-even cost in order to assess the feasibility of using the domestic supplier. The break-even cost is denoted by .cs∗ such that if the cost of ordering from the domestic supplier is equal to this value, the maximum expected profits for ordering from domestic and offshore suppliers are the same. This value can be found as follows: Es ((Q∗ ) | Dl ) = El ((Q∗ ) | Dl ), √ √ −1 (p − cs∗ )/(p − s) − σ tn − ts = −1 (p − cl )/(p − s) − σ tn − tl , √ √ cs∗ = p − (p − s) −1 (p − cl )/(p − s) − σ tn − tl + σ tn − ts . .
This last equation helps determine the break-even cost premium that companies are willing to pay to reduce lead times by sourcing domestically (De Treville et al., 2014; Biçer et al., 2018): Break-even Cost Premium =
.
cs∗ − cl . cl
Suppose that the cost premium is equal to 40% for a company to reduce the lead time by 80%. However, the company is indifferent between the offshore and domestic suppliers when .cs = 1.4cl . Then, a domestic supplier is more attractive than an offshore supplier when both of the following conditions hold: • The cost of ordering from a domestic supplier is less than or equal to .1.4cl (i.e. .cl < cs ≤ 1.4cl ) • Ordering locally helps reduce the lead time by no less than 80%.
5.1 Lead Time Reduction
165
125%
Cost premium
100%
75%
50%
25%
0% 0.00
0.25
0.50
0.75
1.00
Time to order with domestic supplier
Fig. 5.2 Cost premium for domestic production
We now present an example of a buyer that can source products locally or from an offshore supplier. The selling price of the product in the market is .$10 per unit during the selling season (.p = $10). Unsold products at the end of the selling season can be salvaged at a price of .$1 per unit. The cost of ordering from the offshore supplier is .$2 per unit (.cl = $2). We normalize the long lead time (.tn − tl ) to one so .tn = 1 and .tl = 0. We also normalize the initial demand forecast to one and assume that .μ = 0 and .σ = 1. The forecast evolution process for these demand parameters is given in Fig. 4.3 for the multiplicative model. We vary the time of ordering with the domestic supplier (.ts ) and calculate the break-even cost premium accordingly. Figure 5.2 shows the cost-premium frontier for varying .ts values. Any cost and lead time pair that falls below the frontier makes domestic sourcing more profitable than sourcing from the offshore supplier. When .ts is equal to zero, the ordering decision is made at the same time for both domestic and offshore sourcing options. In this case, there is no advantage to sourcing locally. Therefore, the cost-premium value is equal to zero for .ts = 0. As .ts is increased, the ordering decision with the domestic supplier can be postponed, which in turn leads to an exponential increase in the cost-premium value. If local sourcing makes it possible to reduce lead times long enough to place purchase orders after knowing the demand, the resulting cost-premium becomes equal to 135%. From a modeling perspective, this occurs when .ts = tl = 1. From a practical point
166
5 Supply Chain Responsiveness
of view, the buyer knows demand before placing a procurement order if the demand lead time is longer than the sum of the supply lead time and days of inventory (i.e. operating lead time; see Fig. 1.2 in Chap. 1). In such a case, local sourcing would be overwhelmingly more appealing for the buyer, even if its cost is twice the cost of offshore sourcing. In this example, we focus on the comparison between offshore and local sourcing. But the cost-premium frontier can also be used for alternative cases. Suppose, for example, a buyer sources products from an offshore supplier. The products can be delivered by ocean freight or air cargo. If ocean transshipment is used, the ordering decision has to be made at time .tl . On the other hand, air cargo makes it possible to postpone the ordering decision for some time, but it leads to an increase in ordering costs. The cost-premium frontier can help decisionmakers determine under which circumstances air cargo should be preferred over ocean transshipment. The cost-premium frontier strongly depends on demand uncertainty because reducing the lead time is more beneficial for products with high uncertainty. In Fig. 5.3, we only change the volatility parameter to .σ = 0.75 and .σ = 1.25 to investigate the cases with low and high uncertainty. The frontier curve shifts downward when the uncertainty is low but upward when the uncertainty is
175%
150%
Cost premium
125%
Type
100%
A−High uncertainty B−Moderate uncertainty C−Low uncertainty
75%
50%
25%
0% 0.00
0.25
0.50
0.75
1.00
Time to order with domestic supplier
Fig. 5.3 Cost-premium frontiers for different demand uncertainty levels: .σ = 0.75 for low uncertainty, .σ = 1.00 for moderate uncertainty, and .σ = 1.25 for high uncertainty
5.2 Multiple Sourcing
167
high. This indicates that the value of lead time reduction increases with demand uncertainty. In the online web application, we provide an interactive example that shows the cost-premium frontier for given input parameters. The user can specify the input values and regenerate the graph given in Fig. 5.2. The Python codes of the example are also given in the online web application.
5.2
Multiple Sourcing
Despite its effectiveness in comparing sourcing alternatives, the cost-premium approach suffers from one limitation. It restricts decision-makers to choosing one of the options while foregoing the benefits of other alternatives. However, it would be possible to use both offshore and domestic suppliers at the same time to benefit from both the low cost offered by the offshore supplier and the responsiveness offered by the domestic supplier (Allon & Van Mieghem, 2010; Cattani et al., 2008; Biçer, 2015). If both suppliers are used at the same time to meet market demand for a certain product, decision-makers need to determine the order quantities from each of the sources. The order quantity from the offshore supplier should be determined well in advance of the selling season. The order quantity from the domestic supplier can be determined later in the season when the total demand can be estimated with very high accuracy. In this setting, domestic suppliers may end up receiving no orders when demand turns out to be low. Therefore, they do not rely on the orders from the buyers that use them as a second source in addition to an offshore supplier. To use a domestic source, buyers would need to reserve some of the source’s capacity, thereby incurring a capacity reservation cost. We use .Q1 and .c1 to denote the order quantity and cost of ordering per unit from an offshore supplier, respectively. The reactive capacity (reserved at the site of a domestic supplier) and capacity reservation cost per unit are denoted by K and .ck , respectively. The buyer incurs the capacity reservation cost (i.e. .ck K) at the beginning. The order quantity from the domestic supplier is limited by the capacity reserved. The order quantity and cost of ordering per unit—that is, the cost that will be incurred in addition to the capacity reservation cost—from the domestic supplier are denoted by .Q2 and .c2 , respectively. Figure 5.4 depicts the sequence of events in this setting (Biçer, 2015). At the beginning of the planning horizon, the buyer makes two decisions in the face of demand uncertainty. It orders .Q1 units from the offshore supplier and incurs a cost of .c1 Q1 . It also reserves K units from the domestic supplier and incurs a cost of .ck K. At the end of the planning horizon, the offshore supplier delivers .Q1 units to the buyer. Then, the buyer observes the market demand D. After knowing the demand, the reactive capacity can be utilized to meet excess demand .max(D − Q1 , 0) up to K units. Thus, the order quantity from the domestic supplier becomes
168
5 Supply Chain Responsiveness units are received from the offshore supplier.
The buyer orders units from the offshore supplier. The buyer reserves a reacve capacity of units from the domesc supplier
Customer demand is realized. Excess demand max( − , 0) is sasfied up to units from the domesc supplier
Planning horizon
Fig. 5.4 Sequence of events for the dual-sourcing model
equal to: Q2 = max(min(D, Q1 + K), Q1 ) − Q1 .
.
The term .min(D, Q1 + K) gives total sales. This amount is restricted by the maximum availability, which is the sum of the order quantity from the offshore supplier and the reserved capacity (.Q1 + K). If demand exceeds its value, only .Q1 + K can be sold and excess demand is foregone. Otherwise, the buyer can meet demand completely and sell D units. The term .max(min(D, Q1 + K), Q1 ) gives the total inventory available in stock after orders from both offshore and domestic suppliers are delivered and before customer demand is fulfilled. Subtracting .Q1 from this term yields the order quantity from the domestic supplier. The buyer incurs an additional cost of ordering from the domestic supplier: .c2 Q2 , which can be considered the operating cost of utilizing the domestic supplier. The buyer can benefit from dual sourcing, compared to single sourcing from either an offshore or a domestic supplier, if p > ck + c2 > c1 > ck .
.
When .p < ck + c2 , the cost of reserving capacity and producing one unit with the domestic supplier exceeds the selling price. Thus, utilizing the domestic supplier would not be a viable option for the buyer. Likewise, when .c1 < ck , the cost of reserving capacity with the domestic supplier is higher than the cost of ordering from the offshore supplier; hence, utilizing the domestic supplier would not be a viable option. When .ck + c2 < c1 , the offshore supplier becomes more expensive than the domestic supplier. Therefore, the buyer would only order from the domestic supplier. The total profit for a dual-sourcing model is given by (Biçer, 2015): (Q1 , K, Q2 ) = p min(D, Q1 + K) + s max(Q1 − D, 0) − c1 Q1 − ck K − c2 Q2 .
.
5.2 Multiple Sourcing
169
The first term on the right-hand side of this expression gives the total revenue from regular sales. Excess inventory salvaged at a price of s per unit has a positive value when demand turns out to be less than .Q1 . The second term hence gives the revenue collected by selling the excess inventory at the salvage value. This term is independent of the order quantity from the domestic supplier because reactive capacity is utilized after knowing the demand. If demand turns out to be less than .Q1 units, no order is placed with the domestic supplier. For this reason, the excess inventory has an upper bound of .Q1 . The last three expressions give the total cost. The optimal values of .Q1 and K are found by (Biçer, 2015; Cattani et al., 2008): c + c − c 2 k 1 , c2 − s p − c − c c + c − c 2 k 2 k 1 K ∗ = F −1 , − F −1 p − c2 c2 − s Q∗1 = F −1
.
where .F −1 (·) is the inverse of the cumulative distribution of demand. The proof of these expressions given by Cattani et al. (2008), Biçer (2015) is summarized in the appendix to this chapter. The optimal values, however, can also be derived based on the marginal analysis as we have done in Chap. 3. We apply marginal analysis and answer the following questions: • What happens if the order quantity from the offshore supplier is increased by one unit while keeping the reactive capacity the same? • What happens if the reactive capacity at the domestic supplier is increased by one unit while keeping the order quantity from the offshore supplier the same? When .Q1 = K = 0, increasing the order quantity and/or the reactive capacity has a positive impact on the profits. As their values increase, the positive impact of increasing the order quantity or the reactive capacity on the profits diminishes. For very large values of .Q1 and K, increasing their values further would have a negative impact on profits, because there is a high probability of ending up with excess inventory and idle capacity due to demand turning out to be low, compared to .Q1 and K values. In this case, additional units would not be needed to meet demand. Instead, they would be salvaged at a loss, leading to a reduction in profits. Suppose that a decision-maker has already ordered .Q1 − 1 units and set the reactive capacity to K. Then, we analyse the marginal profit that will be made by increasing the order quantity by one unit. If demand exceeds .Q1 + K, the revenue of the decision-maker increases by .$p as a result of the additional unit. If demand is between .Q1 and .Q1 + K, the revenue earned increases by .$c2 due to the additional unit ordered. In this case, the reactive capacity is utilized at a lower rate, and the decision-maker would pay .$c2 less than what she would pay when the order quantity from the offshore supplier is equal to .Q1 − 1 units. If demand is less than .Q1 units, the additional unit will be salvaged at a salvage price of .$s. The cost of increasing the order quantity by one unit is equal to .$c1 . Then, the marginal profit is derived as
170
5 Supply Chain Responsiveness
follows: (Q1 | K) = p(1 − F (Q1 + K)) + c2 (F (Q1 + K) − F (Q1 )) + sF (Q1 ) − c1 ,
.
= (p − c1 ) − (p − c2 )F (Q1 + K) − (c2 − s)F (Q1 ). We now consider a case in which a decision-maker has already ordered .Q1 units and set the reactive capacity to .K − 1. We then analyse the marginal profit of increasing the capacity level by one unit. If demand exceeds .Q1 + K units, the decision-maker can increase the revenue by .$p. Because the capacity level is increased by one unit, the decision-maker first incurs a capacity reservation cost of .$ck . If demand exceeds .Q1 + K units, the decision-maker utilizes the capacity at a cost of .c2 . Then, the marginal profit made by increasing the capacity level is derived as follows: (K | Q1 ) = p(1 − F (Q1 + K)) − ck − c2 (1 − F (Q1 + K)),
.
= p − c2 − ck − (p − c2 )F (Q1 + K). The optimal values of .Q1 and K are found by setting both .(Q1 | K) and (K | Q1 ) equal to zero. It follows from .(K | Q1 ) = 0 that
.
F (Q∗1 + K ∗ ) =
.
p − c2 − ck . p − c2
From .(Q1 | K) = 0, we have: (p − c1 ) − (p − c2 − ck ) − (c2 − s)F (Q∗1 ) = 0,
.
c2 + ck − c1 , c2 − s c + c − c 2 k 1 Q∗1 = F −1 . c2 − s
F (Q∗1 ) =
Combining this expression with .F (Q∗1 + K ∗ ) yields: K ∗ = F −1
.
p − c − c c + c − c 2 k 2 k 1 . − F −1 p − c2 c2 − s
We now present an example to illustrate how the decision parameters in the dual-sourcing model impact the expected profit. We assume that a company sells a product in the market at a price of .$14 per unit during the selling season. Unsold units at the end of the selling season are donated to a charity so the salvage value is equal to zero. The company can source the items from an offshore supplier and a domestic supplier. The offshore supplier is very cheap compared to the domestic supplier. The purchase cost per unit is .$2 from the offshore supplier. If the company
5.2 Multiple Sourcing
171
purchases items from the domestic supplier, it should first reserve reactive capacity at a cost of .$1 per unit. After knowing the demand, the reactive capacity can be utilized at a cost of .$4 per unit. Therefore, F (Q∗1 ) =
4+1−2 c2 + ck − c1 = = 0.75, c2 − s 4−0
F (Q∗1 + K ∗ ) =
p − c2 − ck 14 − 4 − 1 = 0.9. = p − c2 14 − 4
.
We consider a multiplicative demand model such that the initial demand forecast is normalized to one, and the drift and volatility parameters are set equal to zero and one, respectively. The planning horizon is normalized to one such that .t0 = 0 and .tn = 1. The actual demand is realized at time .tn . Thus, demand follows a lognormal distribution with the parameters: Dn | D0 ∼ log -N(ln(D0 ) + (μ − σ 2 /2)(tn − t0 ), σ (tn − t0 )) = log -N(−0.5, 1).
.
Then, the optimal value of .Q1 is found by: Q∗1 = F −1 (0.75) = 1.19,
.
where we use the inverse of the lognormal distribution with parameters .−0.5 and 1. The optimal capacity level is found by: K ∗ = F −1 (0.9) − 1.19 = 2.18 − 1.19 = 0.99.
.
Figure 5.5 shows the expected profit values for varying capacity levels. We set the order quantity from the offshore supplier to its optimal value such that .Q∗1 = 1.19 and vary the reactive capacity between zero and .1.5. The expected profit curve is concave with the maximum value obtained at .K = 0.99, which is exactly the optimal value of the reactive capacity. Figure 5.6 shows the expected profit values for varying order quantity values. We set the reactive capacity at the domestic supplier to its optimal value (i.e. .K ∗ = 0.99) and vary the .Q1 values between zero and .1.75. Similar to the previous figure, the curve is concave and the maximum value of the expected profit is attained at .Q1 = 1.19, which is equal to the optimal order quantity. We provide an interactive example in the online web application, where the reader can regenerate Figs. 5.5 and 5.6 with the given input parameters. The Python codes that generate these figures are also available in the web application.
172
5 Supply Chain Responsiveness
8.0
Expected Profit
7.5
7.0
6.5
6.0 0.0
0.5
Reactive Capacity
1.0
1.5
Fig. 5.5 Impact of reactive capacity on the expected profit
8.0
Expected Profit
7.5
7.0
6.5
6.0 0.0
0.5
1.0
1.5
Order Quantity
Fig. 5.6 Impact of order quantity from the offshore supplier on the expected profit
5.3 Quantity Flexibility Contracts
5.3
173
Quantity Flexibility Contracts
The dual-sourcing model has two different suppliers that are utilized jointly by a buyer. In some cases, buyers source products from a single supplier with a long lead time while having some flexibility to update order quantities after improving demand forecasts. Quantity flexibility contracts provide buyers with such flexibility, thereby helping them minimize the supply-demand mismatches. Figure 5.7 depicts the sequence of events and decisions in a quantity flexibility contract model. At the beginning of the planning horizon, the buyer and the supplier negotiate the initial order quantity and the flexibility percentage. We use Q and .α to denote the order quantity and flexibility percentage, respectively. The initial order quantity Q can be considered a reference point, because the buyer has the flexibility to revise the initial order quantity, within limits. As time elapses from .tl to .ts , the buyer collects information about the customer demand and improves the demand accuracy. At time .ts , the buyer determines the final order quantity .Qf , within limits, such that: (1 − α)Q ≤ Qf ≤ (1 + α)Q.
.
Finally, actual demand D is realized at time .tn , so the amount sold to customers is equal to .min(D, Qf ) units. The demand forecast .Ds is observed at time .ts . This information is used to estimate the actual demand .Dn . We use the multiplicative demand model, so the demand distribution conditioned on .Ds is written as follows: Dn | Ds ∼ log -N(ln(Ds ) + (μ − σ 2 /2)(tn − ts ), σ (tn − ts )).
.
The buyer aims to optimize the final order quantity .Qf , such that the expected profit is maximized based on the demand information available at time .ts . The
The buyer and supplier negoate and determine the order quanty and the flexibility percentage .
is determined within Final order quanty ≤ ≤ 1+ the limits: 1 − Delivery to customers
Planning horizon
Fig. 5.7 Sequence of events and decisions with quantity flexibility contracts
174
5 Supply Chain Responsiveness
flexibility percentage is not a decision variable. It is negotiated at the very beginning, and it is part of the contract. Thus, there is no additional cost associated with the revisions of the order quantity. The expected profit function at time .ts is then formulated as follows: E(s (Qf | Ds )) = (p − c)Qf − (p − s)E(max(Qf − Dn , 0)),
.
Qf = (p − c)Qf − (p − s) (Qf − x)f (x | Ds )∂x. 0
For the multiplicative demand model, the last expression is rewritten as follows: E(s (Qf | Ds )) = (p − c)Qf − (p − s)Qf F (Qf | Ds ) + (p − s)Ds eμ(tn −ts ) ln(Q /D ) − (μ − σ 2 /2)(t − t ) √ f s n s − σ tn − ts . + √ σ tn − ts
.
If the flexibility is unlimited, the buyer has no limitation while determining the value of .Qf . The optimal quantity for such a hypothetical unconstrained model is found by: F (Q∗f ) =
.
ln(Q∗ /Ds ) − (μ − σ 2 /2)(tn − ts ) p − c f . = √ p−s σ tn − ts
Let’s define a new variable such that .zβ = −1 ((p − c)/(p − s)), where .−1 (·) is the inverse of the standard normal distribution function. Combining this with the last expression yields: Ds = Q∗f e−(μ−σ
.
2 /2)(t
n −ts )−zβ σ
√
tn −ts
.
This expression has a very important implication regarding the value of flexibility projected at the beginning of the planning horizon. After the buyer and the supplier negotiate the Q and .α values at time .tl , the buyer knows that the final order quantity can only be within the limits of .Q(1 − α) and .Q(1 + α). If the demand forecast is equal to .Ds1 at time .ts , the optimal value of the final order quantity becomes equal to .Q(1 − α). We refer to .Ds1 as the lower threshold for the demand forecast at time .ts , which is found by: Ds1 = Q(1 − α)e−(μ−σ
.
2 /2)(t
n −ts )−zβ σ
√
tn −ts
.
5.3 Quantity Flexibility Contracts
175
If the demand forecast is equal to .Ds2 at time .ts , the optimal value of the final order quantity becomes equal to .Q(1 + α). We refer to .Ds2 as the upper threshold for the demand forecast at time .ts , which is found by: Ds2 = Q(1 + α)e−(μ−σ
.
2 /2)(t
n −ts )−zβ σ
√
tn −ts
.
If the demand forecast is between .Ds1 and .Ds2 at time .ts , the final order quantity is equal to its unconstrained optimal value: .Q∗f . Otherwise, it is limited by the ordering constraints such that: ⎧ ⎪ if Ds < Ds1 , ⎨ Q(1 − α) √ 2 .Qf = Q∗f = Ds e(μ−σ /2)(tn −ts )+zβ σ tn −ts if Ds1 ≤ Ds ≤ Ds2 , ⎪ ⎩ Q(1 + α) if Ds2 < Ds . Then, the expression of .E(s (Qf | Ds )) gives the expected profit. We now present an example to demonstrate how the expected profit is affected by the flexibility parameter .α and the time to determine the final order quantity .Qf . We normalize the planning horizon .tn − tl to one, and set the initial demand forecast .Dl = 1. We assume that the drift rate is equal to zero and the volatility is equal to one. The selling price is assumed to be .$10 per unit, the ordering cost is .$5 per unit, and there is no salvage value for unsold inventory. The initial order quantity is assumed to be equal to one unit. We note that the demand forecast is normalized to one so the order quantity should be interpreted accordingly. We first set the time to determine the final order quantity equal to .0.8 (i.e. .ts = 0.8) and vary the .α parameter between zero and one to see the impact of the flexibility parameter on the expected profit. Figure 5.8 shows the results when we simulate 10,000 random demand paths based on the multiplicative demand model with the given parameter values. We later calculate the expected profit projected at time .ts for each one using the .E(s (Qf | Ds )) formula. Finally, we find the mean profit values for 10,000 instances and report them in the figure. In Fig. 5.9, we present the expected profit for varying .ts values. When .ts = 0, there is no value of flexibility, because the final order quantity has to be determined at the very beginning. As .ts increases, the decision-maker benefits from the flexibility to update the initial order quantity and the expected profit increases as a result. We remark that the expected profit value for .ts = 0.8 in Fig. 5.9 is the same as the expected profit value for .α = 0.3 in Fig. 5.8, because the expected profit is calculated by using the same parameter values at these two points.
176
5 Supply Chain Responsiveness
Expected Profit
3
2
1
0 0.00
0.25
0.50
0.75
1.00
Flexibility parameter
Fig. 5.8 Impact of the flexibility parameter on expected profit when .ts = 0.8
In the online web application, we provide interactive examples, where the reader can generate Figs. 5.8 and 5.9 with the given input parameters. The Python codes are also available in the online web application.
5.4
Multiple and Sequential Ordering Problems
We now focus on different types of problems in which a decision-maker can place multiple orders during the production horizon at various decision epochs. Suppose a manufacturer produces an item using a capacitated resource such that the weekly capacity is equal to 100 units. The production lasts four weeks, and then the items produced are sold in the market. The manufacturer can collect information about demand and improve the accuracy of demand forecasts over time during the production horizon. Order quantities are determined for each week. Within each week, the manufacturer first improves the demand forecast and then determines the weekly order quantity. We assume that the manufacturer expects the demand to be in the range .[0, 300] at the beginning of the first week. The total production capacity at that time is equal to 400 units given that weekly capacity is 100 units. So, the manufacturer decides not to produce anything in the first week. At the beginning of the second week, the demand is still expected to be in the range .[0, 300]. The manufacturer orders the
5.4 Multiple and Sequential Ordering Problems
177
2.5
Expected Profit
2.0
1.5
1.0
0.5
0.0 0.00
0.25
0.50
0.75
1.00
Time to determine final order quantity
Fig. 5.9 Impact of the time to determine final order quantity (i.e. .ts ) on expected profit when = 0.3
.α
production of 50 units for the second week. Therefore, the manufacturer can have 250 units at most if the capacity is fully utilized in the last two weeks. Thus, it may end up facing excess demand if demand turns out to be more than 250 units. There is also excess inventory risk if demand turns out to be less than 50 units, because 50 units were already produced in the second week. Suppose that the manufacturer gets some very important information about demand and updates the demand forecast at the beginning of the third week such that demand is expected to be in the range .[250, 300]. Then, the capacity is fully utilized in the third week and 100 units are produced. This brings the total inventory level to 150 units at the end of the week. In the fourth week, 100 more units are produced, so the total inventory tallies 250 units at the end of the fourth week. True demand is observed to be 275 units, so the manufacturer sells only 250 units and loses the opportunity to sell 25 units. If the manufacturer were to determine the total order quantity at the very beginning, it would have ideally ordered 150 units (assuming that the critical fractile is .50%) and sold 100 units less than the former case. Therefore, placing multiple orders over time helps the manufacturer reduce supply-demand mismatches and increases profits. Production capacity plays an important role in this setting, because weekly order quantities strongly depend on the capacity level. If capacity is unlimited, for example, the manufacturer could choose not to produce anything for the first three weeks. In this case, the manufacturer would ideally postpone all production to the
178
5 Supply Chain Responsiveness
≤
≤
≤
≤
≤
Fig. 5.10 Ordering decisions in the multiple ordering problem. Demand forecasts evolve from .t0 to .tn . During this time period, decision-makers can place n different orders subject to a capacity limitation
last week. Therefore, having unlimited capacity would induce decision-makers to collect demand information during the first three weeks and the order quantity would be better determined in the last week based on accurate demand forecasts. When capacity is limited, however, such a postponement strategy would cause product shortages. For that reason, it would be ideal to start production early when capacity is limited. Therefore, there is a trade-off between demand uncertainty and capacity shortages. If production starts early, the capacity shortage risk is minimized. But the decisions in the early weeks would be based on inaccurate demand forecasts with high uncertainty. If production starts too late, ordering decisions would be based on accurate demand information. However, remaining capacity would be too limited to meet demand peaks. Hence, the manufacturer would be exposed to a high risk of capacity shortages. We present the ordering dynamics for a multi-ordering problem in Fig. 5.10. We divide the production horizon into n periods so the manufacturer places n orders for each period. The order quantity is limited by the capacity constraint K. We now assume that the capacity K is exogenously determined and has a fixed value. For .i ∈ {0, 1, · · · , n − 1}, the order quantity satisfies: .Qi ≤ K. We define a new variable to denote the accumulated inventory level at .ti for .i ∈ {1, 2, · · · , n} such that: Ii =
i−1
.
Qj .
j =0
At the beginning of each period, the decision-maker observes the demand forecast Di . She reviews the accumulated inventory .Ii and the remaining capacity level .(n − i)K. Then, the order quantity .Qi is determined to maximize the profit. At time .tn , actual demand is realized and met from the available inventory. The expected profit is formulated as follows: .E((Qi | Ii , Di )) = E p min(D, In )) + s max(In − D, 0) − cIn .
.
5.4 Multiple and Sequential Ordering Problems
179
An analytical solution to this problem is given by Biçer and Seifert (2017). The optimal solution has a complex structure. Yet, it can be approximated by: 0 = p[1 − F (Ii + Qi + (n − i − 1)K | Di )] + sF (Ii + Qi | Di ) − c.
.
We now present an example that demonstrates how to calculate order quantities, given the evolution of demand forecasts and a capacity limit. We assume that there are four decision epochs such that .n = 4. The order quantities are decided at the beginning of each period at times .t0 , .t1 , .t2 , and .t3 . They are denoted by .Q0 , .Q1 , .Q2 , and .Q3 , as shown in Fig. 5.10. Periodic capacity is 100 units, and the initial demand forecast is 400 units. We normalize the initial demand forecast to one and change the capacity limit accordingly. Based on the normalized values, the capacity limit becomes .100/400 = 0.25. The drift rate is set equal to zero, and the volatility parameter is equal to one. The selling price is .$10 per unit, the cost is .$5 per unit, and there is no salvage value for unsold items. The initial inventory level is equal to zero. Demand forecasts at each decision epoch are updated as .D0 = 1, .D1 = 1.2, .D2 = 1.1, .D3 = 0.9, .D4 = 0.8. The actual demand is realized at time .t4 , so the demand forecast at that time is set equal to the actual demand. Therefore, actual demand is equal to .D = D4 = 0.8 in normalized values and .0.8×400 = 320 units in real values. We now solve the problem for each decision epoch in normalized terms and convert them into real values. The optimal order quantity at time .t0 satisfies: 0 = 10[1 − F (Q∗0 + 0.75 | D0 = 1)] + 0F (Q∗0 | D0 = 1) − 5,
.
F (Q∗0 + 0.75 | D0 = 1) = 0.5, Q∗0 = F −1 (0.5 | D0 = 1) − 0.75.
If the result of this expression is negative, nothing would be ordered. Otherwise, it gives the optimal order quantity at time .t0 . The demand follows a lognormal distribution according to the multiplicative model: D4 | D0 ∼ log -N(−0.5, 1).
.
Then, the inverse for the lognormal distribution with the location and scale parameters of .−0.5 and 1 is found as .F −1 (0.5 | D0 = 1) = 0.606. Then, ∗ .Q = 0.606−0.75 = −0.144. Therefore, nothing is ordered at .t0 such that .Q0 = 0. 0 At time .t1 , the accumulated inventory level is equal to zero because .I0 = Q0 = 0. The demand forecast is .D1 = 1.2. The optimal order quantity at .t1 satisfies: F (Q∗1 + 0.5 | D1 = 1.2) = 0.5.
.
The demand follows a lognormal distribution such that: D4 | D1 ∼ log -N(−0.19, 0.87).
.
180
5 Supply Chain Responsiveness
Then, the inverse for the lognormal distribution with the location and scale parameters of .−0.19 and .0.87 is found as .F −1 (0.5 | D1 = 1.2) = 0.826. Then, ∗ ∗ .Q = 0.826 − 0.5 = 0.326. Because .Q is larger than the capacity limit, .Q1 is set 1 1 equal to the capacity level per period. Therefore, the order quantity at .t1 is .0.25 in normalized values. In real values, it is .0.25 × 400 = 100 units. At time .t2 , the accumulated inventory level is .0.25 and demand forecast is .1.1 in normalized values. The optimal order quantity at .t2 satisfies: F (0.25 + Q∗2 + 0.25 | D2 = 1.1) = 0.5.
.
The demand follows a lognormal distribution such that: D4 | D2 ∼ log -N(−0.15, 0.71).
.
Then, the inverse for the lognormal distribution with the location and scale parameters of .−0.15 and .0.71 is found as .F −1 (0.5 | D2 = 1.1) = 0.86. Then, ∗ = 0.86 − 0.25 − 0.25 = 0.36. As indicated previously, the order quantity .Q 2 at .t2 is .0.25 in normalized values, because .Q∗2 is larger than the capacity limit. Consequently, .Q2 is set equal to the capacity level per period. In real values, it is .0.25 × 400 = 100 units. At time .t3 , the accumulated inventory level is .0.25 + 0.25 = 0.5 and demand forecast is .0.9 in normalized values. The optimal order quantity at .t3 satisfies: F (0.5 + Q∗3 | D3 = 0.9) = 0.5.
.
The demand follows a lognormal distribution such that: D4 | D3 ∼ log -N(−0.23, 0.5).
.
Then, the inverse for the lognormal distribution with the location and scale parameters of .−0.23 and .0.5 is found as .F −1 (0.5 | D3 = 0.9) = 0.79. Then, ∗ .Q = 0.79 − 0.5 = 0.29. Therefore, the order quantity at .t3 is .0.25 in normalized 3 values because .Q∗3 is larger than the capacity limit and .Q3 is set equal to the capacity level per period. In real values, it is .0.25 × 400 = 100 units. The total inventory available when the actual demand is observed (i.e. at time .t4 ) is .I4 = Q0 +Q1 +Q2 +Q3 = 0+100+100+100 = 300 units. The actual demand is 320 units. So, there exists lost sales due to excess demand of .320−300 = 20 units. If a single order is placed at .t0 , the order quantity should be 242 units according to the newsvendor critical fractile solution. Single ordering would cause excess demand of .320 − 242 = 78 units. In other words, multiple ordering helps decrease excess demand by 58 units; hence, an extra profit of .58 × (10 − 5) = $290 is generated owing to the multiple ordering policy in comparison to the single-ordering policy. In this example, we show how to calculate order quantities if demand forecasts evolve such that .Di = {1, 1.2, 1.1, 0.9, 0.8} for .i ∈ {0, 1, 2, 3, 4}. So, the example indicates the advantages of multiple ordering over the newsvendor model in a
Profit increase (in percentages)
5.4 Multiple and Sequential Ordering Problems
181
30
20
10
0 0.0
0.5
1.0
1.5
Volatility parameter
Fig. 5.11 Percentage profit increase due to multiple ordering when compared to the newsvendor ordering policy. As the volatility parameter increases, so do the benefits of multiple ordering
limited setting. We now generate .10,000 sample paths (.10,000 different evolutions of demand forecasts) and calculate the order quantities at each decision epoch for each sample path. Then, we calculate the realized profit by taking the average of .10,000 instances. The realized profit is compared with the newsvendor profit in which a single order is placed at .t0 . Multiple ordering allows decision-makers to update the order quantities over time, as demand forecast updates are observed at successive decision epochs. Therefore, it leads to higher profits in comparison to the newsvendor model. The profit increase in percentage values is reported in Fig. 5.11. The x-axis in the figure represents the volatility parameter, while the y-axis indicates the profit increase. When the volatility value is zero, multiple ordering yields the same profit as the newsvendor policy, because demand can be predicted accurately at the very beginning. As volatility increases, so does the profit because multiple ordering allows decision-makers to adjust order quantities according to the forecast updates. This in turn reduces the supply-demand mismatches, improving the profits compared to the newsvendor model. We again provide an interactive example in the online web application, including the Python codes, so the reader can regenerate Fig. 5.11 for the given input parameters.
182
5.5
5 Supply Chain Responsiveness
Multiple Ordering in a Multi-echelon Model
The multi-ordering model in the previous section deals with how to optimize ordering decisions over time for a final product, where the inventory level at the end is the summation of the order quantities in each period. In mathematical terms, .In = Q0 + Q1 + · · · + Qn−1 . We now consider a multi-ordering problem in a multi-echelon setting, such that raw materials go through a series of operations to be transformed into finished goods. For example, apparel manufacturers often procure yarns from suppliers. Yarns are then woven into textiles. Textiles are sewn into clothes. In this setting, the production quantity of clothes is limited by the amount of textile. The amount of textile produced is also limited by the availability of yarn. Let .Q0 denote the procurement amount of yarn, .Q1 the production quantity of textile, and .Q2 the production quantity of clothes. We standardize the units of measurement such that the amount of yarn is measured in terms of the number of clothes it can produce. Likewise, the amount of textile is measured in terms of the number of clothes it can produce. Therefore, the manufacturer sells .Q2 units of clothes in the market during the selling season, which is constrained by the upstream production and procurement quantities such that .Q2 ≤ Q1 ≤ Q0 . Another important aspect in this setting is that yarn procurement, textile production, and cloth production have different unit costs. Let .c0 denote the procurement cost for yarns per unit, .c1 the incremental production cost for textile per unit, and .c2 the incremental production cost for clothes per unit. The total cost of making one piece of clothing is thus equal to .c0 + c1 + c2 . As production moves forward, the manufacturer collects information about demand and improves its forecasts. If the manufacturer updates demand forecasts downward over time, .Q2 can be substantially lower than .Q1 . In this case, reducing the order quantity for the last operation helps the manufacturer avoid over-utilization of resources in the final operation. This strategy would be beneficial to the manufacturer if the cost of the final operation is higher than those of the first two operations (i.e. .c2 c1 and .c2 c0 ). In this case, the order quantities .Q0 and .Q1 can be substantially higher, because their costs are relatively low. Demand information collected up to when the order quantity for the final operation is placed would improve the forecast accuracy, so the final decision would be based on more accurate demand forecasts. This in turn reduces the risk of over-utilization of the most expensive resource. We present the ordering dynamics of the multi-ordering model in a multi-echelon setting in Fig. 5.12. The order quantities are constrained by the order quantity in the previous decision epoch. The amount of final product to be sold in the market is then equal to .Qn−1 . Therefore, total sales become equal to .min(Qn−1 , Dn ). Products move forward along the chain such that raw materials are transformed to subcomponents and then to finished goods through a series of different operations. The cost of each operation is denoted by .ci for .i ∈ {0, 1, · · · , n − 1}. We also use p to denote the selling price of the product in the market. Biçer et al. (2022) provide the analytical expression for the optimal order quantity at each epoch. For the last decision epoch, which is at time .tn−1 , the ordering decision is in the spirit of the
5.5 Multiple Ordering in a Multi-echelon Model
≤
183
≤
≤
Sales = min(
,
)
Fig. 5.12 Multi-ordering model in a multi-echelon setting. Order quantity for each decision epoch is constrained by the order quantity in the previous epoch. Cost values are different for each operation
newsvendor solution such that: pProb(Dn > Qi ) − cn−1 = 0.
.
For notational simplicity, we assume that the salvage value for unsold items is equal to zero. The optimal order quantity is found as: Q∗n−1 = F −1
.
p − c
Dn−1 .
n−1
p
We remark that .cn−1 is the incremental cost of processing through the last operation. The unit cost c is the summation of all incremental costs, so it is higher than .cn−1 . Therefore, .Q∗n−1 is higher than what could be found if the unit cost were used in this expression. The actual order quantity .Qn−1 is, however, constrained by the order quantity in the previous decision epoch such that .Qn−1 ≤ Qn−2 . If .Q∗n−1 is larger than .Qn−2 , .Qn−1 is set equal to .Qn−2 . Otherwise, it is set equal to .Q∗n−1 . Thus, Qn−1 =
.
Qn−2 if Q∗n−1 > Qn−2 , Q∗n−1 otherwise.
As we move upstream, the optimal ordering decision at .tn−2 satisfies: .
pProb(Dn > Q∗n−2 , Dn−1 > ψn−1 (Q∗n−2 )) − cn−1 Prob(Dn−1 > ψn−1 (Q∗n−2 )) − cn−2 = 0.
The term .ψn−1 (Q∗n−2 ) gives a threshold value for the demand forecast .Dn−1 . If ∗ .Dn−1 exceeds this threshold value, the .Q n−1 value calculated at .tn−1 becomes ∗ higher than .Qn−2 . The economic interpretation of this expression brings us back to the marginal analysis. Suppose that .(Q∗n−2 − 1) units are already ordered. We then analyse the
184
5 Supply Chain Responsiveness
marginal value of ordering one additional unit at .tn−2 . The manufacturer can make revenues of .$p from one additional unit, if the final demand exceeds .Q∗n−2 and the order quantity in the next epoch is set equal to .Q∗n−2 . If .Qn−1 < Q∗n−2 , it is not possible to generate any revenue from the additional unit, regardless of whether the final demand exceeds .Q∗n−2 . Therefore, the expected marginal revenue generated from the additional unit is given by: pProb(Dn > Q∗n−2 , Dn−1 > ψn−1 (Q∗n−2 )).
.
The first condition inside the probability term “.Dn > Q∗n−2 ” checks if the final demand exceeds .Q∗n−2 . The second condition “.Dn−1 > ψn−1 (Q∗n−2 )” checks if the order quantity in the next epoch is set equal to .Q∗n−2 . The expected marginal cost of ordering the additional unit, given that .(Q∗n−2 − 1) units were already ordered, is found by: cn−1 Prob(Dn−1 > ψn−1 (Q∗n−2 )) + cn−2 .
.
The manufacturer incurs a cost of .cn−2 immediately upon ordering the additional unit. At the next epoch, the manufacturer may reduce the order quantity such that ∗ . In this case, the manufacturer does not incur any marginal cost in the .Qn−1 < Q n−2 last operation for the additional unit. However, there will be a cost of processing the additional unit, which is equal to .$cn−1 , if .Qn−1 = Q∗n−2 . Therefore, incurring the cost of .cn−1 due to the additional unit depends on whether the order quantity in the next epoch is set equal to .Q∗n−2 . The probability term of the marginal cost expression captures this condition. The sum of the two terms hence gives the expected marginal cost. Finally, subtracting the marginal cost from the marginal revenue yields the marginal profit, which becomes equal to zero for the optimal order quantity .Q∗n−2 . We now present an example to illustrate how to calculate order quantities in a simple setting with two operations as depicted in Fig. 5.13. The selling price of the product is .$10 per unit. The cost of the first operation is equal to .$1 per unit, and
≤
Sales = min(
1
0.5
Fig. 5.13 Example: multi-ordering problem in a two-echelon setting
4
,
1
)
5.5 Multiple Ordering in a Multi-echelon Model
185
the cost of the second one is .$4 per unit. Therefore, the total cost is .$5 per unit. We normalize the initial demand forecast to one and assume that the drift rate is equal to zero, and the volatility is equal to one. The length of the production horizon is normalized to one. The order of .Q0 units is placed at time .t0 = 0. The last order of .Q1 units is placed at time .t1 = 0.5. The optimal value of .Q∗1 is found by the formula: Q∗1 = F −1 (0.6 | D1 ).
.
The inverse of the standard normal distribution for .0.6 is .−1 (0.6) = 0.25. Then, Q∗1 = D1 e(μ−σ
.
2 /2)(t
2 −t1 )+
−1 (0.6)σ √t
2 −t1
= D1 e−0.25+0.25×0.707 = 0.929D1 .
With the direct connection between .D1 and .Q∗1 in hand (i.e. .Q∗1 = 0.929D1 ), we can now focus on the ordering problem at .t0 . The last expression can be written as follows: .D1 = 1.076Q∗1 . Thus, .ψ1 (Q0 ) = 1.076Q0 . In other words, .ψ1 (Q0 ) gives us the threshold value of .D1 that makes .Q0 the optimal order quantity at time .t1 . Then, the optimal order quantity at .t0 is: .
pProb(D2 > Q∗0 , D1 > 1.076Q∗0 ) − c1 Prob(D1 > 1.076Q∗0 ) − c0 = 0, 10Prob(D2 > Q∗0 , D1 > 1.076Q∗0 ) − 4Prob(D1 > 1.076Q∗0 ) − 1 = 0.
To calculate .Q∗0 , we simulate random paths based on the multiplicative process and calculate the first and second probabilities empirically from these random paths. Using this approach, we find .Q∗0 = 0.874. We now compare the profit values between the multi-ordering and the newsvendor models in a multi-echelon setting. We vary the volatility parameter between zero and .1.5 and compute the profit values. For each volatility value, we compare the profit based on the multi-ordering policy with that of the newsvendor model. To this end, we simulate random demand paths. We first calculate the newsvendor order quantity assuming that a single order is placed at .t0 such that .Q0 = Q1 and the unit cost is .c = c0 + c1 . For each realization of .D2 , we calculate the revenue “.p min(Q0 , D2 )” and subtract the cost “.cQ0 ” from the revenue to find the profit. Then, the mean value of the profits gives us the expected newsvendor profit. For the multi-ordering model, we first calculate .Q0 as explained above. Then, the .Q1 values for each realization of .D1 are calculated. The profit value for each sample path is found by .p min(Q1 , D2 ) − c1 Q1 − c0 Q0 . The average of these profit values gives the expected profit for the multi-ordering model. The difference between the expected profit values is shown in Fig. 5.14 for varying .σ values. As the volatility increases, so does the benefit of multi-ordering. This results from the fact that multi-ordering allows the manufacturer to adjust the order quantity at .t1 after improving demand forecasts from .t0 to .t1 . The value of the flexibility to adjust the order quantity increases with volatility, as demonstrated in the figure.
186
5 Supply Chain Responsiveness
Profit increase (in percentages)
30
20
10
0 0.0
0.5
1.0
1.5
Volatility parameter
Fig. 5.14 Benefits of multi-ordering policy when compared to the newsvendor approach in a twoechelon model
We provide an interactive example in the online web application that makes it possible to regenerate Fig. 5.14 with the given input parameters. The Python codes are also available in the web application. In this example, we limit our attention to a simple setting in which there are only two decision epochs. For a general model, the optimal order quantity at any decision epoch is given by (Biçer et al., 2022): pProb(Dn > Qi , Dn−1 > ψn−1 (Qi ), Dn−2 > ψn−2 (Qi ), . . . ., Di+1 > ψi+1 (Qi ))
.
−cn−1 Prob(Dn−1 > ψn−1 (Qi ), Dn−2 > ψn−2 (Qi ), . . . ., Di+1 > ψi+1 (Qi )) −cn−2 Prob(Dn−2 > ψn−2 (Qi ), . . . ., Di+1 > ψi+1 (Qi )) − · · · − ci+1 Prob(Di+1 > ψi+1 (Qi )) − ci = 0, where .ψj (Qi ) is equal to the demand forecast at time .tj that makes the optimal order quantity at .tj (.Q∗j ) equal to .Qi . We omit the proof of this expression. The reader is referred to Biçer et al. (2022) for the derivation of this expression.
5.7 Practice Examples
5.6
187
Chapter Summary
In this chapter, we have covered key responsiveness strategies that would help companies mitigate supply-demand mismatches, which result from high demand uncertainty. The longer the decision lead time, the greater the demand uncertainty. For that reason, supply chain responsiveness strategies aim to alleviate the negative aspects of decision making when the lead time is long. The lead time reduction strategy attempts to cut the lead time, so decision-makers can base their ordering decisions on accurate demand forecasts. The dual-sourcing and quantity flexibility strategies provide decision-makers with some flexibility to adjust their sourcing decisions over time. The multiple ordering strategies improve the flexibility to adjust production decisions over time. One of the challenges of establishing responsiveness is that its value cannot be determined easily. To facilitate the assessment of the value of responsiveness, analytical models should be developed correctly by taking into account the dynamics of demand forecast evolution. In this chapter, we only use the multiplicative demand model and focus on pricing the value of alternative strategies. However, the results given in this chapter can also be extended to the other demand models given in the previous chapter.
5.7
Practice Examples
Example 5.1 A sportswear retailer sells swimming T-shirts during the summer season. The retailer procures the T-shirts from an offshore contract manufacturer. The supply lead time is equal to 6 months. The market demand for the T-shirts is highly uncertain at the time when the procurement order is given. It follows a lognormal distribution with a mean of 1000 units and a coefficient of variation (CV) of one (i.e. demand is forecast 6 months in advance of the realization of market demand, because the supply lead time is equal to 6 months). The retailer also updates demand forecasts according to the multiplicative process such that the CV decreases over time without any change in the mean demand. What are the parameters for the multiplicative demand process? Characterize the probability function of demand if the demand is forecast 3 months in advance of the realization of market demand. Solution 5.1 We first normalize the supply lead time to one by setting tl = 0 and tn = 1. This makes the volatility parameter of the multiplicative demand process equal to the scale parameter of the lognormal distribution. It is given in Example 3.5 that the scale parameter is equal to 0.83 when the CV is equal to one. Therefore, the volatility parameter is also 0.83 (σ = 0.83). The initial demand forecast is 1000 units. We can also normalize the initial demand forecast to one. The normalization does not have any impact on the calculation of the break-even cost premium. But optimal order quantity calculation is affected by the normalization of initial demand forecast. Finally, the drift rate of the multiplicative demand process must be zero, so mean demand does not change over time. Therefore, the multiplicative demand
188
5 Supply Chain Responsiveness
process to be used in the calculation of the cost premium is characterized by the following: D0 = 1, μ = 0, and σ = 0.83. If the demand is forecast 3 months in advance of the realization of market demand, the time when the forecast is made is equal to ts = 0.5 in normalized values. The conditional demand distribution is given by: Dn | Ds ∼ log -N(ln(Ds ) − σ 2 /2 × 0.5, σ ×
.
√ 0.5),
∼ log -N(ln(Ds ) − 0.172, 0.587). At time t0 , it is not possible to know the actual value of Ds . But it can be represented by the following probability distribution by using the properties of the multiplicative demand process: Ds | D0 ∼ log -N(ln(D0 ) − σ 2 /2 × 0.5, σ ×
√
.
0.5),
∼ log -N(−0.172, 0.587). Thus, E(Ds | D0 ) = 1. Example 5.2 The retailer given in Example 5.1 sells the T-shirts at a price of $35 per unit. The cost of purchasing from the offshore manufacturer is $15 per unit. Unsold T-shirts at the end of the summer are sold at a salvage value of $10 per unit. Calculate the optimal order quantity and the expected profit when the quantity ordered is equal to its optimal value. Solution 5.2 The critical fractile is α ∗ = (p − c)/(p − s) = (35 − 15)/(35 − 10) = 80%. And, z = −1 (α ∗ ) = 0.842. The demand distribution based on normalized values is given by: Dn | Dl ∼ log -N(ln(D0 ) − σ 2 /2 × 1, σ ×
.
√
1),
∼ log -N(−0.344, 0.83). Then, the optimal order quantity based on normalized values is: ∗
Q = e−0.344+0.83×0.842 = 1.426.
.
The optimal order quantity in real terms is as follows: Q∗ = 1.426 × 1000 = 1426 units. The expected profit in normalized terms is found by: √ El ((Q∗ ) | D0 ) = (p − s)D0 z − σ 1 ,
.
= (35 − 10)(0.842 − 0.83) = $12.62. Thus, the expected profit in real terms is El ((Q∗ ) | D0 ) = $12620.
5.7 Practice Examples
189
Example 5.3 A domestic manufacturer offers the retailer supplying the T-shirts at a purchasing cost of $25 per unit with a supply lead time of 3 months. Calculate the projected optimal order quantity and the expected profit if the T-shirts are purchased from the domestic supplier. Solution 5.3 The critical fractile for this case is α ∗ = (p − c)/(p − s) = (35 − 25)/(35 − 10) = 40%. And, z = −1 (α ∗ ) = −0.253. The demand distribution based on normalized values is given by: Dn | E(Ds ) ∼ log -N(ln(E(Ds )) − σ 2 /2 × 0.5, σ ×
.
√ 0.5),
∼ log -N(−0.172, 0.587). It is found in Example 5.1 that E(Ds | D0 ) = D0 = 1. Then, the optimal order quantity based on normalized values is: ∗
Q = e−0.172−0.253×0.587 = 0.726.
.
The optimal order quantity in real terms is Q∗ = 0.726 × 1000 = 726 units. The expected profit in normalized terms is found by: √ Es ((Q∗ ) | D0 ) = (p − s)D0 z − σ 1 ,
.
= (35 − 10)(−0.253 − 0.587) = $5.01. Thus, the expected profit in real terms is Es ((Q∗ ) | D0 ) = $5010. Example 5.4 After comparing the expected profit values in the previous two examples, the retailer is better off ordering the T-shirts from the offshore manufacturer. Thus, the retailer decides to decline the offer of the domestic manufacturer. Then, the domestic manufacturer updates its offer. What must be the new purchasing cost offered by the manufacturer to induce the retailer to switch from the offshore to the domestic manufacturer? Solution 5.4 The break-even cost value is calculated by the formula: √ √ cs∗ = p − (p − s) −1 (p − cl )/(p − s) − σ tn − tl + σ tn − ts , √ √ = 35 − (35 − 10)(0.842 − 0.83 × 1 + 0.83 × 0.5) = $16.87.
.
If the domestic manufacturer offers supplying the T-shirts at a purchasing cost of $16 per unit, the retailer would be better off switching from the offshore to the domestic manufacturer. If the cost offered by the domestic manufacturer is $17 or even more, it is more appealing to procure the T-shirts from the offshore manufacturer. The percentage break-even cost premium is then equal to (16.87 − 15)/15 = 12.46%.
190
5 Supply Chain Responsiveness
Example 5.5 The domestic manufacturer can reserve reactive capacity at a cost of $4 per unit, such that the retailer has to determine the reserved capacity at time t0 . This provides the retailer with the flexibility to determine the order quantity (up to the reserved capacity) from the domestic manufacturer at time tn . The manufacturer charges the retailer an additional $14 per unit if the reserved capacity is utilized at time tn . If the retailer follows a dual-sourcing policy such that the order quantity from the offshore manufacturer (Q1 ) and the reactive capacity reserved by the domestic manufacturer (K) are determined at time t0 , what would be their optimal values? Solution 5.5 In normalized terms, the demand distribution is: Dn | Dl ∼ log -N(−0.344, 0.83),
.
which is shown in Example 5.2. We also have the following price and cost parameters: p = $35, c1 = $15, s = $10, c2 = $14, and ck = $4 per unit. The optimal value of the order quantity from the offshore manufacturer is found by: ∗
Q1 = e−0.344+0.83×z1 ,
.
where z1 = −1 (c2 + ck − c1 )/(c2 − s) = −1 (14 + 4 − 15)/(14 − 10) = 0.674.
.
∗
Hence, Q1 = 1.24 in normalized values. The optimal value of the reserved capacity is: ∗
K = e−0.344+0.83×z2 − e−0.344+0.83×z1 ,
.
where z2 = −1 (p − c2 − ck )/(p − c2 ) = −1 (35 − 14 − 4)/(35 − 14) = 0.876.
.
∗
Then, K = 0.226 in normalized values. In real values, Q∗1 = 1240 and K ∗ = 226 units. It is shown in Example 5.2 that the optimal order quantity is equal to 1426 units for the single sourcing policy from the offshore manufacturer. The dual-sourcing policy helps the retailer to reduce the order quantity from the offshore manufacturer by 1426 − 1240 = 186 units. Example 5.6 The offshore manufacturer offers the retailer flexibility to update initial order quantity by 10% 3 months after the order is placed. What would be the new ordering policy if the retailer only sources the T-shirts from the offshore manufacturer?
5.7 Practice Examples
191
Solution 5.6 In normalized terms, the lower threshold of the demand forecast at time ts = 0.5 (i.e. Ds1 ) for an initial order quantity Q is given by: Ds1 = Q × 0.9 × e0.83
.
2 /2×0.5−0.842×0.83
√ 0.5
= 0.652Q.
Thus, the updated order quantity at time ts = 0.5 is equal to Qf = 0.9 × Q when the demand forecast at time ts turns out to be less than or equal to Ds1 —that is, when Ds ≤ Ds1 = 0.652Q. The upper threshold of the demand forecast at time ts = 0.5 (i.e. Ds2 ) for an initial order quantity Q is found by: Ds2 = Q × 1.1 × e0.83
.
2 /2×0.5−0.842×0.83
√ 0.5
= 0.797Q.
Thus, the updated order quantity at time ts = 0.5 is equal to Qf = 1.1 × Q when the demand forecast at time ts turns out to be more than or equal to Ds2 —that is, when Ds ≥ Ds2 = 0.797Q. If the demand forecast at time ts = 0.5 is between Ds1 and Ds2 , the updated order quantity at time ts is found by: Qf = Ds e−0.83
.
2 /2×0.5+0.842×0.83
√ 0.5
= 1.38 × Ds .
The retailer has the information of D0 = 1, μ = 0, and σ = 0.83 at time t0 . The actual value of Ds is not known at time t0 . Nevertheless, the retailer can estimate the expected value of Ds such that E(Ds | D0 ) = 1. Therefore, the projected final order quantity is Qf | D0 = 1.38 × E(Ds | D0 ) = 1.38. If the initial order quantity is set equal to Q = Qf | D0 = 1.38, it provides the retailer with the maximum flexibility to adjust initial order quantity. Then, the ordering policy of the retailer is: ⎧ if Ds < 0.9, ⎨ 1.242 .Qf = 1.38 × Ds if 0.9 ≤ Ds ≤ 1.1, ⎩ 1.518 if 1.1 < Ds . When 0.9 ≤ Ds ≤ 1.1, the retailer can determine the final order quantity based on Ds . In that case, the final order quantity does not directly depend on the initial order quantity, and the quantity flexibility provides the retailer with the full benefits of ordering from the domestic manufacturer. It is given in Example 5.1 that Ds | D0 ∼ log -N(−0.172, 0.587). Then, the probability that the demand forecast at ts lies between 0.9 and 1.1 is found by: P rob(0.9 ≤ Ds ≤ 1.1 | D0 ) = ((ln(1.1) + 0.172)/0.587)
.
− ((ln(0.9) + 0.172)/0.587) = 0.13.
192
5 Supply Chain Responsiveness
Thus, 10% flexibility to adjust the initial order quantity gives the full benefits of ordering from the domestic manufacturer with the probability of 13%. Example 5.7 We now consider a different setting in which a manufacturer sells a seasonal product to an uncertain market. The manufacturer starts the production 4 months in advance of the realization of the market demand. There is no initial inventory in the system. The monthly production capacity is 400 units. The demand forecast (D0 ) and the CV are 1000 units and one at time t0 = 0, respectively. The market price of the product is $100 per unit, and the cost of production is $20. The salvage value of unsold products is zero at the end of the selling season. Suppose that the manufacturer updates the forecasts for the following months such that D1 = 1100, D2 = 1200, and D3 = 1300 units. What are the optimal monthly production quantities of the manufacturer? If the actual demand that realizes 4 months after the manufacturer starts the production is 1600 units, what is the value of multipleordering policy compared to the single-ordering (newsvendor) model? Solution 5.7 In normalized terms, the demand parameters are D0 = 1, μ = 0, and σ = 0.83. The production capacity for each month is K = 0.4. The optimal order quantity for each month is found by: 0 = p[1 − F (Ii + Q∗i + (n − i − 1)K | Di )] + sF (Ii + Q∗i | Di ) − c,
.
p−c = F (Ii + Q∗i + (n − i − 1)K | Di ), p Q∗i = eln(Di )−0.83
2 /2×(1−t
i )+z×0.83
√ 1−ti
− Ii − (3 − i) × 0.4,
where z = −1 ((p − c)/p) = −1 (0.8) = 0.842. The optimal production quantity for the first month is: Q∗0 = e−0.83
.
2 /2+0.842×0.83
− 1.2 = 0.225,
I1 = I0 + Q∗0 = 0.225. The optimal production quantity for the second month is: Q∗1 = eln(1.1)−0.83
.
2 /2×(1−0.25)+0.842×0.83
√ 1−0.25
− 0.225 − 0.8 = 0.531.
Given that the monthly production capacity is K = 0.4, we set Q∗1 = 0.4. Then, Q∗1 = 0.4,
.
I2 = I1 + Q∗1 = 0.625, Q∗2 = eln(1.2)−0.83 I3 =
I2 + Q∗2
2 /2×(1−0.5)+0.842×0.83
= 1.025,
√ 1−0.5
− 0.625 − 0.4 = 0.4,
5.7 Practice Examples
193
Q∗3 = eln(1.3)−0.83 I4 =
I3 + Q∗3
2 /2×(1−0.75)+0.842×0.83
√ 1−0.75
− 1.025 = 0.395,
= 1.42.
With the actual demand being equal to 1.6, the manufacturer faces inventory shortage of 0.18 in normalized terms (180 units in actual values). The newsvendor order quantity determined at time t0 is equal to: Q∗ = e−0.83
.
2 /2+0.842×0.83
= 1.42.
With the actual demand being equal to 1.6, the newsvendor policy also leads to the same amount of inventory shortage (i.e. 0.18) as the multiple-ordering policy. Example 5.8 For the manufacturer given in Example 5.7, suppose that D1 = 900, D2 = 800, and D3 = 700 units. What are the optimal monthly production quantities of the manufacturer? If the actual demand that realizes four months after the manufacturer starts the production is 700 units, what is the value of multipleordering policy compared to the single-ordering (newsvendor) model? Solution 5.8 Following up from the solution of Example 5.7, we calculate the optimal order quantities as follows: Q∗0 = e−0.83
.
2 /2+0.842×0.83
− 1.2 = 0.225,
I1 = I0 + Q∗0 = 0.225, Q∗1 = eln(0.9)−0.83 I2 =
I1 + Q∗1
= 0.473,
Q∗2 = eln(0.8)−0.83 I3 =
I2 + Q∗2
2 /2×(1−0.25)+0.842×0.83
2 /2×(1−0.5)+0.842×0.83
√ 1−0.5
= 0.704,
Q∗3 = eln(0.7)−0.83
√ 1−0.25
2 /2×(1−0.75)+0.842×0.83
− 0.225 − 0.8 = 0.248,
− 0.473 − 0.4 = 0.231,
√ 1−0.75
− 0.704 = 0.206,
I4 = I3 + Q∗3 = 0.91. With the actual demand being equal to 0.7, the manufacturer faces excess inventory of 0.21 in normalized terms (210 units in actual values). The newsvendor order quantity determined at time t0 is equal to the following: 2 Q∗ = e−0.83 /2+0.842×0.83 = 1.42, which is the same as what is found in the solution of Example 5.7. The newsvendor policy leads to excess inventory of 0.72 (720 units in actual values) Therefore, the multiple-ordering policy helps the manufacturer reduce the excess inventory by 510 units.
194
5.8
5 Supply Chain Responsiveness
Exercises
1. Suppose that the CV of demand is equal to two for the sportswear retailer given in Examples 5.1 and 5.2. Assuming that all other parameters have the same values as in Examples 5.1 and 5.2, what are the optimal order quantity from the offshore manufacturer and the expected profit when the order quantity is set equal to its optimal value? 2. Suppose that the CV of demand is still equal to two for the retailer, while other parameters preserve the same values as in Examples 5.1, 5.2, and 5.3. Calculate the optimal order quantity from the domestic manufacturer and the expected profit when the order quantity is set equal to its optimal value. 3. For Example 5.7, suppose that the manufacturer updates the forecasts for the following months, such that D1 = 1000, D2 = 1050, and D3 = 1000 units. Calculate the optimal monthly production quantities of the manufacturer. What is the value of multiple-ordering policy compared to the single-ordering policy if the actual demand turns out to be 1050 units. 4. An inventory analyst working for a hedge fund analyses the results of Examples 5.7 and 5.8. She later concludes that benefits of the multiple-ordering policy over the single-ordering policy would be relatively low when demand forecasts evolve upward and the critical fractile is high; or the other way around, benefits of the multiple-ordering policy would be relatively high when demand forecasts evolve downward and the critical fractile is low. Discuss whether the statements of the analyst are correct.
5.9
Appendix to Chap. 5
5.9.1
Derivation of Q∗1 and K ∗
The expected value of total profit is: E((Q1 , K, Q2 )) = pE[min(D, Q1 + K)] + sE[max(Q1 − D, 0)] − c1 Q1
.
− ck K − c2 E[max(min(D, Q1 + K), Q1 ) − Q1 ], Q1 +K
xf (x)∂x + p(Q1 + K) 1 − F (Q1 + K)
E((Q1 , K)) = p 0
Q1 + s (Q1 − x)f (x)∂x 0
References
195 Q1 +K
−c1 Q1 − ck K − c2
(x − Q1 )f (x)∂x Q1
− c2 K 1 − F (Q1 + K) . Taking the derivative of the last expression with respect to .Q1 and K yields: .
∂E((Q1 , K)) = (p − c1 ) − (p − c2 )F (Q1 + K) − (c2 − s)F (Q1 ), ∂Q1 ∂E((Q1 , K)) = (p − c2 − ck ) − (p − c2 )F (Q1 + K). ∂K
The Hessian matrix of .E((Q1 , K)) is negative definite for .p > ck + c2 > c1 > ck > s. Therefore, the optimal values of .Q1 and K can be found by setting .∂E((Q1 , K))/∂Q1 = 0 and .∂E((Q1 , K))/∂K = 0. It follows from .∂E((Q1 , K))/∂K = 0 that: F (Q∗1 + K ∗ ) =
.
p − c2 − ck . p − c2
Plugging this expression into .∂E((Q1 , K))/∂Q1 and setting it equal to zero yields: (p − c1 ) − (p − c2 − ck ) − (c2 − s)F (Q∗1 ) = 0,
.
c2 + ck − c1 , c2 − s c + c − c 2 k 1 Q∗1 = F −1 . c2 − s
F (Q∗1 ) =
Combining this expression with .F (Q∗1 + K ∗ ) yields: K ∗ = F −1
.
p − c − c c + c − c 2 k 2 k 1 . − F −1 p − c2 c2 − s
References Allon, G., & Van Mieghem, J. A. (2010). Global dual sourcing: Tailored base-surge allocation to near-and offshore production. Management Science, 56(1), 110–124. Biçer, I., Lücker, F., & Boyaci, T. (2022). Beyond retail stores: Managing product proliferation along the supply chain. Production and Operations Management, 31(3), 1135–1156. Biçer, I. (2015). Dual sourcing under heavy-tailed demand: an extreme value theory approach. International Journal of Production Research, 53(16), 4979–4992.
196
5 Supply Chain Responsiveness
Biçer, I., Hagspiel, V., & De Treville, S. (2018). Valuing supply-chain responsiveness under demand jumps. Journal of Operations Management, 61(1), 46–67. Biçer, I., & Seifert, R. W. (2017). Optimal dynamic order scheduling under capacity constraints given demand-forecast evolution. Production and Operations Management, 26(12), 2266–2286. Callioni, G., de Montgros, X., Slagmulder, R., Van Wassenhove, L. N., & Wright, L. (2005). Inventory-driven costs. Harvard Business Review, 83(3), 135–141. Cattani, K. D., Dahan, E., & Schmidt, G. M. (2008). Tailored capacity: speculative and reactive fabrication of fashion goods. International Journal of Production Economics, 114(2), 416–430. De Treville, S., Bicer, I., Chavez-Demoulin, V., Hagspiel, V., Schürhoff, N., Tasserit, C., & Wager, S. (2014). Valuing lead time. Journal of Operations Management, 32(6), 337–346. Gustafson, K. (2015). Retailers are losing $1.75 trillion over this. CNBC. https://www.cnbc.com/ 2015/11/30/retailers-are-losing-nearly-2-trillion-over-this.html Jeong, E.-Y., & Strumpf, D. (2021). Why the chip shortage is so hard to overcome. Wall Street Journal. https://www.wsj.com/articles/why-the-chip-shortage-is-so-hard-to-overcome11618844905 Suri, R. (1998). Quick response manufacturing: A companywide approach to reducing lead times. CRC Press.
6
Managing Product Variety
Keywords
Product portfolio selection · Resource allocation · Capacity management · Operational excellence
Customers always demand products or services that are tailored to their unique tastes and needs (Brynjolfsson et al., 2011). Knowing this, companies aim to fulfil their customers’ demand by increasing product variety. Expanding product offerings helps organizations increase sales and stay competitive in the market. However, it also increases the complexity of their supply chains, leading to higher operational costs (Randall & Ulrich, 2001). Operational complexity is probably the most important factor that determines the limitations of product variety. In the retail industry, product variety is often reduced when retailers experience increasing complexity in managing their operations. For example, during the COVID-19 pandemic, product variety declined in retail stores for this reason (Gasparro et al., 2020). In the manufacturing industry, managing product proliferation in supply chains is considered one of the critical factors that shapes the boundaries of product variety (Biçer et al., 2022), because increasing product variety does not translate into an increase in sales at the same level. According to an industry survey covering a time period between 2010 and 2017 (E2Open, 2018), product variety increased more than twofold on average among the manufacturers surveyed, whereas the sales volume only increased by 15%. For that reason, manufacturers that manage large portfolios must develop new processes and redesign operations in order to achieve cost efficiencies. Philips, a Dutch multinational corporation, suffered from too much product variety in the early 2000s. Although the company was then one of the most innovative companies in Europe, its revenues and market value dropped substantially in the first decade of the millennium due to the operational costs of its excessive product variety outweighing the benefits (Mocker & Ross, 2017). Philips later sold off © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_6
197
198 Fig. 6.1 Product portfolio management funnel
6 Managing Product Variety
Product Design and R&D
Product selecon Resource allocaon and capacity management Operaons excellence
Demand fulfillment
some of its business units and reduced its product variety in the remaining units to stay competitive in the market. Owing to these efforts, company’s share price and profitability has increased since 2011 (Mocker & Ross, 2017). Lego group, a Danish toy company, also suffered from product variety in the first decade of the millennium (Mocker & Ross, 2017). Between 1997 and 2004, the company increased its product variety excessively, such that the costs of the resulting operational complexity outweighed the benefits. As a result, Lego found itself on the edge of bankruptcy in 2004 after which it underwent an operational turnaround to improve the profitability, which included standardizing its products in order to help improve its profitability and growth (Mocker & Ross, 2017). In Fig. 6.1, we present three steps of product portfolio management. Once product design teams and R&D departments develop new products in a company, senior executives determine which products should be marketed in different categories, which is referred to as the product selection problem. We distinguish the product selection problem from the product rollover, which deals with when to upgrade an existing product. Popular products have high selling volumes, and decisionmakers sometimes upgrade those products as technology advances. Product rollover strategies determine the optimal time to market new versions of popular products, which helps reduce the frictions that arise around when to upgrade those products. These strategies are often shaped by competitive forces in the market. For example, Apple and Samsung regularly launch their new smartphone models each fall, so their customers come to expect new models every fall, and they sometimes postpone their purchases until after both of the new models have been launched. The product selection problem mainly focuses on niche items, which may eventually become popular products. For niche items, decision-makers often have an estimate of operating margin with some level of uncertainty. They also set a target operating margin depending on shareholders’ expectations. Based on these inputs, decision-makers allocate capital budget to the products that will be launched in the market. In this chapter, we will present the classic portfolio theory by Markowitz
6.1 Mean-Variance Analysis for Product Selection
199
(1952), which applies mean-variance analysis to optimize the selection of products in such cases. After the products in each product category are selected, production and design teams work together to determine the best way to produce the new items. Some production resources are used to manufacture a variety of products. The overall capacity of such shared resources must be utilized effectively to guarantee that the total mismatch between supply and demand for all products is minimized. In a single-echelon model, the allocation of shared resources is optimized by determining the optimal time to start production of each product (Biçer & Seifert, 2017). In a multi-echelon model, product proliferation occurs sequentially in supply chains. Production starts with a limited number of raw materials. Then, a variety of products are manufactured as the production moves downstream. In a multi-echelon setting, decision-makers are exposed to high demand uncertainty for upstream ordering decisions. As production moves forward, they collect valuable information from the market to improve the demand forecasts of each item, leading to the partial resolution of demand uncertainty. For downstream decisions, there is less demand uncertainty. However, decision-makers must make ordering decisions for more items or components as the production moves forward (Biçer et al., 2022). The single- and multiple echelon models will be presented in this chapter. Companies continuously look for some operational improvements to reduce costs and increase fill rates. Such efforts help them establish operational excellence. In a multi-product setting, decision-makers can redesign processes to delay the point of product differentiation, so upstream decisions are made based on consolidated demand. This strategy alleviates the negative impact of demand uncertainty for upstream decisions. Reducing the lead time also makes it possible to delay the point of differentiation and the other ordering decisions. In addition to the attempts to improve supply chain responsiveness, companies often adapt cost reduction methods to increase their operating margins. Procurement activities may be bundled to take advantage of economies of scale. This makes it possible to reduce the unit cost of the product. In some cases, redesigning the processes to postpone highcost operations offers the benefit of cost reduction, because it reduces the risk of overutilizing expensive resources (Biçer & Seifert, 2017; Biçer et al., 2022). In this chapter, we will also present how to combine different supply chain responsiveness and cost reduction strategies to achieve operational excellence.
6.1
Mean-Variance Analysis for Product Selection
Mean-variance analysis is commonly used in finance to optimize the investment portfolio of financial products (Markowitz, 1952). Given the expected return on investment and uncertainty levels of stock prices, the output of the mean-variance analysis is the optimal portfolio decomposition that generates a certain amount of return on investment with a minimum risk level. The product selection problem of a retailer or a manufacturer can be conceptualized as a portfolio optimization problem. Once a product is selected to be included in
200
6 Managing Product Variety
a product category, a capital budget is allocated to the selected product. The size of a product team depends on the capital allocated to the product. Senior executives make such capital budgeting decisions by analysing the expected return on the capital invested. The output of the portfolio optimization model is the amount of capital that should be allocated to each product. If the optimal amount of capital is equal to zero for a product, the product should be excluded from the category. Otherwise, the product is kept in the product category, and the marketing and sales efforts for it should be constrained by the allocated amount of the capital. Suppose a manufacturer has a capital budget of $10 million reserved for launching new products. There are three new product candidates. After an exhaustive market analysis, the expected return on the capital invested in each product is determined by decision-makers. The expected return percentages are: 7.61% .r = 4.99% . 9.49% The market analysis also reveals that the products are considered complementaries by consumers such that they can be marketed in bundles. Therefore, a positive correlation among their demand values is expected. The uncertainty of expected returns is represented by a covariance matrix: 0.007870 0.003085 0.000989 . = 0.003085 0.003696 0.000129 . 0.000989 0.000129 0.009587 The diagonal elements of this matrix correspond to the variance of the return on invested capital. For example, the expected return is equal to .4.99% for the second product, and its variance is .0.003696 (i.e. the second diagonal element of the covariance matrix). The non-diagonal elements correspond to the covariance values of the returns between two different products. For example, the covariance of the returns between the second and third products is equal to .0.000129. Because the matrix is symmetric, the covariance values between two different products appear in two places. We use .w to denote the weight vector such that: w1 .w = w2 , w 3 where .wi is the percentage of capital allocated to product .i ∈ {1, 2, 3}. Because the total capital is equal to $10 million, the amount allocated to product i is then .wi × $10 million.
6.1 Mean-Variance Analysis for Product Selection
201
Senior executives target an expected return of $600K on the invested capital, which amounts to 6% of total capital. Then, the product selection problem can be written as follows: Minimize:
.
1 T w w, 2
such that: wT r = 0.06, wT 1 = 1. The solution of this problem gives the optimal weight vector of .w. The objective function is the minimization of total variance, which is subject to two constraints. The first constraint guarantees that the target expected return is satisfied based on the capital allocation policy. The second constraint guarantees that all capital is allocated to the three products, so the sum of the weights is equal to one. In the second constraint, .1 denotes a vector of ones such as: 1 .1 = 1 . 1 The problem can be written in the Lagrangian form as follows: J (w, λ1 , λ2 ) =
.
1 T w w + λ1 (0.06 − wT r) + λ2 (1 − wT 1), 2
where .λ1 and .λ2 are the Lagrange multipliers. The optimal weight vector is found as: w = −1 (λ1 r + λ2 1)
.
with λ1 =
.
0.06X3 − X1 , X2 X3 − X12
λ2 =
X2 − 0.06X1 , X2 X3 − X12
and .X1 = rT −1 1, .X2 = rT −1 r, and .X3 = 1T −1 1. The formal derivations of these formulas are given in the appendix at the end of this chapter. Applying these equations to our product selection problem, the optimal weight vector is obtained as follows: 0.00 .w ≈ 0.78 . 0.22
202
6 Managing Product Variety
8
Allocated Capital (in millions)
Product #1 Product #2 Product #3 6
4
2
0 0.060
0.065
0.070
0.075
0.080
Target Return Fig. 6.2 Capital allocation structure for the three products with respect to the target return percentage
Therefore, total capital should be allocated to the second and third products, and the first product should be excluded from the product portfolio. In particular, the capital budget allocated to the last two products should be $7.8 and $2.2 million, respectively. We remark that we do not restrict the weight to being non-negative while formulating the product selection problem. Therefore, the weights can be negative. In finance, a negative weight indicates short-selling of the asset. Following the principles of Lagrange optimization in our setting, the products with negative weights should first be excluded from the list of candidate products. Then, the meanvariance analysis should be carried out in a restricted set of items. This approach filters the products iteratively and provides the optimal product decomposition when all the weights are non-negative. We now vary the target expected return from 6% to 8%. Figure 6.2 shows the capital allocated to each product with respect to the target return. The capital allocated to the second product decreases as the target return is increased. However, the capital allocated to the first and the third products increases with the target return. This result is in line with our expectation, because the expected return for the second product is the lowest (i.e. the second entry in .r is 4.99%). Its variance is also the lowest, which is given by the second diagonal entry of the covariance matrix .. Therefore, the second product can be considered a low-return, low-risk product, which is preferred when the target return is low. If the target return is high, the
6.2 Resource Allocation and Capacity Management
203
capital reserved for the second product should be reduced and more capital should be allocated to the other two products in order to generate high returns. In the online web application, we provide an interactive example of the mean-variance analysis that also includes the Python codes.
6.2
Resource Allocation and Capacity Management
Once the selection of products is completed, decision-makers should ensure the availability of the products in the market. To this end, they develop a plan for sourcing and producing the products. In practice, manufacturers often have limited resources that are utilized for producing different items. Retailers often source a group of products from a single supplier that has limited capacity. Therefore, companies are often exposed to a capacity allocation problem, in which a limited resource should be allocated to those products that maximize total profit. Capacity allocation is considered one of the biggest challenges in the management field (Murray, 2010). Decision-makers are expected to tie their limited resources to activities with the highest value. Ineffective capacity allocation leads to colossal supply-demand mismatch costs because the stock of popular products is quickly depleted, whereas slow-moving items result in excess inventory (Cachon & Lariviere, 1999). Reallocating capacity from slow-moving items to hot-selling items could potentially minimize the mismatches between supply and demand in product categories. The newsvendor model studied in Chap. 3 can be extended to understand the dynamics of resource allocation when different products are produced by utilizing the same resource. We use . to denote the set of products in a product category. If a retailer only sells three colours—black, blue, and red—of men’s T-shirts, the set of products for this product category is . = {black, blue, and red T-shirts}. We also use the subscript i to denote the product in a product category. The newsvendor profit for product .i ∈ is: i = (pi − ci )Qi − (pi − si ) max(Qi − Di , 0),
.
where .pi is the selling price of the product per unit, .ci the unit cost, .si the residual value for one unit of unsold inventory, .Qi the order quantity, and .Di the demand for the product. Following up from Chap. 3, the marginal profit value for the product i is given by: i = pi (1 − Fi (Qi )) + si Fi (Qi ) − ci .
.
When the capacity is unlimited, the optimal order quantity makes the marginal profit value equal to zero. If the capacity is limited, it should first be allocated to the products that have the highest marginal profit. Suppose there is a manufacturer that has a shared resource to produce the products in product category .. The capacity of the resource is C units. The optimal
204
6 Managing Product Variety
allocation policy can be described as the nested-allocation policy. The nested policy can be implemented by the following algorithm: 1. 2. 3. 4. 5. 6.
Set .Qi = 0 .∀i ∈ . Calculate .i for the given .Qi values .∀i ∈ . Select the item j that has the highest .i value such that .j = arg maxi i . Allocate one unit of capacity of C to the product j . Update .C = C − 1 and .Qj = Qj + 1. Stop if all .i values are equal to zero. Otherwise, go to Step 2.
We now consider a textile manufacturer that produces and sells two different Tshirts: (1) V-neck and (2) polo T-shirts. The selling price of the V-neck T-shirt is $30 per unit, the unit cost is $10, and the residual value is $5 per unsold item. The polo T-shirt has the same cost and residual values. But its selling price ranges from $20 to $60. The manufacturer will set the selling price of the polo T-shirt later. The initial demand forecasts for both T-shirts are the same, equal to 500 units. They also evolve according to a multiplicative model with a drift rate of zero and a volatility of one. We also normalize the total forecast horizon to one. The manufacturer has a limited capacity of 1000 units to produce both T-shirts. In Fig. 6.3, we present the allocated capacity to both T-shirts as a function of the price of the polo T-shirt. The red solid curve represents the capacity allocated to the
Allocated Capacity
600
Polo V−neck
500
400
20
30
40
50
60
Price of Polo T−Shirt Fig. 6.3 Allocated capacity to both T-shirts with respect to the increasing price of the polo T-shirt
6.3 Multiple Ordering Model with Multiple Products
205
polo T-shirt, while the blue dashed curve shows the results for the V-neck T-shirt. As the price of the polo T-shirt increases, more capacity should be allocated to the polo T-shirt to maximize the total profit. Therefore, the red curve moves upward as the price increases. This leaves behind more limited capacity to produce the V-neck T-shirt. For that reason, the capacity allocated to the V-neck T-shirt decreases as the price of the polo T-shirt increases. When the price of the polo T-shirt is equal to $30 per unit, both items have the same cost and demand parameters. At that point, the total capacity should be allocated to these two products equally. We provide an interactive example of this model in the online web application, where the reader can specify the input parameters and review the Python codes.
6.3
Multiple Ordering Model with Multiple Products
Manufacturers often sell a group of products on the market, which are produced under a capacity limitation. The production of such items often takes several months, so manufacturers should determine every month how much of each product to produce. In the early months of production, there is limited information about the demand for each product. As time passes, useful information about demand is collected from the market, which helps to improve demand forecasts. The pace of collecting demand information would be different for different items. Therefore, manufacturers would be better off allocating the capacity during the early months to those products with predictable demand. The capacity towards the end of the production period is precious, because it can be utilized to manufacture the products with high demand uncertainty after their demand uncertainty is resolved. Suppose that a textile manufacturer sells both V-neck and polo T-shirts to some retailers. The demand from the retailers is observed every year in March. The manufacturer starts production for both types of T-shirts in December. The production should then be completed in 3 months so the manufacturer would be able to fulfil the retailer demand. Before the manufacturer starts the production, the demand for V-neck T-shirts is estimated to be between 200 and 400 units based on historical demand information. The manufacturer is not able to collect demand information for V-neck T-shirts until the actual demand is observed. Actual demand for V-neck T-shirts turns out to be 300 units, which is observed at the beginning of March. The demand for polo T-shirts is estimated to be between 100 and 500 units at the very beginning. The retailers let the manufacturer know their order quantities for polo T-shirts at the beginning of January, which is equal to 400 units. The monthly production capacity is equal to 200 units. The manufacturer can ideally reserve the production capacity in December for V-neck T-shirts, because the demand uncertainty for polo T-shirts is still high in December. This makes it possible to free up capacity during the last 2 months for polo T-shirts after their demand uncertainty is fully resolved. Because the demand for polo T-shirts is 400 units, the total capacity in January and February is fully allocated to polo T-shirts.
206
6 Managing Product Variety
Fig. 6.4 Production plan and information flow for V-neck and polo T-shirts
In Fig. 6.4, we present the production plan and information evolution for the manufacturer. The manufacturer faces 100 units of stock-outs for V-neck T-shirts, while there is no mismatch between the inventory produced and the demand for polo T-shirts. Because total demand for both T-shirts is 700 units and total capacity is only 600 units for 3 months, the manufacturer cannot fulfil the total demand in any case. We now consider a different case in which the demand for polo T-shirts is equal to 200 units. The manufacturer knows the exact demand for polo T-shirts at the beginning of January. In this case, the manufacturer can ideally produce 200 units of V-neck T-shirts in December and 200 units of polo T-shirts in January. Given that the expected demand for V-neck T-shirts is 300 units at the beginning of February, the manufacturer would produce 100 units of V-neck T-shirts in February. Therefore, 100 units of total capacity would not be utilized in February. The demand for V-neck T-shirts turns out to be 300 units, which is observed at the beginning of March. Then, the total demand for both types of T-shirts is perfectly fulfilled without having any excess inventory at the end. If the manufacturer ignores the evolution of demand forecasts and determines the production plan at the beginning of December, the total capacity would be ideally allocated to the products equally, because the expected demand for each product is equal to 300 units at the very beginning. This causes the manufacturer to face excess inventory of 100 units of the polo T-shirts. Therefore, a dynamic capacity allocation strategy based on the evolution of demand forecasts helps companies minimize mismatches between supply and demand. In Fig. 6.5, we present ordering decisions for multiple products in the multiple ordering problem. Demand forecasts for all products evolve from .t0 to .tn . We use .Qi,j to denote the order quantity for the product i at time .tj . Likewise, we use .Di,j to denote the demand forecast for the product i at time .tj . The total capacity for
6.3 Multiple Ordering Model with Multiple Products
,
≤
,
≤
,
≤
207
,
≤
,
≤
Fig. 6.5 Ordering decisions for multiple products in the multiple ordering problem. Demand forecasts for all items evolve from .t0 to .tn . Decision-makers determine the order quantity for each product in each period. The sum of order quantities in each period is limited by the capacity
each month is equal to C units. Thus, the total production quantity cannot exceed the capacity. In mathematical terms: .
Qi,j ≤ C,
tj ∈ {t0 , · · · , tn }.
i∈
Following up from the marginal profit formulation for the single-product case (given in the previous chapter), we approximate the marginal profit expression for the product i in period j as follows: i,j = pi [1 − F (Ii,j + Qi,j + (n − j − 1)αi,j C | Di,j )]
.
+ si F (Ii,j + Qi,j | Di,j ) − ci , where .Ii,j is the inventory level for product j at time .tj : Ii,j =
k−1
.
Qi,k .
k=0
The term .αi,j is the demand weight of product i at time .tj such that: Di,j αi,j = . Du,j
.
u∈
In each period, the optimal policy is again a nested-allocation policy, which allocates the capacity to the products starting with the ones that have the highest marginal profit. The analytical proof regarding the optimality of this policy is given
208
6 Managing Product Variety
in Biçer and Seifert (2017). The following algorithm optimally allocates the capacity for each period .tj ∈ {t0 , · · · , tn−1 } to the products in the set .: 1. 2. 3. 4. 5. 6.
Set .Qi,j = 0 .∀i ∈ . Calculate .i,j for the given .Qi,j values .∀i ∈ . Select the item k that has the highest .i,j value such that .k = arg maxi i,j . Allocate one unit of capacity of C to the product k. Update .C = C − 1 and .Qk,j = Qk,j + 1. Stop if all .i,j values or C become equal to zero. Otherwise, go to Step 2.
We now turn back to our example of the textile manufacturer that produces Vneck and polo T-shirts and sells them to retailers. Monthly production capacity is 200 units, and production takes place in December, January, and February. Demand forecasts for both V-neck and polo T-shirts evolve according to a multiplicative demand model. The initial demand forecast for each type of T-shirt is equal to 300 units. We normalize the initial demand forecast to one for both products and change the capacity limit accordingly. Thus, the normalized value of the capacity is equal to .2/3. The volatility parameter for the demand of each product is set equal to one. The normalized demand forecasts for the V-neck T-shirt are updated as .D1,0 = 1, .D1,1 = 1.2, .D1,2 = 1.1, and .D1,3 = 1.1. The forecasts for the polo T-shirt are .D2,0 = 1, .D2,1 = 0.9, .D2,2 = 0.8, and .D2,1 = 0.9. The demand forecasts are given in normalized values. For example, the true values of the forecasts for the Vneck T-shirt are found by multiplying the normalized values with the initial demand forecast of 300 units. Therefore, the true values are 300, 360, 300, and 330 units. The selling price of the V-neck T-shirt is $30; the unit cost, $10; and the residual value, $5 per unit. For the polo T-shirt, the selling price is $45; the unit cost, $10; and the residual value, $5 per unit. The profit margin for the V-neck T-shirt is 66.6%, whereas it is 77.7% for the polo T-shirt. The parameters of the demand model are identical at the very beginning for both types of T-shirts. Therefore, we expect that most of the capacity should be allocated to the polo T-shirts in December because the profit margin is higher for the polo Tshirts. Starting from January, the demand forecasts for the V-neck T-shirts become higher than those for the polo T-shirts. Therefore, most of the capacity should be allocated to the V-neck T-shirts in January and February after the manufacturer observes increasing demand forecasts for the V-neck T-shirts. The manufacturer also observes decreasing demand forecasts for the polo T-shirts. We implement the nested-allocation algorithm and report the optimal order quantities in Table 6.1. The results are consistent with our expectation, such that the capacity in December is mostly used to produce the polo T-shirts, whereas most of the capacity in the last 2 months is used to produce the V-neck T-shirts. In Table 6.2, we present the inventory level accumulated over time and the final demand for both types of products. The nested-allocation policy brings the inventory levels close to the final demand at the end of the production period for both types of T-shirts. This results from the fact that the policy uses the information of evolving demand forecasts and the capacity level for the next months to determine
6.4 Product Proliferation Model Table 6.1 Order quantities for both types of T-shirts
Table 6.2 Inventory level and final demand for both types of T-shirts
209 Product V-neck T-shirt Polo T-shirt Total .≈
December 0.061 0.596 0.66
January 0.489 0.168 0.66
February 0.465 0.192 0.66
Product December January February Final Demand V-neck T-shirt 0.061 0.550 1.015 1.1 Polo T-shirt 0.596 0.764 0.956 0.9
the order quantities. Therefore, it calibrates the inventory level according to the demand forecasts reactively, which in turn helps the manufacturer match inventory with demand in an effective way. In the online web application, we present an interactive example based on this model. The web application makes it possible for the reader to specify the input parameters and review the Python codes.
6.4
Product Proliferation Model
Manufacturers that sell a variety of products commonly use a limited number of raw materials and differentiate them in the later stages of production. In the early stages of production, there is high demand uncertainty for the products. However, raw material orders are based on aggregate demand forecasts, for which the negative aspects of high uncertainty would be limited. If product differentiation is postponed to the later stages, the risk of high demand uncertainty can be mitigated by postponing the order quantity decision for the final products until there is partial or full resolution of the demand uncertainty. In such a setting, manufacturers have an evolving risk structure. There is high demand uncertainty in the early stages. As time passes, manufacturers collect information from the market and reduce the demand uncertainty. However, they are exposed to high product variety in the later stages of production. To minimize the mismatches between supply and demand, companies aim to delay the differentiation point by redesigning their processes. For example, Benetton was making clothes in three stages: (1) yarn procurement, (2) dyeing yarns, and (3) knitting. Historically, it offered more colour options than size options (Heskett & Signorelli, 1989). The company’s advertising strategy was centred around the richness of the colour options, which was popularized with the quote “the United Colors of Benetton”. To offer a large variety of colour options, Benetton mainly differentiated its clothes during the second stage—that is, when the yarns were dyed. However, this caused high mismatches between supply and demand. To postpone the dyeing operation to a later stage, the company invested in the technology and started to dye the items after the knitting process (Heskett & Signorelli, 1989). Therefore, the company swapped the last two stages and redesigned the sequence
210
6 Managing Product Variety
Fig. 6.6 Production processes at Kordsa Inc. with the differentiation points
of its operations as follows: (1) yarn procurement, (2) knitting (all clothes are in white), and (3) dyeing the clothes. Reebok also followed a similar strategy for the production of National Football League (NFL) jerseys (Parsons & Graves, 2005). The company had the exclusive rights to sell NFL jerseys from 2000 to 2010. The player jerseys had a very high demand uncertainty, because demand depends on the performance of both the players and their teams. Initially, Reebok ordered the jerseys with the players’ names and numbers from offshore manufacturing sites. But this strategy resulted in excess inventory of some players’ jerseys and stock-outs for the high-performing players’ jerseys. Then, Reebok changed this policy and started ordering blank jerseys from the offshore manufacturing sites and printing the blank jerseys in its Indianapolis facility (Parsons & Graves, 2005). This allowed Reebok to place consolidated orders for the team jerseys and postpone the printing process until after it observed valuable information about the players’ performance. Consequently, the company reduced the mismatch risk substantially (Parsons & Graves, 2005). In addition to these examples, Kordsa Inc. increased the variety of its products along its supply chain as shown in Fig. 6.6. (Biçer et al., 2022). As explained at the end of Chap. 1, the company produces tire cords after processing polypropylene though four production stages. The first step is yarn production, where polymer yarns are produced from polypropylene through a continuous process. The speed of the process is adjustable, so the output rate can be kept within some limits. However, stopping yarn production due to a breakdown results in huge costs, because all the pipes need to be opened and cleaned to set up the system again. The produced yarns are all the same. Thus, there is no differentiation in the first stage. The second step is the twisting operation, whereby two threads are bound together. Twisted yarns immediately go through the weaving operation. For technical reasons, it is not possible to keep the twisted yarns in stock for more than a day. Otherwise, the twists loosen and degenerate. Tire cords may differ in the number of twists per meter of yarn. Thus, the first level of product differentiation occurs in this stage. The third step is the weaving operation, where the twisted yarns are woven to produce plain cords. The plain cords differ in the number of twisted yarns used for each square meter. Therefore, the second level of differentiation occurs in this stage.
6.4 Product Proliferation Model
,
, ∈
≤
211
,
,
∈
,
≤
,
,
Fig. 6.7 Ordering decisions in the product proliferation model. Demand forecasts for the final products evolve from .t0 to .tn . Product differentiation occurs at time .t1 and .tn−1
The plain cords are then treated with some chemicals to reach the desired levels of elasticity and strength in the final step. Different chemical blends can be used for a single type of plain cord to differentiate the cords in the final stage. In the product proliferation model, the manufacturer places an order for a component at the beginning. The production takes place sequentially, and the final products are manufactured after the components go through a series of production stages. For each production stage, the amount produced in the previous stage is used as the input. Therefore, the quantity produced in one stage determines the production limit in the following stage. We use .Qi,0 to denote the order quantity for component i at time .t0 . If there is a product proliferation from that component in the second step, we use .i,0 to denote the set of items that are produced by using the component i. Therefore, the order quantities for items in the set .i,0 are constrained by the total amount of component i, which is equal to .Qi,0 . In mathematical terms, .
Qj,1 ≤ Qi,0 .
j ∈i,0
In Fig. 6.7, we assume that product differentiation occurs at time .t1 and .tn−1 . For the second proliferation point, the items that are produced by using the component j are denoted by the set .j,n−2 . Then, the order quantities at .tn−1 are limited such that: . Qh,n−1 ≤ Qj,n−2 . h∈j,n−2
For the other intermediate steps, no differentiation occurs. Therefore, Qj,k ≤ Qj,k−1 ,
.
∀k ∈ {2, · · · , n − 2}.
212
6 Managing Product Variety
We have presented the marginal profit expression for the single-product, multiple ordering, and multiple echelon model at the end of the previous chapter. Following up from the previous chapter, the marginal profit for product h (its production starts at .tk ) is: h,k = pProb(Dn,h > Qh , Dn−1,h > ψn−1 (Qh ), Dn−2,h
.
> ψn−2 (Qh ), . . . ., Dk+1,h > ψk+1 (Qh )) −cn−1 Prob(Dn−1,h > ψn−1 (Qh ), Dn−2,h > ψn−2 (Qh ), . . . ., Dk+1,h > ψk+1 (Qh )) −cn−2 Prob(Dn−2,h > ψn−2 (Qh ), . . . ., Dk+1 > ψk+1 (Qh )) − · · · − ck+1 Prob(Dk+1,h > ψk+1 (Qh )) − ck , where p is the selling price of the product per unit and .ck is the cost of processing one unit of product through stage k, which takes place from time .tk to .tk+1 . If the marginal profit is calculated for a final product, p must be the selling price of the product. If the marginal profit is calculated for a component that will be differentiated to produce a group of products, p can be approximated as the weighted average of the selling prices of the products. If there is no differentiation occurring at time .tk , the optimal policy aims to minimize the marginal profit until it reaches zero. If it is not possible due to the quantity constraint (i.e. .Qj,k ≤ Qj,k−1 ), the order quantity is set equal to the quantity produced in the previous echelon as discussed in the previous chapter. If differentiation occurs at time .tk , the following algorithm optimizes the production quantities: 1. 2. 3. 4. 5. 6.
Set .Qh,k = 0 .∀h ∈ j,k−1 . Calculate .h,k for the given values of .Qh,k .∀h ∈ j,k−1 . Select the item .ν that has the highest .h,k value such that .ν = arg maxh h,k . Allocate one unit of .Qj,k−1 to the product .ν. Update .Qj,k−1 = Qj,k−1 − 1 and .Qν,k = Qν,k + 1. Stop if all .i,j values or .Qj,k−1 become equal to zero. Otherwise, go to Step 2.
This algorithm optimally allocates the quantity produced at time .tk−1 to the items produced at time .tk . This allocation mechanism is called the “nested allocation” mechanism. The analytical derivations that show the optimality of the nestedallocation mechanism are shown by Biçer et al. (2022). We now present an example of the product proliferation model for a textile manufacturer that produces V-neck and polo T-shirts in two production steps. The first step is yarn production. The second step is the knitting process. The differentiation occurs during the second step. The cost of the yarn production is .$4 per unit, and the cost of the knitting process is .$6 per unit. The selling price per
6.4 Product Proliferation Model
213
,
+
,
≤
Sales for V-neck = min( , , , ) Sales for polo = min( , , , ) = 0.5
=4
=6
=1
Fig. 6.8 Ordering decisions and the cost values for the yarn production and knitting operations
unit is .$30 for the V-neck T-shirt and .$45 for the polo T-shirt. The ordering decisions and the cost values for both production steps are given in Fig. 6.8. The initial demand forecast for both types of T-shirts is 300 units. The demand forecast at time .t1 = 0.5 is 360 units for the V-neck T-shirt and 240 units for the polo T-shirt. The final demand is 390 units for the V-neck T-shirt and 270 units for the polo T-shirt. We normalize the initial demand forecast to one and update the other values accordingly. Therefore, the demand parameters for the V-neck T-shirt are .D1,0 = 1, .D1,1 = 1.2, and .D1,2 = 1.3. The normalized values for the polo T-shirt are .D2,0 = 1, .D2,1 = 0.8, and .D2,2 = 0.9. We assume a multiplicative demand process with zero drift and a volatility of one. The manufacturer should determine the amount of yarn to be produced at time .t0 , which is denoted by .Q0 . Then, the production quantities for both V-neck and polo T-shirts are determined at the beginning of the knitting process—that is, at time .t1 . We use .Q1,1 and .Q2,1 to denote the production quantities for the V-neck and polo T-shirts, respectively. The total amount of T-shirts produced is limited to being less than .Q0 . We also define a new variable that denotes the total production quantity of both types of T-shirts such that .Q1 = Q1,1 + Q2,1 . Thus, Q1,1 + Q2,1 = Q1 ≤ Q0 .
.
The marginal profit function at .t0 is written as follows: 0 = pavg Prob(D1,2 + D2,2 > Q0 , D1,1 + D2,1 > ψ1 (Q0 ))
.
− c1 Prob(D1,1 + D2,1 > ψ1 (Q0 )) − c0 , where we recall that .ψ1 (Q0 ) is the threshold value for the demand forecast .D1,1 + D2,1 such that .Q0 becomes the optimal value of .Q1 when .D1,1 + D2,1 ≥ ψ1 (Q0 ). The term .pavg is the weighted average of the selling prices, which is equal to .$37.5, given that the selling price is .$30 per unit for the V-neck and .$45 per unit for the polo T-shirt.
214
6 Managing Product Variety
To find an expression for .ψ1 (Q0 ) in the marginal profit function, the optimal value of .Q1 is first approximated by the critical fractile formula at .t1 : Q∗1 ≈ F −1
.
p
avg
− c1 D1,1 , D2,1 = F −1 (0.84 | D1,1 , D2,1 ).
pavg
The inverse of the standard normal distribution for 0.84 is . −1 (0.84) = 0.994. Then, Q∗1 ≈ (D1,1 + D2,1 )e(μ−σ
.
2 /2)(t
2 −t1 )+
−1 (0.84)σ √t
2 −t1
= (D1,1 + D2,1 )e−0.25+0.994×0.707 = 1.573(D1,1 + D2,1 ). Then, the threshold value is formulated as follows: ψ1 (Q0 ) = Q0 /1.573 = 0.636Q0 .
.
Once we develop an expression for .ψ1 (Q0 ), the optimal value of .Q0 can be found by setting the marginal profit .0 equal to zero. In the interactive example given in the online web application, we present the Python codes to find the optimal value, which is equal to .Q∗0 = 2.71 in the normalized terms. In real values, the optimal amount of yarns to be produced should be .2.71 × 300 = 813 units. The marginal profit function at time .t1 is written as follows: j,1 = pj Prob(Dj,2 > Qj,1 ) − c1 ,
.
j ∈ {V-neck, polo}.
We use this marginal profit formulation in our nested-allocation algorithm to determine the order quantities at .t1 . The optimal values are found as .Q∗1,1 = 1.48 and .Q∗2,1 = 1.23. In real values, the optimal quantity to be produced should be 444 and 369 units for the V-neck and polo T-shirts, respectively.
6.5
Operational Excellence
There are different forms of trade-offs in supply chains that should be managed carefully in order to reduce the supply-demand mismatches in a high product variety environment. For the multiple ordering model, there are alternative ways to reduce the supply-demand mismatches. One option would be capacity expansion, which makes it possible to postpone the starting time of production. Another option would be to reduce the delivery lead time so production does not have to be completed well in advance of the selling season. When production facilities are located near market bases, the delivery lead time is expected to be relatively short. However, capacity expansion would not be a viable
6.5 Operational Excellence
215
solution in this case due to high costs. Land costs and other expenses are expected to be much higher in places close to the markets compared to offshore countries. When production facilities are located in offshore countries, the capacity is expected to be higher than that of domestic production. However, the delivery lead time is expected to be longer for offshore production than domestic production. Therefore, decisionmakers who must make a decision between offshore and domestic production may face a trade-off between high capacity levels and short delivery lead times. Biçer and Seifert (2017) explore this trade-off and develop a decision typology that suggests some effective strategies depending on the production flexibility (i.e. the ability to produce alternative items by using some shared resources) and demand uncertainty. When demand uncertainty is low for a product category and there is no room for increasing the product variety, companies are better off producing the items in an offshore country. Offshore production offers the benefits of lower costs and higher capacity levels compared to domestic production, which are very important for products with low demand uncertainty. Production flexibility is not very important for such product categories, because the demand is predictable at the very beginning of the production horizon. Thus, the optimal order quantities determined at the very beginning can effectively reduce the supplydemand mismatches. When demand uncertainty is low for a product category and there is room for increasing the product variety, companies would be better off investing in domestic production with high production flexibility. This makes it possible to increase the number of items in the product category and effectively manage the increasing demand uncertainty due to the category expansion. In this case, production flexibility allows decision-makers to update the order quantities according to the evolution of demand forecasts. To improve the order quantity decisions, demand forecasts should be updated frequently and accurately according to the information collected from the market. For that reason, the domestic production strategy should be complemented with some technology investments to reach valuable information sources about the demand. Some manufacturers of consumer packaged goods implement vendor-managed inventory (VMI) systems to reach the point-of-sales data of some retailers (Biçer & Seifert, 2017). The point-of-sales data helps the manufacturers understand trends, seasonality, and other important factors that influence the demand for their products, which in turn leads to improved demand forecasts. When demand uncertainty is high for a product category, domestic production with short delivery lead times would be a better option than offshore production with high capacity levels. To benefit from domestic production effectively, companies should also invest in production flexibility and the technology to gain valuable information about customer demand as outlined above. Establishing production flexibility may not be possible for manufacturers in some specific industries. Biçer and Seifert (2017) gave an example from the automotive industry, such that production is highly inflexible due to the frozen periods in production schedules. In other industries, such as agriculture and commodities, processing lead times are excessively long, so the production quantities should be
216
6 Managing Product Variety
determined well in advance of the selling season without any flexibility to update the initial quantities. Increasing the product variety may hurt the profitability in these industries due to production inflexibility. Such manufacturers would be better off standardizing the products in their portfolios to minimize the mismatches between supply and demand. If product standardization is not possible for a manufacturer that aims to offer a variety of products (e.g. to increase the market share and enter new markets), designing the production process as a series of sequential operations would help deal with the operational challenges of product variety. This would bring a single-echelon and multiple ordering model into the product proliferation model. For the product proliferation model, Biçer et al. (2022) identified four effective strategies to reduce the mismatches between supply and demand, depending on the product characteristics and process flexibility. Process flexibility is the ability to change the sequence of the operations. The Benetton example given above shows how process flexibility can help companies delay the differentiation point in their supply chains. Delaying differentiation is an effective method for reducing supply-demand mismatches, because it allows the ordering decisions for the final products to be postponed until there is a partial or full resolution of demand uncertainty. In the Benetton case, the company swapped the dyeing and knitting operations to postpone the dyeing operation. It helped the company increase its profits substantially (Heskett & Signorelli, 1989), because the dyeing operation was much costlier than the knitting operation. In supply chains, delaying differentiation always helps increase profits if the point of differentiation is also a high-cost operation. In such cases, there are two benefits of delaying the differentiation. First, the order quantities of the final products are determined after improving the demand forecasts, which in turn minimizes the risk of supply-demand mismatches. Second, delaying the point of differentiation, which is also a costly operation, makes it possible to avoid the overutilization of expensive resources. If the benefits of delaying the differentiation is coupled with those of avoiding overutilization of expensive resources, companies would be able to generate a substantial increase in their profits. Delaying the differentiation, however, may hurt the profits if its implementation requires expediting the high-cost operations in the supply chain. Suppose, for example, that a manufacturer produces two products (i.e. Product A and B) in two steps. First, a single type of raw material goes through Step X. There is no product differentiation in the first step. The second step is Step Y, which differentiates the components produced in Step X into either Product A or B. We also assume that the time length to complete each operation is the same. The cost of processing one unit of the raw material through Step X is .$45. The cost of processing one unit of component through Step Y is .$5. The selling price is the same for both products, which is equal to .$75 per unit. We also assume that demand is either zero or 10 units with equal probabilities for both products. The manufacturer observes the true values of demand just before starting Step Y. The order quantity for the component produced by Step X is then determined by the newsvendor model. The
6.5 Operational Excellence
217
critical fractile is: Critical fractile =
.
75 − 5 − 45 = 35.7%. 75 − 5
The total demand for both Product A and B can be zero with a probability of 25%, 10 units with a probability of 50%, and 20 units with a probability of 25%. Because the critical fractile is 35.7%, the optimal order quantity for the components is 10 units at the beginning of Step X. Then, the total expected profit estimated at the very beginning: .
− 45 × 10 × 0.25 + 25 × 10 × 0.75 = $75.
We now consider the case in which raw materials first go through Step Y and then Step X. In other words, the differentiation occurs at the very beginning, and the manufacturer has to determine the order quantities for both products before starting the first operation. We assume that the demand dynamics remain the same, such that the manufacturer observes the true demand values before starting the second operation. The critical fractile value for each product is the same as follows: Critical fractile =
.
75 − 45 − 5 = 83.3%. 75 − 45
Because the demand for each product can be either zero or 10 units with equal probabilities, the optimal order quantity is equal to 10 units for both products, given that the critical fractile is 83.3%. Then, the total expected profit estimated at the very beginning: 2 × [−5 × 10 × 0.50 + 25 × 10 × 0.50] = $200.
.
Therefore, swapping the operations leads to an increase in the expected profit of $125 even though it means expediting the differentiation. This example illustrates that delaying the differentiation does not always increase profits, especially when its implementation results in expediting a high-cost operation. Therefore, organizations must leverage cost efficiency and supply chain responsiveness to maximize their profits. Biçer et al. (2022) show that cost reduction efforts should start from upstream operations and move downstream. Reducing the cost of an upstream operation helps increase profits more than reducing the cost of a downstream operation by the same amount. The rationale behind this result is that reducing the cost of an upstream operation leads to an increase in the optimal order quantities for the upstream operation. The downstream order quantities are constrained by the upstream quantities. For that reason, increasing the upstream order quantities provides the manufacturer with more flexibility when the downstream ordering decisions are made, which leads to higher profits. Biçer et al.
.
218
6 Managing Product Variety
(2022) refer to this approach as “systematic cost reduction”, where cost reduction efforts start from upstream operations and then move downstream. When it comes to lead time reduction efforts, organizations should follow the opposite route. Suppose, for example, a manufacturer processes raw materials through three serial operations: Operations X, Y, and Z in sequence. It takes 2 weeks to complete each operation. If the lead time of Operation X is reduced by one week, only the ordering decision for that operation is postponed by 1 week. If the lead time of Operation Y is reduced by 1 week, the ordering decisions for Operations X and Y are postponed by 1 week. Finally, if the lead time of Operation Z is reduced by one, the ordering decisions of all the operations are postponed by 1 week. Postponing an ordering decision makes it possible to base the decision on accurate demand forecasts, which helps increase the profits. Hence, reducing the lead time of Operation Z allows the manufacturer to base all the decisions on more accurate forecasts. For that reason, reducing the lead time of a downstream operation helps increase the profits more than reducing the lead time of an upstream operation by the same amount. Lead time reduction efforts should start from the downstream operations and then move upstream. This approach of reducing the lead times is called “systematic lead time reduction”. In a product proliferation model, the systematic cost reduction and lead time reduction efforts do not necessitate changing the sequence of the operations. Even when there is no process flexibility, manufacturers can adapt these approaches to improve their profits. For standard products that have a low profit margin and low demand uncertainty, improving cost efficiency would be more important than improving supply chain responsiveness. In such a case, manufacturers should follow a systematic cost reduction strategy. For innovative products that have a high profit margin and high demand uncertainty, supply chain responsiveness becomes more important than cost efficiency. Thus, manufacturers should follow systematic lead time reduction strategy for their innovative products. When manufacturers achieve a certain level of process flexibility, it is possible to redesign the operations by changing their sequence. Manufacturers can also complement such an operational redesign strategy with systematic cost reduction and/or lead time reduction. In a production-proliferation model with process flexibility, Biçer et al. (2022) propose two different strategies: (1) performancebased process redesign and (2) mixed strategy. The performance-based process redesign strategy examines all adjacent operations and swaps their order in the following situations: 1. Differentiation priority: When the differentiation occurs only during the upstream operation, swapping the two operations makes it possible to delay the differentiation. In such a case, the process redesign strategy prioritizes delaying the differentiation point. This strategy always benefits manufacturers when the cost of the upstream operation is not lower than the cost of the downstream operation and the lead time of the upstream operation is not longer than the lead time of the downstream operation. If these conditions are fulfilled, swapping the operations based on the differentiation priority helps manufacturers increase their profits.
6.5 Operational Excellence
219
2. Cost reduction priority: When the cost of the upstream operation is higher than the cost of the downstream operation, swapping the two operations makes it possible to delay the high-cost activity, possibly reducing the overutilization of expensive resources. Such a process redesign strategy prioritizes delaying the high-cost operation. If no differentiation occurs during the downstream operation and the lead time of the upstream operation is not longer than the lead time of the downstream operation, swapping the operations by prioritizing cost reduction always benefits manufacturers. 3. Lead time priority: When the lead time of the upstream operation is shorter than the lead time of the downstream operation, swapping the operations makes it possible to postpone the ordering decision of the second operation. Suppose, for example, that a manufacturer processes raw materials through Operation A and B in sequence. Operation A takes 1 week, while Operation B takes 3 weeks. At the beginning of each month, the manufacturer determines the order quantity for Operation A. The quantity for Operation B is determined after 1 week. If the manufacturer swaps the operations, the order quantity for Operation B should be determined at the beginning of the month. But the quantity for Operation A is determined after 3 weeks. Therefore, the ordering time for the second operation (i.e. Operation B before swapping and Operation A after swapping) is postponed 2 weeks after swapping the operations. Such a process redesign strategy that prioritizes the lead times always benefits manufacturers when no differentiation occurs during the downstream operation and the cost of the upstream operation is not lower than the cost of the downstream operation. Therefore, the performance-based strategy iteratively looks at the level of differentiation, cost values and lead times for all adjacent operations. Then, it determines which one of these operational levers should be prioritized while swapping the operations. In some cases, however, it may not be so clear whether swapping two adjacent operations would benefit the manufacturer. If an upstream operation has a higher cost than the downstream operation and the differentiation takes place during the downstream operation, the manufacturer faces a trade-off between postponing a high-cost operation and the point of differentiation. Although the analytical model given in the previous section helps the manufacturer optimize this trade-off, Biçer et al. (2022) also propose the mixed strategy as a practical approach to improve the profits in such cases. The mixed strategy first resequences the operations according to their costs in an ascending order. It then complements this cost-based ordering strategy with a lead time reduction strategy. Suppose that a manufacturer processes raw materials through four operations in sequence: Operations A, B, C, D. The cost and lead time values for each operations are denoted by .ci and .li for .i ∈ {A, B, C, D}, respectively. We also use .ti and .ti to denote the starting time of Operation .i ∈ {A, B, C, D} before and after resequencing the operations, respectively. The lead time reduction efforts in the mixed strategy should start from the most downstream operation and continue upstream until .ti ≥ ti for all operations.
220
6 Managing Product Variety
Suppose, for example, that the lead times for all four operations are the same and equal to 1 week. The cost values satisfy .cC > cB > cD > cA . Thus, the cost-based ordering strategy resequences the operations in the order: (1) Operation A, (2) Operation D, (3) Operation B, and (4) Operation C. For each month, the starting time of Operation A is Week 1, both before and after the resequencing. In other words, the starting time of Operation A is not affected by resequencing the operations because Operation A is the first operation in both cases. The starting time of Operation D becomes Week 2 after the resequencing, which was initially Week 4. The starting time of Operation B becomes Week 3 after resequencing the operations, which was initially Week 2. Finally, the starting time of Operation C is Week 4 after the resequencing, which was initially Week 3. For Operations A, B, and C, the condition .ti ≥ ti is satisfied after resequencing the operations even without reducing the lead times. However, the condition is violated for Operation D, because Operation D should start 2 weeks earlier than the initial schedule after resequencing the operations. In other words, cost-based ordering strategy causes Operation D to be started in Week 2, rather than in Week 4, where it occurred in the original case before resequencing the operations. The mixed strategy proposes that the lead times of the downstream operations (C, B, and D) should be reduced until the starting time of Operation D becomes Week 4 after resequencing the operations. The total lead time for processing Operations C, B, and D should be reduced from 3 weeks to 1 week. Then, Operation D can be started in Week 4 of each month, and all three operations (C, B, and D) can be completed during the same week. After reducing the lead times, the first operation can also be postponed to Week 3. When a manufacturer has the process flexibility, implementing a performancebased process redesign strategy does not require any additional investments. However, this is not the case for the mixed strategy, because reducing lead times can be costly for manufacturers. For standard products that have a low profit margin and low demand uncertainty, the performance-based strategy would be a better option than the mixed strategy, because the benefits of lead time reduction for such products may not justify the investment cost. For innovative products that have a high profit margin and high demand uncertainty, the mixed strategy would be a better option than the performance-based strategy, because the benefits of lead time reduction would be much higher than its investment cost.
6.6
Chapter Summary
Managing product variety is a very complex task that requires the involvement of different stakeholders in organizations. It starts with selecting the products and forming product portfolios for each product category. Once the product portfolios are designed in a company, the marketing department assigns product managers to the products with a budget to be spent for some marketing campaigns. Therefore, senior executives are responsible for not only designing the product portfolios but also allocating budget to each product. For that reason, the capital budgeting
6.7 Practice Examples
221
problem is part of the product selection problem in practice. We have presented how to use the mean-variance analysis to select the products and determine the budget for each of the products in the portfolio. After the product selection stage, operations management practices should be aligned with the product portfolios. From the operational point of view, the most challenging part is the allocation of shared resources to the products. If each product utilizes dedicated resources, the single-product models can be used to optimize the operational decisions. However, companies (especially manufacturers) generally use flexible and shared resources that can be allocated to different products. When multiple ordering is not possible, the resource allocation and capacity management model optimizes the allocation of the shared resources to different products. When multiple ordering over time is possible, we then need to look at the echelon structure of the supply chain. The multiple ordering model with multiple products effectively allocates the shared resources to products in a single-echelon case. If production of the products is completed by processing raw materials through some operations in sequence, the allocation decisions in the multi-echelon setting can be optimized with the product proliferation model. Finally, we have gone through the strategies that help organizations achieve operational excellence by redesigning their operations and reducing lead times and production costs. An effective strategy can only be developed by taking into account the supply chain and demand dynamics. Thus, organizations must tailor the operational practices according to the characteristics of their product portfolios.
6.7
Practice Examples
Example 6.1 A manufacturer produces two different products (i.e. Product A and B) and sells them in a market with uncertain demand. Demand for each product is seasonal and identically distributed: It follows a lognormal distribution with a location parameter of 6.56 and a scale parameter of 0.83 for each product. The selling price of Product A is $25 per unit, while it is $20 per unit for Product B. The cost of production is the same for both products, and it is equal to $5 per unit. Unsold items at the end of the selling season are donated to a charity, so the residual value is zero. During the manufacturing process, the products utilize a shared resource. If the capacity of the shared resource is unlimited, calculate the optimal order quantities for Products A and B (Use this information to answer the questions of Examples 6.1–6.6). Solution 6.1 Given that the capacity is unlimited, the optimal order quantity is calculated by the single-period newsvendor model. The critical fractile for Product A is αA = (25 − 5)/25 = 80%. The z value for this critical fractile is equal to z = (0.8) = 0.842. Hence, Q∗A = e6.56+0.842×0.83 = 1421 units.
.
222
6 Managing Product Variety
The critical fractile for Product B is αB = (20 − 5)/20 = 0.75. The z value is z = (0.75) = 0.674. Hence, Q∗B = e6.56+0.674×0.83 = 1236 units.
.
The total capacity employed to produce the optimal quantities for both products is then equal to: Q∗A + Q∗B = 2657 units. Example 6.2 Calculate optimal order quantities in Python if the total capacity of the shared resource is equal 2000 units. Solution 6.2 The capacity is now less than the sum of optimal order quantities calculated in the solution of Example 6.1 (i.e. 2000 < 2657. The optimal quantities with limited capacity can be calculated by employing the nested-allocation algorithm given in the “Resource Allocation and Capacity Management” section. The optimal quantities and the marginal profit values for both products are calculated as follows: QA = 1086,
.
QB = 914, A = B = 2.56. Python codes of this example are given in the online web application. Example 6.3 Calculate optimal order quantities in Python when the total capacity of the shared resource is equal to 500 units. Then, iterate this example when the total capacity is 300 units and discuss the results. Solution 6.3 When the total capacity of the shared resource is equal to 500 units, we have: QA = 369,
.
QB = 131, A = B = 14.5. When the total capacity of the shared resource is equal to 300 units, we have: QA = 300,
.
QB = 0,
6.7 Practice Examples
223
A = 16.2, B = 15.0. Python codes of this example are given in the online web application. The profit margin of Product A is equal to $20, while it is $15 for Product B. The nested-allocation policy allocates the capacity only to Product A until the marginal profit of Product A drops to $15. Then, it starts to allocate the capacity jointly to both products. When the capacity is 300 units, it allocates whole capacity to Product A, which in turn reduces the marginal profit of Product A from $20 to $16.2. But the marginal profit of Product A is still higher than $15 (i.e. profit margin of Product B), so it is not feasible for the manufacturer to allocate any of its scarce capacity to Product B. When the capacity is 500 units, the nested-allocation policy allocates the capacity to both products. But most capacity is utilized to produce Product A, because it has a higher profit margin than Product B. Example 6.4 Suppose that the manufacturer starts the production 4 months in advance of the selling season. She also starts to update the demand forecasts at the beginning of each month according to the multiplicative demand model. Calculate the demand parameters of the multiplicative model that corresponds to the demand distribution following a lognormal distribution with a location parameter of 6.56 and a scale parameter of 0.83. Solution 6.4 We normalize the lead time of 4 months to 1. Given that both products have an identical demand distribution, the multiplicative demand model parameters are the same for both products. The initial demand forecast is: D0 = e6.56+0.83
.
2 /2
= 997 units.
The volatility parameter of the multiplicative demand model is the same as the scale parameter of the lognormal distribution (σ = 0.83). The actual demand realizes at time t4 = 1. The monthly production orders are placed at the beginning of each month: t0 = 0, t1 = 0.25, t2 = 0.5, and t3 = 0.75. The demand distribution at the beginning of each month is formalized as follows: D4 | Di ∼ log -N(ln(Di ) − σ 2 /2 × (1 − ti ), σ 1 − ti ), ∼ log -N(ln(Di ) − 0.344 × (1 − ti ), 0.83 1 − ti ).
.
Example 6.5 Suppose that the monthly production capacity of the shared resource is equal to 700 units. Calculate in Python the optimal order quantities for each product in each month if the forecasts of Product A are updated, such that D0 = 1000, D1 = 900 D2 = 900, and D3 = 800, and the forecasts of Product B are updated as D0 = 1000, D1 = 1100 D2 = 1200, and D3 = 1300.
224
6 Managing Product Variety
Solution 6.5 We now use the algorithm given in Sect. 6.3 to determine the optimal order quantities in each month for each product. Python codes of this example are given in the online web application. The order quantities for Product A are found such that: Q0 = 375 units,
.
Q1 = 268 units,
Q2 = 260 units,
Q3 = 138 units.
Therefore, the total amount produced for Product A is equal to 1041 units. The order quantities for Product B are found as: Q0 = 191 units,
.
Q1 = 419 units,
Q2 = 440 units,
Q3 = 529 units.
Therefore, the total amount produced for Product B is equal to 1579 units. The sum of order quantities for the third month is equal to 260 + 440 = 700, which is equal to the available capacity per month. The capacity constraint is binding only for the third month, such that the marginal profit for each product is the same and equal to 0.37 for the third month. For the other months, the sum of order quantities is less 700 units, and therefore, the capacity is not fully utilized. Example 6.6 Suppose that actual demand values realized at t4 are 800 units for Product A and 1300 units for Product B. Compare the results of Examples 6.1 and 6.5 to show the value of multiple ordering flexibility (in Example 6.5) over the single-ordering newsvendor policy (in Example 6.1). Solution 6.6 With the monthly production capacity of 700 units, the total capacity over the lead time of 4 months is equal to 2800 units. This amount is more than the sum of newsvendor order quantities found in Example 6.1, which is equal to 1421 + 1236 = 2657 units. If the manufacturer has to place a single order for each product at the very beginning, the order quantities must be the newsvendor order quantities found in Example 6.1, because the capacity does not limit the order quantities. In this case, the manufacturer faces excess inventory of 1421−800 = 621 units for Product A and inventory shortage of 1300 − 1236 = 64 units for Product B. Given that the overage cost of Product A is $5 per unit and the underage cost of Product B is $15 per unit, the cost of supply-demand mismatches is equal to: 5 × 621 + 15 × 64 = $4065.
.
With the multiple ordering flexibility, the manufacturer faces excess inventory of 1041−800 = 241 units for Product A and (also excess inventory) of 1579−1300 = 279 units for Product B. The cost of supply-demand mismatches is then equal to: 5 × 241 + 5 × 279 = $2600.
.
Therefore, the value of multiple ordering over the single-ordering policy is 4065 − 2600 = $1465.
6.7 Practice Examples
225
Example 6.7 We now consider a different setting in which a manufacturer produces two different SKUs (Product X and Y) in two steps. The first step is an assembly process, which is identical for both products. The second step is a painting process, where the product differentiation occurs. The materials used for production are recycled plastics, such that the material cost is negligible. The process cost for the assembly process is $10 per unit. It is also $10 per unit for the painting process. It takes 1 week to complete a production batch for each operation. The demand lead time is set equal to 1 week such that customers are requested to place their orders 1 week in advance of delivery times. Thus, the painting process is the decoupling point. We assume that demand for each product is identical: It is either 10 or 20 units with equal probabilities. The selling price is the same for both products and equal to $50. Calculate the expected profit for the manufacturer (Use this information to answer the questions of Examples 6.7–6.9). Solution 6.7 The demand for each product is known before the painting process begins. Therefore, the manufacturer can determine the quantities of Product X and Y after observing their demand. The quantity produced through the assembly process is determined under demand uncertainty. The total demand for both Product A and B can be 20 units with a probability of 25%, 30 units with a probability of 50%, and 40 units with a probability of 25%. The underage cost is the loss of profit, which is equal to 50 − 10 − 10 = $30. The overage cost is only the process cost of the assembly operation, which is equal to $10. Then, the critical fractile is: α=
.
30 50 − 10 − 10 = = 0.75. 50 − 10 30 + 10
Therefore, the optimal order quantity that corresponds to this critical fractile value is equal to 30 units. The expected profit is calculated as follows: 0.25 × (50 × 20 − 10 × 20 − 10 × 30) + 0.75 × ([50 − 10 − 10] × 30) = $800.
.
Example 6.8 What is the expected profit for the manufacturer if the assembly and painting operations are reversed? Solution 6.8 Because the painting process takes place before the assembly process, the product differentiation occurs under demand uncertainty. The underage cost is the loss of profit, which is equal to 50 − 10 − 10 = $30. The overage cost is only the process cost of the painting operation, which is equal to $10. Then, the critical fractile is: α=
.
30 50 − 10 − 10 = = 0.75. 50 − 10 30 + 10
226
6 Managing Product Variety
The optimal order quantity that corresponds to this critical fractile value is 15 units for each product. The expected profit is then equal to: 2 × 0.5 × (50 × 10 − 10 × 10 − 10 × 15) +0.5 × ([50 − 10 − 10] × 15) = $700.
.
Example 6.9 Discuss the results of Examples 6.7 and 6.8. Solution 6.9 The assembly and painting operations have the same processing cost. For that reason, the difference in the expected profit between Examples 6.7 and 6.8 can be solely attributed to delaying the differentiation point. In other words, the manufacturer can increase the expected profit by $100 by delaying the differentiation point (i.e. processing the painting operation after the assembly operation).
6.8
Exercises
1. Suppose that a manufacturer produces an item by processing raw materials through two operations. The cost of raw materials is negligible. The processing cost is equal to $5 per unit for the first operation, and it is $10 per unit for the second operation. It takes 1 week to complete a production batch for each operation. The demand lead time is set equal to 1 week such that customers are requested to place their orders 1 week in advance of delivery times. Therefore, the demand for the item is known before the second operation begins. We assume that demand is either 10 or 20 units with equal probabilities. The selling price of the product is $25 per unit. Calculate the expected profit. 2. Calculate the expected profit if the operations in the previous question are reversed. 3. Discuss the results of previous two questions and comment on the value of delaying a costly operation.
6.9
Appendix to Chap. 6
6.9.1
Mean-Variance Analysis Derivations
The product selection problem is written as follows: Minimize:
.
1 T w w, 2
6.9 Appendix to Chap. 6
227
such that: wT r = 0.06, wT 1 = 1. Then, the Lagrange formulation is written as follows: J (w, λ1 , λ2 ) =
.
1 T w w + λ1 (0.06 − wT r) + λ2 (1 − wT 1). 2
Then, we obtain: .
∂J (w, λ1 , λ2 ) = w − λ1 r − λ2 1 = 0, ∂w w = −1 (λ1 r + λ2 1), ∂J (w, λ1 , λ2 ) = 0.06 − wT r = 0, ∂λ1 ∂J (w, λ1 , λ2 ) = 1 − wT 1 = 0. ∂λ2
Thus, .
λ1 rT −1 r + λ2 1T −1 r = 0.06, λ1 rT −1 1 + λ2 1T −1 1 = 1.
Let .X1 = rT −1 1, .X2 = rT −1 r, and .X3 = 1T −1 1. So, λ1 =
.
6.9.2
0.06X3 − X1 X2 X3 − X12
and
λ2 =
X2 − 0.06X1 . X2 X3 − X12
Multi-product Newsvendor Model
The profit function for the multi-product newsvendor problem is formulated as follows: . = pi min(Qi , Di ) − ci Qi + si max(Qi − Di , 0). i∈
i∈
i∈
For the sake of exposition of the problem, we set the salvage value equal to zero and rewrite the expression as: =
.
i∈
pi min(Qi , Di ) −
i∈
ci Qi .
228
6 Managing Product Variety
We now define a variable .Wi denoting the sales of each product: Wi = min(Qi , Di ).
.
Total production capacity is denoted by C, which can be considered as the capacity of the shared resource. Then, the ordering problem of the decision-maker is written as follows: pi W i − ci Qi. .Maximize (6.1) Qi ,∀i∈ i∈S i∈S st: Wi − Qi ≤ 0, Wi ≤ Di , Qi ≤ C, .
∀i ∈ . ∀i ∈ .
(6.2) (6.3) (6.4)
i∈
Qi ≥ 0, Wi ≥ 0
∀i ∈ .
(6.5)
The problem cannot be solved in its current form because demand values are stochastic. Therefore, the model is not a linear programming (LP) model. To make the formulation given by the expressions (6.1)–(6.5) an LP model, we should extend the problem space. Let .γir denote a realization of demand for product i such that .r ∈ , where . = {1, 2, · · · } is a large finite set of positive integers. Thus, .γir .∀r ∈ includes all possible realizations of .Di . Then, we can reformulate the problem as an LP model: pi ci Qi. .Maximize P r(γir )Wri − (6.6) Qi ,∀i∈ r∈ i∈S i∈S st: Wri − Qi ≤ 0, Wri
≤
γir ,
∀i ∈ , ∀r ∈ . ∀i ∈ , ∀r ∈ .
Qi ≤ C, .
(6.7) (6.8) (6.9)
i∈
Qi ≥ 0, Wri ≥ 0
∀i ∈ , ∀r ∈ ,
(6.10)
where .P r(γir ) is the probability value of demand .Di being equal to .γir . For an LP model, an equivalent dual model can be developed. We use the .∼ sign over parameters to denote the dual variables. Then, the dual of this LP is written as follows: β˜ir γir + λ˜ C. .Minimize (6.11) λ˜
i∈ r∈
st: α˜ ir + β˜ir ≥ pi P r(γir ),
∀r ∈ , ∀i ∈ , .
(6.12)
6.9 Appendix to Chap. 6
229
−
α˜ ir + λ˜ ≥ −ci ,
∀i ∈ , .
(6.13)
r∈
α˜ i ≥ 0, β˜ir ≥ 0, λ˜ ≥ 0
∀r ∈ , ∀i ∈ .
(6.14)
For the demand realizations such that .γir ≥ Qi , we have the following results: 1. .Wri = γir and the constraint (6.7) becomes binding. 2. The constraint (6.8) becomes non-binding. Due to complementary slackness theorem, .{β˜ir | γir ≥ Qi } .∀r ∈ are equal to zero. For the demand realizations such that .γir ≤ Qi , we have the following results: 1. .Wri = Qi and the constraint (6.7) becomes non-binding. Due to complementary slackness theorem, .{α˜ ir | γir ≤ Qi } .∀r ∈ are equal to zero. 2. The constraint (6.8) becomes binding. Then, .
α˜ ir = {pi P r(γir ) | γir ≥ Qi },
β˜ir = {pi P r(γir ) | γir ≤ Qi }. Plugging these results into constraint (6.13) yields:
λ˜ ≥ pi
.
P r(γir ) − ci = pi (1 − F (Qi )) − ci .
γir ≥Qi
We remark that there is an adjustment factor in the objective function of the dual model such that .λ˜ can be considered as the marginal value of the capacity. Thus, the objective of the dual problem is to minimize the .λ˜ value to guarantee that the capacity is allocated in the most effective way. Then, the dual problem can be written in the following form: Minimize λ˜ .
.
(6.15)
Qi ∀i∈
st: λ˜ ≥ pi (1 − F (Qi )) − ci , λ˜ ≥ 0,
Qi ≥ 0,
∀i ∈ , .
∀i ∈ .
(6.16) (6.17)
To minimize the objective function in Eq. (6.15), the maximum of the marginal profit (i.e. .pi (1 − F (Qi )) − ci ) should be minimized. Therefore, the capacity has to be allocated to the products in a nested scheme starting with the products that have the highest marginal profit. This approach in turn minimizes the maximum marginal profit.
230
6 Managing Product Variety
References Biçer, I., Lücker, F., & Boyaci, T. (2022). Beyond retail stores: Managing product proliferation along the supply chain. Production and Operations Management, 31(3), 1135–1156. Biçer, I., & Seifert, R. W. (2017). Optimal dynamic order scheduling under capacity constraints given demand-forecast evolution. Production and Operations Management, 26(12), 2266–2286. Brynjolfsson, E., Hu, Y., & Simester, D. (2011). Goodbye pareto principle, hello long tail: The effect of search costs on the concentration of product sales. Management Science, 57(8), 1373– 1386. Cachon, G. P., & Lariviere, M. A. (1999). Capacity allocation using past sales: When to turn-andearn. Management Science, 45(5), 685–703. E2Open (2018). Forecasting and inventory benchmark study. Corporate Report. https://www. e2open.com/demand-sensing-forecasting-and-inventory-benchmark-study-2018/ Gasparro, A., Bunge, J., & Haddon, H. (2020). Why the american consumer has fewer choices— maybe for good. Wall Street Journal. https://www.wsj.com/articles/why-the-americanconsumer-has-fewer-choicesmaybe-for-good-11593230443. Heskett, J. L., & Signorelli, S. (1989). Benetton (A). Harvard Business School Case: 9-685-014. Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91. Mocker, M., & Ross, J. W. (2017). The problem with product proliferation. Harvard Business Review, 95(3), 104–110. Murray, A. (2010). The end of management. Wall Street Journal. https://www.wsj.com/articles/ SB10001424052748704476104575439723695579664 Parsons, J. C. W., & Graves, S. C. (2005). Reebok NFL replica jerseys: A case for postponement. MIT Sloan School of Management, SCM Case. Randall, T., & Ulrich, K. (2001). Product variety, supply chain structure, and firm performance: Analysis of the us bicycle industry. Management Science, 47(12), 1588–1604.
7
Managing the Supply Risk
Keywords
Supply disruption · Delivery shortfall · Incoterms · Risk mitigation inventory · Reactive capacity
In the previous chapters, we have mainly focused on demand uncertainty while assuming that the quantity ordered would be delivered in full at the requested delivery time. In practice, however, companies are often exposed to a supply risk such that the ordered items would not be delivered on time and in full due to some disruptions in their supply chains. In the first quarter of 2021, for example, US manufacturing firms were severely hurt by supply disruptions due to port congestion, semiconductor shortages, and severe weather conditions (Meko & Esteban, 2021; McLain et al., 2021). Companies in the other geographical areas were often hurt by supply disruptions for various reasons. For example, the blockage of the Suez Canal for 6 days in March 2021 caused supply disruptions in many industries in Europe, and the cost was estimated to be several billion dollars for world trade (Paris & Malsin, 2021). The digitalization of supply chains and the adoption of Internet technologies have long made it possible to source products from different geographical areas in the world. Thus, global supply chains have become very complex and vulnerable to disruption risks during the last two decades. Companies that have well-articulated mitigation and contingency plans against supply risks can minimize the negative consequences of those risks and potentially increase their market shares (Tomlin, 2006). In 2000, a fire in the Philips Semiconductor plant caused component shortages for both Nokia and Ericsson (Tomlin, 2006). Nokia responded to this incident by increasing its production quantities in the other suppliers’ facilities owing to its multiple sourcing strategy, whereas Ericsson did not have any contingency plan. In the aftermath of this disruption, Ericsson lost an estimated $400 million in potential revenues. However, Nokia was not severely affected by it (Tomlin, 2006). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_7
231
232
7 Managing the Supply Risk
Companies may follow three alternative strategies to mitigate supply risk (Tomlin, 2006; Lücker et al., 2019). As exemplified by Nokia, the first strategy is the multiple sourcing strategy, which makes it possible to utilize a backup source when the main source of the supply is disrupted. The capacity of the backup source can be considered the reactive capacity, which can be reserved by paying a capacity reservation fee to a reliable backup source. The second strategy is to build a risk-mitigation inventory. Companies that anticipate a supply disruption may consider building up inventory to mitigate disruption risk. If a disruption occurs, the risk-mitigation inventory helps fulfil customer demand during the disruption. If no disruption occurs, companies would incur an extra cost of holding the riskmitigation inventory unnecessarily. The third strategy is demand management, such that during the disruption, customer demand for a particular item is shifted to alternative in-stock products. When a company experiences supply disruption, the maximum potential risk exposure depends on the operational dynamics and contractual relationship between the company and its suppliers. Companies sometimes take responsibility for the items sourced from an offshore supplier when they are shipped. If a disruption occurs during the shipment, the companies incur both the cost of holding inventory and the cost of lost sales. We refer to this type of disruption as a “Type-1 disruption”. In some cases, suppliers are responsible for the shipments until they are delivered. Then, the companies incur only the cost of lost sales if a disruption occurs during the shipment. We refer to this type of disruption as a “Type-2 disruption”. In other cases, shipments are partially completed such that some items are damaged during the shipment. Therefore, the quantity delivered in good condition is less than the original order quantity. We refer to this type of disruption as a “Type-3 disruption”. In this chapter, we focus on developing analytical models for each type of disruption risk and quantify the benefits of the risk-mitigation strategies.
7.1
Type-1 Disruption Risk
Companies are exposed to a Type-1 disruption risk when they have full responsibility for the shipments of the items ordered from offshore suppliers. Logistics service providers, such as Maersk, use some well-known routes and ports to transport goods from offshore countries to domestic markets. However, big retailers often prefer to transport their goods using alternative routes and small ports. To do so, they charter their own ships and take full responsibility for the shipments. For example, Walmart, Costco, Home Depot, and Target followed this approach to stock items for the 2021 Christmas season (Nassauer & Paris, 2021). When the buyer in a supply chain takes full responsibility for the shipment, the cost of the disruption for the buyer is higher than it is when the supplier takes full responsibility. In the former case, the buyer incurs both the inventory holding cost and the cost of lost sales when a disruption delays the shipment. In the latter case, the buyer incurs only the cost of lost sales.
7.1 Type-1 Disruption Risk
233
We use b and h to denote the penalty cost when a customer is not fulfilled (including the cost of lost sales) and the inventory holding cost per unit, respectively. In the absence of a disruption, the cost function is formulated as follows: (Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0),
.
where Q is the order quantity and D is the amount of customer demand. The first term on the right-hand side of this expression gives the total penalty cost that occurs when the demand exceeds quantity. The second term is the total inventory holding cost, which occurs when the quantity exceeds demand. The demand for the product is uncertain, which follows a statistical distribution whose probability density function is denoted by .f (·). We denote by .θd the probability of a disruption. If a disruption occurs, the cost function is formulated as follows: (Q | disruption) = bD + hQ.
.
Then, the expected cost function is: (Q) = θd (Q | disruption) + (1 − θd )(Q | no disruption),
.
+∞ = θd [bD + hQ] + (1 − θd ) b (x − Q)f (x)∂x Q
Q + h (Q − x)f (x)∂x . 0
To find the optimal order quantity, we take the first derivative with respect to Q and set it equal to zero. Thus, .
∂(Q) = θd h + (1 − θd )(−b(1 − F (Q)) + hF (Q)) = 0, ∂Q
where .F (·) is the cumulative distribution function of .f (·). The optimal Q value that satisfies the last expression is: F (Q∗ ) =
.
hθd b − . b + h (1 − θd )(b + h)
The first term on the right-hand side of this expression is the exact equivalent of the newsvendor critical fractile, such that the optimal order quantity in the absence of the disruption risk is given by this term. The second term can be considered the adjustment factor due to the disruption risk. As the disruption probability increases, so does the second term, which in turn leads to a reduction of the optimal order level. This result is intuitive and consistent with the rationale of the inventory
234
7 Managing the Supply Risk
dynamics. The expected duration of holding inventory increases with the disruption risk. This leads to an increase in the total inventory cost, given that the buyer has full responsibility for the inventory in the event of a disruption. For that reason, the optimal order quantity decreases as the disruption risk increases. Another important aspect of the second term is that it provides valuable insights regarding the impact of the holding and penalty costs on the optimal order quantity. For high profit margin products such that .b h, the adjustment factor approaches zero, even for a high value of .θd . Therefore, decision-makers should not respond to the disruption risk by decreasing the order quantity substantially when the profit margin of the product is relatively high. We now consider an example of a buyer (e.g. a retailer) that purchases Q units of a product from an offshore supplier. The demand for the product in the market is uncertain, which follows a lognormal distribution with a location parameter of .μ and a scale parameter of .σ . We assume that both the mean demand and the standard deviation are equal to 1000 units. Thus, the coefficient of variation (CV), which is the ratio of mean demand to the standard deviation, is equal to one. The lognormal parameters are then found as: 2
eσ − 1 = CV 2 ,
.
σ = 0.83, e
μ+σ 2 /2
= 1000,
μ = ln(1000) − 0.832 /2 = 6.56. The penalty cost for each unit of demand that is not fulfilled is $40, including the cost of lost sales and loss of goodwill. The holding cost of each unit of excess inventory is $10. We present the optimal order quantity for this example as a function of the disruption probability .θd in Fig. 7.1. When .θd = 0, there is no disruption risk, and the optimal order quantity is equivalent to the newsvendor order quantity, which is equal to 1412 units. The optimal order quantity decreases as the .θd value increases. It drops to zero when .θd becomes equal to .80%. As shown in the figure, its value stays at zero for the .θd values that are higher than .80%. These results can be interpreted as follows. The newsvendor solution of the problem without any disruption risk aims to target the critical fractile value of .80%: F (Q∗ ) =
.
b = 0.8. b+h
Based on this newsvendor solution, the buyer aims to keep the in-stock probability (i.e. the probability of the demand being less than the order quantity) at the .80% level. As the .θd value increases, the target in-stock probability decreases by a factor of .hθd /[(1 − θd )(b + h)], which is equal to .0.8 for .θd = 0.8. For that reason, the
Optimal order quantity
7.1 Type-1 Disruption Risk
235
1000
500
0 0.00
0.25
0.50
0.75
1.00
Disruption probability
Fig. 7.1 Optimal order quantity versus the disruption probability
buyer should not order anything when the disruption probability becomes greater than or equal to .80%. We now consider a case in which the buyer has a backup supplier that reserves a reactive capacity for the buyer. The capacity is only utilized when a disruption occurs. The buyer pays a capacity reservation fee of .ck per unit capacity. We use C to denote the total capacity reserved by the backup supplier. In the absence of a disruption, the cost function is written as follows: (Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0) + ck C.
.
If a disruption occurs, the cost function is updated such that: (Q | disruption) = b max(D − C, 0) + hQ + ck C.
.
The last term on the right-hand side appears in both expressions, because the buyer incurs the capacity reservation cost regardless of whether or not a disruption occurs.
236
7 Managing the Supply Risk
Then, the expected cost is found by: (Q, C) = θd (Q | disruption) + (1 − θd )(Q | no disruption),
.
+∞ = θd b (x − C)f (x)∂x + hQ C
Q +∞ +(1 − θd ) b (x − Q)f (x)∂x + h (Q − x)f (x)∂x + ck C. 0
Q
The first derivative with respect to Q is the same as above. Therefore, the optimal Q value satisfies the expression: F (Q∗ ) =
.
b hθd − . b + h (1 − θd )(b + h)
Hence, Fig. 7.1 also applies to the case with the backup supplier. We also take the derivative of .(Q, C) with respect to C and set it equal to zero to find the optimal C value: .
∂(Q, C) = −θd b(1 − F (C)) + ck = 0. ∂C
Thus, the optimal value of C satisfies: F (C ∗ ) =
.
θd b − ck . θd b
We now extend the example given in Fig. 7.1 by considering the availability of a backup supplier. The capacity reservation cost is assumed to be .$10 per unit. In our derivations, we do not include the operating cost of utilizing the capacity, nor the purchasing cost from the offshore supplier. Our model implicitly assumes that the operating cost for the backup supplier is the same as the purchasing cost from the offshore supplier, because the two cost parameters (i.e. penalty and holding costs) are the same for both the disruption and no-disruption cases. In Fig. 7.2, we present the optimal capacity as a function of .θd , where we use the same values of the demand parameters. We obtain .θd b = ck for .θd = 0.25, which makes .F (C ∗ ) = 0. Therefore, the capacity level is equal to zero for .θd ≤ 0.25. It then increases with .θd for .θd > 0.25. When .θd = 1, the optimal capacity level becomes equal to 1228 units. At this point, demand is fulfilled by utilizing the reactive capacity only, and the buyer aims to satisfy a target in-stock probability of .75% (i.e. .(θd b − ck )/(θd b) = 0.75). In some cases, buyers keep risk-mitigation inventory to fulfil customer demand when a disruption occurs (Tomlin, 2006; Lücker et al., 2019). For example, some
7.1 Type-1 Disruption Risk
237
1250
Optimal reactive capacity
1000
750
500
250
0 0.00
0.25
0.50
0.75
1.00
Disruption probability
Fig. 7.2 Optimal reactive capacity versus the disruption probability
pharmaceutical companies keep a stock of their life-saving drugs to ensure the continuous supply of them for patients (Lücker et al., 2019). Public health agencies (e.g. FDA) often demand that pharmaceutical companies improve the resilience of their supply chains and report more information about their risk assessment. Especially during the COVID-19 pandemic, the US Coronavirus Aid, Relief, and Economic Security (CARES) Act in 2020 and the 100-Day Review Plan by the Biden Administration emphasized the importance of building resilient supply chains to avoid the disruptions (Colvill et al., 2021). We now focus on optimizing the risk-mitigation inventory in our disruption model. The risk-mitigation inventory is distinguished from the regular inventory in that it can only be used when a disruption occurs (Lücker et al., 2019). We use .QRMI to denote the risk-mitigation inventory. Companies can build the risk-mitigation inventory by increasing the production and order quantity and keeping the excess inventory to mitigate the disruption risk in further selling periods. Therefore, the holding cost for .QRMI is the same as any excess inventory, which is denoted by h dollars per unit. If a buyer keeps .QRMI units of the risk-mitigation inventory, the cost function in the absence of a disruption is formulated as: (Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0) + hQRMI .
.
238
7 Managing the Supply Risk
If a disruption occurs, the cost function becomes: (Q | disruption) = b max(D − QRMI , 0) + h max(QRMI − D, 0) + hQ.
.
Then, the expected cost is found by: (Q, QRMI ) = θd (Q | disruption) + (1 − θd )(Q | no disruption),
.
QRMI +∞ = θd b (x − QRMI )f (x)∂x + h (QRMI − x)f (x)∂x 0
QRMI
+ hQ
Q +∞ +(1 − θd ) b (x − Q)f (x)∂x + h (Q − x)f (x)∂x + hQRMI .
Q
0
The first derivative with respect to Q is the same as above. Therefore, the optimal Q value satisfy the expression: F (Q∗ ) =
.
hθd b − , b + h (1 − θd )(b + h)
and Fig. 7.1 also applies to risk-mitigation inventory. We also take the derivative of (Q, QRMI ) with respect to .QRMI and set it equal to zero to find the optimal .QRMI value: .
.
∂(Q, QRMI ) = θd [−b(1 − F (QRMI )) + hF (QRMI )] + (1 − θd )h = 0. ∂QRMI
Thus, the optimal value of .QRMI satisfies: F (Q∗RMI ) =
.
h θd b − (1 − θd )h =1− . θd (b + h) θd (b + h)
We remark that this expression is a different and simplified version of the analytical derivation given by Lücker et al. (2019), where the authors take into consideration the uncertainty about the disruption duration. We again extend our previous example given in Fig. 7.1 and consider the possibility of holding the risk-mitigation inventory. We use the same values for the demand and cost parameters. We present the results in Fig. 7.3, where the x-axis represents the disruption probability .θd and the y-axis represents the optimal risk-
Optimal risk−mitigation inventory
7.1 Type-1 Disruption Risk
239
1000
500
0 0.00
0.25
0.50
0.75
1.00
Disruption probability
Fig. 7.3 Risk-mitigation inventory versus the disruption probability
mitigation inventory .Q∗RMI . The results indicate that the buyer should not hold any risk-mitigation inventory for .θd ≤ 0.2. However, the optimal .QRMI level increases with a decreasing rate as .θd increases for .θd > 0.2. We provide the interactive examples of Figs. 7.1, 7.2 and 7.3 with their Python codes in the online web application, where users can specify the input parameters and get the results. In this section, we consider three alternative strategies that a buyer may follow to manage a Type-1 disruption risk. The first strategy, given in Fig. 7.1, can be considered the passive mitigation strategy, where the buyer determines the optimal order quantity in the presence of the disruption risk. However, the buyer does not aim to mitigate the disruption risk completely by reserving capacity at a reliable supplier or holding risk-mitigation inventory. The second strategy, given in Fig. 7.2, is the capacity-based risk-mitigation strategy, where the buyer reserves capacity at a reliable supplier to fulfil the customers’ demand when a disruption occurs. Finally, the third strategy, given in Fig. 7.3, focuses on the risk-mitigation inventory to fulfil the customer demand if a disruption occurs. We consider these strategies in isolation from each other to be consistent with the practice. In particular, companies either ignore the disruption risk or adopt only one of the risk-mitigation strategies. Comparing the derivations of .F (C ∗ ) and ∗ ∗ .F (Q RMI ), we observe that the optimal capacity level .C does not depend on the holding cost, whereas the optimal risk-mitigation inventory .Q∗RMI decreases as the holding cost increases. Therefore, companies should mitigate the disruption risk
240
7 Managing the Supply Risk
by building the risk-mitigation inventory when the holding cost for a product is relatively low with respect to the penalty cost. This applies to innovative products, because they have high profit margins and high demand uncertainty. Therefore, for innovative products, the cost of stock-outs is expected to be much higher than the holding cost of the inventory. For that reason, decision-makers ought to build the risk-mitigation inventory to mitigate the disruption risk for innovative products. When the holding cost is relatively high with respect to the penalty cost, building the risk-mitigation inventory would not be appealing for companies, because .Q∗RMI decreases as the holding cost increases. In such cases, companies should reserve the reactive capacity at a reliable supplier to mitigate the disruption risk, because the capacity decision does not depend on the holding cost (as can be seen in the derivation of .F (C ∗ )). This applies to standard products that have low profit margins and low demand uncertainty. Their holding cost is expected to be higher than the penalty costs. Therefore, decision-makers ought to follow the capacity reservation strategy to mitigate the disruption risk for standard products.
7.2
Type-2 Disruption Risk
We now look at the Type-2 disruption risk such that a disruption may occur during a shipment and the supplier takes full responsibility for the shipment. Therefore, the supplier owns the inventory during the shipment, so the buyer does not incur a holding cost when a disruption occurs. In other words, the buyer does not tie up capital in inventory until the delivery is completed. The ownership of the inventory during the shipment has important implications for the inventory policy of the buyer. When the buyer has ownership of the inventory (such as occurs in a Type-1 disruption risk), we have observed in Fig. 7.1 a decrease in the order quantity of the buyer as the disruption probability increases. Is the ordering decision of the buyer affected from the disruption probability under the Type-2 disruption risk? In the absence of any risk-mitigation strategy, the ordering decision is not affected by the disruption risk. The expected cost is formulated such that: (Q) = θd (Q | disruption) + (1 − θd )(Q | no disruption),
.
where (Q | disruption) = bD,
.
(Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0).
7.2 Type-2 Disruption Risk
241
We remark that there is no holding cost term in .(Q | disruption) under a Type2 risk, which is not the case for a Type-1 risk. Then, the expected cost function can be rewritten as follows: +∞ +∞ .(Q) = θd b xf (x)∂x + (1 − θd ) b (x − Q)f (x)∂x 0
Q
Q + h (Q − x)f (x)∂x . 0
To find the optimal value of Q, we take the derivative of .(Q) with respect to Q and set it equal to zero: .
∂(Q) = (1 − θd )[−b(1 − F (Q)) + hF (Q)] = 0. ∂Q
The optimal order quantity is thus found by: F (Q∗ ) =
.
b , b+h
which is the same as the newsvendor order quantity. This derivation of the optimal order quantity also proves that the order quantity under a Type-2 disruption risk is unaffected by the disruption probability in the absence of a risk-mitigation strategy, because the expression does not include the .θd term. We now focus on the risk-mitigation strategy by reserving reactive capacity at a backup supplier. As discussed in the previous section, the capacity is utilized when a disruption occurs, and the buyer pays a capacity reservation fee of .ck per unit of capacity reserved. We recall that the total capacity reserved is denoted by C. In the absence of a disruption, the cost function is formulated as follows: (Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0) + ck C.
.
If a disruption occurs, the cost function is written as follows: (Q | disruption) = b max(D − C, 0) + ck C.
.
242
7 Managing the Supply Risk
Then, the expected cost is found by: (Q, C) = θd (Q | disruption) + (1 − θd )(Q | no disruption),
.
+∞ = θd b (x − C)f (x)∂x C
Q +∞ +(1 − θd ) b (x − Q)f (x)∂x + h (Q − x)f (x)∂x + ck C. 0
Q
Taking the first derivative of this expression with respect to Q, we find: F (Q∗ ) =
.
b , b+h
which is the same as the newsvendor order quantity. Therefore, the ordering decision under a Type-2 risk is still unaffected by the disruption probability if the reactive capacity is reserved at a backup supplier as a risk-mitigation strategy. We also take the derivative of .(Q, C) with respect to C and set it equal to zero to find the optimal C value: .
∂(Q, C) = −θd b(1 − F (C)) + ck = 0. ∂C
Thus, the optimal value of C satisfies: F (C ∗ ) =
.
θd b − ck . θd b
The last expression is the same as the optimal capacity derivation for a Type-1 risk. Therefore, Fig. 7.2 applies to a Type-2 disruption risk. This result indicates that the reactive capacity depends on the disruption probability, whereas the order quantity does not depend on it. When the risk-mitigation inventory is held against the disruption risk, the optimal .QRMI level for a Type-2 disruption risk is the same as that of a Type-1 risk. If the buyer in a supply chain relationship uses the risk-mitigation inventory to mitigate the disruption risk, the cost function at the end is written as follows: (Q | no disruption) = b max(D − Q, 0) + h max(Q − D, 0) + hQRMI ,
.
which is conditional on that there is no disruption. If a disruption occurs, the cost function is: (Q | disruption) = b max(D − QRMI , 0) + h max(QRMI − D, 0).
.
7.2 Type-2 Disruption Risk
243
Then, the total expected profit is formulated as follows: (Q, QRMI ) = θd (Q | disruption) + (1 − θ )(Q | no disruption),
.
+∞
= θd b
QRMI
(x − QRMI )f (x)∂x + h
(QRMI − x)f (x)∂x
0
QRMI
Q +∞ (x − Q)f (x)∂x + h (Q − x)f (x)∂x +(1 − θd ) b Q
0
+ hQRMI . Taking the first derivative of total expected profit with respect to Q, the optimal order quantity from the supplier is found by: F (Q∗ ) =
.
b . b+h
Therefore, the order quantity is the same as the newsvendor order quantity for the Type-2 disruption risk, but, as we see above, it is different from the newsvendor order quantity for the Type-1 disruption risk. We now take the derivative of the total expected cost with respect to .QRMI and develop the formula for the optimal .QRMI level: F (Q∗RMI ) = 1 −
.
h , θd (b + h)
which is the same as the optimal .QRMI level of the Type-1 risk, because the last expression is identical to the .F (Q∗RMI ) formula that we derived for the Type-1 risk. In summary, the regular order quantity from the supplier (i.e. Q) is affected by whether the buyer is exposed to a Type-1 or a Type-2 disruption risk. However, the optimal levels for the risk-mitigation strategies (either the reactive capacity or the risk-mitigation inventory) are not affected by the type of disruption risk. We summarize the formulas for the optimal levels in Table 7.1, which shows a direct comparison between the Type-1 and Type-2 disruption risks. Table 7.1 Summary of the formulas for both Type-1 and Type-2 disruption risk cases ∗)
formula ∗ .F (C ) formula ∗ .F (QRMI ) formula .F (Q
Type-1 risk + h) − (hθd )/((1 − θd )(b + h)) .(θd b − ck )/(θd b) .1 − h/(θd (b + h)) .b/(b
Type-2 risk + h) .(θd b − ck )/(θd b) .1 − h/(θd (b + h)) .b/(b
244
7.3
7 Managing the Supply Risk
Implications of the Shipment Ownership for Global Trade
When we analyze Type-1 and Type-2 disruption risks, we make a distinction between them based on the ownership of the shipment. In particular, the disruption risk is defined as a Type-1 risk, if the buyer is the responsible party for the shipment. Otherwise, the disruption risk is referred to as a Type-2 risk. For global trade, the International Chamber of Commerce (ICC) has issued a set of predefined commercial terms, which are popularly known as Incoterms.1 The Incoterms are commonly used in international trade to describe the responsibilities of the supply chain parties and streamline the trade process. The Incoterms are categorized into four different groups.2 The “C Terms” comprise the first group of Incoterms and apply to situations whereby the supplier pays for the shipment, and the buyer assumes the risks during the shipment. There are some subcategories under the C Terms. For example, the Incoterm CPT allocates the responsibilities to the supply chain parties, such that the supplier pays for the carriage of the goods to a predetermined location, but the buyer assumes the risks. Another C Term—CIP— obliges the supplier to pay for the insurance for the shipment; however, the risks beyond the scope of the insurance are assumed by the buyer. Therefore, the buyer would incur the costs of a disruption, given that the supplier is not responsible for the risks associated with the shipment. For that reason, the optimal ordering and risk-mitigation policies for the C Terms can be analysed under a Type-1 disruption risk. In the second group of Incoterms—the “D Terms”—the supplier is responsible for the shipment, and the risk is transferred to the buyer after the shipped goods are unloaded at a specific location. There are also some subcategories of the D Terms, such as DAT, DAP, and DDP. For example, the supplier is responsible for the delivery of the goods to the buyer under the DAT terms (i.e. delivered at terminal), and the risk is transferred to the buyer after the delivery. Therefore, the supplier would incur the costs of a disruption, given that she is responsible for the risks associated with the shipment. Hence, the optimal ordering and risk-mitigation policies for the D Terms can be analysed under a Type-2 disruption risk. The third group of Incoterms—the “E Terms”—imposes the maximum level of responsibility on the buyer. The only responsibility assumed by the supplier under the E Terms is the packaging of the goods and making them available in a place to be picked up on time. The buyer is then fully responsible for handling, transporting, and insuring the shipment. Therefore, the optimal ordering and risk-mitigation policies for the E Terms can be analysed under a Type-1 disruption risk. Both Type-1 and Type-2 disruption risks apply to some parts of the shipment under the last group of Incoterms—the “F Terms”. Under the FAS Term (i.e. one of the subcategories of the F Terms), for example, the supplier is responsible for making sure the goods are delivered and loaded onto the vessels in a port, while the 1 https://en.wikipedia.org/wiki/Incoterms. 2 https://partners.wsj.com/ups/the-abcs-of-incoterms/.
7.4 Type-3 Risk: Delivery Shortfalls
245
buyer is responsible for the rest of the shipment. If the disruption risk is expected to occur before loading, then a Type-2 model should be used to develop the optimal ordering and risk-mitigation policies. If it is expected to occur during the ocean shipment, then a Type-1 model should be used to develop the optimal policies. In summary, the Incoterms specify the contractual obligations of the parties in a supply chain, which is very important for determining whether the Type-1 or the Type-2 derivations should be used to develop the optimal policies.
7.4
Type-3 Risk: Delivery Shortfalls
We have analysed the ordering policies and implications of the disruption risk by focusing on Type-1 and Type-2 disruption risks. Supply risks may also occur in a different form, such that the delivery is completed, but the amount delivered is less than the initial order quantity. There are different reasons for a shortfall in the delivery of the goods ordered. For example, fresh produce may spoil during the shipment due to poor storage practices. According to a former Walmart executive, fresh produce spends half of its shelf life in the shipment, and there is a huge cost for the spoilage of fresh produce due to poor shipment practices (Garsten, 2020). In the agriculture industry, a delivery shortfall may occur due to yield uncertainty. In 2021, for example, crop yields were lower than expected in the USA due to drought, which led to food shortages and an increase in food prices (Dougherty & Santilli, 2021). Delivery shortfall cases are not only limited to perishable items but also observed in durable goods. In 2020, nearly 2000 containers managed by a Japanese carrier were lost or damaged due to storms in the Pacific Ocean, potentially causing delivery shortfalls for buyers expected to receive the goods in full and in good condition (Sheldrick, 2020). A delivery shortfall may also occur for other reasons, such as theft, accidents, poor handling of the items during the shipment, etc. We now consider a buyer that is exposed to a delivery shortfall for one or more of the reasons mentioned above. The buyer determines the order quantity Q at the beginning under demand uncertainty. We use Y to denote the amount of the delivery shortfall, which is assumed to be stochastic. The amount of the shortfall is restricted to being less than Q—that is, .0 ≤ Y ≤ Q. Therefore, the amount of the items delivered is equal to .Q−Y units. The buyer can sell only .Q−Y units if the customer demand D exceeds .Q − Y . We now consider a case in which the buyer pays the supplier only for .Q−Y units and the supplier incurs the cost of the delivery shortfall. We use p, c, and s to denote the selling price, cost, and salvage value of the product per unit, respectively. Then, the newsvendor profit under the delivery shortfall risk is written as follows: (Q) = (p − c)(Q − Y ) − (p − s) max(Q − Y − D, 0).
.
We assume that the demand follows a multiplicative demand model, given in Chap. 4, such that the demand forecast is updated over a forecasting horizon
246
7 Managing the Supply Risk
between .t0 and .tn . The demand forecast at time .ti for .t0 ≤ ti ≤ tn is denoted by Di . The actual demand is realized at time .tn , which is equal to .Dn . Then, the final demand conditional on the initial demand forecast follows a lognormal distribution:
.
√ Dn | D0 ∼ log -N(ln(D0 ) + (μ − σ 2 /2)(tn − t0 ), σ tn − t0 ),
.
where .μ is the drift rate and .σ is the volatility parameter. We assume that the amount of the delivery shortfall is proportional to the final demand. In practice, manufacturers have limited production capacity. When the demand for an item increases beyond the capacity limits, the buyers will face a shortfall, such that the delivered amount will be less than the amount of their initial orders. For example, retailers around the world observed such a shortfall in supplies in the early stages of the COVID-19 pandemic for products such as toilet paper. If the shortfall is directly proportional to the demand, the ratio .Y /Dn has a constant value. Suppose that .Y /Dn = rconst such that .rconst is the constant, which we refer to as the shortfall ratio. Then, the amount of the delivery shortfall predicted at time .t0 is .Y0 = rconst D0 . Given that the demand model is multiplicative, the amount of the shortfall follows a lognormal distribution with the following parameters: √ Y | Y0 ∼ log -N(ln(Y0 ) + (μ − σ 2 /2)(tn − t0 ), σ tn − t0 ).
.
We remark that this lognormal distribution may assign positive densities for the realizations of .Y > Q, although the amount of the delivery shortfall cannot be more than the order quantity. To avoid such circumstances, we need the restrict .rconst to being less than .0.2, so the probability of Y being more than Q is minimized. We can also approximate the following lognormal distribution to characterize the summation .Dn + Y : √ Dn + Y | D0 , Y0 ∼ log -N(ln(D0 + Y0 ) + (μ − σ 2 /2)(tn − t0 ), σ tn − t0 ).
.
We now use .h(·) and .g(·) to denote the probability density functions of Y and Dn + Y , respectively. The cumulative probability functions are denoted by .H (·) and .G(·), respectively. The expected profit is then calculated as follows: .
+∞ Q .E((Q)) = (p − c)Q − (p − c) yh(y)∂y − (p − s) (Q − z)g(z)∂z. 0
0
7.4 Type-3 Risk: Delivery Shortfalls
247
Taking the first derivative of this expression with respect to Q, the optimal order quantity is found as follows: .
∂E((Q)) = (p − c) − (p − s)G(Q) = 0, ∂Q p−c G(Q∗ ) = . p−s
The expected profit for the optimal order quantity is then equal to: ∗
Q
∗
E((Q )) = (p − s)
.
+∞ zg(z)∂z − (p − c) yh(y)∂y.
0
0
We remark that (p − c)Q∗ = (p − s)Q∗ G(Q∗ ),
.
for .G(Q∗ ) = (p − c)/(p − s). Using the derivations of the expected profit given in Chap. 3’s appendix, the expected profit for the optimal order quantity is written as: √ E((Q∗ )) = (p − s)(D0 + Y0 )eμ(tn −t0 ) (zQ∗ − σ tn − t0 )
.
−(p − c)Y0 eμ(tn −t0 ) , where zQ∗ =
.
ln(Q∗ ) − ln(D0 + Y0 ) − (μ − σ 2 /2)(tn − t0 ) . √ σ tn − t0
The expected profit value conditional on .Y = 0 reduces to the newsvendor profit, where there is no supply risk. Then, the cost of the delivery shortfall risk for the buyer is calculated as follows: E((Q∗ ) | Y = 0) − E((Q∗ )).
.
We now consider an example of a retailer that purchases goods from a supplier and sells them in a market with uncertain demand. The selling price is .$10 per unit, the cost is .$5 per unit, and the residual value of unsold items is .$2 per unit. We normalize the forecasting horizon .tn − t0 to one and update the parameters accordingly. The drift rate .μ is set equal to zero, and the volatility parameter is equal to one. The initial demand forecast is 100 units.
248
7 Managing the Supply Risk
100
Order quantity
95
90
85
0.00
0.05
0.10
0.15
0.20
Shortfall ratio
Fig. 7.4 Impact of the delivery shortfall risk on the optimal order quantity
In Fig. 7.4, we present the impact of the delivery shortfall on the optimal order quantities. The x-axis represents the shortfall ratio .rconst , and the y-axis represents the optimal order quantity .Q∗ . The order quantity for .rconst = 0 corresponds to the newsvendor order quantity without any shortfall risk, which is equal to 83 units. As the shortfall ratio increases, so does the optimal order quantity. This result is consistent with our expectation, because the retailer that anticipates a delivery shortfall is induced to increase the order quantity in order to get the desired amount of products delivered. In Fig. 7.5, we present the cost of the shortfall risk for the retailer. The x-axis represents the shortfall ratio .rconst . The y-axis represents the percentage cost of the shortfall risk, which can be formulated as follows: 1−
.
E((Q∗ )) . E((Q∗ ) | Y = 0)
When .rconst = 0, the second term of this expression becomes equal to one, making the percentage cost of the shortfall equal to zero. As the shortfall ratio increases, the expected profit decreases due to the increase in the shortfall risk. Therefore, the percentage cost of the shortfall increases with the shortfall ratio. In our example, we assume that the supplier is responsible for the delivery shortfall, which may be violated in some situations. In the agriculture industry, for example, the buyers may contract with the farmers to protect them against a delivery
7.4 Type-3 Risk: Delivery Shortfalls
249
Cost of the shortfall (in percentages)
30
20
10
0 0.00
0.05
0.10
0.15
0.20
Shortfall ratio
Fig. 7.5 Cost of the delivery shortfall risk
shortfall risk. In particular, there are different forms of contract farming (Meemken & Bellemare, 2020), such as that of a pre-harvest agreement between a farmer and a buyer, which transfers the risk of a low yield from the farmer to the buyer. In the retail industry, poor inventory handling and control practices in retail warehouses would result in the loss of inventory, resulting in an inaccurate inventory record (DeHoratius & Raman, 2008; Chuang & Oliva, 2015). In these circumstances, the retailers, not necessarily the suppliers, are responsible for the inventory shortfalls. When the supplier is not responsible for delivery shortfalls, the buyer pays the supplier for Q units, although it can only sell .Q − Y units in the market. In other words, the buyer incurs an additional cost of cY , compared to our example. Therefore, the buyer’s profit will be cY less than when the supplier incurs the cost. Then, the newsvendor profit is written as follows: (Q) = (p − c)(Q − Y ) − cY − (p − s) max(Q − Y − D, 0).
.
The expected profit is also found by: +∞ Q .E((Q)) = (p − c)Q − p yh(y)∂y − (p − s) (Q − z)g(z)∂z. 0
0
250
7 Managing the Supply Risk
Taking the first derivative of this expression with respect to Q, the optimal order quantity .Q∗ is: G(Q∗ ) =
.
p−c , p−s
which is the same as the formulation of the optimal order quantity above—that is, when the supplier incurs the cost of cY . The expected profit for the optimal order quantity is: ∗
Q
∗
E((Q )) = (p − s)
.
zg(z)∂z − p 0
+∞ yh(y)∂y. 0
We now extend our example given in Figs. 7.4 and 7.5 to show the cost of the delivery shortfall risk when the buyer incurs the cost of the shortfall (i.e. cY ). We present the results in Fig. 7.6. When we compare Fig. 7.6 with Fig. 7.5, we observe that the transfer of the shortfall risk from the supplier to the buyer causes a substantial increase in the buyer’s cost, although the optimal order quantity is the same for both cases. For instance, the percentage cost increase due to the shortfall risk is close to .80% in Fig. 7.6 for .rconst = 0.2, whereas it is around .30% in Fig. 7.5.
Cost of the shortfall (in percentages)
80
60
40
20
0 0.00
0.05
0.10
0.15
0.20
Shortfall ratio
Fig. 7.6 Cost of the delivery shortfall risk when the buyer incurs the cost of the shortfall (i.e. cY )
7.6 Practice Examples
7.5
251
Chapter Summary
Owing to the technological advances that enable companies to source products from offshore countries, global supply chains have been extended for the last three decades. This trend causes long lead times and high supply risks in supply chains. In this chapter, we have discussed two types of supply risks: (1) the supply disruption risk and (2) the delivery shortfall risk. Under the supply disruption risk, the ordering policy (i.e. the regular order quantity) differs, depending on the ownership of the costs of the risk. However, the risk-mitigation strategy is not affected by the ownership of the costs of the risk. Building a risk-mitigation inventory would likely be appealing for innovative products, whereas reserving reactive capacity would be highly effective for standard products. Under the delivery shortfall risk, the ordering policy is not affected by who owns the risk costs; however, the expected profit is affected by it. When the probability of being exposed to supply risk increases, the ordering policy (i.e. the regular order quantity) changes for a Type-1 disruption risk and a Type-3 delivery shortfall risk, but not for a Type-2 disruption risk. Therefore, one of the useful insights derived from this chapter regarding the management of supply risk is that the optimal ordering strategy and contingency plans are affected by the type and ownership of the supply risks. Decision-makers should not ignore these aspects when managing the supply risks.
7.6
Practice Examples
Example 7.1 Suppose that a wholesaler sells a seasonal product to some buyers in North America. The demand follows a lognormal distribution with a location parameter of 6.56 and a scale parameter of 0.83. The wholesaler leases a fleet of ships and trucks, so the products can be transported from an offshore supplier at a low cost. Thus, the wholesaler takes responsibility for the shipment. If the demand of buyers is not fulfilled due to a supply disruption, the wholesaler incurs a loss of profit, which is equal to $20 per unit. The holding cost of excess inventory is estimated to be $15 per unit. If the wholesaler fails to deliver orders to buyers on time due to a supply disruption, demand is lost and inventory that is already shipped is kept as excess stock. The probability of a supply disruption is estimated to be 10% by the wholesaler. What is the optimal order quantity? (Use this information to answer the questions of Examples 7.1–7.5) Solution 7.1 The critical fractile is found as: α=
.
hθd 20 15 × 0.1 b − = − = 0.524 b + h (1 − θd )(b + h) 20 + 15 0.9 × (20 + 15)
252
7 Managing the Supply Risk
Then, z = −1 (0.524) = 0.06 and Q∗ = e6.56+0.83×0.06 = 742 units.
.
Example 7.2 A domestic supplier offers the wholesaler reserving a reactive capacity that can be utilized if a supply disruption occurs. The capacity reservation cost charged to the wholesaler is $1 per unit. What is the optimal capacity level? Solution 7.2 The optimal capacity is found by: F (C ∗ ) =
.
0.1 × 15 − 1 θd b − ck = = 0.333. θd b 0.1 × 15
Then, z = −1 (0.333) = −0.432 and C ∗ = e6.56−0.83×0.432 = 494 units.
.
Example 7.3 The wholesaler decides to keep risk-mitigation inventory in stock, such that it can be used to fulfil buyers’ demand when a supply disruption occurs. What is the optimal level of risk-mitigation inventory? Solution 7.3 The optimal level of risk-mitigation inventory is found by: F (Q∗RMI ) = 1 −
.
15 h =1− = −3.28. θd (b + h) 0.1 × (20 + 15)
Given that the cumulative probability cannot have any negative value, this expression is unsolvable. So, Q∗RMI = 0. This expression can be positive when θd > 0.429. Therefore, it is not feasible for the wholesaler to keep risk-mitigation inventory when the disruption probability is low (i.e. 0 and .i < n: Step 2a: Solve the Knapsack problem for the invoices in the set .Ji with the capital equal to .Ki . Step 2b: Separate the selected invoices .J∗i and update .Ji+1 = Ji+1 ∪ Ji . Step 2c: Set .i = i + 1. Step 2d: Go to Step 0. Step 3: Solve the Knapsack problem for the invoices in the set .Jn with the capital equal to .min(K1 , K2 , · · · , Kn ).
8.5 Chapter Summary
275
The algorithm starts from the first period of the problem horizon by calculating the cash surplus: .K1 = K2 − K1 . If the cash surplus is negative, a single Knapsack problem given in Step 3 should be solved, including all the invoices in the selection set. The available capital for the Knapsack problem is set equal to the minimum of the free cash flows over the problem horizon: .min(K1 , K2 , · · · , Kn ). If the cash surplus .K1 is positive, we then create a subproblem, such that the surplus can be used to pay some of the invoices that are due during the first period. The subproblem is also a Knapsack problem as stated in Step 2 of the algorithm. After solving the Knapsack problem and hence selecting the invoices to be paid early, the remaining invoices that are due during the first month (i.e. .J1 /J∗1 ) should be included in the set of .J2 (Step 2b). Next, the algorithm returns to Step 0 after setting .i = 2. The cash surplus for the second period is then calculated as follows: .K2 = K3 − K2 . If this amount is positive, we then create another subproblem in Step 2 to determine the invoices to be paid early by using the cash surplus of .K2 . Therefore, the algorithm continues to create subproblems as long as .Ki values are positive. Once a negative value is observed, the algorithm stops to create subproblems and jumps to the final step (Step 3). This stopping rule is strict because the sliding scale should specify the discount rates for the time epochs of the payment starting from day 0. Therefore, the algorithm can no longer partition the invoices whenever a negative .Ki value is observed. In that case, it should jump to Step 3 and solve the Knapsack problem for the invoices in set .Jn before terminating the algorithm. Dynamic discounting aims to minimize the inefficiencies of the early payment scheme in two different ways. First, it enables buyers to offer the sliding scale to their suppliers, such that the buyers can get a discount if the payment is made anytime up to the payment date of an invoice. Second, it allocates the buyers’ capital to the invoices, so the suppliers in need of cash are paid early and the buyers also benefit from this. The main disadvantage of dynamic discounting is that it requires a frequent exchange of information between the supply chain parties and an optimization model to facilitate the decision-making process.
8.5
Chapter Summary
Supply chain finance is an important and essential part of supply chain management. An effective operational strategy would be useless if it is not complemented with a well-designed financing strategy. We have covered four different supply chain finance solutions in this chapter. The first one is the early payment scheme, which is practically easy to implement. It does not involve any financial intermediary, and the supplier simply offers an early payment discount to the buyer. The early payment scheme comes with a hefty cost of implementation inefficiencies, such that the buyer would not be able to realize the full benefits of early payments. The second one is the reverse factoring, where an intermediary (e.g. a bank) finances the trade at a low cost. Reverse factoring makes it possible for the supplier to benefit from the buyer’s credibility and access to low-cost financing. In return, the buyer expects
276
8 Supply Chain Finance
to increase the payment term on the invoice, so both parties would benefit from reverse factoring programs. The third one is the letter of credit, which is different from the other three supply chain finance solutions, because it not only finances the trade but also establishes trust between the trading parties. To this end, it involves two financial intermediaries (i.e. issuing and advising banks), such that the issuing bank represents the buyer, whereas the advising bank tries to protect the supplier’s rights. Given that there are two banks involved, the letter of credit is costlier than the other supply chain finance solutions due to the fees paid to the banks. Finally, dynamic discounting addresses the implementation inefficiencies of the early payment scheme with the help of technological infrastructure and optimization models. The sliding scale of dynamic discounting is useful when the buyer has an abundance of cash, so she can offer the same sliding scale to all the suppliers. If the buyer has limited cash, the market mechanism of dynamic discounting makes it possible to allocate the limited capital to the invoices for an early payment in the most profitable way. In this chapter, we present the analytical models that help quantify the value of each financing strategy separately. However, the optimal policy would be a hybrid model that integrates different supply chain finance solutions. As stated above, some buyers tend to select a group of suppliers and include them in the reverse factoring programs. This would help the buyers free up capital that can be used for early payments to other suppliers. To arrange the early payments, dynamic discounting is a better alternative than the early payment scheme. For that reason, we envision that companies that effectively use dynamic discounting and reverse factoring together would gain a competitive edge in realizing the full potential of supply chain finance.
8.6
Practice Examples
Example 8.1 What is the effective annual rate for an early payment scheme of 3/40, net 80? Solution 8.1 The early payment scheme of 3/40, net 80 means that the buyer receives a 3% discount on the invoiced amount if she pays the invoice in the first 40 days. According to an initial agreement between the supplier and the buyer, the invoice must be paid in 80 days. Therefore, the buyer gets the discount if the payment is done “80 − 40 = 40” earlier than the initially scheduled payment date. The effective rate of the early payment scheme is then calculated as follows: 365/40 × 3% = 27.375%.
.
Example 8.2 Suppose that a buyer purchases toys from a contract manufacturer and sells them in a market during a short selling season. The supply chain parties initially agree that the payment will be done in 120 days, so the buyer can sell the products in the market, collect revenues, and then pay the invoice to the supplier. The amount of invoice is $80,000. The manufacturer also offers the buyer an early payment
8.6 Practice Examples
277
opportunity with the terms of 2/30, net 120. The buyer does not have enough cash to make an early payment. However, she can have a bank loan at an annual interest rate of 4.5%. What is the value of early payment for the buyer if she gets the bank loan and pays the invoice earlier than the scheduled payment date? (Use this information to answer the questions of Examples 8.2 and 8.7) Solution 8.2 The buyer would receive the total discount of 80,000 × 0.02 = $1600 by paying 90 days earlier than the scheduled payment date. She has to make the payment of $78,400. When she gets the loan of $78,400 for 90 days, the interest of the loan is: $78,400 × 4.5% × 90/365 = $870.
.
At the time of the original payment date of the invoice, the buyer repays the loan to the bank, which amounts to 78,400+870 = $79,270 (including the debt and interest of the loan). Therefore, the value of early payment is equal to 80,000 − 79,270 = $730. Example 8.3 Suppose that the buyer’s bank is involved in a reverse factoring program, such that the bank offers a reverse-factoring rate of 2% to the manufacturer. If the manufacturer accepts the offer, she gets paid in 30 days. What is the value of reverse factoring for the manufacturer, compared to the early-payment scheme of 2/30, net 120? Solution 8.3 The bank charges an interest of 80,000 × 2% × 90/365 = $395 to the manufacturer if she accepts to utilize reverse factoring. Therefore, the manufacturer gets paid 80,000−395 = $79,605 on the 30th day. With the early-payment scheme, she could get paid only $78,400. Thus, the value of reverse factoring is equal to 79,605 − 78,400 = $1205 for the manufacturer. Example 8.4 If the buyer asks the manufacturer to extend the payment term in the reverse factoring program, what would be new payment term that makes the manufacturer indecisive between the reverse factoring and the early payment with 2/30, net 120? Solution 8.4 With the early-payment scheme, the manufacturer offers the total discount of $1600 to receive the payment on the 30th day. The interest rate in the reverse-factoring program is 2%. Then, the payment term that makes the manufacturer indifferent between the two options is found by: .
80000 × 2% × (x − 30)/365 = 1600, x = 365/0.02 × 1600/80000 + 30 = 395 days.
278
8 Supply Chain Finance
We remark that we use the term “(x − 30)” in the first expression, because the manufacturer gets paid on the 30th day with the reverse-factoring program. The left-hand side of the first expression is equal to the total interest amount of reverse factoring with the new payment term of x days. The right-hand side of the first expression is the total discount of the early-payment scheme. Therefore, the new payment term of 395 days makes the manufacturer indifferent between these two options. Example 8.5 Suppose that the manufacturer has enough working capital, for which the cost of capital is equal to 8%. Should the manufacturer still offer the earlypayment scheme of 2/30, net 120 to the buyer? Solution 8.5 The early-payment scheme makes receiving the payment 90 days earlier than the scheduled time worth 2%. Thus, the annual interest rate for the early payment is equal to 365/90 × 2% = 8.11%. Because the annual rate is higher than the cost of capital, the manufacturer should not offer the early payment to the buyer. Example 8.6 The manufacturer considers updating the early-payment scheme such as 1.5/30, net 120. Given that her cost of capital is equal to 8%, would the new scheme be appealing for the manufacturer? Solution 8.6 The annual interest rate for the new scheme is 365/90×1.5 = 6.08%. Therefore, the new early-payment scheme is appealing for the manufacturer as its annual interest rate is less than the cost of capital. Example 8.7 Suppose that the buyer has excess cash, which is available from the 20th day after the delivery of the toys by the manufacturer is completed. If the manufacturer develops a dynamic discounting scheme that is equivalent to the early payment of 2/30, net 120, what would be the benefit of paying the invoice on the 20th day for the manufacturer? Solution 8.7 The early-payment scheme of 2/30, net 120, offers the same discount rate (i.e. 2%) to the buyer if the payment is made in the first 30 days. Therefore, the buyer gets a total discount of 80,000 × 0.02 = $1600, regardless of whether the payment is made on the 20th or the 30th day. However, dynamic discounting fixes the annual interest to the same value (i.e. 365/90 × 2% = 8.11%) regardless of the payment date as illustrated in Fig. 8.6. If the payment is made by the buyer on the 20th day, the discount rate will be 8.11% × 100/365 = 2.22%. Hence, the total discount is equal to 80,000 × 0.0222 = $1777. Therefore, the benefit of paying the invoice on the 20th day with the dynamic discounting scheme is 1777 − 1600 = $177.
References
8.7
279
Exercises
1. What is cost of capital? Why is it an important parameter to assess the value of supply chain finance alternatives (i.e. early payment, reverse factoring, letter of credit, and dynamic discounting)? 2. How is the value of early payment affected by an increase in the seller’s cost of capital? 3. How is the value of early payment affected by an increase in the buyer’s cost of capital? 4. Discuss the benefits and challenges of reverse factoring, compared to early payment scheme. 5. When do supply chain parties engage in a letter of credit? What are the advantages and disadvantages of the letter of credit, compared to early payment scheme?
References Eaglesham, J. (2020). Supply-chain finance is new risk in crisis. Wall Street Journal. https://www. wsj.com/articles/supply-chain-finance-is-new-risk-in-crisis-11585992601 Esty, B. C., Mayfield, S., & Lane, D. (2017). Supply chain finance at procter & gamble. Harvard Business School, Case #:9-216-039. Fréville, A. (2004). The multidimensional 0–1 knapsack problem: An overview. European Journal of Operational Research, 155(1), 1–21. Kahn, G. (2004). Financing goes just-in-time. Wall Street Journal. https://www.wsj.com/articles/ SB108629254032928178 Steinberg, J. (2020). SEC asks Boeing, Coca-Cola to disclose more about popular financing tool. Wall Street Journal. https://www.wsj.com/articles/sec-asks-boeing-coca-cola-to-disclosemore-about-popular-financing-tool-11598526651 Welch, I. (2021). The cost of capital: If not the capm, then what? Management and Business Review, 1(1), 187–194.
9
Future Trends: AI and Beyond
Keywords
Deep learning · Artificial neural networks · Activation functions
In this final chapter, we conclude the book by exploring the opportunities and limitations of artificial intelligence (AI) in supply chain management for two reasons. First, AI has the potential to improve the efficiency, visibility, and predictability in supply chains. So, it is important to know the basic concepts of AI. Second, some practitioners believe that new developments in AI are so promising that they will completely revolutionize the supply chain management discipline (Lyall et al., 2018). While the challenges of supply chain management are unfortunately too big to be solved by AI, supply chain analytics experts should keep an eye on recent developments in AI because the field is growing fast, and it may offer some opportunities for supply chain management professionals. AI is a multidisciplinary field that lies at the intersection of computer science, neuroscience, operations research, mathematics, and statistics (Castellanos, 2018). It’s roots date back to 1956, when the first AI workshop was organized at Dartmouth College. This event is considered the birth of AI as a new discipline (Castellanos, 2018). Despite its long history, it only became popular in the business world in the 2000s, and we have been experiencing an AI start-up boom since 2010. According to a Wall Street Journal article (Purnell & Olson, 2019), the total value of venture capital funding for AI start-ups increased more than fivefold globally between 2014 and 2018. The number of start-ups with the Internet domain “.ai” soared during the same time period among AI-based technology firms, even though “.ai” is the Internet country domain of Anguilla (Purnell & Olson, 2019), and not directly related to AI. When we look at the evolution of AI over the years, one question that may arise is “Why did it take so long for AI to become popular?” AI leverages computers to mimic humans in reasoning, learning, taking actions, and making decisions. Therefore, the boundaries of AI are loosely defined because © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0_9
281
282
9 Future Trends: AI and Beyond
the limits of human behaviour and cognition are not tightly defined. We humans, for example, read a book, listen to music, play an instrument, do sports, build infrastructure, do math calculations, think, dream, learn, teach, etc. In the early stages of AI, researchers focused on mimicking humans in doing basic calculations. The first programmable calculator invented in 1960s,1 automated some human calculations. AI’s popularity then waned between the 1970s and 1990s (Castellanos, 2018). AI regained its popularity in the 2000s and outpaced it in the 2010s owing to its success in image recognition, natural language processing (NLP), self-driving cars, etc. Naturally, the tasks that can be done by AI models are now much more complex than what could be done in the 1960s. The popularity and success of AI after the 2010s can be attributed to the deep learning revolution, which makes it possible for computers to learn from data in the same way that humans do (Metz, 2021). Given the evolution of AI, we argue that the definition of AI should be revisited. The current definition of using computers to mimic humans covers a large spectrum of methods and techniques. Suppose, for example, that a manufacturer determines inventory levels for a group of products in order to maximize profit. Traditionally, operations research models (e.g. linear programming) have been used to optimize inventory levels, which was done by humans before such models were developed. Because linear programming, for example, replaced the human decision-making process to better determine stock levels, it is considered an AI tool by the AI community, which puts it in the same pool as deep learning.2 This, however, overstates the value of start-ups that use linear programming and sell it as an AI tool to investors that are increasingly interested in AI start-ups (Purnell & Olson, 2019). To avoid such problems, we believe that AI should be defined as a discipline that teaches computers to mimic humans in reasoning, learning, taking actions, and making decisions by imitating how the human brain works. In the AI community, there is a deeper discussion about the definition of intelligence and to what extent AI tools have the intelligence component. We refer interested readers to a great MIT Technology Review article titled “We’ll never have true AI without first understanding the brain” (Heaven, 2021). With this new extended definition, we narrow the scope of AI and can only focus on deep learning and artificial neural networks (ANNs). ANNs have become very popular during the last decade owing to some technological advances that use ANNs, such as image recognition, NLP, etc. In other fields, both researchers and practitioners aim to understand the potential of AI to transform other aspects of our lives. Supply chain management is no different. In this chapter, we briefly go through ANNs and the related analytical aspects. Then, we present some simple examples to assess the performance of ANNs in supply chain management.
1 https://en.wikipedia.org/wiki/Calculator. 2 https://deepai.org/machine-learning-glossary-and-terms/linear-programming.
9.1 Artificial Neural Networks
283
(...)
(...)
Input layer
Hidden layers
Output layer
Fig. 9.1 Neural network with input, hidden, and output layers
9.1
Artificial Neural Networks
A neural network consists of an input layer, an output layer, and hidden layers. Figure 9.1 presents an example with an input layer, an output layer, and two hidden layers. The input nodes are .X1 and .X2 ; the output nodes are .Y1 and .Y2 . When we specify the values of input nodes and feed the ANN with these values, it generates the output values. Suppose, for example, an online retailer sells an electronic product to the US market. The demand for the product mainly depends on two key parameters: (1) price and (2) inventory level. This information is shared with the customers online. Therefore, the input nodes .X1 and .X2 denote the price and inventory level at the beginning of each day, respectively. The retailer is interested in predicting daily orders that are requested to be delivered with express- and standarddelivery options. Therefore, the output nodes .Y1 and .Y2 denote the total number of express-delivery and standard-delivery orders for each day, respectively. With this representation, the ANN can be considered a predictive analytics tool that returns the estimates of daily orders for a given set of inputs. Customers have different utility levels for the same product. A customer’s utility for a product determines the intrinsic value of the product to the customer. When the utility exceeds the price of the product, customers buy it; otherwise, they do not buy it. When the price is low, customers with both high- and low-utility levels would buy it. Those with a low utility would probably order the product with standard delivery because they do not need it urgently. When the price is high, those with a high utility would buy the product and would probably order the product with express delivery to receive it as soon as possible. Thus, the percentage of customers with high-utility levels is expected to increase with the price level. For that reason, the percentage of express-delivery orders is also expected to increase with the price level. High-utility customers often place orders when the product is available in stock. Their decision to place an order is less affected by inventory level. However, lowutility customers may postpone their purchases, because they do not need the
284
9 Future Trends: AI and Beyond
product immediately. Signalling that the product would be out-of-stock soon may induce these customers not to postpone their ordering decision any longer. Once they order the product, they are likely to choose the standard-delivery option. Thus, the percentage of customers with low-utility levels is expected to increase as the inventory depletes. For that reason, the percentage of standard-delivery orders is expected to increase as the inventory depletes. The relationship between the input and output nodes can be highly complex, which may not be captured by linear statistical models. In the online retailer example, customers’ reactions to inventory availability would differ, depending on the price level. Likewise, customers’ reactions to price information would differ, depending on the inventory level. ANNs use hidden nodes to represent the complex relationships between the input and output nodes. In Fig. 9.1, there are two hidden layers and two hidden nodes for each layer. The first hidden layer consists of nodes .H1 and .H2 . The second hidden layer has nodes .H3 and .H4 . The output values for these nodes can be different than the input values, and they are determined by the activation function. The output values are denoted by .Gi .∀i ∈ {1, 2, 3, 4} such that: G1 = f1 (H1 ),
.
G2 = f1 (H2 ),
G3 = f2 (H3 ),
G4 = f2 (H4 ),
where .f1 (·) and .f2 (·) are activation functions of the first and second hidden layers, respectively. Historical data is used to train neural networks, such that the values of the input and output nodes are available in those datasets. The weights .Wi .∀i ∈ {1, · · · , 12} are estimated by training the model. The input and output nodes are connected through the weights. For the example given in Fig. 9.1, the following expressions show how to feed the neural network forward towards the output nodes: H1 = X1 W1 + X2 W3 ,
G1 = f1 (H1 ),
H2 = X1 W2 + X2 W4 ,
G2 = f1 (H2 ),
H3 = G1 W5 + G2 W7 ,
G3 = f2 (H3 ),
H4 = G1 W6 + G2 W8 ,
G4 = f2 (H4 ),
.
Y1 = G3 W9 + G4 W11 ,
Y2 = G3 W10 + G4 W12 .
Once the weights are determined, it is possible to predict the values of output nodes for a given set of inputs. Unlike other statistical models, such as linear regression, the prediction cannot be carried out with a single linear model. To find the predicted values of .Y1 and .Y2 , the model should be fed forward by using all the expressions from .H1 = X1 W1 + X2 W3 to .Y2 = G3 W10 + G4 W12 . The number of layers determines the depth of the model, which is equal to four in Fig. 9.1. The number of nodes at each layer determines the breadth of the given layer, which is equal to two for all the layers in Fig. 9.1. Increasing the number of layers helps to improve the accuracy of ANNs, because the model does a good job of fitting the data as the number of layers between input and output nodes increases. As
9.2 Activation Functions
285
the number of layers increases, however, it is difficult to understand how connections between the input and output nodes are made. Due to the lack of visibility, ANNs are sometimes considered “black box” methods by practitioners (McCormick, 2021). The improvement in model accuracy due to an increase in the number of layers also causes an overfitting problem (Hastie et al., 2001). Researchers are often concerned about overfitting when the performance of the ANN on a training set is too good to be true. When data is stationary such that the future states and dynamics of the problem setting will be the same as the past states and dynamics, an overfitting problem would not have much effect on the robustness and consistency of ANNs. However, the dynamics of real-world problems are often non-stationary and evolutionary. As conditions change over time, the performance of an overfitted model would deteriorate. This makes the model unreliable and inconsistent. The overfitting problem can be avoided, if the right number of nodes is included in ANNs and relationships among the nodes are constructed properly. Research on causality and causal networks has long aimed to address these problems in networks (Pearl, 1988). Coupled with these efforts, regularization approaches, such as the Lasso and Ridge regression, also help alleviate overfitting problems in ANNs. Even when networks are properly constructed with the right number of input and output nodes, the estimation of weights can be biased due to some irrelevant assumptions in the learning algorithm. This would cause an underspecification problem, such that different combinations of weights would fit the data very well, making it impossible for the learning algorithm to find the correct values for the weights. According to an article that appeared in the MIT Technology Review (Heaven, 2020), underspecification is regarded as one of the most important problems by leading researchers in the field. When the current and future states of the setting in which AI is implemented are similar to past states, the negative impact of underspecification would be limited. However, market dynamics in supply chain problems are often evolutionary in that customer preferences and demand levels for different products may change suddenly. In such cases, the underspecification problem can cause substantial inefficiencies in supply chain management practices. Unless the underspecification problem is addressed rigorously, both practitioners and researchers should approach AI cautiously when it is applied to supply chain problems in a dynamic and evolutionary fashion.
9.2
Activation Functions
In ANNs, it is common to assign activation functions to hidden layers. An activation function makes it possible to change the value within a hidden node based on a functional form. In Fig. 9.1, for example, the entering value .H1 is changed to .G1 in the first hidden node based on activation function .f1 (H1 ) = G1 . Activation functions make ANNs more flexible in terms of fitting data, which in turn improves model accuracy. Activation functions should be specified before training the model. There are five commonly used activation functions. The first one is the binary activation function that returns either zero or one by comparing the input value with
286
9 Future Trends: AI and Beyond
a threshold: f (x) =
.
0 if x < c, 1 otherwise.
The binary activation function has proven to be useful for basic classification problems. In the supply chain framework presented in the first chapter, the cash conversion cycle determines the financial needs of companies. If it has a positive value—for example, upstream suppliers often have long cash conversion cycles— companies would use a line of credit to finance their operations. Banks charge companies an interest rate for using a line of credit. In some cases, the cash conversion cycle can be negative—for example, retailers often have a negative cash conversion cycle. Such companies may earn interest by depositing their extra cash in a bank. A company would probably be a debtor when the cash conversion cycle is positive; otherwise, it would be a creditor. A neural network that uses the cash conversion cycle to determine corporate tax debt would probably need a binary activation function with a threshold value of .c = 0 because being a debtor or creditor has a direct impact on corporate tax levels. Another activation function is a linear function, which returns values according to the following functional form: f (x) = ax,
.
where a is a constant. In supply chain problems, cost function is often defined as a linear function, because sellers often charge buyers on a cost-per-unit basis. If we consider a in this function as the unit cost, the linear function of .f (x) = ax gives the total cost for a given quantity level. The third commonly used activation function is a rectified linear unit (ReLu), which limits the output to having non-negative values: f (x) = max(0, x).
.
ReLu is very important for modeling the interaction between demand and inventory. The amount of net sales directly depends on demand and inventory levels. If demand is less than inventory, net sales becomes equal to demand, and excess inventory is kept in stock. Otherwise, net sales becomes equal to inventory, and excess demand is lost. Therefore, Net Sales = min(D, Q),
.
9.3 Model Training
287
where D and Q denote demand and inventory levels, respectively. This expression can be rewritten using ReLu as follows: N et Sales = min(D, Q) = Q + min(D − Q, 0) = Q − max(Q − D, 0),
.
= Q − ReLu(Q − D). Therefore, ReLu is very useful for representing sales, given inventory and demand values. The following two activation functions are used for choice models. In a choice model, the output returns the probability that a customer will choose a product offered. Let x denote the utility of a product for a customer. If only one product is offered to customers, the decision is simply a buy vs. no-buy decision. This binary decision is modeled with the sigmoid function: f (x) =
.
1 , 1 + e−x
which is referred to as the binary choice model in the marketing literature. If more than one product is offered to customers, the decision to choose product i among n different alternatives depends not only on the input values of product i but also on the other products. In this case, the softmax function is used to represent the probability that product i is chosen: f (xi ) =
.
e−xi , n e−xj j =1
which is referred to as the multinomial choice model in the marketing literature.
9.3
Model Training
Model training is the process of estimating the weights of ANNs. The most popular method for training an ANN is the backpropagation method, which is based on the gradient descent approach. Backpropagation can be conceptually carried out in two steps. The first step is to formulate the gradient of the loss function with respect to weight vectors. The second step is to find the values of the weights using the gradient descent method. Loss functions in ANNs can be defined in different ways, depending on the preferences of decision makers. For example, the sum of squared deviations between the true values of an output value and its estimates can be used as a loss function. Training an ANN is the process of estimating its weights, in such a way that minimizes the value of the loss function. When an ANN has many layers, there would be too many combinations of weights, which effectively minimizes the
288
9 Future Trends: AI and Beyond
True values
Loss Funcon
Level 1 Level 2
Level 3 Level 4
Fig. 9.2 Hierarchical representation of a neural network
loss function, and the model may overfit data in such a case. To avoid overfitting problems, a regularization term is often included in loss functions. Figure 9.2 depicts the hierarchical structure of an ANN. Level 1 includes two output nodes .y1 and .y2 . Level 2 consists of the weights that connect the hidden and output layers. The next level has four hidden nodes. The fourth level has the weights that connect the input layer with the hidden layer. Let .L(Y) denote the loss function that depends on the vector .Y = (y1 , y2 ). We also use .W2 , .H3 , and .W4 to denote the vector of instances that appear in Levels 2, 3, and 4 in Fig. 9.2, respectively. Then, the gradient of the loss function with respect to the weight vectors is: ∂L/∂W2 ∂L ∂L ∂Y ∂L ∂L ∂Y ∂W2 ∂H .L = ∂L/∂W4 , ∂W = ∂Y ∂W , and ∂W = ∂Y ∂W ∂H ∂W . 2 2 4 2 4 After formalizing the gradient function and its dependencies, the weights are estimated using the gradient descent method, which is a numerical optimization method discussed in Chap. 2.
9.4
ANNs in Inventory Management
In supply chain settings, ReLu is useful to for capturing the inventory dynamics. For inventory management problems under demand uncertainty, decision-makers have to determine the optimal inventory level that maximizes profit. Recall that Q and D denote the inventory level and demand values, respectively. If Q exceeds D, the excess inventory of .Q − D units are salvaged. If Q falls short of D, there is an opportunity cost of not fulfilling .D −Q units of excess demand. Therefore, net sales
9.4 ANNs in Inventory Management
289
Fig. 9.3 Inventory example: neural network representation with three layers. ReLu is chosen as the activation function for the hidden layer
Fig. 9.4 Feeding the neural network with inputs .Q = 50 and .D = 70 with ReLu as the activation function of the hidden layer
is equal to .min(Q, D). An alternative way to formulate net sales is: Net Sales = Q − max(0, Q − D).
.
Therefore, a neural network that relates inventory level and demand to net sales can be constructed as in Fig. 9.3. Suppose that demand for a product is random and uniformly distributed between 20 and 70 units. The inventory level is equal to 50 units. If demand turns out to be 70 units, net sales becomes equal to 50. With these inputs, the neural network that has the weights and ReLu as the activation function returns the net sales volume as given in Fig. 9.4. In this example, decision-makers lose the opportunity to fulfil the excess demand of .D − Q = 20 units. Now, let’s consider an alternative case such that demand turns out to be lower than the inventory level, and there is excess inventory that has to be salvaged at the end. The input parameters are set as follows: .Q = 50 and .D = 20. In this case, the neural network returns 20 units of net sales as shown in Fig. 9.5. We have randomly simulated 1000 instances of inventory and demand values and calculated the net sales for each pair of Q and D. Then, the model is trained using the gradient descent-based backpropagation algorithm, and the weights are estimated as in Fig. 9.6. We assume that demand is distributed uniformly between 20 and 70 units. For each demand value, we randomly draw an inventory value of
290
9 Future Trends: AI and Beyond
− −
2
Fig. 9.5 Feeding the neural network with inputs .Q = 50 and .D = 20 with ReLu as the activation function of the hidden layer
.
− .
− .
− .
.
− .
Fig. 9.6 Estimated model based on 1000 instances generated randomly. The estimated weights are significantly different from their true values given in Fig. 9.3
between 20 and 70 units—that is, the inventory level is randomly drawn from a uniform distribution of between 20 and 70 units—and then calculate the net sales. Comparing the estimated weights given in Fig. 9.6 with true values in Fig. 9.3, we observe that there is substantial deviation between the true and estimated values. In the online web application, we provide a direct comparison between true and estimated models, where demand is assumed to follow a normal distribution. The tool estimates the weights given the parameters of normal distribution and inventory level. A screenshot of this tool is given in Fig. 9.7. Python codes are also available for readers who are interested in coding. The output of the web application also demonstrates that there is a substantial difference between true and estimated weights under the normal distribution. The poor performance of ANNs in finding true values of weights can be attributed to the underspecification problem, because AI models can be constructed in many different ways and many models can do a good job of fitting the data. Therefore, as discussed above, the specification of the right model may not be possible in most cases (Heaven, 2020).
9.5 Chapter Summary
291
Fig. 9.7 The web application showing the estimated weights given distributional parameters of demand and inventory level
9.5
Chapter Summary
We have illustrated that ANNs perform poorly even for a basic form of inventory management due to the underspecification problem. For more complex problems with sequential decisions, the performance of ANNs may even be worse. On other fronts, AI is still very powerful if it is mainly used for predictive analytics. Therefore, the potential of AI can be realized in some settings, where predictive analytics is carried out separately before prescriptive analytics. However, the potential of supply chain analytics is unlocked when predictive and prescriptive analytics are well integrated. Because the implementation of AI in supply chains would require companies to separate predictive analytics from prescriptive analytics, focusing on AI in its current form would potentially mislead companies and result in silos between demand management and inventory planning. Proponents of AI may argue that AI tools can help increase forecast accuracy substantially. But it rarely happens for products with volatile demand. For example, many companies that utilized complex forecasting models failed to predict demand peaks during the COVID pandemic’s lockdowns. Most customers simply postponed their purchases during the lockdowns, which caused demand peaks after stores reopened. So, an effective strategy would be to first model customers’ behaviour and then aggregating this information to estimate demand. However, ANNs or other forecasting tools do not rely on an explicit modelling of customer behaviour. Instead, they aim to estimate demand using some explanatory variables. For that reason, in some cases human judgment may still perform better than forecasting models when estimating demand. We argue that the future of AI in SCM directly depends on its effectiveness in addressing the three challenges of modern supply chains discussed in the first chapter. The first and second challenges—(1) sequential decision-making and dynamic action space and (2) demand uncertainty and its evolutionary dynamics—
292
9 Future Trends: AI and Beyond
are directly related to the planning of complex actions in sequence. Unfortunately, AI is not effective in addressing these challenges. According to Yann LeCun, one of the three pioneers of deep learning along with Geoffrey Hinton and Yoshua Bengio, planning complex action sequences is one of the important challenges of bringing AI to the next level (Council, 2020). In my opinion, AI will address this problem in the future. Once this challenge is addressed, artificially intelligent supply chains would not be just a dream.
References Castellanos, S. (2018). What exactly is artificial intelligence? Wall Street Journal. https://www. wsj.com/articles/what-exactly-is-artificial-intelligence-1544120887 Council, J. (2020). The future of deep learning is unsupervised, AI pioneers say. Wall Street Journal. https://www.wsj.com/articles/the-future-of-deep-learning-is-unsupervised-aipioneers-say-11581330600 Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning (2nd ed.). Springer Series in Statistics. Heaven, W. D. (2020). The way we train AI is fundamentally flawed. MIT Technology Review. Heaven, W. D. (2021). We’ll never have true AI without first understanding the brain. MIT Technology Review. https://www.technologyreview.com/2021/03/03/1020247/artificialintelligence-brain-neuroscience-jeff-hawkins/ Lyall, A., Mercier, P., & Gstettner, S. (2018). The death of supply chain management. Harvard Business Review. https://hbr.org/2018/06/the-death-of-supply-chain-management McCormick, J. (2021). Pinterest’s use of AI drives growth. Wall Street Journal. https://www.wsj. com/articles/pinterest-ai-growth-11621604558 Metz, C. (2021). Genius makers: The mavericks who brought AI to Google, Facebook, and the World. Penguin Random House LLC. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann. Purnell, N., & Olson, P. (2019). AI startup boom raises questions of exaggerated tech savvy. Wall Street Journal. https://www.wsj.com/articles/ai-startup-boom-raises-questionsof-exaggerated-tech-savvy-11565775004
A
Introduction to Python Programming for Supply Chain Analytics
Python is an open source programming language, which is very popular for data science and web development. It was developed by Guido van Rossum in the Netherlands in the late 1980s. It has been consistently recognized as the top programming language for the last 5 years. IEEE Spectrum (IEEE stands for Institute of Electrical and Electronics Engineering) compares 55 different programming languages according to different metrics. The metrics range from the job prospects for the experts of the programming languages to the social media attention and the Google trends. Python consistently ranks at the top of the list.1 Compared to other programming languages such as Java, C# and C++, Python is easier to parse and read, making it possible to develop a program much faster in Python. Compared to statistics software such as R, Python is much better at data management and web development. Capitalizing on its superior performance in data management, Python offers computational advantages for matrix operations and optimization over R. That being said, there are more packages in R developed by researchers to solve statistical problems than in Python. For example, there is no package in Python for dynamic conditional correlation although it is very popular in finance for asset pricing and correlation analysis (Engle, 2002). Nevertheless, it is still possible to call R packages from Python by using rpy2, an R interface embedded in Python. Thus, Python developers benefit from its several advantages because they can also call other programming languages into Python if necessary. Python codes are parsed easier than the other programming languages owing to Python’s indentation structure. Python encourages developers to use explicit statements, rather than implicit and nested statements. For example, to have a binary variable that takes zero if an input parameter is less than five and otherwise takes one, we define a function that assigns a binary value according to the input parameter. The Python screenshot for this simple example is given in Fig. A.1. As can be seen in the figure, Python uses an indentation structure rather than cluttering 1 https://spectrum.ieee.org/top-programming-languages/.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 I. Biçer, Supply Chain Analytics, Springer Texts in Business and Economics, https://doi.org/10.1007/978-3-031-30347-0
293
294
A Introduction to Python Programming for Supply Chain Analytics
Fig. A.1 Python example with the indentation structure
the code with curly brackets, so the function and the if statement are clear to the user. In addition to data science, Python is often used for web development. There are alternative libraries, such as Django and Dash, that make it possible to develop Python-based web applications. Some famous organizations that have developed web applications in Python include Instagram, Spotify, and the Washington Post.2 ,3 Python’s effectiveness in data science and web development means that it can be used as a unified platform for supply chain projects that include both analytics and web development components. Python can be downloaded from its website: https://www.python.org. However, it is much more convenient to use a navigator platform to download and use Python. Anaconda Navigator brings together different Python editors and documents. The individual edition is free and can be downloaded from the website: https://www. anaconda.com/products/individual. Once it is downloaded, the Anaconda Navigator shows alternative Python editors, such as Spyder and Jupyter, and the related documents. Figure A.2 shows a screenshot of the Anaconda page. Jupyter notebook is an effective tool for web-based computing that facilitates sharing Python applications in HTML format. Spyder, with its advanced debugging and editing flexibility, is useful for scientific applications in Python. Spyder should be downloaded onto the user’s computer from Anaconda, whereas there is no need to install Jupyter as it is web-based. There are several libraries in Python such that each of them stores specific functions. The full list of the libraries available in the Anaconda Navigator is given at https://docs.anaconda.com/anaconda/packages/ pkg-docs/. If a library has not already been installed in Python, the user can install it in the Jupyter environment. Once the Jupyter Notebook is launched from the Anaconda Navigator (see Fig. A.2), the user can create a new Python kernel (i.e. ipykernel).
2 https://en.wikipedia.org/wiki/Django_(web_framework), 3 https://engineering.atspotify.com/2013/03/20/how-we-use-python-at-spotify/.
Fig. A.2 Anaconda Navigator page that includes the Python editors and the related documents
A Introduction to Python Programming for Supply Chain Analytics 295
296
A Introduction to Python Programming for Supply Chain Analytics
Fig. A.3 Importing libraries in Python
Then, she has access to the Python programming environment shown in Fig. A.3. In an input cell, the installation can be done by entering the following command: “conda install rpy2” We remark that rpy2 is the R interface embedded in Python, which is not available in the Anaconda Navigator. Therefore, the user must install it before using it. If a library has already been installed, the user must import it into Python to be able to use its functions. Figure A.3 shows the Python code for importing some libraries. It is possible to relabel the library using the extension “as” at the end of each line to simplify coding in later stages. Python has some built-in functions. Those functions are not part of any of the libraries, so it is not necessary to import any of the libraries to be able to use those functions. The list of the built-in functions are available at https://docs.python.org/ 3/library/functions.html. We recommend new users look at the built-in functions to become familiar with them. Some of them are math functions, such as abs(), max(), min(), sum(), round(), complex(), etc. To check the parameter type, the type() function can be used. The functions available in a library can be checked using the dir() routine. For example, the functions in the NumPy library can be listed by the dir(np) command after relabeling the NumPy library as np as shown in Fig. A.3. The other built-in functions that are worth mentioning are len(), list(), and range(). The len() function returns the number of the elements in a list. For example, we define a vector of four elements such that x = [1, 2, 3, 4].
.
Then, len(x) returns four. The data type of x is a list, so that type(x) returns a list. The list() function is useful for defining a vector and carrying out some operations. For instance, by using lists, appending new elements to a vector is straightforward. Let us define another list such that y = [5, 6, 7, 8].
.
A Introduction to Python Programming for Supply Chain Analytics
297
Fig. A.4 Range and list function in Python
Summing up x and y returns another list with eight elements: x + y = [1, 2, 3, 4, 5, 6, 7, 8].
.
Instead of computing the vectorial addition, the summation of the two lists just appends new elements, which is sometimes very useful for data management purposes. We remark that the vectorial addition can be done after converting the data type from a list to a numpy.array as we will discuss later. The range() function generates a range of values that makes it practical to construct a loop. The data type of a variable generated by the range() function is a range, so the user needs to convert it to a list to see the elements as given in Fig. A.4. As shown in the figure, the z variable is defined as a range of values from 10 to 100, increasing by an increment of five. The range function does not include the last element, which explains why 100 is missing in z. When we call the z variable in the second line, it does not return all the elements explicitly. Therefore, we use the list function in the third line to see all the elements of z. Python starts indexing from zero unless the indices are specified. There are 18 elements of z, and the first element is called by .z[0]. The last element is called by .z[17]. Using the minus sign, the elements from the end of a list can be called. For instance, the last element of z can also be called by .z[−1] as shown in the last line of the figure. There are four popular libraries in Python: NumPy, SciPy, pandas, and mathplotlib. In addition to those, there are other libraries that are used for AI and machine learning applications. The sklearn library is very popular for machine learning applications. Three Python libraries are commonly employed to address AI-related problems. They are MXNet (developed by Apache Foundation), PyTorch (developed by the Facebook AI team), and TensorFlow (developed by Google Brain). In this Appendix, we exclude the machine learning and AI-related topics and focus on the first four libraries. Then, we show some examples of general programming concepts in Python. After reviewing this Appendix, we believe that
298
A
Introduction to Python Programming for Supply Chain Analytics
the reader will easily understand the programming aspects of our online web application.
A.1
NumPy
In supply chain analytics, it is a common practice for data to be handled in a matrix format. Apart from that, models are also developed in a matrix format. For example, the derivations of the demand regularization model in Chap. 4 are all in a matrix format. The matrix and array operations are not supported by the built-in functions in Python. NumPy is the main library in Python that includes most important matrix and linear algebra functions. A matrix can be defined in Python by using the array() function of NumPy. Then, the matrix operations, such as getting the transpose of a matrix and showing its rank and its eigenvalues, can be performed by using other NumPy functions. The full list of functions and routines of this library can be observed by the dir() function. We also refer the reader to https://numpy.org/doc/ stable/reference/ for a detailed explanation of each function. In Fig. A.5, we present how to define a vector and a matrix by using the NumPy arrays. We begin by importing the NumPy library into our Python environment and defining a vector. The type() function shows the type of input parameter, which is numpy.ndarray for those defined by using the np.array() function. Then, we define a matrix having two rows and four columns. This matrix is a_matrix in the figure. If we use double brackets, it returns one element of the matrix. For example, a_matrix[0][0] returns the first column in the first row, which is equal to 1. Or a_matrix[0][1] returns the second column in the first row, which is equal to 2. If a single bracket is used, it returns the entire row. For example, a_matrix[0] returns the first row entirely. Likewise, a_matrix[1] returns the second row of the matrix. If the user would like to obtain one of the columns entirely, the transpose() function should be used with a single bracket. For instance, a_matrix.transpose()[1] returns
Fig. A.5 Basic matrix calculations with NumPy
A
Introduction to Python Programming for Supply Chain Analytics
299
Fig. A.6 Some functions of the numpy.linalg sub-library
the second column of the matrix. When a matrix having two rows and four columns (i.e. a_matrix) is multiplied with another matrix having four rows and two columns (b_matrix), the result should be a matrix having two rows and two columns. The NumPy function dot() makes it possible to multiply two such matrices as shown in the figure. There are three important sub-libraries of NumPy. The first one is numpy.fft, which includes fast Fourier transform functions. In the online web application, we have provided an example in Chap. 4 including the fast Fourier transform component. So, the reader is referred to the online web application for the Python codes or to https://numpy.org/doc/stable/reference/routines.fft.html for more information about this sub-library. The second one is numpy.linalg, which includes the linear algebra routines in Python (https://numpy.org/doc/stable/reference/routines.linalg. html). The functions related to matrix factorization and decomposition, solving the system of equations and taking the inverse of a matrix can be found in the numpy.linalg sub-library. In Fig. A.6, we show some important functions of the numpy.linalg sub-library. In the figure, we first define a square matrix having a matrix rank of 2. Therefore, the inverse of the matrix exists. The second line of the codes checks the rank of the matrix. Then, we can take the inverse of the matrix by using the numpy.linalg.inv() function. In the demand regularization example, we recall that we have used numpy.linalg.pinv(), instead of numpy.linalg.inv(). For large matrices with some values close to 0, the computation of the inverse matrix may not be reliable. In such cases, the pseudo-inverse matrix that is based on the singular-value decomposition would be more reliable, and this is why we have chosen to use numpy.linalg.pinv(). The last two input lines in the codes show how to obtain the eigenvalues and solve the system of equations. To solve the system of equations, we need to define a
300
A
Introduction to Python Programming for Supply Chain Analytics
Fig. A.7 Some functions of the numpy.random sub-library
vector with two values. Then, numpy.linalg.solve() function solves the following two equations with two unknowns: x1 − 2x2 = −1
.
3x2 = 3, for which the .x1 and .x2 values that solve these equations are .[x1 , x2 ] = [1, 1]. The third sub-library is numpy.random, which includes the functions that generate random values from a statistical distribution. The list of statistical distributions supported by this sub-library is given at https://numpy.org/doc/stable/reference/ random/generator.html. In Fig. A.7, we show four examples of generating random numbers with different distributions. The two most popular statistical distributions in supply chain analytics are the normal and lognormal distributions. We also include two other examples with the exponential and multinomial distributions in Fig. A.7. Following the rules of writing each of the functions given on the webpage, the user would be able to generate different sets of arrays and use those values in supply chain applications.
A.2
SciPy
Despite its effectiveness in matrix operations and linear algebra, NumPy has some limitations that prevent it from supporting some advanced computing and statistical modelling functions. In Chap. 3, for example, we have discussed that the cumulative distribution and probability density functions appear in the derivations of the optimal ordering policy for different inventory models. The statistics routines of NumPy (https://numpy.org/doc/stable/reference/routines.statistics.html) do not include the cumulative and density functions from a specific statistical distribution, such as the normal and lognormal distributions. Nevertheless, such functions are available in
A
Introduction to Python Programming for Supply Chain Analytics
301
the SciPy library. The full list of SciPy functions and sub-libraries is available at https://scipy.org. SciPy also has a linear algebra sub-library (scipy.linalg) similar to the numpy.linalg, and there is an overlap between the SciPy and NumPy functions. For example, both NumPy and SciPy have some functions for applying the eigendecomposition to a square matrix or for solving the system of equations. On the one hand, SciPy consists of some matrix-decomposition routines (e.g. LU decomposition), which are not available in NumPy. On the other hand, it is argued that some NumPy functions are more flexible than their SciPy counterparts.4 For that reason, we recommend the reader check the NumPy and SciPy references carefully when there is an overlap in the functions that are intended to be used. In supply chain analytics, the most important SciPy sub-library is the one that includes the statistics functions: scipy.stats. Suppose that we are interested in calculating the in-stock probability when the order quantity is 140 units, and the demand follows a normal distribution with a mean of 100 units and a standard deviation of 20 units. Then, we use the following command to find the in-stock probability: scipy.stats.norm.cdf (140, 100, 20).
.
If we are interested in finding the order quantity for a critical fractile of 97.5%, we should call the following function: scipy.stats.norm.ppf (0.975, 100, 20).
.
The probability density for a given x value is calculated by the following: scipy.stats.norm.pdf (x, 100, 20).
.
The statistics functions are kept in a standard format in SciPy: scipy.stats.{distribution}.{f unction}.
.
The list of the distributions is given at https://scipy.github.io/devdocs/reference/ stats.html. There are several functions supported by SciPy. A non-exhaustive list includes the following: • • • • •
pdf .−→ Probability density function. cdf .−→ Cumulative distribution function. ppf .−→ Inverse function. sf .−→ Survival function, which is equal to .1 − cdf . isf .−→ Inverse survival function.
4 https://numpy.org/doc/stable/reference/routines.linalg.html.
302
A
Introduction to Python Programming for Supply Chain Analytics
Fig. A.8 Statistics functions in SciPy
In Fig. A.8, we present some of the statistics functions in SciPy. We start by importing the SciPy library. Next, we import the stats sub-library from SciPy. The figure shows some applications for a normal distribution. The 2nd input line (i.e. In [11] in the figure) returns the probability that the demand turns out to be less than or equal to 140 units when it follows a normal distribution with a mean of 100 units and a standard deviation of 20 units. The survival function in the third line (i.e. In [12]) gives the probability that the demand turns out to be greater than 140 units when it follows a normal distribution with a mean of 100 units and a standard deviation of 20 units. Subtracting its value from 1 returns the cumulative distribution value. For that reason, we obtain the same results in the Out [11] and Out [12] lines. The fourth and fifth input lines (In [13] and In [14]) are related to the inverse functions. The fourth input returns the inverse from a cumulative distribution, whereas the fifth input returns the inverse from a survival function. As we enter .1 − 0.975 in the fifth input, we obtain the same results in the Out [13] and Out [14] lines. The final input is related to the probability density function. The expected profit calculation for the normal distribution explicitly includes the probability density term for the standard normal distribution. So, we often use the density function in supply chain analytics applications. The density function is also useful for plotting the statistical distributions.
A.3
Pandas
Another important Python library is the pandas, which is popular for data management. Pandas allows users to upload and download data, create pivot tables, handle missing data, etc. The complete list of all pandas functions is available at http://pandas.pydata.org/docs/reference/. Using the pandas library, data from a variety of sources can be uploaded to the Python platform. For example, the routine pandas.read_excel() makes it possible to upload data from an MS Excel spreadsheet (https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html). Once data is
A
Introduction to Python Programming for Supply Chain Analytics
303
Fig. A.9 Uploading a data file to the Python platform with pandas
uploaded to the Python platform, its type becomes a pandas data frame. Each column of a data frame is called a data series. In Fig. A.9, we provide the Python code to upload an MS Excel file to Python using the pandas functions. Once we upload the data, a small subset of the data can be shown using the DataFrame.head() function. Our dataset includes 5 columns and 12 rows, and the DataFrame.head() function returns the 1st 5 rows of all the columns. Next, we show the type of the data uploaded to Python, which is a data frame. There are several pandas routines that can be applied to a data frame as provided at https://pandas.pydata.org/docs/reference/frame.html. Each column of a data frame is called a series. The full list of the routines that can be applied to a pandas series is given at https://pandas.pydata.org/docs/reference/series.html. We can also create a unique identifier for each row, which is known as “indexing” in the database management literature. In Fig. A.10, we create a new column using the range function. Then, we make this column the index of the data frame. Some data tables have unique identification numbers such as the transaction ID. If the user already knows that one of the columns in the data frame is a unique identification number, this column can be changed to an index. Otherwise, indexing is automatically done by the pandas. In Fig. A.11, we first remove the newly added column and bring the data into its original format. To this end, we use the reset_index() function and then the drop() function by specifying the column name “Index Column”. This removes the previously created index column. On the left side of “Column 1”, we observe the integer values starting at 0 and increasing by 1. These numbers indicate the automatic indexing of the pandas. To select a specific row by using its index, we can use the iloc[] routine as shown in the figure. Comparing the results of Fig. A.11 with those of Fig. A.10, we observe that the final outputs are the same in both figures.
304
A
Introduction to Python Programming for Supply Chain Analytics
Fig. A.10 Indexing with the pandas
Fig. A.11 Automatic indexing feature of the pandas
A
Introduction to Python Programming for Supply Chain Analytics
305
Fig. A.12 Creating the pivot tables with the pandas
In our example, we have some missing values in some cells. For example, the final output in Fig. A.11 includes two missing values: one in Column 1 and the other in Column 5. The missing values can be checked by using the pandas.DataFrame.isna() function. Then, the pandas.DataFrame.fillna() function fills in the missing values. If the user would like to replace the missing values with specific values, the values should be written in parentheses. If the user would like to replace the missing values with the previous values in their columns, she must add method=’ffill’ in parentheses. Likewise, the user should select the method=’bfill’ if the missing values should be replaced by the next values in their columns. Pivot tables can be created using the pandas library. Pivot tables are very helpful for aggregating data. In Fig. A.12, we first replace the missing values with the previous values in their columns. Then, we create a new column such that its value is equal to 0 for the first six elements and 1 for the last six elements. We later obtain the first pivot table by using the pandas.DataFrame.pivot_table() function. To get a pivot table, the data source should be selected together with the columns included in the pivot table. Next, the index column must be specified. The pivot_table() function simply applies the aggregate function (i.e. aggfunc) to the selected columns for the same values of index column. The index in our example is the new category column that can take either 0 or 1. The first pivot table shows the average values of the columns from 1 to 5 depending on the new category value. In the second pivot table, we change the function from numpy.mean to numpy.sum and calculate the sum of the values for each column depending on the new category value. Data management in Python has several aspects, and pandas is a complete library that helps users handle data very effectively. As stated above, we refer the reader to http://pandas.pydata.org/docs/reference/ for a complete list of pandas functions while acknowledging that we have only covered a very small subset of the pandas functions in this section.
306
A.4
A
Introduction to Python Programming for Supply Chain Analytics
Matplotlib
An important aspect of programming is data visualization, which is the graphical illustration of data. Although a spreadsheet may contain full information about each variable included in a dataset, some trends and correlation structures may not be evident from merely looking at the spreadsheet. To make them visible to users, the data must be represented in plots. Matplotlib is the Python library that makes it possible to plot data in a variety of ways. The full list of the plot types supported by matplotlib is available at https://matplotlib.org/stable/plot_types/index.html. The popular ones are line graph, scatter plot, histogram, and pie chart. In Fig. A.13, we present a histogram that shows a randomly drawn sample from a normal distribution. We first import the pyplot sub-library that includes all the routines to create visual illustrations: https://matplotlib.org/stable/api/pyplot_ summary.html. Then, we generate .10,000 random values from a normal distribution that has a mean of 100 and a standard deviation of 20. Finally, the histogram is plotted for this dataset. While plotting the histogram, the number of bins should be specified. This information specifies how many equally distanced intervals should be created, which is equal to 30 in our example. Then, the y-axis shows the number of occurrences that correspond to each bin. We present another example in Fig. A.14 that shows one solid curve and one scatter curve. We define three different variables. First, the x variable is a numpy array that starts from 0, increases by 5, and ends at 245. The last two variables (i.e. y and z) represent the demand distribution of two different products. The y variable is a numpy array that corresponds to the density values of the x variable from a normal
Histogram
1000
Frequency
800
600
400
200
0 20
40
60
80
100
120
140
160
Data bins
Fig. A.13 Histogram of a randomly drawn sample
180
A
Introduction to Python Programming for Supply Chain Analytics
0.0200
307
First demand distribution Second demand distribution
0.0175 0.0150
Density
0.0125 0.0100 0.0075 0.0050 0.0025 0.0000 0
50
100
150
200
250
Value
Fig. A.14 Probability density of two random variables
distribution with a mean of 100 and a standard deviation of 20. We display the y variable with a solid curve. To this end, we use the function matplotlib.pyplot.plot() and label the curve “first demand distribution” as shown in the figure. The third variable is the z variable, which is a numpy array corresponding to the density values from a normal distribution with a mean of 120 and a standard deviation of 20. We display the z variable in the graph with a scatter curve, for which we use the matplotlib.pyplot.scatter() function. We also label this curve “second demand distribution”. As shown in the figure, different curves can be shown in a single graph, and their labels are given in a text box, which would be useful for comparing different variables. Finally, we present an example of a pie chart that shows the market shares of four different products. Suppose that the market shares are $4 million, $2 million, $1 million, and $1 million, and we would like to present this information in a pie chart. In Fig. A.15, we show the Python codes that plot the pie chart. We first define the list of products, which is later used to label the data in the pie chart. The market shares are entered as a numpy array. As the last step, we plot the pie chart and show it.
A.5
General Programming Concepts
In computer programming, there are three important concepts that can be observed in any programming language. They are (1) if-else statements, (2) for and while loops, and (3) user-defined functions. If-else statements are very important for creating some logic trees such that a set of instructions is followed if a logical
308
A
Introduction to Python Programming for Supply Chain Analytics
Market analysis Product #1 50.00%
12.50% 25.00%
12.50%
Product #4
Product #2 Product #3
Fig. A.15 A pie chart showing the market shares of four products
Fig. A.16 An if-else statement to calculate the excess inventory and stockout values
statement is true; otherwise, an alternative set of instructions is followed. Suppose that the demand for a product follows a normal distribution with a mean of 100 units and a standard deviation of 20 units. The available inventory is 105 units. If the demand turns out to be more than the available inventory, some customers face stockouts; otherwise, excess inventory is observed. We exhibit the Python codes of this example in Fig. A.16. We first draw a random demand value from the normal distribution with the given parameter values. The random value would have a decimal point, so we must round it to the nearest integer. The if-else statement starts with the condition that the randomly drawn demand value is greater than or equal to the available inventory. If this condition holds true, the stockout and excess inventory values are calculated accordingly. Otherwise, the else statement should hold true, and the expressions under the else statement are used to calculate the stockout and excess inventory values. In the example, the demand value is 94 units; hence, the condition demand ¿= inventory is false. For this reason, the computer program skips the expressions after the if statement. The
A
Introduction to Python Programming for Supply Chain Analytics
309
Fig. A.17 Two for loops to calculate the sum of the first five integers and the sum of the integers given in a vector
expressions under the else statement are used to calculate the stockout and excess inventory values such that the stockout value is 0 and the excess inventory is 11 units. In computer programming, the for loop and while loops are two very popular loops that are used in various settings. The for loop iterates a variable over a sequence one by one until the last element in the sequence is reached. Suppose, for example, we are interested in calculating the sum of the first n positive integers. Or we may be interested in calculating the sum of the values given in a vector. A for loop can be constructed to do these calculations. In Fig. A.17, we present the Python codes for these two examples. In the first loop, we calculate the sum of all the first five positive integers. In the second loop, we calculate the sum of all values given in a vector. The while loop is a condition-based loop such that it iterates as long as a specific condition holds true. In Fig. A.18, we present three while loops to illustrate how to develop while loops. The first one returns the sum of the first five positive integers. In contrast to the for loop, we now define the i parameter as an integer, not a range. Its starting value is 1, and the while loop iterates as long as it is less than or equal to 5. At each iteration, we increase the value of i by 1; otherwise, the while loop never ends. In the second while loop, we add an if-else statement with the continue command. This command skips one iteration. For example, the second while loop skips one iteration whenever i is an even number. Therefore, it calculates the sum of odd numbers until .i