279 49 5MB
English Pages 497 Year 2016
From Little’s Law to Marketing Science
From Little’s Law to Marketing Science Essays in Honor of John D. C. Little
John R. Hauser and Glen L. Urban, editors
The MIT Press Cambridge, Massachusetts London, England
© 2016 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. This book was set in Palatino LT Std by Toppan Best-set Premedia Limited. Printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Names: Little, John D. C., honoree. Hauser, John R., editor.|Urban, Glen L., editor. Title: From Little’s law to marketing science : essays in honor of John D.C. Little/John R. Hauser and Glen L. Urban, eds. Description: Cambridge, MA: MIT Press, [2015] | Includes bibliographical references and index. Identifiers: LCCN 2015038265|ISBN 9780262029919 (hardcover : alk. paper) Subjects: LCSH: Marketing–Management.|Little, John D. C. Classification: LCC HF5415.13 .F76 2015 | DDC 658.8–dc23 LC record available at http://lccn.loc.gov/2015038265 10
9
8
7
6
5
4
3
2
1
Contents
Preface vii 1
John D. C. Little: A Profile 1 John R. Hauser and Glen L. Urban
I
Marketing Science: Managerial Models
2
Optimal Internet Media Selection 23 Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
3
Strategic Marketing Metrics to Guide Pathways to Growth 49 John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
4
Moving from Customer Lifetime Value to Customer Equity 85 Xavier Drèze and André Bonfrer
5
Deriving Customer Lifetime Value from RFM Measures: Insights into Customer Retention and Acquisition 127 Makoto Abe
6
Building and Using a Micromarketing Platform 161 Masahiko Yamanaka
7
Dynamic Allocation of Pharmaceutical Detailing and Sampling for Long-Term Profitability 175 Ricardo Montoya, Oded Netzer, and Kamel Jedidi
8
Morphing Banner Advertising 211 Glen L. Urban, Guilherme Liberali, Erin MacDonald, Robert Bordley, and John R. Hauser
vi
Contents
II
Marketing Science: Decision Information Models
9
Disjunctions of Conjunctions, Cognitive Simplicity, and Consideration Sets 255 John R. Hauser, Olivier Toubia, Theodoros Evgeniou, Rene Befurt, and Daria Dzyabura
10
Decision Process Evolution in Customer Channel Choice 285 Sara Valentini, Elisa Montaguti, and Scott A. Neslin
11
The Value of Social Dynamics in Online Product Ratings Forums 317 Wendy W. Moe and Michael Trusov
12
Uninformative Advertising as an Invitation to Search 349 Dina Mayzlin and Jiwoong Shin
13
Alleviating the Constant Stochastic Variance Assumption in Decision Research: Theory, Measurement, and Experimental Test 399 Linda Court Salisbury and Fred M. Feinberg
III
Little’s Law: Current State
14
Generalized Little’s Law and an Asset Picking System to Model an Investment Portfolio: A Working Prototype 437 Maria Luisa Ceprini and John D. C. Little
15
Closing Statement 467
Contributors 469 Index 471
Preface
John D. C. Little is a pioneer in many fields, but he is, perhaps, best known for his research, teaching, and service in the related fields of operations research and marketing science. He is widely known for a fundamental theorem in queuing theory known as Little’s Law, L = λW. Little’s Law is now used widely in a variety of fields. John is also known, along with Frank Bass, as the founder of marketing science. His research on optimal advertising experimentation, advertising budgeting, aggregate advertising models, marketing mix models, logic models applied to UPC data, and subsequent applications has generated entire streams of research. Many faculty today owe their careers to John’s seminal research. John is also an innovative and devoted educator, a skilled administrator, and an entrepreneur. He is truly a role model. We begin this volume with a profile of John. On Saturday, June 6 to Sunday, June 7, 2009, after the 2009 INFORMS Marketing Science Conference at the University of Michigan, we held a Festschrift Celebration, including a banquet and full day seminar to honor John D. C. Little. More than a hundred people attended the event. The papers presented there form the basis of this book. Eight of the ten works presented at the Festschrift are included here in the form of resulting publications and working papers. In addition, twenty-four papers not presented at the conference were submitted for our consideration; we selected five of these to be included in this Festschrift book. The profile, the Festschrift papers, and the submitted papers are supplemented by a 2014 paper coauthored by Maria Luisa Ceprini and John Little that has been added to the book as a statement of where Little’s law is today and what we might expect in the future. We wanted to wait for this paper so that we could close the loop from Little’s Law to Marketing Science and back to Little’s Law. Enjoy this book and the honor the authors and editors want to bestow on John D. C. Little.
1
John D. C. Little: A Profile John R. Hauser and Glen L. Urban
The development of Operations Research (OR) after World War II was influenced greatly by OR pioneers who applied their knowledge and experiences to the problems of business and industry. In parallel, they brought OR methods and practices into academia, and created the environment for the development and training of a new cadre of OR practitioners and researchers. In the forefront of this new group, we find John D. C. Little. From being the first to receive a PhD in OR, John went on to leave his own lasting imprint on the field as a progenitor of marketing science and its applications, the person behind the eponymous Little’s Law for queues, an influential academic leader, and a highly successful OR researcher, practitioner, and entrepreneur. For his innovative and seminal research in marketing, John received the American Marketing Association (AMA) Charles Parlin Award for contributions to the practice of marketing research (1979), the AMA’s Paul D. Converse Award for lifetime achievement (1992), and MIT’s Buck Weaver Award for outstanding contributions to marketing (2003). He was president of the Operations Research Society of America (ORSA) in 1979, president of the Institute of Management Sciences (TIMS) in 1984–1985, and the first president (1995–1996) of the Institute for Operations Research and the Management Sciences (INFORMS)— making him one of only two individuals who have served as president of all three organizations. Boston Born, Massachusetts Bred, and on to MIT John D. C. Little was born in Boston, Massachusetts, on February 1, 1928, the son of John D. and Margaret J. Little, and grew up in Andover, Massachusetts. Both his father and mother were natives of Massachusetts; he was born in Malden and she was born in Gloucester. Margaret
2
John R. Hauser and Glen L. Urban
graduated from Smith College, and, after her three children had grown up, became an English teacher, and then the principal of a private school in Andover. John D. attended Malden public schools and Dartmouth College, but left without graduating to become an ambulance driver in France during World War I. Later, he held various positions including reporter for the Boston Herald; editor for a financial journal in Washington, DC; bond salesman for a Boston brokerage firm; writer for the Office of War Information in Washington during World War II; and a credit manager.
Figure 1.1 John D. C. Little
John D. C. Little
3
John (D. C.) lived in the West Parish part of Andover, which was then quite rural, an exurbia from which his father commuted to Boston by train. John attended Andover’s elementary and middle public schools and was a good student, especially in mathematics and science. He obtained a scholarship to attend Phillips Academy in Andover and won most of the science-related prizes. He graduated from Andover in 1945 and started college at the Massachusetts Institute of Technology (MIT). Due to the wartime acceleration of MIT’s academic program, his freshman year began in the summer. He decided to major in physics, which appealed to him as a worthy intellectual challenge. John did more than study—he became editor-in-chief of Voo Doo, the MIT humor magazine. He also “took a minor in hitchhiking to Wellesley [a women’s college]” (Little 2008). He graduated in three years, receiving an SB degree in 1948. Tired of school and not yet wanting to enter the working world, John hitchhiked around the country for ten months. This brought him to the point where work seemed better than poverty, so he joined the General Electric Company as an engineer. In 1951, he went back to MIT, enrolling as a graduate student in physics. Although he passed his general exams in physics, his intellectual curiosity caused him to search out other areas—a research assistantship (RA) on an unclassified air defense project with the psychologist J. C. R. Licklider, and a course in the new field of OR taught by George Wadsworth of the MIT mathematics department. Inspired by the challenges offered by the embryonic field of OR, John obtained a RA working for the physicist Philip Morse. Morse, a World War II pioneer in OR, is recognized as the founder of OR in the United States—he had established the first American OR group for the Navy in 1942. After WWII, Morse founded and directed MIT’s interdepartmental Operations Research Center. John’s RA was for a US Navysponsored project, “Machine Methods of Computation and Numerical Methods.” Machine computation meant Whirlwind, one of the earliest digital computers—it had been built in the late 1940s by MIT personnel to support the Navy’s research program. John’s task was to learn about Whirlwind, how to program it, and to compute a book of tables for spheroidal wave functions—rather esoteric functions used for calculations in theoretical physics. John did not think the resulting book was going to be a big seller, but, in assuming the task, he thought “an RA is an RA”; one usually does not question the professor, especially when the job helps pay your tuition. More importantly, John became one of
4
John R. Hauser and Glen L. Urban
the few people in the world with access to a digital computer. The knowledge of what a computer could accomplish was central to his future research and consulting activities. He felt that “computers are cool”—and to this day he can often be found “playing” with the latest technological devices (Little 2008). When John asked Morse about possible thesis topics, Morse mentioned a few from physics and then wondered if John would be interested in an Operations Research topic. Physics or OR? John faced the decision and concluded, “Physics is fine, but look at it this way. Bohr solved the hydrogen atom and it’s beautiful. Then somebody named Hylleraas solved the helium atom. It took him seven years on a handcrank calculator and it’s ugly. Beyond helium there are another hundred or so elements with bad prognoses. I’m searching for a field with a lot of unsolved hydrogen atoms. OR looks good” (Little 2007, 3). John’s dissertation research dealt with the study of water-flow management in a hydroelectric reservoir and dam system (Little 1955). In particular, his analysis dealt with Washington State’s section of the Columbia River, its Grand Coulee hydroelectric plant, and the Franklin Delano Roosevelt Lake (the reservoir that formed behind Grand Coulee Dam). The problem was how best to schedule the amount of water flow used to generate electricity. The system is naturally dynamic with the available water to generate power a function of seasonal rainfall and the runoff from snow melting in the mountains during the spring and summer. In the fall and winter, when precipitation falls as snow in the mountains, the natural river flow drops drastically. Water leaves the reservoir from spillover (wasted energy) and when the water is drawn down to generate electricity. The problem is interesting because the power is proportional to the head, the height of the water behind the dam relative to the water below the dam, and to the rate of water flow. A greater flow generates more electricity, but also reduces the head at a faster rate. The tradeoff is challenging. As John notes, decisions have to be made “… in the spring and summer, the right decisions about water use are obvious—indeed they are hardly decisions—whereas in the fall and winter such decisions require a balancing of the benefits of future against immediate water use in the face of uncertain future flow” (Little 1955, 188). John formulated the problem as a dynamic program, although, at the time, he did not know it was a dynamic program. John faced the
John D. C. Little
5
dynamic-programming “curse of dimensionality,” the rapid increase in computing time with the number of state variables (Bellman 1957, ix). The simplest credible formulation required two state variables: one for the amount of water in the reservoir (which determines the head) and the other for the current river flow. Thanks to Whirlwind, John was able to finish his thesis before his RA ran out (Little 2007, 4). The thesis was very likely the first non-defense application of dynamic programming to a problem of practical importance. Real data were used for the historical stream flows. The models of Grand Coulee and its reservoir were simplified to save computing time, but based on actual physical dimensions (Little 1955). John received his PhD in Operations Research in 1955, the first person in the world to receive such a degree; his dissertation title was the “Use of Storage Water in a Hydroelectric System,” and his advisor was Philip Morse. John, to be precise, describes his PhD as being in physics and OR, since his general exams were in physics and his thesis in OR (Little 2008). In 1953, John married Elizabeth (Betty) Alden, an MIT physics graduate student who worked on ferroelectrics under Professor Arthur von Hippel, a pioneer in dielectrics, especially ferromagnetic and ferroelectric materials. She received her PhD in physics from MIT in 1954, one of very few women to do so at the time. After their marriage, Betty and John moved to Marlborough Street in Boston’s Back Bay. The rent was high “but not too bad and we could split it” (Little 2007, 4). Beginning a lifetime of exercise (John jogs and bicycles daily at age eighty), they had to walk up five flights to reach their tiny, top-floor, two-room apartment. It had sloping ceilings to conform to the roof, but it suited their purposes admirably. Army, Case Institute, and Laying Down the Law In early 1955, three months after completing his thesis, John was drafted into the Army. He was stationed for two years at Fort Monroe, Hampton, Virginia where he served as an operations analyst working on military OR problems that included probabilistic models of land mine warfare. As an antidote to the Army, John and Betty bought a sailboat and had a wonderful time getting in and out of trouble on the Chesapeake Bay. Their first child, John N. Little, was born in the Fort Monroe Army hospital. Upon his discharge from the Army in 1957, John began his academic career as an assistant professor at Case Institute of
6
John R. Hauser and Glen L. Urban
Technology (now Case Western Reserve University) in Cleveland, Ohio. There he had his first experiences working on industrial OR projects. A project with M&M Candies introduced him to advertising problems, and a project with Cummins Engine, Inc. introduced him to the issues of conflict between a manufacturer and its independent distributors. At M&M, the president had deliberately stopped all advertising after a long period of operating at a high level in order to see what would happen. The Case team analyzed the resulting sales over time. At first, sales changed very little—and then started into a serious decline. With their response analysis and further data, the team calibrated a model that led it to recommend a new policy for buying TV spot advertising. At Cummins, top management was dismayed that, when the company provided extra sales support to its independent distributors, the latter rather quickly reduced their own. John devised a graphical profit analysis showing that such behavior by the distributors was entirely rational. With his accumulation of such realworld experiences, he developed and taught a graduate course— OR in Marketing—that may have been the first such marketing science course. Although not a problem in marketing, John also worked on the traveling salesman problem (TSP), whose solution procedures for finding the minimum TSP route were computationally challenging. In addressing the problem, he and his coauthors introduced and popularized the term “branch and bound” as a technique for solving the TSP, as well as other combinatorial optimization problems (Little et al. 1963, 972). For a while, the authors held “the indoor record for size of problem solved, forty cities on the fastest machine at MIT at the time, an IBM 7090” (Little 2008). And, as it turns out, even more than four decades later, the TSP is relevant to modern marketing science. With automatic tracking of supermarket shopping carts and recorded checkout data, researchers are using optimal traveling salesman solutions as reference routes for studying how shoppers actually move around in stores. John’s research at Case (and later work) was clearly influenced by his mentor Philip Morse, who began his career as a theoretical physicist, but, in the course of his OR activities, became both an experimentalist and a theoretician. John was guided by the definition of OR promulgated by Morse and George Kimball in their book, Methods of Operations Research: “Operations Research is a scientific method of providing executive departments with a quantitative basis for decisions
John D. C. Little
7
regarding operations under their control” (Morse and Kimball 1951, 1). John, in his retrospective review, “Philip M. Morse and the beginnings” (Little 2002) notes: “The definition leaves room for the tremendous development of methodology that we have witnessed in the past fifty years, but it keeps our feet on the ground with the requirement for data, models, and decisions. I like that, and I am sure it is what Morse intended” (Little 2002, 148). Being true to this paradigm, John, in a study of a real world problem, would often meet with managers to learn how they perceived the problem. Then he would formulate tentative hypotheses about underlying processes that, if understood, might permit improvement in operations. After that, he would look for relevant data and/or design and execute a plan for collecting it. Whenever feasible, he would do this with students. Often, iteration was required between the observed and modeled worlds until the model was right for the job at hand. This style marked his professional behavior in such disparate subfields of OR as hydroelectric systems, traffic signal synchronization, the process of managing, and eventually marketing. An exception to the paradigm is the paper underlying Little’s Law. This paper, written while John was at Case, established him in OR history. John published the first general proof of the famous queueing formula, L = λW (Little 1961a). Assuming steady state operation, the formula says that the average number of customers in a queueing system over time (L) equals the average arrival rate of customers to the system (λ) multiplied by the average time that each customer spends waiting in the system (W). A customer can be anything from a consumer waiting for a teller in a bank, to an aircraft waiting to land, to a packet of data waiting to be processed in a computer network. Little’s Law allows an analyst to obtain all three of these fundamental performance measures of a queue by calculating (or measuring) only two of them. This is useful because the analytic methods used to calculate L and W are usually quite different, and, often, one is much easier to carry out than the other. John taught a queueing course at Case, and, among other sources, used Morse’s pioneering book on queueing theory for OR applications, Queues, Inventory and Maintenance (Morse 1958). In one of John’s lectures, he pointed out Morse’s observation that the curious formula L = λW always seemed to apply to queues whose operational behavior Morse had solved the long, hard probabilistic way—that is, by making specific assumptions about arrival processes, service time distributions,
8
John R. Hauser and Glen L. Urban
and queue disciplines (Morse 1958, 75). John went on to sketch a figure on the blackboard, similar to the one that appears in his paper (Little 1961a, 385). He used the figure to give a heuristic argument as to why the formula should hold in great generality for steady state queues. In a discussion after class, one of the students wondered how difficult it would be to prove the general case. John obligingly answered, “It shouldn’t be too hard.” “Then you should do it!” was the response (Little and Graves 2008, 99). The discussion stuck in John’s mind, and he started to think about how he might turn his heuristic proof into a formal one. He bought and read some books on general stochastic processes. In those years, he and Betty would pile themselves, their children, their summer clothes, and a stack of books for John’s research projects into their Ford Falcon station wagon and head for Nantucket (Island), Massachusetts
Box 1.1 What’s in a Name? That Which We Call a Rose …
There has been much curiosity about what John’s D. C. middle initials stand for—he has been asked often. An extreme case of curiosity occurred when an OR teacher in Oklahoma City challenged his OR class to find out the exact middle names of the person after whom Little’s Law is named—the first student who did so would get a reward. This resulted in John receiving much email from Oklahoma City. Out of curiosity, John decided to pursue the challenge and searched the web for the answer. He was surprised to find how difficult the task was. He was able to do it, but admits he was aided by knowing that the names are in two places on the MIT website. Many people have guessed that D. C. came from direct current based on John’s early work on hydroelectricity, which, by the way, would be A. C. (alternating current). Others have guessed District of Columbia from John’s contributions to OR in the US. It is not DC Comics, although many people consider John a superhero, nor DC Shoes for skateboards— John jogs. It is not Dominican College or the Dublin Core, nor is it D. C. United, the Department of Corrections, desert combat, or digital camera. The answer is Dutton Conant. His father, who was John Dutton Little, did not want John to be called junior and so added another middle name. His father was close to his grandmother whose maiden name was Conant. Thus, John became John D. C. Little. John has noted that, although there are many people named John Little, he has never found another who was John D. C. Little. He finds this helpful in searching for himself in web documents (Little 2008).
John D. C. Little
9
to spend the summer. This particular year (1960) he took his new books and worked on L = λW as one of his major projects. The books did not have any magic formulas, but they gave him important ideas—the outcome was his paper, “A proof of the queuing formula L = λW” (Little 1961a). John decided then and there not to make a career out of being a measure-theoretic stochastic process mathematician—he has never regretted it. Little’s Law has entered Operations Research folklore. At an ORSA conference in New Orleans, T-shirts were sold to raise money for ORSA. A best seller was the one that proclaimed: “It may be Little, but it’s the Law.” Return to MIT, Evolution of the Science of Marketing In 1962, John interviewed for a faculty position at MIT in the School of Industrial Management, now the MIT Sloan School. He had been a tenured associate professor at Case and, without any qualms, he accepted MIT’s offer as an untenured associate professor. John viewed the scope of OR broadly and was attracted back to MIT by the promise of new problems and new research directions. MIT was an excellent base of operations with good colleagues and great students—he has never left! Perhaps frustrated by Boston drivers, who are alleged to be the worst in the country, John first continued his work on traffic flow and traffic signal control, a problem he had begun at Case—traffic delays due to additional time to travel over a route as a result of traffic and traffic lights (Little 1961b). He had also worked with a master’s student, John Morgan, who programmed the synchronization of traffic signals on a two way street on the Case computer. This was the first time the problem had ever been approached in this manner. Previously it had been done graphically by hand. It is trivial to synchronize the signals on a one way street so that cars traveling at an average speed can traverse the length of the street without stopping. The problem becomes combinatorial and quite difficult on a two-way street when it is desired to have the cars in both directions be able to do the same. The fraction of the signal cycle time for which cars in both directions can travel without stopping is known as the bandwidth of the street (Morgan and Little 1964). Finding the maximum bandwidth (MAXBAND) is a challenging optimization problem. At MIT, John extended this work to complete street networks, seeking to maximize a linear combination of the bandwidths of the
10
John R. Hauser and Glen L. Urban
various arteries in the network (Little 1966a). The methodology was based on mixed-integer linear programming. He was joined by colleagues and research assistants and supported by the Federal Highway Administration to produce the software package, MAXBAND—it was distributed to municipalities so they could optimize their street systems (Gartner, Kelson, Little 1981; Little and Cohen 1982). This stream of research defined a new state of the art in the field of synchronizing traffic signals on arteries and networks. In a quite different direction, John, now in a business school, had the vision to perceive marketing as source of interesting and relevant unexplored opportunities for OR and Management Science (MS). As an example, the effectiveness of a company’s advertising is likely to vary over time. No matter how good the response function used to calculate an optimal advertising rate at one point in time, it is likely to drift to something different. What to do? Run an experiment to remeasure effectiveness and update the advertising response function. For example, take five medium-sized markets and set them at higher than the currently presumed optimal advertising rate and set another five markets lower. The resulting measurement can be used to reset the advertising response function and obtain a more profitable advertising rate for use nationally. But the ten experimental markets are being deliberately operated differently from the perceived best rate, thereby incurring a calculable cost. The adaptive system optimization, however, takes the next step by setting the number of experimental markets so as to maximize total system profit, including the cost of the experiment. John thus became the first scholar to develop adaptive control methods for the field of marketing. He was particularly pleased that his model could be applied readily (Little 1966b). For John, it was not enough to develop a nice mathematical solution—he wanted somebody to use it. He also published a generalized version of the mathematics in terms of optimal adaptive control (Little 1977). During this period, John became increasingly interested in advertising budgeting and media selection. He and Leonard Lodish, one of his PhD students, developed an online, computer-based system for selecting and scheduling advertising media. They described it as a media planning calculus and named it MEDIAC (Little and Lodish 1969). MEDIAC replaced heuristic analyses with the optimization of a measure more closely related to sales and profits. In 1968, while he was conducting an MIT summer session on OR in marketing, John was approached by attendees from Nabisco (formerly
John D. C. Little
11
the National Biscuit Company) who asked him to develop a model to set advertising spending levels for Oreo cookies. John realized that Nabisco had some explicit, hard data, but that other key data were buried in managers’ heads. Managers with experience in this area had implicit knowledge of how sales would respond to advertising. The challenge was how to unlock that information in a manner that could augment, rather than replace, the hard data. From this challenge, John developed the concept of a decision calculus as described in “Models and managers: The concept of a decision calculus” (Little 1970). That revolutionary paper begins with the sentence: “The big problem with management science models is that managers practically never use them” (Little 1970, B466). But it was not a negative paper—John wanted to improve matters. First, it broke with standard practice for empirical models that all constants be estimated at once on a single data set. If data were not available on, say, advertising, then advertising could not be in the model. John’s paper took the view of those who wanted to apply management science models and set forth guidelines that would be critical to implementation. John defined a decision calculus as “a model-based set of procedures for processing data and judgments to assist a manager in his decision making,” and proposed that models, to be useful to managers, should be “simple, robust, easy to control, adaptive, complete on important issues, and easy to communicate with” (Little 1970, B-469, B483). John’s insight was that for managers to use a model, they must understand the model well enough that they could control it. His theme was: “I claim that the model builder should try to design his models to be given away. In other words, as much as possible, the models should become the property of the manager, not the technical people” (Little 1970, B-483). The decision calculus paper was cited as one of the ten most influential papers published in the first fifty years of Management Science. John demonstrated the relevance of the decision calculus by applying it to the complex problem of selecting the entire marketing mix. He espoused eclectic calibration. Some submodels, like manufacturer advertising and its effect on brand share, are almost sure to include time lags and be dynamic. Others, like seasonality and trend, may be straightforward and standard. Still others, like coupons, premiums, and production capacity constraints, might be handled by simple indices based on data analysis or the product manager’s prior experience. The resulting model—ADBUDG (Advertising Budget)—is given in Little (1970). This model was later expanded into BRANDAID, which
12
John R. Hauser and Glen L. Urban .
is a more complete online marketing-mix model that provides AID for the BRAND manager by permitting the evaluation of new strategies with respect to price, advertising, promotion, and related variables (Little 1975a, b). The latter paper describes a case study for a wellestablished brand of packaged goods sold through grocery stores. The Marketing Data Explosion In his paper “Aggregate advertising models: The state of the art” (Little 1979), John summarized and critiqued the previous decade’s modeling knowledge and advances in modeling advertising phenomena. After posing a set of modeling questions, he reviewed the published empirical data and studies that bore on them. He then listed five phenomena that a dynamic model of advertising response should, at a minimum, be able to incorporate: assist annual budget setting, geographic allocation of funds, allocation over time, and incorporate media and copy effects. One of John’s concerns was that available data to test and calibrate such models were aggregate in nature, e.g., historical time series at a national or market level. He observed (Little 1979, 629): “Although many models have been built, they frequently contradict each other and considerable doubt exists as to which models best represent advertising processes . . . Future work must join better models with more powerful calibration methods.” Central to this objective was the need for accurate data at the point-of-sale (local) level. John noted that such a data revolution was on its way (Little 1979, 663). Products had begun to be labeled with computer-readable Universal Product Codes (UPC), supermarket scanners were being installed that could read such codes, and computer technology was becoming distributed—no longer just mainframes—to collect, organize, and analyze those data. The marketing field would soon be inundated with data. To address the new issues in using such data, John and Peter Guadagni, one of his master students, built a disaggregated model—a logit model of brand choice calibrated on scanner data—that predicted actions at the level of the individual consumer making individual purchases (Guadagni and Little 1983). A novel aspect of the model was that it included what John termed a loyalty variable; an exponentially smoothed history of past purchases treated as 0–1 variables, and, thus, a measure of the customer’s past propensity to purchase the product,
John D. C. Little
13
weighted most for recent purchases. This paper is one of the most cited papers in Marketing Science and has been republished as one of that journal’s eight classic papers. The logit model has been improved, reanalyzed, expanded, kicked, and modified. New phenomena have been added and new data have been analyzed. But the basic structure (and the power of the loyalty variable) remain. An entire generation of marketing science academics and students have been influenced by the original and extended UPC logit models. The logit models are powerful, but could be intimidating to mangers who, according to John’s decision calculus theme, should be able to understand the model well enough that they could control it. Managers wanted answers in a form they could digest. More importantly, computer technology had gotten to the point where the logit models could work behind the scenes to create automated reports in the form that managers could use. This thinking led to a decision support (expert) system termed CoverStory that was developed for Ocean Spray Cranberries, a fruit-processing cooperative. Ocean Spray tracks sales and assesses the effectiveness of its marketing program using large data bases collected through barcode scanners in supermarkets (Schmitz et al. 1990). For a brand manager, CoverStory rapidly and automatically computes and summarizes a large amount of output generated by the system’s models. The output—structured as a memorandum to the manager—includes a single page of charts and a series of descriptive lines customized for the markets in which the brand competes, showing performance vis-à-vis competitors brands. The number of brands, individual products, and regions make it infeasible to do such an analysis manually. Modern computers, artificial intelligence, and advanced marketing science models form the CoverStory system—they are combined in such a way as to have a direct, positive impact on marketing practices, as well as managerial efficiency and effectiveness. John had come a long way from Whirlwind. Educator, Leader, Entrepreneur, and Service John is an innovative and devoted educator at all levels of the university—undergraduate, MBA, PhD. Since 1990, he has been chair of the Undergraduate Program Committee of the MIT Sloan School. As such, he is involved in policy matters, but he always has a group of undergraduate advisees. MIT undergraduates who major at MIT Sloan
14
John R. Hauser and Glen L. Urban
Box 1.2 At Home with JDCL
John often invited students to his house for social functions that usually included squid tasting. A special time was Thanksgiving Day when John would invite foreign students and their families for dinner at his home in Lincoln. John also made it a practice to invite new faculty to Nantucket during the summer to enjoy the island and be exposed to New England culture. A stay in John’s little cabin and fishing for bluefish off Miacomet Rip have provided particularly vivid memories for many. John loves seafood and claims that “anything from the sea was good to eat until proven otherwise.” Sea urchin roe pizza is a Little specialty (Little 2008).
receive an SB degree in Management Science, which John calls an MITstyle business degree (Little 2008). From the time he developed a course on OR in marketing at Case, John has been interested in teaching master’s degree students how to solve marketing science problems. At MIT, he developed MIT’s first course in marketing models—a course that was a staple fixture in the marketing group until marketing models ultimately invaded almost all marketing courses. In the 1970s, he pioneered a new specialty program at the MIT Sloan School called Fast Track. John would read all the files for admitted students, and, if they had very strong quantitative skills, he would invite them to join the Fast Track program. He found that the students thrived in the challenging advanced courses in mathematical programming, information technology, and statistics. John has served MIT in many capacities. He was director of the Operations Research Center from 1969 to 1975, succeeding Philip Morse. For the MIT Sloan School, he headed the marketing group and eventually the Management Science Area (MSA) from 1972 to 1982. During this period, he was instrumental in making the MSA cohesive and interdisciplinary. In 1982, John was asked to work his magic again—the Behavioral and Policy Science Area (BPS) at MIT Sloan was formed after a major reorganization. It was not a cohesive group, and, surprisingly, did not include anyone who might be labeled either an operations researcher or management scientist. It was primarily a collection of faculty from the less quantitative fields of organizational studies, research and development management, human relations, and strategy. John led the group for six years, and his legacy was the establishment of a sound foundation for the area and a potent BPS faculty. In 1989, John was
John D. C. Little
Box 1.3 John’s Family.
John’s wife and former fellow physics graduate student, Betty, was an impressive scientist in her own right. For her PhD thesis, she studied the dynamic behavior of domain walls in barium titanate; she finished her thesis before John and published a paper on it in the Physical Review (E. A. Little 1955). She did not pursue a full-time career, being the one who agreed to stay at home as they raised their children: John N. (Jack), Sarah A., Thomas D. C. (Tom), and Ruel D. Betty became a teacher’s aide during the time her children were in public school, and later, in 1985 at age 58, having become interested in Nantucket history and its nativeAmerican archaeology, received an MA in Anthropology, with concentrations in archaeology and geology, from the University of Massachusetts Amherst. Betty continued her archaeological research and writings for many more years. After a two-year battle with cancer, she died in 2003. Their children could not escape their parents’ scientific, engineering, and entrepreneurial influences. Jack Little graduated from MIT in electrical engineering and received an MS in Electrical Engineering from Stanford University. In 1984, he co-founded MathWorks, a leading developer of technical computing software for engineers and scientists in industry, government, and education. Sarah Little graduated from Stanford in physics and then joined the MIT-Woods Hole Oceanographic Institute PhD program, graduating in geophysics with a thesis that involved making dives in the deep-ocean submersible Alvin and collecting data on hydrothermal vents in the Pacific. Tom Little graduated from Rensselaer Polytechnic Institute in biological engineering, and earned a PhD in computer engineering from Syracuse University. He is now a professor in the Department of Computer Engineering, Boston University School of Engineering. With a former student he co-founded a web software and consulting firm, Molecular, Inc., which they have since sold. Ruel Little has a BA in physics from Johns Hopkins University and an SM in mechanical engineering from MIT. After working for many years for solar energy companies, he helped found GreenRay, a solar energy startup that is developing labor-saving technology that simplifies construction and installation of solar modules for delivering electricity directly into home appliances and lighting. As a grandfather, John answers to eight grandchildren.
15
16
John R. Hauser and Glen L. Urban
appointed an MIT Institute Professor—a special rank and honor reserved for a very few faculty at the university. In this capacity, John has undertaken some sensitive and important MIT-wide projects. He reports directly to the Provost. John’s interest in modeling real world problems led him, in 1967, to cofound Management Decision Systems, Inc. (MDS), a company whose objective was to create and commercialize marketing models and marketing decision-support software. MDS grew to more than 200 employees. In 1985, MDS merged with Information Resources, Inc. (IRI), with John serving on the IRI board until 2003. John continued his entrepreneurial activities by investing and serving on the board of a startup company, InSite Marketing Technology. InSite provided a new class of e-business applications that identify the buying style of the customer and the selling style of the company, and dynamically integrate them
Figure 1.2 John D. C. Little’s family, Nantucket Island, summer 2002. (John, Betty, and their children are noted in boldface.) Back row (left to right): John, Ruel, Sara, Max, Tom, Nancy, Jack. Front row (left to right): Betty, Kathy, Avery, Isaac, Sarah, Cora, Doug, Emily, Dyson, Erica. The families of John and Betty’s children are: Jack Little and Nancy Wittenberg/ Erica and Emily Little; Sarah Little and Doug Hersh/Cora and Isaac Hersh; Tom Little and Sara Brown/Max and Stephanie Little (Stephanie not born until 2003); Ruel and Kathy Little/Dyson and Avery Little.
John D. C. Little
17
to help the customer through the buying process. In 2000, InSite merged with the Kana Corporation, a multichannel customer service software company that integrates telephone, email, web chat, and collaboration channels with knowledge management capabilities in a unified application. John served as president of ORSA in 1979 and president of TIMS between 1984 and 1985. During his ORSA term of office, he and Frank Bass, then president of TIMS, persuaded the two societies to found the joint journal Marketing Science. When the two societies merged in 1995 to form INFORMS, John was elected its first president; he also chaired the committee whose efforts led to the merger. Honors and Awards John has been recognized for his innovative and seminal research in marketing by: the Paul D. Converse Award, a lifetime achievement award given by the American Marketing Association (AMA) (1992); the AMA Charles Parlin Award for contributions to the practice of
Figure 1.3 John presenting plaques to the winners of the 2007 John D .C. Little Award at the 2008 Marketing Science Conference in Vancouver. Shown (left to right) are John, P. K. Kannan, and Brian T. Ratchford (co-author, Lan Luo, was not able to attend).
18
John R. Hauser and Glen L. Urban
marketing research (1979); and MIT’s Buck Weaver Award for outstanding contributions to marketing (2003). He was elected to the National Academy of Engineering for outstanding contributions to operational systems engineering, including research, education, applications in industry, and leadership (1989). He has received the ORSA’s George E. Kimball Medal for recognition of distinguished service to the society and profession of OR (1987), and the Distinguished Service Medal from TIMS. He is a member of the International Federation of Operational Research Societies’ (IFORS) Operational Research Hall of Fame (Larson 2004), and a Fellow of both INFORMS and the INFORMS Society of Marketing Science (ISMS). John has received honorary degrees from the University of London; University of Liège, Belgium; and Facultés Universitaires Catholiques de Mons, Belgium. He has been honored by having a most prestigious annual award in marketing science named for him—the John D. C. Little Award—given annually by ISMS for the best marketing paper published in an INFORMS journal (including Marketing Science and Management Science). Appendix 1.1:
Acronyms
ADBUDG
Advertising Budget
AMA BRANDAID INFORMS ISMS MAXBAND
American Marketing Association Brand Aid (model to help evaluate the marketing mix) Institute for Operations Research and the Management Sciences INFORMS Society for Marketing Science Maximum Bandwidth
MEDIAC MDS MS OR ORSA RA TIMS TSP UPC
Media Calculus Management Decision Systems Management Science(s) Operations Research/Operational Research Operations Research Society of America Research Assistant The Institute of Management Sciences Traveling Salesman Problem Universal Product Code
Notes The authors, with appreciation, wish to note the following sources of material: “IFORS’ operational research hall of fame: John D. C. Little,” Larson (2004); and an unpublished
John D. C. Little
19
presentation by John at his eightieth birthday celebration. We also thank John Little for adding many historical facts and insights to our summary of his life and work. This profile originally appeared in A.A. Assad, S.I. Gass (eds.), Profiles in Operations Research, International Series in Operations Research &Management Science 147, DOI 10.1007/978-1-4419-6281-2_36, © Springer Science+Business Media, LLC 2011. Reprinted here with kind permission of Springer Science+Business Media.
References Bellman, R. E. 1957. Dynamic Programming. Princeton, New Jersey: Princeton University Press. Gartner, N. H., M. Kelson, and J. D. C. Little. 1981. MAXBAND: A program for setting signals on arteries and triangular networks. Transportation Research Record 795:40–46. Guadagni, P. M., and J. D. C. Little. 1983. A logit model of brand choice calibrated on scanner data. Marketing Science 2 (3): 203–238. Larson, R. 2004. IFORS’ Operational Research Hall of Fame: John D. C. Little. International Transactions in Operational Research 11 (3): 361–364. Little, E. A. 1955. Dynamic behavior of domain walls in barium titanate. Physical Review 98 (4): 978–984. Little, J. D. C. 1955. Use of storage water in a hydroelectric system. Operations Research 3 (2): 187–197. Little, J. D. C. 1961a. A proof for the queuing formula: L = λW. Operations Research 9 (3): 383–401. Little, J. D. C. 1961b. Approximate expected delays for several maneuvers by a driver in Poisson traffic. Operations Research 9 (1): 39–52. Little, J. D. C. 1966a. The synchronization of traffic signals by mixed-integer linear programming. Operations Research 14 (4): 568–594. Little, J. D. C. 1966b. A model of adaptive control of promotional spending. Operations Research 14 (6): 1075–1097. Little, J. D. C. 1970. Managers and models: The concept of a decision calculus. Management Science 16 (8): B466–B485. Little, J. D. C. 1975a. BRANDAID: A marketing-mix model, Part 1: Structure. Operations Research 23 (4): 628–673. Little, J. D. C. 1975b. BRANDAID: A marketing-mix model, Part 2: Implementation, calibration, and case study. Operations Research 23 (4): 656–673. Little, J. D. C. 1977. Optimal adaptive control: A multivariate model for marketing applications. IEEE Transactions on Automatic Control AC-22 (2): 187–195. Little, J. D. C. 1979. Aggregate advertising models: The state of the art. Operations Research 27 (4): 629–667. Little, J. D. C. 2002. Philip M. Morse and the beginnings. Operations Research 50 (1): 146–149.
20
John R. Hauser and Glen L. Urban
Little, J. D. C. 2007. Life as the first OR doctoral student—and other prehistoric tales. In The Operations Research Center at MIT, ed. I. Y. Larson. Hanover, MD: INFORMS Topics in Operations Research Series. Little, J. D. C. 2008. Personal communication, May 31. Little, J. D. C., and S. Cohen. 1982. The MAXBAND program for arterial signal timing plans. Public Roads 46 (2): 61–65. Little, J. D. C., and S. C. Graves. 2008. Little’s Law. In Building Intuition: Insights From Basic Operations Management Models and Principles, ed. D. Chhajed and T. J. Lowe. New York: Springer Science + Business Media. Little, J. D. C., and L. Lodish. 1969. A media planning calculus. Operations Research 17 (1): 1–35. Little, J. D. C., K. G. Murty, D. W. Sweeney, and C. Karel. 1963. An algorithm for the traveling salesman problem. Operations Research 11 (6): 972–989. Morgan, J. T., and J. D. C. Little. 1964. Synchronizing traffic signals for maximal bandwidth. Operations Research 12 (6): 896–912. Morse, P. M. 1958. Queues, Inventories and Maintenance. New York: John Wiley & Sons. Morse, P. M., and G. Kimball. 1951. Methods of Operations Research. New York: John Wiley & Sons. Schmitz, J. D., G. D. Armstrong, and J. D. C. Little. 1990. CoverStory—automated news finding in marketing. Interfaces 20 (6): 29–38.
I
Marketing Science: Managerial Models
Introduction by Editors In 1966, John Little published one of the first papers on the optimization of advertising spending. This paper was well ahead of its time, using advanced concepts in dynamic programming and optimal experimentation. In what was to become a “John Little” style, the math was sophisticated, but the ideas were accessible and easy to implement. Four years later, in 1970, John recognized that many models were becoming increasingly complicated, but divorced from application. He proposed, and illustrated, a roadmap for the successful implementation of marketing models in what is now known as “decision calculus.” This paper became one of the most highly cited papers in the first fifty years of Management Science. The papers in this section derive from John’s research. They follow in the style of rigorous analysis and practical application. They focus on real problems faced by real managers, and combine theoretical rigor and the analysis of real data. They are truly John’s progeny.
2
Optimal Internet Media Selection Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
2.1
Introduction
The “dot com” crash of 2000 resulted in an immediate downturn in Internet advertising that lasted a further two years. Subsequent to 2003, however, Internet advertising has grown at about 30% annually, to the point where online ad spend in the US was over $21 billion in 2007 (IAB 2008). This growth was also mirrored in Europe and Asia. Part of the reason for the slump, followed by the rapid growth, in online advertising has been a shift in the way advertising has been priced and sold. Advertising pricing models in traditional media are generally based on audience size, but the Internet’s unique ability to track browsing behavior meant that early online ad campaigns were often priced on “pay-per-click.” However, within a short time, clickthrough rates diminished to less than half of one percent (Manchanda et al. 2006). Indeed, the percentage of online ad campaigns priced on the basis of pay-per-click fell from 56% in 2000 to only 14% in 2007 (IAB 2002; 2008). The rapid demise of the pay-per-click pricing model has been matched by an equally dramatic rise in online ad revenue allocated to paid or sponsored search, as provided by search engines such as Google and Yahoo! Although paid search receives a lot of attention, over the past four years, the share of Internet ad spend devoted to this form of advertising has plateaued at about 40% (IAB 2005; 2006; 2007a; 2008). Instead, the silent, but steady, recent growth in online ad spend has been in display-type advertising (banners, popups, skyscrapers, etc.), where ad prices are based on audience size, a pricing model conventionally termed cost-per-thousand (CPM). CPM-based online advertising has increased steadily since a low point in 2004 and has now attained almost 50% of total spend, being over $11 billion in 2007 (IAB
24
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
2008). This is more than was spent on radio advertising in the entire US during that year (TNS Media Intelligence 2008). The increasing spending on Internet display advertising is due to a realization of the benefits this advertising format has on awareness, recognition, and attitude formation compared with simple clickthrough. Manchanda et al. (2006) review many industry and academic studies and conclude that: “banner advertising has attitudinal effects and click-through is a poor measure of advertising response.” Indeed, their own study shows a significant link between banner ad exposure and downstream purchase. Regarding paid search, Klaassen (2007) reports that it is largely confined to four major search engines (Google, Yahoo, AOL, and MSN), which dominate the gross online ad spend for sponsored search. In contrast to these four websites, there are millions of others that accrue ad revenue based on CPMs. One of the major reasons for initial caution with CPM-based Internet advertising was a lack of comparability with traditional media (Smith 2003) and inconsistent measurement methods (Bhat et al. 2002; Coffey 2001). Today, these problems are largely resolved, and now it would be unusual for a major ad campaign not to include an online display component. Despite this shift and growth to CPM-based Internet advertising, there are currently no optimal media vehicle selection methods for the Internet in the published literature.1 Therefore, the purpose of this study is to develop an optimal media selection method suitable for Internet campaigns using display advertising. Media selection on the Internet presents some challenges that do not arise in traditional media, and so our method is not a simple adaptation of previous methods. Instead, we employ a previous audience size model that is customized to the Internet and additionally develop a budget function that is tailored to the unique way online advertising “impressions” are priced and sold by web publishers. For example, on the Internet it is possible to limit the number of times a person sees a particular ad. We find that our method is both accurate and computationally quick. Comparison with optimal schedules obtained by computationally slow complete enumeration shows that the schedules derived by our method are either the same or close to the true optimal schedule. 2.1.1 Audience Estimation and Optimal Scheduling for the Internet A precursor to any media scheduling method is a requirement to develop reliable estimates of campaign audience size, such as reach and
Optimal Internet Media Selection
25
frequency (Meskauskas 2003). Danaher (2007) reviews several proprietary and nonproprietary methods to estimate the audience for online ad campaigns. One model, a generalization of the negative binomial distribution, was shown to be very accurate at predicting page views and then downstream audience size measures like reach. However, obtaining good estimates of audience size is only part of the requirement for advertisers. Another key aspect is media selection, which optimally allocates the available budget across media vehicles so as to maximize an audience size measure (Bass and Lonsdale 1966; Rust 1986; Rossiter and Danaher 1998). This involves an examination of which media vehicles to select, and how much to advertise in each. Hence, in this study, we build on previous work by Danaher (2007) in which he constructed a model for just Internet audience exposure. Here, we extend this work to Internet media selection, which requires more than just a model for media audience size. In particular, appropriate objective and budget functions must be developed. In turn, the selected media comprise the eventual media schedule. The first comprehensive model for optimal media selection in traditional media was Little and Lodish’s (1969) MEDIAC model, developed for TV media scheduling. Later developments were made by Aaker’s (1975) ADMOD and Rust’s (1985) VIDEAC models. (Also see Danaher’s (1991) review of operations research methods for optimal media selection.) Heuristic optimization methods, like the so-called “greedy algorithm,” have proven popular and surprisingly accurate for optimizing media allocation (Rust 1986; Danaher 1991). There have also been a number of proprietary commercial software packages developed to optimize television advertising schedules. These include SuperMidas and X-pert (www.wpp.com). None of the proprietary methods have been benchmarked against the published methods, but the widespread uptake of commercial advertising optimization software is evidence that the advertising industry views them as extremely useful (Ephron 1998). Even though optimal media selection is now commonplace in traditional media, there are no nonproprietary methods available for online media. As noted earlier, the two major forms of Internet advertising are based on paid search and audience size. Since paid search advertising is a relatively distinctive form of advertising, and depends mostly on the choice of target key words, website design and content, and how much the advertiser is willing to pay per click,2 we do not consider this method of online advertising in this chapter. Instead, we concentrate
26
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
on Internet advertising scheduling for display ads, where the purpose is to reach a target group via a banner or some other form of visual advertisement. Our study focuses primarily on maximizing audience size measures across multiple websites. Here, we do not explicitly consider other media planning issues, such as the position, size, and animation of the ad, and creative fit with each website. The paper is organized as follows. In the next section, we develop objective functions based on a robust Internet exposure model. Section 2.3 then constructs a budget function that allows for a different share of advertising impressions in each website and possible ad restrictions at the individual level, both issues being unique to the Internet. We then describe the data used to illustrate our method in section 2.4 and present the findings and applications in section 2.5. 2.2
Development of Objective Functions
The starting point for any optimal media planning system is the formulation of an objective function. In traditional media, objectives to be maximized include reach, frequency, and effective frequency (Rust 1986), and the Internet is no different. Reach is the proportion of the target audience exposed at least once to the advertising campaign, and is a very common advertising objective. Frequency is the average number of exposures per person among those reached (often expressed as gross rating points, GRPs, divided by the reach). Effective frequency is usually defined to be some threshold level of exposure, such as three (Krugman 1972; Naples 1979). That is, the objective might be to maximize the proportion of the target audience exposed at least three times. A model that forms the basis for measuring all of these objectives is the media exposure distribution. 2.2.1 Internet Advertising Exposure Distribution Models A formal definition of the exposure distribution is: let Xi be the number of exposures3 a person has to website i, Xi = 0 , 1, 2, … , i = 1, … , m , where m is the number of different websites. The exposure random variable m to be modeled is X = ∑ i =1 Xi , the total number of exposures to an advertising schedule. Therefore, X is a simple sum of random variables, but two non-ignorable correlations make modeling it difficult (Danaher 1992), as will be seen later. Danaher (2007) examined exposure distribution models for the Internet in detail.4 For a single website, Danaher (2007) and Huang and
Optimal Internet Media Selection
27
Lin (2006) show that a negative binomial distribution (NBD) model fits the observed exposure distribution better than several other plausible models. Hence, for a single website, we also use the NBD with mass function: xi ri ⎛ xi + ri − 1⎞ ⎛ α i ⎞ ⎛ ti ⎞ Pr(Xi = xi |ri , α i , ti ) = ⎜ ⎜ ⎟ ⎜ ⎟ ⎟⎠ ⎝ α + t ⎠ ⎝ α + t ⎠ , xi = 0 , 1, 2, … , (1) ⎝ xi i i i i
where ri and α i are the usual parameters for the gamma distribution. An additional parameter ti , not incorporated by Danaher (2007), permits the NBD to be rescaled depending on the time interval used to estimate the parameters (see Lilien, Kotler, and Moorthy 1992, p. 34). For instance, if one week of data are used to estimate the model, but an advertiser wants to predict the exposure distribution for a four-week period, then all that changes is that ti goes from 1 to 4. This results from the additive property of the Poisson distribution that underpins the NBD, and makes it extremely versatile for Internet exposure distribution applications. For multiple websites, Danaher (2007) considered several possible models, including a multivariate generalization of the NBD. This multivariate NBD (denoted the MNBD) performed the best empirically as a model for predicting audiences to website ad campaigns, so we also employ it in this study. The full details are given by Danaher (2007), but briefly, the MNBD uses a method, developed by Sarmanov (1966), which combines several univariate distributions into a multivariate distribution, simultaneously taking into account pairwise and (possibly) higher-order interactions among the constituent random variables.5 Park and Fader (2004) introduced the Sarmanov method to the marketing literature for an application to purchases across two websites, and Danaher and Hardie (2005) applied it to grocery store purchases between two categories (bacon and eggs). The MNBD model for the full exposure distribution, with truncation after third-order terms is: ⎧m ⎫⎡ f (X1 , X 2 , … , X m ) = ⎨∏ f i (Xi )⎬ ⎢1 + ∑ ω j1 , j2 ϕ j1 ( x j1 )ϕ j2 ( x j2 ) ⎩ i =1 ⎭ ⎣ j1 < j2 ⎤ + ∑ ω j1 , j2 , j3 ϕ j1 ( x j1 )ϕ j2 ( x j2 )ϕ j3 ( x j3 )⎥ , j1 < j2 < j3 ⎦ where, in the case of the NBD,
(2)
28
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
ri
αi ⎛ ⎞ φi ( xi |ti ) = e − xi − ⎜ , ⎝ ti (1 − e −1 ) + α i ⎟⎠
(3)
is called a mixing function and has the property that
∑ ϕ (x ) f (x ) = 0. i
i
i
i
xi
The terms ω j1 , j2 and ω j1 , j2 , j3 are parameters measuring the second- and third-order associations among the websites. 2.2.2 Objective Functions Formulas for reach, average frequency, and effective reach can then be derived from this MNBD model, as we now demonstrate. Applying equations (2) and (3) to model just reach, we have: Reach = 1 − f (X1 = 0, X 2 = 0, … , X m = 0|t1 , t2 , … , tm ) rj1 ⎛ ⎛ α j1 ⎞ ⎞ ⎧m ⎫⎡ = 1 − ⎨∏ f i ( xi = 0|ri , α i , ti )⎬ ⎢1 + ∑ ω j1 , j2 ⎜ 1 − ⎜ −1 ⎟ ⎟ ⎩ i =1 ⎭ ⎢⎣ j1 < j2 ⎝ ⎝ t j1 (1 − e ) + α j1 ⎠ ⎠ rj2 rj1 ⎛ ⎛ ⎛ ⎛ α j2 α j1 ⎞ ⎞ ⎞ ⎞ ⎜1− ⎜ −1 ⎟ ⎟ + ∑ ω j1 , j2 , j3 ⎜ 1 − ⎜⎝ t (1 − e −1 ) + α ⎟⎠ ⎟ j1 j1 ⎝ ⎠ ⎝ ⎝ t j2 (1 − e ) + α j2 ⎠ ⎠ j1 < j2 < j3 rj2 rj3 ⎛ ⎛ α j2 α j3 ⎞ ⎞⎛ ⎛ ⎞ ⎞⎤ 1 1 − − ⎜ ⎜ −1 −1 ⎟ ⎟⎜ ⎜ ⎟ ⎟ ⎥. ⎝ ⎝ t j2 (1 − e ) + α j2 ⎠ ⎠ ⎝ ⎝ t j3 (1 − e ) + α j3 ⎠ ⎠ ⎥⎦
(4)
If the objective function is average frequency, then: Av.Frequency = =
GRPs Reach
∑
m i =1
E[Xi ]
(5)
Reach
∑ =
m i =1 i i
r t /αi
Reach
.
Lastly, if the objective function is effective reach at three exposures, then: 2
EffectiveReach = Pr(X ≥ 3) = 1 − ∑ f X ( x ) ,
(6)
x=0
f X ( x) =
∑
{( x1 ,…, xm ): x1 ++ xm = x}
f ( X1 , X 2 , … , X m )
, x = 0 , 1, 2, … ,
(7)
Optimal Internet Media Selection
29
where the joint distribution f (X1 , X 2 , … , X m ) is obtained from equation (2). Equations (4), (5), and (6) are measures only of audience size. They are only part of the requirement for a system to optimally select websites. An equally important component is the budget function, and we now show how the Internet presents some unique challenges when it comes to formulating a budget function. 2.3
Development of the Budget Function
A very common measure of media cost that is well established in broadcast and print media is cost-per-thousand (CPM) (see, for example, Sissors and Baron 2002; Rossiter and Danaher 1998). CPM is the dollar cost charged by the broadcaster or publisher for 1000 exposures among people in the target audience. CPM has now also become established in online media (for non-search advertising placements), but there is a slight modification. Instead of being the cost for 1000 target audience people, it is the cost for 1000 impressions.6 This difference has important ramifications for Internet media planning, which makes it distinct from traditional media, as we now explain. For traditional media, the 1000 people exposed in a CPM measure are unique, whereas for online, the 1000 impressions need not be served to 1000 different people. The reason for this is that for online media the user controls the rate of ad delivery. For instance, if cnn.com puts the same ad on its home page for a week, then a twice-a-day visitor will be served fourteen impressions of that ad during that week. Hence, depending on the incidence of repeat visits to a website, 1000 impressions will result in an ad being delivered to much fewer than 1000 people. It could be one hundred people exposed ten times, fifty people exposed twenty times, etc. (Chandler-Pepelnjak and Song 2004). This distinctive aspect of online CPMs must be allowed for in any optimal online media selection procedure. The incidence of high repeat visits to a website by the same person is of concern to advertisers who sometimes wish to limit the number of impressions delivered to visitors, known as “frequency capping.” We now discuss media planning and scheduling issues peculiar to the Internet. 2.3.1 Share of Impressions Websites that sell advertising on the basis of CPM usually set a time period of a week or a month in which the ad impressions are delivered
30
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
to site visitors. For websites with a lot of traffic, ad impressions can accumulate very quickly, into the hundreds of thousands or millions. For example, at a typical CPM of $20, five million impressions in a week costs an advertiser $100,000, which is very expensive for a single website. Consequently, very few advertisers can afford to purchase every available impression, nor would they want to. It would be akin to buying every available commercial slot on a TV network for a week. Conversely, probably very few web publishers would want to give exclusive coverage to a single advertiser even for a week. Therefore, web advertisers must settle for a share of the available impressions per site for a fixed time period. The concept of share of impressions (SOI) has been an established measure in regard to Internet media planning for some time (see, for example, AdRelevance 2001). We denote this share as si for website i, which can be thought of as the probability that a website visitor who requests a web page has a particular ad delivered on that page. For instance, Dell might purchase a 5% share of impressions for a month-long campaign on msn.com. This means that each page served to msn.com visitors in that month has a 5% chance of containing a Dell advertisement. Accommodating a share of impressions into the reach estimate is straightforward, due to the additive property of the Poisson distribution. All that changes is the α i parameter in equation (4) is divided by si (Lilien, Kotler, and Moorthy 1992, p. 34). A similar adjustment is required for the mixing functions in equation (3). An additional factor alongside the share of impressions is a possible guaranteed number of impressions during the campaign period. Figures of 100,000 to 500,000 impressions are common (Zeff and Aronson 1997). Such impression targets are usually easy for large websites to fulfill, but if the site falls short of its guarantee, then a “make good” is expected. This might entail an extended campaign period until the impression guarantee is achieved or a credit applied to the next advertising purchase. 2.3.2 Frequency Capping As mentioned earlier, online advertising has a particular quirk whereby the same person can be exposed to an ad many times at a rate determined by the user, not the publisher. Chandler-Pepelnjak and Song (2004) report the findings from thirty-eight online ad campaigns and reveal that sales conversions and profits from banner ads are highest for somewhere between three and ten impressions. They
Optimal Internet Media Selection
31
therefore recommend some form of frequency capping where possible. Frequency capping can be executed by using a cookie to identify whether or not a particular computer has previously visited a website and therefore been served an ad. Visitors who have been served an ad, say, ten times already will not be served any more ads for the duration of the campaign. Of course, some people do regularly clear their cookies (comScore 2007), but the majority do not, so that frequency capping can go some way to increasing the number of unique visitors to a site. This will increase the reach. It will also mean that advertisers will not waste money on the “… thousands of users who receive hundreds of impressions without any response” (Chandler-Pepelnjak and Song 2004, p. 4). Operationalizing frequency capping in our framework is relatively straightforward. Recall that we can think of si as the probability a visitor will be served an impression on website i. If we additionally apply a frequency cap, then si becomes a conditional probability, with an impression being served conditional on the user having not exceeded the frequency cap. The probability of not exceeding the cap is denoted Pr(Xi ≤ cap) and can be obtained from the univariate NBD cap model as ∑ x = 0 Pr(Xi = x ) . Now, the conditional probability for being served an ad impression, given that the frequency cap has not been exceeded is: si / Pr(Xi ≤ cap).
(8)
Notice that the cap is the same for each website. It is generally impossible to impose a cap on total exposures ( X = ∑ i Xi ) rather than Xi , since individual websites typically do not have information about the browsing behavior of visits to sites other than their own.7 2.3.3 Notation for Budget Constraint As with conventional media planning, advertisers in the online environment must also work within a total budget, which we denote as B. This might be disaggregated into a budget per website, Bi , so that m ∑ i =1 Bi ≤ B . All websites selling advertising based on audience size report a rate-card CPM, denoted ci , for website i. In addition, they’ll give standard time intervals for an ad to be delivered on their website, such as a week, a month, or longer. Alternatively, they might report a fixed cost for a week or month of advertising and guarantee a certain number of impressions within that time period. Under this arrange-
32
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
ment, an advertiser could set Bi to be the website’s fixed cost and select a subset of sites that maximizes reach subject to ∑ im=1 Bi ≤ B. For the NBD model, the expected number of impressions per person in the time interval [0, ti ] is ri ti / α i . Hence, the expected total number of impressions among the target population to website i is Nri ti / α i , where N is the target population size. For a share, si , of the total impressions on website i in a time interval [0, ti ], the expected number of ad impressions delivered is Nsi ri ti / α i , at an expected cost per website of ( N / 1000)ci si ri ti / α i . The overall budget constraint is, therefore, ( N / 1000)∑ i =1 ci si ri ti / α i ≤ B. m
(9)
2.3.4 Statement of Optimization Problem Applying the share of impressions adjustment and the budget constraint in equation (9), as well as the reach model in equation (4), the online advertising optimization problem for maximizing reach8 by varying (s1 , s2 , … , sm ) can be stated formally as follows: Maximize Reach = 1 − f (X1 = 0, X 2 = 0, … , X m = 0|t1 , t2 , … , tm , s1 , s2 , … , sm ) ⎧m ⎫ = 1 − ⎨∏ f i ( xi = 0|ri , α i , ti , si )⎬ ⎩ i =1 ⎭ rj1 ⎡ ⎛ ⎛ α j1 ⎞ ⎞ ⎢1 + ∑ ω j1 , j2 ⎜ 1 − ⎜ −1 ⎟ ⎟ ⎝ ⎝ s j1 t j1 (1 − e ) + α j1 ⎠ ⎠ ⎢⎣ j1 < j2 rj2 ⎛ ⎛ α j2 ⎞ ⎞ ⎜1− ⎜ −1 ⎟ ⎟ ⎝ ⎝ s j2 t j2 (1 − e ) + α j2 ⎠ ⎠ +
∑
ω j1 , j2 , j3
j1 < j2 < j3
(10)
rj1 ⎛ ⎛ α j1 ⎞ ⎞ ⎜1− ⎜ −1 ⎟ ⎟ ⎝ ⎝ s j1 t j1 (1 − e ) + α j1 ⎠ ⎠
rj2 rj3 ⎛ ⎛ α j2 α j3 ⎞ ⎞⎤ ⎞ ⎞⎛ ⎛ ⎜1− ⎜ −1 −1 ⎟ ⎟⎥, ⎟ ⎟ ⎜1− ⎜ ⎝ ⎝ s j2 t j2 (1 − e ) + α j2 ⎠ ⎠ ⎝ ⎝ s j3 t j3 (1 − e ) + α j3 ⎠ ⎠ ⎥⎦
subject to 0 ≤ si ≤ 1, and N m si ci ri ti ≤B ∑ 1000 i =1 α i
(11)
If frequency capping is to be used the only change to the optimization problem set up is to use equation (8) where each si is replaced with
Optimal Internet Media Selection
33
si / Pr(Xi ≤ cap) , with the denominator probability obtained from the univariate NBD in equation (1). Equation (11) is the budget constraint for Internet media selection where websites permit an advertiser to purchase any share of impressions for a fixed time period, and the amount paid will vary depending on the share purchased. That is, the amount paid by the advertiser to the website is determined by the advertiser’s chosen share of impressions, the website’s CPM and the expected total impressions for the campaign duration. This might be described as the ‘Fully Flexible’ allocation method. However, other websites recommend or require that an advertiser have a minimum spend on their site and therefore state a fixed charge per period of advertising (such as a week or month). If f we denote the fixed period cost per website as ci then an alternative optimization problem is to vary (s1 , s2 , … , sm ) so as to maximize reach, subject to 0 ≤ si ≤ 1 and m
∑c
f i { si > 0 }
I
≤B,
(12)
i =1
where I{ si >0 } is an indicator function, being 1 if any share of impressions are allocated to site i and 0 otherwise. We call this the “Fixed Cost Per Website” allocation method. 2.3.5 Parameter Estimation We follow Danaher (2007) and estimate the parameters of the univariate NBD models by the method of means and zeros. The bivariate and trivariate association parameters ( ω j1 , j2 and ω j1 , j2 , j3 ) are estimated by using Danaher’s (2007) equations (9) and (11), which equate the observed non-reach to that implied by the parametric model. Maximum likelihood estimation is also possible, but Danaher (2007) notes that it does not do as well at reach estimation, so we do not use it here. 2.4
Data
2.4.1 KoreanClick Our Internet data are provided by KoreanClick (www.koreanclick.com), a Korean market research company that specializes in Internet audience measurement. KoreanClick maintains a panel of Internet users, between ages six and seventy-five, selected on the basis of stratified sampling in South Korea. Potential candidates for the panel are selected by telephone random-digit dialing. After the person agrees to be a
34
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
panel member, he or she receives authorization from KoreanClick by both email and postal mail to register as a panel member. The panel member is counted as active if they connect to the Internet at least once during the previous four weeks. The Internet usage behavior of each panelist is monitored by a module, called “iTrack,” which captures all Internet activity by panel member at their home or office. This method is very similar to that employed by comScore (Coffey 2001) and Nielsen/Netratings in the US and many other countries, but has the added advantage of capturing panelists’ Internet use at work as well as home. The only major difference is that KoreanClick’s panel size is around 6500, while comScore’s and Netratings’ are, respectively, one million and 30,000 people. This makes KoreanClick information less suitable for purchase transactions, where conversion rates are about 2%, but its sample size is ample for website audience estimation, especially for popular sites, as used in our application. There are several major performance indicators of Internet usage, including page views, visitors, unique visitors, and reach (Novak and Hoffman 1997). A page view is the act of browsing a specific website page. When a visitor accesses a web page, a request is sent to the server hosting the page, and a page view occurs when the page is fully loaded. At this point, an impression or exposure is delivered, as the site visitor is (potentially) exposed to the page contents, including advertisements. Following Danaher (2007), we define a page view as equivalent to an exposure in the case of traditional media such as television, radio, and magazines. We gathered data for two four-week periods, March 3–30 and June 2–29, 2003. The number of “active” panel members, who actually used the Internet in these periods, is 4729 and 4681, respectively, for March and June. We use March for model estimation and June for validation. According to KoreanClick, the “Internet population” in South Korea at the time this data was obtained was 23,658,097. However, since only 4729 of the 6468 total panelists actually used the Internet in March, the effective Internet population size for our data is N = 23658097 × ( 4729 / 6468) = 17 , 297 , 331 . 2.4.2 Website Description To measure a panelist’s Internet usage behavior, we use page views of the index pages (similar to the cover page of magazine) of ten selected websites in three major categories: community portal, news, and search
Optimal Internet Media Selection
35
engine, as shown in table 2.1. We selected these ten sites for their popularity among all types of Internet users and because they have a relatively high reach (if a website has a small reach, its page view data become more volatile). All sites attract similar gender proportions among their visitors except news sites, which receive more visits from men. Portal sites and search engines are highly visited; for example, the community portal site 1 (daum.net) reached more than 70% of the total active Internet users in March 2003. To measure the average impressions, we divided all page impressions by total Internet users, being a measure similar to Gross Rating Points, which is frequently used by traditional media. The average number of page impressions indicates the overall exposure rate of the given site to all Internet users. 2.4.3 Website Costs Table 2.1 also gives realistic CPM and fixed cost per week advertising rates for the ten websites, as provided by Daehong, a Korean advertising agency. For example, msn.co.kr (the Korean version of msn.com) charges $10 per thousand impressions, or alternatively, $8000 for an ad placement for one week on their website. For a CPM of $10, it is not difficult to calculate that $8000 amounts to a purchase of (8000 / 10) × 1000 = 800 , 000 impressions. Since the total impressions for msn.co.kr in a week is 3,269,306, the share of impressions for a weekly ad posting is 0.2447. Hence, an advertiser buying space on msn. co.kr for a week will have their ad delivered on about every fourth page impression that is served up to the site’s visitors. We additionally report frequency capping information in table 2.1. The probabilities given in table 2.1 are those used to adjust the si values, as shown in section 2.3.2, namely replace si with si / Pr(Xi ≤ cap), where Pr(Xi ≤ cap) comes from the univariate NBD model. For instance, the most popular website, daum.net, delivers over sixty-four million impressions per week. If visitors are limited to at most ten ad impressions of a particular advertisement, then 90.7 percent of visitors will not exceed the cap, meaning that about 9% will exceed the cap. Once this happens, these 9% of visitors will not receive any more of the designated ads (in so far as this is possible through the use of cookies) for the remainder of the weeklong period. As would be expected, when the cap is lowered to five impressions, a greater percentage (33.5%) exceed the cap. For the other nine websites, exceeding
Domain Home Page
daum.net dreamwiz.com msn.co.kr chosun.com joins.com hani.co.kr donga.com naver.com kr.yahoo.com empas.com
Site no.
1 2 3 4 5 6 7 8 9 10
portal portal portal news news news news search engine search engine search engine
Category
70.4 12.0 9.7 7.8 7.1 4.5 5.2 41.6 52.0 25.3
Reach 3.74 .42 .19 .20 .18 .15 .13 1.82 2.17 .69
5.3 3.5 1.9 2.6 2.6 3.4 2.5 4.4 4.2 2.7
Av.Freq
March 2003 Mean Page Impressions
Table 2.1 Audience and Cost Information for Korean Websites.
64,654,720 7,259,028 3,269,306 3,477,751 3,137,656 2,589,114 2,267,304 31,405,812 37,571,416 12,009,395
Total Impressions 15 8 10 10 8 8 10 10 12 10
CPM ci , $ 15,000 9,000 8,000 7,000 5,000 5,000 5,000 7,000 5,000 12,000
Weekly Fixed cost cif , $ .907 .993 .999 .998 .999 .998 .999 .961 .957 .994
cap=10
.765 .978 .996 .992 .993 .993 .995 .896 .874 .971
cap=5
Frequency Cap Pr(Xi ≤ cap)
69.1 10.3 9.4 7.2 6.1 3.3 4.6 47.8 49.0 24.2
Reach
6.3 3.6 2.1 2.6 2.3 2.7 2.1 4.2 4.1 2.5
Av.Freq
June 2003
36 Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
Optimal Internet Media Selection
37
the cap of ten impressions is rare and is even uncommon for cap of five, indicating that “frequency overload” for these sites is not especially problematic. 2.5
Results
2.5.1 Optimizing Reach We wrote a program in FORTRAN to find the optimal solution that maximizes reach subject to either of the cost constraints in equations (11) and (12). Given that we have a nonlinear objective function, a subroutine named NCONF from the IMSL (1997) library was used to solve the optimization problem.9 This algorithm uses a successive quadratic programming method (Schittkowski 1983). The same algorithm and subroutine are used to maximize effective reach and average frequency. We set the campaign duration to be either one or four week(s) (thereby using data for either the first or all four weeks of March), and had total budget levels of $50,000 or $100,000. No frequency capping, as well as frequency caps of five and ten impressions were also employed. In all cases, our method arrives at the optimum solution in less than 0.04 seconds using an IBM laptop running at 1.6GHz. Tables 2.2a and 2.2b, respectively, report the optimal reach values for total budgets of $50,000 and $100,000. For example, a campaign with a budget of $50,000, lasting one week, with no frequency cap, achieves a reach of 21.8% when the purchaser has full flexibility, and 20.4% when there is a fixed cost per website. Table 2.3 gives the share of impressions (SOI), number of impressions, and cost for these two optimal campaigns. It shows that for the fully flexible schedule, nine of the ten possible websites should be used, while the fixed-cost schedule uses seven sites. Interestingly, in both cases, the most popular website Table 2.2a Reach For Different Campaign Durations and Frequency Capping Levels. Budget=$50,000 No. Campaign Weeks
Website Purchase Flexibility
Frequency Cap None
10
5
1 1 4 4
Full None Full None
21.8% 20.4% 22.7% 21.6%
22.2% 20.6% 26.5% 24.8%
23.0% 21.4% 31.4% 28.6%
38
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
Table 2.2b Reach For Different Campaign Durations and Frequency Capping Levels. Budget of $100,000. No. Campaign Weeks
Website Purchase Flexibility
Frequency Cap None
10
5
1 1 4 4
Full None Full None
35.0% 33.6% 36.8% 35.6%
35.5% 34.2% 42.7% 41.6%
36.8% 35.9% 48.5% 46.3%
(daum.net) is not in the optimal schedule. This is likely due to its high CPM relative to its reach (see online appendix section A3). As a point of comparison, a naïve media planner might choose to spend the entire budget on the website with the greatest reach. In this case, spending $50,000 on daum.net buys a SOI of 5.16% and achieves a reach of 15.5%. This is well below the optimum reach of 21.8% in table 2.2a. A less naïve, but intuitively reasonable, media schedule, is to buy weeklong campaigns in each of the top five sites (numbered 1, 2, 8, 9, and 10). The cost of doing this is within budget, at $48,000, but the reach is only 19.5%, again lower than the optimal reach obtained by our method. Hence, it is apparent that using an intuitive heuristic, such as picking the most popular sites, does not guarantee the highest overall reach.10 Part of the reason for this is the correlation in exposure among websites. For example, website 7 (donga.com) has moderate correlation with all the other news sites. Therefore, this site is not included in the optimal schedule which has a fixed cost per website, where three other news sites are already included.11 Also worth noting in table 2.2a is that in every case the optimal reach for the fully flexible purchase is higher than when sites charge a fixed cost for a set time period. This is because allowing more flexibility in impression purchases permits the budget to be spread over more websites, thereby increasing the overall reach. Table 2.2a also shows the benefit of frequency capping, with the reach increasing as the frequency cap is lowered, as would be expected. Also demonstrated is the positive effect on reach if a campaign has longer duration. Going to four weeks from one week allows extra time for more unique visitors, thereby increasing the reach. If, additionally, a frequency cap is applied, the reach can increase substantially. For
Optimal Internet Media Selection
39
example, for a budget of $50,000 with a frequency cap of five in a fourweek campaign, the fully flexible schedule achieves a reach of 31.4% compared with the no-cap, one-week, fully flexible schedule optimal reach of 21.8%. The nearly ten percentage point increase is achieved at no additional cost. The rightmost half of table 2.3 gives the optimal schedule when a frequency cap of five is applied, and the budget is $50,000. For the fully flexible schedule, the number of impressions decline for sites 2, 3, 4, 5, 6, 7 and 10, but increase for sites 8 and 9. It must be remembered that these impressions have the proviso that, where possible, no person receives more than five ad impressions. There is a large increase in the number of impressions purchased for the Korean version of yahoo. com, as it has a lower probability of being below the cap (0.874) than most of the other sites. The same is true of naver.com. Hence, the additional impressions purchased will be served to a larger pool of unique (new) visitors. Table 2.2b mirrors 2a, but with double the budget. Diminishing returns effects are apparent, since the no-cap, one-week, fully flexible optimal schedule produces a reach of 35.0%, up from 21.8%, which is not a doubling of the reach. Again, the same effects of frequency capping and longer campaign duration are evident for the optimal schedules with a larger budget. 2.5.2 Optimizing Other Audience Measures In addition to reach, we now obtain optimal schedules for three other objective functions, being effective reach at three exposures, effective reach subject to a minimum reach,12 and average frequency. Maximizing effective reach subject to a reach minimum is very similar to maximizing Pr(X ≥ 3) , but adds a further constraint in addition to those given in section 2.3.4, namely that Pr(X ≥ 1) ≥ rmin , where rmin is the specified minimum reach. Table 2.4 reports the optimal schedules. As seen from section 2.5.1, the highest reach is obtained by spreading the budget over nine of the ten available websites. By contrast, maximizing effective reach requires the concentration of ad spend into vehicles that are the most cost effective at obtaining high repeat exposures. The second, fifth, and sixth sites fulfill these criteria. If an additional minimum reach threshold is added, the optimal schedule spreads the budget over more sites than for effective reach. We chose a minimum reach of 16%, being midway between the lowest reach in table 2.4 (10.6%) and the highest (21.8%).
daum.net dreamwiz.com
msn.co.kr chosun.com joins.com hani.co.kr donga.com naver.com kr.yahoo.com empas.com
1 2
3 4 5 6 7 8 9 10
.1419 .0468 .1322 .0621 .0232 .0464 .0196 .0822
0 .0920
SOI*
463,401 162,864 414,729 160,742 52,680 1,456,107 736,174 986,903 Reach=21.8%
0 667,825
Impressions
Fully Flexible
4634 1629 3318 1286 527 14561 8834 9869
0 5343
Cost
0 .2013 .1992 .2414 0 .0223 .0111 .0999
0 .1550
SOI
0 700,000 625,000 675,000 0 700,000 416,667 1,200,000 Reach=20.4%
0 1,125,000
Impressions
0 7000 5000 5000 0 7000 5000 12000
0 9000
Cost
Fixed Cost Per Website
.0613 .0366 .0941 .0558 .0176 .0482 .0319 .0707
0 .0828
SOI
*SOI=Share of Impressions †This is the total number of impressions, with the proviso that no one receives more than five.
Domain Home Page
Site no.
No Frequency Capping
Table 2.3 Optimal Schedules for Korean Websites. Budget is $50,000 over a 1 week period.
200,187 127,249 295,320 144,579 39,803 1,512,465 1,198,250 849,362 Reach=23.0%
0 601,396
Impressions†
Fully Flexible
2002 1272 2363 1157 398 15125 14379 8494
0 4812
Cost
0 .2013 0 0 0 .0223 0 .0999
.0155 .1550
SOI
0 700,000 0 0 0 700,000 0 1,200,000 Reach=21.4%
1,000,000 1,125,000
Impressions†
0 7000 0 0 0 7000 0 12000
15000 9000
Cost
Fixed Cost Per Website
Frequency Cap=5 Impressions
40 Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
daum.net dreamwiz.com msn.co.kr chosun.com joins.com hani.co.kr donga.com naver.com kr.yahoo.com empas.com Reach Effective Reach Average Frequency
Website
0 .0920 .1419 .0468 .1322 .0621 .0232 .0464 .0196 .0822
SOI*
*SOI=Share of Impressions
1 2 3 4 5 6 7 8 9 10
Site no.
0 667,825 463,401 162,864 414,729 160,742 52,680 1,456,107 736,174 986,903 21.8% 1.50% 1.35
0 5343 4634 1629 3318 1286 527 14561 8834 9869
Impressions Cost
Reach
0 .5492 0 0 .3469 .4533 0 0 0 0
SOI 0 3,987,457 0 0 1,088,665 1,173,879 0 0 0 0 14.7% 4.55% 2.46
Impressions
Effective Reach
0 31900 0 0 8709 9391 0 0 0 0
Cost 0 .4370 0 .0365 .4294 .4627 .0195 .0047 0 .0088
SOI 0 3,172,562 0 126,795 1,347,515 1,198,281 44,322 149,082 0 105,114 16.0% 4.38% 2.22
Impressions 0 25380 0 1268 10780 9586 443 1491 0 1051
Cost
Effective Reach subject to Reach ≥ 16%
Objective Function
Table 2.4 Optimal Schedules for Alternative Objective Functions. Budget is $50,000 over a one-week period.
0 0 0 0 1.0 1.0 .1841 0 0 0
SOI
0 0 0 0 3,137,656 2,589,114 417,570 0 0 0 10.6% 4.28% 3.36
Impressions
0 0 0 0 25107 20717 4176 0 0 0
Cost
Average Frequency
Optimal Internet Media Selection 41
42
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
Lastly, to maximize average frequency, the media strategy requires a heavy concentration in a handful of websites. Indeed, in this example, all the available impressions are purchased for sites 5 and 6, with the balance of the ad budget spent on donga.com. These are all costeffective sites, where the cost of purchasing all the impressions is less than the total budget of $50,000. 2.5.3 Optimization by Complete Enumeration A slow but sure method to optimally select online media vehicles is to laboriously calculate the reach for every possible campaign that stays within the budget. This method is known as complete enumeration and provides a benchmark to compare with the optimal reach obtained via our model-based method. We operationalize the complete enumeration method by using an empirically derived estimate of reach, as detailed in section A4 of the online appendix (appendix 2.A.1). Briefly, we use the raw data to simulate the exposure distribution for share of impression values at each website, while staying within the budget constraint. Since the simulated exposure distribution is not based on the MNBD model, this is a test of both model accuracy as well as the efficiency of the optimization algorithm. For a total budget of $50,000, the complete enumeration method took over sixty-eight hours for the fully flexible schedule (using increments of 0.0001 for each of the si ). We did not even attempt to obtain the fully flexible optimal schedule using complete enumeration for a budget of $100,000, as the projected calculation time is 174 days. For the fixed cost per website schedule, complete enumeration took around twenty-one minutes, but this increases substantially to twenty-two hours when the campaign budget goes to $100,000 (as the number of feasible solutions increases by a factor of over sixty). For a budget of $50,000, the complete enumeration optimal schedule is exactly the same as for our model-based method, while for a budget of $100,000 the complete enumeration schedule is close to ours and has a reach higher by only 0.1 percentage point. Since our method is very much faster, taking no longer than 0.04 seconds, yet produces the same or similar optimal schedules, it is apparent that our method has some advantages. As a final test for our proposed method we undertook a validation check by comparing model-based reach estimates to those derived empirically across two time periods—March for estimation and June for the holdout period. Moreover, we obtained the optimal schedule and reach values in the holdout period. Section A5 of the online
Optimal Internet Media Selection
43
appendix gives the details, and shows that the optimal schedules and reach values across the two months are either the same or similar, thereby demonstrating reasonable validity for our method. 2.6
Discussion and Conclusion
The Internet presents challenges for online media planners that are not prevalent in traditional media. The main difference is the rate of delivery of advertisements, which is determined by the visit frequency to a site by a web user. Although web publishers have the ability to limit the number of ads delivered by frequency capping, this is not without its problems, since the same person can access a website from different computers. The purchase of online advertising is also different from other mass media, and so the budget function needs to be correspondingly suited to the new online arena. Like exposure distribution models, media selection methods for traditional media are not applicable to online media. Instead, our method is to use a multivariate generalization of the NBD as an exposure distribution model that is specifically tailored to the Internet—and is also adaptable to online media selection, where a share of impressions, a fixed number of impressions, or a purchase covering a fixed time period is required. That is, the MNBD is not only a good prediction model for Internet audience exposure, but it can also accommodate several different online advertising purchase methods. The downstream nonlinear optimization problem can be solved with a standard algorithm in a fraction of a second, and the resulting optimal solution is either the same or very similar to that obtained by a complete enumeration. However, the complete enumeration takes an order of magnitude longer, at best twenty-one minutes and at worst sixtyeight hours. Moreover, in prediction validity tests, our method derives optimal schedules that are identical to those obtained in future time periods for moderate budget levels and are very close to the maximum achievable reach for larger budgets, where there are many more feasible solutions. Therefore, the method we develop for optimal online media selection appears to have some merit. Our findings also have some implications for media planners wishing to use the Internet. First, an obvious point is that reach usually increases as the number of websites in the campaign increases. Although this is well known for traditional media vehicles, current practice in online media purchases is for fewer sites (Klaassen 2007).
44
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
Second, where possible, online media planners should still attempt to buy a set number of impressions in a fixed time period, where the number of impressions is determined by the media planner, not the web publisher. We call this fully flexible scheduling. A common method of online ad purchasing typically requires buyers to purchase impressions for a fixed period (such as a week or month) at a fixed rate, and then the site’s CPM determines the number of impressions that will be delivered. This fixed cost per website arrangement gives the buyer less flexibility and always delivers lower overall reach. Third, frequency capping always increases the reach, but another, less obvious way of increasing reach is to lengthen the campaign duration. Indeed, the combination of frequency capping and extending the campaign from one to four weeks increases the reach by almost 50% for the same budget in our example. Our method can also be used to maximize average frequency and effective reach. Other aspects of online media scheduling, such as the timing and placement of ads within a website have already been considered by Adler et al. (2002), Kumar et al. (2007), and Nakamura and Abe (2005). Furthermore, targeting to specific demographic segments has not been demonstrated here, but it is not difficult to implement, as it requires that only a subset of the full data be obtained and the model refitted. We leave it to future research to embellish on these facets of optimal online media scheduling. Notes Reprinted by permission, Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache, “Optimal Internet Media Selection,” Marketing Science 29 (2), 336–347. ©2010, the Institute for Operations Research and the Management Sciences (INFORMS), 5521 Research Park Drive, Suite 200, Catonsville, MD 21228 USA. The authors thank KoreanClick for providing the data. Janghyuk Lee’s work was supported by a Korea University research grant and Laoucine Kerbache’s work was partially supported by the HEC Foundation. An online appendix to this paper is available as part of the online version that can be found at http://dx.doi.org/10.1287/mksc.1090.0507. 1. By optimal media vehicle selection we mean choosing websites and allocating advertising budget to the selected websites, so as to maximize an audience measure, such as reach, while keeping within a budget constraint. There has been previous work on optimal placement and timing of banner ads within a single website. See, for example, Adler, Gibbons, and Matias (2002), Kumar, Dawande, and Mookerjee (2007), and Nakamura and Abe (2005). This placement and timing stage is done separately for each website, but occurs subsequent to the media selection stage that we examine. Both stages comprise the eventual media schedule.
Optimal Internet Media Selection
45
2. In addition, Google, and more recently Yahoo, position sponsored search advertisers on the match between what a web user is looking for and whether a particular advertiser can fulfill that need (using a proprietary matching algorithm). For all major search engines, advertisers pay only when a user clicks on their displayed link. Such a revenue model is often termed ‘performance-based’. See, for example, Seda (2004) for a detailed description of how to conduct search engine advertising. 3. For Internet advertising, by “exposure” we mean an ad impression. Our data do not have explicit information on ad impressions, only page views/impressions. Therefore, we treat an exposure and a page impression as equivalent. 4. Further details on how univariate and multivariate exposure distributions are developed and applied in this setting are given in section A1 of the online appendix. 5. Also see in section A2 of the online appendix, which discusses the sensitivity of the MNBD to pairwise correlations among websites, and the number of interaction terms to use in the Sarmanov model. 6. The Internet Advertising Bureau defines an Impression to be “A measurement of responses from a Web server to a page request from the user browser” (IAB 2007b). Included in the page served to a user is likely to be some form of advertising, such as banners, popups, skyscrapers, etc. 7. Frequency capping of total exposures should be possible when a third-party ad serving company has control over the delivery of impressions for the entire campaign. The same general principle of adjusting the si values can still be applied in this situation. 8. We work primarily with reach, but it is easy to change the objective function to be either average frequency or effective reach, as given, respectively, in equations (5) and (6). See section 2.5.2. 9. Implementation details for this subroutine are in section A1.3 of the online appendix. 10. Such intuitive schedules, which choose just the dominant websites, are common in both the United States (Klaassen 2007) and Europe (LemonAd 2003). Indeed, concentration of online advertising expenditures in fewer, but larger, websites is increasing (Klaassen 2007). 11. As a matter of interest, we also obtained optimal schedules when it is assumed there is no association among websites (i.e., set ω j1 j2 = 0 and ω j1 j2 j3 = 0 in equation (2)). In every case, the reach for this alternative optimal schedule is lower than when correlations are included. The difference increases for larger budgets, stronger correlations, and more websites. This justifies the inclusion of the bivariate and trivariate associations into the media exposure model. Also see section A2.1 in the online appendix. 12. We thank the editor for this suggestion.
References Aaker, David A. 1975. ADMOD: An Advertising Decision Model. Journal of Marketing Research (February):37–45. Adler, Micah, Phillip B. Gibbons, and Yossi Matias. 2002. Scheduling Space-Sharing for Internet Advertising. Journal of Scheduling 5:103–119.
46
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
AdRelevance. 2001. “Intelligence: Special Report,” http://www.imagedomainweb.com/ adrelevance/Reportb.htm, (accessed, February 24, 2009). Patrick, Barwise T., and Andrew S.C. Ehrenberg. 1988. Television and its Audience. London, UK: Sage. Bhat, Subodh, Michael Bevans and Sanjit Sengupta. 2002. “Measuring Users’ Web Activity to Evaluate and Enhance Advertising Effectiveness,” Journal of Advertising, 31 (Fall), 97–106. Bass, Frank M., and Ronald T. Lonsdale. 1966. “An Exploration of Linear Programming in Media Selection,” Journal of Marketing Research, May, 179–188. Chandler-Pepelnjak, John, and Young-Bean Song. 2004. “Optimal Frequency: The Impact of Frequency on Conversion Rates,” Atlas Digital Marketing Technologies, www.atlasdmt.com/pdf/OptFrequency.pdf. (accessed March 5, 2007). Coffey, Steve. 2001. “Internet Audience Measurement: A Practitioner’s View,” Journal of Interactive Advertising, 1, 2, (http://jiad.org/article9.html). comScore. (2007) “Cookie-Based Counting Overstates Size of Web Site Audiences.” http://www.comscore.com/press/release.asp?press=1389 [accessed May 1, 2007]. Danaher, Peter J. 1991. Optimizing Response Functions of Media Exposure Distributions. Journal of the Operational Research Society 42 (7):537–542. Danaher, Peter J. 1992. Some Statistical Modeling Problems in the Advertising Industry: A Look at Media Exposure Distributions. American Statistician 46 (4):254–260. Danaher, Peter J. 2007. Modeling Page Views Across Multiple Websites With An Application to Internet Reach and Frequency Prediction. Marketing Science 25 (May/June): 422–437. Danaher, Peter J., and Bruce G.S. Hardie. 2005. Bacon with Your Eggs? Applications of a New Bivariate Beta-Binomial Distribution. American Statistician 59 (November): 282–286. Drèze, Xavier, and Fred Zufryden. 1998. Is Internet Advertising Ready for Prime Time? Journal of Advertising Research 38 (3):7–18. Ephron, Erwin. 1998. Optimizers and Media Planning. Journal of Advertising Research 38 (4):47–56. Huang, Chun-Yao, and Chen-Shun Lin. 2006. Modeling the Audience’s Banner Ad Exposure for Internet Advertising Planning. Journal of Advertising 35 (2):23–37. IMSL. 1997. IMSL MATH/LIBRARY User’s Manual. Houston, TX: Visual Analytics Inc. IAB. 2002. “IAB Internet Advertising Revenue Report,” June 2002, http://www.iab.net/ media/file/resources_adrevenue_pdf_IAB_PWC_2001Q4.pdf. IAB. 2005. “IAB Internet Advertising Revenue Report,” April 2005, http://www.iab.net/ media/file/resources_adrevenue_pdf_IAB_PwC_2004full.pdf. IAB. 2006. “IAB Internet Advertising Revenue Report,” April 20, 2006, http:// www.iab.net/media/file/resources_adrevenue_pdf_IAB_PwC_2005.pdf. IAB. 2007a. “IAB Internet Advertising Revenue Report,” May 23, 2007, http:// www.iab.net/media/file/resources_adrevenue_pdf_IAB_PwC_2006_Final.pdf.
Optimal Internet Media Selection
47
IAB. 2007b. “Glossary of Interactive Advertising Terms,” http://www.iab.net/resources/ glossary.asp#i. IAB. 2008. “IAB Internet Advertising Revenue Report,” May 23, 2008, http:// www.iab.net/resources/adrevenue/pdf/IAB_PwC_2007_full_year.pdf. Klaassen, Abbey. 2007. Economics 101: Web Giants Rule ‘Democratized’ Medium. Advertising Age 8 (April). Krugman, Herbert E. 1972. Why Three Exposures May Be Enough. Journal of Advertising Research 12 (6):11–14. Subodha, Kumar, Lilind Dawande, and Vijay S. Mookerjee. 2007. Optimal Scheduling and Placement of Internet Banner Advertisements. IEEE Transactions on Knowledge and Data Engineering 19 (November): 1571–1584. LemonAd (2002), “Online Advertising Results: Year 2002 in Europe.” Lilien, Gary L., Philip Kotler, and K. Sridhar Moorthy. 1992. Marketing Models. Englewood Cliffs, New Jersey: Prentice Hall. Little, John D. C., and Leonard M. Lodish. 1969. A Media Planning Calculus. Operations Research 17 (1):1–35. Manchanda, Puneet, Jean-Pierre Dube, Khim Yong Goh, and Pradeep K. Chintagunta. 2006. The Effect of Banner Advertising on Internet Purchasing. Journal of Marketing Research 43 (February):98–108. Meskauskas, Jim. (2003), “Reach and Frequency—Back in the Spotlight,” iMedia Connection, www.imediaconnection.com. Morrison, Donald G. 1979. Purchase Intentions and Purchasing Behavior. Journal of Marketing 43 (Spring):65–74. Nakamura, Atsuyoshi, and Naoki Abe. 2005. Improvements to the Linear Programming Based Scheduling of Web Advertisements. Electronic Commerce Research 5:75–98. Naples, Michael J. 1979. Effective Frequency: The Relationship Between Frequency and Advertising Effectiveness. New York, NY: Association of National Advertisers. Novak, Thomas P., and Donna L. Hoffman. 1997. New Metrics for New Media: Toward the Development of Web Measurement Standards. World Wide Web Journal 2 (1):213–246. Park, Young-Hoon, and Peter S. Fader. 2004. “Modeling Browsing Behavior at Multiple Websites.” Marketing Science, 23 (Summer), 280–303. Rossiter, John R., and Peter J. Danaher. 1998. Advanced Media Planning. Norwell, MA: Kluwer Academic Publishers. Rust, Roland T. 1985. Selecting Network Television Advertising Schedules. Journal of Business Research 13:483–494. Rust, Roland T. 1986. Advertising Media Models: A Practical Guide. Lexington, MA: Lexington Books. Salas, Saturino L., and Einar Hille. (1978).Calculus: One and Several Variables with Analytic Geometry. 3rd ed. New York: Wiley.
48
Peter J. Danaher, Janghyuk Lee, and Laoucine Kerbache
Sarmanov, O. V. 1966. Generalized Normal Correlations and Two-Dimensional Frechet Classes. [Soviet Mathematics] Doklady 168:596–599. Schittkowski, K. 1983. “On the Convergence of Sequential Programming Method with an Augmented Lagrangian Line Search Function,” Mathematik Operationsforschung und Statistik. Serie Optimization 14:197–216. Seda, Catherine. 2004. Search Engine Advertising: Buying Your Way to the Top to Increase Sales. Indianapolis, IN: New Riders Publishing. Sissors, Jack Z., and Roger B. Baron. 2002. Advertising Media Planning. 6th ed. Chicago: McGraw-Hill. Smith, David L. 2003. “Online Reach and Frequency: An Update,” April, http:// www.mediasmithinc.com/white/msn/msn042003.html. TNS Media Intelligence, 2008. “TNS Media Intelligence Reports U.S. Advertising Expenditures Grew 0.2 Percent in 2007.” http://www.businesswire.com/news/home/ 20080325005529/en/TNS-Media-Intelligence-Reports-U.S.-Advertising-Expenditures# .VVTLgmOQlIE, March 25. Zeff, Robbin, and Brad Aronson. 1997. Advertising on the Internet. New York: John Wiley and Sons.
3
Strategic Marketing Metrics to Guide Pathways to Growth John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
3.1
Introduction
Marketing metrics have attracted considerable attention over the past twelve years, with the Marketing Science Institute listing the topic as its top research priority from 1998 to 2010 (Roberts, Kayande, and Stremersch 2014, web appendix 3.2), reflecting the pressure from member companies to be able to prove the value of their marketing investments. This pressure comes partly from an evidence-based management culture summarized by the adage, “If you can’t measure it, you can’t manage it.” Growth of techniques such as Six Sigma (6σ) management, quality control, and performance-based contracts are manifestations of this trend. It is facilitated by the automatic capture of considerable quantities of data through digitization, and an increasing sophistication of the tools to understand them. In areas such as the cost effectiveness of web lead generation expenditures, not only are details of inputs and results readily accessible, but efficiency of conversion is routinely supplied (see, for example, Google Analytics at http:// www.google.com/intl/en_uk/analytics). This has led to easy experimentation and learning, and a high resultant penetration of metrics for decision making in specific areas (Lilien 2011). We can see this analytical approach to calibrating the effectiveness of marketing activity as a natural extension of the pioneering work of Little in using operations research techniques to optimize expenditure (e.g., Little 1975) and make the results actionable by managers (e.g., Little 1979). However, despite efforts to improve the accountability of marketing, many marketers believe that the profession is becoming marginalized, particularly at senior levels of the organization (Verhoef and Leeflang 2010). We feel that one reason for this trend is that a lot of the issues that metrics address are not the ones foremost in the mind of the CEO.
50
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
In particular, they are primarily operational rather than strategic. By that we mean that much metrics research deals with optimizing (or at least improving) the performance of existing mature products in their current markets. While this work is undoubtedly valuable, there are also other issues exercising the mind of the CEO. Primary amongst these is the idea of net growth (e.g., Carey and Pasalos-Fox 2006). What is true of areas of management focus is also true of the measures used to calibrate them. Indeed, it is interesting that for the 2010– 2012 Research Priorities, the Marketing Science Institute (2010) subtly changed the wording of its top research priority from “Marketing metrics” to “Using Market Information to Identify Opportunities for Profitable Growth.” Marketers often advocate that marketing expenditure should be seen as an investment not a cost (Doyle 2000). For this to be a credible option, marketers must be able to demonstrate the return on that investment (see, for example, Gupta and Zeithaml 2006) and its impact on market capitalization (Srinivasan and Hanssens 2009). Senior levels of the organization often regard the ability of marketers to provide them with this material as limited. For example, in a survey of CEOs, McKinsey consultants found that close to three out of four agreed with the statement: marketers “are always asking for more money, but can rarely explain how much incremental business this money will generate”(Gordon and Perrey 2015). In a survey of 600 CEOs, a major marketing performance measurement firm, the Fournaise Marketing Group, found that “73% of CEOs think marketers lack business credibility: they can’t prove they generate business growth.” (FournaiseTrack, June 15, 2011, https://www.fournaisegroup.com/ marketers-lack-credibility). We believe that it is essential to match the metrics used to evaluate past performance (and planned future activity) in different productmarkets to the specific environments to those markets, and in particular the stage of the product-market life cycle (PMLC)1 in which they operate. To that extent, we argue for what we call an “unbalanced score card” for individual units of the business. Of course, it is much easier to calibrate response in mature product-markets, but they reveal only a part of the financial performance of the firm and an even smaller part of the change in value of the firm, a subject central to the agenda of the CEO. All too often we get trapped into using historically based metrics in existing (and largely mature) product-markets, largely because actual performance data on new products and new markets cannot be
Strategic Marketing Metrics to Guide Pathways to Growth
51
available and what we are doing today is more salient than what we may do in the future. While it is more difficult to implement, in this paper we concentrate on metrics that are strategic in that they focus on measuring and managing the growth and risk of existing productmarket portfolios in growth and decline, new product-markets, and the balance between these across the organization. We start by looking at the issues facing the manager across different stages of existing product-market life cycles and suggest metrics for each. We proceed from there to address changes in scope: how the manager can calibrate opportunities in new product-markets using the idea of market-based assets and the Ansoff (1965) matrix to categorize growth potential. Finally, in addition to metrics to manage individual units, we consider metrics that might be useful to focus on the organization as a whole, proposing measures to examine the sum of the parts, accounting for synergy and overlap over time. To study which metrics might be useful to C-level managers interested in improving marketing accountability, in figure 3.1 we first define the components necessary to assess accountability: inputs (whether they are available and appropriate), outputs (both immediate and eventual financial returns, and generation of long-term market assets), and the efficiency of transformation (value for resources committed, given the external environment). Marketing metrics can be used in two distinct ways: backward-looking (to understand and reward past performance); and forward-looking (to plan future activity and resource allocation). We focus primarily on the latter.
Inputs • Elements of the marketing mix • Other assets deployed
External environment • Competitive environment • Channel, customer and regulatory
Outputs • Marketplace and Financial returns • Market-based assets generated
Efficiency of conversion • Effectiveness and efficiency of the marketing mix, given external environment
Figure 3.1 Framework for Assessing Marketing Accountability.
52
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Marketing inputs refer to the customer-facing resources devoted by the firm to create value. These will primarily be elements of the marketing mix, but may also extend to complementary activities such as market research or training. Inputs will also include the use of other assets, both market-based (such as brand equity) and non-marketbased (such as physical plant). Outputs will be both in terms of immediate rewards such as sales (and associated measures of revenue, profit, market share, etc.), as well as in terms of the market-based assets that marketing activity creates to enable the generation of future income streams. The efficiency of conversion of inputs to outputs will include variables such as return on marketing investment and asset turnover. This efficiency will be a function of the external environment, including customer trends, competitive activity, economic climate, and the actions of channels. Factors external to the firm are clearly important determinants of future performance and optimal firm strategy, but they represent a largely neglected side of the study of market metrics. For example, a study of three of the most popular books on marketing metrics (Ambler 2003, Davis 2006, and Farris et al. 2010) reveals 165 different metrics. Of these, only seven relate to the external environment. Marketbased assets will also moderate the efficiency and effectiveness of the conversion process in figure 3.1. We proceed to develop a framework for strategic marketing metrics by considering the firm’s mature product-markets where much metrics research to date has taken place. We review the use of marketing metrics in these markets and the types of management decisions that they can inform. From there, we examine product-markets in their growth and decline stages. We look at the marketing challenges involved in these environments, which leads to a consideration of appropriate metrics. We consider growth in new markets and for new products, before looking at the firm as a whole and proposing strategic metrics that should be useful to ensure portfolio balance. 3.2
Managing Existing Product-Markets
There is much successful work using marketing metrics focused on optimization of the marketing mix for the firm’s existing products in mature markets (see, for example, Hanssens and DeKimpe 2010). This provides considerable guidance to managers in these product-markets, as well as a foundation from which to build a metrics framework to inform other strategic challenges.
Strategic Marketing Metrics to Guide Pathways to Growth
53
3.2.1 Metrics for Managing Mature Product-Markets Financial returns to marketing activity may come either from contemporaneous market response, resulting in increased profits during the current period, or greater intangible assets (such as brand equity, customer equity, or collaborator equity2), allowing the firm to have greater profits in future periods. Contemporaneous Marketing Effects When product-markets are stationary, there are a limited number of ways of improving earnings or increasing profits. The profit equation can be written:
πijt = Sijt * (pijt – vijt) – Fijt – Cijt and Sijt = Iijt * Mijt
(1)
where: πijt = Profit of product j in market i in time period t Sijt = Unit sales of product j in market i in time period t pijt = Price (average) of product j in market i in time period t vijt = Variable costs of product j in market i in time period t Fijt = Fixed costs of product j in market i in time period t (excluding marketing) Cijt = Marketing costs of product j in market i in time period t Iijt = Industry size of category in which j competes in market i in time period t Mijt = Market share of product j in market i in time period t This profit equation identifies the possible ways in which a firm can increase its profit in such product-markets. Other than fixed cost initiatives, profit growth in a given period can only come from category growth, share growth, or margin growth. Again, with further drill down, it is easy to understand the possible ways in which these may occur. Category growth (market development) may come from new users, new usages, more volume per usage, or more usage occasions. Net share growth (customer wins) can result from increased acquisition, higher retention rates, or a larger share of wallet of current users. Finally, margin growth can be achieved by higher prices (greater differentiation or better value capture) or lower variable costs (for example, see Porter 1980). Costs can usefully be added as long as the resultant value added to (and captured from) consumers exceeds the associated cost. See Kumar (2004) for an example.
54
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Carryover (Future) Effects of Marketing Activity In addition to profits in the period in which marketing expenditure is incurred, marketing may result in a company’s ability to enhance brand equity (value of product group j across consumers, πjt) or customer equity (value of consumers in market i across products, CLVit)—that is, increased market-based assets. We summarize the challenges of stationary existing product-markets in figure 3.2. Customer equity (or lifetime value of consumers) from current products is obtained by summing across rows (e.g., Gupta, Lehmann, and Stuart 2004), while brand (or product) equity in existing markets is obtained by summing profits down columns (Ailawadi, Lehmann, and Neslin 2003). Clearly, marketing productivity in existing product-markets is enhanced by increasing the earnings density in the matrix in figure 3.2 (e.g., cross-selling and additional usage to increase revenues; enhancing quality, service levels, and image to justify higher margins, etc.). Unsurprisingly, marketing researchers and academics have developed market metrics such as market share, brand loyalty, customer retention or usage rates, and price premiums to assess the success of marketing actions such as advertising, promotions, and cross-selling efforts (e.g., see Farris et al. 2010). Input variables include product attributes (Mizak and Jacobson 2008), price (Ailawadi, Lehmann, and Neslin 2001), advertising, and promotions (Mela, Gupta and Lehmann 1997), among others. The effect Products (brands, SKUs, etc.) A
2
D
Total customer equity CLVi
Retain Cross sell
3
Grow usage
4 Total brand equity
C
Increase margins
1 Customers (accounts, segments, etc.)
B
pj
Market value
(Shaded areas represent product-customer cells with current sales)
Figure 3.2 Illustrative strategies for net growth in existing mature product-markets.
Strategic Marketing Metrics to Guide Pathways to Growth
55
of these activities on final outputs—including financial ones, such as profit and firm value (e.g., Pauwels et al. 2004)—has been supplemented by the intermediate variables of market-based assets, including brand equity (Keller and Lehmann 2003) and customer satisfaction (Anderson, Fornell, and Mazvancheryl 2004). The marketing literature has provided valuable insights into performance calibration in this area. Most of this work uses econometric techniques applied to historical data. For this past performance evaluation contribution to be made more relevant to forward-looking planning, there is room for more research to be undertaken on understanding differences between drivers of past, successfully harvested opportunities and those of potential future ones. To combine the contemporaneous and future profit opportunities in existing product-markets we move from profits in time t to profits over the planning horizon from t+1 to t+K (adjusted by the appropriate discount rate to reflect risk and the time value of money). Thus, the value of a product-market to the firm (assuming a given marketing strategy), ENPVijt, may be written: ENPVijt = ∑ k =1 K
π ijt
(1 + dijt + k )k
+ Tijt + K
(2)
where dijt is the discount rate for product j in market i at time t, and Tijt+K is the terminal value of product j in market i at time t+K in current dollars. The value of market-based assets is factored into equation (2) by their ability to enhance profits in periods beyond the one in which they were created. Figure 3.1 and the above discussion lead directly for us to understand metrics that will be useful to the CEO in understanding future potential in mature, existing product-markets, outlined in figure 3.3 and how that potential might respond to alternative strategies and different marketplace scenarios. Figure 3.3 summarizes metrics currently in use (see, for example, Ambler 2003, Davis 2006 and Farris et al. 2010). The CEO can see not only performance measures, such as return on marketing investment (ROMI, which compares current plans to no marketing investment), but more usefully compare performance measures to other benchmarks, such as increasing or decreasing marketing investment by, say, 20%. That is, return on incremental marketing investment (ROIMI) is
56
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Inputs
Environment
Outputs
Efficiency of conversion
Current performance (t)
Inputs last period: Marketing spend Assets deployed
Envirt last period: Competitive activity Market conditions
Results achieved: Sales, Revenue, Profit, Share, Growth Future measures: Brand equity, value Customer equity, value
Margins Return on marktg investment Return on assets deployed
Base plan and environmental assumptions (t + 1)
Current planned activity
Current expected environment
Current target outcomes
Current expected efficiency
Plan A (t +1)
Current planned activity +20%
Current expected environment
Plan A target outcomes
Plan A expected efficiency
Plan B (t +1)
Current planned activity –20%
Current expected environment
Plan B target outcomes
Plan B expected efficiency
Scenario 1 (t +1)
Current planned activity
Environmental scenario 1
Expected target under scenario 1
Expected efficiency under scenario 1
Figure 3.3 Metrics to calibrate alternative marketing plans in existing mature product-markets.
a lot more informative than return on marketing investment (ROMI). She can see immediate expected rewards (e.g., brand profit) and longterm investments (e.g., brand equity) and the sum of the two (e.g., brand value). Finally, scenario analysis enables her to examine the robustness of plans to changes in environmental assumptions. We have already argued that marketing accountability in mature existing product-markets is well-established (at least in principle, even if industry practice does not always match our methodological potential). We do have two further reservations. First, marketers could gain more insight into the productivity of marketing investments if they decomposed the outcome, return on assets, into return on sales and asset turnover (sales/assets),
Strategic Marketing Metrics to Guide Pathways to Growth
57
analogously to the Du Pont model in accounting (e.g., Kaplan 1984). This introduction of asset utilization, currently largely absent in marketing (with the exception of retailing) can be used to manage marketbased assets. For example, metrics such as the ratio of sales or earnings to brand (or customer) equity can be used to measure the utilization of intangible assets, particularly market-based ones. Second, it is also important to develop measures that capture the benefits of reduced volatility in sales and profits (see Srivastava, Shervani, and Fahey 1998). There is ample evidence in the accounting literature that volatility has a negative effect on firm valuations leading to earnings management by both accounting practices and adjusting marketing activities (e.g., Tucker and Zarowin 2006). While there has been discussion of marketing actions such as price promotions that increase volatility, and use of product platforms to reduce it across a portfolio of products (Fornell et al. 2006), there has been very limited focus on managing volatility through active management of customer portfolios or incentives to customers to reduce their demand uncertainties (Tarasi et al. 2011). Metrics such as revenue or profit volatility are especially relevant. Incentives to sales and marketing personnel to reduce volatility will help harmonize relationships with other functions, such as manufacturing. Persistence metrics such as percentage of profits from recurring business signal safety to investors, and thus result in higher valuation ratios. 3.2.2 Managing Dynamic Product-Market Life Cycles: Growth and Defense While much metrics work focuses on mature markets, we know from the product life cycle literature (illustrated in figure 3.4) that markets are often not stationary (e.g., Day 1981). Both before markets reach maturity (when we need to concentrate on growth) and after they leave it (when we must focus on managing decline), we face different challenges for which the assumption of stationarity will be a poor one. Varying responsiveness to marketing actions over the PMLC has serious implications for resource allocation decisions and choice of metrics. For both forecasting purposes (prognostics) and resource allocation decisions (diagnostics), we need to understand these differences. While retention, acquisition, and category development will be important to sales at each stage, retention will be relatively less important in the growth stage when there is less to retain, while acquisition will be
58
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Category life cycle Iijt
Sales volume Retention Category
Product life cycle Sijt
Slow decline Max salvage value
Share enhancement Introduction and Growth
Maturity
Decline
Time
Figure 3.4 Product-market and associated category-market life cycles.
less of a concern in decline when customer lifetime value is likely to be lower than earlier in the PMLC. Management challenges may be different during growth, maturity, and decline, calling for different metrics at each stage. First, the role of marketing activities may change over the life cycle. For example, advertising may shift from strategic (and market-building) to a maintenance role. Wind (1982, p. 48) summarizes the differential impact of various marketing mix elements at different stages of the product life cycle. Second, as product-markets become more competitive when we move from the growth to maturity stage, a decrease in responsiveness to, and hence productivity of, marketing actions might be anticipated. Third, process changes might influence the effectiveness of different marketing activities. For example, advertising support for new products often helps build distribution leading to both direct (via consumer awareness and preference) and indirect (via availability) effects on sales. Fourth, because of greater uncertainty in the growth and decline stages of the product life cycle, risk management requires different discount rates at different stages. Thus, one must have a contingency approach to metrics selection in different product-markets. In short, there is no holy grail, universally applicable to all situations (Ambler and Roberts 2006, 2008). To understand market dynamics and to capture the growth potential at each stage from an earnings perspective, it is useful to define a new metric: profit persistence. The profit persistence of product j in market i at time t, Pijt, is defined as:
Strategic Marketing Metrics to Guide Pathways to Growth
Pijt =
π ijt π ijt −1
59
(3)
At the beginning of the PMLC, we expect profit persistence or Pijt to be greater than one, in maturity Pijt ≈ 1 (ceterus paribus), and in decline Pijt < 1. Clearly, changes in this variable are going to be of considerable interest to the CEO in search of net growth. As Polli and Cook (1969) point out, for these stages to be useful we need to be able to understand the current stage, and, more importantly, to determine when we are likely to transition to the next. We have some good tools in marketing to assist with this process. For example, in the growth to maturity stage, the diffusion of innovation literature provides a rich range of tools to model growing markets (e.g., see Mahajan, Muller and Bass 1991), while Golder and Tellis (1997) examine the transition from introduction to growth (“take off”). Correspondingly, in the decline stage, one might use technology substitution models to predict the rate at which a new technology would replace an existing one (e.g., the replacement of CDs by DVDs). (See, for example, Danaher, Hardie and Putsis 2001.) While a study of the PMLC provides useful prognostic information to the CEO about the likely persistence of existing earnings, it is by asking “What’s driving the product-market life cycle?” that we can empower the manager to change it and address diagnostic challenges relating to its dynamics. The PMLC will be affected by factors external to the firm (such as changing consumer tastes and competitive moves) and internal ones (particularly response to marketing activity). As opposed to the static challenges addressed in the previous section, the dynamic challenges are to get sales earlier (e.g., accelerate earnings by investing in or leveraging off existing market-based assets such as brands, customers, and channels), to obtain more sales at maturity (e.g., by cross-selling), and to get sales longer (e.g., extend sales by building switching costs or via product rejuvenation), illustrated in figure 3.4. Reducing volatility may be a fourth time-related objective. We know quite a lot about the likely trajectory of sales in each of these stages, and the strategies best able to assist with the above four objectives, thus maximizing the expected net present value of existing product-markets. For example, in his popularist book, “Crossing the Chasm,” Moore talks about how to transition from introduction to growth, while in “Inside the Tornado,” he describes the transition from growth to maturity. Finally, in “Living on the Fault Line,” he addresses
60
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
mitigating the effects of factors potentially driving decline (Moore 1991, 1995, 2000). This understanding helps us choose appropriate metrics for performance management and planning. We use figure 3.5 to illustrate changing emphases during the PMLC. Management Focus Management goals and concerns shift over the product-market life cycle—from innovation and market development in early stages, to leveraging product and customer-centric brand platforms in order to better manage margins and turnover during mature phases, and to defensive marketing and/or rejuvenation in decline. Because metrics should follow and not dictate strategy, a contingent approach to measurement is necessary. While the majority of productmarkets in a firm’s portfolios may be in the mature phase (requiring a focus on enhancing operational efficiencies), new transformational opportunities are likely to die on the vine if starved of resources due to application of the same yardsticks, while declining ones end up being over-resourced due to legacy marketing budgets. We summarize the key challenges, senior management concerns, and environmental issues at different stages of the PMLC in figure 3.5. Key environmental factors influencing decisions in early stages of the PMLC include investments (such as distributor incentives) and marketing activities (such as advertising) to build market-based assets (brands, customer networks, etc.) and to manage emerging competitors. Faced with uncertain revenues and cash flows (and far more concrete investments and expenses), managers charged with exploiting growing product-market opportunities face vexing problems of justifying marketing investments when saddled with the same tools and business models used for mature products. For example, ROI-related metrics favor mature products (Rust et al. 2004). In the decline stage, firms often face commoditization and the “fight or flight” option. Here the challenge is to develop harvesting strategies such as increasing market inertia by offering customer solutions that increase switching costs (Tuli, Kohli, and Bharadwaj 2007 and Roberts 2005, p. 154) in order to extend revenues and protect margins and maximize terminal cash flow—or to cut losses by exiting early when confronted with stronger substitution forces. Because salvage values are higher before the decline phase kicks in, prognostics are critical. While limited relative to the literature of managing in maturity, research on managing markets and maximizing value in decline does exist (e.g., Roberts, Morrison, and Nelson 2004).
Strategic Marketing Metrics to Guide Pathways to Growth
61
Product-Market Life Cycle sales
Time Stage
Growth
Maturity
Management focus
Innovation management Market development
Leverage Product and customer platforms Optimization
Defensible perimeters Slow decline Rejuvenation
Key challenges
Accelerate growth Risk control Justify marketing investments
Maximize productcustomer density Mix optimization and efficiency
Extend net revenues and margins Harvest to maximize terminal cash flow Signal intent
Senior management concerns
Sales trajectory Price trajectory Dynamic cash flow Uncertainty
Profit maximization Leverage for growth Generate opportunity
Profitability and cash (margin and volume protection) Ongoing/salvage value
Major environmental issues
Availability/ distribution Emerging competition Assembling solutions and networks (MBAs)
Competition Sources of Economic uncertainty substitution Available adjacencies Inertial variables Harvesting versus aggregating
Defining and utilizing relevant firm resources Market-based assets (MBA) Brand equity Customer base Routes to market Collaborators NonMBA (Internal) Cash, IP
Brand Awareness Associations Customer base Relationships Access Word of mouth Routes to market Availability Access IP, Production
Resource allocation (e.g., size, share and response) Brand platforms Customer platforms Channel platforms Roadmap for category and market (geog) extensions
Figure 3.5 Strategic marketing challenges during the product-market life cycle.
Decline
Defensive Strategies Areas of continued competitive advantage Defensive (locked-in) customer groups Managing benefits of inertia— reinvention and rejuvenation
62
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Defining and Utilizing Relevant Firm Resources At the beginning of the PMLC, the firm will be building market-based assets (brands, customers, and collaborative relationships) or leveraging off those from other product-markets. In the mature phase, not only will the firm protect its assets, it will broaden their appeal, to allow them to provide platforms for entry and migration into other product-markets. For example, in the mid-1990s Amazon.com repositioned itself from being the leading online book store to the leading online store. Finally, in the decline stage of the PMLC, market-based assets need to be focused towards segments or products for which they have continuing value. Unilever PLC classifies its product-markets into three groups corresponding closely to our three stages: “Invest to build,” “Invest to maintain,” and “Harvest.” Investment levels, marketing activities, growth objectives, and hurdle rates are correspondingly different in each. Figure 3.5 argues the need for multiple business models to manage the various stages of the PMLC (or different horses for different courses). Strategies that drive growth (high persistence) are important in early stages, while defensive strategies that limit or slow decay will enhance the level and longevity of profits. This clearly suggests that firms must shift business models across stages of the life cycle. That is, while cost-minimization strategies might work for mature products, the firm must make product innovation and marketing investments using different criteria that recognize investment in marketing intangibles in early stages. Further, if a company maintains product portfolios in significantly different product-markets, it must balance its multiple business models. Marketing Metrics for the Growth Stage of the PMLC It is selfevident that markets in their growth phase will require a greater emphasis on metrics that calibrate change, rather than just levels. In growing markets, marketers have relied on a broad mix of metrics, such as growth in penetration and volume, change in margins, and longitudinal measures, such as time-to-market, time-to-market acceptance, and time-to-money or positive cash flow. Importantly, these metrics must be linked to marketing activity and corporate strategy. In addition, the quality of customers often escapes scrutiny. Recent advances in modeling (e.g., Kumar, Ramani and Bohling 2004, and Wiesel, Skiera, and Villanueva 2008) promise to reveal metrics that focus on improvements in expected CLV. There is, fortunately, a rich literature base in diffusion models (Mahajan, Muller, and Bass 1980) that can be used to
Strategic Marketing Metrics to Guide Pathways to Growth
63
project sales trajectories and provide information to better manage market dynamics and risk. These models can be combined with options thinking (Lander and Pinches 1998) to better manage the uncertainties associated with new product-market opportunities. Also, because marketing activities are not independent across the marketing mix and have carryover effects, it is important to manage momentum when allocating resources (Larreché 2008). To develop metrics to guide management action during the growth stage of the PMLC, it is useful to add dynamics to the sales and profit model outlined in equation 1. A model that has proved to be robust to different market evolutions is that of Bass (1969 and 2004). We will demonstrate it at the firm sales level, though decomposition into category sales and brand share evolution is straightforward. Using the Bass model, sales in time t, may be written as: q (Cijt ) Yijt ⎞ ⎛ Sijt = ⎜ p (Cijt ) + * (m (Cijt ) − Yijt ) m (Cijt ) ⎟⎠ ⎝
(4)
where p(Cijt) and q(Cijt) are rate parameters, m(Cijt) is a cumulative sales saturation level parameter, and Yijt represents cumulative sales. p(Cijt), q(Cijt), and m(Cijt) are functions of marketing investment. The metrics that will prove insightful for the CEO flow directly from equation (4) (when taken in conjunction with equation (1)). The CEO wants to know the ultimate size of the opportunity (m), the speed at which it will be realized (p and q), and how that varies as a function of the resources deployed, Cijt (p(Cijt), q(Cijt), and m(Cijt)). Expressions for the resultant profit and expected net present value follow by analogy to equations (1) and (2), respectively. Marketing Metrics for the Decline Stage of the PMLC The manager in the decline stage of the PMLC has two objectives. The first is to maintain earnings for as long as possible and to slow migration from the product (harvesting). The second is to understand those market segments or customers for whom the product-market will continue to be relevant in the long term and to ensure that the maximum defensible perimeter is retained. (A third objective may be productmarket rejuvenation, which we consider under the heading of new product-markets). The decline stage may be represented by a hazard rate model, which represents the rate at which sales will decline (e.g., Jain and Vilcassim 1991):
64
Sijt =
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
a (Cijt )
b (Cijt ) − e − r (Cijt )t
(5)
where, analogous to the growth case, r(Cijt) is a rate parameter and a(Cijt) /b(Cijt) is the residual level of sales once decline is complete. Again, profit and ENPV follow using equations (1) and (2). In decline, the CEO wants to know the ultimate residual size of the opportunity (a/b), the rate at which sales will be lost (r), and how both of these vary as a function of the resources deployed, Cijt (a(Cijt), b(Cijt), and r(Cijt)). See Roberts, Morrison, and Nelson (2004) for a practical example of the estimation of these types of parameter as a function of marketing effort. Obviously, an understanding of a/b, the residual value of the opportunity, will involve a consideration of heterogeneity and product-market applications to determine for which segments the firm’s product will continue to be able to maintain sales. 3.2.3 Metrics for Managing Dynamic Product-Market Life Cycles: The Unbalanced Scorecard The need for multiple business models throughout the PMLC has obvious implications for the metrics needed to ensure marketing accountability. Importantly, while short-term, earnings-based metrics may be appropriate at the maturity stage, marketing investments in early stages of the life cycle that have longer-term payoffs will require approaches that capture future revenues and cash flows from risky investments. Therefore, we need “unbalanced” scorecards. Having considered the challenges of different stages of the PMLC, we suggest measures that will help the CEO manage them. Metrics that flow from this discussion are outlined in figure 3.6. In summary, they add to those in the stationary case illustrated in figure 3.3 by considering the equilibrium level of sales (saturation in the case of growth and residual sales in the case of decline), rate of progression to that level, and the response of those two variables to marketing activity. 3.3 Managing Beyond the Core: Innovations to Feed the ProductMarket Funnel While a major source of future earnings will be in existing productmarkets, growth into new markets, with new products, or using new value chain activities, will be central to many firms’ futures. Christiansen (1997) gives a number of reasons as to why a firm might
Strategic Marketing Metrics to Guide Pathways to Growth
65
Inputs
Environment
Outputs
Efficiency of conversion
Current performance (t)
Inputs last period: Marketing spend Assets deployed
Environment last period: Competitive activity Market conditions
Results achieved: Sales, Revenue, Profit, Share Future measures: Brand equity, value Customer equity, value
Margins Return on mark investment Return on assets deployed
Growth markets (base plan in t +1)
Current planned activity
Current expected environment
As above plus: Penetration Growth rate Saturation growth
Time to money Ultimate market
Declining markets (base plan in t + 1)
Current planned activity
Current expected environment
As above plus: Rate of decline Inertia/Loyalty Residual sales
Persistence Salvage value
Plan A in t + 1
Current planned activity +20%
Current expected environment
Plan A target outcomes
Plan A expected efficiency
Plan B in t + 1
Current planned activity –20%
Current expected environment
Plan B target outcomes
Plan B expected efficiency
Scenario 1 in t + 1
Current planned activity
Environmental scenario 1
Expected target outcomes under scenario 1
Expected efficiency under scenario 1
Figure 3.6 Metrics to calibrate alternative marketing plans in existing non-stationary productmarkets.
66
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
neglect looking beyond its existing product-markets even when its existing income streams from those are under threat. The degree to which adjacent and nonadjacent growth into new product-markets should figure in a firm’s plans clearly depends on the potential of opportunities in those markets (suitably adjusted for risk and resources required), relative to opportunities in its existing product-markets. (Or, to be more precise, it depends on the relative earnings response to allocating scarce marketing resources to each over time.) Thus, Kodak at the end of the last century could see the decline of mass market halide film, and should have perhaps perceived a stronger adjacency imperative than it did. eBay grew revenues its “Marketplace” Division by 21% in 2007 and faced considerable further growth potential. It decided to focus on its “Payments” Division (basically PayPal) in 2008. The result was a 13% drop in Marketplace revenue and only a 14% increase in Payment revenue.3 Without more data on sales response functions, it is impossible to say whether this change in emphasis was misguided, but certainly it does appear as though there was unrealized potential in the electronic auction marketplace that eBay failed to win. Frequently, companies tend to focus on existing product-markets rather that look beyond the core to adjacent markets based on a focus on short term metrics such as ROI. This is in part because of onerous discount rates in new product-markets and in part due to unwarranted complacency about earnings persistence in current product-markets. 3.3.1 The Imperative for Moving Beyond Existing ProductMarkets: Extended Ansoff Matrix A simple but reasonably insightful way to systematically identify and classify opportunities for growth is the Ansoff Matrix (Ansoff 1965). Ansoff suggested that growth opportunities could be classified by whether they are in new or existing markets, cross-classified by whether these markets used new or existing products. Aaker (2005, p. 245) extends the Ansoff Matrix by also considering how those customers are reached (value chain coverage) which, as we point out in note 2, is a conceptually straightforward extension of product growth and market growth. These three sources of growth bear a strong similarity to Treacy and Wiersema’s (1993) value disciplines (product innovation, customer intimacy, and operational excellence). They are also closely aligned to the concept of market-based assets (customers, brands, and channels or value networks), developed by Srivastava, Shervani, and Fahey
Strategic Marketing Metrics to Guide Pathways to Growth
67
(1998) and the role of such assets in value creation via customer-facing business processes (Srivastava, Shervani, and Fahey 1999). This framework provides an approach to establishing and calibrating potential sources of growth opportunities. It gives us an overview from whence new growth could come. We can also drill down to understand the potential and composition of each cell, including its attractiveness and responsiveness to possible management action. Day (2007) discusses how the Ansoff Matrix may be used to assess the size of new product-market opportunities, the position of the firm to take advantage of them, and the risk involved in doing so. Ansoff’s framework also provides strategies to enable firms to leverage existing internal and market-based assets in order to expand in businesses beyond their core (existing) product-markets, described in section 3.2. We now proceed to examine these opportunities in adjacent, and therefore related, cells. The market-based and other assets that these strategies leverage are not independent of each other. While product development strategies may be embedded in existing product development processes, the derived benefit (differentiation) or value is extracted by leveraging strong customers and channel relationships. That is, business performance is enhanced due to positive interactions between the firm’s three customer-facing processes (Ramaswami, Srivastava, and Bhargava 2009). We look at the criteria for adjacent (and nonadjacent) growth and how to calibrate its potential, before we consider metrics that should be useful to measure and manage this process. 3.3.2 New Product Growth New product growth that leverages off existing brands, markets, and channels exploits existing market-based (and other) assets for expansion into a new product area. Using the nomenclature of equation 1, sales from a new product opportunity NP, in market i at time t, SiNPt, are the size of the opportunity (IiNPt) multiplied by the share of the opportunity that the firm is able to capture (MiNPt). SiNPjt = IiNPt (ChariNPt, CiNPt)* MiNPt (FitiNPt, CiNPt)
(6)
where ChariNPt represents the relevant characteristics of the new product’s market i in time t (e.g., population base), FitiNPt represents the fit of the firm in the new product’s market at time t (e.g., firm skills and transferability to this product), and (as previously), CiNPt represents marketing activity.
68
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
The size of the opportunity will depend on the characteristics of the new category or product segment, which is why firms look for attractive markets when launching new products. The market share of the new product’s category sales that the firm gains will depend on the strength of the market-based (and other) assets it can leverage, moderated by how transferable those assets are to the new product category (the degree of fit). We know much about sizing new product-markets (see, for example, Urban and Hauser 1980). We can also calibrate the strength and transferability of many market-based assets. For example, the brand extension literature describes the extendibility of brand equity in terms of fit (e.g., see Aaker and Keller 1990). That is, we can estimate the size of and firm share of new product opportunities, once identified (albeit with some degree of uncertainty). A firm entering a category at the beginning of its product-market life cycle (or one that is elastic to marketing activity) will focus its efforts on generating primary demand, while one entering a mature market will concentrate more towards gaining market share, secondary demand (e.g., Arora, Allenby, and Ginter 1998). For example, Starbucks’ challenge in expanding into Spain with its established coffeedrinking culture was mainly to gain share from local cafés, while in China it was to develop the category by persuading local consumers to drink coffee (Schultz and Gordon 2012). A prerequisite to a firm defining and utilizing relevant firm resources for new product development is to identify and leverage its resources or capabilities. For new product development in existing markets and channels, obviously the firm exploits customer and channel equity by using customer relationships for both market sensing and market relating (Day 1994, 2007), while existing channels provide routes to market. However, brand- and product-based assets are also becoming increasingly valuable in the form of product platforms. In productcentric companies these resources are invariably linked to off-balance sheet assets such as intellectual property (e.g., patents), technological knowledge, and/or manufacturing process skills (Sanchez 1995). This platform thinking is common in the technology sector and is highly visible for companies such as Intel. Intel’s ability to expand its X86 design architecture enabled the company to utilize IP from one generation into the next, thereby reducing development costs while accelerating growth. Product platforms can be viewed as investments that provide a basis for growth options (Kogut and Kulatilika 1994).
Strategic Marketing Metrics to Guide Pathways to Growth
69
3.3.3 New Market Growth Growth into new markets follows a similar line of argument, in that, for new products, it leverages off existing customer relationships, existing products, and channels. Sales from such a new market opportunity NM at time t, for product j, SNMjt, are the size of the opportunity (INMjt), multiplied by the share of the opportunity that the firm is able to gain (MNMjt): SiNMjt = INMjt (CharNMjt, CNMjt)* MNMjt (FitNMjt, CNMjt)
(7)
Again, we know a reasonable amount about various forms of market expansion. For example, global marketing research not only looks at the degree to which a firm can take its existing products into new markets, but also market dynamics (e.g., see Dekimpe, Parker, and Sarvary 2000). As with existing product-markets, it is straightforward to move from the sales equations (6) and (7) to new product and new market profit and expected net present value, using equations (1) and (2). As discussed earlier, because basic frameworks for managerial accounting and managing business models were developed almost a century ago, when most business assets and resources were largely tangible, measurement systems do not adequately capture assets such as customers, brands, and distribution networks. As a consequence, brand or customer asset utilization is rarely measured. Existing products may provide the vehicle to address new markets and existing channels as the route to market. For example, when Dell first went online in the mid-nineties, the company was surprised to learn that a significant proportion of orders were received from unserved geographical regions. One of Coca-Cola’s sustainable strengths is its global bottling and distribution network. However, customer equity in the form of customer platforms may also be leveraged in new market development. And, once an expanded, global customer base is developed, relationships based on positive customer experience can be leveraged in cross-selling new related products and services, beyond the core business. Therefore, just as there are supply-side benefits due to economies of scale, experience curves, and product/technology platforms, so there are also demand-side synergies based on “customer platforms.” These take advantage of demand-side synergies (i.e., benefits due to common customers) that enhance the rate of adoption of products due
70
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
to shared experience, service, reputational assets, and trust. Unlike product platforms where technology/IP assets provide the basis for expanding into related product-markets, customer platforms rely on exploiting customer relationships to in-sell, cross-sell, up-sell, and on-sell products using customer knowledge and customer-relating processes. Figure 3.7 summarizes the managerial issues that must be addressed to successfully compete with new products and in new markets. These, together with the profit and value drivers stemming from equations (6) and (7), will drive the metrics that will be diagnostic to the Expanding beyond the core Source of growth
New products
New markets
Size of the opportunity
Demand for new category over time
Size of new market over time
Primary capability for share acquisition
Product differentiation
Customer management
Key market-based assets developed
Customer equity and channel access Product platform; Brand equity
Brand equity and channel access Customer platform Customer equity
Support strategies
Leverage existing customers and channels to launch new products
Leverage existing products/ brands and channels in accessing new markets
Key firm resources for growth
Routes to market-channel availability and access; Infrastructure, working capital; Customer base
Leverage product/brand platforms Routes to market-channel availability and access
Management concerns
Sales growth; Risk mitigation; Technology development; (Out) sourcing Existing skill transfer Developing new skills
Sales growth; Risk containment; Market access and acquisition Transfer CRM processes Understand new behaviors
Major environmental issues
Competition in new product category
Competition in New Market
Figure 3.7 Strategies and resources for expanding beyond the core.
Strategic Marketing Metrics to Guide Pathways to Growth
71
CEO allocating scarce resources across different existing and new product-markets. 3.3.4 Performance Metrics for Managing Growth Beyond the Core Figure 3.8 summarizes key size, profitability, growth, and risk metrics, as well as investments required, corresponding to figure 3.7’s management issues for expanding beyond the core. Thus, cost containment may be due to shared design within product platforms, shared sales, service and support for customer platforms, and shared order-supply systems for value-chain platforms. Similarly, value drivers are based on differentiation (product platforms), customer solutions (customer platforms), and delivery efficiencies (value-chain platforms). Growth
Inputs Current performance (t)
Environment
Outputs
Efficiency of conversion
There is no current performance in this product-market. It may be possible to use analogues for benchmarking.
New products (base plan at t + 1)
R&D Cost of NPD Launch costs Ongoing marketing costs
Expected environment: • Competition • Market conditions
Sales dynamics (trial, repeat) (category sales, share, margins) Word of mouth Availability Cannibalization
As for figure 3.3
New markets (base plan at t + 1)
Market development Market research Ongoing marketing costs
Expected environment: • Competition • Market conditions
Sales dynamics (trial, repeat) (category sales, share, margins) Word of mouth Availability
As for figure 3.3
Plan A at t +1
Current planned activity +20%
Current expected environment
Plan A target outcomes
Plan A expected efficiency
Plan B at t +1
Current planned activity –20%
Current expected environment
Plan B target outcomes
Plan B expected efficiency
Scenario 1 at t + 1
Current planned activity
Environmental scenario 1
Target outcome under scenario 1
Expected efficiency under scenario 1
Figure 3.8 Metrics to calibrate alternative marketing plans in new product-markets.
72
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
metrics related to product, customer, and value-chain platforms can be captured in time-to-market, time-to-market acceptance, and time-tovolume, respectively. The differential challenges in each column stem from the fact that we are leveraging off different market-based assets, and the different requirements to establish new resources in each. Note that the principles applicable to adjacent growth also apply to nonadjacent growth. The value of the opportunity to the firm is the size of the opportunity, times the proportion of it that the firm can expect to gain (discounted and summed over time). Risk associated with diversification strategies can often be mitigated by leveraging marketbased assets as well as non-market-based assets. For example, Nylon’s expansion into new products to new markets (often requiring new channels) after World War II (for example, from parachutes to women’s hosiery) was less risky than it might appear, because of the strong intellectual property base on which it was founded (Wind 1982). CEOs charged with managing long-term growth in firm value are understandably concerned about the management and measurement of value created through risk-laden strategies such as innovation, exploitation of new market opportunities, and expansion into adjacent markets—hence the popularity of topics such as disruptive innovations and blue ocean strategies. Current marketing practices offer little in the way of measuring and managing risk in launching new technologies or in exploiting global markets. Fortunately, the rich literature in diffusion models (Mahajan, Muller, and Bass 1980) can be used to project sales trajectories and provide information to better manage market dynamics and risk. These models can be combined with real options thinking to better manage uncertainties associated with new productmarket opportunities—exercising growth options if market projections are better than expected, or exit options should the prognosis be negative. The identification of growth using the above classification, together with the managerial issues it raises, lead to metrics that can be adopted to monitor performance in each quadrant. At its most aggregate level, the CEO will want to know the future earnings from each of the quadrants of the Ansoff Matrix. The most important quadrant for most firms will be the existing product-market one where key metrics emerged from the discussion surrounding figures 3.2, 3.3, and 3.4.
Strategic Marketing Metrics to Guide Pathways to Growth
3.4
73
Balancing the Firm’s Product-Market Portfolio
We have looked at measuring marketing performance across different product life stages of existing product-markets and calibrating potential new opportunities. The CEO must combine these profit opportunities to manage a portfolio of product-market revenues over time. The questions that arise from such multicell opportunities include resource allocation, particularly how to maximize synergies and minimize cannibalization and risk, across groups and over time. As well as opportunities in individual product-markets, the CEO is probably also interested in the balance between the cells of the Ansoff Matrix. For example, Kaplan and Norton (1996, p. 52) suggest as one strategic metric, the percentage of earnings from new products, and cite its use by companies such as HP and 3M. However, we are not yet in a position to set standards for such measures. We cannot advocate relative investments in individual quadrants until we understand the relative size of the opportunities in each and the response functions revealing the marketing investments required to realize them. Therefore, the evaluation of any of these metrics must be undertaken relative to some standard. We have spent little time in marketing addressing the fundamental question, “What is good?” To answer that question, we must systematically enumerate all the opportunities confronting the firm, and research is relatively silent on how the investment opportunity set should be identified. Ironically, this is in contrast to the Accounting literature (e.g., Skinner 1993). We regard this as being a particularly exciting area for future research in marketing because it has the potential to reintroduce structured creativity back into our marketing analysis. 3.4.1 Understanding Marketing Mix Responses Functions A major problem facing the manager is to allocate marketing resources to areas of greatest effectiveness. At any point in time, expenditure should be allocated to where it adds most to the long-term value to the firm. That is, the first derivative of value should be maximized for each product-market marketing expenditure. This optimization problem is made more difficult by spillover effects between product-markets and carryover effects over time. Of particular theoretical complexity is the variance-covariance matrix of prediction errors between related product-market opportunities and their response functions. Nonetheless, resource allocation will always involve at least implicit
74
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
assumptions about these factors. Our objective is to make this simple, explicit, and systematic. The task of the manager is to maximize the sum of the expected net present value in each product-market, ENPVt: ENPVt = ∑ i ∑ j ∑ k
π ijt + k (Cijt + k )
(1 + dijt + k )k
(8)
3.4.2 Allocation of Resources across Product-Markets There is evidence that managers make a number of common errors in these resource decisions. For example, Stiglitz (1996) suggests that lack of information and infrastructure lead to underinvestment in emerging, relative to developed, markets. Similarly, Pavitt (1991) warns of too strong an emphasis on established versus new products. Historical data are more available in mature product-markets, environmental response is more predictable and market response relationships more stable and observable. Therefore, a higher discount rate should be applied to markets in their growth and decline phases. The level of uncertainty in growth markets will be greater because market acceptance for the product in the market is not stable, totally understood, or observable. Greater uncertainty in markets in decline stems from less information about the drivers of decline. However, two considerations may lead the CEO to impose a discount factor that is higher than a value-maximizing one. The first is the principle of conservatism in accounting (Weygandt, Kimmel, and Kieso 2011). The principle of conservatism suggests that “If a situation arises where there are two acceptable alternatives for reporting an item, conservatism directs the accountant to choose the alternative that will result in less net income and/or less asset amount.” Obviously, this is at odds with an expected utility, probabilistic, value-maximizing approach. That is, accounting strongly favors verifiable, low-variance income streams relative to those whose realization may be more uncertain. This conservatism will overly penalize new projects (ironically, potentially increasing overall long-term risk). The second factor favoring existing product-markets is a cultural one. Managers tend to feel more comfortable with the familiar. Therefore, they are likely to give greater weight to more psychologically and economically proximate opportunities. (See Wehrung and Maccrimmon 1988 for a review of managerial risk aversion.) It is highly likely that the value of activities in new markets or products will have lower correlation with existing activity, than will
Strategic Marketing Metrics to Guide Pathways to Growth
75
increasing activity within them. Thus, new activities or increased activity in new product-markets may well have favorable effects on diversifiable risk. 3.4.3 Marketing Resource Allocation over Time In addition to allocating resources across product-markets, the manager must allocate them over time. There are a number of reasons why the productivity of marketing assets may change over time. For example, Srinivasan, Rangaswamy, and Lilien (2005) show that advertising during a recession may lead to more long-term value than advertising during growth periods, mainly because of the opportunity to communicate during a less cluttered period. In addition, the manager may value profitability differently at different times. We know from the accounting literature that the market values consistent earnings performance over volatile earnings with the same mean (Roychowdury 2006). Therefore, at the margin the manager will be rewarded for adjusting marketing activity to smooth earnings performance such as diversifying across product categories and geographical markets. For example, GE’s customer solutions approach not only increased earnings, but raised the proportion from repeat business (which is less volatile), resulting in a higher valuation of those earnings. The smoothing of resource allocation over time in terms of its greater effectiveness in times of recession and its ability to even earnings must be set against any costs of an uneven distribution of expenditure. Of course, non-uniform marketing may also increase its effectiveness (e.g., Mahajan and Muller 1986). Heuristics to provide guidance for the allocation of marketing and other resources are readily identified. Expected profit maximization using equation (8) is a good first step. Balance in earnings over time may be examined by calculating the correlation between the earnings of a given product-market and all other members of the firm’s portfolio. Balance in terms of resource requirements over time can similarly be looked at using the correlation between the market resource requirements of each product-market and the remainder of the portfolio. Market coverage of the customer base and product categories may be seen by mapping the joint positions of the firm’s offerings by product and by market, as can overlap and potential cannibalization.
76
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
3.4.4 Metrics for Balancing Growth across the Product-Market Portfolio The need to compare the productivity of marketing activity over time described in sections 3.4.1–3.4.3 directly suggests a number of metrics that will be useful to the manager in evaluating performance and planning future activity across the firm. These are outlined in figure 3.9. In terms of balance, it may not just be the need for a balanced portfolio in a financial sense that is important. Balance may also be desirable in terms of physical and intellectual assets. For example, the entrepreneurial skill sets that lead to success in product-markets in the growth stage, which may become redundant with the operational efficiency imperative of maturity, can be effectively redeployed into new growth product-markets in the firm’s portfolio.
Variable
Measure
Cross product-market marketing efficiency comparisons Earnings Financial trends Trends in competitive advantage Volatility and risk
Immediate profitability Persistence of returns Trends in brand and customer equity Uncertainty Allocating resources across product-markets
Product-market response Synergy across productmarkets
Marketing mix elasticities across products and markets Interactions in returns and risks Allocating resources over time
Cyclical marketing response Income smoothing value Costs and benefits of pulsing
Advertising return per dollar relative to average of prior five years Budgeted revenue versus budgeted revenue in prior year Marketing response in planned versus uniform allocation
Balance of future earnings and risk across total firm New product contribution Market development contribution New channel contribution Risk distribution
Percentage of future earnings for new products Percentage of future earnings from new and emerging markets Percentage of future earnings from value chain expansion Earning variance in new products/markets/channels/total
Figure 3.9 Typical strategic metrics to assess portfolio balance of marketing activity.
Strategic Marketing Metrics to Guide Pathways to Growth
3.5
77
Summary
In terms of new product-market growth, this paper started with the premise that pressures for marketing accountability will increase rather than decrease. If we want one of the results of that accountability to make marketing more central in the boardroom, we will need to address the questions on the mind of the CEO and the board. That calls for metrics that help calibrate the size, timing, risk, and required resources for all sources of growth and persistence. By looking at existing and new activities of the firm in terms of its market-based (and non marketbased) assets, we can identify and calibrate sources of growth and how they change over time. In turn, that analysis leads to a set of metrics that clearly indicate from whence growth might come and what must be done to realize it. These metrics must distinguish between those that are retrospective and analyze past performance (e.g., sales, margins, and turnover) and those that are forward-looking and used to allocate resources in the future. New metrics (or at least metrics that are new to marketing performance evaluation), such as time to penetration, persistence, proportion of earnings from new products, etc., will help focus the activities of the firm, and give it a clearer idea of where it can go. This will extend the work on valuing firm marketing activity that started in the early 1970s with the pioneering response function modeling work of Little (1975), later incorporated into usable systems by him (Little 1979), as well as the more recent issues raised by the Marketing Meets Wall Street initiative (Hanssens, Rust, and Srivastava 2009). Practicing marketers now rely on a broad but ad hoc mix of metrics such as growth in penetration and volume, increase in margins, and time-based measures, such as time-to-market, time-to-market acceptance, and time-to-money or positive cash flow. Interestingly, academic research has been slow to examine these measures and their optimization. To be useful, these metrics must be linked to corporate strategy. For example, companies such as 3M track the percent of revenue from new products as a measure of the success of their innovation efforts. However, a corollary of the arguments that we have advanced in this paper is that the appropriate level of metrics, such as percentage of future earnings from new product-markets, should be endogenous, not exogenous. That is, the optimal level is determined by the return and risk opportunities in existing product-markets (given the mix of their earnings persistence) balanced against opportunities in new
78
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
product-markets (moderated by the firm’s capability to take advantage of them). In reviewing the substantial progress that has been made on measuring the impact of marketing activity, we have argued that for it to be more influential in the boardroom we need to expand our emphasis from a largely mature, existing product-market view. First, we need to look at the specific challenges in non-stationary product-markets because it is from them that the majority of the changes in earnings (positive and negative) will come. We also have to systematically identify, and then calibrate, opportunities in new product-markets. There is some work on the drivers of value in these markets, but not relative to their importance. Having identified the management challenges, we have attempted to suggest metrics that might be useful to inform marketing activities to address them. We have proposed ways to calibrate the expected returns and risks from doing so, and the relevant management issues, which in turn lead to a set of metrics to assist managers with strategic decision making. While we need metrics that speak to the different challenges in different markets (our “unbalanced” score card), we also need metrics that are comparable across very different product-markets to enable trade-offs to be made. We have attempted to illustrate how this might work in practice. Finally, while much of the marketing literature (and almost all of the metrics literature) focuses on individual components of the firm’s activities, because there are interactions between product-markets, we need marketing metrics that speak to the synergies (and tensions) between different units in the firm, particularly, their effect on risk. Notes 1. Note that we refer to product-market life cycles (PMLCs), in contrast to the more usual product life cycles. With the rise of globalization, the stage of the life cycle in which a product finds itself can vary dramatically from one market to another. 2. It is useful to think of market-based assets in terms of what the firm does (brand equity), to whom it does it (customer equity), and the value chain and channels through which it does it (collaborator equity). See Srivastava, Shervani, and Fahey (1998). For the sake of simplicity, in this paper we discuss only the first two of these assets. Extension to considering collaborator equity is straightforward. 3. http://www.businessinsider.com.au/chart-of-the-day-ebay-revenue-by-segment -2010-11.
Strategic Marketing Metrics to Guide Pathways to Growth
79
References Aaker, David. 2005. Strategic Market Management. 7th ed. New York: John Wiley and Sons Publishers. Aaker, David, and Kevin Keller. 1990. “Consumer Evaluations of Brand Extensions.” Journal of Marketing 54 (1):27–41. Ailawadi, Kussim, Don Lehmann, and Scott Neslin. 2001. “Market Response to a Major Policy Change in the Marketing Mix: Learning from Procter & Gamble’s Value Pricing Strategy.” Journal of Marketing 65 (January):44–61. Ailawadi, Kussim, Don Lehmann, and Scott Neslin. 2003. “Revenue Premium as an Outcome Measure of Brand Equity.” Journal of Marketing 67 (October):1–17. Ambler, Tim. 2003. Marketing and the Bottom Line. Second Edition. London: FT Prentice Hall. Ambler, Tim, and John Roberts. 2006. “A Word of Warning Clarified: Reactions to Peppers and Rogers’ Response.” Marketing Science Institute Working Paper Series, Issue 3, No. 06-115. Ambler, Tim, and John Roberts. 2008. “Assessing Market Performance: Don’t Settle for a Silver Metric.” Journal of Marketing Management 24, No. 7–8 (Special Issue on the Marketing Accounting Interface), pp. 733–750 Anderson, Eric, Claus Fornell, and Sanal Mazvancheryl. 2004. “Customer Satisfaction and Shareholder Value.” Journal of Marketing 68, 4172–4185. Ansoff, Igor. 1965. Corporate Strategy: An Analytic Approach to Business Policy for Growth and Expansion. New York: McGraw-Hill. Arora, Neeraj, Greg M. Allenby, and James L. Ginter. 1998. “A Hierarchical Bayes Model of Primary and Secondary Demand.” Marketing Science 17 (1):29–44. Bass, Frank M. 1969. “A New Product Growth Model for Product Diffusion.” Management Science 15 (5):215–227. Bass, Frank M. 2004. “Comments on “A new product growth for model consumer durables the Bass model.” Management Science 50, no. 12_supplement: 1833–1840. Carey, Dennis, and Michael Pasalos-Fox. 2006. “Shaping Strategy from the Boardroom.” McKinsey Quarterly (3):90–94. Clayton, Christensen. 1997. “Making Strategy: Learning by Doing.” Harvard Business Review (6):141–146, 148–156. Danaher, Peter, Bruce Hardie, and William Putsis. 2001. “Marketing-Mix Variables and the Diffusion of Successive Generations of a Technological Innovation.” Journal of Marketing Research 38 (4):501–514. Davis, John. 2006. 101 Key Metrics Every Marketer Needs. Singapore: Wiley. Day, George. 1981. “The Product Life Cycle: Analysis and Applications Issues.” Journal of Marketing 45 (Fall):60–67. Day, George. 1994. “The Capabilities of Market-Driven Organizations.” Journal of Marketing 58 (4):37–52.
80
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Day, George (2007), “Is It Real? Can We Win? Is It Worth Doing? Managing Risk and Reward in an Innovation Portfolio.” Harvard Business Review (December) Reprint: R0712J-PDF-ENG Dekimpe, Marnik, Philip Parker, and Miklos Sarvary. 2000. “Global Diffusion of Technological Innovations: A Coupled-Hazard Approach.” Journal of Marketing Research 37 (1):47–59. Doyle, Peter. 2000. “Value-Based Marketing.” Journal of Strategic Marketing 8 (4):299–311. Farris, Paul, Neil Bendle, Phillip Pfeifer, and David Reibstein. 2010. Marketing Metrics: The Definitive Guide to Measuring Marketing Performance, 2nd ed. Upper Saddle Creek: Wharton School Publishing. Fornell, Claus, Sunil Mithas, Forrest Morgeson, and M. Krishnan. 2006. “Customer Satisfaction and Stock Prices: High Returns, Low Risk.” Journal of Marketing 70 (January):3–14. Golder, Peter, and Gerry Tellis. 1997. “Will It Ever Fly? Modeling the Takeoff of New Consumer Durables.” Marketing Science 16 (3):256–270. Gordon, Jonathon, and Jesko Perrey. 2015. “The Dawn of Marketing’s New Golden Age.” McKinsey Quarterly (February). Gupta, Sunil, and Valarie Zeithaml. 2006. “Customer Metrics and Their Impact on Financial Performance.” Marketing Science 25:71–739. Gupta, Sunil, Don Lehmann, and Jennifer Stuart. 2004. “Valuing Customers.” Journal of Marketing Research 41 (1):7–18. Hanssens, Dominique, and Marnik DeKimpe. 2010. “Models for the Financial Performance Effects of Marketing.” In Handbook of Marketing Decision Models. New York, ed. Berend Wierenga. Springer. Hanssens, Dominique M., Roland T. Rust, and Rajendra K. Srivastava. 2009. “Marketing Strategy and Wall Street: Nailing Down Marketing’s Impact.” Journal of Marketing 73 (6):115. Jain, Dipak C., and Naufel J. Vilcassim. 1991. “Investigating Household Purchase Timing Decisions: A Conditional Hazard Function Approach.” Marketing Science 10 (1):1–23. Kaplan, Robert. 1984. “The Evolution of Management Accounting.” Accounting Review 59 (3):390–418. Kaplan, Robert, and David Norton. 1996. The Balanced Scorecard: Translating Strategy into Action, Soldiers Field. Harvard Business Press. Keller, Kevin, and Don Lehmann. 2003. “The Brand Value Chain: Optimizing Strategic and Financial Brand Performance.” Marketing Management (May/June):26–31. Kogut, Bruce, and Nalin Kulatilika. 1994. “Options Thinking and Platform Investments: Investing in Opportunity.” California Management Review (Winter):52–71. Kumar, Nirmalya. 2004. Marketing as Strategy. Soldiers Field: Harvard Business School Press. Kumar, Nirmalya, and Jan-Benedict Steenkamp. 2007. Private Label Strategy: How to Meet the Store Brand Challenge. Boston: Harvard Business School Publishing.
Strategic Marketing Metrics to Guide Pathways to Growth
81
Kumar, V., Girish Ramani, and Timothy Bohling. 2004. “Customer Lifetime Value Approaches and Best Practice Applications.” Journal of Interactive Marketing 8 (3):60–72. Lander, Diane and George Pinches. 1998. “Challenges to the Practical Implementation of Modeling and Valuing Real Options.” The Quarterly Review of Economics and Finance 38 (3):537–567. Larreché, Jean-Claude. 2008. The Momentum Effect: How to Ignite Exceptional Growth. Pearson Prentice Hall. Lilien, Gary. 2011. “Bridging the Academic–Practitioner Divide in Marketing Decision Models.” Journal of Marketing 75 (4):196–210. Little, John. 1975. “Brandaid, an On-Line Marketing Mix Model, Part 2: Implementation, Calibration and Case Study.” Operations Research 23 (4):656–673. Little, John. 1979. “Decision Support Systems for Marketing Managers.” Journal of Marketing 43 (Summer):9–26. Mahajan, Vijay, and Eitan Muller. 1986. “Advertising Pulsing Policies for Generating Awareness for New Products.” Marketing Science 5 (2):89–106. Mahajan, Vijay, Eitan Muller, and Frank M. Bass. 1991. “New Product Diffusion Models in Marketing: A Review and Directions for Research.” In Diffusion of Technologies and Social Behavior, pp. 125–177. Springer Berlin Heidelberg. Marketing Science Institute. 2010. 2010–2012 Research Priorities. Cambridge, MA: Marketing Science Institute. Mela, Carl, Sunil Gupta, and Don Lehmann. 1997. “The Long-Term Impact of Promotion and Advertising on Consumer Brand Choice.” Journal of Marketing Research 34 (2):248–261. Mizik, Natalie, and Robert Jacobson. 2008. “The Financial Value Impact of Perceptual Brand Attributes.” Journal of Marketing Research 45:15–32. Moore, Geoffrey. 1991. Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers. New York: Harper-Collins Publishers. Moore, Geoffrey. 1995. Inside the Tornado: Marketing strategies from Silicon Valley’s cutting edge. New York: Harper-Collins Publishers. Moore, Geoffrey. 2000. Living on the Fault Line: Managing for Shareholder Value in the Age of the Internet. Capstone Publishers. Pavitt, Keith. 1991. “Key Characteristics of the Large Innovating Firm.” British Journal of Management 2 (1):41–50. Pauwels, Koen, Jorge Silva-Risso, Shuba Srinivasan, and Dominique Hanssens. 2004. “New Products, Sales Promotions, and Firm Value: The Case of the Automobile Industry.” Journal of Marketing 68 (October):142–156. Polli, Rolando, and Victor Cook. 1969. “Validity of Product Life Cycle.” Journal of Business 42 (4):385–400. Porter, Michael. 1980. Competitive Strategy. New York: The Free Press.
82
John H. Roberts, Rajendra Srivastava, and Pamela D. Morrison
Ramaswami, Sridhar, Rajendra Srivastava, and Mukesh Bhargava. 2009. “Market-based capabilities and financial performance of firms: insights into marketing’s contribution to firm value.” Journal of the Academy of Marketing Science 37 (2):97–116. Roberts, John. 2005. “Defensive Marketing: How a Strong Incumbent Can Protect Its Position.” Harvard Business Review (November):150–157. Roberts, John H., Ujwal Kayande, and Stefan Stremersch. 2014. “From academic research to marketing practice: Exploring the marketing science value chain.” International Journal of Research in Marketing 31 (2):127–140. Roberts, John, Pamela Morrison, and Charlie Nelson. 2004. “Implementing a Pre-Launch Diffusion Model: Measurement and Management Challenges of the Telstra Switching Study.” Marketing Science 23 (2):180–191. Roychowdhury, Sugata. (2006). “Earnings management through real activities manipulation.” Journal of Accounting and Economics 42 (3):335–370. Rust, Roland T., Katherine N. Lemon, and Valarie A. Zeithaml. 2004. “Return on marketing: Using customer equity to focus marketing strategy.” Journal of Marketing 68 (1):109–127. Sanchez, Ron. 1995. “Strategic flexibility in product competition.” Strategic Management Journal 16 (S1):135–159. Schultz, Howard, and Joanne Gordon. 2012. Onward: How Starbucks Fought for Its Life without Losing Its Soul. Emmaus, PA: Rodale Books. Skinner, Douglas J. 1993. “The investment opportunity set and accounting procedure choice: Preliminary evidence.” Journal of Accounting and Economics 16 (4):407–445. Srinivasan, Ravi, Arvind Rangaswamy, and Gary Lilien. 2005. “Turning adversity into advantage: Does proactive marketing during a recession pay off?” International Journal of Research in Marketing 22 (2):109–125. Srinivasan, Shuba, and Dominique Hanssens. 2009. “Marketing and Firm Value: Metrics, Methods, Findings, and Future Directions.” Journal of Marketing Research 46 (3):293–312. Srivastava, Rajendra, Tasadduc Shervani, and Liam Fahey. 1998. “Market-Based Assets and Shareholder Value: A Framework for Analysis.” Journal of Marketing 62 (1):2–18. Srivastava, Rajendra K., Tasadduc Shervani, and Liam Fahey. 1999. “Marketing, Business Processes and Shareholder Value: An organizationally Embedded View of Marketing Activities and the Discipline of Marketing.” Journal of Marketing 63 (Special issue):168–179. Stiglitz, Joseph. 1996. “Some Lessons from the East Asian Miracle.” World Bank Research Observer 11 (2):151–177. Tarasi, Crina, Ruth Bolton, Michael Hutt, and Beth Walker. 2011. “Balancing Risk and Return in a Customer Portfolio.” Journal of Marketing 75 (3):1–17. Treacy, Michael, and Fred Wiersema. 1993. “Customer Intimacy and Other Value Disciplines.” Harvard Business Review (January/February):4–93. Tucker, Jennifer, and Paul Zarowin. 2006. “Does Income Smoothing Improve Earnings Informativeness?” Accounting Review 81:251–270.
Strategic Marketing Metrics to Guide Pathways to Growth
83
Tuli, Kapil, Ajay Kohli, and Sundar Bharadwaj. 2007. “Rethinking customer solutions: From product bundles to relational processes.” Journal of Marketing 71 (7):1–17. Urban, Glen, and John Hauser. 1980. Design and Marketing of New Products. Englewood Cliffs, NJ: Prentice-Hall. Verhoef, Peter, and Peter Leeflang. 2010. “Getting Marketing Back into the Boardroom: The Influence of the Marketing Department in Companies Today.” GfK Marketing Intelligence Rev 2 (4) Wehrung, Donald, and Kenneth Maccrimmon. 1988. Taking Risks. New York: The Free Press. Weygandt, Jerry, Paul Kimmel, and Donald Kieso. 2011. Accounting Principles. 10th ed. New York: John Wiley and Sons. Wiesel, Thorsten, Bernd Skiera, and Julián Villanueva. 2008. “Customer Equity: An Integral Part of Financial Reporting.” Journal of Marketing 72 (2):1–14. Wind, Yoram. 1982. Product Policy: Concepts, Methods, and Strategy. Reading, MA: Addison-Wesley.
4
Moving from Customer Lifetime Value to Customer Equity Xavier Drèze and André Bonfrer
4.1
Introduction
In an era when Customer Relationship Management (CRM) is widely espoused, many researchers extol the virtues of Customer Lifetime Value (CLV) as the best metric to use to select customers and optimize marketing actions (Farris et al. 2006; Fader, Hardie, and Lee 2005; Reinartz and Kumar 2003). A customer’s lifetime value is the net present value of all profits derived from that customer. It is constructed using core metrics such as customer retention rate, revenue per customer, and discount rate of money. Berger and Nasr (1998) discuss the concept in detail and provide many different ways to compute it depending on the situation the firm is facing. Proponents of CLV argue that it should be used for customer acquisition (CLV is the upper bound of what one should be willing to spend acquiring a customer lest one wants to lose money—Farris et al. 2006; Berger and Nasr 1998), customer selection (one should focus on customers with high CLV—Venkatesan and Kumar 2004), and resource allocation (marketing resources should be allocated so as to maximize CLV—Reinartz, Thomas, and Kumar 2005; Venkatesan and Kumar 2004). Along with this renewed interest in CLV, there has been a move toward using Customer Equity (CE) as a marketing metric both for assessing the return of marketing actions and to value firms as a whole. The metric, proposed by Blattberg and Deighton (1996) and defined by Rust, Lemon, and Zeithaml (2004) as “the total of the discounted lifetime values summed over all the firm’s current and potential customers,” seeks to assess the value of not only a firm’s current customer base, but also its potential or future customer base. This long-term view seems appropriate, since a firm’s actions at any given time do not only affect its customer base at that point in time, but also its ability to recruit
86
Xavier Drèze and André Bonfrer
and retain customers in subsequent time periods. Customer equity has been endorsed by many researchers. For instance, Berger et al. (2002) make CE the foundation on which they build their framework for managing customers as assets; Bayon, Gutsche, and Bauer (2002) make a case that CE is the best basis for customer analysis; Gupta, Lehmann, and Stuart (2004) use CE as a basis for valuing firms. Villanueva, Yoo, and Hanssens (2008) use CE to measure the impact of different acquisition methods on the long-term value of the firm. There is a direct link between CLV and CE: customer equity is the sum of the CLV of all current and future customers. When building CE as an aggregation of CLVs, researchers have taken for granted that the precepts coming out of a CLV analysis still apply when looking at customer equity (e.g., CLV is a limit to acquisition spending; maximizing CLV is equivalent to maximizing CE). The distinction between CLV and CE is confounded by the fact that some researchers (e.g., Berger and Bechwatti 2001) use the terms interchangeably. There is no doubt however that the two concepts are distinct; CLV computes the value of a customer to the firm, while CE measures the value of all present and future customers given a firm’s marketing actions (e.g., acquisition policy, marketing mix). There is no doubt that when valuing a firm (Gupta, Lehmann, and Stuart 2004) one should use CE, rather than CLV, as a metric. But, when optimizing marketing decisions, are CLV and CE interchangeable? Since CE is a sum of CLV, are we maximizing CE when we maximize CLV? The theoretical development we conduct in this paper shows that the answer to these two questions is a resounding “No.” Maximizing CLV is suboptimal from a CE perspective—it leads to the wrong marketing actions and the wrong acquisition policy. We find that firms that optimize CLV generate lower profits and retain fewer customers than firms that optimize CE. To show why CLV is suboptimal, we start with a discussion of customers as a resource to the firm (section 4.2.1) and show that a firm should aim to maximize the long-term value of this resource. Based on this discussion, we derive a general functional form for a firm’s customer equity in section 4.2.3. In sections 4.3 and 4.4, we derive the first order condition that must be satisfied to maximize CE and perform comparative statics to show how marketing actions should be adapted when control variables (e.g., retention rate, acquisition effectiveness) change. We show how maximizing CE is different from maximizing CLV such that maximizing CLV is suboptimal from a customer equity standpoint. Then, in section 4.5, we discuss the impact of a long-term
Moving from Customer Lifetime Value to Customer Equity
87
focus on acquisition spending. We show that firms that aim to maximize CE will spend less on acquisition than firms that aim to maximize CLV; however, the former will retain their customers longer and generate larger total profits. In section 4.6, we show our model to be robust to both observed and unobserved heterogeneity. We conclude the paper in section 4.7. 4.2
A Customer Equity Model
For modeling purposes, and in the spirit of Berger et al. (2002), we consider the case of a firm that is trying to generate revenues by contacting members of its database. Customers are recruited through an acquisition policy and are contacted at fixed intervals. The problem for the firm is to set its marketing policy, defined here as the contact periodicity and the amount of money spent on acquisition, that will maximize its expected revenues, taking into account the various costs it faces (acquisition, communication) and the reaction of customers to the firm’s policy (defection, purchase). When developing our model of customer equity, we pay particular attention to two processes. First, we seek to capture the customer acquisition/retention process in a meaningful way. Second, we seek to capture the impact of the firm’s marketing actions on customer retention and expenditure. 4.2.1 Customer Flow: Acquisition and Retention Process Fundamentally, managing a database of customer names is a dynamic problem that hinges on balancing the cultivation of surplus from the firm’s customers with the retention of customers for future rent extraction. The problem of extracting the highest possible profits from a database of customer names is similar to the challenges encountered by researchers studying the management of natural resources. Economists (Sweeny 1992) consider three types of resources: depletable, renewable, and expendable. The distinction is made based on the time scale of the replenishment process. Depletable resources, such as crude oil reserves, are those for which the replenishment process is so slow that one can model them as being available once and only once and with the recognition that spent resources cannot be replaced. Renewable resources, such as the stock of fish in a lake or pond, adjust more rapidly so that they renew within the time horizon studied. However, any action in a given time period that alters the stock of the resource
88
Xavier Drèze and André Bonfrer
will have an impact on the stock available in subsequent periods. Finally, expendable resources, such as solar energy, renew themselves at such a speed that their use in one period has little or no impact on subsequent periods. Depending on the type of resource a firm is managing, it faces different objectives. In the case of a depletable resource, the firm is interested in the optimal use of the resource in its path to depletion. With a renewable resource, the firm is interested in long-term equilibrium strategies. Faced with an expendable resource, the firm is interested in its best short-term allocation. When looking at its customer database, a firm can treat its customers as any of these three resource types depending on the lens it uses for the problem. In the cohort view (the basis for CLV), customers are seen as a depletable resource. New cohorts of customers are recruited over time, but each cohort will eventually become extinct (Wang and Spiegel 1994). Attrition is measured by the difference in cohort size from one year to another. In an email-SPAM environment, where consumers cannot prevent the firm from sending them emails, and where new email names can readily be found by “scouring” the Internet, the firm views consumers as an expendable resource, and thus can take a shortterm approach to profit maximization. Customer equity takes a long-term approach to valuing and optimizing a firm’s customer base. For a firm with a long-term horizon, a depletable resource view of customers is not viable as it implies that the firm will lose all its customers at some point. Thus, a customer acquisition process must be in place for the firm to survive. Further, if the firm observes defection and incurs acquisition costs to replace lost customers, then it cannot treat its customers as expendable. Hence, we argue that the appropriate way to look at the profit maximization problem is to view customers as a renewable resource. This view is the basis for our formulation and analysis of a firm’s customer equity. Further, if the firm views customers as a renewable resource, then it must evaluate its policies in terms of the longterm equilibrium they generate. This is the approach we take in this paper. 4.2.2 The Impact of Marketing Actions on Customer Retention and Expenditure One cannot capture in a simple model the impact of all possible marketing actions (pricing decisions, package design, advertising
Moving from Customer Lifetime Value to Customer Equity
89
content …). Hence, as in Venkatesan and Kumar (2004), we concentrate on optimizing the frequency with which a firm should contact its customer base. It is a broadly applicable problem that is meaningful for many applications, not only for the timing of mail or email communications, but also the frequency of calls by direct sales force or the intensity of broadcast advertising. It also possesses some interesting characteristics that illustrate typical tensions among marketing levers in terms of spending on retention versus cultivation efforts. For instance, the effect of the periodicity of communication on attrition is not so straightforward. To see why, one can conduct the following thought experiment: let us consider the two extremes in contact periodicity. At one extreme, a firm might contact its clients so often that the relationship becomes too onerous for the clients to maintain, and thus they sever their links to the company, rendering their names worthless. (Evidence of this phenomenon can be seen in the emergence of the “Do Not Call Registry” in 2003 as well as the Anti-Spamming Act of 2001.) At the other extreme, the firm never contacts its customers, and, although the names have a potential value, this value is never realized. Thus, periodicity affects the value of names in two ways. On the one hand, more frequent contact leads to more opportunities to earn money. On the other hand, more frequent contact provides customers with more opportunities to defect. The latter can quickly lead to high long-term attrition. Imagine the case of a company that has a retention rate of 97% from one campaign to another. This might at first glance seem like outstanding loyalty. However, if the firm were to contact its database every week then it would lose 80% of its current customers within a year! Clearly, there must be an intermediate situation where one maximizes the realized value from a name by optimally trading off the extraction of value in the short-term against the loss of future value due to customer defection. 4.2.3 Customer Equity Our customer equity model is in essence a version of the model developed by Gupta et al. (2004) that has been modified to directly take into account the requirement described in the preceding two sections. In particular, we make customer retention and customer spending a function of the time elapsed between communications (τ ). We also explicitly develop a consumer production function (Si (τ )) that characterizes how the customer base evolves from one period to another as a function of acquisition and retention.
90
Xavier Drèze and André Bonfrer
Because the time elapsed between each communication is a control variable, we must be careful when discounting revenues and expenses. To apply the correct discount rate to expenses and revenues, one must pinpoint when each instance occurs. Consistent with Venkatesan and Kumar (2004), we assume that acquisition expenses are an ongoing expenditure that the firm recognizes on a periodic basis (e.g., monthly), while communications expenses and revenues are recognized at the time of the communication. This assumption is consistent with our observations of practice, where the acquisition costs are decoupled from the marketing activities once the names are acquired. This is also a recommendation of Blattberg and Deighton (1996), highlighting the different roles of acquisition and retention efforts. Based on these assumptions, customer equity (CE) can be generally defined as: ∞
∞
i=0
j=0
CE(τ ) = ∑ e − irτ ( Ri (τ )Si (τ ) − FCi ) − ∑ e − jr AQ j
(1)
where: i is an index of communications, j is an index of time periods, e − r is the per period discount rate,1 τ is the periodicity of contact, Ri (τ ) is the expected profit (revenues net of cost of goods sold and variable communication costs) per customer for communication i , Si (τ ) is the number of people in the database when communication i was sent, FCi is the fixed cost associated with communication i , AQ j is the acquisition costs incurred in time period j . Further, we define the customer profit ( Ri (τ ) ) and production (Si (τ ) ) functions as: Ri (τ ) = ( Ai (τ ) − VCi ) ,
(2)
Si (τ ) = Si −1 (τ )Pi (τ ) + gi (τ )
(3)
where: Ai (τ ) is the expected per-customer gross profit for communication i , VCi is the variable cost of sending communication i , S0 (τ ) = 0 , Pi (τ ) is the retention rate for communication i , gi (τ ) is the number of names acquired between campaigns i − 1 and i .
Moving from Customer Lifetime Value to Customer Equity
91
The model laid out above is a general case that is applicable to many situations. However, we make the following simplifications in order to make the maximization problem more tractable. First, we assume that the acquisition efforts are constant over time and produce a steady stream of names (i.e., gi (τ ) = τ . g , AQ j = AQ( g ) ). There is no free production of names and the cost of acquisition increases with the size of the name stream such that: AQ(0) = 0 , ∂AQ( g )/ ∂g > 0 . Second, we assume that the fixed and variable communications costs are constant across communications (i.e., VCi = VC , FCi = FC ). Third, we assume that the communications are identical in nature, if not in actual content, and that customers’ reaction to the communications depends only on their frequency such that: Ai (τ ) = A(τ ) and Pi (τ ) = P(τ ) . Further, we assume that A(τ ) is monotonically increasing in τ , with A(0) = 0 and A(∞) finite (i.e., the more time the firm has to come up with a new offer, the more attractive it can make it—up to a point), and that P(τ ) is inverted-U shaped (i.e., retention is lowest when the firm sends incessant messages or when it never contacts its customers, and there is a unique optimal communication periodicity for retention purposes) and dependent on A(τ ) such that ∂P(τ )/ ∂A(τ ) > 0 (i.e., the better the offer, the higher the retention rate). We assume, for now, that customers are treated by the firm as homogenous entities; i.e., there is no observed heterogeneity the firm can use to derive different τ for different individuals. This is consistent with the recommendation of a number of CLV researchers (Zeithaml, Rust, and Lemon 2001, Berger et al. 2002, Libai, Narayandas, and Humby 2002, Hwang, Jung, and Suh 2004) who talk about generating a set of rules for groups (or segments) of customers. It is also an approach that is widely followed by practitioners; customers are often segmented into fairly homogenous subgroups and then each subgroup is treated independently. With this in mind, we would argue that each customer segment has its own CE with its own acquisition and retention process; equation (1) can then be applied to each of these segments. We will return to the issue of heterogeneity in section 4.6. We defer to appendixes A, B, and C, later in this paper, for a more complete justification of our assumptions. We also discuss the robustness of the major findings to the relaxation of these assumptions later in this paper when appropriate. Building on these assumptions, we can rewrite the customer equity equations as follows: ∞
∞
i=0
j=0
CE(τ ) = ∑ e − irτ ( R(τ )Si (τ ) − FC ) − ∑ e − jr AQ( g )
(4)
92
Xavier Drèze and André Bonfrer
Si (τ ) = Si −1 (τ )P(τ ) + τ . g
(5)
R(τ ) = ( A(τ ) − VC ) .
(6)
We show (Lemma 1) that, given (5), in the long run, the database reaches an equilibrium size, regardless of the number of names the firm is endowed with at time 0 ( S0 ). Lemma 1: For any given constant marketing actions there exists a steady state such that the database is constant in size. The steady state size is S=
τ .g . 1 − P(τ )
Proof: See appendix A.1. The intuition behind Lemma 1 is straightforward given the database production function (5). For any given τ the firm acquires a constant number of new names (τ .g ) between every communication, but it loses names in proportion ( P(τ ) ) to its database size. Consequently, the database will be at equilibrium when the firm, between one communication and the next, acquires as many names as it loses due to the communication. This is analogous to Little’s Law, which states that an inventory reaches a steady state size, L, which is a product of the arrival rate of new stock (λ ) and the expected time spent by any single unit of stock in the inventory (W ) or L = λW (Little 1961). When Little’s Law is applied to an inventory of customer names we have L = S the expected number of units in the system, λ = τ .g the arrival rate of new units, and W = 1 /(1 − P(τ )) the expected time spent by a unit in the system. Little’s Law yields a finite inventory size, as long as λ and W are both finite and stationary. This law has been the subject of numerous papers and has been shown to hold under very general assumptions. This means that Lemma 1 will hold (in expected value) for any stochastic acquisition and retention process as long as they are stationary in the long run, that τ and g are finite, and P(τ ) < 1. The appeal of Little’s Law applied to the CE problem is its robustness. This relationship holds even if: • there is seasonality in retention or acquisition (aggregated to the year level, these processes become stationary—e.g., g = E[Yearly acquisition rate]);
Moving from Customer Lifetime Value to Customer Equity
93
• the firm improves its ability to attract new customers (as long as g = lim g(t) t →∞
is finite, where t is the length of time the firm has been in business); • there is heterogeneity in customer retention (see section 4.6); • customer retention increases—possibly due to habit formation or inertia—as customers stay in the database longer (provided that lim P(τ , t) < 1 , t →∞
where t is the length of time during which a customer has been active). Lemma 1 yields some interesting properties for the CE model. First, it allows managers to ex ante predict the long-term database size and CE. All they need to know is their long-term attrition rate and acquisition effectiveness. This is particularly important when valuating young companies (e.g., startups) as it provides a monitoring tool and a base for predicting long-term value. Second, it allows us to further simplify the formulation of CE. Indeed, we can write the steady state CE: ∞
∞
i=0
j=0
CE(τ ) = ∑ e − irτ ( R(τ )S (τ ) − FC ) − ∑ e − jr AQ( g ) , or, taking the limit of sums: CE(τ ) = ( R(τ )S (τ ) − FC )
e rτ er − AQ( g ) r . e −1 e −1 rτ
Third, the elements of (7) are quite easy to compute for database marketers. Having a database of names they track, they can measure S (it is the number of live names in the database), they can compute R as the average per-customer profit per campaign, the various costs can be obtained from their accounting department (as is r). Finally, they can set τ to maximize CE(τ ) as discussed in proposition 2, below. Fourth, the optimal marketing actions and the resulting steady state database size are a function of the rate of the acquisition stream, not its cost. We formalize this point in lemma 2. Lemma 2: Given an acquisition stream, the marketing actions that maximize CE depend on the rate of acquisition, but are separable from the cost of acquisition.
94
Xavier Drèze and André Bonfrer
Proof: The proof is straightforward. One only needs to recognize that CE can be split into two terms. The first one depends on g and τ ; the second one depends on g only. That is: e rτ er − AQ( g ) r CE(τ ) = ( R(τ )S (τ ) − FC ) rτ −1 e −1 e f (τ , g )
k( g)
Further, to maximize CE with respect to τ one computes ∂CE(τ )/ ∂τ such that: ∂CE(τ ) ∂ ∂ ∂ = f (τ , g ) − k( g ) = f (τ , g ) = 0. ∂τ ∂τ ∂τ ∂τ The essence of the proof stems from the observation that the acquisition expenditures precede, and are separable from, the frequency of contact or the message content. Once names have entered the database, it is the marketer’s responsibility to maximize the expected profits extracted from those names, treating the cost of acquisition as a sunk cost. That is, when optimizing the contact strategy, the firm only needs to know how many new names are acquired every month, not how much these names cost. Everything else being constant, two firms having the same acquisition rate, but different acquisition costs, will have identical intercommunication intervals (but different overall profitability). Lemma 2 belies the belief that one should be more careful with names that were expensive to acquire than with names that were acquired at low cost. This does not imply, however, that acquisition costs are irrelevant. The long-term profitability of the firm relies on the revenues generated from the database being larger than the acquisition costs. Further, acquisition costs are likely to be positively correlated with the quality of the names acquired (where better quality is defined either by higher retention or higher revenue per customer). Nevertheless, the optimal marketing activity is a function of the acquisition rate (g) and can be set without knowledge of the cost associated with this acquisition rate (AQ(g)). The importance of lemma 2 is that it allows us to split customer equity in two parts: the value of the customer database and the cost of replenishing the database to compensate attrition. We can thus study the optimization problem in two stages. First, solve (section 4.3.1) the problem of maximizing the database value given a predetermined
Moving from Customer Lifetime Value to Customer Equity
95
acquisition stream (i.e., find τ * |g ). Second, optimize (section 4.5.1) the acquisition spending given the characterization of τ * . Further, this lemma formalizes Blattberg and Deighton’s (1996) suggestion that, when maximizing customer equity, the “acquisition” and the “customer equity” management tasks are very different, and should be treated separately. 4.3
Customer Equity and Customer Lifetime Value
We now turn to the profit maximization problem given an acquisition stream of names (g). As we have shown in the previous section, if the acquisition expenditures are independent of τ , we can ignore them when optimizing the communication strategy. The problem to the firm then becomes: max Vce (τ ) = ( R(τ )S (τ ) − FC ) τ >0
e rτ . e rτ − 1
(8)
In this equation V (the database value part of CE) represents the net present value of all future expected profits and the subscript ce indicates that we take a customer equity approach, as opposed to the customer lifetime value approach that we will formulate shortly (using clv as a subscript). We show in proposition 1 that maximizing the CLV leads to different solutions than maximizing Vce—such that maximizing the CLV is suboptimal with regard to long-term customer equity maximization. Proposition 1: CLV maximization is suboptimal with regard to the long-term profitability of the firm. Proof: See appendix A.2. The intuition behind the proof is that when computing a CLV one accounts for the fact that, due to attrition, customers have a decreasing probability of being active as time passes. Given our notation, one would write the CLV of an individual customer as: ∞ FC ⎞ CLV (τ ) = ∑ e − irτ P(τ )i ⎛ R(τ ) − ⎝ S ⎠ i=0
FC ⎞ e rτ . = ⎛ R(τ ) − ⎝ S ⎠ e rτ − P(τ )
(9)
96
Xavier Drèze and André Bonfrer
Thus, we can restate the database value (8) in terms of the customer lifetime value as: Vce (τ ) = CLV (τ ) S
e rτ − P(τ ) . e rτ − 1
(10)
e rτ − P(τ ) > 1 accounts for the fact e rτ − 1 that customers are renewable and lost customers are replaced. The lower the retention rate, the higher the multiplier. Further, the multiplier increases as the discount rate (r) or the communication interval (τ) decreases Following equation (10), the database value is equal to the CLV multiplied by a correction factor. Maximizing the CLV will thus lead to the maximum customer equity, if and only if, at the maximum CLV, the derivative of the multiplier to τ is equal to 0. We show in appendix A.2 that this can only occur if τ * = 0. But, as we will show in the next section, τ * is always strictly greater than zero. Thus, the marketing actions that maximize the CLV do not also maximize the long term CE of the firm. Therefore, CLV maximization is suboptimal for the firm! One should note that when computing the CLV in equation (9), we allocated a portion of the campaign costs (FC) to each name. This allocation does not affect the substance of our findings. Indeed, if we were to ignore the fixed costs at the name level, we would compute the CLV as: In this equation, the multiplier
CLV (τ ) = R(τ )
e rτ e − P(τ ) rτ
(11)
Thus, we would have: e rτ − P(τ ) e rτ Vce = ⎡⎣CLV (τ )S (τ )⎤⎦ rτ − FC rτ e −1 e −1 The value of the database as a whole is the value of its names minus the costs associated with extracting profits from these names. In ⎛ e rτ − P(τ ) ⎞ as we valuating each name, we find the same multiplier ⎜ rτ ⎝ e − 1 ⎟⎠ had when we incorporated the costs directly in the name value. 4.3.1 Finding the Optimal Periodicity (τ * ) In order to optimize the firm’s marketing actions for any given acquisition strategy, we calculate the first-order condition for optimality by
Moving from Customer Lifetime Value to Customer Equity
97
differentiating equation (8) with respect to τ . Without further specifying the general functions that constitute the database value, it is not possible to generate an explicit closed-form solution for optimal τ . However, we can make some inferences using comparative static tools. We start by describing the first-order condition, expressed as a function of the elasticities of the retention and profit functions with respect to changes in the intercommunication interval. Then, we study how the first-order condition changes as a result of changes in retention, profit, acquisition, and discount. Proposition 2: The first-order condition for a firm that seeks to maximize its database value by optimizing the intercommunication interval is:
ηR + ηS + ηD .GM = 0
(12)
where ∂R(τ ) τ ∂A(τ ) τ = is the elasticity of R(τ ) with respect ηR = ∂τ R(τ ) ∂τ A(τ ) − VC to τ , ∂P(τ ) τ is the elasticity of S with respect to τ , ηS = 1 + ∂τ 1 − P(τ ) − rτ e rτ ⎞ ⎛ is the elasticity of the discount multiplier ⎜ D(τ ) = rτ ηD = rτ ⎟ ⎝ e −1 e − 1⎠ with respect to τ , R(τ ).S (τ ) − FC GM = is the gross margin yielded by each R(τ ).S (τ ) communication. Proof: See appendix B.1. The proof of proposition 2 is an algebraic exercise that leads to a simple expression of the first-order condition: a linear combination of elasticities. The optimal intercommunication interval (τ * ) is found when the sum of elasticities is equal to 0. When the sum is positive, the firm would increase its value by increasing τ . When the sum is negative, the firm would be better off decreasing its τ . As we show in appendixes A, B, and C, if there exists a τ such that Vce (τ ) is positive, then there exists a unique τ * that is finite and strictly positive. This is not a restrictive condition, as it only assumes that it is possible for the firm to make some profits. If it were not the case, the firm will never be profitable and the search for an optimal τ becomes meaningless. Further, if is not too convex, then there are no local maxima, and thus
98
Xavier Drèze and André Bonfrer
the minimum is unique. This, again, is not restrictive, as P(τ ) will typically be concave over the duration of interest. 4.4 The Impact of Retention Rate, Discounts, Revenues, and Acquisition Rates on CE Our framework for the valuation of customer equity relies on several metrics commonly used to characterize a customer database: retention rate, acquisition rate, customer margins (revenues and costs), and discount rates. When any of these change, the marketing action (τ ) must change to reach the new maximum customer equity. Since we endogenized the marketing action, we can now answer the question: how do these levers of customer value impact both the optimal marketing actions (τ * ) and customer equity (CE*)? To answer this question, we perform a comparative static analysis on the CE problem. Furthermore, we make some generalization about how these levers impact customer equity. 4.4.1 Change in Retention Rate ( P(τ ) ) A change in the retention rate response function can come about in two different ways. First, one might see an intercept or level shift that increases or decreases P(τ ) overall without changing the sensitivity of P(τ ) to changes in τ .(i.e., the gradient ∂P(τ )/ ∂τ is unaffected). Second, one might see a change in the sensitivity of P(τ ) to changes in τ (i.e., a change of ∂P(τ )/ ∂τ at τ * while P(τ * ) is constant). One might, of course, observe a combination of these two changes. In such a case, the total impact of the changes will be the sum of the changes due to the levelshift and the sensitivity-shift. The combined impact of both changes is given in the following proposition: Proposition 3a: An increase in retention sensitivity ( ∂P(τ )/ ∂τ ) leads to an increase in τ * . A level-shift increase in retention (P(τ )) leads to an increase in Vce. It also leads to an increase in τ * when τ * is small, and a decrease in τ * when it is large. The cut-off level is:
τ * : ηS (τ * ) > ηD(τ * )
FC +1 . R(τ ).S (τ * ) *
Proof: See appendix B.2. The retention probability affects the first-order condition (FOC) through both ηS and GM. Thus, when looking at changes in P(τ ) we
Moving from Customer Lifetime Value to Customer Equity
99
need to consider the combined impact of both changes. In the case of a change in ∂P(τ )/ ∂τ , the situation is straightforward, as GM is unaffected and thus the increase in ∂P(τ )/ ∂τ leads to an increase in the FOC through ηS . Hence, the firm would react by increasing its τ . In other words, an increase in ∂P(τ )/ ∂τ means that the firm has more to gain by waiting a little longer between communications, and since the system was at equilibrium bore, it now leans in favor of a larger τ * . In terms of database value, Vce is not directly affected by changes in ∂P(τ )/ ∂τ and thus, strictly speaking, a change in ∂P(τ )/ ∂τ will not affect the database value. However, a change in ∂P(τ )/ ∂τ cannot reasonably arise without some change in P(τ ). And thus, Vce will be affected through the change in P(τ ) . It is straightforward to show that a level-shift increase in P(τ ). It is straightforward to show that a levelshift increase in P(τ ) increases Vce. The envelope theorem tells us that although τ * is affected by changes in P, when looking at the net impact on Vce, we can ignore the (small) impact of changes in P on τ * , and simply look at the sign of ∂Vce / ∂P . Here, we have ∂Vce / ∂P > 0 and thus Vce is increasing in P. In the case of a level-shift, the situation is made complex in that GM always increases when P(τ ) increases (which leads to a decrease in the FOC as ηD is negative), but the effect on ηS depends on ∂P(τ )/ ∂τ . If we assume that P(τ ). is inverted-U shaped (increasing at a decreasing rate for small τ — ∂P(τ )/ ∂τ > 0 , ∂ 2 P(τ )/ ∂τ 2 < 0, ∀τ < τ p —then decreasing after some threshold— ∂P(τ )/ ∂τ < 0 , ∀τ > τ p ) then we find that for small τ an intercept-shift increase in P(τ ) leads to an increase in τ * , and for large τ it leads to a decrease in τ * . The change in behavior comes from the fact that when τ is small, an increase in P(τ ) has a large impact on the database size and spurs the company to seek an even larger database, while, when τ is large, an increase in retention allows the firm to harvest the database to a greater extent. 4.4.2 Change in Expected Profit per Contact ( R(τ ) ) We study the impact of costs in the next section and focus here solely on A(τ ). Similar to P(τ ) , A(τ ) can be changed through an intercept-shift or through a change in sensitivity to τ . The situation is, however, a bit more complex for A(τ ) than for P(τ ) because P(τ ) depends on A(τ ). In other words, defection depends in part on the content that is sent to customers. Thus, we must account not only for the direct effect of a change in A(τ ) on the FOC, but also for indirect effects through changes in P(τ ) .
100
Xavier Drèze and André Bonfrer
Proposition 3b: An increase in revenue per contact sensitivity ( ∂A(τ )/ ∂τ ) leads to an increase in τ * . An intercept-shift in revenue per contact ( A(τ ) ) leads to an increase in Vce. It also leads to a decrease in τ * when τ * is small to moderate, and an increase in τ * when it is large. A lower-bound to the cut-off level is: τ * : ηS = τ 2 + 1. Proof: See appendix B.3. The impact of A(τ ) on Vce follows from the envelope theorem. In terms of the impact of A(τ ) and ∂A(τ )/ ∂τ . on τ * , the proof in appendix B.3 is longer than for the previous proposition because of the effects of A on P. The essence of the proposition is, however, straightforward. When the sensitivity of revenue to longer τ increases, there is a pressure toward longer τ * . In case of a positive intercept-shift, there is a tendency for the firm to take advantage of the shift by harvesting the database. 4.4.3 Change in Acquisition Rate (g) Proposition 3c: An increase in acquisition rate leads to a decrease in τ * and an increase Vce. Proof: The proof is similar to the proof of Proposition 3a. Since S is a linear function of g, the elasticity ηS . does not depend on g. The only term in the FOC that depends on the acquisition rate is the gross margin (through S ). Thus all we are interested in is the sign of: ∂GM ∂ R(τ ).S (τ ) − FC = ∂g ∂g R(τ ).S (τ ) =
FC ∂S (τ ) . 2 R(τ ).S (τ ) ∂g
Hence,
∂GM ∂S (τ ) τ > 0 since = > 0. Thus, given that GM acts ∂g ∂g 1 − P(τ )
as a multiplier to ηD , which is negative, an increase in acquisition rate—through increased acquisition spending or or increased acquisition effectiveness—will lead to a decrease in τ *, and, conversely, for a decrease in acquisition rate. Finally, the increase in Vce follows from the envelope theorem and the fact that ∂Vce / ∂g is positive. This relationship between g and τ * is a direct result of treating customers as a resource. When the resource is plentiful (g is high) the firm can harvest it—setting a low τ * . When the resource is scarce (g is low), the firm must conserve it—setting a high τ * .
Moving from Customer Lifetime Value to Customer Equity
101
4.4.4 Change in Discount Rate (r) Proposition 3d: An increase in discount rate leads to an increase in τ * and a decrease in Vce. Proof: Although counterintuitive with regards to τ * , Proposition 3d is straightforward to prove. To show that τ * increases in r, we first note that ηR , ηS , and GM are independent from r. Hence, a change in discount rate will only affect ηD . This change is: ∂ηD ∂ − rτ τ ⎛ rτ e rτ ⎞ = = rτ − 1⎟ > 0 , ∀r > 0 , τ > 0. ⎜ rτ ∂r ∂ r e − 1 e − 1 ⎝ e rτ − 1 ⎠ The derivative of ηD with respect to r is positive for all positive r and τ .2 This implies that the optimal reaction for a firm faced with an increase in discount rate is to increase its intercommunication interval. This may seem counterintuitive at first, as one might believe that increasing r makes future revenues less attractive, and, thus, one might want to decrease τ so as to realize more profits in the short-term. What actually happens is that an increase in r leads to a decrease in the discount multiplier (D). This decrease means that, holding everything else constant, the value of each name decreases, and so does the value of the database. This decrease must be offset by either an increase in database size ( S ) or an increase in expected profit per name per communication (R). Both are accomplished by increasing τ . It is straightforward to show that as r increases, the value of the database decreases. Applying the envelope theorem one more time we have: ∂Vce ∂ e rτ −τ e rτ = ( RS − FC ) = ( RS − FC ) < 0. rτ ∂r ∂r e − 1 (e rτ − 1)2 Hence, an increase in r leads to both an increase in optimal intercommunication time (τ * ) and a decrease in database value (Vce). The impact of r on τ * could be used to study how a firm should vary its communication time in response to interest rate variations. However, a more interesting study is how τ * changes as a firm grows from a startup to a large legitimate company. Indeed, a startup is a risky venture, and its implicit discount rate will be high. As the firm grows, its future becomes less uncertain and its discount rate diminishes. Proposition 3d states that in such cases, the firm will decrease its intercommunication time as it grows. This decrease in τ effectively shifts the firm from a “database growth” regime to a “profit generation” regime as it now tries to
102
Xavier Drèze and André Bonfrer
extract more revenues from its database and leaves itself less time between communications to replace the names lost. 4.4.5 Changes in Costs The firm’s reaction to any increase in cost, whether fixed or variable, is to increase optimal intercommunication interval. The intuition for this is straightforward. An increase in cost reduces per-campaign profits. The firm must then increase τ * to try to boost its profits per campaign—reducing the number of communications sent raises expected revenue per campaign. Proposition 3e: An increase in fixed or variable costs leads to a decrease in Vce and an increase in τ * . Proof: Comes from ∂Vce ∂Vce ∂ ηR ∂GM < 0, < 0, > 0, and < 0. ∂FC ∂VC ∂VC ∂VC 4.4.6 Take Away The study of the levers of customer value reveals some consistent patterns. The firm has two conflicting objectives that it is trying to balance: database growth and database harvesting. Database growth involves a long-term approach, so the marketer needs to adjust marketing actions so as to increase the database size, ( S ). In the CE framework this can be achieved by reducing the communication frequency (i.e., increasing τ ). On the other hand, when the objective is to harvest the database, marketing actions need to be adjusted so as to increase short-term revenues. This can be achieved by reducing the inter-communication interval (i.e., decreasing τ —except in some occasions when τ is very small, in which case both objectives are achieved by increasing τ ). Whenever there is a change in the firm’s performance or in the environment that is beneficial to the firm (e.g., higher return from acquisition, better retention rate, lower interest rate), the firm can adjust its actions in a way that favors harvesting at the expense of growth (i.e., decrease τ ). Conversely, if the change is detrimental (e.g., lower return per campaign, higher costs), the firm must counteract this change by leaning more toward growth rather than harvesting (i.e., increase τ ). 4.5
Acquisition Policy
Now that we have discussed the key levers of the database value and can maximize the marketing actions (τ * ) for any given acquisition
Moving from Customer Lifetime Value to Customer Equity
103
policy, we can turn to the problem of optimizing acquisition. Indeed, we have thus far assumed that acquisition expenditures were fixed. We were able to do this since lemma 2 shows that the actual spending on name acquisition is separable from the optimization of τ , and therefore only the acquisition rate matters for the derivation of the optimal intercommunication interval. The implication of this result is that the marketer first needs to derive the optimal τ for a given acquisition rate (section 4.3.1), and then optimize acquisition spending given the solution for τ *. Section 4.4 ignored acquisition except for to show that an increase in g leads to a decrease in τ * (proposition 3c). 4.5.1 Optimal Acquisition Policy Our goal for this section is to derive the optimal acquisition policy (g*) as a function of the optimal marketing policy τ * . We found in section 4.4.3 that a higher customer acquisition rate leads to greater database size, and larger revenues, but these benefits must be offset against the cost of acquiring these customers. It is critical, for this analysis, to understand that the acquisition costs correspond to a stream of payment over time, rather than a one-time transaction. Formally, since, by proposition 2, we can separate the optimization of the optimal intercommunication interval (τ * ) from the optimal acquisition rate, then τ * can be expressed as a function of acquisition spending. Thus, we can find the optimal expenditure on acquisition by taking the derivative of CE with respect to g. The optimum is found when marginal revenues are equal to marginal costs. We formalize this in the following proposition: Proposition 4: If the periodicity at which the acquisition expenditures are recognized is equal to the periodicity of the communications, then the optimal acquisition rate occurs when the marginal acquisition cost per name is equal to the CLV of the name. Proof: See appendix C. The idea behind the proof is straightforward. Let us assume that the firm currently acquires g customers per period for a cost of AQ(g) and is considering acquiring one more. This will cost of ΔAQ = AQ ( g + 1) − AQ( g ) . This is worth doing if the net present value (NPV) of the additional expenses is lower than the NPV of the additional revenues. In each period, the cost of acquiring the extra customer is ΔAQ and the NPV of the future revenues derived from the acquired customer is its CLV. Thus, one will acquire customers as long as their acquisition costs are lower than their CLV.
104
Xavier Drèze and André Bonfrer
It may seem odd that our analysis shows on the one hand that maximizing CLV is suboptimal for the firm, and on the other hand that CLV is appropriate to set customer acquisition policies. There seems to be a disconnect here as if we set τ * according to our customer equity framework, the CLV of each of the customers in our database is by definition smaller than if we set τ * so as to maximize the CLV. Thus, since we have just shown that acquisition is set based on the CLV of the people we acquire, we could spend more on acquisition, and thus acquire more customers, if we set τ * according to the CLV rather than the CE. It stands to reason that if maximizing CLV leads to customers that are worth more than maximizing CE, and that maximizing CLV allows the firm to acquire more such customers, then maximizing CLV should be more profitable than maximizing CE. Why is it not the case? To understand why, let us analyze a synthetic case where we set communication and acquisition policies according to both CE and CLV and see how the firm’s decision and its profits change depending on whether it maximizes CLV or CE. To specify the various parameters of our synthetic case, let us assume that there are no fixed costs of communications (FC=0). This insures that any differences we find are due to the revenue and acquisition stream and not due to an allocation of fixed costs over a smaller or larger pool of customers. Further, let us use quadratic acquisition costs ( AQ ( g ) = g 2 ), revenues per communication that follow an exponential recovery process ( R (τ ) = r0 (1 − e − r1τ ), where r0 = 5 and r1 = 10), and a retention rate that follows a log-logistic function ( P (τ ) = αλ (λτ )α −1 /(1 + (λτ )α )2 , where α = 2 and λ = 3/4). These functional forms are arbitrary; they are chosen as simple widely used expressions that conform to the assumptions made in section 4.2. Finally, let us assume that the firm is a startup and thus use a relatively high discount rate of 20% per annum (Haenlein, Kaplan, and Schoder 2006). Using these equations and the solver function in Excel, it is straight* forward to find the τ that maximizes the CE (τ CE ) and compare it to the * τ that maximizes the CLV (τ CLV ). We can then compute the CE that * * would be generated if one used τ CLV rather than τ CE . This static comparison will help us better understand the ramification of maximizing CLV rather than CE. The results of the maximization with regards to CE and CLV are shown in table 4.1. As one can see, maximizing CE leads to an intercommunication interval that is 13% larger than if one were to maximize the
Moving from Customer Lifetime Value to Customer Equity
105
Table 4.1 Static comparison of CE and CLV maximizing strategies
CE
CLV
Diff (CE— CLV)
% Diff (Diff CLV)
Optimal communication interval ( τ * ) Retention rate (P) Revenue per communication per customer Non-discounted lifetime revenues Customer lifetime value (CLV)
0.802
0.708
0.094
13.33%
0.487 4.998 9.734 8.623
0.485 4.996 9.694 8.703
0.002 0.003 0.041
0.39% 0.05% 0.42%
−0.080
−0.92%
Number of customers acquired per year (g)
4.311
4.351
−0.040
−0.92%
18.589
18.935
6.734
5.975
−0.347 0.759
−1.83% 12.70%
33.657 1.247
29.848 1.413
135.898
132.989
Yearly acquisition spending (AQ) Steady state customer base (( S ) Profit per communication # of communications per year Value of the firm (CE)
3.808
12.76%
−0.166 2.909
−11.76% 2.19%
CLV. This leads to a slightly larger retention rate and larger revenue per communication per customer, but as expected a lower CLV (1% smaller). And thus smaller acquisition budgets (–2%) and fewer new customers acquired every year (–1%). However, because the retention rate is larger and fewer communications are sent with the larger τ when maximizing CE, this approach results in a larger pool of customers at any point in time (+13%). This leads to a higher value of the firm (+2%) despite the fewer opportunities to generate revenues (−12% contacts per year). This shows the problem with maximizing CLV as a business practice. CLV favors generating profits now at the expense of future profits. This is rational as future profits are worth less than current ones. However, this comes at the expense of lower retention rates. This is factored in the CLV framework on a cohort-by-cohort basis. However, what the CLV framework does not take into account is that lower retention rates don’t just mean that the firm will extract less revenue from newly acquired customers in the future; they also mean that the firm currently has fewer of its past customers to extract revenue from in the present. To better understand how CLV neglects the impact of past cohorts on present revenues, consider figures 4.1a and 4.1b. Figure 4.1a represents the evolution of one cohort. Figure 4.1b represents the make up
106
Xavier Drèze and André Bonfrer
100
Database size
80
60
40
20
0
Y1
Y2
Y3
Y4
Y5
Y6
Y7
Y8
Y9
Y10
Figure 4.1a Cohort size over time.
100 C10 80
C9
Database size
C8 C7
60
C6 C5
40
C4 C3 C2
20
0
C1
Y1
Y2
Y3
Y4
Y5
Figure 4.1b Database size over time (by cohorts).
Y6
Y7
Y8
Y9
Y10
Moving from Customer Lifetime Value to Customer Equity
107
of the database as new cohorts are acquired and old ones shrink. Focusing on the tenth year (Y10), one can see that at any point in time, the database contains the names acquired as part of the latest cohort, plus the names from the previous cohort that survived one period, plus the names of the cohort before that survived two periods, and so on. Hence, by collating at any one time the individuals retained across cohorts, we can recreate the profile of any single cohort across time. Thus, the revenues generated by the database in each period are equal to the undiscounted revenues generated by one cohort over its entire lifetime. To maximize profits for a given acquisition policy, the firm should maximize the undiscounted cash flow generated from an individual rather than the discounted cash flow (i.e., the CLV). Referring * produces a higher back to table 4.1, we can see that although the τ CLV * CLV than τ CE, it does produce lower undiscounted lifetime revenues (by 0.42%), which, coupled with a smaller steady state customer base, yields a lower value for the firm. A side benefit of the fact that at any point in time the profile of the customer base is equivalent to the profile of a cohort over its lifetime is that it allows us to relax the assumptions of constant retention rate and revenues. Indeed, looking at figure 4.2a we see a cohort profile where retention is not constant over the life of its members. Retention is low in the first year, high for the next four years, then steady for the 30
Cohort size
25 20 15 10 5 0
1
2
3
4
5
6
Years Figure 4.2a Cohort size—uneven attrition.
7
8
9
10
108
Xavier Drèze and André Bonfrer
100
Database size
90
C10
80
C9
70
C8 C7
60
C6
50
C5
40
C4
30
C3 C2
20
C1
10 0
1
2
3
4
5
6
7
8
9
10
Years Figure 4.2b Database size—uneven attrition.
remaining years. This uneven retention violates our assumption of constant retention rate that is necessary to simplify the infinite sum of future revenues into the simple equation that is (9). However, as is demonstrated by figure 4.2b, the database size in this case is still equal to the sum of the cohort size over its life, and the revenues generated by the database in one period are still equal to the revenues generated by a cohort over its life. Thus, lemma 1 will still hold in that there is a fixed long-term size to the database (the actual size will not be
τg 1 − P(τ ) as P is not constant over time, but it can still be calculated, as we show by example in section 4.6.1). Further, one can still use the first order condition developed in equation (12) to maximize the database value. Finally, proposition 4 still applies, and we can compute the maximum acquisition spending by computing the CLV of each cohort. 4.5.2 The Path to Steady State Skeptics might argue that our customer equity framework might work well for an established company that already has a customer base and
Moving from Customer Lifetime Value to Customer Equity
109
20 S(CE) S(CLV) Rev(CLV) Rev(CE)
18 16 14 12 10 8 6 4 2 0
0
1
2
3
4
5
6
7
8
9
10
Figure 4.3 Customer base size and revenues.
does indeed derive revenues from both its new and old customers. But what of a startup company? It does not have ‘old’ customers. Wouldn’t a CLV maximization approach allow it to spend more on customer acquisition and thus grow faster? To show this is not the case, and that the CE maximization approach primes over a CLV approach even away from steady state, we use the synthetic example developed in the preceding section and plot the growth in database size and revenues over time for our hypothetical firm assuming it starts with no customers at time 0. As one can see in figure 4.3, the CE approach grows the customer base faster and to a higher level than the CLV approach. Revenues follow a similar pattern. As a bonus, the startup firm will need to invest less in customer acquisition to reach the faster growth. Thus, it will need to raise less capital and provide a larger ROI than if it were to follow a CLV maximization strategy. These two benefits should make raising capital easier. 4.6
Heterogeneity
An important assumption was made about customers being identical in terms of key constituents of the CE: the revenue function, the retention function, and acquisition policy. We assumed that the database
110
Xavier Drèze and André Bonfrer
value could be expressed as a function of an “average” consumer. This is a simplified view of the derivation of many of the results, but we need to understand how robust the results are if we relax this assumption. First note that marketers deal with two types of heterogeneity: observed and unobserved. This leads to two questions. First, what happens when customers are treated as identical when they are not (unobserved heterogeneity)? Second, how does our framework change, and is it still applicable, when heterogeneity is observed and the marketer treats different customers differently, by sending different communications to different groups of customers, or by sending communications at different rates to different customers? 4.6.1 Unobserved Heterogeneity Customers can be heterogeneous in their retention probability (P) and/or their expected return from each communication (R). If consumers have identical retention probabilities, but are heterogeneous in their expected return, then, if all customers are treated as being identical, the average expected return ( R ) can be used to maximize profits. Indeed, let f(R) be the probability density function of R. The expected database value is obtained by integrating out customer equity across customers: E[Vce ] = ∫ ( RS − FC )
e rτ e rτ = − f ( R ) dR RS FC . ( ) e rτ − 1 e rτ − 1
Taking the same approach is a little more complicated for heterogeneity in the retention rate. Indeed, P appears on the denominator of CE through S . Thus, going back to Little’s Law, we need to compute:
τ .g 1 ⎤ S = τ . g.E ⎡⎢ . ≠ ⎥ ⎣ 1 − P ⎦ 1 − E[P] The difference stems from the fact that customers with higher retention rates stay in the database longer than customers with lower retention rates. For instance, if we assume that the acquisition efforts yield a stream of customers whose retention rate has a Beta distribution, B(α , β ) with β > 1 ,3 then we have: S = τ g∫
1
0
1 Γ(α + β ) f (P) dP = τ g ∫ Pα −1 (1 − P)β − 2 dP. 0 Γ(α )Γ(β ) 1− P
Moving from Customer Lifetime Value to Customer Equity
111
Since Γ ( n + 1) = nΓ(n) , we have: S = τg
τ .g (α + β − 1) 1 Γ(α + β − 1) α −1 P (1 − P)β − 2 dP = , ∫ 0 α (β − 1) Γ(α )Γ(β − 1) 1− (α + β − 1) =1
α as the expected value of a α + β −1 Β(α , β − 1). Hence, if the heterogeneity in retention rate in the acquisition stream is characterized by a Β(α , β ) , then the marketer should optimize its database using the expected value of a Β(α , β − 1) as its average P. where we recognize the term
4.6.2 Observed Heterogeneity A firm faced with observed heterogeneity can adapt its marketing actions in two ways. The traditional method used by marketers is to segment customers into mutually exclusive segments based on their purchase behavior, and then treat each customer segment as homogenous. Airlines, with their tiered frequent flyer programs, are classic examples of this approach. Proponents of Customer Relationship Management (CRM) propose a more radical approach, where each customer is able to receive an individually tailored marketing communication. It is easy to see that the CE framework is appropriate for the traditional customer segmentation approach. One can optimize each segment’s marketing communications and overall value separately, as a function of each segment’s acquisition, margins, and retention rates. One can also easily compute the benefits of having more or less segments by comparing the segment values in an N segment world to those in an N+1 segment world. The tradeoff one makes when increasing the number of segments is that revenues should increase with the number of segments as one can better customize the marketing communications when the segments become more homogenous; but, communication costs will also increase as different communication must be created for different segments. This naturally leads to an optimal number of segments. The promises of CRM are to make the customization costs so low that it now becomes feasible to have segments of size one. If this promise were realized, one could argue that a CLV model is more appropriate than a CE model. Indeed, if segments are of size one then
112
Xavier Drèze and André Bonfrer
they truly are depletable. When the segment member defects, the segment is depleted. Newly acquired customers constitute their own segments rather than replenishing pre-existing segments. If all decision variables (acquisition spending, communication frequency, communication content, etc.) are optimized at the individual level independent of other individuals, then the CLV approach might be the correct one. However, that is not what CRM actually entails. What CRM proposes to do in practice is to develop a set of rules or models that, when applied to the information known about the firm’s customer, yield a customized marketing message (Winer 2001). The nature and the amount of customization vary across applications. For instance, American Airlines uses CRM to select which WebFares to advertise to each of its frequent flyers in its weekly emails. Citibank uses CRM to direct callers on its toll free lines to the sales person that is most adept at selling the type of products that their predictive model selected as a likely candidate for cross-selling. Amazon.com uses CRM to try to up-sell its customers by offering them bundles of books they might be interested in. In each of these examples, different customers will experience different offerings. Different frequent flyers will be alerted to different promotional fares depending on the cities they fly most; different readers will be offered different book bundles depending on the books they are searching for. Nevertheless, the set of rules or models that are used to generate these offers will be identical for all frequent flyers, or all Amazon customers. (In other words, the output will be different, but the process will be identical.) Further, the costs of creating the rules and communication templates will be shared across all individuals rather than borne individually. As such, the CE framework developed here still applies. The CRM objectives will be to develop the set of rules that maximize overall CE rather than any specific individual’s CLV. When doing so, it is critical to use the appropriate objective function. This will be achieved by considering the impact of the CRM efforts on retention and amounts spent and maximizing the long-term database value just as has been done in this paper with the intercommunication interval (τ ). 4.7
Managerial Implications and Conclusion
The lifetime value of a firm’s customer base depends on both the acquisition process used by the firm and the marketing actions targeted at these customers. To take both aspects into account, we draw on
Moving from Customer Lifetime Value to Customer Equity
113
the theory of optimal resource management to revisit the concept of customer lifetime value. Our contention is that customers are a renewable resource, and hence marketing actions directed at acquiring, and extracting revenues from, customers should take a long-term valuemaximization approach. Thus, we study the implications of moving from Customer Lifetime Value maximization to Customer Equity maximization. In answer to the first questions raised in the introduction (Does CLV maximization equate to CE maximization?), our findings indicate that disregarding future acquisition leads to suboptimal revenue extraction strategies (proposition 1). This will then lead to suboptimal acquisition strategies since acquisition expenditures are set according to revenues (proposition 4). Following on from this, our second research question addresses what is the proper benchmark to use to guide customer acquisition policy. We find that the firm is able to generate more profits by spending less on acquiring a new customer if it were to use the CE approach, than if it were to use a CLV approach. To answer the final research questions raised—What is the appropriate metric? How is it computed and maximized?—we first need to consider how the database value is impacted by actions of the marketer. As we summarized in proposition 2, our model of customer equity directly accounts for the impact of marketing actions on the value of a firm’s customer assets. This allows us to derive the optimal actions for the firm, and we derive the long-term steady-state size of a firm’s customer base (lemma 1). Our first-order condition for the maximization of customer equity (proposition 2) shows that a firm should adapt its marketing actions to changes in the critical factors that impact database value. The strength of the first-order condition derived in proposition 2 is that it is easy to optimize empirically. The firm can run a series of tests to estimate the various elasticities and infer from them if it should increase or decrease its communication periodicity. This is simplified by the fact that customer equity has a unique maximum. The first-order condition also lends itself well to static comparisons (propositions 3a to 3d). This exercise shows how firms are attempting to balance two conflicting objectives: database growth and database harvesting. Whenever there is a change in the firm’s performance or in the environment that is beneficial to the firm (e.g., higher return from acquisition, better retention rate, lower interest rate), the firm should adjust its actions in a way that favors harvesting at the expense of growth. Conversely, if the change is detrimental (e.g., lower return per campaign, higher
114
Xavier Drèze and André Bonfrer
costs), the firm must counteract this change by leaning more toward growth rather than harvesting. We finish the paper with a discussion of customer heterogeneity. We show that our model is robust to unobserved heterogeneity. The learning point there is that ignoring heterogeneity in customers’ valuation of the communication is less of an issue than ignoring customer heterogeneity in retention rate. We recognize several limitations inherent in this study. We made specific assumptions regarding the relationship between customer retention and the firm’s communication strategy, both in terms of timing (through τ ) and in content (through the relationship between A and P) that could be relaxed in future work. Further work also needs to be done to incorporate the impact of competitive actions and reactions on the measurement of customer equity. This might potentially be done by folding our approach with a theoretical model such as the one developed by Fruchter and Zhang (2004). We believe, however, that the framework presented in this study is a useful starting point for such a competitive analysis. Appendix A: A.1
Proof of Lemma 1 and Proposition 1
Proof of Lemma 1
Let P (τ ) be the proportion of the database that is retained from one campaign to the next, given that τ is the fixed inter-campaign time interval. Let the acquisition stream be g(τ ) = τ . g . If the firm begins with a database size at S0 ≠ S , we show that the long-term stationary value of the database is still S . To find this value, solve the Law of Motion for the size of the database as: S = S .P(τ ) + τ . g =
τ .g . 1 − P(τ )
What if the database size is not at the stationary value? Then, if τ is constant, the state variable converges to the stationary value. To see this, pick any arbitrary value for the starting size of the database— e.g., pick any ε ≠ 0 such that Si < S or Si > S , as in: Si =
τ .g + ε. 1 − P(τ )
Moving from Customer Lifetime Value to Customer Equity
115
so that for the next period:
τ .g ⎛ τ .g ⎞ Si + 1 = Si P(τ ) + τ . g = ⎜ + ε ⎟ P(τ ) + τ . g = + ε .P(τ ) ⎝ 1 − P(τ ) ⎠ 1 − P(τ ) and for any n>0, Si + n =
τ .g + ε P(τ )n 1 − P(τ , k )
and lim Si + n = S , since P(τ ) ∈ (0 , 1) and therefore ε P(τ )n → 0 as n → ∞ . n →∞
A.2
Maximization of Customer Value versus Customer Equity
Assume that we look at the equilibrium conditions so that Si = S . The value of an individual customer name is defined as: ∞ FC ⎞ ⎛ FC ⎞ e rτ Vclv = ∑ e − irτ P(τ )i ⎛ R(τ ) − = R(τ ) − . τ r ⎝ S ⎠ ⎝ S ⎠ e − P(τ ) i=0
The database value, is defined as: ∞
Vce = ∑ e − irτ ( R(τ )S − FC ) = ( R(τ )S − FC ) i=0
e rτ . e rτ − 1
Hence, we have the database value as a function of the customer value: Vce = Vclv S
e rτ − P(τ ) . e rτ − 1
To maximize, we differentiate with respect to τ : ∂Vce e rτ − P(τ ) ∂Vclv ∂ e rτ − P(τ ) = S rτ + Vclv S rτ . ∂τ e − 1 ∂τ ∂τ e −1 Hence, maximizing the CLV and the CE will be identical if: ∂Vce e rτ − P(τ ) ∂Vclv ∂ e rτ − P(τ ) = S rτ +V S rτ =0 clv ∂τ ∂τ ∂τ e − 1 e− 1 ≠ 0 0?
≠0
The fourth term ⎛ ∂ e rτ − P(τ ) ⎞ ⎜⎝ S rτ ⎟ ∂τ e −1 ⎠
=0
=0 ?
116
Xavier Drèze and André Bonfrer
will be equal to 0 iff:
ηS + η e rτ − P(τ ) = 0 e rτ − 1
,
or
ηS = −η e rτ − P(τ ) . e rτ − 1
We know that
ηS = 1 +
∂P(τ ) τ , ∂τ 1 − P(τ )
we now need to compute
η e rτ − P(τ ) : e rτ − 1
∂ e rτ − P(τ ) = ∂ τ e rτ − 1 =
∂P(τ ) rτ rτ ∂τ − ( e − P(τ )) re 2 rτ e −1 (e rτ − 1)
re rτ −
1 ⎛ rτ ∂P(τ ) ( e rτ − P(τ )) re rτ ⎞ re − − ⎟⎠ . e rτ − 1 ⎜⎝ ∂τ e rτ − 1
Thus: 1 ⎛ rτ ∂P(τ ) ( e rτ − P(τ )) re rτ ⎞ τ re − − ⎠⎟ e rτ − P(τ ) e − 1 ⎜⎝ e rτ − 1 ∂τ e rτ − 1 ⎡ ⎤ ⎛ ⎞ ∂P(τ ) τ 1 − P(τ ) = − ⎢τ re rτ ⎜ rτ + ⎥. rτ rτ ⎟ ∂τ e − P(τ ) ⎦ ⎝ ( e − 1) ( e − P(τ )) ⎠ ⎣
η e rτ − P(τ ) = e rτ − 1
rτ
This means that ηS and −η e rτ − P(τ ) are both affine transformations of e rτ − 1
∂P(τ ) ∂P(τ ) , and thus will be equal for all iff both their intercepts and ∂τ ∂τ their slopes are equal, or: ⎛ ⎞ 1 − P(τ ) 1 = τ re rτ ⎜ rτ and rτ ⎝ ( e − 1) ( e − P(τ )) ⎟⎠
τ τ . = 1 − P(τ ) e rτ − P(τ )
Moving from Customer Lifetime Value to Customer Equity
117
the second condition gives us that they will be equal only when τ = 0. Applying l’Hospital Rule to the first condition we find that, at the limit for τ → 0 , the first condition is also satisfied. Hence, this shows that it ∂ e rτ − P(τ ) S rτ will only be for τ = 0 that . And thus, maximizing the Vclv ∂τ e −1 and the Vce lead to the same optimal only when τ * = 0 , which cannot happen since at τ = 0 the database value is negative (Vce (0) = −∞). QED Appendix B:
Maximum Derivation
We handle the maximization of the database value in three steps. First, we derive the first-order condition that needs to be satisfied for a τ to be optimal. Second we show that such a τ exists. And third, we provide conditions under which the maximum is known to be unique. B.1
First-Order Condition
To derive the first-order condition related to the maximization of customer equity with respect to the intercommunication time, we seek the point at which the derivative of (B-1) with respect to τ is null. We do so in the following steps: CE(τ ) = ( R(τ ).S (τ ) − FC )
e rτ er − AQ e rτ − 1 er − 1
(B-1)
Let Ω(τ ) = R(τ ).S(τ ), hence: ∂CE(τ ) ∂ e rτ = (Ω(τ ) − FC ) rτ ∂τ ∂τ e −1 =
⎡ re rτ e rτ ∂Ω(τ ) re 2 rτ ⎤ − + (Ω(τ ) − FC ) ⎢ rτ 2 ⎥ e − 1 ∂τ ⎢⎣ e − 1 ( e rτ − 1) ⎥⎦
=
e rτ ⎡ ∂Ω(τ ) e rτ ⎞ ⎤ ⎛ + r (Ω(τ ) − FC ) ⎜ 1 − rτ ⎟ ⎢ ⎝ e − 1 ⎣ ∂τ e − 1⎠ ⎥⎦
=
e rτ Ω(τ ) ⎡ Ω(τ ) − FC ⎤ ηΩ + ηD . ⎢ rτ e −1 τ ⎣ Ω(τ ) ⎥⎦
rτ
rτ
Further, since Ω(τ ) = R(τ ).S(τ ) , then ηΩ = ηR + ηS . Hence: ∂CE(τ ) e rτ R(τ ).S (τ ) ⎡ R(τ ).S (τ ) − FC ⎤ = rτ ⎢ηR + ηS + ηD R(τ ).S (τ ) ⎥ ∂τ e −1 τ ⎦ ⎣
(B-2)
118
Xavier Drèze and André Bonfrer
If we restrict ourselves to cases where the optimal database value is positive (otherwise the firm would not engage in database driven marketing) then we have R(τ ) > 0 and R(τ )S (τ ) − FC > 0 and thus, at the maximum, the following first-order condition needs to be satisfied:
ηR + ηS + ηDGM = 0
(B-3)
R(τ ).S (τ ) − FC is the gross margin generated by each R(τ ).S (τ ) communication. where GM =
B.2
Change in Retention Probabilities
B.2.1 ∂P(τ )/ ∂τ The retention sensitivity ( ∂P(τ )/ ∂τ ) only affects the FOC through ηS . An increase in retention sensitivity will lead to an increase in ηS as: ∂ηS ∂ ⎡ ∂P(τ ) τ ⎤ 1+ = ∂P(τ ) ∂P(τ ) ⎢⎣ ∂(τ ) 1 − P(τ ) ⎥⎦ ∂(τ ) ∂(τ ) τ = > 0. 1 − P(τ ) This increase in ηS will lead the firm to increase its τ to reach maximum profits. B.2.2 Intercept-shift in P(τ ) We look here at the change in FOC resulting from an intercept-shift increase in retention probabilities. That is: P1 (τ ) = P(τ ) + p0 ∂P1 (τ ) ∂P(τ ) = . ∂τ ∂τ Since R(τ ) and D are both independent from P(τ ) , we have ∂ ηR ∂ηD = 0 and = 0 . Further: ∂p0 ∂p0
Moving from Customer Lifetime Value to Customer Equity
∂ηS ∂ = ∂p0 ∂p0
119
τ ⎤ ⎡ ∂P1 (τ ) ⎢1 + ∂τ 1 − P (τ ) ⎥ ⎦ ⎣ 1
⎤ ∂P(τ )τ ∂ ⎡ 1 ⎢ ∂τ ∂p0 ⎣ 1 − P(τ ) − p0 ⎥⎦ τ ∂P(τ ) = . 2 (1 − P(τ ) − p0 ) ∂τ
=
(B-4)
Hence, ∂ηS / ∂p0 has the same sign as ∂P(τ )/ ∂τ . For small τ , where ∂P(τ )/ ∂τ is positive, the intercept-shift will have a positive impact on ηS . For large τ , where ∂P(τ )/ ∂τ is negative, the impact will be negative. For GM we have: ∂GM ∂ R(τ ).S (τ ) − FC = ∂p0 ∂p0 R(τ ).S (τ ) ∂ FC =− ∂p0 R(τ ).S (τ ) =
(B-5)
FC ∂S (τ ) . 2 R(τ )S (τ ) ∂p0 >0
And:
τ .g ∂S (τ ) ∂ = ∂p0 ∂p0 1 − P(τ ) − p0 τ .g > 0. = (1 − P(τ ) − p0 )2
(B-6)
Hence, an intercept-shift increase in P(τ ) leads to an increase in GM that leads to a decrease in FOC. Putting equations (B-4), (B-5), and (B-6) back into the FOC, we have that an intercept-shift increase in P(τ ) will lead to higher τ * if:
τ .g FC τ ∂P(τ ) > ηD 2 (1 − P(τ ) − po ) ∂τ (1 − P(τ ) − po )2 R(τ )S (τ )2 g ∂P(τ ) FC > ηD ∂τ S (τ ) R(τ ).S (τ ) FC ∂P(τ ) τ > ηD ∂τ 1 − P R(τ ).S (τ ) FC + 1. ηS > η D R(τ ).S (τ )
(B-7)
120
B.3
Xavier Drèze and André Bonfrer
Change in Revenue per Contact
B.3.1 Change in ∂A(τ )/ ∂τ The revenue sensitivity affects the FOC through both ηR and ηS . We have: ∂ ηR ∂ ∂A(τ ) τ = ∂A(τ ) ∂A(τ ) ∂τ A(τ ) − VC ∂ ∂ ∂τ ∂τ τ = > 0. A(τ ) − VC and ∂ηS ∂ τ ⎤ ⎡ ∂P(τ ) = 1+ ⎢ ∂A(τ ) ∂A(τ ) ⎣ ∂τ 1 − P(τ ) ⎥⎦ ∂ ∂ ∂τ ∂τ ∂ τ ∂ ⎛ A(τ ) − c ⎞ ⎤ ⎡ = f 1+ ∂A(τ ) ⎢⎣ 1 − P(τ ) ∂τ ⎝ τ ⎠ ⎥⎦ ∂ ∂τ ∂f ( x ) ∂ ⎡ A(τ ) − c ⎤ ⎤ ∂ τ ⎡ = 1+ ⎢ ⎥⎦ ⎥⎦ ∂A(τ ) ⎣ 1 − P(τ ) ∂x ∂τ ⎢⎣ τ ∂ ∂τ ⎡ ⎛ ∂A(τ ) ⎞⎤ ∂ f ( x ) ⎜ ∂τ ⎢ A(τ ) − c ⎟ ⎥ ∂ τ − = 1+ ∂A(τ ) ⎢ 1 − P(τ ) ∂x ⎜ τ τ 2 ⎟ ⎥⎥ ⎢ ∂ ⎜⎝ ⎟⎠ ⎦ ∂τ ⎣ ∂f ( x ) 1 = > 0. 1 − P(τ ) ∂x Hence, the database sensitivity to τ increases when the sensitivity of the revenue increases, creating a compounding effect that leads the firm to increase its optimal sending rate (τ * ). B.3.2 Intercept-shift in A(τ ) An intercept-shift in A(τ ) will be felt through ηR , ηS , and GM. Thus, we compute the following: A1 (τ ) = A(τ ) + a0 ∂A1 (τ ) ∂A(τ ) = ∂τ ∂τ
Moving from Customer Lifetime Value to Customer Equity
∂ ηR ∂ ∂A(τ ) τ = ∂a0 ∂a0 ∂τ A(τ ) − VC ∂A(τ ) −τ = ∂τ ( A(τ ) − VC )2 −ηR = 0 τ ∂x ∂
∂P(τ ) ∂τ = ∂ ⎡ ∂f ( x ) ⎛ ∂A(τ ) 1 − A(τ ) − c ⎞ ⎤ τ 2 ⎠ ⎥⎦ ∂a0 ∂a0 ⎢⎣ ∂x ⎝ ∂τ τ 1 ∂f ( x ) =− 2 0 1 − P(τ ) τ ∂x
∂GM ∂ FC =− ∂a0 ∂a0 R(τ ).S (τ ) ∂S (τ ) ⎞ ⎛ ∂R(τ ) .S (τ ) + R(τ ). ⎜ ∂a0 ∂a0 ⎟ = FC ⎜ ⎟ 2 R(τ ).S (τ )) ( ⎜ ⎟ ⎜⎝ ⎟⎠ R(τ ).S (τ ) 1 ∂f ( x) ⎞ ⎛ S (τ ) + ⎜ 1 − P(τ ) τ ∂x ⎟ = FC ⎜ ⎟ 2 R(τ ).S (τ )) ⎜ ⎟ ( ⎜⎝ ⎟⎠ =
FC ⎛ 1 1 1 ∂f ( x ) ⎞ ⎜⎝ R(τ ) + 1 − P(τ ) τ ∂x ⎠⎟ > 0 R(τ ).S (τ )
121
122
Xavier Drèze and André Bonfrer
∂ηS ∂ = ∂a0 ∂a0
⎡ ∂P(τ ) τ ⎤ ⎢1 + ∂τ 1 − P(τ ) ⎥ ⎣ ⎦ ∂P(τ ) ∂P(τ ) ∂ τ ∂P(τ ) ∂a0 τ ∂ +τ = 1 − P(τ ) ∂a0 ∂τ ( 1 − P(τ ) )2 −1 ∂f ( x) −1 ∂f ( x) ∂P(τ ) τ 2 ∂x τ = +τ 1 − P(τ ) τ 2 ∂x ∂τ ( 1 − P(τ ) )2 ∂f ( x ) ⎡ −1 ∂P(τ ) ⎤ . = (1 − P(τ )) + τ (1 − P(τ ))2 ∂x ⎣⎢ ∂τ ⎦⎥
Hence, ∂ηS / ∂a0 is negative for small to moderate levels of τ * (i.e., τ * : 1 − P(τ * ) > −∂P(τ * )/ ∂τ ) and positive for larger τ * . Thus, for small to moderate τ * , the negative impacts on ηR , ηS and the positive impact on GM will both yield a smaller τ * . For large τ * , the net impact might be positive. The point at which the effect reverses itself is given by:
τ:
−ηR −1 ⎛ ∂P(τ ) − τ (1 − P(τ ))⎞ ∂f ( x ) + 2 ⎝ ⎠ ∂x R(τ ) τ (1 − P(τ )) ∂τ + ηD
FC ⎛ 1 1 1 ∂f ( x ) ⎞ ⎜⎝ R(τ ) + 1 − P(τ ) τ ∂x ⎟⎠ = 0 R(τ ).S (τ )
.
This expression is not tractable, but a lower-bound on τ * is given by:
τ* :
∂P(τ ) = τ (1 − P(τ )) ∂τ
or τ * : ηS = τ 2 + 1 . Appendix C
Optimal Acquisition Policy
Let us assume that the firm currently acquires g customers per period for a cost of AQ(g) and is considering acquiring one more. This will cost ΔAQ = AQ ( g + 1) − AQ( g ) . Between communications it will acquire τg customer at a cost of τΔAQ. This is worth doing if the net present value (NVP) of the additional expenses is lower than the NPV of the additional revenues. The NPV of additional expenses is equal to
Moving from Customer Lifetime Value to Customer Equity
123
e rτ . e rτ − 1
τΔAQ
Ignoring any impact of g on τ * , the increase in revenue comes from the increase in database size: Δg = 1 ⇒ ΔS =
τ . 1− P
However, we do not go from S( g ) to S( g + 1) in one period, it only builds over time. The change in database size can be written as i
Si = τ ∑ P i , with ΔS∞ = ΔS . j=0
The change in database size affects only the revenues, it does not affect the fixed cost. Thus, we can write the change in revenue as i ∞ ∞ ∑ Pj RΔSi j=0 ΔCE = ∑ irτ = Rτ ∑ . If we write the long form of the sums, e irτ i=0 e i=0 we have: ∞
∑ i=0
∑
i j=0
Pj
e irτ
= 1+
(1 + P )
+
(1 + P + P 2 ) + (1 + P + P 2 + P 3 ) + …
e rτ e 2 rτ e 3 rτ 2 3 P P P P P2 P3 1 ⎛ ⎞ = 1 + rτ + 2 rτ + 3 rτ + … + rτ ⎜ 1 + rτ + 2 rτ + 3 rτ + …⎟ ⎠ e e e e ⎝ e e e P P2 P3 1 ⎛ ⎞ + 2 rτ ⎜ 1 + rτ + 2 rτ + 3 rτ + …⎟ ⎠ e ⎝ e e e P P2 P3 1 ⎛ ⎞ + rτ ⎜ 1 + rτ + 2 rτ + 3 rτ + …⎟ +… ⎠ e ⎝ e e e
P P2 P3 e rτ , If we substitute D for the series 1 + rτ + 2 rτ + 3 rτ +… = rτ e e e e −P then we can rewrite the sums as: ∞
∑ i=0
∑
i j=0
Pj
e irτ
= D+
e rτ D D D = D + + + … e rτ − 1 e rτ e 3 rτ e 3 rτ
Thus, ∞
ΔCE = Rτ ∑ i=0
∑
i j=0
e
irτ
Pj
= Rτ
e rτ e rτ rτ e − P e −1 rτ
124
Xavier Drèze and André Bonfrer
and: ΔCE > NPV ( ΔAQ ) ⇔ Rτ ⇔R
e rτ e rτ e rτ > τ Δ AQ e rτ − P e rτ − 1 e rτ − 1
e rτ > ΔAQ ⇔ CLV > ΔAQ e −P rτ
In other words, the optimal acquisition policy from a profit perspective is to spend up to the CLV on acquisition. Notes This research was funded in part by the Wharton–SMU research center, Singapore Management University and in part by a WeBI–Mac Center Grant. This paper was originally published as Drèze, Xavier; & Bonfrer, André. (2009). “Moving from customer lifetime value to customer equity.” QME: Quantitative Marketing and Economics, 7 (3), pp. 289–320. The online version of this article (doi:10.1007/s11129-009-9067-y) contains supplementary material, which is available to authorized users. 1. NPV calculations traditionally use 1/(1+d), where d is the discount rate of money, as the discount factor. To make our derivations simpler, we use e − r instead. The two formulations are equivalent if we set r = ln(1+d). 2. To see this, note that at τ = 0 , rτ e rτ = e rτ − 1 = 0 , and for all τ > 0 , ∂ ∂ rτ rτ e rτ = re rτ + r 2τ e rτ > e − 1 = re rτ ∂τ ∂τ
.
3. β>1 is needed to ensure that f(1) = 0. If f(1) > 0, then some people will stay in the system forever, regardless of the firm’s actions. This would lead to degenerate solutions.
References Bayon, Thomas, Jens Gutsche, and Hans Bauer. 2002. “Customer Equity Marketing: Touching the Intangible.” European Management Journal 20 (3):213–222. Berger, Paul D., Ruth N. Bolton, Douglas Bowman, Elten Briggs, V. Kumar, A. Parasuraman, and Creed Terry. 2002. “Marketing Actions and the Value of Customer Assets: A Framework for Customer Asset Management.” Journal of Service Research 5 (1):39–54. Berger, Paul D., and Nada I. Nasr. 1998. “Customer Life Time Value: Marketing Models and Applications.” Journal of Interactive Marketing 12 (1):17–30. Blattberg, Robert C., and John Deighton. 1996. “Manage Marketing by the Customer Equity Test.” Harvard Business Review (July-August):136–144. Fader, Peter S., Bruce G.S. Hardie, and Ka Lok Lee. 2005. “RFM and CLV: Using Iso-Value Curves for Customer Base Analysis.” Journal of Marketing Research XLII (November): 415–430.
Moving from Customer Lifetime Value to Customer Equity
125
Farris, Paul W., Neil T. Bendle, Phillip E. Pfeifer, and David J. Reibstein. 2006. Marketing Metrics: 50+ Metrics Every Executive Should Master. New Jersey: Wharton School Publishing. Fruchter, Gila E., and John Z. Zhang. 2004. “Dynamic Targeted Promotions: A Customer Retention and Acquisition Perspective.” Journal of Service Research 7 (1):3–19. Gupta, Sunil, Donald R. Lehmann, and Jennifer Ames Stuart. 2004. “Valuing Customers.” Journal of Marketing Research XLI (February):7–18. Haenlein, Michael, Andreas M. Kaplan, and Detlef Schoder. 2006. “Valuing the Real Option of Abandoning Unprofitable Customers When Calculating Customer Lifetime Value.” Journal of Marketing 70 (July):5–20. Hwang, Hyunseok, Taesoo Jung, and Euiho Suh. 2004. “An LTV Model and Customer Segmentation Based on Customer Value: A Case Study on the Wireless Telecommunication Industry.” Expert Systems with Applications 26:181–188. Libai, Barak, Das Narayandas, and Clive Humby. 2002. “Toward an Individual Customer Profitability Model: A Segment-Based Approach.” Journal of Service Research 5 (1):69–76. Little, John D. C. 1961. “A Proof of the Queuing Formula: L = λW.” Operations Research 9 (May):383–387. Reinartz, Werner J., Jacquelyn S. Thomas, and V. Kumar. 2005. “Balancing Acquisition and Retention Resources to Maximize Customer Profitability.” Journal of Marketing 69 (January):63–79. Reinartz, Werner J., and V. Kumar. 2003. “The Impact of Customer Relationship Characteristics on Profitable Lifetime Duration.” Journal of Marketing 67 (January):77–99. Rust, Roland T., Katherine N. Lemon, and Valarie A. Zeithaml. 2004. “Return on Marketing: Using Customer Equity to Focus Marketing Strategy.” Journal of Marketing 68 (January):109–127. Sweeny, James L. 1992. “Economic Theory of Depletable Resources: An Introduction.” In Handbook of Natural Resources and Energy Economics. vol. 3. Ed. A. V. Kneese and J. L. Sweeney. North-Holland. Venkatesan, Rajkumar, and V. Kumar. 2004. “A Customer Lifetime Value Framework for Customer Selection and Resource Allocation Strategy.” Journal of Marketing 68 (October):106–125. Villanueva, Julian, Shijin Yoo, and Dominique M. Hanssens. 2008. “The Impact of Marketing-Induced Versus Word-of-Mouth Customer Acquisition on Customer Equity Growth.” Journal of Marketing Research XLV (February):48–59. Wang, Paul, and Ted Spiegel. 1994. “Database Marketing and its Measurement of Success: Designing a Managerial Instrument to Calculate the Value of a Repeat Customer Database.” Journal of Direct Marketing 8 (2):73–81. Winer, Russell S. 2001. “A Framework for Customer Relationship Management.” California Management Review 43 (Summer):89–105. Zeithaml, Valarie A., Roland T. Rust, and Katherine N. Lemon. 2001. “The Customer Pyramid: Creating and Serving Profitable Customers.” California Management Review 43 (4):118–142.
5
Deriving Customer Lifetime Value from RFM Measures: Insights into Customer Retention and Acquisition Makoto Abe
5.1
Introduction
Three customer measures—recency (time since most recent purchase), frequency (number of prior purchases), and monetary value (average purchase amount per transaction), or RFM—have been used extensively in Customer Relationship Management (CRM) with great success. At the heart of CRM is the concept of customer lifetime value (CLV), a long-term approach to identifying profitable customers and cultivating relations. The three customer measures of RFM serve the purpose well by capturing past buying behaviors parsimoniously without burdening a firm’s data storage limits. Although not all firms maintain each customer’s purchase history, most still accumulate customer RFM data (Hughes 2000). This suggests that rich, long-term customer information is condensed into these three measures (Buckinx and Van den Poel 2005). Despite the importance of CLV and the usefulness of RFM measures, however, the literature provides conflicting findings on their relationship. In their predictive analysis, Malthouse and Blattberg (2005) generally found that frequency and monetary measures were positively correlated with CLV, while recency was negatively correlated with CLV. Blattberg, Malthouse, and Neslin (2009), in their empirical generalization article on CLV, further state that practitioners share a similar sentiment. However, they also point out at least five academic studies that have conflicting results on the association of RFM with CLV, leading them to conclude that there is a need for further research on the issue. Here, one must be careful to note that RFM themselves are not behavioral constructs, but mere statistical measures that are observed as a result of the underlying customer behavior. In particular, recency is strongly affected by the analyst’s selected time frame. For the same
128
Makoto Abe
recency, frequent customers are more likely to be inactive than infrequent ones, thereby resulting in different CLVs. Frequency is different from purchase rate. Purchase rate considers a customer’s purchases only when that customer is active. In contrast, frequency counts purchases during the entire observation period that can include a customer’s inactive duration. Hence, observed RFM measures are interrelated in a complex manner, as demonstrated by the iso-value contour plot (Fader, Hardie, and Lee 2005). A more rigorous approach to investigating the relationship between RFM and CLV is to infer the underlying behavior processes from RFM measures and then seek their association with CLV. In a non-contractual setting, appropriate behavior traits that can be derived from RFM measures as sufficient statistics are “purchase rate,” “lifetime,” and “spending per transaction,” which will be denoted as PLS throughout the paper (Fader et al. 2005). Such model-based studies of customer behavior with PLS, however, still generate contradictory findings. Table 5.1 shows a partial list of such research. For instance, Reinartz and Kumar (2003) find that monetary value (monthly spending) is positively related to lifetime duration. Borle, Singh, and Jain (2008), in a discrete-time contractual setting, found a positive relationship between purchase rate and lifetime, a negative relationship between purchase rate and spending per purchase, and no correlation between lifetime and spending. Singh, Borle, and Jain (2009) show that purchase rate is negatively associated with lifetime and spending per purchase, whereas lifetime and spending have a positive relationship, although the consequence of investigating only one of the two gamma parameters (the shape instead of the usual scale parameter) is unknown. Fader et al. (2005) did not find any Table 5.1 Conflicting Findings on the Correlation among Purchase Rate, Lifetime, and Spending Purchase Rate and Lifetime
Purchase Rate and Spending
Lifetime and Spending
Schmittlein & Peterson (1994) Reinartz & Kumar (2003) Reinartz et al. (2005) Fader et al. (2005) Borle et al. (2008)
N/A N/A + 0 +
0 N/A N/A 0 −
+ + N/A N/A 0
Singh et al. (2009)
−
−
+
Deriving Customer Lifetime Value from RFM Measures
129
relationship between purchase rate and average spending in their database, although they noted that its empirical generalization also requires more studies. So why is understanding the relationships among purchase rate, lifetime, and spending important? Borle et al. (2008), for example, found that frequent shoppers spend less per purchase. If that is the case, an obvious question is whether recruiting frequent shoppers leads to increased or decreased CLV. The answer has a potentially severe consequence for managerial decision-making in CRM, where incorrect decisions could be detrimental to the firm. To assess the net influence on CLV accurately, one must tradeoff the relative magnitudes of increase in purchase rate with the decrease in spending, as well as account for any associated change in lifetime. The objectives of this research are threefold. The first is, from the observed RFM measures of a customer, to develop an individual-level CLV model that identifies the underlying customer traits of PLS, which are then linked to CLV. By evaluating the individual difference in the intensity of PLS, one can obtain insights into their correlation, and, thus, accurately assess their net impacts on CLV. The model accounts for both observed (characteristics) and unobserved customer heterogeneity. Figure 5.1 illustrates the framework of our approach. The second objective using this model is to obtain normative implications for marketing programs that maximize the CLV of existing customers. In particular, we investigate implementable programs for customer retention that is most effective in terms of marketing Return on Investment (ROI), such as what action needs to be taken to which customers at which timing. The third objective is, by applying our CLV model to prospective customers, to gain insight into acquisition strategy. Without transaction data, RFM measures of prospective customers do not exist, and, therefore, we cannot calculate their CLV. By examining the relationship between PLS and the demographic characteristics of existing customers, we can infer the PLS of prospective customers from their demographic information alone, and, hence, infer their CLV. The paper is organized as follows. Section 5.2 discusses the general approach of our model, followed by a detailed specification that relates RFM measures to PLS, which are then linked to CLV. Then customerspecific metrics that are useful for CRM, such as expected lifetime, one-year survival rate, etc., are derived from the model. Section 5.3 describes the model estimation. Section 5.4 presents the empirical
130
Makoto Abe
Observed measures
R (recency)
F (frequency)
M (monetary value)
Behavior traits P (purchase rate) Model L (lifetime)
CLV
S (spending)
Customer characteristics Figure 5.1 Underlying customer traits are derived from the observed RFM measures, which are in turn linked to CLV. The model accounts for both observed (characteristics) and unobserved customer heterogeneity.
analysis with actual customer data. Finally, section 5.5 concludes the paper with summary, limitations, and future directions. 5.2
Model
5.2.1 The General Approach Our approach is to construct a Pareto/NBD-based stochastic model of buyer behavior (Schmittlein, Morrison, and Colombo 1987, hereafter referred to as SMC) for estimating CLV in a “non-contractual” setting. In this case, a customer being “alive” or “dead” is inferred from recencyfrequency data through simple assumptions regarding purchase behavior. The CLV research using such a Pareto/NBD model includes Fader, Hardie, and Lee 2005 (hereafter referred to as FHL), Reinartz and Kumar 2003, and Schmittlein and Peterson 1994 (hereafter referred to as SP). Furthermore, for disaggregate modeling, we adapt the hierarchical Bayes extension of Pareto/NBD proposed by Abe 2009 to manage “heterogeneity.” Table 5.2 highlights our methodology in comparison with that of SP and FHL. For recency-frequency data, both SP and FHL adopt a
Deriving Customer Lifetime Value from RFM Measures
131
Table 5.2 Comparison with Existing Methods. Empirical Bayes Model Data
Model
Individual Behavior
Heterogeneity Distribution
RF (recencyfrequency)
Pareto/NBD (SMC 1987)
M (monetary)
normal-normal (SP 1994)
Poisson purchase (λ) Memoryless dropout (μ) Normal spending (mean θ)
λ ~ Gamma μ ~ Gamma λ and μ independent θ ~ Normal θ, λ, μ independent
gamma-gamma (FHL 2005)
Gamma spending (scale ν)
ν ~ Gamma ν, λ, μ independent
Proposed Hierarchical Bayes Model
Data
Model
Individual Behavior
RF (recencyfrequency)
Poisson/exponential (Abe 2009)
M (monetary)
lognormal-lognormal
Poisson purchase (λ) Memoryless dropout (μ) Lognormal spending (location η)
Heterogeneity Distribution λ, μ, η ~ MVL λ, μ, η correlated
Pareto/NBD model that presumes a Poisson purchase and an exponential lifetime processes whose parameters are independently distributed as gamma. For monetary data, SP posit a normal-normal model, whereby purchase amounts on different occasions within a customer are normally distributed with the mean following a normal distribution in order to capture customer heterogeneity. FHL use a gamma-gamma model, whereby the normal distributions within and across customers in SP are replaced by gamma distributions. Each of these methodologies can be characterized as an individual-level behavior model whose parameters are compounded with a mixture distribution to capture customer heterogeneity. This in turn is estimated by an empirical Bayes method. The proposed methodology posits the same behavioral assumptions as SP and FHL, but captures customer heterogeneity through a more general mixture distribution to account for the interdependence among the three behavior processes of PLS using the Hierarchical Bayes (HB) framework. Before describing the model in detail, let us summarize the reasons for our methodology.
132
Makoto Abe
(1) Assumptions on customer behavior are minimal and posited in various past studies. (2) Aggregation can be managed easily without resorting to integration, and individual-level statistics can be obtained as a byproduct of Markov Chain Monte Carlo (MCMC) estimation. (3) It produces correct estimates of standard errors, thereby permitting accurate statistical inference even when the asymptotic assumption does not apply. (4) It is easy to incorporate covariates. 5.2.2 Model Assumptions This section describes the assumptions of the proposed HB model. Individual Customer Assumption 1: Poisson purchases. While active, each customer makes purchases according to a Poisson process with rate λ. Assumption 2: Exponential lifetime. Each customer remains active for a lifetime, which has an exponentially distributed duration with dropout rate μ. Assumption 3: Lognormal spending. Within each customer, the amounts of spending on purchase occasions are distributed as lognormal with location parameter η. Assumptions 1 and 2 are identical to the behavioral assumptions of a Pareto/NBD model. Because their validity has been studied by other researchers (FHL; Reinartz and Kumar 2000, 2003; SMC; SP), the justification is not provided here. Assumption 3 is specified because (1) the domain of spending is positive; and (2) inspection of the distributions of spending amounts within customers reveals a skewed shape resembling lognormal. As described previously, SP and FHL assume normal and gamma, respectively, to characterize the distribution of spending amounts within a customer. Heterogeneity across Customers Assumption 4: Individuals’ purchase rates λ, dropout rates μ, and spending parameters η follow a multivariate lognormal distribution. Assumption 4 permits correlation among purchase rate, lifetime, and spending parameters. Because Assumption 4 implies that log(λ), log(μ), and log(η) follow a multivariate normal, estimation of the variance-covariance matrix is tractable using a standard Bayesian method. A Pareto/NBD combined with either a normal-normal (SP)
Deriving Customer Lifetime Value from RFM Measures
133
or gamma-gamma (FHL) spending model posits independence among the three behavioral processes. We will assess the impact of this independence assumption through comparison between multivariate and independent lognormal heterogeneity. The impact of assuming a different heterogeneity shape (lognormal rather than gamma) is an empirical issue, which will be evaluated by comparing independent lognormals of the HB model with independent gammas of Pareto/NBD. 5.2.3 Mathematical Notations For recency and frequency data, we will follow the standard notations {x, tx, T} used by SMC and FHL. Lifetime starts at time 0 (when the first transaction occurs and/or the membership starts) and customer transactions are monitored until time T. x is the number of repeat transactions observed in the time period (0, T), with the last purchase (xth repeat) occurring at tx. Hence, recency is defined as T–tx. τ is an unobserved customer lifetime. For spending, sn denotes the amount of spending on the nth purchase occasion of the customer under consideration. Using these mathematical notations, the previous assumptions can be expressed as follows: ⎧ (λ T )x − λ T e ⎪⎪ P[x|λ ] = ⎨ x ! x ⎪ (λτ ) e − λτ ⎪⎩ x ! f (τ ) = μ e − μτ
if τ > T
x = 0 , 1, 2,..
if τ ≤ T
τ ≥0
log(sn )~N (log(η), ω 2 )
(1)
(2) sn > 0
⎛ ⎡ σ λ2 ⎡ log(λ )⎤ ⎡θ λ ⎤ ⎢log(μ )⎥ ~ MVN ⎜ θ = ⎢θ ⎥ , Γ = ⎢σ ⎢ ⎥ ⎜ 0 ⎢ μ ⎥ 0 ⎢ μλ ⎢⎣σ ηλ ⎢⎣ log(η) ⎥⎦ ⎢⎣θη ⎥⎦ ⎝
(3)
σ λμ σ μ2 σ ημ
σ λη ⎤⎞ σ μη ⎥⎥⎟ ⎟ σ η2 ⎥⎦⎠
(4)
where N and MVN denote univariate and multivariate normal distributions, respectively. ω2 is the variance of logarithmic spending amounts within a customer. 5.2.4 Expressions for Transactions, Sales, and CLV Given the individual level parameters for (λ, μ), the expected number of transactions in arbitrary time duration w, E[X(w)|λ, μ], is shown by evaluating E[ψ] as:
134
Makoto Abe
E[X(w)|λ , μ ] = λ E[ψ ] =
λ (1 − e − μw ) μ
where ψ = min(τ ,w).
(5)
The expected sales during this period w is simply the product of the expected number of transactions shown in equation (5) and the expected spending E[sn|η, ω] as: E[sales(w)|λ , μ , η , ω ] = E[sn |η , ω ]E[X(w)|λ , μ ] = ηeω
2 /2
λ (1 − e − μw ). μ
(6)
For CLV, we define “value” to be synonymous with “revenue” because margin and cost information is unknown in this study. The general formula of CLV for an individual customer under a continuous time framework, as appropriate for a Pareto/NBD model, is expressed as: ∞
CLV = ∫ V (t)R(t)D(t)dt , 0
where V(t) is the customer’s value (expected revenue) at time t, R(t) is the survival function (the probability that a customer remains active until at least t), and D(t) is a discount factor reflecting the present value of money received at time t (FHL; Rosset et al. 2003). Translating to assumptions 1–3, they are V (t) = λ E[sn ] where E[sn ] = η exp(ω 2 / 2) from the definition of lognormal, and R(t) = exp(− μt). With continuously compounded discounting of an annual interest rate d, D(t) = exp(−δ t), where δ = log(1 + d) with the time unit being a year. Therefore, our CLV reduces to the following simple expression: ∞
∞
0
0
CLV = ∫ V (t)R(t)D(t)dt = ∫ ληeω
2
2 /2
e − μt e −δ t dt =
ληeω /2 μ +δ
(7)
Hence, if we can estimate λ, μ, η, and ω for each customer from RFM data, we can compute CLV as in equation (7). 5.2.5 Incorporating Customer Characteristics To control extraneous factors and gain insight into acquisition, we would like to relate customer characteristic variables for customer i, di (a K × 1 vector) to customer specific parameters λi , μi , and ηi. A straightforward extension of assumption 4 expressed in equation (4) results in a multivariate regression specification as follows: ⎛ ⎡ σ λ2 ⎡ log(λi )⎤ ⎢log(μ )⎥ ~ MVN ⎜ θ = Bd , Γ = ⎢σ i i 0 ⎢ μλ ⎢ ⎥ ⎜ i ⎢⎣σ ηλ ⎢⎣ log(ηi ) ⎥⎦ ⎝
σ λμ σ μ2 σ ημ
σ λη ⎤⎞ σ μη ⎥⎥⎟ ⎟ σ η2 ⎥⎦⎠
(8)
Deriving Customer Lifetime Value from RFM Measures
135
where B is a 3 × K matrix of coefficients. When di contains a single element of 1 (i.e., no characteristic variables), the common mean, θ0 = θi for all customers i, is estimated. 5.2.6
Elasticities
EλCLV =
∂CLV / CLV μ , EηCLV = 1 = 1, EμCLV = − ∂λ / λ μ +δ
(9)
Equation (9) implies that a one percent increase in the purchase rate or spending parameter causes a one percent increase in CLV, whereas a one percent decrease in the dropout rate leads to less than a one percent increase in CLV (with the magnitude depending on the discount rate δ). Under a high interest rate, the impact of prolonging lifetime on CLV is not as rewarding because future customer value would be discounted heavily. The effect of customer characteristics on CLV can be decomposed into purchase rate, lifetime, and spending processes to provide further managerial insight. Defining dik as the kth (continuous) characteristic of customer i, from equations (8) and (9), the elasticity becomes: ∂CLVi / CLVi ∂dik / dik ⎡ ∂CLVi ∂λi ∂CLVi ∂μi ∂CLVi ∂ηi ⎤ dik + =⎢ + ∂ηi ∂dik ⎥⎦ CLVi ∂μi ∂dik ⎣ ∂λi ∂dik
EdCLV = ik
(10)
bμ k μi ⎤ ⎡ = ⎢bλ k − + bηk ⎥ dik μi + δ ⎦ ⎣ CLV CLV = Epdik + Eldik + EsdCLV ik where blk (l ∈ {λ , μ , η}, k = 1,..., K ) denotes (l, k)th element of matrix B. Applying the chain rule, the derivative with respect to dik through λi , μi , and ηi , results in the sum of three elasticities, EpdCLV , EldCLV , and ik ik CLV Esdik due, respectively, to purchase rate, lifetime, and spending. 5.3
Estimation
In the previous section, simple expressions for the customer processes of purchase rate, lifetime, and spending (and thus CLV) are derived from the basic behavioral assumptions 1, 2, and 3. To account for customer heterogeneity, the HB approach is adopted to bypass the complex aggregate expressions of the compounding mixture distribution.
136
Makoto Abe
5.3.1 Hierarchical Bayesian Estimation The purchase rate and lifetime parts adopt the HB extension of the Pareto/NBD model proposed by Abe (2009). It is estimated by MCMC simulation through a data augmentation method. Because information about a customer being active (z = 1) or not at time T is unknown, and if not active, the dropout time (y < T) is also unknown, z and y are considered as latent variables.. Both z and y are randomly drawn from their posterior distributions. The RFM data of a customer are denoted as x, tx, T, and as, where x, tx, and T are defined as in SMC, and as represents the average spending per purchase occasion. Without the knowledge of spending variation within a customer from one purchase to another, however, there is no means to infer the variance of logarithmic spending ω2, specified in equation (3), from RFM data alone. As the RFM provides cross-sectional measures, it contains information only on spending variation between customers. Since it is easy to obtain ω2 given the panel data, here we assume that ω2 is common across customers and estimated from historical data. Assumption 3 permits standard normal conjugate updating in Bayesian estimation, whereby the posterior mean is a precisionweighted average of the sample and the prior means. For this method to work, however, we need the mean of log(spending) (or equivalently, the logarithm of the geometric mean of spending amounts) from each customer, whereas the M part of RFM data provides only the arithmetic mean of spending, as. implies Following equation (3), log(sn )~N (log(η), ω 2 ) 2 E [ sn ] = exp (log(η) + ω / 2) . Replacing the expectation E[sn] and log(η) 1 x 1 x by their respective sample means, ∑ sn and ∑ log(sn ), the following x n =1 x n =1 approximation can be obtained. 1 x 1 ∑ log(sn ) ≅ log(as) − 2 ω 2 x n =1
(11)
When we evaluated the accuracy of this approximation with the department store FSP data of 400 customers used in section 5.4, the correlation between the actual and the approximation was 0.927, and the mean absolute percentage error was 6.8%. 5.3.2 Prior Specification Reinstating customer index i, let us denote the customer specific param′ eters as ϕ i = [ log(λi ), log(μi ), log(ηi )] , which is normally distributed
Deriving Customer Lifetime Value from RFM Measures
137
with mean θ i = Bdi and variance-covariance matrix Γ0 as in equation (8). Our objective is to estimate parameters {ϕ i , yi , zi , ∀i ; B, Γ 0 } from the observed RFM data {xi , txi , Ti , asi ; ∀i}, where the index for customer i is made explicit. In the HB framework, the prior of individual-level parameter φi corresponds to the population distribution MVN(Bdi, Γ0). The priors for the hyperparameters B and Γ0 are chosen to be multivariate normal and inverse-Wishart, respectively: vec(B) ~ MVN ( b00 , Σ 00 ) , Γ 0 ~ IW (ν 00 , Γ 00 )
(12)
These distributions are standard conjugate priors for multivariate regression models. Constants {b00 , Σ 00 , ν 00 , Γ 00 } are chosen to provide very diffuse priors for the hyperparameters. 5.3.3 MCMC Procedure We are now in a position to estimate parameters {ϕ i , yi , zi , ∀i ; B, Γ 0 } using an MCMC method. To estimate the joint density, we sequentially generate each parameter, given the remaining parameters, from its conditional density until convergence is achieved. Because these conditional densities are not standard distributions, the independent MH-algorithm is used. 5.4
Empirical Analysis
5.4.1 Frequent Shoppers Program Data for a Department Store We now apply the proposed model to real data. This dataset contains shopping records from 400 members of a frequent shoppers program (FSP) at a large department store in Japan.1 These members had joined the FSP during the month of July 2000, and their transactions were recorded for fifty-two weeks. The first twenty-six weeks of data were used for model calibration, and the second twenty-six weeks were used for validation. The available customer characteristic variables were (a) Age, (b) Gender, and (c) Food, the fraction of store visits on which food items were purchased (which is a proxy for store accessibility). These same data were also used by Abe (2009), whose descriptive statistics are reported in table 5.3. The variance of log(spending) within a customer, ω2, was estimated to be 0.895 from panel data, as discussed in section 5.3.1. 5.4.2 Model Validation The MCMC steps were put through 15,000 iterations, with the last 5,000 used to construct the posterior distribution of parameters.
138
Makoto Abe
Table 5.3 Descriptive Statistics for the Department Store Data
Number of repeat purchases Length of observation T (days) Recency (T-t) (days) Average purchase amount (×105 yen) Food Age Female
mean
std. dev.
minimum
maximum
16.02 171.24 24.94 0.067
16.79 8.81 42.82 0.120
0 151 0 0.0022
101 181 181 1.830
0.79 52.7 0.93
0.273 14.6 0.25
0 22 0
1 87 1
Table 5.4 shows the result of three nested HB models: Independent (variance-covariance matrix Γ0 is diagonal without covariates), Correlated (general Γ0 without covariates), and Full (general Γ0 with all covariates). Correlated models with subsets of covariates are not reported here because the Full model had the best marginal loglikelihood. The performance of the Full HB model was evaluated with respect to the number of transactions and spending, obtained from equations (5) and (6), respectively, in comparison to the benchmark Pareto/NBD-based model. The expected number of transactions, predicted by the Pareto/NBD, was multiplied by average spending asi to come up with the customer i’s spending. Figure 5.2 shows the aggregate cumulative purchases over time. Both models provide good fit in calibration and good forecast in validation, which are separated by the vertical dashed line. With respect to the mean absolute percent errors (MAPE) between predicted and observed weekly cumulative purchases, the HB performed better for validation (1.3% versus 1.9%) and equal to the benchmark Pareto/ NBD-based approach for calibration (2.5% for both). Fit statistics at the disaggregate level provide more stringent performance measures. Figure 5.3 shows the predicted number of transactions during the validation period, averaged across individuals and conditional on the number of purchases made during the calibration period. Figure 5.4 compares the predicted total spending during the validation period in a similar manner. Both figures visually demonstrate the superiority of the HB over the Pareto/NBD-based model.
Deriving Customer Lifetime Value from RFM Measures
139
Table 5.4 Estimation Results of HB Models (department store) Independent
Correlated
Full
Food
−0.82 (−0.93, −0.72)a —
−0.81 (−0.92, −0.71) —
Age
—
—
Female (male = 0) Intercept
—
—
−2.03 (−2.52, −1.51) 1.50*b (1.11, 1.89) −0.21 (−0.84, 0.40) 0.15 (−0.20, 0.48)
Food
−6.24 (−7.03, −5.52) —
−6.13 (−7.10, −5.56) —
Age
—
—
Female (male = 0) Intercept
—
—
Food
−3.59 (−3.67, −3.51) —
−3.57 (−3.64, −3.49) —
Age
—
—
Female (male = 0)
—
—
correlation(log(λ), log(μ))
—
−0.33 (−0.59, 0.01)
−0.24 (−0.51, 0.09)
correlation(log(λ), log(η))
—
−0.28* (−0.39, −0.17)
−0.14* (−0.25, −0.01)
correlation(log(μ), log(η))
—
−0.01 (−0.31, 0.27)
−0.07 (−0.35, 0.24)
marginal log−likelihood
−2111
−2105
−2078
Purchase rate log(λ)
Dropout Rate log(μ)
Spending Parameter log(η)
Intercept
a. Figures in parentheses indicate the 2.5 and 97.5 percentiles. b. * indicates significance at the 5% level.
−5.03 (−6.49, −3.57) −1.09 (−2.66, 0.26) −0.34 (−2.35, 1.49) 0.01 (−1.20, 1.38) −3.23 (−3.61, −2.86) −1.34* (−1.62, −1.06) 1.18* (0.71, 1.65) 0.11 (−0.15, 0.39)
140
Makoto Abe
Cum. rpt transactions
15,000
10,000
5,000 Actual Pareto/NBD HB 0
0
10
20
30
40
50
60
Week
Average number of transactions in weeks 27–52
Figure 5.2 Weekly Cumulative Repeat Transaction Plot. The vertical dashed line separates the calibration period and forecast period. Both proposed HB and Pareto/NBD models provide good fit on the aggregate cumulative purchases.
15
10
5 Actual Pareto/NBD HB 0
0
1
2
3
4
5
6
7
8
9
10+
Number of transactions in weeks 1–26 Figure 5.3 Conditional Expectation of Future Transactions. The proposed HB model shows a better forecast than Pareto/NBD model on the predicted number of transactions during the validation period, averaged across individuals and conditional on the number of purchases made during the calibration period.
Deriving Customer Lifetime Value from RFM Measures
141
Average sales in weeks 27–52
15
10
5 Actual Pareto/NBD HB 0
0
1
2
3
4
5
6
7
8
9
10+
Number of transactions in weeks 1–26 Figure 5.4 Conditional Expectation of Future Spending. The proposed HB model shows a better forecast than Pareto/NBD model on the predicted amount of spending during the validation period, averaged across individuals and conditional on the number of purchases made during the calibration period.
Table 5.5 compares the correlation and mean squared error (MSE) between prediction and observation with respect to the number of transactions and total spending at the individual customer level during the calibration and validation periods. The difference between Pareto/ NBD and Independent models, aside from Empirical Bayes (EB) versus HB, can be attributed to the difference in the assumption of the heterogeneity distributions for λ and μ, whether they are independent gammas (EB) or independent lognormals (HB). The slight advantage of the Independent model over Pareto/NBD in predicting spending seems to justify the lognormal heterogeneity for this dataset. The effect of relaxing the independent assumption and incorporating the covariates is reflected, respectively, by the difference between Independent and Correlated and between Correlated and Full. Because all HB models perform similarly, the improvement in fit from accommodating correlation and covariates is minor. In sum, the Full HB model seems to fit and predict well in comparison to the Pareto/NBD-based model, in terms of the number of transactions and spending at both the aggregate and disaggregate levels.
142
Makoto Abe
Table 5.5 Disaggregate Fit of Pareto/NBD and HB Models (department store) Pareto/NBD Spending Correlation MSE Transactions Correlation MSE
Independent
Correlated
Full
validation calibration validation calibration
0.80 0.99 0.39 0.02
0.83 0.99 0.35 0.06
0.83 0.99 0.35 0.06
0.83 0.99 0.35 0.06
validation calibration validation calibration
0.90 1.00 57.7 1.22
0.90 1.00 57.1 4.61
0.90 1.00 57.0 4.06
0.90 1.00 56.5 3.92
However, the difference is minor. Considering that both models use a Bayesian method (HB versus EB), but assume a different prior, the result seems to suggest that the estimated posterior distribution is driven mainly by data. The real advantage of the HB approach is in interpretation rather than prediction, as will be shown in the subsequent sections. 5.4.3
Insights into Existing Customers
Interpretation of the Model Estimation Having established the validity of the HB model, let us now examine table 5.4 to interpret the estimation result. Food, the fraction of store visits on which food items were purchased and a proxy for store accessibility, is the most important covariate with a significant positive coefficient for purchasing (log(λ)) and a significant negative coefficient for spending (log(η)). Managerially, food buyers tend to shop more often, but spend a smaller amount on each purchase. This finding is consistent with the story told by a store manager in that, although food buyers spend a smaller amount on each shopping trip, they visit the store often enough to be considered as vital. Another significant covariate for log(η) is Age, signifying that older customers tend to spend more at each purchase. Let us now turn our attention to the relationships among the purchase rate (λ), dropout rate (μ), and spending (η) parameters. To check whether the independence assumption of the Pareto/NBD is satisfied, the correlation of Γ0 must be tested on the intercept-only model
Deriving Customer Lifetime Value from RFM Measures
143
0.2 0.18 0.16 0.14
eta
0.12 0.1 0.08 0.06 0.04 0.02 0.
0
0.5
1.5
2
2.5
3
3.5
4
lamda Figure 5.5 Scatter Plots of Posterior Means λ and η. Each point corresponds to one of the 400 households estimated. Note the negative correlation between purchase rate (lamda) and spending (eta).
(Correlated), but not the covariate model (Full). The reason is that if covariates explain the correlation among λ, μ, and η completely, then no correlation remains in the error term as captured by Γ0. First, table 5.4 indicated that the correlation between log(λ) and log(μ) is not significantly different from zero, implying that the assumption of Pareto/NBD holds here. Second, the correlation between log(λ) and log(η) is significantly negative (–0.28), consistent with the Food variable having opposite signs on log(λ) and log(η) in the Full model. Figure 5.5 presents the scatter plot for the posterior means of the individual λi and ηi (i = 1,.., 400). One can visually observe the correlation. Hence, the assumption of the independence between transaction and spending components in the Pareto/NBD-based model (SP and FHL) does not hold in this dataset. For researchers using the SP and FHL models, this finding emphasizes the importance of verifying the independence assumption (as was done in SP and FHL). Managerially, this negative correlation implies that a frequent shopper tends to spend a smaller amount on each purchase. Furthermore, this correlation remains even after accounting for differences in customer
144
Makoto Abe
characteristics, specifically Food (store accessibility), Age, and Gender, as seen from the correlation for the Full model. No correlation was found between purchase rate and lifetime or between lifetime and spending per purchase. Customer-Specific Metrics and CLV Table 5.6 presents nine customer-specific metrics for the top and bottom ten customers in terms of CLV, along with the average, minimum, and maximum for the entire sample of 400 customers: posterior means of λi, μi,, and ηi; expected lifetime; survival rate after one year; probability of being active at the end of the calibration period; an expected number of transactions (using equation (5)); expected total spending during the validation period (using equation (6)); and CLV (using equation (7)). In computing CLV, an annual interest rate of 15% (δ = 0.0027 per week) was assumed as in FHL. There exists much heterogeneity across customers despite the use of the Bayesian shrinkage estimation. The mean expected lifetime is 10.0 years with the maximum and minimum of 24.7 and 1.3 years, respectively. The probability of being active at the end of the calibration period ranges from 0.18 to 1.0, with the average being 0.93. Over the validation period of twenty-six weeks, the expected number of transactions was 16.0 times with the total amount of 74,000 yen on average (divide by 100 to convert to the approximate US dollars). CLV ranges from 40,000 yen to 10.2 million yen, with an approximate average of 0.69 million yen. Figure 5.6 shows a gain chart (solid line) in which customers are sorted according to the decreasing order of CLV, and the cumulative CLV (y-axis, where the total CLV is normalized to 1) is plotted against the number of customers (x-axis). In addition, two gain charts are plotted. The dashed curve is based solely on recency criterion, whereby customers are sorted in the order of increasing recency (from most recent to least recent). The dotted line is a gain chart based on customers being ordered according to the sum of the three rankings of recency, frequency, and monetary value. The 45 degree dashed line corresponds to the cumulative CLV for randomly ordered customers. This figure implies that recency criterion alone is not sufficient to identify good customers, although many companies use this criterion. On the other hand, combined use of the three measures (recency, frequency, and monetary value), even with the naïve equal weighting scheme, seems to provide a rather accurate ordering of CLV. This finding strongly
mean(μ)
0.00165 0.00173 0.00205 0.00188 0.00338 0.00191 0.00206 0.00202 0.00206 0.00273 … 0.01218 0.00750 0.01151 0.03915 0.02974 0.04307 0.00586 0.00699 0.04713 0.01709
0.00564 0.00165 0.04713
mean(λ)
3.20 1.98 1.15 2.55 1.11 2.23 2.86 1.10 2.19 0.87 … 0.29 0.15 0.29 0.10 0.38 0.10 0.24 0.20 0.10 0.14
0.66 0.07 3.78
ID
1 2 3 4 5 6 7 8 9 10 … 391 392 393 394 395 396 397 398 399 400
ave min max
0.038 0.007 0.207
0.075 0.056 0.097 0.036 0.088 0.034 0.027 0.067 0.034 0.090 … 0.011 0.016 0.011 0.049 0.009 0.044 0.008 0.009 0.034 0.016
mean(η)
10.0 1.3 24.7
24.7 21.5 19.9 22.1 11.3 22.4 18.6 19.3 19.7 13.8 … 3.3 5.7 3.4 1.4 2.1 1.4 6.5 6.1 1.3 2.2
Mean Expected lifetime (years)
0.823 0.454 0.926
0.926 0.922 0.910 0.918 0.862 0.916 0.910 0.910 0.909 0.886 … 0.665 0.754 0.666 0.463 0.555 0.480 0.786 0.762 0.454 0.581
1 year Survival rate
0.929 0.182 1.000
1.000 1.000 0.999 1.000 0.997 1.000 0.999 0.996 1.000 0.999 … 0.379 0.803 0.381 0.436 0.182 0.450 0.951 0.862 0.420 0.601
Probability of being active at the end of calibration
Table 5.6 Customer-Specific Metrics for Top and Bottom 10 Customers (department store)
16.0 0.5 96.1
81.5 50.3 29.2 64.7 27.7 56.7 72.5 27.8 55.5 21.9 … 1.7 2.7 1.8 0.7 0.5 0.7 5.5 4.1 0.7 1.6
Expected number of transactions in validation period
0.74 0.01 9.61
9.61 4.38 4.43 3.60 3.80 3.01 3.01 2.93 2.91 3.09 … 0.03 0.07 0.03 0.05 0.01 0.05 0.07 0.06 0.04 0.04
Expected total spending in val. period (x105 yen)
6.9 0.4 102.0
102.0 45.4 45.1 37.2 33.9 31.0 30.3 29.6 29.3 29.0 … 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4
CLV (x105 yen)
Deriving Customer Lifetime Value from RFM Measures 145
146
Makoto Abe
10 0.9
Cumulative CLV
0.8 0.7 0.6 0.5 0.4 0.3 HB Recency only RFM scoring Random
0.2 0.1 0
0
50
100
150
200
250
300
350
400
Number of customers Figure 5.6 Gain Chart for CLV based on HB model and Simple Recency. RFM scoring, whereby three measures (RFM) are combined with equal weights, seems to provide a rather accurate ordering of CLV. However, customer ranking using just one measure of recency is not that accurate.
supports the wide use of RFM analysis and regression-type scoring models among practitioners for identifying good customers (Malthouse and Blattberg 2005). While RFM measures can produce the relative ranking of customers well, without the proposed model, the absolute CLV figure itself cannot be obtained. Customer Base and Customer Equity We can compute the expected number of active customers (customer base) at the end of the calibration period, that is January 1, 2001, from table 5.6. Customer base is the sum of active probabilities (column 7) for all customers. Although the dataset contains 400 customers, the effective number of active customers on January 1, 2001 is only 371.6 (= 400×0.929). When customer turnover is high, the customer base becomes much smaller than the registered number of customers in the dataset. Customer equity is the expected CLV generated by all active customers at that time. By aggregating the products of a CLV and the active probability on January 1, 2001, for all 400 customers, it becomes 273
Deriving Customer Lifetime Value from RFM Measures
147
r (c) 0.8 = R 0.5
1000 yen
c
Figure 5.7 Retention Response Function. Investment level of c per customer would bring him/her to become active with probability r(c). The function can be approximated by an exponential curve.
million yen. Customer base and customer equity are firm’s valuable long-term indicators, neither of which are provided by the accounting statement. Retention Program for Existing Customers Let us assume that, if a customer were indeed inactive, an investment level of c per customer would win her back with probability r(c). Such a response function r(c), as shown in figure 5.7, can be constructed by decision calculus (Blattberg and Deighton 1996; Little 1971). The figure implies that, by sending a coupon valued at 1,000 yen (i.e., investment c =1,000), there is a 50% chance that inactive customers can be brought back. However, no matter how large the coupon’s value is, the firm cannot bring back inactive customers with a probability of more than 0.8. The effect of such a coupon on the change in CLV is derived in the appendix. We now consider two examples of customer-specific coupon mailing. Customer Retention Example 1: Given that a coupon with a different value is sent to each customer on January 1, 2001, what level of c* maximizes the increase of her CLV? The second column of table 5.7 presents c* for the same twenty customers as in table 5.6 (i.e., top ten and bottom ten in terms of CLV). The optimum coupon value varied by customers between 0 yen and 1,500 yen, and the average was 500 yen. For seventy-four out of 400 customers, c*=0. This implies that, for these customers, a coupon increases their CLV only by a small amount such that the investment cannot be recouped. The optimal formula for c*, equation (A3) in the appendix, indicates that a higher coupon value is more effective for
148
Makoto Abe
Table 5.7 Customer-Specific Retention Action for Top and Bottom 10 Customers (department store)
ID
Coupon Value (yen)
Recency to Wait (days)
Probability of being active at the end of calibration
CLV (x105 yen)
1 2 3 4 5 6 7 8 9 10
407 235 510 221 623 0 385 661 186 372
3.5 6.5 7.8 5.9 7.0 6.9 5.8 10.1 6.9 9.6
1.000 1.000 0.999 1.000 0.997 1.000 0.999 0.996 1.000 0.999
102.0 45.4 45.1 37.2 33.9 31.0 30.3 29.6 29.3 29.0
391 392 393 394 395 396 397 398 399 400
916 684 911 887 949 864 372 561 822 713
60.5 110.8 62.6 54.0 37.8 55.2 96.3 106.7 66.2 120.4
0.379 0.803 0.381 0.436 0.182 0.450 0.951 0.862 0.420 0.601
0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4
ave min max
500 0 1500
32.6 3.5 120.4
0.929 0.182 1.000
6.9 0.4 102.0
customers with a higher CLV and a lower active probability, which is consistent with the general pattern of table 5.7. Customer Retention Example 2: To increase CLV by 10,000 yen, how many non-purchase days should a firm wait before mailing a coupon with a face value of 500 yen? Here, the logic is as follows. If an active probability is sufficiently high, the investment would be wasted (as in equation (A3)). An active probability decreases as time passes (as in equation (A1)). But if the firm waited too long, an investment of 500 yen would not be sufficient to induce an increase of CLV by 10,000 yen. Therefore, there is an
Deriving Customer Lifetime Value from RFM Measures
149
optimum timing to step in with the coupon. The third column of table 5.7 shows this timing (i.e., non-purchase days or recency), which varies by customers from 3.5 days to 120.4 days. In general, an earlier action is better for customers with a higher CLV. The pattern is consistent with equation (A4) in the appendix. 5.4.4 Insights into Prospective Customers We now focus our attention to prospective customers for the purpose of acquisition. In particular, to assess the influence of demographic covariates on CLV, we examine the elasticity of these covariates on PLS and CLV. Elasticity of Covariates on CLV Table 5.8 reports the decomposition of the elasticity of CLV with respect to each covariate into purchase rate, lifetime, and spending components, as shown in equation (10). To account for parameter uncertainty, elasticity is computed for each set of the 5000 MCMC draws of blk and μi according to equation (10). This result is then averaged over the 5000 draws and 400 customers. When the posterior mean of blk and μi is directly substituted into equation (10) (bottom table) instead of averaging over MCMC draws (top table), Table 5.8 Decomposition of CLV Elasticity into Three Components (department store). Accounting for Parameter Uncertainty Food
Age
Female
0.63
0.21 0.14
Total purchase rate: EpCLV
0.53 1.17
lifetime: ElCLV
0.41
−0.11 0.12
spending: EsCLV
−1.05
0.63
−0.04 0.10
AGE
FEMALE
0.69
0.18 0.14
Ignoring Parameter Uncertainty FOOD Total purchase rate: EpCLV
0.76 1.17
lifetime: ElCLV
0.63
−0.11 0.17
spending: EsCLV
−1.05
0.63
−0.06 0.10
* Note that elasticity for lifetime but neither purchase frequency nor spending is different when uncertainty is ignored. This is because, as shown in equation (10), only μ enters the elasticity formula in a nonlinear fashion.
150
Makoto Abe
elasticity with respect to the lifetime component is overestimated by about 50% (because of nonlinearity in μi). This overestimation occurs even when customer heterogeneity is accounted for. One of the advantages of our Bayesian approach is that parameter uncertainty can be evaluated easily with the sampling estimation method. Ignoring their uncertainty and computing various statistics from their point-estimates, say MLE, as if parameters are deterministic, could produce a biased result and lead to incorrect managerial decisions (Gupta 1988). Implication for New Customer Acquisition To clarify the impact of covariates, the solid line in figure 5.8 plots the value of log(CLV) for different values of a covariate when the remaining two covariates are fixed at their mean values. These graphs are computed using the mean estimate of the coefficients of the Full model shown in table 5.4, assuming that all covariates are continuous. For the Female covariate, therefore, it should be interpreted as how log(CLV) varies when the gender mixture is changed from the current level of 93.3% female, while keeping the other two covariates unchanged. The dotted vertical line indicates the mean value of the covariate under consideration. Both Food and Age have strong influences on log(CLV), whereas Female exerts a very weak influence, consistent with the significance of these covariates. Figure 5.8 also attempts to decompose the influence of covariates on log(CLV) into three components: purchase rate, lifetime, and spending. Taking the logarithm of the basic formula of CLV in equation (7) results in the following summation expression: log(CLV ) = − log(μ + δ ) + log(λ ) + log(η) + ω 2 / 2 = [ Lifetime μ component ] + [ Purchase rate λ component ] + [ Spending η component ] + constant
Figure 5.8 Impact of Covariates on CLV Decomposed into Three Components (department store). The dotted vertical line indicates the mean value of the covariate under consideration. [panel 5.8a] Increasing the fraction of food buyers improves lifetime and purchase rate, but decreases spending per purchase, yielding a net increase in the overall CLV. [panel 5.8b] Increasing the fraction of elderly people increases the spending without much influence on lifetime and purchase rate, thereby resulting in a net increase in the overall CLV. [panel 5.8c] Increasing the fraction of females leads to little improvement in all three components, and, hence, a negligible increase in the overall CLV.
Deriving Customer Lifetime Value from RFM Measures
(a)
Impact of Covariates on CLV Mediated by Three Components
3.5
Dropout + Frequency + Spending Frequency + Spending Spending
3
log (CLV)
2.5
Lifetime
2 1.5
Purchase rate
1 0.5
Spending Covariate = Food
0
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
Food
0.7
0.8
0.9
1
3.5 3
Lifetime
log (CLV)
2.5 2
Purchase rate
1.5 1
Spending
0.5
Covariate = Age 0 20
(c)
30
40
50
Age
60
70
80
90
3.5 3
log (CLV)
2.5
Lifetime
2
Purchase rate
1.5 1 0.5
Spending Covariate = Female
0
0
0.1
0.2
0.3
0.4
0.5
Female
0.6
0.7
0.8
0.9
1
151
152
Makoto Abe
The graph can be interpreted as stacking the lifetime, purchase rate, and spending components from top to bottom, thus constituting the overall log(CLV). To account for the scale differences among these components, each was adjusted to become 1 at the mean value of the covariate. Therefore, log(CLV) = 3 at the dotted vertical line. The direction and magnitude of the effect of each covariate on the three components are consistent with the signs of the posterior means blk (l ∈ {λ , μ , η}, k = 1,..., K ). Increasing the fraction of food buyers improves lifetime and purchase rate, but decreases spending per purchase, yielding a net increase in the overall CLV. Increasing the fraction of elderly people increases the spending without much influence on lifetime and purchase rate, thereby resulting in a net increase in the overall CLV. Increasing the fraction of females leads to little improvement in all three components, and, hence, a negligible increase in the overall CLV. Elasticity decomposition provides managers with useful insights into acquisition. An effort to manipulate certain customer characteristics might impact lifetime, purchase rate, and spending components in opposite directions, thereby canceling each other to produce a collectively reduced effect as the total on CLV. For example, much of the improvement in purchase rate (from increasing the fraction of food buyers) is negated by the decline in spending per purchase. In addition, only the lifetime improvement provides the net contribution to CLV, as can be seen from table 5.8 and the near flat dashed line of figure 5.8. In contrast, an effort to increase the proportion of elderly people is met with a boost in CLV, due to increased spending per purchase with only a small negative influence on purchase rate. To build an effective acquisition strategy from these results, managers must strike a fine balance between desired customer characteristics (i.e., demographics), desired behavioral profiles (i.e., purchase rate, lifetime, and spending), responsiveness (elasticity) of the characteristic covariates on CLV, and the acquisition cost of the desired target customers. 5.4.5 Second Dataset: Retail FSP Data for a Music CD Chain The second dataset, which was also analyzed in FHL, contains the record of 500 customers in an FSP of a large music CD store chain. The period covers fifty-two weeks beginning September 2003. The following customer characteristics were used as covariates: the amount of the initial purchase, age, and gender.
Deriving Customer Lifetime Value from RFM Measures
153
Table 5.9 Estimation Results of Various Models (music CD chain) Independent
Correlated
Full
Initial amount
−2.11 (−2.19, −2.03) —
−2.11 (−2.19, −2.03) —
Age
—
—
Female (male=0)
—
—
−2.10 (−2.34, −1.85) 0.37* (0.11, 0.63) −0.26 (−0.87, 0.34) −0.13 (−0.29, 0.03)
Intercept Initial amount
−5.18 (−5.63, −4.74) —
−5.14 (−5.64, −4.72) —
−5.06 (−5.89, −4.34) 0.02 (−1.09, 0.94)
Age
—
—
Female (male=0) Intercept
—
—
−0.15 (−1.84, 1.39) 0.05 (−0.60, 0.64)
Initial amount
−1.18 (−1.22, −1.13) —
−1.18 (−1.22, −1.13) —
Age
—
—
Female (male=0)
—
—
correlation (log(λ), log(μ))
—
correlation (log(λ), log(η))
—
correlation (log(μ), log(η))
—
0.20 (−0.02, 0.43) 0.14* (0.01, 0.27) 0.01 (−0.22, 0.24)
−0.03 (−0.10, 0.05) 0.19 (−0.04, 0.42) 0.10 (−0.05, 0.24) 0.01 (−0.20, 0.22)
marginal log-likelihood
−2908
−2906
−2889
Purchase rate log(λ)
Dropout Rate log(μ)
Spending Parameter log(η)
Intercept
* indicates significance at the 5% level
−1.49 (−1.63, −1.35) 0.50* (0.36, 0.65) 0.47* (0.12, 0.82)
154
Makoto Abe
Table 5.9 reports the model estimation. Let us first examine the Full HB model, which results in the highest marginal log-likelihood, for significant explanatory variables. First, the amount of an initial purchase is positively significant on log(λ) and log(η), implying that customers with a larger initial purchase tend to buy more often and spend more per purchase in subsequent purchases. Second, older customers appear to spend more per purchase. Next, we turn our attention to the Correlated model for the relationship among λ, μ, and η. First, we see that the correlation between log(λ) and log(μ) is not significantly different from 0, implying that the assumption of Pareto/NBD holds here. Second, the correlation between log(λ) and log(η) is significantly positive (0.14), consistent with the initial amount covariate having the same signs on log(λ) and log(η). Once again, the independence assumption of the transaction and spending components does not hold here. This time, however, the sign is in the opposite direction, implying that the correlation between purchase rate and spending per occasion is data dependent. Managerially, the correlation implies that frequent buyers spend more per shopping occasion. Also, note that when covariates are included (Full model), the correlation is no longer significant. This fact indicates that the difference in initial amount and age can explain the correlation between purchase rate and spending. Table 5.10 shows the elasticity decomposition of CLV into purchase rate, lifetime, and spending components. When parameter uncertainty is not accounted for, the lifetime component is overestimated by approximately 20%, as was the case for the department store data. The elasticity decomposition of log(CLV) into the three components for varying levels of the three covariates is presented in figure 5.9. A higher initial purchase amount is related to a higher CLV by increasing the purchase rate and spending with almost no change in lifetime. Older customers are associated with a lower purchase rate, longer lifetime, Table 5.10 Decomposition of CLV Elasticity into Three Components (music CD chain) Initial Amount
AGE
FEMALE
Total
0.31
0.12
−0.10
purchase rate: EpCLV
0.13
lifetime: ElCLV
0.00
−0.09 0.06
−0.02
spending: EsCLV
0.18
0.15
−0.01
−0.06
Deriving Customer Lifetime Value from RFM Measures
155
and higher spending per purchase with a positive net contribution to CLV. Female customers are associated with a lower purchase rate, shorter lifetime, and less spending with a negative net contribution to CLV. 5.5
Conclusions
The wide use of RFM analysis in CRM suggests that these measures contain rather rich information about customer purchase behavior. However, the existing literature provides conflicting findings on the relation between RFM and CLV, and several authors have advocated the need for further studies to provide empirical generalization (Blattberg, Malthouse, and Neslin 2009). The present research sought to clarify the issue through the identification of the underlying customer traits characterized by the interrelated behavior process of purchase rate, lifetime, and spending from statistical RFM measures. The PLS process posited the same behavioral assumptions as the established Pareto/NBD-based model studied by other researchers. The hierarchical Bayesian extension for constructing the individual-level CLV model permitted the application of accurate statistical inference on correlations, while controlling covariates and avoiding complex aggregation. The model also related customer characteristics to the buyer behaviors of purchase rate, lifetime, and spending, which were, in turn, linked to CLV to provide useful insights into retention of existing customers as well as acquisition of new customers. Two FSP datasets, one from a department store and another from a CD chain, were investigated in the empirical analysis. The proposed CLV model provided nine customer-specific metrics: posterior means of λi, μi,, and ηi; expected lifetime; survival rate after one year; probability of being active at the end of the calibration period; an expected number of transactions and expected total spending during the validation period; and CLV. These metrics are especially useful for identifying preferred customers and taking marketing actions targeted at the individual level in CRM. By maximizing Marketing ROI, we illustrated two examples of retention couponing that specifies what value needs to be mailed to which customers at which timing. It was also found that recency criterion alone was not sufficient to identify good customers, although many companies use this criterion. On the other hand, combined use of the three measures (recency, frequency, and monetary value), even with the naïve equal weighting
156
(a)
Makoto Abe
Impact of Covariates on CLV Mediated by Three Components
5
Dropout + Frequency + Spending Frequency + Spending Spending
4.5 4
Lifetime
log (CLV)
3.5 3
Purchase rate
2.5 2 1.5 1
Spending
0.5
Covariate = Amount of initial purchase
0
(b)
0.2
0.4
0.6
0.8
1
1.2
1.4
Initial purchase amount
1.6
1.8
2
3.5 3
Lifetime
log (CLV)
2.5 2
Purchase rate
1.5 1
Spending
0.5
Covariate = Age 0
(c)
10
20
30
40
Age
50
60
70
3.5 3
log (CLV)
2.5
Lifetime
2
Purchase rate
1.5 1 0.5
Spending Covariate = Female
0
0
0.1
0.2
0.3
0.4
0.5
Female
0.6
0.7
0.8
0.9
1
Deriving Customer Lifetime Value from RFM Measures
157
Figure 5.9 Impact of Covariates on CLV Decomposed into Three Components (CD chain). [panel 5.9a] A higher initial purchase amount is related to higher CLV by increasing the purchase rate and spending with almost no change in lifetime. [panel 5.9b] Older customers are associated with a lower purchase rate, longer lifetime, and higher spending per purchase with a positive net contribution to CLV. [panel 5.9c] Female customers are associated with a lower purchase rate, shorter lifetime, and less spending with a negative net contribution to CLV.
scheme, seemed to provide a rather accurate ordering of CLV. This finding strongly supports the wide use of RFM analysis and regressiontype scoring models among practitioners in identifying good customers. Finally, by relating the behavioral traits of PLS with demographic characteristics, we obtained insights into acquisition strategy for prospective customers with a high CLV. For example, the first dataset exhibited a statistically significant negative correlation between purchase rate and spending (–0.28). In such a case, recruiting food buyers for the purpose of improving purchase rate would be negated by the decline in spending per purchase, and, as a result, only an improvement in lifetime contributed to the net increase in CLV. Note that a correlation between purchase rate and spending is data dependent. Specifically, one of our datasets exhibited a negative relation, whereas the other dataset showed a positive relation. For a negative relationship, their relative magnitudes must be evaluated to assess the net impact on CLV—an insight that is crucial, especially in the context of customer acquisition. The current study is only the beginning of a stream of research addressing customer behavior in a “non-contractual” setting. Possible extensions are synonymous with limitations of the proposed method. From the consideration of RFM measures as sufficient statistics, the current model posited the behavioral assumptions of a Poisson purchase, an exponential lifetime and lognormal spending. With a customer’s complete transaction history, however, more elaborate behavioral phenomena can be modeled. Appendix: Derivation for the Effect of a Retention Program on CLV Following Blattberg and Deighton (1996), if a customer were indeed inactive, an investment level of c per customer would win her back with probability r(c), where:
158
Makoto Abe
r(c) = R (1 − e − kc )
where c ≥ 0 .
Here R and k are parameters, which can be estimated by decision calculus. Because whether a customer is active is unknown, we use its stochastic metric, the predicted probability of being active p (table 5.6), derived from the proposed model, as shown in equation (A1) (Abe 2009). p=
1 1+
μ [e(λ + μ )( recency ) − 1] λ+μ
(A1)
Let us denote the probability of being active, with and without the retention program, as pa and p, respectively. Then, by assumption, pa = p + (1 − p)r(c). Therefore, an increase in CLV as a result of the program, Δ, is expressed as equation (A2). Δ = ( paV − c ) − pV = (1 − p)r(c)V − c
where V = CLV
(A2)
Customer Retention Example 1. Given that a coupon with a different value is sent to each customer on January 1, 2001, what level of c* maximizes the increase of her CLV? The optimum c* is obtained by maximizing Δ with respect to c by solving the first-order condition of equation (A2) as: c* =
1 ln(1 − p)kRV . k
(A3)
Customer Retention Example 2. To increase CLV by 10,000 yen, how many non-purchase days should a firm wait before mailing a coupon with a face value of 500 yen? Substitute Δ = 10000 and c = 500, and solve equation (A2) with respect to active probability p*: p* = 1 −
c+Δ r(c)V
Substitute p = p* in equation (A1) and solve for recency: recency * =
1 c+Δ ⎤ ⎡λ + μ ln +1 λ + μ ⎢⎣ μ r(c)V − c − Δ ⎥⎦
(A4)
Deriving Customer Lifetime Value from RFM Measures
159
Notes I would like to express my sincere thanks to Prof. John D. C. Little, who supported me throughout my career since I was a student at MIT. I am also deeply grateful to Glen Urban and John Hauser, who served my doctoral dissertation committee, for editing this book. This research was supported by grant #19530383 from the Japan Society for the Promotion of Science. 1. In the HB estimation, data on sample households are utilized only to construct a prior for individual customer-specific parameters. For this reason, the estimation result would be rather insensitive to the sample size, as long as it is sufficiently large (i.e., 400). One will not gain much, for example, by using a sample of 10,000 customers. Hence, the scalability is not an issue in our approach.
References Abe, Makoto. 2009. “‘Counting Your Customers’ One By One: A Hierarchical Bayes Extension to the Pareto/NBD Model.” Marketing Science 28 (3):541–553. Blattberg, Robert C., and John Deighton. 1996. “Manage Marketing by the Customer Equity Test.” Harvard Business Review (July-August):36–44. Blattberg, Robert C., Edward C. Malthouse, and Scott Neslin. 2009. “Lifetime Value: Empirical Generalizations and Some Conceptual Questions.” Journal of Interactive Marketing 23 (2):157–168. Borle, Sharad, Siddharth S. Singh, and Dipak C. Jain. 2008. “Customer Lifetime Value Measurement.” Management Science 54 (1):100–112. Buckinx, Wouter, and Dirk Van den Poel. 2005. “Customer Base Analysis: Partial Defection of Behaviorally Loyal Clients in a Non-Contractual FMCG Retail Setting.” European Journal of Operational Research 164 (1):252–268. Fader, Peter S., Bruce G. S. Hardie, and Ka Lok Lee. 2005. “RFM and CLV: Using Iso-Value Curves for Customer Base Analysis.” Journal of Marketing Research 42 (4):415–430. Gupta, Sunil. 1988. “Impact of Sales Promotions on When, What, and How Much to Buy.” Journal of Marketing Research 25 (4):342–355. Hughes, Arthur. 2000. Strategic Database Marketing. 2nd ed. New York: McGraw-Hill. Little, John D.C. 1971. “Models and Managers: The Concept of a Decision Calculus.” Management Science 16 (8):B466–B485. Malthouse, Edward C., and Robert C. Blattberg. 2005. “Can We Predict Customer Lifetime Value?” Journal of Interactive Marketing 19 (1):2–15. Reinartz, Werner J., and V. Kumar. 2000. “On the Profitability of Long-Life Customers in a Noncontractual Setting: An Empirical Investigation and Implications for Marketing.” Journal of Marketing 64 (4):17–35. Reinartz, Werner J., and V. Kumar. 2003. “The Impact of Customer Relationship Characteristics on Profitable Lifetime Duration.” Journal of Marketing 67 (1):77–99.
160
Makoto Abe
Reinartz, Werner J., J. S. Thomas, and V. Kumar. 2005. “Balancing Acquisition and Retention Resources to Maximize Customer Profitability.” Journal of Marketing 69 (1):63–79. Rosset, Saharon, Einat Neumann, Uri Eick, and Nurit Vatnik. 2003. “Customer Lifetime Value Models for Decision Support.” Data Mining and Knowledge Discovery 7 (July): 321–339. Schmittlein, David C., Donald G. Morrison, and Richard Colombo. 1987. “Counting Your Customers: Who Are They and What Will They Do Next?” Management Science 33 (1):1–24. Schmittlein, David C., and Robert A. Peterson. 1994. “Customer Base Analysis: An Industrial Purchase Process Application.” Marketing Science 13 (1):41–67. Singh, Siddharth S., Sharad Borle, and Dipak C. Jain. 2009. “A Generalized Framework for Estimating Customer Lifetime Value When Customer Lifetimes are not Observed.” Quantitative Marketing and Economics 7 (2):181–205.
6
Building and Using a Micromarketing Platform Masahiko Yamanaka
Editors’ Note Among the contributors to this Festschrift, Masahiko Yamanaka (Masa) is unique in that he is non-academic, president of the largest supplier and consulting company working with daily (versus weekly) POS (point of sale) data in Japan, and his company is non-US based. Masa’s expertise and experience offer new insights about what Marketing Science can do with fine-grained POS data. Masa graduated from the administrative engineering department of Keio University (BS and MS) and the international business department of Sophia University (MA). After graduation, he joined Ajinomoto, one of the largest food and pharmaceutical companies in Japan. Its signature product is monosodium glutamate (MSG), which it started selling in 1909. Masa was an operation researcher in the central research laboratories working on applications in marketing. From 1983 to 1984, he studied marketing decision systems and product management as a visiting fellow at MIT under Professors John Little, Glen Urban, and John Hauser. He continued to work at Ajinomoto in information systems, new product development, sales planning, and sales promotion until he founded KSP-SP in 2003. He also served as an adjunct professor of Hosei University from 2005 to 2008. He translated “Product Management” by Urban and Hauser into Japanese and has written many application papers. Glen Urban and John Hauser 6.1
Japan’s POS Data Environment
6.1.1 The POS Trend in Retailing POS cash registers spread rapidly among companies after they were first introduced by Seven-Eleven Japan in 1982. They were easier to
162
Masahiko Yamanaka
operate, speeded up checkout, and improved the accuracy of sales data. In 1985, Seven-Eleven Japan implemented a data analysis system to evaluate the effectiveness of product selections for small convenience stores by analyzing POS data using Pareto analysis (ABC analysis) (Ogawa1999). By 1985, the number of registered manufacturers for the universal product code (UPC) reached 10,000. However, the majority of retailers valued the efficiency of checkout and the accuracy of sales data more than the opportunity to do analysis. 6.1.2 The POS Trend for Manufacturers and Data Suppliers In Japan, two marketing research organizations, Intage and Tokyu Agency (Macromill), provide consumer research panels to serve the needs of manufacturers. Intage was the first company to offer such a panel, using a diary method beginning in 1964. The first scanning system started in 1988. Since the operation of a consumer research panel is costly, the number of consumers on the panel of Intage, Japan’s largest marketing research company, is only 12,000 people. This is not enough for an adequate study of the diffusion of new products. To solve this deficiency, POS marketing data was needed. Intage launched a panel research network service linked to retailer POS data in 1992. The first POS system had been started by Nihon Keizai Shimbun (Nikkei) in 1985. The POS marketing data providers now include Intage, Nikkei, the Distribution Economics Institute of Japan, and KSP-SP, which was the last company to join in 2003. The demand for POS marketing data in Japan is increasing as the number of users expands, but the growth has been fairly slow. This caused Nielsen to withdraw from the POS data business in 2007. 6.1.3 Collaborative Movement of Japan’s Retailers and Suppliers In the 1990s, the idea of category management was introduced by leading manufacturers. As part of this, each major manufacturer started software development to support shelf management of regular items. However, it was difficult for retailers to evaluate the recommendations of manufacturers. In addition, many retailers recognized that the underlying principles behind the recommendations are more important than data analysis itself. As a result, shelf management software developed by a third-party has become popular with retailers. Despite this, there have not been major advances in category management in the food industry, whereas category management in the
Building and Using a Micromarketing Platform
163
toiletry industry has progressed because of the efforts of P&G and Kao. The major obstacle to advances in category management in the food industry has been the lack of a good methodology for promotion planning. The shelf management of regular items is a periodic retailer operation carried out twice a year in spring and autumn. However, promotion planning is conducted on a monthly basis by major retailers, and their sales critically depend on these being successful. In 2003, the retailer Coop Sapporo made its POS data available to suppliers for all merchandise categories. This attracted more than 200 companies, including major national brand manufacturers, as well as local mid-sized manufacturers and wholesalers. Coop Sapporo’s action showed that it welcomed recommendations of seasonal promotion plans for individual items and entire categories. Furthermore, a POS system that monitored whether promotions were conducted as planned became desirable. 6.1.4 Movement of Consumers and Retailers in Japan The consumers in Japan particularly care about the freshness of seafood and vegetables. This desire for quality is due to the climate of Japan. As an island country, it became common for Japanese people to eat fish raw because fresh fish was easily available. In addition, the Japanese archipelago lies from north to south, which provides distinctive differences among the four seasons. Spring starts in the south and autumn starts in the north, which makes it possible to supply seasonal vegetables and fruits for a long period of time. There are also many customers who want to enjoy seasonal delights from distinctive areas. However, many local products are only available in limited quantities. This is a disadvantage for large national chains that wish to procure products in large quantity. Thus, economies of scale are often impossible for wild and seasonal products like fish and vegetables. In addition to supply problems, prices of fish and vegetables fluctuate almost daily, depending on the time of the year. For that reason, these products do not match the EDLP (everyday low price) strategy of selling at stable low prices throughout the year. Retailers who sell products using a High-Low strategy are well received by consumers. This allows local supermarkets to survive in Japan despite competition from national chains. As a result, the share of the top four national chains in the food market is only 24%, which is much smaller than in the United States and European countries.
164
6.2
Masahiko Yamanaka
Demands of a Micromarketing Strategy
The results of a shopping behavior study were systematized into In-Store Merchandising (ISM), and the knowledge of ISM became common among sales divisions in Japan (Tajima 1989). One of the characteristics of Japanese companies is the close relationship between the marketing divisions at headquarters and the sales divisions at the local branches. Major manufacturers rotate human resources every three to five years, which allows most employees to gain experience in both positions. This practice explains why Japanese management is able to focus on a localized, on-site approach (Ushio 2010). In many companies, headquarters decides on the outline of marketing strategies and then the branches set the goals for their geographic area based on prior performance levels. To better meet consumer needs, many supermarkets rely heavily on local strategies and the abilities of regional sales staff from manufacturers who are in charge of dominant retailers in the areas. This is another reason why the market concentration by large national chains is low in the Japanese supermarket industry. It is desirable for manufacturers to propose strategies that match retailers’ own brand strategies. For that reason, the sales personnel of manufacturers need to understand shopper trends through POS data. Against this background, the sales personnel of manufacturers must become “micromarketers” who can take the lead in local marketing strategies. It is important to cultivate many micromarketers from sales personnel by building the infrastructure. To realize this, the following items should be addressed: Planning support (1) Product appeals for the main category and each brand, as well as best timing for promotions. (2) Effect of price promotion. (3) Effect of each type of non-price promotion. Verification support (4) Causal data for retail stores. (5) POS data management and analysis for manufacturers and wholesalers. (6) Information disclosure of retail POS data.
Building and Using a Micromarketing Platform
165
Common infrastructure (7) Easy-to-use marketing POS data. (8) Key success factors for display promotion. (9) Methods for evaluating promotion campaigns. Items (1)–(3) for planning support and (4) for verification support are closely related. If brand equity is considered the baseline, it can only be measured by subtracting the effects of price and non-price promotions from sales. To this end, POS data provides one component. It is necessary to incorporate causal data from non-price promotions for the rest. 6.3 Developing New Services for Manufacturers, Wholesalers, and Retailers A first goal was a micromarketing infrastructure that aimed to support the sales staff of branch offices rather than marketing division headquarters at both major and mid-sized manufacturers and wholesalers. To accomplish this, the service is provided by an ASP (application service provider), which permits access from any location via the Internet (see figure 6.1). 6.3.1 POS Marketing Data: Developing Pricing Packages Previous users of POS data had been dissatisfied in two main areas. One was that, because it was conventional to charge a fee for each product category through a service agreement, it was too expensive for users to obtain access to a broad category. To solve the problem, a pricing package was created with a flat fee of 100,000 yen per month for week-to-week and month-to-month data for the entire food category in Japan, divided into ten geographic areas. This service, covering a broad range of products, was widely accepted by sales personnel who wanted to propose cross-merchandizing plans that combined products in strong and weak categories. Second, the previous service was slow. It provided only basic indicators, including week-to-week data for each area and average weekly sales prices (but with two weeks delay). In the Japanese market, where most retailers follow High-Low strategies, some of the advanced retailers wanted the PI (purchase incidence, as measured by unit sales per 1,000 customers) for each regular-priced and each bargain-priced product at the earliest possible time. The new platform is able to do
166
Masahiko Yamanaka
Figure 6.1 ASP (application service provider) of KSP-SP.
this substantially faster than heretofore. It permits users to obtain results from daily POS data by keyword search via the Internet. This enables them to understand PI values for each price level and/or look at detailed trends in the sales price. A flash report enables users to grasp the market situation with just two days’ delay. This has been attractive to major manufacturers needing accurate information at an early stage. This ASP (application service provider) service was made possible by designing a fast search engine. By the end of 2015, KSP-POS collected weekly data from 1,000 stores and daily data from 910 stores from approximately 180 companies, and the number is increasing day by day. 6.3.2 Monitoring End Aisle Display Promotions The majority of high profit companies in the Japanese supermarket business conduct sales promotion weekly. However, most retailers
Building and Using a Micromarketing Platform
167
were not able to determine how well each store adhered to the promotion plan suggested by the head offices. Furthermore, few retailers evaluate individual promotions and analyze the success and failure factors. Therefore, a service was developed to measure the productivity of end aisle displays and other sales events by setting up a sample of stores in twenty-eight locations from seven retail companies. The importance of the impact of end aisle display on sales, found by this system, has been published in a research paper (Yamanaka 2007). 6.3.3 POS Data Disclosure System in the Retail Industry POS data disclosure from retailers to suppliers has become common in Japan. The system for accessing it (developed by KSP-SP) is called SUPLINK and has been adopted by twenty companies. It allows access not only to POS data, but also to basic analysis functions. Systems developed by other companies require an additional step, in which data must be downloaded into Microsoft Excel or statistical software before analysis can begin, requiring technical knowledge and the time of users. SUP-LINK is well received for its user-friendliness. Users can obtain results simply by selecting an analysis menu via the Internet. This system offers a differential advantage over other systems to manufacturers and wholesalers who want to conduct data analysis by themselves. 6.3.4 POS Data Management and Analysis System for Manufacturers and Wholesalers: CCMC Disclosure of POS data by retailers allows suppliers to analyze data and make necessary recommendations to the retailers. To support such operation, KSP-SP offers a system called CCMC (Customer’s Category & Market Comparison) using ASP. Previously, sales personnel developed their own tools to analyze POS data. However, if collected data were not turned over to successors or depended on the skills of the analyst, large amounts of time were lost in transitions. To solve such problems, CCMC is useful for sales teams. The analysis menus of CCMC are made for both beginners and people who require detailed analysis, and offer Excel-ready analysis results, including graphs. 6.3.5 Coaching Service Because POS data is being disclosed by retailers, manufacturers need to make detailed propositions to retailers, supported by verification
168
Masahiko Yamanaka
studies, and continuously improve the quality of their recommendations. In such a drastically altered environment, employees often receive training from outside agencies. The coaching service of KSP-SP aims not only to develop better analytical ability in the sales teams from manufacturers and wholesalers, but also to improve the abilities for business negotiations. If retailers disclose POS data, the salesperson who calls on the retailer can learn how to use the menus of the analysis software (CCMC) to find the challenges facing clients. Participants create proposal sheets for the discovered challenges and make simulated proposals for the coach. The coach, who is an experienced buyer for the retailer and is eager to develop human resources, points out the good points and the points that require improvements from simulated proposals before making the final proposals. 6.4 High Precision Measurement Model of Advertisement and Sales Promotion: M3R. The M3R (Micro Marketing Mix Response model) service focuses on measuring the effectiveness of the marketing mix to permit its improvement and provides new indicators of performance. It measures the synergistic effects of TV commercials and sales promotions and how well commercials are utilized during sales activities. 6.4.1 Background and Purpose For the manufacturers’ national brand products, although Internet ads have become common, the manufacturers value television ads more. After the publication of IMC (Integrated Marketing Communication) by Shultz, Tannenbaum, and Lauterborn (1993), many professionals took an interest in the new concept of M3R. However, it remained a theory because a simple method for measuring the level of integration did not develop. Watanabe (2000) studied the effects of advertisements on shoppers through exit interviews. He named as the primary effect to be the case when advertisements were followed by planned purchases, and the secondary effect to be the case when shoppers purchased unplanned products and recalled the television ads. He showed the secondary effect exceeded the primary one. This indicates the importance of matching the timing of the advertising exposures with increased
Building and Using a Micromarketing Platform
169
in-store exposure, as well as the importance of measuring the secondary effects of an advertisement. 6.4.2 Concepts and Features of the Model The model uses daily POS data from each store and the daily GRP (gross rating points) as inputs. Using single source data, Jones (1995) confirmed that the shopper purchasing attributable to watching a short-term television campaign is maximized with three exposures, meaning that shoppers take action in an extremely short time period after watching a commercial. This implies that it is necessary to collect sales, advertising, and POS data daily because changes occur daily. To accurately assess advertising effect, the in-store sales promotions must be accounted for by understanding their effect. The objective variable is a quartile of the daily PI (purchase incidence), and the explanatory variables are: (1) daily GRP; (2) number of stores offering temporary price reduction with different discounts; and (3) the number of stores conducting non-price promotion estimated from the price and quantity PI value. When eight quintiles are taken as objective function variables, seven regression equations are solved at the same time. The carryover rate is estimated by maximizing the R2 of the regression equation. In general, quartile and quintile are influenced more strongly by advertising and promotion than median values. The average value of the coefficient obtained from each quartile is used as the desired value. To take each quartile means a kind of market segmentation for finding market response functions (Guadagni and Little 1983). Basic Data (1) GRP data per advertising copy on a daily basis. (2) POS data per item per store on a daily basis. As an option: (3) Causal data of detailed in-store promotion on a weekly basis. Output Weekly baseline: Estimated values of quantity PI without sales promotions. Efficiency of advertisements can be obtained from three indicators, effect can be obtained from four indicators, and sales promotion can be broadly divided into pricing and non-pricing (see figure 6.2).
170
Masahiko Yamanaka
Adv. efficiency
(3) Threshold values of adv. stock
Adv. secondary effect Non-price SP effect Price SP effect
Adv. fourth effect
(2) Carry-over rates
Adv. primary effect
Adv. third effect
(1) Maximum lift-up rates
Baseline sales
Sales turnover by promotion type
Sales ability
Figure 6.2 Basic Indicators obtained by M3R Model.
Three indicators of advertising efficiency (1) Maximum lift rates: sales gained by short-term advertisements. (Pricing PI subtracted from weekly sales promotion effects with maximum advertising stocks) / (Average baseline prior to airtime of advertising). (2) Advertising carryover rates: the value is the carryover effect from advertising GRP of the previous day. (3) Threshold values of advertising stock: minimum and maximum. The minimum threshold value indicates a value below which advertising does not increase sales. The maximum threshold value indicates a value above which advertising does not lead to incremental sales. Six indicators of advertising effects Indicators measured from sales (4) Primary effect of advertising: traditional way to gain sales. It presumably indicates results of planned purchase. (5) Secondary effect of advertising: synergistic effect of expanded exposure by displaying on-end gondolas and advertising. It presumably
Building and Using a Micromarketing Platform
171
indicates results that shoppers purchase the product when they find it at store and are reminded of television commercials. Indicators measured by increasing rate of number of stores (6) Pull effects of end display: indicates whether sales staff can utilize commercial plans and obtain places to display products. In general, it is a rate of numbers of stores promoting end displays during the period of advertising per number of stores. (7) Shelving promotional effect: for products that are not new or leading brands, it indicates the degree of success of sales staff in increasing shelf space before television commercial airtime. Discount effects (8) Effects of sales per unit of discount: it indicates how much sales can be increased depending on the amount of discount. Sales turnover per unit of discount compared to previous day/baseline. Non-pricing sales promotion effects (9) Effects of end displays and flyers: it indicates the increase in sales turnover at stores presumably from non-price effects. 6.4.3 Characteristics The major companies Intage and Nielsen offer measurement models for advertisement effectiveness using weekly POS data and weekly GRP data as input. Since the weekly POS data are influenced heavily by in-store promotions, it is not appropriate to detect advertising effects without adjustment. The typical model involving marketing mix is the BRANDAID introduced by Little (1975). This model considers shares as an objective variable and grasps explanatory variables by the ratio between competitions. The M3R provides sales increase, not shares, and it is easily used by practitioners because it handles absolute values, such as explanatory variables, GRP, and number of stores. This model is highly flexible. To estimate the impacts of competitive brands, just add daily GRP of competitive brands as an explanatory variable. One particular brand launched its website as a part of the consumer campaign along with television ads to promote product comprehension by consumers. The number of daily accesses to the website was added as an explanatory variable, and the coefficient was estimated to be statistically significant. The model makes adding variables
172
Masahiko Yamanaka
easy. The weekly baseline assists promotion timing by indicating the start and finish of seasonal demand. 6.4.4 Results This model has never failed to detect the effects of advertisements. The t value for an advertising variable, GRP, is usually over 6 and sometimes reaches more than 10, which is rarely obtained in regular regression models. Comparing the maximum lift-up values for the primary and secondary effects of advertisements, the secondary effects are more than three times the primary effects, which shows the importance of displaying products on end displays during the advertising period. In addition, it indicates that daily advertising carryover and threshold values can be used to construct effective advertising schedules. 6.4.5 Challenges and Future Development After the basic M3R model was created in 2005, the process was enhanced and automated in many ways. However, there are some basic problems as a model and challenges for advertising effects. As a model, it has not been verified that the current method is the best possible formula. In comparison with previous methods, it has made excellent progress. The objective variables are obtained using quartile and sampling methods, but a theoretical explanation is lacking as to why this method can easily detect advertising and sales promotion effects. In addition, the current model is still simple and further extension is warranted. It might be possible to enhance accuracy using Bayesian methods. Future challenges for the M3R business are to implement flash reports. If it becomes possible to provide the effects of on-air commercials in nearly real time, it should be useful to determine whether to continue showing particular television commercials. 6.5
Concluding Remarks
After working at the manufacturer Ajinomoto for thirty-one years, I established KSP-SP. During this time, I came to realize that it was desirable to provide an infrastructure for sales activities from a neutral organization of manufacturers, wholesalers, and retailers. This concept has been well received by many clients. To facilitate micromarketing, the field of ID-POS has already been an important infrastructure in the drugstore business. KSP-SP will
Building and Using a Micromarketing Platform
173
implement ID-POS when expanding the business to the drugstore industry. Acknowledgments I express my gratitude for help of Professor Makoto Abe (Tokyo University). Without his support, this essay could not have been accomplished. Finally, the current business could never have been achieved without the support of many people. I would like to take this opportunity to show my appreciation to Professor Yoshio Hayashi (Keio University), as well as Professors Little, Urban, and Hauser (MIT). I am deeply grateful for the various experiences with many associates at Ajinomoto, through which I have learned a tremendous amount. The business of KSP-SP was originally achieved by understanding shareholders and through the passionate work of employees, as well as many clients who agreed with the vision of KSP-SP. References Guadagni, P. M., and J. D. C. Little. 1983. “A Logit Model of Brand Choice Calibrated on Scanner Data.” Marketing Science 2 (3):203–238. Jones, J. P. 1995. Single-Source Research Begins to Fulfill Its Promise. Journal of Advertising Research 35 (3): 9–16. Little, J. D. C. 1975. “BRANDAID: A Marketing-Mix Model.” Operations Research 23 (July-August):628–673. Ogawa, Kosuke. 1999. Marketing Information Revolution, 173–179. Tokyo: Yuhikaku. Schultz, Don E., S. I. Tannenbaum, and R. F. Lauterborn. 1993. New Marketing Paradigm: Integrated Marketing Communications. Lincoln, IL: NTC BusinessBooks. Tajima, Yoshihiro. 1989. In-Store Merchandising. Tokyo: Business-sha. Ushio, Jiro. 2010. Words of Japanese Management, 146. Tokyo: Chichi Shuppan. Watanabe, Takayuki. 2000. In-Store Buying Behaviors and Marketing Adaption, 101–104. Tokyo: Chikura Shobo. Yamanaka, Masahiko. 2007. “Importance of Irregular End Displays During Sales Promotions: Integration of Micro and Brand Marketing.” Japan Marketing Journal 103:77–94.
7
Dynamic Allocation of Pharmaceutical Detailing and Sampling for Long-Term Profitability Ricardo Montoya, Oded Netzer, and Kamel Jedidi
7.1
Introduction
The pharmaceutical industry is under significant pressure to consider its costs very carefully … Currently, much budget is spent despite marketers being unable to identify which combination of activities has the greatest growth potential, and without knowing what specific effect individual activities are having on physicians over time. —Andrée Bates, Managing Director of Campbell Belman Europe (Bates 2006)
Marketing is essential to company growth. The US pharmaceutical industry spent upwards of $18 billion on marketing drugs in 2005 (Donohue et al. 2007), representing approximately 6% of industry sales revenues. Detailing and drug sampling activities account for the bulk of this spending. To stay competitive, pharmaceutical marketing managers need to optimally allocate these resources and ensure that they achieve the highest possible return on investment for the firm. Marketing resource allocation decisions are complex. Pharmaceutical firms need to determine which physicians to target as well as when and how to target them. Optimizing these decisions requires insights into (1) physicians’ heterogeneity in prescription behavior and their responsiveness to marketing activities, (2) the evolution of physicians’ preferences over time, and (3) the short- and long-term impact of marketing activities on prescription behavior. Perhaps because of these complexities, there is evidence that pharmaceutical firms do not allocate their marketing budgets optimally (Manchanda and Chintagunta 2004, Narayanan et al. 2005). This research offers a first step in providing pharmaceutical marketing managers with a state-of-the-art model and optimization procedure for dynamically targeting marketing activities to individual physicians.
176
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Previous research suggests that physicians are heterogenous and may exhibit dynamic prescription behavior, particularly for a new drug (e.g., Janakiraman et al. 2008; Narayanan et al. 2005). This research stream has also shown that pharmaceutical marketing actions can have both short- and long-term effects (Manchanda and Chintagunta 2004; Mizik and Jacobson 2004; Narayanan et al. 2005). Thus, accounting for physicians’ heterogeneity and dynamics in prescription behavior and the enduring impact of marketing activities are critical for optimizing marketing decisions. Ignoring physicians’ dynamic behavior can result in misleading inferences regarding the temporal pattern of elasticities. Similarly, a myopic firm is likely to underallocate marketing resources with primarily long-term effects. In this paper, we present an integrative approach for dynamically targeting and allocating marketing activities to physicians. We first model the dynamics in physicians’ prescription behavior while accounting for the short- and long-term effects of marketing actions and physicians’ heterogeneity. We then use the estimation results to derive an optimal1 dynamic marketing resource allocation policy. Specifically, we use a nonhomogeneous hidden Markov model (HMM) that accounts for the dynamics in prescription behavior and the enduring effect of marketing actions. We capture physicians’ dynamic behavior by allowing them to transition over time among a set of latent states of prescription behavior. To model the long-term impact of marketing actions, we allow the nonhomogeneous transition matrix to be dynamically affected by these actions. Finally, integrating over the posterior distribution of the physician-level, time-varying parameter estimates, we implement a partially observable Markov decision process (POMDP) to dynamically allocate pharmaceutical marketing resources across physicians. Although implemented within a pharmaceutical context, our approach can be readily used in other domains where firms have access to longitudinal, customer-level data, such as retailing, telecommunication, and financial services firms. We demonstrate the managerial value of the proposed approach using data from a major pharmaceutical company. In this application, we find a high degree of physicians’ heterogeneity, and dynamics and substantial long-term effects for detailing and sampling. Specifically, we find that detailing is most effective as an acquisition tool, whereas sampling is most effective as a retention tool. The optimization results suggest that the firm could increase its profits substantially while decreasing its marketing efforts by as much as 20%.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
177
The rest of this paper is organized as follows. Section 7.2 reviews the relevant literature. Section 7.3 describes the pharmaceutical data we use in our empirical application. Section 7.4 presents the modeling approach. Section 7.5 reports the empirical results. Section 7.6 presents the optimization procedure, and section 7.7 discusses the derived resource allocation policy. Section 7.8 concludes this paper, and discusses limitations and future research directions. 7.2
Literature Review
In this section, we briefly review the pharmaceutical literature and other work related to the different components of our approach: dynamics in physician prescription behavior, long-term effect of marketing activities, and marketing resource allocation. The pharmaceutical marketing literature shows that physicians can be dynamic in their prescription behavior. Such dynamic behavior can arise from internal factors, such as state dependence (Janakiraman et al. 2008; Manchanda et al. 2004) and learning (Narayanan and Manchanda 2009; Narayanan et al. 2005), or from the long-term effect of marketing actions, such as detailing and sampling (Gönül et al. 2001; Janakiraman et al. 2008; Manchanda and Chintagunta 2004; Mizik and Jacobson 2004; Narayanan et al. 2005; Narayanan and Manchanda 2009). Erdem and Sun (2001) demonstrate that dynamics in consumer behavior and the long-term effect of marketing actions need to be accounted for simultaneously to properly quantify their marginal effects. In a pharmaceutical context, Janakiraman et al. (2008) show that ignoring physicians’ habit persistence may bias the estimates of the effectiveness of marketing actions. In this paper, we account for physician dynamics through a nonhomogenous HMM (Netzer et al. 2008), in which the states are defined by both physician behavior and external factors such as marketing activities. From a methodological point of view, our paper belongs to the small but growing number of HMM applications in marketing. HMMs have been used to study the dynamics in consumer attentions (Liechty et al. 2003), web search behavior (Montgomery et al. 2004), competitive environment (Moon et al. 2007), customer relationships (Netzer et al. 2008), and service portfolio choice (Schweidel et al. 2010). The nonhomogenous HMM simultaneously captures physicians’ dynamics, physicians’ heterogeneity, and the short- and long-term effects of marketing activities. It captures dynamics by allowing
178
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
physicians to dynamically transition among a set of prescription states. The long-term effects of detailing and sampling are often captured in the literature by the exponential decay or the cumulative detailing stock approaches (e.g., Gönül et al. 2001; Manchanda and Chintagunta 2004). In our HMM, marketing actions can have a “regime shift” effect on physicians’ behavior (i.e., they affect the physician transition to a different state of behavior), thus providing a more flexible approach for capturing their long-term effect. Despite the rich body of research investigating physicians’ responses to detailing and sampling, little work has been devoted to the optimal dynamic allocation of these marketing activities. In this research, we formulate a POMDP (see Littman 2009 for a review; and Aviv and Pazgal 2005, Knox 2006, and Hauser et al. 2009 for marketing applications) that uses the posterior distribution of the HMM parameters as input to dynamically allocate marketing resources across physicians and maximize long-run profitability. Several papers in the marketing literature have used such a two-step approach (i.e., estimation followed by optimization) to optimize advertising effort (e.g., Dubé et al. 2005; Hitsch 2006), catalog mailing (Simester et al. 2006), and pricing (e.g., Nair 2007; Dubé et al. 2009). Our optimization approach advances the marketing resource allocation literature (e.g., Jedidi et al. 1999; Lewis 2005; and Naik et al. 2005) by accounting for the short- and long-term effects of marketing activities as well as physicians’ heterogeneity and latent dynamics when allocating detailing and sampling to physicians over time. 7.3
Data
Our data comprise physician-level new prescriptions, as well as detailing and sampling activities received over a 24-month period after the launch of a new drug used to treat a medical condition in postmenopausal women. Monthly new prescriptions are measured for both the new drug and the total category.2 Detailing activity corresponds to the monthly number of face-to-face meetings in which pharmaceutical representatives present information about the drugs to physicians. Sampling activity corresponds to the monthly number of free drug samples offered to physicians by the pharmaceutical representatives.3 Our sample consists of 300 physicians who have received at least one detail and one sample during the first 12 months of the data. These data are compiled from internal company records and pharmacy audits.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
179
Table 7.1 Descriptive Statistics
New drug prescriptions Number of details Number of samples Months detailed (%) Category prescriptions New drug share
Mean
Std. dev.
Lower 5%
Upper 95%
1.62 2.18 9.07 0.87 22.50 0.08
1.35 0.63 3.30 0.15 13.05 0.06
0.54 1.22 4.17 0.35 10.10 0.03
3.21 3.71 16.33 1.00 37.79 0.14
Note: Average monthly values computed for each physician across the sample of 300 physicians.
Table 7.1 presents descriptive statistics of the data. On average, a physician writes 22.5 new prescriptions in the category per month, 1.62 of which correspond to the new drug. Each physician receives an average of 2.18 details and 9.07 samples of the new drug per month. Furthermore, an average physician was detailed in 87% of the months, suggesting a relatively non-targeted detailing allocation by the pharmaceutical firm. Finally, there is variability in prescription behavior across physicians, as well as in the number of details and samples received. Figure 7.1 shows the monthly evolution of the total volume of new drug prescriptions, details, samples, and share of new prescriptions for the 24-month span of our data. The figure suggests an increasing trend in the level of new prescriptions of the new drug, but relatively stable detailing and sampling activities by the firm. In addition, the share of the new drug increases from almost 0% in the first month to about 10% in the last month, closely following the increase in prescriptions of the new drug. Thus, the increase in the volume of prescriptions for the new drug cannot be attributed to category expansion. Furthermore, because the new drug reaches only 10% share by month 24, it is evident that demand for the new drug has not reached saturation by the end of our observation period. Several questions arise from figure 7.1: (1) How did the marketing actions (detailing and sampling) influence physicians’ prescribing behavior? (2) Do these marketing activities have primarily short-term or enduring effects? (3) Could the firm have implemented a better targeting policy? We address these and other questions in the following sections.
180
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
0.18 Prescriptions Detailing Sampling Share
1600 1400
0.16 0.14
1200
0.12
1000
0.10
800
0.08
600
0.06
400
0.04
200
0.02
0
0
2
4
6
8
10
12
14
16
18
20
22
24
New drug share of prescriptions
Number of prescriptions, details or samples
1800
0
Month Figure 7.1 Total Number of New Drug Prescriptions, Details, Samples, and Share of New Prescriptions per Month.
7.4
The Nonhomogeneous Hidden Markov Model
In this section, we describe the use of a nonhomogeneous HMM to capture physicians’ dynamics in prescription behavior and the shortand long-term effects of marketing actions. An HMM is a Markov process with unobserved states. In our application, the hidden states represent a finite set of prescription-behavior states. For instance, assume two prescription-behavior states. At the low prescription-behavior state, physicians make only a few prescriptions for the new drug, possibly because of the need to acquire information about the drug. Consequently, physicians in this state may be responsive to information-based marketing initiatives (e.g., journal advertising). In contrast, physicians at the higher prescription-behavior state are likely to be affected by retention-type marketing initiatives (e.g., sampling). Physicians stochastically transition among these states through a Markovian first-order process. The transitions between states are
Dynamic Allocation of Pharmaceutical Detailing and Sampling
181
functions of marketing activities and physicians’ intrinsic propensities to switch. For example, detailing may educate physicians about the drug and move them from the low prescription-behavior state to the higher one. Thus, the HMM model can capture the long-term effect of marketing activities through their impact on the transition probabilities. Marketing initiatives also affect physician behavior in the short term. We capture this effect by relating the marketing variables to the observed prescription behavior through a state-dependent component. Let Yit be the number of prescriptions of the new drug written by physician i in month t. In the HMM, the joint probability of a sequence of decisions up to time t {Yi1 = yi1, …, Yit = yit} is a function of three main components: (1) the initial hidden states membership probabilities (πi); (2) a sequence of transition probabilities among the prescriptionbehavior states (Qit); and (3) a set of prescription probabilities conditioned on the prescription-behavior states (Mit). We describe our formulation of each of these components next. 7.4.1 Initial State Membership Probabilities Let s denote a prescription-behavior state (s = 1, …, . . ., S). Let πis be the probability that physician i is initially in state s, where πis ≥ 0 and ∑ Ss =1 π is = 1. Such a probability can depend, for example, on the physician’s prior exposure to detailing and sampling activities for other drugs made by the pharmaceutical firm. That is, physicians with higher levels of exposure to marketing activities prior to the launch of the new drug are likely to be in more favorable states of prescription behavior initially. Because we do not have access to such information and because our application involves a new drug, we assume that all physicians start at state 1, which corresponds to the lowest prescription-behavior state, in the first month.4 Thus, we have equation (1): p i′ = [π i1 , π i 2 , … , π iS ] = [1, 0 , … , 0 ] .
(1)
7.4.2 The Markov Chain Transition Matrices The transition matrix Qit governs physician i’s transitions among the states after period 1. We model Qit as a function of detailing and sampling activities. Let zit = [f(Detailingit), f(Samplingit)] be the vector of marketing actions where Detailingit and Samplingit correspond to the number of details and samples that physician i receives in month t, ln ( x + 1) − μ respectively, and f ( x ) = , where μ = mean(ln(x + 1)) and σ
182
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
σ = std(ln(x + 1)). We log-transform detailing and sampling to capture the potentially diminishing returns of their effectiveness (Manchanda and Chintagunta 2004). We normalize these variables to ensure a proper identification of the prescription-behavior states. Let Xit ∈ {1, . . ., S} denote physician i’s state membership at time t. Then each element of the transition matrix, corresponding to the probability that physician i switches from state s′ to s in period t, can be written as equation (2): qis′st = P(Xit = s|Xit−l = s′, zit−1),
(2)
where qis′st ≥ 0, ∑ Ss =1 qis′st = 1. Thus, the propensity to transition from one state to another is a function of unobserved factors that can be captured by a transition randomeffect coefficient and a set of marketing actions zit−1 in period t − 1. Note that we use zit−1 to ensure temporal precedence of detailing and sampling to the physician’s transition among states between period t − 1 and period t. In contrast, one should use zit in situations where marketing actions can transition customers in the same period (i.e., in-store promotions). Alternatively, the customer transition could also depend on the cumulative past exposure to marketing activities, such as stock variables for detailing and sampling (e.g., ∑ tl =1 zil ). However, using stock variables would substantially complicate the formulation of the POMDP resource allocation problem (see section 7.6). In our empirical analysis, we have tested these specifications and found that a model with zit−1 fits the data best. We follow Netzer et al. (2008) in parametrizing the nonhomogeneous hidden-state transitions as an ordered logit model. Thus, the transition probabilities in equation (2) are given by the following (equation (3)): exp (τˆ is1 − ris′ ⋅ zit −1 ) , 1 + exp (τˆ is1 − ris′ ⋅ zit −1 ) exp (τˆ is1 − ris′ ⋅ zit −1 ) exp (τˆ is 2 − ris′ ⋅ zit −1 ) , = − 1 + exp (τˆ is 2 − ris′ ⋅ zit −1 ) 1 + exp (τˆ is1 − ris′ ⋅ zit −1 )
qis1t = qis 2t
qisSt = 1 −
exp (τˆ isS −1 − ris′ ⋅ zit −1 ) , 1 + exp (τˆ isS −1 − ris′ ⋅ zit −1 )
(3)
Dynamic Allocation of Pharmaceutical Detailing and Sampling
183
where {τˆ iss′, s′ = 1, …, S − 1} is a set of ordered logit thresholds parameters specific to state s that delineates the regions of switching, and ρis is a vector of regression weights intended to capture the effect of marketing activities on the propensity of physician i to transition from state s to other states. To constrain the ordering of the thresholds, we set τˆ is1 = τ is1; τˆ iss′ = τˆ iss′−1 + exp (τ iss′ ) ∀i , s , and s′ = 2, . . ., S − 1, such that τˆ is1 ≤ τˆ is 2 ≤ … < τˆ isS −1. 7.4.3 Conditional Prescription Behavior Conditional on being in state s in month t, we assume that the number of new prescriptions of the new drug written by physician i, Yit, follows a binomial distribution with parameters Wit and pist; that is: ⎛ Wit ⎞ yit Pist (Yit = yit Xit = s, zit ) = ⎜ ⎟ pist (1 − pist )Wit − yit , ⎝ yit ⎠
(4)
where Wit is the total number of new prescriptions in the category written by physician i in month t. Because the new drug is prescribed for a very specific disease that needs to be medically treated, category prescription is not likely to be affected by the introduction of the new drug and its associated marketing efforts. Accordingly, we treat Wit as exogenous.5 To capture the short-term impact of marketing actions, we reparametrize pist, the probability of physician i to prescribe the new drug in month t, as: pist =
exp (αˆ s0 + a is′ zit ) , 1 + exp (αˆ s0 + a is′ zit )
(5)
where αˆ s0 is the intrinsic probability of prescribing given state s and zit includes the transformed Detailingit and Samplingit variables. To ensure identification of the states, we impose the restriction that the choice probabilities are non-decreasing in the behavioral states. That is αˆ 10 ≤ ≤ αˆ S0 is imposed by setting αˆ 10 = αˆ 10; αˆ s0 = αˆ s0−1 + exp (α s0 ) ∀s = 2, … , S at the mean of the vector of covariates, zit. There are several advantages for using the binomial distribution in the current application. First, accounting for category prescriptions allows us to control for variation in patients’ category demand. For example, consider a physician who experiences a sudden increase of patients in a particular month for non-marketing reasons (e.g.,
184
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
practice expansion). In such a case, and in contrast to the binomial model, a prescription volume model (e.g., Poisson) would attribute the change in new prescriptions to marketing activities. Second, category prescriptions help to control for seasonal or time-specific effects that may affect the market or the specific physician. Third, the binomial distribution can easily handle extreme values of share of new prescriptions observed in our data (0 and 1). Following standard notation in HMMs (McDonald and Zucchini 1997), we write the vector of state dependent probabilities as a diagonal matrix Mit. To summarize, the nonhomogeneous HMM captures the dynamics in physician prescription behavior and allows marketing activities to have both short- and long-term effects. To capture dynamics, we allow physicians to stochastically transition among latent prescriptionbehavior states. The marketing actions are included in the statedependent decision (zit in equation (5)) to capture their short-term effect. This means that, conditional on the physician’s current state, marketing interventions may have an immediate effect on prescription behavior. Additionally, the marketing actions are included in the transition probabilities (zit−1 in equation (2)) to capture their long-term effect. This means that marketing interventions can move the physician from one state to another (possibly more favorable) state. Such a regime shift may have long-term impact on the physician’s decisions depending on the stickiness of the states. 7.4.4 Model Estimation Let (Yi1 , … , Yit , … , YiTi ) denote a sequence of Ti drug prescription observations for physician i, (i = 1, . . ., N). Given the HMM structure, the likelihood function for a set of N physicians can be succinctly written as: N
N
Ti
i =1
i =1
t=2
L = ∏ P (Yi1 , … , Yit , … , YiTi ) = ∏ p i′Mi1 ∏ Qit Mit 1,
(6)
where 1 is a Sx1 vector of ones. To ensure that cross-individual heterogeneity is distinguished from time dynamics, we specify the HMM S parameters q i = {τ is1 , … , τ isS −1 , ris , a is }s =1 at the individual level and use a hierarchical Bayesian Markov chain Monte Carlo (MCMC) procedure for parameter estimation (see the electronic companion, available as part of the online version that can be found at https://www.informs
Dynamic Allocation of Pharmaceutical Detailing and Sampling
185
.org/Pubs/MktSci/Online-Supplements, for the complete specification of the prior and the full conditional distributions). To be able to interpret and compare physicians’ behavior across states, we set the S intrinsic probabilities of prescribing F = {α s0 }s =1 in equation (5) to be common across physicians. HMMs are shown to be parametrically and non-parametrically identified under mild identification restrictions (Ephraim and Merhav 2002; Henry et al. 2009; Ryden 1994). Our model specification, however, differs from extant HMMs because it allows both the state- dependent vector and the transition matrix to be a function of covariates in a heterogeneous random-effect framework. Accordingly, we ran several simulations to demonstrate the ability of our estimation algorithm to recover the model’s parameters and the number of states under different modeling and data scenarios. Using the context of our pharmaceutical study to simulate data, we find that our estimation procedure does well in recovering the model’s parameters and the number of states, even for smaller sample size and number of time periods than the ones observed in our data. Specifically, we could fully recover the true number of states and all the model parameters for a sample size N ≥ 100 and number of time periods T ≥ 8. Further details of the simulation analyses are available upon request from the authors. 7.5
Empirical Results
In this section, we report the results of estimating the nonhomogeneous HMM model using the pharmaceutical data set described in Section 7.3. We use the first twenty months of the data to calibrate the model and the last four months for validation. We ran the hierarchical Bayes estimation for 300,000 iterations. The first 200,000 iterations were used as a “burn-in” period, and the last 100,000 iterations were used to estimate the conditional posterior distributions. The HMM exhibits a high degree of autocorrelation between successive MCMC draws. We therefore use the adaptive Metropolis—Hasting procedure (Atchadé 2006) to improve convergence and mixing properties. Convergence was assessed by running multiple parallel chains following Gelman and Rubin’s (1992) criterion. 7.5.1 Model Selection To infer the number of states that best represents our data, we estimated the HMM for varying number of states. Based on the
186
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Table 7.2 Selecting the Number of States States
LMD
DIC/2
Log BF
Validation log-likelihood
1
−10,854
11,001
–
−2,842
2
−9,037
9,207
1,817
−2,240
3
−8,468
8,791
568
−2,171
4
−8,489
8,859
−21
−2,179
Note: The best model in each column is in bold.
log-marginal density (LMD), log Bayes factor (Log BF), the deviance information criterion (DIC) value, and the validation log-likelihood for the validation periods (periods 21–24) criteria, we selected a three-state model (see table 7.2).6 7.5.2 Predictive Validity We compare the predictive validity of the selected three-state HMM relative to four benchmark models: two nested versions of the HMM; a latent class (LC) model; and a recency, frequency, and monetary value (RFM) model. The last two are commonly used in the literature to capture heterogeneity and dynamics in buying behavior. In all models, we assume that prescriptions follow a binomial distribution and use an MCMC approach to estimate the models’ parameters. Nested HMM We estimate two nested versions of our full three-state HMM (full HMM-3). The first is a fixed-parameter, three-state HMM, where the parameters do not vary across physicians (fixed-parameter HMM-3). Comparing the full HMM to this model allows us to assess the magnitude of heterogeneity among physicians in the sample. The second is a three-state HMM with a stationary transition matrix. In this model the marketing activities are included only in the conditional choice component (M), allowing for only short-term effect of marketing actions (stationary HMM-3). Comparing the full HMM to this model allows us to assess the value of capturing the long-term effect of detailing and sampling. LC Model The latent class model of Kamakura and Russell (1989) captures heterogeneity in customer behavior through a set of latent segments (or states). However, unlike the HMM, the LC model cannot capture dynamics because customers cannot transition among
Dynamic Allocation of Pharmaceutical Detailing and Sampling
187
Table 7.3 Predictive Validity
Model
LMD
Validation log-likelihood
RMSE
Full HMM-3
−8,468
−2,171
0.075
Stationary HMM-3
−8,597
−2,177
0.077
Fixed-parameter HMM-3
−9,334
−2,232
0.089
RFM
−9,084
−2,261
0.075
Latent class
−10,495
−2,357
0.087
Note: The best model in each column is in bold.
segments. Thus, the LC model can be viewed as a special case of a HMM. We estimate this model for three segments to emphasize the differences between a model that accounts for heterogeneity only and a model that accounts for both heterogeneity and dynamics. RFM model One of the models most commonly used to capture dynamics and manage the firm’s customer base is the RFM model (Pfeifer and Carraway 2000). We construct the RFM variables as follows. Recency corresponds to the number of months since the last new prescription. Frequency corresponds to the average incidence of new prescription up to the current time period. Monetary value is measured by the average number of monthly new prescriptions of the new drug up to the current time period. These variables are defined at the physician level and are updated every month. Additionally, we include the transformed detailing and sampling variables to account for the effect of marketing activities. Based on the validation log-likelihood and the RMSE criteria (see table 7.3), the selected three-state HMM (full HMM-3) predicted the holdout prescription data best. Comparing the fit and predictive ability of the full HMM-3 and the stationary HMM-3, we conclude that by incorporating the effect of detailing and sampling in both the transition and conditional choice matrices, we not only capture and disentangle the short- and long-term effects of detailing and sampling, but also better represent the physicians’ behavior. The relatively poor predictive ability of the latent class model suggests a high degree of dynamics in the physicians’ prescription behavior. Finally, the RFM model shows a relatively good predictive ability, albeit slightly worse than the full HMM-3. Indeed models that include lagged dependent variables as covariates, such as the RFM or the Guadagni and Little (1983) models,
188
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
tend to have good predictive ability. However, these models provide little insight into how the observed-state dynamics can be used to assess the enduring effects of marketing activities and to dynamically allocate these activities across physicians, which is the main objective of our research. 7.5.3 The HMM’s Parameter Estimates We now discuss the parameter estimates for the three-state HMM (Full HMM-3). In table 7.4, we report the posterior means and posterior standard deviations of the parameters, as well as the 95% heterogeneity intervals. We then use the parameter estimates to: (1) interpret the three HMM states; (2) investigate physicians’ dynamics; and (3) disentangle the short- and long-term effects of detailing and sampling. Interpreting the States To characterize the three states, we convert the intercept parameters (α 10 , α 20 , α 30) in table 7.4 into prescription probabilities (i.e., share of new prescriptions) conditional on being in each state (equation (5)) with and without detailing and sampling. The results in table 7.5 suggest that, on average, the share of new prescriptions of physicians in the first state is very close to zero. Accordingly, we call this the “inactive” state. In the second state, physicians present a somewhat more favorable prescription behavior toward the new drug—prescribing to the new drug 6% of their total volume of new prescriptions in the category. Thus, we call this the “infrequent” state. In the third state, physicians frequently prescribe the drug to their patients, with new prescription share nearing 20%. Consequently, we label this state as the “frequent” state. Note that even in the frequent state, the share of new prescriptions reaches only 20%. This result suggests that, even at the frequent state, physicians do not “run out” of patients for whom to prescribe the drug. This result is typical for a new drug. To further characterize the states, we now focus on the prescription dynamics as physicians transition among the three states. Physicians Dynamics Similar to table 7.5, table 7.6 converts the mean posterior transition matrix parameters to probability transition matrices, with and without the effect of marketing activities. Examining the lefthand side matrix in table 7.6, which represents the mean transition matrix with no detailing or sampling, we observe a high degree of stickiness in the inactive and infrequent states. That is, on average, physicians in these states are very likely to remain in the
Dynamic Allocation of Pharmaceutical Detailing and Sampling
189
Table 7.4 Posterior Means, Standard Deviations, and 95% Heterogeneity Intervals Heterogeneity interval
Transition matrix Threshold parameters Low threshold—state 1 High threshold—state 1 Low threshold—state 2 High threshold—state 2 Low threshold—state 3 High threshold—state 3 Marketing effects Detailing—state 1 Detailing—state 2 Detailing—state 3 Sampling—state 1 Sampling—state 2 Sampling—state 3 Conditional choice State specific effects Intercept—state 1 Intercept—state 2 Intercept—state 3 Marketing effects Detailing—state 1 Detailing—state 2 Detailing—state 3 Sampling—state 1 Sampling—state 2 Sampling—state 3
Parameter label
Posterior mean
Posterior std. dev.
τ11 τ12 τ21 τ22 τ31 τ32
0.36
0.12
1.67
0.13
−1.80 1.53
0.14
−1.98 0.77
0.23
ρ1d ρ2d ρ3d ρ1s ρ2s ρ3s
α 10 α 20 α 30 α 1d α 2d α 3d α 1s α 2s α 3s
2.5%
97.5%
−0.34 1.10
1.53
−2.79 0.52
−0.72 2.31
0.19
−3.45 0.07
−1.16 1.30
Long-term 0.31 0.12
−0.36
0.88
0.02
0.15
−0.56
0.47
0.02
0.17
−0.64
0.44
0.11
2.05
0.18
0.11
−0.43
0.78
0.21
0.13
−0.26
0.82
0.28
0.17
−0.19
0.58
1.05
−4.98 0.83 0.19 Short-term 0.27
0.11 0.05 0.04 0.11
−0.72
0.00
0.05
−0.61
0.51
−0.04 0.09
0.09
−0.42
0.46
0.11
−0.73
0.98
0.06
0.05
−0.43
0.60
0.02
0.08
−0.42
0.51
Note: The 95% heterogeneity interval indicates that 95% of the physicians have a posterior mean that falls within that interval.
190
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Table 7.5 State-Specific Share of New Prescription Estimates With and Without Sampling and Detailing
States Inactive Infrequent Frequent
p1 p2 p3
No marketing activities
Detailing only
Sampling only
0.004 0.062 0.196
0.007 0.062 0.184
0.004 0.067 0.201
Note: The three columns titled “No marketing activities,” “Detailing only,” and “Sampling only” represent the mean share of new prescriptions with no detailing or sampling, with the average number of detailing, and with the average number of sampling, respectively.
Table 7.6 Posterior Means of the Transition Matrix Probabilities Across Physicians No marketing activities
Detailing only
Sampling only
0.75
0.25
0.00
0.62
0.38
0.00
0.70
0.30
0.00
0.17 0.15
0.78 0.46
0.05 0.39
0.16 0.15
0.79 0.45
0.05 0.40
0.13 0.10
0.81 0.41
0.06 0.49
Note: The detailing and sampling matrices are calculated assuming the firm allocates the average number of details and samples to each physician.
same state in the next period. In contrast, in the frequent state, physicians are more likely to drop to the infrequent state than they are to stay in the frequent state. Thus, consistent with Janakiraman et al. (2008), who find a high degree of physician persistence, we find that physicians are reluctant to fully adopt the new drug as they stick to the inactive and infrequent states. It should be noted that we estimate random-effect parameters for the transition matrix (equation (3)). Thus, a separate set of transition matrices similar to the ones presented in table 7.6 can be obtained for each physician. Dynamics in State Membership We further examined physicians’ dynamics by calculating the state membership distribution across physicians over time. We use the filtering approach (McDonald and Zucchini 1997) to calculate the probability that physician i is in state s at period t. The filtering probability is given by equation (7): t
P ( Xit = s Yi1 , … , Yit ) = p i Mi1 ∏ Qiτ misτ Lit , τ =2
(7)
Dynamic Allocation of Pharmaceutical Detailing and Sampling
191
300
250
Physicians
200
150
100
50
0
0
2
4
6
8
10
12
14
16
18
20
Months Inactive
Infrequent
Frequent
Figure 7.2 Distribution of Physicians’ State Membership Over Time.
where mits is the sth column of the matrix Mit and Lit is the likelihood of the observed sequence of physician i’s decisions up to time t, which is given by Lit = p i′Mi1 ∏tl = 2 Qil Mil 1 . Figure 7.2 shows that the majority of physicians quickly moved from the inactive state to the infrequent state and to a lesser extent to the frequent state. It took approximately ten months for the aggregate distribution of physicians’ state membership to stabilize at approximately 28%, 51%, and 21% in the inactive, infrequent, and frequent states, respectively. Disentangling the Short- and Long-Term Effects of Detailing and Sampling The nonhomogeneous HMM allows us to disentangle the total effect of marketing activities into two components: immediate and enduring effects. The immediate impact of detailing and sampling can be assessed by the effect that these activities have on the share of new prescriptions conditional on being in a particular state (see equation (5)
192
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
and table 7.5). On the other hand, the enduring effect of detailing and sampling can be assessed by their effect on the transitions between the states (see equations (2) and (3) and table 7.6). The results in tables 7.4 and 7.5 show that, on average, detailing and sampling have a relatively small short-term effects. This result is consistent with the finding of Mizik and Jacobson (2004). Furthermore, the short-term impact of detailing is strongest for physicians in the inactive state; that is, consistent with prior research (e.g., Narayanan et al. 2005), we find that detailing primarily plays a role in affecting product adoption. In contrast, when physicians are in the frequent state, we find an average small negative (although statistically insignificant) short-term effect for detailing. This result is consistent with the finding of Manchanda and Chintagunta (2004), who suggest that physicians may be conscious about the pressure being put on them by the companies’ sales force and the possible physicians’ backlash as a result of excessive marketing exposure.7 In contrast to their relatively small short-term effects, detailing and sampling have, on average, strong enduring effects. The results in table 7.4, and in the center and right matrices in table 7.6, show that detailing and sampling have substantial effects in switching physicians from lower states to higher ones. By comparing the transition matrices without marketing interventions (left side of table 7.6), with detailing only (center of table 7.6), and with sampling only (right side of table 7.6), we can see that detailing has a strong effect in moving physicians away from the inactive state and sampling is most effective in keeping them in the frequent state. Thus, whereas detailing may be more useful as an acquisition tool, sampling is more useful as a retention tool. A possible explanation for this result is that when physicians are in the inactive state, they are more receptive to new information about the drug. Then, as they move to the frequent state and are familiar with the drug, physicians can primarily benefit from receiving free samples to encourage them to keep prescribing the drug. We take advantage of this behavior when optimizing the detailing and sampling allocation. Magnitude and Duration of the Marketing Actions Effects To assess the magnitude and duration of the impact of detailing and sampling, we use the individual-level parameter estimates from the HMM to simulate the effect of targeting one additional detail or sample to each physician on the number of prescriptions in the first month (shortterm) or the following nineteen months (long-term) after targeting.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
Detailing Sampling
25
Increase in prescriptions (%)
193
20
15
10
5
1 0
5
10
15
20
Time Figure 7.3 Duration of the Effect of Marketing Actions.
Figure 7.3 shows the percentage increase in prescriptions over time. Relative to a no-detailing and no-sampling policy, we find that in the long run, one detail (sample) produces 0.55 (0.11) additional prescriptions. We observe that in the long run the effect of detailing is both stronger in magnitude and longer in duration relative to the effect of sampling. In particular, the effect of detailing in the long term is about five times the effect of sampling. However, the short-term (one month) effect of sampling is 5.5 times that of detailing. The duration of the effectiveness of detailing and sampling is ten and five months, respectively, after which the increase in prescriptions as a result of the marketing action is less than 1%. Moreover, only 25% (35%) of the total effect of detailing (sampling) occurs in the first two months. Thus, a model that ignores the long-term effect of detailing and sampling is likely to severely underestimate their effectiveness. Similarly, the short-term elasticities for detailing and sampling are much smaller compared to their long-term elasticities (table 7.7). Furthermore, in the short term, sampling has a stronger effect relative to detailing; in the long term, detailing has a stronger effect. These elasticities are consistent in magnitude with the elasticities reported by Manchanda and Chintagunta (2004) and Mizik and Jacobson (2004).
194
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Table 7.7 Detailing and Sampling Elasticities Marketing action
Short-term
Long-term
Total
Detailing Sampling
0.002 0.021
0.652 0.232
0.654 0.253
7.5.4 Endogeneity Pharmaceutical marketing resources are often targeted based on physicians’ category prescription volume (Manchanda and Chintagunta 2004). This targeting approach may lead to an endogenous relationship between prescription behavior and marketing efforts. We used the approach proposed by Manchanda et al. (2004) to check for the presence of endogeneity in our study. Specifically, we estimated a simultaneous system of equations where the share of new prescriptions is modeled using a one-state HMM. Detailing is modeled as a Poisson process, where the rate parameter is specified as a function of the physician-level intercept and the detailing response parameters from the share of new prescription model. The system of equations can be written as: Yit ~ Binomial(Wit, pit),
(8)
pit = exp(β0i + β1i ∙ dit)/(1 + exp(β0i + β1i ∙ dit)),
(9)
dit ~ Poisson(λit),
(10)
λit = exp(γ0i + γ1 ∙ β0i + γ2 ∙ β1i),
(11)
where dit is the number of details received by physician i in month t. Endogeneity in detailing exists if the parameters relating the number of details to the physician’s propensity to prescribe the new drug (γ1) and responsiveness to detailing (γ2) are significantly different from zero. Estimating the full system of equations (8)–(11) using our data, we find γˆ 1 = 0.029 and γˆ 2 = −0.085 . Both coefficients are not significantly different from zero. The 95% posterior confidence intervals are, respectively, [−0.167, 0.137] and [−0.690, 0.472]. Therefore, we fail to reject the hypothesis that endogeneity is not present in our study. Full details of this endogeneity analysis are available in the electronic companion. There are several reasons why this result runs counter to previous findings in the literature. First, our application involves a new product.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
195
Thus, endogeneity is less likely in the earlier stages of the drug’s diffusion, prior to observing actual response to the marketing efforts for the drug. Second, unlike previous research, we model the share of new prescriptions rather than the number of new prescriptions. The former variable is less likely to be endogenous because the share of new prescriptions “controls” for the physician’s category prescription volume. Next, we discuss how to use the parameter estimates from the nonhomogeneous HMM to dynamically allocate marketing resources across physicians. 7.6
The POMDP Procedure for Resource Allocation
In this section we outline the formulation of a POMDP and the optimization procedure we use to derive the resource allocation policy. Two aspects of the HMM that make the dynamic optimization difficult are that the firm has uncertainty regarding: (1) the physicians’ state at any period t; and (2) how the physicians may evolve over time through the prescription-behavior states. In other words, unlike most dynamic programming (DP) problems that use observed state variables (e.g., past purchases), the state variable Xit in our model is only probabilistically observed. To address the state uncertainty, we formulate the dynamic optimization problem as a POMDP. A POMDP is a sequential decision problem, pertaining to a dynamic Markovian setting, where the information concerning the state of the system is incomplete. Thus, the POMDP approach is well suited for handling control problems of HMMs (see Lovejoy 1991a for a survey). In the POMDP approach, the first step is to define the firm’s beliefs about physicians’ latent-state membership. We define bit(s) as the firm’s belief that physician i is in state s at time t. After observing the physician’s decision (yit) and its own marketing intervention decision (zit), the firm can update its beliefs in a Bayesian manner. Specifically, using the Bayes’ rule, the transition probability estimates (qis′st) from equation (3) and the conditional choice probabilities (Pist) from equation (5), the firm’s beliefs about the physician’s state can be updated from period t to t + 1 as: S
bit +1 ( s Bit , yit , zit −1 , zit ) =
∑b
it
s′=1 S S
( s′ ) qis′st Pist
∑ ∑ bit ( s′) qis′lt Pilt s′=1 l =1
,
(12)
196
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
where ∑ Ss =1 bit ( s) = 1 and Bit = (bit(1), …, bit(S))′. We model the pharmaceutical firm’s resource allocation as a DP problem under state uncertainty. The objective of the firm is to determine, for each period, the optimal allocation of detailing and sampling, so as to maximize the sum of discounted expected future profits over an infinite planning horizon. The optimal resource allocation is the solution to the following problem: ⎧∞ ⎫ max E ⎨∑ δ τ −t Riτ ⎬ , zit ⎩ τ =t ⎭
(13)
where, δ ∈ [0, 1] is the discount rate, E [ Rit ] = ∑ Ss =1 bit ( s) rist, and rist is the expected profit the firm earns at period t if physician i is in state s and given marketing intervention zit. The firm’s optimal scheduling of marketing interventions is the solution to the dynamic program from that time forward, and it needs to satisfy the Bellman optimality equation (Bertsekas 2007): ⎫ ⎧∞ Vi* (Bit ) = max E ⎨∑ δ τ − t Riτ ⎬ (14) zit ⎩ τ =t ⎭ S ⎪⎫ ⎪⎧ S = max ⎨∑ bit ( s) ⋅ rist + δ ∑ ∑ bit ( s) P ( yit bit ( s) , zit ) [Vi* (Bit + 1 )]⎬ , zit ⎪⎩ s =1 ⎪⎭ s = 1 yit ∈D where Vi* (Bit ) denotes the maximum discounted expected profits that can be obtained for physician i given the current beliefs Bit. The optimal allocation is thus: S ⎪⎧ S ⎪⎫ zit* (Bit ) = arg max ⎨∑ bit ( s) ⋅ rist + δ ∑ ∑ bit ( s) P ( yit bit ( s) , zit ) [Vi* (Bit + 1 )]⎬ , zit s = 1 yit ∈D ⎩⎪ s =1 ⎭⎪ (15)
The Bellman optimality equation can be rewritten as Vi* = Γ (Vi* ). That is, the problem reduces to find a fixed point of the mapping Γ. For the POMDP case, Γ has been shown to be a contraction mapping; thus, the problem has a unique fixed point solution (Lovejoy 1991b). The exact solution to POMDPs involves complex computations and can be found and confirmed only for problems of low computational complexity (Littman 2009). The complexity arises mainly from the use of a continuous space of beliefs to represent the uncertainty under the partial observability of the Markov decision process. Exact
Dynamic Allocation of Pharmaceutical Detailing and Sampling
197
algorithms, like the enumeration algorithm (Monahan 1982) or the witness algorithm (Kaelbling et al. 1998), are not practical for solving our pharmaceutical problem, because the marketing decision variables (detailing and sampling) and the outcome variable (physician’s prescriptions) can take on a large number of values. Similarly, other exact algorithms like linear support (Cheng 1988) or incremental pruning (Cassandra et al. 1997) are not possible because they lead to a large number of optimization problems to be solved (Hauskrecht 2000). Consequently, we use the approximation algorithm proposed by Lovejoy (1991b) to solve our infinite-horizon POMDP and find a closedloop policy. Lovejoy’s method combines two approximation approaches: (1) value iteration; and (2) value function interpolation. Value iteration (Bellman 1957) has been used extensively to solve infinite-horizon discounted DP problems in general, and POMDPs in particular (Lovejoy 1991a). The value function interpolation (Hauskrecht 2000; Lovejoy 1991b) procedure has been used to approximate the continuous state space of beliefs using a grid of belief points and then interpolating for other points in the state space. We adopt the fixed-grid interpolation based on Freudenthal triangulation (Lovejoy 1991b). The procedure constructs a piecewise-linear function by evaluating only the vertices of the triangulation. Sondik (1978) demonstrates that the value function of a POMDP can be approximated arbitrarily closely by a convex piecewise-linear function. Lovejoy (1991b) shows that his proposed approximation algorithm is a contraction mapping, and, thus, any stationary point found will be the unique fixed point, “in equilibrium” (Bertsekas 2007).8 Thus, in what follows, we use “optimal” to refer to the approximate solution to the optimization problem. One of the advantages of the Bayesian estimation procedure is that it provides a full posterior distribution for each individual-level parameter. These distributions reflect the uncertainty in the estimation. We incorporate this uncertainty in the optimization procedure by integrating out the parameters’ distribution over the MCMC draws (Ansari and Mela 2003) when solving the individual-level POMDP. Specifically, given the estimation results, the value function in equation (14) is calculated by: Vi* (Bit ) = ∫
∫
F θi
Vi* (Bit F , θ i ) f ( F ) f (θ i ) dFdθ i ≈
1 K * ∑ Vi (Bit Fk , θ ik ), (16) K k =1
198
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
where K is the number of retained MCMC draws, Φk is the set of fixedeffect parameters, and θik is the set of random-effect parameters from the kth draw of the MCMC. Next, we use the POMDP procedure and the posterior distribution of the physician-level HMM parameter estimates to dynamically allocate detailing and sampling for each physician in our sample. 7.7
Optimization Results
We obtain an optimal forward-looking dynamic (FL dynamic) policy for an infinite time horizon by solving the POMDP problem described previously. We compare the performance of the proposed policy to the performances obtained by three competing policies: (1) a no-marketing policy; (2) a myopic static policy; and (3) a myopic dynamic policy. The no-marketing policy does not allocate detailing or sampling for the entire time horizon. Both myopic policies consider only the short-term effects of detailing and sampling, and can be seen as a special case of equation (14) when δ = 0. The myopic static policy neglects physicians’ dynamics in the planning horizon, whereas the myopic dynamic policy considers only the short-term effect of detailing and sampling in updating the firm’s beliefs in each period. In other words, the myopic static policy identifies a unique set of physicians based on their initial statemembership probabilities and short-term responsiveness to detailing and sampling, and repetitively targets them during each period of the planning horizon. The myopic dynamic policy does the same except that the set of physicians to target, and the amounts of detailing and sampling to allocate, are optimized each month given the physicians’ updated state-membership probabilities. In solving the optimization problem, we make the following assumptions:9 the retail price of a prescription (including refills) p is $300, the cost of one detail cd is $80, the cost of one sample cs is $30, and δ is 0.985 (i.e., 0.9 yearly). The profit rist in equation (14) can be specified as rist = pWitpist − cdDetailingit − csSamplingit, where Wit is the total number of new prescriptions in the category written by physician i in month t, and pist is the share of prescriptions of the new drug allocated by physician i in month t (see equation (5)). Additionally, we impose the constraint that each physician needs to be detailed in order to receive a sample. This constraint is common in the industry (Manchanda et al. 2004) and was observed in our data. To visualize the performance of each policy, we depict in figure 7.4 the effect of
Dynamic Allocation of Pharmaceutical Detailing and Sampling
199
× 10 5 2.6 2.5 2.4
Profits
2.3 2.2 No marketing Myopic static Myopic dynamic FL dynamic
2.1 2 1.9 1.8 1.7
0 1
5
10
15
20
Months Figure 7.4 Infinite Horizon Policy Performance Comparison Over Twenty Months.
applying each policy on physicians’ behavior for the first twenty months of the infinite planning horizon used to solve the DP problem. In figure 7.4, the proposed FL dynamic policy performs worse than the myopic policies during the first couple of months as it invests in moving the physicians’ base to higher states. However, within three to four months, the FL dynamic policy substantially outperforms the alternative policies, demonstrating the importance of accounting for dynamics in prescription behavior and the long-term effectiveness of detailing and sampling. Furthermore, the superior profit performance of the myopic dynamic policy over the static one emphasizes the importance of dynamically allocating resources to physicians, even if only short-term effects are considered. Finally, the results in figure 7.4 suggest that after six to eight months, the profits from all policies stabilize. Table 7.8 compares the resource allocation and profits of the alternative policies over the first 20 periods of the infinite planning horizon. The FL dynamic policy resulted in 33% return on investment (ROI) for detailing and sampling. This is substantially higher than the ROI obtained from the myopic static and myopic dynamic policies (20% and
200
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Table 7.8 Comparison of the Resource Allocation Policies
Policy No marketing Myopic static Myopic dynamic FL dynamic
Discounted Discounted Profits Prescriptions Details Samples budget ($) Profits ($) increasea (%) 12,439
0
0
0
3,243,450
–
16,715
4,080
6,940
464,859
3,893,091
20
17,567
4,322
7,930
507,243
4,069,117
25
20,761
8,583
18,356
1,077,750
4,319,797
33
a. The percentage increase in profits for each policy is relative to a no-marketing policy.
25%, respectively). These differences in ROI performance can be attributed to the failure of the two myopic policies to capture the long-term effect of detailing and sampling. Accordingly, both policies allocate fewer details and samples relative to the FL dynamic policy. The 8% (= 33% – 25%) improvement in ROI for the FL dynamic policy relative to the myopic dynamic policy stresses the importance of accounting for the long-term effect of detailing and sampling and the firm’s forwardlooking behavior. The 5% (= 25% – 20%) improvement in ROI for the dynamic myopic policy relative to the myopic static policy emphasizes the importance of dynamically allocating resources as physicians’ behavioral states change over time. In summary, our results highlight the possibly substantial financial implications from simultaneously accounting for the dynamics in consumer behavior and the long-term effect of marketing actions when allocating marketing resources. 7.7.1 Comparison with the Current Resource Allocation Policy We compare the FL dynamic policy to the policy currently applied by the company during the last four months in our data (months 21–24). This analysis provides several insights. First, we find that the pharmaceutical firm is currently overspending on detailing and sampling. Under the proposed FL dynamic policy, the firm should cut its overall spending by 20%. This result is directionally consistent with the finding of Mizik and Jacobson (2004) and the industry cut on detailing and sampling efforts since the data period. Second, despite the 20% cut in spending, the FL dynamic policy allows the firm to increase its
Dynamic Allocation of Pharmaceutical Detailing and Sampling
201
prescriptions by 61.9%, generating an additional $412 in profits per physician per month. Our targeting approach requires first estimating the HMM parameters for every physician in the potential target market and then using the POMDP procedure to optimize the allocation of detailing and sampling for each of these same physicians. Our optimization approach accounts for uncertainty in the parameter estimates by integrating over the posterior distribution of the parameters. If one wishes instead to estimate the HMM only for a sample of physicians, and then use the resulting posterior distributions of the parameters to optimize and target the allocation of detailing and sampling to the full physician base, the percentage improvement of the policies in table 7.8 relative to the no-marketing policy (and the improvement of the FL dynamic policy over current policy) may be overstated (Mannor et al. 2007). As demonstrated by Mannor et al., because model parameters are estimated with error on a specific sample, value functions estimated for optimal policies using the same sample are on average positively biased. We adapted the cross-validation approach of Mannor et al. (2007) to explore the extent to which the POMDP procedure we employ overstates the profit performance in such a context. First, we randomly divided our sample of physicians into two subsamples, a calibration sample and a validation sample, and separately estimated our HMM for each subsample. Second, because our estimation produces full posterior distribution for each physician, we followed Ansari et al. (2000) and used the population distribution of the parameters from the calibration sample to infer the posterior distribution of the parameters of each physician in the validation sample. We then used these latter estimates to derive an optimal policy for each physician in the validation sample. We used this “calibration-sample optimal policy” to calculate the value function of each physician in the validation sample. Third, we used the validation sample parameters to derive the optimal policy for each physician in the validation sample. We used the “validation sample optimal policy” to calculate the value function of each physician in the validation sample. The difference between the value function calculated in the second step and the value function calculated in the third step provides an estimate for the bias. Consistent with Mannor et al. (2007), we find that the optimal FL dynamic policy and the myopic dynamic policy resulted in value
202
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
functions that are biased upward by 10.2% and 8.2%, respectively (see the electronic companion for more details about this analysis). However, it is clear from table 7.8 that, even after correcting for the bias suggested by Mannor et al., the improvement of the FL dynamic policy over the current and myopic policies remains substantial. As noted in Mannor et al., biases that arise from functional form and distribution assumptions are not corrected by the approach proposed above. As we show below, the superior performance of the FL dynamic policy relative to the current policy is due to better decisions on which physicians to target, when to target them, and how many details and samples each physician receives. Who Is Being Targeted We compare the physicians targeted by the current policy with those targeted by the FL dynamic policy in terms of their responsiveness (elasticities) to marketing actions. Specifically, we simulate the effect of allocating one detail or one sample in the first period of the planning horizon (period twenty-one), and compute the average elasticities over the first four periods of the infinite planning horizon.10 We find that the FL dynamic policy targets physicians with substantially higher responsiveness to detailing and sampling relative to the current policy. The average detailing elasticity for the physicians targeted by the current and FL dynamic policies were 0.1 and 0.57, respectively. For sampling, the elasticities were 0.07 and 0.22 for the current and FL dynamic policies, respectively. Furthermore, the elasticities corresponding to the current targeting policy did not differ substantially from those corresponding to a random targeting policy (calculated by randomly shuffling the identity of the targeted physicians). Specifically, we find that the current policy is targeting physicians with low (and sometime even negative) detailing and sampling coefficients. This result is consistent with the findings of Manchanda and Chintagunta (2004). When to Target Resources Because of the dynamic structure of our model, the proposed dynamic policy is capable of optimizing the timing of the detailing and sampling allocation to each physician. Figure 7.5 shows the distribution of detailing allocated per state. We can observe that, whereas the current policy allocates details almost uniformly across states, the FL dynamic policy allocates more details to physicians who are in the inactive state. Once these physicians have transitioned to the higher state the allocation of details decreases. These
Dynamic Allocation of Pharmaceutical Detailing and Sampling
FL dynamic
203
Current
Allocated details
1 0.8 0.6 0.4 0.2 0
Inactive
Infrequent
Frequent
Prescription states Figure 7.5 Detailing Allocation per Prescription State.
results suggest that detailing should be used primarily as an acquisition tool. Detailing Depth versus Breadth We compared the physicians targeted by the FL dynamic policy with those targeted by the current policy in terms of breadth and depth of detailing over time. Overall, we find that the current policy targets almost twice as many physicians as the FL dynamic policy, but with fewer details. Specifically, over the four-month period, the current and FL dynamic policies have targeted, at least once, 85% and 44% of the physicians every month, with an average of 2.20 and 3.43 details per month, respectively. Thus, it appears that the pharmaceutical company is pursuing a “shot gun” approach to targeting, indiscriminately allocating details to most physicians. In contrast, the FL dynamic policy appears to suggest a “rifle” targeting approach, detailing fewer physicians more intensively. These latter physicians have higher responsiveness to detailing and are more likely to be in the inactive state (state 1). The emphasis of the FL dynamic policy on depth over breadth is in line with our finding that detailing is to be used primarily as an acquisition tool (see also Narayanan et al. 2005). Thus, for a successful physician acquisition, detailers need to educate potential adopters “in depth” about the uses and benefits of the new drug.
204
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
In summary, the improved profitability of the proposed dynamic policy relative to the current policy can be attributed to: (1) the targeting of physicians with higher responsiveness to marketing resources; (2) the timing of these resources depending on the physicians’ behavioral states; and (3) a better tradeoff between breadth and depth of resource allocation. 7.8
Conclusions
This paper presents an integrative nonhomogeneous HMM model and a POMDP dynamic programming approach to dynamically target and allocate detailing and sampling across physicians. The HMM model accounts for physicians’ heterogeneity and captures the dynamics in physicians’ behavior and the long-term effect of marketing activities. The application of our modeling framework in the context of a new pharmaceutical drug introduction reveals several insights. First, we find three latent prescription-behavior states that characterize physicians’ dynamic prescription behavior. Second, for the particular drug studied, both detailing and sampling have long-term impact on physicians’ prescription behavior. Third, detailing is particularly effective as an acquisition tool, moving physicians from the inactive state, whereas sampling is mostly effective as a retention tool, keeping physicians in a high prescription-behavior state. Fourth, sampling has a stronger short-term effect than detailing, but detailing has a stronger long-term effect. Fifth, we demonstrate that ignoring the dynamics in physician buying behavior and the long-term effects of marketing activities leads to suboptimal allocation of marketing interventions. Specifically, using a counterfactual analysis, we demonstrate that a dynamic policy can lead to a substantial increase in profitability relative to the current and myopic policies, and that the firm should cut its marketing spending by 20% relative to the current policy. The optimal dynamic allocation of sampling and detailing involves first moving physicians away from the inactive state to the frequent state and then retaining these physicians in the frequent state. We highlight several limitations and directions that future research could explore. First, in our empirical application, we find no evidence of endogeneity in the detailing and sampling of the new drug. In general, if endogeneity is present, one could integrate into our modeling approach a targeting process equation along the lines of
Dynamic Allocation of Pharmaceutical Detailing and Sampling
205
Manchanda et al. (2004). Second, an alternative source of dynamics not considered in this research comes from the belief that physicians have foresight regarding their prescription-behavior evolution and the firm’s marketing resource allocation. One could extend our modeling framework by formulating a structural model of state dependence with forward-looking behavior (Erdem and Keane 1996). Third, our optimization procedure did not consider geographical, multiphysician practices, and intertemporal constraints on the sales-force allocation. Such constraints can be added to the optimization procedure. Generally, though, because salespeople detail multiple drugs, the firm often has flexibility in the detailing allocation for any particular drug. Fourth, recent studies have suggested that social interactions among physicians can influence their prescription behavior (Nair et al. 2010). Future research could extend our modeling approach to account for such effects. Finally, although we have applied our model in a pharmaceutical setting, our approach can be readily used in other application areas where firms individually target multiple marketing activities and possess longitudinal, customer-level, transaction data. Notes The first author gratefully acknowledges partial funding by FONDEF (project D06I1015) and the Millennium Institute on Complex Engineering Systems. The authors are also grateful for the support of the Marketing Science Institute through the Alden G. Clayton Dissertation Competition. They thank Asim Ansari, Brett Gordon, Sunil Gupta, Raghu Iyengar, Rajeev Kohli, three anonymous reviewers, and the area editor and editor in chief of Marketing Science for their helpful comments and suggestions. Reprinted by permission, Ricardo Montoya, Oded Netzer, and Kamel Jedidi, “Dynamic Allocation of Pharmaceutical Detailing and Sampling for Long-Term Profitability,” Marketing Science, volume 29, number 5, September–October, 2010. Copyright 2010, the Institute for Operations Research and the Management Sciences (INFORMS), 5521 Research Park Drive, Suite 200, Catonsville, MD 21228 USA. An electronic companion to this paper is available as part of the online version that can be found at https:// www.informs.org/Pubs/MktSci/Online-Supplements. 1. Throughout the paper we use the term “optimal” to refer to our approximate solution to the optimization problem. 2. Throughout the paper, “prescriptions” refer only to new prescriptions made by the physician, excluding refills. 3. The sponsoring pharmaceutical firm did not use direct-to-consumer advertising for marketing the new drug. 4. A more general specification of estimating the vector πi did not provide significant improvement in fit.
206
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
5. To test if Wit is indeed exogenous to the marketing activities, we calculated for each physician the correlations between the number of details and samples he or she received for the new drug, and his or her category prescriptions (Wit) across the twenty-four months. The average correlations between category demand (Wit) and detailing and sampling are 0.026 and 0.023, respectively; both correlations are statistically insignificant. This analysis suggests that in the context of our empirical application, category demand may not be affected by marketing efforts. 6. We also tested a four-state model with an absorbing no-prescription (“defected”) state. The fit criteria for this model are LMD = −8517, DIC/2 = 8840, Validation log-likelihood = −2195. Therefore, a model where physicians can move to a defected state is rejected in favor of a model with three states or four states. 7. Another explanation that cannot be ruled out is that competitors may be aware of the favorable behavior of physicians in the frequent state and can consequently increase their marketing efforts to those physicians. However, there is no direct evidence of pharmaceutical companies reacting in such a way. 8. To further examine the accuracy of our approximation approach, we applied our optimization procedure to Sondik’s (1978) POMDP problem, for which the optimal solution is known. In addition, we used a pharmaceutical problem that is similar to our empirical application to compare the approximate solution to an infinite-horizon POMDP with a finite-horizon solution that can be solved numerically. In both cases, our approach was able to recover the optimal policy and accurately approximate the true value function. Further details are available in the electronic companion. 9. These estimates were determined based on discussions with the data provider and industry standards. The cost of one detail considers that three drugs are discussed during a ten to fifteen minute visit. Sampling costs include the drug itself, packaging, shipping, and storing costs. Based on treatment specifications for this condition, we assume one new prescription corresponds to a treatment of three months on average. That is, an average patient needs to obtain two additional refills of the drug. The procedure presented in this section could be easily modified given an alternative cost structure. 10. We cannot use a longer time horizon because there are only four holdout periods for the current policy. References Ansari, A., S. Essegaier, and R. Kohli. 2000. “Internet Recommendation Systems.” Journal of Marketing Research 37 (August):363–375. Ansari, A., and C. Mela. 2003. “E-customization.” Journal of Marketing Research 40 (2):131–145. Atchadé, Y. 2006. “An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift.” Methodology and Computing in Applied Probability 8:235–254. Aviv, Y., and A. Pazgal. 2005. “A Partially Observed Markov Decision Process for Dynamic Pricing.” Management Science 51 (9):1400–1416. Bates, A. 2006. “Using ROI Data for Effective Decision Making in Pharmaceutical Marketing.” KeywordPharma Expert Review. (January 26), http://www.keywordpharma.com/ prods/bates.asp#top.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
207
Bellman, R. 1957. Dynamic Programming. Princeton: Princeton University Press. Bertsekas, D. 2007. Dynamic Programming and Optimal Control. vol. I and II. Belmont, MA: Athena Scientific. Cassandra, A., M. Littman, and N. Zhang. 1997. “Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes.” In Proceedings of the 13th Annual Conf. on Uncertainty in Artificial Intelligence (UAI-97), Morgan Kaufmann, San Francisco, 54–61. Cheng, H. 1988. Algorithms for Partially Observable Markov Decision Processes. PhD thesis, University of British Columbia. Donohue, J. M., M. Cevasco, and M. B. Rosenthal. 2007. “A decade of direct-to-consumer advertising of prescription drugs.” New England Journal of Medicine 357 (7):673–681. Dubé, J., G. Hitsch, and P. Manchanda. 2005. “An Empirical Model of Advertising Dynamics.” Quantitative Marketing and Economics 3 (2):107–144. Dubé, J., G. Hitsch, and P. Rossi. 2009. “Do Switching Costs Make Markets Less Competitive?” Journal of Marketing Research 46 (4):435–445. Ephraim, Y., and N. Merhav. 2002. “Hidden Markov Processes.” IEEE Transactions on Information Theory 48 (6):1518–1569. Erdem, T., and M. Keane. 1996. “Decision-Making under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets.” Marketing Science 15 (1):1–20. Erdem, T., and B. Sun. 2001. “Testing for Choice Dynamics in Panel Data.” Journal of Business & Economic Statistics 19 (2):142–152. Gelman, A., and D. Rubin. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7:457–511. Gönül, F., F. Carter, E. Petrova, and K. Srinivasan. 2001. “Promotion of Prescription Drugs and Its Impact on Physicians Choice Behavior.” Journal of Marketing 65 (3):79–90. Guadagni, P., and J. Little. 1983. “A Logit Model of Brand Choice Calibrated on Scanner Data.” Marketing Science 2 (3):203–238. Hauser, J. R., G. L. Urban, G. Liberali, and M. Braun. 2009. “Website morphing.” Marketing Science 28 (2):202–223. Hauskrecht, M. 2000. “Value-Function Approximations for Partially Observable Markov Decision Processes.” Journal of Artificial Intelligence Research 13:33–94. Henry, M., Y. Kitamura, and B. Salanie. 2009. Identifying Finite Mixtures in Econometric Models. Working Paper, Columbia University, New York. Hitsch, G. 2006. “An Empirical Model of Optimal Dynamic Product Launch and Exit under Demand Uncertainty.” Marketing Science 25 (1):25–50. Janakiraman, R., S. Dutta, C. Sismeiro, and P. Stern. 2008. “Physicians’ Persistence and Its Implications for Their Response to Promotion of Prescription Drugs.” Management Science 54 (6):1080–1093. Jedidi, K., C. Mela, and S. Gupta. 1999. “Managing Advertising and Promotion for LongRun Profitability.” Marketing Science 18 (1):1–22.
208
Ricardo Montoya, Oded Netzer, and Kamel Jedidi
Kaelbling, L., M. Littman, and A. Cassandra. 1998. “Planning and Acting in Partially Observable Stochastic Domains.” Artificial Intelligence 101:99–134. Kamakura, W., and G. Russell. 1989. “A Probabilistic Choice Model for Market Segmentation and Elasticity Structure.” Journal of Marketing Research 26 (4):379–390. Knox, G. 2006. Modeling and Managing Customers in a Multichannel Setting. PhD Dissertation, The Wharton School, University of Pennsylvania. Lewis, M. 2005. “A Dynamic Programming Approach to Customer Relationship Pricing.” Management Science 51 (6):986–994. Liechty, J., M. Wedel, and R. Pieters. 2003. “Global and Local Covert Visual Attention: Evidence from a Bayesian Hidden Markov Model.” Psychometrika 68 (4):519–541. Littman, M. 2009. “A Tutorial on Partially Observable Markov Decision Processes.” Journal of Mathematical Psychology 53:119–125. Lovejoy, W. 1991a. “A Survey of Algorithmic Methods for Partially Observed Markov Decision Processes.” Annals of Operations Research 28 (1):47–66. Lovejoy, W. 1991b. “Computationally Feasible Bounds for Partially Observed Markov Decision Processes.” Operations Research 39 (1):162–175. Manchanda, P., and P. Chintagunta. 2004. “Responsiveness of Physician Prescription Behavior to Salesforce Effort: An Individual Level Analysis.” Marketing Letters 15 (2–3):129–145. Manchanda, P., P. Rossi, and P. Chintagunta. 2004. “Response Modeling with Nonrandom Marketing- Mix Variables.” Journal of Marketing Research 41 (4):467–478. Mannor, S., D. Simester, P. Sun, and J. Tsitsiklis. 2007. “Bias and Variance Approximation in Value Function Estimates.” Management Science 53 (2):308–322. McDonald, I., and W. Zucchini. 1997. Hidden Markov and Other Models for Discrete Valued Time Series. London: Chapman and Hall. Mizik, N., and R. Jacobson. 2004. “Are Physicians ‘Easy Marks’? Quantifying the Effects of Detailing and Sampling on New Prescriptions.” Management Science 50 (12): 1704–1715. Monahan, G. 1982. “A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms.” Management Science 28 (1):1–16. Montgomery, A., S. Li, K. Srinivasan, and J. Liechty. 2004. “Modeling Online Browsing and Path Analysis Using Clickstream Data.” Marketing Science 23 (4):579–595. Moon, S., W. Kamakura, and J. Ledolter. 2007. “Estimating Promotion Response When Competitive Promotions Are Unobservable.” Journal of Marketing Research 44 (3): 503–515. Naik, P., K. Raman, and R. Winer. 2005. “Planning Marketing-Mix Strategies in the Presence of Interaction Effects.” Marketing Science 24 (1):25–34. Nair, H. 2007. “Intertemporal price discrimination with forward-looking consumers: Application to the U.S. market for console video games.” Quantitative Marketing and Economics 5 (3):239–292.
Dynamic Allocation of Pharmaceutical Detailing and Sampling
209
Nair, H., P. Manchanda, and T. Bhatia. 2010. Asymmetric Social Interactions in Physician Prescription Behavior: The Role of Opinion Leaders.” Journal of Marketing Research 47 (5). Narayanan, S., and P. Manchanda. 2009. “Heterogeneous Learning and the Targeting of Marketing Communications for New Products.” Marketing Science 28 (3):424–441. Narayanan, S., P. Manchanda, and P. Chintagunta. 2005. “Temporal Differences in the Role of Marketing Communication in New Product Categories.” Journal of Marketing Research 42 (3):278–290. Netzer, O., J. Lattin, and V. Srinivasan. 2008. “A Hidden Markov Model of Customer Relationship Dynamics.” Marketing Science 27 (2):185–204. Pfeifer, P., and R. Carraway. 2000. “Modeling Customer Relationships as Markov Chains.” Journal of Interactive Marketing 14 (2):43–55. Ryden, T. 1994. “Consistency and Asymptotically Normal Parameter Estimates for Hidden Markov Models.” Annals of Statistics 22 (4):1884–1895. Schweidel, D., E. Bradlow, and P. Fader. 2010. Portfolio Dynamics for Customers of a MultiService Provider. Working paper, The Wharton School, University of Pennsylvania. Simester, D., P. Sun, and J. Tsitsiklis. 2006. “Dynamic Catalog Mailing Policies.” Management Science 52 (5):683–696. Sondik, E. 1978. “The Optimal Control of Partially Observable Markov Processes Over the Infinite Horizon: Discounted costs.” Operations Research 26 (2):282–304.
8
Morphing Banner Advertising Glen L. Urban, Guilherme Liberali, Erin MacDonald, Robert Bordley, and John R. Hauser
8.1
Introduction
This paper describes the first random-assignment field test of morphing with a sample size sufficient to observe steady-state behavior (116,168 unique CNET consumers receiving 451,524 banner advertisements). A banner advertisement morphs when it changes dynamically to match latent cognitive-style segments, which, in turn, are inferred from consumers’ clickstream choices. Examples of cognitive-style segments are impulsive-analytic, impulsive-holistic, deliberative-analytic, and deliberative-holistic. The website automatically determines the best “morph” by solving a dynamic program that balances exploration of morph-to-segment effectiveness with the exploitation of current knowledge about morph-to-segment effectiveness. Banner morphing modifies methods used in website morphing (Hauser, Urban, Liberali, and Braun 2009), which changes the look and feel of a website based on inferred cognitive styles. (For brevity we use HULB as a shortcut citation to the 2009 website-morphing paper.) Morphing adds behavioral-science-based dynamic changes, which complement common banner-selection methods such as context matching and targeting. HULB projected a 21% improvement in sales for BT Group’s broadband-sales website, but the projections were based on simulated consumers whose behavior was estimated from data obtained in vitro. The BT Group did not allocate resources necessary to obtain sufficient sample for an in vivo field-test.1 (By in vivo we refer to actual websites visited by real consumers for information search or purchasing. By in vitro we refer to laboratory-based websites that simulate actual websites and are visited by a randomly recruited panel of consumers. In vitro experiments attempt to mimic in vivo field experiments, but never do so perfectly.)
212
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Online morphing is designed for high-traffic websites with tens of thousands of visitors. Simulations in HULB (figure 8.3) suggest that 10,000–20,000 consumers are necessary to realize substantial gains from website morphing. Banner morphing is likely to require higher sample sizes than website morphing because successful banner outcomes (click-throughs) occur relatively less often than successful websitemorphing outcomes (sales of broadband services). Our field test (section 8.4.9) has sufficient sample to observe a significant 83–97% lift in clickthrough rates between test and control cells above and beyond context matching. Although click-through rates are a common industry metric, we also sought to test whether banner morphing increases brand consideration and purchase likelihood. Because brand-consideration and purchaselikelihood measures are intrusive, such metrics are difficult to obtain in vivo. We therefore supplement the large-sample field test with a smaller-sample random-assignment experiment on an in vitro automotive information-and-review website. We avoid the need for extremely large samples with three longitudinal surveys that act as surrogates for the HULB dynamic program. The first two surveys measure advertising preference, cognitive styles, and the stage of the consumer’s buying process. The third survey, separated from the premeasures by four and one-half (4 1/2) weeks, exposes consumers to banner advertising while they search for information on cars and trucks. In the test group, consumers see banners that are matched to their cognitive style and buying stage. Banners are not matched in the control group. The sample (588 consumers) is sufficient because: (1) we substitute direct measurement for Bayesian inference of segment membership; and (2) we substitute measurement-based morph assignment for the HULB dynamic program. The in vitro experiment suggests that matching banners to segments improves brand consideration and purchase likelihood relative to the control. 8.2
Banner Advertising—Current Practice
In the last ten years, online advertising revenue has tripled. Banner advertisements, paid advertisements placed on websites, account for 24% of online advertising revenue—about $6.2 billion in 2010. Banner advertisement placements cost roughly $10 per thousand impressions. Click-through rates are low and falling from 0.005 click-throughs per impression in 2001 to 0.001 in 2010 (Dahlen 2001;
Morphing Banner Advertising
213
PricewaterhouseCoopers 2011). Website managers and marketing managers are highly interested in methods that improve banner effectiveness. Current theory and practice attempt to increase click-through rates with a variety of methods. For example, Sundar and Kalyanaraman (2004) use laboratory methods to examine the effect of the speed and order of animation. Gatarski (2002) uses a genetic algorithm on a training sample to search forty binary features of banners. He achieves a 66% lift above a 1% click-through rate based on sixteen “generations” seeing approximately 200,000 impressions. Iyer, Soberman, and Villas-Boas (2005) and Kenny and Marshall (2000) suggest that click-through rates should improve when banners appear on webpages deemed to be relevant to consumers. Early attempts matched textual context. For example, Joshi, Bagherjeiran, and Ratnaparkhi (2011) cite an example where “divorce” in a banner is matched to “divorce” on the webpage. But context matters—it is not effective to place a banner for a divorce lawyer on a gossip account of a celebrity’s divorce. Instead, Joshi, et al. achieved a 3.3% lift by matching a banner’s textual context to a combination of webpage content and user characteristics. In a related application to Yahoo!’s news articles, rather than banners, Chu, et al. (2009, p. 1103) use context-matching methods to increase click-through rates significantly (3.2% lift based on “several million page views”). Context matching is quite common. For example, General Motors pays Kelly Blue Book to show a banner advertisement for the Chevrolet Sonic when a consumer clicks on the compact-car category. Relevance can also be inferred from past behavior: “behavioral targeting leverages historical user behavior to select the most relevant ads to display” (Chen, Pavlov, and Canny 2009, p. 209). Chen, et al. use cookie-based observation of 150,000 prior banners, webpages, and queries to identify the consumers who are most likely to respond to banners. They report expected lifts of approximately 16–26% based on in-sample analyses. Laboratory experiments manipulate consumers’ goals (surfing the web versus seeking information) to demonstrate that banner characteristics, such as size and animation, are more or less effective depending upon consumers’ goals (Li and Bukovac 1999; Stanaland and Tan 2010). This web-based research is related to classic advertising research that suggests advertising quality and endorser expertise (likability) are more or less effective depending upon relevance
214
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
(involvement) for consumers (e.g., Chaiken 1980; Petty, Cacioppo and Schumann 1983). Morphing differs from prior research in many ways. First, banners are matched to consumers based on cognitive styles rather than context relevance or past behavior. Second, latent cognitive-style segments are inferred automatically from the clickstream rather than manipulated in the laboratory. Third, morphing learns (near) optimally about morphto-segment matches in vivo as consumers visit websites of their own accord. Thus, morphing is a complement rather than a substitute for existing methods such as context matching. If successful, morphing should provide incremental lift beyond context matching. 8.3
Brief Review of Banner Morphing
The basic strategy of morphing is to identify a consumer’s segment from the consumer’s clickstream and show that consumer the banner that is most effective for the consumer’s segment. Because the clickstream data cannot completely eliminate uncertainty about the consumer’s segment, we treat these segments as latent—we estimate probabilities of segment membership from the clickstream. In addition, there is uncertainty about which banner is most effective for each latent segment. Using latent-segment probabilities and observations of outcomes, such as click-throughs, the morphing algorithm learns automatically and near optimally which morph to give to each consumer. Morphing relies on fairly complex Bayesian updating and dynamic programming optimization. Before we provide those details, we begin with the conceptual description in figure 8.1. In figure 8.1 we label the latent segments as Segment 1 through Segment 4. Typically, the segments represent different cognitive styles, but segments can also be defined by other characteristics, such as the stage of the consumer’s buying process. A design team uses artistic skills, intuition, and past experience to design a variety of alternative websites (HULB) or alternative banners (this paper). We call these banners (or websites) “morphs.” In figure 8.1 we label the morphs as Morph 1 through Morph 4. Designers try to give the system a head start by designing morphs they believe match segments, but, in vivo,
Figure 8.1 Conceptual Diagram of Banner Morphing (Illustrative Values Only).
Morph Morph Morph 1 2 3
Morph 4
Segment 4 Segment 3 Segment 2
Morph Morph Morph 1 2 3
Morph 4
Segment 4 Segment 3 Segment 2 Segment 1
0.000
0.002
0.004
0.006
0.008
0.010
Morph Morph Morph 1 2 3
Morph 4
Segment 4 Segment 3 Segment 2 Segment 1
Gittins’ indices after 80,000th consumer
0.000
0.002
0.004
0.006
0.008
0.010
Gittins’ indices after 20,000th consumer
0.000
0.002
0.004
0.006
0.008
0.010
Gittins’ indices after 100th consumer
+ 0%
20%
40%
60%
80%
100%
Segment 1 Segment 2 Segment 3 Segment 4
Probability that the current consumer is in Segment r, e.g., 101st consumer
=
0.000
0.002
0.004
0.006
0.008
0.010
0.000
0.002
0.004
0.006
0.008
0.010
0.000
0.002
0.004
0.006
0.008
0.010
Morph 2
Morph 3
Morph 4
Morph 2
Morph 3
Morph 4
Morph 1
Morph 2
Morph 3
Morph 4
Assign Morph 2
Expected Gittins’ index for 80,001st consumer
Morph 1
Assign Morph 2
Expected Gittins’ index for 20,001st consumer
Morph 1
Assign Morph 3
Expected Gittins’ index for 101st consumer
Morphing Banner Advertising 215
216
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
the best matches are identified automatically, and optimally, by the morphing algorithm. If the segments could be measured directly, rather than identified latently, the morphing optimization would be “indexable.” Indexability implies we can solve the optimal allocation of morphs to segments by computing an index for each morph x segment combination. The index is called a Gittins’ index. The Gittins’ indices evolve based on observed consumers’ behavior. The optimal policy for the nth consumer would be to assign the morph with the largest index for the consumer’s segment. For example, if the upper-left bar chart represents the Gittins’ indices computed after one hundred consumers, and if segments were known, the algorithm would assign Morph 3 to Segment 4 because Morph 3 has the largest Gittins’ index for Segment 4 (largest of the dark bars). Similarly, it would assign Morph 1 to Segment 2. But segment membership cannot be observed directly. Instead the HULB algorithm uses a pre-calibrated Bayesian model to infer the probabilities that the consumer belongs to each latent segment. The probabilities are inferred from the clickstream on the website, possibly including multiple visits. Illustrative probabilities are shown by the bar chart in the middle of figure 8.1. We use these segment-membership probabilities and the Gittins’ indices to compute Expected Gittins’ Indices (bar chart in the upper right of figure 8.1). There is now one Expected Gittins’ Index per morph. Based on research by Krishnamurthy and Mickova (1999), the (near) optimal policy for latent segments is to assign the morph with the highest Expected Gittins’ Index. The bar chart in the upper-right corner tells us to assign Morph 3 to the 101st consumer. Because a sample size of one hundred consumers is small, the system is still learning morph-to-segment assignments, and, hence, the bars are more or less of equal height. If the 101st consumer had made different clicks on the website, the segment probabilities would have been different, and, perhaps, the morph assignment would have been different. As more consumers visit the website, we observe more outcomes— sales for website morphing or click-throughs for banner morphing. Using the observed outcomes the algorithm refines the morph x segment indices (details below). The middle-left and lower-left bar charts reflect refinements based on information up to and including the 20,000th and 80,000th consumer, respectively. As the indices become more refined, the morph assignments improve. (In figure 8.1’s illustrative example, the Expected Gittins’ Index assigns Morph 3 after 100
Morphing Banner Advertising
217
consumers, changes to Morph 2 after 20,000, and discriminates even better after 80,000 consumers.) State-of-the art morphing imposes limitations. First, because many observations are needed for each index to converge, the morphing algorithm is limited to a moderate number of morphs and segments. (HULB used 8 x 16 = 128 Gittins’ indices rather than the 16 indices in figure 8.1.) Second, although designers might create morphs using underlying characteristics, and morphing may define segments based on underlying cognitive dimensions, the dynamic program does not exploit factorial representations. Schwartz (2012) and Scott (2010) propose an improvement to handle such factorial representations to identify the best banners for the non-morphing case, but their method has not been extended to morphing. We now formalize the morphing algorithm. Our description is brief, but we provide full notation and equations in appendix A.1. Readers wishing to implement morphing will find sufficient detail in the cited references. Our code is available upon request. 8.3.1 Assigning Consumers to Latent Segments based on Clickstream Data Figure 8.2 summarizes the two phases of morphing. We call the first phase a calibration study. The in vitro calibration study measures cognitive styles directly using established scales. Such measurement is intrusive and would not be feasible in vivo. Respondents for the calibration study are drawn from the target population and compensated to complete the calibration tasks. Using the questions designed to identify segment membership, we assign calibration-study consumers to segments. For example, HULB asked 835 broadband consumers to complete a survey in which the consumers answered thirteen agreeversus-disagree questions, such as “I prefer to read text rather than listen to a lecture.” HULB factor analyzed answers to the questions to identify four bipolar cognitive-style dimensions. They used median splits on the dimensions to identify sixteen (2 × 2 × 2 × 2 = 16) segments. Calibration study respondents explore an in vitro website as they would in vivo. We observe their chosen clickstream. We record each respondent’s clickstream as well as the characteristics of all possible click choices (links) on the website. An example “click characteristic” is whether the click promises to lead to pictures or text. Other click characteristics are dummy variables for areas of webpage (such as a
218
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Tasks Calibration Study (prior to in vivo morphing)
1. Measure cognitive styles with established questions and define cognitive-style segments.
Outcomes 1. Assign each calibration-study consumer to a cognitive-style segment (using questions only in the calibration study).
2. Calibrated model which can 2. Observe clicks and infer segment membership characteristics of clicks for probabilities from clickstream. consumers in each cognitivestyle segment.
Day-to-day operation (of in vivo website)
Exploration
Exploitation
1. Observe clickstream. Use calibrated model to infer consumers’ latent cognitivestyle segments. 2. Observe outcomes (e.g., click-throughs). Update Gittins’ indices for each segment x morph combination.
1. Cognitive-style probabilities for each latent segment. 2. Gittins’ index value for each segment x morph combination after the n th consumer. 3. (Near) optimal assignment of a morph to the n th consumer to balance exploration and exploitation.
3. Use latent segment probabilities and Gittins’ indices to compute Expected Gittins’ index.
Figure 8.2 The Different Roles of the Calibration Study and the Day-to-Day Banner-Morphing Algorithm.
comparison tool), expectations (the click is expected to lead to an overall recommendation), or other descriptions. These calibration data are used to estimate a logit model that maps click characteristics to the chosen clicks (see appendix A.1, equation (A1)). The parameters of the logit model are conditioned on consumers’ segments. The calibration study also provides the (unconditioned) percent of consumers in each segment—data that form prior beliefs for in vivo Bayesian calculations. During day-to-day operation of the in vivo website, we do not observe consumers’ segments; instead, we observe consumers’ clickstreams. The calibrated model and observed click characteristics give likelihoods for the observed clickstream conditioned upon a consumer belonging to each of the (now latent) segments. Using Bayes Theorem (and prior beliefs) we compute the probabilities that a consumer with the observed clickstream belongs to each segment (as shown in the middle of figure 8.1.) See appendix A.1, equation (A1). In notation, let n index consumers, r index segments, and t index clicks. Let cnt be
Morphing Banner Advertising
219
consumer n’s clickstream up to the tth click. The outcomes of the Bayes ian calculations are the probabilities, Pr(rn = r |cnt ), that consumer n belongs to segment r conditioned on the consumer’s clickstream. In HULB the first ten clicks on the in vivo website were used to identify the consumer’s segment and select the best morph. We adopt the same strategy of morphing after a fixed and predetermined number of clicks. We label the fixed number of clicks with to. Hauser, Urban, and Liberali (2012) propose a more complex algorithm to determine the optimal time to morph, but their algorithm was not available for our experiments. Thus, our experiments are conservative, because morphing would likely do even better with an improved algorithm. 8.3.2 Automatically Learning the Best Banner for Each Consumer For ease of exposition, temporarily assume we can observe directly the consumer’s latent segment. Let m index morphs, and let prm be the probability of a good outcome (a sale or a click-through) given that a consumer in segment r experienced morph m for all clicks after the first to clicks. One suboptimal method to estimate prm would be to observe outcomes after assigning morphs randomly to a large number, N large , of consumers. This policy, similar to that used by Google’s web optimizer, and many behavioral-targeting and context-matching algorithms, is suboptimal during the calibration period because N large consumers experience morphs that may not lead to the best outcomes.2 To get a feel for N large, assume eight morphs and four segments as in the CNET experiment, assume a typical click-through rate of 2/10ths of 1%, and calculate the sample size necessary to distinguish 2/10ths of 1% from a null hypothesis of 1/10th of 1%. We would need to assign suboptimal banners to approximately 128,000 consumers to obtain even a 0.05 level of significance (exact binomial calculations for each morph x segment). Morphing identifies optimal assignments with far fewer suboptimal banners. Optimal assignment for directly observed segments is a classic problem in dynamic programming. The dynamic program balances the opportunity loss incurred while exploring new morph-to-segment assignments with the knowledge gained about the optimal policy. The updated knowledge is gained by observing outcomes (sales or clickthroughs) and is summarized by posterior estimates of the prm’s. (See appendix A.1, equation (A2).) Improved posterior estimates enable us to assign morphs more effectively to future consumers.
220
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
For known segments, the optimal solution to the dynamic program has a simple form: we compute an index for each r , m-combination. The index, called the Gittins’ index, Grmn, is the solution to a simpler dynamic program that depends only on assignments and outcomes for those consumers who experienced that r , m -combination (see appendix A.1, equation (A3)). For the nth consumer, the optimal policy assigns the morph that has the largest index for the consumer’s segment (Gittins 1979). The indices evolve with n. Because we do not observe the consumer’s segment directly, we must estimate the probabilities that the consumer belongs to each latent segment. Thus, in vivo, the problem becomes a partially observable Markov decision process (usually abbreviated POMDP). Krishnamurthy and Mickova (1999) establish that the POMDP is indexable and that an intuitive policy is near optimal. Their policy assigns the morph with the largest Expected Gittins’ index. The Expected Gittins’ Index is defined by EGmn = ∑ Pr (rn = r|cnt )Grmn . We still update the prm ’s and the r Grmn ’s, but we now do so using the Pr(rn = r |cnt )’s. The key differences between the Expected Gittins’ Index policy and the naïve calibrationsample policy (N large ) is that the Expected Gittins’ Index policy: (1) learns while minimizing opportunity loss; (2) continues to learn as n gets large; and (3) can adapt when prm changes due to unobserved shocks, such as changes in tastes, new product introductions, or competitive actions. Recalibration is automatic and optimal. 8.4
CNET Field Experiment
8.4.1 Smart Phone Banners on CNET.com CNET.com is a high-volume website that provides news and reviews for high-tech products such as smart phones, computers, televisions, and digital cameras. It has eight million visitors per day and has a total market valuation of $1.8 billion (Barr 2008). Banner advertising plays a major role in CNET’s business model. Context-matched banners demand premium prices. For example, a computer manufacturer might purchase banner impressions on web pages that provide laptop reviews. Non-matched banners are priced lower. Morphing provides a means for CNET to improve upon context-matching, and, hence, to provide higher value to its customers. CNET accepted our proposal to compare the performance of morphing versus a control on their website and to explore interactions with context matching.
Morphing Banner Advertising
S1
S2
221
S3
T1 S4
T2
S5
T3
Figure 8.3 Square and Top-of-Page Banner Advertisements (CNET Field Experiment).
The banners advertised AT&T smart phones. Consumers visiting CNET.com were assigned randomly to test and control cells. In each experimental cell some banners were context-matched and some were not (as occurred naturally on CNET). To assure sufficient sample for the morphing algorithm to be effective, we assigned 70% of the consumers to the test cell. CNET’s agency developed a pool of eight AT&T banner advertisements about HTC refurbished smart phones. Five of the banners were square banners that could appear anywhere on the website and three of the banners were wide rectangular banners that appear at the top of the page. See figure 8.3—we provide more detail in section 8.4.3. (AT&T was out of stock on new HTC smart phones; AT&T followed industry practice to focus on refurbished smart phones when new phones were out of stock. Industry experience suggests lower click-through rates for refurbished products, but the decrease should affect the test and control cells equally.)
222
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
8.4.2 CNET Calibration Study We first identified a candidate set of cognitive-style questions using those suggested by HULB augmented from the references therein and from Novak and Hoffman (2009). We drew 199 consumers from the Greenfield Online panel for a pre-study. These consumers answered all cognitive-style questions. Factor analysis and scale purification identified eleven questions likely to categorize CNET consumers. (Detailed questions and pre-study analyses are available from the authors.) In the calibration study, 1,292 CNET users answered the eleven purified questions. We factor analyzed the answers and identified three factors that we labeled impulsive versus deliberative, analytic versus holistic, and instinctual versus not. See appendix A.2. Following standard procedures (e.g., Churchill 1979), we re-purified these scales resulting in three multi-item bipolar cognitive-style dimensions with reliabilities of 0.75, 0.66, and 0.57, respectively. CNET’s designers felt they could target most effectively consumer segments that varied on the two most-reliable cognitive-style dimensions. We follow the methods in HULB and assign consumers to segments based on median splits of the two bipolar scales. The four segments were deliberativeholistic, deliberative-analytic, impulsive-holistic, and impulsive-analytic. While the dimensions are orthogonal by construction, there is no reason that the four segments contain equal numbers of consumers. In vivo posterior estimates were 9%, 42%, 23%, and 27%, respectively. 8.4.3 Banner Characteristics (designed by CNET’s agency) CNET’s agency varied characteristics of the morphs such as the smartphone image (home screen versus pictures of people), the size of the image, the size and colors of the fonts, the background colors, and information content (online only, free shipping versus a list of smart phone features). The designers also varied hot links such as “get it now,” “learn more,” “watch video,” “benefits of Android technology,” “see phone details,” and “offer details.” With a moderate number of banners, it was not feasible to vary all of these banner characteristics in a fractional factorial. Rather, we relied on CNET’s designers to provide banners that varied substantially. The morphing algorithm automatically and optimally assigned the banners to latent segments (via the Gittins’ indices). CNET’s choice of potential banners is an empirical tradeoff—more banners might achieve greater discrimination, but more banners might compromise the optimal policy by spreading updating over a greater number of morph x segment indices.
Morphing Banner Advertising
223
More empirical experience might suggest procedures to determine the number of banners that optimizes this tradeoff. CNET’s agency relied on the judgment of their designers. With more empirical experience, banner designers will be better able to design banners to target latent segments. Researchers might use pre-studies to link banner characteristics to segments identified in the calibration study. Analyses similar to the logit model that links click preferences to segments could help designers select banner characteristics. 8.4.4 Calibrated Model of Segment-Specific Click Preferences We observed the clickstreams for all 1,292 consumers in the calibration study. We decompose every click alternative into a vector of 22 click characteristics, including dummy variables for areas on the homepage (“carousal,” “navigation bar,” “promotion bar,” “more stories,” “popular topics,” etc.), areas on other pages (product-specific reviews, “CNET says,” “inside CNET,” etc.), usage patterns (search category, social influences, tech-savvy news, etc.), and independent judges’ evaluations of expected click outcomes (pictures, graphs, data, etc.). The same decomposition applied to the website in the calibration study and to the tracked areas of the in vivo website. Using the calibration data we estimated segment-specific click-characteristic weights, ω r . The specific model mapping characteristics to clicks is a logit model that is conditioned upon segment membership. See appendix A.1 (equation (A1)) and HULB (p. 211, equation 4). Parameter values are given in appendix A.3. 8.4.5 Posterior Beliefs about Latent Cognitive-Style Segments in vivo During day-to-day operation on the CNET website, we use Bayesian updating to estimate the probabilities that each consumer belongs to each latent segment. See appendix A.1 (equation (A1)) and HULB (p. 211, equation 5). Simulations based on the calibration study suggested that five clicks (t0 = 5 ) would provide sufficient observations to obtain reasonable posterior estimates of Pr(rn = r |cnt ). In CNET, unlike in HULB, we use cookies so that updating continues through multiple consumer visits to CNET. We define an active consumer as a consumer who has made at least five clicks on tracked areas of the website. In the control cell, we track clicks, but only to determine whether a consumer is active. Before becoming active, consumers are not shown any banners (in neither test nor control). After becoming
224
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
active, test consumers see a banner selected by the morphing algorithm, and control consumers see a randomly chosen banner. 8.4.6 Defining a Successful Click-through When There Are Multiple Sessions The same banner might be shown in many sessions. (CNET considers a session new after thirty minutes of inactivity.) CNET (and AT&T) consider the banner a success if the consumer clicks through in at least one session. We adopt their definition when we update the prm ’s. To account for interrelated sessions, we use a strategy of temporary updates and potential reversals. This strategy is best illustrated with a three-session example. Suppose that a consumer sees the same banner in three sessions and clicks through only in the second session. A naïve application of HULB would make three updates to the parameters of the posterior distributions for the success probabilities, prm . The updates would be based erroneously on observations classified as a failure, then a success, and then a failure. Instead, using CNET’s success criterion, the correct posterior is computed after the third session based on one success because the banners achieved their collective goal of at least one consumer click-through. Until we reach the third session, updates should represent all information collected to that point. We update as follows. After the first session (no click-through), we update the posterior distribution based on a failure—this is the best information we have at the time. After the second session (click-through), we reverse the failure update and update as if success. After the third session (no click-through), we do nothing because the update already reflects a success on CNET’s criterion. The mathematical formulae for CNET’s success criterion are given in appendix A.1. 8.4.7 Priors for Morph x Segment Probabilities (as Used in Computing Indices) The morphing algorithm requires that we set priors for the morph x segment click-through probabilities. The findings in HULB suggest that weakly informative priors suffice. We set priors equal to the historic click-through probability for banners for refurbished smart phones— the same for all banners. To ensure that the priors are weakly informative, we select parameters of the prior distribution based on an effective sample size of forty consumers—small compared to the anticipated number of CNET consumers.
Morphing Banner Advertising
225
8.4.8 Interaction between Morphing and Context Matching CNET uses context matching; thus one goal of the field experiment was to determine whether morphing adds incremental lift. The context-matching literature reports lifts of approximately 3% for in vivo testing and 26% for in-sample projections (see section 8.2). These lifts were calculated for banners or page-views (not on a consumer-byconsumer basis). The information technology literature consistently postulates that context matching is effective because the banner is more relevant to the consumer (e.g., Chen, Pavlov, and Canny 2009; Chu, et al. 2009; Joshi, Bagherjeiran, and Ratnaparkhi 2011). Relevance has a long history in advertising research. For example, classic studies postulate that “persuasion may work best depending on whether . . . message-relevant thought occurs” (Petty, Cacioppo, and Schumann 1983). Chaiken (1980) manipulates issue involvement as “personal relevance” and demonstrates that greater quality advertising is more persuasive with high involvement, but not with low involvement. Zaichkowsky (1986) summarizes that “although there does not seem to be a single precise definition of involvement, there is an underlying theme focusing on personal relevance.” Her survey of the literature indicates that “under high involvement, attitudes were influenced by the quality of the arguments in the message.” Prescriptive theories of targeting make similar predictions (Iyer, Soberman and Villas-Boas 2005; Kenny and Marshall 2000). If these theories apply to banner advertising, and if morphing increases the effective quality of the communication, then we expect an interaction between morphing (increased quality) and context matching (relevance). If cognitive-style matching makes it easier for consumers to learn their preferences, then a morphing-by-context-matching interaction is also consistent with observed interactions between targeting and preference learning (Lambrecht and Tucker 2011). In our field experiment we manipulate morphing (test versus control) randomly. Within each experimental cell, some banners match context and some do not. Context-matching occurs naturally on the CNET website and occurs in the same manner in the test cell as in the control cell. 8.4.9 Results of the CNET Field Experiment CNET placed banners on their website for all active consumers in the test and control cells during April 11, 2011 to May 13, 2011. Naturally, there were non-AT&T-HTC banners placed on CNET during the 31-day
226
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
test period, but these banners were placed randomly between test and control. Both we and CNET went to great lengths to ensure there were no systematic effects of non-AT&T-HTC banners or interactions with AT&T-HTC advertising. Sampling appeared random—we detected no systematic differences in the placement of control banners across esti2 = 15.9, p = 0.98 ). mated (latent) cognitive-style segments ( χ 30 Table 8.1 summarizes the field-test results. Overall, 116,168 consumers saw 451,524 banners. Of these, 32,084 consumers (27.4%) saw 58,899 banners (13.0%) on webpages where any smart phone was rated, compared, priced, discussed, or pictured. We consider such webpages as context-matched. Consistent with theories of relevancequality interactions, morphing achieves significant and substantial incremental improvements for banners on context-matched webpages ( t = 3.0 , p = 0.003). Because many consumers saw multiple banners, we also calculate click-through rates on a consumer-by-consumer basis. Morphing is significantly better than the control on consumer clickthrough rates when the banners are placed on context-matched webpages (t = 2.2, p = 0.028 ). Morphing almost doubled click-through rates for context-matched banners (83% and 97% lifts, respectively for banners and for consumers). To put these lifts in perspective, context-matching alone achieved a 5% lift in banner click-through rates, but the difference was not significant (t = 0.3 , p = 0.803 ). A 5% lift is consistent with Joshi, et al. (2011) and Chu, et al. (2009) who report lifts of 3.3% and 3.2%, respectively, on large samples. Context-matching alone had a negative lift on consumer click-through rates, but the lift was not significant (t = 1.4 , p = 0.167 ). Table 8.1 also suggests that gains to morphing require the banners to be relevant to the webpage visited by the consumer. There was a decline for banners and consumers when the banners were not on context-matched webpages ( t = 0.5, p = 0.495 and t = 1.74 , p = 0.081, respectively), but that decline is marginally significant at best. Interactions between morphing and context-matching were significant for banners ( χ 2 = 161.8 , p < 0.01) and for consumers ( χ 2 = 8.2, p = 0.017 ). 8.4.10 Morphing Discriminates Among Latent Cognitive-Style Segments Segment membership is latent; we do not observe segment membership directly. Instead we use posterior estimates of segment membership to examine the probability that morph m was assigned to segment
Morphing Banner Advertising
227
Table 8.1 CNET Field Test of Banner Advertisement Morphing Sample Size Test
Control
Click-through Rate a Test
Control
Lift
Significance
+83% +97%
0.003 0.028
Context-matched webpages All banners 40,993 17,906 Per consumer 22,376 9,708 Non-context-matched webpages All banners 262,911 129,714
0.307b 0.250b
0.168 0.127
0.151
0.160
−6%
0.495
Per consumer
0.144c
0.197
−27%
0.081
59,362
24,722
a. Click-through rates are given as fractions of a percent, e.g., 0.307 of 1%. b. Test cell has a significantly larger click-through rate than control cell at the 0.05 level or better. c. Test cell has a marginally significantly smaller click-through rate than the control cell at the 0.10 level.
r. Table 8.2 reports posterior probabilities for square banners and for top-of-page banners. On average, top-of-page banners did better than square banners—a result that does not appear connected to cognitivestyle morphing. For example, in the context of search advertising, eyetracking studies suggest a “golden triangle” or “F-shaped” attention pattern; top-of-page sponsored links receive substantially more attention than right-of-page sponsored links (Buscher, Dumais, and Cutrell 2010). Buscher et al. suggest further that high-quality sponsored links receive twice as much visual attention as low-quality sponsored links. For ease of comparison between different types of banners we renormalize click-through rates for top-of-page banners. Morph-to-segment matching worked well for square banners. Some square banners are differentially better for specific cognitive-style segments. For example, the best morph for the deliberative-analytic latent segment ( r = 2) is Morph S1. The best morph for the impulsive-holistic segment ( r = 3) is Morph S5. The morphing algorithm discriminates less well for the deliberative-holistic segment ( r = 1), likely because that segment is a much smaller segment than the others (9% of the consumers). The deliberative-analytic segment (r = 2) and the impulsive-holistic segment (r = 3) together account for 65% of the consumers, and each received their best morphs most often. The morphing algorithm does less well for the remaining 35% of the consumers. For the
18% 7% 22% 10%
m = S4 51% 12% 37% 24%
m = S5
.13% .43% .44% .63%
m = S5
a, b
46% 76% 66% 75%
m = T1
13% 12% 21% 16%
m = T2
.26% .35% .32% .48%
m = T2
41% 12% 13% 9%
m = T3
.26% .39% .33% .35%
m = T3
a. Largest values in a column are shown in bold italics. Rows sum to 100%. b Posterior segment sizes are shown in parentheses (percent of total consumers).
Deliberative-holistic segment (r = 1, 9%) Deliberative-analytic segment (r = 2, 42%) Impulsive-holistic segment (r = 3, 23%) Impulsive-analytic segment (r = 4, 27%)
Posterior Pr(m|r , top ofpage)
.35% .45% .45% .63%
9% 35% 21% 30%
m = S3
.27% .42% .40% .54%
m = S4
Deliberative-holistic segment (r = 1, 9%) Deliberative-analytic segment (r = 2, 42%) Impulsive-holistic segment (r = 3, 23%) Impulsive-analytic segment (r = 4, 27%)
7% 5% 5% 24%
m = S2
.28% .30% .24% .33%
m = S3
m = T1
14% 41% 15% 11%
m = S1
.12% .43% .25% .65%
m = S2
Top-of-Page Banners
Posterior Click-through Rates a, b
Deliberative-holistic segment (r = 1, 9%) Deliberative-analytic segment (r = 2, 42%) Impulsive-holistic segment (r = 3, 23%) Impulsive-analytic segment (r = 4, 27%)
a, b
.16% .47% .42% .41%
Deliberative-holistic segment (r = 1, 9%) Deliberative-analytic segment (r = 2, 42%) Impulsive-holistic segment (r = 3, 23%) Impulsive-analytic segment (r = 4, 27%)
Posterior Pr(m|r , square)
m = S1
Posterior Click-through Rates a, b
Square Banners
Table 8.2 CNET Field Test: Posterior Click-through Rates and Assignment Probabilities
228 G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Morphing Banner Advertising
229
impulsive-analytic (r = 4) segment and the deliberative-holistic segment (r = 1), the best morph was given more often than average, but other morphs were given more often. The posterior probabilities for top-of-page banners illustrate a situation where designers did not achieve enough variation. The algorithm learned correctly that Morph T1 was best for all latent segments. Overall, the morph assignments were enough to achieve substantial lift, but the lift would likely have improved if the algorithm had run longer. When click-through rates are low, the CNET data suggest convergence even beyond 82,000 consumers. This result illustrates why large samples are necessary to evaluate in vivo banner morphing. We attempted to link the features of the best morphs to cognitive style segments. Some assignments made sense. For example, the best morph for the deliberative-analytic segment included a detailed list of product features and the best morph for the impulsive-holistic segment included a link to “get it now.” We are hesitant to overinterpret these qualitative insights because, in the CNET field test, there are many more features than morphs. 8.5
Automotive Experiment to Test Matching Morphs to Segments
Banner advertising generates click-throughs, but banners are also display advertising and may enhance a brand’s image whether or not a consumer clicks through. For example, Nielsen (2011) describes a survey in which “54 of those surveyed believe online ads are highly effective at ‘enhancing brand/product image.’” Because managers are often interested in more than click-through rates, we supplement the CNET field experiment with an in vitro automotive experiment. (Organizational differences between CNET and AT&T, and proprietary concerns, made it impossible to track click-through rates back to sales of AT&T telephones.) In the automotive experiment we abstract from the mechanics of Gittins’ learning to test whether morph-to-segment matching increases brand consideration and purchase likelihood as well as click-through rates. The automotive experiment enables us to further test the hypothesis that banner advertisements are more effective when targeted to consumer segments that vary on cognitive styles. Measures of brand consideration and purchase likelihood require intrusive questions, unlike measures of click-through rates, which can be observed unobtrusively. To measure brand consideration and purchase likelihood, we invited consumers to complete questionnaires
230
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
before and after searching for information on an automotive information-and-review website. Because a sample size of tens of thousands of consumers was not feasible with this design, we used longitudinal methods as a surrogate for dynamic program optimization. Figure 8.4 summarizes the longitudinal methods. In Phase 1, consumers rated all test and control advertisements for their buying stage and preferred body-type. Two weeks later, in Phase 2, consumers answered a series of questions that enabled us to assign consumers to cognitive-style segments. In Phase 2, we also obtained premeasures of brand consideration and purchase likelihood. Phases 1 and 2 replaced Bayesian inference and Gittins’-index-based optimization with in vitro measurement. Phases 1 and 2 assigned each consumer to a segment and identified the best banners for each segment, thus replacing two tasks performed in vivo in the CNET experiment. The actual experiment, Phase 3, occurred two and one-half weeks after Phase 2 (four and onehalf weeks after consumers rated banners in Phase 1). In the experiment (Phase 3), consumers saw banners while exploring an automotive information-and-review website. In the test cell, banners were matched to cognitive styles (plus buying stage and body-type preference), while in the control cell banners were matched only to body-type preference. (Note that this experiment extends the definition of consumer segments to include buying stage—a practical consideration in the automotive market.) The experimental design, its implications, and potential threats to validity are best understood and evaluated within context. Thus, before we describe the Phase 3 experiment, we first describe the website, the automotive consumer segments, and the test and control banner advertisements. 8.5.1 Automotive Banners on an Information-andRecommendation Website Information-and-recommendation websites, such as Edmunds’, Kelley Blue Book, Cars.com, and AutoTrader, play major roles in automotive purchasing. For example, Urban and Hauser (2004) estimate that at least 62% of automotive buyers search online before buying a car or truck. More recently, Giffin and Richards (2011) estimate that 71% of automotive buyers search online and that online search was more influential in purchase decisions than referrals from family or friends, newspapers, and other media sources. Because information-andrecommendation websites attract potential purchasers, automotive
Morphing Banner Advertising
231
Phase 1 Develop potential banners (morphs) based on pre-studies. Screen consumers for target market. Consumers indicate body-type preference and stage of buying process. Consumers rate potential banners on meaningfulness, relevance, information content, and believability. 5 minutes
Phase 2 (two weeks later) Consumers complete 29 cognitive-style scales. Pre-measures for consideration and purchase likelihood. 10 minutes
Identify consumer segments. (4 cognitive styles) x (3 buying stages) Assign consumers to segments. Identify the best two morphs for each segment. (Two of 15 possible morphs for each segment) All morphs match body-style preference.
Phase 3 (experiment, four and one-half weeks after Phase 1) Consumers explore “Consumer Research Power” website. Observe click-throughs on banners. Consumers exposed to banners in natural search. Test: Banners assigned by morph-to-segment rules. Control: Current in vivo Chevrolet banners. Post-measures for consideration and purchase likelihood. 20 minutes
Figure 8.4 Automotive Experiment: Longitudinal Design as Surrogate for Morph-to-Segment Matching. (Phases 1 and 2 replace in vivo Bayesian inference and Expected Gittins’ Index optimization.)
232
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Figure 8.5 Simulated Website for Automotive Experiment Matching Morphs to Segments. (Landing page on the left. One of many subsequent pages on the right.) The left-most banners are controls. The other columns contain five banners designed for each buyingstage segment. In the experiment there were 10 potential control banners: body type x two banners. There were 75 potential test banners: body type x buying-stage x cognitive-style.)
manufacturers invest heavily in banner advertising on these websites. The importance of such expenditures motivated General Motors to test morph-to-segment-matching for banner advertising targeted for their Chevrolet brand. General Motors’ managerial motivation matched our scientific desire to test whether morph-to-segment matching would enhance brand consideration and purchase likelihood. We created a website that simulated actual information-and-recommendation websites. Figure 8.5 illustrates the landing page and an example search page. Consumers could search for information, receive tips and reviews, learn about insurance, and read reviews just like they would on commercial information-and-recommendation websites. To mimic best practices, all test and control banners were targeted by consumers’ expressed preferences for one of five body types. Such targeting is typical on commercial websites. For example, Edmunds. com displays body-type category links (coupe, convertible, sedan, SUV, etc.) prominently on the landing page and uses click-through information from these links to place relevant banner advertising on subsequent webpages and site visits. Body-type targeting enhances external validity and relevance. (Recall that morphing was most effective on relevant CNET webpages.)
Morphing Banner Advertising
233
8.5.2 Cognitive Styles and Stage of the Automotive Buying Process Body-type preference and the automotive buying stage were measured in Phase 1; cognitive styles were measured in Phase 2. General Motors defines buying-stage segments by: collection, comparison, or commitment. “Collection” segments included consumers who indicated they were more than a year away from buying a car or truck, but in the process of collecting information. “Comparison” segments included consumers less than a year away from buying a car or truck and who had already gathered information on specific vehicles or visited a dealer. “Commitment” segments included consumers who plan to purchase in the next three months, who have collected information on specific vehicles, and who have visited a dealer. To identify cognitive styles we asked consumers in a pre-study to answer twenty-nine questions adapted from HULB and Novak and Hoffman (2009). We factor analyzed their answers to identify three factors. Based on the questions that load together, we labeled the first two factors as rational-versus-intuitive and impulsive-versus-deliberative. The third factor was hard to define. See appendix A.2. Following standard procedures (e.g., Churchill 1979), we purified the scales resulting in three multi-item cognitive-style dimensions with reliabilities of 0.87, 0.87, and 0.36, respectively. Because morphing requires a moderate number of discrete segments, we defined four cognitive-style segments by mean splits on the first two cognitive dimensions.3,4 The four segments were rational-impulsive, rational-deliberative, intuitiveimpulsive, and intuitive-deliberative. 8.5.3 Test and Control Banner Advertisements Banner designers created test banners that varied on characteristics they judged would appeal to consumer segments with different cognitive styles. Some banners emphasize information; others compare targeted vehicles to competitors; and still others stress test drives, finding a dealer, and purchase details. The banners also varied on the size of the images, the number of images, the amount of information provided, the size of the headlines, the amount of content in the headlines, whether content emphasized product features or recommendations, and other design characteristics. Clicks on banners took consumers to different target webpages (as promised in the banners). The designers judged that these characteristics provided sufficient variation for
234
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Phases 1 and 2 to target the banners to each cognitive-style segment. In total there were seventy-five test banners: (five variations to appeal to different cognitive styles) x (three variations to appeal to different stages of the buying process) x (five variations using Chevrolet vehicles chosen to appeal to consumers interested in different body types). Figure 8.6 provides examples of fifteen test banners for one body type (Chevrolet Tahoe). In Phase 1, consumers evaluated potential test (and control) banners on meaningfulness, relevance, information content, and believability. Using the average score on these measures we identified the best two test banners for each consumer segment. In Phase 3, consumers in the test cell saw the banners that were matched to their segment. Consumers in the control cell saw the control banners. We allowed consumers’ preferences to override designers’ prior beliefs just as in the CNET field experiment the dynamic program overrode designers’ prior beliefs. There were ten control banners: two banners for each of five body types. Control banners did not vary by cognitive style or buying stage. The control banners were the banners that Chevrolet was using on real information-and-recommendation websites at the time of the automotive experiment. The control banners in figure 8.6 were most relevant to General Motors’ business decisions, but if we are to use them as a scientific control we must establish they are a valid control. The literature uses a random selection of “morphs” as a no-morphing control. If General Motors’ current banners are better than a random selection of test banners, then any differences between test and control cells would underestimate the gain due to morph-to-segment matching. We could then conclude that the improvement due to matching is at least as large as we measure. However, if current banners are worse than a random selection of test banners, then we could not rule out that the test banners are, on average, simply better than the control banners. The average score for a test banner is 3.36 (out of 5); the average score for a control banner is 3.70. The combined control banners have significantly larger average scores than random test banners (t = 10.3 , p < 0.01). For a stronger comparison we compare the two best test banners to the two control banners. Even in this comparison the average test score is still less than the control score (t = 2.7 , p < 0.01). We therefore conclude that the current Chevrolet banners are a
Morphing Banner Advertising
Control
Collect
235
Compare
Commit
Figure 8.6 Example Test and Control Banner Advertisements for the Automotive Experiment.
236
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
sufficient control. If morph-to-segment matching is superior to the current Chevrolet banners, then it is highly likely that morph-to-segment matching will be superior to either a randomly selected set of test banners or to a non-matched mix of the two best test banners. 8.5.4 Experimental Design and Dependent Measures In Phase 3, consumers were invited to explore an information-andrecommendation website called “Consumer Research Power.” Consumers searched naturally as if they were gathering information for a potential automotive purchase. They did so for a minimum of five minutes. While consumers searched we recorded click-throughs on the banners. During this search we placed banner advertisements for Chevrolet models as they would be placed in a natural setting. Test consumers received banners that alternated between the best and second-best banner for their cognitive-style and buying-process segment. Control consumers received banners that alternated between the two control Chevrolet banners.5 All banners, both test and control, were targeted by body-type preference. Consumers who clicked through on banners were redirected to various websites—websites that varied by banner (and hence consumer segment). For example, banners targeted to impulsive consumers in the commitment buying stage linked to maps of nearby dealerships, while banners targeted to rational consumers in the commitment buying stage linked to information on loans, purchasing, and options packages. We balanced the variety of click-through targets to include enough variation to implement targeting by segment, but not so much that consumers were directed outside the in vitro web environment. Our in vitro targeting likely underestimates variation obtainable in vivo, and is, thus, conservative. After consumers completed their search on “Consumer Research Power,” we measured Chevrolet brand consideration and purchase likelihood (post measures). 8.5.5 Potential Threats to Validity One potential threat to validity is that exposure to banners in Phase 1 might have contaminated the Phase 3 measures. We took steps to minimize this threat. The Phase 1 questionnaire was relatively short (five minutes) and occurred four and one-half weeks before the Phase 3 experiment. In Phase 1, consumers were not allowed
Morphing Banner Advertising
237
to click through on the banners, and, hence, did not receive the same rich information experience as in Phase 3. Instructions were written carefully to disguise the goals of the later phases—consumers believed the Phase 3 website experience was a test of the website, not an advertising test. We believe that the time delay, the number of banners rated, the lack of active click-through in Phase 1, and instructions that disguised later phases combine to limit contamination from Phase 1 to Phase 3. More importantly, the experimental design minimizes potential false positives that might be due to contamination. First, Phase 2 is more proximate in time than Phase 3. Contamination, if any, should be larger in Phase 2 than in Phase 3, making it more difficult to show an effect on Phase-3-versus-Phase-2 measures. Second, contamination, if any, would affect test and control cells equally and have no impact on statistical tests of differences that are invariant with respect to constant effects. Another potential threat to validity is that the morph-to-segment test chooses from more banners than the control. If a consumer saw a greater variety of banners in the test cell, then we would be concerned about biases due to wear-out in the control cell or biases because of greater variety in the test cell. All else equal, greater variety in the banners that a consumer actually sees increases the odds that a banner is the best banner for a consumer. Our design minimizes this threat because consumers in both test and control cells saw only two different banners. 8.5.6 Results of the Automotive Experiment Testing the Behavioral Premise of Morphing We invited 2,292 members of the Gongos Automotive Panel to participate in a multi-phase study of website design. Consumers were screened so that they were an equal or sole decision maker in automotive purchases and planned to purchase a new car or truck in less than three years. This mimics standard practice. Of these, 1,299 consumers agreed to participate (61% response rate) and 588 consumers completed Phases 1, 2, and 3 (45.3% completion rate). More consumers were assigned to the test cell (70%) than the control cell (30%), so that we had sufficiently many consumers in each consumer segment. All statistical tests take unequal cell sizes into account. Dependent measures included click-through rates for banners, click-through rates
238
G. L. Urban, G. Liberali, E. MacDonald, R. Bordley, and J. R. Hauser
Table 8.3 Automotive Experiment: Banner Advertisement Morphing (Post-only Results) Sample Size
Click-through rates All banners Per consumer Brand Consideration Purchase likelihood
Outcome Measurea
Test
Control Test
6,348 421 421 421
2,643 167 167 167
0.97%b 15.9%b 42.8%b 3.28b
Control Lift
Significance
0.26% 8.6% 32.9% 3.05
< < <
Inertia?
Nonpreference based
Pref > Inertia?
3
3
No
No
Yes
Yes High marketing responsiveness?
High marketing responsiveness?
No
4
4
No
Stayers Decision strategy change—Change in marketing responsiveness Decision strategy change—Same marketing responsiveness Same decision strategy—Change in marketing responsiveness
Figure 10.2 A Taxonomy of Channel Migration Patterns from Trial to Post-trial. Notes: Customers in Category 1 have well-established preferences for various channels, but can be influenced by marketing activities to switch channels. Customers in Category 2 have well-established preferences, but are less influenced by marketing activities. For example, they always buy on the Internet, and marketing has little influence on which channel they choose. Customers in Category 3 tend to use the channel they used the previous time, unless marketing directs them to do otherwise. These customers are habitual, but their habits can rather easily be changed. Customers in Category 4 are also inertial, but pay little attention to marketing. They will use the same channel out of habit until an unobserved factor induces them to use a different channel. For example, these customers may start off buying from the catalog, but for a particular purchase, they may be pressed for time and therefore use the Internet. They then continue using the Internet. The figure also shows the possibility that the process does not change; that is, the customer is a stayer. (This is signified by the solid lines in the figure.)
290
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
The theoretical concept at work in decision process evolution is fundamentally one of learning. What is being learned is the decision process (i.e., how the customer will go about choosing which channel to use to make a purchase). Therefore, learning means that the customer determines his or her channel preferences, how he or she will respond to the firm’s marketing communications, and to what extent he or she will simplify things by using the same channel used previously. In terms of Gollwitzer and Bayer’s (1999) theory, the customer first adopts a deliberative mind-set, determining what his or her decision process will be. After this process is learned, the customer adopts an implemental mind-set, executing the chosen decision process. This clarifies what is being learned—the decision process—which in turn prompts the question of why it is being learned. There are at least three reasons: 1. Motivation and ability: Motivation and ability are prerequisites to learning (Bettman and Park 1980; Hoch and Deighton 1989; MacInnis, Moorman, and Jaworski 1991). These factors are salient when the cost of an incorrect decision is high. For example, the customer may buy a product for a specific event using a catalog, but if the product is delivered late, the customer pays a cost by not having it when it was needed. These factors are also relevant when customers have the ability to evaluate their decisions. The channel experience is easy to evaluate in terms of service convenience, effort, and so on (Verhoef, Neslin, and Vroomen 2007). 2. Lack of task familiarity: Learning is more likely to take place when customers are not familiar with the task at hand (Alba and Hutchinson 1987; Hoch and Deighton 1989; Johnson and Russo 1984). For example, newly acquired customers may have experience with buying on the Internet or with the policies of certain stores, but not with the channels offered by the particular company for which they have just become a customer. Thus, when they are not familiar with the channel environment, there are important things for them to learn. 3. Unsatisfying experiences: An unsatisfying experience (e.g., a channel was not as convenient as expected) may cause the customer to reassess and try another channel. Previous research has shown that unsatisfying experiences can have a profound impact on future behavior (Mattila 2003; Weiner 1986). In summary, a customer’s decision processes should evolve because of learning, to the extent that motivation and ability, lack
Decision Process Evolution in Customer Channel Choice
291
of task familiarity, and unsatisfying experiences are present.1 These factors can be set in motion in several contexts (e.g., directly after the customer has been acquired, when there is an abrupt change in the environment; Moe and Yang 2009). Therefore, we advance the following hypothesis: H1: The channel choice process for the average customer evolves over time. We specify “average customer” because the factors driving learning— motivation and ability, lack of task familiarity, and unsatisfying experience—vary across customers. Some customers may be uninvolved shoppers and thus not motivated to evaluate their channel experience. Some (e.g., those who shop for others’ in addition to their own needs) may not fully be able to evaluate the outcome of their channel choice. Others may be so experienced with shopping that they are highly familiar with the channel selection process. Some will have unsatisfying experiences; others will not. As a result, we hypothesize the following: H2: The likelihood that the channel decision process evolves varies across customers. We now turn to hypothesizing the patterns of evolution most likely to surface when we classify customers into the taxonomy depicted in figure 10.2. The literature suggests the trial stage will be relatively less driven by preferences than the post-trial stage (Heilman, Bowman, and Wright 2000; Meyer and Sathi 1985). Gollwitzer and Bayer (1999) suggest that when customers are in a deliberative mindset, they are uncertain about their goals. In the channel choice context, this means that they are unsure of what channel attributes are relevant and how channels rate on those attributes. This suggests that consumers are less (more) certain of their channel preferences when they are in the trial (post-trial) stage. This leads to the following hypothesis: H3: The trial stage of the customer decision process is less preference dominated than the post-trial stage. The question of which channel to choose is inherently more challenging for a newly acquired customer or a customer facing a new channel environment. Therefore, the trial stage of the customer decision process is characterized by complexity as well as goal uncertainty.
292
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Consumers are known to resort to simplifying heuristics in such situations (Payne, Bettman, and Johnson 1992; Tversky and Kahneman 1974). One possible heuristic, essentially the availability heuristic (Tversky and Kahneman 1974), would be to bias current channel choices toward the channel last chosen. This would apply to the trial stage. However, as consumers make more choices and become more familiar with the channel choice task, that task becomes less complex and consumers develop clearer goals. This suggests that consumers are more (less) likely to rely on heuristics and inertia in the trial (post-trial) stage. Therefore, we propose the following: H4: The trial stage of the customer decision process is more inertia dominated than the post-trial stage. Gollwitzer and Bayer (1999) theorize that consumers in the deliberative stage have an open mindset and are responsive to information. When they move to the implemental stage, they have less need for information, so communications are less influential. Previous work has explored temporal differences in marketing effects. Narayanan, Manchanda, and Chintagunta (2005) find that marketing communications reduce uncertainty about pharmaceutical product quality in the early launch phases and reinforce preferences in subsequent stages. Overall, information has a lower impact over time (see figure 1 in Narayanan, Manchanda, and Chintagunta 2005). Similarly, Lee and Ariely (2006) show that promotion effectiveness varies depending on the consumer’s mindset (i.e., deliberative versus implemental), and, in particular, that it is greater when consumers’ goals are less well defined. These considerations suggest the following hypothesis: H5: The trial stage should be more marketing responsive than the post-trial stage. 10.3
Methodology
Following our goals, we sought to develop a methodology for determining whether the customer channel choice decision process evolves over time, how this differs across customers, and how the trial stage process compares with the post-trial stage process. To accomplish this, we build on previous work (Ansari, Mela, and Neslin 2008; Thomas and Sullivan 2005) and develop an analysis consisting of two components: (1) a model that captures the evolution process; and (2) two logit
Decision Process Evolution in Customer Channel Choice
293
channel choice models, one for the trial stage and the other for the post-trial stage. All parameters in these models vary across customers, so we can examine how the evolution process differs from customer to customer and distinguish fast evolvers (learners) and slower evolvers (stayers). The evolution component quantifies how many purchase occasions the customer takes to transition from the trial to the post-trial stage. We use the purchase occasion as our unit of analysis because our theory posits that the decision process changes as a result of learning from the customer’s channel experiences. Specifically, we analyze the number of purchase occasions it takes for the customer to evolve from the trial to the post-trial stage using a simple geometric distribution. Specifically, let qh be the conditional probability on any given purchase occasion customer h moves from the trial to the post-trial stage, given the customer has not moved yet, and let pht be the probability customer h has moved to the post-trial stage by the t-th purchase occasion. These two quantities are related as follows: pht = 1 − (1 − qh)t−1.
(2)
This is because (1 − qh)t−1 is the probability the customer has not moved by the t-th purchase occasion, so the converse, 1 − (1 − qh)t−1, is the probability the customer has moved by the t-th purchase occasion. Note that pht is an increasing function of qh (the more likely the customer is to change on any given purchase occasion, the more likely it is that the customer will have changed by purchase occasion t) and an increasing function of t (the more purchase occasions, the more likely it is that the customer will have evolved). The geometric distribution with a single parameter (qh) is parsimonious, though it implies that the most likely time for the customer to evolve is immediately after acquisition or after the decision environment changes. To investigate this assumption, we tested two alternative distributions (the discrete Weibull and a conditional logit). The results show that the geometric specification outperformed the others (see Web Appendix W1 at https://www.ama.org/publications/ JournalOfMarketing/Pages/JMTOC_2010.6.aspx). We also note that the estimated values of qh suggest that the mean time customers take to evolve to the post-trial stage is twenty-three purchase occasions, with a standard deviation of 5.4. This means there is much variability in evolution rates, and by no means do most customers evolve right away.
294
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
The second component of our analysis consists of two logit channel choice models, one for the trial stage and one for the post-trial stage. The term Uhjt represents the utility that customer h derives from choosing channel j in purchase occasion t. We distinguish between U0 and U1, where the superscript 0 indicates trial and 1 indicates post-trial. Therefore, U 0hjt is the utility for customer h in choosing channel j at purchase occasion t, given the customer is in the trial stage, and U1hjt is the utility of choosing channel j at purchase occasion t, given the customer is in the posttrial stage. These utilities are a function of the factors shown in equation 1 and figure 10.1: U 0hjt = α 0hj + β10hj CS ht + β02 hj ES ht + β03 h LC hjt + ε 0hjt , and
(3)
U1hjt = α 1hj + β11hj CS ht + β12 hj ES ht + β13 h LC hjt + ε1uht ,
(4)
where α khj = customer h’s preference for channel j during the trial stage (k = 0) or the post-trial stage (k = 1); CSht = number of catalogs sent to customer h during the quarter in which purchase occasion t occurs; ESht = number of e-mails sent to customer h during the quarter in which purchase occasion t occurs; LChjt = 1 if customer h purchased from channel j on the purchase occasion before purchase occasion t, and 0 if otherwise (this captures state dependence2); k βmhj = impact of catalogs (m = 1) or e-mails (m = 2) on customer h’s utility for channel j during the trial stage (k = 0) or the post-trial stage (k = 1); k β 3h = impact of the previous purchase being in a particular channel on the utility for that channel, for customer h at purchase occasion t, during the trial stage (k = 0) or the post-trial stage (k = 1); and ε khjt = error term for customer h for channel j during the trial stage (k = 0) or the post-trial stage (k = 1) for purchase occasion t. For example, this includes situational factors not directly observable by the researcher. The term α khj captures channel preferences, βkmhj captures the importance of marketing, and βk3h captures the importance of inertia or state dependence. The relative sizes of these customer-specific parameters enable us to classify customers into the categories defined in figure 10.2, both before and after they change decision processes.
Decision Process Evolution in Customer Channel Choice
295
Using equations (3) and (4) and assuming an extreme value distribution for the error terms, we have two logit choice models. Equation (2) indicates the likelihood the customer is using the trial or post-trial model at purchase occasion t. Combining these components yields the following probability—that customer h chooses channel j at purchase occasion t:3 ⎛ exp ( U 0hjt ) ⎞ ⎛ exp ( U1hjt ) ⎞ + Phjt = (1 − pht ) ⎜ J p ht ⎟ ⎜⎝ ∑ J exp ( U1 ) ⎟⎠ . ⎝ ∑ j=1 exp ( U 0hjt ) ⎠ hjt j=1
(5)
With probability (1 − pht), the customer is still in the trial stage and thus is using the first logit model; with probability pht, the customer has evolved and is using the second logit model. On the customer’s first purchase occasion, t = 1, so pht = 0 (equation (2)), and the customer’s decision process is driven entirely by the trial model. Over time, pht increases to a degree determined by the customer’s parameter qh, and the process becomes increasingly driven by the post-trial model. For any purchase occasion, the customer’s probability of choosing channel j is a weighted average of the trial and post-trial models, the weights being the likelihood that the customer is in the trial stage or the post-trial stage, respectively. Because the weight placed on the posttrial model increases as the customer accumulates more purchases (equation (2)), the post-trial model becomes more important as purchases accumulate. 10.4
Data and Estimation
Our main data set is from a major European book retailer operating in one country. The retailer uses three sales channels—physical retail stores, catalogs, and the Internet—and uses a subscription business model. Thus, to be eligible to purchase, each customer must become a member. A code is associated with each customer, tracking each time she or he purchases and from which channel. Therefore, we have information on which channel each customer selected on each purchase occasion, the date of each purchase, how much was spent, the number and types of communications (e-mails and catalogs) each customer received and when they were received, and age and gender demographics. Each channel shares the same assortment and price. Communications were not targeted according to channel usage. This means that
296
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
each channel was treated the same in terms of marketing activities. This gives us the opportunity to measure intrinsic customer preferences for each channel. The period of observation is January 2002–June 2006. The retailer monitored customers and made customer-marketing decisions on a quarterly basis. Therefore, we assemble each customer’s purchase occasions, channel choice, and the catalogs and e-mails received on a quarterly basis. On the rare occasions when there was more than one purchase occasion in a quarter (this happened in only 4.2% of the 14,985 customers per quarter observations), we used the single purchase occasion. We found no difference in the results when we estimated the model using all purchase occasions. We sample a cohort of new customers who live in at least one store’s service area and who entered into a subscription agreement with the company during the fourth quarter of 2001. We focus on a static sample of active customers (i.e., customers having an active relationship with the company between January 2002 and June 2006) with all three channels available throughout the relationship. The sample size is 1018 households. Table 10.1 presents descriptive information about the sample. We use Bayesian estimation (see Web Appendix W2 at https://www.ama.org/publications/JournalOfMarketing/Pages/ JMTOC_2010.6.aspx) to estimate the model. Catalogs and e-mails sent (CSht and ESht) can depend on the customer’s past choices and thus might be endogenous. We used an approach similar to Gönül, Kim, and Shi (2000) to minimize endogeneity bias (see Web Appendix W3 at https://www.ama.org/publications/ JournalOfMarketing/Pages/JMTOC_2010.6.aspx). 10.5
Results
10.5.1 The Existence of Choice Evolution and Its Variation Across Customers We test for the existence of decision process evolution (H1) by comparing our proposed model (equations (2)–(5)) with a logit model that does not distinguish between trial and post-trial stages (Model 1).4 We test whether the likelihood of evolution varies across customers (H2) by comparing the proposed model with a logit model that assumes each customer starts off using one logit model and moves to another logit model after the same a priori defined number of purchase occa-
Decision Process Evolution in Customer Channel Choice
297
Table 10.1 Descriptive Statistics of the Selected Cohort of Customers (n = 1018) Variable
M
Average number of purchase occasions (per year) Average number of purchase occasions over relationship Average returns (in US$ per quarter) Average number of catalogs received (per quarter) Average number of e-mails received (per quarter) Age (years) Gender (male)
SD
Min
Max
3.1
1.1
.0
4.0
14.7
2.9
7.0
18.0
.4 2.0
3.5 .7
.0 .0
104.4 4.0
.7
2.0
.0
12.0
43.8 35.2%
14.6
21.0
88.0
Channel Usagea
% of Customers
Average Number of Purchase Occasions per Customer
Mainly catalog Mainly Internet Mainly store Catalog and store Catalog and Internet Internet and store Catalog, Internet, and store
27.6% .6% 42.4% 15.4% 8.4% 1.2% 4.3%
14.6 14.8 15.3 13.4 14.7 15.3 14.6
a. “Mainly” means that at least 95% of purchases were made on that channel. Similarly, we classified as two-channel/three-channel users those customers who made at least 95% of purchases using two channels/three channels.
sions (nine; Model 2).5 We compare these models using both the deviance information criterion (DIC) statistic (Spiegelhalter et al. 2002) and the hit rate (see Web Appendix W4 at https://www.ama. org/publications/JournalOfMarketing/Pages/JMTOC_2010.6.aspx). The proposed model (DIC = 6525.2, hit rate = 72.9%), which assumes a distinct trial and post-trial logit model and a heterogeneous trial period length across customers, is superior to both Model 1 (DIC = 6881.0, hit rate = 57.9%) and Model 2 (DIC = 6648.2, hit rate = 56.9%). These results support H1 and H2, indicating the existence of channel choice evolution and suggesting that the likelihood of evolution varies across customers.
298
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
10.5.2 Prevalence of Decision Process Evolution and Channel Usage of Learners Versus Stayers The preceding results suggest that customers change decision processes. Next, we want to know how prevalent this behavior is. To investigate this, we compute the probability that customer h evolves to the post-trial stage by his or her last observed purchase occasion (see Web Appendix W5 at https://www.ama.org/publications/ JournalOfMarketing/Pages/JMTOC_2010.6.aspx). The expected value of this probability is 0.22, which means that we expect 22% of customers to have changed decision processes by the end of the observation period. We ordered customers according to their individual probability and labeled the 22% highest as the learners. The logic is that the customers with the highest probability of changing are the ones we would expect to change by the end of the observation period. The remaining customers were classified as stayers (i.e., not expected to change by the end of the observation period). The average probability of changing for stayers is 0.14 and for learners is 0.49. Using equation (2), we calculated that, on average, it takes the learner nine purchase occasions to evolve (see Web Appendix W5 at https://www.ama.org/publications/ JournalOfMarketing/Pages/JMTOC_2010.6.aspx). By the same token, it takes an average of 32 purchase occasions for a stayer to evolve, indicating a clear separation between the two groups. Table 10.2 shows learners’ versus stayers’ usage of various channels and reveals an intriguing finding: Stayers are mainly singlechannel users, while learners are predominantly multichannel. We conducted a sensitivity analysis regarding this finding by relaxing the 95% requirement used to identify single channel customers (see note a in table 10.2). We find that this result is robust to the rule for defining a multichannel customer. This is an important exploratory finding and a characterization of the multichannel customer previously not demonstrated. 10.5.3 Describing and Contrasting the Trial and Post-trial Decision Processes Table 10.3 presents the estimates of the mean parameters for the proposed model. The intercept estimates suggest that customers prefer the store to the Internet in both trial and post-trial stages. This may seem counterintuitive given the popularity of the Internet in book retailing. However, the customers in our sample essentially had joined a loyalty club for a book retailer and therefore presumably were avid readers. It
Decision Process Evolution in Customer Channel Choice
299
Table 10.2 Stayers Versus Learners: Channel Choice Behaviora Stayers (793 Customers)
Learners (225 Customers)
Channel Usage Mainly catalog Mainly Internet Mainly store Catalog and store Catalog and Internet Internet and store Catalog, Internet, and store
35.4% .8% 54.5% 6.2% 2.6% .1% .4%
— — — 48.0% 28.9% 4.9% 18.2%
Multiple-Channel Shopper Yes Two-channel buyer Three-channel buyer No
9.3% 8.9% .4% 90.7%
100.0% 81.8% 18.2% —
a. “Mainly” means that at least 95% of purchases were made on that channel. Similarly, we classified as two-channel/three-channel users those customers who made at least 95% of purchases using two channels/three channels.
Table 10.3 Estimates of Parameter Means: Two-Stage Channel Choice Model Catalog Versus Store Parametera
Internet Versus Store
Elasticityb
Parametera
Elasticityb
Intercept Trial
.85 (.47)
—
−2.48 (.39)
—
Post-trial
.38 (.92)
—
−3.78 (.46)
—
−1.45 (.36)
−7.24
−.12 (.63)
−.33
−9.10 (.97) .63 (.59)
−20.56 1.83
−.03 (.61) 2.35 (.34)
−.46 .93
Catalogs Sent Trial Post-trial E-mails Sent Trial Post-trial State Dependence Trial Post-trial
3.28 (.75)
4.90
.41 (.70)
−.49
4.09 (.60) 3.14 (.62)
a. A positive coefficient means that a customer is more likely to choose channel j than the base channel. The base channel is the store. b. We computed elasticities at the mean value of the continuous variables and the mode of the categorical variables. Notes: Bold indicates that the 95% posterior distribution for the parameter does not include zero. The standard deviation of the posterior distribution is in parentheses. For identification purposes, we set one channel (the store) as the base.
300
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
is sensible that such customers would enjoy browsing through books and purchasing in the store. Also, the stores were a relatively new channel for the retailer, which opened its first store in 2000. Thus, in contrast to the prevailing tradition in the book industry of bookstores being the “incumbent” channel, in our case, the store was the “novel” channel. While the store is, on average, preferred over the Internet, the intercept for the catalog over the store is not significant for either the trial or post-trial periods, showing that these two channels were equally preferred on average. 10.5.4 Characterizing Evolution Patterns We now focus on learners and classify them in terms of the taxonomy depicted in figure 10.2; in other words, we quantify the extent to which different decision processes are employed in the trial versus post-trial stages and examine the patterns by which learners evolve from one process to another. The customer-level parameter estimates enable us to accomplish this classification. By comparing the magnitudes of the channel preference parameters with the state dependence parameters, we classify customers’ decision processes as either preference based or non-preference based. For example, if customer h’s state dependence parameter is higher than his or her preference for the catalog over the store and the Internet over the store (in absolute values), we classify this customer as nonpreference based (for details, see table 10.4, panel A). After classifying customers as preference or non-preference based, we further distinguish them according to their marketing responsiveness. We compare individual marketing elasticities with the median value across customers and classify customers in the high marketing responsiveness group if e-mails or catalog elasticities are greater than the respective median elasticities (for details, see table 10.4, panel B). We used the aforementioned decision rules to classify each learner into one of the four decision process categories. We then “cross-tabbed” category membership by decision stage to examine the patterns by which customers evolve from one decision process category to another. Figure 10.3 shows these evolutions. The key findings are as expected: a tendency to evolve from non-preference-based to preference-based decision making and a tendency to evolve to lower marketing responsiveness. In particular, figure 10.3 highlights two main types of evolutions: from Trial Category 3 to Post-trial Category 1 (42% of all learners); and from trial Category 3 to Post-trial Category 2 (51%). Both these
Decision Process Evolution in Customer Channel Choice
301
Table 10.4 Classifying Learners A: Preference Based or Non-Preference Based
Categories
Decision Strategy
1
Preference based
State dependence < Preference (C vs. S) State dependence < Preference (I vs. S)
2
Preference based
State dependence > Preference (C vs. S) State dependence < Preference (I vs. S)
3
Preference based
State dependence < Preference (C vs. S) State dependence > Preference (I vs. S)
4
Non—preference based
State dependence > Preference (C vs. S) State dependence > Preference (I vs. S)
Classification Rulea
B: High Marketing Versus Low Marketing Responsiveness Categoriesb
Marketing Responsivenessc
1
High
Marketing elasticity (C vs. S) > Median marketing elasticity (C vs. S) Marketing elasticity (I vs. S) > Median marketing elasticity (I vs. S)
2
High
Marketing elasticity (C vs. S) < Median marketing elasticity (C vs. S) Marketing elasticity (I vs. S) > Median marketing elasticity (I vs. S)
3
High
Marketing elasticity (C vs. S) > Median marketing elasticity (C vs. S) Marketing elasticity (I vs. S) < Median marketing elasticity (I vs. S)
4
Low
Marketing elasticity (C vs. S) < Median marketing elasticity (C vs. S) Marketing elasticity (I vs. S) < Median marketing elasticity (I vs. S)
Classification Rulea
a. C stands for the catalog, I for the Internet, and S for the store. b. We have four categories in total (three under “High Marketing Responsiveness” and one under “Low Marketing Responsiveness”) and consider two types of direct marketing communications (e-mails sent and catalogs sent). Therefore, we have a total of 42 = 16 possible outcomes. c. We classify customers as low responsive to marketing only if all their marketing elasticities are less than the median values (or are not significant) across both the trial and post-trial models.
302
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Trial
Post trial
Preference based
1
1 6%
Yes
No
High marketing responsiveness?
2
No
2
Yes
Yes
55%
42% 51%
Pref > Inertia?
Nonpreference based
Yes
4%
High marketing responsiveness?
Pref > Inertia? 3
3 No
44%
No
94%
Yes
Yes
High marketing responsiveness?
High marketing responsiveness? No
4
No
4 1%
Figure 10.3 Estimated Trial to Post-trial Learner Migration Patterns. Notes: Only paths greater than 2% threshold are represented. From table 10.5, panel B, customers are classified as more marketing responsive if any of their four marketing elasticities (catalogs: catalog versus store, Internet; catalogs versus store; e-mails: catalog versus store; e-mails: Internet versus store) are above the median for that elasticity across customers. As long as the customer is exceptionally responsive to some form of marketing, we classify him or her as highly responsive. Although we could have used other classification rules, the important issue is not the level of responsiveness per se, but rather how marketing responsiveness changes from the trial to the post-trial stage. Using this rule, we find that 100% of the learners are highly marketing responsive in the trial period and 44% in the post-trial period. We also tested a more restrictive classification rule, requiring two marketing elasticities, not one, to be greater than the median. Using this rule, the number of responsive consumers decreases (from 100% to 91.6% in the trial period, and from 44% to 3.6% post-trial), but the same general finding holds: Learners are less marketing sensitive after they change.
Decision Process Evolution in Customer Channel Choice
303
evolutions are from a non-preference-based to a preference-based decision process. Customers in Category 3 are responsive to marketing and nonpreference based. This suggests that marketing bounces them from channel to channel during the trial stage. However, these customers eventually form strong channel preferences, which makes them preference-based decision makers. Some of them remain responsive to marketing (i.e., marketing can still influence their choice), but for the majority, marketing diminishes in importance. 10.6
Robustness Checks
10.6.1 Robustness with Respect to Assumptions We conducted robustness checks of some of our key assumptions by: (1) investigating whether observed heterogeneity and time variance influence the customer’s propensity to evolve; and (2) modeling a relationship between trial and post-trial parameters. To investigate the first issue, we estimated models with qh a function of customer variables (age, gender: Model A) and time-varying variables including marketing (Model B); lagged channel choice, lagged product returns, age, and gender (Model C); and lagged product returns (Model D). We found that none of these models improved over our proposed model, which assumes that qh varies randomly across customers and does not change over time (see table 10.5). The proposed model’s superiority to Model B indicates that marketing does not significantly affect the length of the trial period. The comparison with Model D suggests that at least one measure of unsatisfying experience—product returns— does not encourage learning. A possible reason for this is that returning a book is not an unsatisfying experience. This is in line with previous work showing that when firms effectively manage service failure, a negative experience might turn into a positive one. Product returns can contribute to generate customer satisfaction and reinforce the relationship with the firm (Petersen and Kumar 2009). To investigate the second issue, we posited a general correlation matrix in model parameters across both stages or made the post-trial parameters functions of the trial parameters plus age, gender, and first channel chosen. None of these models fits better than the proposed model, or they did not converge (for details, see the Web Appendix W6 at https://www.ama.org/publications/JournalOfMarketing/Pages/ JMTOC_2010.6.aspx).
304
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Table 10.5 Robustness Checks for the Proposed Model Model
Description
DIC
Hit Rate
Proposed model
Two-stage multinomial channel choice model (equations (2)–(5)) Two-stage multinomial channel choice model (with qh a function of age and gender) Two-stage multinomial channel choice model (with qht a function of marketing) Two-stage multinomial channel choice model (with qht a function of age, gender, lagged number of channel used, and lagged returns) Two-stage multinomial channel choice model (with qht a function of lagged returns)
6525.8
72.9%
6598.2
53.1%
6999.4
49.2%
7284.8
43.9%
7202.8
42.4%
Model A
Model B
Model C
Model D
Overall, we find that the relatively simple proposed model performed better than plausible though more complex models. It might be that the more complex models overtaxed the data or that the proposed model captures the essence of decision process evolution, so that elaborations on that model do not add significantly more explanatory power. 10.6.2 Replication in a Different Business Context We perform a second empirical analysis to validate our results in a different business context. We obtained data from a US retailer that sells durables and apparel through catalogs, stores, and the Internet. The company opened two stores in July 2000 and May 2001 that were relatively close to each other. We select a sample of customers who first purchased from one of these stores after May 2001 and lived within fifteen miles of these stores. Therefore, we include only customers for whom the catalog, Internet, and store channels were available. The observation period is May 2001–September 2004. We include only active customers who made at least three purchases. The final sample size is 506 households. We have information on channel choice and catalog communications. Table 10.6 summarizes the data. This is a useful data set because the opening of the stores is a discontinuity in the customer experience. Our theory posits that the
Decision Process Evolution in Customer Channel Choice
305
Table 10.6 Descriptive Statistics for the Durable and Apparel Data Set (n = 506) Variable Average number Average number relationship Average number Average number Age (years)
M
SD
Min
Max
of purchase occasions (per year) of purchase occasions over
5.7 14.9
3.1 5.7
1.0 9.0
20.0 30.0
of returns (per year) of catalogs received (per week)
.6 .6 53.4
1.1 .8 10.2
.0 .0 33.0
10.0 6.0 97.0
Channel Usagea Mainly catalog Mainly Internet Mainly store Catalog and store Catalog and Internet Internet and store Catalog, Internet, and store
9.3% 5.7% 13.6% 20.6% 4.0% 3.8% 43.1%
a. “Mainly” means that at least 95% of purchases were made on that channel. Similarly, we classified as two-channel users/three-channel users those customers who made at least 95% of purchases using two channels/three channels.
decision process evolves when consumers have the motivation and ability to learn and are unfamiliar with the task. These conditions apply to our first data set, because the customer was just acquired. We believe these conditions would hold in the case of the second data set as well, because there has been an abrupt change in the channel environment. Indeed, Moe and Yang (2009) show that disruptive events force consumers to reexamine and possibly adjust their preferences and habits. We compare our proposed model with the logit model that does not distinguish between trial and post-trial stages (Model 1). We also compare with a multinomial logit that assumes that each customer changes decision process after the same a priori defined number (fifteen) of purchase occasions (Model 2).6 The DIC statistics confirm the superiority of the proposed model over Models 1 and 2 (DICproposed = 11,711, DICmodel1 = 12,082, DICmodel2 = 12,545). This provides reconfirmation of H1 and H2—that the customer decision process evolves and that this evolution is heterogeneous across customers. The expected value of the probability of changing by the end of the data collection is 0.35, which means that we expect 35% of customers to have changed
306
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
their decision processes. The average probability of changing for stayers is 0.25 and for learners is 0.53. We also find that the expected number of purchase occasions a learner needs to update his or her decision process is eleven. In contrast, it takes an average of thirty-nine purchase occasions for a stayer to change. Table 10.7 presents the estimates for the proposed model. The intercept estimates show that the store is preferred over both the catalog and the Internet in the trial stage, and the magnitude increases considerably in the post-trial stage. However the pseudo t-statistics for the preference parameters decrease in going from the trial to post-trial stage (from 7.7 to 2.1 for catalog versus store; from 9.6 to 1.9 for Internet versus store). The number of catalogs sent is significant in the trial stage, but not in the post-trial stage. The pseudo t-statistics for catalogs sent decrease from the trial to post-trial stages (from 5.1 to 0.0 for catalog versus store; from 2.5 to 0.05 for Internet versus store), consistent with the changes in magnitude. In summary, we cannot clearly Table 10.7 Estimates of Parameter Means: Two-Stage Channel Choice Model (Durable and Apparel Data Set) Catalog Versus Store Parametera
Internet Versus Store
Elasticityb
Parametera
Elasticityb
Intercept Trial
−1.46 (.19)
—
−2.87 (.30)
—
Post-trial
−33.25 (15.90)
—
−8.66 (4.60)
—
.51 (.10)
.35 (.14)
Catalogs Sent Trial Post-trial
−.01 (2.82)
.22 .04
−.07 (1.44)
.17 .09
Distance Trial Post-trial
.14 (.02) 1.63 (1.08)
— —
.14 (.03) .54 (.45)
— —
State Dependence Trial Post-trial
−.09 (.06) 2.35 (2.34)
a. A positive coefficient means that a customer is more likely to choose channel j than the base channel. The base channel is the store. b. We computed elasticities at the mean value of the continuous variables and the modal of the categorical variables. Notes: Bold indicates that the 95% posterior interval for the parameter does not include zero. The standard deviation of the posterior distribution is shown in parentheses.
Decision Process Evolution in Customer Channel Choice
307
support an increase in the importance of preference going from the trial to post-trial stage. However, we find clear evidence for a decrease in the impact of marketing. 10.7 Managerial Implications: Identifying Learners and Marketing to Them Our results have established that there is a learner segment—a segment that significantly changes its decision process within a relatively short time. However, two questions naturally arise: In concrete terms, who is the learner segment? And can we use this information to design more effective marketing programs? We answer the first question with a discriminant analysis and the second with a profitability scenario of a “right-channeling” strategy. We hypothesize five variables that should distinguish between learners and stayers: 1. Acquisition channel: Our focal company predominantly uses “street agents” or “door-to-door” agents to acquire customers. Street agents approach customers directly outside the company’s bookstores; doorto-door agents go directly to prospects’ residences. We hypothesize that customers acquired through door-to-door agents would be less familiar with the company and its various marketing channels. Therefore, they would be more open to information and open to learning about the various channels (i.e., they would more likely be learners). 2. Age: Younger customers should be more open to exploring different channels, and thus are more likely to be learners. 3. Gender: We have no particular hypotheses on gender, but given the literature on gender differences in shopping, we thought it was worthwhile to explore this variable. 4. Immediate e-mail suppliers: The firm asks for customers’ e-mails only when the customer is acquired. Some customers supply their e-mail addresses immediately, while others wait before doing so. Therefore, immediate e-mail suppliers are customers who provided their e-mail addresses at the time of acquisition. We hypothesize that the immediate suppliers would be more open to information, and thus more likely to be learners. 5. Big city dwellers: Customers who live in big cities should be more familiar with various shopping alternative, and thus less likely to be learners.
308
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Table 10.8, panel A, shows the means of these variables for learners and stayers. Three of our hypotheses are confirmed: Learners are more likely to be acquired through door-to-door agents, they are younger, and they are more likely to supply their e-mail address immediately. Our next step is to estimate the discriminant function to use for classification. Results are shown in table 10.8, panel B.7 The hit rate is 83.8%, significantly better than the random assignment standard of 65.5% (pvalue = 0.000) (cfair in Lehmann, Gupta, and Steckel 1998, p. 663). The model does not perform equally well in classifying learners: 37% of learners are misclassified. However, 63% of our predicted learners are actually learners, whereas only 10.2% of the predicted stayers turn out to be learners. In summary, the discriminant analysis shows we can identify the learners in concrete terms and classify customers as either learners or stayers. By definition, the learner segment changes its decision process relatively quickly after acquisition. To illustrate the implications of this and how our discriminant analysis could be used in marketing to this group, we conduct a profitability analysis involving right-channeling learners. We assume the firm’s profit margin is higher for Internet purchases (50%) versus catalog or store purchases (30%). Therefore, the company wants to encourage learners to use the Internet, and it knows that e-mails tend to do this (table 10.3). The question is: Should those e-mails be used directly after customer acquisition, or later, after the customer has had a chance to get “settled”? Our results suggest the “e-mail-early” approach is better because learners are more receptive to marketing directly after they have been acquired. The e-mail-early strategy allocates e-mails toward the purchase occasions immediately after the customer has been acquired. The e-mail-late strategy allocates e-mails toward the later purchase occasions. We use the discriminant function to identify learners and the parameter estimates obtained at the individual level to simulate channel choices under each strategy. Table 10.8, panel C, summarizes the results and shows that, as we expected, the e-mail-early campaign generates approximately 24% more profits ($15.52 − $12.54/$12.54) by encouraging earlier usage of the less costly Internet channel. This analysis shows that learners can be identified sensibly with concrete measures, and profits can be leveraged through appropriate marketing communication activities.
Decision Process Evolution in Customer Channel Choice
309
Table 10.8 Discriminant Analysis and Profitability Scenarios of Marketing to Learners A: Mean Characteristics of Learners and Stayers
Variable
Learners
Stayers
p-Value for Difference
Door-to-door agent acquisition Age
60.9%
49.7%
.003
41.2 years
.002
Gender
70.2% female
Immediate e-mail suppliers Big city dwellers
24.9%
44.5 years 72.4% female 2.7%
52.0%
54.4%
.533
.525 .000
B: Classification Matrix of Learners Versus Stayers Actual Group Membership Stayer Predicted Group Membership (Number) Stayer 708 Learner 85
Learner
Total
80 145
788 230
Predicted Group Membership (Conditional Probability of Correct Prediction) Stayer 89.8% 10.2% 100% Learner 37.0% 63.0% 100% Hit Rate = (708 + 145)/1018 = 83.8% C: Profitability Analysis: E-Mail-Early Versus E-Mail-Late Strategya Profits per Purchase Occasion
Web Use
Strategy
First Period
Second Period
Entire Period
First Period
Second Period
Entire Period
E-mail early E-mail late
$16.81 $11.08
$14.22 $13.99
$15.52 $12.54
88.4% 5.9%
49.4% 46.2%
68.9% 26.0%
a. We used customers who had at least seven purchases. We noted the number of e-mails each customer received. Then we reallocated them to a 70–30 split for the e-mail-early strategy and a 30–70 split for the e-mail-late strategy. The e-mails were used on purchase occasions 1, 2, and 3 (early) and 5, 6, and 7 (late), with the exact number depending on the number of e-mails the customer received and the allocation dictated by the strategy being simulated.
310
10.8
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Conclusions
10.8.1 Key Results and Implications The central questions of this study are as follows: (1) Do customers evolve to a different channel choice process over time? (2) If so, how many customers evolve? (3) How can we characterize the decision process of customers who evolve? and (4) What are the managerial implications for channel management? To answer these questions, we characterized the channel choice decision process using the three main factors that previous researchers have used to study channel choice: channel preference, state dependence or inertia, and marketing. We hypothesized that: (1) the channel choice decision process evolves over time; (2) the time of this evolution varies across customers; (3) channel preferences become more relevant over time; (4) state dependence becomes less relevant over time; and (5) marketing communication is less effective over time. We developed a two-stage channel choice model to test these hypotheses and answer our key questions. We investigated customer channel choice behavior in a contractual setting in the book retail industry. As a robustness check, we analyzed a US retailer of durables and apparel. Our key findings and managerial implications are as follows: • Customers’ channel choice decision processes evolve over time. This provides managers with the strategic insight that new-to-the-channel customers need to be treated differently from mature customers in terms of channel decisions. Operationally, it means that managers must take into account how recently the customer has been acquired in predicting channel choices. • A significant portion of customers are expected to shift (ranging from 22% [newly acquired customers] to 35% [new channel introduction]). This suggests that the customer’s speed of evolution can serve as a basis for segmentation: learners versus stayers. • The predominant pattern of evolution is from a more marketingresponsive to a less marketing-responsive decision process. This means that the newly acquired customer or the customer who experiences a new channel introduction is more receptive to right-channeling than the mature customer. In general, mature customers will be less amenable to marketing efforts aimed at getting them to try new channels. This suggests that channel strategies for mature customers should ensure these customers are very satisfied with the service experience of the channel(s) they have chosen to use.
Decision Process Evolution in Customer Channel Choice
311
• Learners can be characterized through discriminant analysis, and the discriminant analysis can be used to identify the learner segment. In our case of the book retailer, we found that learners are likely to be acquired through door-to-door agents, immediately provide their e-mail addresses, and be younger. The result is a concrete simulation of how managers can identify learners and how they might market to them. Policy simulation can be used to examine the impact of various marketing strategies. We illustrate how an e-mail-early strategy can increase learners’ profitability. Transitioning consumers early into the cheapest Internet channel increased the company’s profits by approximately 24%. The key hypotheses that (1) the customer decision process evolves, (2) the evolution is heterogeneous across customers, and (3) customers become less responsive to marketing over time were confirmed in two databases in very different situations (after the customer is acquired by a book retailer and after the introduction of a new purchase channel by a durables/apparel retailer). We confirmed our hypotheses that the process becomes more preference based and less inertial over time for the book retailer database but not for the durables/apparel database. The importance of preference in the trial stage in the durables/apparel database could be due to customers being highly familiar with purchase channels in this industry. The type of store the company introduced was similar to that of other US competitors. Therefore, adding a store was not an entirely new experience for the customer. In terms of the learning theory that motivates decision process evolution, customers were relatively familiar with the task, which diminished learning. The most important conceptual contribution of this work is that the customer’s channel choice process evolves over time. We are encouraged regarding the generalizability of this finding in two databases. The underlying theoretical reason for evolution is customer learning, stimulated by three conditions: motivation and ability, lack of task familiarity, and dissatisfying experiences. To the extent these conditions are present in other decision processes (e.g., brand choice, price search, service usage), we would expect this evolution to hold beyond the domain of channel choice. We do not advocate our model as the only way to model this evolution; the main point is that evolution should be considered in some way. We believe the taxonomy we developed for characterizing the
312
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
channel choice process (figure 10.2) is useful and produced a revealing portrait of choice process evolution (figure 10.3). 10.8.2 Limitations and Further Research There are limitations to this study that provide opportunities for further research. First, our work is based on what we can observe, namely, customers’ purchase histories. A well-designed panel survey could enrich the characterization of stayers and learners, and, of course, of their decision processes. Second, we estimated different specifications of the proposed model that relax the assumption of independence between trial and post-trial stages. Although our results provide support for the independence assumption, the data we used might not be strong enough to pick this relationship up, and/or the alternative modeling specifications might overtax the model with too many parameters. Additional research could investigate this issue further. Third, the population of interest in this study is heavy users who account for high sales. This focuses attention on the best customers and avoids including customers active only during the trial stage. Work in the future might examine the evolution of the channel choice decision process for light users while accounting for customer churn. This would allow for a more complete analysis of customer retention and profitability. Fourth, we focus on channel choice for the purchase decision and do not investigate search behavior (Verhoef, Neslin, and Vroomen 2007). Further research might explore the choice of channels for both search and purchase and the evolution of these two decision processes, but this would require the availability of both search and choice longitudinal data. Fifth, we examined the impact of a single disruption in the environment on the ensuing transition from a trial to a post-trial decision process. A more general analysis would observe several disruptions. Related to this is the question of whether, in the face of a disruption, the trial model starts from scratch or is related to the previous decision process the customer was using. To explore this, we examined our new store opening data. In that case, we had data for a subset (n = 115 customers) who purchased before the opening of the stores. For these customers, we estimated a binomial logit model for choosing between the catalog and the Internet during the pre-store-opening period. We
Decision Process Evolution in Customer Channel Choice
313
estimated individual-specific coefficients and correlated them with the individual-specific coefficient we estimated for the trial model. We found a correlation of 0.38 between the intercept terms representing preference for catalog versus Internet and a correlation of 0.43 between the catalog response coefficients representing impact on catalog choice rather than Internet. These correlations were both statistically significant at the 0.05 level, and, in our view, promising, given the small sample size and noise level in the data. These results also reinforce our conjecture that there was no change in the importance of preference associated with the opening of the store because of already established task familiarity. Notably, and consistent with Moe and Yang’s view of environment changes as disruptions of customer habits, we found virtually no correlation (−0.02) between the state dependence coefficients. Overall, this points to the plausibility of multi-event modeling in which trial decision processes build on the decision process that existed before the market disruption. Sixth, we did not examine the specific content of e-mails and catalogs as it pertains to channel choice. Certainly, the catalog, for example, listed store locations and included the URL of the company’s website. However, to our knowledge, there were no time-specific campaigns to encourage customers to use various channels. An examination of the potential for such campaigns would be interesting grounds for further research. Seventh, a significant finding of our work is that marketing information affects channel choice in the trial stage but does not affect the timing of moving from the trial to the post-trial stage. This suggests that evolution is a customer-specific trait. Our robustness checks verify this result, but it is possible that in different contexts, marketing could influence not only channel choice but also the speed of evolution. It may be possible to design the information content of marketing to influence the timing of evolution. This would be supported, for example, by Petty and Cacioppo (1986), who suggest that the ability to learn can depend on the message. We believe this is a fertile area for further research in the context of learning new decision processes, the domain of our work. Eighth, the focus of this study is on the customer channel choice decision process. Further research might also investigate whether other aspects of the customers’ decisions, particularly purchase incidence and purchase quantity, evolve over time.
314
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Notes Reprinted by permission from Journal of Marketing, published by the American Marketing Association, Sara Valentini, Elisa Montaguti,and Scott A. Neslin, “Decision Process Evolution in Customer Channel Choice,” volume 75 (November 2011), pp. 72–86. The authors thank Leonard Lee and Carl Mela for helpful discussions on behavioral theory and statistical analysis, respectively; Bruce Hardie for comments and suggestions; and Paul Wolfson for computer programming support. They thank two anonymous companies that provided the data used in this research. Finally, they express their appreciation to the three anonymous JM reviewers for their diligence and constructive suggestions. 1. Note that although we emphasize the role of learning, there are other potential reasons the decision process might change. In the channel choice context, these would include obtaining better Internet connectivity. 2. State dependence refers to the influence of previous choices on current choices. We used a simple zero-one formulation of the state dependence variable (LC) because previous research is dominated by the operationalization of state dependence to equal 1 if the alternative was chosen at the previous purchase occasion (e.g., Ansari, Mela, and Neslin 2008; Thomas and Sullivan 2005). 3. We also estimated equation (5) with interactions effects and/or diminishing returns to marketing investments (e.g., Thomas and Sullivan 2005; Venkatesan, Kumar, and Ravishanker 2007). In addition, we tested a model with a catalog stock variable that allows for cumulative and carryover effects as the customer gets more catalogs. However, the fit of all these models was lower, and the interpretation of the parameters estimates was essentially identical. 4. Model 1 essentially assumes that α 0hj = α 1hj and β0mhj = β0mhj for m = 1, 2, 3 in equations (3)–(4). 5. Nine is the median number of purchase occasions. We tested other numbers and also found the proposed model was superior (6715.9 for three purchase occasions, 6725.5 for five, 6804.3 for twelve, and 6970.6 for fifteen versus DIC = 6525.8 for the proposed model). The model with one purchase occasion yielded DIC = 6631.1, but did not converge. We also tested a model for which qh was constant across customers (DIC = 10,204.0), but it also did not converge. 6. We use fifteen because this was the median number of purchase occasions. We tested other numbers and also found the proposed model was superior (DIC = 12,085.5 for three purchase occasions, 11,995.7 for five, 12,371.5 for twelve, and 12,275.5 for eighteen versus DIC = 11,711.0 for the proposed model). 7. Note that in addition to the variables in table 10.8, panel B, we added fixed effects for zip code to control more specifically for the location of the customer’s residence. We included 256 of these fixed effects in the discriminant analysis.
References Aaker, D. A. 1971. “The New-Trier Stochastic Model of Brand Choice.” Management Science 17 (5):435–450.
Decision Process Evolution in Customer Channel Choice
315
Alba, Joseph W., and Wesley J. Hutchinson. 1987. “Dimensions of Consumer Expertise.” Journal of Consumer Research 13 (4):411–454. Ansari, Asim, Carl F. Mela, and Scott A. Neslin. 2008. “Customer Channel Migration.” Journal of Marketing Research 45 (February):60–76. Bettman, James R., and C. Whan Park. 1980. “Effects of Prior Knowledge and Experience and Phase of the Choice Process on Consumer Decision Processes: A Protocol Analysis.” Journal of Consumer Research 7 (3):234–248. Blattberg, Robert C., Byung-Do Kim, and Scott A. Neslin. 2008. Database Marketing: Analyzing and Managing Customers. New York: Springer. Fader, Peter S., Bruce G.S. Hardie, and Chun-Yao Huang. 2004. “A Dynamic Changepoint Model for New Product Sales Forecasting.” Marketing Science 23 (1):50–65. Gollwitzer, Peter M., and Ute Bayer. 1999. “Deliberative Versus Implemental Mindsets in the Control of Action.” In Dual-Process Theories in Social Psychology, ed. Shelly Chaiken and Yaacov Trope, 403–422. New York: Guilford. Gönül, Füsun F., Byung-Do Kim, and Mengze Shi. 2000. “Mailing Smarter to Catalog Customers.” Journal of Interactive Marketing 14 (2):2–16. Granbois, Donald H. 1977. “Shopping Behavior and Preferences.” In Selected Aspects of Consumer Behavior: A Summary from the Perspective of Different Disciplines, 259–298. Washington, DC: U.S. Printing Office. Heilman, Carrie M., Douglas Bowman, and Gordon P. Wright. 2000. “The Evolution of Brand Preferences and Choice Behaviors of Consumers New to a Market.” Journal of Marketing Research 37 (May):139–155. Hoch, Stephen J., and John Deighton. 1989. “Managing What Consumers Learn from Experience.” Journal of Marketing 53 (April):1–20. Johnson, Eric J., and J. Edward Russo. 1984. “Product Familiarity and Learning New Information.” Journal of Consumer Research 11 (1):542–550. Knox, George A. H. (2006), “Modeling and Managing Customers in a Multichannel Setting,” working paper, the Wharton School, University of Pennsylvania. Lee, Leonard, and Dan Ariely. 2006. “Shopping Goals, Goal Concreteness, and Conditional Promotions.” Journal of Consumer Research 33 (1):60–71. Lehmann, Donald R., Sunil Gupta, and Joel H. Steckel. 1998. Marketing Research. Reading, MA: Addison-Wesley. LeSage, James P., and Manfred M. Fischer. 2010. “Spatial Econometric Methods for Modeling Origin-Destination Flows.” In Handbook of Applied Spatial Analysis: Software Tools, Methods, and Applications, ed. Manfred M. Fischer and Arthur Getis, 409–434. Berlin: Springer-Verlag. MacInnis, Deborah J., Christine Moorman, and Bernard J. Jaworski. 1991. “Enhancing and Measuring Consumers Motivation, Opportunity, and Ability to Process Brand Information from Ads.” Journal of Marketing 55 (October):32–53. Mattila, Anna S. 2003. “The Impact of Cognitive Inertia on Postconsumption Evaluation Processes.” Journal of the Academy of Marketing Science 31 (3):287–299.
316
Sara Valentini, Elisa Montaguti, and Scott A. Neslin
Meyer, Robert J., and Arvind Sathi. 1985. “A Multiattribute Model of Consumer Choice During Product Learning.” Marketing Science 4 (1):41–61. Myers, Joe, Evan Van Metre, and Andrew Pickersgill. 2004. “Steering Customers to the Right Channels,” The McKinsey Quarterly, (October), (accessed July 28, 2011), available at http://www.mckinseyquarterly.com/Steering_customers_to_the_right_channels_1504. Moe, Wendy W., and Sha Yang. 2009. “Inertial Disruption: The Impact of a New Competitive Entrant on Online Consumer Search.” Journal of Marketing 73 (January):109–121. Narayanan, Sridhar, Puneet Manchanda, and Pradeep K. Chintagunta. 2005. “Temporal Differences in the Role of Marketing Communication in New Product Categories.” Journal of Marketing Research 42 (August):278–290. Neslin, Scott A., Dhruv Grewal, Robert Leghorn, Venkatesh Shankar, Marije L. Teerling, Jacquelyn S. Thomas, and Peter C. Verhoef. 2006. “Challenges and Opportunities in Multichannel Customer Management.” Journal of Service Research 9 (2):95–112. Neslin, Scott A., and Venkatesh Shankar. 2009. “Key Issues in Multichannel Customer Management: Current Knowledge and Future Directions.” Journal of Interactive Marketing 23 (1):70–81. Noble, Steve, Amy Guggenheim Shenkan, and Christiana Shi. 2009. “The Promise of Multichannel Retailing,” The McKinsey Quarterly: McKinsey on Marketing, (October), (accessed July 28, 2011), available at http://www.mckinseyquarterly.com/Marketing/ Digital_Marketing/The_promise_of_multichannel_retailing_2448. Payne, John W., James R. Bettman, and Eric J. Johnson. 1992. “Behavioral Decision Research: A Constructive Processing Perspective.” Annual Review of Psychology 43:87–131. Petersen, Andrew, and V. Kumar. 2009. “Are Product Returns a Necessary Evil? Antecedents and Consequences.” Journal of Marketing 73 (May):35–51. Petty, Richard E., and John T. Cacioppo. 1986. “The Elaboration Likelihood Model of Persuasion,” in Advances in Experimental Social Psychology, vol. 19, Leonard Berkowitz, ed. New York: Academic Press, 123–205. Spiegelhalter, David J., Nicola G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society. Series B. Methodological 64:583–639. Thomas, Jacquelyn S., and Ursula Y. Sullivan. 2005. “Managing Marketing Communications with Multichannel Customers.” Journal of Marketing 69 (October):239–251. Tversky, Amos, and Daniel Kahneman. 1974. “Judgment Under Uncertainty: Heuristics and Biases.” Science 185 (4157):1124–1131. Venkatesan, Rajkumar, V. Kumar, and Nalini Ravishanker. 2007. “Multichannel Shopping: Causes and Consequences.” Journal of Marketing 71 (April):114–132. Verhoef, Peter C., Scott A. Neslin, and Björn Vroomen. 2007. “Multichannel Customer Management: Understanding the Research-Shopper Phenomenon.” International Journal of Research in Marketing 24 (2):129–148. Weiner, Bernard. 1986. An Attributional Theory of Motivation and Emotion. New York: Springer-Verlag. Weinberg, Bruce D., Salvatore Parise, and Patricia J. Guinan. 2007. “Multichannel Marketing: Mindset and Program Development.” Business Horizons 50:385–394.
11
The Value of Social Dynamics in Online Product Ratings Forums Wendy W. Moe and Michael Trusov
In recent years, online product ratings and reviews have taken on a larger role in the consumer decision process. Not only are more consumers contributing their opinions, but potential buyers are also increasingly relying on the information provided by others in these forums. The result is that online customer ratings have the potential to significantly affect product sales (Chevalier and Mayzlin 2006; Clemons, Gao, and Hitt 2006; Dellarocas, Zhang, and Awad 2007). In theory, online review forums facilitate the exchange of information and help consumers make more informed decisions. For example, Chen and Xie (2008) suggest that online reviews created by users can work as “sales assistants” to help novice consumers identify the products that best match their idiosyncratic preferences. The authors argue that, in the absence of review information, novice consumers may be less likely to buy a product if only seller-created product attribute information is available, suggesting that the availability of consumer-generated reviews may lead to an increase in sales. The common underlying assumption of studies that investigate the impact of consumer reviews on product sales is that posted product ratings reflect the customers’ experience with the product independent from the ratings of others. However, researchers have shown that posted product ratings are subject to a number of influences unrelated to a consumer’s objective assessment of the product. For example, Schlosser (2005) shows that posted product ratings are influenced by social dynamics. Specifically, the rating a person posts for a product is affected by previously posted ratings. In addition, Godes and Silva (2009) demonstrate ratings dynamics that result in a negative trend in posted product ratings as the volume of postings increase. Li and Hitt (2008) observe a similar trend.
318
Wendy W. Moe and Michael Trusov
The consequence of these ratings dynamics is that user-provided product ratings do not always accurately reflect product performance, yet they still have the potential to significantly influence product sales. This can be disconcerting for product marketers, and as a result, many marketers are investing in activities intended to create a more favorable ratings environment for their products with the intention of boosting sales (Dellarocas 2006). Our objective in this article is to measure the value of the social dynamics found in online customer rating environments. We explicitly model the arrival of posted product ratings and separate the effects of social dynamics on ratings from the underlying baseline ratings behavior (which we argue reflects the consumers’ “socially unbiased” product evaluations).1 In many ratings forums, consumers evaluate products along a five-star scale. Therefore, we capture the ratings process by modeling the arrival of ratings within each star level as five separate (but interrelated) hazard processes. This process enables us to capture the timing and the valence of posted ratings simultaneously. In addition, we include time-varying hazard covariates to capture the effect of social influence on ratings behavior. The resulting model estimates enable us to compute a set of ratings metrics that represent the expected ratings behavior both with and without the effect of social dynamics. We then decompose observed ratings into a baseline ratings component, a component that represents the impact of social dynamics, and an idiosyncratic error component. To capture the sales impact of social dynamics, we model sales as a function of these component-ratings metrics. Our model results indicate that there are substantial social dynamics in the ratings environment. We study these dynamics further by examining both their direct and indirect effects on product sales. We show specifically that the dynamics observed in a product’s ratings valence (or average rating) can have direct and immediate effects on sales. We also show that these dynamics can have additional indirect effects on future sales through their influence on future ratings. These indirect effects can mitigate the long-term impact of ratings dynamics on sales, particularly in cases in which the level of opinion variance is increased. One unique aspect of our research that differentiates it from previous studies is that we model variation in sales over a large set of products over time. Many of the published studies in this area of research use publicly available data on ratings and product sales rank at a fixed point in time, allowing only for cross-sectional analyses (see, e.g.,
The Value of Social Dynamics in Online Product Ratings Forums
319
Chevalier and Mayzlin 2006). In contrast, we use a data set obtained from an online retailer that contains longitudinal ratings and sales data, which enables us to examine changes in product sales from period to period as a function of changes in the ratings environment. The longitudinal nature of the data also provides multiple observations for each product in the sample. This added richness in the data enables us to explicitly model product heterogeneity. Several other aspects of our data are worth mentioning. First, the functionality to post ratings on the site was introduced approximately halfway through our data period. The weekly sales levels observed before the introduction of the ratings functionality enable us to estimate a product’s baseline sales level absent any ratings effects. The second aspect is the product category. Previous researchers have typically focused on movies and books, likely because of the availability of data in these categories. However, movies and books are unique in that they have product life cycles that are both short and follow predictable exponential patterns (Moe and Fader 2001; Sawhney and Eliashberg 1996). These products experience the greatest level of sales (and ratings activity) immediately after launch, quickly after which sales (and ratings activity) taper off dramatically. The danger of using such product categories is that results can be sensitive to when in the product life cycle the researcher collects the data. In this article, we use sales and ratings data for products in a mature product category with relatively stable sales. As a result, the sales changes observed can be attributed to changes in the ratings environment and are less likely caused by the natural progression of the product life cycle or evolving consumer base (e.g., Li and Hitt 2008). In the next section, we review the existing literature pertaining to the effects of online user-generated ratings. Included in this review is a discussion of previous research that has revealed how social influences and dynamics can affect the posting of product ratings. After the literature review, we present a conceptual framework that relates product sales to online consumer ratings. We then describe the data used in this article. In particular, we highlight some of the characteristics of the product category featured in our data that make it well suited for the research questions we address. We then develop the model and propose a set of metrics based on the ratings component of the model, which enables us to decompose the ratings effect and measure the sales impact of social dynamics. Finally, we present the results and highlight some of the implications.
320
11.1
Wendy W. Moe and Michael Trusov
Literature Review
There is a growing body of research in both the marketing literature and the information technology literature that examines the effects of online word of mouth. The authors of these articles have considered various forms of online word of mouth, including user-provided ratings and reviews and newsgroup postings. Although some researchers have focused on measuring the effects of online word of mouth on performance measures (e.g., sales, growth, television viewership), others have studied the dynamics observed in these online word-ofmouth environments. 11.1.1 Effects of Online Word of Mouth The majority of research in this area has identified three metrics of online word of mouth: valence, variance, and volume (Dellarocas and Narayan 2006). Valence is represented most frequently by an average rating measure (Chevalier and Mayzlin 2006; Clemons, Gao, and Hitt 2006; Dellarocas, Zhang, and Awad 2007; Duan, Gu, and Whinston 2008). It has also been represented by some measure of positivity (or negativity) of ratings (Chevalier and Mayzlin 2006; Godes and Mayzlin 2004; Liu 2006). The variance in ratings has also been measured in a variety of ways, ranging from a statistical variance (Clemons, Gao, and Hitt 2006) to entropy (Godes and Mayzlin 2004), and volume is represented most commonly by the number of postings. Although several authors have focused on studying the effects of valence, variance, and volume of online word of mouth on product performance, they have found varying empirical results (see table 11.1). Two articles (Duan, Gu, and Whinston 2008; Liu 2006) examine the impact of posted user reviews on movie box office sales. Although the effects of valence and volume were modeled in both cases, only the volume of word of mouth was significant. Dellarocas, Zhang, and Awad (2007) also study the effects of online word of mouth on movie box office sales. However, rather than modeling weekly sales, they examine the patterns of sales growth and find that both valence and volume of online word of mouth have significant effects. Another study by Clemons, Gao, and Hitt (2006) focuses attention on the craft beer category, finding that the valence and variance (but not the volume) of ratings affect sales growth. In particular, the authors find that the valence of the top quartile of ratings has the greatest effect on predicting sales growth.
The Value of Social Dynamics in Online Product Ratings Forums
321
Table 11.1 Literature Review
Article Liu (2006) Duan, Gu, and Whinston (2008) Dellarocas, Zhang, and Awad (2007) Clemons, Gao, and Hitt (2006) Godes and Mayzlin (2004) Chevalier and Mayzlin (2006) Current study
Product Category
Dependent Variable
Significant Wordof-Mouth Effects
Movies Movies
Sales Sales
Number of posts Number of posts
Movies Beer
Sales diffusion parameters Sales growth rate
Television shows Books
Televisionviewership ratings Sales rank
Bath, fragrance, and beauty products
Cross-product temporal variation in ratings and sales
Average rating, number of ratings Average rating, standard deviation of ratings Entropy of posts, number of posts Average rating, number of ratings Static and dynamic effects of ratings
An important challenge that must be addressed when studying the sales effect of ratings is the potentially endogenous relationship between sales and ratings. In other words, a “good” product is likely to experience higher sales and receive more positive ratings than a “bad” product. The consequence is that sales and ratings are correlated, but the relationship is not necessarily causal. Chevalier and Mayzlin (2006) control for product heterogeneity by comparing differences in sales ranks for a sample of books that sold on both Amazon.com and Barnesandnoble.com. Because each site operated independently, each had a different set of posted ratings. Their results indicate that, across the products in their sample, the valence and volume of posted ratings had significant effects on sales performance. Overall, the existing research has identified a number of important ratings metrics that can influence sales. However, although there is ample research on the effects of ratings on sales, the understanding of the ratings behavior itself is relatively limited. Next, we discuss a few studies that have investigated ratings behavior and the social dynamics that have been observed. 11.1.2 Consumer Ratings Behavior A few recent studies have found that posted online ratings exhibit systemic patterns over time; specifically, the valence of ratings tends to
322
Wendy W. Moe and Michael Trusov
trend downward (Godes and Silva 2009; Li and Hitt 2008). Li and Hitt (2008) posit that this trend is part of the product life-cycle process, and as the product evolves, so does the customer base. Specifically, they argue that customers who buy early in the product life cycle have significantly different tastes and preferences from those who buy later. At the same time, initial product ratings tend to be provided by the early customers but consumed by the later customers; this results in an increasing level of dissatisfaction over time because potential buyers are reading the ratings and reviews of existing customers who have dissimilar preferences. Godes and Silva (2009) suggest an alternative explanation. They show that the valence of ratings decreases with the ordinality of the rating rather than time. The downward trend is explained by the decreasing ability of future buyers to assess similarity with past reviewers as the total number of ratings grows, which then leads to more purchase errors and, consequently, lower future ratings. In an experimental setting, Schlosser (2005) demonstrates the effect of social influences on consumer rating behavior. She finds that consumers who have decided to post their opinions tend to negatively adjust their product evaluations after reading negative reviews, which indicates that consumer posting behavior is affected by social context and the valence of previously posted reviews. She attributes this behavior to the notion that posters strive to differentiate their reviews, and negative reviews are more differentiated because negative evaluators are perceived as more intelligent (Amabile 1983). This same mechanism may also be driving the downward trend in ratings that both Li and Hitt (2008) and Godes and Silva (2009) document. Schlosser (2005) also discusses multiple-audience effects in the context of online posting behavior. Multiple-audience effects occur when people facing a heterogeneous audience adjust the message to offer a more balanced opinion (Fleming et al. 1990). This is yet another form of social influence and suggests that the effects of previously posted ratings on ratings behavior extend beyond the effect of valence and include the effect of variance. In this article, we consider potential social influences on ratings behavior by modeling the effects of both the valence and variance of previously posted reviews. We also allow for ratings dynamics by modeling the effects of the volume of posted ratings on subsequent rating behavior. By explicitly modeling these covariate effects, we can more effectively separate the effects of social influences and dynamics from the baseline ratings behavior.
The Value of Social Dynamics in Online Product Ratings Forums
11.2
323
Framework
Before presenting our data and proposed model, we discuss our conceptual framework. Figure 11.1 illustrates the relationship between product sales and ratings over time and incorporates key constructs from the consumer purchasing process. In this research, we observe product sales and product ratings in the data, and we model social dynamics in the ratings environment as a latent construct. We include the remaining constructs in the figure to complete the discussion of the consumer process. Before any purchasing experience, a consumer constructs a prepurchase product evaluation based on a number of inputs. First, the consumer can independently evaluate product characteristics and
Figure 11.1 Conceptual Framework. Notes: The constructs highlighted in the solid gray boxes are observed in our data, whereas the social dynamics construct highlighted in the textured gray box is modeled as latent. The empirical model proposed in this article focuses on the relationships between these three constructs.
324
Wendy W. Moe and Michael Trusov
market conditions to arrive at a purchasing decision. Second, the consumer can also look for signals of product quality in his or her social environment. Information cascade researchers suggest that consumers can be affected by the observable choices of others (Bikhchandani, Hirshleifer, and Welch 1992). These potential buyers can also be affected by the post-purchase product evaluations of other consumers through offline word of mouth (e.g., Westbrook 1987). In this article, we focus on the effect of online product ratings on these potential buyers through online word of mouth. Consumer pre-purchase product evaluations are related directly to product sales. After purchasing and experiencing the product, consumers update their post-purchase product evaluations with their own personal product experience. In theory, this post-purchase evaluation more heavily weighs the consumer’s actual experience with the product and, as such, is more influenced by the consumer’s independent assessment of the product and less influenced by the social factors that influenced the pre-purchase evaluations. In theory, these post-purchase evaluations should determine a consumer’s posted product rating (if he or she chooses to post). However, other factors can also influence product ratings—namely, the social dynamics in the ratings environment. We posit that posted product ratings reflect the combined effect of: (1) a consumer’s socially unbiased post-purchase product evaluations; and (2) any social dynamics that may influence a rater’s public evaluation of the product in the ratings environment. In this article, we conceptualize the effect of social dynamics as the impact of previously posted ratings on the posting of future ratings. We do not differentiate between the influence of social dynamics on a potential rater’s decision of whether to post and the decision of what to post. Instead, we examine the net effect of these dynamics on future posted ratings. The objective of our modeling effort is to quantify these effects in terms of how they affect subsequent product sales. To that end, we develop a modeling framework that both captures the ratings process and measures the effect of ratings on product sales. In addition, we propose a set of metrics derived from our ratings model that enables us to decompose observed product ratings into components that represent: (1) consumers’ baseline (or socially unbiased) ratings for the product; (2) the effect of social dynamics in the ratings environment; and (3) idiosyncratic error. We present our model after discussing our data set in the next section.
The Value of Social Dynamics in Online Product Ratings Forums
11.3
325
Data Description
We obtained our data from a national retailer of bath, fragrance, and beauty products and include a sample of 500 products rated and sold on the retailer’s website. Products include hedonic items such as fragrances, room fresheners, scented candles, and bath salts, as well as more utilitarian items such as skin care products (e.g., anti-aging cream, moisturizer), sunscreen, and manicure/pedicure products. These items are moderately priced and appeal to the mass market (the highestpriced product in our sample is $25); they are not branded or designer products. The retailer in this study creates and produces its own products and sells them exclusively at its stores, both online and offline. Products tend to come in a large variety of fragrances. As a result, consumers frequently try new fragrances and/or include multiple fragrances of the same product in their purchase. The retailer does not engage in product-specific marketing. Instead, its marketing efforts are focused on promoting the entire store rather than individual products in the assortment. The wide variety of products, the inclusion of purely hedonic products (which are difficult to evaluate only on the basis of product attributes) and utilitarian products, and the absence of productspecific marketing activity make this an ideal data set for studying online product ratings, and thus allow for the generalizability of our results across a number of different product types and contexts. Our data span a one-year period from December 2006 to December 2007. Sales data are recorded weekly for each product. Although our data period includes two year-end shopping periods, the products we chose for our sample were nonseasonal items and did not display significant holiday sales patterns. The online-ratings functionality was introduced to the site in May 2007, halfway through the data period. The absence of a ratings tool on the site in the beginning of the study enables us to more accurately estimate a sales baseline for each product and thus separate the effects attributed to the rating system. During the post-ratings period, there was a sale event that affected some of our sample for two weeks. We control for this event in our analysis. The ratings tool enabled customers to post both a star rating (on a five-star scale) and review text.2 Each posting (rating and review) is recorded individually in our data set with a date stamp. Visitors to the website are presented with the average and number of ratings posted for the product being viewed. The entire history of previously posted ratings is also available.
326
Wendy W. Moe and Michael Trusov
Of the 500 products in our sample, 3801 ratings were posted. At the time we collected the data, the retailer made no promotional efforts to solicit the posting of online ratings. The ratings posted on this site are typical of most ratings posted online in that they are predominantly positive, a pattern consistently identified in previous research (Chevalier and Mayzlin 2006; Dellarocas 2003; Resnick and Zeckhauser 2002). In our sample, 80% of all product ratings were five-star ratings. If we examine the valence of ratings at the product level, 91.8% of the products in our sample received at least one five-star rating. Products also received their fair share of negative and/or neutral ratings. Table 11.2 provides a histogram of the proportion of products that received one-, two-, three-, four-, and five-star ratings. This distribution of ratings results in a fair degree of ratings variance within products, with the average product experiencing a variance of 0.44 across ratings. Table 11.3 provides a brief description of the heterogeneity across products in our data set. Table 11.2 Histogram of Ratings Rating
Number of Products
Percentage of Products
Five stars Four stars Three stars Two stars One star
459 213 114 82 77
91.8 42.6 22.8 16.4 15.4
Table 11.3 Description of Data Mean
Minimum
Maximum
Ratings dates Sales weeks (week ending)
— —
May 31, 2007 January 6, 2007
November 29, 2007 December 8, 2007
Across Products Valence (average rating) Variance Volume (number of ratings) Total product sales
4.60 .44 6.19 5449
1 0 1 51
5 4 187 36,999
The Value of Social Dynamics in Online Product Ratings Forums
11.4
327
Modeling Overview
Our modeling objective is to separate the effects of social dynamics from those resulting from the consumers’ socially unbiased product evaluation. We refer to the latter effect as the baseline ratings behavior and model how social dynamics can cause ratings to deviate from this baseline. We then model the effects of these socially influenced deviations on product sales. First we develop a ratings model that is intended to decompose observed ratings into a baseline component and a social influence effect. A challenge we face in modeling the ratings process is that we must capture both what was posted and when it was posted, which enables us to differentiate between a product that received a five-star average rating but was sparsely rated to a more heavily rated product with the same average rating. To address this issue, we adopt a hazard modeling approach that treats the arrival of ratings of each star level as separate (though potentially correlated) timing processes. A wellliked product would tend to receive five-star ratings at a more frequent rate than one-star ratings; this would be discernable from the proposed hazard-modeling framework. We include time-varying covariates in each of the hazard processes to capture the effects of social influence and to control for potential dependencies across different star-level arrivals. The resulting baseline hazard rates would then describe the baseline ratings behavior and can be used to obtain a set of ratings metrics that represent the ratings that would have been received if not for the observed social dynamics in the ratings environment. Furthermore, this model enables us to obtain a set of expected ratings metrics that account for social dynamics by considering covariate effects. Our ultimate goal here is to measure the sales impact of these dynamics. Therefore, we also present a sales model that captures the effect of ratings on product sales. Specifically, we decompose traditional ratings metrics into baseline ratings metrics, incremental effects resulting from social dynamics, and idiosyncratic error. A challenge in modeling the relationship between sales and ratings is the highly plausible endogenous relationship between the two. That is, a product may show higher sales not necessarily because of better ratings, but rather due to a higher appeal to consumers, which is also reflected in higher ratings. This relationship between ratings and sales needs to be carefully addressed before any attributions can be made.
328
Wendy W. Moe and Michael Trusov
We can address the potential endogeneity concern by leveraging the power of our unique data set, which includes weekly sales data both before and after the ratings functionality is available on the website. The observed sales data in the pre-ratings period enable us to more effectively estimate a baseline level of sales for each product and to confidently attribute any post-ratings sales changes directly to the posted product ratings themselves. 11.5
Ratings Model
Posted product ratings can be characterized by: (1) the frequency of arrival; and (2) the valence (or star level) of the ratings. To capture this process, we consider each star level separately and model the arrival of each of them as five parallel timing processes. Specifically, we assume that each rating level is associated with its own exponential hazard process (a process consistent with the observed data) with covariates.3 Because we observe the posting of multiple ratings, we write the hazard function governing the time until the next rating of the same valence (τjvk) as follows: h(τvjk; λvj, βvj) = λvjexp{βvjXjτ} ∀ v ∈ {1, 2, 3, 4, 5},
(1)
where: j = product index, k = index for rating occasion, τ = time index in the ratings period (days), and Xjτ = vector of covariates. We treat the posting of all ratings with valence (v) for a given product (j) as repeated events and model the time (τ) elapsing between the posting of rating k and rating k + 1 of the same valence. Each hazard function describes the frequency of posted ratings of that star level and decomposes it into a baseline hazard rate (λvj) and covariate effects (βvjXjτ). The baseline hazard rate represents the underlying ratings behavior absent of any covariate effects. The β coefficients indicate how the specified covariates affect the rating frequency. Note that this process is comparable to a Poisson model (which assumes exponential arrivals), in which we count the number of one-, two-, three-, four-, or five-star ratings in a given time period. However, the exponential hazard model is more precise because it examines the
The Value of Social Dynamics in Online Product Ratings Forums
329
arrival time of each individual rating rather than aggregating ratings over a specified time period. Nonetheless, the interpretation of the parameters is the same. We include a number of covariates that capture social influences and dynamics in the ratings environment. Because our context is one in which product-specific marketing is nonexistent, we do not include marketing-mix covariates. However, if necessary, these covariates would be easy to incorporate in our modeling framework for other contexts. One potential covariate of interest is price. Price can affect a consumer’s assessment of the product’s perceived quality and value (Dodds, Monroe, and Grewal 1991) and thus can affect ratings. Specific to our data set, price does not change over time,4 thus, the effects of price can be identified cross-sectionally (across products), but not temporally (within product). We accommodate for variations in price through the baseline effects. For our analysis, we specify the following covariates: LAGVALENCEjτ = average of all ratings for j posted before τ (mean centered), LAGVARIANCEjτ = variance across all ratings for j posted before τ (mean centered), LAGVOLUMEj = total number of ratings for j posted before τ (mean centered), and SALEτ = indicates a sale event at time τ. The first three covariates (LAGVALENCE, LAGVARIANCE, and LAGVOLUME) characterize previously posted ratings and are metrics that are mirrored later in the sales model.5 These covariates are updated with the arrival of each new rating (of any valence). The LAGVALENCE covariate enables us to observe how the positivity or negativity of past reviewers can influence future postings. On the basis of previous research, we expect that LAGVALENCE will have a negative effect on the subsequent arrival of negative ratings. The LAGVARIANCE covariate can capture potential multiple-audience effects. High LAGVARIANCE would indicate disagreement among reviewers and the presence of multiple audiences, possibly leading to more balanced and neutral future ratings (Fleming et al. 1990). We model dynamics with the LAGVOLUME covariate, which is similar to the way Godes and Silva (2009) model dynamics and the effect of ordinality. This covariate enables us to capture the trend in ratings as more ratings are posted. In addition, because of the parallel hazard functions,
330
Wendy W. Moe and Michael Trusov
this covariate would enable us to identify the source of the downward trend. That is, is the downward trend in ratings driven by the posting of more negative ratings (βnegative, LAGVOLUME > 0), fewer positive ratings (βpositive, LAGVOLUME < 0), or both? Note that these covariates are time varying and continue to be updated as ratings of any valence gets posted. Therefore, to accommodate time-varying covariates (measured in discrete time), the pdf and survival function resulting from the hazard function specified in equation (1) are as follows: β vj X jτ
f(τ vjk ) = λ vj e
⎫ ⎧ τ vjk ⎪ β vj X ju ⎪ exp ⎨−λ vj ∑ e ⎬ , and u = 1+ ⎪ ⎪ τ vj( k −1) ⎩ ⎭
⎫ ⎧ τ vjk ⎪ β vj X ju ⎪ S(τ vjk ) = exp ⎨−λ vj ∑ e ⎬. u = 1+ ⎪ ⎪ τ vj( k −1) ⎩ ⎭
(2)
(3)
To account for product heterogeneity (and the likely correlation between ratings arrivals of different star levels), we assume that baseline hazard rates, λvj, are drawn from a multivariate normal distribution as follows: ln λj ~ MVN(μλ, ∑).6 We also allow for heterogeneous covariate effects such that b vj ~ normal(m βvj , σ β2vj ) . Incorporating repeated ratings observations for each product and the discrete time nature of our data, we write the resulting likelihood function for product j conditional on λj as follows: K vj 5 ⎡ ⎛ ⎞ K vj ⎤ L j = ∑ ⎢S v ⎜ T − ∑ τ vjk ⎟ ∏ pv (t vjk )⎥ ⎝ ⎠ k =1 v =1 ⎣ k =1 ⎦
δ vj
S v (T)1− δ vj ,
(4)
where T is the length of our observed ratings data, Kvj is the total number of ratings posted for product j of valence v, and δvj = 1 if Kvj > 0 (δvj = 0 otherwise). Finally, pv(τvjk) is the probability of a ratings arrival of star-level v in the discrete time period τvjk and is defined as the difference between the survival rates at τvj (k–1) and τvjk: pv(τvjk) = Sv[τvj(k−1)] − Sv(τvjk). 11.6
(5)
Decomposing Ratings Metrics
Our objective in modeling the arrival of posted ratings is to decompose observed ratings metrics and separate social dynamic effects from the
The Value of Social Dynamics in Online Product Ratings Forums
331
product-specific baseline rating. To this end, we develop a set of baseline metrics representing the ratings that would have arrived in the absence of social dynamics. We then predict the expected ratings with social dynamic effects. Thus, the difference between these expectations and the baseline represents the effect of social dynamics on ratings arrivals. Any remaining difference between observed ratings and expected ratings is attributed to idiosyncratic error. In the model we presented, the baseline hazard rates represent the consumer population’s underlying propensity to post a one-, two-, three-, four-, or five-star rating for a given product. Because we meancentered the covariates used in our model, the baseline hazards are given by λvjexp{βvjXj1}, which represents the initial hazard rate at τ = 1 before any ratings arrive (with non-mean-centered data, this simplifies to λvj). These baseline hazards enable us to predict, for each product j, the ratings that would have been posted in the absence of social influences and dynamics and to calculate the associated ratings metrics (i.e., average rating, ratings variance, and number of ratings). In an exponential hazard model, the hazard rate represents the expected number of arrivals in a single time period. As such, we can calculate the proportion of one-, two-, three-, four-, or five-star ratings, absent covariate effects, as a ratio of these baseline hazard rates: q vj =
λ vj exp{b vj X j1 }
∑λ
vj
exp{b vj X j1 }
.
(6)
v
The expected average rating, Rˆ j , and the expected ratings variance, ˆ Vj , for product j would then follow: Rˆ j = ∑ (q vj × v), and
(7)
ˆ j = [q vj × ( v − Rˆ j )2 ]. V ∑
(8)
v
v
The number of ratings (regardless of valence) expected in the absence of social dynamics would simply be the total baseline hazard rate multiplied by time, τ. Note that unlike the metrics for average rating and ratings variance, we expect the baseline number of ratings to change from period to period as the length of the observation period increases: ˆ jÄ = τ × λ vj exp{b vj X j1 }. N ∑ v
(9)
332
Wendy W. Moe and Michael Trusov
In addition to the baseline ratings metrics described in equations 7–9, we can also calculate the metrics associated with a ratings process that is subject to social dynamics. Specifically, we use the time-varying hazard rates to predict the total number of one-, two-, three-, four-, and five-star ratings posted, mvjτ, as follows: τ
m vjτ = ∑ λ vj exp{b vj X ju },
(10)
u =1
where the vector of covariates includes only LAGVALENCE, LAGVARIANCE, and LAGVOLUME. From this, we can easily compute the average rating (Rjτ), ratings variance (Vjτ), and number of ratings (Njτ) for product j at time τ as follows:
∑ (m = ∑m
vjτ
R jτ
v)
v
(11)
, vjτ
v
∑ [m (v − R = ∑m vjτ
Vjτ
jτ
v
)2 ] , and
(12)
vjτ
v
N jτ = ∑ m vjτ .
(13)
v
The difference between these metrics that account for social dynamics and the baseline metrics specified in equations (7)–(9) would describe the impact of social dynamics on product ratings, whereas the difference between observed ratings metrics and the predictions from equations (11)–(13) would be attributed to idiosyncratic error. We consider the impact of each ratings component on product sales in the following section. 11.7
Sales Model
Existing research has typically modeled the relationship between sales and ratings by regressing product sales against measures of ratings valence, variance, and volume. To be consistent with the existing research, we also focus on the valence, variance, and volume of posted product ratings. However, unlike existing research, we decompose these metrics into a baseline component, a social dynamics component, and an error component as described previously.
The Value of Social Dynamics in Online Product Ratings Forums
333
In addition, we divide our data into a pre-ratings period and a postratings period. In the pre-ratings period (t < t*), we estimate only baseline sales for a given product, cj, with no covariate effects. In the post-ratings period (t > t*), we include a set of covariates (Z) that capture the effect of ratings on sales: ⎧c j + ε jt ln(S jt ) = ⎨ ⎩c j + b j Z jt + ε jt
for t < t* , for t > t*
(14)
where t indexes time (in weeks) for the entire data period and ε jt ~ normal (0 , σ 2ε ) . The first covariate we include is a postratings indicator variable to control for the effect of introducing the ratings functionality to the site (POSTRATINGt). We also include an indicator variable to control for the effect of a sale event that occurred in the post-ratings period (SALEt). To capture the effects of the posted ratings themselves, we separately consider the baseline, social dynamics, and error components of each of our three ratings metrics. The baseline ratings metrics include the mean-centered average ˆ j ; and the number rating, Rˆ j the mean-centered ratings variance, V ˆ of ratings, N jt . Because the baseline average rating and ratings variance are product specific and do not vary over time, we include the meancentered values in our sales model to better reflect the impact of a product’s baseline rating relative to the average product. To measure the effects of social dynamics on sales, we include the difference between the predicted rating metrics after adjusting for social dynamics (computed at the beginning of week t) and the baseline metrics for ˆ j ), and the number of average rating (R jt − Rˆ j), ratings variance ( Vjt − V ˆ ratings ( N jt − N jt ). We also include the observed deviations from predicted average rating (AVGRATINGjt – Rjt), ratings variance (VARIANCEjt – Vjt), and number of ratings (NUMRATINGSjt – Njt). Finally, we accommodate unobserved product heterogeneity in the sales constant [ c j ~ normal(m c , sc2 ) ].7 For completeness, we also model covariate effects to be heterogeneous across products according to the following: bkj ~ normal(m kj , skj2 ) . 11.8
Model Estimation and Results
We estimate our model using WinBUGS, specifying appropriate and diffuse priors. We ran at least 50,000 iterations, discarding the first
334
Wendy W. Moe and Michael Trusov
25,000 for burn-in. We used multiple starting values to test the sensitivity of the parameter estimates to starting values and to monitor convergence. The results indicate that starting values had no substantial impact on the parameter estimates. In addition, we computed Gelman– Rubin statistics for each parameter to monitor convergence. We considered two approaches to model estimation: simultaneous and two-stage estimation. The benefit of the simultaneous approach is that the uncertainty in parameter estimates in the rating model (as reflected in posterior distributions) is naturally incorporated into the sales model. The key downside is a significant computational burden. Therefore, we performed simultaneous and two-stage model estimations for a random subsample of fifty products and then compared posterior distributions of the key parameters of the sales model. Because we found no substantial difference between the two approaches, we adopted the two-stage estimation approach. Table 11.4 provides fit statistics for both the ratings and the sales models. Using our model estimates, we compute the pseudo-R-square value (i.e., squared Pearson correlation between observed and predicted values) and the mean absolute percentage error for average rating, ratings variance, number of ratings, and product sales. Overall, both the ratings and the sales models fit our data quite well. 11.8.1 Ratings Model Results Table 11.5 provides the parameter estimates resulting from our proposed ratings model.8 The results show that ratings dynamics can substantially affect the arrival of future ratings through valence, variance, and volume effects. The LAGVALENCE coefficients indicate that increases in average ratings tend to encourage the subsequent posting of negative ratings (βval,1 = 0.451, βval,2 = 0.382, and βval,3 = 0.568) and discourage the posting of extremely positive, or five-star, ratings (βval,5 Table 11.4 Fit Statistics
Average rating (R) Ratings variance (V) Number of ratings (N) Sales (S)
Pseudo-R2
Mean Absolute Percentage Error
.644 .757 .964 .998
.0996 .0771 .189 .0583
The Value of Social Dynamics in Online Product Ratings Forums
335
Table 11.5 Parameter Estimates for Proposed Ratings Model M (μ)
Variance−1 (1/σ2)
Baseline Rating Behavior lnλ1
−8.487
(.331)
lnλ2
−8.208
(.400)
lnλ3
−7.840
(.305)
lnλ4
−5.899
(.183)
lnλ5
−3.601
(.116)
Effect of LAGVALENCE (βval)a on One-star ratings .451 Two-star ratings .382 Three-star ratings .568 Four-star ratings .0845 Five-star ratings −.0425
(.134)b (.134)b (.143)b (.0582) (.0198)b
3.659 (1.508) 3.588 (1.596) 3.009 (1.328) 5.856 (1.867) 18.760 (2.624)
Effect of LAGVARIANCE (βvar)a on One-star ratings −.302 Two-star ratings −.316
(.185)c
2.543 (1.196)
(.225)c
2.727 (1.351)
Three-star ratings Four-star ratings Five-star ratings
(.175) (.108) (.0671)b
2.927 (1.546) 3.826 (1.516) 5.565 (1.689)
Effect of LAGVOLUME (βvol)a on One-star ratings .0853 Two-star ratings .112 Three-star ratings .0531 Four-star ratings .0720 Five-star ratings .0478
(.0300)b (.0296)b (.0254)b (.0187)b (.0112)b
24.600 (4.1936) 25.62 (4.13) 29.59 (4.275) 44.28 (5.246) 84.150 (7.226)
Effect of SALEt (βsale) on One-star ratings
−1.598
(.736)b
1.02 (.822)
−.462
(.472)
1.566 (1.315)
Three-star ratings
−1.262
(.775)b
.672 (.973)
Four-star ratings
−.294 .944
(.351)
1.032 (1.245)
(.109)b
2.433 (.833)
Two-star ratings
Five-star ratings
.00934 .000713 −.188
a. Measures are mean centered. b. Zero is not contained in the 95% confidence interval. c. Zero is not contained in the 90% confidence interval. Note: Values in parentheses represent standard errors of the estimates.
336
Wendy W. Moe and Michael Trusov
= –0.0425). This result is consistent with the extant literature that provides evidence of differentiation behavior in the ratings environment (Amabile 1983; Schlosser 2005). In addition, LAGVARIANCE has significant, negative effects on extremely negative (βvar,1 = –0.302, βvar,2 = –0.316) and extremely positive (βvar,5 = –0.188) ratings; the effects on moderate three- and four-star ratings are insignificant. In other words, disagreement among raters tends to discourage the posting of extreme opinions by subsequent raters. This is consistent with the multiple-audience effects that indicate that consumers facing a highly varied audience are less likely to offer extreme opinions to avoid alienating any one segment of the audience (Fleming et al. 1990; Schlosser 2005).9 With respect to the LAGVOLUME effects, we observe that as the number of posted ratings increases, ratings of all star levels become more frequent. However, the magnitude of the volume effect on negative (one- and two-star) ratings is noticeably larger than that on more positive ratings, which suggests that although positive ratings may become more likely as ratings volume increases, they are overshadowed by the increased arrival of negative ratings. The net effect is a negative trend in posted product ratings. This result is consistent with the trends documented by Godes and Silva (2009), who show that the average rating decreases as the number of ratings increases. Our results add to their empirical findings and show that the decreasing trend in average ratings is driven by an increase in negative ratings (rather than a decrease in positive ratings, which would generate the same negative trend). From the ratings-model results we have presented, we compute a set of metrics that decompose ratings into a baseline component, social dynamic effects, and idiosyncratic error for each product over time. Overall, the results from the ratings component of the model suggest that there are significant dynamics in ratings behavior. For each product, we compute the average difference between the socially unbiased versus socially adjusted ratings valence valence ( R jt − Rˆ j ) variance ˆ j ), and volume ( N jt − N ˆ jt) and present the median, the 25th per( Vjt − V centile, and the 75th percentile values (see table 11.6). For the median product, observed social dynamics result in a lower average rating (–0.0745), a greater variance among ratings (0.133), and a lower volume of ratings (–0.415). These results suggest that observed social dynamics seem to decrease ratings valence and volume while increasing ratings variance in our data.
The Value of Social Dynamics in Online Product Ratings Forums
337
Table 11.6 Effects of Social Dynamics on Posted Product Ratings. 25th Percentile Valence ( R jt − Rˆ j ) ˆ j) Variance ( Vjt − V ˆ jt ) Volume ( N jt − N
−.143 .0725 −3.977
Mdn
75th Percentile
−.0745 .133
−.0287 .246 .0163
−.415
Although it is clear from our results that social dynamics can significantly influence subsequent rating behavior, the more managerially relevant question is: How do these dynamics affect sales? Therefore, we turn next to the results of the sales model and quantify the value of these dynamics in terms of product sales. 11.8.2 Sales Model Results Table 11.7 presents the results of the proposed sales model, which decomposes ratings metrics into baseline, social dynamic, and error components. In Table 11.7, we also present the results for two benchmark models that measure the effects of observed average rating (AVGRATINGjt), ratings variance (VARIANCEjt), and number of ratings (NUMRATINGSjt) without any decomposition. In the first benchmark model (Benchmark 1), the product-specific sales constant is heterogeneous, but the covariate effects are not. In the second benchmark model (Benchmark 2), both the sales constants and the covariate effects are allowed to vary across products. The results show that Benchmark 2 (deviance information criterion [DIC] = 42,097.4) provides a significantly better fit than Benchmark 1 (DIC = 43,063.3), which suggests that there is heterogeneity in ratings effects across products. When we decompose the ratings covariates in the proposed model, model fit further improves (DIC = 42,048.6), which indicates that the effects of product ratings on sales are more complex than what has been represented by the simple summary metrics in the benchmark models. The results of the proposed model, similar to those of the benchmark model, show significant effects of product ratings on sales. However, the baseline, social dynamic, and error components of the ratings metrics each have differing effects. With respect to the valence of product ratings, we find that the baseline product rating has a positive impact on product sales (b1 = 0.577). In addition, deviations from the expected baseline ratings caused by social dynamics ( R jt − Rˆ j ) have positive effects on product sales (b2 =
338
Wendy W. Moe and Michael Trusov
Table 11.7 Parameter Estimates for Sales Models. Proposed Model
Benchmark 2
Benchmark 1
Mean (m)
Variance–1 (1/s2)
Mean (m)
Variance–1 (1/s2)
Parameter Estimate
Mean sales constant (cj)
3.255 (.0687)
.439 (.0287)
3.253 (.0674)
.437 (.0284)
3.253 (.0688)
POSTRATINGt (b0)
.0273 (.0673)
7.364 (1.642)
.297 (.0389)a
7.712 (1.640)
.143 (.0176)a
—
.317 (.102)a
.881 (.218)
.179 (.0349)a
.577 (.206)a
4.201 (1.69)
—
—
—
R jt − Rˆ j ( b2 )
1.198 (.359)a
1.925 (1.263)
—
—
—
AVGRATINGjt − Rjt (b3)
.0420
33.42
—
—
—
(.0167)a
(3.901)
—
.0422 (.0664)
5.221 (1.631)
.0842 (.0270)a
.464 (.222)a
4.087 (1.707)
—
—
—
ˆ j ( b5 ) Vjt − V
.2424 (.254)
3.425 (1.466)
—
—
—
VARIANCEjt − Vjt (b6)
–.0437
4.181
—
—
—
(.0688)
(1.208)
—
.0502 (.00846)a
67.280 (5.901)
.00665 (.00196)a
57.88 (6.251)
—
—
—
Sales Effect of Valence Metrics Observed — AVGRATINGjt Decomposed b Rˆ j ( b1 )
Sales Effect of Variance Metrics Observed — VARIANCEjt Decomposed ˆ j ( b4 ) b V
Sales Effect of Volume Metrics Observed — NUMRATINGSjt Decomposed ˆ jt ( b7 ) N
.0604 (.0160)a
The Value of Social Dynamics in Online Product Ratings Forums
339
Table 11.7 (continued) ˆ jt ( b8 ) N jt − N
.0307 (.0237)
34.93 (4.697)
—
—
—
NUMRATINGSjt − Njt (b9)
–.0312 (.0277)
13.78 (2.461)
—
—
—
SALEt (b10)
.443 (.0567)a
4.506 (1.192)
.441 (.0512)a
4.592 (1.245)
.366 (.0405)a
DIC
42,048.6
42,097.4
43,063.3
a. Zero is not contained in the 95% confidence interval. b. Measures are mean-centered across the sample of 500 products. Notes: Observed AVGRATING, VARIANCE, and NUMRATINGS covariates are meancentered in the benchmark model. Values in parentheses represent standard errors of the estimates.
1.198). This result suggests that positively (or negatively) valenced dynamics in the ratings environment can have direct effects on product sales. The baseline variance also has a positive effect on sales (b4 = 0.464), which suggests that products appealing to a broad base of customers with a high variety of opinions experience higher sales. However, our model results show that deviations from this baseline caused by social ˆ j) do not systematically affect product sales. Changes dynamics ( Vjt − V in variance caused by social dynamics may indicate that the dynamics ˆ j < 0) or that are such that consensus opinions are encouraged (if Vjt − V ˆ more varied opinions are encouraged (if Vjt − Vj > 0 ). Although deviations in ratings variance do not have a significant direct effect on product sales, variance-related dynamics may affect future ratings, which in turn may affect long-term sales. We explore these potential indirect effects in the next section. Our model results also indicate significant volume effects. It is not ˆ jt ) is significant and posisurprising that the baseline volume effect ( N tive (b7 = 0.0604). When interpreting these results, it is important to note that unlike the measures for baseline valence and variance, the baseline volume metric does change over time and reflects the total expected number of posted ratings in each week (in the absence of social dynamics). Therefore, this result indicates the increase in sales resulting from the accumulation of posted product ratings over time. Because we also include a POSTRATING effect, this effect is above and beyond the sales increase resulting from the introduction of the ratings tool itself.
340
Wendy W. Moe and Michael Trusov
Overall, social dynamics have a direct effect on product sales through their effects on ratings valence. Although we have shown that social dynamics can also affect the variance and volume of ratings posted (see table 11.5), our sales model results show that these influences do not have a direct (or immediate) effect on sales. However, because these dynamics do influence subsequent rating behavior, they may have an indirect (or longer-term) effect on sales. We examine these indirect sales effects in the next section. 11.9
The Impact of Social Dynamics on Future Ratings and Sales
In agreement with previous studies (e.g., Chevalier and Mayzlin 2006; Clemons, Gao, and Hitt 2006; Dellarocas, Zhang, and Awad 2007; Godes and Silva 2009; Li and Hitt 2008), our model results provide evidence that consumer-generated product ratings have a direct effect on product sales and are subject to social dynamics. The proposed modeling approach paired with a unique multiproduct longitudinal data set enables us to take the next step in this stream of research by answering the questions of how social dynamics influence sales and how these effects can be quantified. The latter is of high interest to business practitioners who are eager to understand the effect of receiving a negative rating on product sales, the value of stimulating positive posts to rebut negative ones, and the long-term implications of these fluctuations in the rating environment. The comprehensive decomposition approach we developed in this article sheds light on these questions. We illustrate the overall effects (both direct and indirect) of ratings dynamics on sales by simulating a number of scenarios for which the initial ratings for a given product are varied. Because rating behavior is sensitive to previously posted ratings, each scenario would generate a different dynamic in the ratings environment. Comparisons across these scenarios will enable us to examine the impact of these different dynamics on future ratings and sales. For this purpose, we select one product from our data set that, from the firm’s perspective, can be considered to be representative,10 and we simulate ratings and sales by using its product-specific posterior estimates. Specifically, we simulate three scenarios. In all three scenarios, we hold constant the ratings volume in the initial month of the observation period. For the selected product, nine ratings of varying valence arrived in the first month. In Scenario 1, we manipulated the valence
The Value of Social Dynamics in Online Product Ratings Forums
341
Table 11.8 Simulated Ratings Metrics Scenario 1 (Consensus Five-Star) Average Ratinga One month Two months Three months Ratings Variancea One month Two months Three months Ratings Volumea One month Two months Three months
Scenario 2 (Consensus Three-Star)
Scenario 3 (Varied Three-Star)
5 4.64 4.62
3 3.97 4.56
3 4.17 4.66
0
0 .57 .63
.61 .67
2.25 2.10 1.70
9 13.65 18.00
9 11.67 16.83
9 13.48 18.35
a. Ratings metrics refer to the average rating, ratings variance, and number of ratings at the start of the month.
and variance of these nine ratings such that all were five-star ratings (“consensus five-star”); in Scenario 2, we consider a consensus threestar rating (“consensus three-star”); and in Scenario 3, we consider a ratings environment that averaged three stars but exhibited significant variation around this average rating (“varied three-star”). Table 11.8 presents the simulated ratings metrics for the three months after the initial period in which ratings were manipulated. Across scenarios, ratings dynamics lead to virtually the same average rating after just a few months regardless of the valence or variance of initial ratings. In other words, in the absence of new external shocks to the system, average ratings converge to the product’s underlying baseline value over time. However, the nature of this convergence is a function of the initial ratings posted. Figure 11.2 plots the dynamics in ratings valence at the weekly level. Our model results indicate that higher average ratings increase the likelihood that one-, two-, and three-star ratings are subsequently posted. The impact of this dynamic is highlighted when comparing Scenario 1 (which has an average initial rating of five stars) to Scenarios 2 and 3 (which both have an initial average rating of three stars). Comparing Scenarios 2 and 3 highlights the dynamics resulting from ratings variance. Note that, in the varied three-star case, average ratings “recover” more quickly than in the consensus three-star scenario.
342
Wendy W. Moe and Michael Trusov
5.0
Average rating
4.5
4.0
3.5
Consensus five-star Consensus three-star Varied three-star
3.0
2.5 Week 1
Week 3
Week 5
Week 7
Week 9
Week 11
Figure 11.2 Average Rating Convergence.
Likewise, the resulting variance in ratings also reveals a converging trend. Ratings variance in Scenarios 1 and 2 quickly increases from 0 to approximately 0.65 and remains at that level. However, the initially high variance in Scenario 3 steadily decreases throughout the simulation period. Although ratings valence and variance both seem to be gravitating toward a baseline value, ratings volume exhibits more interesting differences across scenarios. Note that ratings volume is not manipulated in the initial setup of the simulation; thus, any differences across scenarios in terms of ratings volume is strictly a result of ratings dynamics. Note also that the higher ratings variance in Scenario 3 seems to generate a slightly higher volume of ratings in the long run than the lowervariance scenarios. These ratings dynamics are noteworthy in and of themselves. However, of particular managerial interest is the impact on product sales. Figure 11.3 plots these dynamics at the weekly level. It is not surprising that the initial positive ratings in Scenario 1 result in higher products sales than those in Scenario 2 (which has a lower initial average rating but the same ratings variance). As ratings valence begins to regress back to the product’s baseline levels, sales decrease, but increase again later as ratings volume increases.
The Value of Social Dynamics in Online Product Ratings Forums
343
300 250
Sales
200 150 100 Consensus five-star Consensus three-star Varied three-star
50 0 Week 1
Week 3
Week 5
Week 7
Week 9
Week 11
Figure 11.3 Sales Dynamics.
Scenario 3 (varied three-star) provides another interesting illustration. Compared with the consensus three-star scenario (which has the same average rating but a lower ratings variance), product sales reach the level of the consensus five-star scenario more quickly when there is more variance in ratings. In contrast, it takes more than three months for sales in the consensus three-star condition to “recover” and reach the sales levels seen in the consensus five-star scenario. This result, considered together with the demonstrated effects on ratings volume, indicates that variance in posted opinions can generate a dynamic that encourages subsequent word-of-mouth activity, which in turn facilitates not only the product’s recovery in average ratings but also in product sales. The managerial implication of this result is that marketers should not necessarily encourage strictly positive word of mouth but also encourage variance of opinions in the online discussion. Although the results of these simulations may be specific to the business environment of the collaborating firm, they should provide further impetus for models of consumer-generated product reviews to go beyond investigating the direct effects of ratings to incorporate a variety of potential indirect effects that come in through the dynamics studied here.
344
11.10
Wendy W. Moe and Michael Trusov
Discussion and Conclusion
Our objective was to decompose the ratings effect on sales into a baseline component and a dynamic component resulting from social dynamics in the ratings environment. We model ratings behavior as a dynamic hazard process and measure the effects that previously posted ratings have on future ratings behavior. The hazard-modeling framework provided measures of expected average product rating, ratings variance, and ratings volume both with and without the influence of social dynamics. These metrics, when included in a model of product sales, enable us to measure the impact of ratings dynamics on product sales. Our model results and simulations show that there are substantial ratings dynamics, and their effects on sales are noticeable. Specifically, our model results show that ratings dynamics can have a direct effect on product sales through their impact on ratings valence, but not variance or volume. Our simulation results further indicate that differences in initial ratings can result in dynamics that have indirect effects on product sales. However, these effects are relatively short lived. In recent years, the sales impact of ratings has been the focus of many research efforts. However, few researchers have examined how ratings dynamics may influence product sales. We have explicitly studied the ratings dynamics within a product forum and have measured the effects of product-level ratings measures on future ratings behavior. We hope that our results encourage further study of how people are influenced by the social dynamics taking place in a ratings environment. For example, it might be of interest to study whether social dynamics can affect a person’s decision to participate in the ratings forum at all, potentially leading to systematic shifts in the customer base composition over time. Individual-level data would be helpful in addressing this question. Furthermore, we constructed a model that allows for heterogeneous ratings effects across products. Although our results indicate that products do vary in terms of how they respond to ratings dynamics (in both the ratings and sales models), examining the sources of these variations is outside the scope of this article. We encourage researchers to study these differences across products. Furthermore, the analysis of heterogeneity across products can be improved if more ratings data on a product level become available. Indeed, because our data set is sparse (2500 transition processes are being estimated on the basis of 3801 transitions, and only 945 of 2500 processes are not right censored), we
The Value of Social Dynamics in Online Product Ratings Forums
345
need to rely on Bayesian shrinkage estimation to infer parameters of each transition process. Richer data sets would allow for nonparametric baseline estimates, which may offer further insights into acrossproduct variations. Another important dimension of ratings environments that calls for further inquiry is the effects of variation in price and sales volume. In our data set, price does not change over time, precluding us from studying any temporal effects that may be caused by price fluctuation. Furthermore, we assume in our analysis that variations in ratings arrival rates are not a function of variations in previous sales. Although in our specific business setting we find little evidence that previous sales help predict future ratings, we acknowledge that this might be specific to mature product categories such as the ones studied here, and future studies should further investigate the role of previous sales on ratings. To establish the true causality of the effects studied here, further research may need to go beyond secondary data analyses and explore rating dynamics in controlled experiment settings. Overall, online product ratings represent one type of online usergenerated content, and as researchers, we have little understanding of the behavior driving consumers to provide this content or their responses to content provided by others. Although substantial research is still needed in this area, we hope that this article provides a first step in framing the problem, introducing a modeling approach, and presenting some empirical results that not only answer some questions but also stimulate readers to ask new ones. Notes Reprinted with permission from Journal of Marketing Research, published by the American Marketing Association, Wendy W. Moe and Michael Trusov, “The Value of Social Dynamics in Online Product Ratings Forums,” volume 48 (June 2011), 444–456. The authors thank the Marketing Science Institute for the research grant provided to this project. They also thank BazaarVoice for the data analyzed in this article, as well as the invaluable conversations surrounding this topic. 1. In the subsequent presentation, we use the term “unbiased” in a narrow sense, specific to the discussed context. 2. In this study, we do not perform text analysis of the reviews. We recognize that text analysis may present another valuable dimension in explaining dynamics in ratings environments. We leave the exploration of review text for further research. 3. Alternatively, the product-ratings arrival can be modeled in a count data framework by using the number of posts per unit of time as a dependent variable. We found the
346
Wendy W. Moe and Michael Trusov
count data approach to be less appropriate in our case, because it is not obvious what level of (time) aggregation should be used in the multiproduct environment considered in this study. For some products with infrequent review posts, aggregation on a weekly (or even biweekly) level might be necessary to reliably estimate parameters of a count model, whereas for others (with a high level of post activity), weekly aggregation might be suboptimal and would effectively lead to information loss. 4. An exception is the sale event, for which we control. 5. Another covariate that may be predictive of ratings arrival is lagged sales. Although we believe that for mature product categories lagged sales are unlikely to be a strong predictor, we tested model specifications with and without a lagged sales covariate. The difference in fit between the two models is minimal (deviance information criterion of 54,770.5 versus 54,787.8, with and without lagged sales, respectfully), and the inclusion of lagged sales does not meaningfully change parameter estimates. Therefore, to avoid potential endogeneity problems, we omit lagged sales from the ratings model. We thank the associate editor for this suggestion. 6. Alternatively, we can estimate a set of nonparametric (fixed-effects) baseline parameters for each product j. However, the sparsity of the data necessitates that we rely on Bayesian shrinkage to a common prior. For example, the maximum-likelihood fixedeffect estimate for the one-star hazard would be perfectly 0 for the 423 products that never received such a rating. Our approach also requires the assumption that the random effects are orthogonal to the covariates. To test the sensitivity of our results to this assumption, we have attempted an alternative specification that replaces LAGVOLUME with an independent and exogenous measure (see the Web Appendix at https:// www.ama.org/publications/JournalOfMarketingResearch/Documents/value_of_ social_dynamics.pdf). The analysis shows that the predicted metrics resulting from the ratings hazard models are not significantly affected. We thank the associate editor for pointing this out. 7. We also considered a fixed-effects model. We found parameter estimation results for both specifications to be similar. 8. We performed a set of robustness checks using both simulated data and subsampling techniques. The results of these tests confirmed a reasonably good performance of the proposed model. We report the details in the Web Appendix at https://www.ama.org/ publications/JournalOfMarketingResearch/Documents/value_of_social_dynamics.pdf. 9. Although there is ample theoretical support for raters to exhibit differentiation behavior in response to ratings valence and variance on subsequently posted ratings, the nature of our secondary data does not allow us to rule out some alternative explanations. Unless rating manipulations are performed in a controlled experiment, the true causality is difficult to establish. We thank the associate editor for bringing this important point to our attention. 10. During the time period in question, this product had an average rating of 4.56, received a fair number of reviews (thirty-six posts), and exhibited a notable ratings variance of 1.58.
References Amabile, Teresa M. 1983. “Brilliant but Cruel: Perceptions of Negative Evaluators.” Journal of Experimental Social Psychology 19 (2):146–156.
The Value of Social Dynamics in Online Product Ratings Forums
347
Bikhchandani, Sushil, David Hirshleifer, and Ivo Welch. 1992. “A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades.” Journal of Political Economy 100 (5):992–1026. Chen, Yubo, and Jinhong Xie. 2008. “Online Consumer Review: Word-of-Mouth as a New Element of Marketing Communication Mix.” Management Science 54 (3):477–491. Chevalier, Judith, and Dina Mayzlin. 2006. “The Effect of Word of Mouth on Sales: Online Book Reviews.” Journal of Marketing Research 43 (August):345–354. Clemons, Eric K., Guodong Gao, and Lorin M. Hitt. 2006. “When Online Reviews Meet Hyperdifferentiation: A Study of the Craft Beer Industry.” Journal of Management Information Systems 23 (2):149–171. Dellarocas, Chrysanthos. 2003. “The Digitization of Word of Mouth: Promise and Challenges of Online Feedback Mechanisms.” Management Science 49 (10):1407–1424. Dellarocas, Chrysanthos. 2006. “Strategic Manipulation of Internet Opinion Forums: Implications for Consumers and Firms.” Management Science 52 (10):1577–1593. Dellarocas, Chrysanthos, and Ritu Narayan. 2006. “A Statistical Measure of a Population’s Propensity to Engage in Post-Purchase Online Word-of-Mouth.” Statistical Science 21 (2):277–285. Dellarocas, Chrysanthos, Xiaoquan Zhang, and Neveen F. Awad. 2007. “Exploring the Value of Online Product Reviews in Forecasting Sales: The Case of Motion Pictures.” Journal of Interactive Marketing 21 (4):23–45. Dodds, William B., Kent B. Monroe, and Dhruv Grewal. 1991. “Effects of Price, Brand, and Store Information on Buyers’ Product Evaluations.” Journal of Marketing Research 28 (August):307–319. Duan, Wenjing, Bin Gu, and Andrew B. Whinston. 2008. “Do Online Reviews Matter? An Empirical Investigation of Panel Data.” Decision Support Systems 45 (4):1007–1016. Fleming, John H., John M. Darley, James L. Hilton, and Brian A. Kojetin. 1990. “Multiple Audience Problem: A Strategic Communication Perspective on Social Perception.” Journal of Personality and Social Psychology 58 (4):593–609. Godes, David, and Dina Mayzlin. 2004. “Using Online Conversations to Study Word-ofMouth Communications.” Marketing Science 23 (4):545–560. Godes, David, and Jose Silva. 2009. “The Dynamics of Online Opinion,” working paper, R.H. Smith School of Business, University of Maryland. Li, Xinxin, and Lorin M. Hitt. 2008. “Self-Selection and Information Role of Online Product Reviews.” Information Systems Research 19 (4):456–474. Liu, Yong. 2006. “Word of Mouth for Movies: Its Dynamics and Impact on Box Office Revenue.” Journal of Marketing 70 (July):74–89. Moe, Wendy W., and Peter S. Fader. 2001. “Modeling Hedonic Portfolio Products: A Joint Segmentation Analysis of Music CD Sales.” Journal of Marketing Research 38 (August):376–385. Resnick, Paul, and Richard Zeckhauser. 2002. “Trust Among Strangers in Internet Transactions: Empirical Analysis of eBay’s Reputation System,” in The Economics of the Internet
348
Wendy W. Moe and Michael Trusov
and E-Commerce: Advances in Applied Microeconomics, vol. 11, Michael R. Baye, ed. Amsterdam: Elsevier Science, 127–157. Sawhney, Mohanbir S., and Jehoshua Eliashberg. 1996. “A Parsimonious Model for Forecasting Gross Box-Office Revenues of Motion Pictures.” Marketing Science 15 (2):113–131. Schlosser, Ann. 2005. “Posting Versus Lurking: Communicating in a Multiple Audience Context.” Journal of Consumer Research 32 (2):260–265. Westbrook, Robert A. 1987. “Product/Consumption-Based Affective Responses and Postpurchase Processes.” Journal of Marketing Research 24 (August):258–270.
12
Uninformative Advertising as an Invitation to Search Dina Mayzlin and Jiwoong Shin
Ask me about my Tempur-Pedic … Ask someone you know. Check out Twitter. Try your friends on Facebook. You’ll hear it all un-edited. —Tempur-Pedic TV commercial that aired in the summer of 2010
12.1
Introduction
Many markets are characterized by imperfect consumer information— consumers are often poorly informed about the existence, price, and attributes of products. Due to this uncertainty, firms invest large amounts of resources into developing effective advertising campaigns to influence consumer learning. Given this, it is surprising to observe that in practice a large proportion of advertising contains no direct information on product attributes. (Abernethy and Butler 1992 find that 37.5% of US TV advertising has no product attribute cues.) In this paper, we give a formal rationale for a firm’s decision to strategically withhold information on product attributes in its advertising in a rational framework. In particular, we explain when a firm would choose to make vague claims (or no claims) as opposed to mentioning specific product attributes in its advertising. We will refer to a campaign that emphasizes product attributes as “attribute-focused” advertising. By definition, this type of advertising contains “hard” information (Tirole 1986) about product benefits, and, hence, the claims are credible and verifiable. In contrast, we will refer to a campaign that does not emphasize any particular product attribute as ostensibly “uninformative” or “nonattribute-focused” advertising.1 Below we illustrate our definitions of the two types of ads with examples that span several industries.2 First, consider Microsoft’s advertising for Windows Vista. In September 2008, Microsoft launched a new
350
Dina Mayzlin and Jiwoong Shin
ad campaign for Vista, featuring Jerry Seinfeld and Bill Gates. The commercial showed the two having a conversation while trying on shoes at a mall. One striking feature of the ad was the utter lack of Windowsrelated information—even the name Vista never appeared during the ad. In contrast, in February of 2009, Microsoft changed its advertising strategy to a new “I am a PC” ad campaign that emphasized several product attributes, such as connectivity and ease of use of Windows Vista. We consider the latter campaign to be “attribute-focused,” while we consider the former to be “non-attribute-focused.” The advertising content strategy often varies across firms within the same industry. For example, consider the recent TV advertising campaigns for three different digital cameras: Canon PowerShot, Nikon Coolpix, and Fujifilm FinePix. The Nikon ads, featuring the actor Ashton Kutcher, consistently emphasized Coolpix’s strong features, such as its zoom capability and touch screen function. In contrast, both the Canon advertising campaign “Make Every Shot a PowerShot” and Fujifilm’s “A Lifetime is Made of Memories” do not mention any of the cameras’ attributes. Hence, while the Nikon campaign is attribute-focused, the Canon and Fujifilm campaigns are non-attribute-focused.3 We also observe this inter-firm variation in advertising strategy in other industries such as the financial services industry. Some firms emphasize their product attributes. For example, Capital One’s “What’s in Your Wallet?” campaign focuses on one specific attribute, such as the convenience of claiming rewards or the low interest rate. Other firms in the same industry pursue a different advertising strategy in that they convey no direct information about their credit cards’ benefits. For example, the American Express “My Life. My Card.” campaign made no mention of the card’s benefits such as its excellent rewards program. Similarly, the advertisement for First Premier Bank contained no direct product claims. How will these different types of advertising campaigns affect consumers’ inference about product quality? What is the relationship between product quality and the firm’s decision to make attributefocused claims? One intuitive hypothesis is that the high-quality product would choose to emphasize its product benefits, which are, by definition, strong. However, the limited bandwidth of communication inherent in any form of advertising implies that a firm can talk about only a small subset of its product’s attributes. It is impossible for a firm to accurately communicate all of the features associated with its product in a 30-second commercial or a print ad (Shapiro 2006, Bhardwaj et al.
Uninformative Advertising as an Invitation to Search
351
2008). Hence, if the firm claims to be good on a few selected attributes, its advertising will be indistinguishable from the advertising of the firm that is only good at those attributes. If, on the other hand, the firm makes no attribute-focused claims, its advertising will be indistinguishable from the advertising of a firm that cannot deliver high quality on any attributes. For example, consider once again the digital camera Canon PowerShot SD940 IS. This camera is high quality on a large number of attributes: the Canon web site lists eighteen technologies that were contained within the camera (consistent with this, in 2010 the camera was ranked number two out of fifty-five in the subcompact digital camera category by Consumer Reports). Clearly, Canon PowerShot cannot emphasize all of its superior attributes in a 30-second commercial or a print ad. On the other hand, if Canon decides to focus on one of its attributes, such as the quality of its flash photos, it cannot distinguish itself from a camera such as Nikon Coolpix S3000, which happens to have high quality flash photos but is dominated by Canon on the versatility dimension. If Canon instead chooses to emphasize the versatility dimension, then it cannot distinguish itself from Pentax Optio I-10, which is equally versatile but is dominated by Canon on LCD quality.4 The argument above highlights the point that the firm may not be able to entirely resolve the uncertainty about its product through advertising alone under limited bandwidth in advertising communication. However, a consumer who is uncertain about the product’s features following exposure to advertising may take actions to resolve this uncertainty (i.e., the consumer is “active”): she can conduct her own search to discover the product’s quality prior to purchase by engaging in activities such as reading online product reviews or talking to her friends. Therefore, the high quality firm may actually prefer to encourage the consumer to search since it is confident that the information uncovered will be positive. We show that there exists an equilibrium where non-attribute-focused advertising serves as an invitation to search. In contrast, an average firm that imitates this strategy risks losing its customer in cases when she uncovers negative information as part of her search. Hence, an average firm may choose to engage in an attribute-focused appeal, despite the fact that this perfectly reveals its type. Upon seeing a non-attribute-focused ad, the consumer chooses to search since she is not sure whether the advertiser is high type (who says little because it has so much to say) or the low type (who says little because it has nothing to say). The Tempur-Pedic ad cited at the
352
Dina Mayzlin and Jiwoong Shin
beginning of the paper illustrates this intuition.5 The ad makes no product claims but, instead, encourages the consumer to seek word of mouth about the mattress. While most non-attribute-focused ads do not literally ask the viewer to search for further product information, we argue that the lack of direct attribute information can effectively serve as an invitation to search. In this paper, we formalize the above argument and develop a framework to analyze the firm’s simultaneous price and advertising content decisions. We also endogenize the consumer’s search behavior, which is again affected by the firm’s price and advertising content. By taking into account the fact that consumers may choose to search for additional information after seeing its advertising, the high quality firm may strategically withhold information on product attributes in its advertising message, and, therefore, advertising content can signal product quality. In an extension, we show that advertising content can signal quality even when the amount of advertising spending can serve as a signal of quality (which is the case in “money-burning” models of advertising) as long as the consumer’s observability of advertising spending is imperfect. We also show that our results are robust to several other alternative model specifications. The paper is organized in the following manner. In section 12.2, we relate our paper to the existing literature in economics and marketing. Section 12.3 presents the model setup. We discuss the model results and its extensions in sections 12.4 and 12.5, and we conclude in section 12.6. 12.2
Literature Review
This paper contributes to the literature on informative advertising.6 The information that advertising provides can be direct, such as the existence of the product or its price (e.g., Grossman and Shapiro 1984), or indirect, where the mere fact that the firm advertises signals the quality of the product (e.g., Nelson 1974). In the existing literature, “indirect” information has been associated with quality signaling. The most prominent theory of the qualitysignaling role of advertising is known as the “money-burning” theory of advertising (see Nelson 1974, Kihlstrom, and Riordan 1984, Milgrom and Roberts 1986, Bagwell and Ramey 1994, and Bagwell 2007 for a comprehensive review). According to this theory, the seller of an “experience” item (a product whose quality cannot be ascertained prior to
Uninformative Advertising as an Invitation to Search
353
purchase) cannot credibly make direct claims about product quality in its ads. Instead, the high quality firm can credibly convey its quality information indirectly through its advertising expenditure. Hence, one of the important take-aways of this theory is that it is the level of spending that signals the quality of the product, and not the content of the message. On the other hand, for “search” items (goods whose quality can be ascertained before purchase), firms may convey quality information directly through the advertising message. Of course, if the information can be conveyed directly through advertising content, there may not be room for quality-signaling through advertising.7 However, we show that the signaling role of advertising is important even for search goods because of limited bandwidth. As we mentioned before, the limited bandwidth in advertising (Shapiro 2006, Bhardwaj et al. 2008) implies that the firm cannot perfectly convey quality information directly through its advertising message alone, and, therefore, it cannot fully resolve the uncertainty concerning the product quality. Thus, the firm needs to further signal its quality in order to resolve this remaining uncertainty. There are several papers that relate to the issue of advertising content. In particular, a number of papers have focused on direct information contained in the advertising message such as price information or information about product existence. For example, Butters (1977) and Grossman and Shapiro (1984) allow the firm to announce the existence of its product or price through advertising. Simester (1995) and Shin (2005) examine the credibility of price claims in advertising messages. Recently researchers have also started to incorporate advertising content (other than price) explicitly in their models. Anand and Shachar (2007) show that while advertising content plays no role in equilibrium, it may shape off-equilibrium beliefs. Anderson and Renault (2006) investigate the amount and the type of information that a seller of search goods would choose to reveal in its advertising messages when consumers are imperfectly informed about product attributes and prices. That is, they investigate whether a firm would choose to inform consumers about product attributes and/or prices in the presence of consumer travel costs. Although our work is similar to Anderson and Renault (2006) in that we also find that the firm may strategically choose to withhold information in its advertising, the model setups, and, hence, the mechanisms behind the two results, are very different.8 In Anderson and Renault (2006) the travel cost creates a misalignment
354
Dina Mayzlin and Jiwoong Shin
of incentives between the firm and the consumer. That is, while the firm wants to convey to a high-valuation consumer precise match information in order to encourage her to visit the store, the firm may not want to convey precise match information to lower-valuation consumer prior to her visit. Due to this trade-off, the firm chooses to reveal only partial match information in its advertising. In the current work, on the other hand, the firm chooses to withhold information in order to encourage the consumer to obtain additional information on her own. To summarize, our paper contributes to two different research streams in advertising: (1) the signaling role of advertising (or advertising as indirect information); and (2) the role of advertising content (or advertising as direct information). Existing literature finds that in the search goods case, the signaling role of advertising is irrelevant, while in the experience goods case, advertising content plays no role in signaling quality. However, under limited bandwidth of advertising, a firm cannot perfectly convey its quality information in its advertising even for search goods. In this case, we show that advertising content can serve as an invitation for the consumer to search for further information on product quality. Hence, advertising content can signal product quality. In addition, the firm can signal its quality through actions other than advertising, such as product warranties (Moorthy and Srinivasan 1995), umbrella branding (Wernerfelt 1988), and the selling format (Bhardwaj et al. 2008). In particular, our model relates to Bhardwaj et al. (2008) in that both papers consider quality signaling in the presence of limited bandwidth in communication. However, there are important differences between the two papers. First, we focus on the content of the communication, while Bhardwaj et al. (2008) focus on who initiates the communication. That is, our model assumes that the firm always initiates communication and investigates whether attribute information should be communicated in an ad, while Bhardwaj et al. (2008) assume that attribute information is always sent and ask who should initiate the communication. Second, in our model the costly search is a decision variable on the part of the consumer, while in Bhardwaj et al. (2008) the search is costless and always occurs. Finally, our model also contributes to the literature on countersignaling (Teoh and Hwang 1991, Feltovich et al. 2002, Araujo et al. 2008, Harbaugh and To 2008).9 In contrast to the standard signaling models
Uninformative Advertising as an Invitation to Search
355
where high types send a costly signal to separate themselves from the low types, in countersignaling models the high type chooses not to undertake a costly signaling action. People of average abilities, for example, get more education than bright people in labor markets (Hvide 2003). Mediocre firms reveal their favorable earning information, while both high quality and low quality firms tend to conceal their earnings information in the financial market (Teoh and Hwang 1991). Feltovich et al. (2002) formalize this intuition and show that in the presence of an external signal, the high type may pool with the low type while the medium type prefers to separate. Their motivating example is one of a job seeker, who has not seen his letters of recommendations, deciding whether or not to reveal his high school grades during an interview. We follow the setup of Feltovich et al. (2002) in that we also have three possible types of sender (in our case, the firm), and the receiver (the consumer in our case) can also obtain an additional noisy signal on the firm’s type. While our model’s result is similar to the results in countersignaling models in that the high and low types pool on the same action, the advertising context makes it necessary for us to define a model that is significantly different from the extant countersignaling models. First and most importantly, while in the previous countersignaling models (e.g., Feltovich et al. 2002, Harbaugh and To 2008), the receiver is assumed to always receive the second signal, in our model the receiver (i.e., the consumer) is active and, hence, only receives this additional information if she chooses to search after observing the price and content of the advertising message. That is, we endogenize the presence of a second signal, which plays a critical role in enabling the equilibrium where the high and the low types pool on non-attributefocused advertising. Second, we also allow price to be a potential signal, which was not an issue in the earlier models. This has a substantive as well as a technical implication for the model. Substantively, the consumer’s decision to search is critically impacted by the price that the firm charges. Hence, depending on the level of price that the firm charges, different types of equilibria arise. Technically, endogenizing price and search allows us to identify conditions under which our focal equilibrium is unique. In contrast, in most countersignaling models, the countersignaling equilibrium is not unique. Finally, in existing countersignaling models, the high and low types do not undertake the costly signaling action (hence, the term “countersignaling”), while in
356
Dina Mayzlin and Jiwoong Shin
our basic model all types engage in advertising, where the exact content differs by type. Although we have focused on rational explanations for uninformative advertising, there are a number of behavioral-based explanations for this phenomenon (see Carpenter et al. 1994, Holbrook and O’Shaughnessy 1984, Kardes 2005, and Scott 1994).10 These models emphasize the importance of both the cognitive and the emotional response to advertising. Since we predict that firms may choose to engage in uninformative advertising even in the absence of these psychological forces, our work complements these explanations. 12.3
Model
The game consists of one firm and one consumer. There is an informational asymmetry about the quality of the firm’s product: the firm knows the quality of its product while the consumer must infer the product’s quality from signals that she receives from the firm as well as information that she may obtain on her own. In particular, the product consists of two attributes α ∈ {A, a} and β ∈ {B, b} where the capital letter stands for higher quality on that dimension. We also assume that an attribute is equally likely to be high or low quality, P(α = A) = P(β = B) = 12 ; and that there may be correlation between levels of the two attributes, P(β = B|α = A) = P(α = A|β = B) = P(β = b|α = a) = P(α = a|β = b) = ρ, where 0 < ρ < 1. Hence, there are three possible types (θ) of products based on the quality levels of the attributes: θ ∈ {H, M, L} = {{A, B}, {A, b} or {a, B}, {a, b}}, with the a priori probabilities of ( ρ2 , 1 − ρ, ρ2 ) respectively.11 The consumer has the following utility function: 1 1 u = u0 + V ⋅ 1{α = A} + V ⋅ 1{β = B}. 2 2
(1)
where 1{∙} indicates whether the attribute is high quality. For now, we assume that both attributes are equally important to the consumer, but we relax this assumption in section 12.5.2. Hence, a priori, the H type product delivers utility u0 + V to the consumer, the M type product delivers utility uo + V2 , and the L type delivers u0 (the basic utility from the product). We normalize u0 = 0 for simplicity. Note that while the exact utility levels are not important to our results (for example, we can re-normalize u0 > 0 to better capture the
Uninformative Advertising as an Invitation to Search
357
reality that even inferior products yield some utility to the consumer), the rank-ordering of products from the consumer’s perspective is important. Hence, all else equal, a consumer would prefer H to M, and M to L, which in turn implies that the L type wants to imitate H and M; the M type wants to separate itself from L and imitate H, and H wants to separate itself from M and L. The firm can communicate to the consumer through advertising. We assume that the cost of advertising is zero. This allows us to focus on the role of content in advertising above and beyond the well-known effect of money burning where the firm can signal that it is high type by engaging in costly advertising activity. We also assume that the firm must advertise in order to inform the consumer of its product’s existence. These two assumptions imply that the firm always chooses to advertise. While our model primarily deals with the quality-signaling role of advertising, this assumption acknowledges the importance of the awareness role of advertising in reality. In section 12.5, we impose a non-zero advertising cost and allow the firm to invest resources into raising product awareness. The firm’s action space consists of two possible advertising choices. First, the firm can choose an ad that centers on the product’s attributes, an “attribute-focused” advertising. Here, we impose a truth-telling assumption following the literature (Anderson and Renault 2006, Bhardwaj et al. 2008, Simester 1995):12 the firm cannot claim to be high quality on an attribute on which it is in fact low quality. This implies that while H and M can engage in either attribute-focused or nonattribute-focused advertising (since both are high quality on at least one attribute), L can only engage in non-attribute-focused advertising since he cannot claim to be high quality on either attribute.13 To capture the reality of limited bandwidth inherent in a communication medium such as TV, we allow the firm to transmit information about only one attribute—either α or β: a = aj, where j ∈ (α, β}. In practice, a product contains a large number of features. However, given the constraints on the time available for communication, as well as the limited cognitive resources available to the consumer for processing advertisement information (Shapiro 2006), the firm is only able to communicate about a small subset of these features (Bhardwaj et al. 2008). In section 12.5.3, we extend this two-attribute model to a more general multi-attribute setting. We show that the critical assumption is whether the bandwidth of advertising is low enough so that the H type cannot distinguish itself from the M type through the message alone.
358
Dina Mayzlin and Jiwoong Shin
Table 12.1 All Types and Possible Actions Product Type
Attribute α
Attribute β
Expected Utility
Possible Ads
Price
L
a
b
0
a0
p≥0
M
A(or a)
b(or B)
V 2
a0, aj
p≥0
H
A
B
V
a0, aj
p≥0
In contrast to attribute-focused advertising, the firm can choose not to emphasize any particular attribute: a = a0. We refer to this as “nonattribute-focused” advertising. In table 12.1, we summarize the possible types and the actions available to them. Following Meurer and Stahl (1994), we assume that the consumer can costlessly obtain information on the firm price, p, after observing an ad. If the firm is not able to commit to the price initially, the consumer would not choose to invest in search since she would fear that the firm would charge a price that would extract all of her surplus. This is very similar to a hold-up problem that occurs in the presence of consumer travel costs (see Wernerfelt 1994). After the consumer receives the advertising message and observes the price, she can choose to invest a cost c in order to discover the quality of the product. After incurring this cost of search, the consumer obtains extra noisy information about the product quality.14 This may involve searching for online reviews (Chevalier and Mayzlin 2006), observing word-of-mouth (Chen and Xie 2008, Godes and Mayzlin 2004), reading Consumer Reports, or doing other types of search activities. We assume that consumer search yields a binary signal on the product quality, s ∈{ s , s }, where s denotes positive news and s denotes negative news. The search outcome is related to the product’s quality level, θ ∈ {L, M, H}, according to the following probabilities: Pr( s|θ ) = γ θ , where θ ∈ {L, M , H } γL < γ M < γ H.
(2)
The firm knows that the consumer can obtain this extra information with the above probabilities, but does not observe whether the consumer actually chooses to search for this extra signal, let alone what signal the consumer ultimately receives if she chooses to do so. The signal space of each type has the same support so that no signal is
Uninformative Advertising as an Invitation to Search
359
perfectly informative. Also, equation (1) implies that the higher quality firm is more likely to generate favorable information. This amounts to a MLRP (Monotone Likelihood Ratio Property) assumption over the signal space across types. In other words, positive news ( s ) is really “good news” regarding the firm’s quality (Milgrom 1981). For example, suppose that after viewing an ad for Canon PoweShot, Bob posts an inquiry about this camera on a digital photography forum. Since this camera is excellent, Bob is likely to receive a positive recommendation. Bob is less likely to receive a positive review for Fujifilm FinePix, which is more likely to disappoint a random consumer. This example illustrates several important points. First, the information the consumer receives through search is potentially richer than the information she can obtain after viewing an ad. The binary signal above can be viewed as a summary of all the product attributes. Second, even an excellent product may generate a negative signal: there is noise in the signal due to factors such as individual taste idiosyncrasies or promotional chat generated by firms, for example. However, a better product is more likely to yield a positive signal (Mayzlin 2006). Hence, the additional signal is informative but noisy. After the consumer receives information regarding the product (through either advertising, prices, or own research), she forms a belief on the quality of the product. Here, we signify by Ω the consumer’s information set, and by μ(Ω) the consumer’s belief. In particular, μ(Ω) = (μL(Ω), μM(Ω), μH(Ω)), where μL(Ω) = P(L|Ω), μM(Ω) = P(M|Ω), μH(Ω) = P(H|Ω). The consumer’s information set (Ω) includes the observation of advertising (a), price (p), and consumer’s own search (s) if that takes place. That is, if the consumer performs own search, then Ω = {aj, p, s} for a firm that advertises an attribute, and Ω = {ao, p, s} for a firm that employs non-attribute-focused advertising. If, on the other hand, no consumer search takes place, then Ω = {aj, p} for a firm that advertises an attribute, and Ω = {ao, p} for a firm with non-attribute-focused advertising. The consumer then decides whether to purchase the product at its posted price based on the posterior belief on its quality: μ(a, p, s) in the case of consumer research, and μ(a, p) in the case of no search. We assume that a consumer who is indifferent between purchasing and not purchasing the product chooses to purchase it. The timing of the model can be summarized as follows:
360
Dina Mayzlin and Jiwoong Shin
Consumer observes advertising and price (a, p)
Consumer decides to purchase or not Time
Firms sends advertising message and charges a price (a, p)
Consumer decides whether to search for additional information or not • If search, she updates her belief μ(a, p, s) • If not, she updates her belief μ(a, p)
Figure 12.1 Timing of the Game.
12.4
Perfect Bayesian Equilibrium
We start with the consumer’s problem and then turn to the firm’s strategy. The consumer observes advertising and price, (a, p), and decides whether to search for additional information before making a final purchase decision. If the consumer is uncertain about the firm’s type even after observing the price and advertising, she can either: (1) forego search for additional information and make a purchase decision based on her belief, μ(a, p), which we abbreviate to μ; or (2) search for additional information s at cost c. In the absence of additional search, the consumer buys the product if and only if E(V|μ) − p ≥ 0. That is, she buys the product if the prior belief is relatively favorable or the price is relatively low. The consumer will search for additional information if: EU(search) ≥ EU(no search) ≡ max (0, E(V|μ) − p)
(3)
Note that the consumer undertakes a costly search only if her decision to purchase differs depending on the outcome of the signal (i.e., there must be value in the information received). In other words, when the consumer chooses to search, she buys only if the signal is high ( s = s ). The conditions for when the consumer chooses to search are specified in the following Lemma: Lemma 1. (Consumer search) 1. If E(V|μ) − p ≥ 0, the consumer will search for additional information if: c ≤ Pr( s|μ )[ p − E(V|μ , s )] ⇔ E(V|μ , s ) + ≤p
c Pr( s|μ )
(4)
Uninformative Advertising as an Invitation to Search
361
2. If E(V|μ) − p < 0, the consumer will search for additional information if: c ≤ Pr( s|μ )[E(V|μ , s ) − p] ⇔ p c ≤ E(V|μ , s ) − Pr( s|μ )
(5)
Moreover, when p = E(V|μ), Pr( s|μ )[ p − E(V|μ , s )] = Pr( s|μ )[E(V|μ , s ) − p]. Proof. See appendix. Equations (4) and (5) compare the marginal cost and the marginal benefit of search. The marginal cost of search (the left-hand side of equations (4) and (5)) is c. The marginal benefit is represented by the right-hand side of these equations and differs depending on the price. If E(V|μ) − p ≥ 0, the consumer would choose to buy the product based on the prior alone in the absence of an additional signal. Hence, the marginal benefit of search is in preventing purchase in the case when the signal is negative (s = s). On the other hand, when E(V|μ) − p < 0, the consumer would not purchase a product in the absence of an additional signal. Therefore, the marginal benefit of search is in enabling the consumer to purchase the product in the case when the signal is positive ( s = s ). Note that if the conditions in either equation (3) or equation (4) hold, then equation (1) holds—the consumer chooses to search before making her purchase decision. One implication of Lemma 1 is that given a belief, the consumer chooses to search for additional information only if the product’s price is within a certain range (see Lemma 2 in the appendix for more details). Hence, we can identify the range of prices and beliefs that ensures the existence of consumer search. For example, figure 12.1 illustrates the consumer’s decision to search for extra information when the consumer is not certain whether the firm is H type or M type. This can occur if the consumer observes an attribute-focused ad, which implies that the product is not L-type, but could be either H-type or M-type. In figure 12.1, the prior belief μH (the probability that the product is H-type) is graphed on the x-axis (where 0 ≤ μH ≤ 1). For a given belief (μH), if the price is low enough (p < p(μH)), the consumer prefers to buy the product without further search (see point D in figure 12.1). As we mentioned in our discussion of Lemma 1, at relatively low levels of p, i.e., p ≤ E(V|μ), the value of additional search is in preventing purchase when the outcome of search is negative, which in this case is captured by p − E(V|μ, s). Hence, when p is low, the marginal benefit of search is not high enough to justify the cost of
362
Dina Mayzlin and Jiwoong Shin
search. At any point on the convex curve p = p(μH), the consumer is indifferent between buying without search or engaging in further search. At a higher price ( p < p(μ H ) < p ), the consumer prefers to search (see points B and C). That is, here the consumer incurs a cost c to obtain an additional signal and purchases if and only if the outcome is positive, s = s , since E(V|μ , s ) > p and E(V|μ, s) < p. On the other hand, at any point on the concave curve p = p(μ H ) , the consumer is indifferent between no purchase and engaging in further search, and at p > p(μ H ) the price is so high that the consumer surplus obtained even in the case when the outcome of search is positive (E(V|μ , s ) − p ) is not high enough to justify the cost of search (see point A). As we can see from the figure, given μH, the consumer chooses to search for additional information only if p ∈[ p(μ H ), p(μ H )]. Moreover, if the belief is extreme ( μ H < μ H or μ H > μ H ), the consumer does not engage in search at any price. However, at the intermediate level of uncertainty, μ H ∈[μ H , μ H ] , there exists a price range at which search occurs. Note that the cut-off beliefs, μ H , μ H , are a function of the search cost c. As c increases, the V (γ − γ ) range [ μ H , μ H ] decreases and becomes empty when c > H8 M V (see Lemma 3 in the appendix). That is, search will not occur under any belief if the search cost is sufficiently high. Since we want to model an active consumer who can choose to engage in her own search, we focus on the region of the parameter space where the cost c is low enough such that search is a feasible option to the consumer.15 V (γ − γ ) Assumption: Search cost is low enough such that c ≤ H8 M V . What is the potential role of search in our model? As we can see from figure 12.1 above, given the prior belief μH, the possibility of consumer search allows the firm to charge a higher price (see point B, for example), as compared to a situation where no consumer search is possible, in which case the maximum the firm can charge is p = E(V|μ). That is, the fact that the consumer can undertake an action to resolve the uncertainty surrounding the firm’s quality enables the firm to charge a higher price. In this sense, the firm may want to invite the consumer to search. We can think of this as the benefit of search to the firm. However, while the possibility of search increases the upside of a transaction through higher price, it also introduces the possibility that no transaction occurs in the case when the consumer receives a negative signal, which may happen even for the highest type since the signal is noisy. We can think of the no transaction outcome as the cost of search (or alternatively as the risk inherent in search) to the firm. Since the probability of a negative signal differs across different types, search is
Uninformative Advertising as an Invitation to Search
363
differentially costly to different quality types. Therefore, a firm that “invites” the consumer to search through an advertising action may be able to signal its quality by credibly demonstrating its confidence in the outcome of the search.
p
p(μH ): Indifferent between “do not purchase” and “search” E(V |μH ) = p
V p
A B C
p V
D p(μH ): Indifferent between “purchase” and “search”
2
0
μ
μˆ H
μ
1
μH
Figure 12.2 Consumer Beliefs and Optimal Response Behaviors.
We next consider the firm’s strategy in more detail. We focus on pure strategies only. Hence, each type chooses an advertising and price combination: (aθ, pθ), where θ ∈ {L, M, H}. There are a number of equilibria that are possible, ranging from full separation to full pooling (see table 12.2). For example, in HM equilibrium, the H- and M-types send out the same advertising message and post the same price, while the L-type differs in at least one of these actions: (aH, pH) = (aM, pM) ≡ (aHM, pHM) ≠ (aL, pL). This in turn implies that if the consumer observes (aL, pL), she infers that the product is L-type. On the other hand, if she observes (aHM, pHM), she is uncertain whether the firm is H-type or Mtype. Her decision to search for extra information depends on her prior belief as well as the price p. While the advertising action choice is discrete (an advertising action can be either attribute-focused or
364
Dina Mayzlin and Jiwoong Shin
Table 12.2 All Types and Possible Actions Equilibrium Type
Description
Notation
Full Separating Semi-Separating Semi-Separating Semi-Separating Full Pooling
H,M,L separate H,M pool H,L pool M,L pool H,M,L pool
FS HM HL ML HML
non-attribute-focused), the price variable is continuous, which implies that a continuum of prices is possible for each type of equilibrium. We can quickly rule out two potential equilibria: the fully separating equilibrium (FS) and the semi-separating equilibrium where M and L pool (ML) by contradiction. Suppose that there exists a fully separating equilibrium (FS). Full separation implies that the consumer can simply infer the product’s type by examining the prices and the advertising campaign. Furthermore, in this equilibrium, the consumer does not search since search is costly and the product’s type can be perfectly observed. That is, (aL, pL) ≠ (aM, pM) ≠ (aH, pH). From our model assumptions, the L-type can only send a non-attribute-focused uninformative ad: aL = a0. Also, note that if pH > pM, the M-type will deviate to the H-type’s strategy, and if pH < pM, the H-type will deviate to the Mtype’s strategy, since the consumer will not search in this equilibrium. This implies that pH = pM = p . This in turn implies that it must be the case aH ≠ aM in equilibrium. Hence, either H or M must engage in non-attribute-focused advertising in FS equilibrium. Suppose that aH = a0 and let p < pL . This of course implies that H-type will mimic L’s strategy. If, on the other hand, p > pL , L-type will mimic H’s strategy. Hence, it must be the case that p = pL , which implies that ( aL , pL ) = ( aH , pH ) = ( a0 , p ) . This contradicts the initial assertion that (aL, pL) ≠ (aM, pM) ≠ (aH, pH); therefore, a fully separating equilibrium does not exist in our model. Proposition 1. A fully separating equilibrium does not exist. The result above illustrates the importance of search in enabling signaling in our model. Consumer search cannot occur in a fully separating equilibrium since the consumer has no uncertainty about the firm type after observing price and advertising. The assumption that there are more types (three types) than possible advertising actions (two possible actions: attribute versus non-attribute-focused ad) results
Uninformative Advertising as an Invitation to Search
365
in at least some pooling between different types in advertising action. The remaining question is then whether price can differentiate between types in the absence of search by the consumer. As is illustrated in the proof above, price alone cannot signal quality since our model does not have any of the elements (such as differential costs, demand, or profits from repeat purchases) that would ordinarily enable price to be a signal of quality in standard signaling models. Instead, as we show below, it is consumer search (coupled with price) that enables signaling in our model.16 Similarly, we can show that the semi-separating equilibrium, ML, where M- and L-types pool cannot exist. In ML, it must be the case that pL = pM ≡ pML, aL = aM = a0. Note that pML < V2 since even with search the consumer cannot be absolutely certain that the product is not L-type. However, if M-type deviates to aj, j ∈ {α, β}, it can charge at least V2 since an attribute message credibly signals that it is not type L. Intuitively, since M is able to perfectly separate itself from the L-type player through advertising, it prefers to do so. Hence, an equilibrium where the M and L pool does not exist. Proposition 2. ML equilibrium does not exist. The remaining three equilibria candidates (HML, HM, and HL) can be categorized into two types: one in which H separates from M (HL), and one in which H pools with M (HML, and HM). As is the case for any signaling model, we have to deal with the technical issue of specifying the out-of-equilibrium beliefs. There are two main approaches to dealing with this. The first is to assume a particular set of beliefs following a deviation (see, for example, McAfee and Schwartz 1994). While this method is often used, it is vulnerable to the criticism that any specific set of chosen beliefs is, by definition, arbitrary. The second approach is to start with an unconstrained set of out-of-equilibrium beliefs, but then narrow it using an existing refinement. The strength of this approach is that it imposes some structure on the out-of-equilibrium beliefs—a belief that is consistent with a refinement is more “reasonable.” A number of signaling models employ the Intuitive Criterion (Cho and Kreps 1987) to refine the beliefs (e.g., Simester 1995, Desai and Srinivasan 1995). The idea behind this criterion is as follows. Suppose that a consumer observes the deviation A1 = (a, p). If type θ makes lower profit in deviation than in equilibrium under all possible consumer beliefs, the consumer does not believe that the product could be type θ. That is, if L-type would not benefit from the deviation even under the most optimistic belief, μH = 1, the
366
Dina Mayzlin and Jiwoong Shin
consumer does not think that the deviating firm could be type L. In our model, however, no search occurs under extreme beliefs, such as μH = 1, since the consumer would rationally choose not to search under certainty. Of course, if search does not occur, and as was illustrated in our discussion following Proposition 1; all types equally benefit (or are hurt) by a deviation. Hence, the Intuitive Criterion does not narrow the beliefs in our model; in other words, any out-of-equilibrium belief in our model can survive the Intuitive Criterion. Instead, and following other countersignaling papers (e.g., Feltovich et al. 2002, Harbaugh and To 2008), we use a stronger refinement, the D1 criterion (Fudenberg and Tirole 1991), to eliminate unreasonable out-of-equilibrium beliefs. The idea behind this refinement is roughly as follows. Consider the set of best responses associated with a particular out-of-equilibrium belief. Suppose that H-type benefits from the deviation under a bigger set of best responses than L-type. Moreover, this is the case for all possible beliefs. D1 then requires that the consumer does not believe that the deviating type is L. More generally, suppose that in deviation A1 = (a, p), type θ′ makes higher profit than in equilibrium under a strictly bigger set of best responses from the consumer than type θ does. D1 then requires that the consumer does not believe that the product could be type θ. Unlike the Intuitive Criterion, D1 does not require that the L-type must not benefit from the deviation under any possible belief. Instead, it requires that the set of consumer’s best responses, which are based on the consumer’s beliefs, should be strictly smaller than that of the H-type. We show that our equilibrium is supported by out-of-equilibrium beliefs that survive not only the Intuitive Criterion, but also even the stronger D1 refinement. We discuss the D1 criterion and its application in the appendix. 12.4.1 The Countersignaling HL Equilibrium We first consider the equilibrium that is the core of this paper: HL equilibrium. In this equilibrium, H- and L-types pool on non-attributefocused advertising and price, whereas the M-type engages in attributefocused advertising and perfectly reveals its type to the consumer. Since as in Feltovich et al. (2002), the high and the low types undertake the same action, we refer to this equilibrium as the countersignaling equilibrium. Surprisingly, in this equilibrium the type with the most to say (H-type) chooses a message devoid of any information on product
Uninformative Advertising as an Invitation to Search
367
attributes. It is this HL equilibrium that explains the fact that so many brands in practice choose not to mention specific product attributes in their advertisements, even when they are strong on these attributes. That is, this equilibrium offers a rational explanation for non-attributefocused advertising, which is the focus of this research. We first characterize the equilibrium and demonstrate its existence. Second, we show that HL equilibrium survives the D1 refinement. Finally, we show that search is necessary for the existence of the HL equilibrium. In addition, we show that the HL equilibrium is unique in section 12.4.2. Proposition 3. A semi-separating HL equilibrium exists if: ( 1− γ )V + 2 c γ V − 2c max 2 −γHH −γ L , 2γVH < min γHH + γ L , 2γVM . In this equilibrium, H- and L-
{
}
types pool on (a0, p
* HL
{
), where ax
{
}
( 1− γ H )V + 2 c 2 −γ H −γ L
,
V 2γ H
}≤ p
* HL
< min
{
γ HV − 2c γ H +γ L
,
V 2γ M
},
while the M-type separates on ( a j , pM = V2 ). The consumer chooses to * search when she observes (a0, pHL ) and purchases the product only if she receives good news ( s = s ). Here, Π*(H) > Π*(M) > Π*(L). Proof. See the appendix. Proposition 3 demonstrates the existence of the HL equilibrium with consumer search. The condition for existence presented in Proposition 1 summarizes several restrictions on the model parameters. First, consider the consumer’s optimal strategy, given the firm’s equilibrium pricing and advertising strategy. Based on Lemma 1 and given equilibrium beliefs, the consumer chooses to search for additional information * following (a0, pHL ) as long as the equilibrium price is not too low or ( 1− γ H )V + 2 c γ V − 2c * too high: 2 −γ H −γ L ≤ pHL < γHH + γ L . (If the price is too low such that ( 1− γ )V + 2 c * pHL < 2 −γHH −γ L , the consumer’s best response is to buy without search. γ V − 2c * Also, if the price is too high such that pHL > γHH + γ L , the consumer’s best response is not to purchase.) In the proof in the appendix, we show V (γ − γ ) that if the search cost is low enough c ≤ H8 M V , there always * exists a price pHL such that the consumer would choose to search in equilibrium. Next, we turn to the firm’s problem. In equilibrium, all types prefer their equilibrium strategies to reflect the optimal deviation. Of course, the optimal deviation depends on the out-of-equilibrium beliefs. To show existence, we assume the following out-of-equilibrium beliefs: * μL = 1 for all (a0, p ≠ pHL ) and μH = 0 for all ( a j , p ≠ V2 ) (below we show that this belief is indeed reasonable; i.e., survives the D1 refinement). Given this, the firm’s non-deviation conditions are the following:
(
)
368
Dina Mayzlin and Jiwoong Shin
* * > Max A1 Π( A1|q = H ) = Π * ( a0 , pHL |q = H ) = γ H pHL
V , 2
V * > Max A1 Π( A1|q = M ) = γ M pHL , 2 * * Π * ( a0 , pHL |q = L) = γ L pHL > Max A1 Π( A1|q = L) = 0. Π * ( a j , pM|q = M ) =
(6)
* > V2 and This of course reduces to the following conditions: γ H pHL * V γ M pHL < 2 , which implies that for the equilibrium to hold, it must be the case that γH is high relative to γM. We can equivalently express this * < 2γVM . as a condition on price: 2γVH < pHL Combining these two conditions, we obtain the result that HL equilibrium with search exists if the equilibrium price is in the right γ V − 2c * range such that max (12−−γγHH)V−γ+L2 c , 2γVH < pHL < min γHH + γ L , 2γVM . Hence, * there exists a pooling price, pHL , that supports the HL equilibrium with ( 1− γ H )V + 2 c V search if max 2 −γ H −γ L , 2γ H < min γγHHV+−γ2Lc , 2γVM . From the condition on * price, it is clearly the case that pHL > V2 , which implies that the H-type firm can charge a quality premium based on the reduced consumer uncertainty under consumer search. That is, in the case when the consumer receives good news ( s = s ), she is willing to pay a higher price compared to the price she is willing to pay for M-type. Hence, H-type may prefer to extend an invitation to search to the consumer by pooling with L-type on non-attribute-focused advertising when it is confident that the consumer is more likely to receive good news (i.e., γH is sufficiently large). Next, we show that this equilibrium can survive the D1 refinement (and the Intuitive Criterion as well). Proposition 4. A semi-separating HL equilibrium where the consumer * chooses to search after observing (a0, pHL ), exists and survives D1 if
{
{
( γ H − γ M ) p j + V2 ( 1− γ H )
γ H ( 1− γ M )
}
}
< min
{
γ HV − 2c γ H +γ L
,
{
V 2γ M
}
{
}
}
, where p j = 34 V +
V 2 ( γ − γ )2 − 2Vc ( γ − γ ) H M H M 4
2( γ H − γ M )
.
Proof. See the appendix. In Lemma 3 in the appendix, we characterize the properties of beliefs imposed by D1 and show that the belief we assumed above, μL = 1 for * all (a0, p ≠ pHL ) and μH = 0 for all ( a j , p ≠ V2 ), is consistent with D1. Note that in addition to the conditions we had in Proposition 1, D1 imposes a new lower limit on price.17 Since not all of the conditions are binding, the constraints reduce to the ones given in Proposition 2. To summarize, we have shown in Proposition 1 that the countersignaling equilibrium where the best and the worst types pool on nonattribute-focused advertising can exist. In other words, advertising
Uninformative Advertising as an Invitation to Search
369
content can signal quality. In Proposition 2, we show that this equilibrium survives the D1 refinement. This demonstrates the robustness of HL equilibrium since D1 eliminates equilibria that are supported by unreasonable out-of-equilibrium beliefs. 12.4.1.1 Discussion Based on the results of these two Propositions, we can demonstrate when we expect to see the HL equilibrium. From equation (6), we can see that in order for the HL equilibrium with consumer search to exist, it must be the case that γH is sufficiently large and γM is sufficiently small. Here H-type prefers to pool with L-type on non-attribute-focused advertising rather than pursue an attribute-focused strategy that perfectly signals that the firm is not L-type. Since the additional signal associated with each type is noisy, after a non-attribute-focused ad and own search, the consumer may mistake an H-type firm for an L-type. Therefore, the risk H bears by pooling with L must be relatively small (γH is large) such that H-type prefers this to the certain outcome of pretending to be M-type by engaging in an attribute-focused ad. Moreover, when γH is large relative to γL, the consumer is willing to pay a higher price following good news ( s = s ), since she is confident that the product is H-type and not L-type. Hence, when γH is large, not only is the probability of a transaction high, but also the price charged can increase. This is the source of H’s confidence in extending the invitation to search to the consumer. On the other hand, M-type prefers to separate itself from L-type rather than pool with it. This can happen only if the additional signal cannot effectively separate between M and L types (in other words, γM is small). Hence, M lacks H’s confidence and prefers not to mimic H-type because the probability that it may be misjudged as L-type is too high. That is, while H-type is willing to relinquish control in its communication strategy (by engaging in nonattribute-focused advertising with an uncertain outcome following consumer search), the M-type prefers the lower risk attribute-focused strategy. Finally note that in HL equilibrium, all types make a positive profit. In particular, L-type is able to extract rents that arise due to the consumer’s mistakes as a result of search. However, L’s profit is strictly lower than those of H- and M-types: * * Π * ( H ) = γ H pHL > Π * ( M ) = V2 > Π * (L) = γ L pHL > 0 . In particular, as the noise associated with L’s signal decreases (γL decreases), L’s profit decreases.
370
Dina Mayzlin and Jiwoong Shin
As we see from the discussion above, consumer search is the core mechanism that enables signaling in equilibrium. In fact, we can formally show that this equilibrium does not exist without consumer search: Proposition 5. A semi-separating HL equilibrium without consumer search does not exist. Proof. See appendix. Without consumer search, the firm is constrained to charge a relatively low price due to the consumer’s uncertainty about product quality. The maximum price that H and L can charge in equilibrium such that the consumer chooses not to search is strictly less than V2 when the search cost is sufficiently low. Hence, H-type would prefer to deviate in order to signal that it is not type L, which of course destroys this potential equilibrium. 12.4.2 Other Equilibria and Uniqueness of HL In the preceding Section, we show that the H-type can signal its quality by extending an invitation to search (through non-attribute-focused advertising) to the consumer. Can there be other equilibria where M extends this invitation? As we show below, there indeed can be equilibria where M, as well as H, extends an invitation to search. In the HML full pooling equilibrium, all types engage in non-attribute-focused advertising and post the same price, and the consumer chooses to search in equilibrium. Note that while a non-attribute-focused advertising is an invitation to search in HML, it is not a signal of higher quality. In contrast, in the HM semi-separating equilibrium, an attribute ad can serve as an invitation to search, while a non-attribute-focused ad reveals that the firm is L-type. We show that these other equilibria exist only if γM is high enough; the mediocre product is willing to extend an invitation to search only if it is fairly certain that the outcome of search will be positive. In other words, if γM is low or the mediocre product is not confident in the outcome of the search process, only the HL countersignaling equilibrium exists, as we show in Proposition 8. We first turn to the full pooling equilibrium, HML, where all types * engage in the same type of advertising and post the same price ( aθ = aHML * and pθ = pHML , where θ ∈ {L, M, H}). Of course, since L-type can only * = a0. The consumer, engage in non-attribute-focused advertising, aHML in turn, may either choose to purchase the product without search based on the prior information only or may search for extra
Uninformative Advertising as an Invitation to Search
371
information. As we show in Proposition 2 below, given our assumption V (γ − γ ) that search is feasible for the consumer c ≤ H8 M V , there does not exist an HML equilibrium without consumer search, but there does exist a full pooling equilibrium where consumer searches if γM is high enough and the price is in the intermediate range.18 Proposition 6. 1. A full pooling equilibrium (HML) without consumer search does not exist. 2. A full pooling equilibrium HML where the consumer chooses to * search after observing ( a0 , pHML ) , exists if γM is high enough and the price is in the intermediate range. Here, Π*(H) > Π*(M) > Π*(L). Moreover, this equilibrium survives D1. Proof. See technical appendix. The first result is very similar to the result we obtain in Proposition 1. If the higher types pool with the lowest type, and no search occurs, the price that is charged in a potential equilibrium is too low to prevent a deviation. On the other hand, as we can see from the second result, the firm may be able to charge a high enough price such that the consumer would choose to search after observing non-attribute-focused advertising. The mediocre firm prefers this strategy only if it is fairly confident about the positive outcome of search—i.e., γM is high enough. In this equilibrium, search still allows the firm to charge a high price due to the decreased uncertainty. However, since γM is high, the possibility of search is not a credible threat to the M-type and, hence, M prefers to pool with H (and L) as opposed to revealing its quality as is the case in HL equilibrium. All types extend an invitation to search through non-attribute-focused advertising, but this invitation to search does not signal quality. The final remaining equilibrium is the semi-separating HM equilibrium. In this equilibrium, H- and M-types pool on attribute advertising * and price: aθ = aj, where θ ∈ {M, H}, pθ = pHM . The fact that the higher types engage in an attribute-focused communication allows them to separate themselves from the L-type: aL = a0.19 The consumer, of course, * can choose to search following (aj, pHM ) in order to further differentiate whether the firm is H-type or M-type. In this semi-separating equilibrium, and in contrast to HL, both H- and M-types choose to emphasize their strong attribute: the firm that has anything positive to say about its product chooses to do so. In this sense, this equilibrium is a very intuitive one.
(
)
372
Dina Mayzlin and Jiwoong Shin
Proposition 7. Suppose that search cost is low enough such that c ≤ V2 ρ(1 − ρ)(γ H − γ M ). 1. A semi-separating HM equilibrium without consumer search does not survive D1. 2. A semi-separating HM equilibrium, where the consumer chooses to * search after observing (aj, pHM ), exists if γM is sufficiently high and the price is in the higher range. Here, Π*(H) > Π*(M) > Π*(L). Moreover, this equilibrium survives D1. Proof. See technical appendix. The first result in Proposition 7 mirrors our earlier results. An equilibrium without consumer search does not survive D1 if cost of search is low enough. This is again due to the fact that without search, the firm is constrained to charge a relatively low price.20 On the other hand, HM equilibrium where the consumer chooses to search can exist. As was the case for HML, here M is willing to extend an invitation to the consumer to search. However, search follows an attribute-focused ad as opposed to a non-attribute-focused ad in HML. The consumer searches in order to differentiate between the H- and M-types. Note that while both of these types deliver relatively high value to the consumer, the pooling price in equilibrium must be high enough so that the consumer prefers to undertake the search in order to further resolve the uncertainty.21 Hence, the condition of low search cost, c ≤ V2 ρ(1 − ρ)(γ H − γ M ), ensures that the consumer searches for additional information in equilibrium. As we can see from the condition on the search cost above, here ρ plays an important role in the decision to search. Recall that ρ is the correlation between attributes, which in this equilibrium translates to a prior belief about the product’s type following aj since P(H|aj) = P(β = B|α = A) = ρ. The consumer chooses to search only if the search cost is low enough relative to the benefit that can be obtained through seeking additional information—i.e., resolving the uncertainty. Therefore, if ρ is either close to 1 or to 0, there is little remaining uncertainty on whether the firm is H-type or M-type following aj. This, in turn, implies that search would not arise in equilibrium unless the search cost is also close to zero. Hence, depending on the magnitude of the search cost c and the correlation ρ, HM equilibrium with search may or may not exist. In summary, for HML and HM to exist, γM (the probability that M receives a positive signal following search) must be high enough. By pooling with a higher type and charging a high price, as is the case for HM and HML, M-type loses control over the consumer’s final inference
Uninformative Advertising as an Invitation to Search
373
since, in both of these equilibria, the consumer chooses to search for additional information. On the other hand, in the HL equilibrium, by revealing its type, M faces lower risk since the consumer has no uncertainty. This decrease in uncertainty, however, comes with a lower upside potential since in this case M cannot charge more than V2 . The amount of risk that M faces—summarized by γM, where a higher γM entails a lower risk—determines whether a pooling equilibrium with H and M exists. Finally, we show that when γH is large and γM is low, HL is the only equilibrium that survives the D1 refinement. Proposition 8. Under D1, HL equilibrium is unique when γH is sufficiently large and γM is sufficiently small such that: 1 max 2 c
Φ = γ Lp * L
* HL
H
}
L
and 2
*
2
γ ( pHL ) Π* (H ) = > Π * ( M ) = 2 > Π * ( L) = L 2 . Proof. See the appendix. Proposition 9 highlights several interesting results. First, both the H- and the L-types advertise in equilibrium. Second, the reach of the advertising campaign increases according to the firm type—the higher the type, the more it invests in advertising ( Φ *H > Φ *M > Φ *L ). This is consistent with the intuition in the money-burning literature where the higher type invests more in advertising. This of course implies * that the consumer who observes (a0, pHL ) infers that the sender is more likely to be H-type than L-type, since Φ *H > Φ *L. In other words, the mere fact that the ad reached the consumer makes it more likely that the firm must be the H-type. However, since the L-type does advertise ( Φ *L > 0), there is remaining uncertainty on whether the firm is H-type or L-type, which results in search if the price is in the appropriate range. Note that the residual uncertainty, which is present here, but is not present in money-burning models, is due to the partial observability of advertising reach. That is, advertising content can signal quality as long as the consumer does not perfectly observe advertising expenditure or the reach of the advertising campaign. V2
12.5.1 Consumer Heterogeneity and Asymmetric Attributes Next, we relax two assumptions of our basic model: (1) the assumption on the homogeneity of consumer preferences; and (2) the assumption on the symmetry of product attributes. First, to introduce heterogeneity in consumer valuations, we assume that there are two types of consumers. The first segment (of size 1-λ) derives only a basic utility (u = u0) from the product, and its utility function is not sensitive to the attributes that the product possesses. Since search is costly, the consumers in this segment never search for quality information. In contrast, the second segment (of size λ) is affected by the quality of the product and therefore may choose to search for additional information on product quality.
376
Dina Mayzlin and Jiwoong Shin
Second, to introduce asymmetry between the two attributes, we assume that the utility derived from the high quality level differs across the attributes. Hence, the consumer in the second segment derives a utility, u = u0 + φV ⋅ 1(α = A) + (1 − φ )V ⋅ 1(β = B), where the parameter ϕ captures the relative importance of attribute α. Without loss of generality, we assume that attribute α is relatively more important ( 12 ≤ φ ≤ 1). In this specification both segments derive a basic utility (u0) from the product. For example, all credit card owners are able to charge purchases on their card, and enjoy this convenience. In addition, customers from the second segment derive utility φV from a credit card with good customer service, utility (1 − φ )V from a card that has generous rewards, and utility of V from a card with both good customer service and generous rewards. Note that under this asymmetric specification, there are four possible types (θ) of products based on the quality levels of the attributes: θ ∈ {H, Mα, Mβ, L} = {{A, B}, {A, b}, {a, B}, {a, b}}. In contrast, under the symmetric case of our basic model, where the consumer values the two attributes equally, the type space essentially reduces to three types, θ ∈ {H, M, L}, since the consumer is indifferent between types Mα and Mβ. Hence, in the asymmetric case, we have to consider the incentives of all four types. In Proposition 10 we show that HL exists in the presence of customer heterogeneity and attribute asymmetry. In this equilibrium, H- and L* types pool on non-attribute-focused advertising and price (a0, pHL ), whereas the M-types engage in attribute-focused advertising, and, therefore, perfectly reveal their types to the consumer (aM = aj, pMα = u0 + φV , pMβ = u0 + (1 − φ )V ). Proposition 10. HL equilibrium with consumer search exists if: min
{
( 1− γ H )( u0 + V ) + ( 1− γ L ) u0 + 2 c 2 −γ H −γ L
Here: Π ( H ) = λγ H p *
* HL
,
u0 λγ L
,
u0 + φV λγ H
} < max {
γ H ( u0 + V ) + γ L u0 − 2 c γ H +γ L
> Π ( Mα ) = λ (u0 + φV ) > Π ( Mβ ). *
,
u0 + ( 1− φ )V γM
}.
*
* = λ (u0 + (1 − φ )V ) > Π * (L) = λγ L pHL Proof. See appendix. Proposition 10 not only serves as a robustness check on the results of our basic model, but also suggests several important take-aways. In order to highlight these, consider the non-deviation conditions for L, M, and H, respectively. (Note that these conditions are contained in the condition given in Proposition 10.)
γL >
u0 u0 + (1 − φ )V ,γM < * * pHL λ pHL
and
u0 + φV < γ H. * pHL
Uninformative Advertising as an Invitation to Search
377
From the prior equation, we can see that in order for the L-type not u0 to deviate from its equilibrium, the ratio λ pHL must be low enough. That * is, for the HL equilibrium to hold, it must be the case that a sizable segment of the population values the quality of the attributes (large λ), and the basic utility derived from the product is relatively small (small u0). Otherwise, the L-type would find it profitable to deviate and reveal its type perfectly by charging the lower price u0, which ensures that it can serve both segments and therefore earn a profit of u0. This condition highlights the point that HL holds only when the attribute levels, and, hence, the quality of the product, are relatively important. Second, the amount of attribute asymmetry also impacts whether the countersignaling equilibrium exists. Consider the scenario where attribute α is far more important than attribute β (i.e., high ϕ). This implies that Mα-type is perceived as very similar to H-type from the consumer’s perspective, since only the quality of α attribute matters to her. Therefore, the H-type would prefer to claim to be an Mα-type rather than take the risk of being confused with the L-type by pooling u0 + φV < γ H does not hold as ϕ → 1 since (i.e., H’s non-deviation condition pHL * * the consumer would not be willing to pay pHL > u0 + φV ). This suggests a boundary condition on the existence of the HL equilibrium. In our basic model, due to limited bandwidth, advertising content alone cannot distinguish between the H- and M-types. This of course is no longer the case when the attribute asymmetry is so great that essentially only one attribute matters. In this case, the H-type can convey its quality information through an attribute-focused ad. Therefore, the HL equilibrium does not exist when attribute asymmetry is sufficiently high. 12.5.2n-Attribute Case (n > 2) Here we consider the more general model where products are allowed to differ across a large number of attributes. That is, suppose that a product has n (> 2) attributes: {b1, b2, . . . bn}, where bj = {Q, q} with a capital letter representing higher quality on that dimension. H-type is high quality on h attributes (n > h), M-type is high quality on m attributes, and L-type is high quality on only the basic l attributes (n > h > m > l). A priori, the consumer’s expected utilities from H-, M-, and Ltypes are hV , mV , and lVn , respectively. n n Because of limited bandwidth, the firm can only communicate up to k attributes in an advertising message. The crucial assumption is how large k (communication bandwidth) is relative to the number of
378
Dina Mayzlin and Jiwoong Shin
attributes along which M-type is high quality. That is, if k > m, then an attribute-focused ad that emphasizes m + 1 attributes would allow the H-type to perfectly separate itself from the M-type.22 On the other hand, if k < m, the ad alone cannot separate the M- and H-types, since the number of attributes at which M-type is high quality is strictly greater than the number of attributes which can be communicated in an ad. Hence, in the multi-attribute setting, the latter assumption represents the case of limited bandwidth. To link it back to our main model, we can think of α as the set of the k attributes that the firm emphasizes in an ad, and β as the remaining set of attributes. Not surprisingly, since limited bandwidth (as in our basic model) does not allow firms to signal their type through advertising alone, our main results continue to hold. Proposition 11. When k < m, HL equilibrium with consumer search exists if: 1. c ≤
( h − m )V ( γ H − γ M ) 4n
2. max
{
, and
h ( 1− γ H )V + 2 nc n( 2 − γ H − γ L )
,
mV nγ H
,
lV nγ L
* Π * ( H ) = γ H pHL > Π* ( M) =
} < min {
mV n 23
hγ H V − 2 nc n(γ H + γ L )
,
mV nγ M
} . Here:
* . > Π * (L) = γ L pHL
Proof. See the appendix. Proposition 11 demonstrates the robustness of our main results. Even in the case of multiple attributes, as long as there exists limited bandwidth in communication (k < m), the superior and the terrible types may choose not to emphasize product attribute information in their advertising in order to invite the consumer to search. Interestingly, we can also further endogenize the γ′’s (the probability of positive signal as the result of her own search) as a function of the number of attributes along which the product is high quality (h, m, and l). Suppose that a consumer talks to her friend about the product, and her friend focuses on a few attributes in her recommendation. Clearly, this would imply that the probability that the WOM about the product is positive would be a function of the number of high quality attributes that the product possesses. In this setting, we would expect γH to be increasing in h, γM to be increasing in m, and γL to be increasing in l (in the simplest specification, γ H = nh , γ M = mn , γ L = nl ). Hence, this specification would provide a micro-foundation for: (1) our assumption that γH > γM > γL since the probability of good news for each type is a function of the number of attributes on which it is high quality; and (2) our assumption that γH < 1 and γL > 0, since even the highest type is low
Uninformative Advertising as an Invitation to Search
379
quality on some attributes, and the lowest type is high quality on some attributes. 12.6
Conclusion and Limitations
We show that advertising content can be a credible quality signal under the realistic assumptions of limited bandwidth of communication and active consumers. The desire to signal one’s quality may result in the surprising phenomenon that the firm with the most to say may choose not to make any hard claims at all. This withholding strategy may be rational in that vague claims can be made by either the superior or the terrible products, which necessitates search for further information on the part of the consumer. Hence, vague claims serve as an invitation to search. Consumer search, that is determined endogenously, is crucial in enabling this type of equilibrium. While most of the previous literature has focused on the decision to advertise (the mere fact that the firm is willing to burn its money) as a signal of quality, we show that message content, coupled with consumer search, can also serve as a credible signal of quality. One surprising take-away of our model is that the high and the low quality firms may charge the same price and engage in non-attribute-focused advertising. The examples we used in the Introduction provide some anecdotal evidence that is consistent with our prediction—a “reality check” on our theory. Consider, for example, the digital cameras example. The two cameras that engaged in non-attribute-focused advertising, Canon PowerShot and Fujifilm FinePix, both priced at around $200, are indeed high and low types in their category according to the Consumer Reports 2010 rankings. A similar pattern emerges in the credit cards example, where the high type (American Express) and the low type (First Premiere Bank) both engage in non-attribute-focused advertising. The example of Windows Vista illustrates the evolution of the firm’s advertising strategy—from a non-attribute-focused strategy to an attribute-focused strategy later on. We can speculate that that reason for the change was due to new negative information that emerged about the Windows software. That is, as Microsoft became less confident about its software, it preferred to switch from a strategy that encouraged the consumer to search on her own to one where it provided the consumer with product information. In contrast, Tempur-Pedic, the maker of the “Ask me about my Tempur-Pedic” ad cited in our opening quote, is confident that a consumer who engages in her own search
380
Dina Mayzlin and Jiwoong Shin
will find out about the high quality of its mattresses. Hence, it literally invites consumers to search. There are a number of limitations to the current work. First, our model abstracts away from a number of phenomena that may be important in practice. We investigate a monopoly setting, where the firm can be one of three types. While it is technically challenging to extend the signaling mechanism to a competitive setting, it would be interesting to investigate how the current insights can be extended to competitive situation in the future. We also abstract away from the possibility that firms can employ several different advertising campaigns. Allowing multiple campaigns relates to the important marketing issue of targeting and segmentation. As the firms gain the ability to personalize their ads to the viewers (through technologies such as search advertising), they can target each segment through emphasizing those attributes that it particularly values. In this case, the truly excellent firm may not need to invite consumers to search. It may instead persuade the consumer to buy its product by emphasizing the attributes that are tailored to her segment. Also, there are several simplifying assumptions in the model that can be relaxed in future work. First, we impose the truth-telling assumption: we assume that a firm cannot claim to be strong on an attribute on which it is in fact weak. In reality, firms may be able to exaggerate their claims, while staying within the bounds of government regulation. Whether and when the firm would choose to tell the truth in advertising in the presence of imperfect government monitoring is an interesting question for future research. Second, we assume that the cost of attribute-focused and non-attribute-focused advertising is the same. Third, we also assume that the consumer faces a discrete search decision—she can either to choose to search or not to search. Realistically, the decision is continuous—the amount of search will be affected by the uncertainty faced by the consumer along with the product price. We leave these for future research. Finally, there can be other alternative explanations for the existence and effectiveness of non-attribute-focused advertising (in particular, image advertising). For instance, image advertising may be used to increase consumer trust. We do not wish to claim that our explanation is the only possible theory for this phenomenon. Nevertheless, we offer a novel explanation for non-attribute-focused advertising as an invitation to search. A laboratory-based experiment would allow us to test our theory against competing explanations. For example, an
Uninformative Advertising as an Invitation to Search
381
experiment would allow us to test whether consumers make inferences along the lines that we suggest. That is, we can test whether consumers are more likely to search for additional information following a nonattribute-focused ad. In addition, we can investigate how changes in price affect consumer search behavior. Appendix
Proof of Lemma 1
The consumer will search if and only if: EU (search ) = Pr( s|μ )[E(V|μ , s ) − p] − c ≥ EU (no search ) = max(0 , E(V|μ ) − p). Therefore: 1. If E(V|μ) − p ≥ 0, then EU(search) ≥ EU(no search) if: Pr( s|μ )[E(V|μ , s ) − p] − c ≥ E(V|μ ) − p ⇔ Pr( s|μ )E(V|μ , s ) − Pr( s|μ )p − c ≥ Pr( s|μ )E(V|μ , s ) + Pr( s|μ )E(V|μ , s ) − p ⇔ c ≤ Pr( s|μ )[ p − E(V|μ , s )] ≡ g
(A1)
2. If E(V|μ) − p < 0, then EU(search) ≥ EU(no search) if: Pr( s|μ )[E(V|μ , s ) − p] − c ≥ 0 ⇔ c ≤ Pr( s|μ )[E(V|μ , s ) − p] ≡ f
(A2)
Next, we show that f = g at p = E(V|μ) f − g = Pr( s|μ )[E(V|μ , s ) − p] − Pr( s )[ p − E(V|μ , s )] = Pr( s|μ )E(V|μ , s ) − Pr( s|μ )p − Pr( s|μ )p + Pr( s|μ )E(V|μ , s ) = Pr( s|μ )E(V|μ , s ) + Pr( s|μ )E(V|μ , s ) − p = E(V|μ ) − p = 0
(A3)
This completes the proof. Q.E.D. D1 Refinement We apply D1 (Fudenberg and Tirole 1991) to eliminate unreasonable out of equilibrium beliefs. Following Fudenberg and Tirole (1991, p. 452), we define Π*(θ) to be the equilibrium profit of type θ. We also define the set of mixed strategy best responses of the consumer, α2 (α2 = {α21, α22, α23} = {Pr (purchase without search), Pr (no purchase), Pr (search)}) to a deviation by the firm, A1 = (a, p), such that type θ strictly prefers A1 to the equilibrium strategy: D(θ, A1) = {α2 ∈ MBR(μ(A1), A1) s. t. Π*(θ) < Π(A1, α2, θ)| μH(A1) (A4) + μM(A1) + μL(A
382
Dina Mayzlin and Jiwoong Shin
Note that the consumer’s best response depends on her belief, μ(A1) = (μH(A1), μM(A1), μL(A1)). Similarly, we define a set of consumer’s best responses such that the firm is indifferent between deviating and playing the equilibrium strategy. D0 (θ, A1) = {α2 ∈ MBR(μ(A1), A1) s. t. Π*(θ) = Π(A1, α2, θ)| μH(A1) + μM(A1) + μL(A1) = 1}
(A5)
The criterion D1 puts zero probability on type θ if there exists another type θ′ such that: D(θ, A1) ∪ D0(θ, A1) ⊂ D(θ′, A1).
(A6)
Using Lemma 1, below we derive the set of consumer’s mixed best responses, MBR(μ(A1), A1): 1. If E(V|μ(A1)) − p > 0, (a) Consumer will search: α2 = {0,0,1}, if c < Pr (s|μ(A1))[p − E(V|μ(A1), s)] (b) Consumer will purchase without search: α2 = {1,0,0}, if c > Pr (s|μ(A1))[p − E(V|μ(A1), s)] (c) Consumer mixes between search and purchase without search: α2 = {α21, 0,1 − α21}, if c = Pr (s|μ(A1))[p − E(V|μ(A1), s)] 2. If E(V|μ(A1)) − p < 0, (a) Consumer will search: α2 = {0,0,1}, if c < Pr( s|μ( A1 ))[E(V|μ( A1 ), s ) − p] (b) Consumer will not purchase: α2 = {0,1,0}, if c > Pr( s|μ( A1 ))[E(V|μ( A1 ), s ) − p] (c) Consumer mixes between search and no purchase: α2 = {0, α22, 1 − α22}, if c = Pr( s|μ( A1 )) ⋅ [E(V|μ( A1 ), s ) − p] 3. If E(V|μ(A1)) − p = 0 and c = Pr (s|μ(A1))[E(V|μ(A1)) − E(V|μ(A1), s)], consumer chooses either α2 = {0, α22, 1 − α22} or α2 = {α21, 0,1 − 〈21}. Note that α2 = {α21, 1 − α21, 0} ∉ MBR(μ(A1), A1) since we assume that if the consumer is indifferent between purchasing the product and no purchase, she chooses to purchase it. Bounds on prices and beliefs for consumer search Next, using the results above, we derive explicit bounds on prices and beliefs such that the consumer searches as a best response to A1. Lemma 2. 1. Consider the case where the firm engages in attribute-focused adverj j tising, A1 = (aj, p) and the consumer’s belief is μ j = (0 , μ M , μ H ) . There
Uninformative Advertising as an Invitation to Search
383
exists a consumer belief under which search is a best response for the V (γ − γ ) consumer if c ≤ H8 M and p ∈[ p j , p j ], where V 2 ( γ − γ )2 − 2Vc ( γ − γ ) H M H M 4
p j = 34 V −
2( γ H − γ M )
V 2 ( γ − γ )2 − 2Vc ( γ − γ ) H M H M 4
≥ V2 , p j = 34 V +
2( γ H − γ M )
≤ V.
j Moreover, for a given μ H , consumer chooses to search if:
p j (μ Hj ) =
j
j
μ H ( 1− γ H )V + ( 1− μ H )( 1− γ M ) V2 + c j j μ H ( 1− γ H ) + ( 1− μ H )( 1− γ M )
≤p≤
j
j
μ H γ H V + ( 1− μ H )γ M V2 − c j j μ H γ H + ( 1 − μ H )γ M
= p j (μ Hj ) .
2. Consider the case where the firm engages in uninformative advertis0 , μ H0 ), where ing, A1 = (a0, p) and the consumer’s belief is μ 0 = (μ L0 , μ M L = μL0 ≤ μ
1 2
(1 +
)
1 − V (γ H4 c−γ L ) .
There exists a consumer belief (μ0) under which search is a best response for the consumer if c ≤ V (γ H4−γ L ) and p ∈[ p 0 , p 0 ], where p 0 ≡ min p 0 (μ L0 ) ≤ p j , p 0 ≡ max p 0 (μ L0 ) ≥ p j . 0 ≤ μ 0 ≤ μˆ L 0 ≤ μ 0 ≤ μˆ L L
L
Moreover, for a given μ , consumer chooses to search if: 0
p 0 (μ 0 ) =
0 ( 1− γ )V + μ 0 ( 1− γ ) V + c μH H M 2 M 0 ( 1− γ ) + μ 0 ( 1− γ ) + μ 0 ( 1− γ ) μH H M L M L
≤p≤
0 γ V + μ0 γ V − c μH H M M 2 0 γ + μ0 γ + μ0γ μH H M M L L
= p 0 ( μ 0 ).
Proof. See the online Technical Appendix (http://faculty.som.yale.edu/ JiwoongShin/Downloads/articles/UninformativeAdvertising_ V (γ H − γ M ) V (γ − γ ) < H4 L Technical_Appendix.pdf). We can easily show that 8 V (γ − γ ) since γL < γH and γL < γM. This of course implies that if c ≤ H8 M , there exists a belief under which the consumer chooses to search after observing aj and a0. HL Equilibrium Proof of Proposition 3 We show that HL equilibrium with consumer search exists if c
γ M pHL and γ H pHL .
We first turn to the consumer’s problem. As we can see from Lemma 1, in order for the consumer to search in equilibrium, it must be the V (γ − γ ) case that c < H8 M and pHL ∈[ p 0 (μ L ), p 0 (μL )] ,
384
Dina Mayzlin and Jiwoong Shin
where p 0 (μL ) =
0 γ V + μ0 γ V μH H M M 2 −c 0 γ + μ0 γ + μ0γ μH H M M L L
and p 0 (μ L ) =
0 ( 1− γ )V + μ 0 ( 1− γ ) V + c μH H M 2 M 0 ( 1− γ ) + μ 0 ( 1− γ ) + μ 0 ( 1− γ ) μH H M L M L
.
In addition, on the equilibrium path, the probabilities that ρthe firm ρis * ) are 12 and 12 (i.e., ρ 2 ρ and ρ 2 ρ H-type and L-type following (a0, pHL 2
+2
2
+2
for H- and L-types, respectively). Hence, p ( ) = > V and V (γ H − γ M ) ( 1− γ H )V + 2 c 0 1 V . Hence, in order for the consumer p ( 2 ) = 2 −γ H −γ L < 2 since c < 8 to search in equilibrium, the price must be in the appropriate range: 0
( 1− γ )V + 2 c * pHL ∈ ⎡⎣ 2 −γHH −γ L ,
γ HV − 2c γ H +γ L
1 2
γ HV − 2c γ H +γ L
1 2
⎤. ⎦
Next, we need to ensure that all types prefer their equilibrium strategy to an optimal deviation. To show existence, and as we discuss in the body of the paper, we impose the following out-of-equilibrium belief: * μL = 1 for all (a0, p ≠ pHL ) and μH = 0 for all ( a j , p ≠ V2 ) . Given the assumed out-of-equilibrium beliefs, the non-deviation conditions for H-type and M-type reduce to the following: * * > Max A1 Π( A1|q = H ) = Π * ( a0 , pHL |q = H ) = γ H pHL
V 2
V * Π ( a j , pM|q = M ) = > Max A1 Π( A1|q = M ) = γ M pHL 2
(A7)
*
Finally, the L-type by definition cannot deviate on advertising. A deviation on price only yields a maximum profit of 0 under the off* equilibrium beliefs. Hence, Π * ( a0 , pHL|q = L) = γ L pHL > 0 , which is trivially satisfied. Therefore, as long as the condition
{
( 1− γ
)V + 2 c
}
max 2 −γHH −γ L , 2γVH < min which satisfies ( 1− γ H )V + 2 c 2 −γ H −γ L
* ≤ pHL
V2 > γ M pHL
Proof of Proposition 4 We examine the restrictions on the out-of-equilibrium beliefs that are * < 2γVM . We will return to imposed by D1. First, we assume that pHL this assumption below and confirm that it is indeed the case in equilibrium.
Uninformative Advertising as an Invitation to Search
385
* < 2γVM . D1 imposes the following constraints Lemma 3. Suppose that pHL on out-of-equilibrium beliefs: ( γ H − γ M ) p j + V2 ( 1− γ H ) . If the consumer observes A1 = (aj, pdev), 1. Let pˆ ≡ γ H ( 1− γ M ) * (a) when V2 < p dev < min(γ H pHL , p j ), μH(A1) = 0, * , when p j ≤ p dev ≤ p j = min ( pˆ , p j ), μH(A1) = 0, (b) if pˆ ≤ pHL
(
)
* , when max p j , pˆ < p dev ≤ p j , μH(A1) = 1. (c) if pˆ ≤ pHL
2. If the consumer observes the deviation A1 = (a0, pdev), * * (a) when γ L pHL < p dev < min(γ H pHL , p 0 ) , μH(A1) = 0, * * * (b) when γ M pHL ) < p dev < min( pHL , p 0 ), < V2 , and max( p 0 , γ L pHL μL(A1) = 1, * * (c) when γ M pHL < p dev < p 0 , μM(A1) = 0. < V2 , and pHL Proof. Let us first define the sets for θ = {L, M, H}: D0 ( H , A1 ) ∪ D( H , A1 ) = X H ∪ YH = * p dev − pHL ⎧ ⎫ ⎨(0 , α 22 , 1 − α 22 )|α 22 ≤ ⎬ dev p ⎩ ⎭ * γ p p dev ) ⎫ − ( ⎧ H HL ∪ ⎨(α 21 , 0 , 1 − α 21 )|α 21 ≥ ⎬ (1 − γ H )p dev ⎭ ⎩
D0 ( M , A1 ) ∪ D( M , A1 ) = X M ∪ YM = V⎫ ⎧ γ M p dev − ⎪ ⎪⎪ 2⎪ ⎨(0 , α 22 , 1 − α 22 )|α 22 ≤ ⎬ dev γ Mp ⎪ ⎪ ⎪⎩ ⎪⎭ V ⎫ ⎧ − γ M p dev ⎪ ⎪⎪ ⎪ 2 ∪ ⎨(α 21 , 0 , 1 − α 21 )|α 21 ≥ dev ⎬ 1 p γ ( − ) M ⎪ ⎪ ⎪⎭ ⎩⎪ L ∪ Y L = D0 (L, A1 ) ∪ D(L, A1 ) = X * p dev − pHL ⎧ ⎫ ⎨(0 , α 22 , 1 − α 22 )|α 22 ≤ ⎬ dev p ⎩ ⎭ * γ p p dev ) ⎫ − ( ⎧ L HL ∪ ⎨(α 21 , 0 , 1 − α 21 )|α 21 ≥ ⎬ (1 − γ L )p dev ⎭ ⎩
1. Consider a deviation to a price such that the consumer chooses not to purchase at any off-equilibrium belief: A1 = (aj, pdev) where p dev > p j or A1 = (a0, pdev) where p dev > p 0 ; i.e., α22 = 1. Here, D1 does not apply.
386
Dina Mayzlin and Jiwoong Shin
2. Next, consider a deviation to a price such that the consumer chooses to purchase without search at any off-equilibrium belief: A1 = (aj, pdev) where pdev < pj or A1 = (a0, pdev) where pdev < p0; i.e., α21 = 1. Therefore, D1 imposes that μH(A1) = 0 if A1 = (aj, pdev), for all dev * V < min(γ H pHL , p j ). Similarly, if A1 = (a0, pdev), for all 2 ≤ p dev * * γ L pHL ≤ p < min(γ H pHL , p 0 ) , μH(A1) = 0. j j 3. Consider A1 = (aj, pdev), and p ≤ p dev ≤ p . First, we assume that ( γ H − γ M ) p j + V2 ( 1− γ H )
* ≤ pHL ⇔ pj ≤
γ H ( 1− γ M )
* − V ( 1− γ ) γ H ( 1− γ M ) pHL H 2
γ H −γ M
.
If p dev
γ M p dev − V2 ( 1− γ M ) p dev
,
which implies that Y H ⊂ YM . Also, we can see that * − V ( 1− γ ) γ H ( 1− γ M ) pHL H 2
γ H −γ M
* < as long as pHL
p dev
pHL ⇔ pj >
* − V ( 1− γ ) γ H ( 1− γ M ) pHL H 2
γ H −γ M
Then, there exists an interval such that * − V ( 1− γ ) γ H ( 1− γ M ) pHL H 2
γ H −γ M
* ≤ p dev < min( p j , pHL ).
.
Uninformative Advertising as an Invitation to Search
387
Using the same argument as in (a) above, we can show that here Y M ⊂ YH , and X H = X M = ∅ . Hence, as long as
(
max p j ,
* − V ( 1− γ ) γ H ( 1− γ M ) pHL H 2
γ H −γ M
)< p
dev
* < min( p j , pHL ),
D1 constrains the belief to be μH = 1 following A1. Next, consider * p dev ≥ pHL . We can see that when * pHL
Max A1 Π( A1|q = Mα ) * = γ M pHL
Π * ( a j , pMβ |q = Mβ ) = u0 + (1 − φ )V > Max A1 Π( A1|q = Mβ ) * = γ M pHL * Π * ( a0 , pL|q = L) = λγ H pHL > Max A1 Π( A1|q = L) = u0
(A11)
(A12)
(A13) (A14)
* Note that since φ > 12 , u0 + (1 − φ )V > γ M pHL implies that * u0 + φV > γ M pHL . The rest of the proof is the same as before in Proposition 3. Q.E.D.
Proof of Proposition 11 Lemma [TA 4] in the online Technical Appendix (http://faculty. som.yale.edu/JiwoongShin/Downloads/articles/Uninformative
392
Dina Mayzlin and Jiwoong Shin
Advertising_Technical_Appendix.pdf)shows that when c ≤ the consumer decides to search if p 0 ≤ p ≤ p 0 , where p 0 (μL ) =
0 ( 1− γ ) hV + μ 0 ( 1− γ ) mV + c μH M n H n M 0 ( 1− γ ) + μ 0 ( 1− γ ) + μ 0 ( 1− γ ) γH H M L M L
, p 0 (μL ) =
0 γ hV + μ 0 γ mV μH H n M M n −c 0 γ + μ0 γ + μ0γ μH H M M L L
( h − m )V ( γ H − γ M ) 4n
.
In addition, on the equilibrium path, the probabilities that the firm ρ ρ * 1 1 2 p is H-type and L-type following (a0, HL ) are 2 and 2 (i.e., ρ ρ and ρ 2 ρ + + 2 2 2 2 for H- and L-type, respectively). Hence, p 0 ( 12 ) =
γ H hV − 2c n γ H +γ L
and p 0 ( 12 ) =
( 1− γ H ) hV + 2c n 2 −γ H −γ L
.
Thus, in order for the consumer to search in equilibrium, the price must be in the appropriate range: * pHL ∈⎡ ⎣⎢
+ 2c ( 1− γ H ) hV n 2 −γ H −γ L
,
γ H hV − 2c n γ H +γ L
⎤. ⎦⎥
Next, we need to ensure that all types prefer their equilibrium strategy to an optimal deviation. To show existence, we impose the follow* ) and μH = 0 for ing out-of-equilibrium belief: μL = 1 for all (a0, p ≠ pHL V all ( a j , p ≠ 2 ). Given the assumed out-of-equilibrium beliefs, the nondeviation conditions for H-type and M-type reduce to the following: * * Π * ( a0 , pHL |q = H ) = γ H pHL > Max A1 Π( A1|q = H )
mV n mV * > Max A1 Π( A1|q = M ) = γ M pHL Π * ( a j , pM|q = M ) = n =
(A15)
The L-type by definition cannot deviate on advertising. A deviation on price only yields a maximum profit of 0 under the off-equilibrium * beliefs. Hence, Π * ( a0 , pHL|q = L) = γ L pHL > 0 , which is trivially satisfied. The rest of the proof is the same as before in Proposition 3. Q.E.D. Notes Reprinted by permission, Dina Mayzlin and Jiwoong Shin, “Uninformative Advertising as an Invitation to Search,” Marketing Science, 30 (4), 2010. Copyright 2010, the Institute for Operations Research and the Management Sciences (INFORMS), 5521 Research Park Drive, Suite 200, Catonsville, MD 21228 USA. We thank Kyle Bagwell, Dirk Bergemann, Dave Godes, Rick Harbaugh, Yogesh Joshi, Dmitri Kuksov, David Miller, Martin Peitz, K. Sudhir, Duncan Simester, Birger Wernerfelt, and seminar participants at QME conference, SICS conference at Berkeley, JDCL
Uninformative Advertising as an Invitation to Search
393
Festschrift conference, NEMC conference at MIT, Third Conference on the Economics of Advertising and Marketing, Yale Economics Theory Lunch, marketing seminars at Columbia University, Erasmus University, Korea University, Northwestern University, Syracuse University, Tilburg University, University of Maryland, University of Texas at Austin, and Yale University for their helpful comments. 1. Previous advertising literature uses the term “uninformative” advertising to designate advertising that has no direct information on product attributes (Milgrom and Roberts 1986, Bagwell and Ramey 1994, and Bagwell 2007). However, here the term “uninformative” is slightly confusing since in our model “uninformative” ads do convey information about quality to the consumer in equilibrium. Hence, we mostly use the term “nonattribute-focused” advertising to avoid confusion. 2. In practice, the discrete classification of all ads into “attribute-focused” or “nonattribute-focused” is difficult. For example, some non-attribute-focused ads provide basic information about the product while some attribute-focused ads focus on irrelevant or non-differentiating attributes. Hence, the simple classification of “attribute-focused” or “uninformative” is an approximation of what is really a continuum. 3. To view these ads, see http://www.youtube.com/watch?v=RN8El63N6Mo, http:// www.youtube.com/watch?v=6iuyqApJmNc. 4. Our quality information is based on the September 2010 Consumer Reports ranking of digital cameras (www.consumerreports.org). 5. Tempur-Pedic is a manufacturer of mattresses and pillows made from Tempur material. In 2009, the company earned exclusive rights to make, use and sell mattresses made of Tempur material. 6. There are also other perspectives on advertising such as the persuasive and the complementary view of advertising (see Bagwell 2007 for a comprehensive survey of the literature). 7. One notable exception is Anand and Shachar (2009) who investigate the signaling role of advertising for search goods. They suggest that the targeting of the ads can also serve as a signal on the horizontal attributes of the product while our model concerns a quality signal in a vertical setting. 8. Sun (2009) also investigates the monopolistic seller’s incentive to disclose the horizontal matching attributes of a product. Kuksov (2007) studies the incentives of consumers to reveal or conceal information about themselves to others through brand choices in the consumer-matching context. Yoganarasimhan (2009) finds that firms sometimes prefer to conceal information to increase the social value of its product. 9. A similar phenomenon is known in the sociology literature as the middle-status conformity theory: the high- and low-status players may deviate from conventional behavior, while the middle-status players conform to social norms (Phillips and Zuckerman 2001). These models, however, do not involve a signaling story. 10. Lauga (2010) incorporates a behavioral insight to propose a novel role of advertising as affecting the distribution of prior beliefs that consumers have about product quality. In this setting, she shows that advertising does not necessarily signal the quality of the product. 11. Note that if ρ = 1 (perfect positive correlation), only {A,B} and {a,b} products exist, and if ρ = 2/3, all products are equally likely.
394
Dina Mayzlin and Jiwoong Shin
12. The Federal Trade Commission requires that “advertising be truthful and nondeceptive” and that all claims must have a “reasonable basis” (http://www.ftc.gov/ bcp/conline/pubs/buspubs/ad-faqs.shtm). Whether in reality FTC can perfectly enforce truth telling is an interesting question but is beyond the scope of this paper.). 13. Advertisers often advertise irrelevant or basic attributes. For example, any credit card can be used as a payment method and offers convenience to the consumer. Since all types can deliver these basic attributes, this type of ads is uninformative to the consumer. Therefore, we regard such an ad as non-attribute-focused ad. 14. In reality, consumers may search for more than one signal (in particular, this may be more likely when the price is high). However, we abstract away from this and assume that the consumer can choose to obtain only one additional signal at cost c to capture our main intuition in the simplest possible way. 15. This low search cost assumption guarantees that there always exists a consumer belief under which search is the best response for the consumer if she observes either a0 or aj. See Lemma 2 in the appendix for more details. 16. An equilibrium result in signaling models requires that the single-crossing property be imposed across types. In existing models, a single-crossing property is exogenously imposed through differential costs or demand (Schmalensee 1978, Milgrom and Roberts 1986, Bagwell and Ramey 1994). Our model departs from these models in that the singlecrossing property here arises from the consumer’s endogenous search. If we allow for costs to differ across firm types (see Schmalensee 1978) and for firms to be able to convey this cost information to the consumer, our main results unravel since the consumer can perfectly infer the quality of the product from the cost information. However, in practice credibly conveying cost information in an advertising message is very difficult since cost information is not easily observable or verifiable. In fact, conveying cost information is equivalent to conveying attribute information: the higher quality (or higher cost), the more high quality attributes the product possesses. Given limited bandwidth in our setting, fully conveying cost or attribute information is not possible. * * ) 17. We find that if pHL is low enough, then there exists a deviation A1 = ( a j , p dev > pHL such that D1 imposes μH(A1) = 1. This of course would destroy HL. Hence, in order to * rule this out, we need the additional constraint that pHL ≥ for more details.
( γ H −γ M ) p J + V2 ( 1−γ H ) γ H ( 1−γ M )
. See the appendix
18. Please see the Online Technical Appendix (http://faculty.som.yale.edu/JiwoongShin/ Downloads/articles/UninformativeAdvertising_Technical_Appendix.pdf)for the exact statement of the existence conditions, such as the conditions on: (1) γM; and (2) the price bounds. 19. Since L’s advertising perfectly reveals its type, L makes zero profit in equilibrium. Again, this zero profit result is driven by our simplifying assumption that u0 = 0. The results would not change qualitatively if we allow that L-type to yield small but non-zero utility to the consumer. 20. We show that there exists a deviation A1 = (aj, pdev > pHM*), such that following this deviation the consumer believes (based on D1) that the firm must be type H. Therefore, under this refined belief, the consumer chooses to purchase without search. This, in turn, destroys the HM equilibrium without search since both types prefer to deviate to (aj, pdev) rather than play the equilibrium strategy.
Uninformative Advertising as an Invitation to Search
395
21. While the exact expressions for the conditions on the search cost as well as the price bounds differ across HM and HML, since they pool on different actions, the basic conditions remain the same: (1) the cost search is small compared to γH—γM; (2) the price is in a certain range that ensures that the consumer searches; and (3) γM is high enough. However, the lower bound for the price range of HM equilibrium is higher than that of HML equilibrium. Otherwise, the consumer would not engage in costly search in HML. 22. For example, MacBook Air emphasizes its thinness—a single attribute—since this attribute requires a high level of innovation that only the high type can achieve. Hence, the high type would be able to distinguish itself from the medium type by emphasizing a single attribute, and thus the bandwidth constraint would not bind. We thank an anonymous reviewer for pointing this issue to us and suggesting this example. 23. Our main result is a special case of this general result when n = 2, h = 2, and m = 1.
References Abernethy, Avery M., and Daniel D. Butler. 1992. “Advertising Information: Services versus Products.” Journal of Retailing 68 (4):398–419. Anand, Bharat N., and Ron Shachar. 2007. “(Noisy) Communication.” Quantitative Marketing and Economics 5:211–237. Anand, Bharat N., and Ron Shachar. 2009. “Targeted Advertising as a Signal.” Quantitative Marketing and Economics 7:237–266. Anderson, Simon P., and Regis Renault. 2006. “Advertising Content.” American Economic Review 96 (1):93–113. Araujo, Aloisio, Daniel Gottlieb, and Humberto Moreira. 2008. “A Model of Mixed Signals with Applications to Countersignaling and the GEC.” Rand Journal of Economics. Bagwell, Kyle. 2007. “The Economic Analysis of Advertising.” In Handbook of Industrial Organization. vol. 3., 1701–1844. Amsterdam: North-Holland. Bagwell, Kyle, and Gary Ramey. 1994. “Advertising and Coordination.” Review of Economic Studies 61 (1):153–171. Bhardwaj, Pradeep, Yuxin Chen, and David Godes. 2008. “Buyer-Initiated versus SellerInitiated Information Revelation.” Management Science 54 (6):1104–1114. Butters, Gerard R. 1977. “Equilibrium Distributions of Sales and Advertising Prices.” Review of Economic Studies 44 (3):465–491. Carpenter, Gregory, Rashi Glazer, and Kent Nakamoto. 1994. “Meaningful Brands from Meaningless Differentiation: The Dependence on Irrelevant Attributes.” Journal of Marketing Research 31:339–350. Chen, Yubo, and Jinhong Xie. 2008. “Online Consumer Review: Word-of-Mouth as A New Element of Marketing Communication Mix.” Management Science 54 (3):477–491. Chevalier, Judith, and Dina Mayzlin. 2006. “The Effect of Word of Mouth on Sales: Online Book Reviews.” Journal of Marketing Research 43 (3):345–354.
396
Dina Mayzlin and Jiwoong Shin
Cho, In-Koo, and David M. Kreps. 1987. “Signaling Games and Stable Equilibria.” Quarterly Journal of Economics 102 (May):179–221. Desai, Preyas, and Kannan Srinivasan. 1995. “Demand Signalling under Unobservable Effort in franchising: Linear and Non-linear Contracts.” Management Science 41 (10):1608–1623. Feltovich, Nick, Richmond Harbaugh, and Ted To. 2002. “Too Cool for School? Signalling and Countersignalling.” Rand Journal of Economics 33 (4):630–649. Fudenberg, Drew, and Jean Tirole. 1991. Game Theory. Cambridge, MA: MIT Press. Godes, David, and Dina Mayzlin. 2004. “Using Online Conversations to Study Word-ofMouth Communication.” Marketing Science 23 (4):545–560. Grossman, Gene M., and Carl Shapiro. 1984. “Informative Advertising with Differentiated Products.” Review of Economic Studies 51 (1):63–81. Harbaugh, Rick, and Theodore To. 2008. “False Modesty: When Disclosing Good News Looks Bad,” working paper. Hertzendorf, Mark N. 1993. “I’m not a high-quality firm—But I play one on TV.” Rand Journal of Economics 24 (2):236–247. Holbrook, Morris B., and John O’Shaughnessy. 1984. “The Role of Emotion in Advertising.” Psychology and Marketing 1 (2):45–64. Hvide, Hans K. 2003. “Education and the Allocation of Talent.” Journal of Labor Economics 21 (4):945–976. Kardes, Frank R. 2005. “The Psychology of Advertising,” Persuasion:Psychological Insights and Perspectives, Timothy C. Brock, and Melanie Colette Green, ed. Thousand Oaks, CA: Sage Publications, 281–303. Kuksov, Dmitri. 2007. “Brand Value in Social Interaction.” Management Science 53 (10):1634–1644. Kihlstrom, Richard E., and Michael H. Riordan. 1984. Advertising as a Signal. Journal of Political Economy 92 (3): 427–450. Lauga, Dominique O. 2010. “Persuasive Advertising with Sophisticated but Impressionable Consumers,” working paper. McAfee, R. 1994. Preston, and Marius Schwartz. Opportunism in Multilateral Vertical Contracting: Nondiscrimination, Exclusivity, and Uniformity. American Economic Review 84 (1): 210–230. Mayzlin, Dina. 2006. “Promotional Chat on the Internet.” Marketing Science 25 (2):155–163. Meurer, Michael, and Dale O. Stahl, II. 1994. “Informative Advertising and Product Match.” International Journal of Industrial Organization 12 (1):1–19. Milgrom, Paul. 1981. “Good News and Bad News: Representation Theorems and Applications.” Bell Journal of Economics 12:380–391. Milgrom, Paul, and John Roberts. 1986. “Price and Advertising Signals of Product Quality.” Journal of Political Economy 94 (August):796–821.
Uninformative Advertising as an Invitation to Search
397
Moorthy, Sridhar, and Kannan Srinivasan. 1995. “Signaling Quality with a Money-Back Guarantee: The Role of Transaction Costs.” Marketing Science 14 (4):442–466. Nelson, Phillip. 1974. “Advertising as Information.” Journal of Political Economy 78:311–329. Phillips, Damn J., and Ezra W. Zuckerman. 2001. “Middle-Status Conformity: Theoretical Restatement and Empirical Demonstration in Two Markets.” American Journal of Sociology 107 (2):379–429. Scott, Linda M. 1994. “Images in Advertising: The Need for a Theory of Visual Rhetoric.” Journal of Consumer Research 21 (3):461–490. Shapiro, Jesse M. 2006. “A Memory-Jamming Theory of Advertising,” University of Chicago, working paper. Schmalensee, Richard. 1978. “A Model of Advertising and Product Quality.” Journal of Political Economy 86 (3):485–503. Shin, Jiwoong. 2005. “The Role of Selling Costs in Signaling Price Image.” Journal of Marketing Research 32 (August):302–312. Simester, Duncan. 1995. “Signaling Price Image Using Advertised Prices.” Marketing Science 14 (Summer):166–188. Sun, Monic. 2009. “Disclosing Multiple Product Attributes.” Journal of Economics & Management Strategy. Teoh, Siew H., and Chuan Y. Hwang. 1991. “Nondisclosure and Adverse Disclosure as Signals of Firm Value.” Review of Financial Studies 4 (2):283–313. Tirole, Jean. 1986. “Hierarchies and Bureaucracies: On the Role of Collusion in Organizations.” Journal of Law Economics and Organization 2 (2):181–214. Tirole, Jean. 1988. The Theory of Industrial Organization. Cambridge, MA: MIT Press. Wernerfelt, Birger. 1988. “Umbrella Branding as a Signal of New Product Quality: An Example of Signaling by Posting a Bond.” Rand Journal of Economics 19 (3):458–466. Wernerfelt, Birger. 1994. “On the function of sales assistance.” Marketing Science 13 (Winter):68–82. Yoganarasimhan, Hema. 2009. “Cloak or Flaunt? The Firm’s Fashion Dilemma,” working paper.
13
Alleviating the Constant Stochastic Variance Assumption in Decision Research: Theory, Measurement, and Experimental Test Linda Court Salisbury and Fred M. Feinberg
13.1
Introduction
Research in marketing focuses, by and large, on theories of behavior expressed as concrete hypotheses. Verifying such theories overwhelmingly involves experimental manipulations or examining field data for differences in mean, expected, or average responses, even at the individual level. By contrast, stochasticity—whether it is conceptualized as noise, uncertainty, response variability, or in some other way—is typically treated (if considered at all) as residing in the denominator: something to minimize to make mean effects stand out, un-modeled “error” in some ancillary statistical model. Such methods can presume, often tacitly, that differences in behavioral outcomes (e.g., observed choices) are driven primarily by (differences in) what is predictable and deterministic, not what is unobserved and stochastic. Recent research, however, has cautioned researchers to be mindful that different conditions can entail differences in response variability as well (Louviere 2001; Louviere et al. 2002). Employing analysis methods—notably, choice models—that do not allow for (or estimate) differences in response variation can introduce potential misspecifications, ones with important substantive implications. One excellent example, widely studied by behavioral researchers, involves temporal effects on behavior. Most consumer decisions are not only made over time, but also involve outcomes—products, services, investments, etc.—experienced or consumed at some point in the future. That is, consumers often make sequences of choices, each intended to anticipate that consumer’s preferences at some future consumption time. Given the ubiquity of this scenario, a great deal of research in both marketing and behavioral decision theory has attempted to develop, test, and measure key aspects of theories of how
400
Linda Court Salisbury and Fred M. Feinberg
choices are made both dynamically and prospectively. For example, it has been established that consumers making multiple choices tend to opt for greater variety when all items are chosen now for future consumption, rather than later at the time of consumption (Read and Loewenstein 1995, Simonson 1990, Walsh 1995); that decision-makers are more likely to prefer hedonic indulgences over cash of equal or greater value when there is a time delay between the decision and its outcome (Kivetz and Simonson 2002); and that people are more likely to choose a “virtue” rather than a “vice” for future consumption, whereas the opposite holds for immediate consumption (Read, Loewenstein, and Kalyanaraman 1999). Understanding temporal aspects of choice requires the analyst to account for uncertainty in verifying any posited effects, such as decision avoidance (Anderson 2003, Dhar 1997), preference for more immediate, smaller gains (Keren and Roelofsma 1995), and diversification (Simonson 1990), among many others. As noted at the outset, such effects are typically studied as a component of the deterministic, as opposed to the stochastic, portion of utility. It is customary in the psychological and statistical literatures to treat “error” as if it were amorphous, a bydefinition nonexplicable construct that lacks meaningful patterns or structure. However, in recent years, researchers have challenged this view of “error,” including its terminology, referring instead to “unobserved variability.” This is consistent with a critical distinction in hierarchical models, where the higher-level specifications distinguish “observed” and “unobserved” heterogeneity. In this article, although we hew to the simpler and more standard term “error,” a core concept is that put forward by Louviere et al. (2002), who sought to “dissect” the random or “unobserved” component of utility, and who suggested numerous dimensions across which the variance of this unobserved utility could vary. Here, we focus on verifying and measuring that for a specific source, involving choices made prospectively versus at the time of consumption. Beyond helping to understand behavioral drivers, this serves to free random utility models from misidentification based on assuming utility variance equal across brands and temporal conditions. Such misspecifications can potentially alter whether a given behavioral hypothesis is supported or refuted by the data, and it is this we seek to examine. Specifically, we examine whether substantive effects can arise not only from deterministic shifts in brand evaluation, but from swells in unobserved stochastic variation of the sort disallowed by commonly
Theory, Measurement, and Experimental Test
401
used discrete choice specifications and other regression-based methods (Adamowicz et al. 2008). That is, certain theoretical conclusions may be at least in part statistical artifacts, if one does not apply equal care to specifying the deterministic and stochastic nature of utility and preference. Although this issue may appear at first blush of mainly technical interest, it goes to the heart of what drives differences in observed choice patterns: do they primarily result from deterministic influences (perhaps suggesting biased decision processes, inattention, or misunderstanding), from increased environmental uncertainty, or from some amalgam? Specifically, when modeling choice, is there a compelling reason to allow for a flexible, perhaps even parametric, account of the stochastic term? Prior literature in econometrics has amply addressed various sorts of non-constant error variance. For example, the chief motivation behind the GARCH class of models (Engle 1982) is to account for heteroscedasticity, and such models are cornerstones in finance and econometrics in general. Swait and Louviere (1993) argue persuasively that, when modeling multiple “sources” of choice data, estimation of substantively important quantities is confounded with error variance, and that error “scale ratio” parameters should be estimated as well. In their landmark treatment of the econometric modeling of choice data, Louviere, Hensher, and Swait (2000) devote many chapters to various stratagems for alleviating the i.i.d. error assumption, among them error scaling, mixed logit, and general probit models. A key application of error scaling of the type we will explore here combines sources of “revealed” and “stated” preference data (e.g., Hensher, Louviere, and Swait 1999) by setting various parameters equal across data sets and allowing error variances to differ. Even in stated preference experiments, systematic differences in stochastic variation have been found to accompany effects of rank order and fatigue (Bradley and Daly 1994). Our goal is to investigate the effects of presuming a constant value for unobserved stochastic variation across items and over time. To examine this issue, we take a multitiered approach, involving derivation, model formulation, an experiment, parameter estimation, and simulation. First, we demonstrate that firm directional predictions can be made about which items’ choice probabilities will, and will not, be affected by “stochastic inflation.” This allows the development of a formal model of the effects of variation in the unobserved stochastic component of utility; the model generates testable predictions about
402
Linda Court Salisbury and Fred M. Feinberg
which items should display shifts in choice probability (namely, one’s most- and least-favored items), and affords the use of discrete-choicebased methods to analyze participants’ choice patterns. We then show how the total error common in the random utility framework can be decomposed into that for the analyst and that for the consumer, both of whom must “predict” experienced utility; and, moreover, that a firm lower bound on the consumer’s (unobservable) “temporal stochastic inflation factor” can in fact be extracted from quantities measurable via discrete choice methods. Next, we present an experiment designed to verify whether specific choice behavior patterns can stem from increased stochastic variation, and to explore the marginal explanatory power of both the posited, temporally induced effects and several others suggested by prior literature and random utility theory. This allows a sequence of econometric models to test for and rule out a number of alternative effects, while providing strong support for not restricting stochastic variance to be constant across temporal conditions (as well as more modest support for an analogous statement across brands/items). Finally, we demonstrate via simulation that one could draw erroneous conclusions about purported (deterministic) effects when using traditional statistical methods that may be misspecified. 13.2
Time Delay, Uncertainty and Choice Probability Distribution
Throughout our discussion, we will compare “future” and “immediate” decision modes. The former will connote making choices now for future consumption; the latter will mean that choices are made at the time of consumption. In our experiment and the tests that follow, we will further specify, and compare an example of, these two types of decision mode. It is well-known that decision-makers have difficulty estimating future utility (Kahneman and Snell 1992, Kahneman, Wakker, and Sarin 1997), and, whether choosing for future or immediate consumption, decision-makers experience uncertainty to some degree. Preference uncertainty arises when one cannot predict perfectly which of the available choice alternatives will maximize experienced utility. It can be influenced by any of a number of contextual variables, such as the number of available alternatives (Iyengar and Lepper 2000) or their similarity (Dhar 1997). We focus on the effect of time delay between choosing and consuming, specifically in its leading to greater preference uncertainty (Simonson 1990). Note that this phenomenon is distinct from the
Theory, Measurement, and Experimental Test
403
primary focus of the literature on intertemporal choice (e.g., choosing between two alternatives whose outcomes are experienced at different times). One need not presume that the expected utility of available alternatives is affected by choice mode in order to account for between-mode differences. Suppose, for example, that one is choosing between two alternatives, A and B, and alternative B has greater mean utility than alternative A (VB > VA).1 However, if variance is greater when choosing for future rather than immediate consumption, the probability that the experienced utility of A will exceed the experienced utility of B is also greater; consequently, decision-makers would be more likely to choose alternative A in the future than in the immediate consumption mode. This assertion is easily verified in a random utility framework, in a manner to be generalized shortly. Let us introduce a relative error variance scaling parameter, σ ≥ 1 (Swait and Louviere 1993): UAt = VAt + σεAt,
(1)
where, without loss of generality, the scaling factor is normalized to σ = 1 for immediate consumption (t = 1), so σ >1 for choices made for future consumption (t > 1). The probability of choosing alternative A (with the t subscript dropped for simplicity) is: PA = Pr (VA + σε A > VB + σε B ) = Pr ⎛ ε B − ε A < ⎝
VA − VB ⎞ σ ⎠
(2)
We can consider the difference of the errors, εB − εA, a random variable in its own right, and examine equation (2) as a function of σ.2 Because the mean anticipated utility of alternative B is greater than that for A, (VA − VB) is negative, and so (VA − VB) / σ is increasing in σ. Thus, the probability of choosing the alternative with the smaller mean anticipated utility (alternative A) monotonically increases (toward 1/2) in σ, while that of the other alternative (B) decreases. Of course, real decisions typically involve more than two potential options. We show in the appendix that, in a general random utility setting, one can make clear directional predictions about which items’ probabilities change with the error scaling factor, σ, when there are multiple options from which to choose. Specifically, we prove three facts. When degree of stochasticity (σ) increases: (1) the most-favored (highest vj) item is less likely to be chosen; (2) the least-favored (smallest vj) item is more likely to be chosen; and (3) systematic predictions
404
Linda Court Salisbury and Fred M. Feinberg
cannot be made about other items. It is remarkable that the same is never true for the most- and least-favored alternatives, regardless of the joint error density, and so theory-driven consequences of increasing uncertainty will concern only individual-specific “favorites” and “least favorites.” In summary, we expect that, ceteris paribus, choices made for future consumption will entail a greater degree of stochastic variation than those made for immediate consumption. Note that this is not the case for specific times in the future; for example, there is more uncertainty about a typical American’s dinner preference for turkey tonight versus for next Thanksgiving Day. We emphasize that random utility (RU) theory makes testable directional predictions—when choices are made for future consumption, the probability of choosing the most-favored item decreases and the probability of choosing the least-favored item increases, relative to choices for immediate consumption—without requiring systematic changes to the deterministic component of utility, if consumption time delay entails relatively greater stochastic variation. We next examine conditions under which statements can be made regarding the consumer’s (unobservable) temporal inflation factor, based on measurements stemming from discrete choice models, which also include error for predictions made by the analyst. 13.3
Consumer, Analyst, and Total Error
Throughout, we appeal to discrete choice models based on the standard random utility framework, which underlie the vast majority of empirical work in marketing and behavioral research; unless needed for clarity, we suppress subscripts for option (brand), time, and consumer. The analyst can only imperfectly estimate the consumer’s assessment of his/her utility, and so includes error process εT; this is the (total) error typically specified in RU models, where all error is presumed to stem from analysts’ inability to perfectly model consumer utility at the time of choice. As suggested explicitly by Louviere (2001, p. 507, equation 4), this total error can be decomposed into subcomponents “that reference within-subjects, between-subjects, between-contexts, between-measurement instruments, between-time periods, and so forth components of response variability.” Here, because consumption is not necessarily immediate—consumers need to anticipate their experienced utility at the time of (future) consumption (Kahneman, Wakker, and Sarin 1997)—we decompose εT into subcomponents εA and
Theory, Measurement, and Experimental Test
405
εC to represent analyst error and consumer error, respectively (we will later account parametrically for differences in response variability between-brands and between-temporal conditions). These error processes have some joint distribution (ε A , ε C ) , about which we make no assumptions. Total error, reflected in the measured system and all observations, is given by ε T = ε A + ε C , and so we may write: U = V + εT εT = ε A + εC
(3)
σ = σ + σ + 2Cov (ε A , ε C ) 2 T
2 A
2 C
Measuring the ratio in error variances between the Future (F) and Immediate (I) choice conditions is only possible for the total error, σ F2 , T / σ I2, T . However, we are concerned throughout with the ratio of the consumer’s error variances, σ F2 , C / σ I2, C , and so proceed as follows. Given that analyst error is unobservable, we invoke the standard model identification assumption that σ I2, A = σ F2 , A (this will be relaxed later). For simplicity of derivation, let a = σ I , A = σ F , A , r = σ F , T / σ I , T , and ρ = Corr (ε A , ε C ), in which case we can write:
σ F2 , T σ I2, T
=
r2
=
⇒
σ F2 , A + σ F2 , C + 2ρσ F , Aσ F , C σ I2, A + σ I2, C + 2ρσ I , Aσ I , C a 2 + σ F2 , C + 2ρ aσ F , C a 2 + σ I2, C + 2ρ aσ I , C
=
a 2 (1 − ρ 2 ) + (σ F , C + ρ a ) a2 (1 − ρ 2 ) + (σ I , C + ρ a )
2
(4)
2
To establish a relationship between the error variance ratio for the consumer, σ F , C / σ I ,C , and that in total, r , we note that removing the same positive quantity from the numerator and denominator of a ratio greater than one will increase its value; thus: r2
=
⇒ r
0 . Such values for ρ are highly plausible, and it is in fact difficult to imagine forces that would influence consumer error and analyst error in different directions; common unobservables (e.g., shocks) would lead to positive values for ρ . Still more
406
Linda Court Salisbury and Fred M. Feinberg
general, and precise, results are possible: if we do not presume that σ I , A = σ F , A , differentiation and algebra show the desired boundedness σF, A σI , A result holds when ρ > − 12 ⎡⎣ σ F , C + σ I , C ⎤⎦ , which again is always negative, but not necessarily less than –1. In terms of estimation using observed choices and covariates, we therefore know that the measured value of σ F , T / σ I , T provides a lower bound for the (unobservable) temporal stochastic inflation factor, σ F , C / σ I ,C , so long as the correlation in analyst and consumer errors is non-negative (and in fact for any value of σF, A σI , A ρ > − 12 ⎡⎣ σ F , C + σ I , C ⎤⎦). We thus concern ourselves throughout with measuring the total error ratio, σ F , T / σ I , T , and claim it as a lower bound for σ F , C / σ I ,C . We next present a statistical model for choices made among different items and times. Thus far, we have been as general as practicable regarding error distributions. They did not, for example, need to be independent, drawn from the same family, or even have the same mean. In our empirical applications, however, it is important to ensure that manipulations of error variance do not also distort other values in the model. We must take care to ensure that scaling of errors alters only variances, not means, thus allowing for cross-model comparison and interpretation. 13.4
Empirical Model
Consider an individual who is choosing a single item from a set of alternatives. The (conditional) multinomial logit (MNL) model has been used extensively in marketing and decision theory to model individual choice behavior (McFadden 1974), and MNL assumes the degree of uncertainty about individuals’ anticipated utility is the same for all alternatives and time periods. This particular assumption limits our ability to model variations in preference uncertainty in observed choice patterns. For example, a consumer choosing from among a set of snack brands has varying degrees of uncertainty about how much she will enjoy each snack, irrespective of her best guess (i.e., mean anticipated utility). She feels relatively certain about how much she’ll enjoy the snacks she has the most experience eating, and less so about the other snacks, especially snacks she’s never seen before, irrespective of whether she believes she will like them or not. The analyst using MNL to model her choice behavior would be presuming equal degrees of predictability about each of these alternatives, ignoring that uncertainty varies in a meaningful, estimable way across them. Disregarding
Theory, Measurement, and Experimental Test
407
differences in brand-specific variation, when they are present, would not only be an a priori misspecification; it could induce artifacts into the stochastic variance effects whose verification and accurate measurement are our main interest. Imposing equal degrees of stochastic variation on each brand also entails the independence of irrelevant alternatives (IIA) property; Bhat’s (1995) HEV model and Allenby and Rossi’s (1999) diagonal-covariance probit, both explored in the sequel, overcome this well-known problem without a huge penalty in parsimony (for further discussion, see also Baltas and Doyle 2001). Thus, although our focus throughout is on temporally induced variance effects, all models will accommodate brand-specific variation parameters as well, unless they are shown to be statistically non-significant. One approach that allows for different degrees of stochastic variation involves the Heteroscedastic Extreme Value model (HEV; Allenby and Ginter 1995, Bhat 1995). For each alternative, j, in the HEV model, the {εijt}, for individual i at time t, follow a type 1 extreme value distribution with scaling parameter θj. The error terms, {εijt}, are independent but not identically distributed, with variances equal to π2θj2/6. The variance scaling parameter, θj, is directly proportional to the standard deviation of the stochastic component of utility, εijt. HEV nests MNL, with θj = 1 for all j. The multinomial probit (MNP) model is yet more general, allowing MVN errors with arbitrary (estimable) error covariances. To address the problem at hand requires that we incorporate an additional component into the model framework: uncertainty associated with time duration between choice and consumption. Specifically, we must allow individuals’ uncertainty about anticipated utility to vary according to whether choices are made for immediate versus future consumption, so that degree of stochastic variation can vary across choice conditions, as well as by brand/item. To this end, an additional variance scaling parameter, σc (where c represents consumption delay, future or immediate) can be included in the random utility formulation: Uijt = Vijt + θjσcξijt,
(6)
where ξijt is a univariate error draw. The specification represented by equation (6) recognizes and allows estimation (using standard tools) of the variation in the stochastic component of utility associated with two distinct sources: that specific to the brand (θj), and that associated with
408
Linda Court Salisbury and Fred M. Feinberg
the time between choice and consumption (σc). The specific multiplicative format we use, θjσcξijt, is consistent with an additive-in-logs specification typical in modeling variance effects. Although other specifications are possible, we have not systematically explored them here (notably, a linear-additive formulation would require a number of estimation constraints to avoid negativity, and would not naturally allow the baseline variance, brand-specific multipliers, and temporal stochastic inflation parameter to interact as they do in equation (6).) Louviere and Eagle (2006) provide a full discussion of the role and specification of such scale constants. Deterministic components, Vijt, are expressed as linear additive functions of brand-specific characteristics, Xijk, that account for the utility of brand j for individual i. The full model is, therefore: K
U ijt = ∑ β k Xijk + θ jσ c ξijt
(7)
k =1
Equation (7) enables us to rigorously test for many possible effects over and above others: the relative magnitude of uncertainty across available options (θj); uncertainty stemming from consumption delay (σc); and, via the error correlations possible with MNP, whether the stochastic portions of brand utilities can be treated as independent. We discuss the incorporation of consumer preference heterogeneity below and in our empirical application. As discussed previously, we expect the temporal scaling factor, σc, to be greater when choosing for future time periods. With the convention that FUT refers to all choices made for consumption in future time periods, and IMM refers to choices made for consumption in the immediate time period, we posit: H1: σ FUT / σ IMM
> 1
As discussed by Louviere (2001), our analysis will consider the same deterministic specification across temporal conditions and ascertain whether equal variances or unequal ones explain empirical data patterns better. This analysis will lead to various estimates for the critical quantity, σFUT /σIMM, which as discussed earlier provides a lower bound for the consumer’s stochastic inflation ratio. We examine the effects of consumption time delay with a controlled laboratory experiment that varies the time between choice and consumption. In all discrete choice models, various elements must be fixed to set location and scale (described fully in the sequel; for further
Theory, Measurement, and Experimental Test
409
discussion of relative error scaling, model specification, and identification for this and related classes of models, see Swait and Louviere 1993 and Louviere et al. 2000). Specifically, the scale of the error is confounded with the magnitude of the deterministic component. Importantly, we alter the ratio of stochastic components only across time conditions (FUT versus IMM); model identification is set identically across time conditions otherwise. We test for all effects by comparing nested models using standard likelihood ratio tests, subject to three well-known identification constraints (see Bhat 1995): (1) one of the brand-specific constants (βk) must be fixed, ordinarily to zero; (2) one of the brand-specific scaling parameters (θj) is set (arbitrarily) to one; and (3) one of the temporal scaling parameters is set to one (i.e., σIMM = 1). Therefore, the key test of H1 concerns whether σFUT > 1. We will also test for a number of other effects, as suggested by prior literature on discrete choice and time delay. The two key issues can be stated simply: Is there evidence of temporal variance inflation (σc) over and above other effects? And is there evidence of other effects over and above temporal variance inflation? 13.5
Testing for Non-Constant Stochastic Variation
Experimental research on time delay is abundant, and it is not our goal to recap it here, or to elucidate the drivers of all consequent behaviors. Rather, we wish to replicate aspects of a carefully conducted study in a manner that isolates possible temporal stochastic inflation effects, without identifying their underlying causes, and ruling out certain confounding factors. Perhaps the best-known among such studies is Simonson’s (1990), which found that people choosing multiple items from an assortment tend to choose a greater variety of items when all items are chosen “simultaneously” before (future) consumption versus when each item is chosen “sequentially” for immediate consumption. Similar differences in observed variety have been replicated by other researchers as well (e.g., Read et al. 2001, Read and Loewenstein 1995). Numerous explanations have been posited for the underlying psychological mechanism, although Loewenstein (2001) suggests none yet offers a thorough account. Among the most widely cited explanations is Simonson’s (1990) contention that the difference in variety chosen is due to greater uncertainty of future preferences for participants in the simultaneous condition. We build on this work using a methodology that allows for the possibility that uncertainty differences may arise
410
Linda Court Salisbury and Fred M. Feinberg
because time delay can inflate the (unobserved) stochastic variation in utility. The forthcoming experiment offers a platform to detect and measure temporal inflation in (relative) stochastic variance. 13.6
Experiment
We conducted an experiment to assess the effect of consumption time delay on stochastic variation. The experimental design replicates essential features of Simonson (1990; experiment 2) and of Read and Loewenstein (1995; experiment 1), with one key difference: because preference heterogeneity is critical in accounting for choice, we collected measures of individual participants’ preferences before they made any choices. This allowed us to directly test the effect of time delay on the probability of choosing participant-specific most- and least-favored alternatives, the quantities about which the random utility formulation makes sharp predictions. 13.6.1 Method Design. A between-subjects design was used to test the effect of temporal distance on stochastic variation in utilities and choice probabilities. Participants chose three snacks from among a set of six. Snacks were selected to accord with those used by Simonson (1990) and Read and Loewenstein (1995). They included: Austin cheese crackers, Doritos tortilla chips, Hershey’s chocolate bar with almonds, Oreo cookies, Planters peanuts, and Snickers bar. One hundred three undergraduate students participated to earn credit in an introductory business course. In addition, participants who completed all four parts of the study were given a $3 completion bonus. Procedure. The experimental procedure comprised four consecutive sessions, each conducted one week apart. Four groups of twenty-four to twenty-eight participants took part in the experiment. Each participant group was run as an independent set of four sessions, so that sixteen sessions were conducted in total. Participants within a condition reported on the same day-of-week and time-of-day for each session of the experiment. In session one, participants’ preferences for the available snack alternatives were measured by asking participants to rate how much they liked each snack using an 11-point Likert scale (“1” = dislike very much, “11” = like very much). Choices were made during sessions two, three, and four (hereafter referred to as choice weeks one, two, and three, respectively).
Theory, Measurement, and Experimental Test
411
In choice week one, participants read the following instructions describing the task to be completed. Only the “simultaneous” participants saw the words in parentheses, while only the “sequential” participants saw the words in square brackets. Today, (the session a week from today, and the session two weeks from today), we will be giving away to students snacks of their choice. [Each student can choose any one of the items from the selection of snacks available on the table.] (Each choice can be any one of the items from the selection of snacks displayed on the table, and you may select the same snack more than once.) There is a sufficient supply of all snacks. Participants in the sequential condition chose one snack to eat, wrote the name of their chosen snack on the same page as the instructions, and ate the snack immediately. All available snacks were displayed on a table, and participants walked to the table to pick up the snack they chose. Participants in the simultaneous condition chose one snack to eat immediately, a second snack to eat one week later (in choice week two), and a third snack to eat two weeks later (in choice week three). They wrote the name of each snack chosen on three separate sheets of paper, with each page designating a specific choice week in the experiment (i.e., choices were assigned to specific time periods). After making the three choices, participants ate their first chosen snack. Note that requiring simultaneous participants to pre-assign each choice to a specific consumption time period eliminated any opportunity to use variety as a “hedge” against changing preferences (Kreps 1979, Walsh 1995). In choice week two, participants in the sequential condition performed the same task as in choice week one. All available snacks were again displayed and participants picked up their chosen snack from the table. Participants in the simultaneous condition were given the sheet of paper from choice week one on which they had written the snack they chose to eat for that day. They selected the snack from the table and ate it. In choice week three, all participants performed the same task as in choice week two. Choice Model Specification. To assess the effects of consumption time delay on participants’ uncertainty about anticipated utility, we adopt a model that accords closely with those used in prior literature that did not allow for differences in stochastic variance across brands or time periods (Simonson, 1990). Latent utility, Uijt, for individual i, brand j and time t, is specified as:
412
Linda Court Salisbury and Fred M. Feinberg
U ijt = β RATE RATEij + β j BRANDj + θ jσ c ξijt ,
(8)
where βRATE represents the effect of prior rating on utility, RATEij is individual i’s prior rating for snack j, βj represents the brand-specific constant for snack j, {BRANDj}are 0–1 brand dummy variables, and the three key stochastic constructs—brand scaling (θj), temporal scaling (σc) and error correlations—are as described previously. Given that simultaneous participants made their first choice for consumption in week one, and the remaining choices for future weeks, σFUT corresponds to all but the first choice in the simultaneous condition. The temporal scaling factor, σFUT, is therefore akin to an “average” inflation factor across time periods two and three. This could be easily generalized to multiple future inflation factors, albeit at the expense of parsimony. Thus, the model specification is: U ijt = β RATE RATEij + βCHEESE CHEESE + βDORITOS DORITOS + β HERSHEY HERSHEY + βOREO OREO + β PEANUT PEANUT + θ jσ c ξijt (9) where βCHEESE and CHEESE represent the brand-specific constant and 0–1 brand dummy variable, respectively, for Austin cheese crackers, and likewise for the remaining snacks. Note that, although equation (9) comprises all available brands, brand dummies are included for all brands but one, for identification purposes; Snickers was arbitrarily selected to have its brand dummy set to zero. All models were estimated by iteratively combining routines in commercially available software packages (LIMDEP, MATLAB, and Stata), and checked by reprogramming the likelihood calculations using Gauss-Laguerre quadrature. Preliminary Bayesian analysis supported the presumed asymptotic normality of parameters’ marginal densities, on which all our classical likelihood ratio tests will be based, as well as accounting for observed heterogeneity (via RATE) in place of an additional random coefficients specification for unobserved heterogeneity. That is, the inclusion of RATE allows for so-called “observed heterogeneity”; “unobserved” (i.e., parametric) heterogeneity cannot be reliably accommodated, due to the small number of (correlated) observations per subject, as cautioned by Andrews, Ainslie, and Currim (2008).3 The utility-based model in equation (8) allows us to evaluate marginal explanatory power: whether a particular effect is supported over and above others. The large number of distinct deterministic and stochastic effects our formulation affords can lead to a combinatorial
Theory, Measurement, and Experimental Test
413
explosion in possible models. Table 13.1 summarizes the range of relevant comparisons, and tells a clear story regarding patterns in the choice data. First, however, a practical matter concerns whether the more tractable logit-based (MNL or HEV) model or the probit (MNP) model is more appropriate. We estimated both logit-based and probit-based models and found identical patterns of results. Because in every case the probit specification offered better fit, we formally present the probit model results exclusively, but report informally that effect strengths were nearly identical under the logit specification (except those with error correlations, which the standard logit does not support). 13.6.2 Results Before presenting our focal results, we first confirm that those of prior research are replicated. As expected, simultaneous participants chose a significantly greater number of unique items than participants choosing sequentially (MSIM = 2.42, MSEQ = 1.82, p < 0.001). This result confirms previous research (Read and Loewenstein 1995, Simonson 1990) showing that simultaneous choice leads to a relatively greater number of observed unique items than sequential choice. Next, we measure the degree of relative temporal stochastic variance effects, and in so doing assess H1. For simplicity of nomenclature in presenting all forthcoming results, the term “uncertainty” refers to the degree of stochastic variation. Temporal Differences in Uncertainty. H1 suggested that stochastic variation in choices made for future consumption would be significantly greater than those made for immediate consumption (σFUT > 1). Using the convention that “B” denotes brand scaling and “T” temporal scaling, comparing model “MNP-BT” to (the nested) model MNP-B (i.e., imposing the restriction that σFUT = σIMM = 1) indicated a strongly significant difference (χdiff2(1) = 11.86, p < 0.001; see table 13.1)4. Even when allowing for brand-specific stochastic variation, the temporal scaling parameter estimate was σFUT = 2.37, so that relative stochastic variation is approximately 137% greater when choosing for future (versus immediate) consumption. Recall that the ratio of total error variances is a lower bound for the ratio of consumer error variances, and thus 2.37 is a conservative estimate of the consumers’ temporal inflation factor. We report parenthetically that this finding was fairly robust to error specification, as evidenced by testing H1 assuming a logit model (χdiff2(1) = 9.60, p < 0.001; σFUT = 2.33).
6 11 7 12 17
−408.25 −402.43 −397.46 −390.57 −385.93
MNP: No Brand or Temporal Scaling
MNP-B: Brand, but no Temporal Scaling
MNP-T: Temporal, but no Brand Scaling
MNP-BT: Brand and Temporal Scaling
MNP-BF: Brand and Free Temporal Scaling
7 12 13 16 21 22
−408.20 −401.55 −390.25 −402.10 −400.29 −388.56
MNP + EquiCorrelation
MNP-B + EquiCorrelation
MNP-BT + EquiCorrelation
MNP + Free Correlation
MNP-B + Free Correlation
MNP-BT + Free Correlation
With Correlated Error Terms
Df
LL
Base Models
Table 13.1 Empirical Model Comparisons
2.70
—
—
2.44
—
—
2.53a
2.37
2.11
—
—
σFUT
—
—
—
—
—
—
—
—
—
—
—
VS
0.001
0.000
0.000
0.000
0.000
0.040
—
MNP
0.004
0.000
0.000
0.000
—
—
—
MNP-B
0.274
0.025
0.011
0.017
—
—
—
MNP-T
Likelihood Ratio Tests: p-value Versus Base Models
0.947
0.425
0.100
—
—
—
—
MNP-BT
414 Linda Court Salisbury and Fred M. Feinberg
8 13
−399.96
−396.84
−390.07
MNP-B + RATE*SIM Interaction
MNP-T + RATE*SIM Interaction
MNP-BT + RATE*SIM Interaction
8 13
−402.18 −202.19 −177.70
−397.01
−390.38
MNP-B + Variety-Seeking SIM onlyb SEQ only χ2, Δ(VS) = 2.184, p = 0.139 MNP-T + Variety-Seeking
MNP-BT + Variety-Seeking
a. The harmonic mean of σFUT,j is listed for “Free Temporal Scaling.” b. Significance test for SIM VS: p = 0.995; for SEQ VS: p = 0.060.
7 12 12 12
−407.25
MNP + Variety-Seeking
Variety-Seeking / Inertia Models
7 12
−406.39
MNP + RATE*SIM Interaction
Allowing Different Preference Weightings —
2.38
2.10
—
—
2.56
2.33
—
—
−0.021
−0.031
−0.032 −0.001 −0.125
−0.043
—
—
—
0.053
0.000
0.000
0.059
0.158
0.000
0.000
0.011
—
0.000
—
0.487
—
0.000
—
0.027
—
0.028
0.344
—
—
0.022
0.268
—
—
0.538
—
—
—
0.318
—
—
Theory, Measurement, and Experimental Test 415
416
Linda Court Salisbury and Fred M. Feinberg
This evidence supports H1 taken on its own. Moreover, and even more compelling, allowing different degrees of stochastic variation (σc) for future and for immediate consumption was supported in every context, no matter which other effects were present, and this is among the main findings of our analysis. A key issue concerns the degree of temporally induced inflation. For all models that include it, the estimated value of σFUT is remarkably consistent. As per table 13.1, it falls between 2.10 and 2.70, and, even when included in models already allowing for brand-specific scaling, is significant at p < 0.001, suggesting that temporal inflation is a strong, robust effect over and above any others included in the model. Brand-Specific Differences in Uncertainty. Though not a main focus of our investigation, it is reasonable to ask whether allowing stochastic variation to differ across brands just adds needless complexity. We find that imposing the restriction that θj takes a common value (across brands, j) leads to a worse fit. Likelihood ratio tests, comparing model MNP-BT to (the nested) model MNP-T, (χdiff2(5) = 6.89, p < 0.02) support that variation does differ across brands, over and above any betweencondition temporal scaling. Further, allowing brands to have differing degrees of stochastic variation was always supported, over and above any other effects in the model (ps < 0.05; brand scaling parameter estimates, θj, along with all supplemental estimation results, are available from the authors). We therefore include brand-specific stochastic variation (uncertainty) in all model comparisons discussed below. Brand-Specific Differences in Temporal Scaling. The most general hypothesis involving temporal inflation is that each brand’s variance is inflated in future choice, but not to the same degree. That is, each brand j has its own temporal inflation factor, σcj, rather than a common temporal inflation factor, σc. This is tested in table 13.1, and designated as “Free Temporal Scaling” (MNP-BF). While strongly supported on its own (χdiff2(11) = 22.32, p < 0.001), it is not supported over and above there being a single value for temporal inflation (χdiff2(5) = 4.64, p ≈ 0.10). In other words, time delay between choice and consumption revealed a consistent, inflationary effect on uncertainty about future experienced utility for each of the choice alternatives. We note parenthetically, however, that the estimated values of σcj varied in a pattern resembling an inverted U-shape relative to mean brand rating: σcj values of {1.76, 2.74, 4.93, 4.29, 2.81, 1.00} corresponded to mean ratings {5.53, 6.78, 6.83, 8.09, 7.99, 8.55} and choice shares {4.2%, 6.2%, 10.7%, 17.9%, 18.2%, 42.7%}, respectively. This suggests the intriguing possibility that brands
Theory, Measurement, and Experimental Test
417
that are strongly preferred or weakly preferred may be subject to relatively less temporal inflation. We must bear in mind, however, that statistical support for this contention would likely require far larger samples, given the weak degree of evidence in its favor here (p ≈ 0.10). Error Correlations. It is well known that, while models like MNL are convenient tools, those allowing for correlated errors are far more demanding and, for want of a better term, finicky in applications (Harding and Hausman 2006). Nevertheless, it is important to explore whether the pattern of experimental results could be driven by latent error correlations, and we consider two possible patterns: a highly parsimonious one where all errors are intercorrelated to the same degree (“equicorrelation”), and “free correlation,” where error correlations are completely unrestricted. As shown in table 13.1, neither equicorrelation (χdiff2(1) = 0.32, p > 0.4) nor free correlation (χdiff2(10) = 2.01, p > 0.9) was supported. This suggests that the proposed model, accounting for brand- and temporal-scaling (but no other error effects), is adequate to capture stochastic variation patterns in the choice data. Differences in the Effect of Prior Preference on Choice. Previous literature has posited that underlying preferences are under-weighted in simultaneous choice. To assess this, we allow the coefficient on RATE to differ by condition, thereby allowing an interaction between prior preference and choice condition, RATEij*SIM, where SIM is a binary dummy that equals one for the simultaneous condition and zero for the sequential condition. Intriguingly, consistent with previous research, this interaction was significant in models that did not allow temporal scaling: for the standard probit model (MNP, χdiff2(1) = 1.84, p = 0.053) and the probit with brand-scaling (MNP-B, χdiff2(1) = 2.47, p < 0.03). However, it was never significant for models including temporal scaling (MNP-T, χdiff2(1) = 0.62, p > 0.2; MNP-BT, χdiff2(1) = 0.50, p > 0.3). The reverse is not true: temporal scaling was strongly significant when added to any model including the RATEij*SIM interaction term (MNP versus MNP-T, χdiff2(1) = 9.55, p < 0.001; MNP-B versus MNP-BT, χdiff2(1) = 9.89, p < 0.001). In other words, once temporal variation in uncertainty is taken into account, there is no evidence that simultaneous and sequential participants weight underlying preferences differently during choice. Critically, one might erroneously conclude the opposite if temporal differences in stochastic variation are not accounted for in the analysis. Finally, we tested whether the deterministic component of utility is itself stable over time by estimating a model that nests MNP-BF. Six
418
Linda Court Salisbury and Fred M. Feinberg
additional parameters were introduced to allow bRATE, as well as brandspecific constants, to differ for FUT and IMM, but did not significantly improve fit over MNP-BF (p ≈ 0.08). State Dependence / Variety-Seeking. Patterns of repeated choice can display various “non zero order” behavior, in the form of state dependence, habit persistence or variety seeking; these can be confounded with error scaling effects. Seetharaman (2004) proposed a utility-theoretic account of four different sources of state dependence—lagged choices (structural state dependence); (two) serially correlated error terms or choices (habit persistence types I and II); and lagged covariates—finding the first to be the most important. As advocated in that paper, we apply a specific utility-based model (Seetharaman and Chintagunta 1998) to recover variety-seeking effects, and reestimated all models subject to their hybrid variety-seeking/inertia specification: VS + VS ⎛ Pj ⎞ ⎜⎝ ⎟ + (1 − VS ) Pj (for i ≠ j ) ; 2 1 − Pi ⎠ VS − VS Pjj = + (1 − VS ) Pj 2
Pij =
(10)
where Pij is the conditional probability of choosing alternative j for the next time period (given that i was chosen for the previous period), and Pj is the unconditional choice probability resulting from equation (8). The parameter VS, ranging from -1 to 1, represents the degree of inertia or variety-seeking in the data, with negative values indicating inertia. As shown in table 13.1, VS was negative, though non-significant, for all models, suggesting a slight degree of inertial choice behavior (for example: MNP-BT, VS = -0.021, χdiff2(1) = 0.19, p > 0.5). To assess this as a driver for variety differences, however, we reestimated the model separately for the simultaneous (VSSIM = -0.001, p > 0.9) and sequential (VSSEQ = -0.125, p > 0.06) conditions. As expected, there appeared to be less inertia in the simultaneous condition, but the difference was not significant (VSdiff = 0.124, χ2 = 2.184, p > 0.13); it would seem inappropriate to interpret this as “more variety-seeking,” since choices exhibited no variety-seeking in either condition. Finally, temporal scaling was strongly significant when added to any model allowing for variety-seeking (MNP versus MNP-T, χdiff2(1) = 10.24, p < 0.001; MNP-B versus MNP-BT, χdiff2(1) = 11.80, p < 0.001). Taken together, these results indicate that, although average observed variety was greater in the simultaneous condition, there is no evidence that simultaneous decision-makers engage in more variety-seeking.
Theory, Measurement, and Experimental Test
419
Summary of Key Findings. In summary, our pattern of results underscores the importance of examining, and testing for, both deterministic and stochastic effects on choice behavior. Whenever our stochastic variation constructs, brand and temporal scaling, are added to models including the other posited effects, they are always strongly supported (ps < 0.001). Yet when other potential effects—differential preference weighting, correlated errors, or variety-seeking—are added to the proposed model (with brand and temporal scaling; MNP-BT), none of these effects is supported (ps > 0.10; see the far right column of table 13.1). The data therefore tell a surprisingly direct and parsimonious story: different brands entail different degrees of stochastic variation; there is greater stochastic variation for choices made for future than for immediate consumption; and, moreover, this can be adequately captured by a single temporal variance inflation factor. We have thus far appealed to real data to demonstrate the value of accommodating both deterministic and stochastic effects when modeling choice. But real data, even in controlled experiments, can contain confounds that interfere with causal inference. Critically, we can never fully determine the effects of model misspecification: can a model lacking some particular feature lead to an incorrect interpretation of one’s data? To examine this issue, we next turn to simulated choice scenarios whose generating mechanisms—and whether they contain either deterministic or stochastic effects—are known. 13.7 Simulation Study 1: Presuming Constant Stochastic Variance as Potential Specification Error Having measured the extent of temporal stochastic inflation, we can now investigate a question of key importance: Could common analytical tools erroneously estimate effects in such a way as to lead to an unsubstantiated conclusion? A Monte Carlo simulation study allows us to examine the extent to which deterministic effects can be either correctly statistically detected, or incorrectly inferred, due to unfounded assumptions about the stochastic portion of utility. We therefore examined two hypothetical choice scenarios, based in part on our experimental design and empirical findings—one in which simultaneous versus sequential choice differences are driven by stochastic differences only; and a second in which choice differences are instead rooted in deterministic differences. In our analysis of each scenario, we examine three factors that influence choice (akin to a 2x2x2 experimental design):
420
Linda Court Salisbury and Fred M. Feinberg
brand-specific uncertainty (fixed θj versus unrestricted θj); temporal differences in uncertainty (fixed σ = 1 versus σ ≥ 1); and differences in preference weighting across choice conditions (bRATE*SIM = 0 versus bRATE*SIM < 0). Previous research posited that sequential decision makers are more likely to choose their favorite option, and less variety, because they weight preferences more heavily (bRATE*SIM < 0) than simultaneous decision makers (Simonson 1990). This “deterministic hypothesis” can be tested with both prior preference (rating) and a preference interaction term (with coefficient bRATE*SIM) as covariates in equations (8) or (9). Thus, our scenario analyses will directly measure and test for both stochastic and deterministic influences on choice in each scenario. The two choice scenarios simulate simultaneous versus sequential choice processes as follows: (1) Stochastic Scenario: Temporal (σ > 1 with time delay) and Brand ({θj} ≠ 1) differences in uncertainty, no difference in preference weighting (bRATE*SIM = 0). (2) Deterministic Scenario: Neither Temporal (σ = 1) nor Brand ({θj} = 1) differences in uncertainty, difference in preference weighting (bRATE*SIM < 0). These scenarios were chosen carefully to examine the key issue in our study: the effects of temporal scaling in choice. Both are described and generated by parameters of the discrete choice model, as follows. In “Stochastic” Scenario 1, prior preferences are not weighted more strongly in the sequential condition. That is, Scenario 1 allows for stochastic variance effects, but a classic (deterministic) explanation of variety is not present. In “Deterministic” Scenario 2, there are no stochastic variance effects, but there is an interaction effect representing a systematic shift in preference weighting across choice conditions. For both scenarios, we will estimate a range of models to quantify how various empirical features of the data can potentially lead to statistical artifacts. Specifically, we explore how a researcher might reach incorrect conclusions: claiming there is an underlying (deterministic) shift in preferences when there is not (possible in Scenario 1); or declaring a systematic difference in stochastic variation when there is not (Scenario 2). Data Generation. Our choice scenarios simulate experimental subjects choosing three snacks from among a set of six, similar to our experimental design described earlier. The simulation study relies on generated data, as specified fully below. Conceptually, we must distinguish the pattern in the underlying covariate data (subject-specific preferences
Theory, Measurement, and Experimental Test
421
for the various brands) and in the resulting choice data. Although demonstrating our main theoretical contentions via arbitrarily chosen simulation parameters would not have been difficult, we wish both the covariate and choice data to reflect empirically meaningful patterns. Thus, for the simulated covariates (preferences), we use the preference data from our experiment; for the choice data, we generate them via known statistical processes and parameter values stemming from our analysis of choices made in the experiment. The simulated choice and preference data were generated in a manner consistent with a multivariate ordered probit model (Lawrence et al. 2008). First, we estimated the latent (MVN) distribution underlying the collected brand preferences, using polychoric correlations and maximum likelihood estimation (see Jöreskog 1994). Estimated cutoffs converted draws from this MVN distribution to ordinal data, so that the simulated and real pdfs matched (e.g., the proportion rating Oreo a “7” out of a possible 11 was identical for the simulated draws and real preference data). In this way, we can capture respondent scale usage and the correct (marginal) preference distributions for each brand (as well as their intercorrelations). A final advantage to this approach is that we can completely control the process generating the outcome variable: brand choice, both over time and across choice conditions. Specifically, using carefully selected discrete choice model parameters, we can generate outcomes perfectly consistent with a hypothesized model of choice behavior, and examine the ‘empirical’ effects of various assumptions, temporal and otherwise. We can thereby determine what is truly statistically artifactual, an impossibility using field data on actual choices. We control how choices themselves are generated, as follows. Given our coefficients, and covariates drawn from the master generated list, we calculate a (latent) deterministic utility for each of the six brands on any particular choice occasion. To these, we add a random (Normal) error draw, scaled by condition and choice occasion according to equation (8). “Choice” on a particular occasion is then simply the brand with the largest realized utility. This is carried out for both the Simultaneous (SIM) and Sequential (SEQ) “conditions” using the same covariate draws (i.e., simulated respondents), to eliminate this as a source of betweencondition differences. Error draws are made separately across conditions, as they must be. Our simulated data sets consist of 500 draws from the master covariate list, yielding 500 sets of three choices for each of the SIM and SEQ conditions. Thus, our task will be to generate and statistically account for a total of 3000 choices from among six snacks.
422
Linda Court Salisbury and Fred M. Feinberg
Discrete choice models require identification constraints for brandspecific constants (we will use bSNICKERS = 0) and also for variances; to ensure that results are comparable across model specifications, we use the re-scaling rule that the product of all brand-specific variances be one: Πj (θj) = 1. Parametric values for the Stochastic Scenario are taken from model MNP-BT in table 13.1, with bRATE = 0.331, bRATE*SIM = 0, bj = {-0.869, -1.138, -0.370, -0.903, -0.695, 0}, θj = {0.782, 1.336, 0.280, 1.070, 0.467, 1}, and σFUT = 2.37; Deterministic Scenario parameter values are taken from “MNP + RATE*SIM Interaction” in table 13.1, with bRATE = 0.353, bRATE*SIM = -0.112, bj = {-0.857, -0.576, -0.724, -0.565, -1.016, 0}, θj = {1}, and σFUT = 1. For our analyses of both simulated scenarios, we estimate the same set of eight parametrically nested models (M1-M8), in accord with the 2x2x2 ‘analysis design’ described previously. The eight models allow for varying degrees of flexibility. Models M1, M2, M5, and M6 do not allow for brand scaling, while the other four models do; nested models M5-M8 allow for temporal scaling to distinguish the SIM and SEQ conditions. Models M7 and M8 are the most general, and models M1 and M2 are the least. Estimation and Empirical Results. Confronting the two simulated data sets with a variety of model specifications helps gauge the extent to which deterministic effects can be either correctly statistically detected or incorrectly inferred due solely to model assumptions that fail to match the true data generation process. Note that model M8 accounts for both types of scaling and the interaction, so we can, in theory, detect all three effects. All estimated quantities and resulting tests appear in table 13.2, and we will refer to them throughout our discussion. For conciseness, we have isolated the most relevant tests here; full estimation results for these and other simulated data sets are available from the authors. We examine three classes of tests, as specified by their null hypotheses: Temporal scaling (σ = 1), Brand Scaling ({θj} = 1) and Preference Interaction effect (i.e., whether prior preferences are more heavily weighted in SEQ than SIM, with null bRATE*SIM = 0). Hypotheses are assessed via likelihood ratio tests, rather than Wald-type t-tests, when possible. “Stochastic” Scenario 1 simulated a choice process with differences in stochastic variation (uncertainty) for SEQ versus SIM choice, but no differences in preference weighting. As shown in table 13.2, all effects— for brand scaling, temporal scaling and interaction—are strongly significant (p < E-21) in all models, with the exception of the two nested models (M6 and M8), where the preference interaction does not
Theory, Measurement, and Experimental Test
423
Table 13.2 Simulation Study 1: Probit-Based Model Comparisons for “Stochastic” and “Deterministic” Scenarios “STOCHASTIC SCENARIO” Temporal Scaling (σ > 1), Brand Scaling ({θj} ≠ 1), No Preference Interaction (bRATE*SIM = 0) Estimated Parameters Model
σ
{θj}
bRATE*SIM
M1
1
1
No
M2
1
1
Yes
M3
1
Free
M4
1
M5
2.08
M6
Tests Against Null of LL
σ=1
{θj} = 1
bRATE*SIM = 0
5
−4059.6
—
—
—
6
−4014.7
—
—
E-21
No
10
−4000.0
—
E-24
—
Free
Yes
11
−3948.2
—
E-27
E-24
1
No
6
−3961.6
E-44
—
—
2.00
1
Yes
7
−3960.8
E-25
—
0.197
M7
2.38
Free
No
11
−3862.9
E-61
E-41
—
M8
2.38
Free
Yes
12
−3862.9
E-39
E-40
0.879
# Params
“DETERMINISTIC SCENARIO” No Temporal Scaling (σ = 1), No Brand Scaling ({θj} = 1), Preference Interaction (bRATE*SIM ≠ 0) Estimated Parameters Model
σ
{θj}
bRATE*SIM
Tests Against Null of # Params
LL
σ=1
{θj} = 1
bRATE*SIM = 0
M1
1
1
No
5
−4051.9
—
—
—
M2
1
1
Yes
6
−4021.5
—
—
E-15
M3
1
Free
No
10
−4050.3
—
.665
—
M4
1
Free
Yes
11
−4020.2
—
.754
E-15
M5
1.25
1
No
6
−4042.6
E-05
—
—
M6
1.01
1
Yes
7
−4021.5
.840
—
E-11
M7
1.27
Free
No
11
−4040.4
E-06
.490
—
M8
1.01
Free
Yes
12
−4020.2
.892
.757
E-10
424
Linda Court Salisbury and Fred M. Feinberg
approach significance (p > 0.1). And here is the main point: a model that does not account for temporal scaling (such as the commonly applied M2) may lead to a conclusion that there is a difference in preference weighting, when there is not. These effects are stark and unequivocal, and speak directly to whether the standard framework for testing such hypotheses, in which error variance is identical across time, is appropriate. Note that testing certain behavioral theories, such as the diversification effect originally studied in this context, have relied on precisely this sort of test and underlying, constant variance, random utility model. Finally, one might question whether any sort of tinkering whatever with the latent utility variance in the simulated data can lead to such incorrect inferences. This is not the case: when brand-scalings do differ (in the “Stochastic” scenario), but are restricted to be identical in the analysis, this does not lead to such incorrect inferences. As per table 13.2, M6 imposes (wrongly) that {θj} = 1, yet the preference interaction is still correctly found to be non-significant (bRATE*SIM = −0.029; p > 0.1). The “Stochastic” Scenario 1 simulation amply demonstrates the danger of ignoring temporal (though not necessarily brand) scaling when there is no true interaction. What happens, then, when there is an interaction, but no scaling effects? Would the suggested model (M8) mistake the true interaction effect for some sort of variance scaling? Our second simulation, the “Deterministic Scenario,” was set up to test this possibility.5 Results appear in table 13.2 and indicate that the unrestricted model (M8) is capable of explaining the data pattern well: both temporal and brand scaling are nonsignificant, whereas the interaction is strongly so (p ≈ E-10). Note that all models allowing for an interaction (M2, M4, M6, M8) find extremely strong support for it in the data. And all models that allow for both an interaction and temporal scaling find the stochastic effects not to approach significance (ps > 0.5). Tellingly, M5 and M7, which incorrectly restrict the interaction coefficient to zero, do find small but significant estimated values of σ. Given the very large sample sizes and extreme significance levels for the interaction term, there is little question that interactions are not mistakenly ‘picked up’ as stochastic effects when the remainder of the model is suitably general. Recall that our Stochastic Scenario simulation found exactly the opposite: that stochastic effects, when un-modeled, can appear to be strong interactions. One might question whether we needed such large samples to validate temporal inflation. This can be answered readily using the “Stochastic Scenario” and either bootstrapping (i.e., re-sampling) or simply
Theory, Measurement, and Experimental Test
425
scaling our results and recalculating likelihood-ratio tests. Both procedures yield the same answer: all reported significant results for 500 simulated respondents remain so (p < 0.001) with only 50; the (nonsignificant) interactions yield p-values of about 0.8 with so few simulated respondents. Thus, the pattern of results reported here hold even with what would traditionally be termed small samples. 13.8
Simulation Study 2: Verifying Robustness
We sought to test the robustness of our simulation findings with a supplementary simulation study based on the empirical results of Simonson (1990, study 2). Would we find similar results assuming a different error distribution or with simulated choices based on different empirical analysis results? Reexamining Simonson’s (1990) empirical findings offers such an opportunity. We examined the same two choice scenarios as in study 1—“Stochastic” and “Deterministic”—but instead generated simulated choices using the empirical model coefficients reported by Simonson in his seminal study (1990, study 2). Thus, we have in both scenarios that bj = {0.15, −0.10, 0.53, 0, −0.18, 0.45}. In the Stochastic Scenario: bRATE = 0.806 and bRATE*SEQ = 0, with σFUT = 2.50 to align with our empirical findings above. To avoid having to introduce new, arbitrary values into the simulation, brand-specific variances are scaled by this same inflation parameter; three were scaled up, two were unaltered, and one was scaled down. So that comparisons would be meaningful across models, we maintained the re-scaling convention that Πj (θj) = 1, yielding θj = {1.842, 1.842, 0.737, 0.737, 1.842, 0.295}; that is, as explained previously, 1.842/0.737 = 0.737/0.295 = 2.50. For the Deterministic Scenario: bRATE = 0.60, bRATE*SEQ = 0.40, θj = {1}, and σFUT = 1. That is, in the Deterministic Scenario data set, bRATE and bRATE*SEQ are set such that preference is weighted by 0.6 in simultaneous choice, and increases to 1.0 for sequential choice, yielding the same average preference weighting (0.8) as in the Stochastic Scenario. Finally, note that in Simonson’s data and for both simulations, the mean (i.e., brand constant) and (stochastic) variance for Oreo is set to zero and to one for identification. For our analyses of both simulated scenarios, we estimate the same set of eight parametrically nested models (M1-M8), similar to the analysis for the simulations just described above, but presume a Gumbel error distribution (as did Simonson). The eight models can be conceptualized in terms of shorthand names: Logit, HEV, Nested Logit, and
426
Linda Court Salisbury and Fred M. Feinberg
Nested HEV. The logit-based models do not allow for brand scaling, while the HEV models do; nested models allow for temporal scaling to distinguish the two “branches,” the SIM and SEQ conditions. Thus, the “Nested HEV” model is the most general, the “Logit” the least. As shown in table 13.3, the pattern of results replicates those found in the earlier simulation results. All Stochastic Scenario effects—for brand scaling, temporal scaling, and interaction—are strongly significant (p < E-16) in all models, with the exception of the two nested models (M6 and M8), where the preference interaction does not approach significance (p > 0.4). The models that allow for an interaction, but do not allow for temporal scaling effects—M2 and M4—erroneously indicate a strongly significant interaction effect (p < E-24). Examining the Deterministic Scenario, the most general model (M8) explains the data pattern well: both temporal and brand scaling are non-significant, while the interaction is strongly so (p ≈ E-24). Again we find that interactions are not mistakenly detected as stochastic effects when the model is suitably general. In summary, our simulation results indicate that a model that does not account for temporal scaling may lead to an erroneous conclusion that there is a difference in preference weighting, when there is not. Conversely, a model that accounts for both temporal scaling and interaction effects ably disentangles deterministic and stochastic effects, avoiding that pitfall of misspecification. This result is robust across different presumed error distributions and different empirically based simulated choices. 13.9
Discussion and Conclusions
Ample evidence suggests that stochastic variation can differ across choice conditions. Marketing theory and research methods, however, in an effort to illuminate individual decision processes, often focus primarily on deterministic effects. While such deterministic influences are doubtless important to explicate and verify, potentially stochastically-driven effects are rarely accounted for in theory testing, as highlighted by Louviere (2001), who detailed specific contexts ripe for dedicated investigations. As such, it is presumed, usually tacitly, that any error “noise” and its specification cannot drive substantive implications for the phenomena under study. In this paper, we examined the validity of this assumption, using a well-established behavioral phenomenon posited to involve greater uncertainty for the future.
Nested HEV
Nested HEV
M7
M8
2.24
2.33
2.50
2.63
1
1
1
1
σ
Free
Free
1
1
Free
Free
1
1
{θj}
Yes
No
Yes
No
Yes
No
Yes
No
bRATE*SEQ
12
11
7
6
11
10
6
5
# Params
E-40 E-16 E-59 E-26
−4655.2 −4654.8 −4509.4 −4509.1
— —
−4642.7 −4566.4
— —
−4744.6 −4690.5
σ=1
LL
Model Type
Logit
Logit
HEV
HEV
Nested Logit
Nested Logit
Nested HEV
Nested HEV
Model
M1
M2
M3
M4
M5
M6
M7
M8
0.97
1.35
0.97
1.35
1
1
1
1
σ
Free
Free
1
1
Free
Free
1
1
{θj}
Yes
No
Yes
No
Yes
No
Yes
No
bRATE*SEQ
Estimated Parameters
12
11
7
6
11
10
6
5
# Params
σ=1 — — — — E-15 .591 E-15 .636
LL −3542.4 −3461.2 −3537.9 −3456.1 −3512.3 −3461.1 −3506.9 −3456.0
“DETERMINISTIC SCENARIO.” No Temporal Scaling (σ = 1), No Brand Scaling ({θj} = 1), Preference Interaction (bRATE*SEQ ≠ 0)
Nested Logit
Nested Logit
M5
M6
HEV
HEV
M3
M4
Logit
Logit
M1
M2
Model Type
Model
Estimated Parameters
“STOCHASTIC SCENARIO” Temporal Scaling (σ > 1), Brand Scaling ({θj} ≠ 1), No Preference Interaction (bRATE*SEQ = 0)
Table 13.3 Simulation Study 2: Model Comparisons for Scenarios Based On Simonson (1990) Estimates
0.458
—
0.401
—
E-34
—
E-24
—
bRATE*SEQ = 0
.070
.056
—
—
.068
.108
—
—
{θj} = 1
E-24
—
E-24
—
E-37
—
E-37
—
bRATE*SEQ = 0
Tests Against Null of
E-60
E-60
—
—
E-50
E-41
—
—
{θj} = 1
Tests Against Null of
Theory, Measurement, and Experimental Test 427
428
Linda Court Salisbury and Fred M. Feinberg
Specifically, substantive conclusions can indeed arise artifactually, if researchers fail to provide an account of both deterministic and stochastic components of utility. Using a discrete choice framework and carefully modeling the stochastic term to account for temporal differences in variation, our experiment and series of simulation studies provided strong empirical evidence that stochastic variation in anticipated utility can differ across temporal conditions. This evidence remained equally compelling when other purported (deterministic) explanations were included, and made these other explanations recede to non-significance: aside from purely econometric issues, like various patterns of error intercorrelation, substantive explanations involving state dependence, variety-seeking, or greater weighting of underlying preferences in sequential choice were not supported. Time delay offers an excellent example of a well-established phenomenon that has been posited to entail less ‘certainty’ for future choices. But it must be stressed that the present work makes no claims regarding the “simultaneous” decision-making process per se, and especially not that it is driven wholly by temporal stochastic inflation. Rather, our goal is to show that presuming such inflation away can lead to non-trivial substantive artifacts. A full account of the underlying source of inflated uncertainty would require ruling out several competing process explanations, over and above conceptualizing future uncertainty as comprising any and all contributory sources. Although we did not seek to identify or isolate any such alternative process explanations, our experiment did take care to rule many out, chief among them flexibility-seeking (Kreps 1979, Walsh 1995), as well as uncertainty in time, location, and number of future consumption occasions. We were able, however, to quantify the overall effect, via a proof on the lower bound for anticipated future utility. To our knowledge, this thereby provides the first formal estimation of the magnitude (and significance) of the effect of time delay between choice and consumption. The degree of consistency of variance inflation measurements—between 2.10 and 2.70—across models is reassuring, but should be supplemented by studies of other product classes, time delays, and choice set sizes. Such variance inflation factors are, of course, distinct in magnitude and concept from time-discounting parameters common in studies of intertemporal choice. Moreover, brand- and time-specific error scaling parameters provide only two dimensions across which the degree of “unobserved variability” has been posited to contain structure. As
Theory, Measurement, and Experimental Test
429
detailed by Adamowicz et al. (2008), different contexts can evoke distinct choice processes and strategies, which may entail differing degrees of unobserved variability. In our view, this presents a ripe area for further investigation, particularly so when lab (e.g., conjoint) measurements are compared with those arising from market data. Beyond modeling choices in an experimental context, our approach has implications for investigating choice behavior in the field. ‘Real world’ choices vary markedly in time-to-consumption, meaning experienced utility can be nearly immediate or far off. Examples abound: videos selected in-store versus those queued for future delivery from some online provider; vacation excursions selected at one’s destination versus those chosen in advance at the time of purchase; projections of use of a gym membership over a year-long contract versus at the time of signing; scanner-recorded supermarket purchases made for quick consumption (e.g., produce) versus those for longer-term use (e.g., frozen vegetables); or, more generally, product classes with multiple channels of distribution that differ in time delay or—although we have not studied this directly here—in product information availability (and hence uncertainty; e.g., the possibility of taste testing in the store but not online; ability to try on clothing in a shop but not when purchasing via catalog, etc.). An especially fertile area for future investigation is in the realm of preference elicitation methods, notably conjoint, where “purchase readiness” can mean anything from “right now” up to many months later; given our findings, it is unclear whether it is appropriate to presume that all such potential customers should be considered to have identical degrees of stochastic variation. In sum, whenever the degree of “error”—whether it be called uncertainty, variability, noise, random component, or something else—may be more pronounced in a particular subset of the data, no matter its source, it need be accommodated in the formal analysis of actual choices. That is, researchers must be mindful of the fact that different conditions can entail different variances, and choose their methods accordingly. The class of models examined here offers an avenue for accommodating this distinction and examining its substantive implications. Appendix Given any set of deterministic components, {ν1 ,…, ν K }, non-degenerate joint error density, { ε1,…, ε K }, and utilities {ν j + σε j }, we prove three facts, that when degree of stochasticity (σ) increases: (1) the
430
Linda Court Salisbury and Fred M. Feinberg
most-favored (highest vj) item is less likely to be chosen; (2) the leastfavored (smallest vj) item is more likely to be chosen; and (3) systematic predictions cannot be made about other items. Consider the probability that item j is chosen from among a fixed set. Without loss of generality, we focus on item j = 1; the probability it is chosen from among items 1, …, K is given by: Pr[1] = Pr [ v1 + σε 1 > {vk + σε k }k >1 ]
(A1)
We can rearrange terms as follows:
{
(vk − v1 ) ⎡ Pr [1] = Pr ⎢ ε1 − ε k > σ ⎣
}
{
(vk − v1 ) ⎤ ⎡ ⎥⎦ = Pr ⎢⎣ γ k > σ k >1
}
k >1
⎤ ⎥⎦ ,
(A2)
for error process { γ 2 ,…,γ K }, where γ k = ε 1 − ε k . We now wish to examine the effect of changing σ. If item 1 is the most-favored alternative, (vk – v1)/σ is negative for all k > 1. Increasing σ therefore contracts the domain of integration (over the joint pdf of{γ 2 ,…,γ K }) along all K – 1 dimensions, thereby decreasing Pr[1]. An analogous argument holds for the least-favored option (when v1 is lowest), because increasing σ expands the domain of integration. This argument does not hold for items other than the smallest and largest. Thus it is established that the probability of choosing the most-favored option decreases with σ, while the probability of choosing the least-favored option increases, for any non-degenerate error density. Note that this does not presume any particular relation among the errors {εk}, like independence, to hold. If item 1 is neither the most- nor least-favored, we must show that no clear directional statements can be made about how its choice probability changes with σ; in fact, it will in general depend on the relative values of {v2 ,..., vK } (as well as the joint density {ε1 ,..., ε K }). To see this, consider K – 1 identical “favorite” items and one “non-favorite” item. As shown above, the choice probabilities of each of the K – 1 favorites will decrease as σ increases. Because the last expression in equation (A2) is continuously differentiable in ν1, this holds locally, for values of ν1 ‘near’ the favorite. (Conversely, the case of K – 1 identical nonfavorite items and one favorite item would result in the choice probability of each of the K – 1 non-favorites increasing with σ.) This is readily illustrated by two examples; in both, {v2 ,..., vK } = {.98, 0.99, 1, 1.01, 1.02} and, for simplicity, {ε1 ,..., ε K } are i.i.d. N[0,1]. If v1 = 2, it is the favorite, and it is readily verified that increasing σ will decrease Pr[1] and increase all Pr[k], k > 1 (specifically, numerical integration
Theory, Measurement, and Experimental Test
431
shows that dPr[k]/dσ ≈ {-.330, 0.069, 0.067, 0.065, 0.064, 0.063}). If v1 = 0, it is the least favorite, and increasing σ will increase Pr[1] and decrease all Pr[k], k > 1 (dPr[k]/dσ ≈ {.064, −0.007, −0.011, −0.013, −0.015, −0.019}). Thus we see that, although increasing σ causes the choice probabilities of the most-favored item to decrease and of the least-favored item to increase, those of the ‘internal’ items (k = 2, 3, 4, 5) can be influenced positively or negatively. Notes This article is based on the first author’s dissertation at the Stephen M. Ross School of Business, University of Michigan, and was originally published in Marketing Science. The authors thank Christie Brown, Rich Gonzalez, Norbert Schwarz, Frank Yates, and the review team for their valuable comments. Reprinted by permission, Linda Court Salisbury and Fred M. Feinberg, “Alleviating the Constant Stochastic Variance Assumption in Decision Research: Theory, Measurement and Experimental Test,”Marketing Science, volume 29, number 1, January–February 2010. Copyright 2010, the Institute for Operations Research and the Management Sciences (INFORMS), 5521 Research Park Drive, Suite 200, Catonsville, MD 21228 USA. 1. We distinguish this from “expected” utility, because this connotes the standard expectation operator, and there is no guarantee that the error process is zero-mean (e.g., it is not for Gumbel errors, which underlie logit models). Thus, “mean anticipated utility” is associated with the “deterministic portion of utility.” 2. Note that the error terms, εA and εB, need not be independent and identically distributed (i.i.d.), but may have an arbitrary non-degenerate bivariate distribution; nor need they have zero mean (as with Gumbel distributions). However, although not necessary here, it would not be unreasonable to presume that εA and εB have the same form of marginal distribution, an issue we take up later in the empirical application. 3. However, because Louviere and Eagle (2006), Louviere and Meyer (2007), and Adamowicz et al. (2008) caution against ignoring various sources of unobserved heterogeneity—including unobserved coefficient and scale heterogeneity—we re-estimated many of the key models in this article subject to a variety of fixed, reduced coefficient values for bRATE; these resulted in mild decreases in the estimated value of σc. Data with substantially more choices per individual would be required to settle this issue empirically. 4. Given the numerous model comparisons reported using chi-square, we adopt a notation convention of listing the difference in log-likelihood between models (rather than two times that quantity), i.e., χdiff2(Δdf) = ΔLL. This serves to facilitate comparing results reported within the text to those appearing in table 13.1. 5. Several “Stochastic and Deterministic Scenario” simulations were run as well, but we do not discuss them here. They do, however, support the accuracy of the estimation methods, which recovered all parameters successfully. 6. Because Simonson used two presumably correlated preference measures, attractiveness (ATR) and liking (LIK), reported values must be combined to create a single usable coefficient; and the same must be done for ATRSEQ and LIKSEQ, which reflected how
432
Linda Court Salisbury and Fred M. Feinberg
much greater these were in the Sequential condition. These reported values were bATR = 0.36 in SIM and 0.36 + 0.39 = 0.75 in SEQ; bLIK = 0.47 and 0.47 + 0.21 = 0.68, respectively. If these were to be the same across the SIM and SEQ conditions, the new means should be about 0.36 + 1/2(0.39) = 0.555 for bATR and 0.47 + 1/2(0.21) = 0.575 for bLIK. And, because these will be correlated, they should not be used as separate covariates, and so we use bRATE = 0.8 in the Stochastic Scenario (this corresponds to ATR-LIK correlation of approximately 0.55. Substantive model results were insensitive to various (positive) values of this correlation.).
References Adamowicz, W., D. Bunch, T. A. Cameron, B. G. C. Dellaert, M. Hanneman, M. Keane, J. Louviere, R. Meyer, T. Steenburgh, and J. Swait. 2008. “Behavioral frontiers in choice modeling.” Marketing Letters 19 (3/4):215–228. Allenby, G. M., and J. L. Ginter. 1995. “The effects of in-store displays and feature advertising on consideration sets.” International Journal of Research in Marketing 12 (1):67–80. Allenby, G. M., and P. E. Rossi. 1999. “Marketing models of consumer heterogeneity.” Journal of Econometrics 89 (1–2):57–78. Anderson, C. J. 2003. “The psychology of doing nothing: Forms of decision avoidance result from reason and emotion.” Psychological Bulletin 129 (1):139–167. Andrews, R. L., A. Ainslie, and I. S. Currim. 2008. “On the recoverability of choice behaviors with random coefficients choice models in the context of limited data and unobserved effects.” Management Science 54 (1):83–99. Baltas, G., and P. Doyle. 2001. “Random utility models in marketing research: A survey.” Journal of Business Research 51 (2): 115–125. Bhat, C. R. 1995. “A heteroscedastic extreme value model of intercity travel mode choice.” Transp. Res. B. 29 (6):471–483. Bradley, M. A., and A. J. Daly. 1994. “Use of the logit scaling approach to test for rankorder and fatigue effects in stated preference data.” Transportation 21 (2):167–184. Dhar, R. 1997. “Consumer preference for a no-choice option.” Journal of Consumer Research 24 (2):215–231. Engle, R. F. 1982. “Autoregressive conditional heteroscedasticity with estimates of variance of United Kingdom inflation.” Econometrica 50 (4):987–1008. Harding, M. C., and J. Hausman. 2006. “Flexible parametric estimation of the taste distribution in random coefficients logit models.” Working paper, MIT Department of Economics, Cambridge, MA. Hensher, D. A., J. J. Louviere, and J. D. Swait. 1999. “Combining sources of preference data.” Journal of Econometrics 89 (1–2):197–221. Iyengar, S. S., and M. R. Lepper. 2000. “When choice is demotivating: Can one desire too much of a good thing?” Journal of Personality and Social Psychology 79 (6):995–1006. Jöreskog, K. G. 1994. “On the estimation of polychoric correlations and their asymptotic covariance matrix.” Psychometrika 59 (3):381–389.
Theory, Measurement, and Experimental Test
433
Kahneman, D., and J. Snell. 1992. “Predicting a changing taste: Do people know what they will like?” Journal of Behavioral Decision Making 5 (3):187–200. Kahneman, D., P. P. Wakker, and R. Sarin. 1997. “Back to Bentham? Explorations of experienced utility.” Quarterly Journal of Economics 112 (2):375–405. Keren, G., P. Roelofsma. 1995. “Immediacy and certainty in intertemporal choice.” Organ. Behav. Hum. Dec. 63(3):287–297. Kivetz, R., and I. Simonson. 2002. “Self-control for the righteous: Toward a theory of precommitment to indulgence.” Journal of Consumer Research 29 (2):199–217. Kreps, D. M. 1979. “A representation theorem for ‘preference flexibility’.” Econometrica 47 (3):565–577. Lawrence, E., D. Bingham, C. Liu, and V. Nair. 2008. “Bayesian inference for multivariate ordinal data using parameter expansion.” Technometrics 50 (2):182–191. Loewenstein, G. 2001. “The creative destruction of decision research.” Journal of Consumer Research 28 (3):499–505. Louviere, J. J. 2001. “What if consumer experiments impact variances as well as means? Response variability as a behavioral phenomenon.” Journal of Consumer Research 28 (3):506–511. Louviere, J. J., and T. Eagle. 2006. “Confound it! That pesky little scale constant messes up our convenient assumptions!” Proceedings, 2006 Sawtooth Software Conference, Sawtooth Software, Sequem, Washington, 211–228. Louviere, J. J., D. A. Hensher, and J. D. Swait. 2000. Stated Choice Methods. Cambridge, UK: Cambridge University Press. Louviere, J. J., and R. J. Meyer. 2007. “Formal choice models of informal choices: What choice modeling research can (and can’t) learn from behavioral theory.” N. K. Malhotra, ed. Review of Marketing Research: Volume 4, 3–32. M. E. Sharpe, New York. Louviere, J. J., D. Street, R. Carson, A. Ainslie, J. R. DeShazo, T. Cameron, D. Hensher, R. Kohn, and T. Marley. 2002. “Dissecting the random component of utility.” Marketing Letters 13 (3):177–193. McFadden, D. 1974. “Conditional logit analysis of qualitative choice behavior.” In Frontiers in Econometrics, ed. P. Zarembka, 105–142. Academic Press, New York. Read, D., G. Antonides, L. van den Ouden, H. Trienekens. 2001. “Which is better: Simultaneous or sequential choice?” Organ. Behav. Hum. Dec. 84(1):54–70. Read, D., and G. Loewenstein. 1995. “Diversification bias: Explaining the discrepancy in variety seeking between combined and separated choices.” Journal of Experimental Psychology. Applied 1 (1):34–49. Read, D., G. Loewenstein, and S. Kalyanaraman. 1999. “Mixing virtue and vice: Combining the immediacy effect and the diversification heuristic.” Journal of Behavioral Decision Making 12 (4):257–273. Seetharaman, P. B. 2004. “Modeling multiple sources of state dependence in random utility models: a distributed lag approach.” Marketing Science 23 (2):263–271.
434
Linda Court Salisbury and Fred M. Feinberg
Seetharaman, P. B., and P. K. Chintagunta. 1998. “A model of inertia and variety seeking with marketing variables.” International Journal of Research in Marketing 15 (1):1–17. Simonson, I. 1990. “The effect of purchase quantity and timing on variety-seeking behavior.” Journal of Marketing Research 27 (2):150–162. Swait, J., and J. Louviere. 1993. “The role of the scale parameter in the estimation and comparison of multinomial logit models.” Journal of Marketing Research 30 (3):305–314. Walsh, J. W. 1995. “Flexibility in consumer purchasing for uncertain future tastes.” Marketing Science 14 (2):148–165.
III Little’s Law—Current State
Introduction by Editors The Ceprini and Little paper closes the loop. John moved from Little’s Law to Marketing Science, with contributions in many fields, but with Ceprini as co-author, he has now returned to Little’s Law and together they have created an important paper that bridges the gap between OR and Finance. For the past few years, John has taught a special seminar on Little’s Law at MIT in which students from many of MIT’s and MIT Sloan’s programs get to experience the master. Indeed, if you visit the MIT Sloan School, one of the conference rooms is named the Little’s Law Conference room.
14
Generalized Little’s Law and an Asset Picking System to Model an Investment Portfolio: A Working Prototype Maria Luisa Ceprini and John D. C. Little
14.1
Introduction
Our ambitious goal is to create an Asset Picking System (APS) combined with Little’s Law (LL), Generalized Little’s Law (GLL) and its corollaries to generate the GLL-APS model and apply it to generate high quality customized portfolios for individuals. The idea is simple. In the original queuing notation of Little’s Law, the formula is L=λW. The queue consists of discrete entities that we call “items” and the system has some basic time unit, say, “weeks.” Then L = the average number of items in the system, λ = the average arrival rate of items to the system (items/week), and W = the average time that an item spends in the system (weeks). Next we argue that each of the three LL parameters may be thought of as a measure of performance. LL locks them together and, if we are dealing with a finite time period, for example, t ∊ [0,T], with T finite, the relationship is numerically exact. What the items are and what the measures of performance mean depends on the application. Thus, in finance, where items are assets, λ = the average portfolio throughput (assets/week), L = the average number of assets in the portfolio during [0,T] (assets), and W = the average time an asset spends in the portfolio (weeks). Little’s Law was generalized in an important way, usually expressed as H=λG, by Brumelle (1971) and Heyman and Stidham (1980) and is here called Generalized Little’s Law (GLL). The underlying idea is that, during the time when an item, i, is in the system, it can be weighted by a function, fi (t), unique to that item. We define fi (t) to be zero for any t for which i is not in the system. We can expand this idea further by associating i with a vector of quantities: fi(t)={fi 1(t),fi 2(t),.,fi m(t),..,fiM(t)}. These might represent several different properties of an investment:
438
Maria Luisa Ceprini and John D. C. Little
e.g., its dollar amount, its expected rate of return, or its risk measured by the variance of the return rate. Finally, we can expand the concept to a matrix, H i (t). If i ∊ S, is a set of investments of interest, H i (t), could be the covariance matrix of the return rates at t, but, if i is not ∊S, Hi(t) =0. Note that the standard properties of LL still hold along with those of GLL, i.e., L=λW. The practical value of the generalizations is that, instead of just counting items, we can weight them by their value, say, in dollars or other form of utility, or expand them into special properties of an investment, i. Surprisingly, as far as we know, these generalizations have not been exploited in practical applications in the real world very much, if at all. Although the usual theoretical model for H=λG was developed for an infinite time line (Little, 2011), we plan to work on [0,T] and will develop the appropriate formulas as needed. As mentioned earlier, our paper aims to apply finance and queuing concepts to the portfolio building and maintenance process, assuming a financial adviser works with a customer. Then, we model the selection of investments as a queuing process in which each item represents a financial asset. Thus n(t) in figure 14.1 (adapted from figure 3 of Little 2011) is the number of investments the customer holds in the portfolio at time, t. Usually investments are divided into classes (k =1,2,3, . . .,K), e.g., municipal bonds, corporate bonds, common stocks, preferred stocks, mutual funds, real estate funds, ETFs, equities, and n(t)
t (time units)
0
T
Observation period Figure 14.1 n(t) vs. t, showing a sample path (ω) of the number of investments n(t) in a portfolio, ω, over an observation period, t ∊ [0,T], with finite T.
A Working Prototype
439
short term money market funds. Each asset has its own profile of risk/ return, and, in today’s global markets, is widely available. In this paper we use few customer profiles, some financial products, a fairly short time period, [0,T], and therefore few transactions. However, we shall use a general structure and notation that permits easy expansion of scale. Eventually, we seek a level of detail that would be practical and useful. We assume the customer has certain goals. Presumably she/he wants to improve some measure of financial well-being, e.g., growth of the total portfolio value over time or annual income increase. However, the customer would also like to do this at low risk. This creates a trade-off, since high return tends to be associated with high risk. Our system has a set of decision rules for recommending investments for the customer. These rules will be tailored according to the client’s characteristics, and, subsequently, will become parameters in the GLL-APS model for developing a customized investment portfolio. In a computer simulation, the rules are simply applied. In a live session, the customer would be the final decision-maker for the choices made. Therefore, the initial step by the financial adviser is to conduct a thorough interview with the customer to identify her/his profile in terms of financial needs and preferences (see section 14.4.1). The adviser interacts with the customer, aiming to make good decisions for the portfolio by adding and/or removing investments from time to time. This can be either at the request of the customer or the adviser any time one of them perceives relevant shifts in financial markets. The model requires rules for responding to these variables. We shall call them “investment decision rules” and/or “market behavior rules.” For simulation purposes we propose to set up plausible rules and plausible scenarios for the environmental variables. As mentioned, we aim to model a real world investment process. What would this involve? The initial portfolio at, t =0, would be given, although it might be all cash, brought by the customer to the adviser, or residing in his/her savings account waiting for good investment opportunities. Decisions, as time moves along, would be real and therefore may not follow the investment decision rules and/or market behavior rules built into the simulation program. Nevertheless, LL and GLL offer advantages for tracking the process. The first is structure. As is pointed out in Little (2011), and earlier in this paper, LL itself offers three quite different measures of performance that are closely interrelated. They also have the advantage of the “2 out of 3” property. If we
440
Maria Luisa Ceprini and John D. C. Little
know two of the measures, the third is determined. This is a consequence of the numerical exactness of the L=λW relationship over [0,T]. In particular, if one of the measurements is difficult to make we can still obtain its value easily. Or we can use the relationship to check for portfolio performance consistency. GLL could add more performance measures with similar advantages. Modeling a real portfolio for a real customer might require six months or more to achieve insights about how well these ideas work in practice. However, we shall first do computer simulations using data from one of the three real customers for which we have collected customer profile data. For investments, we have collected historical data on ten blue chips for each of fourteen investment classes. Later we plan to work on a larger scale with more investors, advisers and assets. 14.2
Generalized Little’s Law (GLL)
If an investment portfolio is of any size and complexity, some kind of summary measures of performance will be highly useful. An alphabetical listing by name, for instance, would tell very little about the quality of the investments. Stock prices in fifteen-minute increments might be available and could be plotted for a year but, for twenty to thirty investments would be overwhelming to look at and only rarely helpful. In this section, we shall focus on certain averages that are performance measures, and, further, have exact mathematical relationships with each other. We shall present two theorems (standard Little’s Law and Generalized Little’s Law) and three corollaries that expand their usefulness. 14.2.1 Theorem LL.2 on [0,T ] for Standard Little Law (LL) Little (2011) establishes a standard LL result and calls it Theorem LL.2. Figure 14.1 above is a sketch for it, adapted from figure 3 in Little (2011). Shown is the evolution of an investment portfolio of a customer over t ∊ [0,T]. In queuing, this history is known as the sample path, ω. Since we do not really distinguish between the customer and her/ his portfolio as it evolves, we can talk about the customer, ω, or the portfolio, ω. Theorem LL.2 (Little’s Law over [0,T]): For a portfolio observed over t ∊ [0,T] with finite T, L=λW holds.
A Working Prototype
441
Proof: Let S(t) = cumulative number of investments in the portfolio over [0,T]. This includes not only the cumulative arrivals in [0,t] but also any investments that were in the portfolio at t =0. This permits S(0) = n(0)>0 and n(T)>0. We define [A=∫0Tn(t)dt] = the area under n(t) over [0,T]. Repeating arguments in Little (2011), we see that: L=
A T
λ=
S (T ) T
W=
A S (T )
A A ⎡ S (T ) ⎤ ⎡ S (T ) ⎤ ⎡ A ⎤ whence: L = ⎡ ⎤ = ⎡ ⎤ ⎢ ⎢⎣ T ⎥⎦ ⎢⎣ T ⎥⎦ ⎣ S (T ) ⎥⎦ = ⎢⎣ T ⎥⎦ ⎢⎣ S (T ) ⎥⎦ = λW
(1)
Note that S(T) is an integer and only changes by integer amounts. Note also that only its value at T enters into the definitions of λ and W. Corollary C.1 to LL.2: Extension C.1 (Classes of investments). Let (k=1, 2, . . . K) index sets of mutually exclusive investments. Let {Lk, λk, Wk} be the LL parameters for the investment class, k. Then: Lk = λk Wk
(2a)
and defining (λ = S k λ k ), (L = S k Lk ) , and (W = S k (λ k / λ ) Wk )
(2b)
we have L = λW . Proof: (2a) is a special case of LL.2 where we only record data about arrival time, departure time, and waiting time for type k investment class; (2b) follows from the definitions of λ, L, W and by noting that W=Σk(λk /λ)Wk=Σk Lk/λ =L/λ.□ The practical meaning of C.1 is that in a portfolio containing many classes of investments, we can put an individual class, k, e.g., municipal bonds, in isolation and use LL.2. Of course, other activities may be going on and affect the values of Lk, Wk, and λk but will not affect the LL relationship. Any class can be compared with the remainder of the
442
Maria Luisa Ceprini and John D. C. Little
investments in terms of the three measures: Wk, the average time an asset spent in the portfolio; Lk, the average number of assets in the portfolio over [0,T];, and λk, the average arrival rate of assets over [0,T]. Furthermore, the investment adviser can collect data having nothing to do with LL (e.g., news stories about certain stocks), but which make it possible to evaluate the desirability of the investment or its class. Thus LL provides a structure for our analysis through its consistent three measures of performance. Corollary C.2 to LL.2: Nesting Suppose that class, k =5, consists of municipal bonds. In the US, for example, these do not incur federal income taxes, but may incur state taxes. However, in some states, residents do not have to pay state taxes on municipal bonds in their own state. Thus it may be of interest to keep track of municipal bonds by state. Since states are mutually exclusive we could include a new subscript, s, to designate state within the class of municipal bonds. Although the notation becomes cumbersome, the arguments establishing C.2 go through to nest the new attribute “state” within the class of municipal bonds. Such nesting of attributes could continue, if the application warrants. The art of model building suggests the wisdom of keeping things simple. 14.2.1.3 Corollary C.3 to LL.2: Flexibility The basic theorem, LL.2, encourages flexibility. Consider, for example, dynamic control of a portfolio by an adviser. Suppose the adviser is working with a customer on a portfolio with various classes of stocks and bonds. One particular group of investments may be of special interest, e.g., all in the same industry that shows sound growth indexes. Think of these investments as having a check-mark put next to them as they are added to the portfolio. The adviser monitors them more frequently than the others in [0,T]. For example, the adviser can apply LL.2 over [0,t] to the check-marked group and make insertions or deletions at t as needed. LL.2 provides an evaluation of what has happened up to time, t, and the adviser decides what to recommend next. Essentially, we are simply defining a new subclass (defined by check-marks) and a different time period [0,t]. It is a useful way of thinking, applicable to any collection of investments with a common need for monitoring. 14.2.2 Theorem GLL.2 on [0,T] for Generalized LL We now use figure 14.1 and the accompanying theorem LL.2 to motivate a more
A Working Prototype
443
general model. Instead of each investment, i, having unit height as assumed in figure 14.1, its height is fI (t), while it is in the portfolio, otherwise zero. (Thus, LL.2 is the special case of fI (t) =1, while i is in the portfolio, otherwise zero.) In the general case, the figure will look different, as will the curve n(t). We keep λ and S(T) with their same meaning and λ=S(T)/T as before. Let A be the area under the new n(t), and define the new symbols H= A/T and G=A/S(T). We want to show: Theorem GLL.2 (Generalized Little’s Law over [0,T]): For an investment portfolio observed over t ∊ [0,T] with finite (T), H=λG holds. Proof: This completely parallels LL.2. The algebra goes through in the slightly changed notation. The corollaries also hold with symbols L and W replaced with H and G. The advantage of the generalization lies in the new flexibility of application and interpretation. For example, fI(t) could be the dollar amount of investment, i. Then H is the average dollar amount of investments in the portfolio over [0,T]. λ continues to be the average arrival rate of investments to the portfolio. Suppose the time unit is a week. Then G is the average dollar-week of investment in the portfolio during [0,T]. Another way to describe G in this example is that it is the average time an investment is in the portfolio during [0,T] after each investment has been weighted by its dollar amount. Thus big dollar investments contribute more to G than small ones. We can go a step further by thinking of the height as a vector, fI(t), unique for each investment, i. The components of the vector might be the dollar amount of the investment, the income from the investment, the market value of the investment at t, and a measure of the risk of the investment. The function fI(t) can be any function of t. For example, it could be (case 1) the Book Value (dollars) at which an asset is carried on a balance sheet and so corresponds to the cost of an asset minus its accumulated depreciation; or (case 2) the Purchase Value (dollars) at which an investor buys the asset, i. Probably most interesting fI(t) could be (case 3) the Market Value (dollars) at time t. This measure of performance takes into account fluctuations in the value of the assets during the time that they are in the portfolio. In this case H becomes the average market value of the portfolio (dollars) during the interval [0,T].
444
Maria Luisa Ceprini and John D. C. Little
Since H=λG, G=H/λ, and λ= the arrival rate of investments (an integer divided by T and the same in any of the three cases), we see that, in (case 3), G has the interpretation of the average time in the portfolio but with each investment weighted by its market value over the period t ∊ [0,T]. 14.3
Creating an Investment Portfolio in a Global Market
14.3.1 Global Recession and Its Impact on Financial Markets As of this writing, the world has not been cured of deep global recessions. One of the most severe recessions following WWII started in 2001. The difficulties and sacrifices felt by families all over the world constituted a hard lesson to motivate not letting it happen again. For this reason we decided to report briefly all the measures taken during the speculative spiral to global recession. The bubble of the dot-com stock market in 2001 was followed by the real estate (2003) and stock bubbles (2008). These started seeping into the market and involving all economic and production sectors, culminating in a global recession, beginning in 2010. Briefly, the rash speculations in the stock market (2001) and the real estate bubble (2003) gradually spread from the US all over the world, permeating the financial system, the engine of any economy. It would be ingenuous to believe that such huge losses in the markets, whose negative macroeconomic effects continued to prevent a sound growth in spite of the stimulus, cannot but produce deep wounds to the real economy. For several years, governments burned large sums of money to support companies and families. These efforts are likely to worsen the situation unless backup fiscal and monetary measures have been considered in advance. Our views are plentifully supported by the literature, where a large and influential economic theory explains the strong connection between financial market fragility and malaise in the real economy. The first warning signal was the continuous drop of production, which shortly dipped from 20% to 45%, depending on the sector. This phase quickly increased unemployment and produced stagnant real wages, exacerbated by reduced purchasing power. To these roots we need to add the cold speculation on the fragility of the sovereign debt (2012) in almost all industrialized countries, particularly in the euro zone, where, each country separately, but apparently under a European supervision, took austerity measures including
A Working Prototype
445
fiscal taxes, expense cuts including minimum pensions, a freeze in civil-service wages, and a fuel-tax increase. Meanwhile, countries began evaluating plans, never or rarely applied, for increasing the social security benefits for the weakest groups in the population. That is what really went on and on. The blind austerity deterred the weakest countries from any sound economic growth for a long time. The harsh market lesson is that when liquidity evaporates from a market, stronger countries need to step in and backstop the asymmetry of the system. In such a scenario, Europe applied the misleading monetary policy of the ECB (Modigliani and Ceprini 2000), justified by an obsessive fear of inflation, without any sense and without consistency. As a result, the American and European economies experienced deep and worrisome recessions. When the stock market plunged because of heavy losses, the American Fed and the European ECB by mutual consent decided on large-scale strategies of liquidity and large cuts of interest rate that shortly brought the prime rate to a historic low. Nevertheless, all stock markets around the world continued to react negatively in the wake of lack of confidence. Thus, market returns were deflated. It would be best if investors learned from this and became more concerned about market opportunities and held more reasonable expectations than before the crisis. Therefore, this paper tries to define investment rules that are transparent and realistic for picking assets to build and maintain a customized portfolio. It tries to do this in a straightforward and consistent manner that seeks to build confidence in the investment process. 14.4 A Four-Step Process to Design the Asset Picking System (APS) Structure In the first step, we identify those characteristics relevant to a customer profile for generating portfolio asset allocation. We also establish a relationship between customer characteristics and investment classes. In the second step we define investment decision rules for designing the APS structure. In the third step, we select major available investment classes suitable for the customer and employ the APS structure to make the choices. Lastly, in the fourth step, on fixed dates, the GLL checks the investment portfolio consistency for maintaining customer goals over time, taking into account market changes and/or client requests.
446
Maria Luisa Ceprini and John D. C. Little
14.4.1 Step 1: Generating Customer Profiles for Portfolio Asset Allocation We think of a customer as a vector of characteristics and often refer to this as the customer profile. The elements would be age, education, wealth, standard of living, level of risk, etc. The vector of customer characteristics, Cj, needs to be related to a vector of investment classes, Ii . The latter might be municipal bonds, corporate bonds, common stocks, preferred stocks, real estate trusts, mutual funds, ETFs, swaps, derivatives, etc. Each of these will have its own level of risk. The mapping relation is a function that links Ii (investment classes) to Cj (customer characteristics). Thus an elderly person might be mapped into municipal bond, which inherently has low risk. Next we consider the market offerings and opportunities. Although there are many financial products, perhaps only a few meet the requirements of a particular customer. The adviser can help considerably because he/she spends full time working with investments. In times of crisis with volatile markets, there may be very few low risk investments. Some investment classes that would normally be low risk actually have much higher risk. In such times, government regulation or intervention may enter the picture. As an add-on to our model, if the effect of a regulation can change the level of risk of a class of financial products, we want our model to calculate and evaluate the change from the customer point of view. One of the most important and difficult decisions an individual faces is the choice of how to invest a certain amount of money, usually savings from work, to obtain an appealing return for satisfying personal goals. These vary from person to person and are generally oriented toward a better standard of living, whatever is the life stage. Since each asset choice, represents an asset-class and has a different return and risk profile, the choice will partially determine the customer’s financial well-being. The answer to the question of who manages an investment portfolio depends on the investor, who may decide to manage it personally or may commit it to a professional adviser. The answers to the questions how and where to invest and what to invest are mainly affected by the behavioral, economic, and financial characteristics of the investor. These directly or indirectly determine the risk and lead the investor to an expected return target. Therefore, the adviser interviews the customer about the following characteristics, as discussed in (Ceprini 1995 and 2006).
A Working Prototype
447
(i) Age. Tolerance for risk decreases with age. This is especially true for low-income workers for whom a public pension is usually the main and often the sole source of income in old age when, having a different time perspective (life expectancy), they are no longer able to assume appreciable risk. Age and family stage interact in the life cycle. Depending on personal circumstances, a particular asset might be perceived as risk-reducing at an early stage but not at a later stage; (ii) Education. Tolerance for risk increases with education. This guides investors to better investment choices since more education permits customers to evaluate market opportunities more effectively; (iii) Wealth. Tolerance for risk increases with wealth. Big capital offers a better chance to recover from losses due to market setbacks or overvaluation of rash investments. Age and wealth tend to characterize two phases of the life cycle: positive tolerance to risk up to retirement and negative thereafter. The wealth accumulated during working years is spent to maintain the retiree’s living standard; (iv) Attitude toward risk. This is influenced by age, wealth, and other attributes of standard of living. It ranges from Risk Tolerance to Risk Aversion. Risk is a key determinant of portfolio choices. Its influence is demonstrated by observing that, usually, a young wealthy individual is willing to assume a higher risk to realize a higher expected return, thereby showing Risk Tolerance. Nevertheless, there are individuals who, under the same conditions, are unwilling to assume any level of risk showing Risk Aversion; (v) Standard of living. This results from the average lifestyle acquired by an individual during working time. It affects directly her/his attitude toward risk, and depends on wealth, education, and working career. To maintain a lifestyle, some individuals are willing to assume almost any risk, hoping to realize a high return from their chosen investments. Unfortunately, they often have no understanding of the real risk they are taking. An adviser may help steer them; (vi) Investment goals. These play a key role in the investment decision process, particularly for retirement savings that often represent the only source of income to support a retiree’s consumption. Therefore investors decide how to draw their expected return E(Rp), as an Annuity or Lump Sum. A customer of modest wealth would be advised toward low risk investments to support an eventual annuity. A customer with relatively high wealth, might choose a lump sum and be advised toward higher risk investments;
448
Maria Luisa Ceprini and John D. C. Little
(vii) Time Horizon. As individuals go through life, they try to plan fairly far ahead. In their twenties this might be a time horizon of ten to fifteen years. As they get older, this might extend further to a designated retirement age. Time horizon is a major determinant in formulating a portfolio strategy, since it includes planning, deciding, and trading phases. As retirement is approached, the time horizon may be further stretched, if investors perceive increased longevity; (viii) Decision process knowledge. Some people approach financial planning with more knowledge than others. Those with less knowledge can rely on their financial advisers. Our decision rules do much of the same task because they represent an accumulation of financial knowledge. Knowing the customer profile (answers from the questionnaire), adviser and customer can go through our GLL–APS model and appropriately diversify the investment portfolio. As anticipated, the level of risk exposure mainly depends on the customer profile, the portfolio diversification, and the volatility of the financial markets. Figure 14.2, adapted from figure 1 of Ceprini (2006), reflects that stocks realize a high return, R, at a high risk, σ, while bonds offer a lower return, R, at a lower risk, σ. Also cash has a lower R but is riskless over one period, because it is considered as a short term money market fund, ignoring inflation risk which is minimal in the short run. The curved line measures only means/standard deviations of stock/ bond mix of a risky portfolio. When cash is added to this portfolio, the set of means/standard deviations is a straight line. Such a line, known as “the mean/standard deviation efficient frontier,” shows the highest mean return for any given standard deviation. The tangency point, A, where the straight and curved lines meet, defines “the best mix of risky assets (stocks and bonds).” Subsequently, at a certain time, say, t, investors decide whether or not to mix into their portfolio a riskless asset to optimize their investment. We conclude that: (1) the efficient frontier, the dashed line containing C, A, D, has the property of the lowest risk (horizontal axis) for the given level of return (vertical axis). The best mix portfolio depends only on the means/standard deviations and occurs at the tangency point, A; (2) a conservative (low risk) investment will combine the risky portfolio of the best mix with cash, shifting down to the left to a point on the efficient frontier around B; (3) a moderate (medium risk) investment would reduce cash, moving to the right up to a point on the efficient frontier around C; finally (4) an aggressive (high risk)
A Working Prototype
449
Expected return (R)
Planning time horizon (t)
Stocks aggressive (high risk)
D1 Stocks/bonds (best mix) Cash conservative (low risk)
A1 A C1
B1
D
C
Bonds moderate (medium risk)
B Risk ( ) Figure 14.2 “The mean-standard deviation efficient frontier” and the flexible strategies for GLL–APS model.
investment will risk moving to the right up to a point on the efficient frontier even higher than the tangency portfolio around D. But none of these should alter the indicated proportion of risky assets. This one-period mean/standard deviation models of Tobin (1958) and Markowitz (1959), the CAPM model (Capital Asset Pricing Model) of Sharpe-Lintner-Mossin, and the robust theorems of Modigliani– Miller, dated late 1960s, well referenced in Merton (1990), suffered from two big limitations, overcome by Merton’s new rules (1990). He replaced “one period” models with a specified time period between successive portfolio revisions and chose more realistic lognormal distributions for asset prices to produce continuous time models. Paradoxically, this produced optimal portfolio rules that are identical in form to the classic static models used above. However, continuous time models also open up new financial tools. Figure 14.2 also shows what happens when a customer decides to use a newly perceived longevity as a lever to expand his/her investment horizon and consider risky assets over a longer time period than in the past.
450
Maria Luisa Ceprini and John D. C. Little
From the figure we can see that longevity could produce a shift, from (A-B-C-D) to (A1-B1-C1-D1), and so achieve a higher expected return for a given risk. 14.4.2 Step 2: Defining investment decision rules for APS structure The wish for a high expected return from an investment is what pushes investors to bear higher risk than they realize. Virtually there is no limit to the expected return from an investment when one is willing to assume enough risk. For this reason, also during very high losses, such as those experienced from financial markets during recessions, there are investors willing to pay no matter what price for a stock because they believe, or at least hope, to sell it at a high profit. The belief that “from this time forward” the market can go up forever is the mistaken idea that builds up speculation bubbles and ultimately tears down financial markets. The savers continue to invest hoping to make high returns with no understanding of the real risks they are taking, especially when they hold a stock of a firm they believe to know but really do not. Therefore, when the “bear market” hits and savings evaporate the risks become apparent and the stock market shows the false promise of unlimited enrichment. The markets reveal that financial security is weak, and, as a result, customer confidence becomes low. This suggests the need for decision rules that put tailored limits on investors. The payouts of an investment strongly depend on: (1) What the level of regulation is; (2) How much money is invested in the portfolio; and (3) How the capital of the portfolio is managed. The first and the last are controversial because, on one hand, confidence and security of the investors rely on a safe and appropriate legislation; on the other hand, strong regulation can impose too many restrictions on the way that managers set up and maintain the portfolio. We suggest that the limit of “How much money should be invested in the portfolio” basically involves two issues: (A) “What is the goal of the customer?” The question leads to: an Annuity in order to maintain a life style, or a Lump Sum to increase wealth? Any answer relies on (B) An appropriate customized portfolio diversification, which means “How much money should be safely and efficiently put into each financial asset?” The question guides the operational decision rule: “Buy a candidate asset if its expected return is equal or greater than the risk free rate plus the risk premium.” The rule requires special calculations: Since the
A Working Prototype
451
standard deviation, σ, of a candidate asset does not, by itself, determine the extent to which it contributes to the overall portfolio risk, we need the covariance matrix, COV, of the candidate asset and the other assets in the portfolio. Basically, our rule focuses on trading off increased return with possible increased risk, taking account that the money invested is efficiently diversified. A portfolio is efficient when the capital invested is highly diversified, ideally consisting of an appropriate portion of the entire financial market, i.e., an equal fraction of all existing assets, stocks, and bonds (Ceprini and Modigliani 1998, 2002). A replica of such a portfolio can be managed cheaply and safely under the supervision of a blue-ribbon committee, immune from any manipulation. Note that the existence of reciprocity agreements gives the portfolio access to cross-sections of foreign markets. The points (A) and (B) in the previous paragraph define the decision rules; while the customer interview answers define the upper/lower bounds to the rules related to the amount of the investment and the level of customer risk. Market volatility is determined by historical market indexes. Using these parameters, adviser and customer decide how to mix, or not, the assets for selecting her/ his portfolio. From the answers to the interview questionnaire, the adviser assesses key attributes, such as Risk Profile, Investment Goals, and Time Horizon, and guides the client, trading-off his/her risk versus expected return over time. We call this process the “Investment Decision Rules.” 14.4.3 Step 3: Selecting Portfolio Investment Classes The uncertainty of the current financial markets and the lack of investor confidence require new flexible strategies for the traditional investment instruments to protect customer return. As a result, the adviser, based on the customer’s profile, creates a database selecting blue chips of major worldwide financial products, over a period of time. Afterwards, a computer subroutine will select assets and number of shares per asset to build the portfolio, according to the customer budget limits. 14.4.4 Step 4: GLL checking portfolio performance consistency On weekly updates, a computer subroutine, using the GLL formulas, checks for portfolio performance consistency, measuring, respectively, the Average Expected Portfolio Return and the Average Asset Systematic Risk, both defined in paragraph 14.5.2 (subsection V).
452
Maria Luisa Ceprini and John D. C. Little
Simply put, a portfolio of financial assets is thought of as a queue. When an asset is sold, it leaves the queue; when it is bought, it joins the queue, and GLL keeps track of the assets and their attributes, as defined by GLL corollaries, and weekly checks portfolio performance consistency, according to the defined conditions. If the actual portfolio return is less than the return expected by the client, and/or the risk is more than that defined by the customer’s profile, the system goes back to the top of the Portfolio in Process (PIP) block, to verify the customer’s return and risk conditions, make changes and reset the portfolio. The procedure ends at a scheduled time and results are reported to the customer. 14.5
GLL-APS model
The GLL-APS model is a financial engineering tool that we created to generate a customized investment portfolio. GLL-APS works customer by customer and is tailored to each customer’s characteristics. To simplify the modeling process of our portfolio system we use the interactive graphical approach, provided by the Simulink program of MathWorks (1994–2014). We start the process by identifying all system components. 14.5.1 Parameters, States, and Signals The system components are: parameters, states, and signals of the model. Simulink parameters and states represent the inner blocks (subsystems) while the signals are the connection lines. Parameters are constant values unless changed by the adviser or client. States are system variables, exogenous and endogenous, that change over time, including asset prices (market values of the first ten blue chips) of fourteen selected classes over an observation period t ∊ [0,T] and an expected return with associated risk by asset. Signals are functional input and output values that change dynamically during the simulation and connect the inner blocks to the outer block diagram embodying the GLL-APS model. Next, we use the above system components, identified by blocks, to formulate the mathematical equations defining the block diagram of the GLL-APS model to generate our portfolio. The results are reviewed weekly by checking risk and performance consistency.
A Working Prototype
453
14.5.2 Equations and Objective Function for Heuristic Optimization To develop mathematically all activities of the GLL-APS model, we break it into seven subsystems, built separately and then linked together to provide the investment portfolio system: I. Customer Profile II. Investment III. Portfolio Assets and Shares Selection IV. Risk-Reward trade-off V. Checking for Portfolio Consistency VI. Portfolio Resetting VII. Macroeconomic Outlook (suspended prototype run)
during
the
current
I. Customer Profile Subsystem Three parameters from the customer define this block: • Tt Time Horizon of the portfolio, possibly stretched by longevity lever according to customer characteristics, where t ∊ [0,T]: (0–4ys) = Short Term (5–9ys) = Medium Term (>10ys) = Long Term • E(Rp) Investment Goal, or expected portfolio return, of the customer, due as: 1 = Annuity 2 = Lump sum • σp Systematic Risk Level of the portfolio (stretched by three points in the case of the longevity lever): (0–23%) = Low (24–38%) = Medium (>38%) = High In order to run and evaluate our working prototype, we processed the data for only one customer profile, among the three listed. The run was from one to eleven weeks. The above parameters enter directly, as selected from the customer’s profile (See table 14.1). The main notation used to develop the GLL-APS model appears in table 14.2.
454
Maria Luisa Ceprini and John D. C. Little
Table 14.1 CUSTOMER Profiles
Customers Characteristics Risk Tolerance
Investment Goal
Financial Knowledge Wealth 1 (property) Wealth 2 (income and liquidity) Time Horizon
1 Female pensioner 80ys old, well educated, medium economic knowledge, pension cash flow, savings, house owner
2 Male worker 55ys old, PHD, high economic knowledge, salary cash flow, savings, house owner
LOW (towards medium leveraging longevity) ANNUITY (increase yearly income to maintain consumption’s expenses protecting wealth for bequest) medium
MEDIUM
(increase wealth to enhance style of living)
LOW (towards Medium leveraging longevity) ANNUITY (increase yearly income to pay house mortgage)
high
high
high
medium
low
medium
medium
medium
SHORT TERM
MEDIUM TERM
MEDIUM TERM
(towards Medium leveraging longevity)
INVESTMENT
INVESTMENT
INVESTMENT
(towards high leveraging longevity)
3 Female worker 40ys old, high economic knowledge, salary cash flow, savings, house rental
LUMP SUM
(towards Long leveraging longevity)
(towards Long leveraging longevity)
Table 14.2 Notation for the GLL-APS Model
βij
Budget ($) of the Portfolio Investment, a1 and a2 are LO and UP limits Investment Goal, or Portfolio Return (Total Dollars), of the customer, who will specify it as Lump sum (1) or Annuity (2) Time Horizon of the portfolio for t, measured in weeks. (11 weeks) Market indexes before fees (S&P500, Dow-Jones etc.) Beta coefficients to measure Asset Systematic Risk
βp
Beta Coefficient to measure Portfolio Systematic Risk
βm
Beta coefficient of the Market Risk
Ba1,a2 Rp Tt IM
A Working Prototype
455
Table 14.2 (continued) σf
Risk-Free rate
σP
Risk Premium rate
σp
Standard Deviation of the Portfolio
σIi
Standard Deviation of the Asset
σb COVpb s 2p a
R2 RSharpe Ir Ge Tben States Aij,t
Aij,t
K
ᅼAij,t
μp μij Pij,t E(Rp)
E ( Rp t ) E (s p t ) Rp
Standard Deviation of the Benchmark Covariance between excess returns of the portfolio Re t and the benchmark Be t Variance of the Portfolio Excess Return Alpha coefficient, risk-adjusted measure (by betas) of the periodic Portfolio Performance compared to Benchmark Index. It corresponds to the intercept of the SCL line Correlation coefficient to measure if the Portfolio’s Movements are well, or not, explained by Benchmark’s Movements Sharpe coefficient of the Risk-Adjusted Performance of each portfolio’s asset Inflation Rate (suspended during prototype run) Economic Growth (suspended during prototype run) Tax benefits (suspended during prototype run)
Financial Assets, Aij, collected under n(t) over [0, T] from different investment classes, measured by their market value, Pij, and selected comparing asset performance indexes with market sector performance A new Asset Subclass put in isolation, marked K according to Corollaries C.1 and C.2 to LL.2, and measured by its LK, λk and WK parameters A new Asset Subclass of special interest (e.g., growth industry sector), marked (ᅼ) according to Corollaries C.3 to LL.2 and measured by Lk, λk, Wk parameters Portfolio Return Rate (%/wk) Asset Return Rate (%/wk) Price, or market value per unit ($/asset share) Expected Portfolio Return (Total Dollar Value), measured by investment estimated value including price changes and any dividends (or payments). If in the portfolio there are risky assets, the E(Rp) will be the σf of return plus a risk premium σP Periodic (weekly) Expected Portfolio Return over a period t ∊ [0,T] measured by GLL while checking for Portfolio Performance Consistency Periodic (weekly) Portfolio Risk in week over a period t ∊ [0,T] measured by GLL while checking for Portfolio Risk Consistency Portfolio Return (Total dollar value)
456
Maria Luisa Ceprini and John D. C. Little
Table 14.2 (continued) Re t
Excess Return of the portfolio at time (t = Rt-Rf), where Rt is the portfolio return at time t, and Rf is the risk-free return at time t Average Weekly Excess Return of the portfolio over t periods (week). It is a simple arithmetic average Excess return of the benchmark at time (t = Bt-Rf), where Bt is the benchmark return at time t and Rf is the risk-free return at time t. Average Indexes Excess Return of the benchmark over t periods (weeks). It is a simple arithmetic average Error term for portfolio at t
Re Be t
Ḃe ep,t Indicators t
Number of Periods (eleven weeks) for t ∊ [0,T] Event (buy or sell a financial asset/unit) Number of shares of blue-chip i of investment class j at t Indexes investment classes Indexes blue chips within investment class
E sij,t i j
II. Investment Subsystem The Objective function of the GLL-APS model is to maximize the Expected Portfolio Return E(μp) of the customer, trading-off Risk σp against Expected Return E(Rp) over time: m
n
T
E ( Rp T ) = ∑ ∑ ∑ ( m ij ,t Pij ,t Sij ,t + eP ,t )
(3)
j =1 i =1 t = 0
Where: E ( Rp T ) is the expected portfolio return (Total Dollar Value), m ij ,t are the assets return rates; Pij,t are the market values ($ prices) of the financial assets collected under t over a period [0,T]; Sij,t are the number of shares by asset, t ∊ [0,T] is the length of time during which assets remain in the portfolio; and ep,t is the error term at time t. The function E ( Rp T ) is subject to the budget limits: Ba1 ≤ B ≤ Ba 2
(4)
Where: Ba1 is the lower limit under which is not convenient to make any investment, and Ba2 is the upper limit (e.g., $50,000) over which the customer cannot invest; Ai = Fourteen Investment Classes (State Bonds, Corporate Bonds, Short Term Money Market Funds, ETFs, Common Stocks, Preferred Stocks, Real Estate Trusts, Mutual Funds, Insurance policies, Swaps, Options, CDs, Derivatives, Future);
A Working Prototype
457
Aj = Ten Assets, blue chips chosen for each investment class; t = Eleven Weeks, observation time period of the assets in the portfolio. III. Portfolio Assets and Shares Selection Subsystem At this point a Simulink subroutine selects the investment classes Ai according to the customer risk level σp. The link is: σp =(0–23%)→Ai (1:5) σp =(24–38%)→Ai (6:7) σp =(>38%)→Ai (8:14) Then, a heuristic MILP (Mixed Integer Linear Program) subroutine defines the number of shares per asset, Sij,t, entering the portfolio, given budget limits, asset prices, and return rates. IV. Risk-Reward Trade-Off Subsystem According to CAPM (Capital Asset Pricing Model), the Efficient Risk-Reward Trade-off Line, the optimal combinations can be achieved by mixing the market portfolio and the risk-free assets σf. There is no reward for taking on systematic risk. An investor is rewarded with a higher E ( Rp ) only for bearing market risk σM: E ( Rp ) = s f + b p
E (s M ) − s f s sM
(5)
When the market portfolio changes, the βp, and hence all βij including kβij and ᅼβij by Corollaries C.1,C.2,C.3 to LL2, regression coefficients tell us how much expected portfolio return E(μp) tends to change; then: E ( Rp ) − s f = b p [ E (s M ) − s f ]
(6)
Beta coefficients, βij, assume an important role in the Security Market Line (SML shown in figure 14.3) to determine if an asset being considered for a portfolio offers a reasonable expected return for risk. The slope of SML is the risk premium on the market portfolio. If the portfolio return, R, is plotted above SML line the portfolio is undervalued; if it is plotted below, it is overvalued. But such a definition is not quite precise considering the asymmetry of the financial markets.
458
Maria Luisa Ceprini and John D. C. Little
E(Rm)
P2 SML P1 (
PM
pM)
P3
( f)
(1.0)
( )
Figure 14.3 The Security Market Line (SML).
Indeed, Fama and French (1992) demonstrated that classic-static beta is an imperfect measure of investment risk because it doesn’t allow investors to differentiate downside risk (loss) from upside risk (gain). The dual-beta measure, making the distinction, helps investors during their investment decision process. As our main goal is to create a working prototype keeping the model reasonably simple, we decided to use the standard beta, and apply the dual-beta measure during the next phase of development. The portfolio’s Beta, βp, is related to the covariance, CovpM, between the excess return of the portfolio Re and the benchmark, Be, over the variance σ2b of the excess return of the benchmark; hence: bp =
COVpb s 2b
(7)
where: COVpb =
1 n ∑ [(Ret − R e )(Bet − B e )] n − 1 t =1
(8)
A Working Prototype
459
1 n e and Re t is the excess return of each asset and Re = ∑ R t is the n − 1 t =1 simple arithmetic average excess return of the portfolio over t 1 n e periods; also Be t is the excess return and B e = ∑ B t is the simple n − 1 t =1 arithmetic average excess return of the benchmark over i periods. The denominator of βp is the variance of the excess returns of the benchmark (similarly we compute the variance, s 2 p, of the portfolio): s 2p =
1 n ∑ (Bet − B e )2 n − 1 t =1
(9)
while the Standard Deviation, s , is the square root of the portfolio variance: s = s 2p
(10)
Then, we calculate the R-squared, R2, ranging from 0, perfectly uncorrelated, to 100 perfectly correlated to the benchmark, to measure the strength of the relationship between the dependent and independent variables. Since correlation is the square-root of R-squared, then, ⎛ COVpb ⎞ R 2 = 100 ⎜ ⎝ s ps b ⎟⎠
2
(11)
A low R2 tells us the movements of the portfolio are not well explained by the movements of the benchmark. Finally, we compute the coefficient Alpha, α, to measure either the portfolio’s performance over t periods (month/week), or asset’s performance, after adjusting for the portfolio (or the asset) systematic risk computed by the Beta, βp, related to the benchmark index. If (α > 0) means that the portfolio has outperformed its benchmark index earning a return in excess of the reward for the risk taken; if (α = 0) the portfolio earned an adequate return for the assumed risk; if (α < 0) the return of the portfolio is too little for the risk borne. Then: (12) a = Re − b p B e where: α is the periodic measure, Re the average periodic excess return of the portfolio, B e is the average periodic excess return of the benchmark index, and βp is the portfolio Beta coefficient. Looking at figure 14.3, α coefficient is the difference between the average return rate on a portfolio and its SML.
460
Maria Luisa Ceprini and John D. C. Little
Last, we compute the Sharpe Ratio, RSharpe, to measure the riskadjusted performance of each portfolio’s investment: RSharpe =
E(Rp ) − s f s
(13)
where: E(Rp) is the expected portfolio return, σf, is the risk free rate return, or an index, and σ is the standard deviation of the excess return. V. Checking for Portfolio Consistency subsystem As anticipated in Step IV, in this subsystem we run scheduled weekly updates of the investment to check data for portfolio performance consistency to assure the investment goals of the customer. GLL block does that in terms of expected return and risk applying for the following formulas: m
n
E ( Rp t ) = ∑ ∑ ( m ij ,t Pij ,t Sij ,t + eP ,t )
(14)
j =1 i =1 m
n
E (s p t ) = ∑ ∑ b ij m ij ,t
(15)
j =1 i =1
Where E ( Rp t ) and E (s p t ) represent the function H in GLL respectively for return and risk; Pij, t are the average assets price, and bij the average beta coefficients over a period t ∊ [0,T]. VI. Portfolio Resetting Subsystem At the end of the checking process, any variation in the assets, or their risk, implies a resetting of the portfolio [Aij,t, σij,t variation block]. Portfolio resetting usually is scheduled on a quarterly or yearly basis and it could occur at any reviewing time (week time unit during the current prototype run). The real ( Rp t ) and (s p t ) are compared to the E( Rp t ) and E (s p t ). If the resetting loop satisfies the consistency criteria of the customer:
( Rp t ) ≥ E ( Rp t ) (s p t ) ≤ E (s p t )
(16) (17)
the process stops, otherwise it follows the directions of the Aijt Variation Block. Once all block activities are executed, the process displays statistics and results of the simulation.
A Working Prototype
461
VII. Macroeconomic Outlook Subsystem The parameters Inflation, Ir, Economic Growth, Ge, by state and by sector, and Tax Benefits, Tben, play key roles in an investment decision. For the former, let’s think about an investment in real assets, for example real estate. The future cash flow from such an investment could likely rise in nominal value due to inflation, and, if the appropriate adjustments are not taken, a valuable investment opportunity could vanish. For the second, any change in the economic growth, as well in the financial market or in the business sector where assets are issued, requires a portfolio reset because some assets are likely to increase in value, while others may drop. Finally, a Tax-approach could provide significant benefits to the customer (e.g., negotiation of individual assets versus fund shares, or state/country free of federal tax, are examples that respectively apply for C.2 and C.3 to LL.2). This subsystem is suspended during the current prototype run. 14.5.3 GLL-APS Model-Based Design: Run and Results On the above premises, to replicate our portfolio, we use the Simulink Model-Based Design integrated with specific blocks we built to apply in finance. The interactive graphical environment, shown in figure 14.5, simplifies the modeling process of our GLL-APS. Moreover the approach provides insight on model organization and interaction allowing making changes and displaying results in real time. Finally, results (table 14.3 and figure 14.4) are validated to check if the model accurately reproduces all details of an investment portfolio system. The design breaks the system into blocks and models each block on its own, following all definitions as described in paragraph 14.5.2. Each block represents a subsystem and to each one is given the same number of the subsystem for a better understanding of figures 14.4 and 14.5 and table 14.3. Customer Profile, CPhij, and Asset List, Aij, enter into the system, providing data, at an interarrival time (unit = one week) stated by the Simulink advancement time over an observation period (t =eleven weeks) ∊ [0,T]. Both, CPhij and Aij, join the queues, QCP and QAij, and it starts the timer (Ts=T0) where Ts stands for the timer of the GLL-APS Model. The system declares the data sources for the model. Data inputs are provided by two Constant blocks labeled CPhij, [Level of Risk, σh, Time Horizon, Ti, and Goal, Ij] and Aij, [Investment Classes by Blue Chip, Aij, Prices of the selected Assets, Pij,, the Budget limit available by customer,
462
Maria Luisa Ceprini and John D. C. Little
n(t) N th arrival In T 2 departures 2 arrivals in T In T
A(T)%
Time period T Figure 14.4 Number of assets in Portfolio 1 versus time.
Figure 14.5 The GLL–APS Simulink Design.
25.24 23.60 5.40 10.25
400 150 3000 2000
VTMFX PZA RVGIX BMPAX.Iw
Portfolio 1 at time (t=11): Budget $ 50,331 Fees* = $ 0 E(Rp1,11) = $ 3,325
3.62 -1.79 2.56 1.20
3.57 -1.74 2.68 1.10 0.73 1.45 0.81 1.31
0.69 1.43 1.10 1.28
1.36 0.97 1.10 1.08
Beta coefficient β
92.70 95.55 67.64 78.91
92.61 95.53 71.69 76.94
98.58 82.49 71.69 80.76
R-squared R2
6.25 6.03 2.16 2.89
6.21 6.01 2.29 2.78
6.12 2.66 2.29 2.45
Standard Deviation σp
9.64 6.45 4.95 4.68
9.58 6.43 5.01 4.54
9.37 7.31 5.01 4.84
Mean μ
1.50 1.75 2.24 1.63
1.46 1.72 2.47 1.59
1.36 2.63 2.47 1.65
Sharpe Ratio RSharpe
6.41%
6.35%
6.05%
Portfolio Average Return Rates before Taxes
E (Rp t )11 = $ 64.0 (s p t )11 = 23%
E (Rp t )7 = $ 63,3 (s p t )7 = 23%
E (Rp t )1 = $ 62.8 E (s p t )1 = 20%
Portfolio Consistency at time t unit (week)
*Fees are computed once every three years at time an asset joins the queue. Note: For coefficients explanation see subsystem IV and table 14.2.
25.10 23.52 5.77 10.13
400 150 3000 2000
VTMFX PZA RVGIX BMPAX.Iw
Portfolio 1 at time (t=7): Budget $ 51,138 Fees* = $ 987 E(Rp1,7) = $ 3,292
-1.60 0.79 2.68 1.01
15.87 14.16 5.77 10.49
400 150 3000 2000
MPA BJBGX RVGIX BMPAX.Iw
Portfolio 1 at time (t=1): Budget $ 46,762 Fees** = $ 2,829 E(Rp1,1) = $ 3,265
Portfolio Asset
Alfa coefficient α
No of Market Shares by Price by asset asset
Table 14.3 Statistics of Portfolio 1 (Based on Three Years Trailing)
464
Maria Luisa Ceprini and John D. C. Little
Bh], and the Aij Variation block, any time occurs a change in the assets. Finally the Sink block displays the output, portfolio results, with their validation. 14.6
Summary Remarks and Steps Ahead
Customers differ in what constitutes appropriate investments for them at a particular point in time. Therefore, a financial adviser administers a questionnaire asking about financially relevant characteristics of the customer. The vector of answers is called the customer’s profile and is an input to the process of selecting and maintaining the customer’s portfolio. In our example, we constructed the profiles of three real customers, although we only report the results for portfolio 1 during the current prototype run. We worked with a set of available investments having fourteen categories with ten blue chip alternatives in each. To go from a customer profile and available investments to a customized portfolio, we devised investment decision rules. Later, as time unfolded, the rules added (or removed) suitable investments for the portfolio. Our rules are based on models and algorithms from finance and financial engineering, for which there is a great deal of experience. From them we derived our model GLL-APS for recommending investment choices for customers. Our goal is a portfolio that adapts to market conditions and meets the needs of the customer. For him/her, we wish to produce understandable, reliable, and sensible results in order to build trust in our system. Currently, our model is a working prototype, since our database of available investment histories is short, and we input to the model only one of the three available customer’s profiles. The results meet the test of face validity for a quality portfolio for a customer having those particular characteristics. However, our structure is general and permits constructing other examples with more customers and longer databases of investment histories. A next step is to automate data collection and follow several portfolios over time. In the overall system, LL and GLL add precision to the APS structure, and flexibility and consistency to the GLL-APS model by locking certain performance measures together. Ahead of us are the price function choice and the selection of other financial measures with which the adviser or customer may wish to track performance over time.
A Working Prototype
465
A further step of our model will be to use a large sample of customers, who run hypothetical portfolios with extensive time series of historical data in accelerated time. After that would be an arrangement with a licensed financial adviser to run the system in parallel with several volunteer clients as they go about maintaining their portfolios in real time. However, we believe we have the basic structure for a viable high quality model for selecting and maintaining a customized portfolio, while offering flexibility and building trust. Notes Due to space constraint more details appear in the MIT SWP 5105–14 or on http:// ssrn.com/abstract=2469576.
References Brumelle, S. L. 1971. “On the Relation between Customer and Time Averages in Queues.” Journal of Applied Probability 8:508–520. Ceprini, M. L. 1995. “Modeling Financial Services Quality Management.” MIT Sloan School, Finance Research Center, working paper, no. 315–95, October, 1995. Ceprini, M. L. 2006. “Managing Investor’s Expectations in Pension Funds of the Private Institutions.” Economic Review Wirtschafts Politische Blatter, no. 1, 2006, update of the LIUC working paper no. 177/05. http://www.liuc.uni.it; http://www .WirtschaftsPolitischeBlatter. Ceprini, M. L., and J. D. C. Little. 2014. “Generalized Little’s Law and an Asset Picking System to Model an Investment Portfolio: a Working Prototype.” MIT Sloan School, working paper no. 5105–14. http://ssrn.com/abstract=2469576. Ceprini M. L., and Modigliani F 1998. “Social Security Reform.” Bank of Rome, Review Italian Economy no. 1, May–June 1998 and Review of Economic Conditions in Italy, no. 2, May–August 1998. Fama, Eugene F., and Kenneth French. 1992. “The Cross-section of Expected Stock Returns.” Journal of Finance 47 (2):427–465. Retrieved June 2013. Heyman, D. P., and S. Stidham, Jr. 1980. “The Relation between Customer and Time Averages in Queues.” Operations Research 28 (4):983–994. Little, J. D. C. 2011. “Little’s Law as Viewed on its 50th Anniversary.” Operations Research 59 (3):536–549. Markowitz, H (1959). Portfolio Selection: Efficient Diversification of Investment. New York: Wiley (1959). Mathworks, Inc. 1994–2014. Matlab & Simulink program. Merton, R. C. 1990. Continuous-time Finance. Basil Blackwell Ltd.
466
Maria Luisa Ceprini and John D. C. Little
Modigliani, F., and M. L. Ceprini. 2000. “A Misguided Monetary Policy Bears the Main Responsibility for the European Unemployment.” Working paper of the European Parliament (2000), Review of Economic Policy Vol. XC July–August 2000 of Confindustria, and Ed. Palgrave MacMillan economic series, 2003, Working paper WEL (World Economy Laboratory) at MIT no. 03, 2000. Modigliani, F., and M. L. Ceprini. 2002. “Capitalization: Privatization or risk sharing through a common portfolio?” Economic Review Wirtschafts Politische Blatter, no. 1, 2002, IEIS Working Paper, 2002, Luxembourg. Tobin, J. 1958. “Liquidity Preference as Behavior Towards Risk.” Review of Economic Studies (25):68–85.
15
Closing Statement
We are all proud to have worked with John and to have experienced his advice throughout the years. John remains an active researcher and teacher. He still advises PhD students, works with students in other programs, provides insights at marketing seminars, and, in general, just contributes to MIT and MIT Sloan. He remains active with his morning bicycle rides, his legendary Thanksgiving Feasts to which he always invites a few MIT students, and, of course, by spending time with his children and his eight grandchildren. He is truly a treasure that we celebrate with this Festschrift. John, it is our honor to have known you.
Contributors
Makoto Abe, Graduate School of Economics, The University of Tokyo Rene Befurt, Analysis Group André Bonfrer, Australian National University Robert Bordley, Booz-Allen-Hamilton Maria Luisa Ceprini, MIT Sloan School of Management Peter J. Danaher, Monash University Xavier Drèze, Anderson School of Management, University of California Los Angeles Daria Dzyabura, New York University, Stern School of Business Theodoros Evgeniou, INSEAD Fred M. Feinberg, Stephen M. Ross School of Business, University of Michigan John R. Hauser, MIT Sloan School of Management Kamel Jedidi, Columbia Business School, Columbia University Laoucine Kerbache, HEC School of Management Janghyuk Lee, Korea University Business School Guilherme (Gui) Liberali, Rotterdam School of Management, Erasmus University John D. C. Little, MIT Sloan School of Management Erin MacDonald, Mechanical Engineering, Stanford University Dina Mayzlin, USC Marshall School of Business Wendy W. Moe, Robert H. Smith School of Business, University of Maryland Elisa Montaguti, University of Bologna
470
Contributors
Ricardo Montoya, University of Chile Pamela D. Morrison, University of Technology Sydney Scott A. Neslin, Tuck School of Business, Dartmouth College Oded Netzer, Columbia Business School, Columbia University John H. Roberts, University of New South Wales–Australia Business School and London School of Business Linda Court Salisbury, Carroll School of Management, Boston College Jiwoong Shin, Yale University Rajendra Srivastava, Singapore Management University Olivier Toubia, Columbia Business School, Columbia University Michael Trusov, Robert H. Smith School of Business, University of Maryland Glen L. Urban, MIT Sloan School of Management Sara Valentini, University of Bologna Masahiko Yamanaka, KSP-SP and Hosei University
Index
Ability and motivation, 290 Acquisitions budgets for, 105 channel of, 307 Customer Equity and, 87–88 expenditures and, 94, 103 optimization of, 103 policy on, 102–109 rate of, 94, 98, 100, 103 recency, frequency, and monetary value model and, 150–152 stream of, 93–94 ADBUDG (Advertising Budget), 11–12 Additive decision rules, 258 Additive parts-worth rules, 264, 274 Additive q-compensatory rules, 264, 274 ADMOD model, 25 Advertising, 54. See also specific types of attribute-focused, 349–350 banner, 24, 30–31 broadcast, 89 CPM, sold on basis of, 29–30 CPM-based Internet, 23–24 display, 23, 239 of Internet, 23–24 non-attribute-focused, 349–350 pricing models for, 23 radio, 24 research streams in, 354 uninformative, 349, 349–395 Age, 307, 447 “Aggregate advertising models: The state of the art” (Little), 12 Aggressive investment, 448–449 Ajinomoto, 161 Alden, E., 5, 15
“A Lifetime is Made of Memories” advertising campaign, 350 Alvin, 15 Amazon.com, 62, 112, 321 American Airlines, 112 American Express, 350 American Marketing Association (AMA), 1, 17 Analyst, 404–406 Android, 222 Annuity, 447, 450 Ansoff Matrix, 66–67, 72–73 Anticipated utility, 403 Anti-Spamming Act, 89 AOL, 24 Application service provider (ASP), 165–167 Arrival rate, 92 Asset picking system (APS), 445–452 Assets best mix of risky, 448 brand-based, 68–69 business, 69 customer, utilization by, 69 intangible, 53, 57 internal, 67 IP, 70 market-based, 52, 55, 57, 59, 67–68 non-market-based, 52 off-balance sheet, 68 return on, 56 technology, 70 turnover of, 56 utilization of, 57 Asymmetric attributes, 375–377 AT&T, 221, 225–226, 229 Attribute case, 377–379
472
Attribute-focused advertising, 349–350 Audience measures optimization, 39–42 Austin, 410 Automotive experiment test matching morphs, 229–240 AutoTrader, 230 Average asset systematic risk, 451–452 Average customer, 291 Average expected portfolio return, 451–452 Average frequency, 39 Average rating, 333, 337, 341 Banner advertising, 24, 30–31 automotive experiment test matching morphs, 229–240 banner morphing and, 214–220 CNET, 220–229 current practice on, 212–214 future directions of, 241–243 implications of, 241–243 morphing, 211–249 morph-to-segment-matching for, 232 Banners characteristics of, 222–223 click through, 236–238 control, 234–236 morphing of, 212, 214–220 smart phone, 220–221 square, 227 test, 234–236 top-of-page, 227–229 Bargain-priced product, 165 Barnesandnoble.com, 321 Baseline hazard rates, 331 Baseline ratings metrics, 330–333, 337 Baseline variance, 339 Bates, A., 175 Bayesian methods, 142, 172, 184, 195, 243, 261–262, 277 calculations, 219 estimation procedure, 197, 296 inference of segment membership, 212 Bayesian shrinkage estimation, 144 Bayes Theorem, 243–244 Bear market, 450 Behavioral and Policy Science Area (BPS), 14 Behavioral assumptions, 131 Behavioral targeting, 213 Bellman optimality equation, 196 Benchmarks, 264–265
Index
disjunction of conjunctions and, 269–271 Best mix of risky assets, 448 Best mix portfolio, 448–449 Big city dwellers, 307 Bivariate association parameters, 33 Body-type targeting, 232–233, 238 Bonds, 448 Book value, 443 Boston Herald, 2 Boston University School of Engineering, 15 Bottom-up approach, 263 Branch and bound, 6 BRANDAID, 11–12, 171 Brands assets based on, 68–69 considerations for, 238–239 equity in, 52–55, 57, 68 loyalty to, 54 scaling of, 422 Brand-specific product, 422 Broadcast advertising, 89 BT Group, 211 Buck Weaver Award, 1, 18 Budget for acquisition, 105 constraints on, 31–33 development of functions for, 29–33 frequency capping, 30–31 notation for constraints on, 31–32 parameter estimation, 33 share of impressions, 29–30 statement of optimization problem, 32–33 Buick, 255 Burn-in period, 185 Business assets, 69 Business credibility, 50 Business resources, 69 Buying stage, 230, 233 B&W screen, 259 Calibration-sample optimal policy, 201 Calibration study, 217–218, 222, 242 Canon PowerShot, 350–351 Capital Asset Pricing Model (CAPM), 449, 457 Capital One, 350 Carryover effects, 54–57, 63, 73, 170 Cars.com, 230 Case Institute of Technology, 5–6
Index
Case Western Reserve University, 5–7, 9, 14 Category growth, 53 Category management, 162–163 Ceteris paribus, 404 Change in costs, 102 Channels acquisition of, 307 choice models for, 293 incumbent, 300 novel, 300 Charles Parlin Award, 1, 17–18 Chesapeake Bay, 5 Chevrolet, 213, 234 Choice data, 420 Choice model specification, 411–412 Citibank, 112 Click-alternative characteristics, 242 Click characteristic, 217–218 Clickstream, 218–219 Click-through rates, 24, 212–213 banners and, 236–238 brand consideration and, 239 context-matching and, 241 morphing and, 226, 229 per customer, 237–238 preferences and, 223–224 CNET, 211 banner characteristics, 222–223 calibration study, 222 click-through preferences, 223–224 context-matching, 225 latent cognitive-style segments, 223–224, 226–229 smart phone banners, 220–221 CNET.com, 220 Coca-Cola, 69 Cognitively simple continuous DOC, 277 Cognitive simplicity, 260, 261, 262 Cognitive styles, 233–234 Collaborative movements, 162–163 Collaborator equity, 53 Collection segments, 233 Columbia River, 4 Commitment segments, 233 Commoditization, 60 Common infrastructure, 165 Communications, 104, 292 Comparison segments, 233 Complete enumeration, 42–43 Complexity control, 261 ComScore, 34
473
Conant, D., 8 Conjoint analysis, 242–243 Conjunctive rules, 257–258, 264, 274 Consensus three-star, 341 Conservative investment, 448 Consideration sets, 255–256 Consider-then-choice rule, 277–278 Consumer -generated product reviews, 343 heterogeneity, 375–377 movement, 163 production function, 89 ratings behavior, 321–322 segments, 242 total error and, 404–406 utility, 404–405 Consumer Reports, 255, 351 Consumer Research Power, 236 Contemporaneous marketing effects, 53 Context-matching, 225–226, 241 Context sensitivity, 272 Control banners, 234–236 Coop Sapporo, 163 Correlated models, 141–143, 154 Cost-per-thousand (CPM), 23–24, 29 advertising sold on basis of, 29–30 impressions and, 33 media selection and, 29 in online media, 29 optimal reach and, 38 rate card, 31 realistic, 35 Countersignaling, 354–356 Covariance mix (COV), 451 Covariate data, 420–421 CoverStory, 13 CPM-based Internet advertising, 23–24 Cronbach’s alpha, 245 “Crossing the Chasm” (Moore), 59 Cross-selling efforts, 54 Cummins Engine, Inc., 6 Curse of dimensionality, 5 Customer acquisition rate for, 103 asset utilization by, 69 average, 291 channel choice by, 287–288 channel decision by, 285–286 characteristics of, 134–135 click-through rates per, 237–238 database for, 98 intimacy of, 66
474
Customer (cont.) margins for, 98 measures, 127–130, 136, 146 platforms for, 69, 71 profit for, 90 retention of, 54 right-channel, 285–286 satisfaction of, 55 solutions for, 71 value of, 102, 134 Customer base, 146–147 Customer Equity (CE), 53–54, 85–87, 89–96, 95. See also Customer Lifetime Value (CLV) acquisition, 87–88 customer base, 146–147 customer flow, 87–88 Customer Lifetime Value and, 95–98 expenditures, 88–89 marketing actions, impact of, 88–89 maximizing, 104–105, 109, 113 retention process, 87–89 segment size, 111–112 valuation of, 98 Customer flow, 87–88 Customer Lifetime Value (CLV), 62, 85–88, 91, 95–96, 95–98, 96. See also Customer Equity (CE); Recency, frequency, and monetary value (RFM) model acquisition policy, 102–109 acquisition rate, 100 change in costs, 102 customer characteristics on, effect of, 135 discount rate, 101–102 elasticity decomposition of, 154–155 elasticity of covariates on, 149–150 expected profit per contract, 99–100 heterogeneity, 109–112 managerial implications of, 112–114 marginal acquisition cost and, 103–104 maximizing, 104–105, 109, 113 optimal periodicity, 96–98 Pareto/NBD-based stochastic model of buyer behavior for, 130, 134 Power Lead System and, 128–129 retention rates, 98–99 segment size and, 111–112 Customer lifetime value, concept of, 127 Customer Relationship Management (CRM), 85
Index
customer lifetime value, concept of, 127 customer measures and, 127 customized marketing message and, 112 managerial decision-making in, 129 promises of, 111–112 Customer’s Category & Market Comparison (CCMC), 167–168 Customized marketing message, 112 Daehong, 35 Dartmouth College, 2 Data/database choice, 420 covariate, 420–421 generation of, 420–421 growth of, 101–102 harvesting, 102 on Internet media selection, 33–37 marketing, 12–13 online product ratings, 325–326 on pharmaceutical detailing and sampling, 178–180 revealed preference, 401 stated preference, 401 value of, 99, 102, 108 Data suppliers, 162 Daum.net, 38 Decision calculus, 11, 21 Decision information models decision process evolution in customer channel choice, 285–314 disjunction of conjunctions, cognitive simplicity, and consideration sets, 255–279 uninformative advertising, 349–395 value of social dynamics in online product ratings, 317–346 variance in assumption in decision research, 399–432 Decision process evolution, 290–293, 296–297, 310–311 in customer channel choice, 285–314 date on, 295–296 estimating, 295–296 hypothesis on, 288–292 managerial implications of, 307–309 methodology on, 292–295 results of, 296–303 robustness checks and, 303–307 taxonomy in categorizing, 287–288 Decision process knowledge, 448
Index
Decision rules additive, 258 consideration sets and, 255–256 non-compensatory, 257–258, 259, 276–277 notation and established, 256–258 q-compensatory, 258 Decline stage, 60 Delano Roosevelt Lake, 4 Deliberative-analytic segment, 227–229 Deliberative stage, 292 Delivery efficiencies, 71 Dell, 30, 69 Demand-side synergies, 69–70 Depletable resources, 87–88 Detailing and sampling. See Pharmaceutical detailing and sampling Deterministic hypothesis, 420 Deterministic scenario, 420, 424–425 Deterministic Scenario parameter values, 422 Deviance information criterion (DIC) value, 186, 297, 305, 337 Diagonal-covariance probit, 407 Differentiation, 71 Diffusion models, 72 Diminishing returns effects, 39 Direct measurement, 212 Disaggregated model, 12–13 Discount multiplier, 101 Discount rates, 98, 101–102 Discrete choice models, 422 Discriminant analysis, 311 Disjunction of conjunctions (DOC), 255–279, 256, 259–261, 264–265 benchmarks and, 269–271 cognitively simple continuous, 277 cognitive simplicity, 260 defined, 259–260 diagnostic summaries of, 276 machine learning approaches to identify, 260–264 Disjunctive rules, 257, 264, 274 Display-type advertising, 23, 239 Distinguished Service Medal, 18 Distribution Economics Institute of Japan, 162 DOCMP, 262–264, 269 LAD-DOC and, 270–271 mathematical programming, 262–263 Do Not Call Registry, 89 Door-to-door agents, 307, 311
475
Doritos, 410 “Dot com” crash of 2000, 23 Dot-com stock market, 444 Dropout rate, 142–143 Du Pont model in accounting, 57 Dynamic programming (DP), 195–197, 199, 219–220 Earnings to brand equity, 57 ECB, 445 Economies of scale, 69 Edmunds’, 230, 232 Education, 447 Effective reach, 39–42 Efficiency of conversion, 51–52 Efficient frontier, 448–449 Efficient risk-reward trade-off line, 457 Elasticity, 135 of covariates, 149–150 decomposition, 154–155 E-mail-early approach, 308 Email-SPAM environment, 88 Empirical Bayes (EB) model, 141–142 Empirical results, 422 ENPV, 64 Enumeration algorithm, 197 Envelope theorem, 100–101 Equity brand, 52–55, 68 collaborator, 53 Error, 399–400 correlation’s in, 417 mean absolute percent, 138 mean squared, 141 potential specification, 419–425 total, 404–406 variance in, 405 Estimation, 422 Evaluate-all-profiles GPSs, 273 Everyday low price (EDLP) strategy, 163 Existing product-markets, 69 Expected Gittins’ Indices, 216, 220, 241, 245 Expected lifetime, 129, 144 Expected portfolio return, 456–460 Expected profit per contract, 99–100 Expected time spent, 92 Expected utility, 403 Expendable resources, 88 Expenditures acquisition, 94, 103 Customer Equity, 88–89
476
Experience curves, 69 Experienced utility, 402–405 Experience item, 352–353 Exponential hazard model, 331 Exponential lifetime, 132 Exposure distribution models defined, 26 for Internet, 26–27 MNBD model for, 27–28 Exposure random variable, 26–28, 31–32 Extension, 441–442 External environment, 51–52 Facebook, 349 Facultés Universitaires Catholiques de Mons, Belgium, 18 Fast Track, 14 Federal Highway Administration, 10 Fellow of both INFORMS, 18 FHL, 130–131 behavioral assumptions of, 131 frequency and, 133 frequent shoppers program and, 152–154 independence assumption and, 143 Pareto/NBD-based stochastic model of buyer behavior and, 132–133 recency and, 133 spending per transaction and, 132 Fight or flight option, 60 Finite-mixture, 277 First-order condition (FOC), 97–99, 118–120 First Premier Bank, 350 Fixed costs of communications, 104 Fixed-grid interpolation, 197 Flexibility, 442 Ford Falcon, 8 Forecasting, 57–58 Fort Monroe, 5 Fort Monroe Army hospital, 5 FORTRAN program, 37 Forward-looking dynamic (FL dynamic) policy, 198–203 Fournaise Marketing Group, 50 Free correlation, 417 Frequency, 26, 127–128 of arrival, 328 average, 39 FHL and, 133 maximizing average, 42 overload on, 37
Index
Pareto/NBD-based stochastic model of buyer behavior and, 133 state of, 188, 191 Frequency capping budget, 30–31 negative binomial distribution and, 35–37 optimal reach and, 38–39 statement of optimization and, 32–33 Frequent shoppers program (FSP), 136–137, 152–154 recency, frequency, and monetary value model, 137, 152–155 Freudenthal triangulation, 197 From this time forward belief, 450 Fujifilm FinePix, 350 Full Hierarchical Bayes (HB) framework, 141–142 Full models, 141–144 Full pooling equilibrium, 370–371 “Fully Flexible” allocation method, 33 FUT, 408–409 Future utility, 402–403 Future value, 89 GARCH class of models, 401 Garmin, 275 Gates, B., 350 Gauss-Laguerre quadrature, 412 Gender, 307 General Electric Company, 3, 75 Generalized Little’s Law (GLL), 437–465 General Motors (GM), 98–101, 119–120, 213, 232–234, 255 George E. Kimball Medal, 18 GfK Group, 266, 272 Gilbride-Allenby formulation, 264–265 Gittins’ Index Theorem, 216–217, 220, 229–230, 243, 245 GLL-APS model, 452–464 Global marketing research, 69 Global Positioning Systems (GPSs), 256, 259, 265–268, 273 Global recession, 444–445 Gongos Automotive Panel, 237 Google, 23–24, 219 Google Analytics, 49 Grand Coulee Dam, 4–5 Grand Coulee hydroelectric plant, 4 Greedy algorithm, 25 Greenfield Online, 222 GreenRay, 15
Index
Gross rating points (GRPs), 26, 35, 169–172 Growth category, 53 Guadagni, J., 12 Guadagni, P., 12 Hard information, 349 “Harvest” stage, 62 Hauser, J., 161 Hazard rate model, 63–64 Hershey’s, 410 Heterogeneity, 109–112, 130, 135 Bayesian shrinkage estimation and, 144 consumer, 375–377 observed, 111–112, 412 physicians’, 175–178 product, 320–321, 333 recency, frequency, and monetary value model and, 132–133 unobserved, 110–111 Heteroscedastic Extreme Value (HEV) model, 407, 413, 425–426 Heuristic optimization methods, 25 Hidden Markov model (HMM), 176–178, 185, 195 Hierarchical Bayes (HB) framework, 131–133, 135–136, 138, 242–243, 256, 264–265 Empirical Bayes (EB) vs., 141–142 Full, 141–142 Pareto/NBD-based stochastic model of buyer behavior, 138, 141–142 validity of, 142 High-Low strategy, 163–165 High precision measurement model of advertisement and sales promotion, 168–172 Hosei University, 161 HP, 73 HTC, 221, 225–226 HULB, 211–212, 214, 217–219, 222–224, 233, 241, 243, 246 “I am a PC” ad campaign, 350 IBM, 6, 37 Identification restrictions on HMM, 185 Image-theory rules, 258 IMM, 408–409 Immediate e-mail suppliers, 307 Impressions click-through-rates and, 212–213 cost-per-thousand and, 33
477
frequency capping and, 30–31, 39 maximizing average frequency and, 42 negative binomial distribution and, 32–33 share of, 29–30 Impulsive-holistic segment, 227–229 IMSL, 37 Inactive state, 188 Incumbent channel, 300 Independence assumption, 142–144 Independence of irrelevant alternatives (IIA), 407 Independent models, 141 Indexability of partially observable Markov decision process, 220 Indirect information, 352–353 Individual-level POMDP, 197 Industrial Operations Research, 6 Infinite-horizon POMDP, 197–199 Information Resources, Inc. (IRI), 16 INFORMS Society of Marketing Science (ISMS), 18 Infrastructure, common, 165 Infrequent state, 191 Innovation, 42 Inputs, 51–52, 54 “Inside the Tornado” (Moore), 59 InSite Marketing Technology, 16–17 Institute for Operations Research and the Management Sciences (INFORMS), 1, 17 In-Store Merchandising (ISM), 164 Intage, 162 Intangible assets, 53, 57 Integrated Marketing Communication (IMC), 168 Intel, 68 Intellectual property, 68 Intercept-shift, 99–100 Internal assets, 67 International Federation of Operational Research Societies’ (IFORS), 18 Internet. See also Internet media selection; Online product ratings advertising exposure distribution models, 26–29 advertising of, 23–24, 25–26 audience exposure, 25 budget constraints for, 33 display advertising, 24–26 exposure distribution models for, 26–27
478
Internet. See also Internet media selection; Online product ratings (cont.) exposure model, 26 media planning, 29–30 population, 34 usage behavior, 34, 34–35 Internet media selection, 23–44 audience estimation and, 24–26 budget functions, development of, 29–33 data on, 33–37 objective functions, development of, 26–29 results of, 37–43 scheduling for, 24–26 Intuitive-deliberative segment, 233 Intuitive-impulsive segment, 233 Investment. See also Investment portfolio aggressive, 448–449 classes, 456–457 conservative, 448 decision rules, 439 goals, 447, 453 market/marketing, 55–57, 62–63 moderate, 448 payouts on, 450 return on, 50, 60, 66, 109, 129, 199–200 return on marketing, 55–56 Investment Decision Rules, 451 Investment portfolio asset picking system and, 445–452 average expected return, 451–452 best mix, 448–449 expected return, 456–460 global recession and, 444–445 “Invest to build” stage, 62 “Invest to maintain” stage, 62 In vivo field-test, 211–212, 214 IP assets, 70 John D. C. Little Award, 18 “John Little” style, 21 Johns Hopkins University, 15 JPEGs, 274 Kana Corporation, 17 Kao, 163 Keio University, 161 Kelley Blue Book, 213, 230 Kernel estimators, 277 Kimball, G., 6–7 Kodak, 66
Index
KoreanClick, 33–34 KSP-SP, 161, 166–168 Kullback-Leibler divergence (K-L), 269–272, 274, 277 LAGVALENCE, 329, 332, 334–336 LAGVARIANCE, 329, 332, 336 LAGVOLUME, 329, 332, 336 Latent class (LC) model, 186–187 Latent cognitive-style segments, 223–224, 226–229 Law of Motion, 114 Leave-one-out cross-validation, 263 Leveraging market-based assets, 67–68 Lexus, 255 L’Hospital Rule, 117 Licklider, J. C. R., 3 Lifetime, 128–129, 135–136 expected, 129, 144 exponential, 132 Likert scale, 410 Little, J. D., 1–2 Little, J. D. C. biographical demographics of, 1–2 Case Institute of Technology and, 5–6 computers and, 3–4 honors and awards for, 17–18 marketing data and, 12–13 Massachusetts Institute of Technology and, 1, 3, 5–6, 8–9, 15–16, 18 profile of, 1–19 science marketing evolution and, 9–12 as teacher, 13–17 U.S. Army and, 5–9 Little, J. N., 5, 15 Little, M. J., 1–2 Little, R. D. B., 15 Little, S. A., 15 Little, T. D. C., 15 Little’s Law, 7, 9, 92, 110 asset picking model for investment, 437–465 extension, 441–442 flexibility, 442 Generalized (GLL), 437–465 nesting, 442 Operations Research and, 9 standard, 440–441 “Livingon the Fault Line” (Moore), 59–60 Lodish, L., 10
Index
Log Bayes factor (Log BF), 186 Logical analysis of data (LAD-DOC), 263–264, 270–271 Logit, 425–426 Logit model, 12–13 Log-marginal density (LMD), 186 Lognormal spending, 132, 134 Long-term equilibrium strategies, 88 Lovejoy’s method, 197 Loyalty, 12–13, 253 Lump sum, 447, 450 Machine-learning algorithms, 256, 260–264 “Machine Methods of Computation and Numerical Methods” project, 3 Magellen, 275 “Make Every Shot a PowerShot” advertising campaign, 350 Malden public schools, 2 Management category, 162–163 Management Decision Systems, Inc. (MDS), 16 Management Science (MS), 10, 11, 17–18 Management Science Area (MSA), 14 Managerial decision-making, 129 Managerial models for banner advertising morphing, 211–249 for customer lifetime value and customer equity, 85–124 for customer lifetime value from RFM, 127–159 for micromarketing, 161–173 optimal Internet media selection, 23–44 for pharmaceutical detailing and sampling, 175–206 strategic marketing metrics for growth, 49–78 Manufacturers, 162 Manufacturing process skills, 68 Marginal acquisition cost, 103–104 Marginal costs, 103 Marginal explanatory power, 412–413 Margin growth, 53 Market-based assets, 52, 55, 57, 59, 67–68 Marketing Meets Wall Street initiative, 77 “Marketing metrics,” 50
479
Marketing mix, 52, 58, 63 Marketing Science, 13, 18, 21 Marketing Science Institute, 49, 50 Market/marketing accountability of, 49, 51–52, 56, 64 actions, 54, 88–89, 98, 102 activity, 54–57 bear, 450 behavior rules, 439 capitalization, 50 communications, 292 contemporaneous effects, 53 data, 12–13 development, 53 dynamics of, 58, 69 effort, 64 expansion, 69 expenditures in, 50, 54 exploitation of new opportunities, 72 growth potential in, 58–59 investment, 55–57, 62–63 marginalization in, 49–50 mature, 57–58 measuring performance, 73 metrics of, 51–52 Operations Research in, 6, 14 productivity, 54 resources, 66, 195 share, 54 stationary, 57–58 value, 443 Market metrics, 49, 51–52, 54 Markov Chain Monte Carlo (MCMC), 132, 136–142, 149, 184–186 random-effect parameters and, 198 recency, frequency, and monetary value model and, 137 Markovian first-order process, 180–181 Marlborough Street, 5 Massachusetts Institute of Technology (MIT), 1, 3, 5–6, 8–9, 15–16, 18, 161 Mathematical programming DOCMP, 262–263 MathWorks, 452 Mature markets, 57–58 Maximizing average frequency, 42 Maximizing effective reach, 39–42 Maximum bandwidth (MAXBAND), 9–10 Maximum likelihood estimation, 33 Mean absolute percent errors (MAPE), 138
480
Mean squared error (MSE), 141 Means/standard deviation efficient frontier, 448 Measurement-based morph assignment, 212 Media. See also Internet media selection online, 29 planning system, 26, 31 selection, 25–26, 29 MEDIAC model, 10, 25 Merton’s new rules, 449 Methods of Operations Research (Morse and Kimball), 6–7 Metrics for baseline ratings, 330–333, 337 Metropolis—Hasting procedure, 185 MH-algorithm, 137 Miacomet Rip, 14 Micromarketers/micromarketing, 161–173 demands for strategies in, 164–163 high precision measurement model of advertisement and sales promotion, 168–172 M3R (Micro Marketing Mix Response model), 168–172 new service development and, 165–168 point of sale in Japan, 161–163 Microsoft, 349–350 Microsoft Excel, 167 Military Operations Research, 5 Minimum reach, 39 MIT Sloan School, 9, 13–14 MIT-Woods Hole Oceanographic Institute, 15 Mixed Integer Linear Program (MILP), 457 MLE, 150 M&M Candies, 6 MNBD model, 27–28 Models. See also specific types of channel choice, 293 Correlated, 141–143, 154 diffusion, 72 disaggregated, 12–13 discrete choice, 422 Empirical Bayes, 141 exponential hazard, 331 Full, 141–144 hazard rate, 63–64 Independent, 141
Index
Internet, 26–29 latent class, 186–187 logit, 12–13 MEDIAC, 10, 25 MNBD, 27–28 Modigliani-Miller, 449 money-burning, 352 multinomial logit, 406, 413 multinomial probit, 407–408, 413, 417–419 negative binomial distribution, 27, 31 one period, 449 pay-per-click pricing, 23–25 Poisson, 328–329 q-compensatory, 258, 274 Sharpe-Lintner-Mossin, 449 VIDEAC, 25 “Models and managers: The concept of a decision calculus” (Little), 11 Moderate investment, 448 Modigliani-Miller model, 449 Molecular, Inc., 15 Monetary value, 127–128 Money-burning models, 352 Monosodium glutamate (MSG), 161 Monte Carlo simulation, 419 Morgan, J., 9 Morphing automotive experiment test matching, 229–240 banner, 212, 214, 214–220 click-through rates and, 226, 229 online, 211–212, 241 website, 211–212 Morph-to-segment-matching, 234–236, 238 for banner advertising, 232 brand consideration and, 239 Morse, P., 3–8, 14 Motivation and ability, 290 M3R (Micro Marketing Mix Response model), 168, 171–172 MSN, 24 Msn.co.kr, 35 Msn.com, 30, 35 Multilinear decision rule, 277 Multinomial logit analyses, 242–243 Multinomial logit (MNL) model, 406, 413 Multinomial probit (MNP) model, 407–408, 413, 417–419
Index
Multiple-audience effects, 322 Multivariate generalization of the negative binomial distribution (MNBD), 27, 42–43 MVN, 137, 421 “My Life. My Card.” campaign, 350 Myopic dynamic policy, 198–200 Myopic static policy, 198–200 Nabisco, 10–11 Nantucket (Island), 8–9, 15 National Academy of Engineering, 18 National Biscuit Company, 10–11 Naver.com, 39 NCONF, 37 Negative binomial distribution (NBD) model, 27, 31 frequency capping and, 35–37 impressions and, 32–33 multivariate generalization of the, 27, 42–43 univariate, parameters of, 33 Negative tolerance, 447 Nested HEV, 425–426 Nested Logit, 425–426 Nesting, 442 Net growth, 50 Net present value (NPV), 103, 122–124 Netratings, 34 New market growth, 69–71 New market opportunity, 69–70 New product development, 67–68 New product growth, 67–68 New product opportunity, 67–68 New service development, 165–168 Nielsen ratings, 34, 162, 229 Nikon Coolpix, 350–351 No-marketing policy, 198–201 Non-attribute-focused advertising, 349–350 Non-compensatory decision rules, 257–258, 259, 276–277 Non-constant stochastic variation, 409–420 potential specification error and, 419–425 robustness and, 425–426 Nonhomogeneous hidden Markov model, 180–185 Non-market-based assets, 52 Notation and established decision rules, 256–258
481
Novel channel, 300 Number of ratings, 333, 337 Objective functions, development of, 26–29 Observed heterogeneity, 111–112, 412 Ocean Spray Cranberries, 13 Off-balance sheet assets, 68 Office of War Information, 2 One period models, 449 One-state HMM, 194 One-year survival rate, 129 Online ad campaigns, 23–24, 30–31 Online media, 29 Online morphing, 211–212, 241 Online posting behavior, 322 Online product ratings consumer ratings behavior, 321–322 data on, 325–326 decomposing metrics for, 330–332 framework for, 323–325 model for, 327–330 online word of mouth and, 320–321 sales model for, 332–333 social dynamics of, 340–343 Online word of mouth, 320–321 Operational excellence, 66 Operational Research Hall of Fame, 18 Operations Research (OR) development of, 1 as hydroelectric systems, 7 industrial, 6 Little’s Law and, 9 Management Science and, 10 in marketing, 6, 14 methods of, 1 military, 5 pioneers of, 1 queueing theory for, 7 scope of, 9 in United States, 3 Operations Research Center at MIT, 3, 14 Operations Research Society of America (ORSA), 1, 9, 17–18 Optimal acquisition policy, 103–108 Optimal adaptive control, 10 Optimal intercommunication time, 100–103 Optimal periodicity, 96–98 Optimal reach, 38–39, 42–43 Optimal schedule, 39–40, 42–43
482
Oreo, 11, 410, 421 Out-of-sample predictions, 263 Outputs, 51–52, 55 Page view, 34 Paid search, 24 Parameters, 33, 452 Pareto analysis (ABC analysis), 162 Pareto/NBD-based stochastic model of buyer behavior (SMC), 130–133 Customer Lifetime Value, 130, 134 FHL, 132–133 frequency and, 133 Hierarchical Bayes framework and, 138, 141–142 independence assumption and, 142–144 Independent models vs., 141 recency and, 133 SP and, 132–133 spending and, 136 Partially observable Markov decision process (POMDP), 176, 178, 182 exact solution to, 196–197 formulation of, 195 hidden Markov model and, 195 indexability of, 220 individual-level, 197 infinite-horizon, 197–199 pharmaceutical detailing and sampling, 195–198 value function of, 197 Paul D. Converse Award, 1, 17 Payouts on investments, 450 Pay-per-click pricing model, 23–25 Pentax Optio I-10, 351 Percentage of profits, 57 Performance-based contracts, 49 Performance calibration, 55 Performance metrics, 71–72 P&G, 163 Pharmaceutical detailing and sampling, 175–206 data on, 178–180 empirical results on, 185–195 literature review on, 177–178 nonhomogeneous hidden Markov model and, 180–185 POMDP procedure, 195–198 results optimization and, 198–204 Phillips Academy, 3 Physical Review, 15 Physician-level HMM, 198
Index
Physicians’ heterogeneity, 175–178 Planning support, 164–165 Planters, 410 Point of sale (POS), 161, 161–163 collaborative movements in, 162–163 consumer movement, 163 data suppliers, 162 manufacturers, 162 retailing, 161–162 Poisson distribution, 27, 30, 132 Poisson model, 328–329 Policies calibration-sample optimal, 201 forward-looking dynamic, 198–203 myopic dynamic, 198–200 myopic static, 198–200 no-marketing, 198–201 resource allocation, 176 validation sample optimal, 201 Portal sites, 35 Portfolio in process (PIP), 452 Positive cash flow, 62 Positive tolerance, 447 Post-purchase product evaluations, 324 POSTRATING effect, 339 Post-trial stage, 292 Potential specification error, 419–425 Potential value, 89 Power Lead System (PLS), 128–129, 131 Predictive tests, 268–272 Preference Interaction effect, 422 Pre-purchase product evaluation, 322–323 Prescription behavior, 175–176 Pricing/price advertising models, 23 premiums, 54 promotions, 57 Primary demand, 68 Prior preference in choice, 417–418 Product. See also Product-market; Product-market life cycle (PMLC) assets based on, 68 attributes, 349 bargain-priced, 165 brand-specific, 422 consumer-generated reviews, 343 development, 67 harvesting, 63 heterogeneity, 320–321, 333 innovation, 62, 66 platforms, 68–70, 71
Index
portfolios, 62 regular-priced, 165 warranties, 354 Product life cycle (PLC), 62 “Product Management” (Urban and Hauser), 161 Product-market applications, 64 existing, 69 management, 52–64 opportunities, 63, 67–68, 73–74 portfolio, 73–76 product life stages of existing, 73 rejuvenation, 63 revenues, 73 Product-market innovations, 64–72 Ansoff Matrix, 66–67 new market growth, 69–71 new product growth, 67–68 performance metrics, 71–72 Product-market life cycle (PMLC), 50–52, 57–59 competition in, 58 decline stage of, 60–62, 63–64 environmental factors influencing, 60 growth and defense in, 57–64 innovation in, 60 market-based assets and, building of, 62 market development and, 60 multiple business models for managing stages of, 62–64 prognostic information provided by, 59 unbalanced scorecard, 64 Profit for customer, 90 equation, 53 generation, 101–102 growth, 53 percentage of, 57 persistence, 58–59 volatility, 57 Promotions, 54 Proof-of-concept test, 277 Property, intellectual, 68 Provost, 16 Pseudo-R-square value, 334 Pull effects, 171 Purchase incidence (PI), 165–166, 169–170 Purchase rate, 128–129, 135–136, 142–143 Purchase value, 443
483
Q-compensatory model, 258, 274 Q-compensatory rules, 264, 274 Quadratic acquisition costs, 104 Quadratic programming method, 37 Quality control, 49 Queueing theory for Operations Research, 7 Queues, Inventory and Maintenance (Morse), 7 Radio advertising, 24 Random-effect parameters, 198 Random utility (RU), 404 Rate card, 31 Rate/rating acquisition, 94, 98, 103 arrival, 92 average, 333, 337, 341 baseline hazard, 331 dropout, 142–143 Nielsen, 34, 162, 229 number of, 333, 337 one-year survival, 129 purchase, 128–129, 135–136, 142–143 retention, 98–99 survival, 144 usage, 54 variance, 333 yearly acquisition, 92 Rational-deliberative segment, 233 Rational-impulsive segment, 233 Ratio of sales, 57 Reach estimation, 33 Reach optimization, 37–39 Real estate bubble, 444 Realistic cost-per-thousand, 35 Realized value, 89 Recency, 127–128 FHL and, 133 Pareto/NBD-based stochastic model of buyer behavior and, 133 Recency, frequency, and monetary value (RFM) model, 127, 186 acquisition, 150–152 analysis using, 146 assumptions about, 132–133 customer characteristics and, 134–135 customer measures of, 127–130, 136, 146 customer specific metrics for, 144–146 elasticities and, 135 empirical analysis of, 137–155
484
Recency, frequency, and monetary value (RFM) model (cont.) estimation and, 135–136 for existing customers, 142–149 frequent shoppers program, 137, 152–155 general approach to, 130–132 heterogeneity and, 132–133 for individual customers, 132 interpreting, 142–144 mathematical notations about, 133 MCMC procedure and, 137 for perspective customers, 149–152 prior specification and, 136–137 retention program and, 147–149 sales and, 133–134 transactions and, 133–134 validation of, 137–142 Regime shift effect, 178 Regular-priced product, 165 Renewable resources, 87, 113 Rensselaer Polytechnic Institute, 15 Research assistantship (RA), 3–4 Research Priorities 2010–2012, 50 Research streams in advertising, 354 Resource allocation, 73–75 decisions, 57–58 pharmaceutical detailing and sampling, 195–198 policy, 176 Resources business, 69 depletable, 87–88 expendable, 88 market/marketing, 66, 195 renewable, 87, 113 Retailing, 161–162 Retention of customer, 54 probability, 98–99 process, 87–89 rate, 98 rates, 98–99 sensitivity, 98 Return on assets, 56 Return on investment (ROI), 50, 60, 66, 109, 129, 199–200 Return on marketing investment (ROMI), 55–56 Return on sales, 56 Revealed preference data, 401 Revenues per communication, 104
Index
Revenue volatility, 57 Rifle targeting approach, 203 Right-channel customer, 285–286 Risk average asset systematic, 451–452 aversion, 447 management, 58 systematic, 453 tolerance, 447 RMF, 127–128 RMSE criteria, 187 Rules additive parts-worth, 264, 274 additive q-compensatory, 264, 274 conjunctive, 257–258, 264, 274 consider-then-choice, 277–278 disjunctive, 257, 264, 274 image-theory, 258 Merton’s new, 449 multilinear decision, 277 q-compensatory, 264, 274 subset conjunctive, 258, 264, 274 Sales model for online product ratings, 332–333 ratio of, 57 recency, frequency, and monetary value model and, 133–134 return on, 56 Salvage values, 60 Sample shrinkage, 262 Sarmanov method, 27 School of Industrial Management, 9 Search engines, 35 Search items, 353 Security Market Line (SML), 457, 459 Segments -based click-characteristic, 242 collection, 233 commitment, 233 comparison, 233 consumer, 242 deliberative-analytic, 227–229 impulsive-holistic, 227–229 intuitive-deliberative, 233 intuitive-impulsive, 233 latent cognitive-style, 223–224, 226–229 rational-deliberative, 233 rational-impulsive, 233 size, 111–112 Seinfeld, J., 350
Index
Semi-separating HM equilibrium, 372 Sequential (SEQ) conditions, 411, 421–422, 426 7 (donga.com), 38, 42 Seven-Eleven Japan, 161–162 Shannon’s information measure, 269 Shared order-supply systems, 71 Share growth, 53 Share of impressions (SOI), 29–30, 30 budget constraints and, 32–33 ‘Fully Flexible’ allocation method and, 33 Internet media planning based on, 30 Sharpe-Lintner-Mossin model, 449 Sharpe Ratio, 460 Shelving promotional effect, 171 Shimbun, N. K., 162 Short-term allocation, 88 Signals, 452 Simulated-maximum-likelihood, 277 Simulink Model-Based Design, 461 Simultaneous condition, 411 Simultaneous (SIM) conditions, 421–422, 426 Six Sigma management, 49 Smart phone banners, 220–221 Smith College, 2 Snickers, 410 Social dynamics of online product ratings, 340–343 Socially unbiased product evaluations, 318 Sophia University, 161 SP, 130–131 behavioral assumptions of, 131 independence assumption and, 143 Pareto/NBD-based stochastic model of buyer behavior and, 132–133 spending per transaction and, 132 Spending parameters, 142–143 Pareto/NBD-based stochastic model of buyer behavior and, 136 per transaction, 128–129, 132, 135 Spillover effects, 73 Square banners, 227 Standard Deviation, 459 Standard Little’s Law, 440–441 Standard of living, 447 Stanford University, 15 Starbucks, 68 State dependence, 418
485
Stated preference data, 401 Statement of optimization problem, 32–33 States, 452 Stationary markets, 57–58 Steady state, 93, 108–109 Stochastic inflation, 401–402 Stochastic scenario, 420 Stochastic variation, 409–419 Stock bubbles, 444 Stocks, 448 Strategic marketing metrics for growth, 49–78 product-market innovations, 64–72 product-market management, 52–64 product-market portfolio, 73–76 Street agents, 307 Subset conjunctive rules, 258, 264, 274 SuperMidas, 25 SUP-LINK, 167 Supply-side benefits, 69 Survival rate, 144 Synergistic effect, 170–171 Syracuse University, 15 Systematic risk, 453 Tangency point, 448 Targeting behavioral, 213 body-type, 232–233, 238 Task familiarity, 290–291 Task variations, 272–273 Technological knowledge, 68 Technology assets, 70 Technology platforms, 69 Temporal scaling, 416–417, 422 Temporal stochastic inflation, 402 Tempur-Pedic, 349, 351–352 Test banners, 234–236 The Institute of Management Sciences (TIMS), 1, 17–18 3M, 73, 77 Three-state HMM (full HMM-3), 186–188 Time delay, 402–404 Time horizon, 448, 453 Time-to-market, 42, 62 Time-to-money, 62 Time-to-volume, 42 Tokyu Agency (Macromill), 162 Tolerance, 447 Top-of-page banners, 227–229 Total error, 404–406
486
Transactions, 133–134 Transferability of market-based assets, 68 Transportable code, 242–243 Traveling salesman problem (TSP), 6 Trial stage, 291–293 Trivariate association parameters, 33 Twitter, 349 Unbalanced scorecard, 64 Uncertainty brand-specific differences in, 416 choice probability distribution and, 402–404 temporal differences in, 413 Undergraduate Program Committee, 13 Uninformative advertising, 349, 349–395 countersignaling hl equilibrium, 366–369 literature review on, 352–356 model for, 356–360 perfect Bayesian equilibrium for, 360–366 U.S. Army, 5–6 U.S. Navy, 3 U.S. pharmaceutical industry, 175 Units, 92 Univariate negative binomial distribution model, 33 Universal product code (UPC), 12–13, 162, 253 University of Liège, Belgium, 18 University of London, 18 University of Massachusetts Amherst, 15 Unobserved heterogeneity, 110–111 Unobserved variability, 400 Unsatisfying experiences, 290–291 Urban, G., 161 URL, 313 Usage rates, 54 “Use of Storage Water in a Hydroelectric System,” 5 “Using Market Information to Identify Opportunities for Profitable Growth,” 50 Utility, 258 anticipated, 403 consumer, 404–405 expected, 403 experienced, 402–405 future, 402–403 maximization, 260 stochastic inflation and, 401–402
Index
Valence, 320, 328, 340–341 Validation log-likelihood for the validation periods, 186–187 Validation sample optimal policy, 201 Validity of Hierarchical Bayes framework, 142 Value book, 443 creation, 67 of customer, 102, 134 customer lifetime, concept of, 127 Deterministic Scenario parameter, 422 deviance information criterion, 186, 297, 305, 337 function, 197 future, 89 iteration, 197 market/marketing, 443 monetary, 127–128 net present, 103, 122–124 potential, 89 pseudo-R-square, 334 purchase, 443 realized, 89 salvage, 60 of social dynamics in online product ratings, 317–346 Value-chain platforms, 71–72 Variance, 320, 337, 340–342 in assumption in decision research, 399–432 baseline, 339 in error, 405 Varied three-star, 341–343 Variety-seeking, 418 Verification support, 164–165 VIDEAC model, 25 Volatility in sales and profits, 57 Volume, 320 Von Hippel, A., 5 Voo Doo, 3 Wadsworth, G., 3 Wald-type t-tests, 422 Washington State, 4 Wealth, 447 WebFares, 112 Website costs of, 35–37 description of, 34–35 morphing, 211–212
Index
“What’s in Your Wallet?” campaign, 350 Whirlwind, 3, 5, 13 WinBUGS, 333 Windows Vista, 349–350 World War I, 2 World War II, 1–3, 444 X86 design architecture, 68 X-pert, 25 Yahoo!, 23–24, 39, 213 Yamanaka, M, 161 Yearly acquisition rate, 92
487