271 18 8MB
English Pages IX, 291 [292] Year 2020
Springer Proceedings in Complexity
Ted Carmichael Zining Yang Editors
Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas
Springer Proceedings in Complexity
Springer Proceedings in Complexity publishes proceedings from scholarly meetings on all topics relating to the interdisciplinary studies of complex systems science. Springer welcomes book ideas from authors. The series is indexed in Scopus. Proposals must include the following: − − − − −
name, place and date of the scientific meeting a link to the committees (local organization, international advisors etc.) scientific description of the meeting list of invited/plenary speakers an estimate of the planned proceedings book parameters (number of pages/articles, requested number of bulk copies, submission deadline) Submit your proposals to: [email protected]
More information about this series at http://www.springer.com/series/11637
Ted Carmichael • Zining Yang Editors
Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas
Editors Ted Carmichael Department of Software and Information Systems University of North Carolina at Charlotte Charlotte, NC, USA
Zining Yang Department of International Studies Claremont Graduate University Claremont, CA, USA
ISSN 2213-8684 ISSN 2213-8692 (electronic) Springer Proceedings in Complexity ISBN 978-3-030-35901-0 ISBN 978-3-030-35902-7 (eBook) https://doi.org/10.1007/978-3-030-35902-7 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
About the Book
Computational Social Science (CSS) is the science that investigates social and behavioral dynamics in both nature and society, through computer simulation, network analysis, and the science of complex systems. The Computational Social Science Society of the Americas (CSSSA) is a professional society that aims to advance the field of CSS in all its areas, from fundamental principles to realworld applications, by holding conferences and workshops, promoting standards of scientific excellence in research and teaching, and publishing novel research findings. This volume is comprised of a selection of the latest research into CSS methods, uses, and results, as presented at the 2018 annual conference of the CSSSA. This conference was held in Santa Fe, New Mexico, October 25–28, 2018, at the Drury Plaza Hotel. What follows is a diverse representation of new approaches and research findings, using the tools of CSS and Agent-Based Modeling (ABM) in exploring complex phenomena across many different domains. As such, readers will not only have the methods and results of these specific projects on which to build, but will also gain a greater appreciation for the broad scope of CSS, and have a wealth of case-study examples that can serve as meaningful exemplars for new research projects and activities. This book, we hope, will appeal to any researchers and students working in the social sciences, broadly defined, who aim to better understand and apply the concepts of complex adaptive systems to their work.
v
Contents
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Milton J. Friesen and Srikanth P. Mudigonda Exposing Bot Activity with PARAFAC Tensor Decompositions . . . . . . . . . . . . Peter A. Chew To Share or Not to Share: Effect of Peer-to-Peer Mentoring on Dynamics of Graduate Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raafat M. Zaini and Saeed P. Langarudi How to Code Algorithms to Favor Public Goods Over Private Ones . . . . . . Hugues Bersini
1 25
39 55
Islamic Extremism and the Crystallization of Norms: An Agent-Based Model of Prison Radicalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryan J. Roberts and Andrew Collins
67
An Agent-Based Model for Simulating Land Degradation and Food Shortage in North Korea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoosoon An and Soo Jin Park
83
Alleviating Traffic Congestion by the Strategy of Modal Shift from Private Cars to Public Transports: A Case of Dhaka City, Bangladesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Md Mamunur Rahman, Jinat Jahan, and Yuan Zhou Segregation-Sensitivity and Threshold-Dependency in Residential Segregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Sushma and S. Choudhury Revisiting Markov Models of Intragenerational Social Mobility . . . . . . . . . . . 133 Rajesh Venkatachalapathy
vii
viii
Contents
NetLogo Meets Discrete Event Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Emmet Beeker III and Matthew Koehler A Popularity-Based Model of the Diffusion of Innovation on GitHub . . . . . 165 Abduljaleel Al-Rubaye and Gita Sukthankar What Can Honeybees Tell Us About Social Learning? . . . . . . . . . . . . . . . . . . . . . . 179 Robin Clark and Steven O. Kimbrough Understanding the Impact of Farmer Autonomy on Transportation Collaboration Using Agent-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Andrew J. Collins and Caroline C. Krejci Modeling Schools’ Capacity for Lasting Change: A Causal Loop and Simulation-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Roxanne A. Moore and Michael Helms Model Structure of Agent-Based Artificial System for Reproducing the Emergence of Bullying Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Shigeaki Ogibayashi and Kazuya Shinagawa Predictors of Rooftop Solar Adoption in Rural Virginia . . . . . . . . . . . . . . . . . . . . 251 Aparna Gupta, Zhihao Hu, Achla Marathe, Samarth Swarup, and Anil Vullikanti Evaluation of Simulated Agent: Based Computational Models of Civil Violence with the Boko Haram Conflict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Olufemi N. Oloba Information Transmission in Randomly Generated Social Networks . . . . . . 283 Jeff Graham
About the Editors
Ted Carmichael is the Senior Research Scientist for TutorGen, a Carnegie Mellon start-up in the Education Technology space; and an Affiliated Assistant Research Professor in the Department of Software and Information Systems at the University of North Carolina, Charlotte (UNC Charlotte). Dr. Carmichael is currently serving as Vice President of the Computational Social Science Society of the Americas (CSSSA). He received his PhD in Computer Science, along with a certificate in Cognitive Science, in 2010 from UNC Charlotte. His primary research interests include modeling and simulation of complex systems, Educational Data Mining, and Intelligent Tutoring Systems; and he has published in a wide variety of fields, such as Computer Science, Economics, Biology, Sociology, Ecology, and Political Science. Dr. Carmichael has successfully served as PI or Co-PI on multiple grants, including for the National Science Foundation, the US Department of Education, and the Kentucky Science and Engineering Foundation. His dissertation won the Distinguished Dissertation Award for 2010 at UNC Charlotte. Zining Yang is Data Science Advisor at Southern California Edison. She also works as Clinical Professor at Claremont Graduate University and Associate Director at TransResearch Consortium. She sits on the Board of the Computational Social Science Society of the Americas (CSSSA), and serves as Scientific Advisory Board Member for Human Factors and Simulations. Dr. Yang received her PhD in Computational and Applied Mathematics and Political Economy from Claremont Graduate University in 2015. Her research interests include data analytics, Machine Learning, Modeling and Simulation, Complex Adaptive Systems, Agent Based Models, and network analysis. Dr. Yang has numerous publications in the fields Computer Science, Economics, Public Policy, and Political Science. She has been identified as outstanding researcher by the government, worked on a National Science Foundation sponsored project, and won multiple awards from various organizations, including the Ministry of Education of People’s Republic of China; International Social Computing, Behavioral Modeling and Prediction; and International Institute of Informatics and Systemics.
ix
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861 Milton J. Friesen and Srikanth P. Mudigonda
Abstract Economic inequality in urban settings is a readily observable phenomenon in contemporary cities, but historical research reflects that the problem is not new. In this paper, we argue that there are citizen-level interactions and arrangements that contribute to the stability of a small group of wealthy citizens alongside a high degree of transience in the poor and more populous part of the city. We developed an agent-based model that drew on the dynamics revealed in a study of Hamilton, Ontario (1851–1861) by Michael Katz (1977). Our central hypothesis was that the wealthy developed and have had access to institutional resources that buffered negative externalities whereas the poor did not. Early results suggest that the presence of an entity that can pool individual agent resources contributes to sustained inequality expressed as a Gini coefficient. Our model is based on Epstein and Axtell’s sugarscape models (1996) where a probability function led to proto-institutions that emerged from inter-agent interactions and thereafter produced greater stability, higher levels of wealth (sugar), and longer lifespans for their participating agents. The inverse was true for the agents who did not have institutional access. Keywords Economic inequality · Agent based model · Cooperation · Urban · Hamilton · Institutional formation · Collective behavior · Emergence · Computational social science
M. J. Friesen () School of Planning, University of Waterloo, Waterloo, ON, Canada e-mail: [email protected] S. P. Mudigonda School for Professional Studies, Saint Louis University, St. Louis, MO, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_1
1
2
M. J. Friesen and S. P. Mudigonda
1 Introduction Our study is motivated by the question: “What is the minimal set of rules required to model, the emergence of inequality in wealth distribution?” Ultimately, we seek to create a model to replicate the key aspects of the economy in Hamilton, ON, during the period of 1850–1860. In the present study, we report on the preliminary models we have built and explain the degree to which these models have generated data that are indicative of emergence of inequality in distribution of wealth. We have used the sugarscape model [1] to serve as a basis for our models. Building on the sugarscape approach to modelling the emergence of economies, we have added the possibility of proximal agents forming what we call as proto-financial institutions, which are akin to present day credit unions, to understand the role they can play in the emergence of a local (sugarscape-based) economy. This paper is organized as follows. First, we present a review of literature pertinent to our subject of interest—emergence of inequality in wealth distribution. Then, we describe our interest in the historical context—Hamilton, ON in the 1850s—which serves as the economy that we ultimately intend to model. This section is followed by a discussion, drawn from a survey of pertinent literature, of the role that financial institutions are said to play in the emergence of economies. In this section, we also make a case for the notion of a proto-institution and how introducing it into the sugarscape model may be useful in generating the dynamics that are expected to be similar to those in a real-world, local, economies. Then, we present our research questions and discuss their relevance on the context of the goal of our project. This section is followed by a description of the various components of our agent-based model. After this section, we present our results and interpret them in the context of our goal. We then present a discussion of the limitations of simulations in general, and those of our models in particular.
2 Literature Review There are vast disciplinary literatures and wide-ranging scholarly and practitioner organizations focused on common resource dynamics, where the actions of economic agents competing for a common pool of resources are studied to understand how these lead to higher order effects in local economies. These dynamics are at the root of persistent inequality [2]. It also appears that inequality is not only persistent but may also be increasing [3].Theoretically detailed and sustained explorations of this subject matter are considerable, from both a historical and contemporary vantage point as even a passing review attests [4, 5]. These dynamics range from examinations of inequality in China with particular focus on urban contexts [6] and the widening economic gap in Chinese society [7]. To a large extent these changes arise from social and structural fragmentation that has occurred as a result of market forces in China [8].
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
3
Several other studies can be found in extant literature on the emergence of wealth inequality. For example, in Chile, there appears to be significant and growing socioeconomic disparity in the city of Santiago when examined at neighbourhood scale resolution [9] and the institutional aspects of this disparity are captured, in part by research that indicates the importance of social participation in addressing equity questions in urban settings in a variety of Latin American cities [10]. Uneven development (formalisation of resources and institutions) in the massive urban region of Mexico City also appears to be driving greater inequality [11]. In the Russian city of Tomsk, the interactions of government policy, urban development patterns and citizen engagement were identified as active drivers of inequality [12]. Anthropological studies that focused on iron age settlements found that inequality has long been a feature of the development of human social structures [13]. Despite these and many other examples of scholarly inquisition, inequality drivers are still not well understood in specific contexts, particularly the linkages between governance, local conditions and social factors [14]. These gaps in knowledge provide room for ongoing examination and experimentation via modelling. The social inequality present in the post-WWII dynamics in Detroit was found to have depended on earlier conditions that were present well before war. These include access to work through union patterns, housing choices, institutional affiliations (union/workplace may be primary) but it is a very complex story with some of the blame for Detroit’s deep depopulation and collapse laying at the feet of big government and big unions [15]. Inequality is a common and persistent feature of urban sociology [16]. Given the above-described history of the study of the dynamics that lead to wealth disparity, we devised our models not as an attempt to provide a simplistic answer to these many facets of inequality or to propose a comprehensive treatment that would narrow the gap in practice. What we offer is an exploration of one aspect of inequality: the role of institutions which, from a modelling perspective can be seen as a key block in the overall puzzle, interact with individual choices and opportunities present in a local economy. Collective effort will be essential to any inequality strategy [17]. The effort, at every point, is reductive as an exploratory technique. It would, however be deeply irresponsible to serve as a model for intervention or policy strategy. Our first, and central, order of business is to try to interrogate more fully the basic mechanism of agents’ resource acquisition and the role that pooling of resources, via institutions, may play in the dynamics of inequality.
3 Outlines of Model Context and Development This model is designed to better understand, at the most basic agent interaction level, what leads to persistent socio-economic inequality in Hamilton, Ontario as observed by Michael B. Katz in “The People of Hamilton, Canada West: Family and Class
4
M. J. Friesen and S. P. Mudigonda
in a Mid-Nineteenth-Century City” [18]. Our model does not attempt to simulate that population or replicate the particulars of the city geography [19] but rather to explore the role of elemental agent-level interactions in producing and sustaining inequality [20, 21]. The historic dynamics of inequality persist to the present and remains a substantial social, political and ecological challenge [22]. Katz examined assessment rolls, city directories and other documents from 1851 to 1861 and discovered two key socio-structural dynamics at play in the city of Hamilton: inequality and population transience. In particular, the records indicate that the members of the wealthy, property owning class (0, or sets alive = FALSE and frees the corresponding sugarscape location (continued)
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
11
Table 2 (continued) Function Description perform_institution_actions Performs agent-agent institution formation actions. Agents that are within a distance of 1 form a new institution if they don’t already belong to one. If, in a pair of proximal (distance = 1) agents, one is a member and the other is not, then the one that is not a member joins the institution of the one who is a member an institution. If both agents are already members of an institution (same or different), then no action is taken. An agent can be a member of at most one institution runsim Sets up the simulation environment and runs one replication of it, for all the parameter combinations present in the appropriate input file
• records uses a ledger to record the ID of the two original agents and any other agents that are proximate and have excess sugar. • dispenses the sugar reserve the proto-institution holds is given to member-agents whose sugar levels would be fatal without intervention, upon request from the member-agents; dispensing is feasible only when the amount of sugar sought is less than the current level of sugar held by the proto-institution. • static proto-institutions do not move once formed but can die if their sugar is used up in which case member agents become independent, that is, they are no longer members of the respective proto-institutions. • remote the member agents can get top-up and deposit sugar without traveling to the proto-institution. • exclusive each agent can be a member of at most one proto-institution; member agents cannot form additional proto-institutions, if they are already members of one proto-institution. • membership growth a new agent who is currently not affiliated with protoinstitution and has sufficient geographical proximity to one or more member agents and has sufficient sugar-level can join the proto-institution by depositing the minimum required amount of sugar. When these dynamics are written in code, the following functions and class attributes manifest the actions representing this dynamic (Table 3): Having described the various components of our models, we will next describe the salient model-level parameters and their significance. Specific configurations of these parameters are used to distinguish one sugarscape from another. Loosely speaking, the multiple sugarscapes representing different combinations of model parameters are akin to multiple societies that are similar in all of the structural aspects but differ in terms of the magnitude of their defining features.
12
M. J. Friesen and S. P. Mudigonda
Table 3 Proto-institution code elements used in R Component/function Institution
institution.id sugarlevel v.member.ids alive transaction.log
deposit() can_withdraw() withdraw()
get_ledger() append_agent_id() get_member_agent_ids()
Description Represents the R6 institution class with various class attributes and methods to encapsulate the data and behavior of an institution object Unique identifier of an institution object Current aggregated sugarlevel within an institution object A vector representing the IDs of all agents that are members of the institution object Status indicator, active/inactive, representing whether an institution object has a non-zero amount of sugar A list of all deposit and withdrawal transactions performed on an institution object accumulated across all the time periods in a given simulation run. These are held even after an institution goes inactive. Presently, we do not have any methods for analyzing these transactions; we can only fetch them (but no code does this currently) Method to deposit sugar, including details of which agent, time period, and amount of sugar deposited Method returning TRUE/FALSE, indicating whether the requested amount of sugar can be withdrawn Method capturing the transaction related to withdrawal of a specified amount of sugar, capturing the time period, agent making the withdrawal, and the amount of sugar Returns a list of all transactions so far Adds a new agent’s ID to the vector of member agent IDs Method returning a vector of all member agents’ IDs
8 Adjusting Model-Level Parameters In addition to the sugarscape-specific and agent-specific parameters, there are three other parameters that determine the size of agent population during each timestep: (a) agent-departure-rate: this determines what proportion of the agent population emigrates from the sugarscape; (b) agent-arrival-rate: this determines the number of new agents added to the sugarscape, as a proportion of the existing sugarscape’s agent population, via immigration; (c) agent-birth-rate: this determines the number of new agents that are born during each time-step, as a proportion of the existing sugarscape’s agent population. Note that since the above aspects of our model are not agent-specific and are not emergent, they represent the “equation-based” aspect of our model. Thus, our modelling approach is described as “hybrid”—a combination of equation-based
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
13
Table 4 Parameter values of outputs used in R Parameter/output variable side carryingcapacity regenrate agentdensity
initialsuglevel metabolrangemax
visionrangemax
birthrate inboundrate outboundrate threshold period_X inst_period_X
Description The side of the sugarscape. No. of cells in the sugarscape = side ∗ side The carrying capacity of sugar within each cell in the sugarscape—This is the maximum amount to which sugar would grow in each cell Rate of regeneration of sugar in each cell per each time period Determines the total number of agents that will be added to the sugarscape initially. No. of initial agents = side × side × agentdensity (rounded to nearest integer) Initial sugar level within each agent Range of metabolism values—Each agent’s metabolism falls within this range, and is determine through a random draw at when an agent object is created Range of vision values—Each agent’s vision (how far in the NSEW directions an agent can see sugar in cells on the sugarscape) falls within this range, and is determined through a random draw at when an agent object is created No. of agents, as a proportion of the current agent population, that are added to the sugarscape at each time step No. of agents arriving into the sugarscape during each time step, via inbound migration, as a proportion of current population No. of agents leaving the sugarscape during each time step via death The threshold of sugar above which an agent can make a deposit to an institution Gini coefficient of sugar levels of all agents that are alive during each time step Gini coefficient of sugar levels of all institutions that are active during each time step
and agent-based modelling approaches. The specific table of output variables is as follows (Table 4): Next, we describe the “dynamic” part of our models—the rules that govern the behaviors of various objects animated in our models.
9 Running the Model The number of agents populating the sugarscape at the beginning of a simulation-run is defined by a model parameter, agentdensity. This number indicates the proportion of sugarscape cells that will have agents. Each agent is initialized with a unique agent-id, which it retains until it departs via death; a previously-assigned ID will not be re-used. Consistent with the Epstein and Axtell model, our agents have vision, which is an agent-level parameter that indicates how many adjacent cells
14
M. J. Friesen and S. P. Mudigonda
(in the vertical and horizontal but not diagonal, directions) an agent can look for determining potential future location to move in each time-step. Agents also have a metabolic rate, an agent-level parameter that determines how much sugar they need to consume at each time-step in order to continue to live. At its creation, each agent has an initial level of sugar-load assigned—this is the same across all agents at the time of their creation (which can be at the start of the simulation-run, or whenever a new agent is “born” during the simulation-run). Each agent’s initial location (whether it is born at the beginning of a simulationrun or during one of the later time-steps), is a randomly-chosen cell in the sugarscape that does not have another agent in it, during that time-step. In addition, an agent maintains a counter of number of successive time-steps that have elapsed without it being able to find a suitable sugar-rich location. During each time step, the following actions occur in the specific order in which they are presented: 1. Agents Added: New agents are added to the sugarscape via birth and immigration. 2. Agents Die: All agents that are dead (defined as having run out of sugar during the previous time-step, in time-steps beyond the first time-step) are removed from agent population and their sugarscape locations are marked as unoccupied. 3. Agents Move in Landscape: – The remaining agents move to a new location in the sugarscape if they determine that the sugar-level in their present location is below the amount they would consume. – The new location is any cell, within their range of vision, in the north-southeast-west directions, that has the highest amount of sugar and exceeds the agent’s current location’s sugar-level. If no such location can be found, the agent remains in its current location, and consumes sugar from the load it is carrying, by an amount equal to its metabolic rate. If the agent succeeds in moving to a new location, it consumes the sugar it needs (defined by its metabolic rate) and replenishes its sugar-load to its maximum carrying capacity by extracting sugar from its new location on the sugarscape. If an agent fails to find a suitable location to which it can relocate, the agent increases its internal tally of the number of successive time-steps during which it could not locate sugar. 4. Agents production of proto-institution: – When two agents share a bordering cell and each agent meets the sugar threshold needed for forming a proto-institution, they form a proto-institution in which they each become members. The proto-institution has its own specific behaviours. The ‘institutional bud’ holds a proportion of each member agent’s sugar. This reserve is held in common and can be used by a member agent regardless of the agent’s current physical location during a future timestep. This means that any member agents can draw down the reserve if their own sugar nears the death threshold. Although any agent could be allowed to form a proto-institution simply
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
15
by being proximate to another agent, adding a minimum sugar level threshold reflects that historically the very poor appear not to have done so in this way. We were interested in determining if there is a threshold value that impacts the formation, persistence or demise of proto-institutions within the landscape. Protoinstitutions persist as long as they have sufficient levels of sugar. The threshold can be seen as representing the minimum infrastructural investments that are needed to create a financial institution in the real world.
10 Monitoring Variables: Simulation-Level During the beginning of each time-step, the size of the agent population, their specific sugar-levels, their specific waiting-to-find-suitable-location times are recorded, along with the sugar-levels within each cell of the sugarscape. Sugarscape captures the equity (or the inequity) of the distribution of wealth by plotting a Lorenz curve. The population is ranked by their wealth (as a percentage) and the percentage of the population that wealth represents (e.g. 30% of the wealth is owned by 50% of the population). The scales on the axes range from 0% to 100%. Our model uses a Gini coefficient calculation to represent this change. Visual versions could include a Lorenz curve graph, which is to be computed for the population of agents that are alive during a current time-step in the simulation. As such, the number of such curves plotted would be equal to the number of time-steps in our simulation. Interpreting the Lorenz curve means that a 45 degree angle reflects 30% of people have 30% of wealth, 90% have 90% of the wealth, and so on. In an extreme case, one person (1%) could have all the money (100%) in which case the curve would look like an “L” but backwards and the Gini value would be “1”. If all wealth is evenly distributed, the Gini value would be “0”. Thus, the bigger the value of the Gini coefficient, the more unequal a distribution of wealth.
11 Duration of the Simulation The simulation was run for 30 time steps in each configuration of the parameter input arrangement. We used a Nearly-Orthogonal Latin-Hypercube (NOLH) design for generating an optimal set of combinations of model parameters [33]. This design allowed us to evaluate our model across a wide range of parameter combinations efficiently, and generate results that could be examined for patterns of significance. At the end of the simulation runs, the outcomes were analyzed, via appropriate numerical and graphical summaries, to assess how a particular set of initial conditions affected the evolution of wealth disparity, birth, death, and migration rates over the course of the simulation-run.
16
M. J. Friesen and S. P. Mudigonda
12 Discussion of Results With the range of parameters and variables, significantly rich datasets were generated by the model. The working hypothesis provided guidance on which interactions would be most likely to provide the answers to our research questions. The first analysis was an examination of Gini values across different sugar-levels for the population of agents in our sugarscape model. In Fig. 2, we have presented the change in the spread of Gini coefficients across simulation runs for a given value of threshold (the parameter that indicates the minimum amount of sugar needed for forming a proto-institution). A second relevant dynamic that we investigated was the ratio of threshold to agent density. This ratio was calculated for each time-step of the model and explored the likelihood of agents being in proximity (agent density) and forming or joining a financial institution (threshold)—a dynamic that is clearly related in the model (Fig. 3). Agent density did have an effect on the Gini coefficient of the sugar levels in general. However, the minimum and maximum values and standard deviations did not suggest a clear differentiation. Patterns of outliers for Medium and High ranges of values of agent density offer clues for further investigation. From the patterns observed in Fig. 3, it is evident that:
Fig. 2 Graph of Gini coefficients at each time step of the model calculated using different agent sugar thresholds
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
17
Fig. 3 Comparison of sugar threshold to agent density at three interval ranges—low-medium-high mapped with integer values from 4 to 33
1. As the average Gini coefficient increases non-monotonically over time in all three intervals of the agent-density-to-threshold ratio; 2. There is not much discernable difference in the overall trends, when comparing the Low and Medium intervals of the agent-density-to-threshold ratio, but there appears to be a small, but discernable difference when comparing Low and High, and Medium and High intervals; 3. There is a greater variance (spread) around the median Gini values across time in the Medium and High intervals; 4. There are more values towards the higher-end of the spectrum of Gini values during the initial time periods, and more values towards the lower-end of the spectrum, in the Medium ratio condition; and 5. In the High ratio condition, the greater spread of Gini values (at both ends) that is observed during the initial time periods disappears in the later time periods. Based on the above-stated observations, it appears that the higher agent-densityto-threshold ratio conditions tend to result in more equal distributions of wealth (lower median Gini values, with smaller variance). As the density of agents in the sugarscape increases for a given value of threshold and/or as the threshold decreases for a given density of agents, the likelihood of forming a proto-type institution
18
M. J. Friesen and S. P. Mudigonda
Fig. 4 Comparison of variation in Gini coefficients across time, in populations where protoinstitutions could be formed vs. those populations where proto-institutions could not be formed
increases. This implies that the “poorer” agents in the population, if they were fortunate enough to become members of a proto-institution, are able to rely on the resources (sugar) available through the proto-institutions during times of need and are thus able to increase their chances of surviving and thriving. So, based on our results, it appears that lowering the threshold for members of society to be part of a financial-buffer providing institution, as exemplified by proto-institutions in our model, could lead to an overall better outcome for all members of society as a whole. From the graphs (Fig. 4), it is evident that: 1. The variance in the Gini coefficient across time seems to be higher in the “no institutions” (facet labeled “no”) condition; 2. A greater number of instances of high Gini values appear to occur in the “institutions” condition, despite the median value of the Gini being approximately the same as the one observed in the “no institutions” condition. Together, these observations seem to imply that while the overall spread of wealth appears to be more equal in the “with institutions” condition, the likelihood of highlevels of wealth disparity, as seen in the outlier values of the Gini coefficient in the right-hand-side panel, seems to be higher in the “with institutions” condition as well.
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
19
Taken together, these observations can be seen as indicative of the following: the overall spread of wealth seems to be comparatively more uniform, across a larger proportion of a population, and stable over longer periods of time, in the “with institutions” condition. However, in this condition, we are also more likely to encounter instances of greater wealth disparity (at the extreme ends of the wealth distribution). In summary, the following appears to be the case: 1. Limited variance when running with and without institutional probabilities; 2. There appears to be some difference with institutions where the Gini co-efficient increases; 3. Agent density appears to interact with sugar thresholds and institutional formation. While these are preliminary results, the graphs suggest that there are complex dynamics at play even in our simplified model. Next, we discuss salient limitations of our model and simulation results.
13 Limits of the Simulation There are a number of ways in which the model may be limited. 1. Information is invariably lost through creation of an artificial representation of a complex system [34]. It may be that, in making the choices we did in simplifying our model, we have, unintentionally, eliminated an important dynamic of the system. 2. We have excluded a formal social system even though that is the very phenomenon we wish to understand. In our efforts to simplify, we have reduced the relational possibilities of agents to a combination of two factors: (a) being in adjacent cells and, (b) possessing a given sugar threshold. It may be that this proximity/resource dynamic is too simple to generate insight into the formation of institutions in a city landscape [35]. This is a pervasive problem with modelling—simplifications that make a model useful may remove information or dynamics that are essential to understanding as noted in earlier cautions about direct application of results. 3. It is possible that there are structural drivers that are hidden [36] in the approach to coding either the agents or the environment. While we have worked carefully to avoid those hidden dynamics, it is possible we have not done so [37]; 4. Using a non-uniform sugarscape may introduce a Gini effect into the model from the beginning since there is a built in probability of a given agent having more sugar than another agent right from time step one.
20
M. J. Friesen and S. P. Mudigonda
14 Conclusions We have shown that when a population of independent agents are able to form arrangements where resources can be pooled amongst themselves [38], the protoinstitutions that represent these pools improve the life-span of agents and lead to higher levels of resources and social ties when compared with agents who do not contribute to, and thus cannot obtain from, these pooled resources (sugar). The generation of a proto-institution is an indirect social relation function—the agents are related via the common proto-institution even if there is no recorded tie external to the proto-institution. In the case of social ties, the phenomenon of interest is that the historical record appears to reflect that the wealthy have strong, bonding ties among themselves—or at the very least, ties to common and persistent institutions—with few bridging ties to the lower class [39]. The lower class may have small numbers of close ties (as in a nuclear family arrangement of some kind—transience was not a solitary male type but included whole families as Katz observed in the historic data), fewer ties within the lower class, and many fewer with the upper class. The function of the wealthy agent’s proto-institutions may be understood as something like a cluster of stable social pillars that the lower class flows between and around. The overall water level may fluctuate (population) but the wealthy remain relatively stable. The table below is suggestive of a possible means of constructing more sophisticated agents to run either separately or within an enriched model [40]. Each agent has a set of features that allow it to behave, relate, process information and so on with each feature having an accumulation space much like the money/bank model featured in this paper. It would look like this (Table 5): Our model made use of quasi-institutional resource-sharing entities, in the form of what we called proto-institutions. This was done with a single resource, sugar (which can be seen as a proxy for communal wealth and resources), but there are other and more complex dimensions of resource allocation that have different dynamics than sugar [25]. Different agent features are uniquely shaped by nonmonetary resources such as technology, for example, and these factors interact
Table 5 Institutional types for possible future development and testing Agent feature Money Social ties Health Status Culture Education Morality
Institutional form Bank Social capital—social network expressed in community organizations like family, school, workplace, church Hospital Class designation Art gallery/library School Court of law
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
21
deeply with institutions [41]. Ongoing work could entail a variety of models tuned to the particular dynamics associated with a range of institutional forms and dynamics. The proto-institutional function could be a useful addition to other research projects which are trying to capture other agent dynamics. Efforts to connect agent learning and agent based modelling are underway in extant literate, see e.g., [42]. Yang’s agents collect human capital (skills, etc.) at agent levels but without an institutional function that can allow social structures to emerge at the meso (middle) level where civil society institutions mediate between the individual and the metastructures of state/government/system wide scale.
15 Code R code used to generate models and landscape is available on Github: https://github. com/IngenuityArts/specscape
References 1. Epstein, J. M., & Axtell, R. L. (1996). Growing artificial societies: Social science from the bottom up. Washington, D.C.: Brookings Institution Press. 2. OECD. (2011). Divided we stand. Paris: Organization for Economic Cooperation and Development. 3. OECD. (2008). Growing unequal? Paris: Organization for Economic Cooperation and Development. 4. Frank, P. M., & Shockley, G. E. (2016). A critical assessment of social entrepreneurship Ostromian Polycentricity and Hayekian knowledge. Nonprofit and Voluntary Sector Quarterly, 45, 615–775. https://doi.org/10.1177/0899764016643611 5. Ostrom, E. (1986). An agenda for the study of institutions. Public Choice, 48(1), 3–25. 6. Yan, F. (2018). Urban poverty, economic restructuring and poverty reduction policy in urban China: Evidence from Shanghai, 1978-2008. Development Policy Review, 36(4), 465–481. https://doi.org/10.1111/dpr.12303 7. Chang, G. H. (2002). The cause and cure of China’s widening income disparity. China Economic Review, 13(4), 335–340. https://doi.org/10.1016/S1043-951X(02)00089-5 8. Zhao, W., & Zhou, X. (2017). From institutional segmentation to market fragmentation: Institutional transformation and the shifting stratification order in urban China. Social Science Research, 63, 19–35. https://doi.org/10.1016/j.ssresearch.2016.09.002 9. Luisa Mendez, M., & Otero, G. (2018). Neighbourhood conflicts, socio-spatial inequalities, and residential stigmatisation in Santiago, Chile. Cities, 74, 75–82. https://doi.org/10.1016/ j.cities.2017.11.005 10. Rice, M., & Hancock, T. (2016). Equity, sustainability and governance in urban settings. Global Health Promotion, 23, 94–97. https://doi.org/10.1177/1757975915601038 11. Aguilar, A. G., & Lopez, F. M. (2018). The city-region of Mexico City: Social inequality and a vacuum in development planning. International Development Planning Review, 40(1), 51–74. https://doi.org/10.3828/idpr.2018.3 12. Kolodii, N. A., Karlova, L. V., Chaykovskiy, D. V., & Sinyaeva, M. A. (2017). The killing fields of social inequality: Experience of understanding modern urban development. In F. Casati, G.
22
M. J. Friesen and S. P. Mudigonda
A. Barysheva, & W. Krieger (Eds.), International scientific symposium on lifelong wellbeing in the world (wellso 2016) (Vol. 19, pp. 638–647). Nicosia: Future Academy. 13. Klehm, C. E. (2017). Local dynamics and the emergence of social inequality in iron age Botswana. Current Anthropology, 58(5), 604–633. https://doi.org/10.1086/693960 14. Deslatte, A., Feiock, R. C., & Wassel, K. (2017). Urban pressures and innovations: Sustainability commitment in the face of fragmentation and inequality. Review of Policy Research, 34(5), 700–724. https://doi.org/10.1111/ropr.12242 15. Sugrue, T. J. (2005). The origins of the urban crisis: Race and inequality in postwar Detroit (with a new preface by the author edition). Princeton, NJ: Princeton University Press. 16. OECD. (2018). Divided cities. Paris: Organization for Economic Cooperation and Development. 17. OECD. (2015). In it together: Why less inequality benefits all. Paris: Organization for Economic Cooperation and Development. 18. Katz, M. B. (1977). The people of Hamilton Canada west: Family and class in a mid 19th century city. Cambridge, MA: Harvard University Press. 19. Gehl, J., & Rogers, L. R. (2010). Cities for people. Washington, DC: Island Press. 20. Page, S. E. (2010). Complexity in social, political, and economic systems (SSRN scholarly paper no. ID 1889359). Rochester, NY: Social Science Research Network. 21. Page, S. E. (2017). Many models thinking. In Presentation presented at the Computational Social Sciences Society of the Americas, Santa Fe, NM. Retrieved from https:// computationalsocialscience.org/events/css2017/ 22. OECD. (2018). A broken social elevator? How to promote social mobility (p. 352). Paris: OECD Publishing. 23. Macal, C. M., & North, M. J. (2005). Tutorial on agent-based modeling and simulation. In Proceedings of the 37th conference on winter simulation (pp. 2–15). Orlando, FL: Winter Simulation Conference. Retrieved from http://dl.acm.org/citation.cfm?id=1162708.1162712 24. Lohmann, R. A. (2016). The Ostroms’ commons revisited. Nonprofit and Voluntary Sector Quarterly, 45, 7–26. https://doi.org/10.1177/0899764016643613 25. Ostrom, E. (2005). Understanding institutional diversity. Princeton, NJ: Princeton University Press. 26. Keuschnigg, M., Lovsjö, N., & Hedström, P. (2018). Analytical sociology and computational social science. Journal of Computational Social Science, 1(1), 3–14. https://doi.org/10.1007/ s42001-017-0006-5 27. Peregrine, P. N. (2017). Toward a theory of recurrent social formations (SFI Working Paper No. 2017–08–026) (pp. 1–33). Santa Fe, New Mexico: Santa Fe Institute. Retrieved from https:// www.santafe.edu/research/results/working-papers/toward-theory-recurrent-social-formations 28. Alexander, C. (2001). The nature of order: The phenomenon of life (Vol. 1). Berkeley, CA: Center for Environmental Structure. 29. Mehaffy, M. W., & Alexander, C. (2016). A City is not a tree: 50th anniversary edition. Portland: Sustasis Press. 30. Sieweke, J. (2014). Preserving the natural flow: Natural disasters in ignorance. In IntAR (Vol. 5). Providence, RI: Rhode Island School of Design. 31. Epstein, J. M. (2001). Remarks on the foundations of agent-based generative social science. Retrieved 13 March, 2018, from https://www.brookings.edu/research/remarks-on-thefoundations-of-agent-based-generative-social-science/ 32. Epstein, J. M. (2006). Agent-based computational models and generative social science. In Generative social science: Studies in agent-based computational modeling (pp. 4–46). Princeton, NJ: Princeton University Press. 33. Sanchez, S. M. (2018). S-NOLH design worksheet [Excel]. Retrieved from https:// my.nps.edu/documents/106696734/108129284/S-NOLH_v1.xls/14b5dea5-266a-409e-a4f1457cd37a4e8c 34. Bavelas, A. (1947). A mathematical model for group structures. Human Organization, 7(3), 16–30.
Institutional Emergence and the Persistence of Inequality in Hamilton, ON 1851–1861
23
35. Sallach, D. L. (2003). Social theory and agent architectures: Prospective issues in rapiddiscovery social science. Social Science Computer Review, 21(2), 179–195. https://doi.org/ 10.1177/0894439303021002004 36. Jolliffe, I., & Morgan, B. (1992). Principal component analysis and exploratory factor analysis. Statistical Methods in Medical Research, 1(1), 69–95. https://doi.org/10.1177/ 096228029200100105 37. Macal, C. M., & North, M. J. (2009). Agent-based modeling and simulation (pp. 86–98). New York: IEEE. 38. Jacobs, J. (2016). A living network of relationships. In S. Zipp & N. Storring (Eds.), Vital little plans: A collection of the short works of Jane Jacobs. New York: Random House. 39. Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380. 40. Helbing, D. (Ed.). (2012). Social self-organization: Agent-based simulations and experiments to study emergent social behavior. Berlin: Springer. 41. Allen, D. W. (2011). The institutional revolution: Measurement and the economic emergence of the modern world. Chicago, IL: University of Chicago Press. 42. Yang, Z. (2017). Integrating agent learning and agent based models. In Presented at the computational social sciences Society of the Americas, Santa Fe, New Mexico.
Exposing Bot Activity with PARAFAC Tensor Decompositions Peter A. Chew
Abstract Russian disinformation tactics via social media have been very topical in Western countries over the past year, and one of the specific tactics that has been spotlighted is the deceptive use of ‘bots’, or automated accounts, by Russia’s Internet Research Agency, particularly on Twitter. In a sense, bots hide in plain sight, purporting to be something that they are not. A useful function is therefore served if we can use data analytics techniques to automate the process of exposing bot activity in a way which helps direct an analyst’s attention towards the signature activity of bots within crowds of ‘genuine’ users. However, the problem is non-trivial because adversaries may deliberately introduce obfuscation in the form of slight differences between bots’ posts. Notwithstanding this, we show how the problem may be solved using the tensor decomposition method PARAFAC. Keywords Social media · Unsupervised learning · Influence operations · Tensor decomposition · Disinformation · Russia · Internet Research Agency · Bot · Twitter · Data analytics
1 Background In its January 2017 report ‘Assessing Russian Activities and Intentions in Recent US Elections’ [1], the US Office of the Director of National Intelligence (ODNI) stated: ‘Moscow’s influence campaign followed a Russian messaging strategy that blends covert intelligence operations—such as cyber activity—with overt efforts by Russian Government agencies, state-funded media, third-party intermediaries, and paid social media users or “trolls”’. Since the ODNI report was issued, more details have continued to emerge in the West of how this ‘influence campaign’ worked. Largely thanks to the Robert Mueller
P. A. Chew () Galisteo Consulting Group, Inc., Albuquerque, NM, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_2
25
26
P. A. Chew
investigation, the general public is now aware that many if not all of Russia’s ‘paid social media users’ worked at the Internet Research Agency (£ÆÈÐÕÔÕÅÑ ËÐÕÈÓÐÈÕ ËÔÔÎÈÇÑÅÃÐËÌ/Agentstvo internet issledovanii, or AII) in St. Petersburg. What is less commonly known, however, is that details of the AII’s operation initially became known as a result of the work of Russian investigative journalists as far back as 2013 [2]. It was because some of these journalists posed as IIA job applicants, for example, that we know certain details about the IIA’s modus operandi, even including the organization’s leaked payroll data [3]. In particular, what has become apparent is that the AII placed a high priority on performance metrics [2], a fact that was also remarked upon in Mueller’s indictment.1 For those posting on social media, 100 posts per day seems to have been the requirement dictated by AII management, and workers were given specific topics to post on (ibid). One of the Russian journalists who posed as a job applicant, Alexandra Garmazhapova, makes it clear that the metric of number of visits to a site was also prioritized by the IIA. Garmazhapova quotes the manager who interviewed her as saying, ‘We have to raise the number of visits to a site. You can do it with robots, but robots work mechanically, and sometimes systems like [Russian search engine] Yandex ban them’ (ibid). From this, it is clear that the IIA was well aware of the potential of bots, or automated accounts, for amplifying the reach of the IIA’s messaging at low cost. It should be noted here, by the way, that the use of bots by the IIA was particularly targeted towards the Twitter social media platform, which lends itself to the use of bots. Other tactics were used on other social media platforms—for example, Facebook lends itself better to information operations via advertising, and the IIA appears to have adapted its approach on Facebook accordingly. Because this paper is about bot activity and in view of the concentration of bot activity on Twitter, references in further discussion will be to Twitter. This notwithstanding, it should be understood that specific references to Twitter may still generalize to social media as a whole. Non-automated analysis of Twitter data up to this point has revealed that the creators of bot accounts attempt, even if perfunctorily, to hide the fact that the accounts are not human, through various means such as attaching avatar photographs lifted from the internet, purporting that the account owner is in America where this is not the case, giving misleading names (perhaps the best-known example being @TEN_GOP, which, it is now known from the Mueller indictment, was actually run by Russians based at the IIA2 ). Yet the attempts were often perfunctory in that simple mistakes were made (mismatches between gender of photo and gender of name) or shortcuts taken (such as naming an account @asdasda). In this case as elsewhere, it is often the simple mistakes that give the criminal away.
1 See
paragraph 37 of the indictment. https://www.politico.com/story/2018/02/16/text-fullmueller-indictment-on-russian-election-case-415670. 2 https://www.cnn.com/2018/02/16/politics/who-is-ten-gop/index.html.
Exposing Bot Activity with PARAFAC Tensor Decompositions
27
While undoubtedly some of this influence campaign focused on spreading deceptive messages (or to use the Soviet-era term, disinformation), it is therefore clear that another equally important dimension to the deception was that the IIA’s social media accounts held themselves out to be something that they were not. It is that aspect of the deception, rather than deception in messaging, that we focus on in this paper. And in particular, a key factor that has enabled the IIA to hide its activity in ‘plain sight’ is the sheer quantity of content available overall in social media. Were it not for this, the activity of bots and the fact that this activity emanates from bots would, we believe, be very self-evident. This is simply because bots by their nature move in formation, posting very similar content within seconds of one another. What makes the phenomenon hard to identify, either manually or in automated fashion, is teasing the bot ‘formations’ apart from the mass of other information created by genuine users. And this is where advanced signal processing techniques like PARAFAC have their place. The paper is organized as follows. Section 2 outlines some of the related previous literature. Section 3 outlines how PARAFAC can be applied in a novel way, not just solving an existing problem in a new way, but in fact recasting the problem to begin with, making a close fit between the problem of interest and the data analytics paradigm of PARAFAC. In Sect. 4, we demonstrate an application of our approach to around 115,000 Twitter posts in Russian, English, and other languages surrounding the time of the 2016 US general election; we show how we were quickly able to tease out patterns characteristic to bots, making the trail of evidence very clear to a human reviewer. Finally, Sect. 5 concludes on our findings.
2 Related Literature In order to situate the contribution of the present work, it is important to distinguish at the outset between two related, but subtly different, problems, both of which are non-trivial. One research question can be simply stated as: given a large Twitter dataset, how to distinguish between bot accounts and human accounts? The second, distinct, question is contingent on the first: assuming that bots can be identified, how to draw out specific patterns of bot activity in a useful way? The value of answering the second research question may seem less obvious, but it is key to ensuring that the human is usefully integrated into the ‘loop’, often something which is overlooked in the development of systems [4]. It is one thing for a machine to label accounts as bots automatically, but it is quite another for the machine to present the results to a human user so the user can quickly concur beyond reasonable doubt that the patterns could not have occurred by chance as a result of human activity. In fact, the value of the second research problem goes beyond this. By extracting patterns of adversarial bot activity, for example, we can also gain insight into the adversary’s activities and motivations—insight that the
28
P. A. Chew
adversary may not even have intended to make available. To our knowledge, this specific problem has not previously been directly tackled in any prior literature. Whether because systems designers often have a tendency to ignore the ‘human in the loop’ dimension, or simply because the first research question is easier to formulate, all the relevant research of which we are aware touches substantively only on the first of the two questions. (This paper attempts to fill the gap by showing how data analytics can be used to address the second.) As regards previous research, we start by looking at the methods that are used by practitioners who have to contend with and analyze bot activity on a day-to-day basis, as this provides useful and easily understandable context with which to frame the problem more abstractly. We then move on to a typology of approaches that have been attempted to automate the problem of bot identification. First, [5, 6] outline a set of 12 heuristics which are useful in distinguishing bots from human accounts on Twitter. The more of these that are true, the more likely the account is to be a bot: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
More than 72 posts a day (more than one every 20 min). Anonymity (no name, no profile information). Amplification (few or no original posts, high rate of retweets). Low posts, high results (few posts with high rate of likes/retweets indicates the account may be part of a botnet). Content mirrors that of other accounts. No avatar image. Avatar image is stolen (e.g. from a celebrity). Account name is a scramble of alphanumeric characters and/or does not match screen name. Single account posts in multiple languages. Commercial content (advertising mixed with occasional political tweets). Bots often use the same URL shorteners (e.g. ift.tt). Patterns of retweets and likes very similar to, or match, other accounts.
These heuristics are used in practice on a day-to-day basis by the Atlantic Council’s Digital Forensic Research Lab in exposing Russian disinformation tactics. However, as we have noted in previous work [7], heuristics are not robust; if patterns of behavior among bot creators change over time, which they may well do if the bot creators try to ‘game the system’ to avoid detection, then the heuristics will not continue to work. A good example of this comes from personal correspondence: Nimmo has mentioned that when the heuristic on the ‘72 tweets a day’ was published, a number of Twitter accounts were observed to drop down to just below 72 posts per day. A second broad category of previous research tackles the problem of automating bot identification. This category subdivides into different data analytics paradigms. One paradigm which is relatively trivial from a data analytics perspective is rulesbased; one could take the 12 heuristics above and simply write a program consisting of if-then statements to distinguish between bots and non-bots. This, however, raises awkward questions of how to prioritize and weight one heuristic over another
Exposing Bot Activity with PARAFAC Tensor Decompositions
29
in a principled way, and also suffers from the same non-robustness as its nonautomated counterpart: if the heuristics have to change, then the approach will also need to be redeveloped. The other two paradigms are supervised and unsupervised learning, exemplified in [8, 9] respectively. Our own previous work on automated bot identification [7] also follows the unsupervised learning paradigm. Broadly, supervised learning relies on prior labeling of accounts known to be bots, and the approach abstracts away from specific heuristics to learn the features of accounts that are predictive of ‘botness’. It can then be deployed to new data to predict which Twitter accounts are likely to be bots. Supervised learning approaches suffer from some of the same drawbacks as heuristics-based approaches, namely, that they work well only as long as the data they are deployed on continues to resemble the training data. In commentary on the work of [9], whose technique was initially developed in 2011, just 3 years later [10] points out that ‘it’s quite possible that there are now more advanced bots that are less easy to detect’. Unsupervised learning approaches, in our view, are best positioned to avoid the problems of a changing landscape of bot activity. Such approaches learn patterns of anomalies and similarities directly from the data, without the need for separate training data. The unsupervised paradigm underlies, for example, search engines such as Google, or recommender systems such as Amazon’s automated system for recommending products to consumers based on the behavior of other consumers. The goal is simply to look for patterns of similar features among nodes in the data. Unsupervised learning has been successfully applied to the bot detection problem by [7, 8]. Both these approaches work by looking for time-correlated activities between Twitter accounts ([7] also uses correlations other than temporal ones). Again, the focus of the research is to identify bots rather than to analyze the correlated bot activity. That said, [8] includes a 46-s video at http://www.cs.unm.edu/~chavoshi/ debot/ which does part of what we focus on here—presenting bot activity to a human so that it is clear that the correlated activity is machine-generated. The goal of the present paper is to focus squarely in on how to tease out patterns like the one demonstrated in the video, using PARAFAC. With this technique, we also show how additional insights can be gained with little effort. PARAFAC, like the above-mentioned research, is an unsupervised technique, and is a cousin to the techniques used for topic aanalysis and bot detection in [7]; it is therefore a natural candidate for what we want to achieve.
3 PARAFAC 3.1 Linear Algebra Background Before proceeding to a description of PARAFAC, we consider the related technique of Singular Value Decomposition (SVD). In linear algebra, SVD, also known as Principal Component Analysis, is a method for factorizing any matrix X into its
30
P. A. Chew
principal components. SVD is also known as a signal processing technique that allows for abstraction away from ‘noise’ to the most salient features of a dataset. In mathematical terms, SVD factorizes the matrix X into three further matrices such that X = USVT , where U is an orthonormal matrix of left singular term vectors, S is a diagonal matrix of singular values, and V is an orthonormal matrix of right singular document vectors [11]. S encodes an ordered list of the eigenvalues of X, corresponding to the orthogonal dimensions implicit in U and V. By discarding less ‘salient’ dimensions (those with lower eigenvalues), one can effectively discard the ‘noise’ to arrive at the best possible approximation to the original matrix given a lower number of features, or dimensions. Chew [7] serves as a highly relevant illustration of the application of SVD to bot identification (again, distinct from highlighting the activity of botnets). Here, X is a matrix of features by Twitter accounts, where ‘features’ means words used by the accounts, and dates of posts. The matrix is populated by weighted frequencies (e.g. weighted number of times a word was used by an account, or weighted number of times an account posted on a date). One of the outputs of SVD is V, which can be thought of as a set of Twitter account vectors in orthonormal space, so that the cosine between any two vectors is a measure of the similarity between two Twitter accounts. Accounts which post identical content on the same dates would have identical vectors, and therefore the cosine between such vectors would be 1. Accounts whose content is totally orthogonal to one another would have a pairwise vector cosine of 0. Because the approach is unsupervised, nothing needs to be known about the content a priori—SVD just ‘looks for’ patterns of similarity. The approach of Chew [7] is predicated on the idea that bots, by their nature, tend to follow one another in lockstep (automated accounts set to repost content from specific URLs or based on certain keywords will be triggered by the same underlying events). The V matrix from SVD conceived in this way gives a mathematical basis for measuring mutual account similarity and postulating that highly similar groups of accounts (stated another way, those that cluster together) are part of a botnet (Fig. 1). Fig. 1 Clusters of very similar accounts postulated to be botnets
0.6 0.4 0.2 0.0 –0.2
Bots? Humans?
–0.4 –0.4
–0.2
0.0
0.2
0.4
0.6
Exposing Bot Activity with PARAFAC Tensor Decompositions
31
3.2 PARAFAC’s Relationship to SVD PARAFAC is, in simple terms, a generalization of SVD from matrices to tensors. Just as a vector is a one-dimensional array and a matrix is a two-dimensional array, a tensor is a multilinear algebraic generalization, an array of any arbitrary number of dimensions; and just as two-dimensional matrices can be factorized, so can tensors (see illustrative representation in Fig. 2). Rather than going into the mathematics which are detailed elsewhere, e.g. [12], for the present purposes it is perhaps more convenient to proceed directly to demonstrating how this can be useful with social media data.
3.3 Application of PARAFAC Chew [7] starts with a two-way term-by-account matrix, where dates are construed as ‘terms’ along with words used in social media posts. This gives a useful view of which accounts are similar to one another, based overall on which dates they post on and how frequently, and the words they use. What it does not allow is focusing in on the cases where the different accounts use the same words on the same dates (a key giveaway of bot activity), abstracting away from any noise. Were it not for noise, discovering such correlations between accounts could be trivially achieved by writing a SQL query to join on date and text of each tweet. The fact, however, that there are often slight variations between obviously otherwise identical text, as in the example below, makes this a hard problem. A better representation of the source data, which captures the inter-account cooccurrence of words on the same dates is as a term-by-date-by-account tensor (Fig. 3): The question then arises as to how to weight the entries in this tensor. As mentioned above, [7] uses a standard approach of weighting the term-by-account matrix prior to SVD. The particular approach used in [7] is the pointwise mutual information (PMI) between term i and account j, pmi(i, j):
x C2
C1
=
+
+........+ b2
b1
a1
Fig. 2 PARAFAC factorization of a three-way array
Cr
a2
br
ar
32
P. A. Chew
accounts
terms
da
te
s
Fig. 3 Term-by-time-by-account tensor
pmi (i, j ) = log
p (i, j ) p(i)p(j )
where p(i, j) is the number of terms of type i used by account j, divided by N; p(i) is the number of terms of type i in the corpus divided by N, p(j) is the number of terms used by account j divided by N, and N is the overall number of terms in the corpus. With a three-way array it is a simple matter to adapt the weighting using the generalization of PMI to more than two variables (the multi-way PMI is a component of the ‘multivariate mutual information’ measure). The PMI between term i, account j and date k, pmi(i, j, k), following the notation above, is: pmi (i, j, k) = log
p (i, j ) p (i, k) p (j, k) p (i, j, k) p(i)p(j )p(k)
When the weighted term-by-date-by-account tensor is factorized with PARAFAC, three matrices are obtained: (1) a term-by-dimension matrix U, (2) an account-by-dimension matrix V, and (3) a date-by-dimension matrix W. As with SVD, the three output matrices encode all three types of element (terms, accounts, dates) in the same dimensional space. This means that for any and all dimensions (i.e. principal components of the source data), it is possible to list simultaneously the words, accounts, and dates/times that are most representative of those dimensions. The common dimension is what makes a non-trivial problem—finding accounts that post near-identical tweets at near-identical times—into one that can then trivially be solved. Using the common dimension we can cross-reference back to the original dataset to make the connection between accounts, terms used in the accounts, dates of posts, and find the actual posts that occurred on those dates, using those terms, by those accounts. We now show how this works in practice.
Exposing Bot Activity with PARAFAC Tensor Decompositions
33
4 Demonstration The dataset we use to demonstrate the use of PARAFAC to expose bot activity is a collection of 115,169 Twitter posts, gathered from the Twitter ‘garden hose’ by specifying ‘NATO’ and the Russian equivalent ‘°£μ±’ as search terms. All posts are from between October 13, 2016 and November 28, 2016 (the time surrounding the US general election), with English and Russian posts distributed approximately evenly across the time period. Statistics for this corpus are shown in Table 1 below. We prototyped the implementation of PARAFAC on a standard laptop (a 64-bit Windows 10 HP 755G2 Elitebook, with 16GB RAM). PARAFAC itself was run in the R package x64 3.1.1, with the ‘parafac’ function from the open-source R library ‘Multiway’. In addition, the ‘rTensor’ R library was used for construction of the tensor. There is no reason in theory that the approach we have described cannot be deployed on the full dataset just described, given appropriate computational resources. However, with the constraints of the system on which we ran PARAFAC, it was necessary to downselect from the full corpus of 115,169 posts. The approach we therefore took was as follows. We first ran topic extraction on the full 115,169 posts using the SVD-based approach of [7]. The purpose of this is to group the data into natural topical clusters, allowing a ‘divide and conquer’ approach. However, it should be emphasized that this step would be unnecessary with sufficient computational resources, because PARAFAC also effectively groups the data into clusters, but with three-way associations instead of two, allowing for more sensitivity to latent correlations in the data. With the output of SVD, we were able to make an initial guess at which accounts were most bot-like, based on extremely high inter-account correlations in activity (we set an arbitrary threshhold using a pairwise cosine similarity between accounts of at least 0.95). This then allows accounts to be automatically labeled as suspected bots. We then looked at the topical clusters to identify clusters which had a higherthan-usual preponderance of suspected bots posting on those topics. It is not necessarily the case that the most prominent topical clusters in the data attract bots, something which makes sense given prior estimates that bots constitute no more than around 15% of Twitter traffic [13]. One such topic was the one ranked 23rd in overall importance in the corpus (see Fig. 4). Some keywords for this topic (based on the SVD U matrix) were °£μ± (NATO), ÄÈÊÑÒÃÔÐÑÔÕË (security), ¤ÃÎÕËÍÈ (Baltic), and the top five accounts posting on this topic (based on the Table 1 Statistics by corpus and language
Language EN RU Other Total
# posts 31,637 24,553 58,979 115,169
# distinct accounts 20,007 9068 28,813 57,888
34
P. A. Chew
Fig. 4 A single button click generates a report containing the data of Table 3
SVD V matrix) were moya_oksana, nina55055, elena_ma_468013, _deti_zhdut_, and 09j0kermen. All five of these were labeled as suspected bots based on very high overall similarity between their activity and that of other accounts. A key post on this topic by these accounts was ‘°£μ± ÑÕÍÃÊÃÎÑÔß ÑÄÔÖÉÇÃÕß Å ¯ÑÔÍÅÈ ÅÑÒÓÑÔÞ ÄÈÊÑÒÃÔÐÑÔÕË Ðà ¤ÃÎÕËÍÈ’.3 To demonstrate the utility of PARAFAC, we then focus in on a specific use case that SVD cannot directly answer. It is all very well for a machine automatically to label accounts as bots, based on an arbitrary threshold (even if the means by which similarity is measured is well-grounded in information theory), but a human should rightly be skeptical. How does the human reviewer know that the ‘black box’ is reliable? If, for example, the human reviewer is an employee of Twitter tasked with suspending the accounts of bots that abuse the terms of service, is it enough just to know that the black box, incorporating as it does multiple methodological assumptions, says that an account is a suspected bot? We would argue that that is not enough. The human reviewer needs to be presented with compelling evidence that shows an ‘audit trail’ back to the source data. This is where PARAFAC comes in. Using the output of SVD, we listed the top 100 accounts most representative of each ‘botlike’ cluster—those that coalesce around the 23rd topic from SVD mentioned above, on ‘NATO Baltic security’. (The number of 100 is arbitrary and could be increased depending on available computing power.) Using the ‘divide-and conquer’ approach, we gather the posts, and form a term-by-date-by-account tensor, just for those accounts. For the ‘NATO Baltic security’ topic, for example, this results in a 252 × 14 × 100 tensor containing 2523 nonzeros. (The tensor that would result from the full original dataset would be of size 118,358 × 47 × 57,449, with 1,667,648 nonzeros.) We factorize this tensor using PARAFAC to yield a second set of ‘PARAFAC topics’ (we set the number of PARAFAC factors to be returned to 10, which is sufficient for the purposes described here), along with the U, V and W matrices that map terms, accounts, and dates to those topics. This process takes just a couple of minutes to run on the laptop mentioned above. It is then a relatively simple matter to produce output of the sort that can be reviewed by a human, to make the case very convincingly that different accounts in 3 NATO
fears the ‘Admiral Kuznetsov’ may be used for attacks on Aleppo.
Exposing Bot Activity with PARAFAC Tensor Decompositions
35
the cluster post so much in sync with one another, that this could not occur by chance if the accounts were human-operated. For example, within the above-mentioned cluster, four different accounts can be seen to move in formation, posting almost, but not quite, identical text, as in Table 2. When Table 2 is pivoted on the text (using PARAFAC to abstract away from the ‘noise’ that differentiates the posts), and showing the date and time of each post, the case can be clearly and convincingly made to the human reviewer that these are part of a botnet (see Table 3). Again, what makes this pivot a non-trivial problem is that neither the text nor the date/time field is an exact match across accounts. It is worth noting too that the small differences between the text of the posts may have been introduced so as deliberately to obfuscate, and to make it appear that the accounts are not posting identical material. However, PARAFAC cuts through this noise to find the signal that unmistakably links the different accounts (Table 3). This kind of analysis can be fully automated: we put together a basic proofof-concept using Microsoft Access (connecting to underlying SQL Server data) in which a single button click will generate the sort of report shown in Table 3, for any SVD topic in which the user is interested. While the system’s decisions on which accounts are considered likely to be bots is displayed to the user (via the check boxes), the user need not rely on the output of a black box. The single button click provides the last link in the chain of analytical evidence, using an easy-to-understand format which is also easily traceable to elements in the source data. This, rather than the machine classification of bots, is more like the sort of evidence that could be presented in court.
Table 2 Commonalities between bot activity molodost__bz #news °£μ± ÑÒÃÔÃÈÕÔâ ËÔÒÑÎßÊÑÅÃÐËâ «£ÇÏËÓÃÎà -ÖÊÐÈÙÑÅû ÇÎâ ÃÕÃÍ Ðà £ÎÈÒÒÑ #news Bloomberg: PÖÕËÐ ÊÃâÅËÎ Ñ ÐÈÇÑÒÖÔÕËÏÑÔÕË ÒÓÑÅÑÍÃÙËÌ ÒÓÑÕËÅ °£μ± _deti_zhdut_ °£μ± ÑÒÃÔÃÈÕÔâ ËÔÒÑÎßÊÑÅÃÐËâ «£ÇÏËÓÃÎà -ÖÊÐÈÙÑÅû ÇÎâ ÃÕÃÍ Ðà £ÎÈÒÒÑ ˘ Bloomberg: PÖÕËÐ ÊÃâÅËÎ Ñ ÐÈÇÑÒÖÔÕËÏÑÔÕË ÒÓÑÅÑÍÃÙËÌ ÒÓÑÕËÅ °£μ± ˘ Table 3 Bot activity pivoted on text, to show date and time of similar posts
molodost__bz _deti_zhdut_ korabliks nina55055 a NATO
°£μ± ÑÒÃÔÃÈÕÔâ ËÔÒÑÎßÊÑÅÃÐËâ «£ÇÏËÓÃÎà -ÖÊÐÈÙÑÅû ÇÎâ ÃÕÃÍ Ðà £ÎÈÒÒÑa 10/25/2016 11:20:00 10/25/2016 11:19:58 10/25/2016 11:19:55 10/25/2016 11:21:07
Bloomberg: PÖÕËÐ ÊÃâÅËÎ Ñ ÐÈÇÑÒÖÔÕËÏÑÔÕË ÒÓÑÅÑÍÃÙËÌ ÒÓÑÕËÅ °£μ±b 10/28/2016 03:35:54 10/28/2016 03:35:49 10/28/2016 03:35:42 10/28/2016 03:36:06
fears the ‘Admiral Kuznetsov’ may be used for attacks on Aleppo announces that provocations against NATO would be inadmissible c NATO refuses to discuss questions of Baltic security in Moscow b Putin
°£μ± ÑÕÍÃÊÃÎÑÔß ÑÄÔÖÉÇÃÕß Å ¯ÑÔÍÅÈ ÅÑÒÓÑÔÞ ÄÈÊÑÒÃÔÐÑÔÕË Ðà ¤ÃÎÕËÍÈc 10/31/2016 18:53:22 10/31/2016 18:53:23 10/31/2016 18:53:15 10/31/2016 18:53:38
36
P. A. Chew
5 Conclusion In this paper we have proposed a novel application of the tensor decomposition method PARAFAC, to solve the non-trivial problem of teasing out patterns of bot activity that would not be characteristic of humans. The application is distinct from, but related to, the problem of identifying the bots in the first place. Here, we assume that problem is to some degree solved, but the use of PARAFAC complements automated bot identification by making the chain of analytical evidence more watertight. We show how PARAFAC can be used, among other things, to present evidence to a human reviewer in a way which removes virtually all doubt about the ‘bot’ nature of certain accounts, and at the same time, shows which bots move together in formation, at which times, and on what topics. While adversaries may deploy bots as a faster, better, cheaper way of conducting deceptive influence operations against us, the kind of analysis we propose turns that sort of activity into a double-edged sword for the adversary. Not only can automating analysis of bot activity potentially allow social media platforms like Twitter to achieve the ideal state, perhaps, of shutting down malign bots as fast as they can be set up. The kind of analysis we propose also allows insight to be gained into adversary behavior that otherwise would not be easily available. It may be a truism, but it is true nonetheless, that technology itself is a double-edged sword: while it has opened up new opportunities for adversaries who seek to obfuscate, it opens up at least as many opportunities for good, and shining a light on the realities of the data. And we remain optimistic that light, in the end, always overcomes darkness. Acknowledgments This work was partially funded by the Office of Naval Research under contract number N00014-16-P-3020.
References 1. Office of the Director of National Intelligence. (2017). Assessing Russian activities and intentions in recent US elections. Available at https://www.dni.gov/files/documents/ ICA_2017_01.pdf 2. Nimmo, B. & Toler, A. (2018). The Russians who exposed Russia’s trolls. Retrieved April 9, 2018, from https://medium.com/dfrlab/the-russians-who-exposed-russias-trolls72db132e3cd1 3. Toler, A. & Brookie, G. (2018). Matching Mueller’s explosive indictment. Retrieved April 9, 2018, from https://medium.com/dfrlab/matching-muellers-explosive-indictmentecf04e55ef80 4. Nielsen, J. (1993). Usability engineering. London: Academic Press Ltd. 5. Nimmo, B. (2017). #BotSpot: twelve ways to spot a bot. Retrieved April 9, 2018, from https:// medium.com/dfrlab/botspot-twelve-ways-to-spot-a-bot-aedc7d9c110c 6. Nimmo, B. (2017). #BotSpot: how bot-makers decorate bots. Retrieved April 9, 2018, 2018 from https://medium.com/dfrlab/botspot-how-bot-makers-decorate-bots-4d2ae35bdf26
Exposing Bot Activity with PARAFAC Tensor Decompositions
37
7. Chew, P. (2018). Searching for unknown unknowns: unsupervised bot detection to defeat an adaptive adversary. Social Computing, Behavioral-Cultural Modeling, and Prediction, 2018, 357–366. 8. ´havoshi, N., Hamooni, H., & Mueen, A. (2016). Identifying correlated bots in twitter. In E. Spiro & Y.-Y. Ahn (Eds.), Soc info 2016, part II, LNCS (Vol. 10047, pp. 14–21). Cham, Switzerland: Springer. https://link.springer.com/chapter/10.1007/978-3-31947874-6_3#citeas 9. Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2014). The rise of social bots. Communications of the ACM, 59(7), 96–104. 10. MIT Technology Review. (2014). How to spot a social bot on Twitter. July 28, 2014. Available at https://www.technologyreview.com/s/529461/how-to-spot-a-social-bot-on-twitter/ 11. Golub, G., & Van Loan, C. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press. 12. Harshman, Richard A. (1970). Foundations of the PARAFAC procedure: models and conditions for an ‘explanatory’ multi-modal factor analysis (PDF). UCLA Working Papers in Phonetics. Ann Arbor: University Microfilms. 16: 84. No. 10,085. Archived from the original (PDF) on October 10, 2004. 13. Varol, O, Ferrara, E., Davis, C., Menczer, F., Flammini, A. (2017). Online human-bot interactions: detection, estimation, and characterization. Proceedings of international conference on web and social media (ICWSM). AAAI.
To Share or Not to Share: Effect of Peer-to-Peer Mentoring on Dynamics of Graduate Life Raafat M. Zaini and Saeed P. Langarudi
Abstract A simulation model has been developed in this paper to show how a peer-to-peer mentoring setting for sharing work-in-progress research can influence the academic journey of graduate students. The authors have used their personal experience in design, development, and establishment of such a setting to formulate dynamic reference modes of behavior and the model structure. Simulation results show that achieving a balance between smooth graduate research journey, depth of subject matter knowledge, and sharing work-in-progress could be challenging. A hypothetical optimal idea-sharing frequency exists though. We found, through model simulations, that over-sharing, under-sharing, or premature sharing have significant consequences for graduate life. Regardless of its value, keeping the optimal sharing frequency constant might help the students to improve upon their socio-academic life. Keywords Academic life simulation · System dynamics · Graduate research · Higher education · Graduate students
1 Introduction Graduate research is stressful by nature as Jørgen Randers describes in his PhD dissertation [7]. A recent study shows that for a PhD student, there is a 51% chance of experiencing psychological distress and 32% of suffering from a common
R. M. Zaini () Department of Social Science & Policy Studies, Worcester Polytechnic Institute, Worcester, MA, USA e-mail: [email protected] S. P. Langarudi College of Agricultural, Consumer & Environmental Sciences, New Mexico State University, Las Cruces, NM, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_3
39
40
R. M. Zaini and S. P. Langarudi
psychiatric disorder [6]. One reason for such mental pressure could be attributed to the increasingly fierce competition in academic job markets. Using a systems approach, [5] show that in the USA, only 12.8% of PhD graduates could potentially attain academic positions, assuming a steady state at the time of the study. Another study shows that in 2016, there was approximately one tenure-track position in the USA for every 6.3 PhD graduates in biomedical fields [2]. The current growing competition among research universities adds extra pressure on graduate researchers. Research universities around the world are racing to attract the best talent from the graduate student pool to their education programs with high aspirations to create socio-economic impact through their innovative academic research [8]. In their prevalent growth strategies, universities in the USA have also been expanding graduate programs to increase their graduate students enrollment to resolve their economic issues and enhance their academic reputation [9]. Learning about graduate life dynamics helps students and their advisors better manage the process, leading to improved academic outcomes, better use of academic resources, and hopefully higher quality of life for the students. Many different perspectives could be taken to address this process. This paper, however, is focused on the impact of the students’ idea sharing on the graduate research journey dynamics. Graduate students usually start their graduate programs with a preconceived research idea to pursue in their thesis or dissertation. As they share their ideas with their advisors, professors, and peers, they will be exposed to new ideas they feel curious to explore. Throughout their research journey, graduate students could experience divergence and convergence. Divergence happens when students start exploring new research ideas that are different from their initial research focus. Convergence happens when students start narrowing or combining ideas into a research focus that is necessary for producing a meaningful research thesis or dissertation at the end of their program. Divergence from the main topic is oftentimes necessary for exploring different fields of inquiry, strengthening and facilitating the research in significant ways. Sometimes, without that exploration, the completion of the work would not have been possible, or indeed not timely. In 2013, the authors initiated a peer-to-peer mentoring setting, called Collective Learning Meetings (CLMs), at the Worcester Polytechnic Institute. In this setting, graduate students share their research ideas with their peers (and faculty, in many occasions) on a regular basis, seeking collegial non-evaluative feedback to help them move forward with their work-in-progress projects. This setting, to date, has hosted about a hundred student presentations. As the founders of the CLM, the authors had a question about the impact of research ideas sharing and the collegial feedback on the student graduate research journey. This paper employs a simple system dynamics model to address this question. First, the model explains an observed dynamic reference mode of behavior. Then, it is used to investigate the impact of the CLM peer-to-peer mentoring setting on potential trajectories of graduate life. The model structure and its parameters are defined based upon the authors’ first-hand experience in design, development, and
To Share or Not to Share
41
establishment of the CLM mentoring setting and reinforced by anecdotal stories of other graduate students and alumni from other institutions shared with the authors through informal dialogues. Simulation outcomes of the model imply that there is a sweet spot for research ideas’ sharing frequency. If students share less frequently than the optimum level, they will have to face the risk of not converging to a research idea and finishing their study on time. Consequently, their level of subject matter knowledge might be negatively affected because idea sharing and topics exploration are means of learning growth. Less sharing, hence less feedback, could mean less exploration and might lead to lower levels of subject matter knowledge. If they share more frequently than the optimum level, they will probably finish within the expected period. However, they may experience a severe mental pressure at some point in their journey due to the excessive amount of feedback and the plethora of ideas they have to process. In this case, they may also end up having a less-than-optimal level of subject matter knowledge as sharing takes time and may consume the students’ research and exploration time. The paper is organized as follows: Model structure is explained in Sect. 2. Base case simulations are presented in Sect. 3. Results of some hypothetical experimentation on consequences of students’ idea sharing are discussed in Sect. 4. Section 5 explores possibilities of improvement in the dynamic behavior through alternative sharing strategies. Section 6 then concludes the paper.
2 Model Structure The model tells the story of graduate students who are expected to produce a meaningful research thesis or dissertation at the end of their program. In their journey, the students may diverge from their research focus as a natural process of exploration, but eventually, they need to converge toward a meaningful research idea that produces a thesis or dissertation. Therefore, we expect “divergence” from research focus to increase initially, reach a peak, and decline to a low level at the end of the graduate life. The model should be able to reproduce this reference mode of behavior. Stock and flow diagram of the model is shown in Fig. 1. Three stocks represent the state of the system: research ideas (R), collegial feedback (C), and knowledge (K). The students are assumed to start the graduate program with an already established research idea of their own. Thus, the initial value of the research idea stock (R) is R0 = 1. Research ideas can increase through exploration (P ) by an integration process: Rt = Rt−dt + Pt−dt · dt R0 = 1
(1)
Exploration (P ) is a linear function of divergence (D). If the students are not distracted from their original research idea, then they will remain focused and stay in a dynamic equilibrium:
42
R. M. Zaini and S. P. Langarudi
Fig. 1 Stock and flow diagram with interacting feedback loops comprising the model structure
Pt = pDt p =1−s
(2)
In this equation, p is a constant, representing the number of ideas that each unit of diversion (D) may generate. This parameter, however, depends on sharing fraction (s) which is explained below. The more the students spend time sharing the less time they will have to spend on exploration, explaining the reverse relationship between the two parameters. The students share their ideas with their advisor(s) and peers in formal or informal settings like the CLM. This activity is called sharing rate (S): St = srRt
(3)
s, r > 0 Here, s represents the frequency of sharing ideas by the students and r indicates the productivity of sharing. Higher values of either of these parameters indicate higher frequency and productivity of sharing activities. For example, sharing frequency could represent the number of conferences, meetings, or colloquiums that the students attend during an academic year where they present their work. The feedback they receive from these exchanges will accumulate in the collegial feedback stock(C), which is initially empty, thus C0 = 0. Ct = Ct−dt + St−dt · dt C0 = 0
(4)
To Share or Not to Share
43
The third stock of the model is knowledge (K). Knowledge increases through learning rate (L) which itself is a simple linear function of exploration (P ). l represents the amount of knowledge acquired per unit of exploration. Furthermore, we assume that the relevant subject matter knowledge that leads to a coherent research project is initially zero. This simplified assumption serves our modeling purposes reasonably well. Otherwise, graduate students usually have a certain level of domain knowledge before starting their research program. Lt = lPt
(5)
l>0 Kt = Kt−dt + Lt−dt · dt
(6)
K0 = 0 We define divergence (D) as the discrepancy between desired research focus (d) and focused ideas (A). Focused ideas represent the number of ideas that the students are pursuing. Graduate students are usually expected to complete one coherent research project either in the form of a thesis, dissertation, or a certain number of thematically related articles. Thus, desired research focus (d) is assumed to be 1. Dt = At − d d=1
(7)
The sheer number of research ideas (R) increases focused ideas (A). If A is already higher than d, then increasing research ideas diverges the students from their research focus. The stock of knowledge (K) has the opposite effect: increasing knowledge converges the students’ focused ideas (A) toward desired focus (d). This effect is shown by an exponential decay function G (see Eq. 8). Figure 2 shows the default shape of this function along with some examples of alternative cases by varying the multiplier v and the exponent w. Fig. 2 Effect of knowledge on focused ideas
44
R. M. Zaini and S. P. Langarudi
Gt =
1 1 + vKt w
(8)
1 change-time then move (change-time − entryT ime) ∗ (turtle speed) schedule turtle event for patch change-time; break end if move (distance to edge of patch) if turtle at edge of grid then turtle status ← dead; break end if entryTime = exitTime if next patch is red AND next patch change-time > entryTime then schedule turtle event for patch change-time; break end if end while end if else Agent is patch change patch color calculate next change-time and schedule patch event for change-time end if end while Live turtles
and the number of turtle events is measured directly. Experimentally, for 50 turtles, the average simulation time is 85 min.
5.1 Performance of Time-Stepped Model Every time step, a turtle event is executed once per turtle and a patch event is executed once per patch. Therefore, the total number of events should be (sim-time ∗ 144 + sim-time ∗ n). These results are confirmed in the data tables. It should be noted that time granularity can be improved in the time-stepped simulation. However, decreasing the size of the time step increases the number of events.
156
E. Beeker III and M. Koehler
Fig. 3 Expected distribution of change times is constant
It is useful to calculate the average number of patches that change color in 1 min. Assume the change color times for patches are randomly assigned between 0 and 19 and that at every patch change color event, the patch adds 20 min to the current time for its next patch change time. In other words, the patch change time is incremented by a constant 20 min. Over a period of 20 min, all patches will change color exactly once. The average number of change color events per minute is the number of patches divided by 20. If the simulation were stopped at any point, then the expected distribution of change times would be 1/20th of the number of patches for each minute for the next 20 min, as depicted in Fig. 3 where the Y-axis is percentage of patches. Now, consider when the color change time is drawn from a uniform distribution. To simplify, look first at 100 patches that are initially assigned a change time of 0–4. On average, there will be twenty patches with change time of zero, twenty patches with change time of one, and so forth as in Fig. 4a. At patch change time, each patch will draw a number between one and five with equal probability (uniform random distribution (1–5)). At time zero, twenty patches draw numbers from 1 to 5. On average, four patches will draw 1, four will draw 2,. . . , four will draw 5. The four patches that add 1 for their change time will be added to the patches that already have a patch change time of 1. The same occurs for the times 2, 3, and 4. At each of these times, there will now be 24 patches with those change times. There were no patches with change time of 5, so now there will be four patches with change time
NetLogo Meets DES
157
Fig. 4 Evolution of time distribution of 100 patches with random draw from 1 to 5
of 5. The distribution is in Fig. 4b. At time 1, there are 24 patches to change color. Each one draws a number from 1–5, as before. On average, there will be 24/5 = 4.8 patches added to each of the next five change times. Times 2–4 have 28.8 patches on average. There will be 8.8 patches with change time of 5, and 4.8 patches with change time of 6, as shown in Fig. 4c. Figure 4d, e and f continue the series. By time 5, the distribution assumes its triangular shape. Figure 5 shows the steady state for random distribution of 1–20, where the Y-axis is the percentage of patches with the change time shown on the X-axis. The number of change events scheduled for any time is the percentage of patches ∗ the total number of patches. Combining the fixed amount with the random amount of time for the change time gives a distribution with constant value and then a decreasing value. For the red patches, the distribution of number of patches that will change at a given time looks like Fig. 6. For the green patches, the distribution looks like Fig. 7. These distributions are important because they represent the “steady state” after a long period. Often, simulations are executed for a warm-up period to achieve steady state before they are actually used for analysis. Since the distribution after warm up is given in Figs. 6 and 7 the simulation is initialized with the proper distribution. We can combine Figs. 6 and 7 to determine the expected value of patch changes during the next tick. Since about 20% of the patches, 28.8 of them on average, are red, they will change according to Fig. 6 and about 80% of the patches, about 115.2 on average, are green, they will change with the distribution in Fig. 7. We expect, on average, 1.44 red change events and 1.44 green change events in the next minute. Thus, there are 2.88 total color changes per tick in the time-stepped simulation. Since the average simulation time for a model execution with 50 turtles is 85 min, the total number of events is (m + n) ∗ 85 = (144 + 50) ∗ 85 = 16,490.
158
E. Beeker III and M. Koehler
Fig. 5 Expected distribution of change times is decreasing
5.2 Performance of First Discrete Event Model In the first discrete event simulation (DES 1), an event is scheduled for each patch color change and patches that will change color in the future are not checked. Based upon the analysis at the end of the previous section, we should expect 2.88 patch color changes per minute or a patch change event about once every 0.347 min. During each patch change event, every turtle is checked and moves, if possible. Therefore, the average number of events per minute is 2.88 ∗ (n + 1). When 2.88∗(n+1) < m+n, DES 1 executes fewer events than the time-stepped implementation. Solving this equation, 2.88 ∗ (n + 1) < m + n 1.88 ∗ n < m − 2.88 n < m/1.88 − 1.53 n < 75.1 turtles. When m = 144, DES 1 executes fewer events than the time-stepped simulation when the number of turtles is less than 75. At 75 turtles, they should execute about the same number of events. Comparing the actual run times for DES 1 and time-
NetLogo Meets DES
159
Fig. 6 Expected distribution of RED patch change times
Fig. 7 Expected distribution of GREEN patch change times
stepped implementation for 100 turtles in Table 3 DES 1 takes less time, implying that turtle events are processed quicker than patch events. It is interesting to note that in all of the cases, the average turtle age for event-driven simulations is slightly lower than the time-stepped simulation. This is due to the increased fidelity of the event-driven simulations. In the time-stepped
160
E. Beeker III and M. Koehler
simulation, the turtle is always considered to exit the grid at an integer time. If the speed and location of the turtle would have the turtle to leave the grid at 20.4 min, the time-stepped simulation will record the age at the next tick, when the turtle is 21 min old. This accounts for the difference in average turtle age.
5.3 Performance of Second Discrete Event Model The second discrete event model has the same number of patch events as the first. However, a turtle event is not executed as part of a patch color change event. Rather, a turtle event is scheduled when the turtle can move. This means that there will be at least one turtle event for each turtle, plus one event whenever a turtle is blocked and then allowed to move again. Experimentally, there are 2.5 events per turtle over the life of the simulation. For 50 turtles, the total simulation time is about 85 min. The total number of events for this implementation is expected to be about 2.88 ∗ 85 + 2.5 ∗ n = 245 + 2.5 ∗ 50 = 370 total events. Experimentally, Table 2 shows 348 events. Using the results in Table 2 the second event-driven version is 16,533/348 = 47 times more efficient than the time-stepped simulation for 50 turtles. This advantage remains nearly constant as the number of turtles changes.
6 Conclusions A given model can be implemented in numerous ways. A simulation practitioner that is able to render a model in both time-stepped and event-driven simulations has an advantage. Sometimes, jumping from one event to the next event in a model can execute more quickly than a time-stepped implementation. The model described by Macal and North as an example ABM has been implemented in three ways, one time-stepped and two event-driven. The results in the data tables show that the simulations produce equivalent results, but have different execution time profiles. Figure 8 shows that DES 1 executes in about half of the time of the time-stepped implementation and DES 2 performance advantage increases as the number of turtles increases. The authors have shown how NetLogo users can construct eventdriven simulations for their own models using NetLogo Array and Table extensions.
NetLogo Meets DES
161
Fig. 8 Runtime for 50,000 replications of each implementation on a Dell Latitude E6540 laptop
Fig. 9 All nodes on the same level are stored sequentially, followed by the nodes on the next level
Appendix: Heap Implementation This description and code have been adapted from the GeeksForGeeks website [6]. A binary heap can add or delete a new element in log n time, making it extremely efficient. A binary heap is a binary tree with two properties: 1. It is a complete tree (all levels are completely filled except possibly the last level and the last level has all values as far left as possible). 2. The value at the root is the minimum value in all the tree. This property is recursively true for all nodes in the binary tree. Figure 9 shows how the array is laid out. The following equations show how to find the index values for the parent and children nodes of node i: index for parent node = Array[(i − 1)/2] Index for left child node = Array[(2 ∗ i) + 1] Index for right child node = Array[(2 ∗ i) + 2] The binary tree represented by this array is shown in Fig. 10. Below is NetLogo code for the basic heap operations. The code does not include error checks (overflow, empty array, etc.).
162
E. Beeker III and M. Koehler
Fig. 10 Children of a node are always larger than the parent
; r e t u r n s r o o t and r e s t o r e s p r o p e r t y 2 t o −r e p o r t e x t r a c t −min i f e l s e heap−s i z e 0 ) and ( a r r a y : item heap ( p a r e n t i ) > a r r a y : item heap i ) ] [ i t e m −swap i ( p a r e n t i ) set i ( parent i )
NetLogo Meets DES
163
] end t o −r e p o r t p a r e n t [ n1 ] r e p o r t ( n1 − 1 ) / 2 end t o −r e p o r t l e f t −c h i l d [ n1 ] r e p o r t 2 * n1 + 1 end t o −r e p o r t r i g h t −c h i l d [ n1 ] r e p o r t 2 * n1 + 2 end t o i t e m −swap [ n1 n2 ] l e t ntemp a r r a y : i t e m h e a p n1 a r r a y : s e t h e a p n1 a r r a y : i t e m h e a p n2 a r r a y : s e t h e a p n2 ntemp end ; r e s t o r e o r d e r i n g , assumes a l l s u b t r e e s have heap p r o p e r t y to heapify let i 0 l e t s m a l l e s t −1 while [ s m a l l e s t != i ] [ l e t l ( l e f t −c h i l d i ) l e t r ( r i g h t −c h i l d i ) set smallest i i f ( ( l < heap−s i z e ) and ( a r r a y : item heap l < a r r a y : item heap i ) ) [ set smallest l ] i f ( ( r < heap−s i z e ) and ( a r r a y : item heap r < a r r a y : item heap s m a l l e s t ) ) [ set smallest r ] i f s m a l l e s t != i [ i t e m −swap i s m a l l e s t set i smallest s e t s m a l l e s t −1 ] ] end
164
E. Beeker III and M. Koehler
References 1. Beeker, E., Bergen-Hill, T., Henscheid, Z. A., Jacyna, D. G., Koehler, M. T. K., Litwin, L., et al. (2010). Coin 1.5 model formulation. MITRE Technical Report. 2. Beeker, E., & Koehler,M. (2018, July). Exampleabm event driven 1. http://ccl.northwestern. edu/netlogo/models/community/ExampleABMEventDriven1. Accessed 17 July 2018. 3. Beeker, E., & Koehler,M. (2018, July). Exampleabm event driven 2. http://ccl.northwestern. edu/netlogo/models/community/ExampleABMEventDriven2. Accessed 17 July 2018. 4. Beeker, E., & Koehler, M. (2018, July). Exampleabm time stepped. http://ccl.northwestern.edu/ netlogo/models/community/ExampleABMTimeStepped. Accessed 17 July 2018. 5. Gardner, M. (1970). Mathematical games: The fantastic combinations of John Conway’s new solitaire game “life”. Scientific American, 223, 120–123. 6. GeeksForGeeks. Binary heap. https://www.geeksforgeeks.org/binary-heap/. Accessed 9 July 2018. 7. Law, A. (2007). Simulation, modeling and analysis (4th ed.). New York: McGraw-Hill. 8. Leckie, W., & Greenspan, M. (2005). An event-based pool physics simulator. Advances in Computer Games, 11th International Conference (pp.143–186). 9. Macal, C. M., & North, M. J. (2008). Agent-based modeling and simulation: ABMS examples. Proceedings of the 2008 Winter Simulation Conference (pp.101–113). 10. Nance, R. E. (1971). On time flow mechanisms for discrete system simulation. Management Science, 18(1), 59–73. 11. Schelling, T. C. (1971). Dynamic models of segregation. Journal of Mathematical Sociology, 1, 143–186. 12. Thompson, J. R., & Page, E. H. (2016). A primer on selected modeling and analysis techniques. MITRE Technical Report MTR160208. 13. Wilensky, U. (1999). Netlogo. http://ccl.northwestern.edu/netlogo/. Accessed 22 June 2018. 14. Zeigler, B. P. (2000). Theory of modeling and simulation. Orlando: Academic Press.
A Popularity-Based Model of the Diffusion of Innovation on GitHub Abduljaleel Al-Rubaye and Gita Sukthankar
Abstract Open source software development platforms are natural laboratories for studying the diffusion of innovation across human populations, enabling us to better understand what motivates people to adopt new ideas. For example, GitHub, a software repository and collaborative development tool built on the Git distributed version control system, provides a social environment where ideas, techniques, and new methodologies are adopted by other software developers. This paper proposes and evaluates a popularity-based model of the diffusion of innovation on GitHub. GitHub supports a mechanism, forking, for creating personal copies of other software repositories that can be used to measure the propagation of code between developers. We examine the effects of repository popularity on two aspects of knowledge transfer, innovation adoption and sociality, measured on a dataset of GitHub fork events. Keywords GitHub · Popularity · Diffusion of innovation · Sociality
1 Introduction Social media platforms are highways for innovation, enabling ideas to rapidly flow across the world. Unfortunately on most platforms, the ease and velocity of communication make it challenging to disentangle the actual transmission of an innovation from the shared zeitgeist of an idea simultaneously arising from multiple sources. However, in open software development platforms such as GitHub, the same infrastructure used for software version control can be repurposed to track the transmission of innovation from one software developer to the next. In his seminal book on this topic, Rogers identified several important elements that affect the speed of adoption, including communication channels, time and social
A. Al-Rubaye · G. Sukthankar () University of Central Florida, Orlando, FL, USA e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_11
165
166
A. Al-Rubaye and G. Sukthankar
system [1]. The aim of our research is to understand the effects of the GitHub social system on code adoption. The GitHub community now numbers 24 million developers working across 67 million repositories, making it the largest host of source code in the world [2]. Code on GitHub is stored in repositories, and the repository owner and collaborators make changes to the repository by committing their content. Any GitHub user can contribute to a repository through sending a pull request. The owners and collaborators review pull requests and decide whether to accept or reject the requests. Contributors can attach a comment to their commit or pull request to communicate a message. Unlike other communication channels, the list of public events is available to everyone; this paper uses data from the GHTorrent project [3] which monitors the GitHub public event timeline. Three event types in particular are key for tracking public interest in a repository: forking, watching, and starring. Forks occur when a user clones a repository and becomes its owner. Sometimes forks are created by the original team of collaborators to manage significant code changes, but anyone can fork a public repository. Developers can watch a repository to receive all notifications of changes and star repositories to signal approval for the project and receive a compressed list of notifications. Forks are valuable for tracking the spread of innovation, and all three events (fork, watch, and star) have been used as measures of repository popularity. This paper introduces a popularity-based model of GitHub code adoption and uses it to predict the spread of innovation as measured through fork events. Our model tracks relationships between GitHub repositories, users, and followers in order to generate event timelines and social structures. Our results show that the inclusion of popularity is key for improving modeling performance on the dimensions of innovation adoption and sociality.
2 Related Work GitHub has been used as a laboratory to study social behavior at many scales, including individual differences [4], group collaboration [5], and code ecosystems [6]. Onoue et al. [4] used GitHub to analyze the amount of time different developers spent on coding vs. commenting and issue handling. Although GitHub provides lists of top contributors based solely on the number of commits, their study concluded that active projects were best served by a mix of developer types [4]. Saadat et al. [7] created an agent-based model of developer activities, initialized with archetypes extracted from stable clusters of GitHub activity profiles, in order to model the disparity between user contributions [7]. Since GitHub is a rich source of data on the performance of virtual teams, many studies have centered on collaboration between developers. Dabbish et al. [5] hypothesized that the social transparency of the platform was key for promoting collaborations by empowering users to make inferences about other developers’ commitment and work quality [5]. Borges et al. [8] used multiple linear regression
A Popularity-Based Model of the Diffusion of Innovation on GitHub
167
to predict the future popularity of repositories [8], which may also serve as a proxy for understanding virtual team performance [9]. In this paper, we use repository popularity to predict the spread of innovation and the future structure of the network. Yu et al. [10] investigated social communications and innovation diffusion on GitHub through an analysis of growth curves and follower networks [10]. Two types of networks were flagged as significant by Zhang et al. [11]: developerdeveloper and developer-project. They argue that the interactions within these networks can affect the generation of innovation diffusion [11]. Our model centers on popularity rather than network structure and treats structure as a secondary effect. This next section describes our dataset and our popularity-based model of innovation diffusion.
3 Method Our research uses data from the GHTorrent project [3], an archive of public event data extracted using GitHub’s REST API.1 There are 37 listed events in GitHub, including create, fork, watch, issue, push, and pull request events [12]. These events alert GitHub developers to changes in repositories that they are tracking. The code to execute our model had been made available at: https://github.com/aalrubaye/ GithubDataAnalysis.
3.1 Data Filtration From the GHTorrent dataset, we only retained the fork events (41, 014 in total). Each fork event object contains information about the user who generated the event and his/her followers. Additionally, it includes repository properties, such as programming language, fork count, watchers, and size. We created a copy of the dataset but simplified the fork event object to only include information relevant for our innovation diffusion model. A sample of our fork event object is shown below. { 'actor': 'GithubUser', 'created_at': '2018-01-16T02:00:58Z', 'followers': [ 'follower_1', 'follower_2', 'follower_3' ], 'repo_created_at': '2017-01-01T17:44:32Z',
1 Data
was retrieved on Jan 16, 2018.
168
A. Al-Rubaye and G. Sukthankar
'repo_forks_count': 4, 'repo_language': 'C++', 'repo_name': 'my_repo', 'repo_network_count': 4, 'repo_size': 536, 'repo_stargazers': 2, 'repo_subscribers_count': 4, 'repo_watchers': 2 }
The programming language is clearly relevant to code adoption. The total number of repository languages in our dataset is 160, where JavaScript was the most commonly used language in all the fork events (8665). Table 1 details the programming languages of the repositories in our original dataset, ranked by the number of fork events. Clearly there are a few highly used languages, while many languages (aggregated in the table under the category Other Languages) only had a few fork events. To account for the dominance of the most common languages, we only retained the repositories coded in the top 20% (32 languages), and removed the rest. As a result, our dataset included 39,935 fork events, of which we randomly sampled 5000 fork events to create our model.
3.2 Model Construction Our model of innovation diffusion includes three components: (1) repositories, (2) actors (the users who fork the repositories), and (3) their followers. Since GitHub repositories host ideas, technologies, and methods, they can be considered as knowledge resources that are used as a means to transmit the innovations among projects through the users and their followers. We link the components Table 1 Top 10 repository languages, ranked by number of fork events
Repo language JavaScript Java Python C++ HTML PHP Ruby C C-sharp Go Other languages
Fork events 8665 5925 5608 2366 2278 1717 1555 1533 1349 1180 8838
A Popularity-Based Model of the Diffusion of Innovation on GitHub
169
with three types of connections: repo–actor, repo–repo, and actor–follower. These connections naturally arise from GitHub’s event and notification system: • A GitHub user (actor) is able to create different events on multiple repositories. • The followers of an actor are notified about commits made by their followee. Thus we conclude that a connection exists between a repository and an actor if an actor forks the repository. Likewise, the second interaction (repo-repo) implicitly occurs if a user creates two similar events on two different repositories. These repositories are likely to be related to each other in multiple ways, including language, code structure, and topic. Also it is likely that there is a connection between two repositories that were forked by the same user. The actor–follower relationship is explicitly defined as the list of followers for a specific actor. Figure 1 shows an illustration of a sample cluster in the network that contains all the three connections.
3.3 Identifying Structural Changes Over Time Our model network spreads innovation by connecting repositories to potential adopters (followers). However, the structure of the network changes over time; to model this we examine the structure at three check points to study the knowledge transmission rate. Assuming that the initial structure is constructed at time t0 , three time points t1 , t2 , and t3 are defined, where t0 < t1 < t2 < t3 . The temporal distance that separates them is defined as a period during which a threshold number of additional fork events are created; our results use a threshold of 5000 events between time points. The new fork events result in mergers of existing connections (repository–actor and actor–follower). Fig. 1 Our model contains three components (repos, actors, and followers) and three types of connections: (a) repo–repo, (b) repo–actor, and (c) actor–follower
170
A. Al-Rubaye and G. Sukthankar
Fig. 2 Repo–follower interaction: a probable connection that may occur when follower f of actor A forks repository R
Once the followers are notified about a fork event creation by their followee, they may be eager to learn about the new repository by forking it. However, this is counted as a probable interaction that could occur at some point in the future, where many other factors may be involved. Figure 2 shows the potential repo–follower interaction. Knowing that in our model the repositories do not possess the same features, we assume that this fact affects the likelihood of a repository being forked by an actor’s followers at a certain time point. For instance, if there are two repositories (R1 and R2 ) where R1 has a higher popularity R2 then our assumption is that it will fork at an earlier time than R2 . Popularity is a feature that may be considered as encompassing the repository reputation. In this work we use popularity of the repositories as an appropriate factor that identifies repositories’ effectiveness at transmitting innovation. Out of the fork event object properties, the features that are correlated with the popularity of a repository are: • repo fork count: the total number of users that forked a repository • repo watcher count: the users that chose to receive emails or notifications about a repository • repo stargazer count: the number of users who bookmarked a repository However, since a GitHub user who stars a repository is automatically set to watching that repository, in most cases repo_watching_count and repo_stargazing_count are equal. By comparing the number of fork events and the number of watchers or stargazers of a repository we realized that increasing one increases the other as well [13]. Figure 3 shows the correlation between repo fork events and watcher count. Thus, to utilize repo popularity in our model, we use the following formulation for the probability of establishing a specific connection between repository R and a follower of the actor A who forked R: P (f orkEventtj (R, fA )) = β f orkEventti (R, A) , where: • ti < t j , • f orkEvent (R, A) = 1, if A has forked R.
(1)
A Popularity-Based Model of the Diffusion of Innovation on GitHub
171
Fig. 3 The correlation between the number of fork events and the watcher count of the repositories in our dataset
• fA is a follower of A that has not forked R yet. • β is the likelihood of becoming connected based on the popularity of R. β is closer to 1 if R is more popular than the other repositories. To find the value of β we compute the following fraction: f orkCount (R) , β = N n=1 f orkCount (Rn )
(2)
where f orkCount (R) = the number of fork events of R, and N = total number of repositories. According to our assumption, once a repository is forked by a follower, the popularity of that repository will increase by one more fork event. At each time when a connection may occur, a follower is selected randomly among an actor’s followers to interact with the repository. Figure 4 depicts the evolution of a simple network that is generated using the example data provided in Table 2. We compare this model vs. a random network evolution model.
4 Results After designing a model that translates the existing relationships between repositories and users, based on the occurrence of fork events, we used the data to create a network at the initial time point t0 (Fig. 5). Then we generate three versions of the network for the time points t1 , t2 , and t3 , where each one represents the structure with more introduced forks (repo–follower connections). Since we used
172
A. Al-Rubaye and G. Sukthankar
Fig. 4 The popularity-based network evolution process generated with the data in Table 2. (t0 ) is the base model generated according to the data shown in Table 2; in (t1 ) R1 is the most popular repository so the likelihood of it being forked by f1 is higher. In (t2 ) R3 will have a higher chance of being forked and in (t3 ) R2 . Notice that the temporal distance between the time points is the time required for one repository to be forked Table 2 A set of fork events used to evolve the sample network (t0 ) shown in Fig. 4 Fork event f orkEvent1 f orkEvent2 f orkEvent3
Actor A1 A1 A2
Actor’s follower f1 f1 f2
Repo R1 R2 R3
Total Repo fork count 100 20 65
The table presents a set of 3 events that done by the actors A1 and A2 to fork the repositories R1 and R2 . The table also shows two extra details: the followers of the actors, and the fork event count for each repository
repository popularity as a factor to compute the probability of adding fork events, we labeled these structures as prob(ability) networks. To evaluate the contribution of popularity, we compare the popularity-based model to a random network in which all repositories have an equal chance of being forked. Consequently, we ended up with one main network at t0 (initialized with the data), and three more for each type (prob and random) representing the structures at t1 , t2 , and t3 . Diffusion of innovation is defined as a multidimensional construct that seeks to explain the spread of knowledge across a population. There are several factors that affect the spread of ideas, techniques, and technologies. According to Rogers
A Popularity-Based Model of the Diffusion of Innovation on GitHub
173
Fig. 5 The initial network of connections resulting from our GitHub fork event dataset. The colors indicate different communities in the network, which were detected with the modularity algorithm [14]. The color red indicates higher connectivity among the nodes, and blue marks less connected communities. The node size indicates the type of the nodes (e.g., size of repository nodes > size of actor nodes > size of followers nodes). We used the open source software Gephi [15] to visualize this network (V = 19, 250, E = 33,988)
(2010), knowledge spread depends mainly on agents that act as intermediaries in this process [1]. Therefore, innovation must be adopted by more individuals in order to continue spreading over time and across the community. In our experiments, we evaluate the popularity-based model at forecasting the diffusion of innovation, considering two dimensions: • Innovation adoption: measured by the spread rate of knowledge across a community. • Sociality: measured by the network’s main structural properties and their effect on knowledge spread.
4.1 Innovation Adoption To evaluate the characteristics of our models we perform a comparison of the evolution of both models (popularity and random) at each time point, compared to the original network. Figure 6 shows the cumulative distribution of fork events across repositories. The popularity model preferentially assigns events to the repos with higher popularity, causing the popular repositories to become even more popular sooner. This behavior of the random models leans toward spreading knowledge more equally through repositories independent of the popularity. In other words, in the popularity model, the users that interacted with popular repositories
174
A. Al-Rubaye and G. Sukthankar
Fig. 6 The repositories’ rate of acquiring new fork events in the popularity and random models at three time points, where the repository popularity is assumed to be the total number of fork events that occurred since creation. (a) depicts the difference between popularity and random structures at t1 against the structure at t0 , while (b) shows the same comparison for the structures at t2 , as well as the comparison in (c) at t3
can be considered potential early adopters who are more forward-looking than the late adopters who are not following the popular repositories. The evolution of the popularity-based model is similar to that of a preferential attachment network [16], where there is an increasing disparity in code adoption over time.
4.2 Sociality One of the main elements that affects knowledge spread is the way a community is structured. According to Rogers (2010), communities that have social characteristics will have a positive influence on the diffusion of innovation [1]. We compare the network structure generated by our models using a set of standard network measures: • Average Path Length: indicates the average number of hops between two nodes in a network, where it is measured by the number of one-step links connecting them. • Clustering Coefficient: describes how tightly nodes are connected to one another. • Degree Distribution: the likelihood, P (k), of possessing nodes with certain degree k. Two common classes of a networks are small-world and scale-free networks. Smallworld networks possess highly connected clusters which result in a network with a high clustering coefficient [17]. Nodes can be reached in a few steps from anywhere in the network, because of the lower average path length. Scale-free networks tend to have more nodes with a low connectivity, along with a few highly connected nodes, due to the power-law degree distribution [18]. The degree distribution is a power law [19] if it follows:
A Popularity-Based Model of the Diffusion of Innovation on GitHub
175
Table 3 Characteristics of the evolving networks at different timesteps
Average path length Clustering coefficient Average degree Power-law exponent α
Initialization t0 5.75 0.53 3.51 2.00
Popularity model t1 t2 t3 5.59 5.49 5.42 0.73 0.80 0.84 4.03 4.55 5.06 2.04 2.07 2.08
Random model t1 t2 5.60 5.51 0.74 0.79 4.03 4.55 2.41 2.12
t3 5.44 0.82 5.06 2.08
Fig. 7 Power-law fitting of both popularity (prob) and random models degree distributions at all time points
P (k) = ck −α , where the exponent value α falls in the range 2 < α < 3. Table 3 compares the predicted network structure of both models at the three checkpoints. In most cases, the popularity-based model exhibits traits common to most social networks, such as shorter path lengths and higher clustering coefficients. However, we can see that the average degrees are the same and that is due to the fact that we allowed both model types to receive the same number of connections (fork events) for a more accurate comparison. Figure 7 shows that both model types have scale-free characteristics, meaning that the degree distributions fit the power law; regardless of the time factor the power-law exponent α falls between 2 and 3 for all cases.
176
A. Al-Rubaye and G. Sukthankar
5 Conclusion Our research aims to understand the driving forces behind the diffusion of software innovations, as measured by GitHub code adoption over time. In this paper, we examine the role that repo popularity plays in code adoption. In essence, our model measures how predictive past fork events (popularity) are of future fork events (innovation). To simulate knowledge spread, we compute the allocation of fork events across repositories and the network structure at three different time points, after more fork events were introduced into the model from the dataset. Our results that demonstrate the popularity-based model is superior at modeling the main elements of diffusion of innovation: innovation adoption and sociality. Note that our model does not account for the introduction of new followers after repo fork events and uses a fairly simple measure of popularity. We are currently developing a general agent-based model of human interactions on social media platforms, including GitHub, Twitter, and Reddit. Many GitHub event sequences are repetitive with developers systematically committing code, reporting issues, and making pull requests. However fork events are often more difficult to predict from past interactions—what makes a software developer suddenly become interested in a different code repository? In addition to popularity, the topic and programming language of a repository also affect its future likelihood of being forked. In future work, we plan to expand our model to include other repository features and measures of influence. Acknowledgements This work was partially supported by grant FA8650-18-C-7823 from the Defense Advanced Research Projects Agency (DARPA). The views and opinions contained in this article are the authors and should not be construed as official or as reflecting the views of the University of Central Florida, DARPA, or the U.S. Department of Defense.
Appendix Our code for generating the innovation network from GitHub events can be found at https://github.com/aalrubaye/GithubDataAnalysis. Here are the algorithms for initializing and updating the network.
A Popularity-Based Model of the Diffusion of Innovation on GitHub
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
177
Data: Ground Truth Data Result: GitHub network for each input of GD do Create a new node; node.type = input.type; end for all nodes r where r.type = repository do a = the actor associated with repository r; Set link(r, a); if a.degree > 1 then for all other nodes m linked to a do Set link(r,m); end end end for all nodes a where a.type = actor do F = set of all followers of a; for all followers f in F do Set link(a,f); end end .
Algorithm 1: Constructing the model connections (repo–actor, repo–repo, and actor–followers) based on the ground truth dataset GD that includes repositories with fork events R, actors A, and their follower set F
1 2 3 4 5 6 7 8 9 10 11 12
Data: Ground Truth Data Result: GitHub network Set NF = number of new fork events to add to PM; for i = 1 to NF do Calculate the probability of each repository in PM based on its popularity (Equation 1); Set HR = the repository with a higher probability of getting another fork events; Set HA = Actor associated with HR; Set HF = Set of followers of HA that haven’t forked HR yet; if HF is not empty then Set f = a random follower from HF; set link (f,HR); end end .
Algorithm 2: Model update procedure for introducing fork events that have occurred since the last time point P M
178
A. Al-Rubaye and G. Sukthankar
References 1. Rogers, E. M. (2010). Diffusion of innovations. New York: Simon and Schuster. 2. GitHub.com (2017). The State of the Octoverse . Available: https://octoverse.github.com/ 3. Gousios, G., & Spinellis, D. (2012). GHTorrent: GitHub’s data from a firehose. In IEEE Working Conference on Mining Software Repositories (pp. 12–21). 4. Onoue, S., Hata, H., & Matsumoto, K.-i. (2013). A study of the characteristics of developers’ activities in GitHub. In Asia-Pacific Software Engineering Conference, (Vol. 2, pp. 7–12). 5. Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012). Social coding in GitHub: Transparency and collaboration in an open software repository. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (pp. 1277–1286). 6. Blincoe, K., Harrison, F., & Damian, D. (2015). Ecosystems in GitHub and a method for ecosystem identification using reference coupling. In Proceedings of the Working Conference on Mining Software Repositories (pp. 202–207). 7. Saadat, S., Gunaratne, C., Baral, N., Sukthankar, G., & Garibay, I. (2018). Initializing agentbased models with clustering archetypes. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, Washington. 8. Borges, H., Hora, A., & Valente, M. T. (2016). Predicting the popularity of GitHub repositories. In Proceedings of the International Conference on Predictive Models and Data Analytics in Software Engineering (p. 9). 9. Saadat, S., & Sukthankar, G. (2018). Predicting the performance of software development teams on GitHub. In International Conference on Computational Social Science, Evanston. 10. Yu, Y., Yin, G., Wang, H., & Wang, T. (2014). Exploring the patterns of social behavior in GitHub. In Proceedings of the International Workshop on Crowd-based Software Development Methods and Technologies (pp. 31–36). 11. Zhang, Z., Yoo, Y., Wattal, S., Zhang, B., & Kulathinal, R. (2014). Generative diffusion of innovations and knowledge networks in open source projects. In 35th International Conference on Information Systems “Building a Better World Through Information Systems”, ICIS 2014. Auckland: Association for Information Systems. 12. GitHub.com. Event types & payloads. Available: https://developer.github.com/v3/activity/ events/types/ 13. Peterson, K. (2013). The GitHub open source development process. http://kevinp.me/githubprocess-research/github-processresearch 14. Duch, J., & Arenas, A. (2005). Community detection in complex networks using extremal optimization. Physical Review E, 72(2), 027104. 15. Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In International Conference on Weblogs and Social Media, 8 (pp. 361–362). 16. Barabasi, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512. 17. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, 393(6684), 440. 18. Barabási, A.-L., & Bonabeau, E. (2003). Scale-free networks. Scientific American, 288(5), 60– 69. 19. Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.
What Can Honeybees Tell Us About Social Learning? Robin Clark and Steven O. Kimbrough
Abstract We independently implemented and studied the nest-site selection model described in Passino, K. M., and Seeley, T. D., “Modeling and analysis of nest-site selection by honeybee swarms,” 2006, focusing on the default parameter values they obtained by field calibration. We focus on aspects of the model pertaining both to imitation and social learning and to the model as kind of metaheuristic. Among other things, we find that the model is robust to different parameterizations of social learning, but that at least a modicum of social learning is essential for successful nest-site selection (in the model). Regarding the model as a metaheuristic, we find that it robustly produces good but significantly non-optimal nest-site selections. Instead of a single-criterion metaheuristic, the algorithm is best seen as balancing three objectives: choose the best of the available sites in the neighborhood, make the choice quickly, minimize risk of failing to choose a site. Keywords Social learning · Imitation · Metaheuristics · Honeybees
1 Introduction Honeybees are a remarkable collection of species (any member of the genus Apis). With their earliest fossils appearing 34 million years ago, they are on the order of 30 times older than our own species, and have survived, even flourished, in the face of drastic changes in the biosphere and geosphere, changes for which human flourishing and civilization would be problematic.
R. Clark Department of Linguistics, University of Pennsylvania, Philadelphia, PA, USA e-mail: [email protected] S. O. Kimbrough () Department of Operations, Information and Decisions, University of Pennsylvania, Philadelphia, PA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_12
179
180
R. Clark and S. O. Kimbrough
Among their many striking traits is the process by which a successful colony, having outgrown its nest site, sends out colonists and acquires a new nest site for a portion of the original colony, which continues to operate. This process has only reached mature scientific understanding in the last decade or so, beginning from initial studies in the 1950s. The science is thoroughly and delightfully described in Seeley’s book Honeybee Democracy [17]. Briefly and at a high level of description, nest-site selection works as follows. Bees in a successful hive sense the need or opportunity for form a new colony. They begin to nurture a new queen, who is created from an ordinary larva by special feeding. As the queen matures, a large number of bees (on the order of 10,000) in the hive go quiet and gorge themselves on honey, which they otherwise cannot take with them. Rather suddenly, the new colonists, including the old queen (once a new queen is hatched to stay with the old hive) leave the hive and form a large swarm nearby. The swarm quickly identifies a staging place, typically on a nearby tree branch, and congeals in a small living ball of densely packed bees, who adjust their metabolisms so as to conserve energy and to maintain the ball at the proper temperature. A few score or few hundred scout bees are recruited from bees that had been foragers. These scout bees fly off from the swarm more or less in random directions, in search of candidate nest sites. Their searches may extend several kilometers from the swarm. When the scouts return, those that have found promising sites engage in the famous waggle dance, indicating both direction to the candidate nest site and the bee’s own assessment of its quality. Scouts that do not encounter satisfactory sites rest and observe the dances. After a time, they go exploring again, possibly heading to one of the danced for sites (with the choice of site biased towards more highly evaluated sites that are being danced for). This is a form of imitation grounded not in directly copying an observed behavior, but in interpreting a signal transmitted by a compatriot, which allows the bee to imitate the behavior at a distance. If the bee does not imitate a dancer in this way, the bee explores randomly again, a form of innovation rather than imitation. The process continues until the bees sense a “quorum,” a sufficient number of bees at one of the candidate sites, individually exploring its characteristics. Once this happens, scouts returning from the site begin “piping,” making distinctive sounds that cause the bees in the swarm to awaken and prepare for flight. Soon after, the compressed swarm explodes into flight, dispersed over several score meters. Directed by scouts, the swarm heads off towards the chosen nest site. When it reaches the new site, if all has gone well, the bees enter it in an orderly fashion, including the queen, and begin operating as a new hive, striving to build up honey reserves as quickly as possible. Of course, much can go wrong and site selection ventures may often end in failure and the destruction of the would-be colonists. What is amazing is that over evolutionary time, millions of years, this mechanism has succeeded in preserving the species. Honeybee nest-site selection was investigated and its mechanics discovered by biologists interested in ascertaining the basic biology involved. Investigations continue, but it seems clear that the basic biology of honeybee nest-site selection is understood and the findings are settled. Our interest in honeybee nest-site selection
What Can Honeybees Tell Us About Social Learning?
181
complements and extends—builds upon—the basic biology. This interest of ours is mainly focused on two rather distinct topics, which we shall now discuss. Social learning is the first and indeed primary impetus driving our interest in honeybee nest-site selection. By social learning we mean, roughly, learning by individuals in a population (or society) driven principally by observation of, and interaction with, other individuals. This is to be contrasted with learning in which individuals mainly observe and interact with nature, that is, with the world outside the society. There is now a burgeoning field of study of social learning, motivated in large part by an overarching interest in social and cultural evolution (see, e.g., [6, 9, 10] for recent overviews). Following the important work of Boyd and Richerson (e.g., [3, 16]), who developed theory, complemented by empirical studies, that emphasizes the importance of imitation (and social learning more generally) in cultural evolution, a large amount of research in social and cultural evolution has been, and continues to be, directed at understanding social learning. (See [14] for an important example of a study of this kind.) The second impetus for our interest arises from the fact that the metaheuristic (≈ heuristic optimization algorithm schema) instantiated by the honeybees in selecting new nest sites is itself interesting and worth exploring for other applications. Metaheuristics (the term was coined by Fred Glover) are standardly categorized into two kinds: local search heuristics (such as simulated annealing, GRASP (greedy randomized adaptive search procedure), mimetic algorithms, and tabu search) and population based heuristics (evolutionary algorithms such as genetic algorithms and genetic programming, particle swarm optimization, and ant colony optimization) [1, 8, 11, 15, 19, 20]. There are literally scores of published metaheuristics, many of them “biologically inspired.” Particularly notable about the honeybee nest-site algorithm (schema) is that it does not fit well in either the local search category or the population based category of metaheuristics, although of course it has flavors of both approaches. Instead, it might best be described as an individual-based learning algorithm [7], akin to reinforcement learning in some guise, that is distributed rather than centrally guided. The individual in question, of course, is the swarm of bees, not the individual bees themselves, but the learning mechanism is distributed across all of the scouts and is not controlled by an executive. As such, it perhaps is most similar to ant colony optimization and particle swarm optimization in the canon of metaheuristics. Even so, in its use of copying and imitation, it also resembles mimetic algorithms. In any case, there is much to hold one’s interest. This is an exploratory study. We reimplemented in Python 3.6.4 the honeybee nest-site selection simulation model presented and discussed by Passino and Seeley in [13]. Our implementation was from scratch, because [13] does not provide the source code or very much in terms of software documentation. Although their paper is very informative and reports much of interest and value, even with a very close reading we were unable to discern certain aspects of their model. Fortunately, our results cohere with those reported in [13]. Moreover, our implementation serves well for exploring the two main topics of interest to us: social learning and the use of the honeybee nest-site selection procedure as a metaheuristic.
182
R. Clark and S. O. Kimbrough
We describe our implementation in Sect. 2 and our results in Sect. 3. Section 5 concludes the paper with a discussion of the results and promising avenues for future research. We note that recent releases of NetLogo have come with a model of honeybee nest-site selection, called the BeeSmart Hive Finding model. This is an interesting and useful model with nice graphics. No doubt it could be modified for the experimental purposes we have in mind, but that remains to be done. Our Python implementation is quite independent and is well suited for our purposes. Like the NetLogo model, we are happy to make the source code and documentation publicly available.
2 Methods Table 1 presents the model’s parameters and default values, drawn from Passino and Seeley in [13]. The table is for purposes of reference. We refer to these variables in the sequel as needed. Table 2 presents, also for purposes of future reference, observed or computed quantities in the model. Our documented source code is available from the authors. The original article by Passino and Seeley [13] supplies, in addition, valuable documentation and background for the model, including parameter calibration information. In consequence, and because of space limitations, we focus here on the core procedures of the model.
Table 1 Default parameter settings Symbol Jxsize, Jysize
Default value 21, 21
locations εq (epsilonq) εs (epsilons) εt (epsilont) σ SiteValues B maxTicks
[(8, 11), (14, 6), (2, 20), (16, 0), (15, 19), (16, 18)] 20 15 0.2 400 [0.1,0.3,0.35,0.5,0.55,1] 100 64
mRate pm (pm)
0.1 0.25
σ (sigma)
400
Program variables given in typewriter font
Description Side dimensions of the nest-site quality landscape, J Grid coordinates of candidate sites Quorum threshold Dance decay rate Quality/dance threshold Exploration tendency Site utility scores Number of scout bees Maximum number of ticks in a run Nominal mortality rate Probability of becoming an observer Tendency to dance parameter
What Can Honeybees Tell Us About Social Learning?
183
Table 2 Important observed or computed values Symbol pd kji
Description mRate/maxTicks = mortalityRate= probability of death on a single expedition Step (tick) at which bee i discovered site j
Lij (k) Li (k) Lt (k) pe (k)
Dance strength of bee i at step k for site j Dance strength of bee i at step (tick) k Total dance strengths at step k Probability an observer becomes an explorer
Program variables given in typewriter font
All of the modeled bees are scouts. In both Passino and Seeley [13] and in our implementation, the model assumes that a swarm has been formed and focuses on the actions of the scout bees as they explore for new sites and either come to agreement or not (within the allotted time of 64 ticks or 32 h) regarding a new site. In our implementation, each bee is an instance of the class scout and a list of all modeled bees/scouts (alive or dead) is maintained during the run of the model. For present purposes, we call this list Bees. Each run of the model is affected by a call to the main() procedure, which in turn consists largely of a for loop over the range of ticks, [0, 1,. . . , 63], 64 in all by default, range(parameterSettings.maxTicks) in Python. We focus now on the operations occurring in the governing for loop during each tick (modeled 30 min time period). 1. cullBeesAtRisk(). Scouts foraying from the swarm are at risk of death, determined by parameter values as shown in Table 2. In this step each scout that is either in the ‘Committed’ or the ‘Exploring’ state is converted to the ‘Cadaver’ state with probability pd . This step implements the assumption that committed/dancing bees visit the site they are committed to each tick, thereby making themselves available to be counted as part of a potential quorum for the site in question. 2. goExplore. Each bee in the ‘Exploring’ state, either it has acquired a site during the previous tick (by imitation and so has a spot value) or not. For those that have not, randomly assign them each a point of the search grid, J , which is a Jxsize×Jysize grid. (See Table 1.) For each (surviving) bee in the ‘Exploring’ state, record in the visitsCounter at this tick if it visits one of the candidate sites and determine whether it has found an attractive site and becomes ‘Committed’ with an initial dance strength to a candidate site or, if not, goes to the ‘Resting’ state. (Thus, at the exiting of this procedure, all ‘Exploring’ bees have either become ‘Committed’ and are dancing or are in the ‘Resting’ state.) 3. decayDance(). For each bee in the ‘Committed’ state, decay its dance strength and compute its new value of Lij (k). (See Table 2 and [13, page 430, right].) For any bee whose Lij (k) value falls below εt , set its dance strength to 0 and its state to ‘Resting’.
184
4. 5.
6.
7.
R. Clark and S. O. Kimbrough
At this point, all bees that are alive are either ‘Committed’ or ‘Resting’. sleepersAwake(). For each bee in the ‘Resting’ state, convert it to the ‘Observing’ state with probability pm . (See Table 1.) exploreOrCommit(). For each bee in the ‘Observing’ state, calculate pe (k). (See Table 2 and [13, page 432, expression (1)].) With probability pe (k), convert the associated bee to the ‘Exploring’ state. If there is insufficient dancing (“low enthusiasm”) then pe (k) = 1 or is high and every (or nearly every) bee is converted from ‘Observing’ to ‘Exploring’. pickSiteCommit(). For each bee now (still) in the ‘Observing’ state, probabilistically pick one of the sites that is being danced for and target it, that is, acquire the site as the target of its visit during the next tick. (Set its spot value to that of the bee it is imitating.) (See [13, page 433, expression (2)].) Convert the bee to the state ‘Exploring’. At this point every bee is either dead, exploring, committed, or resting. quorumCounts(). Check to see whether there is a quorum evident in totalVisits at this tick. If so, break out of the loop, record statistics, and terminate the run.
Note that [13, page 432] set σ to 4000. This gave us implausible values when we implemented their formulas [13, page 432, expression (1)], which were inconsistent with their description. We believe it is a typo of sorts and found that 400 gave us plausible values, so σ = 400 is the default value for that parameter. All other parameters in our model are, so far as we know, identical with those in [13].
3 Results Seeley and Passino comment as follows about the foci of their interests with regard to the honeybee site selection model. It seemed a priori that all three parameters—quorum threshold [εq ], [dance enthusiasm] decay rate [εs ], and exploration tendency [σ ]—could strongly affect the outcome and the timing of a swarm’s decision making. [13, page 428]
We examine all these factors in the present section. In addition, given our interests in social learning and imitation, we explore additional parameters as well, especially pm (pm), the probability of becoming an observer.
3.1 Default Parameter Settings Implementing our best interpretation of [13] (and see Table 1), we obtain, in trials of 100 runs, results that are in broad agreement with those reported by Passino and Seeley. The comparison cannot be terribly precise because they often report only
What Can Honeybees Tell Us About Social Learning?
185
Fig. 1 Histogram of the chosen sites from 100 runs in the default configuration. Sites chosen: 1, 41 times, 0.55, 20 times, 0.5, 31 times, 0.35, 6 times, 0.3, 2 times, and none, never
Fig. 2 Visits records from 100 runs in the default configuration
results from single, “presentative” runs. In this reference trial, see Fig. 1, the hives settled on the best candidate site 41 of 100 times, and never failed to find a quorum. The second best site, with a value of 0.55, was picked 20 times. The average number of ticks to find best site, when the best site was chosen, was 7.49. Figure 2 plots the counts of visits to the various sites over the span of the 100 runs. The six sites are numbered from 0 to 5 in increasing order of site value. Thus, site 5 is the best and has a value of 1 in the default case; see Table 1. While noisy, Fig. 2 evidences a clear pattern: better sites tend to receive more visits over 100 runs. Table 3 gives a statistical summary of 30 replications of trials with 100 runs each under the default parameter assumptions. We see that on average a trial (of 100 runs) found the best site about 45 times and the second best site about 26 times. None of the 3000 runs resulted in failure to pick some site (there were no Nones; column D). How impressive is this performance? From one perspective, more than half the time the swarm failed to settle on the best candidate nest site. From a second perspective, what matters is long-term survival and flourishing. If the candidate sites that are actually considered by the swarm are in fact close in expected value with respect to natural selection, then what matters most is that some adequate site is selected. The None result is an evolutionary dead end from the perspective of the
186 Table 3 Statistical summary of 30 replications of 100 runs of the default settings
R. Clark and S. O. Kimbrough
Count Mean Std Min 25% 50% 75% Max
A 30.000 45.467 4.790 36.000 42.000 46.000 49.000 54.000
B 30.000 6.872 1.114 5.220 6.011 6.777 7.294 10.023
C 30.000 26.433 3.803 18.000 24.000 27.000 28.750 34.000
D 30.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
E 30.000 8.015 0.755 6.580 7.490 7.790 8.310 9.970
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
swarm (and the sourcing hive). So this is a complex question, one that is beyond the scope of this focused study. From a third perspective, we can compare the performance of the hive against chance. This, too, is a subtle issue, but given our primary interest in social learning, we can ask what happens if there is no imitation or social learning at all. To model this, we make two parametric modifications. First, we set εs = 0. This eliminates decay in dance strength by a scout, so that once a scout has found a danceable site, it keeps dancing for and visiting the site. Second, we set σ = 1,000,000. This has the effect of virtually eliminating any chance that an observing scout will imitate a dancing scout and adopt its candidate site. Thus, when an uncommitted scout returns to the swarm, either it has found an acceptable candidate or not. In the former case, it dances and maintains its commitment to that site for the duration. In the latter case of an uncommitted scout not finding a suitable site on its foray, the uncommitted scout remains uncommitted and forays again after a rest. Figure 3 plots the visits record from a trial of 100 runs with this setup. It should be compared with Fig. 2. In Fig. 3, the values for each site increase at a more or less constant rate with the tick count. Site 0 is a comparatively poor site, so that a scout visiting it might actually return and not become committed. Site 1, similarly, but less so. The remaining four sites are sufficiently high in quality that every scout visiting them becomes committed, although each scout assesses quality with noise. But in the end, no site is selected under this setup. Without imitation, the swarm fails to settle upon a new nest site, even though its scouts are finding all of the candidates.
3.2 Default Parameter Settings: Random Site Values The default site values, [0.1,0.3,0.35,0.5,0.55,1], are given in Table 1. They perhaps make it easy on the hive to find the best site, since its value, 1, is so much larger than that of the second best site, 0.55. Also, the worst site, at 0.1, is unlikely to meet
What Can Honeybees Tell Us About Social Learning?
187
Fig. 3 Visits records from a trial of 100 runs in the default configuration, but with no imitation (εs = 0, σ = 1,000,000)
Fig. 4 Histogram of the chosen sites from a trial of 100 runs in the default configuration, but with random sites values drawn anew for each run. Sites chosen: Best, 44 times, second best, 22 times, and None, 1 time. Average number of ticks to find best site (when the best site was chosen): 8.64. Average number of ticks overall: 9.32
the quality/dance threshold of 0.2, epsilont in Table 1. This serves to quickly eliminate it in most cases from consideration by the hive. In search of a more realism, we repeated the experiment by doing 100 runs with the default parameter settings, except that the site values were determined by drawing random values uniformly from the [0,1] interval. In this experiment, the hive chose the best site available to it in 44 of the 100 cases, the second best in 22 instances, and no site in only 1 instance. Thus, the overall performance matches that of the default case, which is seemingly easier for the hive. Figure 4 is the analog of Fig. 1, but has a rather different interpretation. In the runs underlying Fig. 4, the six site values were constant: [0.1,0.3,0.35,0.5,0.55,1], given in Table 1 and used by Passino and Seeley. Thus, the results could in principle be binned into 6 categories. The runs underlying Fig. 4 each had randomly-drawn site values. These were, however, ranked in increasing order: site 5 being the best, site 0 the worst. Thus, in principle there could be 6 × 100 bins of distinct site values. Figure 4 imposes 12 bins on these data. The figure indicates that generally, the hives were able to select higher-valued candidate sites.
188
R. Clark and S. O. Kimbrough
Fig. 5 Visits records from 100 runs in the default configuration, but with randomly-drawn site values for each run
Table 4 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values
Count Mean Std Min 25% 50% 75% Max
A 30.000 38.633 4.359 31.000 36.000 38.500 41.500 48.000
B 30.000 7.664 1.375 5.590 6.521 7.381 8.629 11.091
C 30.000 29.167 4.496 19.000 27.000 30.000 31.750 39.000
D 30.000 0.600 0.894 0.000 0.000 0.000 1.000 3.000
E 30.000 7.770 0.925 5.970 7.168 7.715 8.190 10.330
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
Comparing Fig. 5 with Fig. 2, we see a similar but more noisy pattern, in which the best and second best sites are distinctly more visited than the others. Table 4 corresponds to Table 3, but with random site values being chosen for each constituent run. There is perhaps some slight degradation in the performance, but if so, it is small. Here, the swarm found the best available site (regardless of its absolute score) nearly 39% of the time on average, the second best 29% of the time and failed to settle only 0.6% of the time.
3.3 Varying the Quorum Requirement The default quorum value, εq , is 20, which was determined by Seeley and collaborators through observation. Table 5 summarizes the results from a 100-run trial with εq = 15. Comparing these results to those in Table 4, there is little or no noticeable difference.
What Can Honeybees Tell Us About Social Learning? Table 5 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and εq = 15
Count Mean Std Min 25% 50% 75% Max
A 30.000 38.367 4.131 29.000 35.250 39.000 41.000 45.000
189 B 30.000 7.671 1.358 5.244 7.007 7.488 8.079 11.619
C 30.000 28.867 3.937 23.000 25.250 29.000 32.000 37.000
D 30.000 0.233 0.430 0.000 0.000 0.000 0.000 1.000
E 30.000 7.423 0.776 6.030 7.110 7.310 7.730 9.700
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs Table 6 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and εq = 50
Count Mean Std Min 25% 50% 75% Max
A 30.000 42.600 4.553 32.000 39.000 43.500 45.750 53.000
B 30.000 10.753 1.255 8.316 9.958 10.656 11.633 13.841
C 30.000 30.200 3.755 23.000 27.000 30.000 33.000 36.000
D 30.000 0.833 0.950 0.000 0.000 1.000 1.000 4.000
E 30.000 11.143 1.081 9.470 10.360 11.095 11.780 14.200
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
We now consider the effects of increasing the quorum size required to 50. Table 6 summarizes the results of 30 repetitions of 100-run trials. Surprisingly, the results are not much different from those in Table 5. Table 7 summarizes results for when εq = 70. Surprisingly, at least to us, there is seemingly little degradation in the performance of the swarm. The chance of finding either the best or second best sites remains at about 72%, while the probability on average of not settling on a site rises to 5.1%. What does change significantly is the average length of a run (column E), which goes from 7.4 ticks when εq = 15–17.8 when εq = 70. We suspect that the 5.1% failure rate is comparatively highly disadvantageous, and that, given the exposure and energy use of the swarm, the lengthening of the average number of ticks to 17.8 is also a large loss over evolutionary time. Finally, Table 8 summarizes results for when εq = 85. Unsurprisingly, the numbers would seem to be catastrophic over evolutionary time, with a failure rate in excess of 56%. When εq = 90, the results are much worse than this. These findings
190 Table 7 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and εq = 70
R. Clark and S. O. Kimbrough
Count Mean Std Min 25% 50% 75% Max
A 30.000 44.533 5.412 35.000 40.250 45.000 48.750 54.000
B 30.000 15.647 1.519 12.119 14.757 15.724 16.185 19.154
C 30.000 28.400 4.889 18.000 26.250 28.000 30.750 42.000
D 30.000 5.100 1.989 0.000 4.000 5.000 6.750 9.000
E 30.000 17.832 1.405 14.860 16.742 17.870 19.015 20.230
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs Table 8 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and εq = 85
Count Mean Std Min 25% 50% 75% Max
A 30.000 25.267 4.051 17.000 23.000 25.000 27.750 34.000
B 30.000 15.768 1.852 12.800 14.558 15.456 16.730 19.381
C 30.000 12.533 3.928 7.000 9.000 12.000 16.000 19.000
D 30.000 56.633 4.460 47.000 53.250 56.500 59.750 66.000
E 30.000 42.031 2.311 37.660 40.808 41.785 43.692 47.500
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
are perhaps not relevant biologically, but they do serve to help validate the model and our implementation of it.
3.4 Varying the Number of Scouts The default value for the number of scouts, B, is 100. It was determined by Seeley and collaborators through observation. With B = 75, instead of 100, we see in Table 9 compared to Table 4 on page 188. Table 10 summarizes results when the number of scout bees is set to 150, compared to the default value of 100. We see no discernible improvement in success in finding nest sites, but strong improvements in the speed of doing so.
What Can Honeybees Tell Us About Social Learning? Table 9 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and B = 75
Count Mean Std Min 25% 50% 75% Max
A 30.000 40.167 3.779 34.000 36.250 41.000 43.000 46.000
191 B 30.000 11.652 1.427 8.867 10.395 11.577 12.638 14.432
C 30.000 29.167 4.457 21.000 27.000 30.000 31.750 40.000
D 30.000 1.367 1.066 0.000 1.000 1.000 2.000 4.000
E 30.000 11.803 1.025 9.750 11.168 11.820 12.518 13.790
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs Table 10 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and B = 150
Count Mean Std Min 25% 50% 75% Max
A 30.000 38.933 5.723 29.000 35.250 39.000 42.000 55.000
B 30.000 4.525 0.704 3.300 4.006 4.479 5.055 6.237
C 30.000 28.433 4.446 19.000 25.000 28.000 31.750 38.000
D 30.000 0.100 0.403 0.000 0.000 0.000 0.000 2.000
E 30.000 4.568 0.355 3.970 4.338 4.510 4.908 5.220
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
3.5 Dance Decay Rate The dance decay rate parameter, εs (epsilons) has a default value of 15, from Seeley’s calibrations. Setting it to 5 yields results quite similar to the default results shown in Table 4 on page 188. At 50, it yields data summarized in Table 11. Success rates (chances of finding the best or second best site) are slightly higher than in the default configuration, about 75% compared to about 68%, but with an increase in Nones (failures, column D) and a nearly 50% increase in the time (ticks) taken to reach a resolution.
3.6 Probability of Becoming an Observer, pm (pm) The default value of pm , the probability in a given tick that a resting bee awakens and begins observing the dances, is 0.25. Table 12 summarizes results with pm = 0.15.
192 Table 11 Statistical summary of 30 replications of 100 runs of the default settings, but with random nest-site values and εs = 50
R. Clark and S. O. Kimbrough
Count Mean Std Min 25% 50% 75% Max
A 30.000 46.467 4.637 37.000 42.500 46.500 49.000 55.000
B 30.000 11.677 1.936 7.490 10.668 11.426 13.078 16.370
C 30.000 29.833 3.833 22.000 27.000 29.000 32.000 37.000
D 30.000 2.133 1.306 0.000 1.000 2.000 3.000 6.000
E 30.000 11.684 1.254 8.940 10.832 11.775 12.610 14.760
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs Table 12 Statistical summary of 30 replications of trials of 100 runs of the default settings, but with random nest-site values and pm = 0.15
Count Mean Std Min 25% 50% 75% Max
A 30.000 43.467 5.296 32.000 41.000 43.500 46.750 58.000
B 30.000 12.896 1.637 10.250 11.962 12.853 13.538 18.204
C 30.000 28.700 4.419 19.000 25.250 28.500 30.750 37.000
D 30.000 2.500 1.570 0.000 1.250 2.000 3.000 7.000
E 30.000 13.522 1.578 11.130 12.275 13.170 14.853 17.490
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
The results quite similar to the default results shown in Table 4 on page 188. The main difference here is in the time needed to resolve the search, columns B and E, which is increased. More rest for the bees leads to longer times to select nest sites, but no evident change in quality of choice. When pm increases to 0.5, Table 13, there may be a slight decrease in success rate, but there is plausibly an increase in the time taken for resolution.
3.7 Search Area Size The bees search randomly (in the model) in a Jxsize×Jxsize grid, with Jxsize = Jxsize = 21 by default. Table 14 presents summary results for performance on a 41 × 41 grid. Performance degrades substantially compared to the default 21 × 21 configuration. What is perhaps surprising is that the success rate (rate of selecting the best or second best candidate) is only modestly diminished.
What Can Honeybees Tell Us About Social Learning? Table 13 Statistical summary of 30 replications of trials of 100 runs of the default settings, but with random nest-site values and pm = 0.5
Count Mean Std Min 25% 50% 75% Max
A 30.000 35.533 4.183 28.000 33.250 35.500 38.750 43.000
193 B 30.000 4.240 0.613 2.853 3.897 4.178 4.524 5.794
C 30.000 28.333 4.097 20.000 26.000 28.000 31.000 37.000
D 30.000 0.100 0.305 0.000 0.000 0.000 0.000 1.000
E 30.000 4.298 0.368 3.670 4.090 4.270 4.408 5.270
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs Table 14 Statistical summary of 30 replications of trials of 100 runs of the default settings, but with random nest-site values and Jxsize = Jxsize = 41
Count Mean Std Min 25% 50% 75% Max
A 30.000 32.833 4.907 22.000 29.250 32.500 36.750 41.000
B 30.000 19.963 2.682 14.816 18.019 19.396 21.758 27.636
C 30.000 28.033 6.408 15.000 24.000 28.000 31.000 45.000
D 30.000 9.833 2.755 4.000 7.250 10.000 12.000 14.000
E 30.000 23.455 1.787 18.840 22.690 23.650 24.475 26.400
A best candidate site chosen in 100 runs, B ticks to choose the best site, when it is picked, C second best candidate chosen in 100 runs, D no site chosen in 100 runs, E ticks per trial of 100 runs
Much strong effects are seen in the increase in failures, Nones in column D, and the time needed to reach resolution (columns B and E).
4 Social Learning and Cultural Evolution We now conclude with a few more speculative remarks pertaining to social learning and to cultural evolution. On the one hand it is clear that the bees are learning and learning from each other, so social learning is definitely present. On the other hand, it is hard to see cultural evolution in bee nest-site selection. As noted above, site selection is an example of a population making a decision, but calling it democracy is misleading and we submit a conceptual error. The individual bees are not deliberating and then crowdsourcing the decision. Rather, the individual bees play their roles in a distributed, population based decision process where there is no individual decision maker. We will argue below that certain aspects of cultural
194
R. Clark and S. O. Kimbrough y1
Fig. 6 Schematic of a leaky, competitive accumulator (LCA), after [2]
k
I1 ± c
y2
w k
I2 ± c
evolution use mechanisms similar to what we see with the bees, but do so to arrive at cultural and evolving decisions. In developing this line of thought, we first need to understand and model at a more abstract level what it is the bees are doing. To this end, we pursue a suggestion made by Seeley in his book [17, chapter 9]. The particular mechanism we describe here is distinct from Seeley’s, although there are affinities. In many ways, honeybee nest-site selection resembles Leaky, Competitive Accumulators (LCA), e.g., [2, 18]. Figure 6 illustrates the basic ingredients of an LCA. Annotating the figure: • • • • •
Ii : input unit activity; c: random noise; yi : accumulators (with values) w: mutual inhibition rate parameter k: decay rate following W , a Wiener process W : Wiener process dy1 = (−ky1 − wy2 + I1 )dt + c1 dW1
(1)
dy2 = (−ky2 − wy1 + I2 )dt + c2 dW2
(2)
y1 (0) = y2 (0) = 0
(3)
In addition, there is an exogenous decision rule (e.g., threshold, or quorum, as with the bees) that when it fires occasions the decision. All of this, of course, generalizes to N nodes. The model, represented in Fig. 6, operates as follows. Sensors—I1 and I2 — detect (with noise) various features and send signals to their corresponding accumulators—y1 and y2 —which accumulate the observations. Each accumulator leaks at a given rate, k, and inhibits other accumulators, via the inhibition strength parameter, w. The process continues until a decision condition is reached, e.g.,
What Can Honeybees Tell Us About Social Learning?
195
one accumulator reaches a threshold (quorum for the bees), or an exogenous event occurs that forces a decision, normally choice of the accumulator with the highest value. Except for the culminating decision, Eqs. (1)–(3) formalize this process: change in the accumulator values (dy1 , dy2 ) is a function of the decay rate, the values of the accumulators, the input levels, and scaling factors. Points arising: • LCA is a member of the broader class of stochastic information accumulator models. • LCA developed as an explanatory model for simple learning behavior, at which it does well. • Arguably LCA has a more or less direct neuronal basis. • Input nodes, I , can be thought of as noisy sensors. The ys accumulate the sensor activity, and the system makes a choice based on a threshold or a deadline. • The deep point is that brains CANNOT be centralized decision makers. They have to be distributed, population based decision makers, and the bees are also population based distributed decision makers. Can similar models work to describe them both? Can we abstract an algorithm (family) that describes them both? How can this be interpreted with bees? We suggest the following mapping. • Input units, I s: Scouts; • Accumulators, ys: Emerging in the population of scouts (those that dance for or inspect a candidate site) • Exogenous decision rule: Quorum detection; • Leaking, decay: Attenuation of dance strength over time; • Inhibition: This is indirect, recruiting a bee for a candidate site prevents it from being recruited to a different site. There are, thus, deep similarities between nest-site selection and LCA. Note as well that both processes are (heuristic) optimization procedures. Moreover, bees engage in social learning, they imitate other bees; they have an ability to switch affiliation from one candidate site to another; accumulators form virtually, dynamically, consisting of the collections of bees dancing for a given site. Finally, the dancing evidences displacement [5], communicating about entities not immediately present. The bees indeed are engaged in sophisticated social deliberations, and there are genuine differences with simple LCAs as well. Scouts, as input units, engage in search and evaluation, but they are active, not passive. Scouts themselves engage in social learning, acquiring information from the neighborhood as well as from other scouts. Can this more sophisticated behavior credibly be modeled with LCA? We think so, by adding exogenous factors to LCA, leading to social LCA (S-LCA) models of learning. Abstracting, the general pattern for the bees is as follows.
196
R. Clark and S. O. Kimbrough
• Accumulators – At nest sites. – On the surface of the hive. • Agents and roles – – – – –
scouting/exploring evaluating dancing/reporting observing piping
That seems clear enough, but can this pattern fit cultural evolution? Bees do not have cultural evolution, after all, because they do not even have culture. The bees are actually making a decision about where to live. Decisionmaking is not the same as cultural evolution. Hive hunting does not count as cultural evolution: • There is no ongoing “accretion” of cultural artifacts, material or conceptual. • For the bees, the accumulators are used simply for decision procedures (and, perhaps, optimization). • Culture is an accretion, relying on shared memory. In fact, cultural features have persistence [4, 12]. • The bees lose accumulators when they are no longer necessary for decisionmaking (that is, nest-site selection). • Humans can save accumulators in the culture and keep them beyond the lifetime of any individual. So we need to adapt or at least frame the S-LCA learning model/algorithm with cultural evolution in mind. To illustrate culture and accumulators, consider Fig. 7.1 Figure 7 shows the relationship between the occurrences of clarity relative to the total of clarity + clearness from 1500 to 2000 in British English. At first, clarity dominates, but suddenly, around 1650, the frequency of clarity plummets, virtually disappearing until about 1900, a period of 250 years. It is interesting to note that 1650 coincides with the English Civil War; we speculate that the change in frequency of clarity coincides with a cultural change that deprecated clarity as compared to clearness; clarity has the romance suffix -ity, which was deprecated as foreign. But the accumulator for clarity clearly persisted, allowing clarity to return to fashion around 1900. This latter fact may be due to the frequency of electricity in the popular vocabulary (electrification of Britain was much discussed), which shares the -ity suffix with clarity, as well as the rise of scientific psychology, where clarity might be viewed as a kind of mental force, perhaps on analogy with electricity. These conjectures, of course, remain to be tested.
1 Thanks
to Tony Kroch for directing our attention to this example; he is, of course, not responsible for our interpretation.
What Can Honeybees Tell Us About Social Learning?
197
Fig. 7 Ratio of “clarity” to “clarity” + “clearness,” 1500–2000 in British English
We can connect accumulators and culture by viewing accumulators as a form of social memory, allowing for the growth (and loss) of cultural artifacts. Thus, we see S-LCA as a schema for a mechanism for social and cultural evolution: • Individuals from sub-communities of the larger society organize (via stigmergy?) to solve problems, e.g., a new technology arises and names need to be found, as in electronic mail. • Some individuals explore/innovate • Some individuals observe and either imitate or go to explore, changing the accumulator balances over time. • Eventually, at least one norm is established (A variety of exogenous decision rules are possible: conformity, prestige, quorum, forced decision, group identity, and so on.) • Reward may be distributed to the community members (perhaps not equally) • Possibly, there is adoption by the larger society To illustrate, we can see cases in which words are treated as accumulators. 1. text messaging versus texting Both refer to the activity of composing and sending a written message via telephone. Currently, a search on text messaging yields 13.5 million pages, while texting yields 125 million pages. Criteria for social selection are probably a combination of: (a) Intelligibility; (b) Economy; (c) Frequency. Initially, text messaging was most frequent, presumably because intelligibility. Perhaps, texting, although less immediately intelligible, could probably piggyback on text messaging and so it came to accumulate instances. Clearly, texting is more economical than text messaging. In consequence, it eventually has the highest frequency and dominates.
198
R. Clark and S. O. Kimbrough
2. Gaps in morphological paradigms can be filled when new niches are created. Friend on Facebook is a neologism (sort of). The meaning of become friends is otherwise expressed by befriend. So the verb form friend was blocked by befriend, which was higher in frequency. The rise of Facebook created a new social relationship, “Facebook friend” (as opposed to a real friend). The new meaning is captured by the accumulator for friend, which is recruited for the purpose. And so friend comes to mean a relationship on Facebook. 3. Conflicting Criteria French has three competing terms for electronic mail: email, courrier éléctronique, and courriel. email is an English borrowing, seen as franglais and so to be deprecated; courrier éléctronique is high French, but not economical in comparison with its competitors; courriel is a genuine French word, but— hélas—French-Canadian. There is no clear winner yet; email is still very frequent, but courriel has many adherents (despite its French Canadian origins). The reader is invited to supply their own examples (and send the good ones to us), which doubtless abound.
5 Discussion and Conclusion We independently implemented and studied the nest-site selection model described in [13], focusing on the default parameter values they obtained by field calibration. We have systematically explored the behavior of the nest-site selection model in response to individual changes to all the important parameters of the model about their default values, Table 1. In doing so, we have obtained some findings that, while not inconsistent with [13], extend what they report. We also have eschewed for most of our analyses the fixed, probably atypical, distribution of site values used by [13]. Instead, we generate random values with each run. We find the model to be highly robust in the neighborhood of the default parameter values, although we have not reported interaction effects (except for Jxsize and Jysize together). Jxsize and Jysize are again an exception. Quadrupling the size of the search area from 21 × 21 to 41 × 41 noticeably degrades the nest-site selection process. Across changes to each of the other parameters, the main effects of changes concentrate less on success—defined as selecting the best or second best nest site—and much more on risk of failure to select some site and on time taken to resolve the search. In many cases, we have examined parameter value changes larger than the Jxsize and Jysize move to 41 each. Throughout these parameter changes the chance of success, as just defined, is robustly in the neighborhood of 70%. This suggests the possibility that natural selection may be acting more on time to decision and probability of avoiding failure to select any nest, rather than putting a premium on finding the best nest site. Is there something about the nest-site selection process as an algorithm that fits especially well with such values (robust on reasonably good, quick, avoiding failure)? These are large and important questions, to be addressed in future research.
What Can Honeybees Tell Us About Social Learning?
199
References 1. Blum, C., & Roli, A. (2003). Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Computing Surveys, 35(3), 268–308. 2. Bogacz, R., Usher, M., Zhang, J., & McClelland, J. L. (2007). Extending a biologically inspired model of choice: multi-alternatives, nonlinearity and value-based multidimensional choice. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1485), 1655– 1670. 3. Boyd, R., & Richerson, P. J. (1988). Culture and the evolutionary process. Chicago: University of Chicago Press. 4. Cohen, D., Nisbett, R. E., Bowdle, B. F., & Schwarz, N. (1996). Insult, aggression, and the southern culture of honor: an ‘experimental ethnography’. Journal of Personality and Social Psychology, 70(5), 945–960. 5. Hockett, C. F. (1960). The origin of speech. Scientific American, 203(3), 88–97. 6. Hoppitt, W., & Laland, K. N. (2013). Social learning: An introduction to mechanisms, methods, and models. Princeton: Princeton University Press. 7. Kimbrough, S. O. (2012). Agents, games, and evolution: Strategies at work and play. Boca Raton: CRC Press. 8. Kimbrough, S. O., & Lau, H. C. (2016). Business analytics for decision making. Boca Raton: CRC Press. 9. Laland, K. N. (2017). Darwin’s unfinished symphony: How culture made the human mind. Princeton: Princeton University Press. 10. Mesoudi, A. (2011). Cultural evolution: How Darwinian theory can explain human culture & synthesize the social sciences. Chicago: University of Chicago Press. 11. Nikolaev, A. G., & Jacobson, S. H. (2010). Simulated annealing. In Handbook of metaheuristics, M. Gendreau & J.-Y. Potvin (Eds.). International Series in Operations Research & Management Science (2nd ed., Vol. 146, pp. 1–39). New York: Springer. 12. Nunn, N., & Wantchekon, L. (2011). The slave trade and the origins of mistrust in Africa. American Economic Review, 101(7), 3221–52. 13. Passino, K. M., & Seeley, T. D. (2006). Modeling and analysis of nest-site selection by honeybee swarms. Behavioral Ecology and Sociobiology, 59(3), 427–442. 14. Rendell, L., Boyd, R., Cownden, D., Enquist, M., Eriksson, K., Feldman, M. W., et al. (2010). Why copy others? Insights from the social learning strategies tournament. Science, 328, 208– 213. 15. Ribeiro, C., & Hansen, P. (Eds.) (2001). Essays and surveys in metaheuristics. Boston: Kluwer. 16. Richerson, P. J., & Boyd, R. (2005). Not by genes alone: How culture transformed human evolution. Chicago: The University of Chicago Press. 17. Seeley, T. (2010). Honeybee democracy. Princeton: Princeton University Press. 18. Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review, 108(3), 550–592. 19. Voß, S. (2001). Metaheuristics: The state of the art. In Local search for planning and scheduling, A. Nareyek (Ed.). Lecture Notes in Computer Science (Vol. 2148, pp. 1–23). Heidelberg: Springer. 20. Voudouris, C., & Tsang, E. (2003). Guided local search. In Handbook of metaheuristics, F. Glover & G. Kochengerger, (Eds.) (pp. 185–218). Boston: Kluwer.
Understanding the Impact of Farmer Autonomy on Transportation Collaboration Using Agent-Based Modeling Andrew J. Collins and Caroline C. Krejci
Abstract Food from small-scale farms has seen a recent resurgence in demand as consumers have become increasingly aware of the benefits of buying from regional food supply chains (RFSCs). However, these farmers still face pressure to reduce their costs and remain financially solvent. One potential solution for farmers is to use collaborative transportation methods to reduce costs. However, shared transportation means a reduction in the autonomy of the farmers, something they highly prize. To investigate this trade-off, an agent-based model (ABM) of farmers forming coalitions was created. This ABM includes cooperative game theory concepts to enable the farmers (agents) to form coalitions strategically. As expected, the model finds that the farmers do not form coalitions when their preference for autonomy is high, or the impact of distance between farmers is too great. These results represent a proof-of-concept for using agent-based modeling to investigate this problem. Keywords Agent-based modeling · Regional food systems · Cooperative game theory
1 Introduction The modern industrial food supply system faces many serious challenges concerning long-term sustainability and resilience. This system is characterized by largescale and highly centralized production methods that rely on energy-intensive inputs and long-distance transport and yields environmentally toxic outputs. One response to these challenges is regional food supply chains (RFSCs), in which small-scale and A. J. Collins () Old Dominion University, Norfolk, VA, USA e-mail: [email protected] C. C. Krejci University of Texas at Arlington, Arlington, TX, USA © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_13
201
202
A. J. Collins and C. C. Krejci
mid-sized farms provide food to geographically proximal consumers, often using environmentally sustainable production practices. It was estimated that $8.7 billion of sales were from food produced and sold regionally by farms in USA [1]. Consumers are motivated to buy regional food for various reasons including: supporting their local economy, it is fresher/more nutritious than food that comes from outside of the region, it is seen as environmentally friendly, and lower prices [2–4]. Small-scale farms, which account for 85% of all U.S. regional food producers, benefit from this increased demand [5]. Federal, state, and local policymakers have created programs which support small farmers and rural economies [6]. However, RFSCs face many challenges. Small-scale and midsized farmers often struggle to remain financially solvent. A major contributor to this struggle is logistics activities, which can significantly erode farm profits. The cost of transporting food from rural and geographically dispersed farm locations to distant urban demand centers, especially when refrigerated/frozen goods are involved, can be cost-prohibitive [7]. To help manage transportation costs for small-scale and mid-sized farmers, collaborative transportation is strongly recommended [8]. By sharing transportation, costs are pooled among multiple farmers, allowing them to split the cost of leasing, purchasing, and/or maintaining a vehicle, as well as distributing fuel costs. For example, one study determined that coordinated RFSC transportation has the potential to reduce the number of routes, driving distance, and total transportation time by 68%, 50%, and 48%, respectively [9]. However, collaborative transportation is often very challenging for farmers to adopt in practice. Farmers must balance the potential benefits of collaboration with the additional logistics costs, which include the time, effort, and expenses involved with managing the coordination. In some cases, entirely new systems must be implemented to support the collaboration. In particular, traceability systems (e.g., farm-level labeling) become necessary to enable each product’s origins to be identified [3]. Furthermore, while transportation collaboration can reduce farmers’ costs, it can potentially increase their business risk, and it can also slow decision-making since decisions are made by a group rather than individuals [10]. Additionally, information sharing among collaborating farmers is necessary for efficient coordination but can be challenging to implement, given that they may be competitors [11]. When farmers collaborate, they lose some of their autonomy. Farmer values autonomy highly so this loss is of importance [12]. Farmers have been shown to be willing to sacrifice income to maintain autonomy [13, 14]. As a result of this desire for autonomy, any collaboration that significantly reduces a farmer’s autonomy will be negatively viewed by the farmers [15, 16]. In this research, we investigate the impact of reduced autonomy on farmers’ transportation collaboration decisions using agent-based modeling. In our model, the farmers can strategically form coalitions with other farmers based on a utility function that incorporates both the benefit of collaboration and the desire for autonomy. To allow the farmers to form coalitions strategically, an algorithm based on cooperative game theory is used. The following sections provide a brief review
Understanding the Impact of Farmer Autonomy on Transportation. . .
203
of the literature, a description of the model, preliminary experiments and results, and conclusions.
2 Literature Review Horizontal collaboration occurs between organizations that are in different supply chains but operate at the same supply chain echelon. The collaborating organizations cluster their logistics activities and assets to more efficiently utilize infrastructure, reduce costs, and reduce environmental load [17, 18]. Most horizontal collaboration efforts focus on transportation management [19] and include shared consolidation centers, joint trucking routes, and optimization of the entire transportation network across multiple competing supply chains for maximum transportation efficiency [20]. Effective transportation collaboration can result in fewer empty truck runs and can reduce the total distance traveled, thereby reducing fuel consumption and greenhouse gas emissions [21]. However, implementing horizontal transportation collaboration is typically very challenging, due to technological barriers and insufficient trust between coordinating partners, particularly because “coopetition”—i.e., cooperation with competitors—is often necessary [22, 23]. Other major barriers include finding suitable collaboration partners and estimating and allocating the costs and benefits of collaboration [24]. The problem of developing routes that minimize transportation costs via horizontal coordination has been studied using traditional optimization techniques (see [25] for a review), often using vehicle routing models that have an objective of minimizing cost and/or greenhouse gas emissions. However, system-wide cost reductions from collaboration do not ensure that the collaboration will remain intact. For example, if an individual member of the collaboration believes that the allocation of costs/savings is unfair, or that their own benefit does not outweigh the costs associated with collaboration, the individual will choose not to participate, or may join and then later leave. Straightforward approaches to allocation (e.g., allocating proportionally) do not guarantee an equitable distribution of collaboration benefits [26]. Thus, in addition to determining a transportation strategy that minimizes overall logistics costs, it is also critical to determine an allocation strategy that maintains an intact optimal coalition [27]. To address this issue, some models have applied cooperative game theory techniques to model coalition formation and optimal cost/savings allocation for logistics cooperation and collaboration. Cooperative (or coalition) game theory is the standard approach of modeling strategic situations of more than two decision makers [28]. Cooperative game theory focus is on which coalition forms among the players, that is, which players cooperative with each other. A critical concept of cooperative game theory is how the rewards are divided up among the members of a coalition. An “imputation” is a split of reward that is individually rational for all players. Each game can have more than one imputation. When comparing imputations, for a given game, some subgroups might do better than other, in
204
A. J. Collins and C. C. Krejci
terms of reward. Thus, an imputation dominates another imputation if there exists a subgroup that would do better in the dominating imputation. An imputation that is not dominated by any other is a member of the core [29]; a solution concept for cooperative game theory popularized by Lloyd Shapley in 1960s [30]. A simpler informal way to describe a member of the core is a partition of the agents such that no subgroup of agents has an incentive to form (and leave their current subgroups). Fiestras-Janeiro et al. [31] review existing models that apply cooperative game theory techniques to optimize collaboration efforts across multiple autonomous but interdependent supply chain actors. These models are used to study coalition formation and to optimize costs/savings allocation for a variety of logistics activities, including collaborative ordering, inventory pooling and shared warehousing, knowledge and information sharing, and collaborative production planning. Most of these models explore ways of creating incentives that encourage sustaining the grand coalition (i.e., the coalition in which all potential members have joined). Lozano et al. [32] present a model that focuses specifically on horizontal transportation collaboration. In their model, the coalition’s transportation costs are reduced via more efficient vehicle use, i.e., consolidation of loads in larger vehicles and a balanced flow of vehicles between locations to minimize empty backhauls. They develop a mixed-integer linear program to determine the size and structure of the collaborating coalition that will provide the greatest overall cost savings, and then apply several different cooperative game theory methods to determine how best to allocate these cost savings to each coalition member to ensure that the optimal coalition remains intact. Agent-based modeling (ABM) has also been used to study supply chain collaboration. For example, Arvitrida et al. [33] describe an ABM that was developed to study the impacts of competition and collaboration on supply chain performance, in terms of responsiveness and efficiency. Serrano-Hernandez et al. [34] describe an ABM of companies seeking to reduce their costs and greenhouse gas emissions through horizontal transportation collaboration. The model includes a heuristic to solve the vehicle routing problem for different cooperative scenarios, which provides the companies with sufficient information about the benefits of collaboration (i.e., reduction in distance traveled) to inform their decisions about joining the coalition. The model also uses cooperative game theory to allocate savings optimally among coalition members. Utomo et al. [35] review the literature on ABMs of agri-food supply chains and determine that there is a need for more research using ABM to investigate cooperation, competition, and collaboration in agri-food supply networks. The existing literature in this area is sparse: Krejci and Beamon [36] describe an ABM of farm-level horizontal collaboration, in which farmers pool their resources and outputs for greater efficiency and scale, and Boero [37] describes an ABM of collaborative inventory pooling among small-scale farmers that enable them to market their products to large mainline distributors. This paper describes an agent-based model of a theoretical regional food supply system, in which small-scale farmers are represented by autonomous agents. In each time-step, the farmer agents decide whether to form coalitions for coordinated food transport by comparing the value of the coordination with the farmer’s valuation of
Understanding the Impact of Farmer Autonomy on Transportation. . .
205
his/her autonomy. The purpose of this model is to demonstrate a proof-of-concept of our approach; our intention is to expand the model for application to real-world scenarios.
3 Model The game is comprised of “n” farmers who can form coalitions with each other, using a heuristic algorithm described later in this paper. The size of these coalitions can vary, with the largest being the grand coalition N = {1,2, . . . , n} and the smallest being the singleton coalition for the farmer “i,” which is represented as {i}.
3.1 Utility Equation The utility equation for each farmer agent consists of three components that represent three distinct aspects of the farmer’s potential utility. The first aspect is the farmer’s desire to remain autonomous and their dislike of large groups. The second aspect is the farmer’s desire to maximize profit. The third aspect considers the negative impacts of geographic distance on a coalition’s ability to function, including increased transportation costs and logistical planning requirements. These three aspects have been combined into a single utility function for a given farmer “i” in coalition “Ci ”: 1 j ∈Ci pj − 1 + (1 − λ) (1 − μ) − μ max L∞ (j, k) V (i) = λ j,k∈Ci |Ci | |Ci | λ ∈ [0, 1] is the weighting on the farmer’s desire to be autonomous (aspect 1). Autonomy is represented by the inverse of the number of agents in a farmer’s coalition, with larger coalitions yielding a lower utility value. Each agent produces goods worth p ∈ [1, 2], the value of which is drawn randomly from a uniform distribution, pooled in the coalition, and split evenly among the coalition members (aspect 2). A single fixed transportation cost is assumed, which is normalized to one. This fixed cost is divided equally among the farmers, thereby encouraging them to form coalitions. In reality, transportation costs would not be fixed, but we appeal to parsimony for this first iteration of the model’s development. Finally, the negative impact of distance on a coalition is represented by the maximum norm, which is: L∞ (j, k) = max xj − xk , yj − yk
206
A. J. Collins and C. C. Krejci
The negative impact of distance is weighted, relative to profit, using μ ∈ [0, 1]. Note that the distance weight only effects half the utility equation and has no impact on the autonomy component.
3.2 Implications of Utility Function The utility function affects the farmer’s decision to join or leave a coalition. Since the utility equation is quite complex, it is worth exploring its working by considering the extremities and the singleton case. The extremities occur when the weights, λ and μ, are given their maximum and minimum values. A summary of this impact is given in Table 1. Table 1 demonstrates that when λ has a high value, there is an expectation that farmers will form only singleton coalitions, since λ represents the weighting on autonomy. It is interesting to note that when μ has a high value, there is also a tendency for the farmers to form singleton groups. Thus, interesting results from this model are likely to occur for low values of weights. If the farmer belongs to the singleton set “{i},” then the utility equation can be reduced to: V (i) = λ + (1 − λ) ((1 − μ) (pi − 1)) For low values of the weights, the utility value for the singleton case is close to the farmers’ production value (plus transportation costs).
3.3 Coalition Formation Algorithm The simulation models use a coalition formation algorithm that emulates a cooperative game theory solution concept. The algorithm was originally developed by Collins and Frydenlund [38], and improved upon by Vernon-Bido and Collins [39]. The algorithm creates suggested coalitions for the agents (farmers); the affected agents (farmers), in these coalitions, decide whether to form the coalitions.
Table 1 Analysis of weighting extremes on models output λ\μ 0
0 Farmers only consider profit, and autonomy and distance are ignored
1
Farmers desire to be on their own, and only singletons form
1 Farmers are only concerned with the maximum distance of their coalition. As a result, only singletons form Farmers desire to be on their own, and only singletons form
Understanding the Impact of Farmer Autonomy on Transportation. . .
207
The cooperative game theory concept that the algorithm emulates is the core [29]. An element of the core is a coalition structure, a disjoint and complete collection of coalitions, where no subgroup of agents (farmers) have an incentive to leave. Determining whether a coalition structure is in the core requires checking whether any incentive to form alternative coalitions given the current coalition structure [40]. Our algorithm provides a heuristic approximation of the core at each simulated time-step by providing alternative coalitions for the agents, which they will join if the alternative coalition provides an increase in utility for all its potential members. Alternative coalitions of the following types are created and tested at each timestep: • Individual leave: Each agent evaluates whether it would receive higher utility if it were in its singleton coalition; if it would be better off, it leaves its current coalition. • Kick outs: Each coalition checks to see if it would be better if a randomly chosen member were removed from the coalition. If all the other members would obtain a higher utility if the chosen member left, then the chosen member is removed from the coalition. • Joining: Two existing coalitions are proposed to join together; if all agents, from both coalitions, would receive higher utility, then the two coalitions join together to form a super coalition. • Pairs: Two randomly selected agents from two different coalitions are selected. If the two agents would obtain a higher utility by joining together, then they leave their respective coalitions and form a new pair coalition. • Defects: A random agent is selected. A random coalition that the selected agent is not a member of is selected. If the random coalition and random agent would both obtain higher utility from joining together, then the random agent defects from its current coalition and joins the selected coalition. If all agents are currently in the grand coalition, then this is not possible. • Splits: A random subgroup from a random coalition is selected. If either subgroup’s members or remaining members would get higher utility, then coalition splits. By repeat testing for these alternative coalitions, it was hoped that stability would be found in the farmers’ coalition structure. Thus, we intend to understand the impact of the utility function (as determined by the weights) on the final coalition structure.
4 Experimentation and Results The purpose of the simulation runs was to determine the impact of varying the farmers’ preference weighting towards autonomy (λ) and the weighting due to the distance between farms in a coalition (μ) on the resulting coalition structure. The simulation was run for 10,000 time-steps and, due to the stochastic nature of the
208
A. J. Collins and C. C. Krejci
Fig. 1 Screenshots from the NetLogo simulation with the circle representing farms when run for parameters (a) λ = 0.1, μ = 0.1 (b) λ = 0.2, μ = 0.2; both from final time-step
simulation, repeated 100 times for each combination of these two parameters, where each parameter’s value is taken from {0, 0.1, 0.2, 0.3, 0.4, 0.5}. As the results will indicate, the convergence of results meant there was little benefit from considering parameter values above 0.5. The number of coalitions that formed varied depending on the input parameter values; this phenomenon is exemplified using screenshots in Fig. 1. Since multiple runs were completed for each parameter combination, the average values are presented here. A coalition structure can be difficult to express, especially using averages, and, as such, we have focused on key summary statistics instead. The statistics that are considered are the percentage of single farmers (farmers in their singleton groups), average coalition size, and maximum coalition size. Each of these statistics is an average of the 100 runs for the input parameter combination. Figure 2 shows the impact of the input parameters on the number of singleton groups. As both the weighting towards autonomy (a) and the weighting toward distance (b) increase, the number of singleton groups increases. This is as expected: increasing farmer preference for autonomy and increased negative impact of distance on the farmer’s utility function should indeed lead to more singleton groups forming. Increasing the negative impact of distance between farmers makes the farmers less inclined to form coalitions with other farmers that are far away, which, in turn, increases farmers’ preference towards those farmers close to them. This effectively decreases the pool of other farmers that a farmer has available to form a feasible coalition. Since there are fewer farmers to choose from, a farmer is more likely to
Understanding the Impact of Farmer Autonomy on Transportation. . .
100
m
75
50
0 0.1 0.2 0.3 0.4 0.5
% Single Farmers
% Single Farmers
100
209
75
l
50
0 0.1 0.2 0.3 0.4 0.5
25
25
0.0
0.1
0.3
0.2
0.4
0.0
0.5
0.1
0.3
0.2
0.4
0.5
m
l (a)
(b)
10
m
5
0.0
0.1
0.2
0.3 l
(a)
0.4
0.5
0 0.1 0.2 0.3 0.4 0.5
Average Coalition Size
Average Coalition Size
Fig. 2 Graphs showing the relationship between the parameter values and the percentage of farmers that are in singleton groups: (a) shows how λ causes this to vary for different values of μ, and vice versa for (b)
10
l
5
0.0
0.1
0.2
0.3 m
0.4
0 0.1 0.2 0.3 0.4 0.5
0.5
(b)
Fig. 3 Graphs showing the relationship between the parameter values and the average coalition size: (a) shows how λ causes this to vary for different values of μ, and vice versa for (b)
remain as a singleton, because it finds no one compatible to join. Hence, as the negative impact of distance increases, so does the number of singleton groups. This explanation also explains the shape of the graphs in Figs. 3 and 4. The average maximum group size found was 17 (of 100 farmers). The maximum group size of all runs was 27. Thus, under our utility equation, the grand coalition is not shown to form. Since the input parameters explicitly affect the utility function, it would be incorrect to compare utility (and social welfare) of the input parameters.
A. J. Collins and C. C. Krejci
15 μ 10
5
0.0
0.1
0.3
0.2
0.4
0.5
0 0.1 0.2 0.3 0.4 0.5
Maximum Coalition Size
Maximum Coalition Size
210
15 λ 10
0 0.1 0.2 0.3 0.4 0.5
5
0.0
0.1
0.2
0.3
λ
μ
(a)
(b)
0.4
0.5
Fig. 4 Graphs showing the relationship between the parameter values and the maximum coalition size: (a) shows how λ causes this to vary for different values of μ, and vice versa for (b)
5 Conclusions The research presented in this paper used agent-based modeling to investigate how small-scale regional farmers might form a coalition to facilitate horizontal transportation collaboration, in which their decisions to collaborate are a function of their preference for autonomy and the difficulty of coordinating with other farmers at a distance. Preliminary experimental results demonstrated that the farmer agents form singleton groups when either their preference for autonomy is high or they consider the impact of distance between farms is too large. These results are as expected and therefore represent a proof-of-concept for using agent-based modeling and cooperative game theory to explore the problem of horizontal collaboration in regional food supply chains. Though a simple analytical model could have been used to yield these results, it is unlikely that such a model could be feasibly adapted to a real-world scenario, due to the limitations of an analytical approach (e.g., the assumption of homogeneity required to amalgamate the agents). This research represents the first step in the modeling process, and the obvious nature of the results serve as a white-box validation of the approach. The next stage of this research is to adapt the model to a real-world regional food supply network to understand how farmers could be influenced to form coalitions with other farmers. Empirical human behavioral data is currently being collected via semi-structured interviews with small- and mid-sized farmers throughout the state of Texas. The goal of these interviews is to gain a better understanding of the factors that encourage/prevent farmer transportation collaboration, as well as the types of logistics infrastructure (e.g., cross-dock depots, information-sharing software) that are necessary to facilitate more effective and efficient logistics for these farmers. The interview data will be used to inform ABM input parameter values (e.g., farmer agents’ relative preference for autonomy versus financial gain) and to provide a
Understanding the Impact of Farmer Autonomy on Transportation. . .
211
means of model validation. The model will then be used to identify appropriate incentives, and the barriers that must be removed, to encourage coalition formation. Model outputs will also be used to determine the effects of coalition formation on individual farmers’ performance and satisfaction, as well as the overall performance and sustainability of the regional food supply system over time. The long-term research objective is to use model outputs to design a resilient collaborative logistics system for regional food producers that will strengthen rural economies and improve consumer access to healthy, fresh, regionally produced food. A potential alternative use of the model is comparing transportation as a service to farmer coordination. That is, if logistics providers were the ones to coordinate and optimize the collection of goods from multiple farms, then this would reduce the need for the farmers to cooperate altogether. However, this assumes that the logistics providers would pass on the savings to the farmers, which may not be a valid assumption.
References 1. U.S. Department of Agriculture. (2016). Direct farm sales of food: Results from the 2015 local food marketing practices survey. Retrieved July 17, 2018, from https:// www.agcensus.usda.gov/Publications/2012/Online_Resources/Highlights/Local_Food/ LocalFoodsMarketingPractices_Highlights.pdf 2. Feldmann, C., & Hamm, U. (2015). Consumers’ perceptions and preferences for local food: A review. Food Quality and Preference, 40, 152–164. 3. Martinez, S., et al. (2010). Local food systems: Concepts, impacts, and issues. Retrieved July 17, 2018, from https://www.ers.usda.gov/webdocs/publications/46393/ 7054_err97_1_.pdf?v=4226 4. Schnell, S. M. (2013). Food miles, local eating, and community supported agriculture: Putting local food in its place. Agriculture and Human Values, 30(4), 615–628. 5. Low, S. A., et al. (2015). Trends in U. S. local and regional food systems: Report to congress. Retrieved July 17, 2018, from https://www.ers.usda.gov/webdocs/publications/ 42805/51173_ap068.pdf?v=42083 6. King, R. P., et al. (2010). Comparing the structure, size, and performance of local and mainstream food supply chains. Retrieved July 17, 2018, from https://files.are.ucdavis.edu/ uploads/filer_public/2014/06/19/comparing-the-structure-size-and-performance.pdf. 7. Miller, M., Holloway, W., Perry, E., Zietlow, B., Kokjohn, S., Lukszys, P., et al. (2016). Regional food freight: Lessons from the Chicago region. Retrieved July 17, 2018, from https://localfoodeconomics.com/wp-content/uploads/2018/02/miller-et-al-2016Regional-food-freight-final-2.pdf 8. Mittal, A., Krejci, C. C., & Craven, T. J. (2018). Logistics best practices for regional food systems: A review. Sustainability, 10(1), 168. 9. Bosona, T. G., & Gebresenbet, G. (2011). Cluster building and logistics network integration of local food supply chain. Biosystems Engineering, 108(4), 293–302. 10. Lindsey, T., & Slama, J. (2012). Building successful food hubs: A business planning guide for aggregating and processing local food in Illinois. Retrieved July 17, 2018, from http:// www.familyfarmed.org/wp-content/uploads/2012/01/IllinoisFoodHubGuide-final.pdf 11. Bloom, J. D., & Hinrichs, C. C. (2011). Moving local food through conventional food system infrastructure: Value chain framework comparisons and insights. Renewable Agriculture and Food Systems, 26(1), 13–23.
212
A. J. Collins and C. C. Krejci
12. Gasson, R. (1973). Goals and values of farmers. Journal of Agricultural Economics, 24(3), 521–542. 13. Gillespie, J. M., & Eidman, V. R. (1998). The effect of risk and autonomy on independent hog producers’ contracting decisions. Journal of Agricultural and Applied Economics, 30(1), 175–188. 14. Key, N. (2005). How much do farmers value their independence? Agricultural Economics, 22(1), 117–126. 15. Renting, H., Marsden, T. K., & Banks, J. (2003). Understanding alternative food networks: Exploring the role of short food supply chains in rural development. Environment and Planning A, 35(3), 393–411. 16. Lyson, T. A. (2007). Civic agriculture and the north American food system. In C. C. Hinrichs & T. A. Lyson (Eds.), Remaking the north American food system: Strategies for sustainability (pp. 19–32). Lincoln, NE: University of Nebraska Press. 17. Audy, J. F., Lehoux, N., D’Amours, S., & Rönnqvist, M. (2012). A framework for an efficient implementation of logistics collaborations. International Transactions in Operational Research, 19(5), 633–657. 18. van der Vorst, J., Beulens, A., & van Beek, P. (2005). Innovations in logistics and ICT in food supply chain networks. In W. M. F. Jongen & M. T. G. Meulenberg (Eds.), Innovation in agrifood systems (pp. 245–292). Wageningen: Wageningen Academic Publishers. 19. Soosay, C. A., & Hyland, P. (2015). A decade of supply chain collaboration and directions for future research. Supply Chain Management: An International Journal, 20(6), 613–630. 20. Mason, R., Lalwani, C., & Boughton, R. (2007). Combining vertical and horizontal collaboration for transport optimisation. Supply Chain Management: An International Journal, 12(3), 187–199. 21. CLECAT. (2010). Logistics best practice guide. Retrieved July 18, 2018, from https:// www.clecat.org/media/sr005osust101201clecatsustlogbpg2nded.pdf 22. Barratt, M. (2004). Understanding the meaning of collaboration in the supply chain. Supply Chain Management: An International Journal, 9(1), 30–42. 23. Pomponi, F., Fratocchi, L., & Rossi Tafuri, S. (2015). Trust development and horizontal collaboration in logistics: A theory based evolutionary framework. Supply Chain Management: An International Journal, 20(1), 83–97. 24. Cruijssen, F., Cools, M., & Dullaert, W. (2007). Horizontal cooperation in logistics: Opportunities and impediments. Transportation Research Part E: Logistics and Transportation Review, 43(2), 129–142. 25. Pérez-Bernabeu, E., Juan, A. A., Faulin, J., & Barrios, B. B. (2015). Horizontal cooperation in road transportation: A case illustrating Savings in Distances and Greenhouse gas Emissions. International Transactions in Operational Research, 22(3), 585–606. 26. D’Amours, S., & Rönnqvist, M. 2010. Issues in collaborative logistics in energy, natural resources and environmental economics, E. Bjørndal, M. Bjørndal, P. M. Pardalos, M. Rönnqvist (Eds.) Springer, Berlin, 395–409. 27. Guajardo, M., & Rönnqvist, M. (2016). A review on cost allocation methods in collaborative transportation. International Transactions in Operational Research, 23(3), 371–392. 28. Thomas, L. C. (2003). Games, theory and applications. Mineola, NY: Dover Publications. 29. Gillies, D. B. (1959). Solutions to general non-zero-sum games. Contributions to the Theory of Games, 4(40), 47–85. 30. Shapley, L. S. (1967). On balanced sets and cores. Naval Research Logistics Quarterly, 14(4), 453–460. 31. Fiestras-Janeiro, M. G., García-Jurado, I., Meca, A., & Mosquera, M. A. (2011). Cooperative game theory and inventory management. European Journal of Operational Research, 210(3), 459–466. 32. Lozano, S., Moreno, P., Adenso-Díaz, B., & Algaba, E. (2013). Cooperative game theory approach to allocating benefits of horizontal cooperation. European Journal of Operational Research, 229(2), 444–452.
Understanding the Impact of Farmer Autonomy on Transportation. . .
213
33. Arvitrida, N. I., Robinson, S., & Tako, A. A. (2015). How do competition and collaboration affect supply chain performance? An agent based modeling approach. In Proceedings of the 2015 winter simulation conference (pp. 218–229). New York: IEEE. 34. Serrano-Hernandez, A., Faulin, J., Hirsch, P., & Fikar, C. (2018). Agent-based simulation for horizontal cooperation in logistics and transportation: From the individual to the grand coalition. Simulation Modelling Practice and Theory, 85, 47–59. 35. Utomo, D. S., Onggo, B. S., & Eldridge, S. (2017). Applications of agent-based Modelling and simulation in the agri-food supply chains. European Journal of Operational Research, 269(3), 794–805. 36. Krejci, C. C., & Beamon, B. M. (2015). Impacts of farmer coordination decisions on food supply chain structure. Journal of Artificial Societies and Social Simulation., 18(2), 19. 37. Boero, R. (2011). Food quality as a public good: Cooperation dynamics and economic development in a rural community. Mind and Society, 10(2), 203. 38. Collins, A. J., & Frydenlund, E. (2018). Strategic group formation in agent-based modeling and simulation. Simulation, 94(3), 179–193. 39. Vernon-Bido, D., & Collins, A. J. (2018). Advancing the coalition formation Heuristic algorithm for agent-based modeling. Norfolk, VA: Old Dominion University. 40. Chalkiadakis, G., Elkind, E., & Wooldridge, M. (2011). Computational aspects of cooperative game theory. Synthesis Lectures on Artificial Intelligence and Machine Learning, 5(6), 1–168.
Modeling Schools’ Capacity for Lasting Change: A Causal Loop and Simulation-Based Approach Roxanne A. Moore and Michael Helms
Abstract Improving the K-12 education system in the USA is one of the wicked problems of the twenty-first century. For all the discussion around STEM (Science, Technology, Engineering, and Mathematics), “Every Student Succeeds,” and other countless initiatives and interventions, it seems reasonable to ask—why are K-12 school systems still starved for lasting, meaningful change? While education researchers and evaluators have certainly brought rigor to understanding intervention impacts and outcomes on a student and teacher level, these studies often do not directly account for the social science-based context that surrounds interventions. In this paper, we present an approach for modeling school settings using causal loop diagrams and accompanying stochastic simulations to better understand schools’ capacity for intervention. We present initial evidence to support the claim that the modeling process and the resultant models can aid in the design of quality, school-compatible interventions by improving understanding of the ecosystems in which educational interventions operate. This work attempts to shine light on the relationship between a K-12 school environment and an intervention, paying respect to the fact that the two cannot be cleanly decoupled. We also present a framework that can be adopted by other intervention teams in the future to better understand the settings in which they are operating. Keywords Complex systems · Education · System dynamics · K-12 schools · Modeling · Simulation · Intervention · Capacity
R. A. Moore () G.W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, USA e-mail: [email protected] M. Helms Center for Education Integrating Science, Mathematics, and Computing, Georgia Institute of Technology, Atlanta, GA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_14
215
216
R. A. Moore and M. Helms
1 Introduction Many entities in the USA are working on improving K-12 education, from the macro to micro-level. While many interventions and policies may be effective in some situations, there are two big challenges: sustainability and scalability. Sustainability comes into play when interventions are implemented as part of a grant with a finite life. Often, during the grant, a school or set of schools is well-resourced, compensated for their implementation efforts, and well-supported by the intervening agency. However, when the grant ends, it is anyone’s guess as to what happens— will a particular school continue to make use of the intervention, or will it fade out like so many past attempts to improve educational outcomes [1]? Scalability is another buzzword in educational interventions. After collecting initial data on effectiveness, interveners often apply for scale-up funding to see if the intervention is effective more broadly [2]. The question here, however, should not be about quantity, but rather about responsible scaling—for what schools and school systems is an intervention a good fit? The compatibility of a school and an intervention is often not considered, but school ecosystems vary widely, and what works in a magnet school may not be effective or sustainable in an urban public school, and vice versa. Current proposal solicitations from the National Science Foundation (NSF) reflect this understanding by asking the question, “what works, for whom, and under what conditions [3]?” Answering this question systematically, however, requires tools beyond the current state-of-the-art in education research. Educational research conducted in conjunction with educational interventions has evolved, with increasing consideration on the effects of school and school system variables on project implementation. Design-based implementation research (DBIR) is one framework that recognizes the variability inherent in school settings and embraces the complexities of the educational system. DBIR is a research method increasingly used by educational researchers and practitioners when developing, testing, and implementing educational interventions [2, 4, 5]. However, DBIR does not currently provide any specific quantitative or qualitative modeling frameworks to systematically handle this complexity. Modeling educational systems as complex systems, it follows, may provide useful information for designers of educational innovations who engage in DBIR [6]. In this work we present a process for modeling an intervention and an accompanying suite of school models using both causal loop diagrams (CLD) and an accompanying discrete-time simulation based on the education system intervention modeling (ESIM) framework [7]. While DBIR presents strategies for successful intervention design and implementation, ESIM is specifically an approach for modeling school settings to gain understanding of the factors that enhance or impede intervention sustainability. This research was conducted within an NSFfunded Discovery Research PreK-12 project, EarSketch, in which the primary goal was to design, implement, and assess a STEAM (Science, Technology, Engineering, Arts, and Mathematics) educational intervention in Georgia high schools aimed at broadening interest and participation in computer science. Initially, the aim of the
Modeling Schools’ Capacity for Lasting Change
217
modeling effort was to identify key attributes of school settings likely to impact intervention sustainability in the long term. However, during the intervention, we found that the models also shaped the discussions and decisions among the broader research and intervention team, making the modeling process as valuable as the models themselves. In this paper, strategies for developing and validating models and a suite of modeling results for five participating schools are presented and discussed.
2 Model Setting: EarSketch in High School Computer Science Principles Classes EarSketch uses a STEAM approach (STEM + Arts) to lower the barriers to entry and increase engagement in computer science through music [8]. It is a webbased learning environment that blends a digital audio workstation (DAW) and a library of culturally relevant music samples with two popular coding languages, JavaScript and Python [8–10]. The environment enables students to quickly create novel musical compositions, or “remixes,” using coding and computational thinking, without prior training in music theory or performance [8]. EarSketch has shown promise in facilitating learning of computational principles and improving engagement for student populations traditionally underrepresented in the field [11]. EarSketch is being implemented as a 10-week module within the ~36 week Computer Science Principles (CSP) course. CSP is a new standard for Advanced Placement (AP) and other high school computer science (CS) courses. CSP takes a broader view of computing literacy, focusing not only on algorithms, data structures, and programming, but also on the social, cultural, and technological impacts of computing, as well as emphasizing creativity, collaboration, and communication [12–14]. The goal of this particular project was to scale EarSketch to multiple schools and school districts in Georgia and measure the impacts on students’ interest in computer science, their learning of programming, and their intention to persist in the field.
3 Developing the Models The goal of the modeling effort within EarSketch was to identify the key attributes that would enhance or inhibit the sustainability of EarSketch at a particular school. During our NSF-funded grant, teachers and districts are generally incentivized to participate—the question is, will these schools continue to use EarSketch after the funding runs out and the support from university faculty and staff wanes? The ESIM framework, which was originally developed to analyze the sustainability of a middle school science intervention, was used to guide the modeling process [15–17].
218
R. A. Moore and M. Helms
Table 1 Attributes directly contributing to each sustainability dimension Infrastructure Technology support Instructional time Class size
Culture Community support for CS Teacher support of EarSketch Administration support of EarSketch
Capability Student content knowledge (CS) Teacher content knowledge (CS) Teacher EarSketch knowledge
In this work, as with past work in developing and using ESIM, we operationalize intervention sustainability and capacity using the work of Blumenfeld et al. [18], which proposes three dimensions to assess the compatibility between an intervention and a school. These dimensions are policy management, culture, and capability. Policy management generally includes school attributes such as supplies, teacher preparation time, class time, class size, and space. For the EarSketch intervention, we simplify this dimension to simply “infrastructure,” where the focus is on class size and available technology for teaching computer science. Capability generally reflects the attributes of the teachers and staff, such as teaching ability and content knowledge. Culture generally represents support for the intervention among the teachers, administrators, and community in which a school is situated (Table 1).
3.1 Agents and Attributes The modeling effort for EarSketch occurred in parallel to the intervention design and implementation, not retrospectively, unlike previous work in developing and applying ESIM. Because of this, there was an opportunity to define the agents and attributes exhaustively prior to implementation, and to attempt to measure relevant attributes in real-time where possible. The models focus on the school level, as the sustainability question is about compatibility between EarSketch and a particular school setting. To that end, we developed an initial list of attributes based on past experience with schools and interventions, previous ESIM models, and expert input from the EarSketch team regarding the attributes specific to EarSketch implementation. We then compared the list of proposed ESIM model attributes with the constructs proposed for measurement by the education research team for EarSketch and tried to leverage any overlap. We mapped relevant constructs to existing instruments and modified some instruments to include measures of some additional attributes where practical. Other attributes that cannot be measured on a quantitative instrument are measured using observational and interview protocols and where necessary, expert insight. The attributes and relationships are depicted in the causal loop diagram shown in Fig. 1.
Modeling Schools’ Capacity for Lasting Change
219
Fig. 1 Causal Loop Diagram of EarSketch Intervention in a High School. Blue arrows indicate positive correlations, red negative correlations, orange indicates a desired range outside of which is less preferred. Each attribute is assigned to a particular “agent” or “resource,” e.g., the teacher or the professional development (PD)
3.2 Causal Loop Diagram Causal loop diagrams were introduced in EarSketch as a new addition to the ESIM framework. These models serve multiple purposes: (1) Shared mental models enabled better communication among the EarSketch team members, (2) Visual models of observed classroom behavior enabled adjustments to both the EarSketch curriculum and the training provided to teachers, and (3) Face validation of the change equations—being able to see the influences visually allowed the broader EarSketch team to understand what the school ecosystem looks like in terms of feedback loops, which is difficult to convey using only equations. For more information on 1 and 2, see prior work by the authors [19–21]. Presented in Fig. 1, above, is a VENSIM PLE rendering of a causal loop diagram for the EarSketch intervention as generally situated in a school. The figure captures attributes and their relationships, where each attribute is color-coded based on the agent, entity, or resource to which it is assigned. The attributes influence each other in a very complex way, as depicted in the diagram. These influences are used to develop the change equations for the simulation. In the figure, blue arrows represent standard positive correlations, red arrows are negative correlations, and the orange arrow indicates a preferred range of values, outside of which is a negative influence.
220
R. A. Moore and M. Helms
Table 2 Measured attributes from intake surveys contributing to initial attribute values Students: active participation Students selected course
Students’ prior CS experience (formal/informal) Students’ music appreciation and music knowledge
Students: CS engagement Computing confidence, enjoyment, usefulness, and identity Intention to persist and motivation to succeed Creativity: person and place
Teacher: CSP self-efficacy Experience teaching CS
Experience teaching high school
For example, student computer science (CS) engagement is a measure that is aggregated from multiple constructs on a student pre-survey developed by the educational research team. The constructs are used to set the initial condition on CS engagement for a class of students, but we do not consider changes to those constructs on an individual basis in this model. Rather, we leave that analysis to traditional education research methods. In addition to the attributes listed in Table 2, administrative support for EarSketch is initialized based on any past experiences of the school and/or district with the intervening agency. This diagram was presented to the EarSketch team and updated based on expert feedback. It has additionally been presented to two EarSketch teachers in a qualitative, semi-structured interview setting. While the specifics of each connection were not discussed, the overall impression of the teachers was that the causal loop diagram was generally representative of their school environment. One teacher said the following: I think any subject that you teach, because you’re dealing with people, it’s not an input A, output B, sort of thing. And I think you’ve hit upon the big components that go into student success and teacher success, and by default EarSketch’s success. The big ones are students, teachers, the school, the administrative support, the community.
3.3 Simulation Equations Changes in the agents’ attributes and composite system state are modeled as a discrete-time Markov chain (DTMC). It is therefore assumed that change can be modeled as taking place in discrete time steps and that changes in the agents’ attributes depend only on the current attribute level and current relationships, not on past system states. The time horizon for the intervention (generally between 1 and 3 years) is divided into discrete time periods, during which the attributes of the agents have some probability of change. The following equation represents the general structure of the changes taking place in the model: pchange = winternal .pinternal + wexternal .pexternal
(1)
Modeling Schools’ Capacity for Lasting Change
221
where pchange is the overall probability of change, pinternal captures the “modeled” aspects of change that are specific to the school and intervention itself, and pexternal captures aspects of the schools and surrounding community that are not modeled explicitly but still affect change. In addition, the weights, winternal and wexternal , are non-negative, sum to 1, and are assigned based on one of the four possible attribute types, defined in Table 1. The internal probability of change is further divided into two parts: transient and steady state. As implementation of an intervention begins, it is assumed that a school enters a temporary transient phase where changes occur more readily. Then, the school reverts back to a steady state where change is less likely to occur without more significant disturbances to the system. More specifically, in the transient state, the absolute values of the agents’ attributes affect change in other agents, whereas in the steady state, change in an agent only occurs when there is a change in other agents’ attributes with whom there are relationships. The following equation represents the structure of the internal component of the change equation: pinternal (t) = e−kt .ptransient + 1 − e−kt .psteady
(2)
where pinternal (t) is the internal component of the change probability equation corresponding to time period t, k is a time constant affecting how long the system stays in the transient phase, and ptransient and psteady are the transient and steady state sub-components of the internal change probability. Note: lim pi (t) = psteady
t→∞
(3)
For more details on the equation structures and calculating the probability of change, see prior work [7, 16].
3.4 Simulation To run stochastic simulations for each school, initial values for all attributes had to be calculated for each school at the beginning of the intervention. These starting values were derived from three primary data sources: quantitative data, either collected as part of the research project or from public data (e.g., computer science knowledge test scores or percent of students receiving free lunch), qualitative data from classroom observations (e.g., number of students and number of available computers), and “oracle” or expert data, gathered from knowledgeable experts with first-hand knowledge of the relationships and attitudes of teachers, administrators, and the school environment. To ensure all attributes were operating over the same value range, each attribute was converted to a value scale from 1.0 to 5.0 (ESIM used a value from 0.0 to 2.0). The selection of the range is immaterial to the calculation algorithm, and we felt a range of 1–5 would aid in the communication of
222
R. A. Moore and M. Helms
Table 3 Attribute value conversion examples Attribute Student CS content knowledge Classroom technology support
Data source Preintervention student CS test Observational
Sample input 55.6 (average class score on test, out of 100) “Classroom had 25 computers for 30 students, with some internet bandwidth issues”
Teacher support for EarSketch
Oracle
“The teacher was enthusiastic about EarSketch, but had some reservations about the kids being able to handle the content”
Conversion method (Test score/25) + 1
Sample attribute score 3.2
Baseline 5; −0.5 for bandwidth or sound issues; −1 for a few students without computers, −3 for many, −4 for most 5 for total support with experience, 4 for support without, 3 if reservations, 2 for strong reservations or forced participation, 1 for actively negative
3.5
3
results. Low scores are associated with conditions that inhibit learning or reflect low performance, such as test scores, overcrowded classrooms, or low teacher support for the intervention, while high scores reflect conditions that promote learning or indicate high performance. The method of converting collected data to a value scale was unique for each attribute. The methods for three example attributes are shown in Table 3. Once initial values were calculated for each school, the simulation progressed through three intervention cycles with 12 periods of regular attribute adjustments, followed by a “year-end” period for annual adjustments. During each period of regular adjustment, the percentage chance for each attribute to change was calculated according to the formulae described previously, using the attribute “type” parameters, external trend values, and the prior period value of all influencing attributes. Negative percent change values reflect the odds that the change would be in the negative direction, and vice versa. To simulate a stochastic change event, a uniform random variable between 0 and 1 is generated and compared to the percentage change. If the random variable is less than the absolute value of the percentage change value, the attribute was changed by an increment of 0.154, calculated as a maximum possible movement for one year (2.0) divided by the number of time steps in one year (13), in the appropriate direction, with an attribute floor value of 1.0 and ceiling of 5.0. At the end of the year, a new student class enters the system with attributes that are generally lower than the attributes of the previous class at the end of the
Modeling Schools’ Capacity for Lasting Change
223
year (e.g., CS content knowledge). This creates a negative adjustment and slight discontinuities in the general trends for attribute values at times 13, 26, and 39. The change caused by these year-end discontinuities can trigger the steady-state portion of the change equation, causing changes to ripple through an otherwise steady-state environment for several periods, until it settles into a new steady state. The computational model was developed in Java, and simulations were run on a Dell XPS 8910, running an Intel i5–6400, 2.7GHz quad core with 8GB RAM, on Windows 10. For each of the eight datasets of school initial conditions, 300 simulations were run and expected values are calculated at each time step, approximating a deterministic model, with a processing time for all runs of all schools of 1.5 s. Higher numbers of simulations (500, 1000, and 10,000) were tested on three schools, resulting in identical final expected values for all attributes to within two decimal places. We did not test the lower bounds of the necessary number of simulations to achieve similarly reliable results. The results of each of the 300 runs for each school were averaged to show the general likely outcome for that school. Variable screening analysis of this model was done using the method of Morris [22] (900 input value combinations, each simulated 300 times). Results show that the attributes directly connected to another attribute as depicted in Fig. 1 strongly influence how that attribute will change, whereas for those attributes that are one step removed, the influence is reduced, often to the point of being indistinguishable from model noise. In this sense, the visual model is a useful representation for understanding the key influences on any given attribute. It does not, however, reveal differences in the level of influence of those variables, nor does it directly reveal those attributes that influence the sustainability constructs, which are complex composites of many attributes.
4 Results and Discussion The results of interest for this analysis are the sustainability scores produced for each school, averaged over the 300 simulation runs. Tables 4, 5, 6, and 7 show the initial calculated scores for each of the six schools, and the simulated values at the Table 4 Total fitness score (weighted sum of capability, culture, and infrastructure) School A B C D∗ E F ∗
Initial value 3.92 4.23 4.41 3.48 2.86 4.02
Year 1 simulated (calculated) 4.09 (4.16) 4.31 4.52 3.74 3.09 (3.29) 4.20
Year 2 4.14 4.32 4.57 3.86 3.18 4.26
Year 3 4.16 4.31 4.58 3.89 3.20 4.25
Change (initial to Y3) 0.24 0.08 0.17 0.29 0.22 0.16
School D, marked with an asterisk, dropped out of the study after year 1.
224
R. A. Moore and M. Helms
Table 5 Capability School A B C D∗ E F ∗
Initial value 3.51 2.70 4.36 2.61 1.51 2.86
Year 1 3.72 (4.04) 2.93 4.46 2.86 1.73 (2.61) 3.13
Year 2 3.79 3.01 4.51 2.97 1.83 3.22
Year 3 3.79 3.02 4.52 2.99 1.85 3.23
Change 0.28 0.32 0.16 0.38 0.34 0.37
School D, marked with an asterisk, dropped out of the study after year 1.
Table 6 Culture School A B C D∗ E F ∗
Initial value 3.42 5.00 4.37 3.48 2.41 4.35
Year 1 3.71 (3.96) 5.00 4.56 3.78 2.68 (3.05) 4.54
Year 2 3.84 4.99 4.66 3.94 2.72 4.62
Year 3 3.90 4.96 4.70 4.02 2.75 4.61
Change 0.48 −0.04 0.33 0.54 0.54 0.26
School D, marked with an asterisk, dropped out of the study after year 1.
Table 7 Infrastructure School A B C D∗ E F ∗
Initial value 4.83 5.0 4.51 4.34 4.67 4.83
Year 1 4.83 (4.50) 5.0 4.55 4.36 4.69 (4.17) 4.83
Year 2 4.81 4.98 4.54 4.35 4.68 4.81
Year 3 4.77 4.94 4.51 4.32 4.64 4.77
Change −0.06 −0.06 0 −0.02 −0.03 −0.06
School D, marked with an asterisk, dropped out of the study after year 1.
end of years 1, 2, and 3. The duration of the grant implementation was 3 years, which is why the simulation is over 3 years. Two schools (A, E) participated in the study during two consecutive years, and values were calculated from data gathered from these schools at the beginning of their second year of implementation. These calculated values are shown in parentheses, providing points of comparison for validation purposes. Investigating the two schools with multiple years of data, we note that the total fitness score for school A tracked very close to the model, +0.07 difference, while school E was off by −0.43. When we look at the culture, capability, and infrastructure we see that the overall fitness is masking larger differences. Capability is off by +0.31 and + 0.88, culture by +0.25 and + 0.43, and infrastructure by −0.33 and − 0.52 for schools A and B, respectively.
Modeling Schools’ Capacity for Lasting Change
225
While such discrepancies may suggest that the model requires some parameter tuning, the differences in the magnitude of discrepancies between A and E also suggest that another factor may be affecting school E. When we inquired with the research team, we learned that school E went through an administration change before year 2, bringing in a principal more supportive and improving culture. This was followed by a change of teacher at school E in year 3, which will affect capability. School A, however, remained more stable, with no dramatic changes in year 2, although the class size did increase due to popularity and teacher recruitment activities. In general, capability and culture improve rapidly in year 1 (where possible) and then stabilize quickly over time with some minor slippage in year 3 for some schools. Infrastructure may initially improve but is weighed down by negative trends (rising costs of technology, crowded classrooms, etc.) and begins to deteriorate over time. Unsurprisingly, infrastructure tends to change the least over the course of a few years. In terms of overall variation, we see capability exhibiting the widest range, followed by culture and infrastructure. Because capability is highly affected by teacher-level attributes, it is not surprising it exhibits the greatest variability. Culture exhibits less variation, but this is in large part due to the fact that these 6 schools come from only 2 districts, and district support is required to be able to work with the school and collect data. Infrastructure exhibits the least variation, but again, a school without the necessary technology infrastructure would be excluded from this study; this is not representative of a statewide distribution. This clustering does not undermine the importance of culture and infrastructure; rather, it is likely that this sample of schools is skewed positively. Year 3 data will include additional 7 schools, and more variation is possible. School stability is a key factor for intervention sustainability. For example, if hours of professional development are provided to a teacher, and that teacher leaves the school, that particular school no longer benefits from that resource. Generally, this loss is perceived as negative, but sometimes there can actually be a “positive destabilization” event that takes place, as shown with school E, where a more supportive administrator positively impacts culture. This outcome is in line with past results from ESIM, where it was demonstrated that a school with a low culture dimension could benefit from new teachers or administrators who came in with positive morale [7].
5 Conclusions and Future Work As demonstrated in the results, some models of schools generally adhere well to the perceived or measured reality of the school setting, while others indicate some differences between the models and reality. The open question is, what does it mean when a school radically deviates from the model prediction? Is the model inaccurate, are assumptions being violated, or is there an instability in the school that we need to
226
R. A. Moore and M. Helms
be aware of? The current school models assume a certain level of stability in a school and do not attempt to account for teacher or administrative turnover. As shown in the results, when turnover occurs, model results may deviate radically from measured ones. Predicting turnover is nearly impossible, but an open question is whether or not there is some way to measure or characterize the stability of a school. In future work, we will be looking at a larger dataset including more multi-year schools, which will enable richer validation of the model and simulation results. In addition, we will examine sustainability of EarSketch and what constitutes an acceptable sustainability fitness—this will be more readily accomplished in the next academic year as we follow up with schools to see which ones continue to use the program. Acknowledgements EarSketch receives funding from the National Science Foundation (CNS #1138469, DRL #1417835, DUE #1504293, and DRL #1612644), the Scott Hudgens Family Foundation, the Arthur M. Blank Family Foundation, and the Google Inc. Fund of Tides Foundation. We would like to acknowledge the other members of the EarSketch team who have provided input on these models, including Jason Freeman, Doug Edwards, Tom McKlin, Brian Magerko, Sabrina Grossman, Anna Xambó, Léa Ikkache, and Morgan Miller, as well as Drs. Marion Usselman and Donna Llewellyn for their influence in this work. EarSketch is available online at http://earsketch.gatech.edu.
References 1. Lyon, A. R., Frazier, S. L., Mehta, T., Atkins, M. S., & Weisbach, J. (2011). Easier said than done: Intervention sustainability in an urban after-school program. Administration and Policy in Mental Health and Mental Health Services Research, 38(6), 504–517. 2. Penuel, W. R., & Fishman, B. J. (2012). Large-scale science education intervention research we can use. Journal of Research in Science Teaching, 49(3), 281–304. 3. National Science Foundation. (2018). Discovery research PreK-12. Available from: https:// www.nsf.gov/funding/pgm_summ.jsp?pims_id=500047 4. Fishman, B. J., Penuel, W. R., Allen, A.-R., Cheng, B. H., & Sabelli, N. (2013). Design-based implementation research: An emerging model for transforming the relationship of research and practice. National Society for the Study of Education, 112(2), 136–156. 5. Penuel, W. R., & Fishman, B. J. (2017). Design based implementation research. Cited February 10, 2017. Retrieved from: http://learndbir.org/ 6. Groff, J. S. (2013). Dynamic systems modeling in educational system design & policy. New Approaches in Educational Research, 2(2), 72–81. 7. Mital, P., Moore, R. A., & Llewellyn, D. C. (2017). Education system intervention modeling framework. Policy and Complex Systems, 3(2), 72–87. 8. Magerko, B., Freeman, J., Mcklin, T., Mccoid, S., Jenkins, T., & Livingston, E. (2013). Tackling engagement in computing with computational music remixing. Proceeding of the 44th ACM Technical Symposium on Computer Science Education, ACM 9. Mahadevan, A., Freeman, J., Magerko, B., & Martinez, J. C. (2015). EarSketch: Teaching computational music remixing in an online Web Audio based learning environment. Web Audio Conference. Citeseer. 10. Mccoid, S., Freeman, J., Magerko, B., Michaud, C., Jenkins, T., Mcklin, T., et al. (2013). EarSketch: An integrated approach to teaching introductory computer music. Organised Sound, 18(02), 146–160.
Modeling Schools’ Capacity for Lasting Change
227
11. Freeman, J., Magerko, B., Mcklin, T., Reilly, M., Permar, J., Summers, C., et al. (2014). Engaging underrepresented groups in high school introductory computing through computational remixing with EarSketch. Proceedings of the 45th ACM Technical Symposium on Computer Science Education, ACM. 12. Astrachan, O., Barnes, T., Garcia, D. D., Paul, J., Simon, B., & Snyder, L. (2011). CS principles: Piloting a new course at national scale. Proceedings of the 42nd ACM Technical Symposium on Computer Science Education, ACM. 13. Astrachan, O., & Briggs, A. (2012). The CS principles project. ACM Inroads, 3(2), 38–42. 14. Collegeboard. (2016). AP computer science principles: Course and exam description.https:// apcentral.collegeboard.org/pdf/ap-computer-science-principles-course-and-examdescription.pdf. 15. Llewellyn, D. C., Usselman, M., Edwards, D., Moore, R. A., & Mital, P. (2013). Analyzing K-12 education as a complex system. 120th ASEE Annual Conference & Exposition, Atlanta, GA. 16. Mital, P. (2015). A modeling framework for analyzing the education system as a complex system. H. Milton School of Industrial and Systems Engineering, Georgia Institute of Technology. 17. Mital, P., Moore, R., & Llewellyn, D. (2014). Analyzing K-12 education as a complex system. Procedia Computer Science, 28, 370–379. 18. Blumenfeld, P., Fishman, B. J., Krajcik, J., Marx, R. W., & Soloway, E. (2000). Creating usable innovations in systemic reform: Scaling up technology-embedded project-based science in urban schools. Educational Psychologist, 35(3), 149–164. 19. Helms, M., Moore, R., Edwards, D., & Freeman, J. (2016). STEAM-based interventions: Why student engagement is only part of the story. In Research on equity and sustained participation in engineering, computing, and technology (RESPECT). New York: IEEE. 20. Moore, R. A., Helms, M., & Freeman, J. (2017). STEAM-based interventions in computer science: Understanding feedback loops in the classroom. Columbus, OH: American Society for Engineering Education. 21. Moore, R. A., Helms, M., & Usselman, M. (2018). effective design-based implementation research using complex systems modeling. American Society for Engineering Education Annual Conference. Salt Lake City, UT. 22. Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33(2), 161–174.
Model Structure of Agent-Based Artificial System for Reproducing the Emergence of Bullying Phenomenon Shigeaki Ogibayashi and Kazuya Shinagawa
Abstract The macrophenomenon associated with bullying is characterized by the emergence of bullies, the bullied, and a third party that makes up the majority, including bystanders who are not directly involved in the conflict of bullying. Another feature of bullying is the persistent and offensive attacks by one agent, the perpetrator, against another agent, the victim. To elucidate the mechanism of bullying through agent-based modeling, this paper analyzes the structural aspects of the model that are considered indispensable in reproducing the emergence of the bullying phenomenon by systematically changing the behavioral rules of the model. The necessary condition for the model structure is found to be that each agent has the characteristic tendency of tuning and excluding behavior, which is modeled using shared values and an agent-specific threshold for the tuning and excluding actions. This model successfully reproduces the emergence of the third party, as well as the victim and perpetrator, during the process of the agents’ actions and interactions. Moreover, the personality conditions for becoming the perpetrator, the victim, and the third-party agents are well explained by the agent’s tendency for tuning and excluding behavior. It is concluded that the people who are likely to attune with others tend to become members of a large group and are rarely excluded; among these people, those who are likely to exclude others are more likely to become the perpetrator; otherwise, they may become the thirdparty agents. People who are less likely to attune with others tend to become solo agents, among whom those who are less likely to exclude others are more likely to become victims, with others becoming the third-party agents. However, this model does not reproduce the emergence of intensively repeated attacks by specific perpetrators against specific victims. Based on these results, the mechanisms and countermeasures against bullying are discussed.
S. Ogibayashi () Chiba Institute of Technology, Narashino-shi, Japan e-mail: [email protected] K. Shinagawa FromSoftware, Inc., Tokyo, Japan © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_15
229
230
S. Ogibayashi and K. Shinagawa
Keywords Agent-based modeling · Bullying · Model structure · Shared value · Tuning and exclusion behavior · Mechanism · Countermeasure
1 Introduction Although bullying is a crucial social phenomenon, no effective countermeasures have yet been established. One reason seems to be that the underlying mechanism for bullying behavior is not well understood. There are many previous studies related to bullying [1–12]. According to the literature, bullying refers to negative actions perpetrated by one or more people toward one or more individuals that are conducted repeatedly and regularly over a period of time. The negative actions may include harassing, mobbing, offending, and socially excluding a victim or victims [1]. Bullying is an escalating process in the course of which the person confronted ends up in an inferior position, having difficulty in defending him or herself because of an imbalance in strength or power existing in the relationship with the perpetrator [1, 6]. Many studies have focused on the cause of bullying. Some argue that the perpetrator is the origin of the bullying, and his or her envy and self-esteem are the factors responsible for bullying [1, 3]. In fact, at least from the victims’ perspective, the cause of bullying is identified with a particular perpetrator [1]. Others believe that the personality of the victim is the cause of bullying. There is evidence that the simple fact of being significantly different from the rest of the group increases the risk of becoming the victim [1]. Coyne et al. [12] found the victims of bullying to be less extroverted and independent than a control sample of non-victims, as well as more unstable and conscientious. In addition to the perspectives of perpetrators and victims, an organization such as a school or workplace could be responsible for the occurrence of bullying. Zapf [3] classified the causes of bullying into three categories, namely the victim, the perpetrator, and the organization. Zapf [3] also analyzed a wide range of empirical data and found that bullying can be caused by more than one factor simultaneously; therefore, one-sided explanations should be avoided. He also identified that research into the causes of bullying is insufficient, mainly because many reports are based on interviews with victims, while the perspectives of perpetrators and potential bystanders are not considered. Because of the limitations of current approaches, there is still much to be clarified regarding the underlying mechanism of bullying. Using conventional approaches, there is clearly a limit to how well the dynamic characteristics of the occurrence of bullying can be elucidated. However, it should be noted that agent-based modeling (ABM) is an effective approach for studying the mechanisms behind the dynamic characteristics of social phenomena. Various features of ABM have been described in the literature [13],
Model Structure of Agent-Based Artificial System for Reproducing. . .
231
such as it being an individual-based modeling approach and its ability to deal with heterogeneity. However, the most essential feature of ABM is that it is a bottomup modeling method in the sense that the artificial society modeled on a computer works, in principle, under the same mechanisms as in the real world. The social phenomenon in question emerges in the artificial system as a result of the actions and interactions of agents, as in a real system. According to one of the authors’ previous works on ABM applied to the macroeconomic system [15–18], there exists a specific system structure of the model that is indispensable in reproducing the desired macrophenomenon. In other words, the class of agents and their behavioral rules, including their attribute variables, are responsible for the emergence of the phenomenon. Therefore, the class of agents and their behavioral rules are required to be similar to those of the real system for the model to reproduce the desired macrophenomenon. If this requirement is not fulfilled, the ABM cannot reproduce the phenomenon, even at a qualitative level. We believe the system structure that is indispensable in reproducing a phenomenon can be elucidated by a series of computer experiments in which factors are systematically changed one by one, while other factors remain constant. Moreover, by elucidating the indispensable model structure and considering the reason why the model structure is indispensable for reproducing the phenomenon, we can obtain a greater understanding of the causal mechanism behind the macrophenomenon. Based on the findings in the literature, two features are considered to characterize the macrophenomenon associated with bullying. First, the emergence of bullies, the bullied, and a third party, which makes up the majority, in which bullies and the bullied are in conflict with each other, the relationship of which associates an imbalance in strength or power which increases with time. The third-party agents include bystanders who are not directly involved in the conflict of bullying. Second, persistent and repeated attacks are conducted by a specific person or group, as the perpetrator, against a particular person or group as the victim. Some researchers have used ABM to study the bullying phenomenon. For instance, Maeda et al. [14] developed an ABM approach that models the tuning and excluding actions of agents and reproduces the emergence of the group of agents. However, few studies have attempted to elucidate the indispensable system structure of the model required to reproduce the phenomenon. Using the ABM approach, this study analyzes the factors within the model structure that are indispensable in reproducing the characteristic features of the bullying phenomenon. This analysis allows for a discussion on the underline mechanism of bullying as well as effective measures for preventing the bullying from occurring. Based on the behavioral rules proposed by Maeda et al. [14], the present model introduces additional factors relating to the model structure as experimental levels to clarify which conditions are indispensable, and which are not, for reproducing the bullying phenomenon.
232
S. Ogibayashi and K. Shinagawa
2 Method of Study 2.1 Model The artificial system includes n agents. Each agent has a value vector of size M, each element of which is assigned a value of 1 or 0. This vector represents a set of values with M elements, each of which corresponds to traits in the real world covering preferences, skills, and behavioral patterns, and the value of 1 or 0 signifies whether or not that trait is selected or owned by the agent. The value of the kth element of the ith agent is represented by ν i, k . The total number of selected values for the ith agent is given by Eq. (1), which is assumed to range between the upper limit mmax and the lower limit mmin . mi =
M
vi,k ,
(1)
k=1
where vi,k = 1 (when selected) vi,k = 0 (when not selected) The agent who performs the action and the agent who is the object of the action are denoted by the subscripts act and obj. Shared and non-shared values are defined as those in which both ν act, k , ν obj, k are 1 or one of the values is 0, respectively. Each agent communicates through tuning actions, excluding actions, or doing nothing depending on the action probability given by Eq. (2), in which the numerator represents the shared value given by Eq. (3). It is assumed that each agent has characteristic threshold values for tuning and excluding actions, and these are defined as uniform random numbers in the range [0, 1]. p (act, obj) = c (act, obj) /mact c (act, obj) =
M
νact,k .νobj,k
(2)
(3)
k=1
In the calculation, a pair consisting of an active agent and an objective agent is selected at random, and the active agent performs one of three actions on the objective agent. This will be either a tuning action, excluding action, or doing nothing. Repeating this process for all of the agents makes up one step of the calculation. During the repeated steps, the pattern of the selected values of each agent may change. As a result, the number of selected values may increase in some agents through the tuning action, which increases the number of shared values with respect to others. Thus, a group of agents emerges in which the members have the same set of values. In contrast, excluding actions will decrease the number of selected values in some agents, leading to solo agents who do not share any value with other agents.
Model Structure of Agent-Based Artificial System for Reproducing. . .
233
In this model, an agent who excludes others most frequently without being excluded often corresponds to the bully or perpetrator, an agent who is frequently excluded without excluding others corresponds to the bullied or victim, and the other agents who exclude others and are excluded less frequently correspond to the third parties or bystanders. Thus, an excluding action corresponds to the negative action in the real world. Moreover, the number of selected values corresponds to the degree of strength or power hold by the agent. In a typical experiment, the tuning and excluding actions are defined as follows.
2.1.1
Tuning Action
The active agent conducts the tuning action defined below when the action probability exceeds the tuning action threshold, as stated by Eq. (4). The tuning action threshold is a random number in [0, 1] and is fixed for each agent. p (act, obj) > gact , where gact : agent s threshold of tuning action.
(4)
The active agent randomly selects one of the k values characterized as ν act, k = 0, ν obj, k = 1, and changes its own value to ν act, k = 1. However, when mact exceeds the upper limit mmax under this procedure, the active agent additionally selects another value p at random from the set of values characterized as ν act, p = 1, ν obj, p = 0 and changes the value to ν act, p = 0. Thus, the tuning action modifies the active agent’s set of selected values to make it closer to that of the objective agent. For comparison, the case in which gact is not inherent to each agent but given by a uniform random number in the range [0, 1], which is computed at each step, is also calculated (see EC3 and EC2 in Table 1).
2.1.2
Excluding Action
The active agent conducts the excluding action defined below when the conditions given by Eq. (5) are fulfilled. The excluding action threshold is assumed to be an agent-specific value given by a random number in [0, 1] and is fixed for each agent. When the threshold of the excluding action exceeds that of tuning action, it is redefined as that value, because the excluding action is only conducted when a tuning action is not conducted. p (act, obj) < eact and mact > mobj where eact = agent’s threshold of excluding action.
(5)
When the conditions specified by Eq. (5), with mobj exceeding the lower limit, are fulfilled, the active agent selects one of the values k at random from the set of
Model with agent-specific rules Model with the structure of base Model with agent’s model and the threshold of tuning reaction against and exclusion (base exclusion model) EC7 EC6 p(act,obj) > gact p(act,obj) > gact Model with revised rule of exclusion EC3 p(act,obj) > δ
–
–
–
p(act,obj) < eact m(act) > m(obj) m(act) > m(obj)
Model with agent’s threshold of tuning, where exclusion condition is changed EC5 EC4 p(act,obj) > gact p(act,obj)>gact c(act,obj)t–1 c(act,obj)t > l –
Model presented by Maeda EC2 p(act,obj) > δ
Model without agent-specific rules
–
–
Model with tuning only EC1 p(act,obj) > δ
Note: gact : the threshold value for tuning action of the agent, defined by a [0–1] random number, eact : the threshold value for exclusion action of the agent, defined by a [0–1] random number, δ: a uniform [0–1] random number
p(act.obj) < eact , p(act,obj) < eact, m(act) > m(obj) m(act) > m(obj) Reaction against Tuning, exclusion, – exclusion neutral, depending on the agent Experimental Number of agents 20 parameters Number of values 50 initial number of 10 selected values Max. number of 15 selected values Min. number of 5 selected values Max. number of 10000 steps Number of runs 10
Name of experimental condition Behavioral Tuning rules of agent Exclusion
Table 1 Calculation conditions
Model Structure of Agent-Based Artificial System for Reproducing. . .
235
values characterized by ν act, k = 1, ν obj, k = 1, and changes the objective agent’s value to ν obj, k = 0. Thus, the excluding action modifies the set of selected values in the objective agent to make it more different from that of the active agent (see EC6 in Table 1). For comparison, the case in which the excluding action is only conducted when mact > mobj (see Eq. (6)) without depending on its threshold value is also calculated (see EC4 and EC3 in Table 1). An additional scenario is the case in which the excluding action is only conducted when the condition given by Eq. (7) is fulfilled without depending on the condition given by Eq. (6) (see EC5 in Table 1). Additionally, the case in which the excluding action is only conducted when the difference in the number of shared values between the current and previous steps exceeds some threshold value is considered. In this study, the threshold is assumed to be 1, as stated in Eq. (8) (see EC2 in Table 1). Equation (8) is the assumption made by Maeda et al. [11]. mact > mobj
(6)
p (act, obj) < eact
(7)
c (act, obj) − c (act, obj) > 1, where c (act, obj) : Number of shared values in the current step c (act, obj) : Number of shared values in the previous step.
(8)
2.1.3
Reaction Against Excluding Action
For comparison with the base model, the effect of the reaction against an excluding action is analyzed by assigning a characteristic random number in [0, 1] to each agent. As a retaliation action, the agent selects one of three choices, namely an excluding action, tuning action, or doing nothing toward the objective agent, when the assigned random number is less than 0.34, greater than 0.67, or between these two values, respectively. The retaliation action is conducted in addition to the abovementioned shared-value-dependent tuning or excluding actions (see EC7 in Table 1).
2.2 Experimental Conditions The behavioral rules and parameter values are presented in Table 1. The base model is EC6, which includes agent-specific tuning and excluding actions. The models denoted EC1–EC5 and EC7 represent the modified versions for comparison with the base model in which the behavioral rules have been changed. The aim of the
236
S. Ogibayashi and K. Shinagawa
comparison is to elucidate the effect of the model structure on the emergence of the bullying phenomenon and to understand the conditions required to reproduce this phenomenon. For each of the experimental levels, 10 runs of calculation are conducted using the same seed of random numbers, and the typical results among them are employed as experimental results. Moreover, a set of calculations using a different random number seed is also conducted to confirm the consistency of the result for each of the experimental levels.
3 Simulation Results The simulation results for each of the six experimental conditions are described in this section. In Figs. 2, 3, 4, 5, 6, 7, 8, and 9, we use the notation “solo,” “Mxx,” “Mxx_1,” and “Mxx_2,” where solo refers to an agent who is not a member of any groups and Mxx refers to an agent who is a member of a group with xx members. The notation Mxx_1 and Mxx_2 is used when more than one group have the same number of members.
3.1 Results Without Agent-Specific Rules 3.1.1
Model with Tuning Only (EC1)
When the behavioral rules only include the tuning action (i.e., no excluding actions), the set of values becomes the same for all agents in the equilibrium state, as shown in Fig. 1, although the initial set is randomly assigned to each agent. Thus, all agents come to belong to the same group, and no conflict emerges between bullies and victims, which agrees with the findings reported by Maeda et al. [14].
3.1.2
Model with Tuning and Excluding Actions, Where the Exclusion Rule Presented in the Literature Is Employed (EC2)
In this case, the excluding action is only conducted when the number of shared values between the active and objective agents is less than the value in the previous step by at least the constant threshold value, as stated in Eq. (8). This rule of exclusion is the same as that employed by Maeda et al. [14]. In this case, two types of agents emerge as a result of the interaction among agents. Those are solo agents, whose set of selected values is not coincident with that of others, and agents in a group, where the set of selected values is coincident inside the group. However, when looking at the relationship between the number of excluding actions performed by an agent and the number of times the same agent is excluded, it appears that agents who exclude other agents more often are more
Model Structure of Agent-Based Artificial System for Reproducing. . .
237
Fig. 1 Example of the set of values (i.e., value vector) in the initial and equilibrium states obtained in the model with tuning only Fig. 2 Example of the relationship between the number of excluding actions and the number of times an agent is excluded by others in model EC2
likely to be excluded by other agents. Figure 2 shows an example of this behavior, indicating that victims and perpetrators do not separately emerge. Thus, it is evident that bullies and the bullied who conflict with each other do not emerge under the conditions of this model.
238
S. Ogibayashi and K. Shinagawa
Fig. 3 Example of the relationship between the number of excluding actions and the number of times an agent is excluded in model EC3
Fig. 4 Example of the relationship between the number of exclusions and the number of times an agent is excluded in model EC4
Fig. 5 Example of the relationship between the number of excluding actions and the number of times an agent is excluded in model EC5
3.1.3
Model with Tuning and Excluding Actions, Where Exclusion Criterion Defined by Eq. (6) Is Employed (EC3)
When the exclusion rule is changed from that assumed in Eq. (8) to that assumed in Eq. (6), a negative correlation emerges between the number of exclusions performed by an agent and the number of times that agent is itself excluded, as shown in
Model Structure of Agent-Based Artificial System for Reproducing. . .
239
Fig. 6 Agent’s thresholds for tuning and excluding behavior employed in model EC5
Fig. 7 Example of the relationship between the number of excluding actions and the number of times the agent is excluded in model EC6 Fig. 8 Number of excluding actions applied to agent 16 in relation to the total number of excluding actions
Fig. 3. This result indicates the separate emergence of agents who are more likely to exclude others than to be excluded, and agents who are more often excluded by others. The former are typical candidates for the perpetrator, whereas the latter are candidates for the victim. Thus, bullies and the bullied emerge under the conditions of this model.
240
S. Ogibayashi and K. Shinagawa
Fig. 9 Changes in the number of selected values of the typical perpetrator, bystander and the victim (i.e., Agent 5, 18 and 16 in Fig. 7, respectively)
However, it should be noted that we cannot observe any agents who rarely exclude others and are rarely excluded by others, indicating that third-party agents who are not directly involved in the conflict between the perpetrator and the victim do not emerge with this model.
3.2 Results with Agent-Specific Rules 3.2.1
Model with Agent’s Specific Tuning Threshold, Where Exclusion Criterion Defined by Eq. (6) Is Employed (EC4)
When the threshold value for the tuning action is defined as being specific to each agent and exclusion criteria is defined by Eq. (6), the negative relationship emerges between the number of excluding actions and the number of times the same agent is excluded by others, as shown in Fig. 4, even though the criterion for the excluding action is not defined as being agent-specific. Moreover, agents in the same group exhibit a similar number of exclusions as other agents, as seen in Fig. 4, indicating that they behave similarly. However, as is evident from Fig. 4, the third-party behavior does not emerge with this model.
3.2.2
Model with Agent’s Specific Thresholds of Tuning and Exclusion, Where Exclusion Criterion Defined by Eq. (7) Is Employed (EC5)
This experimental level is the case in which we employ Eq. (7) instead of Eq. (6) as the exclusion criterion. Note that the relationship between the number of excluding actions and the number of times the same agent is excluded exhibits a positive correlation for most of the agents (i.e., except for agents 17 and 13) as shown in Fig. 5. This indicates that agents who are in conflict with each other concerning excluding behavior do not emerge with this model. However, it is noticeable that agent 17 and agent 13 in Fig. 5 are victims, as they are more often excluded by others but do not themselves exclude others. These victim agents are characterized by a large tuning threshold as well as a small
Model Structure of Agent-Based Artificial System for Reproducing. . .
241
excluding threshold, as shown in Fig. 6. This indicates that agents who are less likely to tune with others and are less likely to exclude others are more likely to become the bullied. Thus, although some agents become victims, conflicting relationships as a whole resulting from excluding behavior do not emerge with this model, indicating that the excluding criterion defined by Eq. (6) is responsible for the model reproducing the set of agents who conflict with each other concerning the excluding behavior.
3.2.3
Model with Agent’s Specific Thresholds of Tuning and Exclusion, Where Eq. (5) Is Employed as Excluding Criterion (EC6, the Base Model in the Present Study)
This experimental level is the case in which we employ Eq. (5) as the exclusion criterion, which includes both Eqs. (6) and (7). An example of the relationship between the number of exclusions and the number of times an agent is excluded is shown in Fig. 7. Note that the agents in Fig. 7 are categorized into three types. The first type consists of agents who are very often excluded but rarely exclude others: these are the victims and the candidates for victims. The second type includes agents who are likely to exclude others while rarely being excluded by others: these are the perpetrators and the surrounding agents. The remaining group of agents, for which the number of exclusions and the cases of being excluded are relatively low, corresponds to the third party. This group includes bystanders who are rarely involved in the conflict of bullying, as they rarely exclude others and are rarely excluded by others. The typical victim in Fig. 7 is the agent who is most often subjected to excluding actions. In this case, this is agent 16, who was excluded 38 times (see the vertical axis in Fig. 7). Figure 8 shows the perpetrators of these excluding actions and the number of times they applied this action to agent 16. Note that the agent who excluded others the most often, agent 5 in this case, applied the most excluding actions to agent 16. This indicates that agent 5 is the main perpetrator toward the victim agent. Moreover, as can be seen in Fig. 7, the third-party agents that emerge are categorized into three types, namely agents who rarely exclude others as well as being rarely excluded, agents who perform a similar number of exclusion actions as the perpetrator, and agents who are excluded a similar number of times as the victim. These are the bystanders, the reinforcers of bullies, and the defenders of the victim, respectively, and coincide with groups identified in the literature [7]. Note that the agents who are rarely excluded by others belong to the large group with nine members and the agents who are often excluded by others are solo agents as shown in Fig. 7. An essential factor that differentiates these two types of agents is the number of selected values. As shown in Fig. 9, the numbers of selected values of the typical perpetrator, agent 5, and the typical bystander, agent 18 in Fig. 7, increase with time steps and tends to the maximum number, 15 in this case, and that number of the typical victim, agent 16 in Fig. 7, decreases with time steps and tends to the minimum amount, 5 in this case. These changes in the number of selected values are
242
S. Ogibayashi and K. Shinagawa
due to the agents’ interaction through tuning actions in the former and through being subjected to exclusions in the latter. Moreover, this tendency is observed in general, namely the number of selected values of the agent who belongs to the larger group becomes larger with time steps than that of the agent who belongs to the smaller group. This result indicates that the emergence of the imbalance in strength existing between bullies and the bullied which increases with time is successfully reproduced with this model. Let us now look at the features of these three categories of agents from the viewpoint of the threshold values of tuning and exclusion. The effects of the threshold values for tuning and excluding actions on the number of times agents are excluded and on the number of excluding actions are shown in Figs. 10 and 11, respectively. As is clear in Fig. 10, the number of times an agent is excluded by other agents is mainly dependent on the agent’s tuning threshold. Namely, the agents whose threshold tuning values are very small correspond to the agents who are more likely to tune with others and tend to become members of a larger group, resulting in the tendency of being less likely to be excluded by others. Moreover, the agents whose threshold tuning values are very large correspond to the agents who are less likely to tune with others and tend to become solo agents, resulting in the tendency of being more likely to be excluded. In contrast, the number of excluding actions applied to others is mainly dependent Fig. 10 Effect of the agent’s tuning threshold on the number of times they are excluded by other agents
Fig. 11 Effect of the agent’s excluding threshold on the number of times they exclude other agents
Model Structure of Agent-Based Artificial System for Reproducing. . .
243
Fig. 12 Effect of the solo agents’ threshold of exclusion on the number of times they are excluded by other agents
on the agent’s threshold of exclusion, as shown in Fig. 11. As is evident in Fig. 11, the agents who are members of the larger group, and therefore less likely to be excluded by others, can be classified into two types, namely the bystanders, who are characterized by very small exclusion threshold values, and the perpetrator or surrounding agents (i.e., reinforcers), who are characterized by larger exclusion threshold values. Thus, it is concluded that the agents who are likely to tune with others tend to become members of a large group and are rarely excluded; among these agents, those who are likely to exclude others are more likely to become the perpetrator; otherwise, they may become the third-party agents including the bystanders. For solo agents, the effects of the threshold values for excluding actions on the number of times agents are excluded are represented by the negative correlation shown in Fig. 12. Note that, among the solo agents, those who are less likely to exclude others are more likely to become the bullied or surrounding agents (i.e., defenders of the victim). Thus, it is concluded that the agents who are less likely to tune with others tend to become solo agents, among whom those who are less likely to exclude others are more likely to become victims, with others becoming the third-party agents. Thus, we can conclude that the first feature of the bullying phenomenon, namely the emergence of bullies, the bullied, and the third-party agents, is successfully reproduced with this model. Moreover, the emergence of the imbalance in strength existing between bullies and the bullied which increases with time is successfully reproduced with this model and the features of these types of agents can be understood reasonably well from the agents’ tendency for tuning and excluding behavior. However, it should also be noted that this model does not reproduce another feature of the bullying phenomenon, namely the tendency for the perpetrator, as a specific agent or group, to attack the victim, as a particular agent, persistently and repeatedly. In the real world, this tendency is thought to be caused by the personality or characteristic features of the perpetrators and the victims, which are modeled by the set of values in the present model. However, as seen in Figs. 10 and 11, the number of being excluded or the number of excluding actions
244
S. Ogibayashi and K. Shinagawa
is continuously distributed as a function of the agent’s thresholds of tuning or exclusion, respectively. There are no specific agents observed in the present model who extremely often exclude others or are too often excluded compared with other agents. This result indicates that the conflicts between bullies and the bullied as particular agents are not well determined by the shared-value-dependent excluding actions in this model, despite the presence of agent-specific threshold values for tuning and exclusion.
3.2.4
Model with Agent’s Retaliation Against Exclusion as well as Tuning and Excluding Actions (EC7)
The effect of including some reaction by the excluded agent against the excluding action is now analyzed. Figures 13 and 14 show the number of times each agent was excluded as a function of the number of excluding actions and as a function of the agent’s tuning threshold, respectively. As evident in Figs. 13 and 14, the influence of the reactive actions is negligible, indicating that persistent and intensive attacks by the perpetrator toward the victim cannot be explained by simple rules using the tuning and exclusion thresholds Fig. 13 Effect of reactive actions seen in the relationship between the numbers of excluding actions and cases of being excluded in model EC7. Reaction types are denoted as E for exclusion, N for neutral (doing nothing), and T for tuning
Fig. 14 Effect of the agent’s tuning threshold and the reactive actions on the number of cases of being excluded in model EC7. Reaction types are denoted as E for exclusion, N for neutral (doing nothing), and T for tuning
Model Structure of Agent-Based Artificial System for Reproducing. . .
245
assumed in the present model. Other factors should be considered to model the agents’ personalities, and this will be clarified in a future study.
3.3 Summary of the Simulated Results 3.3.1
Findings Regarding the System Structure Required to Reproduce the Bullying Phenomenon
The existence of the third-party agents, as well as the perpetrator and the victim, is reproduced under the assumption that the likelihood of both the tuning and excluding actions is agent-specific, and exclusion occurs when the number of values held by the objective agent is lower than that of the active agent, as explained in the results for EC6. Without these conditions, the third-party agents as well as the perpetrator and the victim do not emerge in the artificial society. As for other experimental conditions, the findings are as follows. In the case of experimental condition EC1, all agents come to belong to the same group, and neither the perpetrator nor the victim emerges. In EC2, separate groups emerge, but the agents who exclude others are often excluded, and therefore the victim and perpetrator do not emerge as different, conflicting agents. In EC3 and EC4, the victim and perpetrator emerge as conflicting agents, but the third party does not emerge. In EC5, the victim agents emerge as solo agents or members of a small group, but, for most agents, those who exclude others are often excluded, and therefore agents with conflicting relationships do not emerge. The experimental results are summarized in Table 2. Another feature of bullying, namely that persistent and repeated attacks are conducted by a specific person or group toward another particular person or group, could not be reproduced within the framework of the present model, even when some form of retaliation or reaction was incorporated into the model.
3.3.2
Findings Related to the Mechanism of Bullying
First, the fact that the existence of the third party, as well as the perpetrator and the victim and increasing conflict between them, is reproduced in model EC6 indicates that some differences in the individual-specific tuning and excluding characteristics, as well as the interaction among agents, are indispensable items in determining the cause of bullying. According to the literature [7], the third party consists of assistants or reinforcers of bullies, bystanders, and defenders of the victim. This fact is well reproduced with model EC6, as shown in Fig. 7. Moreover, the fact that the assumption that an agent excludes others only when the number of the values is greater than that of the objective agent is indispensable for reproducing the bullying phenomenon suggests that bullies attack their victims when they recognize the status or power of the victim is lower than their own.
246
S. Ogibayashi and K. Shinagawa
Table 2 Summary of the experimental results Name of experimental Behavioral rules of agent Tuning Exclusion condition EC1 p(act,obj) > δ –
EC2
p(act,obj) > δ
c(act,obj)t-1 c(act,obj)t >l
EC3
p(act,obj) > δ
m(act) > m(obj)
EC4
p(act,obj) > gact m(act) > m(obj)
EC5
p(act,obj) > gact p(act,obj) < eact
EC6
p(act,obj) > gact p(act,obj) < eact, m(act) > m(obj)
EC7
Same condition as EC6 plus reaction rule against exclusion
Experimental result All agents come to belong to the same group, and neither the perpetrator nor the victim emerges Separate groups emerge, but the victim and perpetrator does not emerge as different conflicting agent The victim and perpetrator emerge as conflicting agents, but the third party does not emerge The victim and perpetrator emerge as conflicting agents, but the third party does not emerge The victim agents emerge, but for most of the agents, conflicting relationship as victim and perpetrator does not emerge The victim, perpetrator and the third party consisting of three categories emerge Same result as EC6. The effect of the reaction rule assumed in the present study is negligible
Second, the fact that the features of these types of agents are reasonably explained by the agents’ tendency of tuning and excluding behavior indicates that the tendency for tuning behavior determines the likelihood of being excluded and the tendency for excluding behavior determines the likelihood of excluding others. Based on these results, the basic mechanism of bullying is considered to be as follows. The people in the organization have the essential characteristics of their tendency to tune with others and to exclude others. Moreover, agents who are more likely to tune with others tend to become members of a larger group, enhancing their social power, whereas agents who are less likely to tune with others tend to become solo agents, weakening their social power. These tendencies, as well as the effect of the interaction among agents, determine the type of agents in the bullying phenomenon. Namely, the agent who is more likely to tune with others as well as being more likely to exclude others tends to become the perpetrator. Conversely, the agent who is less likely to tune with others as well as being less likely to exclude others tends to become the victim. The others are the third-party agents consisting of bystanders, reinforcers of the bullies, and defenders of the bullied. These results coincide with those described in the literature [7], in which the importance of the role of bystanders as well as the interaction among the peer group is identified.
Model Structure of Agent-Based Artificial System for Reproducing. . .
247
However, some other factors must exist in the mechanism of bullying as a whole, because the present model cannot reproduce another feature of bullying, namely the persistent and repeated attacks by specific agents, i.e., the perpetrator, toward particular agents, i.e., the victim. Let us discuss further details in the next section.
4 Discussion The fact that persistent attacks by the perpetrator toward the victim, as particular agents, could not be reproduced by the present model indicates that some other factors are responsible for the existence of bullying. What could those factors be? Some interesting hints can be found in the literature in the ideas concerning group involvement in bullying, although they have not been examined empirically. One example of such views is presented in the review article by Salmivalli [7], and can be summarized as follows. In social groups where bullying takes place, initiative “ringleader” bullies can be identified [11]. The bullying behavior is motivated by the bullies’ pursuit of high status, which is the individual’s relative standing in the peer hierarchy. The bullies choose victims who are submissive, insecure about themselves, physically weak, and in a low-power, rejected position in the group, because they can repeatedly demonstrate their power to the rest of the group and renew their high-status position without the fear of being confronted. The bullies’ peer status is enhanced by the bystanders’ positive feedback or reinforcement through verbal or nonverbal cues (e.g., smiling, laughing), whereas challenging the bully’s power by taking sides with the victim provides negative feedback for them. The bystanders’ reaction affects the victims’ adjustment as well. Victims who have one or more classmates defending them when victimized are less anxious, less depressed, and have higher self-esteem than victims without defenders [7]. Based on the above arguments, additional factors to be implemented in the present model to reproduce other features of bullying should be related to the motivation for the excluding actions as well as the bystanders’ reaction, which affects the bullies’ motivation. Such factors can be implemented in the present model by assuming additional rules regarding the agents’ behavior, and this will be explored in future work. Moreover, if such a model successfully reproduces all aspects of the features of bullying, then we can conclude that the same procedure would occur in the real world through the same mechanism as in the modeled society. From the results of the present study as well as the above discussions, the following two countermeasures are considered effective. One is intentional tuning behavior with the victim, which could help him/her to become a member of a group, and therefore less likely to be attacked by the bullies. This tuning behavior should be intentionally conducted by people who are likely to tune with others or by people in authority within the organization, such as a teacher, because they may have the power to confront the bullies. Another countermeasure is for the bystanders’ not to
248
S. Ogibayashi and K. Shinagawa
reinforce the bullies so that the bullies do not receive positive feedback for their attacks toward the victim. Note that the people in authority who have power, such as a teacher, should not neglect the existence of bullying when it is occurring in the organization, because their attitude not to side with the victim could substantially result in giving positive feedback to the bullies.
5 Conclusion The macrophenomenon associated with bullying is characterized by the emergence of bullies, the bullied, and third-party agents including bystanders, which are the majority. Another characteristic is the persistent and offensive behavior by the perpetrator against the victim as particular agents. To elucidate the mechanism of bullying by agent-based modeling, this paper analyzed the structural aspects of ABM, which are considered indispensable in reproducing the phenomenon, by systematically changing the behavioral rules in the model. As a result, the following findings were obtained. The emergence of the third party, as well as the victim and the perpetrator, is only reproduced under the assumption that each agent has the characteristic tendency of tuning and excluding behavior, which is modeled according to shared values with others, and that exclusion occurs when the number of values held by the objective agent is lower than that of the active agent. This model successfully reproduced the emergence of the third party, as well as the victim and perpetrator and increasing conflict between them that associates an imbalance in strength which increases with time, during the process of the agents’ actions and interactions. Moreover, the personality conditions for becoming the perpetrator, the victim, and the third-party agents are well explained by the agent’s tendency for tuning and excluding behavior. It is concluded that the people who are likely to tune with others tend to become members of a large group and are rarely excluded, among whom those who are likely to exclude others are more likely to become the perpetrator while others may become the third-party agents. In contrast, the people who are less likely to tune with others tend to become solo agents, among whom those who are less likely to exclude others are more likely to become the bullied while others may become the third-party agents. Based on these results, the mechanisms and countermeasures against bullying were discussed. Despite the success in reproducing the emergence of the third party as well as the victim and the perpetrator, this model could not reproduce the emergence of the intensively repeated attacks by specific perpetrators against specific victims. Some motivation-related factors might be required to reproduce this tendency, which remains a subject for future study. Based on these findings, the following two countermeasures are considered effective. One is intentional tuning behavior with the victim, which could help him/her to become a member of a group, and therefore less likely to be excluded.
Model Structure of Agent-Based Artificial System for Reproducing. . .
249
This intentional tuning should be conducted by people who are likely to tune with others or by people in authority within the organization, such as a teacher, because they may have the power to confront the bullies. A second countermeasure is for the bystanders’ not to reinforce the bullies so that the bullies do not receive positive feedback for their attacks toward the victim. The people in authority who have power, such as a teacher, should not neglect the existence of bullying when it is occurring in the organization, because their attitude not to side with the victim could substantially result in giving positive feedback to the bullies.
References 1. Einarsen, S., Hoel, H., Zaph, D., & Cooper, C. (2003). Bullying and emotional abuse in the workplace. London and New York: Taylor & Francis. 2. Einarsen, S. (2000). Harassment and bullying at work: A review of the Scandinavian approach. Aggression and Violent Behavior, 5(4), 379–401. 3. Zapf, D. (1999). Organizational, work group related and personal causes of mobbing/bullying at work. International Journal of Manpower, 20(1/2), 70–85. 4. Adams, A. (1992). Bullying at work: How to confront and overcome it. London: Virago. 5. Adams, A. (1997). Bullying at work. Journal of Community & Applied Social Psychology, 7, 177–180. 6. Olweus, D. (1994). Bullying at schools: Long-term outcomes for the victims and an effective school-based intervention program. In L. R. Huesmann (Ed.), Aggressive behavior: Current perspectives (pp. 97–130). New York: Plenum Press. 7. Salmivalli, C. (2010). Bullying and the peer group: A review. Aggression and Violent Behavior, 15, 112–120. 8. Matthiesen, S. B., & Einarsen, S. (2001). MMPI-2 configurations among victims of bullying at work. European Journal of Work and Organizational Psychology, 10(4), 467–484. 9. Matthiesen, B. S., & Einarsen, S. (2004). Psychiatric distress and symptoms of PTSD among victims of bullying at work. British Journal of Guidance & Counselling, 32(3), 335–356. 10. Ferris, G., Zinko, R., Brouser, L. R., Buckly, M. R., & Harvey, M. G. (2007). Strategic bullying as a supplementary, balanced perspective on destructive leadership. The Leadership Quarterly, 18, 195–206. 11. Craig, W., & Harrel, Y. (2004). Bullying, physical fighting, and victimization. Young people’s health in contest (International report from the HBSC2001/02 survey). 12. Coyne, I., Seigne, E., & Randall, P. (2000). Predicting workplace victim status from personality. European Journal of Work and Organizational Psychology, 9(3), 335–349. 13. Wilensky, U., & Rand, W. (2015). An introduction to agent-based modeling. Cambridge: MIT. 14. Maeda, Y., & Imai, H. (2005). An agent based model on the bully of mobbed classmates. IEICE Transactions on Information and Systems, J88-A(6), 722–729. 15. Ogibayashi, S., & Takashima, K. (2014). Influence of the corporation tax rate on GDP in an agent-based artificial economic system. In Advances in computational social science (Vol. 11, pp. 157–173). Tokyo: Springer. 16. Takakshima, K., & Ogibayashi, S. (2014). Model structure of agent-based artificial economic system responsible for reproducing fundamental economic behavior of goods market. In: The 5th World Congress on Social Simulation, Sao Paulo, Brazil.
250
S. Ogibayashi and K. Shinagawa
17. Ogibayashi, S., & Takashima, K. (2019). System structure of agent-based model responsible for reproducing business cycles and the effect of tax reduction on GDP. https://ssrn.com/ abstract=3350172 18. Ogibayashi, S., & Takashima, K. (2017). Influential factors responsible for the effect of tax reduction on GDP. Evolutionary and Institutional Economics Review, 14, 431–449.
Predictors of Rooftop Solar Adoption in Rural Virginia Aparna Gupta, Zhihao Hu, Achla Marathe, Samarth Swarup, and Anil Vullikanti
Abstract This paper considers a variety of social, demographic, spatial, and structural features of the households to determine factors that are influential in predicting the adoption of rooftop solar panels in rural regions of Virginia. A novel feature of this work is that it synthesizes anonymized data from different sources to make it usable under one common architecture and to provide a geospatial context. Three different sets of models are used which add features in a modular manner. Results show that the demographics and neighborhood level features influence the likelihood of adoption but the social network based features do not. Keywords Rooftop solar · Synthetic population · Rural electric cooperatives · Peer effects · Predictors
A. Gupta Department of Computer Science, Virginia Tech, Blacksburg, VA, USA e-mail: [email protected] Z. Hu Department of Statistics, Virginia Tech, Blacksburg, VA, USA e-mail: [email protected] A. Marathe () Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA Network Systems Science and Advanced Computing Division, Biocomplexity Institute & Initiative, University of Virginia, Charlottesville, VA, USA e-mail: [email protected] S. Swarup Network Systems Science and Advanced Computing Division, Biocomplexity Institute & Initiative, University of Virginia, Charlottesville, VA, USA e-mail: [email protected] A. Vullikanti Department of Computer Science, University of Virginia, Charlottesville, VA, USA Network Systems Science and Advanced Computing Division, Biocomplexity Institute & Initiative, University of Virginia, Charlottesville, VA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_16
251
252
A. Gupta et al.
1 Introduction Adding an extra mile of grid lines can cost up to US $350,000 [10]. Solar energy, on the other hand, is a cost-effective and clean way of delivering energy to residents in developing countries and in remote rural regions where adequate electrical infrastructure may be lacking. Developing countries have been capitalizing on solar power not only as a source of energy, but also as a means to improve sustainability. According to Bloomberg New Energy Finance, solar energy accounted for 19% of all new generating capacity in developing economies. China installed 40GW, i.e., 40%, of the world’s new solar in 2016, while Brazil, Chile, Jordan, Mexico, and Pakistan all at least doubled their solar capacity in 2016 [5]. Although solar research and marketing efforts have been focused mainly in urban regions, rural areas offer higher potential for solar photovoltaic (PV) growth. Solar requires open space and access to sunlight, which is more abundant in rural and remote regions. Low population density and sparse locations of houses in remote rural areas make it economically infeasible to build an efficient electrical distribution network; hence, a distributed renewable energy source is an ideal solution. Additionally, in rural regions, the increased use of high-tech equipment, precision agriculture, and mechanization of many farming operations require increasing amounts of energy usage, which the farmers must obtain at affordable prices to keep agricultural production costs under control. As solar is becoming popular worldwide, the price of PV is going down due to economies of scale, cost reductions in PV manufacturing, and increased competition among retailers. There has been a 20-fold increase in solar capacity since 2010 among rural cooperatives in the USA [6]. This research aims to understand factors that contribute to the adoption of solar PV by rural households, especially in the US state of Virginia (VA). Our model creates a spatially explicit and highly resolved synthetic representation of individuals and their households’ attributes. Each synthetic household is statistically similar to a real household but is not identical to any particular household in the population. We use a diverse set of datasets to build a comprehensive list of features that are likely to influence a household’s decision to adopt solar PVs. This includes information about each household’s demographics, its structural characteristics such as the floor area, residence type, acreage, etc., spatial attributes such as its geolocation, number of PV adopters in the neighborhood within a certain distance, and social network features such as the number of links with coworkers who are adopters. A separate model is built to impute hourly load profiles of each household based on householders’ energy usage activities and their durations [8, 9]. We obtain PV adopter data from a rural electric cooperative in Virginia and use it to build the prediction model. This model is validated by measuring its performance on in-sample and out-of-sample data of the rural electric cooperative in Virginia.
Predictors of Rooftop Solar Adoption in Rural Virginia
253
2 Datasets Solar Adoption Data Through the National Rural Electric Cooperative Association, we obtained data for the year 2017, on 23,000 households, located across 51 zip codes in the Shenandoah Valley Electric Cooperative (SVEC). Of these, 221 households have adopted rooftop solar panels in the last 10 years. This dataset contains the following information: (1) service location of the household; (2) zip code of the household; (3) solar panel installation date; (4) solar KW capacity of the system installed; (5) county/city of the household. Synthetic Population of Virginia We have built an activity-based population model for the state of Virginia by integrating over several public and commercial datasets such as the American Community Survey, Open Street Maps, American Time Use survey, National Household Transportation survey, and transportation networks from HERE. The population synthesis process preserves the confidentiality of the individuals in the original datasets, yet produces realistic attributes, demographics, and activity patterns for synthetic individuals. The process constructs a representation of the set P of individuals, the activities done by individuals and households, the locations where these are done, and the movement patterns of people during the day—this results in a collocation-based, dynamic, temporal contact network G = (P , E). This social contact network provides detailed information on each person’s contacts and their locations throughout the day. See [1–4] for details of this process. In addition energy usage activities are assigned to individuals in the household to estimate hourly energy load profiles of the household. For more details on load generation methodology, see [7–9]. To identify rural households, we use the US Census Bureau’s urban and rural classification mechanism. According to the US Census: (1) Urban Areas (UAs) have 50,000 or more people; (2) Urban Clusters (UCs) have at least 2500 and less than 50,000 people; and (3) Rural areas encompass all the population, housing, and territory not included within an urban area or urban cluster. Residential Energy Consumption Survey (RECS) RECS data provides energy usage estimates and energy related characteristics, for a nationally representative sample of 5686 households in the USA. This data is collected by the Energy Information Administration (EIA) of the Department of Energy (DOE) every few years.1 This study uses a number of structural characteristics from the RECS 2015 data that are pertinent to estimating a household’s energy usage. These include floor area, fuel equipment, number of bathrooms, residence type, etc. This data is used to assign structural characteristics to the synthetic households. Peer Effects We generate several features that can be used to measure the peer influence in PV adoption decision. These include spatial features such as the number of adopters within a 1 mile, 2 mile, 3 mile, and 4 mile radius of the house; and social
1 https://www.eia.gov/consumption/residential/.
254
A. Gupta et al.
network-based features such as the number of connections with people at locations outside home (such as work, school, shopping center), who are adopters.
3 Methodology In order to build a model of adoption we construct a dataset of all households in the SVEC region that is endowed with a complete feature set. The data obtained from NRECA provides information on the date of adoption and the street addresses of the households. We begin by mapping geolocations of adopters to the synthetic households in order to determine which households are the adopters and when they adopted. This data is then combined with the synthetic data on demographics, and RECS data which provides structural features of the house. The date of adoption is used to build neighborhood-based peer effects since it identifies when the neighbors adopted. It is also used to calculate the social network-based peer effects. More information on how this larger dataset is constructed is given below.
3.1 Mapping Solar Adoption Data to Synthetic Households in VA To map the SVEC data onto the VA synthetic population, we first created a dataset from the VA synthetic population comprising of 51 zip codes present in the SVEC data. This resulted in over 138,000 synthetic households. Next we mapped the locations of the 221 SVEC households to the geolocations of the matching synthetic households as follows. For each household in the SVEC dataset, we calculated the distance between each synthetic household and the SVEC household using the great-circle distance and selected the nearest 5 synthetic households. The great-circle distance or orthodromic distance is the shortest distance between two points on the earth’s surface. Once the 5 nearest households were selected, one of those 5 synthetic households locations was randomly selected and mapped to the SEC household. This approach was used to map each household in the SVEC data to the synthetic household.
3.2 Adding Structural Features to Synthetic Households We use RECS data to overlay structural features to synthetic households. For this we use a recursive feature elimination technique that applies a random forest model to determine the urban/rural class labels. The features that were identified as important for this classification were residence type (i.e., mobile home/single-family attached home/single-family detached Home, etc.) and household income.
Predictors of Rooftop Solar Adoption in Rural Virginia
255
To match RECS households with synthetic households, a classification tree of RECS households was created which branched by urban/rural type, then by residence type, and then by household income. Similar bins were created for the synthetic households. For each synthetic household, a random RECS household was drawn from the matching bin. Once matched, the structural features of the RECS household were assigned to the synthetic household. For more details on the household matching process, see [9].
3.3 Constructing Peer Effects 3.3.1
Neighborhood-Based Peer Effects
The neighborhood features were constructed by counting the numbers of solar adopters within a 1 mile, 2 mile, 3 mile, and 4 mile radius of non-adopters. To build these features we used data on the adopters present in the SVEC dataset. However, some adopters had to be dropped because of missing data on other features. In the final dataset we had a total of 146 adopters for whom all the features could be constructed. The adoptions occurred over a period of 49 months, i.e., from May 2012 to June 2017, and over 140 unique adoption dates. Each of the unique dates was considered as a data point and the peer influence was calculated based on the number of adopters in the neighborhood of 1–4 miles. Distances up to 4 miles for the neighborhood were considered because houses in rural areas can be very sparsely distributed. At each unique adoption date, peer influence of all non-adopters in the appropriate neighborhood is updated. However, if the adopter falls outside 1–4 mile radius distance of non-adopters, then their peer influence remains the same. Also if a new adopter comes in the neighborhood, it does not have any peer effect on the existing adopters since they have already adopted. The neighborhood features are dynamic and update at every adoption date for each of the 138,044 household in the SVEC region. This results in a total of ≈19 million data points. We use 10% of this data to train the model.
3.3.2
Social Network-Based Peer Effects
The social network-based peer effect features were constructed by counting the number of other solar adopters a household is connected to in a social contact network. To construct these features we first generated an activity-based social contact network file. A bi-directional edge exists only if a person is collocated with other people at work, school, shopping, or “other” activity locations. Further, we extracted the households covered under the SVEC region and marked the households who had already adopted.
256
A. Gupta et al.
The peer effects were then generated at each unique adoption date for each household for each activity location. For example, the number of contacts a household (adult family members only) has with other households who are adopters at work location. A total of 4 social network-based features were generated, one each for work, school, shop and other. These features are also dynamic and update at every adoption date for each of the households in the dataset.
3.4 Generalized Linear Model We use a generalized linear model (GLM) with a logit link function to estimate the binary response variable. GLMs extend ordinary regression models to encompass non-normal response distributions and modeling function of the mean. The response, solar adoption, is the binary random component. We use y to denote the solar adoption, where y = 1 denotes that the household is an adopter, and y = 0 denotes the household π is not an adopter. Let E(Y = 1) = P (Y = 1) = π ; then the link function is log 1−π , and π is the mean of the Bernoulli random variable Y . 3.4.1
Predictors Used in Model Estimation
The predictors used in the GLM include the following demographic and structural characteristics of the household: household income, family size, education level of the survey respondent, daily electricity consumption, total value of the house, number of bathrooms, number of bedrooms, acreage of the house, rural or urban region, built year, number of car storage, square footage of the house, pool present or not, type of climate region the house is in, type of fuel used to heat the house, etc. Additionally the following peer effect predictors are used: adopters within a 1 mile, 2 mile, 3 mile, and 4 mile radius of the house. Number of household social contacts (adult members only) with other adopters at work, school, shop and “other” activity locations.
3.4.2
Three Models with Different Predictors
We built three models with three different sets of predictors. The first model uses demographics and structural features as predictors. The second model adds neighborhood-based peer features to the Model 1 predictor set. The third model adds social network-based features to Model 2 predictor set. The goal is to determine if the additional predictors, i.e., neighborhood and social network-based features improve the prediction of solar panel adopters.
Predictors of Rooftop Solar Adoption in Rural Virginia
3.4.3
257
Model Evaluation
We use two methods to evaluate our models. The first one is in-sample error. We compare the fitted response with the observed response. Typically, we set yˆ = 1 if the probability is greater than or equal to 0.5 (P ≥ 0.5) and 0 otherwise. However, the fitted probabilities are all less than 0.5, since the proportion of adopters (y = 1) is very small. Hence, we lower the threshold to a value equal to the nth 1 -highest fitted probability, where n1 is the number of y = 1. The second is the out-of-sample error. Here the dataset is split into 5 equal groups randomly. Then 4 of them are used as the training set and the hold-out group as the test set. This is repeated 5 times so each distinct group can be tested. The misclassification error is the average error across all 5 test sets. This out-of-sample error is closer to the true prediction error.
4 Results The results of Model 1 estimation are shown in Table 1. This model includes all the predictors except the neighborhood and social network features. Based on the p-value and 0.05 threshold, we find house acreage, built year, number of bedrooms, household income, number of car storage, total value of the house and family size to be significant in predicting the likelihood of an household being an adopter. Note that the p-value shown in the regression only shows the relative significance of a level of categorical variable against its base value. To check whether or not a categorical variable is significant, we need to perform the Likelihood Ratio Test (LRT). The LRT statistic is defined as: λ=
Likelihood (Reduced model) , Likelihood (Full model)
where the full model is the logistic model that includes all predictors, and the reduced model is the logistic model that excludes the predictor variable we want to test. According to Wilks’ theorem as the sample size approaches infinity, the test statistic −2log(λ) is asymptotically chi-squared distributed. Based on the corresponding p-value, we determine the significance at 5% level. There are three likelihood ratio tests shown in Table 2, one for each of the categorical variables. The results show that all categorical variables are significant. Next we consider only the significant features of Model 1 and re-estimate it; the results are shown all 3 models in Table 3. It shows that the probability for a household to be an adopter increases with number of bedrooms, built year, number of car storage, and household size, but decreases with acreage, income, and total value of the house. For categorical features, the first level of a categorical feature is treated as the baseline, and the coefficient of the first level is 0. For the climate region variable,
258
A. Gupta et al.
Table 1 Regression results of Model 1 that uses demographic and structural features of the house as predictors (Intercept) Acreage of the house Rural/urban region Built year Number of bathrooms Number of bedrooms Household income Number of car storage Square footage of the house Pool/no-pool Total value of the house Family size Daily electricity consumption Climate region the house is in: Hot-Humid Climate region the house is in: Mixed-Humid Education level: 2 Education level: 3 Education level: 4 Education level: 5 Type of fuel used for heating: 2 Type of fuel used for heating: 3 Type of fuel used for heating: 4 Type of fuel used for heating: 5
Estimate −77.87 −0.12 −8.34 0.06 −0.86 0.13 −0.28 0.43 −0.16 0.36 −0.51 0.14 0.02 33.80 32.42 15.17 16.02 16.52 0.94 31.95 17.82 16.92 19.25
P -value 0.9048 0.0000 0.8759 0.0128 0.0000 0.0000 0.0000 0.0000 0.1878 0.0000 0.0000 0.0000 0.2259 0.9452 0.9474 0.9592 0.9569 0.9555 0.9980 0.9520 0.9510 0.9535 0.9471
There are 3 multi-level categorical variables, i.e., climate region, education level, and the type of fuel used for heating. For each, the first level is used as the base level The values is bold show that the variables are significant Table 2 The likelihood ratio test finds all of the categorical variables to be significant Variable Type of climate region the house is in Education level of the survey respondent Type of fuel used to heat the house
P -value n P R = total number of revolutionaries within vision radius v; C = total number of active citizens within vision radius v; P = total number of policemen within vision radius v Revolutionaries become active when the ratio of rebel forces to policemen loyal to the government exceeds a given threshold n and they kill a randomly selected policeman within the vision radius v with a probability r otherwise they remain hidden. They are always active but they can hide from the policeman agent. Table 1 Agents’ state transition rules
State Q Q A A
G(yi ) − N(yi ) >f ≤f >f ≤f
State transition Q =⇒ A Q =⇒ Q A =⇒ A A =⇒ Q
Evaluation of Simulated Agent: Based Computational Models of Civil Violence. . .
275
Fig. 2 Yar’Adua/Jonathan and Buhari administrations’ ratings viz-à-viz Boko Haram incidents
In the model, while citizen agents choose to behave according to information local to their vision radius, revolutionaries act on the basis of global information and they decide to start the revolution according to the rule which involves the total number of active citizens.
2.5 Conflict Data The models are calibrated with empirical data from the Armed Conflict Location and Events Data (ACLED) Project and approval rating from the Gallup and NOI Polls. Figure 2 shows the Boko Haram incidents between 2009 and February 2018, both dates inclusive, with the approval ratings of the Yar’Adua/Jonathan and Buhari administrations from 2009 to 2018, both years inclusive. It can be seen that despite the increasing popularity of Yar’Adua and Jonathan’s first tenure, the chart shows a corresponding increase in Boko Haram incidents. From 2011, the rating of Jonathan had an opposite relationship with the incidents. The rise in the Boko Haram incidents may be explained from the decrease in the Jonathan administration’s rating, or vice versa. Buhari coming on the scene in 2015 with a high popularity rating of 72% saw the Boko Haram incidents stagnating before plummeting. By 2016, Buhari’s rating had dropped from 72% to 46%, rising slightly in 2017 to 51% only to plummet again to 39% by February, 2018. An interesting observation is that the Boko Haram incidents increase and decrease alongside the increase and decrease of Buhari’s rating (Fig. 2).
3 Simulations Both models used in the paper locate the agents on a grid of cells and each agent’s state is updated at each time step. Thus, all model calculations and activities are done relative to the grid. In the models, the grid is a 60 × 60 grid which gives an
276
O. N. Oloba
Table 2 Model parameters used in both simulations Model parameters Grid dimension
Initial population density
Initial police agent density
Government legitimacy
Vision
Hardship Grievance
Maximum jail term
Data used in model 60 × 60 The torus used a 60 × 60 grid for a total cell space of 3600, representative of the landmass in Northeast Nigeria 73% The initial density of 73% is used to give a total population of 2628, fairly representative of the 26,263,866 population of Northeast Nigeria [27] 0.002 Proportion of cops to the total population of the citizens used is 0.2%, also fairly representative of the 1:460 national ratio of policemen to citizens in Nigeria The Nigerian government created a new division for the Northeast, the seventh division. They government also added 500 soldiers to the division putting the total number of soldiers in the Northeast at 8500 This is a value that is varied between the perceived governance and approval ratings as observed from the data in the various polls 8 Awareness of the presence of police by agents and awareness of agents by the police Varying each year, this value is taken from the “index of economic freedom” [28] This is as specified in the model G = H(1 − L) The Moro model uses the income-indexed parameterized model G(yi ) = (1 − L)H(yi ) 50—though the life-time imprisonment is specified in the Nigerian constitution, the models use 50 turns since most suspects (citizens) are released from the prisons but revolutionaries are generally killed in battle
Table 3 Other model parameters used in Moro’s simulation Model parameters Initial revolutionary density
Data used in model 1%—this is based on the number of Boko Haram fighters as quoted in various data such as the CIA data and that by START
area that is relative to the area of the northeast Nigeria which is the focus of the models. Both models use the approval/disapproval ratings for the Yar’Adua/Jonathan and Buhari administrations to monitor the agents who joined the rebellion—that is those who do or do not accept, or otherwise, the legitimacy of the government. ® The simulations are done using NetLogo 6.0.4 with model parameters as indicated in the Tables 2 and 3.
Evaluation of Simulated Agent: Based Computational Models of Civil Violence. . .
277
Fig. 3 Simulation of Epstein’s model of rebellion outbursts
3.1 Model Parameters The parameters are used in the simulation for both the Epstein/Wilensky and Moro models.
3.2 Epstein/Wilensky Model Simulation The Epstein/Wilensky model using the legitimacy values observed by the NOI and Gallup Polls gives the following chart (Fig. 3). The model shows continuous bursts of rebellion by the civil populace though reducing in intensity. A comparison with ACLED data shows similar reduction in the intensity of the Boko Haram insurgency and one may be tempted to say there may be a “fit” coupled with the fact that the model is non-ending giving an intractable outlook. However, the model is a civil rebellion and not an insurgency. Thus, it does not capture the Boko Haram insurgency aptly. This is differentiated from the Moro model which is inspired by the Epstein model of civil rebellion
3.3 Moro’s Model The Moro model did not give any fit even with different values outside the empirical values obtained from the various data. The model is at a total variance with the data obtained from the various sources. Changes made to the various parameters did not yield the observed ACLED data outcome. With this, two behavioral characteristics of the Nigerian environment were added to the model of the Boko Haram insurgency. First, a citizen who has been jailed more than five times with a corresponding higher grievance level automatically joins the
278
O. N. Oloba
Fig. 4 High income values: (a) high income chart; (b) initialization: 11 policemen, 32 revolutionaries, no active citizen; (c) just before new government agents’ deployment: 4 policemen, 20 revolutionaries, no active citizen; (d) killing-off of the revolutionaries: 12 policemen, 0 revolutionaries, no active citizen
insurgency in the model. This is similar to what obtains in the Northern Nigeria region where citizens who have been imprisoned incessantly or without trial are forced to join the insurgency at the slightest opportunity they have. Similarly, when the density of government agents goes to a very low level, new agents are added to the war zone to mimic the Nigerian government’s penchant for throwing personnel at the insurgency. Two constant but different outcomes were observed: 1. With a high income, Fig. 4, the civil populace refused to join in the rebellion. The revolutionaries went ahead to confront government’s agents. Before long,
Evaluation of Simulated Agent: Based Computational Models of Civil Violence. . .
279
Fig. 5 Low income values: (a) low income chart; (b) first burst: 9 policemen, 36 revolutionaries, 1762 of 2638 citizens active; (c) redeployment of new government agents:14 policemen, 36 revolutionaries, 1938 of 2628 citizens active; (d) joining of the revolutionaries: 8 policemen, 45 revolutionaries, 1924 of 2610 citizens active
they have depleted the number of the agents and government responded as usual with the deployment of more agents. 2. With a low income, the civil populace joined the rebellion from the inception with over 67% of the population becoming active and 49% of the government agents killed off. As usual, government responded by throwing more agents at the rebellion though with no resolution insight (Fig. 5). It can be observed that the rank of the revolutionaries is swelling with ex jailed citizens who are joining them.
280
O. N. Oloba
4 Conclusion The Moro model shows that the decrease in income led to an uprising which gave the revolutionaries the fillip they needed to revolt. The mechanisms for interaction were neither obvious nor defined through their interactions and their decision to rebel was based on lower income. Both models do not actually apply to the Boko Haram insurgence as it is operating now. They embark more on suicide bombing which to them is more effective. Similarly, the models chose to start their rule on the number of active citizens which also does not reflect the insurgency’s mode of operation. Using the models, the ratings of each government did not adversely affect the actions of the citizens and the high income run shows that the citizens did not care about the ratings of the government, but only about their needs, which is a perfect description of the Nigerian environment. It should be noted that the model’s qualitative successes or otherwise in generating real-world phenomena does not really guarantee that the model’s parameters have any realistic relationship with the parameters they nominally represent. Epstein [24] himself noted that no political or social order is represented in the model (as) the dynamics of decentralized upheaval, rather than its political substance, is the focus . . . .
Similarly, it is seen that the revolution is a deliberate attempt to establish opposition on other grounds and not on grievance or legitimacy, thus the insurgency subsists because there are vested interests beyond grievance and legitimacy. Thus, the parameters—hardship, legitimacy, grievance, jail terms specified in the models do not reflect the reality. Similarly, greed as a factor of rebellion or revolution should not be discountenanced.
References 1. Henslin, J. (2015). Essentials of sociology: A down to earth approach. New Jersey: Pearson Education. 2. Kriesberg, L. (1998). Constructive conflict: From escalation to resolution. Lanham, MD: Rowman & Littlefield. 3. Lemos, C. (2016). On agent-based modeling of large scale conflict against a central authority: From mechanisms to complex behaviour. Lisboa: Instituto Universitario de Lisboa. 4. Moscona, J., Nunn, N., & Robinson, J. A. (2017). Social structure and conflict: Evidence from sub-Saharan Africa. Journal of Economic Literature, 2, 60. 5. Andrade, L., Plowman, D. A., & Duchon, D. (2008). Getting past conflict resolution: A complexity view of conflict. University of Nebraska, Lincoln: Management Department Faculty Publications. 6. Gallo, G. (2012). Conflict theory, complexity and systems approach. Systems Research and Behavioral Science, 30(2), 156–175. 7. Rummel, R. J. (1976). Understanding conflict and war, volume 2: The conflict helix. Thousand Oaks, CA: SAGE Publications.
Evaluation of Simulated Agent: Based Computational Models of Civil Violence. . .
281
8. Goh, C. K., Quek, H. Y., Tan, K. C., & Abbass, H. A. (2006). Modeling civil violence: An evolutionary multi-agent, game theoretic approach. New York: IEEE. 9. Rule, J. B. (1988). Theories of civil violence. Berkeley: University of California Press. 10. McLellan, D. (1971). The thought of Karl Marx: An introduction. New York: Harper & Row. 11. Situngkir, H. (2004). On massive conflict: Macro-micro link. Journal of Social Complexity, 1(4), 1–12. 12. Hayden, N. K. (2016). Balancing belligerents or feeding the beast: Transforming conflict traps. College Park: University of Maryland. 13. Lemos, C., Coelho, H., & Lopes, R. J. (2013). Agent-based modeling of social conflict, civil violence and revolution: State-of-the-art review and further prospects. Eleventh European Workshop on Multi-Agent Systems (EUMAS 2013), Toulouse. 14. Ademowo, A. (2015). Stigma, violence and the human agents on the Motor Park Space in Ibadan Metropolis, Southwest Nigeria. Bangladesh e-Journal of Sociology, 12(1), 88–98. 15. Kriesberg, L. (1973). The sociology of social conflict. Englewood Cliffs: Prentice Hall. 16. Hill, J. N. (2010). Sufism in Northern Nigeria: Force for counter radicalization? Carlisle: Strategic Studies Institute. 17. Pate, A. (2015). Boko Haram: An assessment of strengths, vulnerabilities, and policy options. College Park: National Consortium for the Study of Terrorism and Responses to Terrorism. 18. Abubakar, A. (2010). Manhunt begins after prison break. Retrieved 9 September, from http:// www.iol.co.za/news/africa/manhunt-begins-after-prison-break-1.680173#.VBgRSPldUYM 19. Simonelli, C., Jensen, M., Castro-Reina, A., Pate, A., Menner, S., & Miller, E. (2014) Boko Haram recent attacks. START Background Rep. College Park: National Consortium for the Study of Terrorism and Responses to Terrorism. 20. Raleigh, C., Kishi, R., & Russell, O. (2017). Al Shabaab and Boko Haram: Patterns of violence. 21. Raleigh, C., & Dowd, C. (2017). Armed conflict location and event data project (ACLED) codebook. 22. Raleigh C., Dowd, C., & Linke, A. (2013). African conflict baseline and trends: Armed conflict location and event dataset (ACLED): Overview, uses & applications, Department for International Development (DFID). 23. Wilensky, U. (2004). NetLogo Rebellion model. Evanston, IL: Center for Connected Learning and Computer-Based Modeling, Northwestern University. 24. Epstein, J. M. (2002). Modeling civil violence: An agent-based computational approach. Proceedings of the National Academy of Sciences of the United States of America, 99, 7243– 72250. 25. Moro, A. (2016). Understanding the dynamics of violent political revolutions in an agent-based framework. PLoS One, 11(4), e0154175. https://doi.org/10.1371/journal.pone.0154175 26. Economics, T. (2018). Nigeria GDP per capita 1960–2018 [Online]. Retrieved April 30, 2018, from https://tradingeconomics.com/nigeria/gdp-per-capita 27. Nigeria Bureau of Statistics. (2018). Key statistics: 2016 [Online]. Retrieved March 14, 2018, from http://nigerianstat.gov.ng/#pop 28. Miller, T., Kim, A. B., & Roberts, J. M. (2018). 2018 index of economic freedom. Washington, DC: The Heritage Foundation.
Further Readings Bak, P. (1996). How nature works: The science of self-organized criticality. New York: Springer. Berkowitz, L. (1969). The frustration-aggression hypothesis revisited. In L. Berkowitz (Ed.), Roots of aggression. Atherton Press: New York. Cioffi-Revilla, C. (2014). Introduction to computational social science: Principles and applications. London: Springer.
282
O. N. Oloba
Gilbert, N., & Troitzsch, K. G. (2005). Simulation for the social scientist. Berkshire: Open University. Gurr, T. R. (1968). Psychological factors in civil violence, political system and change: A world politics reader. World Politics, 20(2), 245–278. Gurr, T. R., & Duvall, R. (1973). Civil conflict in the 1960s—reciprocal theoretical system with parameter estimates. Comparative Political Studies, 6, 135–169. Hendrick, D. (2009). Complexity theory and conflict transformation: An exploration of potential and implications. Bradford: University of Bradford. Kaisler, S. H., & Madey, G. (2009). Complex adaptive systems: Emergence and self-organization. Notre Dame, IN: University of Notre Dame. Nicolescu, B. (2002). Nous, la particule et le monde, Le Mail, Paris, 1985. 2nd ed. Le Rocher, Monaco, Transdisciplinarité series.
Information Transmission in Randomly Generated Social Networks Jeff Graham
Abstract A bee swarm is comprised of two kinds of bees; there are a small number of scout bees and the rest are uninformed bees. The scouts have information that needs to be communicated to the rest of the swarm. This paper explores how the information moves through several random network models as the number of scouts and the number of links between the uninformed bees varies. Under some circumstances, it appears that complicated models might be replaced by smaller, simpler ones. Keywords Social networks · Randomly generated · Honey bees
1 Introduction When a hive of bees gets too large, a new queen is born and the old queen and a portion of the hive go looking for a new nest site. The process that bees use to decide on a new site is quite interesting and well understood [4, 6]; however, there is still some controversy over how the swarm finds its way to the new nest since at most 5% of the swarm (the scout bees) knows where the new nest is located [5]. The most obvious mechanism the bees could use is simply to watch a few other bees and if one of them flies, follow it. We can think about this process as information flowing from the bees who know where they are going (scouts) through the network to the bees that do not. The purpose of this paper is not to realistically model what the bees actually do, but to explore how information propagates through networks formed by random processes inspired by the bees’ navigation problem. We will examine several models as described below. All of the models presented are implemented in NetLogo [7].
J. Graham () Susquehanna University, Selinsgrove, PA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2020 T. Carmichael, Z. Yang (eds.), Proceedings of the 2018 Conference of the Computational Social Science Society of the Americas, Springer Proceedings in Complexity, https://doi.org/10.1007/978-3-030-35902-7_18
283
284
J. Graham
Each of these models will be described using the bees as the example, but the models use nothing specific to bees and should be applicable to any setting where there are small groups of things interacting with other small groups of things. For an example more relevant to the social sciences, one could consult the book The Invisible Cure: Africa, the West, and the Fight Against Aids by Epstein [1]. She argues in this book that in some African cultures, many engage in concurrent relationships, i.e., stable relations with more than one partner and that fact helped spread aids farther and more quickly in those populations. Other examples might be found in the corporate world. Some businesses organize their workers into teams to accomplish the company’s objectives. The communication channels between the teams could be crucial to success. This might matter between small groups working on a project or between larger corporate structures like manufacturing and sales departments. First, we will examine a network formed randomly as follows: each bee will form links to k randomly chosen other bees. This is different than a Poisson random network [2, 3] in that the outgoing degree of every node in the network is k and the network under consideration is a directed network; i.e., if Bee X is watching Bee Y it does not necessarily follow that Bee Y is watching Bee X. All of the networks constructed in this paper are directed networks. We will use a swarm of 625 bees and observe how changing the number of links and scouts affects the percentage of bees who receive the information. The second type of network we will examine will take the spatial distribution of the bees into account. For these models, the bees can only watch bees that neighbor them in the swarm. The bees will form k random links to their neighbors. We will consider both the Moore1 and von Neumann2 neighborhoods. Once again, we will vary the number of links and scouts to observe the effect on the percentage of bees who get the message. In the models just considered, the neighborhoods of the bees substantially overlap so each bee will be a member of multiple neighborhoods. This is probably realistic; however, suppose that the local groups overlapped less? The next series of models examines what happens if we vary the overlap between groups. For these models we will fix the size of the swarm at ten thousand bees divided into 625 groups. The number of scouts and links will vary and we will observe how this affects the percentage of bees that eventually start moving.
2 Random Model In all the models, the bees are represented by patches. The model was run for 625 bees and the number of links and scouts varied from one to six. The number of bees
1 The 2 The
Moore neighborhood is the nearest eight neighbors in the grid. von Neumann neighborhood consists of the up, down, left, and right neighbors in the grid.
Information Transmission in Randomly Generated Social Networks Fig. 1 Results for the random model
285
625 Bees Random Model Percent Reached
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
6
7
Number of Scouts Link1
Link2
Link3
Link4
Link5
Link6
was chosen to match the number of groups in the later models in this paper. Each combination was run one hundred times and the average number of uninformed bees that the information reached was computed. The results are shown in Fig. 1. The most noticeable result is the big jump that occurs when going from one link to two links. This jump is likely because in the one link model, isolated cycles form. In one model run, a chain of twenty-two bees was found where the 20-s bee was watching a bee earlier in the chain forming a cycle of eight bees within this chain. When there are two or more links, it seems less likely for this situation to occur. Another interesting result is that for two or more links and three or more scouts the resulting network is path connected or at least every path connected component contains a scout. For the higher number of links, the network is also path connected.
3 Random Neighbors Models For the models in this section, we will restrict the bees to linking with only their neighboring bees. We will use the Moore neighbors for one model and the von Neumann neighbors for the other. The size of the swarm will be fixed again at 625 bees. The scouts and links will be varied as above, except that for the von Neumann neighbors, there can be at most four links. The von Neumann neighbor results are shown in Fig. 2 and the Moore neighbor results in Fig. 3. Once again we can see the big jump from one to two links. The only essential difference between these models and the random model is that the percentage of bees reached by the scouts in the one link graph stays close to zero. Since the potential links that each bee can make are restricted to those bees nearby, it is likely that chains of bees leading to a cycle occur more often.
286
625 Bees Von Neumann Neighbors Model Percentage Reached
Fig. 2 Results for von Neumann neighbors
J. Graham
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
6
7
6
7
Number of Scouts Link1
Link3
Link4
625 Bees Moore Neighbors Model Percentage Reached
Fig. 3 Results for Moore neighbors
Link2
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
Number of Scouts Link1
Link2
Link3
Link4
Link5
Link6
4 Controlled Overlap Models In this section, we will describe several models where the number of overlapping bees between neighboring groups will vary between one and three. We will fix the number of bees at ten thousand which define 625 local groups of bees. The number of links and scouts vary as before.
4.1 One Overlap Model One of the more challenging aspects of these models was making sure that groups only overlapped by the number desired. Each of the groups starts with a core of nine members (shown in black in Fig. 4). The entire world is then laid out with groups of nine separated by one row in each direction of unassigned individuals. The trick is to assign the individuals in the gaps to each group in a way that makes sure that we get the overlap in each direction that we want and that all the individuals get assigned to a group. The design of the local groups of bees for the one overlap model is displayed in Fig. 4. As mentioned above, the black colored patches are the core group. The gray
Information Transmission in Randomly Generated Social Networks
287
Fig. 4 One overlap model neighbors group
Fig. 5 Overlapping groups
colored patches are the ones which overlap with the neighboring four groups. The seven white patches in Fig. 4 represent individuals that are assigned to neighboring groups (including groups that are diagonally adjacent) and are not overlapping. Each local group overlays by one layer with its north, south, east, and west neighbors as seen in Fig. 5. In all of the controlled overlap models, a toroidal geometry was used so that the local groups were all the same. Another important point about the model is that the overlapping bees get a different neighbor set than the other bees. In order to allow the overlapping bees to link to bees in either of the groups of which they are members, their neighbor set was formed from the union of the non-overlapping bees of the two groups of which the bee is a member. This prevents the direct linkage of one overlapping bee with another one while allowing the overlapped bee to potentially pass information
288
J. Graham
Fig. 6 One overlap results
One Overlap Model
Percent Reached
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
6
7
Number of Scouts Link1
Table 1 Probability of internal connectedness
Link2
Link3
Number of links 1 2 3 4 5 6
Link4
Link5
Nine bees 0.283 0.857 0.972 0.995 0.999 1.00
Link6
18 bees 0.147 0.822 0.956 0.989 0.996 0.998
in either direction between the groups. Obviously, other choices for the overlapping members could be made. The ten thousand bees are divided into 625 groups of bees with one bee overlapping with the north, south, east, and west neighbor groups. It works similarly to the previous models, except that the bees choose random bees within their local group to watch. The model was run 100 times for each combination of the parameters. The results in Fig. 6 are qualitatively similar to the results from the models in the last section. One can see the jump from one link to two links, although it is a smaller leap. Probably the same explanation applies for this jump as it did for the last one (isolated cycles in the one link model). Another difference here is that the two link graph seems to top out at about 85%. The lesser magnitude of the jump and the lower ceiling for the two link graph are probably a result of the limited overlap between the groups. There is another possible explanation for these differences. Since the local group size is larger in this model (eighteen vs. eight or four), it may be less likely that each local group is internally path connected. To test this idea, a random network model was run ten thousand times to estimate the probability that eighteen bees would form a path connected network. The estimates are shown in Table 1 along with estimates for a group of nine bees as a proxy for the smaller groups of bees in the random neighbors models. The probability of eighteen bees being connected for one link is about half that of nine bees. This may partially account for the smaller jump in the graphs. However,
Information Transmission in Randomly Generated Social Networks
289
the higher number of links has very similar estimates. It seems likely that the limited overlap accounts for the lower ceiling of the two link model.
4.2 Two Overlap and Three Overlap Models The two overlap model works the same as the one overlap model, except, of course, each local group of bees overlaps by two members to its four neighbor groups. In Fig. 7, the design of the two overlap group is shown. As above, the overlaps are shown in gray. The local groups consist of twenty bees in this model. Again, the overlapping bees get a neighbor set consisting of the union of the non-overlapped bees from the two overlapping groups to which they belong. The results in Fig. 8 are similar to the one overlap model. The only noticeable differences between them are that there is a larger jump from one link to two links and the two link graph is able to climb upwards to join the higher link graphs. The Fig. 7 Two overlap model neighbors group
Fig. 8 Two overlap results
Two Overlap Model Percent Reached
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
6
7
Number of Scouts Link1
Link2
Link3
Link4
Link5
Link6
290
J. Graham
Fig. 9 Three overlap model neighbors group
Fig. 10 Three overlap results
Three Overlap Model Percent Reached
1.2 1 0.8 0.6 0.4 0.2 0 0
1
2
3
4
5
6
7
Number of Scouts Link1
Link2
Link3
Link4
Link5
Link6
estimates of the probability of internal connectedness of the twenty bee groups are very similar to the eighteen bee groups of the one overlap model, so two overlapping members seem to be enough to overcome the lower asymptotic behavior of the previous model. Figure 9 contains the design for the three overlap groups. The results from running the model are shown in Fig. 10. There does not seem to be much gained over the two overlap model.
References 1. Epstein, H. (2007). The invisible cure: Africa, the west, and the fight against AIDS. New York: Farrar, Straus, Giroux. 2. Erd˝os, P., & Rényi, A. (1959). On random graphs. Publicationes Mathematicae Debrecen, 6, 290–297. 3. Jackson, M. (2008). Social and economic networks. Princeton: Princeton University Press.
Information Transmission in Randomly Generated Social Networks
291
4. Passino, K., & Seeley, T. (2006). Modeling and analysis of nest-site selection by honeybee swarms: The speed and accuracy tradeoff. Behavioral Ecology and Sociobiology, 59, 427–442. 5. Schultz, K., Passino, K., & Seeley, T. (2008). The mechanism of flight guidance in honeybee swarms: subtle guides or streaker bees? The Journal of Experimental Biology, 211(20), 3267–3295. 6. Seeley, T., & Visscher, P. K. (2004). Group decision making in nest-site selection by honey bees. Apidologie, 35, 101–116. 7. Wilenski, U. (1999). Center for connected learning and computer-based modeling. In NetLogo. Evanston: Northwestern University.