375 84 14MB
English Pages 200 Year 2015
PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH & PRACTICE
Editors Hamid R. Arabnia Leonidas Deligiannidis, Fernando G. Tinetti Associate Editors Lamia Atma Djoudi, Ashu M. G. Solo
CSCE’17 July 17-20, 2017 Las Vegas Nevada, USA americancse.org ©
CSREA Press
This volume contains papers presented at The 2017 International Conference on Software Engineering Research & Practice (SERP'17). Their inclusion in this publication does not necessarily constitute endorsements by editors or by the publisher.
Copyright and Reprint Permission Copying without a fee is permitted provided that the copies are not made or distributed for direct commercial advantage, and credit to source is given. Abstracting is permitted with credit to the source. Please contact the publisher for other copying, reprint, or republication permission.
© Copyright 2017 CSREA Press ISBN: 1-60132-468-5 Printed in the United States of America
Foreword It gives us great pleasure to introduce this collection of papers to be presented at the 2017 International Conference on Software Engineering Research and Practice (SERP’17), July 17-20, 2017, at Monte Carlo Resort, Las Vegas, USA. An important mission of the World Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE (a federated congress to which this conference is affiliated with) includes "Providing a unique platform for a diverse community of constituents composed of scholars, researchers, developers, educators, and practitioners. The Congress makes concerted effort to reach out to participants affiliated with diverse entities (such as: universities, institutions, corporations, government agencies, and research centers/labs) from all over the world. The congress also attempts to connect participants from institutions that have teaching as their main mission with those who are affiliated with institutions that have research as their main mission. The congress uses a quota system to achieve its institution and geography diversity objectives." By any definition of diversity, this congress is among the most diverse scientific meeting in USA. We are proud to report that this federated congress has authors and participants from 64 different nations representing variety of personal and scientific experiences that arise from differences in culture and values. As can be seen (see below), the program committee of this conference as well as the program committee of all other tracks of the federated congress are as diverse as its authors and participants. The program committee would like to thank all those who submitted papers for consideration. About 65% of the submissions were from outside the United States. Each submitted paper was peer-reviewed by two experts in the field for originality, significance, clarity, impact, and soundness. In cases of contradictory recommendations, a member of the conference program committee was charged to make the final decision; often, this involved seeking help from additional referees. In addition, papers whose authors included a member of the conference program committee were evaluated using the double-blinded review process. One exception to the above evaluation process was for papers that were submitted directly to chairs/organizers of pre-approved sessions/workshops; in these cases, the chairs/organizers were responsible for the evaluation of such submissions. The overall paper acceptance rate for regular papers was 26%; 20% of the remaining papers were accepted as poster papers (at the time of this writing, we had not yet received the acceptance rate for a couple of individual tracks.) We are very grateful to the many colleagues who offered their services in organizing the conference. In particular, we would like to thank the members of Program Committee of SERP’17, members of the congress Steering Committee, and members of the committees of federated congress tracks that have topics within the scope of SERP. Many individuals listed below, will be requested after the conference to provide their expertise and services for selecting papers for publication (extended versions) in journal special issues as well as for publication in a set of research books (to be prepared for publishers including: Springer, Elsevier, BMC journals, and others). • • • • • •
Prof. Afrand Agah; Department of Computer Science, West Chester University of Pennsylvania, West Chester, PA, USA Prof. Nizar Al-Holou (Congress Steering Committee); Professor and Chair, Electrical and Computer Engineering Department; Vice Chair, IEEE/SEM-Computer Chapter; University of Detroit Mercy, Detroit, Michigan, USA Prof. Hamid R. Arabnia (Congress Steering Committee); Graduate Program Director (PhD, MS, MAMS); The University of Georgia, USA; Editor-in-Chief, Journal of Supercomputing (Springer); Fellow, Center of Excellence in Terrorism, Resilience, Intelligence & Organized Crime Research (CENTRIC). Dr. Travis Atkison; Director, Digital Forensics and Control Systems Security Lab, Department of Computer Science, College of Engineering, The University of Alabama, Tuscaloosa, Alabama, USA Prof. Dr. Juan-Vicente Capella-Hernandez; Universitat Politecnica de Valencia (UPV), Department of Computer Engineering (DISCA), Valencia, Spain Prof. Kevin Daimi (Congress Steering Committee); Director, Computer Science and Software Engineering Programs, Department of Mathematics, Computer Science and Software Engineering, University of Detroit Mercy, Detroit, Michigan, USA
• • • • • •
•
• • • • • • • • • • •
• • •
• •
Prof. Zhangisina Gulnur Davletzhanovna; Vice-rector of the Science, Central-Asian University, Kazakhstan, Almaty, Republic of Kazakhstan; Vice President of International Academy of Informatization, Kazskhstan, Almaty, Republic of Kazakhstan Prof. Leonidas Deligiannidis (Congress Steering Committee); Department of Computer Information Systems, Wentworth Institute of Technology, Boston, Massachusetts, USA; Visiting Professor, MIT, USA Dr. Lamia Atma Djoudi (Chair, Doctoral Colloquium & Demos Sessions); Synchrone Technologies, France Prof. Mary Mehrnoosh Eshaghian-Wilner (Congress Steering Committee); Professor of Engineering Practice, University of Southern California, California, USA; Adjunct Professor, Electrical Engineering, University of California Los Angeles, Los Angeles (UCLA), California, USA Prof. Byung-Gyu Kim (Congress Steering Committee); Multimedia Processing Communications Lab.(MPCL), Department of Computer Science and Engineering, College of Engineering, SunMoon University, South Korea Prof. Louie Lolong Lacatan; Chairperson, Computer Engineerig Department, College of Engineering, Adamson University, Manila, Philippines; Senior Member, International Association of Computer Science and Information Technology (IACSIT), Singapore; Member, International Association of Online Engineering (IAOE), Austria Dr. Vitus S. W. Lam; Senior IT Manager, Information Technology Services, The University of Hong Kong, Kennedy Town, Hong Kong; Chartered Member of The British Computer Society, UK; Former Vice Chairman of the British Computer Society (Hong Kong Section); Chartered Engineer & Fellow of the Institution of Analysts and Programmers Dr. Andrew Marsh (Congress Steering Committee); CEO, HoIP Telecom Ltd (Healthcare over Internet Protocol), UK; Secretary General of World Academy of BioMedical Sciences and Technologies (WABT) a UNESCO NGO, The United Nations Prof. Dr., Eng. Robert Ehimen Okonigene (Congress Steering Committee); Department of Electrical & Electronics Engineering, Faculty of Engineering and Technology, Ambrose Alli University, Nigeria Prof. James J. (Jong Hyuk) Park (Congress Steering Committee); Department of Computer Science and Engineering (DCSE), SeoulTech, Korea; President, FTRA, EiC, HCIS Springer, JoC, IJITCC; Head of DCSE, SeoulTech, Korea Prof. Dr. R. Ponalagusamy; Department of Mathematics, National Institute of Technology, India Prof. Abd-El-Kader Sahraoui; Toulouse University and LAAS CNRS, Toulouse, France Prof. Igor Schagaev; Director of ITACS Ltd, United Kingdom (formerly a Professor at London Metropolitan University, London, UK) Dr. Akash Singh (Congress Steering Committee); IBM Corporation, Sacramento, California, USA; Chartered Scientist, Science Council, UK; Fellow, British Computer Society; Member, Senior IEEE, AACR, AAAS, and AAAI; IBM Corporation, USA Chiranjibi Sitaula; Head, Department of Computer Science and IT, Ambition College, Kathmandu, Nepal Ashu M. G. Solo (Publicity), Fellow of British Computer Society, Principal/R&D Engineer, Maverick Technologies America Inc. Prof. Fernando G. Tinetti (Congress Steering Committee); School of CS, Universidad Nacional de La Plata, La Plata, Argentina; Co-editor, Journal of Computer Science and Technology (JCS&T). Prof. Hahanov Vladimir (Congress Steering Committee); Vice Rector, and Dean of the Computer Engineering Faculty, Kharkov National University of Radio Electronics, Ukraine and Professor of Design Automation Department, Computer Engineering Faculty, Kharkov; IEEE Computer Society Golden Core Member; National University of Radio Electronics, Ukraine Varun Vohra; Certified Information Security Manager (CISM); Certified Information Systems Auditor (CISA); Associate Director (IT Audit), Merck, New Jersey, USA Dr. Haoxiang Harry Wang (CSCE); Cornell University, Ithaca, New York, USA; Founder and Director, GoPerception Laboratory, New York, USA Prof. Shiuh-Jeng Wang (Congress Steering Committee); Director of Information Cryptology and Construction Laboratory (ICCL) and Director of Chinese Cryptology and Information Security Association (CCISA); Department of Information Management, Central Police University, Taoyuan, Taiwan; Guest Ed., IEEE Journal on Selected Areas in Communications. Prof. Layne T. Watson (Congress Steering Committee); Fellow of IEEE; Fellow of The National Institute of Aerospace; Professor of Computer Science, Mathematics, and Aerospace and Ocean Engineering, Virginia Polytechnic Institute & State University, Blacksburg, Virginia, USA Prof. Jane You (Congress Steering Committee); Associate Head, Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong
We would like to extend our appreciation to the referees, the members of the program committees of individual sessions, tracks, and workshops; their names do not appear in this document; they are listed on the web sites of individual tracks. As Sponsors-at-large, partners, and/or organizers each of the followings (separated by semicolons) provided help for at least one track of the Congress: Computer Science Research, Education, and Applications Press (CSREA); US Chapter of World Academy of Science; American Council on Science & Education & Federated Research Council (http://www.americancse.org/); HoIP, Health Without Boundaries, Healthcare over Internet Protocol, UK (http://www.hoip.eu); HoIP Telecom, UK (http://www.hoip-telecom.co.uk); and WABT, Human Health Medicine, UNESCO NGOs, Paris, France (http://www.thewabt.com/ ). In addition, a number of university faculty members and their staff (names appear on the cover of the set of proceedings), several publishers of computer science and computer engineering books and journals, chapters and/or task forces of computer science associations/organizations from 3 regions, and developers of high-performance machines and systems provided significant help in organizing the conference as well as providing some resources. We are grateful to them all. We express our gratitude to keynote, invited, and individual conference/tracks and tutorial speakers - the list of speakers appears on the conference web site. We would also like to thank the followings: UCMSS (Universal Conference Management Systems & Support, California, USA) for managing all aspects of the conference; Dr. Tim Field of APC for coordinating and managing the printing of the proceedings; and the staff of Monte Carlo Resort (Convention department) at Las Vegas for the professional service they provided. Last but not least, we would like to thank the Co-Editors of SERP’17: Prof. Hamid R. Arabnia, Prof. Leonidas Deligiannidis, and Prof. Fernando G. Tinetti. We present the proceedings of SERP’17.
Steering Committee, 2017 http://americancse.org/
Contents SESSION: TESTING AND RESILIENCE, TEST GENERATION METHODS, FORMAL METHODS, BUG-FIXING TECHNIQUES AND SECURITY RELATED ISSUES A Comparison of Strategies to Generate Test Requirements for Fail-Safe Behavior Salah Boukhris, Ahmed Alhaddad, Anneliese Andrews
3
The Impact of Test Case Prioritization on Test Coverage versus Defects Found Ramadan Abdunabi, Yashwant Malaiya
10
Behavior Driven Test Automation Framework Ramaswamy Subramanian, Ning Chen, Tingting Zhu
17
Efficient Component Integration Testing of a Landing Platform for Vertical Take Off and Landing UAVs Anneliese Andrews, Aiman Gannous, Ahmed Gario, Matthew J. Rutherford
24
A Software Test Approach to Evaluate the Enforcement of a Workflow Engine Laurent Bobelin, Christian Toinard, Tuan Hiep Tran, Stéphane Moinard
31
An Automated Approach for Selecting Bugs in Component-Based Software Projects Georgenes Lima, Gledson Elias
38
Simple Promela Verification Model Translation Method based on Relative SysML State 45 Machine Diagrams Bo Wang, Takahiro Ando, Kenji Hisazumi, Weiqiang Kong, Akira Fukuda, Yasutaka Michiura, Keita Sakemi, Michihiro Matsumoto Cybersecurity Practices from a Software Engineering Perspective Aakanksha Rastogi, Kendall Nygard
51
A Sound Operational Semantics for Circus Samuel Barrocas, Marcel Oliveira
56
Resilience Methods within the Software Development Cycle Acklyn Murray, Marlon Mejias, Peter Keiller
62
SESSION: WEB-BASED TECHNOLOGIES AND APPLICATIONS + CLOUD AND MOBILE COMPUTING A Comparison of Server Side Scripting Technologies Tyler Crawford, Tauqeer Hussain
69
The Architecture of a Ride Sharing Application Mao Zheng, Yifan Gu, Chaohui Xu
77
An Automated Input Generation Method for Crawling of Web Applications Yuki Ishikawa, Kenji Hisazumi, Akira Fukuda
81
The Application of Software Engineering to Moving Goods Mobile App Katherine Snyder, Kevin Daimi
88
Support Environment for Traffic Simulation of ITS Services Ryo Fujii, Takahiro Ando, Kenji Hisazumi, Tsunenori Mine, Tsuneo Nakanishi, Akira Fukuda
98
Publishing and Consuming RESTful Web API Services Yurii Boreisha, Oksana Myronovych
104
SESSION: PROGRAMMING ISSUES AND ALGORITHMS + SOFTWARE ARCHITECTURES AND SOFTWARE ENGINEERING + EDUCATION Aesthetics Versus Entropy In Source Code Ron Coleman, Brendon Boldt
113
KDD Extension Tool for Software Architecture Extraction Mira Abboud, Hala Naja, Mourad Oussalah, Mohamad Dbouk
120
Version Control Open Source Software for Computer Science Programming Courses Mesafint Fanuel, Tzusheng Pei, Ali Abu El Humos, Xuejun Liang, Hyunju Kim
127
Software Engineering Experience - Fraud Detection Suhair Amer, Zirou Qiu
131
Determining Degree of Alignment of Undergraduate Software Engineering Program with SWECOM Massood Towhidnejad, Mouza Al Balooshi
137
Improving Cuckoo Hashing with Perfect Hashing Moulika Chadalavada, Yijie Han
143
SESSION: AGILE DEVELOPMENT AND MANAGEMENT IEEE 42010 and Agile Process - Create Architecture Description through Agile Architecture Framework Shun Chi Lo, Ning Chen
149
Implementation and Analysis of Autonomic Management SaaSEHR System Nadir Salih, Tianyi Zang
156
SESSION: POSTER PAPERS Enriching Information Technology Research for Each Video Contents Scene based on Semantic Social Networking using Ontology-Learning SungEn Kim, TaeGyun Lee, JaeDu Kim
165
Strategy Development on Disaster Information Integration System in High-rise Buildings in Korea Chunjoo Yoon, Changhee Hong
167
SW Test Automation System Implementation for Securing SW Quality and Stability Cheol Oh Jeong, Byoung-Sun Lee, In Jun Kim, Yoola L. Hwang, Soojeon Lee
169
SESSION: LATE PAPERS - SOFTWARE ENGINEERING RESEARCH Object Orientation: A Mathematical Perspective Nelson Rushton
173
An Actor Model of Concurrency for the Swift Programming Language Kwabena Aning, Keith Leonard Mannock
178
Performance Investigation of Deep Neural Networks on Object Detection Oluseyi Adejuwon, Hsiang-Huang Wu, Yuzhong Yan , Lijun Qian
184
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
SESSION TESTING AND RESILIENCE, TEST GENERATION METHODS, FORMAL METHODS, BUG-FIXING TECHNIQUES AND SECURITY RELATED ISSUES Chair(s) TBA
ISBN: 1-60132-468-5, CSREA Press ©
1
2
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
3
A Comparison of Strategies to Generate Test Requirements for Fail-Safe Behavior Salah Boukhris1 , Ahmed Alhaddad1 , and Anneliese Andrews1 of Computer Science, University of Denver, Denver, CO, USA
1 Department
Abstract— This paper compares the effectiveness and efficiency of generating test requirements for testing fail-safe behavior with simulation experiments and case studies. The strategies to generate test requirements include genetic algorithm (GA) and coverage criteria for failure scenarios. The underlying testing method is based on a behavioral model and its test suite. The results show that test requirements generated by a genetic algorithm (GA) are more efficient for large search spaces. They are equally effective for the two strongest coverage criteria. For small search spaces, the genetic algorithm is less effective. The weakest coverage criteria is ineffective.
work in model-based testing, testing fail-safe behavior and use of Genetic Algorithms (GA) to find defects in software. Section 3 explains the test generation process. Section 4 explains the selection of fail-safe test requirements via GA and coverage criteria. Section 5 describes the simulation experiments. Section 6 compares the efficiency and effectiveness of both types of strategies to determine test requirements on a number of case studies. Section 7 discusses threats to validity. Section 8 draws conclusions.
1. Introduction
Utting et al. [7] provide a survey on MBT. They define six dimensions of MBT approaches (a taxonomy): model scope, characteristics, paradigm, test selection criteria, test generation technology and test execution. Dias-Neto et al. [8] characterize 219 MBT techniques, discuss approaches supporting the selection of MBT techniques for software projects, risk factors that may influence the use of these techniques in industry and their mitigation. Utting et al. [7] classify MBT notations as State Based, History Based, Functional, Operational, Stochastic, and Transition based. Transition based notations are graphical node-and-arc notations that focus on defining the transitions between states of the system such as variants of finite state machines (FSMs), extended finite state machines (EFSMs), and communicating extended finite state machines (CEFSMs). Examples of transition based notations also include UML behavioral models (like activity diagrams, sequence and interaction diagrams), UML state charts, and Simulink Stateflow charts [7]. Testing with FSM models has a long history [9]–[13]. FSM-based test generation has been used to test a large number of application domains.
An external failure is an undesirable event in the environment that affects system operation [1]. Examples include hardware failures, sensor failures, or network outages. External failures are not a result of faults in the software. External failures in safety-critical systems (SCSs) such as medical devices, autonomous systems, many control systems, and some robots can cause loss of life, destruction of property or large financial losses. Even failures in web applications can result in losses of billions of dollars [2]. Testing external failure mitigation is therefore important for many application domains. Andrews et al. [3] introduce a technique for testing proper external failure mitigation in safety-critical systems. Unlike other approaches which integrate behavioral and failure models, and then generate tests from the integrated model [4], [5], they construct failure mitigation tests (F M T ) from an existing behavioral test suite that is generated from extended finite state machines (EFSMs) [3], using an explicit mitigation model (M M ) for which they generate mitigation tests (M T ) which are then woven at selected failure points into the original test suite to create failure-mitigation tests (F M T [6]) [3]. The possible combinations of failures and where in the test suite they occur represent failure scenarios. Test requirements state which of these failure scenarios need to be tested. Test requirements have been selected using failure scenario coverage criteria [3] and a genetic algorithm (GA) [6]. Neither [3] nor [6] evaluate their approach with respect to effectiveness and efficiency, nor do they compare under which circumstances coverage criteria may be preferable over GA. This paper reports on a series of simulation experiments and case studies that compare the two strategies with respect to effectiveness and efficiency. The paper is organized as follows: Section 2 describes related
2. Related Work 2.1 Model Based Testing (MBT)
2.2 Testing Fail-Safe Behavior An external failure is an undesirable event in the environment that affects system operation [1]. External failures are not a result of faults in the software. They can occur because of physical (network and system domain) failures, or client error (user generated/Interaction). Sensor failure in a safety critical system is an example of a physical failure. One strategy for testing fail-safe behavior alongside functional behavior is to integrate fault models with behavioral models: [4], [14] integrate State Charts and Fault Trees (FTs), while [15] integrates UML State Diagrams and FTs for safety analysis but not for testing. These approaches have a variety of limitations and challenges, including: (1) possible mismatches
ISBN: 1-60132-468-5, CSREA Press ©
4
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
between notations and terminologies used in the FT vs. the behavioral model; (2) potential scalability problems when multiple or large FTs exist or a fault can occur in a large portion of behavioral states; (3) they cannot leverage an existing behavioral test suite; and (4) there is no formal mitigation model.
2.3 Genetic Algorithms and Software Testing Ali et al. [16] analyze the use and results of search-based algorithms in test generation, including Genetic Algorithms (GA). The majority of techniques (78%) have been applied at the unit level and do not target specific external faults, but focus on structural coverage criteria. By contrast, the goal of fail-safe mitigation test requirements is to target specific external fault types and to test at various points of the test suite whether mitigation of external faults works properly. Berndt and Watkins [17] and Watkins et al. [18] introduce a multi-objective fitness function that changes based on results from previous testing cycles, i. e. the fitness function changes as the population evolves, based on the knowledge gained from prior generations (the “fossil record”). Individuals are rewarded based on novelty, proximity, and severity. Boukhris et al. [6] used a similar strategy to select failure scenarios that explore the search space for novel failure scenarios, prospect in the larger vicinity of found mitigation defects, and mine for further mitigation defects in the immediate vicinity.
Fig. 1 T EST G ENERATION P ROCESS [ ADAPTED FROM [6] AND [3]]
coverage criteria (M C) are used to create mitigation tests (MT). A State Event Matrix (SE) determines which failure types are possible in which behavioral states. This matrix and the test suite BT are then used in both techniques to identify fail-safe testing requirements: either through a heuristic search (GA) or coverage criteria (CC). The failure mitigation test (F M T ) is then created based on selecting an appropriate mitigation test and weaving it into the behavioral test according to weaving rules. The next sections describes the selection of test requirements in more detail.
3. Approach
4. Selection of Test Requirements
The fail-safe test generation process for both techniques (coverage criteria (CC) [3] and Genetic Algorithm (GA) [6]) consists of five steps (see Figure 1): 1) Generating test cases from the behavioral model. 2) Identifying external failure events and their required mitigation. 3) Generating test requirements (i.e. selecting external failure types and position in the test suite where the failure is to be applied either via GA [6] or Coverage Criteria CC [3]). Evaluating and comparing the efficiency and effectiveness of generating the fail-safe test requirements via coverage criteria or GA is not addressed in [3], [6] and is the core contribution of this paper. 4) Generating mitigation tests from the mitigation models 5) Applying weaving rules at the points of failure in the behavioral test suite to generate failure mitigation tests (FMT). The failure mitigation test process assumes that a technique for testing required functionality exists via a behavioral model (BM ), associated behavioral testing criteria (BC), and a behavioral test suite (BT ). Boukhris et al. [6] uses an existing web application MBT approach, FSMWeb [19]. Both [3] and [6] assume that system requirements exist that identify types of failure events and any required mitigation actions, e.g via hazard and risk analysis [20]. This is used to build failure mitigation models (M M ) for which mitigation
Let F = {f1 , ..., fm } be the failure types, and S = {s1 , ..., sn } be the behavioral states. Failures may not be applicable in all behavioral states. We express this in a StateEvent matrix: 1, if failure type j applies in node i in S SE(i, j) = 0 otherwise. Andrews et al. [3], [6] encode the test suite as a whole. They do so by concatenating the test paths in the test suite. Let BT={t1 , t2 , ..., tl } be the behavioral test suite. l is the number of test paths. Then CT= t1 ◦t2 ◦t3 . . . tl is the concatenated test path. Let I =Len(CT). Let |F | be the number of failure types. Let node(p) be the index of the state in S at position p. Then PE={(p, e)|1 ≤ p ≤ I, 1 ≤ e ≤ |F |, SE(node(p), e) = 1} represents all possible failure scenarios. Both GA and CC select (p, e) pairs to build safety/failure mitigation tests. The GA uses PE as the search space, while CC imposes coverage criteria on it. For example, assume the test paths in rows 1 and 2 of Table 2 for a behavioral model with 8 states and three failure types. Table 1 shows a state-event matrix for this example. Table 2 shows CT in row 2. PE is defined by the entries in the matrix that are marked ’1’. Table 1 S TATE -E VENT (SE) M ATRIX F/S f1 f2 f3
s1 1 0 0
s2 1 0 0
ISBN: 1-60132-468-5, CSREA Press ©
s3 1 0 0
s4 1 0 0
s5 1 1 1
s6 1 1 1
s7 1 1 1
s8 1 1 1
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
5 Table 2
4.1 Selection of Test Requirements Using a GA Boukhris et al. [6] used the defect potential of an individual (p, e) for selecting the initial population, as it is more effective than random selection. Test results are recorded in the fossil record (FR). A fossil record entry is a triplet (p, e, d) where (p, e) is a test requirement from a given generation and d is a boolean value that is 1 if a mitigation defect was found through the test requirement and 0 if it was not. The fitness function uses this information. First, the fitness function is using the novelty (S) of an individual (p, e). It is computed as the Euclidean distance from each new individual to each entry in the current fossil record. The second part of the fitness function uses proximity (R) of individuals to individuals with known mitigation defects. R is calculated as the distance for the new individual from all individuals in the fossil record that triggered a defect. q P (ec −e)2 (pc −p)2 R(pc , ec ) = length(I) + |F | (p,e,1)∈F Rd
where p, and e are the position and failure type in the fossil record that triggered a mitigation defect. F Rd = {(p, e, 1)|(p, e, 1) ∈ F R and (p,e) found mitigation defect} After executing the test cases associated with a given individual (p, e), R is computed as follows:
C1: A LL P OSITIONS , A LL A PPLICABLE FAILURES t1 t2 F\S s1 s1 s2 s2 s3 s3 s4 s4 s1 s1 s2 s5 f1 1 1 1 1 1 1 1 1 1 1 1 1 f2 1 f3 1
t3 s6 s1 s2 s3 s4 s7 1 1 1 1 1 1 1 1 1 1
s8 1 1 1
s5 1 1 1
(only one failure per node, but covering all failures). Collectively, all failures must be paired with a node at least once, but not with each selected node as in criteria 2. Criteria 3 can be met with 8 pairs: (1,1), (1,3), (1,6), (1,7), (1,12), (1,18), (2,13), and (3,13).
5. Experiment Design and Results We built a simulator to compare the effectiveness and efficiency of test requirements generated by GA and the three coverage criteria. More test requirements lead to more tests. We can hence measure the efficiency by the number of test requirements generated. We measure the effectiveness of a test requirement by whether the test(s) generated from it detected a mitigation defect.
5.1 Simulator R=
R(pc , ec ) if (pc , ec ) found a defect; 1 Otherwise. R(pc ,ec )
The overall fitness function is: F itness = (ws × S 1.5 )2 × (wr × R1.5 )2 where ws is a weight for exploration and wr is a weight for prospecting and mining. Crossover between two individuals (p1 , e1 ) and (p2 , e2 ) creates new individuals (p1 , e2 ) and (p2 , e1 ) based on a defined crossover rate. An individual (p, e) is mutated to (p0 , e) based on a given mutation rate. The algorithm removes duplicates and infeasible pairs.The procedure ends when no new generations can be found. Boukhris et al. [6] showed that the GA approach is superior to random generation of test requirements.
4.2 Selection of Test Requirements Using Coverage Criteria Coverage criteria are attractive, since they allow for systematic algorithmic generation of failure scenarios. We evaluate these criteria in section 6. Andrews et al. [3] defined the following coverage criteria. Criteria 1 (C1): All combinations, i.e. all positions p, all applicable failure types e (test everything). C1 requires testing all (p,e) pairs that are marked with a ”1” in Table 2. Criteria 2 (C2): All tests, all unique nodes, all applicable failures. Here we require that when unique nodes need to be covered they are selected from tests that have not been covered. Table 2 shows (p, e) pairs marked ’1’ in gray that meet C2 for the example. Criteria 3 (C3): All tests, all unique nodes, some failures
We extended the simulator in Boukhris et al. [6]1 . The simulator takes the following independent variables as input: Test suite size (I), Failure types (F) Mitigation defect density (D), Applicability level (AL)(percentage of “1” entries in the state-event matrix), Duplication factor (DF). This variable is defined as DF = I/|S|. that is, the length of the concatenated test suite divided by the number of states in the behavioral model. It computes the average number of times a state occurs in CT, and Type of (p,e) pair (test requirement) generation. Currently the simulator supports the following approaches: GA, Random, and coverage criteria C1-C3. The first two variables describe the problem size and characteristics. I and F determine the size of the search space. The mitigation defect density is used to determine how many mitigations are defective. These are marked with (d), so it can be determined later whether a selected (p,e) pair uncovers a mitigation defect or not. The applicability level is used to determine the number of ’1’ in the state event matrix which is generated next. Using the duplication factor, the simulator determines the number of states in the behavioral model and generates a concatenated test suite. The simulator selects one or more of the (p,e) pair generation approaches and determines the set of test requirements ((p,e) pairs). Then it determines whether or not they found a defect and computes defect coverage. We use the following dependent variables: • • •
Test requirements ((p,e) pairs) and their number The set of mitigation defects found and their number The percentage of mitigation defects found.
1 Note
that [6] does not account for duplication factor (DF).
ISBN: 1-60132-468-5, CSREA Press ©
6
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
5.2 Comparison of GA vs Coverage Criteria We compare GA performance against using the coverage criteria defined in section 4.2 with respect to effectiveness and efficiency. We also investigated whether size of the search space matters. 5.2.1 Parameter Settings For the GA parameters, we use the same parameters as were identified and experimented with in [6]2 : mutation rate MR=0.3, crossover CR=0.5, number of runs NR=10. We chose a defect rate D that is close to the one reported by Sawaelpong et al. [21]: D=20%. Based on the results of tuning the GA in [6] we selected exploration weights wr =3 and ws =1 for D=20% and wr =1, ws =3 for D=5%. We vary the size of I ×|F | from 10 to 2000. We keep the applicability level at the same level as in the prior experiments. The duplication factor is kept at 2 for the small search space (10-100). That means on average a specific state in the behavioral model occurs in the test suite twice. The duplication factor for the case studies reported in section 6 for the small search spaces vary from 2-14. A lower duplication factor makes for a more difficult search problem since there are fewer opportunities to find a mitigation defect for a failure that occurs in a particular state. For the larger search spaces, we set the duplication factor to 20. This is smaller than the duplication factor of 31 for the large case study reported in section 6. Again, to make it a harder problem. 5.2.2 GA vs. C1 C1 represents the equivalent of an exhaustive search as the test requirements ((p, e) pairs) state that all feasible combinations be tested. C1 guarantees 100% mitigation defect coverage. Table 3 shows simulation results for a mitigation defect density of 20% and 5%. Both GA and C1 find all mitigation defects. The first column shows the potential search space I × |F | (GA and CC remove infeasible pairs based on the SE Matrix). It varies from 200-2000. The columns 2 and 5 show the number of test requirements the GA generates each defect density. Columns 3 and 6 report this information when using coverage criterion C1. Both GA and C1 find all mitigation defects, but GA does so much more efficiently. Columns 4 and 7 show this quantitatively by computing the fraction of pairs needed by the GA vs C1. The GA only needs between 3.1%-5% of the pairs required by C1, it is clearly more efficient. When comparing the two mitigation defect densities. What is interesting is that GA does not need many more pairs the lower than for the larger mitigation defect density. The relative efficiency is also similar (Table 3 shows a slightly wider range [3.4%-5.0%] for a 20% defect density versus [3.8%-4.6%] for a 5% defect density. Next, we consider the small search space. Table 3 and 4 show these results for a 20% and 5% defect density respectively. 2 These were tuning experiments and experiments comparing GS to random requirement generation
Columns 2 and 4 show the number of test requirements for GA and C1, respectively. Columns 3 and 5 show the proportion of mitigation defects found. While the GA again generates fewer pairs it is not able to find all mitigation defects until I × |F | reaches 70. The relative efficiency (column 6) is not as good as for the larger search spaces. The GA now needs between 33.3%-62.5% of the number of pairs C1 requires. For this experiment, C1 is recommended over GA as long as I ×|F | ≤ 60. For the low mitigation defect density D=5% for the small search space. Results are reported in (Table 5), 3 the GA is unable to detect all mitigation defects for the very small search spaces (less than 80 in this case), hence C1 is more effective. When the defect rate decreases, it takes a larger potential search space (80 vs. 70) to become successful at finding all defects. We do not report relative efficiency until both GA and C1 find all defects since comparing efficiency of an ineffective approach compared to one that does find all defects is pointless. L ARGE S EARCH S PACE : GA
Table 3 C1 - 20%
VS .
AND
5%
DEFECT DENSITY
Size I× |F |
GA #pairs
20% C1 #pairs
GA/ C1
GA #pairs
5% C1 #pairs
GA/C1
200 400 600 800 1000 1200 1400 1600 1800 2000
7 11 17 25 28 30 39 53 57 70
160 320 480 640 800 960 1120 1280 1440 1600
4.3% 3.4% 3.5% 4.0% 3.5% 3.5% 3.1% 5.0% 4.0% 4.3%
6 12 22 27 34 43 46 60 61 72
160 320 480 640 800 960 1120 1280 1440 1600
3.8% 3.8% 4.6% 4.2% 4.3% 4.5% 4.1% 4.7% 4.2% 4.5%
Table 4 S MALL S EARCH S PACE : GA
VS .
C1 - 20%
DEFECT DENSITY
I × |F |
GA # of pairs
GA Defect %
C1 pairs required
C1 defect
GA/C1
10 20 30 40 50 60 70 80 90 100
5 7 8 12 16 23 26 28 32 35
50% 50% 50% 50% 67% 83% 100% 100% 100% 100%
8 16 24 32 40 48 56 64 72 80
100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
62.5% 43.8% 33.3% 37.5% 40.0% 47.9% 46.4% 43.8% 44.4% 43.8%
5.2.3 GA vs. C2 As before, we first analyze the performance of GA vs coverage criteria for the typical mitigation defect rate of 20% (both large and small search space), then for the low mitigation defect rate of 5%. All other parameter values are as indicated in section 5.1. Table 6 shows the results for D=20% and D=5% and I × |F | ∈ {200,...., 2000}. It is 3 we cannot investigate I × |F | = 10, since D=5% results in less than one defect, hence is not possible
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 | Table 5 S MALL S EARCH S PACE : GA
VS .
C1 - 5%
DEFECT DENSITY
I × |F |
GA # of pairs
GA Defect %
C1 pairs required
C1 defect
20 30 40 50 60 70 80 90 100
9 11 15 16 20 24 30 35 37
0% 0% 0% 50% 50% 50% 100% 100% 100%
16 24 32 40 48 56 64 72 80
100% 100% 100% 100% 100% 100% 100% 100% 100%
GA/C1
46.9% 48.6% 46.3%
organized identical to Table 3. Both GA and C2 are able to find all mitigation defects. The number of pairs does not differ as much as for GA vs. C1, although the GA needs fewer pairs. The relative efficiency of the GA is between 62.5% and 87.5%. For the lower mitigation defect density of D=5%, both GA and C2 find all defects. Decreasing the defect rate does not appear to affect the number of pairs the GA needs very much although the trend is for an increased number of pairs for the smaller mitigation defect rate. This is also illustrated by comparing relative efficiency: for the lower defect density, GA needs up to 93.8% of pairs C2 needs while for the higher one GA needs no more than 87.5% of the number of pairs C2 requires.
L ARGE S EARCH S PACE : GA
Table 6 C2 - 20%
VS .
5%
AND
DEFECT DENSITY
Size I× |F |
GA #pairs
20% C2 #pairs
GA/ C2
GA #pairs
5% C2 #pairs
GA/C2
200 400 600 800 1000 1200 1400 1600 1800 2000
7 11 17 25 28 30 39 53 57 70
8 16 24 32 40 48 56 64 72 80
87.5% 68.8% 70.8% 78.1% 70.0% 62.5% 69.6% 82.8% 79.2% 87.5%
6 12 22 27 34 43 46 60 61 72
8 16 24 32 40 48 56 64 72 80
75.0% 75.0% 91.7% 84.4% 85.0% 89.6% 82.1% 93.8% 84.7% 90.0%
S MALL S EARCH S PACE : GA
Table 7 VS . C2/C3 - 20%
I × |F |
GA #pairs
GA Defect%
C2 #pairs
10 20 30 40 50 60 70 80 90 100
5 7 8 12 16 23 26 28 32 35
50% 50% 50% 50% 67% 83% 100% 100% 100% 100%
5 8 12 16 20 24 28 32 36 40
DEFECT DENSITY
GA/ (C2)
C3 #pairs
C3 Defect%
92.9% 87.5% 88.9% 87.5%
2 5 8 10 13 15 18 20 23 25
50% 50% 50% 50% 50% 43% 43% 38% 20% 20%
Table 7 shows results for the small search spaces ranging from 10-100 and a mitigation defect density of 20%. The GA
7
does not reach 100% effectiveness until a search space of 70, showing C2 is more effective for search spaces ranging from 10-60. The number of pairs is comparable: C2 ranges from 5-40 pairs, GA ranges from 5-35 pairs. We thus recommend C2 for potential search spaces of less than 70, and GA for larger ones. Similarly, when comparing results for the small search space between the lower (Table 8) and higher (Table 7) defect densities, initially the GA is not able to find all defects (I ×|F | ≤ 80). When it finally does, it requires slightly fewer pairs than C2. It appears that lowering the defect density from 20% to 5% does not have a huge impact on effectiveness or efficiency. In summary, for large search spaces GA and C2 are equally effective; the GA has a slight advantage in efficiency over C2. Based on the results, we recommend using coverage criteria for small search spaces (until 70 for D=20% and until 80 for D=5%). Table 8 S MALL S EARCH S PACE : GA VS . C2/C3 - 5% I × |F |
GA #pairs
GA Defect%
C2 #pairs
20 30 40 50 60 70 80 90 100
9 11 15 16 20 24 30 35 37
0% 0% 0% 50% 50% 50% 100% 100% 100%
8 12 16 20 24 28 32 36 40
DEFECT DENSITY
GA/ (C2)
C3 #pairs
C3 Defect%
93.8% 97.2% 92.5%
5 8 10 13 15 18 20 23 25
50% 50% 0% 50% 0% 0% 0% 33% 0%
5.2.4 GA vs. C3 C3 is the weakest of the three coverage criteria and requires the fewest (p,e) pairs. C3 does not find any defects for 20% and 5% defect density, in the large search space, while GA finds them all. The criteria is too weak. We turn to the small search space next. Table 7 reports results for D=20% while Table 8 summarizes results for D=5%. What is interesting, however, is that C3 does not detect all defects for any of the potential search spaces, although it does better than for the larger potential search spaces. For the 20% defect rate (Table 7) it starts out with detecting 50% of the defects for I × |F | = 10 and decreases steadily to 20% when I × |F | = 90. It appears that as the potential search space increases, its ability to find mitigation defects decreases. When considering the lower defect rate D=5%, the results do not show a trend (Table 8, column 7) such as decreasing effectiveness with increasing potential search space. Overall, we cannot recommend C3.
6. Comparison of Various Case Studies Table 9 lists characteristics of seven case studies involving 3 different model types and four different application domains. The first is FSMWeb [19], a model used to test web applications. The second is an extended finite state machine [13]. The third is a communicating extended finite state machine [23].
ISBN: 1-60132-468-5, CSREA Press ©
8
Int'l Conf. Software Eng. Research and Practice | SERP'17 | Table 9 S UMMARY OF CASE STUDIES USING Behavioral Model
FSMWeb
EFSM CEFSM
Case Study
CSIS [19] Mortgage System [6] Closing Documents SubSystem [6] RCCS -1 [3] Launch System [22] Insulin Pump [6] RCCS-2 [6]
# of States
DIFFERENT BEHAVIORAL MODELS
# of F 3 10
Size of BT 6 266
Length of CT
AL
16 127
# of Transitions 20 224
70 3998
33.33% 40.00%
12
9
10
12
169
41.67%
4 21
8 34
4 14
4 5
24 49
81.25% 45.92%
15 14
23 19
4 4
11 11
74 58
61.67% 60.71%
Column 1 in Table 9 shows which of these models has been used in each case study. Column 2 identifies the case studies and provides a reference where details of the case study can be found. The first is a student services web application (CSIS), the second a large mortgage application, underwriting, and management system, the third is one of its subsystems that deals with closing documents. These are all web applications. The Railroad crossing control system (RCCS-1) in EFSM format is a simplified version of the RCCS-2 in CEFSM format 4 . Additionally, we also use CEFSM models of an aerospace launch vehicle and an insulin pump. These are all examples of safety critical systems. Hence we cover four application domains: web application, transportation, medical devices, and aerospace. These case studies show that we successfully used the approach described in this paper with multiple models and multiple application domains. The next two columns show the number of states and transitions, respectively. They vary from 4 in the RCCS-1 to 127 in the mortgage system, and from 8 transitions in RCCS-1 to 224 for the mortgage system. Column 5 shows the number of failure types. Column 6 lists the number of behavioral tests for each case study, ranging from 4 to 266. The length of the behavioral test suite is given in column 7. The last column shows the applicability levels for each case study. Applicability levels are usually lower when certain failures only apply in certain phases of processing. For example, in the launch vehicle case study, failure types are specific to a launch phase and are not applicable to earlier or later phases. These case studies show a range of model sizes, failure types, and applicability levels. Next, we evaluate effectiveness and efficiency of the case studies. We seed each case study with mitigation defects: we chose a mitigation defect rate of 5%. While this is lower than the rate reported by [21], it represents a harder search problem. Table 10 shows the efficiency of using GA, and C1C3. The first column identifies the case study. The second reports the size of the potential search space (I × |F |). Columns 3-7 report the number of test requirements ((p, e) pairs) using GA, C1-C3, respectively. Except for C3, the approaches for generating test requirements find all mitigation defects. The last column shows the percentage of defective mitigations found by C3. The case studies show behavior similar to the simulations: the GA is more efficient (i.e generates fewer (p,e) pairs) 4 RCCS-2
is the example used in this paper
than criteria C2 for the larger search spaces, but C2 is more efficient for RCCS-1 (13 pairs for C2 vs. 17 pairs for GA). The case studies also show that applicability level makes a difference: the insulin pump and RCCS-2 show that C2 need 48% and 17 % more pairs while the lower applicability levels of the mortgage system, the case study closing documents (CD) and the launch system only show a difference of 4-6% more (p,e) pairs for C2. The application domain does not appear to affect effectiveness. Generally, when the applicability level increases, we need more (p,e) pairs. For example, compare CSIS (AL=33.33%) with RCCS2 (AL=60.71%). CSIS has I × |F | = 210 with RCCS-2 has I×|F | = 232. CSIS requires 16 pairs for C2 vs. 34 for RCCS2. The number of failure types also makes a difference: with 14 failure types the launch system requires 128 (p,e) pairs for the GA and 135 (p,e) pairs for C2. This is much higher than any other case study except for the mortgage system. If we consider the case study closing documents (CD) which has a comparable applicability level, larger potential search space, but 10 failure types instead of the 14 for the launch vehicle, the launch vehicle requires 128 (p,e) pairs using GA while the closing documents subsystem only requires 47. As we showed in the simulation experiments, using the GA is much more efficient than C1. It compares in efficiency with C2. C3 is too weak. Table 10 E FFICIENCY C OMPARISON Application CSIS Mortgage System CD RCCS-1 Launch System Insulin Pump RCCS-2
I× |F | 210 39980 1690 96 686 296 232
GA
C1
C2
C3
14 485 47 17 128 25 29
97 27986 638 76 386 189 138
16 508 50 13 135 37 34
9 127 12 4 21 15 14
C3 Defect 0% 0% 0% 100% 0% 50% 50%
7. Threats to Validity 7.1 Simulation Experiments The threats are related to the choice of simulation parameters. The weights wr and ws are based on tuning experiments performed in [6]. Their tuning relies on the use of published mitigation defect rates (i.e [21]). They also experiment with a much lower defect rate (5%). This supports our use of a common defect rate of 20%, contrasting it with a low one (5%). Test suite size I ranges from 5-1000 while |F | ranges from 2-10. This falls within the range described for the case studies in Table 9, except for the large Mortgage system (I=3998). If the trends observed in our simulation experiments do not hold for larger search spaces, validity is limited to the sizes we experimented with. The choice of failure types spans the ranges reported for the case studies, hence is realistic. The applicability level was set close to the highest found in the case studies. Higher applicability levels increase the search space and make the problem more difficult, i.e. our results are conservative. If future case studies follow the same pattern,
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
the simulation results should carry over. To be conservative, we used the smallest deuplication factor DF found in the case studies (DF=2) and a smaller DF=20 than the one found in the largest case study (DF=31). In the case studies, DF ranges from 2-31. In summary, we selected simulation parameter values guided by either literature or case studies. If a future case study has very different characteristics, our results may not hold. Effectiveness is measured by the proportion of mitigation defects found. This is common for many experiments as a measure of effectiveness [24]. Efficiency is measured in terms of number of test requirements. Test requirements have been used in the past to predict test effort [25].
7.2 Case Studies As with any case study, generalizability is limited. We therefore selected a range of case studies that varied types of model, application domain and problem size. This shows that the comparison of GA vs. C1-C3 is applicable to a relatively wide range of case studies, we cannot guarantee the same results for new case studies with very different characteristics.
8. Conclusion This paper compared efficiency and effectiveness of using a GA vs. coverage criteria to determine fail-safe test requirements when testing proper mitigation of failures in safetycritical systems and web applications. Its goal was to evaluate and compare the use of GA [6] versus the use of coverage criteria (C1-C3) [3]. Neither [3] nor [6] provided such a comparison. We used a simulator so as to be able to vary problem size (search space) mitigation defect density, and type of approach used (GA versus C1-C3). Our comparisons show that for large search spaces the GA is more efficient (more so when compared to C1 than to C2) while GA, C1 and C2 are equally effective. C3 was ineffective. Dropping the defect rate from a more common 20% in [21] to a much lower 5% did not result in a large increase in test requirements, i.e. the GA is relatively robust in this range. The simulation results favor C1 and C2 when search spaces are small. A smaller mitigation defect rate affected how big the search space had to be for the GA to be effective. We also presented results of case studies where we compared performance of GA and C1-C3.
Acknowledgment This work was supported, in part, by NSF IUCRC grants #0934413 and #1439693 to the University of Denver.
References [1] P. A. Laplante, Ed., Dictionary of Computer Science Engineering and Technology. Boca Raton, FL, USA: CRC Press, Inc., 2001. [2] S. Pertet and P. Narasimhan, “Causes of failure in web applications,” Carnegie Mellon University Parallel Data Lab Technical Report CMUPDL-05-109, Tech. Rep., 2005. [3] A. Andrews, S. Elakeili, and S. Boukhris, “Fail-safe test generation in safety critical systems,” in 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering (HASE). IEEE, 2014, pp. 49–56.
9
[4] M. Sanchez and F. Miguel, “A systematic approach to generate test cases based on faults,” in ASSE2003, ISSN 1666 1087, Buenos Aires, 2003. [5] J. Kloos, T. Hussain, and R. Eschbach, “Risk-based testing of safetycritical embedded systems driven by fault tree analysis,” in IEEE Fourth International Conference on Software Testing Verification and Validation Workshops (ICSTW), 2011, 2011, pp. 26 –33. [6] S. Boukhris, A. Andrews, A. Alhaddad, and R. Dewri, “A case study of black box fail-safe testing in web applications,” Journal of Systems and Software, 2016. [7] M. Utting, A. Pretschner, and B. Legeard, “A taxonomy of modelbased testing approaches,” Software Testing, Verification and Reliability, vol. 22, no. 5, pp. 297–312, 2012. [8] A. C. Dias-Neto and G. H. Travassos, “A picture from the modelbased testing area: concepts, techniques, and challenges,” Advances in Computers, vol. 80, pp. 45–120, 2010. [9] T. Chow, “Testing software designs modeled by finite state machines,” IEEE Transactions on Software Engineering, vol. SE-4, no. 3, pp. 178– 187, 1978. [10] W. Howden, “A methodology for generating program test data,” IEEE Transactions on Computers, vol. 24, no. 5, pp. 554–560, 1975. [11] J. Huang, “An approach to program testing,” ACM Computing Surveys, vol. 7, no. 3, pp. 113–128, 1975. [12] S. Pimont and J. C. Rault, “A software reliability assessment based on a structural and behavioral analysis of programs,” in Proceedings of the 2nd International Conference on Software Engineering, San Francisco, CA, 1976, pp. 486–491. [13] D. Lee and M. Yannakakis, “Principles and methods of testing finite state machines- a survey,” Proceedings of the IEEE, vol. 84, no. 8, pp. 1090–1123, 1996. [14] M. A. Sanchez, J. C. Augusto, and M. Felder, “Fault-based testing of E-commerce applications,” in Proceedings of 2nd Workshop on Verification and Validation of Enterprise Information Systems, 2004, pp. 66–71. [15] H. Kim, W. E. Wong, V. Debroy, and D. Bae, “Bridging the Gap between Fault Trees and UML State Machine Diagrams for Safety Analysis,” in Proceedings of the 2010 Asia Pacific Software Engineering Conference, ser. APSEC ’10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 196–205. [16] S. Ali, L. Briand, H. Hemmati, and R. Panesar-Walawege, “A systematic review of the application and empirical investigation of search-based test case generation,” IEEE Transactions on Software Engineering, vol. 36, no. 6, pp. 742 –762, 2010. [17] D. J. Berndt and A. Watkins, “High volume software testing using genetic algorithms,” in Proceedings of the 38th Annual Hawaii International Conference on System Sciences - Volume 09. Washington, DC, USA: IEEE Computer Society, 2005, pp. 318–326. [18] A. Watkins, E. M. Hufnagel, D. Berndt, and L. Johnson, “Using genetic algorithms and decision tree induction to classify software failures.” International Journal of Software Engineering & Knowledge Engineering, vol. 16, no. 2, pp. 269 – 291, 2006. [19] A. A. Andrews, J. Offutt, and R. T. Alexander, “Testing web applications by modeling with FSMs,” Software and System Modeling, pp. 326–345, 2005. [20] X. Ge, R. Paige, and J. McDermid, “An iterative approach for development of safety-critical software and safety arguments,” in Agile Conference (AGILE), 2010, 2010, pp. 35–43. [21] E. Sawadpong, P. Allen and B. Williams, “Exception handling defects: An empirical study,” in 2012 IEEE International Symposium on HighAssurance Systems Engineering, 2012, pp. 90–97. [22] A. Andrews, S. Elakeili, A. Gario, and S. Hagerman, “Testing proper mitigation in safety-critical systems: An aerospace launch application,” in 2015 IEEE Aerospace Conference , 2015, pp. 1–19. [23] J. Li and W. Wong, “Automatic Test Generation from Communicating Extended Finite State Machine (CEFSM)-Based Models,” in Proceedings. Fifth IEEE International Symposium on Object-Oriented RealTime Distributed Computing, 2002. (ISORC 2002)., 2002, pp. 181–185. [24] C. Wohlin, P. Runeson, M. H¨ost, M. C. Ohlsson, B. Regnell, and A. Wessl´en, Experimentation in software engineering. Springer Science & Business Media, 2012. [25] I. Burnstein, Practical software testing: a process-oriented approach. Springer Science & Business Media, 2003.
ISBN: 1-60132-468-5, CSREA Press ©
10
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
The Impact of Test Case Prioritization on Test Coverage versus Defects Found Ramadan Abdunabi
Yashwant K. Malaiya
Computer Information Systems Dept Colorado State University Fort Collins, CO 80523 [email protected]
Computer Science Dept Colorado State University Fort Collins, CO 80523 [email protected]
Abstract—Prior studies demonstrate the importance of the relationship between code coverage and defects found to determine the effectiveness of test inputs. The variation of defect coverage with code coverage has been studied for a fixed execution sequence of test cases. We conducted an experiment to evaluate hypotheses expressed in two research questions. The first question addresses the relationship between defect coverage with code coverage for different execution sequences of test cases. The second research question evaluates the effectiveness and expensiveness of employing the prioritization techniques. The study confirms that altering the test cases execution order can affect the relationship between the code coverage and defects found. The results show that the optimal prioritization outperforms the st-total and the random prioritization. Index Terms—Defect Coverage, Statement Coverage, Empirical Study, Test Cases, Prioritization
I. I NTRODUCTION The relationship between test coverage and the number of defects found is an indicator of software testing process. Horgan and London [9] showed that code coverage is an indicator of testing effectiveness and completeness. A high code coverage with low fault rate indicates a high software reliability. In this work, we conducted an experiment to investigate the relationship between test coverage and number of defects found for varying execution order of test cases. The test cases are ordered in each execution sequence based on Test Case Prioritization (TCP) techniques defined in the study by Elbaum at al [7]. We propose an approach that can be used to improve the selection of the most cost-effective technique. The research results provide insight into the tradeoffs between techniques, and the conditions underlying those tradeoffs, relative to the programs, test suites, and modified programs that we examine. If these results generalize to other workloads, they could guide the informed selection of techniques by practitioners. The analysis strategy we use demonstrably improves the prioritization technique selection process, and can be used by practitioners to evaluate techniques in a manner appropriate to their chosen testing scenarios. Earlier experimental studies [4], [10] showed that at the beginning of testing, the number of detected faults grows slowly, which appears as knee shape in a plot of the defects found versus test coverage. Following the knee, when the fault detection has been already started, the number of
detected faults grows linearly with test coverage. Saturation or acceleration is unlikely, and have not been encountered. However, all the studies to date presented in section 2 ignore the role of the ordering of test cases in the growth of number of detected faults and test coverage. Different test cases might have different fault detecting ability and by altering the execution order of the test cases changes the curve of defects found versus test coverage. The first hypothesis in this study is that “the most effective test cases will give high test coverage and number of defects found than less effective test cases. However, we cannot distinguish between test cases unless we execute them and then investigate the reported data. Another hypothesis is that altering the execution sequences of test cases will change the position of the knee. One other important expected results in this study is a better understanding of how to choose the test sequence that leads to maximum defect coverage growth. Further expectation of the possible effects is the change in the shape of the curve following the knee . The future goal of the study is to obtain a better insight into the behavior of the number of defects found with test coverage in order to propose more accurate faults prediction models. An important consideration in our empirical study of fault detection is whether to use natural, manually seeded, or automatically seeded faults. Using automatically seeded faults was the only feasible option. Even apart from resource considerations, automatically seeded faults offer some advantages for experimentation: unlike hand-seeded faults, automatically seeded faults are not influenced by the person seeding the fault. We created faulty versions using mutation-based fault injection to non-faulty programs. This was done to enlarge our data sets and because mutation-based faults have widely used for analyzing test effectiveness [6], [2], [13]. For all of our subject programs, any faulty versions that did not lead to at least one test case failure in our execution environment were excluded. We initially had a concern that the effectiveness measure may not capture the artificially seeded faults and real effectiveness of a test suite and; its fault detection capability. Nevertheless, structural coverage and mutation score has been widely used as a successful surrogate of fault detection capability in software testing literature [6], [2], [13]. Andrews et al. [6] reported that faults generated
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
with mutation operators are similar to hand seeded faults for seven programs and natural faults in a space application. MuJava, a mutation testing tool for Java programs, is used in our study to generate faulty version programs to simulate real faults. Mutant programs are commonly used as practical replacements for real faults. In these studies [14], [20], [1], a mutant program has shown that mutation faults can be representative of real faults. The rest of the paper is organized as follows. Section 2 presents the background material about the relationship between code coverage and defects found and Test Case Prioritization (TCP) techniques. The challenges in the study, experimental approaches, and data sets are discussed in Section 3. Section 4 analyzes the results with respect to the research questions. The potential threats to validity are discussed in Section 5. Finally, Section 6 summarizes the results and discusses future research work. II. R ELATED W ORK Prior studies showed that the test coverage is a good estimator of the defects after achieving high coverage. Bishop [3] used test coverage to estimate the number of residual faults. This model is applied to a specific data set with known faults, and the results agreed well with the model. Cai et al. [5] proposed a reliability growth model that combines testing time and test coverage measures. The Malaiya Li Bieman Karcich Skibbe MLBKS model [15] relates test coverage and defects found. This study demonstrated the early applicability of their model to describe the relationship between the test coverage and the defects found. Furthermore, the study showed that the initial defect density decides the position of the knee of the curve describing the model. Cia and Lyu [4] studied the relationship between code coverage and fault detection using various scenarios: different test case coverage techniques, functional testing vs. random testing, normal operational testing vs. exceptional testing, and in different combinations of coverage metrics. The overall result showed code coverage is a good indicator for testing effectiveness. In contrast to our results, these studies empirically showed that random test case prioritization (a.k.a. random ordering) can be ineffective. It has been a long tradition to deem random ordering as the lower bound control technique. If random ordering is indeed ineffective, we would like to ask the question: Why are other techniques not used to resolve tie cases? Moreover, none of these studies are concerned about the relationship between code coverage and defect coverage under different execution sequences of test cases. In our study, the test cases are given particular order in each execution to examine the impact on the coverage and defect relationship. These prior studies have investigated small programs than the number of programs. Thus, we suspect that the results in these prior studies are limited and cannot be generalized.
11
Recently, a number of test case prioritization approaches have been proposed [11], [23]. Similar to our work, these approaches compare Test Case Prioritization (TCP) techniques in terms of their effectiveness, similarity, efficiency, and performance degradation. While these studies are considered among the most complete studies in terms of evaluation depth, they ignored the static techniques considered in this paper. Thus, our study is differentiated by the unique goal of understanding the relationships between purely static TCPs. Zhang et al. [23] studied the regression prioritization and Fast Mutation Testing prioritization, which are similar to our study: to reorder the test cases to make regression/mutation testing faster. However, their mechanisms are different. Regression test prioritization aims to cover more program units faster, thus increasing the probability of revealing unknown faults earlier; whereas the locations of mutation faults (i.e., mutated statements) are known for mutation testing and a simple strategy can merely execute the test cases that reach the mutation faults, making the coverage-based regression test prioritization techniques are not suitable for mutation testing. Rothermel et al. [19], [7] studied the ’Total’ and ’Additional’ approaches that utilize program dynamic coverage information. This work investigates test prioritization techniques for regression testing, Regression Test Prioritization (RTP), has different goals than ours. Our research aims at sorting a given set of tests into an expected most-productive order of execution and proposes which tests of a previously selected test suite are most advantageous to rerun with the same programs. Total techniques, followed in this study, do not change values of test cases during the prioritization process, whereas additional techniques adjust values of the remaining test cases, taking into account the influence of already prioritized test cases. For example, a typical test-case prioritization technique reorders the tests to execute the effective tests earlier in order to speed up fault detection. Different from these traditional work; our work aims to improve the effectiveness of the existing tests rather than run them more efficiently. Moreover, the study in [7] does not incorporate a testing time budget. Two graduate students of computer science are recruited to insert faults that were as realistic as possible based on their experience. Since there is usually a limited amount of time allowed for testing, in our work, faults seeding process as well as reordering the test cases are fully automated to reduce testing time. III. T HE E XPERIMENTAL S TUDY We will address the following research questions: RQ1: Does the alteration of test execution sequence change the position of knee in the defect coverage vs code coverage curve? RQ2: Can the curves help us to determine which execution sequence is the best based on its effectiveness and expensiveness? The approach of comparing Test Case Prioritization (TCP) techniques is to first obtain several non-faulty programs, mu-
ISBN: 1-60132-468-5, CSREA Press ©
12
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
tant faults, and test suites. Then, the prioritization techniques are applied to the test suites, the resulting ordered suites are executed, and measurements are taken of their effectiveness. The subject programs vary both in terms of their size based on the LOC (lines of code) and functionality as shown in Table(1). This allows us to evaluate across a very broad spectrum of programs. Assume P be a non-faulty program, T be a test suite for P, and testing is concerned with validating program P. To facilitate this, engineers often begin by reusing T, but reusing all of T (the retest-all approach) can be inordinately expensive. Thus, we aim to find an approach for rendering reuse most cost-effective test selection and test case prioritization. Test Case Prioritization (TCP) techniques are one way to assist speeding up the effectiveness of testing process. Test case prioritization (TCP) techniques reorder the test cases in T such that testing objectives can be met more quickly. Our objective involves revealing faults, and find the TCP techniques capable of revealing faults more quickly. Because TCP techniques do not themselves discard (less effective) test cases, they can avoid the drawbacks that can occur with different test selection. Alternately, in cases where discarding test cases is acceptable, test case prioritization can be used in conjunction with different test selection to prioritize the test cases in the selected test suite. Further, test case prioritization can increase the likelihood that, testing time will have been spent more beneficially than if test cases were not prioritized. Mutation Testing is applied as a test case prioritization technique, measures how quickly a test suite detects the mutant in the testing process. Testing sequences are rescheduled based on the rate of mutant killing. Automating test case prioritization can effectively improve the rate of fault detection of test suites. The tool MuJava [16] was used to seed mutation faults. Using MuJava, all possible faults within MuJava’s parameters were generated for each sample program. Of these, faults that spanned multiple lines and faults in sample classes corresponded to events deliberately omitted. Faults not inside methods (i.e., in class-variable declarations and initialization) were also omitted, because their coverage is not tracked by EclEmma - a Free Java Code Coverage for Eclipse [12]. Equivalent mutants were not accounted for in this experiment, because it would have been unfeasible to examine every mutant to see if it could lead to a failure. A test suite is generated using Raandoop - a free Automatic Unit Test Generation for Java [17], for each fault in the sample. For each test suite, fault pair, each test case is executed on the clean version of the application and, if it covered the line containing the fault on the faulty version. To determine whether a test suite covered a faulty line, the coverage report from EclEmma was examined. A. Challenges in the Study The success of an experiment relies on real applications with natural faults. However, usually the number of natural faults are not enough to factor into the experiment. Similarly,
Subject Program Students Project NanoXML Jtopas
Language Java Java Java
Type Artificial Real Real
Lines of Code 160 7,646 5,400
Downloads – 269 340
Tab. I: Subject Programs Characteristics
when there are no faults in the programs, researchers have to seed faults to produce faulty versions. These faults are seeded using artificial injecting faults tools such as MuJava. When Natural test suites are available, they usually do not give a good code coverage. To control this problem, more test cases need to be generated to obtain a high code coverage. The alternative solution is the commercial software applications, however, such applications are restricted to use. Thus, we used Raandoop tool to generate test suites for the subject programs. The most well-known empirical approaches are the controlled experiments and case studies. The advantages of the controlled experiment is that the independent variable can be treated to identify their impact on the dependent variables. Therefore, the results won’t depend on unknown factors. The disadvantages of such an approach are the threats to validity (see Section 5) due to the manufacturing of test cases and seeding defects. The advantages of the case studies are that they are performed on real objects and this reduces the cost of code inspection and artificiality injecting faults and generating test cases. B. Data Sets Three subject programs Students Project, Jtopas, and NanoXML are written in Java. The faults in the Students project (from a class project at CSU) are artificial faults caused by misconception of the project requirements. The test cases are developed based on the project specifications. The NanoXML and Jtopas are open source software applications downloaded from Software-artifact Infrastructure Repository (SIR) for experimentation [18]. For the Jtopas program, the test cases are available. However, there are no natural faults in the Jtopas program. Therefore, the faults are injected by using MuJava. For the NanoXML program, the faults are natural and the test cases are not available. Raandoop is used to generate the test cases for NanoXML. For all programs, test cases are developed in Java based on JUnit testing framework. Table I summarizes the characteristics of the object programs. C. The Experimental Approach This study is conducted to execute the test cases in different order and observe the effects on the relationship between the defect and code coverage. For each execution of test suites, the test cases are ordered based on different prioritization techniques as defined by Elbaum et al.[7]. The highest priority tests are executed first to find faults faster and reduce the cost of the of testing process. We order the test cases based on the statement coverage. We investigate one version of three subject programs. Each version has multiple defects identified by either bug reports or code inspection, and it is associated with a set of test cases that expose these defects. The test cases are ordered based on Optimal, Random, and st-total prioritization
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Tool Type Language Purpose
Eclipse Free Java and Others JDK
Eclemma Free Java Code Coverage
Raandoop Free Java Test Case Generation
MuJava Free Java Artificially Seed Faults
Tab. II: Experimental Tool Attributes
techniques [7]. The optimal technique optimally order test cases in a test suite; it assumes that the faults are known. The random technique randomizes the order of test cases. St-total is an abbreviation of total statement technique in which the test cases sorted using statement coverage data for each test case. The data is collected in a tabular manner such that each row comprises number of test cases, cumulative code coverage, and a cumulative number of defects found. The prioritization techniques are applied on the subject programs and the data is collected in Table III, Table IV , and Table V. To get more insight about the relationship between defect coverage and statement coverage, the data is displayed in plots as shown on Figure 1, Figure 2, and Figure 3. Obviously, The curves in the plots help to compare the prioritization techniques. The software development environment Eclipse IDE [8] is employed to compile and run the programs and test cases. The free Eclipse plug-in Java code coverage tool named Eclemma [22] is used to exhibit the fraction of covered code and the number of defects found for particular execution order of test cases. EclEmma is a coverage measurement of Java programs and it measures the statement coverage within JUnit testing framework. Table II summarizes the attributes of these software tools. For Students and NanoXML programs, test cases are automatically generated for these programs, and in each execution sequence, the first test case is chosen and executed, then the data is collected. The subsequent test case is chosen with respect to the prioritization technique, then it is executed in combination with prior test cases. This process continues until all test cases in such execution sequence are executed. This approach is not followed for Jtopas program because the faults are artificially seeded. Then, the test cases are executed to kill the mutants and the mutant scores are recorded using MuJava. To obtain the coverage data, test cases are cumulatively executed and the data is recorded. IV. R ESULTS AND D ISCUSSION The analysis of the data is driven by the answer of two questions, RQ1 and RQ2. Overall, the results indicate that there is enough statistical evidence to eliminate the null hypothesis. For question RQ1, the determination of the knee is conducted based on manual extrapolation and it is subjective. A straight line is fit to the part of the curve that has a maximum number of points. The straight line along the maximum number of points represents the high number of defects found and high coverage. The point of intersection of that straight line with the curve at the lowest possible statement-coverage value is the knee. Tables VI show the statement-coverage values at
13
the knee points of the three test execution sequences for NanoXML, Jtopas, and Students Project receptively. The results confirm the hypothesis expressed in RQ1; the position of the knee is different in each prioritization technique. After the knee, the curves that represent the relationship between statement coverage and defects found increase semi-linearly.
(a) Optimal
(b) Random
(c) St-Total
Fig. 1: Test Execution Sequences of the NanoXML
(a) Optimal
(b) Random
(c) St-Total
Fig. 2: Test Execution Sequences of the Jtopas
(a) Optimal
(b) Random
(c) St-Total
Fig. 3: Test Execution Sequences of the Students Project
Therefore, the highest defects found and statementcoverage is attained after the knee. Obviously, the position of the knee reacts to the starting point of exposing high defects and statement coverage. Furthermore, fewer faults in the early stage during the execution of each test sequence, causes the knee to pushed further along the statement-coverage axis. Hence, the position of the knee is a good measure of the effectiveness of prioritization techniques. Building on the alteration of the knee position, the most effective test cases provide high coverage and high defect density. The knee position confirms that the optimal prioritization outperforms st-total and random prioritization in all programs. For optimal, the most effective test inputs are scheduled first causing most faults are exposed early, hence the knee occurs at a lower statement coverage value. The random prioritization technique is the least desirable of all programs because the least effective test inputs are non-deterministic-ally executed near the beginning in the
ISBN: 1-60132-468-5, CSREA Press ©
14
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
N 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Statement Coverage vs Defect Coverage Optimal Random st-total SC DC SC DC SC 2.90% 5 3.20% 0 3.50% 6.40% 9 6.40% 2 6.80% 9.60% 12 9.40% 2 10.20% 12.50% 15 12.70% 3 13.70% 15.90% 17 15.40% 4 17.10% 19.20% 19 18.70% 4 20.40% 22.30% 21 21.90% 6 23.70% 25.50% 23 24.70% 8 27.00% 28.70% 25 26.60% 8 30.20% 31.90% 27 29.70% 10 33.40% 35.10% 29 33.20% 12 36.50% 37.90% 31 33.90% 12 39.70% 40.90% 33 36.50% 12 42.80% 44.30% 34 39.80% 14 46.00% 47.70% 35 42.70% 19 49.20% 51.10% 36 45.90% 20 52.50% 54.20% 37 49.00% 20 55.60% 56.90% 38 52.30% 20 58.80% 59.30% 39 55.20% 22 61.80% 62.60% 39 58.00% 22 64.70% 65.90% 39 61.20% 24 67.60% 69.20% 39 64.60% 25 70.50% 72.30% 39 67.50% 28 73.40% 75.50% 39 70.70% 31 76.10% 78.50% 39 74.20% 32 78.90% 81.30% 39 77.60% 36 81.50% 83.90% 39 80.70% 38 83.90% 85.70% 39 83.10% 39 85.70% 86.40% 39 86.40% 39 86.40%
DC 1 2 6 8 9 11 11 11 11 11 13 15 16 16 18 21 23 25 25 27 32 35 37 38 38 38 39 39 39
Tab. III: NanoXML TCP Techniques N 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Statement Coverage vs Defect Coverage Optimal Random st-total SC DC SC DC SC 23.80% 9 0.40% 0 23.80% 37.60% 13 8.70% 2 34.70% 43.80% 17 10.00% 2 47.20% 47.20% 21 11.20% 4 53.20% 51.40% 24 11.60% 5 57.40% 54.10% 26 12.50% 7 60.70% 56.50% 28 21.10% 11 63.30% 57.70% 30 22.40% 13 65.80% 58.70% 32 24.70% 15 66.90% 60.10% 34 24.90% 15 68.00% 61.30% 36 36.40% 19 69.30% 61.60% 38 36.60% 20 70.20% 61.90% 40 37.10% 20 70.70% 62.40% 42 40.40% 24 72.00% 62.90% 44 44.60% 27 72.30% 63.40% 46 48.90% 28 72.60% 63.70% 48 49.20% 28 73.10% 72.90% 49 50.20% 30 73.60% 77.10% 50 52.60% 32 77.80% 77.40% 51 53.90% 34 78.10% 77.60% 52 54.30% 36 78.30% 77.70% 53 54.60% 38 78.40% 78.60% 53 63.80% 39 78.60% 79.10% 53 64.10% 41 79.40% 79.40% 53 64.40% 41 79.50% 80.20% 53 65.40% 41 80.00% 80.30% 53 65.90% 43 80.30% 80.60% 53 66.10% 44 80.60% 80.80% 53 80.80% 53 80.80%
DC 9 10 14 18 21 25 27 29 31 33 35 35 35 37 39 41 43 45 46 46 47 48 49 49 49 51 53 53 53
nique in the Table III are (33.40%,27),(31.90%,11), and (29.70%,10). Therefore, the number of defects found is compared as follows: 27 > 11 > 10 for optimal, st-total, and random respectively. This demonstrates that the optimal technique is less expensive than the st-total and the random techniques. The coverage is compared as follows: 33.40% > 31.90% > 29.70% for st-total, optimal, and random respectively. Therefore, there is a trade off between the optimal and st-total. Due to the fact that the number of defects found is more important than the coverage, the optimal technique is better than the st-total technique. Definitely, the random technique is the most expensive. Following the same approach for each row in the three tables, we could conclude the following: the optimal and st-total prioritization technique significantly outperform the random technique. On average, the st-total technique is closer to the optimal technique. Nevertheless, the above analysis of the results is performed based on subjective observation of the data. The results should be supported by some statistical evidences. Therefore, the analysis of variance ANOVA is applied to measure the differences between prioritization techniques in terms of number of exposed defects. To acquire more insight about the differences between techniques, the Tukey statistical model is used to compare the prioritization techniques. Execution Order Optimal st-total Random
Execution Order Optimal st-total Random
NanoXML Coverage Value at the Knee 14% 45% 60 %
N 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Jtopas Coverage Value at the Knee 42% 49% 53 %
Students Project Coverage Value at the Knee 45% 60% 67 %
Tab. VI: Positions of Knees
Tab. IV: Jtopas TCP Techniques Statement Coverage vs Defect Coverage Optimal Random st-total SC DC SC DC SC 23.90% 2 6.00% 1 24.20% 48.00% 4 27.00% 3 62.00% 52.00% 5 47.00% 4 66.00% 55.00% 7 52.00% 4 66.40% 64.00% 9 52.60% 4 73.00% 65.00% 10 68.00% 4 73.20% 66.00% 11 69.00% 6 76.00% 72.00% 12 78.00% 7 84.00% 86.00% 13 82.00% 7 85.00% 86.50% 14 82.30% 7 85.40% 88.00% 14 83.00% 9 88.00% 90.00% 15 83.60% 11 88.20% 90.60% 15 87.00% 11 88.40% 90.80% 15 87.80% 13 88.80% 91.00% 15 91.00% 15 91.00%
Execution Order Optimal st-total Random
DC 1 1 3 3 5 7 9 9 11 13 13 13 14 14 15
Tab. V: Students Project TCP Techniques
test suite. The least effective test cases exhibit low coverage and defect density. The st-total technique outperforms the randomized technique and it is subjectively closer to the optimal technique in terms of the early occurrence of the knee. The effectiveness of the prioritization techniques in the research question RQ2 is confirmed in the answer of question RQ1. The expensiveness/cost of implementing prioritization techniques can be answered informally by investigation of Table III, Table IV, and Table V. For instance, the number of defects found and obtained coverage by the execution of ten test cases for each tech-
The analysis of variance shows no significant differences between prioritization techniques in case of all test cases (100%) are executed. This is true because when all the test cases are executed, we will get the same code coverage and defects found values. A prioritization technique is considered effective if the testing process is prematurely halted for some reasons, and a large number of defects is detected. More precisely, we can compare the prioritization techniques based on the number of defects covered after applying about 50% of the test cases. Following this approach, the ANOVA and Tukey analysis are applied to compare the prioritization techniques after applying around 50% of test cases. The one way ANOVA analysis was conducted to compare the effect of execution order on defects found in optimal, st-total, and random conditions. As illustrated in the table, there was a significant effect of execution order on defects found at the p < .05 level for the three conditions [F (2, 42) = 19.04, p = 0.000]. Source Def Error Total
Defects DF 2 42 44
versus Number of Test Cases SS MS F 1795.2 897.6 19.04 1979.9 47.1 3775.1
P 0.000
Tab. VII: The ANOVA Analysis for NanoXML
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
And then, as illustrated in the table below, Post hoc comparisons using the Tukey HSD test indicated that the mean score for the optimal condition (M = 22.33, SD = 9.40) was significantly different than the random condition (M = 7.73, SD = 5.37). However, the st-total (M = 10.60, SD = 4.89) did not significantly differ from the random conditions. Level 1 2 3
N 15 15 15
Mean 10.600 22.333 7.733
StDev 4.896 9.409 5.378
1- st-total, 2- optimal,
Individual 95% CIs For Mean Based on Pooled StDev ---+---------+---------+---------+-----(-----*-----) (-----*-----) (-----*-----) ---+---------+---------+---------+-----6.0 12.0 18.0 24.0 3- Random
Fig. 4: Tukey HSD Test for NanoXML
Taken together, these results suggest that the optimal order really do have an effect on defects found. Specifically, our results suggest that when test cases are ordered as optimal, errors are exposed early. The same statistical analysis has been conducted with Jtopas and students projects and the result is shown in the following tables. As a summary, there is enough statistical evidence on the differences between prioritization techniques. The optimal technique is statistically more effective than the random technique in all programs. For the NanoXML, the optimal technique significantly outperforms the st-total. For the Jtopas and students programs, optimal is not statistically more effective than st-total; st-total is more effective than the random technique for Jtopas program, and they are closer in NanoXML and students projects. Finally the random technique exposes the smallest number of defects in all programs. Defects versus Number of Test Cases DF SS MS F 2 2435.4 1217.7 12.73 42 4017.6 95.7 44 76453.0
Source Def Error Total
P 0.000
Tab. VIII: The ANOVA Analysis for Jtopas Individual 95% CIs For Mean Based on Pooled StDev +---------+---------+---------+--------(------*------) (------*-------) (-------*------) +---------+---------+---------+--------7.0 14.0 21.0 28.0 1- st-total, 2- optimal, 3- Random Level 1 2 3
N 15 15 15
Mean 26.533 28.933 12.267
StDev 9.970 10.620 8.648
Fig. 5: Tukey HSD Test for Jtopas Source Def Error Total
Defects versus Number of Test Cases DF SS MS F 2 81.7 40.8 3.19 27 345.8 12.8 29 427.5
P 0.057
Tab. IX: The ANOVA Analysis for Student’s Project Individual 95% CIs For Mean Based on Pooled StDev +---------+---------+---------+--------(--------*--------) (--------*--------) (--------*--------) +---------+---------+---------+--------2.5 5.0 7.5 10.0 1- st-total, 2- optimal, 3- Random Level 1 2 3
N 10 10 10
Mean 6.200 4.700 8.700
StDev 4.237 2.003 4.057
Fig. 6: Tukey HSD Test for Student’s Project
To gain more statistical evidences about the differences between the techniques, we attempted to fit to the curves in the plots a collection of statistical models including:
15
Regression Linear Model, Quadratic Model, Cubic Model, and Logistic Model, and then to compare the R2 values (The goodness of fit). However, for each program there was no model that fit all the curves and thus we could not compare the R2 values. V. T HREATS TO VALIDITY In many empirical studies, as the one proposed here, there will be potential threats to validity, both construct and external. In this section, we discuss threats to validity and explain how to reduce the chances of these threats. A. Threats to Construct Validity The inferences in the forgoing section can be affected by the following factors. Threats due to incorrect estimations of the total number of faults. For instance, the students project has artificial faults. These faults are detected by manual inspection of the source code and by designing new test cases. Such faults may not be the complete set of faults present in the program. However, it can be argued that the detected faults is a representative sample of the complete set of faults and any analysis with the detected set is also valid for the whole population. Threats due to the presence of multiple faults. For students project, we only considered artificially occurring faults, all of which are present simultaneously in the program. Consider a test case that sensitizes two faults; both faults falsifies the condition stated in the assert statement of the JUnit test case. When the JUnit test case fails, it is not possible to identify the fact that the failure was a result of sensitization of multiple faults. The failure will be counted as detection of one fault and the fault detection ability of the test case will be underestimated. One way to avoid this problem is to isolate the faults so that whenever there is a failure in the presence of a single fault, it is certain that the failure is due to that fault only. Mutation testing resolves the problem of existing multiple faults in a program. Existence of multiple faults evolves extra effort to identify which fault is sensitized by exercising a particular test case. Threats due to artificial faults; faults are artificially inserted in the Jtopas. The mutation system apply one mutation at a time and creates a new mutant program for every fault. Moreover, the distribution of the artificial faults may not be same as that for the natural faults. This may result in completely different behavior of the defect coverage vs stament coverage for artificial faults. A code inspection and development documentations are required to gain more insight about the natural faults. However, a study by Smith et al. [21] demonstrated the efficiency of the MuJava tool, the mutant operators seed faults quite similar to natural faults. Threats due to required prior knowledge. Optimal testing assumes prior knowledge of the defects and st-total strategy assumes prior knowledge of coverage of a test case. The two strategy are applicable only when such prior knowledge is available.
ISBN: 1-60132-468-5, CSREA Press ©
16
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
B. Threats to External Validity The generalization of results can be affected by the following factors. Threats due to the object programs representativeness. The object programs are small and medium size. However, two of these objects are real applications. Furthermore, the fault patterns are natural in two objects. Threats due to automaton test case generation. The test cases for NanoXML are generated using Randoop tool [17]. This tool automatically generate hundreds of test cases for a given set of classes within a limited time interval. As shown earlier, These test cases achieve a significant statement coverage. Furthermore, a large number of test cases are doubled, such test cases do not factor in the study, these test cases are excluded. The elimination process requires the investigation of all test cases in order to remove insignificant test cases. There is no way for the tester to chose values for arguments to reduce the replication of test cases. VI. C ONCLUSION AND F UTURE W ORK We conducted an experiment to evaluate how defect coverage varies with statement coverage for varying test execution sequences. Two prioritization techniques (optimal and st-total) relative to random testing are examined for altering the test cases execution order. The result of the study indicates that the position of the knee is altered by changing the test cases execution order. We also showed a statistical evidence that the optimal prioritization technique outperforms the random and st-total techniques, and the random technique is the least effective technique. The presented impact of time and synthesis of threats to validity open the door to future research. To cope with the threats to validity, we need to extend this study with more subject programs. Such subject programs should be associated with complete and accurate fault data and change log files. Preferably, we tend to investigate test cases developed by test engineers during the development and maintenance practices. We also need to seek for an automatic tool, for given various metrics about programs, modifications, and test suites, we should be able to predict the prioritization technique most likely to succeed. The factors affecting prioritization success are, however complex, and interact in complex ways. We do not possess sufficient empirical data to allow creation of such a general prediction algorithm, and the complexities of gathering such data are such that it may be years before it can be made available. Moreover, even if we possessed a general prediction algorithm capable of distinguishing between existing prioritization techniques, such an algorithm might not extend to additional techniques that may be created. R EFERENCES
[2] J.H. Andrews, L.C. Briand, and Y. Labiche. Is mutation an appropriate tool for testing experiments? In Proceedings of the 27th international conference on Software engineering, page 411. ACM, 2005. [3] P. Bishop. Estimating residual faults from code coverage. Computer Safety, Reliability and Security, pages 325–344. [4] X. Cai and M.R. Lyu. The effect of code coverage on fault detection under different testing profiles. ACM SIGSOFT Software Engineering Notes, 30(4):7, 2005. [5] X. Cai and M.R. Lyu. Software reliability modelling with test coverage: Experimentation and measurement with a fault-tolerant software project. 2007. [6] H. Do and G. Rothermel. On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Transactions on Software Engineering, 32(9):733–752, 2006. [7] S. Elbaum, A.G. Malishevsky, and G. Rothermel. Test case prioritization: A family of empirical studies. IEEE Transactions on Software Engineering, 28(2):159–182, 2002. [8] The Eclipse Foundation. Eclips ide foundation, 2010. http://www.eclipse.org/. [9] P.G. Frankl and E. Weyuker. An applicable family of data flow testing criteria. IEEE Transactions on Software Engineering, 14(10):1483– 1498, 1988. [10] S. Goren and F.J. Ferguson. Test sequence generation for controller verification and test with high coverage. ACM Transactions on Design Automation of Electronic Systems (TODAES), 11(4):916–938, 2006. [11] C. Henard, M. Papadakis, M. Harman, Y. Jia, and T.Y. Le. Comparing white-box and black-box test prioritization. In Proceedings of the 38th International Conference on Software Engineering, pages 523–534. ACM, 2016. [12] M.R. Hoffmann, B. Janiczak, and E. Mandrikov. Eclemma-jacoco java code coverage library, 2011. [13] Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. IEEE transactions on software engineering, 37(5):649–678, 2011. [14] R. Just. The major mutation framework: Efficient and scalable mutation analysis for java. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, ISSTA 2014, pages 433–436, New York, NY, USA, 2014. ACM. [15] Y.K. Malaiya, C. Braganza, and C. Sutaria. Early Applicability of the Coverage/Defect Model. In Software Reliability Engineering, pages 127–128, 2005. [16] J. Offutt. java (mujava) - a mutation system for java programs, November 31 2008. http://cs.gmu.edu/ offutt/mujava/. [17] C. Pacheco and M.D. Ernst. Randoop: feedback-directed random testing for java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion, pages 815–816. ACM, 2007. [18] G. Rothermel, S. Elbaum, A. Kinneer, and H. Do. Software-artifact infrastructure repository (sir), 2010. http://sir.unl.edu/portal/index.html. [19] G. Rothermel, R.H. Untch, C. Chu, and M.J. Harrold. Test case prioritization. IEEE Transactions on software engineering, 27(10):929–948, 2001. [20] A. Schwartz and M. Hetzel. The impact of fault type on the relationship between code coverage and fault detection. In 2016 IEEE/ACM 11th International Workshop in Automation of Software Test (AST), pages 29–35, May 2016. [21] B.H. Smith and L. Williams. An empirical evaluation of the MuJava mutation operators. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION, 2007. TAICPARTMUTATION 2007, pages 193–202, 2007. [22] W.E. Wong, J.R. Horgan, S. London, A.P. Mathur, H.N.S. Inc, and M.D. Germantown. Effect of test set size and block coverage on the fault detectioneffectiveness. In Software Reliability Engineering, 1994. Proceedings., 5th International Symposium on, pages 230–238, 1994. [23] L. Zhang, D. Hao, L. Zhang, G. Rothermel, and H. Mei. Bridging the gap between the total and additional test-case prioritization strategies. In 2013 35th International Conference on Software Engineering (ICSE), pages 192–201. IEEE, 2013.
[1] N.S. Akbar, J.H. Andrews, and D.J. Murdoch. Sufficient mutation operators for measuring test effectiveness. In Proceedings of the 30th International Conference on Software Engineering, ICSE ’08, pages 351–360, New York, NY, USA, 2008. ACM.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
17
Behavior Driven Test Automation Framework Ramaswamy Subramanian Dept. of Computer Science, California State University Fullerton. [email protected]
Ning Chen Dept. of Computer Science, California State University Fullerton. [email protected]
Abstract—The technological advancements and the concept of cloud computing made today’s industry to adopt agile development practices in software engineering and made web based software applications an integral part of our life. As the cost reduction and cost control are the two important driving factors for any business, the success of any business relies on the quality of its software systems. Web application test automation is a key component to achieve the superior quality, but complex technological features and rapid application development environment creates multitude of challenges in implementing a powerful, reliable and robust test automation framework which can support testing multitier architectural software systems like automated testing of the user layer, technical, business layers and data layer of the software system. In this paper our proposed solution “Behavior Driven Test Automation Framework (BDTAF)” can address all the above constraints, as the framework can be implemented and maintained without involving much cost and can support test automation of all the different layers of a three-tier architectural system which will act as a driving factor behind substantially improving the test coverage, quality and reliability of the software system. Keywords—Web application testing, GUI automation testing, Database testing, Services automation, Business driven testing, Agile test automation, Test driven development, Selenium WebDriver, FitNesse.
I. INTRODUCTION With a rapid advancement in the technological arena, introduction of more and more robust and easy to use open stack web development tools and cloud computing technologies, web applications become predominant in today’s era than the standalone user and desktop based applications. The advancement in server-side and client-side scripting and programming languages made web application development easier by compressing complex business logic into couple lines of code, which makes web applications prettier than ever before. In addition, following agile development methodologies require more frequent releases of working software systems to the customer. These technological advancements and project environment create unique challenges for the software project teams involved in Web application testing, especially the test automation. It is true that many organization do not invest enough resources for test automation due to the cost and complexity involved in the setting up and maintaining test automation projects but only a robust test automation framework and appropriate automated regression test cases are the guaranteed way to thoroughly regression the software systems and support continuous releases. Without test automation and proper regression testing the product release can become an absolute failure on the very
Tingting Zhu Dept. of Computer Science, California State University Fullerton. [email protected]
first day of the release. A classic example was the technical issues encountered by the users of HealthCare.gov [11], a health insurance exchange website operated under United States federal government on its first day (October 1, 2013) of the launch. In this paper, we are proposing a powerful and robust test automation framework “Behavior Driven Test Automation Framework (BDTAF)” to support web application test automation and regression testing simple and robust enough that both technical and non-technical project team members can build and execute automation test cases without any prior experience in writing automation tests. BDTAF can support test automation of web applications built on three tier architecture seamlessly and the framework can easily be implemented in agile development projects, which is the hottest trend in today’s market. Such that, the business analysts and even business users of the organization can provide their requirements in a customized pre-defined test format, and the test can be easily incorporated into the existing suite of tests. This cost-effective test automation solution is built by integrating two powerful open source technologies, Selenium WebDriver [1] and FitNesse [2], which helps in keeping the test automation implementation and maintenance cost very small for the project management and helps in quickly recognizing the return on investment by achieving the highest level of automation success, improvement in test efficiency, highest possible test coverage and superior product quality with every release of the product. II. RELATED WORK A successful test automation requires very powerful test strategy and as per R. Potter [3], the execution of test automation using the user interface interaction exists from early 1990’s. A very common methodology is utilizing Record and Playback as per [4], where a script or a tool will capture the keyboard and mouse events when users executes manual tests in a software application and another driver script will playback these recorded tests against the same application with multiple sets of data. This approach heavily relies on the screen co-ordinates and hence very brittle and susceptible to failures. Later, more improvised test automation modeling frameworks like WTM [5], Web application test model has been presented by Kung et al., where by using Forward Engineering tools and Reverse Engineering tools, static and dynamic test artifacts of web application are extracted to create WTM instance model and through that model, behavior and structural tests are
ISBN: 1-60132-468-5, CSREA Press ©
18
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
derived to complete the test automation. This test model achieves optimal testing by deriving structural and behavioral test cases but the set up and maintenance of the test models requires heavy investments in terms of cost and resources, as it requires Forward and Reverse engineering tools to decode the web application such that WTM can parse necessary information. All these moving parts makes maintenance of the framework more expensive and the process more cumbersome. DART [13] is a regression testing framework to support nightly builds of web applications by ripping GUI’s underlying document object model information, creating event flow graphs and setting up integration trees. This framework is primarily intended to do smoke and regression testing for GUI of a web application which is still under development. Having an automated framework which can creates test oracles based on ever changing GUI during development will increase the efficiency of development process. Nevertheless, this framework does not support full-fledged regression testing of all the business functionalities as well as all the layers involved in the application under test. STIQ [6] a story testing framework helps building complex test cases by using reusable test components written in form of stories in the agile context. This framework supports data driven testing and helps users defining acceptance tests in the form of user stories. Xebium [7], a data driven testing framework to support integration testing of different components in a web application by recording tests using Selenium IDE, which is an open source tool but has been deprecated in the market. Both frameworks rely primarily on Selenium IDE to support test automation. As the underlying IDE support, has been deprecated, these frameworks cannot support robust test automation needs for any software applications developed using current technology. NTAF [14] is an automation tool developed to support test automation of services layer of web applications to facilitate primarily the performance and stress testing by leveraging FitNesse and STAF to facilitate communication among the developers, stakeholders and at the same time executing test both locally and remote client/server based systems. As the goal of this framework indicates its main intention is to only support test automation of API layer only. This framework cannot be used to support test automation of GUI or Database layers of the web applications. Bures et al. presents SmartDriver [8], which is an extension of Selenium WebDriver which helps users in creating more efficient automation test cases to support Web and mobile application automation by providing advanced mechanism to write adaptive test cases and mechanism to understand test failures easily for script developers and in turn makes script maintenance very adaptive. Nevertheless, this framework focusses only on the GUI layer of a Web application, which makes this framework supporting only one layer of any
software systems developed using three tier architecture frameworks. Castor et al. presents Extension of Selenium RC to perform Database testing [9], which helps in validating data involved in any web application by extending features of Selenium RC to connect and query the databases and compare the stored data against the expected data set. This framework also relies on deprecated Selenium RC framework as underlying principle and support only one part of testing needs in three tier architecture web applications. Our new suggested framework BDTAF overcomes all the constraints we face in the existing test automation solutions and can be used to build automation test cases to test any Web application’s (i) GUI or User layer (ii) Services (API) or business layer and (iii) Data layer. Thus it can provide well rounded automation coverage needed for any software project. In addition, the automation test cases can be written in a simple, easy to understand, table like format where any business analyst and business users can write agile story based automation test cases which can be reliably used to test complex business functionalities. III. BEHAVIOR DRIVEN TEST AUTOMATION FRAMEWORK Crispin et al. [10] states that Agile testing means a way to learn the application such that the customer stories should drive the testing to be in par with agile values of working software by responding to the change. Some key factors for successful agile testing projects are (i) Continuous collaboration with customers, (ii) Automated regression testing (iii) Adopting an agile mindset, (iv) Implementing an end to end test solution and (v) Using the whole team approach. The core concept of our new approach “Behavior Driven Test Automation Framework (BDTAF)” lives by above principles. BDTAF can be implemented as an end to end test automation solution for any Web based software applications by means of writing (i) Automation tests to validate the front end GUI layer, (ii) Automation tests to validate the middle layer or services layer of a software application, (iii) Automation test can also include database validation and more importantly (iv) Automation tests can be written and executed by technical team members like Developers and Automation test analysts, Non-technical team members like Functional Quality Assurance analysts, Business Analysts, Business users and the Stakeholders. The below sections will explain the BDTAF architecture, deployment and implementation details. A. BDTAF – Architecture BDTAF is built by integrating two powerful test automation solutions, “Selenium WebDriver” and “FitNesse” available in current market. Both are open source frameworks built with the capability to extend any test automation needs for Web based software applications. Each of these frameworks have
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
19
its own advantages and disadvantages, but integrating these two frameworks gives users humongous potential and power to successfully complete any kind of test automation needs for a software project.
Test execution support in local development environment as well as remote and distributed environments by implementing remote web driver capabilities inside our framework.
Selenium WebDriver is known for its robustness and flexibility in terms of automating web based software applications in multitude of browsers. An array of available programming language bindings gives users the flexibility to write tests in their own choice of programming language to test the system. Leveraging the remote web driver features from Selenium, the automated test execution can be done either in the local development environment or in the dedicated remote test environment.
Enables continuous collaboration between project team members and the business users as well as stakeholders of the application, resulting in faster feedback on the developed features and requirements can be delivered in the form of acceptance tests.
Supports agile model by implementing test driven development as well as involving all the project team members in writing and executing automation tests.
Likewise, FitNesse comprises of FIT engine, FitLibrary and FitLibraryWeb is very popular for its easy to use wiki based test writing style. An open source tool used for automating customer tests is called Fit. A wiki built on top of Fit is the FitNesse used for automating acceptance testing [15], such that any project team member can write automation tests using predefined syntax in table like wiki format. FitNesse allows excellent collaboration between all the stakeholders of the project and enables communication between the project team members and customers.
As illustrated below in Figure 1, Selenium WebDriver will act as the underlying platform for executing the automation tests against the system under test (SUT). The users will interact with FitLibraryWeb and write automation test cases in predefined wiki format.
The aim of BDTAF is to exploit the advantages from both the tools, such that automation tests can be easily developed and executed against multitude of environments, browsers based on the project needs and at the same time automation tests can be easily maintained. In addition, by implementing continuous integration model, the regression test suites can be scheduled to execute against deployed build servers after every check-in, and even automation tests can be written before completing the development activity to support the core concept of agile principle called test driven development methodology. Test driven development or Test first development is a strategy for shortening the cycles between test development and test execution [12] and the aim of this approach is to find the ambiguous or incomplete requirements early in the development life cycle such that the rework effort can be dramatically reduced at the later stages of the development or the maintenance cycle. Few important and notable features of our approach are:
Test automation support for GUI layer, API layer and Database layer of web application.
Automated test execution of application under test using multitude of browsers (Firefox, Chrome, Edge, Safari, Internet Explorer, etc.,) and operating systems (Windows, Linux, etc.,), by leveraging the power of Selenium WebDriver.
Enables both technical and non-technical team members to write and execute automation test cases.
The automation tests triggered from FitLibraryWeb will trigger the FIT engine which in turn filtered thru BDTAF fixture layer, channeled thru correct test model and communicated to Selenium WebDriver. The tests will then be executed against SUT, the results will be captured and communicated back to BDTAF which will parse correct information and channel it back thru FIT engine and display the results in FitLibraryWeb to user in a simple,
ISBN: 1-60132-468-5, CSREA Press ©
20
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
understandable and intuitive format, which completes the full cycle for test execution activity. B. BDTAF – Automation support for three tier architecture systems As shown in Figure 2, BDTAF consists of three independent Java classes which acts the core modules of the framework. These java classes understand the commands coming from the user triggered tests and determines what type of testing to perform on the SUT and provide result back to the user.
verifyElement: To perform an assertion in the test and make sure the expected element is available in the SUT verifyText: To perform an assertion in the test and make sure the expected text is available in the SUT captureScreenshot: To capture necessary screenshot of the SUT wherever user wants to see the state of the system during the test at run time and during any failures close browser: To close the browser successfully after completing the testing.
By reading the above sample commands, it is clearly visible that anyone in the project team who wants to write and build automation test cases shall use the above illustrated commands, follow the sequential steps of functional tests cases and write the automation tests in the predefined wiki format. The user who is writing automation tests does not need to worry about following java syntaxes or declaring variables or following any kind of programming paradigms, as BDTAF encapsulates all those nuances for the end user and enables them to write automation test cases in simple plain English statements. Similarly, the test commands required for automating service or API layer of SUT are implemented in SeleniumServicesFixture class. This class has commands like GET and POST and helps users in writing API layer automation test cases.
The three core java classes of the framework are SeleniumWebFixture SeleniumServicesFixture SeleniumDatabaseFixture As the names clearly indicates, SeleniumWebFixture is the primary class where all the necessary commands and methods have been implemented to drive Web application and perform GUI based test automation. This module acts the heart of the framework where all the necessary command transformation will happen for successfully completing the test automation. Few important and key implementations in this fixture are listed below. These are few very common commands used in GUI based automation test cases to perform testing. start browser: Implementation to start the user defined browser. open url: To navigate to a website that user wants to perform testing sendKeys: To type any user supplied data into web application input fields selectItem: To select appropriate data from select boxes and combo boxes in the web application clickElement: To click on a button or hyperlink on the web application
Finally, SeleniumDatabaseFixture has necessary methods implemented to perform all the database related validations as part test automation. As any web application testing, cannot be a well-rounded complete test without validating the underlying data stored in the database, BDTAF has mechanism to evaluate the resulting data as part of either GUI based or API based tests. The framework can support validating expected data against the actual data from multitude of databases like Microsoft SQLServer, IBM DB2, MySQL, Oracle and PostgreSQL, provide valid connection string and query syntax are passed appropriately in the test along with expected data. C. BDTAF – Elegant automation test cases As detailed in section III-B above, BDTAF provides mechanism for users to write automation test cases in simple, easy to read and understandable format by following a structured mechanism as below. The test case will be in wiki format tables which will have three columns. The first column will specify the command to perform, the second column specifies the identifier or the element locator in the web application and the third column specifies the test data to be passed into SUT to complete the testing. Consider the following example of testing a Login page of any Web email application. The necessary steps for a positive test are given below:
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Start browser FitNesse will send raw test request to BDTAF to launch a browser. BDTAF parses and identifies the browser from the request and invoke Selenium WebDriver to launch the correct type of the browser.
Navigate to the target web email application (gmail.com) BDTAF’s SeleniumWebFixture will pass commands to browser to navigate to gmail.com.
Verify login button is available Verification method in SeleniumWebFixture of BDTAF will be called to complete the assertion.
Type user name into Login field BDTAF’s SeleniumWebFixture will send parsed element ids and data to WebDriver to identify the correct element Login field in the web form and type the data into the field.
Type password data into Password field BDTAF’s SeleniumWebFixture will send parsed Password element id and encrypted data to WebDriver to type the data into the password field.
Click the Login button Click command will be issued from SeleniumWebFixture to the browser to identify and click the login button of the Gmail application.
Verify there is no error Verification method in SeleniumWebFixture will be used to complete the assertion that application did not display any validation error.
Verify the link with my username displayed in the home page of the email application Verification method in BDTAF’s SeleniumWebFixture will be used to complete the assertion and parsed user friendly result will be displayed back in FitNesse.
Log out of the email application FitNesse will send raw test request to log out. SeleniumWebFixture will parse the request and identify the element id and operation to perform on the element and instructs Selenium WebDriver to complete the operation.
Close the browser SeleniumWebFixture will issue shutdown command to Selenium WebDriver to close the browser and complete the test. BDTAF will pass all the parsed verification assertion results back to FitNesse to display to the user in a user-friendly format.
21
Once the test is executed as explained in section III-A, the results will be displayed to the user in a nice formatted layout as below. The green shades confirm the successful results, whereas the red shades explain the failure and deviation of the application behavior from an expected test result. Below example explains a successful test run.
The above sequence of steps can be written in wiki test table format as below.
ISBN: 1-60132-468-5, CSREA Press ©
22
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
If there is any error in the input data, for example when the username and password combination are mistyped in the above example, the resulting failure test case will be formatted in red color as below:
Execute the tests in the existing framework set up with all new dependencies. Check in the new libraries in the source code control systems to trigger the continuous integration builds using new libraries.
The whole process of upgrading will act as a simple Plug and Play approach, such that framework maintenance and upgrade can happen without affecting the productivity of the team. IV. CONCLUSION This paper comprehends the key features and benefits of BDTAF, a powerful and robust test automation framework that can be quickly implemented and leveraged for creating and maintaining a solid test automation solution. This framework enables management and project teams to participate in test automation process and advances the visibility on test reporting. It also improves communication between project teams involving developers, testers, business analysts, business users and stakeholders. BDTAF’s strength in comparison with related works includes its elegant test automation support for web applications developed using multi-tiered architecture. An end to end test automation, which includes automated testing of web application’s GUI layer, API layer and Database layer can be implemented for both simple and complex software applications using this framework.
D. BDTAF - Maintenance Maintenance being the pain point for any automated test framework, we have architected the BDTAF in a way that the core three modules of the framework will act completely selfcontained and will not be affected by any new releases of underlying open source tools. Whenever there is a new release of Selenium WebDriver or FitNesse, we can upgrade to new stable versions of those tools by dropping the new jar files into framework library and continue supporting test automation with below simple steps. New stable releases of underlying open source tools (Selenium WebDriver or FitNesse) can be downloaded from their official website download pages. Copy the jar files and their source files into BDTAF library directory with all the dependencies. Remove the old, deprecated libraries. Update class path settings of the project to reference new libraries. Recompile the framework core classes referencing the new libraries.
With continuous integration support model, BDTAF enables its users to write automation tests even before the development completion and enables users to follow test driven development methodology. In addition to its strength, the test framework can be easily extendable to support multiple browser versions and operating systems. Software project teams can be greatly benefited from BDTAF by bridging the gap of complexities involved in developing and maintaining a powerful and robust test automation solution to ensure the product quality every release supporting the core Agile principles without affecting the project cost.
REFERENCES [1] [2] [3]
[4] [5]
[6] [7] [8]
http://www.seleniumhq.org http://www.fitnesse.org/ R. Potter, Triggers: Guiding automation with pixels to achieve data access. University of Maryland, Center for Automation Research, Human/Computer Interaction Laboratory, 1992, pp. 361–382 A. Memon: Gui testing: Pitfalls and process, IEEE Computer, vol. 35, no. 8, pp. 87–88, 2002 D. C. Kung, Chien-Hung Liu and Pei Hsia, "An object-oriented web test model for testing Web applications," Proceedings First Asia-Pacific Conference on Quality Software, Hong Kong, 2000, pp. 111-120 http://storytestiq.solutionsiq.com/index.php http://xebia.github.io/Xebium/ M. Bures and M. Filipsky, "SmartDriver: Extension of Selenium WebDriver to Create More Efficient Automated Tests," 2016 6th International Conference on IT Convergence and Security (ICITCS), Prague, 2016, pp. 1-4
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 | [9]
[10] [11] [12]
[13]
[14]
[15]
A. M. F. V. de Castro, G. A. Macedo, E. F. Collins and A. C. Dias-Neto, "Extension of Selenium RC tool to perform automated testing with databases in web applications," 2013 8th International Workshop on Automation of Software Test (AST), San Francisco, CA, 2013, pp. 125131 Crispin, L.; Gregory, J.; Agile Testing: A Practical Guide for Testers and Agile Teams, Addison-Wesley, 2009, ISBN 0-321-53446-8 https://en.wikipedia.org/wiki/HealthCare.gov D. Winkler, S. Biffl, T. Ostrcicher, Test-Driven Automation – Adopting Test-First Development to Improve Automation Systems Engineering Processes, Madrid, Spain: EuroSPI, 2009. A. Memon, I. Banerjee, N. Hashmi and A. Nagarajan, "DART: a framework for regression testing "nightly/daily builds" of GUI applications," International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., 2003, pp. 410-419. E. H. Kim, J. C. Na and S. M. Ryoo, "Implementing an Effective Test Automation Framework," 2009 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, WA, 2009, pp. 534-538. Rick Mugridge, and Ward Cunningham, Fit for Developing Software: Framework for Integrated Tests, Prentice Hall, 2005.
ISBN: 1-60132-468-5, CSREA Press ©
23
24
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Efficient Component Integration Testing of A Landing Platform for Vertical Take Off and Landing UAVs A. Andrews, A. Gannous, A. Gario, and M. J. Rutherford Department of Computer Science, University Of Denver, Denver, Colorado, USA Abstract—This paper describes a multi phase test generation strategy for testing systems whose components function in parallel. In this type of system, some or all components do not need to communicate directly with each other. Testing concurrent functionality requires an efficient method to generate test cases for concurrent behavior. Existing concurrent coverage criteria such as Rendezvous are not applicable since the components do not need to interact with each other. In this paper we propose a new efficient serialization technique to execute test paths of concurrent system components. We used Extended Finite State Machines (EFSM) as a model-based test generation technique to produce test paths for each component. We apply this strategy to a fully operational prototype of a mobile landing platform robot designed to launch, recover and re-launch Vertical Take Off and Landing (VTOL) Unmanned Aerial Vehicles (UAVs).
Keywords: Model-Based Testing, Combinatorial Testing, Concurrent Coverage Criteria.
1 I NTRODUCTION Systems that consist of components that function in parallel exist in many civil and military applications, such as search and rescue robots. Testing this type of systems as a black box requires testing system components that operate concurrently posing challenging combinatorial problems. The proposed test suite generation process for this type of system follows a series of steps. First, it generates test paths for each component independently. For this step model based testing is used to model each component behavior using Extended Finite State Machines (EFSMs). Second, combinatorial testing criteria are defined to generate an efficient combination of test paths. Finally, we need to serialize the execution of these combined test paths. In this paper, we propose a new concurrent execution technique that is applicable when system components under test do not need to communicate or interact with each other but need to work in parallel to execute a desired task jointly. The main contributions of this paper are: re-purposing and implementing the In-Parameter Order algorithm (IPO) [9] to combine test paths, a new technique for executing combined test paths and empirical results showing that the proposed technique is producing an efficient test suite size. We apply the steps of generating test cases to a fully-operational prototype of a mobile controlled leveling robot that functions as a landing platform designed to launch, recover and re-launch Vertical Take Off and Landing (VTOL) Unmanned Aerial Vehicles (UAVs)[11]. The paper is organized as follows: Section 2 gives background about model-based testing, combinatorial testing and concurrent coverage criteria. Section 3 describes the case study used to apply the proposed test suite generation process. Section 4 covers the process of test suite generation in detail.
Section 5 contains the results and a discussion. Section 6 presents the conclusions.
2 S TATE OF R ESEARCH 2.1
Model-Based Testing (MBT)
Utting et al.[1] defined MBT as ”an automatable derivation of concrete test cases from abstract formal models and their execution”. MBT consists of 5 steps: 1. Build the model. 2. Define test selection criteria. 3. Generate test paths from the model based on the selected criteria. 4. Generate operational test cases from the test paths. and 5. Execute the tests. Dias-Neto et al. [4] conducted a systematic survey on (MBT) classifying (MBT) methods presented in 78 papers in five categories based on the representation of information from the software requirements. MBT using Finite State Machines (FSMs) has a long history. However, FSMs are not useful in modeling all behavioral aspects of software components [5] because they do not model conditions that trigger a specific state transition. This is provided by an Extended Finite State Machine (EFSM). Fantinato and Jino [5] used an EFSM model to model system behavior to test interactive systems. For more details about EFSM see [6], [7]. 2.2
Combinatorial testing
Since exhaustive testing is expensive, combinatorial testing techniques such as pair wise coverage are used to reduce the size of the test suite, while maintaining its effectiveness. [10]. In a survey by Grindal et al. [8], combination strategies are considered test case selection methods where test cases are identified by choosing values, then combined based on a combinatorial strategy. Grindal et al., classified the combination strategies into deterministic and non-deterministic approaches. The deterministic combination strategies are either instant or iterative. The parameter-based combination strategy In Parameter Order (IPO) [9] is an example of the iterative combinatorial algorithms. It starts by creating a test suite for a subset of the parameters, then adding one parameter at a time until all parameters are included in the test suite to satisfy pair-wise coverage. In this paper, we repurposed the parameterbased combination algorithm IPO by considering the test paths as the parameters in the combinations. Nguyen et al. [10] presented a new approach by combining a pairwise combinatorial testing strategy and MBT to generate effective test cases by extracting paths from an FSM model and transforming them into classification trees to get an efficient
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
25
combination. They further reduce the test suite size by applying a post-optimization algorithm.
There are three subsystems. The moving system consists of four electric motors. Each motor is operating a wheel separately. The leveling system is designed such that each of the four wheels is mounted on the end of a control arm that rotates up and down to change the position of each wheel independently. This gives the platform the ability of leveling on a sloped surface or uneven terrain. The control arms are connected to a linear actuator that adjusts the position of the control arm and its corresponding wheel by extending and retracting each linear actuator individually. The third subsystem provides the capability of increasing the area of the landing surface using a collapsible landing surface that folds up from an octagon into a square to make it more compact. For servicing and monitoring the subsystems, the XMOS XC1A micro-controller is used. This micro-controller is capable of handling up to 32 simultaneous tasks with tight real-time scheduling. This unique capability allows it to control the wheel motors, operate the linear actuators, and communicate with human operators in parallel. Communication between the human operator and the landing platform is accomplished wirelessly. The operator sends commands (shown in Table 1), and receives routine information regarding the status of the on-board systems. Since the system under test is receiving commands from a human to follow a path in a hard terrain, level the landing surface and fold and unfold the flaps, it forms a good case study when concurrent coverage criteria are needed to serialize a combination of commands for different components.
2.3
Concurrent coverage criteria
Hierons et al. [13] identify challenges that arise when testing concurrent systems: (1) Controllability problems when the tester does not know when to feed the inputs; (2) Observability problems, when the local tester cannot define a unique sequence of inputs and outputs that the system under test has to produce. According to Andrews et al. [3], two serialization techniques are usually used in execution: All-Possible Serializations and Rendezvous. For the All-Possible Serialized Execution Sequences Coverage Criterion (APSESCC), the test suite is generated by permuting the execution sequence of combined test paths while maintaining execution order for each test path corresponding to each component. This is very expensive in both execution time and the size of the test suite. Yang et al. [12] introduced the Rendezvous graph, which models possible rendezvous sequences among tasks. The Rendezvous Coverage Criterion (RCC) is based on Rendezvous graphs. Rendezvous graphs can be constructed when two or more components in a system have distinct models that can be represented as a connected graph and have a shared node in the graph. The RCC is applicable when the task execution in a concurrent system traverses a task path of its rendezvous graph, therefore, (RCC) will produce a set of all paths that have rendezvous nodes. However, our case study has components that do not interact with each other, making the Rendezvous criterion inapplicable.
3 C ASE STUDY The system under test designed by Conyers et al. [11], is a fully-operational prototype of a mobile leveling landing platform to launch, recover and re-launch VTOL UAVs. The landing platform is rugged, lightweight and inexpensive, making it ideal for civilian applications that require a base station from which a rotorcraft UAV can be launched and/or recovered on terrain that is normally unsuitable for UAV takeoff and landing. The first prototype of the landing platform is controlled remotely by a human to move it to the desired landing location and to level the landing surface on rough terrain and inclines up to 25◦ .
Table 1: System Commands. Command x a s d j k l t y u v b n ˆ 6 1 ! F R >
x_MAX) -> S_controller_top_autocruise_exit(); m_event!stop S_controller_top_goal_entry() :: true -> if :: atomic{ c_event?overheat } -> S_controller_top_autocruise_exit(); S_controller_top_emergency_entry()
8 9 10 11
:: (motor_topState == motor_top_overheat ) -> S_motor_top_overheat(); T_motor_top_overheat() :: (motor_topState == motor_top_final) -> S_motor_top_final() fi
12 13 14 15 16 17
}
The final state is translated into an inline macro that terminates the operations of the behavior in the region. , 1 2
inline S_controller_top_final_entry() { controller_topState = controller_top_final }
3 4 5 6 7
inline S_controller_top_final() { break }
:: else -> skip fi :: true -> if :: atomic{ cinput_event? emergency_stop } -> S_controller_top_autocruise_exit(); S_controller_top_emergency_entry()
20 21 22 23 24 25 26 27 28
:: else -> skip fi
3.9 The Event Occurrence Model In our verification model, we adopted the following input event occurrence model for each state machine. In the model, input events that can occur are sent to each corresponding channel as triggers for state machines. When the state machine reaches the final state, the event occurrence model terminates. ,
:: else -> skip fi }
3.7 Translation of Regions A region in the state machine diagram represents a container that includes the states and transitions. In our translation rules, each region of the state machine diagram is translated into an inline macro that combines the inline macro calls of the states and transitions in the region. The inline macro represents the behavior in the region. , 1 2 3 4 5
:: (motor_topState == motor_top_rotation ) -> S_motor_top_rotation(); T_motor_top_rotation()
3.8 Translation of Final State
, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
6 7
inline R_motor_top() { if :: (motor_topState == motor_top_idle) -> S_motor_top_idle(); T_motor_top_idle()
1 2 3 4 5 6 7 8 9
active proctype cinput_eventOccur() { do :: cinput_event!go :: cinput_event!turnoff :: cinput_event!emergency_stop :: (controller_topState == controller_top_final) -> break od }
3.10 Process for State Machine Behavior In our Promela verification model, only one process is present in each state machine. This process represents the overall operation of the original state machine diagram. It starts from the inline macro call of the initial pseudo state and executes the transitions and behaviors in the region.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
, 1 2 3 4 5 6
active proctype controller_stm() { T_controller_top_init(); do :: R_controller_top() od }
4. Case Study In this section, we confirm the feasibility of our translation method for SPIN model checking. We applied our translation method to the related state machine diagrams shown in Fig. 2 and generated the corresponding Promela code. The interaction information of each state machine is implemented in the Promela code. Moreover, LTL verification can be conducted using the code as the input to the SPIN model checker. We used the following LTL formula, which indicates that the controller state will not permanently become the goal state, to verify the reachability of the Promela code. , 1
49
between Promela codes and each component of the original diagram. Moreover, we can also identify interact signals between relative state machines using channel variables in Promela. Thus, we can more easily recognize the source of each trigger event and clearly analyze the verification results. The flexibility of the translation method is also increased. In future work, we intend to further some other representations of the message communication in the translation method. For example, we aim to provide only one channel for each state machine, receiving both input events and interact messages. This may help increase the verification process efficiency. We also plan to apply our method to various examples and refine our translation rules based on feedback attained while doing so. Moreover, we plan to develop a translation method that translates state machine diagrams into parallel verification models rather than simple sequential models. Further, we plan to develop an automatic translation system that can translate the target objects into different types of models to meet different needs.
References
[]!(controller_topState == controller_top_goal)
The result of this verification case study is shown in Fig. 3. The figure shows that an error has been found, which means that a transition sequence has reached the controller’s goal state. This is an appropriate result for this verification case study, showing the generated model can be used for SPIN model checking. A counterexample for this case study is shown in Fig. 4. The green region in the figure clearly shows the interactions between each state machine. The yellow region in the figure represents the counterexample sequence. When (x > x_MAX), the controller state machine sends a "stop" signal to the motor state machine and the controller’s state becomes the goal state. The motor state machine receives the "stop" signal and the motor changes to the idle state. The red region in the figure shows the values of each variable when the sequence ends, at which time the controller has reached the goal state. Thus, we can easily observe the code corresponding the verification model to state machines. In addition, the code relating the verification model to the interactions between each process can also be observed. These results reveal that our translation method can be useful in the model checking process.
[1] E. M. Clarke, O. Grumberg, and D. Peled, Model Checking. The MIT Press, 1999. [2] G. J. Holzmann, “The Model Checker S PIN,” IEEE Transactions on Software Engineering — Special issue on formal methods in software practice, vol. 23, no. 5, pp. 279–295, May 1997. [3] A. Cimatti, E. Clarke, F. Giunchiglia, and M. Roveri, “N U SMV: A new Symbolic Model Verifier,” in Proc. of the Eleventh Conference on Computer-Aided Verification (CAV’99), ser. Lecture Notes in Computer Science, N. Halbwachs and D. Peled, Eds., no. 1633. Trento, Italy: Springer, July 1999, pp. 495–499. [4] K. G. Larsen, P. Pettersson, and W. Yi, “Model-Checking for RealTime Systems,” in Proc. of Fundamentals of Computation Theory, ser. Lecture Notes in Computer Science, no. 965, Aug. 1995, pp. 62–88. [5] OMG, “OMG Systems Modeling Language Version 1.3,” June 2012. [Online]. Available: http://www.omg.org/spec/SysML/1.3/PDF [6] T. Ando, Y. Miyamoto, H. Yatsu, K. Hisazumi, W. Kong, A. Fukuda, Y. Michiura, K. Sakemi, and M. Matsumoto, “SysML State Machine Diagram to Simple Promela Verification Model Translation Method,” in Proceedings of The 2016 International Conference On Software Engineering Research & Practice, 2016, pp. 64–68. [7] P. Bhaduri and S. Ramesh, “Model Checking of Statechart Models: Survey and Research Directions,” CoRR, vol. cs.SE/0407038, 2004. [8] K. L. McMillan, “Symbolic Model Checking: An approach to the state explosion problem,” Ph.D. dissertation, Pittsburgh, PA, USA, 1992. [9] The Formal Systems website, “FDR2.91,” http://www.fsel.com/, November 2010. [Online]. Available: http://www.fsel.com/
5. Conclusions In this study, we improved our previously proposed translation method. The translation objects have extended from a single SysML state machine to plural relative SysML state machine diagrams. Using our method, relative state machine diagrams can be converted into a simple SPIN verification model. Using inline Promela macros in our simple verification model, we can easily recognize the correspondence
ISBN: 1-60132-468-5, CSREA Press ©
50
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Fig. 3: Result of Verification Case Study Using the SPIN Model Checker
Fig. 4: Counterexample of a Verification for the System Using the SPIN Model Checker
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
51
Cybersecurity Practices from a Software Engineering Perspective Aakanksha Rastogi, Kendall E. Nygard Department of Computer Science North Dakota State University Fargo, ND, USA Email: [email protected], [email protected]
Abstract. Building a software application that is equipped with upto-date security measures and is complaint with standard security rules, regulations and standards is a complex task. Patching software after a hacker attack may often be the only a situational reasonable solution. However, incorporating security requirements while building the software in the first place can be helpful in shielding from hacking attacks. Determining when and how to integrate security considerations is a debatable issue. We assert that including security aspects at each phase of the software development life cycle adds significant protection to the software. This paper provides a survey of relevant literature of secure software development practices that have been developed and utilized to build software applications. We also addressed related issues of trust and autonomous systems Keywords. Cybersecurity, Trust, Security, Cyber Attacks, Software Development Life Cycle (SDLC), Analysis.
Several security issues, concerns, challenges, and solutions at different phases of the software development life cycle as described in the literature on cybersecurity are also presented. However, the scope of this paper is limited to Analysis, Design, Implementation, and Testing phases of the Software Development Life Cycle (SDLC). The organization of rest of the paper is as follows. Section II describes the characterization of trust in cybersecurity. Section III describes the phases of a typical software development life cycle. Section IV describes the security concerns encountered during the Analysis phase of the Software Development Life Cycle and potential ways to address those concerns. Section V discusses future work and Section VI wraps up the paper with the conclusion.
I. INTRODUCTION
II. CYBERSECURITY AND TRUST
Incorporating cybersecurity protections into software applications during development is a complex issue. In the ever-expanding digital age, virtually every aspect of human endeavor relies on secure transactions and operations. However, consideration of cybersecurity issues is often inadequate, leading to problems such as financial losses, data losses, and privacy breaches. From a systems and networking view, enormous efforts have been made to develop tools to combat specific types of cyber-attacks as they appear. However, hackers tend to think differently than developers of applications and are constantly and proactively developing increasingly notorious and creative attack strategies. Such attacks in planting malicious pieces of code that corrupt the application, steal sensitive customer information, or introduce malware such as viruses, worms and spyware, phishing, extortion schemes, and spam, [1], can be exploit vulnerabilities introduced at any step of the development process. Software applications that are vulnerable to cyber-attacks can drive potential customers and users of the application away. To gain user trust in purposeful applications, it is important to carry out application development while carefully addressing security issues at each step. Software developers tend to focus on functional requirements, with little emphasis on nonfunctional requirements, such as security. Some authors report benefits of addressing security measures at the distinct phases of the software development life cycle [2][3][4][5][6]. Futcher and von Solms proposed guidelines for secure software development [6]. In this paper we provide a survey of literature that is relevant to secure software development practices.
With technology advancement and mass digitalization of user personal data, establishing user trust has become an important factor in the use of software systems. Most software systems are potentially vulnerable to attacks even if there is strict adherence to leading edge principles of encryption and decryption. Security of software systems is classified into three categories: Confidentiality, Integrity and Availability [7][8][1]. These categories are also collectively known as the CIA triad. Confidentiality is defined as “Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information…” [9]. Integrity is defined as “Guarding against improper information modification or destruction, and includes ensuring information non-reputation and authenticity…” [9]. Availability is defined as “Ensuring timely and reliable access to and use of information…” [9]. Security is often intertwined with trust. In the context of software systems, trust refers to the level of confidence or reliability that a person places in a software system, including the expectations that they have for the software fulfilling its purpose. The software system can be of multiple elements, including programs, configuration files, and documentation. In addition, the concept of trust in the context of cyber security includes expectations that people have from all aspects of software development, including requirements, design, platform-specific issues and networks, for which various security practices, processes and technologies are in use. Trust also refers to a relationship that a person forms with software applications that are online or over a network. The
ISBN: 1-60132-468-5, CSREA Press ©
52
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
trust relationship is betrayed if the user’s expectations from these applications are not met. This raises questions concerning the kinds of expectations that users have with the applications and the factors that diminish trust. One factor arises from any negative risks that are associated with the usage of an application. There are traditional ways of assessing risk in cybersecurity. Oltramari et. al. [7] identified endpoint users as key introducers of risk in an application network, since humans, such as software developers, attackers and users of the application are included as a component of the system. In addition, low skill level or exhausted software developers tend to increase cybersecurity risk, while users can substantially decrease cybersecurity risks by being aware and attentive to the means of protecting their personal assets from, phishing or spam efforts [5]. Again, insiders within an organization are also known to sometimes support and execute malicious attacks for which outsiders have minimal knowledge. As described by Colwill, “A malicious insider has the potential to cause more damage to the organization and has many advantages over an outside attacker” [10]. These human concerns in information and cybersecurity make it important to learn to distinguish between regular users, potential hackers and insiders who can pose a great threat. Trust and human factors in cybersecurity all also of great concern in the rapidly expanding area of autonomous systems, many of which utilize advanced methods of artificial intelligence. Examples of autonomous systems include floor cleaning robots, agent software, military and private drones, surgery-performing robots and self-driving cars. Autonomous systems are managed and supervised independently by a single administrator, entity, or organization [11] [12]. Each autonomous system has a unique identifying label that can be used during data packet transfer between two systems [12]. Some autonomous systems can make decisions and perform tasks in unstructured environments with no need for human control or guidance. Stormont [13] illustrated a spectrum of autonomy ranging from a low end of fully remote control to a high end with no human intervention. Figure 1 illustrates the spectrum. .
Fig. 1.
Examples of degrees of autonomy.
In addition, Stormont questioned the need for high autonomy for robots in hazardous environments such as combat zones and identified factors contributing to human trust
in robots [13]. In Unmanned Aerial Systems (UAS), vehicles are highly restricted in the national airspace, and are not allowed to autonomously release weapons in military battlefield operations, which illustrate lack of trust in autonomous systems. The legal and moral ramifications of autonomous systems are of rapidly increasing concern today. For example, as selfdriving cars become reality, in addition to the direct cybersecurity issues, we have broader issues of trust, reliability, and ethical decision making [14]. This leads to questions surrounding who to hold responsible if autonomous systems exhibit faults, experience security breaches, injure innocents or damage property. This calls into question whether or not autonomous systems should be allowed to make life and death decisions [13]. When autonomous systems, which are already limited in reliability and are lacking human trust, are victims of hacking attacks and security vulnerabilities, they become dangerous. There is an inter domain routing protocol called BGP (Border Gateway Protocol) that each autonomous system in a network can follow to reach every block of an IP address. Although BGP is a secure protocol, it is still vulnerable to potential hijacking, malicious and cryptography related attacks. This possibility diminishes human trust in autonomous systems. In [15], a method called pretty good BGP (PGBGP) for detecting anomalies and responding to them is described. The approach can potentially detect and stop the propagation of invalid origin autonomous systems and invalid paths. III. SECURE SOFTWARE DEVELOPMENT LIFE CYCLE The Software development follows a software development life cycle, commonly referred to as a SDLC. A SDLC typically consists phases for Analysis, Design, Implementation, Testing and Maintenance. In the Analysis phase of a business application, potential stakeholders are identified and requirements are gathered from the customers, who are future users of the software. Requirements gathering involve customers meeting the business managers and analysts to state their expectations for the software. These requirements are analyzed and a requirements specification document is formulated and used throughout the life cycle of the software, including usage by the software design team, architects, developers, testers, and the end users of the software. The software design phase involves the creation of a blueprint of the software, which is based upon the customer requirements. A document is created that defines the guidelines for the design of the system. The Implementation phase involves the coding activities required to create the software, following the customer requirements. The Testing phase follows, which tests the system for fault removal and to provide assurance that the system functionality adheres to requirements. Finally, a Maintenance phase is entered, in which future enhancements, bug fixes and version controls are undertaken. We assert that incorporating security concerns at every phase of the software development life cycle can significantly
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
improve and enhance the overall quality of the software product. To follow a secure software development life cycle, the software project requires knowledgeable developers who are current with security standards and follow secure coding practices. During the design phase, developers can also incorporate secure implementation by starting out with the basics, such as having the code that they wrote, reviewed by threat advisors, as well as managers and fellow developers. Code reviews improve the coding style of the developers, but also enhances code security by pin-pointing known secure coding errors. Review feedback and hacking demonstrations enhance code security [16]. According to Grégoire et. al. [17], code reviews provide an opportunity to find security related bugs and fix them relatively early, and are also useful in teaching secure coding practices and security vulnerabilities to developers [17]. Other than code reviews, risk assessment at each phase, asset identification and valuation; threat modelling and categorizing threats are also important factors that should be taken into consideration. Figure 2, based on information in [18], depicts a model of a secure software development life cycle infused with security at every stage, as suggested by Jones and Rastogi [18].
Fig. 2.
Secure Software Development Life Cycle
Jones and Rastogi followed the STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege) protocol for categorizing threats to each attack target node [18]. Spoofing threats refers to the susceptibility of the target to their authentication information being illegally accessed and used by another person. Tampering threats refer to user sensitive data being maliciously modified. Repudiation threats refer to users denying that they have performed illegal operations in a system that lacks traceability, unless proved otherwise. Information Disclosure threats involve sensitive personal information being exposed to users who are not authorized access. Denial of Service attacks refers to denying services to valid users, typically by flooding platforms with extraneous data. Elevation of Privilege threats enables unprivileged users to gain privileged access to a system, which makes it sufficient for them to destroy or compromise the integrity of a system.
53
IV. SECURITY IN THE ANALYSIS PHASE In To incorporate security during the analysis phase, it is essential that security requirements are generated as part of initial requirements gathering and analysis. If the requirements are gathered keeping security aspects of the software product formalized, other phases of the SDLC can be driven towards higher security. Nearly every software project is carried out in a time crunch for which security is given minimal to no attention. Defining security requirements in the requirements gathering use cases and performing an initial risk analysis encourages the elevation of consideration of security aspects in the rest of the SDLC phases [3]. In the requirement phase, Futcher and von Solms argued for the importance of gathering and including all known securityrelated information, including business and legal requirements [6]. They also define CLASP (Comprehensive, Lightweight Application Security Process), SDL (Security Development Lifecycle) and TSP-Secure (Team Software Process-Secure) for describing the capturing of security requirements as a best practice. We also emphasize the provision of security education and training for developers, acknowledging and endorsing the training, and providing a list of global security requirements that can be used as a baseline for software projects [6]. An excellent comparison of CLASP and SDL is presented by Grégoire et. al. [17]. Security being a non-functional requirement reveals that it is a crosscutting concern that can potentially impact the multiple functional requirements across a multi-component system. A crosscutting concern is a common functionality that supports different operations in a system, including authentication, authorization, communications, caching, logging, exception management, and validation. These functionalities span layers and tiers [19]. In most cases, requirements for security are specified in terms of achievability rather than in terms of the problem being solved. This results in lack of clarity as to how the security requirements affect functional requirements [20]. Haley et. al. used aspect-oriented software development crosscutting concepts and problem frames and presented an approach to deriving security requirements from them [20]. They identified threat descriptions, concerns, join points (locations of objects or assets which are being shared by functional requirements and threat descriptions) and vulnerabilities that are a composition of functional requirements and threats and are found at join points. With the help of several examples, one of them being the result of an application of CIA concerns conflicting with ease-of-use and revenue; the authors recognized, interpreted, and clarified conflicts of security requirements with each other. The representation of security requirements as crosscutting threat descriptions illustrates how they assist with the composition of these requirements with functional requirements. The approach allows general security concerns to be converted into instantiations that are closely related to functional requirements. Rosenhainer demonstrated an approach to identify and
ISBN: 1-60132-468-5, CSREA Press ©
54
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
document early crosscutting concerns at the requirements engineering level from an aspect-mining point of view [21]. Moreira et. al. focused on the need to clearly identify and separate non-functional concerns. These include quality attributes such as security, reliability, accuracy, and response time, and address their crosscutting nature, since these properties affect the entire system [22]. To manage conflicts that arise due to tangled representations, specifications, and code at the requirements stage of software development process, they proposed a model consisting of three activities namely identifying, specifying, and integrating requirements. Their approach first identified all of the requirements and selected relevant quality attributes, followed by using use cases from the world of Unified Modeling Language for specifying functional requirements and using special templates for describing quality attributes [22]. This helps to identify all of the quality attributes that can crosscut functional requirements. Finally, they proposed a set of models that represented an integration of functional requirements with crosscutting quality attributes [22]. V. FUTURE WORK
requirements at each phase in the software development life cycle and conducting risk analysis at each step is a promising contribution towards building a secure and robust software application. REFERENCES [1]
[2]
[3]
[4]
[5]
[6]
The premises of this survey study were primarily limited to web software applications. Our next steps are to extend this work of surveying security issues, concerns, challenges and their solutions in hardware and networking, as they interface with software. We plan to proactively implement ways to secure software, hardware and networks and develop tools and techniques to prevent hacker attacks more broadly. Another is to survey security issues, concerns, challenges, and their solutions in emerging technologies that apply software engineering methodologies in cloud computing, social networking, safety-critical software applications, healthcare software applications, internet banking applications, mobile applications, telecommunication, and smartphone technology.
[7]
[8]
[9] [10]
[11]
VI. CONCLUSION Cybersecurity is of high importance in the development and usage of software applications. For a software application, from gathering requirements and through design, architecture, development, testing, quality assurance, distribution and actual usage by endpoint users, security is a core issue. Stakeholders, development teams, customers, and users must be attentive to security issues. We recognize that security is a non-functional requirement. From inception to delivery of a software product, security poses concerns of great importance, and heavily impact user’s trust in a software application and its continued usage. The survey presents the core concepts from the literature dealing with security related issues, concerns, challenges, and their solutions from a software engineering perspective. We conclude that patching a software product for emerging issues is not enough and that there is a strong need to incorporate security requirements from the inception of the software life cycle. We also conclude that implementing security
[12]
[13]
[14]
[15]
[16]
[17]
J. Jang-Jaccard and S. Nepal, "A survey of emerging threats in cybersecurity", Journal of Computer and System Sciences, vol. 80, no. 5, pp. 973-993, 2014. A. Apvrille and M. Pourzandi, "Secure Software Development by Example," IEEE Security and Privacy Magazine, vol. 3, no. 4, pp. 10-17, 2005. K. Balarama, "10 Ways to Incorporate Security Into Your Software Development Life Cycle," 18 May 2016. [Online]. Available: https://www.cigital.com/blog/infuse-security-intoyour-software-development-life-cycle/. [Accessed 25 March 2017]. H. Yu, N. Jones, G. Bullock and X. Yuan, "Teaching secure software engineering: Writing secure code", Software Engineering Conference in Russia (CEE-SECR), 2011 7th Central and Eastern European, pp. 1-5, 2011. M. Howard, "Building more secure software with improved development processes", IEEE Security and Privacy Magazine, vol. 2, no. 6, pp. 63-65, 2004. L. Futcher and R. von Solms, "Guidelines for secure software development", Proceedings of the 2008 annual research conference of the South African Institute of Computer Scientists and Information Technologists on IT research in developing countries: riding the wave of technology, pp. 56-65, 2008. A. Oltramari, D. Henshel, M. Cains and B. Hoffman, "Towards a Human Factors Ontology for Cyber Security", Semantic Technology for Intelligence, Defense, and Security, 2015. R. von Solms and J. van Niekerk, "From information security to cyber security", Computers & Security, vol. 38, pp. 97-102, 2013. FIPS PUB 199 Standards for Security Categorization of Federal Information and Information Systems, NCSD February 2004 C. Colwill, "Human factors in information security: The insider threat – Who can you trust these days?", Information Security Technical Report, vol. 14, no. 4, pp. 186-196, 2009. [11] "What is an Autonomous System (AS)? - Definition from Techopedia", Techopedia.com, 2017. [Online]. Available: https://www.techopedia.com/definition/11063/autonomoussystem-as. [Accessed: 26- Apr- 2017]. "Autonomous system and autonomous system number", 2017. [Online]. Available: http://www.certiology.com/tutorials/ccnatutorial/autonomous-system-and-autonomous-systemnumber.html. [Accessed: 26- Apr- 2017]. D. P. Stormont, "Analyzing human trust of autonomous systems in hazardous environments", Proc. of the Human Implications of Human-Robot Interaction workshop at AAAI, pp. 27-32, 2008. J. Lee, K. Kim, S. Lee and D. Shin, "Can Autonomous Vehicles Be Safe and Trustworthy? Effects of Appearance and Autonomy of Unmanned Driving Systems", International Journal of Human-Computer Interaction, vol. 31, no. 10, pp. 682-691, 2015. J. Karlin, S. Forrest and J. Rexford, "Autonomous security for autonomous systems", Computer Networks, vol. 52, no. 15, pp. 2908-2923, 2008. M. Lyman, "Benefits of Secure Code Review: Developer Education," 10 November 2015. [Online]. Available: https://www.cigital.com/blog/benefits-of-secure-code-reviewdeveloper-education/. [Accessed 25 March 2017]. J. Grégoire, K. Buyens, B. De Win, R. Scandariato and W. Joosen, "On the Secure Software Development Process: CLASP and SDL Compared", Proceedings of the Third International
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Workshop on Software Engineering for Secure Systems, p. 1, 2007. [18] R. Jones and A. Rastogi, "Secure Coding: Building Security into the Software Development Life Cycle", Information Systems Security, vol. 13, no. 5, pp. 29-39, 2004. [19] "Chapter 17: Crosscutting Concerns", Msdn.microsoft.com, 2017. [Online]. Available: https://msdn.microsoft.com/enus/library/ee658105.aspx. [Accessed: 27- Apr- 2017]. [20] C. B. Haley, R. C. Laney and B. Nuseibeh, "Deriving security requirements from crosscutting threat descriptions", Proceedings
55
of the 3rd international conference on Aspect-oriented software development, pp. 112-121, 2004. [21] L. Rosenhainer, "Identifying Crosscutting Concerns in Requirements Specifications", Proceedings of OOPSLA Early Aspects, 2004. [22] A. Moreira, J. Araújo and I. Brito, "Crosscutting Quality Attributes for Requirements Engineering", Proceedings of the 14th international conference on Software Engineering and Knowledge Engineering, pp. 167-174, 2002..
ISBN: 1-60132-468-5, CSREA Press ©
56
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
A Sound Operational Semantics for Circus S. L. M. Barrocas1 and M. V. M. Oliveira1 ,2 1 Departamento de Informática e Matemática Aplicada, Universidade Federal do Rio Grande do Norte 2 Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte Natal, RN, Brazil Contact: [email protected]
Abstract— The use of formal methods in software engineering considerably reduces the number of errors throughout system developments by enforcing a rigorous specification and verification before reaching a final implementation. We provide, on this paper, a full and sound Structural Operational Semantics for Circus, a formal notation that combines Z and CSP. Our work lifts the works of Freitas, Cavalcanti and Woodcock by creating rules that deal with any Circus construct. We also provide, on this paper, proof of soundness of the Structural Operational Semantics with the Unifying Theories of Programming (UTP).
constructs for processes (compound processes, call processes and etc). This paper aims at providing a full and sound Structural Operational Semantics for Circus. This will be achieved by lifting the Operational Semantics based on ideas described in [4], [5], [9], [14], and then proving the soudness of the unproved laws with respect to the UTP [10].
2. Circus
Keywords: Operational Semantics, Circus, Soundness
1. Introduction Formal methods are techniques that specify systems using formal notations underpinned by rigourously defined semantics. Due to the effort required to apply such techniques, their application usually become expensive. For this reason, their use have normally been justified only in the implementation of concurrent and safety-critical systems. Circus [17] is a specification language whose syntax combines the syntaxes of Z [18] and CSP [15]. This feature of Circus allows the representation of concurrent systems with large amount of data in a non-implicit fashion. Circus has a refinement calculus [14] with transformation rules that can be used to refine Circus abstract specifications into Circus concrete implementations. Circus has been provided with both, a denotational semantics [14], and also an operational semantics [9] that contains rules that can be applied to generate labelled predicate transition systems (LPTS) of a given specification. The Operational Semantics shown at [9] is the first version of the Operational Semantics of Circus: it was presented having its rules defined using Z Schemas. Later, Woodcock and Cavalcanti have developed an updated Structural Operational Semantics using different definitions and a different description, envisaging to provide a better automation (using an automatic theorem prover). The rules were developed for a formalism that has similar constructs to Circus: CML. These rules are shown at [5]. Similar rules are shown in [4]. These works, however, show rules for Circus actions and basic processes, and Circus has other
Concurrent and Integrated Refinement CalculUS (Circus) [17], [8], [9], [14] is a formal language whose syntax is based on the sintaxes of two other formal languages, Z [18] and CSP [12]. Circus joins Z’s feature of representing complex data structures with CSP’s process algebra, that represents concurrency. Circus also has a refinement calculus [14], and its syntax is based on Dijkstra’s language of guarded commands [6]. A Circus program is formed by zero or more paragraphs, each of which can be a declaration of channels or a channel set, Z Paragraphs [16] or Process paragraphs. When it is a process paragraph, it can be a compound (Parallel, Interleave, External Choice, Internal Choice or Sequence) process, an unary process (Hide, Rename, Parameterised), a Call Process (with parameters, which can be indexed or normal), an Iterated Process (that are a generalization of compound processes), or a basic process, that has a state (possibly with variables), inner actions (tasks that the process can perform) and a main action (that defines the behaviour of the process). Each action can be a command (Assignment, Variable Block, Specification Statement, If-Guarded or a Substitution command), a compound action (using the same operators of compound process), an unary action (Hide, Rename, Parameterised, Guarded or Prefixing), or a basic action (SKIP, STOP or CHAOS). The syntax of Circus is described on [14]. We will give an example of Circus specification of a simple Cash Machine that allows cash withdrawal (if there is sufficient balance) and balance inquiry. The Cash Machine interacts with a user. It will be specified as the following
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
process: channel cash, inquiry channel entercnum, chosencash, amount : N channel displaymoney, displaybalance : N channelset CS = b {| cash, inquiry, entercnum, amount |} process CashMachine = b begin state St == [ b : seq N ] INQUIRY = b x : N • displaybalance.(b(x)) → SKIP CASH = b x : N • amount?y → if(b(x) ≥ y) → b := b ⊕ {x 7→ b(x) − y} ; displaymoney!y → SKIP [] (b(x) < y) → SKIP fi • µ X • entercnum?x → ((inquiry → INQUIRY(x)) 2 (cash → CASH(x))) ; X end process Customer = b begin state St == [ mycash, cnum : N ] INIT = b cnum := 2 ; mycash := 50 ; entercnum!cnum → SKIP INQUIRYOP = b inquiry → SKIP CASHOP = b cash → amount!mycash → SKIP • INIT ; µ X • (CASHOP u INQUIRYOP) ; X end process System = b (CashMachine |[ CS ]| Customer) \ CS The process System consists on the parallel composition between the process CashMachine and the process Customer. Thus, System is a Parallel Process with a hidden set of channels. The parallel composition between System and CashMachine will have its synchronisation on channels cash, inquiry, entercnum, amount. The channels that compose the parallel interface of System are hidden from the external environment (\ CS). Customer will initially store a card number (with the assignment command cnum := 2) and an amount to be cashed (with another assignment command mycash := 50) through action INIT. INIT then writes (!) the card number (cnum) on channel entercnum (entercnum!num). Then it will randomly (using the Internal Choice u operator) choose between cashing (CASHOP) or requiring balance information (INQUIRYOP). If Customer chooses to require balance, it will perform event inquiry. If it chooses to cash, it will perform cash and then write (!) its desired value to be cashed (mycash) on channel amount (amount!mycash). After choosing between requiring balance or cashing, it recurs (invoking X). When it recurs, it goes to µ X. The process CashMachine initially inputs a card number (cnum) on channel entercnum, then it offers two choices (2): one in which the user will see balance (in which the choice is resolved by performing inquiry), and the
57
other in which the user will cash out money (in which the choice is resolved by performing the cash event). The amount to be cashed will be input (?) on channel amount and the value will be stored on variable y, and then if the chosen amount is bigger than the balance of the Customer (b(x) ≥ y), the CashMachine will subtract the balance (b := b ⊕ {x 7→ b(x) − y}) of the user and display the money on displaybalance channel. If (b(x) < y), then it terminates successfully (SKIP). After seeing balance or cashing money, CashMachine recurs. The ”→” operator is used between a channel invocation (possibly followed by communication fields) and an action, and it establishes a Prefixing between the channel and the action (that is, the action will only be performed if an event on the channel occurs). The Sequence Operator (;) establishes a similar relation, but between two actions. Circus has a theory on the Unifying Theories of Programming (UTP) [10], that is a framework that provides constructs for defining theories for formalisms. This theory of Circus is its Denotational Semantics, which will be explained as follows.
2.1 Circus Denotational Semantics The Denotational Semantics of Circus [14] defines each construct of Circus as an UTP reactive design of the form R (Pre ` Post), in which the predicates Pre and Post are defined using UTP variables (tr for trace, ref for refusals, ok for program ready to start and wait for program waiting) and their dashed versions (tr0 , ref 0 , ok0 and wait0 ). For any variable x on the UTP, x represents a variable on its current state, and x0 represents a variable on a future state. The meaning of the design is: Pre ` Post = b Pre ∧ ok ⇒ Post ∧ ok0 The expression above means: if the program has started having its pre-conditions holded, than when it finishes it will have its post-condition holded. The R on the expression is a function that guarantees soundness of the Denotational Expression with the UTP. This function is called Healthiness Condition. It is defined in terms of other 3 healthiness conditions that Circus has: R (P) = R1 ◦ R2 ◦ R3 (P) R1 , R2 and R3 will have their definitions shown as follows: R1 (P) = b P ∧ tr ≤ tr0 R2 (P (tr, tr0 )) = b P (, tr0 - tr) R3 (P) = b IIrea wait P
ISBN: 1-60132-468-5, CSREA Press ©
58
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
2.2 Circus’ Operational Semantics
•
•
•
The healthiness condition R1 says that the history of events cannot be undone. This is established by the expression tr ≤ tr0 . On this expression, tr represents the trace in the current state and tr0 represents the trace in a future state. The trace in the future state cannot be less than the trace in the current state; R2 says that the behaviour of the process independs from what hapenned before. The expression of R2 can be re-written as: R2 (P (tr, tr0 )) = b P [, tr0 - tr / tr, tr0 ] The expression above replaces, on the predicate P, tr by the empty trace () and tr0 by tr0 - tr. Thus, the history of events on tr is irrelevant for the behaviour of the process; R3 says that processes that wait for other processes to finish must not start. This is represented by the conditional expression IIrea wait P. This expression says: if wait is true, then the conditional expression equals the reactive skip (IIrea ). Otherwise it equals P. The reactive skip is defined as follows: IIrea = b (¬ok ∧ tr’ ≤ tr ) ∨ (ok’ ∧ tr’ = tr ∧ wait’ = wait ∧ ref’ = ref ∧ v’ = v)
The Operational Semantics of Circus was firstly described on [9]. Cavalcanti and Gaudel [4], [11] described an updated Operational Semantics for Circus. It shows different kinds of definitions for both labelled and silent transitions, in which the nodes contain a constraint, that indicates if that node is enabled or not. On this paper, the node is represented by a triple (c | s |= A), where c is the constraint, s is the sequence of assignments and A is the program text (in this case, an action) that remains to be executed. A similar kind of description is shown on Woodcock’s technical report [5], where the Operational Semantics of a language that has similar constructs to Circus (CML) is shown and explained. The denotational definition of each rule of this Semantics can be seen as follows:
Definition 1: Silent Transition τ (c1 | s1 |= A1 ) − → (c2 | s2 |= A2 ) = ∀ w . c1 ∧ c2 ⇒ Lift (s1 ) ; A1 v Lift (s2 ) ; A2
Definition 2: Labelled Transition l (c1 | s1 |= A1 ) → − (c2 | s2 |= A2 ) = ∀ w . c1 ∧ c2 ⇒ Lift (s1 ) ; A1 v (Lift (s2 ); c.w1 → A2 ) 2 (Lift (s1 ) ; A1 ) where
The expression above stands for: The process either did not start (having its history of events not undone) or has finished (ok’) having its UTP variables unchanged. Thus the behaviour of R3 is: if the process is waiting for other processes to finish, then it either did not start or has already finished. Otherwise it continues;
An important operator from the UTP is the Sequence Operator. It is described as the following expression: P; Q = b ∃ v0 . P (v, v0 ) ∧ Q (v0 , v0 ) The definition of the reactive designs for the constructs of Circus can be seen at [14].
Lift (s) = R1 ◦ R3 (true ` s ∧ tr’ = tr ∧ ¬ wait’) Definition 1 gives the denotational meaning of a Silent Transition. It means: for all loose constants w, if c1 and c2 are true, then the left side of the transition is refined by the right side of the transition. The meaning of the denotational definition on 2 is: if c1 and c2 are true, then the left side of the transition is refined by the external choice between the right side of the transition prefixed by the label and the left side of the transition. The Lift function assures that the assignment is healthy with respect to the theory of Circus on the UTP. It applies healthiness conditions R1 and R3 to the assignment put as parameter with no restriction (true on the pre-condition of the design), and assures that the assignment does not change the history of events (tr’ = tr) and that the program will not be in a waiting state (¬ wait’).
The Denotational definition of each Circus construct is efficient for expressing their conditions, but it does not explicitly express the operational behaviour of each Circus construct. We will show some concepts of the Operational Semantics of Circus on the following sub-section.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
An example of Operational description can be seen for action INQUIRYOP of process Customer as follows: inquiry
(true | {} |= inquiry → SKIP)−−−→(true | {} |= SKIP) The above description consists on a transition between two nodes. The first node has inquiry → SKIP as the program text that remains to be executed, and true as constraint (that is, there is no restriction on this node). The labelled arc inquiry −−−→ then establishes that, when event inquiry occurs, the program goes to the node (true | {} |= SKIP). As there are no arcs going out from the node containing SKIP, then the program terminates.
2.2.1 Extending and Lifting the Operational Semantics to Circus processes
Among the constructs that lie on Circus’ syntax, all of them refer to a subset of Circus actions. Woodcock defined rules for Prefixing, Assignment, External Choice, Internal Choice, Parallelism, Interleaving, Guards, Hiding and Recursion, all of them for actions. For this paper, we created new rules for the remaining constructs of Circus actions (If-guarded-command, Specification Statements, Alphabetised Parallel Action, Parameter Action Call, Iterated Actions, and etc), and lifted the semantics to Circus processes by creating a new kind of transition, the syntactic transition, whose denotational definition is seen as follows:
Definition 3: Syntactic Transition σ (c1 |= P1 ) − → (c2 |= P2 ) =
59
One of the constructs that Woodcock did not encompass was If-Guarded-Command. We created a set of rules that describe the behaviour of If-Guarded-Command, among other constructs. We will show its rules as an example:
Rule 1: If-Guarded Command: Be IGC = b if (pred1 ) → A1 [] (pred2 ) → A2 [] ... [] (predn ) → An fi, then (c (c ... (c (c (c
τ
| s |= IGC) − → (c ∧ pred1 | s |= A1 ) τ | s |= IGC) − → (c ∧ pred2 | s |= A2 ) τ
| s |= IGC) − → (c ∧ predn | s |= An ) τ | s |= IGC) − → ∧ (¬ pred1 ) ∧ ... ∧ (¬ predn ) | s |= CHAOS)
The set of rules 1 are for If-Guarded Commands. Each rule shows a possibility of computation for the program text, each one depending on a predicate from the If-Guarded Command. There is a possibility in which the command goes to the node whose program text is A1 and whose constraint is c ∧ pred1 (indicating that A1 is reachable only if pred1 is true), and so on. When all guards are false, the If-Guarded Command diverges (with CHAOS), and when more than one guard is true, the If-Guarded Command behaves as an internal choice between the true-guarded branches. We will also give an example of rule, created for this paper, that can be applied to any compound (Parallel, Interleave, External Choice, Internal Choice and Sequence) process: Rule 2: Compound Process Left:
∀ w . c1 ∧ c2 ⇒ ((Lift (gA (P1 )) ; P1 v Lift (gA (P2 )); P2 )) ∧ Lift (gA (P1 )) = Lift (gA (P2 )) getAssignments (abbreviated as gA) is an auxiliary function that calculates the sequence of assignments of a node whose program text is a process. A node whose program text is a process is defined by a constraint and a process, σ represented as (c |= P). The transition (c1 |= P1 ) − → (c2 |= P2 ) means that, if c1 and c2 are true, then P1 can be syntactically transformed to P2 without semantically changing the program. The Syntactic Transition, thus, does not specify a path of computation on the program. The idea is to syntactically transform the process in order to reach a Basic Process. During this transformation, inner assignments on the process cannot change (Lift (getAssignments (P1 )) = Lift (getAssignments (P2 ))) and the assignments have to preserve healthiness.
σ
(c1 |= P1 ) − → (c2 |= P3 ) σ (c1 |= P1 OP P2 ) − → (c2 |= P3 OP P2 ) Where OP ∈ { u,2,; ,|||,|[ CS ]| } Rule 2 describes a syntactic transformation by syntactically advancing the left branch of the compound operator. It says: if (c1 |= P1 ) can be syntactically transformed to (c2 |= P3 ), so (c1 |= P1 OP P2 ) can be syntactically transformed to (c2 |= P3 OP P2 ).
3. Soundness with respect to the UTP We provided, for this paper, the proof of soundness with respect to the UTP for all rules of the Operational Semantics of Circus. Each rule was proved using theorems from Woodcock’s technical report [5] and lemmas created for this
ISBN: 1-60132-468-5, CSREA Press ©
60
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
paper. All proofs and referenced lemmas are shown on [2]. Each proof is divided in sub-goals (intermediate expressions that compose the proof), and tactics between sub-goals (tactics can be composed by lemmas and assumptions that are used to justify the advancement from a sub-goal to another). Each tactic has a source sub-goal and a destination sub-goal. Sometimes tactics will be omitted. On these cases, transformations between the sub-goals will consist only on predicate calculus. The first sub-goal is the expression to be proved. The last sub-goal (which is the goal of the proof) will mandatorily be true. In order to modularize the proof, lemmas were created and proved as well. We will also provide explanation for each step of the proof. Each tactic will be shown throughout the proof as an explanation with parenthesis. Proofs will be shown on the following format: Sub-goal 1 (explanation 1) ...
= A1 ∨ EXPR v A1 (a refinement expression A v B can be written as [B ⇒ A], where the brackets means universal quantification on the variables of the expression. As, on a proof, the expression under proof is proved for all possible values of variables, then brackets can be ommited) = A1 ⇒ A1 ∨ EXPR (from here below, only predicate calculus until reaching true as final sub-goal) = = = =
¬A1 ∨ (A1 ∨ EXPR) ¬A1 ∨ A1 ∨ EXPR true ∨ EXPR true
Now we will prove the following expression: τ
(c | s |= IGC) − → (c ∧ pred1 | s |= A1 ) = Sub-goal n (explanation n)
Proof:
We will show and explain, step by step, the proof of soundness of the first rule of 1, which is τ
(c | s |= IGC) − → (c ∧ pred1 | s |= A1 ) The first step for proving this law is creating a lemma to prove that IGC is refined by A1 (IGC v A1 ), provided that pred1 is true. This lemma will be used throughout the proof: Lemma 1: IGC v A1 , provided that pred1 Proof : Be EXPR = (if (pred2 ) → A2 []...[](predn ) → An fi), then IGC v A1 (the first step of the proof is to infer that, as pred1 is true, IGC either equals A1 (if pred1 is true and all other predi are false) or an internal choice between A1 and another expression - if pred1 is true and at least one predicate predi is true ; for IGC = A1 , the proof is trivial: The expression becomes a refinement between equal sides - A1 v A1 ), which is true. For IGC = A1 u EXPR, the proof evolves as we will show)
τ
(c | s |= IGC) − → (c ∧ pred1 | s |= A1 ) (we firstly had to apply a tactic with the definition 1 of Silent Transition to transform the expression that we want to prove into a First Order Logic expression) = ∀ w . c ∧ c ∧ pred1 ⇒ (Lift (s) ; IGC v Lift (s) ; A1 ) (then we applied a tactic that combined lemma 1, the assumption pred1 and Monotonicity of Refinement in order to prove that the refinement expression was true; if Lift (s) v Lift (s) and IGC v A1 , by 1, and (assms) pred1 , pred1 is true because it appears on the left side of the implication). = ∀ w . c ∧ c ∧ pred1 ⇒ true (then two Predicate Calculus tactics were applied and finally the final sub-goal (true) was reached) = ∀ w . true = true
We will now show the proof of the last rule of 1. The expression is given by τ
(c | s |= IGC) − → (c ∧ (¬ pred1 ) ∧ ... ∧ (¬ | s |= CHAOS) Proof:
= A1 u EXPR v A1 (in the UTP, an internal choice between two actions is an OR composition between the denotational definitions of both actions)
At first, we will consider: NEGPREDS = (¬ pred1 ) ∧ ... ∧ (¬ predn )
ISBN: 1-60132-468-5, CSREA Press ©
predn )
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
References
τ
(c | s |= IGC) − → (c ∧ NEGPREDS | s |= CHAOS) (we firstly apply a tactic to expand the first sub-goal into a First Order Logic expression (1), that becomes the second sub-goal) = ∀ w . (c ∧ c ∧ NEGPREDS) ⇒ (Lift (s) ; IGC v Lift (s) ; CHAOS) (we make here an assumption (assms) that c is true in order to reach the third sub-goal. If c was false, the antecedent of the implication would also be false, and an implication with a false antecedent is always true) = ∀ w . (NEGPREDS) ⇒ (Lift (s) ; IGC v Lift (s) ; CHAOS) (if all predicates are false, then IGC equals CHAOS) = ∀ w . (NEGPREDS) ⇒ (Lift (s) ; CHAOS v Lift (s) ; CHAOS) (we then reach a refinement between similar expressions. Be E = Lift (s) ; CHAOS) = ∀ w . (NEGPREDS) ⇒ E v E (from now on, all tactics will apply only predicate calculus) = = = = =
∀w. ∀w. ∀w. ∀w. true
(NEGPREDS) (NEGPREDS) (NEGPREDS) (NEGPREDS)
61
⇒ ⇒ ⇒ ⇒
[E ⇒ E] [¬ E ∨ E] [true] true
4. Conclusions and future work On this paper, we provided a full and sound Operational Semantics for Circus, having Woodcock’s [5] and Freitas’ [9] Operational Semantics as a basis. We also used ideas from Oliveira [14] in order to build the syntactic transformation rules for Circus processes. We proved soundness of the laws of the Operational Semantics with respect to the theory of Circus on the Unifying Theories of Programming (UTP), which is the Denotational Semantics of Circus. Our longterm goal is to allow the best possible theory for mechanising it on an automatic theorem prover. As a future work, we can point out the mechanisation using Isabelle-UTP [1], [7] of the Structural Operational Semantics we developed on this paper. The automatization of the Semantics on the theorem prover will strenghten the reliability of the proofs for the laws we created on this paper. What is more, the Structural Operational Semantics we created for this paper can serve as a basis for all formalisms created from Circus. OhCircus [3] (a version of Circus that is Object Oriented) and SCJ-Circus [13] (Circus for Safety Critical Java) are examples of formalisms whose Operational Semantics can be based on Circus.
[1] Isabelle. At http://www.cl.cam.ac.uk/Research/HVG/Isabelle/index.html. [2] Samuel Lincoln Magalhães Barrocas. Proof of laws from the operational semantics of circus. Technical report, UFRN (Federal University of Rio Grande do Norte), 2017. [3] A. L. C. Cavalcanti, A. C. A. Sampaio, and J. C. P. Woodcock. Unifying Classes and Processes. Software and System Modelling, 4(3):277–296, 2005. [4] Ana Cavalcanti and Marie-Claude Gaudel. Testing for refinement in Circus. Acta Inf., 48(2):97–147, 2011. [5] Ana Cavalcanti and Jim Woodcock. CML Definition 2 - Operational Semantics. Technical report, University of York, 2013. [6] E. W. Dijkstra. Guarded commands, nondeterminacy and the formal derivation of programs. Communication of the ACM, 18(18):453–457, 1975. [7] Simon Foster, Frank Zeyda, and Jim Woodcock. Isabelle/utp: A mechanised theory engineering framework. In Unifying Theories of Programming - 5th International Symposium, UTP 2014, Singapore, May 13, 2014, Revised Selected Papers, volume 8963 of Lecture Notes in Computer Science, pages 21–41. Springer, 2014. [8] A. Freitas. From Circus to Java: Implementation and Verification of a Translation Strategy. Master’s thesis, Department of Computer Science, The University of York, Dec 2005. [9] L. Freitas. Model-checking Circus. PhD thesis, Department of Computer Science, The University of York, 2005. YCST-2005/11. [10] C. A. R. Hoare and H. Jifeng. Unifying Theories of Programming. Prentice-Hall, 1998. [11] Ana Cavalcanti J. C. P. Woodcock. Operational semantics for circus. Formal Aspects of Computing, 2007. . [12] M. G. Hinchey and S. A. Jarvis. Concurrent Systems: Formal Development in CSP. McGraw-Hill, Inc., New York, NY, USA, 1995. [13] Alvaro Miyazawa and Ana Cavalcanti. Scj-circus: a refinementoriented formal notation for safety-critical java. In Proceedings 17th International Workshop on Refinement, Refine@FM 2015, Oslo, Norway, 22nd June 2015., pages 71–86, 2015. [14] M. V. M. Oliveira. Formal Derivation of State-Rich Reactive Programs using Circus. PhD thesis, Department of Computer Science, University of York, 2006. [15] A. W. Roscoe. The Theory and Practice of Concurrency. PrenticeHall Series in Computer Science. Prentice-Hall, 1998. [16] J. M. Spivey. The Z Notation: A Reference Manual. Prentice-Hall, 2nd edition, 1992. [17] J. C. P. Woodcock and A. L. C. Cavalcanti. A concurrent language for refinement. In A. Butterfield and C. Pahl, editors, IWFM’01: 5th Irish Workshop in Formal Methods, BCS Electronic Workshops in Computing, Dublin, Ireland, July 2001. [18] J. C. P. Woodcock and J. Davies. Using Z—Specification, Refinement, and Proof. Prentice-Hall, 1996.
ISBN: 1-60132-468-5, CSREA Press ©
62
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Resilience Methods within the Software Development Cycle Acklyn Murray, Marlon Mejias, Peter Keiller Department of Electrical Engineering and Computer Science Howard University Washington, DC 20059
Abstract Resilience, attuned to software development, is the ability of a component such as an Operating System(OS), server, network, data center or system of storage to rapidly recovery and continue with operations even when certain failures or errors occur with limited disruption that could otherwise affect the progress of operations running in the computer. In the following paper, we will discuss the development cycle in relation to resilience methods, resilience factors that may contribute to unscheduled disruptions and upcoming possibilities in the examination of resilience methods. Keywords— Resilience; Software methods; Dependability, Fault Tolerance
1
Introduction
Many of today’s business models are unique, advanced and highly dependent on technology when compared to the models of earlier organizations. The increasing use of technology in business today has changed the nature of business operations [11,16] and has made software an indispensable aspect of operations. In the past, businesses have applied software to support operations during business hours from 9 A.M. to 5 P.M., Monday to Friday. In such operations, computers would process orders, reconcile Point of Sale (POS), cash transactions, and other activities. As technology and operations advanced new models emerged where integrated systems were used for batch processing to complete tasks such as consolidation, reconciliation, and data transfer overnight. Today however, most organizations use software to process transactions on a real-time basis, and for decision support in global operations[1]. This development means that small failures can cripple operations and have far reaching impact. The level of reliance on software today is not comparable to the level of reliance any time before in history. This increased reliance on software has not been met with appropriate improvements in the software design process to ensure that software products are resilient. While resilience is a main stay of ‘high-risk’ industries with physical products such as in aviation, chemical manufacturing, healthcare technology, and traditional manufacturing, much can be learnt and applied to the field of software development [23]. The study of software resilience is relatively new and there are only limited studies or publications [1]. As with any emerging field, it is important to review recent literature and suggested best practices so that conclusions and future areas of research can be identified. This paper presents a review of recent articles on
software engineering and methods of software resilience. Relevant analysis and discussion are provided before conclusions are presented. 2 Defining Software Resilience Generally, in engineering disciplines, resilience is measured by the ability of a system to persevere or work through a major fault in a critical part of the system [5]. It is the ability of a system to “take a hit to a critical component” and recover in a reasonable manner or acceptable duration [2]. Resilient software should have the capacity to withstand a failure in a critical component but still recover in an acceptable predefined manner and duration. Failures in software can arise from intentional activities/attacks or unintended faults. Either way, a resilient system should have the capacity to recover after such faults. Software that does not recover on its own or applications that recover after a long duration are not resilient. 2.1 Factors affecting software resiliency There are several factors that affect the resiliency of software. Factors include complexity, globalization, interdependency, rapid change, level of system integration and human factors [2]. These factors are discussed further. Today’s organizations have complex networked systems that come from many integrations and interdependencies. For example, a telecommunication company providing a mobile money service through a third-party application has a complex system with multiple points of failure. When one critical system fails, the entire platform cannot deliver the required service [18]. Software that relies on the internet and ever-changing, dynamic networks increases its chances of failure. Net-centricity can introduce complexities leading to greater chances of errors [6]. Combining systems introduces higher chances of failure due to complexity. This complexity makes it extremely hard to provide a platform with consistent levels of resilience. The greater the number of integrated systems, the greater the chances of lower resilience [8]. Another important aspect of highly integrated complex systems to consider is that organizations could be using third-party systems. Failures in such applications affect local systems. This often leads to considerations of complexity as a factor in software resilience. Today, systems rely on multiple other technology platforms to provide services [10]. For instance, a firm can use a combination of locally implemented software and other cloud-based architectures. In such a scenario, failures in the cloud become failures in the local systems. The use of open source software could also complicate matters since more complexity is
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 | introduced. COTS, or Commercial off-the-shelf, products could have different designs and features compared to open source software. For instance, when firms integrate a combination of ‘off-the-shelf’ and open source software, resilience is reduced because of the complexities and different levels of net-centricity. Modern businesses operate a range of interdependent, interconnected, and interrelated systems, hence the failure of one easily becomes the failure of the many. The other factors considered important include globalization and rapid changes. In terms of globalization, the new software development models and software supply chains are seen to focus more on cost reduction rather than quality. They focus more on reducing their costs of production rather than the resilience of the systems. Further, changing requirements and frequent software releases lead to dynamic and complex outcomes. For instance, when new software releases are being provided to the market, new changes may be incorporated and resiliency may be reduced compared to earlier releases [11]. Additional research needs to focus on the influence and impact of human error in software resilience. It is important to consider failure in complex systems as a multifaceted problem where several factors can be identified. It is not proper to attribute failure to one root cause but to the wider array of interrelated factors. The manner in which users or developers react can affect the resilience of systems. From a socio-technical point of view, it is possible that safety, operational errors, and other considerations of the man-machine interface can affect the resilience of systems [17]. Lapses, rule violations, and process confusion can affect the ability of systems to remain resilient. Such issues can be related to the manner in which operators handle the systems. However, if we attribute the resilience of entire systems to human error, it could lead to an insufficient approach to combating failure. Failure to human factor alone could be insufficient in providing solutions to resiliency problems. Therefore, human error should be taken as the starting point as we move towards an examination of the wide array of factors that can cause the problems. 3 Resilience in new methods of Software Engineering One of the obvious approaches to ensuring resilience is adequate testing at each phase within the software development cycle. When software is being completed, a series of tests are necessary to ensure that all aspects of the software satisfy the design requirements [12]. Adequate testing can ensure improved integrity and resilience in the new software. Most specifically, clear and regular testing should be concentrated on the coding stage as poor coding standards, a lack of peer reviews and follow up regression testing can make the system prone to resilience problems. An untested error routine is a major cause of later system failure and the disability to easily recover after a disruption. Inadequate testing could lead to disastrous consequences. A second method of resilient software development is designing the software for resilience when the software specifications are being developed. Some systems can fail because relevant resilience routines or features were not included in the initial design. Designing for resilience calls for a clear understanding of the requirements of the software and business
63 process before starting the entire design process. This can be daunting, especially in the age of agile development. When designing technology systems, it is important to ensure that the design captures important aspects related to failure or resilience [11] such as recovery from hardware failure, incorrect input into the system, human factors and graceful degradation. When such features are included early in the design, it becomes easier to design for resilience even at advanced stages of development [10]. However, when such features are absent at early stages, it becomes a burden to include them at the advanced stages of development. It is important that human factors are considered early in design phase. Inappropriate use of software can lead to failure when applications are not used for the correct purposes. This could result into routine failures or process failure leading to the failure of the entire application. It is also important that designers consider fault-tolerant hardware, load balancing, highavailability clustering [15], back-up and recovery options. The application should also be designed with fault tolerance and failure recovery at the forefront of the design. High fault tolerance ensures that the application can protect itself from further damage when an error or fault occurs [20]. Hardware checking, preparation for recovery and the recovery process are being addressed through the new general algorithm fault tolerance (GAFT) [22]. With improved design methods and planning to mitigate the consequences of potential failure, the ability to create software and a business process with a good ability to recover is enhanced for the benefit of the entire organization. Change management is also a vital stage in the software development life cycle. Strategies that handle change in a scheduled manner are preferable and should occur as early in the software development cycle as possible. The change process should be highly controlled in order to increase the likelihood of security, resiliency and integrity in the software development cycle. The dependencies of all systems/ components should be documented so that when changes are scheduled to occur they are discussed with relevant stakeholders. Unscheduled change management strategies are extremely detrimental to the system especially when the system is already in production. Generally, systems are becoming more integrated and development is becoming more complex at a worldwide basis. Reduced complexity in the software design is a good resilience method. The software should be designed modularly and each module should have a method of monitoring its own status. This makes the system less complex to trouble shoot in the event of failure. Despite this, the system should be able to meet all the needs that were outlined in the objectives while still maintaining complexity. With increased complexity, the inability of the final system to recover after various disruptions also increases. 4
Software Engineering process for improved resilience
In systems that are already deployed code refactoring and optimization is important. Improving a system’s resiliency calls for the collection of detailed measures of the software process and product quality. To design for improved resilience, engineers should focus on several concerns. These concerns may include planned and unplanned downtime, system repair time, disaster recovery, and uptime requirements [9]. In the system optimization phase engineers define key performance measure for
ISBN: 1-60132-468-5, CSREA Press ©
64
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
system uptime or availability, estimation of functional availability, acceptable downtime and assessments of possible failures [3]. After defining these metrics the system should be reworked from its core to ensure resilience. Another approach, as suggested by J. Allspaw , is to ensure that organizations learn from failure. When software fails, the organization should consider this as a lesson meaning that new resilience features should be introduced in the systems. Companies should consistently pursue strategies to identify the systemic vulnerabilities present in their software. One valid approach is to identify such vulnerabilities when software fails so that new features can be developed. Even though such exercises are very complex and lengthy, the result of a successful exercise is worth the time and resources invested. Indeed, postmortems are opportunities for organizations, users, and developers to learn the underlying vulnerabilities in the systems [17]. This often improves the ability to anticipate and resolve failures in future. When systems fail, most organizations launch investigations to understand the root cause of the failures (14). In such endeavors, companies may even discover vulnerabilities that they didn’t know existed. Even though the system may be working optimally, it is prudent to identify and resolve issues to curb any future failures. Organizations should develop features, routines, and processes for recovery and restoration after failure [12]. Failure recovery should be governed by processes and activities that have been identified and documented. Recovery procedures should outline other fail-over systems where the application can redirect operations when there is a failure [9]. This is often provided for by backup systems located on-site or off-site. Recent developments in cloud systems have provided another option for fail-over techniques [2]. This does not necessarily point to software features. Organizational work procedures and processes can include several activities aimed at ensuring that systems recover quickly from failures [5]. Such an approach is important since modern architectures operate in diversified environments where technological and human factors interplay. This is in agreement with the insights of J. Allspaw, that human factors are important in system failure [17]. The human-machine interface determines how systems operate and their resilience might rely on how such interfaces are planned and implemented. 5 Economics of Resilience Normally, all aspects of business depend on economics. Organizations must weigh between costs and the need for resilience. The strategic importance of the application should also be considered. Ultimately, organizations must strike that delicate balance between reducing costs and ensuring that applications are resilient and reliable [2]. Therefore, with costs in mind, organizations should understand the balance between backups, redundancy, fail-over systems, and the economics of their operations [4]. One approach is the consideration of the “Greedy approximation”. “Greedy approximation” adds a new redundant component at each step, yielding the highest gain in reliability on each iteration. The result is to induce the redundancy around the system from the baseline forward. Similarly, the apportionment process spreads the redundancy within the system, but in a top down fashion. In general, the two techniques may yield a different set of sub optima; most of
the time, both will yield good solutions yet feasibly economically divergent. The apportionment approach has a number of advantages, including the following: (a) it fits in well with the philosophy of system design, which is generally adopted by designers of large systems; (b) its computations will in general be simpler; and (c) it may provide more insight into when the suboptimal solutions it generates are good.[21] In some cases, it might not be economically feasible to improve the resilience of systems. However, organizations will always operate better when they rely on resilient systems especially when such systems form the core or backbone of organizational processes. 6
Selected software resilience projects in literature
The following projects are being conducted through the Center for Resilient Software at MIT. The general notion in resilient features is adding functionality to create resilient software yet, with a minimizing functionality to make it more feasible to formally verify software. “ The Cloud Intrusion Detection and Repair (CIDAR) project is developing a system that observes normal interactions during the secure operation of the cloud to derive properties that characterize a secure operation. The project is funded under the DARPA Mission-Oriented Resilient Clouds (MRC) program with MIT at the helm. If any part of the cloud subsequently attempts to violate these properties, the system intervenes and changes the interaction by adding or removing operations, or changing the parameters that appear in operations, to ensure that the cloud executes securely and survives the attack while continuing to provide uninterrupted service to any legitimate user . The approach revolves around a new technique: input rectification. Applications process the vast majority of inputs securely. Malicious attacks find success in input that the application regards as errors and capitalizing on the malformed, novel features. Input rectification research observes inputs that the application processes to derive a model in the form of constraints over input fields. A defined boundary is created for the applications with a set of inputs that the application can process error free. When the application encounters an input that is outside the boundary, the rectifier uses the model to change the input to move the input into the corrective boundaries of the application. The results show that this technique eliminates security vulnerabilities in a range of applications, leaves the overwhelming majority of safe inputs unchanged, and preserves much of the useful information in modified atypical inputs.[19]” The SPICE project, another collaborative project with promise from IARPA under the STONESOUP program, has devised and is fine tuning a method to “ eliminate vulnerabilities triggered by a variety of low level errors in stripped x86 binaries. A combined dynamic and static type inference system analyzes reads and writes to infer how the application structures the flat x86/x64 address space. The information is then utilized to preserve the integrity of the execution. Through this project one can neutralize malicious attacks that require the exploit of buffer overflow vulnerabilities and uses a configurable security policy to modify the execution to eliminate the vulnerability and enable continued safe execution.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 | An additional method derived from the SPICE project is a precise taint tracing system . This system combines static and dynamic analysis to assist minimizing overhead. The taint information enables one to detect the unsafe direct use of untrusted input fields at vulnerability field locations such as command invocation, SQL and memory allocation sites. Secondarily, one is able to track memory allocation information to properly eliminate buffer overflow attacks for ongoing monitoring”. A third project, under STONESOUP, is focused on eliminating bugs in a particular language and set of libraries. Headed by Kestrel and MIT, the VIBRANCE project is developing techniques to eliminate vulnerabilities in Java applications triggered by unchecked inputs. VIBRANCE is attempting to provide a taint tracer with overhead small enough for routine production use that combines precise dynamic taint tracing with static analysis to reduce the tracing overhead [26]. Inputs monitored will quell SQL, command, ldap, and xquery injection attacks also resource allocation vulnerabilities and numeric overflow and underflow vulnerabilities. The International Workshops on Software Engineering for Resilient Systems (SERENE) is now in it’s 9th conference bringing together researchers and industry practitioners to discuss advances in engineering resilient systems.
65 [4]
[5] [6]
[7]
[8]
[9]
[10]
[11]
[12]
7 Conclusion The topic of software resilience is relatively contemporary. Several studies have been conducted and the results indicate that the software engineering field has been lagging behind compared to other engineering sectors. The studies also noted that there are some important factors that affect resilience; complexity, globalization, hybridization, interdependency, rapid change and net-centricity are important considerations. Further, some design related factors include human factors, inappropriate use, poor designs, change management and inadequate testing. To ensure that systems are resilient, software engineers should ensure that their development process considers core elements or features in the design for resilience. Fault tolerance, fail-over systems, recovery options, and backup systems are important considerations. Further, human factors should be considered in the design as well as the relevant economic considerations. Ultimately, organizations perform better when their core software systems are resilient and reliable. Considering the importance of this area of study, it is important for more research to be conducted especially in the areas of resilience features, new products and features, and relevant organizational case studies. 8 References
[13]
[14]
[15] [16]
[17]
[18]
[19]
Avgeriou, Paris. Software Engineering for Resilient Systems: Fourth International Workshop, SERENE 2012, Pisa, Italy, September 27-28, 2012, Proceedings. Springer, 2012. [2] Axelrod, Warren. "Investing in Software Resiliency." The Journal of Defense Software Engineering 22 (2009): 20-25. [3] Berleur, Jacques J. What Kind of Information Society? Governance, Virtuality, Surveillance, Sustainability, Resilience: 9th IFIP TC 9 International Conference, HCC9 2010 and 1st IFIP TC 11 International Conference, CIP 2010, Held as Part of WCC 2010, Brisbane, Australia, Sep. Springer Science & Business Media, 2010. [1]
[20]
[21] [22]
Chang and Chen. 2014 International Conference on Artificial Intelligence and Software Engineering(AISE2014). DEStech Publications, Inc, 2014. Florio, Vincenzo. Innovations and Approaches for Resilient and Adaptive Systems. IGI Global, 2012. Istvan, Majzik and Vieira Marco. Software Engineering for Resilient Systems: 6th International Workshop, SERENE 2014, Budapest, Hungary, October 15-16, 2014. Proceedings. Springer, 2014. Jackson, Scott. Architecting Resilient Systems: Accident Avoidance and Survival and Recovery from Disruptions. John Wiley & Sons, 2009. Jorge, Diaz-Herrera and Tucker Allen. Computing Handbook, Third Edition: Computer Science and Software Engineering. CRC Press, 2014. Khan, Khaled M. Security-Aware Systems Applications and Software Development Methods. IGI Global, 2012. Kuhn, Roland. "Creating Resilient Software with Akka." 13 6 2013. 10 16 2015 . Marco, Vieira and Van Moorsel Ad. Resilience Assessment and Evaluation of Computing Systems. Springer Science & Business Media, 2012. Merkow and Raghavan Lakshmikanth. Secure and Resilient Software: Requirements, Test Cases, and Testing Methods. CRC Press, 2011. Richard, A. Caralli, H. Allen Julia and W. White David. CERT Resilience Management Model (CERT-RMM): A Maturity Model for Managing Operational Resilience. Addison-Wesley Professional, 2010. SSCA. "Resilient Software." 2015. 10 15 2015 . Stan, Han. Blue-Prints for High Availability: Designing Resilient Distributed Systems. New York: John Wiley & Sons. 2009. Svetlana, Cojocaru and Bruderlein Claude. Improving Disaster Resilience and Mitigation - IT Means and Tools. Springer, 2014. Théron, Paul. Critical Information Infrastructure Protection and Resilience in the ICT Sector. IGI Global, 2013. That, Emi Islam. Secure and Resilient Software Development. England: CRC Press. 2011 Troubitsyna, Elena A. Software Engineering for Resilient Systems: Third International Workshop, SERENE 2011, Geneva, Switzerland, September 29-30, 2011, Proceedings. Springer, 2011. Webb, Jenn. "How resilience engineering applies to the web world." 05 18 2011. . Woods, Rozanski. "THE AVAILABILITY AND RESILIENCE PERSPECTIVE." 2015. 10 16 2015 . Mission-Oriented Resilient Clouds. DARPA. 2015.
IARPA, Securely Taking On New Executable Software of Uncertain Provenance (STONESOUP). 2015.
Shooman, Martin L., “Reliability of Computer Systems and Networks”. John Wiley and Sons, 2002. Schagaev ZIgor and Thomas, Kaegi, “Software Design for Resilient Computer Systems” Springer 2016
ISBN: 1-60132-468-5, CSREA Press ©
66
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
SESSION WEB-BASED TECHNOLOGIES AND APPLICATIONS + CLOUD AND MOBILE COMPUTING Chair(s) TBA
ISBN: 1-60132-468-5, CSREA Press ©
67
68
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
69
A Comparison of Server Side Scripting Technologies Tyler Crawford, Tauqeer Hussain* [email protected], [email protected] Computer Science & Information Technology Department Kutztown University of Pennsylvania Kutztown, PA, U.S.A. Abstract — In the presence of so many scripting languages or technologies available for server-side processing, it is usually a difficult decision for a beginner which technology to learn or use for development. In this paper, we have presented four leading server-side scripting technologies –PHP, Django, Ruby on Rails, and Node.js. We have identified five comparison attributes namely ease of getting started, availability of help and support, popularity, availability of development tools and packages, and perofrmance for the purpose of comparison. We have rated each technology based on these comparison attributes and provided our recommendations for the learners and developers. It is expected that the developers, software engineers, as well as the instructors teaching web development courses would benefit from this research. Keywords — web technologies; Node.js; PHP; Django; Ruby on Rails Type of submission: Regular Research Paper
I. INTRODUCTION Today a number of scripting languages or technologies are available for server side processing and integration with the databases. These technologies have their advantages and disadvantages. This makes it difficult for developers to choose an appropriate server side environment for their projects development. The instructors teaching computer science or information technology courses in web engineering or web development areas are also confronted by similar problem, that is, which language they should adopt in their program curriculum to teach their students. In this paper, we are providing a compariosn of four major server-side scripting technologies – Node.js, PHP, Django, and Ruby on Rails. For the purpose of this comparison, we have chosen five comparison attributes: 1) the ease of getting started with a particular technology, 2) the help and support available to the developers for that particular technology, 3) its popularity, 4) the development and package management tools available for that specific technology, and 5) its performance. It is worth mentioning here that many other server side technologies are not included in this comparison because they are either not scripting languages or are frameworks. For example, two of the largest server side technologies, ASP.NET MVC and Java servlets including frameworks, were not included as they are not classified as a scripting language like JavaScript, PHP, or Ruby. Other languages excluded are Go which is a system language and not a scripting language, and Coldfusion which is a commercial platform by Adobe. Hack, created by Facebook, is sort of a dialect of PHP and was not included due to not being used as widely as PHP on the web.
Finally, Vapor, an upcoming framework for the Swift programming language, was not included in this comparison as well due to the infancy of the technology which results in lack of information, lack of help and support. Swift itself is not a scripting lanuage either. In the rest of this section, we have provided a brief history of each scripting language that we are comparing. In the following sections, we compare these languages with respect to each comarison attribute mentioned above. In the last section, we conclude this research by rating each teachnology, giving our recommnedation, and providing pointers to further work. A. PHP PHP’s history dates back to when the internet was still young. PHP is actually a successor to a product name PHP/FI which was created in 1994 by Rasmus Lerdorf. Lerdor originally created PHP to track visits to his online resume and named the suite of scripts “Personal Home Page Tools”. Development continued on to PHP 3, which most closely resembles PHP as it is today [53]. PHP is written primarily in C with some code in C++. B. Django Django was created in the fall of 2003 when programmers for the Lawrence Journal-World newspaper began using Python to build applications. The World Online team, which was responsible for production and maintenance of multiple new sitesm, thrived in a development environment that was dictated by journalism deadlines. Simon Willson and Adrian Holovaty developed a time-saving web development framework out of their necessity to maintain web applications under extreme deadlines. In the summer of 2005, the developers decided to release the framework as open source software and named it after the jazz guitarist Django Reinhardt. Since 2008, the newly formed Django Software Foundation continued maintaining Django [54]. C. Ruby on Rails Unlike PHP, Django, (and Node.js), Ruby on Rails was not quite created out of necessity. Rather, it was pulled from David Heinmeier Hansson’s work while working for the company Basecamp on their product Basecamp. The first open source release of Rails was in July 2004 with commit rights to the project following soon after. Rails uses the Model-ViewController (MVC) programming pattern to organize application programming so that it is easier to work with. D. Node.js Node.js, was first introduced in May 2009 by Ryan Dahl. When looking at the prevailing server side programming
ISBN: 1-60132-468-5, CSREA Press ©
70
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
languages around 2009, Dahl felt that there was a problem with input and output (I/O), and the database processing at the backend. Dahl observed that instead of waiting for the database to respond with a result set, that wastes millions of clock cycles, an application can proceed in non-blocking mode [52]. The goal of Node.js was therefore set to provide an event-driven, nonblocking I/O model that was lightweight and efficient [4]. It allows the application to proceed its execution without wasting clock cycles thus making the application run faster. Node.js is primarily written in C, C++, and JavaScript and touts that it is a JavaScript runtime built on Google Chrome’s V8 JavaScript engine.
the header, and the data are set. Finally, the server listens on the port and specified host. The for loop in Figure 2 should be quite familiar in that it is very similar to a for loop in many other languages. It is important to note however, that JavaScript is a dynamically typed language. This means that var can be used instead of explicitly defining a variables type but rather its type is inferred. for (var i = 0; i < n; i++) { // code } Fig. 2. For loop in Node.js/JavaScript
II. GETTING STARTED In many programming languages, the starter program is to say “Hello, World!”. For comparison sake, we are doing the same thing in each of the four server-side technology sections given below. Though syntax for many other things can be compared, considering loops as an essential part of every scripting and programming language, a simple for loop is also coded to show that for primitive statements the syntax in the four technologies is quite comparable. A. Node.js To get started with Node.js, a simple installer may be used to obtain the technology locally in order to start developing [4]. The installation typically includes the node package manager as well any command line tools to get started with local server development. However, the installation package does not include any database management tools, but rather is left to the developer to choose freely which database system would be necessary to create their desired product. The code for creating a “Hello, World!” project in Node.js can be seen in Figure 1. const http = require('http'); const hostname = '127.0.0.1'; const port = 3000; const server = http.createServer((req, res) => { res.statusCode = 200; res.setHeader('Content-Type', 'text/plain'); res.end('Hello, World!\n'); }); server.listen(port, hostname, () => { console.log(`Server running at http://${hostname}:${port}/`); }); Fig. 1. Node.js “Hello, World!” server code [48].
Node.js relies heavily on the use of packages from the Node Package Manager (npm) to complete tasks. The first line of code requires that an http package be used to create a server. The next lines specify the hostname and port number. In this case, it is the localhost address and port 3000. To create a server, the createServer function is utilized where the response status code,
B. PHP To get started with developing and testing PHP code locally, a local server environment is desired. Such tools to assist in creating this environment include XAMPP, MAMP, or WampServer [1-3]. Each of these are quite simple to install and get started on Linux, MacOS, or Windows, but not all technologies support each platform. Typically, the aforementioned technologies include a database management system that interacts well with the PHP server tools. The code for creating a PHP “Hello, World!” project is quite simple and involves only a few lines of code. The code, seen in Figure 3, simply echos “Hello, World!” onto the page and is used inside PHP tags. This is similar to bash scripting when printing data to a document or to the console.
Fig. 3. PHP “Hello, World!” code
The code for a PHP for loop shown in Figure 4 is similar to Node.js. The variables however are prefixed with a $ symbol which is typical when programming in PHP.
Fig. 4. For loop in PHP
C. Django Getting started with Django is quite simple. The first step is to install python if it is not already installed, and then use the python package manager, pip, to install Django [5]. The development version of Django can also be installed by cloning the repository from Github. Django does not provide any sort of database management system and is left to the discrection of the developer creating a project. The code for creating a Django “Hello, World!” project is simple as well, but it is important to modify the file controlling the server settings to be able to properly develop locally. The project code shown in Figure 5 defines a function called index
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
that takes a request parameter and returns an HttpResponse with the “Hello, World!” text as the parameter. def index(request): return HttpResponse("Hello, World!") Fig. 5. Primary code responsible for creating “Hello, World!” in Django [49]
Django fortunately uses Python to make creating a for loop quite simple. The code, found in Figure 6, creates a for loop by using a range of values between 0 and n. for i in range(0, n): # code Fig. 6. For loop in Django/Python
D. Ruby on Rails Finally, getting started with Ruby on Rails has proven to be more difficult compared to that of the other technologies. The first step in getting started with Ruby on Rails, or Rails as it is widely known, is to ensure that the latest compatible version of Ruby is installed for Rails. For Windows this involves an installer to install Ruby. For Linux and macOS, a Ruby version manager is necessary to manage the versions of Ruby installed. This task can become more complicated if the developer is unfamiliar with version management tools or lacks some knowledge of how command line tools typically behave. Once ensuring that the latest compatible verison of Ruby is installed, the gem package manager that comes with Ruby may be used to install the Rails package. The Install Rails website provides clear instructions of how to install Rails on each platform and ensures that Ruby is installed as well [6]. The “Hello, World!” project code gets accepted a little differently with Rails due to a number of files required to start developing and running a local server. No code will be shown for Rails, but rather the process to get a project started to see the “Yay! You’re on Rails” webpage will be given instead. The commands for creating the project will be shown however for the sake of being able to replicate this application. The first necessary step is to create a new rails application via the command line tools, then change directories into the new folder created for the application, and finally starting the server. The commands for doing so may be found in Figure 7. $ rails new $ cd $ rails server Fig. 7. Commands used to create a new Rails application
Finally, Rails utilizes the Ruby language to create a for loop in a range of values similar to that of Django. The differences is that in Django, a range function is called, versus the two periods used between the lower and upper bounds of the range in Rails. This difference can be quickly identified in Figure 8. for i in 0..n # code end Fig. 8. For loop in Rails/Ruby
71 E. Comparison on “Getting Started” It can be easily observed that getting started with PHP is the simplest. Upon setting up a local server environment, it takes minimal effort to create a new PHP file and view it on the local server. Node.js on the other hand proves to contain more difficult syntax that may not be quite clear to a beginner. Django continues the trend of difficulty proving to be a bit more complicated to get started with. While writing “Hello, World!” in a function may not be difficult, setting up the initial URL patterns, hostname, and port number creates extra steps that can be confusing and and hard to understand. Finally, Rails may be difficut to set up but makes up for it in the ease of creating a new project and running this on a local server. Installing Ruby and Rails can be a daunting task if one is unfamiliar with how the technologies work, but creating a new project and running it is quite straight forward.
III. HELP AND SUPPORT For each of the following sections, the age, guides available to get started, help, and documentation provided are discussed regarding each technology. A. PHP PHP first appeared in June of 1995 which makes it 21 years old and is the oldest of all technologies discussed in the document. When navigating to the PHP website, a developer should notice that getting started guides are avaible from those that develop the language making it quite clear and simple how to get started writing a basic PHP web application. Due to its age, there are a number of posts on the internet related to the language which makes it easy to get help. B. Django Django, the second oldest technology, was founded on July 21, 2005, currently making it 11 years old. Django provides well documented starter guides on their site for developers begin learning the framework. While Django is used by less than 0.1% of the internet, getting help with it should not be too difficult due to the documentation and quick start guides on their site. Aside from this, it is important to note that Django is written in Python, which is widely used, making it easier to obtain specific help related to the Python lanugage itself. C. Ruby on Rails Ruby on Rails, or Rails, was founded on December 18, 2005, which currently makes it 10 years old. RailsGuides provides numerous guides to working with Rails and getting started as a new developer to the framework [11]. D. Node.js Finally, the newest of technologies, Node.js, was founded on May 27, 2009 which currently makes it 7 years old. Node.js provides a minimal usage and example guide on the site which simply shows how to get started but provides no further examples or guides to continue after replicating the “Hello, World!” project. There is however API documenation for the long-term status (LTS) and newest version of Node.js.
ISBN: 1-60132-468-5, CSREA Press ©
72
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
E. Comparison on “Help and Support” It can be concluded that PHP is probably the easiest technology to get started with given the guides, documentation, and other help available. This is partially due to the age of the technology compared to the other three. PHP has ten years more of data compared to the next oldest technology, Django. While age and popularity alone does not necessarily mean that a technology is easier to get started with, it helps due to it being more prevalent and therefore more people have knowledge of the technology and are able to assist when help is needed. Django on the other hand is used in only a small portion of websites on the internet, but it provides many guides to get started as well as API documentation. Following closely behind Django is Rails which has many guides available on web via RailsGuides making it a suitable choice if desired [11]. Finally, the latest technology, Node.js – though it is gaining popularity gradually, finding well documented starter guides has been more difficult than desired. The Node.js website provides a usage and example guide in the API but provides no further guides to getting started with Node.js. It seems that outside sources will be absolutely necessary to continue further if a developer is just getting started with the technology.
Figure 10 shows the trend of websites using PHP from November 2015 to November 2016. It is unclear why there is a sudden drop in data. One conclusion could be that the application used to find the number of websites using PHP had a bug during this time. As of November 2016, approximately 42 millions websites use PHP.
Fig. 10. Line graph of websites using PHP from Nov ‘15 to Nov ’16 [8]
IV. POPULARITY The following section discusses the popularity of each technology in in terms of its use on the web today. However, this may not correctly reflect popularity amongst developers. Trending popularity is derived from the number of packages that are currently available in each package management system as it shows a general interest level of developers to a specific technology.
C. Django Based on statistics from W3Techs and BuiltWith, Django is used by less than 0.1% of the internet [9, 10]. As of November 2016, Django is used on approximately 5,200 websites which is very small compared to that of PHP. Figure 11 below shows the trend of websites using Django for November 2015 – November 2016.
A. Node.js Based on statistics from W3Techs, Node.js is currently used by approximately 0.24% of all websites whose server they know [13]. Figure 9 below shows the trend of websites using Node.js from November 2015 to November 2016.
Fig. 11. Line graph of websites using Django from Nov ‘15 to Nov ’16 [10]
Fig. 9. Usage statistics from W3Techs of Node.js for Nov ’15 – Nov ’16 [13]
B. PHP Based on the statistics from the site BuiltWith, and as of November 2016, 11.7% of the entire internet uses PHP [8].
D. Ruby on Rails Currently, based on statistics from BuiltWith, approximately 0.3% of the entire internet uses Rails [12] as shown in Figure 12. Once again, it is unclear what caused the dip in data for a few months over the year long period. Nonetheless, there are approximately 1.09 million website using Rails as of November 2016 based on statistics from BuiltWith [12].
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
73
Typically documentation for the packages is provided under the “Links” section and is shown on an external site. C. PHP PHP utilizes a technology called Composer as its package management system which must be installed separately from PHP [17]. Packagist is the site used to browse or search packages to use with Composer which currently contains approximately 115,000 packages [14, 18]. Documentation is typicallyed linked to on each of the packages page to assist in getting started. Composer follows similarly to the rest of the package managers in that in order to install a package, the command line tool that comes with Composer is utilized.
Fig. 12. Line graph of websites using Rails from Nov ‘15 to Nov ’16 [12]
V. DEVELOPMENT TOOLS AND PACKAGE MANAGEMENT SYSTEMS Each of the technologies discussed in this paper has development tools and package management systems that support the corresponding languages. The following subsections present the number of packages each package management system currently has. The numbers obtained for each of these package managers are noted at a website called module counts [14].
D. Django Django uses the package manager that comes with Python called PyPI, or Python Package Index. Packages, installed via command line, can be searched using the PyPI site, which is an extension of the main Python website where approximately 90,000 packages are currently listed [14, 19]. Documentation for each of the packages is either given on the package page, link to, or not shown on the package page. This can cuase problems when desiring to use a tool but no information is clearly given on the package page. This requires a developer to seek other resources all together or to search the internet for the documentation to the package. E. Comparison on “Development Tools and PM Systems” Based upon these findings, Node.js provides almost three times the amount of packages compared to that of the next competitor, RubyGems, which provides some indication that there are more developers working with, and interested in Node.js. This number could be skewed due to Node.js lacking features that other technologies have and therefore need packages to make up for the missing abilities. This should not warrant over 200,000 packages worth of features missing from Node.js however. Node.js also provides a section on each package page for a developer to include documentation for the package if it is able to be shown on a single page. RubyGems and Composer both prove that they are a good package manager overall and provide a large number of packages to developers with well rounded package pages. The packages listed on Python’s package manager website, PyPI, on the other hand, seems to lack documentation occasionally, as well as a good interface for browsing and discovering new packages. Overall, PyPI is usable but lacks in information and friendliness.
A. Node.js As stated prior in this document, npm is the package manager used for Node.js projects and is bundled with Node.js on installation. Node.js relies on npm applications to get tasks done when creating projects. When installing Node.js a developer will receive a Node.js JavaScript runtime which allows the developer to run Node.js applications. In order to start programming specific applications, like a web server, a package will be necessary. By not bundling everything desired into the technology, it helps keep each application lightweight. Section 6 discusses a number of the environments that Node.js can run on. If Node.js were to include everything necessary to run the application in all environments, this would make Node.js very slow and undesirable to use. It is quite necessary to say, that in order to run any application using Node.js, npm is mandatory to use. If it is not used, the application likely becomes a simple JavaScript application instead. Npm packages can be browsed via the npmjs website where approximately 345,000 packages currently live [14, 15]. Packages can be installed via the command line tools on a per project basis and documentation for each package is typically displayed on the package’s page on the npmjs site.
In the following sections, integration with the top five databases is studied for each of the technologies compared in this paper. The top five databases are being pulled from DBEngines and are as follows: Oracle, MySQL, Microsoft SQL Server, MongoDB, and PostgreSQL [22-27].
B. Ruby on Rails RubyGems is the package manager used for Ruby and Ruby on Rails and is provided with the installation of Ruby. RubyGem packages can be browsed via the RubyGems website where approximately 125,000 packages are currently listed [14, 16]. Packages for Ruby and Rails can be installed via command line.
A. Node.js Node.js does not have support upon installation for any database. It does, however, inherit support from packages. All of the top five databases are supported by at least one or more package. The following packages are a few examples that provide support: oracledb for Oracle, mysql for MySQL, mssql
VI. INTEGRATION WITH DATABASES AND DRIVERS
ISBN: 1-60132-468-5, CSREA Press ©
74
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
for Microsoft SQL Server, mongodb for MongoDB, and pg for PostgreSQL [28-32]. B. PHP PHP appears to integrate with MySQL, Oracle, PostgreSQL, and MongoDB upon installation. As of PHP 7.x.x, it appears that support for Microsoft SQL Server has been deprecated, but support can be added via the SQL Server Driver on Microsoft’s website [33].
to the event-driven, non-blocking I/O strategy. Due to this strategy, fewer clock cycles are wasted and also the program does not wait on another process resulting in significantly improved performance.
C. Django Django relies on support added by drivers to the Python language. Support is provided for each of the top five database systems but may not be clear how to add or utilize and may be hard to find. Examples of packages or drivers that support each of the top five databases include the follwing: psycopg2 for PostgreSQL, cx_Oracle for Oracle, Django MongoDB Engine for MongoDB, pymssql for Microsoft SQL Server, and MySQL Python Connector for MySQL [34-38]. D. Ruby on Rails Rails inherits database support primarily from RubyGem packages as support for databases is not added to the Ruby language. A few examples of RubyGem packages that provide support for the top five databases are MySQL2 for MySQL, pg for PostgreSQL, mongoid for MongoDB, activerecordsqlserver-adapter for Microsoft SQL Server, and activerecordoracle_enhanced-adapter for Oracle [39-43]. E. Comparison on “Integration with Databases” After studying how each database can be connected to with each technology, it is evident that PHP has the most support upon initial installation and may take some work to utilize Microsoft SQL Server with the newest version of PHP. While Node.js does not support database connections by default, it is made up for in the packages that the community provides with well rounded documentation for each. Ruby and Rails inherit support via packages as well but falls just slightly behind Node.js in terms of community support and documentation. Finally, support for connecting to a database using Python for Django seems to be less intuitive and requires research to understand how to install and utilize the drivers added.
Fig. 13. Plaintext Requests/second [50]
Fig. 14. JSON Requests/second [50]
VII. PERFORMANCE The following section is included to discuss performance differences among the four technologies. However, there is not any significant data or information available in literature or on web to compare the performance of these technologies. The only source that we could find is [50] that compares requests per second for plaintext, JSON, and SQLite on Vapor (Swift), Ruby on Rails (Ruby), Laravel (PHP), Lumen (PHP), Express (JavaScript), Django (Python), Flask (Python), Spring (Java), Nancy (C#), and Go. Of these technologies compared, the important ones for this document are Express (which is a Node.js web application framework), Laravel and Lumen, Ruby on Rails, and Django. Figures 13–15 show the results found in the comparison. Of the technologies being compared, it is evident that Node.js is the clear winner in each test run with Django coming in second and finally Rails and PHP sitting in the last three positions of the bar graphs. One can infer that Node.js has the ability to process more requests per second due
Fig. 15. SQLite Requests/second [50]
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
75
REFERENCES
VIII. CONCLUSION AND FUTURE WORK In this paper, we have compared four web-based server-side scripting technologies: PHP, Django, Ruby on Rails and Node.js based on five comparison attributes – ease of getting started, availability of help and support, popularity, availability of development tools and packages, and perofrmance. To summarize and make it convenient for the reader, we have rated in the following table (Table 1) each technology for each comparison attribute on a qualitative scale of excellent, very good, good, fair, and poor. Though these ratings are subjective, they serve the purpose of providing a quick comparison.
[1]
Table 1: Qualitative comparison of Node.js, PHP, Django, and Rails
[7] [8]
Node.js
PHP
Django
Rails
Getting Started
Very good
Excellent
Very good
Good
Help And Support
Good
Excellent
Excellent
Excellent
Popularity
Very Good
Excellent
Fair
Very Good
Development Tools and Package Management Systems
Excellent
Good
Fair
Very Good
Environments
Excellent
Poor
Good
Good
Integrations with Databases
Excellent
Very Good
Fair
Very Good
Performance
Excellent
Fair
Very Good
Fair
[2] [3] [4] [5] [6]
[9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
Based on the above ratings, our overall recommendation is Node.js. Node.js is a very fast technology that allows for eventdriven, non-blocking I/O giving it an extra boost compared to the other technologies. It is also a very popular platform amongst developers when comparing the amount of packages available in npm to other package management systems. It can also run virtually everywhere you want to run it. Aside from being able to run on a server, Node.js can be used to create console applications, run on the desktop with the help of Electron, in the browser, and embedded systems. The integration with five of the most popular databases through packages makes it easy to continue using whichever database system you desire. The documentation provided by these popular packages is usually very good which makes getting started with them even easier. The limitation of Node.js however, is its learning curve. Developers are required to find other resources to get started and they may not be readily available. PHP continues to be the dominator of the internet primarily due to its old age and ease of getting started as well as the help and support available on the internet. It is limited in the number of packages available, the environments it can run on, and its overall performance. For future work, we intend to run experiments that can help evaluate performance of each technology. Also, more comparison attributes can be studied, for instance, the native environment supported by each technology, their hosting and deployment, and support for complex programming structures.
[19] [20] [21] [22] [23] [24] [25] [26] [27]
[28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
"MAMP & MAMP PRO." MAMP & MAMP PRO. Appsolute GmbH, 2016. Web. 25 Oct. 2016. Bourdon, Romain. "WampServer." WampServer. N.p., n.d. Web. 25 Oct. 2016. "XAMPP Installers and Downloads for Apache Friends." Apache Friends. Apache Friends, 2016. Web. 25 Oct. 2016. "Node.js." Node.js. Node.js Foundation, 2016. Web. 25 Oct. 2016. "Download Django." Django. Django Software Foundation, 2016. Web. 25 Oct. 2016. "Install Rails: A Step-by-step Guide." Install Rails. One Month, Inc., 2016. Web. 25 Oct. 2016. N/A "PHP Usage Statistics." BuiltWith. BuiltWith Pty Ltd, 2016. Web. 25 Oct. 2016. "Usage Statistics and Market Share of Django CMS for Websites." Web Technology Surveys. Q-Success, n.d. Web. 25 Oct. 2016. "Django Language Usage Statistics." BuiltWith. BuiltWith Pty Ltd, 2016. Web. 25 Oct. 2016. "Ruby on Rails Guides (v5.0.0.1)." Ruby on Rails Guides. N.p., n.d. Web. 25 Oct. 2016. "Ruby on Rails Usage Statistics." BuiltWith. BuiltWith Pty Ltd, 2016. Web. 25 Oct. 2016. "Usage Statistics and Market Share of Node.js for Websites." Web Technology Surveys. Q-Success, n.d. Web. 25 Oct. 2016. DeBill, Erik. "Module Counts." Modulecounts. N.p., n.d. Web. 26 Oct. 2016. "Build Amazing Things." Npm. Npm, Inc, n.d. Web. 26 Oct. 2016. "Find, Install, and Publish RubyGems." RubyGems. RubyGems, n.d. Web. 26 Oct. 2016. "Dependency Manager for PHP." Composer. N.p., n.d. Web. 31 Oct. 2016. Boggiano, Jordi. "Packagist The PHP Package Repository." Packagist. N.p., n.d. Web. 31 Oct. 2016. "PyPI - the Python Package Index." PyPI - the Python Package Index. Python Software Foundation, 2016. Web. 31 Oct. 2016. "Electron." Electron. Github, n.d. Web. 31 Oct. 2016. Prévost, Rémi, Mike McQuaid, and Danielle Lalonde. "Homebrew." Homebrew. N.p., n.d. Web. 31 Oct. 2016. "DB-Engines Ranking." DB-Engines. Solid IT Gmbh, 2016. Web. 31 Oct. 2016. "Oracle Database." Oracle. Oracle, n.d. Web. 31 Oct. 2016. "MySQL." MySQL. Oracle Corporation, 2016. Web. 31 Oct. 2016. "SQL Server 2016." Microsoft. Microsoft, 2016. Web. 31 Oct. 2016. "MongoDB." MongoDB. MongoDB, Inc., 2016. Web. 31 Oct. 2016. "PostgreSQL." PostgreSQL: The World's Most Advanced Open Source Database. The PostgreSQL Global Development Group, 2016. Web. 31 Oct. 2016. "Oracledb." Npm. Npm, Inc, 2016. Web. 01 Nov. 2016. "Mysql." Npm. Npm, Inc, 2016. Web. 01 Nov. 2016. "Mssql." Npm. Npm, Inc, 2016. Web. 01 Nov. 2016. "Mongodb." Npm. Npm, Inc, 2016. Web. 01 Nov. 2016. "Pg." Npm. Npm, Inc, 2016. Web. 01 Nov. 2016. "Microsoft | Developer Network." System Requirements for the PHP SQL Driver. Microsoft, 2016. Web. 01 Nov. 2016. Varrazzo, Daniele. "PostgreSQL + Python | Psycopg." Psycopg. N.p., 2014. Web. 01 Nov. 2016. "Cx_Oracle." Cx_Oracle. N.p., n.d. Web. 01 Nov. 2016. "Django MongoDB Engine." Django MongoDB Engine. N.p., 21 May 2016. Web. 01 Nov. 2016. "Pymssql." Pymssql. Pymssql Developers, 2016. Web. 01 Nov. 2016. "Download Connector/Python." MySQL. Oracle Corporation, 2016. Web. 01 Nov. 2016.
ISBN: 1-60132-468-5, CSREA Press ©
76
[39] [40] [41] [42] [43] [44] [45] [46] [47]
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
"Mysql2." RubyGems. RubyGems, n.d. Web. 01 Nov. 2016. "Pg." RubyGems. RubyGems, n.d. Web. 01 Nov. 2016. "Mongoid." Mongoid. RubyGems, n.d. Web. 01 Nov. 2016. "Activerecord-sqlserver-adapter." RubyGems. RubyGems, n.d. Web. 01 Nov. 2016. "Activerecord-oracle_enhanced-adapter." RubyGems. RubyGems, n.d. Web. 01 Nov. 2016. "DigitalOcean: Cloud Computing Designed for Developers." DigitalOcean. DigitalOcean Inc., 2016. Web. 01 Nov. 2016. "Deploy Pre-built Applications." DigitalOcean. DigitalOcean Inc., 2016. Web. 01 Nov. 2016. "Cloud Application Platform | Heroku." Heroku. Salesforce, 2016. Web. 01 Nov. 2016. "Amazon Web Services (AWS) - Cloud Computing Services." Amazon Web Services. Amazon Web Services, Inc., 2016. Web. 01 Nov. 2016.
[48] "Usage | Node.js V7.2.0 Documentation." Node.js V7.2.0 Documentation. Node.js Foundation, 2016. Web. 28 Nov. 2016. [49] "Writing Your First Django App, Part 1." Django. Django Software Foundation, 2016. Web. 28 Nov. 2016. [50] Harrison Shaun, Paul Wilson, Willie Abrams, Jonathan Channon, Chris Johnson, and Sven Schmidt. "Server Side Swift vs. The Other Guys - 2: Speed." Medium. N.p., 13 June 2016. Web. 28 Nov. 2016. [51] Anonymous. "Perl, Python, Ruby, PHP, C, C++, Lua, Tcl, Javascript and Java Benchmark/comparison." Onlyjob. N.p., 08 Mar. 2011. Web. 28 Nov. 2016. [52] Stri8ted. "Ryan Dahl: Original Node.js Presentation." YouTube. YouTube, 08 June 2012. Web. 28 Nov. 2016. [53] "PHP: History of PHP - Manual." PHP. The PHP Group, n.d. Web. 28 Nov. 2016. [54] Holovaty, Adrian, and Jacob Kaplan-Moss. "Introducing Django." The Django Book. N.p., Nov. 2008. Web. 28 Nov. 2016.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
77
The Architecture of a Ride Sharing Application Mao Zheng, Yifan Gu, and Chaohui Xu Department of Computer Science, University of Wisconsin – La Crosse, La Crosse, WI, USA
Abstract - In the development of a lite version of a ride sharing application, we encounter some real time requirements. For example, a rider wants to track the driver’s location in real time, once a driver accepts the trip request. Compared to the traditional method in which the driver sends the location every minute to the app server and the rider receives the information every minutes from the server, we used Google Cloud Messaging (GCM) to send messages to client devices in this project. That means the driver sends GCM notification containing the location information to the GCM connection server. Then the information is sent to the rider. As a result, the rider will be able to view the driver’s location in real time. The benefit of using GCM is the scalability and performance advantages over the app server. There is also battery life savings for the clients’ mobile devices. This paper presents the architectural design of using GCM connection server in a ride sharing application. Keywords: Mobile Connection Server
1
App,
Google
Cloud
Messaging,
Introduction
In this paper, we present a lite version of a ride sharing application. There are two main types of users: riders and drivers. A rider can request a car by providing his/her preferred start location and the destination of the trip. The rider can also view the cost of the trip, track the driver, pay and rate the driver after the trip. A driver can view all the requests from different riders, select a request, and pick up the rider. It requires some real-time communications between the rider and the driver. The information about the trip and all the requests will be recorded into a database. The project is hosted in Microsoft Azure cloud platform. NodeJS is used to write server-side code to provide RESTful APIs..MongoDB is a backbone database, and Redis is used as the cache to allow a fast system response. In this ride sharing application, we use Google Cloud Messaging (GCM) to send messages to client devices. That means the driver sends a GCM notification containing his/her location information to the GCM connection server, then the information is sent to the rider. As a result, the rider will be able to view the driver’s locations in real time. The benefit of using GCM is the scalability and performance advantages over the app server. There is also the battery life savings for the clients’ mobile devices.
The paper is organized as follows: In section 2, we briefly introduce GCM and explain how we incorporate it in our project. In section 3, we present the architectural design of our ride sharing architecture. In section 4, we explain our testing scenario for real time functionalities. Section 5 concludes the paper and outlines the directions of our future work.
2
Google Cloud Messaging (GCM)
Google Cloud Messaging (GCM) is a free service that enables developers to send messages between servers and client apps. This includes downstream messages from servers to client apps, and upstream messages from client apps to servers [1]. A GCM implementation includes a Google connection server, an app server that interacts with the connection server via HTTP or XMPP protocol, and a client app. The client app is a GCM-enabled client app. When clients need to communicate with each other, traditionally every communication goes through the app server. For example, the driver sends his/her location to the app server every minute or every second. The rider will need to send a HTTP request to get the driver’s location from the server every minute or second. The smaller of the interval between the requests, the better the accuracy. The rider can get the driver’s location immediately whenever the driver sends his/her location to the server. However, this causes a lot of traffic to and from the server. In addition, all the polling requests from the rider’s device would consume the battery significantly and slow down the app or device itself. In our project, GCM is used to send messages to the client’s devices. That means the driver sends GCM notification containing the location information to the GCM connection server. Then the information is sent to the rider. As a result, the rider will be able to view the driver’s locations in real time. Hence the clients’ real time communication is through the GCM connection server. Once the trip is completed or the rider cancels the request, the information will be saved to the app server accordingly. The benefit of using GCM is the scalability and performance advantages over the app server. There is also the battery life savings for the clients’ mobile devices.
3
The Application Architecture
Figure 1 is the architecture diagram for our ride sharing application. The system is a client-server model. There are two client side applications, one is for the rider, and the other
ISBN: 1-60132-468-5, CSREA Press ©
78
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
is for the driver. The rider can request a car by providing his/her preferred start location and the destination of the trip. The rider can also select vehicle type, view the cost of the trip, track the driver’s current location, pay and rate the driver after the trip. The driver can view all the trip requests from different riders within a predefined distance, select a request, and pick up the rider.
5)
Email module: this module is responsible for sending four types of the email. a) sending an email to a rider or a driver once the registration is approved; b) sending a warning email to the driver if his/her rating is below a threshold; c) sending an email to notify a driver to close his/her account since the rating did not improve within a time period after the warning was issued; d) sending the trip receipt to the rider.
6)
Data model module: this module defines the schema of all the data needed to be stored in MongoDB. It includes the user’s registration and trip details.
Figure 2 is a screen shot of the rider’s app for making a trip request. Figure 3 is the screen shot for the driver’s app. The driver received the rider’s request.
Figure 1 The Architecture Diagram In the server side, there are total of six modules: 1)
Admin module: this module is for the web server part of our project. It deals with search rider or driver, approves the driver’s registration, review all the trip data.
2)
Rider module: this module is responsible for all the functionalities of rider.
3)
Driver module: this module implements all the responsibilities of the driver.
4)
Authentication module: this module consists of three parts, each part is to authenticating each type of user: driver, rider and admin.
Figure 2 Rider’s App Once the driver accepts the trip request, the rider can view the driver’s current location. Figure 4 and Figure 5 are screen shots on the rider’s app. The rider can see where the driver’s current location is on the map.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
79
Figure 5 Rider views the driver’s location Figure 3 Driver’s App
4
Testing
The developers have conducted white-box testing during the development of the app. Black-box testing has been used in integration testing and system testing. The scenario below has been repeatedly used to focus on the real-time communications between riders and drivers. Scenario Testing: • Rider1 makes trip1 request • Rider2 makes trip2 request • Driver1 is within 10 miles from the trip1’s start locations • Driver2 is more than 10 miles far away from the trip2’s start locations • Driver1 receives the trip1 request, but Driver2 does not • Driver1 picks up the trip1’s order • Driver2 moves closer to the trip2’s start location • Now Driver2 will receive trip2’s request The scenario described above is illustrated in Figure 6. We use the circle to represent the 10 miles range from the start location of the rider’s request.
Figure 4 Rider views the driver’s location
ISBN: 1-60132-468-5, CSREA Press ©
80
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
this.setState({driverGeo}, this.updateRegion);
Driver3
var driverHeading = parseInt(notification.heading) ;
10 miles
this.setState({driverHeading}); }
Driver2 Rider1 Rider2 Driver1
}, senderID: "728367311402", popInitialNotification: true, requestPermissions: true, }); Figure 8 Receiving the Driver’s location from the Driver’s App
Figure 6 Testing Scenario The application demonstrated the correct results and all the trip information was recorded in the database correctly. Figure 7 is the code segments in the driver’s app, sending the driver’s location to the rider’s app. Figure 8 is the code segments in the rider’s app, receiving the driver’s location from the driver’s app.
sendMyGeo(rider_gcm_token, driverGeo, heading){ $f.gcm({
5
Conclusions
This ride sharing application has been used to practice a number of the latest technologies. In this paper, we mainly introduced GCM to implement the real time communications between the rider and the driver. We have been conducted a number of testing to ensure a small set of functionalities were implemented correctly and efficiently. For our next version, we are interested in adding the reservation functionality so that the rider can request a trip for the next day. In the current version, all the trip requests are for the current time. A response from a driver must be within 15 minutes or the rider needs to make another trip request.
key: FIREBASE_API_KEY, token: rider_gcm_token.token,
6
data: { driverGeo: driverGeo,
[1] https://developers.google.com/cloud-messaging/gcm, 2017.
heading: heading }, }); } Figure 7 Sending the Driver’s location to the Rider’s App
[2] Weiser, M. “The computer for the 21st century”, Scientific American, 1991. pp. 94-104. [3] Android Developer’s http://developer.android.com/guide/index.html
Guide.
[4] https://ymedialabs.com/hybrid-vs-native-mobile-apps-theanswer-is-clear/
PushNotification.configure({ onRegister: (gcm_token) => {
[5] Android Developer’s http://developer.android.com/guide/index.html
this.setState({gcm_token}); },
Guide.
[6] https://facebook.github.io/react-native/
onNotification: (notification) => { if(notification.driverGeo){ var driverGeo = JSON.parse(notification.driverGe o);
References
[7] Yifan Gu, Chaohui Xu, Mao Zheng, “Using React Native in an Android App”, MICS2017 - Conference Proceedings of Midwest Instruction and Computing Symposium, April 7-8, 2017, La Crosse, WI.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
81
An Automated Input Generation Method for Crawling of Web Applications Yuki Ishikawa13 , Kenji Hisazumi124 , Akira Fukuda125 1 Graduate School and Faculty of Information Science and Electrical Engineering Kyushu University, Motooka 774, Nishi-ku, Fukuoka, Japan 819-0395 2 System LSI Research Center Kyushu University, Motooka 774, Nishi-ku, Fukuoka, Japan 819-0395 3 Email: [email protected] 4 Email: [email protected] 5 Email: [email protected]
Abstract— Even though many tools provide automatic crawling, sometimes web applications are crawled manually in case of testing. The manual crawling takes much time and effort because it requires the construction of appropriate input sequences. To solve this problem, we proposed and developed an automated input generation method that performs manual crawling. This method employs an observeselect-execute cycle. The cycle enables us to automatically generate consecutive inputs for crawling which changes the states significantly while analyzing the target web application. The paper also demonstrates our method by measuring the server-side code coverage and compares with the manual crawling. As a result, our method covered 47.58% on average, which is close to the 50.21% manual coverage. Keywords: Web applications, Automated input generation, Event driven, Dynamic analysis
1. Introduction Nowadays, with the rapid growth of the Internet, research and development targeting various web applications have become prevalent. There is works that researchers and developers must manually provide input to the web applications. The most typical example is a testing. During testing, the tester provides input to the web application and checks whether the resulting behavior is correct. Another example is crawling. The crawling mentioned here is to repeat an action normally performed by the user on the web page, such as a transition from a web page occurring in the same web application or between different web applications, by clicking with a mouse or moving the mouse pointer. Crawling is executed to confirm the behavior of various web applications currently used, and using various data that can be acquired by performing the behavior. For these tasks, it is necessary to construct an appropriate input sequence according to each purpose. This takes much time and effort. We are required to accomplish this manually. Therefore, the efficiency of research and development deteriorates severely.
The technology required to solve the above problem is automated input generation and automatic execution. This paper proposes an automated input generation method specialized for crawling. Various development studies related to crawling have been published [1], [2]. These crawling systems repeat page transitions and are used to acquire various data that the user needs. However, automatic crawling with these tools can not observe dynamic changes in a page, and can not acquire various data owing to the changes. Many current web applications include functions to dynamically make changes within a page [3]. Not supporting these functions makes the quality of data that can be acquired lower. To generate such a change, it is still necessary to perform crawling manually. In this paper, to reduce the time for this manual crawling, we construct an automated input generation system specialized for crawling. One of the problems in constructing automated input generation specialized for crawling is that it seldom know all the information of the target web application for testing. The information is as follows: the structure of the web application, what type of input is received, what kind of behavior can be expected, etc. From this small amount of information, it is necessary to select inputs that can perform the behavior of crawling while specifying the structure of the web application. We employ an observe-select-execute cycle [4] as the framework of the automated generation system. In the observer stage, the system observes the current state of the target web application and specifies what kind of input is received. In the selector stage, one input is selected from the specified input group. In the executor stage, the system practically executes one selected input on the web application. By repeating this cycle, it is possible to automatically generate serial inputs for crawling which changes the states significantly while analyzing the target web application. The rest of the paper is organized as follows. Section 2 introduces related works dealing with the automated input generation system. Section 3 describes challenges associ-
ISBN: 1-60132-468-5, CSREA Press ©
ated with the purpose of our research. Section 4 describes the automated generation method, employing an observeselect-execute cycle. Section 5 demonstrates the proposed method using code coverage measurement on the server side. Section 6 concludes this paper and mentions future works.
2. Related Works This section describes related works that automatically generate inputs for other languages in Section 2.1, and for web applications in Section 2.2.
2.1 For Other Languages Dynodroid is an automated input generation system for Android applications. Dynodroid employs an observeselect-execute cycle. In the observer stage, Dynodroid analyzes the state of the phone for two types of events, UI events and system events, and specifies what kind of event is expected. In the selector stage, one event is selected (using a selection algorithm called BiasedRandom) from the expected events specified by the observer. In the executor stage, Dynodroid executes one event selected by selector, and reflects the state of the result executed on the phone. Dynodroid repeats this cycle and enables the automated generation of consecutive inputs. Our system also uses this observe-select-execute cycle and the BiasedRandom algorithm.
0
5000
10000
15000
20000
25000
30000
DIV A SPAN LI IMG SCRIPT UL BUTTON OPTION P I LINK H3 META DD
Fig. 1: Statistics of HTML tags
that web applications can receive. We must take into account both of HTML [10] and JavaScript [11] to identify inputs. This section describes each type of input and selects corresponding inputs based on statistics.
2.2 For Web Applications Kudzu [5] and Jalangi [6] are tools that apply symbolic execution [7] to JavaScript. Symbolic execution is a technique of evaluating a result by treating variables in the program as symbols and simulating the program. Kudzu automatically generates inputs to analyze security vulnerabilities in web applications. It applies symbolic execution to JavaScript programs, guesses what kind of inputs the program accepts, estimate what kind of output can be expected, and execute the input. Jalangi performs symbolic execution for the dynamic analysis of JavaScript programs. Artemis [8] and JSEFT [9] are tools that apply feedback testing to JavaScript applications. These tools employ the process of generating inputs and determine how to generate the next input by feeding back the information gathered during the execution. To summarize, there are some research studies dealing with automated input generation for web applications. Although Kudzu and Jalangi are only for use with JavaScript, and inappropriate for the entire web application. In addition, these related works mainly aim at testing or program analysis, and are not specialized for crawling.
3. Challenges 3.1 Input in Web Applications First, for the purpose of automated input generation for web crawling, it is necessary to identify the types of input
3.1.1 HTML HTML is a language that describes the document structure of a web application as elements in the form of tags. The document structure described by HTML is called Document Object Model (DOM) [12]. Tags have various roles for each type, and each can be described regarding HTML specifications. Several tags receive user inputs, and the received inputs and resulting behaviors are determined by the specification. The most used tag for crawling is the anchor tag. Since the anchor tag fulfills the role of a link for performing behaviors such as transitions of a page, it must be considered for the purpose of crawling. In addition, there are input tags and button tags described in form tags. The form tag describes an input and transmission form, and receives string inputs by the user, on/off inputs from a button, etc. There are several tags that receive inputs, but the tag practically used is biased. Figure 1 shows the statistics of HTML tags used in the top 100 sites of the website rankings of traffic (site visitors) published by Alexa. Tags with numbers less than 1000 are omitted. From the figure, it can be seen that anchor tags are most frequently used among tags receiving input. There are button tags which that also receive inputs, but they are fewer in number when compared with anchor tags. Therefore, automated generation dealing with anchor tags can perform sufficient behavior.
0
500
1000
1500
5000
5500
Crawling
click mousedown mouseover mouseout
Page transition
load
Checking behavior on pages
Not repeating the same behavior
Execution on the client side
keyup keydown focus
Responding to dynamic state change
blur
Specifying of input space and type
Avoid generating redundant inputs
Method of generating inputs
error mouseenter
Fig. 3: Requirement analysis for crawling
touchstart mouseleave mouseup change submit
Fig. 2: Statistics of event listeners
3.1.2 JavaScript JavaScript is a language used to implement the dynamic behavior of web applications. JavaScript employs an eventdriven programming model, which handles operations executed by users and other programs as events and processes them accordingly. In JavaScript, inputs are received as various events, and the corresponding processes are performed. Events include clicks and scrolls performed by the user. The event listener decides what kind of behavior will occur as a result of the events that occur as inputs. Event listeners are described registering in HTML elements. For example, if a click event listener is registered in an element and the user clicks the element, the behavior determined by the event listener occurs. Event listeners in JavaScript are rich in type. In some cases, event listeners of the web application are newly created, so it is difficult to deal with all types of event listeners when building an automated generation system. Therefore, it is necessary to select the type of the corresponding event listener. Figure 2 shows the statistics of the event listeners used for the same sites as in Figure 1. The figure shows the most used event listeners in the current web application. Event listeners whose number is 100 or less are omitted. First, the click event listener is used much more than other event listeners. A click is the most general input to be received from the user. Users can easily speculate on the behaviors that will occur by clicking on the element and by changes in the shape of the mouse cursor. After the click, the mousedown event listener, mouseover event listener, and mouseout event listener are the most used event listeners. Mousedown is an event fired when the mouse is pressed, mouseover is an event fired when the mouse
moves onto an element on the page, and mouseout is an event fired when the mouse moves off elements on the page. Since it is easy to assume a sequence of actions that mouse moves onto and off an element, mouseover and mouseout are often used as one set. As for mousedown, mouseup events fired when a pressed mouse is released are often used as one set. The figure shows they are used much more than other event listeners not as often as clicks, and event listeners which we should deal with. These event listeners are “input by a mouse” considered to be most used by the user for the purpose of crawling. Statistics indicate they account for the majority of used event listeners as the result of statistics shows. Therefore, automated generation dealing with these event listeners can perform sufficient behavior. 3.1.3 Ratio of Anchor Tag to Event Listeners It can be seen from Figure 1 and 2 that the total number of event listeners is about 10000, while the number of anchor tags is as large as about 23000. Therefore, as a feature of the type of input, there is a bias in the ratio between the anchor tags and the event listeners.
3.2 Requirement Analysis for Crawling This section analyzes the properties that automated input generation for crawling should satisfy. Figure 3 shows the requirement analysis for crawling. There are four major requirements for crawling: • Page transition • Checking behavior on pages • Not repeating the same behavior • Execution on the client side Page Transition, Checking Behavior on Pages. The page transition is the main behavior when crawling. By repeating the page transition, information on new pages is obtained one after another. Checking behavior on pages is also a property that crawling should satisfy. Although it is not done to the extent of testing and debugging in a web application, some degree of checking behavior is
84
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
also accomplished during crawling. It is a basic process in crawling to perform some degree of behavior check in the same page while mainly making a page transition. Not Repeating the Same Behavior. In normal crawling, it is rare to repeat the same behavior. This is because the information that can be acquired by repeating the same behavior is not changed, and as a result, there is a high possibility that it will become meaningless. Execution on the Client Side. When crawling is done manually, it is basically accomplished on the client side. In many cases, it is not possible to access the source code of the web application to be crawled, and it is required to be executed on the client side through a browser in order to target all web applications.
Google Chrome
observer
Specifying input space and type
selector List of inputs
Selecting an input
executor an input
Executing an input
Fig. 4: Overview of automated generation method
4. Approach This section describes our solutions to the challenges described in Section 3.
3.3 Challenges
4.1 Overview
Possible problems in order that the automated input generation system satisfies the requirements described in Section 3.2 are the following four challenges: • Responding to dynamic state change • Specifying of input space and type • Avoid generating redundant inputs • Method of generating inputs Responding to Dynamic State Change. The state of the web application is changed dynamically by the page transition or the behavior on the page. The state of the currently open page changes owing to the page transition, and the DOM changes owing to the behavior on the page. This requires a method that can deal with these dynamic state changes. Specifying of Input Space and Type. In order to automatically generate an input that causes a transition of a page or an action on a page, it is necessary to specify the input space and the input type that confirm some output (behavior). Unless this specific problem of the input space and type is solved, there are some cases in which a transition of the page or a dynamic change on the page does not occur even when the input is generated and executed. Therefore, there is a possibility that the effect by the automated generation is deteriorated. Avoid Generating Redundant Inputs. If the input to be generated overlaps multiple times, the behavior obtained as a result of the input will be also redundant. In order to satisfy the property of not repeating the same behavior, it is necessary to minimize the duplication of inputs to be generated. Method of Generating Inputs. In manual crawling, the user performs a behavior such as a page transition by clicking the link using the mouse. However, automated crawling cannot use the mouse. In addition, as mentioned in Section 3.2, inputs must be executed on the client side. There are problems such as how to generate the input, execute it on the client side, and reflect it as behavior in the web application.
We propose an automated generation method that supports dynamic state change, such as page transitions and structural changes of the DOM. Our method refers to the observeselect-execute cycle employed in Dynodroid [4] which is an automated input generation system for Android applications. Dynodroid observes which input is related in the current state of the Android terminal (observer), selects one event from the inputs obtained as a result of observation (selector), and reflects the selected input to the terminal (executor). By employing this cycle in our method, even if the web application to be crawled changes dynamically, it can be dealt with. Figure 4 shows an overview of the automated generation method. The role of the observer in our method is to specify the input space and type shown in Section 4.2. Our method observes the state of the current web application from DOM and uses the command line API of the Developer Tools described later. The role of the selector is to execute the input selection algorithm described in Section 4.3. Our method selects one input from the list obtained by specifying the input space and type by using the algorithm. The role of the executor is to execute the Developer Tools. Our method executes an input on the Developer Tools and reflects the state in Google Chrome that opens the target web application. Details are discussed in Section 4.4.
4.2 Observer The observer specifies the input space and type of the current web application. First, it is necessary to consider specifically the input space in the web application is. As described in Section 3.1.1, the document structure (DOM) of the web application is described as HTML elements. This element can be individually specified, and specifying the input space in the web application can be rephrased as specifying the element that can receive the input. It is necessary to make a list of the input space and type in the observer. This section describes the approach for the HTML tag (Section 4.2.1) and event listener (Section 4.2.2).
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
85
4.2.1 HTML Tag
Algorithm 1 Input selection algorithm using BiasedRandom.
As mentioned in Section 3.1.1, an anchor tag is necessary for the purpose of crawling. Therefore, all anchor tags must be specified in the DOM. To solve this problem, our method employs the HTML API, getElementsByT agN ame(name). This function takes the name of the tag and returns all elements of the specified tag name in the DOM. By passing anchor tag in this function, all elements of the anchor tag are given. Regarding the type of input, inputs that the HTML tag can receive are limited. Since the anchor tag can receive a click as input, the observer defines that the input type is a click.
Require: List I of inputs,URL u Ensure: An input in I 1: i := an input 2: G := global map 3: for each input in I do 4: if G(i, u) is not existed then 5: G(i, u) := init_score(i) 6: end if 7: end for 8: L := init local map 9: while true do 10: i s := an input selected at random from I
4.2.2 Event Listener
11:
Event listeners are difficult to specify the input space in the DOM like HTML tags. This is because of the existence of various libraries in the current JavaScript, each of which has a complicated form. To solve this problem, our method employs “Developer Tools” that are functions of Google’s web browser “Google Chrome.” In the command line of the Developer Tools, it is possible to debug the behavior by writing with JavaScript. Several APIs available only in the Developer Tools are provided [13]. There is a function getEvent Listener s(element) in them. This function takes an element and returns all of the event listeners registered in that element. The observer applies this function to all elements on the current page to identify the space and type of the event listener.
4.3 Selector The selector selects an input from the list of inputs the observer makes. In order to avoid generating redundant inputs, the selector employs a selection algorithm. This selection algorithm is called once per cycle. Our method refers to the BiasedRandom algorithm employed in Dynodroid [4]. 4.3.1 BiasedRandom Algorithm Algorithm 1 shows an input selection algorithm using BiasedRandom. In this algorithm, the selection number of each input is recorded as a map, and inputs that are not much selected based on the number of selections are selected. G(i, u) is the number of times an input i of the page whose URL is u was selected. Since the global map G exists for each page, it gets the URL to use the map of the current page. In lines 3-7, the algorithm adds inputs that are in the input list I and not in G(u).The initial setting function used for adding is init_score on lines 22-30. The initial selection times of the anchor tag is 1, and the event listener are 0. The reason for this will be described in Section 4.3.2. L is a local map used only in one cycle. The initial selection number is 0. In lines 9-21, algorithm selects an input using a comparison between L and G. In line 10, an
12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28:
//Choose an input at random
if L(i s ) = G(i s, u) then G(i s, u) := G(i s, u) + 1 //Select i s this time, but decrease chance of //selecting i s after this cycle.
return i s else L(i s ) := L(i s ) + 1 //Increase chance of selecting i s in this cycle.
end if end while procedure init_score(i) if i = anchor tag then return 1 else return 0 end if end procedure
input i s is chosen at random from I. In lines 12-16, if i s satisfies L(i s ) = G(i s, u), it is selected as an input by adding the selection number of i s . In lines 17-19, selection of i s is avoided because the input has been selected frequently in previous cycles. Instead, the algorithm increases the selection number of i s in the local map and the chance of selecting i s in the current cycle. This process is repeated until an input is selected. As a result, it is difficult to select inputs having a large number of selections in past cycles, and those with a small number of selections are more likely to be selected. 4.3.2 Preventing Bias of Input Types As mentioned in Section 3.1.3, the ratio of anchor tags is much larger than that of the event listeners. As a result, it is highly probable that the anchor tag is removed when an input is randomly chosen from the list, and it is assumed that the anchor tag is easily selected as an input. Since anchor tags are more likely to be selected, it is conceivable that the number of page transitions becomes
ISBN: 1-60132-468-5, CSREA Press ©
86
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
very large. However, since crawling requires not only simple page transitions but also various behaviors on the same page, it is necessary to make a certain number of dynamic changes by the event listeners. Therefore, by setting the initial selection number of the anchor tag to 1 and the event listener to 0, the anchor tag is never actually selected as an input unless it has been randomly chosen at least once. As a result, this intentionally makes anchor tags difficult to select.
4.4 Executor The executor executes an input selected by the selector. This section describes our method of generating inputs for two types of input: JavaScript and HTML. In JavaScript, actions generated by the user during manual input generation are handled as events. Therefore, the executor generates this event directly. 1 2 3 4 5
function mouseevent(type, element){ var event = document.createEvent("MouseEvents"); event.initMouseEvent(type); element.dispatchEvent(event); }
Listing 1: Generating events Listing 1 shows the JavaScript code for generating events. Since all events handled in our method are generated by the mouse, the function name is mouseevent. This function takes the type of mouseevent and the element to be executed. By passing "Mouseevents" in createEvent, an event is generated on line 2. By passing the type of mouseevent in initMouseEvent, the event is initialized as the specified mouseevent on line 3. By passing the event in dispatchEvent on the element, the input is executed on line 4. Regarding HTML, our method deals with the anchor tag, and the received input is a click. Therefore, by passing the click and the target anchor tag in mouseevent, it is possible to go to the link specified by the anchor tag. By executing these command in the Developer Tools, the executor can generate a selected input on the client side.
5. Evaluation This section describes an evaluation of the automated generation method. We used code coverage measurement of server-side programs for evaluation. The code coverage is the ratio executed in the program test. The higher the code coverage, the closer the test covers the behavior of the target program and the higher the reliability. During crawling, code coverage is also an indicator of how much of the crawling operation is performed. There are several types of code coverage. We used line coverage in this evaluation. Line coverage shows the ratio of executed lines for executable lines in the program.
Table 1: CMS baserCMS concrete5 Drupal EC-CUBE Joomla! MODX PrestaShop WordPress Average
Result of code coverage initial manual automated 27.11 50.31 49.39 29.73 33.68 32.84 49.23 68.19 66.38 43.10 50.89 49.46 38.19 50.70 40.48 32.97 48.06 47.33 34.33 39.90 41.83 52.71 59.96 52.89 29.84 50.21 47.58
5.1 Environment The target for code coverage measurement is a server-side PHP program. PHP is a language that dynamically generates files such as HTML and JavaScript, and can generate pages in web applications. In other words, the code coverage of the PHP program is directly related to the rate at which page transitions and dynamic state changes are made. We used open-source content management system (CMSs) for the server-side web application. Open-source CMSs usually have templates that are already available when installing them on the server, and we used this template. The target web applications are eight CMSs. As a method of measuring the code coverage, we used Xdebug [14], the code coverage measurement function installed in PHP’s debugging extension module. It monitors the PHP file of the target CMS and measures the code coverage of the executed file. As a subject of comparison of code coverage, manual crawling by a human was executed. By comparing the code coverage by manual crawling and automated generation, we evaluated how often the automated generation executes the input expected for manual crawling. Both manual and automated generation were executed on the client side using Google Chrome. First, both opened the URL of the top page of the CMS as the initial state. Then, manual crawling was executed for a sufficient time, as many page transitions as possible were completed, and behavior other than a page transition was produced at least once in the same page. Automated generation was given 100 as the number of inputs. Finally, we measured the coverage at the time when each generation was completed.
5.2 Evaluation Result Table 1 shows the result, which indicates that the average of manual crawling is 50.21% and that of automated generation is 47.58%. The difference is 2.63%, which indicates that the quality of the automated generation is very close to that of manual crawling.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
5.3 Discussion Although the result indicates that the coverages for manual and automated generation are close, they can cover only 50% of the lines. This result occurs because the PHP program includes the description of not only page generation and dynamic changes of the state, but also the operations of various databases. In addition, there are relatively many lines that the crawling cannot execute. Regarding the comparison with the initial state, both manual and automated generation cover 20% more lines than the initial state. This indicates that both execute the crawling sufficiently.
87
[10] “Html | mdn - mozilla developer network,” https://developer.mozilla. org/ja/docs/Web/HTML, accessed 2017/02/08. [11] “Javascript | mdn - mozilla developer network,” https://developer. mozilla.org/ja/docs/Web/JavaScript, accessed 2017/02/08. [12] “W3c document object model,” https://www.w3.org/DOM/, accessed 2017/02/08. [13] “Command line api reference | web | google developers,” https://developers.google.com/web/tools/chrome-devtools/console/ command-line-reference, accessed 2017/02/08. [14] “Xdebug - debugger and profiler tool for php,” https://xdebug.org/, accessed 2017/02/08.
6. Conclusion and Future Works We presented an automated input generation method for web crawling. Our method employs an observe-selectexecute cycle, and automated generation is repeated with the sequence of specifying the input space and type, selecting a non-redundant input, and executing an input. We applied our method to eight open-source CMSs, and compared them with manual crawling by code coverage measurement on the server side. As a result of this evaluation, the average coverage of the automated generation system is 47.58%, which is very close to the average coverage of manual crawling (50.21%). This shows the usefulness of our system. As challenges for the future, we plan to deal with all types of input. Regarding the event listener statistics shown in Section 4.3, we can deal with higher used event listeners, but we can not deal with other event listeners. Dealing with all types of input increases the quality of the automated generation.
References [1] “Googlebot - search console help,” https://support.google.com/ webmasters/answer/182072?hl=en, accessed 2017/03/23. [2] “Scrapy | a fast and powerful scraping and web crawling framework,” https://scrapy.org/, accessed 2017/03/23. [3] G. Richards, S. Lebresne, B. Burg, and J. Vitek, “An analysis of the dynamic behavior of javascript programs,” in Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2010, pp. 1–12. [4] A. Machiry, R. Tahiliani, and M. Naik, “Dynodroid: An input generation system for android apps,” in Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013, pp. 224–234. [5] P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCamant, and D. Song, “A symbolic execution framework for javascript,” in Proceedings of the 2010 IEEE Symposium on Security and Privacy. IEEE Computer Society, 2010, pp. 513–528. [6] K. Sen, S. Kalasapur, T. Brutch, and S. Gibbs, “Jalangi: A selective record-replay and dynamic analysis framework for javascript,” in Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013, pp. 488–498. [7] J. C. King, “Symbolic execution and program testing,” Commun. ACM, vol. 19, no. 7, pp. 385–394, Jul. 1976. [8] S. Artzi, J. Dolby, S. H. Jensen, A. Møller, and F. Tip, “A framework for automated testing of javascript web applications,” in Proceedings of the 33rd International Conference on Software Engineering, 2011, pp. 571–580. [9] S. Mirshokraie, A. Mesbah, and K. Pattabiraman, “Jseft: Automated javascript unit test generation,” in Software Testing, Verification and Validation (ICST), 2015 IEEE 8th International Conference on. IEEE, 2015, pp. 1–10.
ISBN: 1-60132-468-5, CSREA Press ©
88
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
The Application of Software Engineering to Moving Goods Mobile App Katherine Snyder and Kevin Daimi Department of Mathematics, Computer Science and Software Engineering University of Detroit Mercy, 4001 McNichols Road, Detroit, MI 48221 {snyderke, daimikj}@udmercy.edu
Abstract— GoodTurn - the moving goods mobile system, is designed and implemented by the University of Detroit Mercy with a grant from Ford Motor Company. The app is intended to facilitate and manage Ford employees’ donation of their time and vehicles to serve the community by moving goods and resources. This paper introduces the requirements, analysis, and design of the GoodTurn system for iPhone environment. The software tools needed for its development will be highlighted. Index Terms— GoodTurn, Requirements, Specification, Design, System Models, Architecture, Interface I.
INTRODUCTION
The goal of the Goods Moving Mobile System, GoodTurn, is to allow non-profit requesters to select drivers who volunteer their time and vehicles to move goods and materials from donors to a location specified by requesting organizations. Requesters are either NonProfit Organization (NPO) or Non-Government Organizations (NGO). A team of software engineers from the University of Detroit Mercy designed and implemented this system following software engineering processes. Currently, GoodTurn runs on iPhone only. The next phase of development will allow for the app to be used on Android-based phones. The Ford Motor Company offered a grant to build this system after three Ford employees proposed the idea for the app. Xcode [1] was used to develop the GoodTurn application using the Swift programming language [2]. Furthermore, Firebase 3.0 was utilized for the underlying database and to capture analytics of app use [3]. GoodTurn was developed using a thorough requirement engineering process and was implemented using an agile software development approach. Unlike the classical software development techniques, agile methods involve extensive collaboration and face-to-face communication with the customer. Agile requirements engineering defines the way of planning, executing and reasoning about requirements engineering activities without having to wait for the requirements to be completed before analysis and design starts [4]. Once a subset of requirements is available, analysis, design, and
programming begin. It is broadly established in software engineering (SE) research that requirements engineering (RE) is one of the most decisive sub-process of software development. There is a wide unanimity that understanding software requirements is crucial for designing the right software system [5], making software requirements extremely important. Recent studies revealed that 56% of system defects are the result of poor requirements and requirements errors cost 10 times more than coding errors [6]. There are many methodologies to develop software requirements through interactions with the clients. Substantial software defects can be detected during the requirement analysis phase. This is considered a sensitive task in which mistakes or incorrect perceptions may result in a major catastrophe for the software product [7]. Agile Software Development (ASD) is frequently adopted to handle the growing complexity in system development. Hybrid development models with the integration of User-Centered Design (UCD) can be utilized with the aim of delivering competitive products with a suitable user experience [8]. In the development of the GoodTurn app, software modeling notations were also used to facilitate the analysis of some of the requirements. UML notation, use-cases, and data-flow diagrams were employed where useful. Model notations provide efficient graphical views for sharing knowledge between the professionals responsible for documenting information and those who need to understand it and put it into practice [9]. Model notations are also useful for communicating hardware issues. Vogel-Heuser, Braun, Kormann, and Friedrich [10] indicated that object-oriented model based design can be constructively utilized in industry and that the code automatically derived from the UML model can be implemented on industrial Programmable Logic Controllers (PLCs) without supplementary work. Modeling notations also facilitate converting an analysis model to a software design seamlessly and efficiently implementing it into a programming language [11]. Software architecture is normally the first design artifact that focuses on quality issues, such as performance, reliability, and security. In addition, it is a
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
reference point for other development activities including coding and maintenance [12]. Conventionally, software architecture is considered the result of the software architecture design process usually symbolized by a set of components and connectors. Currently, the set of design decisions crafted by the software architect complements the solution-oriented definition of software architecture [13]. Visualization of the software architecture supports software engineers in reasoning about the functionality and features of a software system without the need to get involved in coding and implementation details [14]. Alenezi [15] stressed that software systems are becoming more complex, larger, more integrated, and are implemented using various technologies, which need to be managed and organized to ensure quality product. Quality attributes of the product require an architecture to enable designers to assess and analyze them. The GoodTurn system relied on the three-tiered client-server architecture style that includes iPhone software, application server, and database server. User interface (UI) is the locus of interaction between the user and computer software. The success and failure of a software application is influenced by User Interface Design (UID). The ease of using a software and the time needed to learn the software are impacted by UID [16]. In the past, application functionality dominated most of the attention, effort and development time. User interaction was looked upon as the least significant aspect in developing a software application [17]. When the services of a software system need to be described, two main elements must be considered: User Interface (UI) which is presented to the user, and the User Actions on this UI. The system reacts to a user action through onscreen feedback with the possibility of asking for further information and providing error or help messages [18]. Designing User Interfaces (UIs) is a creative and humanintensive activity, which prevents the adoption of computer-aided tools to explore alternative solutions [19]. The GoodTurn system followed the standards and polices of Apple’s current user interface conventions. This paper presents the design and implementation of the GoodTurn software system. Section II presents the Functional Requirements together with the specification. Section III provides the Design Constraints. Nonfunctional Requirements are discussed in Section IV. The System Models and System Architecture are organized in Sections V and VI respectively. Section VII represents the User Interface. Finally, conclusions are presented in Section VIII. II. FUNCTIONAL REQUIREMENTS Functional requirements embody the foundation for any system construction. They dictate what the system should do. For the GoodTurn app, functional
89
requirements were collected from Ford employees, nonprofit organizations (NPOs), non-government organizations (NGOs), and the public. Samples of these requirements are depicted below. Note that the term “requesters” refer to NPOs and NGOs. A. Login Screen Requirement: The login screen must contain an option to save the user's email. Specification: 1. The login screen must advance the user to the requestor or driver screen if the user has an active login session. 2. The login screen must have an input for the user to input their username/email. 3. The login screen must have an input for the user to input their password. 4. The application should allow the user to register if they do not already have an account. 5. The application must authenticate the user's email and password combination. 6. The application must contain an option for the user to initiate the recovery of password process. B. Account Management Requirement-1: The system must allow deactivation of a user's account. Specification-1: 1. Set account status field to 'deactivate' in firebase. 2. The user will receive a notification when logging in if their account is deactivated. 3. The requestor will not be able to create new jobs if their account is deactivated. 4. The driver will not be able to view any jobs if their account is deactivated. Requirement-2: The system must allow the requestor to reject a specific driver in the future. Specification-2: 1. Add the driver to the blacklisted list for the requester's account in firebase. 2. The application must allow for the requestor to remove a driver from the blacklist. 3. Remove the driver from the blacklisted list for the user's account in firebase. Requirements-3: The system must allow the driver to reject a specific requestor/organization in the future.
ISBN: 1-60132-468-5, CSREA Press ©
90
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Specification-3: 1. Add the requestor to the blacklist for the user's account in firebase. 2. Allow the driver to remove a requestor from the blacklist. 3. Remove the requestor from the blacklist for the user's account in firebase. Requirement-4: The system must be able to put a user under review. Specification-4: 1. If a user requests reactivation, then put their account under review. 2. Set account status field to 'review' in firebase. C. Registration Screen Requirement: The system should allow the user to register if they do not already have an account. Specification: 1. A user should be able to specify whether they are a driver, requestor, or both. 2. The registration screen must have an input text field for the user to enter their full name. 3. The registration screen must have an input text field for the user to enter their address. 4. The registration screen must have an input text field for the user to enter their company name. 5. The registration screen must have an input text field for the user to enter their phone number. 6. The registration screen must have an input text field for the user to enter their email address. 7. The registration screen must have an input text field for the user to input their new password. 8. The registration screen must have an input text field for the user to input their new password again to verify they entered the same password. 9. The registration screen component must verify that the password conforms to a set of text restrictions: at least 8 characters, one uppercase letter, one lowercase letter, and one special character. 10. The registration screen component must verify that the two password fields match. 11. The registration screen must have an input text field for drivers to input the vehicle make they will be using to deliver goods. 12. The registration screen must have an input text field for drivers to input the vehicle model they will be using to deliver goods.
13. The registration screen must have an input text field for drivers to input the vehicle year they will be using to deliver goods. 14. The registration screen must have an input text field for the drivers to input the license plate of the vehicle they will be using to deliver the goods. 15. The registration screen must have a button for users to register their entered data. 16. The registration component will verify the password and check if all the fields are correctly populated. 17. If successful, hash the password and create a new user on firebase with all the information populated by the user. If not successful, then inform the user what went wrong. D. Job Completion Requirement-1: The system must allow rating of users that were involved in a job. Specification-1: 1. Users will be rated on a number system, 1 to 5, where 5 being the highest quality. 2. Drivers provide their rating of the requestor. 3. Requestors provide their rating of the driver. 4. Average ratings will be associated with the perspective driver or requestor. Requirement-2: The system must allow users that were involved in a job to provide feedback. Specification-2: 1. The feedback given by users should be a text input for the user. 2. The application must allow the driver to provide any feedback they have for the requestor. 3. The requestor screen must have an input for the requestor to provide any feedback they have for the driver. E. Driver Screen/Dashboard Requirement-1: The system must provide a list of available jobs. Specification-1: 1. A job must contain the items to be transported. 2. A job must contain the address of the pick-up and drop-off. 3. A job must specify the miles from the pick-up location to the drop-off location. 4. A job must include the estimated time for completion. 5. The job list should be updated to reflect jobs that are accepted, rejected, or no longer available.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
6. 7.
A job must include what type of vehicle is required. A job must include the time of pick-up.
Requirement-2: The system should allow the driver to accept a job. Specification-2: 1. The requestor must be notified that their job has been accepted. 2. A reminder should be issued for the driver regarding the accepted job. 3. The job should be set as no longer available for other drivers. Requirement-3: The system must allow the driver to cancel an accepted job. Specification-3: 1. If a job is cancelled by the driver, the requestor must be informed that their job was cancelled. 2. If a job is cancelled by the driver, the job must be reposted for other drivers. F. New Job Requirement: The system must allow the requester to start a new job. Specification: 1. The new job screen must have an optional input field for the user to specify a new item to the job. 2. The new job screen must have an input field for the user to specify the number of items to be added to the new job. 3. The new job screen must allow the user to add to the list of items of the new job. 4. The new job screen must allow the user to remove an item from the list of items for a new job. 5. The new job screen must allow the user to move to the next screen to specify pickup and drop-off locations. 6. The new job screen must allow the user to specify the size of vehicle needed, such as pickup truck, van, or sedan. 7. The new job screen must allow the user to specify if any heavy items are in the job. 8. The new job screen must contain a search address field for the user to enter their pickup location for the job. 9. The new job screen must contain a search address field for the user to enter their drop-off location for the job. 10. The new job screen component must calculate the miles from the pickup location to the drop-off location.
91
11. The new job screen must allow the user to save the job. The requestor should then be brought back to the Requestor Dashboard. 12. The new job screen must have an option for cancelling the current job being created. G. Requester Screen/Dashboard Requirement-1: The system must allow canceling a job that is in the queue. Specification-1: 1. The job must be removed from the available jobs. 2. The driver should be alerted that the job has been cancelled if job is accepted. Requirement-2: The system must allow the requestor to modify a job. Specification-2: 1. The requestor should not be allowed to modify a job within 24 hours of the accepted job's scheduled deliver date. 2. The driver must be notified if the requestor makes any changes to an accepted job. 3. Driver should re-accept the changes to the job. 4. If a driver rejects a new job, job gets reposted to available drivers. 5. The requestor is notified of the driver's decision to accept or reject the modified job. H. Support/Feedback Requirement: The system must allow users to provide feedback, submit problems, or seek help. Specification: 1. The support screen should have a way to specify what type of problem they are inquiring about or whether they are providing feedback. 2. The support screen must have an option for users to inquire about a lost item. 3. The support screen must have an option for users to file a complaint. 4. The file-complaint must provide users with an option to file complaints against another user. 5. The support screen must have an option to provide feedback on the vehicle used to transport items. 6. A password should be allowed to reset. 7. The support screen must have an option to report a broken link.
ISBN: 1-60132-468-5, CSREA Press ©
92
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
8. 9.
The support screen must have an option to delete the user's own account. The support screen must have an option to allow the user to request reactivation of their account.
I. Profile Requirement: The system must have a profile screen for the users to review or change their information. Specification: 1. The profile screen must contain the user's name that can be changed. 2. The profile screen must contain the user's address that can be changed. 3. The requester’s profile screen must contain the user's company name that can be changed. 4. The requester’s profile screen must contain an option for how users wish to be notified. 5. The driver's profile screen must have an option for adding a vehicle. 6. The driver's profile screen must have an option to edit the details of their vehicle. 7. The driver's profile screen must have an option for removing a vehicle.
A. Performance • • • • • • •
The system should allow drivers and requesters to sign in within 5 seconds. Drivers and requesters registration should not take more than 5 seconds. Deactivation and reactivation of accounts should take 5 seconds. Displaying blacklisted drivers for a specific requester should take no more than 5 seconds. The system should display the job history details within 5 seconds. Completed jobs for a requester should be listed within 3 seconds. The system should display the list of available jobs within 5 seconds.
B. Usability • •
•
Drivers and requesters should be able to use the system without any training. System administrator should be provided with online training. The system should provide messages to guide the users when invalid information is entered.
8. The profile screen must contain a button for the user
C. Security
to save their changes.
•
III. DESIGN CONSTRAINTS
•
GoodTurn was constraints: • • • • •
developed
under
the
following
The system must be developed using the clientserver methodology. The clients will be accessing the app using iPhones 5 or more recent models. Information about drivers and requesters should be stored using Firebase database. Swift should be the programming language used. The system must be developed under the Xcode development environment
IV. NONFUNCTIONAL REQUIREMENTS Nonfunctional requirements represent constraints on the functional requirements. They express some quality characteristics that the software system should possess. In this section, performance, usability, security, privacy, reliability, and maintainability features of the GoodTurn system will illustrated.
• • • •
Drivers, requesters, and system administrators should be authenticated Messages exchanged between all parties (drivers, requesters, system administrators) should be confidential. No party can deny sending a message to another party. No part can deny receiving a message from another party. No authorized party should be denied service An authorized party will be denied service.
D. Privacy • • • •
•
The system should not disclose requester information to non-drivers. The system should not disclose a driver information to non-requesters. The system should not disclose driver’s information to another driver. The list of rejected drivers for a requester should not be available to other requesters. The list of rejected requesters for a driver should not be available to other drivers.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
93
E. Reliability and Availability • • • • •
The system must detect, isolate, and report faults. The mean time between failures should be six months. The system should be up and running within an hour after a failure. The system should be backed up weekly. Backup copies must be stored at a different location specified by the NPO/NGO (Requester or mover).
F. Maintainability • • • •
Errors should be easily corrected using effective documentation. Additional features should be added without considerable changes to the design. The system should be easily ported to iPad. The system should be easily ported to Android.
V. SYSTEM MODELS System models help the requirement engineer to understand the functionality of the system. In addition, models help the analyst when communicating with stakeholders. Software engineering relies on the use of abstract models to portray and infer the properties of software systems. A system model symbolizes attributes of a system and its development environment. Normally, systems have many details. However, system models concentrate on few vital pieces of the system to be fully understood and facilitate reasoning. A use-case tells a formalized story about how an enduser interacts with a system under a specific set of situations. The story may be narrative text, an outline of tasks or interactions, a template-based description, or a diagrammatic representation [20]. Regardless of its form, a use-case represents the system from the end-user’s point of view. Examples of use-cases for the GoodTurn system are given in Figures 1 and 2 below.
Figure 1. Driver Use-Case
VI. SYSTEM ARCHITECTURE A software system architecture represents the structure of the system comprised of the elements of the system and the relationships between them. It depicts the highlevel structures of a software system. In addition to the relationships between the elements, the architecture emphasizes the characteristics of both the elements and relationships. Architecture is normally thought of as a blueprint for the software system, in other words, the big picture. The architecture is the main artifact for further designs of components, interfaces, data, and code.
Figure 2. Requester Use-Case
ISBN: 1-60132-468-5, CSREA Press ©
94
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
By examining the architecture, one can conclude how multiple software elements collaborate to fulfill their tasks. Figure 3 illustrates the architectural style used by the system. GoodTurn follows the three-tiered clientserver architectural style. In this style, the client presentation includes the iPhones with the GoodTurn application installed. The business logic is represented by the GoodTurn application server. Finally, the Firebase database server represents the information system.
Figure 3. GoodTurn Architectural Style
Figure 5. Second/Third Levels Decomposition
Figure 4. Top-Level Decomposition
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
The GoodTurn system architecture is functionally decomposed into various functional components. Decomposition assist designers to recognize and identify the key problems that the system needs to tackle. Figure 4 demonstrates the top-level and first-level decomposition. The component, MVS Startup, is further decomposed into second and third levels in Figure 5. VII. USER INTERFACE A user interface is the means by which users interact with, manipulate, control, and use the system. Current software systems have a graphical user interface (GUI). A graphical user interface relies on several components, such as menus, toolbars, windows, and buttons. The first step in interface design is to identify the humans who will interact with the system. For the GoodTurn system, the humans included drivers, requesters, and system administrators. Then, scenarios for each way the user can interact with the system were developed. The classes needed to implement these scenarios were designed and integrated with other classes of the system. The user interface of GoodTurn User Interface is illustrated below. Figures 6 and 7 provide samples of the User Interface. A. Splash Interface This interface will be the first screen the user encounters when starting the application. It will only show for a brief time (2 seconds) while the application is loading the first screen. B. Login Interface The Login screen will be seen immediately after the splash screen. If the user is already logged in, then the appropriate dashboard screen will be loaded next. This screen contains fields for the user to enter their email/password combination. It also contains a register/create new account button so that the user can navigate to the Registration screen. C. Registration Interface When a new user/account is being created, this interface will be displayed. The first screen will be seen regardless of user type. The second screen will only be available if a user registers as a driver. After the user fills out all the information they can confirm by pressing a button. Then, terms of service will be displayed to the user, which they must agree to before the account is created. Once the account is created, the appropriate dashboard will be loaded for the user.
95
For both drivers and requesters (movers), all general details of the account will be encompassed on this screen including name, address, company, phone number, email, password, and confirm password. Drivers will be able to enter Vehicle Make, Vehicle Model, Vehicle Year, Vehicle Color, Vehicle License Plate. D. Requester Dashboard A requestor, who has successfully logged into their account, will arrive at this screen next. This screen will display all jobs the requestor has scheduled with options to edit jobs, cancel jobs, or create new jobs. Jobs should also have a status to let requestors know if the job has been accepted, on route, completed, or still pending (searching for driver). There are options to navigate to these screens: Profile, Support/Feedback, Driver Dashboard (if registered as driver as well). E. Driver Dashboard A driver who has successfully logged into their account will arrive at this screen. This screen will display a job list that will be populated with any jobs made by requesters. The driver will be able to sort jobs by distance to pick-up, distance between pick-up and drop-off, pickup time, or vehicle type. The driver can accept a job and the requestor will be notified that their job was accepted. The driver will also be able to reject a job. This will just dismiss the job from their list. There are options to navigate to the screens: Profile, Support/Feedback, Requestor Dashboard (if registered as a requestor as well). F. Profile Interface This screen is available to all users of the application. The screen gives information about the user and the ability to edit this information. The Profile screen for any user includes: name, address, company, and notification settings. If the user is a driver, it will also provide vehicle information with the ability to add, edit, or remove a vehicle from their profile. An option to save changes is also displayed. G. New Job Interface When requesters want to create a new job for drivers, this screen becomes available. Requestors can enter information about the job they wish to create, such as what they expect to be transported, how many items need to be transported, the size of the vehicle they think they will require, whether there will be heavy objects to be lifted, where the items will be picked up, and where the items will be dropped off. This screen has options for submitting the job details and creating a new job as well as cancelling the created job, and removing all job details.
ISBN: 1-60132-468-5, CSREA Press ©
96
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 6. Available jobs for drivers H. Support/Feedback Interface This screen should be available to all users of the application at any time. If an account is locked, this will be the screen used to request unlocking the account. This screen has options for stating lost items, filing a complaint, filing a complaint against a specific user, giving feedback on a driver, resetting password, reporting a broken link, deleting an account, reactivation of an account, and “other” to cover any other possible help required. These are represented as drop down menus to select a topic and then appropriate text fields based on the selected topic will be shown to report accordingly.
Figure 7. Requester’s new job iPhone applications. Since iPhones, application server, and a database server were needed, the GoodTurn system architecture followed the three-tiered client-server architecture. Future work will expand the scope to include Android-based phones. REFERENCES [1]
[2] [3]
VIII. CONCLUSION The application of software engineering to the creation of the GoodTurn system was described. Specifically, the sub-processes; requirements engineering, modeling, software architecture, and user interface were highlighted. It was concluded that the best development approach would be the agile process. The user interface was constrained by the standards and polices of Apple for
[4]
[5]
MacUpdate, “Xcode: Integrated Development Environment (IDE) for OS X,” https://www.macupdate.com/app/mac/13621/xcode, 2016, [retrieved: March, 2017]. Swift Documentation, “The Swift Programming Language,” https://swift.org/documentation, 2006, [retrieved: March, 2017]. Firebase, “Apple Success Made Simple,” https://firebase.google.com, [retrieved: March, 2017]. I. Inayat, S. S. Salim, S. Marczak, M. Daneva, and S. Shamshirband, “A systematic literature review on agile requirements engineering, practices and challenges,” Computers in Human Behavior, vol. 51, pp. 915–929, 2015. R. Mohanani, “Implications of requirements engineering on software design: A cognitive insight,” Proc. The 2016 IEEE/ACM 38th IEEE International Conference on Software Engineering Companion, Austin, Texas, May 2016, pp. 835-838.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
[6]
[7] [8]
[9]
[10]
[11] [12]
[13]
S. Besrour, L. B. Rahim, and P.D. Dominic,” A quantitative study to identify critical requirement engineering challenges in the context of small and medium software enterprise,” Proc. 3rd International Conference on Computer and Information Sciences (ICCOINS), Kula Lumpur, Malaysia, Aug. 2016, pp. 606-610. M. Geogy and A. Dharani, “A Scrutiny of the Software Requirement Engineering process,” Procedia Technology, vol. 25, pp. 405 – 410, 2016. E. M. Schön, J. Thomaschewski, and M. J. Escalona, “Agile Requirements Engineering: A systematic literature review,” Computer Standards and Interfaces, vol. 49, pp. 79–91, 2017. K. Sousa, J. Vanderdonckt, B. Henderson-Sellers, and C. Gonzalez-Perez, “Evaluating a graphical notation for modelling software development methodologies,” Journal of Visual Languages and Computing, vol. 23, pp. 195–212, 2012. B. Vogel-Heuser, S. Braun, B. Kormann, and D. Friedrich, “Implementation and evaluation of UML as modeling notation in object oriented software engineering for machine and plant automation,” Proc. The 18th World Congress of the International Federation of Automatic Control, Milano, Italy, Aug. 2011, pp. 9151-9157. L. Li, “Ontological modeling for software application development,” Advances in Engineering Software, vol. 36, pp. 147–157, 2005. M. Galster and S. Angelov, “What Makes Teaching Software Architecture Difficult?” Proc. The 38th IEEE International Conference on Software Engineering Companion (CSE’16), Austin, Texas, May 2016, pp. 356-359. H. van Vliet and A. Tang, “Decision making in software architecture,” The Journal of Systems and Software, vol. 117, pp. 638–644, 2016.
97
[14] E. Kouroshfar, M. Mirakhorli, H. Bagheri, L. Xiao, S. Malek, and Y. Cai, “A Study on the Role of Software Architecture in the Evolution and Quality of Software,” Proc. The 12th Working Conference on Mining Software Repositories (MSR’15), Florence, Italy, May 2015, pp. 246-257. [15] M. Alenezi, “Software Architecture Quality Measurement Stability and Understandability,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 7, pp. 550-559, 2016. [16] B. Faghih, M. R. Azadehfar, and S. D. Katebi, “User interface design for e-Learning software,” The International Journal of Soft Computing and Software Engineering (JSCSE), vol. 3, pp. 786794, 2013. [17] A. I. Molina, W. J. Giraldo, J. Gallardo, M. A. Redondo, M. Ortega, and G. García, “CIAT-GUI: A MDE-compliant environment for developing Graphical User Interfaces of information systems,” Advances in Engineering Software, vol. 52, pp. 10–29, 2012. [18] M. Gómez, J. Cervantes, “User Interface Transition Diagrams for customer–developer communication improvement in software development projects,” The Journal of Systems and Software, vol. 86, pp. 2394– 2410, 2013. [19] L. Troiano, and C. Birtolo, “Genetic algorithms supporting generative design of user interfaces: Examples,” Information Sciences, vol. 259, pp. 433–451, 2014. [20] J. van der Poll, P. Kotzé, A. Seffah, T. Radhakrishnan and A. Alsumait, “Combining UCMs and Formal Methods for Representing and Checking the Validity of Scenarios as User Requirements,” Proc. The 2003 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology (SAICSIT), Johannesburg, South Africa, 2003, pp. 59-68.
ISBN: 1-60132-468-5, CSREA Press ©
98
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Support Environment for Traffic Simulation of ITS Services Ryo Fujii1 , Takahiro Ando1 , Kenji Hisazumi2 , Tsunenori Mine1 , Tsuneo Nakanishi3 , and Akira Fukuda1 1 Graduate School / Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan 2 System LSI Research Center, Kyushu University, Fukuoka, Japan 3 Department of Electronics Engineering and Computer Science, Faculty of Engineering, Fukuoka University, Fukuoka, Japan
Abstract— In this paper, we introduce a support environment for traffic simulation of ITS-related systems that are currently under development. To confirm the effect of ITSrelated systems on society via simulation, it is generally necessary to implement a model of the system on a simulator. Our support environment aims to realize a simulation environment by directly connecting ITS-related systems and simulators, without modeling the system. In this paper, we introduce the overview and architecture of our support environment for traffic simulation, and discuss its usefulness.
Keywords: Smart Mobility Service, Intelligent Transportation System, Traffic Simulator
1. Introduction With the development of IoT technology, expectations for realizing a smart mobility society equipped with a new social information infrastructure system are increasing. Information infrastructure systems that support the smart mobility society are called smart mobility systems, and research on system life cycle, including development technology, operation, and development of those systems is increasingly important. The realization of a sustainable smart mobility society is currently being discussed as a challenging task [1]. In particular, there are still a number of challenges concerning the development of smart mobility systems such as services and applications related to ITS that are responsible for the social infrastructure. In the development of systems related to ITS, it is important to confirm the effect on society by the system, and field trials are often carried out before full-scale introduction. However, field trials are time consuming and expensive, and the cost of preparing them is large, and so it is not realistic to conduct field trials on a single system several times. Therefore, it is common to check the effect of these systems via simulation before release or field trials. In general, when we check the effect of ITS services via simulation, we need to create models on the behavior of ITS services, in addition to modelling the vehicles or pedestrians and road networks. Apart from the ITS service, we have to create the behavior model with the same function.
This paper is based on the idea that it is possible to omit creation of an ITS Service behavior model by constructing the simulation support environment, using the implementation of the ITS Service for simulation.
2. Traffic Simulator Traffic simulators simulate the behavior of a moving object such as a vehicle or a pedestrian on a road network and the behavior of a signal system in virtual spaces. These simulators are systems that observe the behavior of the moving objects and the condition of traffic in the virtual spaces. The traffic simulators are classified into macro traffic simulator, micro traffic simulator, and human flow simulator according to the size of the virtual space to be focused and the observation granularity of the moving objects. An appropriate simulator is selected according to the simulation purpose. The characteristics of each simulator are given below. macro traffic simulator It covers a relatively wide area and targets a largescale road network. In addition, it does not treat moving objects as an individual object but treats them as fluids that flow over the network and observes their flow rates. Visum[2] and others serve as representative macro traffic simulators. micro traffic simulator It covers an area of relatively narrow range. In addition, it treats moving objects as an individual object and simulates its detailed movement. SUMO[3], Vissim[4], Aimsun[5] and others serve as representative micro traffic simulators. human flow simulator It primarily covers areas such as stations or commercial facilities and simulates the movement of people in that space. In the simulation of the evacuation of people during a disaster, artisoc[6] and the like is used as a human flow simulator.
3. Related Work As research on environmental development of traffic simulation, research[7][8] on the construction of an integrated
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Fig. 1: Conceptual diagram of the support environment
simulation environment connecting a traffic simulator and various simulators, research[9] on an interface-enabling connection with a traffic simulator, and other similar researches have been conducted. In the literature[7], an integrated simulation environment that combines various simulators such as Vissim, MATLAB[10], ns-3[11] as a simulation environment of V2X(Vehicle-to-everything) communication technology has been proposed. In the literature[8], an integrated simulation environment TraNS (Traffic and Network Simulation Environment) for VANETs (Vehicle Ad Hoc Networks) has been proposed, in which the traffic simulator SUMO and the network simulator ns-2[12] are interconnected by TraCI that is described below. In the literature[9], an interface TraCI (Traffic Control Interface) for interconnecting a traffic simulator and a network simulator has been proposed. By using TraCI, it becomes possible to acquire congestion information on the road, via the traffic simulator, or to control the behavior of the moving object in real time at the time of simulation. In the simulation support environment proposed in this paper, TraCI is used to realize the connection between the simulator and the implementation of the ITS service.
4. Overview of the Support Environment of Traffic Simulation In this section, we describe the simulation support environment currently under development, which uses the implementation of ITS services. The simulation support environment proposed in this paper acts as an intermediary between the traffic simulator and the implementation of the ITS service as shown in Fig. 1. The environment provides an interconnection between the traffic simulator and the ITS service. By connecting the implementation of the ITS service with the traffic simulator, simulation can be performed without creating a model representing the behavior of the ITS service on the simulator. Therefore, besides implementing the ITS service,
99
the creation of a behavior model for simulation is omitted, which further reduces development costs. Moreover, because the traffic simulator and the implementation of ITS services can exchange information in real time through this support environment during simulation, the behavior of the ITS services affects the environment being simulated in real time. In addition, it is also possible to observe that the environmental change on the simulator affects the behavior of ITS services in real time. In order to realize these features, this support environment has the following functions: • It obtains information such as the current location and speed of the moving object from a moving object such as a vehicle or pedestrian. • It provides information from moving objects to ITS services. • It operates ITS services on behalf of the moving object. • It acquires the response result of the ITS services, the information actively transmitted by the ITS services, and so on. • It changes the behavior of the moving object based on the information from the ITS services. • It acquires POI(Point of Interest) information such as the current degree of congestion from the POI. • It provides POI information acquired from the simulator to the ITS services. In addition, this support environment can connect multiple ITS services to the traffic simulator and can simulate how influences from each service interfere with each other. It is also possible to simulate instances in which the effect of the services to be coordinated will be as expected. Furthermore, this support environment enables mutual connection between ITS services and multiple or different kinds of simulators, and it enables the simulation of the following scenarios: 1) Under the guidance of the car navigation system, use the car to move to the parking lot that is closest to the destination of the commercial facility and move to the commercial facility on foot. 2) Within a commercial facility, using the in-facility guidance system, walk from the entrance to the target store on foot. In this scenario, simulation is carried out using a micro traffic simulator outside the commercial facility and a human flow simulator inside the facility, and two types of simulators are executed at the same time. In the simulation, the moving object performing the scenario behaves as a moving object on the micro traffic simulator when out of the facility, and as a moving object on the human flow simulator when inside the facility. In order to enable such simulation, this support environment has the following function: • When a moving object on one simulator moves to a space on another simulator, it supports movement between simulators.
ISBN: 1-60132-468-5, CSREA Press ©
100
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Information flowing from the moving objects on the simulator to the Agent: The present location and speed of the moving object are acquired from the simulator through the Vehicle module and sent to the Agent module in charge. Information flowing from the POI on the simulator to the Agent: POI information such as the degree of congestion is acquired from the simulator through the POI module and sent to the Agent module in charge.
Fig. 2: Architecture of the support environment for traffic simulation
5. Overview of the Architecture of the Support Environment In this section, the architecture of the simulation support environment, currently under development, is outlined based on the architecture diagram shown in Fig.2. This support environment consists of several kinds of core modules as follows: Manager This is a module that forms the base of the support environment and is responsible for managing instances of each module appearing in the support environment. Sim This is responsible for connecting between an individual simulator environment and the support environment. Vehicle This is responsible for connecting with moving objects on the simulator environment. When the moving objects move to a space on another simulator, it is moved to another Sim module via the Manager module. POI This is responsible for connecting with the POI object within the simulator environment. Agent This is responsible for an intermediary for information exchange between Vehicle and ITS services or between POI and ITS services. This is also responsible for changing the movement of moving objects connected with this instance based on information from the ITS service. In each connection between modules, information is transmitted and received as follows.
Information flowing from the Agent to the ITS services: The Agent module that obtained the information from the simulator transmits the information to the ITS services. In addition, Agent acting as an intermediary with the moving object transmits to the ITS services what operation is to be performed and information necessary for the operation on behalf of the operation performed by the moving object. Information flowing from the ITS service to the Agent: The ITS service returns a proper response information for the operation to the Agent. For example, a route recommendation service recommends a proper route for the user. Information flowing from the Agent to the moving objects on the simulator: The agent responsible for the moving object sends information for changing the behavior of the moving object based on the information obtained from the ITS services. Based on the architectural design, the class design of the simulation support environment was performed as shown in the class diagram shown in Fig. 3. In addition, we are currently developing a prototype of this simulation support environment using Python. Functions for connecting different types of simulators are not currently implemented in the prototype at the development stage, but connection of the same kind of simulator and connection of multiple kinds of ITS services are supported.
6. Case Study In this section, we describe the case study of simulation using the prototype of the simulation support environment under implementation in Python based on the architecture design shown in Section 5. In this case study, we implement several simple ITS services and connect them with SUMO as a traffic simulator. We simulate with the environment. Then we confirm that the interference of the influence of the behavior of each ITS service can be observed by using this support environment.
6.1 Data Used for Simulation Execution Road network A road network shown in Fig. 4 was prepared as the
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
101
from the ITS services are included. Depending on the recommendation information from the ITS services, the parking lot, which is the destination may be changed to another parking lot. The number of moving object units using each ITS service is defined as shown in Table 1. Table 1: Number of vehicles according to ITS service recommendation ITS Service 1 ITS Service 2 NOT Use Service Total
Fig. 3: Class diagram of our support environment implementation
Fig. 4: Road network for simulation case study
application space of the case study. Three places marked in gray represent parking lots that are POIs. In addition, the POI information of each parking lot is assumed to be the degree of congestion at that time, and the maximum number of storable cars in each parking lot is 40, 40, 25 in order from parking lot 1. Moving object data The data of the moving objects moving on the road network was created using the tool attached to SUMO. Each moving object emerges from any one of the left end, the right end, and the lower end, and moves to one of the three parking lots. In this case study, it is assumed that the execution step of the simulation is 3600 steps, and the moving objects on the simulator are generated in one of the steps and discarded when reaching the parking lot. In addition, some of the moving objects use the ITS services described below and those that conform to the recommendation information
83 98 119 300
units units units units
ITS Service As an ITS service connecting to the simulator, we implemented a simple ITS service that recommends vacant parking lots according to the degree of congestion of the parking lot, and we prepared two simple parking lot recommendation services with different parking lots to be monitored using this. The first service monitors parking lot 1 and parking lot 3 and recommends either, and the second service monitors parking lot 2 and parking lot 3 and recommends either. In this case study, simulations were carried out by simultaneously connecting the above two kinds of simple parking lot recommended services to the simulation support environment.
6.2 Result of Simulation Execution Simulation was carried out using various data described in the previous section and the prototype of the simulation support environment currently under development. As a result, simulation was executed without abnormal termination up to the set step, and it was confirmed that simulation using the implementation of ITS services is possible by using this support environment. In addition, we observed the change in the degree of congestion of the parking lot during simulation, and it was as shown in the graph shown in Fig. 5. In this graph, the horizontal line represents the number of steps, the vertical one represents the congestion rate of the parking lot, and the congestion rate of the parking lot is expressed by the following equation. (Congestion rate) = (Number of cars parked ) × 100 (Maximum number of storable cars) In this graph of the degree of congestion, when simulation step reaches around 1,500 steps, the congestion rate of parking lot 3 suddenly rises and exceeding 70% can be confirmed. This is because the parking lot 3 is less
ISBN: 1-60132-468-5, CSREA Press ©
102
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Fig. 5: Result of parking congestion ratios
congested than the parking lots 1 and 2 in this immediately preceding step, and the parking lot 3 was recommended at the same time from both services. From this, it can be seen that this support environment is useful for manifesting the mutual interference situation of the influence by multiple ITS services.
7. Conclusions In this paper, we proposed a simulation support environment that does not require modeling of ITS services for simulation, and we outlined the support environment and architecture design. The simulation support environment proposed in this paper has the following features. • This support environment does not require modeling of ITS services but provides a simulation environment connecting the implementation of the ITS service to the simulators. • In this support environment, information is exchanged between the simulator and the ITS service when the simulation is executed, and it is possible to simulate the ITS service behavior and environmental changes mutually affecting each other in real time. • This support environment provides the environment enabling connection between simulator and multiple ITS services and confirms how the effects of each ITS interfere with each other. • This support environment provides the environment in which multiple or different types of simulators can be connected, and simulation can also be performed on moving objects moving to different simulator worlds.
In addition, we developed a prototype of the supporting environment in Python based on the proposed architecture design, and it is possible to carry out simulation of the environment where multiple ITS services are affected by each other. Using case studies, the effectiveness of the proposed support environment was confirmed. In future work, we plan to develop or research the simulation space integration or division method, which is a problem when connecting a plurality of simulators to this support environment. In addition, we plan to perform data conversion when different types of simulators are connected. We will continue to implement prototypes and using case studies, we will strengthen the support environment through its feedback.
Acknowledgment This work is partially supported by JSPS KAKENHI Grant Number 15H05708.
References [1] A. Fukuda, K. Hisazumi, S. Ishida, T. Mine, T. Nakanishi, H. Furusho, S. Tagashira, Y. Arakawa, K. Kaneko, and W. Kong, “Towards sustainable information infrastructure platform for smart mobility - project overview,” in 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), 2016, pp. 211–214. [2] P. Group, “PTV Visum 16,” http://vision-traffic.ptvgroup.com/enus/products/ptv-visum/, 2016. [3] D. Krajzewicz, J. Erdmann, M. Behrisch, and L. Bieker, “Recent development and applications of SUMO - Simulation of Urban MObility,” International Journal On Advances in Systems and Measurements, vol. 5, no. 3&4, pp. 128–138, December 2012. [4] P. Group, “PTV Vissim 9,” http://vision-traffic.ptvgroup.com/enuk/products/ptv-vissim/, 2016.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
[5] T.-T. S. Systems, “Aimsun 8.1,” https://www.aimsun.com/aimsun/, 2015. [6] KOZO KEIKAKU ENGINEERING Inc., “artisoc 4,” http://mas.kke.co.jp/modules/tinyd0/index.php?id=13, 2016. [7] A. Choudhury, T. Maszczyk, C. B. Math, H. Li, and J. Dauwels, “An integrated simulation environment for testing V2X protocols and applications,” Procedia Computer Science, vol. 80, pp. 2042–2052, 2016. [8] M. Piorkowski, M. Raya, A. L. Lugo, P. Papadimitratos, M. Grossglauser, and J.-P. Hubaux, “TraNS: realistic joint traffic and network simulator for VANETs,” ACM SIGMOBILE mobile computing and communications review, vol. 12, no. 1, pp. 31–33, 2008. [9] A. Wegener, M. Piórkowski, M. Raya, H. Hellbrück, S. Fischer, and J.-P. Hubaux, “TraCI: an interface for coupling road traffic and network simulators,” in Proceedings of the 11th communications and networking simulation symposium. ACM, 2008, pp. 155–163. [10] T. MathWorks, “MATLAB,” https://www.mathworks.com/products/matlab.html. [11] K. Wehrle, M. Güne¸s, and J. Gross, Modeling and Tools for Network Simulation, 2010, ch. 2.“The ns-3 Network Simulator”, pp. 15–34. [12] “The Network Simulator ns-2,” http://www.isi.edu/nsnam/ns/.
ISBN: 1-60132-468-5, CSREA Press ©
103
104
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Publishing and Consuming RESTful Web API Services Yurii Boreisha, Oksana Myronovych Department of Computer Science, Minnesota State University Moorhead, Moorhead, MN, USA Department of Computer Science, North Dakota State University, Fargo, ND, USA Abstract - This paper is dedicated to the key issues of building RESTful Web API services using ASP.NET and Entity Framework. It shows how to enable crossorigin requests and secure ASP.NET Web API using token-based authentication. Examples of JavaScript clients that consume RESTful Web API services are provided. Keywords: REST, Web API, cross-origin requests, token-based authentication.
1 Introduction Contemporary software development is heavily built on publishing and consuming web services. As expected of any software products, these services and their clients should be maintainable, testable, and adaptive to change. Web API technology provides a way to implement a RESTful web service using all of the goods that the .NET framework offers. In Web APIs, communication between applications is done over the underlying HTTP protocol. With Web API we create endpoints that can be accessed using a combination of descriptive URLs and HTTP verbs: GET – Request data/resource, POST – Submit or create data/resource, PUT – Update data/resource, and DELETE – Delete data resource [1-4]. Cross-Origin Resource Sharing (CORS) mechanism is required when a client requests access to a resource (for example, data from an endpoint) from an origin (domain) that is different from the domain where the resource itself originates. In the CORS workflow, before sending a DELETE, PUT, or POST requests, the client sends an OPTIONS request to check that the domain from which the request originates is the same as the server. The server must include various access headers that describe which domains have access: Access-Control-Allow-Headers (ACAH) header describes which headers the API can accept; Access-Control-Allow-Methods (ACAM)
header describes which supported/permitted [5]
HTTP
verbs
are
Since Web API adoption is increasing at a rapid race, there is a serious need for implementing security for all types of clients trying to access data from Web API services. It can be done with token-based authentication that allows to authenticate and authorize each request by exposing OAuth2 endpoints using OAuth2 Authorization Framework and Open Web Interface for .NET (OWIN) [6]. This paper is dedicated to the key issues of building RESTful Web API services using ASP.NET and Entity Framework. It shows how to enable crossorigin requests and secure ASP.NET Web API using token-based authentication. Examples of JavaScript clients that consume RESTful Web API services are provided.
2 Creating a RESTful Web API Service It is easy to create a Web API service with Individual User Accounts using Visual Studio 2017. Let us assume that we would like to use this service to enable CRUD operations for a simple database (Figure 1). To create all data access classes we can add an ADO.NET Entity Data Model (database first approach). To enable CRUD operations we should create a Web API controller with actions, using entity framework (Figure 2). The RESTful Web API service is ready. We can add Swashbuckle/Swagger UI to represent API operations – just install Swashbuckle using NuGet Package Manager. Swashbuckle seamlessly adds a Swagger to Web API projects. In order to have a direct link to the Swagger API interface one should add an action link to the top navigation (_Layout.cshtml). To publish this services using IIS Express – just start the application (Figure 3).
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
105
Figure 1: Phone Book Database
Figure 2: Web API Controller Settings
Figure 3: Swagger UI
3 Creating a JavaScript Client
Let us create a simple HTML/CSS/JavaScript client that allows to retrieve, add, delete, update, and find entries in the Phone Book database. Fragments of jQuery code used in the client application are shown in Figures 4-9. For this application to work one should enable CORS (Cross-Origin Resource Sharing) for the service:
Install Microsoft.AspNet.WebAPI.Cors using NuGet Package Manager. Enable CORS (App_Start/WebAPIConfig.cs): config.EnableCors(); Allow CORS for Controllers/PhoneBooksController.cs: [EnableCors(origins:"*", headers:"*", methods:"*")]
ISBN: 1-60132-468-5, CSREA Press ©
106
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 4: jQuery Code
Figure 5: Function showAll()
Figure 6: Function find()
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 7: Function add()
Figure 8: Function update()
Figure 9: Function delete()
ISBN: 1-60132-468-5, CSREA Press ©
107
108
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
3 Securing Web API using Token-Based Authentication Created RESTful Web API service with Individual User Accounts (Visual Studio 2017) includes an authorization server that validates user credentials and issues tokens where Web API controllers act as resource servers. An authentication filter validates access tokens, and the [Authorize] attribute is used to protect a resource. The following diagram shows the credential flow in terms of Web API components (Figure 10). To enable the tokenbased authentication the following changes should be made for the service and related clients. For the RESTful Web API service we should do the following [5, 6]:
Add the ‘RequireHttpsAttribute’ filter to the MVC pipeline (App_Start/FilterConig.cs) Add a custom ‘RequireHttpsAttribute’ filter to the Web API pipeline (App_Start/WebAPIConfig.cs) Remove ‘AllowInsecureHttp’ from ‘OAuthOptions’ (App_Start/Stratup.Auth.cs) Allow CORS for Controllers/AccountControoller.cs and enable CORS for the /Token endpoint adding the proper code to the App_Start/Startup.Auth.cs (ConfigureAuth method). Enable authorization for all related controllers – add the [Authorize] attribute.
For the client app we should add the Registration and Login features, and modify all functions to work with the Authorization headers (Figures 11-15).
Enable SSL.
Figure 10: Credential Flow
Figure 11: jQuery Code (Part 1) - Token-Based Authentication
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 12: jQuery Code (Part 2) - Token-Based Authentication
Figure 13: Function find() – Token-Based Authentication
Figure 14: Function register() – Token-Based Authentication
ISBN: 1-60132-468-5, CSREA Press ©
109
110
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 15: Function login() – Token-Based Authentication
4 Conclusion This paper is dedicated to the key issues of building RESTful Web API services using ASP.NET and Entity Framework. It shows a ‘big picture’ development approach based on the following steps:
Creating a RESTful Web API service. Adding Swashbuckle/Swagger UI to represent API operations/endpoints. Enabling Cross-Origin Resource Sharing. Sharing Web API using Token-Based Authentication. Creating JavaScript/jQuery clients to consume RESTful Web API services.
[2]
[3]
Future developments will be dedicated to the following topics and related new technologies, protocols, standards, and frameworks:
[4]
[5]
What kind of API to use (for the present RESTful, SOAP, JavaScript, XML-RCP)? What data format to use in the request (for the present - JSON, HTML, XML)? What kind of authorization/authentication to use (for the present – OAuth2, HTTP Basic Authentication)?
5 References [1] Boreisha, Y. and Myronovych, O. Resolving Cyclic Dependencies – Design for Testability; Proceedings of the 2016 International Conference
[6]
on Software Engineering Research and Practice, SERP2016, 109-115, July, 2016, Las Vegas, Nevada, USA. Boreisha, Y. and Myronovych, O. Genetic Algorithm and Mutation Analysis for Software Testing; Proceedings of the 2010 International Conference on Software Engineering Research and Practice, SERP2010, 247-252, July, 2010, Las Vegas, Nevada, USA. Boreisha, Y. and Myronovych, O. Modified Genetic Algorithm for Mutation-Based Testing; Proceedings of the 2009 International Conference on Software Engineering Research and Practice, SERP2009, 44-49, July, 2009, Las Vegas, Nevada, USA. Kearn, M. Introduction to REST and .NET Web API, 2015, January, https://blogs.msdn.microsoft.com/martinkearn/2 015/01/05/introduction-to-rest-and-net-web-api/ Preece, J., Create a RESTful API with Authentication using Web API and JWT, 2016, March, http://www.developerhandbook.com/csharp/create-restful-api-authentication-usingweb-api-jwt/ Wasson, M. Secure a Web API with Individual Accounts and Local Login in ASP.NET Web API 2.2, 2014, October, https://docs.microsoft.com/en-us/aspnet/webapi/overview/security/individual-accounts-inweb-api
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
SESSION PROGRAMMING ISSUES AND ALGORITHMS + SOFTWARE ARCHITECTURES AND SOFTWARE ENGINEERING + EDUCATION Chair(s) TBA
ISBN: 1-60132-468-5, CSREA Press ©
111
112
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
113
Aesthetics Versus Entropy in Source Code Ron Coleman and Brendon Boldt Computer Science Department, Marist College, Poughkeepsie, NY, USA
Abstract – Although the separate literatures on programming style and information theories of software are large, little attention has been given to information theories of style. What is new in this paper we study the measure of programming style or “beauty” and its empirical relationship to disorder or information entropy in source. In a series of experiments, we use a beauty model based on fractal geometry and a corpus of programs from the GNU/Linux repository. The data show with statistical significance that beauty and entropy in code are inversely related. The data also indicate beauty and entropy are weakly to moderately correlated, which is to say, the beauty and entropy in source are not proxies. Finally, the data contains as a result of this study statistical evidence of ways in which the beauty model might serve as a new kind style checker. The main research contribution of this effort is a better understanding through empiricism of roles aesthetic expression and entropy play in crafting and maintaining codes. Keywords: programming style, fractal geometry, aesthetics, information entropy
1
Introduction
Software engineers have observed that a repository, during its lifecycle, often undergoes modifications that disrupt the layout or structure of the code—specifically, it’s programming style—with changes believed to increase disorder or information entropy in the code [1, 2]. However, these observations remain anecdotal. Coleman and Gandhi [3] put forward a relativistic, fractal model of aesthetic appeal in code that makes two predictions related to crafting and maintaining codes. First, the measure of style or “beauty” as we define it here is inversely related to entropy in code. In other words, transformations that beautify code tend to have a lower bit rate and increased transparency of style while transformations that obfuscate or de-beautify code tend to have a higher bit rate and reduced transparency of style. The model furthermore predicts beauty and entropy are not proxies. That is, the correlation between beauty and entropy is weak-tomoderate. What is new in this paper is we test these predictions through statistical experiments using the beauty model and semantic-preserving transformations on a corpus of open sources from the GNU/Linux source repository. The general goals of this research effort are to revisit the original ideas of Dijkstra [4], Kunth [5], and others regarding sensorialemotional or aesthetic appeal in code [6]. However, our approach employs empirical methods which would enable programmers, educators, students and others to reason more
objectively and systematically about programming style using metrics rather than anecdote, ad hoc assessments, guesswork, etc. Furthermore, in view of recent calls by industry and the government to expand computing instruction for kids, underrepresented groups, and the next generation of coders [7, 8], we posit that an understanding of styles and anti-styles through quantitative means may be more important than in the past.
2
Related work
Although the separate literatures on programming style and information theories of software are large (see for example [9, 10, 11, 12, 13, 14, 15, 16, 17]), little attention has been given to information theories of style. Kirk and Jenkins [18], Anckaert, et al [19], Moshen and Pinto [20], Moshen and Pinto [21], and Avidan and Feitelson [22] proposed to use information approaches to obfuscate code. However, they were fundamentally interested in information security, not the software development lifecycle. While Posnett, Hindle, and Devanbu [23] model incorporated information entropy in their logit regression model of readability, they were not investigating style but a means to simplify a readability model put forward by Weimer and Buse [24]. In both cases, they were interested not in aesthetics but readability. Kokol with others [25, 26, 27, 28] showed that programs indeed contained longrange correlations, i.e., code is self-similar, in lexical characters and tokens. However, they were not studying style but searching for a fractal-based metric of software complexity using generated Pascal programs. Coleman and Gandhi [35] showed that software complexity and beauty are related and in common sense directions with weak-to-moderate correlation. In other words, complexity and beauty are not proxies. The investigations of this paper resemble efforts of researchers who used fractal geometry to assess aesthetic values in paintings and masterpieces, including Pollock’s “action paintings.” [29, 30, 31, 32, 33] The main technical differences are the use of code versus fine art and programming style versus artistic style. Beautiful Code [34] deals with conceptual beauty in the design and analysis of algorithms, testing, and debugging, topics which are outside the scope of this paper.
3
Methods
In this section we give our methods, starting with some working definitions.
3.1
Some working definitions
We follow Coleman and Gandhi [3] who defined “style” operationally to be the layout of code, namely, its lexical
ISBN: 1-60132-468-5, CSREA Press ©
114
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
structure. They limited the scope to “basic tenets” of good programming style defined by three general recommendations from a survey of different style guides, namely: 1) use white space; 2) choose mnemonic names; and 3) include documentation. They then defined “beauty” to be the measure of style relative to the basic tenets.
3.2
Semantic-preserving transformations
Let S be some source code called the control or baseline. Then, we have S’ = T (S)
(1)
such that T is a transformation operator or “treatment” of S which results in a new source, S’. The sources, S and S’, differ only in style; the semantics of S and S’ are the same. There are two modalities of T: de-beautification and re-beautification. De-beautification manipulates S in ways inconsistent with the basic tenets. Beautification manipulates S ways consistent with the basic tenets. The tables below give the de-beautification and beautification treatments. (For details on the algorithms to manipulate the sources, see Coleman and Gandhi [35].) Table 1 De-beautification treatments T NOI R2 R5 NON DEC
Tenet 1 1 1 2 3
Semantics Removes indents. Randomizes indents with 1-2 spaces. Randomizes indents with 1-5 spaces. Refactors names to be less mnemonic. Removes comments.
Table 2 Beautification treatments T GNU K&R BSD LIN MNE REC
3.3
Tenet 1 1 1 1 2 3
Semantics Applies GNU style [36]. Applies K&R style [37]. Applies BSD style [38]. Applies Linux style [39]. Refactors names to be more mnemonic. Adds comments.
Information entropy
If S = {s0, s1, s2, …} where si are lexical tokens in S, then we compute, I, the information entropy (or bit rate) as ,
𝐼(𝑆) = −
(2)
𝑝( log 𝑝( (-.
where pi is the observed frequency of token, si [40].
3.4
Fractal dimension
Mandelbrot [41] described fractals as geometric objects that are self-similar on different scales and nowhere differentiable. A quantitative interpretation
based on reticular cell counting or the box counting dimension, namely, gives D as 𝐷 𝑆 = lim
2→4
log 𝑁2 (𝑆) log 1/𝑟
(3)
where S is some surface; Nr(S) is the number of components or subcomponents covered by the ruler of size, r. For fractal objects, log Nr(S) will be greater than log 1/r by a fractional amount. If the ruler is a uniform grid of square cells, then a straight line passes through twice as many cells if the cell length is reduced by a factor of two. A fractal object passes through more than twice as many cells. For r, we use grid sizes of 2, 3, 4, 6, 8, 12, 16, 32, 64, and 128 measured in pixels [42].
3.5
Beauty model Coleman and Gandhi [3] defined a beauty factor, B, to
be 𝐵 𝑆 𝑇 = 𝑘 log (D / D′)
(4)
where k is a constant. When k=10 and the log is base 10, the units of B are decibels (dB). D and D’ are the fractal dimensions of the sources, S and S’, respectively, after they have been converted to “artefacts” which we describe further below. Beauty factor, B, has the following interpretations: 1.
If B Bool var hashValue: Int static func ==(lhs: MailBox, rhs: MailBox) -> Bool }
}
class Worker: ActorProtocol { func calculate(start: Int, nrOfElements: Int) var acc = 0.0 for item in start ... (start + nrOfElements acc += 4.0 * (1 - (item % 2) * 2) / (2 * item + 1) return acc } } func processor(msg) -> Void { // run calculate when an init // message is received and send // result to sender }
V. E XAMPLE USAGE (P I ) We revisit the Gregory-Leibniz series [9] mentioned earlier, where the result of a series of fractions are alternatively summed up to obtain the final result. This work can be broken down into smaller pieces so that the combining operator is the addition. And as addition is associative, it lends itself to a possibility of the work being completed at different times and returned as and when values become available. To this end the different parts can be distributed among concurrently running processes and finally aggregated for the final result, as illustrated in the code below var n = 1.0 var pi = 0.0 let seriesLength = 24000 var pass = 0 while (pass < seriesLength) {
} VI. R ELATED WORK While our work draws principally on the ACTOR model there are also other programming approaches we draw upon to formulate our architecture. We will briefly describe those in this section. A. Concurrent ML Is an extension, which adds concurrency capabilities to Standard ML of New Jersey - a statically typed programming language with an extensible type system, which supports both
ISBN: 1-60132-468-5, CSREA Press ©
182
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
imperative and functional styles of programming. CML is described as a higher-order (sometimes referred to as functional) concurrent language [11] designed for high performance and concurrent programming. It extends Standard ML with synchronous message passing, and similarly to CSP described above, over typed channels [12].
start() spawns the process for the current module with any parameters that are required. A loop is then defined which contains directives to execute when it receives messages of the enumerated patterns that follow. loop() is then called so that the process can once again wait to receive another message for processing.
B. Occam 2
D. Pony
Occam is an imperative procedural language, implemented as the native programming language from the Occam model. This model also formed the bases for the hardware chip — the INMOS transputer microprocessor [13]. It is one of several parallel programming language developed based on Hoare’s CSP[8]. Although the language is a high level one, it can be viewed as an assembly language for the transputer [13]. The transputer was built with 4 serial bi-directional links to other transputers to provide message passing capabilities among other transputers. Concurrency in Occam is achieved by message passing along point-to-point channels, that is, the source and destination of a channel must be on the same concurrent process. VARIABLE := EXPRESSION CHANNEL ? VARIABLE CHANNEL ! VARIABLE This notation takes its exact meaning from Hoare’s CSP [8]. The “?” is requesting an input from the channel to be stored in the VARIABLE whereas “!” is sending a message over the channel and the message is the value stored in VARIABLE. Occam is strongly typed and as such the channels over which messages are passed need to be typed as well, although the type can be ANY, meaning that the channel can allow any type of data to be transmitted over it. An inherent limitation in Occam’s data structures is that the only complex data type available is ARRAY.
Pony is an object-oriented, actor-model, capabilities-secure programming language [1]. In object oriented fashion, an actor designated with the keyword actor is similar to a class except that it has what it defines as behaviours. Behaviours are defined as asynchronous methods defined in a class. Using the be keyword, a behaviour is defined to be executed at an indeterminate time in the future[1]. actor AnActor be(x: U64) => x * x Pony runs its own scheduler using all the cores present on the host computer for threads, and several behaviours can be executed at the same time on any of the threads/cores at any given time, giving it concurrent capabilities. It can also be viewed within a sequential context also as the actors themselves are sequential. Each actor executes one behaviour at a given time. E. The Akka Library
Earlier versions of Scala had natively implemented actors. Actors were part of the Scala library and could be defined without any additional libraries. Newer versions (2.9 and above) have removed the built in Actor and now there is the Akka Library. Akka is developed and maintained by Typesafe [15] and when included in an application, concurrency can be achieved. Actors are defined as classes that include or extend the Actor C. Erlang trait. This trait enforces the definition of at least a receive The Erlang Virtual Machine provides concurrency for the function. In the trait receive is defined as a partial function language in a portable manner and as such it does not rely to with takes another function and returns a unit. any extent on threading provided by the operating system nor The function it expects is the behaviour that the developer any external libraries. This self contained nature of the virtual needs to program into the actor. This is a essentially defined machine ensures that any concurrent programmes written as a pattern matching sequence of actions to be taken when a in Erlang run consistently across all operating systems and message is received that matches a given pattern. environments. import akka.actor.Actor The simplest unit is a lightweight virtual machine called a class MyActor extends Actor { process [14]. Processes communicate with each other through def receive = { message passing so that a simple process written to commucase Message1 => nicate will look like the following //some action start() -> spawn(module_name, [Parameters]). case Message2(x:Int) => loop() -> // another action use receive // x as in int pattern -> expression; ... pattern -> expression; case MessageN => pattern...n -> expression; //Other actions end } loop(). }
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 | At the heart of the Akka Actor implementation is the Java concurrency library java.util.concurrent [16]. This library provides the (multi)threading that Akka Actors use for concurrency. Users of the library do not need to worry about scheduling, forking and/or joining. This is dealt with by the library’s interaction with the executor service and context. F. Kotlin coroutines Kotlin coroutines can be described as suspendable operations that can be resumed at a later time and potentially on a different thread of execution. We believe that this model can similarly be implemented in this language perhaps leveraging coroutines. VII. C ONCLUSIONS AND F UTURE WORK
183 [8] C. A. R. Hoare, Communicating Sequential Processes. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1985. [9] J. Arndt and C. Haenel, Pi-unleashed. Springer Science & Business Media, 2001. [10] A. Inc. (2016) Dispatch - api reference. [Online]. Available: https://developer.apple.com/reference/dispatch [11] J. H. Reppy, “Cml: A higher concurrent language,” SIGPLAN Not., vol. 26, no. 6, pp. 293–305, May 1991. [Online]. Available: http://doi.acm.org/10.1145/113446.113470 [12] J. Reppy, C. V. Russo, and Y. Xiao, “Parallel concurrent ml,” SIGPLAN Not., vol. 44, no. 9, pp. 257–268, Aug. 2009. [Online]. Available: http://doi.acm.org/10.1145/1631687.1596588 [13] D. C. Hyde, “Introduction to the programming language occam,” 1995. [14] J. Armstrong, Programming Erlang. Pragmatic Bookshelf, 2007. [15] Typesafe. (2014, Feb.) Build powerful concurrent & distributed applications more easily. [Online; accessed 05 Feb, 2014]. [Online]. Available: http://www.akka.io/ [16] D. Wyatt, Akka concurrency. Artima Incorporation, 2013.
In this article we have briefly described the architecture and implementation of a prototype ACTOR model implementation for the Swift programming language. The prototype can run actor based programs and is performant with manually written concurrent code. We intend to enhance our system by incorporating configuration mechanisms into the architecture to allow for different actor systems to interoperate. We also plan to add a remoting feature to allow actors to communicate across platforms and virtual environments. This will enhance the flexibility and performance of our actors. We also plan to add an event bus, further scheduling facilities, logging, and more monitoring capabilities. From a professional usage viewpoint we need to determine a Quality of Service guarantee strategy, but we intend that this process can be determined after more extensive testing. R EFERENCES [1] S. Clebsch. (2015, 01) The pony programming language. The Pony Developers. [Online]. Available: http://www.ponylang.org/ [2] R. Pike, “Go at google,” in Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, ser. SPLASH ’12. New York, NY, USA: ACM, 2012, pp. 5–6. [Online]. Available: http://doi.acm.org/10.1145/2384716.2384720 [3] M. Odersky, L. Spoon, and B. Venners, Programming in Scala, 2nd ed. Artima Inc, 2011. [4] R. Hickey, “The clojure programming language,” in Proceedings of the 2008 Symposium on Dynamic Languages, ser. DLS ’08. New York, NY, USA: ACM, 2008, pp. 1:1–1:1. [Online]. Available: http://doi.acm.org/10.1145/1408681.1408682 [5] C. Hewitt, P. Bishop, and R. Steiger, “A universal modular actor formalism for artificial intelligence,” in Proceedings of the 3rd International Joint Conference on Artificial Intelligence, ser. IJCAI’73. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1973, pp. 235–245. [Online]. Available: http://dl.acm.org/citation.cfm?id=1624775.1624804 [6] J. Reppy, Concurrent Programming in ML. Cambridge University Press, 2007. [Online]. Available: https://books.google.co.uk/books?id=V 0CCK8wcJUC [7] P. Butcher, Seven Concurrency Models in Seven Weeks: When Threads Unravel, 1st ed. Pragmatic Bookshelf, 2014.
ISBN: 1-60132-468-5, CSREA Press ©
184
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Performance Investigation of Deep Neural Networks on Object Detection Oluseyi Adejuwon, Hsiang-Huang Wu, Yuzhong Yan and Lijun Qian Center of Excellence in Research and Education for Big Military Data Intelligence (CREDIT) Department of Electrical and Computer Engineering Prairie View A&M University, Texas A&M University System [email protected],[email protected],[email protected],[email protected]
Abstract—Object detection in surveillance system has to meet two requirements: high accuracy and realtime response. To achieve high accuracy, we adopt TensorBox architecture [1] as the realization that integrates the Long Short Time Memory (LSTM) of Recurrent Neural Network (RNN) into Convolutional Neural Network (CNN). Such the deep neural networks have been already known for their superior ability of object detection. Besides, the deep neural networks are usually computation-intensive and demand more resources to meet the realtime requirement. However, the system configuration is not properly selected and often leads to low utilization of the resources, which in turn makes the redundancy in the system with the expensive cost. In our performance investigation, we use the drone to capture the video and propose it as the scenario of a surveillance system. In addition, we experiment on different number of CPU cores and GPU with different available memory. Our performance investigation concludes that GPU outperforms CPU even though we use all of the CPU cores and the parameter tuning doesn’t matter much with respect to the time.
I. I NTRODUCTION The surveillance systems are facing the inevitable problem—requiring large memory space to store the images that contain both non-suspicious and suspicious scenes for post-processing. Such characteristic results in the quick occupation of memory with unwanted information since objects can encapsulate more information into the compact data structures than pixels. On the other hand, it takes more computational time in separating the useful data from unwanted data, which in turn makes the surveillance system time sensitive. Here we propose that the object detection should be done at the edge device and send the object information back to the surveillance system for realtime tracking. Without sacrificing the accuracy, we use the architecture of CNN integrating LSTM from RNN to guarantee the quality of objection detection: feature extraction, classification and localization. Based on the state-of-the-art, we investigate how the performance is being affected while object detection is running. Realtime object detection in surveillance systems has become a difficult task due to the following reasons [2]: segmentation of the object, viewing point, intensity of object, deformation, and so on. It becomes more complex when the scene or image gets clumsy and makes object detection difficult even by manual parameter tuning. Furthermore,
it sometimes is less accurate and time consuming. While deep learning significantly succeeds in object detection, we choose the architecture of the deep neural network implemented by [1] as our detecting engine and also investigate the performance under different platforms or system configurations. The dataset is composed of the sequential images extracted from the video taken by the drone. The chosen deep neural networks are automatically in charge of model training and object detection. Specifically, CNN is responsible for the feature extraction and also reduces the number of used parameters in comparison with Artificial Neural Network (ANN). In addition, LSTM exploits the temporal locality to retain only useful information for the reduction of the memory usage. With those helps, a better or smart surveillance system can be designed in this way for the different computational capacity and configuration. There are many works in object detection that use different techniques to tackle many targets of interest predictions in the images [3]–[5]. Better performance in object detection is achieved by [6]–[8] which adopt sliding window and image scanning. In [9], [10], CNN is applied for object classification. The two former methods produce bounding boxes on the target but fail in performance at multi-targets images. In [1], the authors provide the lasting solution to the performance failure by combining CNN and LSTM of RNN to enhance the capacities of the model arrangement. II. S ETUP
OF
O UR P ERFORMANCE I NVESTIGATION
We adopt Tensorbox [1] as the implementation of our chosen deep neural networks which have the advantages of inception modules [11]: 1) the capacity of intense profound convolutional high level representation and 2) the capacity to produce incremental set of forecast of variable length. We present a mode that first encodes processed input images from video stream into an advanced descriptors by means of CNN design [12], and decodes features representation into an arrangement of bounding boxes. As a center apparatus for anticipating variable length outcome, we integrate on a recurring system of LSTM units. This is the strength for object detection in the surveillance system in which only the target or suspicious object in a scene will be captured with the bounding boxes as a result saves system memory consumption.
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
The grid of 1024 dimensional descriptors are generated from each image at striding areas all through the image. The 1024 dimensional vector abridges the substance of the area and conveys rich data with respect to the object positions. The LSTM acts as a controller in the interpreting of an area after extracting information from the source. At each progression, the LSTM yields another bounding box and a comparing certainty with the formerly undetected object will be found at that region of the image. The bounding boxes are expected to be generated in descending order of confidence. At the point when the LSTM can not discover another box in the area with a confidence over a pre-assumed threshold, a stopping criteria is satisfied. The succession of output is collected and displayed as a final depiction of all object occurrences in the image area. The model is developed to encode image into a 15x20 grid of 1024 size dimensional high level features. The grid comprises of cells with receptive field of 139 by 139 in size and trained to generate a group of unique bounding boxes inside 64x64 locale. The size is assumed to be sufficiently enough to bound any object of interest in difficult local occlusion. There are 300 different LSTM controllers running in parallel, one for each 1x1x1024 cell of the network. Our LSTM units have 250 memory stages without both bias and non-linearities. We have delivered practically identical outcomes by only input 480x640 image into the only first LSTM unit as multiple input of image is unnecessary. The model must figure out how to relapse on bounding box areas using the LSTM decoder. Amid training, the decoder yields an complete set of bounding boxes, each with its corresponding confidence. To make the model simple and efficient in batch processes, the cardinality of the complete set remained unchanged, irrespective of the quantity of ground-truth boxes. The loss of the model is evaluated using bipartite function as explained in [1]. The systems we use for performance investigation are CPU and GPU. In the CPU system, Intel CPU Xeon E7-4850, 16 cores and 32 threads, is chosen. In GPU system, Nvidia GPU titan X with 3584 CUDA cores with 12GDDR5X memory type is selected. Figure 1(a) and 1(b) illustrate two scenarios: one and multiple object detection respectively. III. A NALYSIS
OF
P ERFORMANCE I NVESTIGATION
In CPU platform, the effect in the number of processing cores is investigated against the computational time of the algorithm. The total number of sixteen cores of Intel Xeon CPU ES-2650 are used in this experiment. The TensorBox is ran several times under different number (2,4,8,12,16) of cores in terms of their computational time. The measurement is based on Real Time, User Time and Sys Time which represent the actual clock time, the CPU processing time in client mode code inside the process and the CPU time used in system kernel in the process respectively.
185
(a) Two target of Interest.
(b) One target of Interest.
Figure 1: L Column: Prediction and R Column: Ground truth.
Figure 2: Number of Processor Cores Verse Computational Time(Real,User,Sys).
With respect to the real time in Figure 2, the number of the processing core is inversely proportional to the computational time that is the higher the number of the core(s) utilized in the system configuration, the lower the computational time as a result of multi-threading operation(Parallelism), which enable the codes to compute faster in parallel simultaneously. As the number of the core(s) increases the parallel computation of the program increases and inversely decrease the computation time of the processes but on the user time, the number of cores is directly increase the computation time, since the user time is the aggregated (Multi-threading) computational time of the CPU processes as previously explained. Effect of different system memory configuration of the GPU against the performance of the algorithm is also examined . The NVIDIA Titan X was used with the standard memory configuration of 12 GB GDDR5X; the memory configuration was manipulated into range of 2 to 12G in respect to their computational time. From the Figure 3 and Table I, it is observed that computational time slightly remain constant (no significant variation) that is the Random Access Memory is self sufficient within the range of the memory selection. It can be further analyzed; the training
ISBN: 1-60132-468-5, CSREA Press ©
186
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
Figure 3: GPU Computational Time Based on Memory Allocation.
Memory 2G 4G 6G 8G 10G 12G
Real time 807m22.351s 808m19.462s 816m5.669s 795m9.620s 804m37.069s 803m37.313s
User time 1151m48.588s 1153m1.492s 1162m42.392s 1139m34.828s 1149m56.904s 1144m45.036s
Sys time 30m16.260s 30m33.36s 31m2.032s 30m40.180s 30m15.188s 32m55.272s
Figure 4: Number of iteration Verse Training Losses of three different datasets.
S/N 1 2 3 4
Learning Rate 0.1 0.01 0.001 0.0001
Convergence Point 660000 561000 429100 330000
Table I: GPU COMPUTATIONAL TIME BASED ON MEMORY ALLOCATION
Table II: Effect of the Learning rate on the Convergence
batch size was set to 33,000 which simply means that the algorithm can learn from chunk of 33,000 images at a step or batch with preset learning rate. Each 640 x 480 image pixel size is 3.68KB, the total image size per batch is still below 1.3 G and couple with other processes. This is far below the minimum RAM selection for the experiment. Considering the computation of the algorithm, it will be a waste of memory resources to select above 2G for the GPU memory for this application. The machine learning or deep learning algorithm seeing as brain model comes with the some characteristics of human brain as it is shown from the Figure 4. As the number of the training iteration increases, the training loss decreases. The loss was so high in early stages of the process considering all the three different datasets and lower as the iteration number increases. There were slight differences in the behavior of the different dataset used during the experiment. Dataset1 and dataset3 curve is closely related due to the nature of their data, dataset1 comprises of the mixture of image with or without one object of interest while dataset3 contains images with only one object of interest while dataset2 is the mixture of images of more two object if interest or disinterest. Any busy scene images with so many targets to identify will require complex processes and training accuracy will be computed in summation of all individual target identified in the image that usually responsible for the higher training loss in complex scene compared to images with one or no
target. As the training of algorithm progresses, the model computes towards the direction which individual bias and weight values can changed to calculate more accurate prediction. A learning rate is assigned keeping in mind the end goal to decide how much the connection weights and biases can be adjusted in respect to change direction also rate of change. The learning rate is directly proportional to the training speed of the algorithm, The higher the learning rate the speedier the training of the network. In any case, the system has a superior possibility of not converging at the global minimum rather local minimum:the point when the network gained stability on the solution but not at optimal global minimum.The learning rate of the algorithm was evaluated along the rate of convergence and the behavior of the training loss. As shown in the Table II: When the learning rate was set to 0.1 under Adam adaptive learning, the algorithm converged at 660000 iterations. At learning rate of 0.01, the algorithm converged at 561000 iterations while at 0.001 and 0.0001 learning rate, the algorithm converged at 429,000 and 330000 but higher computational time. The common problem in deep learning is over-fitting. One of the popular method to tackle this problem is known as Dropout. The important trick is to randomly drop some neural network connections to avoid co-adapting connection during training. During testing, it is simple to aggregate the impact of averaging the output predictions of all these
ISBN: 1-60132-468-5, CSREA Press ©
Int'l Conf. Software Eng. Research and Practice | SERP'17 |
187
number FA8750-15-2-0119. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Dept of Navy or the Office of the Assistant Secretary of Defense for Research and Engineering (OASD(R&E)) or the U.S. Government. R EFERENCES
Figure 5: Number of iteration Verse Training Losses of three different datasets.
thinned systems by just utilizing a solitary unthinned network with smaller weights [13]. Different dropout values were investigated on this system to examine how it affect both the computational time, the training loss and Test accuracy. The Figure 5 shows the computational times after 60000 iteration at different dropout values.The effect of the dropout values on the training loss of the algorithm, it is so obvious that 0.15 and 0.25 were not the appropriate values for dropout in this algorithm as the loss is so erratic during the training. The popular dropout value is 0.5 but it can be observed that 0.75 also produced comparative result with 0.5 dropout. IV. C ONCLUSIONS Our performance investigation of object detection in surveillance system under different system configurations and platforms present the following conclusions: 1) the number of processor cores is inversely related to the real computational time while directly related with user computational time; 2) the batch size and the image size played a key roles in the RAM utilization of the system and directly impact the performance of the algorithm; 3) the learning rate plays a key role in rate of convergence of the algorithm: it was noticed that 0.001 converged faster against the 0.001 used as shown in the results; 4) setting the dropout to 0.5 does not guarantee optimal performance during as 0.75 is better compared to the standard value set; 5) GPU platform is roughly 12 times faster in computation than CPU. The future work will focus on real time streaming and tracking of specific targets. ACKNOWLEDGMENT This research work is supported in part by the U.S. Dept of Navy under agreement number N00014-17-1-3062 and the U.S. Office of the Assistant Secretary of Defense for Research and Engineering (OASD(R&E)) under agreement
[1] R. Stewart, M. Andriluka, and A. Y. Ng, “End-to-end people detection in crowded scenes,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2325–2333. [2] G. Hinton, “Why object recognition is difficult [neural networks for machine learning],” https://www.youtube.com/ watch?v=Qx3i7VWYwhI/, (Accessed on 02/14/2017). [3] M. A. Sadeghi and A. Farhadi, “Recognition using visual phrases,” in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011, pp. 1745– 1752. [4] W. Ouyang and X. Wang, “Single-pedestrian detection aided by multi-pedestrian detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 3198–3205. [5] S. Tang, M. Andriluka, and B. Schiele, “Detection and tracking of occluded people,” International Journal of Computer Vision, vol. 110, no. 1, pp. 58–69, 2014. [6] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “Overfeat: Integrated recognition, localization and detection using convolutional networks,” arXiv preprint arXiv:1312.6229, 2013. [7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587. [8] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99. [9] C. Szegedy, S. Reed, D. Erhan, D. Anguelov, and S. Ioffe, “Scalable, high-quality object detection,” arXiv preprint arXiv:1412.1441, 2014. [10] J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” International journal of computer vision, vol. 104, no. 2, pp. 154– 171, 2013. [11] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. [12] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9. [13] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting.” Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
ISBN: 1-60132-468-5, CSREA Press ©
Author Index Yan , Yuzhong - 184 Abboud, Mira - 120 Abdunabi, Ramadan - 10 Abu El Humos, Ali - 127 Adejuwon, Oluseyi - 184 Al Balooshi, Mouza - 137 Alhaddad, Ahmed - 3 Amer, Suhair - 131 Ando, Takahiro - 45 , 98 Andrews, Anneliese - 3 , 24 Aning, Kwabena - 178 Barrocas, Samuel - 56 Bobelin, Laurent - 31 Boldt, Brendon - 113 Boreisha, Yurii - 104 Boukhris, Salah - 3 Chadalavada, Moulika - 143 Chen, Ning - 17 , 149 Coleman, Ron - 113 Crawford, Tyler - 69 Daimi, Kevin - 88 Dbouk, Mohamad - 120 Elias, Gledson - 38 Fanuel, Mesafint - 127 Fujii, Ryo - 98 Fukuda, Akira - 45 , 81 , 98 Gannous, Aiman - 24 Gario, Ahmed - 24 Gu, Yifan - 77 Han, Yijie - 143 Hisazumi, Kenji - 45 , 81 , 98 Hong, Changhee - 167 Hussain, Tauqeer - 69 Hwang, Yoola L. - 169 Ishikawa, Yuki - 81 J. Rutherford, Matthew - 24 Jeong, Cheol Oh - 169 Keiller, Peter - 62 Kim, Hyunju - 127 Kim, In Jun - 169 Kim, JaeDu - 165 Kim, SungEn - 165 Kong, Weiqiang - 45 Lee, Byoung-Sun - 169 Lee, Soojeon - 169 Lee, TaeGyun - 165 Liang, Xuejun - 127 Lima, Georgenes - 38
Lo, Shun Chi - 149 Malaiya, Yashwant - 10 Mannock, Keith Leonard - 178 Matsumoto, Michihiro - 45 Mejias, Marlon - 62 Michiura, Yasutaka - 45 Mine, Tsunenori - 98 Moinard, Stéphane - 31 Murray, Acklyn - 62 Myronovych, Oksana - 104 Naja, Hala - 120 Nakanishi, Tsuneo - 98 Nygard, Kendall - 51 Oliveira, Marcel - 56 Oussalah, Mourad - 120 Pei, Tzusheng - 127 Qian, Lijun - 184 Qiu, Zirou - 131 Rastogi, Aakanksha - 51 Rushton, Nelson - 173 Sakemi, Keita - 45 Salih, Nadir - 156 Snyder, Katherine - 88 Subramanian, Ramaswamy - 17 Toinard, Christian - 31 Towhidnejad, Massood - 137 Tran, Tuan Hiep - 31 Wang, Bo - 45 Wu, Hsiang-Huang - 184 Xu, Chaohui - 77 Yoon, Chunjoo - 167 Zang, Tianyi - 156 Zheng, Mao - 77 Zhu , Tingting - 17