461 109 17MB
English Pages 224 [226] Year 2020
INFORMATION TECHNOLOGY AND INTELLIGENT TRANSPORTATION SYSTEMS
Frontiers in Artificial Intelligence and Applications The book series Frontiers in Artificial Intelligence and Applications (FAIA) covers all aspects of theoretical and applied Artificial Intelligence research in the form of monographs, selected doctoral dissertations, handbooks and proceedings volumes. The FAIA series contains several sub-series, including ‘Information Modelling and Knowledge Bases’ and ‘Knowledge-Based Intelligent Engineering Systems’. It also includes the biennial European Conference on Artificial Intelligence (ECAI) proceedings volumes, and other EurAI (European Association for Artificial Intelligence, formerly ECCAI) sponsored publications. The series has become a highly visible platform for the publication and dissemination of original research in this field. Volumes are selected for inclusion by an international editorial board of well-known scholars in the field of AI. All contributions to the volumes in the series have been peer reviewed. The FAIA series is indexed in ACM Digital Library; DBLP; EI Compendex; Google Scholar; Scopus; Web of Science: Conference Proceedings Citation Index – Science (CPCI-S) and Book Citation Index – Science (BKCI-S); Zentralblatt MATH. Series Editors: J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong
Volume 323 Recently published in this series Vol. 322. M. Araszkiewicz and V. Rodríguez-Doncel (Eds.), Legal Knowledge and Information Systems – JURIX 2019: The Thirty-second Annual Conference Vol. 321. A. Dahanayake, J. Huiskonen, Y. Kiyoki, B. Thalheim, H. Jaakkola and N. Yoshida (Eds.), Information Modelling and Knowledge Bases XXXI Vol. 320. A.J. Tallón-Ballesteros (Ed.), Fuzzy Systems and Data Mining V – Proceedings of FSDM 2019 Vol. 319. J. Sabater-Mir, V. Torra, I. Aguiló and M. González-Hidalgo (Eds.), Artificial Intelligence Research and Development – Proceedings of the 22nd International Conference of the Catalan Association for Artificial Intelligence Vol. 318. H. Fujita and A. Selamat (Eds.), Advancing Technology Industrialization Through Intelligent Software Methodologies, Tools and Techniques – Proceedings of the 18th International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques (SoMeT_19) Vol. 317. G. Peruginelli and S. Faro (Eds.), Knowledge of the Law in the Big Data Age Vol. 316. S. Borgo, R. Ferrario, C. Masolo and L. Vieu (Eds.), Ontology Makes Sense – Essays in honor of Nicola Guarino
ISSN 0922-6389 (print) ISSN 1879-8314 (online)
Information Technology and Intelligent Transportation Systems
Edited by
Lakhmi C. Jain Centre for Artificial Intelligence University of Technology Sydney, Broadway, Australia Liverpool Hope University, Liverpool, United Kingdom
Xiangmo Zhao School of Information Engineering, Chang’an University, Xi’an, China
Valentina Emilia Balas Department of Automation and Applied Informatics, Faculty of Engineering, Aurel Vlaicu University of Arad, Arad, Romania
and
Fuqian Shi Rutgers, The State University of New Jersey, New Brunswick, USA
Amsterdam Berlin Washington, DC
© 2020 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-64368-060-6 (print) ISBN 978-1-64368-061-3 (online) Library of Congress Control Number: 2020932262 doi: 10.3233/FAIA323 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected] For book sales in the USA and Canada: IOS Press, Inc. 6751 Tepper Drive Clifton, VA 20124 USA Tel.: +1 703 830 6300 Fax: +1 703 830 2300 [email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
v
Preface Intelligent transport systems, from basic management systems to more application-oriented systems, vary in the technologies they apply. Information technologies, including wireless communication, are important in intelligent transportation systems, as are computational technologies: floating car data/floating cellular data, sensing technologies, and video vehicle detection. Theoretical and application technologies, such as emergency vehicle notification systems, automatic road enforcement and collision avoidance systems, as well as some cooperative systems are also used in intelligent transportation systems. This book presents papers selected from the 128 submissions in the field of information technology and intelligent transportation systems received from 5 countries. In December 2019 Chang’an University organized a round-table meeting to discuss and score the technical merits of each selected paper, of which 23 are included in this book. The meeting was also co-sponsored by Xi’an University of Technology, Northwestern Polytechnical University, CAS, Shaanxi Sirui Advanced Materials Co., LTD and Special Aircraft Engineering Research Institute. We are grateful to all contributors for their efforts in preparing their manuscripts in a timely manner. We would also like to express our appreciation to the reviewers for their support, and gratefully acknowledge their help in bringing out this volume on time. Lakhmi C. Jain Xiangmo Zhao Valentina Emilia Balas Fuqian Shi
This page intentionally left blank
vii
Contents Preface Lakhmi C. Jain, Xiangmo Zhao, Valentina Emilia Balas and Fuqian Shi
v
A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System Yiheng Yang, Feng Gao, Jianli Duan and Caimei Wang
1
Study on the Determination of Spatio-Temporal Threshold of Bicycle Connection Rail Transit in Haze Under Traffic Restriction Na Zhang and Jianpo Wang
10
The Influence Assessment of Platform’s Parameters on Service Level in Subway Transfer Stations Yao Wang, Shunping Jia and Jiaqi Wu
22
Visual Traffic Surveillance: A Concise Survey Atreyee Mondal, Anjan Dutta, Nilanjan Dey and Soumya Sen Multi-Criteria Evaluation for Energy Consumption Quality of Urban Rail Transit Stations Tongjun Guo, Baohua Mao and Huiwen Wang
32
42
On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations Jiaqi Wu, Baohua Mao, Yao Wang and Qi Zhou
52
Performance Comparison of Feature Extraction Methods for Iris Recognition J. Jenkin Winston and D. Jude Hemanth
62
Emotion Recognition Using Feature Extraction Techniques M. Kalpana Chowdary and D. Jude Hemanth
71
Interactive Educational Content Using Marker Based Augmented Reality K. Martin Sagayam, D. Jude Hemanth, Andrew J, Chiung Ching Ho and Dang Hien
77
Classification of Melanoma Through Fused Color Features and Deep Neural Networks Ananjan Maiti, Himadri Shekhargiri, Biswajoy Chatterjee, Venkatesan Rajinikanth, Fuqian Shi and Nilanjan Dey Two Dimensional DEM Simulation for Truck Tires Rolling onto the Randomly Shaped Pebbles Used in a Truck Escape Ramp Pan Liu, Qiang Yu, Xuan Zhao and Peilong Shi Image Examination System to Detect Gastric Polyps from Endoscopy Images Nilanjan Dey, Fuqian Shi and Venkatesan Rajinikanth Machine Learning Models for Bird Species Recognition Based on Vocalization: A Succinct Review Nabanita Das, Atreyee Mondal, Jyotismita Chaki, Neelamadhab Padhy and Nilanjan Dey
86
97 107
117
viii
Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal Katia Melo, Mahdi Khosravy, Carlos Augusto Duque and Nilanjan Dey
125
A Risk Assessment of Ship Navigation in Complex Waters Chen Mao, Hao Wang, Jinyan Zheng, Mingdong Wang and Chunhui Zhou
135
Optimized Tang’s Algorithm for Retinal Image Registration Sayan Chakraborty, Ratika Pradhan, Nilanjan Dey and Amira S. Ashour
142
Modeling and Simulation of Dynamic Location Allocation Strategy for Stereo Garage Bowen Li, Jianguo Li and Yaojun Kang
150
Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts Using a Linear Modulation Factor Junsheng Liu, Shichao Chen, Yun Chen, Chenlu Duan and Ming Liu
159
Study on the Influence of Intelligent Evacuation Guidance System of Highway Tunnel on Emergency Evacuation Zhengmao Cao and Xiao Liu
166
Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene Wei Wang, Zhaoyang Zhang, Xinyao Tang and Huansheng Song Morphological Segmentation of Plantar Pressure Images by Using Mean Shift Local De-Dimensionality Dan Wang and Zairan Li IMA System Health Assessment Method Based on Incremental Random Forest Yujie Li, Dong Song, Zhiyue Liu and Yuju Cao
176
186 193
Research on Automatic Extraction Method of Lane Changing Behavior Based on the Naturalistic Driving Data Yingzhi Xiong, Qin Xia, Penghui Li, Long Chen and Yi Chai
202
Subject Index
213
Author Index
215
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200040
1
A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System Yiheng YANGa, Feng GAO b, Jianli DUAN a,1, and Caimei WANG b School of Electrical Engineering, Chongqing University, Chongqing, China b School of Automotive Engineering, Chongqing University, Chongqing, China a
Abstract. Combinatorial test (CT) is a commonly used method to conduct comprehensive and efficient test. To further improve test efficiency, based on CT, this paper proposes an improved combinatorial test (ICT) method for model-inthe-loop (MIL) test of the autonomous parallel parking system (APPS). Different from CT, in the process to generate test cases, the ICT takes not only combination coverage but also importance degree into consideration. When selecting parameter value of the test case, the ICT tends to select the value with higher importance degree, on the premise of the combination coverage and the number of test cases remaining unchanged. Among the different values of the same parameter, these more likely to lead to bad parking performance are assigned higher importance degrees using Analytic Hierarchy Process (AHP). The experiment results show that, overall, for different values of the same parameter, the values with higher importance degree appear more frequently in the ICT test suite. Besides, compared with CT, the ICT further improves the test efficiency. Keywords. Autonomous parallel parking system, improved combinatorial test, importance degree, model-in-the-loop
1. Introduction To acquire high-quality advanced driving assistant system (ADAS), it is unavoidable to conduct reasonable test on it in the development stage. According to the V-shape development process, model-in-the-loop (MIL) test is one of the most important sessions[1]. However, different from traditional onboard control systems, like battery management system [2] etc., the functionality and performance of ADAS are closely related to the changeable traffic environment. Since the traffic environment is uncontrollable, and difficult to be defined exactly, it brings great challenges to the test case design. In this paper, the method of test case design in MIL test is studied, using autonomous parallel parking system (APPS) as example. In recent years, researches about autonomous parking system (APS) mainly focus on its realization and performance, such as parking path planning [3][4] and trajectory tracking control [5][6]. In these studies, researchers only built some typical working conditions to validate advantages of their algorithms. For example, K. Demirli et al. designed five parking slots with different widths to demonstrate that the parking
1 Corresponding Author. Jianli Duan, School of Electrical Engineering, Chongqing University, Chongqing, China; Email: [email protected].
2
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
controller they proposed can control the vehicle direction according to its position successfully, without the information about slot width [5]. The test method mentioned above is only applicable to verify some specific functionality or performance of APS. It is difficult for them to find potential errors. To test APS comprehensively, X. Du et al. adopted enumeration method to generate test cases when testing robustness of the path planning algorithm [7]. Firstly, they respectively selected candidate values of vehicle's initial position and orientation with a fixed step size. Then, test cases were generated by combining the candidate values of these parameters completely. However, too many test cases would be generated using this method. Although they only selected three parameter values to construct a test case, 10208 test cases were generated. When taking more parameters into consideration, the number of test cases generated in this way will increase exponentially. It may lead to failure to complete test tasks within a defined test cycle. To reduce the number of test cases, M. Oetiker et al.[8] adopted random test (RT) method to validate robustness of parking controller. When constructing test cases, they randomly selected values of vehicle's initial position, orientation etc. However, RT depends on "luck" to find errors. Its test efficiency tends to be low. Besides, consistency of test suites generated by RT is poor. Some researchers discovered that almost all system faults are caused by interactions of few parameters. It is only necessary for the test suite to cover the N ( 2 d N d 6 ) dimension combination of parameters related to performance of the system under test (SUT) [9][10][11]. Therefore, testers can use combinatorial test (CT) to detect system faults. In fact, CT has already been wildly used in the field of test [12]. It can improve test efficiency to some extent, while ensuring comprehensiveness. Besides, it is able to achieve consistency of test suites [13][14]. In view of that, CT can be applied to APPSMIL test to detect unqualified parking performance. Now, researches about CT mainly focus on minimizing the amount of test cases in test suites without decreasing coverage. Greedy algorithm is one of the most popular methods adopted to meet the goal [12][13][15]. By analyzing characteristics of CT test suites, we find that, for some parameters, it only needs partial test cases in the test suite to cover all value combinations related to them. In the other test cases, no matter which value of these parameters is chosen, the combination coverage will not change. Considering that, to further improve the probability of detecting unqualified parking performance, we propose the improved combinatorial test (ICT) method for MIL test of the APPS. The ICT tends to select more important values to generate test cases, on condition that the combination coverage and the number of test cases keep unchanged. In section 2, we introduce the strategy to conduct the APPS MIL test. The ICT algorithm is introduced in section 3. The experiment results are shown in section 4. The section 5 is the conclusion.
2. Strategy of the APPS MIL Test The strategy of the APPS MIL test is depicted in Figure 1. We set up a MIL test platform using a computer equipped with Intel(R) Core (TM) i7-8550U. PreScan, MATLAB are installed in it. Virtual traffic scenarios are constructed in PreScan, according to test suites. The parallel parking algorithm controls the motion of host vehicle (HV) according to its physical parameters and information about the traffic scenario apperceived by sensors model. The motion state and position of the HV are
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
3
continuous updating in the virtual scenario. To speed up the testing process, based on the "test automation" function provided by PreScan, we script to make the test process execute automatically.
.
Figure 1 The strategy of the APPS MIL test.
In Figure 1, Pi represents i-th parameter related to working conditions of the SUT. A test case can be generated by selecting a value from set of values of each parameter. In this paper, we propose ICT algorithm to generate test cases. It intervenes the process of generating test cases by importance degrees of parameter values. In section 3, the algorithm will be further introduced in detail. According to the explanation above, to generate ICT test suite to test the APPS, it is of importance to tackle these two problems first, namely (1) assigning importance degrees; (2) finding out the parameters related to working conditions of the APPS. 2.1. Calculation of Importance Degrees G. Rowe et al. proposed that in the absence of objective data, importance degrees can be evaluated on basis of experience and subjective perception [16]. Analytic Hierarchy Process (AHP) is a popular method to subjectively evaluate importance degrees. It integrates quantitative calculation on the basis of qualitative analysis, making the results more scientific and convincing [17][18]. In view of that, we use AHP assign importance degrees to parameter values. Among m different values of the same parameter, the values that are more likely to lead to bad parking performance are assigned higher importance degrees. According to [17][18][19], at first, it is needed to construct pairwise comparison matrix A > aij @mum , whose leading diagonal elements are 1, and aij u a ji 1(1 d i d m,1 d j d m) . Then consistency of A should be evaluated according to CR=CI/RI, where CR is consistency ratio; CI is consistency index; RI is random index. CI (Omax m) / (m -1) , where Omax represents the maximal eigenvalue of A. RI can be obtained in reference [19]. That CR 0.1 means A meets the consistency requirement. That CR t 0.1 means some elements in A should be modified. W > w1 w2 " wi " wm @ represents eigenvector corresponding to Omax . After
ª¬ w1' in W' are importance degrees of parameter values.
normalizing W using Eq. (1), we can get W'
w'2 " w'i
" w'm º¼ . Elements
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
4
m
w'i
wi / ¦ wi
(1)
i 1
2.2. Determination of Parameters Related to Working Conditions of APPS. Inspired by [20][21], we can look for the parameters from these three aspects: (1) technical manual; (2) empirical knowledge; (3) related standards. Different values of some parameters may have different influences on performance of APPS. In the rest of the paper, we call each of these parameters as an influence factor (IF). Finally, we achieve a set of IFs depicted in Table 1, including their values' range. If an IF has too many discrete points or it is continuous, it is difficult or even unable for test suites to cover all their values. In this situation, the equivalence partitioning (EP) and boundary value analysis (BVA) should be conducted [22]. Values' range of IFs after conducting EP and BVA is also depicted in Table 1. Since the virtual scenarios are constructed in PreScan, we select cars/motors models in PreScan as boundary vehicles of parking slots. In PreScan, Lexus_LS_600h (LL), Fiat_Bravo (FB), Honda_Pan_European (HPE) are the cars/motors with maximum, middle, and minimum size respectively. Values in square brackets are importance degrees. They are calculated using the AHP introduced in section 2.1. Table 1. Selected IFs and their importance degrees. No.
IF
1
Front vehicle (FV)
2
Rear vehicle (RV)
3 4
Minimum lateral distance between FV and curb (LDFC, unit: m) Minimum lateral distance between RV and curb (LDRC, unit: m)
Value's range All cars/motors that can be driven in downtown All cars/motors that can be driven in downtown
[0.2, 0.4]
0.2[0.088]; 0.3[0.243]; 0.4[0.669]
6
Direction of RV(DR)
Opposite to HV (OP); The same as HV(SA) OP; SA
7
Orientation of FV ( D f , unit: deg)
[-4, 4]
8
Orientation of RV ( D r , unit: deg)
[-4, 4]
9
Orientation of HV ( E , unit: deg)
[-2, 2]
11 12
LL [0.091]; FB [0.091]; HPE [0.818] 0.2[0.088]; 0.3[0.243]; 0.4[0.669]
Direction of FV(DF)
Direction of parking slot relative to HV(DS) Slot length (SL, it represents multiple of HV length) Lateral distance between the parking slot and mid-point of HV’s rear wheel axis (LDPH, unit: m)
LL [0.091]; FB [0.091]; HPE [0.818]
[0.2, 0.4]
5
10
Value's range after conducting EP and BVA
OP [0.5]; SA [0.5] OP [0.5]; SA [0.5] -4[0.230]; -3[0.130]; -2[0.076]; -1[0.042]; 0[0.042]; 1[0.042]; 2[0.076]; 3[0.130]; 4[0.230] -4[0.230]; -3[0.130]; -2[0.076]; -1[0.042]; 0[0.042]; 1[0.042]; 2[0.076]; 3[0.130]; 4[0.230] -2[0.535]; -1[0.093]; 0[0.040]; 1[0.093]; 2[0.238]
Left; Right
Left [0.5]; Right [0.5]
[1.5, 1.7]
1.5[0.770]; 1.6[0.162]; 1.7[0.068]
[1.615, 1.915]
1.615[0.070]; 1.765[0.178]; 1.915[0.751]
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
5
What some IFs in Table 1 represent are depicted in Figure 2. Taking moving direction of HV as the positive direction, FV represents the cars/motors in front of parking slot; RV represents the cars/motors in rear of parking slot. LOD represents longitudinal distance from HV to parking slot. LOD is not selected as IF, but it is an important factor to define parking conditions. Its value is constant 2m. 6/
59
)9
/2 '
/' 3 +
+9
0 RYLQJGLUHFWLRQ /' ) &
/' 5 &
59
)9
Figure 2. Diagram of parking conditions.
3. Generation of ICT Test Suite This section introduces the ICT algorithm at first, and then compares and analyzes the distribution characteristics of the values in the CT and ICT test suites. 3.1. ICT Algorithm Inspired by the tools to generate CT test suites like AETG, PICT[13][15], when generating N-wise combinatorial test cases, firstly, it is necessary to look for the set of N-wise IF combinations which is represented by N_CIF. Each IF combination is comprised of some value combinations. Sometimes, there are constraints among IFs. The value combinations that do not satisfy these constraints should be removed from N_CIF. As shown in Figure 3, at first, the ICT algorithm takes covering more uncovered value combinations as objective to select values from set of values of each IF successively. IFS represent IF set of the SUT. That IFi ^Vi (1), Vi (2)," , Vi ( j )," , Vi (m)` represents the i-th IF in IFS. Vi ( j ) represents the j-th value in IFi . If whichever value in IFi is selected there is no value combination can be covered, the ICT algorithm will select a value according to importance degree to form a test case. The probability to select Vi ( j ) is wij . wij represents importance degree of Vi ( j ) . 7 KHSUREDELOLW\WRVHOHFWV i (j)LVwij
7 HVWFDVH D
E
F
,P SRUWDQFH GHJUHH
WDNLQJFRYHULQJP RUHXQFRYHUHGYDOXHFRP ELQDWLRQVDVREMHFWLYH
IF1
IF2
wij
wi 2
wim
wi1
Vi (1) Vi (2) Vi ( j )
IFn
Figure 3. Schematic diagram of ICT algorithm.
IFi
Vi (m)
6
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
Assuming that p(p test cases have been generated by ICT algorithm; UC represents the set of uncovered value combinations; CC represents the set of covered value combinations, process of generating the (p+1)-th test case is shown in Figure 4. In the process, if the ICT algorithm has selected a value in IFi , it will no longer select any value in it, and IFi will be stored in CIF. The CIF should be cleared before generating each new test cases To ensure the consistency of test suites, the random mentioned in Figure 4 is pseudorandom. UC = N_CIF - CC In UC, randomly mark one of IF combinations with the greatest number of value combination. Select the first value combination in it. UC = N_CIF – CC.
CIF = IFS
Y
END
N Select a value or value combination which can cover more value combinations than others by interacting with selected values. U C = N _CIF – C C . Repeat it, until no value combination in UC can be covered or CIF = IFS.
Select an IF value according to importance degrees. The probability to select each value equals to its importance degree. Repeat it, until CIF = IFS
Figure 4. Flow chart for generating (p+1)-th test case.
3.2. Value Distribution in ICT Test Suite Kuhn, et al. analyzed test results of the Mozilla web browser and found that 2-wise CT is more efficient than other N-wise (N [9]. In order to obtain higher probability of detecting unqualified parking performance, we generate a 2-wise test suite by ICT using values after conducting EP and BVA in Table 1. Besides, we generate a test suite by CT which does not select values according to importance degree. For the CT, the probabilities to select all values of the same IF are the same. The value distributions of partial IFs in these two test suites are shown in Figure 5. Values in square brackets are importance degrees. Different colors represent different times that corresponding values appear in test suites. Min
Max
)9
/' ) & P
59
/' 5 & P
// )% + 3( // )% + 3( > @ > @ > @ > @ > @ > @ > @ > @ > @ > @ > @ > @
&7
,& 7
£GHJ
/' 3 + P
6/
> @ > @ > @ > @ > @ > @ > @ > @ > @ > @ > @
&7
,& 7
Figure 5. The value distributions of partial IFs in test suites generated by CT, and ICT.
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
7
Obviously, in general, different from value distributions in CT test suite, the value distributions in ICT test suite are oriented. The values with higher importance degree appear more frequently. However, the value distributions of D f and D r in CT and ICT test suites are the same. They are not oriented. The reason for this phenomenon can be explained as follow. Assuming that it needs Tt test cases to cover all value combinations, usually only Ti (Ti Tt ) of them are needed to cover value combinations related to IFi . On this condition, in the process of generating the other Tt Ti test cases, the selection of values in IFi depends on importance degree. Larger Tt Ti means greater impact of importance degree on distribution of values in IFi in ICT test suite. However, if the number of values in IFi is larger than number of values in other IFs, Ti may equal to Tt . In this case, importance degree has no impact on distribution of values in IFi in ICT test suites. In this paper, 81 test cases are generated to cover all value combinations, and 81 test cases are needed to cover value combinations related to D f and D r . Therefore, the importance degree has no impact on value distributions of
D f and D r . What is more, since the importance degrees of values of DF are the same, its value distributions in CT and ICT test suites are the same. Similarly, the value distributions of DR and DS in CT and ICT test suites are the same too.
4. Experiment Results To validate ICT can achieve higher test efficiency, we analyze test results of RT, CT and ICT in this section. When a test case detects that HV fails to find the parking slot, or collides with other objects, or fails to parking into the parking slot, we regard that an unqualified parking performance is detected. The test results are shown in Figure 6. 4 XDQWLW\
4 XDQWLW\
(a)
57
&7
,& 7
Quantity of detected unqualified parking performance using 81 test cases.
(b)
57
&7
,& 7
Quantity of test cases needed to detect 50 unqualified parking performances.
Figure 6. Test results of RT, CT and ICT.
At first, considering there are 81 test cases in CT and ICT test suites, we generate a RT test suite including 81 test cases too. Figure 6(a) shows that using the same amount of test cases, compared with RT, the quantity of unqualified parking performance detected by CT is improved by 228.57%; compared with CT, the quantity detected by ICT is improved by 117.39%. Furthermore, to make RT and CT detect 50 unqualified parking performances too, we must add some test cases to the RT and CT test suites. Figure 6(b)
8
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
shows that RT and CT need 407 and 166 test cases respectively to detect 50 unqualified parking performances. Compared with RT, the quantity of test cases needed by CT is decreased by 59.21%; Compared with CT, the quantity needed by ICT is decreased by 51.20%. To sum up, compared with RT, the test efficiency of CT is higher. Besides, the ICT further significantly improves the test efficiency based on CT.
5. Conclusion In this paper, in order to further improve test efficiency while ensuring the comprehensiveness of the test, we develop the ICT algorithm to generate test cases based on CT. The ICT intervenes the process of generating test cases by introducing the importance degree. For different values of the same parameter, the value with higher importance degree tends to appear in the ICT test suite more frequently. The test results show that the quantity of unqualified parking performance detected by ICT is apparently higher than CT and RT by utilizing the same amount of test cases. Besides, the quantity of test cases needed by ICT is apparently less than CT and RT when detecting the same amount of unqualified parking performances. Therefore, we can conclude that ICT is more efficient than the other two methods. In the future, we can further apply ICT to other ADAS like Autonomous Emergency Braking (AEB), Adaptive Cruise Control (ACC) and so on.
References [1] A.V. Tumasov, A.S. Vashurin, Y.P. Trusov, E.I. Toropov, P.S. Moshkov, V.S. Kryaskov, A.S. Vasilyev, The Application of Hardware-in-the-Loop (HIL) Simulation for Evaluation of Active Safety of Vehicles Equipped with Electronic Stability Control (ESC) Systems, in 13th International Symposium on Intelligent Systems (2018), 309-315. [2] M.A. Hannan, M.S.H.Lipu, A.Hussain, A.Mohamed, A review of lithium-ion battery state of charge estimation and management system in electric vehicle applications: Challenges and recommendations, Renewable and Sustainable Energy Reviews 78 (2017), 834-854. [3] B. Li, and Z. Shao, A unified motion planning method for parking an autonomous vehicle in the presence of irregularly placed obstacles, Knowledge-Based Systems 86 (2015),11-20. [4] B. Li, Y. Zhang, and Z. Shao, Spatio-temporal decomposition: a knowledge-based initialization strategy for parallel parking motion optimization, Knowledge-Based Systems 107 (2016), 179-196. [5] K. Demirli and M. Khoshnejad, Autonomous parallel parking of a car-like mobile robot by a neuro-fuzzy sensor-based controller, Fuzzy sets and systems 160 (2009), 2876-2891. [6] D. Xu, S. Yan, and Z. Ji, Model-Free Adaptive Discrete-Time Integral Sliding-Mode-ConstrainedControl for Autonomous 4WMV Parking Systems, IEEE transactions on industrial electronics 65 (2018), 834-843. [7] X. Du and K. K. Tan, Autonomous Reverse Parking System Based on Robust Path Generation and Improved Sliding Mode Control, IEEE transactions on intelligent transportation systems 16 (2015), 1225-1237. [8] M. Oetiker, G. Baker, and L. Guzzella, A Navigation-Field-Based Semi-Autonomous Nonholonomic Vehicle-Parking Assistant, IEEE transactions on vehicular technology 58 (2009), 1106-1118. [9] R. Kuhn and M. J. Reilly, An investigation of the applicability of design of experiments to software testing, in IEEE Software Engineering Workshop- proceedings (2002), 91-95. [10] R. Kuhn, Y. Lei, and R. Kacker, Practical Combinatorial Testing: Beyond Pairwise, IT Professional 10 (2008), 19-23. [11] R. Kuhn, D. R. Wallace, and A. M. Gallo, Software fault interactions and implications for software testing, IEEE Transactions on Software Engineering 30 (2004), 418-421. [12] C. Nie and H. Leung, A survey of combinatorial testing, ACM computing surveys 43 (2011), 1-29.
Y. Yang et al. / A New Model-in-the-Loop Test Strategy for Autonomous Parallel Parking System
9
[13] J. Czerwonka, Pairwise Testing in Real World, in Proceedings of the 24th Quality Conference (2006), 419–430. [14] D. E. Simos, R. Kuhn, A. G. Voyiatzis, and R. Kacker, Combinatorial Methods in Security Testing, Computer 49 (2016), 80-83. [15] D. M. Cohen, S. R. Dalal, M. L. Fredman, and G. C. Patton, The AETG system: an approach to testing based on combinatorial design, IEEE Transactions on Software Engineering 23 (1997), 437-444. [16] G. Rowe and G. Wright, The Delphi technique as a forecasting tool: issues and analysis, International Journal of Forecasting 15 (1999), 353-375. [17] T. L. Saaty, Axiomatic foundation of the analytic hierarchy process, Management Science 32 (1986), 841-855. [18] Y. Wind and T. L. Saaty, Marketing Applications of the Analytic Hierarchy Process, Management Science 26 (1980), 641-658. [19] A. Ishizaka and A. Labib, Review of the main developments in the analytic hierarchy process, Expert Systems with Applications 38 (2011), 14336-14345l. [20] D. Spinellis, State-of-the-Art Software Testing, IEEE Software 34 (2017), 4-6. [21] C. Berger, D. Block, S. Heeren, C. Hons, S. Kuhnel, A. Leschke, D. Plotnikov, and B. Rumpe, Simulations on Consumer Tests: A Systematic Evaluation Approach in an Industrial Case Study, IEEE Intelligent Transportation Systems Magazine 7 (2015), 24-36. [22] Barhate S.S., Effective test strategy for testing automotive software, in International Conference on Industrial Instrumentation and Control (2015), 645-649.
10
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200041
Study on the Determination of SpatioTemporal Threshold of Bicycle Connection Rail Transit in Haze Under Traffic Restriction a
Na ZHANG a,1 and Jianpo WANG a Highway College, Chang’ an University, middle section of South 2nd Ring Road, Xi’an, Shaanxi 710064, China Abstract. Urban rail transit accounts for an increasing proportion of the travel structure and becomes the backbone of the city. At the same time, in order to make full use of the efficient and convenient rail transit, how other modes of transportation such as buses, cars, and bicycles form an effective multimodal transport mode has become a challenge for current traffic planners, managers and transportation travel participants. At this stage, the rapid development of urban shared bicycles provides favorable conditions for commuters to use bicycles to connect rail transit. This paper establishes the distance attenuation model and the non-aggregate price sensitivity measurement method under the influence of winter traffic restriction measures by studying the significant passenger flow using the bicycle to connect the rail transit of the Xi'an metro line 2 terminal Weiqu South Station. The RP and SP traffic survey data were used to analyze the impact of the restricted measures on the spatio-temporal threshold determination of the bicycle-connected rail transit, which provided a reference for the planners to formulate relevant plans for bicycleconnected rail transit.
Keywords. Urban rail transit; Bicycle feeder; Distance attenuation model; Spatiotemporal threshold
1. Introduction With the development of rail transit and the concept of green travel, the mode of "bicycle + rail transit" has been gradually recognized by people. At the same time, this traffic mode not only solves the problem of "the first kilometer" and "the last kilometer" for residents to travel, but also enlarges the attraction scope of rail transit and attracts more passengers for rail transit. Many cities have chosen to build bike-sharing systems as a user-friendly way to promote connectivity with other models, reduce car use and encourage healthy lifestyle [1]. Cycling to mass transit can reduce commuting time, make commuting more enjoyable, and integrate healthy activities into daily life [2]. From the center of the city to the periphery, the station density tends to increase from high to low and the sharing rate of bicycle connection increases gradually. The number of interchange rail transit passengers is mainly commuter and commuter [3]. Commuters 1
Corresponding Author. Na ZHUANG, Highway College, Transportation Planning and Management, Chang’ an University; PH (86) 17789135658; E-mail: [email protected].
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
11
and commuters have higher requirements for time, congestion and parking during peak hours, and more utility value of using bicycle to connect the subway to complete the journey, thus accept a longer distance from the accessing distance. Xi'an is located in the northwest of China, suffering from haze in winter. In order to alleviate the environmental problems, Xi'an will implement the urban traffic restriction measures in winter, which has changed the traffic structure of the whole Xi'an city, and is different from other seasons, especially in suburban metro stations. Therefore, this paper will further study the spatio-temporal threshold of Bicycle-to-rail transit according to the traffic restriction measures in xi'an. Starting from the suburban metro station, analyzing the influence of the traffic restriction measures on the determination of the spatio-temporal threshold of the bicycle-to-rail transit, and provides a reference for the related planning of the bicycleto-rail transit (such as the scale of the bicycle parking facilities, the planning of the bicycle lanes, etc.). 2. Literature Review Previous studies have shown that the relationship between bicycles and railway stations is weak, which can be strengthened through collaborative multimodal transport planning [4]. In order to give full play to the advantages of bicycles, it is necessary to increase the importance of bicycles entering multi-modal transfer hubs, so as to expand the coverage of non-vehicle transfer hubs. The 2 mile (3.219 km) radius around the light rail station is considered a bicycle shed, or the distance most people can and would like to ride a bicycle [5]. Cyclists tend to ride longer distances to connect better transit stations [6]. In the Netherlands, Cycling is the main mode of transportation, with about 35 percent of home trips used for rail transit [7]. At present, scholars from various countries have conducted in-depth study on the impact of the spatio-temporal threshold of bicycle-torail transit from various perspectives. Yi Fang (2014) analyzed the reasonable attraction scope of metro transit for a single station and two adjacent stations respectively by using questionnaire survey, and determined the accessing distance of various ways by using time as the demarcation line. Hai, Rong, Wen, Tao, and Nan (2013) analyzed the spatial distribution characteristics of walking and cycling through SP and RP surveys, and quantified the spatio-temporal thresholds of rail transit connections between walking and cycling by using non-aggregated price sensitivity analysis to obtain reasonable and maximum accessing distance and accessing time respectively. Yan and Fei (2014), taking Tokyo, Japan as an example, studied the transfer model of the potential public bicycle connection rate at the outbound end of rail transit, and analyzed the impact of the accessing distance on the public bicycle connection rate. In Melbourne, about three quarters of respondents (77 percent) cycled less than 3.5 km in order to reach their respective train stations, and no cyclists rode for more than 5.5 km. cyclists traveling to public transport hubs would generally choose to cycle if the distance to be covered was between 2.2 and 5 km[8]. A study from Germany [9] found bicycle access trips to train service to range between 1.8 and 3.9km, depending on rail type and day of week. These numbers match relatively well with observed mean access distances to train stations from Atlanta and Los Angeles [10], which range between 1.7 km (Atlanta) and 4.5 km (Los Angeles). Rietveld [11] obtained comparable results for the Netherlands, with cycling being the dominant accessing mode between 1.2 and 3.7 km for train service. Rastogi
12
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
and Rao [12] analyze travel behavior for Mumbai, India, identifying a mean access distance of about 2.7 km to train stations, where a few socioeconomic factors (e.g., occupation, housing type) were found to affect access distance. The above literature has laid an important foundation for the study of this paper, mainly on personal attributes, travel characteristics, travel environment and other aspects of a deeper study. However, there is a lack of research on the spatio-temporal threshold of traffic measures for bicycleto-rail transit. From the point of view of traffic restriction measures, this paper analyzes the influence of the determination of the spatio-temporal threshold of bicycle-to-rail transit, and fills in the research gap in this field. 3. DATA 3.1 Data collection Subway metro stations located in the between suburbs and urban, which have different characteristic in different periods. In winter, affected by haze weather, private car will be restricted, more trips will be gathered on public transport, the attraction scope of subway metro stations will be further expanded, the proportion and spatial distribution of the accessing mode will also be changed. This paper selects Weiqu South station, the terminal station of xi'an metro line 2, as the research object. As the connection between suburbs and urban areas, the spatio-temporal threshold of the passenger flow accessing mode of Weiqu South station will change under the influence of private car restriction measures. The data of this paper mainly comes from the questionnaire survey, which adopts the combination of RP (behavior survey) and SP (intention survey). The main object of the survey is the commuter. Commuters take up most of the day's travel. The survey time is December 3, 2017. The survey location is shown in figure 1. Through site visits, there were 204 valid questionnaires, and the main contents of the survey included: 1) personal attributes, including age, gender, occupation, accessing path familiarity, etc. 2) travel behavior, including accessing path, origin-destination, willing connection distance and time. The following five questions were investigated under restricted and unrestricted conditions respectively: When the transfer distance is how much, you begin to feel that "such accessing distance is closer"; When the transfer distance is how much, you begin to feel that "such accessing distance is far"; When the transfer distance is how much, you begin to feel that "such accessing distance is too far, you will consider other accessing mode"; How long is the accessing time, which you think is appropriate; How long is the accessing time, which you think is maximum;
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
13
Figure 1. Survey location
3.2 Data processing A preliminary analysis of the survey data shows that the subjects of this survey are commuters, aged 18-60 years. In the survey, 26-45 years accounted for the largest proportion, which is consistent with the current age structure of Xi'an travel population distribution, which also shows that this is the main component of commuters. The age distribution is shown in figure 2.
Figure 2. Age and sex ratio distribution
In this survey, 59 percent of the respondents were male and 41 percent were female. On the one hand, male commuters accounted for the majority, and on the other hand, men were more likely to be surveyed. Through the statistical analysis and fitting of the survey data, the distribution of the traveler's acceptance of the accessing distance and the accessing time is obtained, as shown in figure 3 and figure 4. It can be seen from the diagram that the acceptance curve of the accessing distance rises faster when without private car restriction measures.
14
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
Under the condition of the same distance (within 3km), the accessing distance acceptance of without private car restriction measures is higher than private car restriction measures. According to the 85% percentile value, the acceptable accessing distance under without private car restriction measures condition is about 2.5km, and the private car restriction measures condition is about 4km. For the curve of accessing time acceptability, the rise speed of the curve under the unrestricted condition is also higher than that under the restricted condition. Under the same time condition, the accessing time under the unrestricted condition first reaches 1.0. In reality, the acceptable time is about 15 minutes under unrestricted condition and 25 minutes under restricted condition when calculated according to 85 percent fraction. According to the cycling speed of 10 km/h, the acceptable accessing distance is about 2.5km in the case of unrestricted condition, and the acceptable accessing distance is about 4 km in the case of restricted condition, which is consistent with the acceptable accessing distance in the case of unrestricted condition in figure 3.
Figure 3. Distribution of accessing distance acceptance
Figure 4. Distribution of accessing time acceptance
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
15
4. Model Specification 4.1 Distance attenuation model The analysis of the travel volume decreases as the accessing time or distance increases between bicycle and rail transit stations is one of the important means to determine and predict the influence scope of the "bicycle + rail transit" mode[13]. Using distance attenuation curve as a method to determine the scope of metro stations. Hochmair [14] show the short decay functions provides for bicycle access trips to transit services for Los Angeles, Atlanta, and Twin Cities. Therefore, it is reasonable to use the distance attenuation model to analyze the spatial relationship between bicycle connection passenger flow and distance under the condition of restricted condition and unrestricted condition. Based on the survey data, the traveler's preference distribution of metro transit accessing distance is obtained. From the traveler's willingness to rail transit accessing distance, a distance attenuation model is established with distance as a single independent variable [15]. This paper analyzes the spatial relationship between bicycle passenger and metro accessing distance at Weiqu South Station, the terminal of Xi'an Metro Line 2, under the conditions of restricted and restricted traffic, so as to provide reference for the planning of bicycle facilities around the station. There are three mathematical models for distance attenuation models: general mode, logarithmic mode and exponential mode as Eq. (1) ܶ = ߙ × ݁ ఉ×ௗ
(1)
Where, ܶ is the passenger flow distribution with distance length of ݀ ; α , β is coefficients, Among them, α is a constant and β is a distance function coefficient, namely the distance attenuation index, which reflects the influence of distance on the attenuation speed of the accessing passenger flow, and most studies believe that the value range should be determined by fitting the actual data. According to the existing research theory, the value has two meanings: 1) the attenuation range of passenger flow with distance extension, the higher β is, the lower the sensitivity of passenger flow to distance and the lower its attenuation rate is; 2) the degree of applicability of the mode of transportation. The higher theβ, the farther the application distance of the mode of transportation is. 4.2 Disaggregate price sensitivity analysis The non-aggregated price sensitivity method (KLP, Kishi’s Logit Price Sensitivity Meter) is developed from the price sensitivity analysis method. Under the condition of limited and non-limited travel, it is difficult to quantify the acceptable distance and time of bicycle connection. KLP is used to study the reasonable location of escalators in rail transit stations, reasonable transfer distance in station and evaluation of transfer distance to station [16]. This paper considers that the model is applicable to quantitatively analyze the connection distance and time range that travelers can bear for pedestrian and bicycle rail transit. Therefore, this paper uses this model to calculate the spatio-temporal threshold of bicycle connection rail transit on the Weiqu South railway station of Xi'an line 2 under the condition of limited and non-limited traffic. The schematic diagram of the model is shown in figure 5, considering that there is no traveler who abandons the
16
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
connection because of too close distance, the curve intersection point is the distance threshold point. The calculation formula is as Eq. (2), and (3) follows:
ଵ
ଵା ಷ ሺೣሻ
(2) (3)
Where: is the relevant cumulative frequency, i=1, ⋯, 3; is the distance (time) function; x is the distance (time; a, b is the parameter) ଵ , ଵ for the connection distance (time) is closer; ଶ , ଶ for the connection distance (time) is far; ଷ , ଷ for the connection distance (time) is too far, choose other ways. As shown in figure 5, on the passenger ௦ , the passenger feels that the connection distance (time) is not far or close, Therefore, it is used as a planned bicycle connection reference distance (time); here is called the upper limit connection distance (time), if the connection distance (time) exceeds this limit, the traveler will feel too far and the psychological burden is large, may change the way of travel. ௦ And divide the entire connection distance (time) into three areas, at 0- ௦ , the traveler thinks that the connection distance (time) is close, the connection is not burdened, ௦ - the traveler thinks the connection distance (Time) is not far, but not close, exceeds , travelers think too far, use other means to connect.
Figure 5. KLP schematic diagram of passenger flow in bicycle feeder rail station
5. Result 5.1 Spatial relationship According to the survey data, using python's pandas library, SciPy library, sk-learn library and so on, the distance attenuation model is fitted and calibrated, and the distance attenuation curve of bicycle-to-rail transit is obtained under the condition of limited and non-limited lines. As shown in figure 6, the fitting precision of the model is higher, and the parameters of the model such as Table 1 shows.
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
17
Table 1 Distance attenuation model for bicycle passenger flow model function =α ఉൈௗ
traffic measures
α
β
R2
unrestricted
1.08
-0.54
0.99
restricted
1.12
-0.88
0.99
Figure 6. The distance attenuation curve of bicycle feeder rail transit
In both cases, the sensitivity of travelers who use bicycles to connect rail transit to the connection distance can be directly reflected in the diagram. In the two cases, the passenger flow gradually decreases with the increase of distance. In the model, the value (-0.88) under the non-limited condition is less than the value (-0.54) under the limited condition, and with the increase of distance, the structure sequence characteristics of the connecting passenger flow under the limited condition is smaller than the non-limited passenger flow, which is consistent with the survey results. Compared with the connection passenger flow under non-limited traffic conditions, the connection passenger flow under the limited traffic conditions is less sensitive to distance, and its attenuation speed is also lower, and the attraction range of "bicycle + rail transit" of rail transit stations is far. 5.2 Spatio-temporal threshold of connection Using the survey data calibration model, the uncategorized price sensitivity model is calibrated under the condition of non-linear. The joint distance correlation cumulative frequency curve is shown in figure 7, and the results are shown in Table 2. Table 2 Fitting function of model fitting function
correlation coefficient
ଵ = -3.18649x+3.28264
0.97
ଶ = 2.13084x-4.30594
0.97
ଷ = -1.05234x+3.15797
0.98
Where, ଵ is close to the connection, ଶ is far from the connection, ଷ is too far from the connection, choose other ways to connect.
18
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
Figure 7. The cumulative frequency curve of the bicycle connection distance under the unrestricted
The non-aggregate price sensitivity model is calibrated under the limited line condition. The joint distance-dependent cumulative frequency curve is shown in figure 8, and the results are shown in Table 3. Table 3 Fitting function of model fitting function
correlation coefficient
ଵ = -1.81630x+2.47164
0.96
ଶ = 1.09548x-3.63363
0.97
ଷ = -1.33073x+6.01842
0.97
Where, ଵ is close to the connection, ଶ is far from the connection, ଷ is too far from the connection, choose other ways to connect.
Figure 8. The cumulative frequency curve of the bicycle connection distance under the restricted Table 4 Threshold of distance for bicycle feeder traffic measures
࢙
unrestricted
1.4182
2.3374
restricted
2.0945
3.9783
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
19
Using KLP to analyze the threshold value of bicycle connection distance, the results are shown in Table 4. From the table, it can be seen that under non-limited conditions, when the bicycle connection distance is 1.4182 km, travelers feel that the use of bicycle connection rail transit is relatively reasonable, but when the connection distance exceeds 2.3374 km, travelers feel that the connection distance is too far, may be change the way of travel. Under the restricted conditions, when the bicycle connection distance is 2.0945km, the traveler may feel that the connection distance is within the acceptance range due to the restriction of the car, but when the connection distance exceeds 3.9783km, although the limit is implemented, the travel is carried out. They still think that this connection distance is too far, they will give up the choice of bicycles to connect to rail transit. Using KLP regression fitting model, the traveler's acceptance distribution curve of bicycle connection time is obtained. figure 9 shows the standard connection time acceptance distribution curve, and figure 10 shows the maximum connection time acceptance distribution curve. The results are shown in Table 5.
Figure 9. Standard acceptance feeder time distribution
Figure 10. Maximum acceptance feeder time distribution Table 5 Threshold of time for bicycle feeder traffic measures
standard connection time
maximum connection time
unrestricted
8.8
14.7
restricted
11.3
24.8
20
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
The connection time is calculated at 85% of the fractional value. The standard connection time is 8.8 minutes and the maximum connection time is 14.7 minutes under the non-limited condition. According to the bicycle speed of 10 km/h, the standard connection distance is 1.4667 km and the maximum connection distance is 2.45 km. This is consistent with the result of the distance threshold. Under the limited condition, the standard connection time is 11.3 minutes, the maximum connection time is 24.8 minutes, which is also consistent with the distance threshold calculation results. Compared with the non-limited line condition, the standard connection time under the limited line condition is extended by 2.5 minutes, and the maximum connection time is extended by 10.1 minutes. This is due to restrictions on the use of cars from the suburbs to the downtown area. Travelers will choose public transport, and the attraction of bicycle-torail transit will expand outward. The suburban bus stations have long intervals, low line density and long distances, which makes travelers within 4km of the track station choose to connect bicycles to rail transit instead of cars, and give up other ways to connect rail transit. 6. Conclusion Under non-limited conditions, the standard distance of bicycle-to-rail transit is 1.4182km, the standard time is 8.8min, the maximum distance is 2.3374km, and the maximum time is 14.7min. Under the limited conditions, the standard distance of bicycle-to-rail transit is 2.0945km, the standard time is 11.3min, and the maximum distance is 3.9783km. The maximum connection time is 24.8min. Compared with the existing research conclusions, such as 500m-2km [18], 2.7-3.2km [19], 3km(Hua, 2006), the connection distance is larger, 3.7km [20], the connection distance is close, and 5.5km [14]the connection distance is smaller. The reasons are analyzed as follows: The station is located at the junction of urban and rural areas, because of the limitation of traffic means, the time and distance of interchange between travelers are prolonged; Because of the restriction of urban areas, the acceptable range of distance for travelers who use bicycle to connect rail transit is relatively large. This paper investigates travelers who use bicycles to connect rail transit under restricted and non-restricted traffic conditions. A questionnaire survey is conducted around Weiqu South Station of Xi'an Metro Line 2. The sensitivity of travelers to the distance and time between bicycles and rail transit under restricted and non-restricted traffic conditions is analyzed. The conclusion of this paper can be used as a reference for Xi'an and other cities in the same situation to formulate the related infrastructure planning of "bicycle + rail transit" under the influence of different traffic measures implemented in different seasons. Especially in the metro stations at the junction of urban and rural areas, when planning the scale of bicycle parking facilities, the impact of restrictions should be fully considered. Due to the sample limitation of the questionnaire, this paper mainly analyzes the attraction range of commuters choosing "bicycle + rail transit" under the condition of limited and non-limited travel, and neglects the location of rail transit station and the road environment around the station. Further research will take more factors into account and obtain a diversified time-space threshold of rail transit connection, which will
N. Zhang and J. Wang / Study on the Determination of Spatio-Temporal Threshold
21
provide a reference for rational planning, design and determination of the size of bicycle parking lot in the mode of bicycle-to-rail transit. References [1]A., S. S., Hua, Z., Elliot, M., & Stacey, G. (2018). China's Hangzhou Public Bicycle. Transportation Research Record: Journal of the Transportation Research Board (2018), 2247(1), 33-41. [2]Bracher, T. Demand characteristics & co-operation strategies for the bicycle & railway transport chain. World Transport Policy & Practice, (2000) 6(3), 18-24. [3]Chen, W. Z., & Sheng, W. M. (2005). Joining and coordination between urban rail traffic transfer junction and other transportation modes. Research and Design, 4, 46-50. doi:10.13219/j.g jgya t.2005.04.015 [4]D., M. J., & Katharine, H.-Z. (2018). Bicycle network connectivity for passenger rail stations. Transportation Research Record: Journal of the Transportation Research Board, 2448(1), 21-27. doi:10.3141/2448-03 [5]Griffin, G. P., & Texas, I. N. S. (2016). Planning for bike share connectivity to rail transit. Journal of Public Transportation, 19(2), 1-22. doi:10.5038/2375-0901.19.2.1 [6]Hai, Y., Rong, Y. R., Wen, X., Tao, L., & Nan, C. (2013). Critical accessing time and distance for pedestrian and cyclists to urban rail transit. UrbanTransport of China, 11(2), 83-90. doi:10.13813/j.cn115141/u.2013.02.015 [7]Hochmair, H. H. (2014). Assessment of Bicycle Service Areas around Transit Stations. International Journal of Sustainable Transportation, 9(1), 15-29. doi:10.1080/15568318.2012.719998 [8]Hua, G. Y. (2006). Guangzhou integration planning of urban rail transport and other transport modes. Journal of Huazhong University of Science and Technology (Urban Science Edition), 23(2), 113-116. [9]J.Pucher, J.Dill, & S.Handy. (2010). Infrastructure, programs, and policies to increase bicycling: an international review. Prev Med, 50 Suppl 1, S106-125. doi:10.1016/j.ypmed.2009.07.028 [10]Jing, H., Gang, L. Z., & Wang, S. Y. Transfer convenience analysis based on passengers's physical and psychological impedance. Journal of TongJi university (natural science), (2010).38(1), 92-97. doi:10.3969/j.issn.0253-374x.2010.01.017 [11]Juan, K. L., & Fei, Y. X. (2010). Characteristics of bike-and-ride at urban mass transit station. [12]Luigi, d. O., Angel, I., & Luis, M. J. Implementing bike-sharing systems. Proceedings of the Institution of Civil Engineers - Municipal Engineer, (2011). 164(2), 89-101. doi:10.1680/muen.2011.164.2.89 [13]Osmonson, B. An equity analysis of bicycle infrastructure around light rail stations in Seattle, WA. (Master), (2017).University of Washington. Retrieved from http://hdl.handle.net/1773/40315 [14]Ping, C., & Jun, C. Bicycle-metro station locating and its demand predicting. Technology&Economy in Areas of Communications, (2008). 3(10), 87-89. doi:10.3969/j.issn.1008-5696.2008.03.036 [15]Rastogi, R., & Rao, K. V. K. Travel characteristics of commuters accessing transit: Case study. Journal of Transportation Engineering, (2003).129, 684-694. doi:10.1061/(ASCE)0733-947X(2003)129:6(684) [16]Rietveld, P. The accessibility of railway stations: the role of the bicycle in the Netherlands. Transportation Research Part D, (2000). 5, 71-75. doi:10.1016/S1361-9209(99)00019-X [17]Rong, L. J., & Hui, Y. T. Strategic design of public bicycle sharing systems with service level constraints. Transportation Research Part E: Logistics and Transportation Review, (2011). 47(2), 284-294. doi:10.1016/j.tre.2010.09.004 [18]Rose, G., Weliwitiya, H., Tablet, B., Johnson, M., & Subasinghe, A. Bicycle access to Melbourne metropolitan rail stations. Australasian Transport Research Forum 2016 Proceedings, 1-10. [19]Yan, C., & Fei, Y. X. Transfer model of potential bicycle share on egress journey in urban mass transit. Journal of TongJi university (natural science), (2014)24(12), 1862-1867. doi:10.11908/j.issn.0253374x.2014.12.012 [20]Yong, M. W. Research on transfer system of bicycle and rail transit in big cities. (Master), Southwest Jiaotong University. (2007).Retrieved from https://doi.org.10.7666/d.y1131528
22
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200042
The Influence Assessment of Platform’s Parameters on Service Level in Subway Transfer Stations a
Yao WANG a 1, Shunping JIA a, and Jiaqi WUa Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing Abstract: In this paper, per capita area is used to describe service level. An AnyLogic simulation system is formulated to study on subway design parameters, which influence the service level of transfer stations. For stations in the case, it is suggested that the platform width in different transfer stations should be different. According to the simulation, the platform width of "One", "Cross" and "T" type transfer stations should be widened by 3.6m, 1.2m and 2.4m respectively on the basis of the original models. As a result, the service level of the platform at the most crowded moment can be upgraded from B to A, and the increase of the minimum per capita area of the platform is 34.9%, 15.8% and 21.2%. Besides, the width of stair should be widened by 0.4m, 0.8m and 0.2m, so that the service level of the platform can reach the peak, and the average per capita area of the platform can reach 1.754 m2/p (square meter per person), 2.000 m2/p, 2.178 m2/p. Finally, if conditions permit, the stairs should be placed at both ends of the platform. Keywords: Transfer; Level of service; AnyLogic simulation
1. Introduction In recent years, urban rail transit has been developing rapidly in China, while many transfer stations have low transfer efficiency. This is mainly due to the design of transfer station, such as the unreasonable design parameters of station and the small space of the platform. In view of the service level of metro transfer station, a lot of research works have been carried out by scholars at home and abroad. Gao Fengqiang[1] gave a guiding force to pedestrians on the basis of the cellular automata evacuation model. Based on the premise that Helbing introduced random fluctuation into social force model, Qu Zhaowei[2] gave pedestrians a random wave power by randomly changing the direction and speed of pedestrians. Xu Qi [3] established a distribution model of subway platform based on the self-adaptive agent. W.L.Wang[4] put forward a pedestrian simulation model CityFlow based on Agent. Miho Asano and Takamasa Iryo[5] proposed a micro-pedestrian simulation model, which explained the negotiation process between pedestrians to avoid collision, and assumed that pedestrians predicted the movement of other pedestrians to Fund Project: The Fundamental Research Funds (2019JBM334). 1 Corresponding Author: Yao WANG (1996-), female, Nanjing, Jiangsu people, master, [email protected]
Y. Wang et al. / The Influence Assessment of Platform’s Parameters on Service Level
23
avoid collision with them, and also proposed a macro-tactical model to determine the macro-path to the destination. WangFei[6] set up a theoretical calculation model for assembling passengers of one-platform-transfer station. There are few quantitative analyses of design parameters of transfer station. In this paper, a simulation-based assessment method of service level is proposed. AnyLogic is used to build transfer simulation models for different types of subway transfer stations. By changing design parameters of platform, the impact of service level can be analyzed. 2. Evaluation of service level At present, most of research on service level is to establish an evaluation system by selecting relevant indicators. Zhang Qi [7] set up an evaluation model of transfer service level with the level of facility utilization, convenience, patency, safety and coordination. In this paper, crowding degree and safety degree will be used to describe the level of service, which focuses on the comfort and safety. Per capita area can describe crowding level of station most. Per capita area equals effective area divided by the number of assembling passengers. The crowding degree of the area can be judged by comparing the service level classification in Table 1. Table 1. Level of service in transit capacity and quality of service manual Level of Per capita Description service area(m2/p) A ≥1.2 Stand or walk freely in a queue without affecting others. B 0.9-1.2 Being able to stand in line, activities can be partially limited by avoiding others. C 0.7-0.9 Being able to stand or move in a queue, but it can affect others. D 0.3-0.7 Contact with others is unavoidable when standing up. Walking in the team is greatly restricted. E 0.2-0.3 Contact with others is unavoidable while standing, and it is impossible to walk in a queue. F Threshold if E(x, y)≤Threshold
(I) foreground background
(II)
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
34
In the above equations, E (x, y) represents the resultant image after subtraction, M (x, y) denotes the binary image and gi is the ith image frame. 2.2. Background Subtraction Most frequently used method for vehicle detection is background subtraction mechanism. In this approach, the foreground objects are extracted from the stationary background followed by updating the background of the consecutive frames [7]. It is also called the two-class segmentation technique where the pixels are segmented as either foreground pixel (x, y) or background pixel and they are solved using Gaussian probabilistic distribution [8]. The following equation (3) represents the background model for a pixel (xi, yi) (i.e. pixel at the ith frame) using mean filter. ଵ
ܥ, ሺݔ, ݕሻ = [ሺ݅ − 1ሻ ∗ ܥ,ିଵ ሺݔ, ݕሻ + ܫ ሺݔ, ݕሻ]
(III)
Where, Cm, i (x, y) denotes background model at pixel (x, y) and Ii (x, y) represents intensity at pixel (x, y). 2.3. Temporal Variance based Model The motion detection using temporal variance method computes the mean and variance of the intensity at each pixel followed by a recursive update for the subsequent frames to detect the moving region of the vehicle [9]. The value of the variance of the intensity at a pixel (x, y) considers the amplitude and time duration of changes. The drawback of this model is after the occurrence of changes, it decomposes to its initial position leaving a series of a pixel in the motion of the moving vehicles [10]. 3. Vehicle Tracking Several state-of-the-art types of research have been done over the years to provide various solutions to automated vehicle tracking problem and some of these are described in the below section. 3.1. 3D Model-based Tracking The prominence of three-dimensional model-based tracking is the retrieval of trajectories by measuring the detailed geometrical shape of the point of interest (i.e. vehicles). Several literature studies have been done on vehicle tracking using 3D models. In [11] the three-dimensional models for various vehicle types are matched with the detected edges in the image. All kinds of occlusions are virtually eliminated here since the aerial view of the video scene is considered. In the presence of partial occlusion tracking of a single vehicle is successfully performed in [12] though its applicability to the congested traffic scenarios is not mentioned here. 3D model-based technique is an infeasible approach because of increased computational complexity and expecting detailed models for all moving vehicles on a busy crossroad or a superhighway [13]. 3.2 Region based Tracking Region-based tracking exposes the vehicles configuration as connected regions within any geometrical shape. To achieve effective vehicle tracing, a data association technique is used between region attributes of the successive frames [14].
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
35
The fundamental approach towards region-based tracking is background subtraction and foreground object (i.e. vehicles) detection. In [15] a region-based vehicle tracking algorithm is proposed by using the image processing techniques. The vehicle location and the probable region in the next frame is predicted here by applying a time series analysis model. A hybrid model based on region matching and feature detection was proposed in [16] where vehicle trajectory was predicted through consequent frames. Region based models works competently with less congested traffic but under a dense traffic environment, it gets exhausted to trace individual vehicle [17]. 3.3 Active Contour based Tracking Active contour models (aka snakes) tracks the contour that represents the boundary of the target vehicles. After successful recognition of a vehicle region in the input frame, the contours are extracted and dynamically upgraded in the consecutive frames [18]. Kalman filtering algorithm can be used to evaluate the movement and structure of the contour [19]. A novel approach towards tracking the vehicle’s entry and exit is reported in [20] through License Plate Recognition. Here, Active Contour method is used to separate foreground from background. An automated visual surveillance system to recognize human pose based on neural networks as well as active contours is demonstrated in [21]. The recognition of the elements of interest from a moving image frame sequence is executed using a two-tier segmentation process. First, the parameters of the initial active contours are evaluated. Thereafter GVF-Snake model [22] is applied for high-level object segmentation. An automatic target identification system by using pulse coupled neural network (PCNN) is proposed in [23].Candidate regions are generated from the car license plate by using PCNN. Connected objects are detected here by applying an active contour detection algorithm. The connected objects having similar geometrical features as the plate are selected as the candidates. In Active contour-based approach, the computational complexity is minimized compared to region-based tracking. However, occluded vehicles cannot be precisely traced using this approach [24] . 3.4 Feature-based Tracking Instead of tracking the target vehicle as a whole, feature-based algorithms are used to track features like detectable points or lines on the vehicles. It performs relatively well in congested traffic roadway because some features of dynamic object persist detectable even under partial occlusion [24]. In a realistic approach, some algorithms like Kalman filtering [25], Lucas Kanade feature tracking [16] are adopted for successfully tracking multiple vehicles in dense traffic by assembling of features. In [27] a feature space is constructed for the visual surveillance applications by selecting and extracting the low-level features from the video frames. Kalman filter-based vehicle sub-feature is used in [28] for vehicle tracking. The proposed system is more efficient in the presence of partial occlusions that is frequent in congestions. A novel feature-based methodology is proposed in [29] to track different objects in motion from a sequence of videos collected in a complex environment. The features of the individual objects are extracted from the successive frames and Chi-Square dissimilarity measure is used to match them. Nearest Neighbour Classifier is used here to track the objects in motion.
36
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
4. Traffic Anomaly Detection Anomalies are the abnormalities in the data pattern that deviates from the normal behavioral notion [30]. Identifying anomalies is a challenging task due to the randomness of the video scenes and shortage of labelled or annotated data obtained from surveillance videos. Traditionally three main classes of anomalies are mentioned in the literature- (a) point anomalies [31-33], (b) contextual anomalies [34,35] and (c) collective anomalies[36]. The data pattern is considered a point anomaly if they are an outlier to the usual distribution. An example of the point anomaly can be considered as a stationary car on a busy road. An anomaly which depends on the current context is known as Contextual anomaly. If a car moves faster compared to the other cars on a congested road, it can be considered as an abnormal behavior whereas it is normal in the less crowded road. In some scenarios, data instances are individually normal but collectively they might result in anomalous scenario e.g. sudden dispersal of a group of people within a very short time span. Anomaly detection is highly important for implementing road safety and traffic control in the real world. In recent years several works are done on anomaly detection. A substantial amount of work focuses on learning the feature representation from the video clips without any anomaly [37-39]. In recent years with the advent of Generative Adversarial Network(GAN)[40] video prediction is used for anomaly detection. In [41] R-CNN is used to detect anomalous vehicles in the extracted background. Different anomaly detection approaches are briefly described in the below section. 4.1 Anomaly detection approaches 4.1.1 Model based In model-based approaches, statistical models are used to learn the set of parameters representing the normal behavior of data. Statistical approaches are of two types – (i) parametric or (ii) non-parametric. In the parametric approach, the parametric distribution and probability density function are used to generate the normal data. Regression models [42], Gaussian mixture models [43] are some of the examples of the parametric approach. Unlike the parametric approach, in a non-parametric approach the structure is not predefined, instead dynamically determined from the data. Dirichlet process mixture models (DPMM) [44], histogram-based [45] and Bayesian networkbased models [46] are some of the typical examples of non-parametric approaches. 4.1.2 Proximity-based Proximity-based models are based on the distance-based approach which assumes that the normal data is associated with dense neighborhood . Anomalies are determined by determining the closeness of the data points to their neighbors. An outlier score is calculated by comparing the relative density of a point with its neighbors [47]. 4.1.3 Classification-based In classification-based models, classifiers are used to differentiate between normal and anomalous classes in a particular feature space. Two categories of classbased anomaly detection technique are found in the literature (i) one class and (ii) multiclass. The basic assumption of anomaly detection by one class classification is that only one label is associated with all the data points [48-50]. In [51] Support Vector
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
37
Machine (SVM) is used for anomaly detection in one class classification-based approach. In multi-class classification-based anomaly detection method, it is assumed that label instances of both normal and anomalous classes are available with the training data. If a data point falls in the anomalous class, it is considered as anomalous [52]. 4.1.4 Prediction-based In the prediction-based model, the predicted and actual spatiotemporal properties of the feature descriptor is compared to detect the anomaly [53]. HMM and LSTM models depends on the prediction-based approach to detect anomaly [ 54]. 4.1.5 Reconstruction based Reconstruction based methods assume that a lower dimensional subspace can embed the normal data where the normal and anomalous instances appear differently. Data reconstruction error is calculated to measure anomaly [55].Autoencoder [56], sparse coding[57-59] and principal component analysis(PCA) are some of the examples of reconstruction-based anomaly detection approach. 5. Discussion Different state of the art presented in this paper clearly depicts the importance of visual traffic surveillance system in computer vision. The challenges in this field is prominent because of the presence of occlusion, unusual behavior of vehicles, changes in illumination [60], irregular vehicle orientation and lower camera viewpoints etc. There are several classification of Intelligent transportation System elaborated in [61] which are used to find the optimized vehicle trajectories using wireless sensors, postulates of graph theory etc. The detection of foreground objects and separating them from background is effectively done by motion segmentation [62] approaches which includes frame differencing, feature point based, temporal variance methods. The significant techniques for origination of vehicle motion cues in a video sequence is vividly discussed in [63]. A robust motion estimation algorithm is proposed in [64] for video compaction and image reconstruction. The tracking algorithm such as Kalman filter, Lucas Kanade filter, particle filtering and the novel researches using the conventional models works appreciably well with dense traffic environment. The feature-based vehicle tracking approach is more efficient under partial occlusion unlike the snakes model which fails to track occluded object precisely. The wide application domain of the aforementioned system includes surveillance and security, traffic congestion control and accident prevention etc. There are a few researches made on the visual traffic surveillance system which highlights the performance at night time [65] and under inappropriate light [66, 67]. The major challenge for the congested traffic environment is occlusion handling and shadow effect reduction in highways [68, 69]. An inclusive data set mentioned in [70] depicts deficient in weather and lighting conditions. In [71,72] proposed an outline framework to establish video surveillance system. A behavioral analysis of CCTV footage in the field of surveillance system is proposed in [73]. An Intelligent transport monitoring system is proposed in [74] using infrared proximity sensor and microcontrollers which are used in AVS system. In [75], illustrated a Unmanned Aerial Systems(UAV)[76, 77] in compute vision in context of visual traffic monitoring for safety, privacy and legislation concern. Trackers capable of tracking multiple vehicles parallelly is proposed in [78]. A thermal or IR camera can
38
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
be used in night time for reducing the visual obstacles under difficult light conditions [79]. This literature considers vehicle lights instead of occlusion of contours at night. Different deep learning methodologies such as R-CNN,GAN perform relatively well in context of classifying objects, anomaly detection etc. Conclusion In this study, a concise review of the computer vision techniques is presented in the field of the visual traffic surveillance system. Several conventional approaches of vehicle detection, tracking and traffic anomaly detection are highlighted here. Different state-of-the-art motion segmentation approaches in a surveillance system are described in this study. The first stage of an automatic surveillance system is an object or vehicle tracking. Different tracking models e.g. 3D model based, Region-based, Active Contour based and feature based tracking methods are briefly elaborated here. After vehicles tracking is done from video games the next step is classifying them into different categories. In a realistic approach, the feature-based tracking performs competently well after satisfying the limitations of the other approaches. Detection of abnormality in the traffic scenario has paramount importance in visual tracking. Different anomaly detection methods are briefly illustrated in this paper by measuring the potentiality of methods used in the traffic surveillance system considering data interpretation. The present study represents anomaly detection techniques that are capable of handling point, conceptual and collective anomalies in a dense traffic environment. Some relevant researches propose a future design combining detectors and classifiers using 3 DHOG algorithms for effective traffic management. In future, a framework could be designed which considers the broadcasting information between vehicle and infrastructure or vehicle to vehicle communication, studied in Japan. References [1] Mountain, L. J., & Garner, J. B. (1980). Application of photography to traffic surveys. Highway Engineer, 27(11). [2] Qureshi, K. N., & Abdullah, A. H. (2013). A survey on intelligent transportation systems. Middle East Journal of Scientific Research, 15(5), 629-642. [3] Sandim, M., Rossetti, R. J., Moura, D. C., Kokkinogenis, Z., & Rúbio, T. R. (2016, November). Using GPS-based AVL data to calculate and predict traffic network performance metrics: A systematic review. In 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC) (pp. 1692-1699). IEEE [4] Ambardekar, A., Nicolescu, M., Bebis, G., & Nicolescu, M. (2013). Visual traffic surveillance framework: classification to event detection. Journal of Electronic Imaging, 22(4), 041112. [5] Tian, B., Morris, B. T., Tang, M., Liu, Y., Yao, Y., Gou, C., ... & Tang, S. (2014). Hierarchical and networked vehicle surveillance in ITS: a survey. IEEE transactions on intelligent transportation systems, 16(2), 557-580. [6] Liu, Z., Chen, Y., & Li, Z. (2009, March). Camshift-based real-time multiple vehicle tracking for visual traffic surveillance. In 2009 WRI World Congress on Computer Science and Information Engineering (Vol. 5, pp. 477-482). IEEE. [7] Hadi, R. A., Sulong, G., & George, L. E. (2014). Vehicle detection and tracking techniques: a concise review. arXiv preprint arXiv:1410.5894. [8] Jun, G., Aggarwal, J. K., & Gokmen, M. (2008, January). Tracking and segmentation of highway vehicles in cluttered and crowded scenes. In 2008 IEEE Workshop on Applications of Computer Vision (pp. 1-6). IEEE. [9] Joo, S., & Zheng, Q. (2005, January). A temporal variance-based moving target detector. In IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS)
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
39
[10] Abdelkader, M. F., Chellappa, R., Zheng, Q., & Chan, A. L. (2006, January). Integrated motion detection and tracking for visual surveillance. In Fourth IEEE International Conference on Computer Vision Systems (ICVS'06) (pp. 28-28). IEEE. [11] Ferryman, J. M., Worrall, A. D., & Maybank, S. J. (1998, September). Learning Enhanced 3D Models for Vehicle Tracking. In BMVC (pp. 1-10). [12]Hinz, S., Schlosser, C., & Reitberger, J. (2003, May). Automatic car detection in high resolution urban scenes based on an adaptive 3D-model. In 2003 2nd GRSS/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas (pp. 167-171). IEEE. [13] Coifman, B., Beymer, D., McLauchlan, P., & Malik, J. (1998). A real-time computer vision system for vehicle tracking and traffic surveillance. Transportation Research Part C: Emerging Technologies, 6(4), 271-288. [14] Abdulrahim, K., & Salam, R. A. (2016). Traffic surveillance: A review of vision-based vehicle detection, recognition and tracking. International journal of applied engineering research, 11(1), 713-726. [15] Wang, J., Sun, X., & Guo, J. (2013). A region tracking-based vehicle detection algorithm in nighttime traffic scenes. Sensors, 13(12), 16474-16493. [16] Saini, A., Suregaonkar, S., Gupta, N., Karar, V., & Poddar, S. (2017, August). Region and feature matching based vehicle tracking for accident detection. In 2017 Tenth International Conference on Contemporary Computing (IC3) (pp. 1-6). IEEE. [17] Hu, W., Tan, T., Wang, L., & Maybank, S. (2004). A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 34(3), 334-352. [18] Malik, J., & Russell, S. (1997). Traffic surveillance and detection technology development: new traffic sensor technology final report. [19] Koller, D., Weber, J., & Malik, J. (1994, May). Robust multiple car tracking with occlusion reasoning. In European Conference on Computer Vision (pp. 189-196). Springer, Berlin, Heidelberg. [20] Felix, A. Y., Jesudoss, A., & Mayan, J. A. (2017, August). Entry and exit monitoring using license plate recognition. In 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM) (pp. 227-231). IEEE. [21] Buccolieri, F., Distante, C., & Leone, A. (2005, September). Human posture recognition using active contours and radial basis function neural network. In IEEE Conference on Advanced Video and Signal Based Surveillance, 2005. (pp. 213-218). IEEE. [22] Xu, C., & Prince, J. L. (1998). Snakes, shapes, and gradient vector flow. IEEE Transactions on image processing, 7(3), 359-369. [23] Chacon, M. I., & Zimmerman, A. (2003, July). License plate location based on a dynamic PCNN scheme. In Proc. Int. Joint Conf. Neural Netw (Vol. 2, pp. 1195-1200). [24] Saunier, N., & Sayed, T. (2006, June). A feature-based tracking algorithm for vehicles in intersections. In The 3rd Canadian Conference on Computer and Robot Vision (CRV'06) (pp. 59-59). IEEE. [25] Li, X., Wang, K., Wang, W., & Li, Y. (2010, June). A multiple object tracking method using Kalman filter. In The 2010 IEEE international conference on information and automation (pp. 1862-1866). IEEE. [26] Suhr, J. K. (2009). Kanade-lucas-tomasi (klt) feature tracker. Computer Vision (EEE6503), 9-18. [27]Altahir, A. A., & Asirvadam, V. S. (2014, June). Building a feature-space for visual surveillance. In 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS) (pp. 1-6). IEEE. [28] Huang, L. (2010, March). Real-time multi-vehicle detection and sub-feature based tracking for traffic surveillance systems. In 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010)(Vol. 2, pp. 324-328). Ieee. [29] Chandrajit, M., Girisha, R., & Vasudev, T. (2016). Multiple objects tracking in surveillance video using color and hu moments. Signal & Image Processing: An International Journal (SIPIJ) Vol, 7. [30] Chandala, V., Banerjee, A., & Kumar, V. (2009). Anomaly Detection: A Survey, ACM Computing Surveys. University of Minnesota. [31] Jeong, H., Yoo, Y., Yi, K. M., & Choi, J. Y. (2014). Two-stage online inference model for traffic pattern analysis and anomaly detection. Machine vision and applications, 25(6), 1501-1517. [32] Li, W., Mahadevan, V., & Vasconcelos, N. (2013). Anomaly detection and localization in crowded scenes. IEEE transactions on pattern analysis and machine intelligence, 36(1), 18-32. [33] Roshtkhari, M. J., & Levine, M. D. (2013). An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions. Computer vision and image understanding, 117(10), 1436-1452. [34] Song, X., Wu, M., Jermaine, C., & Ranka, S. (2007). Conditional anomaly detection. IEEE Trans. Knowl. Data Eng., 19(5), 631-645. [35] Yuan, Y., Fang, J., & Wang, Q. (2014). Online anomaly detection in crowd scenes via structure analysis. IEEE transactions on cybernetics, 45(3), 548-561.
40
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
[36] Wang, T., & Snoussi, H. (2014). Detection of abnormal visual events via global optical flow orientation histogram. IEEE Transactions on Information Forensics and Security, 9(6), 988-998. [37] Cong, Y., Yuan, J., & Liu, J. (2011, June). Sparse reconstruction cost for abnormal event detection. In CVPR 2011 (pp. 3449-3456). IEEE. [38] Lu, C., Shi, J., & Jia, J. (2013). Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE international conference on computer vision (pp. 2720-2727). [39] Chong, Y. S., & Tay, Y. H. (2017, June). Abnormal event detection in videos using spatiotemporal autoencoder. In International Symposium on Neural Networks (pp. 189-196). Springer, Cham. [40] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680). [41] Wei, J., Zhao, J., Zhao, Y., & Zhao, Z. (2018). Unsupervised anomaly detection for traffic surveillance based on background modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 129-136). [42] Cheng, K. W., Chen, Y. T., & Fang, W. H. (2015). Gaussian process regression-based video anomaly detection and localization with hierarchical feature representation. IEEE Transactions on Image Processing, 24(12), 5288-5301. [43] Li, Y., Liu, W., & Huang, Q. (2016). Traffic anomaly detection based on image descriptor in videos. Multimedia tools and applications, 75(5), 2487-2505. [44] Ngan, H. Y., Yung, N. H., & Yeh, A. G. (2015). Outlier detection in traffic data based on the Dirichlet process mixture model. IET intelligent transport systems, 9(7), 773-781. [45] Zhang, Y., Lu, H., Zhang, L., & Ruan, X. (2016). Combining motion and appearance cues for anomaly detection. Pattern Recognition, 51, 443-452. [46] Blair, C. G., & Robertson, N. M. (2014, January). Event-driven dynamic platform selection for poweraware real-time anomaly detection in video. In 2014 International Conference on Computer Vision Theory and Applications (VISAPP) (Vol. 3, pp. 54-63). IEEE. [47] Liu, S. W., Ngan, H. Y., Ng, M. K., & Simske, S. J. (2018). Accumulated Relative Density Outlier Detection For Large Scale Traffic Data. Electronic Imaging, 2018(9), 1-10. [48] Patil, N., & Biswas, P. K. (2016, December). Global abnormal events detection in surveillance video— A hierarchical approach. In 2016 Sixth International Symposium on Embedded Computing and System Design (ISED) (pp. 217-222). IEEE. [49] Wang, T., Qiao, M., Zhu, A., Niu, Y., Li, C., & Snoussi, H. (2018). Abnormal event detection via covariance matrix for optical flow based feature. Multimedia Tools and Applications, 77(13), 17375-17395. [50] Xu, D., Yan, Y., Ricci, E., & Sebe, N. (2017). Detecting anomalous events in videos by learning deep representations of appearance and motion. Computer Vision and Image Understanding, 156, 117-127. [51] Zhang, X., Gu, C., & Lin, J. (2006). Support vector machines for anomaly detection. In 2006 6th World Congress on Intelligent Control and Automation (Vol. 1, pp. 2594-2598). IEEE. [52] Chen, Y., Yu, Y., & Li, T. (2016, August). A vision based traffic accident detection method using extreme learning machine. In 2016 International Conference on Advanced Robotics and Mechatronics (ICARM) (pp. 567-572). IEEE. [53] Liu, W., Luo, W., Lian, D., & Gao, S. (2018). Future frame prediction for anomaly detection– a new baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 6536-6545). [54] Medel, J. R., & Savakis, A. (2016). Anomaly detection in video using predictive convolutional long short-term memory networks. arXiv preprint arXiv:1612.00390. [55] Kumaran, S. K., Dogra, D. P., & Roy, P. P. (2019). Anomaly Detection in Road Traffic Using Visual Surveillance: A Survey. arXiv preprint arXiv:1901.08292. [56] Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A. K., & Davis, L. S. (2016). Learning temporal regularity in video sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 733-742). [57] Tan, H., Zhai, Y., Liu, Y., & Zhang, M. (2016, March). Fast anomaly detection in traffic surveillance video based on robust sparse optical flow. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1976-1980). IEEE. [58] Yu, B., Liu, Y., & Sun, Q. (2016). A content-adaptively sparse reconstruction method for abnormal events detection with low-rank property. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47(4), 704-716. [59] Zhang, Z., Mei, X., & Xiao, B. (2015). Abnormal event detection via compact low-rank sparse learning. IEEE Intelligent Systems, 31(2), 29-36. [60] Dey, N. (2019). Uneven illumination correction of digital images: A survey of the state-of-theart. Optik, 183, 483-495.
A. Mondal et al. / Visual Traffic Surveillance: A Concise Survey
41
[61] Biswas, S. P., Roy, P., Patra, N., Mukherjee, A., & Dey, N. (2016). Intelligent traffic monitoring system. In Proceedings of the Second International Conference on Computer and Communication Technologies (pp. 535-545). Springer, New Delhi. [62] Dey, N., Ashour, A., & Patra, P. K. (Eds.). (2016). Feature detectors and motion detection in video processing. IGI Global. [63] Casas, S., Olanda, R., & Dey, N. (2017). Motion cueing algorithms: a review: algorithms, evaluation and tuning. International Journal of Virtual and Augmented Reality (IJVAR), 1(1), 90-106. [64] Acharjee, S., Biswas, D., Dey, N., Maji, P., & Chaudhuri, S. S. (2013, March). An efficient motion estimation algorithm using division mechanism of low and high motion zone. In 2013 International MutliConference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s) (pp. 169-172). IEEE. [65] Ge, J., Luo, Y., & Tei, G. (2009). Real-time pedestrian detection and tracking at nighttime for driverassistance systems. IEEE Transactions on Intelligent Transportation Systems, 10(2), 283-298. Syst., vol. 10, no. 2, pp. 283–298, Jun. 2009. [66] N. Hautiere, J.-P. Tarel, and D. Aubert, “Mitigation of visibility loss for advanced camera-based driver assistance,” IEEE Trans. Intell. Transp. Syst., vol. 11, no. 2, pp. 474–484, Jun. 2010. [67] B. Johansson, J. Wiklund, P. Forssén, and G. Granlund, “Combining shadow detection and simulation for estimation of vehicle size and position,” Pattern Recognit. Lett., vol. 30, no. 8, pp. 751–759, Jun. 2009. [68] N. K. Kanhere and S. T. Birchfield, “Real-time incremental segmentation and tracking of vehicles at low camera angles using stable features,” IEEE Trans. Intell. Transp. Syst., vol. 9, no. 1, pp. 148–160, Mar. 2008. [69] J. W. Hsieh, S. H. Yu, Y. S. Chen, and W. F. Hu, “Automatic traffic surveillance system for vehicle tracking and classification,” IEEE Trans. Intell. Transp. Syst., vol. 7, no. 2, pp. 175–187, Jun. 2006. [70] Buch, N., Velastin, S. A., & Orwell, J. (2011). A review of computer vision techniques for the analysis of urban traffic. IEEE Transactions on Intelligent Transportation Systems, 12(3), 920-939. [71] Valera, M., & Velastin, S. A. (2005). Intelligent distributed surveillance systems: a review. IEE Proceedings-Vision, Image and Signal Processing, 152(2), 192-204. [72] Greiffenhagen, M., Comaniciu, D. O. R. I. N., Niemann, H. E. I. N. R. I. C. H., & Ramesh, V. I. S. V. A. N. A. T. H. A. N. (2001). Design, analysis, and engineering of video monitoring systems: an approach and a case study. Proceedings of the IEEE, 89(10), 1498-1517. [73] Dee, H., & Hogg, D. C. (2004). Is it interesting? comparing human and machine judgements on the pets dataset. In ECCV-PETS: the Performance Evaluation of Tracking and Surveillance workshop at the European Conference on Computer Vision. [74] Roy, P., Patra, N., Mukherjee, A., Ashour, A. S., Dey, N., & Biswas, S. P. (2017). Intelligent traffic monitoring system through auto and manual controlling using PC and android application. In Applied video processing in surveillance and monitoring systems (pp. 244-262). IGI Global. [75] Barmpounakis, E. N., Vlahogianni, E. I., & Golias, J. C. (2016). Unmanned Aerial Aircraft Systems for transportation engineering: Current practice and future challenges. International Journal of Transportation Science and Technology, 5(3), 111-122. [76]Mukherjee, A., Chakraborty, S., Azar, A. T., Bhattacharyay, S. K., Chatterjee, B., & Dey, N. (2014, November). Unmanned aerial system for post disaster identification. In International Conference on Circuits, Communication, Control and Computing (pp. 247-252). IEEE. [77]Samanta, S., Mukherjee, A., Ashour, A. S., Dey, N., Tavares, J. M. R., Abdessalem Karâa, W. B., ... & Hassanien, A. E. (2018). Log transform based optimal image enhancement using firefly algorithm for autonomous mini unmanned aerial vehicle: An application of aerial photography. International Journal of Image and Graphics, 18(04), 1850019. [78] Betke, M., Haritaoglu, E., & Davis, L. S. (2000). Real-time multiple vehicle detection and tracking from a moving vehicle. Machine vision and applications, 12(2), 69-83. [79] Ferrier, N. J., Rowe, S., & Blake, A. (1994, December). Real-time traffic monitoring. In WACV (pp. 81-88).
42
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200044
Multi-Criteria Evaluation for Energy Consumption Quality of Urban Rail Transit Stations a
GUO Tongjuna,1 , MAO Baohua 濴澳, and WANG Huiwena Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport澿澳Beijing, China
Abstract. This study conducts investigations on the basic data of urban rail transit station energy consumption, analyzes the composition of urban rail transit station energy consumption, grasps the energy consumption differences of different types of stations, and establishes a tree structure evaluation system of energy consumption quality for metro stations. The system reflects the energy quality level of urban rail transit stations. Several stations of the Beijing Subway were taken as cases to analyze the energy quality characteristics of different types of stations under the evaluation system. Keywords. Multi-criteria; Station energy consumption; AHP; Rough set theory; TOPSIS
1. Introduction Urban rail transit was characterized by large traffic volume and safety and punctuality, which makes it an important part of the urban transportation system. Although the unit energy consumption of the urban rail transit mode is small, the overall energy consumption of the system is still considerable [1]. For example, Beijing Subway company consumes 64.7 million kWh of electricity for metro operation in April 2018. The energy consumption of urban rail transit system can be divided into train operation energy consumption and station energy consumption. The former includes energy consumption of train traction, braking, vehicle equipment, etc. The latter includes energy consumption for ventilation, air conditioning, lighting, escalator and so on. According to statistics, the station energy consumption accounts for about 42% of the total energy consumption of urban rail transit in Beijing [2]. Obviously, energy consumption of the station is an important part of the rail transit system. It’s an important social demand to reduce the energy consumption of the station and improve the energy efficiency of the station. Although the change of station energy consumption in the urban rail transit system is relatively insignificant in a short time span, the situation may change if the time span is extended. Aging equipment or excessive energy consumption, and changes in environmental standards require the station to replace new equipment to maintain the 1 Corresponding Author. GUO Tongjun, Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Email: [email protected]
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
43
normal operation and sustainable development of the system. When the station's equipment is updated, its energy consumption will change dramatically. For example, Shenzhen metro line 9 adopts more advanced lighting control system, saving 20% ~ 30% of lighting power [3]; In 2012, the air conditioning system of Beijing metro line 4 was improved to save energy. After the improvement, the average annual energy saving of each station was 137600kW·h [4]. Washington metro installed two new water chillers at U Street and Navy Yard-Ballpark stations in 2014, saving $15,000 per device per year in energy costs. Metro or light rail stations are typically renovated every 25 years, the cost of updating them is also very high. The urban rail transit system in many Chinese cities has been in operation for decades, and the equipment used in railway stations built in different periods is quite different. The scientific evaluation system can better guide the station reconstruction process, so the evaluation method for the overall energy efficiency and sustainable development of the station is very important. This paper proposes the concept of energy consumption quality as a standard to measure the energy use of the station, which includes the efficiency of energy use, sustainability and the quality of service provided. The evaluation of station energy consumption quality can be regarded as a MCDA (Multi-Criteria Decision Aid) based on utility theory. In transportation applications, the common MCDA methods include AHP (Analytic Hierarchy Process), PROMETHEE (Preference Ranking Organization Method for Enrichment Evaluations) and TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution). In addition, a key prerequisite for MCDA is to determine the relative importance or weight of different performance criteria based on the decision maker's subjective perceptions or objective mathematical calculations of facility attributes [5]. Previous studies on energy consumption of urban rail transit system tend to focus on the energy consumption of train operation because there are many factors affecting it, like operation plan and so on. Therefore, it is relatively easy to optimize the energy consumption efficiency by adjusting the train operation control, train schedule or train marshaling plan. For example, when Bai yun [6], Feng jia [7], Acampora [8] and other scholars studied the energy consumption of urban rail transit system, they all took the station energy consumption as a fixed value. Data collection
Weighting AHP
Energy consumption data Rough set method
Evaluation
Subjective weights
TOPSIS
Subjective scores
Integrated weights
TOPSIS
Integrated scores
Objective weights
TOPSIS
Objective scores
Figure 1. Flowchart of multi-criteria evaluation process
Besides, there are few studies on the evaluation of energy consumption quality of urban rail transit stations, because it is difficult to collect field data, and the properties of different stations are quite different, and the characteristics of energy consumption are inconsistent. This paper proposes a subjective and objective method to evaluate the energy consumption quality of urban rail transit stations, which not only utilizes expert
44
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
experience but also mines the characteristic information of evaluation attributes from the data. This method can reflect the difference in energy consumption of different types of stations and adopts the AHP, rough set method and TOPSIS method to coordinate the inconsistency of energy consumption characteristics. In this study. China Beijing south railway station was selected to verify the feasibility and effectiveness. This paper selects the Beijing subway as a case to verify the feasibility and effectiveness of the proposed method.
2. Criteria system In general, the energy consumption of the environmental control system, lighting, and escalator accounts for about 80% of the total energy consumption of the vehicle. Therefore, these three devices are selected to reflect the energy consumption efficiency of the station. There is a great correlation between the emission level and the total energy consumption level, so the total energy consumption is not taken as a separate criterion in the system. Common evaluation criteria for energy consumption include equipment energy efficiency and emission level. However, the simple pursuit of low energy consumption often means the decline of service level, which is contrary to the function of the station. Therefore, this paper also introduces the service level index to comprehensively evaluate the energy consumption quality of the station. Station service level is a difficult index to be quantified. In this paper, passenger satisfaction and service reliability are used to measure this criterion quantitatively. Complex multi-attribute decision problems often have a hierarchical structure. In this paper, a tree structure evaluation system is constructed according to the characteristics of the energy consumption quality of the station. The paper puts forward six bottom criteria to construct the energy consumption quality evaluation system of an urban rail transit station. The letter L in the round represents that the evaluation results are good when The attribute values reach low while the letter H does the opposite. The criteria are set as follows: Ec: Energy consumption efficiency of the environmental control system (kWh/m2). Ec indicates the energy consumption of the environmental control system (including air conditioning and ventilation equipment) in one year per unit area of the station. (L) El: Energy efficiency of the lighting system (kWh/m2). El indicates the energy consumption of the illumination system in one year per unit area of the station. (L) Ee: Escalator energy efficiency (kWh). Ee indicates average energy consumption in one year per elevator. (L) Ic: Station carbon emission intensity (kg/m2). Ic indicates the converted carbon dioxide weight in one day per unit station area. (L) S: Satisfaction degree of passengers for stations, S indicates the quantitative score of passengers' satisfaction degree of the station service according to the survey. (H) R: Service reliability index. R indicates the number of failures or failures of station equipment in a certain operating period. (L) In addition to the above evaluation criteria, there are other criteria that can also reflect the energy consumption of the station, such as the total energy consumption of the station, the per capita energy consumption of passengers, and the seasonal
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
45
characteristics of energy consumption. However, these criteria are usually greatly affected by the design and passenger flow characteristics of the station, and cannot effectively reflect the quality of energy consumption. Or they have duplicates with existing criteria, so it is not adopted.
3. Weighting methods The determination of criteria weight is the key to the multi-factor integrated evaluation. The rationality of the weight determination will directly affect the reliability and effectiveness of the evaluation results. Through a comprehensive review of studies at home and abroad, it can be found that there are various methods to determine the weight of criteria, but they can be roughly divided into three categories: subjective weighting method, objective weighting method and integrated weighting method [9]. 3.1. Subjective weighting Generally, when determining the weights, the subjective weighting method mainly relies on the knowledge and experience or preference of decision-makers or experts, and compares each criterion according to its importance, distributes its weight or calculates its weight. This kind of method has strong subjective arbitrariness, but the ranking of criteria weight is basically consistent with the actual situation of evaluation objects. At present, the commonly used subjective weighting methods can be classified into four categories: expert estimation method, binomial coefficient method, sequential scoring method, and analytic hierarchy process. The AHP method can stratify the decision-making thinking of complex systems and organically combine the qualitative and quantitative factors in the decision-making process [10]. This paper applies AHP as a subjective weighting method. The steps are as follows: x Construct a judgment matrix Because the evaluation system already has a hierarchical structure, the subjection of the factors between the upper and lower levels has been determined. On this basis, the relative importance of each factor at each level needs to be judged.
X
ª x11 «x « 21 « # « ¬ xm1
x12 x22 # xm 2
" x1n º " x2 n »» % # » » " xmn ¼
(1)
Where xij represents the relative importance of criterion i to the criterion j in the evaluation criterion at this level. Table 1 shows the meaning of the scale of relative importance. At first, Saaty proposed to use nine scales to score the scheme, but some studies believed that fewer scales were beneficial for experts or decision makers to master, and it was easier to make the matrix meet the consistency requirements [11]. Therefore, this paper adopts the five-scale method to evaluate the scheme.
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
46
TABLE 1. Descriptions of Scales between Two Attributes Scale
Implication
1
Two attributes are equally important
2
One attribute is slightly more important than the other
3
One attribute is significantly more important than the other
4
One attribute is highly more important than the other
5
One attribute is extremely more important than the other
x Calculate the weight Use the relative importance to calculate the initial weight and normalize it, as shown in the formula. n
wi
n
x
(2)
ij
j 1
x Consistency test Calculate the maximum eigenvalue and corresponding eigenvector for each pairwise comparison matrix. then calculate CR(Consistence Rate), which can be used to determine whether matrix A can be accepted, with CI(Consistence Index) and RI(Random Index). CR=
CI RI
With CI
(3) Omax n n 1
, can be checked in Table 2 according to the number of criteria
judged by comparison. TABLE 2 . Correspondence Table of Random Index Number of criteria
1
2
3
4
5
6
7
8
9
RI
0
0
0.58
0.9
1.12
1.24
1.32
1.41
1.45
Generally, if CR < 0.1, the normalized eigenvector is the weight vector; If CR > 0.1, the pairwise comparison matrix needs to be reconstructed. 3.2. Objective weighting Objective weighting method is a method to determine the weight of criteria based on the quantitative analysis of the actual data of criteria, which relies on certain mathematical theories and ensures the absolute objectivity of the weight. The basic idea of the rough set method for determining the weight of criteria is to carry out the original classification of evaluation objects according to all criteria in the system first and reduce one criterion at a time. Considering the change degree of the object classification after reduction compared with the original classification, the importance of criteria is proportional to the change degree [12].
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
47
The steps are as follows. x Discretization of data and establishment of decision table ˄U AˈV ˈf ˅, where Abstract the evaluation problem into an information system S= U is a non-empty finite object set (the domain), A is non-empty finite attribute set, V is the set of values that attributes may take. f ˖U u A o V , is an information function. Where the value set V= ^1, 2,3` of the information system can be obtained by trisection calculation of attribute values. x Calculate criterion significance For the importance of attribute subset of the classification derived from attribute set A, we measure the difference in the degree of dependence of the two. U / A U / A ^a`
sig (a )
x
(4)
Calculate the weight
wi
sig (ai ) n
¦ sig (a )
(5)
i
i 1
3.3. Integrated weighting The integrated weighting method can combine the disadvantages and advantages of subjective and objective weighting methods and reflect the importance of various attributes comprehensively, which is more suitable for this study. At present, there are many forms of integrated weighting methods based on different principles, but they can be roughly classified into four categories, which are integration based on additive or multiplicative synthesis normalization, integration based on squared deviation, integration based on game theory and integration based on goal optimization. The normalized synthetic integrated weighting method based on addition or multiplication is to directly add or multiply the weight of criteria obtained by subjective and objective weighting methods in the form of equal preference, and obtain the integrated weight of each criterion through normalization.
wint
D wsub E wobj
(6)
Where wint , wsub , wobj is integrated weight, subjective weight and objective weight, respectively, DˈE are the relative importance of subjective and objective, respectively, given by decision makers according to demand. In this study, subjective weight and objective weight are equally important.
4. Multi-criteria evaluation method The purpose of the multi-criteria evaluation is to get an integrated evaluation by summarizing all the criteria of the program, and then judge the advantages and disadvantages of each program. Based on this evaluation, reasonable planning can be made for the energy consumption transformation of stations .
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
48
In this paper, the TOPSIS method is used to evaluate the energy consumption quality of each station. TOPSIS can make full use of the information of the original data and fully reflect the gap between the stations. It is characterized by authenticity and reliability, and has no special requirements for the sample data. The steps of TOPSIS method are as follows: x xij
Construct a normalized matrix and assign weights to form a weighted canonical matrix yij wj m (7) ¦ yij2 i 1
where x x*j
x
yij is the value of object i and attribute j, w j is the weight of attribute j.
Determine the ideal solution and the anti-ideal solution °max xij , ( H ) i ® min °¯ i xij , ( L)
° max xij , ( L) i ® min °¯ i xij , ( H )
x oj
Calculate the distance between each scheme and the ideal solution and the negative ideal solution n
¦ (x
di*
ij
x*j ) 2
n
dio
j 1
x Si
(8)
¦ (x
ij
x oj ) 2
(9)
j 1
Calculate the closeness degree of force connection, that is, the final score of each scheme dio (10) o (di di* )
5. Case study In order to verify the feasibility and rationality of the proposed method, this paper selected several stations in Beijing rail transit as examples. The names of the lines and stations have been withheld for reasons of confidentiality. 5.1. Criteria and integrated weights In this paper, 21 stations of the subway company were selected for the evaluation of energy consumption quality. Among them, the data of energy consumption efficiency and emission level were measured by a subway operating company in Beijing. The data of service level criteria were obtained through a questionnaire survey. The data of these stations are shown in Table 3. TABLE 3 . Attributes of Urban Rail Transit Stations Stations A-1 A-2
Ec
El
Ee
Ic
S
R
19.18 21.94
60.10 69.93
35.80 31.54
4991 4383
1.935 2.629
0.274 2.192
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality A-3 A-4 A-5 A-6 A-7 B-1 B-2 B-3 B-4 B-5 B-6 B-7 B-8 B-9 B-10 C-1 C-2 C-3 C-4
18.33 20.19 32.80 41.44 77.20 19.50 21.52 22.07 19.41 33.41 60.38 33.68 23.07 17.22 40.28 24.25 41.61 19.15 70.24
64.85 79.22 65.59 87.84 26.22 59.79 67.26 60.08 51.75 62.47 60.38 52.54 61.07 47.60 72.16 28.92 49.18 55.07 33.66
42.29 38.83 35.78 19.89 26.22 40.29 30.94 26.98 46.57 33.41 17.48 35.03 36.64 25.32 41.95 21.45 11.35 21.55 29.27
3826 4434 7798 6944 6190 4439 4011 3891 6301 6630 5734 6003 6073 6670 6238 2686 3413 2316 3557
1.960 4.704 3.606 2.588 3.107 1.236 1.405 4.795 4.703 4.329 2.645 2.889 4.810 3.193 3.993 4.895 1.391 1.563 3.054
49 0.274 2.740 0.274 0.000 2.192 0.000 0.274 1.644 1.644 0.548 2.192 1.644 1.370 1.918 1.918 2.466 1.918 1.918 2.740
The subjective weight, objective weight and integrated weight obtained by the method in chapter 3 are shown in Table 4. TABLE 4 . Weight Values under Different Weighting Methods Criteria
Ec El Ee Ic S R
Subjective weight 0.4135 0.1777 0.0458 0.1047 0.0430 0.2153
Objective weight 0.0909 0.1819 0.0909 0.0909 0.2727 0.2727
Integrated weight 0.2522 0.1798 0.0684 0.0978 0.1578 0.2440
It can be seen that there is a big difference between different methods to calculate the weight, especially criterion Ec, criterion S. The ventilation air conditioning system energy consumption efficiency is very important according to the experience, therefore the subjective weight of Ec is higher, while of its difference between stations is not obvious, which leads to a lower objective weight of Ec. Similarly, experts believe that passenger satisfaction degree with low importance degree, while the objective weight of S is far higher. 5.2. Results and analysis According to the results obtained above, TOPSIS method was used to score the 21 stations in the case. The score of station energy consumption quality under subjective weight is generally higher than that under objective weight, while the score obtained through integrated weight calculation is between the two. Because in the multi-criteria evaluation stage, the ideal point distance operation in the TOPSIS method amplifies the influence of the weight coefficient, while the subjective weight value is more concentrated and the objective weight distribution is more uniform, so the difference in the total score value is generated after the square operation. This phenomenon is caused
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
50
by the data characteristics of station energy consumption in this case, which reflects the differences in emphasis of different ways of weighting. Different types of stations have differences in equipment requirements and equipment operation modes. For example, ground stations can use natural light, while underground stations need all-weather lighting. Generally, there are no escalators at the entrance and exit of ground stations, and there are fewer escalators. The above ground station does not need to open the ventilation equipment for a long time, so its energy consumption is lower than that of the underground station which must keep the ventilation equipment open for a long time. The transfer station has higher passenger flow, more complicated passenger flow, and higher service quality. TABLE 5 . Scores and Rankings of Stations Stations
Subjective scores
Objective scores
Integrated scores
A-1
0.859㸦2㸧
0.634㸦4㸧
0.757㸦1㸧
A-2
0.663㸦12㸧
0.333㸦19㸧
0.522㸦15㸧
A-3
0.857㸦3㸧
0.630㸦5㸧
0.753㸦2㸧
A-4
0.622㸦15㸧
0.388㸦15㸧
0.492㸦16㸧 0.717㸦5㸧
A-5
0.720㸦8㸧
0.714㸦2㸧
A-6
0.620㸦16㸧
0.646㸦3㸧
0.642㸦8㸧
A-7
0.201㸦21㸧
0.369㸦17㸧
0.281㸦20㸧
B-1
0.867㸦1㸧
0.616㸦6㸧
0.746㸦3㸧
B-2
0.838㸦4㸧
0.598㸦8㸧
0.727㸦4㸧
B-3
0.727㸦7㸧
0.560㸦9㸧
0.630㸦10㸧
B-4
0.735㸦6㸧
0.550㸦10㸧
0.633㸦9㸧
B-5
0.711㸦11㸧
0.731㸦1㸧
0.713㸦6㸧
B-6
0.278㸦19㸧
0.299㸦21㸧
0.283㸦19㸧
B-7
0.632㸦14㸧
0.446㸦12㸧
0.546㸦13㸧
B-8
0.739㸦5㸧
0.600㸦7㸧
0.656㸦7㸧
B-9
0.718㸦9㸧
0.440㸦13㸧
0.593㸦11㸧
B-10
0.526㸦18㸧
0.427㸦14㸧
0.462㸦17㸧
C-1
0.652㸦13㸧
0.471㸦11㸧
0.543㸦14㸧
C-2
0.534㸦17㸧
0.345㸦18㸧
0.454㸦18㸧
C-3
0.716㸦10㸧
0.369㸦16㸧
0.569㸦12㸧
0.306㸦20㸧
0.257㸦21㸧
C-4 0.209㸦20㸧 Note: The ranking of each station is in brackets.
In order to make an accurate and reasonable horizontal comparison of stations, the stations should be classified and evaluated according to their laying modes and station functions. In this paper, the types of stations are divided into underground transfer stations, underground non-transfer stations, and above-ground non-transfer stations. As can be seen from Table 5, in most cases, the score of energy consumption quality of stations with subjective weighting is high, while that with objective weighting is low. At the same time, there are some exceptional cases, such as station A-7, whose equipment and service level is high, making up for the disadvantage of energy consumption efficiency criteria.
T. Guo et al. / Multi-Criteria Evaluation for Energy Consumption Quality
51
The integrated weight balances the subjective and objective criteria and has a correction effect on the evaluation method of the traditional expert scoring method. Besides, the shape of the radar chart is similar, which will not lead to overturning the overall trend of the original evaluation or substantially changing the original station ranking, so it is reasonable.
6. Conclusion In this paper, multi-criteria evaluation is used to evaluate the energy consumption quality of urban rail transit stations. Service level is added into the evaluation criteria system to make the evaluation more comprehensive. According to the type of stations, the horizontal comparison is more reasonable and accurate. Subjective and objective perspectives are combined in weighting and evaluation, TOPSIS is used to obtain multi-dimensional and intuitive evaluation results. It has certain reference significance for the evaluation of station energy quality. Acknowledgment This research was supported by the Fundamental Research Funds(2019JBM334), the People’s Republic of China.
References [1] DAI Huaming, LI Zhaoxing, SONG Jie. Current Status of Energy Consumption of Urban Rail Transit in Beijing and Suggestions for Energy Saving Measures[J]. Railway Technology Innovation, 2016(4). [2] YUAN Hongwei, KONG Lingyang. Influencing Factors and Measurement of Urban Rail Transit Energy Consumption[J]. Urban Rapid Transit, 2012, 25(2). [3] HU Zilin. Energy-saving analysis of frequency conversion chillers in Shenzhen Metro Line 9 [C]// Proceedings of the 4th National Symposium on Tunnels and Underground Space and Transportation Infrastructure for Operational Safety and Energy Conservation. 2013. [4] Li Ganhua. Research on Energy Saving and Environmental Protection Measures in Construction[J]. Science and Technology Innovation Review, 2015, 12(11): 193-193. [5] CHEN S, LENG Y, MAO B. Integrated weight-based multi-criteria evaluation on transfer in large transport terminals: A case study of the Beijing South Railway Station [J]. Transportation Research Part A: Policy and Practice, 2014, 66:13-26. [6] BAI Wei [1], Zhou Yuhe [1], Qiu Yu [1], et al. Energy-saving control method for subway trains in the long downhill section [J]. China Railway Science, 2018. [7] FENG Jia, XU Qi, FENG Xu-Jie, et al. Analysis of Factors Affecting Energy Consumption of Rail Transit Based on Grey Correlation Degree[J]. Journal of Transportation Systems Engineering and Information Technology, 2011, 11(1): 142-146. [8] Acampora G, Landi C, Luiso M, et al. Optimization of Energy Consumption in a Railway Traction System [C]// International Symposium on Power Electronics. IEEE, 2006. [9] LIU Qiuyan, WU Xinxin. Review on the Method of Determining Index Weights in Multi-element Evaluation[J]. Knowledge Management Forum, 2017(6). [10] Saaty T L. A Scaling Method for Priorities in Hierarchical Structures [J]. Journal of Mathematical Psychology, 1977, 15(3): 234-281. [11] ZHU Yin, MENG Zhiyong, YAN Shuyu. Calculating Weights by Analytic Hierarchy Process[J]. Journal of Northern Jiaotong University, 1999, 23(5): 119-122. [12] CAO Xiuying, LIANG Jingguo. Method for determining attribute weight based on rough set theory[J]. CMS, 2002, V(5): 98-100.
52
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200045
On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations Jiaqi WU a 1, Baohua MAO a, Yao WANG a, and Qi ZHOU a a
Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing Abstract: Turn-back ability of the intermediate station is a key factor to determine the carrying capacity of the rail line in the multi-routing mode of urban rail transit. In this paper, we analyze the calculation principle of the turn-back ability of the intermediate station, and the calculation model of the minimum turn-back interval time is established. Taking 2.5 minutes as the train headway time, analyze the limiting factors of the intermediate turn-back station under the condition of the ratio of the short routing train and long routing train is 1:1, and use the simulation method to measure the relevant influencing factors. Studies have shown that the line-front turning-back stations with single cross line has restrictions on the capacity of the rail line, while the line-behind station reentry has no limit. When the interval is less than 2.5 minutes, the line-behind station with double turn-back line can be used to meet the turn-back ability requirement. In addition, the speed of arrival the station, the length of train, the position of enter the station and the train dwelling time at station have a certain impact on the turn-back ability. The train dwelling time at station and the speed of arrival the station has a great influence on the turn-back ability. When speed arriving at the station is less than 60 km/h, the increase of speed improves turn-back ability obviously. Keywords: Urban rail transit; Intermediate station; Turn-back capacity
1. Introduction With the urban population growth and urban expansion, the multi-routing mode of urban rail transit is more adaptable to the disequilibrium passenger flow demand. At present, many urban rail transit lines such as Shanghai Metro Line 1 and Beijing Metro Line 10 have adopted different train routing modes to adapt to different passenger flow demands. At the beginning of the construction of urban rail transit, it is generally not possible to purchase too many vehicles. Train routing like long-short routing can accelerate the turn-round of rolling stock, save the number of vehicles and improve the efficiency of utilization of metro vehicle. With the multi-routing mode, the train needs to be turn-back at the intermediate station. In the intermediate station, there is a certain conflict between the short-routing trains and long-routing trains, which restricts the line capacity. Fund Project: the Fundamental Research Funds(2019JBM334). 1
Corresponding Author: Jiaqi WU(1995-),famale, Chongqing people, master, [email protected]
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
53
Domestic and foreign scholars have comprehensive research on the turn-back ability of ordinary terminal turn-back stations, including the analysis of the influencing factors of line capacity [1]-[3], the calculation method of the turn-back interval [4]-[6], the turn-back ability optimization [7]-[9] and so on. Research on multi-routing is still in begining stage. Xu Ruihua et al. [10] studied the influence of the long-short routing mode on the carrying capacity of rail lines and the rolling stock scheduling of urban rail transit. Chen et al. [11] analyzed the minimum time intervals between trains at intermediate stations and proposed a computational model to calculate the interval. Zhang Guobao et al [12] analyzed the calculation method of the turn-back ability of the intermediate station under different train routing modes, and proposed the optimization measures of the turn-back ability of the intermediate station by quantitative analysis. The intermediate turn-back station has a certain influence on the turn-back ability due to the mutual interference between the short-routing trains and long-routing trains. There is relatively little research in this area. This paper provides suggestions for the improvement of turn-back ability of intermediate turn-back station by studying the calculation method of the turn-back ability under different train routing modes and the limitation factors of the turn-back ability of the intermediate station. 2. Calculation Principle of Intermediate Station Turn-back Ability The turn-back ability is mainly determined by the minimum arrival-departure interval time of the train. Definition of the arrival-departure interval: The minimum interval of the adjacent turn-back trains normally completes the turn-back operation process and considers the mutual interference. The intermediate station train type includes the short routing train and long routing train. Assuming the ratio of the two train is 1:1, arrival-departure interval of intermediate station is determined by the interval between the adjacent two type of trains, as shown in Figures 1 and 2. T1-T4 represents the long routing train. R1 represents the short routing train. The following is a detailed analysis of the intermediate line-front and line-behind turn-back station with single cross line. IT1-R1
IR1-T4
IT3-R1 T1
R1
IT1-R1 IR1-T2
IR1-T2
T3
T2
IT3-R1 T4
Figure 1. The operation diagram of line-front turning-back station
T1
R1
T2 T3
IR1-T4 T4
Figure 2. The operation diagram of line-behind turning-back station
54
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
2.1 Calculation of the Station-front Turn-back Ability In Fig. 3, point A is the position at which the train starts to brake when the signal is not open; Point B is the adjacent location of the track circuit. Only after the departure train completely passes the point B, the approach of S1 to S2 can be arranged; Point C is the position where the train stops at the platform; Point D is the boundary point between the station section and the crossover switch section. Next, we will take straight into the station and lateral departure as an example of line-front turning-back stations with single cross line to introduce. T1 M
upstream
S2
X1
C
D
R1
S1
A
T2
Platform T4
S3
C'
X2
downstream
D'
B
T3
X3
Figure 3. Schematic diagram of line-front turning-back station
The specific operation process is as follows: Handling the T1 route, the S1 signal open, and the T1 arrives at point A and enters in the A-D direction. The train arrives at the upstream platform, stops at the station and begins to handle passengers getting on and off the train, and handles the departure route at the same time. After the train stops, T1 runs in the D-C direction until it completely passes through the point M, and begins to handle the R1 route. The R1 enters in the A-D direction and stops at the upstream platform and begins to handle passengers getting on and off the train. At the same time, The T3 stops at the downstream platform and begins to handle passengers getting on and off the train, and handles the departure route at the same time. After the train stops, T3 runs in the D’-B direction until it completely passes through the point B, and begins to handle the R1 departure route. R1 runs in the D-B direction until it completely passes through the point B, and begins to handle the T2 and T4 route. The calculation model of the minimum turn-back interval time of adjacent trains is shown in Table 1. Table 1 Calculation model of the minimum turn-back interval time of line-front station ଵ ଵ Iଵିୖଵ = t ଵ + t ୖଵ ଶ + tଷ + tସ ୖଵ ୖଵ Iୖଵିଶ = t ଵ + t ଶ ଶ + tଷ + tହ
Iଷିୖଵ = t ଵ + t ଷ ହ ୖଵ ସ Iୖଵିସ = t ଵ + t ସ ଶ + tଷ + tହ
I୪୧୬ୣି୰୭୬୲ ୱ୲ୟ୲୧୭୬ = max൛Iଵିୖଵ ,Iୖଵିଶ,ܫଷିୖଵ ,Iୖଵିସ ൟ
t1 represents the time to handle the route; t2 represents the travel time from point A to point C; t3 represents the stop time of passengers get off or get on the train; t4 represents the time of the long routing train completely passes through the point M; t5 represents the time of the short routing train completely passes through the point B.
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
55
2.2 Calculation of the Station-behind Turn-back Ability As shown in Fig. 4, the station-behind single turn-back line is taken as an example for analysis. We only take E-G segment as turn-back line. The train enters the station from point A, and point B is the adjacent location of the track circuit. S1 upstream
T1 S2
M
X2
H
S3
X1
C
T2
A
E'
Platform
R1 G
D
X3
T4
F
E H'
S4
D'
C'
X4
B
T3
downstream
Figure 4. Schematic diagram of line-behind turning-back station
The specific operation process is as follows: Handling the T1 route, the T1 arrives at point A and enters in the A-D direction. The train arrives at the upstream platform, stops at the station and begins to handle passengers getting on and off the train, and handles the departure route at the same time. After the train stops, T1 runs in the D-C direction until it completely passes through the point M, and begins to handle the R1 route. The R1 enters in the A-D direction and stops at the upstream platform and begins to handle passengers getting off the train and handles the route to enter the turn-back line at the same time. The R1 enters the turn-back line from the D-G direction. When the R1 completely passes through the point F, the T2 can handle the route of enter the station and enters in the A-D direction. At the same time, The T3 stops at the downstream platform and begins to handle passengers getting on and off the train, and handles the departure route. After the train stops, the T3 runs in the D’-B direction until it completely passes through the point B, and begins to handle the route to enter the downstream platform. The R1 stops at the the downstream platform for passengers to get on and handles the departure route. R1 runs in the D-B direction until it completely passes through the point B, and begins to handle the T4 route. The calculation model of the minimum turn-back interval time of adjacent trains is shown in Table 2. Table 2 Calculation model of the minimum turn-back interval time of line-behind station ଵ ଵ Iଵିୖଵ = t ଵ + t ୖଵ ଶ + tଷ + tସ ୖଵ ୖଵ Iୖଵିଶ = t ଵ + t ଶ ଶ + tଷ + t ୖଵ ୖଵ Iଷିୖଵ = t ଵ + t ଷ ହ + t + tଷ ୖଵ ସ Iୖଵିସ = t ଵ + t ସ ଶ + tଷ + tହ
I୪୧୬ୣିୠୣ୦୧୬ୢ ୱ୲ୟ୲୧୭୬ = max൛Iଵିୖଵ ,Iୖଵିଶ,Iଷିୖଵ,Iୖଵିସ ൟ
The definition of t1-t5 is shown in Table 1. t6 represents the time of the short routing train completely passes through the point F from turn-back line; t7 represents the travel time from turn-back line to downstream platform.
56
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
3 Analysis of the Impact of the Turn-back Interval Time on Ability of Intermediate Turn-back Stations After the analysis in chapter 2, we learned about the common turn-back operation process and established the calculation model of the minimum turn-back interval time. Next, we use the multi-train operation diagram to analyze the impact of the turn-back interval on ability of intermediate turn-back stations. Assume that the ratio of the short routing train and long routing train is 1:1, and the train headway time is 2.5 minutes. Refer to Table 3 for the train operation time standard. Table 3 Related parameters setting of train turn-back process Operation time
The length of time(s)
The time of handle the route
15
The travel time of the train enter the station
25-40
The travel time of the train depart the station
25-35
The time of passengers get on or off the train
30
The travel time of the train enter the turn-back line
35-40
The travel time of the train depart the turn-back line
35-40
The time of perform the reversing operation
10
*Data from a station in Nanjing Metro.
3.1 Line-front Turn-back Station Take the line-front turning-back stations with single cross line (Fig. 3) as an example. There is interference between the short routing train and long routing train, and the composition of the departure-arrival interval between the two adjacent trains is shown in Fig.5. I1 represents stop time, I2 represents departure-arrival train Interval, t1 represents the travel time of the short routing train completely passes through the point B, t2 represents the time to handle the route, t3 represents the travel time from point A to point C. The constraints on capabilities are different when the operating time of each part is different.
I1
I2 t1 t2 t3
Figure 5 The operation time of each part of the train of the intermediate line-front turn-back station
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
57
In the first case (Fig. 6), the I1 is 90s and I2 is 60s. According to the train operation time standard of Table 3, it takes at least 60s for passengers getting on and off the train. Therefore, the stop time is enough. However, it takes at least 25-35 s for the train to pass the point B, and at least 15 s for handling the route. The train entering the station needs at least 25-40 s. Currently, I2 is 65-90 s. And the time of each part of I2 are difficult to compress. In this case, the running line of the latter train needs to be moved backwards, which increases the train tracking headway and reduces the line capacity. The turn-back ability of the line-front turning-back stations with single cross line is restricted by I2. Train Headway: 5min
Intermediate Turn-back Station
90s
60s
Train Headway:2.5min
Figure 6. The multi-train operation diagram of line-front turning-back station
Train Headway: 5min
Intermediate Turn-back Station
60s
90s
Train Headway:2.5min
Figure 7. The multi-train operation diagram of line-front turning-back station
In the second case (Fig. 7), the I1 is 60s and I2 is 90s. In the case of the train headway is 2.5 minutes, we adjust the ratio of I1 and I2. According to the train operation time standard of Table 3, I2 takes 90s to meet the train operation process and the stop time of 60s can just meet the needs of the time for passengers to get on and off the train. At this time, the requirements for the passengers to get on and off the train are higher, and once the operation time is extended, the latter train will be affected. The train tracking headway is increased and the line capacity is reduced. At this time, the turn-back ability of the line-front turning-back stations with single cross line is restricted by I1. According to the above explanation, the ability of the line-front turning-back stations with single cross line is limited by the stop time and departure-arrival train interval. And the operation of each part is closely linked, and the time is difficult to compress. 3.2 Line-behind Turn-back Station Take the line-behind turning-back station (Fig. 4) as an example. The departurearrival interval between the two adjacent trains is shown in Fig. 8. I represents stop time, t1 represents the time of passengers get off the train, t2 represents the time of the short routing train completely passes through the point F, t3 represents the time of passengers get on the train, t4 represents the time of the short routing train completely passes through the point B, t5 represents the travel time to enter the turn-back line, t6 represents the time of handle the route and perform the reversing operation, t7 represents the travel time from turn-back line to downstream platform. Under normal circumstances, the train must stop at the station for at least 30s for the passengers to getting on or off the train. It takes 35s-40s for the train to enter or depart the turn-back line. It takes 10s to perform the reversing operation. And it takes 15s for handling the route. When the train headway time
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
58
is 2.5 minutes, the multi-train operation diagram of intermediate line-behind turningback station with single cross line is as shown in Figure 9. Both the short routing train and long routing train have sufficient time to handle all parts of the operation without interference.
I
t1 t2
t3 t 4
t5 t6 t7
Figure 8 Comparison of delay
Through the above analysis, in the case of 2.5min driving interval, the turn-back capacity of the intermediate station will not restrict the line capacity. Train Headway: 4min
Train Headway: 5min
Intermediate Turn-back Station
Train Headway:2.5min
Figure 9. The multi-train operation diagram of turning-back station with single cross line
Intermediate Turn-back Station
Train Headway:2min
Figure 10. The multi-train operation diagram of turning-back station with double cross line
If you want to shorten the tracking headway, the tracking headway between adjacent short routing train and long routing train will be shortened. And the interval between two adjacent turn-back trains will also be shortened. At this time, it may cause conflicts between the two adjacent turn-back trains if the station-behind single turn-back line is still used. And the whole return operation process of the return train is very compact. Once an operation time is extended, it will affect the ability, and it is not convenient to adjust in the actual application. At this time, the station-behind double turn-back line can be used, as shown in Figure 10. Although this can shorten the tracking headway, it will increase the stay time of the turn-back train on the turn-back line and increase the train turnaround time. In order to maintain the same interval, the number of vehicles will increase.
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
59
4. Measure the Influencing Factors of Turn-back Ability After analyzing the calculation principle of intermediate station turn-back ability in the chapter 2, the affecting factors of the turn-back ability are measured, including the speed arriving at the station, the length of train, the position entering the station, the train dwelling time at station, and slope of the turn-back line. Taking station B as an example, the influence of various factors on the ability of single-track line-behind station is analyzed. Station B is the intermediate line-behind turning-back station with single cross line (Fig. 4). The station length of station B is 144 m, the platform width is 14 m, and the station line spacing is 17 m. It is arranged with a No. 9 turnout. The line spacing of the turn-back line is 5m. The safe braking distance of the train is 400m. The X1, X2, X7, and X8 signals are 5m from the end of the platform, and the HD and H'D' segments are 13.839m. The distance of point B to signal C' is 50m. The lengths of the lateral sections HE' and EH' of the turnout zone are 54.2323 m, the length of the E'F is 45.3142 m, and the length of the straight section of the turnout is 45.0673 m. The FG segment is 169.0015m, which includes the distance from the center of the ballast to the beginning of the ballast of 13.839 m and the effective length of the turn-back line of 155.162 5 m. In the "Urban Rail Transit Train Operation Calculation Experiment System" software by Beijing Jiaotong University, the train turn-back process simulation model (Fig. 11) is built. This section takes the line-behind turning-back station as an example. A, D, F, G, D', C', B are the main nodes of station B, where point A is the train enters the station, point D is the train stop at the station, and point F is the train enters the turn-back line until completely clear the upstream platform. The point G is the position where the train stop at the turn-back line, and the point D' is the position where the train enters the downstream platform from the turn-back line and completely clears the turn-back line. C' is the position where the train stop at the downstream platform, and the point B is the position where the train departs and completely clears the downstream platform. Fig. 12 shows the V-S curve of the turn-back train operation process.
Figure 11. Simulation model of station B
Figure 12. The train operation diagram of simulation
Combine the formula of Table 2, the simulation results are shown in Table 4.
60
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
Table 4 Simulation result The speed arriving the station(km/h)
40
50
60
70
80
25.51
28.38
29.17
29.74
29.94
20
30
40
50
60
32.65
29.94
27.64
25.67
23.96
60
80
100
120
140
31.84
31.34
30.86
30.39
29.94
200
250
300
350
400
Turn-back ability(columns /h)
32.95
32.07
31.26
30.65
29.94
The slope of turn-back line(‰)
-2
-1
0
1
2
Turn-back ability(columns /h)
29.97
30.21
29.94
29.71
29.95
Turn-back ability(columns /h) The length of train(m) Turn-back ability(columns /h) The distance of station and the position entering the station(m) Turn-back ability(columns /h) The train dwelling time at station
As the speed of train arriving at station gradually increased from 40 km/h to 80 km/h, the turn-back ability increased from 25.5 columns/h to 29.9 columns/h. It can be seen from the simulation results that when the speed is lower than 60 km/h, the turn-back ability is obviously improved with the increase of the speed. As the train speed increases from 40km/h to 60 km/h, the turn-back ability is increased by 1.85 columns/h with an average increase of 10km/h with the speed. As the train dwelling time increased from 20s to 60s, the turn-back capacity decreased from 32.65 columns/h to 23.96 columns/h. For every 10 s increase in dwelling time, the turn-back ability is reduced by 2.17 columns/h. The capability of the intermediate station can be improved by reducing the stop time of the train through a series of optimization measures. As the train length increased from 60m to 140m, the turn-back capacity decreased from 31.84 columns/h to 29.94 columns/h. For every 20m increase in train length, the return capability is reduced by 0.48 columns/h. Shortening the length of the train has a certain improvement on the turn-back capability. As the distance of station and the position that the train enters the station increased from 200m to 400m, the turn-back capacity decreased from 32.95 columns/h to 29.94 columns/h. With the distance was reduced by 100m, the reentry ability increased by 0.75 columns/h. Shortening the distance can improve the reentry ability. The slope of the turn-back line has little effect on the turn-back ability. The turnback ability is maintained at around 29.9 columns/h when the turn-back line slope is gradually increased from -2‰ to 2‰.
J. Wu et al. / On Impact of Turn-Backs on Capacity at Urban Rail Intermediate Stations
61
5. Research conclusion (1) Assume that the ratio of the short routing train and long routing train is 1:1 and the train headway time is 2.5 minutes, the line-front turning-back stations with single cross line has certain restrictions on the capacity of the rail line. The main limiting factor is the stop time and the train arrival-departure interval, and there is no limit to the poststation reentry. When the train headway time is less than 2.5min, the interval between the two adjacent trains cannot be met, and the line-behind station with double line can be used to satisfy. (2) The speed arriving at the station and the train dwelling time at station have a relatively large impact on the turn-back ability. Taking the simulation results of singletrack line-behind station as an example, for every 10s increase in dwelling time, the turnback ability is reduced by 2.17 columns/h. When the train speed is less than 60 km/h, the average inbound speed increases by 10 km/h, and the ability increases by 1.85 columns/h. The length of the train and the location of the train entering the station also have a certain impact on the ability. Among them, for every 20m increase in train length, the ability is reduced by 0.48 columns/h. As the distance of station and the position entering the station was reduced by 100m, the reentry ability increased by 0.75 columns/h. References [1] Burdett, R. L. Kozan E. 2009. Techniques for absolute capacity determination in railways[J]. Transportation Research Part B 40, 616-632. [2] Gill, D. C. Assessment of mass transit turn-back capacity using dynamic simulation models[J]. Computers in Railways VII, 2009: 1077-86. [3] T. He. Turn-back Capacity of Urban Rail transit lines based on analytical method. // Proceedings of the 7th Chinese traffic Forum. Beijing Jiaotong University, 2011: 139-144 . [4] Z. B. Jiang, Y. Rao. Turnback capacity assessment at rail transit stub-end terminal with muIti-tracks. Journal of Tongji University (Natural Science), Vol. 45, No. 9, pp. 1328-1335(2017). [5] K. Qiu. The line reentry capability and signal system of rail transit. Electric Railway, 2010: 40-43. [6] Z. Y. Zhang, B. H. Mao, Y. K. Jiang. Calculation method for station-end turn-back capacity of urban rail transit based on train traction. Systems Engineering-Theory&Practice, Vol. 33, No. 2, pp. 450-455(2013). [7] J. B. Liang. Analysis of Subway signal equipment and station tum-back capacity. Tunnel&Underground Engineering, Vol. 32, No. 1, pp. 93-96(2014). [8] Y. C. Tang, Y. F. Long, C. G. Xiao. Yard type research on the enhancing back-returning capacity of metro terminal station. Journal of Railway Engineering Society, 2013: 90-93. [9] Y. Chen, B. H. Mao, M. G. Li. Analysis of the reentry ability of post-station double-return line based on safety section. Urban Rapid Rail Transit, Vol. 27, No. 5, pp. 55-59(2014). [10] R. H. Xu, J. J. Chen, S. M. Du. Study on carrying capacity and use of rolling stock with multi-routing in urban rail transit. Journal of Railway Engineering Society, Vol. 27, No. 4, pp. 6-10(2005). [11] Y. Chen, B. H. Mao, Y. Bai. Capacity analysis on intermediate turn-back stations with multi-routing in urban Rail Transit. Journal of Transportation Systems Engineering and Information Technology, Vol. 17, No.3, pp. 150-156(2017). [12] G. B. Zhang, M. S. Liu, R. H. Xu. Analysis on carrying capacity for URT train tum-back at wayside station. Urban Mass Trans, 2005: 31-35.
62
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200046
Performance Comparison of Feature Extraction Methods for Iris Recognition 1,2
J. Jenkin Winston1, D. Jude Hemanth2 Dept. of ECE, Karunya Institute of Technology and Sciences, Coimbatore, India 1 [email protected], [email protected] Abstract. The traditional way of providing security is not enough for providing authentication over a large population. In this era of digital advancements and artificial intelligence, biometric security systems are transforming such security problems across the world. The characteristics of the iris pattern are extraordinarily unique. Hence it is extremely favored among other biometric modalities. In the iris image based biometric system, each image of the iris pattern is transformed into a set of distinct features by the process called feature extraction. It is one of the key steps in any recognition system. In this paper, statistical features are extracted from different domains like the histogram of intensity level, local binary pattern (LBP), histogram of oriented gradients (HoG), Eigenspace and moments of the iris image. A comparison of these different feature extraction methods for iris biometric system is discussed using minimum distance classifier. The experimental results show moment based feature extraction performs better than other feature extraction methods. Keywords. Moments, Feature extraction, Artificial intelligence, Iris biometrics, Machine learning.
1.
Introduction
Security has become a major concern across all sectors. Earlier, passwords, ID cards, and keys were used to protect private data and access private or public areas. But, to check the authenticity of these modes for large scale population will be laborious and consume much time. With the rapid development of digital technologies, these issues are resolved by deploying biometric systems at the terminals. Biometrics is a physical or behavioral measurement of a person which is unique. It can distinguish an individual from others. Biometric systems can identify this difference in biometric traits and authenticate an individual. Out of different biometric traits, the iris is most widely used. This is because it is highly unique even between twins and does not change over the lifetime of the individual. After contribution from several renowned researchers, now iris-based biometric systems are used in many government projects like AADHAR in India for identifying the citizens, border surveillance in UAE and immigration systems in the UK [1]. It is now incorporated to provide security to personal electronic gadgets. This shows a huge market for this biometric technology. The reliability of such biometric systems depends on having less false acceptance rate and a high recognition rate [2]. Choice of proper feature vector around the iris region can enhance the robustness of the biometric system. The feature vector is a numerical measurement which can help
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
63
the machine in processing and do statistical analysis. Many methods are proposed by authors to extract the feature vector from the iris image. A comparison of the performance of the biometric system based on choosing different feature vector can help researchers and industrialists to choose the best feature vector for implementing their projects. The feature vector can be in the spatial domain, frequency domain or in spatialfrequency domain. A good feature vector must represent both local and global information present in the iris region. It should also be invariant to noise, scale and translation effects which occurs while capturing the iris image in an unconstrained environment. A novel method for feature extraction through wavelet Mel-coefficients is proposed by authors in [3]. The wavelets are derived from the prominent biorthogonal Cohen-Daubechies-Feauveau 9/7 filter bank. This makes it more exceptional as it has high-frequency resolution and better spatial-frequency localization. To reduce the dimensions, the coefficients are divided into a logarithmic approach. The feature vectors are extracted by computing discrete cosine transform on these wavelet coefficients. The iris texture information is better characterized by this method. A code-level approach is recommended by N. Liu et al in [4]. Markov model is adapted to develop a binary feature code. Furthermore, a weighted map is used to enhance the binary feature code. This method is best suited for iris images collected through variations of the imaging sensor, subject condition, and image quality. In [5], F. Alonso-Fernandez et al have suggested scale invariant feature transformation method for iris recognition. This method does not need the segmentation and normalization which is followed in conventional iris biometric system. The texture information around the SIFT feature points are matched using SIFT operator. This approach is appropriate for iris recognition on the go. Himanshu Rai has taken advantage of the textural variation of the iris region in [6]. HAAR wavelet coefficients and 1D Gabor wavelets are computed from the zigzag collarette area around the iris. A combined SVM and hamming distance classifier provides better accuracy for this approach. Using Multiscale Taylor expansion the feature vectors are extracted from the iris region. The authors have experimented with three approaches in [7]. Firstly, they phased based representation from the first order and second order Taylor series expansion coefficients are used. Secondly, local extremum points of the first two coefficients of the expansion are used. Thirdly, a combination of the first two methods is used. This method is robust against illumination variations. The local texture information of the iris region is captured using weighted cooccurrence phase histogram by authors in [8]. The uncertainty brought by the image gradients in phase angle estimation is balanced by a weighting function. The histogram is computed by co-occurrence pixels which are at a fixed distance. This method helps to apprehend the local features of the iris region. It is also insensitive to illumination variations. The paper is organized as follows. Section 2 explores the block diagram of the basic iris-based biometric system and gives the outline of the comparison method. Section 3 discusses five different feature extraction techniques taken for comparison study. Section 4 briefs about the results obtained using each feature extraction method. Conclusions are discussed in section 5.
64
2.
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
Block diagram of Iris based biometric system
The basic steps in any iris based recognition system are as follows, 1) iris image capture, 2) pre-processing, 3) localization, 4) normalization, 5) feature extraction and 6) feature matching. The block diagram of the iris-based biometric system is shown in figure 1.
Figure 1. Block diagram of iris based biometric system
Iris image can be captured by any good quality iris capture device which can operate in the visible or infrared band. In the unconstrained environment, the captured image may have artifacts like improper illumination and others. An appropriate enhancement technique like filtering or histogram equalization method can be used to pre-process the captured image. Localization is the process of locating the pupil boundary and iris boundary from the pre-processed iris image and segmenting the iris region. The segmented circular iris region is unwrapped into a rectangular block of fixed size through normalization. Feature vectors are extracted from the normalized iris region. These feature vectors are used in matching with features stored in the database. Based on the matching score, the iris images are classified. In this work, five different feature space like the histogram of intensity level, local binary pattern, the histogram of gradients, Eigenspace and moments are taken and statistical features are computed. The performance of these feature vectors is analyzed using minimum distance classifier. Figure 2 shows the outline of the proposed work. 3.
Feature Extraction
A large dataset can heavily load the computation process. Hence, feature extraction helps in reducing the size of the data used to represent the iris. This transformed raw data into a small set of features is called a feature vector. These feature vector should represent enough information about the iris region. It should avoid any redundant irrelevant information present in the raw image. The feature vector can be computed in a number of ways. In the following subsection, the feature spaces like the histogram of the intensity levels, local binary pattern, histogram of the oriented gradients, Eigenspace and moments are discussed along with the statistical parameters which are computed to form the resultant feature vector. An efficient feature vector [9] can improve the performance of the system.
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
65
Figure 2. Outline of the proposed work
3.1 Histogram In image processing, images are considered as matrices with their elements representing the intensity values of the image. The intensity value ranges from 0 to 255 in grayscale. Hence, there are 256 values signifying different shades of the color in the image. A histogram is a graphical chart representing the number of occurrences of each gray level in the iris image [10]. The histogram is a discrete function. It provides many beneficial statistics about the image. This information is useful for a number of image processing applications. For an image plane having k different intensity levels, the histogram can be given as ℎ
ೖ
(1)
Where is the number of pixels in the image having kth intensity value. L is the total number of pixels in the image. 3.2 Local Binary Pattern Local binary pattern [11] is a simple approach which captures the texture information of the image. It is obtained by thresholding the center pixel with the neighboring pixels and converts the resulting binary codeword into a decimal value. The LBP of the pixel (x,y) is calculated by the following equation. , ∑ேିଵ ୀ 2
1, 0 Where 0, ℎ The illustration of the LBP method is shown in figure 3.
(2)
66
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
Figure 3. Steps to compute local binary pattern
This local binary pattern forms the feature space of the normalized iris region. Statistical feature vectors are extracted from the above binary pattern obtained. 3.3 Histogram of oriented gradient Histogram of oriented gradients [12] can be used to describe any object and shape present within the iris image. This method captures the global information present in the iris image. The image is divided into blocks and each comprises of cells. On each cell, gradients are computed by kernel mask. The image gradient is a vector which gives the magnitude and points in the direction of high frequency. The magnitude values of each block are binned into a finite number of direction between 0 to 180. Final histogram is calculated by the concatenation of histogram of gradient calculated across each block. 3.4 Eigen Space It is important to extract all the significant information from the iris image. In the eigen space approach [13], the information is encoded by capturing the variation in the collection of iris images. A set of iris image is arranged in the form of a matrix, in which each column denotes an iris image. The eigenvector computed from the covariance of this matrix is termed as the principal components. It denotes the variation of the textural feature among the image present in a class. Each iris image can be depicted as a linear combination of these eigenvectors. For a set of N iris images kept for training, we have N column vectors. The column vectors ଵ , ଶ , … . . ே are obtained by reshaping the 2 D image into a 1D column vector. The average of these training column vectors is obtained by ఓ
ଵ ே
∑ே ୀଵ
(3)
The difference between the column vector and the mean image vector is given by
ఓ
(4)
The covariance of the matrix ଵ , ଶ , … . ே ! is a matrix of dimension " ଶ " ଶ. It is obtained by the following equation
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
=
ଵ ே
்
ଵ
் ∑ே ୀଵ − ఓ − ఓ = ே
67
(5)
The eigendecomposition for huge matrix becomes intractable. Hence, a simple decomposition method called singular value decomposition is suggested. The eigenvectors are computed from the covariance matrix by the expression = ∗ Σ ∗
(6)
Where U and V are unitary eigenvectors. The eigenvector spans in the algebraic subspace of the training images. Rearranging these vectors into the iris image dimension depicts the principle textures present in the iris region. 3.5 Moments Iris recognition under unconstrained environment poses a severe challenge because the captured iris images degrade due to a number of reasons. Hence, it is important to select a feature vector which is less sensitive to the environmental changes. Moments are one such feature vector which is invariant to translation and rotation effects. These are also good at capturing local and global features present throughout the image. Five different moments are chosen to represent the feature space [14]. Each of these moments is concatenated as = ଵ , ଶ , ଷ , ସ , ହ
(7)
The moments chosen are Hu moments, Tchebychev moments, Coiflet moments, Gabor moments and affine and blur invariant moments. Moments in the feature space can improve the performance of the classification system. 3.6 Statistical feature vector The feature space represents the textural distribution of the segmented iris region. Seven different statistical parameters [15] namely mean, standard deviation, standardized moment, root mean square, entropy, skewness, and kurtosis are computed over the feature space to extract the required feature vector. This vector quantity well signifies the characteristics of the iris region. This vector is computed for all the images in the dataset. A few are taken as training dataset and few are assumed as test dataset. 4.
Results and discussion
This section briefs on the database used and analyses and compare different feature extraction methods employed for the iris-based biometric system. The experiments were done using MATLAB 2013a (8.1.0.604) of Mathworks, Inc, USA installed on Intel i5 processor, 2.7 GHz and 8GB RAM. 4.1 Database The different feature extraction methods have been experimented on iris images of IIT-Delhi database [16] in the NIR spectrum. The iris images are captured using JIRIS [17], JPC1000, digital CMOS camera. The images have a resolution of 320 x 240 pixels stored in bitmap format. The database includes 1120 images under 224 subjects from the age group of 14 to 55 years.
68
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
Figure 4. Sample iris images from IIT-D database
4.2 Performance evaluation of the feature vector extraction methods
a) Normalized iris
b) LBP of iris c) Eigen space of iris
d) Histogram of iris
e) HoG of iris
f) Moments of iris Figure 5. Normalized iris and it feature space
The feature spaces taken for observation are the histogram of intensity level, local binary pattern, the histogram of the oriented gradient, eigenspace and moments. From this feature space, statistical parameters are estimated as the feature vector. A minimum distance classifier [19] is used to validate the robustness of the feature vector in iris recognition. The distance between the test feature vector and the feature of the kth class is given by weighted Euclidean distance [18] given by ∑ே ୀଵ
൫ሺሻିೖ ሺሻ൯ ఙೖమ
మ
(8)
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
69
Where L is the feature vector of the iris under test, Lk is the feature vector of kth class, ଶ is the standard deviation of the kth feature vector and N is the length of the feature vector. Table 1. Accuracy of feature space Feature space Histogram LBP HoG Eigen space Moments
Length of feature vector 5 7 5 7 5 7 5 7 5 7
Normalized iris image size 50%
100%
77.8 78.4 80.6 82 80.6 81.3 73 74.2 83.8 84.2
78.1 79.4 81.2 83 80.4 81 77.8 78.2 85.4 86
The normalized iris image is taken as full and a half to extract features from it for performance comparison. The feature vector of length 5 and 7 is also used to study the performance. Table 1 shows the recognition rate of feature vector computed from different feature space for iris recognition. Recognition rate shows that moment based feature vector performs better than other methods. 5.
Conclusion
This paper compares different feature extraction methods from the localized iris region with minimum distance classifier for matching. The experiments are done on a publically available IIT-Delhi database. The results show an accuracy of 79.4% for histogram method, 83% for LBP , 81% for HoG, 78.2% for Eigenspace and 86% for moments. It is inferred that the moment based feature vector is robust under unconstrained environment. References [1] [2] [3] [4] [5] [6]
M. Negin, T. A. Chmielewski, M. Salganicoff, U. M. von Seelen, P. L. Venetainer and G. G. Zhang, "An iris biometric system for public and personal use," in Computer, vol. 33, no. 2, pp. 70-75, Feb. 2000. doi: 10.1109/2.820042 J. G. Daugman, "High confidence visual recognition of persons by a test of statistical independence," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1148-1161, Nov. 1993. doi: 10.1109/34.244676 S. S. Barpanda, B. Majhi, P.K. Sai, A.K. Sangaiah, “Iris feature extraction through wavelet mel-frequency cepstrum coefficients,” Optics and Laser Technology, vol. 110(2019), pp. 13-23. N. Liu, J. Liu, Z. Sun and T. Tan, "A Code-Level Approach to Heterogeneous Iris Recognition," in IEEE Transactions on Information Forensics and Security, vol. 12, no. 10, pp. 2373-2386, Oct. 2017. doi: 10.1109/TIFS.2017.2686013 F. Alonso-Fernandez, P. Tome-Gonzalez, V. Ruiz-Albacete and J. Ortega-Garcia, "Iris recognition based on SIFT features," 2009 First IEEE International Conference on Biometrics, Identity and Security (BIdS), Tampa, FL, 2009, pp. 1-8 H. Rai, A. Yadav, “Iris recognition using combined support vector machine and hamming distance approach,” Expert systems with application, vol. 41, (2014), pp. 588-593.
70
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]
J.J. Winston and D.J. Hemanth / Performance Comparison of Feature Extraction Methods
A. Bastys, J. Kranauskas, V. Kruger, “Iris recognition by different representations of multi-scale Taylor expansion,” Computer vision and image understanding, vol. 115(2011), pp. 804-816. P. Li, X. Liu, N. Zhao, “Weighted co-occurrence phase histogram for iris recognition,” Pattern Recognition Letters, vol. 33(2012), pp. 1000-1005. S. Lim, K. Lee, O. Byeon, T. Kim, “Efficient iris recognition through improvement of feature vector and classifier,” ETRI Journal, vol. 23(2), pp. 61-70. R. S. Choras, “Image feature extraction techniques and their applications for CBIR and biometric systems,” International Journal of Biology and biomedical engineering, vol.1(1), 2007, pp. 6-16. T. Ojala, M. Pietikainen and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987, July 2002. doi: 10.1109/TPAMI.2002.1017623 N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 886-893 vol. 1. doi: 10.1109/CVPR.2005.177 G. Lu, D. Zhang, K. Wang, “Palmprint recognition using eigenpalms features,” Pattern recognition letters, vol. 24(2003), pp. 1463-1467. B. Kaur, S. Singh, and J. Kumar, “Robust iris recognition using moment invariants,” Wireless Pers Commun, vol. 99(2), pp. 799-828. I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, “Feature Extraction: Foundation and Applications.” Springer 2006, ISBN 3-540-35487-5. IITD Iris Database, http://web.iitd.ac.in/~biometrics/Database_Iris.htm, 2008 http://www.jiristech.com/download/jpc_series.pdf Y. Zhu, T. Tan, and Y. Wang, “Biometric personal identification based on handwriting,” in Proc. 15th ICPR, Barcelona, Spain, Sep 3-7, 2000, 797-800. M.E. Hodgson, “Reducing the computational requirements of the minimum-distance classifier,” Remote sensing of environment, vol. 25, 1988, pp. 117-128.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200047
71
Emotion Recognition Using Feature Extraction Techniques M. Kalpana Chowdary1 and D. Jude Hemanth1,* Department of ECE, Karunya Institute of Technology and Sciences, Coimbatore, India *Corresponding Author: [email protected]
1
Abstract. Emotion recognition is one of the better way to recognize state of mind of humans, there are several emotions such as Happy, Neutral, Disgust, Sad, Anger, Surprise, and Fear which are known as universal emotions. In this paper we are trying to extract different facial features using some well-known techniques such as Local Binary Patterns (LBP), Histogram of Gradients (HOG),Scale Invariant Feature Transform (SIFT) and Speeded-up Robust Features(SURF). These techniques are tested on the most popular Japanese Female Facial Expression (JAFFE) database. These Database have frontal view face images. Keywords. LBP, HOG, SIFT, SURF.
1. Introduction Emotions plays a prominent role in human society, using this facial expression recognition system, we can know individual’s mood at that instant whether he/she is happy, sad, angry. We can develop many applications using these system for security purpose and surveillance, criminal interrogations, human computer interaction [1], music players based on mood, automated driving and many more. Although there is advancement in both hardware and software still there is some problems with its accuracy because of variations in faces. In terms of accuracy of face emotions detection system, we need to increase accuracy at each step of development, feature extraction is one of the main step which plays an important role for accuracy of system, In this work four different feature extraction techniques of LBP, HOG, SURF and SIFT are discussed. The major components in any emotion recognition system are pre-processing, Segmentation, Feature Extraction and Classification as shown in below figure.
Figure 1. Facial Emotion Recognition System
72
M.K. Chowdary and D.J. Hemanth / Emotion Recognition Using Feature Extraction Techniques
Pre-Processing is required in order to reduce the noise, complex backgrounds and illumination variations in input images. Segmentation technique is used to extract the non-skin regions such as eyes, nose and mouth regions from the image. Feature Extraction is helpful to extract the useful information from the images like colour, shape, motion etc., this extracted features are given as input to the classifier for identification of particular emotion such as: happy, sad, fear, neutral, anger, disgust and Surprise. 2. Facial Expression Database The Japanese Facial Expression Database containing 213 pictures is from reference [2], where 10 subjects are enrolled with seven facial expressions: happy, sad, surprise, angry, disgust, fear and neutral. The size of the images are 256*256 and in TIFF format. The sample images from JAFFE database are shown in below figure.
Sad
Neutral
Disgust
Happy
Angry
Fear
Suprise
Figure 2. Sample images from JAFFE database
3. Facial Feature Extraction Techniques Feature Extraction is mainly used to extract the meaningful information from the images .The information is in the form of shape, texture, motion or color etc., Some of the Feature Extraction techniques used in emotion recognition are discussed below. 3.1 Local Binary Pattern One of the most widely used technique for feature recognition is local binary pattern. The LBP operator was introduced by ojala et al. in [3]. The LBP operator works with eight neighbor pixels and one center pixel. The first step is every neighbor pixel is compared with the center pixel, if the value of neighbor pixel is greater than the center
M.K. Chowdary and D.J. Hemanth / Emotion Recognition Using Feature Extraction Techniques
73
pixel value then the neighbor pixel value is assigned to one, otherwise it is assigned to zero [4].In the second step all these binary numbers are multiplied by the powers of two and concatenated in clock wise direction from top to bottom to form a decimal number. Below figure shows the calculations of Local Binary Patterns.
Figure 3: Calculation of LBP
(a) input image
(b) LBP of image
Figure 4. Input image and LBP image
3.2 Histogram of Oriented Gradients HOG was introduced by Navneet Dalal and Trigg in the year of 2005 [5]. HOG is a feature descriptor which is used in various domains where we need categorization through shapes. The HOG descriptor technique counts how many times gradient orientation occurred in localized portions of an image - region of interest (ROI). The main blocks in HOG are shown in below figure.
74
M.K. Chowdary and D.J. Hemanth / Emotion Recognition Using Feature Extraction Techniques
Figure 5. Main blocks in HOG
Steps involved for Implementation of the HOG descriptor are:
First step is division of the image into connected small regions (cells) and compute histogram of gradient directions for each pixel present in cell.
Providing a discrete value to each cell into angular bins according to its gradient orientation.
Corresponding angular bin of each pixel is formed by the contribution of weighted gradient from the each cell of pixel.
Blocks are formed from groups of neighbor cells they are considered as spatial regions. Because of the formation of these blocks the normalization process of histograms happens.
These groups of normalized histograms are considered as the block histogram. The final descriptor is represented by the set of block histograms.
(a) Input image
(b) HOG Feature Extraction
Figure 6. Input image and HOG Visualization plot
3.3 SIFT ( Scale Invariant Feature Transform) SIFT algorithm was published by David Lowe in [6]. It is mainly used to detect and describe local features in images. The main applications of SIFT include 3D modelling and object recognition. The SIFT algorithm [7] consists of four steps 1 .Estimation of scale space extrema using Difference of Gaussians 2. Key point localization
M.K. Chowdary and D.J. Hemanth / Emotion Recognition Using Feature Extraction Techniques
75
3. Assign one or more orients to each key point 4. Key point descriptor is created from local image gradients The below figure shows the input image and extracted key points by using SIFT technique.
(a) Input image
(b) Key points of SIFT
Figure 7. Input image and key points of SIFT
3.4 SURF ( Speeded-up Robust Features) Speeded-up Robust Features was introduced in [7]. SURF features are used in real time applications like object detection and tracking .The SURF algorithm consists of the following steps. 1. Generation of integral image 2. Computation of Hessian Matrix determinants (interest point detection) 3. Application of Non-Maximal Suppression 4. Computation of interest point Orientation 5. Computation of interest point Descriptor The below figure shows the input image and key points descriptors obtained by using SURF.
(a) Input image
(b) SURF Features
Figure 8. Input image and key point descriptors obtained by using SURF
76
M.K. Chowdary and D.J. Hemanth / Emotion Recognition Using Feature Extraction Techniques
Conclusion This paper presents various feature extraction techniques used in automatic facial emotion recognition systems. For all feature extraction techniques the input images are taken from the standard JAFFE Database. From the above results, we state that, these extracted features using the above techniques can be help in the classification step. We can further go for classification using different classifiers such as KNN classifier and SVM classifier etc., to obtain a desired result. References [1] Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W. and Taylor, J.G., 2001. Emotion recognition in human-computer interaction. IEEE Signal processing magazine, 18(1), pp.32-80. [2] M. J. Lyons, M. Kamachi and J. Gyoba, "Japanese Female Facial Expressions (JAFFE)," Database of digital images, 1997 [3] T. Ojala, M. Pietikäinen, and D. Harwood, "A comparative study of texture measures with classification based on featured distributions," Pattern Recognition, vol. 29, pp. 5159, 1996. [4] T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution grayscale and rotation invariant texture classification with local binary patterns," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 971-987, 2002. [5] Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005; Volume 1. [6 ] David G Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol.50, No. 2, 2004, pp.91-110 [7] Brown, M., Lowe, D.: Invariant features from interest point groups. In: BMVC. (2002) [8] H. Bay, T. Tuytelaars, and L. Van Gool. SURF: Speeded up robust features. In ECCV, 2006.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200048
77
Interactive Educational Content Using Marker Based Augmented Reality K. Martin Sagayam1, D. Jude Hemanth2, *, Andrew J3, Chiung Ching Ho4, Dang Hien5 1, 2 Department of ECE, Karunya Institute of Tehnology and Sciences, India. 3 Department of CSE, Karunya Institute of Tehnology and Sciences, India. 4 Deparment of Computation & Information, Multimedia University, Malaysia. 5 Department of Computer Science and Engineering, ThuyLoi University, HaNoi, VietNam. *Corresponding Author: [email protected] Abstract. Now-a-days augmented reality (AR) has become an interactive and collaborative tool for educational applications. AR makes the teaching and learning process more effective and allows the user to access the virtual objects in the real world. Although AR technology enhances the educational outcome, psychological, and pedagogical aspect for real-time usage in the classroom, the most important problem is the lighting conditions on which the output clarity is dependent. They can be affected by reflections from overhead lights and the light from the windows. Omni directional lighting can be used to avoid this problem. If the marker is moved out of the camera view, then the recognition will fail. Hence, we are using markers with large black and white regions that are low-frequency patterns. An augmented reality application for mathematics and geometry in school level education system is discussed in this paper. Keywords: Augmented reality, mixed reality, AR Toolkit, marker, and school level education
1. Introduction An augmented reality system consists of a display, a computational unit, and a camera. The camera captures the image and the system draws the virtual objects on top of the image and finally displays the result. ARToolkit is the software that helps programmers to work in the field of augmented reality. Augmented reality is the addition of computergenerated content into the real world, and it has many applications like advertisement, academic research, industry, media, and entertainment. ARToolkit can be used to calculate the position and orientation of the camera relative to square shapes and allows the programmer to superimpose virtual objects. ARToolkit supports classical square markers, 2D barcode, and multimarker and Hiro markers. Furthermore, AR toolkit supports any combination of the above markers together. The fast and accurate tracking done by ARToolkit has enabled the rapid development of thousands of new and interesting AR applications.
78
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
2. Literature survey In recent scenario, there are more tremendous applications based on augmented reality (AR) and virtual reality (VR). [1] has proposed AR app which targeting into non-expert visitors for guiding the users to find the location with descriptions. It helps to recognize the new looking place quickly without any person’s guidance. [2] has presented integration of AR and VR for digital twin (DT). This system provides more immersive and interactive to the users. [3] has demonstrated AR textbook for interactive learning experience to the school children. [4] has investigated structural behavior from finite element analysis (FEA) using augmented reality. It is used to identify the interior structure in 3- D view clearly and easily modify the structure through collaborative learning. [5] presented a comparative study between desktop AR and mobile AR using parameters like mean and standard deviation. The aim of the experiment is to improve the visualization skill of students with full support for mobile devices. Here they discuss AR in education application, especially in engineering graphics. [6] have proposed 3D AR micro integral imaging display system by combining conventional integral imaging and AR technique. Integral imaging means recording a 3D scene and reconstructing the 3D scene. Here they are describing the implementation of the micro integral imaging display system and experiment results using computer technology with parameters. [7] has done a short survey on AR applications that uses ARToolkit. [8] has presented a system for real-world object recognition and camera pose estimation. It recognizes the objects directly without any marker and augments them. It is a markerless augmented reality system. [9] have discussed marker- based augmented reality in android platform. It can display 3D models from the markers and is compatible with all the versions of the operating system. [10] have discussed the augmented reality application in the medical field. The results obtained in this work shows that the developed system helps to understand and study the functioning of the heart. [11] have demonstrated the concept of effective teaching method using marker-based augmented reality. This work has explained about live interaction through an augmented object. The simulation work provides a high degree of concentration and analyzed the ANOVA test for this process. [12] have proposed the augmented alphabet book for to teach to the kids in a realistic way. This AR textbook has more preferred by the age of children is between 3- 6 years. It provides the interactive way of learning the concept in basic level.
3. Methodology Figure 1 shown below describes the steps and the flow of our work in more detail. When the video stream is captured and the marker is found, the next step is the conversion of the image into a binary image and the marker frame is identified. Then the ARToolkit will search and find the square regions in the binary image. The position and orientation of the marker relative to the camera are calculated. The pattern inside each square is captured and checked whether it is matching with a pre-trained pattern.
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
79
Figure 1. Framework of Marker Based Augmented Reality 3.1. Collection of markers Markers are the squares which have a white surface surrounded by black region. There are many numbers of markers used generally, e.g. 2D barcodes, QR codes, Hiro marker, Kanji markers, and multiple markers. We use markers like Kanji marker, Hiro marker and multiple markers as a suitable approach for this project for obtaining virtual objects. The specifications of markers are given below in table 1. 3.2. Square Markers Markers provide the correct scale and convenient coordinate frames. And, they will have encoded information or have an identity. This enables a system to attach certain objects or interactions to the markers.
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
80
Square markers have only a few properties:
They must be square. They must have a continuous black or white border. And, the background must be a contrasting color with the marker in the foreground (that is, a dark versus a light color). The final constraint is that the pattern must be rotationally asymmetric.
3.3. Multimarker In ARToolKit the term multimarker refers specifically to the use of multiple square markers fixed to a single object and its arrangement is multiple markers on a single flat sheet. Multimarker tracking allows for several tracking performance and stability enhancements and has special support in the ARToolKit API. Some of the advantages of multimarker include:
Occlusion can be avoided: even if when one marker is not detected, another may be visible. Improved pose-estimation accuracy: in a multimarker set, a numerical error can be reduced because all marker corners are used to calculate the pose. Table 1. Specifications of marker as shown in figure 2 Marker
Size(KB)
Dimension
Hiro
2.52
192x191
Kanji
2.06
181x182
Multiple Marker
10.1
308x182
Figure 2. (a) Hiro marker (b) Kanji marker (c) Multiple marker
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
81
4. Detection of markers The basic marker detection procedure consists of the following steps: 1. Image acquisition Acquisition of an intensity image. 2. Preprocessing Low-level image processing Line detection/line fitting Detection of the corners of the marker. 3. Detection of potential markers and discards of obvious non-markers Fastest rejection of obvious non-markers Fast acceptance test for potential markers. 4. Identification and decoding of markers Template matching (template markers) Decoding (data markers). 5. Calculation of the marker poses The camera view is captured as video from the camera and it is sent to the computer. The software then searches for any square shapes, the marker will be outlined with red and green lines if the ARToolkit has identified it as shown in Fig.3. When the square marker is found and the image content embedded is matched and identified, the position of the Black Square and camera orientation is calculated by the software. Once the position and orientation of the camera are known, the 3D model is drawn and it appears as it is attached to the background of the marker. 4.1. Template matching (Jung-Chuan Yen et. al., 2013) has used template matching, morphological operations and segmentation determine the location of the marker. Square marker obtained in the video frame is matched against pre-trained patterns as shown in figure 3. These patterns are contained in the Data directory of the bin in ARToolkit and are loaded during run time. An important advantage of template matching is its insensitivity to blur and noise. In SSD the dissimilarity value between the marker and template is (1) In normalized cross-correlation the dissimilarity value is
(2)
82
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
(a)
(b)
(c) Figure 3. (a) the original image (b) the image after adaptive thresholding (c) a cube augmented on top of a detected marker.
5. Drawing virtual object The Augmented Reality system consists of a camera or webcam, computational unit (laptop) and a display. The camera captures an image of the marker and then the system draws the virtual object on top of the image and displays the result. When the virtual object is drawn, it will remove the real object completely and we can see only the virtual objects there. When the camera captures the image of the marker, it will detect the marker, find out the location and the camera position and then draws virtual objects or shapes. For this, we need a camera, laptop, and different markers. The capturing module captures the image using a camera. Tracking module calculates the correct location and position of the camera. The rendering module will combine the results from the above modules with a virtual component so that it renders the augmented image on display. They are adding a virtual object to the environment and removing real objects like a marker or paper on which the marker is drawn.
6. Results and Discussion Initially, there were some complications with the output due to different lighting conditions and range issues. Using Omni directional lighting conditions and a Logitech camera having a specification of high definition of 720 and 5 megapixels, this issue has been rectified. Figure 4 shows the experimental results with different objects such as cube, sphere, cone, multiple cubes, and teapot. High-quality output is characterized by high PSNR. It is mainly used to measure the quality of an image. The Signal here refers
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
83
to the original data and noise is the error caused due to compression. If the PSNR value is high, it indicates that the image is of higher quality.
(a) Cube
(b) Sphere
(c) Cone
(d) Multiple cubes
(e) Teapot Figure 4. Experimental results for augmented reality application The traditional way of teaching is not enough to understand 3D geometry. They may feel difficulty in drawing the shapes and realizing the concepts. Hence, augmented reality will have an important role in educational application. It helps the students to compare and examine different shapes and get an idea about the geometry model at an early age. It can also be used in teaching the solar system in higher education. Most of the materials available to the students are textbook images and 2D charts. From our experiment, students can understand the concepts and study the shapes as they can change the angle of viewing. Geometric shapes are also an unavoidable factor in the architecture field. Cube shapes are used in interesting religious buildings, modern architecture designs, monuments, etc.
84
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
Table 2. Performance Measures of different virtual objects
Shapes
Mean
Variance
Cube
98.25
491.35
Sphere
101.80
Cone Torus Teapot Multiple Output
99 104.88 99.556 103.91
Standard
PSNR
SNR
22.166
20.989
15.22
413.288
20.329
21.32
14.82
483.428
21.987
21.211
14.968
505.11
22.47
21.273
16.97
435.271
20.8633
21.91
17.43
355.35
18.85
20.65
18.4508
Deviation
7. Conclusion and Future Work In this work, the authors mainly focus on the application of augmented reality to the educational system. These results show that the concepts of geometry shapes can be simplified and studied easily by the students. By seeing the simple virtual objects, the student community will quickly understand the concepts. Experimental result shows that better PSNR and SNR values are achieved for the proposed approach. The time taken to execute this experiment is less than 10 seconds. Thus, this work is efficient in terms of accuracy and convergence rate which is ideal for practical applications. References [1] Silvia Blanco-Pons, Berta Carrion-Ruiz, Jose Luis Lerma, Valentin Villaverde (2019): Design and implementation of an augmented reality application for rock art visualization in Cova dels Cavalls (Spain), Journal of Cultural Heritage, doi:10.1016/j.culher.2019.03.014. [2] Fei Tao, Meng Zhang, and A. Y. C. Nee (2019): Digital twin and virtual reality and augmented reality/mixed reality, Digital Twin Driven Smart Manufacturing, pp. 219-241, doi: 10.1016/B978-0-12817630-6.00011-4. [3] Haifa Alhumaidan, Kathy Pui Ying Lo, and Andrew Selby (2018): Co- designing with children a collaborative augmented reality book based on a primary school textbook, International Journal of ChildComputer Interaction, Vol. 15, pp. 24-36. [4] J. M. Huangm S. K. Ong, and A. Y. C. Nee (2017): Visualization and interaction of finite element analysis in augmented reality, Vol. 84, pp. 1-14. [5] Jorge Camba, Manuel Contero, Gustavo Salvador-Herranz (2014): Desktop Vs. Mobile: A Comparative Study Of Augmented Reality Systems For Engineering Visualizations In Education, Frontiers in Education Conference (FIE), pp. 1-8, IEEE. [6] Jingang Wang, Xiao Xiao, Hong Hua and Bahram Javidi (2015): Augmented Reality 3D Displays WithMicro Integral Imaging, Journal of display technology, vol. 11, no. 11. [7] Ashok kumar M, Sahitya Priyadharshini K (2012): A Survey of ARToolkit Based Augmented Reality Applications, Journal of Computer Applications, Volume-5.
K.M. Sagayam et al. / Interactive Educational Content Using Marker Based Augmented Reality
85
[8] Rehman Ullah Khan, Muh. Inam UlHaq, Moh. Shahrizal Sunar, Shahren Ahmad Zaidi Adruce, Yahya Khan and Yasir Hayat Mughal (2015): Objects Recognition And Pose Calculation System For Mobile Augmented Reality Using Natural Features, Ind. J. Sci. Res. and Tech. Vol. 3(1), pp. 40-50. [9] Megha Shetty, Vineet Lasrado, Riyaz Mohammed (2015): Marker Based Application in Augmented Reality Using Android, IJIRCCE Vol. 3, October. [10] Edgard Lamounier, Jr., Arthur Bucioli, Alexandre Cardoso, Adriano Andrade and Alcimar Soares (2010): On the Use of Augmented Reality Techniques in Learning and Interpretation of Cardiologic Data, Annual International Conference of the IEEE EMBS Buenos Aires, Argentina, doi: 10.1109/IEMBS.2010.5628019. [11] Jung-Chuan Yen, Chih-Hsiao Tsai, Min Wu (2013): Augmented reality in the higher education: Students’ science concept learning and academic achievement in astronomy, Procedia – Social and Behavioural Sciences, Vol. 103, pp. 165-173. [12] Dayang Rohaya Awang Rambli, Wannisa Matcha, Suziah Sulaiman (2013): Fun Learning with AR Alphabet Book for Preschool Children, Procedia – Computer Science, Vol. 25, pp. 211-219.
86
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200049
Classification of Melanoma Through Fused Color Features and Deep Neural Networks Ananjan MAITIa, Himadri SHEKHARGIRIa, Biswajoy CHATTERJEEb, Venkatesan RAJINIKANTHc, Fuqian SHId, Nilanjan DEYa aDepartment
b
Department of Computer Science and Engineering, University of Engineering & Management, New Town, India c
d
of Information Technology, Techno India College of Technology, Kolkata-700156, West Bengal, India
Department of Electronics and Instrumentation Engineering,St. Joseph’s College of Engineering, Chennai-600 119, Tamilnadu, India
College of Information and Engineering, Wenzhou Medical University, Wenzhou 325035, China
Abstract. Skin malignancy is a catastrophic health problems witnessed in Europeans and western area of the world because of the changes in the ozone layer. Ultraviolet (UV) rays common threats for the human health. Scientists have studied on Computer-Aided Diagnosis (CAD) scheme to ease interpretation detection of melanoma. There are several variations of features of the lesions and different Artificial Intelligence (AI) based design participates in an essential role for building CAD system. This study has refined skin lesions with diffusion and dull razor technique. Lesion images have taken for color-based shape and texture feature extraction. Scientists have found new fused color features are effective for melanoma and nevus classification. It has discovered details of 2000 images from ISIC (database archive) helped to build improved feature set. These features were analyzed through various 12 machine learning models as highest accuracy of 93.9%. Proposed Deep Neural Network (DNN) has reached 95.8% accuracy within few epochs. This model was assessed specific limits which were discussed in the results section. In future this exercise will motivate investigators to experience with color features and its variations with other AI based models. Keywords. Melanoma, Computer-Aided Diagnosis, Dull razor, Color features, Artificial intelligence, Deep neural network
1 Introduction In regions like Europe, Australia and America most white-collared individual’s experience melanoma, which is one of the most destructive diseases [1]. U V light exposure from the sun or unwanted sun exposure are the principal reasons behind this investigation says this disease in areas of high elevation or close to the equator where the daylight exposure is very large [2] . People with brown/blue eyes and blonde/red hair are suitable to suffer from these diseases. Scientists have carried out on identification of a particular skin cancer through CAD. This strategy also helped
A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks 87
victims to identify the disease at an early stage. Skin lesion images have other details by which these can be identified exclusively. Pigmentation color and its association carries distinct attributes. Skin doctors used techniques to identify scientific features be used for the detection of melanoma from a skin lesion are i) Asymmetry, ii) Border irregularity, iii) Diameter and iv) Color [3] . From these ABCD rule, CAD system can identify malignant melanoma from six suspicious colors. Its main advantage over other features is that it can determine the existence of melanin in the deeper layers of epidermis and dermis. Scientists have performed many works with color features from unprocessed images. These images may contain un wanted noises and hence this unwanted objects hinders proper feature extraction techniques, which has affect the efficiency of the classifier. Researchers tested on shape and texture features and it also helps to classify Melanoma lesions. There is still a gap of evaluation through various color features, which are extensively tested on these studies. The effectiveness of proposed features were tested on different machine-learning models and as an outcome we have got a good result compare to former studies. Last, we have tested all features in proposed deep learning models with varying its hyper-parameters which gives better results than former studies. Researchers have discussed these in result analysis section. In the cur- rent study, the researchers have highlighted different recent literature in section 2. This study encourages use of color features and offered deep learning models for improvement of CAD system. These topics are discussed in section3. The proposed CAD system was confirmed and detailed summaries are explained in sec 4. In section 5, it sums up the findings limitations with its future scopes.
2 Recent Works Researchers have found color features specially properties of Red and Blue channels can effectively perform well in classifiers. Cheng et al. utilize these statistical features of channels (mean, area, skewness in the Green band, entropy of Red band etc.) in multilayer perceptron. This study reveals efficiency of MLP and tenders 87% of accuracy to detect malignant melanoma [4]. The advancement of the deep-learning lead different researches to renovate features in many different ways. Color-space based lesion features extraction is also popular. Chakraborty et al. worked on HSV color space to segment lesions and which has an effective strategy. Different classification models could be applied to extend this study [5]. Dermo-deep was invented by Abbas et al., which is color-space based fused feature. This study incorporates 2800 images and tested with deep learning model. As an outcome, researchers have got 0.96 AUC score and 95% specificity. It is also possible to extract features of the lesion through CNN and Sharma et al. recently worked on this method. The outcome of the evaluation was the accuracy of 75% on PH2 Dataset [6].The researchers have compared the role of color band texture features and determined which feature sets can be used for more differentiation. Barata et al. found that color features are more advantageous than texture features, if used separately and give excellent results that is, the sensitivity of 96% and specificity of 80% for the global method as compare to local methods which have a sensitivity of 100% and specificity of 75%. In the previous studies, global features like texture, shape and color associated with the whole lesion is used by a binary classifier trained from the data. However, it is observed that local features are
88 A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks
increasing in popularity in many image analysis problems (like image retrieval and object recognition) [5]. Barata et al. proposed method uses a combination of image processing techniques and AI based techniques. The study states extracting features like Energy, Entropy, Contrast, and Correlation of segmented skin region. It has helped to determine the severity of Melanoma [7,8].
3
Methodology
CAD framework consists of combination of different stages [9]. Current researchers exposed one effective path for classification of Melanoma over Navus. In first phase, skin cancer images from different open archives are arranged. These may have different unwanted objects (such as scratches tapes, bandages, and hairs). In this studies researchers observes these issues and imposed few strategies to deal with these issues. So, that it can have proper pigment information.
Fig. 1. Different stages of stated CAD Framework
3.1 Image Pre-processing and Segmentation Image Denoising: Image pre-processing is an essential operation for medical image analysis [10-12] . The color noises of the images may disrupt features of the skin lesions [13].There searchers have utilized Perona-Malikan isotropic diffusion features (ADF) to preserve the lesions properties (like edges) and remove unwanted noisy pixels. 2D-ADF also helped to protect edges from noisy color pixels. In other process of diffusion filters, it might lose details of the edges of the objects. After diffusion completed, ADF keeps lesions boundary information. The edges original values that mean no edge loss happen. As the edges remain almost constant in this diffusion.
A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks 89
(a) Sample of Melanoma image from ISIC archives
(b) Sample of gray-scale image after dull-razor operation
(c) Sample of segmentation using Active-Contour Model
(d) Trace made by Active-Contour technique
Fig. 2 Results obtained with the existing and proposed approach
Dull Razor: Some hairs are also noticeable among few of the images. These images were refined with Dull Razor algorithm to remove hair from the lesions. It finds the location of the black hair of gray scale in gray-scale images [14-16] . It identifies those pixels which contains the dark hair. By identifying them, it can understand about the thickness and the length of the hair. It distorts identified pixels with neighbourhood pixels. Fig 2 presents the results obtained with the existing techniques. Fig 2 (a) presents the sample melanoma picture. Fig.2 (b) shows the preprocessed image with dull-razer, Fig.2 (c) and (d) depicts the results with the active-contour segmentation. Contour Identification Image segmentation separates skin lesion from whole image. Researchers have utilized Active Contour Model (ACM) segmentation techniques to mine the melanoma section [17-19]. This algorithm autonomously searches for the minimum state lesion boundary. The skin lesion can be quantified properly through snake shapes properties (i.e. energy, entropy etc.).It preserves the information of the contour or edges in the intuitive manner. Contour information was critically evaluated through different RGB
values specially:
90 A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks
Table 1. Description of various colors with its range
Sl. No 1. 2. 3. 4. 5. 6. 7. 8.
Color Red Blue Whitish Blue Light Brown Dark Brown Violet Black White
Range with (R, G. B) Channel (255, 250, 250) to (50, 20,20) (248, 248, 255) to (0, 35,102) (204,229,255) to(102,178,255) (188,143,143) to(210,105,30) (188,143,143) to(210,105,30) (191, 148, 228) to (105, 53,156) (0, 0, 0) to (180, 180,180) (181, 181, 181) to (255, 255,255)
Table 1 presents the relation between the color and the threshold value in RGB image for the skin-melanoma image cases. 3.2
Color Feature Extraction
Present work has incorporated eight color’s pixel clusters using its ranges. Melanoma skin sore possesses specific color attributes as its compelling interpretation [20, 21]. In ABCD rule, C holds a quantity of color schemes present in the sores. Former researches claim variations of blue, black and white is highly questionable and several investigates advised these key eight shades in their studies of melanoma classification. It generates an adequate analytical feature set. These set involves texture details for example, area, pixel variance other than various shape property of color clusters [22, 23]. Besides this scientists have extracted texture features based on color like percentage, counter-count, distance-mean, distance-variance, area-mean, area-variance, perimeter-mean, perimeter-variance, harmonic-mean, largest - area, counter saturation mean, saturation-variance, etc. Table2.presents the melanoma diagnosis features of any individual color. This fused feature set comes with 32 properties of every color. These hybrid approach fused lesion shape with colors and consists specific information of every sore. Conceding to the color preference it has determined major chunk of colors and shape properties largest part. Lastly, it has accumulated 256 (8 x 32) features and formulate a data set for classification. 3.3
Feature Preprocessing
Quality of the model or classifiers depends on properly processed feature set [24- 26]. Here, the feature set was rescaled with min-max scalar properties and remove its out layer with its mean. The entire feature set was normalized for uniform distribution and maintained its variance. Table 2 describes the various features considered for the image classification task.
A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks 91
Table 2. List of extracted color based features used for classification
Percentage Area Parameter variance Saturation variance Contour circularity
3.4
Contour count Area mean
Distance mean Area variance
Harmonic mean
Contour area
Value mean
Value variance
Contour convexity
Contour harmonic
Ratio of area
Ratio of variance
Ratio of saturation
Ratio of contour convexity
Ratio of distance mean
Ratio of distance variance
Ratio of perimeter mean Ratio of contour harmonic mean Ratio
of perimeter variance
Distance variance Perimeter mean Contour Saturation variance Contour eccentricity Ratio of percentage of contour Ratio of circularity Color asymmetricity Ratio of color asymmetricity
Principal Component Analysis
Principal component analysis (PCA) are able to hypothesize qualitative variables and analysis its correspondence [27,28]. It manages multiple factor analysis (MFA) to deal with various collections of variables [29]. It is needed to decrease the variety of variables, substantially, still preserving much of the essential details of the huge or massive original data set. Scientists utilized this strategy as a dimension-reducing technique and 256 features are reduced to 128 Features. It extracts the critical details from the dataset, in order to exemplify it as a set of fresh orthogonal variables. 3.5 Classification Machine Learning Models: The extensive fused-color set is utilized in training model to predict binary targets [30-32] . In the classification stage, investigators test it through 12 machine learning models [33,34]. Current researchers have used logistic Regression. This is very dynamic probabilistic scheme. The intension of using SVC (Support Vector Classifier) that is every effective in a large feature set. It creates a large margin in a hyperplane. NuSVC is like SVC but it uses a parameter to control the number of support vectors. One of the simplest classifier is K-Neighbour classifier which better organized large features. Construction of the Decision Tree Classifier is complex but it is very effective for labeled tuples. One of the improve version of this classifier is Random Forest Classifier. It has experimental approach with a different classifier and cross validation. To combine with different classifiers, the AdaBoost learning scheme can be used. Gradient boosting classifier uses here to generalize the construct. Naïve Bayes (NB) is immune to noise, and it also ignores the missing values. Here its two variations have been employed which are Bernoulli NB and Gaussian NB. Here, Linear-Discriminant-Analysis (LDA) is effective against separating subgroups. One of its variant Quadratic Discriminant Analysis (QDA) creates covariance of each class to segregate them.
92 A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks
Deep Learning Models: Deep Neural Network (DNN) follows same learning procedure like Artificial Neural Network (ANN) [35,36]. It includes better hidden layer mechanism which increases performance classifiers to tune the training data significantly. It does need elimination of noisy features [37]. It has an architecture that can be exploited by different techniques like Neural Network, long short-term memory, etc. DNN were assessed on separate hyper-parameters. After that proposed NN with five hidden layers with 64,32,16,8,4 input size and rectified linear activation unit function and at last sigmoid activation function offer gratifying results. Testing was completed. With different hyper parameter composition (like inputs, batch sizes, activation function and hidden layers).
Fig. 3 Figure shows architecture of Proposed Deep Neural Network
Fig 3 depicts the proposed DNN architecture to examine the skin-melanoma pictures of the ISIC database.
4. Results and Discussion Formerly there has been an investigation of CAD system with various open source image archive. Most of the studies consists of sample images less than one thousand. One of the most popular available source archive is ISIC (The International Skin Imaging Collaboration database). This archive contains the largest publicly convenient compilation of quality-controlled pictures of skin lesions. From this, scientists have collected one thousand images of melanoma and one thousand images of nonmelanoma (Nevus). These 2000 images have been tested in classification model with a ratio of 80:20 for training and validation set. Applying all 12 machine learning models for the process, researchers have the highest accuracy in Decision Tree Classifier accuracy of 93.9% with 1.04 log loss value. Followed by this Random Forest Classifier also produced very accurately as it provided 90.4% accuracy with 1.19 log loss. Gradient Boosting Classifier has become third among top performing classifier as it obtains accuracy of 87.4% with an excellent log loss value of 0.32. Researchers has
A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks 93
analyzed this dataset with alternative models and details description is displayed in table 3. In proposed DNN, researchers have considered several hyper-parameters. Hidden layers, which is a layer in between input and output layers. Here, five hidden layers are the optimal hidden layer. The batch sizes it has considered 16,32,64,128, but it has got the highest accuracy of 95.8% at 158 epochs. The learning rate plays a critical role and was settle data rate of 0.01.The activation function RELU performs best compared to SRELU, TANH, sigmoid etc. Figure 4, shows training and validation performance according to accuracy and loss during training and validation phase. Table 3. Results obtained from different classifiers
Classifier Logistic Regression SVC NuSVC K-Neighbors Classifier Decision Tree Classifier
Accuracy(%) 71.36 50.25 74.37 61.81 93.98
Log loss 0.58 0.69 0.51 0.46 1.04
Random Forest Classifier AdaBoost Classifier Gradient Boosting Classifier Bernoulli NB Gaussian NB Linear Discriminant Analysis Quadratic Discriminant Analysis
90.47 77.89
1.19 0.67
87.44
0.32
55.28 62.81
1.75 7.36
74.37
0.51
78.39
5.03
Fig. 4 The figure shows Accuracy and Loss DNN during training and validation phase
Below researchers have compared its effectiveness with recently stated color features based skin cancer classification techniques.
94 A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks
Table 4. This table shows comparison of proposed model with recent studies on color based features.
Ref. Stanley et al. [38]
No. of images 226
Belarbi et al. [39]
200
Soumya et al.
200
Kasmi et al. [41]
200
Hagertyet al. [42]
10015
Proposed method
2000
[40]
Description of features Color histogram analysis techniques RGB chromatic co-ordinates techniques 2 features-Color Correlogram, SFTA Atypical pigment network Pigment network, Color distribution Fused Color Features
Accuracy 87.7% 91% 91.5% 94% 94% 95.8%
5. Conclusion In this extensive work, the current researcher has reformed CAD framework as it is discussed earlier phases of the study. Few machine learning techniques like Random forest, decision tree and gradient boosting classification have obtained the following accuracy of 93.9%, 90.4%, 87.4% etc. The proposed framework performs effectively worked through fused color features. Although proposed DNN has per- formed more accuracy and gave 95.8% accuracy then which is higher than former approaches. The proposed framework could be further evaluated through other modern deep learning methods. The strategy of preprocessing and choice of features has added an extra feather to it. The proposed DNN and its correspond could have a further scope of improvement. In future, this might be evaluated through a larger dataset with a variant of feature set.
References 1. S. Rajpar, J. Marsden, ABC of skin cancer, Vol. 94, John Wiley & Sons, 2009. 2. D. E. Brash, J. A. Rudolph, J. A. Simon, A. Lin, G. J. McKenna, H. P. Baden, J. Ponten, A role for sunlight in skin cancer: UV-induced p53 mutations in squamous cell carcinoma, Vol. 88, 1991, pp. 10124–10128. 3. F. Nachbar, W. Stolz, T. Merkle, A. B. Cognetta, T. Vogt, M. Landthaler, G. Plewig, The ABCD rule of dermatoscopy: high prospective value in the diagnosis of doubtful melanocytic skin lesions, Journal of the American Academy of Dermatology 30 (4) (1994) 551–559. 4. Y. Cheng, R. Swamisai, S. E. Umbaugh, R. H. Moss, W. V. Stoecker, S. Teegala, S. K. Srinivasan, Skin lesion classification using relative color features, Skin Research and Technology 14 (1) (2008) 53–64.
A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks 95
5. A. P. Chakkaravarthy, A. Chandrasekar, An Automatic Threshold Segmentation and Mining Optimum Credential Features by Using, HSV Model. 3D Research 10 (2) (2019) 18. 6. Q. Abbas, M. E. Celebi, DermoDeep-A classification of melanoma-nevus skin lesions using multi-feature fusion of visual features and deep neural network, 2019. 7. C. Barata, M. Ruela, M. Francisco, T. Mendonça, J. S. Marques, Two systems for the detection of melanomas in dermoscopy images using texture and color features, IEEE Systems Journal 8 (3) (2013) 965–979. 8. C. Barata, M. Ruela, T. Mendonça, J. S. Marques, A bag-of-features approach for the classification of melanomas in dermoscopy images: The role of color and texture descriptors, Springer, Berlin, Heidel- berg, 2014. 9. K. C. Santosh, S. Antani, D. S. Guru, N. Dey, Medical Imaging, 2019. 10. S. D. Kamble, N. V. Thakur, P. R. Bajaj, Fractal Coding Based Video Compression Using Weighted Finite Automata, International Journal of Ambient Computing and Intelligence (IJACI) 9 (1) (2018) 115–133. 11. A Beginner’s Guide to Image Preprocessing Techniques, CRC Press, 2018. 12. S. C. Satapathy, N.S.M. Raja, V. Rajinikanth, A. S. Ashour, N. Dey, Multi-level image thresholding using Otsu and chaotic, 2018. 13. S. Chakraborty, S. Chatterjee, A. S. Ashour, K. Mali, N. Dey, Intelligent computing in medical imaging: A study, IGI Global, 2018. 14. P. Anitha, S. Bindhiya, A. Abinaya, S.C. Satapathy, N. Dey, V. Rajinikanth, RGB image multi-thresholding based on Kapur’s entropy—A study with heuristic algorithms, Second International Conference on Electrical, Computer and Communication Technologies (ICECCT) (2017). DOI: 10.1109/ICECCT.2017.8117823. 15. P. Roy, S. Goswami, S. Chakraborty, A. T. Azar, N. Dey,Image segmentation using rough set theory: a review. International Journal of Rough Sets and Data Analysis (IJRSDA) 1(2), 6274, (2014). 16. A. Choudhury, S. Samanta, N. Dey, A. S. Ashour, D. Bălas-Timar, M. Gospodinov, E. Gospodinova, Microscopic image segmentation using quantum inspired evolutionary algorithm, Journal of Advanced Microscopy Research 10 (3) (2015) 164–173. 17. J. Chaki, N. Dey, A Beginner’s Guide to Image Shape Feature Extraction Techniques, CRC Press; 2019 Aug 29. 18. L.Moraru, S. Moldovanu, A. L. Culea‐Florescu, D. Bibicu, A. S. Ashour, N. Dey Texture analysis of parasitological liver fibrosis images. Microscopy research and technique, 80(8), 862-869, (2017). 19. V. Rajinikanth, S. C. Satapathy, N. Dey, S. L. Fernandes, K. S. Manic, Skin Melanoma Assessment Using Kapur’s Entropy and Level Set—A Study with Bat Algorithm, Springer, Singapore, 2019. 20. N. Dey, W. B. A. KarÁ¢ a, S. Chakraborty, S. Banerjee, M. A. Salem, & A. T. Azar, Image mining framework and techniques: a review. International Journal of Image Mining, 1(1), 45-64, (2015). 21. N. Dey, A. S. Ashour, A. Singh, Digital analysis of microscopic images in medicine. Journal of Advanced Microscopy Research, 10(1), 1-13, (2015). 22. J. Chaki, N. Dey, F. Shi, R. S. Sherratt, Pattern mining approaches used in sensor-based biometric recognition: A review. IEEE Sensors Journal, 19(10), 3569-3580, (2019). 23. A. Tharwat, T. Gaber, Y. M. Awad, N. Dey, Hassanien, A. E. Plants identification using feature fusion technique and bagging classifier. In The 1st International Conference on Advanced Intelligent System and Informatics (AISI2015), November 28-30, 2015, Beni Suef, Egypt (pp. 461-471). Springer, Cham, (2016). 24. N. Dey, S. Wagh, P. N. Mahalle, M. S. Pathan, (Eds.). Applied Machine Learning for Smart Data Analysis. CRC Press, (2019). 25. Dey, N., S. Borra, A. S. Ashour, F. Shi (Eds.), Machine Learning in Bio-Signal Analysis and Diagnostic Imaging. Academic Press, (2018).
96 A. Maiti et al. / Classification of Melanoma Through Fused Color Features and Deep Neural Networks
26. N. Dey, G. N. Nguyen, D. N. Le, A Special Section on Machine Learning in Medical Imaging and Health Informatics. Journal of Medical Imaging and Health Informatics, 8(4), 809-810, (2018). 27. P. Roy, S. Goswami, S. Chakraborty, A. T. Azar, N. Dey, Image segmentation using rough set theory: a review. International Journal of Rough Sets and Data Analysis (IJRSDA), 1(2), 62-74, (2014). 28. D. Acharjya, A. Anitha, A comparative study of statistical and rough computing models in predictive data analysis, International Journal of Ambient Computing and Intelligence (IJACI) 8 (2) (2017) 32–51. 29. M. A. Belarbi, S. Mahmoudi, G. Belalem, PCA as dimensionality reduction for large-scale image retrieval systems, International Journal of Ambient Computing and Intelligence (IJACI) 8 (4) (2017) 45–58. 30. S. Hemalatha, S. M. Anouncia, Unsupervised segmentation of remote sensing images using FD based texture analysis model and ISODATA, International Journal of Ambient Computing and Intelligence (IJACI) 8 (3) (2017) 58–75. 31. K. Lan, D. T. Wang, S. Fong, L. S. Liu, K. K. Wong, N. Dey, A survey of data mining and deep learning in bioinformatics. Journal of medical systems (2018). 32. B. Lakshmi, S. Parthasarathy, Human Action Recognition Using Median Background and Max Pool Convolution with Nearest Neighbor, International Journal of Ambient Computing and Intelligence (IJACI) 10 (2) (2019) 34–47. 33. N. Kausar, A. Abdullah, B. B. Samir, S. Palaniappan, B. S. AlGhamdi, N. Dey, Ensemble clustering algorithm with supervised classification of clinical data for early diagnosis of coronary artery disease. Journal of Medical Imaging and Health Informatics, 6(1), 78-87, (2016). 34. S. Beagum, A. S. Ashour, N. Dey, Bag-of-Features in Microscopic Images Classification, IGI Global, 2017. 35. Z. Li, N. Dey, A. S. Ashour, L. Cao, Y. Wang, D. Wang & F. Shi Convolutional neural network based clustering and manifold learning method for diabetic plantar pressure imaging dataset. Journal of Medical Imaging and Health Informatics, 7(3), 639-652, (2017). 36. S. Chatterjee, S. Sarkar, N. Dey, A. S. Ashour, S. Sen, A. E. Hassanien, Application of cuckoo search in water quality prediction using artificial neural network. International Journal of Computational Intelligence Studies, 6(2-3), 229-244, (2017). 37. N. Dey, W. B. Karâa, Biomedical Image analysis and mining techniques for improved health outcomes, 2015. 38. R. J. Stanley, W. V. Stoecker, R. H. Moss, A relative color approach to color discrimination for malignant melanoma detection in dermoscopy images, Skin Research and Technology 13 (1) (2007) 62–72. 39. S. Pathan, V. Aggarwal, K. G. Prabhu, P. C. Siddalingaswamy, Melanoma Detection in Dermoscopic Images using Color Features, Biomedical and Pharmacology Journal 12 (1) (2019) 107–115. 40. R. S. Soumya, S. Neethu, T. S. Niju, A. Renjini, & R. P. Aneesh, Advanced earlier melanoma detection algorithm using colour correlogram. In 2016 International Conference on Communication Systems and Networks (ComNet) (pp. 190-194). IEEE. R. Kasmi, M. amp, K., Classification of malignant melanoma and (2016). 41. J. Hagerty, J. Stanley, H. Almubarak, N. Lama, R. Kasmi, P. Guo, W. V. Stoecker, Deep Learning and Handcrafted Method Fusion: Higher Diagnostic Accuracy for Melanoma Dermoscopy Images. IEEE journal of biomedical and health informatics, (2019).
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200050
97
Two Dimensional DEM Simulation for Truck Tires Rolling onto the Randomly Shaped Pebbles Used in a Truck Escape Ramp Pan Liu a,1, Qiang Yu a, Xuan Zhaoa , and Peilong Shi a a
Chang'an University, Xi'an, 710064, China
Abstract. The truck escape ramp is a type of traffic emergency facility for out-ofcontrol trucks on long downhill slopes. As the rigid pebbles were randomly placed in arrester beds, it is difficult to simulate truck tires rolling onto the randomly shaped pebbles. Coupled with an overlapping method, the randomly shaped pebble DEM models were built. Based on the side view of a truck tire, the tire DEM model was built using a close packing agglomerate method. Next, the process of trucks running onto the truck escape ramp was simulated, and the signals from the truck speed and travel distance were recorded. Simulation results are consistent with the test results. The built tire-pebble DEM model could be used for truck escape ramp safety predication and the pavement of the arrester beds. Keywords. truck escape ramp, arrester bed, discrete element method (DEM), randomly shaped pebbles
1. Introduction Out-of-control truck accidents on mountain roads are key issues in traffic accidents. When a truck is driving on a long down-slope, the temperature of the brake drum greatly increases with continuous brake behavior. Studies show that when the temperature of the break drum exceeds a certain degree, the brake efficiency decreases dramatically and may even cause brake failure. The truck escape ramp is known as the most efficient way to prevent out-of-control truck accidents. It is a type of traffic facility that is constructed by a clump of pebbles, and is known as the most efficient way to prevent out-of-control trucks to rescue and to pull off. Previous studies on the truck escape ramp were mainly concerned on the setting locations [1], efflux angles [2], and materials [3]. However, as the pebbles were randomly placed in the arrester beds, it is difficult to simulate the process of truck tires rolling onto the randomly shaped pebbles. The discrete element method is a new method that has been developed rapidly for discrete particles. The DEM separates an object into isolated particles. With a preset contacting algorithm that is based on Newton’s Second Law, the contacting forces and the movements of the particles could be computed. Currently, the DEM has been used in a wide range of aspects such as soil tillage [4-5], rock mechanics and rock engineering [6-7], road materials analysis [8-9], and tire-road terramechanics [10-11]. 1
Corresponding Author: Pan Liu([email protected])
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
98
The DEM is an appropriate method for simulating the truck tires rolling onto the discrete pebbles. There were a few studies concerned on the simulation methods for the truck escape ramp with simplified DEM models. Zhang Gaoqiang [12] and Qin Pinpin [13] simulated the process of truck tires rolling onto arrester beds, and the tire and pebble DEM model were simplified as ball elements. Liu Pan [14] simulated trucks running onto arrester bed using the DEM with an adaptive master-slave simulation method, which greatly decreased the computing time. In the simulation process, three typical types of the pebble DEM models were built using the overlapping method. However, there were few studies on the construction methods for the randomly shaped pebbles used in the truck escape ramp. In the DEM, the pebble shape construction method includes the following aspects: First is the single ball element construction method [15-16]. This method takes the basic ball element as the objects. This method greatly decreases the computing amount. However, as to the pebbles that are mostly oval and flat, the method does not fit. Second is the basic element shape construction method [17-18]. This method rebuilt the basic elements. This method increased the computing amount and accuracy. However, it is difficult to build the basic DEM elements of the randomly shaped pebbles. Third is the close packing agglomerate method with connected ball elements [19-20]. this method greatly increases the calculating accuracy. However, the computing time is limited. Fourth is the overlapping method [21-22]. Compared with the methods described above, the overlapping method could reduce computing amount and increasing calculating accuracy. However, the elements coordinates’ calculation method needs to be adapted. The purpose of this study is to find an appropriate way to simulate the process of truck tires rolling onto the randomly shaped pebbles. Based on the overlapping method, a pebble shape reconstruction method was proposed. Next, the truck tire DEM model was built by the close packing agglomerate method. We randomly selected ten pebble DEM samples from the three views of the pebbles, and the corresponding pebble DEM models were built. Coupled with a “rain fall” method, the truck escape ramp DEM model was built. Next, road tests were conducted on a truck escape ramp located on K209+400 Road, Gansu Province, China, and the signals from truck speed and travel distance were recorded. Coupled with an adaptive master-slave simulation method, the process of the truck tires were simulated, and the road test results and simulation results were compared.
2. The DEM model construction method 2.1. The pebble shape construction method x Materials The pebbles tested were collected from a truck escape ramp located on K209+400 Road, Gansu Province, China (Fig. 1). Shape characteristics of the pebbles can be summarized as: (1) Basic shapes of the pebbles were flat and oval. (2) Most of the pebbles were round and smooth. (3) The smoothness of the pebbles was roughly the same. (4) There were a few fracture pebbles with horizontal corners.
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
99
Figure 1. The tested pebbles
30
30
20
20
20
10 0
0
10
20
10 0
30
num
30
num
num
One hundred pebbles were randomly selected, and the pebbles were taken photos as three views: the main view, the left view, and the top view. First, the top view is preset with the X-axis as the max length, and the Y-axis as the corresponding max width. Next, the main view and left view were settled. Second, the length, width, and height were measured using a vernier caliper, and the corresponding pebble size distributions are shown in Fig. 2.
0
10
20
10 0
30
width/mm
length/mm
(a) length
0
5
10
15
height/mm
(b) width
(c) height
Figure 2. The pebble size distributions
The pebble sizes of length, width, and height are ranging from 7.77 mm to 29.17 mm, 6.61 mm to 22.20 mm, and 2.81 to 12.70 mm. The average size of length, width, and height are 15.98 mm, 11.82 mm and 6.77 mm. x Pebble DEM model shape construction method In the DEM, basic elements were ball elements and wall elements. This paper used the overlapping method to build the pebble DEM model shapes. Take a pebble as an example. The construction process can be summarized as the following steps: First, the three views were divided into two parts by the major axis, and the major axis was separated into equal spaces with equal spaced parallel vertical lines (Fig. 3(a)). Next, the lengths of the vertical lines were measured using a vernier caliper, and outlines of the three views were drawn by MATLAB, and the results are shown in Fig. 3 (b). Second, combined with a polynomial algorithm, outlines of the three views were fitted. Here the 7th fit was selected (Eq. 1). Fitting coefficients of the outcurves are shown in Tab. 1, and the results are shown in Fig. 3(c). 7
F ( xi )
¦ (a u x u ( x 100) ) i
i
i
(1)
i
i 1
Table 1. Fitting coefficients of the three views
Main Left Top
Upper Lower Upper Lower Upper Lower
a1
a2
a3
a4
a5
a6
a7
-0.047 0.047 0.15 -0.37 0.013 0.0048
-0.0052 0.0053 0.015 -0.039 0.0037 -0.0018
-2.9e-4 3.0e-4 5.7e-4 -0.0016 2.7e-4 -1.3e-4
-8.4e-6 8.9e-6 1.1e-5 -3.4e-6 8.6e-6 -3.8e-6
-1.3e-7 1.4e-7 1.1e-7 -3.9e-7 1.4e-7 -5.4e-8
-9.9e-10 1.1e-9 5.4e-10 -2.3e-9 1.1e-9 -3.7e-10
-3.1e-12 3.7e-12 9.1e-13 -5.4e-12 3.2e-13 -9.1e-13
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
100
Third, with a preset number of ball elements in the three views, the major axis of the three views were divided into equal spaces. The radius, and y coordinates are calculated by Eqs. 2-3. r ( y2 y1 ) / 2 (2) y ( y2 y1 ) / 2 (3) Where y1 and y2 are the cordinates in the lower and upper curve; r is the radium of the ball element; and y is the y-coordinate of the ball element. The results are shown in Fig. 3(d).
(a)
(b)
(c)
(d) Figure 3. Ball elements filled into the three views
In the DEM, the computing amount is mainly judged by the circle numbers of the pebble DEM model. With a decreasing number of ball elements, the computing time correspondingly decreases. However, the pebble DEM shape accuracy is limited. Fig. 4 shows the simulation results with ball element numbers of 3, 13, and 23.
(a) 3
(b) 13
(c) 23
Figure 4. The simulation results with different circle numbers
In this paper, five pebble samples were randomly selected from the outcurves of the three views (Fig. 5(a)). Using the pebble shape construction method, the built pebble DEM models are shown in Fig. 5(b), and the corresponding simplified pebble DEM models are shown in Fig. 5(c).
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
101
(a)
(b)
(c) Figure 5. The pebble DEM models
2.2. The tire shape construction method x Truck tire sample The truck tire tested is a heavy load wagon tire (11.0R20 16PR). The side view of the tire is shown in Fig. 6. The side view of the truck tire is similar to a tooth structure, and the parameters of the tested tire such as tire diameter, cog thickness, cover tire thickness, hub diameter, and cog length are shown in Tab. 2.
Figure 6. Truck tire
Figure 7. Tire model design
Table 2. The parameters of the tested tire
Parameter Tire diameter/cm Cog thickness/cm Cover tire thickness/cm Hub diameter/cm Cog length/cm
Measured 110 1 1.5 60 2.4
x Tire DEM model shape construction method To obtain an accurate shape of the tire DEM model, this paper used the close packing agglomerate method, and the process is described as the following steps: First, based on the parameters in Tab. 2, outlines of the tire were built with drawing software, and the results are shown in Fig. 7. Second, the circular rim section was built with connected ball elements: (1) Two connected ball elements P1(x1,y1) and P2(x2,y2) in the rim section were randomly selected. (2) Coordinates of the ball elements that were tangent to the built elements were built according to Eq. 4. (3) With limitations of the maximum and minimum coordinates of the ball elements, the ball elements were expanded into the rim zone.
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
102
2 2 2 ° ( x x1 ) ( y y1 ) (r r1 ) ® (4) 2 2 2 °¯( x x2 ) ( y y2 ) (r r2 ) Next, the trapezoid groove was constructed. The relationship between a line and a point P3 (x3,y3) can be judged by Eq. 5. We randomly selected two points from the line, and the points are marked as P4 (x4,y4) and P5 (x5,y5). With the coordinates of P4, P5, and P3, the value of S in Eq. 5 was calculated. If S was negative, then P4P5P3 was clockwise. If S was positive, then P4P5P3 was anticlockwise. The trapezoid groove on the rim zone of the tire DEM model is shown in Fig. 8(a). S ( P4 , P5 , P3 ) ( x4 x3 )( x5 x3 ) ( y4 y3 )( y5 y3 ) (5) Similarly, the inner part of the tire DEM model was built by the close packing agglomerate method (Fig. 8(b)). Next, a ball element that represented the hub was filled in the middle of the tire DEM model, and the results are shown in Fig. 8(c).
(a) rim circle zone
(b) trapezoid groove on the rim zone
(c) tire DEM model
Figure 8. Tire DEM model construction process
2.3. The shape construction methods for a truck escape ramp The basic shape of the tested truck escape ramp is shown in Fig. 9. The truck escape ramp can be separated into two sections. The first section is from 0 – 60 m, and the laying depth increases from 0.075 m to 0.6 m. The second section is from 60 – 140 m, and the laying depth is 0.6 m. Surface degree of the truck escape ramp is 7.47 %.
Figure 9. Truck escape ramp design diagram
Based on the truck escape ramp design diagram, the truck escape ramp DEM model was built, and the procedure is described as the following steps: First, the first section of the truck escape ramp was divided into 60 parts, and each part was 1.0 m in length. An extra part that represented the second section of the truck escape ramp was built. Second, the DEM model of the 61 parts of the truck escape ramp was built. Take a part as an example, a box was first built with a “wall create” command. Next, pebble DEM models were built with preset ball numbers in rows and columns, and the pebble shapes and orientations were randomly built. According to a “rain fall” method, pebble DEM models were dropped into the box DEM models. Last, a few pebbles that were out of the scope were deleted, and the results are shown in Fig. 10.
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
103
Figure 10. Simulation process of pebble DEM models dropping into the box DEM model
3. DEM simulation of truck tires rolling onto randomly shaped pebbles 3.1. Road tests on a truck escape ramp Road tests were conducted on a truck escape ramp located on K209+400 Road, Gansu Province, China (Fig. 11). The tested truck is Delong F3000 (Fig. 12). The truck load is shown in Tab. 3
Figure 11. The truck escape ramp
Figure 12. Delong F3000
Table 3. The truck load Truck load
Front tire(kg)
Rear rire(kg)
Total mass(kg)
Value
7500
18600
25800
A VBOX – VGPS recorder was used to record truck speed and travel distance (Fig. 13).
Figure 13. VBOX – VGPS recorder
During the road test process, the entry truck speed was set around 80 km/h. To monitor the out-of-control trucks, at the time that the truck enters the truck escape ramp,
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
104
80
80
60
60
Distance[m]
Velocity[km/h]
the driver shifted into neutral and disengaged the clutch. In the process of trucks rolling onto the arrester beds, the truck driver did nothing to control the truck. With an entry speed of 78 km/h and truck front tire load of 3750 kg, the results of the truck speed and travel distance are shown in Fig. 14.
40 20 0
0
2
4
6
8
40 20 0
0
2
Time[s]
(a) truck speed
4
6
8
Time[s]
(b) travel distance Figure 14. The road test results
The results show that with an entry speed of 78 km/h, the truck stopped within 6.7 s, and the corresponding distance was 75.4 m. The results indicated that at the initial step, the truck condition is unstable, with increasing travel distance, the truck condition is stabilized, and the decrease of the truck velocity is linear shaped. 3.2. Simulations of truck tires rolling onto arrester beds Taking computing amount into consideration, an adaptive master-slave simulation method [14] was used. The process can be described as following steps: (1) An arrester bed with three meters (parts) in length were built, and the truck tire started to roll onto the arrester beds (Fig. 15(a)). (2) At the time that the truck tire DEM model reached to the middle of the second part, the first part is deleted, and the fourth part is built (Fig. 15(b)). (3) The former two steps were repeated until the tire DEM model stopped in the arrester beds.
Figure 15. The adaptive master-slave simulation method
Coupled with the adaptive master-slave simulation method, the process of truck tires rolling onto arrester beds was simulated. In the simulation process, the pebble DEM model density, normal stiffness, shear stiffness, and friction coefficient were set to 2.777e3 kg/m3, 4.8e6 N/m, 2.4e7 N/m and 0.11, respectively. The corresponding tire DEM model parameters were set to 2.7e3 kg/m3, 9.3219e5 N/m, 3.1073e6 N/m and 0.11, respectively [23]. In the simulation process, the signals from the truck speed and travel distance were
recorded. Comparison of the road test and simulation results are shown in Fig. 16.
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling 80
Experiments DEM
60
60
Distance[m]
Velocity[km/h]
80
40 20 0
0
2
4
Time[s]
(a) truck speed
105
6
8
40 Experiments
20 0
DEM
0
2
4
6
8
Time[s]
(b) travel distance
Figure 16. Comparison of the road tests and simulation results
The results show that in the simulation process, with an entry speed of 78 km/h and truck front tire load of 3750 kg, the truck stopped within 7.199 s, and the corresponding distance was 79.79 m. Compared with the road test results, the travel distance of the road tests is a bit shorter than the DEM model, and the error is 5.82%. The results indicate that although there were burrs between the simulation and road test results, the difference is not much, which verified the feasibility of the simulation method.
4. Conclusion As the pebbles were randomly settled in the arrester beds, it is difficult to simulate the process of truck tires rolling onto the randomly shaped pebbles. The DEM is mostly used for rigid discrete particles, and is a suitable method for simulating the randomly shaped pebble contacting mechanics. This paper simulated the truck tires rolling onto arrester beds. First, taking computing time into consideration, a pebble shape construction method was proposed based on the overlapping method. Compared with the real pebbles, the method simulated the real oval and flat pebble shapes well. Taking computing accuracy into consideration, the truck tire DEM model was built with a close packing agglomerate method. Coupled with an adaptive master-slave simulation process analysis, the process of truck tires rolling onto the arrester beds was simulated, and road tests were conducted. The road test results verified the feasibility of the simulation method Further study mainly lie in the following aspects: (1) optimized pebble and tire DEM model force-displacement contacting algorithm; and (2) dynamic vehicle influences on the truck tire DEM model.
5. Acknowledgements This research was supported by the National Key R&D Program of China (2017YFC0803904), China Postdoctoral Science Foundation (2018T111006, 2017M613034), Key Research and Development Program of Shaanxi (2019ZDLGY15-02), The Youth Innovation Team of Shaanxi Universities, Yulin Municipal Science and Technology Project (211822190134).
`
P. Liu et al. / Two Dimensional DEM Simulation for Truck Tires Rolling
106
References [1] [2]
[3] [4] [5]
[6] [7] [8] [9] [10] [11] [12]
[13]
[14]
[15]
[16] [17] [18]
[19] [20] [21] [22] [23]
X. Zhao, S. Wang, M. Yu, Q. Yu and C. Zhou, The position of speed bump in front of truck scale based on vehicle vibration performance, J INTELL FUZZY SYST (2018) 1-13. P. Qin, Y. Han, H. Hu, G. Qin and J. Wang, Influence of Main Road Alignment on Handling Stability of Runaway Vehicle on Truck Escape Ramp. In: MATEC Web of Conferences. C.W. Lim and X. Zhu eds.2018). I.L. AlQadi and L.A. RiveraOrtiz, Use of Gravel Properties to Develop Arrester Bed Stopping Model, J TRANSP ENG 117 (1991) 566-584. B. Li, Y. Chen and J. Chen, Modeling of soil-claw interaction using the discrete element method (DEM), SOIL TILL RES 158 (2016) 177-185. J. Sun, Y. Wang, Y. Ma, J. Tong and Z. Zhang, DEM simulation of bionic subsoilers (tillage depth > 40 cm) with drag reduction and lower soil disturbance characteristics, ADV ENG SOFTW 119 (2018) 30-37. K. Duan and C.Y. Kwok, Discrete element modeling of anisotropic rock under Brazilian test conditions, INT J ROCK MECH MIN 78 (2015) 46-56. W. Shen, T. Zhao, G.B. Crosta and F. Dai, Analysis of impact-induced rock fragmentation using a discrete element approach, INT J ROCK MECH MIN 98 (2017) 33-38. J.C.L. Perez, C.Y. Kwok and K. Senetakis, Effect of rubber size on the behaviour of sand-rubber mixtures: A numerical investigation, COMPUT GEOTECH 80 (2016) 199-214. C. Wang, A. Deng and A. Taheri, Three-dimensional discrete element modeling of direct shear test for granular rubber-sand, COMPUT GEOTECH 97 (2018) 204-216. M. Michael, F. Vogel and B. Peters, DEM-FEM coupling simulations of the interactions between a tire tread and granular terrain, COMPUT METHOD APPL M 289 (2015) 227-248. C.L. Zhao and M.Y. Zang, Application of the FEM/DEM and alternately moving road method to the simulation of tire-sand interactions, J TERRAMECHANICS 72 (2017) 27-38. G. Zhang, C. Sun, X. Cheng, J. Zhang and H. Liu, Determining method of length of arrested bed of truck escape ramp based on particle flow simulation, Journal of Highway and Transportation Research and Development 28 (2011) 118-123. P. Qin, C. Chen, S. Pei and X. Li, A PFC2D model of the interactions between the tire and the aggregate filled arrester bed on escape ramp. In: IOP Conference Series-Materials Science and Engineering2017). P. Liu, Q. Yu, W. Wang and X. Zhao, Determination of Truck Escape Ramp Parameters in Tire-particle Simulation Using the Discrete Element Method, China Journal of Highway and Transport 32 (2019) 165-173. B. Yan and R.A. Regueiro, A comprehensive study of MPI parallelism in three-dimensional discrete element method (DEM) simulation of complex-shaped granular particles, COMPUT PART MECH 5 (2018) 553-577. D. Xu, Z. Tang and L. Zhang, Interpretation of coarse effect in simple shear behavior of binary sandgravel mixture by DEM with authentic particle shape, CONSTR BUILD MATER 195 (2019) 292-304. L.J.H. Seelen, J.T. Padding and J.A.M. Kuipers, A granular Discrete Element Method for arbitrary convex particle shapes: Method and packing generation, CHEM ENG SCI 189 (2018) 84-101. B. Yan and R. Regueiro, Large-scale dynamic and static simulations of complex-shaped granular materials using parallel three-dimensional discrete element method (DEM) on DoD supercomputers, ENG COMPUTATION 35 (2018) 1049-1084. R. Fu, X. Hu and B. Zhou, Discrete element modeling of crushable sands considering realistic particle shape effect, COMPUT GEOTECH 91 (2017) 179-191. B. Wang, U. Martin and S. Rapp, Discrete element modeling of the single-particle crushing test for ballast stones, COMPUT GEOTECH 88 (2017) 61-73. Y. Guo, Y. Yang and X.B. Yu, Influence of particle shape on the erodibility of non-cohesive soil: Insights from coupled CFD-DEM simulations, PARTICUOLOGY 39 (2018) 12-24. S. Ma, Z. Wei and X. Chen, A systematic approach for numerical research of realistic shaped particlefluid interactions, POWDER TECHNOL 339 (2018) 377-395. X. Zhao, P. Liu, Q. Yu, P. Shi and Y. Ye, On the Effective Speed Control Characteristics of a Truck Escape Ramp Based on the Discrete Element Method, IEEE ACCESS 7 (2019) 80366-80379.
`
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200051
107
Image Examination System to Detect Gastric Polyps from Endoscopy Images Nilanjan DEYa, Fuqian SHIb, Venkatesan RAJINIKANTHc Department of Information Technology, Techno India College of Technology, West Bengal, 740000, India b College of Information and Engineering, Wenzhou Medical University, Wenzhou, PR China c Department of Electronics and Instrumentation Engineering, St. Joseph’s College of Engineering, Chennai, 600119, India a
Abstract. In pathological practice, a substantial number of procedures are followed to detect and analyze the disease in humans. Usually, pathologists examine the suspicion to diseases in various examination levels ranging from tissues to organs to discover the cause and the stage of the disease. The proposed study aims to investigate the endoscopy recorded digital pictures of Gastric Polyps (GP). This study aims to implement a Computer based Disease Examination Tool (CDET) to analyze the abnormal regions in the stomach. The proposed work comprises a threshold process based on the Brain-StromOptimization-Algorithm and Kapur’s Function (BSOA+KF) to augment the polyp fragment and the segmentation based on the Active-Contour (AC) to mine the polyp segment. The performance of implemented technique is checked using the benchmark GP endoscopy images of CVC-ClinicDB dataset. The performance of the proposed CDET is confirmed based on a relative assesment with the Ground-Truth images existing in the considered database. Further, the performance of the AC segmentation is validated with Chan-Vese (LAC) and Seed-Region-Growing (SRG) segmentation techniques. The results of this study confirms that, AC segmentation technique offers better performance values compared to LAC and SRG Keywords. Endoscopy images, Gastric polyps, Kapur’s function, Brain storm optimization, Active contour segmentation.
1. Introduction Due to the increase in disease occurrence rates; a considerable number of computers based diagnostic systems are widely adopted in hospitals to get the pre as well as the post analysis of patient’s conditions. Most of the diagnostic systems are acting as the aiding devices to get the initial information regarding the health condition of the patient. The doctor will also perform a routine investigative task to confirm the treatment to be implemented to cure the patient [1-4]. In hospitals, the disease in internal/external organ is normally identified using a personal check by an experienced doctor, confirmation with the bio-signals collected using the electrodes and the bio-images collected using suitable imaging techniques. The earlier research work confirms that, the disease in the internal organ can be
108
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
effectively verified with a suitable bio-imaging technique compared to the personal check and the bio-signal based approaches. Moreover, the bio-image assisted assessment technique is widely adopted in medical clinics due to the availability of a range of imaging techniques and assessment software. This approach also will provide the essential pre-opinion regarding the disease to be treated [5-9]. Medical imaging supports a variety of modalities, which may help to get an accurate two-dimensional and three-dimensional view of the internal organ to be examined. The medical-images can be examined systematically to recognize the section, category and harshness of the disease. These pictures can be examined with an experienced doctor or by using a dedicated computer program. Due to its clinical importance, several semi- and automated disease examination methods are discussed and realized to evaluate several medical pictures [10-14]. The proposed work aims to develop a Computer based Disease Examination Tool (CDET) to investigate the abnormalities in the gastrointestinal system. Particularly, the proposed work aims to investigate the Gastric Polyps (GP), since the untreated GP may lead to the stomach cancer. A common clinical diagnostic followed to detect and confirm the severity of the GP involves in; (i) Endoscopy assisted recognition and evaluation of the abnormal cell growth (GP) inside the stomach wall, and (ii) Biopsy to confirm whether the collected cell is cancerous or not [15,16]. The initial level assessment, such as the endoscopic technique is normally considered in clinics to record the GP in the form of the digital-pictures, which can be evaluated in clinics by an experienced doctor. The initial assessment by the doctor can be used to categorize the GP into adenomatous or nonadenomatous class. After the classification, further confirmation can be done with the help of the polyp-cells collected using the biopsy [17,18]. The classification and the need of the biopsy will be done with the help of the initial GP pictures collected using the endoscopy. Further, the disease confirmation and the treatment practice rely depend mainly on the initial confirmation done with the help of the CDET. Hence, it is necessary to develop a CDET to examine the digital GP pictures with better accuracy. In this work, the proposed CDET is developed by combining the multi-thresholding and segmentation technique; widely adopted in the recent literature [19-23]. Assessment of the GP pictures is quite complex due to its complexity and the RGB (Red, Blue, and Green) channel histogram. Further, the separation of the GP mass from the stomach wall is quite difficult, since in most of cases, the color of the GP and the stomach wall is identical. From the previous studies, it is noted that, the combination of multi-thresholding and segmentation can facilitate to mine the abnormal section from the RGB scaled medical images. Hence, in this work a multi-thresholding using Kapur’s function and the segmentation with Active-Contour (AC) is implemented to mine the abnormal region (GP) from the Digital-Endoscopy-Pictures (DEP). Entropy based image appraisal presented superior outcome in image enhancement and extraction tasks. The earlier works on the Kapur’s entropy also suggested that, this technique enhances the grayscale and the RGB-scale pictures sufficiently from which the section of interest can be easily mined. After mining the GP section from the DEP, an evaluation between the Ground-Truth (GT) and the mined GP is then executed to compute the essential Image-
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
109
Performance-Values (IPV). In this work, the benchmark DEP called the CVC-ClinicDB dataset [24] is considered for the evaluation. These IPVs are then considered to confirm the performance of the CDET on the considered DEP. This work also presented a study on the existing segmentation techniques, such as the Active-Contour (AC), Chan-Vese (LAC) and Seed-Region-Growing (SRG) mining techniques available in the literature and the outcome of this work confirmed that, the AC offers better result compared to LAC and SRG. The remaining part of the study is organised as follows; section 2 describes the earlier works executed to evaluate the GP pictures, section 3 presents the details of proposed CDET, section 4 describes the results attained in this study and its discussions and section 5 presents the conclision of proposed work. 2. Relater works Because of its clinical significance, a significant number of GP examination procedures are discussed by the researches. The work of Figueiredo et al. (2019) proposed Local-Binary-Pattern (LBP) based diagnosis system for the GP and tested the proposed work on the real clinical image and the CVC-ClinicDB [16]. Their work with the Support-Vector-Classification (SVM) offered the following IPVs; LBP+SVM presented sensitivity= 99.6%, specificity= 78.4% and accuracy= 89.5%. Further, the LBP with polyp detection and SVM classification achieved; sensitivity= 99.7%, specificity= 79.6% and accuracy= 90.1%. Mohammed et al. (2018) developed a DeepConvolutional-Neural-Network (DCNN) model to detect the GP from the DEP and acieved; precision= 87.4%, recall= 84.4%, F1 score= 85.9% and F2 score= 85.0% [17]. The research by Shin et al. (2017) proposed a DCNN model and achieved precision= 91.4%, recall= 71.2%, F1 score= 80% and F2 score= 74.5% [18]. The earlier works provided a sufficient insight on the GP examination techniques. In this work, a heuristic approach based technique is implemented to exzmine the considered database. 3. Computer based Disease Examination Tool (CDET) Development This portion of the work presents the outline of the CDET implemented to extract and evaluate the GP from CVC-ClinicDB dataset.
Figure1. Structure of implemented CDET Figure 1 show different phases implicated in the proposed CDET. Primarily, the selected test pictures go through a multi-threshold operation which improves the GP
110
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
region of the DEP according to the preferred threshold value. This study employs the BSOA+KF method with three-level threshold to improve the GP fragment. Later, the enhanced GP region is segmented with the chosen approach. The segmentation is initially executed with the AC and then other techniques, such as LAC and SRG are also adopted to extract the GP. Finally, a comparative investigation among the GT and the mined GP is executed and the necessary IPVs are computed. The benefit of implemented practice is, it tenders a better outcome due to the two-phase evaluation technique. The remaining part of this subdivision presents the details of CVCClinicDB, BSOA, KE, segmentation procedures and the IPVs. 3.1 Gastric Polyp Database In this work, the DEP of the GP available in the CVC-ClinicDB dataset is considered for the examination [24]. This dataset has 612 numbers of RGB scaled test pictures of dimension 384x288 pixels. This database also consist 612 numbers of the GT pictures offered by an expert. This dataset is in RGB form and extraction of essential information from these images is quite complex. This database is adopted as the testing database by most of the researchers and these datasets is considered as the testing dataset for the developed CDETs. This dataset consists a high-resolution clinical grade images collected using endoscopy. This image consist a series of pictures collected from the same volunteer and in future, this information can be considered to develop a mathematical model to diagnose the GP. 3.2 Brain-Strom-Optimization-Algorithm BSOA was first implemented by Shi in 2011 [25] and due to its performance, it is widely implemented to resolve a class of optimization tasks. This algorithm is based on the problem-solving nature and the information sharing among the humans. BSOA consist three major phases; (i) Grouping of persons, (ii) disturbing group centres, and (iii) generating new solutions. More discussion of the BSOA can be found in the recent articles [26,27]. In this work, the traditional BSOA is adopted in thresholding task and its mathematical model is described below; Let, there are N solutions of M groups based on the K-means technique. If a new solution is developed, then the existing cluster centre is disturbed and an individual is picked (Bselect) by combining one or two clusters. (1) B i ° Bselect ® °¯.Bi 1( 1 ).B2i
Where = random variable [0 to1], B1i and B2i are the i-th dimension in chosen clusters. The chosen idea is updated as; Bnew B select [ .( 0 ,1 ) (2) where ( 0 ,1 ) is a Gaussian random value with mean=0 and variance=1 and [ is the varying factor. § 0.5 * mi ci · [ log sin¨ (3) ¸. k © ¹ where logsin= logarithmic sigmoid function, k= varying rate of the slope, mi= maximum iteration and ci= current iteration. Other details regarding the BSOA can be found in [28,29].
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
111
3.3 Kapur’s Function Entropy value is largely considered in image assessment based on the abnormality in images [30]. This work utilizes Kapur’s Entropy (KE) technique discussed in [31] and a detailed explanation on the KE can be found in [20]. In this work, the BSOA based optimization search is employed to enhance the GP section based on the chosen threshold. This BSOA is allowed to search the search space till the KE reaches the maximal level. 3.4 Segmentation Schemes Normally, segmentation is adopted in image-processing schemes to extract the vital information from the test-image with a greater accuracy. In most of the cases, the segmentation can be implemented with or without the pre-processing based on thresholding scheme [9,10]. Implementation of segmentation after the possible preprocessing technique offered vital result in most of the cases. In most of the image examination tasks, semi-automated techniques, such as Active-Contour (AC), ChanVese (LAC) and Seed-Region-Growing (SRG) are widely preferred due to their adaptability and accuracy. The details on these segmentation procedures can be found in [14,19,32]. 3.5 Image-Performance-Values (IPV) During the disease examination based on medical pictures, the merit of the CDET is validated by calculating the IPVs as discussed in earlier research works [14,19]. In this work, IPVs like the Jaccard, Dice, Precision, Sensitivity, Specificity, Accuracy, Balance Error Rate (BER) and Balanced Classification Rate (BCR) are calculated and its expressions can be found in [22,23]. 4. Results and Discussion This section of the work presents the investigational results and related conversations. The CVC-ClinicDB dataset consist clinical grade RGB scaled GP images recorded in a controlled environment. The earlier works on the GP images also confirm the need of a highly accurate and implementable procedure to examine the GP region from the DEP. In this work, all the existing pictures (612 numbers) are considered for the examination and in few image cases (58 images), the proposed work offered a poor result. In this work, better results are obtained for 554 images and these results are presented for the evaluation. Table 1 Sample test images of CVC-ClinicDB Image class Image
GT
Image1
Image2
Image3
Image4
Image5
112
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
Figure 2. Outcome attained with implemented scheme. (a) Sample picture, (b) Processed image, (c) AC trace, (d) Mined section
Table 1 shows the chosen GP images for the experimental evaluation. Similar procedure is repeated for every other image of the database. Extraction of the GP region is quite complex, since in most of the images, the GP section and the background of the test image looks similar. Hence, it is essential to initially seperate the GP from the background using a suitable image processing scheme. In this work, the initial seperation is achieved with the BSOA+KE based tri-level thresholding process. Later, the chosen segmentation technique is implemented to mine the GP region from the threshold picture. Finally a relative evaluation between the GT and GP can be considered to verify the significance of the deveoped CDET. Fig 2 presents the result obtained for image 1 with the proposed CDET with AC segmentation. Fig 2(a) shows the test-picture, Fig 2(b) and (c) shows the threshold and AC segmentation results respectively and Fig 2(d) depicts the extracted binary version of image1. After mining the GP section, a relative study among the GT and GP is executed and the essential IPVs are then computed. Later, a comparative appraisal between GT and mined GP is executed and the acquired pixel level measures are considered to form the confusion-matrix shown in Fig 3. In which, the TP, FP, FN and TN represents true-positive, false-positive, falsenegative and true-negative respectively. P and N indicate the positive and negative pixel values. Further, TPR, FNR, TNR and FPR denote the TP, FN, TN and FP rates respectively. From this confusion matrix, it can be observed that, the IPVs obtained with AC segmentation for Image1 is superior and roughly similar results are obtained with the LAC and SRG. This procedure is implemented on every picture in the dataset and for 58 images cases; the obtained IPVs are very less due to the poor result during
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
113
threshold and segmentation tasks. Hence, in this discussion, the results obtained with only 554 images (612-58=554 images) are considered to validate the outcome of the proposed CDET.
Identified Class
Actual Class TP= 26140 (pixels)
FP= 570 (pixels)
TPR= 0.8859 TPR+FNR=1
FN= 3364 (pixels)
TN= 80518 (pixels)
FNR= 0.1141
P=TP+FN (pixels)
N=FP+TN (pixels)
TNR= 0.9930 TNR+ FPR=1
110592 (Total pixels=P+N)
FPR= 0.0070
Figure 3. Confusion matrixes obtained for Image1
(a)
(b)
(c)
Figure 4. Results obtained for sample pictures. (a) Test-image, (b) Threshold picture, (c) Mined GP section, (d) GT
(d)
114
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
Table 2 Attained IPVs for CVC-ClinicDB with various segmentation procedures Test image
Jaccard
Dice
Precision
Sensitivity
Specificity
Accuracy
BCR
BER
Image1
86.91
93.00
97.86
88.59
99.30
96.44
90.13
8.873
Image2
77.28
87.19
98.67
99.97
77.62
88.09
88.80
11.20
Image3
80.31
89.08
96.92
99.08
84.03
91.25
91.56
8.438
Image4
83.15
90.80
97.37
98.25
89.12
93.58
93.69
6.311
Image5
87.11
93.11
99.29
95.96
97.86
96.91
96.91
3.085
Average1
82.95
90.64
98.02
96.37
89.58
93.25
92.22
7.581
Average (AC)
83.17
90.20
98.45
98.86
87.06
93.27
93.84
5.948
Average (LAC)
82.86
89.65
98.26
98.44
86.37
90.83
91.15
7.944
Average (SRG)
82.15
88.39
97.55
97.61
84.18
90.28
90.86
8.265
Fig 4 presents the results received for other sample test images. Fig 4(a) presents the RGB scaled sample pictures, Fig 4(b) and (c) depicts the threshold and AC segmentation results and Fig 4(d) presents the GT. The assessment of GP and GT offered better results and the values are depicted in table 2. Table 2 further illustrates the average values of IPVs with AC, LAC and SRG attained for the 554 test pictures of the CVC-ClinicDB database. Fig 5 shows the pictorial evaluation of the average performance measures obtained with the AC, LAC and SRG techniques. From Table 2 and Fig 5, it can be confirmed that, the IPVs attained with the AC is superior compared to the LAC and SRG. Further, the trace convergence in AC is better compared to the SRG.
Performance measure (%)
100 80
AC LAC SRG
60 40 20 0
Jaccard
Dice
Precision Sensitivity Specificity Accuracy
BCR
BER
Figure 5. Evaluation of the performance of AC, LAC and SRG based on IPVs
This work aims to develop a CDET to extract the GP fragment from the RGB scale DEP using BSOA assisted technique. In future, a suitable feature mining and the classification method can be implemented to categorise the GP images into normal and acute class. Further, a Deep-Neural-Network (DNN) procedure can be generated to examine the GP regions and along with the CVC-ClinicDB pictures, other test images existing in [24] can be considered to test and validate the CDET, which may help to build a state of the art diagnosis system to assiste the dctor in disease evaluation and treatment planning process.
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
115
5. Conclusion This work aims to implement an evaluation practice for Gastric polyps using CVCClinicDB pictures. This study considered 554 test pictures (384x288 pixels) to investigate the performance of developed CDET. Initially, the DEP is enhanced using the BSOA+KE and the GP segment is mined with the AC procedure. Later, the LAC and SRG techniques are implemented on the test pictures and the result (GP) is then compared against the GT to compute the IPVs. The experimental outcome validate that, proposed CDET is proficient in mining the GP section from the RGB scaled pictures and also provides better result with lesser iteration time compared 87%) compared to LAC and SRG. The result of this work also confirms that, irrespective of GP group, the proposed CDET offers roughly similar in various image cases. In future, this tool can be enhanced by implementing a suitable machine-learning and deep-learning technique. References [1]
[2]
[3] [4]
[5]
[6] [7] [8] [9] [10]
[11]
[12] [13]
[14]
[15]
B. Lakshmi, S. Parthasarathy, Human action recognition using median background and max pool convolution with nearest neighbor, International Journal of Ambient Computing and Intelligence, 10(2) (2019) 1-17, DOI: 10.4018/IJACI.2019040103. P. Chandrakar, A secure remote user authentication protocol for healthcare monitoring using wireless medical sensor networks, International Journal of Ambient Computing and Intelligence, 10(2) (2019) 96-116. DOI: 10.4018/IJACI.2019010106. N. Dey, A.S. Ashour, F. Shi, S.J. Fong, R.S. Sherratt, Developing residential wireless sensor networks for ECG healthcare monitoring, IEEE Transactions on Consumer Electronics, 63(4) (2017) 442-449. N. Dey, A.S. Ashour, S. Beagum, D.S. Pistola, M. Gospodinov, E.P. Gospodinova, J.M.R.S. Tavares, Parameter optimization for local polynomial approximation based intersection confidence interval filter using genetic algorithm: An application for brain MRI image de-noising, Journal of Imaging, 1(1) (2015) 60-84. D.R. Kishor, N.B. Venkateswarlu, A novel hybridization of expectation-maximization and K-means algorithms for better clustering performance, International Journal of Ambient Computing and Intelligence, 7(2) (2016) 47-74. M. Yamin, A.A.A. Sen, Improving privacy and security of user data in location based services, International Journal of Ambient Computing and Intelligence, 9(1) (2018) 19-42. N. Dey, A.S. Ashour, F. Shi, R.S. Sherratt, Wireless capsule gastrointestinal endoscopy: direction-ofarrival estimation based localization survey, IEEE reviews in biomedical engineering, 10 (2017) 2-11. N. Dey, A.S. Ashour, F. Shi, S.J. Fong, R.S. Sherratt, Developing residential wireless sensor networks for ECG healthcare monitoring, IEEE Transactions on Consumer Electronics, 63(4) (2017) 442-449. N. Dey, V. Rajinikanth, A.S. Ashour, J.M.R.S. Tavares, (2018). Social Group Optimization Supported Segmentation and Evaluation of Skin Melanoma Images. Symmetry, 10(2) (2018) 51. V. Rajinikanth, N. Dey, S.C. Satapathy, A.S. Ashour, An approach to examine magnetic resonance angiography based on Tsallis entropy and deformable snake model, Future Generation Computer Systems, 85 (2018) 160-172. Doi: 10.1016/j.future.2018.03.025. V. Rajinikanth, N.S.M. Raja, K. Kamalanand, Firefly algorithm assisted segmentation of tumor from brain MRI using Tsallis function and Markov random field, Journal of Control Engineering and Applied Informatics, 19(2) (2017) 97-106. N. Dey, A.S. Ashour, F. Shi, S.J. Fong, J.M.R.S. Tavares, Medical cyber-physical systems: A survey, Journal of Medical Systems, 42(4) (2018)74. S.S. Ahmed, N. Dey, A.S. Ashour, D. Sifaki- -Timar, V.E. Balas, J.M.R.S. Tavares, Effect of fuzzy partitioning in Crohn’s disease classification: a neuro-fuzzy-based approach, Medical & biological engineering & computing, 55(1) (2017) 101-115. V. Rajinikanth, S.C. Satapathy, S.L. Fernandes, S. Nachiappan, Entropy based segmentation of tumor from brain MR images–A study with teaching learning based optimization, Pattern Recognition Letters, 94 (2017) 87-94. J. Bernal, F.J. Sánchez, G. Fernández-Esparrach, D. Gil, C. Rodríguez, F. Vilariño, WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians, Computerized Medical Imaging and Graphics, 43 (2015) 99-111.
116
N. Dey et al. / Image Examination System to Detect Gastric Polyps from Endoscopy Images
[16] P.N. Figueiredo, I.N.. Figueiredo, L. Pinto, S. Kumar, Y-H.R. Tsai, A.V. Mamonov, Polyp detection with computer-aided diagnosis in white light colonoscopy: comparison of three different methods, Endoscopy International Open, 7 (2019) E209–E215. [17] A. Mohammed, S. Yildirim, I.Farup, M. Pedersen, Ø. Hovde, Y-Net: A deep convolutional neural network for polyp detection, arXiv:1806.01907 [cs.CV]. [18] Y. Shin, H.A. Qadir, L. Aabakken, J. Bergsland, I. Balasingham, Automatic Colon Polyp Detection using Region based Deep CNN and Post Learning Approaches, IEEE Access, 6 (2018) 40950 – 40962. [19] V. Rajinikanth, S.C. Satapathy, Segmentation of ischemic stroke lesion in brain MRI based on social group optimization and fuzzy-Tsallis entropy, Arabian Journal for Science and Engineering, 43(8) (2018) 4365–4378. [20] V. Rajinikanth, S.C. Satapathy, N. Dey, S.L. Fernandes, K.S.Manic, Skin melanoma assessment using Kapur’s entropy and level set—A study with bat algorithm. Smart Innovation, Systems and Technologies 104 (2019) 193-202. Doi: 10.1007/978-981-13-1921-1_19. [21] V. Jahmunah , S. L. Oh, V. Rajinikanth, E.J. Ciaccio, K.H. Cheong, N. Arunkumar, U.R. Acharya, Automated detection of schizophrenia using nonlinear signal processing methods, Artificial Intelligence In Medicine, (2019) 1-8. Doi: 10.1016/j.artmed.2019.07.006. [22] N. Dey, F. Shi, V. Rajinikanth, Leukocyte nuclei segmentation using entropy function and ChanVese approach. Information Technology and Intelligent Transportation Systems 314 (2019) 255 – 264. Doi: 10.3233/978-1-61499-939-3-255 [23] S.C.Satapathy, V. Rajinikanth, Jaya algorithm guided procedure to segment tumor from brain MRI. Journal of Optimization 2018, 12 (2018). Doi: 10.1155/2018/3738049. [24] https://polyp.grand-challenge.org/CVCClinicDB/ [25] Y. Shi, Brain storm optimization algorithm, Lecture Notes in Computer Science, 6728 (2011) 303-309. [26] A.R. Jordehi, Brainstorm optimisation algorithm (BSOA): An efficient algorithm for finding optimal location and setting of FACTS devices in electric power systems, International Journal of Electrical Power & Energy Systems, 69 (2015) 48-57. [27] S. Cheng, Y. Shi, Q. Qin, T. Ting, R. Bai, Maintaining population diversity in brain storm optimization algorithm. In: Proceedings of 2014 IEEE congress on evolutionary computation. CEC 2014IEEE, Beijing, China, (2014) 3230–3237 [28] S. Cheng, Y. Shi, Q. Qin, T. Ting, Q. Zhang, R. Bai, Population diversity maintenance in brain storm optimization algorithm, Journal of Artificial Intelligence and Soft Computing Research, 4(2) (2014) 83-97. [29] S. Cheng, Q. Qin, J. Chen, Y. Shi, Brain storm optimization algorithm: a review, Artificial Intelligence Review, 46(4) (2016) 445–458. [30] J.N. Kapur, P.K. Sahoo, A.K.C. Wong, A new method for gray-level picture thresholding using the entropy of the histogram. Comput Vision Graph Image Process, 29 (1985) 273–285. [31] D. Shriranjani, S.G. Tebby, S.C. Satapathy, N. Dey, V. Rajinikanth, Kapur’s entropy and active contour-based segmentation and analysis of retinal optic disc. Lecture Notes in Electrical Engineering, 490 (2018) 287-295. Doi: 10.1007/978-981-10-8354-9_26 [32] T.F. Chan, L.A. Vese, Active contours without edges, IEEE Transaction on Image Processing, 10(2) (2001) 266–277.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200052
117
Machine Learning Models for Bird Species Recognition Based on Vocalization: A Succinct Review Nabanita DAS a, Atreyee MONDAL b, Jyotismita CHAKI c, Neelamadhab PADHY d, Nilanjan DEY e a
Department of Computer Science, Bengal Institute of Technology, Kolkata, India, [email protected] b,e Department of Information Technology, Techno India College of Technology, Kolkata, India, E-mail: [email protected] , [email protected] c School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, 632014, India, [email protected]. d School of Computer Engineering (SOCE), GIET University, [email protected] Abstract. Bird sound classification based on their vocalization has become a significant research field nowadays. Acoustic sound produced by the birds is very rich and used to detect their species. In earlier days ornithologist used to detect the bird species, but this manual recognition is costly and requires huge amount of time. With the advancement of machine learning and deep learning, classification of syllables has become more significant. Methods of automatic sound recognition consist of different stages, such as preprocessing of the input audio file, segmentation of syllables, feature extraction followed by classification. In this study the models used for audio classification are concisely reviewed. Identification becomes more challenging due to a huge similarity between different species. However, noise reduction from the audio files is possible using several machine learning models. Deep learning techniques are also an emerging field in the classification domain, which is discussed in this review. Using these models, it is possible for researchers to detect species or even individual bird from their vocalization which is more time efficient. This paper aims to deliver a review summary, and present guidelines for utilizing the broadly used machine learning techniques in order to identify the challenges as well as future research directions of bird song recognition systems. Keywords. Automatic bird species identification, Machine learning models, Syllable classification, Deep learning.
1.
Introduction
Birds add life, sound and colour to our lives. Birds are innumerable and reactive to ambient changes, that is why they are simpler to observe compared to other species. Birds are the most important indicators of the state of the environment. Detection of bird species or individual bird from their acoustic language can be a very effective tool for evaluating biodiversity with many significant applications in ecology, bioacoustics examining and behavioural objectives [1]. Because they are sensitive to habitat change and they are easy to census, birds can be the ecologist's favourite tool. Changes in bird
118
N. Das et al. / Machine Learning Models for Bird Species Recognition
populations are often the first indication of environmental problems [2]. Birds’ sound, such as their call or song, offers lots of information about the regional activities and distributions for ornithologists and ecologists computing the biological variety of a distinct region, that works as a measure to represent the ambient change and loss of habitat. Bird calls are generally considered shorter than songs and indicates a particular function such as feed, fight, etc. Songs are generally longer in duration and more complex than calls. Most of the bird species sing only during their breeding season and many of them limits into males only. Bird songs are sometimes the repetition of certain transformative patterns having syllabic diversity. Some species even sing when flying also some are only able to produce rhythmic sound. In earlier days bird sound was identified by ornithologist using mist netting, transect count etc. This requires a special type of knowledge and only detectable by the professionals. With the advancement of machine learning, it is possible to detect bird sound by using automated systems which facilitate the common people to identify the species or even individual birds in some cases. In this study, different machine learning models for bird sound recognition are concisely discussed. The method of bird species recognition using their sound includes various stages like pre-processing of the input audio containing bird acoustic sound, segmentation of syllables, feature extraction and classification. The main motivations of this study are assessing biodiversity, state of nature, habitat quality analysis, mating call, balancing ecology, tracking climate change etc. Biodiversity is the measure of variation at the genetic species and ecosystem levels. Sometimes birds can be treated as an excellent indicator of environmental health. A previous study exhibits that in the woods apart from a woodpecker, there also exist more birds of different species [3]. Monitoring the variation in woodpecker species, scientists could represent bird diversity in this region. For bird biodiversity conservation, accurate bird recognition is very essential. Bird population survey increases the effect of land utilization and land management on avian species and is elemental for ornithologist, ecologists around the globe. Birds play very vital role in balancing nature. Birds eat insects [4], for example warblers, bluebirds and woodpeckers are the insect eating birds. They are a natural way to control pests in gardens, on farms, and other places. Gliding of a group of birds through the air can easily eat hundreds of insects each day. Overall habitat quality can easily indicate by birds. To predict the habitat quality of a particular species the accumulation in a region as well as the quantity of avian is considered. A relative study reported that with the degradation of woods habitats, the species of avian converted in a predictable manner [3]. Similarly, an exclusive species depicts habitat quality for e.g., the red-cockaded woodpecker has typical nesting demand. The sensitivity analysis of these species towards specific ecosystem elements makes them precious sign of habitat standard. Birds attract their mates, mostly by their songs. A mating call is the auditory signal used by animals to attract mates. Looking for a healthy mate birds can produce the variety of different songs which show maturity and intelligence, highly desirable characteristics [5]. Bird plays significant role in keeping the ecosystem running smoothly. They help in plant dispersal. Birds helps in the fertilization of the sex cell by moving the pollen from flower to flower [4]. Hummingbirds, Sunbirds, and the Honeyeaters are common pollinators. The birds tell us about the health of the planet. If suddenly the climate changes they can sense it first. Because of this, they are our earlywarning system for pressing concerns such as climate change [6].
N. Das et al. / Machine Learning Models for Bird Species Recognition
119
In this study, different methods for bird species identification using their sound are discussed in section 2 which consists of preprocessing of sound, segmentation of syllables, feature extraction and classification. Some extensive machine learning models of supervised, unsupervised and semi-supervise approaches are studied in this work. In section 3, challenges of this domain are addressed, and future scopes of this study are reported. Finally, section 4 depicts the conclusion. 2.
Methods of Bird Species Recognition
In recent years, automatic detection of sound has become significant in the field of signal processing. Audio classification system consists of preprocessing of sound, segmentation of syllables, feature extraction followed by classifier design. In this study automatic classification system for bird species identification has been considered based on their sound. The song of the birds is generated by their special organ syrinx, which is similar in function , but more complicated in structure to human vocal cords [7]. This anatomic complexity of syrinx causes wide variations in their calls. Very few species mostly parrots utilize their tongues to generate similar speech production mechanism like human beings [8]. The acoustic conversation of birds amongst themselves are detectable in the early days by ornithologist using mist netting, transect count etc. The advancement in machine learning, facilitates to develop an automatic computer aided system to identify the discrete bird or their species based on the bird song using several methodologies. In this section, the stages to implement such a system are reviewed briefly. 2.1. Preprocessing and Segmentation of bird song The fundamental step to implement an automatic identification system for bird song identification is pre-processing of each audio frame sequentially [9]. The input audio frames required to be pre-processed before classification which contains noise depletion, raw data conversion into an appropriate format, decreasing processing time etc. The raw audio file is separated in signal and noise parts with the help of preprocessing. A spectrogram is calculated for noise and signal using a short-time Fourier transform (STFT) [10] with a Hamming window (size 512,overlap 74%); considering spectrogram as a grayscale image [11]. The value of the STFT varies from 0 and 1 for separating bird song from the background noise. A further effective approach towards this adopts a 4 by 4 filters to avoid noisy spectrogram, reported in [12]. After preprocessing, bird calls (signal part) are segmented in syllables. In this context, a syllable is defined as the smallest unit of sound that a bird can produce with a single breath. An iterative time-domain algorithm is proposed in [13], where the syllable segmentation is elaborately discussed. The aim of time-domain segmentation is it will able to detect the false syllables which are non-bird noise or environmental noise and group mixture of syllables for a particular bird species whenever there is a situation of chaos created by different birds signing at the same time. The time-domain approach can effectively segment syllables overlapping with time. 2.2. Feature Extraction The syllable segmentation method follows extraction of features which is further categorized as spectral and temporal features. To detect bird species robustly, features are computed from syllables which depicts the feature vector while calculation. The features of the segmented syllables are represented as overlapping frames. In [14],
120
N. Das et al. / Machine Learning Models for Bird Species Recognition
proposed a spectral feature Mel Frequency Cepstral Coefficients (MFCC) to depict the bird songs. MFCC is broadly used to represent the power spectrum of sound and for genre classification of any audio system. Deep neural network (DNN) performs competently with MFCC features for identification of bird species. The temporal features consist of zero-crossing rate (ZCR) and temporal time span of the elements during in bird call, reported in [15]. ZCR indicates the rate at which the sign changes with a signal from positive to negative or vice versa. It is used to retrieve birds sound or recognize acoustic language from the audio file. A comparative study of feature sets is presented in [16], based on their performance in the field of automated bird species identification system. The maximum probability of segments extracted from the training audio patterns is determined by template equality, utilizing standardized crosscorrelation [17]. 2.3. Syllable Classification In wildlife monitoring, classification of bird song has become more challenging. To detect the bird species or individual bird from their vocalization, the syllables needs to be classified using different classifiers. The main task performed by the classifier is to decide the best suitable class for the obtained pattern by estimating uniformity between the test sample and the target pattern of the class. There are three conventional machine learning approaches for classification, namely supervised, unsupervised and semi-supervise classification. Some of them which are used in bird sound recognition are discussed here. 2.3.1.
Machine Learning Approaches
I. Supervised learning Supervised learning deals with the objects which are having known labelled data to train the machine. There are various conventional supervised machine learning algorithms to identify bird song by their vocalization such as Random Forest(RF) [18], Support Vector Machine (SVM) [19, 20], Decision Trees [21], Multi-instance multilabel [22], Deep learning based approach etc, frequently used to obtain high accuracy. Random forest (RF) classifier consists of a large no of collection of decision trees where each of them acts as an ensemble [23]. This type of structure helps to protect each other from errors. The advantage of RF is if some of the decision trees results error, but some are correct then also the overall class of trees are capable of moving towards a right direction [24]. Therefore, RF generates more accurate classification than individual decision trees. A supervised random forest (RF) classifier is used in [25] for sound detection in noisy environment. The artificial neural network is the most widely used techniques for classification because of its good learning ability, parallelism, reduced affection of noise etc [26]. It has a set of input layer, one output layer and one or more hidden intermediate layer. Each layer is connected with a weighted network [27]. The iterative adjustment of the weight of the layers, performance of the network can be upgraded. Support Vector machine is one of the most significantly used conventional supervised machine learning algorithms for the bird sound classification purpose. Nonlinear patterns are manipulated as they are difficult to distinguish unlike linear
N. Das et al. / Machine Learning Models for Bird Species Recognition
121
patterns [28]. In this algorithm, the fundamental step is to train the machine followed by prediction of new data. To increase accuracy, it sometimes uses a dynamic kernel parameter for prediction [29]. Here dynamic kernel means kernel with variable length, which is used to find a resemblance between feature vectors [30]. Various kernel functions used in SVM are linear, quadratic and polynomial in nature. The decision tree is another popular supervised learning model for the classification of bird sounds. The decision tree is a framework which produces a tree like structure and set of rules that depicts models of different classes from a dataset. The dataset is entirely broken into smaller parts which in turn incrementally produces a tree with decision nodes and leaf node which represent the different classes [31]. Multi-instance multilabel (MIML) algorithm under supervised approach is adopted in [32] to detect bird species from audio recording files. MIML is a type of framework where a bag of instances is linked with multiple class labels. Here the bag of instances indicates the objects to be classified. MIML framework is generally used to represent a good performance for complex targets which have multiple semantic meaning for e.g. bird song. Another supervised algorithm is K-nearest neighbour (KNN). In KNN, the input consists of the k closest training samples and the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (where, k is a positive integer) [33]. It converts an input audio file into a bag of instances which is further used with the classifiers. However, the supervised algorithm needs bulk amount of training data sample for reaching high accuracy. Deep learning is a subset of machine learning, popularly used for classification because of higher accuracy while training large amount of data in recent years [34]. Deep learning mechanism learns classes incrementally by using hidden layer architecture from lower class to a higher class. For e.g. it first learns a word, then word then the whole sentence. A relevant study shows the performance metrics of accuracy much higher for deep learning compared to machine learning. The conventional machine learning algorithms first breaks down the input audio into clusters, solve the individual clusters and combine them at the final stage whereas deep learning believes in an end to end solving of the audio as a whole. However, deep learning algorithm takes much longer time than traditional machine learning algorithms. Artificial Neural network (ANN), [35, 36], Convolutional Neural Network (CNN) [37, 38, 39], Recurrent neural network (RNN) is some of the most dominant algorithms in deep learning. CNN uses spatial data and it is more powerful than RNN which is suitable for temporal data. CNN takes fixed size of input and produces fixed size of output. RNN is capable of using their internal memory to process arbitrary sequences of inputs. Therefore, RNN is ideal for speech processing systems. In [40], two different types of CNN are used for large scale bird species identification based on bird song to gain higher accuracy in noisy ambient. II. Unsupervised learning Unsupervised learning is also used nowadays for classification of bird song. In unsupervised learning, the output variables without a pre-existing set of variables of a corresponding input data, are unknown [41].Unsupervised learning does not require manual data labelling and broadly used for clustering purpose. The aim is to find the
N. Das et al. / Machine Learning Models for Bird Species Recognition
122
best dimensionality reduction of the input data set. In [42] proposed an automatic method to measure the distance between two syllables using unsupervised classification methods. III. Semi-supervised learning Semi-supervised learning (hybrid of supervised and unsupervised approaches) is also reported for syllable classification. In semi-supervised learning technique, a large amount of unlabelled data is used with a smaller amount of labelled data [43]. Previous study in this domain reported that the technique used in semi supervised model improves the performance of learning accuracy. 3.
Discussion
Different state of the art presented in this study clearly depicts the significance of machine learning models in the context of classification of birds based on their sound [44]. The challenges in this field are prominent because of the presence of ambient noise, large volume unlabelled data, similarity between classes etc. Ambient noise can be mixed with the bird sound which needs to be reduced by preprocessing. Moreover, an input audio file sometimes contains more than one bird species which creates a chaotic situation and treated as noise. ANN performs competently for classification because of its less affection towards the noise. Training large amount of data is a tedious process which also requires labelling of data. Labelling of large volume of data is time consuming. If unlabelled data is fetched, the machine will not work. To overcome this problem, new methodologies like in-stream supervision are emerging where data is labelled at the time of natural use. Sometimes similarities are addressed between different classes which prevents the distinct identification. Some relevant studies proposed a future design using a much larger dataset to train the automatic identification system. Future work in this field may introduce a unique solution for low quality bird sound audio files containing noise to represent the real time scenario. 4.
Conclusion
In this study, different machine learning models with their application in bird recognition based on bird song are discussed. Bird acoustic sounds are sometimes used by music composers and frequently observed by ornithologist and ecologists. Birds are very sensitive towards climate changes. Therefore, some of the species are almost extinct from the environmental ecology cycle. Manual detection of bird sound for classification of species needs more professional persons as well as costlier and time inefficient. To identify the bird species or individual bird from their vocalization automatically, machine learning and deep learning techniques have become popular . These methods include pre-processing of the input audio files containing bird sound, segmentation of the preprocessed files into syllables, feature extraction and finally classification using various classifiers. In recent years with the advancement of machine learning, different classifiers are addressed for classification purpose. Relevant literatures proposed supervised and unsupervised and semi-supervise approaches in this context which are concisely reviewed in this work. Supervised approaches consist of some fundamental algorithms for e.g. SVM, RF, Decision trees etc. ANN is widely used amongst these due to its lower affection to noise. However, unsupervised algorithms also give good performance for classification. Semi-supervise leadning needs to be explored more. Deep learning algorithms have become very
N. Das et al. / Machine Learning Models for Bird Species Recognition
123
popular nowadays over traditional machine learning algorithms for bird sound recognition, reported briefly in this study. References [1] Frommolt, K. H., Bardeli, R., & Clausen, M. (2008). Computational bioacoustics for assessing biodiversity. In Proceedings of the International Expert meeting on IT-based detection of bioacoustical patterns, BfN-Skripten (Vol. 234). [2] https://birdfriendlyiowa.org/Pages/BirdFriendlyIowa.aspx?pg=6 (last access date: 23/07/2019) [3] https://www.environmentalscience.org/birds-environmental-indicators (last access date: 23/07/2019) [4] https://www.ck12.org/biology/bird-ecology/lesson/Importance-of-Birds-MS-LS/ (last access date: 23/07/2019) [5] https://www.thespruce.com/bird-courtship-behavior-386714 (last access date: 23/07/2019) [6] https://www.birdlife.org/worldwide/news/why-we-need-birds-far-more-they-need-us (last access date: 23/07/2019) [7] A. S. King and J. Mcklland, eds., Form and Funcrion inBirds,vol. 4.Academic Press, 1989. [8] D. K. Patterson and 1. M. Pepperberg, “A comparative studyof human and parrot phonation: Acoustic and articulatorycorrelates of vowels,’’ J. Acousr. Soc. Am., vol. 96, pp. 636 648, August 1994. [9] Lasseck, M. (2014, February). Large-scale Identification of Birds in Audio Recordings. In CLEF (Working Notes) (pp. 643-653). [10] Sen, S., Dutta, A., & Dey, N. (2019). Audio Processing and Speech Recognition: Concepts, Techniques and Research Overviews. Springer [11] Sprengel, E., Jaggi, M., Kilcher, Y., & Hofmann, T. (2016). Audio based bird species identification using deep learning techniques (No. CONF, pp. 547-559). [12] Fazeka, B., Schindler, A., Lidy, T., &Rauber, A. (2018). A multi-modal deep neural network approach to bird-song identification. arXiv preprint arXiv:1811.04448. [13] K. Ito, K. Mori, and S. Iwasaki, “Application of dynamic programming matching to classification of budgerigar contact cal ls,” J. Acoust. Soc. Amer., vol. 100, no. 6, pp. 3947–3956, Dec. 1996. [14] Dan Stowell and Mark D. Plumbley, “Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning,” PeerJ, vol. abs/1405.6524, 2014. [15] Fagerlund, S. (2007). Bird species recognition using support vector machines. EURASIP Journal on Applied Signal Processing, 2007(1), 64-64 [16] Lopes, M. T., Junior, C. N. S., Koerich, A. L., &Kaestner, C. A. A. (2011, October). Feature set comparison for automatic bird species identification. In 2011 IEEE International Conference on Systems, Man, and Cybernetics (pp. 965-970). IEEE. [17] Lewis, J. P. (1995). Industrial Light & Magic. Fast normalized cross-correlation, 2011. [18] Ho, T. K. (1995, August). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278-282). IEEE. [19] Fagerlund, S. (2007). Bird species recognition using support vector machines. EURASIP Journal on Applied Signal Processing, 2007(1), 64-64. [20] Zemmal, N., Azizi, N., Dey, N., & Sellami, M. (2016). Adaptive semi supervised support vector machine semi supervised learning with features cooperation for breast cancer classification. Journal of Medical Imaging and Health Informatics, 6(1), 53-62. [21] Herr, A., Klomp, N. I., & Atkinson, J. S. (1997). Identification of bat echolocation calls using a decision tree classification system. Complexity International, 4, 1-9. [22] Zhou, Z. H., & Zhang, M. L. (2007). Multi-instance multi-label learning with application to scene classification. In Advances in neural information processing systems (pp. 1609-1616). [23] Barandiaran, I. (1998). The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell, 20(8), 1-22. [24] Ho, T. K. (1995, August). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278-282). IEEE. [25] Neal, L., Briggs, F., Raich, R., & Fern, X. Z. (2011, May). Time-frequency segmentation of bird song in noisy acoustic environments. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2012-2015). IEEE. [26] I. A. Basheer and M. Hajmeer, “Artificial neural networks: fundamentals, computing, design, and application,” J. Microbiol. Methods, vol. 43, no. 1, pp. 3–31, 2000. [27] Bala, R., & Kumar, D. D. (2017). Classification Using ANN: A Review. International Journal of Computational Intelligence Research, 13(7), 1811-1820. [28] Londhe, S. S., & Kanade, S. S. (2015). Bird Species Identification Using Support Vector Machine.
124
N. Das et al. / Machine Learning Models for Bird Species Recognition
[29] Chakraborty, D., Mukker, P., Rajan, P., & Dileep, A. D. (2016, December). Bird call identification using dynamic kernel-based support vector machines and deep neural networks. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 280-285). IEEE. [30] Dileep, A. D., & Sekhar, C. C. (2013). GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. IEEE Transactions on Neural Networks and Learning Systems, 25(8), 1421-1432. [31] Han, J., Pei, J., &Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. [32] Zhou, Z. H., & Zhang, M. L. (2007). Multi-instance multi-label learning with application to scene classification. In Advances in neural information processing systems (pp. 1609-1616). [33] Yong, Z., Youwen, L., &Shixiong, X. (2009). An improved KNN text classification algorithm based on clustering. Journal of computers, 4(3), 230-237. [34] Goëau, H., Glotin, H., Vellinga, W. P., Planqué, R., & Joly, A. (2016, September). Lifeclef bird identification task 2016: The arrival of deep learning. [35] Balfoort, H. W., Snoek, J., Smiths, J. R. M., Breedveld, L. W., Hofstraat, J. W., &Ringelberg, J. (1992). Automatic identification of algae: neural network analysis of flow cytometric data. Journal of Plankton Research, 14(4), 575-589. [36]Chatterjee, S., Hore, S., Dey, N., Chakraborty, S., & Ashour, A. S. (2017). Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications(pp. 331-341). Springer, Singapore. [37] Stefan Kahl, Thomas Wilhelm-Stein, Hussein Hussein, HolgerKlinck, Danny Kowerko, Marc Ritter, and Maximilian Eibl Large-Scale Bird Sound Classification using Convolutional Neural Networks [38] Lasseck, M. (2018, September). Audio-based Bird Species Identification with Deep Convolutional Neural Networks. In CLEF (Working Notes). [39]Li, Z., Dey, N., Ashour, A. S., Cao, L., Wang, Y., Wang, D., ... & Shi, F. (2017). Convolutional neural network based clustering and manifold learning method for diabetic plantar pressure imaging dataset. Journal of Medical Imaging and Health Informatics, 7(3), 639-652. [40] Tóth, B. P., & Czeba, B. (2016, September). Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment. In CLEF (Working Notes) (pp. 560-568). [41] Hinton, G. E., Sejnowski, T. J., &Poggio, T. A. (Eds.). (1999). Unsupervised learning: foundations of neural computation. MIT press. [42] Ranjard, L., & Ross, H. A. (2008). Unsupervised bird song syllable classification using evolving neural networks. The Journal of the Acoustical Society of America, 123(6), 4358-4368. [43] Zhu, X. J. (2005). Semi-supervised learning literature survey. University of Wisconsin-Madison Department of Computer Sciences. [44]. Stastny, J., Munk, M. and Juranek, L., 2018. Automatic bird species recognition based on birds vocalization. EURASIP Journal on Audio, Speech, and Music Processing, 2018(1), p.19.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200053
125
Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal Katia MELO a, Mahdi KHOSRAVY a, b, 1, Carlos Augusto Duque a, and Nilanjan DEY c a
Fedral University of Juiz de Fora, MG, Brazil b Unviersity of the Ryukyus, Okinawa, Japan c Techno India College of Technology, Kolkata, India. Abstract. In the era of Internet of things and big data, a substantial problem in data transmission, processing, and storage is high data volume. In the case of harmonics, due to the high frequency contents, and the requirement by the Nyquist sampling theorem, short length data becomes impractical for efficient data transmission. Compressive sensing (CS) is a technique that comes to solve this problem deploying the sparsity of the signal to measure the signal with considerably fewer number of samples than the conventional methods. While most of CS techniques are via random sampling matrices, deterministic CS uses wellknown matrices and deploy their characteristics. Deterministic CS by chirp codes is deterministic CS where this paper gives a detailed analysis to its technique and its implementation to power signal. The analysis includes the effect on deviations of the parameters of the harmonics, inter-harmonics, sub-harmonics. Keywords. Harmonics, Frequency estimation, Signal reconstruction, Compressive Sensing, Sparsity, Chirp codes
1. Introduction Signals, images and in general natural data sources have sparse characteristics wherein the information is not scattered uniformly but sparsely. As an example, in an image of scenery, most of the areas of the image are monotonic with minimum variation in their intensities, some areas which have most of information inside possesses most of the variations. This concept is called sparseness. Based on the same concept, Compressive Sensing (CS) is a novel technique that process the signal using a smaller number of measurements than the usual methods [1]. Sparse or compressible signals in original domain or in some transform domain are the kind of signals that we need to work with CS. Accordingly, CS samples the signal at a lower rate than one required by the Nyquist theorem. This is interesting because sometimes the rate required by Nyquist is too high which makes a real high computational cost [2]. A signal can be sparse on the original domain or in another one like the Fourier Transform. For example, a voltage signal is not sparse in time domain, but its Fourier Transform is sparse. Measuring a compressible signal without CS, requires transforming coefficients for all the samples, then in the case of compression retaining only larger coefficients and discarding the smaller ones for storage or transmission. In 1
Corresponding Author.
126
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
the case of high compression, we have to discard a lot of samples thus losing information [3]. In CS the sampling rate is governed by the sparsity of the signal, that is why we can process with less samples. It is important to know that CS measurements are non-adaptive, this means that they are not obtained by learning from the previous measurements, and this reduces the computational cost. Based on that it’s important to know that CS is really practical where measurements are either costly, or time consuming or where sensing is power limited or there health limitation like medical applications [4]. The potential application field of CS covers a wide range like ECG signal quantization [5, 6], biomedical acoustic sensing [7], telecommunication systems [8 - 11], internet of things, [12], image enhancement [13, 14], economic data modeling, etc. Choosing the CS matrix is the main challenge in this regard. In the case of this paper we are using a deterministic matrix, because, although the CS has been proposed along with random measurement matrices [15], there are a problem with this because we can’t store and reproduce them at receiver, so it means that this matrix needs to be transmitted along with the signal. With that in mind, we choose to use a deterministic and structured measurement matrix. Their advantages are clear [16], we have faster acquisition, less storage requirement, reproducibility and reduces transmission overhead [17]. The disadvantage is that we require a higher number of measurements when comparing with random matrices. The measurement matrix of this paper is made from chirp signals, that is, a cosine with the frequency changing by a rate [18]. 2. Compressive sensing Compressive sensing requires the signal to be sparse in current domain or in some orthonormal basis. Because of the sparsity we can compress the signal in fewer samples, and requiring less memory to keep the data. The level of compression depends on the sparsity level of the signal. The method is called compressive sensing because it directly measures the signal in its compressed format. In other words, all made at the same time; sensing compression. The basis equation of CS is shown in Equ. (1). (1) where is the compressed sensing signal vector, is the CS matrix and is the sparse signal. Indeed, y is made by sensing that is a signal of length N by a matrix of size M by N where N is much bigger M. This M by N rectangular matrix gives an M measurement by linearly combining the N samples of the signal. As a matter of fact a compression with ratio of N/M happens at the same time of measurement as it called compressive sensing. The signal to be sensed may not be sparse in original domain, so it possible to see it on some orthonormal basis Ψ as signal s a sparse representation of signal x, it can be seen in Equ. (2): (2) This is the base of CS, the next step is to obtain y and recover x. In this method with a deterministic matrix Φ, we just multiply the sparse signal x by the deterministic CS matrix to compress it, and then we recover the signal through the inverse process. So, y is also the product between Φ and Ψ and x as shown in Equ. (2) and we can make a Θ matrix as the combination of the matrices Φ and Ψ.
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
127
(3) The main challenge of CS is finding the a recovery matrix while in general case of CS there is not any prior information for the source signal as well the CS matrix. Finding the recovery matrix is very similar to finding the un-mixing matrix in blind source separation (BSS) [19 - 23] which looks for a matrix maximizing a characteristic in the recovered signals. 3. Chirp Sensing Matrix The problem of the signal recovery can be approached by searching for the signal corresponding to the linear combinations of the columns in Θ = Φ.Ψ which formed y. In other words, Φ.Ψ is the key solution of recovery. Because of that CS matrix get its importance in both compressive measuring and recovery. In general, most of CS technique are by random CS matrices which are called non-deterministic CS techniques. In front of non-deterministic CS techniques, there are deterministic techniques which CS matrix has special formulation. This formulation gets its importance specially for recovery side. A deterministic CS is by chirp codes [24] which is the main interest of this paper and we review it here in simple words. Chirp code CS matrix is made of length chip codes as columns and can make a compressive sensing the data compression ratio of due to matrix dimension of . A -length chirp codes is as follows:
(4) 4. Recovery of the signal sensed by chirp CS matrix After creating the chirp sensing matrix, we want to make the steps to recover the signal. Until now we have the sparse signal and the Chirp Sensing Matrix that forms the vector y. Consider these vector y, indexed by l, formed from the linear combination of some chirp signals, we have equation (5):
which have base frequencies defined by , the frequency of that i iteration, and chirp rates defined by of that i iteration, where i represents the columns of the chirp signal . The chirp rates can be recovered from y by looking at rates can be recovered from y by looking at as can be seen in equation (6) where the index l + T is taken mod K. This gives [24]:
128
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
where the CT is cross terms and are of the form
We see that f(l) is a signal that has sinusoids at the discrete frequencies mod . It’s necessary that is a prime number to this method, only when this happen we have a bijection, and this bijection is from chirp rates to FFT (Fast Fourier transform) bins. Furthermore, the remainder of the signal consists of the cross terms. Since the cross terms are chirps, their energy is spread across all FFT bins. Since x is sparse, y consists of sufficiently few chirps. Thus, the FFT obtained from f(l) will have a spectrum with significant peaks in mod K and with these peaks we will remove the chirp rates to use after to reconstruct the signal. With the chirp rates we can transform the signals with that rate into sinusoids. To do this we must multiply the signal y(l) by the exponential we use the FFT in the resulting signal, and then we get the values of In this way, the CS matrix is chosen as:
We can then write Θ as a function of a set of matrices
and
. After that .
such that:
where is matrix K × K whose columns are formed by chirp signals with a fixed chirp rate of and with base frequency m ranging from 0 to K − 1. Thus, for example, for k = 2 and r, m, l ∈ {0, 1}, we have:
We see that y = Θ.s will have the form equation (5). Given y formed using Θ, we summarize the algorithm described in [24]. 1. Choose a , and a stopping energy ε. 2. Form and take length K FFT. 3. Find location of the peak in the FFT as mod K and record the unique corresponding to the location. 4. 5.
Multiply y(l) by and take length K FFT. Find the location of the peak and record as . Use the value of the peak to recover .
6. 7.
Replace y with Repeat steps (2)-(6) until
. or have iterated M times.
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
129
We must consider the use of Chirp codes in the special cases of scarce Fourier signals, as well as the use of the algorithm to mitigate interference and long-term noise [8]. An Sparse Signal 1.5
1
0.5
0
0
50
100
150
200
250
300
Figure 1. A Sparse signal. 5. Experimental Results and Discussion In this section we will show the results obtained from the simulation of sensing by chirp codes CS matrix and recovery explained in the former section. A. Experiment on a sparse signal Consider the sparse signal in Figure 1. It is a signal of length 289 samples but all samples are zero except two samples which means high sparsity. We create a chirp matrix which in this case is 172 = 289 columns and K = 17 rows. After that we multiply the chirp matrix with the sparse signal vector and we obtain which is indeed a linear combination of chirp codes with corresponding coefficients equal to the signal samples. At this stage, we have a K-length vector as measurement of the vector signal of -length. Each of the chirp codes involved the linear combination of chirp codes can be detected and separated in a process earlier explained in Section 4. To decode , we must do the operation that is a circular convolution, then we apply the FFT and we obtain the .
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
130
Corresponding Chirpt code for sample s(60)
Amplitude of FFT of Chirp code of s(60)
1
real imaginary
5 4
0.5
3
0
2
−0.5 1
−1
0
2
4
6
8
10
12
14
16
0
18
0
2
4
6
Corresponding Chirpt code for sample s(50)
8
10
12
14
16
18
14
16
18
Amplitude of FFT of Chirp code of s(50)
1
real imaginary
5 4
0.5
3
0 2
−0.5
−1
1
0
2
4
6
8
10
12
16
14
0
18
0
2
6
4
8
10
12
Figure 2. (left) Corresponding chirp codes to the sparse signal samples in Fig. 1, and the amplitude of their FFT (right). Amplitude of FFT of the compressed sensed signal i.e the lin. mix. of chirps 12 10 8 6 4 2 0
0
2
4
6
8
10
12
14
16
18
12
14
16
18
Amplitude of FFT of f(l) 30 25 20 15 10 5 0
0
2
4
6
8
10
Figure 3. (up) Amplitude of FFT of the compressed sensed signal i.e. the linear mixture of chirp codes, (bottom) amplitude of FFT of f(l). FFT of dechirped 1.4 1.2 1 0.8 0.6 0.4 0.2 0
0
2
4
6
8
10
12
14
16
Figure 4. Amplitude of FFT of the dechirped f(l).
18
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
131
Figure 2 (left) shows the corresponding chirp codes to the two samples of the sparse signal s(n) shown in Figure 1. As it can be seen the codes are binary sequences where their real part and imaginary part have been shown separately as the sequences of circles and stars, respectively. Figure 2(right) shows the amplitude of FFT of the chirp sequences shown in Figure 2(left). As can be seen, there is not any dominant frequency, and the frequency content of a chirp code is monotonic. Figure 3 shows the amplitude of FFT of where it can be seen the index corresponding to is dominant which is indeed the chirp frequency. Once we have the , we want to obtain the . It is by dechirping the chirp frequency of wherein result of that we will have as dominant frequency as it is observable in FFT of the -dechirped sequence in Figure 4. To dechirp, we multiply by . Following the same approach of detection and dechirping, the chirp codes coefficients are found one by one from the most dominant one to the slightest one when the remained energy in the signal reach less than a certain threshold. Figure 6 shows the reconstructed sparse signal. We got with this a Mean Square Error (MSE) of 0.2219, and Mean Absolute Error (MAE) of 0.0493, and we reconstruct a signal of 289 samples just using 17 samples. Figure 5 shows the reconstructed signal. 1.4 1.264 1.2
1.262 1.26
1
49.5
50
59
60
50.5
0.8 0.955 0.95 0.945 0.94 0.935
0.6
61
0.4
0.2
0
0
50
100
150
200
250
300
Figure 5. The reconstructed sparse signal of Figure 1 from the compressed sensed samples measured by chirp code CS. B. Experiment on synthetic power line signal Power line signal is a sparse spectrum signal. A wide range of signals are sparse spectrum signals as in Fourier domain they have a sparse spectrum. A very good example of these signals are harmonics, which each tone has a sample in Fourier space. A single tone signal is 1-sparse in Fourier domain, which means just one non-zero sample and the rest all zero. Here, we did an experiment on a double tones signal composed of a fundamental and its 60-th harmonic with the following amplitudes and their corresponding frequencies as a synthetic example of a power signal: 20Hz, 30, 1200Hz, 100,
(11)
132
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
K is taken 41, and the signal vector length above mentioned case study signal and its FFT.
= 1681 samples. Figure 7 shows the
Figure 6. The spectrum sparse signal of study composes of a fundamental and its 60th harmonic. K is taken 41, and the signal vector length = 1681 samples. Figure 6 shows the above-mentioned case study signal and its FFT. Since the signal is sparse in the Fourier domain, its FFT is given to chirp compressive sensing matrix. The result is a short complex vector signal of the length of 41 shown in Figure 8. Applying the steps given in Section III. The components parameter is estimated. Figure 7 shows the FFT of the compressed sensed signal (up), the FFT if f (l) (middle), and FFT of dechirped signal (bottom) related to the detection of the first iteration. After estimation of the parameters of frequencies and amplitudes of the harmonics, the signal is synthesized by having the available estimated parameters. Figure 8 shows a segment of both the original signal and the reconstructed one (up), and their FFT (bottom left and right). As it is observable, although the signal is reconstructed well, it has an amount of error that is visible. The mean absolute error and mean squared error per sample has been measured, respectively as 10.5215% and 15.2440%. Therefore, as a matter of fact error mitigation of this technique is an essence to increase its efficiency. 6
2
FFT of Compressed Sensed Signal
x 10
1.5 1 0.5 0
0
5
10
15
8
8
20
25
30
35
40
25
30
35
40
25
30
35
40
FFT de fsig
x 10
6 4 2 0
0
5
10
15
5
2
20 FFT of dechirped
x 10
1.5 1 0.5 0
0
5
10
15
20
Figure 7. FFT of the compressed sensed signal (up), the FFT if f (l) (middle), and FFT of dechirped signal.
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
133
Original Signal and the recostructed one by CS 150
100
50
0
10 5
2
x 10
20
30
40
50
60
FFT of the original signal 2
1.5
1.5
1
1
0.5
0.5
0
0
500
1000
1500
0
70
80
90
100
5 x 10 FFT of the reconstructed signal
0
500
1000
1500
Figure 8. Original signal and the recovered signal (up), and their FFT respectively at bottom left and right. 6. Future Scope In our future work, we will focus on increasing the accuracy and efficiency of the technique, after applying noise mitigation techniques, we will do an exact statistical analysis to the accuracy and robustness of the technique. Furthermore, we will apply this technique for Power-line harmonics compressed measuring and estimation of its components. The technique has lots of details. Another future task of the authors is going through each technical detail of this CS technique and having an accurate analysis. In the light of the importance of this technique and its attractive aspects, authors will give a comprehensive tutorial to chirp compressive sensing it their future work. 7. Conclusion In this paper, we validate the Algorithm proposed in [24] and apply it on a synthetic power-line signal. The methodology applied to two different study case signals; a 2-sparse signal in the time domain and a power line signal as aspectrum sparse signal of double tones. The chirp CS was applied on both cases; the signal was compressed sensed and reconstructed. Step by step, the process was clarified through the figures of the signal and its FFT. At the end, the estimation error of the technique was accurately measured. The chirp code CS is an efficient deterministic technique with compression capability of K times. In our future work, we will implement the same approach to power line harmonics compressive sensed with noise mitigation.
134
K. Melo et al. / Chirp Code Deterministic Compressive Sensing: Analysis on Power Signal
References [1] D. L. Donoho, Compressed Sensing, IEEE Transactions on information theory, vol. 52, no. 4, (2006), pp. 1289–1306. [2] A. Cohen, W. Dahmen, and R. DeVore, Compressed sensing and best k-term approximation, Journal of the American mathematical society, vol. 22, no. 1, (2009), pp. 211–231. [3] H. Liu, H. Zhang, and L. Ma, “On the spark of binary ldpc measurement matrices from complete protographs,” IEEE Signal Processing Letters, vol. 24, no. 11, (2017), pp. 1616–1620. [4] M. Khosravy, N. Dey, Compressive Sensing in Health Care, Elsevier, in Press, (2020) [5] MH. Sedaaghi and M. Khosravi, “Morphological ecg signal preprocessing with more efficient baseline drift removal,” in 7th. IASTED International Conference, ASC, (2003), pp. 205–209. [6] M. Khosravy, M. R. Asharif, and M. H. Sedaaghi, “Morphological adult and fetal ecg preprocessing: Employing mediated morphology,” in IEICE Tech. Rep., IEICE, vol. 107, (2008), pp. 363–369. [7] N. Dey, A.S. Ashour, W.S. Mohamed, and N.G. Nguyen, Acoustic sensors in biomedical applications. In Acoustic sensors for biomedical applications, Springer, Cham. (2019), pp. 43-47. [8] F. Asharif, S. Tamaki, MR. Alsharif, M. Khosravy, HG. Ryu, Performance improvement of constant modulus algorithm blind equalizer for 16 QAM modulation. International Journal on Innovative Computing, Information and Control. 7(4), (2013), 1377-84. [9] M. Khosravy, MR. Alsharif, M. Khosravi, and K. Yamashita, An optimum pre-filter for ICA based mulit-input multi-output OFDM system. In 2010 2nd International Conference on Education Technology and Computer, Vol. 5, (2010), pp. V5-129. [10] M Khosravy, N Punkoska, F Asharif, MR Asharif, Acoustic OFDM data embedding by reversible Walsh-Hadamard transform, AIP Conference Proceedings 1618 (1), (2014), 720-723 [11] M. Khosravy, M.R. Alsharif, K. Yamashita, An efficient ICA based approach to multiuser detection in MIMO OFDM systems. In Multi-Carrier Systems & Solutions 2009, Springer, Dordrecht (2009), pp. 47-56. [12] C. Bhatt, N. Dey, & A.S. Ashour, Internet of things and big data technologies for next generation healthcare, (2017) [13] M. Khosravy, N. Gupta, N. Marina, I.K. Sethi, and M.R. Asharif, Brain action inspired morphological image enhancement. In Nature-Inspired Computing and Optimization, Springer. Cham., (2017) pp. 381-407. [14] M Khosravy, N Gupta, N Marina, IK Sethi, MR Asharif, Perceptual adaptation of image based on Chevreul–Mach bands visual phenomenon, IEEE Signal Processing Letters 24 (5) (2017), 594-598. [15] D. L. Donoho and M. Elad, “Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization, Proceedings of the National Academy of Sciences, vol. 100, no. 5, (2003), pp. 2197–2202. [16] H. Rauhut, “Compressive sensing and structured random matrices,” Theoretical foundations and numerical methods for sparse recovery, vol. 9, (2010), pp. 1–92. [17] S. D. Howard, A. R. Calderbank, and W. Moran, “The finite heisenbergweyl groups in radar and communications,” EURASIP Journal on Applied Signal Processing, vol. 2006, (2006), pp. 111–111. [18] R. A. DeVore, “Deterministic constructions of compressed sensing matrices,” Journal of complexity, vol. 23, no. 4, (2007), pp. 918–925. [19] M. Khosravy, M.R. Alsharif, and K. Yamashita, An efficient ICA based approach to multiuser detection in MIMO OFDM systems. In Multi-Carrier Systems & Solutions, Springer, Dordrecht, (2009), pp. 47-56. [20] M. Khosravy, M.R. Asharif, K. Yamashita, A theoretical discussion on the foundation of Stone’s blind source separation. Signal, Image and Video Processing, 5(3), (2011) pp.379-388. [21] M. Khosravy, M.R. Asharif, K. Yamashita, A Probabilistic Short-length Linear Predictability Approach to Blind Source Separation. In 23rd International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC 2008), Yamaguchi, Japan, (2008) pp. 381-384. [22] M. Khosravy, M.R. Asharif, K. Yamashita, A PDF-matched short-term linear predictability approach to blind source separation. International Journal of Innovative Computing, Information and Control (IJICIC), 5(11), (2009), pp.3677-3690. [23] M Khosravy, N Gupta, N Marina, F Asharif, MR Asharif, IK Sethi, Blind components processing a novel approach to array signal processing: A research orientation, In 2015 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), 2015, pp. 20-26. [24] L. Applebaum, S. D. Howard, S. Searle, and R. Calderbank, “Chirp sensing codes: Deterministic compressed sensing measurements for fast recovery,” Applied and Computational Harmonic Analysis, vol. 26, no. 2, (2009), pp. 283–290.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200054
135
A Risk Assessment of Ship Navigation in Complex Waters a
Chen Mao a,1, Hao Wang a, Jinyan Zheng a, Mingdong Wang a and Chunhui Zhou a School of Navigation, Wuhan University of Technology, Wuhan, P.R.China, 430063 (E-mail: [email protected]) Abstract: With the rapid development of safety science, traditional accident management has become risk management. Based on the analysis of cases, this paper puts forward the method of establishing a risk assessment model, which can effectively reduce the risk of navigation, and constructs the relative risk model based on multi-factors and the Bayesian Risk consequence model under stochastic information. Before a ship goes to complex waters, it can quantitatively analyze the factors such as accident frequency, risk degree and accident consequence, and construct a risk assessment model for navigation accidents, to effectively reduce the probability of navigation accidents. Keywords: Navigation; Risk assessment; Complex waters; Measurement model
1.Introduction Safety of ship navigation is an eternal theme of maritime transport. With the progress of modern ship design and manufacture, the probability of ship navigation accidents is greatly reduced. Compared with other means of transportation, ships are relatively safe. However, due to the complex structure of ships and marine environment, maritime traffic accidents still occur from time to time. As far as navigation accidents are concerned, they not only cause huge losses to shipping companies, but also bring huge impact and impact to the public. There are many reasons for navigation accidents, and the improvement of ship navigation environment is an important factor in navigation safety. In the course of navigation, ships often face many complex environmental impacts such as foggy weather, many navigation branches, high temperature and humidity. At present, such complex navigation waters almost cover every waterway jurisdiction area, such as the section of Jingzhou Yangtze River Bridge. In order to reduce the occurrence of navigation accidents, it is necessary to evaluate the risk of ship navigation in complex waters. Many researchers have done a lot of research on the changing law of ship navigation environment risk. The application of some optimization models has played a good guiding role. For example, Stoschek. Oliver and others have conducted in-depth research on the impact of flow changes on the navigation environment of ships in complex waters. However, different shipping companies adopt different optimization models for ships with different missions and different shipping areas.
1
Corresponding Author.
136
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
Sometimes extreme weather can also change the navigation environment during the voyage, which has a great impact on the navigation of ships. For example, in a storm environment, the position of the ship should be determined according to the position of the typhoon center, the direction of its movement and the change of wind direction, and reasonable measures should be taken. Dietsche. Daniel et al. studied the impact of storm conditions on the navigation environment of ships. In addition, the water waves also have a certain impact on the ship's navigation in the course of navigation. Wave factors not only have great influence on ship navigation, but also have great uncertainty. In order to ensure the safe navigation of ships in complex waters, it is very important to study the changes of water waves in complex waters. For example, Choi. Junwoo et al. studied the effects of abnormal wave changes on navigation environment, it has certain reference significance and value for ships sailing under abnormal wave conditions. 2.Navigation Risk 2.1 Classification of navigation accidents Ship navigation accidents refer to the traffic accidents of ships and floating facilities in ocean, coastal waters and inland river navigation waters, such as collision, grounding, intake, sinking, capsizing, hull damage, fire, explosion, main engine damage, cargo damage, crew casualties, marine pollution, etc. China's "Statistical Law on Water Traffic Accidents" defines water traffic accidents as follows: 1. Collision accidents, collision accident refers to the damage caused by collision between two or more ships. Collision accidents may cause casualties, ship damage, ship sinking and other consequences. The level of collision accidents is determined by casualties or direct economic losses; 2. Grounding accidents, grounding accident refers to an accident in which a ship is placed on a shoal, causing grounding or damage; 3. Reef-breaking accidents, reef-striking accident refers to an accident in which a ship touches a reef or is placed on a reef, causing damage; 4. Damage accidents, damage accident refers to the accident that touches the shore wall, wharf, navigation mark, pier, floating facility, drilling platform and other underwater structures or sunken ships, sunken objects, wooden piles and fishing shed and other obstacles to navigation and causes damage; 5. Wave damage accidents, wave damage accident refers to the damage caused by the shock of other ships; 6. Fire and explosion accidents, fire and explosion accident refers to the damage caused by fire or explosion of a ship caused by natural or man-made factors; 7. Wind disaster accidents, wind disaster accident refers to an accident in which ships are damaged by severe storms.; 8. Self-sinking accidents, self-sinking accident refers to the sinking, capsizing and total damage of a ship caused by overloading, improper stowage or loading, improper operation, water leakage of the ship's hull or unknown reasons; 9. Other water traffic accidents causing casualties and direct economic losses. 2.2 Classification of navigation accidents Because of the high risk of shipping industry, once a ship's navigation accident occurs, it is necessary to define the degree and level of the accident. Therefore, according to the level of the accident ship, casualties and direct economic losses caused, the level of the maritime traffic accident can be divided into five levels: minor accident, general
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
137
accident, major accident, major accident and super-large accident. However, the classification of major water traffic accidents shall be carried out in accordance with the relevant provisions of the State Council. The classification table of water traffic accidents is as follows. Table 1. Classification Table of Marine Traffic Accidents Major accidents
Big accident
More than three Ships with a total 1-2 deaths; or direct tonnage of 3000 or people died; or more economic losses of than five million a main engine less than 5 million, direct economic power of 3000 kW more than 3 million. losses. or more
General accident
Minor accident
Personnel are seriously injured; or direct Accidents that do economic losses are not exceed less than 3 million normal accidents. yuan and more than 500,000 yuan.
Personnel are seriously There were 1-2 Ships with a total Direct economic injured; or direct deaths or direct More than three tonnage of more losses are less than 500 tons or a people died; or more economic losses of economic losses are than 200,000 and less than 3 million less than 500,000 yuan main engine power than three million more than and more than 200,000 and more than of more than 1500 directly economically. 100,000. yuan. 500,000. kW
Note: 1. If one of the standards in the table is met, the corresponding accident grade will be reached. 2. The "above" in these Rules and this Table contains the present number or this level; the "below" does not contain the present number or this level.
2.3 Navigation risk statement The frequency of ship accidents is analyzed, and the rules of ship accidents risk are summarized and applied to Risk assessment. Table 2. Risk Table for Ship Navigation Working wates
Sail
Turn around and berthing
Unberthing Mooring
System buoyancy
Merge
Ocean
International wates
F2 S2( R4)
F2 S2( R4)
Channel
International waters
F1 S1( R2)
F1 S1( R2)
Domestic waters
F1 S1( R2)
F1 S1( R2)
Foreign waters
F1 S1( R2)
F1 S1( R2)
Fishing zone
138
Anchor area--nonpilotage waters
Anchor area-pilotage waters
Port area-non-pilotage waters
Port area-pilotage waters
Channel-non-pilotage waters
Channel-pilot waters
Merge
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
Domestic waters
F1 S1( R2)
F1 S1( R2)
Foreign waters
F1 S1( R2)
F1 S1( R2)
Domestic waters
F1 S3( R4)
F1 S3( R4)
Foreign waters
Domestic waters
F2 S2( R4)
F3 S2( R5)
F1 S1( R2)
F2 S1( R3) F3 S1( R4)
Foreign waters
F2 S4( R6)
F3 S4( R7)
F2 S1( R3)
F3 S3( R6)
Domestic waters
F2 S2( R4)
F3 S2( R5)
F2 S1( R3) F1 S1( R2)
F3 S1( R4)
Foreign waters
F2 S2( R4)
F3 S3( R6)
Domestic waters
F3 S3( R6)
F3 S3( R6)
Foreign waters
F3 S2( R5)
F2 S1( R5)
Domestic waters
F3 S4( R7)
F3 S4( R7)
Foreign waters
F3 S3( R6)
F3 S3( R6)
F3 S4( R7)
F3 S4( R7)
F3 S2( R5)
F2 S1( R3) F1 S1( R2) F1 S1( R2)
3.Risk Assessment Model for Navigation Accidents 3.1 Relative Risk model based on multi-factors Because there will be some problems when applying the traditional FN Risk model to ship navigation Risk assessment, in order to solve this Risk assessment problem, under the premise of defining the relative standard of consequence degree and referring to the
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
139
calculation method of Risk matrix, the relative Risk measurement model (MRRM) can be established for ships in different situations. Assessment of the risk of shipping. Firstly, the multiple factors involved in the relative Risk measurement model (MRRM), such as the degree of risk, are determined. Then the weight coefficients of different factors are determined and the Risk state equation is established. The following graph matrix: R ୱ୲ୟ୲୳ୱ = b × W (ଵ ) ⎡ ( ) b = ⎢ ଶ ⎢ ⋮ ⎣ ( )
(1)
ே (ଵ ) ே (ଶ ) ⋱ ே ( )
ோ (ଵ ) ோ (ଶ ) … ோ ( )
ேೝ (ଵ ) ேೝ (ଶ ) … ேೝ ( )
ோೝ (ଵ ) ⎤ ோೝ (ଶ ) ⎥ ⋮ ⎥ ோೝ ( )⎦
W=[ ே ோ ேೝ ோೝ ]்
(2) (3)
Among them, μ(x)is the calculated value of membership under the condition of Risk degree and W is the weight. The proposed state dimension can supplement data samples that are not included in traditional safety management, such as hidden dangers, minor accidents and so on, and increase the effective accumulation of databases when the amount of data is small. The concept of relative risk model is put forward, which can be applied to the shipping department, help it to establish internal risk level and risk matrix, and provide data support for decision-making department. At the same time, the establishment of the state dimension can help to analyze and provide risk characteristics, thus obtaining the logical and physical characteristics of risk. In this way, the Risk assessment model of navigation accidents is established to predict and evaluate the occurrence of risk accidents in navigation, so as to effectively prevent accidents in navigation and reduce the probability of accidents. 3.2 Bayesian risk consequence model with random information The occurrence of ship navigation accidents is random, and there are only two possible outcomes: occurrence or non-occurrence. If we assume that the consequences of navigation accidents obey exponential distribution, we can meet our requirements for accident analysis. First assume that the consequences of ship navigation accidents obey exponential distribution, several observational values of accident consequence C can be obtained from the first validation information: ଵ , ଶ ,……, . They are generally collated from previously recorded data, a priori mean c and a priori variance ଶ can be calculated. It is verified that the conjugate prior distribution of exponential distribution satisfies the inverse Gamma distribution: IGa(a,λ),πc =
ఒೌ ଵ ିଵ ି/ୡ
()
௰ሺሻ
(4)
Then verify that the distribution satisfies: ଵ
fc ∕ t < ( )ାିଵ ି(௧ା)/ୡ
(5)
140
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
The introduction of Bayesian Risk consequence model provides a basis for the overall law of accident probability and Risk measurement of ships sailing to complex waters. By means of Bayesian Risk consequence model, the accident probability of ships sailing to complex waters can be reasonably inferred. 4.Conclusion Based on the case study and the search and analysis of relevant data, this paper proposes a relative Risk model based on multiple factors and a Bayesian Risk consequence model with stochastic information, it provides a relatively complete method for Risk assessment of ships sailing in complex waters. In recent years, this technology has been used in maritime transport organizations successively, and has been successfully applied by many large ocean shipping companies, port ship pilotage departments and traffic management departments, and it has produced obvious effect. It reduces the probability of navigation accidents and effectively improves the economic benefits. In the shipping industry, only on the premise of ensuring the safety of ship navigation, can we promote the further development of this industry and make continuous progress in the historical changes. In the new period of shipping industry development, ship navigation safety should be paid enough attention. Before a ship sails to complex waters, it can effectively prevent accidents in navigation and reduce the incidence of navigation accidents by establishing corresponding Risk assessment model to pre-evaluate, so as to help the ship achieve safe navigation and reduce the Risk rate. But the level of risk changes over time, so in order to reduce the probability of accidents in ship navigation, there is also a need to continue to study the mathematical relationship between risk and different navigation environments. Only in this way can we find out the causes of the risks and effectively avoid the occurrence of navigation accidents. References [1] Zhang Di. Study on Risk assessment and Risk prediction of navigation environment in Tianjin Port waters[D]. Wuhan University of Technology, 2008 (In Chinses) [2] Kairong Li. Cause analysis and Countermeasures of frequent collision accidents in coastal navigation[A]. Marine Shipping Commission of China Navigation Society. Collection of 2008 papers on ship safety management[C]. Marine Shipping Commission of China Navigation Society: China Navigation Society, 2008:5 (In Chinses) [3] Zhu Jingjun. Research on Risk assessment method of ship navigation safety[D]. Jimei University, 2016 (In Chinses) [4] Hu Jingping, Fang Quanggen, Qiao Guimin, et al. Risk Analysis and Risk Control of Large Ship Navigation[J]. China Navigation, 2006 (3):34-38 (In Chinses) [5] Li Sheng. Study on the Risk assessment method of navigable environment for ships in complex waters [D]. 2014 (In Chinses) [6] Hu Jingping, Fang Quanggen, Xi Yongtao. Integrated technology of navigation Risk assessment, prediction and decision-making for large ships[J]. Modern Transportation Engineering, 2012(1) (In Chinses) [7] Li Fei. Reasons for ship collision and Countermeasures[J]. China Shipping: Theoretical Edition, 2007,5(6):27-28 (In Chinses) [8] Hu S, Fang Q, Xia H, Xi Y. Formal Safety Assessment based on Relative Risk Model in Ship navigation[J]. Reliability Engineering & System Safety, 2007, 92: 369-377 [9] Hu S, Li X, Fang Q. Use of Bayesian Method for Assessing Vessel Traffic Risks at Sea[J]. International Journal of Information Technology & Decision Making, 2008,7(4):627-638
C. Mao et al. / A Risk Assessment of Ship Navigation in Complex Waters
141
[10] Choi J, Lee J I, Yoon S B. Surface Roller Modeling for Mean Longshore Current over a Barred Beach in a Random Wave Environment[J]. Journal of Coastal Research, 2012,284(5):1100-1120 [11] Hu S, Li X, Fang Q, et al. USE OF BAYESIAN METHOD FOR ASSESSING VESSEL TRAFFIC RISKS AT SEA[J]. International Journal of Information Technology & Decision Making, 2008, 07(04):627-638. [12] Shenping Hu,Quangen Fang,Haibo Xia,Yongtao Xi. Formal safety assessment based on relative risks model in ship navigation[J]. Reliability Engineering and System Safety,2006,92(3).
142
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200055
Optimized Tang's Algorithm for Retinal Image Registration Sayan CHAKRABORTYa, Ratika PRADHANa, Nilanjan DEYb, Amira S. ASHOURc a Department of Computer Applications, Sikkim Manipal Institute of Technology, Rangpo, Sikkim India b Department of Information Technology, Techno India College of Technology, West Bengal, 740000, India c Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta Univ., Egypt
Abstract. Medical images aligning is essential due to the existing angular rotation and translation of objects inside the medical images which leads to complex post-processing. Accordingly, medical image registration becomes one of the key tools for aligning images. In image registration, to the images’ frames and the objects are transformed by aligning each image with its corresponding reference one. Demons registration is one of the most efficient non-rigid image registration techniques. In this paper, the proposed system optimized the parameters of Demons registration using the firefly optimization algorithm in retinal images. Several metrics were measured for evaluating the proposed model, including the mean square error, joint entropy, and mutual information calculation between the original and registered images. Keywords. Demons registration, Tang’s demons, firefly algorithm, mean square error, correlation, joint entropy
1. Introduction Medical image registration is considered one of the key techniques for image with respect to a corresponding reference image. Medical image registration can be categorized into Mono-modal and Multi-modal in terms of the medical imaging modality [1- 3]. Generally, mono-modal image registration [4-6] refers to aligning images that are captured by same imaging device at different time instances, while multi-modal image registration [5] refers to registering the image taken from different imaging devices. Another categorization of image registration in terms of the used image transformation is as follows, rigid, Affine, b-splines, and demons, where apart from rigid registration type; the all other methods are considered non-rigid techniques. In rigid registration, the original shape of object in any image is unchangeablewith theregistration effect, whereas shape and size may be affected by non-rigid registration techniques. Rigid registration consists of rotation, translation, and scaling, whereas affine registration involves all operations from rigid registration along with shearing
S. Chakraborty et al. / Optimized Tang’s Algorithm for Retinal Image Registration
143
and mapping. B-Spline registration uses spline curves to build up a mesh which helps in transformation. Demons registration uses a transformation matrix as a vector and applies velocity smoothening kernel to enhance images leading to a fluid registration process [1]. In order to guarantee the superior registration performance, the parameters of demons registrationwere optimized in the present work using the Firefly algorithm [7, 8] due to its efficiency [9, 10]. The structure of the following sections is as follows, section 3 describes the methods and materials used. Section 4 discusses the proposed method followed by the results in section 5. Finally, section 6 concluded the present work.
2. Related Studies Several researchers were interested to develop new image registration methods, while others used the firefly algorithm. Thirion [1] proposed demons registration using Maxwell’s demons algorithm to match brain images. Recently, Tang et al. [2] updated the existing Thirion’s demons by introducing balance coefficient parameter in order to stabilize the traditional demons algorithm. Zhang et al. [3] proposed log-Demons for massive deformation in image registration by usingdriving force. An optimization was applied todemons algorithmwith further calculations of thedriving force by LogDemons’ boundary points. This procedure enhanced the boundary points’ motion direction. Han et al. [4] introduced diffeomorphic demons framework for image registration usingDemons algorithm’s momentum-based acceleration which was applied on MR-CT deformable images. Various features of image registration were discussed, such as multi-modality, deformable registration, and diffeomorphic registration. In 2019, Lan et al. presented a comparative study [5] of various demons algorithm on retinal images, including Thirion’s, Tang’s and Wang’s demons. A demons algorithm involvingall the parameters of Tang’s demons as well as rotation parameters were proposed in demons registration. Yang et al. [7] proposed firefly algorithm (FA) to guarantee the optimality tobalance theexploration and exploitation. Another study was conducted by Chen et al. [8] using a modified FA on global path planning to increase the convergencespeed and reduce the error during the local search of standard firefly algorithm (SFA). Currently, Sarangi et al. [9] used FA to increase the diversityin searching conduct of standard algorithm. Additionally, Janah et al. [10] presented a comparative study between using FA and particle swarm optimization (PSO) in solving the Simultaneous localization and mapping (SLAM) problems. From the preceding studies, the present work applied the FA algorithm to optimize the Tang's algorithm for retinal image registration. 3. Methodology The current work uses Tang’s demons registration proposed by Tang et al. [2] using an optimization framework based on the firefly algorithm.
144
S. Chakraborty et al. / Optimized Tang’s Algorithm for Retinal Image Registration
3.1. Tang’s Demons Typically, Thirion’s demons for registration is on Maxwell’s demons algorithm including optical flow [11, 12]. The fixed image applies a displacement vector on the moving image (deformed), which is known as Demons force [13, 14]. Accordingly, the pixels in the moving image are displaced to align with the fixed image using the deformation field formula that goes through multiple numbers of iterations. Thirion’s basic Demons [15, 16] method only is applied on deformation field, which is acquired from the gradient information of the fixed image. Hence, the framework was effective for small deformed problems. Wang et al. [16] applied the gradient of moving image into the Demons field formula with introducing a the force strength to stabilize the framework in each iteration. Recently, Tang et al. [2] updated the demons field formula by introducing a balance coefficient k. Accordingly, the present work fine-tuned the force strength adaptively, and where thedemons field equation in Tang’s demons is given by:
G u
JJJG JJJG R F JJJG (|| R F ||)( 2 JJJG 2 k || R || D 2 || R F ||2 k 2 || F ||2 D 2 || R F ||2
(1)
G Where R and F refer to the moving and fixed images, respectively, and u is the enhancement of deformation field.
3.2 Firefly Algorithm The FA is one of the most popular and efficient meta-heuristic algorithm which uses an objective function to search for the optimal solution in an iterative process forenhancing the local solution. The principle of FA was inspired by thebehavior of the fireflies and their flashing patterns. This algorithm adopted the following characteristics of fireflies: • Fireflies proceeds towards any firefly with brighter flashing pattern. • Brightness defines the attractiveness of the fireflies. Brightness of fireflies is inversely proportional to space between two fireflies. Hence, less bright firefly move towards brighter firefly, if there is no brighter firefly found then the firefly moves in random direction. • An objective function is determined to specify the problem domain. It refers to the brightness of the firefly's flashes and proportional to brightness of firefly. 4. Proposed Method Since the fitness function of the nonrigid registration should involve similarity measure of the alignment field [18, 19], hence in this work, the correlation is consideredas the objective function. The current work aims to optimize Tang’s demons which has all the parameters from Thirion’s demons [1] as well as Wang’s demons [16], and a balance co-efficient. Lan et al. [5] discussed the advantage of Tang’s demons [2] on retinal images, [20] which achieved superior performance compared to Thirions and Wang’s demons in terms of having less error in Demons deformation [21] field. Hence, Tang’s demons is optimized in the present work using the FA, where Chakraborty et al. [14] established that the FA outperformed the PSO during optimizing the parameters of Thirion’s demons registration of simple tennis video content. In the current work, the
S. Chakraborty et al. / Optimized Tang’s Algorithm for Retinal Image Registration
145
framework observes the effect of optimization [15] on image registration on medical content. The medical content chosen for to evaluate the current framework was the retinal images obtained from FIRE-DB [16]. The current work proposedan optimization framework for the Tang’s demons’ parameters of the Gaussian filterfor smoothening the kernel values [14, 15], including the window size and sigma. Typically, the default window size in Thirion’s demons as well as Tang’s demons is 60×60 and sigma value is 10. Furthermore, the fitness function in this current work is set to be the correlation value betweenthe original and registered image. Initially, a set of two images are considered from the given retina dataset [16] [FIRE-DB], then the second frame is registered with reference to the first one. During registration, the range of the window size values of the Gaussian filter in the Tang’s demons is set to vary from 0 to100, while the sigma value of Gaussian kernel varies from 0 to 20. The upper bound and lower bound values are taken from the work of Chakraborty et al.’s work [14]. Afterwards, the FA generates a set of solutions according to the given population and iteration size having population ranging from 5 to 30, while the numbers of generations were fixed to 15. By using the generated solutions, the FA [14] tries to find the best fitness value (i.e., the best correlation value between theoriginal and registered image) and returns its corresponding scaling factor (Gaussian kernel) values. The framework executes the process until an optimized value is found, where the current framework is summarized into Algorithm 1. Algorithm 1: Proposed Firefly based Tang’s Demons Registration Start Read the source and target images Set upper bound and lower bound for optimization Apply the firefly algorithm Generate new solutions For solution 1 to n Apply the solutions on demons algorithm Perform demons algorithm for each solution and return the correlation value between the original and registered images Choose the highest correlation value and return it as best fit Obtain the solution values for which best fit was returned Save the registered image for post-processing Endfor Calculate the mean square error, joint entropy, and mutual information following the optimization of Demons registration Stop
Furthermore, the block diagram of the proposed optimization- based registration is demonstrated in Figure 1, where the three parameters to be optimized are k1 , k 2 and k3 values representing the Gaussian filter’s window size and sigma values, respectively.
146
S. Chakraborty et al. / Optimized Tang’s Algorithm for Retinal Image Registration
Tang’s Demons.
Generate Initial n no of solutions randomly in K1, K2, K3
Target frame in FIRE DB
Source Frame in FIRE DB
No
Apply Firefly algorithm where each firefly is K1, K2, K3
Rank the firefly according to the light intensity and the update new position Target registration ?>@Q$%\!^+_`> Demons: Deformable Image Registration with Local Structure-Preserving Regularization Using Supervoxels for Liver Applications”. In: Journal of Medical Imaging 5.2,(2018), pp. 1-9 . S.Chakraborty, N.Dey, S.Samanta, A.S.Ashour, V.E.Balas, "Firefly algorithm for optimized nonrigid demons registration", Bio-Inspired Computation and Applications in Image Processing,(2016), pp 221-237. S.Chakraborty, N. Dey, S. Samanta, A. S. Ashour, C. Barna, M. M. Balas,"Optimization of Non-rigid Demons Registration Using Cuckoo Search Algorithm", Cognitive Computation, (2017),9(6), pp 817– 826. C. Wang, Q. Ren, X. Qin and Y. Yu, "Adaptive Diffeomorphic Multiresolution Demons and Their Application to Same Modality Medical Image Registration with Large Deformation", International Journal of Biomedical Imaging,(2018)2018(7314612), pp1-9. https://www.ics.forth.gr/cvrl/fire/, (last access date: 7/8/19). T. Vercauteren, X.Pennec, e.a., “Nonparametric diffeomorphic image registration with the demons algorithm”. MICCAI ,(2007), pp 319–326. S. Chakraborty, S. Ghosh, S. Chatterjee, S. Chowdhuri, R. Ray, N. Dey, "Rigid image registration using parallel processing", International Conference on Circuits, Communication, Control and Computing, (2014), pp.11-15. S. Chakraborty, A. Mukherjee, D. Chatterjee, P. Maji, S. Acharjee, N. Dey, "A Semi-automated System for Optic Nerve Head Segmentation in Digital Retinal Images", 2014 International Conference on Information Technology, (2014), pp.1-4. S. Chakraborty, P.K. Patra, P. Maji, A. S. Ashour, N. Dey, "Image Registration Techniques and Frameworks: A Review", Applied Video Processing in Surveillance and Monitoring Systems, (2016) pp.102-110.
150
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200056
Modeling and Simulation of Dynamic Location Allocation Strategy for Stereo Garage LI Bowen, LI Jianguo, and KANG Yaojun School of Automation and Electrical Engineering ,Lanzhou Jiaotong University, Lanzhou Gansu 730070,China Abstract. Aiming at the shortcomings of long queues and low efficiency of vehicle queues in the process of plane mobile stereo garage parking space allocation. By analyzing the queueing process of the vehicle and the time characteristics of the vehicle entering and leaving the garage, we propose a parking space allocation method based on the probability characteristics of the vehicle entering and leaving the garage. The parking space allocation simulation program under the probability of near allocation and outbound storage is written separately, and the average waiting time of the customer, the average waiting queue length, the average service time and the idle probability of the carrier are used as the evaluation indexes of the service efficiency of the stereo garage, and the two schemes are compared and analyzed. The simulation results show that the parking space allocation with the probability of entering and leaving the garage is better than the nearest allocation principle. This method can effectively improve the service efficiency of the stereo garage. Keywords. Stereo garage, parking allocation, probability characteristic, nearest parking space allocation
1. Introduction With the rapid development of the social economy, the continuous growth of the number of cars has brought tremendous pressure on urban road traffic. In order to utilize limited land resources and solve the problem of urban parking difficulties, the flat mobile stereo garage has developed rapidly. The plane mobile stereo garage can effectively save land resources and greatly increase the inventory capacity, but it also has the limitations of long waiting time for customers and low efficiency of vehicles entering and leaving the garage. In order to solve the above problems, Wang Xiaonong[1] and others have established a stereo garage decision-making allocation model, using multi-color set theory and fruit fly algorithm to optimize the research. Liang Ying[2] and others used the method of district management to analyze the allocation of stereo garage parking spaces. Liu Ri[3] and others based on the multi-color set theory, using SVM support vector machine to
Corresponding Author.
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
151
predict the vehicle dwell time range, and establish the corresponding parking space allocation model. Zhang Zhicheng[4] aimed at minimizing the total vehicle access time in stereo garage, studied the most suitable access scheduling strategy in different time periods. She Bei[5] established a double-objective optimization model with the shortest vehicle storage distance and the lowest center of gravity of the vehicle, and used multiple intelligent algorithms to optimize the solution. In the above study, only the efficiency of the vehicle's inventory is considered, and the impact of the vehicle's departure service and the no-load operation time of the carrier on the overall efficiency of the garage is neglected. By analyzing the queuing model of the vehicle [6-7], referring to the access scheduling method of the cargo [8-9] and the vehicle [10-11], we propose a parking space allocation scheme based on the probability characteristics of the vehicle entering and leaving the garage. The three-dimensional graphics are drawn based on the row and column coordinates of the location and the dwell time of the vehicle, and the vehicles requested for storage are stored near the storage station with a short residence time. While the vehicle is in the warehousing operation, the request for the delivery can be taken into consideration to reduce the waiting time and the queue length of the customer, and improve the working efficiency of the stereo garage. 2. Queue model and time characteristics analysis of vehicle access 2.1. Stereo garage queuing model The stereo garage queuing model mainly includes three parts: input process, queuing system and output process. The schematic diagram of the queuing model is shown as Figure 1. The stereo garage queuing process can be described as: when the vehicle arrives at the garage at an indeterminate time interval, if the parking space in the garage is full at this time, the vehicle will leave directly. If there is a parking space, the vehicle enters the service window and waits in line. Due to the limited capacity of the stereo garage, it is impossible to accommodate the waiting vehicles indefinitely. When the vehicle enters the garage of the stereo garage, if the number of queued vehicles is greater than the capacity of the garage, the vehicles can leave directly, otherwise they wait in line.
Figure 1. Stereo garage queuing model
2.2. Vehicle's time characteristics analysis and garage's evaluation indicators In the stereo garage service system, the customer's arrival time obeys a certain probability distribution, and the storage process of the vehicle in the garage can be regarded as the service process in the queuing model. Hence, we can use the theory of queuing theory to
152
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
study and analyze some characteristics of the stereo garage, and then give the performance evaluation index of the stereo garage. Through the investigation and data collection of the actual running stereo garage, it is analyzed that the customer's arrival rate obeys the Poisson distribution with the parameter , and the service time obeys the negative exponential distribution with the parameter . According to the theory of queuing theory, the average waiting time of customers, the average waiting queue, the average service time and the idle probability of the autonomous vehicle are used as the evaluation indicators of the service efficiency of the stereo garage. We assume that the customer's arrival source number is N, the customer's average waiting queue is Q, the average waiting time is W, and the average service time is S, where the i-th customer waits for the captain, the waiting time, and the service time are respectively denoted as Qi, Wi, Si, the average mathematical expression of the customer waiting for the captain is: N 1
Q=
Q
i Q1 +Q2 +QN 1 N 1 N 1
(1)
Note: The waiting queue of the first customer is 0. The average wait time for a customer is expressed as: N 1
Wi W1 W2 WN 1 W N 1 N 1
(2)
Note: The waiting time of the first customer is 0. The average service time of the customer is expressed as: N
Si S1 S2 S N 1 S N N
(3)
We define the no-load running state of the autonomous vehicle as the idle state. In the one-access vehicle operation, the no-load running time of the autonomous vehicle is Tk, the total running time of the autonomous vehicle accessing the vehicle is TA, and the idle probability of the autonomous vehicle is P, then the mathematical expression of the autonomous vehicle's idle probability is:
P
Tk TA
(4)
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
153
3. Establishment of simulation model for parking space allocation 3.1. Stereo garage parking space allocation principle According to the investigation and analysis of the stereo garage, it is found that the distribution of the stereo garage parking space in real life mainly follows the following principles: (1) Overall stability principle. According to the quality of parked vehicles, different parking spaces are allocated. Vehicles with higher quality are distributed in the lower garage as much as possible, while those with lighter quality are distributed in the upper garage, which reduces the center of gravity of the garage, improves the stability of the garage and guarantees the safety of parked vehicles. (2) The location using high efficiency principle. Different parking spaces are allocated according to the parking time of the vehicle. Vehicles with short storage time are parked in the parking space near the entrance and exit. Vehicles with long storage time are stored in the parking space far from the entrance and exit to improve the frequency of use of the storage space and the overall service of the vehicle. (3) First come and first serve principle. According to the order of arrival of the vehicle, the first arriving vehicle takes priority in the warehousing service. (4) Vehicle outbound priority principle. When the customer issues a request for delivery, priority should be given to ensuring that the vehicle that needs to be out of the garage receives the outbound service in the shortest amount of time. (5) Minimum energy consumption principle. The length of the running path of the dispatching equipment and the no-load running time have a decisive influence on the overall energy consumption of the garage. When the location of the vehicle is allocated, the running distance and no-load running time of the equipment should be shortened as much as possible to save resources and reduce the operating cost of the stereo garage. (6) The principle of the shortest running time of equipment. The running time of the equipment directly affects the waiting time of the customer and the overall service efficiency of the stereo garage. Therefore, the service of the vehicle should ensure that the running time of the dispatching equipment from the entrance to the exit is the shortest to improve the overall working efficiency of the garage. 3.2. Stereo garage parking space allocation model The three-dimensional model of the plane mobile stereo garage is shown in Figure 2. This model is a double I/O port stereo garage model. There are two AVs (Autonomous Vehicle) and two elevators in this model, of which AV is responsible for the horizontal traverse movement of the vehicle, and the elevator is responsible for the vertical lifting movement of the vehicle. Garage vehicle access operations can be expressed as: when the customer has a warehousing request, the vehicle needs to be parked at the garage I/O port, and the garage system allocates a free parking space for the vehicle. According to the assigned location, the AV transports the vehicle to the corresponding column of the location, and the lift transports the vehicle to the corresponding floor of the garage, thus completing a storage operation. The process of picking up the car is the opposite of the process of storing the car.
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
154
Figure 2. Stereo model of a plane mobile stereo garage
Since the vehicle access operations on the left and right sides of the flat mobile stereo garage are the same, one side of the garage can be selected as the research object. In this paper, the location on the right side of the garage is selected as the research object. The mathematical model of the stereo garage is established with the center line of the I/O port as the axis (taking the 11-story and 16-column garage as an example), and each of the locations is numbered in the model. The mathematical model is shown in Figure 3. Y
…
130
131
3
33
34
35
36
2
17
18
19
20
1
1
2
3
4
8
7
6
5
… … … …
… …
… …
I/01 I/02 1
1
… … … …
175
176
158
159
160
141
142
143
144
…
…
129
174
157
…
…
9
173
…
…
148
…
…
164
147
…
… … …
163
146
…
… … …
162
145
…
… … …
161
10
… X
132
… … …
11
45
46
47
48
29
30
31
32
13
14
15
16
5
6
7
8
X
Figure 3. Mathematical model of a plane mobile stereo garage
4. Simulation of stereo garage parking space allocation 4.1. Simulation of parking space allocation under the principle of nearest allocation After statistical analysis of the actual operational data of a plane mobile stereo garage, we found that the customer's arrival rate is subject to a Poisson distribution of =4.7car/min, and the service time is subject to a negative exponential distribution of μ = 1.7 car/min. The simulation is an 11-story and 16-row stereo garage. Assuming that the total number of vehicles in the garage service is 500 in a certain day, the efficiency evaluation index data of the parking space allocation is shown in Figure 4 under the principle of nearest allocation:
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
(a) Customer waiting captain
(b) Customer waiting time
(c) Customer service time
(d) Free probability of autonomous vehicle
155
Figure 4. Evaluation index of garage efficiency based on nearest principle
According to the analysis of the data in Figure 4, when the parking space allocation is adopted by the nearest principle allocation method, the average value of the waiting queue, waiting time, service time and autonomous vehicle idle probability of the customer are: 4.53, 5.43min, 8.58s and 53.25%. 4.2. Simulation of parking space allocation based on probability characteristics of inbound and outbound The parking space allocation method based on the probability of entering and leaving the garage rely on the dwell time of the vehicle. The three-dimensional graphics are established, and the parking space is dynamically selected through the projection graphics on the plane coordinates. Figure 5 is a three-dimensional figure predicting the probability of outbound and its top view, where the X-axis and the Y-axis respectively represent the column and layer of the garage, that is, the position coordinates of the location, and the Z-axis represents the residence time of the vehicle in the garage. In the top view, the length of stay of the vehicle in the garage is differentiated by the depth of different colors. Vehicles with a longer dwell time are indicated by a yellow area, and the shorter dwell time corresponds to a blue color, and the shorter the dwell time, the darker the color, the more likely the vehicle in this area has a higher probability of exiting. In the case of the storage task, the parking space in the deep blue area is prioritized to adapt to the possible request for the departure of the system, so that the purpose of both the storage and the pickup vehicles can be achieved, and the waiting time of the customer is reduced.
156
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
(a) Three-dimensional graphics
(b) Top view
Figure 5. Stereo graphics of the probability characteristics of the inbound and outbound
Similarly, we used the stereo garage with 11 layers and 16 columns as the simulation object, and the data of the stereo garage efficiency evaluation index using the parking space allocation method with the probability of entering and leaving the library is shown in Figure 6.
(a) Customer waiting captain
(b) Customer waiting time
(c) Customer service time
(d) Free probability of autonomous vehicle
Figure 6. Garage efficiency evaluation index under the probability of entering and leaving the garage
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
157
According to the data analysis in Figure 6, when the allocation of the parking space probability characteristics is used to allocate the parking space, the average waiting captain, waiting time, service time of the customer and autonomous vehicle idle probability are respectively: 2.82, 2.63min, 6.12s and 40.18%. 4.3. Simulation data analysis Comparative analysis of stereo garage efficiency evaluation index data in the above two different modes. In the case of the same number of cars in the garage service, compared with the parking space allocation method of the nearest principle, when we adopt the distribution method of the probability of entering and leaving the warehouse, the average waiting queue length of the customers is reduced by 1.71, and the average waiting time is shortened by 2.8 min, the average service time is shortened by 2.46s, and the idle probability of the carrier dropped by 13.07%. 5. Conclusion This paper takes the parking space allocation method of the plane mobile stereo garage as the research object. At the same time, the queuing model and parking space allocation principle of the stereo garage vehicle are analyzed, the mathematical model of the stereo garage is established, and a parking space allocation method based on the probability of entering and leaving the garage is proposed. This program is compared with the distribution method of the nearest principle. The research results show that: (1) The parking space allocation method based on the probability of entering and leaving the garage fully considers the dwell time of the vehicle in the warehouse and predicts the outbound probability of the vehicle. Distribution of parking spaces dynamically according to the outgoing probability of vehicles. It achieves the goal of balancing the task of storage and the request of departure, and reducing the probability of no-load operation of the carrier. (2) The adoption of this scheme can effectively reduce the waiting queue of the customer, shorten the waiting time of the customer and the vehicle’s service time in the garage, and improve the overall working efficiency of the stereo garage. However, the parking space allocation method of the probability of entering and leaving the garage needs to estimate the probability of the vehicle based on the historical time data of the vehicle, hence, this scheme is only applicable to the parking garage with fixed users such as residential quarters and enterprise garages. It not suitable for shopping malls, hospitals and other highly mobile garages. In the future research, the universality of the scheduling scheme should be considered, so that the research results can serve the social life more comprehensively. References [1] [2]
X.N.Wang, J.G.Li, Y.P.He, Modeling and simulation of parking space allocation in plane mobile stereo garage, Journal of Nanjing University of Science and Technology 43 (2019), 54-62. Y.Liang, Y.Fang, J.G.Li, Simulation and analysis of partition access strategy for plane mobile stereo garage, Floor Science and Technology 38 (2015), 88-90.
158
[3]
B. Li et al. / Modeling and Simulation of Dynamic Location Allocation Strategy
R.Liu, J.G.Li, X.N.Wang, Modeling and simulation of parking space allocation in stereo garage, Journal of Jiangsu University(Natural Science Edition) 39 (2018), 19-25. [4] Z.C.Zhang, Research on staching stereo garage access strategy and control system, Anhui University of Engineering, Wuhu, 2019. [5] B.She, Research on scheduling optimization of automatic and multi-layered parking space, Shaanxi University of Science and Technology, Xian, 2017. [6] G.N.Xu, H.M.Cheng, Y.W.Chen, Optimization of vehicle scheduling principles for stereo garage based on queuing theory, Hoisting and Transport Machinery 5 (2008), 50-55. [7] Q.C.Zhou, N.Miao, X.L.Xiong, Parking model setup and analyze based on queuing theory, Chinese Journal of Construction Machinery 2 (2005), 161-164. [8] Y.J.Ma, Z.Y.Jian, Z.M.Yang, Dynamic location assignment of AS/RS based on genetic algorithm, Journal of Southwest Jiaotong University 3 (2008), 415-421. [9] S.N.Liu, Y.L.Ke, J.X.Li, Optimization for automated warehouse based on scheduling policy, Computer Integrated Manufacturing Systems 9 (2006), 1438-1443. [10] Z.Y.Li, Research on the scheduling optimization method of stereo garage, Shenyang Jianzhu University, Shenyang, 2015. [11] T.Xia, B.She, Study on storage and retrieval dispatching of lift-sliding stereo parking system, Logistics Technology 34 (2015), 138-140.
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200057
159
Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts Using a Linear Modulation Factor a
Junsheng LIUa,b, Shichao CHENb,1, Yun CHENb, Chenlu DUANb and Ming LIUc School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China b Xi’an Modern Control Technology Research Institute, Xi’an 710065, China c School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
Abstract. Composite guidance mode can realize better aircraft control than single guidance mode. However, trajectory wreck and attitude change are inevitable during handover from midcourse guidance to terminal guidance, leading to unpredicted control failure. How to realize smooth handoff is of crucial importance to robust control. Focusing on the problem, an effective method using linear factor to realize smooth transition between the midcourse and terminal guidance for the helicopterborne aircrafts is proposed in this paper. In accordance with the strict constraint conditions of smooth handover, the application scope of the proposed method is analyzed. The effectiveness of the proposed method has been verified on simulated experiments. Keywords. Composite guidance, midcourse and terminal guidance handover, smooth transition
1. Introduction The midcourse and terminal guidance handover consists of two main aspects. One of them is the handover of the seeker, and the other one is the handover of the flight trajectory [1]. The seeker handover aims at guiding the aircraft to some air space to guarantee target capture with limited error and time. The trajectory handover guarantees the smooth transition between midcourse guidance and terminal guidance. Successful transition between midcourse and terminal guidance is restricted by various factors [2-3]. For the problem of seeker handover, one can design appropriate midcourse guided law to make the seeker-target connection line within the field of view of the seeker when the seeker starts to detect the target. As for the problem of trajectory handover, since different guided laws are adopted for the midcourse and terminal guidance, different ballistic trajectory characteristics exist [4-5]. When the aircraft starts to convert from midcourse guidance to terminal guidance, the overload may change dramatically, leading to an unstable body [6-7]. How to realize smooth transition between midcourse guidance and terminal guidance is of crucial importance to the body stabilization and target capture. The relationship of the midcourse and terminal guidance handover is shown in Figure 1. 1
Corresponding Author. E-mail: [email protected].
160 J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts
Figure 1. The relationship of the midcourse and terminal handover.
where T 0 is the angle between the speed vector and the reference line, which represents the variance rate of the speed vector. M represents the aircraft location, T0 represents the original location of the interested target, M k represents the location of the aircraft at the point where midcourse guidance and terminal guidance handover starts, Tk is the corresponding position of the target, PI represents the intersection point of the aircraft and the target, and R is the effective detected range of the seeker.
2. Handover of the trajectory We focus on the handover of the trajectory in this paper. However, the successful handover of the seeker is the precaution of the smooth transition between the midcourse guidance and the terminal guidance. When the seeker is within its effective detective range, the seeker can capture the target as long as the target lies in the cone space of the seeker formed by the field of view of the seeker in both pitching and yawing directions with certain probability. Algorithms regarding to the seeker handover have been detailed exploited and discussed in existing literatures [1, 8], we will not explore it in this paper. As previously discussed, due to the different choices of guided laws during midcourse guidance and terminal guidance, ballistic damage will emerge during transition inevitably. The phenomenon will lead to target missing. As a result, how to avoid saltation at the interface is of great importance to guarantee smooth transition. There are two kinds of smooth transition at the ballistic interface, which are the first order smooth transition and the second order smooth transition. The first order transition corresponds to the smooth transition of the speed vector of the aircraft, i.e., the derivation of the speed vector exists and being continuous; the second order transition corresponds to the smooth transition of the acceleration vector, i.e., the derivation of the acceleration vector exists and being continuous. Actually, during the transition of the midcourse guidance and the terminal guidance, the seeker has already outputted the line of the sight (LOS) rates. Therefore, combining the guided laws together is a good way of achieving smooth transition. Assuming that, a (t ) stands for the acceleration of the aircraft during handoff, we need to satisfy the following two restrictions if we would like to realize smooth transition at the initial handover point t0 .
J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts 161
a t0
a c t0
ac1 t0
acc1 t0
(1) (2)
where ac1 stands for the acceleration before handover. Likewise, the following two restrictions need to be satisfied at the ending point of the handover. (3) a t0 T ac 2 t0 T a c t0 T
acc2 t0 T
(4)
where ac 2 stands for the acceleration after handover, T stands for the lasting time of handover. As can be seen from the above restrictions, so as to achieve stable handover, the acceleration a (t ) should be a weighted sum of ac1 (t ) and ac 2 (t ) . a t
O t ac1 t P t ac 2 t
(5)
where O t and P t are weighed coefficients. Combining equations (5) and (1), we have
a t0
ac1 t0
O t0 ac1 t0 P t0 ac 2 t0
O t0 1, P t0
0
(6)
Similarly, combining equations (5) and (3), we have a t0 T ac1 t0 T
O t0 T ac1 t0 T P t0 T ac 2 t0 T O t0 T
(7)
0, P t0 T 1
Here, we aim to adopt the linear modulation factor to realize smooth transition during handover. As a result, O t and P t are constructed by
O t
P t
k1 t t0 k2
k3 t t 0 k 4
(8) (9)
where ki (i 1, 2,3, 4) are constants. Substituting equations (6) and (7) into equations (8) and (9), we have k1 1 T ° °k2 1 (10) ® ° k3 1 T °k4 0 ¯ Equations (8) and (9) can be further updated based on the results of equation (10). t t0 °°O t 1 T (11) ® ° P t t t0 °¯ T Substituting equation (11) into equation (5), we can get the expression of the acceleration finally. ª t t0 º ª t t0 º ac1 t « a t «1 (12) » » ac 2 t T ¼ ¬ ¬ T ¼
162 J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts
From equation (12), we can see that the acceleration a (t ) consists of two parts. At the preliminary stage, the weight of ac 2 (t ) is relatively small, and ac1 (t ) is the principle component. With time increasing, when the time approaches t0 T gradually, the weight of ac 2 (t ) grows larger and larger, and the effect of ac1 (t ) gets weakened. However, just meeting the restrictions of equations (1) and (3) is not enough. In order to realize smooth transition during handover, we also need to satisfy the conditions expressed as equations (2) and (4). Taking a first order derivative with respect to equation (12), we have ac1 t ª t t0 º «1 acc1 t ac t T T »¼ ¬ (13) ac 2 t ª t t0 º « » acc2 t T ¬ T ¼ Substituting equation (2) into (13), we have ac 2 t0 ac1 t0 (14) a c t0 acc1 t0 T Similarly, substituting equation (4) into (13), we have ac 2 t0 T ac1 t0 T a c t0 T (15) T acc2 t0 T As can be seen from equations (14) and (15), the adopted linear modulation factor cannot meet the conditions set by equations (2) and (4) completely, i.e., a c(t0 ) z acc1 (t0 ) , a c(t0 T ) z acc2 (t0 T ) . However, the extent of the satisfaction relies on how the error approaches zero. That is to say, if equations (16) and (17) are quite close to zeros, we can still get satisfying handover results by using the linear modulation factor. (16) err1 ac 2 t0 ac1 t0 err2
ac 2 t0 T ac1 t0 T
(17)
Fortunately, the value of err1 and err2 is quite small under the condition of small overload difference. The negative influences can be ignored in such situations, especially under helicopter-borne aircrafts. Smooth transition can be realized during handover between midcourse guidance and terminal guidance.
3. Experimental results We take the helicopter-borne case as an example to show the effectiveness of the proposed method. The position of the target is set to be (8000.0m, 0.0m, 0.0m), the velocity of the target is set to be 15.0m/s, the average velocity of the aircraft is 200m/s, the aircraft flies at a fixed height along the line of sight during midcourse guidance to guarantee target capture of the seeker. The proportional guided law is adopted during terminal guidance. When the distance between the target and the aircraft reaches the effective detection range of the seeker, the handover starts. The lasting time of handover T is set to be 1.0s. Normally, the seeker starts to work at the range of 4km between the
J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts 163
target and the seeker for helicopter-borne aircrafts, i.e., the handover process starts at 4km. To verify the advantage of the proposed method, we compare it with the way of direct handover. Simulation results of the aircraft trajectory along the Y direction, the overload along the Y direction, the trajectory inclination angle, the pitching angle, and the pitching angle acceleration are displayed in Figures 2-6. The zero time along the horizontal axis corresponds to the starting point of the handover. 238
1 Direct handover
236
Direct handover
Linear factor
0.9
232
a(g)
Y(m)
234
230
0.85 0.8
228
0.75
226 224 -1
Linear factor
0.95
0
1
2
3
4 5 Time(s)
6
7
8
9
0.7 -0.5
10
Figure 2. Aircraft trajectory along the Y direction (4km).
0
0.5
1 Time(s)
1.5
2 Direct handover
Direct handover
Linear factor
Linear factor
0.4
The pitching angle(°)
Trajectory inclination angle(°)
2.5
Figure 3. The overload along the Y direction (4km).
0.6
0.2 0 -0.2 -0.4 -0.5
2
0
0.5
1 Time(s)
1.5
2
1.5
1
0.5
0 -0.5
2.5
Figure 4. The trajectory inclination angle (4km).
0
0.5
1 Time(s)
1.5
2
2.5
Figure 5. The pitching angle (4km).
The pitching angle acceleration(°/s)
10 Direct handover Linear factor
5 0 -5 -10 -15 -20 -0.5
0
0.5
1 Time(s)
1.5
2
2.5
Figure 6. The pitching angle acceleration (4km).
From Figure 2, we can see that since the ballistic is a variable that changes slowly, the difference is not that explicit with different ways of handover. However, the proposed linear factor still performs a little better. As can be seen from Figure 3, a step hop will emerge for the overload if we directly apply handover without smooth transition. From
164 J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts
Figures 4-5, we can see that the proposed linear modulation factor can realize fluctuation suppression for the trajectory inclination angle and the pitching angle. The phenomenon is much more explicit for the pitching angle acceleration, the variance range changes from -18º/s㹼+10º/s to -2º/s㹼+2º/s by using the proposed linear factor, which is quite beneficial for the stable flight of the aircraft, as shown in Figure 6. In the following, we come to analyze the error shown by equations (16) and (17), the error ac 2 (t ) ac1 (t ) verses time is shown in Figure 7. As previously discussed, the zero time corresponds to the starting time of handover t0 , and 1s corresponds to the terminal time of the handover t0 T . We can see that err1 | 0 and err2 0.25 , the amplitude is quite small. Although the conditions restricted by equations (2) and (4) cannot be satisfied completely, stable handover and smooth transition can still be realized thanks to the small amount of error. 0.05 0
-0.1 -0.15
a
c2
—a
c1
(g)
-0.05
-0.2 -0.25 -0.3
0
0.2
0.4 0.6 Time(s)
0.8
1
Figure 7. The acceleration error during handover (4km).
Taking a close look at the results shown in Figure 7, we can give some further discussion on the influences of the proposed method with respect to the overload difference. As previously discussed, to realize complete smooth transition during the handover, one has to meet the strict requirements by equations (1)-(4). The proposed method can satisfy equations (1) and (3). However, the extent of how the proposed method satisfies equations (2) and (4) relies on how err1 and err2 are close to zero. The smaller the err1 and err2 are, the better the performance of the proposed method will be. Implementing handover at range 4km from the seeker to the target, the amplitude of err1 and err2 is very small. The errors are quite close to zero, and the ballistic can realize smooth transition successfully. From the experimental results, we can see that the proposed linear factor is suitable for the handover of the helicopter-borne aircrafts.
4. Conclusion Focusing on the problem of trajectory handover with composite guidance mode, a simple but effective linear modulation factor is proposed in this paper to realize smooth transition between midcourse guidance and terminal guidance. However, we have to note that the proposed method is suitable for the situations with small acceleration difference during handover. For cases with explicit acceleration difference, we have to design more
J. Liu et al. / Achieving Midcourse and Terminal Trajectory Handoff for Helicopter-Borne Aircrafts 165
complicated factors to realize effective handover, such as the second order factor or the triangle function factor.
Acknowledgments This work was supported by the national natural science foundation of China under Grant 61701289, 61601274, the natural science foundation of Shaanxi province under Grant 2018JQ6083, 2018JQ6087, the young talent fund of university association for science and technology in Shaanxi under Grant 20190106, and the fundamental research funds for the central universities under Grant GK201903084.
References [1] Z. Shi, H. Wang, P. Zhang, X. Tang, B. Wu and C. Wang, Study on the probability of successful handoff of missile trajectory from midcourse guidance to terminal guidance, 2010 International Conference on Computational and Information Sciences, Chengdu, China, 2010. [2] R. W. Morgan and A. P. Noseck, Generalized optimal midcourse guidance, 53rd IEEE Conference on Decision and Control, Los Angeles, CA, USA, 2014. [3] Z. Jin, S. Lei, W. Huaji, Z. Dayuan and L. Humin, Optimal midcourse trajectory planning considering the capture region, Journal of Systems Engineering and Electronics 29 (2018), 587-600. [4] R. W. Morgan, Midcourse guidance with terminal handover constraint, 2016 American Control Conference (ACC), Boston, MA, USA, 2016. [5] X. Liu and M. Lv, Design of an optimal midcourse guidance law to enhance the probability of target acquisition, 2015 IEEE International Conference on Mechatronics and Automation (ICMA), Beijing, China, 2015. [6] C. Chang and D. Lin, Modeling and simulation of kinetic optimal midcourse guidance law for air-to-air missiles, Second International Conference on Information and Computing Science, Manchester, UK, 2009. [7] Z. Danxu, F. Yangwang and Y. Pengfei, Midcourse trajectory optimization method with strong velocity constraint for hypersonic target interceptor, 29th Chinese Control and Decision Conference (CCDC), Chongqing, China, 2017. [8] S. Ann and Y. Kim, Trajectory optimization for a missile with strap-down seeker against hypersonic target, Proceedings of the International Conference on Advanced Mechatronic Systems, Melbourne, Australia, VIC, 2016.
166
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200058
Study on the Influence of Intelligent Evacuation Guidance System of Highway Tunnel on Emergency Evacuation a
Zhengmao CAOa and Xiao LIU b,1 China Merchants Chongqing Communications Research and Design Institute Co., Ltd., Chongqing 400067, China b Department of foreign language, Southwest Jiaotong University Hope College, Chengdu 610400, China
Abstract. According to the personnel evacuation characteristics of highway tunnel, simulation was conducted to analyze the evacuation pattern of different crowds under the mixed behavior mode by using a renovated cellular automaton model (CA). Research on intelligent evacuation system was based on the basic principle of safety evacuation, the installation location of intelligent evacuation direction system, the tunnel space structure and typical fire scenarios㸪with comprehensive consideration of factors such as path length, exit width, population density and distribution of evacuated people. The effects of visual induction, auditory induction and dual induction on the evacuation process of intelligent evacuation guidance system were studied by changing the range of guidance signals to population evacuation, and a method for intelligent dynamic identification evacuation path based on multiparameter was obtained. The results of the simulation show that the intelligent evacuation guidance system can offer a dynamic evacuation route via the real-time control of the guidance signals㸪such as the sound and light indicators㸪and instruct the people under the fire to choose the most feasible behavior pattern so as to enhance the efficiency of evacuation㸬Under the different behavior patterns㸪it would be possible to effectively reduce the evacuation time via the dual induction mechanism of the sound and the light if a crowd manages to choose the appropriate number㸪location and direction of the induction signals㸪and enlarge the impact range of those signals㸬In addition㸪based on the intelligent dynamic identification algorithm㸪the evacuation efficiency can be expected to be raised by controlling the working status of induction signals to provide people with dynamic evacuation route㸬 Keywords. Highway tunnel, intelligent evacuation guidance system, cellular automaton model, visual induction, dual induction, crowd evacuation
1. Introduction With the total number of road tunnels and construction scale continue to increase in China, as of the end of 2018 (Jiang et al, 2019), there were 17738 highway tunnels, with the total length of 17236.1 km. There are more and more super-long highway tunnels and large-scale tunnel groups, which brought great convenience to the transportation, but 1
Corresponding author at: Department of foreign language, Southwest Jiaotong University Hope College, Chengdu 610400, China E-mail addresses: [email protected], [email protected].
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
167
also increased the difficulty of personnel escape in tunnel fire. The space inside the tunnel is easy to make people lack a sense of direction, especially under fire conditions, people's desire for escape and psychology form a strong contrast with the harsh environment around them. In order to ensure the rapid and safe evacuation of people, the setting of a light-type evacuation indicator is required by relevant regulations and actual operations. Considering that the single fire detection and automatic alarm system equipment has not been able to meet the requirements of the fire protection system, it has put forward higher requirements for timely fire evacuation after the fire. How to apply information technology to tunnel fires to achieve safe, accurate and rapid evacuation in fires and panic is an important issue currently facing. As an effective means of evacuation guidance and intervention, the intelligent evacuation guidance system has become a hot research direction. The safe evacuation of people in tunnel fires is a complex system involving the interaction of three basic factors: tunnel structure characteristics, fire development process and human behavior. It is one of the hotspots in the field of tunnel operation technology and fire safety science. Domestic and foreign scholars have conducted research on personnel evacuation and intelligent evacuation systems. For example, the new intelligent emergency evacuation indication system adopts a centralized control system to convert the traditional nearby escape into a safe guided escape (Li et al, 2006). A wireless sensing emergency lighting system, in which information is transmitted through a coordinator and a router, and intelligent control can be realized (Li, 2018). In order to reduce the impact of the predictability of urban rail transit network on daily operations, a congestion propagation model based on real-time distribution of passenger flow is established (Li and Qin, 2016). Incorporate the control effect of the indicator into the population distribution prediction model to get closer to the real situation evacuation population distribution prediction model (Zhao, 2016). A subway intelligent evacuation system closes the marker light pointing to the exit of the danger zone according to the fire point location reported by the fire alarm system, and only lights the marker light pointing to the exit of the safety zone (Xiang, 2007). Dynamic model evacuation flow configuration enables continuous prediction of the temporal and spatial distribution of pedestrian flow (Hoogendoom et al, 2004). The evacuation model is used to construct the evacuation route construction algorithm and the intersection traffic flow distribution algorithm to guide the escapers to evacuate to the safe area (Chen, 2008). In addition, a number of fire evacuation models and personal evacuation behaviors have been conducted at home and abroad, including numerical simulation calculations and experimental studies (Chen et al, 2019; Xie et al, 2018; Zeng et al, 2018). It can be seen that it has become the consensus of researchers to avoid the evacuation of people to dangerous areas by setting intelligent evacuation indicators. However, how to optimize the overall evacuation efficiency by setting intelligent evacuation indicators is still a topic worthy of further exploration. Most studies ignore the psychological characteristics of the evacuated personnel in the event of an emergency and the individual evacuation behavior. The construction of the intelligent evacuation guidance system, the interaction mechanism between the system and the evacuation behavior of the population, and the determination principle of the evacuation induction path need further research. Therefore, this paper comprehensively consider the internal structure of the building, the intensity of personnel, the psychology of evacuation personnel, the control of people flow, etc., establish a crowd safety evacuation model, study the new intelligent evacuation guidance system, improve the safety evacuation efficiency, and have important academic value and practical significance.
168
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
2. Composition of Intelligent Evacuation System The intelligent evacuation concept is a new evacuation concept based on the major defects of the traditional evacuation system. It introduces high-level exit voice, low-level evacuation lighting and two-way adjustable, ground or wall continuous guided optical flow. The core idea is to guide people to evacuate safe exits away from fire sources in an active, fast and accurate manner based on the exact location of the fire. The safe evacuation method of converting the “near evacuation” method into the premise of “away from the fire source and the principle of evacuation as the principle”, which greatly reduces the evacuation time and avoids blind escape. The intelligent evacuation system consists of fire detectors, intelligent evacuation controllers, intelligent emergency lighting and intelligent evacuation sign lamps, and other communication control devices. Based on the concept of “smart evacuation”, the application of advanced computer technology has made significant improvements to the traditional fire emergency lighting controller. The system realizes many functions such as window operation, building plane graphic editing and display, emergency evacuation of personnel accidents under non-fire conditions and so on. The intelligent evacuation system consists of intelligent emergency lighting controller, emergency lighting power supply, emergency lighting distribution electric device, emergency lighting and indicator light. As shown in Figure 1.
Figure 1. Composition of Intelligent Evacuation System In the event of fire, after the intelligent evacuation controller receives the information from the fire detector, the intelligent evacuation system immediately generates the optimal evacuation route, and quickly starts the fire emergency sign on the optimal evacuation line along the evacuation channel to the safety exit. The direction flashes in turn to form an optical flow that allows the escaper to clearly see the optical flow and evacuate it safely along the optical flow. When the system is applied to a highway tunnel, considering the particularity of the tunnel, the evacuation method of “keeping away from the fire source as the premise, evacuation in the wind and evacuation as the principle” is implemented. When the intelligent evacuation controller performs spatial decision-making and evacuation plans, it combines the direction of the fresh air supply and smoke exhaust system with the location information of the evacuation facilities, including the cross-holes and cross-holes of the road tunnel. The evacuation situation of the road tunnel fire is shown in Figure 2.
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
169
Figure 2. Highway Tunnel Fire Evacuation
3. Evacuation Model 3.1. The Construction of the Evacuation Model and Basic Assumptions The cellular automate takes a finite discrete state from each cell distributed in the regular grid and makes synchronous updates following certain local rules. Which is a discrete space-extensible dynamic system composed of a large number of simple and consistent individuals through local interactions. On the basis of the improved cellular automaton model, the auditory effect and visual effect coefficient are introduced to reflect the mechanism of the intelligent evacuation guidance system in the highway tunnel for crowd evacuation. 3.1.1. Quantification of Repulsive Force and Friction When the number of person is m(mӋ2) moves to the same target at the same time, the rejection probability rę[0, 1] is introduced, and the motion probability of several people moving at the same time is:
pi
(1 r1 ) / m
(i 1ǃ2 ĂĂm)
(1)
That is, the relevant pedestrians remain motionless with probability r1, and move with probability 1-r1, and each person's motion probability is equal. The exclusion probability is defined by the Sigmoid function in the neural network:
r
1 e D V 1 e D V
(2)
where V is the relative velocity; is the hardness coefficient. The model gives a quantitative relationship of friction, and the friction probability f is defined as: f = •V
(3)
170
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
where V is the relative velocity; is the hardness coefficient, is the Coefficient of friction. Through the above processing, a quantitative description of the repulsive force and friction between people and people, and the evacuation of people can be obtained. 3.1.2. Determination of Auditory Effect Coefficient and Visual Effect Coefficient The actual evacuation guidance system is to follow the safety exit and set the high exit voice along the evacuation path, the low level evacuation illumination, and the ground or wall continuous guided light flow indicator light to indicate the direction, or directly use the sound and light alarm device to instruction the evacuation of people. The highlevel exit voice is defined as auditory induction, and the low-level evacuation illumination and the continuous-oriented guided light flow marker light are defined as visual induction. The joint action of the two or the sound and light alarm device is called double induction, and the comprehensive evacuation of personnel is carried out. The degree of influence is defined as kę[0, 1]. When there is high visibility, the visual induction is stronger than the auditory induction. On the contrary, the auditory induction is stronger than the visual induction, and the expression of the auditory induction and visual induction on the probability of motion is:
pic (ka kv ) / pi
(4)
where ka is auditory effect coefficient; kv is visual effect coefficient; ka+kv=1. When ka tends to 1, it is complete auditory induction ,corresponding to the state of low visibility in the middle and late stages of fire, and vice versa is complete visual induction corresponding to the state of good visibility at the beginning of the fire. 3.2. Simulated Scene and Evacuation Mode Let the simulation scene be a section in the tunnel. As shown in Figure 3, there are three safety exits. The exits of A and B are 2 m wide and the exit of C is 4 m wide. According to calculations, the number of people who should be evacuated is about 400.
Figure 3. Schematic diagram of the plane layout of the simulated scene According to its familiarity with the tunnel, the evacuation behavior mode can be divided into the shortest path behavior mode 1, the inbound and outbound of consistent behavior mode 2, and the complete herd behavior mode 3. When people are unfamiliar with the surrounding environment and encounter emergency situations, their decisionmaking process is more affected by fear, which is characterized by a complete herd
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
171
behavior pattern based on the direction of movement of most people in their field of vision. Set the evacuation grid to 0. 5 m × 0. 5 m, and the parameters of the evacuation model are selected as: = 1 (human exclusion), = 2 (human and wall exclusion), = 0. 3 (person Rubbing with people), = 0.7 (wall and human friction). Full consideration of the actual crowd evacuation, the evacuation speed of personnel is related to the density of people around, and monotonously decreasing with the increase of population density. The evacuation speed is calculated as follows:
v 0, U ! 2.3; °° ®v v0 ª¬0.35 >1.32 0.82 ln U @ 0.02 3 0.76 U 0.2 º¼ , 0.5 d U d 2.3; ° U 0.5 °¯v 1,
(5)
where v is the actual speed during evacuation; v0 is the normal speed of personnel; is the density of people within a certain range of the research object. When the evacuated individual moves to the evacuation exit, it is considered as evacuation success. Each step consumes 0.5 m/v. When the last person evacuates successfully, the evacuation time of all personnel is compared, and the maximum value is the final evacuation time. Under the condition of the mixed behavior mode, the evacuation personnel are considered to be evacuated according to any of the three behavior patterns randomly assigned. 3.3. Simulation Result Analysis After multiple simulations using the mixed behavior pattern, the average total evacuation time is 125s, the average total evacuation time for a complete herd behavior pattern under the same conditions is 256s. Obviously, the total evacuation time of the mixed behavior pattern is much smaller than the total evacuation time of the complete herd behavior pattern, showing a higher evacuation efficiency, and also reflects that the evacuation behavior pattern has different degrees of impact on the evacuation process. The reason can be considered as follows: In the mixed behavior mode, the evacuated people in behavior pattern 1 and behavior pattern 2 have a “herd effect” to the population in behavior pattern 3 to a certain extent, thus the evacuation time is greatly shortened as a whole. Therefore, it is necessary to explore how to exert the dynamic guidance of the evacuation guidance system to improve the evacuation efficiency.
4. Analysis of the Role of Intelligent Evacuation Guidance System 4.1. Working Principle of Intelligent Evacuation Guidance System The intelligent evacuation guidance system is shown in Figure 4.The system consists of the computer of the control system, the smoke concentration of the smoke layer, the crowd evacuation speed detector, the crowd density detection device, and the dynamic evacuation line identification and the safety exit identification. When a fire occurs, the system can continuously optimize the evacuation induction path according to the smoke concentration of the smoke layer, the evacuation speed of the crowd, the density of the
172
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
crowd and the structural parameters of the building, the dynamic identification path of the population's real-time best evacuation induction path is obtained.
Figure 4. Intelligent evacuation guidance system 4.2. The Effect of Visual Guidance on the Evacuation Process In the intelligent evacuation guidance system, taking ka = 0, indicating the position, number and influence range of the evacuation guidance sign indicating the visual induction effect on the evacuated population. The complete herd behavior pattern 3 is being analyzed, when the evacuation indicator enters the visual range, its behavior pattern is directly converted to behavior pattern 1, which means the evacuation direction is clear. The simulation results are shown in Figure 5 and Figure 6.
Figure 5. Influence of visual guidance on evacuation Figure 6. Relationship between visual guidance and time the number of people unevacuated
Figure 5 shows the effect of the range of visual effects of the evacuated population on the overall evacuation time. Figure 6 shows the number of unevacuated people with different impact ranges at the same time step. It can be seen from the analysis of Fig. 5 and Fig. 6 that when the influence range of the evacuation guidance sign is small, more evacuation personnel unable to evacuate quickly because the evacuation guidance sign is not seen, and the overall evacuation time was delayed. On the contrary, when the influence range of the evacuation guidance sign is large, more evacuation personnel can select a reasonable evacuation path under the guidance of the evacuation guidance sign to save the evacuation time. Figure 6 also reveals that when the range of influence is extended to a certain extent, the impact on
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
173
evacuation time will be lost. Therefore, it is important to optimize and illuminate a reasonable dynamic evacuation guidance indicator, which is one of the advantages of the intelligent evacuation guidance system. In the same way, we use the above method to simulate the influence of auditory induction on the evacuation process, and we will draw similar conclusions. 4.3. The Effect of Double Guidance on the Evacuation Process Under the condition that kv=0.8, kv=0.6, kv=0.4, and kv=0.2, respectively, the evacuation of personnel is simulated based on the fact that the auditory induction in the real fire smoke is slightly larger than the visually guidance influence range. The simulation results are shown in Figure 7.
Figure 7. Relationship between dual guidance and the number of people unevacuated
It can be seen from Fig. 7 that when the acoustic and light guidance markers are used together, the visually guidance weakening (from 0.8 to 0.2) and the acoustic guidance enhancement (from 0.2 to 0.8) are reflected indirectly. The importance of sound evacuation signs when fire smoke causes a decrease in visibility. Therefore, a double guidance mechanism is adopted in the intelligent evacuation guidance system of the tunnel, which can greatly shorten the evacuation time and improve the evacuation efficiency. 4.4. The Effect of Double Guidance on Evacuation Process in Tunnel Fire Let the fire happen near the exit A and make the following changes to the simulation calculation scenario: x The evacuation guidance marker position is distributed close to the safety exit and along the evacuation path. x The intelligent evacuation guidance system can change the direction of induction of the evacuation guidance marker according to the smoke situation, that is, to point to a non-hazardous and nearest security exit. According to the above scenario, the simulation is performed in two cases: Figure 8 shows the change in the direction of indication of the sound and light indicators when kv = 0.4, and the comparison of the number of unevacuated personals by the intelligent evacuation guidance system. Figure 9 shows the alignment of the traditional evacuation guidance system for the number of unevacuated personals when
174
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
kv = 0.4. The position and number of sound and light indicators are unchanged and point to the original direction.
Figure 8. Evacuation effect of intelligent evacuation Figure 9. Evacuation effect of traditional evacuation guidance system with kv= 0.4 guidance system with kv= 0.4
Comparing Figure 8 and Figure 9, it is easy to see that the intelligent evacuation guidance system illuminates the number and location of the correct evacuation guidance markers, which can shorten the evacuation time and improve the evacuation efficiency.
5. Conclusions The improved cellular machine model is used to simulate the evacuation process of mixed evacuation behavior patterns in three behavioral modes. The effects of visual induction, auditory induction and dual induction on evacuation process in intelligent evacuation guidance system were studied. The influence mechanism of sound and light induction on the evacuation process of the crowd was analyzed, and the following conclusions could be drawn: (1) When the highway tunnel fire occurs, the number, location and direction of lighting the correct evacuation guidance signs are an important way to ensure the rapid and effective evacuation of the crowd. (2) Extending the influence range of the evacuation guidance identification system by means of the dual guidance mechanism of sound and light is an effective way to shorten the evacuation time. (3) The intelligent evacuation indicator with comprehensive consideration of path length, exit width and personnel density can more rationally plan the evacuation route. Dynamic identification of evacuation paths according to certain optimization rules will greatly improve the evacuation efficiency of the crowd.
6. Acknowledgments This paper was supported by “National Key Research and Development Program of China” (Project No. 2016YFC0802210).
Z. Cao and X. Liu / Study on the Influence of Intelligent Evacuation Guidance System
175
References [1] S.P. Jiang, Z. Lin, S.F. Wang, Development of China's highway tunnels in 2018, Tunnel Construction 39 (2019), 1217–1220. [2] Y.Q. Li, W.L. Liu, Q. Li, New intelligent emergency evacuation indicator system research, Fire control technology and product information 9 (2006), 21–23. [3] J.J. Li, Design of intelligent emergency lighting evacuation system , Jilin University , Jilin, 2018. [4] B.Y. Li, X.M. Qin, Study on propagation mechanism of mass passenger flow congestion in urban rail transit network, China Safety Science Journal 26 (2016), 162–168. [5] W. Zhao, Research on optimizing intelligent evacuation direction signage design and controlling flow of evacuees, China Safety Science Journal 26 (2016), 169–174. [6] D. Xiang, A subway intelligent evacuation system design research, Building Electrical 5 (2007), 36–39. [7] S.P. Hoogendoom, P.H. Bovy, Dynamic user-optimal assignment in continuous time and space, Transportation Research Part B: Methodological 38 (2004), 571–592. [8] Y.M. Chen, D.Y. Xiao, Emergency evacuation model and algorithms, Journal of Transportation System Engineering and Information Technology 8 (2008), 96–100. [9] Y.Z. Chen, W.T. Chen, W.D. Zhang, Evacuation model of densely populated areas in complex buildings, China Safety Science Journal 29 (2019), 13–18. [10] H. Xie, Y.W. Xiao, Q.M. Zhang, The effect of acoustic signals on the evacuation in underground space, Chinese Journal of Underground Space and Engineering 14 (2018), 595–600. [11] Y.H. Zeng, J. Li, K.F. Peng, Study of personnel evacuation mode of Mawan undersea shield tunnel, Tunnel construction 38 (2018), 551–557.
176
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200059
Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene Wang Wei a,b , Zhang Zhaoyang a , Tang Xinyao a , and Song Huansheng a a
School of Information Engineering, Chang’an University, Xi’an 710064) Anhui Science and Technology Information Industry Co. Ltd, Hefei 230088
b
Abstract. Currently most of the traffic camera self-calibration algorithms are performed based on vanishing point, however, there would be a tending to infinity ill-condition of vanishing point in certain angle of view. To overcome this problem, firstly, we establish a typical complementary calibration models, then we get vanishing points through the diamond space created by the vehicle trajectories and body edges, meanwhile detect geometrical markings in road active area, finally, in order to avoid the ”shock” effect of the vanishing point, adopt the optimization method to optimize the parameter space of the self-calibration model by using redundant information to make the model more accurate. The experimental results on real traffic images demonstrate the effectiveness and practicability of our selfcalibration method, and it is especially suitable for PTZ cameras that constantly changing angles of view Keywords: Dynamic camera self-calibration, PTZ camera, Diamond space, Vanishing point, Parameter optimization.
1. Introduction In recent years, computer vision based applications in intelligent transportation systems (ITS) have become more and more popular, the mainstream applications include speed measurement, traffic flow analysis, vehicle size identification and vehicle identification etc [1][2][3]. However, camera calibration is the basis of the computer vision application in ITS [4]. Due to the PTZ camera platform has used widely in traffic scene [5][6], automatic calibration method become more and more important. Many scholars have proposed various solutions to the problem of camera calibration in traffic road scenes, (1) using vanishing points and some priori geometric identifications [7][8][9][10][11], such as camera height, lane width, lane marking length or the perpendicular distance from the camera to the road’s edge and so on. Kanhere et al [12] have summarized the current popular roadside camera self-calibration methods, and proposed nine calibration methods which meet the minimum calibration conditions. However, in some cases, the priori identifications in traffic scenarios cannot be measured or the measurements are inaccurate that will causes large errors. (2) without
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
177
using vanishing point[13]. Romil Bhardwaj et al [14] using deep learning method to obtain special spatial structure of feature points of vehicles, then combined the known camera internal information to calibrate using the PNP method. Due to the high recognition accuracy of deep learning, the calibration accuracy obtained by this method is high, even so, it has certain limitations that it needs to know the camera internal information and special spatial structure of the vehicles’ feature points before. In summary, this paper analyzes the advantages and disadvantages of selfcalibration methods for traffic cameras at present, based on that, we propose a camera calibration method based on vanishing point and multi-geometric identification fusion.
2. Traffic Camera Calibration Model Under Road Scene 2.1. Space Model of Traffic Camera
a) Left side view of the scene
b) Geometric information
c) Corresponding projection
Figure 1. Camera space model under road scene.
In this model, the camera model is reduced to a pinhole model like current mainstream approaches, and the principal point coincides with the center point of the image. The image plane is perpendicular to the optical axis, and internal parameters are all known except the focal length, besides that we assume the road surface to be flat. We refer to the camera calibration model in Figure. 1(a) is schematic diagrams of the camera calibration space model under the road scenario. For the convenience of subsequent analysis, define the following parameters: the camera focal length f, the height of the projection center above the road plane h, and the tilt angle , the pan angle (the angle between the camera’s optical axis and the road extension direction). Since the camera roll angle can be represented by a simple image rotation and has no effect on the calibration results in this article, it is not considered. We define the camera coordinate system as and the world coordinate system as . As shown in Figure. 1(a), we can easily conclude that the two coordinates are related by a rotation of + /2 around axis, and a distance of h along the axis . Second paragraph.The homogeneous coordinates of a world point is: x=[ 1] , in image coordinate system, it is expressed as: p=[ ] , 0. As shown in reference [12], the projection equation from world coordinates to image coordinates is: 0 0 0 = 0 (1) 0 1
178
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
It can be seen from the above formula that only the calibration parameters , and need to be solved in this model. Because the constraint of this model is based on the geometric information on the road, set the coordinate of point on road plane as (, , 0). In the model of camera self-calibration, vanishing point would contain a lot of information[15][16]. This paper introduces the vanishing point ( , ) in road extension direction, and the vanishing point ( , ) in vertical direction. As shown in Figure.2(b), the angle between the axis y and road extending direction is , so the infinity point along road extension direction in the world coordinate system is expressed as: = [! 1 0 0]" , and the infinity point along the vertical direction is expressed as: = [1 ! 0 0]" . Obviously, ( , ) and ( , ) are the projections of and in image space. Substituting into (1) and yields: ! = (2) # = = !# (3)
= = = = (4) #! Where = arises from the zero roll angle assumption. Transforming (2), (3) and(4) that we can get the equivalent expression related to , , : =
= $( % + ) (5) & (6) = ! ( ' ) (7) = !& * ' In addition to the vanishing point, the geometric information such as dotted lines and road width in road scenarios which have physical parameters with national standards can used as calibration marks. Setting the physical length of the dotted line as l, the physical and pixel ordinate of dotted line ends as: y. and y5 , v. and v5 , in addition, setting road physical width as w, the pixel length of intersection of the lane lines with the v = 0 axis in the image as 6. The spatial relationship of the road prior knowledge information and the perspective projection relationship in the image are shown in Figure.1(b) and (c). 2.2. Camera calibration based on two vanishing points and dotted lines (VVL) It can establish equations with , , by using vanishing points. In order to obtain , this model introduces dotted lines in the traffic scenario. According to equation (1), the physical coordinate can be expressed as: ( !#) = (8)
+ !# It can be known from equation (8) that the physical coordinates are independent of the corresponding pixel abscissa, therefore, a dotted line parallel to the road direction at any position of the road, as shown in Figure.1(c), it can establish an equation constraint as 7 = 9 + :, and then substituting it into (8) can obtain equations as:
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
179
( 9 !#) (9)
9 + !# ( 9 !#) 9 = 7 : = : (10)
9 + !# According to (8), (9), we can get an equation about 9 , then substituting (8) into this equation that we can get the expression about . ( 9 7 ). It can be seen from (11) that the h can be indirectly indicated by l, therefore, all of the unknown parameters required for calibration in this model are completed. 9 =
3. Traffic Camera Calibration Model Under Road Scene
a) Quadrant mapping diagram
b) Infinity/axis point mapping
Figure 2. Image/diamond space mapping diagram
First paragraph.Markta built a point-line-point transformation [17] based on the idea of cascade transformation and the parallel coordinate system (PC lines), this transformation compressed the infinite image domain into a finite diamond domain. As shown in Figure.2(a), it is a mapping relationship between image domain and diamond domain. d represents the half-axis length of the diamond space y, and D represents the half-axis length of the diamond space x, the infinite space in the original image domain is mapped to the finite diamond area. In addition, as shown in Figure. 2(b), the dotted lines represent the infinity points distributed in the four quadrants of the image domain, and their corresponding line segments in the diamond space are with the same color. Due to space limitation, this paper won’t go into details while directly using the transformation formula of diamond domain point [, , ?]@ and image domain point [, , ?]A which was proven by Markta. [, , ?]@ B [C, E()F + E()C FC?, ]A (12) The most common objects in traffic scenario is vehicles, so we get straight lines from vehicles, and transform them into diamond domain to get vanishing points. The straight lines generated by the trajectory of vehicles parallel to the road, and vehicle body has a large number of lateral edges, which perpendicular to the road direction. This paper uses “Pyramid Lucas-Kanade optical flow method”[18] to track moving vehicles, and get the trajectory lines as line group parallel to road: GH = I:HJ | = 1,2, … , KH L, in addition, uses ”LSD Fast Line Detection Algorithm”[19] to detect the N edges of the vehicles and get the car edge straight line group: GM = I:M |O = 1,2, … , KM L.
180
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
Let PQH be the average slope of trajectory group GH . According to the principle that mutually perpendicular lines in perspective image have different signs(positive and N negative), we could pick out the transverse edge linear group GRM from GM , and set PRM as N the slope :SM in group GRM , as the following equation shows: RN Q N UM L = VPM W PH < 0 (13) GRM = I:SM TO = 1,2, … , K N :SM Y GM As shown in Figure.4(a), it’s an example of vehicle trajectory extraction result in the traffic scene, and Figure.4(b) is the example of vehicle transverse edge extraction. Based on prior knowledge, GH and GRM are mutually orthogonal sets of parallel lines in the world coordinates which are parallel to the road direction and perpendicular to the road direction respectively. Therefore, the intersections of parallel lines in GH and GRM corresponds to the two vanishing points ( , ), ( , ) in image space. The steps of finding the vanishing points based on the diamond space are as follows: Step1. Detecting the parallel line group GH and GM by using the ”Pyramid LucasKanade optical flow method” and the ”LSD fast detection algorithm”. Step2. Obtaining the parallel line group GRM of vehicle transverse edges according to formula (13), which satisfies GRM Z GM . Step3. Convert lines in GH and GRM into finite polylines in the diamond space, and superimpose these polylines in the diamond space to form the diamond domain C, then finding the point with maximum cumulative value in the diamond domain C. Step4. In order to resist noise interference, a Gaussian kernel is introduced to perform convolution smoothing in the diamond domain as C W . Then, finding the two points F and F with maximum cumulative value in the diamond domain, and calculating their corresponding pixel points in the image space as ( , ), ( , ) accordingg to formula (12).
a) Vehicle trajectory extraction
b) Vehicle body line group extraction
Figure 3. Road parallel and vertical line group extraction
a) Traffic flow trajectory
b) Vehicle edges
c) Vanishing points in diamond space
Figure 4. An example of road vanishing point detection in diamond space
As shown in Figure. 3(a), it’s an instance of the detected parallel line group of vehicle flow trajectory, and Figure. 3(b) is an instance of the detected parallel line
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
181
group of vehicle transverse edges in image space. Figure. 4(c) is an instance of their corresponding diamond space, the red point has maximum cumulative value in diamond space which represent the real vanishing point.
4. PTZ Camera Automatic Calibration in Road Scene 4.1. Identification of Auxiliary Calibration Geometrical Marking According to the calibration model in the first subsection, in addition to the vanishing point, some geometrical Markings are necessary for camera calibration, such as dotted line l, road width w and its corresponding projection in the image. Since dotted line and the road edge present obvious linear characteristic, while the active road region can be located by background modeling method, therefore, this paper identify the dotted line and road width as follows: Step1. Extracting the background of road scene by Gaussian Mixed Model(GMM), and obtaining the active area \ in the background based on vehicle trajectories mentioned in the previous section, as shown in Figure 5(a); Step2. Extracting straight lines in the active area \ using Hough transform method. According to vanishing point principle that parallel lines in the same plane would intersect at the same point, while the dotted line and the road edge are all parallel to the road direction, therefore, it can narrow the linear angle detection range and accelerate the velocity of Hough transform with using prior information as ( , ) and \. Step3. In the Hough domain, the line corresponding to the larger accumulated value is judged in the image domain by the pixel spatial distribution feature. Since the lines overlapped by the dotted line segments are in pixel space, the pixel value is represented by the alternation of wave peaks and valleys. Therefore, this paper adopts this feature to obtain the dashed line group in the image domain according to the dotted line detection method[20], we set: G^ = I:_J T `9J a :_MJ a `7J L, Where [ `9J , `7J ] represents the ordinate range of the dotted line segment _lb ; Because road edge is a long line which corresponding to maximum point in Hough domain, and its pixel value distributed evenly in image space, by these features, we can detect road edges in the image. Furthermore, find intercept 6 between image u axis and road edges. As shown in Figure 5(b), it’s an example of dotted line segment detection result.
a) Active area and vanishing point
b) Road sign detection result
Figure 5. Road Auxiliary Calibration Identification
182
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
4.2. Camera Self-adapt Optimized Calibration in Road Scene In the second section, this paper get the vanishing points ( , ) and ( , ) by the line groups parallel or perpendicular to road direction. However, camera pan angle changes constantly on the PTZ platform in traffic scene, it’s derived from equation (4) as: m = ±z ±gBi nopqrstg
cef jk = cef
±gBi
(14)
It can be seen from equation (14) that when the pan angle g tends to zero, that is, when the projection of the camera’s optical axis tends to be parallel to the road direction, the vanishing point of road direction will tend to infinity, which will cause a great ”shock” of the point extraction, the vanishing point long the direction perpendicular to road is considered to be stable, so we will adopt an optimization method to correct this error.According to the calibration model, the calibration for traffic camera is equivalent to the estimation of parameter: } = (m, q, ~). Since there are redundant information in most traffic scenes, for example, multiple road dashed lines could be identified, and camera installation height information could be obtained from the road specification. Therefore, this paper adopts an open optimization method to dynamically calibrate traffic camera. Based on the minimum calibration conditions, using existing redundant geometric information to further improve the calibration accuracy with least square method. The constraint function of least squares is: s fet e
c_e + () + ~ (~) }
(15)
c_e
Where e
c_e , () and (~) represent the normalized error of the measured/actual acquired value of dashed lines, road width and camera height respectively. is constraint error coefficient of road width, when using the VVL calibration model, = k, and when using the VWL model, since the parameter is actually used, the width of the road used as the parameter is set to = i. If the camera height is fixed, set ~ = k, otherwise ~ = i. While the constraint function is nonlinear, in this paper, adopt Newton method based on trust region to solve this nonlinear optimization problem.
5. Experiments In this paper, we have completed groups of tests and select three typical scenarios to validate the efficiency of our algorithm. According to the highway construction specifications, the length of dotted line is : = 6 and single lane width is ? = 3.75. The experiments take the algorithm proposed by the above chapters, firstly, extracting background images by GMM, as shown in Figure.6(a), meanwhile, tracking vehicle trajectories and picking up vehicle transverse edges, transforming these lines into diamond space in which the vanishing points would be obtained, and finding out multiple dotted lines and road edges in the active region. Then, according to the scene condition, selecting calibration method automatically for initial calibration, as shown in
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
183
Figure.7(b), the green part represent correct identification result, and red part is the error value of identification length and real length calculated by the calibration parameter. Finally, based on the obtained multiple redundant information, such as multiple road dashed lines and multiple road width constraints, using the nonlinear calibration algorithm to iteratively optimize the initial calibration parameter (, , ) in three-dimensional space to obtain the optimal calibration parameters. As shown in Figure.6(c), the error value of the identification information after the calibration parameter is reduced, and Figure.7 shows optimization iterative process. After obtaining the initial calibration value, the calibration parameter (, , ) is optimized in a three-dimensional space according to the detected redundant identifier, as shown in Tables 1, 2, and 3, which are the improvement results of the optimization iteration error of the three scenarios. Scene 1
Scene 2
Scene 3
a) Road scene
b) Initial calibration result
c) Optimize calibration result
Figure 6. Optimize iteration error reduction schematic diagram
Figure 7. Error iteration of three scenes
184
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
It can be seen that the overall error has a large decline after optimization, that is, the calibration accuracy is greatly improved. From the foregoing, calibration accuracy is closely related to the ”shock” of the vanishing point, and the ”shock” of vanishing point depends on whether the deflection angle is too small. However, as this paper uses redundant identification to optimize calibration parameters, the overall calibration errors could decline sharply. All in all, experiment show that the algorithm has smaller overall error and high calibration accuracy under different pan angles. Iteration 1(initial)
Table 1. Scene 1 optimizes iteration error values f h(mm) (rad) 719.15 0.0403 14653
2
errors 2324.5
706.17
0.0382
14532
2062.2
626.48
0.0239
13804
1164.4
Table 2. Scene 2 optimizes iteration error values f h(mm) (rad) 0.0695 3947.67 12922
errors 1879.2
… 8(optimal) Iteration 1(initial) 2
3904.33
0.0691
12815
1720.2
3639.65
0.0658
12168
1118.8
Table 3. Scene 3 optimizes iteration error values f h(mm) (rad) 0.2589 3888.04 12744
errors 995.3
… 8(optimal) Iteration 1(initial) 2
3827.38
0.2612
12652
889.9
3682.73
0.2708
12184
551.3
… 8(optimal)
6. Conclusion and future work Through experimental verification, the self-calibration algorithm of PTZ camera in road scene proposed by this paper achieved good calibration results in various scenarios. The main contributions of this algorithm are as follows: 1) Select the appropriate calibration model automatically according to traffic scene, and get the vanish points in diamond space, which would improve the pertinence and avoid the calibration failure caused by the ill-condition of vanishing point. 2) Use the redundant identification in the scene, based on that, to optimize the initial calibration parameters in three-dimensional space with nonlinear optimization, which reduces the overall calibration error and improve the calibration accuracy. The vanishing point can be obtained through a large number of vehicle trajectories and body edges, while the dotted line or road width depends on the road scene, so in the future work. we could try to get geometrical markings from multiple ways, such as length/width vehicle, zebra crossings and road fence, and so on. A variety sources of markings will extend the calibration probability of the traffic scene improve the calibration accuracy.
W. Wang et al. / Dynamic Self-Calibration Algorithm for PTZ Camera in Traffic Scene
185
Acknowledgments This work is supported by the Joint fund of the ministry of education of China (No. 6141A02022610), Natural Science Foundation of Shaanxi Province (2018ZDXM-GY047), Central University Fund of China (300102249103) and Natural Science Foundation of Shaanxi Province (No. 2019SF-258).
References [1] Xu Y, Yu G, Wang Y, et al. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images[J]. Sensors, 16(8), 1325-1336 (2016). [2] He H, Shao Z, Tan J.: Recognition of Car Makes and Models From a Single Traffic-Camera Image[J]. IEEE Transactions on Intelligent Transportation Systems, 16(6), 3182-3192 (2015). [3] Sochor J, Juranek R, Spanhel J, et al.: Comprehensive Data Set for Automatic Single Camera Visual Speed Measurement[J]. IEEE Transactions on Intelligent Transportation Systems, 1-11(2018). [4] Dubska M, Herout A, Juranek R, et al.: Fully Automatic Roadside Camera Calibration for Traffic Surveillance[J]. IEEE Transactions on Intelligent Transportation Systems, 16(3), 1162-1171(2015). [5] Song K T, Tai J C.: Dynamic Calibration of Pan–Tilt–Zoom Cameras for Traffic Monitoring[J]. IEEE Transactions on Systems Man & Cybernetics, 36(5), 1091-1102(2006). [6] Dong R, Li B, Chen Q M. An Automatic Calibration Method for PTZ Camera in Expressway Monitoring System[C]// Wri World Congress on Computer Science & Information Engineering. 2009. [7] Álvarez, S, Llorca D F, Sotelo M Á.: Hierarchical camera auto-calibration for traffic surveillance systems.[J]. Expert Systems with Applications, 41(4), 1532-1542(2014). [8] He, Chen X.: New method for overcoming ill-conditioning in vanishing-point-based camera calibration[J]. Optical Engineering, 46(3), 037202-1-12(2007). [9] You X, Zheng Y.: An accurate and practical calibration method for roadside camera using two vanishing points[J]. Neurocomputing, 204, 222-230(2016). [10] Zheng Y, Peng S.: A Practical Roadside Camera Calibration Method Based on Least Squares Optimization[J]. IEEE Transactions on Intelligent Transportation Systems, 15(2), 831-843(2014). [11] Sochor J, Roman Juránek, Herout A.: Traffic Surveillance Camera Calibration by 3D Model Bounding Box Alignment for Accurate Vehicle Speed Measurement[J]. Computer Vision & Image Understanding, 161, 87-98(2017). [12] Kanhere N K, Birchfield S T.: A Taxonomy and Analysis of Camera Calibration Methods for Traffic Monitoring Applications[J]. IEEE Transactions on Intelligent Transportation Systems, 11(2), 441452(2010). [13] Song Hongjun, Chen Yangzhou, Gao Yuanyuan.: A Dynamic Camera Calibration Algorithm Based on Homogenous Fog in Unstructured Road[J]. Journal of Computer-Aided Design & Computer Graphics, 25(7), 1060-1073(2013). [14] Bhardwaj R , Tummala G K , Ramalingam G , et al. Autocalib: automatic traffic camera calibration at scale[C]// the 4th ACM International Conference. ACM(2017). [15] Grammatikopoulos L, Karras G, Petsa E.: An automatic approach for camera calibration from vanishing points[J]. Isprs Journal of Photogrammetry & Remote Sensing, 62(1), 64-76(2007). [16] He B.: Camera calibration with lens distortion and from vanishing points[J]. Optical Engineering, 48(1), 013603-013603-11(2009). [17] M. Dubská and A. Herout.: Real projective plane mapping for detection of orthogonal vanishing points, in Proc. BMVC, pp. 90.1–90.10(2013). [18] Bouguet J Y.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm[J]. Intel Corporation, 5(4), 1-10 (2001). [19] Gioi R G V, Jakubowicz J, Morel J M, et al. LSD: A Fast Line Segment Detector with a False Detection Control[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 32(4), 722-732(2010). [20] Yan Hongping, Wang Lingfen, and Pan Chunhon.: Automatic Self-Calibration of Expressway Surveillance Camera Under Dynamic Conditions[J]. Journal of Computer—Aided Design㸤Computer Graphics, 25(7), 1036-1044(2013).
186
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200060
Morphological Segmentation of Plantar Pressure Images by Using Mean Shift Local De-Dimensionality Dan WANGa,1 and Zairan LIb a
Tianjin Key Laboratory of Process Measurement and Control, School of Electrical Engineering and Automation, Tianjin University, 300072, P.R. China b Wenzhou Vocational & Technical College, Wenzhou, 325035, P.R. China Abstract. The analysis of plantar pressure imaging does not perform well preprocessing, dimensionality reduction and feature calculation; this makes the research of foot comfort by statistical methods have the defects of linearization and poor robustness. The intelligent analysis technology to achieve the extraction of the plantar functional area, providing a streamlined and content-rich data set for the study of plantar pressure comfort is very effective and feasible. Different from the existing local-based segmentation technique of the plantar pressure image, the bottom pressure image mean shifting segmentation model segments the plantar pressure image from a global perspective to obtain more accurate segmentation results; by using pixel precision, average pixel precision, uniform cross-section, frequency-to-weight ratio, segmentation accuracy, over-segmentation rate, undersegmentation rate and Dice coefficient for contrast segmentation, the proposed mean shift local de-dimensionality morphological segmentation performs higher effectiveness. Keywords. Morphological segmentation; Mean shift local de-dimensionality; Plantar pressure imaging.
1. Introduction Mean Shift is a versatile, non-parametric iterative algorithm based on density gradient rise that can be used to find patterns and clusters. Mean shifting takes the feature space as an empirical probability density function. If the input is a set of points, the mean shift treats them as samples from the potential probability density function; if there are dense regions (or clusters) in the feature space, they are A pattern (or local maximum) that corresponds to a probability density function. In addition, the average shift can also be used to identify the cluster associated with a given pattern [1]. For each data point, the average shift associates it with a nearby peak of the data set probability density function. For each data point, the mean offset defines a window and calculates the mean of the data points. Then move the center of the window to the average and repeat the algorithm until it converges. After each iteration, consider switching the window to a denser area of the data set [2]. The average of the data in the calculation window is fixed 1
Corresponding Authors. [email protected].
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
187
at a higher level around each data point, and the window is moved to the average value, and then repeated until convergence, in which case the mean shift can be specified. However, at present, simple morphological methods may cause over-segmentation, structural continuity, difficulty in edge learning, and computational complexity are not well solved [3-5]. Therefore, it is necessary to introduce local dimensionality reduction means and improve the pretreatment ability of plantar pressure imaging. The focus of the research is on Locality Preserving Projection (LPP), which improves the efficiency of segmentation through the combination of preservation and morphological segmentation. This method has been widely used in feature extraction of high-dimensional datasets. Its purpose is to preserve the difference between the images of local attributes in the original high-dimensional space and the difference between the weighted transformed lowdimensional vectors [6]. In addition, the morphological segmentation can overcome the dimensional weakness of the data set semantic feature extraction. Considering that useful information loss occurs during all feature extraction processes, some useful information, such as neighborhood connections and pixel correlation, must be preserved in the data set. Among these methods, especially in image preprocessing, the introduction of localized domain projection for image feature extraction can use adjacent points in highdimensional space, and can also retain their attributes in low-dimensional space [7]. Planar pressure image segmentation is the basic premise of image recognition and image understanding. The quality of segmentation of an image will directly affect the further processing of the image. Therefore, the role of image segmentation is crucial. In the past two decades of research, various problems in image segmentation and edge detection are gradually being solved and discussed by researchers [8-9]. The research and experiment of relevant theoretical methods have also been greatly developed. Image structure, different imaging sources have proposed a lot of operational methods, and have achieved certain results in different fields. However, there is still much room for improvement in the search for an image segmentation and detection algorithm that is universally applicable to various complex situations and has a high accuracy. In particular, the development of a set of image segmentation models suitable for plantar pressure imaging is an important research topic for pressure imaging analysis of plantar images [10-11]. So far, the analysis of plantar pressure imaging has not performed well in preprocessing, dimensionality reduction and feature calculation; The research on the comfort of the sole by statistical methods alone has the defects of linearization and poor robustness. The intelligent analysis technology to achieve the extraction of the plantar functional area, providing a streamlined and informative data set for the study of plantar pressure comfort is very effective and feasible [12-13]. 2. Methods Mean Shift Local De-dimensionality (MSLD) is a feature space analysis method; in image segmentation, the problem can be mapped to the color feature space. In essence, image segmentation is to perform Image Labeling for Classification (ILC) for each pixel. The key to ILC calculation lies in clustering. The key problem of clustering is the definition of distance and the calculation of Probability Density Function (PDF). The climbing value or gradient method can be used to find the extreme value of PDF, but it cannot be guaranteed. Convergence; MSLD can converge through its adaptive step size
188
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
advantage. That is, it is necessary to seek a smoothing PDF Kernel Density Estimation (KDE) in the case of non-parametric estimation. 2.1. Sliding operators of plantar pressure images Let pixel P ’s coordinator be ( x , y ) , color be ( r , g , b ) , then a
1 5
vector
[ x, y , r , g , b] has been constructed. Actually, the sliding operator is to assign the color value to itself. Here, hill-climbing method is applied; let point
{ yi 0 , yi1 ,, yik ,, yic } ,
where
yi 0
is start pixels and
yic
xi
climbs pixels of
is the end pixels,
c is
convergence pixel. s Step 1:in x-y space, screening the pixels who are closed to pixels yik ( it is the sth coordinator of
yik ), go to next step and the yiks
satisfies,
yiks {( x, y ) | x [ px hs , px h],y [ py hs , py h]}
(1)
Step 2:calculate the center of gravity following as, N
yis,k 1 xns g (|| n 1
xnr yikr 2 N x r yikr 2 || ) / g (|| n || ) hr hr n 1
Where, N is the total number of points,
(2)
hr is the scale of kernel sliding, g is kernel
function, which has the following distribution,
x 1 1 g ( x) 0 otherwise
(3)
xnr yikr 2 || ) is a sphere that yikr is center and radius is hr , if points of Where, g (|| hr
{n1 , n2 ,, N k } are inside of the sphere, then the center of gravity can be simplified as,
yis,k 1
hr
1 Nk
Nk
x n 1
s n
(4)
and hs present coordinator space and color (HSV) space’s band width.
Step 3:if it climbs to sample point, then goes to stop. If the color or position change after moving is small, end the search and skip to step four. Otherwise, you need to go back to step one and start again, starting from yk 1 . The stop condition can be set to a constant coordinate distance or a color change value that is less than a given threshold. r r r Step 4:assign yic ’s color yic to start point, i.e. xi yic and xi yic .
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
189
2.2. Similar area merge The merged image from the previous step can be performed using a flood fill algorithm; the algorithm fills nearby pixels into a new color from a point until all pixels in the enclosed region are filled with new colors. Four-field pixel filling method, eightneighbor pixel filling method and scanning line-based pixel filling method are common in flooding filling. According to the implementation can be divided into recursive and non-recursive (based on the stack). It is assumed here that the filtered image is represented by
z , and whether the filtered two pixel points zi
and z j are combined can
be judged by using color similarity and spatial position similarity. If the pixels
xi
and
x j satisfy || zis z sj || hs or || zir z rj || hr , then the two pixels can be merged. This step is mainly to achieve modular clustering. For some smaller areas, if the color difference between them and the surrounding area is particularly large, they will also be self-contained. These small areas need to be further combined for density estimation and density gradient estimation. 3. Results and discuss Mean de-dimensionality is applied and window sets hr=30 and hs=30; and in order to briefly explain the process of the MS, take the left foot sole pressure image 4 frames and the right foot plantar pressure image 4 frames for calculation. The original image is shown in Fig. 1, and the mean shift result is shown in Fig. 2.
Figure 1 Original foot pressure imaging image (1 to 8 are the middle 4 states of the left and right feet respectively)
190
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
Figure 2 The mean value drift calculation result of the state pressure diagram of the left and right feet 8
The morphological segmentation process includes corrosion expansion, opening and closing operations, skeleton acquisition and refinement, etc. In order to simplify the calculation results, the target component is positioned using the ICA-LPP transformation. In order to simplify the calculation process, the first figure is taken as an example, and the rest is the same. Using the neighborhood pixel geometric distance function based on the LPP transform process, Fig. 3 is calculated using the geometric mean.
(a) original image after morphological processing (b) watershed (c) segmentation result (d) local minimum Figure 3 Morphological segmentation results
Table 1 shows the comparison results of pixel precision, average pixel precision, uniform cross ratio, frequency-to-weight ratio, segmentation accuracy, oversegmentation rate, under-segmentation rate, and Dyce coefficient.
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
191
Table 1 Evaluation index of segmentation effect (standardized to [0, 1]) Color K Normal Gradience Mean shift deMethods Texture mean watershed watershed dimensionality filter Pixel Accuracy 0.781 0.812 0.692 0.610 1
PA MPA
Mean Pixel accuracy
0.682
0.825
0.841
0.692
1
MIoU
Equal pay
1.120*
0.921
0.981
0.852
1
FWIoU
Frequency ratio
0.854
0.765
0.92
0.822
1
GT-SA
0.841
0.625
0.812
0.698
1
OR
Segmentation accuracy Over split rate
1.158
0.899(*)
1.42
1.587
1
UR
Under division rate
0.852
1.210(*)
0.921
0.692
1
DICE
Dyce coefficient
0.752
0.855
0.682
0.821
1
In Table 1, (*) indicates that the method index is inferior to the comparison method; the mean value of the policy (this method) is the standard reference value 1. As can be seen from the above table, of the five methods of the eight indicators, only three indicators are inferior to the other methods. The efficiency of the proposed segmentation method is basically higher than other segmentation methods. 4. Conclusion The mean value drift data reduction method and the preservation mode morphing operation proposed in this paper effectively reduce the computational complexity of the original plantar pressure image dataset. This dimensionality reduction operation is actually a basic transformation of feature extraction. By calculating the histogram of the plantar pressure image, the weighted and corrected covariance power spectral density and other experimental results show that the segmentation image has higher segmentation efficiency. The DICE, MIoU and other indicators also prove the effectiveness of the method in the study of plantar pressure image segmentation. References [1] [2] [3] [4] [5] [6] [7]
Fukunaga K, Hostetler L D. The Estimation of the Gradient of a Density Function with Applications in Pattern Recognition. IEEE Transactions on Information Theory 21 (1975), 1, 32-40. Sliti O, Hamam H, Amiri H, et al. CLBP for scale and orientation adaptive mean shift tracking. Journal of King Saud University - Computer and Information Sciences 30 (2018),3, 416-429. Naderi H, Fathianpour N, Tabaei M. MORPHSIM: A new multiple-point pattern-based unconditional simulation algorithm using morphological image processing tools. Journal of Petroleum Science and Engineering 173 (2019), 1417-1437. Rishikeshan C, Ramesh H. An automated mathematical morphology driven algorithm for water body extraction from remotely sensed images. ISPRS Journal of Photogrammetry and Remote Sensing 146 (2018), 11-21. Devkota B, Alsadoon A, Prasad P, et al. Image Segmentation for Early Stage Brain Tumor Detection using Mathematical Morphological Reconstruction. Procedia Computer Science 125 (2018), 115-123. Li X, Pan J, He Y, et al. Bilateral filtering inspired locality preserving projections for hyper spectral images. Neurocomputing 164 (2015), 300-306. Corona F, Zhu Z, SouzaJnior, A H, et al. Supervised distance preserving projections: Applications in the quantitative analysis of diesel fuels and light cycle oils from NIR spectra. Journal of Process Control 30 (2013), 10-21.
192
[8] [9] [10]
[11] [12] [13]
D. Wang and Z. Li / Morphological Segmentation of Plantar Pressure Images
A. H. T. E. De Silva, W. H. P. Sampath, N. H. L. Sameera, Y. W. R. Amarasinghe, A. Mitani, Development of a novel telecare system, integrated with plantar pressure measurement system, Informatics in Medicine Unlocked 12 (2018), 98-105 P. Moretto, M. Bisiaux, M. A. Lafortune, Froude number fractions to increase walking pattern dynamic similarities: Application to plantar pressure study in healthy subjects, Gait & Posture, 25 (2017) 1, 40-48 Wang D., Li Z., Cao L., Balas V.E., Dey N., Ashour A.S., McCauley P., and Shi F., Image Fusion Incorporating Parameter Estimation Optimized Gaussian Mixture Model and Fuzzy Weighted Evaluation System: A Case Study in Time-Series Plantar Pressure Data Set, IEEE Sensor Journal, 17(2017), 5, 14071420 Li Z., Dey N., Ashour A.S., Cao L., Wang Y., Wang D., McCauley P., Balas V.E., Shi K., and Shi F., Convolutional neural network based clustering and manifold learning method for diabetic plantar pressure imaging dataset, Journal of Medical Imaging and Health Informatics, 7 (2017), 3, 639-652 Li Z., Wang D., Dey N., Ashour A.S., Sheratt R.S., and Shi F., Plantar pressure image fusion for comfort fusion in diabetes mellitus using an improved fuzzy hidden Markov model Biocybernetics and Biomedical Engineering, 39 (2019), 742-752 Wang D., Li Z., Dey N., Ashour A.S., Moraru L., Biswas A., and Shi F., Optical pressure sensors based plantar image segmenting using an improved fully convolutional network, Optik - International Journal for Light and Electron Optics, 179 (2019), 99-114
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200061
193
IMA System Health Assessment Method Based on Incremental Random Forest Yujie Li, Dong Song1, Zhiyue Liu and Yuju Cao College of Aviation, Northwestern Polytechnical University, China
Abstract. Integrated Modular Avionics (IMA) system realizes the residence of various avionics functions on standardized hardware and integration also brings some problems such as resource defect spread and functional error cross-linking. To make sure the health of the aircraft, it is essential to carry out a comprehensive health assessment of the IMA system. This paper proposes an IMA system health assessment method based on random forest. First, build the IMA system health assessment framework and transform the mapping that affects the health of the IMA system and the health status of the IMA system into the classification problem in supervised learning. Then give a health assessment method based on random forest, and through incremental learning, the new data generated during the operation of the IMA system can improve the evaluation effect in real-time. Finally, through the system health dataset to realize a comprehensive evaluation of the health status of the IMA system. The results show that the method can perform a comprehensive health assessment with high accuracy for the IMA system. Keywords. Health assessment, random forest, incremental learning
1. Introduction With the development of electronics and software technology, modern avionics systems have developed rapidly. From the earliest discrete avionics systems to the joint avionics systems of the 1960s and people recognized that the increased complexity of avionics systems would result in high development costs and maintenance management costs. Therefore, after the development of integrated avionics systems to advanced integrated avionics systems from the 1980s to the 1990s, an IMA system with resource savings and short development cycles was gradually formed [1]. At present, international research on the IMA system regards ensuring system safety, reliability, and health management as essential research contents. However, Domestic research on health assessment technology started late, and most of them are aimed at public facilities such as bridges and transformers. There are few studies on health assessment and fault prediction technology for conventional electronic systems. 1 Corresponding author, College of Aviation, Northwestern Polytechnical University, Youyi West Road 127, Xi’an, China; Email: [email protected].
194
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
The health assessment research of the IMA system is still in the initial discussion. At the stage, there are still many problems in health assessment techniques that need to be carefully studied and resolved. This paper proposes a method based on random forest and incremental learning to evaluate the health status of the IMA system. Through the construction of the IMA system health assessment framework, the mapping of the health factors affecting the IMA system and the health status of the IMA system transformed into the classification problem in supervised learning. Using the incremental random forest classification method, the new data generated during the operation of the IMA system can improve the evaluation effect in real-time and realize the health assessment of the IMA system. Through the system health dataset, the health status of the IMA system is comprehensively evaluated to verify the effectiveness of the research method on the health assessment of the IMA system. 2. System Health Assessment 2.1. System Comprehensive Health Assessment Index System The IMA system health assessment is divided into three layers: system layer, functional layer, and resource layer. This paper will comprehensively evaluate the health of the IMA system from the three-dimensional influencing factors of resource health, functional health, and abnormal system resources, as shown in Figure 1. System Health Index(SHI)
System health assessment model
Resources Health Index(RHI)
Functions Health Index(FHI)
Resources Abnormal Index(RAI)
Figure 1. Basic framework of system health assessment
The System Health Index (SHI) as Eq.(1). SHI f ( RHI , FHI , RAI )
(1)
The value of SHI is between 0 and 10, 0 means the worst degree of health, 10 means the highest degree of health, the relationship between the index and the function execution ability of the IMA system is shown in Table 1, and the reduced health status represents the functional execution capability of IMA system is declining.
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
195
The IMA Resource Health Index (RHI) indicates the overall health status of system resources; the IMA Functions Health Index (FHI) indicates the overall health status of system functions; the IMA Resources Abnormal Index (RAI) shows the overall status of system resource anomalies. RHI, FHI, and RAI have values between 0 and 10, with 0 being the worst health and 10 being the most healthy. Table 1. System health and full function execution capability Functional execution capability and system health
No decrease
IMA system impact
No effect
9-10
Slightly reduced
Basic skills
Danger
Malfunction
6-8
4-5
2-3
0-1
Functional execution capability is slightly reduced
Keep basic functions running
Greatly reduced performance
Loss of function execution
3. IMA System Health Assessment Method This paper studies the application of supervised learning algorithms in the system health assessment dataset. The mapping of RHI, FHI, RAI, and SHI is transformed into a classification problem. The system health dataset and the system health information that is continuously obtained during the operation are taken as training samples, and the supervised learning algorithm is used to classify the evaluation indicators. 3.1. Random Forest The random forest used in this paper belongs to the Bagging algorithm in the integrated learning algorithm [2]. The establishment of the random forest needs to start from the decision tree. The decision tree is a simple classifier consisting of root nodes, intermediate nodes, and leaf nodes. Each node represents the division of features, the corresponding branch of the node is a possible feature value, the root node inputs the training sample set, and the leaf node outputs the final classification result. Each decision tree starts to split from the root node according to certain rules, and leaf nodes are formed until the requirements for continued splitting are not satisfied. Each leaf node represents a split result. After the unevaluated samples were input into the decision tree, the path is selected in the node according to the feature value and finally reaches the leaf node, which is the classification result of the decision tree model on the sample. The random forest uses the classification and regression tree (CART) as the base classifier. As a classification model, the single CART tree not only has complicated growth rules and pruning rules, but it is easy to over-fitting for training samples, and the generalization is poor. So CART is introduced into the integrated learning model to overcome such shortcomings, and multiple CART trees are integrated. The classification results are obtained through all tree voting results to form a random forest.
196
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
The algorithm is described in Figure 2. Each base classifier will classify the new test samples, and finally, summarize the classification results of each CART, and use the classification result with the highest number of votes as the final classification result.
Figure 2. Description of the random forest algorithm
3.2. Incremental Learning Incremental Learning [3] refers to a learning algorithm that can continuously learn classification knowledge from new samples. While learning new knowledge, it can maintain the classification ability in the original knowledge, gradually strengthen its classification ability from each new sample generated and correct the initial classification knowledge, so that the whole can adapt to the new data. Not having to retrain all the historical data is the biggest advantage of incremental learning, which reduces the time and space overhead. Long-term incremental learning studies have been conducted in areas such as computer vision [4] and handwriting recognition [5]. The implementation of the incremental learning method can not require a large amount of health state characteristic data in the initial stage of the IMA system operation, and the system continuously enhances the evaluation model through new samples during the operation. 3.3. Incremental Random Forest Classification Assessment Method The random forest algorithm uses the decision tree as the base classifier to achieve a random forest incremental learning for the problem of less initial samples in the actual operation of IMA [6]. The basic idea is as follows: the training samples are stored in the case of small samples, and the trained samples are stored in the leaf nodes of the final classification of each subtree. When the subsequent samples arrive, the new samples are combined with historical information. Since the random forest can allow the subtree to grow fully without pruning, the data after incorporating new samples in the leaf nodes are used as the reconstruction basis to realize incremental learning, thus completing the random forest incremental learning model under the small sample. The incremental method shows in Figure 3.
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
197
The new samples
Original model
Subtree 1
Subtree 2
Subtree 3
Subtree 4
...
Subtree M
Subtree M′
Sample subset 1
Sampl e subset 2
Sample subset 3
Sample subset 4
...
Sample subset M
Subset M∪new sample
Figure 3. Random forest increment method
Since the random forest introduces the characteristics of random selection features and randomly selects samples in the construction process, this will make the subtree reconstruction process limited to random constraints. This paper uses the random forest expansion algorithm: Extremely randomized trees (ERT) [7] instead of random forest as an incremental learning model. Like the random forest, ERT adopts a top-down structure to integrate the unpruned decision tree. The difference is that the decision tree as the base classifier divides nodes completely randomly and uses all samples in the training process instead of making a random selection of samples. The main idea is: The tree to be split starts from the root node. If the stop division condition is satisfied, the node is no longer split and is changed into a leaf node. It is assumed that there is a subset S. If the stop division condition is not satisfied, K features {a1 , a 2 , , a K } are randomly selected from the features of all subsets S for K-times S S division {S 1 , S 2 , , S K } , where for a subset S and feature a, {amin , amax } respectively represent the minimum and maximum values of feature a in S, and a division point a C is S S randomly selected from {a min , a max } . According to Eq.(2), return an optimal division
point, generate a [a aC ] node partition and two tree nodes, and continue the above steps.
Score( s* , S ) max1, 2,,K Score( si , S )
(2)
Each subtree is divided as above until it is fully grown, and all subtrees integration forms extremely randomized trees model in which the most critical two parameters are the number of features K to be selected for each division and the minimum number of samples nmin to be divided, which determines whether the node continues to be divided; the number of subtrees is the same as that of random forests. And the more subtrees, the better the model effect, but the higher the relative loss.
The specific implementation of the incremental process is as follows: (1) Use the initial sample set to construct ERT, and let all base classifiers entirely grow according to the above rules.
198
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
(2) In the construction process of the ERT base classifier, when the sample x enters the decision tree and falls to a leaf node L through decision tree classification, add the sample to the sample list of the node. Considering that IMA may periodically increment multiple samples in operation, this paper improves the method here by updating n samples at a time and store n samples on leaf nodes at the same time. (3) Calculate the splitting threshold of the samples stored on the leaf node L, using the Gini coefficient in the Eq.(3). N
N
i 1
i 1
Gini ( p) pi (1 pi ) 1 pi2
(3)
The Gini coefficient assumes that there are a total of N classification cases, the probability that the sample belongs to the category i is p i . (4) If the split threshold is reached, L becomes a tree node and splits into two new leaf nodes. The splitting method of the node is the same as that of the ERT subtree. Regardless of the type of samples stored on the leaf nodes, the Gini coefficient N
satisfies 0 pi2 1. The Gini coefficient can consider the ratio between all categories. i1
Propose the theorem for this problem [8]. Theorem 1 Suppose that there are K-types samples in the L node of the leaf, and the category with the largest number is Lmax . The proportion of Lmax in the sample is p max and pi (i max, i 1,2, , K ) is the proportion of the rest classes. The sample size of the rest class K is i relative to Lmax , then the proportion of all the other classes to Lmax is i . Then the splitting threshold can be calculated through Eq.(4). i 1,i max
1 1 1
2
(4)
The above process is the steps to achieve the incremental process of random forest. First, use the ERT algorithm that avoids randomness affecting the decision tree as the base model. When the samples training is completed, the samples are not discarded, but to store in the leaf node sample list. Then calculate the Gini coefficient of the stored sample and judge whether the node reaches the split threshold , split the node when it reaches. 4. Instance Verification The incremental random forest algorithm was applied to the IMA system health assessment, and the basic model was built for the primary training samples using the IMA system health assessment dataset. After each increment of n samples, 10 experts
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
199
evaluated 50 samples separately and formed a system health dataset of 500 samples to verify the accuracy of the model. The workflow shows in Figure 4.
Figure 4. IMA system health assessment process
As is shown in the picture above, incremental learning of random forest first needs to determine the splitting threshold . Here to choose different values for 50 times, each time increment 10 samples of the incremental training, will be ERT parameter is set to 100 subtrees. Since the IMA system health dataset has only three-dimensional features, the number of randomly selected features is set to 2, and the minimum leaf node division is set to 2 for enhanced model classification. Under this condition, the relationship between classification accuracy and value is shown in figure 5.
Figure 5. Classification accuracy and value relationship
As can be seen from the figure, the value of is between 0 and 1. When the value is close to 1, it can be seen that the evaluation accuracy decreases rapidly. When is between 0 and 0.4, the accuracy is always maintained at around 94%. Of course, the smaller the value of is, the more the number of reconstructions of the decision tree is needed after the new sample is included. So in this paper, 0.2 , which means the splitting threshold 0.306 .
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
200
Use the method of this paper to evaluate the index parameters in various cases and compare them with the results of the constant weight model evaluation, as shown in Table 2. In order to verify the evaluation effect of the index parameters of the algorithm under various conditions, in this paper, 8 sets of samples were selected. The difference between sample 1 and sample 2 evaluation indicators was small, and they were used to verify the evaluation results of the two models when the indicators were similar. The overall evaluation indicators of sample 3 and sample 4 were poor, and they were used to verify the evaluation results of the two models, which were similar indicators and low index. Samples 5-8 were some of the poorly evaluated indicators and they were used to verify the evaluation of the two evaluation models in the case of the IMA system operating characteristic indicators in the “wooden bucket principle”. Table 2. Comparison of results of two health assessment methods Samples
1
Indicator to be evaluated RHI
FHI
RAI
5
6
8
Constant weight model weighted evaluation result
Incremental random forest assessment results
6
6
2
4
4
5
4
4
3
1
0
3
1
0
4
0
0
2
1
0
5
10
5
1
6
4
6
9
8
2
7
5
7
0
1
6
2
1
8
7
0
0
3
1
According to the analysis of the evaluation results in the above table, it can be seen that in the case of a small difference in indicators, the evaluation effects of the constant weight assessment model and the random forest assessment model are similar and both can accurately reflect the comprehensive health state of the system. This is because the weight of permanent power is more reasonable when the indicators to be evaluated are relatively similar. When there is a situation in which the evaluation index is significantly worse than other indicators, the constant weight model cannot adequately consider the impact of the degradation index on the IMA system. When other indicators are healthy, the overall weighted evaluation result judgment system is still in an excellent working state. But the random forest model can study the new samples in a single or batch, makes the evaluation result conforms to the actual operation of the IMA systems. The health assessment should consider the accuracy of the evaluation under different functional decline states. The new samples after evaluation by 500 experts are used to increment the health assessment method based on incremental random forest. The overall evaluation results show in Table 3.
Y. Li et al. / IMA System Health Assessment Method Based on Incremental Random Forest
201
Table 3. Accuracy rate of each operation function reduction degree Operation function is reduced
No decrease
Slightly reduced
Basic skills
Danger
Malfunction
Health index
9-10
6-8
4-5
2-3
0-1
Accuracy
100.00%
96.24%
91.48%
99.13%
100.00%
Total
97.01%
It can be seen from the above table that the evaluation method of this paper can obtain better evaluation results, and the accuracy of evaluation under each health state is above 90%, which proves the effectiveness of the method. 5. Conclusion This paper presents a health assessment method of IMA systems based on random forests. Firstly, the system comprehensive health assessment index system was constructed, and the evaluation index was built to detect the health status. An incremental learning method is implemented in combination with random forests, which solves the problem that the basic random forest algorithm has insufficient generalization ability in the initial stage of IMA system operation. By using the new sample classification knowledge, the IMA system health assessment method with high evaluation accuracy is realized. Finally, through the IMA system health dataset, demonstrating the effectiveness of real-time assessment of the health status of the IMA system. References [1] Watkins C B, Walter R. Transitioning from federated avionics architectures to Integrated Modular Avionics[C]// Ieee/aiaa, Digital Avionics Systems Conference. IEEE, 2007:2.A.1-1-2.A.1-10. [2] Dietterich T G. Ensemble Methods in Machine Learning[J]. 2000, 1857(1):1-15. [3] LIU Chengxuan, DONG Zhenjiang, XIE Siyuan, PEI Ling. Human Motion Recognition Based on Incremental Learning and Smartphone Sensors[J]. ZTE Communications, 2016, 14(S1): 59-66. [4] ARINC Specification 653, Avionics Application Software Standard Interface[S]. [5] Hilbrich, R., and Dieudonné, L. Deploying safety-critical applications on complex avionics hardware architectures[J]. Journal of Software Engineering and Applications.2013(6): 229–235. [6] Gaska T, Watkin C, Chen Y. Integrated Modular Avionics - Past, present, and future[J]. IEEE Aerospace & Electronic Systems Magazine, 2015, 30(9):12-23. [7] STANAG 4626, Final Draft of Proposed Standards for Architecture[S]. [8] Luo B, Zhou ZH, Chen ZQ, Chen SF. Induce: An incremental decision tree algorithm. Journal of Computer Research and Development, 1999,36(5):518−522 (in Chinese with English abstract).
202
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved. doi:10.3233/FAIA200062
Research on Automatic Extraction Method of Lane Changing Behavior Based on the Naturalistic Driving Data a
Xiong Yingzhia,b澳,Xia Qin b,a ,Li Penghui c,d ,Chen Long c,d and Chai Yi b Intelligent Vehicle Testing & Evaluation Center, Chongqing Xibu Automobile Proving Groud Management Co., Ltd, Chongqing,404100 b Chongqing University, Chongqing, 400044 c China Automotive Engineering Research Institute Co., Ltd, Chongqing,400044; d Tsinghua University, Beijing,100084
Abstract. This paper is based on naturalistic driving data that collected by Intelligent Vehicle Test and Evaluation Center of China Automotive Engineering Research Institute. In order to extract the typical scenario from naturalistic driving data, driver behavior characteristics need to be researched. For massive naturalistic driving data, a method of extraction of driving behaviors is needed. This paper focuses on automatic extraction lane changing behavior from naturalistic driving data. Analyzed the original methods and the data in our project, an algorithm which automatically extract lane change data is proposed based on distance between vehicle and lane markings. The results show that, our algorithm can well obtain lane change behaviors from straight roads in highway and urban road, and support research on typical traffic scenario in China. Keywords. automatic extraction lane changing behavior, naturalistic driving data, typical traffic scenario in China, Dynamic Time Warping
1. Introduction With the rapid development of autonomous driving technology, autonomous vehicle will gradually take the role of human in driving and deal with more and more complex traffic environment [1]. Lane changing behavior is influenced by driving needs, driving security, surrounding vehicle, traffic conditions, etc. Lane changing vehicle will leave the current lane and enter adjacent lane. Lane changing is one of the main reasons for traffic accident, therefore, the research of lane change behavior is important. Naturalistic Driving Study (NDS) is study of real driving process that recorded by high accuracy data acquisition without interference and experimenter [2]. With the development of image processing technology, naturalistic driving data can not only record drivers’ behavior and vehicle movement, but also record traffic environment and dynamic traffic flow elements. Thus, driving traffic scenarios reappearance can be realized, and it can be seen that naturalistic driving data can be better reflect the real driving behavior than field test data.
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
203
There are many researches on the lane changing behavior extraction. For example, although the corner lamp signal can be considered as the basis for vehicle lane changing, the usage of corner lamp is low. Liu [3] analyzed 8669 lane changing samples, the results show that the probability of usage of corner lamp is 44%. Kuge[4] identified lane changing behavior by steering wheel angle. However, this parameter is obtained by driving simulator, which is different from the data acquired from driving vehicle. The higher velocity, the less obvious change of steering wheel angle. Moreover, there is large disturbance of steering wheel angle which acquired from driving vehicle. It will disturb the identification of lane changing behavior. F.Lethaus[5] proposed a method to use driver’s eye movement as a criterion of lane changing intention, but this method can not apply to naturalistic driving data, because it will interfere with drivers. In [6], Joel C.McCall applied Bayesian strategies to lane changing recognition based on the parameters of lane direction, road curvature, road width, vehicle speed, steering wheel angle, acceleration, driver’s head information, etc. This method need to input many parameters, one of which will affect the identification if the parameter acquisition fails. In [7], according to data that collected from driving simulator and support vector machine, Yang Diange obtained the high recognition rate, however, the driving simulator data is quite different from real vehicle data. Peng Jinshuan[8] solve the lane changing recognition problem with neural network. When the number of input parameters is small, the recognition rate is low. This paper proposes a method based on distance between vehicle and lane markings that extract from naturalistic driving data to recognize left lane changing and right lane changing. Through this method, a large number of videos and data during lane changing can be extracted from naturalistic driving data, and support the lane changing behavior analysis and typical scenario extraction.
2. Naturalistic Driving Data Acquisition The original data in this paper is derived from naturalistic driving data that acquired by Intelligent Vehicle Testing & Evaluation Center of China Automotive Engineering Research Institute Co.,Ltd(CAERI). By July 2019, data of 500,000 kilometers of highway road and urban road have been collected, covering 27 cities. The coverage area, road type and driver's age rate of data are shown in Figure 1.
mileage漏km漐
150000
120000
100000 50000
120000
70000 40000
40000
50000
60000
0 Northeast China
North China
Central China
East China
(a) Coverage area
South China
Northwest Southwest China China
204
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
40.00%
36.50%
33.80% 27.00%
30.00% 20.00% 10.00%
2.70%
0.00% Highway road
National road
Urban road
Other road
(b) Road type 11% 2%
27%
60% 20-30
30-40
40-50
50-55
(c) Driver's age rate Figure 1. Overview of naturalistic driving data.
The driving data acquisition system includes a driving data acquisition host, a Mobileye, a millimeter wave radar, three cameras and a GPS module. The installation of the equipment is shown in Figure 2. The driving data acquisition host records and stores the 濷濴瀇濴 collected by sensors. Mobileye outputs road information (type and color of lane markings, distance between vehicle and lane markings, etc.). Mobileye and millimeter wave radar output surrounding object information (type, velocity of object, lateral and longitudinal distance between object and vehicle, TTC and object ID, etc.). A camera records the video in front of the vehicle. GPS records location information of vehicle (altitude, longitude and latitude, etc.). This device also can collect CAN data of vehicle (velocity, acceleration, steering wheel angle, cornering lamp signal, etc.). The acquired naturalistic driving data support research on driving behavior and typical traffic scenario in China.
Figure 2. Driving data acquisition system.
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
205
3. Automatic Extraction Lane Changing Behavior Method 3.1. Character Analysis of Lane Changing Behavior This paper focuses on automatic extraction method of lane changing behavior based on the naturalistic driving data. Lane changing is a continuous behavior over a period of time. The most direct reflection of lane changing behavior is the distance between the vehicle and lane marking. Mobileye can identify this distance well. Therefore, the distance between vehicle to lane marking is chosen as characteristic parameter of lane changing behavior. The process of lane changing is shown in Figure 3. A complete lane changing process can be divided into three phases: start of lane changing, touching lane marking, end of lane changing. (1) start of lane changing is defined as the first lateral deviation of the vehicle and continuous movement towards the target lane until vehicle touches the lane marking. (2) touching lane marking is defined as the process from vehicle first touching the lane marking in current lane to vehicle leaving lane marking in target lane. (3) end of lane changing is defined as vehicle driving in target lane until vehicle completely return.
Figure 3. The process of lane changing.
Because the distance between vehicle and left lane is negatively correlated with the distance between vehicle and right lane, we choose distance between vehicle and left lane as characteristic parameter (Dleft). The change pattern of Dleft is shown in Figure 4. When vehicle driving to left lane, trend of (1) phase and (3) phase is rising. When vehicle driving to right lane, trend of (1) phase and (3) phase is down. 1 Dleft(m)
0 -1 -2 -3 -4 0
50
100 Time(s)
(a) Left lane changing
150
206
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
0 Dleft(m)
-1 -2 -3 -4 -5 0
50
Time(s)
100
(b) Right lane changing Figure 4. change pattern of Dleft.
3.2. Dynamic Time Warping Method According to the analysis in the previous section, the distance between vehicle and left lane is time series. And the mode of left lane changing and right lane changing is different, therefore, we can distinguish them by the method of time series matching [9]. The measurement distance of two time series determine their similarity degree. The smaller measurement distance, the higher the similarity. Dynamic time warping (DTW) is the most common method for similarity matching of time series of different length. DTW can avoid the problem of time series matching of equal length which requires each point corresponds one by one (such as Euclidean distance). DTW can accurately match the peaks and troughs of two series (shown in Figure 5). Thus, it is suitable for time series pattern matching. It has high accuracy and strong robustness for time series migration and amplitude change [10][11].
(a) Dynamic time warping
(b) Euclidean distance
Figure 5. Matching of dynamic time warping and Euclidean distance.
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
207
There are 2 time series: T = {t1, t2, …, tn} and Q = {q1, q2, …, qn}. Constructing a matrix d with n*m dimension, which element is distance between ti and qj. (1) (, ) = K can be any non-zero natural number. When k = 2, (, ) is Euclidean distance. The smaller (, ), the higher similarity of two series. DTW distance is to find the optimal bending path between T and Q so as to minimize the cumulative distance between and . = { , , … , }, (, ) + + 1 (2) (3) = , Thus, the calculation formula of (, ) is as follows: (, ) = #(, ) " #( 1, 1) #(, ) = , + min $ #( 1, ) % #(, 1) ! #(0,0) = 0 #(, 0) = #(0, ) = '
(4)
3.3. Automatic Extraction Lane Changing Behavior Method The basic principle of automatic extraction lane changing behavior is the different mode of Dleft of left lane changing and right lane changing. When vehicle first touches the lane marking, the Dleft will change at t1. Clip the fixed time window of 2.5s before and after t1, then match and compare typical lane changing segment and cut segment. If the DTW distance between two segments is less than threshold, the cut segment is left lane changing or right lane changing. Automatic extraction lane changing behavior method is as follows. x
Step 1: Read distance between vehicle and left lane marking of driving data(dl), distance between vehicle and left lane marking of typical left lane changing data(dsl) and distance between vehicle and left lane marking of typical right lane changing data(dsr).
x
Step 2: If the dl is not empty, enter to Step 3. Otherwise return to Step 1.
x
Step 3: Find the moment tlc that vehicle first touches the lane marking (the outburst value over 3 meters). If the is tlc exist, enter to Step 4. Otherwise return to Step 1.
x
Step 4: Clip the segment data with fixed time window of 2.5s before and after tlc, denote as dtemp.
x
Step 5: Calculate the DTW distance between dtemp and dsl. If this DTW distance less than threshold, this segment is left lane changing data.
x
Step 6: Calculate the DTW distance between dtemp and dsr. If this DTW distance less than threshold, this segment is right lane changing data.
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
208
x
Step 7:Clip and save the driving data with fixed time window of 15s before and after tlc.
4. Results and Discussion This section verifies the effectiveness of the proposed method via naturalistic driving data of one car for 2 days. Driving areas include urban roads and highways. Road types include straight and curve. Vehicle driving states can be classified into two categories: lane changing and non-lane changing. According to statistics of lane changing behavior of CAERI, mean lane changing time is 7.61s. This data includes 190 left lane changing segments and 189 right lane changing segments. 11 times lane changing occurred in curve, and 368 times lane changing occurred in straight. 139 times lane changing occurred in urban roads, and 240 times lane changing occurred in highways. 4.1. Distance Between Vehicle and Left Lane Marking of Typical Lane Changing Data This section introduces the calculation method of the distance of typical left and right lane changing between vehicle and lane marking. Typical left and right lane changing data requires manual selection. Selecting 50 left lane changing segments and 50 right lane changing segments from naturalistic driving data which lane changing time is 6-9 minutes, and manual marking lane changing starting point, middle point and ending point for all selected segments. Phase (1) is from starting point to middle point. Phase (3) is from middle point to ending point. 50 sample points for data of phase (1) and phase (3) are selected. The value of each point is obtained by spline interpolation. Then the distance between vehicle and left lane marking of typical lane changing data in phase (1) and phase (3) is obtained by calculating the mean value of each sample point, shown in Figure 6. The blue points is the discrete point of the distance between the vehicle and left lane marking. And the red points represent typical left and right lane changing data. According to statistics, the initial location before lane changing is 1.75m from the left lane. The position after lane changing is 1.72m from left lane.
(a) Left lane changing
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
209
(b) Right lane changing Figure 6 . Distance between vehicle and left lane marking of typical lane changing data.
4.2. Performance Validation Lane changing extraction is based on the selection of middle point of lane changing and judgement distance trend of 2.5s fixed time window before and after middle point. If this trend doesn’t match the left or right lane changing trend, selected middle point is not real lane changing middle point. In order to verify the effectiveness of the method proposed in this paper, the distance threshold is required to determine first. The recall and precision of different threshold is shown in Table 1 and Table 2. Table 1. The recall of different threshold Threshold
Left lane changing
Right lane changing
Total
0.4
63.16%
64.74%
63.95%
0.5
68.42%
69.47%
68.95%
0.55
70.00%
69.47%
69.74%
0.6
71.05%
69.47%
70.26%
0.7
71.58%
70.53%
71.06%
0.8
72.63%
71.58%
72.11%
Table 2. The precision of different threshold Threshold
Left lane changing
Right lane changing
Total
0.4
95.24%
98.40%
96.82%
0.5
91.55%
94.96%
93.26%
0.55
89.26%
93.62%
91.44%
0.6
79.88%
82.50%
81.19%
0.7
59.39%
61.19%
60.29%
0.8
52.08%
50.37%
51.23%
The result shows that the smaller threshold, the lower recall, the higher precision. Otherwise, the bigger threshold, the higher recall, the lower precision. Selecting different threshold according to different data and requirements. When the scale of naturalistic
210
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
driving data is small, the segments of lane changing should be extracted as much as possible. Therefore, a larger threshold can be taken. When the scale of naturalistic driving data is big, the segments of lane changing should be extracted as accurately as possible. In this condition, we can choose a smaller threshold. The naturalistic driving data used in this paper, even when the threshold is 0.7 or even 0.8, can not fully identify all lane changing segments. Because the line marking is irregular or unclear, which affects the lane recognition, and then affects the lane changing recognition. However, too large threshold leads to non-lane changing segments are recognized as lane changing segments, which greatly reduces the precision. When the threshold is 0.4, the precision is 96.81%. However, due to the small threshold, the number of correct extraction segments is also less. Considering the recall and precision, the threshold should be 0.5. The traffic environment and flow of highways is relatively simple compared with urban roads. This section will compare the performance of automatic extraction method for highways and urban roads. The result is shown in Table 3 and Table 4. Table 3 . The recall of different threshold in highways and urban roads. Threshold
Urban roads
Highways
Total
0.4
41.01%
77.50%
59.25%
0.5
50.36%
80.00%
65.18%
0.55
51.08%
80.83%
65.96%
0.6
51.08%
81.67%
67.09%
0.7
52.52%
82.08%
67.30%
0.8
53.24%
83.33%
68.29%
Table 4. The precision of different threshold in highways and urban roads. Threshold
Urban roads
Highways
Total
0.4
90.48%
98.94%
94.71%
0.5
83.33%
97.46%
90.40%
0.55
79.78%
96.52%
88.15%
0.6
70.30%
85.96%
78.13%
0.7
56.15%
61.95%
59.05%
0.8
48.68%
52.22%
50.45%
The result shows that the recall and precision of urban road is lower than highways and vice versa. Considering the recall and precision, the threshold should be 0.5.
5. Conclusion There are a lot of lane changing extraction algorithms, however, these algorithms have a lower identification ratio or not suitable for naturalistic driving data. The proposed algorithm is based on naturalistic driving data. The lane changing algorithm extracts criterion based on the distance between ego vehicle and the left lane. The flag of identification lane changing behavior is the distance to lane. DTW is used to identify left lane changing or right lane changing. In results, when the distance threshold is 0.5, left lane changing recall and precision are 68.42%, 91.55%, in respectively. Right lane changing recall and precision are 69.47%, 94.96%, in respectively. Compared the proposed algorithm with the piece wire approximation algorithm, the proposed algorithm which is based on DTW has a better performance based on naturalistic.
Y. Xiong et al. / Research on Automatic Extraction Method of Lane Changing Behavior
211
The proposed algorithm has a high level of recall, a better identification ratio of left or right lane changing, and better for largely auto extraction. The extraction data are used for analysis lane changing behavior. The recall and precision are important to ensure the accuracy and responsibility of analysis results. In the future, the time window needs to be adjusted. Acknowledge This paper is subsidized by the key project of Intelligent Vehicle and Intelligent Transport Application Demonstration Project and Industrialization Public Service Platform Based on Broadband Mobile Internet 漏 bidding number: 0714-EMTC025593/20漐 References [1] Li Keqiang, Book Title, The development trend of automobile technology and the countermeasure of our country, Automotive Engineering, 2009,31(11):1005-1016. [2] Fitch G M, Hanowski R J. Using naturalistic driving research to design, test and evaluate driver assistance systems. Handbook of Intelligent Vehicles. London: Springer, 2012: 559-580. [3] LIU A. What the Driver’s Eye Tell the Car’s Brain. UNDERWOOD G. Eye Guidance in Reading and Scene Perception. Oxford: Elsevier Science,1998:431-452. [4] Kuge N, Yamamura T, Shimoyama O,et al. A Driver Behavior Recognition Method Based on a Driver Model Framework[J]. SAE Paper, 2000-01-0349濁 [5] F.Lethaus, J.Rataj. Do eye movements reflect driving maneuvers. IET Intell, September, 2007. [6] McCall J C,Wipf D P,Trivedi M M,et al. Lane Change Intent Analysis Using Robust Operators and Sparse Bayesian Learning. IEEE Transaction on Intelligent Transportation Systems, 2007,8(3):431-440. [7] Yang Diange, He Changwei, Li Man, He Qiguang. Vehicle steering and lane-changing behavior recognition based on a support vector machine[J]. J Tsinghua Univ(Sci&Technol), 2015,55(10):10931097. [8] Peng Jinshuan, Guo Yingshi, Fu, Rui, et al. Multi-parameter Prediction of Drivers Lane-changing Behaviour with Neural Network Model. Applied Ergonomics, 2015,50:207-217. [9] Salvador S㸪Chan P. Toward Accurate Dynamic Time Warping in Linear Time and Space㸬Intelligent Data Analysis㸬2007,11(5):561-580. [10] Li Hailin, Liang Ye, Wang Shaochun. Review on dynamic tima warping in time series data mining. Control and Decision. 2018.33(8):1345-1353. [11] Xiong Yingzhi. Reaserch on Feature Representation and Clustering Method for Time Series. Chongqing University. 2016.
This page intentionally left blank
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved.
213
Subject Index active contour segmentation 107 AHP 42 anomaly detection 32 AnyLogic simulation 22 AR Toolkit 77 arrester bed 97 artificial intelligence 62, 86 augmented reality 77 automatic bird species identification 117 automatic extraction lane changing behavior 202 automatic video surveillance (AVS) 32 autonomous parallel parking system 1 bicycle feeder 10 brain storm optimization 107 cellular automaton model 166 chirp codes 125 color features 86 complex waters 135 composite guidance 159 compressive sensing 125 computer-aided diagnosis 86 correlation 142 crowd evacuation 166 deep learning 117 deep neural network 86 demons registration 142 diamond space 176 discrete element method (DEM) 97 distance attenuation model 10 dual induction 166 dull razor 86 dynamic camera self-calibration 176 dynamic time warping 202 endoscopy images 107 feature extraction 62 firefly algorithm 142 frequency estimation 125 gastric polyps 107 harmonics 125
health assessment highway tunnel HOG importance degree improved combinatorial test incremental learning intelligent evacuation guidance system intermediate station iris biometrics joint entropy Kapur’s function LBP level of service machine learning machine learning models marker mean shift local de-dimensionality mean square error measurement model melanoma midcourse and terminal guidance handover mixed reality model-in-the-loop moments morphological segmentation motion segmentation multi-criteria naturalistic driving data navigation nearest parking space allocation parameter optimization parking allocation plantar pressure imaging probability characteristic PTZ camera random forest randomly shaped pebbles risk assessment rough set theory school level education SIFT
193 166 71 1 1 193 166 52 62 142 107 71 22 62 117 77 186 142 135 86 159 77 1 62 186 32 42 202 135 150 176 150 186 150 176 193 97 135 42 77 71
214
signal reconstruction smooth transition sparsity spatio-temporal threshold station energy consumption stereo garage SURF syllable classification Tang’s demons
125 159 125 10 42 150 71 117 142
TOPSIS 42 transfer 22 truck escape ramp 97 turn-back capacity 52 typical traffic scenario in China 202 urban rail transit 10, 52 vanishing point 176 vehicle tracking 32 visual induction 166
Information Technology and Intelligent Transportation Systems L.C. Jain et al. (Eds.) IOS Press, 2020 © 2020 The authors and IOS Press. All rights reserved.
215
Author Index Ashour, A.S. 142 Balas, V.E. v Cao, Y. 193 Cao, Z. 166 Chai, Y. 202 Chaki, J. 117 Chakraborty, S. 142 Chatterjee, B. 86 Chen, L. 202 Chen, S. 159 Chen, Y. 159 Chowdary, M.K. 71 Das, N. 117 Dey, N. 32, 86, 107, 117, 125, 142 Duan, C. 159 Duan, J. 1 Duque, C.A. 125 Dutta, A. 32 Gao, F. 1 Guo, T. 42 Hemanth, D.J. 62, 71, 77 Hien, D. 77 Ho, C.C. 77 J, Andrew 77 Jain, L.C. v Jia, S. 22 Kang, Y. 150 Khosravy, M. 125 Li, B. 150 Li, J. 150 Li, P. 202 Li, Y. 193 Li, Z. 186 Liu, J. 159 Liu, M. 159 Liu, P. 97 Liu, X. 166
Liu, Z. Maiti, A. Mao, B. Mao, C. Melo, K. Mondal, A. Padhy, N. Pradhan, R. Rajinikanth, V. Sagayam, K.M. Sen, S. Shekhargiri, H. Shi, F. Shi, P. Song, D. Song, H. Tang, X. Wang, C. Wang, D. Wang, Hao Wang, Huiwen Wang, J. Wang, M. Wang, W. Wang, Y. Winston, J.J. Wu, J. Xia, Q. Xiong, Y. Yang, Y. Yu, Q. Zhang, N. Zhang, Z. Zhao, X. Zheng, J. Zhou, C. Zhou, Q.
193 86 42, 52 135 125 32, 117 117 142 86, 107 77 32 86 v, 86, 107 97 193 176 176 1 186 135 42 10 135 176 22, 52 62 22, 52 202 202 1 97 10 176 v, 97 135 135 52
This page intentionally left blank