139 32 27MB
English Pages 377 [376] Year 2023
Lecture Notes in Networks and Systems 853
Pandian Vasant · Mohammad Shamsul Arefin · Vladimir Panchenko · J. Joshua Thomas · Elias Munapo · Gerhard-Wilhelm Weber · Roman Rodriguez-Aguilar Editors
Intelligent Computing and Optimization Proceedings of the 6th International Conference on Intelligent Computing and Optimization 2023 (ICO2023), Volume 3
Lecture Notes in Networks and Systems
853
Series Editor Janusz Kacprzyk , Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
Pandian Vasant · Mohammad Shamsul Arefin · Vladimir Panchenko · J. Joshua Thomas · Elias Munapo · Gerhard-Wilhelm Weber · Roman Rodriguez-Aguilar Editors
Intelligent Computing and Optimization Proceedings of the 6th International Conference on Intelligent Computing and Optimization 2023 (ICO2023), Volume 3
Editors Pandian Vasant Faculty of Electrical and Electronics Engineering, Modeling Evolutionary Algorithms Simulation and Artificial Intelligence Ton Duc Thang University Ho Chi Minh City, Vietnam Vladimir Panchenko Laboratory of Non-traditional Energy Systems, Department of Theoretical and Applied Mechanics, Federal Scientific Agroengineering Center VIM Russian University of Transport Moscow, Russia
Mohammad Shamsul Arefin Department of Computer Science Chittagong University of Engineering and Technology Chittagong, Bangladesh J. Joshua Thomas Department of Computer Science UOW Malaysia KDU Penang University College George Town, Malaysia Gerhard-Wilhelm Weber Faculty of Engineering Management Pozna´n University of Technology Poznan, Poland
Elias Munapo School of Economics and Decision Sciences North West University Mmabatho, South Africa Roman Rodriguez-Aguilar Facultad de Ciencias Económicas y Empresariales, School of Economic and Business Sciences Universidad Panamericana Mexico City, Mexico
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-50326-9 ISBN 978-3-031-50327-6 (eBook) https://doi.org/10.1007/978-3-031-50327-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
Preface
The sixth edition of the International Conference on Intelligent Computing and Optimization (ICO’2023) was held during April 27–28, 2023, at G Hua Hin Resort and Mall, Hua Hin, Thailand. The objective of the international conference is to bring the global research scholars, experts and scientists in the research areas of intelligent computing and optimization from all over the world to share their knowledge and experiences on the current research achievements in these fields. This conference provides a golden opportunity for global research community to interact and share their novel research results, findings and innovative discoveries among their colleagues and friends. The proceedings of ICO’2023 is published by SPRINGER (in the book series Lecture Notes in Networks and Systems) and indexed by SCOPUS. Almost 70 authors submitted their full papers for the 6th ICO’2023. They represent more than 30 countries, such as Australia, Bangladesh, Bhutan, Botswana, Brazil, Canada, China, Germany, Ghana, Hong Kong, India, Indonesia, Japan, Malaysia, Mauritius, Mexico, Nepal, the Philippines, Russia, Saudi Arabia, South Africa, Sri Lanka, Thailand, Turkey, Ukraine, UK, USA, Vietnam, Zimbabwe and others. This worldwide representation clearly demonstrates the growing interest of the global research community in our conference series. The organizing committee would like to sincerely thank all the authors and the reviewers for their wonderful contribution for this conference. The best and high-quality papers will be selected and reviewed by International Program Committee in order to publish the extended version of the paper in the international indexed journals by SCOPUS and ISI WoS. This conference could not have been organized without the strong support and help from LNNS SPRINGER NATURE, Easy Chair, IFORS and the Committee of ICO’2023. We would like to sincerely thank Prof. Roman Rodriguez-Aguilar (Universidad Panamericana, Mexico) and Prof. Mohammad Shamsul Arefin (Daffodil International University, Bangladesh), Prof. Elias Munapo (North West University, South Africa) and Prof. José Antonio Marmolejo Saucedo (National Autonomous University of Mexico, Mexico) for their great help and support for this conference. We also appreciate the wonderful guidance and support from Dr. Sinan Melih Nigdeli (Istanbul University—Cerrahpa¸sa, Turkey), Dr. Marife Rosales (Polytechnic University of the Philippines, Philippines), Prof. Rustem Popa (Dunarea de Jos University, Romania), Prof. Igor Litvinchev (Nuevo Leon State University, Mexico), Dr. Alexander Setiawan (Petra Christian University, Indonesia), Dr. Kreangkri Ratchagit (Maejo University, Thailand), Dr. Ravindra Boojhawon (University of Mauritius, Mauritius), Prof. Mohammed Moshiul Hoque (CUET, Bangladesh), Er. Aditya Singh (Lovely Professional University, India), Dr. Dmitry Budnikov (Federal Scientific Agroengineering Center VIM, Russia), Dr. Deepanjal Shrestha (Pokhara University, Nepal), Dr. Nguyen Tan Cam (University of Information Technology, Vietnam) and Dr. Thanh Dang Trung (Thu Dau Mot University, Vietnam). The ICO’2023 committee would like to sincerely thank all the authors, reviewers, keynote speakers (Prof. Roman Rodriguez-Aguilar,
vi
Preface
Prof. Kaushik Deb, Prof. Rolly Intan, Prof. Francis Miranda, Dr. Deepanjal Shrestha, Prof. Sunarin Chanta), plenary speakers (Prof. Celso C. Ribeiro, Prof. José Antonio Marmolejo, Dr. Tien Anh Tran), session chairs and participants for their outstanding contribution to the success of the 6th ICO’2023 in Hua Hin, Thailand. Finally, we would like to sincerely thank Prof. Dr. Janusz Kacprzyk, Dr. Thomas Ditzinger, Dr. Holger Schaepe and Ms. Varsha Prabakaran of LNNS SPRINGER NATURE for their great support, motivation and encouragement in making this event successful in the global stage. April 2023
Dr. Pandian Vasant (Chair) Prof. Dr. Gerhard-Wilhelm Weber Prof. Dr. Mohammad Shamsul Arefin Prof. Dr. Roman Rodriguez-Aguilar Dr. Vladimir Panchenko Prof. Dr. Elias Munapo Dr. J. Joshua Thomas
Contents
Clean Energy, Agro-Farming, and Smart Transportation UV-A, UV-B, and UV-C Irradiation Influence on Productivity and Anthocyanin Accumulation in Lettuce, Mustard and Basil Plants in Reduced Light Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Smirnov, N. Semenova, Y. Proshkin, A. Ivanitskikh, N. Chilingaryan, and V. Panchenko
3
Optimization of Electrocontact Welding Wear-Resistant Functional Coatings Regime in the Use of Engineering Industrial Wastes . . . . . . . . . . . . . . . . A. V. Serov, N. V. Serov, S. P. Kazantsev, I. Y. Ignatkin, and O. V. Chekha
13
Accelerated Growth and Development of Plants as a Result of Their Stimulation in the Impulsed Electric Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Vasilev, S. Mashkov, P. Ishkin, V. Syrkin, M. Fatkhutdinov, and I. Yudaev A Deep Reinforcement Learning Framework for Reducing Energy Consumption of Server Cooling System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdullah Al Munem, Md. Shafayat Hossain, Rizvee Hassan Prito, Rashedul Amin Tuhin, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Far North: Optimizing Heating Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Yu. Ignatkin, N. A. Shevkun, A. S. Kononenko, V. Ryabchikova, and V. Panchenko Justification for the Need to Develop and Implement Remote Monitoring Systems of the Grain Embankment Condition Which Operate by Using Renewable Energy Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dmitry Budnikov, Vladimir Panchenko, and Viktor Rudenko Justification of the Technology of Keeping Animals to Maintain the Microclimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Igor M. Dovlatov, Ilya V. Komkov, Sergey S. Jurochka, Alexandra A. Polikanova, and Vladimir A. Panchenko
23
32
43
51
58
viii
Contents
Cattle Icare Monitoring System (CIMS): Remote Monitoring of CATtle’s Heart Rate, Temperature, and Daily Steps with Smart Sprinkler System . . . . . . . Deane Cristine Castillo, Alvin Bulayungan, John Carl Lapie, Dorothy Mary Ann Livida, Bryant Macatangay, Kim Francis Sangalang, and Marife Rosales
67
Identification of the Distribution of the Viral Potato Infections . . . . . . . . . . . . . . . P. Ishkin, V. Rakitina, M. Kincharova, and I. Yudaev
77
Smart Irrigation System for Farm Application Using LoRa Technology . . . . . . . Alfredo P. Duda, Vipin Balyan, and Atanda K. Raji
84
The Effect of Illumination on the Productivity of Dairy Cattle . . . . . . . . . . . . . . . . Igor M. Dovlatov, Ilya V. Komkov, Dmitry A. Blagov, and Alexandra A. Polikanova
95
Improvement of Technological Process of Growing Hydroponic Green Fodder Triticale (Triticosecale Wittm.) in Indoor Farming Using Light Emitting Diodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 N. I. Uyutova, N. A. Semenova, N. O. Chilingaryan, V. A. Panchenko, and A. S. Dorokhov Design of a Device with a Thermoelectric Module for Transporting Milk . . . . . . 114 Irina Ershova, Dmitrii Poruchikov, Vladimir Kirsanov, Vladimir Panchenko, Gennady Samarin, Gennady Larionov, and Natalia Mardareva Energy-Efficient AI Models for 6G Base Station . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Mahadi Karim Munif, Mridul Ranjan Karmakar, Sanjida Alam Tusi, Banalata Sarker, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Green IT, IoTs and Data Analytics Model-Based Design of User Story Using Named Entity Recognition (NER) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Aszani and Sri Mulyana Intrinsic and Extrinsic Evaluation of Sentiment-Specific Word Embeddings . . . . 145 Sadia Afroze and Mohammed Moshiul Hoque Movie Recommender System: Addressing Scalability and Cold Start Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Pradeesh Prem Kumar, Nitish U., Nimal Madhu M., and Hareesh V.
Contents
ix
E-waste Management and Recycling Model for Dhaka with Collection Strategy Application: A More Effective and Sustainable Approach . . . . . . . . . . . . 165 Md. Nazmus Sakib, Md. Mainul Hasan, Anika Faiza, Shahinur Rahman Nova, Ahmed Wasif Reza, and Mohammad Shamsul Arefin CoBertTC: Covid-19 Text Classification Using Transformer-Based Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Md. Rajib Hossain and Mohammed Moshiul Hoque Glaucoma Detection Using CNN and Study on Class Imbalance Problem . . . . . . 187 Nitish U., Pradeesh Prem Kumar, Nimal Madhu M., Hareesh V., and V. V. Sajith Variyar Identification of Deceptive Clickbait Youtube Videos Using Multimodal Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Sheikh Sowmen Rahman, Avishek Das, Omar Sharif, and Mohammed Moshiul Hoque Perception and Knowledge of South African Creatives with Regards to Crypto Art, NFTs, and Crypto Art Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Siyanda Andrew Xaba, Xing Fang, and Dhaneshwar Shah Automated Bone Age Assessment Using Deep Learning with Attention Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Maisha Fahmida, Md. Khaliluzzaman, Syed Md. Minhaz Hossain, and Kaushik Deb Green Banking Through Blockchain-Based Application for Secure Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Md. Saiful, Nahid Reza, Maisha Mahajabin, Syada Tasfia Rahman, Farhana Alam, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Application of Decision Tree Algorithm for the Classification Problem in Bank Telemarketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Ngoc Nguyen Minh Lam, Ngoc Hong Tran, and Dung Hai Dinh Robust Feature Extraction Technique for Hand Gesture Recognition System . . . 250 V. Yadukrishnan, Abhishek Anilkumar, K. S. Arun, M. Nimal Madhu, and V. Hareesh Adaptive Instance Object Style Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 Anindita Das
x
Contents
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Adriel A. Intan and Rolly Intan Lowering and Analyzing the Power Consumption of Smartphones . . . . . . . . . . . . 274 Imtiaj Ahmed, Samiun Rahman Sizan, Fariha Tabassum, Md. Mostafijur Rahman, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Comparison for Handwritten Character Recognition and Handwritten Text Recognition and Tesseract Tool on IJAZAh’s Handwriting . . . . . . . . . . . . . . . . . . . 289 Alexander Setiawan, Kartika Gunadi, and Made Yoga Mahardika Secure Communication Through Quantum Channels: A Study of Quantum Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Seema Ukidve, Ramsagar Yadav, Mukhdeep Singh Manshahia, and M. P. Chaudhary A Study on Android Malware Classification by Using Federated Learning . . . . . 306 Vo Quoc Vuong and Nguyen Tan Cam Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 Emmanuel Jason Wijayanto, Siana Halim, and I. Gede Agus Widyadana Android Application Behavior Monitor by Using Hooking Techniques . . . . . . . . 325 Nguyen Tan Cam, Trinh Gia Huy, Vo Ngoc Tan, Phuc Nguyen, and Sang Vo Blockchain in Concurrent Green IoT-Based Agriculture: Discussion, Analysis, and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 Md Janay Alam, Ashiquzzaman Choudhury, Kazi Sifat Al Maksud, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Automatic Document Summarization of Unilingual Documents: A Review . . . . 345 Sabiha Anan, Nazneen Islam, Mohammed Nadir Bin Ali, Touhid Bhuiyan, Md.Hasan Imam Bijoy, Ahmed Wasif Reza, and Mohammad Shamsul Arefin Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
About the Editors
Pandian Vasant is Research Associate at Modeling Evolutionary Algorithms Simulation and Artificial Intelligence, Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, Ho Chi Minh City, Vietnam, and Editor in Chief of International Journal of Energy Optimization and Engineering (IJEOE). He holds Ph.D. in Computational Intelligence (UNEM, Costa Rica), M.Sc. (University Malaysia Sabah, Malaysia, Engineering Mathematics) and B.Sc. (Hons, Second Class Upper) in Mathematics (University of Malaya, Malaysia). His research interests include soft computing, hybrid optimization, innovative computing and applications. He has co-authored research articles in journals, conference proceedings, presentations, special issues Guest Editor, chapters (312 publications indexed in Research-Gate) and General Chair of EAI International Conference on Computer Science and Engineering in Penang, Malaysia (2016) and Bangkok, Thailand (2018). In the years 2009 and 2015, he was awarded top reviewer and outstanding reviewer for the journal Applied Soft Computing (Elsevier). He has 35 years of working experience at the universities. Currently, Dr. Pandian Vasant is General Chair of the International Conference on Intelligent Computing and Optimization (https://www.icico.info/) and Research Associate at Modeling Evolutionary Algorithms Simulation and Artificial Intelligence, Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, HCMC, Vietnam. Professor Dr. Mohammad Shamsul Arefin is in lien from Chittagong University of Engineering and Technology (CUET), Bangladesh and currently affiliated with the Department of Computer Science and Engineering (CSE), Daffodil International University (DIU), Dhaka, Bangladesh. Earlier he was the head of CSE Department, CUET. Prof. Arefin received his Doctor of Engineering Degree in Information Engineering from Hiroshima University, Japan with support of the scholarship of MEXT, Japan. As a part of his doctoral research, Dr. Arefin was with IBM Yamato Software Laboratory, Japan. His research includes data privacy and mining, big data management,
xii
About the Editors
IoT, Cloud Computing, Natural Language processing, Image Information Processing, Social Networks Analysis and Recommendation Systems and IT for agriculture, education and environment. Prof. Arefin is the Editor in Chief of Computer Science and Engineering Research Journal (ISSN: 1990-4010) and was the Associate Editor of BCS Journal of Computer and Information Technology (ISSN: 2664-4592) and a reviewer as well as TPC member of many international journals and conferences. Dr. Arefin has more than 120 referred publications in international journals, book series and conference proceedings. He delivered more than 30 keynote speeches/invited talks. He also received a good number of research grants/funds from home and abroad. Dr. Arefin is a senior member of IEEE, Member of ACM, Fellow of IEB and BCS. Prof. Arefin involves/earlier involved in many professional activities such as Chairman of Bangladesh Computer Society (BCS) Chittagong Branch; Vice-President (Academic) of BCS National Committee; Executive Committee Member of IEB Computer Engineering Division; Advisor, Bangladesh Robotic Foundation. He was also a member of pre-feasibility study team of CUET IT Business Incubator, first campus based IT Business Incubator in Bangladesh. Prof. Arefin is an Principle Editor of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, Volume 95) published by Springer and an editor of the books on Applied Informatics for Industry 4.0, Applied Intelligence for Industry 4.0 and Computer Vision and Image Analysis for Industry 4.0 to be published Tailor and Francis. Prof. Arefin is the Vice-Chair (Technical) of IEEE CS BDC for the year 2022. He was the Vice-Chair (Activity) of IEEE CS BDC for the year 2021 and the Conference Co-Coordinator of IEEE CS BDC for two consecutive years, 2018 and 2019. He is acting as a TPC Chair of MIET 2022 and the TPC Chair of IEEE Summer Symposium 2022. He was the Organizing Chair of International Conference on Big Data, IoT and Machine Learning (BIM 2021) and National Workshop on Big Data and Machine Learning (BDML 2020). He served as the TPC Chair, International Conference on Electrical, Computer and Communication Engineering (ECCE 2017); Organizing Co-chair, ECCE 2019, Technical Co-chair, IEEE CS BDC Winter Symposium 2020 and Technical Secretary, IEEE CS BDC Winter Symposium 2021. Dr. Arefin helped different international conferences
About the Editors
xiii
in the form of track chair, TPC member, reviewer and/or secession chair etc. He is a reviewer of many reputed journals including IEEE Access, Computing Informatics, ICT Express, Cognitive Computation etc. Dr. Arefin visited Japan, Indonesia, Malaysia, Bhutan, Singapore, South Korea, Egypt, India, Saudi Arabia and China for different professional and social activities. Vladimir Panchenko is an Associate Professor of the “Department of Theoretical and Applied Mechanics” of the “Russian University of Transport”, Senior Researcher of the “Laboratory of Non-traditional Energy Systems” of the “Federal Scientific Agroengineering Center VIM” and the Teacher of additional education. Graduated from the “Bauman Moscow State Technical University” in 2009 with the qualification of an engineer. Ph.D. thesis of the specialty “Power plants based on renewable energy” was defended in 2013. From 2014 to 2016 Chairman of the Council of Young Scientists and the Member of the Academic Council of the All-Russian Institute for Electrification of Agriculture, Member of the Council of Young Scientists of the Russian University of Transport, Member of the International Solar Energy Society, Individual supporter of Greenpeace and the World Wildlife Fund, Member of the Russian Geographical Society, Member of the Youth section of the Council “Science and Innovations of the Caspian Sea”, Member of the Committee on the use of renewable energy sources of the Russian Union of Scientific and Engineering Public Associations. Diplomas of the winner of the competition of works of young scientists of the AllRussian Scientific Youth School with international participation “Renewable Energy Sources”, Moscow State University M.V. Lomonosov in 2012, 2014, 2018 and 2020, Diploma with a bronze medal of the 15th Russian agroindustrial exhibition “Golden Autumn—2013”, Diploma with a gold medal of the 18th Russian agro-industrial exhibition “Golden Autumn—2016”, Diploma with a silver medal of the XIX Moscow International Salon of Inventions and Innovative technologies “Archimedes—2016”, Diploma for the winning the schoolchildren who have achieved high results in significant events of the Department of Education and Science of the City of Moscow (2020–2021, School No. 2045). Scientific adviser of schoolchildren-winners and prize-winners of the Project and Research Competition “Engineers of the Future” at NUST MISiS 2021 and
xiv
About the Editors
RTU MIREA 2022. Invited expert of the projects of the final stages of the “Engineers of the Future” (2021, 2022) and the projects of the “Transport of the Future” (2022, Russian University of Transport). Grant “Young teacher of MIIT” after competitive selection in accordance with the Regulations on grants for young teachers of MIIT (2016– 2019). Scholarship of the President of the Russian Federation for 2018–2020 for young scientists and graduate students carrying out promising research and development in priority areas of modernization of the Russian economy. Grant of the Russian Science Foundation 2021 “Conducting fundamental scientific research and exploratory scientific research by international research teams”. Reviewer of articles, chapters and books IGI, Elsevier, Institute of Physics Publishing, International Journal of Energy Optimization and Engineering, Advances in Intelligent Systems and Computing, Journal of the Operations Research Society of China, Applied Sciences, Energies, Sustainability, AgriEngineering, Ain Shams Engineering Journal, Concurrency and Computation: Practice and Experience. Presenter of the sections of the Innovations in Agriculture conference, keynote speaker of the ICO 2019 conference session, keyspeaker of the special session of the ICO 2020 conference. Assistant Editor since 2019 of the “International Journal of Energy Optimization and Engineering”, Guest Editor since 2019 of the Special Issues of the journal MDPI (Switzerland) “Applied Sciences”, Editor of the book of the “IGI GLOBAL” (USA), as well as book of the “Nova Science Publisher” (USA). Participated in more than 100 exhibitions and conferences of various levels. Published more than 250 scientific papers, including 14 patents, 1 international patent, 6 educational publications, 4 monographs. J. Joshua Thomas is an Associate Professor at UOW Malaysia KDU Penang University College, Malaysia since 2008. He obtained his Ph.D. (Intelligent Systems Techniques) in 2015 from University Sains Malaysia, Penang, and Master’s degree in 1999 from Madurai Kamaraj University, India. From July to September 2005, he worked as a research assistant at the Artificial Intelligence Lab in University Sains Malaysia. From March 2008 to March 2010, he worked as a research associate at the same University. Currently, he is working with Machine Learning, Big Data, Data Analytics, Deep Learning, specially targeting on
About the Editors
xv
Convolutional Neural Networks (CNN) and Bi-directional Recurrent Neural Networks (RNN) for image tagging with embedded natural language processing, End to end steering learning systems and GAN. His work involves experimental research with software prototypes and mathematical modelling and design He is an editorial board member for the Journal of Energy Optimization and Engineering (IJEOE), and invited guest editor for Journal of Visual Languages Communication (JVLC-Elsevier). Recently with Computer Methods and Programs in Biomedicine (Elsevier). He has published more than 40 papers in leading international conference proceedings and peer reviewed journals. Elias Munapo has a Ph.D. obtained in 2010 from the National University of Science and Technology (Zimbabwe) and is a Professor of Operations Research at the North West University, Mafikeng Campus in South Africa. He is a Guest Editor of the Applied Sciences Journal and has co-published two books. The first book is titled Some Innovations in OR Methodology: Linear Optimization and was published by Lambert Academic publishers in 2018. The second book is titled Linear Integer Programming: Theory, Applications, and Recent Developments and was published by De Gruyter publishers in 2021. Professor Munapo has co-edited a number of books, is currently a reviewer of a number of journals, and has published over 100 journal articles and book chapters. In addition, Prof. Munapo is a recipient of the North West University Institutional Research Excellence award and is a member of the Operations Research Society of South Africa (ORSSA), EURO, and IFORS. He has presented at both local and international conferences and has supervised more than 10 doctoral students to completion. His research interests are in the broad area of operations research.
xvi
About the Editors
Gerhard-Wilhelm Weber is a Professor at Poznan University of Technology, Poznan, Poland, at Faculty of Engineering Management. His research is on mathematics, statistics, operational research, data science, machine learning, finance, economics, optimization, optimal control, management, neuro-, bio- and earth-sciences, medicine, logistics, development, cosmology and generalized spacetime research. He is involved in the organization of scientific life internationally. He received Diploma and Doctorate in Mathematics, and Economics/Business Administration, at RWTH Aachen, and Habilitation at TU Darmstadt (Germany). He replaced Professorships at University of Cologne, and TU Chemnitz, Germany. At Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey, he was a Professor in Financial Mathematics and Scientific Computing, and Assistant to the Director, and has been a member of five further graduate schools, institutes and departments of METU. G.-W. Weber has affiliations at Universities of Siegen (Germany), Federation University (Ballarat, Australia), University of Aveiro (Portugal), University of North Sumatra (Medan, Indonesia), Malaysia University of Technology, Chinese University of Hong Kong, KTO Karatay University (Konya, Turkey), Vidyasagar University (Midnapore, India), Mazandaran University of Science and Technology (Babol, Iran), Istinye University (Istanbul, Turkey), Georgian International Academy of Sciences, at EURO (Association of European OR Societies) where he is “Advisor to EURO Conferences” and IFORS (International Federation of OR Societies), where he is member in many national OR societies, honorary chair of some EURO working groups, subeditor of IFORS Newsletter, member of IFORS Developing Countries Committee, of Pacific Optimization Research Activity Group, etc. G.-W. Weber has supervised many M.Sc. and Ph.D. students, authored and edited numerous books and articles, and given many presentations from a diversity of areas, in theory, methods and practice. He has been a member of many international editorial, special issue and award boards; he participated at numerous research projects; he received various recognitions by students, universities, conferences and scientific organizations. G.-W. Weber is an IFORS Fellow.
About the Editors
xvii
Roman Rodriguez-Aguilar is a professor in the School of Economic and Business Sciences of the “Universidad Panamericana” in Mexico. His research is on large-scale mathematical optimization, evolutionary computation, data science, statistical modeling, health economics, energy, competition, and market regulation. He is particularly interested in topics related to artificial intelligence, digital transformation, and Industry 4.0. He received his Ph.D. at the School of Economics at the National Polytechnic Institute, Mexico. He also has a master’s degree in Engineering from the School of Engineering at the National University of Mexico (UNAM), a master’s degree in Administration and Public Policy in the School of Government and Public Policy at Monterrey Institute of Technology and Higher Education, a postgraduate in applied statistics at the Research Institute in Applied Mathematics and Systems of the UNAM and his degree in Economics at the UNAM. Prior to joining Panamericana University, he has worked as a specialist in economics, statistics, simulation, finance, and optimization, occupying different management positions in various public entities such as the Ministry of Energy, Ministry of Finance, and Ministry of Health. At present, he has the secondhighest country-wide distinction granted by the Mexican National System of Research Scientists for scientific merit (SNI Fellow, Level 2). He has co-authored research articles in science citation index journals, conference proceedings, presentations, and book chapters.
Clean Energy, Agro-Farming, and Smart Transportation
UV-A, UV-B, and UV-C Irradiation Influence on Productivity and Anthocyanin Accumulation in Lettuce, Mustard and Basil Plants in Reduced Light Conditions A. Smirnov1(B) , N. Semenova1 , Y. Proshkin1 , A. Ivanitskikh1 N. Chilingaryan1 , and V. Panchenko1,2
,
1 Federal Scientific Agroengineering Center VIM, 1St Institutsky Passage 5, 109428 Moscow,
Russia [email protected], [email protected], [email protected], [email protected] 2 Russian University of Transport, Obraztsova St. 9, 127994 Moscow, Russia
Abstract. Plant photo-protective reaction to UV irradiation depends on plant species, this research is of great importance for creating optimal conditions for growing plants with desired properties in indoor farming. Conducted research of the ‘Robin’ lettuce (Lactuca sativa L.) photoprotective reactions under low illumination conditions when exposed to various ranges UV irradiation were shown UV-B irradiation positive effect on fresh mass (40% increase), dry mass (17% increase), leaf area (48% increase), and also the anthocyanin content increased 2.5 times. For the ‘Vitamin’ red mustard plants (Brassica juncea L.), additional UV-B irradiation made it possible to increase the plant fresh mass by an average of 46%, leaf surface area by 15% and anthocyanin content by 2.2 times. UV-A irradiation treatment made it possible to increase red mustard fresh mass by 32%, dry mass by 35% and leaf surface area by 15%. Supplementary UV-A irradiation of the ‘Gastronom’ sweet basil (Ocimum basilicum L.) increased fresh mass by 2.4 times, dry mass by 2%, and leaf surface area by 2 times. We established, that purple leaf basil plants can be used as an indicator of UV-C radiation presence in the lighting spectrum. Plant additional UV A and B irradiation application at low doses allows to increase the synthesis of phenolic compounds useful for health and to improve product quality, and for a number of crops to increase yields in indoor farming conditions. Keywords: UV-irradiation · Red mustard · Sweet basil · Lettuce · Anthocyanin · Chlorophyll · Photosynthesis
1 Introduction Climate changes, an average annual temperature increase and the raise of exposure to UV radiation intensity, due to stratospheric ozone layer depletion and greenhouse gases release into the Earth atmosphere, negatively affects biological organisms and leads © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 3–12, 2024. https://doi.org/10.1007/978-3-031-50327-6_1
4
A. Smirnov et al.
to a decrease in the ecosystem biodiversity, increases the cancer incidence, and has a depressing effect on growth and plant productivity [1]. According to recent studies [2], the combination of these factors can have a strong impact on the grown agricultural product quantity and quality. In many scientific articles devoted to this problem, much attention is paid to the UV-B radiation effect on plants in general, and also to synthesis and accumulation mechanism of compounds that protect the plant from UV-B radiation destructive effects [3, 4]. The main plant protection mechanism against UV radiation is the processes of stimulating the synthesis and accumulation of phenolic compounds [5, 6], which play a key role in alleviating stress conditions and increasing resistance to biotic and abiotic influences. The synthesized by plants flavonoid group includes compounds such as [7]: anthocyanins, phenols, flavonoids, and number of other compounds with antioxidant properties. Most of the studies were devoted to the research of the high-intensity UV-B radiation effect on the concentration and secondary metabolite accumulation rate. Studies based on the mechanisms of increasing the flavonoid synthesis amount and reducing stress effects caused by UV-B impact on agricultural crops through the use of sulfhydryl compounds such as thiourea, dithiothreitol and other active additives are of great interest [8, 9]. It should be noted, that in plant natural products a high concentration of flavonoids and phenolic acids, which have an increased ability to trap free radicals and high antioxidant activity, can have a positive effect on human health [10]. It has been proven in mouse models that food rich in pigments, in particular anthocyanins, has anticancer activity and has a beneficial effect on the human liver [11]. Plant food flavonoids can have both positive and negative effect on living organism immunity and health [12–14]; besides they can affect a number of processes in ecosystems [7, 15, 16]. When studying the plant adaptation mechanisms and various range UV radiation influence on plants to identify plant photoprotective reactions, the main indicators are usually under monitoring: both morphological ones (such as leaf surface area, shape, fresh mass, dry mass, stem length, etc.) and biochemical ones (such as chemical composition and plant pigment concentration). Studies were carried out both in the natural growing environment, where an increased background of UV-A and UV-B irradiation was [17, 18], and under artificial created conditions in indoor farming [19, 20]. Plant responses to UV radiation include changes in leaf surface area, leaf thickness, stomatal density, photosynthetic pigment production and stem length [21, 22]. It was determined that repeated UV-B and UV-C exposure inhibited the growth of lettuce plants [23], but exposure of basil plants to supplementary UV-B and UV-A irradiation resulted in increased assimilating leaf surface area, fresh and dry mass [24, 25]. Additional UV-B irradiation stimulated physiological functions in young basil plants and caused the increase of total chlorophyll content but didn’t influence the carotenoid content [22]. Recent studies make it possible to improve the process of growing leafy greens with specified parameters in indoor farming and regulate the biologically active substance content in the product. Additional adequate levels of UV-A/B cause a positive mild
UV-A, UV-B, and UV-C Irradiation Influence on Productivity
5
stress, stimulating an oxidative stress and antioxidant mechanisms and increasing phenolic compounds in plants. Nowadays this is considered as a new paradigm of modern horticulture [23, 25]. Taking into account the cost and increasing demand for purple-coloured leafy greens, we decided to conduct a research aimed at studying the photoprotective plant reactions when exposed to UV radiation of various ranges (UV-A, UV-B, UV-C). The photoprotective reactions were assessed by changes in morphological parameters and pigment concentration in plant leaves.
2 Materials and Methods Red-leaved varieties of the following crops, widespread in the Russian Federation, belonging to various botanical families, were selected for the research: lettuce (Lactuca sativa L.) of the ‘Robin’ variety (Asteraceae), sweet basil (Ocimum basilicum L.) of the ‘Gastronom’ variety (Lamiaceae) and red mustard (Brassica juncea L.) of the ‘Vitamin’ variety (Brassicaceae). The plants were grown in a climatic chamber in 1-L pots. The neutralized high-moor peat was used as a substrate. The phytochamber was equipped with four equal compartments, separated by an opaque material (Fig. 1). Plants were cultivated for seven weeks after germination. After thinning, 5 plants were left in each pot. Below is the chemical composition of the nutrient solution for plants watering: P-PO4 (1.00 µM/l); Cu (1.00 µM/l); Mo (1.00 µM/l); N-NH4 (1.07 µM/l); Mg (1.65 µM/l); SSO4 (1.75 µM/l); Ca (2.00 µM/l); Zn (5.00 µM/l); K (5.77 µM/l); N-NO3 (9.64 µM/l); Mn (10.00 µM/l); Fe (15.00 µM/l); B (20.00 µM/l). The plants were kept for 24 h a day in the climatic chamber, where by the automation system, the temperature conditions were maintained in the range of 23/16 ± 1.0 °C and the relative humidity of the air on the level of 60 ± 10%.
Fig. 1. Experimental plant appearance in 4 compartments of the chamber
Fluorescent lamps Osram Fluora 36W (OSRAM Licht AG, Munich, Germany) were used as the main irradiators in the phytochamber. The illumination intensity
6
A. Smirnov et al.
was measured using the spectrocolorimeter TKA-VD (made by scientific-technical enterprise ‘TKA’, St. Petersburg, RF). The average irradiation intensity was 120 ± 10.3 µmol m−2 s−1 . The recent studies conducted with the lettuce and basil crops showed that PPFD equal to 250 µmol m−2 s−1 was optimal in terms of increasing yield and energy efficiency [26]. The reduced illumination was chosen with the purpose to create conditions, where the plant photosynthetic apparatus works more actively and the UV irradiation effect is stronger [19]. In the first three compartments of the chamber, in addition to the main light, UV irradiators UV-A 365 nm FERON T8 (FERON Moscow, RF), UV-A+B Arcadia T8 and UV-C LED 275 nm were installed. The main illumination common to all camera compartments consisted of the following spectra: blue (400–500 nm)–31 µmol m−2 s−1 , green (500–600 nm)–25 µmol m−2 s−1 , red (600–700 nm)–64 µmol m−2 s−1 , and far red (700–800 nm)–7 µmol m−2 s−1 ). Additional UV irradiation: 1 camera compartment– UV-C (100–280 nm)–0.055 W/m2 and UV-A (315–400 nm)–0.093 W/m2 ; 2 chamber compartment–UV-A (315–400 nm)–1.28 W/m2 ; 3 chamber compartment–UV-A (315– 400 nm)–0.32 W/m2 and UV-B (280–315 nm)–0.06 W/m2 . Section 4 served as a control, in which only the main illumination was set (UVA-0.06 W/m2 ). The lighting period of the main and UV irradiation was 16 h. The intensity of UV irradiation in the chamber did not exceed the natural solar UV radiation. The level of UV irradiation was measured using a portable UV-radiometer TKA-PKM (NTP “TKA”, St. Petersburg, RF). Ultraviolet light was turned on from the first day after seed germination. The sampling for the biometric indicators measurements was performed on the 45th day after germination. 5 plants of each species were selected from each compartment of the phytochamber. The total mass of the plant aboveground part was determined by weighing on the laboratory scale Sartorius LA230S Laboratory Scale (Germany). For the leaf surface area measuring, there was used the photo-planimeter LI-COR-LI-3100 AREA METER (USA). The quantitative pigment analysis was made on the spectrophotometer SPEKS SSP705 (Moscow, Russian Federation) after pigment extraction from the plant tissues with a solvent use according to the accepted methodology [27]. The measurement of the leaf CO2 -gas exchange was carried out using the portable photosynthesis system Li-COR Li-6800 (Li-COR Corp. Nebraska, USA). The microclimate parameters in the leaf testing chamber for gas exchange measuring corresponded to the values in the phytochamber, where the test plants were cultivated. The CO2 concentration in the leaf chamber was maintained at 500 ppm. The CO2 assimilation by the leaves was calculated based on the difference in the gas concentrations at the inlet and outlet of the leaf chamber. The light-related curves of the photosynthesis response were obtained at a decrease in PPFD from 600 µmol m−2 s−1 to zero. All the experiments were repeated three times. For the measurement results processing, the statistical methods of the data analysis were used. The diagrams were obtained in MS Excel 2016. For assessment of a reliability of the results with the significant difference P < 0.05, the two-factor variance analysis was performed using the Fisher criterion and the Tukey criterion.
UV-A, UV-B, and UV-C Irradiation Influence on Productivity
7
3 Results and Discussion From the first days of the cultivation, the sweet basil plants showed a noticeable lag in growth and a damaged appearance of leaves (like burns and twisting) under UV-C irradiation treatment as compared to mustard and lettuce plants. This fact indicates that they are more adaptable to this type of radiation (Fig. 2). Our research showed that the sweet basil plants can be used as an indicator of an increased UV-C radiation in the lighting spectrum. The effect of the UV irradiation of various ranges on the plant fresh and dry mass and on the leaf surface area was reliably established (Table 1). In the case of the lettuce and mustard crops, a greater increase in the leaf surface area and a greater accumulation of the fresh mass was observed in the variant using the UV-B irradiation (the 3rd compartment of the chamber), while a greater accumulation of the dry mass was observed in the variants with UV-A and UV-C irradiation (by 22–33% and 35–38%, respectively). The basil grown up with UV-A and UV-B additional treatment had a larger leaf surface area and a larger fresh mass. A greater accumulation of the dry mass was observed, when using the UV-C irradiation treatment (higher by 43% as compared to the control compartment).
Fig. 2. Appearance of lettuce a, mustard b and basil c plants on 45th day of cultivation. From left to right: plants irradiated with UV-C, UV-A, UV-B and control plants having no additional UV irradiation.
According to the experimental results, for all the studied crops, no direct correlation was observed between the morphological parameters, the dry mass accumulation and the main photosynthetic pigment concentration (Fig. 3). It should be noted, that usually, if illumination is enough, the chlorophyll b content in plants makes about 1/3 of the content of chlorophyll a; and an increase in this ratio indicates the plant adaptation to the lack of illumination due to an increase in the size of the light-collecting antenna of their photosystem II [28]. The highest total chlorophyll concentration in the mustard and basil plants was observed in the control variant, where the UV radiation was not used at all, while for the lettuce plants, the highest value was in the variant with the UV-B irradiation (surplus by 35%). The mustard plants did not have any lack of illumination; their light stress and a decrease in the chlorophyll concentration in comparison with the control plants were caused exactly by UV irradiation. Sweet basil and lettuce are more light demander plants. Judging by the concentration ratio of the chlorophyll b to the chlorophyll a, the plants were in the light lack conditions. This is confirmed by the averaged photosynthesis light response curves for the basil, lettuce and mustard plants (Fig. 4).
8
A. Smirnov et al.
Table 1. Plant growth indicators for mustard ‘Vitamin’, lettuce ‘Robin’ and basil ‘Gastronome’ varieties on 45th day of cultivation. Crop/variety
Light treatment
Fresh mass, g
Lettuce ‘Robin’
UV-C UV-A
Mustard ‘Vitamin’
Basil ‘Gastronome’
Dry mass, %
Number of leaves
Leaf surface area, cm2
22.7 ± 3.2
9.36 ± 0.5
23.0 ± 4.2
618.0 ± 102.8
23.8 ± 2.1
9.01 ± 0.5
22.3 ± 3.4
604.4 ± 84.9
UV-B
31.7 ± 6.5
7.88 ± 0.6
21.0 ± 3.6
966.7 ± 139.1
Control
22.7 ± 7.5
6.76 ± 0.3
17.5 ± 5.1
651.6 ± 209.2
UV-C
7.0 ± 1.6
14.5 ± 0.9
8.0 ± 0.1
132.9 ± 22.9
UV-A
12.0 ± 0.5
15.9 ± 1.4
9.0 ± 0.8
230.1 ± 19.9
UV-B
13.3 ± 2.4
11.5 ± 2.1
8.3 ± 1.2
279.0 ± 44.9
Control
9.1 ± 2.7
11.8 ± 2.0
7.5 ± 0.5
200.2 ± 55.0
UV-C
1.3 ± 0.1
12.0 ± 4.4
5.0 ± 1.0
36.7 ± 15.6
UV-A
3.3 ± 0,7
8.6 ± 0.2
4.5 ± 0.5
84.5 ± 11.8
UV-B
2.5 ± 0.5
8.5 ± 0.3
5.0 ± 0.8
65.6 ± 10.1
Control
1.4 ± 0.6
8.4 ± 1.4
4.3 ± 0.5
39.1 ± 17.7
Least significant difference (p < 0.05)
4.70
2.44
–
139.45
Fig. 3. Content of photosynthetic pigments in lettuce ‘Robin’, mustard ‘Vitamin’ and basil ‘Gastronome’ plants under various ranges of UV radiation on 45th day of cultivation (mg per 1 g of fresh material).
At PPFD = 125 µmol m−2 s−1 , which corresponds to cultivation conditions, the photosynthesis is far from its maximum. At higher values of PPFD, the photosynthetic response to the light intensity began to level out and reached its saturation plateau. In the case of the mustard and lettuce plants, the light saturation point was at the level of PPFD = 300–400 µmol m−2 s−1 , while in the case of the basil plants, no saturation took place at all even at 600 µmol m−2 s−1 . In our experiments, there was observed the anthocyanin synthesis intensification under the UV-B radiation 2.5 times at the lettuce plants and 2.2 times at the red mustard
UV-A, UV-B, and UV-C Irradiation Influence on Productivity
9
Fig. 4. Averaged photosynthetic light response curves of ‘Robin’ variety lettuce, ‘Vitamin’ variety mustard and ‘Gastronome’ variety basil on the 45th day of cultivation.
(Fig. 5). The conducted for the first-time research of the lettuce transcriptome changes under the impact of the UV-B radiation showed that the UV-B radiation causes the genes expression of anthocyanin biosynthesis; besides, it stimulates the stress proteins synthesis, due to which the resistance to the oxidative stress increases [29].
Fig. 5. Anthocyanin content (mg per 100 g of fresh material) on 45th day of cultivation of lettuce ‘Robin’, mustard ‘Vitamin’ and basil ‘Gastronome’ varieties; Least significant difference = 0.2, p < 0.05.
In spite of purple leaf plants, in general, absorb more irradiation than green-leaved ones, in the insufficient illumination conditions, the efficiency of their photosynthesis is often lower than that of the green ones. At high light intensity, the photosynthetic apparatus of the plants can be damaged; and the anthocyanins serve as a protection. In addition, the anthocyanins increase the light energy absorption within the visible range of the light spectrum [30]. In cell cultures, the UV-B radiation stimulates the anthocyanin generation and, thereby, reduces the DNA damage [31]. In our experiments, in the case of the UV-A and UV-B radiation treatments, the purple leaf basil increased its fresh mass and leaf surface area more intensively and accumulated less amount of the anthocyanins (by 15 and 24.5%, respectively). It suggests that in the conditions of the light stress caused by the illumination lack, the decrease of the vital indicators was due to the concentration growth of exactly the anthocyanins of the acyl group, while the anthocyanin total amount decreased.
10
A. Smirnov et al.
4 Conclusion In the closed artificial agroecosystems, the photosynthetic photon flux density (PPFD) is much lower as compared to the light intensity in the conditions of the direct sunlight illumination, which affects the quality of the grown products including the secondary metabolites content in them. The use of the plant additional UV irradiation in low doses increases the synthesis of the useful for the human health phenolic compounds and improves the products quality. Also, in the controlled environment, it’s true for a number of crops that the UV irradiation increases their yield. Our research showed that in the case of the lettuce of the ‘Robin’ variety, the use of the UV-B radiation additional to the main illumination spectrum led to the increase in the plant fresh mass by 40%, in the dry mass by 17%, in the leaf surface area by 48%, and in the anthocyanin content 2.5 times. In the case of the red mustard plants of the ‘Vitamin’ variety, the additional UV-B irradiation (0.06 W/m2 ) results in an average increase of the plant fresh mass by 46%, the leaf surface area by 15% and the anthocyanin content 2.2 times. If the UV-A irradiation is added to the illumination spectrum (1.28 W/m2 ), this results in the increase in the fresh mass by 32%, in the dry mass by 35% and in the leaf surface area by 15%. At cultivation of sweet basil of the ‘Gastronome’ variety, the use of the additional UV-A irradiation (1.28 W/m2 ) made it possible to increase the fresh mass 2.4 times, the dry mass by 2% and the leaf surface area 2 times. Thus, the plant reaction to the UV irradiation of various ranges supplemented to the main illumination spectrum is species-specific and requires an individual selection of parameters and modes of the irradiation for each crop.
References 1. Neale, R.E., et al.: Environmental effects of stratospheric ozone depletion, UV radiation, and interactions with climate change: UNEP environmental effects assessment panel, update 2020. Photochem. Photobiol. Sci. 20(1), 1–67 (2021). https://doi.org/10.1007/s43630-02000001-x 2. Urban, L., Charles, F., de Miranda, M.R., Aarrouf, J.: Understanding the physiological effects of UV-C light and exploiting its agronomic potential before and after harvest. Plant Physiol. Biochem. 105, 1–11 (2016). https://doi.org/10.1016/j.plaphy.2016.04.004 3. Yamasaki, S., Mizoguchi, K., Kodama, N., Iseki, J.: Lowbush Blueberry, highbush blueberry and cranberry extracts protect cucumber (Cucumis sativus L.) cotyledons from damage induced by UV-B irradiation. Jarq.-Jpn. Agr. Res. Q. 51, 241–250 (2017) 4. Prado, F.E., Perez, M.L., Gonzalez, J.A.: Effects of B ultraviolet radiation (UV-B) on different varieties of Quinoa. II. Effects on the synthesis of photosynthetic and protective pigments and soluble sugars under controlled conditions. Bol. de la Soc. Argent. de Bot. 51, 665–673 (2016). https://doi.org/10.31055/1851.2372.v51.n4.16353 5. Lin, L.Z., Sun, J., Chen, P., Harnly, J.: UHPLC-PDA-ESI/HRMS/MSn analysis of anthocyanins, flavonol glycosides, and hydroxycinnamic acid derivatives in red mustard greens (Brassica juncea Coss Variety). J. Agric. Food Chem. 59, 12059–12072 (2011). https://doi. org/10.1021/jf202556p 6. Dias, M.C., Santos, C., Silva, S., A Pinto, D.C.G., S Silva, A.M.: Physiological and metabolite reconfiguration of Olea Europaea to cope and recover from a heat or high UV-B shock. J. Agric. Food Chem. 68, 11339–11349 (2020) https://doi.org/10.1021/acs.jafc.0c04719
UV-A, UV-B, and UV-C Irradiation Influence on Productivity
11
7. Shah, A., Smith, D.L.: Flavonoids in agriculture: chemistry and roles in, biotic and abiotic stress responses, and microbial associations. Agronomy 10(8), 1209 (2020). https://doi.org/ 10.3390/agronomy10081209 8. Singh, S., Pandey, B., Agrawal, S.B., Agrawal, M.: Modification in spinach response to UVB radiation by nutrient management: pigments, antioxidants and metabolites. Indian J. Exp. Biol. 56, 922–931 (2018) 9. Golob, A., Novak, T., Marsic, N.K., Sircelj, H., Stibilj, V., Jerse, A., Kroflic, A., Germ, M.: Biofortification with selenium and iodine changes morphological properties of Brassica oleracea L. var. gongylodes) and increases their contents in tubers. Plant Physiol. Bioch. 150, 234–243 (2020). https://doi.org/10.1016/j.plaphy.2020.02.044 10. Benko, B., Toth, N., Zutic, I.: Flavonoids and phenolic acids in lettuce: how can we maximize their concentration? And why should we?. VI Balkan Symp. Vegetables Potatoes, Acta Hortic. 1142, 1–10 (2016). https://doi.org/10.17660/ActaHortic.2016.1142.1 11. Cammarisano, L., Donnison, I.S., Robson, P.R.H.: Producing enhanced yield and nutritional pigmentation in Lollo Rosso through manipulating the irradiance, duration, and periodicity of LEDs in the visible region of light. Front. Plant Sci. 11, 598082 (2020). https://doi.org/10. 3389/fpls.2020.598082 12. Mani, J.S., Johnson, J.B., Hosking, H., Ashwath, N., Walsh, K.B., Neilsen, P.M., Broszczak, D.A., Naiker, M.: Antioxidative and therapeutic potential of selected Australian plants: a review. J. Ethnopharmacol. 268, 113580 (2021) 13. Choudhary, D., Pan, M.H.: Antiviral effects of anthocyanins and phytochemicalsas natural dietary compounds on different virus sources. Curr. Res. Nutr. Food Sci. 8, 674–681 (2020). https://doi.org/10.12944/CRNFSJ.8.3.01 14. Langer, S., Kennel, A., Lodge, J.K.: The influence of juicing on the appearance of blueberry metabolites 2 h after consumption: a metabolite profiling approach. Brit. J. Nutr. 119(11), 1233–1244 (2018). https://doi.org/10.1017/S0007114518000855 15. Brunetti, C., Fini, A., Sebastiani, F., Gori, A., Tattini, M.: Modulation of phytohormone signaling: A primary function of flavonoids in plant-environment interactions. Front. Plant Sci. 9, 1042 (2018). https://doi.org/10.3389/fpls.2018.01042 16. Thoma, F., Somborn-Schulz, A., Schlehuber, D., Keuter, V., Deerberg, G.: Effects of light on secondary metabolites in selected leafy greens: a review. Front. Plant Sci. 11, 497 (2020). https://doi.org/10.3389/fpls.2020.00497 17. Contreras, R.A., Pizarro, M., Kohler, H., Zamora, P., Zuniga, G.E.: UV-B shock induces photoprotective flavonoids but not antioxidant activity in Antarctic Colobanthus quitensis (Kunth) Bartl. Environ. Exp. Bot. 159, 179–190 (2019). https://doi.org/10.1016/j.envexpbot. 2018.12.022 18. Bakhtiari, M., Formenti, L., Caggìa, V., Glauser, G., Rasmann, S.: Variable effects on growth and defense traits for plant ecotypic differentiation and phenotypic plasticity along elevation gradients. Ecol. Evol. 9, 3740–3755 (2019). https://doi.org/10.1002/ece3.4999 19. Dou, H., Niu, G., Gu, M.: Pre-harvest UV-B radiation and photosynthetic photon flux density interactively affect plant photosynthesis, growth, and secondary metabolites accumulation in basil (Ocimum basilicum) plants. Agronomy 9(8), 434 (2019) 20. Li, Y., Shi, R., Jiang, H., Wu, L., Zhang, Y.T., Song, S.W., Su, W., Liu, H.C.: End-of-day LED lightings influence the leaf color, growth and phytochemicals in two cultivars of lettuce. Agronomy 10, 1475 (2020). https://doi.org/10.3390/agronomy10101475 21. Gitz, D.C., Liu-Gitz, L.: How do UV photomorphogenic responses confer water stress tolerance?. Photochem. Photobiol. 78(6), 529–534 (2003). https://doi.org/10.1562/0031-8655(200 3)0780529HDUPRC2.0.CO2 22. Sakalauskait˙e, J., et al.: The effects of different UV-B radiation intensities on morphological and biochemical characteristics in Ocimum basilicum L. Sci. Food Agric. 93(6), 1266–1271 (2013)
12
A. Smirnov et al.
23. Lee, M.-J., Son, J.E., Oh, M.-M.: Growth and phenolic compounds of Lactuca sativa L. grown in a closed-type plant production system with UV-A, -B, or -C lamp. J. Sci. Food Agric. 94(2), 197–204 (2014). https://doi.org/10.1002/jsfa.6227 24. Semenova, N.A., Smirnov, A.A., Ivanitskikh, A.S., Izmailov, A.Y., Dorokhov, A.S., Proshkin, Y.A., Yanykin, D.V., Sarimov, R.R., Gudkov, S.V., Chilingaryan, N.O.: Impact of ultraviolet radiation on the pigment content and essential oil accumulation in sweet basil (Ocimum basilicum L.). Appl. Sci. 12, 7190 (2022). https://doi.org/10.3390/app12147190 25. Mariz-Ponte, N., Mendes, R.J., Sario, S., Ferreira de Oliveira, J.M.P., Melo, P., Santos, C.: Tomato plants use non-enzymatic antioxidant pathways to cope with moderate UV-A/B irradiation: a contribution to the use of UV-A/B in horticulture. J. Plant Physiol. 221, 32–42 (2018). https://doi.org/10.1016/j.jplph.2017.11.013 26. Pennisi, G., et al.: Optimal light intensity for sustainable water and energy use in indoor cultivation of lettuce and basil under red and blue LEDs. Sci. Hortic. 272, 109508 (2020). https://doi.org/10.1016/j.scienta.2020.109508 27. Semenova, N.A., et al.: The effect of plant growth compensation by adding silicon-containing fertilizer under light stress conditions. Plants 10(7), 1287 (2021). https://doi.org/10.3390/pla nts10071287 28. Borisova-Mubarakshina, M.M., Vetoshkina, D.C., Rudenko, N.N., Shirshikova, G.N., Fedorchuk, T.P., Naidov, I.A.: The size of the light-harvesting antenna of higher plant photosystem II is regulated by illumination intensity through transcription of antenna protein genes. Biochem. 79(6), 520–523 (2014). https://doi.org/10.1134/S0006297914060042 29. Zhang, L., Gong, F., Song, Y., Liu, K., Wan, Y.: De novo transcriptome analysis of lettuce (Lactuca sativa L.) and the identification of structural genes involved in anthocyanin biosynthesis in response to UV-B radiation. Acta. Physiol. Plant 41, 148 (2019) 30. Krasilnikova, L.A., Avksentieva, O.A.: Biochemistry of plants. Rostov: Phoenix 224 (2004) 31. Yoshida, K., Oyama, K., Kondo, T.: Structure of polyacylated anthocyanins and their UV protective effect. Recent Advances in Polyphenol Research, vol. 5, pp. 171–192 (2016)
Optimization of Electrocontact Welding Wear-Resistant Functional Coatings Regime in the Use of Engineering Industrial Wastes A. V. Serov1
, N. V. Serov2(B) , S. P. Kazantsev2 and O. V. Chekha2
, I. Y. Ignatkin2
,
1 Bauman Moscow State Technical University, Moscow 105005, Russia 2 Russian State Agrarian University—Moscow Timiryazev Agricultural Academy,
Moscow 127550, Russia [email protected]
Abstract. The problems of improving the quality of the resource of machines occupy an important place in mechanical engineering, and therefore, there is a high need for the development of technologies aimed at improving the technical and economic indicators of production. Such technologies can be applied both for the restoration of worn parts of machinery and equipment, and in the manufacture of new ones. Resource-saving technologies are especially relevant for use in the production of agricultural machinery parts operating under conditions of abrasive and corrosive influences, accompanied by increased dynamic loads. The direction of development of such technologies is the creation of coatings on the working surfaces of machines that provide, first of all, high wear resistance and corrosion resistance while reducing the consumption of expensive materials with the considered (required) properties. One of the effective and promising ways to obtain such coatings on the working surfaces of parts is the method of electrocontact welding of compact materials. This technology makes it possible to obtain coatings, including from waste from tool and machine-building production, which further reduces the cost of purchasing materials and meets the current trend aimed at developing recycling technologies to reduce harmful emissions into the environment. In this paper, one of these technologies for creating wear-resistant coatings is presented. With this technology, a wear-resistant coating was obtained on the zone (toe) of the ploughshare exposed to the greatest abrasive wear by electrocontact welding of unusable hacksaw blades made of steel grade 11R3AM3F2. Keywords: Functional coatings · Hardening · Restoration · Recycling · Electric contact welding · Environmental friendliness · Economy
1 Introduction The wear of machine parts and equipment used in agriculture is an urgent problem and requires increased attention. One of the ways to solve this problem is to increase the wear resistance of the working surfaces of agricultural machinery parts, which should lead to an increase in the entire life of the machine as a whole. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 13–22, 2024. https://doi.org/10.1007/978-3-031-50327-6_2
14
A. V. Serov et al.
At the same time, in modern society, the tasks of increasing the efficiency and environmental friendliness of production are becoming more and more acute, as a result of which the need for recycling technologies is increasing. A promising direction of recycling is the use of machine-building and tool production waste as additive materials for the restoration and/or hardening of machine parts, including agricultural purposes [1–5]. A promising method [6–10] for strengthening and/or restoring machine parts is the creation of functional coatings with increased wear resistance by electrocontact welding (ECW). To implement this method, a technology has been developed for obtaining functional wear-resistant coatings during the restoration and/or hardening of agricultural machinery parts with simultaneous disposal of mechanical engineering and tool production waste without the use of preliminary labor-intensive preparatory operations and harmful emissions that are formed during metallurgical waste disposal methods [6–10]. Waste in the form of strips or wires of structural, tool carbon, tool alloyed or highspeed steel (hand, machine, tape, jigsaw blades, files, files, drills and their scraps; scraps, stumps and chips from tool production) are used as additive materials [6–9]. This method was used to harden flat-shaped parts by the example of ploughshares made of 65G steel. This method was realized on “Remdetal” installation “011–1–10” with the help of developed device for seam electrocontact welding of additive material on the surface of flat parts [7–8, 10]. To intensify the welding process, the welded surfaces were pretreated with corundum by cold gas-dynamic spraying using the “Dimet” unit. Also, when welding, an intermediate layer was introduced between the filler material and the base to improve the quality of the connection. Studies [10–11] have shown that the best choice for the intermediate layer are amorphous nickel-based tape solders. The quality of the coatings obtained ECW is influenced by various factors that can be divided into controllable and uncontrollable (not controllable), the uncontrollable include: ambient temperature, its humidity and pressure, the temperature of the cold network water, the value of stability and frequency of the voltage in the power supply network of the plant. Controllable factors include those that are set directly during the welding process [11]: compression force of welding electrodes P, which is set by the pressure in the pneumatic cylinders of the welding head, the welding current Iw , the time of the welding current pulse flow t i and the pause between pulses t p , the feed rate Ss or the feed speed vS of welding electrodes (workpiece) depending on the scheme of the ECW process, value of cooling fluid consumption (network cold water) G, welding speed vw , , which also depending on the process scheme and the type of used equipment is set by the rotation frequency of the workpiece n in the case of treatment of bodies of revolution or the rate of longitudinal motion of the workpiece relative to the electrode (or vice versa) in the treatment of flat parts. It is the controlled parameters that have the greatest influence on the properties and quality of the obtained coatings and are the parameters of the welding mode. Their rational and reasonable choice based on accumulated experimental experience and conducted theoretical research will allow controlling the properties (programming) and quality of functional coatings obtained by this method.
Optimization of Electrocontact Welding Wear-Resistant
15
To evaluate the economic indicators of the offered technology, it is necessary to calculate the productivity W of the process of electrocontact welding. The purpose of the study: calculation, assignment and verification of optimal parameters of the process of obtaining wear-resistant functional coatings on a plowshare from unusable hacksaw blades by electrocontact welding (ECW), as well as measuring the hardness and testing the wear resistance of the coatings obtained.
2 Materials and Methods The “011–1–10” “Remdetal” unit was used for the ECW. Before welding, the connected surfaces of the main and additive materials were subjected to abrasive blasting operation. Abrasive blasting operation of the surfaces was performed with DIMET K-00-04-16 corundum powder, and STEMET 1301 nickel-based amorphous tape solder was used as an intermediate layer. The preliminary abrasive blasting operation of the parts’ surfaces was carried out on the “DIMET 405” cold gas-dynamic spraying unit. As an additive material, the following were used: spent hacksaw blades made of steel grade 11R3AM3F2 with a thickness of 2 mm; as well as steel strips with a thickness of 0.5 mm made of steel grades 65G, U12A, 50HFA. As a basis, samples made of strips of grade 65G steel with a thickness of 12 mm were used. The samples made of 65G steel, in one case, passed through electrocontact tempering (ECT)—run–in with rollers without welding the coating. The samples were cut off using the LC-300 Metallographic cutting machine. Microhardness was measured using a METALLAB 502 hardness tester. The microhardness of the coatings obtained by welding hacksaw blades made of steel grade 11R3AM3F2 was measured between two welding spots with a width of 0.5 mm in length at depths of 0.2 mm, 0.7 mm and 1.2 mm from the surface. The hardness of the samples was obtained by the Super Rockwell method using the N scale of small loads on a TH320 hardness tester at a load of 147 N (15 kgf). Wear resistance of obtained coatings was determined on IM-01 (Fig. 1) device at average contact pressure in the friction zone of 0.33 MPa, that at the rotation speed of the roller 115 min−1 and its diameter of 50 mm corresponded to the friction path of 540 m. Abrasive material consumption (fraction 0.16…0.32 mm) 7.0 g/min, test duration 30 min. Samples were weighed before and after the tests on the scales of VL-120C. The wear per unit of time for each of the samples under consideration was calculated by the formula: mi = (mb − ma )/t, where mb is the mass before weighing; ma —weight after weighing; t—time of the tests. The relative wear-resistance ε was defined as: ε = (mi )/(ms ) where ms —wear per unit of time of a reference sample made of 45 steel with a hardness of 190…200 HV.
16
A. V. Serov et al.
Fig. 1. General view a and installation diagram b IM-01: 1—drum with abrasive; 2—chute; 3—screw; 4—tested sample; 5—elastic roller; 6—weight; 7—holder.
3 Results For electrocontact welding of a wear-resistant coating using an intermediate layer based on highly active amorphous nickel-based tapes, calculations of the modes of controlled parameters of the ECW process were performed. The current value Iw when welding spent hacksaw blades made of steel grade 11R3AM3F2 with a thickness of 2 mm using amorphous nickel-based tape “Stemet 1301” as an intermediate layer in soft and hard welding modes is calculated by the formula [12]: Tκ λ , (1) Iw = 4.2dc ηT ρe
(1200+273)×0.030×103 ≈ 4.8 kA, −6 0.25×1.19×10 3 × 5 × 10−3 (1310+273)×0.030×10 ≈ 6.3 kA, 0.45×1.19×10−6
for soft mode: Iw = 4.2 × 3 × 10−3
for hard mode: Iw = 4.2 where dc —diameter of the welding spot, m; λ—thermal conductivity of the substance, W/(m-K); Tκ —temperature in the connection zone, K; ηm —thermal efficiency (for the hard mode—0.45; for the medium—0.3; for the soft—0.25); ρe —specific electrical resistance, Ohm-m. So, in the soft mode we get dc = 3 mm, Tκ = 1473 K, for the hard mode dc = 5 mm, Tκ = 1583 K, λ = 0.030 kW/(m-K), ρe = 1.19 × 10–6 -m. The calculation of the flow time of the welding current pulse ti in soft and hard modes was carried out according to the formula [11]: ti = for soft mode: ti = for hard mode:
4×2×1 = 0.26c, 0.030×103 1.5×2×1 ti = 0.030×103 = 0, 1c,
Kτ δhd , λ
(2)
Optimization of Electrocontact Welding Wear-Resistant
17
where δ—the thickness of the filler material, m, m; hd —the depth of heating, mm; Kτ —coefficient of mode rigidity (for hard mode—1.5; for medium mode—2.5; for soft mode—4). In our case δ = 2 mm, hd = 1 mm, Kτ = 1.5 for hard mode, and Kτ = 4 for soft mode. To ensure the required switch-on time (ST) of the welding current source and to minimize the area, which is released due to tempering in the process of welding the next spot, the condition ti < tp must be observed and in practice the pause duration tp is set at 0.02 s more than the pulse duration. It is established that the electrical resistance of the welding zone sufficient for the formation of the joint is achieved at the optimal compression force of the roller electrodes and is determined by the formula: P=f ×
3.14 × 52 π dc2 = 86 × = 1.7 kN, 4 4
(3)
where f—the specific pressure—30…140 MPa. To determine the optimum weld spot overlap coefficients in a welding row kPn and between welding rows kPS , the following condition kPn = kPS must be met. The overlapping coefficients of welding spots are according to the formulas: 2 (4) kPS = 1 − kPn kPn =
2 1 − kPS
(5)
Welding spot overlap coefficients will affect the productivity of the ECW process, which is determined by the formula: 3 2 2 ti kPn + 2tp kPn − tp , (6) W = dc 2 tp + ti kPn where W —the performance of the ECW process, mm2 /s; tp —pause duration, s. In this case, the productivity of the electrocontact soldering process will be maximal at such a value of kPn , at which W will be equal to 0. 3 2 2 ti kPn + 2tp kPn − tp . (7) 0 = dc 2 tp + ti kPn Formula (7) can be 0 only if the numerator of the fraction is zero: 3 2 + 2tp kPn − tp = 0, ti kPn 3 + 2 × 0.28k 2 − 0.28 = 0, result k for soft mode: 0.26kPn Pn = 0.623, Pn 3 + 2 × 0.12k 2 − 0.12 = 0, result k for hard mode: 0.1kPn Pn = 0.629. Pn Let’s find the value of√the overlap coefficient kPS by formula (4): for soft mode: kPS = √1 − 0.6232 = 0.782, for hard mode: kPS = 1 − 0.6292 = 0.777,
(8)
18
A. V. Serov et al.
where kPS is the coefficient of overlap of welding spots between welding rows. Welding speed vw is calculated by the formula: vw =
kPn dc , tp + ti kPn
(9)
0.623×3 m for soft mode: vw = 0.28+0.26×0.623 = 4.22 mm s = 0.26 min , 0.629×5 m = 17.192 mm for hard mode: vw = 0.12+0.1×0.629 s = 1.08 min . The area of the filler material through which the current passes during the entire pulse for spots whose shape is close to elliptical (Fig. 2) is determined by the formula [11]: ⎛ ⎞ 2 2 2 vw ti a vw ti ⎠ b πa (vw ti ) − − sin−1 a2 − Si = 4 ⎝ (10) a 4 4 4 2 2a
On soft mode:Si = 4 1.5 2 7.06 mm2 . On hard mode:Si =
4 2.5 3
3.14×22 4
−
4.22×0.26 4
3.14×32 4
−
17.92×0.1 4
22 − 32
−
(4.22×0.26)2 4
(17.92×0.1)2 4
−
22 2
sin−1
−
32 2
sin−1 17.92×0.1 6
=
4.22×0.26 4
=
18.52 mm2 .
Fig. 2. Geometric parameters of an elliptical welding spot.
When applying functional coatings to flat surfaces [11], the main movement will be the movement of the welding head relative to the workpiece (or vice versa), then the feed value in the perpendicular direction for a full stroke will be equal to: Ss = kPS dc , mm/stroke, mm for soft mode: Ss = 0.782 × 3 = 2.346, stroke ,
(11)
Optimization of Electrocontact Welding Wear-Resistant
19
mm for hard mode: Ss = 0.777 × 5 = 3.885, stroke . The authors [11] obtained analytical expressions for determining the coolant flow rate during ECW of flat parts (workpieces):
0,25 0,333 Prj λ(Tκ − tH )πl 0.664Re0,5 jd Prj Prc αF(Tκ − tw ) G= = ac(tc − tw ) + (1 − a)r ac(tc − tw ) + (1 − a)r g l 660.96 × 0.0045 × (1200 − 11) = 0.05 ≈ 3 = 0.9 × 4190 × (30 − 11) + (1 − 0.9) × 2.4 s min
(12)
where tc —temperature of coolant after contact with surface, °C; tw —temperature of water supplied from city cold water supply network or other coolant, °C; r—specific heat of vaporization, J/kg; α—heat transfer coefficient; F—surface area washed by water (coolant) at ECW, m2 ; a—fraction of unevaporated water (coolant); c—specific heat capacity of substance, J/(kg-K); Nujd —Nusselt number. Calculation was made for electric contact welding of hacksaw blades from steel grade 11R3AM3F2, 2 mm thick with the use of intermediate layer made of amorphous nickel-based tape Stemet 1301, 40 microns thick to a 12 mm thick flat sample made of 0,5 = 500; Pr j = 65G steel. For soft mode (Iw = 4.8 kA, t i = 0.26 s, P = 1.7 kN, Rejd 9.46 (at tw = 11 °C); Pr c = 4.31 (at T k = 40 °C); λ = 0.6 W/(m K); d = 0.04 m; c = 4190 J/(kg•K); t c = 30 °C; t w = 11 °C; r = 2406.5 kJ/kg (at T k = 1200 °C); a = 0.9) 0,5 and for hard mode (Iw = 6.3 kA, t i = 0.1 s, P = 1.7 kN, Rejd = 500; Pr j = 9.46 (at tw = 11 °C); Pr c = 4.31 (at T k = 40 °C); λ = 0.6 W/(m K); d = 0.04 m; c = 4190 J/(kg K); t c = 30 °C; t w = 11 °C; r = 2406.5 kJ/kg (at T k = 1310 °C); a = 0.9). The maximum productivity W of the ECW process will be calculated by the formula: ⎛ ⎞ 2 kPn 1 − kPn ⎠, W = dc2 ⎝ (13) tp + ti kPn On soft welding mode W = 32
0.623×0.782 0.28+0.26×0.623
2
2
cm = 9.92 mm s = 5.952 min ,
2 0.629×0.777 cm2 = 66.80 mm On hard welding mode W = 52 0.12+0.1×0.629 s = 40.08 min . Thus, all necessary parameters for realization of the process of ECW of waste hacksaw blades from steel grade 11R3AM3F2 on a ploughshare were received, thus it is visible that productivity of the process of ECW on a rigid mode in 6.73 times higher than on soft mode, hence for realization of welding process we choose a rigid mode: Iw = 6.3 kA, t p = 0.12 s, t i = 0.1 s, P = 1.7 kN, vw = 1.08 m/min, Si = 18.52 mm2 , Ss = 3.885 mm/stroke, G = 3 l/min, W = 40.08 cm2 /min.
4 Discussion The hardness of the 11P3AM3F2 steel sheets before they were welded in the initial state was 57.0 HRC, in the zone of thermal influence (tempering) after the ECW decreased to 51.4 HRC, inside the welding spot hardness (re-hardening zone) hardness increased to 64.0 HRC.
20
A. V. Serov et al.
Fig. 3. Microhardness of the coating zone of steel 11R3AM3F2
Similar data were obtained when analyzing the results of micro-confirmation measurement using the Micro-Vickers method, at the welding spot the maximum average hardness value is 904 HV, in the tempering zone 595.6 HV, in the zone with the initial structure 804 HV (Fig. 3). Wear resistance studies have established that the coating obtained by electrocontact welding through a tape amorphous solder 1301 carbon steel tape U12A has a wear resistance 1.48 times higher than the coating of the tape 50HFA, as well as 4 times greater wear resistance than steel 45, while the wear resistance of steel 65G after electrocontact tempering (ECT) is 2 times higher than steel 45 and higher than steel 65G 1.5 times. Wear resistance of coatings of steel 11R3AM3F2 on the basis of steel 65G obtained in the calculation values for the hard ECW mode is 7.3 times higher than the reference steel 45 and 1.7 times higher than that of coated samples of steel U12A obtained by this method with comparable hardness on the HRN15 scale (Fig. 4).
Fig. 4. Relative wear resistance and hardness of the tested specimens. 1—Steel 45; 2—Steel 65G (initial state); 3—Steel 65G (ECT); 4—Steel 65G + coating of steel 50HFA (ECW); 5—worn-out blade of steel 11R3AM3F2; 6—Steel 65G + coating of steel U12 (ECW); 7—St3 + coating of worn-out blade 11R3AM3F2
Optimization of Electrocontact Welding Wear-Resistant
21
The presented data allow us to conclude that the utilization of mechanical engineering waste combined with the processes of restoration and (or) hardening of parts by electrocontact welding under optimal conditions with the use of preliminary abrasive treatment of the joined surfaces and intermediate layers based on amorphous tape solders STEMET 1301 are a promising resource-saving direction of mechanical engineering allowing to significantly increase wear resistance, reduce material costs and increase environmental friendliness. In the future, the research will be aimed at: creating and constantly updating databases of properties of functional coatings obtained by the ECW; production of a prototype of modern equipment for obtaining functional coatings by electrocontact welding (ECW); development of a technique that allows programming the properties of coatings for ECW. It is also important to expand the assortment of repaired and (or) hardened parts. For example [12–14], a promising direction is the strengthening of details of livestock farms.
5 Conclusion It is certain that the main parameters of the ECW process are: the welding current Iw , the compression force of the welding electrodes P, the pulse duration t i and the pause between pulses t p , the coolant flow rate G, the welding speed vw and the process productivity W . The calculation of modes of the process of applying functional coating by electrocontact welding of useless hacksaw blades made of steel grade 11R3AM3F2 has been made. It is seen that the productivity of the ECW process in hard mode is 6.73 times higher than in soft mode, therefore, to implement the process we choose the hard mode: Iw = 6.3 kA, t p = 0.12 s, t i = 0.1 s, P = 1.7 kN, vw = 1.08 m/min, Si = 18.52 mm2 , Ss = 3.885 mm/stroke, G = 3 l/min, W = 40.08 cm2 /min. It was found that the wear resistance of 11R3AM3F2 steel coatings on a 65G steel base obtained at the calculated values of the ECW mode is 7.3 times higher than that of normalized 45 steel. Acknowledgments. We express our special gratitude for the help in the research to Dr. Sci., Professor of Moscow Polytechnic University, Moscow, Russia R. A. Latypov.
References 1. Kononenko, A.S., Ignatkin. I.Y., Drozdov. A.V.: Recovering a reducing-gear shaft neck by reinforced-bush adhesion. Polym. Sci., Ser. D. 15(2), 137–142 (2022). https://doi.org/10. 1134/S1995421222020113 2. Fanidi, O., Kostryukov, A.A., Shchedrin, A.V., Ignatkin, I.Y.: Comparison of analytical and experimental force in cylindrical workpiece drawing process. Tribol. Ind. 43(1), 1–11 (2021). https://doi.org/10.24874/ti.1000.11.20.02 3. Latypov, R.A., Serov, A.V., Serov, N.V.: Repair of radiator leaks by cold spraying. J. Phys.: Conf. Ser., Yalta, 17–20 May 2021. 012042 (2021). https://doi.org/10.1088/1742-6596/1967/ 1/012042
22
A. V. Serov et al.
4. Skorokhodov, D., Krasnyashchikh, K., Kazantsev, S., Anisimov, A.: Theory and methods of means and modes selection of agricultural equipment spare part quality control. Eng. Rural Dev. Jelgava, 20–22 May 2020. 19, 1140–1146 (2020). https://doi.org/10.22616/erdev.2020. 19.tf274 5. Erokhin, M., Kazantsev, S., Pastukhov, A., Golubev, I.: Theoretical basis of justification of electromechanical hardening modes of machine parts. Eng. Rural Dev. Jelgava, 20–22 May 2020. 19, 147–152 (2020). https://doi.org/10.22616/ERDev.2020.19.TF032 6. Altukhov, A.Y., Latypova, G.R., Andreeva, L.P.: Effect of the technological melting parameters of cobalt–chromium powders produced by electric discharge dispersion on the properties of the additive products made from them. Russ. Metall. (Met.). 2022(6), 694–698 (2022). https://doi.org/10.1134/S0036029522060040 7. Latypov, R.A., Serov, A.V., Ignatkin, I.Y., Serov, N.V.: Utilization of the wastes of mechanical engineering and metallurgy in the process of hardening and restoration of machine parts. Part 1. Metallurgist. 65(5–6), 578–585 (2021). https://doi.org/10.1007/s11015-021-01193-y 8. Latypov, R.A., Serov, A.V., Ignatkin, I.Y., Serov, N.V.: Utilization of the wastes of mechanical engineering and metallurgy in the process of hardening and restoration of machine parts. Part 2. Metallurgist. 65, 689–695 (2021). https://doi.org/10.1007/s11015-021-01206-w 9. Ageev, E.V., Podanov, V.O., Ageeva, A.E.: Microstructure and elemental composition of powders obtained under conditions of electroerosive metallurgy of heat-resistant Nickel alloy ZhS6U wastes in water. Metallurgist. 66, 578–585 (2022). https://doi.org/10.1007/s11015022-01362-7 10. Latypov, R., Serov, A., Serov, N., Chekha, O.: Technology of hardening plowshares by electrocontact welding using waste from tool production. Smart Innovation, Syst. Technol. 247, 197–203 (2022). https://doi.org/10.1007/978-981-16-3844-2_21 11. Burak, P.I., Serov, A.V., Latypov, R.A.: Optimization of the process of electric resistance welding of metallic strips through an amorphous solder. Weld. Int. 26(10), 814–818 (2012). https://doi.org/10.1080/09507116.2011.653168 12. Samarin, G.N., Vasilyev, A.N., Dorokhov, A.S., Mamahay, A.K., Shibanov, A.Y.: Optimization of power and economic indexes of a farm for the maintenance of cattle. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems and Computing, vol. 1072. Springer, Cham (2020). https://doi.org/10. 1007/978-3-030-33585-4_66 13. Samarin, G.N., Vasilyev, A.N., Zhukov, A.A., Soloviev, S.V.: Optimization of Microclimate Parameters Inside Livestock Buildings. In: Vasant, P., Zelinka, I., Weber, GW. (eds.) Intelligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_35 14. Ignatkin, I., Kazantsev, S., Shevkun, N., Skorokhodov, D., Serov, N., Alipichev, A., Panchenko, V.: Developing and testing the air cooling system of a combined climate control unit used in pig farming. Agriculture 13, 334 (2023). https://doi.org/10.3390/agriculture1 3020334
Accelerated Growth and Development of Plants as a Result of Their Stimulation in the Impulsed Electric Field S. Vasilev1
, S. Mashkov1
, P. Ishkin1 , V. Syrkin1 and I. Yudaev2(B)
, M. Fatkhutdinov1
,
1 Samara State Agrarian University, Uchebnaya St., 2, 446442 Ust-Kinelskiy, Russia 2 Kuban State Agrarian University, Kalinina St. 13, 350044 Krasnodar, Russia
[email protected]
Abstract. Research is aimed at increasing energy saving in the cultivation of vegetable, green and aromatic crops in an environmentally friendly way. At present, there is a need to increase energy saving and intensify the production of vegetable products, while simultaneously improving their quality. The most affordable, environmentally friendly and energy-saving way to accelerate the cultivation of plants is electrical stimulation, that is, the impact on plants of an electric field of high intensity. The use of a pulsed electric field of high intensity accelerates metabolic processes in plant cells and photosynthesis. As a result, their accelerated growth, development and fruiting occur. The accelerated growth and development of plants leads to shorter growing times and thus higher energy savings per production cycle. In the conducted experimental studies, the field strength varied in the range from 10 to 50 kV/m. As a result of the research, it was found that the highest stimulation efficiency was achieved at an electric field strength of 30 kV/m. The increase in plant height was 32% compared to the control experiment. At the same time, the impact on plants was carried out only in the morning and evening, for 3 h, respectively. Keywords: Accelerated cultivation · Electrical technologies · Impulsed electric field
1 Introduction Increasing the yield of vegetable, green, berry and spice-aromatic crops grown in protected ground is usually carried out by using a high amount of minerals (fertilizers) and growth stimulants. However, increased chemicalization, when growing the above crops, inevitably and negatively affects both the environment and the quality of the products [1, 2]. In this regard, it became necessary to improve the technology of growing vegetable, green, berry and aromatic crops, which consists in the use of environmentally friendly methods of influencing plants and leading to an increase in the level of energy saving in the growing process, as well as the quality of the products obtained. The most promising © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 23–31, 2024. https://doi.org/10.1007/978-3-031-50327-6_3
24
S. Vasilev et al.
is the use of a pulsed electric field of high intensity. The advantage of electrical stimulation (electrotechnology), in comparison with traditional methods, is its environmental friendliness and low energy costs [2, 4]. The purpose of the research, in this regard, is to increase energy saving and the quality of the products obtained, when growing vegetable, green, berry and spice-aromatic crops in protected ground, by using a high-tension pulsed electric field. Thus, the studies were aimed at establishing qualitative and quantitative causal relationships between the characteristics of the process of exposing plants to an electric field and accelerating their growth and development. Currently, there are many studies on the use of thermal, light, electromagnetic and other physical effects on plants in order to increase their productivity and product quality. In most cases, the conducted studies have a positive effect [1, 3, 6, 7]. Of the wide range of physical factors affecting plants, their greatest sensitivity is manifested to such as light, sound, electric, magnetic, and electromagnetic fields. It is likely that this is due to the fact that the above factors have been and remain natural components of the environment during the evolution of plants [5, 10]. Of the listed methods of influencing plants, the least complex, from a technical point of view, while the most energy-saving, are methods of influencing plants with electric, magnetic or electromagnetic fields. This study considers stimulation only by an electric field [2, 8]. However, for the successful application of the presented stimulation technologies, it is necessary to solve a number of problems that do not yet have an unambiguous solution and, therefore, limit the use of these methods. The main problems include: lack of clear and justified parameters of the electric or magnetic field (strength, frequency, shape of the function in time); lack of a reasonable value for the duration of stimulation, the duration of each stimulation cycle, the number of cycles and their distribution during the day; lack of justification for the alternation of stimulation and relaxation cycles; lack of substantiation of the direction of the electric and magnetic fields relative to the stimulated plants. Thus, in the course of continuing research, it is necessary to solve the problems presented above, as well as many related ones that are not presented in this paper. This will increase the level of energy saving in the process of growing vegetable products in greenhouses, as well as improve the quality of the products obtained by reducing the amount of mineral substances (fertilizers) and growth stimulants used.
2 Materials and Methods Research was carried out at the theoretical and experimental levels. This paper presents the results of only experimental studies. In theoretical studies, a constructive and electrical circuit of an installation for electrical stimulation of plants was developed, theoretical dependencies were obtained that make it possible to establish the relationship between the parameters of the electric field and the magnitude of the current passing through the body of the stimulated plant. At the experimental stage, it is necessary to test the
Accelerated Growth and Development of Plants
25
theoretical prerequisites and substantiate the optimal parameters of the electric field for stimulating vegetable green crops. In the course of the experiment, plants are exposed to a pulsed electric field of high intensity. To do this, the plants to be grown are located between two electrodes of different polarity, so that the direction of the electric field created between these electrodes coincides with the direction of plant growth. To do this, electrode 2 with a positive potential is located under the roots of plants, and electrode 1 with a negative potential is placed above the plants (Fig. 1). The electrode 2 can be placed directly in the soil 7, or below the soil container (or hydroponic plant tank). With a sufficient width of the lower electrode, plants 6 will be in a relatively uniform electric field [7]. The upper electrode is made in the form of a sufficiently thin wire so as not to shade the grown plants [3, 9].
Fig. 1. Plant electrical stimulation circuit: 1—upper electrode (with negative potential); 2—bottom electrode (with positive potential); 3—impulse voltage source; 4—traverse for attaching the upper electrode; 5—insulators; 6—stimulated plants; 7—the soil
The pulse frequency of the voltage applied to the electrodes, in this experiment, is 100 Hz, since it is generated from the mains voltage by rectifying and amplifying it. In the future, it will be necessary to determine the optimal frequency experimentally, based on the reaction of plants (responsiveness) to a certain value [1, 3]. In addition, it is possible to apply a modulated voltage according to any function to the electrodes. The magnitude of the voltage (amplitude) applied to the electrodes is determined by the distance between the electrodes h and the required magnitude of the electric field strength E in which the plants are located. These studies are aimed at determining the required field strength, which also depends on the response of plants (responsiveness) to a certain value. Some researchers who conducted similar experiments recommend tension in the range from 10 to 50 kV/m [2, 5]. In these studies, the interval will be similar. The task of the research is to experimentally establish the dependence of the parameters of the electric field (field strength) and the efficiency of plant growth.
26
S. Vasilev et al.
Five variants of field strength values are set in the studies: B1 = 10 kV/m, B2 = 20 kV/m, B3 = 30 kV/m, B4 = 40 kV/m, B5 = 50 kV/m and control K = 0 kV/m m (voltage was not applied at the control). The distance between the electrodes is determined by the design features of the laboratory research facility and is 40 cm (0.4 m) for all options. Thus, the voltage required to create the corresponding options for field strengths is: 4 kV, 8 kV, 12 kV, 16 kV, 20 kV and 0 kV (Control), respectively. The wide range of field strength values is explained by the need to identify the effective value of the field strength according to the criterion of plant responsiveness to stimulation, since its value was unknown before the start of the experiment. In accordance with the established methodology for such studies, each variant of the studied factor was duplicated in four repetitions, including the control variant. The arrangement of repetitions by variants is presented in Table 1. Table 1. Scheme of arrangement of plants by replicates and variants Repetitions
Factor options B5
B4
B3
B2
K
B1
1
U
X
U
X
U
X
U
X
U
X
U
2
U
X
U
X
U
X
U
X
U
X
U
3
U
X
U
X
U
X
U
X
U
X
U
4
U
X
U
X
U
X
U
X
U
X
U
Options B5–B1 are arranged in descending order of the electric field strength. This is done to reduce the possible influence of the field on the control variant K. That is, so that the control variant is located near the electrode with the lowest potential. To reduce the mutual influence of the electric fields of neighboring options, a free space is provided between them, marked with an “X” sign (Table 1). During the experiment, electrical stimulation of plants was carried out in the morning and evening. The duration of stimulation was 3 h, respectively, from 6.00 to 9.00 in the morning and from 16.00 to 19.00 in the evening. The total daily duration was 6 h. The duration of the experiment was 50 days. Shoots began to appear on the 9–12th day. At the end of the experiment, all plants were cut at the soil level, measured by the height of the aerial part and weight.
3 Results and Dıscussıon The results of measuring the height of the aerial parts of plants are presented in Table 2, according to the variant, and in Table 3 the results of the analysis of the experimental data are presented. The repetitions of each variant are groups of plants grown in one section (Fig. 1). Accordingly, in each variant, four repetitions. Three plants were concentrated in each repetition.
Accelerated Growth and Development of Plants Table 2. Plant height measurement results Replication number
Plant number in the i-th repetition
Average height over repetitions, mm
Average height by options, mm
292.3
1
2
3
1
289
262
292
281
2
296
292
298
295.3
3
284
294
302
293.3
4
302
287
310
299.7
1
301
316
312
309.7
2
314
310
295
306.3
3
298
306
315
306.3
4
342
301
312
318.3
1
300
321
309
310
2
332
320
311
321
3
299
315
307
307
4
338
306
325
323
1
265
259
251
262
2
272
248
256
260
3
281
277
262
279
4
278
263
261
270.5
1
240
262
251
251
2
236
278
265
259.7
3
271
277
267
271.7
4
229
248
258
245
1
260
245
249
251.3
2
277
278
248
267.7
3
252
251
253
252
4
248
254
262
254.7
Option B5
Option B4 310.2
Option B3 315.3
Option B2 264.4
Option B1 256,8
Option K 256.4
27
28
S. Vasilev et al.
The data obtained during the measurement of plant height were subjected to a brief mathematical and statistical processing to determine the main coefficients that characterize the accuracy and reliability of the studies. The results of the analysis of experimental data are presented in Table 3. From the analysis of the data (Table 3), it can be seen that the coefficient of variation for the second and fourth repetitions exceeds 5%. Option B1 was carried out with the lowest field strength. The height of the plants, at the same time, is comparable with the control, that is, with non-stimulated plants. However, the coefficient of variation is much higher than in the control υB1 υK . That is, stimulation by a low-strength electric field (10 kV/m) has no effect or, as in this case, it has a negative effect—no increase in height is observed, but the variation increases. The data obtained according to option B2 indicate that the average height of the aerial parts of plants is 264.7 mm, which exceeds the control by 3%, which is not significant. That is, stimulation with a field strength of 20 kV/m does not give a significant positive effect. The coefficient of variation for option B2 is 4.02%, which is less than in the control. That is, the plants, according to this option, are more even in height, which already indicates a positive effect of stimulation. The average height of the above-ground part according to option B3 is 315.25 mm, which significantly exceeds the control—by 58.55 mm (22.8%). That is, stimulation with a tension of 30 kV/m has the greatest positive effect on the growth rate of the aerial parts of plants. An increase in the growth rate, in this variant, is also accompanied by a significant decrease in the coefficient of variation, which has the lowest experimental value of 3.89%, which is approximately 0.4% lower than in the control. That is, stimulation by a highintensity pulsed electric field makes it possible to better align the plants in height, as well as to obtain a greater increase in biomass. The average height of the aerial part according to option B4 is 310.2 mm, which also significantly exceeds the control—by 53.5 mm (20.8%), however, the results are slightly worse than in option B3. That is, stimulation with a tension of 40 kV/m has a fairly good effect on the growth rate of the aerial parts of plants. An increase in the growth rate, in this variant, as well as in variant B3, is accompanied by a significant decrease in the coefficient of variation, which has a comparable value of 3.96%, which is approximately 0.3% lower than in the control. The average height of the aerial part according to option B5 is 292.3 mm, which also significantly exceeds the control—by 35.6 mm (13.9%), however, the results are worse than in options B3 and B4. That is, stimulation with a strength of 50 kV/m, although it has a fairly good effect on the growth rate of the aerial parts of plants, there is an obvious tendency for the depressing effect of an excessively high electric field. The coefficient of variation, in this variant, has a higher value compared to the previous options B3 and B4 and is 4.1%, which is about 0.18% lower than in the control. In the control experiment (option K), there is a significant spread in the value of the coefficient of variation over repetitions, from 0.39% to 6.36%. The reason for this is not yet clear, because all repetitions of this option, as well as other options, were located next to each other. It is possible that the absence of an external electric field adversely affects the degree of evenness of plants.
Accelerated Growth and Development of Plants
29
Table 3. Results of the analysis of experimental data Replication number
Standard deviation, The coefficient of of replicates σp , % variation, for replicates υ, %
Standard deviation by options σ, %
Coefficient of variation by options υ, %
1
16.50
5.90
11.97
4.10
2
3.10
1.00
3
9.00
3.10
4
11.70
3.90
1
7.80
2.50
12.27
3.96
2
10.00
3.30
3
8.50
2.80
4
21.20
6.70
1
10.50
3.40
12.27
3.89
2
10.50
3.30
3
8.00
2.60
4
16.10
5.00 10.64
4.02
16.12
6.28
10.98
4.28
Option B5
Option B4
Option B3
Option B2 1
7.00
2.70
2
12.20
4.70
3
10.00
3.60
4
9.30
3.40
1
11.00
4.40
2
21.50
8.30
3
5.00
1.90
4
14.70
6.00
1
7.80
3.10
2
170
6.40
3
10
0.40
4
70
2.80
Option B1
Option K
For a visual presentation of the research results and analysis of the data obtained, a diagram was constructed that displays the values of the coefficients of variation ν and
S. Vasilev et al. 350.0
The coefficient of variation, for replicates υ & Standard deviation by options σ, %
18.00
300.0
310.2 14.00
16.12
315.3
16.00
264.4
256.4
292.3 12.27
250.0
12.00
256.8
12.27 10.00
200.0
11.97
10.98
10.64
8.00
150.0 6.28
6.00 4.10
4.28
3.96
Plant height h, mm
30
100.0
4.00 4.02
3.89
50.0
2.00 0.00 Option В5 Option В4 Option В3 Option В2 Option В1
0.0 Option К
Options ν
σ
h
Fig. 2. Diagram of the distribution of the coefficients of variation ν and standard deviations σ by options
standard deviations σ for the options (Fig. 2). For clarity, the diagram also shows the average height of plants by options. The variants under study are located on the diagram in the same order in which they were physically located during the experiment (Fig. 1). That is, they are arranged in descending order of electric field strength. The diagram shows that the optimal field strength, which gives the best effect from the stimulating effect, is 30 kV/m. A further increase in the field strength causes a certain depressing effect—the height of plants decreases, and the coefficient of variation, which characterizes the degree of evenness of plants in height relative to each other, increases.
4 Conclusion Increasing energy saving in technologies for growing vegetable, green, berry and aromatic crops under controlled conditions (in protected soil) can be carried out in environmentally friendly ways, by using electrical technology, namely, by exposing plants to a high-intensity pulsed electric field. Plants located in a pulsed electric field begin to
Accelerated Growth and Development of Plants
31
interact with it. The nature of the interaction is determined by the nature of the field, the frequency of its pulsation, the intensity, as well as the physical characteristics of the plant itself, its conductivity and dielectric constant. As a result of the experimental studies, this theoretical assumption was confirmed. In option B3, the field strength is 30 kV/m. It is at this value that the best effect of stimulation is observed. The average height of the above-ground part is 315.25 mm, which significantly exceeds the control—by 22.8%. At the same time, an increase in the growth rate is accompanied by a significant decrease in the coefficient of variation, which is 3.89%, which is approximately 0.4% lower than in the control. That is, stimulating plants with a high-intensity pulsed electric field allows not only to increase the growth rate of plants, but also to level them in height, which is fundamentally important in the production of green vegetable products for commercial purposes. Thus, exposure of plants to a high-intensity electric field, while substantiating its optimal characteristics, contributes to an increase in the growth rate of plants, which reduces the time of their cultivation in one cycle—from the moment of sowing to the moment of harvesting. Thus, it is possible to increase energy saving in technologies for growing vegetable, green, berry and spice-aromatic crops in protected ground.
References 1. Aladjadjiyan, A.: Physical factors for plant growth stimulation improve food quality. In: Food Production—Approaches, Challenges and Tasks, pp 145–168 (2012) 2. Yudaev, I.V., Daus, Y.V., Gamaga, V.V., Grachev, S.E., Kuligin, V.S.: Plant tissue sensitivity to electrical impulse. Res. J. Pharm. Bio. Chem. Sci. 9(4), 734–739 (2018) 3. Marinkovi´c, B., et al.: Use of biophysical methods to improve yields and quality of agricultural products. J. Agr. Sci. 53(3), 235–242 (2008) 4. Dardeniz, A., Tayyar, S., Yalcin, S.: Influence of low-frequency electromagnetic field on the vegetative growth of grape CV. USLU. J. Central Eur. Agric. 7(3), 389–395 (2006) 5. Vasil’yev, S.I., Mashkov, S.V., Syrkin, V.A., Gridneva, T.S., Yudaev, I.V.: Results of studies of plant stimulation in a magnetic field. Res. J. Pharm., Biol. Chem. Sci. 9(4), 706–710 (2018) 6. Yudaev, I.V., Daus, Y.V., Kokurin, R.G.: Substantiation of criteria and methods for estimating efficiency of the electric impulse process of plant material. IOP Conf. Ser.: Earth Environ. Sci. 488(1), 012055 (2020) 7. Mashkov, S.V., Vasilev, S.I., Fatkhutdinov, M.R., Gridneva, T.S.: Using an electric field to stimulate the vegetable crops growth. Int. Trans. J. Eng., Manage. Appl. Sci. Technol. 11(16), 11A16V (2020) 8. Baev, V.I., Yudaev, I.V., Petrukhin, V.A., Prokofyev, P.V., Armyanov, N.K.: Electrotechnology as one of the most advanced branches in the agricultural production development. In: Handbook of Research on Renewable Energy and Electric Resources for Sustainable Rural Development. IGI Global, Hershey, PA, USA (2018) 9. Yudaev, I., Eviev, V., Sumyanova, E., Romanyuk, N., Daus, Y., Panchenko, V.: Methodology and modeling of the application of electrophysical methods for locust pest control. Lect. Notes Netw. Syst. 569, 781–788 (2023) 10. Petrukhin, V., et al.: Modeling of the device operating principle for electrical stimulation of grafting establishment of woody plants. Lect. Notes Netw. Syst. 569, 667–673 (2023) 11. Yudaev, I.V.: Analysis of variation in circuit parameters for substitution of weed plant tissue under electric impulse action. Surf. Eng. Appl. Electrochem. 55(2), 219–224 (2019) 12. Tokarev, K., et al.: Monitoring and intelligent management of agrophytocenosis productivity based on deep neural network algorithms. Lect. Notes Netw. Syst. 569, 686–694 (2023)
A Deep Reinforcement Learning Framework for Reducing Energy Consumption of Server Cooling System Abdullah Al Munem1 , Md. Shafayat Hossain1 , Rizvee Hassan Prito1 , Rashedul Amin Tuhin1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2(B) 1 Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
{mcctuhin,wasif}@ewubd.edu
2 Department of Computer Science and Engineering, Chittagong University of Engineering and
Technology, Chattogram, Bangladesh [email protected]
Abstract. Data Centers consume a tremendous amount of energy for cooling the servers. The cooling system of a data center consumes around 40–55% of the total energy consumption. Thus, it is required to reduce the energy consumption of the server cooling system to minimize the electricity cost. In this experiment, a general framework for reinforcement learning agent integrated cooling systems has been proposed to reduce the cooling cost. The proposed model has been trained and evaluated in a simulated environment. A traditional threshold-based cooling system was also evaluated in the same environment to determine the efficiency of the proposed framework. The proposed framework was able to reduce energy consumption by 21% for 36 months compared to the threshold-based cooling system. To develop the proposed model, the Deep Q-Learning algorithm has been used in this experiment. Keywords: Energy consumption · Datacenter · Server cooling · Reinforcement learning · Deep Q-learning
1 Introduction The current world is increasingly digitized and is being carried by data centers. From artificial intelligence and cloud computing to websites and databases, all are hosted in data centers. Since data centers have several devices for processing, storing, and serving data, a massive heat load is produced. So, cooling systems are essential for any data center [1]. 40–55% of total data center energy is needed for cooling. Reinforcement learning (RL) can be used to find more efficient cooling methods to reduce costs. Data center cooling systems controllers follow preset settings or straightforward, conventional, and sometimes hand-tuned configurations specific to particular data center layouts [2]. Managing these controllers for different situations is tedious and sometimes requires human interactions. These can be automated using RL to find optimal tweaks of these controllers. For example, setting a desired value to a controlling variable like © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 32–42, 2024. https://doi.org/10.1007/978-3-031-50327-6_4
A Deep Reinforcement Learning Framework for Reducing
33
the temperature of an air conditioner can enable the conditioner to run with a certain amount of power [3]. In this experiment, a general Deep RL framework has been proposed to reduce the energy consumption of data center cooling systems, combining deep learning and reinforcement learning [4]. Nowadays DL based models are used for solving problems in several domains [5]. DL uses multiple artificial neural network layers like human brain structure to help machines gain knowledge like humans [6, 7]. Policy in RL is when an agent is reinforced to learn to execute the best actions from the experiences it got [8]. The model will act as an agent to get data from different data center sensors to learn from the environment and make safe optimal decisions to tweak these controllers depending on different situations and also satisfy some set of safety constraints. The RL model in this experiment is implemented using deep Q networks. It uses deep neural networks and Q-learning together. In a Deep Q-network, a neural network maps state-action pairs to its corresponding Q value where it initializes the neural networks then chooses a state-action pair using the Epsilon-Greedy Exploration Strategy and Updates the neural networks weights using the Bellman Equation. Then the agent picks the best state-action pair with the best Q-value [9, 10]. The cooling cost of a data center is the most dominant expense among all [11]. Servers host Websites, Databases, information, etc. which users request to interact with for which data is transmitted to and from these servers. The combination of these impacts the surface temperature of these servers. Therefore, an integrated cooling system must automatically regulate the surface temperature by always bringing the temperature into the optimal range. Energy consumption is correlated to surface temperature. This project focuses on developing a deep RL model integrated cooling system that can minimize the energy consumption of a server compared to the traditional rule-based cooling system. This experiment aims to develop a Deep Reinforcement Learning model to minimize the energy consumption of a server cooling system. The objectives of this project are given below: 1. To design a Deep Q-Network architecture for the Reinforcement Learning model. 2. To construct an environment for the learning phase.
2 Related Work Reinforcement learning has been used for reducing energy consumption in many data centers including Google. In July 2016, DeepMind AI reduced the cooling bill of Google’s data center by 40% [12]. The environment of each data center is different from others. They developed a general intelligence framework using reinforcement learning that is applicable to every environment. The agent of this framework can interact with the environment to take optimal action and adapt to the changes in the environment. A similar experiment has been done to optimize the energy consumption of a data center in some other relevant works. The authors of the paper [2] described that the application of reinforcement learning (RL) algorithms to regulate temperatures can reduce a significant amount of energy consumption in a data center. They demonstrated that an RL agent can effectively regulate the temperatures of the server room after a few hours of exploration. Yuanlong et al. [3]
34
A. Al Munem et al.
proposed an end-to-end cooling control algorithm (CCA) based on the deep deterministic policy gradient (DDPG) algorithm. They trained the RL agent on the simulation platform of EnergyPlus and achieved an average of 11% cooling cost savings. They also trained this model at the National SuperComputing Centre (NSCC) of Singapore and achieved an average of 15% cooling energy saving if the temperature threshold was set at 26.6 degrees Celsius. Qingchen et al. [13] proposed a reinforcement learning model based on deep Qlearning for an energy-efficient scheduling scheme. For learning Q-Value, they used auto-encoder instead of Q-function. They evaluate this model on several simulated environments. The result of this proposed model demonstrated that this method could save 4.2% more energy than a hybrid dynamic voltage and frequency scaling (DVFS) scheduling based on Q-learning (QL-HDS). Reducing energy consumption for an edge computing device is a challenging problem. Another paper by Qingchen et al. [14] proposed a deep Q-learning model with multiple DVFS algorithms for energy-efficient scheduling (DQL-EES). The DQL-EES model has some limitations regarding stability and distinguishing the continuous system state. Thus, they modified this model and proposed a new model called the double deep Q-learning model for energy-efficient edge scheduling (DDQ-EES). This model was trained on EdgeCloudSim and saved on average 2–2.4% energy. Yongyi et al. [15] proposed a Deep Reinforcement Learning (DRL) framework called DeepEE to reduce the energy consumption of data centers (DC) by considering the cooling systems and information technology (IT) concurrently. In this framework, they proposed a Parameterized action space-based Deep Q-Network (PADQN) algorithm and a two-time-scale control mechanism to coordinate cooling systems and IT more effectively. They demonstrated that this approach could save up to 15% of energy consumption in a DC. Jianxiong et al. [16] have suggested a DRL model which takes safety as its prime priority in decision-making. An actor-critic Model-Based RL (MBRL) model is proposed called SafeCool which comprises a risk model to calculate the potential risk of doing a certain action, and a transition model to predict the future state of that environment. The suggested SafeCool MBRL algorithm has proven to save 13.18% cooling power, reduces thermal violations by 48% and also learns the optimal cooling policies faster when tested on real-world simulation on OCA and ECA systems in contrast to state-of-the-art MBRL. Ruihang et al. [17] proposed a framework of deep reinforcement learning (DRL) for the single hall-datacenter to provide safety awareness. It implements offline imitation learning and online post-hoc rectification to remove the risk of thermal unsafety that occurs in online DRL. The main goal of post-hoc rectification is to make the smallest adjustment to the DRL-suggested course of actions to prevent unsafety from occurring. A thermal state transition model is used to construct the rectification. It has the ability to predict the transitions to unsafe states that are founded by DRL. The proposed model has reduced the safety violations from 94.5 to 99% compared to reward shaping and also it has saved 22.7 to 26.6% of the overall power of the data center compared to conventional regulation. RL also applied for controlling home energy management. Paulo et al. [18] achieved 8% energy savings compared to rule-based algorithms by applying the DRL algorithm.
A Deep Reinforcement Learning Framework for Reducing
35
To conclude, the RL model can reduce a significant amount of energy consumption compared to rule-based programs. Hence, an experimental study is conducted to find out an efficient approach to reduce the energy consumption of DC for a generic environment. In [19–28], there are guidelines and suggestions in related issues.
3 Methodology To conduct this experiment, a Deep Q-Learning model will be developed which can take inputs from the environment and perform an action to reduce the energy consumption of the server cooling system. A custom environment has been created to simulate the model’s performance. A server integrated with a threshold-based cooling system and AI-based cooling system runs simultaneously to compare the energy consumption. The performance of both cooling systems has been evaluated over 36 months to determine the rate of reducing energy consumption. From the literature review, it has been found that Reinforcement Learning is the most suitable method for this experiment since the model can learn from the environment. A deep Q-Learning algorithm has been used because deep learning-based RL models can work efficiently in a large environment that has many diverse states. The stochastic policy has been used instead of the deterministic policy so that the agent can explore and exploit the environment. Figure 1 shows the working flow of this experiment.
Fig. 1. Working flow of this experiment.
3.1 Environment Creation To simulate the AI-based cooling system, an environment has been created in this experiment. The environment consists of multiple parameters. Table 1 shows the selected values for the parameters. The RL agent takes input from this experiment and performs an action on the environment. Based on the action the state of the environment has changed. The parameters are updated after each action. Some important variables have been defined in this environment. Those are (1) the temperature of a server at a specific time, (2) the number of users in the server at a specific time, (3) the rate of data transmission of the server at a specific time, (4) energy consumption by the AI integrated cooling system, and (5) energy consumption by the traditional threshold-based cooling system. Two separate simulations have been performed simultaneously in this environment. One is for the AI-integrated cooling system and the other is for the threshold-based cooling system. The temperature of a server is calculated separately for both types of the
36
A. Al Munem et al.
cooling system. The temperature of the server depends on three parameters which are atmospheric temperature, number of users, and rate of data transmission. Equation (1) shows the formula for calculating server temperature. The values of the coefficients are estimated using regression analysis using Multiple Linear Regression. The number of users and the rate of data transmission is assigned randomly within an optimal range. Since this experiment focused on minimizing energy consumption using an AIintegrated cooling system, it has been assumed that the energy consumption of both cooling systems is proportional to the absolute temperature change between two-time intervals. Equation (2) shows the formula for calculating energy consumption. Since both cooling systems have been evaluated in the same environment, if the AI-integrated cooling system outperformed the threshold-based cooling system, then it will work in a real-world scenario. Server temperature(T) = b0 + b1 × at + b2 × nu + b3 × rd
(1)
here, at = atmospheric temperature, nu = number of users, rd = rate of data transmission Energy spent(E)at time(t) = |Tt+1 − Tt |
(2)
here, T = Server temperature Table 1. The selected values for the environment’s parameters Parameters
Values
Average atmospheric temperature over a month (January to December in Dhaka, Bangladesh)
[20.65, 23.6, 27.65, 30.85, 31.2, 30.65, 29.7, 29.55, 29.05, 27.55, 24.85, 21.7] (Celsius)
Optimal range of server temperatures
[18 °C, 24 °C]
Minimum temperature
5 °C
Maximum temperature
50 °C
Minimum number of users
10
Maximum number of users
100
Number of users that can be changed (up or 5 down) per minute Minimum rate of data transmission
20 Mb/minute
3.2 Deep Q-Network (DQN) The DQN (Fig. 2) takes three inputs (server temperature, number of users, and rate of data transmission) from the environment and gives optimal action as output. In this experiment, the agent can perform 5 types of action based on the inputs. Those are, increasing the temperature by 1.5 °C or 3 °C, decreasing the temperature by 1.5 °C or
A Deep Reinforcement Learning Framework for Reducing
37
Fig. 2. Interaction of the RL agent with the environment.
3 °C, and not changing the current temperature. The neural network part of this DQN is a stack of multiple dense layers. The activation function for the hidden layer is sigmoid. For the output layer, the SoftMax activation function has been used. SoftMax activation function gives the probability of taking each action according to the input states. The output of this network is the Q-value for the input states for each policy. The agent selects the policy of the highest Q-value and performs the action on the environment. After taking the action, the state of the environment has been updated. The agent gets a reward from the environment based on the action. The reward is calculated by subtracting the energy consumption of the agent-based cooling system from the threshold-based cooling system. It means if the energy consumption of the agent-based cooling system is lower than the threshold-based cooling system then the agent will get a positive reward, otherwise it will get a negative reward. The goal of this agent is to maximize the reward to perform optimal action. Mean squared error (MSE) has been used as the loss function of this neural network. The current or input states performed action, reward and the next or updated states have been saved in the replay memory. The experience replay technique has been used in the training phase of this DQN to improve the sample efficiency. If the model takes consecutive samples from the states, then the samples will be highly correlated which leads to inefficient learning. Instead of taking consecutive samples, if the model takes a random sample from replay memory, then it breaks the correlation between the consecutive inputs and it improves the model performance. The selected hyperparameters of this model are shown in Table 2. 3.3 Research Ethics This experiment has no ethical issues in terms of psychological, social, physical, or legal standards since this experiment has been conducted in a simulated environment.
4 Experimental Result and Discussion Both the AI-integrated cooling system and the threshold-based cooling system have been evaluated during the training phase of the RL model. For each epoch energy consumption of both models has been calculated using Eq. (2). The evaluation during training is shown
38
A. Al Munem et al. Table 2. The selected values of the hyperparameter tuning
Hyperparameters
Value(s)
Epoch
36
Batch size
256
Discount factor
0.9
Epsilon
0.3 (30% exploration)
Activation
Hidden layers
Sigmoid
Output layer
SoftMax
Loss
MSE
Optimizer
Adam
in Fig. 3. In Fig. 3, the red line is for the threshold-based cooling system, and the green line is for the AI-integrated cooling system. This figure shows that over time (epoch) the AIintegrated cooling system spent less amount of energy compared to the threshold-based cooling system.
Fig. 3. Performance of the AI-integrated cooling system and the threshold-based cooling system during the training phase.
Each epoch is denoted as one month time period. During one minute time interval, the model takes inputs from the environment and performs an action to regulate the cooling system. In each month, the model takes input from the environment for 30 (days) * 24 (hours) * 60 (minutes) times. Figure 3 shows that after 36 epochs (months) the AI agent learns to minimize energy consumption compared to the traditional cooling system. After the training phase, the RL model-based cooling system and threshold-based cooling system have been simulated for 36 months duration. Table 3 shows the obtained result of this evaluation. It shows that the total energy consumption of a server with the AI-integrated cooling system is 131736 units and 166320 units for the threshold-based
A Deep Reinforcement Learning Framework for Reducing
39
cooling system. The AI-integrated cooling system reduces server energy consumption by up to 21% for 36 months. Table 3. Experimental result on MovieLens and comparison with the baseline models Cooling system
Energy consumption
AI-integrated cooling system
131736 units
Threshold-based cooling system
166320 units
Table 4 shows the comparison of the proposed framework with the previous works. Although they trained and evaluated their model in different environments, they used the RL model to reduce energy consumption. From Table 4, DeepMind AI reduced the cooling bill of Google’s data center by 40%. The other models also reduced a significant amount of energy consumption by using the RL model. The proposed framework of this experiment also reduces a significant amount of energy consumption compared to threshold-based cooling systems. Table 4. Comparison of reducing energy consumption between the proposed framework with the previous works Previous work
Method
Percentage (%)
Environment
[9]
DeepMind AI (RL)
40
Google data center
[3]
DDPG algorithm (RL)
11
EnergyPlus
[3]
DDPG algorithm (RL)
15
National super computing centre
[10]
Deep Q-learning algorithm (RL)
4.2
Simulated environment
[12]
Parameterized action 10–15 space-based deep Q-network algorithm (RL)
Simulated environment
The proposed framework
Deep Q-learning algorithm (RL)
Simulated environment
21
Since the values of the parameters for the simulated environment have been assigned randomly in a limited range and a smaller number of variables has been considered for the input states, this model performed well from the beginning of the training phase. The model will take more time during training in a real-world scenario with more complicated environments which have a large number of parameters and variables. Since the model outperformed the threshold-based cooling system, it can be concluded that AI-based cooling systems can reduce the energy consumption of a server and eventually of a data center.
40
A. Al Munem et al.
5 Conclusion and Future Work In this paper, a general framework for an AI-integrated server cooling system has been proposed to reduce the energy consumption of a server which can eventually reduce the energy consumption of a data center. A threshold-based traditional cooling system and the proposed cooling system have been simulated simultaneously to compare the energy consumption within the same environment. The experimental result shows that the proposed AI-integrated cooling system reduces the energy consumption of a server by 21%. This percentage can vary depending on the environment. Since the proposed framework outperformed the threshold-based cooling system, it has been concluded that integrating RL agents in the environment can reduce the cooling cost of a data center. In this experiment, a general framework has been trained and evaluated to reduce the energy consumption of a server. In the future, the proposed framework can be trained and evaluated in a real-world data center with more complex parameters. In this experiment, to calculate the energy consumption of a server, only the server temperature has been considered. When the proposed framework will be tested on a real-world data center, it can consider other factors to calculate the energy consumption. The number of inputs for the neural network can be increased depending on the environment.
References 1. Zhang, Q., et al.: A survey on data center cooling systems: technology, power consumption modeling and control strategy optimization. J. Syst. Architect. 119 (2021). https://doi.org/10. 1016/j.sysarc.2021.102253 2. Lazic, N., et al.: Data center cooling using model-predictive control. In: Advances in Neural Information Processing Systems, 2018, vol. 2018-December 3. Li, Y., Wen, Y., Tao, D., Guan, K.: Transforming cooling optimization for green data center via deep reinforcement learning. IEEE Trans. Cybern. 50(5), 2020. https://doi.org/10.1109/ TCYB.2019.2927410 4. Liu, T., Tan, Z., Xu, C., Chen, H., Li, Z.: Study on deep reinforcement learning techniques for building energy consumption forecasting. Energy Build 208 (2020). https://doi.org/10.1016/ j.enbuild.2019.109675 5. Al Nasim, M.A., Al Munem, A., Islam, M., et al.: Brain tumor segmentation using enhanced U-Net model with empirical analysis. In: 2022 25th International Conference on Computer and Information Technology (ICCIT) (2022). https://doi.org/10.1109/iccit57492.2022.100 54934 6. Amaratunga, T.: What is deep learning?. In: Deep Learning on Windows, (2021). https://doi. org/10.1007/978-1-4842-6431-7_1 7. What is Deep Learning and How Does it Work?. Towards Data Science 8. A Hands-On Introduction to Deep Q-Learning using OpenAI Gym in Python. https://www. analyticsvidhya.com/blog/2019/04/introduction-deep-q-learning-python/ 9. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236 10. Roderick, M., MacGlashan, J., Tellex, S.: Implementing the deep Q-network. No. Nips, pp. 1– 9 (2017), [Online]. Available: http://arxiv.org/abs/1711.07478 11. Belady, C.L.: In the data center, power and cooling costs more than the it equipment it supports. https://www.researchgate.net/publication/286271372_In_the_data_center_power_ and_cooling_costs_more_than_the_IT_equipment_it_supports (2007)
A Deep Reinforcement Learning Framework for Reducing
41
12. Google DeepMind.: DeepMind AI Reduces Google Data Centre Cooling Bill by 40%. https://www.deepmind.com/blog/deepmind-ai-reduces-google-data-centre-coolingbill-by-40 (2016) 13. Zhang, Q., Lin, M., Yang, L.T., Chen, Z., Li, P.: Energy-efficient scheduling for real-time systems based on deep Q-learning model. IEEE Trans. Sustain. Comput. 4(1), (2019). https:// doi.org/10.1109/TSUSC.2017.2743704 14. Zhang, Q., Lin, M., Yang, L.T., Chen, Z., Khan, S.U., Li, P.: A double deep Q-learning model for energy-efficient edge scheduling. IEEE Trans. Serv. Comput. 12(5), (2019). https://doi. org/10.1109/TSC.2018.2867482 15. Ran, Y., Hu, H., Zhou, X., Wen, Y.: DeepEE: joint optimization of job scheduling and cooling control for data center energy efficiency using deep reinforcement learning. In: Proceedings— International Conference on Distributed Computing Systems, vol. 2019–July (2019). https:// doi.org/10.1109/ICDCS.2019.00070 16. Wan, J., Duan, Y., Gui, X., Liu, C., Li, L., Ma, Z.: SafeCool: safe and energy-efficient cooling management in data centers with model-based reinforcement learning. IEEE Trans. Emerg. Top. Comput. Intell. 1–15 (2023) 17. Wang, R., Zhang, X., Zhou, X., Wen, Y., Tan, R.: Toward physics-guided safe deep reinforcement learning for green data center cooling control. In: 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS), (2022) 18. Lissa, P., Deane, C., Schukat, M., Seri, F., Keane, M., Barrett, E.: Deep reinforcement learning for home energy management system control. Energy AI. 3 (2021). https://doi.org/10.1016/ j.egyai.2020.100043 19. Yeasmin, S., Afrin, N., Saif, K., Reza, A.W., Arefin, M.S.: Towards building a sustainable system of data center cooling and power management utilizing renewable energy. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_67 20. Liza, M.A., Suny, A., Shahjahan, R.M.B., Reza, A.W., Arefin, M.S.: Minimizing e-waste through improved virtualization. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/ 978-3-031-19958-5_97 21. Das, K., Saha, S., Chowdhury, S., Reza, A.W., Paul, S., Arefin, M.S.: A sustainable e-waste management system and recycling trade for Bangladesh in green IT. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_33 22. Rahman, M.A., Asif, S., Hossain, M.S., Alam, T., Reza, A.W., Arefin, M.S.: A sustainable approach to reduce power consumption and harmful effects of cellular base stations. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_66 23. Ahsan, M., Yousuf, M., Rahman, M., Proma, F.I., Reza, A.W., Arefin, M.S.: Designing a sustainable e-waste management framework for Bangladesh. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_104 24. Mukto, M.M., Al Mahmud, M.M., Ahmed, M.A., Haque, I., Reza, A.W., Arefin, M.S.: A sustainable approach between satellite and traditional broadband transmission technologies based on green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas,
42
25.
26.
27.
28.
A. Al Munem et al. J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_26 Meharaj-Ul-Mahmmud, Laskar, M.S., Arafin, M., Molla, M.S., Reza, A.W., Arefin, M.S.: Improved virtualization to reduce e-waste in green computing. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_35 Banik, P., Rahat, M.S.A., Rafe, M.A.H., Reza, A.W., Arefin, M.S.: Developing an energy cost calculator for solar. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-03119958-5_75 Ahmed, F., Basak, B., Chakraborty, S., Karmokar, T., Reza, A.W., Arefin, M.S.: Sustainable and profitable IT infrastructure of Bangladesh using green IT. In: Vasant, P., Weber, G.W., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_18 Ananna, S.S., Supty, N.S., Shorna, I.J., Reza, A.W., Arefin, M.S.: A policy framework for improving e-waste management in Bangladesh. In: Vasant, P., Weber, G.W., MarmolejoSaucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi. org/10.1007/978-3-031-19958-5_95
Far North: Optimizing Heating Costs I. Yu. Ignatkin1
, N. A. Shevkun1(B) , A. S. Kononenko2 and V. Panchenko3,4
, V. Ryabchikova1
,
1 Russian State Agrarian University—Moscow Timiryazev Agricultural Academy,
Moscow 127550, Russia {ignatkin,energo-shevkun}@rgau-msha.ru 2 Bauman Moscow State Technical University, Moscow 105005, Russia 3 Russian University of Transport, Obraztsova St. 9, 127994 Moscow, Russia 4 Federal Scientific Agroengineering Center VIM, 1St Institutsky Passage 5, 109428 Moscow, Russia
Abstract. Pork production in the conditions of the Far North is expensive. Heating of production facilities is a significant cost item, as the heating period is quite long. The expansion of the production capacity of Farm Enterprise “Sibir” required a review and optimization of the costs of heating the production facilities. To achieve this, measures to optimize heating costs were analyzed and it was decided to apply the heat recovery system. Technical and economic indicators were calculated to justify the decision. Based on the results obtained, the heat recovery system with two heat generators and four recuperative units working in a series shunt network was chosen. The annual economic benefit in the amount of 477.1 thousand rubles with the possibility of saving up to 89.2% of energy resources was shown by the obtained theoretical and practical data on the implementation of the system. The payback period is 2.5 years. Keywords: Ventilation · Indoor climate · Heat recovery · Pig breeding · Indoor climate system · Heating and ventilation system · Heat recovery · Energy saving
1 Introduction The expansion of the Farm Enterprise “Sibir”, located in Yakutsk, required the optimization of the cost of heating the production facilities. The cost of heating production facilities is a major part of the cost of finished products. [1] This is particularly noticeable in the conditions of the Far North; in particular, pork production requires significant heating costs [2], so their reduction is important. Within the framework of the established regulations, the parameters of the indoor climate, including the indoor air temperature, have a significant influence on animal productivity [3]. Deviation from the specified values of indoor climate leads to decrease in animal productivity by 20–30% [4], shortening of productive life of breeding animals by 15– 20%, increase in piglet mortality by up to 5–40%, possibility of respiratory diseases of animals, increase in fattening period and feed costs per production unit. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 43–50, 2024. https://doi.org/10.1007/978-3-031-50327-6_5
44
I. Y. Ignatkin et al.
In an effort to save money on heating, or due to insufficient thermal capacity, pork producers underestimate the air exchange during the heating period. [3] This solution inevitably leads to a deterioration of the climate in the barn, which negatively affects animal productivity [4, 5] and, ultimately, production costs. It was necessary to analyze the existing possibilities to optimize heating costs and to carry out a feasibility study in order to solve the problem. The desired results can be achieved in two ways: either by increasing the heat production capacity, which requires significant capital expenditure, or by modernizing the functioning of the microclimatic system of the production plants. One of the ways to save on heating is to recover heat from the exhaust air [6, 7]. This can be achieved by using air-to-air heat exchangers [8] of various types: tubular, plate [9, 10], spiral [11, 12]. In addition to the hardware components [13], automated control systems [14] are important to maintain an indoor climate with the required parameters [15]. The major Western companies offering exhaust air heat recovery systems are concentrated in Germany, the Netherlands, France and other European countries with relatively mild climate. The HRUs offered by them are designed for ambient temperature not lower than minus 15–20 °C. Use of these HRUs with heat recovery efficiency up to 0.4–0.5 in extreme server conditions will allow to heat the supplied air from the temperature of minus 52 °C to the temperature of minus 17–25 °C. However, this air temperature range is extremely low for directing into the animal’s living area, as it can lead to respiratory diseases. Therefore, additional heating is required. In addition, it is necessary to take into account the cost and operating costs of imported equipment, which are extremely high [16] and are ultimately included in the production cost [17]. An alternative way of using exhaust air heat and reducing heating costs of pig houses in the conditions of the Far North is the Russian HRU based on a series shunt network [2], in which the air supplied to the production facilities from the outside passes through two stages of heating by exhaust air taken from it.
2 Materials and Methods The buildings with insemination and gestation areas of the Farm Enterprise “Sibir” breeding pig farm, located in the city of Yakutsk, were selected as production facilities in which heat recovery should be applied to optimize heating costs [18]. The building measures 12 × 50 × 2.8 m and houses 300 pigs of 150–200 kg. The building is heated by a water heating system, the heating medium is supplied by a boiler house using natural gas as fuel. The choice of the final technical solution to reduce the heating costs [19] was based on the feasibility study of the effectiveness of the introduction of the heat recovery system in comparison with the base case. The use of PSI 225 open combustion heat generators with an output of 66 kW was chosen as the basic variant. The alternative variant is 66 kW heat generator PSI 225 and recuperative heat recovery unit UT-6000C with the capacity of 6000 m3 /h. A series shunt network is used to operate the heat recovery unit. The unit is designed as an air heat exchanger of the regenerative type, where the warm exhaust air heats the incoming cold supply air. The heat transfer takes place through the heat
Far North: Optimizing Heating Costs
45
exchanger wall, which separates the supply and exhaust air streams and prevents their mixing. The principle of heat recovery using a series shunt network is as follows (Fig. 1): the exhaust air enters the ventilation chamber and is distributed to the first and second phase recuperators. The first phase recuperator heats the supply air to a temperature of minus 20… 15 °C. The air heated in the first recuperator enters the second phase recuperator as supply air and is additionally heated by the exhaust air. The supply air is then sent to the production room. The operating conditions of the first phase recuperator assume icing of the exhaust duct, so when the ambient temperature is below minus 20 °C it operates in the cyclic mode “Heat recovery” - “Defrost”.
Fig. 1. Simulation circuit of two-phase shunt network
In order to determine the energy savings for heating using a heat recovery system, we calculated the heat and moisture balance for the entire heating period, taking into account the climatic conditions of the region (Table 1). The annual heat energy demand for heating is determined by the formula [2]. Q=
n
Pi · τi .
(1)
i=1
where Pi is the i-th heating power, kW, and τi is the duration of the i-th temperature, h. When using an HRU, the heating capacity required to heat the production facility is the difference between the heating capacity without recovery and the recuperator heating
46
I. Y. Ignatkin et al. Table 1. Yakutsk city climatic parameters of the cold period
Republic territory region settlement
Yakutsk
Air temperature of the coldest five-day period, °C; Reliability 0.92 −52
Duration of daily and average air temperature °C, period of average daily air temperature ≤ 8 °C Duration
Average temperature
252
−20.9
capacity at design temperature. The values are calculated using the following formula [20]: Pp.i =
Wsu.i · ρsu.i · Cm · tmax · ηi . 3600
(2)
where W sui is the supply air capacity at the i-th ambient temperature, m3 /h; ρsui is supply air density at the i-th ambient temperature, kg/m3 ; Δt max = t in –t a which is the maximum temperature drop, and °C; ηi is the efficiency of heat recovery at the i-th ambient temperature. The estimated savings of heat energy for heating is determined by the following dependence: Qp =
n Pi − Pp.i · τi .
(3)
i=1
The energy efficiency of the unit is estimated by the recuperator efficiency, which reflects the proportion of heat from the exhaust air that is transferred to the supply air. The actual recuperator efficiency (E) is calculated from the formula based on preliminary measurements of supply and exhaust air temperatures before and after the heat exchanger [16]: E=
tsu − ta Wsu · tex − ta Wex
(4)
where t a is the ambient air temperature, ºC; t su is supply air temperature at the exit of the recuperator, ºC; t ex is the exhaust air temperature, ºC; W su is the supply air flow from the recuperator, m3 /h; W ex is the volume of air removed by the recuperator, m3 /h.
3 Results The feasibility study of technical solutions was carried out on the basis of data on production areas, livestock and climatic conditions. In particular, the options of heating the premises without heat recovery and with heat recovery were compared. Calculation of energy savings on heating for the entire heating period was carried out under the condition of providing temperature not lower than 18 °C and relative humidity not higher than 70%.
Far North: Optimizing Heating Costs
47
The calculations compared the following combinations of applied devices and their efficiency. The basic variant with two heat generators of open combustion PSI 225 and the offered variant with 2 heat generators of open combustion PSI 225 with 2 recuperative heat recovery units UT-6000C. The results of the calculation are shown in Table 2. Table 2. Feasibility study of the efficiency of the heat recovery system for the insemination section according to the scheme 2 PSI 22 + 2 UT-6000C Items
Without heat recovery With heat recovery
Fixed heating capacity, kW
108.7
55.5
Natural gas, thousand m3
34.7
4
Annual heating cost, thousand RUB
225.6
87.3
Cost of equipment, thousand RUB
540.5
1008.3
Cost of heat recovery system, thousand RUB
–
467.8
Profit from the increase of the animal’s life quality, thousand rubles
–
275.7
Annual income from using the heat recovery system, thousand RUB
–
414.1
Heat recovery amortization period, years
–
1.1
Payback period of the system, without taking into – account the additional life expectancy of the animals, in years
3.4
Gas saving, %
61.3%
–
In addition, we considered a variant consisting of the basic variant with 2 PSI 225 open combustion heat generators and the proposed variant with two PSI 225 open combustion heat generators with 4 UT-6000C recuperative HRUs. The results of the calculations are presented in Table 3. Based on the results given in Tables 2 and 3, the variant with 2 open combustion heat generators PSI 225 with four recuperators UT-6000C was approved for implementation, as it provided the greatest energy savings. The chosen technical solution was implemented by installing the recuperators in the ventilation chamber located in the gallery connecting the insemination and gestation areas (Fig. 2). In order to assess the reliability of the theoretical calculations within the feasibility study and to determine the energy efficiency of the adopted technical solutions, we measured the main parameters of the microclimate and calculated the efficiency of the regenerative unit. Calculations and measurements were carried out in accordance with the Standard of the Association of Agricultural Machinery and Technology Testers 31.2– 2007: Testing of agricultural machinery. Sets of equipment for creation of indoor climate in cattle and poultry houses. Methods of estimation of functional characteristics.
48
I. Y. Ignatkin et al.
Table 3. Feasibility study of the efficiency of the heat recovery system for the insemination section according to the scheme 2 PSI 225 + 4 UT-6000C Items
Without heat recovery With heat recovery
Fixed heating capacity, kW
108.7
37.8
Natural gas, thousand m3
34.7
4
Annual heating cost, thousand RUB
225.6
24.3
Cost of equipment, thousand RUB
540.5
1746.1
Cost of heat recovery system, thousand RUB
–
1205.6
Profit from the increase of the animal’s life quality, thousand rubles
–
275.7
Annual income from using the heat recovery system, thousand RUB
–
477.1
Heat recovery amortization period, years
–
2.5
Payback period of the system, without taking into – account the additional life expectancy of the animals, in years
6.0
Gas saving, %
89.2%
–
Fig. 2. Ventilation chamber layout
The calculations showed that the use of a series-shunt network of recuperation allowed to achieve a high efficiency of heat recovery from the supply air with a coefficient of 0.75 and to warm up the supply air to a positive temperature when the temperature of
Far North: Optimizing Heating Costs
49
the ambient air is as low as minus 45 °C. The coefficient of efficiency of heat recovery from the exhaust air was 0.5. On the basis of the experimental data obtained and the calculations made, graphical dependencies (Fig. 3) of the heat consumption without and with heat recovery were plotted.
Fig. 3. Annual energy demand for heating the insemination area with and without heat recovery.
The income from the introduction of the heat recovery system is summarized from the reduction of the cost of energy resources [18] and the increase in the preservation of livestock due to the reduction of the occurrence of respiratory diseases [4, 5]. The saving of energy resources will be 89.2%. In monetary terms, the annual income will be 477.1 thousand rubles, and the payback period of the system, taking into account the additional preservation of animals, will be 2.5 years.
4 Conclusion The obtained data show that the application of the heat recovery system based on the series shunt network is expedient in the conditions of the Far North. It allows to save up to 89.2% of energy resources, to develop production capacities without increasing the capacity of heating equipment, to heat up the supply air to positive temperatures at ambient temperatures up to minus 45 °C.
References 1. Khimenko, A., Tikhomirov, D., Trunov, S., Kuzmichev, A., Bolshev, V., Shepovalova, O.: Electric heating system with thermal storage units and ceiling fans for cattle-breeding farms. Eng. Innovations Agric. Agric. 12, 1753 (2022). https://doi.org/10.3390/agriculture12111753 2. Ignatkin, I.Y.: Energosberezhenie pri otoplenii v usloviyakh kraynego severa (Energy saving when heating in the conditions of the Far North). Vestnik NGIEI 1, 52–58 (2017). (in Russian) 3. Tikhomirov, D.A., Tikhomirov, A.M.: Improvement and modernization of systems and means of power supply is the most important direction of solving the problems of increasing the energy efficiency of agricultural production. J. Mach. Equip. Rural Areas 11, 32–36 (2017)
50
I. Y. Ignatkin et al.
4. Ignatkin, I.Y., Arkhiptsev, A.V., Stiazhkin, V.I., Mashoshina, E.V.: A method to minimize the intake of exhaust air in a climate control system in livestock premises. In: Proceedings of the International Conference on Agricultural Science and Engineering Michurinsk, Russia, 12–14 April 2021; IOP Conference Series: Earth and Environmental Science. IOP Publishing: Bristol, UK (2021) https://doi.org/10.1088/1755-1315/845/1/012132 5. Ignatkin, I., Kazantsev, S., Shevkun, N., Skorokhodov, D., Serov, N., Alipichev, A., Panchenko, V.: Developing and testing the air cooling system of a combined climate control unit used in pig farming. Agriculture 13, 334 (2023). https://doi.org/10.3390/agriculture1 3020334 6. Roulet, C.A., Heidt, F.D., Foradini, F., Pibiri, M.C.: Real heat recovery with air handling units. Energy Build. 33, 495–502 (2001) 7. Gubina, I.A., Gorshkov, A.S.: Energy saving in buildings with heat recovery exhaust air. Constr. Unique Build. Struct. 4, 209–219 (2015) 8. EI Foujh, Y., Stabat, P.: Adequacy of air-to-air heat recovery ventilation system applied in low energy buildings. Energy Build. 54, 29–39 (2012) 9. Tikhomirov, D.A.: Methodology of calculation heat and energy saving ventilation and heating units used in animal farms. J. Altern. Energy Ecol. 2, 125–131 (2013) 10. Tikhomirov, D.A.: Electrical and thermal calculation of air heater recuperative heat exchanger. J. Mech. Electrif. Agric. 1, 15–17 (2013) 11. Adamski, M.: Longitudinal flow spiral recuperators in building ventilation systems. Energy Build. 40, 1883–1888 (2008) 12. Adamski, M.: Ventilation system with spiral recuperator. Energy Build. 42, 674–677 (2010) 13. American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE). ASHRAE Handbook. Heating, Ventilating, and Air-Conditioning Systems and Equipment, I-P ed.; American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE): Atlanta, GA, USA (2008) 14. Tikhomirov, D., Vasilyev, A.N., Budnikov, D., Vasilyev, A.A.: Energy-saving automated system for microclimate in agricultural premises with utilization of ventilation air. Wireless Netw. 26(7), 4921–4928 (2019). https://doi.org/10.1007/s11276-019-01946-3 15. Ogunlowo, Q.O., Akpenpuun, T.D., Na, W.-H., Rabiu, A., Adesanya, M.A., Addae, K.S., Kim, H.-T., Lee, H.-W.: Analysis of heat and mass distribution in a single- and multi-span greenhouse microclimate. Greenhouse technology and management. Agriculture 11, 891 (2021) https://doi.org/10.3390/agriculture11090891 16. Ignatkin, I.Y.: Teploutilizatsionnaya ustanovka s adaptivnoy retsirkulyatsiyey (Heat recovery unit with adaptive recirculation). Vestnik NGIEI 10(65), 102–110 (2016) (in Russian) 17. Samarin, G.N.; Vasilyev, A.N., Zhukov, A.A., Soloviev, S.V.: Optimization of Microclimate Parameters Inside Livestock Buildings Advances in Intelligent Systems and Computing ICO 2018, AISC 866, pp 337–345. Springer, Cham (2019). https://doi.org/10.1007/978-3-03000979-3_35 18. Samarin, G.N., Vasilyev, A.N., Dorokhov, A.S., Mamahay, A.K., Shibanov, A.Y.: Optimization of Power and Economic Indexes of a Farm for the Maintenance of Cattle. Advances in Intelligent Systems and Computing ICO 2019, AISC 1072, pp 679–689. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4_66 19. Tikhomirov, D., Trunov, S., Kuzmichev, A., Rastimeshin, S., Ukhanova, V.: Floor-Mounted Heating of Piglets with the Use of Thermoelectricity. Advances in Intelligent Systems and Computing ICO 2020, AISC 1324, pp. 1146–1155. Springer, Cham (2021).https://doi.org/ 10.1007/978-3-030-68154-8_96 20. Yudaev, B.N.: Teploperedacha [Heat Transfer], 2nd edn; Vyssh, p. 319. Shkola, Moscow, USSR (1981) (in Russian)
Justification for the Need to Develop and Implement Remote Monitoring Systems of the Grain Embankment Condition Which Operate by Using Renewable Energy Sources Dmitry Budnikov1(B)
, Vladimir Panchenko1,2
, and Viktor Rudenko1
1 Federal State Budgetary Scientific Institution, Federal Scientific Agroengineering Center VIM
(FSAC VIM), 1-St Institutskij 5, Moscow, Russia 109428 [email protected] 2 Department of Theoretical and Applied Mechanics, Russian University of Transport, 127994 Moscow, Russia
Abstract. The aim of this study is to assess different post-harvest storage methods used by small-scale agricultural producers, the monitoring systems for the condition of the grain layer and the possibility of its off-grid energy supply. Based on existing data the study discusses various grain losses arising at the different stages of post-harvest processing and storage. Based on the conducted research, conclusions were made about the need to develop and implement monitoring systems of the grain embankment, which is stored in warehouses remoted from centralized power supply systems. The monitoring system capacity of remote warehouses according to data of developers is 2 kW. The ventilation systems capacity, consisting of air-lances, is 15–30 kW or more. Portable power platforms, as well as local energy sources, can be used for their electricity supplying. Keywords: Grain embankment · Condition monitoring · Grain losses · Grain storage · Renewable energy sources
1 Introduction Currently, according to Food and Agricultural Organization of the United Nations, worldwide about 2756 million tons of cereals is harvested per year [1]. Most of this crop is grown on large-scale agro-industrial production. However, there are a lot of small-scale productions (more than 500 million) operating on less than 10 ha of land [2]. Smallholder farming is more common in developing countries [3]. Many of them produce grain for domestic consumption, for example for animal feed. Wheat, corn and rice are the most produced and consumed cereals in the world. The peculiarity of growing cereals is that they are produced seasonally. Many farms have limited access to mechanization of post-harvest process and modern storage facilities [4]. All of this leads to losses during harvesting, as well as post-harvest processes and storage. In such conditions, post-harvest losses are up to 15% in the field, 13–20% in processing and 15–25% in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 51–57, 2024. https://doi.org/10.1007/978-3-031-50327-6_6
52
D. Budnikov et al.
storage [4]. It leads to huge food losses and a decline in food quality, which contributing to food insecurity for agricultural households. Storage losses occur due to increased storage humidity, self-heating, weather and other deviations in the storage conditions. The need to improve equipment and technology for post-harvest processing of grain is due not only to an increase in grain harvest, but also to the need to ensure the safety of the crop. To ensure the reliable preservation of grain in the country, granaries are needed, the total capacity of which exceeds the average annual gross harvest by 1.5–1.8 times. At the same time, more and more producers are striving to carry out post-harvest processing and storage on their own farms, and not in elevators. To solve this problem, it is necessary to improve the equipment for post-harvest processing of grain, including through the use of intensifying electrophysical effects. Storage losses are a major factor, which effects on overall post-harvest grain losses [5]. Efficient grain storage with minimal losses can significantly contribute to reducing overall food losses for small-scale farmers, which can have a direct and considerable impact on their livelihoods. Thus, improvement of post-harvest processing and storage of agricultural products and also means of monitoring their condition are essential for reducing cereal losses during storage and processing. Post-harvest grain losses include all losses from harvesting to the use it for food or other purposes. It may be losses in time or quantity and quality of the grain, both of which significantly reduce grain cost. Quantitative losses are due to direct losses of grain; pests such as birds, mice, etc. or mechanical damage, whereas qualitative losses are mainly due to infection with mould, mycotoxins and mechanical damage. Grain store on farms can be divided into: ground warehouses; concrete silos; metal silos, etc. At the same time, about 56% of grain storage capacity is agricultural producers’ capacity, and the remaining part consists of the capacity of elevators, bakeries, grain processors. In many cases, storage systems, which are organized in remote warehouses, are not equipped with both centralized and local electricity systems. In these cases, it is difficult to monitor the condition of the grain embankment and there may be losses of quality and volume of grain mass due to the late detection of storage problems. Both centralized and off-grid energy sources can be used as energy sources. In this case, by renewable energy sources, which can be used for power supply of thermal processes, are meant the energy of the sun, soil, air, water, biomass, which is used for heat generation. 1.1 Main Part 1.2 Physical Factors Grail is a product for the long-term storage. It is stored for several months after harvest. This stage is longest in comparison with the rest of post-harvest grain processing stages [6, 7]. In addition, grain is minimally controlled during storage. Certain storage conditions must be met to reduce grain losses. Storage losses are affected by physical, biological and socio-economic factors. The physical factors influencing grain preservation include oxygen, humidity, relative humidity and temperature. Physical factors influence on the conditions for insect
Justification for the Need to Develop and Implement Remote
53
reproduction and mould growth during grain storage, which ultimately affects the preservation of grain. There is life activity and reproduction of harmful insects in the grain layer during storage. At the same time, temperature in range of 25 to 35 °C creates favorable conditions for the rapid growth of most stockpiling insects [8, 9]. Insects tend to decrease their activity, migrate or eventually die in conditions of temperatures below 13 °C or above 40 °C [10, 11]. Also, in this temperature range (25–30 °C) there is growth of most mould species [10]. The development of moulds is also typical in the presence of high humidity. Mould spores settle and grow rapidly on the surface of the grain. Temperature gradients contribute to moisture accumulation in problem areas of the storage system, which creates favorable conditions for the growth of mould and self-heating sources [11]. When the grain is stored in a silo or in a warehouse, the temperature at the center of the grain embankment volume remains approximately unchanged, corresponding to the temperature of putting grain into storage. The grain contacting the walls of the storehouse has temperature fluctuations due to fluctuations of the ambient temperature. When the grain is stored in conditions of high moisture content and high relative humidity, the temperature of the outer wall decreases faster, which contributes to condensation and the formation of areas of high humidity. Mould mushrooms grow faster in such conditions. The presence of moisture and oxygen increases the intensity of grain respiration. It leads to the release of heat, carbon dioxide and enzymes that break down starch, protein, and lipid in the grain. Insect livelihood activity is also related to the presence of available oxygen during metabolic activity, they increase the concentration of carbon dioxide in a sealed storage system due to respiration. All this leads to possible damage to the grain mass during storage and requires a timely response to the appearance of self-heating centers and their elimination. 1.3 Grain Store Monitoring At present, there are a large number of monitoring systems for the grain storage, silos, elevators. Most of these systems represent thermo-sensor groups for monitoring temperature changes throughout the grain volume during storage. The existing systems control temperature, carbon dioxide concentration, humidity, smoke, fire [12]. Many of the existing systems automatically transmit the readings to the centralized control system server via both wired and wireless communications. The data obtained by these systems are used both for the direct monitoring of problem areas and for the elimination of the problem areas by means of ventilation and cooling processes. In contrast to centralized warehouses, elevators and similar storage facilities provided by centralized power supply systems, there are remote storages, which are un-electrified sites, provided with roofs and minimal protection against weather anomalies. In such warehouses, grain can be stored both to intensify the harvesting process and to reduce downtime while moving grain to remote storages. In addition, some grain producers and suppliers use such storages for temporary storage to wait for the best price. The topology of the grain layer condition monitoring system for silage can be simplified presented in the scheme, which is shown in Fig. 1. The controller not only collects information from the sensors, but also transmits it to the operator’s SCADA
54
D. Budnikov et al.
system. Often, the condition indication and notification of accidents are not only made at operator’s workplace, but also carried out in the smartphone application. The implementation of intelligent grain drying systems using electrophysical effects on grain is carried out through process control with learning. At the same time, data on the energy intensity of the drying process is accumulated using microwave convection in the SCADA system with periodic upload to the optimization unit, where probabilistic models are analyzed and corrected with further return in the form of modified equipment operation modes. To ensure the safety of personnel and the technological process, the operation of microwave energy sources is possible only if there is a material controlled by capacitive sensors. The determination of the current grain moisture in the process of microwave convective processing is carried out by sensors that record the decrease in the level of electric field strength from the source to the control point (based on simulation results).
Fig. 1. Topology of grain storage monitoring system.
A system for the grain stores or post-harvesting sites can be constructed in the same way. Taking into account the fact that these storages are not equipped with a centralized power supply system, it is necessary to determine the list of equipment used (sensors, controllers, etc.). The monitoring capacity is estimated at 2 kW according to the available data.
Justification for the Need to Develop and Implement Remote
55
1.4 Results and Discussion It is worth considering both the capacity of the monitoring system and the prediction of the possibility of energy accumulation within a few days in choosing a source of local energy generation. Since weather conditions are not constant during storage, the efficiency of energy accumulation from individual renewable sources is related to the warehouses’ location and average weather conditions. The cereal harvesting should take place under favorable weather conditions (temperature and humidity), then it is the most appropriate to use of solar panels. In some cases, the application of ventilation equipment for rapid drying, cooling of grain and elimination of self-heating can be envisaged. An example of such equipment the air-lance, which allows the use of electrophysical means of intensifying the heat and moisture transfer process (Fig. 2).
Fig. 2. The sketch model of the experimental portable grain drying plant: a—assembled b— waveguide-emitter.
In Fig. 2: 1—the fan can be connected for air blowing (using electro-physical sources) or for sucking air (without source); 2—electrophysical action module; 3—waveguideextender; 4—waveguide-emitter; 5—the emitting slot; 6—the ventilation holes; 7—the PTFE ring (it is used for preventing the material being processed from entering the waveguide-emitter). This plant capacity is 1.5–3 kW depending on the design and the equipment used. The power required for the ventilation system can reach 15–30 kW depending on the size of the warehouse and the number of ventilation units used. Furthermore, such a system can operate in the modes of pulse power supply or partial activation of portable units.
56
D. Budnikov et al.
2 Conclusions The following conclusions can be drawn based on the reviewed data: 1. The developing of monitoring systems which are powered by renewable energy sources is necessary to reduce the loss of grain during storage in remote un-electrified warehouses. 2. The monitoring system capacity of remote warehouses according to data of developers is 2 kW. 3. It is advisable to apply the solar panels as a source of energy for power supply of grain layer monitoring systems. 4. It is advisable to apply air-lances to ensure the elimination of self-heating, as well as drying and cooling of the grain layer in the remote warehouses including those which use electrophysical intensification means. 5. The ventilation systems capacity, consisting of air-lances, is 15–30 kW or more. Portable power platforms, as well as local energy sources, can be used for their electricity supplying.
References 1. World Wheat Production by Country. https://www.atlasbig.com/en-us/countries-wheat-pro duction Accessed 31 Jan 2023 2. Manandhar, A., Milindi, P., Shah, A.: An overview of the post-harvest grain storage practices of smallholder farmers in developing countries. Agriculture 8, 57 (2018). https://doi.org/10. 3390/agriculture8040057 3. Food and Agriculture Organization.: The State of Food and Agriculture: Innovation in Family Farming; Food and Agriculture Organization: Rome, Italy (2014) 4. Abass, A.B., Ndunguru, G., Mamiro, P., Alenkhe, B., Mlingi, N., Bekunda, M.: Post-harvest food losses in maize-based farming system of semi-arid savannah area of Tanzania. J. Stored Prod. Res. 57, 49–57 (2014) 5. Kumar, D., Kalita, P.: Reducing postharvest losses during storage of grain crops to strengthen food security in developing countries. Foods. 6, 8 (2017) 6. Chigoverah, A.A., Mvumi, B.M.: Efficacy of metal silos and hermetic bags against storedmaize insect pests under simulated smallholder farmer conditions. J. Stored Prod. Res. 69, 179–189 (2016) 7. Budnikov, D.A.: Study of the ratio of heat and electrical energy expended in microwaveconvective drying of grain. In: Popkova, E.G., Sergi, B.S. (eds.) Sustainable Agriculture. Environmental Footprints and Eco-design of Products and Processes. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-8731-0_38 8. Belov, A., Vasilyev, A., Dorokhov, A.: Effect of microwave pretreatment on the exchange energy of forage barley. J. Food Process. Eng. 44, #9 (2021). https://doi.org/10.1111/jfpe. 13785 9. Proctor, D.L.: Grain Storage Techniques: Evolution and Trends in Developing Countries; Food and Agriculture Organization: Rome, Italy (1994) 10. Lindblad, C.: Programming and Training for Small Farm Grain Storage. Appropriate Technologies for Development. Manual No. M-2B; Burton International School: Detroit, MI, USA (1981)
Justification for the Need to Develop and Implement Remote
57
11. Fields, P.G.: The control of stored-product insects and mites with extreme temperatures. J. Stored Prod. Res. 28, 89–118 (1992) 12. Lydia J, Vimalraj SLS, Monisha R, Murugan R.: Automated food grain monitoring system for warehouse using IOT. Meas.: Sens. 24, 100472 (2022). https://doi.org/10.1016/j.measen. 2022.100472
Justification of the Technology of Keeping Animals to Maintain the Microclimate Igor M. Dovlatov1 , Ilya V. Komkov1 , Sergey S. Jurochka1 , Alexandra A. Polikanova1(B) , and Vladimir A. Panchenko1,2 1 Federal Scientific Agroengineering Center VIM, 1St Institutsky Passage 5, 109428 Moscow,
Russia [email protected] 2 Russian University of Transport, Obraztsova St. 9, 127994 Moscow, Russia
Abstract. In the introduction: The influence of conditions of detention on the physiological state of animals and their productivity is studied. The main components of the conditions of detention and ways to maintain them at the appropriate level are determined. Methods of influencing them are considered and described. A large degree of influence of microclimatic conditions on farm animals has been revealed. The conducted research on this topic has been studied, the need for further research in this area has been determined. The purpose of the analysis of modern technical solutions for maintaining the microclimate inside livestock premises for the middle zone of Russia. The materials and methods contain a parametric model of boundary conditions, a table of monetary losses in the winter period. In the results and their discussions: the materials used for the study are given. A parametric model and a scheme of the proposed premises for keeping livestock are given. The simulation of the gas composition of the room was carried out, with the determination of the maximum concentrations of hydrogen sulfide at 2.25 m and ammonia under the roof, by the direction of air flows, the air velocity from the axial fan up to 5 m/s near cows and the indoor temperature in winter, equal to 8 °C. Conclusion: The authors analyzed the sources of literature on this issue. The main methods of regulation of microclimatic indicators are analyzed. The most promising methods that meet the normative values for the necessary indicators are highlighted. Keywords: Microclimate · Animal husbandry · Air flows · Temperature stress · Air duct
1 Introduction At this period of animal husbandry development, there is a fairly extensive genetic base for various agricultural animals. For the full realization of the existing genetic potential, it is necessary to observe suitable conditions of detention. The conditions of detention are formed from several main factors, maintenance technologies and microclimate parameters [1–3]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 58–66, 2024. https://doi.org/10.1007/978-3-031-50327-6_7
Justification of the Technology of Keeping Animals
59
When analyzing the scientific literature, it was revealed that non-compliance with the maintenance technology leads to a deterioration of feed, and therefore the energy value and quality indicators of these decrease [4–6]. For a better realization of the genetic potential of cattle, the norms of maximum permissible concentrations were derived, and reference values for microclimate indicators were determined. The norm varies for age and gender groups, but they do not differ much [7, 8]. One of the important indicators of the microclimate are air indicators: humidity level, temperature, air flow velocity inside the room, the multiplicity of air exchange, gas contamination of the room. An analysis of the literature has shown that when gases accumulate in the room, animals begin to experience stress, there is a deterioration in their physiological state and a fatal outcome is possible. It was also found that in the winter period, in the absence of air flow velocity control, the morbidity of the herd increases, due to additional exposure to insufficiently heated air [9–11]. One of the most important factors realizing the proper level of livestock conditions will be the uniformity of air flows, the possibility of mixing them, followed by the removal of exhaust air and uniform temperature values throughout the room with the exception of drafts. 1.1 Materials and Methods During the research, a simplified model of the farm was developed with full compliance with the overall dimensions of the premises, supply and exhaust vents in the SketchUp 2020 software package (Fig. 1). The following parameters were set: a section of the building was taken in the section displaying a window acting as a supply channel, a cow stall, a medium-sized cow weighing 550 kg, an exhaust shaft, heat generated by cows 1000 W, the amount of carbon dioxide 4% in the exhaled air, wind speed at the entrance to the window 2 m/s, the mass concentration of ammonia is 50 mg/m3 , the mass concentration of hydrogen sulfide is 1 mg/m3 , (Fig. 1). Modeling of air movement was carried out in the SolidWorks 2020 software package. To develop a parametric model, the authors additionally conducted a study of the gas composition of air: Ammonia derivatives; Hydrogen sulfide; Ammonia. At the level of the feed table, organic compounds (their scanty amount), ammonia and hydrogen sulfide from manure/urine/feed. One of the factors for the distribution of gases in the room is the amount of heat, mass concentrations of ammonia with hydrogen sulfide. Parametric modeling. The appearance of the developed parametric model for determining the gas composition of the air environment of a livestock room is implemented in the form of a farm plot, where 10 objects are hierarchically placed (Fig. 1). There are simplifications in the model: the shape of cows, there are no stalls, gates are closed, there are no communications on columns and other minor assumptions. These assumptions were made in order to optimize the model for the possibility of parametric modeling. Otherwise, the computing power will not be enough, and the result, which will be affected by the added simplifications, will be insignificant. According to the data received from company Zelenogradsk in 2021 (Table 1), it was determined that temperature stress has an extremely detrimental effect on the economic performance of production, therefore there is a need to level this factor.
60
I. M. Dovlatov et al.
Fig. 1. Parametric model of boundary conditions
The research was carried out in the conditions of the central part of Russia. In winter, there is an extremely low air temperature, and in summer, on the contrary, extremely high. In this regard, there is a possibility of temperature stress in cows, which adversely affects the physiological state of the body and its productivity. Therefore, in winter it is necessary to avoid large negative temperatures in the room for keeping animals. 1.2 Results and Discussion During the analysis of Table 1, it was revealed that temperature stress is a fairly significant factor. As a result of experiencing a large amount of stress by animals, milk losses per cow exceeded the normative value by 2 times. This led to the fact that the actual losses exceeded the norm by 91%. There were also significant losses due to excessive gas contamination of the premises, exceeding the norm by 98%. According to the developed simulation model Fig. 1, a simulation was carried out for gases whose concentration is regulated by the MPC standards. For a more visual display, parametric models for Ammonia (NH3 ) and Hydrogen Sulfide (H2 S) were chosen as examples (Fig. 2). Analysis of the parametric model of H2 S (A) propagation showed that gas accumulations, formed from feces and urine, accumulate under the ceiling due to the fact that it is lighter than other gases. When the speed of incoming fresh air changes, the saturation of
Justification of the Technology of Keeping Animals
61
Table 1. Monetary losses in winter from temperature stress. Indicators
Standard
Actually
Duration of the cold period, temperature below 9 °C, Moscow region, days
90
90
The number of heads on the farm, pieces
480
480
Milk yield per cash cow, kg
37
37
Daily milk yield on the farm, tons/day
17.8
12.8
Reduction of milk yields from gas contamination per day, %
10
20
Milk loss per cow, liters/day
3.7
7.4
The cost of a liter of milk, rub
33
33
Milk losses from gas contamination, thousand rubles/day
58.6
117.2
Milk losses from reduced productivity due to gas contamination, million rubles/period
5.3
10.5
Continuation of Table 1 Culling of cows for the period, %
1
1
Losses from culling, thousand rubles/period
240
240
Decrease in the level of fertilization of the herd, %
5
5
Losses of calves from decreased fertilization, heads
24
24
Loss of funds for the period of untimely fertilization, thousands of 150 rubles
150
Total loss of money for the period, million rubles
5.7
11
Loss of money per day/period, rub
62941.33
121549.33
Fig. 2. Parametric model of gas propagation with an air flow velocity condition of 2 m/s. a— Spread of H2 S 20 mg/m3 , b—Spread of NH3 50 mg/m3
the gas in the cloud will change. It was found that the highest concentration of hydrogen sulfide is at an altitude of 2.25 m. Based on the data of the parametric model of NH3 (B) propagation, it follows that the distribution of ammonia at a velocity of 2 m/s is quite
62
I. M. Dovlatov et al.
extensive and an extensive gas cloud can be observed throughout the room. It is also seen that the greatest concentration is observed under the canopy of the roof throughout its area. To date, it is proposed to use exhaust shafts located throughout the roof of the room, throughout the entire length of the room, most often directly in the ridge. They are the way out of the air masses to the outside. There is a possibility of using exhaust shafts together with supply channels. Due to the structural features, the mine is unable to suck in a large volume of air, which is why it is not possible to remove air that is at the level of animals and below the entire area of the building. Consequently, optimal air circulation is not achieved. Modeling was also carried out (Fig. 3), as a result of which it was established that there were vortices of air flows.
Fig. 3. Parametric model of air movement in the cattle housing. 1, 6—the area of the first turbulence; 2, 5—the area of chaotic air movement; 3, 4—the area of the secondary and strongest turbulence.
Based on Fig. 3, it can be seen that the air flows do not move uniformly, which leads to mixing of the accumulated gases with each other. It can be seen that in the areas of primary vortices 1 and 6, a partial return of Ammonia is carried out, and its bulk in these areas is not affected at all. A partial removal of Ammonia and Hydrogen Sulfide occurs from regions 2 and 5, but a significant part does not move close enough to the exhaust shafts for removal. In areas 3 and 4, the greatest vortices are observed, the air in which descends to the level of animals. In this regard, these largest accumulations of Hydrogen Sulfide and some of the accumulations of Ammonia are able to return to animals and adversely affect the body. Also, due to such vortices, it becomes difficult to completely remove these gases through exhaust shafts, due to the small suction capacity of these. Axial and suspended blade fans are often used to adjust the microclimate in addition to exhaust shafts (Fig. 4). This equipment is one of the least expensive microclimate
Justification of the Technology of Keeping Animals
63
monitoring systems, but it has obvious disadvantages. Due to the low power, it is necessary to integrate several pieces, depending on the area of the cowshed. Also, the direction of their action determines that they work according to the tunnel type and the air flows created by them are directed in a strictly predetermined direction.
Fig. 4. Axial and suspended blade fans in a room for cattle keeping. a—Axial fan, b—Suspended blade fan.
A simulation was carried out (Figs. 5 and 6), from which it becomes clear that the fans do not have enough power to ensure that the central area of the cowshed was also exposed to the air flow created by such a fan.
Fig. 5. Parametric model of axial fan operation.
As a result of the simulation (Fig. 5), it was revealed that the fans located on the side walls are not efficient enough, according to the mime of the above problem with low power, there are a number of other problems. Due to the fact that the fan, in the area closest to itself, carries out air movement, on average, up to 5 m/s, this creates a detrimental effect on the animal. In the area of the feed passages, the speed decreases to 1.5 m/s, forcing harmful gases on them. From Fig. 6, we can conclude that the highest air velocity—3 m/s, is achieved directly above the fan, and the main flows captured by it are located at the level of the middle of the height of the cowshed. Consequently, the
64
I. M. Dovlatov et al.
Fig. 6. Parametric model of the suspended blade fan operation.
air located at the level of the cows’ heads and below is not subject to suction by the fan or is partially exposed, since its speed is below 0.8 m/s. Part of the flows, about 30%, after being lifted by the fan, fall back down, which leads to the inhalation of exhaust air by animals. Such movement of air masses is undesirable and largely harmful. For the most comfortable level of living conditions, it is necessary to maintain the uniformity of the values of all indicators throughout the animal housing. This is facilitated by an air duct, which has a total length along the entire length of the room and the presence of several air outlets, which contributes to the uniformity of air flows (Fig. 7). Analyzing Fig. 7a, it can be seen that the temperature distribution occurred evenly, which is favorable for animals. The starting temperature inside the cowshed is −15 °C frost. The supply temperature through the duct is +5 °C. The upper part of the figure shows the temperature 2 min after turning on the duct, and the lower part shows the temperature 18 min after turning on. 18 min after the duct was switched on, the indoor temperature reached a value of −8 °C, while heating was carried out evenly throughout the room, which is also an important aspect. This allows the cattle not to experience sudden temperature changes in different parts of the room. Based on Fig. 7b, a number of conclusions can be drawn. The speed of indoor air flows in most areas is fairly uniform from 1 to 2 m/s, which has a beneficial effect on the physiological state of animals. The main areas of increased speed are the places of air outlet from the duct and the location of exhaust shafts. Since the air velocity near the exhaust shafts reaches up to 4 m/s, it can be concluded from this that there is a high degree of gas exchange in the room.
2 Conclusions The air duct is the most suitable solution in the issue of monitoring microclimatic indicators, due to the uniformity of air flow velocities from 1 to 2 m/s, which favorably affects the physiological state of animals and the temperature throughout the room is 8 °C in winter, at an external temperature of 15 °C frost. Due to this, there is no temperature shock in animals.
Justification of the Technology of Keeping Animals
65
Fig. 7. Parametric models of indoor air temperature in winter and air flow velocity during duct operation. a is a model of indoor air temperature in winter, b is a model of air flow velocity.
The integration of axial and vane ceiling fans is not a sufficient measure of the impact on the microclimate. Despite their cheapness, they are not able to carry out normal circulation of air flows, and also create drafts with an air velocity of up to 5 m/s near the cow, which is extremely harmful. Supply and exhaust shafts by themselves are also not capable of maintaining the necessary conditions for livestock. Since they are not able to suck air from the entire room, which does not contribute to the qualitative removal of harmful gases, additional equipment is needed. The conducted research will significantly increase the productivity of cattle, which will also increase the economic performance of enterprises. In the future, it is planned to conduct additional detailed studies on this issue.
References 1. Yan, G., Shi, Z., Cui, B., Li, H.: Developing a new thermal comfort prediction model and web-based application for heat stress assessment in dairy cows. Biosyst. Eng. 214, 72–89 (2022) 2. García-Castillo, J.L., Picón-Núñez, M., Abu-Khader, M.M.: Improving the prediction of the thermohydraulic performance of secondary surfaces and its application in heat recovery processes. Energy 261, 125–196 (2022)
66
I. M. Dovlatov et al.
3. Kochetova, O.V., Kostarev, S.N., Tatarnikova, N.A., Sereda, T.G.: Development of microclimate control system in cattle barns for cattle housing in the Perm region. IOP Conf. Ser.: Earth Environ. Sci. 839(3), 032030 (2021) 4. Kokunova, I.V., Zhukov, A.A., Podchekaev, M.G.: On the issue of improving the quality of haylage harvested in difficult weather and climatic conditions. Bull. KRASGAU 1(142), 51–55 (2019) 5. Klimenko, V.P.: High–quality bulky feed—the basis of full-fledged rations for highly productive livestock. Adapt. Feed Prod. 3, 102–115 (2019) 6. Ivanov, Y.G., Kirsanov, V.V., Yurochka, S.S.: Studies of microclimate parameters in the zoo station of the Russian state agricultural academy named after K.A. Timiryazev. TLC Reports: Collection of Articles. Issue 291. Ch. V/M., p. 115. Publishing House RGAU-MSHA (2019) 7. Order of the Ministry of Agriculture of the Russian Federation No. 622 dated 21.10.2020. On approval of Veterinary rules for keeping cattle for the purpose of its reproduction, cultivation and sale 8. Kirsanov, V.V., Dovlatov, I.M., Komkov, I.V., Yurochka, S.S.: Modern approaches to systems for ensuring air parameters in livestock premises. Equipment Technol. Anim. Husbandry 4(48), 61–71 (2022) 9. Assatbayeva, G., Issabekova, S., Uskenov, R., Karymsakov, T., Abdrakhmanov, T.: Influence of microclimate on ketosis, mastitis and diseases of cow reproductive organs. J. Anim. Behav. Biometeorol. 10(3), 22–30 (2022) 10. Lovarelli, D., Riva, E., Mattachini, G., Guarino, M., Provolo, G.: Assessing the effect of barns structures and environmental conditions in dairy cattle farms monitored in Northern Italy. J. Agric. Eng. 52(4), 12–29 (2021) 11. Martynova, E.N., Yastrebova, E.A.: Features of the microclimate of cowsheds with a natural ventilation system. Vet. Med., Anim. Sci. Biotech. 6, 52–56 (2015)
Cattle Icare Monitoring System (CIMS): Remote Monitoring of CATtle’s Heart Rate, Temperature, and Daily Steps with Smart Sprinkler System Deane Cristine Castillo(B) , Alvin Bulayungan, John Carl Lapie, Dorothy Mary Ann Livida, Bryant Macatangay, Kim Francis Sangalang, and Marife Rosales(B) Electronics Engineering Department, De La Salle Lipa, 1962 J.P. Laurel; National Highway, Lipa City, Batangas, Philippines {deane_cristine_castillo,alvin_bulayuanga,john_carl_lapie, dorothy_mary_livida,bryant_macatangay,kim_francis_sangalang, marife.rosales}@dlsl.edu.ph
Abstract. As cattle production decreased over the last years, the mortality rate of cattle due to disease became more significant. In order to lessen the death of cattle, a smart monitoring system is proposed to help the caretaker and farm owners detect early signs of disease through the cattle’s vitals. The sensor-based monitoring system will be able to detect the cattle’s heart rate (in bpm), body temperature (in Celsius), and the number of daily steps. The Cattle iCare Monitoring System was able to gather the vitals through a collar belt and pedometer strap with embedded sensors. Meanwhile, a mobile application was used to display the vitals and its Alert Level classification (Normal, Alert Level 1, Alert Level 2, and Alert Level 3). Furthermore, the monitoring system is accompanied by a smart sprinkler system that can be operated with the CiMS mobile application. Keywords: Cattle · Monitoring system · Precision livestock farming · IoT
1 Introduction Livestock farming refers to economically oriented animal husbandry with high stocking densities and mechanization [1]. In other words, it is a way of managing and raising livestock animals to produce a large amount of meat and dairy, resulting in an abundant source of food for human consumption. Thus, livestock plays a huge role as it serves as a pillar of the food system, contributes to reducing poverty, provides security for food, and leads agriculture to development [2]. With the development in mind, livestock farming will be more sustainable, animal-friendly, and safer for human consumption and health. Thus, the study will focus on the health of livestock farming, particularly cattle. According to Philippines Statistic Authority (2021), the number of cattle productions dropped from the year 2018 to 2020.The data shows that in 2020, there are 229.13 metric © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 67–76, 2024. https://doi.org/10.1007/978-3-031-50327-6_8
68
D. C. Castillo et al.
tons produced in the Philippines. It is 12.2% lower than the production in 2019 which is 260. 62 metric tons. The disposition of cattle due to death caused by disease from 2018 to 2022 reached to a total of 36,398. Even though disposition decreases from 2018 to 2020, the number of deaths is still significant and affects the supply in the market since most of the cattle farming comes on a small-scale. Precision in livestock farming can offer many opportunities since automatic monitoring systems will enable caretakers to detect early signs of disease and monitor the health and welfare of the livestock. It is mentioned in [3] that there is a variety of cattle breeds in the world, but the study focuses on the monitoring system of Philippine cattle for it is the most accessible. With the different variety of breeds, there are also different management depending on the location and techniques of the caretaker. Through this, the researchers were able to know the setup where the Cattle iCare Monitoring system is most applicable. The proposed project will focus on fattening cattle since dairy farming practices intensive monitoring systems already. For years, people have relied on traditional ways of monitoring livestock wherein the person in charge is individually inspecting animals for any signs of disease or injury. However, the traditional method is both costly and highly unreliable. For the traditional method of measuring cattle temperature, a thermometer is used. The normal temperature of an adult cow is around 38.5 °C and a temperature over 39.5 °C (103 °F) may indicate an infection [4]. Meanwhile, adult cattle have an average heart rate of between 48 and 84 bpm. The heart rate is traditionally assessed using a stethoscope while listening over the left-hand side of the cow’s chest behind the cow’s elbow. Meanwhile, a caretaker should know the several factors that are affecting cattle health and welfare which consist of environment, genetics, hygiene, and diet. However, caretakers cannot monitor the factors affecting the cattle, rather, the vitals that reflect the cattle’s health and welfare should be considered. In measuring cattle’s temperature, fever, stress, and infection can be detected [5]. Moreover, by getting the heart rate variability, caretakers can monitor the health of the cattle’s respiratory system as well as the nervous system [6]. Meanwhile, [7] and [8] suggest that by identifying the daily step of cattle, you can monitor their activity in a pedometer that can identify the early stages of lameness and fertility. Thus, different studies using the hardware components were presented to establish the components’ credibility and precision. It is mentioned that PPG, a noninvasive method of getting heart rate shows accuracy even during physical activities. In the study conducted by Bulinova, the error percentage on the bpm parameter of the MAX301032 sensor with an ATMega 2560-16AU is exceptionally low which implies the accuracy of the pulse oximetry. Moreover, the chosen temperature sensor which is DS18B20 is small and can be calibrated for the setup. In [9], two temperature sensors were used (ambient and core) which were adopted by the researchers. In addition, the MPU-6050 pedometer shows an average inaccuracy of 2 missed steps, implying its precision. With this, the researchers will use the Wi-Fi module along with Raspberry Pi 3 Model B+. Aside from the model of the hardware to be used, the chosen method of analysis which is Machine Learning and the KNN algorithm was discussed. A related study about the identification of estrus in cows using LR and KNN algorithms alongside other machine-learning [10–12] algorithms (CART, BPNN, and LDA) shows a high percentage
Cattle Icare Monitoring System (CIMS): Remote
69
of accuracy. Aside from the components, the areas wherein the sensor will be located are relevant because it will affect the physical design of the project as well as the accuracy of data acquisition.
Fig. 1. Conceptual framework of the whole study
Figure 1 shows the Input-Process-Output (IPO) Conceptual framework of the study. It highlights the different processes for the whole system which are divided into four major parts such as: (a) Gathering of vital parameters, (b) Classification of vital parameters, (c) Display of vitals, classification, and remote control of smart sprinkler, and (d) Smart sprinkler system. To address the limitations of traditional ways of monitoring livestock, particularly for cattle, this study aims to develop a monitoring system that can detect and classify cattle’s heart rate, temperature, and the number of daily steps that are useful in recognizing early signs of disease. The study aims to monitor the cattle’s vitals, alert them to abnormalities, and suggest actions if abnormalities occur in cattle. The findings of the study would benefit in determining and preventing cattle from further fatality.
2 Materials and Methods Since the study includes a prototype to attain the objectives applied research methods were used. Applied research systematically uses high-quality research standards, methods, and tools to develop practical solutions for real-world problems. a. The pre-design stage comprises problem conceptualization and several studies, which are critical in supporting the research and in producing the recommended solution. b. The design stage focuses on the system design plan used for the project based on the concepts and methods from the gathered related literature and studies about the use of monitoring systems and sensors on cattle. c. The development stage is the implementation of the specific ideas that were gathered and considered during the early design stage. This stage is divided into two parts, particularly hardware development and software development. d. Testing stage, the CiMS will undergo a series of assessments to verify its accuracy and effectiveness. Each output data will be evaluated whether the goal accuracy was obtained. The goal percentage for accuracy on every output data is 90%.
70
D. C. Castillo et al.
i. A functionality Test was done on all sensors and sub-systems. The functionality test for hardware components will ensure that the components used are working as expected. Any malfunction from parts will be subject to replacement. ii. There will be 30 trials to evaluate the accuracy of the MAX30202 (PPG Sensor). All trials will be conducted while the cattle are at rest since it will require the cattle to be still to be tested under a stethoscope. The bpm from the PPG sensor and bpm assessed through a stethoscope will be gathered and compared. Meanwhile, the accuracy of DS18B20 (temperature sensor) will be assessed in a total of 30 trials while the cattle are at rest since a thermometer requires the animal to be still to have accurate readings. The comparison between the body temperature reading from the sensor and thermometer. The addition, for the ambient temperature, the temperature reading between the sensor and thermometer will be compared on 30 trials with an indicated time of the day. For the pedometer, a testing method will be used where the readings from the pedometer will be compared to the actual steps for a specific time frame. iii. For the accuracy of the classification of Machine Learning, a confusion matrix will also be used to represent the result of the classification. The researchers shall train the data sets with a different splitting percentage which includes 80–20, 70–30, and 75–25 data splitting.
3 Project Design and Development The study was designed to gather the three vital parameters of the cow using three (3) sensors; heart rate sensor, temperature sensor, and accelerometer. Each of the sensors was deployed in three different locations where the casing was placed to ensure the accuracy of the measured parameters. The temperature sensor was placed in the collar of the product together with ESP8266, and a power supply. Next, the heart rate sensor was positioned on the chest of the cattle for accurate readings. Lastly, the accelerometer was placed in the knee of the cattle with an ESP8266 and power supply of its own.
Fig. 2. Actual collar belt and pedometer subunit
Figure 2 shows the actual image of the CiMS Collar Belt subunit and pedometer subunit where three casings were designed. The first casing was placed in the neck collar which serves as a housing for DS18B20 (Temperature sensor), WeMos D1 Mini ESP8266, and a power bank for gathering the temperature of the cattle in Celsius. Next, the neck collar has a bridge connected to the belt to ensure the positioning of the
Cattle Icare Monitoring System (CIMS): Remote
71
collar belt. The second casing was developed to be the housing of the heart rate sensor, ensuring that the sensor has direct contact with the chest of the cattle to gather the heart rate of the cattle in beats per minute. The last casing was designated for the MPU6050 (Accelerometer) with WeMos D1 Mini ESP8266 and has its own strap placed on the knee of the cattle for accurate reading and data collection of cattle’s daily steps.
Fig. 3. Homepage of the mobile application
Based on Fig. 3, the CiMS mobile application was designed to have easy access to the gathered data for the three parameters with classification and other information about the cattle, and the switch to the Smart Sprinkler System. An application that is user-friendly and provides a platform for its users to perform tasks safely, effectively, and efficiently. The application’s homepage will allow you to select which cattle the user wants to monitor as long as each cattle is wearing the device in order to have its own unique number. The vitals that the users can check on the application are the heart rate, temperature, and the number of steps. Through the application, users will be given alert levels and suggestion prompts. The Raspberry Pi 3B+, and two Wemos ESP8266 are connected to the Wi-Fi Router that is connected to the Internet. The inputs (heart rate sensor, temperature sensor and pedometer) gather the parameters from the cattle and the ESP8266 sends the heart rate, temperature, and the number of steps through the Wi-Fi with a given IP address. From the Raspberry Pi 3B+ acting as a web server that is accessed by the ESP8266, the flask server will then receive commands like upload heart rate, temperature, and the number of daily steps. After a while, it will be processed by Machine Learning. The Machine Learning Algorithms will analyze and classify the data gathered whether it is Normal, Alert level 1, Alert level 2, or Alert level 3. Once processed, the data will be sent and uploaded to the cloud hosting and the database using the internet. Finally, the mobile application will request the data from cloud hosting and as a response, the parameters and Alert level will be displayed on the screen of the mobile application. All the data will be stored in the database and the daily number of steps will reset every 24 h. The microcontroller that was used for the Smart Sprinkler System is an Arduino Uno. Once triggered on the mobile application, the Arduino Uno will activate the solenoid valve through a relay module which is connected to a 12V DC Power Supply. After activating the solenoid valve, it will open the water current coming from the faucet or water source
72
D. C. Castillo et al.
which will flow through the hose up to the sprinkler for the misting of the cattle walking through the gate where the sprinkler system will be deployed.
Fig. 4. Components on area 1
The wearable unit composed of the Heart Rate Sensor, Temperature Sensor, and Pedometer will be in Area 1 as shown in Fig. 4. This system will be attached to the cattle.
Fig. 5. Components on area 2
The Main Control Box will be in Area 2 as shown in Fig. 5. It is composed of the Arduino Uno rev3, relay module, solenoid valve, Raspberry pi 3b+, ambient temperature, and the main sprinkler system. The casing dimensions are 3.5 cm in width, 10.5 cm in length, and 12.5 cm in height. There is a fitted hole for the temperature sensor and heart rate sensor. The case has a sliding cover. The dimension of the casing is 2 cm in width, 2.5 cm in length, and 2.5 cm in height. The case has an exposed bottom since the heart rate sensor required physical contact with the cattle. The dimension of the casing is 3.5 cm in width, 9 cm in length, and 11.5 cm in height. Just like the first case, the casing for the pedometer has a sliding cover. The Main control box shown in Fig. 6 is the housing for the LCD display to control the system, the router for connection, Raspberry Pi, and the Arduino Uno Board. Figure 7 shows the actual design of the Nylon Strap. The Nylon has 2 buckles which are located on the neck part and chest part of the subunit. Through the buckle, the cattle
Cattle Icare Monitoring System (CIMS): Remote
73
Fig. 6. Main control box
Fig. 7. Design of nylon strap
caretaker can easily remove and place the wearable system. The collar is 90 cm long while the belt is 200 long. The purpose of the bridge is to assure that the sensors are in place despite the cattle’s movement. The dimensions are determined by measuring the biggest cattle available on the farm. The collar and belt are adjustable to also make sure that the collar or belt is fitted but not endangering the cattle. Figure 8 shows the LCD Main Display of the project. The display has 4 buttons on the right side which are the About button, About developers button, user manual button and the shutdown button. When the user clicks the shutdown button, no data will be fetched and classified.
4 Data Analysis and Interpretation of Results A. Location of Data Gathering. The data gathering for the study of the Cattle iCare Monitoring System was held in MilkJoy Farm located in Ayusan II Tiaong, Quezon Province from the month of August 2022 up to the month of December 2022 for 120 days. B. Functionality Test. Through the functionality test, the proponents have gathered enough proof and information to confirm that their chosen sensors and program are working as stated. There are six major functionality tests that will cover all the chosen sensors and actuators that make up the program and each has 3 trials to test whether it will
74
D. C. Castillo et al.
Fig. 8. LCD main display page
be successful or not. As the proponents have finished the functionality test, it was deemed to be successful for all trials. The remarks and decisions used in determining the success rate of all the trials are also based on past research and statements from experts in the field. C. Accuracy Test for Sensors Heart Rate sensor (MAX30102 PPG SENSOR). The heart rate sensor was put on the cattle simultaneously as the farm veterinarian tested the heart rate through a stethoscope. The lowest Error Percentage is 1.1363% while the highest is 11.6666%. The average Percentage error is 3.247%. Temperature Sensor (DS18B20). The accuracy test result for body temperature through an assessment on the temperature sensor and a rectal thermometer for veterinary use. The measuring time of the traditional method is 60 s. The thermometer is used simultaneously as the temperature sensor on the collar. The average error percentage is 0.71229% Result on Missed Steps on Pedometer Reading. The test accuracy on the step reading of the pedometer. The pedometer was put on the cattle while the researchers observed the motion and compared the number of steps every 5 min. The Average Missed steps is equivalent to 1.366666667. D. Accuracy on Machine Learning. The researchers were able to train and evaluate three Machine learning models, namely K Nearest Neighbors (KNN), ANN and SVM. The Machine Learning model was used to classify the vitals as Normal, Alert 1, Alert 2 and Alert 3. Using different data splitting percentages specifically the 70%-30%, 80%-20% and 75%-25% data splitting. The researchers used two different Machine Learning in the prototype since there are two different timeframes. One is for the classification of heart rate and surface temperature while the other one is for the daily steps of cattle. The most accurate Machine Learning model is embedded to the Raspberry pi of the system. For heart rate and temperature, the highest accuracy is ANN (accuracy = 96.30%) while on the steps, the machine learning model used was KNN (accuracy = 100%).
Cattle Icare Monitoring System (CIMS): Remote
75
5 Conclusions The researchers were able to develop a monitoring system that can detect and display alert level classification based on cattle’s heart rate, temperature, and the number of daily steps that are useful in recognizing early signs of disease. A wearable collar belt with embedded sensors for the heart rate and temperature was created. The final prototype for the collar-belt unit is composed of ESP8266, MAX30102 (Heart rate sensor), DS18B20 (temperature sensor), and a 20,000 maH power bank. The collar belt is placed on the neck and chest of the cattle. A wearable pedometer belt with the following components: a pedometer sensor and a microprocessor were created. The final prototype for the pedometer unit is composed of ESP8266, MPU6050 (pedometer), and a 20,000 maH power bank. The pedometer belt is placed on the upper thigh of the cattle. A smart sprinkler system that is connected to the mobile application of the system was developed. The researcher was able to gather the average daily steps among 30 healthy cattle with the same setup. Using a commercial pedometer, the average daily steps of cattle recorded is 412 steps. The researcher was able to train and evaluate three Machine learning models, namely K-Nearest Neighbors (KNN), ANN, and SVM. The most accurate machine learning model is embedded in the Raspberry pi of the system. For heart rate and temperature, the highest accuracy is ANN with 96.30% accuracy while on the steps, the machine learning model used was KNN with 100% accuracy. Evaluation of the whole system through different tests was conducted to test its accuracy and reliability. An accuracy test of the sensors by comparing the value of data gathered between CiMS and traditional tools was conducted. Comparing the beats per minute assessed between the stethoscope and MAX30102 PPG SENSOR, the % or error is 2.567%. Meanwhile, comparing the surface temperature, the accuracy between temperature sensor DS18B20 and the rectal thermometer has an error % of 0.71229%. Lastly, comparing the actual steps and the steps reading on the pedometer, the average missed steps is 1.3 steps.
References 1. Rodriguez, I.: Live Stock Farming—Problems & Solutions. [online] seventeen goals Magazin, (2022). Available at: https://www.17goalsmagazin.de/en/livestock-farming-problemsand-solutions/. Accessed 9 April 2022 2. The World Bank Group.: Moving Towards Sustainability: The Livestock Sector and the World Bank (2022). [online] Available at: https://www.worldbank.org/en/topic/agriculture/brief/ moving-towards-sustainability-the-livestock-sector-and-the-world-bank. Accessed 9 April 2022 3. Successful Farming.: 16 Common Cattle Breeds. [online] Successful Farming (2022). Available at: https://www.agriculture.com/family/living-the-country-life/16-common-cattlebreeds. Accessed 19 May 2022 4. Nadis.org.uk.: NADIS Animal Health Skills—The Healthy Cow (2022). [online] Available at: https://www.nadis.org.uk/disease-a-z/cattle/the-healthy-cow/#:~:text=The%20adult% 20cow%20has%20a,in%20conjunction%20with%20several%20diseases. Accessed 3 April 2022
76
D. C. Castillo et al.
5. Liu, J., Li, L., Chen, X., Lu, Y., Wang, D.: Effects of heat stress on body temperature, milk production, and reproduction in dairy cows: a novel idea for monitoring and evaluation of heat stress—a review. Asian Australas. J. Anim. Sci. 32(9), 1332–1339 (2019) 6. Kashou, A., Basit, H., Chhabra, L.: Physiology, Sinoatrial Node (2021). [online] Ncbi.nlm.nih.gov. Available at: https://www.ncbi.nlm.nih.gov/books/NBK459238/. Accessed 19 May 2022 7. Mazrier, H., Tal, S., Aizinbud, E., Bargai, U.: A field investigation of the use of the pedometer for the early detection of lameness in cattle (2016). [online] Available at: https://www.ncbi. nlm.nih.gov/pmc/articles/PMC1555681/. Accessed 19 May 2022 8. Madureira, A., Burnett, T., Bauer, J., Cerri, R.: Pedometers on Dairy Cattle can Point to Better Fertility (2020). [online] Dairy Research Blog. Available at: https://dairyresearchblog.ca/ 2020/07/21/pedometers-on-dairy-cattle-can-point-to-better-fertility/. Accessed 19 May 2022 9. Bulínová, B.: Remote Monitoring of Animals Vital Functions by Using IoT (2019), [online] Available at: https://dspace5.zcu.cz/bitstream/11025/37342/1/DP_Bulinova.pdf. Accessed 3 April 2022 10. Rosales, M.A., Magsumbol, J.-A.V., Palconit, M.G.B., Culaba, A.B., Dadios, E.P.: Artificial intelligence: the technology adoption and impact in the Philippines. In: 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Manila, Philippines, pp. 1–6 (2020). https://doi.org/10.1109/HNICEM51456.2020.9400025 11. Rosales, M.A., Bandala, A.A., Vicerra, R.R.P., Sybingco, E., Dadios, E.P.: Oreochromis niloticus growth performance analysis using pixel transformation and pattern recognition. J. Adv. Comput. Intell. Intell. Informatics 26(5), 808–815 (2022) 12. Rosales, M.A., de Luna, R.G.: Computer-based blood type identification using image processing and machine learning algorithm. J. Adv. Comput. Intell. Intell. Informatics 26(5), 698–705 (2022)
Identification of the Distribution of the Viral Potato Infections P. Ishkin1
, V. Rakitina1
, M. Kincharova1
, and I. Yudaev2(B)
1 Samara State Agrarian University, Uchebnaya St., 2, 446442 Ust-Kinelskiy, Russia 2 Kuban State Agrarian University, Kalinina St. 13, 350044 Krasnodar, Russia
[email protected]
Abstract. The yield and quality of potatoes depend on the degree of their infection with a number of pathogens, including bacteria, fungi, viruses and viroids. A viral infection can cause significant damage to this food crop. The species composition and the degree of prevalence of viruses in potato plantings were determined on the basis of statistical material obtained by the staff of the Research Laboratory for Plant Protection of the Samara State Agrarian University for the period 2010–2019. Potato virus infestation was established in the laboratory by enzyme immunoassay (ELISA) using monoclonal and polyclonal antibodies and conjugates labeled with alkaline phosphatase of the company “Bioreba” (Swiss). For ten years, about 63 thousand plants and tubers from 8 districts of the region were examined. The survey results indicate that the main potato viruses are widespread in the Samara region and are present in almost every farm. Of all the viruses found on potatoes, for which studies were conducted, PVS turned out to be the most common in the region, on average for 10 years it was contained in 31.7% of all selected and tested tubers. Of the group of strong mosaics, PVY is widespread (16.2%), and leaf roll virus (PLRV) is quite rare (0.8% of tubers). The conducted studies indicate that, in the farms of the region, two main strain groups of the PVY virus are now widespread: PVYo and PVYn, which was previously considered not common or not widespread in the region. The PVY virus was contained in 47% of plants, and the necrotic strain of potato Y-virus, which was considered not widespread or widespread, was contained in 32.6% of PVYN infected plants, while PVY0 was contained in 38.9%. The spread of PVA viruses in the amount of 3.1% of all tested tubers, PVM – 2.5% and PVX – 1.1% was also revealed. The new data obtained on seed potato infestation in various regions of the Samara region made it possible to adjust the phytosanitary regulations for the production of seed potatoes for the Samara region. Research in this direction will be continued. Keywords: Plant protection · Infection · PVY virus
1 Introduction Potatoes are the fourth most important food crop in the world (after rice, wheat and corn) and the first non-grain. Potato consumption per person in Russia is considered one of the highest on the planet, and the yield is one of the lowest [1–4]. The poor quality © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 77–83, 2024. https://doi.org/10.1007/978-3-031-50327-6_9
78
P. Ishkin et al.
of seed material, heavily infected with harmful pathogens, is one of the main factors determining the chronically low level of potato yield [1, 5–10]. The yield and quality of potatoes depend on the degree of their infection with a number of pathogens, including bacteria, fungi, viruses and viroids. Viral infection can cause significant damage to this food crop [11, 12]. Many authors note the marked expansion of vector insects (aphids, whiteflies, thrips and cicadas) and associated viruses of the genus Potyvirus, Begomovirus, Tospovirus, etc. over the past 20 years [13–16]. Despite the fact that more than 40 viruses have been described infecting potatoes in natural conditions, only nine of them are of great economic importance for the world potato growing. These are potato leaf twisting virus (potatoleafrollvirus - PLRV), potato viruses A, M, S, V, X and Y (potatoviruses A, M, S, V, X, Y - PVA, PVM, PVS, PVV, PVX and PVY), potato tip panicle virus (potatomop-topvirus - PMTV) and tobacco rattle virus (tobaccorattlevirus - TRV). PLRV and PVY are currently considered the most dangerous viruses. They are ubiquitous and can cause up to 80% loss of potato yield depending on the region, variety and various environmental factors [10]. The increase in the number of viruses that cause potato diseases and changes in the geography of their distribution reflect the general process of the relationship between phytoviruses and their hosts in modern agricultural production. Understanding the global situation is necessary to optimize all the links of integrated protection, which will ensure sustainable production of high-quality potato crops in Russia. The low quality of seed material, heavily infected with harmful pathogens, is one of the main factors that determine the chronically low level of potato yield [17]. For the main potato-producing regions of the Russian Federation, PVY (various strains), PVM and PLRV are especially relevant in terms of distribution and harmfulness. With their strong distribution, crop losses can reach 50% or more. A particularly dangerous trend is observed in relation to the widespread distribution and increase in the harmfulness of PVY on many varieties of potatoes that are in economic and commercial circulation [18]. The great harmfulness of this virus is primarily due to the fact that it quickly spreads by vectors, for example, green peach aphids (Myzuspersicae) and, causing severe and moderate forms of viral infection on plants (wrinkled and striped mosaics), is always transmitted to vegetatively propagated offspring through tubers [19]. In recent years, the importance of PVY has greatly increased due to the wide spread of new and more harmful strains of this virus [20]. The purpose of the research was to improve the protection of potatoes from viral diseases, taking into account bioecological features and manifestations of harmfulness, based on the use and activation of plant resistance, adaptation to agroecological and phytosanitary conditions of growing crops in the forest-steppe of the Middle Volga region, providing an increase in yield and quality of tubers, ecologization of protective measures. For this, the following tasks were set: to identify and clarify the species composition of potato viruses in the conditions of the forest-steppe of the Middle Volga region in connection with the changed assortment of cultivated potato varieties and the changed phytosanitary situation and determine their distribution and harmfulness in the Samara region.
Identification of the Distribution of the Viral Potato Infections
79
2 Materials and Methods In order to determine the species composition and the degree of prevalence of viruses in potato plantings, statistical material obtained by the staff of the Scientific Research Laboratory for Plant Protection of the Samara State Agrarian University (2010–2019) was used. Evaluation of potato varieties for viral infection was carried out by sampling potatoes in the farms of the region, growing seed potatoes, and conducting a laboratory study, according to generally accepted methods, according to national standards (GOST). Sample selection. The tuberous seed fraction was taken as a sample after the tops had died off. Samples were taken before, during or after harvesting. A certain number of tubers of each reproduction were selected when crossing the site diagonally at 10 or 20 points, one tuber from 10 bushes in a row (100 or 200 tubers, depending on reproduction), regardless of the planting area. Carrying out the selection of tuber samples in the field allows you to form a representative sample, reliably take into account the situation on the edge plantings of plants more susceptible to infection with viruses (PVY, PVM, PLRV) transmitted by migratory aphid species on potatoes. Either the sample size reflected the area of reproduction planting, for example, 200 tubers were taken from every 5 ha. This approach is more indicative and gives more confidence in the desired state of health of the entire culture, but requires more time for laboratory research. For greater demonstration of the infection of a certain reproduction, samples were taken from the row after the tops died off and before harvesting. If sampling was carried out after harvesting, then the tubers were taken from the entire batch of a certain planting from storages. Analysis for the presence of a virus. Post-harvest testing has focused on the detection of aphid-borne viruses related to strong mosaic viruses such as PVY, PLRV and PVA. Tests for viruses that cause weak mosaics (S, M, X) are less important because these viruses have little effect on the culture, so they were performed somewhat less frequently. Post-harvest testing was carried out in the autumn-winter and/or spring periods on plant material obtained by germinating tuber indices (eye tuber with adjacent tissue) for 30 days (illumination 14–16 h a day) at a temperature of 20–25 °C until the appearance of green leaves. Tuber indices were obtained by cutting a piece of tuber tissue with an upper eye from each tuber of the selected sample. Then they were immediately treated with a solution of gibberellic acid (10 mg/l) by soaking for 10 min. Subsequently, the solution was drained, and after drying for 24 h, the tuber indices were planted in the ground in boxes. The infection of potatoes with viruses during the growing season and after harvesting was determined by the laboratory method of enzyme-linked immunosorbent assay (ELISA) using monoclonal and polyclonal antibodies based on alkaline phosphatase from Bioreba (Switzerland) according to the manufacturer’s method.
3 Results and Discussion The main problem of potato growing in the Samara region, as well as throughout the world, remains the quality of seed material, determined by a viral infection. In order to determine the species composition and prevalence of viruses in potato plantings,
80
P. Ishkin et al.
statistical material was used, obtained by the employees of the research laboratory for plant protection of the Samara State Agrarian University for ten years (2010–2019). It should be noted that in the Samara region, 6 major potato viruses are monitored annually: PVY, PLRV and PVA are classified as strong potato mosaics, and PVS, PVM and PVX as weak ones. Of all the viruses found on potatoes, for which studies were conducted, PVS turned out to be the most common in the region, on average for 10 years it was contained in 31.7% of all selected and tested tubers. This is due to the fact that the varieties of foreign breeding cultivated in our region, as a rule, initially contain this virus. From the group of strong mosaics, the PVY virus (16.2%) became widespread, and the leaf twisting virus (PLRV), on the contrary, is less common (0.8% of all tested tubers) (Fig. 1).
Fig. 1. Average prevalence of potato viruses in the forest-steppe of the Samara region (2010–2019)
The analysis of seed potatoes also revealed the spread of PVA viruses in the amount of 3.1% of all tested tubers, PVM - 2.5% and PVX - 1.1%. The low prevalence of these viruses is most likely due to the method of transmission of infection, as well as a change in the assortment of varieties (Fig. 2). For ten years, about 63 thousand plants were examined from 8 districts of the region. The results of the survey indicate that the main potato viruses are widespread in the Samara region and are present in almost every farm. A high infectious background has developed in farms located in the Central zone of the region. Here, almost half of the cultivated areas are affected by PVS and 1/3 by PVY. As noted above, in recent years, a huge number of varieties of domestic and foreign selection have appeared on the Russian potato growing market, so research results often contradict previously established knowledge. In the Samara region, for a long time, no research work was carried out to study the distribution and strain diversity of the Y-potato virus. The latest data are reflected in the
Identification of the Distribution of the Viral Potato Infections
81
Fig. 2. The degree of damage to potato plants by viral infection in the Samara region by year, %
works of researchers in the seventies of the last century, where it is noted that in the conditions of the Samara region, potatoes are mainly affected by the PVYo strain, which is identical to the known strain (according to Smith, 1960) of this virus. The studies carried out indicate that two main strain groups of the PVY virus are now widespread in the farms of the region: PVYo and PVYn. According to the data obtained, under conditions of a strong infectious background (on heavily infected potato plantings), on average, over the years of research, the PVY virus was contained in 47% of plants. Moreover, the necrotic strain of PVYn, a potato virus, which was considered not widespread or not widespread, was contained in 32.6% of plants infected with PVY, while PVYo in 38.9%. There are many reasons for the wide spread of the necrotic strain of the Y-virus of potato: this is a weak manifestation of symptoms or their complete absence on the leaves of plants, and therefore, diseased plants are often not removed from the field during phytocleaning, low concentrations of the virus, not always detected by enzyme immunoassay, and the absence its control in seed material. Active transmission of the virus by various aphid species and contact should also be taken into account. To this should be added the established false idea of the pathogen as not common in the Volga region. In conclusion, it should be noted that, in general, there is a downward trend in the number of infected tubers, with the exception of PVY and PVS viruses, which have spread throughout the region in recent years to 36.8 and 47.3%, respectively. Also, according to the data obtained over 10 years in the farms of the region, where there is irrigation and a high level of agricultural technology, the following varieties are strongly affected: from the early ones - Feloks, Impala, Zhukovsky early, Latona, Plante, Red Scarlett; mid-early - Rodriga, Lanorma, Volzhanin, Santa; mid-season -
82
P. Ishkin et al.
Aurora, Desiree; middle-late - Hermes and Saturn. They had PVY ranging from 32.1 to 87.5%, and PVS - 0 and 96.8%, respectively.
4 Conclusion The analysis of the obtained results shows that the main potato viruses are widespread in the Samara region and are present in almost every farm, while PVY and PVS have an exceptionally wide range in the region, affecting the highest reproductions of released and promising varieties. PLRV, on the contrary, is quite rare. The conducted studies indicate that, in the farms of the region, two main strain groups of the PVY virus are now widespread: PVYo and PVYn, which was previously considered not common or not widespread. A weak distribution of PVA viruses was also revealed in the amount of 3.1% of all tested tubers, PVM - 2.5% and PVX - 1.1%. Due to the fact that PVY, according to the data and literature sources, is widespread and highly harmful, we consider it necessary to continue to strictly control its spread. Thus, as a result of ten years of research, new data on the infestation of seed potatoes in various potato-producing regions of the Samara region were obtained, which made it possible to adjust the phytosanitary regulations for the production of seed potatoes for the Samara region.
References 1. Molyanov, V., Vinogradov, O., Ivanayskaya, N., Kuvshinova, N., Molianov, I.: A comparative assessment: methods of growth of potato minitubers in summer greenhouses and in field under a covering material in the climate conditions of the Middle Volga region. E3S Web Conf. 175(10), 09002 (2020) 2. Demina, G.V., Safiullina, G.F.: Virus-free potato propagation in greenhouse conditions. Res. J. Pharm. Biol. Chem. Sci. 6, 1673–1680 (2015) 3. Rubtsov, S.L., Milekhin, A.V., Bakunov, A.L., Dmitrieva, N.N.: Development and implementation of modern biotechnological module for year-round virus-free potato (Solanum tuberosum) mini-tuber production. Res. Crops 21(3), 529–533 (2020) 4. Vasilev, S.I., Mashkov, S.V., Syrkin, V.A.: Results of studies of plant stimulation in a magnetic field. J. Pharm. Biol. Chem. Sci. 9(4), 706–710 (2018) 5. Baev, V.I., Yudaev, I.V., Petrukhin, V.A., Prokofyev, P.V., Armyanov, N.K.: Electrotechnology as one of the most advanced branches in the agricultural production development. In: Handbook of Research on Renewable Energy and Electric Resources for Sustainable Rural Development. IGI Global, Hershey, PA, USA (2018) 6. Yudaev, I.V., Daus, Y.V., Kokurin, R.G.: Substantiation of criteria and methods for estimating efficiency of the electric impulse process of plant material. IOP Conf. Ser. Earth Environ. Sci. 488(1), 012055 (2020) 7. Tokarev, K., Lebed, N., Prokofiev, P., Volobuev S., Yudaev, I., Daus, Y., Panchenko, V.: Monitoring and Intelligent management of agrophytocenosis productivity based on deep neural network algorithms. Lecture Notes in Networks and Systems, vol. 569, pp. 686–694 (2023) 8. Yudaev, I., Eviev, V., Sumyanova, E., Romanyuk N., Daus, Y., Panchenko, V.: Methodology and modeling of the application of electrophysical methods for locust pest control. Lecture Notes in Networks and Systems, vol. 569, pp. 781–788 (2023)
Identification of the Distribution of the Viral Potato Infections
83
9. Ivushkin, D., Yudaev, I., Petrukhin, V., Feklistov, A., Aksenov, M., Daus, Y., Panchenko, V.: Modeling the influence of quasi-monochrome phytoirradiators on the development of woody plants in order to optimize the parameters of small-sized LED irradiation chamber. Lecture Notes in Networks and Systems, vol. 569, pp. 632–641 (2023) 10. Petrukhin, V., Feklistov, A., Yudaev, I., Prokofiev P., Ivushkin D., Daus, Y., Panchenko, V.: Modeling of the device operating principle for electrical stimulation of grafting establishment of woody plants. Lecture Notes in Networks and Systems, vol. 569, pp. 667–673 (2023) 11. Mashkov, S.V., Vasilev, S.I., Fatkhutdinov, M.R., Gridneva, T.S.: Using an electric field to stimulate the vegetable crops growth. Int. Trans. J. Eng. Manage. Appl. Sci. Technol. 11(16), 11A16V (2020) 12. Girsova, N.V., et al.: Diverse phytoplasmas associated with potato stolbur and other related potato diseases in Russia. Eur. J. Plant Pathol. 145(1), 139–153 (2016) 13. Voronov, E.V., Terekhova, O.B., Shashkarov, L.G., Mefodiev, G.A., Eliseeva, L.V., Filippova, S.V., Samarkin, A.A.: Formation of yield and commodity qualities of potatoes, depending on the varietal characteristics. IOP Conf. Ser. Earth Environ. Sci., 012028 (2019) 14. Jones, R., Barbetti, M.: Influence of climate change on plant disease infections and epidemics caused by viruses and bacteria. CAB Rev. Perspect. Agric. Vet. Sci. Nutr. Nat. Resour. 7(22), 1–31 (2012) 15. Krishnareddy, M.: Impact of climate change on insect vectors and vector-borne plant viruses and phytoplasma. In: Climate-Resilient Horticulture: Adaptation and Mitigation Strategies, pp. 255–277 (2013) 16. Solomon-Blackburn, R.M., Barker, H.: A review of host major-gene resistance to Potato viruses X, Y, A, and V in potato: genes, genetics and mapped locations. Heredity 86, 8–16 (2001) 17. Amelyushkina, T.A., Semeshkina, P.S.: Protection of potato seed plantings from viral diseases. Plant Prot. Quarantine 3, 21–23 (2011) 18. Polder, G., Blok, P., De Villiers, H., Wolf, J., Kamp, J.: Potato virus Y detection in seed potatoes using deep learning on hyperspectral images. Front. Plant Sci. 10 (2019). https://doi. org/10.3389/fpls.2019.00209 19. Anisimov, B.V.: Viral diseases and their control in potato seed production. Prot. Quarantine Plants 5, 12–17 (2010) 20. Jerg, S., Frank, R., Miroslava, H.: SpaarDieter: on the problem of diagnosing strains of potato Y-virus (PVY). Herald Plant Prot. 3, 3–10 (2004)
Smart Irrigation System for Farm Application Using LoRa Technology Alfredo P. Duda, Vipin Balyan(B) , and Atanda K. Raji Department of Electrical, Electronic and Computer Engineering, Cape Peninsula University of Technology, Cape Town, South Africa [email protected]
Abstract. In terms of information generation for future goods and services, the Internet of Things is now one of the most promising fields. One of the uses for the Internet of Things is smart farming, which automates irrigation systems and monitors agricultural fields to save water therefore, this paper presents an intelligent irrigation system for farm application using LoRa technology which will help farmers in the optimisation and management of water for irrigation of crops and prevent water loss and minimizing minimising the cost of labour. The system consists of two main hardware, the LoRa sensor node and LoRaWAN gateway. The system employs smart sensors, LoRa communication device, smart control and cloud-based monitoring and control. The smart sensors modules employed on the LoRa node consist of temperature and humidity sensor, capacitive soil moisture sensor, rain sensor and passive infrared sensor. At the same time, the sensors gather the environmental data, which are then transmitted to the LoRa gateway via LoRa communication using Serial Peripheral Interface protocol. Moreover, a cloud monitoring system has been developed for real-time data monitoring, alarm and pump control. Keywords: LPWAN · Smart irrigation · LoRa · LoRaWAN · IOT · Water wastage · Node · Gateway · Sensors · Agriculture · Crops · Long range
1 Introduction Over thousands of years, Human race have been relying on agriculture for food production and irrigation for watering their crops. Agriculture is the use of land for the breeding of animals and the growing of crops to produce food, medicinal herbs, and other goods that support and improve life [1]. Irrigation involves the process of watering land or oil to assist with the growth of crops, landscape maintenance and revegetation of damaged soil in dry areas and during period of inadequate rainfall [2]. Water is the source of life for every living thing and also needed for growth for example, the agriculture sector [3]. Traditionally, irrigation system in the agriculture sector is done manually by making water available to crops without knowing the appropriate amount of water needed. These old systems have been a source of water waste in agriculture, destroying crops. By introducing automation in irrigation, it assists the farmers in monitoring the amount of water being used, which avoids over-saturated crops, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 84–94, 2024. https://doi.org/10.1007/978-3-031-50327-6_10
Smart Irrigation System for Farm Application
85
improving yields and cost-effectiveness [4, 5]. Irrigation has been the major primary component in agriculture and different irrigation systems have been developed over the past years. However, many irrigation systems today do not fulfil the process of applying controlled water to crops at the needed interval and revegetate disturbed soil in dry areas and water optimisation during less rainfall. Many systems are time-consuming, have inefficient regulation mechanisms, and require intense labour and overwatering that eventually result in water wastage [6]. The two most important components of irrigation are timing and quantity of watering. It is possible to automatically predict when plants might require water by using sensors and other techniques. The system will be made up of sensors that measure the amount of soil moisture, and it will turn on the pump automatically when the amount of moisture falls below the threshold required for healthy plant growth. When the device achieves the appropriate level of soil moisture for the given time, it will automatically shut off. Field operations in agriculture are changing and now demand high levels of accuracy in their procedures in order to maximize crop output and quality while also lowering production costs. In order to satisfy these needs, automation mechanisms must be put in place. To achieve a high level of automation, it is imperative that the producer researches and implements the early framework eras of mechanics. This research aims to design and implement a Smart irrigation system for farm applications using LoRa technology which will help the farmers optimise and manage water for irrigation of crops, prevent water loss and minimise the cost of labour.
2 Literature Survey LPWAN technologies such as LoRa, Sigfox, LTE Cat M1 and MB-IoT have arisen to empower a large portion of the development of IoT market and applications that they are used, differ from each other in their requirements and their usage [7, 8]. Given the enormous market chance and advantages to businesses and consumers, there is significant interest in IoT and multiple approaches to deal with the trending market. LPWAN is becoming one of the fastest growing technology in the internet of things ecosystem since 2013 and is helping to drive such technological movement. LPWAN is a great solution for mobile devices that has to send data over long ranges while keeping the battery life longer [9, 10]. Due to the high replacement costs resulting from nodes being dispersed over broad geographical areas, having a long battery life is essential in agriculture. Compared to competing technologies (such GSM, 3G, and LTE) [11, 12], LPWAN technologies offer battery savings that guarantee nodes are constantly available and maintain a relatively high output power [13]. LoRa is a spread spectrum technology developed by Semtech and standardized standardised by the LoRa Alliance 2015. This LPWAN technology modulates the signal using a chirp spread spectrum in the sub-GHz range to enable bidirectional communication. Additionally, it renders the modulated signal resistant to channel noise and interference while LoRaWAN is an open standard network stack which uses the characteristic of LoRa physical layer, aiming to provide devices and sensors with the ability to transmit and receive data frames within minimum data rate, long-distance transmission and distinct time duration during the transmissions [14, 15].
86
A. P. Duda et al.
LoRaWAN can cover a long-range distance up to 48 km in rural areas or line of site and more than 3 km in urban areas having link budget from 158 dB to 168 dB depending on the environment and any obstructions in present. With such coverage distance, it can compete with the existing cellular technologies. LoRa’s unique spread-spectrum modulation scheme enables this LoRaWAN [16]. Compared to narrowband schemes, the spread-spectrum techniques are more robust in very noisy channel conditions and better at mitigating interference [10, 17]. LoRaWAN was developed to minimise the power demands of connected sensors and increase their battery life [18]. It employs an asynchronous communication technique in the lowest power mode, allowing the nodes to be in sleeping mode at all time and only wake up when they have data to broadcast before returning to power saving mode immediately or after the transmission is acknowledged [19]. LoRaWAN is established in a star-ofstars network architecture, provides long-range, bidirectional communication, and runs in unlicensed spectrum. End nodes are not connected to a single gateway but instead transmit data to numerous gateways that are in its range. Tens of thousands of sensor nodes can be supported separately by each gateway. In order to reduce power consumption and maximize network capacity, LoRaWAN data rates are scalable and employ an adaptive data rate algorithm. Since the LoRaWAN network has a star-of-stars topology, end nodes do not need to be connected to a single gateway in order to access the network. Since a gateway logically functions as a link-layer relay, end devices are directly connected to a network server. In this two-way connection, uplink transmission from the end device to the network server has always been encouraged. End devices, gateways, network servers, and application servers make up the LoRaWAN architecture. LoRaWAN can cover a long range distance up to 48 km in rural areas or line of site and more than 3 km in urban areas having link budget from 158 dB to 168 dB depending on the environment and any obstructions in presence. Such coverage distance, it can compete with the existing cellular technologies. This LoRaWAN is enabled by LoRa’s unique spread-spectrum modulation scheme. The spread-spectrum techniques compared to narrowband schemes, it is more robust in very noisy channel conditions and better at mitigating interference.
3 System Design The design of the system is divided into hardware, which consist of LoRa sensor node and the LoRaWAN gateway, and software which consist of algorithms that runs on the above hardware. 3.1 LoRa Sensor Node The LoRa sensor node was required to gather environment data such as the soil moisture content, ambient temperature, humidity, rain flow and movement around the hardware. Then, the data was processed and communicated with the LoRaWAN gateway. The LoRa sensor mode makes the use of SHTC3 for humidity measurement ranging from 0 to 100% RH and temperature measurement ranging from 40 °C to 125 °C with typical accuracy of ±2% RH and ±0.2 °C. For the soil moisture, the capacitive soil
Smart Irrigation System for Farm Application
87
moisture sensor was used. The system also used rain sensor to detect rain-fall and passive infrared sensor for motion detection for security purposes. For LoRa sensor node and LoRaWAN to communicate, the SX1278 LoRa radio was used. The SX1278 transceivers include the LoRaTM long-range modem, which offers ultra-long range spread spectrum communication and high interference immunity while minimising current consumption. The module uses a half-duplex communication system, allowing it to send and receive signals on the same medium but not simultaneously. LoRa sensor node Hardware: power supply unit Figure 1 represents the LoRa sensor node power supply unit which was required to provide stable DC voltage to charge the battery and power all modules and sensors used. The power supply unit operate over an input DC voltage range of 8 V to 24 V allowing a wide range of power source that can be used and support reverse polarity protection in case of a reversed supply connection.
Fig. 1. The complete power supply unit schematic of the LoRa sensor node.
Communication unit The Sx1278 LoRa transceiver module, by semtech was selected. The module interface with the LoRa sensor node PCB through 16 female pin header with pitch of 2.54 mm. The module is installed on the board using SPI communication protocol allowing the computing unit to communicate with the communication unit. Lora Sensor node program routine Figure 2 presents the design of the LoRa sensor node along with its mode of operation. Start-up Mode When the sensor node is switched on or restarted, it first includes the libraries used, initialise the GPIOs and global variables, initialize the peripheral hardware before arriving at the main loop. The setting up of the GPIOs, initialisation of peripherals is known as the sensor node start up routine. The nodes begin by including libraries for the hardware, initialising the ports, global variables and functions, then checks which devices are connected and whether they are functioning. If they are not, the user is alerted via a message to the UART and display after that, the LoRa module is then set up according to the parameters given in subsection.
88
A. P. Duda et al.
Fig. 2. Completed LoRa Sensor node PCB with all surface mount
Once the LoRa module is fully functional, the node start-up routine is complete and the program continues to the main loop. Continuous Mode The main loop takes place inside a continuous loop. The sensor node gather data for each sensor, then all the data gathered is packed into variable message then sent to the gateway address, after that the module goes to sleep. After 5min, the MCU wakes up, gather the data then goes to sleep MCU. In a case where the PIR sensor actives, the MCU wakes up and only send the PIR message alerting the view. 3.2 LoRaWAN Gateway The LoRaWAN gateway is the base station for the LoRa sensor node. It is responsible for creating the communication between the private network to the internet and cloud. The LoRaWAN gateway consist of power supply, control unit, water pump control, siren indication, communication module and OLED display. Power supply unit The voltage supplied from 12 V DC adapter or 12 V battery is filtered by C3, C4 and C5 to remove unwanted noise, then is fed to the fixed positive regulator L7805 regulating the 12 V to 5 V which is then fed to the LDO LT1117-3.3V and the PWR LED indicator which indicate whether the system is power. Control unit The control unit controls the pump switch, the siren, the irrigation timer and transmit the data gathered from the LoRa sensor to the network server. The control unit consist of esp32, OLED display pin header, irrigation timer setting pushbuttons, siren on/off pushbutton, programming pads BOOT pushbutton and reset pushbutton. Siren indicator The siren indicator was design to report when here is an obstacle around the LoRa sensor node. This was used to detect and identify possible threats to a system, and to
Smart Irrigation System for Farm Application
89
provide early warning to system in the event that an attack is able to exploit a system vulnerability. Water pump relay Water pump relay was implemented to allow the irrigation to be controlled in combination with the soil moisture sensor as shown in Fig. 3.
Fig. 3. Water pump relay circuit of the LoRaWAN gateway.
The pump controller relay operates in such a way that when the LoRa sensor node reports that the moisture content of the soil is at a level where it is considered as dry, the pump controller switch on the pump for a certain time set by the user for the irrigation time then the pump goes off when the moisture sensor detects that the soil moisture content is at desired state considered as wet. Communication unit The LoRaWAN gateway shares the same communication module with the LoRa sensor node. The module is mounted in the LoRa gateway PCB through 16 female pin header with pitch of 2.54 mm shown in Fig. 4. The module is installed on the board using SPI communication protocol allowing the computing unit to communicate with the communication unit.
4 Results and Discussion This section describes the results from several tests performed on the hardware, software and communication of the LoRa gateway and the LoRa sensor node. To guarantee that the hardware and software function as intended, communication range test and environmental monitoring tests were performed.
90
A. P. Duda et al.
Fig. 4. Completed LoRaWAN gateway PCB with all surface mount and through hole components assembled.
4.1 Communication Range Test The testing site as it resembles the environment in which the network can be tested. Measurements were possible over distances varying between 0 to 1500 m in steps of 50 m’ distance. 4.2 RSSI Measurement In wireless communications, the measurement of the power present in a received radio signal is known as the Received Signal Strength Indicator. The RSSI measures the communication quality between two radios attempting to communicate with one another. It is measured in dBm, with values closer to 0 dBm indicating a better communication quality and more negative values referring to worse communication quality. Figure 5 displays the RSSI readings for the radios over the test area. The distance at which a radio was unable to establish communication can be viewed from the figure as the point where the RSSI reached −120 dBm where the blue points align the red line as shown. The RSSI decreased as the distance between the two radios increased. This is expected because as the distance between the radios increases, the signals between them suffer from greater free space losses. The signals are also gradually distorted and refracted by environmental objects or absorbed by the soil and vegetation. In summary, The LoRa sensor node and LoRaWAN gateway were able to transmit packets without error over a distance up to 1 km in an area with busy roads, flat type buildings and trees. After this, packet transmission approached 0%.
Smart Irrigation System for Farm Application
91
Fig. 5. The RSSI measurements at each waypoint from 0 m to 1500 m distance.
4.3 Environmental Monitoring Test In this section, various test case was done, such as temperature reading, humidity, rain checking, moisture and pump commands. The first test conducted for the environmental condition was humidity which is represented in Fig. 6 and temperature in Fig. 7. The data were collected within eight days during the winter season. The relationship between humidity and temperature is inversely related.
Fig. 6. Humidity graph monitoring during eight days’ period.
Moisture content was monitored in a period of three hours as shown in Fig. 8 graph. The series1 in blue represents the moisture while the orange line series2 represents the water pump which was either 0% meaning the pump is off or 100%, meaning the pump is on. Slowly decrease in moisture over a period of time, once moisture reached around 20% the water pump switched on watering the soil till wet state which was set to 100% while the dry state was set to 20%.
92
A. P. Duda et al.
Fig. 7. Temperature graph monitoring during eight days’ period.
Fig. 8. Graph representing decrease on moisture level.
4.4 Web Interface To allow the user to interact with the system remotely and to facilitate the logging of data, AdafruitIO was used. AdafruitIO enables the system to be monitored in real-time and also provides the ability to control the siren and pump motor switch. The interface will enable you to view the temperature in degrees Celsius, humidity, rain, moisture in percentage. It also provides the temperature, humidity and moisture graphs.
5 Conclusion This paper presented a smart irrigation system for farm application using LoRa technology to reduce water wastage and labour in agriculture by combining LPWAN technology, cloud computing and optimisation. The system successfully provided significant results and fast communication by deploying low-cost sensors to sense variables of interest such as humidity, temperature, rain, soil moisture and obstacle around LoRa sensor
Smart Irrigation System for Farm Application
93
node. The system demonstrated that optimisation into irrigation help reduce the water consumption, improve the quality of crops and also reduce labour. Recommendations and Future Work • Future implementations of solar panel to charge the LoRa sensor node battery must be taken into consideration in order to further increase the battery life. • Future work should be considering the mounting of the LoRa sensor node into a pole to facilitate the communication between the gateway and the LoRa sensor node. • Future replacement of the microcontroller for the LoRa sensor node to a cheaper microcontroller will be • Future work with the Sigfox LPWAN will also be taken into account.
References 1. Kumar, D., Sobti, R., Jain, A., Kum, P.: LoRa based intelligent soil and weather condition monitoring with internet of things for precision agriculture in smart cities. IET Commun. 16, 604–618. https://doi.org/10.1049/cmu2.12352 2. Rasyid, A.M., Shahidan, N., Omar, M.O., Hazwani, N., Choo, C.J.: Design and development of irrigation system for planting part 1, Malaysia (2015) 3. Ragab, M.: IOT based smart irrigation system. Int. J. Ind. Sustain. Dev. 3(1), 76–86 (2022). https://doi.org/10.21608/IJISD.2022.148007.1021 4. Rehman, A., Saba, T., Kashif, M., Fati, S., Bahaj, S., Chaudhry, H.: A revisit of internet of things technologies for monitoring and control strategies in smart agriculture. Agronomy 12, 127 (2022). https://doi.org/10.3390/agronomy12010127 5. Chethan, R., Damodharan, G., Elumalai, K., Eswaran, C., Manjula, C.: Smart irrigation system for agricultural field using labview and IOT. Int. J. Innov. Res. Sci. Eng. Technol. 7(1), 146 (2018) 6. Ahmad, D., Muhammed Yousoof, I., Babar, S., Mohammad, B., Farman, A., Siddique Ullah B.: A deep reinforcement learning-based multi-agent area coverage control for smart agriculture. Comput. Electr. Eng. 101 (2022). https://doi.org/10.3390/agronomy12010127 7. Ayesha, S., Bhakti, P., Aishwarya, C., Rasika, P.: A review on intelligent agriculture service platform with LoRa based wireless sensor network. Int. Res. J. Eng. Technol. (IRJET) 6(2), 100:7000 (2019) 8. Tu Y, Haoye T, Wenyou H (2022) An application of a LPWAN for upgrading proximal soil sensing systems. Sensors, 4333 (2022). https://doi.org/10.3390/s22124333 9. Gunjan, G., Robert, V.Z.: Energy harvested end nodes and performance improvementof LoRa networks. Int. J. Smart Sens. Intell. Syst. 14(1) (2021). https://doi.org/10.21307/ijssis-202 1-002 10. SEMTECH (2017) LoRa technology: ecosystem, applications and benefits. Mobile world live (2017) 11. Biš´can J, Has M, Tržec K, Kušek M (2022) Selecting IoT communication technologies for sensing in agriculture. In: 2022 international conference on broadband communications for next generation networks and multimedia applications (CoBCom), pp. 1–7. https://doi.org/ 10.1109/CoBCom55489.2022.9880660 12. Gaitan, N., Floarea, P.: The Implementation of a Solution for Low-Power Wide-Area Network using LoRaWAN. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 13(6), 1–7 (2022) 13. Esma, K., Bahadır Can, Ç.: Low-power agriculture IoT system with LoRa: open field storage observation. Sciendo 16(2020), 88–94 (2020). https://doi.org/10.2478/ecce-2020-0013
94
A. P. Duda et al.
14. Henna, D., Rakesh, V.S.: Smart irrigation management system using LoRaWAN based sensor nodes. Int. J. Appl. Eng. Res. 15(11) (2020). https://doi.org/10.1016/j.atech.2022.100053 15. Chaudhari, B., Zennaro, M., Borkar, S.: LPWAN technologies: emerging application characteristics, requirements, and design considerations. Future Internet 12(3), 46 (2020). https:// doi.org/10.3390/fi12030046 16. Gunjan, G., Robert, V.Z., Vipin, B.: Evaluation of LoRa nodes for long-range communication. Nonlinear Eng. 11, 615–619 (2022) 17. Haxhibeqiri, J., De Poorter, E., Moerman, I., Hoebeke, J.: A survey of LoRaWAN for IoT: from technology to application. Sensors 16(11), 3995 (2018). https://doi.org/10.3390/s18 113995 18. Cheong, P.S., Bergs, J., Hawinkel, C., Famaey, J.: Comparison of LoRaWAN classes and their power consumption. In: 2017 IEEE Symposium on Communications and Vehicular Technology (SCVT), vol. 10, nº 1109/SCVT.2017.8240313, pp. 1–6 (2017). https://doi.org/ 10.1109/SCVT.2017.8240313 19. Bouguera, T., Diouris, J.-F., Chaillout, J.-J., Jaouadi, R., Andrieux, G.: Energy consumption model for sensor nodes based on LoRa and LoRaWAN. Sensors 18(6), 2104 (2018). https:// doi.org/10.3390/s18072104
The Effect of Illumination on the Productivity of Dairy Cattle Igor M. Dovlatov , Ilya V. Komkov , Dmitry A. Blagov , and Alexandra A. Polikanova(B) Federal Scientific Agroengineering Center VIM, 1st Institutsky Passage 5, 109428 Moscow, Russia [email protected]
Abstract. In the introduction, the factors affecting milk productivity are given, the relevant literature is studied. The light sources used in the premises for keeping cattle are considered. It is determined that the greatest efficiency is achieved with illumination of 200 lx at the level of the aft table, and the total light-permeable part of the roof should be 18% of the floor area. The aim of the study was to study the effect of the level of illumination on the productivity of cattle. The materials and methods show the scheme of research, the diet of livestock and its nutritional value. The site for conducting research and the lamps that were used were also identified. The results and discussions present the research data in tables. For a more visual demonstration, graphs and diagrams are constructed according to the values of the parameters we are interested in. It was determined that the use of the SP-1 lamp with 200 lx illumination has the most positive effect on the productive qualities of cattle, since milk yield increased by 1.5% and quality indicators were also increased. Conclusion: The use of the Sp-1 lamp with illumination of 200 lx has a beneficial effect on productive indicators, which allows to increase the dairy productivity of animals. Keywords: Livestock complex · LED lighting · Light level · Operating modes · Cattle
1 Introduction At this period of animal husbandry development, there is a fairly extensive genetic Quantitative indicators of milk yield of the varnishing herd are extremely important, as they are, as a rule, the main source of income for production. Milk is either sold directly or processed into dairy products and is already subject to sale. Based on the sources [1–4], it can be determined that the level of lactation is influenced by many factors: breed, feed, microclimate, drinking, stress, conditions of detention. If proper conditions are met, it is possible to increase the dairy productivity of cattle. To date, the most common lighting installations in the cowshed are LSP-44 (suspended linear fluorescent lamp for industrial and industrial buildings), RSP-05 (suspended mercury lamp for industrial and industrial buildings), ZHSP-01 (suspended © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 95–103, 2024. https://doi.org/10.1007/978-3-031-50327-6_11
96
I. M. Dovlatov et al.
high-pressure sodium lamp for industrial and industrial buildings), linear lamps of the chipboard type. The LSP-44 contains low-pressure fluorescent lamps of the LB (lowpressure fluorescent lamps) and LD (fluorescent lamps of low-pressure daylight) types. RSP-05 is equipped with mercury-arc high-pressure lamps of the DRL (mercury arc lamps of high pressure) and DRV (mercury arc lamps of high pressure) type. Sodium arc lamps of the DNaT (high-pressure tubular sodium lamps) type are integrated into the ZHSP-01. A chipboard lamp has LED (Light Emitting Diode) modules [5] (Table 1). Table 1. Illumination parameters at different voltage levels. Lamp type
Illumination (lx) at voltage 198 V
220 V
242 V
Incandescent lamp 75 (W)
313
560
666
Compact fluorescent lamp 15 (W)
336
389
405
LED lamp 9 (W)
611
611
611
Studies have shown that the illumination from IN (incandescent lamps) and CFL (Compact fluorescent lamp) decreases by 44 and 14%, respectively, when the voltage is reduced from 220 to 198 V, and increases by 19 and 4% when increased to 242 V. The illumination of the working surface when using LED lamps did not change with an increase in the operating voltage of the network from 198 to 242 V and amounted to 611 lx. This confirms the manufacturers’ claims that the drivers used in LED lamps stabilize the luminous flux regardless of lowering or increasing the supply voltage [6]. Currently, electricity costs for lighting livestock premises reach 15% of total electricity consumption, which negatively affects the cost of production. Therefore, in this area, work is actively underway to create perfect energy-saving lighting systems. Works [7–10] are devoted to the peculiarities of the creation and application of universal LED energy-saving systems. It is known that good lighting of the cowshed increases the productivity of cows. For farm animals, the full spectrum of illumination is most effective. In the area of placement of cows, the illumination should be 75 lx (with a duration of 14 h per day), calves - 100 lx (12 h) [11, 12]. An increase in the duration of a cow’s wakefulness, due to a light day of up to 16 h, leads to an increase in productivity by about 8%. A further increase in daylight hours to values exceeding the above does not lead to positive dynamics, but entails only additional electricity consumption. The next factor ensuring the maximum efficiency of the cowshed is the illumination values in various zones: at the drinkers and the feeding table - 200–300 lx, and in the boxes for resting animals during lactation 200 lx at head level. It is very important to provide the necessary illumination in the rest boxes. If the value of this parameter is insufficient, the success of “light management” will be absent, since cows are near the feed table for about 3 h daily, and in the recreation area the time spent may be 14 h. Another important parameter is the color temperature of lighting sources - 3000 K. The
The Effect of Illumination on the Productivity of Dairy Cattle
97
modern range of LED lamps offered on the market makes it easy to choose the lighting devices required by color temperature. During the rest of the animals, low-intensity infrared lighting can be used for observation [13, 14]. The light source should be located at a distance of 5 m above the floor, preferably above the drinking bowls or the feeding table, the resting places may remain darkened. The total light-permeable part of the roof should be 18% of the floor area. In experiments conducted in the USA, with the help of proper lighting, 8% more milk was obtained while increasing the consumption of dry matter by 6%, and problems with reproduction decreased by 15% [15, 16]. The aim of the study was to study the effect of the level of illumination on the productivity of cattle. 1.1 Materials and Methods Scientific research was carried out in the conditions of the economy of the Grigorievskoye Federal State Unitary Enterprise of the Yaroslavl region, owned by the Federal scientific agroengineering center VIM, in a typical cowshed with an area of 21 × 75 m2 with a lamp mounting height of 4.0 m. The concentration of the gas composition of the air during the study was within the limits of the MPC norms, the air temperature was −8 to −10 °C, relative humidity from 68 to 73%. To conduct scientific experience, 2 experimental groups of 5 heads each were formed according to the principle of pairs of analogues, taking into account lactation, milk yield, live weight, age. When forming experimental groups of animals, they were preliminarily examined for the presence of subclinical mastitis, which made it possible to exclude sick individuals in the conducted studies (Table 2). Accounting for dairy productivity of cows was carried out by conducting control milking 1 time per month. The quality of the milk obtained was determined according to GOST 31449-2013. Table 2. Research scheme. Box №
Group
Number of heads, pcs
Illumination in the evening, lx
Working time of artificial lighting, h
1
Check
10
–
6
2
1
10
100
6
3
2
10
200
6
This enterprise uses black-and-white Holstein cattle with a degree of Holstein - 75%. The maintenance of experienced cows was group tethered. Milking of lactating cows was carried out at 5 a.m. and 5 p.m. in the common milk pipeline for each group. The distribution of the feed mixture was carried out 2 times a day using a feed mixer. Hygienic indicators of drinking water, which was watered by experienced animals, corresponded to the norms imposed by the veterinary and sanitary requirements of GOST 32220-2013.
98
I. M. Dovlatov et al.
It should be noted that the visual sensitivity of cows is shifted to the blue area, so the best lighting effect will be achieved from the use of LED light sources with a color temperature from 4000 to 9000 K. To ensure light exposure, artificial lighting was added to natural lighting (NL) starting in the morning for 16 h. In the experiments, lamps of the RSP type were used-26-125-001 with a D-type LIC (light intensity curve) for the 1st group, for the 2nd group, a developed lamp based on the FSBI FNAC VIM SP-1 (ceiling lamp), with a power of 15 W, color temperature – 6500 K, the main colors are blue and green. The illumination from the developed lamps at the floor level was 200 lx. The operation of this lamp is carried out at a frequency of 50 Hz. When working to failure equal to at least 7000 h. The diet of the experimental cows was a feed mixture, the nutritional value of which was calculated using the Astra+ program based on detailed feeding norms according to A.P. Kalashnikov, which allowed balancing the mixture according to the main nutrients [17]. Hygienic indicators of drinking water, which was watered by experienced animals, met all the standards required by veterinary and sanitary requirements (GOST 322202013). The experiment was carried out for 30 days during lactation. The developed diet (Table 3) for lactating cows refers to a semi-concentrated type of feeding, which is characterized by the content of concentrated feed from 25 to 39% of the energy nutritional value. The daily weight of the household ration was 33 kg. The obtained nutritional value of the diet during its computer modeling made it possible to provide the body of cattle with the necessary nutrients according to the norms of detailed feeding for lactating cows (Table 4). 1.2 Results and Discussion The obtained data on the milk productivity of the experimental groups are shown in Table 5. It displays the values of quantitative and qualitative indicators of milk. Based on the data in the table, it can be determined that the values of the indicators of the experimental groups exceed the control ones. Thus, the yield of 1 and 2 experimental groups exceeds the control by 0.23% and 1.5%, respectively. The mass fraction of fat in milk between the 1st experimental group and the control group does not differ, but the values of the 2nd experimental group exceed the values of the control group by 0.01%. Only in 2 experimental groups, the values of protein-milk content differ compared to the control group and exceed it by 0.33%. It is also seen that the milk fat yield is highest in group 2 and exceeds the values of the control and the first group by 1.72% and 2.23%, respectively. For a more visual representation of the productivity of cows as a result of the study, graphs and diagrams are given (Figs. 1 and 2). The diagram shows that the yield of 1 and 2 experimental groups exceed the control by 0.23% and 1.5%, respectively. This may indicate a positive effect of the influence of the equipment used on the productive indicators of livestock. Based on the graph, it can be seen that the mass fraction of fat in milk between the 1st experimental group and the control group does not differ, but the values of the 2nd
The Effect of Illumination on the Productivity of Dairy Cattle
99
Table 3. The economic ration of dairy cattle. Type of feed Grain-grass hay, kg
Quantity 6.0
Corn silage, kg
20.0
Fodder beet, kg
3.5
Barley, g Oats, g
900.0 1250.0
Rapeseed meal, g
500.0
Feed molasses, g
450.0
Salt, g
67.0
Dicalcium phosphate, g
70.0
Three vitamins, ml Total weight of the diet, kg
0.32 33.06
The structure of the diet Coarse feed, %
31.04
Juicy feeds, %
42.69
Concentrated feed, % Total, %
26.27 100.0
experimental group exceed the values of the control group by 0.01%. The values of the 2 experimental group for protein content exceed the control by 0.33%. To analyze the course of the experiment, Table 6 was compiled, which shows the dynamics of lactation of cows depending on group membership. Changes in milk yields throughout lactation are displayed. As a result of the analysis of this table, it can be concluded that throughout lactation in the 2 experimental group, milk yield indicators exceeded the control group by 1.5% on average. It can also be noted that the average milk yield for 1 experimental group is higher than that of the control group by 0.23%. Table 7 shows the chemical composition of the milk of the control and experimental groups for a detailed study of the effect of lamps on the dairy productivity of cattle. Analyzing the data in the table, it can be understood that the dry matter values of the 1 experimental group are 0.001% less than the control one. The values of dry matter, dry skimmed milk residue, Lactose, in 2 experimental groups exceed the control values by 0.08%, 0.02%, 0.02%, respectively. The values of mineral salts in all groups remained unchanged.
100
I. M. Dovlatov et al. Table 4. Nutritional value of the diet.
Nutritional indicators
Total
Standard
Deviation from the norm
Deviation from the norm in %
Pure lactation energy, MJ
72
73.9
−1.9
−3
Energy feed units
12.18
11.5
0.68
6.0
Dry matter, g
13073.5
13200
−126.5
−1.0
Crude protein, g
1556.7
1445
111.7
8.0
Digestible protein, g
942.2
940
2.2
0.2
Sugar, g
780.2
760
20.2
3.0
Crude fat, g
434
290
144
50.0
Crude fiber, g
3142.8
3650
−507.2
−14.0
Calcium, g
125.7
65
60.7
93.0
Phosphorus, g
87.3
45
42.3
94.0
Carotene, mg
552
410
142
35.0
Vitamin D3, IU
9801.4
9600
201.4
2.0
Vitamin E, mg
1196.3
385
811.3
211.0
Table 5. Research scheme. Group
Milk yield for 305 days, kg
Mass fraction of fat, %
Milk fat yield, kg
Mass fraction of protein, %
Check
3080.03 ± 17.58
3.55 ± 0.02
109.29 ± 0.92
3.05 ± 0.02
1
3087.1 ± 20.54
3.55 ± 0.02
109.26 ± 1.15
3.05 ± 0.02
2
3126.03 ± 20.42
3.56 ± 0.02
111.17 ± 1.10
3.06 ± 0.02
2 Conclusions The conducted studies of dairy productivity of cattle have shown that the use of a lamp SP-1, with a power of 15 watts, a color temperature of 6500 K, with illumination of 200 lx is the most suitable. Analyzing the current data, it can be concluded that the use of this equipment with lighting of 200 lx has a stimulating effect on dairy productivity. The yield of 1 and 2 experimental groups exceeds the control by 0.23% and 1.5%, respectively. The mass fraction of fat in milk between the 1st experimental group and the control group does not differ, but the values of the 2nd experimental group exceed the values of the control group by 0.01%. In the 2 experimental group, the values of protein-milk content differ compared to the control group and exceed it by 0.33%.
The Effect of Illumination on the Productivity of Dairy Cattle
101
Milk yield for 305 days, kg
3140 3120 3100 3080 3060 3040
check
1
2
Fig. 1. Quantitative indicators of productivity
DAIRY PRODUCTIVITY OF COWS 3.55 3.05
3.56 3.06
3.55 3.05
check 1 Mass fraction of fat, %
2 Mass fraction of protein, %
Fig. 2. Qualitative indicators of productivity.
Throughout lactation in the 2 experimental group, milk yield indicators on average exceeded the control group by 1.5%. Consequently, the influence of an additional light source has a positive effect on the productivity of animals. The indicators of the experimental group 2, where the SP-1 lamp was used, exceed the values of the control group. The values of dry matter, dry skimmed milk residue, lactose, exceed the control values by 0.08%, 0.02%, 0.02%, respectively. The conducted research will significantly increase the productivity of cattle, which will also increase the economic performance of enterprises. In the future, it is planned to conduct research on this issue.
102
I. M. Dovlatov et al.
Table 6. Dynamics of changes in milk yields in experimental groups during lactation, n = 10. Month
Group Check
1
2
1
455.91
456.1
462.69
2
422.03
424.03
428.30
3
388.14
398.14
393.91
4
354.26
351.2
359.52
5
323.45
327.5
328.26
6
295.73
297.7
300.12
7
264.92
263.2
268.86
8
227.96
226.9
231.34
9
190.99
191.9
193.83
10
157.11
158.7
159.44
Table 7. Chemical composition of milk of experimental cows, n = 10. Group
Dry matter, %
Dry skimmed milk residue, %
Lactose, %
Mineral salts, %
Check
12.221 ± 0.197
8.673 ± 0.198
4.770 ± 0.109
0.720 ± 0.016
1
12.220 ± 0.204
8.673 ± 0.199
4.770 ± 0.109
0.720 ± 0.017
2
12.231 ± 0.209
8.675 ± 0.20
4.771 ± 0.110
0.720 ± 0.017
Acknowledgements. The team of authors in conducting research expresses gratitude to the Federal scientific agroengineering center VIM, FSUE Grigoryevskoe for providing laboratories and a platform for conducting research. We also express special gratitude to V.V. Kirsanov and D.Y. Pavkin for their support and assistance in the study.
References 1. Guseva, T.A.: Dairy productivity of Holstein cattle. Agroindustrial complex of Russia: education, science, production, pp. 49–51 (2021) 2. Nikolaenko, E.I.: Factors affecting dairy productivity of cattle. Continuing education: current trends and prospects, pp. 79–83 (2021) 3. Sokolov, N.A., Safronov, M.K., Chepushtanova, O.V.: Factors influencing dairy productivity of cattle. Theoretical, practical and safe aspects of farming, pp. 220–222 (2021) 4. Trung, T.D., Quang, M.V., Thanh, V.P.: Weight of factors affecting sustainable urban agriculture development (case study in Thu Dau Mot smart city). In: Intelligent Computing & Optimization. ICO 2021. Lecture Notes in Networks and Systems, vol. 371, pp. 707–717 (2021)
The Effect of Illumination on the Productivity of Dairy Cattle
103
5. Blinkov, B.S., Konyaev, N.V., Nazarenko, Yu.V.: New in lighting of cowsheds. In the collection: Innovative directions of development of technologies and technical means of mechanization of agriculture. Materials of the international scientific and practical conference dedicated to the 100th anniversary of the Department of Agricultural Machinery of the Faculty of Agricultural Engineering of the Voronezh State Agrarian University named after Emperor Peter I. Ministry of Agriculture of the Russian Federation; Voronezh State Agrarian University named after Emperor Peter I, pp. 37–40 (2015) 6. Vysotskaya, E.A., Kornev, A.S., Polkovnikov, E.V.: Improving the lighting system of industrial premises of agricultural enterprises through the introduction of energy-saving lamps. Bull. Voronezh State Agrarian Univ. 1(56), 137–142 (2018) 7. Konyaev, N.V., Firsov, V.S., Blinkov, B.S.: Justification of the use of energy-saving technologies for lighting systems in cowsheds. Bull. Kursk State Agric. Acad. 8, 171–175 (2018) 8. Murphy, B.A., Herlihy, M.M., Nolan, M.B., O’Brien, C., Furlong, J.G., Butler, S.T.: Identification of the blue light intensity administered to one eye required to suppress bovine plasma melatonin and investigation into effects on milk production in grazing dairy cows. J. Dairy Sci. 104(11), 12127–12138 (2021) 9. Purtova, A., Budnikov, D., Panchenko, V.: On application of solar collectors in dairy farms. In: Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569, pp. 756–762 (2022) 10. Dovlatov, I.M.; Yuferev, L.Y., Pavkin, D.Y., Panchenko, V.A., Bolshev, V.E., Yudaev, I.V.: Rationale for parameters of energy-saving illumination inside agricultural premises and method of its values calculation. Energies 16(4), 1837 (2023) 11. Semashko, S.V.: Using the Lely light for cows lighting system in a cowshed. In: Scientific and Educational Potential of Youth in Solving Urgent Problems of the XXI Century, No. 15, pp. 120–123 (2019) 12. Order of the Ministry of Agriculture of the Russian Federation No. 622 dated 10/21/2020 “On approval of Veterinary rules for keeping cattle in order to its reproduction, cultivation and sale”, the standard was determined in accordance with the regulatory documentation of the OSN APK 2.10.24.001-04 “Standards for lighting agricultural enterprises, buildings, structures” 13. Egorov, V.P.: Improving the energy efficiency of a cowshed by 200 heads by optimizing lighting. In: Scientific and Educational Potential of Youth in Solving Urgent Problems of the XXI Century, No. 18, pp. 156–158 (2022) 14. Dovlatov, I.M., Yuferev, L.Yu., Kirsanov, V.V., Pavkin, D.Yu., Matveev, V.Yu.: Automated system for providing microclimate in poultry houses. Vestnik NGIEI 7(86), 7–18 15. Timoshenko, V., Muzyka, A., Moskalev, A., Shmatko, N.: Comfort of cows is the key to high productivity. Animal Husbandry of Russia S3, 17–20 (2015) 16. Yan, G., Shi, Z., Cui, B., Li, H.: Developing a new thermal comfort prediction model and web-based application for heat stress assessment in dairy cows. Biosyst. Eng. 214, 72–89 (2022) 17. Kalashnikov, A.P.: Norms and rations of feeding farm animals: a reference manual; Kalashnikov, A.P., Fisinin, I.V., Shcheglov, V.V., Kleimenov, N.I., et al., 3rd ed., reprint; and additional – Moscow: Ministry of Agriculture of the Russian Federation Russian Academy of Agricultural Sciences All-Russian State Research Institute of Animal Husbandry, p. 359 (2003)
Improvement of Technological Process of Growing Hydroponic Green Fodder Triticale (Triticosecale Wittm.) in Indoor Farming Using Light Emitting Diodes N. I. Uyutova1(B) , N. A. Semenova1 , N. O. Chilingaryan1 V. A. Panchenko1,2 , and A. S. Dorokhov1
,
1 Federal Scientific Agroengineering Center VIM, 1st Institutsky Passage 5, 109428 Moscow,
Russia [email protected], [email protected], [email protected], [email protected] 2 Russian University of Transport, Obraztsova St. 9, 127994 Moscow, Russia
Abstract. This study is devoted to hydroponic green fodder (HGF) of triticale (Triticosecale Wittm.) cultivation using 3 variants of LED irradiators with PPFD = 137 ± 3.2 µmol s−1 m−2 with following spectral composition (Blue(B):Green(G):Red(R) spectrum with supplementary ultraviolet (UV) and far red (FR) irradiation): solar-like (S)18B:38G:44R with additional UV = 5 µmol s−1 m−2 and FR 16.5 µmol s−1 m−2 ; white spectrum (W) 18B:46G:36R with additional FR 4.6 µmol s−1 m−2 ; and R-B spectrum 20B:4G:76R. According to the set of growth indicators the best light spectrum for triticale HGF of ‘Nemchinovsky 56’ cultivar growing was R-B spectrum. It had the maximum total mass (12%), and nutritional value (the increase in the dry mass of greenery - by 4.4%, protein - by 9.4%, ash - by 2.5%, NDF - by 14.4%). Keywords: Hydroponic green feed · Light Emitting Diode (LED) · Light spectrum · Vegetation indices · Triticale
1 Introduction Vertical farming is gaining more and more importance around the world because of the beneficial role it plays in the field of agriculture [1]. The global hydroponics market size reached $2.56 billion in 2021 and based on the latest analysis by Emergen Research, it is expected to have a CAGR of 19.2% over the forecast period. It is assumed that regardless of the external climatic conditions and the increase in desertification, the growth of market income will be maintained in the period from 2022 to 2030 [2]. One of the topical segments of the hydroponics market is the production of hydroponic green fodder (HGF) as a way to improve the food supply and the digestibility of roughage [3]. This technology is an economical solution, especially where traditional © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 104–113, 2024. https://doi.org/10.1007/978-3-031-50327-6_12
Improvement of Technological Process of Growing Hydroponic
105
green fodder production is limited or unavailable and in case of forage production shortages caused by the lack of green fodder during dry seasons and in urban areas [4]. In addition, recent studies have proven the economic efficiency of growing HGF [5]. Hydroponic feed is a natural product produced without the use of any hormones, growth promoters or chemical fertilizers. It does not contain residual amounts of pesticides, fungicides, toxic substances that could contaminate livestock products [6]. Cereal grains and legume seeds can be used for germination and green fodder production. During the germination process, beneficial changes occur in the grain in terms of increased enzyme activity, as well as protein content and more bioavailable amino acids and vitamins [7, 8]. There are a large number of varieties of HGF depending on the type and variety of seed. For the production of HGF, crops such as barley, oats, rye, corn, wheat, soybeans, chin, peas, vetch, as well as mixtures of cereals and legumes, etc. are used. However, triticale seeds can also serve as a potential basis for green fodder. Triticale (× Triticosecale Wittmack) is an artificial species created by crossing wheat (Triticum spp.) and rye (Secale cereale L.). It includes favorable genes from both predecessor species (wheat and rye), which allows adaptation to conditions less favorable for wheat, but provides a higher biomass yield and forage quality. In a comparative study of three small grain crops (wheat, rye and triticale), triticale showed consistently better fodder yields [9]. In addition to high productivity (biomass and grain), triticale has a good protein content and composition of essential amino acids (lysine). Also, due to the higher digestibility of starch, triticale is the best feed for ruminants than other cereals [10, 11]. There are many contradictions in published studies on the cultivation of HGF in terms of profitability, cultivation conditions, but, certainly, there can be no doubt about the usefulness of such feed. As a culture for growing HGF, triticale has practically not been studied. The aim of our work was to determine the optimal conditions for additional illumination of triticale when growing HGF using 3 types of LED illuminators.
2 Materials and Methods To conduct study on the impact of different light spectral composition on morphophysiological, qualitative indicators and productivity of HGF triticale (× Triticosecale Wittmack), the ‘Nemchinovsky 56’ cultivar with a growing season of 316–340 days was chosen as the object of the research. This plant is widely used in compound feed production, for the production of starch, malt and baking flour. Before growing, the seed material was washed with running water to clean it from dust. The seeds were treated with a weak solution of potassium permanganate (0.01%) for 1 h for disinfection, then the grain was soaked in water at a temperature of 20 °C for 6 h, after which the grain was placed in plastic trays (1245 × 560 × 50 mm) for germination. The seeding rate was 4 kg/m2 . Germination energy and seed germination were determined in advance by placing in Petri dishes in a thermostat in 3 repetitions and amounted to 96% and 94%, respectively. The process of grain germination was carried out for 2 days in the dark at a temperature of 25 °C, while the trays with planting material were covered with a black dense film. On the third day after the appearance of young seedlings, the film was removed.
106
N. I. Uyutova et al.
HGF were grown on a hydroponic plant in a climatic room (Fig. 1) with an automatic microclimate system, maintaining a day/night temperature range of 25/22 ± 1.0 °C and a relative air humidity of 60 ± 10% for 10 days. Watering and nutrition of plants was carried out by an irrigation system of periodic flooding type FDS (Flood & Drain system) 2 times a day. Additional plant nutrition was not carried out. For flooding, purified tap water was used, the temperature of which was maintained within 18/20° ± 1.0 °C.
Fig. 1. The appearance of the shelving construction with 3 lighting options (from top to bottom - daylight (S), white (W) and red-blue (R-B)).
A combination of three variants of the spectral composition of light was installed on the rack, designated as S (daylight), W (white) and R-B (red-blue) with a photosynthetic photon flux density (PPFD) of 137 ± 3.2 µmol s−1 m−2 (Fig. 1). Spectral characteristics of radiation sources are given in Table 1. W LEDs were taken as the control of the study, as they are the most common and cheapest. Table 1. Spectral composition of light Light spectrum variant
PPFD (µmol s−1 m−2 ) B
G
S
24.6
54.5
W (control)
25.0
R-B
26.5
R
Supplementary (µmol s−1 m−2 ) Total
UV
FR
61.2
140.3
5.0
16.5
63.6
49.6
138.3
0.4
4.6
4.9
101.4
132.7
0.3
1.3
Improvement of Technological Process of Growing Hydroponic
107
On the 5th and 10th day after germination, the following parameters were measured and evaluated: seedling height, HGF fresh and dry mass, leaf surface area, leaf spectral reflectance coefficients, as well as the percentage of protein, fat, fiber (ADF, NDF) and ash in green mass. The final measurements were made on the 10th day of cultivation, since, according to published studies on the example of wheat plants, the maximum content of crude protein and the content of metabolizable energy were higher at this time in HGF than on days 8 and 12 [12]. It has also been shown on the example of corn plants that after the 10th day of cultivation, the decrease in the biomass of HGF took place [13]. For measurements, 15 samples were selected from each group, which were a cut of a hydroponic layer 10 × 10 cm (100 cm2 ) in size. Selection of the samples was carried out by randomly cutting out a hydroponic mat. From each variant of the experiment, the selection of the samples was carried out with a special metal mold. Of these, 10 samples were used to determine morphological indicators (height (cm), fresh mass (g), dry mass (g), leaf surface area (cm2 ) and 5 samples - to determine biochemical parameters. The fresh and dry mass of the hydroponic mat was weighed using a Sartorius LA230S balance (Laboratory Scale, Germany). To determine the dry fraction of the mat, the selected samples were dried in an oven at a temperature of 60–70 °C for 3 h, and then at a temperature of 105 °C for 1 h to constant mass. The leaf area was determined on a photoplanimeter LI-3100 AREA METER (LI-COR, USA). Using the analyzer NIRS DS2500 (FOSS, Denmark), the biochemical parameters of the studied HGF samples and seed material were determined. The crushed sample (green mass of triticale) was placed in a large bowl and sent to the measuring device, where each sample was measured at eight points. The study of samples of grown HGF triticale amounted to five samples from each option. To measure the spectral reflectivity of HGF leaves, a portable RP410 meter (PolyPen, Czech Republic) equipped with an internal light source (380–1050 nm xenon incandescent lamp) and one type of detector (UVIS) was used. The obtained information about the samples made it possible to calculate a variety of vegetation indices: normalized difference vegetation index (NDVI), Zarco-Tejad and Miller index (ZMI), greenness index (GI), carotenoid reflectance indices (CRI). All the experiments were performed in triplicate, followed by statistical processing of the measurement results and plotting was carried out in MS Excel 2010. One-way analysis of variance (ANOVA) with a significant difference of P < 0.05 was used to determine significant results.
3 Results and Discussion At the beginning of the vegetation phase (sprouts), triticale plants visually did not have a significant difference between the experimental variants on the 5th day of cultivation (Fig. 2a). The factor of light spectral composition did not have a significant effect on the indicators of plant height, fresh mass (green and root), leaf surface area, but a statistically significant effect on the dry mass of HGF green and root parts was observed (Table 2).
108
N. I. Uyutova et al.
According to the indicators of the dry mass of greens, and hence the nutritional value of the HGF as a whole, according to the variants of the experiment on the 5th day of cultivation, the best results were obtained using phytoirradiators with R-B spectrum. Table 2. Average biometric indicators of triticale HGF of ‘Nemchinovsky 56’ cultivar on the 5th and 10th day of cultivation according to the experimental options Light spectrum variant
Height (cm)
Fresh mass (g) Green
Dry mass (%) Root
Green
Root
Leaf surface area (cm2 )
5th day of cultivation S
7.6 ± 0.6ns
12.8 ± 1.6ns
133.6 ± 21.1ns
13.4 ± 0.3a
26.5 ± 3.1b
310.7 ± 40.5ns
W (control)
7.8 ± 0.6ns
15.0 ± 2.4ns
154.4 ± 24.3ns
13.4 ± 0.4a
25.3 ± 2.1ab
376.0 ± 65.9ns
R-B
7.1 ± 0.8ns
14.3 ± 3.1ns
157.8 ± 22.8ns
13.8 ± 0.2b
22.1 ± 3.3a
361.5 ± 88.2ns
LSD05
–
–
–
0.37
0.33
–
10th day of cultivation S
14.1 ± 0.7ns
24.4 ± 4.3ns
132.1 ± 23.9a
13.8 ± 0.3a
20.3 ± 4.7ns
779.9 ± 116.1ns
W (control)
14.4 ± 1.4ns
23.8 ± 5.1ns
143.5 ± 23.0ab
13.7 ± 0.4a
20.8 ± 6.8ns
796.4 ± 149.7ns
R-B
13.3 ± 1.5ns
25.1 ± 4.1ns
161.8 ± 18.7b
14.3 ± 0.5b
24.5 ± 5.7ns
845.0 ± 136.6ns
LSD05
–
–
25.62
0.46
–
–
Values represent mean ± SE (n = 10). The different letters indicate significant differences among treatments and ns mean non-significant differences among treatments according to Duncan’s test (p ≤ 0.05)
On the 10th day of HGF cultivation (Fig. 2b), plants visually differed markedly in the experimental variants. The yield of green mass under R-B irradiation was higher, the plants were less elongated and had the largest leaf surface area. This may be due to the highest percent of R irradiation, which is responsible for ensuring high photosynthesis and developmental activity of long-day plants [14]. Some researchers proved that an additional plant illumination with G and FR light spectrum could increase plant fresh mass [15] and leaf surface area [16]. G light is able to increase plant growth by photosynthesis enhancing of lower canopy leaves [17–19] and to high plant nutritional value as far as the G light maintains high net photosynthesis rate and photochemical efficiency [20]. In our study we determined, that variants with higher G and FR percent in irradiation spectrum (S and W) hadn’t significant differences in fresh leaf mass and leaf surface area from R-B variant. Based on the obtained results, it was found that the factor of the spectral composition of light affect the fresh root mass and dry mass of green part of HGF. The HGF yield on the 10th day of cultivation under S spectrum was 15.7 kg/m2 , under W spectrum - 16.7 kg/m2 , and under R-B - 18.7 kg/m2 , which is 12% more than in the control. Moreover, RB LEDs influence the plant growth as far as the RB radiation spectra coincide well with the absorption spectra of plant photosynthetic pigments and they are considered as a most energy-efficient among other LEDs [21, 22]. Based on the results of determining the HGF nutritional value analysis, graphs of the dependences of the content of protein, ash, ADF and NDF depending on lighting spectrum were plotted (Fig. 3). ADF is a poorly digestible, inefficient feed component
Improvement of Technological Process of Growing Hydroponic
109
Fig. 2. Appearance of HGF triticale of ‘Nemchinovsky 56’ cultivar, grown under different types of LED irradiation on 5 (a) and 10 (b) days of cultivation (from left to right: S, W, R-B).
(the recommended content is no more than 24–25%), and NDF is an effective fiber that is of great importance in digestion, and, consequently, in productivity and milk formation. In order not to overload the stomach of ruminants, this indicator is controlled (no more than 35% in feed). With a lack of NDF, the acidity of the rumen is disturbed, and its excess reduces the absorption of nutrients. It has been proved on the example of barley that the content of NDF, ADF and ash in HGF is much higher compared to its grain analogue [23]. The corn HGF had better nutritional value (protein and ash content, vitamins A and K) than corn kernels and forage. However, it did not provide enough fiber compared to feed corn [13]. According to our data, on the 5th day of cultivation, in terms of protein and NDF content, the best option was the S variant of irradiation. However, as the plants grew and the green mass increased, by the 10th day of cultivation, the situation changed, and these indicators became maximum when using the R-B spectrum. In general, compared to triticale grain (Fig. 4), the green mass of HGF contains more fiber, which is confirmed by experiments with barley HGF [23]. At the same time, the ash content remains at the same level, but the proportion of protein decreases compared to grain, in contrast to the experiments carried out on corn plants [13]. It can be concluded that the indicators obtained using NIR analysis confirm the preliminary conclusions about the benefits of using the R-B spectrum to increase the nutritional value of green fodder. In any case, in GZK triticale, the content of NDF, ADF are relatively low, and in order not to cause disturbances in the acidity of rumen, it is better for ruminant cattle to
N. I. Uyutova et al.
Content, %
110
14 12 10 8 6 4 2 0 Protein
Ash
S
Acid detergent fiber
W (control)
Neutral detergent fiber
R-B
a
Content, %
20 15 10 5 0 Protein
Ash
S
Acid detergent fiber
W (control)
Neutral detergent fiber
R-B
b Fig. 3. Ash, protein, ADF and NDF Content (% of the green fresh mass of triticale GZK of the ‘Nemchinovsky 56’ cultivar on the 5th (a) and 10th (b) days of cultivation.
use it as an additive to the main feed (grain and silage of cereals and legumes), which contains more fiber. We also obtained vegetation indices of the hydroponic green fodder. The obtained indicators are presented in the graphs (Fig. 5). It can be seen from the graphs that for all vegetation indices, the variant with R-B lighting is the best. The vegetation index NDVI in all variants of the experiment was approximately at the same level, which indicates the same amount of photosynthetically active biomass per unit area. The ZMI index associated with the light-harvesting complex of photosystem I and the GM index also correlating with chlorophyll content [24], and the carotenoid reflectance indices CRI550, CRI700 were also higher in the variant with R-B lighting. Moreover, the data indicate that the vegetation indices practically did not change during the cultivation of HGF triticale. An increase in almost all indices in the
Improvement of Technological Process of Growing Hydroponic
111
Fig. 4. The content of protein, washing, moisture, fat, fiber and starch in the original grain of the triticale of the ‘Nemchinovsky 56’ variety.
5
5
4
4
3
3
2
2
1
1
0
0 NDVI
ZMI S
GM CRI550 CRI700
W (control)
a
R-B
NDVI
ZMI S
GM CRI550 CRI700 W (control)
R-B
b
Fig. 5. HGF vegetation indices of triticale ‘Nemchinovsky 56’ cultivar on the 5th (a) and 10th (b) days of cultivation.
R-B variant indicates a higher rate of photosynthesis and the overall activity of the photosynthetic apparatus. The worst option for vegetation indices is S-variant spectrum lighting.
4 Conclusion According to the set of growth indicators, biochemical indicators and vegetation indices, the optimal option for lighting hydroponic green fodder triticale of ‘Nemchinovsky 56’ cultivar under conditions of total control environment agriculture is the variant with a red-blue spectrum, both in terms of the maximum mass gain of HGF (12%), and in terms of the nutritional value of the folder (the increase in the dry mass of greenery - by 4.4%,
112
N. I. Uyutova et al.
protein - by 9.4%, ash - by 2.5%, NDF - by 14.4%). The vegetation indices also confirm that under red-blue lighting, the process of photosynthesis proceeds in plants in the best way, while the relative content of photosynthetic pigments is maximum compared to other variants of the experiment.
References 1. Lu, C., Grundy, S.: Urban agriculture and vertical farming. In: Abraham M.A. Encyclopedia of Sustainable Technologies, pp. 393–402. Elsevier, Ohio (2017) 2. LNCS Homepage: http://www.emergenresearch.com/press-release/global-hydroponicsmarket. 16 Jan 2023 3. Assefa, G.: Efforts, successes and challenges of green feed production in Ethiopia. Online J. Animal Feed Res. 11, 13–17 (2021) 4. Ghorbel, R., Ko¸sum, N.: Hydroponic fodder production: an alternative solution for feed scarcity. In: 6th International Students Science Congress 2022, pp. 1–8. Izmir (2022) 5. Soto, M., Reyes, A., Ahumada, J., Cervantes, M., Bernal, B.H.: Biomass production and nutritional value of hydroponic green forage of wheat and oat. Interciencia 37(12), 906–913 (2012) 6. Hassen, A., Dawid, I.: Contribution of hydroponic feed for livestock production and productivity: a review. Int. J. Ground Sediment Water 15, 899–916 (2022) 7. Sarıçiçek, B.Z., Yıldırım, Hano˘glu, H.: The Comparison of nutrient composition and relative feed value of barley grain, barley green food and silage grown with grounded system of barley grass grown with hydroponic system. Black Sea J. Agric. 1(4), 102–109 (2018) 8. Jemimah, E.R., Gnanaraj, P.T., Muthuramalingam, T., Devi, T., Vennila, C.: Productivity, nutritive value, growth rate, biomass yield and economics of different hydroponic green fodders for livestock. Int. J. Livestock Res. 8(9), 261–270 (2018) 9. Kim, K.S., Anderson, J.D., Webb, S.L., Newell, M.A., Butler, T.J.: Variation in winter forage production of four small grain species-oat, rye, triticale, and wheat. Pak. J. Bot 49(2), 553–559 (2017) 10. Ayalew, H., Kumssa, T.T., Butler, T.J., Ma, X.F.: Triticale improvement for forage and cover crop uses in the southern great plains of the United States. Front. Plant Sci. 9, 1130 (2018) 11. Talalay, G.S., Matserushka, A.R., Matserushka, V.V., Chagina, Y.: Evaluation of the digestibility of diet nutrients containing green hydroponic feed. Regul. Issues Vet. Med. 1, 278–282 (2020) 12. Herrera-Torres, E., et al.: Effect of harvest time on the protein and energy value of wheat hydroponic green fodder. Interciencia 35(4), 284–289 (2010) 13. Chethan, K.P., et al.: Biomass yield and nutritive value of maize grain sprouts produced with hydroponic technique compared with maize grain and conventional green fodder. Anim. Nutr. Feed. Technol. 21(1), 121–133 (2021) 14. Gabibova, E.N., Mukhortova, V.K.: Vegetable growing: a textbook in the areas of training: 35.03.04 Agronomy, 35.03.05 Gardening, 35.03.07 Technology of production and processing of agricultural products, 35.04.05 Gardening. In 3 Ch. 1. Donskoy GAU. Persianovsky, Donskoy GAU (2019) 15. Meng, Q., Kelly, N., Runkle, E.: Substituting green or far-red radiation for blue radiation induces shade avoidance and promotes growth in lettuce and kale. Environ. Exp. Bot. 162, 383–391 (2019) 16. Claypool, N.B., Lieth, J.H.: Physiological responses of pepper seedlings to various ratios of blue, green, and red light using LED lamps. Sci. Hortic. 268, 109371 (2020)
Improvement of Technological Process of Growing Hydroponic
113
17. Wheeler, R.M., Sager, J.C., Goins, G.D., Kim, H.H.: A comparison of growth and photosynthetic characteristics of lettuce grown under red and blue light-emitting diodes (LEDs) with and without supplemental green LEDs. In: VII International Symposium on Protected Cultivation in Mild Winter Climates: Production, Pest Management and Global Competition, vol. 659, pp. 467–475 (2004) 18. Kamal, K.Y., et al.: Evaluation of growth and nutritional value of Brassica microgreens grown under red, blue and green LEDs combinations. Physiol. Plant. 169(4), 625–638 (2020) 19. Li, L., Tong, Y.X., Lu, J.L., Li, Y.M., Yang, Q.C.: Lettuce growth, nutritional quality, and energy use efficiency as affected by red–blue light combined with different monochromatic wavelengths. HortScience 55(5), 613–620 (2020) 20. Bian, Z., Cheng, R., Wang, Y., Yang, Q., Lu, C.: Effect of green light on nitrate reduction and edible quality of hydroponically grown lettuce (Lactuca sativa L.) under short-term continuous light from red and blue light-emitting diodes. Environ. Exp. Botany 153, 63–71 (2018) 21. Mitchell, C.A., et al.: Light-emitting diodes in horticulture. Hortic. Rev. 43, 1–88 (2015) 22. Mengxi, L., Zhigang, X., Yang, Y., Yijie, F.: Effects of different spectral lights on Oncidium PLBs induction, proliferation, and plant regeneration. Plant Cell Tiss. Organ Cult. (PCTOC) 106, 1–10 (2011) 23. Bulcha, B., Diba, D., Gobena, G.: Fodder yield and nutritive values of hydroponically grown local barley landraces. Ethiopian J. Agric. Sci. 32(1), 31–49 (2022) 24. Zagajewski, B., Kycko, M., Tommervik, H., Bochenek, Z., Wojtun, B., Bjerke, J.W., Klos, A.: Feasibility of hyperspectral vegetation indices for the detection of chlorophyll concentration in three high Arctic plants: Salix polaris, Bistorta vivipara, and Dryas octopetala. Acta Societatis Botanicorum Poloniae 87(4) (2018)
Design of a Device with a Thermoelectric Module for Transporting Milk Irina Ershova1,2 , Dmitrii Poruchikov2,3(B) , Vladimir Kirsanov2 , Vladimir Panchenko2,4 , Gennady Samarin2 , Gennady Larionov3 , and Natalia Mardareva3 1 Russian State Agrarian University - Moscow Agricultural Academy Named After K.A.
Timiryazev, Listvennichnaya Alley 2, 127550 Moscow, Russian Federation 2 Federal Scientific Agroengineering Center VIM, 1St Institutskiy Proezd, 5, 109428 Moscow,
Russian Federation [email protected] 3 Chuvash State Agrarian University, Karl Marx Street 29, 428003 Cheboksary, Russian Federation 4 Russian University of Transport, Obraztsova St. 9, 127994 Moscow, Russian Federation
Abstract. The developed device with a thermoelectric module for transporting milk can be used to control the temperature of milk, as well as for other liquids, such as drinking water. Thermoelectric cooling has great prospects with the design of thermoelectric materials. Existing thermostatic devices for milk transportation and new thermoelectric materials [1–3] have some disadvantages. For example, a thermostatic device for transporting milk in cars [4] contains a gas refrigeration unit that runs on liquefied gas. Liquefied gas, in addition, is used as a fuel for a transport automobile engine. The main disadvantage of this device is that currently not all cars run on liquefied gas, so the proposed device cannot be used on other vehicles. The device for controlling the temperature of milk during its transportation [2] contains a thermoelectric cooler, which automatically maintains the desired optimum temperature during milk transportation, especially in the hot season. Keywords: EMF UHF · Thermoelectric module · Milk
1 Materials and Method The design and creation of a multi-generator continuous-flow microwave unit with dual resonators for defrosting and heating colostrum provides for a phased integration optimization of design and technological parameters and regression analysis of mathematical models. In the research methods of mathematical statistics and the theory of experiment were used, modern technical means and measuring instruments were used. Three-dimensional modeling of cavity resonators was carried out using CST Microwave Studio, SolidWorks. The processing of experimental data was carried out on a PC using MS Office Excel, Mathcad 14, STATGRAHHICS Plus for Windows. The patterns of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 114–120, 2024. https://doi.org/10.1007/978-3-031-50327-6_13
Design of a Device with a Thermoelectric Module for Transporting Milk
115
functioning of hot and cold junctions of TEM were obtained to determine the operating modes of a device with a thermoelectric module that provides a given temperature of raw materials during transportation; methodology for developing a device with a thermoelectric module and a technological process for cooling and heating milk during its transportation. A patent search was carried out for technical solutions and devices for regulating the temperature of milk during its transportation. The results obtained are analyzed and used as the basis for the device being developed.
2 Results and Discussion Invention relates to cooling and storage of agricultural products, including milk, yoghurts, juices, etc., and can be used in agricultural production, food industry and in everyday life. In the device the sealed reservoir is made heat-insulated and is connected by means of the heat-insulated pipeline with the filter and the submersible pump located in the well below level of freezing of ground waters. In the lower part of the reservoir there is a safety valve and a pipe with a valve, which connects it to the cold accumulator through the float valve installed in it. Refrigerant machine evaporator is placed into cold accumulator. Cold accumulator through float valve and valve is connected via pipeline to heat insulated tank. In lower part of heat-insulated tank there installed is safety valve and distributing pipe of warm water with valve and pipe with valve, which connects reservoir with equipment for warm water warming. Float valve is installed on cover of heat-insulated container. In the upper part of the side wall of the heat-insulated container the warm water temperature sensor is installed. Temperature of cold carrier in cold accumulator is controlled by temperature sensor. An innovative idea is a device with a thermoelectric module, containing a car engine cooling system, a milk tank, a thermoelectric cooler, a temperature sensor, a control unit, thermoelectric cooler heat exchangers and a milk heat exchanger, additionally contains a thermoelectric module - a generator installed on the exhaust pipe of the exhaust gases of the engine, which includes a hot heat exchanger connected to hot junctions and connected with exhaust gases; a cold heat exchanger connected to cold junctions and connected to the cooling system of an automobile engine. In addition, the device contains a storage battery and a main switchboard, the input of which is connected to sources of electrical energy, the output is connected to the control unit. The device also acts as an additional source of electrical energy, while increasing the power of the onboard power plant required to use the thermoelectric module (cooler). The developed device works as follows. Electric pump 18, channel 25, coil heat exchanger 13, channel 26, cold heat exchanger 5 form a closed milk cooling circuit through which CaCl2 or NaCl salt solution circulates. A hot heat exchanger 8 is installed on the exhaust pipeline 7, which heats up as a result of heat exchange with the exhaust pipeline 7, and the heat is transferred to the hot junctions of the thermoelectric modules 9. Thermoelectric modules 9 are interconnected, their number is selected by calculation (Table 1). In accordance with the above procedure, a schematic solution was developed (Fig. 1), RF patent 100199, and a laboratory installation with a thermoelectric module for transporting milk (Fig. 2).
116
I. Ershova et al.
Table 1. Form for the design of a device with a thermoelectric module for transporting milk. Identification of functional criteria
The device is automated High precision temperature sensor Battery powered device Device with heat exchangers Salt solution of the required concentration for heat transfer The power supply for the TEM must provide a current with ripples of no more than 5% Reliable thermal contact of the module with the cooling radiator Voltage up to 12 V
Design of several laboratory facilities with different design versions of TEM Substantiation of operating modes of the device for transporting raw materials under various external conditions Evaluation of the economic performance of the device
Economic effect due to the difference in the reduced costs and due to the sale of the product without a noticeable increase in acidity before delivery to dairy enterprises. Calculation of the profitability of the device
The cold heat exchanger 10 is structurally pressed against the cold junctions of the thermoelectric modules 9. The cold heat exchanger 10 has through holes A and B, through which the coolant of the engine cooling system 11 passes, while heat ex-change occurs between the coolant of the cooling system by the cold junctions of the thermoelectric modules 9. The electronic three-way valve 16 according to patent No. 2270923 [6] is de-signed in such a way that part of the coolant can be directed through channel 28, the other part through channel 30, or the entire flow of coolant is directed through channel 28, while channel 30 is located in the closed position. After filling the car with cold milk (milk temperature Tm = 2 °C), the device starts to work. In this case, let it be required to maintain the temperature of milk in tank 1, equal to 4 °C. Then the operation of the device, consisting of two modes, can be considered. First mode. When the engine 11 is started, the cooling system, the standard generator 12, the thermoelectric generator (TEG) 6 starts working. The regular generator 12 generates electricity, which is supplied through channels 36, 40 to the main switchboard 24 and to the battery 23, while the battery is being charged 23. At the same time, TEG 6 starts operating. At the hot junctions of TEG 6, heat is absorbed from the exhaust gases (EG), and the heat of the coolant is removed from the cold side, minus the electricity generated at the external load. On an external load, TEG 6 creates a voltage equal to the electromotive force (emf) minus the voltage drop and internal resistance, the generated electricity through channels 39, 41 enters the main switchboard 24 and the battery 23, it occurs enhanced charging.
Design of a Device with a Thermoelectric Module for Transporting Milk
117
Fig. 1. Schematic solution of a device with a thermoelectric module: milk storage tank 1; thermoelectric cooler 2, including thermoelectric modules 3, hot heat exchanger 4, cold heat exchanger 5; thermoelectric generator 6, including exhaust pipeline 7, hot heat exchanger 8, thermoelectric modules 9, cold heat exchanger 10; automobile engine 11; regular generator 12; coiled milk heat exchanger 13; heat exchanger (radiator) 14; thermostat 15; electronic three-way valve 16 [3]; cooling system pump 17; electric pump 18; temperature sensor 19; master 20; comparison unit 21; control unit 22, battery 23; main switchboard 24; brine supply channels 25, 26; coolant supply channels of the engine cooling system 27, 28, 29, 30, 31, 32, 33, 34; exhaust gas supply channel 35; power supply channels 36, 37, 38, 39, 40, 41, 42, 43, 44; electrical signal channels 45, 46, 47.
Since the temperature of the filled milk Tm = 2 °C, the feasibility study does not work after the engine is started. In this case, the three-way valve 16 keeps channel 28 in the closed position, and the coolant through channel 30 enters TEG 6 and through channels 31, 29 enters thermostat 15. Thermostat 15 sends part of the coolant flow through channel 33 to engine 11, the other part channel 32 is sent to the heat ex-changer (radiator) 14 and after heat exchange, the coolant is sent to the engine 11 through channels 34, 33. Second mode. If Tm ≥ 4 °C, then the signal from the temperature sensor 19 is sent to the comparison unit 21, here the setter 20 through channel 46 sends a signal to the comparison unit 21, where the control signal is generated, and is fed to the control unit 22, which feeds on channel 44 electricity to the electronic three-way valve 16; through channel 37, electricity is supplied to TEC 2, and through channel 38 to electric pump 18. Electronic three-way valve 16 opens channel 28, starts TEC 2 and electric pump 18. At the same time, part of the coolant is directed through three-way valve 16 through
118
I. Ershova et al.
Fig. 2. Laboratory installation with a thermoelectric module for transporting milk
channel 30 to TEG 6, and through channel 28 to hot heat exchanger 4, where it cools hot junctions of thermoelectric modules 3 and through channel 29 enters thermostat 15 and similarly (Fig. 1) enters engine 11. The electric pump 18 through the channel 25 supplies the cooled CaCl2 or NaCl salt solution to the serpentine heat exchanger 13 located in the milk tank 1, where the heat exchange between milk and the salt solution and the milk temperature decrease. Then, through channel 26, the salt solution enters the cold heat exchanger 5, as a result of heat exchange with cold junctions of thermoelectric modules 3, the temperature of the salt solution decreases and the cycle repeats. Upon reaching the set temperature Tm = 4 °C, the temperature sensor 19 sends a signal to the comparison unit 21, a control signal is generated and fed to the control unit 22, which cuts off the power supply to the three-way valve 16, TEC 2 and the electric pump 18, and the proposed device switches to the first mode. During forced stops of the car, when TEG 6, standard generator 12 cannot generate electricity, all consumers of the device’s electricity are powered by battery 23. When transporting milk in the cold season, the TEC 2 can work as a heater, and the milk can be heated to the optimum temperature. With the help of new thermoelectric materials [8, 9], it is possible to create highly efficient thermoelectric materials [9, 10], for example, taking into account the 18 VEC rule with liovalent substitution [8]. The above technical result is achieved by the fact that in the proposed heatrefrigerating hybrid plant for cooling agricultural products, containing a sealed reservoir,
Design of a Device with a Thermoelectric Module for Transporting Milk
119
a heat exchanger, a milk pump, a coolant pump, a control unit, according to the invention, the sealed reservoir is made heat-insulated, the upper part of which, with a float valve installed in it, level sensor, water temperature sensor, connected by a heat-insulated pipeline with a filter and a submersible pump located in the well below the freezing level of groundwater, and a safety valve and a pipe with a valve are installed in the lower part of the heat-insulated sealed tank, connecting it with a cold accumulator through a float valve, moreover, the evaporator of the refrigerating machine is placed in the cold accumulator, and through the coolant pump, heat exchanger, valve, the cold accumulator is connected to the lower part of the heat-insulated container, while e that the cold accumulator through the float valve and the valve is connected by a pipeline with a heatinsulated tank, and in the lower part of the heat-insulated tank there is a safety valve and a distribution pipe of warm water with a valve and a pipe with a valve connecting the tank with equipment for warm water reheating, a float valve is installed on the cover of the heat-insulated tank valve, and a warm water temperature sensor is installed in the upper part of the side wall of the heat-insulated container, while the temperature of the coolant in the cold accumulator is controlled by a temperature sensor. Thus, energy savings in the operation of the proposed heat-refrigerating hybrid plant for cooling agricultural products occurs due to a reduction in energy consumption for pre-cooling of the coolant water (due to natural cooling by ground water), savings in electricity costs during cold accumulation during the preferential tariff for electricity, regeneration of thermal energy of agricultural products for industrial needs. The proposed installation has the features of a hybrid system, as it is united by a single control system for energy sources of various nature.
3 Conclusion The developed device with a thermoelectric module for transporting milk and colostrum allows you to maintain its specified temperature during transportation, at which it can be stored without a noticeable increase in acidity until milk is delivered to dairy enterprises. It allows to provide defrosting and warming up of colostrum of animals (bovine and goat) with preservation of feed value; a laboratory sample of a de-vice with a thermoelectric module (TEM) of milk, providing a given temperature of raw materials during transportation, while increasing the power of the onboard power plant required for the use of TEM. In confirmation of the results obtained, the following were obtained: RF patent 100199 and patent No. 2270923.
References 1. Mahan, G., Sales, B., Sharp, J.: Thermoelectric materials: new approaches to an old problem. Phys Today 3, 42 (1997) 2. Yang, R., Chen, G. Nanostructured thermoelectric materials: from superlattices to nanocomposites. Mater Integr, 1–12 (2006) 3. Sano, S., Mizukami, H., Kaibe, H.: Design of high-efficiency thermoelectric power generation system. Komai’su technical report, vol. 49, no. 152 (2003)
120
I. Ershova et al.
4. Zhdankin, G.V., Belova, M.V., Mikhailova, O.V., Novikova, G.V.: Radio wave installations for heat treatment of non-food waste of animal origin. Proc. Orenburg State Agrarian Univ. 4(72), 198–202 (2018) 5. Poruchikov, D., Vasilyev, A., Samarin, G., Ershova, I., Kovalev, A., et al.: Experimental data of meat raw parameter change by electrophysical impact. Helix 9(4), 5144–5151 (2019) 6. Timofeev, V.N., Kuzin, N.P., Krasnov, A.N.: Patent RF No. 2270923, IPC F01P 7/16. Electric thermostat. Published 27 Feb 2006. Bull. No 6 7. Zhao, D., Tan, G.: A review of thermoelectric cooling: materials, modeling and applications. Appl. Thermal Eng. 66(1–2), 15–24 8. Choudhary, M.K., Ravindran, P.: First principle design of new thermoelectrics from TiNiSn based pentanary alloys based on 18 valence electron rule. Comput. Mater. Sci. 209, 111396 (2022) 9. Hue Do, T., Skirving, R., Chen, T., Williams, J.L., Bottema, C.D.K., Petrovski, K.: Colostrum source and passive immunity transfer in dairy bull calves. J. Dairy Sci. 104(S1) (2021). https:// doi.org/10.3168/jds.2020-19318 10. Churyumov, G., Denisov, A., Frolova, T., Nannan, W., Qiu, J.A.: High-power source of optical radiation with microwave excitation. In: Proceedings of the 17th International Conference Microwave and High Frequency Heating (AMPERE 2019), pp. 43–50. UPV Press, Valencia (2019)
Energy-Efficient AI Models for 6G Base Station Mahadi Karim Munif1 , Mridul Ranjan Karmakar1 , Sanjida Alam Tusi1 , Banalata Sarker1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2(B) 1 Department of Computer Science and Engineering, East West University, Dhaka 1212,
Bangladesh [email protected] 2 Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, Bangladesh [email protected]
Abstract. An intelligent base station is designed to use artificial intelligence (A.I.) and machine learning techniques to optimize its performance and improve overall energy efficiency. It is unclear what specific features an intelligent base station for 6G might have, as 6G technology has yet to be created. However, an intelligent base station for 6G will likely be able to adapt to changing network conditions and make intelligent decisions about allocating resources to provide the best possible service to users. It is also possible that an intelligent base station for 6G can learn from past performance and use that information to improve future performance. This paper presents the primary considerations for the 6G base station. We will look at how A.I. models are being used to manage the 6G base station network and increase energy harvesting in the transition to a greener future. We investigate how cutting-edge Machine Learning (ML) and Deep Learning (DL) techniques can collaborate with traditional A.I. methodologies. Finally, we discussed the best ML or DL-efficient models for building the 6G base station. Keywords: A.I. · Base station · Machine learning · Deep learning · 6G
1 Introduction The increasing number of technology, networks, and services are most demanding nowadays. Several countries have already installed and implemented fifth-generation (5G) networks. Although the 5G network has yet to be shown its full potential and has yet to be implemented in all countries, tech-developed countries have to lead the way for the new generation network (sixth generation-6G). To improve the next level of performance and support novel services and applications, introducing more unique requirements in terms of Communication, hybrid data transfer, latency, reliability, data rate, and many other infrastructures. 6G is a high and immersive networking system, so with the implementation and development of 6G, scientists, technologists, and developers have to pay great attention to the greening environment because 6G will be the next step for fulfilling user demands. Numerous aspects, including the progress of device-to-device Communication, will be included. So the main problem is that for implementation and use of 6G, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 121–132, 2024. https://doi.org/10.1007/978-3-031-50327-6_14
122
M. K. Munif et al.
this excessive level of performance will increase energy consumption, and there will be a lot of carbon (CO2 ) emissions. So it will be an excellent problem for green Communication with the 6G. The A.I. modeling for 6g automation maintaining the greening is a great challenge, and the power consumption maintenance and distribution and sustainable resource allocation is also a big challenge. So to keep the maximum level of green Communication, optimizing resource utilization, less power consumption, energy reharvesting, and building communication efficiency is significant. In 6G communication, there will be massive network traffic, high energy, and power-consuming devices, and dynamic network environments, so there will be a more complex situation for resource optimization and natural energy loss problems. So data science, AI-based optimization, and A.I. model automation system can solve the problems related to resource optimization. Proper utilization and auto task assignment in distributed systems for greening the Communication by its advantage of data drove decision and operation with dynamic flexibility and the self-adjustment.
2 Related Work In this section, we discussed some of the related work done in 6G communication. There is a discussion of network architecture and how the number of linked terminals has been growing [1]. The methods used are Machine learning, deep learning, and mathematical models. In [2], the researcher discusses the 6G network, which has stringent security, flexibility, Quality of Service, security, and intelligence requirements. The methods used are Machine learning and deep learning. In [3] describes antennas, bandwidth, and denser base stations. The methods used are Machine learning and extensive data analysis. In [4] paper discusses various green communication and technology techniques. The methods used are Device-to-device Communication and a Massive MIMO system. In [5], the paper proposes a 6G cellular vehicle based on the (C-V2X) scheme. The method used is deep-reinforcement learning (DRL). Machine learning is used in the [6] article; they introduce 6G and provide potential learning applications. They also describe the critical technical challenges, the learning methods, and problems for the future research of 6G communications.
3 Some A.I. Models for Communication The use of A.I. techniques to improve network performance is a hot topic, and green Communication is a significant application. Various A.I. models have been developed to improve the performance of A.I. strategies, and some new tendency has appeared toward more intelligent communication management. It has been established that artificial intelligence is a crucial concept for 6G to achieve autonomous network systems. However, the increased network complexity and the stricter service standards present substantial difficulties for the A.I. solutions that are now in use. Future cooperation between diverse components is necessary for intelligent network management: Designing, deploying, and allocating network resources. Various A.I. techniques will be adopted to realize intelligence in every part. Figure 1 shows all three types of algorithms.
Energy-Efficient AI Models for 6G Base Station
123
Fig. 1. Algorithms used in communication
Different application situations exist for AI techniques, and they should be chosen in accordance with specific issues. Deep learning approaches have also gained increasing popularity as computer hardware has advanced for addressing more challenging tasks. This does not, however, mean that established AI methods, such as heuristic algorithms and shallow ML models, are no longer appropriate. Many classical AI techniques are appropriate for situations with limited resources since they have substantially lower computational complexity than DL. 3.1 Traditional A.I. Methods 6G is the next generation of mobile communication technology currently being researched and developed. It is expected to bring significant advances in speed, capacity, and connectivity, as well as the integration of new technologies such as edge computing and quantum computing. Many potential traditional A.I. methods could be used in 6G communication, including 1. Heuristic Algorithm: Heuristic algorithms are a type of algorithm that use trial and error and experiential knowledge to solve problems and make decisions. They are often used when it is impossible to use a more formal, mathematical approach and rely on the expertise and intuition of the person developing the algorithm. Heuristic algorithms generally have a partial equation or set of rules that they follow. Instead, they rely on experiential knowledge and trial and error to solve problems and make decisions. The specific steps and techniques a heuristic algorithm uses will depend on the specific problem it is trying to solve and the expertise of the person developing the algorithm. Heuristic algorithms are typically used when a more formal, mathematical approach is not feasible or needs to be more time-consuming. They are designed to find a quick solution to a problem rather than an optimal solution. In the context of 6G communication, heuristic algorithms could solve a wide range of problems, such as optimizing network configurations, allocating resources within a network, or routing data to minimize latency and maximize capacity. The steps and techniques used by a heuristic algorithm in a 6G communication system depend on the specific problem it is trying to solve and the system’s requirements.
124
M. K. Munif et al.
Particle Swarm Optimization (PSO) is a heuristic optimization algorithm initially developed for use in artificial intelligence and computer science but has also been applied to a wide range of other fields, including communication networks. The specific Equation a PSO algorithm uses will depend on the problem it is trying to solve and the system’s requirements. The PSO algorithm is based on its personal best position, the global best position, and a random component. The specific values of the constants and parameters used in the Equation will depend on the problem being solved and the system’s requirements. We can use this Equation for network optimization. 2. Genetic Algorithm: Genetic algorithms (G.A.s) are optimization algorithms inspired by natural evolution and genetics principles. They are commonly used to solve complex optimization problems in various fields, including communication networks. Sudo Code for Genetic Algorithm: for i = 1 to population_size: fitness[i]=calculate_fitness(individual[i]) for i = 1 to number_of_generations: parent1, parent2 = select_parents(fitness) child = crossover(parent1, parent2) child = mutate(child) replace_least_fit_individual(child) Here, the G.A. begins by calculating the fitness of each individual in the population based on a fitness function that is specific to the problem being solved. It then iterates through several generations, selecting pairs of parents using a selection function (such as roulette wheel selection), combining their genetic material using a crossover function, and mutating the resulting child using a mutation function. The trimmest fit individual in the population is then replaced with the child. This process is repeated until the desired optimization level is achieved or a stopping condition is reached. The specific values of the parameters and functions used in the G.A. equation will depend on the problem being solved and the system’s requirements. 3.2 Development of Deep Learning Models We describe the attributes, difficulties, and future technologies of 6G networks and discover that deep learning can address these difficulties. Deep learning techniques and their possible 6G network applications are also discussed. We highlight the key arguments and research findings and offer fresh suggestions for building 6G networks. Numerous devices could connect to 6G networks at any time or location, and they are expected to offer high speed and throughput, low latency, and dependable communication services. We used a convolutional neural network (CNN) and deep learning model: long short-term memory (LSTM) [7]. 1. Convolutional Neural Network CNN is one of the most effective deep learning techniques in the context of 6G technology. The four layers that make up a CNN’s general structure are the input layer, the convolution layer, the pooling layer, and the output layer. The convolution
Energy-Efficient AI Models for 6G Base Station
125
layer receives the input image and applies a filter to create a feature map. After the convolution layer gives its output, the pooling layer receives the feature maps. Scalar-weighing by a factor of Wx+1 , and adding a bias of bx+1 . One of CNN’s key advantages is that parallel learning makes the network less complex. 2. Long Short-Term Memory An LSTM-based deep learning model is employed for better power allocation to learn the channel properties between the B.S. and the device. Multiuser identification technology is then applied for uplink message decoding. Despite having minor temporal delays, the LSTM can connect up to thousandtime steps. The steady stream of faults is due to the cells that are put together in the location to create specialized units. Cell ingress is controlled by multiplying gate units. The CNN model predicts the elements affecting 6G adoption most effectively. Even though LSTM models appear to have a high degree of accuracy, the quality of the data they output is relatively poor. Hence CNN is essential for high speed of accuracy in a 6G perspective [7]. 3.3 Future Perspective Learning Methods Machine learning algorithms are categorized as Reinforcement Learning, supervised, and unsupervised algorithms. Usually, to train datasets, supervised and unsupervised algorithms are vastly used, but the reinforcement learning algorithm, it uses an agent who is trained to interact and take action with that environment. Then we will use some strategies which are set for the agent so that it can follow those strategies and take proper actions in the current environment. In Reinforcement Learning Algorithm, there are three approaches to ML. Valuebased approach, policy-based approach, and the other is the model-based approach. For energy efficient 6G network implementation, we use the model-based approach. For the energy-efficient 6G network implementation, we use the positive type of R.L. Algorithm, which will run for some specific agent behavior. It increases the operating power and network behavior, which will help the agent take better action in the future. The R.L. has two learning models for implementing the energy efficient 6G base station: “Markov Decision Process” and “Q-learning.” The Reinforcement Learning Algorithm optimizes problems such as: • Implementing the Reinforcement learning algorithm in this perspective, there need to allocate agents, which can be power-consuming components, power supply components, switch components, bandwidth meters, etc., which will interact and operate (on/off) or connect and disconnect to that current situation from the perspective of given strategies. • Strategies are the states on which, when & how the agent will take proper action. For less power consumption R-L algorithm can turn off the power supply machine when it does not need to be in the on the state. • First, we need to prepare an agent with some initial instructions and strategies. Then observe the surroundings of its current state. After observing, the algorithm selects the optimal solution regarding the current state of that environment and performs the specified action set by default. Then the agent can find out the expected required
126
M. K. Munif et al.
result. By observing the Result, R.L. can update, regenerate, or restore the action if required.
4 Some Proposed A.I. Models for Intelligent Base Stations Some other potential applications of A.I. in 6G base stations could include • Resource management: A.I. could optimize the allocation of resources such as bandwidth and power to maximize efficiency and minimize interference. • Predictive maintenance: A.I. could predict when equipment is likely to fail and perform preventative maintenance to minimize downtime. • Network security: A.I. could detect and prevent cyber attacks on the network. • Energy efficiency: A.I. could be used to optimize power consumption to reduce energy costs and improve sustainability. 4.1 Energy-Efficient Base Station A base station is a stationary transceiver that connects to mobile devices like tablets and smartphones. It is used to offer wireless coverage over a particular region and is often situated in a central place, such as a phone exchange or a cell tower. Other wireless communication systems, such as Wi-Fi and satellite, can also employ base stations. A base station’s energy usage varies with its workload, with some power usage for base stations during sleep and idle phases remaining constant. Usually, it is condensed into Eq. (1) Pbs = Psleep + Ibs {Padd + ηPtrans }
(1)
where Pbs and Ptrans stand for the base station’s maximum transmission power consumption and overall power consumption, respectively, the utilization rate is indicated by ηε[0,1]. Psleep is the constant use of energy to maintain vital processes while in sleep mode. Padd stands for the additional constant power required for active mode power supply, backhaul connectivity, and processing. The binary parameter I bs indicates whether or not the base station is awake or asleep. The performance measurement concerning energy per unit mass is known as energy efficiency. As a result, it is typically described in cellular networks as the ratio of the obtained transmission rate to power consumption expressed in “bits per joule.” For green Communication, increasing energy efficiency is just as essential as direct energy-saving measures. Here, we analyze several optimization strategies and determine the energy efficiency equations for a U.E. in cellular networks. It should be highlighted that the derivation technique also applies to base stations. Each cell utilizes the same portion of the spectrum. If the transmission power of one U.E. and the accompanying base station’s channel gain is Gu and Pu accordingly, GPu (2) Ru = B log log 1 + N +I Equation (2), Ru is the highest data transmission for the uplink of the considered U.E., I and N stand for the noise and interference on the active channel, respectively, while B represents the given bandwidth.
Energy-Efficient AI Models for 6G Base Station
127
In Eq. (3) is the energy efficiency equation, Pu0 and μ stands for static power consumption and power amplifier. EE u =
Ru μPu + Pu0
(3)
4.2 Research Works for 6G Communication Table 1 demonstrates the 6G communication, its focuses, and the algorithms and frameworks used. 4.3 Base Station Deployment The base station. Deployment influences communication performance and energy usage throughout the network-building phase. Although specific deployment locations may be selected manually based on population density [26], increasing dynamics, varying propagation characteristics, complex physical settings, and even climates force researchers and operators to consider more efficient and automated methods. Dai and Zhang consider deploying fewer base stations by adopting a multi-objective genetic algorithm. In their study, the suggested method isolates the key characteristics that control the received signal strength first (RSS). The link between the collected features and RSS values is then mapped using various ML models, such as KNN, R.F., SVM, and MLP. The multiobjective genetic algorithm [27] is used in the second stage to optimize the locations and operating circumstances. In particular, several base stations are used during the genetic algorithm. Process and the lowest number matching the coverage requirement are chosen. The suggested ML models assess the workable options in further detail. According to simulation data, MLP performs better than other ML models in terms of Mean Absolute Error (MAE). Furthermore, compared to real-world deployment, the coverage rate has increased by 18.5%. 4.4 Work State Scheduling The network traffic is dynamically changing owing to user mobility. Thus, the multi-tier base stations may be continuously configured to switch on and off to conserve energy [28]. To guarantee a qualified connection, the user association information should be updated whenever the base station’s operational condition changes. As a result, it is essential to properly plan out the work state of base stations to reduce energy consumption and still achieve QoS requirements. A base station switch on/off policy may be created using the correlation between the current traffic data and previous experience resulting from the users’ daily travels, which contribute to the same changing tendency of traffic patterns [29, 30]. The simplest option could be to use a historical profile to estimate future traffic and turn off the base stations with little utilization. The potential degradation of QoS is the primary reason for turning off some base stations. To allay concerns, Gao et al. examine the complexity, speed, and rate of traffic prediction using a variety of ML models, including ARIMA [31], R.F., LSTM, and E.L. According to simulation data, it is possible to satisfy more than 99.9% of requests while reducing energy usage by 63%.
128
M. K. Munif et al. Table 1. Some research work for intelligent base station
Reference Focus
Algorithms/framework
[6]
Services and Road-map for 6G-communication Federated Learning methods for 6G
Federated Learning Solution and Architecture
[8]
Resource allocation, power consumption Neural Network – Node Optimization
[9]
Network performance and guaranteeing seamless connectivity
Mobile Edge Computing, innovative spectrum management, and intelligent mobility and handover management
[10]
Security and privacy issues
Edge Based on ML models
[11]
Advantages in addressing impending 6G Deep Learning, Machine Learning wireless communications issues and improving various systems
[12]
By fostering end users’ trust in in-network Communication, one may enhance the quality of their communication services
Federated learning of XAI models
[13]
Cache Capacity, Data usage amount, Average access time
Q-learning algorithm SaaS server interaction SaaS caching model
[14]
Environment Observation, Sequential decision-making
Reinforcement Learning MARL algorithms
[15]
6G Intelligent Vehicular Network Energy-efficient-IoV techniques Sustainable Communication and networking in the 6G
SDN-based framework
[16]
Higher transmission rates Large data processing Low-latency communication
OFDM-based Communication system Support Vector Machine (SVMs)
[17]
Resource management and channel estimation, energy use, computation delay, the pace of network utility offloading in 6G
Recurrent neural networks (RNNs), convolutional neural networks (CNNs), feed-forward neural networks (FNNs), and Recurrent Neutral network, LSTM
[18]
Resource Allocation, cross-layer optimization framework
Unsupervised and supervised deep learning for URLLCs,DR
[19]
Future 6G PHY systems will need to use DNN, Q-learning effective DL deployment tactics due to the prohibitive complexity of training, limitations in computing power, memory, control, energy sources, and security concerns (continued)
Energy-Efficient AI Models for 6G Base Station
129
Table 1. (continued) Reference Focus
Algorithms/framework
[20]
Massive systems with many inputs and outputs, complex multi-carrier waveform designs, and PHY security
Deep Learning, Machine Learning, CNN, RNN
[21]
Green and sustainable solutions for many multidimensional applications
Internet-of-things (IoT), distributed computing, and the internet of everything (IoE)
[22]
IRS-Assisted Communication Systems, ML-Based IRS-Assisted Wireless Communication
Machine Learning
[23]
Increased energy efficiency, low access network, and backhaul congestion, and improved data security
Artificial Intelligence
[24]
Achieve efficient use of 6G networks, 6G wireless network key performance indicators, and performance improvement
Machine Learning
[25]
Several aspects of the green 6G design
Distributed A.I., Spectrum Communication Techniques
4.5 General Power Control of Base Station Since the transmit power control influences the transmission rate, increasing the system’s energy efficiency is essential. Based on the user density and SINR obtained, Zhang et al. employ the R.L. approach to optimize the transmit power in [32] to reduce interference in surrounding cells. The target base stations implement the greedy strategy using the Q-function to select the recommended transmission power value. Lower interference, increased network speed, and reduced energy usage are all evidence of the performance. The authors [32] also propose a CNN model based on DRL map network states, such as received SINR, user density in the target cell, and projected channel conditions in surrounding cells, to the transmit power level. The network performance indicates how the DRL-based method can improve energy economy, throughput, and interference. Another notable advantage is that the DRL strategy converges much faster than the RL-based approach. The KNN model is used by Liu et al. [33] to maximize the spectrum and energy efficiency of distributed antenna systems. This study covers a single-cell distributed antenna system with numerous Remote Access Units (RAUs), and it is recommended that the RAUs’ broadcast power be raised. They use the KNN to map the relationship between user location and power allocation under the assumption that orthogonal channel resources and Channel State Information (CSI) are available. Furthermore, the traditional method collects specialized data samples for the KNN models. The power of the nearest neighbor is transferred from the training samples to the user in the test group after calculating the Euclidean distance between users in the training and testing groups. For multi-layered HetNets, achieving the global optimum in power
130
M. K. Munif et al.
management is more complicated. Liang and Zhang [34] propose a DRL technique in the core neural network. The core neural network trains the essential DNN using the global experience to get around the optimal local path or solution. 4.6 Green Energy Base Station The need for the power grid has been thought to be reduced by using renewable energy sources, which have been encouraged by the development of energy harvesting and motivated by concerns about climate change. However, the dynamics of renewable energy sources make it more challenging to maintain and run cellular networks. The use of A.I. algorithms to track the source of dynamic harvesting and improve network performance has received much research. Predicting the harvesting power is the most straightforward way to enhance network performance with green energy-enabled base stations. Vallero and Renga [35] employ the BLR [36] and LSTM [37] to forecast traffic for the case where a photovoltaic panel, battery, and power grid power the base station. Estimating harvesting power is done concurrently using the linear regression model. Then, to save energy, some small base stations in low utilization can be turned off using the forecast findings.
5 Conclusion The advantages and effectiveness of A.I. techniques in resolving challenging problems have been demonstrated. This paper covers network optimization and configuration AIrelated research to enhance energy efficiency. We evaluate the advantages and disadvantages of several A.I. models, including traditional heuristic approaches and deep learning models. We demonstrate how they may work together systematically to minimize energy usage and boost energy efficiency. After discussing the Intelligent Base Station models, we can conclude below: • Heuristic algorithms are widely used and well-developed to improve work state scheduling and base station development. Machine learning and heuristic algorithms have the potential to increase flexibility and productivity. • The R.L. and DRL technique effectively determines the best course of action for problems with power distribution and control problems. However, the extensive action space provided by continuity and the wide range of conceivable metric combinations makes an additional effort for the training process required. Current research focuses on creating A.I. models rather than their deployment on cellular networks or the energy required to run their computationally aggressive versions. The research will be more valuable and applicable if we focus on implementing suggested A.I. models.
References 1. Mao, B., Tang, F., Kawamoto, Y., Kato, N.: A.I. Models for green communications towards 6G. IEEE Commun. Surv. Tutor. 24(1), 210–247 (2022)
Energy-Efficient AI Models for 6G Base Station
131
2. Mao, B., Tang, F., Yuichi, K., Kato, N.: A.I. based service management for 6G green communications (2021). arXiv 3. Chih-Lin, C.I.: A.I. as an essential element of a green 6G. IEEE Trans. Green Commun. Netw. 5(1), 291–307 (2021) 4. Malik, N.A., Ur-Rehman, M.: Green communications: techniques and challenges. EAI Endorsed Trans. Energy Web 4(14) (2017) 5. Sanghvi, J., Bhattacharya, P., Tanwar, S., Gupta, R., Kumar, N., Guizani, M.: Res6Edge: an edge-AI enabled resource sharing scheme for C-V2X communications towards 60. In: 2021 International Wireless Communications and Mobile Computing, IWCMC 2021, pp. 149–154 (2021) 6. Liu, Y., Yuan, X., Xiong, Z., Kang, J., Wang, X., Niyato, D.: Federated learning for 6G communications: challenges, methods, and future directions (2020) 7. Zamzami, I.F.: Deep learning models applied to prediction of 5G technology adoption. Appl. Sci. 13, 119 (2023). https://doi.org/10.3390/app13010119 8. Kamble, P., Shaikh, A.B., Shaikh, A.N.: Optimization of base station for 6G wireless networks for efficient resource allocation using deep learning (n.d.) 9. Yang, H., Alphones, A., Xiong, Z., Niyato, D., Zhao, J., Wu, K.: Artificial-intelligence-enabled intelligent 6G networks. IEEE Netw. 34(6), 272–280 (2020) 10. Xue, R., Tan, J., Shi, Y.: Exploration and application of A.I. in 6G field (2022) 11. Rekkas, V.P., Sotiroudis, S., Sarigiannidis, P., Wan, S., Karagiannidis, G.K., Goudos, S.K.: Machine learning in beyond 5G/6G networks—state-of-the-art and future trends. Electronics (Switzerland) 10(22) 12. Renda, A., Ducange, P., Marcelloni, F., Sabella, D., Filippou, M.C., Nardini, G., Stea, G., Virdis, A., Micheli, D., Rapone, D., Baltar, L.G.: Federated learning of explainable A.I. models in 6G systems: towards secure and automated vehicle networking. Information (Switzerland), 13 (2022) 13. Chen, H., Tan, G.: A Q-learning-based network content caching method. EURASIP J. Wirel. Commun. Netw. 2018(1), 1 (2018). https://doi.org/10.1186/s13638-018-1268-1 14. Zhang, K., Yang, Z., Ba¸sar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds.) Handbook of Reinforcement Learning and Control. SSDC, vol. 325, pp. 321–384. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-60990-0_12 15. Wang, J., Zhu, K., Hossain, E.: Green internet of vehicles (IoV) in the 6G era: toward sustainable vehicular communications and networking. IEEE Trans. Green Commun. Netw. 6(1), 391–423 (2022) 16. Juwono, F.H., Reine, R.: Future OFDM-based communication systems towards 6G and beyond: machine learning approaches. Green Intell. Syst. Appl. 1(1), 19–25 (2021) 17. Iliadis, L.A., Zaharis, Z.D., Sotiroudis, S., et al.: The road to 6G: a comprehensive survey of deep learning applications in cell-free massive MIMO communications systems. J. Wirel. Comput. Netw. 2022, 68 (2022) 18. Kashyap, P.K., et al.: DECENT: deep learning enabled green computation for edge centric 6G networks. IEEE Trans. Netw. Serv. Manage. 19(3), 2163–2177 (2022). https://doi.org/10. 1109/TNSM.2022.3145056 19. Zhang, S., Liu, J., Rodrigues, T.K., Kato, N.: Deep learning techniques for advancing 6G communications in the physical layer. IEEE Wirel. Commun. 28(5), 141–147 (2021) 20. Ozpoyraz, B., Dogukan, A.T., Gevez, Y., Altun, U., Basar, E.: Deep learning-aided 6G wireless networks: a comprehensive survey of revolutionary PHY architectures. IEEE Open J. Commun. Soc. 3, 1749–1809 (2022) 21. Green 6G with Integrated Communications, Sensing and Computing | Frontiers Research Topic (n.d.). Retrieved 7 Jan 2023
132
M. K. Munif et al.
22. Karagiannidis, G.K., Goudos, S.K., Wan, S., Salucci, M., Abrar, M., Sejan, S., Rahman, H., Shin, B.-S., Oh, J.-H., You, Y.-H., Song, H.-K.: Machine learning for intelligent-reflectingsurface-based wireless communication towards 6G: a review. Sensors 22(14), 5405 (2022) 23. Chowdhury, M.Z., Shahjalal, M., Ahmed, S., Jang, Y.M.: 6G wireless communication systems: applications, requirements, technologies, challenges, and research directions (n.d.) 24. Integration of Communication and Computing Networks for 6G | Frontiers Research Topic. (n.d.). Retrieved 7 Jan 2023 25. Huang, T., Yang, W., Wu, J., Ma, J., Zhang, X., Zhang, D.: A survey on green 6G network: architecture and technologies. IEEE Access 7, 175758–175768 (2019) 26. Fundamental Green Tradeoffs: Progresses, Challenges, and Impacts on 5G Networks | IEEE Journals & Magazine | IEEE Xplore (n.d.) 27. Borah, J., Hussain, M., Bora, J.: Effect on energy efficiency with small cell deployment in heterogeneous cellular networks. Internet Technol. Lett. 2(3), e97 (2019) 28. Feng, M., Mao, S., Jiang, T.: Base station ON-OFF switching in 5G wireless networks: approaches and challenges. IEEE Wirel. Commun. 24(4), 46–54 (2017) 29. Sun, Y., Peng, M., Zhou, Y., Huang, Y., Mao, S.: Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues (2018) 30. International Union of Radio Science; Institute of Electrical and Electronics Engineers: 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC) (n.d.) 31. Jian, M., Alexandropoulos, G. C., Basar, E., Huang, C., Liu, R., Liu, Y. Yuen, C.: Reconfigurable intelligent surfaces for wireless communications: overview of hardware designs, channel models, and estimation techniques. Intell. Converged Netw. (2022) 32. Xiao, L., Zhang, H., Xiao, Y., Wan, X., Liu, S., Wang, L.C., H.V.: Reinforcement learningbased downlink interference control for ultra-dense small cells. IEEE Trans. Wirel. Commun. (2020) 33. Liu, Y., He, C., Li, X., Zhang, C.: Power allocation schemes based on machine learning for distributed antenna systems. IEEE Access (2019) 34. Zhang, L., Liang, Y.-C.: Deep Reinforcement Learning for Multi-Agent Power Control in Heterogeneous Networks (2020) 35. Vallero, G., Renga, D., Meo, M., Marsan, M.A.: Greener RAN operation through machine learning. IEEE Trans. Netw. Service Manag. 16(3), 896–908 (2019) 36. Pan, H., Liu, J., Zhou, S., Niu, Z.: A block regression model for short-term mobile traffic forecasting. In: Proceedings of IEEE/CIC International Conference on Communications in China (ICCC), Shenzhen, China, Nov. 2015, pp. 1–5 37. Understanding LSTM Networks – colah’s blog. (n.d.). Retrieved 7 Jan 2023
Green IT, IoTs and Data Analytics
Model-Based Design of User Story Using Named Entity Recognition (NER) Aszani and Sri Mulyana(B) Department of Computer Science and Electronics, Universitas Gadjah Mada, Yogyakarta 55281, Indonesia [email protected], [email protected]
Abstract. The needs analysis method to determine system specifications is generally represented as a user story. A user story is built to achieve the goals and values from the user’s point of view as the party requesting and will use the system. In this study, Named-Entity Recognition (NER) has been applied to build a model-based design for analysis techniques to identify, find and categorize user stories. Therefore, it can help system analysts and software development teams understand and improve user stories’ quality. The design model using spaCy’s library has an evaluation of the ten datasets. The order to evaluate the accuracy value is known for precision value is 97.45%, the recall value reached 99.67%, and the F1-Score was 84.61%. The experimental results show that the number of sentences does not affect the recall value and f1-score. However, the amount of data does not determine the accuracy of the label prediction. Keywords: Named entity recognition · Natural language processing · SpaCy · User story
1 Introduction Nowadays, practically every sector, institution, etc., need software engineers to stay competitive [1]. Although there is no attempt to use expert techniques to identify system breakdown, diagnosing system failure detection in the environment of information systems leads to strengthening their features and efficacy, which is highly significant [2]. Every software development is preceded by conducting a system requirements analysis process. The needs analysis process to determine system specifications is generally represented as a user story. A user story is not only about the features to be built but also the goals and values to be achieved from the user’s view as the party requesting and will use the system [3]. Based on the standard format that has been determined, user stories represent at least three main aspects in defining system requirements: Who wants the functionality, What is expected by stakeholders for the functionality to be built and Why the stakeholders need this functionality [4]. Some of these questions can describe the design of the system to be built in detail. A detailed system design can assist the development team in building, sizing and defining system complexity before starting the coding process [5]. People are generally better at classifying than computers [6], but larger data requires more sophisticated © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 135–144, 2024. https://doi.org/10.1007/978-3-031-50327-6_15
136
Aszani and S. Mulyana
machine calculations. The advent of high-throughput technology produced due to massive amounts of data. In tandem with this technical advancement, new mathematical techniques developed to analyse these intricately interconnected systems, allowing for a deeper understanding of the dynamic behaviour of complex regulatory systems [7]. Finding user stories about the same themes or claiming the same benefit might be tricky when trying to get a high-level picture of the relationships. Filtering and sorting procedures, however, can provide a basic overview [8]. One research implements a user story as a requirement for developing a decision support system in the health sector or Clinical Decision Support (CDS) by implementing an agile process. The studies concluded that estimating story size using story points can improve planning and predict the development period more accurately [9]. However, it is not popular in the industry because no one can ensure the quality of a user story. Some available approaches are too general or use a high qualitative metric [10]. Approximately 50% of user stories from the actual world have linguistic errors that can be easily fixed. A tool was developed to solve this problem and encourage the development of improved user stories in a pipeline [11]. The most popular model in software development for agile methods, user stories, take time significantly. There are numerous advantages and disadvantages to using natural language processing (NLP) to identify user stories. A rigorous NLP approach and assessment method exploration are necessary to provide high-quality research [12]. One of the essential subtask in natural language processing is Named entity recognition (NER) [13]. The effectiveness of the most recent statistical models is the primary distinction between the NER systems of the past and the present [14]. There are several libraries that can be used with NER, The environment it will be used in, the entities that will be extracted, and the language it will be used in must all be taken into account when choosing the best one [15]. SpaCy is a simple training procedure that can scale effectively (with little and big data). NER practitioners are not required to create a bespoke neural network utilizing PyTorch/FastAI or TensorFlow/Keras, all of which have a steep learning curve, despite being some of the most user-friendly frameworks [16]. Other research use a filter to improve the data received from text fields after being passed through a typical Spacy Entity Recognizer model. When evaluated on a large web corpus, the model was said to have an accuracy of roughly 80% [17]. This study aims to identify software development requirements documents in the form of unstructured text, which is then processed by reviewing the information according to the user story format. The result of this research is software development process can be more efficient and measure the accuracy with model-based designed.
2 Related Work Several studies have been conducted on the relevance of user stories in the design of systems. The implementation of Natural Language Processing (NLP) could contribute to improving the greatness of user stories.
Model-Based Design of User Story Using
137
2.1 Natural Language Processing for User Story User Story is widely used to formulate requirements in agile processes. However, it is not popular in the industry because no one can ensure the standard of a user story [18]. Kind of 14.81% of agile practitioners apply user stories, and even less (14.81) are familiar with them. Indicates how poorly most agile device practitioners undertake user stories [3]. Other studies explain that customer satisfaction is obtained if all customer needs are correctly framed as a user story [19]. Furthermore, the user story is used as a requirement for developing a decision support system in the health sector or Clinical Decision Support (CDS) by implementing an agile process. User stories become a communication tool during management decisionmaking. The research concluded that estimating story size using story points can improve planning and predict the development period more accurately [9]. Algorithms for Natural Language Processing (NLP) classify similar user stories into an Agile release plan. The applied NLP is the RV coefficient widely used in corpus linguistics. NLP algorithms make the planning process standardized and automated— a word corpus built specifically for banking projects [20]. Other research presents a pipeline using natural language processing that automatically generates model objectives from a series of realistic user stories so that model automation can significantly save time in completing analysis tasks [8]. Based on these studies, it is known that the NLP technique gives good results for user story extraction. 2.2 Named Entity Recognition Named Entity Recognition (NER) is a component of natural language processing that aids in the identification of entities present in a given text. NER can be used for information extraction, namely summarizing NER methods starting from the basic rules and dictionary methods, statistical methods, to deep learning methods for learning. NER is known to be the basis of information extraction, such as question and answer systems, machine translation, and other tasks [21]. The following steps are taken for each natural language text input: preprocessing, labelling with NER, and pattern matching [22]. NER is used to extract information about space from some of these texts, such as names of satellites, rockets, and space agencies. The research shows that building specific NER techniques can improve accuracy [23]. 2.3 SpaCy SpaCy is an open-source software library designed for advanced natural language processing and has a considerably better and more accurate recognition approach concerning our data. In addition to responding to the usual style of Names, Places, and Organizations, it enhances outputs with new degrees of detail. Locations, for instance, are contextually categorized into three categories: Airports, roads, and bridges are examples of nonGPE places and facilities. These additional layers could be significant and offer fresh viewpoints on our geographic and cultural facts [16].
138
Aszani and S. Mulyana
3 Methodology The research begins with a literature study, previous research on user stories and Natural Language Processing (NLP) obtained from various sources such as books, articles, journal publications, papers, and others. From the studies conducted, it can be seen that defining user stories is the task of a system analyst. However, the work can be done automatically by applying the NLP algorithm to save time. 3.1 Proposed Model-Based The aim of this research is to automatically extract user stories from unstructured text, which contains information about the development needs of software to reduce time and costs in the development process using agile. The implementation process consists of several stages, as depicted in Fig. 1.
Fig. 1. The research workflow begins with preprocessing and concludes with evaluation.
Based on the purpose of the research using user stories [11]. In this research, we are trying to implement the same dataset [24]. Then, labelling is carried out. We define NER labels as who (user), what (action), and why (explain) using the NER annotator tool manually. This tool was created primarily to shorten the annotation process. With the help of that tool, we can export the annotation together with the completed text into a .json file and go on modeling.
Model-Based Design of User Story Using
139
3.2 Model-Based NER with Spacy We will create a blank English model because we will only use this model temporarily. We do not need the other components. This model only has an Entity Ruler, which we will temporarily use to generate the training set. The steps that must be taken to train the NER model in SpaCy are as follows: (a) (b) (c) (d)
Initiate blank models. Define a task (pipe) named ‘ner’. Labels prepared in the training data are added to the pipe ‘ner’. Start doing iterations for training. Iterations = epochs. We train several n epochs. Then every epoch, we do m mini-batches training. (e) All mini-batches in 1 epoch/iteration are completed, and processing in the next epoch will be carried out. It continues like that until it reaches the last epoch.
3.3 Main Results The first thing we do is convert the training data into a binary space object following the information extraction and entity generation processes. They are then converted into binary objects (smaller in size and faster to load). Also, it used db.to disk to store data to disk. We’ll utilize each dataset both for training and for validation. This method establishes a workflow and demonstrates some issues due to small datasets. The following shows the training data that starts from 25 epochs to 500 epochs. With one of the datasets, the training results can figure out in Table 1. Table 1. Training data Epoch
ENTS_F
ENTS_P
ENTS_R
SCORE
25
75.87
90.9
65.1
0.76
50
82.28
70.68
98.43
0.82
75
78.74
81.23
76.4
0.79
…
…
…
…
…
450
82.83
70.94
99.5
0.83
475
82.85
70.89
99.67
0.83
500
82.76
70.88
99.42
0.83
We can learn the epochs and various metrics for our model from the output mentioned above. From the training data then, we do the evaluation. Evaluation measurement is used to determine how good the research results are and the degree of closeness between predicted and actual values. The measurement method commonly used is the confusion matrix table to see the precision, recall, and F1-score. The four terms used in the confusion matrix to evaluate classification performance are True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). TN represents the number of negative data correctly classified, while FP indicates negative
140
Aszani and S. Mulyana
data that is classified as positive. TP is the number of positive data that is correctly classified, while FN represents positive data that is classified as negative. Precision, recall, and accuracy are performance metrics that can be calculated based on these values. Accuracy measures how well the system can correctly classify data by comparing the correctly classified data to the actual data. Precision measures the proportion of positive data that is correctly classified compared to the total positive data classified. Precision can be calculated by formula 1: Precision (P) =
TP FP + TP
(1)
On the other hand, recall indicates the percentage of positive category data that is correctly classified by the system. The recall value can be calculated by formula 2: Recall (R) =
TP FN + TP
(2)
With the values of P and R, the precision and recall equation’s output can be used to derive the F-Measure: F1 Measure =
2PR p+R
(3)
Based on Table 1, we can see that the average precision is 70.89%, the recall is 99.67%, and the F-Measure is 82.85%, but we still need a better model. The NER test outcomes for each label can be observed in Table 2. Table 2. Testing for every entity in first case study Label
Precision (%)
Recall (%)
F1-score (%)
User
98.39
99.67
Action
59.29
100.00
74.45
Describe
49.57
99.13
66.09
99.03
Of the three entities, ‘user’ is the easiest to identify, while ‘describe’ still has a low level of recognition. Although more tremendously, they get better the more diverse training data we give them. Starting with 200 training samples and making modifications later is a solid general rule. We should collect more diverse training data or reevaluate our annotation. To know how data training can detect labels for data testing, we can see the visualization in Fig. 2. After training with one dataset, we move forward to see how the other dataset models are, and with a total of 10 datasets that we did training and testing, The outcomes are observable in Table 3.
Model-Based Design of User Story Using
141
Fig. 2. Visualization result of the training data model
Table 3. Result testing 10 cases studies Case studies
Sentences
Precision (%)
Recall (%)
F1-score (%)
Dataset 1
98
97.45
55.85
71
Dataset 2
58
70.89
99.67
82.85
Dataset 3
51
73.5
92.68
81.98
Dataset 4
53
75.05
96.67
84.61
Dataset 5
66
70.64
94.29
80.77
Dataset 6
87
73.42
91.05
81.29
Dataset 7
73
75.55
91.72
82.85
Dataset 8
55
77.57
86.26
81.69
Dataset 9
53
75.75
88.06
81.44
Dataset 10
67
78.1
87.39
82.49
4 Conclusion and Outlook Based on the results from Table 3, the recall and value of f-measure are the smallest, while precision is high with the most data. It is known that dataset 1, with the maximum number of sentences, has the highest precision value of 97.45% out of the ten datasets that have been labelled using spacy with a custom label to obtain the findings. The highest recall value reached 99.67% but came from dataset 2, while the highest f1-score
142
Aszani and S. Mulyana
came from dataset 4, which was 84.61%. The experimental results show that the number of sentences does not affect the recall value and f1-score. Nevertheless, the precision value is influenced because the smaller the number of sentences, the precision value also decreases. However, the amount of data does not determine the accuracy of the label prediction. According to our tests and experiments, we provided a relatively small training dataset that performs well in the domain we trained. This result proves that if qualitylabelled data is relevant to the domain, it will yield better results than a large dataset that needs to be labelled/ relevant to the required domain. For further work, the research can explore alternative methods for labelling data beyond the NER Annotator, such as active learning, crowdsourcing, or distant supervision. Additionally, the research can incorporate similarity measurements to improve the quality of the labeled data before applying machine learning models. In particular, techniques such as semantic similarity, word embeddings, or clustering can be employed to identify similar user stories and reduce the labelling effort required. Finally, more machine learning models can be studied in the research for NER or transformer-based models, to evaluate their performance in the context of user story NER. Acknowledgments. Authors would like to take this opportunity to offer our sincere thanks to University of Gadjah Mada, for providing financial support through the Final Project Recognition Grant Number 3550/UN1.P.III/Dit-Lit/PT.01.05/2022.
References 1. Maimuna, M., Rahman, N., Ahmed, R., Arefin, M.S.: Data mining for software engineering: a survey. In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2021. LNNS, vol. 371, pp. 905–916. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-93247-3_86 2. Alhendawi, K.M., Al-Janabi, A.A., Badwan, J.: Predicting the quality of MIS characteristics and end-users’ perceptions using artificial intelligence tools: expert systems and neural network. In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2019. AISC, vol. 1072, pp. 18–30. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4_3 3. Pokharel, P., Vaidya, P.: A study of user story in practice. In: 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI 2020) (2020). https://doi.org/10.1109/ICDABI51230.2020.9325670 4. Dalpiaz, F., Brinkkemper, S.: Agile requirements engineering with user stories. In: Proceedings of 2018 IEEE 26th International Requirements Engineering Conference (RE), pp. 506–507 (2018). https://doi.org/10.1109/RE.2018.00075 5. Algarni, A., Magel, K.: Applying software design metrics to developer story: a supervised machine learning analysis. In: Proceedings of 2019 IEEE First International Conference on Cognitive Machine Intelligence (CogMI 2019), pp. 156–159 (2019). https://doi.org/10.1109/ CogMI48466.2019.00030 6. Krak, I., Barmak, O., Manziuk, E., Kulias, A.: Data classification based on the features reduction and piecewise linear separation. In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2019. AISC, vol. 1072, pp. 282–289. Springer, Cham (2020). https://doi.org/10.1007/978-3030-33585-4_28 7. Weber, G.-W., Defterli, O., Alparslan Gök, S.Z., Kropat, E.: Modeling, inference and optimization of regulatory networks based on time series data. Eur. J. Oper. Res. 211, 1–14 (2011). https://doi.org/10.1016/j.ejor.2010.06.038
Model-Based Design of User Story Using
143
8. Gunes, T., Aydemir, F.B.: Automated goal model extraction from user stories using NLP. In: Proceedings of 2020 IEEE 28th International Requirements Engineering Conference (RE), 2020-Aug, pp. 382–387 (2020). https://doi.org/10.1109/RE48521.2020.00052 9. Kannan, V., et al.: User stories as lightweight requirements for agile clinical decision support development. J. Am. Med. Informatics Assoc. 26, 1344–1354 (2019). https://doi.org/10.1093/ jamia/ocz123 10. Lucassen, G., Dalpiaz, F., Van Der Werf, J.M.E.M., Brinkkemper, S., Zowghi, Di.: Behaviordriven requirements traceability via automated acceptance tests. In: Proceedings of 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW 2017), pp. 431–434 (2017). https://doi.org/10.1109/REW.2017.84 11. Dalpiaz, F., Brinkkemper, S.: Agile requirements engineering: from user stories to software architectures. In: Proceedings of 2021 IEEE 29th International Requirements Engineering Conference (RE), pp. 504–505 (2021). https://doi.org/10.1109/RE51729.2021.00076 12. Raharjana, I.K., Siahaan, D., Fatichah, C.: User stories and natural language processing: a systematic literature review. IEEE Access 9, 53811–53826 (2021). https://doi.org/10.1109/ ACCESS.2021.3070606 13. Jiang, R., Banchs, R.E., Li, H.: Evaluating and combining named entity recognition systems. In: Proceedings of Sixth Named Entity Workshop, pp. 21–27 (2016). https://doi.org/10.3115/ 1572392.1572430 14. Dawar, K., Samuel, A.J., Alvarado, R.: Comparing topic modeling and named entity recognition techniques for the semantic indexing of a landscape architecture textbook. In: 2019 Systems and Information Engineering Design Symposium (SIEDS 2019) (2019). https://doi. org/10.1109/SIEDS.2019.8735642 15. Orellana, M., Farez, C., Cardenas, P.: Evaluating named entities recognition (NER) tools vs algorithms adapted to the extraction of locations. In: Proceedings of 2020 International Conference of Digital Transformation and Innovation Technology (INCODTRIN 2020), pp. 123–128 (2020). https://doi.org/10.1109/Incodtrin51881.2020.00035 16. Mattingly, W.J.B.: How to Train spaCy NER Model. https://ner.pythonhumanities.com/03_ 02_train_spacy_ner_model.html 17. Kostakos, P.: Strings and things: a semantic search engine for news quotes using named entity recognition. In: Proceedings of 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2020), pp. 835–839 (2020). https://doi. org/10.1109/ASONAM49781.2020.9381383 18. Lucassen, G., Fabiano, D., van der Werf, J.M.E.M., Brinkkemper, S.: Forging high-quality user stories: towards a discipline for agile requirements. In: IEEE 23rd International Requirements Engineering Conference (RE), pp. 126–135 (2015). https://doi.org/10.1109/RE.2015. 7320415 19. Chopade, R.M., Dhavase, N.S.: Agile software development: positive and negative user stories. In: 2017 2nd International Conference for Convergence in Technology (I2CT 2017), 2017-Jan, pp. 297–299 (2017). https://doi.org/10.1109/I2CT.2017.8226139 20. Sharma, S., Kumar, D.: Agile release planning using natural language processing algorithm. In: Proceedings of 2019 Amity International Conference on Artificial Intelligence (AICAI 2019), pp. 934–938 (2019). https://doi.org/10.1109/AICAI.2019.8701252 21. Guo, Q., Wang, S., Wan, F.: Research on named entity recognition for information extraction. In: Proceedings of 2020 2nd International Conference on Artificial Intelligence and Advanced Manufacture (AIAM 2020), pp. 121–124 (2020). https://doi.org/10.1109/AIAM50918.2020. 00030 22. Mulyana, S., Hartati, S., Wardoyo, R., Subandi: Utilizing natural language processing in casebased reasoning for diagnosing and managing schizophrenia disorder. ICIC Express Lett. 15, 1083–1091 (2021). https://doi.org/10.24507/icicel.15.10.1083
144
Aszani and S. Mulyana
23. Maurya, P., Jafari, O., Thatte, B., Ingram, C., Nagarkar, P.: Building a comprehensive NER model for Satellite Domain. SN Comput. Sci. 3(3), 1–8 (2022). https://doi.org/10.1007/s42 979-022-01085-1 24. Fabiano, D.: Requirements data sets (user stories). Mendeley Data, V1. https://data.mendeley. com/datasets/7zbk8zsd8y/1. https://doi.org/10.17632/7zbk8zsd8y.1
Intrinsic and Extrinsic Evaluation of Sentiment-Specific Word Embeddings Sadia Afroze1,2 and Mohammed Moshiul Hoque1(B) 1 Department of Computer Science & Engineering, Chittagong University of Engineering &
Technology, Chittagong 4349, Bangladesh [email protected] 2 Department of Computer Science & Engineering, Green University of Bangladesh, Dhaka 1207, Bangladesh
Abstract. Sentiment-specific Wore embedding model generation and evaluation are crucial for low-resource languages. In this paper explores the challenges of sentiment-specific embedding model generation and evaluation for low-resource language, i.e., Bengali. It incorporates the effectiveness of three distinct embedding techniques (Word2Vec, GloVe, and FastText) for sentiment-specific word embeddings (SSWE). This study evaluates the performance of each embedding technique using intrinsic and extrinsic evaluation methods. Results demonstrate that the GloVe-based SSWE model achieved the highest syntactic and semantic similarity accuracy, with a Pearson correlation of 61.78% and 60.23%, respectively, and a Spearman correlation of 60.88% and 60.34%, respectively. The extrinsic evaluation involved sentiment classification using various classifiers, and the highest accuracy of 92.88% was achieved using the Glove+CNN model. Overall, this study provides insights into effective techniques for sentiment analysis in low-resource languages. Keywords: Sentiment analysis · Sentiment specific word embedding · Intrinsic evaluation · Extrinsic evaluation · Sentiment classification
1 Introduction Word embeddings are vector representations of words that capture their semantic meaning in a continuous space, and sentiment-specific word embeddings (SSWE) are distinct types of word embeddings that incorporate sentiment information into word representations [8]. The development of SSWE is a critical research issue for well-resourced languages such as English. However, creating an embedding model for resource-constrained languages like Bengali is highly challenging due to limited resources. Bengali is the most widely spoken language in Bangladesh and the second most widely spoken of the 22 official languages in India, making it the 7th most spoken language globally [7]. The scarcity of resources has presented challenges in the development of NLP tools, causing difficulties for Bengali speakers to access modern NLP tools and impeding their
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 145–154, 2024. https://doi.org/10.1007/978-3-031-50327-6_16
146
S. Afroze and M. M. Hoque
ability to utilize language technologies effectively. To address this issue, Bengali word embedding is a crucial foundation for the development of any NLP tools for the Bengali language. This paper utilizes two widely used evaluation methods, extrinsic and intrinsic, to assess embedding techniques [24]. Extrinsic evaluation involves downstream tasks such as machine translation [2] and part-of-speech tagging [19], while intrinsic evaluation measures the quality of language processing tasks such as semantic and syntactic word similarity [17], word relatedness [5], and word analogy [22]. However, the lack of standard Bengali embedding corpus and limited resources pose significant challenges to generating and evaluating such models. To address this gap, the proposed work aims to develop a Bengali embedding model, especially a sentiment-specific word embedding model, and evaluate it using intrinsic and extrinsic evaluation.
2 Related Works Word embedding is a widely studied technique that has been applied to various NLP tasks. However, generating accurate and reliable word embeddings for low-resource languages with limited annotated text data remains a challenge. Overfitting is a common issue in such cases, resulting in poor generalization and low performance on downstream NLP tasks [14, 20]. Among the commonly used methods, Word2Vec is a window-based technique that learns word embeddings by optimizing a loss function that maximizes the probability of predicting the correct words in a training corpus [15, 16]. Despite being widely used, Word2Vec has some limitations, such as not considering morphological information. To address this limitation, GloVe was introduced as a hybrid model that uses matrix factorization and context window-based methods [18]. GloVe is known for effectively capturing global word co-occurrence statistics and can also capture morphological information, which is particularly important for domain-specific languages [3]. The evaluation of embedding models is commonly done through intrinsic and extrinsic evaluation methods. Intrinsic evaluation measures the quality of word embeddings based on their ability to capture specific language aspects such as word similarity or analogy. The datasets RG-65 and WordSimilarity-353 are widely used for the intrinsic evaluation of word embedding models [4, 21]. The training corpus used for word embeddings can have an impact on their performance, especially across different domains. Researchers have proposed various domain adaptation techniques to address this issue to adapt pre-trained embeddings to specific domains [1, 9, 13, 23]. However, there is a lack of research on sentiment-specific embedding corpus and embedding model generation for low-resource languages like Bengali. To address this gap, this research presents a sentiment-specific word embedding (SSWE) model for the Bengali language and evaluates its performance using intrinsic and extrinsic evaluation methods.
Intrinsic and Extrinsic Evaluation of Sentiment-Specific …
147
3 Methodology Our research primarily aims to examine the impact of intrinsic and extrinsic evaluations on Bengali sentiment-specific word embedding (SSWE) models. To achieve this, we have outlined five main modules: (i) Dataset development, (ii) Sentiment-specific embedding model training, (iii) Intrinsic evaluation & Best embedding model selection, (iv) Feature extraction, and (v) Extrinsic evaluation. The abstract methodology of the proposed system is shown in Fig. 1.
Fig. 1. An abstract diagram of the proposed system
3.1 Dataset Development This module consists of three basic parts: embedding corpus, intrinsic dataset, and sentiment classification corpus. Embedding Corpus This corpus is built for SSWE model generation. Over the span of six months (June 20, 2022, to December 20, 2022), we gathered 50,000 Bengali sentiment text files, which were subsequently sent to the data preprocessing stage. These textual sentiment files were obtained from a combination of two open-source datasets and news portals. We used a Python script to collect the news portal’s corpus data, while a Python crawler was utilized to retrieve data from the online news portals. After collecting the data, the text files are initially filtered to remove non-Bengali alphabets and digits. The subsequent preprocessing step eliminates HTML tags, hashtags, URLs,
148
S. Afroze and M. M. Hoque
punctuation, and white spaces. The final step involves the removal of duplicate texts from the archive. Overall, the preprocessing step resulted in the elimination of 2,000 blank text documents from the initial dataset. The remaining 48,000 texts were used for the word embedding corpus, which contained a total of 912,000 words and 34,080 unique words. Intrinsic Dataset Two datasets, a syntactic similarity dataset and a semantic similarity dataset are collected as an intrinsic datasets. Two undergraduate students are responsible for gathering these datasets. One student selects a word, while the other finds a word that is syntactically or semantically similar. They then individually assign similarity scores for each word pair. The two students collect a total of 100-word pairs for each dataset. The resulting average score for each word pair represents its syntactic or semantic similarity score. Sentiment Classification Corpus A dataset of 11,807 documents is randomly selected to conduct an extrinsic evaluation. The dataset is manually labeled, and majority voting is used to assign each document with a suitable sentiment label (positive or negative). Two linguistic experts are assigned to annotate each document with one of the sentiment labels. Out of the 11,807 documents, both experts agreed on 10,854 text labels. The resulting corpus achieved a Kappa score (K) of 69.467%, indicating a reasonable agreement between the annotators for downstream tasks. 3.2 Sentiment-Specific Word Embedding Model Training The sentiment-specific word embedding (SSWE) model training module utilizes a sentiment embedding corpus. We consider three popular embedding methods, namely Word2Vec, GloVe, and FastText to generate ten sentiment-based embedding models, including four Word2Vec-based SSWE models, two GloVe-based SSWE models, and four FastText-based SSWE models. The module takes into account various hyperparameters for each embedding technique, including embedding dimension (size), minimum word frequency count (min count), contextual window size (window), and the number of iterations (epoch). – Word2Vec: Word2Vec is a neural embedding technique that randomly initializes word vectors and optimizes them through a single neural hidden layer [6]. In this study, we use two Word2Vec variants, Skip-gram and CBOW. The following hyperparameters are considered for developing the Word2Vec-based SSWE model: an embedding dimension of 200, a contextual window of 15, a minimum word frequency count of 1, and 30 epochs. The remaining hyperparameters are the same as those used in a previous study on Bengali embedding model generation [12]. We employ the gensim-based library for generating the embedding models. – Glove: GloVe is an embedding model technique based on word frequency, where each word is assigned a vector representation based on local and global word semantics [18]. To account for the morphological variation in Bengali, this study uses the following hyperparameters: the embedding dimension of 200, the minimum word count of 1, X_MAX of 100, 30 epochs, and a window size of 15. The remaining parameters are adopted from a previous study on Bengali GloVe model generation, and evaluation [10]. Two GloVe-based SSWE models are produced using the sentiment-specific corpus.
Intrinsic and Extrinsic Evaluation of Sentiment-Specific …
149
– FastText: FastText is an embedding technique that inputs sentiment-specific corpus and generates four embedding models as output. It is similar to Word2Vec but differs in that it can generate sub-word or character-level embeddings to capture morphological variations of words. To achieve optimal performance, various hyperparameters are considered, including embedding dimension (200), contextual window size (15), minimum word frequency threshold (1), and the number of epochs (30). These values were chosen based on previous Bengali text embedding research [11], while other hyperparameters were set accordingly. 3.3 Intrinsic Evaluation & Best Embedding Model Selection This research evaluates ten SSWE models, including four based on Word2Vec, two based on GloVe, and four based on FastText. The intrinsic evaluation involves calculating the word similarity scores (both semantic and syntactic) using Cosine similarity (CoS). The performance of the models is then assessed using both Spearman (SP) and Pearson (PR) correlation coefficients [12]. 3.4 Feature Extraction Feature extraction modules take two inputs: best embedding models after intrinsic evaluation and sentiment classification corpus. The embedding model has a vector representation of words which we call embedding features (Em ) of each word. We collect 11,807 text documents as a sentiment classification corpus. The feature extraction module generates the sentiment classification feature using the Eq. 1 Smi = Fx (Eb, SCi ), i ∈ {1, . . . , |SCi |}
(1)
where Smi denotes the i th sentiment feature matrix, i.e., Smi ∈ R(Mxl ×Fd ) . Fx denotes the feature extraction function, Eb denotes the best-performed embedding models, SC is the sentiment classification corpus, Mxl denotes the maximum length of each text and Fd denotes the feature dimension. Our research considers Mxl to be 120 and Fd to be 200. The output of this module is finally passed to the extrinsic evaluation module. 3.5 Extrinsic Evaluation The output of the feature extraction module is divided into two parts: sentiment model training (70%) and sentiment model testing (30%). Sentiment model training model constructed on Convolutional Neural Network (CNN), Long Short Term Memory (LSTM), Bidirectional Long Short Term Memory (BiLSTM), CNN + BiLSTM, and Gated Recurrent Unit (GRU).
150
S. Afroze and M. M. Hoque
The following subsections provide detailed descriptions of each technique. CNN Multi-kernel single-layer CNN has been utilized for this research. The model employs three kernels with sizes of (3 × 200), (4 × 200), and (5 × 200). The convolution layer is followed by a 1D max pool layer and activation layers (ReLu). Finally, a CNN model is constructed for sentiment classification purposes. LSTM This research uses a two-layer LSTM with the following sequential layer parameters: embedding dimension 200, the maximum sequence length of 120, hidden dimensions of (200 × 128) and (128 × 2), batch size of 16. Finally, an LSTM model is constructed for the purpose of sentiment classification. BiLSTM This study employs a sentiment classification model that utilizes a twolayer BiLSTM. The BiLSTM model consists of sequential layers with the following parameters: an embedding dimension of 200, a maximum sequence length of 120, and hidden dimensions of (200 × 128) and (128 × 2). Additionally, a batch size of 16 is used to train the model. GRU model are configured with an embedding dimension of 200, a maximum sequence length of 120, and hidden dimensions of (200 × 128) and (128 × 2). The model is trained using a batch size of 16 to optimize its performance. Once the sentiment classification model has been built, it is evaluated on untested text to identify whether the sentiment is positive or negative.
4 Experiments Python 3.5, seaborn, and sklearn were used to implement the proposed system, with training being carried out on a machine equipped with a core i7 processor and 32 GB of RAM. The developed embedding models are evaluated in three different ways: Cosine similarity of word-pair (CoS), Spearman correlation (SP), and Pearson correlation (PC). The cosine similarity (CoS) is computed using the Eq. 2. → − → − DX · DY CoS(DX , DY ) = − −−→ → DX × DY
(2)
− → − → The distributed feature vectors of the word pair (DX and DY ) are denoted by DX and DY , − → − → while DX and DY represent the 2 norm of the corresponding distributed feature vectors. The performance of the model is calculated using Spearman (SP) and Pearson correlations (PC).
5 Results The study aims to evaluate ten SSEW models intrinsically. These models include four Word2Vec-based SSEW models, two GloVe-based SSWE models, and four FastTextbased SSWE models. The results, presented in Table 1, display the cosine similarity (CoS) score (CoSsy and CoSse )for the top three embedding models. Among them, the GloVe-based SSWE model performed the best, achieving the highest CoS score for both syntactic (67.90%) and semantic (66.68%) word pairs.
Intrinsic and Extrinsic Evaluation of Sentiment-Specific …
151
Table 1. Syntactic cosine similarity (CoSsy ) and semantic cosine similarity (CoSse ) for bestperformed SSWE models SSWE model
CoSsy
CoSSe
Word2Vec
65.67
65.45
GloVe
67.90
66.68
FastText
66.87
66.49
Avg
66.81
66.20
The intrinsic evaluation results for both syntactic and semantic word pairs are presented in Table 2. The GloVe-based SSWE model achieved the highest score for syntactic word pairs with SP: 60.88% and PR: 61.78%, while for semantic word pairs, the model obtained SP: 60.34% and PR: 60.23%, respectively. Table 2. A Summary of intrinsic evaluation using Spearman’s (SP) and Pearson’s (PR) correlation Syntactic (%)
Semantic (%)
SSWE model
Size
SP
PR
SP
PR
Word2Vec
200
45.45
46.78
33.72
32.66
GloVe
200
60.88
61.78
60.34
60.23
FastText
200
51.77
51.64
52.44
51.24
The sentiment classification corpus has been divided into two sets for training (8264) and testing (3543). The sentiment classifier model’s performance is evaluated using extrinsic evaluators such as accuracy (AC), precision (P), and recall (R). Table 3 presents a summary of the various model for sentiment classification. According to the results, the Glove + CNN Model had the highest P, R, and achieved the highest AC at 92.88%. On the other hand, the Word2Vec + GRU Model had the lowest P, R, and resulting in the lowest AC at 89.11%.
152
S. Afroze and M. M. Hoque Table 3. Statistical summary of sentiment classifier
Model
Sentiment Glove + CNN
FastText + CNN Word2Vec + CNN Glove + LSTM FastText + LSTM Word2Vec + LSTM Glove + BiLSTM FastText + BiLSTM Word2Vec + BiLSTM Glove + GRU FastText + GRU Word2Vec + GRU
P
R
AC (%)
Positive
85.88
88.78
Negative
85.67
88.56
Positive
84.28
87.48
Negative
84.87
87.34
Positive
84.58
85.48
Negative
83.56
84.59
Positive
84.01
85.23
Negative
86.12
83.25
Positive
84.21
84.56
Negative
83.44
83.20
Positive
83.01
82.23
Negative
83.35
82.95
Positive
83.66
85.71
Negative
83.42
83.67
Positive
82.44
84.24
Negative
85.24
82.57
Positive
82.44
84.27
Negative
85.34
82.39
Positive
84.34
83.22
Negative
85.63
82.11
Positive
83.54
82.21
Negative
84.23
81.90
Positive
81.41
81.43
Negative
81.23
81.21
92.88 91.28 90.66 91.01 90.43 89.45 91.23 90.67 90.09 89.45 89.23 89.11
6 Conclusion This research developed sentiment-specific word models on sentiment-specific corpus and evaluate it in intrinsic and extrinsic approaches. Ten sentiment-specific word embedding models for the Bengali language were generated in this study using a combination of three embedding techniques (GloVe, Word2Vec, and FastText) and various hyperparameters. While the GloVe model has shown better performance than other models like Word2Vec and FastText. The best sentiment-specific model is used for sentiment classification tasks using four classifiers (CNN, LSTM, BiLSTM and GRU).
Intrinsic and Extrinsic Evaluation of Sentiment-Specific …
153
In future studies, additional analogy tasks could be incorporated to evaluate the effectiveness of various embedding models using a range of intrinsic and extrinsic evaluation methods. We also evaluate the performance of transformer-based language model performances, i.e., mBERT, XML-RoBERTa and Bangla-BERT.
References 1. Afroze, S., Hoque, M.M.: Sntiemd: Sentiment specific embedding model generation and evaluation for a resource constraint language. In: Intelligent Computing & Optimization: Proceedings of the 5th International Conference on Intelligent Computing and Optimization 2022 (ICO2022), pp. 242–252. Springer (2022) 2. Banik, D., Ekbal, A., Bhattacharyya, P.: Statistical machine translation based on weighted syntax-semantics. S¯adhan¯a 45, 1–12 (2020) 3. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017) 4. Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. In: Proceedings of the 10th International Conference on World Wide Web, pp. 406–414 (2001) 5. Gladkova, A., Drozd, A.: Intrinsic evaluations of word embeddings: what can we do better? In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, pp. 36–42 (2016) 6. Hong, T.V.T., Do, P.: Comparing two models of document similarity search over a text stream of articles from online news sites. In: International Conference on Intelligent Computing & Optimization, pp. 379–388. Springer (2019) 7. Hossain, M.R., Hoque, M.M.: Automatic Bengali document categorization based on deep convolution nets. In: Emerging Research in Computing, Information, Communication and Applications: ERCICA 2018, vol. 1, pp. 513–525. Springer (2019) 8. Hossain, M.R., Hoque, M.M.: Towards Bengali word embedding: corpus creation, intrinsic and extrinsic evaluations (2020) 9. Hossain, M.R., Hoque, M.M.: Covtexminer: covid text mining using CNN with domainspecific glove embedding. In: Intelligent Computing & Optimization: Proceedings of the 5th International Conference on Intelligent Computing and Optimization 2022 (ICO2022), pp. 65–74. Springer (2022) 10. Hossain, M.R., Hoque, M.M., Dewan, M.A.A., Siddique, N., Islam, N., Sarker, I.H.: Authorship classification in a resource constraint language using convolutional neural networks. IEEE Access 9, 100319–100338 (2021). https://doi.org/10.1109/ACCESS.2021.3095967 11. Hossain, M.R., Hoque, M.M., Sarker, I.H.: Text classification using convolution neural networks with fasttext embedding. In: Proceedings of HIS, pp. 103–113. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-73050-5_11 12. Hossain, M.R., Hoque, M.M., Siddique, N., Sarker, I.H.: Bengali text document categorization based on very deep convolution neural network. Expert Syst. Appl. 184, 115394 (2021) 13. Hossain, M.R., Hoque, M.M., Siddique, N., Sarker, I.H.: Covtinet: covid text identification network using attention-based positional embedding feature fusion. Neur. Comput. Appl. (2023) 14. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical cooccurrence. Behav. Res. Methods Instrum. Comput. 28(2), 203–208 (1996) 15. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv preprint arXiv:1301.3781
154
S. Afroze and M. M. Hoque
16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neur. Inf. Process. Syst. 26 (2013) 17. Pawar, A., Mago, V.: Calculating the similarity between words and sentences using a lexical database and corpus statistics (2018). arXiv preprint arXiv:1802.05667 18. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014) 19. Priyadarshi, A., Saha, S.K.: Towards the first Maithili part of speech tagger: resource creation and system development. Comput. Speech Lang. 62, 101054 (2020) 20. Rohde, D.L., Gonnerman, L.M., Plaut, D.C.: An improved model of semantic similarity based on lexical co-occurrence. Commun. ACM 8(627–633), 116 (2006) 21. Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965) 22. Schluter, N.: The word analogy testing caveat. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Volume 2 (Short Papers), pp. 242–246. Association for Computational Linguistics (2018) 23. Xu, J., Cai, Y., Wu, X., Lei, X., Huang, Q., Leung, H.f., Li, Q.: Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386, 42–53 (2020) 24. Zhelezniak, V., Savkov, A., Shen, A., Hammerla, N.Y.: Correlation coefficients and semantic textual similarity (2019). arXiv preprint arXiv:1905.07790
Movie Recommender System: Addressing Scalability and Cold Start Problems Pradeesh Prem Kumar1(B) , Nitish U.1 , Nimal Madhu M.2 , and Hareesh V.1 1 Center for Computational Engineering and Networking (CEN), Amrita School of Artificial
Intelligence, Amrita Vishwa Vidyapeetham, Coimbatore 641112, India [email protected] 2 Department of Electrical Engineering, National Institute of Technology Calicut, Kozhikode, Kerala, India
Abstract. A recommender system is a tool used to utilized to forecast the inclination of a user for a specific item. Matrix Factorization is one of the most extensively employed and well-known techniques for constructing a Recommender System. This approach has been proven to be efficacious in numerous practical applications and has been employed in a variety of domains, including films, literature, songs, and electronic commerce. In this paper, we attempted to conduct a biasminimized analysis of latent feature factorization and achieved successful results in small datasets. Our findings suggest that when studying a smaller segment of users, it is not necessary to incorporate all biases, resulting in lower computational costs. We also discussed the issue caused by cold start i.e., whenever a new user is available in the system, and possible solutions to address the issue of cold start. Keywords: Matrix factorization · Cold start · Cosine similarity · Scalability
1 Introduction A recommender system [1–9, 11, 13–15], also referred to as a recommendation system, is a type of information Filtering System that aims to predict the “rating” or “preference” a user would assign to a particular item. These systems are widely used in various fields and they are most commonly seen used for playlist generators for video and music services, product recommenders for online stores, or content recommenders for social media platforms. There are two primary approaches to building a recommender system: collaborative filtering (CF) and content-based filtering. CF forms a model from a user’s prior behaviour and other users’ decisions. Content-based filtering suggests items based on a comparison between the content of the items and a user’s past preferences. A good recommender system is essential for making sure that the right item is recommended to the user in a timely fashion. If the wrong item is recommended, the user would not be hooked onto the system and if it takes too long, they would be dissatisfied and stop using the platform. To make users enticed to use a platform, it is important that © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 155–164, 2024. https://doi.org/10.1007/978-3-031-50327-6_17
156
P. Prem Kumar et al.
the right amount of good content is available and that the system can recommend correct items based on their preferences/likings without delays. However, when dealing with large-scale recommendation systems, it is common to use complex algorithms that take into account various biases and factors such as user preferences, item popularity, and past interactions. These algorithms can be computationally intensive and require significant computational resources [9]. To reduce computation the work has been done to introduce the CF algorithm on Hadoop and use mapReduce in Hadoop cluster [9]. The ratings given by a user are influenced by various factors and might include age, gender, time, place, sentiments [1, 10], and various other factors [12]. Our intuition is that users from similar regions or cultural backgrounds will have similar biases toward various items. Hence, if users are clustered based on geographical location, language, etc., it is possible to simplify the calculation process by eliminating all biases, thus significantly reducing the computational burden on the system, and still being able to achieve good recommendations. So, the whole process would be simple and easy to implement. Below is the infographic on film genre search history as per google trends in the Indian subcontinent between Feb’22 to Feb ‘23. With the specific example given above it is quite evident that certain genres are preferred in specific zones which could be due to cultural differences, specific movies of the certain genre being released in regional languages, specific news driving hype, etc. Hence if we could cluster users based on the intuition mentioned above, we could achieve our goal (Fig. 1).
Fig. 1. Region wise genre search trends (Courtesy: Google trends)
Movie Recommender System: Addressing Scalability
157
With this idea, in this paper, we have tried evaluating two different approaches with small equations and tried out computing the ratings, and compared the accuracy using mean square error.
2 Literature Review Ms. Neeharika et al. [1] in 2017, proposed a Hybrid recommendation strategy that utilized a hierarchical approach to evaluate both content-based and collaborative filtering approaches and hence provided users with more personalized movie suggestions. Authors have utilized the fact that for obtaining a better correlation between the user and the item it would be better if such information is obtained through social media posts, likes, and preferences of the individual user and his/her friends. This would include even posts suggesting certain items or even pictures related to those items. In 2014 Urszula Kuelewska [2] used clustering to create recommender systems and various similarity measures based on Euclidian distance, cosine, correlation coefficient, and loglikelihood function were used. Y. Koren, R. Bell, and C. Volinsky [5] in 2009 have given a very descriptive analysis of the collaborative recommendation system which forms the basis of this paper as well as where the equations for matrix factorization have been derived. The minimization of the function was carried out using stochastic Gradient Descent and Alternating Least squares. At the time of publishing this paper, they claimed Netflix’s model had an RMSE (Root Mean Square) of 0.9514. Bogdan Walek and Vladimir Fojtik [7] in 2020, proposed a hybrid recommender system called Predictory for movie recommendation. The system amalgamates collaborative filtering, content-based, and a fuzzy expert system. The efficacy of the system was evaluated on the MovieLens dataset and was compared against other traditional recommenders. A. M. Elmisery and D. Botvich [8] in 2011, mentions data mashups being used by IPTV services to improve their recommendation services in relation to their competitors. The idea behind this is to use data from various sources (like Netflix or IMDb etc.) and get more accurate recommendations by using these aggregated data. J. Jiang et al. [9] in 2011, implemented the CF algorithm using Map-Reduce in the Hadoop cluster and showed that performance improved in the CF algorithm. All the above papers have addressed the issue of having a good recommender system where CF based algorithm proved to be one of the best to generate accurate recommendations but it suffers scalability issues as the dataset grew larger [9]. Implementation of Hadoop and clustering was implemented but it again had issues of scalability as the dataset grew larger. Our work involves assuming that if the users are clustered based on geographical location, language, etc., which again is a major cause of influence for preferences of users in liking a particular type of movies would allow us to give better ratings with less computationally expensive methods. To address the cold start issue we have used cosine similarity which is quite flexible and easy to implement as the dimension of the data increased which in other methodologies like euclidian distances, correlation [2], etc. doesn’t perform well in higher dimension data.
158
P. Prem Kumar et al.
3 Working Principle 3.1 Matrix Factorization Matrix Factorization is a widely utilized approach in Recommender Systems that is based on Collaborative Filtering. This large user-item interaction matrix is decomposed into two lower-dimensional matrices, one representing the user preferences and the other representing the item features. These matrices are then used to make recommendations by computing the dot product of the user and item matrices. In matrix factorization, the user-item interaction matrix is factorized into two matrices, θ (representing the users) and X (representing the items). Each row of θ represents a user, and each row of X represents an item. The entries in these matrices are learned by minimizing the difference between the predicted ratings and the actual ratings in the user-item interaction matrix. The ith row of θ and the jth row of X are used to compute the dot product which in turn is the predicted rating for user i and item j. The dot product can be interpreted as the similarity between the user and item preferences. The predicted ratings can then be used to rank the items for a user, and the top-ranked items can be recommended to the user (Fig. 2).
Fig. 2. Matrix factorisation
3.2 Content-Based Filtering A content-based recommender system [1–10] is a recommendation system that suggests items to users based on their past preferences and behaviour, as well as the characteristics of the items. It uses the content, or attributes, of the items to make recommendations. The idea is that if a user likes a particular type of item, they will also like similar items.
4 Dataset Movielens dataset downloaded from Kaggle was used, in which a small rating dataset that can be a perfect representation of a small group of users dataset with similar features like geographical location or language preferences etc. was used. It contained 610 users and 9742 items (Fig. 3).
Movie Recommender System: Addressing Scalability
159
Fig. 3. Ratings Csv
5 Methodology This paper suggests that when we have a large user-item matrix dataset, more complex equations are used to calculate and obtain the most accurate ratings. However, if we have a small dataset, simpler optimization equations can be used to categorize users based on their characteristics. This would reduce the system’s load and make it easier to update ratings as they come in from users within the same group. 5.1 Approach 1 (Considering Linear Dependency) Here we are assuming the fact that the User matrix and Item Matrix are just linearly dependent and then we try to minimize the error between actual and obtained ratings derived from the user and item matrix. i.e., we are minimizing the equation min f(x, θ) = R−Xθ
(1)
To elaborate let’s consider User U1 and Item M1 for the explanation. Let us consider 4 latent features are there that is determining the rating
U1
M1
If we take the dot product of these two arrays it would be a1 b1 + a2 b2 + a3 b3 + a4 b4 = d11
(2)
where d11 is the resultant predicted rating. This is a more or less linear equation and we could formulate the equation as below: γ = a1 b1 + a2 b2 + · · · + an bn
160
P. Prem Kumar et al.
γ =
n
ai bi
(3)
i=1
a1 =
γ − (a2 b2 + a3 b3 + · · · + an bn ) b1
(4)
So starting with a random initial values for a1 , b1 , a2 , b2 , a3 , b3 , a4 and b4 as a1 , a2 , and b4 respectively then next a1 value a1 can be obtained by,
a3 , a4 , b1 , b2 , b3
γ − (a1 b1 + a2 b2 + · · · + an bn ) + a1 b1 b1 γ − γ + a1 b1 a1 = b1 e a1 = + a1 b1
a1 =
where, e = γ − γ , which is error between actual rating and predicted rating same as (1).
a1 = a1 +
1 (b1 )2
· e · b1
a1 ≈ a1 + α · e · b1
(5)
where, α is some small stepsize for the equation. Similarly,
b1 ≈ b1 + α · e · a1
(6)
We keep minimizing error until a certain number of iterations or when it has reached a certain tolerance level of our requirement. Then we keep updating the value of each weight factors in the user and item matrix as in (5) and (6). This is more or less Gauss-Jacobi iterative method to solve linear equations. There is a small flaw in the system that if the random values taken at the beginning at times tend to become either too small or big which could cause the values getting really absurd as we go along with iterations and code could result in giving error/warning. This can be addressed by incorporating error handling code to ensure the process is repeated in case of error or warning. With α = 0.02 and considering original rating as 5 the convergence of the weights of the equation to reach the value is shown in Fig. 4. 5.2 Approach 2 (Considering Non Linear Dependency) MSE = (R − Xθ)2 MSE R X θ
Mean square error Original Rating Matrix Movie Matrix User Matrix.
(7)
Movie Recommender System: Addressing Scalability
161
6 4 2 0 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99
PredRaƟng/error
Convergence
PredRaƟng
Error
Fig. 4. Convergence in approach 1
This is the most common approach where in the mean square error is considered to predict ratings. To make it more accurate many biases are added but we are not including anything considering a small set of users database. The model of the system is created by adjusting to prior observed ratings. To prevent overfitting of the observed data, regularization is applied to the learned parameters. The level of regularization is determined by the constant λ. min J (x, θ ) =
λ λ 1 (X θ − R)2 + θ2 + X 2 2 2 2
(8)
∇x =
∂J = (X θ − R)θ + X ∂x
(9)
∇θ =
∂J = (X θ − R)X + θ ∂θ
(10)
Xk+1 = Xk − α∇x
(11)
θk+1 = θk − α∇θ
(12)
To determine X and θ matrices the initial ratings matrix, R is decomposed using SVD to obtain matrices U, S and Vt, Scree plot is checked for determining the most appropriate number of singular values(components) that needs to be taken for getting the most approximate original ratings matrix. Then the singular value percentage is taken and the matrix S[:components] and Vt[:components,:] obtained are multiplied to obtain the matrix X whereas U[:,:components] is takes as matrix θ. Optimal λ is determined while checking its performance against MSE. The most optimal λ value for the objective function (7) was observed to be 0.9. Once the λ value is fixed we carry out the minimisation of the objective function (7) using gradient descent as mentioned in (10) and (11). And finally dot product of X and θ and the values are scaled in range of 1 to 5 which is considered as the predicted ratings. Before starting this whole process in approach 1 and 2, some of the ratings are made hidden during training process and those are tested to determine the accuracy of the model.
162
P. Prem Kumar et al.
5.3 Cosine Similarity The major drawback within this collaborative recommender system is it is dependent on the user’s ratings and assumes the fact that the user would prefer items similar to that. But whenever there is a new user then there are no ratings for comparison to start with. Which makes the system incapable of any suggestion. This problem is referred to as a cold start problem. One way to tackle this would be to consider cosine similarity between different features. Cosine similarity is a measurement of similarity between two vectors and can be found by measuring the cosine of angle between the two vectors, which again can be found by dividing the dot product of two vectors and then diving it with the magnitude of each. cos φ =
·B A A · B
(13)
The result obtained would be in the range of −1 to 1 and the value closer to 1 would indicate most similar vectors. To test it in our dataset we took the genres as reference vector and created a list of all types genres available within the dataset and then converted each movie’s genres into one hot encoding based on the list created. The idea is whenever a user opens an item in the list there are high chances that the person is interested in that (Fig. 5).
Fig. 5. Cosine similarity geometrical inference
Keeping that in mind we would take genre vector created for that movie and find cosine similarity against each movie in the list. Once all cosine value is obtained then we would sort items in descending order to obtain the best similar item for recommendation.
6 Results and Discussion With approach 1, model had RMSE as 1.38 whereas in approach 2 it was 0.82. The dataset used in this paper was not really a clustered dataset based on some feature so would contain mixed users but we were still able to produce good results while using simple equations making it computationally less expensive. This can be especially
Movie Recommender System: Addressing Scalability
163
useful in situations where real-time recommendations are required or when resources are limited. So, scalability shouldn’t become an issue if we could just try to group user database into smaller batches based on certain features and the recommendations would not be much influenced by global or other sorts of biases (Table 1). Table 1. Results RMSE
MAE
Approach 1
1.38
1.06
Approach 2
0.82
0.64
Cosine similarity was used to address giving recommendations while having a cold start issue. One major advantage is that it is still effective as the dimension increases since it is a measure of the angle between the vectors, whereas other methodologies like Euclidian distances, Manhattan distance, etc. suffer from the curse of dimensionality, and computation time would increase. This flexibility makes cosine similarity a versatile and effective tool for recommendation systems to solve cold start problems.
7 Conclusion and Future Works In our work, we were able to show that with clustered user data we could still achieve a good recommendation system while using computationally less expensive methods. For future works, we could work on various ways to sub clustering users based on language preferences, Age group, gender, education etc., and see if better results could be obtained.
References 1. Immaneni, N., Padmanaban, I., Ramasubramanian, B., Sridhar, R.: A meta-level hybridization approach to personalized movie recommendation. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, pp. 2193–2200 (2017). https://doi.org/10.1109/ICACCI.2017.8126171 2. Urszula, K.: Clustering algorithms in hybrid recommender system on MovieLens data. Stud. Logic Grammar Rhetoric 37(1), 125–139. https://doi.org/10.2478/slgr-2014-0021 3. Subramaniyaswamy, V., Logesh, R., Chandrashekhar, M., Challa, A., Vijayakumar, V.: A personalised movie recommendation system based on collaborative filtering. Int. J. High Perform. Comput. Networking 10, 54 (2017). https://doi.org/10.1504/IJHPCN.2017.083199 4. Lekakos, G., Caravelas, P.: A hybrid approach for movie recommendation. Multimed. Tools Appl. 36, 55–70 (2008). https://doi.org/10.1007/s11042-006-0082-7 5. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009). https://doi.org/10.1109/MC.2009.263 6. Ponnam, L.T., Punyasamudram, S., Nallagulla, S., Yellamati, S.: Movie Recommender System Using Item Based Collaborative Filtering Technique (2016) 7. Walek, B., Fojtik, V.: A hybrid recommender system for recommending relevant movies using an expert system. Expert Syst. Appl. 158,113452 (2020). ISSN 0957-4174.https://doi.org/10. 1016/j.eswa.2020.113452
164
P. Prem Kumar et al.
8. Elmisery, A.M., Botvich, D.: Agent based middleware for private data mashup in IPTV recommender services. In: 2011 IEEE 16th international workshop on computer aided modeling and design of communication links and networks (CAMAD), Kyoto, Japan, pp. 107–111.https:// doi.org/10.1109/CAMAD.2011.5941096 9. Jiang, J., Lu, J., Zhang, G., Long, G.: Scaling-up item-based collaborative filtering recommendation algorithm based on Hadoop. In: 2011 IEEE World Congress on Services, Washington, DC, USA, pp. 490–497 (2011).https://doi.org/10.1109/SERVICES.2011.66 10. Zhao, G., Qian, Lei, X., Mei, T.: Service quality evaluation by exploring social users’ contextual information. IEEE Trans. Knowl. Data Eng. 28(12), 3382–3394 (2016). https://doi.org/ 10.1109/TKDE.2016.2607172 11. Rao, N.K., Challa, N.P., Chakravarthi, S.S., Ranjana, R.: Movie recommendation system using machine learning. In: 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, pp. 711–716 (2022). https://doi.org/10.1109/ICI RCA54612.2022.9985512 12. Stephen, G., Inbaraj, D., Anbuudayasankar, S.P., Poongkundran, T.: Investigating the influence of audiences’ movie-viewing motives on attitude towards brand placement in movies. J. Glob. Scholars Market. Sci. 31(4), 487–510 (2021). https://doi.org/10.1080/21639159.2020.180 8813 13. Anbazhagan, M., Arock, M.: Collaborative filtering algorithms for recommender systems. Int. J. Control Theory Appl. 9(27), 127–136 (2016) 14. Bindu, K.R., Visweswaran, R.L., Sachin, P.C., Solai, K.D., Gunasekaran, S.: Reducing the cold-user and cold-item problem in recommender system by reducing the sparsity of the sparse matrix and addressing the diversity-accuracy problem. In: Modi, N., Verma, P., Trivedi, B. (eds.) Proceedings of International Conference on Communication and Networks. AISC, vol. 508, pp. 561–570. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-27505_58 15. Islam, M.S., Forhad, M.S.A., Uddin, M.A., Arefin, M.S., Galib, S.M., Khan, M.A.: Developing an intelligent system for recommending products. In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2020. AISC, vol. 1324, pp. 476–490. Springer, Cham (2021). https://doi.org/10. 1007/978-3-030-68154-8_43
E-waste Management and Recycling Model for Dhaka with Collection Strategy Application: A More Effective and Sustainable Approach Md. Nazmus Sakib1 , Md. Mainul Hasan1 , Anika Faiza1 , Shahinur Rahman Nova1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2,3(B) 1 Department of Computer Science and Engineering, East West University, Dhaka 1212,
Bangladesh [email protected] 2 Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh [email protected] 3 Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chattogram, Bangladesh
Abstract. E-waste, which contains chemicals and metals, is a term used for abandoned electronic devices. The increasing use of electronic devices has made it difficult to handle the enormous volume of e-waste due to a lack of a strong organizational and governmental e-waste management infrastructure in developing countries. To reduce the harm to the environment and human health, scientific methods must be used for e-waste disposal. The Azizu Recycling and E-waste Company Ltd.‘s recycling and management models have been analyzed and compared, and a new model with an environmental pollution control system, sustainable reuse plan, detailed recycling process, suggested landfill away from residential areas, and an app system for e-waste collection has been proposed. Keywords: Electronic gadgets · e-waste disposal · Recycling · Effective execution
1 Introduction E-waste management involves collecting, recycling, and disposing of e-waste safely to minimize its negative environmental impact. Electronics contain hazardous compounds that can harm human health if they are not properly handled. These compounds can enter our bodies through air, land, and soil, causing harm over time. [1]. The groundwater quality is declining, and it is consumed by humans and animals, which poses health risks [2]. The production of electrical and electronic waste in the country is increasing due to the fact that not all imported or domestically produced electrical and electronic products are made with environmental protection in mind [3]. Separation can be done manually or semi-mechanically in the factory. The local firm Azizu Recycling and E-Waste Limited © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 165–178, 2024. https://doi.org/10.1007/978-3-031-50327-6_18
166
Md. Nazmus Sakib et al.
collects e-garbage from institutions and waste collection vendors and recycles it to a high standard by separating large aluminum, copper, steel, rubber, and plastic pieces and supplying them as raw materials to other businesses [4]. In our findings, the general public is unconcerned about the dangers of inappropriate e-Waste disposal [5]. The recycling industry in Bangladesh is not paying sufficient attention to hazardous electronic waste imported for reprocessing. There are no scientific studies on IT waste management or systematic research on e-waste management in Bangladesh [6]. Our research objectives are as follows: i. To identify the main shortcomings in Bangladesh’s e-waste management and recycling process, ii. To analyze the existing model to find out the drawbacks, iii. To propose a more efficient and green model based on comparison data analysis. This study investigates industrial activities, E-waste practices, and disposal systems in Dhaka to eliminate inappropriate E-waste management structures. It examines the handling of electronic garbage in Dhaka and the trading of improperly treated electronic trash in underdeveloped nations. The study analyzes how various firms manage waste from outdated and worn-out machinery, computers, and other electronic gear, and describes the harm done to the environment and public health. The study is based in Dhaka, Bangladesh, and would be based on observations of the context for the management of E-waste in Bangladesh.
2 Literature Review The country’s registered electronic waste producers and recyclers are required to submit their WEEE management plans, according to the law [7]. The Department of Education held a consultative session on Bangladesh’s e-waste management on January 9, 2022 [8]. As many as 92% of businesses agreed with the statement that “We must take a serious view on ensuring that all devices used to equip the workforce throughout the COVID-19 pandemic are appropriately stored and disposed of [8]. The Bangladesh government released “Hazardous Waste (e-waste) Management Rules 2021” on June 10, 2021, to regulate e-waste. E-waste is categorized into six categories, including cooling and freezing devices, televisions, monitors, laptops, notebooks, and tablets, under the Bangladesh Environmental Protection Act of 1995. [7]. The papers [9–18] provided different guidelines and solutions. Rapid technology advancement results in individuals discarding numerous electronic devices daily, posing a risk to human health due to harmful chemicals, including lead, mercury, and chromium, in e-waste during collection and unprocessed recycling.
E-waste Management and Recycling Model for Dhaka
167
3 Materials and Methods 3.1 Data Collection Plan Our research goal is to propose a new model that is more effective and efficient. The proposed model would be achieved by reducing the cost of waste management and improving the effectiveness of the recycling process. For our research, we need quantitative data to calculate the present and proposed costs of waste management and recycling to validate our model. We’ll obtain the current waste management model by visiting recycling industries in Dhaka and gathering online data. We planned to conduct an interview with industries we prepared some questions to ask. The questions are: (i) (ii) (iii) (iv)
How do they collect E-waste? How we could improve the collection method? How much does it cost for the whole process of e-waste collection and management? In what condition do they consider items to recycle?
3.2 Research Process This study targets to fulfill the objectives mentioned. To solve all the problems it uses both qualitative and quantitative data. Qualitative data will be used to propose a new efficient model and quantitative data will be used to justify our proposed model. Our research process is given below in a flow chart (Fig. 1):
Fig. 1. Research process
The figure shows the research process: collecting and analyzing data on the current E-waste management system to identify its shortcomings, then proposing a more ecofriendly model.
168
Md. Nazmus Sakib et al.
3.3 Data Analysis The waste stream method was used to estimate waste generation, with consent forms provided to stakeholders, and is based on certain factors, following factors are: (i) Total population (ii) Stock data (iii) Products average weight electronic waste generation in a region =
Total Amount of Electronic Waste (1) Total population in that region
After getting the e-waste generation amount this will impact our proposed result.
4 Results and Discussion Our analysis shows few concerns about improper e-waste disposal and challenges faced by recyclers. We collected data from field trips and developed sustainable models to improve e-waste management, including landfilling and promoting repair and reusability. 4.1 Recycling Procedure This is the current scenario of the recycling process of a renowned recycling company which is located beside Dhaka city. Most of the e-waste collected from Dhaka and all over the country is being recycled through this process. In Fig. 2, the company collects the e-waste. Then they send it to the plant. The company recycles the waste and distributes it to the manufacturers. After the distribution and manufacturing process, the leftover e-waste goes to the recycling plant again. The manufacturers provide the end product to the sellers and the sellers sell them to corporate and household consumers. In the waste collection step, the e-waste which is possible or profitable to recycle goes to be landfilled. The current model provided and described by recycling companies does not have that kind of effect on environmental and reusability plans. So, we proposed a model which will ensure the reusability plan with environmental safety. In Fig. 3, the reusability of the products will be checked. If it is ready to repair then will be sent to the recovery section, if not then it will step forward into the next procedure of recycling. The scrap-breaking and pellet-producing section can produce tiny particles of plastics which can pollute the air. A melting furnace does the same. So, recycling companies must ensure environmental safety measurements during the breaking and melting process. For this, they have to set up EPCS (Environment pollution control system) in their facility. This system will purify the air before releasing it into the environment. In this recycling model, green methods like Hydrometallurgical, Pyrometallurgical, and Electrometallurgical were also proposed to make the model greener and environmentally friendly.
E-waste Management and Recycling Model for Dhaka
169
Fig. 2. Current recycling procedure of Azizu
4.2 E-Waste Management From the collected data we tried to analyze the current situation of e-waste management around Dhaka city. Based on that we draw a process flow model which is shown below. In Fig. 4, Small scrap dealers gather the e-waste from many sources. The garbage is then sent to the major scrap merchants. Sometimes they sell the electronic parts to the neighborhood repair shop before mailing them. Then, the big scrap dealers provide the recycling businesses with e-waste as the raw material. A part of the garbage that cannot be recycled by recycling firms is exported because of their limited recycling capacity. The businesses recycle the remaining garbage. After recycling, certain e-waste components remained that could no longer be recycled. These parts are disposed of at the disposal facility. The dumping station authority disposes of e-waste in the same manner as they dispose of domestic rubbish. They sometimes set fire to them, causing deadly environmental damage. By reviewing the pre-mentioned current model, it is easy to notice the absence of some important steps like uninterrupted e-waste collection, repair ability, and safe disposal. In this proposed model we have added these to make the model more acceptable. In Fig. 5, we proposed to set up a central collection point where the recycling companies will be able to collect the waste easily. Another shortcoming is the absence of a proper repair and reusability plan. We also mentioned it too where only the expert and trained employees will repair the usable products under certain safety measurements. A common mistake is being occurred in the safe disposal section of e-waste. Landfilling
170
Md. Nazmus Sakib et al.
Fig. 3. Proposed model to recycle e-waste effectively
can be a good option to bury the useless e-waste particles which have also been suggested in this model. 4.3 Analysis and Comparison Current and Proposed Model A detailed comparison between the existing and proposed framework of the e-waste management system and recycling process is given below. In this Table 1, we identified some comparison issues through which we performed a detailed comparison between the existing and proposed framework of the e-waste management system and recycling process. In the recycling process comparison, we picked up Environmental pollution control, Reusability plan, and Separate recycling to compare the existing model with the proposed one. The management system needed
E-waste Management and Recycling Model for Dhaka
171
Fig. 4. Current e-waste management process of Dhaka
safe disposal, effective e-waste collection, and repair with experts to make a meaningful comparison. 4.4 Waste Collection Most people do not know the bad effects of E-waste. The e-waste management system is not so easy though. And also people do not want to waste time on recycling because they are not getting any rewards for it. After analyzing all the collected information, we have designed the current waste collection process which is given below: In Fig. 6, there are four stakeholders who are engaged in e-waste management. All do their task asynchronously but as they do not have a sequential connection with each stakeholder to the other that’s why we can see there are two-time gaps in the model. In that time gap, e-waste pollutes our environment a lot. That’s why in our to-be model we tried to remove that time gap. After analyzing all the shortcomings, we have designed a proposed model of the waste collection process which is given below: In Fig. 7, the proposed model our basic stakeholders are The mobile application, the User, and Recycle Company. The customer will post about their e-waste through
172
Md. Nazmus Sakib et al.
Fig. 5. A proposed e-waste management system for Dhaka
the application. On a scheduled date and time the company can plan to collect all the e-waste area-wise. Analyzing waste collection through the application, we find that most of the waste comes from households, so conducted an interview with households of different sizes regarding their concerns about e-waste management. From the Table 2 we can see the number of participants from different sizes of families. The responses of the participants are given in the Table 3. From the Table 3 we can see that most of the respondents do not manage the e-waste properly the mentioned reason is the difficulty they face in managing e-waste, 20% think that the current management system is easier, but 90% of the respondents agreed that mobile application can make it easier for them to manage the e-waste properly. That is why we developed a mobile application for e-waste management which will make e-waste management easier. 4.5 Interface of Application The amount of e-waste is increasing day by day. As the number of e-waste is increasing vastly so if we can make the e-waste management system a little more environmentally friendly then it will impact huge in the future. To make the management system we have
E-waste Management and Recycling Model for Dhaka
173
Table 1. Analysis and compare between existing and proposed model Recycle
Management
Issues
Existing
Proposed
Environmental pollution control system
The system is absent in the exiting model
Proposed to set up EPCS in the facility
Reusability plan
The current model does not have a proper reusability plan
Provided a sustainable reusability plan
Recycle separately
Not mentioned in the model
Shown detailed process
Safe disposal
Follow the traditional burning system
Landfilling away from the residential area
Effective e waste collection
They do not have any collection point
Proposed to set up collection points and develop app system
Repair with experts
Directly repair in shops without any safety measure
Will be repaired by experts with safety measure in repairing facility
developed an Android Application using which the house owner easily can contact the recycling companies. The interface of the application is given below: In Figs. 8 and 9, we can see that users can post their waste details on the app, and the company owner will get notified with location details. They can collect waste from area to area as they have data on the house no, area no of the customers. The recycling companies can manage events in different areas to collect the e-waste through the application. The customers can get a reward in money for the e-waste or they can donate it. By using the app we can make the e-waste collecting process more efficient and environmentally friendly. As we have seen that e-waste is increasing day by day. After using the app the e-waste collection will be very easy and the time gap will be removed.
174
Md. Nazmus Sakib et al.
Fig. 6. Current model of the waste collection process
Fig. 7. Proposed model of the waste collection process
E-waste Management and Recycling Model for Dhaka
175
Table 2. Family size of the respondents Sl. No.
Particulars
No. of members
1
Small family
2
3
30
2
Medium family
Between 2 and 5
4
40
3
Large family
More than 5
3
30
10
100
Total
No. of respondents
Percentage
Source Primary data
Table 3 Responses from the respondents Questions
Yes (%)
No (%)
Do they manage e-waste properly?
40
60
Does the current management system is easier for them?
20
80
Can a mobile application make it easier to handle and connect them with recycling companies?
90
10
Fig. 8. a Splash screen b dashboard
176
Md. Nazmus Sakib et al.
Fig. 9. a Create post b upcoming events
5 Conclusion The study’s only goal was to find a sustainable method of Dhaka e-waste treatment and compare the suggested model to the present paradigm. The current model has some flaws. As a result, the suggested model’s faults are fixed. The proposed models are effective and sustainable models for recycling and e-waste management. We have also developed an application for waste collection which will help save time and make the environment less polluted. The recycling of obsolete electronic equipment is critical, but it must be done in a secure and uniform manner. It is unavoidable to improve working conditions for all e-waste company employees. Tons of e-waste are wasted every year, and the situation is only growing worse. Acknowledgments. The authors would like to thank Mr. Abdulla Al Mamun, Senior Executive Officer for Business Development at Azizu Recycling and E-waste Company Ltd.
References 1. Halim, L., Suharyanti, Y.: E-Waste: current research and future perspective on developing countries. Int. J. Indus. Eng. Eng. Manage. 1(2), 25–42 (2019) 2. Masud, M.H., et al.: Towards the effective E-waste management in Bangladesh: a review. Environ. Sci. Pollut. Res. 26(2), 1250–1276 (2018). https://doi.org/10.1007/s11356-0183626-2 3. Ahmad, S., Wong, K., Rajoo, S.: Sustainability indicators for manufacturing sectors. J. Manuf. Technol. Manag. 30(2), 312–334 (2019) 4. Rahman, M.: E-waste management challenges in the country. The Financial Express (2021)
E-waste Management and Recycling Model for Dhaka
177
5. Ahirwar, R., Tripathi, A.: E-waste management: a review of recycling process, environmental and occupational health hazards, and potential solutions. Environ. Nanotechnol. Monitor. Manage. 15, 100409 (2021) 6. Aboelmaged, M.: E-waste recycling behaviour: an integration of recycling habits into the theory of planned behaviour. J. Clean. Prod. 278, 124182 (2021) 7. Roy, H., Islam, M., Haque, S., Riyad, M.: Electronic waste management scenario in Bangladesh: policies, recommendations, and case study at Dhaka and Chittagong for a sustainable solution. Sustain. Technol. Entrepreneurship 1(3), 100025 (2022) 8. Haque, R., Rahman, M.: E-waste management in Bangladesh. Published by Syed Manzur Elahi for International Publications Limited, Tropicana Tower (4th floo), 45, Topkhana Road, GPO Box: 2526 Dhaka, 1000 (2022) 9. Yeasmin, S., Afrin, N., Saif, K., Reza, A.W., Arefin, M.S.: Towards building a sustainable system of data center cooling and power management utilizing renewable energy. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_67 10. Liza, M.A., Suny, A., Shahjahan, R.M.B., Reza, A.W., Arefin, M.S.: Minimizing E-waste through improved virtualization. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10. 1007/978-3-031-19958-5_97 11. Das, K., Saha, S., Chowdhury, S., Reza, A.W., Paul, S., Arefin, M.S.: A sustainable e-waste management system and recycling trade for bangladesh in green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_33 12. Rahman, M.A., Asif, S., Hossain, M.S., Alam, T., Reza, A.W., Arefin, M.S.: A sustainable approach to reduce power consumption and harmful effects of cellular base stations. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_66 13. Ahsan, M., Yousuf, M., Rahman, M., Proma, F.I., Reza, A.W., Arefin, M.S.: Designing a sustainable e-waste management framework for Bangladesh. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_104 14. Mukto, M.M., Al Mahmud, M.M., Ahmed, M.A., Haque, I., Reza, A.W., Arefin, M.S.: A sustainable approach between satellite and traditional broadband transmission technologies based on green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_26 15. Meharaj-Ul-Mahmmud, Laskar, M.S., Arafin, M., Molla, M.S., Reza, A.W., Arefin, M.S.: Improved Virtualization to Reduce e-Waste in Green Computing. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_35 16. Banik, P., Rahat, M.S.A., Rafe, M.A.H., Reza, A.W., Arefin, M.S.: Developing an Energy Cost Calculator for Solar. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-03119958-5_75
178
Md. Nazmus Sakib et al.
17. Ahmed, F., Basak, B., Chakraborty, S., Karmokar, T., Reza, A.W., Arefin, M.S.: Sustainable and profitable IT infrastructure of bangladesh using green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_18 18. Ananna, S.S., Supty, N.S., Shorna, I.J., Reza, A.W., Arefin, M.S.: A Policy Framework for improving e-waste management in Bangladesh. In: Vasant, P., Weber, GW., MarmolejoSaucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi. org/10.1007/978-3-031-19958-5_95
CoBertTC: Covid-19 Text Classification Using Transformer-Based Language Models Md. Rajib Hossain(B)
and Mohammed Moshiul Hoque
Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong 4349, Bangladesh [email protected], [email protected]
Abstract. Covid-19 has significantly impacted human life, decreasing face-toface communication and causing an exponential rise in virtual interactions. Consequently, online platforms like news websites, blogs, and social media have become the primary source of information for many aspects, particularly Covid-19-related news. Nonetheless, accurately categorizing Covid-19-related text data is an ongoing research challenge during and after the pandemic. This paper introduces a Covid-19-related text classification system named CoBerTC to address this issue, which consists of three primary modules: transformer-based language model finetuning, transformer-based language model inference, and best-performing model selection. Six transformer-based language models are exploited for the text classification task, including mBERT, XML-RoBERTa, mDistilBERT, IndicBERT, MuRIL, and mDeBERTa-V3 on the English Covid-19 text classification corpus (ECovC). The findings reveal that XML-RoBERTa achieved the highest accuracy of 94.22% for the Covid text classification task among the six models. Keywords: Natural language processing · Text processing · COVID-19 · Transformer-based language models · Fine-tuning
1 Introduction Text classification related to Covid-19 is an essential task that aims to automatically categorize if text data includes information related to Covid-19. In response to this public health crisis, governments worldwide have implemented states of emergency and enforced limitations on daily movements. The pandemic has led to the emergence of online platforms as a crucial information source for various entities such as the ministry of education and health, different types of limited and government companies, law and enforcement agents, and legislators [10]. As a result, a significant amount of textual data has been generated. Nevertheless, this abundance of textual data is frequently characterized by inaccuracies, falsehoods, rumors, and counterfeit news, collectively leading to an infodemic. Researchers in linguistics have been strongly driven to extract Covid-related information from published data, both current and past [4]. However, most of the text is unstructured and unlabeled, posing a challenge for Covid text mining. The process of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 179–186, 2024. https://doi.org/10.1007/978-3-031-50327-6_19
180
Md. R. Hossain and M. M. Hoque
manual text classification can be time-consuming and expensive. To overcome the limitations of manual classification and assist governments and security agencies in conveying their directives to the public, there is a need for an automatic Covid text classification system. However, developing such a system is challenging due to the need for labelled corpus and domain-specific fine-tuned models. Despite several studies focusing on identifying Covid toxicity and whispers from English text, they need to address Covid text classification, which involves determining whether a text retains Covid-19-related facts. While pre-trained multilingual or non-contextual language models have been widely used in text identification/classification studies for feature extraction, domain-specific embedding models have been shown to extract superior semantic and syntactic features compared to generalized embeddings or pre-trained language models. This study aims to create an English Covid-19 text classification system by fine-tuning six transformerbased language models using the ECovC corpus. The primary focus is to address the following research questions (RQs): – RQ1: What is the approach for designing a Covid-19 text classification system for a specific domain and language? – RQ2: How can out-of-vocabulary words be addressed when using transformer-based language models for Covid-19 text classification? The RQ1 and RQ2 answer and contributions of this study as follows: – ARQ1: Designing a Covid-19 text classification system for the English language involves fine-tuning a transformer-based language model using a Covid-19 text corpus in English. This involves preparing a labeled Covid-19 text corpus, i.e., ECovC, and fine-tuning a pre-trained language model (Sect. 3). – ARQ2: Employing subword tokenization techniques, such as Byte Pair Encoding (BPE), during the pre-processing stage. This technique breaks down words into subword units the language model can recognize, reducing the chances of encountering out-of-vocabulary words during classification. Additionally, techniques such as dynamic vocabulary expansion or using additional pre-trained language models can be used to handle out-of-vocabulary words in Covid-19 text classification (Sect. 4).
2 Related Work The COVID-19 pandemic has sparked widespread interest in text analysis research, especially detecting fake news, misinformation, and disinformation. Recently, researchers have focused on developing text analysis systems for both low and high-resource languages. Pranesh et al. [14] employed Dense-CNN+MBERT to create a multi-lingual misinformation detection system that attained a maximum accuracy of 82.17%. They conducted a comparative analysis of their model’s performance against eight monolingual transformer-based language models. Sentiment and emotion-aware information mining from the Covid tex research have been conducted in the recent year but need to address the prerequisite tasks,i.e., Covid data detection [13]. Utilized machine learning techniques to develop a sentiment detection system for Covid-affiliated vaccine data that is collected from tweets [15]. Another study developed a Covid faux sentence classification and factify system using transformer-based models
CoBertTC: Covid-19 Text Classification Using …
181
(i.e., BERT and ALBERT models) [16]. Their system was evaluated by the test set and obtained a maximum accuracy of 85.50%. Additionally, the scarcity of linguistic resources has made the analysis of Covid text in resource-constrained languages like Arabic, Hindi, and Bengali a subject of interest. Ameur et al. [2] developed an Arabic COVID-19 toxicity identification system (including FakeNews). Besides, researchers have also explored sentiment analysis and emotion detection during the pandemic. These advancements in COVID-19 text analysis have opened up new avenues for research in various areas, such as fake news detection, multi-lingual misinformation detection, sentiment analysis, and emotion detection. However, these studies are not focused on improving the accuracy and scalability of these systems and developing new systems for languages with fewer linguistic resources. Most of the previous research has been conducted on Covid toxicity text measurement systems, i.e., sentence-level misinformation, document-level disinformation, mining vaccination data, Covid-aware sentiment-emotion analysis, and rumour identification. The static embeddings (i.e., Word2Vec, FastText, GloVe) are insufficient for overcoming out-of-vocabulary issues and do not better represent domain-specific word semantics [1]. These studies have relied on generalized multilingual transformer-based language models, which overcome the traditional static embedding model’s shortcomings and represent better features. We also address this gap between static embeddings and transformer-based language model-based embeddings. Finally, we fine-tuned six transformer-based language models with the previously developed ECovC corpus [7] for the COVID-19 text classification purpose. Our approach offers a better solution for improving the accuracy and scalability of COVID-19 text classification.
3 Methodology The primary objective of this system is to create a Covid-19 text classification system capable of automatically detecting whether textual content pertains to Covid. To achieve this aim, this study has designed the CoBertTC system, which encompasses three core modules: (i) Fine-tuning a transformer-based language model, (ii) Transformer-based language inference, and (iii) Best-performing model selection. The fine-tuning module uses pre-trained transformer-based language models to train the models on Covid-19related text data. The inference module applies the fine-tuned models to classify new textual content. Finally, the best-performing model selection module selects the model with the highest accuracy in Covid-19 text classification. Figure 1 depicts the details of CoBertTC. This study used the ECovC corpus developed in the previously English Covid-19 text mining research [7]. 3.1 Transformer-Based Language Model Fine-Tuning In this module, inputs are the training dataset (i.e., Xt ) and transformer-based language models (i.e., Lm ). Here, t denotes the total number of training sample and Lm indicates the transformer-based language models, i.e., Lm ∈ {mBERT, XML−RoBERTa, Indic− BERT, MuRIL, mDeBERTa − V 3, mDistilBERT}. The transformer-based language
182
Md. R. Hossain and M. M. Hoque
Fig. 1. Abstract view of CoBertTC
models are fine-tuned using Eq. 1. j
Fmi = Tf (Xt , Lim ), i = mBERT, ..., MuRIL; j = 1, ..., t
(1)
where Fmi denotes the ith fine-tuned model of Lim . The function Tf (., .) extracts the jth text features using the ith language model and forward to the fine-tuning phase. Each language model is fine-tuned using the Xt training set. After the fine-tuning, tuned models are forwarded to the language model inference module. 3.2 Transformer-Based Language Model Inference In this module, inputs are the test set (i.e., Xs ) and fine-tuned models set (i.e., Fm ) and outputs are the performance matrices set (i.e., Ym ). The ith fine-tuned model and kth unlabelled sample is inference by Eq. 2. Ymi = TI (Fmi , Xsk ), i = mBERT , ..., MuRIL; k = 1, ..., s
(2)
s representing the total number of test sample in Xs and Ymi denotes the ith fine-tuned model performance. The function TI (., .) sequentially takes the unlabelled sample text, i.e., Xsk ∈ Xs and fed to the fine-tuned model. The fine-tuned model is produced a performance matrix Ymi ∈ {mBERT, XML−RoBERTa, Indic−BERT, MuRIL, mDeBERTa− V 3, mDistilBERT}. This matrix contains the accuracy, precision, recall, average precision, average recall, and F-1 score. Each of the model performances is forwarded to the best-performed model selection module. 3.3 Best-Performed Model Selection This module takes the performance matrices of Ym ∈ {mBERT, XML − RoBERTa, Indic − BERT, MuRIL, mDeBERTa − V 3, mDistilBERT} and outputs are the deserted expected predicted label y ∈ {Covid, Non − Covid} based on the best-performed model.
CoBertTC: Covid-19 Text Classification Using …
183
In this research, we have taken six transformer-based model performances, and each model accuracy is normalized by Eq. 3. i,j
yji
= max(
e Ym
k=s Ymi,k k=1 e
)
(3)
where yji denotes the jth sample normalized prediction value of ith fine-tuned model. After the accuracy normalization, the best model result was selected, which is considered the proposed system performance. This study empirically analyzes the six transformerbased language model performances and selects the best one for the English Covid text classification system.
4 Experiments and Results The fine-tuning phase of the six transformer-based language models was executed using PyTorch, Python 3.6, and Scikit-learn. The transformer models were implemented on a core-i7 processor equipped with an NVIDIA GTX 1070 GPU 8 GB and a physical memory of 32 GB. The ECovC test set performance is evaluated by the six transformerbased fine-tuned models. The accuracy (A), Precision (PR), Recall (RC), F1-score (F1), micro-average (MA) and weighted average (WA) statistical measures are used to evaluate this system, i.e., CoBertTC [9]. 4.1 Results The CoBertTC system performance is shown in Table 1. The MuRIL [12] and IndicBERT [11] language models are trained for the local Indian languages, including English. Whereas XML-RoBERTa, mDistilBERT, mBERT, and mDeBERTa-V3 are supported more than hundreds of languages [3]. Among these models, XML-RoBERTa-V3 obtained the maximum accuracy of 94.22% and maximum weighted and macro-average precision, recall and F1 score value is 94. Table 1. Summary of transformer-based covid-19 text classification statistics Models
A(%)
MA (%)
WA (%)
PR
RC
F1
PR
RC
F1
Indic-BERT
91.98±.001
92
92
92
92
92
92
MuRIL
94.07±.001
94
94
94
94
94
94
mBERT
90.83±.001
91
91
91
91
91
91
XML-RoBERTa
94.22 ± .001
94
94
94
94
94
94
mDeBERTa-V3
93.92±.001
94
94
94
94
94
94
mDistilBERT
90.68±.001
91
91
91
91
91
91
184
Md. R. Hossain and M. M. Hoque
The minimum accuracy of 90.68% was obtained from mDistilBERT due to the smaller number of trainable parameters compared to the other five language models. The MuRIL and Indic-BERT obtained better accuracy compared to mBERT. The mBERT trained with over a hundred languages, whereas the Indic-BERT and MURIL trained only 12 Indian languages. However, the overall CoBertTC maximum accuracy was obtained from the XML-RoBERTa. The Fig. 2 shows the word-cloud representation of Covid-related words (i.e., Fig. 2a), and non-covid-related words (i.e., Fig. 2b).
(a) Covid Word-cloud
(b) Non-Covid Word-cloud
Fig. 2. Word-cloud representation of 100 frequent words of English Covid and non-Covid training samples
Performance Comparison: As this research focuses on Covid-19 text classification and utilizes a unique dataset, a direct comparison of the developed method’s performance with earlier studies are not feasible. Therefore, we compared the developed method’s performance with others classification methods. Table 2 compares the best-performing developed method and earlier text classification studies using the earlier developed corpus (i.e., ECoVC). The table compares the performance of different models on the ECoVC dataset regarding accuracy. Four models are compared with the proposed model, XML-RoBERTa-V3. The first model, Word2Vec+SGD, achieves an accuracy of 72.73%. The second model, Word2Vec+BiLSTM, achieves an accuracy of 84.05%. The third model, Charembedding+LSTM, achieves an accuracy of 82.68%. The fourth model, dGloVe+CNN, achieves an accuracy of 88.89%. In comparison, the proposed model, XML-RoBERTa, achieves the highest accuracy of 94.22%. This is a significant improvement over the other models. The results suggest that the proposed model, which uses the XML-RoBERTa architecture, outperforms the other models on the ECoVC dataset. This model may also be a promising approach for other text classification tasks. However, it is important to note that the comparison is limited to accuracy, and other performance metrics may also need to be considered in evaluating the models.
CoBertTC: Covid-19 Text Classification Using …
185
Table 2. Performance summary of the best-performed developed method and other text classification methods Corpus
ECoVC
Methods
A(%) SGD-Word2Vec [5]
72.73
DCNN-RNN [6]
84.05
CNN-Fasttext [8]
82.68
dGloVe+CNN [7]
88.89
XML-RoBERTa-V3
94.22
5 Conclusion In this study, we developed a Covid-19 text classification system for English using six transformer-based language models, fine-tuned on ECoVC. The system was evaluated using various performance metrics, including accuracy, precision, recall, and F1-score. The results of the experiments demonstrated that the fine-tuned transformer-based models significantly outperformed the baseline models in classifying Covid-19 text data. Among the six models, the XML-RoBERTa model achieved the highest accuracy of 94.22%, while the other models achieved accuracies ranging from 90.68% to 94.07%. Furthermore, the evaluation of the system on the independent test set demonstrated its robustness and generalization ability. The system could effectively classify Covid-19 text data with high accuracy, precision, recall, and F1-score. Overall, this study has demonstrated the effectiveness of transformer-based language models in developing accurate Covid-19 text classification systems for the English language. The proposed system has the potential to be applied in various domains, such as healthcare, public health, and social media analysis. We hope this study can contribute to the fight against the Covid-19 pandemic by enabling efficient and accurate analysis of large volumes of Covid-19-related data in English. Although the suggested approach demonstrated satisfactory performance, future research will combine the model with transformer-based language models and include the score fusion to improve its efficacy and accuracy.
References 1. Afroze, S., Hoque, M.M.: Sntiemd: Sentiment specific embedding model generation and evaluation for a resource constraint language. In: Intelligent Computing and Optimization. pp. 242–252. Springer International Publishing, Cham (2023) 2. Ameur, M.S.H., Aliane, H.: Aracovid19-mfh: Arabic covid-19 multi-label fake news and hate speech detection dataset. Procedia Comput. Sci. 189, 232–241 (2021). https://doi.org/ 10.1016/j.procs.2021.05.086, aI in Computational Linguistics 3. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
186
4.
5.
6.
7.
8.
9. 10.
11.
12.
13.
14.
15. 16.
Md. R. Hossain and M. M. Hoque Technologies, Volume 1 (Long and Short Papers). pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423, aclanthology.org/N19-1423 Gadri, S., Chabira, S., Mehieddine, S.O., Herizi, K.: Sentiment analysis: Developing an efficient model based on machine learning and deep learning approaches. In: Intelligent Computing and Optimization. pp. 237–247. Springer International Publishing, Cham (2022) Hossain, M.R., Hoque, M.M.: Automatic Bengali document categorization based on word embedding and statistical learning approaches. In: Proc. IC4ME2. pp. 1–6. Rajshahi, Bangladesh (2018) Hossain, M.R., Hoque, M.M.: Semantic meaning based Bengali web text categorization using deep convolutional and recurrent neural networks (dcrnns). In: Internet of Things and Connected Technologies. pp. 494–505. Springer International Publishing, Cham (2021) Hossain, M.R., Hoque, M.M.: Covtexminer: Covid text mining using cnn with domainspecific glove embedding. In: Intelligent Computing and Optimization. pp. 65–74. Springer International Publishing, Cham (2023) Hossain, M.R., Hoque, M.M., Sarker, I.H.: Text classification using convolution neural networks with fasttext embedding. In: Proc. HIS. pp. 103–113. Springer International Publishing, Cham (2021). 10.1007/978-3-030-73050-5_11 Hossain, M.R., Hoque, M.M., Siddique, N., Sarker, I.H.: Bengali text document categorization based on very deep convolution neural network. Expert Syst. Appl. 184, 115394 (2021) Hossain, M.R., Hoque, M.M., Siddique, N., Sarker, I.H.: Covtinet: Covid text identification network using attention-based positional embedding feature fusion. Neural Comput. Appl. (2023) Kakwani, D., Kunchukuttan, A., Golla, S., N.C., G., Bhattacharyya, A., Khapra, M.M., Kumar, P.: IndicNLPSuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of the Association for Computational Linguistics: EMNLP 2020. pp. 4948–4961. Association for Computational Linguistics, Online (2020). 10.18653/v1/2020.findings-emnlp.445 Khanuja, S., Bansal, D., Mehtani, S., Khosla, S., Dey, A., Gopalan, B., Margam, D.K., Aggarwal, P., Nagipogu, R.T., Dave, S., Gupta, S., Gali, S.C.B., Subramanian, V., Talukdar, P.P.: Muril: Multilingual representations for indian languages. CoRR abs/2103.10730 (2021) Pacheco, M.L., Islam, T., Mahajan, M., Shor, A., Yin, M., Ungar, L., Goldwasser, D.: A holistic framework for analyzing the COVID-19 vaccine debate. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 5821–5839. Association for Computational Linguistics, Seattle, United States (2022) Pranesh, R., Farokhenajd, M., Shekhar, A., Vargas-Solar, G.: CMTA: COVID-19 misinformation multilingual analysis on Twitter. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. pp. 270–283. Association for Computational Linguistics, Online (2021). 10.18653/v1/2021.acl-srw.28 Sarirete, A.: Sentiment analysis tracking of covid-19 vaccine through tweets. J. Ambient Intell. Human. Comput. (2022). https://doi.org/10.1007/s12652-022-03805-0. Mar Vijjali, R., Potluri, P., Kumar, S., Teki, S.: Two stage transformer model for COVID-19 fake news detection and fact checking. In: Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda. pp. 1–10. International Committee on Computational Linguistics (ICCL), Barcelona, Spain (Online) (2020)
Glaucoma Detection Using CNN and Study on Class Imbalance Problem Nitish U.1 , Pradeesh Prem Kumar1(B) , Nimal Madhu M.2 , Hareesh V.1 , and V. V. Sajith Variyar1 1 Center for Computational Engineering and Networking, Amrita School of Artificial
Intelligence, Amrita Vishwa Vidyapeetham, Coimbatore 641112, India [email protected], [email protected] 2 Department of Electrical Engineering, National Institute of Technology Calicut, Kozhikode, Kerala, India
Abstract. Glaucoma refers to eye disorders that result in harm to the optic nerve, leading to either vision loss or blindness. Without treatment glaucoma tends to worsen gradually over time, making early detection and treatment crucial. Deep learning based on CNN was used for the detection of glaucoma using fundus images. But deep learning requires datasets which contain huge amounts of labelled high-quality images to predict with high accuracy. It is because of the need to extract important and complex patterns or features from the images for tuning its parameters and to reduce the model’s loss. But often the availability of such datasets is scarce and sometimes the datasets available will be highly imbalanced. In this paper, the issue of an imbalanced dataset has been addressed by using 2 sets of data augmentation techniques and GAN to make the dataset balanced and thereby increase the size of the dataset. The first set includes operations such as changing rotation, height, flipping etc. The second set includes operations such as changing hue, saturation, and contrast. The second set of augmentation provided the best result with an accuracy of 0.821 when compared with the rest of the approaches. Keywords: Glaucoma · Optic disk (OD) · CNN · Data augmentation (DA) · Generative adversarial network (GAN)
1 Introduction The visual sensation experienced by humans is caused by the optic nerve which is composed of millions of nerve fibers, which transmit the signal from the eye to the brain. Primary Open-Angle Glaucoma, the most widely occurring form of glaucoma, is caused by an increase in pressure inside the eye causing damage to the optic nerve and thus a gradual loss of nerve fibers. Left untreated can lead to blindness [1, 2]. It is not always the case that individuals with elevated intraocular pressure will develop glaucoma, and those with normal eye pressure can also be susceptible to the condition. When the eye pressure is too high for a particular optic nerve, it will lead to the development of glaucoma. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 187–198, 2024. https://doi.org/10.1007/978-3-031-50327-6_20
188
Nitish U. et al.
There are several ways to detect glaucoma: • Tonometry: By measuring the pressure inside the eye. • Visual field test: By measuring the extent of peripheral vision loss. • Pachymetry: By measuring the thickness of the cornea, which can affect the pressure reading. • Optic nerve head evaluation: By examining the appearance of the optic nerve head in order to ascertain any signs of damage. The “cup-to-disk ratio” (CDR) is a crucial measurement used in detecting as well as monitoring glaucoma [3]. This measurement is used to assess the appearance of the optic nerve head. The CDR is calculated by comparing the size of the optic nerve head cup, which is the central depression in the nerve, to the size of the entire nerve head disk. A healthy eye will comprise a small cup and a large disk. In a patient suffering from glaucoma, the optic nerve head can become damaged and due to this cup may increase in size and take a larger portion of the disk. Hence in an eye with glaucoma, the CDR will be high compared to healthy eyes. And with the technological advancements that have been taking place lately in both the medical field and the field of artificial intelligence, it is possible to develop an algorithm that will have the capability to automatically detect glaucoma. Using deep learning and image processing, fundus images can be classified as healthy or glaucoma-infected ones based on the CDR. A fundus image (Fig. 1) is a medical image containing the inner surface of the eye, which includes the retina, OD, macula, and other features. The image is typically captured using specialized cameras and is a non-invasive diagnostic tool in ophthalmology.
Fig. 1. Fundus image
But one of the main complications that arise here is the lack of positive images compared with negative images. This will lead to an imbalanced dataset where the spread of classes in the training data is oblique, meaning one of the classes has significantly larger instances comparing the other class. A model that is trained on such datasets will have poor performance on the minority class and will be highly biased towards the majority class. This paper deals with the various techniques which can be used to tackle the issue of imbalance in our datasets. A comparison study of the efficiency of our model which
Glaucoma Detection Using CNN and Study on Class Imbalance Problem
189
is based on CNN, with the datasets obtained using various techniques, is carried out at the end.
2 Literature Review In the year 2022, Singh et al. [4] developed a CNN framework using the self-adaptive butterfly optimization algorithm. The pre-processing of the fundus images was done using Gaussian filtering to reduce unwanted noise. The author was able to achieve a higher precision score when compared to other similar models. In 2020, Fei Li et al. [5] proposed a deep learning algorithm wherein the segmentation of Optic Disc (OD) and Optic Cup (OC) in fundus images was automated. The authors used a U-Net model which is based on an encoder-decoder architecture for detecting OD. In 2016, Singh et al. [6] proposed a methodology in which the blood vessel is removed and the segmentation of OD is done using wavelet feature extraction. The feature extraction was followed by dimensionality reduction using PCA and normalization using z-score. Various Machine Learning classifiers were used out of which KNN and SVM gave the finest results. In 2022, Joshi et al. [7] proposed a model using a convolutional neural network to differentiate between healthy and glaucomatous fundus images by extracting features from the images. The proposed ensemble architecture was then compared with ResNet50, VGGNet-16, and GoogLeNet in terms of performance and the test was carried out on both public and private datasets. The proposed model’s performance outperformed the already existing cutting-edge methods. In 2015, Akram et al. [8] proposed an approach for accurately detecting glaucoma from coloured retinal images. The system used a novel approach to extract the optic disc by analyzing vessel-based features. After the OD was detected, the required region was extracted for evaluation. The system extracted various features, including CDR, rim-todisc ratio, and spatial and spectral features, and used a classifier based on multivariate medoids for detecting glaucoma. In 2019 Afzal et al. [9] mentions the challenges faced when developing a model using an imbalanced dataset. The paper clearly addresses the problem of overfitting due to the insufficiency of data available for training. AlexNet, which is a transfer learning model was used for training. The dataset used was OASIS which contains images of MRI scans for Alzheimer’s Disease which is imbalanced. Various data augmentation techniques were used before training the model and the authors were able to show significant improvement during evaluation. All the papers mentioned above focus on the pre-processing of the images and especially the training of the model. Although these studies provide valuable insights, it appears that there is still room for research on how to effectively deal with the issue of imbalance in datasets. An imbalanced dataset occurs when one class in the dataset has a greater number of instances than others. This is very much applicable in realworld datasets, especially in the medical field where disease-positive data would be far lesser compared to disease-negative data. The major issue when training with an imbalanced dataset is that the trained model will be overfitted and heavily biased towards
190
Nitish U. et al.
the majority class [10, 11]. This is because the model is trained more on the number of instances belonging to the majority class and therefore, it becomes over-represented in the model’s decision-making process. Evaluation metrics such as accuracy score will become unreliable as it takes into account only the number of true predictions. This means that a model can achieve high accuracy even if it is making poor predictions for the minority class. This paper deals with the study of how to handle imbalanced datasets by generating new data points using different augmentation techniques and using GAN (Generative Adversarial Network).
3 Dataset Description This paper uses two different datasets containing fundus images. • The first dataset is called ACRIMA which contains a total of 705 fundus images. Of that 396 are glaucomatous (positive) and 309 are non-glaucomatous (negative) images. The images are already preprocessed as they are snipped around the optic disc by taking 1.5 times the radius of the optic disc. • The second dataset is imbalanced, which contains 168 positive and 482 negative images. These images are not cropped like the ACRIMA dataset. Both datasets are downloaded from Kaggle (Figs. 2 and 3).
Fig. 2. From ACRIMA dataset
4 Methodology Description In this paper, CNN (Convolution Neural Network) based model is used for classifying fundus images. The same CNN model will be trained on the following datasets. 1. 2. 3. 4. 5.
ACRIMA Imbalanced dataset Balanced dataset after using Augmentation 1 Balanced dataset after using Augmentation 2 Balanced dataset after using GAN.
Glaucoma Detection Using CNN and Study on Class Imbalance Problem
191
Fig. 3. From Imbalanced dataset
All the images are pre-processed by resizing the images to 200 × 200 pixels. The pixel values are then normalized from a scale of (0–255) to (0–1). The imbalanced dataset is then balanced by performing two sets of data augmentation techniques and GAN, only on the positive (minority) class. 4.1 Data Augmentation It is an approach wherein the size of the dataset is increased by producing reformed versions of the existing images [12, 13]. The main goal is to introduce variations in our dataset which will reduce the chance of our model overfitting. Common DA techniques include flipping, rotation, scaling, and adding noise to images. This paper deals with the below data augmentation techniques: • First set of DA techniques will include Rotation, Shift, Shear, Width Shift, Height, Zoom and Horizontal Flip. • Second set of DA techniques will include changes in Hue, Saturation and contrast. 4.2 Generative Adversarial Network (GAN) GAN is used to produce fake images using real existing images using concepts of deep learning. It contains two neural networks (Fig 4), a generator network which will try to create new images and a discriminator network which will try to differentiate between original and generated data [14]. The loss generated by each network is then backpropagated to adjust their weights. Initially, a random array is fed into the generator and the images produced will be of less quality. But as the number of epochs increases, images with better quality will get generated. After each epoch, a sample from both fake images and real images is fed into the discriminator to distinguish between them.
5 Proposed Workflow The imbalanced dataset is balanced by performing 2 sets of data augmentation and GAN on the positive class. Then they were split into train, test and validation sets. In the CNN model, there will be 3 convolution layers. The first layer will consist of 32 filters of size
192
Nitish U. et al.
Fig. 4. GAN
(3, 3). The second layer will contain 64 filters while the third layer 128 filters with kernel size (3, 3) respectively. And in between each of these layers, there will be a max pooling layer of kernel size (2, 2). In the convolution layers, there is no padding and the stride has been kept to default i.e. (1, 1). The max pooling layer is to choose only the most important feature or pixels from the feature map and will lessen the dimension of the feature map. After all the convolution layer, comes the dense layers. Before sending the transformed feature map to the dense layer, they will have to be flattened. There are 4 dense layers with 256, 128, 64, and finally 2 nodes respectively. In the first 3 dense layers, the activation function used was ReLu. And in the final dense layer sigmoid activation function has been used. The model has been compiled using the Adam optimization to update the weights using loss during backpropagation. The metrics used for evaluation are precision, recall, f1-score and accuracy score (Fig. 5).
Fig. 5. Flowchart
6 Results and Discussion 6.1 ACRIMA For the ACRIMA dataset, the CNN model was able to achieve an accuracy of 0.9574, with an f1 score of 0.96 (Figs. 6 and 7).
Glaucoma Detection Using CNN and Study on Class Imbalance Problem
193
Fig. 6. Confusion matrix
Fig. 7. Classification report
6.2 Imbalanced Dataset For the imbalanced dataset, an accuracy score of 0.753 and an f1 score of 0 were obtained. Here the model is not predicting or classifying any image as Glaucoma Positive and hence the precision, recall and f1 score are 0, whereas accuracy shows up to be 0.753 which is quite misleading (Figs. 8 and 9).
Fig. 8. Confusion matrix
194
Nitish U. et al.
Fig. 9. Classification report
6.3 Balanced Dataset After Using Augmentation 1 For the balanced dataset acquired after using augmentation techniques like Rotation, Shift, Shear, Width Shift, Height, Zoom and Horizontal Flip, an accuracy score of 0.81 and an f1 score of 0.783 were obtained (Figs. 10, 11, 12, 13, 14 and 15).
Fig. 10. Confusion matrix
Fig. 11. Classification report
Balanced Dataset after using Augmentation 2. For the balanced dataset acquired after using augmentation techniques like changes in Hue, Saturation and contrast, an accuracy of 0.821 and an f1 score of 0.797 were obtained.
Glaucoma Detection Using CNN and Study on Class Imbalance Problem
195
Fig. 12. Confusion matrix
Fig. 13. Classification report
Fig. 14. Confusion matrix
6.4 Balanced Dataset After Using GAN For the balanced dataset acquired after using GAN, an accuracy of 0.781 and an f1 score of 0.66 were obtained. It is quite evident from Table 1, that without augmentation in the imbalanced dataset, the model is not predicting or classifying any image as Glaucoma Positive and hence precision, recall or f1 score is zero, whereas Accuracy shows up to be 0.753 which is misleading. After using two sets of data augmentation and GAN the accuracy scores obtained are 0.810, 0.821 and 0.781 respectively, the highest accuracy given by the second set of
196
Nitish U. et al.
Fig. 15. Classification report Table 1. Result summary
Balanced dataset
Imbalanced dataset
Precision
Recall
F1 score
Accuracy
Glaucoma negative
0.94
0.97
0.96
0.957
Glaucoma positive
0.97
0.94
0.96
Glaucoma negative
0.75
1
0.86
Glaucoma positive
0
0
0
Using Glaucoma augmentation 1 negative
0.73
0.86
0.93
Glaucoma positive
0.94
0.67
0.78
Using Glaucoma augmentation 2 negative
0.75
0.96
0.84
Glaucoma positive
0.95
0.69
0.80
Glaucoma negative
0.72
1
0.84
Glaucoma positive
1
0.49
0.66
Without augmentation
Without augmentation
Using GAN
0.753
0.810
0.821
0.781
augmentation. But the main point to be noticed in the table is that the f1 score obtained is no longer zero, which means the model is not misleading anymore.
7 Conclusion This paper addressed the issue of how an imbalanced dataset may affect the final results of a model and the various methods by which one can tackle this problem. The methods used include two different sets of augmentation techniques and GAN.
Glaucoma Detection Using CNN and Study on Class Imbalance Problem
197
The dataset produced after augmentation and GAN has shown promising results with data prediction. By creating these synthetic images from existing images, the challenges of limited or imbalanced datasets can be overcome and can provide a more robust and diverse training set for deep learning models. In addition to improving the performance of models, this also helps to reduce overfitting that occurs when a model is excessively tailored to the training data, resulting in poor performance when presented with new data. GAN was found to be quite promising when data generation is concerned with new images, but it is quite time-consuming and computationally expensive when compared with other methodologies discussed in the paper. With two different sets of augmentation techniques used, Augmentation 2 gave us the best result for the dataset. This might not be true for all sorts of datasets and the best methodology available depends on the individual dataset. Future works can include how data generated by autoencoders improve the performance of the model. It has also been observed that Generative Adversarial Networks (GANs) necessitate a substantial amount of training data in order to generate high-grade synthetic data. Subsequent research could concentrate on devising GANs that can effectively learn from a limited number of examples, either through the utilization of transfer learning or other methods.
References 1. Barros, D.M.S., Moura, J.C.C., Freire, C.R., et al.: Machine learning applied to retinal image processing for glaucoma detection: review and perspective. BioMed Eng OnLine 19, 20 (2020). https://doi.org/10.1186/s12938-020-00767-2 2. David, A., Lee, M.D., Eve, J., Higginbotham, M.D.: Glaucoma and its treatment: a review. Am. J. Health-Syst. Pharm. 62(7), 691–699 (2005). https://doi.org/10.1093/ajhp/62.7.691 3. Sarhan, A., Rokne, J., Alhajj, R.: Glaucoma detection using image processing techniques: a literature review. Comput. Med. Imaging Graph. 78, 101657 (2019). https://doi.org/10.1016/ j.compmedimag.2019.101657. Epub 2019 Oct 10 PMID: 31675645 4. Singh, P.B., Singh, P., Dev, H.: Optimized convolutional neural network for glaucoma detection with improved optic-cup segmentation. Adv. Eng. Softw., 175, 103328 (2023), ISSN 0965-9978, https://doi.org/10.1016/j.advengsoft.2022.103328 5. Fu, H., et al.: A retrospective comparison of deep learning to manual annotations for optic disc and optic cup segmentation in fundus photographs. Transl. Vision Sci. Technol. 9, 33 (2020). https://doi.org/10.1167/tvst.9.2.33 6. Singh, A., Dutta, M.K., ParthaSarathi, M., Uher, V., Burget, R.: Image processing based automatic diagnosis of glaucoma using wavelet features of segmented optic disc from fundus image. Comput Methods Programs Biomed. 124, 108–120 (2016). https://doi.org/10.1016/j. cmpb.2015.10.010. Epub 2015 Oct 23 PMID: 26574297 7. Joshi, S., Partibane, B., Hatamleh, W.A., Tarazi, H., Yadav, C.S., Krah, D.: Glaucoma detection using image processing and supervised learning for classification. J. Healthcare Eng., 2022, 12 (2022). Article ID 2988262. https://doi.org/10.1155/2022/2988262 8. Akram, M.U., Tariq, A., Khalid, S., Javed, M.Y., Abbas, S., Yasin, U.U.: Glaucoma detection using novel optic disc localization, hybrid feature set and classification techniques. Australas. Phys. Eng. Sci. Med. 38(4), 643–655 (2015). https://doi.org/10.1007/s13246-015-0377-y
198
Nitish U. et al.
9. Afzal, S., et al.: A data augmentation-based framework to handle class imbalance problem for Alzheimer’s stage detection. IEEE Access 7, 115528–115539 (2019). https://doi.org/10. 1109/ACCESS.2019.2932786 10. Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: a review. GESTS Int. Trans. Comput. Sci. Eng. 30, 25–36 (2005) 11. Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019). https://doi.org/10.1186/s40537-019-0192-5 12. van Dyk, D.A., Meng, X.-L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001). https://doi.org/10.1198/10618600152418584 13. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0 14. Wang, K., Gou, C., Duan, Y., Lin, Y., Zheng, X., Wang, F.-Y.: Generative adversarial networks: introduction and outlook. IEEE/CAA J. Automatica Sinica 4(4), 588–598 (2017). https://doi. org/10.1109/JAS.2017.7510583 15. Bagavathi, C., Lakshmi, S.J., Bhavani, S.: A complete analysis on classification strategies for class imbalanced datasets. In: International Conference on Electrical, Electronics and Communication Technology (ICEECT 2021) (2021) 16. Kumar, K., Sowmya, V., Gopalakrishnan, E.A., Soman, K.P.: Classification of classimbalanced diabetic retinopathy images using the synthetic data creation by generative models. In: Raj, J.S., Palanisamy, R., Perikos, I., Shi, Y. (eds.) Intelligent Sustainable Systems. LNNS, vol. 213, pp. 15–24. Springer, Singapore (2022). https://doi.org/10.1007/978-981-162422-3_2 17. Chinnaswamy, Ramakrishnan, S., Sooraj, M.P.: Rough set based variable tolerance attribute selection on high dimensional microarray imbalanced data. Data Enabled Discovery and Applications (2018) 18. Jose, C., Gopakumar, G.: An improved random forest algorithm for classification in an imbalanced dataset. In: 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC), New Delhi, India, pp 1–4 (2019). https://doi.org/10.23919/URSIAP-RASC.2019.8738232
Identification of Deceptive Clickbait Youtube Videos Using Multimodal Features Sheikh Sowmen Rahman , Avishek Das , Omar Sharif , and Mohammed Moshiul Hoque(B) Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong 4349, Bangladesh [email protected], {avishek,omar.sharif,moshiul_240}@cuet.ac.bd
Abstract. In social media platforms like Youtube, clickbait content has become a standardized method of driving public attention to video content for the creator’s benefit. With the gradual standardization of clickbait on Youtube, an increasing trend is observed for the probability of deceptive intentions within the content. Smarter clickbait and masquerading techniques allow fake sensationalized media information to generate public attention. Thus, it is harder for users on video platforms to distinguish between misleading and legitimate videos. This work explores the deception aspect of clickbait on Youtube. It proposes a clickbait classification technique exploiting various DL and ML models on the multimodal features of the collected clickbait videos. This work also developed a Youtube video corpus containing recent clickbait videos and devised techniques for utilizing multimodal features such as video text titles, video thumbnails, and video metadata properties using natural language processing techniques. We have explored standard deep learning techniques such as CNN, LSTM, and Bi-LSTM for textual features and ResNet50, VGG16, and VGG19 for visual feature extraction. In the multimodal approach, the combined model CNN ⊕ ResNet50 achieved the highest F1-score (74.8%) among all multimodal techniques. Keywords: Natural language processing · Clickbait deception · Clickbait video dataset · Multimodal classification · Clickbait classification
1 Introduction A clickbait video on YouTube is carefully crafted with attractive thumbnails and flashy/catchy titles that attract the viewers’ attention. Clickbait videos are often indecent and deceptive, according to the recent usage trends of clickbait worldwide. Still, more and more YouTubers are using clickbait for their legitimate videos to grab more viewers for their content. This work proposes a learning system to classify YouTube clickbait videos into three classes: decorative, deceptive, and satirical. For example, in Fig. 1a, the title and thumbnail suggest the person in the video catching something huge while fishing. Upon watching the video, it is seen that the message delivered by the title-thumbnail pair is truthfully delivered, whereas, in Fig. 1b, the thumbnail indicates a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 199–208, 2024. https://doi.org/10.1007/978-3-031-50327-6_21
200
S. S. Rahman et al.
man holding a small missile behind a tank, however, the title is misleading, because the actual content does not depict any scenario which might be fatal or injurious to this man. Instead, it only displays random scenarios where he blows up the environment using heavy artillery.
Fig. 1. Clickbait deception classes
The heterogeneity present in the combination of text and thumbnail images in YouTube videos makes it hard for manual scrutiny to detect if the video is worth watching and if it delivers the promised message. In the last few years, a new mixed class of clickbait has emerged that is neither deceptive nor decorative, somewhat comical, sarcastic, or satirical, which adds to the overall classification of this task. This is seen in Fig. 1c, a parody video that mocks the high tolerance of spice and spicy foods by people of Asian ethnicity, compared to people in the Western culture who generally prefer less spice. This work explores several machine learning and deep learning approaches using the extracted multi-modal features (text and images) from YouTube videos to develop a working solution. The key contributions of this work are illustrated below:
Identification of Deceptive Clickbait Youtube Videos Using …
201
– Development of a labeled corpus of 1402 YouTube videos containing video title text, thumbnail images, and metadata properties. – Investigation of various word embedding techniques, including GloVe, FastText, and Keras Embedding Layer with hyper-parameters tuning for textual deception classification of clickbait YouTube videos. – Development of an appropriate model to identify deceptive content from clickbait YouTube videos by exploiting various deep learning algorithms trained on textual and visual features of the subject videos.
2 Related Work A few studies have been performed in recent years based on uni-modal clickbait detection. To identify clickbait posts on Facebook, Rony et al. [1] created a large corpus of 1.67 million posts. They also developed a clickbait detection model using an artificial neural network, achieving an accuracy of 98.3%. Another work built a social bot and browser extension named BaitBuster that can detect floating click baits around the web and briefly explains its action. Sisodia et al. [2] proposed an ensemble learner-based classification model to classify clickbait from genuine article headlines and achieved an accuracy of 91.16%. In the social media space, multi-modal works on clickbait rely on working on the text, video, metadata, and audio entities. A few works have explored a few modalities on YouTube videos and proposed models on clickbait models. These works, however, strictly work on detecting whether or not the video can classify clickbait. Gamage et al. [3] presented a multi-model architecture that uses a deep learning technique where six inference models are jointly consulted to make the final classification decision. These models focus on different attributes of the video, including title, comments, thumbnail, tags, video statistics, and audio transcript. They achieved a test accuracy of 98% with an inference time of ≤ 2s. Mowar et al. [4] introduced a novel model to detect clickbait on YouTube that considered the relationship between video content and title/thumbnail. Their developed clickbait detection model achieved a high accuracy of 95.38% for Misleading Video Dataset (MVD) [5]. Huette et al. [6] explored machine learning techniques for clickbait classification and proposed using logistic regression and Naïve Bayes classifiers. They also introduced two attributes to the dataset used in this work to improve the logistic regression accuracy and achieved a test accuracy of 86.2%. Mowar et al. [4], in their two-level classification approach for YouTube clickbait detection and classification, proposed a classification model with two classes of deceptive clickbait videos and achieved an accuracy of 92.5%.
3 Dataset Development Processes This work considers three clickbait classes: decorative, deceptive, and satirical. While other works [4, 7] have either solely treated clickbait classification in a purely deceptive stance or interchangeably used the term ’detection’ in place of ’classification’, we hypothesize that there is more to clickbait than what meets the eye, regarding deception in their messages.
202
S. S. Rahman et al.
Add a citation: NOTE: Sir, please note that these labels were termed from observation and have not been used in other papers in the same context, however, I’ve gone ahead and added a citation briefly stating the same thing. – Decorative: A clickbait YouTube video is termed decorative if the underlying content of the video carries the same summary as hinted at in the title and thumbnail of the video. The clickbait video is guaranteed to be properly designed by its creators to attract the attention of its target audience. However, it is not misleading and does not result in the user’s frustration after viewing the video. – Deceptive: The video is termed as deceptive if the content delivered from the video does not match in veracity with the thumbnail and title. As a result, users watching the video who have clicked with the incentive to fill their curiosity gap from the title and thumbnail will often feel frustrated and betrayed as the video does not deliver its promise, causing the YouTube platform to lose the trust of its consumers. – Satirical: A video is termed as satirical if it falls to any other category that is not misleading due to its ’baiting’ property. A satirical clickbait video is characteristically different from regular clickbait videos as they deliver a diverted version of reality, a partially accurate message that imitates the truth. The message may be rhetorical, comical, or a perverted message relating to specific real-life incidents. Figure 2 depicts the process of dataset development.
Fig. 2. Dataset development process
Collected video information, including title, description, thumbnail, and metadata such as views, likes, and the number of comments. Approximately 1402 videos were manually collected from YouTube, and their details were fetched using the YouTube Rest API(v3). The title and thumbnail images were extracted from there for further processing in their separate modalities. The whole corpus was labeled manually, followed by the majority label in a class to assign the suitable label. The voting Algorithm does the majority of labeling. The label which got the maximum label from the annotators is selected as an initial label for the data. The initial labeling or annotation tasks were performed by three undergraduate students who have Computer Engineering, business administration, and English backgrounds. An expert academician working on NLP for several years manually verified each annotator’s labeling. The initial label of a text assigned by the annotators considers the ultimate if its label matches the expert’s label; otherwise expert corrected those data labels due to the omission that occurred during the initial labeling of annotators. We investigated how much the agreement rate between
Identification of Deceptive Clickbait Youtube Videos Using …
203
annotators in assigning classes is by using Cohen’s kappa [8]. The average Kappa value for our dataset is equal to 0.792, which is an almost perfect agreement according to the kappa scale. Table 1 shows the document summary in the training set. There are a total of 24971 words in the decorative class, whereas unique words number 5876. In the deceptive class, out of 8653 words, 2969 words are unique. In the satirical class, out of 2880 words, 1394 words are unique. Table 1. Document-level semantic statistics Type
Total documents
Total words
Unique words
Class
Train
4686
24971
5876
Decorative
1666
8653
2969
Deceptive
576
2880
1394
Satirical
815
4287
2123
Decorative
300
1446
903
Deceptive
108
488
378
Satirical
Test
4 Clickbait Identification Techniques Figure 3 shows the overall architecture of the proposed framework for YouTube clickbait video deception classification. Model-1 (ResNet50) accepts preprocessed thumbnails as input and provides the semantic expression of the visual part by extracting suitable features. Model-2 (CNN) accepts preprocessed texts as input and provides suitable features for further use. Table 2 shows the hyperparameter space for fine-tuning and setting hyperparameters for Model2(CNN). The input to Model-1 is an image of dimensions (112, 112, 3). Model-1 has 48 convolutional layers, one max-pooling, and one average-pooling layer. The convolution and max pool layers are consistently arranged throughout the whole architecture. Conv-1 layer block consists of 1 Conv layer with 64 7 × 7 filters; the conv-2 layer block has 9 Conv layers, each with 64 1 × 1 filters; the conv-3 layer block has 12 Conv layers consisting of 4 1 × 1 and four 3 × 3 Conv layers, each with 128 filters and 4 1 × 1 Conv layers with 512 filters, Conv 4 and Conv 5 follow the same layering pattern but with 18 Conv layers and 9 Conv layers respectively. The filters vary from 256–2048 between the window size of 1 × 1 and 3 × 3. Each Conv block is sequentially connected and passed with ReLU (Rectified Linear Unit) activation to each layer so that all the negative values are not passed to the next layer. While training our model, we used the ‘Adam’ optimizer to reach the global minima with a rate of 0.001. The model is trained using the ‘caegorical_crossentropy’ loss function as our task is multi-class classification, and the softmax activation function is used in the dense layer.
204
S. S. Rahman et al.
Fig. 3. Proposed framework of clickbait identification
Table 2. Hyperparameters for CNN model Hyperparameters
Hyperparameter space
CNN
Filter size
3, 5, 7
5
Filters count
32, 64, 128
64
Pooling type
max, average
max
Embedding dimensions
100, 300
300
Batch size
32, 64, 128, 256
256
Activation function
ReLU, Softmax, Tanh
ReLU
Optimizer
Adam, SGD, RMSProp, Adagrad
Adam
Learning rate
1e−4 , 1e−3 , 1e−2 , 1e−1
1e−4
Model-2 accepts preprocessed text as input as a vectorized padded sequence of text. In the embedding layer, Keras embedding technique is used. For Keras embedding, we used 64 as the output dimension, and the input dimension is 2000 as we limited the tokenization to the top 2000 words in the vocabulary index. The input length is 65, the max length of text caption in the whole corpus. In the convolution layer, the kernel size is 7 with 64 filters, and the activation function is ReLU. Max-pooling is done after the convolution layer. We connect this convolutional layer block with another similar convolutional layer but with global max pooling used after the conv layer with ReLU activation. A dropout layer is introduced after that. Thirty-two neurons are used in the dense layer with the activation function ReLU. Another dense layer with three neurons for the three classes and activation function’ softmax’ for the final output. The model is
Identification of Deceptive Clickbait Youtube Videos Using …
205
trained using the ‘categorical_crossentropy’ loss function and ‘adam’ optimizer. Output from a dense layer of Model-1 and Model-2 is concatenated for getting multi-modal model output. Finally, the dense layer with three neurons and activation function softmax process the final output of the multi-modal model.
5 Results The weighted F1-score determines the models’ superiority. Other evaluation criteria, such as precision (P), accuracy (A) and recall (R), are also considered to better understand the model’s performance. Table 3 shows all the results in the test set concerning only textual features. Features are extracted by Keras embbeding layer, GloVe and FastText embeddings. Table 3. Results of textual modality Modality
Textual
Embedding
Classifier
A
P
R
F1-score
Keras
Bi-LSTM
0.676
0.666
0.666
0.686
CNN
0.716
0.757
0.680
0.729
Bi-LSTM
0.634
0.672
0.729
0.699
CNN
0.671
0.673
0.667
0.669
Bi-LSTM
0.665
0.671
0.671
0.671
FastText (CBOW)
CNN
0.798
0.799
0.799
0.796
Bi-LSTM
0.714
0.714
0.619
0.670
GloVe
CNN
0.746
0.885
0.781
0.787
TF-IDF
Results revealed that the CNN model achieved the highest F1-score of 0.796 using FastText embedding with the CBOW embedding strategy. Moreover, using the other two embeddings, GloVe and Keras, CNN obtained the highest F1 -score of 0.787 and 0.729, respectively. The Bi-LSTM models performed significantly worse than the CNN models using all the embedding techniques. Table 4 represents the visual modality results, indicating that ResNet50 is better than VGG16 and VGG19 according to the weighted average F1-score. Also, ResNet50 has sharply higher P, R, and A measures. The image-based models’ performance proved superior to the text-based models with the highest accuracy of 0.872, a precision of 0.855, a recall of 0.876, and a 0.899 F1 -score based on a weighted average. Table 5 exhibits the evaluation results of the The proposed multi-modal approaches. multi-modal approach that utilizes the CNN ResNet50 model, where denotes the concatenation operation between the two models, obtained a precision of 0.76 (decorative class) and 0.54 (deceptive class) and 0.55 (satirical class) with a weighted average precision of 0.793. Also obtained was a weighted average recall of 0.708 and an F1-score of 0.748, the highest among other multi-modal models.
206
S. S. Rahman et al. Table 4. Results on visual modality
Modality
Classifier
A
P
R
F1-score
VGG16
78.97
79.2
77.59
78.38
Visual
VGG19
67.68
70.3
66.16
68.15
Resnet50
87.24
85.52
87.63
89.89
Table 5. Results of multi-modal approaches Embedding
Models Bi-LSTM
Keras
FastText (CBOW)
Multimodal (Glove)
VGG16
Bi-LSTM VGG19 Bi-LSTM ResNet50 CNN VGG16 CNN VGG19 CNN ResNet50 Bi-LSTM VGG16 Bi-LSTM VGG19 Bi-LSTM ResNet50 CNN VGG16 CNN VGG19 CNN ResNet50 Bi-LSTM VGG16 Bi-LSTM VGG19 Bi-LSTM ResNet50 CNN VGG16 CNN VGG19 CNN ResNet50
A
P
R
F1-score
72.13
66.63
67.12
66.87
71.49
64.21
67.54
65.83
69.13
57.22
57.23
57.22
68.61
60.59
60.51
60.55
69.88
54.31
49.67
51.89
73.44
62.15
63.89
63.01
57.62
67.17
67.88
67.52
54.11
64.43
64.76
64.59
55.49
67.62
66.37
66.99
71.34
70.44
70.54
70.49
72.08
68.77
68.76
68.76
75.77
79.33
70.77
74.81
65.62
67.23
67.22
67.22
74.39
63.76
61.98
62.86
68.49
64.11
68.46
66.21
71.63
69.14
73.07
71.05
70.01
53.88
53.65
53.76
74.89
59.42
60.77
60.09
Table 6 shows the class-wise performance of best models from the textual, visual and multi-modal approaches. 5.1 Comparisons with Baselines To validate the assessment of the proposed model(CNN ResNet50), we compared its performance with the existing techniques of clickbait classification. The results revealed that the proposed model achieved the highest performance scores than the past techniques. Table 6 summarizes the comparison between existing and proposed models.
Identification of Deceptive Clickbait Youtube Videos Using …
207
Table 6. Class-wise results of best visual, textual and multi-modal approaches Approach
Model
Class
P
R
F1-score
Visual
ResNet50
Decorative
0.94
0.91
0.92
Deceptive
0.69
0.61
0.65
Satirical
0.52
0.55
0.53
Decorative
0.83
0.93
0.88
Deceptive
0.75
0.56
0.64
Satirical
0.58
0.50
0.54
Decorative
0.76
0.90
0.82
Deceptive
0.54
0.39
0.45
Satirical
0.55
0.20
0.29
Textual
CNN (FastText)
Multi-modal
CNN
ResNet50
Table 7. Performance comparison with previous approaches Methods Mowar et al. [4] Pujahari et al. [7] Proposed Model
Models
Stacked 6-model RF (metadata) SVM(text) RF(features, metadata) CNN ResNet50
Classes 2
Result (F1-score) 0.73
2
0.60
3
0.74
6 Conclusion This work presented a deep learning-based system to detect the class of deception from clickbait YouTube videos by exploiting multi-modal features. The textual and visual elements of YouTube videos are exploited. The results showed that the textual CNN model with FastText word embeddings achieved the best results with an accuracy of 0.798 and an F1 -score of 0.796. The visual model (ResNet50) outperforms the textual modelby scoring an accuracy of 0.872 and F1 -score of 0.899. The multi-modal model CNN ResNet50 formed from the concatenation of the textual model and the pretrained image model had an accuracy and F1 -score of 0.758 and 0.748 approximately. We noticed that the textual model suffered from high misclassification for the satirical class, resulting from the lack of proper balance in the collected dataset. The metrics of the multi-modal model dropped somewhat, primarily after aggregating the textual and visual features. In the future, we plan to explore a few more classes for finer identification. A secondary direction for improvement would be experimenting with more recent deep learning techniques such as transformers, GANs, and auto-encoders which use large, sophisticated models to reconstruct output from the input.
208
S. S. Rahman et al.
References 1. Rony, M.M.U., Hassan, N., Yousuf, M.: Diving deep into clickbaits: who use them to what extents in which topics with what effects? In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017. pp. 232–239 (2017) 2. Sisodia, D.S.: Ensemble learning approach for clickbait detection using article headline features (2019), www.informingscience.org/Publications/4279 3. Gamage, B., Labib, A., Joomun, A., Lim, C.H., Wong, K.: Baitradar: a multi-model clickbait detection algorithm using deep learning. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 2665–2669. IEEE (2021) 4. Mowar, P., Jain, M., Goel, R., Vishwakarma, D.K.: Clickbait in Youtube prevention, detection and analysis of the bait using ensemble learning. ArXiv abs/2112.08611 (2021) 5. Varshney, D., Vishwakarma, D.K.: A unified approach for detection of clickbait videos on youtube using cognitive evidences (2021), link.springer.com/article/10.1007/s10489-02002057-9 6. Huette, J., Al-khassaweneh, M.A., Oakley, J.: Using machine learning techniques for clickbait classification. In: 2022 IEEE International Conference on Electro Information Technology (eIT) pp. 091–095 (2022) 7. Pujahari, A., Sisodia, D.S.: Clickbait detection using multiple categorisation techniques. J. Inf. Sci. 47(1), 118–128 (2021). https://doi.org/10.1177/0165551519871822 8. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measure. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104
Perception and Knowledge of South African Creatives with Regards to Crypto Art, NFTs, and Crypto Art Platforms Siyanda Andrew Xaba1(B)
, Xing Fang1 , and Dhaneshwar Shah2
1 Wuhan University of Technology, 205 Luoshi Road, Wuhan 430070, China
[email protected] 2 Springer Heidelberg, Tiergartenstr. 17, 69121 Heidelberg, Germany
Abstract. Worldwide, the creative economy has given great attention to crypto art, with auction houses like Christie’s and Sotheby’s selling some of it for high sums. In Christie’s and Sotheby’s, Michael Joseph Winkelman, also known as Beeple, had his NFT auctioned and sold for $69 million. The exploration of this novel environment, which seems to provide limitless opportunities and possibilities, has piqued the imagination of artists all over the world. But there is skepticism about cryptographic art. Since crypto art is still a relatively new phenomenon among artists, there are still many questions about it, especially in the South African art scene. Our study intends to examine South African creatives’ perceptions and knowledge about crypto art, NFTs, and crypto art platforms. We employed a qualitative methodology and questionnaires to gather data for our study. Only 55 of the individuals who received the questionnaire answered, and the survey was provided to all participants. Charts were utilized to present the data after SPSS software was used to analyze it. The majority of participants in the survey are familiar with NFTs and crypto art, however they are less knowledgeable about platforms for crypto art. The study also reveals that the majority of South African artists are not engaged in the cryptocurrency art scene and are unaware of its economics. Keywords: Creatives · Crypto art · Crypto art platforms · NFTs South Africa
1 Introduction Originally associated with the rise of crypto currencies such as Bitcoin and Ethereum, crypto art is now associated with any digital art that has been tokenized on the blockchain in order to associate digital proof of ownership. Franceschet et al. [1] defines crypto art as limited edition collectible digital art that is cryptographically registered with a token on a blockchain. McAvoy and Kidd [2] define crypto art as an art trend inherent to the internet and the blockchain. It is a recent art movement in which creatives use blockchain technology to create works that are still or animated visuals that are shown at crypto art exhibitions or on their own digital channel [2]. As a result of artists making significant money from selling their NFTs, crypto art has impacted the creative economy and sparked © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 209–216, 2024. https://doi.org/10.1007/978-3-031-50327-6_22
210
S. A. Xaba et al.
global attention. With the introduction of NFTs, it is now possible to authenticate the copyright of digital works [3]. Obtaining copyright protection for digitally created works was challenging in traditional art marketplaces, but with NFTs, creatives can have their works validated through a minting process that converts uploaded digital works into NFTs. As a result, NFTs are regarded as rare collectibles that are highly coveted, which accounts for their high costs. In its current state, understanding the blockchain and cryptocurrency ecosystem is difficult for the typical person [4]. Concerns about energy usage, security, legality, and the prevalent notion that cryptocurrency is a hoax are slowing widespread acceptance [4]. The goal of our research is to learn about South African creatives’ perceptions and knowledge about crypto art, NFTs, and crypto art platforms.
2 Methodology The study looked at crypto art platforms, NFTs, and crypto art. We wanted to find out how knowledgeable South African designers and artists are about crypto platforms, NFTs, and crypto art in our survey. We employed a quantitative approach to accomplish this goal. An online survey was made using Google Drive and sent to digital, graphic, and visual artists. We only received responses from 55 participants, and we used SPSS to analyze the data. The chosen participants were emerging and established visual, graphic, and digital artists from many categories.
3 Findings 3.1 Knowledge About Crypto Art, NFTs, and Crypto Art Platforms
Fig. 1. Knowledge about crypto art, NFTs, and crypto art platforms
A question was posed on whether artists and designers know what crypto art is. Participants could only choose to say “yes” or “no”. According to the study, 66% of individuals are aware of what crypto art is, with only 34% being unaware. Given the level of interest that crypto art has generated in the creative community, it is not surprising that a large majority of participants are aware of it. Gutierrez et al. [5] supports this claim by noting that total sales of NFTs increased from USD 183, 121 to USD 38 million in 2021, indicating a rise in their popularity. The participants gave us a good response when we asked if they understood what NFTs were. Participants could only choose to say “yes” or “no”. 76% of the participants said yes, they know about NFTs whereas 23% of the
Perception and Knowledge of South African Creatives …
211
participants said no. This result is not unexpected given how NFTs have affected the art industry and how even non-artists are now selling them. A select few artists and designers have had their creations sold at high prices in Christie’s and Sotheby’s auctions. NFTs are generating so much interest that media outlets and academic journals are writing about them. We compiled a list of crypto art sites and queried participants to see if they were familiar with any of them. Participants were required to reply with a yes or no. Our survey reveals that, across all of the listed crypto art platforms, 68% of participants selected “no”. This led us to the realization that while participants are knowledgeable about crypto art and NFTs, they are less knowledgeable about the platforms that are used to buy and sell NFTs. This might imply that the sample’s participants have no interest in breaking into the NFT sector. To better enlighten South African artists about crypto art platforms and markets, more education and awareness is required (Fig. 1). 3.2 Use of Crypto Art Platforms
Fig. 2. Use of crypto art platforms
We compiled a list of available crypto art platforms and asked participants which one they were currently utilizing. Only 9% of participants in the sample utilize OpenSea, and 7% use Nifty Gateway, while the majority of the survey reveals that 82% of participants don’t use any crypto-art platforms. This is not surprising given that the majority of the sample responded negatively to our prior question about their familiarity with crypto art platforms. This demonstrates that South African artists are not particularly active in the crypto art scene. Given that there have been several frauds and scams involving these platforms, it’s possible that they are hesitant to use them. Another possibility is that they regard these platforms as pyramid schemes that are likely to fail. According to [6], severe digital art scams include wallet hacking, illegal NFT production, and copyright violations (Fig. 2). 3.3 Crypto Art Comprehension and Perception In this section we wrote a series of statements regarding crypto art in hope to know whether the participants have any knowledge regarding crypto art. Participants had to respond with strongly agree, agree, neutral, disagree, and strongly disagree (Fig. 3). We wrote down the following statements:
212
S. A. Xaba et al.
Fig. 3. Crypto art comprehension and perception
1. Crypto art is a digital artwork that is a type of an NFT. Majority agreed with this assertion, while some choose to remain neutral. This demonstrates that individuals are aware of what crypto art is, but the study also reveals that a sizable portion of participants is unsure. 2. Crypto art remains reproducible and visible to all, but only the purchasing collector has ownership of the artwork. Regarding this statement, the majority of participants made the decision to remain neutral. Participants may not have done any investigation to familiarize themselves with cryptographic art or they may not have understood the statement. 3. Crypto art is preserved on the blockchain in the form of NFT’s. While a sizable portion of participants agreed, the majority of participants opted to remain neutral. These findings imply that participants only had a cursory understanding of crypto art rather than in-depth expertise. 4. Crypto art is not real art. The majority of participants elected to remain neutral about this remark, with only a tiny percentage strongly disagreeing. This finding implies that South African artists cannot categorize crypto art as a type of art since they were unresponsive. South African artists must be well-informed about what crypto art is. 5. It is difficult to sell crypto art. According to their responses, the majority of participants decided to remain neutral, although a small percentage of participants agreed. This outcome is hardly unexpected given that the majority of participants claimed to have no prior experience with crypto art sites. Additionally, it’s crucial to understand that this claim is arbitrary, making it challenging to provide a definitive answer. 6. I have no interest in crypto art. While the majority of participants opted to remain neutral, a sizeable portion of individuals opted to strongly agree. These findings are intriguing given that a sizable percentage firmly agrees. Those who firmly concur with this statement might be wary of this technology because of reports of fraud and scams, as well as the fact that it is thought of as a pyramid scheme. They might consider it to be a sophisticated piece of technology, too. 3.4 NFT Comprehension and Perception See Fig. 4. In this section we wrote a series of statements in order to determine participants’ comprehension and perception of NFTs. Participants had to respond with strongly agree, agree, neutral, disagree, and strongly disagree. We wrote down the following statements: 1. NFT’s are used to create verified digital ownership and are used in applications that offer crypto art. A large portion of people strongly agree with this statement,
Perception and Knowledge of South African Creatives …
213
Fig. 4. NFT comprehension and perception
2.
3.
4.
5.
6.
however, the majority of individuals agreed with it. It appears from this that the South African artist in this sample is aware of NFTs. Only digital art can be sold as an NFT. While a sizable portion of the participants disagree with this assertion, the majority elected to remain neutral. These findings unequivocally demonstrate that the sample’s creatives are unsure as to whether tangible works of art may be sold as NFTs. However, it’s encouraging to hear that some people are aware that selling physical artwork as an NFT is a possibility. Only if artists provide images of their actual work will this happen. The image becomes an NFT. Creatives have the choice to inform the collector that they will sell send the physical piece while selling the NFT of physical work. Minting an NFT is costly. When it comes to this remark, the majority of participants selected to be neutral, while just a small percentage firmly agrees. Due to the fact that participants are not frequent users of crypto art platforms, the fact that a large percentage of them remained neutral is not surprising. Anyone can see and download digital art but with NFT, only one can prove ownership. The majority of participants agree with this statement, although a sizeable portion of participants remained neutral. Given that they are not active in the NFT community, it is surprising that the participants concur with this statement. It could imply that the participants heard it through NFT-related YouTube videos or read it someplace. I have no interest in NFT’s. Only a small portion of participants selected to agree with this remark, while a sizable majority chose to remain neutral. This outcome could be caused by ignorance, news of fraud and scams, and other factors. It’s also possible that the findings are the result of allegations that NFTs are pyramid schemes. I can easily make money out of NFT’s. According to the participants’ comments, the majority of them remained neutral on this statement, while just a tiny percentage of them agreed. Since participants explicitly said that they were not a part of the NFT community, this result was predicted; nonetheless, to be fair, it is not an easy statement to concur with. The NFT market is not a get-rich-quick plan; in order to succeed, one must engage in marketing and cultivate a base of devoted followers.
3.5 Crypto Art Platform Comprehension and Perception See Fig. 5. In this section we wrote a series of statements regarding crypto art platforms in hope to know whether the participants have any knowledge regarding crypto art platforms.
214
S. A. Xaba et al.
Fig. 5. Crypto art platform comprehension and perception
Participants had to respond with strongly agree, agree, neutral, disagree, and strongly disagree. We wrote down the following statements: 1. Creatives can promote their art on crypto art platforms. Regarding this statement, a large percentage of participants agreed with it, while a considerable proportion chose to remain neutral. Although participants are not participating in crypto art platforms, much of the hoopla surrounding crypto art is around new creatives making a name for themselves as well as creatives having their work sold at high rates. 2. Creatives can sell their art on crypto art platforms. The majority of participants agree with this statement, while a considerable number of other individuals strongly agree with it. It is unsurprising that participants agree with this notion, given that the hoopla around crypto art is upon creatives selling their work and making a living from this new market. Much of the news surrounding crypto art revolves around how artists are able to sell their work for large quantities of money, which has sparked a lot of interest in the creative business. 3. Creatives can gain exposure through crypto art platforms. The majority of participants agree with this statement, and a considerable percentage strongly agree with it. Participants agree with this notion because many emerging creatives have found fresh success on these platforms. Michael Joseph Winkelmann, popularly known as Beeple, is an example of a designer who went unnoticed until his NFT was auctioned and sold for $69 million at Christie’s and Sotheby’s. This story went viral and sparked a lot of interest in NFTs among creatives. Artists like Lethabo Huma from South Africa and Osinachi from Nigeria, who have struggled to build a name for themselves in the traditional visual art market, have found fresh success in NFT-based technologies [7]. 4. You can sell physical art on crypto art platforms. Regarding this remark, the majority of participants chose to remain neutral, while a tiny number of people strongly agreed with it. Participants have shown throughout this survey that they are not active participants in crypto art platforms, which is why the majority of them elected to be neutral. It is worth noting that some people firmly agree with this assertion. Physical art can be sold on crypto art platforms; but, in order to make it an NFT, creators must capture a photograph of the physical work and mint it [8]. This is frequently followed by the creatives informing the collector that they can also ship the work to them. 5. I find it complex to operate crypto art platforms. Regarding this remark, the majority of participants elected to remain neutral, although a sizable proportion agree. This is hardly surprising given that the participants are not active participants in crypto art platforms. However, some people agree with this assertion. McConaghy et al. [9]
Perception and Knowledge of South African Creatives …
215
claims that collecting physical art is simple. It’s possible they had trouble setting up their cryptocurrency accounts. 6. It is costly to have your work on crypto art platforms. A large proportion of participants elected to remain neutral, while just a tiny proportion agreed with this assertion. Again, as demonstrated by prior questions, the majority of creatives are not participating in the crypto art market; hence, it is not surprising that the majority chose to be neutral with regard to this statement. However, only a small number of participants agree with this statement. These people may have prior expertise with crypto art and have encountered these fees.
4 Conclusion Our study aimed to investigate South African creatives’ perceptions and knowledge about crypto art, NFTs, and crypto art platforms. According to our analysis, the majority of participants (66%) have some understanding of crypto art and NFTs, although the majority (68%) are unaware of any of the crypto art platforms identified in the study. We also observed that the majority of participants (82%) had no prior experience with crypto art platforms. The majority of participants opted to remain neutral in the bulk of the offered statements, which led us to think that participants only had a basic understanding of crypto art, NFTs, and crypto art platforms. The study showed that there are South African creators who use crypto art platforms to promote their crypto art, notwithstanding their small quantity. These artists may educate South African artists on how to prevent fraud and scams and impart their understanding of crypto art. The study advises South African creatives to educate themselves and seek out information from individuals who are more knowledgeable about crypto art.
References 1. Franceschet, M., Hernandez, S., Colavizza, G.: Crypto art: A decentralized view. Leonardo 54(4), 402–405 (2021) 2. McAvoy, E, Kidd, J.N.: Crypto art and questions of value: a review of emergent issues. Creative Indus. Policy Evidence Centre 1–50 (2022) 3. Wang, S.: Crypto art: NFT art trading and the art market. Asian J. Soc. Sci. Stud. 7(10), 14–17 (2022) 4. White, B., Mahanti, A., Passi, K.: Characterizing the OpenSea NFT marketplace. WWW’22: Companion Proceedings of the Web Conference, 488–496. (2022) 5. Gutierrez, C., Gaitan, S.P., Jaramillo, D., Velasquez, S.: The NFT hype: what draws attention to non-fungible tokens? Mathematics 10(3), 1–13 (2022) 6. Radermecker, A.S.V., Ginsburgh, V.: Questioning the NFT “Revolution” within the art ecosystem. Arts 12(25), 1–17 (2023) 7. Notaro, A.: All that is solid melts in the ethereum: the brave new (Art) World of NFTs. J. Visual Art Practice 21, 359–382 (2021) 8. Gunduz, C.S., Eryilmaz, G.M., Yazici, Y.: Crypto art perception of university students within the scope of Nft-Blockchain platform. IDIL 90, 261–273 (2022) 9. McConaghy, M., McMullen, G., Parry, G., McConaghy, Holtzman, D.: Visibility and digital art: blockchain as an ownership layer on the internet. Wiley Online Library 26(5), 461–470 (2017)
216
S. A. Xaba et al.
10. Kapoor, A., Guhathakurta, D., Mathur, M., Yadav, R., Gupta, M., Kumaraguru, P.: TweetBoost: influence of social media on NFT valuation. WWW ’22: Companion Proceedings of the Web Conference, 621–629. (2022)
Automated Bone Age Assessment Using Deep Learning with Attention Module Maisha Fahmida1 , Md. Khaliluzzaman2 , Syed Md. Minhaz Hossain3 , and Kaushik Deb3(B) 1 Department of Computer Science and Engineering, Chittagong University of Engineering and
Technology (CUET), Chattogram 4349, Bangladesh [email protected] 2 Department of Computer Science and Engineering, International Islamic University Chittagong, Chattogram 4318, Bangladesh [email protected] 3 Department of Computer Science and Engineering, Premier University, Chattogram 4000, Bangladesh [email protected], [email protected]
Abstract. Bone age assessment (BAA) based on the hand X-ray image is used by the pediatric for measuring the growth of children, predicting their final height and also diagnosing some diseases. Beside this, it may be used for forensic, sports and other legal purposes. The conventional manual approach, however, takes considerable time and is prone to obverse fluctuation. This study proposes an end-to-end BAA method based on the concept of Convolutional Neural Network (CNN). The proposed model contains three separate module: transferred InceptionV3 module, an attention module, and a regression module. Initially, pretrained InceptionV3 model is used to serve the purpose of initialization. Then four different attention module: channel attention, spatial attention, squeeze and excitation attention, and channel attention with spatial attention are used separately at the end of the InceptionV3 module in order to choose the best module for bone age assessment purpose. Attention module is used to put emphasis on the most important features to escalate the performance of the proposed model. Finally, regression module is introduced to estimate the bone age. The experiment of the model with squeeze and excitation attention on Digital Hand Atlas (DHA) dataset reveals that the model overcomes baseline models performance. Keywords: Bone age assessment (BAA) · Deep learning · Transfer learning · Attention module · Convolution neural network (cnn)
1 Introduction The pediatric radiology and endocrinology departments all around the world use the term bone age to define skeletal growth for both medical and non-medical purposes. For a more accurate assessment of bone age, X-rays of the hand and wrist are most frequently employed.The description of shape and changes in arrangement of different hand and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 217–226, 2024. https://doi.org/10.1007/978-3-031-50327-6_23
218
M. Fahmida et al.
wrist bone segments, provides crucial factors for evaluating biological bone maturity so as bone age. Currently in medical sectors two common and effective methods are in practice to regulate the bone age, i.e., the Greulich and Pyle (GP) method and the Tanner-Whitehouse (TW) method. However, both methods are time consuming, need expert and also the intra-observer as well as inter-observer variability may negatively affect how accurately bone age is determined [1]. Fortunately, computer-aided diagnosis (CAD) system encompassed with deep learning, offers a potential method for automatically determining a child’s bone age [2]. Spampinato et al. [5] used deformation layer to apply transformation and to compute nonrigid geometric formation. However, deformation layer is computationally expensive when uses in hardware. Using non-subsampled contourlet transform (NSCT), Chen et al. [6] proposed two staged deep learning network in which a variety of factors could have an impact on the performance as a whole. Koitka et al. [7], employed a two-phase network that allowed for the improvement of both the detection and regression stages. In few cases, Faster-RCNN failed to detect or detected multiple unwanted ROIs (Region of Interest) from the hand X-ray image. Following the established medial method, Wibisono et al. [8], segmented hand region in five essential portions using Faster-RCNN which required 0.2s for each image to process. Then, this method used multiple network to extract features and merged those features map. Guo et al. [9], used simulated data which is different from real world data and focused on controlling poor quality of image. Considering those limitations, this paper introduces a system for BAA that requires less manual efforts and is more efficient. The proposed model uses the CNN model with an attention module. The proposed model used pretrained InceptionV3, which severed as a feature extractor, and then used different attention modules for the purpose of putting emphasis on the most important features of the hand automatically. The attention module forces the system to focus on the most relevant and major features and thus facilitates the improvement of system performance. Using the attention module with CNN reduces the need for externally segmenting the hand bones before feeding them into the regression model. Thus, it reduces the computational cost as well as the time consumption for processing. The main contributions of this paper include: 1. To propose an end-to-end CNN model based on transfer learning. 2. To add attention module to facilitate the identification of region of interest through training. 3. To eliminate the need of manual annotation for extracting ROIs. 4. To evaluate different attention modules with CNN model to observe their performance.
2 Related Work Many researches have been done for the improvement of the computer based BAA system. The techniques that pushes the advancement of automated BAA at significant amount can be divided into three types: image processing technique, machine learning and deep learning. The image processing based methods mainly performed the age estimation by extracting low level features from hand X-ray image. Those methods used the TW or GP traditional technique logic to extract the low level features of hands and
Automated Bone Age Assessment Using Deep Learning …
219
lacked of ability to make use of more advanced features which could facilitate the age assessment task. As recently, deep learning (DL) techniques has shown a significant success in identifying advanced and hidden feature of given input data, it has encouraged researchers to use this methods in medical image and data processing. BoNet was the oldest deep learning based procedure to accurately predict the bone age of individuals of various racial and age groups from a publicly available dataset, and it achieved an average 9.6 months difference compared to the experts observation [5]. Chen et al. [6], the author proposed multi scale feature fusion network which included non-subsampled contourlet transform (NSTC) for extracting multi scale and directional hand segment bands and those bands were fed into convolutional layer for feature extraction and then finally in regression model. That model resulted in MAE about 8.64 months for Digital Hand Atlas dataset. To more precisely inspect the feature of hand bones, some existing methods used to first draw out the most informative regions. For example, the author in [7], used two different neural network in which the first one was detector network which used Faster R-CNN to locate the RoIs and the second part contained multiple RoIs-specific regression networks. This method resulted in MAE 4.56 months for RSNA dataset and 11.93 months for Digital Hand Atlas dataset. Wibisono et al. [8], Region Based Feature Connected layer was proposed which used different CNN model for extracting features from different hand segments and the whole hand as well. That evaluation produced MAE 6.97 months for RSNA dataset and 7.10 months for Digital Hand Atlas dataset. Guo, Jiajia, et al. in [9], established ‘BoNet+’ a regression model for BAA and additionally added quality improvement method for poor quality X-ray images which resulted in MAE 9.12 months for Digital Hand Atlas dataset.
3 Methodology For improving the performance of Bone Age Assessment by evaluating hand X-ray images, this paper introduces a fully automatic model based on CNN with attention module. This section aims to discuss in details about the proposed architecture. The proposed model used pretrained InceptionV3 which severed as feature extractor and then used different attention modules for the purpose of putting emphasis on most important features of hand automatically. In the next stage, the regression module was added to estimate the bone age. Finally, all the parameters of backbone model, attention module and regression model were trained to get better performance. The given Fig. 1 represents the proposed architecture. 3.1 InceptionV3 CNN methods employ Inception modules to reduce the cost of computation. It used 1 × 1 convolutions that are stacked together and thus facilitates effective computation as well as deeper networks avoiding overfitting issue [13]. This kind of arrangement enabled the module to achieve the ability of extracting features from input data at varying scales. The original block of Inception module is depicted in Fig. 2.
220
M. Fahmida et al.
Fig. 1. The proposed methodology for bone age assessment.
Fig. 2. Original Inception module block.
3.2 Attention Module Woo et al. [11], proposed lightweight Convolution Block Attention Module (CBAM) that can be added with any base CNN model in end-to-end fashion. CBAM architecture included two types of attention sequentially, first channel attention module and then spatial attention module. Hu et al. in [15], proposed an attention module named squeeze and excitation, which could upgrade the channel features correlation thus boosting the performance in minimal computation cost and could be added to any CNN model. Channel Attention Module is used to put channel-based attention. As channels of feature map represents different features of input image, the channel attention module helps in identifying the most important and meaningful features from the input image for specific purpose. From paper [11], in this attention module the input feature map is pushed through the maximum pooling and the average pooling layer in parallel manner. Then, the output of these two layers is forwarded to the multilayer percepton (MLP) layer. Finally, the output features from MLP is elementwise summed and is passed through the sigmoid activation. The formula for channel attention(CA) module is presented in given Eq. (1): CA :
σ (MLP(Maxpool(FeatureMap)) + (MLP(Avgpool(FeatureMap))
(1)
where sigmoid function is denoted by σ . Spatial Attention Module is used to put attention based on spatial relationships of features. This module helps to identify where the most important features are located in
Automated Bone Age Assessment Using Deep Learning …
221
the input image for a given problem domain. From paper [11], in the spatial attention module the channel refined feature map is pushed through the maximum pooling and the average pooling layer in sequential manner. Then, the output of these two layers is concatenated and passed through a convolution layer. Finally, the output feature map of convolution layer is applied to a sigmoid activation. The formula for spatial attention (SA) module is presented in given Eq. (2): SA :
σ (CONV ([Maxpool(FeatureMap); Avgpool(FeatureMap)]))
(2)
where sigmoid function is denoted by σ . Squeeze and Excitation Module can be used as attention module by reconstructing the channel wise feature in a dynamic manner thus empowering the model to represent the extracted features more efficiently. From paper [15], the module has three section: squeeze, excitation and scaling. Squeeze is done by applying global average pooling on feature map. Next, excitation is done by applying relu and sigmoid activation function in dense layer respectively. Finally, in scaling section each channel feature map is multiplied by the corresponding channel attention of the side network. The formula for squeeze and excitation (SE) module is presented in given Eq. (3): SE :
[FeatureMap] × [σ (δ(Avgpool(FeatureMap)))]
(3)
where sigmoid and relu function are denoted respectively by σ , δ. 3.3 Regression Module At the end of the channel attention module, the regression module was added to estimate the bone age. Flattening the feature map after passing it through the maxpooling layer, we concatenated the layer with gender information. Initially we stored the gender information numerically and passed the information to the network fully connected layer. We represented male and female by 1 and 0 respectively. Before concatenating, the gender information was mapped to fully connected layer with 16 nodes. Finally, we passed the concatenated layer information to the linear activation function for estimating bone age.
4 Result and Observation 4.1 Dataset For training and evaluating the proposed architecture, Digital Hand Atlas, a public available hand X-ray image dataset [4] was used. There are 1391 left hand X-ray images in the dataset. The dataset contains evenly distributed normally developing children hand X-ray image of Caucasian (CA), Asian (AS), African-American (AA), and Hispanic (HI) origin, ranging in age from 1- to 18 years, as well as pertinent patient demographic data and radiologists’ bone age readings of each sample. In the dataset, there are 700 images for male and 691 images for female .
222
M. Fahmida et al.
4.2 Experiment Setting The proposed model was implemented using keras 2.9.0 and trained on NVIDIA Tesla T4 GPU and 16GB Ram. We used data augmentation which resulted into total 5995 images and resized the images as 256 × 256 pixels images. We used rotation, width shift, height shift, horizontal flip methods for augmenting data. We performed training on 3357 images, for validation and testing purpose 1199 and 1439 images were used respectively. Evaluation metrics MAE (mean absolute error) was used for the quantitative analysis of performance of the proposed model. MAE :
M 1 |X (k) − Y (k) | M
(4)
k=1
where ground truth label, model estimated label and total sample number are denoted respectively by X(k), Y (k), M. 4.3 Result We initially trained three popular CNN networks-VGG19, InceptionV3, and ResNet50 network to conduct bone age regression using Hand Atlas dataset in order to choose a superior baseline network for the BAA task. The input image size was set to 256 × 256, and the gender information was not used. We trained those networks pretrained on ImageNet for 50 epochs. The result of our training is given in table 1. Table 1. Performance summary of baseline model pretrained on Imagenet dataset. VGG19
InceptionV3
ResNet50
55.63
13.65
45.53
For selecting the best attention module for the proposed architecture, we added different attention modules with pretrained InceptionV3 and observed the different models performance. We mainly performed experiment with Spatial attention, squeeze and excitation attention, channel attention and channel attention with spatial attention. The performance of different attention modules are listed in table 2 with MAE value and time consumption of each model. We performed those experiments for 100 epochs using batch size of 32 and ADAM optimizer. From the given table we can clearly observe that the model with squeeze and excitation attention outperforms the other attention models as it gives the lowest mean absolute error. To find out the right optimizer for the proposed squeeze and excitation model, we carried out experiments with different optimizers ( ADAM, SGD, RMSprop). And to select the appropriate batch size for the model, we performed experiments with different batch sizes (4, 8, 16, 32, 64, and 128). The results of all experiments are listed in given Table 3. For each case, we trained our model for 40 epochs.
Automated Bone Age Assessment Using Deep Learning …
223
Table 2. Summary of the performance of different attention models. Attention model
MAE (month)
MS per step
CA
5.93
143
SA
5.72
121
SE
4.86
133
SA+CA
5.29
169
Table 3. Summary of proposed model performance with different optimizer and batch size settings. RMSprop
SGD
ADAM
Batch Size
MAE
Batch Size
MAE
Batch Size
MAE
4
9.28
4
11.40
4
9.87
8
9.92
8
9.18
8
6.88
16
6.96
16
8.83
16
7.44
32
10.15
32
9.61
32
7.32
64
6.7
64
10.74
64
6.46
128
7.61
128
12.9
128
8.83
From the given table, it is visible that the ADAM optimizer with batch size 64 has performed well than all other listed combination of optimizer and batch size. Finally, the proposed pretrained InceptionV3 with squeeze and excitation attention model was trained and tested on the Digital Hand Atlas dataset with selected hyperparameters. Figure 3 represents the loss curve for selected model. The figure depicts the variation of training set and validation set loss for 100 epochs.
Fig. 3. Proposed model loss curve.
224
M. Fahmida et al.
Table 4 represents the summary of the performance analysis of our approach with a number of state-of-the-art BAA techniques. Although some techniques have achieved significant accuracy, some of those techniques includes extra label like manual annotation, bone parts segmentation, detection and also some includes heavy ensemble learning. Table 4. Performance comparison between proposed model and the state-of-the-art methods. Methods
extra Label
Data augmentation
Model ensembling
MAE (month)
Model [7]
Yes
Yes
Yes
11.93
Model [5]
No
Yes
No
9.6
Model [9]
No
Yes
No
9.12
Model [6]
Yes
Yes
No
8.64
Model [8]
Yes
No
Yes
7.10
ours
No
Yes
No
4.21
Our proposed model achieved considerable results without applying any extra mechanism. As squeeze and excitation attention module has able to put emphasis on the channels which containing most important features as per age required. We believe that the squeeze and excitation attention module has boosted the identification of the most important features and enhanced the representation of hidden features for bone age prediction. As modeling inter-channel interactions rather than inter-spatial relationships may be more significant since determining bone age from hand X-rays requires focusing tiny changes in bone formation across different parts of the hand. Moreover, the SE module is a simple and effective computational attention mechanism that modifies feature maps by learning channel-wise weights to highlight crucial features and suppress irrelevant ones. Figure 4 provides a visualization of which portion of the hand images are detected as the most important parts for bone age assessment by the proposed model for different samples.
Fig. 4. Examples of attention map representing the focused areas of different hand images. The highlighted colors indicates the most important areas identified by the proposed model.
From the figures, it is clearly observable that for different bone images the model was able to focus on different parts of the hand as required. The highlighted color of the given images indicates different focused portions of hand for different samples of the dataset.
Automated Bone Age Assessment Using Deep Learning …
225
5 Conclusion With the aim of enhancing the performance of BAA, we proposed a deep learning model based on CNN. As the performance of the BAA depends on the localization RoIs of hand X-ray image, we used the different attention modules which helped the model to automatically identify and emphasis on the most important features of hand X-ray. Among those different attention modules, the squeeze and excitation module performed better than others. Our end-to-end model also surpassed some state-of-the-art methods with minimal computational expenses. From the experimental results, it was detectable that the proposed model resulted in MAE 4.21 months. In future work, we would like to take an attempt to apply the method to some large datasets and also try to introduce fusion of different feature extractor models to see how they perform with different attention modules.
References 1. Ayala-Raggi, S.E. et al.: A supervised incremental learning technique for automatic recognition of the skeletal maturity, or can a machine learn to assess bone age without radiological training from experts?’ Int. J. Patt. Recogn. Artific. Intell., 32(01):1 860 002 (2018) 2. Booz, C., et al.: Evaluation of a computer-aided diagnosis system for auto- mated bone age assessment in comparison to the greulich-pyle atlas method: a multireader study. J. Comput. Assisted Tomogr. 43(1), 39–45 (2019) 3. Ren, X., et al.: Regression convolutional neural network for automated pediatric bone age assessment from hand radiograph. IEEE J. Biomed. Health Inf. 23(5), 2030–2038 (2018) 4. Digital Hand Atlas Dataset, https://ipilab.usc.edu/research/baaweb/. Accessed: 2023-02-01 (cit. on pp. 4, 7, 14, 25, 27) 5. Spampinato, C., Palazzo, S., Giordano, D., Aldinucci, M., Leonardi, R.: Deep learning for automated skeletal bone age assessment in x-ray images. Med. Image Analysis 36, 41–51 (2017) 6. Chen, X., Zhang, C., Liu, Y.: Bone age assessment with x-ray images based on contourlet motivated deep convolutional networks. In: IEEE 20th International Workshop on Multimedia Signal Processing (MMSP). IEEE 2018, 1–6 (2018) 7. Koitka, S., Kim, M.S., Qu, M., Fischer, A., Friedrich, C.M., Nensa, F.: Mimicking the radiologists’ workflow: estimating pediatric hand bone age with stacked deep neural networks. Med. Image Anal., textbf64, 101 743 (2020) 8. Wibisono, A., Mursanto, P.: Multi region-based feature connected layer (rb-fcl) of deep learning models for bone age assessment. J. Big Data 7(1), 1–17 (2020) 9. J. Guo, J. Zhu, H. Du and B. Qiu, ‘A bone age assessment system for real- world x-ray images based on convolutional neural networks,’ Computers & Electrical Engineering, vol. 81, p. 106 529, 2020 10. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: a elarge-scale hierarchical image database. IEEE Conf. Comput. Vision Pattern Recogn. 2009, 248–25 (2009) 11. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018) 12. Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, H.: Bone age assessment of children using a digital hand atlas. Comput. Med. Imaging Graphics 31(4–5), 322–331 (2007)
226
M. Fahmida et al.
13. Szegedy, C. et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) 14. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016) 15. Jie, H., Li, S., Gang, S., Albanie, S.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell., 42(8), 2011–2023 (2017)
Green Banking Through Blockchain-Based Application for Secure Transactions Md. Saiful1 , Nahid Reza1 , Maisha Mahajabin1 , Syada Tasfia Rahman1 , Farhana Alam1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2(B) 1 Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
[email protected]
2 Department of Computer Science and Engineering, Chittagong University of Engineering and
Technology, Chattogarm 4349, Bangladesh [email protected]
Abstract. Blockchain technology is a foundational, supporting technology with fascinating financial implications. Today, blockchain technology can be used in banking and other sectors of the economy. Blockchains have the potential to update and revolutionize the core technology of credit information and payment clearing processes in banks. We are developing a Blockchain-based solution to secure bank transaction operations. The research aimed to determine the potential outcomes of shifting from traditional banking to decentralized banking and the impact of this environment. In this work, we illustrated a blockchain-based application. We demonstrated how the impact on the environment of the blockchain-based application for secured transactions and decreased fraud, reduced human efforts, reduced media, and physical component usage, and how these reduced expenses, energy consumption, carbon emission, and so on. Keywords: Blockchain · Decentralized · Trusted computing · Green banking · Secured transaction · Paperless
1 Introduction Being “Green” is being ethical, sustainable, and socially responsible. Green banking means promoting environment-friendly practices, bringing down the carbon footprint of activities undertaken by banks, and developing banking strategies that will guarantee sustainable economic development, thereby leading to a green economy [1]. All banks should adopt the proper use of automation, renewable energy, and other strategies to reduce the carbon footprint of their operations [2]. Due to their extensive energy use (such as electricity, conditioning systems (like air, electronic/electrical components, computers, and more.), high paper usage, the absence of green architecture, etc., banks’ current size of operations has significantly expanded their carbon footprint [2]. Blockchain is a decentralized, shared digital ledger used by all parties involved in a transaction. It can be used in any banking transaction to perform a coordinated, decentralized transaction. More broadly, we can say that the use of a decentralized © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 227–240, 2024. https://doi.org/10.1007/978-3-031-50327-6_24
228
Md. Saiful et al.
system, rather than a centralized system, is becoming mandatory because everyone wants their system to be secure, traceable, and resilient. And emerging technologies like blockchain can assist in completing this critical task [3]. Using blockchain technology, banks may provide clients with precise information regarding their transactions and full access to the customer. Financial companies may save money and provide clients more safety, power, and control over the money they commit to those institutions by implementing this technology. However, for banking institutions that use the technology, blockchain further minimizes the increasing number of instances of fraud [4]. Blockchain is a powerful digital tool for modernizing this complicated banking service regime for effective administration, assuring various banking activities’ openness, speed, and data integrity [5]. All nations have stringent laws overseeing the financial industry, and bankers are recognized for having conservative viewpoints. However, due to its widespread use in recent years, the meteoric rise in popularity of cryptocurrencies, and the ICO craze, many banking institutions and bank management have realized that blockchain technology has tremendous opportunities. Blockchain can help financial organizations handle their books realistically and efficiently while involving their clients. Banks, for example, may use blockchain technology to provide precise information to customers about how their money is invested and when interest is paid. Banks may develop and use blockchain-based securely encrypted apps instead of encrypting and distributing banking cards. Banking institutions implementing this new technology may save money while providing their clients with greater security, authority, and control over the funds entrusted to them. Software and digitalization have significantly reduced the internal and external theft of company assets, although this is rarely reported. Digitally produced books are far more challenging for an employee to change. In any case, look up the last person to join the system and make modifications. However, blockchain provides a way for banking firms that embrace the technology to reduce the rarity of fraud occurrences further. Blockchain technology makes it simple to verify every transaction and makes it impossible to change any book backups. Consequently, client accounts will be secure, and fraud operations may be stopped before the money leaves the bank.
2 Literature Review The authors discuss the possible consequences of utilizing blockchain technology in banking and financial administration [6]. A primary objective was determining whether blockchain technology could manage accounts and funds. The author’s research focused on how blockchain looked into the past. These holes are being filled, and a few of the remaining problems are being resolved by Bitcoin. The author of this article examines the application of Blockchain technology without tokens to secure data, describing the intricacies of financial trade [7]. The article analyzes the assurance elements of appropriated data sets and offers a workaround for the issue of preserving the data’s uniqueness without relying on blockchain innovation. The author concludes by offering suggestions for integrating Blockchain technology is being integrated into established financial institutions. According to the authors, a blockchain
Green Banking Through Blockchain-Based Application
229
without mining or coins will significantly improve the support cycle for the validity and originality of data on bank exchanges. This article’s authors look into the benefits and problems associated with implementing blockchain technology in banking [8]. Blockchain innovation can simplify the current financial system by adopting more effective frameworks to achieve economic turns of events. The authors suggest that by removing the current blockchain obstacles linked to “bitcoin,” it may be possible to harness blockchain technology in financial cycles. Costly technology and excessive energy use are some of these restrictions. For dissecting and charting the course of development, the authors of this research recommend using the fundamental advancement model [9]. Any industry may use the model to comprehend the development cycle, progress, and tactics for snagging a piece of the money. The study’s findings demonstrate how most institutions compete to build their Blockchain banking systems. Based on the foundational development model, the investigations also point to the poor fundamental element of Blockchain banking since around right now. From the viewpoints of resource securitization, cross-line installment activity, and charging activity, this study explores the advantages of blockchain innovation for commercial banks [10]. While simultaneously increasing the operational effectiveness of banking sectors and executives, Blockchain innovation in order to lessen exchange costs for both sides. The authors of this study look at blockchain technology as a unique development in the financial administration industry [11]. Through a study of 12 financial administration suppliers, the authors discovered that banking administration companies frequently view blockchain innovations as a minor need due to a lack of a comprehensive value approach. They believe that organizations should consider expert ways when they come across new technological advancements, such as blockchain, to examine and assess the extent to which they can profit from the advancement to be applied. This article’s authors suggest how blockchain technology may be used to interact with loaning, verifying, and assessing development projects at the Brazilian Development Bank [12]. The concept streamlines public cash assignments, improves manual tasks, reduces operational costs, and generates data that can be used to develop a comprehensive understanding of the bank’s credit advantages. This research analyzes what should be feasible as a progress feedback mechanism utilizing blockchain innovation and designs executed after first describing the difficulties of putting the suggested notion into practice. The works in [16–25] demonstrate the guidelines and suggestions for green computing related issues.
3 Methodology We have used a discussion of Bitcoin’s operation to demonstrate the blockchain idea since the two are inexorably linked. Therefore, every online transaction involving digital assets can use blockchain technology. In layman’s words, it is a group of computers working together to modify the database’s configuration record per an agreement based only on mathematics. We use a discussion of Bitcoin’s operation to demonstrate the blockchain concept since the two are inexorably linked. Figure 1 shows the current centralized and
230
Md. Saiful et al.
decentralized system. Therefore, any online transaction involving digital assets can use blockchain technology.
Fig. 1. Centralized and decentralized system
3.1 Implementation Details User Interface Module. With no storage or blockchain capabilities, this module provides the user interface for our system. It uses a standard database to store financial transactions rather than a blockchain to show them. Users may register with the network, make transactions, and request funds from dependable sources by exchanging cash for decentralized coins. A trustworthy individual has the power to transfer money upon request. The system uses the Sha-256 hash method to construct account identities and the GUID method to differentiate across transactions. Along with other things, it offers an interface for login, dashboard, sending, receiving, and requesting payments. During the login process, OTP is utilized to boost system security. Module for Block Generator and Web Mining. We created the appropriate user interface (UI) in the previous module so that clients could send and receive DC currency independently of the Blockchain Platform. This module will create a simple proof of work (miner) system and a blockchain on a particular node or server. A chain of blocks is what makes up a blockchain. Every block has a unique hash and digital signature. Once the blockchain has been created, it will be possible to verify its integrity by looping over each block and comparing the present block’s hash to the recently computed hash and the last one. It is referred to as “Proof of Work.“ The whole blockchain must be recreated whenever an old block is modified. Module for Transactions and Wallet. n module 2 shown in Figs. 2 and 3 just preserved the raw transactional message as data. In this module, we will swap data for transaction records and the user’s wallet with elliptic-curve cryptographically generated public and private keys. It is OK to share the key pair with other people to receive money because the public key will act as the sender for our decentralized currency. To prevent anybody other than the private key owner from using or spending our money, our transactions are signed using our private key. A public key will be transmitted along with the transaction. It may be used to confirm the validity of our signature and the integrity of the contents since the signature contains Sender + To + NodeCoins. The data we do not want to be altered is signed using the private key. The signature’s integrity is checked using the public key to confirm it.
Green Banking Through Blockchain-Based Application
231
Fig. 2. Blockchain transactions module
Fig. 3. Chronological sequence of blockchain transactions
Nodes in peer-to-peer networks. We just built one node or server in Module 2. In this step, we will create two of the nodes required to create P2P networks. Web miners will monitor the network’s integrity while each node will maintain a version of the blockchain. Here, we will utilize Proof of Authority (POA), a consensus method that may be applied to permissioned ledgers. It uses several “authorities,” or designated nodes, which can add clean blocks and safeguard the ledger. In PoA-enabled ledgers, most authorities must approve a block before it can be generated. To store information about blockchain, block storage is nothing more than a ledger and database. Now we discuss why we chose proof of stake Ethereum.
4 Software Architecture and Design 4.1 Data Flow Diagram (DFD) Data “movement” across an information management system is graphically represented in a data flow diagram (DFD) in Figs. 4 and 5 he process characteristics of the system. In order to give a general overview of the system without delving too deeply, a DFD is typically utilized as a first stage. “Level 0” refers to a context diagram, a top-level flow chart. It has one process node, “Process 0,” which extrapolates the whole system’s functionality with regard to external entities. Several DFD layers Make a context diagram first. Here are the data flow diagrams. 4.2 Algorithm 1. Call the “receive_payment_from_bank” function with the “sender” and “amount” as inputs
232
Md. Saiful et al.
Fig. 4. Data flow diagram level 0
Fig. 5. Data flow diagram level 1
2. Check the success status of the payment received a. If success is False, return “Error: could not receive payment from bank” 3. Create a transaction using the “create_transaction” function with the “sender”, “receiver”, and “amount” as inputs 4. Convert the amount to standard cryptocurrency using the “convert_to_cryptocurrency” function 5. Mine a block in the sender’s blockchain using the “mine_block” function with the “sender”, “transaction_id”, and “crypto_currency” as inputs 6. Verify the keys of both parties using the “verify_keys” function with the “sender” and “block” as inputs a. If the keys are invalid, return “Error: Invalid keys”
Green Banking Through Blockchain-Based Application
233
7. Save the block using the “save_block” function 8. Transfer the coins to the receiver using the “transfer_coins_to_receiver” function with the “sender” and “crypto_currency” as inputs 9. Convert the cryptocurrency to the receiver’s currency using the “convert_from_cryptocurrency” function 10. Transfer the currency to the receiver’s bank account using the “transfer_to_bank” function with the “receiver” and “receiver_currency” as inputs 11. Return “Transaction successful” The function initiates a payment from a sender to a receiver. It first checks if the payment from the sender’s bank is successful. If not, it returns an error message. The function then creates a transaction and converts the amount to a cryptocurrency. After that, it mines a block in the sender’s blockchain, verifies the keys of both parties and saves the block. Then it transfers the cryptocurrency to the receiver, converts it to the receiver’s currency, and finally transfers it to the receiver’s bank. If all steps are successful, the function returns a message saying the transaction was successful.
5 Results and Analysis Sustainable banking is a new type of banking that has gained popularity in recent years despite widespread ignorance. From a high level, sustainable banking is the notion that focuses on banks’ role in contributing to long-term development. The fundamental advantage of sustainable banking is preserving natural resources and the environment. While it is well understood that banks play an essential role in encouraging sustainable development, the banking sector has only recently begun to see sustainability as a priority. 5.1 Energy Consumption Analysis Proof-of-Stake Ethereum By switching from proof of work (POW) to proof of stake (POS), Ethereum will consume 99.95% less power consumption. By using (1), we calculated the power consumption of our proposed system and compared the result to show the lower power consumption of proof of stake (POS). Since its inception, Ethereum has sought to implement a proof-ofstake consensus mechanism; however, doing so without jeopardizing Ethereum’s vision of being a safe, expandable, and decentralized blockchain technology has required ages of dedicated development and research. Ethereum will use significantly less carbon after the integration to be more secure. The network, therefore, started with a proof of work consensus. Mining must use their technology to overcome a riddle to get a proof of work agreement. t (1) E=P∗ 1000 In Eq. (1), P = power in units of watts, E = energy measured in joules or kilowatthours (kWh), and t = time over which power or energy is utilized. The answer to the riddle shows that the miner invested energy, proving that they were required to pay money or something else of value in order to have the right to contribute
234
Md. Saiful et al.
to the blockchain. The procedures used to determine who is allowed to add every next block, such as proof of work and stake, are all they are. By transitioning from proof of work to proof-of-stake, where the real-world value involved is Ethereum stacked directly in a shared ledger, miners are no longer obliged to use energy to contribute to the blockchain. As a result, cybersecurity has a significantly lower environmental impact. Further adoption of this technology will be accelerated by its potential advantages and benefits globally, notably in terms of automation, security, and the reliability of transaction data [13]. 5.2 Proof-of-Stake Energy According to current Beacon Chain estimates, the Merge to proof of stake, which is 3000 times more energy efficient than Pow, may result in a 99.95% decrease in overall energy consumption. Each node on the Ethereum network will need about as much energy as a cheap laptop would. Numerous studies evaluate the energy used “per transaction” to compare blockchains to other sectors of the economy. Although it has the advantage of being simple to understand, the amount of energy required for mining a block is unrelated to the number of transactions it contains. An energy unit per transaction would imply that there would be fewer transactions and a commensurate drop in energy consumption, but this is not the case (refer to Table 1). Table 1. Consensus mechanism Proof of stake
Proof of work
Validators are those who create blocks
Miners are those who create blocks
To become a validator, participants must purchase tokens or currencies
To become a miner, participants must purchase energy and equipment
Usage of less energy
Not energy dense
Increases scalability
Does not permit increased scalability
Network can be purchased
Robust security as a result of the high initial cost need
Transaction fees are paid to validators as compensation
Block payouts are given to miners
The definition of a blockchain’s transaction throughput significantly impacts the transaction-by-transaction calculation, and changing this definition can make the number appear higher or lower. For instance, on Ethereum, the total transaction capacity includes all of the “layer two “ rollups, which are generally disregarded and have a significant adverse effect on computations and the base layer’s system. The network’s overall energy use and carbon footprint are more significant. Based on those ideas, we may assess the advantages the network offers to its users and society to decide whether its energy use is reasonable. Contrarily, per-transaction measurements show that the network’s value comes solely from its function as a means of transferring cryptocurrency
Green Banking Through Blockchain-Based Application
235
between accounts, which restricts the ability to conduct a thorough cost-benefit analysis. Ethereum uses 112 terawatt hours of energy annually, comparable to the Netherlands, and 53 MT of carbon annually, comparable to Singapore. Today’s Bitcoin, in comparison, consumes more than 200 terawatt hours of energy annually, emits around 100 metric tons of carbon, and generates more than 32,000 T of electrical waste annually from outof-date equipment. When proof of stake is employed rather than proof of work, this cost is reduced by more than 99.95%, putting the total energy cost for protecting ETH closer to 0.01 Terawatt hours per year. In Fig. 6, the chart displays the anticipated yearly energy consumption for different firms in terawatt-hours per year, retrieved in June 2022 [15]. The statistics in the graphic are approximations culled from publicly available sources that are acknowledged in the text below. They are only intended to serve as samples and do not reflect any official forecast, commitment, or projection.
Fig. 6. Estimated yearly energy usage for various industries
We can compare annualized estimates for other businesses to put Ethereum’s energy consumption in context. Let us consider Ethereum as a network for securely storing digital assets as investments. We can connect to the estimated 240 Terawatt energy usage per year of gold mining. We could compare our digital payment systems to PayPal, which consumes 0.26 TWh of electricity per year. With an estimated annual spending of approximately 34 TW, we may use the gaming industry as a model for a content delivery system. Netflix’s predicted annual energy demand varies greatly, ranging from 0.45 terawatt hours based on their projections released in 2019 to 94 terawatt hours as calculated by Shift Project. Carbon Brief contains arguments addressing these data’s fundamental assumptions [14]. 5.3 A Greener Ethereum Although Ethereum has historically consumed much energy, switching from energyintensive to energy-efficient block production has required much development time and expertise. According to Bankless, the simplest way to limit the amount of energy used
236
Md. Saiful et al.
by proof of work is to “shut down,” which is what Ethereum has done. At the same time, a sizable, growing, and highly active regenerative finance community is developing on Ethereum. Regenerative finance apps include DeFi elements to enhance financial applications’ positive environmental externalities. Regenerative finance is a part of the broader “Solarpunk” movement, which like Ethereum, seeks to combine technical advancement and environmental responsibility [15]. 5.4 Evidence of Stake Secured The 51% attack has long been portrayed as a threat to cryptocurrency supporters, but when PoS is in use, it is improbable to happen. A 51% assault occurs when someone possesses 51% of a currency and then uses that dominance to alter the blockchain. A person or group must own 51% of the staked cryptocurrency in a PoS system (refer to Table 2). Table 2. Comparison of Traditional banking transactions, internet finance business, and blockchain transactions
User experience Efficiency and performance
Rising cost
Ensured of safety
Current banking transactions
Online finance transactions
Blockchain coins transactions
Uniform scenarios
Rich scenarios
Rich scenarios
Homogenous service
Personalized service
Personalized service
Many intermediate links
Several intermediary linkages
Disintermediation and point-to-point transmission
Complex clearing process
Intricate cleansing procedure
Clearing, distributed ledger, transaction
Lake efficiency
Lake efficiency
High efficiency
Large amount of manual inspection
Small amount of manual inspection
Completely automated
Several intermediary linkages
Many intermediate links
Disintermediation
Ridiculous costs
High costs
Low costs
Centralized data storage
Centralized data storage
Distributed data storage
Can be tampered
Can be tampered
Cannot be tampered
Easy to leak users’ personal information
Easy to leak user’ personal information
Use of asymmetric en
Poor safety
Poor safety
Good safety
Many common problems exist in traditional online banking systems, namely efficiency gridlock, transaction desync, fraud, and operation risks. Blockchain technology
Green Banking Through Blockchain-Based Application
237
is believed to solve most of these problems [12]. It is not merely prohibitively costly to own 51% of the committed cryptocurrency since staked money serves as security for the ability to “mine”. If a miner (or miners) attempts to retract a block via a 51% attack, they will lose the whole of their stacked coins. As a result, miners are encouraged to perform honestly in the interest of the coin and the network [15].
6 Discussion Blockchain raises transfer speeds, and the blockchain’s validation system can allow for the processing and settling of transactions in close to real-time. It lowers the costs and complexity of transactions between one’s operations and organizes and automates interactions with other parties using blockchain. Data input redundancy and validation are reduced thanks to blockchain, which gives the network a shared understanding of the truth. Due to the distributed nature of blockchain, resilience is increased because there is no chance of failure. Therefore, it is far more resilient than present systems. The system makes 100 transactions at a time. The system performs CreateAcount, Deposit, getLoan, CloseAccount, PayLoan, and Withdraw. One of our system’s primary constraints is that it can only process 100 transactions per second. Despite our system’s potential, its governance and deployment limitations could be an issue if it was to be used on a large scale. The lack of deployment is also a constraint. While our proposed system can process up to 100 transactions per second, its lack of governance and deployment may prove to be limitations in general. Remix cannot establish actual (non-test) user accounts to transfer money. Security is essential when considering the integrity of any financial platform, and with existing user accounts, Remix can establish a full guarantee of security. Due to its decentralized nature, it cannot enforce any standards or regulations on the systems using its framework. With proper governance and deployment, our proposed system could genuinely maximize its potential of processing up to 100 transactions per second. Deployment is a crucial factor in any system, which is true for our proposed system. Therefore, to provide a safe, secure, and reliable platform for conducting financial transactions, Remix must first implement a governance structure and deployment strategy to ensure that user accounts can be established and protected securely and efficiently. Despite these limitations, this system can still considerably improve processing power compared to traditional methods. We are actively developing a blockchain system for online money transfers. However, with a few modifications, the blockchain system can be extended to many more domains, including land registration and title insurance, healthcare, supply chain management, energy distribution networks, and smart contracts. We are confident that this blockchain system will benefit money transfers and other businesses such as the public sector, the Internet of Things, Online Donation Services, and Fund Raising.
238
Md. Saiful et al.
7 Conclusion Banks must seriously consider sustainable development and take the necessary steps to make it a reality. Due to its recent development, BlockChain can be used by banks. BlockChain has additional benefits that could be considered for stability and sustainability in the banking industry despite its drawbacks. The adoption of blockchain technology will undoubtedly mark a turning point in banking history because it is similar to banking but requires less paper and takes less time. The accountability components will be enhanced to enable quick detection and resolution of any fraud committed. Because of the digitization of transactions, the expense of keeping records and other data will be significantly reduced. While understanding blockchains in the context of Bitcoin is beneficial, one should not assume that all blockchain ecosystems require Bitcoin features such as proof of work, most extended chain rule, and so on. Bitcoin is the first effort at preserving a decentralized public ledger with no formal control or governance. There are considerable obstacles to overcome. On the other hand, privately distributed ledgers and blockchains can be used to tackle different types of problems. Every solution, as always, has tradeoffs, benefits, and drawbacks that must be considered individually for each use case.
References 1. Anitha, K., H.M., Rajeshwari, K., Preetha, S.: Design of blockchain technology for banking applications. In: Raj, J.S., Kamel, K., Lafata, P. (eds.) Innovative Data Communication Technologies and Application. Lecture Notes on Data Engineering and Communications Technologies, vol. 96 2. Uddin, M., Ahmmed, M.: Islamic banking and green banking for sustainable development: evidence from Bangladesh. Al-Iqtishad: Jurnal Ilmu Ekonomi Syariah. 10 (2018). https://doi. org/10.15408/aiq.v10i1.4563 3. Khalid, M.I. et al. (2022) Blockchain-based Land registration system: a conceptual framework. Appl. Bionics Biomech. Hindawi. Available at: https://doi.org/10.1155/2022/385 9629 4. Snehal A., Abhishek, M.: Blockchain based banking application. Int. Res. J. Eng. Technol., (IRJET), 06(04) 5. Di Silvestre, M.L., Gallo, P., Guerrero, J.M., Musca, R., Sanseverino, E.R., Sciumè, G., et al.: Blockchain for power systems: current trends and future applications. Renew. Sustain. Energy Rev. 119, 109585 (2020) 6. Eyal, I.: Blockchain technology: transforming libertarian cryptocurrency dreams to finance and banking realities. Computer 50, 38–49 (2017) 7. Popova, N.A., Butakova, N.G.: Research of a possibility of using blockchain technology without tokens to protect banking transactions. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Saint Petersbugh and Moscow, 28–31 January 2019, 1764–1768 (2019) 8. Cocco, L., Pinna, A., Marchesi, M.: Banking on blockchain: costs savings thanks to the blockchain technology. Future Internet 9, 25 (2017) 9. Harris, W.L., Wonglimpiyarat, J.: Blockchain platform and future bank competition. Foresight 21, 625–639 (2019)
Green Banking Through Blockchain-Based Application
239
10. Wu, B., Duan, T.: The advantages of blockchain technology in commercial bank operation and management. In: Proceedings of the 2019 4th International Conference on Machine Learning Technologies, Nanchang, 21–23 June 2019, 83–87 (2019) 11. Dozier, P.D., Montgomery, T.A.: Banking on blockchain: an evaluation of innovation decision making. IEEE Trans. Eng. Manage. 67, 1129–1141 (2019) 12. Arantes, G.M., D’Almeida, J.N., Onodera, M.T., Moreno, S.M.D.B.M., Almeida, V.D.R.S.: Improving the process of lending, monitoring and evaluating through blockchain technologies: an application of blockchain in the Brazilian development bank (BNDES). In: 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Halifax, 30 July-3 August 2018, 1181–1188 (2018) 13. Rijanto, A.: Blockchain technology adoption in supply chain finance. J. Theor. Appl. Electron. Commer. Res. 16(7), 3078–3098 (2021). https://doi.org/10.3390/jtaer16070168 14. Andrian, H.R., Kurniawan, N.B., Suhardi: Blockchain technology and implementation: a systematic literature review. In: 2018 International Conference on Information Technology Systems and Innovation (ICITSI), pp. 370–374(2018). https://doi.org/10.1109/ICITSI.2018. 8695939 15. Metcalfe, W.: Ethereum, smart contracts, DApps. In: Yano, M., Dai, C., Masuda, K., Kishimoto, Y. (eds.) Blockchain and Crypto Currency. Economics, Law, and Institutions in Asia Pacific (2020) 16. Yeasmin, S., Afrin, N., Saif, K., Reza, A.W., Arefin, M.S.: Towards building a sustainable system of data center cooling and power management utilizing renewable energy. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_67 17. Liza, M.A., Suny, A., Shahjahan, R.M.B., Reza, A.W., Arefin, M.S.: Minimizing E-waste through improved virtualization. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10. 1007/978-3-031-19958-5_97 18. Das, K., Saha, S., Chowdhury, S., Reza, A.W., Paul, S., Arefin, M.S.: A sustainable E-waste management system and recycling trade for Bangladesh in green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_33 19. Rahman, M.A., Asif, S., Hossain, M.S., Alam, T., Reza, A.W., Arefin, M.S.: A sustainable approach to reduce power consumption and harmful effects of cellular base stations. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_66 20. Ahsan, M., Yousuf, M., Rahman, M., Proma, F.I., Reza, A.W., Arefin, M.S.: Designing a sustainable E-waste management framework for Bangladesh. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_104 21. Mukto, M.M., Al Mahmud, M.M., Ahmed, M.A., Haque, I., Reza, A.W., Arefin, M.S.: A sustainable approach between satellite and traditional broadband transmission technologies based on green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_26
240
Md. Saiful et al.
22. Meharaj-Ul-Mahmmud, Laskar, M.S., Arafin, M., Molla, M.S., Reza, A.W., Arefin, M.S.: Improved virtualization to reduce e-waste in green computing. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_35 23. Banik, P., Rahat, M.S.A., Rafe, M.A.H., Reza, A.W., Arefin, M.S.: Developing an energy cost calculator for solar. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-03119958-5_75 24. Ahmed, F., Basak, B., Chakraborty, S., Karmokar, T., Reza, A.W., Arefin, M.S.: Sustainable and profitable IT infrastructure of bangladesh using green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-19958-5_18 25. Ananna, S.S., Supty, N.S., Shorna, I.J., Reza, A.W., Arefin, M.S.: A policy framework for improving e-waste management in Bangladesh. In: Vasant, P., Weber, GW., MarmolejoSaucedo, J.A., Munapo, E., Thomas, J.J. (eds.) Intelligent Computing and Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol. 569. Springer, Cham (2023). https:// doi.org/10.1007/978-3-031-19958-5_95
Application of Decision Tree Algorithm for the Classification Problem in Bank Telemarketing Ngoc Nguyen Minh Lam1 , Ngoc Hong Tran2 , and Dung Hai Dinh3(B) 1 Business Administration, Vietnamese-German University, Ben Cat, Vietnam 2 Computer Science, Vietnamese-German University, Ben Cat, Vietnam 3 Business Information Systems, Vietnamese-German University, Ben Cat, Vietnam
[email protected]
Abstract. This paper describes the application of the Classification and Regression Tree (CART) for a data mining problem in bank telemarketing. Tasks of telemarketing include questions such as which customer segments to target, which prices and promotions to offer, or when to make a phone call to the customer. We make use of a common dataset that is publicly available and construct a decision tree for the classification of customers with or without deposits. By eliminating certain attributes of the dataset to get a reasonable performance of the decision tree algorithm, we show that domain knowledge of an industry sector is important in a data mining project. It is shown that after removing unnecessary features that may cause bias, we improve the structure of the decision tree (DT) and the performance. The metric for measuring information impurity used in this work is the Gini index. The results show that our DT model performs effectively with values of 92, 99, 86 and 92% for AS, Recall, Precision, and F1-score. Our findings revealed interesting and valuable knowledge. The attribute “previous” has a relatively strong impact on the classification, resulting in it being chosen as the root node of the tree. This means that customers who are contacted for the first time would have a higher probability of accepting the subscription. Keywords: Classification and regression tree · Data mining · Bank telemarketing · Classification · Confusion matrix
1 Introduction In 1960, the US telecommunication company A&T introduced Wide Area Telephone Service, a new service which later made telecommunication more accessible to everyone [6]. Not only households benefited from this launch, but also the financial and banking sectors soon popularized a new channel of marketing: telemarketing [15]. Telephones made it possible for firms to promote their products or services at a far distance which also equals cost efficiency. Nowadays, telemarketing can achieve higher productivity with the available technology. Companies can collect and store more data of more customers more easily [14]. Data becomes valuable resources for firms as they might contain new, hidden © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 241–249, 2024. https://doi.org/10.1007/978-3-031-50327-6_25
242
N. N. M. Lam et al.
knowledge about current and potential customers. The process of deriving meaningful knowledge from mass data is called data mining (DM). In telemarketing, DM helps firms save time and cost by targeting customers with a higher chance of accepting a deal or an offer [5]. In DM, the Decision Tree (DT) algorithm is a popular predictive analytics methodology. It supports managers in detecting future trends from existing data and improves firm’s performance upon that. DT is well known for its powerful ability yet easy interpretation, since it does not require complex statistical knowledge [7]. This paper applies the Decision Tree algorithm using the Gini index to classify the telemarketing data of a Portuguese bank. In order to avoid problems with unbalanced data, oversampling is proceeded before building the model. As for accuracy assessment, Confusion Matrix (CM), Accuracy Score (AS), Precision, Recall, and F1 score will be used. Unlike the preceding studies [5, 11] on the same dataset which tested multiple data mining models, this paper will only focus on establishing and evaluating the model built with the DT algorithm, eliminating certain features, and testing the efficiency of oversampling. The paper is presented in the following order: Sect. 2 reviews literature about telemarketing and DM applications in the field, Sect. 3 explains the DT algorithm, the concept of information impurity and the Gini index, Sect. 4 presents the data and methodology, Sect. 5 discusses the results and Sect. 6 concludes the work.
2 Literature Review Direct marketing is a form of marketing that creates real-time interaction with customers [6]. As its name suggests, it is the most straightforward strategy to reach out to new customers and maintain relationships with existing ones [14]. Telemarketing belongs to this segment of marketing activities. According to Warf [15], as telecommunication became cheaper in the late 80s, many financial institutions used phone calls as a way to introduce new products to their buyers. Marketers started incorporating telemarketing into their well-planned promotion programs without face-to-face meetings [6]. Though the terms “telemarketing” and “telesales” are interchangeable, they are not the same in meaning. According to Moncrief (1989), telemarketing is an ongoing and well-planned process, which is more periodic than telesales. Moreover, marketers do not need to meet their customers in real life, unlike telesellers. In other words, telesales can be considered the last stage of telemarketing. It is when the products are finally to be delivered to the clients. As the differences between the two terms are relatively small, within the scope of this paper we will treat telesales as a part of the telemarketing process. The most crucial part of direct marketing is finding the right customers to contact [8]. Companies achieve specific goals of their marketing campaign by focusing on specific customer segments [10]. Companies can effectively target their potential buyers with the help of DM techniques [3]. Fayyad et al. [4] define DM as a “non-trivial process” that finds meaningful patterns in a large dataset. In the context of marketing, given some data about the buyers, firms can find the patterns that lead to their customers’ buying decisions [8]. There were three previous studies that worked on the same dataset, which we use in this paper. Moro et al. [11] applied CRISP-DM to test the performance of three DM algorithms (Naïve Bayes, Support Vector Machines, Decision Tree) on the telemarketing
Application of Decision Tree Algorithm
243
dataset. The results from AUC and ALIFT analysis showed that DT ranked second, after Support Vector Machines, with a score above 0.83 in both metrics. The authors concluded that call duration is the driving factor leading to higher success of this campaign. Unlike Moro, Tékouabou et al. [14] applied a new approach called “class membershipbased” (CMB) to improve prediction results. Comparing with other classical classification models, CMB gains advantages in terms of processing time and avoidance of overfitting issues. Considering AS, AUC, and F1 measures, DT algorithm efficiency is comparable with that of CMB. Ilham [5] also tested the same dataset by comparing predictive capacity between different classification methods. The author built six predictive models including Naïve Bayes, Random Forest, K-Nearest Neighbor, Decision Tree, Support Vector Machine, and Logistic Regression. AUC and AS were used to evaluate the performances of those models. Among the six algorithms, DT received AS and AUC scores of 90% and 0.645 respectively. There was no conclusion, as the author merely employed this dataset for model evaluation purposes. In this paper, we concentrate on DT as the only algorithm built to classify and predict the outcomes of a telemarketing campaign. We modify the model by using domain knowledge and eliminate some attributes that can affect the prediction result. Furthermore, we apply oversampling to reduce bias and thus improve the result. The customers will be in one of the two classes: people who accept the deposit offer and the ones who reject it. The dataset used for analysis is the same as the one initially used in Moro et al. [11]. For performance assessment, AS, Precision, Recall, and F1-score are used.
3 The Decision Tree with Gini Index Decision Tree is categorized as supervised learning since the model “learns” the relationship between attribute values and class values [9]. In business, attribute values are records of past data, namely customers’ age, occupation, or latest product bought, etc. In terms of class values, it categorizes each observation into a specific class. It is commonly expressed in binary values. For executing the learning process, the original dataset will be divided into two subsets: train data and test data. The train set is used for learning and model building. The test data will then be input for the final model. Then, results will be compared with the original test set to evaluate the predictive ability of the model. The basic structure of the tree contains a root node, internal nodes, and leaf nodes, as shown in Fig. 1. The topmost circle covers all available data. It then splits into two or more branches. The leaf node has no other path since all the data within are organized in similar classes. The internal node continues to split until the data are well-classified. The most crucial part of building the model is feature selection [9]. It is the process of choosing the “best” attribute to initialize any splits. One of the methods for identifying decision nodes is the Gini Index [16]. The Gini index ranges from 0 to 1. Zero indicates that there is no mixing value within a dataset. In order words, there are only data with the same characteristics gathered in one group. Therefore, the higher the score, the lower the purity of a dataset. The goal is to find the attribute with the lowest Gini number as it represents the majority of data that will ends up in one specific class. The general formula of the Gini
244
N. N. M. Lam et al.
Fig. 1. Decision tree structure
index is expressed in the following equation. Gini(D) = 1 −
K
pi2
(1)
i=1
D is the dataset and pi is the probability of ith class running from 1 to K. With V attributes in D, the Gini index will be generated for each unique observation. Then, the weighted sum of all Gini indexes in the vth attribute is calculated as V v D × Gini Dv D
(2)
v=1
The process is repeated for all V attributes. The selection ends with the decision node with an attribute that has lowest Gini score. Existing data is split following the condition of the selected node. The tree continues to branch down repeating the calculation process as mentioned above. The process ends when all the data is well classified.
4 Dataset and Methodology 4.1 Dataset The dataset was derived from the UCI machine learning repository posted in 2014 [10]. The data recorded 45,211 telemarketing information of a Portuguese bank from 2008 to 2010. There are 16 attributes describing the customers and their phone calls details during a campaign. “Deposit” is the target value expressed in binary results (Yes/No) indicating whether or not the customers subscribe to the offer after the call. The ratio between “yes” and “no” values is 12 and 88%, accordingly. The summary of this set is recorded in Table 1. 4.2 Methodology This paper used Python for data cleaning and Scikit-learn library for Decision Tree modeling. This library has built-in functions for DT classification and model evaluation which is useful for common data mining purposes. We remove some attributes from the learning process which may cause bias in the result analysis as described next.
Application of Decision Tree Algorithm
245
Table 1. Summary of attributes of telemarketing dataset Variable name Description
Value type
Age
Customer’s age during campaign time
Numeric
Job
Customer’s occupation
Categorical
Marital
Customer’s marriage status
Categorical
Education
Customer’s level of education
Categorical
Default
Has credit indefault?
Categorical
Balance
Bank balance at time of campaign
Categorical
Housing
Has housing loan?
Categorical
Loan
Has personal loan?
Categorical
Contact
Mean of contact
Categorical
Poutcome
Outcome of the previous marketing campaign
Categorical
Day
Last contact day of week
Numeric
Month
Last contact month of year
Numeric
Duration
Last contact duration, in seconds
Numeric
Campaign
Number of contacts performed during this campaign and for this Numeric client
Pdays
Number of days that passed by after the client was last contacted Numeric from a previous campaign
Previous
Number of contacts performed before this campaign and for this Numeric client
y (target)
Has the client subscribed a term deposit?
Binary
In the first step, the data is cleaned by removing unnecessary variables including “month”, “day”, and “contact”. They only provide information about the call but have no contributions to the customers’ characteristics. The feature “duration” of the call was cut out as it strongly influences outcomes of the class variable [11]. Furthermore, “pdays” column is highly correlated with the variable “previous” because both attributes show numbers of past interaction between the customers and the firm. Therefore, “pdays” is deleted along with “poutcome” column since it contains a large amount of “unknown” value. Moreover, rows where the “education” and “job” are unknown were also dropped out. In the end, the dataset is left with 10 input variables, 1 output variable, and 39,478 observations, illustrated as the screenshot from Python in Fig. 2. The frequency ratio between “yes” and “no” classes indicates an imbalance data where one class is dominated by another class. Thus, it may reduce the accuracy of the model due to limited information about the potential customers. One suggestion for this problem is using oversampling. It is the method of creating random duplications of samples in the inferior class [1]. In Bee’s paper, DT performance improved after applying oversampling in the data before modeling. Therefore, this technique is applied
246
N. N. M. Lam et al.
Fig. 2. Selected attribute for model building
to this dataset which results in an equal number of 34,671 observations in both classes, see Fig. 3.
Fig. 3. Distribution of unique values in target variables after oversampling
The dataset is then split into two parts: one for training and one for testing with the ratio of 3:1, correspondingly. The tree is split with the Gini index criterion and under the condition of maximum three splits at each branch. In the next step, test data enter the model for validation check. Testing results are assessed by accuracy score, confusion matrix and classification report tools provided in Scikit-learn. Confusion matrix produces four metrics including true positive (TP), true negative (TN), false positive (FP), and false negative (FN) [12]. Similarly, the report also has four scores which are Recall, Precision, F-measure, and Accuracy [12]. The formulas for those mentioned scores are presented in Table 2. Table 2. Formula of evaluation methods. Metric
Formula
Accuracy
(TP+TN) (TP+FP+TN+FN)
Precision
TP TP+FP TP TP+FN 2×Precision×Recall Precision+Recall
Recall F-measure
Application of Decision Tree Algorithm
247
In the next section, we will discuss the performance results achieved by applying the DT algorithm.
5 Results and Discussion The model achieved an AS of approximately 92%. Table 3 summarizes the amount of correct and incorrection predictions of the model as a Confusion Matrix. Table 3. Confusion matrix Type
Status
YES
NO
Total
Count
YES
11.273 (TP)
1.823 (FP)
13.068
NO
74 (FN)
97.13 (TN)
9.815
YES
86.26%
13.74%
100%
NO
1.04%
98.96%
100%
Proportion
It is up to 86.26% correction in predicting potential customers, i.e. “yes”. As for the “no” class, it categorizes 98.96% of the subset correctly. While there is still misclassification, these metrics express positive results of this DT model. Table 4 shows the performance report of the final model. Table 4. Report of classification metrics Class
Recall (%)
Precision (%)
F-measure (%)
YES
99
86
92
NO
84
99
91
A recall ratio of 99% indicating out of values predicted as “yes”, only about 1% were misclassified. On the Precision scale, negative values (“No”) has higher score compared to (“Yes”). Regarding the F-score, the two classes are similar, and both have good scores of over 90%. The next figure shows the final decision tree. In this model, “previous” variable is chosen as the root node. It is expected that a customer with no contact before will likely accept the offer. Further splits with other variables are illustrated in Fig. 4. This tree was plotted under the condition of three maximum hierarchical levels from the root node. Therefore, overfitting issues are solved automatically by the built-in program. To have a good depiction of a decision tree, prepruning or post-pruning needs to be implemented [2]. Otherwise, the tree will fit every piece of information it gains and grow into an enormous tree. However, within the scope of this paper, those steps will not be explained. It is good to mention that this model needs further improvement for practical use (Fig. 5).
248
N. N. M. Lam et al.
Fig. 4. Decision tree constructed from telemarketing data—3 levels
Fig. 5. Decision tree constructed from telemarketing data—4 levels
6 Conclusion This paper aims at applying a data mining method to real-world data. The Decision Tree algorithm was chosen to classify data in a telemarketing campaign of a Portuguese bank. By building a DT model, marketers can identify the characteristics of customers who are likely to have a deposit subscription. Due to imbalance distribution of the class data, oversampling is carried out to improve the performance of the model. Certain attributes were removed to simplify the model and reduce unnecessary information that can create bias. The results show that our DT model performs quite effectively with 92, 99, 86 and 92% for AS, Recall, Precision, and F-score. Attribute “previous” has a relatively strong impact on the classification. Customers who are contacted for the first time would have a higher probability of accepting the subscription. This information puts a concern on the application of this model. It is because “previous” does not describe specific customers’ traits. Therefore, it would not confirm a specific customers segmentation to be chosen for the marketing campaign. Although the metrics return positive assessment, future studies should be conducted to improve the predictive ability of this model, especially insights provided by domain knowledge of the decision makers. There are limitations and potentials for future studies. The illustration of the tree in this paper is generally plotted for simple interpretation and explanation of how the
Application of Decision Tree Algorithm
249
classification and regression tree works on a real dataset. Though the metrics are satisfactory, an applicable model must be precise and well-constructed, so that marketers can rely on it to create successful marketing strategies and improve operational excellence as well as customer intimacy.
References Bee Wah Yap, K. A.: An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets. Singapore, pp. 13–22. Singapore, Springer (2014) Dietterich, T.: Overfitting and undercomputing in machine learning. ACM Comput. Survey, 326– 327 (1995) Elsalamony, H.A.: Bank direct marketing analysis of data mining techniques. Int. J. Comput. Appl., 0975–8887 (2014) Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery in Databases (1996). Retrieved 1 21, 2022 from AI Magazine: https://ojs.aaai.org//index.php/aim agazine/article/view/1230 Ilham, A., Khikmah, L., Indra, Ulumuddin, Iswara, I.B.: Long-term deposits prediction: a comparative framework of classification model for predict the success of bank telemarketing. J. Phys.: Conf. Series 012035 (2019) Johnsona, E.M., Meiners, W.J.: Selling and sales management in action: telemarketing. J. Pers. Selling Sales Manage., 65–68 (1987) Lee, C.S., Cheang, P.Y., Moslehpour, M.: Predictive analytics in business analytics: decision tree. Adv. Decis. Sci., 1–29 (2022). From https://www.proquest.com/scholarly-journals/predictiveanalytics-business-decision-tree/docview/2674049708/se-2 Ling, C. X., Li, C.: Data mining for direct marketing: Problems and solutions. In: The fourth international conference on knowledge discovery and data mining, pp. 73–79. New York, AAAI Press (1998) Liu, B.: Supervised learning. In: Liu, B. (ed.) Web Data Mining, pp. 63–132. Springer, Berlin (2011) Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. In: UCI Machine Learning Respository [Dataset] (2014). Retrieved Jan 21, 2023 from https:// archive.ics.uci.edu/ml/datasets/Bank+Marketing Moro, S., Laureano, R., Cortez, P.: Using data mining for bank direct marketing: an application of the crisp-Dm methodology. In: Proceedings of European Simulation and Modelling Conference-ESM 2011, pp. 117–1221. Guimaraes (2011) Sulim Kim, H.L.: Customer churn prediction in influencer commerce: an application of decision trees. In: The 8th International Conference on Information Technology and Quantitative Management, pp. 1332–1339. Elsevier science bv (2021) Chakraborty, S., Islam, S., Samanta, D.: Supervised learning-based data classification and incremental clustering. In: Data Classification and Incremental Clustering in Data Mining and Machine Learning, pp. 33–72. Springer Cham (2022) Tékouabou, S., Gherghina, S., ¸ Toulni, H., Neves Mata, P., Mata, M., Martins, J.: A machine learning framework towards bank telemarketing prediction. J. Risk Financ. Manage., 269 (2022) Warf, B.: Telecommunications and the globalization of financial services. The Professional Geographer 257–271 (1989) Yuan, Y., Wu, L., Zhang, X.: Gini-impurity index analysis. EEE Trans. Inf. Forensics Secur., 3156–3157 (2021)
Robust Feature Extraction Technique for Hand Gesture Recognition System V. Yadukrishnan1 , Abhishek Anilkumar1(B) , K. S. Arun1 , M. Nimal Madhu2 , and V. Hareesh1 1 Center for Computational Engineering and Networking, Amrita School of Artificial
Intelligence, Amrita Vishwa Vidyapeetham, Coimbatore 641112, India [email protected], [email protected] 2 Department of Electrical Engineering, National Institute of Technology Calicut, Kozhikode, Kerala, India
Abstract. Hand gesture recognition (HGR) is a vital and versatile technology widely employed in various applications such as gaming, robotics, and interactive media. Beyond its broad utility, HGR systems play a crucial role in facilitating interactions for individuals with disabilities, empowering them with newfound capabilities. This paper introduces an innovative approach to feature extraction, leveraging Autoencoders to derive essential characteristics from the data. The extracted features are subsequently utilized for a comparative analysis of the accuracy achieved across different data sizes within the same dataset. By mapping the visual vocabulary of images, the Autoencoder generates a unified vector, which is then fed into Support Vector Machine (SVM), Naive Bayes, and K-Nearest Neighbor (KNN) classifiers. Through this research, we aim to enhance the efficiency and effectiveness of hand gesture recognition systems, fostering advancements in human-computer interaction and augmenting the quality of life for individuals with disabilities. Keywords: Autoencoder · SVM classifier · KNN classifier · Naïve Bayes
1 Introduction Hand gesture recognition is a technology that enables machines to interpret and understand human sign language. It is an important part of machine learning, as it allows computers to interpret the physical movements of humans and provide useful feedback. Hand gesture recognition systems can be used in various applications, from controlling robots to recognizing sign language. To implement hand gesture recognition, a system must first be trained with data that contains a variety of hand gestures. This data is then used to create a machine learning model that can recognize and classify hand gestures. The model must be able to accurately predict the hand gesture, even when the gesture is slightly different than the training data.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 250–259, 2024. https://doi.org/10.1007/978-3-031-50327-6_26
Robust Feature Extraction Technique for Hand Gesture Recognition System
251
Once the machine learning model is trained, it can be used to create a hand gesture recognition system. This system usually consists of a camera, sensors, and a computer that can interpret the data collected from the camera and sensors. The computer then uses the machine learning model to accurately classify the hand gesture, providing useful feedback for the user.
2 Literature Survey Many researches have been already carried out on HGR system using machine learning and deep learning. Some notable studies include Vijay et al. [1] researched on HGR system using different machine learning models Naive Bayes, KNN, SVM, Hybrid Ensemble Classifier for predicting the output. This paper has used Speeded Up Robust Feature (SURF) for extracting the features. This study found that Hybrid Ensemble Classifier give the best results compared to other models it gives an accuracy of 98%. The research conducted by Abhishek et al. [2] researched on HGR system for interaction with electronic systems. The HGR system is used to preform actions such as switching the pages, scrolling up or down the page. This paper has used 3D convolution Neural Network for predicting the output on 2 type data set-one for the hand detection and the other for the motion or gesture detection. Hand detection uses EGO dataset, Motion or Gesture Recognition uses Jester dataset. In 2020 Mohanarathinam et al. [3] did a study in HGR system using Gabor filter for feature extraction and then used principal component analysis (PCA) for dimensionality reduction. This study found that SVM give the best results compared to Convolutional Neural Network (CNN) and Artificial Neural Network (ANN) it give an accuracy of 97.5%.The short coming of this paper is they fail to check accuracy for different kernels of SVM. In 2017 Benalcázar et al. [4] researched on HGR system. The input for the HGR system was electromyography measured by the commercial sensor the Myo armband. This paper has used KNN and dynamic time warping algorithms for predicting the results. This paper fails to check for different machine learning models like SVM, Naive Bayes. Padhy [5] this paper used Multilinear Singular Value Decomposition (MLSVD) technique for feature extraction. They have used three different data setNinaPro, CapgMyo and CSL-HDEMG data set. Zhang [6] did research on autoencoder. This paper compared two different autoencoders simple autoencoder (SAE) and convolution autoencoder (CAE). The study found that convolution autoencoder gives a better result compared to simple autoencoder. The proposed research aims to compare three different machine learning models KNN, Naïve Bayes and SVM in predicting the hand gestures using the Indian Sign Language data set. This research make use of autoencoders for feature extraction, it helps in extracting all the necessary features. These features are then passed to machine learning models for predicting the hand gestures. In SVM we compare the results of all the 4 kernels to find which kernel predict with most accuracy.
3 Dataset We have used the Indian Sign Language (ISL) data set for testing and training purpose of the proposed HGR system. It consists of 1200 images each of all numbers and alphabets except the number “0” as shown in Fig. 1. For the 1st data set we have taken 50 images
252
V. Yadukrishnan et al.
each of all the hand gestures a total of 1749 images were used. For the 2nd data set we have taken 200 images each of all the hand gestures a total of 6994 images were used. For the 3rd data set we have taken all images a total of 41,998 images were used. The data set was split into a proportion of 1:1 for testing and training purposes.
Fig. 1. Data sample
4 Methodology The HGR process encompasses multiple stages of processing, each requiring the implementation of specific algorithms for successful execution. Preprocessing the images, extracting features from preprocessed images and predicting the accuracy by different classification models are the main stages of HGR. Figure 2 gives an overview of the proposed HGR system. 4.1 Singular Value Decomposition (SVD) SVD is a linear algebra technique that can be used to decompose a matrix into three matrices: U, S, and V. U and V are unitary matrices, and S is a diagonal matrix containing the singular values of the matrix. SVD is useful for data reduction and for finding patterns in data. It can also be used for data compression, for solving linear least squares problems, and for finding eigenvectors and eigenvalues. SVD is also used as an image compression technique that works by breaking down an image matrix into its component singular values, which are then used to reconstruct the image. This technique has been used for decades and has become increasingly popular due to its high compression ratio and low distortion levels. By decomposing the image
Robust Feature Extraction Technique for Hand Gesture Recognition System
253
matrix into its singular values, SVD is able to drastically reduce the size of the image without significantly impacting its quality. This makes it an ideal technique for reducing the amount of space needed to store an image, as well as speeding up transmission time. Here we have applied SVD on all 3-color channel of the image and plotted the singular value matrix to find the optimum value of singular value that should be used so that the least amount of information is lost during the compression of the image. The image size has been reduced from 168 to 48 by applying SVD. The images after compression are reconstructed by the Eq. (1) New Image = (Unew × Snew) × Vnew
(1)
**Unew = Matrix U with all rows and 0 to N columns. **Snew = Diagonal matrix S with N rows and N columns. **Vnew = Matrix V with 0 to N rows and all columns. 4.2 Canny Edge Detection Canny Edge Detection is an image processing technique used to detect edges in an image by looking for abrupt changes in intensity. It is based on the idea that the intensity of an edge is related to the gradients of the image. The algorithm works by first smoothing the image to reduce noise and then applying the Sobel operator to calculate the approximate gradients in the x and y directions. Finally, it applies a non-maximum suppression algorithm to identify edges.
Raw Image
SVD
Machine learning algorithm
Canny edge detection
Autoencoder
Fig. 2. Different stages of proposed HGR system
Fig. 3. a Grey scale image, b Threshold image, c Canny edge image
After the images are compressed, they are converter into grey scale images Fig. 3a and then are converted into threshold images Fig. 3b. The images are converted into black and white pixels. This image is used for canny edge detection Fig. 3c. The edged can be detected easily because of the gradient change from black to white.
254
V. Yadukrishnan et al.
4.3 Autoencoder In this stage we extract the necessary features from the processed images. We use the deep learning algorithm Autoencoder to find and extract the features form the image. Autoencoder is a unsupervised learning algorithm that uses artificial neural network (ANN) to code data in an effective way. Autoencoders are generally used to reduce the size of large datasets or to reduce the dimensionality of data while preserving important features. As shown in Fig. 4 an autoencoder consist of two parts: an encoder and a decoder. The encoder compresses the input data, while the decoder reconstructs it. The reconstruction is done using a loss function, which measures the difference between the original data and the reconstructed data. The goal of an autoencoder is to minimize this reconstruction error.
Fig. 4. Working of Autoencoder
Here we make use of the encoding part of autoencoder to extract the features form the image. We convert the canny edge image into pixel format and use this as the input for autoencoder. The encoder extracts the important features from the image and convert the image into smaller representation of itself. We had converted the 72*72 matrix image into 1*32 matrix form. All the 32 features extracted are the most important features of all. The autoencoder have extracted the information of the pixel values that contain the edges of the hand.
5 Machine Learning Algorithm After creating the feature vector these vectors are mapped to the corresponding output label and then these are divided into test and train and passed to machine learning algorithms. 5.1 K-Nearest Neighbor KNN (K-Nearest Neighbors) is a simple, supervised machine-learning algorithm designed for dealing with classification problems. It works by locating the closest data points to a given point and then classifying the given point based on which class has the highest number of nearest points. KNN is a non-parametric algorithm, which means that it does not make any assumptions about the underlying data distribution. KNN is a powerful and simple algorithm and is widely used for both supervised and unsupervised learning tasks.
Robust Feature Extraction Technique for Hand Gesture Recognition System
255
5.2 Naive Bayes Classifier Naive Bayes classifier is a probabilistic machine learning algorithm based on Bayes’ theorem that assumes independence between features. It is a supervised learning algorithm used for both classification and regression problems. It is primarily used for text classification with high-dimensional training datasets. Naive Bayes classifiers work by making predictions based on the probability of each class given the input characteristics. It’s quick and easy to implement and works well with small datasets. a p(a) ∗ P ab = (2) P b p(b)
5.3 Support Vector Machine Support Vector Machines (SVM) is a supervised machine-learning algorithm used for both classification and regression problems. It is a powerful and flexible algorithm that can be used for both linear and non-linear data sets. SVM works by finding the optimal hyperplane that separates the data into different classes. It is a robust algorithm that can handle outliers and works well with high-dimensional data sets. It is also effective in finding non-linear boundaries.
6 Results and Discussion The software development is done using Python programming language (Python 3). The program was run on a personal computer with i5-1235U 4.4GHz system and 8G RAM. The Accuracy Score is the standard metric for comparing the results of a classifier, however, when dealing with an imbalanced dataset, Precision, Recall, and F1-score are more relevant metrics to consider in order to get a better understanding of the results. A Confusion Matrix is a visual representation of the performance of a classifier on a set of data whose true values are known, allowing for easier interpretation of the results (Fig. 5). Accuracy Score is given by, Accuracy =
TP + TN N
(3)
Number of True Positive and True Negative are represented by TP and TN respectively and N represents the total number of data samples used. Precision is given by, Precision =
TP TP + FP
(4)
It measures the accuracy of a model’s positive predictions. Number of False Positive predictions are represented by FP.
256
V. Yadukrishnan et al.
Recall is given by, Recall =
TP TP + FN
(5)
It measures the ability of a model to identify all of the data points it should. Number of False Negative predictions are represented by FN.
Fig. 5. Confusion matrix
F1-score is given by, F1 - Score =
Precision ∗ Recall Precision + Recall
(6)
It is harmonic mean of Precision and Recall. For KNN we have tried with different values of K from 1 to 10. By comparing all the results, we found that we have the highest accuracy at K being 5. From Fig. 6 we can see that we have the highest accuracy for large data set. For Naïve Bayes we have taken the default parameters and from the graph Fig. 7 we can see that the small data set has the highest accuracy. This is because as the data size increases the linear separability of the data decreases. For SVM we have calculated the accuracy score for all the kernels and found that the linear kernel has the highest accuracy. From the graphs Figs. 8, 9, 10 and 11 we can also see that the accuracy of linear kernel decreases as the size of data set increases but the accuracy for all the other kernels increases as the size of dataset increases. From all the graphs we can see that the highest accuracy is for larger data set but still we are getting high accuracy for smaller data set from this we can incure that the feature extraction technique we used have extracted all the necessary features which helped us in getting high accuracy even for small data set.
Robust Feature Extraction Technique for Hand Gesture Recognition System
PRECISION:
0.99962
0.99944
0.99962
0.99632
0.99638
ACCURACY
41998 images
0.99947
6994 images
0.99652
0.99961
0.99942
0.99964
0.99657
0.99943
KNN 1749 images
RECALL
F1 SCORE
Fig. 6. Plot for KNN
NAÏVE_BAYES
ACCURACY
PRECISION:
RECALL
0.94594
0.99431
0.98244
0.94541
41998 images
0.98272
0.99445
6994 images
0.95058
0.98294
0.99448
0.94667
0.98228
0.99429
1749 images
F1 SCORE
Fig. 7. Plot for Naïve Bayes
ACCURACY
PRECISION
RECALL
Fig. 8. Plot for RBF kernel SVM
0.9931
0.97337
0.9202
41998 images
0.99316
0.97403
0.91332
6994 images
0.99314
0.9767
0.94864
0.99333
0.97284
0.91086
RBF KERNAL
1749 images
F1 SCORE
257
V. Yadukrishnan et al.
ACCURACY
PRECISION:
RECALL
0.99944 0.99686
0.99887
0.99946
41998 images
0.9969
0.99685
6994 images
0.99881
0.99942
0.99898
0.99943
1749 images
0.99702
0.99886
LINEAR KERNEL
F1 SCORE
Fig. 9. Plot for Linear kernel SVM
SIGMOID KERNEL
ACCURACY
PRECISION:
RECALL
0.9871
0.93373
0.72606
0.9871
41998 images
0.93118
0.75241
6994 images
0.9874
0.95391
0.7939
0.98738
0.92824
0.73143
1749 images
F1 SCORE
Fig. 10. Plot for Sigmoid kernel SVM
ACCURACY
PRECISION:
Fig. 11. Plot for Poly kernel SVM
0.89896
0.89967
0.87924
RECALL
0.98869
0.91921
41998 images 0.98861
6994 images 0.9895
0.96914
1749 images 0.9636
0.8988
0.98857
POLY KERNEL
0.87543
258
F1 SCORE
Robust Feature Extraction Technique for Hand Gesture Recognition System
259
7 Conclusion From the results we can conclude that the even for small data set we are getting high accuracy, for KNN we got an accuracy of 99.6% for small data set, for Naïve Bayes we got an accuracy of 99.4% for small data set form this we can incur that compared to other feature extraction technique autoencoder have extracted the features correctly. From all the machine learning models we have the highest accuracy for SVM model 99.9% accuracy. For SVM we have compared the accuracy of all the kernels and we found that linear kernel has the highest accuracy. So from this we can conclude that with the help of autoencoder we can get high accuracy even for small dataset.
References 1. Athul Vijay, M.P., Kanagalakshmi, S., Subodh Raj, M.S., George, S.N.: Hand gesture recognition system using modified SVM and hybrid ensemble classifier. In: 2021 International Conference on Intelligent Technologies (CONIT), Hubli, India, pp. 1–6 (2021). https://doi. org/10.1109/CONIT51480.2021.9498381 2. Abhishek, B., Krishi, K., Meghana, M., Daaniyaal, M., Anupama, H.S.: Hand gesture recognition using machine learning algorithms. In: 2020 Computer Science and Information Technologies. ISSN 2722-323X, e-ISSN 2722-3221 international journal (2020) 3. Mohanarathinam, A., Dharani, K.G., Sangeetha, R., Aravindh, G., Sasikala, P.: Study on hand gesture recognition by using machine learning. In: 2020 4th International conference on electronics, communication and aerospace technology (ICECA), Coimbatore, India, pp 1498–1501 (2020). https://doi.org/10.1109/ICECA49313.2020.9297513 4. Benalcázar, M.E., Jaramillo, A.G., Jonathan, Zea, A., Páez, A., Andaluz, V.H.: Hand gesture recognition using machine learning and the Myo armband. In: 2017 25th European signal processing conference (EUSIPCO), Kos, Greece, pp 1040–1044. https://doi.org/10.23919/ EUSIPCO.2017.8081366 5. Padhy, S.: A tensor-based approach using multilinear SVD for hand gesture recognition from sEMG signals. IEEE Sens. J. 21(5), 6634–6642 (2021). https://doi.org/10.1109/JSEN.2020. 3042540 6. Zhang, Y.: A better autoencoder for image: convolutional autoencoder (2018) 7. Gogoi, M., Begum, S.A.: Image classification using deep autoencoders. In: 2017 IEEE international conference on computational intelligence and computing research (ICCIC), Coimbatore, India, pp 1–5 (2017). https://doi.org/10.1109/ICCIC.2017.8524276 8. Skaria, S., Al-Hourani, A., Evans, R.J.: Deep-learning methods for hand-gesture recognition using ultra-wideband radar. IEEE Access 8, 203580–203590 (2020). https://doi.org/10.1109/ ACCESS.2020.3037062 9. Park, K.-B., Choi, S.H., Lee, J.Y., Ghasemi, Y., Mohammed, M., Jeong, H.: Hands-free human-robot interaction using multimodal gestures and deep learning in wearable mixed reality. IEEE Access 9, 55448–55464 (2021). https://doi.org/10.1109/ACCESS.2021.307 1364 10. Kahar, Z.A., Sulaiman, P.S., Khalid, F., Azman, A.: Skeleton joints moment (SJM): a hand gesture dimensionality reduction for central nervous system interaction. IEEE Access 9, 146640–146652 (2021). https://doi.org/10.1109/ACCESS.2021.3123570
Adaptive Instance Object Style Transfer Anindita Das(B) Assam Down Town University, Guwahati, India [email protected]
Abstract. Style transfer is the procedure to regenerate a given target image in the style of another style image. In this work, a new instance object style transfer scheme is proposed which uses GrabCut segmentation along with a single-image super-resolution (SISR) network to preprocess the inputs and perform the transformation using a traditional style transfer network. A complete analysis is done with various combinations of target and style image pairs taking users’ opinions. For more effective comparison, the contour of the output images is extracted and also similarity is measured. Experimental outcomes show that the resulting images are better in visual quality, less distorted, and preserve more semantic information from the target image than the existing schemes. Keywords: Style transfer · NST · Artistic style · Photorealistic style · SISR · Segmentation · GrabCut segmentation
1 Introduction Vincent van Gogh, Pablo Picasso and Leonardo da Vinci are few artists who mastered the art of painting skill to make different visual representations. Combining the target and the styled representation together with their unique style and skill had always fascinated the humans. Redrawing an artwork with a particular style needs a lot of time with a skilled artist. Studies were made to explore techniques to automatically render images into synthetic artforms. After the theories of the appealing artworks had been revealed, many eminent researchers started their studies to automatically render images into synthetic artforms. Researchers started with Supervised Learning methods [1–3], but performance does not increase with the increase in data, also finding/creating such a huge image pair (both target image and style image) dataset is not feasible. Computer vision technique, Neural Style Transfer (NST) [4] has given a new direction for regenerating the output which includes the content of the target image, which seems to be “painted” with the style of the styled image. For this upgraded approach only one style reference is sufficient rather than using a pair of target and style images. In the context of instance object style transfer, we try to accomplish three competing objectives i.e. segmenting the object of interest perfectly, preserving geometric features of the object and achieving transformation in local features simultaneously. Another challenge is the scene complexity of the real world while considering realism. Thus, our goal is to synthesize styles in the required objects that can be made applicable to any input images © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 260–265, 2024. https://doi.org/10.1007/978-3-031-50327-6_27
Adaptive Instance Object Style Transfer
261
creating artistic or photorealistic stylized images. In this proposed work, a new pipeline is proposed which can segment the interested object and perform style transformation to render the output object both in artistic style and photorealistic style by varying the amount of content information. The network must segment the objects of interest perfectly and produce results with better visual quality, preserve more information from the target image, create less distortion, distinguishable outputs with different weights for all combinations of different types of images. Some Researchers like [5] presented a new knowledge distillation method (named Collaborative Distillation) for encoder-decoder based neural style transfer. To overcome the feature size mismatch a linear embedding loss was introduced. Chiu and Gurari [6] in 2020, where they designed a framework for universal style transfer requiring an autoencoder and bottleneck feature transformation. They also controlled content preservation and style effect transferal in a balanced manner. Another work [7] video multi-style transfer (VMST) framework which performs fast and multi-style video transfer within a network. They used combination of four different loss functions i.e. Perceptual loss, Short-term temporal loss, Long-term temporal loss and Total variation loss to solve the temporal flickering issue in stylizing while preserving the content. Yihuai et al. [8] in which medical image transformation is done using CycleGAN with combination of perceptual loss and total variation loss function. Some work like wavelet corrected transfer [9] on whitening and coloring transforms (WCT2) in stylization. Earlier some segment generation style transfer approaches are proposed such as the works like Kurzman et al. [10] who proposed a Class-Based Styling method (CBS) approach that can style various objects of a content image using semantic segmentation. DAB-Net is used for segmentation in this scheme. Another work in 2019 [11], where they highlighted humans in distinct styles on the image. Instance style transfer model (IST) in the approach combines the segmentation and style transfer process. Schekalev and Kitov [12] work on an approach to detect the central object automatically and transfer the style non-uniformly. Another work of Yu et al. [13], uses Forward Stretching to transfer style to instance directly. It maps the pixel to a fixed position and interpolate the values of pixels into a tensor representation.
2 Proposed Scheme It is observed that segmenting and performing style transfer in the interest object perfectly in the target image seems to be a challenging task. To address this problem, contribution of the proposed pipeline Fig. 1. Consists of a segmenting technique i.e. GrabCut segmentation [14] based on graph cutting algorithm. In this technique of segmentation users input the rectangular pixels which take objects outside the rectangle as background targets and the objects inside are foreground objects. Gaussian Mixture Model (GMM) is applied to learn from the data and creates a new distribution of pixels where unknown pixels are labeled as foreground or background depending on its relation with the neighboring pixels based on non-parametric fusion of texture information and color information. After all this a mincut algorithm is used to segment the graph with a minimum cost function. The cost function is the sum of all weights of the edges cut. At last, all the
262
A. Das
Fig. 1. Schematic diagram
pixels connecting to source nodes become foreground and pixels connecting to sink nodes become background. This algorithm continues until the classification converges and generates the final segmented image with the object of interest. The Resulting segmented image is then passed into another combination of networks [15]. Firstly, into the SISR network where it converts the segmented images to high resolution images. The network has two beneficial outcomes in comparison to other networks. In this network both high and low-resolution subnets are connected in parallel. Secondly, it performs repeated multi-scale fusion with low resolution representations so that a single high-resolution output can be produced. Traditional pretrained VGG-19 architecture on ImageNet is used where the inputs are passed to extract the features image and texture image that represent semantic target and style images respectively with loss functions. An adaptive content weight (Cw) and Style weight (Sw) are introduced to alter the outputs from artistic to photorealistic images. These adaptive weights can be tuned as per requirements. In style transfer to achieve a good result, correct choice of loss function [16] is very important. Mainly to compute the value of losses we used feature extraction. Perceptual loss consists of losses i.e. content and style loss. We want the content image and the output image to have a similar feature map, computed by the loss network VGG19 which results in low content loss. Similarly, for style we want low style loss between style and output image. Reconstruction of the stylized instance object is done using the perceptual loss, total variation loss and some regularization as shown in Eq. 1. φ,i
yˆ = arg min Cω lcφ,i (y, yc ) + Sω lsφ,i (y, ys ) + TVω lTV (y) + λR lR (y)
(1)
After the recreation the segmented image is blended into the target image. For relevant comparison, image fingerprinting is considered as an efficient technique [17] to find image similarities between images. It uses the perceptual hash values of an image as image fingerprints. Perceptual hash is an algorithm belonging to the class of comparable hash functions. Features of the input images are taken to generate unique fingerprints and which are comparable. It compares the similarity taking two data sets of images. It helps to indicate the measure of preservation and realism.
Adaptive Instance Object Style Transfer
263
3 Experimental Results The whole experiment is carried out with the NVIDIA Geforce GTX 1650 Max-Q GPU. The input and output images are of size (500, 500) and contain 3 channels. We start our experiment by setting up GrabCut Segmentation followed by SISR and pretrained VGG19 with loss function. The proposed scheme is tested in all combinations with different types of images, night-to-day and vice versa with varied content weight keeping style weight. Some of the results of this proposed scheme is shown in Fig. 2. The regeneration of the stylized image is done by updating the parameters by backpropagation until the total loss becomes minimal. An empirical study is conducted to compare the best visual effects. 20 volunteers had been invited from different departments to rate the successful output images. They were asked to rate the generated images from 1 to 5 according to the visual quality of the images. The lowest and highest quality are represented by 1 and 5 respectively. Table.1 of mean opinion scores of this scheme with other’s results. Table 1 MOS comparison of artistic image and photorealistic image with other’s results Criteria
Schekalev’s [11] (%)
Artistic result (%)
Yu’s [13] (%)
Photorealistic result (%)
Style information
20.33
38.14
48.83
55.3
Content information
12.71
60.96
29.42
56.89
Visual effect
20.20
69.66
15.27
90.81
For more efficient comparisons contour extracted for the stylizing images with image graying followed by sobel operator and similarity measure is done. This experiment shows that resultant images have contour closer to the target image and with maximum similarity.
Fig. 2. Instance object style transfer
264
A. Das
4 Conclusion In instance object style transfer, the approach which uses GrabCut segmentation along with single image super resolution (SISR) network to preprocess the inputs and performs the transformation using traditional style transfer network with adaptive weights achieves good results in both artistic and photorealistic style with a finer structure and less distortion. An empirical study is conducted on the scheme that results in better visual effects with less memory usage and time consumed in comparison to other models. Along with that the contour of the stylizing images are extracted and similarity is measured, that shows the preservation of semantic information.
References 1. Gooch, B., Gooch, A.: Non-photorealistic Rendering. CRC Press (2001) 2. Rosin, P., Collomosse, J.: Image and Video-Based Artistic Stylisation, vol. 42.Springer Science & Business Media (2012) 3. Strothotte, T., Schlechtweg, S.: Non-photorealistic Computer Graphics: Modeling, Rendering, and Animation. Morgan Kaufmann (2002) 4. Gatys, L.A., Ecker, A.S., Bethge, M.: A Neural Algorithm of Artistic Style (2015). arXiv preprint arXiv:1508.06576 5. Wang, H., Li, Y., Wang, Y., Hu, H., Yang, M.H.: Collaborative distillation for ultra-resolution universal style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1860–1869 (2020) 6. Chiu, T.Y., Gurari, D.: Iterative feature transformation for fast and versatile universal style transfer. In: European Conference on Computer Vision, pp. 169–184. Springer (2020) 7. Gao, W., Li, Y., Yin, Y., Yang, M.H.: Fast video multi-style transfer. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3222–3230 (2020) 8. Liang, Y., Lee, D., Li, Y., Shin, B.S.: Unpaired medical image colorization using generative adversarial network. In: Multimedia Tools and Applications, pp. 1–15 (2021) 9. Yoo, J., Uh, Y., Chun, S., Kang, B., Ha, J.W.: Photorealistic style transfer via wavelet transforms. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9036–9045 (2019) 10. Kurzman, L., Vazquez, D., Laradji, I.: Class-based styling: Real-time localized style transfer with semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019) 11. Stahl, F., Meyer, M., Schwanecke, U.: Ist-style transfer with instance segmentation. In: 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), pp. 277– 281. IEEE (2019) 12. Schekalev, A., Kitov, V.: Style transfer with adaptation to the central objects of the scene. In: The International Conference on Neuroinformatic, pp. 342–350. Springer (2019) 13. Yu, Z., Wu, Y., Wang, T.: A Method for Arbitrary Instance Style Transfer (2019). arXiv preprint arXiv:1912.06347 14. Yong, Z., Jiazheng, Y., Hongzhe, L., Qing, L.: Grabcut image segmentation algorithm based on structure tensor. J. China Univ. Posts Telecommun. 24(2), 38–47 (2017) 15. Das, A., Sen, P., Sahu, N.: Adaptive style transfer using SISR. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (2021) 16. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super resolution. In: The European Conference on Computer Vision, pp. 694–711. Springer (2016)
Adaptive Instance Object Style Transfer
265
17. Ito, K., Aoki, T.: Phase-based image matching and its application to biometric recognition. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (2013)
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia Adriel A. Intan and Rolly Intan(B) Informatics Engineering Department, Petra Christian University, Surabaya, Indonesia [email protected]
Abstract. The emergence of Industry 4.0 makes humans’ lives to be more dependent on the internet. However, considering that crucial sectors such as business and government systems are also included here, the security of cyberspace is being questioned. In this article, the authors are interested to find out about the situation and condition of cybersecurity in Indonesia, particularly from 2018 to 2021. In general, the number of cyberattacks has increased significantly over the past four years. Furthermore, those attacks were also greatly influenced by the political and economic conditions of Indonesia. As for the motivation of those attacks, the perpetrator may want to express a protest and disappointment, or just have fun. This article will also review and discuss the cybersecurity difficulties that Indonesia needs to handle along with the benefits that Indonesia can pursue, particularly in relation to Industry 4.0 and Society 5.0. It is also learned that to achieve strong cyber resilience, Indonesia needs to create a collaboration among government, educational institutions, industries, and communities. Keywords: Cybersecurity · Cyberattacks · Cybercrime · Indonesia
1 Introduction In the era of digitalization, cybersecurity plays a vital role in line with the development of IoT as one of the important technological pillars in Industry 4.0. Users of various IoT devices who generally lack information technology skills and knowledge need to be protected from cybercrimes. The users always become targets of cyberattacks in relation to various motives of cybercrimes. Along with the development of information technology, the activities of cybersecurity and cybercrime are growing fast, more complex, and advanced. Based on its purchasing power parity (PPP) and population, Indonesia is no doubt a large country. This causes Indonesia to rank first in Southeast Asia in terms of spending on Information Technology [1, 2]. On 19 May 2017, Indonesia’s leader issued Presidential Decree No. 53/2017 and 133/2017 (constituting document) to establish the National Cyber and Crypto Agency (Indonesian: Badan Siber dan Sandi Negara, abbreviated as BSSN). This agency aims to implement effective and efficient cybersecurity by utilizing, developing, and consolidating all elements related to cybersecurity. In 2020, International Telecommunication Union, through its Global Cybersecurity Index (GCI) book, stated © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 266–273, 2024. https://doi.org/10.1007/978-3-031-50327-6_28
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia
267
that the score of Indonesia’s cybersecurity is 94.88 [3]. This score put Indonesia in the top 24th globally and 6th in the Asia Pacific. This was a significant improvement for Indonesia compared to its position in 2018, where Indonesia ranked 41st globally and 9th in Asia Pacific [4]. That said, Indonesia apparently encountered more cyberattacks each year from 2018 to 2021 according to BSSN [5–8] (as shown in Table 1). Moreover, it was approximated that 2021’s cyberattacks caused Indonesia to suffer a potential economic loss of about IDR 14.2 trillion (USD 1 billion). Furthermore, the companies that were attacked at that time were estimated to reach 22% [9]. Apparently, the worsening cyberattacks at that time were strongly influenced by the covid-19 pandemic which urged most people to work from home. According to Dan Lohrmann, the covid-19 pandemic caused a cyber pandemic [10]. Hence, it is evident that there is no correlation between the Global Cybersecurity Index of Indonesia and the number of cyberattacks during the covid-19 pandemic. Table 1 The number of cyberattacks in Indonesia. Year
The number of cyberattacks
2018
232,447,974
2019
290,381,283
2020
495,337,202
2021
1,637,973,022
Reflecting on the damages that cyberattacks cause, Indonesia needs to come up with strategies to not only strengthen its cyber resilience but also recover from past attacks. This article will first discuss the cybersecurity situation in Indonesia, especially cyber policies, cybercrimes, and cyberattacks in 2018 and the three subsequent years. Afterward, considering the new coming era of Society 5.0 and Industry 4.0, this article will also explain about challenges and opportunities that Indonesia will face, particularly concerning social welfare and disruptive innovation and technology. Then, this article will conclude with a summary and suggestions on the cyber-policies that Indonesia may adopt.
2 Condition of Cybersecurity in Indonesia As one of the biggest and the fastest growing economic countries in the world, Indonesia has become a main target of cyberattacks and cybercrimes. Based on the reports published by BSSN as seen in Table 1, the number of cyberattacks exponentially increased from 2018 to 2021, despite GCI of Indonesia 2020 is much better than GCI 2018. In this case, covid-19 pandemic is considered as the main factor which caused the extreme increase of cyberattacks because of the increase in the number of internet usage. Internet users in 2019 were 171,200,000 users while internet users in 2021 were 210 million users (77% of the population), so in just two years internet users increased about 14.3 percent.
268
A. A. Intan and R. Intan
Every year, a different major issue occurs regarding cyberattacks. In 2018 [5], there were top three cyberattacks detected as anomaly traffic, namely the use of Trojans (80,325,815 attacks), attempted user privilege gain (37,298,158 attacks), and attempted DOS (24,459,816 attacks). Referred to 2885 public reports of cyberattacks, the highest case was malware (1758 cases), and the second case was fraud (862 cases). In fact, cases of malware detected in traffic reached 122 million attacks. The most cyberattacks in Indonesia in 2018 came from Indonesia itself.
Fig. 1 The distribution of the number of cyberattacks from January to December 2019 [6].
Fig. 2 The number of anonymous connecting users in Indonesia [6].
The year 2019 is referred to as the political year because of the election and inauguration of the president and vice president in Indonesia. Different from 2018, the most
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia
269
cyberattacks in Indonesia in 2019 came from US. In the last three years, from 2016 to 2018, the biggest cyberattack threat was malware. However, the biggest threat of cyberattacks changed to be attempted information leak in 2019. Cyberattacks spiked sharply in September and October and dropped dramatically in November (seen Fig. 1) [6]. This seems to be closely related to the presidential inauguration on October 20, 2019. Furthermore, the number of anonymous users in Indonesia has increased sharply since the end of 2017, 2018 and until the end of 2019. This is believed to be related to the 2017 DKI Jakarta gubernatorial election, 2018 West Java gubernatorial election, and 2019 presidential and vice-presidential elections (see Fig. 2). Internet users log in as anonymous users because they want to maintain privacy and do not want their identities to be known because there is a tendency to plan to carry out illegal actions [6]. In 2020, the US is still the largest cyber attacker in Indonesia with 128,713,177 attacks. Trojans are the anomaly with the highest number based on monitoring from BSSN for 2020. Among several types of Trojans, Allaple (72,374,625 attacks), ZeroAccess (59,099,810 attacks) and Scada Moxa (24,239,572 attacks) ranked the highest in cyberattacks. Data breach incidents became a big topic in Indonesia during 2020 due to 91,000,000 data identity users of Tokopedia as the largest e-commerce in Indonesia leaked on the internet [7]. Cyber attackers took advantage of the covid-19 issue to spread various malware. Some malwares related to the issue of covid-19 were COVIDLOCK, Coronavirus-Ransomware, Spyware Corona, etc. The biggest spike in cyberattacks occurred in 2021, which was 330% compared to 2020. In 2021, around 46.62% of all cyberattacks, or as many as 730,946,448 attacks were dominated by the MyloBot Botnet. Apart from MyloBot, several anomalies that were included in the top 10 anomalies were also related to botnets, such as ZeroAccess and Discover Using Sock Agent [8]. Robot network or botnet is a collection of computer networks that are infected by malware and controlled by a party called a bot-herder. Botnets can be designed for spamming, data theft, ransomware, click fraud, DOS, and others. In 2021, Indonesia is again the home country of the most cyber attackers in Indonesia, and the US is in second place. Furthermore, most cases of web defacement attacks occurred at universities with 2217 cases in 2021 [8]. The increasing number of internet users also gives birth to more malicious users. These so-called hackers are not necessarily professional, but they might be “script kiddies” who are trying to gain popularity from their immoral actions. Although those people are not skillful, they might still pose some danger. For example, the recent case regarding a hacker called Bjorka who claimed to have stolen some critical data, including citizen data from the general elections commission and sim cards registration data from the ministry of communication and information technology [11]. Despite the data theft, Mahfud MD, the coordinating minister for political, legal, and security affairs, claimed that Bjorka was nothing [12]. He further said that all the data revealed by Bjorka were fabricated, and hence, had no threat value. Nevertheless, this incident had caused anxiety among citizens, who then urged the government to quickly solve this issue [13]. Talking about policies and laws to regulate and monitor cybersecurity, Indonesia still does not have one single rule and law as a legal umbrella. Some of the legal rules and laws that regulate and relate to cybersecurity are as follow.
270
A. A. Intan and R. Intan
• UU No. 11/2008 as amended by UU No. 19/2016 regulates Information and Electronic Transactions. • UU No. 14/2008 concerning Public Information Disclosure. UU No. 36/1999 concerning Telecommunications. • UU No. 19/2002 as amended by UU No. 28/2014 concerning Copyright. • UU No. 8/2010 concerning the Prevention and Eradication of the Crime of Money Laundering. • Regulation of the minister of communication and information technology No. 5/2017 concerning Security of the Telecommunication Networks via Internet Protocol-based. • Government Regulation (PP) No. 71/2019 concerning the Operation of Electronic Systems and Transactions. • Regulation of the Ministry of Defense No. 82/2014 provides cyber defense guidelines. • Presidential Decree No. 53/2017 and 133/2017 as the constituting documents for the establishment of BSSN. • Presidential Regulation No 28/2021 concerning BSSN. Apart from being still fragmented, all the rules and laws mentioned above have not covered all the current and future problems of cybersecurity in dealing with cybercrime which are increasingly diverse, complex, and advanced.
3 Challenges and Opportunities With a ranking of 24th out of 194 countries in 2020, Indonesia’s GCI is quite high globally. The GCI assessment is only based on five pillars, namely: Legal, Technical, Organizational, Capacity Development and Cooperation (see Table 2 for detail scores of all pillars). Table 2 Scores of all GCI pillars of Indonesia in 2020 [14]. Overall score
Legal measures
Technical measures
Organizational measures
Capacity development
Cooperative measures
94.88
18.48
19.08
17.84
19.48
20.00
In addition to the GCI, there is another credible cybersecurity index that must be considered, namely the National Cyber Security Index (NCSI). Quite different from the GCI, NCSI uses 12 parameters to measure a country’s cybersecurity. Compared to GCI, NCSI is more comprehensive and accurate in evaluating a country’s cybersecurity. From the results of the NCSI evaluation, Indonesia ranks 84th out of 160 countries, very low globally in 2020 with a score of 38.96% (see Table 3 for detail scores of all parameters) [14]. This is a challenge for Indonesia to improve the NCSI in the future by improving and increasing the scores of these 12 parameters. As explained in Sect. 2, Indonesia does not yet have one single rule or law that can be a reference and legal umbrella for solving various problems comprehensively in the cybersecurity sector. This is one of the reasons why Cybersecurity Policy Development is
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia
271
Table 3 Scores of all NCSI parameters of Indonesia in 2020 [14]. Parameter of assessment
Score (%)
Cybersecurity policy development
0
Cyber threat analysis and information
20
Education and professional development
44
Contribution to global cyber security
17
Protection of digital services
20
Protection of essential services
0
E-Identification and trust service
89
Protection of personal data
25
Cyber incident response
67
Cyber crisis management
20
Fight against cybercrime
78
Military cyber operations
33
rated at 0% (see Table 2). There were attempts to create one single rule or law to regulate and monitor cybersecurity but failed because it was deemed not to accommodate various interests. As a democratic country, the Indonesian government requires a quite long process to be able to ratify a regulation or law because it must be able to accommodate all the interests of the community. Considering the increasingly widespread and complex cybercrime in the digital era, the big challenge for the Indonesian government is to immediately finalize a comprehensive law on cybersecurity. Cyberthreats are not a nationally independent issue, but rather a global issue. This is true as most cyberthreats, such as malware and DDoS attacks, usually involve multiple countries. Besides, since the world is interconnected to each other, one weak country (in terms of its cybersecurity) will affect the other countries. Just like how we overcome the Covid-19 pandemic by developing global immunity, we also need to build global resilience against cyberthreats. Indonesia, however, only scored 17% in the Contribution to Global Cyber Security according to NCSI report 2020 (see Table 3). Hence, this becomes a challenge for Indonesia to contribute more to developing global cybersecurity. Different political ideologies require different approaches in cybersecurity. For example, in a country where people’s privacies are sacred, securing a network using SSL decryption might lead to a privacy violation [15]. Hence, while there are already many cybersecurity strategies, Indonesia needs to carefully choose the ones that align with its ideology. Moreover, following the era of Society 5.0, the chosen strategies need to be focused on the welfare of society. Thus, Indonesia’s lack of organizational measures mentioned in GCI 2020 can be improved. Not many digital applications or digital equipment used for personal, or business purposes are made in Indonesia or created by Indonesian. This is a challenge for Indonesia to ensure the security of products that are widely used by the public, both for personal and work needs. A survey conducted by Secure Code Warrior stated that 86% of technology
272
A. A. Intan and R. Intan
developers do not view application security as a priority [16]. This causes cybersecurity and cyber resilience in Indonesia to be increasingly threatened, especially users who do not understand data security. Along with the improving economy and education in Indonesia and supported by digitalization in almost all aspects of life, more and more people and companies are aware of the important role of cybersecurity. Cybercrime experiences that are widely publicized through various social media and digital media provide very effective education in overcoming cybercrime. This provides an opportunity for the development and improvement of cybersecurity in Indonesia more quickly. Another opportunity comes from government policies related to the role of BSSN to improve cybersecurity. Regarding the role of BSSN, Presidential Regulation No 28/2021 provides a wider space for BSSN to work more effectively, efficiently, and on target in the development of cybersecurity. Entering the era of Industry 4.0 and Society 5.0, the life of business and society as well as the government in all aspects is very dependent on the internet. Data and network security is very important and needed to ensure the life of business, society and government can run well. In the era of Industry 4.0 and Society 5.0, the function and role of the government in maintaining and ensuring social security does not only apply in the real world, but also in cyberspace. Therefore, to protect cybersecurity, many countries including Indonesia now have cyber police to combat cybercrime in the country. In this case, cyber police in Indonesia generally handle two types of cases, namely computer crimes and computer-related crimes. In addition to the threat of domestic cybercrime, Indonesia also faces the threat of cyberwarfare. This is a challenge as well as an opportunity for Indonesia to immediately establish cyber force.
4 Conclusion In the era of Industry 4.0 and Society 5.0, almost all sectors of people’s lives, companies and governments have been connected to the internet. The role of cybersecurity is increasingly important to be able to provide comfort and security protection in carrying out various transactions and activities as needed. Various cyberattacks with the motivation to commit cybercrimes are generally closely related to political and economic interests. The number and types of cyberattacks can fluctuate every year depending on the situation and conditions, especially the political and economic situation at that time. For example, 2019 is considered a political year in Indonesia because the presidential election and inauguration recorded cyberattacks with political agendas not only coming from within the country but also from abroad. From 2016 to 2018, most of the cyberattacks were in the form of malware. In 2019, it turned into an information leak. Interestingly, in the first 9 months of 2019, the number and types of attacks still followed the previous years, but suddenly there was an increase in attacks in September-October 2019. This phenomenon occurred not by chance but is closely related to the election and inauguration of president and vice president. It is also interesting to note that at that time the source of the attacks mostly came from the United States, in contrast to previous years. In addition to committing cybercrimes, cyberattacks can also be considered expressions of dissatisfaction, disappointment, and protest toward individuals, institutions, and
Case Study: A Review of Cybersecurity Policies and Challenges in Indonesia
273
even governments. In this case, the track record of cyberattacks can be used as material for investigation and self-evaluation of a government or private agency. In the digital era, the role of the state to ensure social security is not only focused on the real world but also in cyberspace. The government is responsible for ensuring cyber security and cyber resilience that provides security, comfort, and prosperity in the life of the nation. However, to build strong cyber resilience, a quad helix collaboration is needed between governments, educational institutions, industry, and communities that support and build each other.
References 1. World Bank: https://www.worldbank.org/en/country/indonesia/overview. Last accessed 16 Oct 2022 2. Information Technology Sectors in Indonesia: https://www.cekindo.com/sectors/informationtechnology-in-indonesia. Last accessed 16 Oct 2022 3. Global Cybersecurity Index 2020: The International Telecommunication Union, p. 25, 29 (2021) 4. Global Cybersecurity Index 2018: The International Telecommunication Union, p. 58 (2019) 5. Indonesia Cyber Security Monitoring Report 2018: Id-SIRTII/CC (2019) 6. Indonesia Cyber Security Monitoring Report 2019: Id-SIRTII/CC (2020) 7. Laporan Tahunan Monitoring Keamanan Siber 2020: Id-SIRTII/CC (2021) 8. Laporan Tahunan Monitoring Keamanan Siber 2021: Id-SIRTII/CC (2022) 9. IDN Times: https://www.idntimes.com/business/economy/ridwan-aji-pitoko-1/kerugian-eko nomi-akibat-serangan-siber-mencapai-rp142-triliun?page=all. Last accessed 16 Oct 2022 10. Government Technology: The Year the COVID-19 Crisis Brought a Cyber Pandemic (2020). https://www.govtech.com/blogs/lohrmann-on-cybersecurity/2020-the-yearthe-covid-19-crisis-brought-a-cyber-pandemic.html. last accessed 16 Oct 2022 11. Kompas.com: https://www.kompas.com/tren/read/2022/09/20/121000865/kilas-balik-isukebocoran-data--munculnya-bjorka-hingga-ruu-pdp-disahkan. Last accessed 16 Oct 2022 12. Detik: https://news.detik.com/berita/d-6304063/mahfud-md-bjorka-ndak-ada-apa-apanya. Last accessed 16 Oct 2022 13. CNN news: https://www.cnnindonesia.com/nasional/20220913161624-20-847280/rk-soalbjorka-masyarakat-jabar-resah-mohon-upaya-maksimal-pusat. Last accessed 16 Oct 2022 14. National Cyber Security Index: https://ncsi.ega.ee/country/id/470/#details. Last accessed 17 Oct 2022 15. CBT Nuggets: https://www.cbtnuggets.com/blog/certifications/security/ssl-decryption-ben efits-challenges-and-best-practices. Last accessed 17 Oct 2022 16. Secure Code Warrior: https://www.securecodewarrior.com/press-releases/secure-code-war rior-survey-finds-86-of-developers-do-not-view-application-security-as-a-top-priority. Last accessed 17 Oct 2022
Lowering and Analyzing the Power Consumption of Smartphones Imtiaj Ahmed1 , Samiun Rahman Sizan1 , Fariha Tabassum1 , Md. Mostafijur Rahman1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2,3(B) 1 Department of Computer Science and Engineering, East West University, Dhaka 1212,
Bangladesh [email protected] 2 Department of CSE, Daffodil International University, Dhaka, Bangladesh [email protected] 3 Department of CSE, Chittagong University of Engineering and Technology, Chattogram, Bangladesh
Abstract. In our research paper, we analyze the measurement of smartphone power of two widely used 4G and WiFi and their usage patterns. Due to their significant tail power consumption, 4G remains in high-power phases after a transfer. These observations allow us to create a simulation of the power used by network activities based on the technology. Using this technique, we want to reduce the amount of power used by popular mobile applications. To accomplish this, TailEnder is a protocol that arranges transfers to spend the least amount of power overall while yet meeting user-specified delay tolerance limits. The TailEnder technique is proven to be within the optimal range of 1.28, and no deterministic web-based solution can increase the competitive ratio. We examine the advantages of TailEnder for two applications for case studies: news feeds, email, and web searches both demonstrate a considerable decrease in power use based on actual user logs. In mobile devices, TailEnder was shown to be 60% more effective at downloading files and updates from the news feed than the default configuration, which returns search results for fewer than 50% of online inquiries. Keywords: Power analysis · Mobile network technology · Power usage · TailEnder protocol
1 Introduction Currently, phones support 4G and Wi-Fi for data exchange. For example, 4G is estimated that 4G is widely used in more than 50% of all mobile subscriptions globally and more than 70% in a few nations. We initially carried out an extensive study to further explore these issues using measurements to calculate data on the transmission of power use through Wi-Fi and 4G. Furthermore, nearly 60% of the power used for good 4G performance is known as tail power. In the following state, a successful normal transfer in the ramp power is used for comparing, switching there is a little high-power condition before transmission. In addition to overall transfer size, we discovered that power © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 274–288, 2024. https://doi.org/10.1007/978-3-031-50327-6_29
Lowering and Analyzing the Power Consumption of Smartphones
275
usage is directly connected to Workload characteristics. More power may be used in the sporadic transmission of hundreds of bytes over 4G than in the continuous transmission of a megabyte. i. Almost 60% of the power used for good performance in 4G is known as tail power. In the following state following a successful normal transfer in the ramp power used for comparing, switching there is a little high-power condition before transmission. A constant amortizing greater power is the tail and ramp power. Transfer volume or the frequency of subsequent transfers. ii. In terms of association effort, Wi-Fi is equivalent to 4G tail power, but for all transmission sizes, Wi-Fi’s data transfer is substantially more efficient than 4G. We created a straightforward model considering these discoveries. Technology power usage for the two network operations. Considering individual and social effort, WiFi is equivalent to 4G tail power, but for all transmission sizes, Wifi’s data transfer is substantially more effective than 4G. For two separate applications, email, news feeds, and online search, we assess TailEnder’s performance. Roughly each Obtain authentic user data from these applications arrival time and size of the transport. Compared to using the default strategy, TailEnder may get 60% more news feed updates and more than 50% of his online search results. Tail Ender can cut power usage for email programs, news feeds, and webAsk by 35–52%. It was discovered that an opportunistic WiFi connection greatly reduced power consumption in comparison to pure 4G usage. Sending data via WiFi when it is accessible saves power consumption, even if Wi-Fi is only 50% active. In both applications, consumption more than quadrupled. This is the first study to examine the power usage of two commonly utilized networks, 4G and Wi-Fi.
2 Related Work The purpose of this article is to assess the power consumption of the Android system by introducing the Wakelock mechanism and the power management modules. The author has used DC power sources so that they can determine how much electricity each piece of hardware uses in the module of Android phones. The paper shows real-time current intensity curves and the hardware module’s indicators of power consumption. Most power is consumed by 4G in standby mode, and it has the shortest standby time of 40 h, which is half in comparison with 2G standby. The least power is consumed when it is in flight mode, more than 10 days are required due to standby time. It is also said that LCD screens with multimedia scenarios use the greatest electricity [1]. In a study, the power usage of various smartphone applications connected to displays has been analyzed. A wide range of display content was tested while the experiments were run on several Samsung smartphones. The author confirmed that sub-pixel matrix design strongly influences the AMOLED screen power model and watching a video on an AMOLED screen consumes less power than expected. It is also confirmed that camera recording causes high power usage that is limited by internal data transformation [2]. This paper shows a thorough examination of the power utilization of an Android phone based on the calculation of a device and in what manner the device’s different
276
I. Ahmed et al.
components impact whole power consumption. They have created a model for different scenarios of power consumption and have shown the translation into overall power consumption and battery life within several usage forms. For an experiment, they used the Openmoko Neo Freerunner smartphone and avoided the commercial ones. Detailed measurements have been compared with a coarse-grained analysis of modern phones [3]. Well, understanding the management of power usage in smartphones provides a good power management system. A study of power consumption in available mobile phone functionalities and some internet services has been done by a group of students in Sweden. The authors did a literature survey regarding mobile phone power consumption and operating system and analyzed various 4G phones’ basic services and applications. The paper also shows Parameters that affect the battery life in smartphones. The authors presented experiments on power measurements in basic applications like voice calls, SMS, mp3song, Bluetooth, camera, etc., and the tasks related to internet browsing. They stated that a hardware-based methodology focusing on power monitors, also tested, can serve as a reliable measurement of power consumption which is better than a softwarebased management system [4]. This paper shows a system framework that works on Android smartphones for CPU power consumption. The author has observed that the calculated ideal frequency is greater than the ideal frequency from user opinions as they have measured CPU utilization for the initial stage but in the whole running process, the usage may have resulted in different results. For choosing optimum frequency their approach functions with various application states. It is also stated that the proposed method saves battery life while assuring user satisfaction [5]. In Mauritius, focusing on web-based applications, one study has made a statistical evaluation of the power consumption of cellular phones. The authors have done the analysis based on two categories, first one was a survey that verifies the major type of cellular phone being used in Mauritius and the frequency of use of web applications as well as the network access information to use those applications. The survey results show that the Nokia phone with Symbian Operating System (OS) and the most popular smartphones were Samsung models running Android. Then, an experiment was done on the amount of power used by mobile devices when they are used for social networking, YouTube video streaming, and surfing and downloading files when the network is on 2G, 4G, and Wi-Fi connections. This study has shown that 16.76 MWh in Mauritius is the average power consumption for applications usage in Nokia and Samsung phones [6]. To assess the three crucial metrics: availability, DSPN frameworks have been created and proven to simulate the execution time and power consumption of smartphone apps operating in the MCC environment. The authors have estimated the time of execution and power consumption while the procedure of uploading through Wi-Fi and 4G connections. The values of the experiment scenario were measured with MCC infrastructure so that the application cloudlet can be accessed through wireless networks. The article concludes that numerical analysis showed that, with the use of Cloudlet, performance gets better as well as it is power efficient [7].
Lowering and Analyzing the Power Consumption of Smartphones
277
In this mobile world, there are numeric errors that occur in systems such as operating systems, applications, firmware, hardware, or external devices. Another new bug was discovered today. This is the Power-Bug [8]. During the e-bug, the mobile phone consumes an unexpectedly large amount of power. However, in response to the demands of mobile consumers, numerous strategies and technologies are being swiftly introduced. A system used to assess the performance, efficiency, and power estimation of a specific system is known as a “power management system,” or “EMS.” The operating system and its applications, according to the manufacturer of mobile phones, are the only two fundamental components that affect power usage. However, as illustrated in Fig. 1, several additional power consumption parameters also play a role in the power usage of user interaction, sensors, and wireless optimization [9].
Fig. 1. Fraction of requests versus document rank.
This white paper discusses these factors in relation to smartphone power consumption. Since mobile devices are a popular method in contexts for cloud computing to develop mobile cloud computing, researchers in [10] and [11] have made several efforts to optimize the performance of mobile cloud computing. In [12], power-aware load planning algorithms for various Software and programs were created to gauge the effects of application load on mobile phone power consumption. In [13], the approach investigated showed the connection between power usage and radio signal strength during data transmission. New performance models for Wi-Fi and 4G have been proposed. This helped measure the effect of signal strength and weakness in terms of power consumption of wireless components built into smartphones. Balasubramanian et al. [7], Tawalbeh et al. [14] measured the power used for 802.11 connection and data transfer in wireless networks. The presented study concluded that the factors influencing the optimal data transfer strategy also included device context, phone, and operating system. A thorough investigation of effective and dependable mobile cloud computing was carried out in [8, 15] in a similar context and framework.
278
I. Ahmed et al.
This study suggested a hybrid renewable system that uses sleep mode in addition to other energy-saving features [16]. The works in [17–25] provide strong focus in the field of green technologies.
3 Materials and Methods 3.1 Sampling We have used different mobile phones for testing. We have tested multiple times on the same device. We apply an application for power profiling that offers samples of instantaneous measures of power. To estimate the quantity of power energy consumed, calculate the estimated area within the power measurement curve for the time frame using the power measurements. Power usage, ramp closure, and tail periods should be subtracted from idle power. The power required for each data transmission is then estimated by estimating the power below the power curve from the end of the ramp to the beginning of the tail. We tested on Xiaomi Poco F1, iPhone 13, Xiaomi Redmi Note 10 lite, Note 7 Pro, and Nokia 6.1. Xiaomi devices are one of the most popular and available among people. On the other hand, the iPhone is also well-known and is one of the most used devices. As most people use these devices, we select them for our test devices and collect data from them. 3.2 Experimental Setup We chose these devices (Table 1) as all were readily available and common among the students. It was tough for us to measure power at the component level. Measuring display power required certain attention to brightnesses, as for different brightnesses, the display consumes more or less power. Table 1. Device specifications. Component
Device 1(Xiaomi Poco F1)
Device 2 (iPhone 13)
Battery
Li-Po 4000 mAh
Li-I0n 3240 mAh non-removable
Cellular
GSM/ LTE
GSM/LTE/5G
Wi-Fi
802.11 ac(Wi-Fi 5)
Wi-Fi 802.11
4 Implementation and Experimental Result Analysis 4.1 Delay Tolerant First, we offer a straightforward illustration of how power-saving applications could benefit from delay tolerance. Consider a user who quickly sends two emails. After two periods of inactivity, the device will stay in the high-power mode because when an
Lowering and Analyzing the Power Consumption of Smartphones
279
email is received, it is sent by default. The gadget might be able to send both emails at once and consume high power for one idle time if the user is willing to wait a few minutes. According to our tests, the second technique reduces the power use of small to medium-sized emails by half. 4.2 Transmission for Minimize Power TailEnder is used to schedule the transmission of phone-generated requests in order to minimize overall power consumption and ensure that all phone-generated requests are relayed by the deadlines. Consider n requests of the same size, where each request rI has a start-sending deadline dI and an arrival time aI. When quickly transmitting the request rI is required, the radio switches to a high-power mode and sends the request rI immediately, and then stays in this mode for T units of time, which is equal to the queue time. We neglect the comparably small power excess to transition into the high-power stage. Recall that the device will only run in the high-power mode for time T if several requests are made at once. For a specific request schedule, let’s the total amount of time spent in high-power states. Finding a schedule s1, s2, and sN that minimizes and fulfills aI, sI, and dI is a difficult task. TailEnder is a simple algorithm that plans the transmission of incoming rI requests using an online method. The primary goal is to grant a request if the application deadline has already passed or if it arrives within x T days after the prior deadline. Because arrival times are variable, the scheduling issue must be dealt with online. i. Case 1: Suppose that the algorithm schedules the query right at x T. ADV defines the query up to the nearest deadline and generates new queries after that time limit. Until expiration, Ø(ADV) = T because ADV only schedules the first transmission. Now, Ø(ALG) = (1 + x) T. The ratio of competition to the final deadline is (1 + x) / 1. This ratio is valid even if we consider the total cost of subsequent requests. ii. Case 2: Suppose that the algorithm schedules all requests to come before x T but determines the request to come before x T. ADV then schedule x T on arrival and stops generating more requests. Since ADV has no additional requests for scheduling, Ø(ADV) = (1 + x) T. However, ALG must program the request to xT, and thus Ø(ADV) = (2 + x) T. Rate The competitive ratio, in this case, is (2 + x) / 1 + x. A protocol called TailEnder reduces power consumption while still achieving userspecified delay tolerance standards. For applications that profit from lower overall power usage, TailEnder prefetches data aggressively, even data that may be useless. Mobile device tests reveal that TailEnder can download 50% more Internet search results and 60% more updates to the news feed than conventional methods. The TailEnder scheduler algorithm is included below:
280
I. Ahmed et al.
Fraction of requests versus document rank can also be evaluated using the TailEnder scheduling method. In this context, the graph shows the percentage of user requests that can be fulfilled within their respective deadlines by documents at different ranks or positions in the search engine result page, using the TailEnder scheduling method. The TailEnder scheduling method determines whether or not to transmit a request at time t. The deadline is di, and the request’s starting time is ai in Fig. 1. By using built-in and third-party power management applications, we analyze the power consumption of different applications. The results of mobile phones are given in Tables 2, 3, 4, 5 and 6. Table 2. Device 1 wi-fi usage Application
Battery usage (%)
Usage in milliampere (mAh)
YouTube
22.7
693.0
Messenger
2.3
71.8
AccuBattery
1.8
55.1
Settings
1.1
33.0
QuickStep
0.4
11.7
Table 3. Device 1 Usage 4G Application
Battery usage (%)
Usage in milliampere (mAh)
YouTube
25.9
781.1
Messenger
2.4
73.2
AccuBattery
1.8
55.1
Settings
1.1
33.0
QuickStep
0.4
11.7
Lowering and Analyzing the Power Consumption of Smartphones
281
Table 4. Device 2 usage Application
Battery usage (%)
YouTube
46
Messenger
30
AccuBattery
1
Settings
1
QuickStep
1
Table 5. Device 3 usage Application
Battery usage (%)
YouTube
38
Messenger
22
AccuBattery
2
Settings
1
QuickStep
1
Table 6. Device 4 usage Application
Battery usage (%)
YouTube
50
Messenger
38
AccuBattery
2
Settings
2
QuickStep
1
On the phone, we performed 5–10 tests on each mobile device to obtain accurate results and we examine data transfer and the use of software-level lines. We convert a software suggestion into a series of transfers known as S = {, , …., } such that the mobile device can also download records of length xi at time ai. Then, starting from a fully charged state, we repeat this collection of transfers until the battery is completely drained. We perform two switch sequences, one created using Default and the other with TailEnder. TailEnder organizes the transfers entirely based on a software hint to determine whether or not the program is delay-tolerant or would benefit from prefetching. Transactions are processed as they come in by default. Our research focuses on two applications: web search and downloading Tech news feeds. The number of memory downloaded is the statistic for information feed software, whereas the number of requests for which all user-requested files have been provided is the
282
I. Ahmed et al.
meter for Internet searches. Table 7 shows the test of using information feeds. TailEnder downloads more than 50% more information updates compared to default and a 49% increase in total data downloaded via protocol, from 117 to 230 MB. TailEnder uses 51% less power than Default for the Tech information stream. Table 7. News feed experiment. Default
TailEnder
Stories
1401
3910
Total transfer size (MB)
117
230
The average number of transfers has been reduced by 45% for the same quantity of power. Although prefetching often transfers 10 times more data per transfer, it is nevertheless power-efficient. According to our model-driven analysis (Fig. 2), TailEnder uses 40% less power when performing Web searches than Default. The result of the web search experiment is presented in Table 8. Compared to default, TailEnder downloads 50% more requests. The power of the mobile phone is a limited resource. These wireless technologies are critical as these devices become more prevalent. To do this, we carried out a thorough measurement analysis and discovered a significant tail power overhead in GSM and 4G (Fig. 3). 4.3 Result Analysis This analysis assumes that users spend more time thinking about requesting documents than the inactivity timeout value. This assumption is not made in the evaluation or prefetching process. If the top x documents are pre-fetched, the percentage of power saved is estimated as: Y · p(x) − R(x)/TY
(1)
Figure 4 shows the predicted efficiency of Eq. 1 with increasing x. Y, R(x), and TY are derived from 4G power measurements, and the values of p(x) are calculated from statistics displayed in Fig. 5. The size of the document is predetermined to be the typical size of online papers found in the search history. The most power is saved by prefetching 10 web resources. Additional documents are pre-fetched; therefore, the expense of prefetching surpasses power conservation. Users might not ask for prefetched documents if there are not enough prefetched documents, hence the projected power savings are minimal. For each user request, TailEnder prefetches her ten web pages. 50K of data can be downloaded with an average power consumption of 20 s between transfers. Comparison of the power rating used to download data of various sizes to the calculation of inter-transfer time. All values 4G and GSM, are averaged over more than 100 trials.
Lowering and Analyzing the Power Consumption of Smartphones
Fig. 2. 95% confidence interval displayed by vertical bars.
283
284
I. Ahmed et al. Table 8. Web search experiment. Default
TailEnder
Queries
680
1010
Documents
872
10101
Transfers
1478
1010
Per-query average transfers
9.5K
148.05 K
Fig. 3. Online search—using TailEnder for each query (the power improvement CDF).
Fig. 4. Power savings in percentages to be expected in relation to the volume of documents finalized.
4.4 Result Evaluatıon TailEnder is being evaluated using real-world testing on phones. TailEnder’s power minimization efficacy is heavily reliant on application traffic and user behavior. The news feed and e-mail are examples of applications that can allow some latency. We use
Lowering and Analyzing the Power Consumption of Smartphones
285
Fig. 5. Wi-Fi, 4G, and GSM measurements.
the application traces we acquired to conduct a trace-driven evaluation of TailEnder’s performance for varied parameters and configurations. Our measuring analysis yields a power model (Table 9) for per-byte data transfer. Table 9. Modeling the power required to obtain X bytes of data through GSM networks, 4G, and WiFi.
Transferring power, R(X)
4G
GSM
Wifi
0.035(X) + 4.65
0.045(X) + 3.8
0.010(X) + 6.0
Power, P
0.68 J/s
0.26J/s
N/A
Maintenance, M
0.04 J/s
0.04J/s
0.06J/s
Time, T
12.6 s
7s
N/A
Power Transferring
12.6 J
6.0J
9.7J
4G interfaces consume more power per byte compared to Wi-Fi but have lower accessibility. One possible solution is to convert to a 4G interface when Wi-Fi is inaccessible. This result shows that the combination of 4G and Wi-Fi can provide notable power advantages to mobile nodes without compromising network inaccessibility. The power consumption is 15% lower than Default and less than 4 times lower than TailEnder when WiFi is always accessible. This result shows that the combination of 4G and Wi-Fi networks can provide notable power advantages to mobile nodes without compromising network accessibility. Next, run a data experiment on smartphones by application-level tracing. Convert the app trace into a series of transfers S = {, , …, } such that data of size si of a is downloaded from the smartphone. The statistic for news feed apps is the number of articles downloaded, while the metric for online searches is the number of queries that returned all of the users’ requested documents. In a web search
286
I. Ahmed et al.
experiment with prefetching, TailEnder sends replies to 45% fewer average sends while handling 50% more requests for the same amount of power. Prefetching uses less power but transfers 10 times as much data each time (Fig. 6).
Fig. 6. Web search: Switching between WiFi and 4G networks having average power savings.
5 Conclusion TailEnder offers a straightforward API for programs and may be used in operating systems. Only a delay tolerance for each transmitted item has to be included in a request. In order to save battery life, modern mobile phones like the iPhone require that users specify latency tolerance limits for particular applications. Future work will be required to integrate TailEnder into the kernel and enhance the interface to make it simpler for end users and programmers to utilize. However, TailEnder’s power savings depend on the user’s mobile device application usage habits. Two advantages come from mobile usage habits. In the beginning, it aids in calculating the power advantages of cross-application optimization. Usage trends show us how much time each program is used by mobile users. This makes it possible to calculate his typical daily power savings for a specific usage pattern. To properly quantify the power advantages of TailEnder for mobile users and identify cross-application prospects, we will try to collect evidence of mobile usage trends as part of our ongoing development.
References 1. Riaz, M.N.: Energy consumption in hand-held mobile communication devices a comparative study. In: 2018 International Conference on computing, Mathematics and Engineering Technologies (iCoMET) (2018)
Lowering and Analyzing the Power Consumption of Smartphones
287
2. Fowdur, T.P., Hurbungs, V., Beeharry, Y.: Statistical analysis of energy consumption of mobile phones for web-based applications in Mauritius. In: 2016 International Conference on Computer Communication and Informatics (ICCCI), pp. 1–8 (2016). https://doi.org/10. 1109/ICCCI.2016.7480018 3. Rumi, M.A., Asaduzzaman, Hasan, D.M.H., Bai, G., Mou, H., Hou, Y., Lyu, Y., Yang, W.: Android power management and analyses of power consumption in an android smartphone. In: 2015 3rd International Conference on Green Energy and Technology (ICGET) (2015) 4. Chen, X., Chen, Y., Ma, Z., Fernandes, F.C.A.: How is energy consumed in smartphone display applications? In: Proceedings of the 14th Workshop on Mobile Computing Systems and Applications—HotMobile’13 (2013). https://doi.org/10.1145/2444776.2444781 5. Bai, G., Mou, H., Hou, Y., Lyu, Y., Yang, W.: Android power management and analyses of power consumption in an android smartphone. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (2013). https://doi.org/10.1109/ HPCC.and.EUC.2013.338 6. Mendonça, J., Andrade, E., Lima, R.: Assessing mobile applications performance and energy consumption through experiments and Stochastic models. Computing 101(12), 1789–1811 (2019). https://doi.org/10.1007/s00607-019-00707-6 7. Balasubramanian, N.A., Balasubramanian, Venkataramani, A.: Energy consumption in mobile phones: a measurement study and implications for network applications. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, pp. 280–293. ACM (2009) 8. Shye, A., Scholbrock, B., Memik, G., Dinda, P.A.: Characterizing and modeling user activity on smartphones: summary. In: ACM SIGMETRICS performance evaluation review, vol. 38, no. 1, pp. 375–376. ACM (2010) 9. Al-Ayyoub, M., Jararweh, Y., Tawalbeh, L., Benkhelifa, E., Basalamah: A power optimization of large-scale mobile cloud computing systems. In: 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), 24 Aug 2015, pp. 670–674. IEEE 10. Bahwaireth, K., Tawalbeh, L., Basalamah, A.: Efficient techniques for energy optimization in mobile cloud computing. In: 12th ACS International Conference on Computer Systems and Applications-AICCSA, Morocco, 17–20 Nov 2015. IEEE 11. Schulman, A., Navda, V., Ramjee, R., Spring, N., Deshpande, P., Grunewald, C., Jain, K., Padmanabhan: Bartender: a practical approach to energy-aware cellular data scheduling. In: Proceedings of the Sixteenth Annual ˙International Conference on Mobile Computing and Networking, pp. 85–96. ACM (2010) 12. Ding, N., Wagner, D., Chen, X., Pathak, A., Hu, Y.C., Rice, A.: Characterizing and modeling the impact of wireless signal strength on smartphone battery drain. In: n ACM SIGMETRICS Performance Evaluation Review (Vol. 41, No. 1, pp. 29–40). ACM 13. Rice, A. and Hay:Measuring mobile phone energy consumption for 802.11 wireless networking. Pervasive and Mobile Computing”, 6(6), pp.593–606,2010 14. Tawalbeh, L.A. , Alassaf , N. W., Bakheder ,Tawalbeh ,A. :Resilience Mobile Cloud Computing: Features, Applications and Challenges. 2015 Fifth International Conference on e-Learning (econf) 2015 Oct 18 (pp. 280–284) IEEE 15. Fekete,K., Csorba.K., Vajk,T., Forstner, B. And Pandi, B. : Towards an energy efficient code generator for mobile phones.Cognitive Info communications (CogInfoCom), 2013 ˙IEEE 4th International Conference on (pp. 647–652) 2013, December 16. Rahman,M, A ., Asif,S., Hossain,M, S., Alam,T., Reza,A, W., Arefin, M, S. : A Sustainable Approach to Reduce Power Consumption and Harmful Effects of Cellular Base Stations, ): ICO 2022, LNNS 569, pp. 695–707, 2023. https://doi.org/10.1007/978-3-031-19958-5_66
288
I. Ahmed et al.
17. Yeasmin, S., Afrin, N., Saif, K., Reza, A.W., Arefin, M.S. (2023). Towards Building a Sustainable System of Data Center Cooling and Power Management Utilizing Renewable Energy. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_67 18. Liza, M.A., Suny, A., Shahjahan, R.M.B., Reza, A.W., Arefin, M.S. (2023). Minimizing EWaste Through Improved Virtualization. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/9783-031-19958-5_97 19. Das, K., Saha, S., Chowdhury, S., Reza, A.W., Paul, S., Arefin, M.S. (2023). A Sustainable E-waste Management System and Recycling Trade for Bangladesh in Green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_33 20. Ahsan, M., Yousuf, M., Rahman, M., Proma, F.I., Reza, A.W., Arefin, M.S. (2023). Designing a Sustainable E-Waste Management Framework for Bangladesh. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_104 21. Mukto, M.M., Al Mahmud, M.M., Ahmed, M.A., Haque, I., Reza, A.W., Arefin, M.S. (2023). A Sustainable Approach Between Satellite and Traditional Broadband Transmission Technologies Based on Green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-199585_26 22. Meharaj-Ul-Mahmmud, Laskar, M.S., Arafin, M., Molla, M.S., Reza, A.W., Arefin, M.S. (2023). Improved Virtualization to Reduce e-Waste in Green Computing. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_35 23. Banik, P., Rahat, M.S.A., Rafe, M.A.H., Reza, A.W., Arefin, M.S. (2023). Developing an Energy Cost Calculator for Solar. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-03119958-5_75 24. Ahmed, F., Basak, B., Chakraborty, S., Karmokar, T., Reza, A.W., Arefin, M.S. (2023). Sustainable and Profitable IT Infrastructure of Bangladesh Using Green IT. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_18 25. Ananna, S.S., Supty, N.S., Shorna, I.J., Reza, A.W., Arefin, M.S. (2023). A Policy Framework for Improving E-Waste Management in Bangladesh. In: Vasant, P., Weber, GW., MarmolejoSaucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10. 1007/978-3-031-19958-5_95
Comparison for Handwritten Character Recognition and Handwritten Text Recognition and Tesseract Tool on IJAZAh’s Handwriting Alexander Setiawan(B) , Kartika Gunadi, and Made Yoga Mahardika Informatics Department, Faculty of Industrial Technology, Petra Christian University, Surabaya, Indonesia {alexander,kgunadi}@petra.ac.id
Abstract. Handwriting is a form of being able to recognize various types of writing in various existing fonts. Unlike consistent computer letters, each human handwriting is unique in its form and consistency. These problems can be found in a document where the data is in the form of handwriting. Segmentation of the data location will use a run length smoothing algorithm with points as segmentation features. The Handwriting Text Recognition (HTR) technique requires segmented data into words. The Handwriting Character Recognition (HCR) technique requires segmented data into various characters. The process of this HCR technique uses the LeNet5 model using the EMNIST dataset. HTR uses the tesseract tool and a convolutional iterative neural network using the IAM database. Experiment on 10 samples of scan images, segmentation obtained an average accuracy of 95.6%. The HCR technique failed in the letter segmentation process in cursive handwriting. The easiest technique to use is the HTR with the helps of tesseract tool, tesseract tool also has a good performance. Tesseract managed to get word accuracy above 70% tested on 5 scan samples, 15 data fields. Keywords: Handwritten Text Recognition (HTR) · Handwritten Character Recognition (HCR) · Segmentation · Tesseract
1 Introduction Research on handwriting recognition, especially for Latin numbers and letters, is one of the topics in the development of pattern recognition techniques that are still developing today [1]. Writing recognition techniques can be divided into 2, namely character recognition and text recognition. The process of writing recognition from one way, by means of feature or feature extraction, and rocks. The two techniques have different approaches in carrying out writing recognition, especially in classification. The problem that arises in carrying out the letter recognition process is how a recognition technique can regret various types of writing with different sizes, thicknesses, *Please note that the AISC Editorial assumes that all authors have used the western naming convention, with given names preceding surnames. This determines the structure of the names in the running heads and the author index. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 289–298, 2024. https://doi.org/10.1007/978-3-031-50327-6_30
290
A. Setiawan et al.
and shapes [2]. This problem can be found especially in the case of handwriting. In contrast to computer letters that are consistent in their respective ways, the handwriting of every human being is unique. This problem can be found in the example of a diploma document which still uses handwriting in filling in the personal data of the owner. These handwriting problems can be activated by applying writing recognition techniques. These techniques are character recognition and text recognition. Character recognition techniques using the Extended MNIST dataset. This dataset is a variant of the complete NIST dataset, called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset [3]. Introduction to technical texts using the IAM dataset. The IAM database is a database of handwritten English sentences, which includes 1066 forms produced by approximately 400 different authors [4]. This scientific work aims to provide information in handwriting recognition, as well as an overview in the context of its application and performance.
2 Theoretical Basis Run Length Smoothing Algorithm (RLSA) is a method used for block segmentation and text discrimination. RLSA is used for the blob process in binary images so that word or block segmentation can be done in the image. RLSA converts black (0) and white (1) pixels with the rule that all pixels in the original image are changed by 0 if the subsequent 1 pixel is less than or equal to a value. The calculation of this algorithm is generally done 2 times, namely vertical then horizontal. This can be changed in the parameter if you only want to do any of the processing [5]. The rule iteration function already takes into account the calculation for horizontal, and can be used for vertical calculations as well. Transposing the image and using a horizontal iteration will get a vertical calculation. An example of the RLSA input and output is in Fig. 1.
Fig. 1. Run length smoothing algorithm output
2.1 Convolutional Neural Network (CNN) In a neural network, a Convolutional Neural Network (CNN) is a model specially made for the categories of image recognition, image classification, object detection, face recognition, and others. CNN is a type of artificial neural network designed specifically for
Comparison for Handwritten Character Recognition and Handwritten
291
processing pixel data [6]. CNN receives input in the form of images, is processed, and is classified into several categories (for example: cat, tiger, lion). CNN performs particularly well in looking for a pattern. CNN has 3 main layers, namely the convolutional layer, the pooling layer, and the full connected layer. The convolution process in the convolution layer aims to extract features from the input image [7]. This process multiplies the image matrix with a certain size matrix filter and the product is summed to get the output feature. The convolution process is visualized as in Fig. 2.
Fig. 2. Convolution process
The pooling layer is a layer that reduces the dimensions of the feature map. The process at this layer is known as the step for down sampling. This is useful for speeding up computation because fewer parameters need to be updated and overfitting [8]. There are 2 types of pooling layers commonly used, namely max pooling and average pooling. Max pooling uses the maximum value per filter shift, while average pooling uses the average value. An example of the output pooling layer results can be seen in Fig. 3.
Fig. 3. Max polling
The fully-connected layer is a layer that is fully connected like an ordinary neural network. This layer will calculate the class score. Like a normal neural network, each neuron in this layer will be connected to all subsequent neurons in the volume. The flatten layer converts the matrix into a vector so that it can be fed to the fully connected layer. Example of flatten and fully connected layer results in Fig. 4. 2.2 Convolutional Recurrent Neural Network (CRNN) Convolutional recurrent neural network (CRNN) is a neural network model that has a convolution layer, a recurrent layer, and finally a Connectionist Temporal Classification
292
A. Setiawan et al.
Fig. 4. Flattening & fully-connected layer
(CTC) layer. Unlike the usual CNN model, CRNN features extracted from the convolution layer are fed to the recurrent layer (not the fully connected layer) and the output layer uses CTC. CTC is a layer function that is used to classify the final result into a string of a matrix. CTC is used as the final stage in a neural network sequence that aims to recognize characters. The results of the previous neural network are generally in the form of a matrix, this matrix is processed in order to obtain a text string as the final result of the existing series of processes. CTC is used as a transcription layer to translate the recurrent layer output matrix and basic truth text and calculate loss values. In conclusion, CTC feeds the initial raw output matrix of RNN and translates it into the final text. The length of the basic truth text and the recognized text is up to the maximum length of n characters from the training dataset. CRNN visualization can be seen in Fig. 5.
Fig. 5. Convolutional recurrent neural network
Comparison for Handwritten Character Recognition and Handwritten
293
3 System Design The system accepts input in the form of scanned images of elementary to high school diplomas in colour or grayscale. The system does not accept images from camera photos. Images from the camera still require a lot of preprocessing, to remove noise, perspective, and light and dark settings from documents. The recommended image input resolution is around or greater than or equal to 850 width × 1100 height, because the image will be pre-processed at that dimension. The output of the system is the result of OCR data in the form of a string. This study will use the segmentation method with a point feature for segmentation of diplomas. This method is used to solve problems with diplomas that have different formats. The areas on the diploma are marked with dots, which are the places to fill in the data. The image input will be carried out by a process called crop and reshape. This process is useful for uniform resolution of all diploma image input for the next segmentation process. This process affects the dot size and focus area of the input image. The image of the diploma will be cropped in the focus area and resized to a width of 650 and a height of 850. This process is subjective only for the diploma image. Flowchart process in Fig. 6.
Fig. 6. Preprocess image flowchart
This process aims to obtain segmentation from the location of the data on the diploma. Dot segmentation or point segmentation uses the run length smoothing algorithm method and is used for the process of connecting vertically adjacent areas of the image. The dot
294
A. Setiawan et al.
segmentation process actually only consists of 3 stages. Separating the dot location with the rest of the part and connecting it with RLSA method. Finally, segment the connected dot (line). Flowchart on Fig. 7.
Fig. 7. Flowchart segment data location.
Before it can be fed into the model for recognition. Image data must first be processed. Each technique used requires a different process stage. Text recognition technique requires data to be segmented into images per word. Meanwhile, character recognition techniques require data to be segmented into letters/numbers. The letter segmentation failed in cursive writing, where in this study an adjustment algorithm was made. The process of segmenting letters only uses regular contour searches in the process. The letter image is processed to resemble the image on the MNIST dataset measuring 28 × 28 and centerized. Process flowchart in Fig. 8.
4 Implementation System and Testing System Testing must be done, and the purpose of the test is to receive feedback from participants so that the level of usefulness of this application can be determined [9]. The results of word segmentation will be processed to be predictable in the next process. The segmentation results will be processed again to be able to feed into the model for prediction. Using the same function as the image processing function for this text recognition model training. The image will be converted into grayscale and the aspect ratio changes according to the shape model input. The programming language used is Python. Machine learning API is used to detect images (Cloud Vision API) such as logos or photos, video, natural language (text), voice, and translators [10]. The input image must be grayscale with a size of 128 × 32 pixels, white on text and black on the background (invert). An example of changing the image size to 128 × 32 can be seen in Fig. 9. Testing the accuracy of the RLSA segmentation results with point features compared to text features. Segmentation aims to obtain all pieces of data separately. The calculation
Comparison for Handwritten Character Recognition and Handwritten
295
Fig. 8. Convert image to MNIST format flowchart
Fig. 9. Preprocessing in text recognition
of accuracy uses 10 sample diplomas that have been selected on the grounds that they have a fairly good standard of image quality and are also finding cases that can reduce the accuracy of segmentation. From each sample, segmentation is carried out, and from each segment the segmented character will be calculated how many correctness of the segmented character is divided by the number of slots in the certificate sample form. The final result of each sample is added and divided by the number of existing samples. The test table can be seen in Table 1. Visualization of the final segmentation results can be seen in Fig. 10. Testing was carried out on 5 diploma samples in 3 parts of the data. The diploma has been specially chosen because it has quite good image quality and has cases that affect segmentation and recognition performance. The test was carried out with the LeNet5 character recognition model trained with Extended MNIST data. The text recognition model uses 5 layers of convolution and 2 layers of bidirectional LSTM 256 units, which are trained with the IAM database. The tesseract tool is also used for text recognition. Table 2 is a test of the accuracy of the model reading the name part of the sample. The name on the diploma is mostly in capital letters. This test aims to find out how well the model reads handwriting with capital letters. This makes the character recognition model (Extended MNIST) have quite high accuracy because letters can be segmented quite well.
296
A. Setiawan et al. Table 1. Segmentation accuracy on 10 diplomas.
Sample
RLSA-dot
RLSA-text-5
ijazah1.jpg
0.98
0.86
ijazah2.jpg
0.85
0.33
ijazah3.jpg
0.95
0.55
random1.jpg
0.97
0.50
random2.jpg
1.00
0.62
random3.jpg
1.00
0.90
random4.jpg
1.00
0.87
random5jpg
0.77
0.22
random6.jpg
1.00
0.45
Random7.jpg
0.96
0.57
ACC
0.951
0.587
The character recognition model also has higher accuracy than the text recognition model (IAM). Based on observations, the IAM model is trained with not much data that uses capital letters (mostly cursive handwriting). Table 2. Testing the accuracy (ratio) of the model on the sample dataset. Sample
EMNIST-LeNet5
IAM-CRNN
Tesseract
ijazah1.jpg
0.5
0.65
0.89
ijazah3.jpg
0.61
0.34
0.99
random3.jpg
0.40
0.48
0.97
random11.jpg
0.48
0.51
0.88
random12.jpg
0.55
0.52
0.94
Average
0.51
0.50
0.93
The test aims to find out how good the model is in reading handwritten numeric data. This makes the character recognition accuracy low depending on the segmentation results can be seen Fig. 10.
Comparison for Handwritten Character Recognition and Handwritten
297
Fig. 10. Character recognition accuracy result
5 Conclusion Based on the test results it can be concluded as follows: • Run length smoothing algorithm (text) can be used to search for candidate data on diplomas. But it has a weakness in the accuracy when filtering the candidate. • That dot_size configuration influential in segmentation, especially on images with different resolutions big. Larger image resolution makes the dots in the image larger, therefore it is necessary to standardize the process for automation. • The minimum contour width parameter affects scanned images that have perspective. This parameter is useful as a filter for detected candidate lines. • The accuracy of the RLSA segmentation method with point features has much higher accuracy than RLSA-TextFeature. The average accuracy obtained is 95.1% while RLSA-TextFeature only gets an average of 58.7%.
References 1. Supriana, I., Ramadhan, E.: Pengenalan Tulisan Tangan untuk Angka tanpa Pembelajaran. Konferensi Nasional Informatika, Bandung, Indonesia (2015) 2. Wirayuda, T.A.B., Syilvia, V., Retno, N.D.: Pengenalan Huruf Komputer Menggunakan Algoritma Berbasis chain code dan Algoritma sequence alignment, pp. 19–24. Konferensi Nasional Sistem dan Informat, Bali, Indonesia (2009) 3. Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, pp. 2921–2926 (2017). https://doi.org/10.1109/IJCNN.2017.7966217 4. Bunke, H., Marti, U.: The IAM-database: an English sentence database for offline handwriting recognition. IJDAR 5, 39–46 (2002). https://doi.org/10.1007/s100320200071 5. SuperDataScienceTeam: Convolutional Neural Networks (CNN): Step 4—Full Connection (2018). Retrieved from superdatascience: https://www.superdatascience.com/blogs/convoluti onal-neural-networks-cnn-step-4-full-connection
298
A. Setiawan et al.
6. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell., pp. 2298–2304 (2016) 7. Borlepwar, A.P., Borakhade, S.R., Pradhan, B.: Run Length Smoothing Algorithm for Segmentation (2017) 8. Ujjwalkarn: An Intuitive Explanation of Convolutional Neural Networks, 29 May 2017 . Retrieved from ujjwalkarn: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-con vnets/ 9. Setiawan, A., Hadi, I. P., Yoanita, D., Aritonang, A.I.: Virtual application technology of citizen journalism based on mobile user experience. J. Phys. Conf. Ser. 1502(1), 012057. IOP Publishing 10. Setiawan, A., Rostianingsih, S., Widodo, T.R.: Augmented reality application for chemical bonding based on android. Int. J. Electr. Comput. Eng. 9(1), 445 (2019)
Secure Communication Through Quantum Channels: A Study of Quantum Cryptography Seema Ukidve1 , Ramsagar Yadav2(B) , Mukhdeep Singh Manshahia2 , and M. P. Chaudhary3 1 Department of Mathematics, L. S. Raheja College of Arts and Commerce, Santacruz(W),
Mumbai, Maharashtra, India 2 Department of Mathematics, Punjabi University Patiala, Patiala, Punjab, India
[email protected] 3 International Scientific Research and Welfare Organization, New Delhi, India
Abstract. Quantum communication and quantum cryptography are two rapidly growing fields that aim to utilize the principles of quantum physics for secure communication and data encryption. In this paper, we provide an overview of the current state of research in quantum communication and quantum cryptography. We discuss the basic principles and concepts of quantum communication and cryptography, including quantum key distribution, quantum teleportation, and quantum digital signatures. We also review the various physical implementations of quantum communication systems, such as fiber-optic and free-space links, and their applications in practical scenarios, such as secure financial transactions and military communications. We analyze the security of quantum communication and cryptography protocols and compare them with classical encryption methods. We highlight the advantages of quantum communication and cryptography over traditional methods, such as its ability to provide unconditional security and its resistance to attacks from quantum computers. Keywords: Quantum communication · Quantum cryptography · Quantum systems · Secure communication · Encryption
1 Introduction The field of quantum communication and quantum cryptography is a rapidly growing and highly active area of research. Quantum communication research is focused on developing practical and scalable systems for quantum key distribution (QKD) and other quantum communication protocols, such as quantum teleportation and entanglementbased communication [1] Scientists are exploring new methods for transmitting quantum states over long distances, improving the stability and efficiency of quantum communication systems, and developing practical applications for quantum communication in various domains. Quantum cryptography research is focused on developing secure and efficient methods for generating and distributing encryption keys using quantum mechanics [2] Researchers are exploring new quantum cryptographic protocols, improving the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 299–305, 2024. https://doi.org/10.1007/978-3-031-50327-6_31
300
S. Ukidve et al.
implementation of existing protocols, and developing practical applications for quantum cryptography in various domains, such as secure financial transactions, military communications, and cloud computing [3]. There are also ongoing efforts to integrate quantum communication and cryptography with classical communication and cryptography systems, creating hybrid systems that can take advantage of the strengths of both [4].
2 Basic Principles and Concepts of Quantum Communication and Cryptography Quantum communication and cryptography are based on the principles of quantum mechanics and use quantum states of matter and light to transmit information and perform cryptographic tasks. Quantum Key Distribution (QKD) is a protocol for securely distributing a shared secret key between two parties using the principles of quantum mechanics. In QKD, the parties exchange a series of quantum states of light, which are used to generate a secret key that is secure against eavesdropping. The security of QKD is based on the fundamental laws of quantum mechanics and makes it immune to attacks such as code-breaking or brute-force [5]. Quantum Teleportation is a protocol for transmitting quantum states from one location to another without physically transporting the quantum states themselves. In quantum teleportation, two parties, Alice and Bob, share a pair of entangled quantum states, and Alice can use these states to transmit an arbitrary quantum state to Bob by transmitting classical information about the state [6]. Quantum Digital Signatures are cryptographic signatures that use the principles of quantum mechanics to provide authenticity and integrity of digital messages. In quantum digital signatures, a sender creates a digital signature by encoding information into quantum states, and a recipient can verify the signature by performing quantum measurements on the received states [7].
3 A Review of the Various Physical Implementations of Quantum Communication Systems Quantum communication systems can be physically implemented using fiber-optic and free-space links. Fiber-optic links use optical fibers to transmit quantum states of light, making it a secure and fast means of communication. However, this method is limited by the length of the fiber, which can suffer from attenuation and dispersion, affecting the quality of the transmitted quantum states. Free-space links use the atmosphere to transmit quantum states of light, avoiding the limitations of fiber-optic links. However, this method is vulnerable to atmospheric turbulence and other environmental factors, leading to a lower transmission rate and a more complex system [8]. In practical scenarios, quantum communication systems are used in secure financial transactions where the transmission of sensitive information must be protected against eavesdropping. For example, quantum key distribution (QKD) can be used to establish
Secure Communication Through Quantum
301
a shared secret key between financial institutions, enabling secure communication and protection against unauthorized access. Quantum communication systems are also of interest in military communications where security is of utmost importance. In such scenarios, QKD can be used to establish secure communication between military units, protecting sensitive information from being intercepted by adversaries [9].
4 Analysis of the Security of Quantum Communication and Cryptography Protocols and its Comparison with Classical Encryption Methods Quantum communication and cryptography protocols are considered more secure compared to classical encryption methods. In quantum communication, quantum key distribution (QKD) enables two parties to generate a shared secret key that is secure against eavesdropping. Unlike classical encryption methods that rely on mathematical algorithms, the security of QKD is based on the laws of quantum mechanics, making it immune to attacks such as code-breaking or brute-force. Quantum cryptography protocols like the one-time pad provide unconditional security, meaning the security does not rely on unproven mathematical assumptions. This is in contrast to classical encryption methods that can be vulnerable to attacks such as code-breaking or brute-force if the encryption keys are weak [10]. However, quantum communication and cryptography are still in their infancy and have several practical limitations, such as the requirement for specialized hardware, limited transmission distances, and high costs. In conclusion, while quantum communication and cryptography protocols offer enhanced security compared to classical encryption methods, they are not yet a replacement for these methods due to practical limitations [11].
5 The Advantages of Quantum Communication and Cryptography Over Traditional Methods Quantum communication and cryptography offer several advantages over traditional methods, making it a promising technology for secure communication and encryption. Some of the key advantages include: Quantum communication and cryptography offer unconditional security, meaning that the security of the communication is guaranteed by the laws of physics, rather than the computational difficulty of solving mathematical problems. This makes quantum communication and cryptography immune to attacks from classical computers and provides a higher level of security compared to traditional encryption methods [2]. Traditional encryption methods, such as RSA and Elliptic Curve Cryptography, are vulnerable to attacks from quantum computers, as they can solve mathematical problems faster than classical computers. In contrast, quantum communication and cryptography are designed to be resistant to attacks from quantum computers, ensuring the security of sensitive information even in the presence of quantum computers. Quantum communication can enable high-speed communication, as it does not rely on mathematical computations, but instead uses the properties of quantum systems to transmit information.
302
S. Ukidve et al.
This can make quantum communication well-suited for high-bandwidth applications, such as high-definition video and large data transfers [12]. The inherent randomness of quantum systems can be used to generate random numbers for cryptographic applications, such as key generation and digital signatures, providing a higher level of security compared to traditional methods. The use of quantum systems in communication enables the detection of unauthorized access to the communication channel, as any attempt to intercept or tamper with the communication will alter the quantum state of the communication, making it evident to the legitimate users [13]. Quantum communication and cryptography offer several advantages over traditional methods, making it a promising technology for secure communication and encryption. The unconditional security, resistance to attacks from quantum computers, high-speed communication, randomness, and tamper-evident communication provided by quantum communication and cryptography make it a valuable technology for various industries and applications [14].
6 Challenges and Limitations of Quantum Communication and Cryptography Quantum communication and cryptography face several challenges and limitations that need to be addressed for widespread adoption. Some of the key challenges and limitations include: The performance of quantum communication systems is highly dependent on the quality of the quantum systems used. High-quality quantum systems, such as singlephoton sources and highly stable quantum detectors, are required for the implementation of quantum communication systems, but they can be challenging to produce and maintain. Currently, quantum communication can only be performed over limited distances due to the loss of quantum signals in optical fibers or free-space links. This limits the practicality of quantum communication systems for long-distance communication, such as global communication networks [15]. The implementation of quantum communication systems can be complex and requires specialized knowledge and expertise. This can limit the adoption of quantum communication systems in practical scenarios, particularly in industries and applications with limited technical resources [16]. The interoperability of different quantum communication systems remains a challenge, as different systems may use different protocols and hardware. This can limit the ability of different quantum communication systems to communicate with each other, reducing the potential benefits of a quantum communication network. The cost of implementing quantum communication systems can be high, particularly due to the need for high-quality quantum systems and the specialized expertise required. This can limit the adoption of quantum communication systems in industries and applications with limited budgets [17]. The challenges and limitations of quantum communication and cryptography need to be addressed to ensure the widespread adoption of these technologies in various industries and applications. However, with continued research and development, it is likely that these challenges will be overcome, and the potential benefits of quantum communication and cryptography can be realized [18].
Secure Communication Through Quantum
303
7 The Current Trends and Future Directions of Research in Quantum Communication and Quantum Cryptography The current trends and future directions of research in quantum communication and quantum cryptography are focused on improving the practicality and scalability of quantum communication systems, as well as enhancing the security of quantum encryption protocols. Researchers are working towards making quantum communication systems more practical and scalable, by improving the efficiency and robustness of quantum communication protocols and reducing the cost of quantum communication systems. This includes the development of new quantum communication technologies, such as satellite-based quantum communication, and the integration of quantum communication with classical communication systems. Researchers are also working on enhancing the security of quantum encryption protocols, by developing new methods for detecting and preventing eavesdropping and hacking attacks. This includes the development of new quantum key distribution protocols, as well as the study of the security of quantum communication systems against quantum computers [19]. Another trend in quantum communication research is the development of quantum communication networks, which aim to connect multiple quantum communication systems to form a larger communication network. This will allow for the creation of secure communication channels between multiple parties, and enable the sharing of quantum resources, such as quantum keys, for secure communication. In addition, researchers are exploring new applications of quantum communication and cryptography, such as secure quantum computing and quantum internet, which aim to harness the unique properties of quantum systems for secure computation and communication [20]. The future of quantum communication and cryptography research is focused on improving the practicality and security of quantum communication systems, and exploring new applications of these technologies in various industries and applications.
8 The Potential Impact of Quantum Communication and Cryptography on Various Industries and Applications The potential impact of quantum communication and cryptography on various industries and applications is significant. In the finance industry, quantum communication and cryptography can provide a higher level of security for financial transactions, such as online banking, stock trading, and payment systems. The use of quantum encryption can prevent unauthorized access to sensitive financial information, such as credit card numbers and bank account details, ensuring the privacy and security of consumers [21]. Quantum communication and cryptography can play a crucial role in military communications, as they offer a high level of security and privacy for the transmission of sensitive information. With the threat of cyber-attacks and eavesdropping, quantum communication can provide a secure communication channel for military operations, ensuring the confidentiality and integrity of sensitive information [22].
304
S. Ukidve et al.
In the government sector, quantum communication and cryptography can provide secure communication channels for the transmission of sensitive information, such as classified documents and diplomatic communications. The use of quantum encryption can prevent unauthorized access to sensitive information, ensuring the privacy and security of government agencies [23].
9 Conclusion In conclusion, quantum communication and cryptography are emerging fields with great potential for secure communication and encryption in the digital age. The use of quantum systems provides a higher level of security compared to traditional methods, with advantages such as unconditional security, resistance to attacks from quantum computers, high-speed communication, randomness, and tamper-evident communication. With continued research and development, it is likely that the challenges and limitations faced by these fields will be overcome, and the potential benefits of quantum communication and cryptography can be fully realized. The development of these fields is essential for ensuring the privacy and security of sensitive information in the digital age, and has the potential to transform the way we communicate and protect information. Acknowledgements. Authors are grateful to Punjabi University, Patiala for providing adequate library and internet facility.
References 1. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information, 10th Anniversary edn. Cambridge University Press (2010) 2. Gisin, N., Ribordy, G., Tittel, W., Zbinden, H.: Quantum cryptography. Rev. Mod. Phys. 74(1), 145 (2002) 3. Scarani, V., Bechmann-Pasquinucci, H., Cerf, N.J., Dušek, M., Lütkenhaus, N., Peev, M.: The security of practical quantum key distribution. Rev. Mod. Phys. 81(3), 1301 (2009) 4. Barrett, M.D., et al.: Deterministic quantum teleportation of atomic qubits. Nature 429(6988), 737–739 (2004) 5. Renner, R.: Security of Quantum Key Distribution. Ph.D. thesis, Swiss Federal Institute of Technology Zurich (2005) 6. Wen, Q.Y.: Quantum Communication and Cryptography. Springer (2015) 7. Briegel, H.J., Dür, W., Cirac, J.I., Zoller, P.: Quantum repeaters: the role of imperfect local operations in quantum communication. Phys. Rev. Lett. 81(26), 5932–5935 (1998) 8. Acin, A., Brunner, N., Gisin, N., Massar, S., Pironio, S., Scarani, V.: Device-independent security of quantum cryptography against collective attacks. Phys. Rev. Lett. 98(23), 230501 (2007) 9. Lo, H.K., Chau, H.F.: Unconditional security of quantum key distribution over arbitrarily long distances. Science 283(5410), 2050–2056 (1999) 10. Tamaki, K., Lo, H.K., Qi, B.: Security proof of quantum key distribution with imperfect devices. New J. Phys. 5(1), 4 (2003) 11. Vohra, R., Pahuja, G., Chaudhary, M.P.: Meta Data to meta information: a case study from health services. Int. J. Math. Arch. 2(3), 315–319 (2011)
Secure Communication Through Quantum
305
12. Ekert, A.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67(6), 661 (1991) 13. Bennett, C.H., Brassard, G.: Quantum cryptography: Public key distribution and coin tossing. In: Proceedings of the IEEE International Conference on Computers, Systems, and Signal Processing, pp. 175–179 (1984) 14. Scarani, V., Acin, A., Ribordy, G., Gisin, N.: Quantum cryptography protocols robust against photon number splitting attacks for weak laser pulse implementations. Phys. Rev. A 69(4), 042313 (2004) 15. Ma, X., et al.: Quantum key distribution: a comprehensive review. Rev. Mod. Phys. 90(3), 045005 (2018) 16. Pirandola, S., et al.: Quantum communication through quantum relay channels. Nat. Photonics 11, 641–646 (2017) 17. Yin, J., et al.: Experimental demonstration of quantum key distribution with true random numbers. Phys. Rev. Lett. 118(21), 200501 (2017) 18. Ma, X., et al.: Long-distance quantum communication with a decoy-state method. Phys. Rev. Lett. 94(23), 230502 (2005) 19. Ambainis, A.: Quantum digital signatures. In: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pp. 612–619 (2000) 20. Zhang, Q., et al.: Quantum secure direct communication with confidentiality. Phys. Rev. A 66(5), 052302 (2002) 21. Pirandola, S., et al.: Quantum teleportation of continuous variables. Nat. Photonics 9, 397–402 (2015) 22. Bennett, C.H., Bessette, F., Brassard, G., Salvail, L., Smolin, J.: Experimental quantum cryptography. J. Cryptol. 5(1), 3–28 (1992). https://doi.org/10.1007/BF00191318 23. Pirandola, S., et al.: Quantum cryptography: a review of recent developments. Adv. Phys. X 6(3), 714–738 (2021)
A Study on Android Malware Classification by Using Federated Learning Vo Quoc Vuong1,2 and Nguyen Tan Cam1,2(B) 1 Faculty of Information Science and Engineering, University of Information Technology, Ho
Chi Minh City, Vietnam [email protected], [email protected] 2 Vietnam National University, Ho Chi Minh City, Vietnam
Abstract. Android operating system is a mobile operating system that is always the highest market share in the last five years. Security risks on Android operating system are increasing over time. Users of the Android operating system can be attacked through malicious softwares. There are many studies that propose solutions to detect Android malware. While some studies detect Android malware by analyzing sensitive data leakage, others detect malware by using machine learning algorithms. Most of the current studies perform model training on a centralized computer. However, we can deploy the model training on many different computers using federated learning. In this study, we analyze different federated learning frameworks. We also tested the implementation of a federated learning framework. The results of this study shows that they can be used for future research related to the implementation of federated learning framework in Android malware classification. Keywords: Federated learning · Android malware classification · Artificial neural network · Privacy protection
1 Introduction Android operating is the most popular choice among smartphone users. There are more than three billion smartphone users worldwide, with this number expected to grow in the coming years. Along with the development of Android operating system, the number of Android malware is also increasing. According to Kaspersky statistics, 1,661,743 malware were detected in 2022. They also show that they detected 196,476 mobile banking Trojan installers in 2022. This number is more than double that in 2021 [1]. From there, it shows that Android malware is constantly evolving and innovating. There are many studies on detecting and classifying Android malware that have been proposed [2–7]. The proposed malware detection and classification studies have shown very positive results. The majority of current research uses focused learning. Dataset is stored centrally on one computer. The process of training machine learning models is also centralized at one computer. This has many limitations in taking advantage of
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 306–315, 2024. https://doi.org/10.1007/978-3-031-50327-6_32
A Study on Android Malware Classification by Using Federated Learning
307
training models on multiple machines. In addition, the centralized data storage also limits the privacy of the datasets on different computers. Federated learning (FL) [8] is known as a distributed machine learning method that allows to train a machine learning model on many different devices or servers without requiring central data collection. Instead of downloading all the data from devices or servers and training the model on all of it, federated learning allows devices or servers locally or on a server to act as a machine learning hub to create a global model. These devices or servers use their own data to train the model and then share this model information with a central server or other device to converge and improve the global model. Federated learning reduces the load on data collection and transmission across the network, protects user data privacy, and allows local devices or servers to participate in generic machine learning model training in a more efficient manner. In this study, we analyze different federated learning frameworks in Android malware classification. The results of this study can be used for other future studies. The rest of the paper includes: the related studies are presented in Sect. 2; In Sect. 3, we present the analysis of federated learning frameworks that can be used to classify Android malware; Sect. 4 presents test implementation of a federated learning framework; Sect. 5 is the conclusion.
2 Related Works Ma et al. [3] have proposed the method to detect Android malware by using API information. First, they build a flow chart of the application’s execution to gain API information, then they base them on three datasets and three detection models namely API calls, API frequency, and API call order. Finally, an overall model is built accordingly. Arora et al. [9] proposed a method of detection by identifying potentially dangerous permission pairs. They improved the malware detection model called PermPair by building and comparing malware graphs and normalization patterns. The patterns are permission pairs that are extracted from the application’s manifest file. Khariwal et al. [10] improved the malware detection model by gathering information about the application’s intents and permissions. They then sort this information and select the best intents and permissions to detect android malware with high performance. They also proposed some new algorithms by using machine learning algorithms to find the best set. Gao et al. [4] developed a system called GDroid that uses neural networks in malware classification. They map applications and APIs to a large heterogeneous graph, thereby performing the classification. Fatima et al. [2] propose a machine learning-based approach to detect and classify android malware based on genetic algorithm for feature selection. The features selected from this genetic algorithm will be used for training and classifying Android malware. In addition to increasing the efficiency and accuracy in the process of detecting and classifying Andoird malware, related studies also use large datasets. Bosun et al. [5] propose a new approach that not only improves the performance of Android malware
308
V. Q. Vuong and N. T. Cam
detection and classification; but also reduces the time cost, which is to use datasets collected from different sources to increase accuracy. Current studies have very positive results. However, the datasets they use are all located in a centralized place to perform the training process. This poses a risk to privacy and integrity because the dataset is public and transmitted. We need to ensure data privacy. Besides, we need to reduce network traffic during training on different nodes. So the recommended method to overcome this is to use Federated Learning. Gálvez et al. [11] introduce how to use FL to detect and classify Android malware. They use machine learning algorithms to train the model in a distributed environment. Jiang et al. [12] proposed a method called FedHGCDroid to detect Android malware. To design a multidimensional classification model called HGCDroid, they first used not only a convolutional neural network but also a graph neural network. Second, they introduced a framework to allow Android clients to cooperate in training an Android malware classification model during protects data privacy. Rey et al. [13] proposed a framework to detect malware on IoT devices by using federated learning. They use a dataset consisting of the network traffic of several malicious and benign IoT devices used to evaluate the Framework.
3 Federated Learning Frameworks There are many federated learning frameworks [14–19]. Each federated learning framework has different characteristics. Some federated learning frameworks are horizontal federated learning, while others are vertical federated learning frameworks. Horizontal federated learning is a version of the distributed machine learning method federated learning, in which the devices or servers participating in training a model have the same data type or the same input domain. Vertical federated learning is another version of the distributed machine learning method federated learning, in which the devices or servers participating in model training have different data, but these data have the same users or same data domain. Some frameworks support both horizontal and vertical federated learning like FedML [14], PaddleFL [15], Fedlearner [16]. While some other frameworks only support horizontal like TFF [17], Flower [18]. Based on TensorFlow [17], TFF from Google provides working learning APIs and federated core API so that users can apply and design corresponding federated learning algorithms. In addition, we have some special frameworks. They are designed for the purposes of providing secure computations to many parties (CrypTen [19]), while FedTree [20] is designed to train FLs in decision trees. Different federated learning frameworks will support different sets of functions. Specifically functions like horizontal and vertical federated learning deployment, deployment support, privacy protection, etc. Table 1 shows some characteristics of federated learning frameworks. Regression models and Neural Networks models are supported by most frameworks when we implement the horizontal models. Tree-based network models are supported by a few frameworks (like FedTree).
A Study on Android Malware Classification by Using Federated Learning
309
Table 1. The main characteristics of federated learning frameworks [21]. Framework types
The all in one frameworks
The frameworks support horizontal FL only
FedML PaddleFL Fedlearner TFF
The specialized frameworks
Flower CrypTen FedTree
Horizontal based model Regression
Yes
Yes
Yes
Yes
Yes
N/A
No
Neural network
Yes
Yes
Yes
Yes
Yes
N/A
No
Tree-based model
No
No
No
No
No
N/A
Yes
Vertical based model Regression
Yes
Yes
Yes
Yes
Yes
Yes
No
Neural network
No
Yes
Yes
No
No
Yes
No
Tree-based model
No
No
Yes
No
No
No
Yes
Single-host
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Multi-host < 16 hosts
Yes
Yes
Yes
No
Yes
Yes
Yes
Cross-device > 100 No host
Yes
Yes
No
Yes
N/A
Yes
Does not require a No 3rd party aggregator
Yes
Yes
No
No
Yes
Yes
The aggregation process does not take the model number parameters information
No
Yes
No
No
No
N/A
Yes
The aggregation process does not learn the gradients of the local model
Yes
Yes
No
Yes
No
N/A
Yes
Support deployment
Privacy protection
All federated learning frameworks support single-server deployment with basic functionality. Multi-server deployment option is provided by the most framework except TensorFlow Federated as this feature is under development. Most federated learning frameworks claim that they support cross-device training. Support for privacy protection is one of the important features of frameworks. To keep information private from a central server, frameworks like CrypTen, PaddleFL, FedTree, Fedlearner support different protocols that do not require the arbiters as well as provide
310
V. Q. Vuong and N. T. Cam
protection. Furthermore, PaddleFL, FedTree utilize arbiters for better computational efficiency but without revealing any model parameters. The implementations of these frameworks are often different, which leads to different performance characteristics. The federated learning frameworks often require the data preprocessing. They also need to compute certain functions locally, and be able to communicate with the aggregator to collaborate on model learning. In most cases, the federated learning frameworks improve the model over and over again and repeat these steps after the number of times or until certain criteria are matched. Table 2 shows the characteristics of several federated learning frameworks. Most federated learning frameworks provide details on how to install and use them. Fedlearner, FedML do not provide API documentation to easily set up federated learning scenarios. Moreover, all frameworks support using GPU to speed up the training process. Most federated learning frameworks have integrated CNN, RNN. We compare the criteria of algorithm implementation, asynchronous compositing support of FL frameworks. For horizontal FL implementations, FedAvg is the only algorithm supported by all frameworks except special frameworks. To handle heterogeneous data, the FedML, TFF frameworks support FedProx or FedNova. To optimize the FedML server, Flower supports FedOpt, FedAdam, FedAdagrad and ensures compatibility with FL on multiple devices. Horizontal federated learning is supported by many frameworks. In contrast, vertical federated learning is less supported by federated learning frameworks. The information is Table 3 shows the supported algorithms of federated learning frameworks.
4 Experimental Deployment In this study, we test the federated learning platform named TFF. The diagram of the experimental deployment scenario in this study as shown in Fig. 1. In this study, we deploy a topology that has 1 server and 5 nodes. The federated learning platform used is TFF. Nodes use TensorFlow. Flow (1) illustrates distributing the initial model and updating global models to the nodes. Flow (2) illustrates the selflearning of nodes to create local models. Local models are sent to the server via flow (3). The parameters of the local models from nodes are aggregated by the server into the global model. In this test, the neural network structure has 10386 input layers, 1 hidden layer with 20 neurons and 5 neuron outputs activated by softmax activation function corresponding to 5 types of Android malware. In this study, we used CCCS-CIC-AndMal-2020 [22] for detecting and classifying Android malware. This dataset contains about 400 thousand samples (200 thousand benign and 200 thousand malware). Dataset contains 5 types of malware. We sort the dataset by label and then divide it into 5 separate datasets. Each sub dataset has 1000 samples. Each node uses a sub dataset. If the nodes are trained on the sub dataset, then when testing on the global dataset, the accuracy is very low. Table 4 presents the test results of the nodes using the local model.
Yes
No
Yes
No
No
PyTorch
Code example
API document
GPU support
Support CNN
Support RNN
Machine learning backend
Yes
PaddlePaddle
No
Yes
Yes
No
Yes
Yes
TensorFlow
Yes
Yes
Yes
No
Yes
No
TensorFlow
No
No
Yes
Yes
Yes
Yes
PyTorch, TensorFlow
No
No
Yes
Yes
Yes
Yes
Flower
TFF
Fedlearner
FedML
PaddleFL
Frameworks support horizontal FL only
All in one frameworks
Documentation
Characteristics
Table 2. The other characteristics of federated learning frameworks [21].
PyTorch
No
Yes
Yes
Yes
Yes
Yes
CrypTen
N/A
No
No
Yes
Yes
Yes
Yes
FedTree
Specialized frameworks
A Study on Android Malware Classification by Using Federated Learning 311
312
V. Q. Vuong and N. T. Cam Table 3. The algorithms are supported by common federated learning frameworks [21].
All in one frameworks
Framework
Horizontal Algorithms
Vertical Algorithms
Asynchronous aggregation
FedML
FedAvg, FedOpt, FedNova
VFL-LR
No
PaddleFL
FedAvg
Two-party PrivC, Three-party ABY3
Yes
Fedlearner
FedAvg
Two-party split learning
No
FedAvg, FedProx, FedSGD
/
No
Flower
FedAvg, FedAdam, FedAdagrad
/
No
CrypTen
/
sMPC
Yes
FedTree
HistSecAgg
SecureBoost
No
Frameworks support TFF horizontal FL only
Specialized frameworks
Fig. 1. The diagram of the experimental deployment scenario
When the global model are aggregated on the central sever from the local models, the accuracy is greatly increased. Table 5 shows experimental results when testing with
A Study on Android Malware Classification by Using Federated Learning
313
Table 4. The evaluation results of the nodes using the local model. Node
F1-score
Accuracy
Node 1
0.454508378
0.603693728
Node 2
0.008779865
0.06848788
Node 3
0.025111931
0.118507118
Node 4
0.005055844
0.051558292
Node 5
0.042990178
0.157752982
the global model aggregated from the nodes with the batch_size 8 epochs are 30; the number of rounds are 30; the scaling factor is 0.35. Table 5. The experimental results when testing with the global model aggregated from nodes. Round
Accuracy
Round
0
0.58330127
10
1
0.601385148
11
2
0.061177376
12
3
0.597537514
4
0.084647942
5
Accuracy
Round
Accuracy
0.477876106
20
0.469411312
0.435552135
21
0.503655252
0.420546364
22
0.533282032
13
0.407464409
23
0.554444017
14
0.408233936
24
0.566371681
0.27395152
15
0.431319738
25
0.578299346
6
0.470950366
16
0.431704502
26
0.588303194
7
0.541746826
17
0.439015006
27
0.597922278
8
0.497883801
18
0.448249327
28
0.605617545
9
0.497499038
19
0.451712197
29
0.614851866
5 Conclusion In this study we analyze federated learning frameworks that can be used to classify Android malware. We also tested a federated learning to illustrate Android malware classification using federated learning. We deploy the TFF framework on 1 server and 5 nodes. The experimental results indicate that the global model has much higher accuracy than the local models at the nodes. The results of this study show that they can be used not only for studies related to Android malware classification, but also for future machine learning and deep learning related research. Acknowledgement. This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number C2023-26-02.
314
V. Q. Vuong and N. T. Cam
References 1. Kaspersky: 200,000 New Mobile Banking Trojan Installers Discovered, Double the 2021 (2022). Available: https://www.kaspersky.com/about/press-releases/2023_200000-new-mob ile-banking-trojan-installers-discovered-double-the-2021 2. Fatima, A., Maurya, R., Dutta, M.K., Burget, R., Masek, J.: Android malware detection using genetic algorithm based optimized feature selection and machine learning. In: 2019 42nd International Conference on Telecommunications and Signal Processing (TSP), pp. 220–223 (2019) 3. Ma, Z., Ge, H., Liu, Y., Zhao, M., Ma, J.: A combination method for android malware detection based on control flow graphs and machine learning algorithms. IEEE Access 7, 21235–21245 (2019) 4. Gao, H., Cheng, S., Zhang, W.: GDroid: Android malware detection and classification with graph convolutional network. Comput. Secur. 106, 102264 (2021) 5. Sun, B., Takahashi, T., Ban, T., Inoue, D.: Detecting android malware and classifying its families in large-scale datasets. ACM Trans. Manage. Inf. Syst. (TMIS) 13, 1–21 (2021) 6. Kim, J., Ban, Y., Ko, E., Cho, H., Yi, J.H.: MAPAS: a practical deep learning-based android malware detection system. Int. J. Inf. Secur. 21, 725–738 (2022) 7. Musikawan, P., Kongsorot, Y., You, I., So-In, C.: An enhanced deep learning neural network for the detection and identification of Android malware. IEEE Internet of Things J (2022) 8. Banabilah, S., Aloqaily, M., Alsayed, E., Malik, N., Jararweh, Y.: Federated learning review: fundamentals, enabling technologies, and future applications. Inf. Process. Manage. 59, 103061 (2022) 9. Arora, A., Peddoju, S.K., Conti, M.: Permpair: android malware detection using permission pairs. IEEE Trans. Inf. Forensics Secur. 15, 1968–1982 (2019) 10. Khariwal, K., Singh, J., Arora, A.: IPDroid: android malware detection using intents and permissions. In: 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pp. 197–202 (2020) 11. Gálvez, R., Moonsamy, V., Diaz, C.: Less is More: A Privacy-Respecting Android Malware Classifier Using Federated Learning (2020). arXiv preprint arXiv:2007.08319 12. Jiang, C., Yin, K., Xia, C., Huang, W.: FedHGCDroid: an adaptive multi-dimensional federated learning for privacy-preserving android malware classification. Entropy 24, 919 (2022) 13. Rey, V., Sánchez, P.M.S., Celdrán, A.H., Bovet, G.: Federated learning for malware detection in iot devices. Comput. Netw. 204, 108693 (2022) 14. He, C., Li, S., So, J., Zeng, X., Zhang, M., Wang, H., et al.: Fedml: A Research Library and Benchmark for Federated Machine Learning. (2020) arXiv preprint arXiv:2007.13518, 15. Ma, Y., Yu, D., Wu, T., Wang, H.: PaddlePaddle: an open-source deep learning platform from industrial practice. Front. Data Comput. 1, 105–115 (2019) 16. Cai, F.: ByteDance Breaks Federal Learning: Open Source Fedlearner Framework, 209% Increase in Advertising Efficiency (2020). Available: https://github.com/bytedance/fedlearner 17. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al.: Tensorflow: LargeScale Machine Learning on Heterogeneous Distributed Systems (2016). arXiv preprint arXiv: 1603.04467 18. Labs, F.: Flower A Friendly Federated Learning Framework (2023). Available: https://flower. dev/ 19. Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: Crypten: Secure multi-party computation meets machine learning. Adv. Neural. Inf. Process. Syst. 34, 4961–4973 (2021)
A Study on Android Malware Classification by Using Federated Learning
315
20. Al-Quraan, M., Khan, A., Centeno, A., Zoha, A., Imran, M.A., Mohjazi, L.: FedTrees: A Novel Computation-Communication Efficient Federated Learning Framework Investigated in Smart Grids (2022). arXiv preprint arXiv:2210.00060 21. Liu, X., Shi, T., Xie, C., Li, Q., Hu, K., Kim, H., et al.: Unifed: A Benchmark for Federated Learning Frameworks (2022). arXiv preprint arXiv:2207.10308 22. Gagnon, F., Massicotte, F.: Revisiting static analysis of android malware. In: 10th {USENIX} Workshop on cyber security experimentation and test ({CSET} 17) (2017)
Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study Emmanuel Jason Wijayanto, Siana Halim(B)
, and I. Gede Agus Widyadana
Industrial Engineering Department, Petra Christian Universtiy, Jl. Siwalankerto 121-131, Surabaya, Indonesia [email protected] Abstract. A poultry company in Indonesia has a problem, i.e., losing raw material, the so-called Soybean Meal (SBM), during transportation from the port to the factory. To reduce material loss, the company created a raw material transport (RMT) system, which recorded the time and activities during loading-unloading and transporting the material from the port to the factory warehouses. Therefore, this study aims to mine the data on the loss of raw materials through RMT. The application used is Orange data mining to find the relationship between lost material and other attributes, create clusters, and classify the standardized lost. The clustering exhibits two classes, namely, the standard and non-standard conditions. The classification process uses five different algorithms. The random forest algorithm was chosen because it produces the second-best AUC value and can produce a classification visualization through a decision tree. This classification process also produces rules based on the decision tree. Keywords: Data mining · Clustering · Classification · Random forest
1 Introduction The problem faced by a poultry company in Surabaya is losing raw material, so-called Soybean Meal (SBM), which was imported from e.g., Brazil and Argentina to Surabaya when it is transported from the port to the factory. There are two ports in Surabaya, Tanjung Perak, which is 34 km from the factory, and Teluk Lamong, is 45 km from the factory. Most of the SBM is shipped to Teluk Lamong. Three types of SBM imported by the company right now, e.g., SBM Argentine HiPro (50%), SBM Brazil Lopro (26.4%), and SBM Brazil HiPro (23.6%). The company occupies third-party logistics (TPL) to transport the SBM from the port to the factory. The TPL uses dump trucks as the SBM transporter; each dump truck has a capacity of 25 tons. The material is lost during transportation from the port to the factory. So, the company created a monitoring system called raw material transport (RMT) to reduce the lost material. The RMT recorded the vehicle’s plate number and the weight of the dump truck in an empty condition before and after transporting the SBM, the weight of the loaded dumped truck scaled in the port, and when it arrived in the factory. Additionally, it also recorded each activity’s times. The activities are departure time from the port, arrival time in the factory, queuing time for scaling and unloading in the factory, and unloading time in the warehouse. The scaling and unloaded process in the factory is hectic. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 316–324, 2024. https://doi.org/10.1007/978-3-031-50327-6_33
Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study
317
Post-harvest losses during transportation can occur due to many factors, such as physical damage to the crops during loading and unloading the crops, temperature and humidity, contamination, pest infestation, and poor packaging [1]. Many researchers study this problem, e.g., Medeiros et al. [2], studying the post-harvest soybean loss during transportation in Brazil. In this study, Medeiros et al. recorded the loss statistics for the distance from the farm to the destination and the road condition. Wang and Shi [3] were researching the function optimization of bulk grain transportation. Iord˘achescu et al. [4] investigated transportation and storage losses for Romania’s fresh fruits and vegetables. In their finding, the losses due to transportation could happen because of long shipping and delivery times, transportation of unsuitable products together, product damage due to rough and rugged mechanical processing, and not transporting products in a suitable atmosphere. Jia et al. [5]. Provided a systematic literature review on supply chain management of soybeans. Machine learning, and data mining have also been used widely in agricultural problems. Borse and Agnihotri [6] used fuzzy logic rule to predict the crop yields, Vasilyev et al. [7], processing plants for post-harvest disinfection of grain. This study aims to investigate the loss of soybean meal in transportation through data mining.
2 Methods 2.1 RMT Process The RMT flow starts with recording the loaded dump truck weight before leaving the port. The truck is identified by its vehicle plate number and travel document. The travel document number is barcoded and attached to the truck body. In every security post, the security will scan the barcode and record the time. Then the truck will travel from the port to the factory. Once it arrives in the factory, security will scan the barcode; the truck will enter the queuing line to weigh the loaded truck. The weighting queuing time is recorded in the RMT. After the weighing process, the truck will enter the queue to unload the cargo. The unloading queuing time is also recorded in the RMT. Then the truck will unload the cargo in the specific warehouse. Again, the RMT records the unloading process time. In the final state, the empty truck is weighed, the security scans the factory departure ticket, and the truck returns to the port. 2.2 Data Preprocessing All previously collected data will be processed through merging, selecting, and transforming. First, the merging process is carried out to unify the two data types obtained to calculate the amount and percentage of difference. Next, the selection process is a process to remove unnecessary data attributes from the combined data so that the amount of data can be minimized but still represents the actual data. In the data cleaning, we cleaned the data that deviated significantly and will ultimately damage the data distribution (outliers). Finally, after the transformation, where the data will be changed to the selection, the data will undergo processing in another form to add the information needed in the mining process.
318
E. J. Wijayanto et al.
2.3 Data Mining The next stage is the core of this research, which is to perform data mining from data that has been processed and cleaned. Data mining is a study to collect, clean, process, analyse, and obtain useful information from data [8]. The mining process will be carried out to look for attributes that may be related to the loss of this raw material through correlation tests and analysis of variance (ANOVA). The clustering process is carried out using the k-Means algorithm to determine the grouping of losses that occur and can be used as a standard benchmark for a loss. In this method, the algorithm examines the data to find groups of similar items [9]. The clustering process carried out in this study uses the k-Means algorithm and produces two clusters, namely standard and non-standard conditions. Then the classification process is carried out, dividing objects into each data into one of several categories commonly known as classes [9]. This classification process will be carried out using five algorithms: naive Bayes, KNN, tree, random forest and neural network. This classification process aims to make predictions from a condition that will enter a standard or non-standard group. This classification process can obtain a rule for companies to classify.
3 Results and Discussions 3.1 Data This study uses two datasets recorded from January 2020 to August 2022. The first dataset was obtained from the RMT report data. It contains 28 attributes. It includes information about the dump trucks used for shipping, details of the time and name of the user involved in the occurrence of each flow stop, and the weight of raw materials transported at the port. There are seven files of this type of data, separated by year, period, and type of raw material transported. The second dataset is Factory’s SAP Data, which internal factories use to update after raw materials are weighed at the factory location. Therefore, in this type of data, there are only 9 data attributes, including delivery truck information, time details, and weight of raw materials that are weighed at the factory. Those two datasets are preprocessed to obtain a clean dataset consisting of 27 attributes (14 are numerical data, 8 are categorical data, and five are in the meta category in the text form) and 12,027 rows of data. 3.2 Data Descriptive This study aims to mine the SBM lost due to transportation from the port to the factory. Here, the loss is defined as the dump truck’s weight difference when weighted in the port and the factory. Among previous studies, post-harvest loss happened due to distance; Time traveled, temperature and humidity [1, 2]. Therefore, in this study, we do the correlation analysis and hypothesis test to infer the loss. Furthermore, since the distance from the port to the factory is constant, we start to infer with Time traveled. The company did not record the temperature and humidity. The company did not record the temperature and humidity. However, the temperature is related to the time. The day and night temperature are different. At the same time, the humidity is related to the month.
Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study
319
3.2.1 Time Traveled Time traveled records the length of travel time since the truck driver scans the barcode at the port until the driver scans the barcode again at the factory gate. This attribute was chosen for research because there was an initial assumption from the company that the Time travelled would be related to the SBM loss. Therefore, the company inferred that the longer the Time traveled, the more significant the SBM loss. The Time traveled has an average of 01:33:01. If the Time traveled is grouped every 10 min, the highest frequency occurred from 01:10:00 to 01:20:00, with a total of 2779 trips (23.11% of the total data). The fastest Time traveled is 00:58:40 (58 min 40 s), while the slowest is 03:31:07 (3 h 31 min 7 s). Correlations are performed to see the relationship between SBM loss and Time traveled. The resulting correlation coefficient between the Time traveled, and SBM loss is +0.034. This result is not significant and negligible [10]. Therefore, the Time traveled attribute is not related to the SBM loss. 3.2.2 Temperature The temperature is related to the port departure time. Port departure time is a record when a truck starts leaving the port to deliver SBM to the factory. The departure time can be started from 00:00 to 23:59 and it is well known that temperature and win speed are different from time to time [11]. Most trucks (766 trips or 6.37%) depart from the port between 16.00 and 17.00, and the lowest (108 trips or 0.90%) occur between 07:00 and 08:00. The correlation between port departure time to the SBM loss in percentage is +0.012. Therefore, the correlation is not significant. In Surabaya, the day and night temperature are not extremely different and the uncorrelated condition between SBM loss and temperature is reasonable. Surabaya, the capital of East Java Province, is located on the northern coast of East Java Province and is a tropical city. It is between 7° 9 –7° 21 South Latitude and 112° 36 –112° 54 East Longitude. Topographically, Surabaya is 80% flatland, with a height of 3–6 m above sea level [12]. Additionally, the temperature in Surabaya does not differ daily. It is between 24 and 32 °C. Figure 1 shows the monthly average temperature in Surabaya [13]. 3.2.3 Humidity Indonesia has two seasons, dry and wet seasons. Officially, the dry season starts in April and ends in September; the wet season starts in October to March. However, the starting month of the dry or wet seasons is shifting. Therefore, in this study, we related the humidity to the month when the imported SBM was anchored in the port of Surabaya. Figure 2 shows the monthly average humidity in Surabaya. During wet season the humidity is higher than in the dry season. Additionally, the data set recorded that July is the most hectic month with 2,235 trips (18.58%) from port to the factory, while September is a bit of slack with only 78 trips (0.65%). However, the SBM loss in September is higher than in July. The loss tends to be higher in the wet seasons than in the dry seasons. The monthly average SBM loss in percentage is depicted in Table 1. The ANOVA test shows that the monthly average percentage SBM loss is significantly different (F-value 73.25, p-value 0.000).
320
E. J. Wijayanto et al.
Fig. 1. Monthly average temperature in Surabaya [13]
Fig. 2. Monthly average humidity in Surabaya [13].
Table 1. Monthly average SBM loss in percentage Month
Average loss (%)
Month
Average loss (%)
January
−0.158
July
−0.155
February
−0.123
August
−0.184
March
−0.153
September
−0.175
April
−0.169
October
−0.141
May
−0.206
November
−0.264
June
−0.135
December
−0.219
Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study
321
3.3 Clustering and Classification 3.3.1 Clustering The data descriptive only shows the relationship between one attribute to the SBM loss. Further, we analyze the data by clustering the SBM loss. The data analysis is carried out using Orange data mining [14]. We used K-means since it is simple and efficient. Moreover, it is commonly used to cluster features with many data types [15]. The data is standardized to reduce the noise in the dataset [16]. To find the number of clusters we used the Silhouette scores [17], and it exhibits the number of clusters is two (see Table 2). Figure 3 exhibits the distribution of the two clusters and the summary statistics can be seen in Table 3.
Fig. 3. Distribution of the two clusters
Table 2. Silhouette scores Number of clusters
Silhouette scores
2
0.557
3
0.515
4
0.532
5
0.526
6
0.540
7
0.542
8
0.542
Table 3. Characteristics of Cluster 1 (C1) and Cluster 2 (C2) Cluster
Mean (in %)
St. Dev (in %)
Mean (in kg)
Characteristic
C1
−0.302
0.091
−76.3
Non-standard
C2
−0.083
0.082
−20.7
Standard
322
E. J. Wijayanto et al.
3.3.2 Classification The classification rules allow class predictions if several variables are known in the study [18]. In classifying, many methods can be used, but in this study, there are only four algorithms, namely KNN [19], Naive Bayes [20], Tree [21], Random Forest [22] and Neural Network [23]. In this study, overfitting the decision tree will be prevented by pre-pruning by limiting the depth of the decision tree to 5 levels. Max depth five was chosen because it increases the AUC value significantly compared to the depth below it (up to 4.7%), even though the AUC value is higher than depth 6. Testing the results of this classification uses the 80:20 system, where the algorithm will use 80% of the data to study data patterns and models (training data). The remaining 20% will be used for testing the algorithm (testing data). Based on the assessment results in Table 4, there are five metrics parameters, but this study uses the AUC value as a reference for assessment. AUC, which stands for the area under the ROC curve, is a global index used to calculate the accuracy of the estimated area under the Receiver Operating Characteristic (ROC) Curve [24]. Table 4. Classification algorithm metrics Model
AUC
CA
F1
Precision
Recall
kNN
0.672
0.654
0.645
0.645
0.654
Tree
0.652
0.668
0.681
0.681
0.668
Random forest (2)
0.673
0.662
0.685
0.685
0.662
Neural network
0.524
0.605
0.596
0.596
0.605
Naive Bayes
0.642
0.639
0.635
0.635
0.639
The classification algorithm with the highest AUC value is the kNN algorithm, followed by random forest, tree, and naive Bayes. Although not the best algorithm because it produces the second-highest AUC value, the random forest algorithm has its advantages: it can visualize the classification process into a decision tree. Figure 4 shows the decision tree generated by the random forest algorithm, where there are one root node, 22 internal nodes, and 24 leaf nodes consisting of 16 standard leaves (C2) and eight non-standard leaves (C1). Three features are significant in classifying the SBM loss as standard or non-standard, i.e., month, departure time, and duration. Three rules that signify the SBM loss is classified as non-standard are: Month = April and Port departure time < 09:26 and Time-traveled > 01:30. Month = February and Port departure time > 22:54 and Time-traveled > 01:45. Month = December and Time-traveled > 03: 21.
Post-harvest Soybean Meal Loss in Transportation: A Data Mining Case Study
323
Fig. 4. Random forest decision tree
4 Conclusion This study analyzes three data attributes related to SBM loss during transportation from the port to the factory: time traveled, port departure time, and month. Time-traveled and port departure time are not significantly correlated to the SBM loss, but the months significantly correlate with the SBM loss. Additionally, two clusters are discovered to classify the SBM loss as a standard loss with a mean percentage of loss −0.083% of total SBM delivered and a non-standard loss with a mean percentage of loss −0.302%. Finally, the random forest is used to predict whether particular features will cause the SBM loss to be standard or non-standard. It is found that the wet seasons will cause the SBM loss to be severed. The AUC is still under 70%. In future work, we need to elaborate on other variables that influence the soya bean material loss during transportation from the port to the factory.
References 1. Al-Dairi, M., Pathare, P.B, Al-Yahyai, R.: Mechaninal damage of fresh produce in postharvest transportation: current status and future prospect. Trends Food Sci. Technol. 124, 195–207 (2022). https://doi.org/10.1016/j.tifs.2022.04.018 2. Medeiros, P.O., Naas, I., Vendrametto, O., Soares, M.: Post-harvest soybean loss during truck transport: a case study of Piaui State, Brazil. In: Advances in Production Management Systems, Initiative for a Sustainable World. APMS 2019. IFIP Advances in Information and Communication Technology, vol. 488 (2019). https://doi.org/10.1007/978-3-319-51133-7_72 3. Wang, X., Shi, H.: Research on the function optimization of the bulk grain transportation central control system. IOP Conf. Ser. Earth Environ. Sci. 512(1), 012163 (2020). https://doi. org/10.1088/1755-1315/512/1/012163.(2020) 4. Iord˘achescu, G., Ploscutanu, G., Pricop, E.M., Baston, O., Barna, O.: Postharvest Losses in transportation and storage for fresh fruits and vegetables sector. Agric. Food 7, 244–249 (2019) 5. Jia, F., Peng, S., Green, J., Koh, L., Chen, X.: Soybean supply chain management and sustainability: a systematic literature review. J. Clean Prod. 255, 120254 (2020). https://doi.org/ 10.1016/j.jclepro.2020.120254
324
E. J. Wijayanto et al.
6. Borse, K., Agnihotri, P.G.: Prediction of crop yields based on fuzzy rule-based system (FRBS) using the Takagi Sugeno-Kang approach. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866, pp. 438–447. Springer, Cham (2019). https://doi.org/10.1007/978-3-03000979-3_46 7. Vasilyev, A.A., Samarin, G.N., Vasilyev, A.N.: Processing plants for post-harvest disinfection of grain. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems and Computing, vol. 1072, pp. 501–505, Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4_49 8. Aggarwal, C.C.: Data Mining: The Textbook. Springer International Publishing (2015) 9. Bramer, M.: Principles of Data Mining. Springer London (2016) 10. Schober, P., Boer, C., Schwarte, L.: Correlation coefficients: appropriate use and interpretation. Anesth. Analg. 126(5), 1763–1768 (2018) 11. Zhu, Y., Kuhn, T., Mayo, P., Hinds, W.C.: Comparison of daytime and nighttime concentration profiles and size distributions of ultrafine particles near a major highway. Environ. Sci. Technol. 40(8), 2531–2536 (2006) 12. Geographic of Surabaya: http://dpm-ptsp.surabaya.go.id/v3/pages/geografis. Last access 3 Jan 2023 13. Weather Atlas in Surabaya: https://www.weather-atlas.com/en/indonesia/surabaya-climate. Last access 3 Jan 2023 14. Orange Data Mining: https://orangedatamining.com/download/#windows (2022) 15. Wu, J.: Advances in K-Means Clustering: A Data Mining Thinking. Springer, Berlin Heidelberg (2012) 16. De Amorim, R.C., Hennig, C.: Recovering the number of clusters in datasets with noise features using feature rescaling factors. Inf. Sci. 324, 145 (2015) 17. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987) 18. Tanjung, S.Y., Yahya, K., Halim, S.: Predicting the readiness of indonesia manufacturing companies toward industry 4.0: a machine learning approach. Jurnal Teknik Industri, Ind. Eng. J. Res. Appl. 23(1), 1–10 (2021). https://doi.org/10.9744/jti.23.1.1-10 19. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is Nearest Neighbor Meaningful? Computer Sciences Department, University of Wisconsin, Technical Report #1377 (1998). https://minds.wisconsin.edu/bitstream/handle/1793/60174/TR1377.pdf?sequence=1 20. Rennie, J.D.M, Shih, L., Teevan, J., Karger, D.R.: Tacking the Poor Assumptions of Naïve Bayes Text Classifiers (2003). http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf 21. Rokach, L., Maimon, O.: Top-down induction of decision tress classifiers- a survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 35(4), 476–487 (2005) 22. Shi, T., Horvath, S.: Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 15(1), 118–138 (2006) 23. Jeatrakul, P., Wong, K.W.: Comparing the performance of different neural networks for binary classifications problems. In: Proceeding of the Eighth International Symposium on Natural Language Processing (2009). https://doi.org/10.1109/snlp15315.2009 24. Faraggi, D., Reiser, B.: Estimation of the area under the ROC curve. Stat. Med. 21, 3093–3106 (2002)
Android Application Behavior Monitor by Using Hooking Techniques Nguyen Tan Cam1,2(B) , Trinh Gia Huy1,2 , Vo Ngoc Tan1,2 , Phuc Nguyen1,2 , and Sang Vo1,2 1 Faculty of Information Science and Engineering, University of Information Technology, Ho
Chi Minh City, Vietnam {camnt,tanvn}@uit.edu.vn, {20520556,17520128, 17520981}@gm.uit.edu.vn 2 Vietnam National University, Ho Chi Minh City, Vietnam
Abstract. Android operating system is the most popular among mobile operating systems in recent years. Therefore, security risks in Android applications will have a great impact on many users. Monitoring application behavior as they are being used is a technique that can overcome obfuscation techniques of malware. In this study, we propose a system (uit2ABM) that allows monitoring the application’s behavior during execution on Android device. We use hooking techniques to perform application behavior monitoring. The results of this study can be used in detecting sensitive data leakage and other malicious behavior in Android applications without changing the application’s source code before execution. Keywords: Android · Sensitive data leakage · Hooking · Application behavior monitoring
1
Introduction
According to the statistics of Statcounter [1], Android operating system is the most popular operating system on smart phones. This OS accounted for 71.77% of sales in January 2023. iOS operating system accounts for 27.6% market share. 0.73% is the remaining market share of other mobile operating systems. Today’s mobile phones contain a lot of sensitive data such as location information, contacts, messages, bank accounts, health information, etc. The data stored on these Android devices is becoming increasingly important to users. The need to monitor the use of various types of data on the phone is growing. In order to control the threat from resource exploitation on Android operating system, many analysis methods have been proposed and applied. Static and dynamic analysis techniques are commonly used in the application analysis process on the Android operating system. However, both safe applications as well as malicious applications almost all use a variety of protection techniques such as obfuscation of source code, encryption to avoid being analyzed or decompiled. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 325–333, 2024. https://doi.org/10.1007/978-3-031-50327-6_34
326
N. T. Cam et al.
For static analysis, obfuscation techniques as well as the lack of source code are a big challenge for analysts. In addition, static analysis techniques can only check the behavior of the application at the API level, but cannot detect the behavior at the time of execution. For example, the URL address when connecting to the network or the recipient’s phone number in the act of sending SMS. For dynamic analysis techniques, anti-analysis in virtualized environments (Anti-VM) is also a big challenge in analyzing the behavior of Android applications. Especially when we analyze the application in a virtualized environment. Because of the above difficulties, to make it easier to analyze Android malware, hooking technique is used to monitor and analyze application behavior on Android device. The main contributions of the paper are as follows: • Build an application that supports real-time monitoring of application behavior running on Android operating system. The application promptly detects illegal data access or acts of connecting or sending messages, causing leakage of personal data to the outside. It then alerts the user when malicious behavior is detected. • Build an application that monitors some common and typical APIs such as: connecting URLs, sending SMS, accessing clipboard, starting an activity. • Build an application that supports pentester during penetration testing of Android apps. • Build the right application for the end user, allowing the user to interact with the target application in real time, thereby giving appropriate responses to malicious behavior (Allow / Block / Warning). At the same time, the application provides an interface to display logs recording the behavior that has taken place on the target application. The structure of the paper is organized into five main sections. The introduction is the first section. The second section is the related works. The proposed system is described in Sect. 3. Section 4 and Sect. 5 are the evaluation and conclusion.
2 Related Works Zhang et al. proposed FastDroid [2]. Their proposed system can be used to detect sensitive data leanage in Android applications. In their study, they used static analysis techniques to build graphs representing the propagation path of data in applications. The test results show that their proposed system maintains high accuracy. It has many advantages over Flowdroid [3]. However, their study uses static analysis techniques, so there will be limitations in analyzing malicious applications that use encryption and obfuscation techniques. Samhi et al. proposed CoDoC [4]. Their proposed system uses machine learning to identify privacy-related sink and source API methods. Contrary to previous methods, their proposed system uses machine learning on combines documentation of API function calls and source code. First, they propose new definitions related to sensitive source and critical sink functions in identifying behaviors that cause loss of sensitive data. Second, based on these definitions, they conduct an analysis of the source code and related documents to classify them. The results of their study can supplement the
Android Application Behavior Monitor by Using Hooking Techniques
327
previous source-sink function calls of SuSi [5] to support works related to determining Android application behavior. Schindler et al. [6] proposed a method for integrating open source tools to assist developers in testing Android apps. This method can be used to detect security risk in third-party libraries. Tools such as Frida [7], FlowDroid [3], and mitm-proxy [8] are integrated in a simple and feasible way to perform testing to detect sensitive data leakage in third-party apps. However, in their study they did not implement on mobile devices for real-time application behavior monitoring. Chen et al. proposed AUSERA [9]. Their proposed system can be used to classify vulnerability in Android applications. Their proposed system is extended to classify 50 types of vulnerabilities. In their study, they built a new benchmark dataset covering all 50 of these vulnerability types built to demonstrate the effectiveness of AUSERA. However, in their study, they did not evaluate the application behavior during execution. Natesan et al. [10] have proposed a solution that allows to detect sensitive data leakage using static analysis and dynamic analysis. Android applications in the datase are mapped to their sensitive data leakage, activity list, permissions list, method list, library classes. Their solution uses logcat information and the runtime permissions list to detect sensitive data leakage. Their study was not fully implemented on mobile devices to perform real-time application behavior monitoring. In this paper, we propose a solution using the Xposed Framework toolkits to analyze, exploit and monitor the application’s behavior. The results of this study can be used to detect anomalous behavior of Android application in real time.
3 Proposed System In this study, uit2ABM (Android Application Behavior Monitoring system) is proposed to monitor application behavior on Android phones. The architecture of uit2ABM is shown in Fig. 1. The proposed system is divided into four main modules. There are API hooking module, Behavior collector module, Malicious behavior detector module, User interactor module. 3.1 API Hooking Module To collect application behavior information, the proposed system needs to provide the ability to monitor API functions. The API hooking module provides the ability to monitor API functions by monitoring the parameters of API functions. The values of these parameters are used to build the behavior of the application in the next module (Behavor collector module). In API hooking module, we use Xposed Framework to intercept API functions. Some steps to perform of this module are as follows [11]: Step 1: Add the XposedBridge library to the project in the dependencies section of the app/build.gradle folder to use the API library. Figure 2 shows an example of XposedBridge adding. Step 2: Add a metadata tag to the AndroidManifest.xml file to declare the app hooked by the Xposed Framework. Xposed Installer finds and reads special meta data flags. This section provides basic information for Xposed Framework to run and display
328
N. T. Cam et al.
Fig. 1. The architecture of the proposed system, uit2ABM
Fig. 2. An example of XposedBridge adding
application information. Make sure the meta data information is added between the tag in the AndroidManifest.xml file. Where, “xposeddescription” represents the developer’s description for the module to be written, “xposedminversion” is the version of Xposed Framework in use. Step 3: At the path app/src/main/, create an assets folder and create a file xposed_init. The contents of this directory are the full path of the newly created class. The purpose of this is to suggest the XposedBridge API class that contains the entry points for hooking. Step 4: Create a new class at /app/java/[project path]/main/src, hooking functions are implemented in this class. In this study, the list of functions is stored in the Sensitive API list database. The Xposed module execution flow will be separate from the execution flow of other classes in the same project. MainActivity.class class is used to interact with the Xposed Framework running in the background. Step 5: Xposed Framework supports API interception, parameter information collection, and information display in the “logs” section of Xposed Manager. Figure 3. Shows an example of “logs” section of Xposed Manager.
Android Application Behavior Monitor by Using Hooking Techniques
329
Fig. 3. An example of “logs” section of Xposed Manager
3.2 Behavior Collector Module This module allows to get the value collected from the hooking API module. This information is used to generate information about application behaviors. In this study, we collect behaviors related to accessing clipboard, accessing URLs, sending passwords in messages, etc. Figure 4a shows information about the collected application behaviors. The necessary information includes: time, class name, method name—constructor, warning content. 3.3 Malicious Behavior Detector In this module, we build functions that are capable of detecting malicious behavior of the application. 3.3.1 Clipboard Access Detection In the target application, the hooking module listens and intercepts the getPrimaryClip() method of the android.content.ClipboardManager class. After the getPrimaryClip() call value is returned, a message is displayed to the user interface about the clipboard access behavior from the application. Figure 4.b shows an example of clipboard access detection. 3.3.2 Password Detection in SMS The module listens and hooks the sendTextMessage() method of the android.telephony.SmsManager class before the method is called. The hooking function then checks the parameters passed to the method, including the SMS content and the recipient’s address. This module uses 10 million common passwords (password list) [12] to check the message content. If it detects that an outgoing message contains information about the user’s password, the hooking function displays an Alert Dialog warning about the message content containing sensitive information with two options (Allow and Block). Here, if the user selects the “allow” option, the sendTextMessage() method will be called again and the message will continue to be sent to the destination; otherwise, the “block” option will block this sending action.
330
N. T. Cam et al.
Fig. 4. An example of the result form the application behavior collector
3.3.3 Malicious URL Access Detection When a HTTP connection request occurs, the Android API instantiates an object of the HttpURLConnection class. The value passed to the constructor is a URL. The Xposed module will listen and intercept the initialization of this constructor, HttpURLConnection (URL u), before the constructor is called. Xposed module checks the URL that just made the connection. It compares with the URL blacklist provided by the proposed system to respond appropriately. In this study, the URL blacklist is taken from [13]. This list is saved as a.txt file in the /data/data/package_name_of_target/files directory. The recommended system will rely on this list to detect access to sensitive URLs. If the URL is in this blacklist, the system will display a warning and wait for the user’ss choice. 3.3.4 Sensitive Data Leakage Detection If there is any behavior related to accessing sensitive data and sending it out of the device, the system will raise an alert. In this study, we not only detected access to sensitive data, but also provided a mechanism to monitor the spread of this data across different applications through tagging method. The list of sensitive source and critical sink API
Android Application Behavior Monitor by Using Hooking Techniques
331
calls is taken from SuSi [5]. In this study, we hook inter-application communication related functions like startActivity(),startActivityForResult(), etc.
4 Evaluation 4.1 Dataset In this study, we built our own dataset. The application in this dataset performs the following sensitive behaviors: • The application has a feature used to create a login account, save account information. The application is capable of exploiting this information. • The application has the feature of sending SMS messages. This feature includes two options. The first option is to send SMS without using intent object, but using a builtin Android class called SMSManager to send it directly. The second option is to use an implicit intent sent to the Android system to request a message. • The application has a feature used to illustrate a mobile banking software. This feature is used to simulate a real-life banking or e-wallet application. Users will have to log in with a PIN provided by the bank they are using in order to authenticate. • The application has a feature used to download and display an image from the URL entered. This feature is equivalent to downloading a web page as html from a URL. • The app has the feature used to access the camera. It uses Intent to open the device’s camera to take a photo or record a video and save it in the device’s memory. • The application has the feature used to open the calendar to see the date with the date to be viewed entered from the application window. • The application has a feature that allows users to set an alarm timer. In addition we use samples from the DroidBench dataset [14]. Specifically, we tested the proposed system with prototypes belonging to the groups ImplicitFlows (4 files), InterAppCommunication (3 files), InterComponentCommunication (18 files). 4.2 Evaluation In this study, the evalution results are presented in Table 1. The evaluation results show that the proposed system can accurately detect the behavior of the application form the two datasets (DroidBench dataset and our dataset). One of the key goals of the proposed system is the ability to interact with the user while monitoring the application’s behavior. During testing, the proposed system is capable of interacting with users when malicious behavior is detected. Figure 5 Illustrates the system’s response during malicious behaviors detection.
5 Conclusion and Future Works In this study, we propose the system that allows to monitor the behavior of Android applications on Android phones. The recommended system does not need the source code of the Android application. The system monitors the behavior of the application when the
332
N. T. Cam et al. Table 1. The evalutaion results of the proposed system
Dataset
Dataset component
Correctly detection
Our dataset
Send SMS
Load Image
Use the camera
Calendar view
Set alarm
Mobile banking
ImplicitFlows
InterAppCommunication
InterComponentCommunication
DroidBench dataset
Fig. 5. An example of the system’s response during malicious behaviors detection
application is launched. We also build test dataset to illustrate some malicious behavior such as accessing clipboard, sending sensitive data via SMS, accessing sensitive URL. The test results show that the system performs well when tested with our own dataset and DroidBench dataset. The results of this study can be applied to other future studies such as detecting sensitive image sharing, detecting sensitive images downloading, and detecting real-time spam SMS.
Android Application Behavior Monitor by Using Hooking Techniques
333
Acknowledgement. This research is funded by the University of Information Technology, Vietnam National University HoChiMinh City (VNU-HCM) under Grant No. D1-2023-02.
References 1. StatCounter: Mobile Operating System Market Share Worldwide (2023). . Available: https:// gs.statcounter.com/os-market-share/mobile/worldwide 2. Zhang, J., Tian, C., Duan, Z.: FastDroid: efficient taint analysis for android applications. ˙In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 236–237 (2019) 3. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., et al.: FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In: Presented at the Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, Edinburgh, United Kingdom (2014) 4. Samhi, J., Kober, M., Kabore, A.K., Arzt, S., Bissyandé, T.F., Klein, J.: Negative Results of Fusing Code and Documentation for Learning to Accurately Identify Sensitive Source and Sink Methods An Application to the Android Framework for Data Leak Detection (2023). arXiv preprint arXiv:2301.03207 5. Rasthofer, S., Arzt, S., Bodden, E.: Machine-Learning Approach for Classifying and Categorizing Android Sources and Sinks (2014) 6. Schindler, C., Atas, M., Strametz, T., Feiner, J., Hofer, R.: Privacy leak identification in thirdparty Android libraries. In: 2022 Seventh International Conference On Mobile And Secure Services (MobiSecServ), pp. 1–6 (2022) 7. Frida: Dynamic ˙Instrumentation Toolkit for Developers, Reverse-Engineers, and Security Researchers (2022). Available: https://frida.re/ 8. Cotise, M., Raum: Mitmproxy—An ˙Interactive HTTPS Proxy (2023). Available: https://mit mproxy.org/ 9. Chen, S., Zhang, Y., Fan, L., Li, J., Liu, Y.: AUSERA: automated security vulnerability detection for android apps. ˙In: 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–5 (2022) 10. Natesan, S., Gupta, M.R., Iyer, L.N., Sharma, D.: Detection of data leaks from android applications. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 326–332 (2020) 11. Hacking Android. Packt Publishing Ltd, UK (2016) 12. Miessler, D.: The 10 Million Password List (2022). Available: https://github.com/danielmie ssler/SecLists 13. Cucchiarelli, A., Morbidoni, C., Spalazzi, L., Baldi, M.: Algorithmically generated malicious domain names detection based on n-grams features. Expert Syst. Appl. 170, 114551 (2021) 14. SPRIDE, E.: DroidBench—Benchmarks (2022). Available: http://sseblog.ec-spride.de/tools/ droidbench/
Blockchain in Concurrent Green IoT-Based Agriculture: Discussion, Analysis, and Implementation Md Janay Alam1 , Ashiquzzaman Choudhury1 , Kazi Sifat Al Maksud1 , Ahmed Wasif Reza1(B) , and Mohammad Shamsul Arefin2 1 Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
[email protected]
2 Department of Computer Science and Engineering, Chittagong University of Engineering and
Technology, Chittagong, Bangladesh [email protected]
Abstract. The adaptation of IoT-based technology in agriculture is increasing rapidly in recent times. IoT devices are of great assistance to farmers in monitoring dairy farms, greenhouses, and other agricultural facilities. The most crucial component of an IoT device is ensuring data security and privacy. Data leakage can have significant negative consequences as it may result in financial loss for farmers. It is important to prioritize data security to protect the interests of farmers and ensure the success of their agricultural endeavors. Blockchain technology can be a solution to secure data transactions in the network layer. With the concept of private and public key pairs, data hashing, and decentralized systems; important and sensitive data will be secured. In this study, we discuss the structure of a three-layer architecture for green IoT technology and identify the privacy issues presented in each layer. Then we propose a solution by implementing a secure Ethereum-based blockchain system for data transactions at the network layer. Keywords: Blockchain · Authentication · Data security · Data transaction · Three-layer architecture
1 Introduction The Internet of Things (IoT) is undeniably one of the most prevalent and widespread forms of technology in the contemporary era. The utilization of this technology is becoming more and more prevalent in the agricultural sector. A farmer can monitor the humidity levels and temperature using IoT devices. IoT-based technologies enable devices to communicate with each other automatically, without requiring any human intervention [1]. In Bangladesh, the usage of IoT in agriculture has increased significantly. For example, many dairy firms are currently using IoT-based devices to monitor dairy cows. The monitoring includes how much the cows eat in a day, their pregnancy status, the productivity of milk, etc. Besides that, IoT-based systems can provide detailed data on greenhouse conditions. Humidity, temperature, and light are the most crucial factors © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 334–344, 2024. https://doi.org/10.1007/978-3-031-50327-6_35
Blockchain in Concurrent Green IoT-Based
335
in a greenhouse system. By using greenhouse techniques, farmers are able to produce crops and other agricultural products during off-seasons as well. By closely monitoring the humidity, temperature, and light conditions in the growing environment, farmers can make data-driven decisions regarding cultivation practices to promote the production of high-quality and healthy crops. Using IoT devices in our agriculture sectors will strengthen our economy as our economy is heavily dependent on agriculture. In the Internet of technologies, security, and safety are the most viable options that cannot be neglected. A huge number of datasets are availed into the cloud with sensitive information through the usage of IoT. Any leakage and trespassing of this sensitive information can result in catastrophe. Hence ensuring strict motorization for securing data is a foremost necessity. Blockchain is a viable and innovative option to ensure data protection and safety from other parties in today’s time frame. A blockchain is a chain of blocks that contains a cryptographic hash of the previous block, making it tamperresistant. It contains information on the address of the block, the data inside it, and the address of the previous block Each block in a blockchain is unique. Tampering with one block breaks the chain, making previous data inaccessible. Thus, it is an extremely popular and secure way to transmit data through a medium.
2 Literature Review Previously, Ferrag et al. [2] conducted a survey to investigate the threat model and security in green IoT-based agriculture utilizing blockchain technology. They described the architecture of green IoT-based devices. The architecture contains four layers. (i) The agricultural sensors layer which contains IoT enables devices with GPS, (ii) the fog layer which receives data from IoT devices with the help of the GPS, (iii) The core layer which acts as a bridge that enables data transportation between the fog layer and cloud layer, and (iv) the cloud layer is the place where data are stored. The study included a range of potential hazards, which includes mainly attacks that target privacy, authentication, confidentiality, availability, and integrity of the system. During the discussion, they reviewed prior efforts and proposed a potential solution for security vulnerabilities in green IoT devices. Access Control management is a must for limiting data sharing to unwanted users or resources through strict limitations. However, due to their centralization, a traditional access control system is not a great way to ensure safety in IoT devices. Sheng Ding et al. [3] provided a solution to this problem by using Blockchain access management. The article discusses three main types of blockchains and recommends the use of a consortium blockchain due to its time-saving advantages compared to public blockchains. The insurance of retrieving data in the case of a tempered blockchain is also ensured as it retrieves data from subsequent nodes. Furthermore, a hash is implemented with an address header and body to avail the security. One public key and another private key are provided to ensure communication between two hosts, so data leakage is highly unlikely. It also monitors if two keys are provided from different hosts through a set of codes. Finally, it is a great IOT-based access management system with the reduction of storage overhead and computational overhead as the author states that the protocol for
336
M. J. Alam et al.
controlling access has been simplified, allowing for basic hash and signature operations to be performed by both parties, making it an efficient solution for devices with limited processing and energy resources in IoT systems [3]. While managing access control with efficiency is a great initiative, there is a potential risk of data leakage due to public and private key systems. If two hosts have access to the keys, they could potentially tamper with the data if the authorized panel is not vigilant in regularly running code checks. Although it is a great initiative for managing access control with great efficiency there is a scope for data leakage because of the public and private key system. Two hosts can have the two keys and trample data if the authorized panel is not careful enough to run the codes regularly. The bubble of trust paradigm, a decentralized authentication mechanism for IoT, was previously examined by Hammi et al. [4] The scale and variety of IoT features make it difficult to design a decentralized authentication system that is effective. They also gave actual implementation using the Ethereum technology to cure this restrain. In the IoT environment, they created “bubbles of trust” where each device communicates only with a specific instrument in its own zone, making the system reliable. Non-members are not allowed in that zone. They used a public blockchain to make the system accessible to any client. To create and submit the contract to the blockchain network, miners validate it. If approved, the contract owner receives an open address that can be used by any client. They used the Ethereum blockchain to execute the approach and differentiated three object types: Master, Follower, and Both. Their work demonstrates their ability to meet the security, efficiency, and low-cost requirements of IoT. This system could be useful in green IoT-based agriculture. The following Sects. 3, 4, and 5 contain the three-layer architecture, its issues, and blockchain-based solutions respectively.
3 Three-Layer Architecture of Green IoT Devices The architecture of green IoT devices refers to the overall design and structure of an IoTbased system focused on maximizing energy efficiency and minimizing environmental impact [5]. It consists of three main layers of architecture (Fig. 1): the sensor layer, the fog layer or network layer, and the cloud layer [6]. The main idea of sensor computing is to direct compute resources toward sources of data, e.g., sensors, actuators, and other devices [7, 8]. Sensor Layer. The sensor layer comprises a network of physical devices or sensors that collect data from the environment and transmit it to the cloud layer for analysis and storage. These sensors can be embedded in various objects, such as appliances, buildings, vehicles, and industrial equipment, to gather data on multiple parameters, such as temperature, humidity, air quality, and energy usage. The sensor layer plays a critical role in the architecture of green IoT devices, as it is responsible for collecting and transmitting data that is used to optimize energy efficiency and reduce environmental impact. It enables the devices in the network to sense and responds to their surroundings, making it possible to automate various tasks and processes.
Blockchain in Concurrent Green IoT-Based
337
Fog Layer. The fog layer in the architecture of green IoT devices refers to a network of intermediate devices between the sensor layer and the cloud layer. It is also known as the “edge” or “mist” layer [9]. The main purpose of the fog layer is to perform data processing, analytics, and decision-making tasks closer to the sensors rather than in the cloud. This helps to reduce the amount of data that needs to be transmitted to the cloud, reducing the load on the network, and improving the response time of the IoT system. The fog layer is especially useful for applications that require low latency or high reliability, such as real-time monitoring and control of industrial systems, as it enables the devices in the network to operate independently of the cloud. Cloud Layer. The cloud layer in the architecture of green IoT devices refers to the central hub or server where data is collected and analyzed from the devices in the network [9]. It is the topmost layer in architecture and provides cloud computing resources, storage, and analytics capabilities to IoT devices. The cloud layer also plays a critical role in enabling remote management and control of IoT devices and providing a platform for developing and deploying IoT applications. It acts as a bridge between physical devices and the end users, enabling them to interact with the devices and access the data and services they provide.
Fig. 1. Architecture for three-layer fog computing
In Table 1, some of the key functions of the Sensor Layer, Fog Layer, and Cloud Layer in the architecture of green IoT devices include has been described. Overall, the architecture of green IoT devices [9] enables the devices in the network to sense and respond to their surroundings, making it possible to automate various tasks and processes in an energy-efficient and environmentally friendly way.
4 Security and Privacy Issues in Every Layer Sensor Layer. To ensure a safe and dependable connection, the sensor layers need to meet certain requirements to establish trustworthiness and sufficient security measures. These requirements act as a vessel of protection from attacks. The implication and over-assurance for establishing and maintaining security measurements are quite vital
338
M. J. Alam et al. Table 1. Key functions of the three-layer
Sensor layer
Fog layer
Cloud layer
Data collection: The sensor layer collects data from the environment using various sensors, such as tempera-ture sensors, humidity se-nsors, and air quality sensors
Data processing and analytics: The fog layer performs data processing and analytics tasks closer to the sensors, reducing the amount of data that needs to be transmitted to the cloud
Data collection and storage: The cloud layer receives and stores data from the devices in the network, making it available for analysis and visualization
Data transmission: The sensor layer transmits the collected data to the cloud layer for analysis and storage using wireless communication technologies, such as Wi-Fi, Bluetooth, and Lora WAN
Decision-making: The fog layer can make local decisions based on the data it receives from the sensors, allowing the dev-ices in the network to operate independently of the cloud
Data analysis and processing: The cloud layer processes and analyzes the data from the devices to extract insights and identify patterns and trends
Energy efficiency: The sensor layer is designed to be energy efficient, using low power sensors and communication technologies to minimize energy consume
Data management: The fog layer manages the data it receives from the sensors, including tasks such as data filtering, aggregation, and compression
Application development and deployment: The cloud layer provides a platform for developing and deploying IoT based applications, enabling the devices to perform various tasks and functions
Scalability: The sensor layer can accommodate many sensors and devices, allowing the IoT based network to scale up as needed
Communication: The fog layer acts as a gateway between the sensor layer and the cloud, enabling the devices in the network to communicate with each other and the cloud
Remote management and control: The cloud layer enables remote management and control of the devices in the network, allowing users to monitor and control them from a distance
Interoperability: The sensor layer is designed to be interoperable, allowing different sensors and devices to communicate and work together seamlessly
Scalability: The fog layer can accommodate many devices, allowing the IoT-based network to scale up as needed
Scalability and flexibility: The cloud layer provides the necessary scalability and flexibility to accommodate many devices and handle a large volume of data
as disorganized sensor networks can leave sensor nodes quite vulnerable and prone to attacks from outside. Once inside the domain, attackers can launch a variety of attacks that could leave the whole system broken. There are hindrances and issues to setting up a complete and sufficient security system. These limitations need to be addressed first before manifesting themselves in security measures. Node Limitations. The nodes are heterogenous and sometimes unique in nature. As a result, it could be quite difficult to set them up in a network of sensors and the result could be catastrophic.
Blockchain in Concurrent Green IoT-Based
339
Network and Physical Limitations. Mobile ad hoc networks have limitations such as low power, memory, and processing abilities. Additionally, deploying sensors in environments with unreliable wireless connectivity can pose significant challenges. Moreover, setting up a sensor network in hostile environments may lead to vandalism and damage. To prevent tampering, materials with high-security features can be expensive and drive up costs beyond initial estimates. Strict security measures are crucial to preventing unauthorized access to networks, securing both the sensors and the network itself, and ensuring seamless communication between nodes. The following is a list of potential security breaches that can occur in sensor networks. Information Gathering. In a sensor network, adversaries or enemies may attempt to extract information from the sensors, resulting in data leakage. With enough resources and materials, the possibility of information leakage is always an option if the information is not encrypted [10]. Hello Node Attacks. In a cyber attack, the attacker may choose a node and use it to transmit malicious code into the network system. The receiving node may then accept and integrate this code into the system [11]. Sensitive information is sent to the attacker and the whole system is left unprotected. Figure 2 shows how an attacker may attack the sensor layer using a hello node attack.
Fig. 2. Hello node attacks
Malfunctioning Nodes. Malfunctioning sensors generate false or inaccurate data that can result in disasters. The integrity of the network system will be questioned if a malfunctioning node is present in the network. Fake routing/sinkhole attack. A fake node is created by the attacker to lure the other nodes. All the connecting nodes try to find the shortest path in the nodes and the attacker node claims to have a short path that generates all the subsiding nodes toward it. With the trap set by the attacker, important and risky information could get trampled easily. Fog Layer. The fog layer comes second in the hierarchy between the three layers of the network system, and it is considered a medium or a bridge between the sensor and cloud layer. Each part of the fog layer is prone to vandalism and attacks from the outside. The issues and security measures that might occur are listed below: Object layer.
340
M. J. Alam et al.
1. Device/Node temper: The fog layer is a complex layer that is confined to a lot of devices that can be viewed as nodes. The object layer in a system is vulnerable to attacks that may cause the device to malfunction, making it susceptible to tampering by attackers. This tampering can compromise the security of the system, allowing sensitive data to be revealed and weakening the overall structure of the system.. 2. Malicious/fake Nodes: Sometimes fake nodes are generated by the interferers to trespass into the system and get access to the full details about any system [12]. Similarly, malicious nodes affect other nodes with false and inaccurate data. As a result, the whole authenticity of the system is left to be questioned. 3. Service denial/Node outage: When an attacker gets into a system it floods the entire system with false packets that might result in the exhaustion of batteries and possibly cut down the entire network availability. Similarly, most of the connections between nodes can be cut down in this manner which would result in a shortage of connections. Middleware. Middleware is concerned with the security of storage and data transmission. Some possible candidates for attacks are listed below. Selective Forward: Some of the data packets are blocked by malicious nodes and sometimes nodes skip routing on data packets. 1. Sybil Attack: Some of the nodes take multiple identities and in turn try to do multiple tasks. It results in the loss of efficiency of the nodes, aka the device. 2. Black-hole attack: Just like a black hole snatching nearby objects a black-hole attack is mainly an attacker node that attracts all nearby nodes. The probable inclusion of this sort of attack could result in a huge information leak. Fog Server. The fog server is the front end of the fog layer, and the security and privacy requirements are different for different platforms. For establishing a green IoTbased system the possible security breach with malicious attacks are listed below. Extraction: Attackers extract passwords, OTP, emails, etc. from the files with the help of sniffing. Many protocols and systems are vulnerable to attacks like this. 1. Phishing attacks: The email addresses of the main authorities are seized and used to damage a particular system. 2. Injected data: Injected codes are sent to the system server which results in the loss of data and causes data damage. 3. Session takeover: It is quite similar to hijacking a user and then gaining access to personal information due to the authentication limitation of a system. 4. Application vulnerabilities: The development of a system leaves errors that attackers might breach and harm the system. Cloud Layer. The cloud layer is the main component of storing permanent data with the help of servers and data centers. With the help of servers, data can be collected from any platform, media, or site anytime, anywhere. Providers must ensure that with an internet connection, any user can view or see their stored data or information that is available in the data center or the servers [13]. Additional security features and measurements are ought to be taken by the providers to safely protect sensitive data stored in the server. Some of the potential attacks that can possibly breach security measurements on the cloud layer are listed below.
Blockchain in Concurrent Green IoT-Based
341
Hijacking on Traffic. This type of attack can lead a user to other sites that are harmful and dangerous sites. These sites might take confidential information from the users while being completely unbeknownst to the user and it is one of the primary security threats that lie in the cloud layer [13]. Malware injection in cloud service. Attackers often build malicious service implementations in various forms (depending on the attacker’s goal) to trick the cloud system into believing it to be an existing part of the system. Once a user enters the system, the cloud misdirects it to the malware function [13]. Hence, confidential information is breached entirely in this process. Flooding. For economic benefits, cloud servers are the best solution as the price for hardware can be quite high. The drawback of this system is sometimes intruders send a huge load of requests to the server and the server pays a high workload to render these requests vague and pointless. This extra workload slows the server down and can result in denial of service for users [14]. DOS (Denial of Service). Denial of service occurs when the servers spend an extra chunk of their computational power to fight back against the attacker. The attacker sends requests to one specified node or sometimes several entry points to flood the system. As a result, the extra workload is expanded to fight against the system and the server loses its availability to user requests. The adamant denial of serving the users is basically The denial of service. Since there are no limits for the computation of the servers, the bills of the users might reach sky-high without minimum expectation resulting in a huge loss for the user.
5 Blockchain-Based Solutions All three layers can be secured with the application of blockchain. In the sensor layer, IoT gadgets can transmit data by adding it as a block to the blockchain network. The fog layer will perform the validation of the block’s authorization and if they are found to be valid, they will be added to the network. The fog layer can validate a block by matching the previous block number. In the fog layer, machine learning models can help a farmer or consumer to get benefited. Support Vector Machine (SVM) is one of the best machine learning classifiers that can forecast some information according to the need of an application. The prediction can be beneficial for a farmer. Shen et al. [15] proposed a variant of the Support Vector Machine (SecureSVM) that can learn from blockchain-encrypted Internet of Things (IoT) data. The data gathered by IoT devices in the sensor layer can be used to train this SecureSVM model. The next portion of the paper discussed the blockchain-based authentication system and secure data transactions. Authentication. Authentication is the most important part of security. If it becomes vulnerable, then hackers can easily manipulate and harm the data. Every application needs to be a 100% secure authorization system. This study has analyzed a possible authentication system for the application which controls IoT devices.
342
M. J. Alam et al.
Patel et al. [16] conducted an investigation into the use of blockchain technology and its various applications. They also proposed an alternate method for authenticating users in blockchain-based applications. They proposed a service called ‘Dauth’. It uses a meta-mask for authentication. Oauth 2.0 concept has been implemented there (Fig. 3). Oauth 2.0 gives users the freedom to use one account for other third-party applications’ authentication systems. By using an access token Oauth 2.0 authorizes users.
Fig. 3. Authentication system using Oauth 2.0
If an application uses Oauth 2.0 service and meta-musk, it can be secured. The users can use Oauth 2.0 service to authenticate themselves and monitor the data of the IoT devices. For that, just the meta-mask web browser extension will be needed to be installed. Then after adding the account to the meta-mask the users can authorize themselves to the secure decentralized application. The user will use the blockchain network address as their username and will use the private key for any sensitive interactions. Figure 3 visualized the authentication system. Data Security on the Network Layer. Data security in the network while transporting is one of the most important security concerns for any kind of application or system. IoT data must be secured and untouched in the network. Other than the owner, no one should see the actual data. For IoT data, it will be much needed to secure the data in the transporting network layer. For that, the application will use decentralized data transactions using an Ethereum-based network. To simulate the process, we used ‘ganache’ software for creating a local network in the local host server. From there we can get 10 blockchain accounts to work with. With Solidity programming language the smart contract has been built. The smart contract contains a mapping of the IoT data list. For our experiment, we assume the IoT device will be placed in a greenhouse and sends CO, Humidity, Temperature, etc. data. The smart contract will store this data. This study used a Nodejs program (for the experiment) to send transactions from IoT devices to the networks. Before sending the transaction to the network there is a private key to sign the transaction. Every transaction goes through the Ethereum blockchain network with hashed data. Figure 4 is the visual representation of the system.
Blockchain in Concurrent Green IoT-Based
343
Fig. 4. Appending blocks into the blockchain network
6 Results and Findings There were 10 accounts to simulate the data transaction process in the local server. Each of them has different public and private keys. We assigned four accounts to four programs that were deemed Internet of Things (IoT) devices, while the remaining accounts served as users for the process. The four devices send data through the blockchain network. The data for the four transactions were hashed before being transmitted to the network. We used a separate account to obtain data that simulated a farmer’s desire to view greenhouse environmental data from IoT devices. Random accounts are unable to access the data because to access data, a user’s account must be authorized. Before returning the information, the system verifies the user account. Because of the data’s hashing, any attempts to change it will be unsuccessful. In addition, as each transaction is connected to the next, the system automatically becomes tamper-proof.
7 Conclusion In conclusion, the use of green IoT-based agriculture has the potential to revolutionize the way that we approach farming and food production. By putting the strength of the IoT and blockchain technologies, it is possible to create more sustainable, secure, and efficient agricultural systems that can better withstand the challenges of a rapidly changing world. However, it is important to be aware of the potential security issues that can arise at each layer of the system, from the sensor layer all the way up to the cloud layer. By implementing a data transaction over a blockchain network, it is possible to create a secure and trustworthy system that can help to ensure the success of green IoT-based agriculture. As a whole, both farmers and consumers stand to gain a lot from these technologies, and it is an exciting area of research that is sure to continue to evolve in the coming years.
344
M. J. Alam et al.
References 1. Akka¸s, M.A., Sokullu, R.: An IoT-based greenhouse monitoring system with Micaz motes. Procedia Comput. Sci., pp. 603–608 (2017). https://doi.org/10.1016/J.PROCS.2017.08.300 2. Ferrag, M.A., Shu, L., Yang, X., Derhab, A., Maglaras, L.: Security and privacy for green IoT-based agriculture: review, blockchain solutions, and challenges. IEEE Access, pp. 32031– 32053 (2020). https://doi.org/10.1109/ACCESS.2020.2973178 3. Ding, S., Cao, J., Li, C., Fan, K., Li, H.: A novel attribute-based access control scheme using blockchain for IoT. IEEE Access, pp. 38431–38441 (2019). https://doi.org/10.1109/ACCESS. 2019.2905846 4. Hammi, M.T., Hammi, B., Bellot, P., Serhrouchni, A.: Bubbles of trust: a decentralized blockchain-based authentication system for IoT. Comput. Secur. 15 (2018). https://doi.org/ 10.1016/j.cose.2018.03.011 5. Bader, A., Ghazzai, H., Kadri, A., Alouini, M.S.: Front-end intelligence for large-scale application-oriented internet-of-things. IEEE Access, pp. 3257–3272 (2016). https://doi.org/ 10.1109/ACCESS.2016.2580623 6. Sarkar, S., Chatterjee, S., Misra, S.: Assessment of the suitability of fog computing in the context of internet of things. IEEE Trans. Cloud Comput., pp. 46–59 (2018). https://doi.org/ 10.1109/TCC.2015.2485206 7. Varghese, B., Wang, N., Barbhuiya, S., Kilpatrick, P., Nikolopoulos, D.S.: Challenges and opportunities in edge computing. In: IEEE International Conference on Smart Cloud (SmartCloud), pp. 20–26 (2016). https://doi.org/10.1109/SmartCloud.2016.18 8. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things. IEEE Internet of Things J., pp. 637–646 (2016). https://doi.org/10.1109/ JIOT.2016.2579198 9. Mukherjee, M., Matam, R., Shu, L., Maglaras, L., Ferrag, M.A., Choudhury, N., Kumar, V.: Security and privacy in fog computing: challenges. IEEE Access, pp. 19293–19304 (2017). https://doi.org/10.1109/JIOT.2016.2579198 10. Kumar, V., Jain, A., Barwal, P.N.: Wireless sensor networks: security issues, challenges and solutions. Int. J. Inf. Comput. Technol. 4, 859–868 (2014) 11. Pathan, A.S.K., Lee, H.-W., Hong, C.S.: Security in wireless sensor networks: ıssues and challenges. In: International Conference Advanced Communication Technology, vol. 2, IEEE, Phoenix Park (2006). https://doi.org/10.1109/ICACT.2006.206151 12. Puthal, D., Mohanty, S.P., Bhavake, S.A., Morgan, G., Ranjan, R.: Fog computing security challenges and future directions [energy and security]. IEEE Cons. Electron. Mag. 8, 92–96 (2019). https://doi.org/10.1109/MCE.2019.2893674 13. Sen, J.: Security and privacy issues in cloud computing. In: Architectures and Protocols for Secure Information Technology Infrastructures, pp. 1–45 (2013). https://doi.org/10.4018/9781-4666-4514-1.CH001 14. Schwenk, J., Gruschka, N., lo Iacono, L., Jensen, M.: On Technical Security Issues in Cloud Computing Security Analysis of End-to-End Encrypted Email View project USecureD-Usable Security by Design View project On Technical Security Issues in Cloud Computing (2009). https://doi.org/10.1109/CLOUD.2009.60 15. Shen, M., Tang, X., Zhu, L., Du, X., Guizani, M.: Privacy-preserving support vector machine training over blockchain-based encrypted IoT data in smart cities. IEEE Internet Things 6, 7702–7712 (2019). https://doi.org/10.1109/JIOT.2019.2901840 16. Patel, S., Sahoo, A., Mohanta, B.K., Panda, S.S., Jena, D.: DAuth: a decentralized web authentication system using ethereum based blockchain. In: Proceedings—International Conference on Vision Towards Emerging Trends in Communication and Networking, ViTECoN (2019). https://doi.org/10.1109/VITECON.2019.8899393
Automatic Document Summarization of Unilingual Documents: A Review Sabiha Anan1 , Nazneen Islam1 , Mohammed Nadir Bin Ali2 , Touhid Bhuiyan2 , Md.Hasan Imam Bijoy2 , Ahmed Wasif Reza3 , and Mohammad Shamsul Arefin1,2(B) 1 Department of Computer Science and Engineering, Chittagong University of Engineering and
Technology, Chittagong, Bangladesh {sabiha.anan,sarefin}@cuet.ac.bd, [email protected] 2 Department of Computer Science and Engineering, Daffodil International University, Birulia, Bangladesh [email protected], [email protected] 3 Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh [email protected]
Abstract. Automatic summarization of documents is the creation of short and accurate summary of a source document. Due to the abundance of information on various platforms of our social life, shortening of that information is greatly needed for us. Thus, document summarization automatically provides short, precise and informative abstract of a large document. In this era of information, as the necessity of document summarization is increasing continuously, the techniques of summarizing document(s) are also evolved throughout the years. In this paper, a review on summarization of unilingual documents has been presented. A detailed description of some prominent approaches for document summarization and their performances are also analyzed in this paper. Keywords: Documents summarization · Semantic graphs · Extractive techniques
1 Introduction Owing to the rapid expansion of data in both web and offline, the volume of literature has been increased a lot lately. Most of the times, to find out the pertinent information from such a large volume of text documents is very laborious task. So, to extract useful and important information from the documents, automatic summarization is needed. Summarization involves shortening of texts, documents and literatures while preserving the main content and extracting the pertinent information. Automatic summarization is becoming essential in various aspects of our day-to-day life like media monitoring, newsletters, search marketing, social media marketing, legal contract analysis etc. Thus, automatic document summarization has gained the attention of many researchers. The foremost aim of summarizing documents is to produce a shortened outline of a particular record or of a group of records on a particular topic. Because, whenever © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 345–358, 2024. https://doi.org/10.1007/978-3-031-50327-6_36
346
S. Anan et al.
a user is trying to retrieve information about a particular topic, he/she might get many documents related to that topic, all of which might not always be useful to the user. Again, these retrieved documents might have redundant information. Summarization of documents might help the users largely in this regard. Through summarization, users get a short and informative summary of the desired topic and thus saving both time and effort required during manual summarization. While summarizing a document, the important aspects that should be considered are: • Summaries need to be short • Summaries need to have less redundant information • Summaries should conserve important information of the source documents To initiate the summarizing a document, two well accepted methods are used. One is abstractive summarization and the other is extractive summarization. Abstractive summarization is a method of creating a new brief text, which involves understanding of the given document and compressing its information to deliver the primary information of the document. It is very highly challenging task in NLP (Natural Language Processing). On the other hand, extractive summarization requires electing key words, and/or sentences from the source document and merge them to construct desired summary. The generated summary should confine the context of the source text. Here, we offer a survey of the research concerning automatic summarization considering the documents of the same language know as unilingual document summarization. We have considered 37 papers in total to review. Among them, 19 papers belong to unilingual single document summarization and 18 papers are associated with unilingual multi-document summarization.
2 Distribution of Papers The distribution of the selected papers over the years is listed below in a tabular form in Table 1.
3 Detailed Review 3.1 Unilingual Single Document Summarization Solitary document summarization techniques produce summaries based on a single document and the document consists of only one language. A number of approaches have been put forward over the years on this issue. Among them, some are discussed here. In [1], summarization is done using Maximal Marginal Relevance (MMR) among the sentences in the document. Here, first the documents are ordered by producing a ranked list applicable to user’s inquiry. The main criteria is relevant novelty. To evaluate relevant originality, a first approach is to determine pertinent and novelty separately and propose a metric of linear combination, known as “marginal relevance.” A document is regarded as having significant marginal relevance in this case if it is both relevant to the question
Unilingual multi-document summarization
ESDSRI [8]
ULCTS [2]
iNeATS [22]
CSMD [24]
CBACMS [21]
IDCMS [29] QMDSUD L [30]
IPKTRA MDS [27]
DSGMM DS [28]
CSSAES DS [14]
DSBDR [13]
FTSSBSDTSA [12]
ENKSDS KE [11]
2010–2012
MSSSAS MF [26]
MRBTFM DS [25]
AGDSBNMF [10]
ATSU FTFR M [4]
MSBSE [20]
SDSDE [9]
LCDSDS [3]
SQSDS [6]
ESDSCR TS [7]
DBRRDPS [1] LSDSGF DS [5]
2007–2009
Unilingual single document summarization
2003–2006
Before and at 2002
Sub-domain
StarSum [34]
JMFMRT FMDS [33]
MSAACG JM [32]
MDSBCR UVT [31]
CDSSO [17]
ESSBGO GLS [16]
ASDTSU KCD [15]
2013–2015
Table 1. Overview of the unilingual document summarization related research
IMDSTC [35]
ADSGAN NM [18]
2016–2017
AMDSBS LN [38]
MeanSum
EMSUM ABCOA [36]
ESDSMO [19]
After 2017
Automatic Document Summarization of Unilingual Documents: A Review 347
348
S. Anan et al.
and bears little resemblance to the papers that were previously chosen. For summarization, this marginal relevance needs to be maximized. That is why the method is leveled as ‘Maximal Marginal Relevance’. In case of document summarization, anti-redundancy is an important factor as redundancy defects the purpose of summarization. In this paper, it is segmented into sections and reranked using MMR including a cosine similarity metric using a user/ system generated query to create single document summaries. The system scored 70% accuracy on informative abstracts that provides an F-score of .73. MMR method for summarization especially works better for bigger documents. MMR is serviceable for extracting passages from quite a few documents on the same topic. The challenge of creating a summary from any text without requiring complete understanding is covered in the work [2]. However, it uses the widely available knowledge sources. For this, it relies on the text derived from lexical chains. In order to establish a chain, one must first take a new text word and use relatedness criteria to identify a chain that is similar to it. The factors contributing to the strength of the chain are also being considered; for example, repetition, density and length. In this instance, lexical chains are computed by combining a number of reliable knowledge sources, including the WordNet thesaurus, a part-of-speech tagger and shallow parser for identifying nominal groups, and a segmentation technique. The process of summarizing a document involves three steps: segmenting the original text, creating lexical chains, identifying strong chains, and then extracting significant phrases from the text. Strong chains are discovered because they provide a better estimate of the central topic of the content versus simply selecting the words that appear most frequently in the document. A number of disadvantages to the technique have been noted. Long sentences, for example, are more likely to be chosen; extracted texts contain anaphora links to the remainder of the text. Also, this method does not allow users to alter the size and level of information of the summary. This study [3] proposed a method for reducing readability deterioration, coherence degradation, and topical under representation by generating more coherent summaries by examining lexicalized elements in the original text. Lexical repetition is one distinctive trait that is employed in the creation of models. They have created a system where a lexical repetition-based approach of discourse segmentation, which can identify changes in subjects, is combined with a linguistically aware summarizer that can make use of the concept of salience and dynamically changeable scale of the generating summaries. Cohesion indicators are examined in the texts so that segmentation can find points where the sub-stories change. As a result, the summarizing function use a resulting set of speech fragments to provide more informative and comprehensive reports. Under certain conditions, it is seen that segmentation based summarization is condescending to state of the art base segmentation technology. Such requirements can be described as a function of the length of the original document and the source to summary ratio. In addition, when salience calculation based on background collection is not possible, segmentation based approach deliver realistic, similar quality and appreciably cheaper to generate summaries. The method is works best for small to medium sized document. Paper [4] explains an extractive technique for document summarization. It includes of identifying relevant sentences extracted from source and combining these together into brief summary. This work uses statistical and linguistic combined methods to improve the quality of summary produced. The framework consists of identifying some feature
Automatic Document Summarization of Unilingual Documents: A Review
349
terms of sentences and then calculating their ranks using feature scores. After that it uses statistical and linguistics methods to identify semantically important sentences for summary creation. The performance of the method is evaluated against manual summaries generated by human evaluators. Leskovec et al. [5] outlines a method for document summarization that involves making a semantic graph of the original content and figuring out its sub-structure to extract sentences. The process starts with a thorough syntactic study of the text. It collects logical triplets from each sentence, subject-predicate-object. The triplet sets are then refined and combined into a semantic graph using co-reference resolution, semantic normalization and pronoun resolution of cross-sentence. This approach is used on both the original document and the extracted summary. In order to determine how to identify triplets that pertain to sentences in the document summaries, logical triplet formation is trained using support vector machine in this instance. This classifier is then used to generate test document summaries automatically. The technique was tested on DUC 2002 data, and the F1 measure for the derived summaries improved statistically significantly. The SVM summarization model gives a better performance as it is trained with a rich set of linguistic attributes, which is obtained through sophisticated linguistics analyses. In comparison to summaries written by humans, this approach gained an average precision of 30% and recall of 70%. However, the method works best on smaller dataset and on topic specific document. Varadarajan and Hristidis [6] describes a query specific document summarization technique. Here, a method is presented to bring out query-specific summaries by selecting the fragments that are most pertinent to the query and then they are combined with the help of the documents’ semantic associations. For this, the document is considered as an interconnected text fragments. They focused on keyword queries owing to its power and convenient use. This approach comprises mostly of the following important steps: at the preprocessing step, a structure is added to every document, which is subsequently viewed as a labeled, weighted graph termed document graph. During query time, a keyword proximity search on the document graphs is performed on a collection of keywords to determine how the keywords are associated in them. The minimal spanning tree on the associated document graph that includes all the keywords for each document serves as its summary. The document is processed and split into text fragments before being graphed. Each of these parts is referred to as a node. Between two nodes a weighted edge is added that are connected semantically. The degree of the relationship is indicated by the weight of the edge. The method operates within less processing time and offers better performance than other relevant state of the art techniques. This paper [7] introduces NetSum, a neural-net based novel strategy for automated document summarization technique. In this case, each sentence is given a collection of features that help determine how significant it is in the text. They applied novel features that were dependent on search query logs from news sites and Wikipedia entities. A pair-based sentence ranker is taught using the RankNet algorithm that scores individual sentence and identify the crucial ones. RankNet is a neural network approach that uses pairs of inputs to rank them. In this instance, the set of sentences in a specific document. In a single document, pairs are produced between sentences. For more than 70% of CNN.com papers, the system outperforms the usual baseline in the ROUGE-1 metric.
350
S. Anan et al.
For the improvement of NetSum’s performance, extracting contents across sentence boundaries need to be considered. Such work requires a system to produce abstract summaries. The study [8] describes a text document summary technique that employs semantic similarity among sentences to reduce repetition in the text. The sentences are mapped to a semantic space using random indexing to create semantic similarity scores. A computationally effective way to lower implicit dimensionality is by random indexing. Low-cost vector computations are required. As a result, it offers a useful technique for calculating word, phrase, and text similarity. Then, graph-based ranking methods were used to generate an abstraction of the source text. The technique outperforms commercially available summarizers such as Copernic and word summarizer. Nonetheless, considerable abruptness has been observed in the summaries produced by this method. This work [9] presents a document extension technique to provide more knowledge to aid in single document summary. Using this technique, a specific document is lengthened to a smaller set of documents by attaching some neighbor documents close to it. The larger document collection is then used to use a graph-ranking-based algorithm to extract sentences from the single document. The graph-ranking based algorithm utilizes both the cross-document relationships between sentences of all the documents in the document collection as well as the within-document relationships between sentences of the specified document. Cross-document links between sentences in the extended document set have been shown to be extremely crucial for single document summary. This method validated its effectiveness when summarizing multiple documents on the same concept separately. Nevertheless, the proposed approach has higher computational complexity than baseline approach for single document summarization. Several unsupervised approaches use Latent Semantic Analysis (LSA) for sentence selection. However, because singular vectors are employed for phrase selection, the results obtained are less meaningful. This is due to the fact that the single vector components might have values less than zero. As a result, based on Non-negative Matrix Factorization [10], provides a novel unassisted generic document summarizing approach (NMF). The algorithm operates like this: dissecting a document into individual sentences, executing stopword elimination and word- stemming procedures, generating a terms-bysentence matrix, NMF on the terms-by-sentence matrix, calculating generic relevance for each phrase, and picking k phrases with the greatest relevance values. Against LSArelated algorithms, NMF chooses sentences that are more meaningful and is better at understanding document structure. As a result, it enables superior representation of document subtopics. To enhance the quality of summarization [11], offers a technique based on keywords extraction and the closest similar documents. It is assumed that neighboring texts will contribute for giving more information and explanations. The framework is separated into two steps: neighborhood development and keyword extraction based on neighborhood knowledge. The graph-based ranking algorithm is then used to the small set of documents created by expanding the given source document to include a few nearby neighbor documents. This enables the algorithm to make use of both local features in the source text and global information in neighboring texts. Two types of sentence-to-sentence relations are employed for document summarization: intra-document relationships and
Automatic Document Summarization of Unilingual Documents: A Review
351
cross-document relationships between phrases in different texts. Two types of wordto-word interactions are employed for key extraction: word concurrence patterns in the sample document and word cooccurrence patterns in the neighbor documents. Despite the method is effective, it demands more processing time and space. Nagwani and Verma [12] proposed a Solo Document Summarization Algorithm determined by frequent Term and Semantic Similarity. The technique is split into three main sections, an input text document, a summarizer algorithm and a summarized text document as output. A corpus of 183 documents of Project TIPSTER SUMMAC from the Computation and Language collection was used. The results that were found are remarkable, and the original meaning of the document’s summary has also been maintained. He et al. [13] proposes a unique paradigm called Document Summarization Based on Data Reconstruction (DSDR). The process produces a summary that can most accurately represent the original document. Two objective functions have been established to model this relationship between sentences: (1) linear reconstruction, (2) nonnegative linear reconstruction. An effective algorithm for solving the associated optimization problem is created for each objective function. DSDR (with both reconstruction types) can outperform other cutting-edge summarization methods. However, DSDR with linear reconstruction is more economical, whereas DSDR with nonnegative reconstruction performs better in terms of producing fewer redundant words. The paper [14] introduced a system, which is designed to grasp both semantic and syntactic qualities of a document’s text. The primary focus is centered on the sentence scoring algorithm that has a significant impact on the process of extracting sentences from a document in order to create a summary. Two distinct datasets were used: 1) Cognitive experiment data consisting of scientific-type magazine articles and related humangenerated summaries, and 2) DUC 2002 newspaper article set (supported by the National Institute of Standards and Technology, NIST). ROUGE unigram evaluations was performed. Sarkar [15] introduced automatic Single Document Summarization Using key concepts in Documents. The document was preprocessed, then keyphrase was extracted, and finally the summary was generated. The suggested technique used the DUC 2001 dataset for training and was tested on the DUC 2002 dataset. ROUGE 1.5.5 was used to evaluate the performance. In comparison to the DUC baseline, the suggested summary method performs better. This study [16] introduces MA-Single DocSum, an extractive single-document summarizing approach based on genetic operators and guided local search. Datasets DUC2001 and DUC2002 were used for evaluation. ROUGE-1 for the dataset DUC2001 is outperformed by the DE by 6.67% and by Unified Rank at 0.41% with DUC2002. Yao et al. [17] proposes a new formulation of text summarization via sparse optimization with a decomposable convex objective function and an efficient ADMM algorithm to solve it. To foster diversity in summaries, they also introduced the word sentence dissimilarity. DUC 2006 and DUC 2007 datasets were used for evaluation. In comparison, this compressive framework surpasses all other unsupervised systems and achieves very competitive outcomes when compared to the best peer system in DUC 2006/2007. This procedure is completely unsupervised. It should be developed to include supervised cases for various optimization issues.
352
S. Anan et al.
The summarization of the document in [18] is done in abstractive way and they adopt an encoder-decoder framework. They introduce a graph-based attentional neural model and propose a hierarchical decoding algorithm to tackle the difficulty of abstractive summary generation. They ran studies on two large-scale CNN and DailyMail corpora, both of which are frequently utilized in neural document summarization projects. This strategy surpasses classic extractive methods as well as the distract-based abstractive model. It generates more informative and concise summaries. Based on search approaches such as self-organized multi-objective differential evolution, multi-objective grey wolf optimizer, and multi-objective water cycle algorithm [19], presents three extractive single document text summarization (ESDS) systems. DUC 2001 and 2002 data sets were used for performance evaluation. The DUC2001 dataset demonstrates that the technique outperforms the ROUGE-2 score by 6.49%, 40.86%, and 43.03%, respectively. In the case of the DUC2002 dataset, this strategy enhances the ROUGE-2 score by 49.44%, 81.19%, and 81.44%, respectively. 3.2 Unilingual Multi-document Summarization It is possible to find a significant number of documents on the same subject written in the same language due to the massive expansion of data. To make use of these relevant documents over the same topic, multi-document summarization came in hand. Goldstein et al. [20] discusses a multi document summarization approach that is built on single document summarization approaches by utilizing available, extra information concerning the document. This technique works by segmenting the documents into passages, and indexing those with the use of inverted indices. Then identify the texts relevant to the query using cosine similarity with threshold. Then apply multi document after maximal marginal relevance. The number of passages are selected to calculate passage redundancy and for clustering passages, passage similarity scoring is used. The selected passages are reassembled into a summary through one of the summary cohesion criteria. It is a domain-independent strategy for decreasing redundancy and increasing diversity that is based on quick, statistical processing. It enables simple parameterization for different genres, corpus features, and requirement specification. However, in some cases, the summaries lack co-reference resolution, disjoint passages from each other, or might have false implicate. This paper [21] summarizes information from numerous documents. The summaries are created by using full sentences from the text group. Classic clustering approaches are used to separate the set of texts into distinct clusters. Each sentence in this paragraph addresses a single issue. Clusters are ranked based on their resemblance to the vector of phrase frequencies from all texts to be summarized. The summaries that are created contain complete sentences from the original documents. Though the method seems very promising, it performed very poorly over the data sent by NIST. They identified some areas for future improvements such as, methods for determining clusters, methods for topic detection other than classical clustering techniques, enhancement of boundary identification tools etc. iNeATS [22] is an interactive multi document summarization system. It incorporates state of the art summarization technique with an advanced user interface. The approach’s main goal here is to allow the client immediate control over the summarizing variables,
Automatic Document Summarization of Unilingual Documents: A Review
353
to support exploring of the set of documents using the summary as a preliminary step, and to include different methods for structuring and exhibiting the summary. NeATS [23] is a multi-document summarizing technique based on extraction. There are three primary components to it. Selection, filtering, and presentation of content fall under these categories. The selection of the content identifies crucial ideas that are mentioned in the documents; content filtering uses sentence position, stigma words and redundancy filters to find out N lead sentences; content presentation ensures coherence of the summary generated. Interactive NeATS gives the control of the summarization process to the users. The users can control the size of summary, position of sentence and redundancy filters. If any of the parameters is changed, a new summary is generated. iNeATS provides an browsing of the document sets by giving an overview of the document, linking the summary to the original documents and highlight the most relevant sentences. It uses a map-based visualization for displaying alternative summaries. Radev et al. [24] presents a technique for multiple document summarization. The technique is referred to as MEAD. Here, cluster centroids produced by a topic identification and tracking system are used to create summaries. The primary feature of MEAD is its usage of cluster centroids. The words that make up these cluster centroids are crucial to each and every article, not just the one in the cluster. MEAD ranks the sentences according to a set of criteria before deciding which ones should be included in the extract. A collection of articles and the compression rate value are the inputs of MEAD. When papers are arranged chronologically, sentences are written down in the same sequence as they are in the original texts. Interjudge agreement on sentence utility for this method is quite high. However, cross-sentence subsumption has a mediocrely low level of agreement. The goal of topic-focused multi-document summarizing is to provide a summary that is biased toward a specific topic. This work [25] provides a novel extractive technique to this summarizing issue based on manifold-ranking of sentences. Both the connections between each sentence in the document and the connections between the sentences and the specified topic can be fully utilized by the manifold-ranking method. The approach uses the greedy algorithm to penalize the phrases that heavily overlap other informative sentences after using the manifold-ranking procedure to calculate the manifold-ranking score for each sentence, which highlights the sentence’s richness in biased information. The most informative, unique, and heavily biased sentences toward the issue are those that received the highest total marks and were ultimately chosen to make up the summary. The suggested method greatly outperformed the baseline methods and the current methods used by the top systems in DUC tasks. However, for better result the parameters are to be tuned empirically for different summarization. The study [26] develops a novel multi-document summarizing system based on symmetric non-negative matrix factorization and sentence-level semantic analysis (SLSS) (SNMF). The sentence similarity matrix is formed using SLSS, which can capture the semantic correlations within the sentences. The SNMF technique is applied to cluster the sentences based on the similarity matrix. The traditional non-negative matrix factorization (NMF) only works with rectangular matrices and is hence inapplicable here. Eventually, the most important sentences for each cluster are chosen, taking into account
354
S. Anan et al.
both internal and external information. The presented method performs well due to sentence semantic information comprehension, clustering over symmetric similarity matrix, and within cluster sentence selection. For extractive multi-document summarization, the [27] paper provides a transductive method to develop ranking functions. The suggested approach finds topic themes inside a document collection in the first step, which aids in identifying two sets of appropriate and unrelated texts to a query. Then, iteratively train a ranking function across these two sets of texts by improving a ranking loss and fitting a previous method created from keywords, the technique first determines a prior probability of relevance for each text by employing the set of keywords linked with a query. It then creates an iterative scoring function which suits the prior probability and reduces the number of irrelevant phrases that are scored higher than relevant phrases. For every iteration, new relevant and irrelevant texts are discovered by using the predicted scores of the current function. The training set now includes these sentences in order to train a new function. The result of the function is utilized to find more relevant and irrelevant terms. This technique is repeated until the predetermined stopping point is reached. Wei et al. [28] provides a novel document responsive graph model that focuses on the impact of global document collection information on local sentence evaluation. Intra-document sentence relations are separated from inter-document sentence relations using document-document and document-sentence relations. They created the DsR (Document-Sensitive Ranking) algorithm, which iteratively ranks sentences, using this model. It takes into account how the evaluations of each sentence are influenced by the overall document set. The system significantly outperforms the baseline system. However, the generated summaries are very short and require further post processing. Liu et al. [29] incorporates document clustering for multi-document summarization. By merging the document-term and sentence-term matrices obtained from the original texts, they proposed a novel language model, factorization with provided bases (FGB), where the provided bases are sentence bases. The datasets DUC2002 and DUC2004 were used for generic multidocument summarizing, while the datasets DUC2005, DUC2006, and TAC2008 (set A) were utilized for query-relevant summary. They measured our proposed FGB approach using the ROUGE toolset, which is frequently utilized by DUC for performance evaluation. This technique obtains high ROUGE scores and performs better than the majority of baseline systems. Lin et al. [30] created query-oriented deep extraction (QODE), a revolutionary unsupervised deep learning architecture with a new deep network and an unsupervised deep prediction model to best match the document material and the query-oriented multidocument summarizing system. They evaluate performance using DUC 2005, DUC 2006, and DUC 2007. They also employ a single document collection D376e, which has 26 documents and 9 human summaries. The suggested method makes no assumptions about the presence of a training corpus. As a result, taking supervised training into account can improve the outcome. Kumar et al. [31] offers a supervised learning approach for automated crossdocument relationship recognition that is built on the case-based reasoning (CBR) method and improved using a genetic learning model. They used the CST Bank dataset, which is a corpus of clusters of English news stories tagged with CST associations. They
Automatic Document Summarization of Unilingual Documents: A Review
355
accumulated 582 sentence pairs with the relationship types Identification, Subsumption, Description, and Overlap. Using the DUC 2002 dataset, the suggested model’s overall performance was assessed. The experimental results demonstrated that the model produced better outcomes, hence supporting the hypothesis. Canhasi and Kononenko [32] provides multi-document summarizing using the content-graph joint model’s Archetypal Analysis. This method selects significant sentences from a particular set of documents whilst also minimizing duplicate information in summaries with document collecting coverage. The Archetypal Analysis issue is used to formalize summarization, which takes into account relevance, diversity, information coverage and the length limit. DUC2004 and DUC2006 datasets were used in this study. It outperforms the best systems in DUC2006 on ROUGE-1 and ranks among the top systems in DUC2004. The placement of a sentence within a document, however, has not yet been explored in their current research. Manifold-ranking has been shown to be a great strategy for summarizing topicfocused multi-document sets. Nevertheless, it solely used cosine similarity to build relationships between sentences. An improved similarity metric is provided in [33] for further improving the manifold ranking algorithm. Researchers present a joint optimization approach in this publication that combines the manifold-ranking method with a similarity metric learning method. The integrated system attempts to improve both sentence similarity and sentence rankings. It outperformed the basin Manifold ranking algorithm in terms of summarization. However, further research is needed to incorporate syntactic and semantic characteristics from documents into the summary process. Al-Dhelaan [34] proposes a graph-based method. StarSum mixes subject signature phrases with phrases in a star graph for multi-document summarization. They use a simple but effitient star graph to combine domain signature phrases and phrases. We provide variety by dividing the StarSum graph into multiple parts and picking best phrases from each, and we provide coverage by rating sentences according to their level of connectivity to other topic terms and phrases. ROUGE-1, a unigram metric with the highest agreement with human assessors, was used. Furthermore, they used the bigram measure ROUGE-2, the trigram measure ROUGE-3, ROUGE-4, and ROUGE-L to find the frequent patterns of words in subsequence. Cao et al. [35] developed a novel summarizing method called TCSum for enhancing Multi-Document Summarization by Text Classification, which utilizes text classification to increase summarization performance. DUC 2001, 2002, and 2004 data sets were used. ROUGE scores are used to assess performance. TCSum outperforms all other models that are completely reliant on automatically learnt features on all three datasets. The extractive approach is used for single document summary. Sanchez-Gomez et al. [36] employed a self-organized multi-objective Differential Evolution-based ESDS technique, as well as a Grey Wolf Optimizer-based ESDS approach and a Water Cycle algorithm-based ESDS approach. The datasets DUC2001 and DUC2002 were used. For DUC2001, the best strategy improves by 6.49% points, while for DUC2002, the best strategy gets better by 49.44% points above the best technique, namely, MASingleDocSum. Chu and Liu [37] presents a Neural Model for Unsupervised Multi-Document Abstractive Summarization. The suggested MeanSum model is made up of two parts:
356
S. Anan et al.
(1) an auto-encoder and (2) a summarizing module that learns to construct summaries. They used a dataset of Yelp Dataset Challenge customer reviews, with each review accompanied by a 5-star rating. They also tested on a product review dataset from Amazon. Methods to neural abstractive summarizing rely on supervised learning with a large number of document-summary pairings, which are difficult to generate at scale. To overcome this restriction, we developed an unsupervised abstractive model. Their technique, however, does not offer an unsupervised solution to the more challenging (due to less redundancy cues) single document summary problem. The study [38] describes an abstractive multi-document summarization approach, which first converts records into a Semantic Link Network (SLN) of concepts and events before summarizing the SLN. DUC 2005, DUC 2006, and DUC 2007 datasets were utilized to assess performance. The DUC2006 dataset results outperform all extractive baselines and abstractive baselines on ROUGE-2 and ROUGE-SU4, while ROUGE-1 obtains comparable performance in comparison to numerous state-of-the-art extractive baselines. The system also surpasses all extractive and abstractive baselines on the DUC 2007 dataset, according to the results.
4 Conclusion In this study, we have analyzed various ways auto summarization techniques. As there are various languages and various types of text documents, this field has a great diversity of work. We presented the recent research overview of few combinations of Single document summarization, Multi document summarization and multilingual document summarization. We have shown the major contributions, dataset and also the performance evaluation of the research studies spanning from 1998 to 2019. We have found that there can be two basic type of summarization techniques. One is abstractive and another one is extractive. Supervised and unsupervised learning techniques is used on different cases. Initially, automatic summarization was mostly applied for only single document. But with the increasing necessity of summarization, multi document and multilingual summarization has become very popular. As it is very dynamic field, researchers are still working on this field. We believe that this paper would be a good source of exploration in this field.
References 1. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 335–336 (1999) 2. Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Auto. Text Summ., 111–121 (1999) 3. Boguraev, B., Neff, M.S.: Lexical cohesion, discourse segmentation and document summarization. In: RIAO, pp. 962–979 (2000) 4. Kulkarni, A.R., Apte, M.S.: An automatic text summarization using feature terms for relevance measure (2002) 5. Leskovec, J., Grobelnik, M., Milic-Frayling, N.: Learning sub-structures of document semantic graphs for document summarization. In: LinkKDD Workshop, pp. 133–138 (2004)
Automatic Document Summarization of Unilingual Documents: A Review
357
6. Varadarajan, R., Hristidis, V.: A system for query-specific document summarization. In: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 622–631 (2006) 7. Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 448–457 (2007) 8. Chatterjee, N., Mohan, S.: Extraction-based single-document summarization using random indexing. In: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), Vol. 2, pp. 448–455. IEEE (2007) 9. Wan, X., Yang, J., Xiao, J.: Single document summarization with document expansion. In: AAAI, pp. 931–936 (2007) 10. Lee, J.H., Park, S., Ahn, C.M., Kim, D.: Automatic generic document summarization based on non-negative matrix factorization. Inf. Process. Manage. 45(1), 20–34 (2009) 11. Wan, X., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Trans. Info. Syst. (TOIS) 28(2), 1–34 (2010) 12. Nagwani, N.K., Verma, S.: A frequent term and semantic similarity based single document text summarization algorithm. Int. J. Comp. Appl. 17(2), 36–40 (2011) 13. He, Z., Chen, C., Bu, J., Wang, C., Zhang, L., Cai, D., He, X.: Document summarization based on data reconstruction. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 26, No. 1 (2012) 14. Barrera, A., Verma, R.: Combining syntax and semantics for automatic extractive singledocument summarization. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 366–377. Springer, Berlin, Heidelberg (2012) 15. Sarkar, K.: Automatic single document text summarization using key concepts in documents. JIPS 9(4), 602–620 (2013) 16. Mendoza, M., Bonilla, S., Noguera, C., Cobos, C., León, E.: Extractive single-document summarization based on genetic operators and guided local search. Expert Syst. Appl. 41(9), 4158–4169 (2014) 17. Yao, J.G., Wan, X., Xiao, J.: Compressive document summarization via sparse optimization. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015) 18. Tan, J., Wan, X., Xiao, J.: Abstractive document summarization with a graph- based attentional neural model. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1171–1181 (2017) 19. Saini, N., Saha, S., Jangra, A., Bhattacharyya, P.: Extractive single document summarization using multi-objective optimization: exploring self-organized differential evolution, grey wolf optimizer and water cycle algorithm. Knowl.-Based Syst. 164, 45–67 (2019) 20. Goldstein, J., Mittal, V.O., Carbonell, J.G., Kantrowitz, M.: Multi-document summarization by sentence extraction. In: NAACL-ANLP 2000 Workshop: Automatic Summarization (2000) 21. Boros, E., Kantor, P.B., Neu, D.J.: A clustering based approach to creating multi-document summaries. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (2001) 22. Leuski, A., Lin, C.Y., Hovy, E.: iNeATS: interactive multi-document summarization. In: The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics, pp. 125–128 (2003) 23. Lin, C.Y., Hovy, E.: Neats: a multidocument summarizer. In: Proceedings of the Document Understanding Workshop (DUC) (2001) 24. Radev, D.R., Jing, H., Sty´s, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004) 25. Wan, X., Yang, J., Xiao, J.: Manifold-ranking based topic-focused multi-document summarization. IJCAI 7, 2903–2908 (2007)
358
S. Anan et al.
26. Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 307–314 (2008) 27. Amini, M.R.: Usunier, N.: Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp. 704–705 (2009) 28. Wei, F., Li, W., Lu, Q., He, Y.: A document-sensitive graph model for multi-document summarization. Knowl. Inf. Syst. 22(2), 245–259 (2010) 29. Wang, D., Zhu, S., Li, T., Chi, Y., Gong, Y.: Integrating document clustering and multidocument summarization. ACM Trans. Knowl. Discovery Data (TKDD) 5(3), 1–26 (2011) 30. Liu, Y., Zhong, S.H., Li, W.: Query-oriented multi-document summarization via unsupervised deep learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 26, No. 1 (2012) 31. Kumar, Y.J., Salim, N., Abuobieda, A., Tawfik, A.: Multi document summarization based on cross-document relation using voting technique. In: 2013 International Conference on Computing Electrical and Electronic Engineering (ICCEEE), pp. 609–614. IEEE (2013) 32. Canhasi, E., Kononenko, I.: Multi-document summarization via archetypal analysis of the content-graph joint model. Knowl. Inf. Syst. 41(3), 821–842 (2014) 33. Tan, J., Wan, X., Xiao, J.: Joint matrix factorization and manifold-ranking for topic-focused multi-document summarization. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 987–990 (2015) 34. Al-Dhelaan, M.: StarSum: a simple star graph for multi-document summarization. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 715–718 (2015) 35. Cao, Z., Li, W., Li, S., Wei, F.: Improving multi-document summarization via text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, No. 1 (2017) 36. Sanchez-Gomez, J.M., Vega-Rodríguez, M.A., Pérez, C.J.: Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach. Knowl. Based Syst. 159, 1–8 (2018) 37. Chu, E., Liu, P.: MeanSum: a neural model for unsupervised multi-document abstractive summarization. In: International Conference on Machine Learning, pp. 1223–1232. PMLR (2019) 38. Li, W., Zhuge, H.: Abstractive multi-document summarization based on semantic link network. IEEE Trans. Knowl. Data Eng. (2019)
Author Index
A Afroze, Sadia 145 Ahmed, Imtiaj 274 Al Maksud, Kazi Sifat 334 Al Munem, Abdullah 32 Alam, Farhana 227 Alam, Md Janay 334 Ali, Mohammed Nadir Bin 345 Anan, Sabiha 345 Anilkumar, Abhishek 250 Arefin, Mohammad Shamsul 32, 121, 165, 227, 274, 334, 345 Arun, K. S. 250 Aszani, 135 B Balyan, Vipin 84 Bhuiyan, Touhid 345 Bijoy, Md.Hasan Imam 345 Blagov, Dmitry A. 95 Budnikov, Dmitry 51 Bulayungan, Alvin 67 C Cam, Nguyen Tan 306, 325 Castillo, Deane Cristine 67 Chaudhary, M. P. 299 Chekha, O. V. 13 Chilingaryan, N. O. 104 Chilingaryan, N. 3 Choudhury, Ashiquzzaman 334 D Das, Anindita 260 Das, Avishek 199 Deb, Kaushik 217 Dinh, Dung Hai 241 Dorokhov, A. S. 104 Dovlatov, Igor M. 58, 95 Duda, Alfredo P. 84
E Ershova, Irina
114
F Fahmida, Maisha 217 Faiza, Anika 165 Fang, Xing 209 Fatkhutdinov, M. 23 G Gunadi, Kartika
289
H Halim, Siana 316 Hareesh, V. 250 Hasan, Md. Mainul 165 Hoque, Mohammed Moshiul 145, 179, 199 Hossain, Md. Rajib 179 Hossain, Md. Shafayat 32 Hossain, Syed Md. Minhaz 217 Huy, Trinh Gia 325 I Ignatkin, I. Y. 13 Ignatkin, I. Yu. 43 Intan, Adriel A. 266 Intan, Rolly 266 Ishkin, P. 23, 77 Islam, Nazneen 345 Ivanitskikh, A. 3 J Jurochka, Sergey S. 58 K Karmakar, Mridul Ranjan 121 Kazantsev, S. P. 13 Khaliluzzaman, Md. 217 Kincharova, M. 77 Kirsanov, Vladimir 114
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. Vasant et al. (Eds.): ICO 2023, LNNS 853, pp. 359–360, 2024. https://doi.org/10.1007/978-3-031-50327-6
360
Author Index
Komkov, Ilya V. 58, 95 Kononenko, A. S. 43 L Lam, Ngoc Nguyen Minh 241 Lapie, John Carl 67 Larionov, Gennady 114 Livida, Dorothy Mary Ann 67 M M., Nimal Madhu 155, 187 Macatangay, Bryant 67 Madhu, M. Nimal 250 Mahajabin, Maisha 227 Mahardika, Made Yoga 289 Manshahia, Mukhdeep Singh 299 Mardareva, Natalia 114 Mashkov, S. 23 Mulyana, Sri 135 Munif, Mahadi Karim 121 N Nguyen, Phuc 325 Nova, Shahinur Rahman
165
P Panchenko, V. A. 104 Panchenko, V. 3, 43 Panchenko, Vladimir A. 58 Panchenko, Vladimir 51, 114 Polikanova, Alexandra A. 58, 95 Poruchikov, Dmitrii 114 Prem Kumar, Pradeesh 155, 187 Prito, Rizvee Hassan 32 Proshkin, Y. 3 R Rahman, Md. Mostafijur 274 Rahman, Sheikh Sowmen 199 Rahman, Syada Tasfia 227 Raji, Atanda K. 84 Rakitina, V. 77 Reza, Ahmed Wasif 32, 121, 165, 227, 274, 334, 345 Reza, Nahid 227 Rosales, Marife 67 Rudenko, Viktor 51 Ryabchikova, V. 43
S Saiful, Md. 227 Sakib, Md. Nazmus 165 Samarin, Gennady 114 Sangalang, Kim Francis 67 Sarker, Banalata 121 Semenova, N. A. 104 Semenova, N. 3 Serov, A. V. 13 Serov, N. V. 13 Setiawan, Alexander 289 Shah, Dhaneshwar 209 Sharif, Omar 199 Shevkun, N. A. 43 Sizan, Samiun Rahman 274 Smirnov, A. 3 Syrkin, V. 23 T Tabassum, Fariha 274 Tan, Vo Ngoc 325 Tran, Ngoc Hong 241 Tuhin, Rashedul Amin 32 Tusi, Sanjida Alam 121 U U., Nitish 155, 187 Ukidve, Seema 299 Uyutova, N. I. 104 V V., Hareesh 155, 187 Variyar, V. V. Sajith 187 Vasilev, S. 23 Vo, Sang 325 Vuong, Vo Quoc 306 W Widyadana, I. Gede Agus 316 Wijayanto, Emmanuel Jason 316 X Xaba, Siyanda Andrew Y Yadav, Ramsagar 299 Yadukrishnan, V. 250 Yudaev, I. 23, 77
209