497 89 10MB
English Pages 311 [306] Year 2023
Nirav Joshi Vinod Kushvaha Priyanka Madhushri Editors
Machine Learning for Advanced Functional Materials
Machine Learning for Advanced Functional Materials
Nirav Joshi · Vinod Kushvaha · Priyanka Madhushri Editors
Machine Learning for Advanced Functional Materials
Editors Nirav Joshi University of Sao Paulo, Sao Carlos Institution of Physics, Grupo de Polimeros São Paulo, Brazil
Vinod Kushvaha Department of Civil Engineering Materials and Structures Indian Institute of Technology Jammu Jammu, India
Priyanka Madhushri Proof of Concept and Innovation Group Stanley Black and Decker (United States) Atlanta, GA, USA
ISBN 978-981-99-0392-4 ISBN 978-981-99-0393-1 (eBook) https://doi.org/10.1007/978-981-99-0393-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The evolving field of Machine Learning (ML) and Artificial Intelligence (AI) contributed tremendously to the advancement of various branches of science and technology. Machine learning has attracted great interest from the research community in material science, because of its ability to statistically analyze a large collection of data. Along with the computational task, time-efficient tools of machine learning have also been applied for the prediction of the design and properties of new materials. A noticeable shift from trial and error-based laboratory approach to the modeling and simulation-based software techniques in the preparation and characterization of functional materials manifests the emergence of big data in the field of material science. The efficient algorithms enable data collection, and storage with high security, fast processing, and interpretation of physically generated results. It has put the research of physical and chemical science at the forefront with the advancements in image processing, photonics, optoelectronics, and other emerging areas of material science. Although there have been a lot of research articles on this topic, no comprehensive scientific reference book on machine learning for advanced functional materials has been published so far. This book aims to provide an in-depth examination of recent achievements in material science by focusing on topical issues using machine learning methods. In this book, there are a total 13 chapters. Muhammad Abdul Basit describes the ML about organic and inorganic solar cells, making a discussion about the use of machine learning, various classes of machine learning, common algorithms, and basic steps for ML. Detailed discussion about specific types of ML for solar cells, and application of ML for prediction of suitable materials, for optimization of device structure and fabrication processes, and reconstruction of measured data for solar cells. Shirong Huang demonstrates the application of machine learning toward gas identification focusing on two main strategies: the conventional electronic nose strategy relying on sensor arrays with different materials or functionalization providing different maximum response values with each, and the utilization of single sensors but utilizing multiple transient features of their response. Similarly, Elsa M. Materón discusses artificial intelligence tools utilized to assist electrochemical, optical, and gas sensors, including the advantages, advances, limitations, and v
vi
Preface
strategies of the most commonly used machine learning algorithms. Nitesh Sureja discusses the design, implementation, and effectiveness of the application of ML on data generated by healthcare devices. Gisela Ibáñez-Redin outlines a general overview of the application of machine learning algorithms to wearable technologies with special emphasis has been placed on the application of this approach to health monitoring, sports analytics, veterinary medicine, and agriculture. Humaira Rashid Khan demonstrates the need and necessity of developing new energy materials to contribute to global carbon neutrality with the most recent advancements in datadriven materials research and engineering reviewed, including alkaline ion battery materials, photovoltaic materials, catalytic materials, and carbon dioxide capture materials. Purvi Bhatt reviews applications of machine learning algorithms to study experimentally obtained results of physical systems and concludes the chapter with a discussion of future directions and challenges in the acceptability of this advanced technique to the existing vast area of material science. Ramandeep Kaur includes a brief discussion of perovskite materials and the developments made in the lead-free perovskite for photovoltaics. Further, this discussion turns to the collection and analysis of materials data and extends to the descriptors used to describe the performance and properties of lead-free perovskites. Shulin Yang discusses the models applied for machine learning and the enhanced mechanism of the novel sensors or sensor array to better understand the machine learning methods. R. Vignesh covers the challenges in advanced functional materials research and the role of machine learning in design, simulation, and evaluation. Also, significant pointers to successful machine learning applications are addressed, as well as the remaining hurdles in machine learning for next-generation functional materials. Tulsi Satyavir Dabodiya provides a brief introduction to the ML process that could benefit the photocatalysis field. Further, the chapter provides basic PC research knowledge that could potentially be useful for machine learning methods. Additionally, we also describe the pre-existing ML practices in PC for quick identification of novel photocatalysts. G. Sudha Priyanga explains the state of the art in materials informatics as it pertains to photocatalysts and indicates the combination of ML with photocatalysis expertise could pave the way for the creation of a comprehensive catalyst screening platform that would greatly help in making photocatalysis prominent and technologically relevant, at the same time. V. Balasubramani focuses on impedance-based sensors such as chemical and biosensors using graphene-based nanocomposite materials and the role of machine learning in addressing the challenges of these sensors in the detection of chemical and biomolecules, advantages, drawbacks, current improvement, and future direction are illustrated. This book will benefit the students, researchers, and working professionals working in the field of machine learning, pattern recognition, and artificial intelligence using advanced functional materials. São Paulo, Brazil Jammu, India Atlanta, USA
Nirav Joshi Vinod Kushvaha Priyanka Madhushri
Contents
Solar Cells and Relevant Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Abdul Basit, Muhammad Aanish Ali, and Mamoona Yasmeen
1
Machine Learning-Driven Gas Identification in Gas Sensors . . . . . . . . . . . Shirong Huang, Alexander Croy, Bergoi Ibarlucea, and Gianaurelio Cuniberti
21
A Machine Learning Approach in Wearable Technologies . . . . . . . . . . . . . Gisela Ibáñez-Redin, Oscar S. Duarte, Giovana Rosso Cagnani, and Osvaldo N. Oliveira
43
Potential of Machine Learning Algorithms in Material Science: Predictions in Design, Properties, and Applications of Novel Functional Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Purvi Bhatt, Neha Singh, and Sumit Chaudhary The Application of Novel Functional Materials to Machine Learning . . . Humaira Rashid Khan, Fahd Sikandar Khan, and Javeed Akhtar
75 95
Recent Advances in Machine Learning for Electrochemical, Optical, and Gas Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Elsa M. Materón, Filipe S. R. Silva Benvenuto, Lucas C. Ribas, Nirav Joshi, Odemir Martinez Bruno, Emanuel Carrilho, and Osvaldo N. Oliveira Perovskite-Based Materials for Photovoltaic Applications: A Machine Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Ramandeep Kaur, Rajan Saini, and Janpreet Singh A Review of the High-Performance Gas Sensors Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Shulin Yang, Gui Lei, Huoxi Xu, Zhigao Lan, Zhao Wang, and Haoshuang Gu
vii
viii
Contents
Machine Learning for Next-Generation Functional Materials . . . . . . . . . . 199 R. Vignesh, V. Balasubramani, and T. M. Sridhar Contemplation of Photocatalysis Through Machine Learning . . . . . . . . . . 221 Tulsi Satyavir Dabodiya, Jayant Kumar, and Arumugam Vadivel Murugan Discovery of Novel Photocatalysts Using Machine Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 G. Sudha Priyanga, Gaurav Pransu, Harshita Krishna, and Tiju Thomas Machine Learning in Impedance-Based Sensors . . . . . . . . . . . . . . . . . . . . . . 263 V. Balasubramani and T. M. Sridhar Machine Learning in Wearable Healthcare Devices . . . . . . . . . . . . . . . . . . . 281 Nitesh Sureja, Komal Mehta, Vraj Shah, and Gautam Patel
Solar Cells and Relevant Machine Learning Muhammad Abdul Basit, Muhammad Aanish Ali, and Mamoona Yasmeen
Abstract Machine learning (ML) and data science is the most emerging computation tool which has been recently incorporated in emerging fields of materials science and engineering including but not limited to solar cells. It helps us to optimize materials and their photovoltaic performance for various types of solar cells through algorithms and models, which is easy, cost-efficient, and rapid compared to conventional programming methods. Although the family of solar cells has been classified into various types based on their generations, however, the basic two types (i.e., organic, and inorganic solar cells) are more specific owing to the contrast in their materials, fabrication techniques, and corresponding characterizations. A large number of materials can be used for developing photoanode/photocathode in solar cells; however, it is too difficult and complex to design the most proficient one practically. In this chapter, we will comprehensively review ML about organic and inorganic solar cells, making a discussion about the use of machine learning, various classes of machine learning, common algorithms, and basic steps for ML. A detailed discussion about specific types of ML for solar cells and the application of ML for the prediction of suitable materials, optimization of device structure and fabrication processes, and reconstruction of measured data for solar cells are given. In the end, we shall cover the current research status and future challenges, and expected progress of ML, and will propose suggestions that can enhance the usefulness of machine learning.
1 Introduction Modern day world faces unprecedented challenges caused by global warming and energy crisis. Researchers and scientists continue to work for finding novel and smart solutions to address these challenges. M. A. Basit (B) · M. Aanish Ali · M. Yasmeen Department of Materials Science and Engineering, Institute of Space Technology, Islamabad 44000, Pakistan e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_1
1
2
M. A. Basit et al.
There is a large gap between the supply of fossil fuels and the energy demands, and on the other hand, burning fossil fuels for energy production is a cause of global warming. Green, renewable solar energy can maintain a balance between the environment and the demands of energy [1]. Annually 23,000 TWy of energy can be harvested from solar radiation and the annual consumption of energy in the world is 1,600 TWy. Similarly, the current energy needs of the global can be met for eight years if per day of sunlight that reaches the earth could be harvested. The above facts and figures are good evidence that solar energy can meet the ever increasing energy demands and resolve environmental problems of the world [2].
1.1 Generations of Solar Cells In 1839, Alexandre-Edmond Becquerel gave the concept of photovoltaics, and in 1940, the first silicon-based solar cell was developed in Bell lab [3]. The efficiency of silicon-based solar cells is around 47.1% to date [4]. Further discussion is about the categorization of solar cells, in three generation based on the photoanode junction’s cost of manufacturing, performance, and limitations. Figure 1 presents a broad overview of major categories of solar cells and their classifications according to type [5].
1.1.1
First-Generation Solar Cells
The first generation belongs to wafer-based solar cells like mono-crystalline silicon, polycrystalline silicon, and multi-junctions. Mono-crystalline silicon-based solar cells are expensive but durable under ambient conditions, stable under high temperatures, and have a longer life span. Polycrystalline silicon solar cells have a short life span, less durability under ambient conditions, and stability compromises under elevated temperatures but are low cost. Multi-junction solar cells consist of multiple p − n junctions. In these p − n junctions, different semiconductors are used and can respond to a wider range of light, so more electron-hole generation occurs, and as a result, more current is produced, and the efficiency of multi-junction solar cells is increased [6].
1.1.2
Second-Generation Solar Cells
This type of solar cell is made of thin films and is cheaper than the first generation. The absorbing layer is mounted on a suitable substrate, and mostly three types of substrates are used in this generation; copper, cadmium, and silicon. Amorphous silicon solar cells are extremely low cost, but the problem is degradation under sunlight; as a result, efficiency is decreased [7]. Cadmium telluride solar cells (CdTe) are inexpensive and chemically stable and has a wide range of absorbances due to
Solar Cells and Relevant Machine Learning
3
Fig. 1 Classification of solar cells. Reprinted with permission from Ref. [5]. Copyrights 2017 Elsevier
a low band gap, but cadmium is highly toxic. Copper Indium Gallium Selenide (CIGS) solar cells are less expensive, toxic, and efficient around 20%, but expensive in manufacturing [8].
1.1.3
Third-Generation Solar Cells
In 1839, calcium titanium oxide (CaTiO3 ) perovskite was discovered by German researcher Govstave Rose. Organic cations are present in the corner, inorganic cations are present at the body-centered, and anions are positioned at the face center. In 2009, perovskites were reported for solar cells, with around 4% efficiency. 29% efficiency is reported for tandem perovskite solar cells [9]. Perovskites are cheaper and have a wide range of absorbance due to a low bang gap, but the problem with perovskite is a short life scan due to instability under ambient conditions, and high temperature. For the last few decades, Quantum Dots Sensitized Solar Cells (QDSSCs) have been a hot area of research for researchers due to their wide range of absorbance; highly stable under thermal, and moisture, as well as facile fabrication methods; and low
4
M. A. Basit et al.
cost. Various photoanode sensitizers, electrolytes, and counter electrodes have been investigated to increase the efficiency of QDSSCs [10, 11].
1.2 Machine Learning Machine learning is a subcategory of Artificial Intelligence (AI) that utilizes knowledge of computer science, mathematics, and statistics to build algorithms for a specific purpose. Now algorithms and tools of ML are providing new devise ways to investigate the chemical and physical properties of the material and their corresponding efficiencies. ML allows us to investigate hidden properties of the material. Generally, ML is categorized into three groups namely Supervised Learning (SL), Unsupervised Learning (UL), and Reinforcement Learning (RL) as shown in Fig. 2. The basic difference between them is input and output data. In SL, labeled data is used as input and output is known. In UL unlabeled data is used as input, and output is discovered after understanding the pattern. RL is semi-supervised learning in which learning data is not predefined and work on trial-and-error method. The performance of the material is significantly dependent on the selection of the ML algorithm. For the analysis of ML, the first step is data preparation. Data can be collected from computational simulations or experimental measurements with at least 50 data points necessary for a reasonable ML model. In the second step, data is perfectly cleared to avoid inconsistency and normalize for better interpretation. The third is the proper selection of an algorithm for analysis. Clustering, classification, regression, and probability estimates are widely used in material science. The last step is model evolution; for this, purpose root means squared error (RMSE) and coefficient of determination (R2) are used. Its values range is 0–1 and close to 1 indicates accuracy [12].
Fig. 2 Various types of machine learning relevant to solar cells
Solar Cells and Relevant Machine Learning
5
In the first section of this chapter, we have tried to reveal the importance of ML to predict the PCE of solar cells by exploiting the properties of materials and optimal device structure. In the second section of this chapter, the basic workflow of ML in material science and solar cells is elaborated. In the third section, we have briefly described the ML algorithms which can be used to predict the band gap of materials, and the role of acceptor and donor (D/A) for the performance of the solar cells. In the fourth section, we have explored the role of ML algorithms to predict highly efficient materials and have the potential to discover new efficient materials to improve the performance of solar cells. ML algorithms also helped to optimize the device structure of solar cells. This work provides practical guidance for the acceleration of the discovery of new efficient materials to improve the performance of solar cells.
2 Workflow of Machine Learning The main purpose of ML is to explore hidden patterns of data and help predict new data patterns. For this purpose, ML adopts a systematic workflow like data collection and its preparation, model building, and its evaluation. Here the discussion is about a few steps of machine learning regarding solar cells.
2.1 Data Collection and Preparation The first and most crucial step in the machine learning process is data collection. The type or amount of data collected depends on algorithms. For SL labeled input data is required but for UL labeled input data is not compulsory. The amount of data also varies from model to model, e.g., ANN required a large amount of data but in general, almost 50 points are necessary for the response of ML. The performance of the model also depends on the splitting of data methods, and the original dataset can be split into training and testing subsets and can be split into ratios. ML gives knowledge of the hidden patterns of given data and helps to predict the patterns of new data. For efficient performance of model data, cleaning and transformation are performed to avoid noise and redundant information. Properties of various descriptors fall on different scales, like structural which can be controlled by synthetic rules. If the descriptor features are higher than the observation features, then we used dimensional reduction tools like principle component analysis (PCA), and independent component analysis (ICA) [13].
6
M. A. Basit et al.
Fig. 3 The general workflow of ML for solar cells includes data collection and feature engineering, model selection and evaluation, model application, and experimental evaluation
2.2 Model Building and Evaluation Every algorithm has its pros and cons. Clustering, regression, and classification are mostly used for material properties prediction. Probability estimation is used for the discovery of new material. Common conditions for a good model are to perform well for both training and testing data. To check the performance of the model, different metrics are used; metrics vary from algorithm to algorithm, and mostly root mean squared error (RMSE) and coefficient of determination (R2 ) metrics are used. Value range between 0 and 1, near to 1 indicates efficiency of prediction of the model [13]. To verify the performance of the model, split the original data into test and training sets several times, and observe each set with different groups. Finally, sum up the performance of all training and test sets; this way observe the performance of the model. This method is called cross-validation [14]. Figure 3 presents a general workflow of ML for solar cells.
3 Machine Learning for Solar Cells 3.1 Naïve Bayes (NB) Naïve Bayes classifier is a Supervised ML algorithm that works on the principle of Naïve Theorem. In the eighteenth century, Thomas Bayes and Clergy describe the concept of probability of events.
Solar Cells and Relevant Machine Learning
7
Fig. 4 Structure of Naïve Bayes
( ) P(A)*P(B/A) A = P B P(B)
(1)
Here posterior probability P( BA ) describes the conditional probability that calculates the hypothesis in the light of relevant observations, prior probability P(A) describes probability before collection of new data, likelihood probability P( AB ) is the inverse of the posterior probability, and at the end, marginal likelihood probability P(B) is evidence. NB classifier selects the feature of class independently to the other classes as an input, applies the Naïve Bayes algorithm, and works on conditional probabilities by using prior knowledge after the calculation results were obtained based on prior records, as shown in Fig. 4 [15]. NB classifier is simple and computationally efficient and can operate categorical features directly without pre-processing, and can also handle missing, noisy data pretty well [16].
3.2 Artificial Neural Network (ANN) In 1943, Warren McCullah and Water Pitts conceived the idea of ANN; the purpose of the ANN approach is to create a computational system that can solve real-life problems like the human brain. In the human brain, neutrons are interconnected similarly, in ANN neurons are interconnected in various layers, and these neurons form nodes [17]. Generally, ANN is consists of three layers as shown in Fig. 5. The input layer accepts inputs in various formats inserted by the user. Hidden layers are present between input and output layers, its main function is to calculate and find out hidden features and patterns of data. After the operation of the ANN algorithm data is transferred to output layers. Output layers present a series of transformations of hidden layers. The main advantages of the ANN algorithm are it can perform properly even if inadequate data is provided, programming data is stored in a network, not on a database, and can perform more than one simultaneous task.
8
M. A. Basit et al.
Fig. 5 Structure of an artificial neural network
3.3 Decision Trees (DT) DT belongs to supervise ML and is very popular, user-friendly compared to other algorithms. Its tree-like structure represents the relationship between prediction and outcome. It works for both discrete (classification tree) and continuous (regression tree) [18]. It works on the principle of recursive partition, a single partition known as the root node. Nodes are conditions, branches represent the results of conditions, and leaf nodes are decisions based on conditions. Partition of the tree depends on decision-based conditions, splitting the original dataset into subsets until each subset is homogenous. Leaf nodes (tree terminates) represent predicted results depending on the conditions of roots and decision nodes. The advantage of DT is that it can handle missing and noisy data very well and can better work with discrete and continuous features. Figure 6 shows the structure of the decision tree.
Solar Cells and Relevant Machine Learning
9
Fig. 6 Structure of decision trees
3.4 Other Machine Learning Techniques ANN, NB, and DT SVM machine learning models are mostly used to investigate optimal material prediction, fabrication process, and device structure for solar cells. But not limited to these techniques, several other models can be used for this purpose like Random Forest (RF) [19], K-nearest neighbors (K-NN) [20], Boosted Regression Trees (BRT) [21], Kernel Ridge Regression (KRR) [22], Convolutional Neural Network (CNN) [23], Gradient Boosting Regression Tree (GRBT) [24], Artificial Neural Network (ANN) [25] Support Vector Machine (SVM) [26], and Genetic Algorithm (GA) [27].
4 Typical Applications of ML Tools for Solar Cells Solar cells can meet human energy demands but finding suitable materials for highperformance solar cells is a crucial problem. Most often designing or predicting materials based on conventional methods (lab experiments) is expensive, time-consuming, and needs a lot of manpower. ML is a promising avenue for finding suitable and most efficient materials without a trial-and-error approach. Recently, researchers have published significant articles to predict properties of the material by ML like band gaps, solubility, density, refractive index, scattering, absorption, transmittance, reflectance, etc. ANN algorithm is most extensively used due to its superior properties and other algorithms are also used for this purpose as shown in Table 1 [28].
10
M. A. Basit et al.
Table 1 List of ML models used for material prediction Sr. No
Material
Property
Algorithm for ML
Precision accuracy
References
1.
Perovskite
Optical
ANN
66.5% prediction coefficient improved
[29]
2.
Y6 non-fullerene acceptor
PCE
DFT, RF
Means Absolute [30] Error (MAE) ≈ 0.43, Correlation Efficient r ≈ 0.97
3.
Organic solvent
Solubility of Solvent
Light GBM
log ± 0.59
[31]
4.
Dye-sensitizer
Charge transfer, optoelectronic properties, PCE of device
DFT, TD-DFT, GA-MLR
Average relative error ≈ 3.1%
[32]
5.
Pentacene
Charge carrier mobility
KRR
MAE of 3.0 ± [33] 0.2, 5.3 ± 0.4, 9.7 ± 0.6
6.
ZnO
Morphology, PCE
DT, RF, ANN
Adj. R2 = 0.7232 [34]
7.
Perovskite
Stability and Band Gap
XGBoost, RF
Prediction [35] accuracy of XGBoost R2 = 0.9935 and MAE = 0.0126, and for Random Forest R2 = 0.9410 and MAE = 0.1492
8.
Kesterite I2-II-IV-V4 Semiconductors
Band Gap
SVG with Radial-Bias Kernel
Root mean squared error = 283 meV
[36]
9.
Perovskite
PCE
SCAPS-1D
N/A
[37]
10.
Mono-crystalline module, CIGS Thin Film
I–V
ANN
Means Absolute Percentage Error (MAPE) = 0.874%
[38]
11.
Benzene
HOMO-LUMO
GA
N/A
[39]
4.1 Effect of Material Properties on PCE of Solar Cells 4.1.1
Effect of Band Gap
Li and his co-workers used real experimental data from the literature and predicted the values of band gaps and power conversion efficiency (PCE) using the ANN algorithm of ML, which further correlated with their experimental results of thin
Solar Cells and Relevant Machine Learning
11
film. In the first model, material composition is used as input and output is band gap values. In the second model, PCE is found by using band gap, ΔH, and ΔL as input [39]. The research is basically theoretical as DT algorithm is used to predict the PCE of perovskite solar cells, as shown in Fig. 7. For simulation parameters of the DT algorithm, solar cell capacitance simulator-one-dimensional (SCAPS-1D) software has been used. PCE of perovskite solar cells especially depends on band gap but many other input parameters are important like hole mobility, electron affinity, the concentration of donor and acceptor, etc. Furthermore, the device efficiency is improved from 13.29 to 16.68% by optimizing the band gap, thickness, number of defects, relative permeability, band gap energy, etc. [37].
Fig. 7 Machine learning approach to delineate the impact of material properties on physics of solar cell device. Reprinted with permission from Ref. [37]. Copyrights 2022 American Chemical Society
12
4.1.2
M. A. Basit et al.
Effect of Non-Fullerene Acceptor
Non-fullerene acceptors are a special type of acceptor used for organic solar cells. Recently, Zhao et al. aimed to predict the efficiency of novel non-fullerene acceptors by the leave-one-group out (LOGO) cross-validation method and give more accurate prediction than conventional ML methods [40]. Mahmood and his co-workers theoretically investigated the planar non-fullerene acceptors by DFT. Planar is introduced by restricting specific bond movement, these planars helped to tune band gap and improved absorbance of material, due to high planar conjugate system charge carrier and found improvement in mobility and photovoltaic performance [41]. Similarly, Xiao and his co-workers designed small molecule acceptors for P3 HT-based organic solar cells. Voc increased and device efficiency improved from 4.93 to 6.08% [42]. In this research article five algorithms RL, MLR, RF, ANN, and BRT have been constructed by Yoa and his co-workers; after predicting the ability evaluation of these five models, RL, MLR, and ANN algorithms show low correlation coefficient, and RF and BRT algorithms exhibit batter performance. To discover optimal D/A pairs for non-fullerene solar cells 32 million donor-acceptor (D/A) pairs were screened using RF and BRT algorithms, and six photovoltaic D/A pairs were selected and synthesized by a high screening method. Both models show good correlation efficient r ≈ 0.72 of RF and r ≈ 0.70 of BRT for experimental and predicted PCE [21]. Padula et al. evaluated 280 datasets of organic donors and acceptors and exploited the electronic and chemical properties of organic molecules for the prediction of the efficiency of solar cells. K-NN and R-NN are used for this purpose, and further, found the prediction capability of models to be up to r ≈ 0.7. Such prediction capability of models may allow us to discover new materials [43].
4.1.3
Effects of Morphology
For PCE of OPVs Random Forest (RF), Gradient Boosting (GB), Artificial Neural Network (ANN), and k-Nearest Neighbor (k-NN) algorithms are used and considered 13 important microscopic properties of molecules as descriptors [35]. To predict the performance of OPVs, the thickness of film, morphology, topology, and band gap are studied but here we elaborate the importance of donors and acceptors in photocurrent generation. Firstly, Bayesian Machine Scientist (BMS) algorithm is used for intrinsic JSC -vol% for material that is synthesized by a high-throughput screening method. Secondly, Random Decision Forest (RDF) is used for normalized Jsc -vol% of binary OPVs and the mean absolute error (RME) is below 0.20. The Decision Tree (DT) algorithm can be used to exploit the effects of morphology by hydrothermal experimental parameters while on the other hand, the effect of synthetic parameters on PEC of ZnO DSSCs can be investigated by ANN and Regression Tree (RT) [34]. In this article, the investigation is about the effect of annealing temperature on morphology, grain size, and PCE of organic solar cells (OSCs). The optimal grain size was found to be 9.5 nm at 110 °C for stimulatory discussion, in which DT, and Support Vector Machine (SVM) algorithm were used [45].
Solar Cells and Relevant Machine Learning
13
4.2 Prediction of Optimal Device Structure To study the effect of mesoporous structure TiO2 on the efficiency of DSSCs and PSCs, for the sake of analysis and optimization RF algorithm was employed. Simulation data was collected from SCAPS-1D software and more than 200 simulation samples were seeded. The thickness of mmp-TiO2 increased from 0.0 to 0.5 μm, and comp-TiO2 increased from 0.1 to 1.0 μm [46].
4.2.1
Anti-Reflective Coating (ARC)
Yan et al. investigated anti-reflection (AR) coating by a genetic algorithm [46]. There is a comparison of the double and four discrete layers and low refractive index AR coating of the SiO2 /TiO2 and fabricated by RF magnetron sputtering method. In four discrete layers, low refractive indexes show omnidirectional AR characteristics because 34.4% Jsc improvement is observed and contribute to the enhancement of the photovoltaic efficiency of inverted metamorphic (IMM) solar cells [47]. They were able to successfully deposit 4-layer AR coatings on Si substrate and the results of SEM (at a scale of 200 nm) can be viewed in Fig. 8. Our careful observation of this micrograph clearly identifies the distinct layer boundaries as shown in Fig. 8a. This SEM image further indicates that such AR coating would have least nanoporosity of thin film and hence the lesser refractive index. Figure 8b shows the top view of the same substrate. It can be seen that the diameter of the nanorod is much smaller than the wavelength of visible light. Ant Colony Algorithm (ACA) was used for three layers of refractive index for Sibased solar cells. 2.98% average reflectance is obtained, when the angle of incident of light varies from 0° to 80°, and by further increasing the angle of incident light from 0° to 90°, average reflectance also increased 6.56% [48]. GA algorithm was used to get the optimal combination of almost 1.4 × 109 potential candidates. The optimal combination of uniform and opal like structure of active layer show 98.1% quantum efficiency which is relatively higher than homogenous photoactive perovskite active materials [49]. AR coating for CuIn-GaSe (CIGS) solar cells is inquired from Simulated Annealing (SA) algorithm. 8.46% average reflectance is observed in MgF2 layer, the range of wavelength is 350–1200 nm, and the angle of incident is from 0° to 80°. On the other hand, 11.30% average reflection is reduced, when a double layer of SiO2 /TiO2 AR coating is investigated from 400 to 1100 nm wavelength and incident angle from 0° to 80° [50]. Figure 9 shows the schematic of a typical perovskite solar cell. As indicated by the arrow, the incident radiation comes from a medium with n = 1.5. As soon as the incident ray hits the surface, it interacts with fluorinedoped tin oxide, commonly referred to as FTO. The “t FTO ” refers to the thickness of this layer and is typically between 50 and 800 nm. Right above the perovskite layer is a uniform, homogeneous layer of TiO2 of thickness 40 nm with the purpose of hole blocking. Since the optical band gap of TiO2 is 385 nm, it is able to absorb some of the visible light. Below is the hybrid photoactive layer of methylammonium lead
14
M. A. Basit et al.
Fig. 8 a SEM micrograph of multiple layers of AR coatings on Si wafer. b Top view of the substrate. Reprinted with permission from Ref. [47]. Copyrights Material Views 2012
triiodide (MAPbI3 ) perovskite. It is arranged in face centered cubic (FCC) structure with “ABC” stacking sequence. The radius of spheres and the thickness of this perovskite layer are important attributes that play a significant role in determining its photoactive nature. It is pertinent to mention here that due to the presence of hybrid perovskite layer beneath, hybrid character is induced in the photoactive layer. The “t uni ” as indicated in the diagram varies between 0 (absence of layer) to 100 nm. The hole transporting layer is placed below this uniform layer of perovskite and above the gold (Au) counter electrode. The “t spiro ” ranges from 50 to 800 nm. This optimum level of thickness allows the layer to maximize the reflection of non-absorbed light by Au counter electrode. Moreover, it also aids in constructive interference which in turn increases the light absorption inside the hybrid photoactive layer. Generally, the thickness of Au is limited to 100 nm so that it is able to completely block and reflect the light remaining. In conclusion, the iQE value increased to 2% more than the unstructured photoactive layer [49].
4.2.2
Effects of Light Scattering
Generally, absorbance is improved by introducing light-trapping structure, but on the other hand, electrical degradation occurs. Wang and his co-worker employed finite element method (FEM) software for addressing the issue of electrical degradation [50]. Optical properties were improved by introducing GaP (high refractive index > 3) as the first layer, and flat a-Si:H active layer has been used to avoid electrical degradation in hydrogenated amorphous silicon (a-Si:H) solar cells. Through this
Solar Cells and Relevant Machine Learning
15
Fig. 9 Schematic of perovskite solar cell composed of hybrid photoactive structured and homogeneous layer designed to increase the light absorption. Reprinted with permission from Ref. [49]. Copyrights 2020 MDPI
strategy, the power conversion efficiency of solar cells was enhanced [51]. Alsaigh et al. [52] improved the power conversion efficiency of crystalline and amorphous solar cells by introducing them to the top layer of multi-layer optical designs. Two designs were heightened, inverted multi-element lenslet array (MELA) and nonuniform MELA. With the help of these layer reflection losses, they are reduced and promote light trapping from a wide range of incident angles. For these optical designs, optimal simulations are performed by COMSOL Multiphysics [53]. A bit more about the prediction of optimal solar cells (device optimization) through machine learning is summarized in Table 2. Table 2 List of ML models used to predict optimal device structure Sr. No
Material
Properties
ML (model)
References
1.
SiO2 TiO2
Anti-reflection coating
GA
[54, 55]
2.
TiO2
Light scattering
RF
[56]
3.
–
Doping profile
GA
[57]
4.
Polymer-fullerene
Charge transfer, optoelectronic properties, PCE of device
RF, ANN
[58]
5.
GaN/AlGaN quantum wells
Structural parameter of GaN/AlGaN quantum well
GA
[12]
16
M. A. Basit et al.
5 Conclusion and Future Recommendations This book chapter aimed to introduce the importance of machine learning in material science and engineering. This chapter reviewed solar cells and relevant ML algorithms like ANN, DT, and NB. Review was classified into four categories, the aim of the first category was to briefly introduce the workflow of machine learning, e.g., raw data collection and its future engineering, selection and evaluation of models, and its applications. The second category briefly introduced ML algorithms that have a role in efficient material prediction and device optimization. The third category reviewed the impact of morphology, band gap, non-fullerene acceptor, solubility, and charge transfer on PCE of solar cells. Furthermore, device optimization was investigated, e.g., anti-reflective coating of low refractive index materials, and trapping of light from high refractive index materials. GA and ANN algorithms are extensively used for this purpose because they are user-friendly. This review highlights that ML technology has a lot of potential to accelerate the discovery of high-performance efficient material for solar cells. Recently, research on ML for Organic Solar Cells (OSCs) has tremendously increased. The performance of OSCs specifically depends on solvents, crystallinity, molecular orientation of absorbing layer, and morphology of active and interfacial layers. The complex nature of organics is demanding more efficient and ecoeconomic, and eco-friendly ML models such as photovoltaic phenomena are related to microscopic properties and require high-accuracy quantum calculations. For high accuracy, large-scale virtual screening is required, but on the other hand, high computational cost made it difficult for large-scale virtual screening. To get trade-off situation between cost and accuracy, molecular fingerprints and descriptors are used for organic solar cells. Experimental validation is another important point to get a fruitful result of ML algorithms. Materials are screened through heuristic rules, but it is not sure whether they are synthesizable or not [60]. There are few papers in literature such as Saeki et al. in which they synthesized donor and fabricated device and its PCE was 0.53% and the prediction of RF model was 5.0–5.8% [58]. Much more work is needed in this aspect. In the present era, some degree of programming ability is required to use ML tools. It could be an easy task for those who have deep knowledge about data science and programming. There are many materials scientists who are working on OSCs and do not have such good command on ML tools. This may lead to incorrect interpretation of results. This problem can be solved by interfacing userfriendly graphics for material scientists and engineers. With the evolution of modern photo-energy sensitization in solar cells,i.e., quantum-dots (QDs) and metal clusters (MCs), the ML tools have gained even more importance as it has become critical to control the size of these sensitizers (i.e., QDs/MCs) to achieve well-optimized solar light harvesting, electron-hole transfer, and recombination control. All such characteristics are more often dependent on the size of the photosensitizers in emerging photovoltaic devices and relative energy band positions. The experimental development of such materials is quite expansive as compared to conventional absorbing
Solar Cells and Relevant Machine Learning
17
layers, which makes it vital to investigate and apply the ML concepts in solar cell technology for the cost-effective and time-saving development of modern solar cells.
References 1. Shaikh, M. R., Shaikh, S., Waghmare, S., Labade, S., & Tekale, A. (2017). A review paper on electricity generation from solar energy. International Journal for Research in Applied Science and Engineering Technology, 887. https://doi.org/10.22214/ijraset.2017.9272 2. This month in physics history. https://www.aps.org/publications/apsnews/200904/physicshisto ry.cfm 3. Fraas, L. M. (2014). History of solar cell development. In Low-Cost Solar Electric Power (p. 1). 4. Ibn-Mohammed, T., et al. (2017). Perovskite solar cells: An integrated hybrid lifecycle assessment and review in comparison with other photovoltaic technologies. Renewable and Sustainable Energy Reviews, 80, 1321–1344. https://doi.org/10.1016/j.rser.2017.05.095 5. (PDF) Systematic review elucidating the generations and classifications of solar cells contributing towards environmental sustainability integration | published in Reviews in Inorganic Chemistry. https://www.researchgate.net/publication/343261055_Systematic_review_ elucidating_the_generations_and_classifications_of_solar_cells_contributing_towards_envi ronmental_sustainability_integration 6. Ballaji, A., Mh, A., Swamy, K., Oommen, S., & Ankaiah, B. (2019). A detailed study on different generations of solar cell technologies with present scenario of solar PV efficiency and effect of cost on solar PV panel. International Journal of Research in Advent Technology, 7, 364–372. https://doi.org/10.32622/ijrat.74201963 7. (PDF) Review on life cycle assessment of solar photovoltaic panels. https://www.resear chgate.net/publication/338384189_Review_on_Life_Cycle_Assessment_of_Solar_Photovolt aic_Panels 8. Pseudo-halide anion engineering for α-FAPbI3 perovskite solar cells. https://www.resear chgate.net/publication/350641338_Pseudohalide_anion_engineering_for_a-FAPbI3_perovs kite_solar_cells 9. Monolithic perovskite/silicon tandem solar cell with >29% efficiency by enhanced hole extraction. Science. https://www.science.org/doi/10.1126/science.abd4016 10. Chebrolu, V. T., & Kim, H.-J. (2019). Recent progress in quantum dot sensitized solar cells: an inclusive review of photoanode, sensitizer, electrolyte, and the counter electrode. Journal of Materials Chemistry C, 7(17), 4911–4933. https://doi.org/10.1039/C8TC06476H 11. Choudhary, R., & Gianey, H. K. (2017). Comprehensive review on supervised machine learning algorithms. in 2017 International Conference on Machine Learning and Data Science (MLDS) (pp. 37–43). https://doi.org/10.1109/MLDS.2017.11 12. Mahmood, A., & Wang, J.-L. (2021). Machine learning for high performance organic solar cells: Current scenario and future prospects. Energy & Environmental Science, 14(1), 90–105. https://doi.org/10.1039/D0EE02838J 13. Parikh, N., et al. (2022). Is machine learning redefining the perovskite solar cells? Journal of Energy Chemistry, 66, 74–90. https://doi.org/10.1016/j.jechem.2021.07.020 14. Practical Machine Learning in R | Wiley. Wiley.com. https://www.wiley.com/en-us/Practical+ Machine+Learning+in+R-p-9781119591535 15. Abdualgalil, B., & Abraham, S. (2020). Applications of machine learning algorithms and performance comparison: A review. in 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE) (pp. 1–6). https://doi.org/10.1109/ic-ETI TE47903.2020.490 16. A review on machine learning algorithms to predict daylighting inside buildings—ScienceDirect. https://www.sciencedirect.com/science/article/abs/pii/S0038092X20303509
18
M. A. Basit et al.
17. Sustainability | Free Full-Text | Review on Machine Learning Techniques for Developing Pavement Performance Prediction Models. https://www.mdpi.com/2071-1050/13/9/5248 18. Sun, W., et al. (2019). Machine learning-assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Science Advances, 5(11), eaay4275. https:// doi.org/10.1126/sciadv.aay4275 19. Padula, D., & Troisi, A. (2019). Concurrent optimization of organic donor-acceptor pairs through machine learning. Advances Energy Materials, 9(40), 1902463. https://doi.org/10. 1002/aenm.201902463 20. Machine learning for accelerating the discovery of high-performance donor/acceptor pairs in non-fullerene organic solar cells | NPJ Computational Materials. https://www.nature.com/art icles/s41524-020-00388-2 21. Effect of increasing the descriptor set on machine learning prediction of small molecule-based organic solar cells | chemistry of materials. https://pubs.acs.org/doi/abs/10.1021/acs.chemma ter.0c02325 22. Pokuri, B. S. S., Ghosal, S., Kokate, A., Sarkar, S., & Ganapathysubramanian, B. (2019). Interpretable deep learning for guided microstructure-property explorations in photovoltaics. NPJ Computational Materials, 5(1). https://doi.org/10.1038/s41524-019-0231-y 23. Sahu, H., & Ma, H. (2019). Unraveling correlations between molecular properties and device parameters of organic solar cells using machine learning. The Journal of Physical Chemistry Letters, 10(22), 7277–7284. https://doi.org/10.1021/acs.jpclett.9b02772 24. Majeed, N., Saladina, M., Krompiec, M., Greedy, S., Deibel, C., & MacKenzie, R. C. I. (2020). Using deep machine learning to understand the physical performance bottlenecks in novel thinfilm solar cells. Advanced Functional Materials, 30(7), 1907259. https://doi.org/10.1002/adfm. 201907259 25. Pilania, G., Balachandran, P. V., Kim, C., & Lookman, T. (2016). Finding new perovskite halides via machine learning. Frontier in Materials, 3. https://www.frontiersin.org/articles/10. 3389/fmats.2016.00019 26. Boosting photoelectric performance of thin film GaAs solar cell based on multi-objective optimization for solar energy utilization. Solar Energy, 230, 1122–1132. https://doi.org/10. 1016/j.solener.2021.11.031 27. A review on machine learning algorithms, tasks and applications. https://www.researchgate. net/publication/320609700_A_Review_on_Machine_Learning_Algorithms_Tasks_and_App lications 28. Kim, S. M., Naqvi, S. D. H., Kang, M. G., Song, H.-E., & Ahn, S. (2022). Optical characterization and prediction with neural network modeling of various stoichiometries of perovskite materials using a hyperregression method. Nanomaterials Basel Switzerland, 12(6), 932. https:// doi.org/10.3390/nano12060932 29. Zhang, Q., et al. (2022). High-efficiency non-fullerene acceptors developed by machine learning and quantum chemistry. Advanced Science, 9(6), 2104742. https://doi.org/10.1002/advs.202 104742 30. Ye, Z., & Ouyang, D. (2021). Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms. Journal of Cheminformatics, 13(1), 98. https://doi. org/10.1186/s13321-021-00575-3 31. Accelerated discovery of high-efficient N-annulated perylene organic sensitizers for solar cells via machine learning and quantum chemistry—ScienceDirect. https://www.sciencedirect.com/ science/article/abs/pii/S2352492820326155 32. Machine Learning—Based Charge Transport Computation for Pentacene—Lederer—2019— Advanced Theory and Simulations—Wiley Online Library. https://onlinelibrary.wiley.com/ doi/abs/10.1002/adts.201800136 33. Analysis and prediction of hydrothermally synthesized ZnO-based dye-sensitized solar cell properties using statistical and machine-learning techniques | ACS Omega. https://pubs.acs. org/doi/10.1021/acsomega.1c04521 34. Machine learning stability and band gap of lead-free halide double perovskite materials for perovskite solar cells—ScienceDirect. https://www.sciencedirect.com/science/article/abs/pii/ S0038092X21007878
Solar Cells and Relevant Machine Learning
19
35. Weston, L., & Stampfl, C. (2018). Physical Review Materials, 2(8), 085407. https://doi.org/10. 1103/PhysRevMaterials.2.085407 36. Machine learning approach to delineate the impact of material properties on solar cell device physics | ACS Omega. https://pubs.acs.org/doi/10.1021/acsomega.2c01076 37. Applied Sciences | Free Full-Text | Prediction Model for the Performance of Different PV Modules Using Artificial Neural Networks | HTML. https://www.mdpi.com/2076-3417/12/7/ 3349/htm 38. Huwig, K., Fan, C., & Springborg, M. (2017). From properties to materials: An efficient and simple approach. The Journal of Chemical Physics, 147(23), 234105. https://doi.org/10.1063/ 1.5009548 39. Predictions and Strategies Learned from Machine Learning to Develop High-Performing Perovskite Solar Cells—Li—2019—Advanced Energy Materials—Wiley Online Library. https://onlinelibrary.wiley.com/doi/abs/10.1002/aenm.201901891 40. Zhao, Z.-W., del Cueto, M., & Troisi, A. (2022). Limitations of machine learning models when predicting compounds with completely new chemistries: possible improvements applied to the discovery of new non-fullerene acceptors. Digital Discovery, 1(3), 266–276. https://doi.org/ 10.1039/D2DD00004K 41. Mahmood, A., Tang, A., Wang, X., & Zhou, E. (2019). First-principles theoretical designing of planar non-fullerene small molecular acceptors for organic solar cells: manipulation of noncovalent interactions. Physical Chemistry Chemical Physics, 21(4), 2128–2139. https:// doi.org/10.1039/C8CP05763J 42. Xiao, B., et al. (2017). Non-fullerene acceptors with A2 = A1 – D − A1 = A2 Skeleton containing Benzothiadiazole and Thiazolidine-2,4-Dione for high-performance P3 HT-based organic solar cells. Solar RRL, 1(11), 1700166. https://doi.org/10.1002/solr.201700166 43. Combining electronic and structural features in machine learning models to predict organic solar cells properties—Materials Horizons (RSC Publishing). https://pubs.rsc.org/en/content/ articlelanding/2019/mh/c8mh01135d 44. Lan, F., Jiang, M., Wei, F., Tao, Q., & Li, G. (2016). Study of annealing induced nanoscale morphology change in organic solar cells with machine learning. in 2016 IEEE 16th International Conference on Nanotechnology (IEEE-NANO) (pp. 329–332). https://doi.org/10.1109/ NANO.2016.7751398 45. Al-Saban, O., & Abdellatif, S. O. (2021). Optoelectronic materials informatics: Utilizing random-forest machine learning in optimizing the harvesting capabilities of mesostructuredbased solar cells. in 2021 International Telecommunications Conference (ITC-Egypt) (pp. 1–4). https://doi.org/10.1109/ITC-Egypt52936.2021.9513898 46. Yan, X., et al. (2013). Enhanced omnidirectional photovoltaic performance of solar cells using multiple-discrete-layer tailored- and low-refractive index anti-reflection coatings. Advanced Functional Materials, 23(5), 583–590. https://doi.org/10.1002/adfm.201201032 47. Guo, X., et al. (2014). Design of broadband omnidirectional antireflection coatings using ant colony algorithm. Optics Express, 22(104), A1137–A1144. https://doi.org/10.1364/OE. 22.0A1137 48. Lobet, M., et al. (2020). Opal-like photonic structuring of perovskite solar cells using a genetic algorithm approach. Applied Sciences, 10(5). https://doi.org/10.3390/app10051783 49. Broadband omnidirectional antireflection coatings for metal-backed solar cells optimized using simulated annealing algorithm incorporated with solar spectrum. https://opg.optica.org/oe/abs tract.cfm?uri=oe-19-s4-a87 50. Wang, D., & Su, G. (2015). New strategy to promote conversion efficiency using high-index nanostructures in thin-film solar cells. Scientific Reports, 4(1), 7165. https://doi.org/10.1038/ srep07165 51. Jäger, K., Fischer, M., van Swaaij, R. A. C. M. M., & Zeman, M. (2013). Designing optimized nano textures for thin-film silicon solar cells. Optics Express, 21(S4), A656. https://doi.org/10. 1364/OE.21.00A656 52. Alsaigh, R. E., Alsaigh, R. E., Bauer, R., Lavery, M. P. J., & Lavery, M. P. J. (2020). Multi-layer light trapping structures for enhanced solar collection. Optics Express, 28(21), 31714–31728. https://doi.org/10.1364/OE.403990
20
M. A. Basit et al.
53. Schubert, M. F., Mont, F. W., Chhajed, S., Poxson, D. J., Kim, J. K., & Schubert, E. F. (2008). Design of multilayer antireflection coatings made from co-sputtered and low-refractive-index materials by genetic algorithm. Optics Express, 16(8), 5290–5298. https://doi.org/10.1364/OE. 16.005290 54. Zhang, Y.-J., Li, Y.-J., Lin, J., Fang, C.-L., & Liu, S.-Y. (2018). Application of millimeter-sized polymer cylindrical lens array concentrators in solar cells. Chinese Physics B, 27(5), 058801. https://doi.org/10.1088/1674-1056/27/5/058801 55. Al-Sabana, O., & Abdellatif, S. O. (2022). Optoelectronic devices informatics: optimizing DSSC performance using random-forest machine learning algorithm. Optoelectronics Letters, 18(3), 148–151. https://doi.org/10.1007/s11801-022-1115-9 56. Alì, G., Butera, F., & Rotundo, N. (2013). Geometrical and physical optimization of a photovoltaic cell by means of a genetic algorithm. Journal of Computational Electronics, 13(1), 323. 57. Nagasawa, S., Al-Naamani, E., & Saeki, A. (2018). Computer-aided screening of conjugated polymers for organic solar cell: Classification by random forest. The Journal of Physical Chemistry Letters. https://doi.org/10.1021/acs.jpclett.8b00635 58. Radosavljevi´c, S., Radovanovi´c, J., Milanovi´c, V., & Tomi´c, S. (2014). Frequency upconversion in nonpolar a-plane GaN/AlGaN based multiple quantum wells optimized for applications with silicon solar cells. Journal of Applied Physics, 116(3), 033703. https://doi.org/10. 1063/1.4890029
Machine Learning-Driven Gas Identification in Gas Sensors Shirong Huang, Alexander Croy, Bergoi Ibarlucea, and Gianaurelio Cuniberti
Abstract Gas identification plays a critical role in characterizing our (chemical) environment. It allows to warn of hazardous gases and may help to diagnose medical conditions. Miniaturized gas sensors, and especially those based on chemiresistive detection mechanisms, have gained rapid development and commercialization in the past decades due to their numerous advantageous characteristics, such as simple fabrication, easy operation, high sensitivity, ability to detect a wide range of gases, and compatibility with miniaturization as well as integration for portable applications. However, they suffer from a remarkable limitation, namely their low selectivity. Recently, machine learning-driven approaches to enhance the selectivity of gas sensors have attracted considerable interest in the community of gas sensors, increasing the analyte gas identification ability. In this chapter, firstly, we introduce the general approaches to enhance the selectivity of gas sensors implemented by machine learning techniques, which consists of the architecture scheme design of gas sensors (sensor array and single sensor architecture), the selection of gas sensing response features (steady-state feature and transient-state feature), and the utilization of gas sensing signal modulation techniques (sensing materials modulation, concentration modulation, and temperature modulation). Afterward, a specific application case using a machine learning-enabled smart gas sensor for the identification of industrial gases (PH3 and NH3 ) is presented, which is based on a single-channel device and utilizes multiple transient features of the response. We believe machine S. Huang (B) · B. Ibarlucea · G. Cuniberti (B) Institute for Materials Science and Max Bergmann Center for Biomaterials, TU Dresden, 01062 Dresden, Germany e-mail: [email protected] G. Cuniberti e-mail: [email protected] A. Croy Institute for Physical Chemistry, Friedrich Schiller University Jena, Helmholtzweg 4, 07743 Jena, Germany G. Cuniberti Dresden Center for Computational Materials Science (DCMS), TU Dresden, 01062 Dresden, Germany © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_2
21
22
S. Huang et al.
learning in combination with efficient sensing signal modulation techniques could be a feasible way to gain the gas identification capability of gas sensors. Keywords Chemiresistive gas sensors · Selectivity · Machine learning · Electronic nose · Transient response features · Signal modulation
1 Introduction Air pollution is the presence of substances in the atmosphere that are detrimental to the health of humans and other living beings, which is presently one of the major global social and environmental issues accompanied by terrible consequences. According to an estimate of the world health organization (WHO), there are seven million people killed by the air pollution around the world annually [1]. The European Union (EU) has identified seven main air pollutants (excluding greenhouse gases): nitrogen oxides (NOx ), carbon monoxide (CO), particulate matter (PM), sulfur dioxide (SO2 ), ozone (O3 ), ammonia (NH3 ), and volatile organic compounds (VOCs) [2]. Take NH3 as an example, it is an inorganic compound widely utilized in many industrial processes, such as manufacturing, food processing, refrigeration systems, and fertilizer production [3]. However, even exposure to ultra-low concentrations of NH3 adversely affects human health [4]. Another example, phosphine (PH3 ) is a colorless, flammable, toxic gas, which is widely used in the semiconductor industry (silicon semiconductors and photovoltaic process applications) [5]. Nevertheless, PH3 is extremely toxic and exhibits an acute lethal effect on humans and animals by inhibiting aerobic respiration at even extremely low concentrations. Another large category of pollutants, volatile organic compounds (VOCs), are organic chemicals with a high vapor pressure at room temperature, among which some are prominent and representative indoor pollutants, such as benzene, toluene, ethylbenzene, xylene (BTEX), formaldehyde, and acetaldehyde [6] The United States Environmental Protection Agency (U.S. EPA) estimates that the VOCs level in indoor air is typically 2–5 times higher than that of outdoor air [7]. Commonplace items in our dwellings such as building materials, paints, furniture, cleaning agents, and cosmetics are potential sources of VOCs. However, considerable evidence suggests that a substantial number of these VOCs could cause adverse health effects, including sensory irritation, respiratory symptoms (asthma, allergy, etc.), and even cancer [8], as shown in Fig. 1. Indoor air quality of residential units and workplaces has been a serious concern since human beings spend more than 80% of their lifetime indoors, including domestic residences and working places [9]. Gas sensors play vital roles in monitoring and detecting hazardous gases, and ensuring public safety, air quality, or analyzing environments throughout many different fields [10]. They are useful not only in monitoring toxic gases in the atmosphere emitted from industry, but also in the control of indoor air quality [11], as well as safety in the vehicle. Their applications are virtually countless in the industry and span across a lot of industrial branches, including automotive, underground
Machine Learning-Driven Gas Identification in Gas Sensors
23
Fig. 1 Major impacts of environmental pollution on human health. Adapted with permission from [12]
mining, gas and oil industry, petrochemical industry, etc. [13]. Among most of the aforementioned applications, gas sensors are capable to be interfaced with a control system so that emergency measures could be taken automatically [14]. Moreover, they could sound an alert to operators in the working area where the leak is occurring, alarming them to evacuate from the dangerous district [15]. In this manner, the application of gas sensors could prevent accidents caused by gas leakages, thereby saving lives. Therefore, the development of highly efficient, sensitive, reliable, low-cost gas sensors has been of great significance in guaranteeing the work safety and health of our human. The implementation of machine learning has become a recent trend in view of the gas sensor performance enhancement, especially for gas classification and identification tasks. The structure of this chapter is as below: firstly, the relationship between air pollution and human health is introduced. The second section provides an overview of gas sensor classification in terms of the transduction mode, with a focus on the chemiresistive sensors. The electronic nose is proposed as an effective solution to enhance the selectivity and gas identification performance of gas sensors. In the third section, gas sensing response features, consisting of both steady-state features and transient-state features, are presented. In the fourth section, general approaches for gas sensing signal modulation are introduced, including sensing materials modulation, gas concentration modulation, and working temperature modulation. Following that, an application case using a single-channel gas sensor (or virtual sensor array
24
S. Huang et al.
e-nose) for industrial gas identification (PH3 and NH3 ) is presented. Lastly comes summary and outlook.
2 Gas Sensor and Electronic Nose 2.1 Gas Sensors Classification Gas sensors generally consist of a receptor component as well as a transducer component [16], as schemed in Fig. 2. The receptor represents the sensing material, which upon exposure to the analyte gas changes its physical properties, such as conductivity (σ ), permittivity (ε), work function (ϕ), and mass (m) or responds by emitting heat or light [17]. The transducer component, on the other hand, is responsible for converting such a variation into the variation of electric parameters, such as resistance (R), capacitance (C), and inductance (L). Finally, the circuit connected to the sensor gives rise to the sensing signal (e.g., sensing response), which is either a current (I) or voltage (V ) signal. According to the working principle of the transducer, there are approximately six general categories of gas sensors, including electrochemical, optical, mass-sensitive, calorimetric, magnetic, and electrical type sensors [18]. The electrical gas sensors are the most popular type, which contains a large family of gas sensors, such as the chemiresistive type (conductometric sensors including sensing materials such as metal oxides, metals, polymers, one- or two-dimensional materials, or other semiconductors), the capacitive type, and the semiconductor type (e.g., work function, Schottky barrier, FET, etc.). So far, numerous gas sensors have been developed based on the aforementioned work principles. Nevertheless, the majority of them show some common drawbacks,
Fig. 2 Schematic diagram of electrical gas sensor working principle
Machine Learning-Driven Gas Identification in Gas Sensors
25
such as relatively high cost, sophisticated design, a need for additional equipment, bulkiness, or lack of portability [19]. Chemiresistive sensors have attracted much attention and become an attractive research topic due to their multiple advantageous characteristics, such as high sensitivity, easy operation, simple fabrication/circuitry, high sensitivity, fast response/recovery, good stability, low cost, ability to detect a wide spectrum of gases, and compatibility with miniaturization as well as integration for portable applications [20]. For these reasons, we will focus on the discussion on chemiresistive type gas sensors in this chapter.
2.2 Characteristics of Chemiresistive Type Gas Sensors Chemiresistive sensors work on the mechanism of variation in resistivity upon exposure to analyte gases [21], as illustrated in Fig. 3. Gas molecules arrive on the surface of the sensors and adsorb onto the sensing material either by physical adsorption or chemical adsorption or both. Following that, charge exchange takes place between gas molecules and sensing materials, inducing the change of charge carriers’ population of sensing materials. Both the number of adsorbed gas molecules and the amount of transferred charge depend on the intrinsic property of gas molecules. The sensing signal is characterized by the conductance or resistance of the sensors and the response sensitivity is usually employed as the main output feature. Despite the aforementioned advantages, the major issue encountered by chemiresistive type sensors is their intrinsic cross-sensitivity toward various kinds of gases
Fig. 3 Typical structure of chemiresistive type gas sensor, including interdigital electrodes (IDEs), sensing materials, silicon substrate, and heating element
26
S. Huang et al.
and volatile organic compounds, namely poor selectivity. This limitation brings about a great challenge for their application as they fail to deliver the gas detection results correctly, in particular in complex settings without prior information on surrounding gas species, which is one of the most crucial figures of merit for gas sensors in tracegas sensing [22]. Therefore, it is of great significance to endow the capability of gas identification for chemiresistive gas sensors. To evaluate the sensing performance, some critical quality indicators have to be taken into consideration, such as sensitivity, selectivity, response/recovery time, limit of detection and resolution, stability, and working temperature [23]. In a chemiresistive gas sensor, sensitivity (S) refers to the relative change of resistance upon exposure to analyte gas [24], which can be described as the ratio between the absolute resistance change upon exposure to the analyte gas and the resistance under the reference medium (e.g., ambient air): S(%) =
R g − Ra × 100, Ra
where Ra is the resistance of the gas sensor in the reference medium, also termed baseline resistance, and Rg is the resistance of the gas sensor upon exposure to the analyte gas. Selectivity indicates the capability of gas sensors to discriminate the target gas from the interference gases and to exhibit a target-specific sensor response [25]. Response time characterizes the period during which the resistance value increases/decreases by a certain percentage (e.g., 90%) of its baseline value at the certain gas concentration [27]. Likewise, recovery time determines the period required for the gas sensor to recover to its baseline value fully or partially (e.g., 90%) after switching off the analyte gas supply. The limit of detection represents the lowest gas concentration that could be distinguished from the noise [28]. The lowest concentration difference corresponds to the resolution of the gas sensors. Stability denotes the ability to provide reproducibility of measurement results in a prolonged usage.
2.3 Gas Sensor with Identification Capability: Electronic Nose The human nose is capable to smell and identify a great number of odorants that are processed by the sense of smell, and this sensory function is termed olfaction. The olfactory system is responsible for the perception of odorant molecules, which contains the area from the olfactory epithelium to the olfactory cortex in the human brain. The work mechanism of the human olfactory system is as schemed in Fig. 4 (top part). Odorant molecules first reach the mucus-covered olfactory epithelium in the nasal cavity and then bind to the olfactory receptors, which exist in the cell
Machine Learning-Driven Gas Identification in Gas Sensors
27
Fig. 4 The work principle of gas identification (e.g., coffee quality identification) in the e-nose system, which mimics the human olfactory system [26]
membrane of the olfactory sensory neurons [29]. These binding events alter the conformation of olfactory receptor proteins, generating a signal transduction through the IP3 pathway or the cAMP pathway [30]. The signal is finally transmitted to the olfactory cortex in the brain [31]. Our nose is capable of identifying the perceived odor followed by prior perception training of odors. It was discovered that there are approximately 400 functional olfactory receptors in the human olfactory system and the quantity of odorants the human nose can identify is approximately 10,000 through the combination of various olfactory receptors [32]. Inspired by the human olfactory system, researchers have found a way to gain gas identification capability for gas sensors by coupling multiple gas sensors or gas sensor arrays with artificial intelligence techniques, which is also known as electronic nose (e-nose) or artificial nose. The concept of e-nose was first proposed by Prof. Julian Gardner in 1994, referring to an electronic instrument, which is composed of an array of electronic chemical sensors (hardware) with partial specificity and an appropriate pattern-recognition system (software), capable of recognizing simple or complex odors [33]. The work principle of e-noses is as shown in Fig. 4 (bottom part), which mimics the work principle of the human olfactory system as shown in Fig. 4 (top part). The typical work procedure consists of four steps: (i) adsorption interaction of odorant molecules on sensing materials, (ii) odor sensing response signal generation and signal acquisition, (iii) odor signal processing and feature extraction, and (iv) classifier model training and odor identification by artificial intelligent techniques. Upon exposure to a specific odor (individual odorant or odorants mixture), each sensor in the array generates a sensing response feature, and all the sensing response features contributed by the sensor array constitute the odor fingerprint. The dimension of the odor fingerprint is proportional to the number of gas sensors adopted in the sensor array. Since chemiresistive gas sensors are cross-sensitive to a variety of gases and VOCs, each odor could be represented by a unique fingerprint vector, which is characterized by a sensor array. Lastly, the fingerprint vector of various gases is
28
S. Huang et al.
used as an input for machine learning/deep learning techniques which are trained to identify odors or gases.
3 Gas Sensing Response Features In order to improve the gas identification performance of electronic noses, there are three typical ways: the optimization of sensing element materials, the selection of enhanced response features, and the optimization of classifier algorithms. In particular, feature selection plays a critical role in enhancing the performance of electronic nose. Consequently, a variety of sensing response features have been proposed. Depending on the feature extraction method, sensing response features are categorized into three groups: features extracted directly from the original response curve (e.g., steady-state maximum response, integrals, etc.), features extracted from mathematic models by fitting response curve (e.g., exponential fitting parameters, polynomial fitting parameters, etc.), and features extracted from transformed response curves by applying some transformation (e.g., Fourier transform, discrete wavelet transform, etc.). In terms of the saturation status of the acquired response curve for feature extraction, gas sensing response features are categorized into two groups: steady-state features extracted from the steady-state response curve (usually achieved after a long time of exposure till the sensor reaching saturation state), and transientstate feature extracted from the transient-state response curve (achieved after a short time of exposure before sensor reaches saturation state). Due to the resistance drift for most gas sensors, transient-state features have gained much attention in the e-nose community recently [34, 35]. It has been reported that the transient-state response feature could significantly enhance the gas discrimination performance of e-nose and prolong the lifetime of e-nose system due to reduced acquisition time [36, 37]. In the below section, the steady-state feature and transient-state feature are presented, respectively.
3.1 Steady-State Features The most well-known and most common steady-state feature used in e-nose systems is the maximum response value S max , which represents the maximum fractional change of sensor resistance at saturation state upon exposure to analyte gas relative to the sensor resistance at baseline state upon exposure to reference gas (such as ambient gas), as shown in Fig. 5. In addition to the maximum fractional change of sensor resistance (fractional response) feature, there are some other maximum resistance-derived steady-state features, such as maximum differential change of sensor resistance (differential response), maximum relative change of sensor resistance (relative response), maximum logarithm change of sensor resistance (logarithm
Machine Learning-Driven Gas Identification in Gas Sensors
29
response), and maximum normalization change of sensor resistance (normalization response), as summarized in Table 1. The differential method can eliminate the additive noise or drift existing in the sensing signal while the relative method can eliminate the multiplicative drift incurred by temperature shift and provide a dimensionless response [38]. Using the fractional method, both additive drift and multiplicative drift can be removed. The fractional response feature is dimensionless and normalized, which can compensate for sensors that exhibit intrinsically large (or small) response levels. The log method is more preferred in the case where the variation in gas concentration range is large, and it facilitates the linearization of the nonlinear relationship between gas concentration and sensor resistance [35]. With the normalization processing, the sensor response output could be adjusted at the same magnitude, which is suitable for the application
Fig. 5 Characteristic response of chemiresistive type gas sensor. a The concentration profile of analyte gas in the gas chamber, gas on at ta-tb. b The signal response profile of chemiresistive gas sensor exposed to analyte gas displayed in (a). Ra is sensor resistance at baseline exposed to ambient air, Rg is sensor resistance upon exposure to analyte gas, and Rgmax is sensor maximum resistance upon exposure to analyte gas
Table 1 Maximum resistance-derived steady-state features of gas sensing response
Sensor response models
Descriptions
Differential response
Smax = Rgmax − Ra
Relative response
Smax =
Rgmax Ra
Fractional response
Smax =
Rgmax −Ra Ra
Logarithm response
Smax = log
Normalization response
Smax =
Rgmax Ra
Rgmax Rgmax −Ra
30
S. Huang et al.
that the gas qualitative analysis is the main interest rather than the gas quantitative analysis.
3.2 Transient-State Features In addition to the above-mentioned steady-state features, there is a variety of transient-state features derived from the transient-state response curve. These transient-state features include sensor maximum resistance-derived transient-state features, which are similar to the corresponding steady-state features as summarized in Table 1. The transient-state features also include parameters derived from mathematic models by fitting transient-state response curve, and coefficients by transforming transient-state response curve, as summarized in Table 2. In the case of continuous exposure, gas sensors in e-nose systems are exposed to patches of analyte gases at varying concentrations for varying periods of time, and the polynomial model is generally utilized to fit the gas sensing response curve [39]. Polynomial coefficients of the fitting models can be extracted as features of analyte gases. In addition, the derivatives derived from the polynomial fitting model can be extracted as the features as well, such as the maximum/minimum values of the first/second derivative of the polynomial fitting curve, integrals, or the area under the fitting curve, which exhibit specific physical meanings on the adsorption reaction kinetics. For example, the first derivatives indicate the rate of the adsorption reaction, and the secondary derivatives suggest the acceleration of the adsorption reaction [40]. Integrals or the area under the curve signify the accumulative total of the gas sensing adsorption reaction degree change. In the case of three-stage sampling, an exponential model is often utilized to fit the gas sensing response curve since the gas sensors are exposed to the analyte gases in a stepwise fashion [41, 42]. As shown in Table 2, two exponential models are applied to fit the exposure curve and recovery curve, respectively. The b1 and b2 features represent the response time and the recovery time, corresponding to the one-time constant in the first-order dynamic system [43]. Therefore, exponential coefficients are widely used as the fingerprint information of analyte gases as well (Table 2). Besides, the Discrete Fourier Transform (DFT) and the Discrete Wavelet Transform (DWT) techniques are widely employed to extract characteristic features of analyte gas from sensing response signal as well [44]. The DFT method converts the gas sensing signal from time domain into frequency domain assuming that the sensing signal is periodic, namely a stationary signal. The Fourier frequency features could be further analyzed because the Fourier coefficients of the transformed function represent the contribution of each sine and cosine function at each frequency. However, in most situations, the signal is dynamic and therefore the DWT method is then more preferable. The DWT method decomposes the gas sensing signal into a number of sets, where each set is a time series of coefficients (approximation coefficients and detail coefficients) describing the time evolution of the signal in the corresponding frequency band. These coefficients are used as important features of gases [45]. The
Machine Learning-Driven Gas Identification in Gas Sensors
31
Table 2 Typical transient-state features derived by fitting and transforming approaches Domain
Features
Temporal
Polynomial coefficients, Ai ,i = 0, 1, 2, 3 . . . S (t)max , S (t)min
Characteristics ΣN S(t) = i=0 Ai t i , where S(t) is sensing response, N is the degree of the fitting polynomial function, and t is time ΣN ' Primary derivative, S (t), S(t) = i=0 Ai t i ,
S '' (t)max , S ' '(t)min
Second derivative, S '' (t), S(t) =
'
'
where S(t) is sensing response
Area Exponential coefficients, a1 , b1 , c1 , a2 , b2 ,c2
Spectral
ΣN i=0
Ai t i ,
where S(t) is sensing response (t Integrals, Area = 0 S(t)dt, where S(t) is sensing response ( ) S(t) = a1 1 − e−t/b1 + c1 , exponential fitting for exposure curve, S(t) = a2 e−t/b2 + c2 , exponential fitting for recovery curve, where S(t) is sensing response and t is time
Frequency coefficients f
Fast Fourier transformation of sensing signal S(t), frequency coefficients f
Wavelet coefficients, c A ,c D
Discrete wavelet transformation of sensing signal S(t), approximation coefficients c A and detail coefficients c D
main difference between these two methods is that DWT provides both frequency and temporal information of the sensing signal, while DFT provides only frequency information for the complete duration of the signal so that the temporal information is lost.
4 Gas Sensing Signal Modulation Methods In order to improve the gas identification capability and accuracy of e-nose systems, a method for developing robust and less reductant features has to be established. Signal modulation is an effective way to gain characteristic features of an analyte gas. Traditionally, the critical hardware component of an e-nose system consists of an array of various sensing materials-based gas sensors (namely sensor array), which is responsible to generate a specific response pattern to a specific analyte and used as the feature information. In this case, the gas sensing signal is modulated by applying a variety of sensing materials. Besides, the gas sensing signal could also be modulated by the gas concentration and the operation temperature, particularly in the single gas sensor-based e-nose system, which is also termed a virtual sensor array e-nose system. The sensing response of a single gas sensor under a range of gas concentrations or a range of operation temperatures configures the unique features of specific analyte gas. These gas signal modulation methods are summarized in Fig. 6.
32
S. Huang et al.
Fig. 6 Main components in the e-nose system, including sensor module, signal modulation, feature extraction, and machine learning classification analysis. Gas sensing signal modulation techniques consist of sensing material modulation methods, gas concentration modulation methods, and operating temperature modulation methods
These signal modulation methods are determined by the gas sensing mechanism, which are subject to the sensing element materials in the gas sensors. Currently, metal oxide semiconducting (MOS) gas sensors prevail in chemiresistor-type gas sensor market. The resistances of n-type MOS sensors, for example, in which the electrons are the majority carriers, decrease when they are heated, because electrons on the valence band are excited to the conduction band, hence the number of electron charge carriers increases [46]. When the metal oxides materials are heated to different temperatures T in the ambient air, various forms of oxygen ions are gener− ated on the surface of metal oxides [47], such as O− 2 (T = 100–200 °C), O (T = 2− 200–300 °C), and O (T > 300 °C), as the adsorbed oxygen molecules withdraw electrons from the conduction band of metal oxides [48]. This further results in a decrease in the concentration of electron carriers, thus an increase in the resistance of gas sensors. Upon exposure to the reducing gas, such as CH4 , C2 H4 , and CO, the − 2− chemical reaction occurs between these oxygen ions (O− 2 , O , O , etc.) adsorbed on the surface of n-type semiconductor metal oxides and the reducing gas [49], in which electrons are released and retracted to the conduction band of the n-type semiconductor metal oxides, giving rise to a decrease in the sensor resistance. The amount of change of carriers in MOS materials is also determined by the analyte gas concentration. Therefore, for a specific analyte gas, the gas sensing response behavior of the gas sensor is highly dependent on sensing element materials (e.g., MOS materials), working temperature as well as the analyte gas concentration. In other words, by modulating the sensing element materials, working temperature, and analyte gas concentration, a variety of sensing response behavior information could be acquired. In combination with effective feature extraction processing techniques and efficient classifier algorithms, e-nose systems were established.
Machine Learning-Driven Gas Identification in Gas Sensors
33
5 Machine Learning-Enabled Smart Gas Sensor for Industrial Gas Identification In previous sections, the work principles of smart gas sensors, sensing response features, as well as signal modulation methods are summarized. In order to develop a specific application smart gas sensor, the architecture scheme design of gas sensors is to be considered first, either sensor array (≥2 sensors) or single sensor scheme. Then, the signal modulation approach should be determined, sensing element materials modulation, concentration modulation, or operation temperature modulation. Afterward, an appropriate set of sensing response features are extracted, either steady-state features or transient-state features. With these preprocessing procedures, obtained robust gas features could contribute to the outstanding identification performance of smart gas sensors in supervised machine learning analysis. In this section, a specific use case using a single gas sensor for industrial gases (NH3 and PH3 ) identification in combination with machine learning techniques is presented, which is partially from our previously published work [50]. As presented in the previous section, ammonia (NH3 ) and phosphine (PH3 ) are both common inorganic compounds widely utilized in many industrial processes; however, both gases adversely affect human health even at low concentrations. Therefore, the development of highly selective, highly sensitive, reliable, and efficient gas sensors to monitor NH3 and PH3 gas is of the utmost importance. In this application case, an ultrasensitive, highly discriminative, graphene-based gas sensor for the detection and identification of ammonia and phosphine at room temperature is demonstrated. Copper phthalocyanine derivate (CuPc) functionalized graphene dispersion prepared by liquid phase exfoliation approach was deposited on interdigital electrodes by dielectrophoresis (DEP) deposition approach, which acts as sensing material in chemiresistive formats. Gas concentration modulation method was employed to modulate the gas sensing signal in a manner of switching on/off the analyte gas supply. Multiple transient-state features were extracted from the transient-state sensing response profile and used to represent each analyte gas. A single-channel gas sensor scheme was adopted to construct a virtual sensor array. In combination with machine learning techniques, the developed graphene nanosensor demonstrates excellent gas identification performance when exposed to both analyte gases, as schemed in Fig. 7. Sensing signal data was acquired by a homemade gas sensing system [51]. A constant direct voltage (0.1 V) was applied to the sensor device and the current was recorded. The typical gas sensing signal acquisition procedure was as below: initially, to remove contaminations on the sensor surface, the sensor was flushed by dry N2 flow until the sensor resistance stabilized, which was recorded as the baseline resistance. Afterward, the gas sensor was exposed to the analyte gas for 15 min using dry N2 as carrier gas. The total gas flow rate was around 100 sccm. Afterward, the analyte gas supply was switched off and the gas sensor was recovered by dry N2 flushing at a flow rate of 2000 sccm for 10 min. Then, the analyte gas exposure step and gas flushing step were repeated. To produce a large amount of data for
34
S. Huang et al.
Fig. 7 Schematic illustrations of the smart nanosensor development workflow. Adapted with permission from [50]
machine learning training and validation, 24 repetitions of analyte gas exposure and flushing were performed for each analyte gas. The output time-series current signal was the original data for analyte gas. Obviously, the acquired gas sensing response was transient-state response signal for each analyte. Following the sensing signal data acquisition, the time-series currents upon exposure to NH3 and PH3 at different concentrations were transformed into timedependent response signals. The response profiles for each analyte gas consist of 24 repetitions, as shown in Fig. 8a. Each individual test contains two phases, namely the analyte gas exposure phase (15 min) and the analyte gas flushing phase (10 min). As it is shown in Fig. 8a, upon exposure to 100 ppb analyte gas, the average response magnitude of NH3 is 92% higher than that of PH3 while the response magnitude of PH3 is close to that of reference gas (pure N2 ). This suggests that, at the 100 ppb concentration, the interaction between NH3 and functionalized graphene is stronger than the interaction between PH3 and functionalized graphene, while the interaction between PH3 and functionalized graphene is quite weak. To extract transient features for each analyte gas, the response profile was further processed, as illustrated in Fig. 8b. In order to efficiently discriminate the analyte gases, multiple transient features were extracted from the transient-state response profile rather than utilizing only the maximum response value. 11 transient parameters were extracted from each individual response profile. Both response profiles in analyte gas exposure phase (t 1 –t 2 ) and analyte gas flushing phase (t 2 –t 3 ) were fitted by two exponential functions, respectively. With that, 3 coefficients (a1 , b1 , c1 ) for analyte gas exposure fitting curve and 3 coefficients (a2 , b2 , c2 ) for analyte gas flushing were obtained. Moreover, analysis of the first derivative and the second derivative of the response profile as a function of time was performed. With that, both the maximum value (k max ) and the minimum value (k min ) of the first derivative of the
Machine Learning-Driven Gas Identification in Gas Sensors
35
Fig. 8 a Typical response profile of graphene gas sensor toward 100 ppb analyte gas (NH3 , PH3 , and reference gas N2 ) under cycling exposure testing. A complete test is composed of 24 repetitions test. b Schematic of sensing response profile S(t) for single cycle test, consisting of analyte exposure phase (t 1 –t 2 , 15 min in this work) and analyte flushing phase (t 2 –t 3 , 10 min in this work). The feature vector representing each analyte gas consists of 11 parameters, including, a1 , b1 , c1 , a2 , b2 , c2 , S, k max , k min , amin , and area. Adapted with permission from [50]
response profile were obtained. And the minimum value (amin ) of the second derivative of the response profile was determined. Additionally, the transient response S in the whole exposure phase (t 1 –t 2 ) was calculated, as well as the area under the whole response profile (t 1 –t 3 ) was integrated. Therefore, each analyte gas is represented by a 11-dimension feature vector (24 arrays). All these data were then analyzed using unsupervised machine learning (Principal Component Analysis, PCA) and supervised machine learning (e.g., Linear discriminant analysis, LDA), as shown in Fig. 9. Specifically, PCA is a non-parametric statistical technique primarily utilized for dimensionality reduction or compression of a high-dimensional dataset onto a lower-dimensional feature subspace with the aim of maintaining most of the relevant information [52]. The PCA score plots of all data are presented in Fig. 9a, b. The first principal component explains 49.1% of the variance, while the second and third principal components explain 24.7% and 11.0%, respectively. Together, the first three principal components explain 84.8% of the variance. As shown in the 2D score plot, NH3 clusters are situated on the left side while PH3 clusters are situated in the middle and reference gas clusters are situated on the right side. It is significant that PH3 clusters are close to reference gas clusters, especially at 100 ppb, while NH3 clusters keep far away from the reference gas cluster. This suggests that NH3 induces a more discriminative signal than PH3 upon adsorption by functionalized graphene flakes. PH3 cluster exhibits some overlapping with the reference gas cluster at 100 ppb concentration, showing that the response between PH3 and functionalized graphene is extremely low at low concentration exposure. With PH3 concentration increasing, the distance between PH3 cluster and reference cluster expands, suggesting a stronger response at higher concentrations. With supervised machining learning techniques, the classification results of both NH3 and PH3 from the reference gas (pure N2 ) are presented in Fig. 9c, d. In contrast to the PCA algorithm, the LDA algorithm attempts to find a feature subspace that
36
S. Huang et al.
Fig. 9 PCA score plot for both NH3 and PH3 analyte gas at different concentrations. a 2D space plot. b 3D space plot. c Linear discriminant analysis (LDA) score plot for both NH3 and PH3 analyte gas at 100 ppb concentration. d LDA score plot for both NH3 and PH3 analyte gas at 500 ppb concentration. Adapted with permission from [50]
optimizes class separability [53]. It is shown that at 100 ppb concentration, NH3 forms an isolated cluster while PH3 cluster exhibits some overlapping with reference gas clusters. At 500 ppb concentration, the three clusters separate from each other well, indicating a perfect classification among these gases. To characterize the classification performance of developed gas sensors, several critical metrics were calculated for each analyte gas. The hold-out cross-validation approach was utilized to derive the confusion matrix (training data/testing data = 70/30%). The confusion matrix results for 100 ppb analyte gases are depicted in Fig. 10a. NH3 exhibits good classification results while PH3 exhibits moderate classification results owing to clusters overlapping with Accuracy denotes the ratio of the number of correct predictions to the total number of input samples achieved by classifier algorithms. Accuracy =
(TP + TN) (TP + FP + TN + FN)
Machine Learning-Driven Gas Identification in Gas Sensors
37
Fig. 10 Confusion matrix of analyte gas classification using LDA classifier algorithm at 100 ppb concentration (a), 500 ppb concentration (d), and 1000 ppb concentration (g). Sensor performance metrics toward NH3 , PH3, and N2 at 100 ppb concentration (b), 500 ppb concentration (e), and 1000 ppb concentration (h) using hold-out cross-validation method. Sensor classification accuracy relationship with classifier algorithms at 100 ppb concentration (c), 500 ppb concentration (f), and 1000 ppb concentration (i) using k-fold cross-validation method. Adapted with permission from [50]
Sensitivity denotes the ratio of the number of correct positive results to the number of all relevant samples that should have been identified as positive. Sensitivity =
TP (TP + FN)
Specificity denotes the ratio of negatives that are correctly identified. Specificity =
TN (TN + FP)
Precision denotes the ratio of the number of correct positive results to the number of all predicted positive results.
38
S. Huang et al.
Table 3 Confusion matrix table
Actual label Predicted label
Precision =
Positive
Negative
Positive
True positive (TP)
False positive (FP)
Negative
False negative (FN)
True negative (TN)
TP (TP + FP)
F1-score indicates the harmonic mean of the sensitivity and the precision. F1 − score = 2 ∗
2TP (precision ∗ sensitivity) = 2TP + FP + FN (precision + sensitivity)
reference gas. With the obtained confusion matrix results, performance metrics, including accuracy, sensitivity, specificity, precision, and F1-score, are described in Table 3. As we could see in Fig. 10b, the developed gas sensor exhibits an outstanding identification performance for NH3 (accuracy-100.0%, sensitivity-100.0%, and specificity-100.0%) and a moderate identification performance for PH3 (accuracy77.8%, sensitivity-75%, and specificity-78.6%). The sensor’s overall classification accuracy with respect to various classifier algorithms was analyzed as well with the k-fold cross-validation (k = 10) approach, as shown in Fig. 10c. The overall classification accuracies achieved by most classifier algorithms are above 80%. Likewise, the sensor’s performances toward 500 ppb and 1000 ppb are demonstrated in Fig. 10d–i, respectively. The sensor presents an excellent classification performance for analyte gas upon 500 ppb concentration exposure. Upon exposure to 1000 ppb concentration, a good classification performance is obtained for both NH3 (accuracy-94.4%, sensitivity-100%, and specificity-90.9%) and PH3 (accuracy94.4%, sensitivity-75%, and specificity-100.0%). For 500 and 1000 ppb analyte gas, the overall classification accuracy achieved by most classifier algorithms remains higher than 80%, as illustrated in Fig. 10f, i. In this part, a highly discriminative e-nose system for gas identification (NH3 and PH3 ) using a single-channel graphene nanosensors is presented. With concentration modulation approach and multiple transient response feature, the developed graphene nanosensor demonstrates an excellent gas identification performance by unsupervised and supervised machine learning techniques. This smart sensor prototype paves a path to design highly discriminative, highly sensitive, miniaturized, non-dedicated gas sensors toward a wide spectrum of industrious gases.
Machine Learning-Driven Gas Identification in Gas Sensors
39
6 Summary and Outlook In this chapter, we have summarized the application of gas sensors in environmental monitoring and its benefit for human health. However, the long-standing issue of chemiresistive gas sensor is the poor selectivity. To address the poor selectivity issue for gas sensors, we presented an effective approach based on the use of machine learning techniques to enhance the selectivity of gas sensors. Feature extraction is a crucial step to enhance the identification performance of gas sensors, including both steady-state and transient-state features. Gas sensing response features can be effectively modulated by sensing materials, the working temperature of gas sensors, and analyte gas concentration. To provide more insights on the above-mentioned method to enhance the selectivity of gas sensor, a specific application case using a single-channel gas sensor in combination with machine learning techniques for the industrial gas identification is given. We believe the effective combination of artificial intelligence and gas sensors will facilitate the digitization of human olfactory sense and gain rapid development in the Internet of things (IoT). Acknowledgements The authors appreciate the below funding projects: VolkswagenStiftung (grant no. 96632, 9B396) project, EU project “Smart Electronic Olfaction for Body Odor Diagnostics” (SMELLODI, grant no. 101046369), 6G-life project (Federal Ministry of Education and Research of Germany in the programme of “Souverän. Digital. Vernetzt.”, project identification no. 16KISK001K), as well as Sächsische Aufbaubank (SAB) project (project no. 100525920).
References 1. Bashir, M. F., Jiang, B., Komal, B., Bashir, M. A., Farooq, T. H., Iqbal, N., & Bashir, M. (2020).Environmental Research, 187, 109652. 2. Koolen, C. D., & Rothenberg, G. (2019). ChemSusChem, 12(1), 164–172. 3. Zhang, H., Wang, Y., Zhang, B., Yan, Y., Xia, J., Liu, X., Qiu, X., & Tang, Y. (2019). Electrochimica Acta, 304, 109–117. 4. Wu, Z., Chen, X., Zhu, S., Zhou, Z., Yao, Y., Quan, W., & Liu, B. (2013). Sensors and Actuators B: Chemical, 178, 485–493. 5. Wang, Y., Lew, K. K., Ho, T. T., Pan, L., Novak, S. W., Dickey, E. C., Redwing, J. M., & Mayer, T. S. (2005). Nano Letters, 5(11), 2139–2143. 6. Spinelle, L., Gerboles, M., Kok, G., Persijn, S., & Sauerwald, T. (2017). Sensors (Basel), 17(7), 30. 7. Huang, Y., Ho, S. S., Lu, Y., Niu, R., Xu, L., Cao, J., & Lee, S. (2016). Molecules, 21(1), 56. 8. Rajabi, H., Mosleh, M. H., Mandal, P., Lea-Langton, A., & Sedighi, M. (2020). Science of the Total Environment, 727, 138654. 9. Weschler, C. J. (2009). Atmospheric Environment, 43(1), 153–169. 10. Nikolic, M. V., Milovanovic, V., Vasiljevic, Z. Z., & Stamenkovic, Z. (2020). Sensors (Basel), 20(22). 11. Yamazoe, N. (2005). Sensors and Actuators B: Chemical, 108(1–2), 2–14. 12. Häggström, M. (25 July 2014). WikiJournal of Medicine, 1(2). 13. Minhajul Alam, S. M., Barua, A., Raihan, A., Alam, M. J., Chakma, R., Mahtab, S. S., & Biswas, C. (2021). International Conference on Communication, Computing and Electronics Systems (unpublished).
40
S. Huang et al.
14. Somov, A., Baranov, A., Savkin, A., Spirjakin, D., Spirjakin, A., & Passerone, R. (2011). Sensors and Actuators A: Physical, 171(2), 398–405. 15. Jaber, N., Ilyas, S., Shekhah, O., Eddaoudi, M., & Younis, M. I. (2018). Sensors and Actuators A: Physical, 283, 254–262. 16. Yamazoe, N., & Shimanoe, V. (2013). Semiconductor gas sensors (pp. 3–34). 17. Maekawa, T., Tamaki, J., Miura, N., Yamazoe, N., & Matsushima, S. (1992). Sensors and Actuators B: Chemical, 9(1), 63–69. 18. Hulanicki, A., Glab, S., & Ingman, F. (1991). Pure and Applied Chemistry, 63(9), 1247–1250. 19. Majhi, S. M., Mirzaei, A., Kim, H. W., Kim, S. S., & Kim, T. W. (2021). Nano Energy, 79, 105369. 20. Reddy, B. K. S., & Borse, P. H. (2021). Journal of The Electrochemical Society, 168(5), 057521. 21. Lerchner, J., Caspary, D., & Wolf, G. (2000). Sensors and Actuators B: Chemical, 70(1), 57–66. 22. Acharyya, S., Nag, S., & Guha, P. K. (2022). Analytica Chimica Acta, 1217, 339996. 23. Saxena, P., & Shukla, P. (2022). Computational and experimental methods in mechanical engineering (pp. 165–175). 24. Izu, N., Shin, W., & Murayama, N. (2002). Sensors and Actuators B: Chemical, 87(1), 99–104. 25. Wilson, D. M., Hoyt, S., Janata, J., Booksh, K., & Obando, L. (2001). IEEE Sensors Journal, 1(4), 256–274. 26. Turner, A. P. F., & Magan, N. (2004). Nature Reviews Microbiology, 2(2), 161–166. 27. Wang, H. C., Li, Y., & Yang, M. J. (2006). Sensors and Actuators B: Chemical, 119(2), 380–383. 28. Burgués, J., Jiménez-Soto, J. M., & Marco, S. (2018). Analytica Chimica Acta, 1013, 13–25. 29. (2014). Principles of neural science (5th ed.). McGraw-Hill Education. 30. Boekhoff, I., & Breer, H. (1992). Proceedings of the National Academy of Sciences, 89(2), 471–474. 31. Ko, H. J., & Park, T. H. (2016). Journal of Biological Engineering, 10, 17. 32. Malnic, B., Hirono, J., Sato, T., & Buck, L. B. (1999). Cell, 96(5), 713–723. 33. Gardner, J. W., & Bartlett, P. N. (1994). Sensors and Actuators B: Chemical, 18(1–3), 210–211. 34. Jian-Wei, G., Quan-Fang, C., Ming-Ren, L., Nen-Chin, L., & Daoust, C. (2006). IEEE Sensors Journal, 6(1), 139–145. 35. Zhou, X., Cheng, X., Zhu, Y., Elzatahry, A. A., Alghamdi, A., Deng, Y., & Zhao, D. (2018). Chinese Chemical Letters, 29(3), 405–416. 36. Borowik, P., Adamowicz, L., Tarakowski, R., Siwek, K., & Grzywacz, T. (2020). Sensors (Basel), 20(12). 37. Nallon, E. C., Schnee, V. P., Bright, C. J., Polcha, M. P., & Li, Q. (2016). Analytical Chemistry, 88(2), 1401–1406. 38. Yan, J., Guo, X., Duan, S., Jia, P., Wang, L., Peng, C., & Zhang, S. (2015). Sensors, 15(11), 27804–27831. 39. Trincavelli, M., Coradeschi, S., & Loutfi, A. (2009). Sensors and Actuators B: Chemical, 139(2), 265–273. 40. Zhang, S., Xie, C., Zeng, D., Zhang, Q., Li, H., & Bi, Z. (2007). Sensors and Actuators B: Chemical, 124(2), 437–443. 41. Calvi, A., Ferrari, A., Sbuelz, L., Goldoni, A., & Modesti, S. (2016). Sensors, 16(5), 731. 42. Oh, M.-K., De, R., & Yim, S.-Y. (2018). Journal of Raman Spectroscopy, 49(5), 800–809. 43. Pashami, S., Lilienthal, A. J., Schaffernicht, E., & Trincavelli, M. (2013). Sensors, 13(6), 7323–7344. 44. Ionescu, R. (2005). Sensors and Actuators B: Chemical, 104(1), 132–139. 45. (2002). 46. Saruhan, B., Lontio Fomekong, R., & Nahirniak, S. (2021). Frontiers in Sensors, 2. 47. Barsan, N., & Weimar, U. (2001). Journal of Electroceramics, 7(3), 143–167. 48. Choopun, S., Hongsith, N., & Wongrat, E. (2012). Nanowires—Recent Advances. 49. Katsuki, A., & Fukui, K. (1998). Sensors and Actuators B: Chemical, 52(1), 30–37. 50. Huang, S., Croy, A., Panes-Ruiz, L. A., Khavrus, V., Bezugly, V., Ibarlucea, B., & Cuniberti, G. (2022). Advanced Intelligent Systems, 4(4).
Machine Learning-Driven Gas Identification in Gas Sensors
41
51. Huang, S., Panes-Ruiz, L. A., Croy, A., Löffler, M., Khavrus, V., Bezugly, V., & Cuniberti, G. (2021). Carbon, 173, 262–270. 52. Abdi, H., & Williams, L. J. (2010). WIREs Computational Statistics, 2(4), 433–459. 53. Sharma, A., & Paliwal, K. K. (2015). International Journal of Machine Learning and Cybernetics, 6(3), 443–454.
A Machine Learning Approach in Wearable Technologies Gisela Ibáñez-Redin, Oscar S. Duarte, Giovana Rosso Cagnani, and Osvaldo N. Oliveira
Abstract The combination of wearable devices with the Internet of Things (IoT) and machine learning technologies has led to innovative analytical tools with potential applications in different fields, ranging from healthcare to smart agriculture. In this chapter, we provide an overview of the application of machine learning algorithms to wearable technologies. After introducing the algorithms more commonly used for analyzing data from wearable devices, we review contributions to the field within the last 5 years. Special emphasis is placed on the application of this approach to health monitoring, sports analytics, and smart agriculture. Keywords Machine learning · Wearable sensors · Artificial intelligence · Health monitoring · Sports medicine · Smart agriculture
1 Introduction Wearable devices represent an exciting opportunity to measure physiological parameters continuously in a real-time and nonintrusive manner by leveraging electronics and material science technologies [1]. Such devices can take many forms and contain built-in sensors that can track body movements, provide biometric information, and help with location tracking. Some examples are temperature sensors, accelerometers, electrocardiography (ECG) sensors, and other biometric sensors embedded in smartwatches and wristbands [2]. Apple Watch, Fitbit, Withings Activité, WHOOP Strap, and Xiaomi Mi Band are examples of such wearable devices that are now commercially available to track daily activities [3, 4]. The wearables industry has experienced tremendous growth over the past few years, notably boosted by recent advances in the Internet of Things (IoT) and computer science. Indeed, it is expected that the global wearable market size will surpass USD 392 billion by 2030 [5]. New technologies based on wearable devices are being developed for highly specific applications in healthcare, sports analytics, and agriculture. Some examples include the Vital G. Ibáñez-Redin (B) · O. S. Duarte · G. R. Cagnani · O. N. Oliveira São Carlos Institute of Physics, University of São Paulo, São Carlos, São Paulo 13560-970, Brazil e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_3
43
44
G. Ibáñez-Redin et al.
Connect multisensory for cardiac monitoring, AirGO™ for respiratory monitoring, and SenseHub Dairy for monitoring cows [6–8]. Besides measuring vital physiological signals and physical parameters, the new generation of wearable devices can be coupled to electrochemical, optical, and piezoelectric (bio)sensors, enabling less invasive measurement of chemical compounds and biomarkers in biological fluids [9]. The increasing availability of wearable devices for various applications has driven exponential data growth. The evaluation of these data by conventional methodologies involving human reviewers is no longer feasible, being necessary implementing computer aid approaches based on artificial intelligence (AI). Machine learning is suitable for managing such an extensive amount of data, allowing for applying automation strategies, detecting signal anomalies, reducing noise, and identifying patterns [10]. It is a data analysis methodology that automates building up mathematical and statistical models using a range of algorithms [11]. It has already shown great promise in boosting the development of wearable technology for various applications, including diagnosing and predicting diseases [2], monitoring animals and crops [12], monitoring workers [13], and tracking athlete performance [14]. This chapter explores the state of the art of machine learning research for wearable applications. We begin by describing the basics of machine learning, including the most common algorithms used to analyze data from wearable devices, and their evaluation metrics. Then, we provide a brief overview of recent applications of this technology to healthcare, sports analytics, and smart agriculture.
2 Machine Learning Algorithms Commonly Used in Wearable Technologies Most of the machine learning tasks can be approached with either supervised or non-supervised algorithms. The main difference between these two categories is that the Supervised adjective implies the existence of a response variable, target, or label to be predicted. The target (as we shall refer to it henceforth) can be reduced to a float value, a Boolean, or a discrete categorical field. In the first case, they are regression problems. The second and third cases correspond to classification tasks, which can be the traditional binary (True or False) decision task or a multiclass strategy. Assigning more than one category in a single prediction is also possible. In this case, we refer to multilabel problems. There is no such thing as exclusively regression or classification algorithms since every traditional supervised algorithm can be used for both purposes with some adjustments and modifications [15]. In the following sections, we will list the most common machine learning algorithms used for analyzing data from wearable devices.
A Machine Learning Approach in Wearable Technologies
45
2.1 Supervised Machine Learning Linear models: These simple algorithms model a linear dependency between the feature inputs and the output or response variable. The primary mathematical expression for a linear regression model is given by yi = β0 + β1 X i1 + β2 X i2 + ...β p X i p + ∊i ∀i = 1, 2, . . . n In the matrix form, it reads y = β X + ∊, where y is the vector response, X is the matrix representing the corresponding inputs for each example, β are the parameters to be estimated, and ∊ is usually called error, an unobserved random variable adding noise to the linear relationship. The goal of linear regression is to estimate the parameters β that minimize the error term. Various assumptions are made on the data to apply a linear model. Beyond the initial assumption of the underlying linearity relation between input and output, homoscedasticity or constant variance and independence of the errors are essential. Also, the lack of perfect multicollinearity is assumed. In practice, an approximate fulfillment of those assumptions is accepted, although verifying them is a good practice. The parameters β are estimated based on a predefined criterion, and the least square method is the most common choice; any other choice will result in different estimators. The Linear Regression Models are also popular owing to the ease with which causal relationships are determined efficiently and accurately. Even the regularization, which is achieved by adding a penalization term Σ pto the cost function, is simple in linear regression. The ridge regularization term λ j=1 β 2j is efficient for reducing model complexity and multicollinearity. On the Σp | | other hand, the Lasso regularization term λ j=1 |β j |, which can lead to coefficients equal to zero, helps with feature selection. A relevant variation of the linear regression model is obtained using the sigmoid function sig(t) = 1/(1 + e−t ), which is the base of the Logistic Regression (LR) algorithm for classification. Logistic regression may be taken as going further from linear regression. After we find the fitting parameters in the linear model, the variable response can be used as the argument to the sigmoid function. This maps the output to the interval [0, 1], which can be interpreted as a score that can be converted into binary by choosing an acceptable threshold. The actual algorithms for linear and logistic regression follow different and independent implementations due to technical and efficiency considerations. For example, logistic regression uses maximum likelihood estimation instead of least square and subsequently a different cost function [16]. Support Vector Machines (SVMs): This is a successful algorithm for classification (and regression with some modifications) problems, whose first implementation for pattern recognition dates to 1963 by Vladimir Vapnik [17]. After its implementation in image recognition, it was used in text categorization and bioinformatics. With developments for many years and a variety of implementations compared with other algorithms, SVMs are sometimes considered as an independent subfield of machine learning [18]. The basic idea behind SVMs for binary classification is to create a hyperplane in high-dimensional feature space that separates one class from the
46
G. Ibáñez-Redin et al.
other. The learning algorithm includes an optimization process subject to constraints guaranteeing the largest geometric margin between the classes. The optimization problem can be written in terms of Lagrange multipliers, and the support vectors are identified as those examples with non-zero multipliers. Geometrically these are the closest points to the hyperplane. For simplicity, the algorithm is normally presented in its linear form. However, the modern implementation allows the treatment of nonlinear problems by applying a Kernel Trick, which uses a mapping function that redefines the inner product on the feature space. The effect is equivalent to transforming the original feature space into a higher dimension space in which the data is expected to be separable. Tree-based models: As we can infer from the name, a Tree-based algorithm in its most basic implementation breaks the data by making decisions based on a series of criteria, which can be a simple yes or no question for category variables or a cut-off value for numerical ones. Each chunk of data creates a new child branch that can be submitted to a new set of splitting rules until reaching an established stopping criterion, or pure leaves are found. The stopping criterion also serves to prune deep trees and subsequently avoid overfitting. This rationale is the basis for more sophisticated and efficient algorithms, including Random Forest (RF) and XGBoost [19, 20], which are both ensemble algorithms widespread in the data science community. RF is formed by many individual decision trees. Each tree makes a prediction which is taken as a vote, and the class with the most votes becomes the final prediction. The success of ensemble models is related to the fact that a large number of uncorrelated models operating together will outperform their constituents. The Random Forrest algorithm uses a technique called Bagging [21] to guarantee low correlation, which means that each tree randomly samples from the training dataset with replacement. Also, to make a single decision, each tree picks up from a random subset of features, and the result is a forest of uncorrelated trees making independent predictions. XGBoost (extreme gradient boosting) is the most popular and efficient implementation of the gradient boosting machine algorithm. The boosting technique starts with a weak learner and iteratively builds new models that correct the errors made by the previous ones [22]. The models are added sequentially until a robust model with high generalization is achieved. It also uses a gradient algorithm to minimize the loss function in each iterative step. XGBoost is known for reaching top positions in machine learning competitions, even outperforming complex implementation of Neural Networks when applied to tabular data [23]. The algorithm scales well with the number of samples and features. It requires fewer data preprocessing steps and is less sensitive to outliers than other algorithms. Some implementations can work directly with categorical variables. K-nearest neighbors (KNN): A different approach is presented by the KNN algorithm [24], in which there is not a learning process in the usual sense but instead the simple storage of training data [25]. When a test example is received, the KNN are calculated by applying a distance function. The output is defined as the class mode of the nearest neighbors or the average of the continuous variable response in regression problems. The number of neighbors is a user-defined parameter that has
A Machine Learning Approach in Wearable Technologies
47
to be tuned case by case. The distance function is a critical choice and depends on the specific application. The most common are Euclidean and Manhattan distances [26, 27]. A critical drawback happens when working with imbalanced datasets, where KNN tends to prioritize the majority class since it will be present in more significant numbers among the neighbors. This situation can be avoided by applying distance weights to the neighbors, i.e., closer neighbors (among the K nearest) have priority votes.
2.2 Non-supervised Machine Learning Non-supervised algorithms are used when there is no labeled data, and the objective is to extract the underlying structure from the data itself. For example, Clusterization can be used to divide the available data into groups with well-defined characteristics. Hence, the examples in a specific group can in principle be taken as being the same entity, and labels can be assigned which would be later used to train a classification algorithm. Another known non-supervised problem is Anomaly Detection which can be used to find data points outside the expected range for a specific feature because of the unusual behavior of the studied subject or as an indication of a possible failure in a machine. K-means is the classic algorithm for Clusterization [28]. It begins by choosing (randomly or user-defined) k centroids used as initial cluster centers and assigning every point in the dataset to the cluster with the least square Euclidean distance. Then the centroids are recalculated by averaging the points inside each cluster. The process is iteratively repeated until the centroids no longer change within a predefined tolerance or the number of maximal iterations is achieved. Variations of the same algorithm which use other distance measures are spherical K-means, K-medians, and K-medoids [29]. Another significant variation is K-modes [30], suitable for categorical data where there is no well-defined distance. The K-modes algorithm calculates dissimilarities (not matching categories) between the data vectors. The least number of dissimilarities identifies the proximity of vectors, then redefines the centroids using the cluster modes and repeats the process. K-means is the leading member of a family of methods referred to as Partitional clustering. In Hierarchical clustering methods, the basic algorithm is a sequence of partitions in which each partition is nested into the following partition. There are two kinds of Hierarchical algorithms: the Agglomerative starts with the disjoint cluster in which each element of the dataset is placed in an individual cluster. Then, a distance metric is used to merge two or more of these clusters into a new partition, the nesting process is repeated, and the number of clusters is reduced as the sequence advances to finish with a single cluster containing all the data. If clustering is the primary goal, the sequence must be stopped at a point meaningful to the final objective. However, the entire sequence helps to analyze the underlying data structure since it offers a complete view of hierarchical data relationships. A Divisive algorithm executes the same task in reverse order, but it used to be more computationally demanding [29].
48
G. Ibáñez-Redin et al.
A different approach is followed in the algorithm called density-based spatial clustering of applications with noise (DBSCAN) [31]. The clustering process is based on detecting regions of high point density and grouping the points in those regions; the points in low-density regions are marked as outliers. As opposed to partitional clustering algorithms (such as K-means), DBSCAN does not require determining the number of clusters in advance but demands the knowledge of a meaningful distance measure to determine regions of high density. Also, DBSCAN can find clusters of arbitrary shape and even a cluster inside another one, without explicit connection between them.
2.3 Deep Learning This is a kind of machine learning algorithm inspired by the neural structure of the brain [32]. Though the idea is old, the increase in computational power has made it possible to implement Artificial Neural Networks (ANN) for supervised, unsupervised, and semi-supervised learning [33]. An ANN is constituted by multiple layers of processing. Each layer is formed of several connected units called artificial neurons, a typical neuron receives a signal (input) which is processed, and a response is sent to neurons in the subsequent layer. The signal travels through layers from the first to the final layer, possibly after passing through internal layers multiple times in both directions. Different layers can implement different transformations, and backward propagation allows for correcting errors. The number of neurons, layers, and the complexity of the network is only limited by the computational power available. One of the most famous deep learning models for natural language processing, called GPT-3, has 175 billion parameters and requires 800 GB of storage [34]. Deep learning models have been applied in computer vision, speech recognition, drug design, climate science, board game programs, etc. [35].
2.4 Evaluation Metrics The final part of model development is related to evaluating results. In addition to the metrics used, steps are necessary for an unbiased model evaluation which generally depends on the specific application, the exact use of the model in production, and the kind of data and data structure feeding the model. As a minimum requirement, one needs a validation and a test dataset (or a set of those). The validation dataset is generally used during training to evaluate how the model is performing; some algorithms actively use the validation dataset to correct classification mistakes and improve the chosen validation metric. During feature selection and hyperparameter optimization, a cross-validation procedure may be applied recursively. For example, in k-fold cross-validation, one may choose randomly, let us say, 20% of data for validation and 80% for training, then the data are shuffled. The division is repeated
A Machine Learning Approach in Wearable Technologies
49
until k sets of training and validation are formed. The algorithm for selection or optimization is applied, and the validation metric is averaged over the k sets. The main goal of this approach is to guarantee performance over a broader range of data and improve generalization. The test set is used to reproduce, if possible, the environment production conditions and must never be used during the training process. The metrics applied to the test set gives the performance of the model in production conditions. In the following, the most common metrics used for evaluation are presented depending on the kind of machine learning task. Classification • Accuracy [36]: It is the most straightforward metric for a classification task. It accounts for the percentage of correct label predictions. It is calculated as the ratio between correct classification and the total number of examples, not regarding the number of classes. Accuracy = (Number of correct labels)/(Number of total examples). Accuracy must be treated carefully in cases where the dataset is imbalanced. For example, in a binary problem with 100 examples, where 95 examples are from class 0, and 5 examples are from class 1; if the model attributes arbitrarily label 0 every time, the accuracy will be 95% but of course, this is an entirely useless model. Also, Accuracy is a threshold-dependent metric; then it is a good practice to report any not-obvious threshold choice. • ROC curve: The receiver operating characteristic (ROC) curve is a method derived from radar technology [37]. The curve is created by plotting at the vertical axis the True Positive Rate (TPR) and at the horizontal axis the False Positive Rate (FPR) while the threshold is varied from 0 to 1. A curve for every class is needed to evaluate multiclass problems. The area under de ROC curve (AUC-ROC) is of particular interest. A random classifier will have 0.5 AUC. Any value above 0.5 AUC is better than random, and a value close to 1 corresponds to a very high accuracy. It should be mentioned that situations with imbalanced datasets deserve closer attention. Figure 1 shows what a hypothetical ROC curve looks like when the classifier is excellent (red curve), reasonable (blue curve), and random (dashed line) with their respective AUC values. Once a threshold is set, the best option to evaluate a classification prediction is to observe the number of True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN) directly. These quantities can be combined in useful metrics such as Precision = TP/(TP + FP) or out of all predicted positives, how many are actual positives, Recall = TP/(TP + FN) or out of the actual positives how many were predicted correctly, and F1_score = 2 (Precision * Recall)/(Precision + Recall), which is the harmonic mean of Precision and Recall [36]. In most problems, evaluation is a tradeoff between Precision and Recall since, in general, it is a hard job to increase both simultaneously. The tradeoff must be approached by evaluating the cost embedded in each kind of error. For example, in diagnosing a disease, a False Negative is generally worse than a False Positive since the former can leave a person without adequate medical treatment. Then we can tune our model during the
50
G. Ibáñez-Redin et al.
Fig. 1 ROC curve. The red line corresponds to a very good classifier with AUC = 0.94; blue line is an acceptable model with enough discriminating power, AUC = 0.72; dashed line indicates a random classifier, AUC = 0.5
training process to prioritize one or another metric. Also, it is possible to choose a threshold aiming to maximize the chosen metric. Precision vs. Recall or F1_score vs. Threshold curves are valuable resources to better understand the model performance. Regression In regression problems, there is a numeric continuous variable response, and the metrics used for validation are entirely different. The main idea is that we do not need a prediction matching exactly the actual value for a continuous variable to consider it a success. We only need an approximate prediction, which depends on measure scales and tolerance to errors of the specific application. The metrics for regression here presented are just different forms of observing the same fundamental error, which is For simply the difference yi − y i between the actual yi and the predictedΣy i values. | | N | yi − y i |, example, the Mean Absolute Error (MAE) is calculated as MAE = N1 i=1 the absolute value gives the same importance to predicted values above and below the actual value. The mean is necessary since different experiments could have different numbers of examples. The absolute value can be substituted by the square function, and the Mean Square Error (MSE) is obtained [38]. MSE has the same interpretation as MAE; the only difference is that MSE has the measure units squared compared to predicted values. Therefore, it may be better to work with the Root Mean Square Error (RMSE), calculated as Λ
Λ
Λ
RMSE =
√
MSE
A Machine Learning Approach in Wearable Technologies
51
For linear regression models, one can also evaluate how well the model adjusts the actual data using the Coefficient of Determination or R 2 [39], which measures the proportion of variation in the response variable that is explained by the variation in the features, ΣN ( R =1− 2
)2 i=1 yi − y i . ΣN 2 i=1 (yi − y) Λ
In theory, R 2 goes from 1, being a perfect fitting, to 0, meaning the fitting is just the average of the response variable. However, in poor models fitting could be worse than average, thus producing a negative value. R 2 can be misleading when comparing models with a different number of features since R 2 tends to be higher when the number of features increases, even with a worse fitting. In order to solve this issue, it is recommended to use the Adjusted R Square: [( Adjusted− R = 1 − 2
) ] ( ) N −1 2 1− R , N −k−1
where N is the number of data samples, and k is the number of features. Then, models with a higher number of features are penalized. Clusterization Silhouette is a measure of cohesion, meaning how similar an object is to the cluster it belongs [40]. Its values range from −1 to 1. Assigning a silhouette value to each element inside a cluster is possible by using s=
b−a , max(a, b)
where a is the mean distance between a point and all the other elements inside the same cluster, and b is the mean distance between the point and all other points in the next closest clusters. Positive values closer to 1 mean the element was correctly allocated inside the cluster, and negative values mean a wrong assignment. The global silhouette score is calculated by averaging the individual silhouette values. Also, the average silhouette score can be used to find the optimal number of clusters.
52
G. Ibáñez-Redin et al.
3 Application of Machine Learning in Wearable Technologies 3.1 Healthcare Applications Machine learning applications in the medical field are diverse and include analyzing medical images, processing medical records, mining genetic databases, and monitoring patients [41]. In particular, approaches that use machine learning to analyze human signals recorded with accelerometers, temperature, electroencephalography (EEG), and heart rate sensors are growing in popularity [2]. Such methodologies have been used for monitoring patients in intensive care units, with the development of models to predict clinical worsening [42–44]. The development of wearable technology had allowed extending the application of machine learning to different fields in healthcare. This is illustrated in Table 1 with recent applications in disease diagnosis [45–53], survey [54], prediction [55–57], monitoring [58–67], and patient assistance [68]. Diagnosis and risk prediction of cardiovascular diseases, which represent the leading cause of death in the world, have been performed with wearable devices that may offer robustness, simplicity, and increased comfort for patients [46, 49, 53, 57]. Machine learning and wearable technologies have also been applied to elderly health monitoring in surveillance and disease prediction [69]. For example, models have been developed to monitor falls [67] and dehydration [55] in older adults and diagnose aging-related diseases, including Alzheimer’s [50]. More examples are given in the following paragraphs, where some recent applications of wearable devices for healthcare will be detailed. Methodologies for mass surveillance to identify early cases of COVID-19 and prevent the spread of the disease can be developed using wearable sensors, as demonstrated by Lonini and coworkers [45]. They used physiological signals recorded with a sensor adhered to the trout to train classification models and screen individuals at risk of COVID-19 infections (Fig. 2a). Different classifiers based on LR were trained using single and combined groups of physiological signals, viz. heartbeat, respiration dynamics, walking cadence, and cough frequency spectrum. As observed in Fig. 2b, the model trained on the combined feature set exhibited the highest performance in discriminating COVID-19 patients from the healthy control group, with an AUC of 94%. Hirten and coworkers reported the detection of COVID-19 infections in healthcare workers using physiological data [45]. A total of 407 participants enrolled in the study had their heart rates monitored with Apple smartwatches. Heart rate variability (HRV) was calculated in recording periods of seconds, and parameters such as mean HRV during the day, amplitude, and acrophase were used together with the heart rate measured at rest to train the models. Using the gradient-boosting machines (GBM) algorithm, an AUC-ROC of 86.4%, a sensitivity of 82%, and a selectivity of 77% were obtained for the testing datasets. The proposed methodology is promising to identify and predict SARS-CoV-2 infections using commercially available wearable devices.
Apple watch
SARS-CoV-2 infections
Heart health status
Epilepsy seizures
Pulmonary abnormalities (crackle, wheeze, stridor, and rhonchi)
Hypertrophic cardiomyopathy
Early stage of Alzheimer
Diagnosis
Diagnosis
Diagnosis
Diagnosis
Diagnosis
Diagnosis
Motion detection device
Wrist-worn physiological signal recorder prototype
Soft wearable stethoscope prototype
–
Prototype
Device
Application Application
–
Photoplethysmography sensor
Microphone
Electroencephalography sensor
ECG electrodes
Heart rate sensor
Sensors
Table 1 Recent applications of machine learning to healthcare with wearable devices
Dynamic time warping (DTW), KNN, SVM, and inertial navigation algorithm
Multiple-instance learning via embedded instance selection
CNN
SVM, RF, NB, KNN, and NN
K-mean clustering
Gradient-boosting machines (GBM), elastic-net, PLS, SVM, and RF
Algorithms
Total cohort
(continued)
[50]
[49]
Accuracy = 98% 85 Sensitivity = 95% Specificity = 98% DTW algorithm: 325 Sensitivity = 95.9% Specificity = 94%
[48]
20
Accuracy = 94.78%
[47]
[46] UCI repository
3
[45]
References
RF algorithm: Accuracy = 97.08%
–
GBM algorithm: 407 AUC-ROC = 86.4% Average sensitivity = 82% Specificity = 77%
Performance
A Machine Learning Approach in Wearable Technologies 53
Wrist-worn triaxial accelerometers
Type 2 diabetes
Chronic respiratory diseases (asthma, bronchitis, and chronic obstructive pulmonary disease)
Premature ventricular contractions
COVID-19
Dehydration
Diagnosis
Diagnosis
Diagnosis
Survey
Prediction
Shimmer galvanic skin response
Body-conforming soft sensor prototype
–
Smart face mask prototype
Device
Application Application
Table 1 (continued)
Accelerometer, magnetometer, gyroscope, galvanic skin response sensor, photoplethysmography sensor, temperature, and barometric pressure sensor
Accelerometer
ECG electrodes
Piezoelectric breath sensor
Accelerometer
Sensors
47 [53] (MIT-BIH arrhythmia database)
29
Accuracy = 99.7% Sensitivity = 97.45% Specificity = 99.87% AUC-ROC = 94%
11
20
Accuracy = 95.5%
(continued)
[55]
[54]
[52]
[51]
502,664 (UK biobank)
AUC-ROC = 86%
References
Total cohort
Performance
LR, Lasso, Ridge RF algorithm: Regression, Elastic RMSE = 0.84 Net Regression, SVR, ANN, Gradient Boosting, DNN, and RF
LR
KNN
DT
RF, LR, and XGBoost
Algorithms
54 G. Ibáñez-Redin et al.
Smart watch
Personalized predictions of clinical laboratory measurements
Heart failure exacerbation
Talking in respiratory signals
Fetal movement
Breathing patterns
Rheumatoid arthritis and axial spondyloarthritis
Prediction
Prediction
Monitoring
Monitoring
Monitoring
Monitoring
Withings activité Pop watch
Chest-worn band (AirGO™)
Elastic band prototype
Elastic band prototype
Vital connect multisensor monitoring device
Device
Application Application
Table 1 (continued)
Movement sensor
Breathing sensor and tri-axis accelerometer
Accelerometer
–
Naive Bayesian classification
Sensitivity = 96% Specificity = 97%
Hybrid hierarchical Accuracy = classification 97.22%
Fuzzy ARTMAP classifier
155
12
14
Resistive stretch sensors RF, SVM, NN, and RF algorithm: 15 LDA AUC-ROC = 90% Accuracy = 85%
Accelerometer, Similarity-based electroencephalography, modeling temperature, and impedance sensors
100
Total cohort
AUC-ROC = 89% Sensitivity = 87.5% Specificity = 86.0%
Performance 54
Algorithms
Heart rate, electrodermal RF, LASSO and Multiple activity, temperature, canonical correlation = and movement sensors correlation analysis 0.74
Sensors
(continued)
[61]
[60]
[59]
[58]
[57]
[56]
References
A Machine Learning Approach in Wearable Technologies 55
Smart shirt prototype
Patient activities in a free-living environment
Pathogens presence
Mental fatigue
Alcohol consumption
Monitoring
Monitoring
Monitoring
Monitoring
Electrocardiogram, respiration rate, and galvanic skin response sensors
Performance
Extra-trees
DT, SVM, and KNN
3
73
RMSE = 0.013
–
5
Total cohort
DT algorithm: Accuracy: 89%
–
LDA, SVM, KNN, SVM algorithm: DT, NB, and, ANN F1-score: 93.33% Sensitivity = 93.33% Specificity = 97.78% Accuracy = 96.67% Precision = 93.33%
Algorithms
Antibody functionalized SVM surface and portable microscope
Accelerometer, light sensor and pulse sensor
Sensors
Secure continuous Transdermal alcohol remote alcohol sensor monitor (SCRAM™) and BACtrack Skyn™
Multimodal epidermal electronic system prototype
Contact lens biosensor prototype
Device
Application Application
Table 1 (continued)
(continued)
[65]
[64]
[63]
[62]
References
56 G. Ibáñez-Redin et al.
Shimmer3 GSR+
Stress in older adults
Fall detection in elderly people
Eye-movement-controlled Wheelchair wheelchair Based controller
Monitoring
Monitoring
Patient assistant
Wearable belt prototype
Device
Application Application
Table 1 (continued) Algorithms
Flexible hydrogel biosensor
Accelerometer and gyroscope Wavelet Transform-SVM, SVM and transfer learning convolutional CNN
LR
Electrodermal activity RF, KNN, SVM, and Blood volume pulse LR, and LSTM sensors
Sensors
–
Accuracy = 100%
Wavelet 30 transform-SVM: Accuracy = 96.3%
19
Total cohort
LR algorithm: F1-score: 87% Sensitivity = 81% Specificity = 98% AUC = 81%
Performance
[68]
[67]
[66]
References
A Machine Learning Approach in Wearable Technologies 57
58
G. Ibáñez-Redin et al.
Fig. 2 Schematic illustration of the development of models to predict COVID-symptoms using wearable accelerometers (a). ROC curves for the detection models trained on different datasets of physiological features derived from data recorded with the wearable device (b). Reprinted (Adapted) with permission from [54] (material published under a creative commons license)
Machine learning approaches with wearable devices have already been used to monitor patients with chronic conditions [70]. One of them is epilepsy, a disease affecting approximately 50 million people worldwide [71]. Epilepsy is characterized by seizures that may be accompanied by loss of consciousness and bowel incontinence [2]. Most patients with epilepsy require continuous monitoring since they are at constant risk of accidents and injuries due to seizures and the use of drugs that promote bone weakening. A methodology for monitoring epileptic seizures using machine learning and EEG data has been developed [47]. The authors tested SVM, RF, NB, KNN, and NN algorithms in classifying ECG data recorded under five different conditions available at the UC Irvine machine learning repository. The models trained with SVM and RF exhibited superior performance, with an accuracy of ~97%, but only RF fulfills the requirements of low computational cost for applications in continuous monitoring. Seizure forecasting is also possible using commercially available wearable devices and machine learning, as demonstrated by Meisel and coworkers using long short-term memory (LSTM) networks and data from 69 patients with epilepsy [11]. Combining wearable devices and data processing with machine learning can significantly improve the accuracy for diagnosing respiratory and pulmonary-related abnormalities. Zhang and coworkers reported on a smart face mask to monitor respiratory signals in patients with chronic respiratory diseases [52]. The mask contains
A Machine Learning Approach in Wearable Technologies
59
a piezoelectric sensor fabricated with eco-friendly materials, capable of recording the waveforms of various breathing conditions with high stability for up to four hours (Fig. 3a). Figure 3b shows a picture of the device which contains a portable readout system. A bagged DT classifier was trained using data from healthy volunteers and patients with asthma, bronchitis, and chronic obstructive pulmonary disease collected with the mask. The classifier was excellent in distinguishing between the control healthy group and patients with chronic respiratory diseases, with an accuracy of 99.5%. In a similar approach, Lee and coworkers developed a soft wearable stethoscope for a remote patient cardiopulmonary auscultation [48]. The flexible device contains a microphone sensor, a thin-film electronic circuit, a rechargeable battery and a Bluetooth system for wireless data transmission. Their device allows for continuous detection of different cardiopulmonary sounds with minimal noise. A CNN algorithm was used to train a classification model using data from healthy volunteers and patients with four different types of lung diseases (i.e., crackle, wheeze, stridor, and rhonchi). The model shows excellent performance in the multiclass analysis with about 95% accuracy for the five groups. Although some works have explored innovative wearable devices for clinical applications, most research involving machine learning models has been done with commercial devices such as smartwatches and smart bands. This is surprising, considering the recent emergence of many wearable devices for medical applications [72–74]. In particular, wearable sensors and biosensors for detecting pathogens, biomarkers, and chemical substances remain relatively unexplored in diagnostics and monitoring involving machine learning [10]. However, some algorithms have been used to process the data from wearable biosensors, resulting in improved performance. For example, Veli and coworkers used an SVM-based algorithm to process holographic images from contact lens sensors to detect S. aureus [63]. The
Fig. 3 Schematic illustration of the smart mask for diagnosis of chronic respiratory diseases (a). Picture illustrating the respiratory signal recording process, and the portable readout circuit (b). Reprinted (adapted) with permission from [52]. Copyright 2022 American Chemical Society
60
G. Ibáñez-Redin et al.
contact lens surface was modified with specific antibodies to capture S aureus and then incubated with microbeads functionalized with secondary antibodies. S. aureus detection was accomplished by counting the beads attached to the bacteria from images captured using a portable lens-free microscope. An SVM-based algorithm was used to distinguish microbeads from other particles non-specifically adsorbed at the surface, leading to improved selectivity. Wang and coworkers reported on an interesting eye-movement-controlled wheelchair prototype based on flexible hydrogel biosensors and a Wavelet Transform (WT)-SVM algorithm [68]. Their prototype incorporates flexible sensors into conductive hydrogels and flexible substrates, ensuring patients’ comfortable usage. The signal recorded by the soft sensor was used to train a WT-SVM model with a Gaussian kernel function that allows for recognizing different eye movements. The WT-SVM model exhibited superior performance compared to SVM and transfer learning CNN models, with a high accuracy of 96.3%. Also, the applicability of the eye movement-controlled wheelchair prototype was demonstrated in tests with ten volunteers who were asked to maneuver the wheelchair through a bend (Fig. 4).
Fig. 4 Different eye movements and corresponding wheelchair movement modes using the flexible hydrogel biosensors. Reprinted (Adapted) with permission from [68] (material published under a creative commons license)
A Machine Learning Approach in Wearable Technologies
61
3.2 Sports Analytics The physiological data and movement patterns required to investigate athletic performance in sports, such as heart activity, recovery, muscular strength coordination, and balance, overlap significantly with general health indicators. Given this overlap, data collection tools for these physiological indices can be used to analyze both athletic performance and available health predictors. The application of flexible and wearable sensors in the evaluation of athletes has made it possible for coaches to decide aimed at increasing individual performance. From pre-established indicators, it is possible to assess the kinematic and physiological state of athletes and develop a sporting condition not only supported by hard training, but also by strategic tools [75]. Understanding the physiology and logistics of sports requires the use of biomechanical, biometric, and positional data. The data obtained, both in real-time and after the event, has massive potential for knowledge discovery and R&D across sports. This combination may aid in decoding the physiology and planning of a sport, potentially opening new avenues for academic research or industrial applications. Currently, several AI/ machine learning algorithms can perform multi-sensor data fusion. These algorithms can stitch images, analyze, and forecast time series data, and detect anomalies and faults. Data collection is performed through kinetic and physiological indicators. Kinetic indicators include the use of sensors to monitor posture, movement, acceleration, pressure, and force [76–78]. Physiological indicators include the analysis of metabolites such as glucose, lactic acid, electrolytes, sweat pH and vital signs such as heart rate, blood pressure, temperature, and respiration, among others [79–82]. Kinematic indicators comprise a series of physical parameters that monitor the deformation of the body when performing a specific movement and it is the type of movement that will determine the parameters that will be evaluated. Liu and coworkers developed a wearable sensor to assess the skill of volleyball players [78]. In the study, the sensor monitors finger and arm movements during ball placement (Fig. 5a) extracting data on the force applied to throw the ball, frequency (Fig. 5d), and amplitude (Fig. 5e) with which limbs are flexed. The sensor’s operating mechanism is based on the piezoelectric effect from an applied force. The images and performance of the flexible wearable sensor are shown in Fig. 5. Jeong and colleagues [76] worked on a sensor to assess the distribution of pressure on the feet and hands while performing weightlifting exercises. The sensor consists of an electrode and a functional film of polyamide/carbon nanotubes optimized by flat-tip microdomes. In this type of sensor, pressure data are collected through the change in contact resistance between the electrode and the functional film, in this case polyamide, induced by mechanical deformation [83] established when lifting the weight. The sensor had a sensitivity of 5.66 × 10−3 kPa−1 at 50 kPa and a minimum sensitivity of 0.23 × 10−3 kPa−1 at 3000 kPa. Wearable sensors applied to the investigation of physiological indicators are divided into devices for monitoring vital signs and metabolic parameters [84]. Devices that measure vital signs such as heart rate, respiration, body temperature,
62
G. Ibáñez-Redin et al.
Fig. 5 Scheme of collecting data of wearable sensor applied to volley (a). Images of the sensor (b) and (c). The output piezoelectric voltage response at different frequencies (d). The output piezoelectric voltage response at different angles (e). Reprinted (Adapted) with permission from [78] (material published under a creative commons license)
and blood pressure assist in assessing exercise load, fatigue conditions, and athlete’s recovery level. The monitoring of metabolites and electrolytes in body fluids permits the analysis of muscle condition, functional changes, and metabolism energy [85]. Usually, the metabolites monitored in sports are glucose, lactic acid, and concentrations of ions such as sodium, potassium, calcium, and chlorides. Lactic acid is an indicator of muscle fatigue, as it is associated with the anaerobic reaction of glucose due to a lack of oxygen during exercise. Excess lactic acid cause pain and lactic acidosis [86]. Glucose is an important source of energy. Controlling the level of storage in the body is a determining factor for the high performance of athletes. Meanwhile, the concentration of ions in body fluids determines the proper functioning of various organs. A lack of potassium can influence the heart rate and increase fatigue. The excess of sodium, on the other hand, causes muscle cramps and interferes in
A Machine Learning Approach in Wearable Technologies
63
movement execution [87]. Gao and coworkers fabricated a wearable sensor on PET substrate for continuous detection of sodium, lactate, potassium, glucose, and skin temperature [79]. The sensor for skin temperature is used to calibrate the response of other sensors due to the temperature dependence of the enzymatic reactions. The wearable system was used to measure the sweat profile of volunteers during the physical activities, and to make a real-time assessment of the physiological state of the subjects. This platform enables a wide range of personalized diagnostic and physiological monitoring applications [1]. Advances in manufacturing, electronic engineering, printing, non-invasive data collection, and monitoring technology have made it possible to create durable, discreet, and non-invasive clothing as an electronic platform capable of detecting human movements and vital signs [88]. This media and sensor miniaturization provides a never-before-seen ability to gather a large amount of data in different situations. By choosing a specific set of sensors located on different parts of the human body, it increases the potential for collecting accurate data to solve strategic problems. In this sense, extensive research to integrate data analysis methods with sports science and smart solutions has emerged to support all phases of sports training. Smart Sports Training (SST) is a new type of sports training that aims to improve training performance through wearable sensors, Internet of Things (IoT) devices, and intelligent data analysis methods. Rajšp and Fister, Jr. [88] published a study on SST with data analysis methods, including computational intelligence, conventional data mining methods, deep learning, machine learning, and other methods. In addition, they pointed out that computational intelligence algorithms have stood out in recent years; however, the most used intelligent data analysis methods remain SVM, ANN, KNN, and RF. Table 2 shows the list of research on smart sports training in different sports. The spread of smart apps has influenced every aspect of sports training. Research indicates that SST is an exceptional and decisive tool in transforming the way high-performance athletes are required in all training phases. This new SST approach can only be realized because accurate data can be recorded using wearable devices.
3.3 Smart Farming and Precision Agriculture Agriculture 4.0, which encompasses the application of robotics, IoT, AI, alternative energies, and sensors to the cultivation and production of food, is experiencing tremendous growth. Policy-makers support this new technology revolution to reduce the negative environmental impact and guarantee the population’s food security [96]. Smart farming approaches are now used to optimize the application of pesticides and fertilizers and for crop irrigation [96, 97]. Recent developments in sensors, in particular, have the potential to revolutionize modern agriculture since these devices are suitable to monitor parameters that are essential for optimizing plant growing conditions, resulting in improved crop yields. Such sensors can be used for real-time measurement of chemical and physical parameters on soil, including temperature,
Motion
Basketball
Algorithm
The system provided real-time feedback with up to 70% accuracy
Accuracy of 95.3% and 99.4% was achieved for activity recognition and repetition count,
99.5% accuracy of activity
Performance
The demonstration of a wearable system, Adaptive Boosting 81.7% accuracy was achieved after 24 based on a fabric force mapping sensor different leg workout sessions matrix, which can measure the muscle movement during various sporting activities, demonstrated with the case of leg workout exercises
KNN
An automatic indoor exercise recognition KNN; SVM; DT model for both in gym and home usage scenarios
Recognizing the basketball training type SVM automatically by sampling data from a battery powered wireless wearable device equipped with motion sensors, by using the SVM classifier
Focus
Temperature hardness The implementation of an ambient intelligence system applied to the practice of outdoor running sports, with support for personalized real-time feedback for sports practitioners
Gym training Motion
Running
Gym training Posture
Sensor
Sport
Table 2 Applications of smart sports training to different sports approaches [88]
(continued)
[92]
[91]
[90]
[89]
References
64 G. Ibáñez-Redin et al.
Sensor
Motion
Motion
Motion
Sport
Swimming
Table tennis
Running
Table 2 (continued) Algorithm
Use of machine learning and wearable sensors to predict energetics and kinematics of cutting Maneuvers
DT; ANN
A deep learning-based coaching assistant LSTM method, for providing useful information in supporting table tennis practice on data collected by an inertial movement unit sensor
Proposal for a methodology for the DT; ANN automatic identification and classification of swimmers’ kinematics (swimming strokes) information, retrieved from sensors, during interval training of competitive swimming
Focus
References
Turn direction classification returned good results (accuracy > 98.4%) with all methods
Experimental results showed that the presented method can yield results for characterizing high-dimensional time series patterns
[95]
[94]
The accuracy of the stroke style [93] classification by both the multi-layered neural network (NN) and the C4.5 Decision Tree was 91.1%
Performance
A Machine Learning Approach in Wearable Technologies 65
66
G. Ibáñez-Redin et al.
pH, humidity, nutrients, and pesticide concentrations [98]. Wearable devices can also be used to monitor animals on farms. These sensors can give information about health and stress, ensuring the well-being of the animals [99]. Measuring animal behavior in large, extensively reared herds is expensive, inefficient, and impractical. Introducing wearable sensing technologies makes these measurements possible in a precise, accessible way. The use of sensing in herds brings a second obstacle. A large amount of disconnected data only allows their interpretation using computational algorithms. Using sensors in the field is mostly related to applying machine learning algorithms [12]. For example, wearable sensors have been utilized for predicting sheep metabolizable energy intake [100]. The authors used GPS tracking collars to monitor position and activity of the sheep. Using Deep Belief Network (DBN) algorithm they were able to predict the metabolizable energy intake with a MSE of 20.65 for the testing dataset. Although wearable sensors for human monitoring with machine learning are generally associated with clinical and sports applications, such devices can be used to assess the safety of workers in different industrial sectors. Safety monitoring is one of the main concerns of agriculture 4.0, as workers may be exposed to several risks from the constant use of chemical substances and mechanical equipment [101]. The manipulation of mechanical tools employed in modern agriculture, for example, is associated with musculoskeletal disorders [102]. Therefore, having tools that allow real-time monitoring of workers’ exposure to this type of risk is essential for applying timely corrective measures. Aiello and coworkers reported on a new strategy for monitoring the exposure of farmworkers to vibration risks due to the use of automated tools [103]. The methodology was tested by monitoring a team of operators during mechanized olive harvesting. The data was collected with a wearable prototype device attached to the waist of the operator connected through a cable to two accelerometers fixed to the right and left wrists. A KNN algorithm was used to train a classification model that allows distinguishing vibration signals originated during catch and no-catch activities. The classifier achieved excellent performance with accuracies ranging from 97 to 98% with different values of k (5, 7, and 9), precision ranging between 94 and 96%, and a sensitivity of 100%. Monitoring falls is also relevant to ensure safety in fieldwork, particularly for older farm workers, who are more susceptible to injuries and fractures. Son and coworkers developed a methodology to detect fall and non-fall movements on farmworkers using wearable sensors and supervised machine learning [13]. An inertial sensor for measuring two accelerations and one angular velocity signal attached to the waits of healthy volunteers was used to record data on accidental falls and activities of daily living (ADL). KNN, SVM, and artificial neural networks (ANN) algorithms were used to train binary and multiclass classifiers using data from different fall and ADLs movements. The ANN algorithm provided the best performance in the binary classification with ROC-AUC-scores of 1.0, an accuracy of 99.84%, an F1score of 99.83%, and Matthew’s correlation coefficient (MCC) of 99.69%. The best multiclass model for distinguishing among different ADL and fall movements was obtained with the SVM algorithm, which exhibited a 0.988 roc AUC-score along with accuracy, F1-score, and MCC of 83.94, 74.83, and 81.83%, respectively.
A Machine Learning Approach in Wearable Technologies
67
The concept of wearable sensors has also been extended to plants. Wearable devices with different transduction mechanisms have been produced with a variety of flexible materials to monitor the status of plants’ health by tracking biomarkers, growth rate, and presence of pathogens and pollutants [104]. For example, strain sensors based on piezoelectric measurements have been developed for measuring the growth rate of plants, fruits, and vegetables [105–107]. Flexible sensors can be attached to the plant leaves to measure microclimate conditions such as humidity, temperature, and light exposure. Maintaining the proper balance of these conditions plays an essential role in preserving the health of most plants, so having easy-to-use tools that allow continuous monitoring could have a significant impact on increasing farms’ productivity [104]. Wearable sensors for plant leaves have been used to monitor microclimate through different transduction mechanisms. For instance, Nassar and coworkers developed a wearable integrated sensor for monitoring growth rate and microclimate [108]. Figure 6 shows a picture of the butterflyshaped platform containing a resistive temperature sensor and a humidity sensor based on capacitive measurements. The platform was connected to a rechargeable battery and a small programmable- system-on-chip using ultra-light electrical wires, which allows for real-time monitoring of the plant’s microclimate with good correlation with commercially available temperature and humidity sensors. Although research in wearable plant sensors is still in its initial phase, some works have already explored the application of machine learning algorithms for plant monitoring [109–111]. As in monitoring humans and animals, combining wearable sensors with machine learning offers the possibility of training models that allow for detecting anomalies, optimizing processes, and diagnosing diseases in real time. These types of tools may impact the agricultural sector, allowing farmers to reduce production
Fig. 6 Photography of the autonomous plant wearable sensor for microclimate monitoring and its real-time response for monitoring humidity and temperature compared with commercially available devices. Reprinted (Adapted) with permission from [108] (material published under a creative commons license)
68
G. Ibáñez-Redin et al.
Fig. 7 Results of long-term monitoring of the loss of water content at two different temperatures using traditional methodologies and the proposed sensor (a). Plots of determined versus expected loss of water content for impedance (Z) at 30 (1) and 20 °C using the SISSO descriptor (b). Reprinted (adapted) with permission from [109]. Copyright 2022 American Chemical Society
costs, increase product quality, and establish more sustainable strategies [12]. In this regard, Barbosa and coworkers employed wearable impedimetric sensors for on-site monitoring of water loss on soy leaves [109]. Two types of electrodes were used, one based on Ni thin films deposited onto flexible substrates and an eco-friendly alternative based on graphitic paper fabricated by pyrolysis. Both devices were suitable for detecting loss of water content with a wide dynamic range, excellent reproducibility, and minimal variations among signals recorded under windy conditions. A photo of the Ni-based sensor attached to a soy leaf is shown in Fig. 7a, along with plots showing the water loss measured with the sensor under two temperatures. Since the device showed low sensitivity for the measurements at 20 °C using single frequency analysis, machine-learning algorithms were used to treat the data of the whole spectra. As observed in Fig. 7b, using the supervised sure independence screening and sparsifying operator (SISSO) method, it was possible to quantify the loss of water content with a high linear trend (R2 > 0.84) and RMSEs of 0.2% and 0.3% for the training and test sets, respectively.
4 Conclusion and Outlooks In this chapter, we described various examples of applications of machine learning to wearable technologies. This approach offers countless possibilities for developing tools with potential applications in healthcare, sports, and industry. It was apparent from the survey in the literature that most of the works carried out during the last 5
A Machine Learning Approach in Wearable Technologies
69
years correspond to applications in the health area. This is not surprising considering the number of commercial devices to monitor physiological parameters, such as smartwatches and wristbands. On the other hand, the availability of databases with data from patients monitored with wearable devices facilitates advances in machine learning applications, allowing more research groups to access large amounts of data. Although most machine learning models have been developed using data collected with commercially available devices, several applications using innovative and versatile device prototypes have been reported. It should be noticed, however, that most of these devices encompass physical sensors for physiological measurements such as temperature, impedance, heart rate, etc. The use of machine learning to process data from wearable sensors and biosensors capable of monitoring concentrations of biomarkers in biological fluids remains unexplored. This approach offers excellent opportunities for future research in the biosensing field, with potential applications in personalized medicine. Acknowlegements This work was supported by FAPESP (2018/22214-6 and 2020/14906-5), CNPq, CAPES, and INEO.
References 1. Seshadri, D. R., et al. (2019). Wearable sensors for monitoring the physiological and biochemical profile of the athlete. NPJ Digital Medicine, 2. 2. Sabry, F., Eltaras, T., Labda, W., Alzoubi, K., & Malluhi, Q. (2022). Machine learning for healthcare wearable devices: The big picture. Journal of Healthcare Engineering, 2022. 3. María, E., Reyes, F., & Joshi, N. (2021). Smart materials for electrochemical flexible nanosensors : Advances and applications. 4. Min, J., Sempionatto, J. R., Teymourian, H., Wang, J., & Gao, W. (2021). Wearable electrochemical biosensors in North America. Biosensors & Bioelectronics, 172, 112750. 5. Wearable Technology Market. (2022). Precedence Research https://www.precedenceresea rch.com/wearable-technology-market 6. Airgo. (2021). https://www.myairgo.com/ 7. VitalPatch RTM. (2022). Vital Connect. https://vitalconnect.com/ 8. SenseHub Dairy. (2022). Allflex. https://www.allflexsa.com/products/monitoring/cow-monito ring/ 9. Sempionatto, J. R., Jeerapan, I., Krishnan, S., & Wang, J. (2019). Wearable chemical sensors: Emerging systems for on-body analytical chemistry. Analytical Chemistry. https://doi.org/10. 1021/acs.analchem.9b04668 10. Cui, F., Yue, Y., Zhang, Y., Zhang, Z., & Zhou, H. S. (2020). Advancing biosensors with machine learning. ACS Sensors, 5, 3346–3364. 11. Meisel, C., et al. (2020). Machine learning from wristband sensor data for wearable, noninvasive seizure forecasting. Epilepsia, 61, 2653–2666. 12. Zhang, M., et al. (2021). Wearable internet of things enabled precision livestock farming in smart farms: a review of technical solutions for precise perception, biocompatibility, and sustainability monitoring. Journal of Cleaner Production, 312, 127712. 13. Son, H., et al. (2022). A machine learning approach for the classification of falls and activities of daily living in agricultural workers. IEEE Access, 10, 77418–77431.
70
G. Ibáñez-Redin et al.
14. Kimball, J. P., Inan, O. T., Convertino, V. A., Cardin, S., & Sawka, M. N. (2022). Wearable sensors and machine learning for hypovolemia problems in occupational, military and sports medicine: Physiological basis, hardware and algorithms. Sensors, 22. 15. Torgo, L., & Gama, J. (1997). Regression using classification algorithms. Intelligent Data Analysis, 1, 275–292. 16. Crocker, D. C., & Seber, G. A. F. Linear regression analysis. Technometrics, 22. 17. Vapnik, V. N. (1998). Statistical learning theory. Wiley-Interscience. 18. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press. 19. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844. 20. Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system (pp. 785–794). 21. Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123–140. https://doi.org/ 10.1023/A:1018054314350 22. Opitz, D., & Maclin, R. (1999). Popular ensemble methods: an empirical study. Journal of artificial intelligence research, 11169–198. https://doi.org/10.1613/jair.614 23. Shwartz-Ziv R., & Armon, A. (2021). Tabular data: deep learning is Not All You Need, arXiv:2106.03253 24. Fix, E., & Hodges, J. L. (1989). Discriminatory analysis . Nonparametric discrimination: Consistency properties. International Statistical Review, 57, 238–247. 25. Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27. 26. Black, P.E. (Ed.) (2006). Manhattan distance, in dictionary of algorithms and data structures [online], 11 February 2019. Available from: https://www.nist.gov/dads/HTML/manhattanDis tance.html. Accessed by 3 Nov 2023 27. Black, P.E. (Ed.) Euclidean distance, in dictionary of algorithms and data structures [online], 17 December 2004. Available from: https://www.nist.gov/dads/HTML/euclidndstnc.html. accessed Today 28. Forgy, E. W. (1965). Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications. Biometrics, 21, 768–769. JSTOR 2528559 29. Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Prentice Hall. 30. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining Knowledge Discovery, 2, 283–304. 31. Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. A. (1996). Density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (Vol. 2, pp. 226–231). AAAI Press. 32. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. https://doi.org/10.1037/h00 42519 33. Hardesty, L. (2017). MIT News Office. Explained: Neural networks. 34. Brown, T. B., et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 2020-Decem. 35. Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. https://www.microsoft. com/en-us/research/publication/deep-learning-methods-and-applications/ 36. Raschka, S. (2015). Looking at different performance evaluation metrics. In Python MAchine Learning, 189–198. Packt Publishing Ltd. 37. Fawcett, T. (2006). Introduction to receiver operator curves. Pattern Recognition Letters, 27, 861–874. https://doi.org/10.1016/j.patrec.2005.10.010 38. Willmott, Cort J., & Matsuura, K., (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30, 79–82. 39. Yan, X., & Su, X. (2009). Linear regression analysis: Theory and computing. world scientific.
A Machine Learning Approach in Wearable Technologies
71
40. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53–65. https://doi. org/10.1016/0377-0427(87)90125-7 41. Toh, C., & Brody, J. P. (2021). Applications of machine learning in healthcare. In Smart manufacturing: When artificial intelligence meets the internet of things, 65. 42. Desautels, T., et al. (2016). Prediction of sepsis in the intensive care unit with minimal electronic health record data: A machine learning approach. JMIR Medical Informatics, 4, 1–15. 43. Luo, C., et al. (2022). A machine learning-based risk stratification tool for in-hospital mortality of intensive care unit patients with heart failure. Journal of Translational Medicine, 20, 1–9. 44. Murali, S., Rincon, F., Cassina, T., Cook, S., & Goy, J. J. (2020). Heart rate and oxygen saturation monitoring with a new wearable wireless device in the intensive care unit: Pilot comparison trial. Journal of Medical Internet Research, 22. 45. Hirten, R. P., et al. (2022). Evaluation of a machine learning approach utilizing wearable data for prediction of SARS-CoV-2 infection in healthcare workers. JAMIA Open, 5, 1–9. 46. Farooq, A., Seyedmahmoudian, M., & Stojcevski, A. (2021). A Wearable wireless sensor system using machine learning classification to detect arrhythmia. IEEE Sensors Journal, 21, 11109–11116. 47. Resque, P., Barros, A., Rosario, D., & Cerqueira, E. (2019). An investigation of different machine learning approaches for epileptic seizure detection. 2019 15th International Wireless Communications and Mobile Computing Conference IWCMC 2019 (pp. 301–306). https:// doi.org/10.1109/IWCMC.2019.8766652 48. Lee, S. H., et al. (2022). Fully portable continuous real-time auscultation with a soft wearable stethoscope designed for automated disease diagnosis. Science Advances, 8, 1–13. 49. Green, E. M., et al. (2019). Machine learning detection of obstructive hypertrophic cardiomyopathy using a wearable biosensor. NPJ Digital Medicine, 2, 1–4. 50. Varatharajan, R., Manogaran, G., Priyan, M. K., & Sundarasekar, R. (2018). Wearable sensor devices for early detection of Alzheimer disease using dynamic time warping algorithm. Cluster Comput., 21, 681–690. 51. Lam, B., et al. (2021). Using wearable activity trackers to predict type 2 diabetes: Machine learning-based cross-sectional study of the UK Biobank accelerometer cohort. JMIR Diabetes, 6, 1–15. 52. Zhang, K., et al. (2022). Biodegradable smart face masks for machine learning-assisted chronic respiratory disease diagnosis. ACS Sensors. https://doi.org/10.1021/acssensors.2c01628 53. Yu, J., Wang, X., Chen, X., & Guo, J. (2021). Automatic premature ventricular contraction detection using deep metric learning and KNN. Biosensors, 11. 54. Lonini, L., et al. (2021). Rapid screening of physiological changes associated with COVID-19 using soft-wearables and structured activities: A pilot study. IEEE Journal of Translational Engineering in Health and Medicine, 9. 55. Sabry, F., et al. (2022). Towards on-device dehydration monitoring using machine learning from wearable device’s data. Sensors, 22, 1–20. 56. Dunn, J., et al. (2021). Wearable sensors enable personalized predictions of clinical laboratory measurements. Nature Medicine, 27. 57. Stehlik, J., et al. (2020). Continuous wearable monitoring analytics predict heart failure hospitalization: The LINK-HF multicenter study. Circulation: Heart Failure 1–10. https://doi.org/ 10.1161/CIRCHEARTFAILURE.119.006513 58. Ejupi, A., & Menon, C. (2018). Detection of talking in respiratory signals: A feasibility study using machine learning and wearable textile-based sensors. Sensors (Switzerland), 18. 59. Zhao, X., et al. (2019). An IoT-based wearable system using accelerometers and machine learning for fetal movement monitoring. In Proceedings of the 2019 IEEE International Conference on Industrial Cyber Physical Systems ICPS 2019 (pp. 299–304). https://doi.org/ 10.1109/ICPHYS.2019.8780301 60. Qi, W., & Aliverti, A. (2020). A multimodal wearable system for continuous and real-time breathing pattern monitoring during daily activity. IEEE Journal of Biomedical and Health Informatics, 24, 2199–2207.
72
G. Ibáñez-Redin et al.
61. Gossec, L., et al. (2019). Detection of flares by decrease in physical activity, collected using wearable activity trackers in rheumatoid arthritis or axial spondyloarthritis: An application of machine learning analyses in rheumatology. Arthritis Care and Research, 71, 1336–1343. 62. Ka´ntoch, E. (2018). Recognition of sedentary behavior by machine learning analysis of wearable sensors during activities of daily living for telemedical assessment of cardiovascular risk. Sensors (Switzerland), 18, 1–17. 63. Veli, M., & Ozcan, A. (2018). Computational sensing of staphylococcus aureus on contact lenses using 3D imaging of curved surfaces and machine learning. ACS Nano, 12, 2554–2559. 64. Zeng, Z., et al. (2020). Nonintrusive monitoring of mental fatigue status using epidermal electronic systems and machine-learning algorithms. ACS Sensors, 5, 1305–1313. 65. Fairbairn, C. E., Kang, D., & Bosch, N. (2020). Using machine learning for real-time BAC estimation from a new-generation transdermal biosensor in the laboratory. Drug and Alcohol Dependence, 216, 108205. 66. Nath, R. K., Thapliyal, H., & Caban-Holt, A. (2022). Machine learning based stress monitoring in older adults using wearable sensors and cortisol as stress biomarker. The Journal of Signal Processing Systems, 94, 513–525. 67. Desai, K., et al. (2020). A novel machine learning based wearable belt for fall detection. In 2020 IEEE International Conference on Computing Power Communication Technologies GUCON 2020 (pp. 502–505). https://doi.org/10.1109/GUCON48875.2020.9231114 68. Wang, X., Xiao, Y., Deng, F., Chen, Y., & Zhang, H. (2021). Eye-movement-controlled wheelchair based on flexible hydrogel biosensor and wt-svm. Biosensors, 11. 69. Choi, Y. A., et al. (2021). Machine-learning-based elderly stroke monitoring system using electroencephalography vital signals. Applied Sciences, 11, 1–18. 70. Yu, S., Chai, Y., Chen, H., Sherman, S. J., & Brown, R. A. (2022). Wearable sensorbased chronic condition severity assessment: An adversarial attention-based deep multisource multitask learning approach. MIS Quarterly, 46, 1355–1394. 71. World Health Organization. (2022). Epilepsy. https://www.who.int/news-room/fact-sheets/ detail/epilepsy#:~:text=Ratesofdisease&text=Theestimatedproportionofthe,diagnosedwit hepilepsyeachyear 72. Wang, M., et al. (2022). A wearable electrochemical biosensor for the monitoring of metabolites and nutrients. Nature Biomedical Engineering. https://doi.org/10.1038/s41551-022-009 16-z 73. Sempionatto, J. R., et al. (2021). An epidermal patch for the simultaneous monitoring of haemodynamic and metabolic biomarkers. Nature Biomedical Engineering, 5, 737–748. 74. Yang, Y., et al. (2020). A laser-engraved wearable sensor for sensitive detection of uric acid and tyrosine in sweat. Nature Biotechnology, 38, 217–224. 75. Muniz-Pardos, B., et al. (2021). Wearable and telemedicine innovations for Olympic events and elite sport. The Journal of sports medicine and physical fitness, 61, 1061–1072. 76. Jeong, Y., et al. (2021). Ultra-wide range pressure sensor based on a microstructured conductive nanocomposite for wearable workout monitoring. Advanced Healthcare Materials, 10, 2001461. 77. Menzel, T., & Potthast, W. (2021). Validation of a novel boxing monitoring system to detect and analyse the centre of pressure movement on the boxer’s fist. Sensors, 21, 8394. 78. Liu, W., Long, Z., Yang, G., & Xing, L. (2022). A self-powered wearable motion sensor for monitoring volleyball skill and building big sports data. Biosensors, 12, 60. 79. Gao, W., et al. (2016). Fully integrated wearable sensor arrays for multiplexed in situ perspiration analysis. Nature, 529, 509–514. 80. Hao, J., Zhu, Z., Hu, C., & Liu, Z. (2022). Photosensitive-stamp-inspired scalable fabrication strategy of wearable sensing arrays for noninvasive real-time sweat analysis. Analytical Chemistry, 94, 4547–4555. 81. Zhong, J., et al. (2022). Smart face mask based on an ultrathin pressure sensor for wireless monitoring of breath conditions. Advanced Materials, 34, 2107758. 82. Ji, S., et al. (2020). Water-resistant conformal hybrid electrodes for aquatic endurable electrocardiographic monitoring. Advanced Materials, 32, 2001496.
A Machine Learning Approach in Wearable Technologies
73
83. Pan, L., et al. (2014). An ultra-sensitive resistive pressure sensor based on hollow-sphere microstructure induced elasticity in conducting polymer film. Nature Communications, 5, 1–8. 84. Yeung, K. K., et al. (2021). Recent advances in electrochemical sensors for wearable sweat monitoring: A review. IEEE Sensors Journal, 21, 14522–14539. 85. Liu, Y., et al. (2018). Flexible, stretchable sensors for wearable health monitoring: Sensing mechanisms, materials, fabrication strategies and features. Sensors, 18, 645. 86. Liu, G., et al. (2016). A wearable conductivity sensor for wireless real-time sweat monitoring. Sensors Actuators B Chemical, 227, 35–42. 87. Tabasum, H., Gill, N., Mishra, R., & Lone, S. (2022). Wearable microfluidic-based e-skin sweat sensors. RSC Advances, 12, 8691–8707. 88. Rajšp, A., & Fister, I., Jr. (2020). A systematic literature review of intelligent data analysis methods for smart sport training. Applied Sciences, 10, 3013. 89. Acikmese, Y., Ustundag, B. C., & Golubovic, E. (2017) Towards an artificial training expert system for basketball. In 2017 10th International Conference on Electrical and Electronics Engineering (ELECO) (pp. 1300–1304). IEEE. 90. Das, D., Busetty, S. M., Bharti, V., & Hegde, P. K. (2017). Strength training: A fitness application for indoor based exercise recognition and comfort analysis. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 1126–1129). IEEE. 91. López-Matencio, P., Alonso, J. V., González-Castano, F. J., Sieiro, J. L., & Alcaraz, J. J. (2010). Ambient intelligence assistant for running sports based on k-NN classifiers. In 3rd International Conference on Human System Interaction (pp. 605–611). IEEE. 92. Zhou, B., Sundholm, M., Cheng, J., Cruz, H., & Lukowicz, P. (2016). Never skip leg day: A novel wearable approach to monitoring gym leg exercises. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom) (pp. 1–9). IEEE. 93. Ohgi, Y., Kaneda, K., & Takakura, A. (2014). Sensor data mining on the kinematical characteristics of the competitive swimming. Procedia Engineering, 72, 829–834. 94. Lim, S.-M., Oh, H.-C., Kim, J., Lee, J., & Park, J. (2018). LSTM-guided coaching assistant for table tennis practice. Sensors, 18, 4112. 95. Zago, M., Sforza, C., Dolci, C., Tarabini, M., & Galli, M. (2019). Use of machine learning and wearable sensors to predict energetics and kinematics of cutting maneuvers. Sensors, 19, 3094. 96. Rose, D. C., & Chilvers, J. (2018). Agriculture 4.0: Broadening responsible innovation in an era of smart farming. Frontiers in Sustainable Food Systems, 2, 1–7. 97. Alexander, P., et al. (2017). Smart irrigation system for smart farming. In 26th International Conference on Information Systems Development (ISD2017 CYPRUS). 98. Yin, H., et al. (2021). Soil sensors and plant wearables for smart and precision agriculture. Advanced Materials, 33, 1–24. 99. Neethirajan, S. (2020). The role of sensors, big data and machine learning in modern animal farming. Sensing and Bio-Sensing Research, 29, 100367. 100. Suparwito, H., Thomas, D. T., Wong, K. W., Xie, H., & Rai, S. (2021). The use of animal sensor data for predicting sheep metabolisable energy intake using machine learning. Information Processing in Agriculture, 8, 494–504. 101. Stafford, J. V. (2000). Implementing precision agriculture in the 21st century. Journal of Agricultural Engineering Research, 76, 267–275. 102. Benos, L., Tsaopoulos, D., & Bochtis, D. (2020). A review on ergonomics in agriculture. part II: Mechanized operations. Applied Sciences, 10. 103. Aiello, G., Catania, P., Vallone, M., & Venticinque, M. (2022). Worker safety in agriculture 4.0: A new approach for mapping operator’s vibration risk through machine learning activity recognition. Computers and Electronics in Agriculture, 193. 104. Lee, G., Wei, Q., & Zhu, Y. (2021). Emerging wearable sensors for plant health monitoring. Advanced Functional Materials, 31.
74
G. Ibáñez-Redin et al.
105. Tang, W., Yan, T., Ping, J., Wu, J., & Ying, Y. (2017). Rapid fabrication of flexible and stretchable strain sensor by chitosan-based water ink for plants growth monitoring. Advanced Materials Technologies, 2, 1–5. 106. Jiang, J., Zhang, S., Wang, B., Ding, H., & Wu, Z. (2020). Hydroprinted Liquid-alloy-based morphing electronics for fast-growing/tender plants: From physiology monitoring to habit manipulation. Small, 16. 107. Lee, H. J., Joyce, R., & Lee, J. (2022). Liquid polymer/metallic salt-based stretchable strain sensor to evaluate fruit growth. ACS Applied Materials & Interfaces, 14, 5983–5994. 108. Nassar, J. M., et al. (2018). Compliant plant wearables for localized microclimate and plant growth monitoring. npj Flexible Electronics, 2, 1–12. 109. Barbosa, J. A., et al. (2022). Biocompatible wearable electrodes on leaves toward the on-site monitoring of water loss from plants. ACS Applied Materials & Interfaces, 14, 22989–23001. 110. Li, Z., et al. (2021). Real-time monitoring of plant stresses via chemiresistive profiling of leaf volatiles by a wearable sensor. Matter, 4, 2553–2570. 111. Li, D., et al. (2022). Virtual sensor array based on piezoelectric cantilever resonator for identification of volatile organic compounds. ACS Sensors, 7, 1555–1563.
Potential of Machine Learning Algorithms in Material Science: Predictions in Design, Properties, and Applications of Novel Functional Materials Purvi Bhatt , Neha Singh , and Sumit Chaudhary Abstract The evolving field of Machine Learning (ML) and Artificial Intelligence (AI) contributed tremendously to the advancement of various branches of science and technology. Deep learning has attracted great interest from the research community of material science, because of its ability to statistically analyze a large collection of data. Along with the computational task, time efficient tools of machine learning have also been applied for the prediction of design and properties of new materials. A noticeable shift from trial and error-based laboratory approach to the modeling and simulation-based software techniques in the preparation and characterization of functional materials manifests the emergence of big data in the field of material science. The efficient algorithms enable data collection, storage with high security, fast processing, and interpretation of physically generated results. Embedding ML in material science research also provides distinctions between simulated data and experimental results. It has put the research of physical and chemical science at the forefront with the advancements in image processing, photonics, optoelectronics, and other emerging areas of material science. In this chapter, we review applications of machine learning algorithms to study experimentally obtained results of physical systems. A comprehensive study of different techniques of deep learning to design and predict new functional materials is detailed. We conclude with the discussion of future directions and challenges in the acceptability of this advanced technique to the existing vast area of material science.
P. Bhatt (B) Department of Physics, School of Engineering, Indrashil University Mehsana, Rajpur, Gujarat 382715, India e-mail: [email protected] N. Singh Department of Computer Sciences & Engineering, School of Engineering, Indrashil University Mehsana, Rajpur, Gujarat 382715, India e-mail: [email protected] S. Chaudhary School of CS & AI, Transstadia University, Ahmedabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_4
75
76
P. Bhatt et al.
Keywords Machine learning · Data simulation · Image processing · Computational techniques · Functional materials · Modeling and simulation
1 Introduction The central objective of any material science research is the complete understanding of the material’s constituent building blocks, properties, design, and functional mechanism when employed in specific applications. Although the domain branch for the studies of materials related to physics, chemistry, and nanoscience, it has attracted great interest of researchers from the engineering communities as well for the advancement in synthesis techniques and investigation of materials’ properties. The state-of-the-art research in material science deals with noticeably expanded application areas of novel materials starting from basic needs to advance the field of science, technology, and medicine. Even if material science has already demonstrated its ability to be one of the major demanding research areas in the present era, it is practically still challenging to practice numerous methods for the synthesis and fabrication of novel materials followed by characterization of the prepared samples and functionalizing them to desired applications. One of the major limitations of experimental research is due to the necessity of raw materials, physical set-up, and economical feasibility to conduct repeated experiments on a trial basis. Additionally, well-trained human resources and adequate infrastructure are the integral part of any end product results. In the past few years, other areas that gained equal popularity in the field of technology and research are Machine Learning (ML) and Artificial Intelligence (AI). Rapid encounter of Artificial Intelligence in daily life offers numerous beneficial socio-economic impacts to every aspect of human life. Widely known as big data analytics, this is the branch that involves the processing of a large collection of data by following a set of instructions to extract meaningful information. In the current time, machine learning and AI emerged as efficient alternates of real laboratory-scaled design, fabrication, and application of functional materials. Altogether simulation, computational, and modeling on a predicted large dataset of materials act as key replacements for the lab testing methods. Machine learning involves operations on algorithms to generate the result on the basis of given input information in the context of materials’ property. Also, an aided feature of MI presents its operational convenience which does not require unique programming for each individual task but works on previous related data and produces prediction-based results [1, 2]. An area of data informatics had been already employed to study chemical and biological systems in the past and nowadays signifies its potential in solid-state physics and material science as well [3, 4]. A conventional experimental research has always been constrained by time and resources. It stimulated the continuous need for data-driven simulation and computational science. There are several pioneering modeling methods in this area including Molecular Dynamics simulation (MD), Monte-Carlo method, and Density Functional Theory (DFT). These methods enable the manipulation of a large set of materials’
Potential of Machine Learning Algorithms in Material Science: …
77
data comprising various phase and composition parameters such as surface properties, number of atoms, energy levels, crystallite phase, and electronic state. Furthermore, it is also possible to calculate properties at the scale of a cluster of materials and generate a massive amount of data as output giving rise to the foundation of the use of machine learning in material science. As the calculations become complex, they can be governed by applying algorithms that initiate advanced computational mechanisms and show efficacy to study and investigate scores of materials. These vast input variable-based calculations also establish theoretical support to the experimentally obtained results which improve accuracy and make research considerably data sensitive. Despite of mentioned merits, there are still limitations in the field of deep learning-based material science research due to an inadequate understanding of applying fundamental laws and the complexity of calculations on huge volumes of data which is beyond human ability makes the study speculative and questions its validity. In this chapter, we review present limitations in the field of material science research and the adoption of ML in it followed by its basic principle and existing algorithms as well as models with testing and validation. The introduction of AI and its implementation for the prediction of novel functional materials is also detailed. A report is concluded with challenges associated with the use of ML and AI and their future role in advanced material science research [1].
2 Fundamentals of Machine Learning Algorithms: In Context of Material Science ML can be understood as a set of algorithms designed to operate on a specific problem based on a set of instructions and past available data of the same kind of study. The entire process is comprised of input data, processing, testing, validation, and final output and is commonly categorized into three following mechanisms. i. Supervised learning. ii. Unsupervised learning. iii. Reinforcement learning. The first one refers to the typical data fitting method which processes on predefined input data and uses a function that generates output data. It follows the pattern of previously known data and extrapolates to estimate results as shown in the following figure. An unsupervised learning is related to the pattern of unidentified data and can be applied to data of collective samples. A reinforcement learning is about processing and interacting on a problem and finding the most appropriate solution to obtain maximum reward [5, 6]. However, sometimes existing data are not sufficient and hence is not suitable to opt-out any single of above-mentioned methods. At that time, an intermediate learning method can be applied which uses an algorithm based on both identified as well as unidentified data [7]. Among all, supervised learning is the most commonly used and suitable method for the study and investigation of materials. Here the crucial part is a qualitative and quantitative analysis of data extracted
78
P. Bhatt et al.
from materials which determine the precision and validity of the method. In a typical supervised process, data subset has been selected for available details and simultaneously algorithm has also been applied for targeted property estimation. An entire process of data treatment is being performed with high-grade accuracy and precision. A data feed chosen should be appropriate for the algorithm and throughout the process consistency has to be maintained. In a later step, the validity of the model has been tested by a parameter called “cost function.” It determines differences between actually received output and predicted results and also measures quantitative variation between the two. Consequently, this gap causes over fitting or under fitting which has to be considered at the time of extrapolation, testing, and validity. Over fitting is the result of additional data present in the subset but not relevant to target property estimation and gives rise to unwanted and random output. On contrary, under fitting relates to the missing of useful data points due to which the applied model reduces its efficiency to find a best suitable fit. Both situations affect the reliability and accuracy of the output. Over fitting is the major concern while applying supervised learning and can be minimized by continuous assessment of both validation and training errors accompanied by cross-validation [8, 9]. At the final stage, model is tested on past data to determine its extrapolation limit and applicability on a wide spectrum of input data. Testing can be performed using several methods such as Leave-One-OutCross-Validation (LOOCV), Leave-One-Cluster-Out-Cross-Validation (LOCOCV), Monte-Carlo cross-validation, K-fold cross-validation, and Hold-out validation [10, 11]. Each of these models follows iteration on a training set of data. However, in the field of research in chemistry validity of these methods is still debatable especially in the quantitative analysis of structure-related properties [12, 13]. While LeaveOne-Cluster-Out-Cross-Validation (LOCOCV) model has been found effective in extrapolating novel materials. Here, the prediction is close to practical result and also explains the limitations of the model. It is supported by the previous study on the analysis of variation in superconducting behavior in cluster of superconductors [1, 14]. The following section details fundamental database and algorithms of machine learning and their adoption as well as application in the field of material science research (Fig. 1).
Fig. 1 Process flow of supervised learning [1]
Potential of Machine Learning Algorithms in Material Science: …
79
Fig. 2 Types of algorithms used in material science [15]
To analyze materials’ properties and predict new functional materials with machine learning method task, algorithm and performance collectively provide a pathway in the research of material science. Here, a challenging task that determines the performance ability of the machine learning method is the selection of algorithms which is compatible to given task. Also, the generalization of algorithm is difficult since each problem has its unique solving method and specific application [15, 16]. Widely used ML algorithms in material science can be categorized as shown below [17, 18] (Fig. 2). Prediction about the construction of new novel materials is the very first step of the entire process analysis and probability estimation algorithms are commonly used for this purpose. After the discovery of the material, it can be proven functional only after desired property estimation. Regression, clustering, and classification are the algorithms that determine the characteristic efficiency of materials at the macro and micro scale. Some of the examples of advanced algorithms are Software as a Service (SaaS), Genetic algorithms (GA), and Particle Swarm Optimization (PSO) generally used in combination with standard ML method to optimize parameters of newly discovered materials [17, 18]. Previous studies have shown that these algorithms are also efficient to analyze complex properties and spatial configuration in materials [19–21]. The primary objective is to predict a new material with exceptional performance ability for a specific application. At this time, computational and experimental researches have demonstrated the ability to predict new materials with elemental and structural transformation. Although it is still challenging at a compositional level to alter the atomic structure in the prediction of different materials [22]. “In silico” is the method that bridges computer simulation and machine learning algorithm to predict the structure and properties of new materials. This approach works in two stages comprising learning the system and predicting a material as represented in the below diagram [15] (Fig. 3). A system learning part includes the removal of irrelevant and unnecessary data, reducing noise, shrinking input data, and training and testing of the model. After completion of this operation, the model is analyzed by a prediction system approach
80
P. Bhatt et al.
Fig. 3 General pathway to predict new material by “in silico” [15]
that determines the composition and structure of material. Additionally, the suggestion feature has also been observed in several cases to propose constituent and structural insights to the material which has been predicted. First principle calculations by Density Functional Theory (DFT) are used to measure variation between predicted and suggested samples and their stability. Among the previously predicted materials, some were discovered on the basis of crystal phase identification and others were focused on the compositional aspect. Past study reveals the prediction of 209 new compounds indicating missing ternary oxide using the Bayesian machine learning method [23]. Also, this method predicted new materials with about 20 common ionic substitutions [24]. In the findings of crystals, a noticeable gain has been observed in the results obtained from DBSCAN and OPTICS methods [25]. Artificial Neural Network (ANN) algorithms have demonstrated potential in the prediction of guanidinium-based ionic liquids but were unable in the virtual screening of the materials [26, 27]. Support Vector Machine (SVM) algorithms are widely known for considerable prediction of properties in polymers [15, 28]. A complete understanding of new material involves its crystal structure details, structural characterization, and property analysis. In an experimental research of crystal physics, chemical energy for the reaction and related lab facilities are required. High-performance computational facilities are prerequisites for DFT-based first principle calculations [29]. In substitution of these methods, ML algorithms have received great interest due to their observation-based analysis of a large set of data obtained from previous experiments. Earlier to 1980, crystal structure estimation was completely ignored in the research field of physics as stated by Maddox [30]. Nevertheless, this trend has changed after 2003 when Curtarolo et al. introduced dataset of ab initio calculations and incorporated quantum computations with machine learning to analyze binary alloys and their crystal phases [31]. Furthermore, DFT calculations also play a crucial role in the study of surface energy and thermodynamic stability of crystals predicted by ML algorithms. In spite of high-performance prediction ability, ML method has a limitation as it can predict a material only from the information which is already present in the database and hence it is not possible to optimize new structure out of it. A correlation has been found among atomic size, electro negativity, and crystal phase using the Bayesian probability and regression
Potential of Machine Learning Algorithms in Material Science: …
81
method which indicates the principle and mechanism of the physical and chemical process are also instrumental in the prediction of new material [32]. From previously obtained information, a suggestion model by Fischer et al. has also been developed which gives insights to the compositional and structural parameters of material [33]. Another model Data Mining Structure Predictor (DMSP) has emerged as an efficient tool to extract relevant experimental data and employed it in quantum computational methods to optimize the stability of the crystal. Two-points cluster algorithms, namely, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering points to identify the clustering structure (OPTICS) have been reported to be able to separate out crystals from the spectrum of a large set of data of various materials [25]. Liu et al. have developed an ML pathway that includes tasks of defining data input, selection of features and categorizing algorithm which can estimate the microstructure of magnetoelastic materials, especially Fe-Ga alloys. It has been observed that the required time for this task is reduced to 80% compared to conventional computational technique [34]. Another method derived from Regression and Principal Component Analysis (PCA) techniques have used to calculate the stability of oxide semiconductors and cubic fluorides at 0° K, 300° K, and 1000° K, respectively. In the advancement of organic light-emitting diodes (OLEDs), training algorithm constructed by multi-task neural networks has reported to be able to enhance the efficiency of OLEds-based solid-state devices [35]. The next section describes model details used in material science and few case studies.
3 Adoption of Machine Learning in Material Science 3.1 Principle 3.1.1
Characterization of Properties
Machine learning use in the area of material science and other related fields has been rapid in the last few years as it addresses several design and development challenges. Since the year 2014, the total number of research done or published in this field are very high with the numbers increasing every year [36], Various key areas are there where informatics-based methods are growing and proved to be promising, are discussed here. How machine learning helps overcome various problems and barriers in material design is depicted by giving some examples which address challenges related to materials, modeling, synthesis, and characterization.
82
3.1.2
P. Bhatt et al.
Efficient and Predictive Surrogate Models
ML-based substitute models give another data-abled path to connect desired processing structures which comes under the majority of recent work. Descriptors used are easily accessible and carefully devised numerical representations. The most important feature of substitute is to select the best appropriate descriptor for building exercise which is domain specific. One of the most important components of surrogate model construction, which frequently heavily relies on domain-specific expertise, is a suitable descriptor. Follow recommended practices for statistical inference, such as choosing training data that is reflective of the actual true data distribution, utilizing cross-validation to choose model hyperparameters, and testing on hypothetical data, to guarantee a truly accurate optimal learning model. The true benefit of such models comes in their astonishing speed once they have been constructed, verified, and carefully tested to be predictive within a specific domain of application. This is in comparison to conventional property prediction or measuring pathways [37]. Because of this, ML-based surrogate models are especially well suited for high-throughput projects that attempt to discover chemicals or compounds including one or more attributes falling inside a given range. Discovering “optimal” chemicals relates to finding chemistries that are on or near the underlying Pareto front, which provides the best possible trade-offs among the conflicting reactions if a subset of characteristics exhibits opposing trends or inverse connections (Fig. 4).
Fig. 4 Highlights and elaborates on key areas of ML’s use in materials science
Potential of Machine Learning Algorithms in Material Science: …
3.1.3
83
Design and Finding of Materials
Surrogate models based on machine learning (ML) can be used in a variety of ways to facilitate the design and discovery of new materials, expanding on its fundamental strength of enabling quick yet accurate predictions of material properties. A constructed model can be used to predict outcomes for the whole set of substances that have been combinatorially enumerated and fall within the model’s area of applicability. Even more intriguingly, it is possible to incorporate several property prediction models as part of a hierarchical down-selection pipeline to screen materials using progressively more demanding criteria at each level [16, 38]. By utilizing an optimization technique like evolutionary computation, simulated annealing, local minimum, or swarming optimization-based processes, the forward raw material simulation tool can also be “inverted.” The optimization-based inversion strategy, which results in a more inclusive approach to materials discovery, concentrates on directly forecasting a group of materials that meet specific pre-specified target objectives. This differs from the straight brute-force enumeration method which relies on the digital screening of potential candidates from a list of options. The development of atomically precise imaging tools has created new opportunities for ML-based methods to aid in the quick and quantitative characterization of functioning material under static and dynamic settings, in addition to modeling, simulations, and synthesis. Traditional characterization techniques have traditionally been used to indicate the quantitative structure or activity of a material system; however, modern imaging tools now provide far more quantitative and informationrich characterization options. Real-space imaging methods like direct photography of atomic-level structure and functional features in large multi and multi-phase materials are made possible by techniques like scanning electron microscopy [39], scanning tunneling microscope, and atomic force microscopy [40]. However, beyond the capabilities of standard data-analysis approaches, which are mostly focused on human examination by a domain knowledge specialist, extracting structure-property correlations from these incredibly big databases remains a tough issue. Recent machine learning (ML) efforts have concentrated on generating theory-guided mappings between the known atomic-level structure and measuring the response surfaces (e.g., see Fig. 1f) by utilizing huge datasets made available by cutting-edge characterization techniques [41, 42]. This method also produces orientation maps with a spatial resolution of about 200 nm and a crystal orientation resolution of 0.5°. Although the fastest commercially available orientation imaging microscopes can perform these measurements (one diffraction pattern measurement takes less than 1 ms) [43, 44], they are nonetheless extremely accurate. Traditional indexing methods, which need much longer time scales, are still a bottleneck for orientation reconstruction from highly noisy EBSD patterns. This has been a significant obstacle to real-time orientation indexing implementation that is necessary, for example, to research in-situ microstructure evolution. Recent research has demonstrated that ML-based algorithms can index orientations as quickly as the greatest EBSD scanning rates and are resistant to experimentally observed picture noise [45, 46].
84
P. Bhatt et al.
Fig. 5 Diagram displaying transparency and model performance trade-off. The area that may see future improvement as a result of enhanced explainable AI approaches and tools
The classification of specific chemical environments from X-ray absorption spectra [47], the identity of two-dimensional heterostructures in electron microscopy, the automated tuning of electron microscopy controls, data capture and analysis [48], the phase proof of identity throughout spectroscopic techniques [49], and machine vision data processing and image restoration for magnetic resonance imaging are additional notable examples of ML-aided characterization. A new paradigm of improved materials will be made possible by combining the information amassed by computer vision on material clear implications with physics-informed models, where theoretical predictions and experimental discoveries interact just at microscopic levels (Fig. 5).
3.2 Automatic Information Acquisition Text (manuscripts, reports, abstracts, etc.) currently comprises a sizable portion of scientific knowledge and is expanding at an unprecedented rate. Due to the dearth of efficient algorithms that really can extract correlations, linkages, and connections directly from text inputs, this information-rich resource is still mostly underutilized. As just a consequence, the materials community’s efforts in materials design and discovery have typically relied on expert-curated and very well-property databases. But over the past 10 years, major developments in natural language processing (NLP) have created exciting new opportunities for the study of materials and related fields. Most significantly, a generalized method that can be applied has been developed by using Algorithms like Word2vec [50] and GloVe to construct high-dimensional data vector spaces (commonly referred to as embeddings) for words that appear in the given text while preserving their relative semantic and syntactic relationships. These word embeddings may be able to extract given more important scientific concepts and structure-property relationships without the need for specific domain knowledge input. In order to provide a concrete example of how this method might be used to extract hidden knowledge from materials science literature, Tshitoyan et al. [51]
Potential of Machine Learning Algorithms in Material Science: …
85
gathered and evaluated materials-related research from more than 3.3 million scientific abstracts that were published between 1922 and 2018. This study’s key finding was that information about upcoming discoveries is, to a considerable extent, already included in earlier papers in a latent form. As a result, NLP models may be able to suggest new functional materials years before they would normally be discovered. Automated text mining has been used to handle the challenging problem of locating realistic materials synthesizing routes by integrating ML and NLP. Particularly, Kim et al. [52] demonstrated how to successfully use machine learning (ML) techniques to predict the essential synthesis parameters needed to produce targeted components in this scenario, titanium nanomaterials hydrometallurgical methods, in which the training data was auto-compiled from hundreds of thousands of scholarly papers. More significantly, the study demonstrated the ability of the built ML models to transfer learning by forecasting the results of synthesis on materials systems that were not part of the initial training set. The examples covered in this section demonstrate how ML-based methodologies and algorithms have been used in a variety of materials design and development applications. The developed models mainly rely on material descriptors or features to numerically represent the particulars of the problem, such as the chemical structure and configuration of the material, processing methods, and relevant environmental variables. The choice of an appropriate description set is a crucial stage that is extremely important. The choice of an initial set usually depends either only on the original problem underlying domain expertise (such as the mechanism of action details, well-known and physically reasonable connections, fundamental laws of physics and chemistry, etc.) or on an impartial choice using the available data starting with a very huge array of combinatorial possibilities. One could argue that each approach has advantages and disadvantages of its own. The latter has a higher potential for discoveries that are typically outside the domain of common wisdom, whereas the former is likely to produce models that are more suitable for physical interpretation. Regardless, throughout an ML model development effort, an exploratory investigation using a number of techniques to the issue at hand is usually beneficial [53]. The amount of materials informatics’ influence on the discipline will ultimately depend on our capacity to not only create transparent ML models but also derive physical insights from these substitutes while maintaining the inherent discovery potential of data-enabled approaches.
3.3 Physical Insights from Materials Learning Trade-offs between performance and transparency in the physical sciences, ML models are frequently required to give new scientific understanding and physical insights directly from observational or simulated data in addition to making reliable and accurate predictions. Explainability, or the capacity to rationalize individual predictions by scrutinizing a transparent model’s inner workings and subsequently interpreting the results in tandem with expert knowledge, is a prerequisite for domain knowledge extraction using machine learning. As a result, explainability
86
P. Bhatt et al.
results from a domain knowledge expert evaluating a group of interpretations for a transparent model. Transparency in this situation primarily pertains to the specifics of the ML model that was used (i.e., details pertaining to the specific choices of the model class, model complexity, learning algorithm employed, hyperparameters, initial constraints, etc.), interpretability, on the other hand, integrates the input data with the ML model to make sense of the output. In order to get from interpretability to explainability, a person with a scientific grasp of the issue must be involved. Recently, widely accepted principles of transparency, interpretability, and explainability have evolved as the basic components of paramount importance that are regarded as necessary to permit scientific outcomes from ML initiatives in the search to learn from learning machines or intelligible intelligence. Model complexity is directly related to the ideas discussed above. In contrast to more complicated “black-box” models, simple and transparent ML models typically have lower accuracy and reliability. Despite this, they are quite susceptible to interpretations and explanations. For explainable ML models, a trade-off between efficiency (reliability and accuracy) and transparency needs to be carefully considered, much like the well-known bias-variance trade-off which is prompted to prevent over fitting while building a solid predictive ML model. Strategies for hybrid and local training to provide greater transparency, Fig. 2 graphically depicts this predicament, with deep neural networks—which offer exceptional performance but limited transparency—at one extreme. Decision tree algorithms and fully easy-to-interpret rule-based algorithms constitute the opposite end of the performance spectrum from high transparency. However, more sophisticated hybrid strategies that go beyond traditional single-model frameworks have been proposed to simultaneously increase model visibility and efficiency [54]. For instance, Kailkhura et al. lately devised a method that converts a regression problem into a multi-class classification task utilizing subsampled training data in order to balance the distributions of the least represented material classes. Then, to better understand the various subdomain-specific regimes, smaller and simpler versions for the various classes are trained. A justification generator component of the framework that can provide both models- and decision-level explanations was made possible by this domain-specific learning. Compared to the typical method of training just one regression model for the full dataset, this improved the model’s overall transparency and explainability. Finally, a transfer learning strategy that takes the use of correlations between different qualities was applied to make compensation for the system performance degradation brought on by enhanced transparency. Validations based on causality and consistency A model that can be explained also makes it possible to develop testable hypotheses or more rigorous validation tests for certain predictions in order to address issues with consistency, generalizability, and causality. With the use of the sure independent screening and sparsifying operator (SISSO) method, which is based on the compressed sensing technology, Ouyang et al. offered a persuasive example demonstration in this regard [1, 55, 56]. It should be noted that this approach has been widely used to handle a variety of materials design and discovery challenges. It enables efficient exploration of enormous descriptor
Potential of Machine Learning Algorithms in Material Science: …
87
Fig. 6 A comparison of the tolerance and octahedral factors for perovskite formability in terms of structure maps. a A scatter plotted, traditional structural map. A convex hull enclosing the known examples provides the perovskite formability region, b an upgraded structure map using informatics and the same set of variables that explicitly takes into consideration the likelihood of formation
spaces, with the number of unique descriptors frequently exceeding several billion (Fig. 6). These contributions imply that the use of symbolic regression-based approaches and compressed sensing, along with properly specified domain knowledge-based restrictions, might be extremely beneficial in getting a physical understanding of data about materials. Design maps with informatics enhancements in order to create informatics-enhanced design maps that are significantly more informative and information-rich than traditional methods, which have primarily used twodimensional maps, it is possible to take advantage of the efficient interpolation ability of ML algorithms in high-dimensional spaces. For illustration, Fig. 3 contrasts the octahedral factor structure map with a conventional tolerance factor, which is frequently used to detect formable perovskite oxides [57]. As can be seen in Fig. 3a, the plot of all identified substances that have been successfully synthesized in a perovskite crystal structure tends to cluster, and the combination of geometrical descriptors has an amazing potential for prediction. However, the descriptor pair in such an approach might only be focused on physical properties (i.e., coordination environment dependent Shannon’s ionic radii), completely ignoring local bonding interactions such as ionicity versus covalency, relative electronegative differences between different cations, etc. that also may also be very important in determining formability in perovskites. While one could argue that some of these factors are implicitly taken into account in the relative atomic and ionic size trends, the ability to formally involve extra relevant factors that may have an impact could significantly increase the predictive effectiveness of such traditional maps. A trained random forest machine learning model, for instance, created the equivalent figure shown in Fig. 3b, which was then verified using a much larger collection of descriptors, including octahedral and tolerance factors, electro negativities, ionization potentials,
88
P. Bhatt et al.
electron affinities, and orbital-dependent pseudo potential radii of the cations. The model can be used to produce probabilistic estimations of perovskite formability over the entire multi-dimensional given input space after training and validation. Then, while integrating out or marginalizing any other feature dimensions, these predictions can be reprojected onto a two-dimensional plot for the two conventional geometric variables, as seen in Fig. 3b.
4 Model Generalizability and Performance in the Real World As it was previously mentioned, gathering scientific data is not only costly but also time-consuming, and it frequently requires the use of high-end equipment or advanced computing methods. Researchers should expect to see changes in the data collection process over time. This could be a result of modifying the experimental procedures, adopting more precise computational methods, or changing the equipment owing to wear and tear, environmental conditions, etc. If these modifications lead to changes in the underlying distribution of the data collected, it would not be unexpected. While some of these modifications are predictable and simple to account for throughout the model design and training process, others may be unanticipated and result in significant mistakes in the model predictions. Understanding the trained models’ constraints with regard to adaptation and generalization to new data is therefore critical, particularly when they are used in real-world settings. This section explains how, despite thorough testing, real-world circumstances can produce less-than-desirable outcomes and why it is crucial to evaluate the model’s performance (where it is feasible) when these circumstances are present. ML Model Failures in Real-World Scenarios. The majority of sectors have embraced machine learning, however, there have been mistakes that have been widely reported in the media. These errors range from funny—like an AI-based soccer ball tracking system locking on a linesman’s bald head—to possibly dangerous—sexism and racism in chatbots and employment tools—to disastrous—failures of self-driving cars. Failures are relatively hard to find because the material science and broader physical sciences research communities rarely record them. Although there haven’t been many current studies that discuss such shortcomings in machine learning for medicine. As demonstrated by Oakden-Rayner et al., model accuracies can be misleading when there are subgroups within each class that may have different distributions. They used the real-world medical picture datasets CIFAR-100 and others, and they referred to the availability of these subset classes as hidden stratification. A system created by Google researchers has a diabetic retinopathy detection accuracy of over 90%, which is on par with “human professionals.” When the technology was implemented in clinics in Thailand, however, it encountered practical issues like inadequate lighting for obtaining retinal images and slow Network rates for transferring photos to the cloud for analysis. The team that created the technology has now consulted with end
Potential of Machine Learning Algorithms in Material Science: …
89
users, including doctors and nurse experts, to create a more suited human-centered workflow. The most recent failures of ML systems are from tools used to detect and assess the severity of COVID-19 infections using chest X-rays and computed tomography (CT) scans. In two review papers, Wynants et al. and Roberts et al., respectively, have studied and published algorithms/tools. No tool was suitable for clinical use, according to the findings of both research teams. They found a trend in most of the programs that led them to conclude that they were unfit for application in the real world. This pattern included false assumptions, a dearth of being subject to rapid, the use of several publically available datasets, etc. [58].
4.1 Case Study: Prediction of TATB Peak Stress Due to a change in the experimental setting that caused a change in the feature distributions, we faced comparable difficulties in our case study. SEM pictures that were captured with meticulously calibrated brightness and contrast settings on the microscope were used in the case study results, we have thus far given on the 30 TATB lots. But subsequently, the filament was changed as part of standard maintenance on the microscope. The pixel histograms of the images captured using the pre-existing microscope settings underwent considerable alterations as a result of the new filament. As a result, we evaluated the performance of the WideResNet models that had previously been trained (Table 1) using images that were captured using various brightness and contrast settings on the microscope. The data gathered utilizing the novel filament has never been viewed by these models. Six of the initial 30 TATB lots had their data collected. The test’s outcomes are displayed in Table 2 of the report. The average performance of the six lots is provided for each feature representation for data obtained using the old filament and data gathered using the new filament (new). Every picture representation of the data exhibits a decline in performance, according to the measures. The 2-channel representation that uses both the SEM and BSIF-coded pictures and has the minimum error with the data from the six prior lots of data also has the smallest performance decline with the data obtained using the new filament. The supporting Information Table provides the performance measurements from individual training runs of the WideResNet models using the three image feature representations. We provide the median predicted values for an example lot (AX) at various microscope settings to further investigate how adjustments in the brightness and contrast settings impact the performance of the models. The estimated peak-stress value from the pictures acquired using the old filament is shown as the reference value in the plot. From photos collected with six different acquisition settings, the other values were calculated. On the plot’s top, an illustration that reflects each of these parameters is displayed. The adjustments to the brightness and contrast settings have an impact on the changes in the projected peak-stress values. The configurations (scan 3) that produce the darkest images have a considerably higher anticipated peak-stress value. The settings that produce the brightest and lowest contrast images, however
90 Table 1 Measures of feature and proposed model
P. Bhatt et al. Feature
Model
MAPE (%)
RMSE (psi)
None
Training average
28.64
444.23
Human annotation
Lasso
23.73
411.87
BSIFvec
Random forests
13.05
246.40
SEM
WideResNet
11.98
178.69
BSIFimg
WideResNet
10.66
161.67
SEM + BSIFimg
WideResNet
11.44
169.72
RMSE, or root mean squared error, stands for mean average percent error
Table 2 Data collected with both old and new filaments: performance of WideResNet models by using three image representations
Feature
Test data
MAPE (%)
SEM
Old
5.39
82.51
New
6.77
106.04
Old
5.26
76.57
New
8.65
130.79
Old
4.87
74.67
New
6.21
97.48
BSIFimg SEM + BSIFimg
RMSE (psi)
“MAPE, Mean average percent error; RMSE, root mean squared error”
(scan 6), have a rather low anticipated peak-stress value. The plot also demonstrates that each representation’s sensitivity to variations in visual brightness and contrast varies. The SEM image representation is most sensitive with the highest change in the projected value when the image is bright and has little contrast, although having a reasonably small overall error (new AX lot RMSE = 56.57 psi) (scan 6). The revised AX lot RMSE for the BSIF-coded picture representation is 163.57 psi, which is a substantially higher overall error as indicated by the larger departures from the target for each setting and by Table 2. The forecasts, however, indicate less pronounced alterations in response to brightness and contrast variations. This decreased sensitivity is expected given that, as can be seen, the related SEM image’s brightness and contrast information are not included in the BSIF-coded image. A nice balance between the predictions from the SEM and BSIF-coded pictures is provided by the 2-channel image representation, as shown by the projected values. When compared to SEM pictures, it has the least overall inaccuracy (new AX lot RMSE = 53.62 psi) and is less sensitive to changes in brightness and contrast. These outcomes underline the significance of the experiments performed in Section even more. Despite having the lowest mean prediction error of any of the 30 TATB lots (Table 1), the BSIF-coded picture representation was shown to have higher prediction errors when data was gathered using the new filament (Table 2). On the data gathered using the
Potential of Machine Learning Algorithms in Material Science: …
91
new filament, we solely evaluated the performance of models that had already been trained. By including predicted data changes in the pipeline used for model training, other strategies can be employed to enhance the performance of models. For instance, applying histogram correction as a pre-processing step can result in data obtained with the new filament having fewer errors. The same case study was subsequently used by Zhong et al. to conduct a comprehensive analysis of the effects of brightness and contrast on machine learning model performance and approaches to address them. On the other hand, domain knowledge-driven strategies for data augmentation can also be used to increase the generalizability of the models. In our situation, this may be achieved by training the models using more Table 2 data. Data collected with Old and New Filament feature test data: Performance of WideResNet Models Using the Three Image Representations MAPE (%) RMSE (psi) (psi) SEM new 82.51 5.39 old BSIFimg 6.77 106.04 old 5.26 76.57 new 4.87 74.67 new 8.65 130.79 SEM + BSIFimg old 6.21 97.48 a Root mean squared error; MAPE, mean average percent error photos with artificially enhanced contrast and brightness. The likelihood of the input data being out of distribution can be determined using more complex procedures that employ uncertainty quantification. Such methods offer a measure of the projected values’ degree of confidence, which can be used to accept the model’s high-confidence forecasts and use expert intervention for its low-confidence ones [59].
4.2 Model Generalizability Takeaways 1. Machine learning models do a great job of fitting the data they are given. 2. Substantial distributional shifts brought on by changes in real-world circumstances can result in ML model performance degradation. 3. Data pre-processing methods like normalization and data augmentation can result in models that are more reliable. 4. Human awareness of the deployment environment and alertness to changing conditions are still significantly reliant on human input in current ML systems.
5 Conclusions A new paradigm in materials science is being introduced thanks to machine learning and data-enabled methodologies. As a result, the field’s traditional approach to materials design and discovery is about to undergo significant change. Materials informatics, which began as a specialized field, has quickly become a fully developed, mature discipline. The organization and prioritization of upcoming experiments, as well as the design and discovery of new materials, as well as various aspects of experimental design, are already made easier by machine learning (ML) algorithms.
92
P. Bhatt et al.
References 1. Schmidt, J., Marques, M. R., Botti, S., & Marques, M. A. (2019). Recent advances and applications of machine learning in solid-state materials science. npj Computational Materials, 5(1), 1–36. 2. Pilania, G. (2021). Machine learning in materials science: From explainable predictions to autonomous design. Computational Materials Science, 193, 110360. 3. Baldi, P., & Brunak, S. (2001). Bioinformatics: The machine learning approach. The MIT Press. 4. Noordik, J. H. (2004). Cheminformatics developments: History, reviews and current research. IOS Press. 5. Alpaydin, E. (2014). Introduction to machine learning. The MIT Press. 6. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning. The MIT Press. 7. Nguyen, H., Maeda, S.-I., & Oono, K. (2017). Semi-supervised learning of hierarchical representations of molecules using neural message passing. arXiv:1711.10168 8. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58. 9. Sammut, C., & Webb, G. I. (2017). Encyclopedia of machine learning and data mining. Springer Publishing Company. 10. Picard, R. R., & Cook, R. D. (1984). Cross-validation of regression models. Journal of American Statistical Association, 79, 575–583. 11. Meredig, B., et al. (2018). Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery. Molecular Systems Design and Engineering, 3, 819–825. 12. Tropsha, A., Gramatica, P., & Gombar, V. K. (2003). The importance of being earnest validation is the absolute essential for successful application and interpretation of QSPR models. QSAR & Combinatorial Science, 22, 69–77. 13. Golbraikh, A., & Tropsha, A. (2002). Beware of q2. Journal of Molecular Graphics and Modelling, 20, 269–276. 14. Stanev, V., et al. (2018). Machine learning modeling of superconducting critical temperature. npj Computational Materials, 4, 29. 15. Liu, Y., Zhao, T., Ju, W., & Shi, S. (2017). Materials discovery and design using machine learning. Journal of Materiomics, 3(3), 159–177. 16. Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). Springer. 17. Pei, J. F., Cai, C. Z., Zhu, Y. M., & Yan, B. (2013). Modeling and predicting the glass transition temperature of polymethacrylates based on quantum chemical descriptors by using hybrid PSO-SVR. Macromolecular Theory and Simulations, 22(1), 52–60. 18. Fang, S. F., Wang, M. P., Qi, W. H., & Zheng, F. (2008). Hybrid genetic algorithms and support vector regression in forecasting atmospheric corrosion of metallic materials. Computational Materials Science, 44(2), 647–655. 19. Paszkowicz, W., Harris, K. D. M., & Johnston, R. L. (2009). Genetic algorithms: A universal tool for solving computational tasks in materials science preface. Computational Materials Science, 45(1), IX–X. 20. Zhang, X. J., Chen, K. Z., & Feng, X. A. (2008). Material selection using an improved genetic algorithm for material design of components made of a multiphase material. Materials & Design, 29(5), 972–981. 21. Mohn, C. E., & Kob, W. (2009). A genetic algorithm for the atomistic design and global optimisation of substitutionally disordered materials. Computational Materials Science, 45(1), 111–117. 22. Meredig, B., Agrawal, A., Kirklin, S., Saal, J. E., Doak, J. W., Thompson, A., Zhang, K., Choudhary, A., & Wolverton, C. (2014). Combinatorial screening for new materials in unconstrained composition space with machine learning. Physical Review B, 89(9), 094104.
Potential of Machine Learning Algorithms in Material Science: …
93
23. Hautier, G., Fischer, C. C., Jain, A., Mueller, T., & Ceder, G. (2010). Finding nature’s missing ternary oxide compounds using machine learning and density functional theory. Chemistry of Materials, 22(12), 3762–3767. 24. Hautier, G., Fischer, C., Ehrlacher, V., Jain, A., & Ceder, G. (2011). Data mined ionic substitutions for the discovery of new compounds. Inorganic chemistry, 50(2), 656–663. 25. Phillips, C. L., & Voth, G. A. (2013). Discovering crystals using shape matching and machine learning. Soft Matter, 9(35), 8552–8568. 26. Carrera, G. V., Branco, L. C., Aires-de-Sousa, J., & Afonso, C. A. (2008). Exploration of quantitative structure-property relationships (QSPR) for the design of new guanidinium ionic liquids. Tetrahedron, 64(9), 2216–2224. 27. Farrusseng, D., Clerc, F., Mirodatos, C., & Rakotomalala, R. (2009). Virtual screening of materials using neuro-genetic approach: Concepts and implementation. Computational Materials Science, 45(1), 52–59. 28. Raccuglia, P., Elbert, K. C., Adler, P. D., Falk, C., Wenny, M. B., Mollo, A., Zeller, M., Friedler, S. A., Schrier, J., & Norquist, A. J. (2016). Machine-learning-assisted materials discovery using failed experiments. Nature, 533(7601), 73–76. 29. Beran, G. J. (2015). A new era for ab initio molecular crystal lattice energy prediction. Angewandte Chemie International Edition, 54(2), 396–398. 30. Maddox, J. (1988). Crystals from first principles. Nature, 335(6187), 201–201. 31. Curtarolo, S., Morgan, D., Persson, K., Rodgers, J., & Ceder, G. (2003). Predicting crystal structures with data mining of quantum calculations. Physical Review Letters, 91(13), 135503. 32. Ceder, G., Morgan, D., Fischer, C., Tibbetts, K., & Curtarolo, S. (2006). Data-mining-driven quantum mechanics for the prediction of structure. MRS Bulletin, 31(12), 981–985. 33. Fischer, C. C., Tibbetts, K. J., Morgan, D., & Ceder, G. (2006). Predicting crystal structure by merging data mining with quantum mechanics. Nature Materials, 5(8), 641–646. 34. Liu, R., Kumar, A., Chen, Z., Agrawal, A., Sundararaghavan, V., & Choudhary, A. (2015). A predictive machine learning approach for microstructure optimization and materials design. Scientific Reports, 5(1), 1–12. 35. Gómez-Bombarelli, R., Aguilera-Iparraguirre, J., Hirzel, T. D., Duvenaud, D., Maclaurin, D., Blood-Forsythe, M. A., Chae, H. S., Einzinger, M., Ha, D. G., Wu, T., & Markopoulos, G. (2016). Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nature Materials, 15(10), 1120–1127. 36. Rydning, D. R. J. G. J., Reinsel, J., & Gantz, J. (2018). The digitization of the world from edge to core (p. 16). International Data Corporation. 37. Larrañaga, P., Atienza, D., Diaz-Rozo, J., Ogbechie, A., Puerto-Santana, C., & Bielza, C. (2018). Industrial applications of machine learning. CRC Press. 38. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. 39. Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT press. 40. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. 41. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., & Chen, Y. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354–359. 42. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., & Dieleman, S., (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. 43. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., & Lillicrap, T. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv:1712.01815 44. Moravˇcík, M., Schmid, M., Burch, N., Lisý, V., Morrill, D., Bard, N., Davis, T., Waugh, K., Johanson, M., & Bowling, M. (2017). Deepstack: Expert-level artificial intelligence in heads-up no-limit poker. Science, 356(6337), 508–513.
94
P. Bhatt et al.
45. Brown, N., & Sandholm, T. (2018). Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374), 418–424. 46. Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A. A., Lally, A., Murdock, J. W., Nyberg, E., Prager, J., & Schlaefer, N. (2010). Building Watson: An overview of the DeepQA project. AI Magazine, 31(3), 59–79. 47. Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6, 52138–52160. 48. Arrieta, A.B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., & Chatila, R. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. 49. Morgan, D., & Jacobs, R. (2020). Opportunities and challenges for machine learning in materials science. arXiv:2006.14604 50. Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A., & Kim, C. (2017). Machine learning in materials informatics: recent applications and prospects. npj Computational Materials, 3(1), 1–13. 51. Mueller, T., Kusne, A. G., & Ramprasad, R. (2016). Machine learning in materials science: Recent progress and emerging applications. Reviews in Computational Chemistry, 29, 186–273. 52. Sanchez-Lengeling, B., & Aspuru-Guzik, A. (2018). Inverse molecular design using machine learning: Generative models for matter engineering. Science, 361(6400), 360–365. 53. Häse, F., Roch, L. M., & Aspuru-Guzik, A. (2019). Next-generation experimentation with self-driving laboratories. Trends in Chemistry, 1(3), 282–291. 54. Ziatdinov, M., Dyck, O., Maksov, A., Li, X., Sang, X., Xiao, K., Unocic, R. R., Vasudevan, R., Jesse, S., & Kalinin, S. V. (2017). Deep learning of atomically resolved scanning transmission electron microscopy images: chemical identification and tracking local transformations. ACS Nano, 11(12), 12742–12752. 55. Shetty, P., & Ramprasad, R. (2021). Automated knowledge extraction from polymer literature using natural language processing. Iscience, 24(1), 101922. 56. Batra, R., Song, L., & Ramprasad, R. (2021). Emerging materials intelligence ecosystems propelled by machine learning. Nature Reviews Materials, 6(8), 655–678. 57. Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O., & Walsh, A. (2018). Machine learning for molecular and materials science. Nature, 559(7715), 547–555. 58. Ward, L., & Wolverton, C. (2017). Atomistic calculations and materials informatics: A review. Current Opinion in Solid State and Materials Science, 21(3), 167–176. 59. Jain, A., Hautier, G., Ong, S. P., & Persson, K. (2016). New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships. Journal of Materials Research, 31(8), 977–994.
The Application of Novel Functional Materials to Machine Learning Humaira Rashid Khan, Fahd Sikandar Khan, and Javeed Akhtar
Abstract Due to the numerous challenges associated with traditional methods of developing energy materials, such as low success probabilities, high time consumption, and high computational cost, screening advanced materials coupled with modeling of their quantitative structural-activity relationships has recently become one of the hot and trending topics in next-generation functional materials. As a result, new research concepts and technologies are required to promote the study and development of functional materials. With the recent advances in artificial intelligence and machine learning, there is a growing hope that data-driven materials research will change scientific findings and lead to the establishment of new paradigms for energy materials development. Machine learning (ML) is a powerful tool for extracting insights from multidimensional data quickly and efficiently. It provides a muchneeded pathway for speeding up the research and investigation of novel materials in order to solve time-sensitive global concerns like climate change. Large datasets have made it possible to build machine learning algorithms for a variety of applications, such as experimental/device optimization and material discovery, in recent years. Furthermore, contemporary breakthroughs in data-driven materials engineering show that machine learning technology can help with not just the design and development of advanced energy materials, but also the discovery and deployment of these materials. The need and necessity of developing new energy materials in order to contribute to global carbon neutrality are discussed in this chapter. Following that, the most recent advancements in data-driven materials research and engineering are reviewed, including alkaline ion battery materials, photovoltaic materials, catalytic materials, and carbon dioxide capture materials. Finally, the remaining obstacles in H. R. Khan Department of Chemistry, Rawalpindi Women University, Satellite Town 6th Road, Rawalpindi, Pakistan F. S. Khan HBL Center for Blockchain and Applied Research, Faculty of Engineering Sciences, GIK Institute, Topi, Pakistan J. Akhtar (B) Functional Nanomaterials Lab (FNL), Department of Chemistry, Mirpur University of Science and Technology (MUST), Mirpur 10250, AJK, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_5
95
96
H. R. Khan et al.
the creation of new energy materials are discussed, as well as significant pointers to effective machine learning applications.
1 Introduction A global consensus has emerged regarding the need to diligently develop clean and renewable energy technologies in order to attain a carbon-neutral society over the next several decades as a result of the escalating global environmental issues [1]. The creation of improved energy materials to facilitate effective energy conversion and reliable power output is one of the most important ways to achieve the widespread use of green energy [2, 3]. Laboratory research and simulation exercises are among the classic methods for developing and discovering energy resources. As a result, it takes a long time, and there are only a few samples of newly discovered materials [4]. Furthermore, there is a limited likelihood that these conventional procedures will succeed [5]. The density functional theory (DFT) computation method has been widely used in recent years to screen novel materials [6]. This is mostly because the DFT offers improved computational precision and can support large-space searching [7]. The use of DFT calculation does still have significant drawbacks, such as large computing costs. Recent advancements in artificial intelligence (AI) technology across a number of study domains have shown the enormous potential of using AI to find novel, energy-efficient materials [8]. Machine learning (ML), a subtype of AI, uses algorithms and models to learn from historical data or prior knowledge, whereas AI is a technology that enables a machine to emulate human behavior [9]. Because of ML’s intrinsic strength in processing vast amounts of data and performing highdimensional analyses, materials development can be accelerated [10, 11]. Barnett et al. [12], for example, created an ML model based on the Gaussian regression approach to get new polymer membrane materials. The ML model predicted the gas separation behavior of more than 11,000 untested homo-polymers using data on the gas permeability of about 700 polymers. Machine learning (ML) techniques, a collection of marketable statistical approaches for multidimensional data analysis, such as approximation and expansion of available data to new examples have developed quickly as a result of the boom in the availability of data [13]. Now professionals in other domains can also benefit from ML’s capability on their own. The design and optimization of materials, especially those that facilitate society’s transition toward sustainable energy, is one such interesting use of ML (e.g., batteries, fuel cells, photovoltaics, etc.). Given the size of the chemical combinatorial space and how frequently human intuition falls short of capturing trends in material qualities, it is increasingly clear that traditional trial and error methods are just ineffective for solving this time-sensitive design problem. For instance, despite the fact that there are thought to be at least ~1060 potential tiny organic molecules alone, we have only examined ~106 crystalline materials and ~109 molecules to date [14]. With the introduction of cheminformatics tools, it is now
The Application of Novel Functional Materials to Machine Learning
97
simple to represent any molecule or material in a fashion that computers can understand by a row of numbers that uniquely identify it. Exploration of new chemical spaces is made possible by ML, which uses these representations to identify qualities crucial to the behavior, properties, and performance of well-known materials. Even unreliable forecasts might help scientists decide where to focus their computational or experimental efforts to increase the likelihood of making a discovery. The outcomes of these investigations can be used to continue training and improving the model. In this perceptive, we discuss the latest ML applications in the creation of materials for purposes involving energy. We present an analysis on the efficiency of current methods as well as their drawbacks with an experimentalist reader in mind. There is a discussion of both “theoretical” techniques (based on calculated data) and experimental observations evaluation and optimization. We end by providing our viewpoint on how to resolve the present difficulties in implementing ML in materials design as well as potential development prospects. The Materials Genomics Initiative (MGI) can therefore be developed with the help of the overview of state-of-the-art attempts on data-driven materials research, which can also offer insights for future perspectives. Machine learning for the creation of sophisticated materials has been discussed in a few review papers that have already been published [9, 15]. As an illustration of how ML speeds up material discovery and design, Liu et al. [16] provided a thorough review. The most recent developments, though, were not included. ML applications for renewable energy materials were the main topic of the studies of Gu et al. [17]. The tutorial for ML technologies, however, was not given in great detail. The example studies mentioned in this study are less focused on energy materials than those in Li et al. [18] demonstrated of how AI methods are implemented at various stages of material development. An overview of ML methods and their uses in the study of materials was presented by Chen et al. [19] Still, there is still room to grow in terms of data-driven materials science’s potential. From the perspectives of theory, policy, and investments, Correa Baena et al. [20] presented state-of-the-art efforts via automation and ML for the finding of materials. Additionally, the establishment of design guidelines for the synthesis of materials predicted by ML and the production of energy materials has not gotten enough attention. As the employment of the appropriate ML methodology is wholly dependent on it, the starting discussion in this chapter explores several methods of representing design space (Fig. 1). The topic would then shift to the use of a low-dimensional representation of the high-dimensional design space, where machine learning methods aid to flatten the dimensionality and predict material attributes. Given that ML aids in optimizing space navigation, the chapter’s focus would eventually shift to exclusively on high-dimensional design space. Finally, we talk about particular domain examples found in the functional materials to determine the current situation and how ML is accelerating the development of new applications.
98
H. R. Khan et al.
Fig. 1 a Detailed, high-dimensional representations of various material systems. b Lowdimensional representations of systems derived from their high-dimensional counterparts (a). c Representative material properties that can be measured in experiments or computed in simulations SMILES, simplified molecular-input line entry system [21, 22]
2 Design of Experiments and Parameter Space Optimization Mixed research mainly relies on parameter exploration and optimization, just like other fields of science. A thorough sweep (grid search) is often inexpensive in control parameter spaces. Optimization often entails changing parameters one by one and following the local gradient until an optimum is reached for large parameter spaces. This method quickly becomes costly for more factors and does not ensure that a global minimum or maximum is found. The accuracy of such a process is further compromised by noise in experimental observations. Techniques like Bayesian optimization are now more widely available because to the creation of reliable open-source libraries [23] and cost-free internet services [24]. These techniques give experimentalists the capacity to effectively explore the parameter space and “rationally” assess historical data in order to achieve the intended outcome, such as electrodes voltage, electrolyte conductance, and solar cell efficiency. Similar to traditional ML regression, all relevant data values are interpolated in Bayesian optimization using a few curves (typically Gaussians), resulting in a map of the parameter space. The probability in the predictions is also estimated at each step, which is a major contribution.
The Application of Novel Functional Materials to Machine Learning
99
However, it has been demonstrated mathematically that Bayesian optimization is a more effective search technique [23] In contrast to 100000 data points in a thorough random search, reaching an optimum in 5D–6D spaces frequently requires less than 100 tests. Packages like Phoenics [25] make it possible to implement and incorporate Bayesian optimization into current workflows very easily, increasing user accessibility.
2.1 Device Fabrication Device optimization is a field where ML may be applied to enhance performance, layer thickness, composition, and even testing conditions. The acquisition of a dataset continues to be the fundamental obstacle in many applications of applied ML. When it comes to organic light-emitting diodes (OLEDs), changes to the device layout, such as the material interface and thickness, can significantly alter the device’s photon yields. The ideal device structure can be attained more effectively by ranking the structural elements (band structures, layer thicknesses) according to their influence on the device performance (current, power, and quantum efficiencies). In the study by Janai et al. [26] random forests were used to construct a multivariate regression model to connect the device efficiency with input parameters [26]. In order to explore novel and follow the standard that maximizes device performance, the surrogate approach was next applied, drastically reducing the search space for further tests. According to Kirkley et al., ML has been employed to improve the bulk heterojunction in the active layer of solar cells [27]. The authors improved the donor fraction, solution concentration, annealing time, and temperature in an effort to increase the power conversion efficiency (PCE). Through two sessions of trials, they were able to more successfully optimize the active layer using this approach. A preliminary map of the effects of different variables on the PCE was created after the first round. The peaks were then adjusted in the second round, increasing the density of data points in these areas [28]. A more precise and pertinent dataset can be created by combining ML with highthroughput trials, which also makes it possible to explore the parameter subspace in greater detail. Du and colleagues developed a completely automated method that can fabricate every component of the solar cell, including the deposition, annealing, and changing of up to 100 different parameters. The automated system could also take photos of the layers, measure UV-vis, and evaluate current versus voltage to characterize the material. With regard to device performance and stability, the developed ML model helped to better grasp how different factors interact with one another. For example, altering the donor material’s ordered phase content was shown to increase power conversion efficiency, whereas reduced processing temperatures and faster spin-coating speeds were found to have the greatest effects on device stability [29]. A typical technique for quickly laying device layers for such high-throughput testing is inkjet printing. Low amounts of reagents can be used to quickly saturate a substrate due to the quick deposition of tiny droplets. Moreover, the printer must be
100
H. R. Khan et al.
able to deliver droplets of a similar consistency in size and arrangement in order to retain a reasonable yield if consistent results are to be obtained. The volume, deposition rate, and dispersion of the droplets are adjusted by adjusting the printer’s jetting pressure, valve actuation rate, and nozzle movement speed parameters. Furthermore, image processing techniques were used by Siemenn et al. [30] to determine droplet fitness and determine the ideal deposition conditions for testing [30]. Finding the ideal printer settings for the experiment was shown to proceed twice as quickly using Bayesian optimization as opposed to stochastic gradient descent approaches to reduce convergence time for such methods. This period can be cut down, allowing for quicker testing of new compositions and less material waste.
2.2 Synthesis of Materials Another area that can benefit from ML-guided enhancement is materials synthesis, where even the most basic synthetic pathway depends on a wide range of factors. Voznyy et al. [31] improved PbS quantum dots synthesis by employing Bayesian optimization, aiming for mono-dispersity (expressed by the entire width at half maximum of the first exciton peak) for each nanoparticle size [31]. To construct a continuous map of the parameter space, the ML model was trained on 2000 experimental data points that were digitally extracted from old lab notebooks (the majority of which were duplicates). This allowed for a reduction in experimental noise and brought out the statistical distinction between the intrinsically similar effects of initial concentration and temperature on nanoscale size. Oleylamine was also discovered to enhance the mono-dispersity even more by amplifying the effects of concentrations. Despite this same synthesis having been methodically improved by other research groups for the previous 20 years, the procedure produced appreciable improvements in mono-dispersity for a variety of quantum dot sizes. Several research centers are now looking to make use of the collective productivity of robotics and machine learning to automate future material discovery and synthesis, even though an ML-led retrospective of experimental work gives a path to missed ideas [32, 33] For instance, Chan et al. used robotics to create 8172 distinct metal halide perovskites using negative temperature crystallization and 45 different organic ammonium cations (ITC) [34]. An ML model that can predict the probability of single crystal formation of such perovskites for potential synthesis was trained using the robot-accelerated perovskite investigation and discovery (RAPID), which resulted in a five-fold increase in the number of metal halide perovskites that can be synthesized via ITC. Other material classes and applications, like polyoxometalates, nanomaterials, thin films, piezoelectrics, and photocatalysts, have been the subject of related study [35]. As was said earlier, this robotics and machine learning combination avoids human bias and labor constraints, resulting in speedy and accurate material discovery. This strategy’s main present drawback is the requirement for specialized tools and staff training to program such instruments, which makes accessibility a barrier to
The Application of Novel Functional Materials to Machine Learning
101
implementation. Nevertheless, as technology advances and becomes more generally accessible, accessibility is anticipated to get better soon.
3 Identifying Next-Generation Materials 3.1 Plan for Achieving Carbon Neutrality Global agreement has emerged on the need to reduce CO2 emissions in order to achieve carbon neutrality. The CO2 emissions of the power industry and transportation sectors exceeded more than half of the overall CO2 emissions, as illustrated in Fig. 1 of the CO2 emissions of worldwide fossil from 1970 to 2019 [36]. As a result, both the power industry and the transportation sectors must implement effective CO2 reduction strategies. Regarding this, numerous policies have been proposed by governments and institutions, such as the European Battery 2030+ [37], China’s 13th Five-Year Plan for Renewable Energy [38], and the Paris Climate Agreement [39], in an effort to hasten the development of renewable energy and achieve zero-emission transportation. Figure 2 displays the pathway to reach carbon neutrality, which comprises power production, energy storage and conversion, and energy utilizations, in order to clearly show the energy application scenarios of a fossil-free civilization in the long term. Renewable energy, which is derived from natural resources including water, sunlight, wind, and biomass, is the best option for power generation [40]. The most practical method to achieve carbon neutrality is to produce and use renewable energy to massively substitute fossil fuels [41]. Solar energy, wind energy, hydropower, and nuclear energy are currently available clean energy sources with the potential for large-scale uses. To achieve carbon neutrality by 2060, for instance, China must implement negative emissions technology and use renewable energy on a massive scale [42]. Hence, the achievement of a society without carbon emissions is significantly impacted by the development of renewable energy technology. According to the analysis above, renewable energy will serve as the cornerstone of future energy growth. Renewable energy sources are, nevertheless, subjected to environmental influences. For instance, under cloudy or nighttime conditions, there will be less solar energy available due to a reduction or absence of sunlight, resulting in intermittent energy output. When connected directly to the grid for clean electricity, it is challenging to ensure that these energy sources are being used to their full potential in this situation. Exploring novel energy storage and conversion technologies is, therefore, important [43]. Common energy storage methods include flywheels, pressurized air, and pump hydropower, among others. Supercapacitors, rechargeable batteries, flow batteries, fuel cells, and other advanced energy conversion and storage technologies have all recently appeared and undergone rapid development
102
H. R. Khan et al.
Fig. 2 Plan for achieving future energy
[44, 45]. Hydrogen energy is an example of an electrochemical energy storage technology that has recently received a lot of attention [46]. With the help of this technology, clean electric energy may be transformed into a liquid or gaseous fuel that is easy to store and transfer. Additionally, hydrogen energy can be transformed into energy-dense, carbon-neutral liquid fuels (such CH3 OH and NH3 ) by fusing with CO2 or N2 in the atmosphere [47, 48]. The creation of new energy materials, which may significantly boost energy conversion efficiency and encourage the widespread use of revolutionary energy technologies, is one of the most promising solutions to the aforementioned problems. In conclusion, it is necessary to develop innovative, high-performance energy materials.
3.2 Technological Advancements Conventional approaches for developing energy materials entail theoretical computation, simulation, and experimental investigation [20]. As seen in Fig. 3, combining experiments with calculations, such as a DFT calculation, can speed up the development of new materials. DFT calculation does, however, have some downsides, including a high time and computational cost [49]. The use of ML in screening high-performance energy materials has been thoroughly investigated [50, 51] thanks
The Application of Novel Functional Materials to Machine Learning
103
Fig. 3 Typical and elevated techniques for developing energy materials [53]
to the advent and ongoing development of AI and big data approaches. A database can be created utilizing information from experiments or DFT calculations. Largescale data modeling, categorization, and optimization can be implemented using ML algorithms depending on the database and chosen features. The potential prospects will therefore be eliminated during screening. Additionally, the macro and micro properties of energy materials could be foreseen using ML algorithms. Additionally, feature engineering can be used to assess the relative weights of various descriptors, acting as a useful guide for the following round of modeling or classification [52]. In short, the use of ML technology can significantly speed up the creation of innovative energy materials.
4 Algorithms for Machine Learning A subset of AI known as machine learning (ML) uses models and algorithms to learn from past data as well as previously acquired knowledge [54, 55]. ML algorithms typically include reinforcement learning, unsupervised learning, and supervised learning algorithms (Fig. 4). There are two categories of ML models used in supervised learning algorithms: regression models and classification models [56], including logistic regression and neural networks. It is primarily used for clustering and dimensionality reduction in unsupervised learning techniques, such as K-nearest neighbors and principal components analysis [57]. Additionally, reinforcement learning, which can learn in an interactive environment through trial and error based on feedback, is a crucial component of machine learning. Q-learning and Markov decision processes are two examples of often employed algorithms for reinforcement learning [58].
104
H. R. Khan et al.
Fig. 4 Algorithms for machine learning
5 Machine Learning Applications A growing trend in recent years has been the use of ML algorithms in the design and development of innovative energy materials [59]. Alkaline ion battery materials, photovoltaic materials, catalytic materials, and carbon dioxide capture materials are among the most recent data-driven materials science and engineering innovations that will be introduced and addressed in this section.
5.1 Batteries Due to their great energy density and green attributes, lithium-ion batteries have advanced quickly in recent years. This battery technology does face certain difficulties, such as safety concerns and a lack of raw materials. Therefore, one of the promising solutions to these problems is the development of improved battery materials [60]. Figure 5 depicts a typical lithium-ion battery structure, which consists of a cathode, anode, and electrolyte [61]. While the cathode material is typically constructed of a metal oxide including lithium, the anode material is typically made of carbon, graphite, or silicon [62]. Lithium salt and lithium metal oxide are typically used as the electrolyte components in liquid and solid electrolytes, respectively, in lithium-ion batteries. The use of ML to screen high-performing lithium-ion battery materials has recently received a lot of attention [63]. Thus, a quick synopsis of some of the most recent research is provided in this section (Fig. 6).
The Application of Novel Functional Materials to Machine Learning
105
Fig. 5 Conventional lithium-ion battery design [61]
Fig. 6 Robotic framework for probing electrode materials using automated machine learning [69]
5.1.1
Electrolytes
Innovative electrolytic systems must be developed immediately to significantly increase the dependability and safety of lithium-ion batteries [64]. Researchers have
106
H. R. Khan et al.
recently begun to pay attention to the use of ML approaches to discover new electrolyte materials [65]. Sodeyama et al. [66] suggested a multi-ML application framework based on three separate linear regression techniques to measure the disordered properties of novel electrolyte materials, for instance. The findings demonstrate that a thorough investigation of linear regression can offer the most precise estimation of electrolyte liquid properties. Quantum chemistry models were then used by Ishikawa et al. [67] to investigate the coordination energies of five alkali metal ions (Li, Na, K, Rb, and Cs) to electrolyte solvents. According to cross-validation, algorithms for linear regression have the highest prediction accuracy for coordination energies up to 0.127 eV. The electrolyte additives, in addition to the electrolyte’s composition, have a big impact on how well lithium-ion batteries work. For instance, Yasuharu et al.’s [68] modeling and analysis of the redox potential of 149 electrolyte additives for lithium-ion batteries used for structure-based calculations and machine learning techniques. The outcomes demonstrate that the descriptors correctly foresaw the redox potentials. Furthermore, a limited number of features produced via feature engineering analysis can adequately capture the fundamental properties of redox potential. Dave et al. [69] created a digital framework that combines ML technology with intelligent robots, which can individually carry out hundreds of consecutive experimental tests to optimize rechargeable electrolyte as shown in Fig. 6. This framework aims to accelerate scientific advancements in aqueous electrolytes for lithium-ion batteries. A potential candidate for a mixed-anion sodium electrolyte was found using a database of 251 aqueous electrolytes. Zhang et al. [70] used an unsupervised ML technique to rank the candidate list from a variety of Li-containing materials in order to screen advanced materials for solid-state Li-ion conductors and found 16 novel fast Li-conductors. Similar to this, Suzuki et al. [71] found two previously unreported lithium-ion conductors for solid-state electrolyte batteries by combining a recommender system and the arbitrary forest classification model. Additionally, the recently discovered Li6 Ge2 P4 O17 required ten times less to synthesize than the traditional conductor. Nakayama et al. [72] offered two data-driven methods for screening materials to examine the possible applications of employing non-flammable Li-conducting ceramics as solid electrolytes. Data processing using Bayesian optimization dramatically increased the effectiveness of searches. Wang et al. [73] created an automated simulation optimization framework to build new solid polymer electrolytes using the same ML approach. In conclusion, the use of ML technology can assist in the search for high-performance lithium-ion battery materials.
5.1.2
Electrodes
The successful industrialization of lithium-ion batteries has been set up in recent years by the ongoing development of lithium-ion battery electrode materials. The use of ML to investigate new electrode materials has emerged as a new area of study interest to hasten the development of battery electrode materials [74]. For instance, Shandiz et al. [75] used eight different clustering algorithms to examine
The Application of Novel Functional Materials to Machine Learning
107
how crystal structure affected battery electrode performance. Three common crystal systems were examined, and the findings demonstrated that the use of the RF model can yield the maximum predictive performance. Takagishi et al. [76] developed a thorough framework utilizing three-dimensional synthetic nanostructures and ML in order to investigate the mechanism of micro-structure design of lithium-ion battery electrode. The findings demonstrate that the ANN model’s prediction of the electrodespecific resistance and the simulated value agree quite well. Joshi et al. [77] created a web-based tool that predicts the voltage of electrode materials in metal-ion batteries in order to advance the discovery of battery materials. According to the findings, the created online tool can quickly determine the voltage of any bulk electrode material for various metal ions. In order to encourage the use of modern battery technology, it is important to consider other factors like production and utilization in addition to creating unique battery materials. In light of this, Turetskyy et al. [78] successfully established a digital and intelligent battery production system using data-driven technologies and gave a case study. Kilic et al. [79] created a new ML technique that combined the association rules mining approach and apriori algorithm to examine the effect of materials and battery design on the performance of lithium-sulfur (Li–S) batteries. This study discovered that the kind and quantity of encapsulating material have a crucial effect in boosting battery capacity and extending cycle life based on information resources taken from literature. The most recent technological advancements also demonstrated that ML may be used to estimate the vital signs of batteries as well as sustainable life cycles [80]. The use of data-driven approaches like machine learning (ML) and big data to hasten the development of next-generation battery technologies is, in essence, one of the most promising prospects in the field of battery technology.
5.2 Photovoltaics and Light-Emitting Materials The bandgap (or wavelength) of optoelectronic materials and their ability to be manufactured with a sufficiently low concentration of defects that would otherwise eliminate the created electron-hole pairs serve as the primary performance indicators for these materials. The above attributes can be predicted using the DFT-based materials databases that are already available [81]. It is still uncertain whether there are still any entirely original, previously undiscovered materials, or if all the advancements will be based on alloying and/or doping of existing materials, despite the databases holding up to 20,000 semiconducting compounds [82, 83]. Thus, current machine learning (ML) research in this area concentrates on enhancing the precision of bandgap predictions as well as on figuring out how to forecast the characteristics of alloyed, deformed, and defective structures using only the data provided for ideal crystals. Before precise quantitative descriptors that precisely account for the crystal’s atomic structure were created, qualitative models had already demonstrated their
108
H. R. Khan et al.
potential for filtering and condensing the broad search space. A classifier for measuring a material’s potential for success in photovoltaic (PV) applications was developed by Jin et al. [84] using machine learning (ML). The model was developed using 196 data points of known and previously described PV and non-PV materials with bandgaps between 1 and 2 eV. The bandgap was implicitly included in the model along with other significant elements like experimental defect densities and mobilities. 3011 promising possibilities were found when the learned classifier was applied to 187,000 experimentally known materials from ICSD. After a structural screening along with DFT simulations, the list was further reduced to 26 candidates, among which Sb2Te3 and Bi2 Se3 were identified as promising photovoltaic materials [84]. Limiting the search to a single crystal group where cations and anions all occupy the same lattice positions, so implicitly encoding and accounting for interactions between them, is another method for avoiding the requirement to encode the atomistic crystal structure. This method was used by Lu and colleagues to forecast the band structure of hybrid organic–inorganic perovskites (HOIP). A tolerance factor, octahedral factors, ionic charge, electronegativity, and orbital radii were used to train the machine learning model on DFT data from 212 HOIPs. A total of 5158 potential HOIPs were produced by mixing 32 distinct A-site cations, 43 different B-site cations, and 4 halides. The model was then used to screen this space. Following a preliminary ML screening, six candidate materials were chosen for validation based on the ML results and their expected simplicity in synthesis. Additional DFT investigation verified the existence of direct band gaps in two materials, C2 H5 OSnBr3 and C2 H6 NSnBr3 [85]. Zhuo et al. [86] gave a demonstration of using ML to forecast advanced materials for white-light phosphors. The desired property, quantum efficiency, was substituted with a more practical proxy, structural stiffness, whose Debye temperature can be accurately predicted [81]. The Debye temperature is a fantastic candidate for ML prediction since the DFT calculation is accurate but prohibitively expensive. 2071 materials of interest with more than three phases and typical starting reagents were screened using information from the Materials Project [81]. An ideal phosphor host for NaBaB9O15 synthesis was found by increasing the Debye temperature for specific bandgaps. Following the phosphor host’s synthesis, a very efficient (95% quantum yield) preparation of Eu2+ doped, green-emitting material was produced [86]. In order to better understand the underlying physics and factors affecting the property of interest and, as a result, alter those factors to tune the material, trained ML models can be analyzed. An ML technique was developed by Im et al. [87] that assigns each feature a relevance value based on how it affects the performance of the models [87]. The B3+ to halogen bond length and the electronegativity of halogens had a major role in influencing the formation energy. Similar methods were used by Pilania et al. [88] who were able to establish correlations between the feature values and material bandgap using their model. These methods allowed them to locate the best features, and identify trends [88].
The Application of Novel Functional Materials to Machine Learning
109
6 Future Perspective 6.1 Materials for CO2 Capture The following prospects can be used to show the difficulties and opportunities of applying data-driven research to the creation of CO2 capture materials. Firstly, further research should be done on the creation and modification of ML algorithms for various CO2 capture materials systems. Notable is the fact that the RF method has been widely used in the development and identification of new CO2 capture materials, demonstrating its strong potential for real-world application. In addition, algorithms with optimized functions can be used to screen CO2 capture materials, such as GA and GBRT. Secondly, future studies should concentrate on developing design principles, such as reverse design based on feature engineering, that will direct the creation of new CO2 capture materials. Thirdly, an automated integrated system for the creation of CO2 capture materials should be created using DFT calculations, intelligent robot technology, and experimental research. In order to improve the ML model even more, the anticipated materials can be created. General viewpoints that are applicable to all of these materials are also crucial to further the development of data-driven energy materials science and engineering, in addition to the specific difficulties and future prospects for each energy material stated above.
6.2 Materials for Catalysis The application of ML technology to the creation of new catalytic materials is still in its infancy and is mostly motivated by the experience. The catalytic process typically entails chemical reactions on multiple scales and dimensions, which is a dynamic and complex process. Furthermore, the experimental parameters for catalytic processes that are given in the literature are frequently excessively general, and some particular experimental details are purposefully concealed. Additionally, the experiment used a variety of reporting formats and methods for data, particularly gray data. The aforementioned actions have made it harder to create databases and choose parameters, which has slowed down the development of improved catalytic materials. The following views may be taken into account in a future study to support the advancement of catalysis informatics. Firstly, to optimize the performance of ML models, catalytic materials scientists should focus on integrating ML technology with the current physical and chemical models of catalytic reactions. Secondly, combining automated technology like intelligent robots with optimization algorithms like the Bayesian algorithm and genetic algorithm can speed up research into undiscovered catalytic materials and reaction mechanisms. Thirdly, information for ML modeling should be gathered from a variety of sources, including computational data sets, online open-source databases, and data from in-lab experiments.
110
H. R. Khan et al.
6.3 Model Visualization and an Automated Closed-Loop Optimization Roadmap Although machine learning (ML) technology has consistently shown to be a helpful tool in data-driven materials research, there are still a number of issues that need to be resolved. For instance, the choice of descriptors and the setting of parameters in the ML modeling process are heavily reliant on manual judgement. The cost of time is increased because none of the parameters can be automatically updated based on the outcomes of prior rounds. Furthermore, because these models are typically invisible, ML models created by specific algorithms like neural networks are challenging to comprehend. In view of this, the following pertinent factors and perspectives for ML applications success are underlined. To hasten the process of materials discovery, the closed-loop optimization framework of ML algorithms should be established first. The major goal of this strategy, as depicted in Fig. 7, is to create a closed-loop iterative process that can build hypotheses about producing materials with specific architectures and qualities. Therefore, the automated framework would be able to design, carry out, and analyze experiments. Using a combination of Bayesian active learning and the data obtained from the previous round, the next round of experimental exploration and simulation may then be designed. This approach has drawbacks of its own, such as autonomous optimization, where the best candidate found through the optimization process might only be a local best solution (the ideal solution should be the global optimal solution). Other deserving individuals in this situation will be unintentionally left out. In order to avoid falling into the trap of local optimal solutions, it is recommended that the closed-loop optimization framework be used in conjunction with algorithms that provide global optimization (for instance, evolutionary algorithms). It is advised to use deep learning neural networks in materials science in addition to the automatic framework previously described. Because deep learning has excellent nonlinear fitting capabilities, it can mimic the intricate relationship between multiple features and disclose the mechanism of material creation.
6.4 Machine-Learned Interatomic Potentials The high computational cost of simulating complicated and long-term behavior, particularly in systems with many atoms, is a significant drawback of existing DFTbased approaches. For instance, it has been demonstrated that current ab initio molecular dynamics is sufficient to identify promising ionic conductors in the design of solid and liquid electrolytes. The long-time scale behavior of electrolyte-electrode interfaces, which is crucial to the failure of current materials in devices, is currently impossible to fully model [90, 91]. Though often incapable of simulating bond breakdown, or chemical reactions, classical molecular mechanics simulations are substantially faster. Although bond breaking must be painstakingly parametrized and is not yet available for the majority
The Application of Novel Functional Materials to Machine Learning
111
Fig. 7 Autonomous material exploration and optimization in a closed-loop [89]
of elements, it can be added in some unusual cases, such as in ReaxFF [92]. Additionally, polarization effects are not accurately described by the force fields that are now in use [93]. ML-based potentials and force fields can offer a workable option for enabling precise molecular dynamics for a portion of the price of DFT. The atomic interactions and forces pertinent to a system of interest could be learned using “learning on the fly” (LOTF) from the system in question rather than by creating a universal protective barrier. To accomplish this, a DFT molecular dynamics simulation must be run for a significant number of iterations so that adequate data is gathered to train the ML model. The ML model can forecast the atomic forces without using the DFT after being trained on the local atomic environment of interest. Software programs like MLIP, SchNetPack, and VASP are others that explore this capability (to be implemented in 2021) [94]. This LOTF technique has recently been used to determine Li-ion diffusivities in solid-state electrolytes, allowing for the identification of candidates for protective coatings for cathode materials [95].
112
H. R. Khan et al.
6.5 Data Production and Accessibility Scientific research is a collaborative process, and it strongly depends on the availability of prior information for it to advance. Therefore, open sharing of experimental information is crucial for both future development and replication/verification. A coordinated effort to disseminate all data could help address some of the problems caused by data scarcity in the field of machine learning. However, the data needed to create a particular ML model is not published far too frequently. Even when wellknown databases are used, like the Materials Project, researchers frequently forget to identify the precise data that was used for the creation and evaluation of the Ml algorithm. This is a significant barrier to the replication and improvement of ML models. Data sharing is crucial and ought to be required for all ML reports. Sharing the datasets used for training and validation as well as the ML code and model weights are all included in this. A promising method for capturing factors that DFT simulations find challenging to represent is the use of experimental data for ML model training. The scientific community should consider switching to digital laboratory record keeping, which is suitable for fast data analysis and machine learning (ML), in order to address the issue of data scarcity. Experimentalists have access to published pertained models that allow them to swiftly evaluate, rank, and evaluate new concepts and materials. Shields et al. [96] have shown, for instance, that Bayesian optimization can be utilized to discover synthetic paths for targeting organic compounds that are more effective than equivalents suggested by a person [96]. Such models, when properly integrated, can be a helpful tool to cut down on the amount of time an experimentalist has spent optimizing, thereby enhancing their productivity.
7 Conclusion The comeback of machine learning has come at a critical juncture and is anticipated to be a key instrument in the fight against global concerns like climate change, whose resolution requires quick material discovery. The use of such algorithms avoids human biases that would otherwise prevent reproducibility and innovation in addition to saving time. Further advancements are anticipated by integrating high-throughput robotics with the ML infrastructure to speed up material discovery by partially or entirely eliminating human labor. The use of ML to automate data processing, forecast fundamental material properties, optimize experimental parameter spaces, and find new materials from databases has shown considerable potential. These tools are now easily accessible and available for free to both theorists and experimenters, and they promise to save a lot of time when looking for new materials. While it is heartening to see that DFT-based screening of big datasets is being replaced, the primary objective of such screening—identifying new functional materials—should not be overlooked. Given that simulations are frequently idealized replicas of nature, it is crucial to evaluate the suggested options through experimental preparation of
The Application of Novel Functional Materials to Machine Learning
113
the materials and functional testing in addition to in silico-screening. These datasets are small by nature, but methods that can make use of less data are actively being developed. If these experimental tests are made public, they will act as the nextgeneration datasets for even more precise models. According to our assessment, the state-of-the-art in ML application within the materials community as enumerated in this research would facilitate the creation of high-performance energy materials.
References 1. Lin, L., Gao, L., Kedzierski, M. A., & Hwang, Y. (2020). 2. Gao, P., Chen, Z., Gong, Y., Zhang, R., Liu, H., Tang, P., Chen, X., Passerini, S., & Liu, J. (2020). 10, 1903780. 3. Jurasz, J., Canales, F., Kies, A., Guezgouz, M., & Beluco, A. (2020). 195, 703–724. 4. Kang, Y., Li, L., & Li, B. (2021). 54, 72–88. 5. Kang, P., Liu, Z., Abou-Rachid, H., & Guo, H. (2020). 124, 5341–5351. 6. Neugebauer, J., & Hickel, T. (2013). 3, 438–448. 7. Himanen, L., Geurts, A., Foster, A., & Rinke, P. (2019). A. S. Perspectives. 8. Chen, A., Zhang, X., & Zhou, Z. J. (2020). 2, 553–576. 9. Wang, A. Y.-T., Murdock, R. J., Kauwe, S. K., Oliynyk, A. O., Gurlo, A., Brgoch, J., Persson, & Sparks, K. (2020). 32, 4954–4965. 10. Jablonka, K. M., Ongari, D., Moosavi, S. M., & Smit, B. (2020). 120, 8066–8129. 11. Altintas, C., Altundal, O. F., Keskin, S., & Yildirim, R. (2021). Modeling, 61, 2131–2146. 12. Barnett, J. W., Bilchak, C. R., Wang, Y., Benicewicz, B. C., Murdock, L. A., Bereau, T., & Kumar, S. K. (2020). 6, eaaz4301. 13. Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O., & Walsh, A. (2018). 559, 547–555. 14. Reymond, J. L. (2015). 48, 722–730. 15. Suzuki, Y., Hino, H., Hawai, T., Saito, K., Kotsugi, M., & Ono. K. (2020). 10, 1–11. 16. Liu, Y., Wu, J. M., Avdeev, M., & Shi, S. (2020). Simulations, 3, 1900215. 17. Gu, G. H., Noh, J., Kim, I., & Jung, Y. (2019). 7, 17096–17117. 18. Li, J., Lim, K., Yang, H., Ren, Z., Raghavan, S., Chen, P.-Y., Buonassisi, T., & Wang, X. (2020). 3, 393–432. 19. Chen, C., Zuo, Y., Ye, W., Li, X., & Deng, Z., Ong. (2020). 10, 1903242. 20. Correa-Baena, J.-P., Hippalgaonkar, K., van Duren, J., Jaffer, S., Chandrasekhar, V. R., Stevanovic, V., Wadia, C., Guha, S. & Buonassisi T. (2018). 2, 1410–1420. 21. Kudyshev, Z. A., Kildishev, A. V., Shalaev, V. M., & Boltasseva A., (2020). 7, 021407. 22. Wu, Z., Yang, T., Deng, Z., Huang, B., Liu, H., Wang, Y., Chen, Y., Stoddard, M. C., Li, L., & Zhu, Y. (2019). 8, 559–569. 23. Varma, V., Gerosa, D., Stein, L. C., Hébert, F., & Zhang, H., (2019). 122, 011101. 24. Martinez-Cantin, R., Tee, K., McCourt, M. (2018). 1722–1731. 25. Hase, F., Roch, L. M., Kreisbeck, C., & Aspuru-Guzik, A. J. (2018). 4, 1134–1145. 26. Janai, M. A. B., Woon, K. L., & Chan, C. (2018). 63, 257–266. 27. Kirkey, A., Luber, E. J., Cao, B., Olsen, B. C., & Buriak, J. (2020). Interfaces, 12, 54596–54607. 28. Cao, B., Adutwum, L. A., Oliynyk, A. O., Luber, E. J., Olsen, B. C., Mar, A., & Buriak. (2018). 12, 7434–7444. 29. Du, X., Lüer, L., Heumueller, T., Wagner, J., Berger, C., Osterrieder, T., Wortmann, J., Langner, S., Vongsaysy, U., & Bertrand, M. (2021). 5, 495–506. 30. Siemenn, A. E., Beveridge, M., Buonassisi, T., & Drori I. (2021). 31. Voznyy, O., Levina, L., Fan, J. Z., Askerka, M., Jain, A., Choi, M.-J., Ouellette, O., Todorovi´c, P., Sagar, L. K., & Sargent, E. (2019). 13, 11122–11128. 32. Salley, D. S., Keenan, G. A., Long, D.-L., Bell, N. L., & Cronin, L. (2020). 6, 1587–1593.
114
H. R. Khan et al.
33. Mekki-Berrada, F., Ren, Z., Huang, T., Wong, W. K., Zheng, F., Xie, J., Tian, I. P., Jayavelu, S., Mahfoud, Z., Bash, D., & Hippalgaonkar, K. (2021). 7, 1–10. 34. Li, Z., Najeeb, M. A., Alves, L., Sherman, A. Z., Shekar, V., Cruz Parrilla, P., Pendleton, I. M., Wang, W., Nega, P. W., & Zeller, M. (2020). 32, 5650–5663. 35. Burger, B., Maffettone, P. M., Gusev, V. V., Aitchison, C. M., Bai, Y., Wang, X., Li, X., Alston, B. M., Li, B., & Clowes, R. J. (2020). 583, 237–241. 36. Crippa, M., Guizzardi, D., Muntean, M., Schaaf, E., Solazzo, E., Monforti-Ferrario, F., Olivier, J., & Vignati, E. (2020). 37. Edström, K., Dominko, R., Fichtner, M., Perraud, S., Punckt, C., Asinari, P., Castelli, I., Christensen, R., Clark, S., & Grimaud, A. (2020). 2030, 60–61. 38. Gosens, J., Kåberger, T., & Wang, Y. (2017). Engineering, 5, 141–155. 39. Erickson, L. E., & Brase, G. (2019). 11–22. 40. Javed, M. S., Ma, T., Jurasz, J., & Amin, M. (2020). 148, 176–192. 41. Levasseur, A., Mercier-Blais, S., Prairie, Y., Tremblay, A., & Turpin, C. (2021). Reviews, 136, 110433. 42. Fuhrman, J., Clarens, A. F., McJeon, H., Patel, P., Doney, S. C., Shobe, W. M., & Pradhan S. (2020). 43. Esan, O. C., Shi, X., Pan, Z., Huo, X., An, L., Zhao, T. (2020). 10, 2000758. 44. Cano, Z., Banham, D., Ye, S., Hintennach, A., Lu, J., Fowler, M., Chen, Z. (2018). 3, 279–289. 45. Ng, M.-F., Zhao, J., Yan, Q., Conduit, G. J., & Seh, Z. W. (2020). 2, 161–170. 46. Pan, Z., Bi, Y., & An, L. (2020). 258, 114060. 47. Li, F., Thevenon, A. J. A., Wang, Z., Li, Y., Gabardo, C. M, Ozden, A., Dinh, C. T., Li, J., Wang, Y., et al. (2020). 509–513. 48. Coady, D., Parry, I., Sears, L., & Shang, B. (2017). 91, 11–27. 49. Cai, J., Chu, X., Xu, K., Li, H., & Wei, J. (2020). 2, 3115–3130. 50. Yang, X., Luo, Z., Huang, Z., Zhao, Y., Xue, Z., Wang, Y., Liu, W., Liu, S., Zhang, H., & Xu, K. (2020). 8, 167. 51. Juan, Y., Dai, Y., Yang, Y., & Zhang, J. (2021). 79, 178–190. 52. Blaiszik, B., Ward, L., Schwarting, M., Gaff, J., Chard, R., Pike, D., Chard, K., & Foster, I. (2019). 9, 1125–1133. 53. Liu, Y., Esan, O. C., Pan, Z., & An, AI, L. (2021). 3, 100049. 54. Isayev, O., Fourches, D., Muratov, E. N., Oses, C., Rasch, K., Tropsha, A., & Curtarolo, S. (2015). 27, 735–743. 55. Mrdjenovich, D., Horton, M. K., Montoya, J. H., Legaspi, C. M., Dwaraknath, S., Tshitoyan, V., Jain, A., & Persson, K. (2020). 2, 464–480. 56. Patel, P., & Ong, S. P. (2019). 44, 162–163. 57. Celebi, M. E., & Aydin, K. (2016). 9, 103. 58. Khatib, M. E., & de Jong, W. A. (2020). 59. Vasudevan, R. K., Choudhary, K., Mehta, A., Smith, R., Kusne, G., Tavazza, F., Vlcek, L., Ziatdinov, M., Kalinin, S. V., & Hattrick-Simpers, J. (2019). 9, 821–838. 60. Zhou, M., Gallegos, A., Liu, K., Dai, S., & Wu, J. (2020). 157, 147–152. 61. Deringer, V. L. (2020). 2, 041003. 62. Marom, R., Amalraj, S. F., Leifer, N., Jacob, D., & Aurbach, D. J. (2011). 21, 9938–9954. 63. Liu, K., Wei, Z., Yang, Z., & Li, K. (2021). 289, 125159. 64. Van Duong, M., Van Tran, M., Garg, A., Van Nguyen, H., Huynh, T. & Phung Le, M. (2021). 45, 4133–4144. 65. Kim, S., Jinich, A., & Aspuru-Guzik, A. (2017). 57, 657–668. 66. Sodeyama, K., Igarashi, Y., Nakayama, T., Tateyama, Y., & Okada, M. (2018). 20, 22585– 22591. 67. Ishikawa, A., Sodeyama, K., Igarashi, Y., Nakayama, T., Tateyama, Y., & Okada, M. (2019). 21, 26399–26405. 68. Okamoto, Y., & Kubo, Y. (2018). 3, 7868–7874. 69. Dave, A., Mitchell, J., Kandasamy, K., Wang, H., Burke, S., Paria, B., Póczos, B., Whitacre, J., & Viswanathan, V. (2020). 1, 100264.
The Application of Novel Functional Materials to Machine Learning
115
70. Zhang, Y., He, X., Chen, Z., Bai, Q., Nolan, A. M., Roberts, C. A., Banerjee, D., Matsunaga, T., Mo, Y., & Ling, C. (2019). 10, 1–7. 71. Suzuki, K., Ohura, A., Seko, Y., Iwamizu, G., Zhao, M., Hirayama, I., Tanaka, & Kanno, R. (2020). 8, 11582–11588. 72. Nakayama, M., Kanamori, K., Nakano, K., Jalem, R., Takeuchi, I., & Yamasaki, H. (2019). 19, 771–778. 73. Wang, Y., Xie, T., France-Lanord, A., Berkley, A., Johnson, J. A., Shao-Horn, Y., & Grossman, J. (2020). 32, 4144–4151. 74. Sendek, A. D., Cubuk, E. D., Antoniuk, E. R., Cheon, G., Cui, Y., & Reed, E. (2018). 31, 342–352. 75. Toyao, T., Suzuki, K., Kikuchi, S., Takakusagi, S., Shimizu, K.-I., & Takigawa, I. (2018). 122, 8315–8326. 76. Takagishi, Y., Yamanaka, T., & Yamaue, T. (2019). 5, 54. 77. Joshi, R. P., Eickholt, J., Li, L., Fornari, M., Barone, V., & Peralta, J. (2019). 11, 18494–18503. 78. Turetskyy, A., Thiede, S., Thomitzek, M., von Drachenfels, N., Pape, T., & Herrmann, C. (2020). 8, 1900136. 79. Soriano-Molina, P., Plaza-Bolaños, P., Lorenzo, A., Agüera, A., Sánchez, J. G., Malato, S., & Pérez, J. (2019). 366, 141–149. 80. Liu, W., & Xu, Y. (2020). 35, 1715–1718. 81. Jain, A., Ong, S. P., Hautier, G., Chen, W., Richards, W. D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., & Ceder, G. (2013). 1, 011002. 82. Wang, Z., Ha, J., Kim, Y. H., Im, W. B., McKittrick, J., & Ong, S. (2018). 2, 914–926. 83. Wang, Z., Chu, I.-H., Zhou, F., & Ong, S. (2016). 28, 4024–4031. 84. Jin, H., Zhang, H., Li, J., Wang, T., Wan, L., Guo, H., & Wei, Y. (2020). 11, 3075–3081. 85. Lu, S., Zhou, Q., Ouyang, Y., Guo, Y., Li, Q., & Wang, J. (2018). 9, 1–8. 86. Zhuo, Y., Mansouri Tehrani, A., Oliynyk, A. O., Duke, A. C., & Brgoch, J. J. (2018). 9, 1–10. 87. Im, J., Lee, S., Ko, T.-W., Kim, H. W., Hyon, Y., & Chang, H. (2019). 5, 1–8. 88. Pilania, G., Mannodi-Kanakkithodi, A., Uberuaga, B., Ramprasad, R., Gubernatis, J., & Lookman, T. (2016). 6(1), 9375. 89. Kusne, A. G., Yu, H., Wu, C., Zhang, H., Hattrick-Simpers, J., DeCost, B., Sarker, S., Oses, C., Toher, C., & Curtarolo, S. (2020), 11, 1–11. 90. Zhao, Q., Stalin, S., Zhao, C.-Z., & Archer, L. A. (2020). 5, 229–252. 91. Ceder, G., Ong, S. P., & Wang, Y. J. (2018). 43, 746–751. 92. Senftle, T. P., Hong, S., Islam, M. M., Kylasa, S. B., Zheng, Y., Shin, Y. K., Junkermeier, C., Engel-Herbert, R., Janik, M. J., & Aktulga, H. M. (2016). 2, 1–14. 93. Bedrov, D., Piquemal, J.-P., Borodin, O., MacKerell, Jr, A. D., Roux, B., & Schröder, C. J. (2019). 119, 7940–7995. 94. Novikov, I. S., Gubaev, K., Podryabinkin, E. V., & Shapeev, A. (2020). 2, 025002. 95. Wang, C., Aoyagi, K., Wisesa, P., & Mueller, T. (2020). 32, 3741–3752. 96. Shields, B. J., Stevens, J., Li, J., Parasram, M., Damani, F., Alvarado, J. I. M., Janey, J. M., Adams, R. P., & Doyle, A. G. (2021), 590, 89–96.
Recent Advances in Machine Learning for Electrochemical, Optical, and Gas Sensors Elsa M. Materón, Filipe S. R. Silva Benvenuto, Lucas C. Ribas, Nirav Joshi, Odemir Martinez Bruno, Emanuel Carrilho, and Osvaldo N. Oliveira
Abstract Machine learning is increasingly used in the analysis of distinct types of data for clinical diagnosis and monitoring the environment, particularly because of the large amounts of data generated in sensing and biosensing methods. In this chapter, we discuss the usage of machine learning for electrochemical sensors, with emphasis on colorimetric principles of detection
1 Machine Learning Many human activities involving tasks such as recognizing faces, voice, objects (e.g., chairs, cups), and animals can now be performed with intelligent computer systems. These artificial intelligence (AI) systems must learn how to define and describe the main characteristics to distinguish different objects or animals. The challenges involved are associated with image analysis of animals at different angles, scale, E. M. Materón (B) · F. S. R. Silva Benvenuto · E. Carrilho Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos, SP 13566-590, Brazil e-mail: [email protected] F. S. R. Silva Benvenuto e-mail: [email protected] E. M. Materón · E. Carrilho Instituto Nacional de Ciência e Tecnologia de Bioanalítica-INCTBio, Campinas, SP 13083-970, Brazil E. M. Materón · N. Joshi · O. M. Bruno · O. N. Oliveira São Carlos Institute of Physics, University of São Paulo, P.O Box 369, São Carlos, SP 13560-970, Brazil L. C. Ribas Institute of Biosciences, Humanities and Exact Sciences, São Paulo State University, São José Do Rio Preto, SP 15054-000, Brazil e-mail: [email protected] O. M. Bruno Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, SP 13566-590, Brazil © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_6
117
118
E. M. Materón et al.
fur, size, etc. Humans are able to make distinctions based on the recognition of patterns that have been observed and learned over the years. Another example is the traditional stool ova and parasite test (O&A test). This technique uses a microscope to diagnose worms, whether caused by protozoa or helminths, which cause disease in humans. This type of analysis has been used since the end of the nineteenth century; it has helped to improve human health, especially in children. A competent parasitologist is able to identify parasites or eggs based on their morphological, textural, and color characteristics. This is possible thanks to the knowledge acquired in academic training. On the other hand, creating a computer program that performs image-based parasite or egg recognition as effectively as the expert is a difficult problem. Other problems that are easy for humans but challenging for computers are speech recognition, facial recognition for biometrics, and text translation. To deal with the problems above and others, AI algorithms have been developed, initially based on expert knowledge and logical rules [1]. With the increasing processing and storage power of computers and the availability of massive amounts of data in the Big Data Era, AI systems are now mostly based on a statistical analysis of data of particular relevance in Machine Learning (ML), which is aimed at building systems capable of automatically acquiring knowledge based on experiences accumulated through previous decisions. ML techniques learn from experience using the principle of induction in which generic conclusions are reached from a set of data, inducing a function or hypothesis to solve a problem. Such techniques are based on concepts from different scientific disciplines such as Statistics, Neuroscience, Computer Sciences, and Mathematics. In recent years, different ML algorithms have been applied to disease diagnosis [2–4], speech recognition [5], object detection [6], facial recognition [7], and virus detection [8]. Next, we will introduce some machine learning concepts.
1.1 Induction of Hypotheses To explain how ML algorithms work, let us consider a dataset formed by a tuple (X, Y) where X are instances (or samples or objects) typically represented through the vector-space model and Y are the output attributes. For example, in a dataset involving customer financial data called “credit dataset,” each customer corresponds to an instance. In this dataset, each instance xi is represented by a set of attributes (named input attributes), i x i = [x1i , x2i , . . . , x M ] (or features),
1, 2 . . . M Which can be values describing the consumer information such as age range, geography, debt balances, credit risk scores, assets, and delinquency status at the loan level. Each instance also has an output attribute yi (or label or class) that contains
Recent Advances in Machine Learning for Electrochemical, Optical …
119
the target and can be predicted using the input attributes. This output attribute can determine, for example, whether the consumer can receive a loan or not. ML algorithms aim to learn a model or hypotheses from experience, that is, using a subset of labeled instances from the original dataset, referred to as the training dataset. The learned model using the training dataset must be able to map the input attributes of an instance to its corresponding output attribute. The model will then be able to predict the output attribute of new instances when the output attributes are unknown. In other words, a credit model must predict correctly the credit risk of new customers. To test the generalization power of the trained models, which is the ability to predict unseen data, one can employ a subset of data not used in the model training step, referred to as testing data. A model with good generalization ability is obtained if the training dataset contains several samples that statistically represent the diversity of the problem domain. A good machine learning model should be able to classify properly any unknown data from the domain problem, obtaining good performance metrics in the training and testing data. Figure 1a illustrates a good separation frontier (in red) of samples from two categories (squares and stars). However, overtraining may cause the model to learn exactly the detail and noise of the training data, memorizing such data. In this case, the model performs well on the training data, but it cannot predict unknown samples due to its low generalization ability. This concept is known as overfitting, illustrated in Fig. 1b. On the other hand, underfitting, illustrated in Fig. 1c, occurs when the algorithm fails to learn the training data, achieving low performance in the training and testing dataset. In the learning process, ML techniques perform a search for a hypothesis, within a space of possible hypotheses, which maps and best fits the training data. Thus, the algorithms have two biases: search and representation [1]. In the representation bias, the hypothesis can be represented in different ways according to the type of algorithm, such as tree structure in decision tree algorithms and weight values of connections in artificial neural networks. The search bias defines how the search for hypotheses is carried out in the space of possibilities. An important feature in machine learning algorithms is the robustness to deal with datasets with inconsistencies, noise, outliers, incomplete information, missing or redundant values. In addition, information may
(a) Optimal
(b) Overfitting
(c) Underfitting
Fig. 1 Illustration of different scenarios of generalization of machine learning models
120
E. M. Materón et al.
be irrelevant to the problem domain, such as the height and name of a consumer for the credit risk prediction task. To assist ML algorithms and minimize the influence of these issues on the learning process, pre-processing techniques can be employed on the dataset to determine the correct data for input to the learning algorithms.
1.2 Types and Tasks ML algorithms can be divided in different ways according to various aspects. First, algorithms can be categorized based on the learning paradigm, as illustrated in Fig. 2, which are basically: • Supervised: in supervised learning, human experts of the problem domain act as teachers of the algorithms by presenting the data and its respective labels so that the computational system learns the patterns. For example, to develop a medical system for diagnosing a disease, specialist doctors will determine positive and negative patients to teach the algorithm. These techniques are called predictive and aim to perform predictions by inducing a function f : X → Y from a training dataset X and a set of labels Y. In other words, a model is built to try to map the input attributes of an example x i to its target label yi . • Unsupervised: in this learning paradigm, there is no one to teach computers, which means there are no labeled output attribute. The idea is to explore and detect patterns in the input data, identifying the data organization in group structures from an unlabeled dataset X = x 1 , …, x n . Unlike their supervised counterparts, unsupervised ML techniques are referred to as descriptive and can help humans find and describe unknown patterns in the input data. For example, in financial data, consumers can be grouped according to their consumption similarities. The most common algorithms in this learning paradigm are clustering and association rule algorithms.
Fig. 2 Hierarchy of the main types of machine learning and their applications
Recent Advances in Machine Learning for Electrochemical, Optical …
121
In supervised learning, the main types of problems are categorized according to the domain of the label values. There are regression and classification problems. The former involve a quantitative result, so for each input instance, the model will estimate a continuous output value. For example, in a property valuation problem, given a set of input features, the regression algorithm can estimate the price of the property. In classification problems, models predict discrete values (i.e., classes or categories). Thus, input instances are categorized according to a set of finite classes. Classification problems can be separated into two types: (i) binary, in which there are two classes (e.g., false or positive diagnosis for a disease); and (ii) multi-class, when there are more than two classes (e.g., plant species). Over the years, different kinds of learning methods have been proposed, mainly due to the growing demand for specific algorithms for data and problems with behaviors of different natures. Faceli et al. [1] divide the supervised algorithms into: • Instance-based: algorithms in this category are the simplest. They use the similarity between a new instance to known instances to classify it. These algorithms are known as lazy, as they must keep all instances in memory to classify new instances without inducing a model. The most popular algorithm is the Nearest Neighbor (NN), which calculates the distance between the attributes of the instances to obtain the similarity. • Probabilistic: in these algorithms use is made of statistical models to build predictive models. The main methods are based on Bayesian learning, founded on Bayes theory and using probabilistic models through training data to estimate the class of new instances. • Symbolic: this type of learning aims to build symbolic representations (logical expressions) using samples. Many of these algorithms are based on decision trees, which solve sub-problems through simple decision rules to combine them in the solution of a complex problem. • Optimization: these techniques use some optimization function to search for the hypotheses that represent the data. The learning task is modeled as an optimization problem in which the hyperparameters of the model are iteratively adjusted, resulting in the minimization of a cost function that reports the discrepancy between the true and the predicted labels. Methods based on artificial neural networks, employing a biological metaphor of neural connections, are the most popular with the introduction of deep learning. The Support Vector Machines (SVMs) involving a quadratic optimization problem to build a set of separation hyperplanes are also among the most popular and robust machine learning techniques.
2 Machine Learning in Electrochemistry Electrochemical systems are promising for many applications of which we will emphasize sensors and biosensors. A sensor is defined by Baldassarre et al. [9] as a device used to gather information from biological, physical, or chemical changes and
122
E. M. Materón et al.
then convert the information into a measurable signal [9]. A subclass of sensors are the biosensors where there is a recognition element, and sensing occurs via measuring changes in Faradaic or capacitive currents [10]. Electrochemical biosensors can be divided into two types according to the biorecognition principle: biocatalytic devices and affinity or non-catalytic biosensors [11–13]. In biocatalytic devices, the analytebioreceptor interaction generates a product. This biosensor is normally made with enzymes, whole cells, or tissue slices. On the other hand, in affinity biosensors the analyte and receptor reacted irreversibly, and no new biochemical reaction product is formed, such as an antibody, nucleic acid, or cell receptors [13]. Biosensors may also be categorized depending on the analytes or reactions they monitor. For this reason, the term biosensors are often used to cover chemical sensors that detect analytes of biological interest, even when they do not utilize a biological system directly [9, 14]. The relatively low cost and rapid response of these technologies make them useful in healthcare, environmental monitoring, agricultural applications, industrial analysis, and biological analysis [15, 16]. To achieve high performance in electrochemical sensors and biosensors, it is normally necessary to optimize electrode materials and detection principles and conditions [17]. Machine learning can be used to assist in the design and optimization of the electrochemical sensors, in addition to the analysis of electrochemical data [17]. As already mentioned, ML is a data-driven method based on statistical learning theory to generate numerical models that generalize to new data, not used in the learning process [18–21]. The idea of “machine can think” was initially proposed by Turing in his work on “Turing’s learning machine.” Concerning voltammetric methods, one may highlight the pioneering work by De Palma and Perone in 1979 [22, 23]. Progress in this topic was initially very slow, with further applications appearing only in 2006 [24]. More recently, Benjamin and co-workers reported a deep-learning-based algorithm that analyzes cyclic voltammograms and designates a probable electrochemical mechanism [25]. The algorithm proposed can be continuously refined and improved from experimental data, potentially including data contributed by the electrochemistry community [26]. In addition, the incorporation of 50/60 Hz noise of power-line frequency was proposed for the DL model to increase its utility in practical applications [25]. Undoubtedly, the ML method could change the approach to mechanistic electrochemistry if applications become more adventurous as commented upon by Compton [27]. Cyclic voltammetry, electrochemical impedance spectroscopy, and chronoamperometry are common detection techniques [25, 26], but the most widely used for studies of electrode processes is cyclic voltammetry [21]. The voltammograms supported with simulations allowed the formulation of models to study biosensor processes and identify critical system parameters [21, 24, 26]. In biocatalytic electrochemical biosensors, which use enzymes due to their high biocatalytic activity and specificity [11], the ML technique not only enhances ways to discover catalysts, but also allows for understanding the relationships between the properties of materials/compounds and their catalytic activities, selectivity, and stability. This simplifies the design of catalysts and biosensors materials [28]. The use of ML methods in electrochemical biosensors is still embryonic. But the prospects are excellent, including
Recent Advances in Machine Learning for Electrochemical, Optical …
123
to help minimize limitations of the biosensing technology. For instance, the signal in biosensors has always some noise, their shelf life is limited owing to difficulties with biomolecule stability. Consequently, the reproducibility in sensors and biosensors may be low, especially in biological or complex samples, or in cases where there is interference from other molecules and signal dependence on ionic strength, temperature, and pH. To address these limitations no effort should be spared in studying electrode interface phenomena [29]. In this context, ML may be a tool to enhance the performance of biosensors, particularly in complex matrices [29]. Figure 3 illustrates the possibilities to employ ML in electrochemical sensors, including categorization, anomaly detection, noise reduction, object identification, and pattern recognition. ML has been employed to design more efficient biosensors, as we will explore later in this chapter on colorimetric biosensors [29]. An example of ML application is in the work by Sheng et al. [30] with electrochemical detection of maleic hydrazide employing an artificial neural network (ANN) [18]. In the latter work, glassy carbon electrodes modified with copper nanoparticles (CuNPs) on PEDOT-C4COOH films (carboxyl-functionalized poly (3,4-ethylene dioxythiophene) detected maleic hydrazide in spiked samples of rice, potato, and cotton leaf. Guo et al. [31] used electrochemical impedance spectroscopy (EIS) and deep neural networks (DNN) to determine the levels of the B-type natriuretic peptide (BNP) for the diagnosis and management of heart failure. Du et al. [32] quantified mixed toxicants (MnCl2 , NaNO2, and tetracycline hydrochloride (TCH)) at random concentrations using microbial electrochemical sensors, in which data analysis was made with neural networks. The concentration of glucose and lactate could be determined with an enzyme-free sensor whose chronoamperometry data was analyzed with a back propagation neural network (BP) [33]. Figure 4 exhibits the schematic design of a nonenzymatic electrochemical biosensor platform combined with a back propagation neural network. The biosensor platform consists of three non-enzymatic electrodes (NiO/Pt, Ni(OH)2 /Au, and Ni(OH)2 /Pt) modified into a single electrochemical unit
Fig. 3 Illustration of the advantages of using ML in biosensors. Reproduced with permission from ref [29]
124
E. M. Materón et al.
Fig. 4 Illustration for achieving highly specific non-enzymatic electrochemical sensing using neural networks. Reprinted from ref [33]
for glucose and lactic acid sensing [33]. The biosensor tested serum samples and obtained a relative standard deviation of less than 6.5%. In summary, the ML algorithms employed in work with electrochemical sensors and biosensors include ANN [34], deep learning—Convolutional Neural Networks (CNN) [35], Long Short-Term Memory (LSTM)/Fully Convolutional Networks (FCNs) [36], extreme learning machines (ELMs) [37], k-nearest neighbors (k-NN) [38], Multiclass Support Vector Machine (SVM) [39], and a few others [4, 40–42].
3 Machine Learning in Colorimetric Biosensors Artificial intelligence is relevant for colorimetric biosensors since automated color intensity discrimination is required for point-of-care testing. The combination of selflearning algorithms with smart biosensors offers efficient classification of digital images based on extracted color features related to color change mechanisms. Computational-driven approaches using ML have the advantage of not requiring bulk equipment for analyses. Moreover, well-designed algorithms will assist with selecting important features, reducing signal noise, and performing qualitative and quantitative analysis. Here we consider existing strategies for applying artificial intelligence and ML algorithms to improve the sensor’s specificity, sensitivity, and accuracy in colorimetric biosensors.
Recent Advances in Machine Learning for Electrochemical, Optical …
125
4 Colorimetric Biosensors Colorimetric biosensors are optical sensors in which color changes occur when they are exposed to an analyte. This type of sensor is advantageous because color change may be identified visually, without requiring instruments in many cases. By visualizing the color intensity on sensor devices, the presence of particular analytes can be identified in a cost-effective and field-deployable way [43]. Automated identification of color features can be used in assays to track a specific event based on color change, including biomolecules, cellular events, pathogens, and inorganic molecules [44– 46]. Colorimetric testing has been applied in monitoring and analyzing soil, water, food, biological fluids, quality, and process control [42–44, 47]. Some known colorchanging events are redox reactions, plasmon resonance [48], and ion complexation. Examples of molecules traced are antibodies, nucleic acids, enzymes, or their combination (ex., peroxidase conjugated antibodies). Despite the advantages mentioned above, important issues have to be addressed in challenging problems with complex matrices. Indeed, colorimetric assays are impacted by signal noise and interferences that demand bulk instruments, such as UV-visible, fluorescence, or Raman spectrometers, to isolate a specific signal and remove the interference [49]. Furthermore, colorimetric detection performed in cuvettes, microplates, or flow-through cells is time-consuming [50], and may require specialized personal [51]. To employ colorimetric platforms in disposable, user-friendly, affordable devices [44], one may resort to computational artificial intelligence methods to process data and mitigate problems with the signal acquisition, noise reduction, and normalization. Of particular relevance is the possibility of extracting specific image features. A flowchart of the use of AI and ML in colorimetric biosensing is shown in Fig. 5.
5 AI Feature Extraction The first step of smart biosensing is to obtain digital data that can be processed with AI algorithms. Acquiring digital images from the biosensor by frontal or rear cameras is essential to utilize these pictures. These apps can extract and process valuable information from digital images by applying color spaces. The combination of smart colorimetric biosensing and smartphones is a cost-effective alternative to conventional equipment-based techniques [47, 52]. These phone applications require a perceptive pick of color space model that is behind the mechanism of color changing. A broadly applied color space tool in the biosensor is based on the color perception by human eye theory proposed by Thomas Young, which includes color information separated into three colors: Red, Green, and Blue (RGB) [53]. The conversion of the grayscale average of RGB is also applied to simplify data processing. It constitutes a discount on the gray average value of each color before and after the reaction and utilizes the difference as the signal for analysis [54]. Besides color, digital cameras also capture other biosensor phenomena associated with lighting conditions, which
126
E. M. Materón et al.
Fig. 5 Workflow of AI Biosensing. Artificial Intelligence’s contribution to biosensing analysis starts with image digitalization using camera sensors. Applications assist with the conversion of image pixels into features. Machine learning, an important segment of AI, helps identify potential image features and reduces dimensions. Principal component analysis and linear discriminant analysis are algorithms applied to this end. Moreover, ML tools also assist with detection, e.g., support vector machines (SVM), random forest (RT), and Decision trees. Likewise, corresponding algorithms, such as support vector regression, and regression trees (RT), will return multivariate functions to estimate an output value
may be associated with the color intensity in each image pixel [47]. Hue, saturation, and value/lightness (HSV/HSL) are useful information usually extracted from images [52]. Hue, for instance, is the proper term for the color change, while saturation measures color amount. Eventually, the amount of lightness or brilliance is termed Intensity or Value, and it is crucial to distinguish between darker and lighter colors [44]. Color spaces from a region of interest (ROI) are obtained manually by many color vision applications, such as World of Color, by Maarten Zonneveld (IOS), and Color Grab (Android), which extract features from digitized pictures [45]. Likewise, computer-based methods such as MATLAB, ImageJ, and Python tools are developed to extract specific color features automatically from large image data sets, avoiding tedious and time-consuming manual feature selection [55].
6 ML Algorithms in Colorimetric Biosensors AI algorithms applied to digital images allow for extracting color data correlated to a specific event. In colorimetric biosensors, ML allows for novel ways of overcoming light exposure and camera settings issues by performing optimized feature extraction, decision-making, and self-learning [29, 56]. ML can improve detection even when there is a highly noisy color pattern in the image data [43]. It can be used for feature extraction, classification (set of categorical data), regression (set of continuous numeric variables), noise reduction, automated tasking, and exploratory
Recent Advances in Machine Learning for Electrochemical, Optical …
127
analysis. Moreover, with recent hardware advances, smartphones can reliably run complex, sophisticated quantitative and qualitative colorimetric analysis algorithms or connect to more robust data processing units via the cloud [56]. First, color from colorimetric biosensors is digitalized using the rear or frontal cameras of smartphones. Then, digitalized images are processed by ML applications. ML intelligent sensing platforms use custom algorithms that allow for straightforward applications in the field regardless of illumination conditions [44]. ML-based algorithms distinguish themselves from conventional computational approaches due to their ability to learn specific patterns from data and make predictions on different data sets [43]. Color spacers obtained from raw data can go through denoising, normalization, and rescaling to be converted into a feature vector and applied to ML.
7 Machine Learning Selection of Color Spaces Managing many feature dimensions from sample pictures may demand tremendous computer power. Following image digitalizing, many tools can be applied to process the raw data and enhance the most relevant features of the images. ML algorithms seek to methodically rank and focus on potential features according to their statistical value and contribution to predicting the expected output [57]. This process helps the training space dimensionality and minimizes the computer power required to run applications. Likewise, it will benefit the correlation of meaningful sensing information with highly discriminatory data. Multivariate analysis can be applied to image analysis to improve the visualization of significant features [52]. Dimension reduction algorithms, such as principal component analysis (PCA), will transform several variables into a reduced number of independent variables termed principal components (PCs). This reduction allows for minimizing the required computing power to perform tasks and, at the same time, preserves the initial information [42]. Linear discriminant analysis (LDA), on the other hand, reduces the number of variables for classification in a supervised way. LDA measures mean and variance from each category and apply the Bayesian rule to find the highest probability for all classes if they have equal co-variance, leading to linear terms in discriminant function [44]. LDA has been used to create a model for linear color classification by reducing the variation among the same concentration and enhancing the variance between distinct concentrations [58].
8 Machine Learning-Assisted Colorimetric Testing Computer vision applications are reliable in extracting color information from the colorimetric biosensor. However, the most accurate detection will depend more on combining many image features rather than only a few color parameters [43]. This large-scale feature selection is challenging in basic color applications, demanding
128
E. M. Materón et al.
time-consuming manual or off-site computer-assisted feature processing to determine interrelations [43]. ML algorithms, on the other hand, can coordinate an extensive feature data set, identifying potential features that will improve detection in a supervised approach. In the context of colorimetric biosensing, ML methods reduce noise associated with unspecified color change interference and anomalies arising from the interference of different camera settings or lighting exposure [29]. In terms of analysis, ML colorimetric testing can be subdivided into two distinct methods: (1) image classification, sorting samples into categories; (2) signal regression, which returns a continuous multivariate function to estimate output [29] Both approaches rely on extracted features from colored images and use similar algorithms to make predictions based on learning to deliver accurate analytical results by discovering hidden feature interrelations from noisy and overlapped signals [29]. However, classification models have been proven more robust than concentration regression in colorimetric biosensing [47]. Image classification will predict the label of image samples. Regression will create a multivariate function from a calibration curve based on the interrelation of color space contributing to the output. In signal regression, the output prediction is made according to the input linearity, and the coefficients are obtained from learning. Multilinear regression (MLR) is one example of a regression model in which the prediction is made based on multiple input variables. However, color feature signals extracted from images recurrently result in nonlinear models, limiting their use in multilinear regression [54]. The supervised learning algorithm SVM relies on finding maximum hyperplane subspaces boundaries in the feature space to classify as many training points as possible, as illustrated in Fig. 6a [58, 59]. It can be applied in circumstances with inseparable and nonlinear variables that are unattainable to solve by PCA or LDA [45]. Like SVM, support vector regression (SVR) is a function-fitting model that uses kernels to perform nonlinear regression, avoiding adopting a linear model [54]. SVR circumvents quantification obstacles by selecting inner product functions in the high-dimensional space to construct a linear decision function and eliminate the prerequisite of linear variables [54]. SVR can be applied to colorimetric biosensors to improve quantitative detection by combining the best color components and estimating a continuous value based on a function [54]. SVR is a supervised algorithm that trains the data based on a symmetrical loss function that penalizes both above and below-deviant estimates. One of the advantages of SVM and SVR is their high generalization capacity. SVR possesses a tunned balance between learning and model complexity. Detection based on SVM and SVR generally performs satisfactorily during training and loses accuracy when applied to different data sets or generalized to unseen data [48]. Decision tree (DT) models buffer this overfitting by applying a set of decision rules learned from the data. The bagging nature of DT models highlights deep interaction among color spaces. On the other hand, DT models still suffer from high variance leading to misclassification. The random forest (RF) algorithm represented in Fig. 6b is a more complex type of DT capable of identifying
Recent Advances in Machine Learning for Electrochemical, Optical …
129
Fig. 6 a SVM draws hyperplane boundaries and radial base functions to classify support vectors. SMV uses an optimization step to classify sample images by checking whether the picture belongs to a specific class. b RF can be used for both classification and regression models. Each tree receives a class vote in categorization, and the majority voting determines the final classification. In regression, the predictions from each decision tree at a target x coordinate are averaged to produce a function
outliers and relieving overfitting issues. RF algorithms apply a collection of decision trees for different color features and then average them to reduce variance, which might be a problem in both categorization and regression models. There are also regression trees based on DT and RF, which are non-parametric methods that allow assumptions on the predictors’ underlying distributions. SMV and RF algorithms are robust for building training data for smartphone-taken pictures of colorimetric reactions [58]. SMV and RF applications classify colorimetric sample images regardless of smartphone brand, illumination conditions, and image rotation. SMV performs binary image classification, while RF computes multi-class classification. In order to classify colorimetric biosensors, image samples are processed to obtain color information regarding color and illumination. A training model is then built based on these features to classify the images disregarding the illumination conditions. Employing a color normalizer, such as gray-world or gray edge, has improved color constancy against illumination conditions and increased accuracy [56].
9 Deep Learning Colorimetric Detection Deep learning algorithms are also applied for image analyses. In some biosensors, color changes are discrete. Linking color intensity with the outcome is difficult since shades might overlap, and it becomes impossible to match the color and the analyte concentration [43]. In turn, deep learning allows machines to systematically discover hidden patterns in large datasets and automatically change the internal parameters that fit with the expected outcome [29]. Another advantage of deep learning algorithms is that they can perform feature extraction and classification simultaneously,
130
E. M. Materón et al.
Fig. 7 Simplification of CNN image classifier. In this process, the image is converted into pixels representing color spaces found in the picture. These features from each input are then applied to the hidden layers of the neural network and are multiplied by weights (ω) to identify critical features contributing to the outcome. Then the highest weights neuron connections subsets will link these features to a specified output. The output layer will then display the classification information. Once the neural network is trained, it can classify unlabeled images
demonstrating to be very simple and highly accurate. Convolutional neural network (CNN) is an artificial neural network (ANN) process that mimics neurons in terms of variables’ interrelations [43, 58]. These neurons are organized in layers that multiply color features by arbitrary weights and transfer the resultant variable to the subsequent layers until it reaches the output. A critical step of CNN is finding optimized weights in the training data in the variable transfer process [58]. This procedure is called deep learning because the process is repeated many times until CNN finds a satisfactory classification [48]. The use of CNN for image classification is depicted in Fig. 7. This use is especially convenient when the color change is not significantly apparent or implied, as in the case of overlapping color reactions [43, 48]. Another application of CNN involves uneven brightness caused by nonuniform light exposition camera positioning and other environmental issues. CNN applied to image analysis automatically increases color recognition and improves model accuracy [60].
10 Gas Sensors with Machine Learning There has been an increase in the demand for gas monitoring systems due to various damages and air pollution caused by chemical gas leaks from industrial plants and fossil fuel combustion. Many industrial processes use these gases, including industrial production, meat processing, refrigeration systems, agricultural wastes, and so on [61–63]. Even at low concentrations, these gases are harmful to human health. To protect people from excessive exposure of these gases at workplace, the American
Recent Advances in Machine Learning for Electrochemical, Optical …
131
Occupational Safety and Health Administration (OSHA) has established permitted exposure limits for toxic gases. It is therefore necessary to create sensitive, dependable, and effective gas sensors for monitoring gas concentrations in an industrial setting [64, 65]. Challenges to be faced are noise reduction, increase in shelf life and stability [66, 67], which can be addressed with the use of ML in data analysis [68–71]. ML algorithms may be combined with chemometric methods already employed in gas sensors [72–74], which may involve PCA, PCR, LDA, hierarchical clustering analysis (HCA), MLR, and partial least-squares discriminant analysis or regression (PLSDA or PLSR) [75, 76]. To illustrate how ML can improve gas sensors, consider the steps below. ML can handle large amounts of sensing data for complicated matrices or samples, and it also allows for the possibility of achieving good analytical results from noisy, low-resolution, and highly overlapping sensing data. In particular, ML can be used to interpret raw sensing data from a gas sensor in a variety of ways: • The algorithms can categorize the sensing signals into multiple groups based on the target analyte. • Sample matrix and operating circumstances invariably influence gas sensor performance. • Gas sensor signals shift in seconds or minutes, while electrical noise occurs in sub-seconds. ML models can identify the signal from noise. • ML algorithms help analyze sensing data by revealing latent objects and patterns. The application of ML in gas sensors is still in its infancy [77–82]. The prospects, nevertheless, are promising because ML can help solve problems with interferents in real samples and issues associated with the sensors themselves, including dependence on humidity and temperature, and limited stability. It has been recognized that one-dimensional data analysis is insufficient to obtain sensitive signals that are highly linked with analyte type and quantity. This indicates an opportunity for ML to improve the accuracy and reliability of sensor measurements in real samples. Kang et al. [83] employed CNN to identify patterns in sensor responses with realtime selectivity with the accuracy of 98%. Figure 8a depicts the fabrication of a gas sensor array via wafer-scale glancing angle deposition (GLAD) using four distinct materials, including SnO2 , In2O3 , WO3 , and CuO. Figure 8b shows a MEMS-based sensor device with a microheater platform that satisfies the high working temperature requirements for metal oxide gas sensors while consuming little power. The sensor array in Fig. 8c was used to obtain the data for training and classification using CNN for six gases. Figure 8d, e shows the result of gas classification to predict the gas concentration with an accuracy of 98.06% with an average error of 10.15%. Figure 8f shows the response transient with respect to concentration for different gases. Since the algorithms for pattern identification of sensor responses may be utilized in batch-uniform gas sensors, it is anticipated that the time and cost required for E-nose systems will be greatly reduced. Hence, this approach can be utilized for gas sensors in IoT home and industrial applications. Similarly, ML algorithms have been used to discriminate volatile organic compounds (VOCs) with a precision of 96.84% [84].
132
E. M. Materón et al.
Fig. 8 a Schematic for the manufacture of highly batch-uniform semiconductor metal oxide (SMO) gas sensors by use of glancing angle deposition (GLAD). b Optical microscopic image of the suspended microheater platform-based gas sensors. c Classification and regression of target gases in real-time based on convolutional neural network (CNN) analysis of gas-sensing data by fabricating the sensor control module for heater control and sensor data acquisition. d Prediction of results of different gas types in the confusion matrix. e Normalized data in 0–10 for classification of test data. f Response time required by the proposed E-nose system to estimate the gas species for the test data set. Reprinted from ref [83]
In summary, ML algorithms can be used for classification, regression, and clustering of complicated gas analytes [85–87], as they are able to process databases with hundreds of input features—in contrast to traditional data regression methods in which mathematical equations are employed to determine typically a few variables. Since a considerable amount of data is required, especially for deep learning approaches, it is necessary to design and implement multiplex or high-throughput gas sensors, such as sensor arrays and E-nose systems. Databanks given by federal institutions such as the American Occupational Safety and Health Administration (OSHA) are an additional essential data source for training ML algorithms.
Recent Advances in Machine Learning for Electrochemical, Optical …
133
Acknowledgments This work was supported by CNPq (402816/2020-0,102127/20220,115857/2022-2), CAPES, INEO, and FAPESP (2018/22214-6, 2021/08387-8).
References 1. Faceli, K., Lorena, A. C., Gama, J., & de Carvalho, A. C. P. D. L. F. D. (2021). Inteligência artificial: Uma abordagem de aprendizado de màquina (2nd ed.). LTC. https://www.grupogen.com.br/inteligencia-artificial-uma-abordagem-de-aprend izado-de-maquina?event-category=beon&event-action=details&event-label=produto_ultimos 2. Munir, K., Elahi, H., Ayub, A., Frezza, F., & Rizzi, A. (2019). Cancer diagnosis using deep learning: A bibliographic review. Cancers (Basel), 11, 1235. https://doi.org/10.3390/cancers11 091235 3. Ribas, L. C., Riad, R., Jennane, R., & Bruno, O. M. (2022). A complex network based approach for knee Osteoarthritis detection: Data from the Osteoarthritis initiative. Biomedical Signal Processing and Control, 71, 103133. https://doi.org/10.1016/j.bspc.2021.103133 4. Rodrigues, V. C., Soares, J. C., Soares, A. C., Braz, D. C., Melendez, M. E., Ribas, L. C., Scabini, L. F. S., Bruno, O. M., Carvalho, A. L., Reis, R. M., Sanfelice, R. C., & Oliveira, O. N. (2021). Electrochemical and optical detection and machine learning applied to images of genosensors for diagnosis of prostate cancer with the biomarker PCA3. Talanta, 222, 121444. https://doi.org/10.1016/j.talanta.2020.121444 5. Nassif, A. B., Shahin, I., Attili, I., Azzeh, M., & Shaalan, K. (2019). Speech recognition using deep neural networks: A systematic review. IEEE Access, 7, 19143–19165. https://doi.org/10. 1109/ACCESS.2019.2896880 6. Zhao, Z., Zheng, P., Xu, S., & Wu, X. (2019). Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems, 30, 3212–3232. https://doi.org/ 10.1109/TNNLS.2018.2876865 7. Song, L., Gong, D., Li, Z., Liu, C., & Liu, W. (2019). Occlusion robust face recognition based on mask learning with pairwise differential siamese network. In 2019 IEEE/CVF International Conference on Computer Vision (pp. 773–782). IEEE. https://doi.org/10.1109/ICCV. 2019.00086 8. Soares, J. C., Soares, A. C., Rodrigues, V. C., Oiticica, P. R. A., Raymundo-Pereira, P. A., Bott-Neto, J. L., Buscaglia, L. A., de Castro, L. D. C., Ribas, L. C., Scabini, L., & Brazaca, L. C. (2021). Detection of a SARS-CoV-2 sequence with genosensors using data analysis based on information visualization and machine learning techniques. Materials Chemistry Frontiers, 5, 5506–5506. https://doi.org/10.1039/D1QM90058G 9. Baldassarre, A., Mucci, N., Lecca, L. I., Tomasini, E., Parcias-do-Rosario, M. J., Pereira, C. T., Arcangeli, G., & Oliveira, P. A. B. (2020). Biosensors in occupational safety and health management: A narrative review. International Journal of Environmental Research Public Health, 17, 2461. https://doi.org/10.3390/ijerph17072461 10. Oliveira, O. N., Iost, R. M., Siqueira, J. R., Crespilho, F. N., & Caseli, L. (2014). Nanomaterials for diagnosis: Challenges and applications in smart devices based on molecular recognition. ACS Applied Materials and Interfaces, 6, 14745–14766. https://doi.org/10.1021/am5015056 11. Ronkainen, N. J., Halsall, H. B., & Heineman, W. R. (2010). Electrochemical biosensors. Chemical Society Reviews, 39, 1747–1763. https://doi.org/10.1039/b714449k 12. Wang, J. (2006). Electrochemical sensors. In Analytical electrochemistry (pp. 201–243). Wiley. https://doi.org/10.1002/0471790303.ch6 13. Naresh, V., & Lee, N. (2021). A review on biosensors and recent development of nanostructured materials-enabled biosensors. Sensors, 21, 1109. https://doi.org/10.3390/s21041109 14. Thévenot, D. R., Toth, K., Durst, R. A., & Wilson, G. S. (2001). Electrochemical biosensors: recommended definitions and classification. Biosensors and Bioelectronics, 16, 121–131. https://doi.org/10.1016/S0956-5663(01)00115-4
134
E. M. Materón et al.
15. Grieshaber, D., MacKenzie, R., Vörös, J., & Reimhult, E. (2008). Electrochemical biosensors— Sensor principles and architectures. Sensors, 8, 1400–1458. https://doi.org/10.3390/s80314000 16. Kimmel, D. W., LeBlanc, G., Meschievitz, M. E., & Cliffel, D. E. (2012). Electrochemical sensors and biosensors. Analytical Chemistry, 84, 685–707. https://doi.org/10.1021/ac202878q 17. Puthongkham, P., Wirojsaengthong, S., & Suea-Ngam, A. (2021). Machine learning and chemometrics for electrochemical sensors: moving forward to the future of analytical chemistry. The Analyst, 146, 6351–6364. https://doi.org/10.1039/D1AN01148K 18. Ferguson, A. L. (2018). ACS central science virtual issue on machine learning. ACS Central Science, 4, 938–941. https://doi.org/10.1021/acscentsci.8b00528 19. Brown, K. A., Brittman, S., Maccaferri, N., Jariwala, D., & Celano, U. (2020). Machine learning in nanoscience: Big data at small scales. Nano Letters, 20, 2–10. https://doi.org/10.1021/acs. nanolett.9b04090 20. Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN Computer Science, 2, 160. https://doi.org/10.1007/s42979-021-00592-x 21. Bond, A. M., Zhang, J., Gundry, L., & Kennedy, G. F. (2022). Opportunities and challenges in applying machine learning to voltammetric mechanistic studies. Current Opinion in Electrochemistry, 34, 101009. https://doi.org/10.1016/j.coelec.2022.101009 22. DePalma, R. A., & Perone, S. P. (1979). Characterization of heterogeneous kinetic parameters from voltammetric data by computerized pattern recognition. Analytical Chemistry, 51, 829– 832. https://doi.org/10.1021/ac50043a013 23. Meuwly, M. (2021). Machine learning for chemical reactions. Chemical Reviews, 121, 10218– 10239. https://doi.org/10.1021/acs.chemrev.1c00033 24. Sapozhnikova, E. P., Bogdan, M., Speiser, B., Rosenstiel, W. (2006). EChem++–An objectoriented problem solving environment for electrochemistry. 3. Classification of voltammetric signals by the Fuzzy ARTMAP neural network with respect to reaction mechanisms. Journal of Electroanalytical Chemistry, 588, 15–26. https://doi.org/10.1016/j.jelechem.2005.11.032 25. Hoar, B. B., Zhang, W., Xu, S., Deeba, R., Costentin, C., Gu, Q., & Liu, C. (2022). Electrochemical mechanistic analysis from cyclic voltammograms based on deep learning. ACS Measurement Science Au. https://doi.org/10.1021/acsmeasuresciau.2c00045 26. Semenova, D., Zubov, A., Silina, Y. E., Micheli, L., Koch, M., Fernandes, A. C., & Gernaey, K. V. (2018). Mechanistic modeling of cyclic voltammetry: A helpful tool for understanding biosensor principles and supporting design optimization. Sensors and Actuators B Chemical, 259, 945–955. https://doi.org/10.1016/j.snb.2017.12.088 27. Chen, H., Kätelhön, E., Le, H., & Compton, R. G. (2021). Use of artificial intelligence in electrode reaction mechanism studies: Predicting voltammograms and analyzing the dissociative CE reaction at a hemispherical electrode. Analytical Chemistry, 93, 13360–13372. https://doi. org/10.1021/acs.analchem.1c03154 28. Toyao, T., Maeno, Z., Takakusagi, S., Kamachi, T., Takigawa, I., & Shimizu, K. (2020). Machine learning for catalysis informatics: Recent applications and prospects. ACS Catalysis, 10, 2260– 2297. https://doi.org/10.1021/acscatal.9b04186 29. Cui, F., Yue, Y., Zhang, Y., Zhang, Z., & Zhou, H. S. (2020). Advancing Biosensors with Machine Learning. ACS Sensors, 5, 3346–3364. https://doi.org/10.1021/acssensors.0c01424 30. Sheng, Y., Qian, W., Huang, J., Wu, B., Yang, J., Xue, T., Ge, Y., & Wen, Y. (2019). Electrochemical detection combined with machine learning for intelligent sensing of maleic hydrazide by using carboxylated PEDOT modified with copper nanoparticles. Microchimica Acta, 186, 543. https://doi.org/10.1007/s00604-019-3652-x 31. Guo, Z., Tian, R., Xu, W., Yip, D., Radyk, M., Santos, F. B., Yip, A., Chen, T., & Tang, X. S. (2022). Highly accurate heart failure classification using carbon nanotube thin film biosensors and machine learning assisted data analysis. Biosensors and Bioelectronics X, 12, 100187. https://doi.org/10.1016/j.biosx.2022.100187 32. Du, L., Yan, Y., Li, T., Liu, H., Li, N., & Wang, X. (2022). Machine learning enables quantification of multiple toxicants with microbial electrochemical sensors. ACS ES&T Engineering, 2, 92–100. https://doi.org/10.1021/acsestengg.1c00287
Recent Advances in Machine Learning for Electrochemical, Optical …
135
33. Zhou, Z., Wang, L., Wang, J., Liu, C., Xu, T., & Zhang, X. (2022). Machine learning with neural networks to enhance selectivity of nonenzymatic electrochemical biosensors in multianalyte mixtures. ACS Applied Materials and Interfaces. https://doi.org/10.1021/acsami.2c17593 34. Xu, L., He, J., Duan, S., Wu, X., & Wang, Q. (2016). Comparison of machine learning algorithms for concentration detection and prediction of formaldehyde based on electronic nose. Sensor Review, 36, 207–216. https://doi.org/10.1108/SR-07-2015-0104 35. Yang, Z., Miao, N., Zhang, X., Li, Q., Wang, Z., Li, C., Sun, X., & Lan, Y. (2021). Employment of an electronic tongue combined with deep learning and transfer learning for discriminating the storage time of Pu-erh tea. Food Control, 121, 107608. https://doi.org/10.1016/j.foodcont. 2020.107608 36. Dean, S. N., Shriver-Lake, L. C., Stenger, D. A., Erickson, J. S., Golden, J. P., & Trammell, S. A. (2019). Machine learning techniques for chemical identification using cyclic square wave voltammetry. Sensors, 19, 2392. https://doi.org/10.3390/s19102392 37. Daliri, M. R. (2015). Combining extreme learning machines using support vector machines for breast tissue classification. Computer Methods in Biomechanics and Biomedical Engineering, 18, 185–191. https://doi.org/10.1080/10255842.2013.789100 38. Durante, G., Becari, W., Lima, F. A. S., & Peres, H. E. M. (2016). Electrical impedance sensor for real-time detection of bovine milk adulteration. IEEE Sensors Journal, 16, 861–865. https:// doi.org/10.1109/JSEN.2015.2494624 39. Islam, M., Wahid, K., & Dinh, A. (2018). Assessment of ripening degree of avocado by electrical impedance spectroscopy and support vector machine. Journal of Food Quality, 2018, 1–9. https://doi.org/10.1155/2018/4706147 40. Murphy, E. K., Mahara, A., Khan, S., Hyams, E. S., Schned, A. R., Pettus, J., & Halter, R. J. (2017). Comparative study of separation between ex vivo prostatic malignant and benign tissue using electrical impedance spectroscopy and electrical impedance tomography. Physiological Measurement, 38, 1242–1261. https://doi.org/10.1088/1361-6579/aa660e 41. Leon-Medina, J. X., Anaya, M., Pozo, F., & Tibaduiza, D. (2020). Nonlinear feature extraction through manifold learning in an electronic tongue classification task. Sensors, 20, 4834. https:// doi.org/10.3390/s20174834 42. Schackart, K. E., & Yoon, J. (2021). Machine learning enhances the performance of bioreceptorfree biosensors. Sensors, 21, 5519. https://doi.org/10.3390/s21165519 43. Gunda, N. S. K., Gautam, S. H., & Mitra, S. K. (2019). Editors’ choice—Artificial intelligence based mobile application for water quality monitoring. Journal of the Electrochemical Society, 166, B3031–B3035. https://doi.org/10.1149/2.0081909jes 44. Mercan, Ö. B., Kılıç, V., & Sen, ¸ M. (2021). Machine learning-based colorimetric determination of glucose in artificial saliva with different reagents using a smartphone coupled μPAD. Sensors Actuators B Chemical, 329, 129037. https://doi.org/10.1016/j.snb.2020.129037 45. Xu, Z., Wang, K., Zhang, M., Wang, T., Du, X., Gao, Z., Hu, S., Ren, X., & Feng, H. (2022). Chemical machine learning assisted dual-emission fluorescence/colorimetric sensor array detection of multiple antibiotics under stepwise prediction strategy. Sensors Actuators B. Chemical, 359, 131590. https://doi.org/10.1016/j.snb.2022.131590 46. Zhou, Y., Yuan, Y., Wu, Y., Li, L., Jameel, A., Xing, X., & Zhang, C. (2022). Encoding genetic circuits with DNA barcodes paves the way for machine learning-assisted metabolite biosensor response curve pro fi ling in yeast. ACS Synthetic Biology, 11, 977–989. https://doi.org/10. 1021/acssynbio.1c00595 47. Khanal, B., Pokhrel, P., Khanal, B., & Giri, B. (2021). Machine-learning-assisted analysis of colorimetric assays on paper analytical devices. ACS Omega, 6, 33837–33845. https://doi.org/ 10.1021/acsomega.1c05086 48. Revignas, D., & Amendola, V. (2022). Artificial neural networks applied to colorimetric nanosensors: An undergraduate experience tailorable from gold nanoparticles synthesis to optical spectroscopy and machine learning. Journal of Chemical Education. https://doi.org/10. 1021/acs.jchemed.1c01288 49. Hyeon, D., Kim, Y., Hun, H., Lee, B., Suh, S., Hyuk, J., & Heon, J. (2022). Automatic quantification of living cells via a non-invasive achromatic colorimetric sensor through machine
136
50.
51.
52.
53.
54.
55. 56.
57.
58.
59.
60.
61. 62.
63.
64.
65.
66.
E. M. Materón et al. learning-assisted image analysis using a smartphone. Chemical Engineering Journal, 450, 138281. https://doi.org/10.1016/j.cej.2022.138281 Pohanka, M. (2020). Colorimetric hand-held sensors and biosensors with a small digital camera as signal recorder, a review. Reviews in Analytical Chemistry, 39, 20–30. https://doi.org/10. 1515/revac-2020-0111 Sajed, S., Kolahdouz, M., Sadeghi, M. A., & Razavi, S. F. (2020). High-performance estimation of lead ion concentration using smartphone-based colorimetric analysis and a machine learning approach. ACS Omega, 5, 27675–27684. https://doi.org/10.1021/acsomega.0c04255 Helfer, G. A., Magnus, V. S., Böck, F. C., Teichmann, A., Ferrão, M. F., da Costa, A. B. (2017). PhotoMetrix: An application for univariate calibration and principal components analysis using colorimetry on mobile devices. Journal of the Brazilian Chemical Society, 28, 328–335. https:// doi.org/10.5935/0103-5053.20160182 Leng, Y., Cheng, J., Liu, C., Wang, D., Lu, Z., Ma, C., Zhang, M., Dong, Y., Xing, X., Yao, L., & Chen, Z. (2021). A rapid reduction of Au ( I → 0) strategy for the colorimetric detection and discrimination of proteins. Microchimica Acta, 188, 1–9. https://doi.org/10.1007/s00604021-04906-x Liu, T., Jiang, H., & Chen, Q. (2022). Input features and parameters optimization improved the prediction accuracy of support vector regression models based on colorimetric sensor data for detection of aflatoxin B1 in corn. Microchemical Journal, 178, 107407. https://doi.org/10. 1016/j.microc.2022.107407 Chary, R. V. R. (2012). Feature extraction methods for color image similarity. Advanced Computing an International Journal, 3, 147–157. https://doi.org/10.5121/acij.2012.3215 Solmaz, M. E., Mutlu, A. Y., Alankus, G., Kılıc, V., Bayram, A., & Horzum, N. (2018). Chemical quantifying colorimetric tests using a smartphone app based on machine learning classifiers. Sensors Actuators B. Chemical, 255, 1967–1973. https://doi.org/10.1016/j.snb.2017.08.220 Ballard, Z., Brown, C., Madni, A. M., & Ozcan, A. (2021). Machine learning and computationenabled intelligent sensor design. Nature Machine Intelligence, 3, 556–565. https://doi.org/10. 1038/s42256-021-00360-9 Kim, H., Awofeso, O., Choi, S., Jung, Y., & Bae, E. (2017). Colorimetric analysis of saliva-alcohol test strips by smartphone-based instruments using machine-learning algorithms. Applied Optics, 56, 84–92. https://doi.org/10.1364/AO.56.000084 Vapnik, V. (1998). The support vector method of function estimation BT—Nonlinear modeling: Advanced black-box techniques. In J. A. K. Suykens, & J. Vandewalle (Eds.), Nonlinear model (pp. 55–85). Springer. https://doi.org/10.1007/978-1-4615-5703-6_3 Hu, Z., Fang, W., Gou, T., Wu, W., & Hu, J. (2019). Analytical Methods A novel method based on a Mask R-CNN model for processing dPCR images. Analytical Methods, 11, 3410–3418. https://doi.org/10.1039/c9ay01005j United State Environmental Protection Agency. (2018). Sources of Greenhouse gas emissions | greenhouse gas (GHG) emissions | US EPA. Greenhouse Gas Emissions. Kumar, S., Choudhury, S., & Pandey, V. (2019). A study on the horrendous industrial mass disaster at union carbide plant of Bhopal in light of ethical dimension. Indian Journal of Public Health Research and Development. https://doi.org/10.5958/0976-5506.2019.01251.8 Yandrapu, V. P., & Kanidarapu, N. R. (2022). Energy, economic, environment assessment and process safety of methylchloride plant using Aspen HYSYS simulation model. Digital Chemical Engineering. https://doi.org/10.1016/j.dche.2022.100019 Zhang, H., & Srinivasan, R. (2020). A systematic review of air quality sensors, guidelines, and measurement studies for indoor air quality management. Sustainability. https://doi.org/10. 3390/su12219045 Wienemann, E., & Wartmann, A. (2021). Alcohol prevention in the workplace: current workplace concepts for addiction prevention and addiction assistance programmes. Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitsschutz. https://doi.org/10.1007/s00103-021-033 37-6 Kim, S. J., Koh, H. J., Ren, C. E., Kwon, O., Maleski, K., Cho, S. Y., Anasori, B., Kim, C. K., Choi, Y. K., Kim, J., Gogotsi, Y., & Jung, H. T. (2018). Metallic Ti3 C2 Tx MXene gas sensors with ultrahigh signal-to-noise ratio. ACS Nano. https://doi.org/10.1021/acsnano.7b07460
Recent Advances in Machine Learning for Electrochemical, Optical …
137
67. Shin, W., Hong, S., Jung, G., Jeong, Y., Park, J., Kim, D., Jang, D., Park, B. G., & Lee, J. H. (2021). Improved signal-to-noise-ratio of FET-type gas sensors using body bias control and embedded micro-heater. Sensors Actuators B Chemical. https://doi.org/10.1016/j.snb.2020. 129166 68. Srivastava, S. (2021). Effect on neural pattern classifier for intelligent gas sensor by increasing number of hidden layer. International Journal of Research in Applied Science and Engineering Technology. https://doi.org/10.22214/ijraset.2021.37583 69. Xiong, L., & Compton, R. G. (2014). Amperometric gas detection: A review. International Journal of Electrochemical Science. 70. Song, Z., Ye, W., Chen, Z., Chen, Z., Li, M., Tang, W., Wang, C., Wan, Z., Poddar, S., Wen, X., Pan, X., Lin, Y., Zhou, Q., & Fan, Z. (2021). Wireless self-powered high-performance integrated nanostructured-gas-sensor network for future smart homes. ACS Nano. https://doi. org/10.1021/acsnano.1c01256 71. Kato, Y., & Mukai, T. (2007). A real-time intelligent gas sensor system using a nonlinear dynamic response. Sensors Actuators B Chemical. https://doi.org/10.1016/j.snb.2006.03.021 72. Shafii, N. Z., Saudi, A. S. M., Pang, J. C., Abu, I. F., Sapawe, N., Kamarudin, M. K. A., & Saudi, H. F. M. (2019). Application of chemometrics techniques to solve environmental issues in Malaysia. Heliyon. https://doi.org/10.1016/j.heliyon.2019.e02534 73. Aleixandre-Tudo, J. L., Castello-, L., Aleixandre, J. L., & Aleixandre-, R. (2022). Chemometrics in food science and technology: A bibliometric study. Chemometrics and Intelligent Laboratory Systems. https://doi.org/10.1016/j.chemolab.2022.104514 74. Morita, S. (2020). Chemometrics and related fields in python. Analytical Sciences. https://doi. org/10.2116/analsci.19R006 75. Roy, M., & Yadav, B. K. (2022). Electronic nose for detection of food adulteration: a review. Journal of Food Science and Technololgy. https://doi.org/10.1007/s13197-021-05057-w 76. Oleneva, E., Kuchmenko, T., Drozdova, E., Legin, A., & Kirsanov, D. (2020). Identification of plastic toys contaminated with volatile organic compounds using QCM gas sensor array. Talanta. https://doi.org/10.1016/j.talanta.2019.120701 77. Thomas, S., Joshi, N., & Vijay, T. (Eds.). Functional nanomaterials advances in gas sensing technologies. Springer Singapore. https://doi.org/10.1007/978-981-15-4810-9 78. Materon, E. M., Ibáñez-Redín, G., Joshi, N., Gonçalves, D., Oliveira, O. N., & Faria, R. C. (2020). Analytical detection of pesticides, pollutants, and pharmaceutical waste in the environment. https://doi.org/10.1007/978-3-030-38101-1_3 79. Materón, E. M., Lima, R. S., Joshi, N., Shimizu, F. M., & Oliveira, O. N. (2019). Chapter 13— Graphene-containing microfluidic and chip-based sensor devices for biomolecules. In A. Pandikumar, P.B.T.-G.-B.E.S. for B. Rameshkumar (Eds.), Micro and nano technologies (pp. 321–336). Elsevier. https://doi.org/10.1016/B978-0-12-815394-9.00013-3 80. Joshi, N., Pransu, G., & Adam Conte-Junior, C. (2022). Critical review and recent advances of 2D materials-based gas sensors for food spoilage detection. Critical Reviews in Food Science and Nutrition, 1–24. https://doi.org/10.1080/10408398.2022.2078950 81. Joshi, N., Braunger, M. L., Shimizu, F. M., Riul, A., & Oliveira, O. N. (2021). Insights into nanoheterostructured materials for gas sensing: A review. Multifunctional Materials, 4, 032002. https://doi.org/10.1088/2399-7532/ac1732 82. Joshi, N., Hayasaka, T., Liu, Y., Liu, H., Oliveira, O. N., & Lin, L. (2018). A review on chemiresistive room temperature gas sensors based on metal oxide nanostructures, graphene and 2D transition metal dichalcogenides, Microchimica Acta, 185. 83. Kang, M., Cho, I., Park, J., Jeong, J., Lee, K., Lee, B., Del Orbe Henriquez, D., Yoon, K., & Park, I. (2022). High accuracy real-time multi-gas identification by a batch-uniform gas sensor array and deep learning algorithm. ACS Sensors. https://doi.org/10.1021/acssensors.1c01204 84. Devabharathi, N., Parasuraman, R., Umarji, A. M., & Dasgupta, S. (2021). Ultra-high response ethanol sensors from fully-printed co-continuous and mesoporous tin oxide thin films. Journal of Alloys and Compdounds. https://doi.org/10.1016/j.jallcom.2021.158815 85. Potyrailo, R. A., Brewer, J., Cheng, B., Carpenter, M. A., Houlihan, N., & Kolmakov, A. (2020). Bio-inspired gas sensing: Boosting performance with sensor optimization guided by “machine learning.” Faraday Discussions. https://doi.org/10.1039/d0fd00035c
138
E. M. Materón et al.
86. Reynolds, M., Duarte, L. M., Coltro, W. K. T., Silva, M. F., Gomez, F. J. V., & Garcia, C. D. (2020). Laser-engraved ammonia sensor integrating a natural deep eutectic solvent. Microchemical Journal. https://doi.org/10.1016/j.microc.2020.105067 87. Khorramifar, A., Rasekh, M., Karami, H., Malaga-Toboła, U., & Gancarz, M. (2021). A machine learning method for classification and identification of potato cultivars based on the reaction of MOS type sensor-array. Sensors. https://doi.org/10.3390/s21175836
Perovskite-Based Materials for Photovoltaic Applications: A Machine Learning Approach Ramandeep Kaur, Rajan Saini, and Janpreet Singh
Abstract The future of our planet depends greatly on sustainable energy sources and environmental preservation. Our modern society’s primary energy source is fossil fuels, which emit enormous amounts of carbon dioxide and contribute significantly to global warming. Due to global concerns about the environment and the increasing demand for energy, technological advancement in renewable energy is opening up new possibilities for its use. Even today, solar energy continues to be the most abundant, inexhaustible, and clean form of renewable energy. In this context, scientists and engineers across the world are working toward the development of highly efficient and cost-effective photovoltaic devices. As we move from the first to the third generation of solar cells, although their production cost decreases, their efficiency is also reduced. In the past few years, perovskites emerged as outstanding materials for photovoltaic applications. Halide perovskites have been reported to exhibit a power efficiency of 25.5% due to their excellent defect tolerance, high optical absorption, the minimization of recombination, and long carrier diffusion lengths. Furthermore, halide perovskite materials are more affordable and easier to construct than silicon-based classic solar cells. Thus, it is of paramount significance to design advanced perovskite materials with higher photovoltaic efficiency. Although mixed lead-free and inorganic perovskites have been established as promising photovoltaic materials, their enormous composition space makes it difficult to find compositions with desired bandgap and photovoltaic parameters. The bottleneck impeding this advancement can be addressed by either (1) following a trial-and-error approach to collect enormous experimental data and designing advanced materials with desired properties leading to the development of high-efficiency photovoltaic devices or (2) combining the strengths of experimental materials science and machine learning to understand the underlying compositional and structural descriptors governing the efficiency of these devices. This chapter includes a brief discussion of perovskite materials and the developments made in the lead-free perovskite for photovoltaics. Further, this discussion will turn to the collection and analysis of materials data and extend to the descriptors used to describe the performance and properties of lead-free R. Kaur (B) · R. Saini · J. Singh Department of Physics, Akal University, Talwandi Sabo, Punjab 151302, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_7
139
140
R. Kaur et al.
perovskites. The overarching aim of this chapter is to discuss how ML can be used to design advanced lead-free perovskites with desirable bandgaps and stabilities. Keywords Perovskites · Photovoltaic devices · Machine learning
1 Introduction Throughout history, new materials and energy sources have been the driving force behind all technological advances. A wide range of sectors in our society, including transportation, manufacturing, agriculture, food production, housing, and commerce, are dependent on reliable and affordable energy sources. It can be thermal energy, electrical energy, chemical energy, nuclear energy, or light energy. Before the industrial revolution, wood was the primary source of energy. Since the invention of the steam engine, coal has been the preferred source of energy. With the invention of internal combustion engines, petroleum products like gasoline, diesel, and natural gas are now used to meet the energy needs of people. Fossil fuels like coal, oil, and gas are either burned directly or turned into electricity, which is then sent to homes and businesses to be used. Electrical energy is the most convenient source of energy that can be transformed into all other forms of energy. It is one of the most flexible types of energy in terms of how it can be sent, shared, and controlled. As the world’s population and industrialization grow, so does the need for energy. Using fossil fuels as a source of energy produces greenhouse gases, which are bad for the environment and make global warming worse. Even the supply and demand for fossil fuels in the world are out of balance since the world’s fossil fuel reserves are limited, but our need for these resources is unlimited. At the current rate, our activities on earth cannot last forever; they can last at most for a century or two with ever-increasing fossil fuel consumption. Researchers are putting in a lot of work to improve the way energy is made using a variety of renewable energy sources without putting the environment at risk or depleting natural resources. As the world’s population and industrialization grow, so does the need for energy. Using fossil fuels as a source of energy produces greenhouse gases, which are bad for the environment and make global warming worse. Even the supply and demand for fossil fuels in the world are out of balance since the world’s fossil fuel reserves are limited, but our need for these resources is unlimited. At the current rate, our activities on earth cannot last forever; they can last at most for a century or two with ever-increasing fossil fuel consumption. Researchers are putting in a lot of work to improve the way energy is made using a variety of renewable energy sources without putting the environment at risk or depleting natural resources. Considering all these facts and the fact that most countries have plenty of sunlight, solar photovoltaic (PV) technology stands out as the most promising method for producing energy. As a source of renewable energy, solar PV has gotten a lot of attention from scientists and has grown quickly in the last few decades. Today, solar energy makes up between 2 and 3% of the world’s energy use. Solar cells are seen as
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
141
one of the most promising ways to get the energy that won’t run out. Nevertheless, their success depends heavily on the availability of suitable materials. However, recent improvements in solar PV technology show promise for helping the world meet its energy needs in the near future. In photovoltaic technology, silicon is the most often utilized material and has a band gap of 1.1. eV and a conversion efficiency of more than 30% [1]. The Si solar panels are produced under extremely high temperatures, leading to inexpensive production costs. The inflexibility of silicon solar cells against mechanical deformation further restricts their use. Due to these considerations, it is crucial to identify alternatives to solar cell materials that are more efficient, adaptable, and cost-effective to produce. The new technology must provide significant device stability, lasting at least 10–20 years, to compete with Si PV. In the past few years, halide perovskites have emerged as a promising optoelectronic material, especially as solar cells and light-emitting diodes materials [2, 3]. Perovskites solar cells (PSCs) have progressed rapidly from 4% efficiency in 2009 to 25.5% efficiency today [2, 4]. Due to the perfect band gaps that perovskite materials possess for absorbing solar light, photovoltaic devices have been developed. The success of this class of materials is largely attributed to their exceptional qualities, which include proper and tunable bandgaps, significant optical absorption coefficients, appreciable diffusion lengths, small and balanced charge carrier effective masses, compositional flexibility, high defect tolerance, and process ability in lowtemperature solutions [5–7]. Utilizing the fundamental characteristics of ABX3 , the halide perovskites in PSC show a remarkable high level of defect tolerance, which makes the manufacturing of solar cells easier. The Perovskite, on the other hand, has been known for its infamous structural instability in the presence of moisture and an electric field [8]. Since it is challenging to find a solar cell material that has both high stability with the optimal band gap, a great deal of effort has been dedicated to developing new materials with improved stability and band structure. Since there are so many ways to combine elements, there may still be perovskite compounds that can absorb sunlight that we don’t know about yet. Prior until now, high-throughput first-principle calculations have been utilized to identify prospective perovskite materials for catching solar light, however, these calculations have been found to leave undiscovered perovskite material regions due to the burden of lengthy processing timeframes. Even though lead halide perovskites have a lot of potential for use in optoelectronics, their widespread commercialization is hindered by two key obstacles: poor stability and lead toxicity [9, 10]. One way to solve these problems is to find stable, lead-free alternatives to lead halide perovskites that have the same optoelectronic properties. Finding new environmentally friendly materials suitable for PSC or PSC-like technology is the only way to address the toxicity issue. In light of the rapid progress in PSC based on halide perovskites and the desire to preserve most of the advantages of lead-based halide perovskites, novel lead-free compounds are being explored APbX3 perovskite materials. Among all the potential materials for solar cell technology, perovskites have gained lots of attention due to their high solar conversion efficiency. With an efficiency of 33.5% for turning sunlight into electricity, the best band gap for a photovoltaic cell is the Shockley-Queisser limit [11]. Due to the perfect band gaps that perovskite
142
R. Kaur et al.
materials possess for absorbing solar light, photovoltaic devices have been developed. To find viable alternatives to lead halide perovskites and to develop them, computational efforts are required. More than 1622 different crystals with either a single or double perovskite structure have been described so far [12]. An estimation based on known crystal structures and geometric factors suggests that there may be 100,000 unknown compounds with a perovskite structure. It implies that only ~1.7% of all perovskite compounds have been discovered to date. So far, the perovskites that have been found are just the tip of the iceberg. There are still about 90,000 compounds that haven’t been looked into. Experiments used to be the most important way to find and learn about new materials. Because experimental research needs a lot of resources and tools, it must be done over long periods of time with very few materials. Because of these limits, most important discoveries were made by chance or by people’s instincts. The search for new compounds through experimentation is a time-consuming and pricey endeavor that is similar to the process of trial and error. In order to ease the process, many theoretical and empirical approaches have been developed. The goal is to get rid of compounds that aren’t likely to work and focus research on those that are more likely to work. Therefore, computational efforts are required for the design and development of viable alternatives to lead halide perovskite. High-throughput computational materials design may be used to forecast Pb-free perovskite materials for PV applications by combining computational quantum-mechanical, thermodynamic, and methodologies based on database development and intelligent data mining. Nevertheless, these computations are nowhere near direct modeling of materials, and they often only cover a tiny portion of the actual design challenge. Importantly, the descriptors can be calculated at the DFT level to test potential materials for important properties like stability, light absorption, carrier mobility, low cost, and nontoxicity. Due to the ongoing advancement of computer power, the HT computational material discovery paradigm has recently emerged as a viable and efficient method of finding novel functional materials, particularly perovskite materials. Tens of thousands of novel perovskite-based compounds have been anticipated among them for use in solar systems. An extensive material database, which contains real and fictitious materials, is created using the HT computational approach using first-principle calculations. These computational methods are again very time-consuming. As a new generation of material design strategy, Machine Learning (ML) driven scheme can achieve high-precision material discovery in a “cheap” way without deep physical and chemical knowledge, just based on existing data and appropriate algorithms. Furthermore, ML technology provides an effective way for researchers to gain a deeper understanding of complex material properties by exploiting the structure-property relationship hidden in the data, which provides a unique tool to find more suitable descriptors of the material properties without being limited to the framework of known knowledge. In general, high-quality training data is needed for the development of a good machine learning model. The accuracy of machine learning models depends on the diversity of data.
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
143
2 Implementing Machine Learning in Pb-Free Perovskites for Photovoltaic Applications Material properties determine the types of materials that should be designed for particular applications. Prerequisites for the machine learning process include identifying accurate features that correlate strongly with the targeting properties. In order to address the common problems that prototypical lead halide perovskites face, stability and nontoxicity properties are taken into account when designing halide perovskite-like materials for optoelectronics. Optical and electronic properties then dictate the fundamental functionalities of materials in optoelectronic applications. Material properties are described by computationally viable descriptors. Descriptors are crucial to filtering out the large repository of calculated materials and choosing materials with desirable properties. Using feature engineering, the descriptors or features can be selected. Machine learning algorithms are used to develop validated mappings connecting problem-relevant aspects of materials composition, structure, morphology, processing, etc., to the target property or performance criteria, bypassing the traditional experimental and computational approaches that are time and resource intensive. There are four main systematic implementations of ML that will be covered in our study on perovskites: defining the study objectives, creating the dataset, choosing the descriptors for the data description, and choosing the ML approach. The specifics of such general frameworks are available from a number of ML sources, with some variations in language and step sequencing. Figure 1 gives the general framework for the discovery of lead-free stable perovskites.
Fig. 1 Framework for lead-free stable perovskite design with ML in combination with DFT. A blue box illustrates the material screening process. A machine learning algorithm is developed based on data gathered in the past. The green boxes show the electronic properties and stability of the candidates using DFT [13]
144
R. Kaur et al.
2.1 Targeted Properties There is a direct correlation between material properties and material applications. To successfully implement machine learning, various objectives must be clearly stated before the relevant steps can be undertaken, including the selection of the most appropriate machine learning techniques. With so many possible candidates, it is best to use the charge neutrality condition to get rid of a lot of compounds that can’t be made because they aren’t electrically neutral. This cuts the number of candidates by a huge amount. The perovskite structure is represented by the chemical formula ABX3 , where the A cations are generally of larger radii and have 12-fold coordination, while the relatively smaller B metal ions occupy sixfold coordinated positions in the oxygen octahedra. The A-site ions typically have +2 or +3 charge states, while the charge state of the B-site cations is +4 or +3, for charge neutrality they should follow the charge neutrality criterion: qA + qB + 3qX = 0, where qA , qB , and qX are charges on A-, B-, and X-site ions, respectively [14]. The first step in making perovskite-based photovoltaic materials is to figure out how stable they are. This is also one of the problems that keep perovskites from being used in real life. Perovskites are mainly evaluated in terms of three different aspects of stability: (1) structural stability (or formability), (2) thermodynamic stability, and (3) dynamic stability. Perovskites are primarily evaluated on the basis of simple structural features. In ideal perovskite structures the A and B sites are frequently occupied by two different cations with different ionic radii. In the ideal cubic structure the Bsite cation is present in the octahedral environment of 6 X anions. There are some requirements for the ionic radius of the A, B, and X cations in order to maintain a stable perovskite structure. In 1926, Goldschmidt [15] proposed an empirical formula based on crystal structure to describe the stability of cubic perovskites: (r A + r X ) t=√ 2(r B + r X )
(1)
where r A , r B , and r X are the ionic radii of the A, B, and X-site ions, respectively. In the following years, Goldschmidt’s tolerance factor became widely accepted as a criterion for determining perovskite stability, including oxide, fluoride, and chloride compounds. People have used the tolerance factor as a guide in the design and discovery of new perovskites. It is still used to design and study perovskites. During the exploration of more perovskites, it was discovered that the tolerance factor is important but not sufficient for the formation of perovskites [16]. The tolerance factor determines whether the A-site cation can be accommodated within the voids of the BX6 framework, t = 1 value indicates an ideal cubic perovskite. It has been observed that most perovskites have t values in the range of 0.8–1.00. It is possible to observe distortions in the lower part of this range due to the tilting of the BX6 . For a stable perovskite the t value lies in the range 0.75–1.03 [17, 18]. In the lower ranges of these coordinate numbers, the ABX3 solid is more likely to adopt other types of structures, e.g., ilmenite, where the coordination number of the A-site cation
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
145
is small in comparison to that of the perovskites. When this range is exceeded, the solid tends to be hexagonal [17]. For the purpose of predicting more accurately the formability of perovskite structures, an “octahedral factor” (O f ) factor was introduced [17]. The octahedral factor (O f ) is defined as the ratio of the radius of the small cation B to the radius of the anion X, i.e., r B /r X . When O f > 0.442, the BX6 octahedron is said to be in a stable assembly. However, a lower of value will destabilize the BX6 octahedral assembly. Thus, the octahedral factor is thought to make a better structural map when combined with the tolerance factor and provides new criteria for determining the formability of oxide perovskites. The typical t and O f value for a stable perovskite must be in the range 0.81 < t < 1.11 and 0.44 < O f < 0.90, respectively [16]. Thus, If one uses these two factors to build the two-dimensional structural map, an efficient predictive model of formability can be obtained, as seen in Fig. 2. A-site cation doesn’t seem to change the electronic structure and, by extension, the band structure of the materials that are made from it. Instead, it acts as a charge balancer within the lattice and observed to modify the optical properties of perovskite by deforming the BX6 octahedron framework. The entire lattice can expand or contract as a result of a larger or smaller A cation, and this can change the lengths of the B–X bonds, which has been shown to be crucial for determining the bandgap. Most of the time, the band gap of a material is thought to be the most important performance factor when deciding if it can be used in solar cells. The photoelectric property of the material directly depends on the material’s interaction with the light. Therefore, a material’s band gap controls whether photons are emitted
Fig. 2 A map of tolerance-octahedral factor for halide perovskite ABX3 compounds [16]
146
R. Kaur et al.
Fig. 3 Screening and elimination criteria used to screen lead-free stable perovskite materials [19]
or absorbed, and the bandgap type determines whether phonons are necessary for light interactions. The Shockley-Queisser (S-Q) detailed-balance model predicts that the ideal bandgap energy for single-junction solar cells is 1.34 eV for photovoltaics [11]. For photovoltaics, the bandgap type can be either direct or indirect. Following equation is used to determine the photon energy for the desired E photon = E g +
KBT 2
(2)
color of light when emitting light, and direct bandgap type is necessary for good emission efficiency. The empirical factors such as tolerance factor and octahedral factor not only describe the stability of the perovskites but also gave a good prediction for the band gap values of perovskites. Generally, the targeted properties act as screening criteria for the search stable lead-free perovskites with the desired properties Fig. 3.
2.2 Constructing Datasets Typically, the dataset utilized for ML includes dependent and independent variables associated with the materials. Independent variables, also called features or descriptors, give information about the structure and properties of materials. This includes the chemical composition, molecular, atomic, and structural parameters, as well as the technical conditions for synthesis. The target properties of the materials that are affected by the independent variables are called dependent variables. The quantity and quality of data are key factors in the discovery of materials. The number of samples needed depends on the machine learning model, but in general, a good model needs at least three times as many samples as descriptors. Neural networks and deep learning, however, need more samples. The quality of the data depends on how well the target properties are covered, and the uncertainty associated with
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
147
the data. In most cases, ML performs best when given normally distributed data. ML might not work right if there isn’t enough data for certain targets or if some properties aren’t covered well enough. The quality of data can also be affected by things like experimental errors or calculation errors. Data roughness directly influences the results of the constructed model. There are several methods that can be used to reduce the roughness of the data, including the deletion of missing values, the completion of experimental conditions, the normalization of the data, and others. Data can be collected from various available databases or published papers. The information in the database came from many different places, such as experiments, simulations, and machine learning. There are various platforms such as AFLOWLIB [20] Materials project [21], Open Quantum Material Database (OQMD) [22], etc., which contain databases of various materials. The data should be collected from authoritative databases to ensure quantity and quality. If the database doesn’t have the information needed, computational methods can be used to make datasets. To perform lab-scale computations, software platforms such as Vienna Ab initio Simulation Package (VASP) [23], Quantum Espresso [24], and Car-Parrinello Molecular Dynamics (CPDM) [25] can be used. Lab-scale computations would yield a large amount of data with good reproducibility to assure data quality. A complex material calculation, on the other hand, may necessitate a large amount of processing power and take a long time to complete computation calculations. Using a combination of an atomic structure search method and density functional theory calculations, Kim et al. [26] created a dataset of 1,346 Hybrid Organic–Inorganic Perovskites (HOIPs), which features 16 organic cations, 3 group-IV cations, and 4 halide. Manually going through the literature is another way to build the database. This method is usually linked to the time-consuming process of manually editing. Table 1 lists various experimental, computational, and literature-based databases that store the structures or other characteristics of halide perovskite materials for potential future use.
2.3 Selecting Descriptors A machine learning model attempts to establish meaningful and informative relationships between the input variables and the output variables. The term descriptors (and features) are more common in machine learning. They may also be called factors, descriptors, features, or fingerprints, depending on the discipline, sources, and perspective of the user. To acquire an accurate and promising result, the choice of descriptor plays an essential role in dictating the model performance. It is essential to select the right descriptor set in machine learning. For the system to achieve the desired level of resolution and accuracy, it must be able to take into account the similarities and differences between the response variables. For an ML model to perform a reliable prediction, the input descriptors also play a critical role. For the development of robust and good descriptors for crystalline solids Jain et al. [28] state the four properties good descriptors should hold: (i) Descriptors
An online repository for storing materials data related to www.materialsdata.nist.gov specific publications PubChem is a repository of chemical information that is freely accessible around the world. Chemicals may be searched by name, molecular formula, structure, and other identifiers. The database contains information about chemical and physical properties, biological activities, safety information, and literature citations Crystal Structure Database for Inorganic Compounds The current release 2022/23 contains about 380,000 structural data sets
Materialsdata
PubChem
Pearson’s crystal data
Computational database
The collection contains crystal structures of organic compounds, inorganic compounds, and metal-rganic compounds, excluding biopolymers
Crystallography open database
www.crystalimpact.com/pcd
www.pubchem.ncbi.nlm.nih.gov
www.crystallography.net
www.citrination.com
The world’s leading open database and analytics platform for material and chemical information, provided by Citrine Informatics
Citrination
http://www.chemspider.com
URLs
ChemSpider is a free chemical structure database with access to over 100 million chemical structures by royal society of chemistry. The platform has 115 million crystal structures and 277 data sources
Description
ChemSpider
Experimental databases
Name
Table 1 List of various datasets for the perovskite materials
(continued)
[27]
References
148 R. Kaur et al.
Infrastructure to enable collection, storage, retrieval and analysis of data from electronic-structure codes Materials project is a collaborative effort across www.materialsproject.org institutions and countries to compute the properties of all inorganic materials and provide the data and analysis algorithms for free to every materials researcher. There are 146,323 material and 24,989 molecule’s data available on the platform The NOMAD repository and archive contains materials data that can be shared, retrieved, and reused. A NOMAD Repository and Archive is an open access repository for scientific material data OQMD is a database of 1,022,603 materials with calculated thermodynamic and structural properties through DFT The NRELMatDB is a database of computational materials with a specific focus on renewable energy materials
Computational materials repository
Materials project
NOMAD
Open quantum materials database
NREL materials database
www.materials.nrel.gov
www.oqmd.org
www.nomad-lab.eu
www.cmr.fysik.dtu.dk
This database contains 3,528,653 material compounds www.aflowlib.org with 733.959,824 calculated properties, and it’s growing every day
AFLOWLIB
URLs
Description
Name
Table 1 (continued)
(continued)
[18]
[20]
References
Perovskite-Based Materials for Photovoltaic Applications: A Machine … 149
A hybrid organic–inorganic perovskite dataset
C. Kim et al. constructed a dataset of 1,346 HOIPs, which features 16 organic cations, 3 group-IV cations and 4 halide anions
A MPDS presents materials data extracted from scientific www.mpds.io publications by the project PAULING FILE team
MPDS
Databases from literature
Materials science database hosted by Chinese Academy of Sciences Institute of Physics and Computer Network Information Center
Materiae
www.materiae.iphy.ac.cn
A SpringerMaterials database provides curated materials www.materials.springer.com data and advanced features to support materials research, physics, chemistry, engineering, and other disciplines. Material data resources with a high level of quality, the largest in the world
Springer materials
URLs
Description
Name
Table 1 (continued)
[26]
References
150 R. Kaur et al.
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
151
should be meaningful, with relationships between them and responses not being too complicated. (ii) A good descriptor is universal, i.e., it can be applied to any type of material, whether it is existing or hypothetical. (iii) The descriptors should be reversible, so that a list of descriptors can be turned back into a description. (iv) The descriptor should be readily available; i.e., easier to obtain than the target property. In addition, the number of descriptors should not be redundant or correlated. Ward and co-workers [29] examined potential descriptors for material property prediction and identified 148 potential descriptors, as well as categorizing them into four categories: stoichiometric attributes, elemental properties, electronic-structure attributes, and ionic compound attributes. For perovskite materials, the octahedral factor and tolerance factors are frequently employed to effectively predict the formability of the structures. Zhang et al. and co-workers [30] in their work on halide perovskites list the important descriptors to dictate the formability, band gap for halide perovskites. They found along with the most widely used structural descriptors for stability and band gap prediction, i.e., Goldschmidt tolerance factor and the octahedral factor ionic radii also make an important member to describe the formability of halide perovskites. Park et al. meticulously investigated the effect of BX6 octahedral deformations, rotations, and tilts on the thermodynamic stability and optical characteristics of the compounds and determined the relationship between octahedral deformation and band gap [31]. The “no-rattling” principle was applied to the study of perovskites phase stability limit by Filip and Giustino [12]. The “no-rattling” principle proposed by Goldschmidt successfully describes the perovskite structures with the prediction fidelity of 80% [32]. Bartle introduced a new tolerance factor as a descriptor to predict the stability of perovskites defined as [33]: τ=
( ) rX r A /r B − nA nA − rB ln(r A /r B )
(3)
where nA is the oxidation state of A and τ < 4.18 indicates a perovskite structure. They found that for 576 ABX3 compounds stability accuracy of the factor with experiments >90%. These factors not only describe the stability of the perovskites but also gave a good prediction for the band gap values of perovskites. In 2017 Sun et al. investigated the thermodynamic stability of 138 cubic perovskites and discovered that the tolerance factor t is not a good descriptor for stability screening [34]. Rather, they found a linear relationship between thermodynamic stability and (O f + t)η , where η is the atomic packing fraction (APF). They were able to predict the relative stability of perovskites between 86 and 90% using this stability descriptor. It may provide direct guidance in the design of chemical compositions to create stable perovskites, as well as facilitate efficient high-throughput searches for new stable perovskites with maximum photovoltaic potential. Li et al. [35] described HOMO, LUMO, Eg, ΔH, and ΔL for the overall conversion efficiencies as robust descriptors for charge transfer as well as PCE [35]. ΔH, defined by the energy difference between the highest occupied molecular orbitals (HOMO) of the neighboring two layers and ΔL, defined by the energy difference between the lowest
152
R. Kaur et al.
unoccupied molecular orbital (LUMO) of the neighboring two layers, correlated well with the power conversion efficiencies. The potential descriptors listed in the literature may not all be required for every problem; which ones uniquely describe the data depends on what the purpose is or what knowledge we are trying to gain from the data, because not every property of a material affects all performance measures simultaneously. Depending on the number of data points in the dataset, it may even be necessary to eliminate descriptors that are less (or even moderately) relevant to defining the system in order to reduce overfitting. Therefore, for robust and simpler models, it is necessary to transform high-dimensional data into low-dimensional data using dimensionality reduction [36, 37]. A feature selection model can be used to reduce dimensions by employing forward or backward elimination, keeping only the significant ones in the model. Feature extraction also contributes to the reduction of dimensionality by constructing a smaller set of descriptors from the original list [36, 37]. The development of crystal structure descriptors plays a significant role in the design of perovskites.
2.4 Feature Engineering An effective machine learning model requires carefully designed features. The ML model is therefore able to predict more materials outside the databases with the desired property, which eliminates the need to conduct any more “first-principles” calculations [38]. Despite the fact that many factors can affect a material’s target property, the number of features must be reasonable. To minimize the risk of overfitting it is best to select features that truly represent the corresponding property, and the number of features selected should be less than the number of materials in the input dataset. It is important to note that high-dimensional data can contain irrelevant, misleading, or redundant features, which increases the search space, which can result in difficulty processing more data and thereby inhibit learning. In order to accomplish data-driven material science, feature selection plays a crucial role [39, 40]. Feature engineering (FE) involves creating features that can reveal hidden dependencies in data and help machine learning algorithms predict better. Feature engineering is used to increase the performance of machine learning models. An ML method is designed to analyze a particular material property typically based on a set of features (descriptors). The initial feature set is constructed by selecting some features based on a prior understanding of the physics and chemistry of the materials. After evaluating the model’s performance, feature selection, which involves ranking, is used to select the best features [36]. In general, it entails two steps: the creation of new feature representations by transforming the original raw or primary features, followed by the selection of those engineered features which should be considered useful (interpretable and predictive) for the ML task. In the implementation of ML algorithms, such data transformations are important and often take the majority of the actual effort. Features that can be selected through feature selection decrease dimensionality by picking a subset of features without modifying them, whereas feature
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
153
extraction reduces dimensionality by computing a transformation of the original data to produce new features that should be more relevant.
2.5 Machine Learning Models The choice of the right ML algorithm is essential to the success of the ML method. ML algorithms. The model used directly affects how well ML can predict the properties of materials and how quickly it can find new functional materials. The successful use of ML methods has made it possible to learn about the microscopic physical and chemical properties of materials and accurately predict their properties. A machine learning algorithm can be classified into two subcategories, namely supervised and unsupervised learning, as shown in Fig. 4. In supervised learning, a labeled data set is used to train the model [37]. Based on the type of job they do, supervised learning ML techniques are further divided into two categories: regression and classification. When the output is a continuous variable, regression is used. When the output is a discrete variable, classification is used. In the field of material science, GBR, artificial neural networks, and kernel ridge regression (KRR) are all examples of supervised machine learning regression algorithms that have been used successfully. Using these regression algorithms, one can predict material properties with DFT accuracy and
Fig. 4 Broad classification of machine learning techniques [41]
154
R. Kaur et al.
gain insight into atomic-level chemistry. Meanwhile, unsupervised learning can be used to identify trends and patterns if the training dataset contains unlabelled samples. Models are trained using subsets of the whole data, known as training data, to predict other new data. Based on the results of the model analysis, the selection of models can also be used to make predictions and find new materials. For the purpose of applying machine learning to the development of photovoltaic materials, it is important to ensure that the developed machine learning model does not only focus on the accuracy of the model predictions, but also on its effectiveness in solving various problems. Therefore, the accuracy of the model may not necessarily be high instead stability and other factors must also be taken into account. Wu and Wang investigated the applicability of ML models for developing lead-free stable perovskites in their study [14]. The developed ML model remains applicable even though its accuracy is not satisfactory, i.e., it can be used to screen materials in a fixed compositional space. After obtaining ML model, it is also necessary to evaluate the accuracy of ML model. For the purpose of evaluating the performance of each machine learning model, three key measures are used to estimate prediction errors: coefficient of determination (R2 ), Pearson coefficient (r), and mean square error (MSE). The size of the training dataset significantly influences how well the chosen ML tool performs. It is important to choose a dataset that is big enough, regardless of whether it was generated experimentally or by DFT first-principle calculations. This will lower the likelihood that the ML model will be underfit. The ML model is deemed to be underfitted if it is unable to identify the patterns in the dataset. Training a linear model with non-linear data is another cause of underfitting. Underfit ML model will predict inaccurate results. If an ML model is trained with too much data, it is said to be overfitted. When the model is overfitted, it additionally uses the dataset’s noise uncertainty for training. Such machine learning is prone to produce inaccurate results for both prediction and categorization.
3 Review of Machine Learning in Lead-Free Stable Perovskites for Photovoltaic Application In the search of lead-free stable perovskite S. Lu and co-workers used the dataset of 346 HOIPs from high-throughput first-principles calculations [13]. Out of 346 they selected 212 hybrid organic–inorganic perovskites to train the algorithm and then explored the 5158 unexplored candidates for the photovoltaic property. For training, they started with 30 initial descriptors. The GBR algorithm was used to find the relationship between the selected descriptors and targeted properties (i.e., stability and bandgap) and then they incorporated the “last-palce elimination” in the algorithm to filter out the most impactful descriptors and found the 14 most important features that significantly impact the bandgap. The new feature set contains structural features as well as elemental features. Figure 5a gives the ranking of selected features
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
155
with their ranking using GBR algorithm along heat map of the Pearson correlation coefficient matrix. Figure 5b shows the test results of GBR model with test data of 97.0%, 98.5%, and 0.086 for R2 , Pearsons coefficient r, and MSE, respectively, which backs the outstanding performance of the selected model. The bandgaps are very close to the original input dataset. This proves the reasonability and reliability of our ML model, providing a guarantee for further analysis. Further ML results show an increase in bandgap as the X-site halogen radius decreases, which predicts that the candidates with Br and I at the X-site can be the promising candidate for the photoelectric properties. After training the algorithm the 5158 unexplored candidates were further investigated. The structural stability screening filtered out 1669 suitable candidates for optoelectronic properties. In which 218 candidates were chosen for their suitable predicted bandgap for the photoelectric properties, and 22 Br-based candidates were studied further because Br-based candidates can be easily studied experimentally. Further compounds with toxic elements were excluded and left with 6 potential candidates. They further explored these 6 candidates through first-principle calculations using PBE as exchange correlation functional and compared with ML results (Fig. 5c).
Fig. 5 a The ranking of selected features using GBR algorithm and the heat map of Pearson correlation coefficient matrix among those selected features. b Visual depiction of results and insights from ML model. The fitting results of test bandgaps E g and predicted bandgaps E g . Coefficient of determination (R), Pearson coefficient (r), and mean squared error (MSE) are computed to estimate the prediction errors. Data visualization of predicted bandgaps for all possible HOIPs (one color represents a class of halogen perovskites) with tolerance factor, octahedral factor, ionic polarizability for the A-site ions, and electronegativity of B-site ions. The dotted box represents the most appropriate range for each feature. c A comparison between ML-predicted and DFT-calculated results of six selected HOIPs [13]
156
R. Kaur et al.
In 2019, Wu and Wang. proposed a target-driven method to speed up the discovery of hidden hybrid organic–inorganic perovskites (HOIPs) for photovoltaic applications [14]. A dataset of 77,748 electrically neutral HOIPs was screened from a huge chemical space of 230,808 HOIPs. The dataset was further screened through the stability condition and left with 38,086 potential candidates. The “ionic radii” of the investigated organic A-site cations in HOIPs is one of the important features in predicting HOIPs along with other important features such as Tolerance factor, octahedra factor, and atomic packing factor. They used three machine learning (ML) algorithms including gradient boosting regression (GBR) [42], supporting vector regression (SVR) [43], and kernel ridge regression (KRR) [44] were implemented in order to increase the prediction accuracy of energy bandgap of hybrid organic–inorganic perovskites. The results of predicted and actual unit cell volume through SVR are shown in Fig. 6a, while Fig. 6b shows the predicted and DFT-calculated bandgap obtained through GBR with R2 , MAE, and MSE values to estimate the accuracy of machine learning model. Six hundred and eighty-six HOIPs candidates were successfully chosen after undergoing charge neutrality screening, stability screening, and ML-integrated screening (Fig. 7). Following that, 132 stable and safe orthorhombic-like HOIPs (Cd, Pb, and Hg free) with suitable bandgaps for solar cell use were further confirmed by DFT calculations. A series of unexplored stable HOIPs species, e.g., ABSeI2, ABBrI2, ABBr2I, and ABClI2… have been identified as potential photovoltaic light-harvesting materials. In 2020, Wu and his co-workers selected 209 HOIPs from a dataset of 290,279 unexplored non-transition-metal-element-based HOIPs (Pd-free) to have appropriate band gaps [45]. This study introduced a new feature, i.e., the summation of electronegativity. There are 33 features that are ranked on their impact on the desired properties and the atomic packing factor remains the most significant factor in predicting
Fig. 6 a Fitting results of actual and predicted unit cell volume determined by supporting vector regression (SVR). Comparing the actual and predicted unit cell volume frequency distribution is shown in the inset. b Fitting results of DFT-calculated and predicted bandgaps by gradient boosting regression (GBR). Inset shows the comparison of DFT-calculated and predicted bandgap frequency distributions [14]
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
157
Fig. 7 A visual representation of DFT-calculated bandgaps for 686 HOIP candidates with a effective organic cation radii, b atomic number of B-site elements, c average atomic number of X-site elements, and d DFT-calculated formation energy. Bandgap range is represented by the green dashed line between 0.9 and 2.2 eV and new 15 selected HOIPs are represented by red dots [14]
HOIP band gaps. To predict the material properties of potential HOIP candidates, the researchers utilized three supervised machine learning regression algorithms: GBR, SVR, and KRR. With the same input dataset and cross-validation technique, each machine learning model is trained and tested, and the performance of each model is compared. The GBR model exhibits the highest performance (R2 = 0.943, MAE = 0.203, MSE = 0.086). To evaluate the impact of feature selection on the performance of the ML regression algorithm, the 16 most important features (nearly half of the total features) are sorted out using the GBR algorithm and are used to develop an engineered feature set. Based on the results of the analysis, packing fraction ranked highest in importance, followed by tolerance factor, sum of electronegativity, octahedral factor, sum of atomic masses, etc., as shown in Fig. 8a. Figure 8b shows the fitting results of band gaps predicted by GBR and DFT-calculated along with Coefficient of determination (R2), mean absolute error (MAE), and mean square error (MSE) values to estimate the machine learning prediction errors. This Fig. 8c illustrates band gaps of unexplored HOIP candidates with octahedral factors, showing green and red dots for small and large values. For unexplored HOIP candidates, Fig. 8d
158
R. Kaur et al.
Fig. 8 a The ranking of 16 selected features through the GBR algorithm. b A plot of the fitting results of the DFT-calculated and predicted band gaps obtained by GBR. The inset shows a comparison of the DFT-calculated and predicted band gap frequency distributions. c illustrates the predicted band gaps for unexplored HOIP candidates with an octahedral factor. The green and red dots represent the small and large octahedral factors, respectively. For unexplored HOIPs candidates, the predicted electronic band gap is shown as a heat map corresponding to the tolerance factor and summation of electronegativity. The color bar indicates the value of the band gap predicted by the GBR algorithm [45]
shows a heat map of the predicted electronic band gap based on the tolerance factor and summation of electronegativity. Through the cross-validation method, the GBR algorithm is then applied to predict the electronic band gap of 9152 potential candidates of HOIPs by executing 100 times the algorithm. To select the best HOIPs candidates for photovoltaic applications, a strict screening condition is used, which states that within 100 executions, the predicted band gap of HOIPs candidates must always be in the range of 0.91.6 eV. The final check will be done using the DFT calculation on 209 unexplored, nontoxic HOIPs candidates with the right band gaps (Fig. 9). DFT calculations have shown that there are 96 HOIPs with the right band gaps for use in photovoltaics. Those potential candidates are divided into two groups according to their chemical formulas: 27 belong to ABXY2 , 69 to ABXYZ, while no ABX3 is suitable for photovoltaic applications.
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
159
Fig. 9 Visual depiction of (a) DFT-calculated band gap. b DFT-calculated formation energy of selected 209 HOIPs candidates with tolerance factor. A color bar represents the difference between the band gaps predicted by ML and those calculated by DFT. Red dot in (b) represents the selected representative new HOIPs [45]
As the primary descriptor for machine learning models, M-H Jao and his colleagues developed novel descriptors that are based on pseudopotentials [46]. In order to convert electron density distributions into numerical values that could subsequently be used as descriptors in ML models, they employed an unsupervised ML neural network for automatically extracting representative values from electron density distributions. Based on Element Code as the sole descriptor, they predicted the bandgap of a lead-free double halide perovskite with an accuracy of 0.951 and a mean absolute error of 0.266 eV. This paved the way to explore potential candidates for energy conversion applications in an efficient manner.
4 Conclusion and Outlook Recent years have seen rapid progress in perovskites research, which coincided with an increase in interest in machine learning applications in materials science. As a result, an increase in research papers and articles has been published over the past few years. We can expect this trend to keep going and speed up because experimental and computational tools, as well as machine learning and data management technologies, are always getting better. In order to utilize machine learning more effectively in perovskite research, there are also some challenges to overcome. Since experimental or highly accurate computational datasets are expensive and time-consuming, they are usually small in size; as a result, the models built using these datasets are susceptible to noise, and imbalanced data structures, which limit their generalization ability. In order to overcome this problem, one way would be to develop some new experimental procedures or to use ML techniques that work better with small data sets. It’s also possible that the analysis could be more reliable if smaller ML models (with less features) were used. As a means of reducing
160
R. Kaur et al.
the dimensionality of a set, one may eliminate some descriptors that are insignificant called feature selection and combine them into a smaller new set of descriptors called feature extraction. Lead halide perovskites have great potential for use in optoelectronics, but their widespread commercialization faces two major obstacles: poor stability and lead toxicity. It is essential to achieve both stability and an appropriate gap for lead-free stable perovskites to be used as photovoltaic materials. Stability of perovskites is primarily determined by examining their structural characteristics. The stability of perovskites is described by various structural parameters, for example, tolerance factor by goldsmith, octahedral factor, packing fraction, electronegativity, ionic radius, etc. These are generally used as potential descriptors in ML models for the stability and band structure properties of perovskites. The development of crystal structure descriptors plays a significant role in the design of perovskites. ML methods have been successfully applied to understand and predict the microscopic physical and chemical properties of materials. It is possible to predict material properties using regression algorithms and gain insight into atomic-level chemistry with DFT accuracy [13]. An interpretation of complex ML models that include physical and chemical aspects can be challenging, since the goal of the learning process is to maximize prediction performance, which may require combining hundreds of features. Researchers will have a deeper understanding of the structure–property relationship of materials if the model can be explained based on physical and chemical principles.
References 1. Bhattacharya, S., & John, S. (2019). Beyond 30% conversion efficiency in silicon solar cells: A numerical demonstration. Science and Reports, 9(1), 12482. https://doi.org/10.1038/s41598019-48981-w 2. Kojima, A., Teshima, K., Shirai, Y., & Miyasaka, T. (2009). Organometal halide perovskites as visible-light sensitizers for photovoltaic cells. Journal of the American Chemical Society, 131(17), 6050–6051. https://doi.org/10.1021/ja809598r 3. Lin, K., et al. (2018). Perovskite light-emitting diodes with external quantum efficiency exceeding 20%. Nature, 562(7726), 245–248. https://doi.org/10.1038/s41586-018-0575-3 4. Green, M., Dunlop, E., Hohl-Ebinger, J., Yoshita, M., Kopidakis, N., & Hao, X. (2021). Solar cell efficiency tables (version 57). Progress in Photovoltaics: Research and Applications, 29(1), 3–15. https://doi.org/10.1002/pip.3371 5. Johnston, M. B., & Herz, L. M. (2016). Hybrid perovskites for photovoltaics: Charge-carrier recombination, diffusion, and radiative efficiencies. Accounts of Chemical Research, 49(1), 146–154. https://doi.org/10.1021/acs.accounts.5b00411 6. Kang, J., & Wang, L.-W. (2017). High defect tolerance in lead halide perovskite CsPbBr 3. Journal of Physical Chemistry Letters, 8(2), 489–493. https://doi.org/10.1021/acs.jpclett.6b0 2800 7. Chen, Y., Peng, J., Su, D., Chen, X., & Liang, Z. (2015). Efficient and balanced charge transport revealed in planar perovskite solar cells. ACS Applied Materials & Interfaces, 7(8), 4471–4475. https://doi.org/10.1021/acsami.5b00077 8. Zhong, W., & Vanderbilt, D. (1995). Competing structural instabilities in cubic perovskites. Physical Review Letters, 74(13), 2587–2590. https://doi.org/10.1103/PhysRevLett.74.2587
Perovskite-Based Materials for Photovoltaic Applications: A Machine …
161
9. Ren, M., Qian, X., Chen, Y., Wang, T., & Zhao, Y. (2022). Potential lead toxicity and leakage issues on lead halide perovskite photovoltaics. Journal of Hazardous Materials, 426, 127848. https://doi.org/10.1016/j.jhazmat.2021.127848 10. Davies, M. L. (2020). Addressing the stability of lead halide perovskites. Joule, 4(8), 1626– 1627. https://doi.org/10.1016/j.joule.2020.07.025 11. Markvart, T. (2022). Shockley: Queisser detailed balance limit after 60 years. WIREs Energy and Environment, 11(4), e430. https://doi.org/10.1002/wene.430 12. Filip, M. R., & Giustino, F. (2018). The geometric blueprint of perovskites. Proceedings of the National Academy of Sciences, 115(21), 5397–5402. https://doi.org/10.1073/pnas.171917 9115 13. Lu, S., Zhou, Q., Ouyang, Y., Guo, Y., Li, Q., & Wang, J. (2018). Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nature Communications, 9(1), 3405. https://doi.org/10.1038/s41467-018-05761-w 14. Wu, T., & Wang, J. (2019). Global discovery of stable and non-toxic hybrid organic-inorganic perovskites for photovoltaic systems by combining machine learning method with first principle calculations. Nano Energy, 66, 104070. https://doi.org/10.1016/j.nanoen.2019.104070 15. Goldschmidt, V. M. (1926). Die Gesetze der Krystallochemie. Naturwissenschaften, 14(21), 477–485. https://doi.org/10.1007/BF01507527 16. Li, C., Lu, X., Ding, W., Feng, L., Gao, Y., & Guo, Z. (2008). Formability of ABX3 (X = F, Cl, Br, I) halide perovskites. Acta Crystallographica Section B, 64(6), 702–707. https://doi. org/10.1107/S0108768108032734 17. Li, C., Soh, K. C. K., & Wu, P. (2004). Formability of ABO3 perovskites. Journal of Alloys and Compounds, 372(1), 40–48. https://doi.org/10.1016/j.jallcom.2003.10.017 18. Kumar, A., Singh, S., Mohammed, M. K. A., & Sharma, D. K. Accelerated innovation in developing high-performance metal halide perovskite solar cell using machine learning. International Journal of Modern Physics B, 0(0), 2350067. https://doi.org/10.1142/S02179792235 00674 19. Jacobs, R., Luo, G., & Morgan, D. (2019). Materials discovery of stable and nontoxic halide perovskite materials for high-efficiency solar cells. Advanced Functional Materials, 29(23), 1804354. https://doi.org/10.1002/adfm.201804354 20. Curtarolo, S., et al. (2012). AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations. Computational Materials Science, 58, 227–235. https://doi.org/10.1016/j.commatsci.2012.02.002 21. Jain, A., et al. (2013). Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials, 1(1), 011002. https://doi.org/10.1063/1.481 2323 22. Saal, J. E., Kirklin, S., Aykol, M., Meredig, B., & Wolverton, C. (2013). Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD). JOM Journal of the Minerals Metals and Materials Society, 65(11), 1501– 1509. https://doi.org/10.1007/s11837-013-0755-4 23. Kresse, G., & Furthmüller, J. (1996). Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Physical Review B, 54(16), 11169–11186. https:// doi.org/10.1103/PhysRevB.54.11169 24. Giannozzi, P., et al. (2009). QUANTUM ESPRESSO: A modular and open-source software project for quantum simulations of materials. Journal of Physics: Condensed Matter, 21(39), 395502. https://doi.org/10.1088/0953-8984/21/39/395502 25. Hutter, J. (2012). Car-Parrinello molecular dynamics. WIREs Computational Molecular Science, 2(4), 604–612. https://doi.org/10.1002/wcms.90 26. Kim, C., Huan, T. D., Krishnan, S., & Ramprasad, R. (2017). A hybrid organic-inorganic perovskite dataset. Scientific Data, 4(1), 170057. https://doi.org/10.1038/sdata.2017.57 27. Villars, P. (2007). Pearson’s crystal data: Crystal structure database for inorganic compounds. ASM International, Materials Park. 28. Jain, A., Hautier, G., Ong, S. P., & Persson, K. (2016). New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships. Journal of Materials Research, 31(8), 977–994. https://doi.org/10.1557/jmr.2016.80
162
R. Kaur et al.
29. Ward, L., Agrawal, A., Choudhary, A., & Wolverton, C. (2016). A general-purpose machine learning framework for predicting properties of inorganic materials. NPJ Computational Materials, 2(1), 16028. https://doi.org/10.1038/npjcompumats.2016.28 30. Zhang, L., He, M., & Shao, S. (2020). Machine learning for halide perovskite materials. Nano Energy, 78, 105380. https://doi.org/10.1016/j.nanoen.2020.105380 31. Park, H., Ali, A., Mall, R., Bensmail, H., Sanvito, S., & El-Mellouhi, F. (2021). Data-driven enhancement of cubic phase stability in mixed-cation perovskites. Machine Learning: Science and Technology, 2(2), 025030. https://doi.org/10.1088/2632-2153/abdaf9 32. Travis, W., Glover, E. N. K., Bronstein, H., Scanlon, D. O., & Palgrave, R. G. (2016). On the application of the tolerance factor to inorganic and hybrid halide perovskites: A revised system. Chemical Science, 7(7), 4548–4556. https://doi.org/10.1039/C5SC04845A 33. Bartel, C. J., et al. (2022). New tolerance factor to predict the stability of perovskite oxides and halides. Science Advances, 5(2), eaav0693. https://doi.org/10.1126/sciadv.aav0693 34. Sun, Q., & Yin, W.-J. (2017). Thermodynamic stability trend of cubic perovskites. Journal of the American Chemical Society, 139(42), 14905–14908. https://doi.org/10.1021/jacs.7b09379 35. Li, J., Pradhan, B., Gaur, S., & Thomas, J. (2019). Predictions and strategies learned from machine learning to develop high-performing perovskite solar cells. Advanced Energy Materials, 9(46), 1901891. https://doi.org/10.1002/aenm.201901891 36. Ba¸stanlar, Y., & Özuysal, M. (2014). Introduction to machine learning. In M. Yousef & J. Allmer, (Eds.), miRNomics: MicroRNA biology and computational analysis (pp. 105–128). Humana Press. https://doi.org/10.1007/978-1-62703-748-8_7 37. Rebala, G., Ravi, A., & Churiwala, S. (2019). An introduction to machine learning. Springer International Publishing. https://books.google.co.in/books?id=u8OWDwAAQBAJ 38. Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O., & Walsh, A. (2018). Machine learning for molecular and materials science. Nature, 559(7715), 547–555. https://doi.org/10.1038/s41 586-018-0337-2 39. Liu, Y., Esan, O. C., Pan, Z., & An, L. (2021). Machine learning for advanced energy materials. Energy and AI, 3, 100049. https://doi.org/10.1016/j.egyai.2021.100049 40. Srivastava, M., Howard, J. M., Gong, T., Rebello Sousa Dias, M., & Leite, M. S. (2021). Machine learning roadmap for perovskite photovoltaics. Journal of Physical Chemistry Letters, 12(32), 7866–7877. https://doi.org/10.1021/acs.jpclett.1c01961 41. Suryakanthi, T. (2020). Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm*. International Journal of Advanced Computer Science and Applications, 11. https://doi.org/10.14569/IJACSA.2020.0110277 42. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451 43. Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88 44. Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press. https://books. google.co.in/books?id=RC43AgAAQBAJ 45. Wu, T., & Wang, J. (2020). Deep mining stable and nontoxic hybrid organic-inorganic perovskites for photovoltaics via progressive machine learning. ACS Applied Materials & Interfaces, 12(52), 57821–57831. https://doi.org/10.1021/acsami.0c10371 46. Jao, M.-H., Chan, S.-H., Wu, M.-C., & Lai, C.-S. (2020). Element code from pseudopotential as efficient descriptors for a machine learning model to explore potential lead-free halide perovskites. Journal of Physical Chemistry Letters, 11(20), 8914–8921. https://doi.org/10. 1021/acs.jpclett.0c02393
A Review of the High-Performance Gas Sensors Using Machine Learning Shulin Yang, Gui Lei, Huoxi Xu, Zhigao Lan, Zhao Wang, and Haoshuang Gu
Abstract High-performance gas sensors are of great importance to accurately identify/detect pollutant gases and monitor their concentrations in the environment to ensure human safety in daily life and production. Machine-learning techniques have been used to successfully improve gas sensing performances of gas sensors leveraging large onsite data sets generated by them. A simple process is introduced to show the typical approach to collect the features from sensing response curves and conduct a machine-learning algorithm to further analyze the data set. The improved gas sensing performances of the machine-learning-enabled sensors reported recently are summarized and compared, especially regarding selectivity and long-term stability (drift compensation). Furthermore, the expanded applications of a gas sensor or sensor array under machine-learning algorithms were discussed and reviewed. In addition, the possible challenges/prospects are emphasized and discussed as well. Our review further indicated that machine-learning techniques are effective strategies to successfully improve the gas sensing behavior of a single gas sensor or sensor array. Keywords Gas sensor · Sensor array · Machine learning · Algorithms · Review
1 Introduction High-performance and reliable gas sensors have been urgently required to effectively monitor the leakage of explosive/toxic gases or volatile organic compounds (VOCs) and their concentrations in our modern life [1–5]. Semiconductor-based gas sensors S. Yang (B) · G. Lei · H. Xu · Z. Lan · H. Gu (B) Hubei Key Laboratory for Processing and Application of Catalytic Materials, School of Physics and Electronic Information, Huanggang Normal University, Huanggang 438000, China e-mail: [email protected] H. Gu e-mail: [email protected] S. Yang · G. Lei · Z. Wang · H. Gu Faculty of Physics and Electronic Sciences, Hubei University, Wuhan 430062, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_8
163
164
S. Yang et al.
have been popularly studied among researchers in recent years due to their advantages in high sensitivity, ease of production, low cost, portability and low power requirement [6–9]. Furthermore, the miniaturization of the gas sensor based on a semiconductor also makes it possibly compatible with a chip-level integration [10, 11]. The rapid development of gas sensors integrated into micro-electromechanical systems (MEMS) is also another factor contributing to the fact that semiconductorbased gas sensor has gotten lots of attention [12, 13]. It has been reported that the semiconductor-based gas sensor can exhibit promising gas sensing performance with a short response time and a low working temperature [14–17]. For example, Zhang et al. showed that the Pt-functionalized hierarchical ZnO microspheres showed a high sensor response of 242–200 ppm triethylamine (TEA) at 200 °C with a short response time of only 15 s [18]. The CuO-based nanoflakes exhibited a sensor response and a response time of ~400% and 6.8 s, respectively, to 100 ppm NO2 at 23 °C [19]. The hierarchical branch-like In2 O3 , SnO2 -boron nitride nanotubes, TiO2 thin film or WO3 nanofibers also presented a high sensor response of 44, ~2610, ~85% or 15,000 to 100 ppb O3 , 5 ppm NO2 , 100 ppb H2 S or 25 ppm NO2 , respectively [20– 23]. However, the sensor response to a low-concentration target gas might not be higher than that of a high-concentration competitive gas [24, 25]. The gas sensors based on the semiconductors (especially metal oxides) would then suffer from crosssensitivity to several gases even if they showed a higher sensor response to a specific target gas [26–28]. In addition, the drift of the long-term stability of a gas sensor might also limit its further promotion and application. In recent years, many efforts have been devoted to improving the gas sensing performances of semiconductor-based gas sensors, mainly including doping, surface modulation, morphology modification or preparation of composites [29–34]. For example, the gas sensing performance of NiO nanoparticles to benzyl mercaptan could be effectively enhanced through doping it with Cr [35]. The sensor response to 10 ppb benzyl mercaptan at 200 °C was successfully improved from the initial ~2 for the pure NiO nanoparticles to ~11 for the Cr-doped NiO. The authors also reported that the sensor response of the Cr-doped NiO to 1 ppm aniline was approximately 15.7, which should be higher than that of 10 ppb benzyl mercaptan (~11). This result meant the Cr-doped NiO might still be difficult to directly identify the benzyl mercaptan when the high-concentration aniline was present. Meanwhile, the gas sensing performance of the ZnO was also improved by decorating it with Ag nanoparticles. The sensor response of the Ag-decorated ZnO was ~7.5 to 24 ppm CO at 130 °C, higher than that of the pure ZnO (~3). Furthermore, the Ag-decorated ZnO was reported to show a sensor response of ~5–1 ppm HCHO at 130 °C. The sensor response of the Ag-decorated ZnO–HCHO might be close to or even higher than that of 24 ppm CO when the concentration of the HCHO was increased [36]. A similar phenomenon was also reported in the study on the H2 (or NH3 ) sensing behavior of mesoporous WO3 –TiO2 heterojunction (or SnO2 -based composite) [37, 38]. These reports revealed that the issue of the cross-sensitivity of the semiconductor-based gas sensor could not be fully resolved via the strategies listed ahead. Machine-learning techniques, powerful tools for quality management and prediction, have been reported to effectively enhance the gas sensing performance of
A Review of the High-Performance Gas Sensors Using Machine Learning
165
the semiconductor-based gas sensor or sensor array (also widely described as enose) [39–41]. During the machine-learning strategy, typical features for the sensing behavior would be extracted from the sensing curve to a test gas. The extracted features are always distinct and easy to distinguish among the classes, almost not correlated with other extracted features and helpful to assist humans in identifying the gas [42]. The gas sensor response, relative differences, rise/fall time, integrals, derivatives, area under the sensing curves and slopes are popularly extracted to be the typical features of a specific gas. It should be noted that the typical feature of a specific gas should be unique, which could be seen as the exclusive fingerprint (or footprint) for a target gas [43, 44]. The collected features could be defined as the data set, which is also unique and would be indexed to a specific gas. Then the feature would be classified and analyzed with a selected algorithm, including principle component analysis (PCA), linear discriminant analysis (LDA), discriminant factor analysis (DFA), partial least square (PLS), principle component regression (PCR) or cluster analysis (CA) [8, 45–47]. The collected data could be further processed with a supervised method of random forest (RF), multilayer perceptron (MLP) or support vector machine (SVM) to accurately distinguish the target gas among tested gases [48, 49]. Furthermore, a regression model would be used to predict the concentration of a target gas. Several references have confirmed that the machine-learning strategy could help the gas sensor or sensor array to distinguish a target gas and predict its concentration. For example, Guha et al. have successfully improved the gas selectivity of a SnO2 -based gas sensor with the machine-learning method of RF [50]. The SnO2 based gas sensor could effectively identify the formaldehyde, methanol, propanol and toluene with an average accuracy of 96.84% under the machine-learning algorithm, indicating the elimination of the cross-sensitivity. Jeon et al. have also assembled a gas sensor array consisting of WO3 , NiO and SnO2 with different morphologies to detect CH3 COCH3 , C6 H5 CH3 , NH3 or H2 S [51]. Their study revealed that the target gases were more likely to be separated at an elevated working temperature in the principal component (PC) space under a PCA algorithm, indicating that the gas sensor array was able to identify these four gases. However, few references have reviewed the enhanced gas sensing performances of gas sensors (mainly based on semiconductors) under machine learning methods. Herein, we have summarized and reviewed the improved gas sensing behavior of a single gas sensor or sensor array with the help of various machine-learning algorithms. A typical and common process to do the machine-learning algorithm was introduced, mainly focusing on the extraction of features, the classification of the data set and the regression of the built models. Meanwhile, the recent studies on machine-learning-enhanced gas sensing performances were systematically reviewed and compared, including the improved gas selectivity and enhanced long-term stability (improved drift compensation). Furthermore, the applications of a gas sensor or sensor array in monitoring the freshness of food/meat or the quality of the outdoor air, identifying the kind of food and realizing early cancer diagnosis were also displayed to fully understand the positive effects of machine-learning algorithms. Additionally, the possible challenges/prospects for the machine learning-enhanced gas sensor (array) were discussed as well.
166
S. Yang et al.
2 Common Process to Collect Gas Sensing Features and Conduct Machine-Learning Algorithms Several references have revealed that the gas sensing performance of the gas sensor can be successfully improved with machine-learning techniques [52–55]. The sensing device could be a single gas sensor or a sensor array consisting of several typical gas sensors (popularly 3–4) [44, 54–57]. For a gas sensor array, each gas sensor would exhibit promising gas sensing performance to a target gas, and then the sensor array could be used to detect several kinds of gases. The sensing performances of a single gas sensor (or each gas sensor in a sensor array) to different-concentration gases would be systematically instigated under different working temperatures (or working voltages) and/or different relative humidity. Meanwhile, some references reported that the sensing performance of a mixture of two or more gases could be studied to better learn and understand the improved gas selectivity of a gas sensor or sensor array [41, 58, 59]. There would be several sensing parameters (labeled as features) analyzed and extracted from the dynamic sensing response curve to collect enough data to conduct the machine-learning process [44]. Different features might be extracted to further process the collected data. For example, there were ten typical features extracted by Hasan et al. from the response curve to collect the data set [60]. In the work of Khan et al., they mainly focused their attention on the sensor response of each gas sensor in a sensor array [44]. Figure 1 shows a typical and popular process to collect the typical features from the sensing response curves and further classify a target gas and predict its concentration under machine-learning algorithms. The collected data set would be preprocessed with noise filtering or normalization to uniform the features. An unsupervised method of PCA and supervised methods of SVM, neural network (NN) and Naïve Bayes (NB) are widely used as two main approaches to analyze the collected data set. The PCA uses an orthogonal linear transformation to convert possibly correlated features to uncorrelated attributes (namely principal components). During this process, the dimension of the collected data would be reduced without changing their essence. There have been references confirming that the method of PCA could successfully reduce a batch of
Fig. 1 A typical process of gas sensing measurements, feature extraction and further data analysis under machine-learning algorithms. Reprinted with permission from Ref. [60]. Copyright 2022, Wiley-VCH
A Review of the High-Performance Gas Sensors Using Machine Learning
167
data into n (mainly 2–3) useful principal components (PCs). These separated PCs could be used as the input data for the machine-learning algorithm. Meanwhile, a part of the collected data set (usually over 50%) would be selected as the training data and the rest of the data set would be the test data. The training set was usually used to teach the system how to classify the studied gas and estimate its concentration. The test data was used to evaluate the performance of the studied gas sensor or gas sensor array mainly according to the percentage or ratio of the right classification and the average error of predicted gas concentration [43]. The total variance from the PC1 to PCn would be evaluated to study the efficiency of the PCA to separate clusters (each cluster representing an individual gas). In a practical experiment, a series of training algorithms would be adopted to conduct and evaluate the classification, mainly including decision tree (DT), linear discriminant (LD), k-nearest neighbor (kNN), ensemble bagged trees (EBT), NB, SVM and NN [40, 44, 61]. The classification accuracy obtained by each algorithm should be calculated and compared to systematically screen out the best-performance classifier to accurately identify the target gases in singular or mixed conditions. To better compare the performances of different algorithms, the simulation would be conducted several times to calculate and obtain the average accuracy of an adopted algorithm during the study. In addition, a regression would be done to predict the gas concentration based on the model built by the algorithms listed above. A possible method is to train the features for higheror lower-concentration gas. And then the features of the intermediate-concentration gas could be used to verify the validity of the predicted concentration data the model has not previously seen.
3 Enhanced Gas Sensing Behavior Using Machine-Learning Techniques 3.1 Enhancement in Gas Selectivity and Concentration Prediction Recently, the gas selectivity of a single gas sensor or a gas sensor array has been reported to be effectively enhanced under a selective machine-learning method [62– 67]. The authors always selected different algorithms to classify the features of several gases and further compare the accuracies of the adopted algorithms [66, 68]. Based on the results of the classification of the gases and the training data, the concentration of a target gas would also be predicted in an interfering environment.
3.1.1
Improved Gas Sensing Performance of a Single Gas Sensor
A series of references have reported that the gas sensing performance (including selectivity) of a single gas sensor, especially the semiconductor-based one, could be
168
S. Yang et al.
successfully enhanced with the help of machine-learning algorithms. For example, Hasan et al. have synthesized a novel composite of inkjet-printed rGO/CuCoOx through a facile one-pot hydrothermal method [60]. The CuCoOx nanoclusters (with an average size of ~100 nm) were distributed on the rGO nanosheet (Fig. 2c, d). The prepared composite was mixed with a solution of IPA and 2-butanol to form a stable and aggregation-free functional ink (Fig. 2a). Then the obtained ink was inkjetprinted on the interdigital electrodes with a finger width/gap being 50/42m (Fig. 2b). The room-temperature gas sensing performance of the composite was carried out at 1 atm (1.01 × 105 Pa) under a working voltage of 1 V. The composite showed a typical p-type sensing performance towards 50 ppb NO2 at room temperature (Fig. 2e). Meanwhile, the sensor response showed a close relationship with the applied relative humidity (RH). The assembled gas sensor exhibited a sensor response of 9.41% or 13.16% to 250 ppb NH3 when the RH was 10% or 20%, respectively (Fig. 2f, g). As shown in Fig. 2g, the sensor response of the composite to 250 ppb NO2 under the RH of 20% was close to that of 400 ppb NO2 . The authors further used a machine-learning strategy to classify the specific target gas between pure NO2 and NO2 in the humid air. The sensor responses of the composite to different gas species were collected with a method of PCA. The class of a target gas and its concentration were used as the labeled data with the 2D scatters coordinated in the space of PCA (Fig. 2h). These labeled data would be further used as the feature data for the machine-learning process. A PCA-assisted kNN algorithm was selected to predict the gas concentration of the target gas. The results showed that this method exhibited extraordinary accuracies in identifying specific gases and predicting their concentrations under an interfering atmosphere. A high-scored accuracy could be obtained via the PCA-assisted kNN, being 97.12% or 88.5% for NO2 or RH (Fig. 2i, j), respectively. This meant that the assembled gas sensor could well identify the NO2 under a machine-learning algorithm, indicating the improved gas selectivity of the rGO/CuCoOx . In addition, the 500 ppb NO2 was also exactly projected by the selected method. Their work showed the machine-learning algorithm was effective to improve the gas sensing behavior of a metal oxide-based gas sensor. It would be better if the accuracies of the other algorithms were investigated to compare with that of the kNN to find out the optimum algorithm. Kim et al. have assembled a single gas sensor based on Pt-decorated SnO2 nanowires and studied its improved gas sensing performance under a machinelearning method [69]. In their study, the Pt-decorated SnO2 nanowires showed outstanding gas sensing performances towards 0.1–100 ppm ethanol at 250 °C. Moreover, the authors also explored its potential gas sensing properties to benzene, acetone, hydrogen, toluene and ethanol with concentrations of 0.1–100 ppm at working temperatures of 200–400 °C. The assembled gas sensor showed similar gas sensing performances to these gases, indicating the poor selectivity of the Ptdecorated SnO2 . To overcome this issue, the authors have defied a “ratio of responses” between the target gas and interfering gases and obtained a single five-dimensional (5D) data set (named fingerprints) for each gas. The dimension of the fingerprints was then reduced with a PCA method to better visualize the relationships among the data
A Review of the High-Performance Gas Sensors Using Machine Learning
169
Fig. 2 a The prepared functional rGO/CuCoOx ink and b the assembled gas sensor. c, d SEM images of the obtained rGO/CuCoOx composite. e Gas sensing performance of the composite to 50 ppb ppm NO2 at room temperature. f, g Gas sensing behavior of the composite to different gases or mixtures. h Analysis results from PCA scatter plot for five target gases. Prediction concentration vs. true concentration of NO2 i or RH, j via a kNN algorithm. Reprinted with permission from Ref. [60]. Copyright 2022, Wiley-VCH
points. Furthermore, supervised methods were selected to help the system to distinguish the different gases. Five algorithms of classification tree, logistic regression, random forest learner and support vector machine were applied to verify the points of five gases to improve the gas selectivity. These five methods could be perfect in recognizing the gas to which the point belonged. In addition, the obtained dataset
170
S. Yang et al.
was also trained to get a quantitative prediction of the gas concentration. The result showed that benzene, acetone, hydrogen, toluene and ethanol could be well predicted with all the obtained points being distributed in the diagonal. The authors also found that the benzene, ethanol or hydrogen gas was easier to be predicted, with a low mean error of as low as 3.7%, 4.7% or 7.3%, respectively. In contrast, the mean error for acetone or toluene was higher, being 19.3 or 35.3%, respectively. It might be better if more work could be done to decrease the mean error for acetone or toluene to better predict these gases. Croy et al. have assembled a nanosensor based on graphene nanosheets to detect ammonia (NH3 ) and phosphine (PH3 ) at room temperature [70]. The graphene-based nanosensor exhibited a better sensing performance to NH3 compared to PH3 (seen in (Fig. 3a). Specifically, the sensor response of the assembled sensor was 92.0% to 100 ppb NH3 . In comparison, the sensor response to 100 ppb PH3 was only approximately 25%. The authors collected a series of data in 24 individual measurement profiles, and the obtained data were normalized via an L2 norm algorithm. There were 11 typical features extracted from each sensing circle to conduct the machinelearning algorithm (Fig. 3b). The obtained 11 specific features of the NH3 or PH3 were thoroughly compared and analyzed via an unsupervised PCA and a supervised LDA (Fig. 3b). In the PCA score plots in Fig. 3c–f, the NH3 was mainly located in the left side, and the PH3 was in the middle region with the reference gas (N2 ) mainly distributing in the ride side. The first PC showed a variance of 49.1%, higher than that of the second (24.7%) or third (11.0%) PC. Furthermore, the authors have evaluated several critical metrics for the 100 ppb NH3 and PH3 to study the identification performance of the assembled sensor. 70% of the obtained data was used to train the LDA classifier algorithm and the left 30% of the data was applied to validate the trained classifier. The NH3 was reported to have a good identification result with the accuracy, sensitivity and septicity being 100% (seen in Fig. 3g). For the PH3 , the identification was moderate due to the low sensor response of the sensor based on graphene. The accuracy, sensitivity and septicity were only 77.8, 75.0 and 78.6%, respectively (Fig. 3h). However, the identification of the PH3 was reported to be improved when its concentration was increased to 1000 ppb. The accuracy, sensitivity and septicity of the graphene-based sensor increased to 94.7, 75.0 and 100.0%, respectively. These results revealed that the graphene could detect NH3 or PH3 at room temperature under machine-learning techniques. The research of Yang et al. showed that the machine-learning algorithm could further improve the selective ethanol-sensing behavior of the ZnO-based nanomaterial [71]. The authors have selected the ZnO nanomaterials prepared with urea of 0.5, 1, 3, 6 or 30 g to assemble the gas sensor to detect ethanol, named S1, S2, S3 S4 or S5, respectively. The assembled gas sensors showed promising gas sensing performances to 10–120 ppm ethanol under a working voltage of 5 V. The working temperature was also found to effectively modulate the ethanol-sensing behavior of the ZnO. Specifically, the S4 would show a higher sensor response of 88–100 ppm ethanol than the S5 (60) at 400 °C. The response/recovery time of the best-performance S5 was 15/17 s at 350 °C. For the S4 at 400 °C, the response time and the recovery time were longer, being 26 s and 56 s, respectively. Furthermore, the ZnO nanomaterial exhibited a
A Review of the High-Performance Gas Sensors Using Machine Learning
171
Fig. 3 a Dynamic sensing performances of the assembled gas sensor based on graphene (inset in Fig. 3a) to 100 ppb NH3 , PH3 and reference gas N2 . b Extracted features from a single sensing curve. 2D and 3D PCA score plots for 100–1000 ppb NH3 c or PH3 (d). LAD score plots for 100 ppb (e) or 500 ppb. f NH3 and PH3 . g Identification results of 500 ppb gas in a confusion matrix under LDA classifier algorithm. h Sensor performance results to 500 ppb NH3 , PH3 and N2 using hold-out cross-validation method. i Identification accuracy of 500 ppb gas using different classifier algorithms. Reprinted with permission from Ref. [70]. Copyright 2022, Wiley-VCH
172
S. Yang et al.
relatively high sensor response of ~60 or ~20 at 350 °C to 100 ppm competitive gas of p-propanol or acetone, respectively. The obtained data set was further processed by being randomly disrupted, smoothed and normalized. Four algorithms of PCA, SVM, extreme learning machine (ELM) and backpropagation (BP) neural network were used to train and test the collected data set via a Scikit-learn package in Python. The result of the PCA showed that the data set for different gases were located at a certain position and did not show a cross overlap. These results indicated that the PCA could recognize the target gas with good verification accuracy. The SVM method exhibited excellent performance in identifying the target VOCs with an overall accuracy of as high as 99%. The concentration of the selected gas was then predicted with a method of BP or ELM. The authors found that the squared error (SSE) of the BP was 6.994, which was larger than that of the ELM (only 1.121). The smaller SSE meant the ELM method exhibited a better performance than the regressor. In addition, the predicted data was extremely close to the actual concentration in their work. These results showed that the SVM could be effective in helping the ZnO nanomaterial to distinguish a target gas, and the ELP would be a possible method to predict the concentration of the target ethanol. Singh et al. have also assembled a single gas sensor based on ZnO nanostructures and swept its gas sensing performance at different working temperatures to improve its selectivity under machine-learning algorithms [72]. The authors have studied the CO2 , NH3 and H2 S sensing behaviors of the prepared ZnO nanostructures at working temperatures of 250–350 °C. The sensor response of the ZnO was increased with increasing the concentration of the target gas. The ZnO nanostructure exhibited the highest sensor response to NH3 at the working temperature of 350 °C. For the NH3 or CO2 , the highest sensor response could be obtained at a medium working temperature of 250 °C. There were 87 data vectors (containing 522 features) obtained, and the authors used a python script to treat the collected data with the algorithms of NB, SVM, RF and Logistic Regression (LR). 80% of the collected data set was used as the training data and 20% of the data set was the test data. During the study, the data process was conducted five times, and the accuracies of the five processes were calculated to obtain the average accuracy. The results showed that the LR presented a promising performance in predicting target gas with an average accuracy of ~85%. In contrast, the performance of the NB was poor, with a low average accuracy of only ~65%. To improve the prediction accuracy, the authors have also studied the performances of the four algorithms with considering the relative logic. It was reported that the presence of the relative logic could successfully increase the average accuracy of the RF from the original ~76 to ~100%. Based on these results, it could be easy to infer that the machine-learning algorithm should also be a possible method to improve the selective response of a single gas sensor based on semiconductor metal oxide. It should be noted that the standard deviation in sensor response of the metal oxide-based gas sensor at different working temperatures should be carefully treated. Consecutive operating temperatures close to each other should not be considered because the sensor responses at these temperatures were similar. A possible method should be to test the gas sensing performances of the
A Review of the High-Performance Gas Sensors Using Machine Learning
173
metal oxide-based gas sensor at different working temperatures (with a sufficient interval). MoS2 nanosheet-decorated SnO2 nanofibers were synthesized by Hieu et al. via a two-step method [73]. The gas sensor based on the composite showed a lower optimal working temperature of 150 °C compared to the bare SnO2 nanofibers (300 °C). The sensor response of the composite was as high as ~11 to 10 ppm SO2 at 150 °C. However, like many metal oxides, the composite showed a cross-sensitivity to CO, H2 and NH3 at 150 °C. These results meant the composite might be challenging to identify CO, H2 , and NH3 . To solve this problem, the authors have used an LDA method to classify these gases combined with the gas response variations of the composite at different temperatures. The sensor responses of the composite to each gas concentration at the working temperatures of 150, 200 and 250 °C were studied and collected as the typical feature data for the LDA algorithm. The LDA method could effectively decrease the 3D vector of the input features to a 2D point. Furthermore, the discriminant values for the gases of CO, H2 and NH3 were distributed in separated regions with the range of −3 to −2, 13 to 15 and −13 to −12, respectively. The authors also stated that the SVM algorithm could be used to construct a training model further to identify the separate regions of the different gases. This result indicated the obtained unknown data could be distinguished according to its position in the 2D plot of the results obtained by the LDA. Therefore, the gas selectivity of the MoS2 nanosheet-decorated SnO2 nanofibers would be successfully improved via a strategy of LDA combined with SVM. A gas sensor based on a single SnO2 nanowire has also been assembled by Tonezzer to selectively detect ethanol under machine-learning algorithms [43]. In his work, the sensor response of the single SnO2 nanowire was ~7 to 50 ppm ethanol at its optimum working temperature of 350 °C. The response/recovery time of the sensor decreased with increasing the operating temperature from 200 to 400 °C. Furthermore, the response/recovery time of acetone, ammonia, carbon monoxide, hydrogen, nitrogen dioxide or toluene also showed a similar tendency. The relationship between the temperature and the sensor response would be unique for each gas, which was named thermal fingerprint by the author. The thermal fingerprint of each gas would move up when the concentration of the test gas was increased. A method of PCA was selected to decrease the dimension of the specific thermal fingerprint of the gas sensor. 5D point would be reduced to a 3D point under the PCA, visualizing the gas points and keeping most of the information. The result showed that the variances for the principal components of PC1, PC2 and PC3 would be 84.86%, 10.42% and 4.23%, respectively. The total variance for the three principal components would be as high as 99.51%. Furthermore, SVM was used to classify each measurement as one of the test gases. A first 5D data set of 225 gas injections were applied as a training set, and a second data set of 200 gas injections was used as a test set. The method of SVM could effectively recognize the gas to which a point belonged in most cases. But the method was reported to show a bad performance to the ammonia due to the confusion of the nitrogen dioxide. The sensitivity and precision for ammonia were only 60%, lower than those for the other gases (100%). The least squares SVM was then used to estimate the gas concentration of each point by dividing the training and
174
S. Yang et al.
test data set into sub-datasets. The result showed that all the solid point was quite close to the line in the diagonal, meaning that the test gases with different concentrations could be well predicted. However, there were two points for the NO2 found to be far away from the diagonal, meaning that the NO2 might be not effectively predicted by the least squares SVM. The average errors for the ethanol, hydrogen, CO, acetone, ammonia and toluene were 19.2%, 9.8%, 23.1%, 18.0%, 15.4% and 17.6%, respectively. For NO2 , the average error could be much larger, being 496.8%. Based on these results, a gas sensor could distinguish the presence of which gas and predict its concentration under machine-learning techniques. Guha et al. have developed a gas sensor based on the SnO2 hollow sphere and investigated its improved gas sensing performance to VOCs via machine-learning methods [42]. The uniform SnO2 hollow spheres with rutile phased tetragonal structures were synthesized via a facile hydrothermal route. The SnO2 hollow spheres exhibited a typical n-type gas sensing performance to the VOCs (mainly formaldehyde, methanol, propanol and toluene) at working temperatures of 200–350 °C. The SnO2 hollow spheres showed the best gas sensing performance to 200 ppm formaldehyde, methanol, propanol and toluene at 250, 250, 250 and 350 °C with the sensor response of ~185, 80, 225 and 14, respectively (seen in Fig. 4a–d). As a result, the sensor based on the SnO2 hollow spheres might exhibit a cross-sensitivity to the four VOCs. The authors have used two different methods of fast Fourier transform (FFT) and discrete wavelet transform (DWT) to extract the typical features from the sensing curves. There were 192 signals (response curves) collected during the test. The methods of SVM, RF and MLP were selected to conduct the machine-learning process. The method of RF would exhibit the best performance when the data set was extracted with the method of FFT. The average accuracy of the RF method was 93.15%, higher than that of MLP (91.32%) or SVM (90.52%) (Fig. 4e). In the case of the data set extracted from DWT, the RF method also showed the highest average accuracy of 96.84%, and the average accuracy for the MLP or SVM was 94.73% or 95.79% respectively (Fig. 4f). The average accuracy of the SVM, MLP or RF was so high that it might be possible to distinguish the four VOCs. Based on these results, it could be reasonable to infer that these three methods showed higher average accuracy for the data set extracted from DWT than FFT. This should be because the FFT would give overall frequency information for the whole duration of the signal. The authors further used the RF algorithm to select the data set for quantification analysis. The data set was divided into four parts, where each part should have 48 measurements. The 44 measurements were used as the training set, and the 4 measurements could be applied as the test set. The predicted value for the VOC was almost the same as the actual value with the dots (predicted value) quite close to the dashed diagonal line (perfect prediction) in Fig. 4g. In addition, the average error for concentration quantification was 6.21%, 6.88%, 5.62% or 7.27% for formaldehyde, methanol, propanol or toluene, respectively. These results revealed that the method to extract features might impact the accuracy of the machine-learning algorithms. In their study, the sensor response of the SnO2 hollow spheres tended to increase within the tested working temperature of 200–350 °C. It might be better to do more research to systematically explore the optimal working temperature for toluene. It
A Review of the High-Performance Gas Sensors Using Machine Learning
175
Fig. 4 Gas sensor responses of SnO2 hollow spheres to 25–200 ppm. a formaldehyde, b methanol, c propanol or d toluene at working temperatures of 200–350 °C. Classification performance of RF, MLP or SVM based on features extracted from e FFT or f DWT. g Prediction concentration versus true concentration of different test gases. Reprinted with permission from Ref. [42]. Copyright 2022, Elsevier
should be noted that the seam team also used a similar machine-learning strategy to successfully improve the gas sensing behaviors of WO3 nanoplates and SnO2 hollow spheres to other VOCs [49, 50]. The gas sensing performance of the single gas sensor based on the SnO2 /Pt/WO3 nanofilm or the graphene was also reported to be effectively enhanced via machinelearning methods [74, 75]. In addition, Krivetskiy et al. recently used a method of statistical shape analysis (SSA) to successfully improve the selective CO sensing performance of a SnO2 -based gas sensor [76].
3.1.2
Improved Gas Sensing Performance of a Gas Sensor Array
Recent reports revealed that the machine-learning technique could also be used to enhance the gas sensing performance of a gas sensor array [77]. For example, a batch-uniform gas sensor array based on semiconductor metal oxides could realize the selective defection of CO, NH3 , NO2 , CH4 and Acetone (C3 H6 O) under a deeplearning algorithm of convolutional neural network (CNN) [78]. The authors have
176
S. Yang et al.
used a method of glancing angle deposition (GLAD) to prepare the nanocolumnar films of metal oxides (SnO2 , In2 O3 , WO3 and CuO). The gas sensors based on these films were assembled on a MEMS-based suspended microheater platform (Fig. 5a, b). These were four gas sensors integrated into a chip, and four chips with the size of 3.3 mm * 3.3 mm were then further integrated into a printed circuit board (PCB) to form a gas sensor array (Fig. 5c). To ensure the stability of the gas sensor, the authors have studied the initial resistances of a series of samples. For example, 24 sensors based on In2 O3 film were investigated and compared. The average value of the basic resistance of these 24 samples was 27.07 kΩ with a stand deviation of 4.03 kΩ. This meant that the basic resistance of the assembled gas sensor was relatively stable, which might be attributed to the uniformity of the prepared nanofilms. To promote the interaction between the target gas and the sensing material, a MEMSbased microheater platform with an applied powder of as low as 11 mW was used to enable the gas sensor to operate at the working temperature of 250 °C. An integrated sensor control module was used to drive the gas sensor array and collect the gas sensing response data (Fig. 5d). The fabricated four gas sensors in each chip exhibited uniform sensor responses to a target gas. In the case of the sensor based on SnO2 , the relative standard deviation value of the four gas sensors was calculated to be approximately 5%. This value was so low that the gas sensor showed a promising uniform gas sensing performance. The collected response data to the target gas was first normalized, and then a common log was used to offset the scale deviation. The interval time of the sensing test was set to 10 s, and the collected data was input into the CNN in one second. There was an 8 * 10 matrix obtained and used the input data for the CNN. For each sensing material, the gas sensing data from three different gas sensors in a chip of five gases were used as the training data set, and the data from the fourth gas sensor was the test data set. The results showed that the types of test gases could be well classified with a total accuracy of 98.06% via the method of CNN. Meanwhile, the concentration of the gases could also be well predicted under CNN with a low average error of 10.15%. The classification accuracy of the CNN was also found to be higher than that of the method of SVM (only 60%), which should owe to the overlap of the response to the target gases in the transient region. These results revealed that the CNN might be superior to the traditional machine-learning algorithm for classifying the target gas (Fig. 5e–i). In their study, there were 16 gas sensors used in the gas sensor array to realize the selective detection of the target gas. It might be better to modify the gas sensing material to decrease the number of required gas sensors and to reduce the power consumption of the device. In a similar strategy, Xu et al. have also assembled an e-nose composed of three gas sensors (QS-01, TGS2600 and TGS2602) and a humidity sensor to detect formaldehyde [79]. Three different algorithms of BP, SVM and radial basis function (RBF) neural network were used to improve the formaldehyde sensing performance of the assembled e-nose. There were 766 data collected by the three gas sensors: 182 data were obtained for the formaldehyde with concentrations lower than 1 ppm, and the rest 584 data was gained for higher-concentration formaldehyde. 70% of the data set was used as the training set to establish the models through machine-learning algorithms. The test set was the remaining data set to identify the behavior of the built
A Review of the High-Performance Gas Sensors Using Machine Learning
177
178
S. Yang et al.
◀Fig. 5 a Schematic of high batch-uniform gas sensors based on semiconductor metal oxide (SMO) fabricated by a GLAD method. b Microscopic image of the assembled gas sensor on a suspended microheater platform. c Digital image of a 4 * 4 gas sensor array integrated on a PCB and d the corresponding sensor control module. Prediction results of e gas types summarized in a confusion matrix and f gas concentrations normalized in 0–0 for the test data set. g Response time needed to predict the gas types by the assembled e-nose system. The real-time prediction results of gas types and concentrations to the target gas of h CO and i C3 H6 O. Reprinted with permission from Ref. [78]. Copyright 2022, American Chemical Society
models to find out the optimum algorithm. Their further investigation showed that the self-validation results of the BP were quite close to those of the SVM method, meaning the efficiency of these two methods in processing the data set of the enose. For the RBF, the predicted results were found to be different from the detected results, which indicated that the RBF might not be suitable to establish the prediction models for the assembled e-nose. Further cross-validation showed that the BP also showed the best performance in predicting the results. The mean deviation and the stand deviation for BP were as low as 0.0755 and 0.0819 ppm, respectively, lower than those of the SVM (0.1802 and 0.1362 ppm) and RBF (0.1333 and 0.1540 ppm). These results indicated that the method of BP might be more effective to process the data set for the assembled e-nose and predict the gas concentration. The study of Wang et al. revealed that the machine-learning method could help design In2 O3 nanotube-based gas sensors with high trimethylamine (TEA) sensing performance [80]. Pure In2 O3 nanotubes were first synthesized via a method of electrospinning. Then the authors also prepared a series of nanotubes doped with a different molar percentage of Ga. The pure In2 O3 nanotubes and the ones doped with 1% Ga, 10% Ga and 20% Ga were assembled in a sensor array to detect the TEA at the working temperature of 280 °C. There were four different gases (TEA, xylene, ethanol and H2 S) with different gas concentrations investigated by the gas sensor array. The results showed that the sensor response of the four sensors based on the pure In2 O3 nanotubes and the nanotubes doped with 1% Ga, 10% Ga and 20% Ga would be increased with increasing the TEA concentration. Meanwhile, the sensor response to TEA was reported to be close to that of ethanol. As a result, it was hard to clearly distinguish the TEA from the mixture of TMA and ethanol. The sensor response of the sensor array to a different-concentration mixture of TMA and ethanol was used as the training set under the SVM algorithm. The classification accuracy for the collected 66 samples could be as high as 100%. There were three typical methods, including radial basis function neural network (RBFNN), back propagation neural network (BPNN) and principal component analysis combined with linear regression (PCA-LR), used to investigate the prediction model with the classification labels. The sensor responses of the four sensors were used as the input data set. It was found that the RBFNN and BPNN would be more effective in predicting the mixture compared with the PCA-LR. The prediction accuracy of the PCA-LR was reported to be lower than 50%. The lowest average relative (ARE) error for the H2 S was obtained by the method of BP, while the RBFNN would exhibit the lowest AER for
A Review of the High-Performance Gas Sensors Using Machine Learning
179
the TMA, xylene, ethanol and the mixed gas of TMA/ethanol. For the TMA, the method of RBFNN would successfully predict its concentration with a low AER of only 1.22%, much lower than that obtained by BPNN (2.5%) or PCA-LR (13.34%). In the case of the mixture, the lowest AER was also obtained by the method of RBFNN, being 1.74%. These results reasonably revealed that the sensor based on the Ga-doped In2 O3 nanotubes could be used to selectively detect the TMA under the RBFNN algorithm. The seam team has also proposed a similar method to monitor the freshness of the meat and fruits by detecting ethanol, NH3 and TMA [81]. Khan et al. have assembled a gas sensor array consisting of eight metal oxidebased gas sensors and used machine-learning algorithms to improve its gas selectivity [44]. The authors have studied the sensing performances of the assembled gas sensor array to NO2 , ethanol, SO2 , H2 and the mixture of these gases in the presence of H2 O and O2 gases at 20 °C under UV light. The sensor responses to each gas with different concentrations were obtained by calculating the average value of the sensor response of three consecutive gas sensor responses to a same-concentration gas. The sensor response of each gas sensor in the array to each gas with a certain concentration should be unique and could be used as the data set for the following data process. The authors first used an unsupervised PCA method to accurately identify the singular target gases. There was a 120 * 8 matrix collected as the data set through studying the gas sensing performance of the gas sensor array to a test gas with 6 different concentrations. 60% of the collected data set was used as the training set, and the rest of the raw data was the test set. The results showed that there were three principal components (PC1, PC2 and PC3) obtained. The variances were calculated to be 64.38%, 18.53% and 12.19% for PC1, PC2 and PC3, respectively, indicating a total variance of as high as 95.1%. In the case of the mixture, two different mixtures were prepared: mixture-1 is consisting of NO2 , SO2 , O2 and H2 O and mixture-2 is consisting of ethanol, H2 , O2 and H2 O. In each mixture, the concentration of the O2 was fixed to be 20,000 ppm with the RH being 50%. There were 36 different concentrations tested, and a new data set of 72 * 8 was obtained. In the following study, four algorithms of DT, SVM, NB and k-NN were used to classify the gas type. Both the SVM and the NB exhibited a classification accuracy of 100%, higher than that of the DT (98%) or k-NN (96%). Based on these results, it would be reasonable to infer that the SVM and NB should be more effective methods to accurately identify the target gases and their mixture. The gas sensor in their study showed promising gas sensing performance to common gases and their gas mixture under UV light at 25 °C. The advantages of the small size, the low power consumption and the low cost of the gas sensor array might make it a commercially available gas sensing device under machine-learning algorithms. This should be a positive factor in promoting the application of the sensor array in gas monitoring. Thorson et al. have tried to assemble a sensor gas array to effectively detect complex pollutant mixtures and to identify the likely sources with a machine-learning method [82]. In their work, the authors have developed a gas sensor array to detect and sense several gases via an Arduino-based open-source platform (the Y-Pod). The Y-Pod has been integrated with four types of low-cost gas sensors or detectors. Electrochemical sensors and metal oxide-based gas sensors were then additionally
180
S. Yang et al.
added to the Y-pod to enhance its gas sensing performance. The added metal oxide sensor could be used to detect NO2 , O3 , CO, VOCs and CH4 . The electrochemical sensors were added to sense NO, NO2 , CO, O3 and H2 S. There were four main gas sources simulated in their study, including mobile emissions, biomass burning, natural gas emissions and gasoline vapors. The compositions of mixture gases, the temperature and the RH were changed during the test to study the environmental effects on the low-cost gas sensor array. Specifically, the temperature was set to be 25–45 °C and the RH was 40–80%. There were 3280 instances detected and recorded during the study. The method of multiple linear regression (MLR), ridge regression (RR), RF, gaussian process regression (GPR) and NN were used to build the regression models. The NN and the RF were found to exhibit better performances than the other methods due to their stronger ability to encode complexities. In the classification results, the output data were in a range of 0–1, in which 0 or 1 meant the absence or presence of a gas source, respectively. The authors point out that the best combination should be a linear regression and an RF classification. A high F1 score (a popular metric providing a balanced indication of both the precision and recall) of 0.718 was obtained during the optimization and model fitting. The results showed that the built system and the adopted method could be successfully used to estimate the concentrations of several compounds and further identify the presence of each of those sources. It could be reasonable to infer that the gas mixtures might also be identified and predicted by a gas sensor array under machine-learning algorithms. Apart from the published reports discussed above, Wei’s team has studied the effectiveness of the RF algorithm in optimizing the CO or CH4 sensing performance of a sensor array consisting of TGS2602, TGS2600, TGS2610, TGS2611, TGS2603 and TGS2620 [83]. Furthermore, the work of Itoh et al. or Kroutil et al. has revealed that the machine-learning algorithm could effectively enhance the gas sensing performance of a sensor array [56, 84]. The reported work discussed above revealed the machine-learning method could be successful in improving the gas selectivity of a gas sensor or sensor array. This could effectively solve the problem of the cross-selectivity of the semiconductorbased gas sensor or sensor array. The accurate identification of a target gas would further promote the application of a high-performance gas sensor or sensor array with machine-learning techniques. Furthermore, predicting the concentration of a target gas under machine-learning algorithms would also be helpful in designing and assembling high-performance gas sensors, gas sensor arrays and even developing smart e-noses. The boom of the published references in this area also revealed the great potential of machine-learning algorithms in enhancing the gas sensing performance of a gas sensor or gas sensor array, especially the gas selectivity [40, 41, 85–87].
3.2 Enhanced Long-Term Drift Compensation A sensor drift or chemical deterioration would occur when a gas sensor or sensor array works for a period of time [88, 89]. This is an essential issue for a high-performance
A Review of the High-Performance Gas Sensors Using Machine Learning
181
gas sensor because its sensing behavior would be deteriorated along with the sensor drift. The sensor drift would lead to a fluctuation in the gas sensing performance and result in an inaccurate gas sensor response, response time or recovery time. This would mean that the gas sensor suffering from drift could not be accurate to detect a target gas and its concentration. Lots of attention has been paid to solving the longterm drift compensation of the gas sensor array. Recent reports revealed that machinelearning techniques could also be effective in improving the drift compensation to stabilize the gas sensing behavior of a gas sensor or sensor array. There is a wellknown and famous data set created and donated by the University of California Irvine [88, 89]. The gas sensing features were recorded by 16 gas sensors (four TGS2600, four TGS2602, four TGS2610 and four TGS2620) during 36 months from January 2008 to February 2011. The collected data set is widely and publicly named the gas sensor array drift data set (GSAD), which can be accessed online from the UCI machine learning repository [52, 90, 91]. There were six gases (ethanol, ethylene, ammonia, acetaldehyde, acetic acid and toluene) detected, and 13,910 data in total were collected in the data set. The obtained data set was divided into 10 batches regarding the time. For example, the data collected in the first and second months was divided into batch 1, and the data obtained in the 36th month was defined as batch 10. Based on the GSAD, some researchers have tried to improve the long-term drift compensation of the gas sensor array via a transfer learning approach. For example, Yuan et al. have investigated the strategy to compensate for the drift of gas sensors and improve the accuracy of the drift compensation via a novel balanced distribution adaptation (BDA) method [92]. There were two static features, three rising dynamic features and three decaying dynamic features recorded and studied during the study. The collected data were then normalized due to the inconsistency of the evaluation range of each feature of the sensor. Two different settings were adopted by the authors to study the efficiency of the selected BDA strategy. For Setting 1, batch 1 was used as a training set, and the tested set was batch 2–10. The results showed that the accuracy of batch 2, batch 3 and batch 6 was higher than that of the other batches. Specifically, the average accuracy of batch 2, batch 3 or batch 6 was as high as 80%. For batch 8, the calculated average accuracy was only ~30%. Furthermore, the average recognition accuracy for the method of BDA was 68.92%, higher than that for the joint distribution adaptation (JDA, not over 62.36%) or the NN (56.69%). In the case of Setting 2, the authors have dynamically used the K-1 batch as the training set, and then the following K batch was then applied to test the build model. For example, bath 1 was the training set and then was tested on batch 2. It was found that the accuracy for Setting 2 was higher than that of Setting 1 under the same method of BDA. This might be because the drift influences between batches in Setting 2 were smaller than those in Setting 1. Specifically, the accuracy of the BDA drift compensation method for batch 3 was 80.71% under Setting 1 but was successfully improved to 97.95% under Setting 2. The average accuracy of the method of the BDA for Setting 2 was 81.06%, also higher than that of the JDA (not over 79.44%) or NN (65.23%). Based on these results, it could be inferred that the BDA method was more effective in improving the drift recognition accuracy compared to the JDA
182
S. Yang et al.
or KNN. Therefore, it should be careful to select the method or strategy to choose the best one to obtain the highest domain adaptive method. These results reasonably revealed that the machine-learning algorithm could be a possible method to improve the long-term stability of the metal oxide-based gas sensor array. Dong et al. have also proposed an online inertial machine learning to improve the reliability of the gas sensor array based on the data set of GSAD [93]. The authors have tried to improve the data processing algorithm to understand and investigate the drift compensation because the data process would directly affect the output data of the machine olfactory system. During their study, a null data check and index reconstruction were added when the algorithm was designed to enhance the universality and robustness of the data processing. All the data was transferred to their absolute values and was further processed via a min-max normalization. This method could successfully process the data set and solve some problems during the data processing, including sign errors, decimal point errors and outliers in data samples. There were also two data sets defined by the authors to investigate the longterm drift compensation of the studied sensor array. For data set 1, batch 1 was used as the training set and batch 2–10 was the test set. Different from the study of Yuan et al., in the case of data set 2, the training set was changed to batch 1–2, and batch 3–10 was applied to test the built model. The authors used the SVM as a base classifier, and the kernel function was adopted as the radial basis function. Meanwhile, the multiclassification was achieved with the penalty parameter being 1. The results showed that data set 2 would exhibit a higher experimental effect than data set 1. This result revealed that the number of initial samples would significantly impact the output of the machine-learning method. For example, the accuracy of batch 3 could be 98.38% based on data set 2, higher than that for data set 1 (76.21%). Moreover, the method proposed by the authors also showed a better performance than the method of SVM. The accuracy of batch 3 was only 88.27% under the classification algorithm of SVM for data set 2, lower than the 98.38% obtained by the authors’ method. In addition, it was found that the accuracy would be degraded after about 1–2 batches when the method of SVM was used. However, the degradation of accuracy was observed by extra 1–4 batches, meaning that the effective time of the sensor array could be extended by 4–10 months. Based on these results, the drift of the gas sensor would be effectively compensated under the method proposed by the authors. The strategy in their work successfully improves the drift compensation of a gas sensor array, which could be referred to design high-performance gas sensors. In addition, Priyonko et al. have used an extreme machine-learning technique to mitigate sensor drift via an ensemble of classifiers [94]. The authors first used batch 1 and batch 2 as the training data set, and batches 3–10 were used as the testing data set. The results showed that the accuracy of the classification would be decreased with respect to the run time of the sensor. This meant that there would be a sensor drift when the sensor worked for a long time. Furthermore, the authors also showed that the stand-alone classification might also affect the accuracy. When a multi-classifier was ensembled, the accuracy of an extreme learning machine was improved to 77.28%, higher than that of the stand-alone one (65.75%). These results revealed that the ensemble classification technique would be a better method to
A Review of the High-Performance Gas Sensors Using Machine Learning
183
exhibit a higher accuracy, which should also be a more effective strategy to improve the sensor drift compensation. These reported references reasonably indicated that machine-learning techniques could effectively improve the long-term stabilities of semiconductor-based gas sensors.
3.3 Accurate Classification of Food The smell of food or the gases (including VOCs) released by food could also be detected by a gas sensor or sensor array to identify the kind of food under machinelearning algorithms. Swager et al. have selected and used 20 gas sensors or selectors (named S1–S20 in their work) based on functionalized single-walled carbon nanotubes to assemble a gas sensor array to accurately discriminate cheese, liquor and edible oil via a machine-learning strategy [95]. The carbon nanotubes-based gas sensor was assembled on a substrate of soft polyimide with 16 carbon-based electrodes pre-prepared on it (Fig. 6a). The gas sensing performance was studied by exposing the obtained gas sensor to a target gas for 120 s and then exposing the sensor in the air for 600–900 s. The author first studied the gas sensing performance of each gas sensor to three target categories and repeated the test to choose three suitable items. There were 12 sets of response data for each gas sensor selected, and 2160 individual sensing traces in total were collected to do the machine-learning process. The dynamic gas sensing performances of one selector (S4) to three different kinds of cheeses (cheddar, Mahón and pecorino) were shown in Fig. 6b. Two models of featurized-RF model (f-RF) and KNN were established to investigate the possible classification ability of the gas sensor array. During the study, the authors explored and analyzed each of the selectors based on the f-RF or KNN 67% of the collected data set was applied as the training set and the test set was the remaining 33%. The f-RF model showed a better performance in differentiating three cheese samples than the KNN model. The classification accuracy of the f-RF model for three kinds of cheeses was calculated to be 78 ± 20% (Fig. 6c, the error value of 20% being a standard deviation), higher than that of the KNN model (69 ± 14% in Fig. 6d). These results indicated that the accuracies of the two models were moderate when each selector was used individually. The authors have also investigated the possible combinations of 4 selectors to improve the classification accuracy and find the optimum combination of selectors. There were 4845 combinations considered during the study. The result showed that the best combination would effectively improve the classification accuracy of f-RF for cheeses to be 94 ± 8% (Fig. 6e), also higher than that of KNN (79 ± 12% in Fig. 6f). Similar improvements in the identification accuracy were also found for the liquor or the oil, seen in Fig. 6g. In addition, the possible classification of five kinds of cheeses, liquors or and edible oil samples was also investigated and compared. The f-RF also showed better behavior compared with the KNN, and the cheese was found to be identified most effectively under these two models. Specifically, the average accuracy of the f-RF in distinguishing the five cheese samples was 91 ± 5%, which was higher than that for five liquor or oil samples (78 ± 8% and
184
S. Yang et al.
73 ± 7%, respectively). For the KNN, the average accuracies were 73 ± 6%, 40 ± 9% and 36 ± 8% for the five-class data set of cheese, liquor and oil, respectively. These results revealed the gas sensor array based on carbon nanotubes could well discriminate the food samples with the machine-learning algorithm of f-RF. The work of Tan et al. also revealed that the fermentation degree of cocoa beans could be effectively sensed by an e-nose consisting of nine gas sensors via the machine-learning method of a bootstrap forest, ANN or boosted tree [96]. In addition, Enériz et al. have used a gas sensor array to successfully in-situ track the food quality via a field programmable gate array-based ANN [97]. Very recently, Liu et al. have also reported that a sensor array consisting of six semiconductor gas sensors could be effective in distinguishing five Chinese liquors with different alcohol contents under a deep-learning-based method [98].
3.4 Monitoring of the Freshness of Meat Similarly, Astuti et al. have also used a gas sensor array to detect the typical gas patterns to clarify the E.coli bacteria-contaminated chicken meat with a machinelearning method [99]. In their work, the fresh chicken was bought from a market and then intentionally contaminated with E.coli bacteria obtained from an authority of Central Health Laboratory Surabaya, Indonesia. The contaminated chicken meat was kept in an incubator with a working temperature of 37 °C for 24 h. The typical gases produced by the chicken meat were detected by an e-nose system consisting of six gas sensors. The resistance of the gas sensor would be changed when the produced gas was introduced in the e-nose system, and the output milli voltage (mV) would then change correspondingly. The significant change in the milli voltage meant the introduction of a high-concentration target gas. The detected data set was then separated into seven groups and would be further classified with a method of SVM or RF. The accuracies of these two methods were over 98% for the untreated chicken meat, being 99.25 and 98.61% for the RF and SVM classifier, respectively. In the case of chicken meat with E.coli contaminants, the SVM classifier might not be an effective method to process the data set. The accuracy of the SVM classifier was 86.66%, lower than that of the RF classifier (98.42%). Based on these results, the RF classifier should be more suitable to process the data set detected by the e-nose system. The accuracy of the RF classifier in their study was also higher than that of fresh (95.2%) and thawed (94.67%) chicken meat via a fuzzy K-nearest neighbors (F-KNN) algorithm. This result indicated the reasonability of the method selected by Astuti et al. In addition, the e-nose system was also reported to be possibly used to indict the damage percentage of the selected meat. For example, there would be trimethylamine and ammonia produced along with the protein content variation in meat. These two gases could be used to determine the quality of the meat. The more trimethylamine and ammonia were released, the more serious/severe protein damage in the chicken meat would be. Therefore, the machine-learning method would be
A Review of the High-Performance Gas Sensors Using Machine Learning
185
Fig. 6 a Schematic diagram of the assembled gas sensor (or selector) based on single-walled carbon nanotubes. Dynamic gas sensing performance of one selector (S4) to three different kinds of cheeses (cheddar, Mahón and pecorino). Identification accuracies for f-RF (c) and KNN (d) to three different cheeses of cheddar, Mahón and pecorino using a single selector. The shaded and gray area represents random guessing (33% accuracy). Identification accuracies for f-RF (e) and KNN (f) to three different kinds of cheeses (cheddar, Mahón and pecorino) using the combinatorial selector scan. g Highest accuracy of the optimum selector combination for cheese, liquor or oil. Reprinted with permission from Ref. [95]. Copyright 2019, American Chemical Society
186
S. Yang et al.
effective in improving the gas sensing performance of the e-nose system with high clarifier accuracy for the data set. More recently, the same team also proposed a similar machine-learning method to identify the freshness of the chicken meat instead of looking at the texture, color and even aroma of meat [100]. The typical setup of their experiments could be seen in Fig. 7a. The authors have assembled a gas sensor array containing four metaloxide-based gas sensors (MQ7, MQ8, MQ135 and MQ136) to detect the variation of the smell of the chicken meat. In their study, the chicken meat was also purchased from a local market and was stored in a beaker glass with a volume of 150 ml. The change in the smell of the chicken meat was detected by the gas sensor array at the temperature of 4, 30, 35, 40, 45 and 50 °C, and the interval of each temperature was set to be 6 h. The sensing time was controlled to be 120 s to avoid the mixture of the air density to ensure the optimization of the test during the study. The target gas of NH3 released during the spoiling of the chicken meat was sensed by the gas sensor array, and there were approximately 1500 data collected in the test. The result showed that the concentration of the detected NH3 increased with increasing the shelf time and the stored temperature. This meant that the chicken meat would be more unfresh when the storage time was longer and the storage temperature was higher. The concentration of the NH3 at the storage temperature of 50 °C was the highest and was almost not changed at 4 °C. A method of PCA was selected and chosen to classify the collected data set and decrease the dimension of the data set. There were four PCs obtained in their study. The percentages of variance were 75%, 18%, 6.8% and 0.2% for PC1, PC2, PC3 and PC4, respectively, as seen in Fig. 7b. For the classification based on shelf time (storage time), the PC1–PC3 was used and selected (the detailed classification results could be seen in the original reference) because the data were overlapped with each other in the PC1–PC2 or PC1–PC4. The cumulative percentage of the PC1–PC3 was calculated to be ~95% (Fig. 7c). In the case of the storage temperature, the PC1–PC4 was chosen (the detailed classification results could also be seen in the original reference) because there were also strong overlapping clusters in PC1–PC2 or PC1–PC3. The cumulative percentage of the PC1–PC4 was 94.8% (Fig. 7d). 80% of the data set was used as the training set and 20% of the data set was the test set. Two hidden layers (the input layer and the output layer) were applied to build the deep neural network (DNN) model. It was reported that the accuracy of the training set or the test set based on the DNN model was as high as 98.85% or 98.7%, respectively. Meanwhile, the error was as low as 0.015 or 0.013 for the training set or the test set, respectively. These results showed that the proposed PCA and DNN model could be effective and successful in helping the gas sensor array to identify the freshness of the meat. However, more models might be built to compare their accuracies in identifying the freshness of the chicken meat to find out the optimal model. The study of Zhang et al. revealed that the gas sensor or sensor array could also be used to detect fish quality with the help of machine-learning algorithms [101]. Furthermore, a single SnO2 nanowire-based gas sensor was reported to be able to detect and sense the freshness of fish (one marble trout) and meat (one piece of pork) under a machine-learning method of SVM [53]. A similar detection of pork meat
A Review of the High-Performance Gas Sensors Using Machine Learning
187
Fig. 7 a Schematic diagram of the setup of the designed experiment. b Data visualization of the results for fresh and unfresh chicken meat based on the PCA method. The separated circle in the dash line meant that the cluster of two categories could be successfully separated. c The result of the PC1–PC3 for different shelf times. d The result of the PC1–PC4 for different storage temperatures. Reprinted with permission from Ref. [100]. Copyright 2022, Elsevier
under a k-NN algorithm was also reported in work done by Lumogdang and his teammates [102].
3.5 Assisted Early Cancer Diagnosis Early cancer detection assisted by machine-learning algorithms has also attracted much attention in recent years. Huang and his workmates have developed a gas sensor array to detect lung cancer with the help of a machine-learning method [61]. The authors conducted well-designed research during a long period from July 2016 to May 2018 in the National Taiwan University Hospital. The lung cancer patients were carefully selected to ensure the reliability of the study. The authors then collected the air breath samples from 117 cases and 199 controls via a standardized procedure. The air breath samples were obtained when the patient was intubated with an endotracheal tube before surgery. All samples were further analyzed and compared in two hours. An e-nose, composed of 32 nanocomposites-based gas sensors, was used to analyze the alveolar air during the experiment. The selected gases were effective in sensing ethanol or isopropanol in lipid peroxidation-related VOCs. In their study, the data obtained from 2016 to 2017 were collected and classified as the data to build the prediction model with the methods of LDA and non-linear SVM. There were 203 cases chosen to build the prediction model and further used for internal validation.
188
S. Yang et al.
The average accuracies for the methods of LDA and SVM were reported to be higher than 90%, which was high enough to reveal the efficiency of these two methods. Furthermore, 41 subjects collected in 2018 were used as the independent dataset for external validation during the study. Based on the built SVM model, the sensitivity, specificity and overall accuracy for the external validation were higher than 80%. As a result, it could be inferred that the machine-learning technique should also be one effective method to detect gas-related cancers. This would be helpful to realize early cancer diagnosis and alarm. More recently, the same team has also used a gas sensor array to collect and analyze the alveolar air to diagnose breast cancer using a machine-learning method [103]. The typical process of their work could be seen in Fig. 8a. The authors selected and investigated 899 subjects from July 2016 to June 2018. There were 439 subjects with a mean age of 55.03 chosen to conduct the final analyses, in which there were 351 cases of malignant breast tumors and 88 controls. The breath samples were collected with a mainstream CO2 monitor to ignore the effects of the contamination from the dead space. A heat-moisture exchanger was applied to remove the humidity of exhaled breath, as seen in Fig. 8b. The breath samples were collected into a Tedlar bag through a three-way valve when the concentration of the end-tidal CO2 reached the plateau. The obtained samples were then studied by an e-nose consisting of 32 sensors based on carbon nanotubes at a temperature and RH of 19.5–23.9 °C and 53–64%, respectively. The volatile metabolites from a patient suffering from breast cancer would interact with the sensing material and further change the resistance of the gas sensor. The selected e-nose would investigate the exhaled breath with a fingerprinting approach. There were five typical breath biomarkers, including 2propanol, 2,3-dihydro-1-phenyl-4(1H)-quinazolinone, 1-phenyl-ethanone, heptanal and isopropyl myristate, detected and identified during the study to establish a prediction model. The prediction accuracy was reported to be 91% for breast cancer in the test set under a random forest model. Meanwhile, the sensitivity of the study was 86% with the specificity being 97%. The positive predictive value or the negative predictive value could be as high as 97%. In addition, the subjects with a history of smoking, chemotherapy or diabetes also showed significant effects on the accuracy, summarized in Fig. 8c. For example, the diagnostic odds ratio was found to be 9.12, 9.62 or 8.51 for the subjects with a history of smoking, chemotherapy or diabetes, respectively. These results further indicated the efficiency of the selected model in analyzing breath samples and the reasonability of the machine-learning technique in predicting breast cancer. However, all subjects have received anesthetics for surgery during the study, which might be a limitation of the exploration conducted by the authors.
3.6 Low-Cost Air Quality Monitoring An early study done by Subramanian et al. has also focused on the strategy to build low-cost sensors to monitor air quality under machine-learning algorithms [104].
A Review of the High-Performance Gas Sensors Using Machine Learning
189
Fig. 8 a Schematic diagram of the principle of breath biopsy and the conduction of the machinelearning process to diagnose breast cancer. b A typical process to collect the alveolar air sample with the help of mainstream carbon dioxide monitoring and heat-moisture exchanger. c Summarization of the SROC cures for diagnostic accuracy including confounding factors or comorbidities. Reprinted with permission from Ref. [103]. Copyright 2021, Springer Nature
The authors pointed out that the low-cost sensor would usually suffer from being frequently sensitive to environmental conditions and pollutant cross-sensitivities. This issue has also not been solved and systematically investigated by laboratory calibrations. In their study, the authors have developed a system of 16–19 real-time affordable multi-pollutant (RAMP) monitors and used them to detect five different gases (CO, CO2 , NO2 , SO2 and O3 ). The study was conducted over a long period from August 3, 2016 to February 7, 2017 on the Carnegie Mellon University campus in the Oakland neighborhood of Pittsburgh, PA, US. The dominant gas pollutions at the site were vehicle emissions in the studied parking lot. The selected lot was small enough,
190
S. Yang et al.
and there were few other local sources, indicating that the location is essentially an urban background site. The temperature during the study varied from −15 to 34 °C with the RH being 27–98%. There were 2688 data obtained in 4 weeks at 15 min resolution during the study. 80% of the collected data set was used as the training data with the folds being 5 during the machine-learning process. The remaining data were used as the testing data to evaluate the reliability of the selected and build model. The method of the laboratory-based univariate linear regression (LAB), the MLR or the RF could be effective in matching the CO data with the R2 being 0.92, 0.92 or 0.91, respectively. For the CO2 , the RF model was much more outstanding than the LAB or MLR model. The R2 of the RF model for CO2 was calculated to be 0.85, much higher than that of the LAB (0.07) or MLR (0.06). Similar results could also be found in the model for NO2 or O3 . Based on the selected and adopted RF model, the fractional error was 14%, 2%, 29% or 15% for CO, CO2 , NO2 or O3 , respectively. Based on the results, the authors found that only the RF model could well meet the requirements of the US Environmental Protection Agency (EPA) Air Sensors Guidebook recommendations. Their work provided a possible way to build effective low-cost air quality sensors with outstanding performances. This study might also provide a possible method to assemble a mini air monitor and to collect effective data from a typical test lot in a big city. Furthermore, the method to analyze the collected data was also clearly presented. It might be better if the trend of the change in CO, O2 , NO2 or O3 could be predicted with the model built by the authors. In the study of Ning et al., the limitation of the short-term routine calibration was investigated and improved by comparing the results of the existing MLR and those of machine learning (ML) models [105]. The data were collected by 8 sets of sensor systems in SanMenXia city in Henan Province in Central China from August 2017 to March 2018. The temperature varied from −8 to 38 °C with the RH range of 17–99% during the measurement periods. 4 gas pollutants (NO2 , CO, O3 and SO2 ) and PM2.5 were detected and recorded with the modes of the sensor devices being diffusion model in every 5 s. The data set would be obtained by resampling to a 1 h resolution. In the study, the author mainly focused on the variation of the NO2 concentration in SanMenXia city. A 5 day moving window data were used in MLR and ML models to study the limitations of short-term routine calibration. For the long-term evaluation strategy, the data obtained from August 7, 2017 to November 20, 2017 (106 days) was used as the training set and the data collected from November 22, 2017 to March 20, 2018 (118 days) was applied to validate the built model or adopted algorithm. The authors found that the RF or eXtreme Gradient Boosting (XGB) model could not be used to adjust data not collected during the calibration period. Furthermore, the variation in the NO2 concentration could also not be well reflected during the short-term valuation strategy. The ML models would be more effective in providing calibration results with a lower root mean square error (RMSE) for short-term use. However, the authors also pointed out that this model might exhibit a bad performance when used in long-term validation. For the long-term calibration approach, different models of ANN, MLR, RF, support vector regression (SVR), temperature look-up (TLU) and XGB have been used to predict the NO2 concentration. The results showed all the models could provide a linear relationship between the predicted NO2 concentration
A Review of the High-Performance Gas Sensors Using Machine Learning
191
and the reference NO2 concentration from fixed-air quality monitoring stations. The MLR model exhibited the worst performance with the highest underestimation (slope being 0.31) and consistency (Pearson R being 0.71). The RF and XGB showed a more exciting and outstanding performance with some of the data deviating from the 1:1 line. In addition, the best performance for long-term validation (4 months) was found to be the TLU model with a high model inner coherence of over 0.91. The variation pattern of the predicted NO2 concentration in the TLU model was reported to be closer to the reference data. This study further revealed that the machine-learning technique could be used to detect or monitor gas pollutants under field conditions and would be helpful in improving air quality in real life.
4 Conclusions and Possible Challenges/Prospects Semiconductor-based gas sensors have been popularly used to detect small-size gas molecules, and their gas sensing performances could be further improved with machine-learning methods. The extracted features from the sensing curves of sensors based on semiconductors (mainly metal oxides) were used as the data set and would be classified and regressed with machine-learning algorithms. The accuracies of the different algorithms should be calculated and systematically compared to find out the optimal algorithm for a certain target gas. The machine-learning technique could help the gas sensor to show better selectivity and long-term stability after the post-process of features collected from sensing curves. Furthermore, the gas sensor or sensor array could be successfully used to monitor the change in the environment indoors/outdoors under the machine-learning method. The classification of foods, the freshness of food/meats, the monitoring of outdoor air qualities and the assistant diagnosis of lung cancer could also be realized with the help of an optimal machine-learning algorithm. Therefore, machine-learning techniques could successfully improve the gas sensing behaviors of gas sensors (or sensor arrays) and enrich their applications. As was discussed in the previous section, there should be a lot of data detected and collected to do the machine-learning process. The data set composed of various features might be obtained over a long time. A majority of semiconductor-based gas sensors were reported to exhibit gas sensing behaviors based on the reaction between the adsorbed oxygen ions and the target gas molecules [106–109]. The number of active sites on the surface of the semiconductor showed important impacts on its sensing performance. However, the active site might be captured or covered by the dash or H2 O molecule in the air or other competitive gas, resulting in a decrease in the number of active sites. The detected data for a gas sensor or sensor array might not be reliable if the concentration of H2 O or VOCs or the content of the dash in the environment was significantly changed during the long-term test. Furthermore, the economic efficiency to spend a considerable amount of time collecting the data set should also be considered in practice. For each gas sensor in a sensor array, the data set for each test gas might not be the same and should be collected separately.
192
S. Yang et al.
The process to collect enough data set should be a long-term test. The stability of a gas sensor or sensor array and the cost of the test should not be ignored during the machine-learning process. Another issue should be that there must be an external device (such as a computer) to process and analyze the collected data set. The post-process of the data set might be helpful to identify the detected gas and predict its concentration, but it could not point out the kind of the detected gas in time. The real-time analysis of the collected data was rarely reported and not yet systematically investigated. It might be better that the sensing curves could be studied and the features could be extracted and analyzed in real-time. A single gas sensor or sensor array might still suffer from a crosssensitivity when high-concentration competitive gases are presented if it might not be able to classify the data set and identify the gas in a real-time process. As a result, more attention should be paid to more effective machine-learning methods to better process the collected data and improve the real-time gas sensing performance of a gas sensor or gas array. In addition, an external device to conduct the machine-learning process might be another potential negative factor blocking the further promotion of the assembled gas sensor or sensor array. A powerful microprocessor might be needed and integrated into a gas sensing device to process the data set under machinelearning algorithms to improve the synthetic performance of a smart gas sensor (gas sensor array or e-nose). Acknowledgements This work was financially supported by the National Natural Science Foundation of China (Grant no. 51802109, 51972102, 52072115 and U21A20500), the Department of Education of Hubei Province (Grant no. D20202903) and the Department of Science and Technology of Hubei Province (Grant no. 2022CFB525).
References 1. Chen, X., Wong, C. K., Yuan, C. A., & Zhang, G. (2013). Nanowire-based gas sensors. Sensors and Actuators, B: Chemical Sensors and Materials, 177, 178–195. 2. Tian, X., Cui, X., Lai, T., Ren, J., Yang, Z., Xiao, M., Wang, B., Xiao, X., & Wang, Y. (2021). Gas sensors based on TiO2 nanostructured materials for the detection of hazardous gases: A review, Nano. Materials Science, 3, 390–403. 3. Cui, S., Pu, H., Wells, S. A., Wen, Z., Mao, S., Chang, J., Hersam, M. C., & Chen, J. (2015). Ultrahigh sensitivity and layer-dependent sensing performance of phosphorene-based gas sensors. Nature Communications, 6, 8632. 4. van den Broek, J., Abegg, S., Pratsinis, S. E., & Güntner, A. T. (2019). Highly selective detection of methanol over ethanol by a handheld gas sensor. Nature Communications, 10, 4220. 5. Mehdi Pour, M., Lashkov, A., Radocea, A., Liu, X., Sun, T., Lipatov, A., Korlacki, R. A., Shekhirev, M., Aluru, N. R., Lyding, J. W., Sysoev, V., & Sinitskii, A. (2017). Laterally extended atomically precise graphene nanoribbons with improved electrical conductivity for efficient gas sensing. Nature Communications, 8, 820. 6. Wang, J., Ren, Y., Liu, H., Li, Z., Liu, X., Deng, Y., & Fang, X. (2022). Ultrathin 2D NbWO6 perovskite semiconductor based gas sensors with ultrahigh selectivity under low working temperature. Advanced Materials, 34, 2104958.
A Review of the High-Performance Gas Sensors Using Machine Learning
193
7. Jeong, S. Y., Kim, J. S., & Lee, J. H. (2020). Rational design of semiconductor-based chemiresistors and their libraries for next-generation artificial olfaction. Advanced Materials, 32, 2002075. 8. Krishna, K. G., Parne, S., Pothukanuri, N., Kathirvelu, V., Gandi, S., & Joshi, D. (2022). Nanostructured metal oxide semiconductor-based gas sensors: A comprehensive review. Sensors and Actuators A: Physical, 113578. 9. Patial, P., & Deshwal, M. (2022). Selectivity and sensitivity property of metal oxide semiconductor based gas sensor with dopants variation: A review. Transactions on Electrical and Electronic, 23, 6–18. 10. Yoon, J.-W., & Lee, J.-H. (2017). Toward breath analysis on a chip for disease diagnosis using semiconductor-based chemiresistors: Recent progress and future perspectives. Lab on a Chip, 17, 3537–3557. 11. Wei, S., Li, Z., John, A., Karawdeniya, B. I., Li, Z., Zhang, F., Vora, K., Tan, H. H., Jagadish, C., & Murugappan, K. (2022). Semiconductor nanowire arrays for high-performance miniaturized chemical sensing. Advanced Functional Materials, 32, 2107596. 12. Yuan, Z., Yang, F., Meng, F., Zuo, K., & Li, J. (2021). Research of low-power MEMS-based micro hotplates gas sensor: A review. IEEE Sensors Journal, 21, 18368–18380. 13. Asri, M. I. A., Hasan, M. N., Fuaad, M. R. A., Yunos, Y. M., & Ali, M. S. M. (2021). MEMS gas sensors: A review. IEEE Sensors Journal, 21, 18381–18397. 14. Gao, X., & Zhang, T. (2018). An overview: Facet-dependent metal oxide semiconductor gas sensors. Sensors and Actuators, B: Chemical Sensors and Materials, 277, 604–633. 15. Wang, J., Shen, H., Xia, Y., & Komarneni, S. (2021). Light-activated room-temperature gas sensors based on metal oxide nanostructures: A review on recent advances. Ceramics International, 47, 7353–7368. 16. Li, Z., Li, H., Wu, Z., Wang, M., Luo, J., Torun, H., Hu, P., Yang, C., Grundmann, M., & Liu, X. (2019). Advances in designs and mechanisms of semiconducting metal oxide nanostructures for high-precision gas sensors operated at room temperature. Materials Horizons, 6, 470–506. 17. Rzaij, J. M., & Abass, A. M. (2020). Review on: TiO2 thin film as a metal oxide gas sensor. Journal of Chemical Reviews, 2, 114–121. 18. Liu, J., Zhang, L., Fan, J., Zhu, B., & Yu, J. (2021). Triethylamine gas sensor based on Ptfunctionalized hierarchical ZnO microspheres. Sensors and Actuators, B: Chemical Sensors and Materials, 331, 129425. 19. Bai, H., Guo, H., Wang, J., Dong, Y., Liu, B., Xie, Z., Guo, F., Chen, D., Zhang, R., & Zheng, Y. (2021). A room-temperature NO2 gas sensor based on CuO nanoflakes modified with rGO nanosheets. Sensors and Actuators, B: Chemical Sensors and Materials, 337, 129783. 20. Sui, N., Zhang, P., Zhou, T., & Zhang, T. (2021). Selective ppb-level ozone gas sensor based on hierarchical branch-like In2 O3 nanostructure. Sensors and Actuators, B: Chemical Sensors and Materials, 336, 129612. 21. Sharma, B., Sharma, A., & Myung, J.-H. (2021). Selective ppb-level NO2 gas sensor based on SnO2 -boron nitride nanotubes. Sensors and Actuators, B: Chemical Sensors and Materials, 331, 129464. 22. Pravarthana, D., Tyagi, A., Jagadale, T., Prellier, W., & Aswal, D. (2021). Highly sensitive and selective H2 S gas sensor based on TiO2 thin films. Applied Surface Science, 549, 149281. 23. Morais, P. V., Suman, P. H., Silva, R. A., & Orlandi, M. O. (2021). High gas sensor performance of WO3 nanofibers prepared by electrospinning. Journal of Alloys and Compounds, 864, 158745. 24. Ling, W., Zhu, D., Pu, Y., & Li, H. (2022). The ppb-level formaldehyde detection with UV excitation for yolk-shell MOF-derived ZnO at room temperature. Sensors and Actuators, B: Chemical Sensors and Materials, 355, 131294. 25. Li, Z., Lou, C., Lei, G., Lu, G., Pan, H., Liu, X., & Zhang, J. (2022). Atomic layer deposition of Rh/ZnO nanostructures for anti-humidity detection of trimethylamine. Sensors and Actuators, B: Chemical Sensors and Materials, 355, 131347. 26. Yang, W., Chen, H., & Lu, J. (2021). Assembly of stacked In2 O3 nanosheets for detecting trace NO2 with ultrahigh selectivity and promoted recovery. Applied Surface Science, 539, 148217.
194
S. Yang et al.
27. Chen, L., Song, Y., Liu, W., Dong, H., Wang, D., Liu, J., Liu, Q., & Chen, X. (2022). MOFbased nanoscale Pt catalyst decorated SnO2 porous nanofibers for acetone gas detection. Journal of Alloys and Compounds, 893, 162322. 28. Li, T., Zhang, D., Pan, Q., Tang, M., & Yu, S. (2022). UV enhanced NO2 gas sensing at room temperature based on coral-like tin diselenide/MOFs-derived nanoflower-like tin dioxide heteronanostructures. Sensors and Actuators, B: Chemical Sensors and Materials, 355, 131049. 29. Song, Y. G., Park, J. Y., Suh, J. M., Shim, Y.-S., Yi, S. Y., Jang, H. W., Kim, S., Yuk, J. M., Ju, B.-K., & Kang, C.-Y. (2018). Heterojunction based on Rh-decorated WO3 nanorods for morphological change and gas sensor application using the transition effect. Chemistry of Materials, 31, 207–215. 30. Periyasamy, M., & Kar, A. (2020). Modulating the properties of SnO2 nanocrystals: Morphological effects on structural, photoluminescence, photocatalytic, electrochemical and gas sensing properties. Journal of Materials Chemistry C, 8, 4604–4635. 31. Bai, S., Guo, J., Shu, X., Xiang, X., Luo, R., Li, D., Chen, A., & Liu, C. C. (2017). Surface functionalization of Co3 O4 hollow spheres with ZnO nanoparticles for modulating sensing properties of formaldehyde. Sensors and Actuators, B: Chemical Sensors and Materials, 245, 359–368. 32. Dong, C., Zhao, R., Yao, L., Ran, Y., Zhang, X., & Wang, Y. (2020). A review on WO3 based gas sensors: Morphology control and enhanced sensing properties. Journal of Alloys and Compounds, 820, 153194. 33. Dey, A. (2018). Semiconductor metal oxide gas sensors: A review. Materials Science and Engineering B, 229, 206–217. 34. Korotcenkov, G. (2007). Metal oxides for solid-state gas sensors: What determines our choice? Materials Science and Engineering B, 139, 1–23. 35. Li, P., Cao, C., Shen, Q., Bai, B., Jin, H., Yu, J., Chen, W., & Song, W. (2021). Cr-doped NiO nanoparticles as selective and stable gas sensor for ppb-level detection of benzyl mercaptan. Sensors and Actuators, B: Chemical Sensors and Materials, 339, 129886. 36. Wang, Y., Cui, Y., Meng, X., Zhang, Z., & Cao, J. (2021). A gas sensor based on Ag-modified ZnO flower-like microspheres: Temperature-modulated dual selectivity to CO and CH4 . Surf. Interfaces, 24, 101110. 37. Li, H., Wu, C.-H., Liu, Y.-C., Yuan, S.-H., Chiang, Z.-X., Zhang, S., & Wu, R.-J. (2021). Mesoporous WO3 –TiO2 heterojunction for a hydrogen gas sensor. Sensors and Actuators, B: Chemical Sensors and Materials, 341, 130035. 38. Liu, A., Lv, S., Jiang, L., Liu, F., Zhao, L., Wang, J., Hu, X., Yang, Z., He, J., & Wang, C. (2021). The gas sensor utilizing polyaniline/MoS2 nanosheets/SnO2 nanotubes for the room temperature detection of ammonia. Sensors and Actuators, B: Chemical Sensors and Materials, 332, 129444. 39. Yaqoob, U., & Younis, M. I. (2021). Chemical gas sensors: Recent developments, challenges, and the potential of machine learning-A review. Sensors, 21, 2877. 40. Ha, N., Xu, K., Ren, G., Mitchell, A., & Ou, J. Z. (2020). Machine learning-enabled smart sensor systems. Advanced Intelligent Systems, 2, 2000063. 41. Ye, Z., Liu, Y., & Li, Q. (2021). Recent progress in smart electronic nose technologies enabled with machine learning methods. Sensors, 21, 7620. 42. Acharyya, S., Nag, S., Guha, P. K. (2022). Ultra-selective tin oxide-based chemiresistive gas sensor employing signal transform and machine learning techniques. Analytica Chimica Acta, 339996. 43. Tonezzer, M. (2019). Selective gas sensor based on one single SnO2 nanowire. Sensors and Actuators, B: Chemical Sensors and Materials, 288, 53–59. 44. Khan, M. A. H., Thomson, B., Debnath, R., Motayed, A., & Rao, M. V. (2020). Nanowirebased sensor array for detection of cross-sensitive gases using PCA and machine learning algorithms. IEEE Sensors Journal, 20, 6020–6028.
A Review of the High-Performance Gas Sensors Using Machine Learning
195
45. Leite, L. S., Visani, V., Marques, P. C. F., Seabra, M. A. B. L., Oliveira, N. C. L., Gubert, P., Medeiros, V. W. C. D., Albuquerque, J. O. D., & Lima Filho, J. L. D. (2021). Design and implementation of an electronic nose system for real-time detection of marijuana. Instrumentation Science and Technology, 49, 471–486. 46. Calderon-Santoyo, M., Chalier, P., Chevalier-Lucia, D., Ghommidh, C., & Ragazzo-Sanchez, J. A. (2010). Identification of Saccharomyces cerevisiae strains for alcoholic fermentation by discriminant factorial analysis on electronic nose signals. Electronic Journal of Biotechnology, 13, 8–9. 47. Bermak, A., Belhouari, S. B., Shi, M., & Martinez, D. (2006). Pattern recognition techniques for odor discrimination in gas sensor array. Encyclopedia of Sensors, 10, 1–17. 48. Liu, T., Zhang, W., McLean, P., Ueland, M., Forbes, S. L., & Su, S. W. (2018). Electronic nose-based odor classification using genetic algorithms and fuzzy support vector machines. International Journal of Fuzzy Systems, 20, 1309–1320. 49. Acharyya, S., Nag, S., & Guha, P. K. (2020). Selective detection of VOCs with WO3 nanoplates-based single chemiresistive sensor device using machine learning algorithms. IEEE Sensors Journal, 21, 5771–5778. 50. Acharyya, S., Jana, B., Nag, S., Saha, G., & Guha, P. K. (2020). Single resistive sensor for selective detection of multiple VOCs employing SnO2 hollowspheres and machine learning algorithm: A proof of concept. Sensors and Actuators, B: Chemical Sensors and Materials, 321, 128484. 51. Lee, J., Jung, Y., Sung, S.-H., Lee, G., Kim, J., Seong, J., Shim, Y.-S., Jun, S. C., & Jeon, S. (2021). High-performance gas sensor array for indoor air quality monitoring: The role of Au nanoparticles on WO3 , SnO2 , and NiO-based gas sensors. Journal of Materials Chemistry A, 9, 1159–1167. 52. Dennler, N., Rastogi, S., Fonollosa, J., van Schaik, A., & Schmuker, M. (2022). Drift in a popular metal oxide sensor dataset reveals limitations for gas classification benchmarks. Sensors and Actuators, B: Chemical Sensors and Materials, 361, 131668. 53. Tonezzer, M. (2021). Single nanowire gas sensor able to distinguish fish and meat and evaluate their degree of freshness. Chemosensors, 9, 249. 54. Abe, H., Kimura, Y., Ma, T., Tadaki, D., Hirano-Iwata, A., & Niwano, M. (2020). Response characteristics of a highly sensitive gas sensor using a titanium oxide nanotube film decorated with platinum nanoparticles. Sensors and Actuators, B: Chemical Sensors and Materials, 321, 128525. 55. Isik, E., Tasyurek, L. B., Isik, I., & Kilinc, N. (2022). Synthesis and analysis of TiO2 nanotubes by electrochemical anodization and machine learning method for hydrogen sensors. Microelectronic Engineering, 262, 111834. 56. Kroutil, J., Laposa, A., Ahmad, A., Voves, J., Povolny, V., Klimsa, L., Davydova, M., & Husak, M. (2022). A chemiresistive sensor array based on polyaniline nanocomposites and machine learning classification. Beilstein Journal of Nanotechnology, 13, 411–423. 57. Shiba, K., Tamura, R., Sugiyama, T., Kameyama, Y., Koda, K., Sakon, E., Minami, K., Ngo, H. T., Imamura, G., & Tsuda, K. (2018). Functional nanoparticles-coated nanomechanical sensor arrays for machine learning-based quantitative odor analysis. ACS Sensors, 3, 1592–1600. 58. Aliramezani, M., Norouzi, A., & Koch, C. R. (2020). A grey-box machine learning based model of an electrochemical gas sensor. Sensors and Actuators, B: Chemical Sensors and Materials, 321, 128414. 59. Laref, R., Losson, E., Sava, A., & Siadat, M. (2019). On the optimization of the support vector machine regression hyperparameters setting for gas sensors array applications. Chemometrics and Intelligent Laboratory, 184, 22–27. 60. Ogbeide, O., Bae, G., Yu, W., Morrin, E., Song, Y., Song, W., Li, Y., Su, B. L., An, K. S., & Hasan, T. (2022). Inkjet-printed rGO/binary metal oxide sensor for predictive gas sensing in a mixed environment. Advanced Functional Materials, 2113348. 61. Huang, C.-H., Zeng, C., Wang, Y.-C., Peng, H.-Y., Lin, C.-S., Chang, C.-J., & Yang, H.-Y. (2018). A study of diagnostic accuracy using a chemical sensor array and a machine learning technique to detect lung cancer. Sensors, 18, 2845.
196
S. Yang et al.
62. Barsan, N., Koziej, D., & Weimar, U. (2007). Metal oxide-based gas sensor research: How to? Sensors and Actuators, B: Chemical Sensors and Materials, 121, 18–35. 63. Wang, C., Yin, L., Zhang, L., Xiang, D., & Gao, R. (2010). Metal oxide gas sensors: Sensitivity and influencing factors. Sensors, 10, 2088–2106. 64. Fine, G. F., Cavanagh, L. M., Afonja, A., & Binions, R. (2010). Metal oxide semi-conductor gas sensors in environmental monitoring. Sensors, 10, 5469–5502. 65. Guo, S., Yang, D., Li, B., Dong, Q., Li, Z., Zaghloul, M. E. (2019). An artificial intelligent flexible gas sensor based on ultra-large area MoSe2 nanosheet. In 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS) (pp. 884–887). IEEE. 66. Tonezzer, M., Izidoro, S. C., Moraes, J. P. A., & Dang, L. T. T. (2019). Improved gas selectivity based on carbon modified SnO2 nanowires. Frontiers in Materials, 6, 277. 67. Tonezzer, M., Le, D. T. T., Iannotta, S., & Van Hieu, N. (2018). Selective discrimination of hazardous gases using one single metal oxide resistive sensor. Sensors and Actuators, B: Chemical Sensors and Materials, 277, 121–128. 68. Yaqoob, U., Lenz, W. B., Alcheikh, N., Jaber, N., & Younis, M. I. (2022). Highly selective multiple gases detection using a thermal-conductivity-based MEMS resonator and machine learning. IEEE Sensors Journal, 22, 19858–19866. 69. Tonezzer, M., Kim, J.-H., Lee, J.-H., Iannotta, S., & Kim, S. S. (2019). Predictive gas sensor based on thermal fingerprints from Pt-SnO2 nanowires. Sensors and Actuators, B: Chemical Sensors and Materials, 281, 670–678. 70. Huang, S., Croy, A., Panes-Ruiz, L. A., Khavrus, V., Bezugly, V., Ibarlucea, B., & Cuniberti, G. (2022). Machine learning-enabled smart gas sensing platform for identification of industrial gases. Advanced Intelligent Systems, 4, 2200016. 71. Wang, T., Ma, H., Jiang, W., Zhang, H., Zeng, M., Yang, J., Wang, X., Liu, K., Huang, R., & Yang, Z. (2021). Type discrimination and concentration prediction towards ethanol using a machine learning-enhanced gas sensor array with different morphology-tuning characteristics. Physical Chemistry Chemical Physics: PCCP, 23, 23933–23944. 72. Kanaparthi, S., & Singh, S. G. (2021). Discrimination of gases with a single chemiresistive multi-gas sensor using temperature sweeping and machine learning. Sensors and Actuators, B: Chemical Sensors and Materials, 348, 130725. 73. Viet, N. N., Dang, T. K., Phuoc, P. H., Chien, N. H., Hung, C. M., Hoa, N. D., Van Duy, N., Van Toan, N., Son, N. T., & Van Hieu, N. (2021). MoS2 nanosheets-decorated SnO2 nanofibers for enhanced SO2 gas sensing performance and classification of CO, NH3 and H2 gases. Analytica Chimica Acta, 1167, 338576. 74. Van Toan, N., Hung, C. M., Hoa, N. D., Van Duy, N., Le, D. T. T., Hoa, N. T. T., Viet, N. N., Phuoc, P. H., & Van Hieu, N. (2021). Enhanced NH3 and H2 gas sensing with H2 S gas interference using multilayer SnO2 /Pt/WO3 nanofilms. Journal of Hazardous Materials, 412, 125181. 75. Hayasaka, T., Lin, A., Copa, V. C., Lopez, L. P., Loberternos, R. A., Ballesteros, L. I. M., Kubota, Y., Liu, Y., Salvador, A. A., & Lin, L. (2020). An electronic nose using a single graphene FET and machine learning for water, methanol, and ethanol. Microsystems & Nanoengineering, 6, 50. 76. Krivetskiy, V. V., Andreev, M. D., Efitorov, A. O., & Gaskov, A. M. (2021). Statistical shape analysis pre-processing of temperature modulated metal oxide gas sensor response for machine learning improved selectivity of gases detection in real atmospheric conditions. Sensors and Actuators, B: Chemical Sensors and Materials, 329, 129187. 77. Bae, G., Kim, M., Song, W., Myung, S., Lee, S. S., & An, K.-S. (2021). Impact of a diverse combination of metal oxide gas sensors on machine learning-based gas recognition in mixed gases. ACS Omega, 6, 23155–23162. 78. Kang, M., Cho, I., Park, J., Jeong, J., Lee, K., Lee, B., Del Orbe Henriquez, D., Yoon, K., & Park, I. (2022). High accuracy real-time multi-gas identification by a batch-uniform gas sensor array and deep learning algorithm. ACS Sensors, 7, 430–440. 79. Xu, L., He, J., Duan, S., Wu, X., & Wang, Q. (2016). Comparison of machine learning algorithms for concentration detection and prediction of formaldehyde based on electronic nose. Sensor Review, 36, 207–216.
A Review of the High-Performance Gas Sensors Using Machine Learning
197
80. Ren, W., Zhao, C., Liu, Y., & Wang, F. (2021). An In2 O3 nanotubes based gas sensor array combined with machine learning algorithms for trimethylamine detection. In 2021 IEEE 16th International Conference on Nano/Micro Engineered and Molecular Systems (NEMS) (pp. 1042–1046). IEEE. 81. Liu, Y., Zhao, C., Lin, J., Gong, H., & Wang, F. (2020). Classification and concentration prediction of VOC gases based on sensor array with machine learning algorithms. In 2020 IEEE 15th International Conference on Nano/Micro Engineered and Molecular System (NEMS) (pp. 295–300). IEEE. 82. Thorson, J., Collier-Oxandale, A., & Hannigan, M. (2019). Using a low-cost sensor array and machine learning techniques to detect complex pollutant mixtures and identify likely sources. Sensors, 19, 3723. 83. Wei, G., Zhao, J., Yu, Z., Feng, Y., Li, G., & Sun, X. (2018). An effective gas sensor array optimization method based on random forest. 2018 IEEE Sensors (pp. 1–4). IEEE. 84. Itoh, T., Koyama, Y., Shin, W., Akamatsu, T., Tsuruta, A., Masuda, Y., & Uchiyama, K. (2020). Selective detection of target volatile organic compounds in contaminated air using sensor array with machine learning: Aging notes and mold smells in simulated automobile interior contaminant gases. Sensors, 20, 2687. 85. Zhao, W., Bhushan, A., Santamaria, A. D., Simon, M. G., & Davis, C. E. (2008). Machine learning: A crucial tool for sensor design. Algorithms, 1, 130–152. 86. Hanga, K. M., & Kovalchuk, Y. (2019). Machine learning and multi-agent systems in oil and gas industry applications: A survey. Computer Science Review, 34, 100191. 87. Venketeswaran, A., Lalam, N., Wuenschell, J., Ohodnicki, P. R., Jr., Badar, M., Chen, K. P., Lu, P., Duan, Y., Chorpening, B., & Buric, M. (2022). Recent advances in machine learning for fiber optic sensor applications. Advanced Intelligent Systems, 4, 2100067. 88. Liu, T., Li, D., Chen, J., Chen, Y., Yang, T., & Cao, J. (2018). Gas-sensor drift counteraction with adaptive active learning for an electronic nose. Sensors, 18, 4028. 89. ur Rehman, A., Bermak, A., & Hamdi, M. (2019). Shuffled frog-leaping and weighted cosine similarity for drift correction in gas sensors. IEEE Sensors Journal, 19, 12126–12136. 90. Amarnath, B., Balamurugan, S., & Alias, A. (2016). Review on feature selection techniques and its impact for effective data classification using UCI machine learning repository dataset. Journal of Engineering Science and Technology, 11, 1639–1646. 91. Frank, A. (2010). UCI machine learning repository. http://archive.ics.uci.edu/ml 92. Jiang, Z., Xu, P., Du, Y., Yuan, F., & Song, K. (2021). Balanced distribution adaptation for metal oxide semiconductor gas sensor array drift compensation. Sensors, 21, 3403. 93. Dong, X., Han, S., Wang, A., & Shang, K. (2021). Online inertial machine learning for sensor array long-term drift compensation. Chemosensors, 9, 353. 94. Das, P., Manna, A., Ghoshal, S. (2020). Gas sensor drift compensation by ensemble of classifiers using extreme learning machine. In 2020 International Conference on Renewable Energy Integration into Smart Grids: A Multidisciplinary Approach to Technology Modelling and Simulation (ICREISG) (pp. 197–201). IEEE. 95. Schroeder, V., Evans, E. D., Wu, Y.-C.M., Voll, C.-C.A., McDonald, B. R., Savagatrup, S., & Swager, T. M. (2019). Chemiresistive sensor array and machine learning classification of food. ACS Sensors, 4, 2101–2108. 96. Tan, J., Balasubramanian, B., Sukha, D., Ramkissoon, S., & Umaharan, P. (2019). Sensing fermentation degree of cocoa (Theobroma cacao L.) beans by machine learning classification models based electronic nose system. Journal of Food Process Engineering, 42, e13175. 97. Enériz, D., Medrano, N., & Calvo, B. (2021). An FPGA-based machine learning tool for in-situ food quality tracking using sensor fusion. Biosensors, 11, 366. 98. Fang, C., Li, H.-Y., Li, L., Su, H.-Y., Tang, J., Bai, X., & Liu, H. (2022). Smart electronic nose enabled by an all-feature olfactory algorithm. Advanced Intelligent Systems, 4, 2200074. 99. Astuti, S. D., Tamimi, M. H., Pradhana, A. A., Alamsyah, K. A. Purnobasuki, H., Khasanah, M., Susilo, Y., Triyana, K., Kashif, M., & Syahrom, A. (2021). Gas sensor array to classify the chicken meat with E. coli contaminant by using random forest and support vector machine. Biosensors and Bioelectronics, X(9), 100083.
198
S. Yang et al.
100. Al Isyrofie, A. I. F., Kashif, M., Aji, A. K., Aidatuzzahro, N., Rahmatillah, A., Susilo, Y., Syahrom, A., & Astuti, S. D. (2022). Odor clustering using a gas sensor array system of chicken meat based on temperature variations and storage time. Sensing and Bio-Sensing Research, 37, 100508. 101. Saeed, R., Feng, H., Wang, X., Zhang, X., & Fu, Z. (2022). Fish quality evaluation by sensor and machine learning: A mechanistic review. Food Control, 137, 108902. 102. Lumogdang, C. F. D., Wata, M. G., Loyola, S. J. S., Angelia, R. E., Angelia, H. L. P. (2019). Supervised machine learning approach for pork meat freshness identification. In Proceedings of the 2019 6th International Conference on Bioinformatics Research and Applications, Association for Computing Machinery (pp. 1–6). 103. Yang, H.-Y., Wang, Y.-C., Peng, H.-Y., & Huang, C.-H. (2021). Breath biopsy of breast cancer using sensor array signals and machine learning analysis. Science and Reports, 11, 1–9. 104. Zimmerman, N., Presto, A. A., Kumar, S. P., Gu, J., Hauryliuk, A., Robinson, E. S., Robinson, A. L., & Subramanian, R. (2018). A machine learning calibration model using random forests to improve sensor performance for lower-cost air quality monitoring. Atmospheric Measurement Techniques, 11, 291–313. 105. Wei, P., Sun, L., Anand, A., Zhang, Q., Huixin, Z., Deng, Z., Wang, Y., & Ning, Z. (2020). Development and evaluation of a robust temperature sensitive algorithm for long term NO2 gas sensor network data correction. Atmospheric Environment, 230, 117509. 106. Wusiman, M., & Taghipour, F. (2022). Methods and mechanisms of gas sensor selectivity. Critical Reviews in Solid State, 47, 416–435. 107. Jian, Y., Hu, W., Zhao, Z., Cheng, P., Haick, H., Yao, M., & Wu, W. (2020). Gas sensors based on chemi-resistive hybrid functional nanomaterials. Nano-Micro Letters, 12, 1–43. 108. Al-Hashem, M., Akbar, S., & Morris, P. (2019). Role of oxygen vacancies in nanostructured metal-oxide gas sensors: A review. Sensors and Actuators, B: Chemical Sensors and Materials, 301, 126845. 109. Zhao, S., Shen, Y., Yan, X., Zhou, P., Yin, Y., Lu, R., Han, C., Cui, B., & Wei, D. (2019). Complex-surfactant-assisted hydrothermal synthesis of one-dimensional ZnO nanorods for high-performance ethanol gas sensor. Sensors and Actuators, B: Chemical Sensors and Materials, 286, 501–511.
Machine Learning for Next-Generation Functional Materials R. Vignesh, V. Balasubramani, and T. M. Sridhar
Abstract Machine learning (ML) is a powerful technique for extracting insights from multivariate data quickly and efficiently. It provides a much-needed way to speed up the research and investigation of functional materials in order to address time-sensitive worldwide issues like COVID-19. But Scientists on the other hand are reporting and patenting new functional materials on a day-to-day basis covering medicine to aerospace. The challenge for researchers is to choose the ideal materials for the design and fabrication of devices and instruments to withstand all weather conditions. Machine learning has been developed for a variety of applications in recent years, including diverse experimentation, device optimization, and material discovery. Increased functionality in next-generation functional materials can improve the application, productivity, and energy efficiency and these qualities can also be used to create a new design concept for renewable energy generation. There is a thorough introduction to the principles of machine learning for functional materials. This chapter covers the challenges in advanced functional materials research and the role of machine learning in design, simulation, and evaluation. Finally, significant pointers to successful machine learning applications are addressed, as well as the remaining hurdles in machine learning for next-generation functional materials.
1 Introduction 1.1 Need for Functional Materials Machine learning platforms play a key role today in the design and development of novel functional materials. Advanced functional materials are desirable as their ideal characteristics are to be applied in various applications. These materials have to be trained to interact with the machine learning platform and how to deliver their
R. Vignesh · V. Balasubramani · T. M. Sridhar (B) Department of Analytical Chemistry, University of Madras, Chennai 600025, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_9
199
200
R. Vignesh et al.
properties. The need for advanced functional materials and integration with machine learning platforms is the biggest challenge. Functional materials are those which can carry out specific functions in response to a distinct stimulus. Electric, magnetic, and optical properties are among the few stimuli covered by these functions. Magnetic materials, semi-conductors, ionic conductors, superconductors, dielectrics, pyroelectrics, piezoelectrics, ferroelectrics, electro-optics, and ferroelectric relaxers are several examples of functional smart materials. The relationships between composition, structure, processing, and properties facilitate the creation of improved materials including both known and unique applications. Materials scientists, chemists, and physicists have been probing these interactions for many years, trying to engineer bulk properties of functional materials and describing them with theoretical models. Nowadays, understanding the relationships between compositions, microstructure, processing, and macroscopic properties is crucial, and artificial neural networks go hand in hand in revealing the fundamentals [1].
2 Advanced Functional Materials Artificial intelligence (AI) is one of the machine learning platforms that plays a vital role in the development and construction of advanced functional materials that needs to be enriched as it can further accelerate their production. Functional materials are a bigger challenge as each and every day new innovations and technology are delivering us with new materials for a variety of applications from medicine, aerospace, energy, nanomaterials, and devices. The design and development of advanced functional materials involve a multidimensional approach including the design, shape, structure-property association and quantum properties at the atomic and molecular levels, etc., ML is an ideal choice for these challenges and it can be successful only when its algorithms and data sets are evaluated quantitively and interpreted with the experimental methods. Scientists and researchers have to combine human intelligence with AI to develop digital thoughts dealing with all aspects of material science and ML to achieve the target of smart functional materials. The scientist’s combination of human intelligence of material with sound theoretical knowledge to create digital concepts based on the fundamentals of material science and machine learning tools can bring out new smart functional materials with high data efficiency and affordable cost of production [2]. A few of the concepts that can be considered are processing techniques, pattern recognition, neural network concepts, trial and error with computational sensing networks, fuzzy logic, materials of construction with data mining, and reverse engineering through supervised and unsupervised learning. New materials design and developments have changed with the advances in computational algorithms and machine learning tools. Materials Genome Initiative of National Institute of Standards and Technology (NIST) encourages U.S. innovation and technological competence through economic productivity and increases
Machine Learning for Next-Generation Functional Materials
201
the value of the life of human beings. This objective is aimed at the production of feasible smart materials at competitive prices with robust industrial infrastructure [3]. TU Delft University, Netherlands has launched around 16 TU Delft AI Labs since 2020 to interact with domain professionals on applications of AI-based techniques in various fields for the service of mankind. Despite all these efforts, the world was under lockdown due to the dramatic pandemic situation created because of Covid-19 (Corona virus).
2.1 Materials for Combating Corona Virus Pandemic The Covid-19 pandemic which broke out in 2019 December shook the entire world and exposed the shortcomings from sensing and testing kits, face masks, personal protection equipment (PPE), oxygenators, drugs, vaccines, etc. Attempts are being undertaken to develop novel materials that would aid in diagnostic techniques that make use of machine learning algorithms that are safe for the person doing the test and at the same time the test results are available in minutes. Further, the details of the test are intimated to the health care officials. As an illustration, great sensitivity and speed were shown for machine learning-based screening of SARS-CoV-2 assay designs utilizing a virus detection method based on CRISPR. Research has led to the development of widespread screening of COVID-19 patients based on their distinctive breathing patterns, and neural network classifiers were created to follow the patient and the disease progress. For automatic detection and long-term tracking of COVID-19 patients, a deep learning-based analysis system of thoracic CT scans was built. Automated diagnosis tools based on AI and machine learning are being evolved significantly. In addition to enhancing diagnostic speed and accuracy, these techniques will protect medical professionals by restricting their interaction with COVID-19 patients. The mutations to the virus keep changing constantly and countries have seen 4–5 waves of the changing virus epidemic. To effectively treat the rapidly expanding global COVID-19 patient population, a treatment approach is urgently required. It is essential to create effective strategies to repurpose currently approved medications or build novel medications against SARS-CoV-2 as there is no therapy that has been proven to be useful in treating COVID-19 patients. Prioritizing current medication options against SARS-CoV-2 for clinical trials requires the development of a machine learning-based repositioning and repurposing framework. In addition, a novel medication-like molecule against SARS CoV-2 has been designed and created using a deep learning-based drug discovery process [4]. Deep learning system is a machine learning-based algorithm system created by Google DeepMind, which has published predicted protein structures related to COVID19. These data collections take months using conventional experimental procedures and at the same time, it provides important information for the development of the COVID-19 vaccine. Furthermore, a newly established Vaxign reverse vaccinology technique merged with machine learning revealed COVID-19 vaccine candidates that can be translated to products. COVID-19 affected patients were treated using different
202
R. Vignesh et al.
protocols by physicians across hospitals in the world and World Health Organization (WHO) was also releasing treatment protocols from time to time. The death rate was around 3–5% of the total affected coronavirus patients. The common and uncommon reasons for death and the presence of co-morbidities have to be identified. The enormous volume of COVID-19 treatment data generated in hospitals around the world also necessitates the deployment of modern machine learning techniques for analyzing individualized therapy effects for evaluating new patients, predicting the hospitalization requirement, etc. and this would not only improve individual care for each patient but also local hospital arrangement and management [5]. Patients affected by COVID-19 virus and after taking two doses of vaccines are reporting several side effects ranging from diabetes to vision problems. The biggest challenge today is to analyze these data and advise policymakers and physicians on how to handle these situations. Machine learning is the ideal tool to develop algorithms to process and handle this data and come out with futuristic planning.
2.1.1
Role of Machine Learning in Combating COVID-19
The large volume of COVID-19 datasets has to be classified into the set parameters of interest using machine learning methods. This would be followed by developing algorithms to identify the trends of the parameter and predict the variation in the datasets chosen for the study. In case of disease progression literature reports indicate that machine learning process has been able to envisage the inhibitory synthetic antibodies as a probable cure [6, 7]. In one of the reports, 1933 virus-antibody samples were collected from patients and their neutralization response was recorded and analyzed. Graph featurisation was chosen to evaluate the data on the machine learning platform in order to identify the sequence from the thousands of antibody results to arrive at a hypothesis. The studies indicated that eight stable antibodies were present from the samples and these were identified to inhibit the progress of COVID-19 [7–9]. Further processing and analysis of these datasets indicate the theoretical possibility of applying machine learning concepts in the battle against the coronavirus. To understand the various machine learning concepts involved in determining the role of antibodies for COVID-19 is presented in Fig. 1. The major issue with the processing of COVID-19 data is its large volume which varies from country to country irrespective of the climate and temperature whereas machine learning in turn assists us in reducing the time to identify disease progression patterns which cannot be imagined to be processed manually. The interrelationship between machine learning parameters can be easily identified like the need for oxygenators to provide oxygen support and getting the administration ready to face the shortfall in producing drugs and vaccines. The major advantage of machine learning is that it has the capacity to handle large volumes of data and these can be trained and result patterns improved with the validation of data thus increasing their efficiency and accuracy [10–12]. Machine learning algorithms not only help in faster decisionmaking and forecasting but at the same time it is trained to take a decision without any human intervention or waiting for approval of commands.
Machine Learning for Next-Generation Functional Materials
203
Fig. 1 Different roles of machine learning for the COVID-19 pandemic
3 Machine Learning Platform for Biological Materials Basically, in biological applications, two main paradigms exist in the field of machine learning which are supervised and unsupervised learning. In supervised learning, objects in a given group of data are arranged based on the selection of the particular attributes or features. This classification process is able to propose the particular assignment of objectives of an attribute based on the set of rules developed from the values of features. In reference to biological parameters, the objects are heterogenous in nature and their mapping of object-to-class is different as in the case of protein sequences which are based on the amino acid chains and their combination of secondary structures, profiles generated for gene expression of the tissues involved in various organs to the disease group. As the sequence chains of proteins are longer with different molecular configurations the expression levels of specific genes and their features would be different based on the nature of the tissue. In this situation supervised learning aims to identify the class membership of different new objects based on the existing features to design a reliable system. In addition to forecasting systematic appearances like class label, prediction of continuous features of the objects can be carried out with supervised learning techniques. Further, supervised learning helps us to identify “doubt” and “outlier” based on the algorithm the classification can be done. In this case “doubt” refers to the various possible classes to which the object can be assigned whereas an “outlier” indicates that the decision of class membership of the object which has previously not been used to match any decision is questionable. In the biomedical field with the large-scale use of scanning and imaging techniques several thousand images are generated and processing them to identify the diseases is a big challenge. Viewing the data with our eyes and as printouts do not reveal major information but machine learning and ANN have made the precision
204
R. Vignesh et al.
diagnosis possible thus simplifying the data for a physician to plan their treatment protocols. Image processing has helped us to segment the data, apply detection protocols classify them, and predict the clinical condition [13]. This segment has led to the evolution of collaborative efforts of the academia and research institutions with corporates and multinational companies being involved in the extensive use of AI and ML in the various key areas like decoding genomic structures and understanding biological structures to develop sequences of viruses, bacteria, and genetic disorders. This is currently being used to detect the evolving changes to the Covid-19 mutations on a daily basis across the globe. Drug discovery is the most challenging field where earlier it was the synthetic organic chemist who used to synthesize molecules and then its biological efficacy was evaluated. Today, AI and ML have changed the spheres where the prospective drug molecules are selected and evaluated first followed by the synthetic chemist who develops the process to produce it [14]. Cancer is a unique disease that does not have a defined proliferation and progression pattern and to monitor these various scans like Magnetic Resonance Imaging (MRI), Computed Tomography scan (CT), Positron Emission Tomography (PET), Single-Photon Emission Computerized Tomography (SPECT), etc. Further, image processing techniques help us to monitor the progression of a disease like cancer and deriving vital information to assist physicians and surgeons to plan their treatment protocols. In case of clinical diagnosis volumes of data needs to be stored and retrieved after a few years or decades to decode if the same disease has relapsed and this is possible with the development of algorithms based of ML, ANN, and AI [15].
3.1 Functional Polymeric Materials Polymers are chains of simple carbon structures which has a molecular weight and can be tuned to make them functional. The challenge with today’s functional polymers is their multidimensional approach as their chemical constituents range from a combination of monomers to different types of copolymers. This has diversified the polymer synthesis methods and their characterization which prevents us from systematically studying their structure-property connections. Biological proteins are a form of polymers that also fall in this category which are built on with basic amino acids and it is a combination of 20 of them which are known to us. Proteins constitute enzymes to body-building molecules. Their structures are proportional to their molecular weight and cannot be represented as 2D structures. To develop and understand these proteins and polymers in general, computational chemistry plays a pivotal role where complicated data has to be processed, structures viewed in 3D, and properties evaluated. Machine learning combined with AI has been applied in the development of functional polymeric materials. DeepMind’s AlphaFold2 in 2020 was used in the Critical Assessment of Protein Structure Prediction (CASP) competition. The application of AlphaFold2 ML technique has been used to solve structural biological issues like revealing the protein folding of a simple chain has been a great challenge for biochemists for years [16].
Machine Learning for Next-Generation Functional Materials
205
One of the latest applications of ML in life science includes the synthesis of polymers for 19 F Magnetic Resonance Imaging (MRI) which is based on a combination of software-controlled flow synthesis that is automatically regulated. Around 397 copolymer structures having 6 variable compositional space was synthesized by Isayev [17].
3.2 Combinatorial and Automated Polymer Chemistry Polymers have dominated our lifestyle and moreover functional polymers have made our day-to-day challenges much easier like Teflon which is used in non-stick cookware to bristles in brushes for cleaning specific objects. The large-scale production of polymer is a challenge as it is not a single-step reaction process but the major challenge is its reactivity in an ambient atmosphere where the oxygen and water in the form of moisture present in the air trigger multiple reactions. Sealed vessels, vacuum pumps, air tolerant atmospheres to freeze dryers to several other equipments are required for synthesis. This restricts the use of automation and robotic control to synthesize a variety of functional polymers [18]. This field opens up a new challenging area for the implementation of machine learning concepts.
3.3 The Evolution of Molecular Modeling for Polymers New cutting-edge materials synthesis is all about engineering the chemistry of molecules to develop new products. Molecular modeling is a vital tool that would help us to achieve these goals on the design and synthesis of functional materials, predict their mechanism, interaction with the environment, theoretical characterization from thermodynamics to crystal structure and evaluation, and the factors involved in developing polymeric composites are given in Fig. 2. Polymers are long-chain compounds with several side chains and active functional groups and sites for reactions to occur. Application of Density Functional Theory (DFT) on these large and chaotic polymeric structures with changing configurations is a challenge. On the other hand, Mechanized Generic Interface (MGI) has enabled to carry out the reactivity and stability predictions with the virtual evaluation of properties but it is mainly restricted to small molecules, compounds, semiconducting oxides, and rarely in high-density polymers. The computationally challenging for macromolecular systems at atomistic machine learning in combinatorial polymer chemistry [19]. Thin films and nano templates are most required for device development in the electronics industry. Linear arrangement of polymers known as block copolymers is used to develop these functional sites with intricate morphologies on silicon-based materials. Suwon Bae [20] and their team have explored the orientation in Block Copolymers (BCP) in thin films using Molecular Dynamics (MD) simulations along
206
R. Vignesh et al.
Fig. 2 Evaluation of molecular modeling for polymeric composites
with Gaussian Process Regression (GPR). They have constructed a morphological phase diagram with multi-dimensions where the variables are higher than the data points involving various properties from base materials involved, flow layout to surface tension. Molecular dynamics mockups along with autonomous experimentation (AE) tools have been used to handle the complicated BCPs. Specific points in parameter space have been chosen to run the MD stimulation in BCP, and structures elucidated from these data are analyzed using an autonomous decision-making algorithm consisting of GPR in order to derive the subsequent parameters for simulation [21]. Reinhart group has developed several copolymers by studying theory behaviors with the application of unsupervised learning of sequence-specific aggregation for local surroundings [22]. Aggregation properties and phase behavior vary with the changes in sequence-specific BCP. ML was used to derive and categorize the total structure from the local surroundings and aggregates were found to have lower densities than the orthodox liquid phases and intricate structures of the clusters. Several assortments of self-assembled structures can be developed from the data obtained to predict their properties and relations. Unsupervised ML concepts can be made functional to understand the disordered soft matter systems exclusively when order constraints are not available [47].
3.4 Machine Learning for Polymer Modeling and Evaluation Simulation techniques and processes have been applied to model the synthesis of polymers though it is a dynamic process. Graph Neural Networks (GNNs) are a Python-based algorithm that is designed to predict and interpret the graphical data
Machine Learning for Next-Generation Functional Materials
207
which is a class of deep learning system. GNN have been used to study the synthesis of poly (ethylene oxide) (PEO)/lithium bis-trifluoromethyl sulfonimide (LiTFSI) composite electrolytes to deliver explicit vectors to train chemical atoms along with their bonding and to predict the configurational stability [23, 48]. The non-linear dynamics is predicted using a linear transition matrix where the configurational arrangement of target atoms in space is mapped by combining Koopman with GNN models. Lithium-ion has four coordination states −0, 1, 2, and 3 and its classification is done by carrying out cluster analysis of vectors. In this case, the experimental parameters include the prediction of the coordination state of Li and the transport of clusters containing these ions in PEO/LiTFSI which is maintained at saturated salt conditions [24]. Tracking the changes in the degrees of freedom in high molecular weight polymers using the classical molecular dynamics at the atomic level would provide more information on tracking the length of the polymeric chains. The task involves simplifying the complex chain of molecules using coarse graining by following the rapidly changing degrees of freedom for combining them with the atomic system. In analyzing polymers by building them up one after another in the bottom up method taking thermodynamic and group theory into consideration by coarse graining is a capable new approach in ML-based applications. Hydrocarbons configuration have been elucidated using the Gaussian process applied to the liquid forms with an active selection of algorithms with fly generation ML schemes [25]. These models can translate into onsite production techniques and monitoring.
3.5 Computing of Electronic Polymer Materials Polymeric materials offer many advantages which include rolling them out as thin films, anti-corrosion coatings to porous scaffolds, and catalytic applications. ML is widely used in the design and synthesis of polymers as its intensively data driven and to modify and alter their properties using several computing methods [26]. Researchers have contributed to these applications on various types of advanced polymers using computational methods. Venkatraman and Alsberg [27] adopted ML to the discovery of novel polymers which have a high refractive index and a combination of superior properties and evaluated them using DFT calculations. Polymers permeated with nanoparticles have been prepared by Breneman et al. [26] using materials genomics to design the best shape of the polymer taking into account its mechanical and thermal stability properties and this was further confirmed with experimental results. Conducting polymers with different electronic properties like band gap, dielectric constant, and dielectric loss tangent were developed by Wu et al. on proven statistical models using organic polymers [48]. Infinite chain descriptors served as inputs of organic polymers for ML to forecast the electronic behavior and these models performed well experimentally. Wu et al. [49] used ML models to develop smart electronic materials by tweaking organic polymers as they possess superior dielectric and chemical resistance properties when compared to traditional
208
R. Vignesh et al.
inorganic materials and were validated with traditional models. Classified NMLbased methods have been adopted to design and optimize specific dielectric properties. On the other hand, Mannodi-Kanakkithodi et al. used statistical knowledge models that have data obtained with first-generation computing principles using simple numerical depictions obtained from mapping of fingerprints into ML algorithm. Design of smart polymers was obtained by working on constituent blocks of the polymers using a genetic algorithm to establish the desired properties. MannodiKanakkithodi et al. have extensively worked on polymer genomes to develop future varieties of smart dielectric polymers [29].
4 Energy Material Used for Machine Learning Platform Energy in all forms is the need of the world order today with shrinking fossil fuel reserves and an increase in polluting fuels, alternative resources have to be optimally harnessed. ML plays an important role in accelerating the discovery of highperformance energetic materials, including battery and superconductor materials, electro-ceramic and thermoelectric materials, and photovoltaic and perovskite materials. Hydrogen is widely available in nature and is the cleanest non-polluting fuel but its production and storage is the challenging issue. Development of catalysts for converting water, biomass, chemicals, plastics, and CO2 capture is being extensively investigated. Solar and wind renewable energy resources are hampered by their inconsistency. The requirement for batteries has ever increased. The present development of battery-operated automobiles by decreasing the charging time and increasing the distance of travel is a drive toward a non-polluting sector and its key properties are given in Fig. 3. Advanced battery technologies development is focused on supercapacitors, hybrid batteries, and fuel cells to enable its storage connectivity and transmission directly to the electricity grid [30].
Fig. 3 Key properties of battery for its performance
Machine Learning for Next-Generation Functional Materials
209
4.1 Machine Learning in Renewable Energy Material 4.1.1
Materials Design for CO2 Capture
Traditional CO2 capture methods are by using amines as sorbent materials in the industry by rapid absorption. This process is hampered by the unstable and corrosive nature of amines. Metal-organic frameworks (MOFs) are considered promising due to the distribution of active sites and the presence of porosity in the framework which soaks up CO2 like a sponge while its poor chemical and thermal properties are its disadvantages. To increase the performance quantitative structure-property relationship (QSPR) models have been deployed by Woo and co-workers [31], where MOFs are classified based on their CO2 acceptance capacity. In this model radial distribution function descriptors developed by them enables the identification of chemical sites along with their shapes using Support Vector Machine (SVM) algorithm that handles classification as well as regression. The electronic interactions were modeled from 324,500 large databases of MOF structures with Grand Canonical Monte Carlo (GCMC) simulation combining the math and properties of MOF adsorption by using the molecular level model to study the uptake of CO2 . Further, Lennard-Jones potential was applied taking into consideration the changes in chemical and physical bonds including Van der Waals forces. DFT was applied to compute the electronic interactions in the MOF molecules involved in the generation of electrostatic potential [32]. On the other hand, Froudakis and co-workers used descriptors to substructure models in MOFs to detect their presence and absence [50]. A fully automated ML coding platform equipped with statistical modeling and AI—Just Add Data is used to calculate the absorption capacity of CO2 and H2 . The advantages of this model are that on a dynamic scale, it is able to forecast the uptake of the gases as it is trained with about 100 MOFs of experimental data when compared to Woo and co-workers model. Gómez-Gualdrón and co-workers have deployed DFT, GCMC-based ML to predict the CO2 uptake of MOFs [33]. The storage capacity of 400 MOFs is derived followed by training six different ML models adopting thirteen different electronic and symmetrical factors. Woo and co-workers applied the models for methane to calculate its capacity taking into account pore size and void fraction of the 130,000 MOFs with multilinear regression, decision trees, and a non-linear support vector machine. These building-up MOFs with algorithmic systems indicate that they would result in high methane storage [31, 33].
4.1.2
ML for Solar Cell Developments
Solar cell industry is a highly competitive industry in terms of upgrading technology, ease of maintenance, and return on investment. Panels are generally made of Photovoltaic (PV) or blended with organic molecules. A few shortcomings with PV solar panels are their degradation with time resulting in instability in power generation
210
R. Vignesh et al.
which in turn affects the power conversion capacity and changing cost of siliconbased semiconductor materials. In organic-based semi-conductors identifying the correct material is the biggest challenge. Only data science can help guide the scientist to pick, synthesize and incorporate electronics to develop high-performing solar cells. Shockley-Queisser limit is widely used in solar cell studies as it correlates the maximum performance of a unit p − n junction with a band gap. DFT is broadly used to determine the bandgap along with the thermal and thermodynamic constancy of the materials. Perovskite is preferred materials in solar cells as it is having enhanced solar absorption and reduced non-radiative recombination are a few factors on which ML algorithms have been addressed [34]. Allam et al. have used DFT and neural network model to study the bandgap of perovskites and reported that basic atomic features are required to train ML modes. Ramprasad et al. have deployed a kernel ridge regression model on double perovskite to predict the combination of the input features for developing a bandgap prediction model [29]. Takahashi et al. [51] employed a classification with numerous trees with the random forest-based tool to categorize the bandgap with 18 physical signifiers of perovskite materials and the ideal bandgap varied from 1.5 to 3.0 eV using a sample size of 15,000 perovskites and applying DFT formulas to ascertain its stability. Flexible framework has been created to classify crystalline or amorphous forms of materials by Ward et al. using Open Quantum Materials Database (OQMD) with 145 attributes as signifiers [52]. Several algorithms like LASSO regression, partial least square regression, Gaussian process regression, kernel ridge regression, and ANN are deployed for higher data efficiency and accuracy of prediction. In case of organic compounds-based cells, the Harvard Organic Photovoltaic Dataset has been used by Ferguson [35] where molecular fingerprint techniques are combined with Gaussian models to calculate the optical density, charge conversion efficiency and transfer rate, etc. Schmidt and co-workers have deployed a variable autoencoder scheme where molecules are used to find out LUMO and optical conversion energy. They also established the role of ML models like ANN, ridge regression, and random forests to forecast hull energy requirements for large cubic perovskite systems (ABX3) [37].
4.1.3
ML and Novel Rechargeable Battery Materials
The awareness of pollution to mother earth and implement Reduce, Reuse, and Recycle (RRR) principle has increased the use of rechargeable devices widely used in electronic gadgets to uninterrupted power supply units. The major drawback is tweaking with the charging and discharging cycles can result in fire accidents due to thermal surge resulting in loss of lives. The challenge is the scarcity of the available materials currently used especially lithium in these batteries, the quest to identify novel materials using ML involves identifying the chemical composition based on the size of the device, power cycles in addition to the charging and discharge properties where several descriptors are involved. Madhavi and her group have developed
Machine Learning for Next-Generation Functional Materials
211
several novel methods to develop materials for rechargeable batteries by implementing RRR principle [37]. ML is used to train systems to identify the chemical composition of the composites involving one or more materials in various percentages and run simulation models on these candidate resources to evaluate the performance of the batteries. Hautier et al. [38] constructed a probabilistic model of various chemical systems and electrodes gears by creating models to evaluate large-scale data of materials for which the synthetic routes are available. Statistician methods are deployed to compute the posterior probability identifying the molecular structure of the compound to enable the preparation of the suitable composite and applying DFT to evaluate them. This group has evaluated 1126 chemical molecules from 2211 A-B-O systems to finally arrive at 209 new ternary systems after applying 5546 DFT computations that are theoretically viable and cost-effective to synthesize them. Table 1 summarizes the ML models that have been successfully implemented only for battery components or chemical structure calculations for rechargeable battery materials. The application of these models would be enabling the screening of new compounds before the DFT computing will not only validate the system but also enable the synthesis of new novel materials which are more stable, low weight, thermally safe, and economically viable. Table 1 Application Description Reference ML Method Achievement [39] Application description
ML method
Achievement
Finding nature’s missing ternary oxide compounds
Bayesian
209 new compounds
Obtaining qualitatively useful guidance for a wide range of perovskite oxide stability
ERT
Predicting 15 new perovskite compounds accurately
Discovering lepidolites (with stoichiometry ABC2D6) crystals
KRR
90 unique structures were identified
Predicting the thermodynamic stability of Solids
RR, RF, ERT and ANN
Speeding up considerably (by at least a factor of 5) high-throughput DFT calculations
Screening new materials in an unconstrained composition space
Bayesian
4500 new stable materials Predicting the formation energies by Voronoi tessellations
Developing a tool for crystal structure prediction
PL
Predicting the formation energy of 114 structures of binary alloy with 90% or higher of precision
Developing a tool for molecular structure prediction
Reducing the number of searching trials required to find the global minimum structure by 30–40%
Discovery of new guanidinium ANN ILs
Six new guanidinium ILs
212
R. Vignesh et al.
5 Composite Materials for Machine Learning Applications Space, nuclear, and healthcare are three critical sectors where the failure of materials can have disastrous consequences including loss of life and collapse of the system, but these sectors use a wide range of materials with multifunctional properties for building each component to performance as a whole. Advances in computing systems, ML-based tools, and kits with high-performance and numerical modeling have enabled the applications in the development of composite materials. DFT has been widely applied to determine the molecular chemical activity and its stability. Mechanical behavior of the composites at its nanoscale level and surface modification is determined with molecular dynamics (MD) and FEM which can be performed on laptops reducing the need for servers. ML simulations offer several advantages in addition to the development of new tuneable properties of an engineered material, their properties can be determined theoretically in more than one environment of application from seawater to high altitude place where testing a developed material is a bigger challenge. The ML approaches to designing materials are obtained using inverse design problem concept whereas forward modeling is used to forecast their properties. In these processes, the chemical composition and structure of the material in terms of its crystallinity to configuration to be examined is assumed and the thermodynamic stability, quantum, and solid-state mechanics are calculated. The properties of the desired composites can be calculated numerically using computing tools such as FEM, DFT, and MD at various time scales and molecular distances. Modeling tools to develop a material with desired properties to solve a design problem is still a challenge. Domain knowledge is critical to solving these specific composite materials. The ideal approach to a problem would be to ascertain the requirement of the system as in the case of bone scaffolds which are implanted to treat fractured bones in the body. These materials consist of a blend bioceramics and biocompatible polymers that would naturally degrade and allow new bone to be formed in the human body. In practice, one common approach to solve inverse design problems is to use domain knowledge and experience (intuition) to narrow down the design space and propose new materials by trial and error. In most cases where composite materials are required the properties of the individual constituents are different from the final structural properties. In case of bone, it is made up of inorganic ceramic constituents like calcium and phosphates but on blending with collagen in a specific architecture it is able to bear the entire load of a person that is the hip and knee bones. The concept of designing these categories of materials is termed as bio-inspired materials and ML concepts have to be applied to develop these products without changes in their properties [40].
Machine Learning for Next-Generation Functional Materials
213
6 Machine Learning for Biomedical Applications The field of medical sciences involves the role of physicians and surgeons along with a team of scientists and engineers to accurately diagnose and treat diseases for patients of all ages and sex. Machine learning concepts are widely required and currently applied to reduce the area of surgical intervention and time of surgery. Image processing techniques and tomographic visualizations have brought in the new understanding of the disease behavior and their growth patterns especially in cancer where the rate of progress is unpredictable [41]. Post-surgery the rehabilitation process is a challenging healthcare issue that varies with age. Senior citizens have to the treated with almost care post-operation and hospitalizations and modeling the rehabilitation time and nature of treatment using ML-based AI algorithms are the need of the hour. Applying ML computation-based automation analysis of biomedical signal and image data especially for the senior citizens would help not only in forecasting the diagnostics tests required. This can be programmed with ML-based tools to dynamically detect the changes in disease and treatment progression with time remotely. Human-machine interface is another booming area of technological tools development where orthopedics and orthotic devices could be effectively connected to the lost, damaged, or amputated parts of the human body. Persons with special needs, disabilities, and rehabilitation engineering are inspiring as the designing and automation involve to be person-centric and based on the nature of deformities. Borowska-Terka has developed devices for this special category of persons that are based on the interface controlled with movements of the head. This enables stroke and muscular paralysis affected persons to communicate and on the other hand, visually challenged persons can also handle their gadgets. The system contains a stereovision device that works on changes in inertial dimensions with which the changes in head positions are monitored. Two modules have been developed where one module involves recording the signals at different time frames using an accelerometer and gyroscope which work along the three-axis followed by statistical computing. In another module signals acquired from inertial movements were directly processed. This was followed by classifying the head position accurately with 16 different data classifiers and their evaluation. 95% accuracy was obtained with the SVM classifier with the direct classification of inertial dimensions and 93% with decision tree-based random forest classifier. These outcomes help us to forecast the greater role of ML tools in developing devices to address the biomedical needs of persons with special needs [42]. Additive manufacturing has revolutionized the quest for novel and smart materials with the development of ML-based 3D printing (3DP) applied for diverse fields from automobiles to healthcare. 3DP is a boon for the healthcare sector as layers of materials with different mechanical and biological properties have to be designed in a single implant with bio-inspired materials and its biological function should also be kept intact. This technique enables us to print implants of various anatomical shapes and sizes layer by layer with required porosity so that cell adhesion can be preserved and blood flow is also ensured. It has several advantages like the speed
214
R. Vignesh et al.
of obtaining the components, reproducibility of the shapes and sizes, and control over the composition to ensure the desired mechanical properties are formed for the digital model and inputs provided. In the biomedical field, its applications include the printing of scaffolds which are made up of polymers, bioceramics, metals, and composites for tissue engineering and drug delivery, corneal implants, customized prosthetic implants, breast cancer models, and cancer research. The principle of 3D printers includes the movement of cartridge fillers at varying swiftness by the principle of operation involves extrusion or solidification of the powder/liquid [43]. Specific limitations in design and process are a few disadvantages of his technique. The primary requirement of 3DP is a computer-aided design (CAD) based whose construction has to be executed based on the commands of the software to print the model. The software is designed taking into account the type of materials used which could be in both solid forms as powders or liquids with varying viscosity where the reaction chemistry is spontaneous to build the design and is referred to as additive manufacturing (AM). The shape is evolved when the printer head moves along all three dimensions—x, y, and z axis [44]. Along with the evolution of 3D printer technology from 3 to 4D to 5D where the resolution and control over the shape have increased tremendously. In 4D the object is obtained in its 3D platform shape but in 4D ML-based printing technique applies algorithms and smart materials that are specific for a particular function modified with optical, temperature, and steam as sources This is where a non-living object can change its 3D shape and behavior over time to 4D and ML based additive manufacturing for 3D and 4D printing for biomedical application is represented in Fig. 4. The evolution and functioning of 3D printing process flow and progression over the few decades is traced in Fig. 5 [53]. In 5D, the printers are equipped with 5 degrees of freedom to enable the exact printing of intricate shapes and sizes of models [44]. A number of polymers have been used in developing the ink formulations which include Polyethylene glycol (PEG), Polylactic acid (PLLA), Polycaprolactone (PCL), collagen, and chitosan. This technology has opened up several sectors in biomedical applications and is a promising tool for organ manufacturing and hence also known as 3D bioprinting. ML computing and systems are widely used in generating and converting the design structures into models. Expert rules have to be applied in the domain areas of knowledge along with parsimonious models to predict the desired level of suitability using minimal variables. The parsimonious hypothesis uses linear regression along with its extensions like generalized linear and additive models but they have restricted functions leading to a limited capacity for a robust description of complex needs and further search for advanced ML tools [45]. Issues with incomplete data and problems due to dimensionality and their regularization into ML algorithms like integrating lasso regularization with sparse reversion to optimally select the predictors from its subsets along with neural networks have been adopted. The illustration of multi-layer perceptron (MLP) of a monitored feed-forward neural network processing a minimum one hidden layer along with monitored exercise procedure using an error back-propagation (BP) algorithm is given in Fig. 6. The intricate patterns and shapes of models can be actually obtained with regression and classification algorithms.
Machine Learning for Next-Generation Functional Materials
215
Fig. 4 ML-based additive manufacturing for 3D and 4D printing for biomedical application [53]
Fig. 5 Evolution and functioning of 3D printing process flow and progression over the few decades
7 Conclusions and Future Directions Machine learning concepts and computations are part of our day-to-day activities and are expected to revolutionize not only the properties but also the industrial manufacturing practices of novel materials. The complexity and disconnect between the hardware and software need to be bridged to enable the innovation of new strategies to prepare and fabricate materials by applying ML concepts. Despite several strides made in the development of advanced materials with machine learning many challenges with respect to precision and accuracy, the economics of the software and investment in computing facilities, availability of trained technological manpower along with restrictions in design and processing. Training of ML models is dependent
216
R. Vignesh et al.
Fig. 6 Evolution and functioning of 3D printing process flow and progression over
on the availability of large datasets, applying the laws of fundamental sciences— physics, chemistry, and biology along with understanding the results and accounting for their violations. The coronavirus pandemic has brought several challenges where there were several unpredicted challenges which include the requirement of the genetic sequence of the virus, drugs, vaccines, kits, and lifesaving equipment in the span of days to months. ML models alone have the potential with knowledge models and simulations based on them to forecast inform, develop, and complement one another to overcome each other—and to address each other’s insufficiencies. Further bridging an amalgamation of data and skill-based modeling with scientists and technologists in materials science would lead to massive changes in device performance and manufacturing.
References 1. Choudhary, A. K., Jansche, A., Grubesa, T., Bernthaler, T., & Schneider, G. Machine learning for microstructures classification in functional materials. 2. Schmidt, J., Marques, M. R., Botti, S., & Marques, M. A. (2019). Recent advances and applications of machine learning in solid-state materials science. npj Computational Materials, 5(1), 1–36. 3. The Materials Genome Initiative marks its first decade with a new strategic plan Created June 26, 2018, Updated September 1, 2021. 4. Alimadadi, A., Aryal, S., Manandhar, I., Munroe, P. B., Joe, B., & Cheng, X. (2020). Artificial intelligence and machine learning to fight COVID-19. Physiological Genomics, 52(4), 200– 202. 5. Cheng, Z. J., & Shan, J. (2020). 2019 Novel coronavirus: Where we are and what we know. Infection, 48(2), 155–163.
Machine Learning for Next-Generation Functional Materials
217
6. Chi-Hsien, K., & Nagasawa, S. (2019). Applying machine learning to market analysis: Knowing your luxury consumer. Journal of Management Analytics, 6(4), 404–419. 7. Xu, L. D. (2020). Industrial innovation in the intervention and prevention of COVID-19. Journal of Industrial Integration and Management, 5(04), 409–412. 8. Magar, R., Yadav, P., & Barati Farimani, A. (2021). Potential neutralizing antibodies discovered for novel corona virus using machine learning. Scientific Reports, 11(1), 1–11. 9. Roy, S., Menapace, W., Oei, S., Luijten, B., Fini, E., Saltori, C., Huijben, I., Chennakeshava, N., Mento, F., Sentelli, A., Peschiera, E., Trevisan, R., Maschietto, G., Torri, E., Inchingolo, R., Smargiassi, A., Soldati, G., Rota, P., Passerini, A., et al. (2020). Deep learning for classification and localization of COVID-19 markers in point-of-care lung ultrasound. IEEE Transactions on Medical Imaging, 39(8). 10. Kullaya Swamy, A., & Sarojamma, B. (2020). Bank transaction data modeling by optimized hybrid machine learning merged with ARIMA. Journal of Management Analytics, 1–25. 11. Vafeiadis, T., Dimitriou, N., Ioannidis, D., Wotherspoon, T., Tinker, G., & Tzovaras, D. (2018). A framework for inspection of dies attachment on PCB utilizing machine learning techniques. Journal of Management Analytics, 5(2). 12. Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44–56. 13. Gawehn, E., Hiss, J. A., & Schneider, G. (2016). Deep learning in drug discovery. Molecular Informatics, 35(1), 3–14. 14. Alipanahi, B., Delong, A., Weirauch, M. T., & Frey, B. J. (2015). Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8), 831–838. 15. Camacho, D. M., Collins, K. M., Powers, R. K., Costello, J. C., & Collins, J. J. (2018). Next-generation machine learning for biological networks. Cell, 173(7), 1581–1592. 16. Gormley, A. J., & Webb, M. A. (2021). Machine learning in combinatorial polymer chemistry. Nature Reviews Materials, 6(8), 642–644. 17. Isayev, O. (2022). Machine-learning-guided discovery of 19 F MRI agents enabled by automated copolymer synthesis. In APS March Meeting (Vol. 67, No. 3). 18. Anderson, D. G., Lynn, D. M., & Langer, R. (2003). Semi-automated synthesis and screening of a large library of degradable cationic polymers for gene delivery. Angewandte Chemie, 115(27), 3261–3266. 19. Tamasi, M., Kosuri, S., DiStefano, J., Chapman, R., & Gormley, A. J. (2020). Automation of controlled/living radical polymerization. Advanced Intelligent Systems, 2(2), 1900126. 20. Galant, O., Bae, S., Silberstein, M. N., & Diesendruck, C. E. (2020). Highly stretchable polymers: mechanical properties improvement by balancing intra-and intermolecular interactions. Advanced Functional Materials, 30(18), 1901806. 21. Statt, A., Kleeblatt, D. C., & Reinhart, W. F. (2021). Unsupervised learning of sequencespecific aggregation behaviour for a model copolymer. Soft Matter, 17(33), 7697–7707. 22. Statt, A. (2021). Materials Science and Engineering, Grainger College of Engineering, University of Illinois, Urbana-Champaign, IL 61801, USA Unsupervised learning of sequence-specific aggregation behaviour for model copolymers. 23. Grzybowski, B. A., Bishop, K. J., Kowalczyk, B., & Wilmer, C. E. (2009). The ‘wired’ universe of organic chemistry. Nature Chemistry, 1(1), 31–36. 24. Duschatko, B. R., & Vandermause, J. P., Molinari, N. (2022). Active learning of many-body transferable coarse-grained interactions in polymers. In APS March Meeting (Vol. 67). 25. Wang, C. C., Pilania, G., Boggs, S. A., Kumar, S., Breneman, C., & Ramprasad, R. (2014). Computational strategies for polymer dielectrics design. Polymer, 55(4), 979–988. 26. Venkatraman, V., & Alsberg, B. K. (2018). Designing high-refractive index polymers using materials informatics. Polymers, 10(1), 103. 27. Patel, S. J., Sanjana, N. E., Kishton, R. J., Eidizadeh, A., Vodnala, S. K., Cam, M., Gartner, J. J., Jia, L., Steinberg, S. M., Yamamoto, T. N., & Merchant, A. S. (2017). Identification of essential genes for cancer immunotherapy. Nature, 548(7669), 537–542.
218
R. Vignesh et al.
28. Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A., & Kim, C. (2017). Machine learning in materials informatics: Recent applications and prospects. Computational Materials, 3(1), 1–13. 29. Tashie-Lewis, B. C., & Nnabuife, S. G. (2021). Hydrogen production, distribution, storage and power conversion in a hydrogen economy-a technology review. Chemical Engineering Journal Advances, 8, 100172. 30. Woo, H. G., & Tilley, T. D. (1989). Dehydrogenative polymerization of silanes to polysilanes by zirconocene and hafnocene catalysts. A new polymerization mechanism. Journal of the American Chemical Society, 111(20), 8043–8044. 31. Hirscher, M., Yartys, V. A., Baricco, M., von Colbe, J. B., Blanchard, D., Bowman, R. C., Jr., Broom, D. P., Buckley, C. E., Chang, F., Chen, P., & Cho, Y. W. (2020). Materials for hydrogen-based energy storage-past, recent progress and future outlook. Journal of Alloys and Compounds, 827, 153548. 32. Anderson, R., Biong, A., & Gómez-Gualdrón, D. A. (2020). Adsorption isotherm predictions for multiple molecules in MOFs using the same deep learning model. Journal of Chemical Theory and Computation, 16(2), 1271–1283. 33. Allam, R., Martin, S., Forrest, B., Fetvedt, J., Lu, X., Freed, D., Brown, G. W., Jr., Sasaki, T., Itoh, M., & Manning, J. (2017). Demonstration of the Allam cycle: An update on the development status of a high efficiency supercritical carbon dioxide power process employing full carbon capture. Energy Procedia, 114, 5948–5966. 34. Ferguson, A. L. (2018). ACS central science virtual issue on machine learning. ACS Central Science, 4(8), 938–941. 35. Ghosh, K., Stuke, A., Todorovi´c, M., Jørgensen, P. B., Schmidt, M. N., Vehtari, A., & Rinke, P. (2019). Machine learning: deep learning spectroscopy: Neural networks for molecular excitation spectra. Advanced Science, 6(9), 1970053. 36. Lv, C., Zhou, X., Zhong, L., Yan, C., Srinivasan, M., Seh, Z. W., Liu, C., Pan, H., Li, S., Wen, Y., & Yan, Q. (2022). Machine learning: An advanced platform for materials development and state prediction in lithium-ion batteries. Advanced Materials, 34(25), 2101474. 37. George, J., & Hautier, G. (2021). Chemist versus machine: Traditional knowledge versus machine learning techniques. Trends in Chemistry, 3(2), 86–95. 38. Liu, Y., Guo, B., Zou, X., Li, Y., & Shi, S. (2020). Machine learning assisted materials design and discovery for rechargeable batteries. Energy Storage Materials, 31, 434–450. 39. Chen, C. T., & Gu, G. X. (2019). Machine learning for composite materials. MRS Communications, 9(2), 556–566. 40. Lundervold, A. S., & Lundervold, A. (2019). An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik, 29(2), 102–127. 41. Strzelecki, M., & Badura, P. (2022). Machine learning for biomedical application. Applied Sciences, 12(4), 2022. 42. Melocchi, A., Uboldi, M., Cerea, M., Foppoli, A., Maroni, A., Moutaharrik, S., Palugan, L., Zema, L., & Gazzaniga, A. (2020). A graphical review on the escalation of fused deposition modeling (FDM) 3D printing in the pharmaceutical field. Journal of Pharmaceutical Sciences, 109(10), 2943–2957. 43. Invernizzi, M., Turri, S., Levi, M., & Suriano, R. (2018). 4D printed thermally activated selfhealing and shape memory polycaprolactone based polymers. European Polymer Journal, 101, 169–176. 44. Haleem, A., Javaid, M., & Vaishya, R. (2019). 5D printing and its expected applications in orthopaedics. Journal of Clinical Orthopaedics Trauma, 10(4), 809–810. 45. Ghilan, A., Chiriac, A. P., Nita, L. E., Rusu, A. G., Neamtu, I., & Chiriac, V. M. (2020). Trends in 3D printing processes for biomedical field: Opportunities and challenges. Journal of Polymers and the Environment, 28(5), 1345–1367. 46. Statt, A., Kleeblatt, D. C., & Reinhart, W. F. (2021). Unsupervised learning of sequencespecific aggregation behavior for a model copolymer. Soft Matter, 17(33), 7697–7707. https:// doi.org/10.1039/D1SM01012C
Machine Learning for Next-Generation Functional Materials
219
47. Chen, L., Venkatram, S., Kim, C., Batra, R., Chandrasekaran, A., & Ramprasad, R. (2019) Electrochemical stability window of polymeric electrolytes. Chemistry of Materials, 31(12) 4598–4604. https://doi.org/10.1021/acs.chemmater.9b01553 48. Wu, Y., Guo, J., Sun, R. and Min, J., 2020. Machine learning for accelerating the discovery of high-performance donor/acceptor pairs in non-fullerene organic solar cells. npj Computational Materials, 6(1), 120. https://doi.org/10.1038/s41524-020-00388-2 49. Wu, K., Sukumar, N., Lanzillo, N. A., Wang, C., “Rampi” Ramprasad, R., Ma., Baldwin, A. F, Sotzing, G., Breneman, C. (2016). Prediction of polymer properties using infinite chain descriptors (ICD) and machine learning: Toward optimized dielectric polymeric materials. Journal of Polymer Science Part B: Polymer Physics, 54(20), 2082–2091. https://doi.org/10. 1002/polb.24117 50. Borboudakis, G., Stergiannakos, T., Frysali, M., Klontzas, E., Tsamardinos, I., & Froudakis, G. E. (2017). Chemically intuited, large-scale screening of MOFs by machine learning techniques. npj Computational Materials, 3(1), 40. https://doi.org/10.1038/s41524-0170045-8 51. Takahashi, H., Tampo, H., Arai, Y., Inoue, Y., & Kawashima, H. (2017). Applying artificial intelligence to disease staging: Deep learning for improved staging of diabetic retinopathy. PloS one, 12(6), e0179790. https://doi.org/10.1371/journal.pone.0179790 52. Ward, L., Agrawal, A., Choudhary, A., & Wolverton, C. (2016). A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials, 2(1), 1–7. https://doi.org/10.1038/npjcompumats.2016.28 53. Sonatkar, J., Kandasubramanian, B., & Ismail, S. O. (2022). 4D printing: Pragmatic progression in biofabrication. European Polymer Journal, 111128. https://doi.org/10.1016/j.eurpol ymj.2022.111128
Contemplation of Photocatalysis Through Machine Learning Tulsi Satyavir Dabodiya, Jayant Kumar, and Arumugam Vadivel Murugan
Abstract Advancement of technology in the current era of the Internet of things (IoT) and artificial intelligence (AI) has made researchers to explore into the subfield of data science identified as the Machine Learning (ML). Utilization of ML could benefit the research community for various applications. Coupling of ML with a photocatalyst (PC) can accelerate the facile understanding of the relation between the structure-property-application-oriented relation with its practical application in the areas of sustainable hydrogen generation by water splitting reaction and environmental remediation. Machine learning can be considered as a remarkable tool for unveiling the large knowledge pool related to improvising the efficiency of the photocatalyst. Herein, in this chapter, we aim to provide a brief introduction into the ML process that could benefit the photocatalysis field. Further, the chapter provides basic PC research knowledge that could potentially be useful for machine learning methods. Additionally, we also describe the pre-existing ML practices in PC are for quick identification of novel photocatalysts. Finally, the available conceptualized strategies for complementing data-driven ML with PC are elaborated. The chapter would thereby imbibe the need for utilizing existing databases for investing in the ML training and predictions. This chapter aims to provide adequate information regarding photocatalyst informatics together with the Edisonian approach. Eventually, the chapter demonstrates the potential and need for machine learning to accelerate the discovery of novel photocatalysts.
T. S. Dabodiya (B) Department of Chemical and Materials Engineering, University of Alberta, Alberta, Edmonton T6G 1H9, Canada e-mail: [email protected] T. S. Dabodiya · A. V. Murugan Centre for Nanoscience and Technology, Madanjeet School of Green Energy Technologies, Pondicherry University (A Central University), Dr. R. Vankataraman Nagar, Kalapet, Puducherry 605014, India J. Kumar (B) Delhi Institute of Tool Engineering, Shalimar Bagh, New Delhi 110052, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_10
221
222
T. S. Dabodiya et al.
1 Introduction Information technology has bought the entire world under thumbs, and it also provides great gateways to other fields. Artificial intelligence is one of them which has a great importance in multiple sectors such as healthcare, agriculture, education, transportation, smart cities, infrastructure, smart mobility, and many others to benefit the society. Artificial intelligence (AI) and its subsets, i.e., machine learning (ML) and deep learning (DL) are playing vital roles in today’s time. Though deep learning is the subset of machine learning and machine learning is the subset of artificial intelligence, still they have variations in their approaches. In the case of DL, the algorithm takes lots of time; however, after programming, it provides a response instantly with minute details. While in the case of ML, the programming can be done more earlier but the responses are not in that detailed manner. It is said that ML is at its infant stage but still it is used in diverse sections [1, 2]. As a result, the area of photocatalysis is also included [3]. For many years, scientists have been working on innumerable catalysis based on nature, phases, and structures [4]. But now the questions are arising that how experimental works and machine learning get interlinked? How can it play a vital role in multidisciplinary work? This chapter helps you to understand the interconnection between these two terms. Further, it helps you to understand how the collection of data can be used for the behavior of molecules: interaction force, repulsion force, etc. To investigate the high-throughput (HT) methods by screening materials based on the simultaneous completion of several physical and chemical properties. ML approach reduces the search space notably for the recognition of materials with desired properties [6]. ML-based studies have been successfully applied on the findings of encouraging materials for different applications in environmental remediation and energy storage devices [6]. Similarly, ML is used to discover the hidden possibilities among existing data and predict required material target properties at a normal cost compared to the conventional theoretical or experimental methods. The ML methods can unravel the physics of the underlying process and speed up the identification of future materials. In this chapter, we demonstrate the effect of ML to understand the simplification of photocatalyst materials and applications (Fig. 1a–f).
2 Machine Learning in Photocatalyst Machine learning is generally defined as programming computers to optimize a performance criterion using example data or past experience. In this process, a model is developed by some parameters and learned some experience. This process is executed in the form of a computer program to optimize the parameters of the model using the training data or past experience. The developed model may be predictive to make predictions in the future and descriptive to gain knowledge from data. The photocatalysis domain knowledge facilitated Machine Learning framework could
Contemplation of Photocatalysis Through Machine Learning
223
Fig. 1 A machine learning proposed workflow for target-driven narrow bandgap photocatalyst design. a Chemical features and photocatalysis data from the literature were used to build the dataset. b Bandgap regression and HER activity classification ML models were trained on these. c The best ML models were used to scan unknown material and identify a list of candidates with optimal bandgaps and HER activity. (d, e) (d) These candidates were synthesized and e their bandgap and H2evolution measured. f These new experimental data can be added to the dataset to improve subsequent models, closing the loop. Reproduced by permission [5]
be a more practical and feasible perspective for discovering suitable photocatalyst materials. The coupling of domain knowledge with data-driven machine learning at different stages supplements and complements to the overall process of photocatalyst discovery.
2.1 Photocatalyst Domain Knowledge Acquirement Domain knowledge or photocatalyst research field area is referred to all current knowledge or information in the field of photocatalysis based on existing experimental observation and theoretical and simulation principles. The domain knowledge is segregated into three types, which are illustrated in Fig. 2 and discussed below. Theoretical knowledge Acquirement: Theoretical knowledge Acquirement is a procedure where we collect all the reliable theoretically proved information such as light harvesting of photocatalyst material for photocatalyst performance. All the underlying physical models for solar photocatalysts will begin with adding up the proved calculations and formulas to harvest photons. Some of the parameters like absorbance
224
T. S. Dabodiya et al.
Fig. 2 Domain knowledge acquirement procedures
of solar photons Nphoton equivalent to electronic band gap energy of photocatalysts, optical absorbance using the Beer-Lambert law, radiative recombination calculated using the Shockley-Queisser approach, etc. are obtained from reported literature and defined sources [7]. Furthermore, scrutinizing of database related to the entities for the creation of additional absorption sites from secondary light-sensitive components, i.e., plasmonic absorption materials like Au, Ag, Cu, and Pt by localized surface plasmon resonance effect. The correlation of these additional absorbance sites is created with the aforementioned equations and formulas to evaluate the performance of photocatalysts [8]. Data collection acquire from standard physical characterization techniques for finding out existing material properties such as surface-active sites, carrier mobility, surface area, surface potential, charge recombination, phases, and band gaps. All the characterizations and associated data obtained by using various microscopic and spectroscopic techniques for finding surface and bulk defects are used for machine learning programming (Table 1). Besides this, collecting of data regarding conduction band edge (E CB ) and valance band edge (V CB ) positions, Fermi level, donor density (N D ), acceptor density (NA ), effective mass of electrons (me), and effective mass of holes (mh ) are also obtained from spectroscopic database [9]. It is also important to gather the data and input the values of potentials of target redox reactions and overpotentials associated with the reaction barriers. Few millivolts values of overpotentials are also very important for consideration to predict the target reaction. This activity further helps in attracting suitable candidates of catalysts and cocatalysts on the photocatalyst surface to catalyze the target redox reactions. Likewise, it is also assisted finding in the selection of Normal Hydrogen Electrode (NHE) potential to catalyze the oxygen evolution half and hydrogen half cell reaction in water splitting reactions (listed in Table 2). Experiment Knowledge: Other than theoretical knowledge, domain knowledge can also be derived from observations and mechanisms discussed in the form of experiments. The empirical knowledge, which can’t be adequately described by experiments, will also be scrutinized and added to the machine learning database. This information is accounted for designing and synthesis of defects-free photocatalysts using different techniques that led to different activities of the same semiconductor materials with identical physical and chemical properties. The listing of various defects on the basis of experimental data such as bulk, surface, and line defects in
Contemplation of Photocatalysis Through Machine Learning Table 1 Characterizations and parameters data obtained by machine learning data accusation
Table 2 Standard reduction potentials, E0 (V vs NHE at pH0) of reduction and oxidation reactions
225
S. No
Characterization techniques
Estimated properties evaluation
1.
DFT calculation
Surface adsorption energies
2.
BET, ECSA
Surface area
3.
XPS, Auger, EPR, Raman Surface defect density
4.
XRD, SEM, TEM, HRTEM
Crystal facet
5.
KPFM, AFM, STM, zeta potential
Surface potential
6.
PL/TRPL, TRMC, TAS, THz/TRTS
Charge lifetime, charge recombination rate
7.
TRMC, PEIS
Charge diffusion length
8.
TRMC
Charge mobility
9.
MS, TRMC
Charge density
10.
UV-vis, PL, SPS
Bandgap
11.
MS, LSV, XPS valence, UPS
Band position/fermi level
12.
XRD, ICP
Composition/phase
13.
XRD
Crystal structure/crystallinity
S. No
SRP (E0 , V vs NHE at pH 0) Name of reaction (V)
1.
0.0
Hydrogen evolution reaction (HER)
2.
1.23
Oxygen evolution reaction (OER)
3.
−1.49
CO2 reduction reaction (co2 rr)
4.
0.23 to −0.08
Organic oxidation reaction (OOR)
5.
0.55 to −3.20
N2 reduction reaction (NRR)
6.
1.33
Heavy metal reduction reaction
different concentrations helps in the establishment of physical and chemical properties [10]. Particularly, estimating particle size and specific surface area using already defined empirical relationships is helpful in catalytic activities of active surfaces. To build up a strong domain knowledge background, a number of experimental data
226
T. S. Dabodiya et al.
are needed to prove the validity of the existing hypothesis [11]. This kind of information will be coupled with highly significant fundamental theories to predict the direct outcome of photocatalyst activity rate and types of degradation. To bridge the morphology-catalytic phenomenon especially, crystal facet engineering, edgedefined mythologies, and core-shell morphology strongly regulate the photoactivity, selectivity, and stability of the photocatalyst and need to be properly looked over for building knowledge domain. Indeed, these kinds of relationships are not directly evaluated by physical equations; therefore, for understanding these properties, numerous research experimental results have to be scanned [12]. Another important parameter to be listed is photo-stability during the photocatalysis activity. This characteristic can only be properly fixed after the proper experimental completion process. Metal oxides/sulfides such as Cu2 O, ZnO, CdS, and Cu2 S have undergone extensive photocorrosion during the photocatalytic activity. Some of the other effects such as defect formation, exposed crystal facets, and photocorrosion are still theoretically not expressed properly. The prior shortout of these limitations helps in gathering useful knowledge for the photocatalysis research community [13]. The Existing database: Materials and crystallographic databases are the main resources that give earlier experimentally measured or theoretically estimated materials properties necessary for accomplishing the domain knowledge pipeline. Datahubs bring remarkable lead due to the wide screening of catalysts. Computational or theoretical screening of materials for electrocatalytic HER or OER and methanation are some of the examples that demonstrated the successful identification of new and efficient catalysts from the existing database. Some of the famous material existing databases available in the photocatalyst field are tabulated in Table 3. Table 3 Popular existing database systems of materials S. No
Database systems
Notations
1.
Inorganic crystal structure database V
ICSD
2.
NREL materials database
NRELMD
3.
Materials projects
MP
4.
Computational materials repository
CMR
5.
Automatic flow for materials discovery
AFLOW
6.
Open quantum materials database
OQMD
7.
Crystallography open database
COD
8.
National Institute of Science and Technology
NIST
9.
Stanford catalysis-Hub
SCH
10.
Materials data facility
MDF
11.
Pub Chem
–
Contemplation of Photocatalysis Through Machine Learning
227
3 Photocatalyst ML Framework In machine learning, a model is developed by available inputs based on certain references/input features and output results are estimated according to input data or learned factors. Likewise, in the field of photocatalysis, the relationship is developed in the form of a model using properties of the catalyst, characteristics, and performance as output. In context, the catalytic activity of the material can be inferred without conducting any experiment or simulation as soon as the models are developed. The broad overview of ML facilitated discovery of active photocatalysts materials and framework is given in Fig. 3. In short, ML models could design and develop an inexpensive route to select photocatalytic material for better photocatalytic activity with suitable properties of the catalyst. Therefore, it reduces the efforts and resource wastage as compared with typical experimental and theoretical. The execution of ML to speed up photocatalyst screening generally requires a long time to acquire data via reliable and standardized practices. Various associated components of a generic data-driven ML framework for photocatalyst design and discovery are explained herein. Data Curation: For implementing the ML framework, data curation is a preliminary step for the discovery of photocatalyst materials and their characteristics discovery. This process directs the quantity and quality of data processing and influences directly the validity of the framework. Generally different synthesis methods are used for preparing photocatalyst material; however, different synthesis routes generate different specific properties of the same materials due to distinct morphology, surface area, defect, density, etc. Owing to the aforementioned discrepancies within the same material, it is difficult to fully understand the photocatalytic activity of the material. Consequently, a sequential and standardized synthesis protocol within controlled environment is required to generate data. Although it is assumed that available data from open literature is not sufficient to be extracted for ML, it becomes important to generate new datasets in a compatible manner from the same source. Figure 4
Fig. 3 A machine learning proposed framework for photocatalyst discovery
228
T. S. Dabodiya et al.
Fig. 4 Schematic repersentation data curation for synthesis, characterizations and applications of photocatalyst/photoelectrocatalyst materials including with their performance evaluation using ML approach
represents the schematic illustration of a wide range of photocatalyst data curation systems from common experimental resources [14, 15]. However, high-throughput computational and theoretical studies are also to be considered for data curation in the ML framework for utilizing photocatalysis discovery [14, 15]. The following factors must be considered during the data curation: 1. The changes in the photocatalyst material should be monitored properly before data curation as some of the external factors such as alternation in oxidation states, morphologies, and/or compositions of photocatalysts or reactive working environment of co-catalysts after illumination. These factors immediately cause dynamic changes and make the photocatalyst susceptible to photocorrosion during catalytic activity. 2. Keep observing the photocorrosion or defect pathways by using some in situ or operando characterization techniques to remediate the photocorrosion of semiconductors. Additionally, these advanced characterizations help to elaborate the reaction mechanisms. For example, X-Ray Absorption spectroscopy is used to detect the alteration of the oxidation state of Cu(II) during the catalytic activity [16]. Therefore, the addition of in situ or operando characterizations will be helpful for data curation in the ML framework to predict the clear view of photocatalyst materials and their performance. Some of the commonly developed frameworks are listed in Table 4. 3. A photocatalysis-oriented database could enable to fetch valuable and important data in a regular and standard way to mitigate all the bottlenecks during the smooth operation of the photocatalyst.
Contemplation of Photocatalysis Through Machine Learning
229
Table 4 List of URLs of available machine learning frameworks with abbreviations [3] S. No
Available URL
Abbreviation
1.
https://topepo.github.io/caret/
Caret
2.
https://github.com/tsudalab/combo
COMBO
3.
https://deepchem.io/
DeepChem
4.
https://deeplearning4j.konduit.ai/
Deeplearning4j
5.
https://www.h2o.ai/
H2O.ai
6.
https://keras.io/
Keras
7.
http://www.mathworks.com/matlabcentral/fileexchange/429-nsga-iiamulti-objective-optimization-algorithm
NSGA-II
8.
https://mlpack.org/
MLpack
9.
https://pytorch.org/
Pytorch
10.
https://scikit-learn.org/stable/
Scikit-learn
11.
https://www.tensorflow.org/
TensorFlow
12.
https://github.com/jparkhill/TensorMol
TensorMol
13.
https://www.cs.waikato.ac.nz/ml/weka/
Weka
4. Data incorporation into general machine-understandable formats could be conveniently arranged, queried, shared, and investigated to construct models or find patterns in the data. This will help in harnessing available data with modified ML algorithms to facilitate new findings rather than the initial time-consuming stage. The data focus should be shifted toward resourceful data analytics with consistent availability. Data quality modulation: Data point generally consists of extensive information regarding photocatalyst materials and their photocatalytic activities on the basis of material’s physiochemical and quantum characteristics. These features are estimated or calculated experimentally by the existing database of materials. Furthermore, these points control the level of detail/accuracy incorporated in the ML model. The raw data contains a noise background and is overcrowded by several unwanted data points, which must be pre-processed before it is fed into the ML module. It involves evaluating and constructing machine-readable features on the basis of the significance of each feature. Scrutinizing of appropriate and relevant aspects of data is pivotal for achieving a good practice on training ML models. The better the coherency in data curation, the broad range of dataset needs to be arranged for obtaining accurate model parameters [17], while using insufficient aspects of data heeds to underfitting and poor predictive performance. Therefore, assessments of certain features should be implemented in photocatalyst systems to quantify parameters such as light intensity, catalyst loading, and solution pH. Another important goal of data quality modulation is to reveal the hidden correlations that exist in the complex raw information of photocatalyst materials properties. These interrelationship correlations exist within photocatalysts properties and photocatalytic activities, for example, human-based intuitions during the experiments, etc. Four raw data contains both quantitative and
230
T. S. Dabodiya et al.
Fig. 5 A combined approach of machine learning and experimental details utilized by machine leaning for the finding of new photocatalyst
qualitative information; therefore, data representation manner affects the efficacy of the learning process of ML models [8, 10]. To resolve the problem of converting domain knowledge into fingerprint-based information such as chemical and molecular structures, active absorption sites, and surrounding species, data quality modulation is a promising technique. The aforementioned processes are indicative that data quality modulation could help in encoding and decoding complex scientific data into a physically relevant form for ML applications (Fig. 5). Development of ML models: The development of ML model is the combination of the functions that depict the given inputs into the expected outputs. This can be done by using an appropriate set of parametric functions outlined over the allotted space and using acceptable training algorithms to infer model parameters from the base data. As soon as the model is hypothesized, the photocatalytic activity of the material can be unveiled using the previous best results and used as the core source for identifying new photocatalysts. Model Refinement: During the test run of the model, a series of programs are required to run with the help of a set of data from large labeled datasets to find the applicability of the whole developed model to match the most suitable catalyst properties with the future predicted photocatalyst. Furthermore, the same exercises will be applied to refine the model and to validate the model with other remaining datasets. This activity provides an indication of real-world performance. Explicitly, this form of active learning and continuous model refinement could be more useful [9].
Contemplation of Photocatalysis Through Machine Learning
231
4 Integration of Domain Knowledge with Machine Learning The common view point of the human domain of understanding is based on fundamental relation with laws obtained after long exposure of experimental as well as theoretical view point. However, the understanding is totally distinct from machine learning-oriented view points. Specifically, machine learning is a powerful tool in understanding pattern and inclusion of rigorous datasets in the absence of humanlike sense and knowledge. Therefore, the integration of photocatalyst machine-driven understanding and data-oriented observations into science models. This serves as the mitigation of the learning gap between human and machine, which further executes an optimization loop to rapid findings of novel photocatalysis materials. Generally, the domain knowledge builds a machine learning system by training data collection, modeling methodologies, defining algorithms, and machine-user inference methods. The developed machine learning model would be capable to mitigate the deviation of structure-property-activity relationship with experimental data with the theoretical model.
5 Conclusion With the latest emergence of the Internet of things and artificial intelligence, machine learning (ML) system for prompt identification of catalysts and opportunities to hasten the discovery of significant photocatalyst has been keenly implored. However, the lack of sufficient data related to the state-of-art in the database has been hindering the effective adoption of ML in photocatalysis. There are challenges such as incomplete detail in experimental, use of non-standardized equipment and experiments, unreliable results, inconsistency in measurements, and difficulty in acquiring upto-date data. The scarcity of reliable data leads to difficulty in executing the interpretability and reliability of the ML algorithms as they are completely data-oriented. Therefore, the immediate coupling of knowledge and data related to the domain of photocatalysis with machine learning is a practical approach to meet these challenges. Herein model predictions could be done successfully with the existing knowledge database to achieve rational photocatalyst material that could be synthesized. However, this may require the intense building of ML models, followed by training with the help of human expertise to simplify the complexities that lay in the way of discovering efficient photocatalysts. It could be believed that bringing together Machine Learning in the field of photocatalysis may benefit in developing a facile screening platform for exploring robust photocatalysts.
232
T. S. Dabodiya et al.
References 1. Schwab, K. (2017). The fourth industrial revolution; currency. 2. Catlow, R., Sokol, A., & Walsh, A. (2013). Computational approaches to energy materials. Wiley. 3. Mai, H., Le, T. C., Chen, D., Winkler, D. A., & Caruso, R. A. (2022). machine learning for electrocatalyst and photocatalyst design and discovery. Chemical Reviews, 122, 13478–13515. 4. Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O., & Walsh, A. (2018). Machine learning for molecular and materials science. Nature, 559, 547–555. 5. Mai, H., Le, T. C., Hisatomi, T., Chen, D., Domen, K., Winkler, D. A., & Caruso, R. A. (2021). Use of metamodels for rapid discovery of narrow bandgap oxide photocatalysts. Iscience, 24, 103068. 6. Kumar, R., & Singh, A. K. (2021). Chemical hardness-driven interpretable machine learning approach for rapid search of photocatalysts. npj Computational Materials, 7, 1–13. 7. Shockley, W., & Queisser, H. J. (1961). Detailed balance limit of efficiency of p–n junction solar cells. Journal of applied physics, 32, 510–519. 8. Zhang, X., Chen, Y. L., Liu, R.-S., & Tsai, D. P. (2013). Plasmonic photocatalysis. Reports on Progress in Physics, 76, 046401. 9. Masood, H., Toe, C. Y., Teoh, W. Y., Sethu, V., & Amal, R. (2019). Machine learning for accelerated discovery of solar photocatalysts. Acs Catalysis, 9, 11774–11787. 10. Bai, S., Zhang, N., Gao, C., & Xiong, Y. (2018). Defect engineering in photocatalytic materials. Nano Energy, 53, 296–336. 11. Zuo, F., Wang, L., Wu, T., Zhang, Z., Borchardt, D., & Feng, P. (2010). Self-doped Ti3+ enhanced photocatalyst for hydrogen production under visible light. Journal of the American Chemical Society, 132, 11856–11857. 12. Liu, G., Jimmy, C. Y., Lu, G. Q. M., & Cheng, H.-M. (2011). Crystal facet engineering of semiconductor photocatalysts: Motivations, advances and unique properties. Chemical Communications, 47, 6763–6783. 13. Fermín, D. J., Ponomarev, E. A., & Peter, L. M. (1999). A kinetic study of CdS photocorrosion by intensity modulated photocurrent and photoelectrochemical impedance spectroscopy. Journal of Electroanalytical Chemistry, 473, 192–203. 14. Mao, S. S. (2013). High throughput growth and characterization of thin film materials. Journal of Crystal Growth, 379, 123–130. 15. Goldsmith, J. I., Hudson, W. R., Lowry, M. S., Anderson, T. H., & Bernhard, S. (2005). Discovery and high-throughput screening of heteroleptic iridium complexes for photoinduced hydrogen production. Journal of the American Chemical Society, 127, 7502–7510. 16. Yuan, L., Hung, S.-F., Tang, Z.-R., Chen, H. M., Xiong, Y., & Xu, Y.-J. (2019). Dynamic evolution of atomically dispersed Cu species for CO2 photoreduction to solar fuels. ACS Catalysis, 9, 4824–4833. 17. Medford, A. J., Kunz, M. R., Ewing, S. M., Borders, T., & Fushimi, R. (2018). Extracting knowledge from data through catalysis informatics. Acs Catalysis, 8, 7403–7429.
Discovery of Novel Photocatalysts Using Machine Learning Approach G. Sudha Priyanga, Gaurav Pransu, Harshita Krishna, and Tiju Thomas
1 Literature Survey 1.1 Photocatalyst One of the most important technologies that will aid in reaching sustainability in the twenty-first century is the generation of solar fuels using solar-induced chemical reactions. In this context, semiconductor photocatalysts that may activate in visible light (a major part of the sunlight) are thoroughly researched in the field of nitrogen fixation [1], limiting CO2 [2], chemicals syntheses[3], water splitting [4], environmental decontamination [5, 6], etc. Highly efficient photocatalysts should meet two fundamental requirements. First off, despite the fact that the free energy threshold to split H2 O is 1.23 eV/e, photocatalysts must possess E g of 2 eV to initiate the splitting due to excess potentials, an operational voltage of the instrument, and other losses [7]. The second need is the G. S. Priyanga Department of Physics, Research Institute for Natural Science, and Institute for High Pressure at Hanyang University, Hanyang University, 222 Wangsimni-Ro, Seongdong-Ku, Seoul 04763, Republic of Korea G. Pransu Institute of Physics, Bijeniˇcka C. 46, 10000 Zagreb, HR, Croatia H. Krishna Department of Electrical and Computer Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA T. Thomas (B) Department of Metallurgical and Materials Engineering, Indian Institute of Technology Madras, Chennai 600036, India e-mail: [email protected]; [email protected] Indian Solar Energy Harnessing Center (ISEHC) - An Energy Consortium, Indian Institute of Technology Madras, Chennai 600036, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_11
233
234
G. S. Priyanga et al.
redox potentials of photocatalysts that correspond to the reaction, meaning that the conduction band should have a higher negative potential compared to H+ /H2 [8]. Data-driven method has made it possible to anticipate a large number of suitable alternatives with appropriate direct band gaps and optimum band edges for the manufacturing of chemical fuels from sunlight, significantly increasing the number of theoretically calculated 2D photo electrocatalysts that are still awaiting experimental confirmation. The majority of experimental research conducted in the area of photocatalysis was focused on trial-and-error techniques prior to the development of material informatics. In two distinct chemical fields, oxynitrides [9] and metal-oxide perovskites [10, 11], a huge number of compounds for photocatalysis have been anticipated in the last five years by cutting-edge computational screening research. For example, Castelli et al. work showed how to effectively select oxide materials having perovskite structures via electronic structure simulations based on the GLLB-SC functional [12, 13]. This method effectively condensed down a huge space of 5400 distinct materials to only 15 viable alternatives for solar water splitting [13]. Till now, around 130 different categories of inorganic photocatalysts are identified [14–16] since the discovery of TiO2 ’s photocatalytic characteristics in 1972 [17]. The commercial applications are not feasible due to the low catalytic water splitting activity and weak quantum efficiency under visible sunlight caused by the huge band gaps or poor alignment of the redox potentials [18]. It is obvious that far more efficient photocatalysts are needed to make solar hydrogen production economically feasible. The identification of photocatalytic materials that can achieve the requisite conversion objectives has consistently been a major bottleneck.
1.2 Machine Learning Machine learning refers to methods that learn from an appropriate training data set. It is a model between certain input/reference features and output variables. For instance, ML algorithms in facial identification systems learn the association between face images and the matching identity of a person [19]. Without even being explicitly taught, the algorithm picks up on the importance of the characteristics during ML training (e.g., the position/size/shape of the eyes is more relevant than the surroundings). After training, the algorithms can be used to identify a person’s identity from a new picture that wasn’t included in the training data set. Recently, As computing algorithms have increased and huge sets of data have become more easily accessible, Machine learning algorithms are broadly applied in various fields, like speech processing and computer vision. These applications led to the development of complex, yet effective, ML models. Generally, a large set of data is required to train a complex ML algorithm. Similarly, in heterogeneous catalysis, the correlation that is often modeled during training between catalyst parameters and catalyst performance as output [20–22]. The activity of catalysts can be deduced after the algorithms have been trained, eliminating the requirement for actual tests or
Discovery of Novel Photocatalysts Using Machine Learning Approach
235
simulations. Recent years have seen some early success with machine learning (ML) deployment in the materials arena, including perovskite solar cells, piezoelectrics [23], electrocatalysis [24], and rechargeable batteries. In photocatalysis, ML-based amalgamation attempts are still in their infancy and are inundated with models focused on the use of water remediation [25]. The volume of the initial information for training the machine learning algorithm in this field is very constrained. This is in contrast to other areas like speech and image detection, where huge input information are accessible to train machine learning algorithms. The general idea behind this approach in the context of photocatalysis is that predictive Machine learning algorithms could provide a low-cost method of determining the photocatalytic activity given the catalyst’s properties, thereby minimizing the resources and efforts in comparison to conventional methods that are solely computational and/or experimental. In order to use machine learning (ML) to speed up photocatalyst screening, a sizable quantity of training datasets must be gathered using trustworthy and standardized procedures. It is also essential to evaluate trained datasets to enhance the accuracy with less computational and experimental processes. The general component for the implementation of a Machine learning model in photocatalysts are as follow: 1. Data production 2. Feature engineering 3. Model development and improvement.
1.3 Machine Learning as a Major Technique to Discover Novel Photocatalysts Advanced theoretical methods like density functional theory (DFT) and ML have evolved to speed up materials design [26–28]. DFT can simulate material properties at the atomic level and model processes to aid in guiding experiments and prevent the additional costs of synthesis. However, the computation cost of the DFT method rises exponentially with atom count and set size which can take hours to get a few desired features for a system [29–31]. ML algorithms can be used to construct mathematical models that rely on the sample sets of existing materials to target the desired properties precisely while avoiding complicated quantum physics unlike DFT [32]. Researchers have successfully used the ML method to develop new materials including metal-organic frameworks [33–35], binary alloys [36], and perovskite material [37]. A number of materials created using ML approaches have been successfully synthesized and have exhibited excellent performance [26, 38, 39]. Although machine learning has greatly aided many fields of STEM, its application in photocatalysis research is still in its beginning stage. The implementation of domain information in standard ML algorithms is a possible alternative to the significant limitation of the lack of a consistent training dataset. However, overcoming difficulties in photocatalyst discovery and design has more significance. The ISI Web of Science has about 71,000 research papers, dating back
236
G. S. Priyanga et al.
as far as 1911 related to photocatalysis. These could be useful resource for ML data training. However, the recurring problems in the photocatalysis literature are measurement techniques, varied reaction conditions, and data discrepancy, which has made comparing the findings of different investigations challenging. Additionally, the methods used by various research groups for reporting performance (such as rate of reaction or quantum efficiency) vary [40]. Therefore, utilizing common test procedures and/or identical sources of light is essential to make the data-gathering process easier [41, 42]. One of the most crucial elements that should be well-defined is the light source (with different wavelengths and intensities) due to its high dependency on photoactivity. The most practicable method in this regard is thought to be the use of standard simulated solar irradiation [43]. These steps would make it possible to provide reliable data for training ML models, which currently appears to be an insurmountable challenge. This necessitates a divergence from the typical ML model, which states that a big collection of data is required to build an effective Ml algorithm. For finding effective solar photocatalysts, the domain information-driven machine learning approach may be a more suitable method. All available information on the subject that is built on accepted theory and experimental findings is referred to as “domain knowledge” in this context. The following groups comprise our classification of photocatalysis domain knowledge: (i) Theoretical information derived from basic principles, (ii) empirical knowledge gathered from experimental methods, and (iii) information from available material databases. A wide range of physical, mechanical, and electronic parameters, such as bandgap [44–46], absorption, and bulk modulus, can be predicted by ML algorithms. Moreover, intricate feature-property connections can be found by using feature engineering in the ML models [47]. Due to these factors, ML models are extensively used to detect novel materials with specific properties within large material spaces. The use of ML in photocatalyst design has two primary issues [19]. The fact that there is a lot of structural variation in photocatalytic materials is one problem. The majority of machine learning (ML) models have fairly narrow applicability domains since they are trained on tiny subsets of similar chemical materials. Another issue is inconsistent data from experimental photocatalysis. For instance, a variety of reaction conditions and various measurement techniques are used to gather H2 evolution rates. The photocatalytic properties can also be impacted by the preparation process; however, if these experimental data are properly recorded, can serve as relevant features.
1.4 Scope of This Proposed Work Massive efforts have been made in recent years to develop photocatalysts and assess how well they work in municipal water treatment processes. Quantifying photocatalysts’ effectiveness against various watery pollutants is difficult, though. The characteristics of photocatalysts, such as their crystalline structure, grain size and shape,
Discovery of Novel Photocatalysts Using Machine Learning Approach
237
specific surface area, and pore structure, affect how well they photodegrade pollutants. To optimize the photocatalyst performance with various factors, a completely factorized experimental design is time and money-consuming, if not impractical. The large spectrum of water-borne pollutants further jeopardizes the viability of the traditional experimental approach. Photocatalysts made of metal-oxide semiconductors can break down organic molecules in contaminated water. Particularly in light of the complicated structure of photocatalysts and the variety of pollutants, methods to evaluate their performance via conventional experimental methodology entail enormous efforts and investments. Recent developments in machine learning (ML) provide a data-driven strategy that results in considerably more effective analysis and forecasting of the performance characteristics of various photocatalysts. An ML model enables the complete utilization of experimental data from published literature and can produce findings that direct future experimental plans. When compared to the traditional experimental approach, these dramatically reduce time and labor costs.
2 Materials Used for Photocatalytic Applications, ML Descriptors, and Key Parameters Presently, precious metal semiconductors (AgI, Bi2 O3 , Bi2 WO6 ), metal oxides (TiO2 , ZnO, SnO2 , Fe2 O3 , WO3 , In2 O3 ), sulfides (ZnS), and nonmetallic semiconductors (g-C3N4) are the common types of photocatalytic materials [48]. The below table represents various properties due to which the given materials show photocatalytic activity. Materials
Band gap (eV)
Properties that make the material suitable for photocatalytic applications
TiO2
3.2
High photostability, abundance, high efficiency, non-toxicity, and low cost [49]
ZnO
3,37
Nontoxicity, low price, good thermal conductor, environmentally stable nature, abundance in nature, and excellent optical and electrical properties [50]
SnO2
3.6
Transparent, low cost, environment friendly, chemically and biologically inert, non-toxic, easy production, high photosensitivity, photostable, and thermodynamically stable [51, 52]
Fe2 O3
1.9–2.2
Low band gap (2.3 eV) and good harvesting in the visible light spectrum, highly stable, recyclable, and abundant availability [53]
WO3
2.6–3
Narrow e.g., (2.4–2.8 eV), no photo corrosion, high photosensitivity, and high stability [54]
In2 O3
3.75
Specific surface area and oxygen vacancy [55]
ZnS
3.54
Rapidly generates electron-hole pairs under photoexcitation [56] (continued)
238
G. S. Priyanga et al.
(continued) Materials
Band gap (eV)
Properties that make the material suitable for photocatalytic applications
Bi2 O3
2.0–3.3
High surface area and interconnection among the particles form an integrated structure [57]
Bi2 WO6
2.76
Strong oxidative ability, efficient charge separation and transfer the electron-hole [58]
g-C3 N4
2–3
Outstanding reduction potential [59]
The characteristic of any material can be largely decided by the type of band gap in the material [60]. It can be either direct or indirect. Since computing the band gap is very hard using fundamental principles, the ML model can be used to tackle this challenge in an effective and viable way. Using properties like elemental composition, ionic radius, ionic character, and electronegativity can be used to determine the type of band gap and study the nature of the material [61–63]. In order to be an efficient photocatalyst, there are several features that the material should encompass. Few of them are common for catalysts like Bronsted centers for proton transfer and high surface area. Additionally, photocatalysts should have the features of semiconductors and other new properties [64] as shown in Table 1.
2.1 Descriptors for Photocatalysis Clive Humby, a British mathematician said, “data is the new oil” in this modern world. Therefore, exploring this treasure to obtain maximum information is a major need of development for technological advancement. As the amount and scope of materials datasets increases, the significance of statistical learning and data mining approaches in analyzing the information and building predictive models becomes increasingly Table 1 Desirable properties of photocatalysts Properties
How to achieve various properties Outcomes
High surface area
Small particle size
High adsorption
Single site structure
Crystalline materials
Homogeneity
Light absorption
Engineering band gap
Higher efficiency
Efficient charge separation
Preferential migration along certian direction
Low recombination
Long lifetime of charge separation
Presence of co-catalyst
Possibility of chemical reaction
High charge carrier mobility
High crystallinity
More efficient charge separation
Selectivity toward a single product
Adequate co catalysts
Effecient chemical process
Discovery of Novel Photocatalysts Using Machine Learning Approach
239
significant. The most crucial and difficult stage in constructing a successful datadriven model is mining important properties from the available datasets. A full database in materials science consists of multiple processing factors, materials properties, and elemental contributions. These are part of materials informatics, which plays an important role in the identification of novel materials. Discovering the role of constituent elements in material characteristics is an important but difficult task. Using a simple example, we can demonstrate the necessity of featurization in materials datasets. Consider a dataset that includes the chemical formulas and density of various materials. A chemical formula is a combination of elements with their atomic ratios. The chemical formula (Ai Bi C i Di ) has four elements A, B, C, and D, and the subscripts i, j, k, and l represent the number of atoms for every element. For example, the equation for water is H2 O, which suggests two hydrogen atoms (H is the symbol of the element) and one oxygen atom (represented by O). To comprehend the relationships between properties and create a super-dense material from this kind of data, we must first determine each element’s contribution to density. These specific elemental inputs create characteristics (d for predicting the required variable, in this case density. As a result, the formula must be divided into its constituents (or elements) and their associated amounts. There is no obvious way to accomplish this work using a traditional ML algorithm such as encoding. However, owing to the Python-based materials packages like Pymatgen and matminer, this can be accomplished [65]. Materials data include a broad range of descriptions. . Mechanical Properties: Hardness, yield strength, toughness, fracture toughness, ductility, elastic stiffness, etc. . Processing environments: Relative humidity, temperature, heating and cooling rates, pressure, etc. . Physical properties: Viscosity, refractive index, density, electrical and thermal conductivity, etc. . Structural data: Crystal structure, atomic arrangement, microstructural and textural patterns at scales ranging from angstroms to microns, etc.
2.2 Matminer 2.2.1
Overview
The matminer is an open-source software platform for predicting and assessing materials features using data based on Python. Matminer includes features for obtaining large sets of data from external data sources like the Citrination, Materials Project, the Materials Data Facility, etc. It also includes 44 featurization classes which can produce hundreds of distinct descriptors and integrate them to arithmetic operations, as well as implementations for a wide library for extracting properties created by the community of materials. Matminer has a visualization tool that allows you to create
240
G. S. Priyanga et al.
dynamic, shared graphs. These methods are created in such a way as to work nicely with existing data interpretation and ML tools. Currently, there has been a new focus in the field of materials on gathering and organizing huge sets of data for materials design, research , and the use of ML algorithms and statistical approaches. ML models have been found to determine crystalline materials’ properties significantly faster than other computational tools like DFT [66–69], estimate characteristics that are tough to obtain using conventional computational techniques [70, 71], and lead the research for new materials [72– 76]. With the continuous research of data mining tools for various types of material [77–79], this discipline of “materials informatics” has contributed significantly in the area of materials design. Matminer supports the user in extracting huge sets of data from databases and turning primary data into a machine learning-friendly model and generating dynamic content for exploratory research. We should highlight that matminer doesn’t really incorporate popular machine learning techniques; instead, the data science community has developed and maintained industry-standard techniques like Keras and scikit-learn for this objective. Matminer’s purpose is to integrate these powerful ML algorithms to the materials domain. Matminer overcomes several issues that arise when undertaking data-driven work. Matminer offers a simple approach that isolates the intricacies of these API interactions, allowing users to query and manage massive data in the standard pandas [80] data format. It offers various unique extracting feature packages that can be used to determine input–output correlations more effectively. Despite the fact that many such methods for feature extraction have been documented in the literature, many do not have a free software implementation. Matminer not only incorporates such features but also offers a common approach for them to use, making it simple to replicate, evaluate, and eventually enhance these techniques. Finally, matminer has a huge set of predefined visualization algorithms to explore and identify numerous data associations. Overall, these qualities enable cutting-edge research in materials informatics using a high-level, user-friendly platform. Matminer is meant to integrate with mainstream Python data mining packages like scikit-learn and pandas [81], which contains methods for data extraction and interpretation, and incorporates a set of feature creation algorithms (“featurizers”) for a broad range of properties in the materials (e.g., arrangement of atoms, composition , and electronic structures). 1. Software design The integration of knowledge in the domain and information related to the materials into the greater network of Python software for data interpretation is the core principle of matminer. The huge set of compatible data processing techniques created by the researchers of Python is widely utilized and is sometimes referred as “PyData” or “SciPy” stacks [82]. Matminer provides a separate collection of resources like examples for utilizing the retrieval of information. Matminer presently includes 109 tests that are performed with every program using a script. NumPy and Scipy [83]
Discovery of Novel Photocatalysts Using Machine Learning Approach
241
libraries are included to provide a range of numerical methods and Jupyter [84] provides vibrant data analysis. The Python package index regularly published the updated version (https:// pypi.python.org/pypi/matminer). Mataminer has various tutorials and examples [85] which helps to understand the various features like data featurization, retrieval, and visualization tools.
2.2.2
Components of Matminer
a. Data Retrieval The beginning process is to procure the huge and diverse data set. The availability of databases is a great benefit for material informatics but every database has its own different scheme and process of authentication, making it difficult to access. One of the major tasks of matminer is to provide an API that is consistent around various types of databases and give an output of the data in a form such that it can be easily used in the data mining tools. Mataminer accepts data retrieval from four types of databases: Materials Project (MP) [86], Citrination [87], Materials Platform for Data Science (MPDS) [88], and Materials Data Facility (MDF) [89]. It also has various data sets that can be uploaded directly with python and so does not need any external setting option. Mataminer accepts data in various forms like Excel, CSV, or various other formats. b. Data Featurization Machine learning involves a midway step between processing the raw data and applying the ML techniques. Here, the raw format data is converted into numerical depiction which is useful for the ML. This method is known as “featurization” [90]. This process converts the raw data into specific quantities which shows the connection between the output and input. This is a major method in which domain knowledge can be used to greatly improve the ML algorithm. Mataminer has a similar code design pattern for all featurizer class known as Base Featurizer. c. Data Visualization Visualizing data is an important part of the workflow in the field of materials informatics since it aids in analyzing, selecting properties, and directing the ML algorithm. It provides a common set of comparable graphs, such as heatmaps or 2D graphs, that reduce several complex correlations into simple, instructive representations. Visualizing data distributions at different phases in the simulation is a valuable technique to filter information and find outliers. Matminer greatly facilitates the creation of many common visualizations. Although Python has numerous great libraries for plotting (for example, seaborn [91] and matplotlib [92], these packages are not intended to make dynamic charts that are simple to distribute and arrange the input information. The Plotly library [93] offers the necessary capability; unfortunately, the interface of Plotly library with typical packages of python like pandas is limited. Therefore,
242
G. S. Priyanga et al.
matminer contains its own package, Fig Recipes, which offers a variety of ways for constructing well organized and standard pictures.
2.3 Python Materials Genomics (Pymatgen) The Python Materials Genomics (Pymatgen) package is a powerful Python toolkit for the study of materials. It is a comprehensive package of software techniques that undertake the beginning step for calculating and analyzing to determine important features of various materials using the initial information (input) in the form of raw data. It meets these requirements using various techniques like offering Python methods and objects for the data study, offering collection of framework and thermodynamic analysis appropriate for various usage, and creating a forum for scientists and researchers to build a detailed analysis of information collected from theoretical and experimental studies. It also contains tools for obtaining important information using Materials Projects REST (REpresentational State Transfer) API [94]. The materials project is used by the pymatgen package for structure synthesis, modification, and thermodynamic research. However, it is an independent package, and the majority of its techniques can be utilized by any scientist and researcher.
2.3.1
Overview
Pymatgen library makes use of a wide variety of packages, like the commonly used numpy and scipy packages [95]. The Pymatgen is built on the OOPS model to simplify programming and assure flexibility. It allows the transformation of data (framework, equations, calculation, etc.) from different sources (calculations, rst principles, Materials Project, etc.) in the Python packages. These data are further utilized to carry out various other applications like composition modification or analysis. Structure modifications and compound generation Pymatgen’s transformations package offers a complete platform for executing compound creation and modification of the structure. Transformation helps in the development of advanced compounds and structures from the existing ones. By substituting the current species in the framework for others, one can generate compounds from current materials. Users can, for example, access additional resources by using the data-mined substituted rules established by Hautier et al. [96]. The transformation package includes the complete or partial elimination of a species from a structure and the production of supercells and primitive cells and the arranging of unordered structures [97]. The io (input/output) package supports the reading and writing of major structure and molecular types, as well as output and input for wide electronic structure code. The PyCifRW library [98] supports the widely used CIF format, and an adapter to the OpenBabel library [99] supports a
Discovery of Novel Photocatalysts Using Machine Learning Approach
243
vast number of other molecular formats. The vaspio package supports the majority of VASP [100]. Furthermore, pymatgen includes an adapter for converting between both the Structure object in pymatgen as well as the Atoms object ASE [101]. The io package is useful for converting between multiple formats.
2.3.2
Analysis Tools
Pymatgen includes several packages for data integration from electronic structure simulations and data analysis. 1. Data processing Borg tool uses various processors to automatically explore a directory tree to assimilate the calculated data. The entries compatibility tool in the pymatgen package performs the process of mixing energy values derived from various functionals, such as those computed by generalized gradient approximation (GGA) [102–105]. Traditional generalized gradient approximation often breaks down when the number of electronic localization changes substantially across reactants and products such as in a redox reaction [106]. In such cases, the Hubbard U component significantly enhances the precision of computed energy. Using known experimental enthalpies, the “mixing” approach changes the GGA + U energy to make it compatible with GGA energies. Furthermore, it modifies the energies of known gaseous components like N2 and O2 to adjust for GGA’s to bind such compounds [107]. Jain et al. showed this “mixing” technique yields fairly precise phase diagrams and formation enthalpies [105]. This package can be utilized to combine energy received from any set of functionals with minor adjustments. 2. Reaction Calculation The analysis reaction calculator tool contains classes for analyzing reactions, such as calculating energy of reaction and balancing reactions. A user can compute energy of reaction using the available data (using computed Entry objects) or data from experiment (using ExpEntry objects). These characteristics are now utilized in the Materials Project’s Reaction Calculation to generate calculated reaction energy and comparisons with experimental reaction energies wherever possible. 3. Phase diagram The phase diagram package allows to create and map the phase diagrams. Ong et al. [108, 109] established the approach and algorithms used in this study. It supports both grand canonical and standard compositional phase diagrams. One can evaluate the phase stability of new compound or material and anticipate phase equilibrium for a specific structure by analyzing its energy to phases on the phase diagram.
244
G. S. Priyanga et al.
3 Tools and Approaches Used to Predict Novel Photocatalysis Using Machine Learning Organic chemicals in contaminated water can be broken down by compound semiconductor photocatalysts. For the complicated low symmetry structures of photocatalysts and the variety of pollutants, techniques to evaluate their activity and characteristics through standard experimental methodology entail enormous costs and huge efforts. Recent developments in machine learning (ML) provide a data-driven strategy that results in effective analysis and characteristics of various photocatalysts. Machine learning model enables the complete utilization of empirical data from previous articles and can produce findings that direct future experimental investigations. When compared to the traditional experimental approach, these dramatically reduce time and labor costs. A new approach to evaluating the efficacy of photocatalysts is emerging: data-driven machine learning. Compared to experiments, ML technique is quicker, less expensive, and more adaptable. An ML model known as an artificial neural network (ANN) has been extensively utilized to forecast the characteristics of a variety of materials, including polymers, metals, ceramics, and composite materials [110–115]. The approaches involved in this predictions of novel photocatalysts are the following: (a) integrating theoretical and previous experimental information during the training of dataset in machine learning approach; (b) knowledge embedding in feature engineering space; (c) using current material databases to limit machine learning predictions. The ML approach (Ref. Fig. 1), which makes use of both human knowledge and machine intelligence, may be able to address the interpretability and reliability issues related to data-driven machine learning and spite of the lack of evidence, strengthen complex model architectures. By encouraging this new paradigm change away from Moravec’s paradox approach [116], the idea may also significantly advance photocatalysis informatics. (d) The ML approach proposed here provides a pathway to a future of materials codiscovery through human knowledge-machine collaboration. This would only add further value and momentum to the existing work ongoing in material discovery and property predictions [117–120]. It has also been looked into as a way to speed up the development of new photocatalysis systems [121, 122] and to forecast a photocatalyst’s ability to produce photocatalysis [123–125]. The ML is a proficient tool to identify an efficient material from thousands of materials. To screen the efficiency of photocatalysis, the ML technique requires accurate and adequate descriptors. Formation energy, cohesive energy, binding energy, energy band gap, conduction band minimum (CBM), valence band maximum (VBM), effective masses, dielectric constants, refractive index, absorption coefficient, electron energy loss function, work function, and defect formation
Discovery of Novel Photocatalysts Using Machine Learning Approach
Human User
Intuitive / creative thinking
New thought/idea on material informatics
245
Input information from experiments, theoretical models, computation predicted results (using density functional theory)
Machine Learning Technique
Generation/ Formation of Data
Algorithm selection
Data set tuning /optimizing
Discovery of novel photocatalysis Fig. 1 Schematic representation of proposed road map on Materials Informatics to ‘material discovery of novel photocatalysis’
energy are considered as credible descriptors for photocatalytic hydrogen evaluation. Comparing predictive ML models to more conventional methods that are purely experimental and/or computational, predictive ML models may offer a less expensive and fast method of determining the photocatalytic activity given the catalyst’s parameters. Machine learning approach involves the following steps for computational screening to discover new novel photocatalysts (Ref Fig. 2): . Sample Construction: – Property-driven data collection and curation: Generation and selection of available data for problem-solving (Using DFT and MD on a need basis) – Data Processing: Formatting the collected data (i.e., writing it in proper format) Cleaning corrupted or missing data
246
G. S. Priyanga et al.
Fig. 2 Flow chart of ML approach. It shows the step-by-step progress
Fine graining of data – Data transformation and representation: Transform, scale, and combine data . Model Building and Evaluation: – Machine learning algorithm training: Split datasets into three steps—Training, validation, and testing – Optimization and Testing: Evaluate accuracy and efficiency of the performance by means of validation and due iterative optimization As indicated in the introduction, different ML algorithms are tuned and trained to utilize structured datasets to enhance the performance of each model. The quantity and caliber of the datasets used to produce precise models are crucial. It is impossible to exaggerate the value of the training model. Identification, acquisition, or information collection are all included in this procedure. Making the training dataset is the first stage in the procedure (in some cases). Of course, the model you wish to train will have a significant impact on the machine learning process. Evaluating the trained ML model’s performance on data points that were not included in its training dataset is the simplest method for determining accuracy.
Discovery of Novel Photocatalysts Using Machine Learning Approach
247
Useful tools and databases for the discovery of novel photocatalysts The domain knowledge workflow can be executed by using the materials genome and crystallographic databases, which contain previously observed or theoretically predicted materials properties using the DFT approach/high throughput DFT calculations. Large databases have several benefits since they may be used to screen catalysts. Examples that show how new and effective catalysts have been successfully identified from the existing material databases include the screening of potential candidates for electrocatalytic reactions such as the HER (hydrogen evolution reaction)/ORR (oxygen reduction reaction), the photocatalytic water-splitting reaction and thermal catalytic methanation. The AFLOW [126], the Inorganic Crystal Structure Database (ICSD) [127], the NREL Materials Database, the Materials Project (MP) [128], the Computational Materials Repository (CMR) [129], the Open Quantum Materials Database (OQMD) [130], the Crystallography Open Database (COD) [131], the Stanford Catalysis-Hub (SCH) [132], the National Institute of Science and Technology Materials Data Curation System (NIST) [133], the band structures, Materials Wave functions, adsorption energies, effective electron/hole masses, and crystal structures are some of these extractable features. Additionally, the photocatalysts screening stage can be guided by the crystallographic databases for choosing synthesizable materials (and possibly synthesis conditions) [134, 135]. Even though only a small percentage of them could work as active photocatalysts, understanding synthesizable substances can help significantly reduce the number of potential choices.
4 ML Algorithms for Predicting Novel Photocatalysts To forecast novel photocatalysts, machine learning approaches like random forests and neural networks are widely used. In this section, we first discuss the principles underlying neural networks and random forests before delving further into the ML algorithm and its application for novel photocatalysts.
4.1 Random Forest It is an ensemble learning technique [136] and the algorithm bases its final prediction on the values predicted by its subtrees. The predicted values from the subtrees are aggregated (usually averaged) and this aggregated value is used as the final prediction. Figure 1 shows a simplified random forest for a classification task where we need to classify data into “class A” and “class B” (Fig. 3). Random Forest falls under the category of machine learning algorithms that are used for classification as well as regression. The problem statement we are dealing with, i.e., predicting novel photocatalyst, is a regression problem, hence random forest regression model is apt for this problem. This model is constructed as follows:
248
G. S. Priyanga et al.
Fig. 3 Classification using Random Forest [136]
1. N training samples are bootstrapped for forming a training dataset T i {i = 1,2,…, N}. 2. Associated classification and regression tree (CRAT), which will be a subtree in the random forest, is created for each training set T i . Hence, CRATi is produced for T i . A randomly chosen subset of features is used at each node to determine which split is the best. The tree is not pruned; it is allowed to reach its full potential. 3. Performance of the earlier models is then evaluated using the test data. The subtree output yields the expected values of CRAT1(Test) , CRAT2(Test) ,…, and CRATN(Test) . 4. Prediction outputs of N decision trees are counted. The model then provides a final prediction after inversely normalizing the average forecasts of all subtrees. The advantages of this technique are that it improves prediction accuracy, reduces overfitting, and enables the model to be insensitive to missing data and multicollinearity.
4.2 Gradient Boosting Regression It is an ML algorithm [137] that uses the negative gradient of loss function as an estimate of the residual of a model’s training result at ith training iteration and is used as the objective for the following iteration. The model’s results will adjust in a manner that reduces loss. Gradient boosting with regression tree is stated in the following manner as a weak learner:
Discovery of Novel Photocatalysts Using Machine Learning Approach
249
Let S = {(x 1 , y1 ), (x 2 , y2 ),…, (x N , yN )} be sample space with N samples. We wish to discover a function F(x) for all x to y mappings that minimizes the loss function L(y, F(x)). Let the output function be F(x) =
K Σ
βk h(x, ak )
(1)
k=1
where, h(x, ak ) = kth subtree of the weak learner, ak = parameter of this kth subtree, βk = weight of kth subtree, k = 1, 2, …, k. Let F k (x) be the output modeled by training the first k weak learners then we can rephrase our problem as finding parameters (βk , a k ) of the new subtree (βk , a k ) = argmin
N Σ
L(yi , Fk−1 (xi ) + βk h(x, ak ))
(2)
i=1
Updating the gradient boosting algorithm. 1. The first regression tree is initialized as follows: F0 = argmin
N Σ
L(yi , h 0 (xi , ak ))
(3)
i=1
2. For all subsequent subtrees that are for k = 1 through k = K, the negative gradient of the loss function is calculated as yik = −[∂ L(yi , F(xi ))/∂ F(xi )]F(x) = Fk−1 (x)
(4)
y ik is used as the training target for fitting a new subtree and then ak and βk of the subtree are computed as follows to obtain the area of the leaf node ak = argmin
N Σ
[yik − βk h(xi , ak )]2
(5)
L(yi , Fk−1 (xi ) + βk h(x; ak ))
(6)
i=1
βk = argmin
N Σ i=1
The prediction function is then updated as follows: Fk (x) = F k−1 (x) + νβk h(x; ak )
(7)
where ν is the learning rate. Using a small learning rate means that the model will take longer to converge, and thus will take more time to train to attain the required
250
G. S. Priyanga et al.
prediction accuracy, whereas a large learning rate means that for certain cases we might skip the optimum value and thus we may not have a very high prediction accuracy.
4.3 Artificial Neural Networks (ANN) It is an algorithm developed in an attempt to simulate the information processing methodology of the human brain [139]. These algorithms are extremely good at pattern recognition and have widespread applications in computer vision, natural language processing, etc. Their ability to model nonlinearity, tolerance to outliers, and self-learning through backpropagation are the factors that attribute to their success in the fields of pattern recognition, clustering, prediction, optimization, etc. The heart of these artificial neural network algorithms is artificial neurons. These artificial neurons are essentially mathematical functions deriving inspiration from biological neurons and are activated by an activation function just like biological neurons fire on receiving specific signals. They take inputs, assign them weights, add them up, and pass them through a nonlinear function in order to produce an output. In 1957, Frank Rosenblatt proposed a learning rule, based on the artificial neuron, Perceptron. Perceptron [138] is a single-layer neural network. It has four parameters: 1. 2. 3. 4.
Input values Weights assigned to each of the neurons and the associated bias Net sum Activation function.
The perceptron learning rule claims that the algorithm will automatically learn the optimal weights for each of the neurons. The perceptron works by first multiplying all the input values and their weights and adding bias. Then a weighted sum of the output of each of the neurons is computed and finally passed through an activation function to obtain an output. This was how a single-layer neural network would work, this can be extended to multiple layers and such an architecture is called a multilayer perceptron (Figs. 4 and 5). Multilayer perceptrons [142], or MLPs, are a common structure used by ANN. Figure 2 shows an ANN where all the layers are linked with one another this is a typical ANN with a single hidden layer. More complex ANNs have more than one hidden layer where each hidden layer learns a different feature of the data. An activation function is applied on the input data at each node and the outcomes of the operation are output. A weight that travels through the connection signal indicates a connection between two separate nodes. The output of the networks varies depending on the activation function, weight, and network connection method. Backpropagation (BP) [140] is a multilayer feedforward network-based approach for supervised learning. It is the most often employed ANN which is trained by propagating the error from the output layer to the input layer into minimize loss.
Discovery of Novel Photocatalysts Using Machine Learning Approach
251
Fig. 4 Perceptron rule [141]
Fig. 5 Multilayer perceptron [142]
Backpropagation is based on utilizing the chain rule repeatedly to assess the impact of every node on the error method. These methods are employed for discovering novel photocatalysts.
4.4 Case Study In the following work, “Machine learning aided design of perovskite oxide materials for photocatalytic water splitting” [143] a huge set of ABO3 type perovskite photocatalytic datasets were gathered from various literature sources. Out of these 124 samples were in the E g dataset and 77 samples in the RH2 dataset, respectively. Thus, E g was established as a key parameter in the PWS that determines a photocatalyst’s ability to absorb light. As a result, the first ML model was created to forecast the E g
252
G. S. Priyanga et al.
of materials made of perovskite. The second ML model was then used to forecast the RH2 for PWS. The main objective of photocatalytic water splitting was to effectively exploit visible spectra because the visible spectrum is the major part of sunlight. In the two models, the goals were the corresponding E g and RH2 of the perovskite materials, whereas the variables in the machine learning algorithm represent various features and experimental conditions. The data sets were separated at random into two subsets, which were used for the training of algorithms and access to their standards. The intended material attribute may be influenced by a variety of factors; however, the ideal method is to select a parameter that correctly reflects the material’s properties. Additionally, for a suitable machine learning model, data comparability and reliability are essential. The matching experimental data of the collected specimen were employed to lessen the influence of data differences on machine learning models. Based on the atomic parameters such as light intensity and photocatalyst dose, other unique feature variables are produced. 20 initial features for the E g model were gathered, including 3 experimental circumstances derived from references and 17 atomic variables produced by an online computation platform for materials data mining (OCPMDM) [144]. There were also 24 incipient features for the RH2 model, comprising 18 atomic features and 6 experimental circumstances. Prior to constructing the models, unnecessary or irrelevant features are removed since these features have a substantial impact on how well the models perform, like increasing the possibility of overfitting. Despite lowering the size of the feature space, deleting these characteristics improves the models’ predictive abilities and prediction accuracy [145–147]. In this investigation, the Support Vector Regression [147] and BPANN models’ optimal feature subsets were chosen mRMR method [147]. The feature selection for GBR and RF design was conducted using the embedded method [146, 147]. The fixed feature selecting approach integrates the feature selection with learner training by voluntarily selecting the feature as they are being learned. The feature variables were eliminated one at a time from back to front with the help of the features score ranking provided by mRMR. An algorithm was created using the feature variables that remained after deleting each feature, producing a set of models. The optimal algorithm like BPANN, SVR, GBR, and RF, was chosen for generating the algorithm. LOOCV was used for assessing each ML model’s performance to choose the best regression models. The root mean square error (RMSE) is utilized to assess the errors and Pearson correlation coefficient (R) was utilized to assess the correlation between anticipated and experimental values. In general, good prediction outcomes are indicated by a modest RMSE and a high R-value. The BPANN and GBR models had the best predictive power for RH2 and E g . For E g , GBR had R-value as 0.92 and RMSE as 0.29 and the corresponding values of R and RMSE for BPANN were 0.86 and 0.42, respectively. The RH2 GBR had R-value as 0.92 and RMSE as 770.49 and the corresponding values of R and RMSE for BPANN were 0.986 and 290.568, respectively.
Discovery of Novel Photocatalysts Using Machine Learning Approach
253
The Rh2 had very high RMSE values, which can be attributed to a variety of reasons like, differing experimental settings. Due to varying calcination temperatures used during the preparation procedure, LaFeO3 , which is published in the same literature, has variable RH2 values. LaFeO3 had RH2 values of 5466.7 and 8600 lmol g1 h1 [147], respectively. Accordingly, the experimental results’ mean relative error (MRE) ranges from 36.43 to 57.32%, and its root mean square error (RMSE) is 3133. Contrarily, the dataset had a wide range of data values (from 1.25 to 8600 lmol g1 h1) causing a higher RMSE. The data was split at a ratio of 4:1 for 100 times, at random, into training data and the test data and the models were rebuilt according to the chosen parameter to assess the best algorithms’ consistency and generalization. In 100 training models, the average values of R were 0.9010 (E g ) and 0.9805 (RH ) for LOOCV, whereas in the testing set, they were 0.9125 (E g ) and 0.9543 (RH2 ). These outcomes ensure that the models are robust and generalizable. The majority of ML algorithms need the right parameters to boost the model prediction performance and generalization capabilities. After choosing the ML algorithm, the best models for E g and RH2 were the BPANN and GBR models, respectively. Thus, hyperparameters for the BPANN and GBR were optimized using the grid search method. The hyperparameters in GBR model and hidden layer, momentum, and learning rate are hyperparameters designed for the BPANN algorithm (hidden layer to output layer) and were tuned to get an optimal model. These models performed better after the hyperparameters were tweaked with GBR’s (E g ) R and RMSE values being 0.92 and 0.27, respectively, and BPANN’s (RH2 ) R and RMSE values being 0.089 and 257.345, respectively. To validate model’s performance, 20% of the data that was kept aside was used to test the models. It was observed that both R and RMSE values in the testing data aligned with the corresponding values of the training data confirming the model’s predictive ability. In summary, the GBR and BPANN models performed exceptionally well at predicting the values of E g and RH2 , respectively. As a result, these models can be used to virtually screen potential perovskite materials. Using these models millions of potential combinations could be created (but only a few were documented due to the enormous search space). Based on the intended formula, 30,000 potential perovskite oxide possibilities were produced. The tolerance factor was used to determine whether or not potential candidate materials are capable of forming perovskites. So in essence, this case study shows how machine learning techniques and models can be used for predicting a novel photocatalyst. First, data is collected and cleaned, followed by feature extraction to get the best possible representation of the data (material in our case) and then model is trained on training data which enables the model to learn how to identify novel photocatalysts, additionally optimizations like hyperparameter tuning, regularization, etc. are performed in order to improve model accuracy and lastly, it is tested on the test data to see that the models prediction performance is generalizable. These are the procedures one undertakes while using machine learning to train for any task, in this case it is predicting a novel photocatalyst.
254
G. S. Priyanga et al.
5 Conclusion and Future Perspective In conclusion, photocatalysts are essential for producing renewable energy and cleaning up the environment. We have discussed how machine learning approaches are changing the way that computation and experiment are done in research on photocatalysts. The examination of the catalytic mechanism, the understanding of correlations between characteristics and catalytic reactions, and the design and material discovery of next-generation novel photocatalysts all benefit from the use of ML techniques. Traditional methods for finding new photocatalysts include trial and error. For increasing the effectiveness of finding photocatalysts from a huge potential pool, machine learning has shown a lot of promise. In this chapter, we summarized a step-by-step, target-driven consensus strategy that accurately predicts photocatalyst bandgaps and H2 evolution activities by stacking meta-learning. ML models can quickly screen a huge space (more than a million of materials) for potential, non-toxic candidates that could be used as potential water-splitting photocatalysts after being trained on modest datasets. 1. The amount, variety, and caliber of training data have a significant impact on how well ML models function as a data-driven strategy. Most of the studies whose findings are described in Sect. 3 use literature, high throughput trials, or calculations to gather training data from materials databases. Prior to model training, data comparability must be thoroughly examined. 2. The use of descriptors is crucial in ML modeling and its explained in detail in Sect. 2. Descriptors must have low correlations with one another and sufficient information about the target property. 3. The structured dataset with desired property is used to choose the ML algorithm and models to train the dataset. We outlined the most well-liked ML photocatalysis algorithms in Sect. 4, along with their advantages and disadvantages, to provide materials scientists alternatives to choose from based on which algorithms would be best for their particular projects.
6 Future Perspective First, large, high-quality data sets are necessary for high accuracy in ML techniques, but obtaining these structured datasets would be challenging, time consuming, and expensive. Additionally, because only successful outcomes are frequently reported, ML models may be biased even if low-efficient materials can still be useful sources of data. These problems should be resolved in the future with the widespread use of high throughput synthesis and characterization techniques such as both experimental and computational simulations rather than published data. Secondly, when ML is integrated with the data from in situ inquiry and molecular dynamic simulations and density functional theory (DFT), it is anticipated that it will contribute more to the photocatalytic pathways and applications. Thirdly, it is difficult to use straightforward
Discovery of Novel Photocatalysts Using Machine Learning Approach
255
and understandable adjectives to describe the intricate catalytic systems. Fourth, there are ML models which generate synthetic data when high volumes of data are not available, so devising ML models that are able to generate data will in turn increase the accuracy of prediction models.
References 1. Medford, A. J., & Hatzell, M. C. (2017). ACS Catalysis, 7(4), 2624–2643. https://doi.org/10. 1021/acscatal.7b00439 2. Tu, W., Zhou, Y., & Zou, Z. (2014). Photocatalytic conversion of CO2 into renewable hydrocarbon fuels: State-of-the-art accomplishment, challenges, and prospects. Advanced Materials, 26, 4607–4626. https://doi.org/10.1002/adma.201400087 3. Schultz, D. M., & Yoon, T. P. (2014). Science, 343, 6174, 1239176. https://doi.org/10.1126/ science.1239176 4. Kudo, A., & Miseki, Y. (2009). Chemical Society Reviews, 38, 253–278. 5. Chatterjee, D., & Dasgupta, S. (2005). Visible light induced photocatalytic degradation of organic pollutants. Journal of Photochemistry and Photobiology C: Photochemistry Reviews, 6(2–3), 186–205. ISSN: 1389-5567. 6. Teoh, W. Y., Scott, J. A., & Amal, R. (2012). The Journal of Physical Chemistry Letters, 3(5), 629–639. https://doi.org/10.1021/jz3000646 7. Zhu, S., & Wang, D. (2017). Advanced Energy Materials, 7, 1700841. https://doi.org/10. 1002/aenm.201700841 8. Wang, Q., & Domen, K. (2020). Chemical Reviews, 120(2), 919–985. https://doi.org/10.1021/ acs.chemrev.9b00201 9. Wu, Y., Lazic, P., Hautier, G., Persson, K., & Ceder, G. (2013). Energy & Environmental Science, 61, 157–168 (The Royal Society of Chemistry). https://doi.org/10.1039/C2EE23 482C 10. Castelli, I. E., Olsen, T., Datta, S., Landis, D. D., Dahl, S., Thygesen, K. S., Jacobsen, K. W. (2012). Energy and Environmental Science, 5(2), 5814–5819 (The Royal Society of Chemistry). https://doi.org/10.1039/C1EE02717D 11. Castelli, I. E., Landis, D. D., Thygesen, K. S., Dahl, S., Chorkendorff, I., Jaramillo, T. F., & Jacobsen, K. W. (2012). Energy & Environmental Science, 5, 9034–9043. 12. Castelli, I. E., Olsen, T., Datta, S., Landis, D. D., Dahl, S., Thygesen, K. S., & Jacobsen, K. W. (2012).Energy & Environmental Science, 5, 5814–5819. 13. Chen, S., Takata, T., & Domen, K. (2017). Particulate photocatalysts for overall water splitting. Nature Reviews Materials, 2, 17050. https://doi.org/10.1038/natrevmats.2017.50 14. Nursam, N. M., Wang, X., & Caruso, R. A. (2015). ACS Combinatorial Science, 17(10), 548–569. https://doi.org/10.1021/acscombsci.5b00049 15. Fujishima, A., & Honda, K. (1972). Electrochemical photolysis of water at a semiconductor electrode. Nature, 238, 37–38. https://doi.org/10.1038/238037a0 16. Masood, H., Toe, C. Y., Teoh, W. Y., Sethu, V., & Amal, R. (2019).ACS Catalysis, 9(12), 11774–11787. 17. Baumes, L., Farrusseng, D., Lengliz, M., & Mirodatos, C. (2004). Using artificial neural networks to boost high-throughput discovery in heterogeneous catalysis. QSAR & Combinatorial Science, 23, 767–778. 18. Goldsmith, B. R., Esterhuizen, J., Liu, J.-X., Bartel, C. J., & Sutton, C. (2018). Machine learning for heterogeneous catalyst design and discovery. AIChE Journal, 64, 2311–2323. https://doi.org/10.1002/aic.16198 19. Kitchin, J. R. (2018). Machine learning in catalysis. Nature Catalysis, 1, 230–232. https://doi. org/10.1038/s41929-018-0056-y
256
G. S. Priyanga et al.
20. Yuan, R., Liu, Z., Balachandran, P. V., Xue, D., Zhou, Y., Ding, X., Sun, J., Xue, D., & Lookman, T. (2018). Advanced Materials, 30, 1702884. https://doi.org/10.1002/adma.201 702884 21. Li, Z., Ma, X., & Xin, H. (2017).Catalysis Today, 280, 232–238. ISSN: 0920-5861. https:// doi.org/10.1016/j.cattod.2016.04.013 22. Azadi, S., Karimi-Jashni, A., & Javadpour, S. (2018). 117, 267–277. ISSN: 0957-5820. https:// doi.org/10.1016/j.psep.2018.03.038 23. Chakraborty, S., Xie, W., Mathews, N., Sherburne, M., Ahuja, R., Asta, M., & Mhaisalkar, S. G. (2017). ACS Energy Letters, 2(4), 837–845. https://doi.org/10.1021/acsenergylett.7b0 0035 24 Liu, D., Li, Q., Hu, J., Jing, H., & Wu, K. (2019).Journal of Materials Chemistry C, 7, 371–379. https://doi.org/10.1039/C8TC04065F 25. Mounet, N., Gibertini, M., Schwaller, P., et al. (2018). Two-dimensional materials from high-throughput computational exfoliation of experimentally known compounds. Nature Nanotech, 13, 246–252. https://doi.org/10.1038/s41565-017-0035-5 26. Balachandran, P. V. (2019). 164, 82–90. ISSN: 0927-0256. https://doi.org/10.1016/j.commat sci.2019.03.057 27. Gladkikh, V., Kim, D. Y., Hajibabaei, A., Jana, A., Myung, C. W., & Kim, K. S. (2020).The Journal of Physical Chemistry C, 124(16), 8905–8918. https://doi.org/10.1021/acs.jpcc.9b1 1768 28. Li, C., Hao, H., Xu, B., Zhao, G., Chen, L., Zhang, S., & Liu, H. (2020).Journal of Materials Chemistry C, 8, 3127–3136. 29. Lu, S., Zhou, Q., Ma, L., Guo, Y., & Wang, J. (2019). Rapid discovery of ferroelectric photovoltaic perovskites and material descriptors via machine learning. Small Methods, 3, 1900360. https://doi.org/10.1002/smtd.201900360 30. Moghadam, P. Z., Rogge, S. M., Li, A., Chow, C. M., Wieme, J., Moharrami, N., Aragones-Anglada, M., Conduit, G., Gomez-Gualdron, D. A., Van Speybroeck, V., & FairenJimenez, D. (2019). Structure-mechanical stability relations of metal-organic frameworks via machine learning. Matter, 1(1), 219–234. ISSN: 2590-2385. https://doi.org/10.1016/j.matt. 2019.03.002 31. Shi, Z., Yang, W., Deng, X., Cai, C., Yan, Y., Liang, H., Liu, Z., & Qiao, Z. (2020). Molecular Systems Design & Engineering, 5, 725–742. 32. Wu, Y., Duan, H., & Xi, H. (2020).Chemistry of Materials, 32(7), 2986–2997. https://doi.org/ 10.1021/acs.chemmater.9b05322 33. Sasikumar, K., Chan, H., Narayanan, B., & Sankaranarayanan, S. K. (2019). Chemistry of Materials, 31(9), 3089–3102. https://doi.org/10.1021/acs.chemmater.8b03969 34. Li, Z., Xu, Q., Sun, Q., Hou, Z., & Yin, W.-J. (2019). Advanced Functional Materials, 29, 1807280. https://doi.org/10.1002/adfm.201807280 35. Lu, H., Li, X., Monny, S. A., Wang, Z., Wang, L. (2022). Chinese Journal of Catalysis, 43(5), 1204–1215. ISSN: 1872-2067. https://doi.org/10.1016/S1872-2067(21)64028-7 36. Kaufmann, K., Maryanovsky, D., Mellor, W. M., et al. (2020). Discovery of high-entropy ceramics via machine learning. Npj Computational Materials, 6, 42. https://doi.org/10.1038/ s41524-020-0317-6 37. Qureshi, M., & Takanabe, K. (2017). Chemistry of Materials, 29(1), 158–167. https://doi.org/ 10.1021/acs.chemmater.6b02907 38. Mills, A., Hill, C., & Robertson, P. K. (2012).Journal of Photochemistry and Photobiology A: Chemistry, 237, 7–23. ISSN: 1010-6030. https://doi.org/10.1016/j.jphotochem.2012.02.024 39. Buriak, J. M., Kamat, P. V., & Schanze, K. S. (2014). ACS Applied Materials & Interfaces, 6(15), 11815–11816. https://doi.org/10.1021/am504389z 40. American Society for Testing and Materials. Committee G03 on Weathering and Durability. Standard Tables for Reference Solar Spectral Irradiances: Direct Normal and Hemispherical on 37° Tilted Surface; ASTM International (2012). 41. Chen, C., Zuo, Y., Ye, W., Li, X., Deng, Z., & Ong, S. P. (2020). A critical review of machine learning of energy materials. Advanced Energy Materials, 10, 1903242. https://doi.org/10. 1002/aenm.201903242
Discovery of Novel Photocatalysts Using Machine Learning Approach
257
42. Himanen, L., Geurts, A., Foster, A. S., & Rinke, P. (2019). Data-driven materials science: Status, challenges, and perspectives. Advancement of Science, 6, 1900808. https://doi.org/10. 1002/advs.201900808 43. Toyao, T., Maeno, Z., Takakusagi, S., Kamachi, T., Takigawa, I., & Shimizu, K.-I. (2020). ACS Catalysis, 10(3), 2260–2297. https://doi.org/10.1021/acscatal.9b04186 44. Abor, D. P., Roch, L. M., Saikin, S. K., et al. (2018). Accelerating the discovery of materials for clean energy in the era of smart automation. Nature Reviews Materials, 3, 5–20. https:// doi.org/10.1038/s41578-018-0005-z 45. Sun, X., Wang, C., Su, D., Wang, G., & Zhong, Y. (2020). Application of photocatalytic materials in sensors. Advanced Materials Technologies, 5, 1900993. https://doi.org/10.1002/ admt.201900993 46. Ismael, M. (2021). 303, 121207. ISSN: 0016-2361. https://doi.org/10.1016/j.fuel.2021. 121207 47. Hanif, M. A., Kim, Y. S., Ameen, S., Kim, H. G., & Kwac, L. K. (2022).Boosting the visible light photocatalytic activity of ZnO through the incorporation of N-doped for wastewater treatment. Coatings, 12(5), 579. https://doi.org/10.3390/coatings12050579 48. Wu, Z., Zhong, H., Yuan, X., Wang, H., Wang, L., Chen, X., Zeng, G. & Wu, Y. (2014).67, 330–344. ISSN: 0043-1354. https://doi.org/10.1016/j.watres.2014.09.026 49. Chang, J., Ma, J., Ma, Q., Zhang, D., Qiao, N., Hu, M., & Ma, H.Applied Clay Science, 119, 132–140. ISSN 0169-1317. https://doi.org/10.1016/j.clay.2015.06.038 50. Hitam, C. N. C., & Jalil, A. A. (2020). Journal of Environmental Management, 258, 110050. ISSN 0301-4797. https://doi.org/10.1016/j.jenvman.2019.110050 51. Khan, M.Y., Ahmad, M., Sadaf, S., Iqbal, S., Nawaz, F., & Iqbal, J. (2019). Journal of Materials Research and Technology, 8(3), 3261–3269. ISSN: 2238-7854. https://doi.org/10.1016/j.jmrt. 2019.05.015 52. Li, Z., Zhang, P., Shao, T., Wang, J., Jin, L., & Li, X. (2013). Journal of Hazardous Materials, 260, 40–46. ISSN: 0304-3894. https://doi.org/10.1016/j.jhazmat.2013.04.042 53. Wang, G., Huang, B., Li, Z., et al. (2015). Synthesis and characterization of ZnS with controlled amount of S vacancies for photocatalytic H2 production under visible light. Science and Reports, 5, 8544. https://doi.org/10.1038/srep08544 54. Lim, H., & Rawal, S. B. (2017). Progress in Natural Science: Materials International, 27(3), 289–296. ISSN: 1002-0071. https://doi.org/10.1016/j.pnsc.2017.04.003 55. Salari, H. (2020). Materials Research Bulletin, 131, 110979, ISSN 0025-5408. https://doi. org/10.1016/j.materresbull.2020.110979 56. Hu, K., Liu, P., Zhang, Z., Bian, J., Wang, G., Wu, H., Xu, H., & Jing, L. (2022).The Journal of Physical Chemistry C, 126(23), 9704–9712. https://doi.org/10.1021/acs.jpcc.2c01919 57. Priyanga, G. S., Mattur, M. N., Nagappan, N., Rath, S., & Thomas, T. (2022). Prediction of nature of band gap of perovskite oxides (ABO3) using a machine learning approach. Journal of Materiomics, 8(5), 937–948. ISSN 2352-8478. https://doi.org/10.1016/j.jmat.2022.04.006 58. Behara, S., Rath, S., & Thomas, T. (2022). Machine learning (ML) as a tool for phosphor design: A perspective. Materials Letters, 308, Part A, 131061, ISSN: 0167-577X. https://doi. org/10.1016/j.matlet.2021.131061 59. Behara, S., Poonawala, T., & Thomas, T. (2021). Crystal structure classification in ABO3 perovskites via machine learning. Computational Materials Science, 188, 110191. ISSN 09270256. https://doi.org/10.1016/j.commatsci.2020.110191 60. Rath, S., Priyanga, G. S., Nagappan, N., & Thomas, T. (2022). Discovery of direct band gap perovskites for light harvesting by using machine learning. Computational Materials Science, 210, 111476. ISSN 0927-0256. https://doi.org/10.1016/j.commatsci.2022.111476 61. Neat, u, S, , Maciá-Agulló, J. A., & Garcia, H. (2014). Solar light photocatalytic CO2 reduction: General considerations and selected bench-mark photocatalysts. International Journal of Molecular Sciences, 15(4), 5246–5262. https://doi.org/10.3390/ijms15045246 62. https://towardsdatascience.com/uncovering-the-potential-of-materials-data-using-matminerand-pymatgen-83126fadde1c
258
G. S. Priyanga et al.
63. Ward, L., Liu, R., Krishna, A., Hegde, V. I., Agrawal, A., Choudhary, A., & Wolverton, C. (2017). Physical Review B, 96, 24104. 64. Rupp, M., Tkatchenko, A., Muller, K.-R., & von Lilienfeld, O. A. (2012). Physical Review Letters, 108, 58301. 65. Carrete, J., Li, W., Mingo, N., Wang, S., & Curtarolo, S. (2014). Physical Review X, 4, 11019. 66. Ward, L., & Wolverton, C. (2017). Current Opinion in Solid State and Materials Science, 21, 167. 67. Mauro, J. C., Tandia, A., Vargheese, K. D., Mauro, Y. Z., & Smedskjaer, M. M. (2016). Chemistry of Materials, 28, 4267. 68. Bucholz, E. W., Kong, C. S., Marchman, K. R., Sawyer, W. G., Phillpot, S. R., Sinnott, S. B., & Rajan, K. (2012). Tribology Letters, 47, 211. 69. Sparks, T. D., Gaultois, M. W., Oliynyk, A., Brgoch, J., & Meredig, B. (2015). Scripta Materialia, 111, 10. 70. Mannodi-Kanakkithodi, A., Chandrasekaran, A., Kim, C., Huan, T. D., Pilania, G., Botu, V., & Ramprasad, R. (2017). Materials Today. https://doi.org/10.1016/j.mattod.2017.11.021 71. Faber, F. A., Lindmaa, A., von Lilienfeld, O. A., & Armiento, R. (2016). Physical Review Letters, 117, 135502. 72. Ren, F., Ward, L., Williams, T., Laws, K. J., Wolverton, C., Hattrick-Simpers, J., & Mehta, A., Science Advances, 4, eaaq1566; Seko, A., Hayashi, H., Nakayama, K., Takahashi, A., Tanaka, I., Physical Review B, 95, 144110. 73. Seko, A., Hayashi, H., Nakayama, K., Takahashi, A., & Tanaka, I. (2017). Physical Review B, 95, 144110. 74. Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A., & Kim, C. (2017). NPJ Computational Materials, 3, 54. 75. Kalidindi, S. R. (2012). ISRN Materials Science, 2012, 1. 76. McKinney, W. (2010). Proceedings of the 9th Python in Science Conference (Vol. 1697900, p. 51). 77. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Journal of Machine Learning Research, 12, 2825. 78. Millman, K. J., & Aivazis, M. (2011). Computing in Science and Engineering, 13, 9. 79. van der Walt, S., Colbert, S. C., & Varoquaux, G. (2011). Computing in Science and Engineering, 13, 22. 80. Perez, F., & Granger, B. E. (2007). Computing in Science and Engineering, 9, 21. 81. Jain, A., Ong, S. P., Hautier, G., Chen, W., Richards, W. D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G., & Persson, K. A. (2013). APL Materials, 1, 11002. 82. O’Mara, J., Meredig, B., & Michel, K. (2016). JOM, 68, 2031. https://citrination.com 83. https://mpds.io/ 84. Blaiszik, B., Chard, K., Pruyne, J., Ananthakrishnan, R., Tuecke, S., & Foster, I. (2016). JOM Journal of the Minerals Metals and Materials Society, 68, 2045. 85. Ward, L., Dunn, A., Faghaninia, A., Zimmermann, N. E., Bajaj, S., Wang, Q., Montoya, J., Chen, J., Bystrom, K., Dylla, M., & Chard, K., Matminer: An open source toolkit for materials data mining. 86. Waskom, M., Botvinnik, O., O’Kane, D., Hobson, P., Lukauskas, S., Gemperline, D. C., Augspurger, T., Halchenko, Y., Cole, J. B., Warmenhoven, J., de Ruiter, J., Pye, C., Hoyer, S., Vanderplas, J., Villalba, S., Kunter, G., Quintero, E., Bachant, P., Martin, M., … Qalieh, A. (2017). https://doi.org/10.5281/ZENODO.883859 87. Hunter, J. D. (2007). Computing in Science and Engineering, 9, 90. 88. https://plot.ly/ 89. https://doi.org/10.1016/j.commatsci.2012.10.028 90. Oliphant, T. E. (2007). Computing in Science & Engineering, 9, 10. 91. Hautier, G., Fischer, C., Ehrlacher, V., Jain, A., Ceder, G. (2010). Inorganic Chemistry, 656.
Discovery of Novel Photocatalysts Using Machine Learning Approach
259
92. Gonze, X., Rignanese, G. M., Verstraete, M. J., Beuken, J. M., Pouillon, Y., Caracas, R., Jollet, F., Torrent, M., Zerah, G., Mikami, M., Ghosez, P., Veithen, M., Raty, J. Y., Olevano, V., Bruneval, F., Reining, L., Godby, R. W., Onida, G., Hamann, D. R., & Allan, D. C. (2005). Zeitschrift für Kristallographie, 220, 558. 93. Hester, J. R. (2006). Journal of Applied Crystallography, 39, 621. 94. O’Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Journal of Cheminformatics, 3, 33. 95. Kresse, G., & Furthmuller, J. (1996). Physical Review B, 54, 11169. 96. Bahn, S. R., & Jacobsen, K. W. (2002). Computer Science Engineering, 4, 56. 97. Anisimov, V. I., Zaanen, J., & Andersen, O. K. (1991). Physical Review B, 44, 943. 98. Anisimov, V. I., Aryasetiawan, F., & Lichtenstein, A. I. (1997).Journal of Physics: Condensation Matter, 9, 767. 99. Liechtenstein, A. I., Anisimov, V. I., & Zaanen, J. (1995). Physical Review B, 52, R5467. 100. Jain, A., Hautier, G., Ong, S., Moore, C., Fischer, C., Persson, K., & Ceder, G. (2011). Physical Review B, 84, 045115. 101. Zhou, F., Cococcioni, M., Marianetti, C. A., Morgan, D., & Ceder, G. (2004). Physical Review B, 70, 235121. 102. Wang, L., Maxisch, T., & Ceder, G. (2006). Physical Review B, 73, 195107. 103. Ong, S. P., Wang, L., Kang, B., & Ceder, G. (2008). Chemistry of Materials, 20. 104. Ong, S. P., Jain, A., Hautier, G., Kang, B., & Ceder, G. (2010). Electrochemistry Communications, 12, 427. 105. Fidan, S., Oktay, H., Polat, S., & Ozturk, S. (2019). An artificial neural network model to predict the thermal properties of concrete using different neurons and activation functions. Advances in Materials Science and Engineering, 2019, 3831813. 106. Swaidani, A. M., & Khwies, W. T. (2018). Applicability of artificial neural networks to predict mechanical and permeability properties of volcanic scoria-based concrete. Advances in Civil Engineering, 2018, 5207962. 107. Zhang, Z., Barkoula, N. M., Karger-Kocsis, J., & Friedrich, K. (2003). Artificial neural network predictions on erosive wear of polymers. Wear, 255, 708–713. 108. Roy, N. K., Potter, W. D., & Landau, D. P. (2006). Polymer property prediction and optimization using neural networks. IEEE Transactions on Neural Networks, 17, 1001–1014. 109. Kumar, G. V., Pramod, R., Rao, C. S. P., & Gouda, P. S. (2018). Artificial neural network prediction on wear of Al6061 alloy metal matrix composites reinforced with-Al2 O3 . Materials Today Proceedings, 5, 11268–11276. 110. Scott, D. J., Coveney, P. V., Kilner, J. A., Rossiny, J. C. H., & Alford, N. M. N. (2007). Prediction of the functional properties of ceramic materials from composition using artificial neural networks. Journal of the European Ceramic Society, 27, 4425–4435. 111. Moravec, H. (1988). Mind Children. Harvard University Press. 112. Nath, P., Plata, J. J., Usanmaz, D., Orabi, R. A., Fornari, M., Nardelli, M. B., Toher, C., & Curtarolo, S. (2016). High-throughput prediction of finite-temperature properties using the quasi-harmonic approximation. Computational Materials Science, 125, 82–91. 113. Sparks, T. D., Gaultois, M. W., Oliynyk, A., Brgoch, J., & Meredig, B. (2016). Data mining our way to the next generation of thermoelectrics. Scripta Materialia, 111, 10–15. 114. Xue, D., Yuan, R., Zhou, Y., Balachandran, P. V., Ding, X., Sun, J., & Lookman, T. (2017). An informatics approach to transformation temperatures of NiTi-based shape memory alloys. Acta Materialia, 125, 532–541. 115. Thankachan, T., Prakash, K. S., Pleass, C. D., Rammasamy, D., Prabakaran, B., & Jothi, S. (2017). Artificial neural network to predict the degraded mechanical properties of metallic materials due to the presence of hydrogen. International Journal of Hydrogen Energy, 42, 28612–28621. 116. Zhu, Z., Dong, B., Guo, H., Yang, T., & Zhang, Z. (2020). Fundamental band gap and alignment of two-dimensional semiconductors explored by machine learning. Chinese Physics B, 29, 046101.
260
G. S. Priyanga et al.
117. Masood, H., Toe, C. Y., Teoh, W. Y., Sethu, V., & Amal, R. (2019). Machine learning for accelerated discovery of solar photocatalysts. ACS Catalysis, 9, 11774–11787. 118. Toma, F. L., Guessasma, S., Klein, D., Montavon, G., Bertrand, G., & Coddet, C. (2004). Neural computation to predict TiO2 photocatalytic efficiency for nitrogen oxides removal. Journal of Photochemistry and Photobiology, A: Chemistry, 165, 91–96. 119. Oliveros, E., Benoit-Marquie, F., Puech-Costes, E., Maurette, M. T., & Nascimento, C. A. O. (1998). Neural network modeling of the photocatalytic degradation of 2,4-dihydroxybenzoic acid in aqueous solution. Analusis, 26, 326–332. 120. Emilio, C. A., Litter, M. I., & Magallanes, J. F. (2002). Semiempirical modeling with application of artificial neural networks for the photocatalytic reaction of ethylenediaminetetraacetic acid (EDTA) over titanium oxide (TiO2 ). Helvetica Chimica Acta, 85, 799–813. 121. Curtarolo, S., Setyawan, W., Wang, S., Xue, J., Yang, K., Taylor, R. H., Nelson, L. J., Hart, G. L., Sanvito, S., & Buongiorno-Nardelli, M., et al. (2012). Aflowlib. Org: A distributed materials properties repository from high-throughput Ab initio calculations. Computational Materials Science, 58, 227−235. 122. Belsky, A., Hellenbrandt, M., Karen, V. L., & Luksch, P. (2002). New developments in the inorganic crystal structure database (ICSD): Accessibility in support of materials research and design. Acta Crystallographica Section B: Structural Science, 58(3), 364−369. 123. Jain, A., Ong, S. P., Hautier, G., Chen, W., Richards, W. D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G., & Persson, K. A. (2013). Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials, 1(1), 011002. 124. Landis, D. D., Hummelshoj, J. S., Nestorov, S., Greeley, J., Dulak, M., Bligaard, T., Norskov, J. K., & Jacobsen, K. W. (2012). The computational materials repository. Computing in Science and Engineering, 14(6), 51. 125. Kirklin, S., Saal, J. E., Meredig, B., Thompson, A., Doak, J. W., Aykol, M., Rühl, S., & Wolverton, C. (2015). The open quantum materials database (OQMD): Assessing the accuracy of DFT formation energies. NPJ Computational Materials, 1, 15010. 126. Gražulis, S., Chateigner, D., Downs, R. T., Yokochi, A. F. T., Quirós, M., Lutterotti, L., Manakova, E., Butkus, J., Moeck, P., & Le Bail, A. (2009). Crystallography open database— An open-access collection of crystal structures. Journal of Applied Crystallography, 42(4), 726−729. 127. Winther, K., Hoffmann, M. J., Mamun, O., Boes, J. R., Nørskov, J. K., Bajdich, M., & Bligaard, T. (2019). Catalysis-Hub. Org: An open electronic structure database for surface reactions. Scientific Data, 6(1), 75. 128. Linstrom, P. J., & Mallard, W. G. (2001). The NIST chemistry webbook: A chemical data resource on the internet. Journal of Chemical and Engineering Data, 46(5), 1059–1063. 129. Blaiszik, B., Chard, K., Pruyne, J., Ananthakrishnan, R., Tuecke, S., & Foster, I. (2016). The materials data facility: Data services to advance materials science research. JOM Journal of the Minerals Metals and Materials Society, 68(8), 2045–2052. 130. Kim, S., Thiessen, P. A., Bolton, E. E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B. A., et al. (2016). Pubchem substance and compound databases. Nucleic Acids Research, 44(D1), D1202–D1213. 131. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/ A:1010933404324 132. Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232. https://doi.org/10.2307/2699986 133. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. https://doi.org/10.1037/h00 42519 134. McCulloch, W., & Pitts, W. (1943). A logical calculus of ideas immanent in nervous activity. Bulletin of Mathematical Biophysics., 5(4), 115–133. https://doi.org/10.1007/BF02478259 135. Werbos, P. J. (1975). Beyond regression: New tools for prediction and analysis in the behavioral sciences.
Discovery of Novel Photocatalysts Using Machine Learning Approach
261
136. https://community.tibco.com/wiki/randomforest-template-tibco-spotfirer-wiki-page, CC BYSA 4.0, https://commons.wikimedia.org/w/index.php?curid=68995764. 137. Multilayer Perceptron. https://medium.com/codex/introduction-to-how-an-multilayer-percep tron-works-but-without-complicated-math-a423979897ac 138. Tao, Q., Lu, T., Sheng, Y., Li, L., Lu, W., & Li, M. (2021). Machine learning aided design of perovskite oxide materials for photocatalytic water splitting. Journal of Energy Chemistry, 60, 351–359. ISSN 2095-4956, https://doi.org/10.1016/j.jechem.2021.01.035. https://www. sciencedirect.com/science/article/pii/S2095495621000644 139. Zhang, Q., Chang, D., Zhai, X., & Lu, W. (2018). Chemometrics and Intelligent Laboratory Systems, 177, 26–34. 140. Schmidt, J., Marques, M. R. G., Botti, S., & Marques, M. A. L. (2019). npj Computational Materials, 5, 83. 141. Wang, H., Ji, Y., & Li, Y. (2019). WIREs Computational Molecular Science, 10, e1421. Perceptron. https://www.simplilearn.com/tutorials/deep-learning-tutorial/perceptron. 142. Zhai, X., Chen, M., Lu, W., & Chang, D. (2018). Journal of Mathematical Chemistry, 56, 1744–1758. Multilayer Perceptron. https://medium.com/codex/introduction-to-how-an-mul tilayer-perceptron-works-but-without-complicated-math-a423979897ac 143. Peng, H., Long, F., & Ding, C. (2005). IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1226–1238. 144. Rodriguez-Galiano, V. F., Luque-Espinar, J. A., Chica-Olmo, M., & Mendes, M. P. (2018).Science of the Total Environment, 624, 661–672. 145. Yusof, M. H. M., Mokhtar, M. R., Zain, A. M., & Maple, C. (2018). International Journal of Advanced Computer Science and Applications, 9, 509–517. 146. Classification using Random Forest. By Venkata Jagannath. https://community.tibco.com/ wiki/random-forest-template-tibco-spotfirer-wiki-page, CC BY-SA 4.0, https://commons.wik imedia.org/w/index.php?curid=68995764 147. Parida, K. M., Reddy, K. H., Martha, S., Das, D. P., & Biswal, N. (2010). International Journal of Hydrogen Energy, 35, 12161–12168.
Machine Learning in Impedance-Based Sensors V. Balasubramani and T. M. Sridhar
Abstract The impedance technique is deployed to understand the electrical properties of various conducting surfaces and their interfaces. It is an analytical tool that is applied in electrochemistry and is commonly referred to as Electrochemical Impedance Spectroscopy (EIS). Impedance is applied in many fields such as sensors, semiconductors, energy storage devices, corrosion technology, conducting polymers, coatings, ceramics, and advanced materials. Material properties and functions change with a given environment along with the stability of the system. EIS is a complex system and the output curves are represented as Nyquist and Bode plots. The data obtained from the plots have to be treated with mathematical and electrical components to arrive at the equivalent circuit to meaningfully interpret it. Equivalent circuits analysis helps us to interpret the mechanism of the device in the given electrochemical system. Machine learning (ML) tools help us to train the systems to process the data and obtain the perfect matching equivalent circuit but several challenges remain as EIS database creation is the biggest challenge.
1 Introduction Analytical science is all about decoding the information to probe the structure, chemical composition, distribution in composites, etc. by using several types of classical and instrumental analysis. This quest leads to the development of validated scientific protocols for analysis which when accepted across the globe is accepted as a standard. But the challenge always keeps evolving around an analytical chemists as they are always expected to be innovative to design and apply analytical practices coupled with modern analytical instrumentation techniques and detection strategies which could detect from a few grams to parts per billion (ppb) levels or even less. The need for chemical analysis is part of our daily activity as life without materials is mindboggling. On the other hand, development has led to the deterioration of quality of life and an increase in exposure to toxic materials by humans who need to V. Balasubramani · T. M. Sridhar (B) Department of Analytical Chemistry, University of Madras, Chennai 600025, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_12
263
264
V. Balasubramani and T. M. Sridhar
be analyzed and monitored from the water used for drinking to the air that is inhaled. Sensors play an important role in the environmental monitoring, industry; control of chemical processes, medical applications, and day-to-day life. Sensors alert us on the changes occurring in the environment which could be anything from our atmosphere to our human body. Sensors have become a part of our day-to-day activities and are designed to sense a variety of inputs from gases to solar radiation. Gas sensors are designed to detect the presence of toxic gases in the environment while biosensor is based on the changes in the biochemical environment. Recently, graphene-based nanocomposite materials have been considered as the most capable material for chemical and biosensing applications. The design, functioning, sensitivity, and accuracy of these miniature sensors are dependent on the quality of the nanomaterial’s used in their fabrication to the instrument used in their detection. This leads to challenges for the development of analytical methods in analyzing these materials development of new analytical processes to study the chemical and biochemical changes and machine learning is going to change the modes of application. This chapter mainly focuses on impedancebased sensors such as chemical and biosensor using graphene-based nanocomposite materials and the role of machine learning in addressing the challenges of these sensors in the detection of chemicals and biomolecules, advantages and drawbacks are illustrated. A broad introduction to the fundamentals of machine learning from EIS point of view is provided along with the analysis of machine learning model. Finally, applicable evidences to the fruitful applications of machine learning and the outstanding challenges toward the development of advanced impedance-based sensor are highlighted.
2 Techniques for Chemical and Biomolecules Detection 2.1 Optical-Based Methods A sensor that uses optical wavelength to track variations in light characteristics that include optical path distance, absorbance, scattering, fluorescence, reflectivity, and refraction in order to find specific and targeted analytes. For the detection of target analytes, sensors are based on changes in optical interaction that includes UV and IR or changes in surface characteristics. The major advantages of these sensors compared to the present ones include their ability to operate at room temperature and be portable. Detection of glucose and its precise monitoring has always been a complex in monitoring their levels in diabetic patients on a regular basis. Glucose biosensors are an easy solution but there are several hurdles left before they can be made portable and easy to use by a layman. The challenges with the fabrication of the sensors are the selection of functional materials consisting of polymers, carbon based (graphene), and biological layers along with determining their mechanism. Calibration of sensors
Machine Learning in Impedance-Based Sensors
265
each time with another device is another drawback. Beyranvand et al. [1] have developed an enzyme-less biosensor to detect glucose biosensor with graphene and E. coli using fluorescence, ultraviolet, and electrochemical techniques as sensing signals which are designed to self-calibrate [1]. Figure 1a shows the variation of fluorescence in the exchanges between the biosystems and its biochemical constituents like boronic acid groups and dopamine The biochemical reactions recorded with UV spectra at 248 nm on the electrode surface containing PTDN/GO/DBA, i.e., Triazinefunctionalized poly(ethylene glycol (PTDN), graphene oxide (GO) and dopamine Fig. 1b. Figure 1c shows oxidation of GO where a weak fluorescence resonance energy transfer (FRET) emission signal is obtained at 605 nm for dopamine. Light scattering is decreased in this system by improving its dispersion and UV interaction with graphene at 270 nm. Glucose and E. coli were included to obtain another signal to detect these bio-systems shown in Fig. 1d, e with 50 μM and 5000 (colony forming unit) CFU/ml concentration in Phosphate buffered saline. Figure 1f, g shows a significant increase in the intensity of fluorescence spectra of PTDN and GO with the addition of glucose and E.coli obtained at a wavelength of 605 nm. The advantage of flexibility and precision of this sensor signal increased due to the interaction of graphene and biosystems. This system developed is a real-time biosensor with high efficiency and accuracy for each signal that was determined can be self-calibrated. Graphene as sheets in combination with optical techniques fluorescence and UV with electrochemical properties can be used to develop self-regulating biosensors with accuracy and low detection limits in real-time samples. Sangeetha and Madhan reported MoS2 -graphene (G) composites for the detection of NO2 and formaldehyde gases [2]. The sensor shows that high sensitivity toward NO2 gas to a level of (61%), response, and recovery time (22 s) and (35 s) with high stability. The spectral response obtained for NO2 and formaldehyde gas is presented in Fig. 2a, c MoS2 , Fig. 2b, d MoS2 /G, respectively, with different concentrations (0–500 ppm). The characteristic peaks of the fiber for the response to NO2 in the spectra were observed at three places 690, 751, and 962 nm. The studies indicate that the intensity of the peaks is improvised as the concentration of both gases is increased along with their sensitivity. There are several optical sensors built with nano zinc oxide along with GO as a coating on microfiber to detect ammonia and iron (III) detection with a fluorescence sensor that is developed with fluorescein-reduced graphene oxide (rGO) combination [3, 4].
2.2 Field Effect Transistors (FET) Field Effect Transistors (FET) based sensors have several advantages as they are built as compact devices with small energy consumption, ease of building it, and economic viability. In this device, two source and drain electrodes are sandwiched between the sensing layer. A thin dielectric layer that can be regulated by the channel’s
266
V. Balasubramani and T. M. Sridhar
Fig. 1 a Representation of collaboration between glucose/E. coli and catechol and boronic acid. b UV-vis obtained for nGO, dopamine, 2,5-thiophenediyl bis-boronic acid, PTDN/GO, and PTDN/GO/DBA in water. c Fluorescence recorded for dopamine, nGO, PTDN3 , PTDN/GO, and PTDN/GO/DBA systems in PBS at pH-8 at the wavelength of 275 nm. UV-vis obtained spectra for these systems at a concentration of 1 mg/mL in PBS at pH-8 before and after the inclusion of d glucose and e E. coli. Fluorescence spectra obtained for these systems with the concentration of 1 mg/mL in PBS at pH-8, before and after the addition of f glucose and g E. coli [1]
Machine Learning in Impedance-Based Sensors
267
Fig. 2 Spectral response obtained for NO2 and formaldehyde gases. a, c MoS2 , b, d MoS2 /G, respectively, at different concentrations [2]
conductance applies voltage to the gate electrode. The semiconductive sensing units conductance changes, which occur when the drain-source current varies which forms the basis to identify the chemical and biochemical analytes. The theory is based on the alterations brought about by the molecules of the target gas that have adhered to the semiconductor surface. Seo et al. [5] developed graphene-antibody of SARS-CoV-2 antigen protein-based FET biosensor to detect coronavirus present in clinical samples. The assessment of the sensor was carried out by culturing the virus and its antigen protein from nasal swabs of corona patients to build the biosensor and is presented in Fig. 3 [5].
268
V. Balasubramani and T. M. Sridhar
Fig. 3 A representation of COVID-19 FET biosensor and its functioning method. The Covid-19 spike antibody is extracted from humans this is placed on to the graphene sheet which acts as a sensing film along with 1-pyrene butyric acid n-hydroxysuccinimide ester which acts as a probe [5]
3 Electrochemical Impedance-Based Sensors Impedance is a simplification of “resistance” in the circumstance of an alternating current (AC). Electrochemical Impedance Spectroscopy (EIS) is a quantitative assessment technique widely used to study the properties of surfaces. Impedance is a complex number, where the resistance is the real component and the combined capacitance and inductance are the imaginary component. The total impedance in the circuit is the combined opposition of all its resistors, capacitors, and inductors to the flow of electrons. Impedance is comprised of applied frequency (f ), equivalent voltage (V ), and the current (I) that is moving through the system: Z ( f ) = V ( f )/I ( f )
(1)
Figure 4 shows that the value of impedance and phase shift between current (I) and voltage (V) are related to the real and imaginary parts of impedance. Impedance is the combination of equivalent circuit components which is Resistor (R), Capacitor (C), and Inductor (L). The imaginary section of the equation describes the system’s electrical capacitance and induction. The phase shifts between current and voltage when the imaginary component is not equal to zero are represented in Fig. 5 [6]. In processing electrochemical impedance data using equivalent circuits Randles circuit is commonly used. A typical three-element Randles equivalent circuit is given below where RΩ is the uncompensated resistance of the electrolyte between the working and the reference electrodes and Rp is the polarization or charge transfer resistance. C dl is the specific double-layer capacitance at the working electrode/electrolyte interface.
Machine Learning in Impedance-Based Sensors
269
Fig. 4 Illustration of phase changes in an impedance spectra [6]
Fig. 5 Schematic representation of a circuit elements, b phase diagrams, c phase shift between voltage and current [6]
More complex circuit element models are used when multiple surfaces and interfaces are involved. A representative equivalent circuit for a coated substrate is given below consisting of additional circuit elements like coating capacitance (C c ), the pore
270
V. Balasubramani and T. M. Sridhar
resistance (Rpo ), solution resistance (RΩ ), and the substrate interfacial impedance (Z if ).
3.1 Nyquist and Bode Plots Nyquist plot is also referred to as a Cole-Cole plot that is used to evaluate the impedance data. The Nyquist plot typically represents the changes in frequency recorded as real and imaginary parts of impedance. The Bode plot consists of the frequency response to the total impedance represented as changes in magnitude and the phase shift in the frequency province. EIS is a sensitive technique used for the fabrication of toxic gas sensors at parts per billion levels (ppb). The typical Nyquist plot for a gas sensor is shown in Fig. 6 which represents the high, mid, and lowfrequency regions. The points in the curve are significant as grain bulk properties are analyzed in the high-frequency region, grain boundary details at the mid frequency region, and ion transportation occurring at the electrode contact areas at the lowfrequency region. Bode plot gives us information about frequency-based components and helps us to predict the gas sensing signal. The gas sensitivity can be identified at low-frequency region. In case the charge transfer and the diffusion process occurring in the reaction are equivalent to a charge transfer resistance then it is referred to as Warburg impedance. Constant Phase Element (CPE) is used to describe the equivalent circuit used for modeling of components that are employed to analyze EIS data. It predicts the properties of a double layer which is a defective capacitor and is used to fit the data while analyzing Nyquist plots and a typical CPE is represented in Fig. 7. Over a wide frequency range, CPE proposes a consistent phase angle. A single CPE has an impedance response of a straight line on a Nyquist plot, making an angle of ϕ × (−90°) with the abscissa, ϕ where is the phase angle exponent. When the phase is equal to one, the phase is said to be one [7]. Randles circuit and Warburg impedance are.
Machine Learning in Impedance-Based Sensors
271
Fig. 6 Typical Nyquist plots obtained for gas sensor
Fig. 7 Illustration of a Nyquist plot containing constant phase element [7]
3.2 Applications of EIS Based Sensors EIS is an alternative current (AC) based sensitive technique for developing gas sensors as it has unique superiority compared with other direct current (DC) based techniques like cyclic voltammetry, resistance, several other techniques optical, etc. [8]. Semiconducting metal oxides are widely used to build sensors for the detection of toxic gases using impedance. The success of EIS depends on the selection and fitting of the ideal circuit that would help in predicting not only the mechanism but also the interface, surface, and bulk properties electrode. Machine learning provides an ideal choice to develop algorithms to explore the best fit for an obtained Nyquist and Bode plots. Impedance-based sensors are smarter as they have very low detection limits and can be scanned in a few minutes to acquire the data when compared to traditional methods show several promising advantages over conventional sensors. Detection of toxic gases like hydrogen sulfide at parts per billion levels has been reported in the presence of atmospheric gases [8]. A few disadvantages are also present with EIS sensors when it comes to the selectivity of a specific species especially gases.
272
V. Balasubramani and T. M. Sridhar
Fig. 8 a Shows the schematic representation of gold nanoparticle-rGO-based FET as an Ebola sensing device and b shows the Nyquist plot of impedance spectra for the sensing response in air, buffer, and different concentration of Ebola glycoprotein protein. The curves obtained at high frequency are shown in the inset [9]
The miniaturization of the electrode leads and handling them also requires specific training. Understanding the fundamentals of EIS is essential to select the equivalent circuit by processing the graphs to obtain specific parameters. The use of ML would go a great way in simplifying the complex data processing steps of EIS in the future. Maity et al. [9] developed a biosensor to detect the Ebola virus using gated FET (GFET) with dielectric of gold nanoparticles with rGO containing the carrier injectiontrapping-release-transfer method to detect Ebola glycoprotein (GP) using electrical resonance frequency modulation and is presented in Fig. 8. Tracking the changes in impedance spectra using Nyquist plots the electronic resonance frequency of the GFET device helps in optimizing the sensitivity of Ebola virus detection as the phase angle changes with frequency. Typical semi-circles are obtained for air, buffer, and biomolecule-containing solutions [9]. EIS has been used to detect the concentration of Hg2+ in drinking water using DNA modified composite of reduced graphene oxide (rGO) and chitosan (CS) by Zhang et al. [10]. The fabricated electrochemical biosensor has high sensitivity (0.1– 10 nM) with a low detection limit, better stability, high selectivity, and repeatability [10]. The Nyquist plot curves exhibit changes with an increase in the concentration of Hg2+ with the composites of CS/rGO/DNA exhibiting the highest Hg2+ sensing efficiency compared with others techniques. From the impedance spectra recorded the semicircle diameter can be monitored at high frequencies region. It corresponds to the charge transferring process and in the case of Hg2+ ions concentration, the diameter of the semicircle increased with an increase in concentration. Balasubramani et al. [11] reported H2 S toxic gas sensor by using EIS on rGO– ZnO composite [11]. Figure 9 shows that the impedance with the equivalent circuit of ZnO–rGO composites with H2 S (2–100 ppm) and without H2 S at 90 °C. From these reports, resistance of grain boundary was significantly affected. The authors reported high sensitivity, selectivity, response, recovery, and stability.
Machine Learning in Impedance-Based Sensors
(a)
273
(b)
Fig. 9 a Nyquist plot recorded for nanocomposites of ZnO/rGO in air and H2 S gas with varying concentrations from 2 to 100 ppm and b Mechanism of H2 S gas sensor derived from EIS circuit fitting data [11]
Machine Learning in Sensor Application Today, a life without sensors is unimaginable starting with our remotes to control our devices, smartphones, and agriculture to healthcare and all segments. Human body is the only system packed with a host of several sensors with high selectivity and sensitivity. Reactions of our eye to dark and bright light to smell detected by the nose are a few examples [12]. These features have inspired humans to develop several devices from robots for surgical applications to algorithms for speech therapy. Piezoelectric sensors have aided the development of tactile sensors to sense the touch of a human skin by a mechanical device [13]. Electronic nose to detect gases based on the human olfactory system which is presently under development to detect the chemical species present using an array of gas sensors. Wearable devices to detect and monitor human parameters from blood pressure to recording an electrocardiogram, monitoring blood sugar level of diabetic patients are a few examples where electronics and computer science are integrated with ML and artificial intelligence (AI). The challenge here is to handle the large volume of data and also send alert messages to the physicians when the patients’ health is deteriorating. The role of ML is immense as false alarms should not be raised but at the same time, data should be accurately processed and also stored for future use. Gas sensors are required to be smart as a small increase in the concentration of toxic gases to parts per million levels and above in the atmosphere like ammonia, hydrogen sulfide, nitrous oxide, formaldehyde, carbon monoxide, etc. would result in the death of the person. These gases are common in industrial environments, oil and gas exploration to refining industries, mining, sewage treatment plants, etc. The fabrication smart gas sensor involves the assembly of several arrays of procedures and practices put together that are ably controlled by AI algorithms. These systems would enable us to not only prevent accidents but also save precious human lives. Figure 10 illustrates the model to develop ML and AI-based real-time applications to monitor the chemical constituents in air and water, and detection of diseases without
274
V. Balasubramani and T. M. Sridhar
Fig. 10 Illustration of the design of functional gas sensors and their applications [14]
human contact like coronavirus to gas sensor alarms [14]. This can be achieved with help of nanomaterials to organic polymers, combined with an ideal technique to coat the surfaces using physical or chemical vapor deposition and integrating it with FET and micro electromechanical systems (MEMS). AI and ML algorithms would coordinate the type of catalysis to occur that would be detected by the gas sensor (GS) array with varying environmental conditions for a host of applications. Development and fabrication of the sensor should be integrated with ML and AI tools to not only process the signals but also create a database and provide a time-to-time analysis to monitor the situation.
4 Machine Learning in Impedance Analysis In EIS, the electrochemical system is viewed as an equivalent circuit made up of fundamental elements like resistance (R), capacitance (C), and constant phase elements, which can be arranged in series or parallel (CPE). By employing the electrochemical meaning of these components, the structure of these equivalent circuits and the value of each element can be measured, and the specifics of the electrochemical systems can be examined [8, 15]. Since the discussion of the implications is predicated on the premise that there is an equivalent circuit, how to select the right equivalent circuit model is a vital step. Currently, the standard procedure involves eliminating numerous viable models based on the various applications, followed by mathematical fitting to replicate the matching pattern. The task is to train numerous
Machine Learning in Impedance-Based Sensors
275
reliable models based on the experimental options and they should be compared to determine which is the most logical choice, with the involvement of arbitrary criteria and judgments [16]. The use of machine learning (ML) technologies can eliminate human subjectivity. A set of techniques for studying multivariable non-linear systems are collectively referred to as machine learning [17]. It is possible to endow with relevant information to define the features in particular applications by utilizing machine learning approaches like data analysis based on the sensitive correlation relating the different properties of the dataset. Support vector machines (SVM) are an ideal ML tool to deal with identifying EIS plots as it effectively handles smaller samples that are non-linear [18]. Classification of the impedance curve is the first task to identify the pattern of changes before carrying out circuit fitting. Impedance being a 2D technique it involves the use of high dimensional pattern recognition as it has a low volume of data points but giving the incorrect weightage would change the prediction mechanism. Zhu et al. [19] have trained a database of 500 sets of EIS with SVM to identify the matching equivalent circuits with minimal error [19]. SVM tools allow us to treat the individual data points as multi-dimensional vectors. Further these hyperplanes can be used to distinguish and classify them. The ideal hyperplane can be predicted with SVM as it maximizes the distance to the nearest data point on each side. SVM is deployed in pattern recognition problems such as portrait recognition and text classification. SVM achieves the best comprehensive performance to figure out the most suitable equivalent circuit model of a given EIS spectrum comparing with other algorithms. Database optimization is one of the vital constraints of SVM, but far-reaching levels of accuracy in distinguishing the matching equivalent circuit models of EIS can be achieved to predict the most plausible mechanism of a gas sensor. EIS database was built by extracting data from more than 500 research articles as failure of earlier researches were due to the low volume of data that was selected to implement ML. The disadvantage here is that all researchers do not work on one system and use the same parameters to record the EIS spectra and fit the equivalent circuit. The users of electrochemical impedance spectroscopy in general are by itself limited owing to its complex nature though widely used by chemists, physicists, materials scientists, and electrical engineers. An efficient model can be developed only by selecting the accurate data after discussion with the authors and having a better understanding of the EIS parameters on the equivalent circuits that have been adopted in order to train the database to give accurate analysis. Open-source software WebPlotDigitizer, which is a web-based tool to extract numerical data from images has been widely used for circuit fitting analysis with Nyquist plots [19, 20]. Labeling of data points on X and Y axis is carried out on the original figure followed by using ML tools and code for Web Plot Digitizer is available as open source at https://github.com/ankitrohatgi/WebPlotDigitizer [21]. The summary of the EIS analysis steps is represented in Fig. 11 [19]. The steps include the extraction of raw data from the reported curves into data points that are processible to be stored in databases. This is followed by the application of ML tools to generate the exact equivalent circuit for the curves [22]. The availability of
276
V. Balasubramani and T. M. Sridhar
data exclusively for gas sensors may be less but electrochemical data from corrosion, coatings, electrode systems, batteries, fuel cells, non-renewable energy storage devices are considered. ML has been applied in the area of biosensors for EIS-based applications. Zhu et al. [19] have deployed ML to design and fabricate biosensors for E. coli, the common bacteria present in healthy intestines of humans using impedance spectroscopy [19]. ML is used in this biosensor as it has the advantage of deciding multi-parameters and non-linear subjects in addition to self-training based on the response. Principle Component Analysis (PCA) and Support Vector Regression (SVR) tools along with several ML tools have been deployed to process the electrochemical impedance data to fabricate the E. coli biosensor and is illustrated in Fig. 12. Further processing of the impedance curves and its data analysis would increase the accuracy of detecting E.coli bacteria and it can further be extended to other organisms.
Fig. 11 Schematic representation of ML steps involved in processing electrochemical impedance spectroscopy data [19]
Fig. 12 Illustrations of the development of impedance-based E.coli biosensor system with machine learning tools [19]
Machine Learning in Impedance-Based Sensors
277
The biosensor was trained with PCA and SVR machine learning tools on MATLAB platform. In this process, the electrochemical impedance was recorded with four different concentrations of E.coli and the PCA was executed on data retrieved from impedance analysis. These four components of bacterial concentration served as inputs to SVR to simulate its detection output model. These inputs and outputs were used to train the model developed and test it. Genetic algorithms (GAs) were applied with continuous iterations to find out the best forecasting results. The change of chromosome and genetic material due to mutations induced are carried out to obtain the best model.
5 Conclusions Electrochemical Impedance spectroscopy is a complex system where the Nyquist and Bode plots have to be processed and data extracted by fitting the ideal equivalent circuits. Understanding the fundamentals of EIS is required to analyze the data by giving the correct weightage to the data points on the curves obtained. Machine learning tools aid in the analysis of the complex shapes of EIS data and the fitting of equivalent circuit models. Impedance technique has been employed to fabricate toxic gas and biosensors with graphene-based nanomaterials. EIS has a challenging future as ML-based tools and algorithms would enable us to perform and record the experiments continuously to generate highly accurate and sensitive data for sensor applications. The large volumes of data generated during circuit fitting iterations have to be processed. EIS data recorded in different frequency domains have to be decoded under dynamic conditions. This would change the sphere of electrochemical research and aid the fabrication of new sensor devices with very accurate and low levels of detection limits. This would serve as an ideal platform to propose the mechanism of sensing and evaluate its stability, selectivity, and sensitivity. EIS experiments data and ML numerical simulations together would result in the development of strategies to develop the next generation of experiments with simulation solutions to reduce the processing time and to apply them for several industrial, health care, and environmental protection applications. Acknowledgements The authors acknowledge RUSA-MHRD (COMMUNICATION No: C3/RI & QI/PF4-Appoint/Theme-2/Group-1/2021/150 for the financial support and facilities under UGC SAP DRS-I (Ref. No.F.540/16/DRS-I/2016 (SAP-I)), Department of Analytical Chemistry, University of Madras, Guindy Campus, Chennai-600025, India. Dr. VB is supported by Council of Scientific & Industrial Research with Senior Research Fellowship (Ref. No. 09/115/0791/2019-EMR-I), New Delhi, India.
278
V. Balasubramani and T. M. Sridhar
References 1. Beyranvand, S., Gholami, M. F., Tehrani, A. D., Rabe, J. P., & Adeli, M. (2019). Construction and evaluation of a self-calibrating multiresponse and multifunctional graphene biosensor. Langmuir, 35(32), 10461–10474. 2. Sangeetha, M., & Madhan, D. (2020). Ultra sensitive molybdenum disulfide (MoS2)/graphenebased hybrid sensor for the detection of NO2 and formaldehyde gases by fiber optic clad modified method. Optics & Laser Technology, 127, 106193. 3. Fu, H., Jiang, Y., Ding, J., Zhang, J., Zhang, M., Zhu, Y., & Li, H. (2018). Zinc oxide nanoparticle incorporated graphene oxide as sensing coating for interferometric optical microfiber for ammonia gas detection. Sensors and Actuators B: Chemical, 254, 239–247. 4. Senol, ¸ A. M., Onganer, Y., & Meral, K. (2017). An unusual “off-on” fluorescence sensor for iron (III) detection based on fluorescein-reduced graphene oxide functionalized with polyethyleneimine. Sensors and Actuators B: Chemical, 239, 343–351. 5. Seo, G., Lee, G., Kim, M. J., Baek, S. H., Choi, M., Ku, K. B., Lee, C. S., Jun, S., Park, D., Kim, H. G., & Kim, S. J. (2020). Rapid detection of COVID-19 causative virus (SARS-CoV-2) in human nasopharyngeal swab specimens using field-effect transistor-based biosensor. ACS Nano, 14(4), 5135–5142. 6. Gerasimenko, T., Nikulin, S., Zakharova, G., Poloznikov, A., Petrov, V., Baranova, A., & Tonevitsky, A. (2020). Impedance spectroscopy as a tool for monitoring performance in 3D models of epithelial tissues. Frontiers in Bioengineering and Biotechnology, 7, 474. 7. Rheaume, J. M. (2010). Solid state electrochemical sensors for Nitrogen Oxide (NOx ) detection in lean exhaust gases. UC Berkeley. ProQuest ID: Rheaume_berkeley_0028E_10549. Merritt ID: ark:/13030/m5nk3k1r. Retrieved from https://escholarship.org/uc/item/7g8290w7 8. Balasubramani, V., Chandraleka, S., Subba Rao, T., Sasikumar, R., Kuppusamy, M. R., & Sridhar, T. M. (2020). Review—Recent advances in electrochemical impedance spectroscopy based toxic gas sensors using semiconducting metal oxides. Journal of the Electrochemical Society, 167, 037572. 9. Maity, A., Sui, X., Jin, B., Pu, H., Bottum, K. J., Huang, X., Chang, J., Zhou, G., Lu, G., & Chen, J. (2018). Resonance-frequency modulation for rapid, point-of-care Ebola-Glycoprotein diagnosis with a graphene-based field-effect biotransistor. Analytical chemistry, 90(24), 14230– 14238. 10. Zhang, Z., Fu, X., Li, K., Liu, R., Peng, D., He, L., Wang, M., Zhang, H., & Zhou, L. (2016). One-step fabrication of electrochemical biosensor based on DNA-modified three-dimensional reduced graphene oxide and chitosan nanocomposite for highly sensitive detection of Hg (II). Sensors and Actuators B: Chemical, 225, 453–462. 11. Balasubramani, V., Sureshkumar, S., Subba Rao, T., & Sridhar, T. M. (2019). Impedance spectroscopy-based reduced graphene oxide-incorporated ZnO composite sensor for H2 S investigations. ACS Omega, 4(6), 9976. 12. Ko, H. C., Stoykovich, M. P., Song, J., Malyarchuk, V., Choi, W. M., Yu, C. J., Geddes Iii, J. B., Xiao, J., Wang, S., Huang, Y., & Rogers, J. A. (2008). A hemispherical electronic eye camera based on compressible silicon optoelectronics. Nature, 454(7205), 748. 13. Sundaram, S., Kellnhofer, P., Li, Y., Zhu, J. Y., Torralba, A., & Matusik, W. (2019). Learning the signatures of the human grasp using a scalable tactile glove. Nature, 569(7758), 698. 14. Chen, Z., Chen, Z., Song, Z., Ye, W., & Fan, Z. (2019). Smart gas sensor arrays powered by artificial intelligence. Journal of Semiconductors, 40(11), 111601. 15. Magar, H. S., Hassan, R. Y., & Mulchandani, A. (2021). Electrochemical impedance spectroscopy (EIS): Principles, construction, and biosensing applications. Sensors, 21(19), 6578. 16. Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 1–74. 17. Linardatos, P., Papastefanopoulos, V., & Kotsiantis, S. (2020). Explainable AI: A review of machine learning interpretability methods. Entropy, 23(1), 18.
Machine Learning in Impedance-Based Sensors
279
18. Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189. 19. Zhu, S., Sun, X., Gao, X., Wang, J., Zhao, N., & Sha, J. (2019). Equivalent circuit model recognition of electrochemical impedance spectroscopy via machine learning. Journal of Electroanalytical Chemistry, 855, 113627. 20. Letardi, P., 2000. Electrochemical impedance measurements in the conservation of metals. In Radiation in art and archeometry (pp. 15–39). Elsevier Science BV. 21. https://github.com/ankitrohatgi/WebPlotDigitizer 22. Gao, T., & Lu, W. (2021). Machine learning toward advanced energy storage devices and systems. Iscience, 24(1), 101936.
Machine Learning in Wearable Healthcare Devices Nitesh Sureja, Komal Mehta, Vraj Shah, and Gautam Patel
Abstract The major characteristic of wearable Technology is it is possible to wear on, in, and much closer to the body. Because of said characteristics, it is termed wearable. Their power is driven by microprocessors and improved with the capability to send and receive data through the Internet. In this COVID pandemic era, we realized the necessity for handy wearable devices for the regular monitoring of patients continuously. Fitness activity tracker was the first wearable that attracted the world to divert toward these technologies. The development of devices like wearable ECG and BP monitoring devices and biosensors followed the fitness activity tracker. This technology has transformed into a full-fledged wearable IoT in a very short span. It is very important to collect and analyze the data generated by the devices in such a way that the diagnosis becomes very easy. But, due to the enormous size of the data generation, it becomes difficult to analyze the data effectively. The inventions in machine learning can help us to cope up with the above problem. We can apply an effective machine learning algorithm to the collected data which can perform preprocessing, feature selection, training, and testing and produce an effective prediction of the condition of the patients. This chapter mainly discusses the design, implementation, and effectiveness of the application of ML on data generated by said devices. Keywords Wearable devices · Healthcare systems · Machine learning · Wearable IoT · Feature selection
N. Sureja Computer Science and Engineering, KSET, Drs. Kiran and Pallavi Patel Global University, Vadodara, Gujarat, India K. Mehta (B) Civil Engineering Department, KSET, Drs. Kiran and Pallavi Patel Global University, Vadodara, Gujarat, India e-mail: [email protected] V. Shah · G. Patel Department of Applied Chemistry, School of Science, ITM (SLS) Baroda University, Vadodara, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 N. Joshi et al. (eds.), Machine Learning for Advanced Functional Materials, https://doi.org/10.1007/978-981-99-0393-1_13
281
282
N. Sureja et al.
1 Introduction Wearable devices can be worn on the body and are electronically operated devices. Smartwatch is a very popular example of Wearable devices. They are having characteristics of various forms ranging from jewellery to devices in the medical field. Wearable devices also impact significantly in education, communication, navigation, health sector, sports, and entertainment. AI, Google glass, etc. presenting virtual reality are complicated examples of this technology. It is always said that any facility developed always comes with some silent problems. The easy availability of very competent wearable devices for any application drives exponential growth of data nowadays. For example, in healthcare services, monitoring the patient’s conditions always generates a large amount of data to process. Looking at the quantum of data, it is mandatory to have digitalized approaches for its review as it is impossible for human. Machine learning is increasingly becoming recognized as a potent tool for handling massive amounts of data of any kind. Grace of ML is a large range of algorithms applicable for use in preparing mathematical models with a wide range of few variables in linear equation models to large sets of variables for neural networks [1]. Today, we find the role machine learning plays to process the large amount of data produced for taking decisions or predicting some results. Figure 1 elicits the different applications of wearable devices that could be utilized in day-to-day life [1].
Fig. 1 Model of wearable device application [1]. Reproduced with permission from HINDAWI. Copyright ©2022
Machine Learning in Wearable Healthcare Devices
283
The rest of this chapter is composed as follows. In Sect. 2, we describe the wearable devices in detail. Section 3 discusses about wearable devices in healthcare. Section 4 discusses key challenges with wearable healthcare devices In Sect. 5, we introduce the concept of the machine learning. Section 6 describes applications of machine learning in wearable healthcare devices. We conclude and outline potential directions for further advancements in this subject in Sect. 7.
2 Wearable Devices This development helps a lot in the variety of wearable devices on the human body as they are digitalized and continuous monitoring is possible. They also provide realtime data and one can track the required parameter. The sensors correlate required activity with the required gadget for getting monitoring data. Technologically, this is claimed to be the biggest invention following the smartphones. This invention is effective in support of various fields like healthcare, medical, finance, transport, and many more. The objective is to support the individual by taking entry into their daily life and acting as a functional part of their routine. Because of its characteristics of handy, it is being noticed in the field of business to a great extent. Effectiveness at the workplace can be achieved as tracking, monitoring, rescuing, etc. becomes safe, easy, and speedy with the help of these devices. Researchers, Engineers, and technocrats can increase their efficiency using the feature of its hands-free access and getting data whenever and wherever they need. A variety of these devices is available in the market currently. Figure 2 indicates the various healthcare devices which are fully aided by machine learning parameters [2].
3 Wearable Healthcare Devices They are capable of providing continuous data as critical as human behavior and human physiological and biochemical parameters in routine. Mostly used for various medical reasons like BP, ECG, Oxygen level, Temperature, BCG, etc. in the human body. They are attachable to many things which are in daily use like clothes, gloves, earrings, glass, etc. and their tracking provides detailed clinical data. In recent research, it is also shown that they can be attached to the skin and their sensors can be entrenched to any component of the environment such as car seats, chairs, and mattresses. They work like remote sensing as they get and collect the data and then transmit it to a remote server for various types of analysis and storage. Examples in the field of healthcare monitoring are accelerometers, multi-angle video recorders, gyroscopes, etc. This technology is promising for healthcare problem solutions. Its applications are designed for the purpose of prevention, maintenance, and management of various
284
N. Sureja et al.
Fig. 2 Various Healthcare devices which are fully aided by machine learning and artificial intelligence [2]. Reproduced with permission from RSC. Copyright ©2022
diseases. Decision-making capability can be affected by the application of this technology in clinical problems. As its handy and remotely operated, even outside the hospital regular monitoring of patients is easily possible with its use. There are challenges and opportunities to apply AI to the huge amount of data received by this technology which can be dealt with by researchers effectively. Here, various examples of wearable healthcare devices have been explained. To begin with, measuring of human biofluids is very important, an as a result, researchers have developed AI-based electrochemical sensor that can detect the human sweat. Human sweat consists of various biofluids such as glucose, lactate, urea, nucleotide, and so on. Hence, detecting these quantities can provide an indication of human health. Using a simple printing process, Li et al. developed a highly embedded sensing paper (HIS paper) with MXene/methylene blue (Ti3 C2 Tx /MB) as active materials as well as foldable all-paper substrates as sweat analysis patches. The HIS paper includes a signal processing system for sweat analysis that can sense glucose and lactate in real time (Fig. 3a–e). The HIS paper’s multiple potential parts had been printed on the paper substrate before being folded into a three-dimensional (3D) structure. Sweat can effectively move perpendicularly in the paper substrate by vertically broadening the quantity of hydrophilic region layer by layer, steered by capillary force. The sensor is affordable and adaptable remedy for biochemical systems like wearable bioelectronics. Besides that, methylene blue (MB) has a synergistic effect on Ti3 C2 Tx , accelerating charge migration and improving electrochemical
Machine Learning in Wearable Healthcare Devices
285
performance during sweat scrutiny. The all-paper sweat sensor has the advantages of being inexpensive, compact, and simple. The novel 3D structural design encourages perspiration absorption from the skin’s surface while preventing fluid accumulation. Additionally, Xu et al. successfully developed an electrochemical sensor that detects uric acid (UA) using PEDOT: PSS (poly(3,4-ethylenedioxythiophene): poly (styrene sulfonate) hydrogels (Fig. 3f, g). Scanning from 2.0 to 250 M revealed an ultrahigh sensitivity of 0.875 A M1 cm2 and a low LOD of 1.2 M (S/N = 3). Human sweat was tested before and after consuming a high-purine meal. The sensor was also tested for long-term UA monitoring of a genuine sweat sample, and it showed significant promise to be used as an on-body wearable forum [3, 4]. As per Fig. 4a–e, Zhao et al. postulated a wearable electrochemical biosensor for on-body detection of sodium ion and lactate in sweat during sweating relying on sensing electrodes decorated with ZnO NWs. The biosensor, as well as the signal reading and transmission circuitry, had been completely assimilated into a
Fig. 3 a Sweat analysis during physical exercise with the HIS paper-based device installed on the arm, as photographed. b Wireless detection method, c a flexible circuit board in close-up. d An illustration of the analytical system. e Participants’ sweat glucose (Glu.) and sweat lactate (Lac.) concentration curves, f the on-body device configuration was applied to the subject’s upper arm [3]. g Images of the microfluidic device for sweat extraction and skin operation, both cross-sectional and top-down [4]. Reproduced with permission from Elsevier. Copyright ©2021. a Wearable biosensor design for detecting sweat on the body during physical activity. b A user wearing a headband. c An illustration of the signal readout headband. d The signal readout circuit’s schematic design. e The headband in cross-section [5]. Reproduced with permission from Elsevier. Copyright ©2021
286
N. Sureja et al.
Fig. 4 ROAMM framework conceptual diagram. An accelerometer, heart rate monitor, and GPS are among the sensors found in smartwatches. Its networking features and flexibility in constructing customizable apps make it an ideal platform for implementing a real-time activity monitoring system [6]. Reproduced with permission from Elsevier. Copyright ©2019
sweat headband capable of wirelessly transmitting data to a smartphone. The lowtemperature hydrothermal expansion was employed to directly synthesized ZnO NWs on thread carbon electrodes, and the modified electrodes were fitted with highly specialized sensing membranes for sodium ion and lactate detection [5]. Furthermore, another wristwatch example is included here: Smartphone and wristwatch technology is transforming the scene for patients and research participants to share real-time health information. The creation of a broad variety of healthcare apps that may interact with users and react dynamically to their changing environment is made possible by flexible, bidirectional, and real-time communication control. Additionally, smartwatches are equipped with a number of sensors that may collect data
Machine Learning in Wearable Healthcare Devices
287
on physical activity and location. The combination of all of these elements enables the collected data to be sent to a remote computer, allowing for real-time surveillance of physical and perhaps social behavior. Smartwatches are great instruments for tracking activities over extended periods of time in order to explore physical activity patterns in free-living situations and their link with seemingly random occurring diseases, which has remained a difficulty in the existing research. As a consequence, Martin et al. created a smartwatch-based framework for real-time and online evaluation and mobility tracking (ROAMM). The proposed ROAMM framework will contain a wristwatch application and server. The smartwatch application will gather and preprocess data. The server will be utilized for data storage and retrieval, as well as remote monitoring and other administrative functions. ROAMM blends sensorbased and user-reported data collecting, enabling real-time data presentation and summary statistics. Figure 4 [6] depicts a high-level overview of this case. In addition, Zhou and colleagues created a luminous wearable sweat tape (LWST) biosensor by embedding multi-component nanoprobes in paper substrates with microwell patterns made from hollowed-out double-sided tapes. To detect UA, alcohol, and glucose in blood samples, the researchers employed enzyme-embedded AuNCs (gold nanoclusters) coated in MnO2 nanosheets (NSs). Alcohol, UA, and glucose each have distinct light detection signals that are distinguished by their color: green, red, and yellow, respectively. Although the LWST biosensors are unable to detect variations in the concentration of intriguing chemicals in sweat, future studies will be done to do this. The authors also recorded and digitally transformed colorimetric pictures as RGB (red, green, and blue) signals using a smartphone app to enhance visual assessment. For the purpose of detecting urine, glucose, and alcohol, researchers developed LWST biosensors wrapped with MnO2 NSs and enzymeembedded AuNC assemblies as the probe. Due to their permeable designs and weak bonding within the framework, such as van der Waals force, related sweat analytes may be enhanced by the enzyme-based nanogel framework (Fig. 5a–i). The schematic design of a sweat sensor based on enzyme/AuNCs@PAH@MnO2 NSs nanoprobes is shown in Fig. 5j. Two identical double-sided tapes held together by the adhesive surface were used to build this wearable gadget. By manually punching one of the double-sided tapes, the microwells were made. The functionalized filter sheets were firmly stuck in the microwells and acted as independent reaction chambers to detect target analytes because of the exposed microwells’ special stickiness [7]. Sharma et al. [8] also showed a colorimetric non-invasive sensor for identifying ketones. On a prepared cotton fabric, they alternatively applied sodium nitroprusside (SNP) and poly (2-(dimethylamino ethyl) methacrylate) (pDMAEMA). The first layer (pDMAEMA presence) cationized the cotton surface, boosting its affinity for the second layer’s absorption (SNP). The color of the treated cotton fabric is a result of the incorporation of SNP ions on the cotton surface. Specific analytes can be quickly, cheaply, repeatedly, and convincingly detected thanks to the sensor’s architecture. Figure 5l depicts the most likely sensing mechanism, whereas Fig. 5k displays the schematic design for producing the sensor [8]. On the other hand, the information related to Machine learning research for healthcare wearables such as fall detection, activity recognition, calorie tracking, fitness
288
N. Sureja et al.
Fig. 5 a Utilization of UV light to study the colorimetric responses of uricase/hGGAuNCs@PAH@MnO2 NSs to UA at different concentrations ranging from 0 to 50 M. b Under UV light, the colorimetric reactions of GOx/HG-AuNCs@PAH@MnO2 NSs to glucose concentrations between 0 and 1 mM were studied. c Under UV light, the colorimetric responses of ADH/aGGAuNCs@PAH@MnO2 NSs to alcohol concentrations between 0 and 15 mM were studied. d Color intensity versus UA concentration from 0 to 50 M. e Color intensity versus glucose concentration in the 0–1 mM range. f Color intensity varies with alcohol concentration, ranging from 0 to 15 mM. In human studies, the designed LWST biosensor was used to detect. g UA, h glucose, and i alcohol in situ. j A graphic showing how LWST biosensors are made and how a smartphone device senses the world. k Layer-by-layer construction schematic on cotton fabric; the inset displays the hues produced by different ketones on cotton that has been treated. l An illustration of the color-sensing gadget used to distinguish different ketones [7, 8]. Reproduced with permission from Elsevier. Copyright ©2021
tracking, and stress detection have been depicted in Table 1. The table includes data on Task, Machine learning (ML) technique, and sensor or signal utilized.
3.1 Prevention of Diseases and Maintenance of Health At the global scale, it is due to advancements in lifestyle, risk of unceasing situations, heart attacks, disabilities, and many more health problems at a younger age [9]. In aged people, this can be effectively utilized for preventing collapse conditions. This
Machine Learning in Wearable Healthcare Devices
289
Table 1 Machine learning techniques utilized against various tasks in terms of wearable healthcare devices [1] Sr. No Task
Signal/sensors utilized
Machine learning technique (S)
1
Fall detection
3D accelerometer and gyroscope in smartphone
J48 (96.7%), logistic regression (94.9%), MLP (98.2%)
2
Activity recognition
Accelerometer and gyroscope
CNN Accelerometer and gyroscope UCI-HAR dataset and study set (UCI-HAR dataset: 95.99%, study set: 93.77%)
3
Eating monitoring
3D accelerometer
Proximity-based active learning
4
Fitness tracking
2 accelerometers (hip and ankle)
Logistic regression
5
Stress detection
ECG, GSR, body temperature, BN, SVM, KNN, J48, RF, and SpO2, glucose level, and blood AB learning methods Neural pressure network model (92% accuracy for metabolic syndrome patients and 89% for the rest)
6
Arrhythmia detection ECG and PPG sensors
SVM and K-medoids clustering-based template learning
7
Seizure detection
Accelerometer and electrodermal activity from Empatica Embrace
SVM ((Sens > 92%) and bearable FAR (0.2–1))
8
Rehabilitation tasks
IMU sensor module and plantar pressure measuring foot insoles
K-means clustering, SVM, and artificial neural network (ANN)
9
Hydration monitoring Acoustic sensor
SVM for drinking detection Gradient boosting decision tree for activity recognition
can also detect precise walking and activities pattern [10]. The collapse or seizure conditions can be prevented with the application of a genetic algorithm and two triaxle accelerometer bracelets. Researchers showed the possibility of detection of fall situations at different phases of activity based on the use of wireless accelerometers and classification algorithms [11]. They also presented [12] collapse detection with the help of a database of routine activities. Regular and continuous monitoring of expanded inactive behavior proved useful for dealing with multiple undesirable physical condition outcomes linked along with it. The research was undertaken to check if reminders or alarms can help people change their posture and help in maintaining their wellbeing [13]. This experiment was successful with many positive outcomes in cases who received help in maintaining their wellbeing by changing
290
N. Sureja et al.
their posture from time to time. One case study is also presented by researchers where they tracked communication between child and mother to check the effectiveness of this technology in language tracking and feedback was given to the mother from database and it proved a successful experiment [14].
3.2 Mental Status Monitoring Recently, this technology has entered into mental health conditions monitoring also. Dataset for BP, ECG, BCG, Temperature, etc. helps in deciding human physiology based on which stress conditions can be monitored. In one of the experiments with the application of this technology and device in children, monitoring of heart rates and audio signals were observed and they showed positive results in the correlation of data received and child mental condition [15]. In other research, the experiment was done for monitoring the mental health condition by correlating stress level with skin conductance and for this EDA—Simple [16]. Electrodermal activity sensors were developed and they gave accurate results of stress conditions and it helped in monitoring the mental health of a set of people undertaken in experiments. It proved helpful also in the field of sports for coaches for finding required matches and training sportspersons systematically. In one case study, it is shown that for volleyball players this technology helps in monitoring jumping load. With the help of said database, training for jump specific and for competition this proved useful [17]. With the help of monitoring temperature and humidity, the development of a fuzzy logic-based technique helped in preventing heart stroke for people doing extensive exercise in high temperatures [18]. Health-conscious people have started using more and more devices of this technology for weight control and for said task, three major devices for measurement of Fitbit Charge HR, Apple Watch, and Garmin Forerunner 225 are three fitness trackers. Yang et al. presented a “Happy Me” device based on this technology for controlling obesity in childhood in 2017 [19]. A random sampling test was done for ensuring the effectiveness of this device on grade 5 and 6 students. This test gave scientific proofs for the effectiveness of this device for weight control.
3.3 Patient Management Wearable devices support managing patients’ conditions in hospitals effectively. Researchers want to employ wearable technology to spot health abnormalities early on. Researchers have discovered new point-to-point care devices that are more useful for patient monitoring because of their wireless properties [20]. In the sphere of emergency medical services (EMS) and intensive care unit (ICU) environments, one example involves the integration of clothing with wearables such as transportable sensors and gadgets, which has given the benefit of continuous monitoring of hazards that threaten patient life. Mobility is supported through this technology in patients.
Machine Learning in Wearable Healthcare Devices
291
Monitoring of routine activities has application in managing the critical condition of patients [21]. It is also very effective for increasing self-management abilities. For precise monitoring of weight, diet, other routine activities, etc., this is useful to many health concerned people. Critical condition patients like patients of endometrial cancer survivors, who are physically inefficient for any activity, may improve their condition by regular monitoring of routine activities with this device [22]. In a case study for checking the effectiveness and acceptability of Fitbit Alta which is a monitor for physical activities, it was effective on 25% of targets. Its data showed [23] inadequately vigorous inhabitants. Researchers created a monitoring system with the use of this device for brain and spinal code injury patients who need regular exercise for fast recovery. They utilized [24] entrenched inertial and mechanomyographic sensors, algorithms for the identification of functional movement along with a graphical consumer boundary to showcase significant information to patients for helping self-exercise programs. Such patients need to observe some medically important parameters which cannot be interpreted by themselves. So, this device helps in providing the dataset which can be transferred to the medical advisor for effectiveness in their treatment. With this study, researchers tried to support the therapy of such patients [25]. Patients of severe breathing problems were supported by the novel multimodal sensors-based application including a specific set of their required exercise, tracking, performance data, guidance, etc. [26].
3.4 Disease Management This can be more effectively managed with the application of said technology. For cardiac patients in cardiovascular monitoring, this technology has developed applications. ECG monitoring with this technology device at a low requirement of power consumption is produced. [27] For heart rate variability also monitoring is developed with this device. In one of the case studies, ECG recordings are possible by the system developed based on this technology [28]. For routine life assessment of three-dimensional seism cardiogram (SCG), with the use of textile, the device is developed which can monitor inconspicuous recording of ECG, respiration, etc. Also, a handy and unremitting ballistocardiogram (BCG) monitor is developed by researchers which can be used for the ear, capable of divulging necessary information in terms of cardiac contractility and its directive [29]. The authors of [30] premeditated the viability by means of a wireless digital watch in the sector of monitoring device capable of recording the data of patient vital signs. A comparison with current clinical devices is done by researchers. With a considerable number of dataset statistically, this device gave satisfactory results for comparison of developed and current clinical device with the reliability of heart rate value for 80% of samples. Tracking devices based on said technology gives effective results for tracking blood disorders and in turn they have grabbed the attention of health sector professionals [31]. As per the survey report of 2015–16 in US, the % of hypertension
292
N. Sureja et al.
in adults was 29% and these devices identify it with physiological signals [32]. The most effective and in-use devices include BP evaluation and monitoring, as well as cuff-less BP sensors, handy smartphone-enabled upper arm BP monitors, and mobile and remote monitoring applications. Its effective results support patients to improve their high BP condition and timely medical advice through simple continuous BP monitoring along with better connectivity with medical advisors and reminder alerts [33]. The study of hemodynamics is based on blood flow. Researchers have been working on features of how people with orthostatic hypotension modify their body position and have abnormal hemodynamics that are connected to postural alterations in the body. They included a cephalic laser meter, which patients wore while sitting, standing, or crouching to assess blood flow. While the patient is standing, it efficiently detects cephalic hemodynamics and cerebral ischemia signs [34]. There are certain parameters like medicine oral or injectable, physical workout, quality of sleep, stress level, diet, etc. which affect diabetes condition, and to improve the same, tracking of blood glucose dynamics in various ways with various parameters is required on a continuous basis. Current technology helps every individual patient based on their own real-time data for said parameters and get rid of the critical condition [35]. Wearable device for diabetes management based on AI is implanted with insulin pump [36]. For medical problems like diabetic glucose control, newly developed Closed-loop control which helps in managing type 1 diabetes is available. With its application control of glycemic improved with said medical issue in adults [37]. For simplification of patients with diabetes mellitus, experiments are done with the application of Google glass [38]. Wearable technology holds great promise for the management of Parkinson’s disease in terms of gathering extensive data sources that can shed light on diagnosis and the results of therapeutic therapies. Given that bradykinesia is one of the main signs of Parkinson’s disease, it is frequently used to gauge the severity of the condition. To evaluate the degree of parkinsonian bradykinesia, researchers created a wearable gadget [39]. Numerous subjective evaluations of the degree of dyskinesia in Parkinson’s disease patients do not offer long-term follow-up. In a different investigation, a motion capture device was used to gather patient kinematic data and create an objective dyskinesia score [40]. For those children who are suffering from autistics, it’s very important that they should be able to recognize as well as state their emotions like annoyance, repugnance, terror, contentment, sorrow, and revelation. Researchers experimented with a set of children suffering from this diesis and try if they wear the device framed. The device was having the application of Google glass for therapy in patients of [41] autism spectrum disorder (ASD). It gave positive results and showed effectiveness for children wearing the device and responding to performance of emotion recognition who were affected by autism spectrum disorder. This technology has characteristics to provide support assistance in selection, identification, and monitoring of psychiatric disorders, like depression. The automatic identification of various mood states, both under normal and pathological circumstances, may be made possible by research on cognitive and autonomic reactions to emotionally significant inputs. In a different study, a system-on-chip (SoC) approach
Machine Learning in Wearable Healthcare Devices
293
was used to build a wearable depression monitoring system for a particular application. To increase the success rate of depression detection, the system sped up the filtering and extraction of heart rate variability (HRV) from the electrocardiogram (ECG) [42].
4 Key Challenges with the Wearable Healthcare Devices Commercial viability for this technology with a continuous flow of users is still a challenge. Devices developed and disappear from the market shortly due to the unavailability of users. There are a few challenges from the user side which restricts this product to be user-friendly like poor quality, difficulty in coordinating with smartphones, less battery life, scratchy design, etc. which are at the functional level. On the other side, the devices which are functionally wise and physically very tough could not create the impact by giving positive results on the user’s routine life, behavior or habitat, etc. which makes them fail [43]. Some entrepreneurs predict this technology and use of devices as upcoming opportunity impelling efficient results and increased communication. Similar to the growth of the mobile market, this technology is also grabbing the attention of business with the insecurity of data security breach with its application. As per the report presented by price water house cooper, 86% of people who took support of this device felt that with its use there is a serious risk to data security and privacy. If it is to be applied at the company level, the company or industry needs a very strong policy and SOP for actions against data breach before it is introduced at the company level. The exponential accumulation of data is mostly being driven by the expanding availability of WD and minimally invasive implantable health monitoring devices. Experts are no longer able to analyze this data. It is therefore vital to use computercontrolled or computer-controlled techniques. As a powerful method for managing enormous amounts of any kind of data, machine learning (ML) is becoming more and more popular. The field of machine learning (ML) comprises a wide variety of techniques used to train mathematical models, ranging from deep neural networks with millions of parameters to learn to linear classifiers with only a few parameters. In addition to this, the potential challenges associated with machine learning wearable devices have been illustrated in Fig. 6.
5 Machine Learning ML is considered as a division of artificial intelligence (AI) that focuses on using algorithms to become skilled at from the information exclusive of the need for additional training. As Machine learning is capable of learning through occurrence and regulates to novel inputs, it performs tasks just like human. It is possible with the application of machine learning to counter the draw backs of some issues of the medical field like
294
N. Sureja et al.
Fig. 6 Healthcare ML applications on wearable devices face challenges [1]. Reproduced with permission from HINDAWI, copyright©2022
data handling, transform enduring be concerned, and reform organizational process. Earlier records of data were a great administrative task which is simplified with the use of machine learning. These data can be utilized as input for ML in the domain of health/medical-related projects. Starting from drop prediction to live detection of heart attack is possible through Machine learning. It can also be used for the detection of critical diseases like cancer, tracking of activity/workout/diet/fitness, identification of seizure, rehabilitation task, sleep monitoring, and many other possibilities. In automation algorithm of ML has proved its effectiveness in detection or monitoring or alarming, etc. The above studies showed promising results on retrospective data. A peek into the future when data, analytics, and innovation work together to unwittingly assist countless patients is provided by the expanding number of ML applications in healthcare [44].
6 Applications of Machine Learning in the Wearable Healthcare Devices Through the use of past experiences, machine learning enables wearable technology to act or make decisions for a specific situation. Based on the training data available, machine learning can be categorized as unsupervised, semi-supervised, supervised, and reinforced. Data samples that can be either labeled or unlabeled are used to encode past experience learning. For data with labels, the target variable may be either categorical or numerical. Machine learning is used for a variety of tasks,
Machine Learning in Wearable Healthcare Devices
295
including clustering for unlabeled data, regression for numerical labels, and classification for categorical target output variables. The majority of machine learning research for wearable technology focuses on classification tasks, while some studies focus on clustering [45, 46] and a small number on regression issues [47]. The application of ML algorithms to research, analyze, and detect health services utilizing data gathered from various body sensors has dramatically increased during the past ten years. Researchers have shown interest in a variety of subjects, including fall detection, cancer detection, activity recognition, eating monitoring, fitness tracking, stress detection, seizure detection, rehabilitation tasks, hydration monitoring, emotion recognition, sleep monitoring, disease diagnosis, and many others. The overall application of ML has been shown in Fig. 7 [1]. According to the authors of [48], using the random forest classifier and the KNN classifier, the accuracy for identifying falling activity was 96.82% and 99.80%, respectively. The work in [49] proposes an FDS based on wearable sensors, such as accelerometers and gyroscopes, and Machine Learning (ML), for processing and detecting sensor signals. The technique categorizes the signal by extracting a number of features from certain regions of it. The process takes a number of characteristics from various signal segments and classifies them as falls or regular daily activities. A manually annotated dataset was used to train the Support Vector Machine (SVM), which was then used to categories human behaviors into falls and routine activities. A wearable device that uses a deep learning algorithm to recognize six typical human behaviors—walking, moving up or down stairs, sitting, standing, and lying— was recommended by the study [50]. The authors created a waist wearable gadget for individuals who are unable to wear electronics on their hands due to medical conditions.
Fig. 7 In the literature, each one uses healthcare machine learning tasks and sensors [1]. Reproduced with permission from HINDAWI, copyright©2022
296
N. Sureja et al.
Very successful techniques for virtual driving control and gait recognition were presented in the study [51]. These methods were developed using multimodal data from patients’ surface electromyography (sEMG) inputs for the upper limb and triaxial acceleration as well as plantar pressure signals for the lower limb. They’ve also developed wearable technology to detect human posture. The Support Vector Machine (SVM) method is used for both training and classification. In the study [52], a proximity-based active learning approach for identifying eating occasions in wearable systems was put out. This model significantly lowers the requirement for tagged data with new users. In both controlled and uncontrolled situations, we performed a detailed analysis, and the findings indicate that the F-score of PALS ranges from 22 to 39% for between ten and sixty inquiries. When compared to cutting-edge methods, off-line PALS also detects eating movements with up to 40% higher recall and 12% higher F-score. A small wireless inertial sensing platform was developed by the study’s authors [53], and they applied it to the task of eating detection by pressing the sensor up against the jawbone. Additionally, they created a pipeline for data processing to recognize feeding episodes from inertial sensor data, and they put the system to the test in a lab setting as well as in actual situations. In the lab, the system achieved precision and recall of 91.7% and 91.3%, respectively, as opposed to 92.3% and 89.0% in naturalistic settings. The study [54] presented a novel method for recognizing intended jogging times utilizing traditional accelerometer data collected in the field based on ML approaches. The dimensions from two accelerometers placed at the hip and ankle of children and teenagers, as well as the related activity diary, had been used in the study. Furthermore, the purpose of this study is to decide regardless of whether data from two accelerometers, or tracking of body movement in two different locations, is needed for reliably determining jogging periods. The researchers [55] present RecoFit, a device that uses an arm-worn inertial sensor to automatically monitor repetitive workouts such as weight training and calisthenics. The system’s goal was to give real-time and post-exercise feedback while requiring no user-specific training or intervention during the session. This strategy addressed three issues by isolating exercise from brief periods of inactivity, identifying the exercise being performed, and keeping track of repetitions. The paper describes an investigation on a group of frail older people with mild cognitive impairment (MCI) who actively participated in cognitive as well as motor therapy appointments while wearing wearable physiological sensors and using a smartphone application for physiological monitoring [56]. After the data has been collected, it is analyzed to see how stressed out the weak and old participants were throughout therapy and how exercise helps cognitive training. Finally, the developed stress detection system is evaluated using several machine learning algorithms on the actual dataset acquired. Based on data gathered through wearables, a ML algorithm was used in the study [57] to distinguish between physiological states linked to stressful and non-stressful settings in children with ASD. In the safe laboratory environments, wearables are used to detect heart rate and RR intervals both at rest and while engaged in activities. The acquired data was then subjected to outlier reduction by the authors. The authors then trained and evaluated support vector machine (SVM)
Machine Learning in Wearable Healthcare Devices
297
and logistic regression (LR) classifiers to categorize each validation sample as either a relaxing or stressful period using nested leave-one-out cross-validation. The study’s authors [58] developed an algorithm to recognize Atrial Fibrillation (AF) episodes using photoplethysmograms (PPG) taken while the subjects were mobile and free to live their lives. The authors compiled and analyzed about 4000 h of PPG data that were collected using a wearable device. They achieved a 95% accuracy rate by using a convolutional neural network to train the acquired data. The study’s authors have looked at how well machine learning algorithms like Support Vector Machine, Random Forest, Naive Bayes, K-Nearest Neighbor, and Neural Network (NN) perform at diagnosing syndromes as well as how much it would cost to program one of these algorithms into a wearable device [59]. The algorithms were tested on the Electroencephalography (EEG) sampling dataset from the UCI repository. A machine learning technique that can recognize data from accelerometers (ACC) and electrodermal activity (EDA) sensors for generalized tonic-clonic seizures is suggested by the study [60]. The Embrace and E4 bracelets (Empatica) were used to identify the physiological indicators of generalized tonic-clonic seizures. The study [61] proposes a machine learning model for ambulatory gait analysis while walking and running. The authors used the data that wearables (Sport Sole) collect when engaging in walking and running exercises. Two separate sessions of 6 min intervals of treadmill walking and running at varying speeds were used to collect the data. A foot strike angle prediction and categorization model are created using machine learning in the study [62]. The model uses LoadsolTM wearable pressure insoles to collect the signals (data). To evaluate the effectiveness of the developed model, three distinct machine learning methods—linear regression, conditional inference trees, and random forests—are applied. With all models, they have surpassed 90% categorization accuracy. The study [63] uses wireless sensors (wearables) to perform a performance assessment of the impact of factory instruction on the autonomic answer to olfactory stimuli. For the analysis, three months’ worth of olfactory training data from a group of university students were gathered. We found differing autonomic responses using electrocardiogram (ECG) and galvanic skin response (GSR) signals at the start and conclusion of the training session, with a larger parasympathetically mediated reaction toward the end of the time compared to the first evaluation. A technique for detecting moderate dehydration and evaluating autonomic reactions to cognitive stress is proposed in the study [64]. Wearables were used to create nine separate datasets of autonomic control based on pulse rate variability (PRV) and electrodermal activity (EDA). The individuals were “wet” (not dehydrated) and “dry” for three consecutive days when measurements were obtained (experiencing mild dehydration caused by fluid restriction). The proposed model’s accuracy was examined using nine different classification models. These models were evaluated using leaveone-subject-out cross-validation for each of the nine datasets of autonomic nervous system control in all conceivable combinations. Using machine learning and wearable technology, the study [65] develops a system for the evaluation of emotions. The MUSE headband’s capabilities, along with those
298
N. Sureja et al.
of the Shimmer GSR+ gadget, were to be evaluated as part of the system’s development. Observing how people were feeling while they were exposed to stimuli was the other objective. Using a machine learning strategy, the pertinent features from the gathered data are retrieved. Shimmer GSR+ and a headband produced the signals that were used to prepare the data. A machine learning technique is suggested by the study [60] to be able to recognize data from accelerometers (ACC) and electrodermal activity (EDA) sensors for generalized tonic-clonic seizures. Embrace and E4 bracelets (Empatica) were used to identify the physiological indicators of generalized tonic-clonic seizures. The study [61] makes a machine learning model recommendation for ambulatory gait analysis while running and walking. The writers made use of the information that wearables (Sport Sole) gather while engaging in walking and running routines. Two separate sessions were used to acquire the data, each of which included 6 min intervals of treadmill walking and running at varying speeds. A technique for objectively diagnosing sleep apnea utilizing signals from wearable watch devices fitted with photoplethysmography (PPG) sensors is described in the study [66]. The purpose of this study was to determine whether sleep apnea, a common but underdiagnosed medical disorder associated with a lower quality of life and a higher risk of cardiovascular disease, could be detected using PPG pulse wave data. A sleep classification system is proposed in the article [67] for the data obtained from wrist-worn accelerometers. To classify the data, the author utilized a random forest classifier. 134 adult participants in three independent studies who underwent a one-night polysomnography recording in the clinic while wearing an accelerometer on their wrist provided the data used to train the algorithm. To aid in the advancement of research, the authors have also made these random forest algorithms open-source. In the study [68], a wearable biosensor-based mobile health system based on artificial intelligence is designed for the early detection of COVID-19 in quarantined patients. In order to detect COVID-19 early, wearable biosensors are employed to continually monitor many physiological indicators. The study is an observational wearable biosensor research using ML-dependent remote detecting of people with mild SARS-CoV-2 symptoms [69, 70]. This study investigated how clinical deterioration could be detected using ML and wearable biosensors’ physiological parameter analyses. In this study, 34 COVID-19 patients with mild symptoms were included.
7 Conclusion It’s easy to see how wearable technology use has grown recently. Wearable technology is expected to become even more common as a result of extensive research into the use of artificial intelligence solutions in healthcare-related tasks. It will progress from “nice to have” devices with intriguing implementations to essentials for remote monitoring of patients and the detection of any abnormalities in the human body. In this chapter, we evaluated ML tasks studied in the context of wearable medical devices, as well as the machine learning techniques employed, the various modalities
Machine Learning in Wearable Healthcare Devices
299
employed, and the relevant datasets. As potential solutions to various problems with machine learning applications on wearable devices (deployment alternatives, power consumption, storage and memory, utility and user acceptance, data availability and reliability, communication, security, and privacy) were discovered in the literature, they were discussed. Lastly, the study highlights issues that need to be researched further in the areas of data accessibility, reliability, and confidentiality in order to allow efficacious learning from datasets supplied by wearable devices.
References 1. Sabry, F., Eltaras, T., Labda, W., Alzoubi, K., & Malluhi, Q. (2022). Machine learning for healthcare wearable devices: The big picture. Journal of Healthcare Engineering. https://doi. org/10.1155/2022/4653923 2. Sharma, A., Singh, A., Gupta, V., & Arya, S. (2022). Advancements and future prospects of wearable sensing technology for healthcare applications. Sensors & Diagnostics, 1(3), 387– 404. https://doi.org/10.1039/D2SD00005A 3. Li, M., Wang, L., Liu, R., Li, J., Zhang, Q., Shi, G., & Wang, H. (2021). A highly integrated sensing paper for wearable electrochemical sweat analysis. Biosensors and Bioelectronics, 174, 112828. https://doi.org/10.1016/j.bios.2020.112828 4. Xu, Z., Song, J., Liu, B., Lv, S., Gao, F., Luo, X., & Wang, P. (2021). A conducting polymer PEDOT: PSS hydrogel based wearable sensor for accurate uric acid detection in human sweat. Sensors and Actuators B: Chemical, 348, 130674. https://doi.org/10.1016/j.snb.2021.130674 5. Zhao, C., Li, X., Wu, Q., & Liu, X. (2021). A thread-based wearable sweat nanobiosensor. Biosensors and Bioelectronics, 188, 113270. https://doi.org/10.1016/j.bios.2021.113270 6. Kheirkhahan, M., Nair, S., Davoudi, A., Rashidi, P., Wanigatunga, A.A., Corbett, D.B., Mendoza, T., Manini, T.M., & Ranka, S. (2019). A smartwatch-based framework for realtime and online assessment and mobility monitoring. Journal of Biomedical Informatics, 89, 29–40. https://doi.org/10.1016/j.jbi.2018.11.003 7. Zhou, Z., Shu, T., Sun, Y., Si, H., Peng, P., Su, L., & Zhang, X. (2021). Luminescent wearable biosensors based on gold nanocluster networks for “turn-on” detection of Uric acid, glucose and alcohol in sweat. Biosensors and Bioelectronics, 192, 113530. https://doi.org/10.1016/j. bios.2021.113530 8. Sharma, A., Singh, A., Khosla, A., & Arya, S. (2021). Preparation of cotton fabric based non-invasive colorimetric sensor for instant detection of ketones. Journal of Saudi Chemical Society, 25(10), 101340. https://doi.org/10.1016/j.jscs.2021.101340 9. Ambrose, A., Paul, G., & Hausdorff, J. (2013). Risk factors for falls among older adults: A review of the literature. Maturitas, 75, 51–61. https://doi.org/10.1016/j.maturitas.2013.02.009 10. González, S., Sedano, J., Villar, J., Corchado, E., Herrero, Á., & Baruque, B. (2015). Features and models for human activity recognition. Neurocomputing, 167, 52–60. https://doi.org/10. 1016/j.neucom.2015.01.082 11. Pannurat, N., Thiemjarus, S., & Nantajeewarawat, E. (2017). A hybrid temporal reasoning framework for fall monitoring. IEEE Sensors Journal, 17, 1749–1759. https://doi.org/10.1109/ jsen.2017.2649542 12. Gibson, R., Amira, A., Ramzan, N., Casaseca-de-la-Higuera, P., & Pervez, Z. (2017). Matching pursuit-based compressive sensing in a wearable biomedical accelerometer fall diagnosis device. Biomedical Signal Processing and Control, 33, 96–108. https://doi.org/10.1016/j.bspc. 2016.10.016 13. Frank, H., Jacobs, K., & McLoone, H. (2017). The effect of a wearable device prompting high school students aged 17–18 years to break up periods of prolonged sitting in class. Work, 56, 475–482. https://doi.org/10.3233/wor-172513
300
N. Sureja et al.
14. Choo, D., Dettman, S., Dowell, R., & Cowan, R. (2017). Talking to toddlers: drawing on mothers’ perceptions of using wearable and mobile technology in the home. Studies in Health Technology and Informatics, 239, 21–27. 15. Choi, Y., Jeon, Y., Wang, L., & Kim, K. (2017). A biological signal-based stress monitoring framework for children using wearable devices. Sensors, 17, 1936. https://doi.org/10.3390/s17 091936 16. Setz, C., Arnrich, B., Schumm, J., La Marca, R., Troster, G., & Ehlert, U. (2010). Discriminating stress from cognitive load using a wearable EDA device. IEEE Transactions on Information Technology in Biomedicine, 14, 410–417. https://doi.org/10.1109/titb.2009.2036164 17. Skazalski, C., Whiteley, R., Hansen, C., & Bahr, R. (2018). A valid and reliable method to measure jump-specific training and competition load in elite volleyball players. Scandinavian Journal of Medicine & Science in Sports, 28, 1578–1585. https://doi.org/10.1111/sms.13052 18. Chen, S., Lin, S., Lan, C., & Hsu, H. (2017). Design and development of a wearable device for heat stroke detection. Sensors, 18, 17. https://doi.org/10.3390/s18010017 19. Yang, H., Kang, J., Kim, O., Choi, M., Oh, M., Nam, J., & Sung, E. (2017). Interventions for preventing childhood obesity with smartphones and wearable device: A protocol for a non-randomized controlled trial. International Journal of Environmental Research and Public Health, 14, 184. https://doi.org/10.3390/ijerph14020184 20. Ghafar-Zadeh, E. (2015). Wireless integrated biosensors for point-of-care diagnostic applications. Sensors, 15, 3236–3261. https://doi.org/10.3390/s150203236 21. Chiauzzi, E., Rodarte, C., & DasMahapatra, P. (2015). Patient-centered activity monitoring in the self-management of chronic health conditions. BMC Medicine. https://doi.org/10.1186/s12 916-015-0319-2 22. Basen-Engquist, K., Scruggs, S., Jhingran, A., Bodurka, D., Lu, K., Ramondetta, L., Hughes, D., & Carmack Taylor, C. (2009). Physical activity and obesity in endometrial cancer survivors: Associations with pain, fatigue, and physical functioning. American Journal of Obstetrics and Gynecology, 200, 288.e1-288.e8. https://doi.org/10.1016/j.ajog.2008.10.010 23. Rossi, A., Frechette, L., Miller, D., Miller, E., Friel, C., Van Arsdale, A., Lin, J., Shankar, V., Kuo, D., & Nevadunsky, N. (2018). Acceptability and feasibility of a fitbit physical activity monitor for endometrial cancer survivors. Gynecologic Oncology, 149, 470–475. https://doi. org/10.1016/j.ygyno.2018.04.560 24. Burridge, J., Lee, A., Turk, R., Stokes, M., Whitall, J., Vaidyanathan, R., Clatworthy, P., Hughes, A., Meagher, C., Franco, E., & Yardley, L. (2017). Telehealth, wearable sensors, and the internet: Will they improve stroke outcomes through increased intensity of therapy, motivation, and adherence to rehabilitation programs? Journal of Neurologic Physical Therapy, 41, S32–S38. https://doi.org/10.1097/npt.0000000000000183 25. Burns, A., & Adeli, H. (2017). Wearable technology for patients with brain and spinal cord injuries. Reviews in the Neurosciences, 28, 913–920. https://doi.org/10.1515/revneuro-20170035 26. Tey, C., An, J., & Chung, W. (2017). A novel remote rehabilitation system with the fusion of noninvasive wearable device and motion sensing for pulmonary patients. Computational and Mathematical Methods in Medicine, 2017, 1–8. https://doi.org/10.1155/2017/5823740 27. Winokur, E., Delano, M., & Sodini, C. (2013). A wearable cardiac monitor for long-term data acquisition and analysis. IEEE Transactions on Biomedical Engineering, 60, 189–192. https:// doi.org/10.1109/tbme.2012.2217958 28. Yang, H. K., Lee, J. W, Lee, K. H., Lee, Y. J., Kim, K. S., Choi, H. J., Kim, D. J. (2008). Application for the wearable heart activity monitoring system: analysis of the autonomic function of HRV. In 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2008. EMBS 2008. 29. Da He, D., Winokur, E. S., Sodini, C. G. (2012). An ear-worn continuous ballistocardiogram (BCG) sensor for cardiovascular monitoring. In 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 30. Hernandez-Silveira, M., Ahmed, K., Ang, S., Zandari, F., Mehta, T., Weir, R., Burdett, A., Toumazou, C., & Brett, S. (2015). Assessment of the feasibility of an ultra-low power, wireless
Machine Learning in Wearable Healthcare Devices
31. 32.
33. 34.
35. 36.
37.
38. 39.
40.
41.
42.
43. 44. 45.
46.
47.
48.
301
digital patch for the continuous ambulatory monitoring of vital signs. British Medical Journal Open, 5, e006606–e006606. https://doi.org/10.1136/bmjopen-2014-006606 Fryar, C. D., Ostchega, Y., Hales, C. M., Zhang, G., & Kruszon-Moran, D. (2017). Hypertension prevalence and control among adults: United States, 2015–2016. NCHS Data Brief, 289, 1–8. Ghosh, A., Torres, J., Danieli, M., Riccardi, G. (2015). Detection of essential hypertension with physiological signals from wearable devices. In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Goldberg, E., & Levy, P. (2016). New approaches to evaluating and monitoring blood pressure. Current Hypertension Reports. https://doi.org/10.1007/s11906-016-0650-9 Fujikawa, T., Tochikubo, O., Kura, N., Kiyokura, T., Shimada, J., & Umemura, S. (2009). Measurement of Hemodynamics during postural changes using a new wearable cephalic laser blood flowmeter. Circulation Journal, 73, 1950–1955. https://doi.org/10.1253/circj.cj-09-0103 Heintzman, N. (2015). A digital ecosystem of diabetes data and technology. Journal of Diabetes Science and Technology, 10, 35–41. https://doi.org/10.1177/1932296815622453 Dudde, R., Vering, T., Piechotta, G., & Hintsche, R. (2006). Computer-aided continuous drug infusion: Setup and test of a mobile closed-loop system for the continuous automated infusion of insulin. IEEE Transactions on Information Technology in Biomedicine, 10, 395–402. https:// doi.org/10.1109/titb.2006.864477 Brown, S., Breton, M., Anderson, S., Kollar, L., Keith-Hynes, P., Levy, C., Lam, D., Levister, C., Baysal, N., Kudva, Y., Basu, A., Dadlani, V., Hinshaw, L., McCrady-Spitzer, S., Bruttomesso, D., Visentin, R., Galasso, S., del Favero, S., Leal, Y., … Kovatchev, B. (2017). Overnight closed-loop control improves Glycemic control in a Multicenter study of adults with type 1 diabetes. The Journal of Clinical Endocrinology & Metabolism, 102, 3674–3682. https://doi. org/10.1210/jc.2017-00556 Hetterich, C., Pobiruchin, M., Wiesner, M., Pfeifer, D. (2014). How google glass could support patients with diabetes mellitus in daily life. Paper presented at the MIE. Lin, Z., Dai, H., Xiong, Y., Xia, X., Horng, S.J. (2017). Quantification assessment of bradykinesia in Parkinson’s disease based on a wearable device. In 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Delrobaei, M., Baktash, N., Gilmore, G., McIsaac, K., & Jog, M. (2017). Using Wearable technology to generate objective Parkinson’s disease dyskinesia severity score: Possibilities for home monitoring. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25, 1853–1863. https://doi.org/10.1109/tnsre.2017.2690578 Daniels, J., Haber, N., Voss, C., Schwartz, J., Tamura, S., Fazel, A., Kline, A., Washington, P., Phillips, J., Winograd, T., Feinstein, C., & Wall, D. (2018). Feasibility testing of a wearable behavioral aid for social learning in children with Autism. Applied Clinical Informatics, 09, 129–140. https://doi.org/10.1055/s-0038-1626727 Roh, T., Hong, S., Yoo, H. J. (2014). Wearable depression monitoring system with heart-rate variability. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). https://www.happiestminds.com/insights/wearable-technology/ Beniczky, S., Karoly, P., Nurse, E., Ryvlin, P., & Cook, M. (2020). Machine learning and wearable devices of the future. Epilepsia. https://doi.org/10.1111/epi.16555 Park, S., Lee, S., Han, S., & Cha, M. (2019). Clustering insomnia patterns by data from wearable devices: Algorithm development and validation study. JMIR mHealth and uHealth, 7, e14473. https://doi.org/10.2196/14473 Lee, P. T., Chiu, W. C., Ho, Y. H., Tai, Y. C., Lin, C. C. K., Lin, C. L. (2021). Development of wearable device and clustering based method for detecting falls in the elderly In Proceedings of the 2021 IEEE 10th Global Conference on Consumer Electronics (GCCE) (pp. 231–232). Sabry, F., Eltaras, T., Labda, W., Hamza, F., Alzoubi, K., & Malluhi, Q. (2022). Towards ondevice dehydration monitoring using machine learning from wearable device’s data. Sensors, 22, 1887. https://doi.org/10.3390/s22051887 Hussain, F., Hussain, F., Ehatisham-ul-Haq, M., & Azam, M. (2019). Activity-Aware fall detection and recognition based on wearable sensors. IEEE Sensors Journal, 19, 4528–4536. https:// doi.org/10.1109/jsen.2019.2898891
302
N. Sureja et al.
49. Giuffrida, D., Guido, B., Martini, D. D., Facchinetti, T. (2019). Fall detection with supervised machine learning using wearable sensors In Proceedings of the 2019 IEEE 17th International Conference on Industrial Informatics (INDIN) (Vol. 1, pp. 253–259). 50. Yen, C., Liao, J., & Huang, Y. (2020). Human daily activity recognition performed using wearable inertial sensors combined with deep learning algorithms. IEEE Access, 8, 174105– 174114. https://doi.org/10.1109/access.2020.3025938 51. Li, X., Zhou, Z., Wu, J., & Xiong, Y. (2021). Human posture detection method based on wearable devices. Journal of Healthcare Engineering, 2021, 1–8. https://doi.org/10.1155/2021/ 8879061 52. Nourollahi, M., Rokni, S. A., Alinia, P., Hassan, G. (2020). Proximity-based active learning for eating moment recognition in wearable systems In Proceedings of the 6th ACM Workshop on Wearable Systems and Applications, WearSys ’20 (pp. 7–12). Association for Computing Machinery. 53. Chun, K. S, Jeong, H., Adaimi, R., Thomaz, E. (2020). Eating episode detection with jawbonemounted inertial sensing In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC) (pp. 4361–4364). 54. Morris, D., Scott, S., Guillory, A., Kelner, I. (2014). RecoFit: Using a wearable sensor to find, recognize, and count repetitive exercises In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 3225–3234). ACM. 55. Zdravevski, E., Risteska Stojkoska, B., Standl, M., & Schulz, H. (2017). Automatic machinelearning based identification of jogging periods from accelerometer measurements of adolescents under field conditions. PLoS ONE, 12, e0184216. https://doi.org/10.1371/journal.pone. 0184216 56. Masino, A. J., Forsyth, D., Nuske, H., et al. (2019) M-Health and autism: Recognizing stress and anxiety with machine learning and wearables data In Proceedings of the 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS) (pp. 714–719). 57. Delmastro, F., Martino, F., & Dolciotti, C. (2020). Cognitive training and stress detection in MCI frail older people through wearable sensors and machine learning. IEEE Access, 8, 65573–65590. https://doi.org/10.1109/access.2020.2985301 58. Shen, Y., Voisin, M., Aliamiri, A., Anand, A., Hannun, A., Ng, A. (2019). Ambulatory atrial fibrillation monitoring using wearable photoplethysmography with deep learning In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, KDD ’19 (pp. 1909–1916). Association for Computing Machinery. 59. Kwon, S., Hong, J., Choi, E., Lee, B., Baik, C., Lee, E., Jeong, E., Koo, B., Oh, S., & Yi, Y. (2020). Detection of atrial fibrillation using a ring-type wearable device (CardioTracker) and deep learning analysis of photoplethysmography signals: Prospective observational proofof-concept study. Journal of Medical Internet Research, 22, e16443. https://doi.org/10.2196/ 16443 60. Resque, P., Barros, A., Rosário, D., Cerqueira, E. (2019). An investigation of different machine learning approaches for epileptic seizure detection. In Proceedings of the 15th International Wireless Communications Mobile Computing Conference (IWCMC) (pp. 301–306). 61. Regalia, G., Onorati, F., Lai, M., Caborni, C., & Picard, R. (2019). Multimodal wrist-worn devices for seizure detection and advancing research: Focus on the Empatica wristbands. Epilepsy Research, 153, 79–82. https://doi.org/10.1016/j.eplepsyres.2019.02.007 62. Zhang, H., Guo, Y., & Zanotto, D. (2020). Accurate ambulatory gait analysis in walking and running using machine learning models. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28, 191–202. https://doi.org/10.1109/tnsre.2019.2958679 63. Moore, S., Kranzinger, C., Fritz, J., Stoggl, ¨ T., Kroll, ¨ J., & Schwameder, H. (2020). Foot strike angle prediction and pattern classification using LoadsolTM wearable sensors: A comparison of machine learning techniques. Sensors, 20, 6737. https://doi.org/10.3390/s20236737 64. Posada-Quintero, H., Reljin, N., Moutran, A., Georgopalis, D., Lee, E., Giersch, G., Casa, D., & Chon, K. (2019). Mild dehydration identification using machine learning to assess autonomic responses to cognitive stress. Nutrients, 12, 42. https://doi.org/10.3390/nu12010042
Machine Learning in Wearable Healthcare Devices
303
65. Laureanti, R., Bilucaglia, M., Zito, M., et al. (2020). Emotion assessment using Machine Learning and low-cost wearable devices In Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC) (pp. 576–579). 66. Ayata, D., Yaslan, Y., & Kamasak, M. (2020). Emotion recognition from multimodal physiological signals for emotion aware healthcare systems. Journal of Medical and Biological Engineering, 40, 149–157. https://doi.org/10.1007/s40846-019-00505-7 67. Hayano, J., Yamamoto, H., Nonaka, I., Komazawa, M., Itao, K., Ueda, N., Tanaka, H., & Yuda, E. (2020). Quantitative detection of sleep apnea with wearable watch device. PLoS ONE, 15, e0237279. https://doi.org/10.1371/journal.pone.0237279 68. Sundararajan, K., Georgievska, S., te Lindert, B., Gehrman, P., Ramautar, J., Mazzotti, D., Sabia, S., Weedon, M., van Someren, E., Ridder, L., Wang, J., & van Hees, V. (2021). Sleep classification from wrist-worn accelerometer data using random forests. Scientific Reports. https://doi.org/10.1038/s41598-020-79217-x 69. Un, K., Wong, C., Lau, Y., Lee, J., Tam, F., Lai, W., Lau, Y., Chen, H., Wibowo, S., Zhang, X., Yan, M., Wu, E., Chan, S., Lee, S., Chow, A., Tong, R., Majmudar, M., Rajput, K., Hung, I., & Siu, C. (2021). Observational study on wearable biosensors and machine learning-based remote monitoring of COVID-19 patients. Scientific Reports. https://doi.org/10.1038/s41598021-82771-7 70. Wong, C., Ho, D., Tam, A., Zhou, M., Lau, Y., Tang, M., Tong, R., Rajput, K., Chen, G., Chan, S., Siu, C., Hung, I. (2020). Artificial intelligence mobile health platform for early detection of COVID-19 in quarantine subjects using a wearable biosensor: Protocol for a randomised controlled trial. BMJ Open, 10, e038555. https://doi.org/10.1136/bmjopen-2020-038555