143 68 4MB
English Pages 115 [112] Year 2023
Machine Intelligence for Materials Science
Alessandro Bile
Solitonic Neural Networks An Innovative Photonic Neural Network Based on Solitonic Interconnections
Machine Intelligence for Materials Science
This book series is dedicated to showcasing the latest research and developments at the intersection of materials science and engineering, computational intelligence, and data sciences. The series covers a wide range of topics that explore the application of artificial intelligence (AI), machine learning (ML), deep learning (DL), reinforcement learning (RL), and data science approaches to solve complex problems across the materials research domain. Topical areas covered in the series include but are not limited to: • • • • • • •
AI and ML for accelerated materials discovery, design, and optimization Materials informatics Materials genomics Data-driven multi-scale materials modeling and simulation Physics-informed machine learning for materials High-throughput materials synthesis and characterization Cognitive computing for materials research
The series also welcomes manuscript submissions exploring the application of AI, ML, and data science techniques to following areas: • • • •
Materials processing optimization Materials degradation and failure Additive manufacturing and 3D printing Image analysis and signal processing
Each book in the series is written by experts in the field and provides a valuable resource for understanding the current state of the field and the direction in which it is headed. Books in this series are aimed at researchers, engineers, and academics in the field of materials science and engineering, as well as anyone interested in the impact of AI on the field.
Alessandro Bile
Solitonic Neural Networks An Innovative Photonic Neural Network Based on Solitonic Interconnections
Alessandro Bile Department of Basic and Applied Sciences for Engineering Sapienza University of Rome Rome, Italy
ISSN 2948-1813 ISSN 2948-1821 (electronic) Machine Intelligence for Materials Science ISBN 978-3-031-48654-8 ISBN 978-3-031-48655-5 (eBook) https://doi.org/10.1007/978-3-031-48655-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
To my wife Angelica and to my daughter Athéna, because together we share a love of knowledge. To my whole family.
Preface
This book collects material from my scientific research, conducted during my 3-year (2019–2022) doctoral program at the Department of Basic and Applied Sciences for Engineering of the Sapienza University of Rome (Italy) in collaboration with the Femto-ST Institut of Besançon (France). The manuscript presents an innovative optical technology in neuromorphic hardware development. The most innovative element lies in the ability to reproduce a key property of biological neural tissue: the neural plasticity. The developed hardware is realized through spatial soliton technology, which enable the construction of intelligent connections: the solitonic neural networks (SNNs). Unlike the software and neuromorphic artificial intelligence developed to date, SNNs come close to reproducing the complex neural tissue and its functioning. To understand how SNNs work, Chapter 1 provides a detailed description of the functional characteristics of biological neuronal networks. This chapter opens with a detailed and general description of the functional features of the neuron, the fundamental unit of the nervous system. A second part is devoted to synaptic interconnections, which are fundamental to the realization of complex networks capable of processing and storing information. These are indeed recorded within specific neural pathways, which can be strengthened, modified or eliminated. Finally, some properties that help make the biological brain unique are discussed. In particular, neuroplasticity is presented: it allows the entire nervous system to be highly interconnected. In Chap. 2, some neuromorphic optical realizations are introduced. These are classified into three paradigms based on the neural functionality they are able to reproduce. The third paradigm introduces the novelty of networks constructed by spatial solitons. The book also addresses the concepts of learning and memory. Biological memory is still a great mystery, but modern neuropsychology has succeeded in outlining features of how it works, which are presented in Chap. 3. Chapter 4 gets into the core of the SNN networks, in particular by describing its fundamental unit: the solitonic neuron. The physics of soliton formation first within photorefractive bulk crystals and then within thin films is analyzed. Thin films vii
viii
Preface
allow physical limitations in terms of computation and experimental realization to be overcome. The learning dynamics is presented both theoretically and experimentally. The connection between multiple units of solitonic neurons realizes complex neural networks. These are extensively described in Chap. 4. Their ability to act as episodic memory and to perform recognition is presented. This chapter closes with a section highlighting the current limitations of SNNs and discussing some ideas currently under consideration to move forward with their evolution. This book is aimed primarily at researchers and doctoral students who have notions of nonlinear optics. Basic notions of neuron biology and artificial intelligence will also be needed. However, even the less knowledgeable reader will find a detailed description that will aid in full understanding. Where the text fails to provide adequate background information, appropriate references are suggested. I take this opportunity to ask the Italian state for a sincere regard for research. For too many years, research has been put on the back burner. Italian researchers work with resources that are extremely limited and receive significantly lower salaries than their non-Italian colleagues. Therefore, more merit must be given to researchers, thus ensuring rapid growth from which the entire nation would benefit. Rome, Italy October 2023
Alessandro Bile
Contents
1 Introduction to Neural Networks: Biological Neural Network . . . . . . 1.1 Introduction to Neural Networks: Biological Neural Networks . . . . 1.1.1 Neurons: Fundamental Units of the Nervous System . . . . . . 1.1.2 Active Synaptic Interconnections and Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Biological Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Tissue Properties of Biological Neural Networks . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 6 9 11 15
2 Overview of Neuromorphic Optical Systems . . . . . . . . . . . . . . . . . . . . . . 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Introduction to Optical Neuromorphics . . . . . . . . . . . . . . . . . . 2.1.2 First Neuromorphic Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Second Neuromorphic Paradigm . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Third Neuromorphic Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19 19 20 21 22 24 25
3 Towards Neuro-Learning Process: Pyschomemories . . . . . . . . . . . . . . . 3.1 Episodic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Semantic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Procedural Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29 30 33 35 36
4 The Solitonic X-Junction as a Photonic Neuron . . . . . . . . . . . . . . . . . . . 4.1 Photorefractive Solitons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Learning of the X-Junction Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Supervised and Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 The Physics of the LNOI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 LNOI X-Junction Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 LNOI X-Junction Erasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39 39 42 47 53 56 58 62
ix
x
Contents
4.5 Mathematical Model of the X-Junction Neuron . . . . . . . . . . . . . . . . . 4.6 Tissue Properties of the X-Junction Neuron . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67 70 73
5 Solitonic Neural Network Acting as an Episodic Memory . . . . . . . . . . 77 5.1 Solitonic Neural Network Acting as an Episodic Memory . . . . . . . . 77 5.1.1 Implementation of an Episodic Memory Based on the Plastic Photorefractivity of the SNNs . . . . . . . . . . . . . 78 5.1.2 A 4-bit SNN Working as an Episodic Memory . . . . . . . . . . . 79 5.1.3 Mathematical Model of the SNN . . . . . . . . . . . . . . . . . . . . . . . 81 5.1.4 Learning and Memorization Process . . . . . . . . . . . . . . . . . . . . 86 5.2 Power Consumption Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3 Limitations of Actual SNNs and Future Perspectives . . . . . . . . . . . . 100 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Chapter 1
Introduction to Neural Networks: Biological Neural Network
Abstract Chapter 1 introduces the functional organization of the biological brain. The first section opens with the description of neurons, fundamental units of the brain. These are structures capable of collecting signals, processing them and delivering them to subsequent units. At the same time, however, they are dynamic and can change according to the conditions of their environment. A second part of the chapter is devoted to the description of synapses: the dynamics of connections between neurons is in fact the basis of the processes of learning and memorization. Both are related to the concepts of signal intensity and iteration. A detailed description of how information is stored is thus proposed. The chapter closes with a descriptive overview of some properties of the nervous environment that make it a highly interconnected tissue.
1.1 Introduction to Neural Networks: Biological Neural Networks The human brain is a very complex system capable of collecting processing, storing data from the external environment. The result of this chain is learning. The whole process is optimized: energy consumption is minimal, the number of operations that can be performed simultaneously is very high as well as the ability to build new knowledge from the information acquired. For these reasons, in recent years, scientific research has sought first to understand the fundamental neural mechanisms underlying the learning process and then to replicate them through the neuromorphic approach. This consists precisely in the artificial reproduction of the typical functional blocks of nervous tissue. The starting point of any neuromorphic system is the neuronal dynamics. In fact, the functioning of neural networks can be analyzed by breaking down their complex structures into simpler parts and observing how the connections between these fundamental units, the neurons, evolve according to environment information. The general behavior of a neuron can be schematized in its ability to pursue some main tasks: to receive signals and influence their propagation within the network, as if they were
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 A. Bile, Solitonic Neural Networks, Machine Intelligence for Materials Science, https://doi.org/10.1007/978-3-031-48655-5_1
1
2
1 Introduction to Neural Networks: Biological Neural Network
small chips able of assessing the magnitude of the received signal and to specifically direct it toward one propagation trajectory rather than another. This is done through the synapses, biological bridges connecting neurons. These connections allow to build precise geometries corresponding to the learning of information (theoretical or pratical). As this Chapter will show, there are two types of synapses in relation to the transmitted signal, which can be electrical or chemical [1]. The main difference lies in the fact that in electrical synapses, the propagation of nerve signals is extremely fast and instantaneous because the current passes directly from one cell to another [2]. The two types are used in different neural contexts. Moreover, the variability of the components involved also characterises the reading and translation of external signals. External stimuli are picked up by different types of receptors. The information contained by these, in the form of electrical propagations, follows an elaboration process and through synaptic connections forming a neural circuit. The generation of an electrical signal (spike) depends on the dynamics of ion densities, according to the Hodgin-Huxley model [3]. As it will be shown, synapse activation depends on signal iteration and on the variations in the density of neurotransmitters. This Chapter explores the fundamental concepts of biological neuronal dynamics: starting with an exploration of the structure and functioning of the individual neuron, it goes deep inside the ways in which synapses are built, modified and destroyed between different neurons. Then, it describes how neural network geometries are capable of processing, storing and learning information. The last part of the Chapter is dedicated to the description of tissue properties of the brain that contribute to the development and sustenance of the entire neural network. This text is not a substitute for biological neural dynamics manual. Indeed, the covered arguments will be approached in such a way as to better understand the logic and operation of SNNs. The solitonic neural technology described in this book represents an entirely novel and never-before-explored approach that arises from the concept of a self-assembling dynamical system. As we shall see, SNNs are able to change their geometry by reorganizing the map of neural circuits to learn new information and memorize it into specific neural pathways.
1.1.1 Neurons: Fundamental Units of the Nervous System In recent years, research on brain biology has made important strides [4] enabling the development of in-depth models on the organization of the nervous system and its activity. Neural networks constitute a complex system whose functioning depends on the concatenated and parallel succession of numerous chemical and electrical mechanisms [5]. Information, collected through specialized specific “sensors” called receptors, reaches the brain in the form of electrical signals [6]. These inputs, once collected, are processed locally by neurons, fundamental units of the nervous system.
1.1 Introduction to Neural Networks: Biological Neural Networks
3
Neurons decide whether to send a signal forward or to block it. The choice is made based on the importance of the information. High-amplitude signal are considered important and, after being evaluated, they are transmitted to next neurons. Therefore, signal propagation depends on the “choice” combination made by individual neurons along the neural circuit results in the delineation of a precise information pathway which requires changes in the structure of the network. Thus, the information learning coincides with its neural transposition in the form of a trajectory. This is possible due to the self-organizing capacity of neural tissue [7]. Indeed, signal mapping within the nervous tissue depends on the weight and directionality of each synapse, the nature of which is dynamic: they can increase in number, reinforce or diminish the intensity of the connection between the neurons in order to redirect the information flow and, even, be totally cancelled if the received signal is not evaluated as important or “valuable.” These characteristics give synapses the property of plasticity that underlies the human brain’s way of learning. By plasticity it is indicate the brain property to dynamically change according to the received information, in order to process and remember it. To analyze brain plasticity, it is necessary to start from the description and the characterization of a single neuron unit. To be precise, there is no single type of neuron [8]. Their conformation is different and depends on the nerve region in which they are located and the task they are supposed to perform. In general, and to a good approximation, a common structure can be recognized [9]. Schematically, the neuron is characterized by three functional districts according to the diagram shown in Fig. 1.1. The dendrites are long channels along which incoming information is collected and transported to the "control and data processing centre." This is identified with the soma. Here, electrical signals are processed according to a nonlinear process: if the sum of total inputs exceeds a set threshold [10], an output signal called Spike is generated and propagated. The interesting aspect is that these signals are generated continuously [11]: it follows that they maintain an unchanged amplitude along propagation. Signal regeneration represents a key step in the realization of hardware artificial neurons. Indeed, most models base their theory on this feature called excitability [12]. However, there are implementations in which the input is transmitted and processed through a process of switching to different paths [13]. This approach, as will be shown in the next chapters, underlies the neuron operation realized in the research proposed in this manuscript. The soma ends in a long channel called the axon that carries the spike and directs it to numerous output channels that together constitute the axon arborization [14]. Interestingly, substances such as myelin are present on the axon [15] that consent a speeding up of signal transmission, preventing leakage and allowing a focused pathway. The distribution of myelin is not homogeneous but such that signal transmission is optimized [15]. At the end of this pathway, there is a junction region [16], allowing a communication between multiple successive neurons. The neuron from which the signal starts and precedes the synaptic cavity is referred to as pre-synaptic while the
4
1 Introduction to Neural Networks: Biological Neural Network
Fig. 1.1 Schematic representation of the functional districts of a biological neuron: pre-synaptic dendrites represent the input channels for incoming information, which is collected and processed in the soma, which plays the role of computational center. Once weighed through a threshold process, the information can be propagated in the form of an action potential that travels through the entire axon before being sorted into the postsynaptic dendrites, bridges for the following neurons
ones receiving it as post-synaptic neurons. The connection area is called synapse [17]. Some synapses are activated by variation in the concentration of endogenous chemical messengers, the neurotransmitters. The synapses between all these units constitute the neural network of the biological nervous system [18]. Within it move different types of neurotransmitters that allow the development of a high variability of neural phenomena [19]. Before analyzing the concept of synapses, it is first necessary to focus on the nature of transmitted signals in order to understand the biological dynamics of neurons. These consist of short electrical pulses, called spikes or action potentials or membrane potentials, which have an amplitude of 100 mV and a duration of about 1–2 ms as shown in Fig. 1.2. Normally these pulses never travel alone but in sequence [20], forming the so-called spike trains, Fig. 1.2b. Their manifestation can be regular or irregular, but it is important to be aware that all pulses during propagation are characterized by the same shape [21]. In addition, each pulse of the spike train is well separated from the next. The succession of spikes, with its characteristic periodicity, determines the extent of information and the formation of connections between the pre-synaptic neurons and post-synaptic ones. After having excited a spike, a time period must elapse before a new excitation can arise. The minima time distance between two spikes, is called the absolute refractory period. The action potential corresponds to a change in the difference between the internal potential at the neuronal membrane and the external potential. The potential difference u(t) occurs only in the presence of a signal, otherwise the neuron is in a resting urest condition. After, the spike the potential decays to the resting value. If the change is positive the synapse is excitatory [22], if it is negative the synapse is inhibitory [23]. An excitatory signal contributes to the consolidation of synaptic binding and thus to learning and memorization processes [20]; on the contrary, an inhibitory signal reduces synapse intensity and blocks the information flow: it is the beginning of the forgetting process. The existence of two types of synaptic responses, excitatory and inhibitory, underlies the construction of specific pathways for each
1.1 Introduction to Neural Networks: Biological Neural Networks
5
Fig. 1.2 a reports the dynamics of generating an action potential. The arrival of a stimulus produces a depolarization of the neuronal membrane. If the stimulus is sufficiently intense (above a fixed threshold), depolarization is evident and the spike is generated, otherwise not. A defined refractory period follows in which the neuron is unable to generate new spikes. b Successive stimuli, depending on their intensity, can excite the generation of action potentials very close in time that are called action bridges. Obviously, the refractory period plays a very important physiological role in the transmission of nerve signals, since it influences the frequency with which trains of action potentials can be evoked and propagated
type of information. It is fundamental to the learning process. Therefore, any neuromorphic model must be able to reproduce such a functionality. The cell membrane, under resting conditions, has a negative polarization of about -65 mV. Therefore, an excitatory input reduces the negative polarization and, for this reason, is also called depolarizing [22]. The inhibitory synapse produces an increase in negative polarization and is called hyperpolarizing [23]. Now, let’s report the mathematical formalism of the above. Let ui (t) be the potential of the postsynaptic neuron. For t < 0, ui (t) = urest . At t = 0 the spike occurs. For t > 0 the potential change shown in Eq. 1.1 occurs: u i − u r est = qi j (t)
(1.1)
where qi j (t) is the postsynaptic potential. Let us now imagine a connection with multiple neurons. Each presynaptic spike induces a postsynaptic potential qij (t) with j varying according to the presynaptic spike considered. Equation 1.1 then becomes: u i − u r est =
EE j
( ) qi j t − t kj + u r est
(1.2)
k
representing a linear membrane response to input. This allows us to extrapolate an important feature of the neuronal dynamics of the synapse formation: signals from
6
1 Introduction to Neural Networks: Biological Neural Network
multiple pre-synaptic neurons are summed and concur in the formation of the new action potential. However, if many spikes follow each other in a short interval of time and the membrane potential exceeds a threshold value θ, the response will not only be linear but will show a spike excursion of about 100 mV. This variation propagates along the axon to be transmitted to subsequent neurons. In conclusion, if ui (t) reaches θ then neuron i emits a spike. The neuronal model based on the definition of the critical voltage θ is called Integrate and Fire [20]. The neuron is thus a cell that must be able to collect signals related to environmental stimuli, process them according to nonlinear laws (which are thus the basis of learning) and eventually communicate them to subsequent units so as to create specific propagation paths or change their direction of propagation. To satisfy this aim, the neuron is able to modify its own structure according to the received stimuli. It is thus the neuron that, after “understanding” the information it receives, enables the distribution of the signal in a precise network.
1.1.2 Active Synaptic Interconnections and Neural Networks Indeed, neural networks constitute a complex system in which the different constituent districts are highly connected and dependent on each other even though they are spatially far apart. Small changes in our surroundings can produce large changes in our brain structure and network geometry, generating complex patterns of activity at both spatial and temporal levels. For this reason, it is more appropriate to speak of connected neural tissue. The complexity of neuronal activity underlies the ability to learn new information and store it. But how do these processes take place? To answer this question, it is necessary to introduce Hebb’s postulate. It states that connections between co-active neurons are strengthened through the mechanism of synaptic plasticity, so that the subsequent activation of a single part of the created set of neurons determines the activation of the whole set [24]. In circuit terms, Hebb’s postulate can be interpreted as a mechanism whereby the activation of a specific connection determines the evolution of an entire signal mapping within the network [25]. Whenever information is received, small changes are triggered that, iterated over time, lead to a new configuration in the structural geometry of the brain. Underlying learning, Hebb identifies a mechanism to which he associates the keyword plasticity [25]. It is thanks to this property that the biological brain is able to learn and store data so efficiently. In Chap. 4, it will be shown that SNN networks also base their innovativeness and efficiency on the plastic behavior of the refractive index typical of the optical crystals in which they are built. Neuroplasticity is realized both through the strong dynamicity of neurons, which are able to reorganize themselves spatially and temporally, and through the evolution of synapses. These bridges are not rigid
1.1 Introduction to Neural Networks: Biological Neural Networks
7
structures but are capable of changing over time based on the information received [26]. Moreover, given the high interconnectedness, the modification of a synapse results in a change of the entire previously formed neural pathway. At most synapses, the change in a membrane potential depends not on electrical but chemical principles. When the spike arrives at the presynaptic termination, it causes the release of neurotransmitters that are enclosed within vesicles [27]. These begin to diffuse locally within the inter-synaptic cleft, by a mechanism that, as we shall see in Chaps. 3 and 4, is very similar to charge photo-generation during the formation of a spatial soliton. Upon reaching the postsynaptic dendrites, they bind to membrane receptors and release their contents. There are neurotransmitters of two main different natures. Excitatory ones induce a membrane depolarization that leads to the generation of a supra-threshold signal (spike) that propagates along the network by strengthening connections between neurons, as already seen. In contrast, inhibitors cause a hyperpolarization of the membrane that results in a shift of the signal away from the threshold value. These lead to a weakening of the synapses bound to that neuron. Then, there are also electrical synapses which are rarer and do not involve neurotransmitters but allow the connection between one neural cell to another through junction points called gap-junctions [28] where specific proteins are found to enable the connection. These synapses allow for faster signal propagation and are therefore located at regions that require a faster rate of communication. The intensity of a synaptic connection is neither static nor permanent but is subject to change over time. This characteristic is defined synaptic plasticity. Neural plasticity mechanisms also come into play when, as a result of localized damage, re-organization of neural mapping is required [29]. Such reorganization enables functional recovery through different types of processes [30] that allow the restoration of specific functional activities and the rehabilitation of already learned mechanisms. For this reason, neuroplasticity is considered the neurobiological property at the base of memory [31]. Neuroplasticity also intervenes in the adaptive capacity of biological neural circuits [32]. Indeed, it has been studied that when a biological organ is removed, the neural region corresponding to it can be reassigned to all three functions. In this case indeed the network mapping associated to the organ’s functioning, won’t receive anymore any stimuli and for this reason, after some time, should be erased. But in reality, this does not happen. Instead, a fantastic phenomenon occurs. Other human districts, which are neurally close to the damaged zone, start using more connections and therefore more neurons. This is one reason why when the capacity to exercise one sense is lost, other sensory abilities improve. Neural connections change conformation and adapt to the new functionality. This dynamicity is provided by the ability to change the specific weight of each synapse in signal propagation along the neural mapping according to the enviroment. The strength of synaptic communication can be modified by the application of different stimulation patterns as a function of time [31]. Continuous stimulation of the synaptic connection over time results in its progressive strengthening [33].
8
1 Introduction to Neural Networks: Biological Neural Network
Fig. 1.3 Schematic representation of the progressive strengthening of the synaptic connection between two neurons. As the signal travels through the connection it becomes stronger
Whenever the neural system “encounters” the same information, it slightly increases the weight of its competing synapses (Fig. 1.3). The absence of a stimulus, on the other hand, weakens synaptic information [34]. These mechanisms, as the next section will show, underlie memory storage. The binding strength within a neural pathway can also be modulated by exploiting different messengers that communicate the nature of the incoming signal. Indeed, excitatory information is not always associated with a stimulus. Some stimuli encode synaptic downregulation [35]. Biological neural tissue plays on the presence of different types of neurotransmitters, which, depending on their properties, can act as excitatory or inhibitory, giving this nuance to the relevant synapse as well. For example, among the main excitatory neurotransmitters, which thus increase synaptic weight, there is glutamate [36]. Its presence facilitates the initiation of a spike, thus promoting the propagation of information. GABA, on the other hand, has an inhibitory effect on 90 percent of synapses [37]: an increase in its concentration in the synaptic cavity results in a decrease in synaptic weight, which results in the arrest of propagation. Its ability to “block” signal propagation has also made it a widely utilized agent in the production of sedative and tranquilizing drugs [38, 39]. The number of synapses in the brain is not fixed but changes with learning [31]. In fact, information from the external environment is able to excite different stimuli that result in neural pathways. This means that new information requires new pathways and thus new connections [31]. The dynamism of synaptic connections is ensured by their plasticity: synapses continuously change in their structure according to the density of received signals and their nature. The wide diversity is ensured by the presence of different types of neurotransmitters that influence the formation and consolidation of connections. This depends on their chemical nature, as already seen, but also on their density within the synaptic cleft. When stimuli lead to the formation of a new connection we
1.1 Introduction to Neural Networks: Biological Neural Networks
9
speak of a constructive synapse while when it is disrupted of a destructive synapse [40].
1.1.3 Biological Memory In the previous paragraphs, the structure and the functional dynamics of the single neuron has been illustrated. Also, the communication between these units have been described. All mathematical models to describe neuronal dynamics developed in the literature [20] highlight a very specific feature: neuronal functioning is linked to interaction with the external environment. Without external stimuli, the structure is rigid, does not evolve and therefore does not learn. A dynamic structure is synonymous with learning. This means that for a neural tissue, in order to process the stimuli it receives, it must be able to change its geometry according to the information coming from outside. The main needed features are two: adaptability and plasticity. Adaptability is the ability to change according to information from the outside [41]; plasticity, on the other hand, is the characteristic of a system that allows the aforementioned self-modification. What is memory then? This is a question that is still not easy to answer today. Memory certainly depends on adaptability and plasticity and is therefore related to the concept of learning, but it cannot be identified with it [31]. Information is stored through changes in synaptic weights that allow its propagation along the neural mapping. A specific trajectory, a neural graph, is thus originated and retraced each time the information is recalled. In the biological world, there are different types of stimuli influencing the form of learning responsible for different patterns of neural activity. For all types, however, one basic principle applies: neural learning activity changes the geometry of neural pathways and, more generally, the geometry of the brain. When these changes persist, the result is memory storage. These changes occur at the level of synapses. A synapse can be weakened or strengthened, and this determines mnemonic robustness: in fact, the duration of storage, especially for short-term memory [42] depends on how long a synapse has been weakened or strengthened. Therefore, the temporal duration of these processes determines the duration of short-term mnemonic storage. As will be shown in Chaps. 3 and 4, the connections underlying a complex solitonic network, likewise, can be modified, strengthened or weakened, depending on the incoming signal and its propagation. It is important to emphasize that memory is a complex phenomenon distributed throughout neural tissue and therefore does not depend on a single district. Memorization can be identified with a geometric change in the entire neural circuitry. These changes occur gradually. Their repetition over time reinforces them and results in a translation of short-term memory into long-term memory. Repetition is thus one of the keys to memory formation as it is commonly understood. In this regard, Dr. Kandel reports that short-term memory “is fixed” in neural circuits through repetition [31]. When information is stored in circuits, such as long-term memory, and lost
10
1 Introduction to Neural Networks: Biological Neural Network
in the short term, it takes less time to relearn the information than to learn it from scratch. Some studies [31–43] show that mnemonic memorization is an active process, involving the construction of new synaptic connections, the strengthening or weakening of certain pathways, or the total deletion between neural bridges. Therefore, learning new information may require the activation of new pathways or the modification of pre-existing ones. In the biological case this occurs according to the mechanisms introduced in a general way in the preceding paragraphs. For the reader who has the desire to explore the biological viewpoint in more detail, I highly recommend reading the following references [31–44]. The possibility of building self-modifying connections over time, according to the stimuli they receive from the external environment has been, and still is, one of the main challenges in making neuromorphic hardware that has properties close to those of the biological brain. Chapters 3 and 4 will show how the main strength of SNNs lies in their ability to self-construct and self-modify according to a dynamic that is similar to the biological one. To fully understand the logic of memorization, biological first and solitonic later, an important aspect needs to be analyzed. When information is learned and, contextually memorized, synapses are organized according to a precise structure. The more solid the memory, the more intense are the synaptic connections that describe it. A well-established memory is characterized by very intense synapses. As introduced in the previous section, a synapse can activate a strong connection either as a result of a signal iterated numerous times along the same pathway or as a result of a signal of large amplitude, indicative of high information content. Generalizing this phenomenon to an n number of synapses, we can say that memory realization can occur following a propagation of signals that are either very intense or iterated over time. In either case, the result is the activation of a synapse mapping. In this sense, memorizing becomes synonymous of to iterate. Thus, to eliminate stored information requires the contribution of a high density of inhibitory agents that either go to act on each link or block any subsequent propagation of signals along those paths. This stopping of propagation is read as a reduction in the importance of the information. Gradually each synapse attenuate its linkage efficiency accordingly to the inhibitory agents density until the stored information is eliminated. It is good to keep these processes of biological memory evolution in mind because they are very similar to those governing the formation and evolution of SNN networks. Kandel shows experimentally that the information can be recorded and thus stored depending on the frequency with which a synapse is activated. This traces Hebb’s famous postulate that when two or more neurons fire simultaneously, they strengthen the extent of their connection. In his monograph In Search of Memory, he reports, “We discovered that in all three forms of learning the duration of short-term mnemonic storage depends on how long a synapse has been weakened or strengthened.” Input information initiates a process of network structural modification that, if iterated, leads to memorization. He also found that the stored information is not localized
1.1 Introduction to Neural Networks: Biological Neural Networks
11
but is distributed throughout the whole brain circuit. This means the neural tissue has a global response. Although one or more neural regions are directly affected by synaptic excitation, it is the entire system that learns and changes accordingly. Furthermore, if the synaptic weight decreases because one path is no longer stimulated, learning the same information again requires fewer iterations of stimuli response than learning it from scratch. This the meaning of memory recovery: there is still a path which should be only reinforced and not created from zero. At this point, it is important to ask how memory works and whether there is a single form of biological memory. Modern neuro-psychology identifies many types of memory that, interacting with each other, perform different tasks. Chapter 3 will consider the three main forms of memory focusing on their functional descriptions to understand how the neuromorphic solitonic paradigm is able to reproduce their characteristics.
1.1.4 Tissue Properties of Biological Neural Networks This last section of the chapter devoted to biological neural networks introduces some indispensable properties in the neuronal dynamics of signal transmission that underlie the processes of learning and memorization. These phenomena are not only attributable to communication between neurons and synapse activations. A much more generalized argument must be underlined. Biological neurons are located within an active environment that is capable of transporting nutrients between distant regions, activating and inhibiting elements of a chemical nature [45]. The activation of neurons thus depends on the occurrence of a number of factors that can modify extensive neural areas. This is precisely why, in the biological case, we speak of neural tissue. When an electrical signal propagates within the network, it may run the risk of dispersing and not reaching the communicative target. This scattering phenomenon can lead to a twofold negative effect: first, the signal does not reach the post-synaptic neurons to which it is addressed, resulting in a loss at the level of information to be processed and stored; second, unfocused propagation can result in the activation of a variable number of synapses by constructing a mapping that does not correspond to the nature of the stimulus presented as input. The number of connections that can be activated depends on the diffractive nature of the signal propagation. The greater the diffraction, the greater the number of connections that can be stimulated and activated. An iteration of unfocused signals leads to a strengthening of “wrong” synapses that are then stored. If synaptic weights are very intense, it then becomes difficult to remove the stored information. But nature as always is able to find an optimal solution. Axons, on which signal runs for a long distance, are externally coated with a substance called myelin. This is essentially a lamellar insulating structure composed mainly of lipids and proteins [46]. The coating type can vary from monolayer to multiple concentric layers that provide the typical sleeve shape depending on the neural region.
12
1 Introduction to Neural Networks: Biological Neural Network
Fig. 1.4 Representation of myelin distribution along the axon of the neuron. Myelin is organized into sleeves that discretely wrap around the exit channel, leaving portions called nodes of Ranvier uncovered. This organization promotes rapid signal conduction through jumping conduction
The presence of myelin gives a characteristic whitish color, compared to grayish districts without it [47]. The main function of myelin is to allow rapid and proper propagation of nerve stimuli, in some cases amplifying their signal conduction through the condition of “jumping conduction” [48]. This is ensured by the sleeve-like shape of the myelin layers, which do not uniformly cover the axon and give it a sausage-like appearance: it is this conformation that ensures the rapidity of the signal, which does not just travel the entire extent of the axon but jumps from one sleeve to another, decreasing the distance to be traveled. The discretely myelinated regions of the axon are called Schwann cells [49]. These are interrupted periodically by so-called nodes of Ranvier [50]. Here the axon is devoid of myelin. At these structures, the effective passage of ions through the plasmalemma occurs due to the presence of ion gates. The nodes of Ranvier thus contribute to the proper propagation of electrical impulses along axon channels. The fact that this phenomenon occurs only at Ranvier’s nodes provides a significant “savings” in terms of time by ensuring propagation through Jumping Conduction (Fig. 1.4). Indeed, due to the hopping condition, the signal velocity increases by up to two orders of magnitude from about 0.5–2 m/s to 200–100 m/s. The qualitative trend of velocity conduction as a function of the fiber diameter is shown in Fig. 1.5 [51]. It can be seen that as the diameter increases, the propagation velocity decreases if the axon lacks myelin. On the other hand, if the neuron is myelinated then the velocity remains constant as the axon size changes. As we will see shortly this is because the myelin also acts as a focusing agent for the signal and thus limits its loss and dispersion, ensuring confined low-loss propagation. Thus, the second important function of myelin is to isolate electrical propagation, preventing the formation of unintended collateral connections. Indeed, in the absence of myelin, neurons, especially in denser networks, would respond to all signals propagating in their surroundings, even those associated with noise and not with the propagation of a precise information signal, behaving somewhat like a conducting wire of electricity that has no insulating cover. A much larger number of neurons would be involved than would be expected for mapping a specific signal. Clinically,
1.1 Introduction to Neural Networks: Biological Neural Networks
13
Fig. 1.5 Trend of the signal propagation speed along the neuronal axon as a function of fiber diameter. In the presence of myelin, signal conduction significantly decreases compared with the case without myelin
this is a condition that is not normal in adult human beings but can occur at the occurrence of different types of pathologies [52, 53]. Figure 1.6a schematically shows the propagation of a diffuse signal, a situation that occurs in the absence of a myelin sheath. The numerous losses, which can occur, cause the possible interaction with multiple neurons (in the figure the two lateral ones) that affect the learning of information incorrectly. In contrast, if the axon is sheathed in myelin, the signal is conducted in a focused manner until it reaches the target neuron. In addition, the presence of myelin contributes to the protection of the axon output channel from mechanical stress by preventing possible damage that could result in malfunction of the neuron itself [54]. Finally, at the myelin bundles the exchange of nutrients takes place that allow the sustenance of the neuron and all its activities [55]. In summary, the presence of myelin allows the correct neuron functioning and signal propagation. In the case of the solitonic neuromorphic paradigm this book describes, a role similar to that of myelin is played by the application of a bias electrostatic field. This ensures the formation of solitonic channels and their stability. The absence of a bias field does not allow soliton realization and results in light propagation in the crystal in the diffractive-diffuse regime. To best understand this functional similarity, it is important to keep in mind the myelin functions which are summarized in the following Table 1.1. As anticipated earlier, the low amount of myelin or its absence are symptoms of a nervous system that is not fully developed or prone to pathology. Let’s analyze a specific case and see what happens to signal transmission in infants at the time of birth: many nerves lack myelin protection, and the movements of infants are jerky and clumsy. As soon as the myelin sheaths develop, however, movements
14
1 Introduction to Neural Networks: Biological Neural Network
Fig. 1.6 In the absence of myelin (a), the propagation of the electrical signal along the axon is subject to numerous losses that can interact with neurons other than the target neurons by going on to modify and activate additional synapses; the presence of myelin (b), on the other hand, ensures targeted signal propagation, eliminating losses and promoting excitation of the target synapse
Table 1.1 Main properties and effects of myelin on the biological neuron
Main functions of myelin Focusing of signal conduction Increased propagation speed Neuron nourishment Protection from mechanical stresses
ac-quire coordination and fluidity [56]. Similar, and in some cases more severe, symptomatology can result from demyelination processes, in which a progressive loss of myelin sheaths can lead to disorders of various kinds such as multiple sclerosis [57], Leber’s hereditary optic neuropathy [58], neuromyelitis optica [59], etc.
References
15
Recent studies have also shown that, in general, sources of high stress can lead to an overproduction of myelin, which has negative consequences on the communication dynamics inside the brain [60]. An excess of myelin, indeed, results in stronger and faster connections but at the same time disrupts the normal patterns of communication between neurons and between brain districts by favoring the activity of one neural region over a second. Among the disadvantages of excessive myelin production there is also the accompanying high energy demand. High metabolic energy must be available to support the rich presence of lipids in the myelin membrane. When there is an overlap, the presence of myelin has the opposite effect with respect to its normal function: instead of providing nourishment to the neurons it covers, it requires an energy requirement to develop that decreases the nourishment provided to the cells. In the most extreme cases, this process can lead to the progressive death of neurons due to nutrient deficiency [59]. For these reasons, myelin overproduction results in worse conduction of nerve stimuli. Poor conduction of the information-associated electrical signal is thus associated with both low and overproduction of myelin. Chapter 3 will show that the myelin properties with respect to biological neurons, both in favor of communicative conduction and to the disadvantage of it, are quite similar to those shown by the application of an electrostatic field for the formation of solitonic neurons. A brief description of the memory mechanisms identified by modern neuropsychology will be addressed in Chap. 2. There are in fact three main types of memories: episodic memory, procedural memory and semantic memory. These work together, overlapping in information processing, memory construction and memory recall. The presence of all three is necessary for knowledge formation. Their functional details will be covered in order to understand how learning occurs in solitonic neural networks. The idea behind SNNs is to develop a system that functions globally like the biological brain, can learn and remember like it, and is not limited to replicating its basic functionality as the systems developed so far in the literature do.
References 1. Larsen, Rylan S., and P. Jesper Sjöström. 2015. Synapse-type-specific plasticity in local circuits. Current Opinion in Neurobiology, 35: 127–135. ISSN 0959–4388, https://doi.org/10.1016/j. conb.2015.08.001 2. Grant, S.G.N. 2019. Synapse diversity and synaptome architecture in human genetic disorders. Human Molecular Genetics 28 (R2): R219–R225. https://doi.org/10.1093/hmg/ddz178. 3. Goldwyn, J.H., and E. Shea-Brown. 2011. The what and where of adding channel noise to the hodgkin-huxley equations. Plos Computational Biology. https://doi.org/10.1371/journal.pcbi. 1002247. 4. Nölting, Svenja, and others. 2022. Personalized management of pheochromocytoma and paraganglioma. Endocrine Reviews 43(2): 199–239 5. Yaghini Bonabi, Safa, Hassan Asgharian, Saeed Safari, and Majid Nili Ahmadabadi. 2014. FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model. Frontiers in Neuroscience 8. https://doi.org/10.3389/fnins.2014.00379
16
1 Introduction to Neural Networks: Biological Neural Network
6. Gross, Joachim, Bettina Pollok, Martin Dirks, Lars Timmermann, Markus Butz, and Alfons Schnitzler. 2005. Task-dependent oscillations during unimanual and bimanual movements in the human primary motor cortex and SMA studied with magnetoencephalography. NeuroImage 26(1): 91–98. ISSN 1053–8119. https://doi.org/10.1016/j.neuroimage.2005.01.025 7. Vizcaíno, J., E. Deutsch, R. Wang, et al. 2014. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nature Biotechnology 32: 223–226. https://doi. org/10.1038/nbt.2839. 8. Furness, John B. 2000. Types of neurons in the enteric nervous system. Journal of the Autonomic Nervous System 81 (1–3): 87–96. ISSN 0165–1838. https://doi.org/10.1016/S0165-183 8(00)00127-2 9. Palay, S.L., and G.E. Palade. 1955. The fine structure of neurons. The Journal of Biophysical and Biochemical Cytology 1 (1): 69–88. https://doi.org/10.1083/jcb.1.1.69.PMID:14381429; PMCID:PMC2223597. 10. Shaban, A., S.S. Bezugam, and M. Suri. 2021. An adaptive threshold neuron for recurrent spiking neural networks with nanodevice hardware implementation. Nature Communications 12: 4234. https://doi.org/10.1038/s41467-021-24427-8. 11. Gutkin, B.S., and G.B. Ermentrout. 1998. Dynamics of membrane excitability determine interspike interval variability: a link between spike generation mechanisms and cortical spike train statistics. Neural Computation 10 (5): 1047–1065. https://doi.org/10.1162/089976698 300017331. 12. Vezzani, Annamaria, and Barbara Viviani. 2015. Neuromodulatory properties of inflammatory cytokines and their impact on neuronal excitability. Neuropharmacology 96 (Part A): 70–82. ISSN 0028–3908. https://doi.org/10.1016/j.neuropharm.2014.10.027 13. Misra, Janardan, and Indranil Saha. 2010. Artificial neural networks in hardware: A survey of two decades of progress. Neurocomputing 74(1–3): 239–255. ISSN 0925–2312. https://doi. org/10.1016/j.neucom.2010.03.021 14. Gauthier, Julie, Martin Parent, Martin Lévesque, and André Parent. 1999. The axonal arborization of single nigrostriatal neurons in rats. Brain Research 834(1–2): 228–232. ISSN 0006–8993. https://doi.org/10.1016/S0006-8993(99)01573-5 15. Stadelmann, Christine, Sebastian Timmler, Alonso Barrantes-Freer, and Mikael Simons. 2019. Myelin in the Central nervous system: structure, function, and pathology. https://doi.org/10. 1152/physrev.00031.2018 16. Merlo, L., F. Cimino, F.F. Angileri, D. La Torre, A. Conti, S.M. Cardali, A. Saija, and A. Germanò. 2014. Alteration of synaptic junction proteins following traumatic brain injury. Journal of Neurotrauma 31 (16): 1375–1385. 17. Mattson, Mark P., M. Murrain, P. B. Guthrie, and S. B. Kater. (1989). Fibroblast growth factor and glutamate: opposing roles in the generation and degeneration of hippocampal neuroarchitecture. Journal of Neuroscience 9 (11): 3728–3740.https://doi.org/10.1523/JNEUROSCI.0911-03728.1989 18. Araque, A., V. Parpura, R.P. Sanzgiri, and P.G. Haydon. 2001. Glutamate-dependent astrocyte modulation of synaptic transmission between cultured hippocampal neurons. European Journal of Neuroscience. https://doi.org/10.1046/j.1460-9568.1998.00221.x. 19. Carbone, E., C. Calorio, and D.H.F. Vandael. 2014. T-type channel-mediated neurotransmitter release. Pflügers Archiv—European Journal of Physiology 466: 677–687. https://doi.org/10. 1007/s00424-014-1489-z. 20. Gerstner, Wulfram, Werner M. Kistler, Richard Naud, and Liam Paninski. 2014. Neuronal dynamics: from single neurons to networks and models of cognition. Cambridge University Press, USA. Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT Press 21. Bak, L.K., A. Schousboe, U. Sonnewald, and H.S. Waagepetersen. 2006. Glucose is necessary to maintain neurotransmitter homeostasis during synaptic activity in cultured glutamatergic neurons. Journal of Cerebral Blood Flow & Metabolism. 26 (10): 1285–1297. https://doi.org/ 10.1038/sj.jcbfm.9600281.
References
17
22. Chih, Ben et al. (2005). Control of excitatory and inhibitory synapse formation by neuroligins. Science 307: 1324–1328. https://doi.org/10.1126/science.1107470 23. Scimemi, A., and M. Beato. 2009. Determining the Neurotransmitter Concentration Profile at Active Synapses. Molecular Neurobiology 40: 289–306. https://doi.org/10.1007/s12035-0098087-7. 24. Hebb, D. O. (1949). Temperament in chimpanzees: I. Method of analysis. Journal of Comparative and Physiological Psychology, 42(3), 192–206. https://doi.org/10.1037/h00 56842 25. Sweatt, J.D. 2008. The neuronal MAP kinase cascade: A biochemical signal integration system subserving synaptic plasticity and memory. Journal of Neurochemistry. https://doi.org/10.1046/ j.1471-4159.2001.00054.x. 26. Pereda, A. 2014. Electrical synapses and their functional interactions with chemical synapses. Nature Reviews Neuroscience 15: 250–263. https://doi.org/10.1038/nrn3708. 27. Anne, Christine, and Bruno Gasnier. 2014. Chapter Three—Vesicular neurotransmitter transporters: mechanistic aspects. In Current Topics in Membranes, vol. 73, ed. Mark O. Bevensee. Academic Press, 149–174. ISSN 1063–5823, ISBN 9780128002230. https://doi.org/10.1016/ B978-0-12-800223-0.00003-7 28. Bennett, M.V.L. 1997. Gap junctions as electrical synapses. Journal of Neurocytology 26: 349–366. https://doi.org/10.1023/A:1018560803261. 29. Bile, A., G. Bile, R. Pepino, et al. 2023. Innovative and non-invasive method for the diagnosis of dyschromatopsia and the re-education of the eyes. Res. Biomed. Eng. 39: 321–327. https:// doi.org/10.1007/s42600-023-00263-1. 30. Berlucchi, G., and H.A. Buchtel. 2009. Neuronal plasticity: Historical roots and evolution of meaning. Experimental Brain Research 192: 307–319. https://doi.org/10.1007/s00221-0081611-6. 31. Kandel, Eric R. 2017. In search of memory. Code Editions. ISBN 9788875786755 32. Grafman, Jordan. 2000. Conceptualizing functional neuroplasticity. Journal of Communication Disorders 33(4): 345–356. ISSN 0021–9924. https://doi.org/10.1016/S0021-9924(00)00030-7 33. Hawkins, R.D., E.R. Kandel, and C.H. Bailey. 2006. Molecular mechanisms of memory storage in Aplysia. The Biological Bulletin 210: 3. https://doi.org/10.2307/4134556. 34. Castellucci, Vincent et al. 1970. Neuronal mechanisms of habituation and dishabituation of the gill-withdrawal reflex in Aplysia. Science 167: 1745–1748.https://doi.org/10.1126/science. 167.3926.1745 35. Newman, Ehren L., and Kenneth A. Norman. 2010. Moderate excitation leads to weakening of perceptual representations. Cerebral Cortex, 20(11): 2760-2770. https://doi.org/10.1093/cer cor/bhq021 36. Zhou, Y., and N.C. Danbolt. 2014. Glutamate as a neurotransmitter in the healthy brain. Journal of Neural Transmission 121: 799–817. https://doi.org/10.1007/s00702-014-1180-8. 37. Faingold, Carl L., Greta Gehlbach, and Donald M. Caspary. 1989. On the role of GABA as an inhibitory neurotransmitter in inferior colliculus neurons: iontophoretic studies. Brain Research, 500(1–2): 302–312. ISSN 0006–8993. https://doi.org/10.1016/0006-8993(89)903 26-0 38. Krogsgaard-Larsen, Povl, Bente Frolund, and Karla Frydenvang. 2000. GABA uptake inhibitors. Design, molecular pharmacology and therapeutic aspects. Current Pharmaceutical Design 6, 12: 1193–1209. https://doi.org/10.2174/1381612003399608 39. Jazvinscak Jembrek, Maja, and Josipa Vlainic. 2015. GABA Receptors: Pharmacological Potential and Pitfalls. Current Pharmaceutical Design 21(34): 4943–4959(17) 40. Maham, B., and R.C. Kizilirmak. 2018. Neuro-spike communications with multiple synapses under inter-neuron interference. IEEE Access 6: 39962–39968. https://doi.org/10.1109/ACC ESS.2018.2854878. 41. Duarte, Renato, Alexander Seeholzer, Karl Zilles, and Abigail Morrison. 2017. Synaptic patterning and the timescales of cortical dynamics. (2017). Current Opinion in Neurobiology 43: 156–165. ISSN 0959–4388. https://doi.org/10.1016/j.conb.2017.02.007
18
1 Introduction to Neural Networks: Biological Neural Network
42. Jonides, J., R.L. Lewis, E.N. Derek, C.A. Lusting, M.G. Berman, and K.S. Moore. 2008. The mind and brain of short-term memory. Annual Review of Psychology 59 (1): 193–224. 43. Kandel, E.R. 2009. The biology of Memory: A forty-year perspective. Journal of Neuroscience. https://doi.org/10.1523/JNEUROSCI.3958-09.2009. 44. Morillas-España, A., Á. Ruiz-Nieto, T. Lafarga, G. Acién, Z. Arbib, and C.V. GonzálezLópez. 2022. Biostimulant capacity of Chlorella and Chlamydopodium species produced using wastewater and centrate. Biology 11: 1086. https://doi.org/10.3390/biology11071086. 45. Kuffler, S.W., and D.D. Potter. 1964. Glia in the leech central nervous system:Physiological properties and neuron-glia relationship. Journal of Neurophysiology. https://doi.org/10.1152/ jn.1964.27.2.290. 46. Aggarwal, S., L. Yurlova, and M. Simons. 2011. Central nervous system myelin: Structure, synthesis and assembly. Trends in Cell Biology. https://doi.org/10.1016/j.tcb.2011.06.004. 47. Paus, Tomáš. 2010. Growth of white matter in the adolescent brain: Myelin or axon? Brain and Cognition 72(1): 26–35. ISSN 0278–2626. https://doi.org/10.1016/j.bandc.2009.06.002 48. Doi, A. 1987. Ionic conduction and conduction polarization in oxide glass. Journal of Materials Science 22: 761–769. https://doi.org/10.1007/BF01103509. 49. Kidd, Grahame J., Nobuhiko Ohno, and Bruce D. Trapp. 2013. Chapter 5—Biology of schwann cells. In Handbook of Clinical Neurology, vol. 115. ed(s): Gérard Said, Christian Krarup, 55– 79. Elsevier. ISSN 0072–9752, ISBN 9780444529022,https://doi.org/10.1016/B978-0-44452902-2.00005-9 50. Girault, Jean-Antoine, and Elior Peles. 2002. Development of nodes of Ranvier. Current Opinion in Neurobiology 12(5):476–485. ISSN 0959–4388. https://doi.org/10.1016/S09594388(02)00370-7 51. Waxman, S., and M. Bennett. 1972. Relative conduction velocities of small myelinated and non-myelinated fibres in the central nervous system. Nature New Biology 238: 217–219. https:// doi.org/10.1038/newbio238217a0. 52. Reeves, T.M. 2012. Unmyelinated axons show selective rostrocaudal pathology in the corpus callosum after traumatic brain injury. Journal of Neuropathology & Experimental Neurology 71 (3): 198–210. https://doi.org/10.1097/NEN.0b013e3182482590. 53. Ochoa, J. 1978. Recognition of unmyelinated fiber disease: Morphologic criteria. Muscle & Nerve. https://doi.org/10.1002/mus.880010506. 54. Krämer-Albers, Eva-Maria, Niko Bretz, Stefan Tenzer, Christine Winterstein, Wiebke Möbius, Hendrik Berger, Klaus-Armin Nave, Hansjörg Schild, and Jacqueline Trotter. 2007. Oligodendrocytes secrete exosomes containing major myelin and stress-protective proteins: Trophic support for axons? Proteomics Clinical Applications. https://doi.org/10.1002/prca.200700522 55. Shreiber, D.I., H. Hao, and R.A. Elias. 2009. Probing the influence of myelin and glia on the tensile properties of the spinal cord. Biomechanics and Modeling in Mechanobiology 8: 311–321. https://doi.org/10.1007/s10237-008-0137-y. 56. Sternberger, Nancy H., Yasuto Itoyama, Marian W. Kies, and Henry de F Webster et al. 1978. Immunocytochemical method to identify basic protein in myelin-forming oligodendrocytes of newborn rat C.N.S. J Neurocytol 7: 251–263. https://doi.org/10.1007/BF01217922 57. Rossi, Christina, Deepa Padmanaban, Jake Ni, Li-An Yeh, Marcie A. Glicksman, and Hanspeter Waldne. 2007. Identifying druglike inhibitors of myelin-reactive t cells by phenotypic highthroughput screening of a small-molecule library. SLAS Discovery 12(4): 481–489. ISSN 2472– 5552. https://doi.org/10.1177/1087057107301272 58. Man, P.Y.W., D.M. Turnbull, and P.F. Chinnery. 2002. Leber hereditary optic neuropathy. Journal of Medical Genetics 39: 162–169. 59. Wingerchuk, Dean M., Vanda A. Lennon, Sean J. Pittock, and Brian G. Weinshenker. 2007. The spectrum of neuromyelitis optica. The lancet neurology 6(9). https://doi.org/10.1016/S14744422(07)70216-8 60. Ravera, S., and I. Panfoli. 2015. Role of myelin sheath energy metabolism in neurodegenerative diseases. Neural Regeneration Research 10 (10): 1570–1571. https://doi.org/10.4103/16735374.167749.PMID:26692843;PMCID:PMC4660739.
Chapter 2
Overview of Neuromorphic Optical Systems
Abstract After analyzing in Chap. 1 some of the most important features of the biological brain that underlie learning process and information storage, this Chapter provides a quick overview of optoelectronic and fully-optical neuromorphic techniques implemented to date. The operation and physical principles of each approach presented is described. These are grouped into three main categories called paradigms. Each neuromorphic paradigm is built on a main neural connotation. The first paradigm replicates neural excitability. The second paradigm focuses on the concept of connection between units. Finally, the third paradigm is an absolute novelty in the scientific panorama and encompasses only solitonic networks, which can replicate a tissue structure. The main difference from other neuromorphic hardware lies in their ability to self-organize according to the received stimuli, evolving accordingly over time.
2.1 Overview The journey into neuromorphic optics begins with the question: why is optics a powerful tool for implementing the mathematical algorithms underlying neural computation? In order to fully answer, it is important to reflect on some aspects about neural networks. A neural network is a system which has, first of all, to be able to collect data and then to analyze, recognize and memorize them. To accomplish this, each component must be connected to the whole structure and able to transport data between different districts. Moreover, neural networks require an adaptive capacity, which means they should be able to modify their decision rules and their structure according to the learning environment. The ability to readapt represents the real challenge of neuromorphic research. In various fields, optical devices have proved extremely adaptable [1]. Moreover, they also generally suffer from considerably lower energy losses than electronic counterparts [2]. For these main reasons, research first moved towards the realization of optoelectronic neuromorphic circuits and then completely in the domain of optics. Therefore, this chapter aims to introduce the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 A. Bile, Solitonic Neural Networks, Machine Intelligence for Materials Science, https://doi.org/10.1007/978-3-031-48655-5_2
19
20
2 Overview of Neuromorphic Optical Systems
main all-optical neuromorphic research lines, in order to contextualize the presented work. During the discussion, the neural peculiarities of each neuromorphic approach are described. In this way, the innovative aspect introduced by this book is highlighted: the presented system overturns the general neuromorphic paradigm and instead of starting from the single neural units, the neurons, starts from the concept of connection, the synapses, to elaborate a model of solitonic neural tissue.
2.1.1 Introduction to Optical Neuromorphics Artificial Intelligence was born as an attempt to develop machines reach the selfadapting reasoning capacity of human brain, overcoming the limits of traditional computers [3]. Many milestones, in different fields, have been achieved through Machine Learning (ML) and Deep Learning (DL) that have allowed computers to learn and improve their reasoning over time. However, their performance limits relating to physical characteristics remain. This is due to several factors: a physical limit to the downsize of the circuit, a huge consumption of energy for operation in nanometric devices and the typical von Neumann architecture for computational systems which is not efficient [4]. Traditional computers have separate computing and memory units while the biological neural computation is massively parallel. This latter peculiarity is at the root of the low power required for the human brain to function. Neuromorphic research has moved to try replicating the benefits of the biological neural system. Where does biological brain efficiency come from? Mainly from the unification of memory with computational areas, made possible thanks to neural plasticity, or the ability of the brain to modify its own interconnections plastically to optimize processes and memorize synaptic pathways. This book identifies three fundamental paradigms in both the electrical and the optical neuromorphics, which will be discussed shortly. However, this book does not aim to discuss the electrical neuromorphic systems which is extensively discussed in detail in other works. Due to the physical nature of their components, the electrical realizations suffer from limitations which can be summarized in the following points: . high energy losses mainly due to the Joule effect; . low, or at least limited, computation speed due to propagation along cables and high energy losses; . poor adaptability and non-ability to self-modify. It is important to focus on this last point for two main reasons. First of all because, speaking of biological memory, the fundamental characteristic for a neuromorphic system is plasticity. Secondly, because rigid systems, which are not able to adapt when the environments change, cannot reach the dynamism of biological neural tissues. These limits are overcome by neuromorphic photonics, which, easily, exploits the characteristics of the light and matter properties, to achieve high-performance and low losses computing. In the vision of this book, neuromorphic photonics, to date,
2.1 Overview
21
has been developed by responding to three different paradigms. To be precise, the first two paradigms collect the research belonging to the state of the art up to date, while the third paradigm is introduced with the relazione of solitonic networks. The three paradigms can be summarized as follows. Paradigm 1 is focused on single neurons dynamics; it arises from the attempt to reproduce the characteristic behavioral and functional traits of the single neuron. Paradigm 2 highlights the passage from neurons to synapses; it takes into consideration the dynamics of how single neurons are connected: this is an extension of the first paradigm. Paradigm 3: from a complex network, passing through the synapses and arriving at the neurons. It introduces a complete structural reversal. The starting point consists in the overall dynamics of the neural tissue, network, to arrive through the characterization of the single synapses to the definition of the neural units.
2.1.2 First Neuromorphic Paradigm Most of today’s neuromorphic optical systems are conceived starting from the functioning of a single neuron. This approach can sometimes prove to be limiting from the point of view of integration into larger systems. The starting point consists in the reproduction of the main functions of biological neurons: first of all, the excitability, that is the ability to produce a self-consistent pulse from small perturbations through a threshold process as for the integrated and fire models [6]. Neuron models used in engineering are simpler than biological ones. To reproduce the excitability feature, at the beginning many researchers have turned to the study of optical cavities [7–9]. A strong similarity was subsequently recognized between the excitation patterns of semiconductor lasers for signal processing and the spike nature of neurons. This paradigm attempts to convert the integrate-and-fire model introduced in Chapter 1 into physical processes. Among the most important models realized, there are laser semiconductors embebbed with graphene [10], two-section models [11–13], photonic crystal nanocavities [14], polarization sensitive vertical cavity lasers [15], lasers with optical feedback or optical injection [16] and linked photodetector-laser systems with receiverless connections [17] or resonant tunneling [18] (Fig. 2.1). The VCSEL neuron [11] stands out as the hardware that best emulates the Integrate and Fire model. These neurons possess the ability to maintain a constant amplitude of the signal generated at each moment in time. This ensures that information can be encoded in the form of spike train frequencies, mirroring the processes in the biological brain. Moreover, their structure offers an advantage in terms of ultrafast computation, thanks to photonics. They exist as distinct units that can be organized in successive layers. However, there is a significant limitation. These units remain unchangeable over time; their geometric properties cannot be altered. Consequently, the history of processed signals needs to be stored in separate and distinct areas. Typically, this is achieved through electronic means. In this scenario, the processing unit remains distinct from the storage unit. This paradigm lacks the capability to
22
2 Overview of Neuromorphic Optical Systems
Fig. 2.1 Schematic functional representation of the neuron model developed according to paradigm 1. It consists of input channels, a computational area able to sum the weighted inputs and to transmit them according to an activation function towards the outputs
construct a memory that undergoes processing concurrently with learning, and that adapts over time to the type of information being received.
2.1.3 Second Neuromorphic Paradigm A second approach to neuromorphic optics consists in conceiving the interconnection between units. This means remembering the passage of signals to previous times and what connections they activated. Indeed, a connection is a bridge that can be fortified and weakened; therefore, it must keep memory of the historical evolution. The passage from paradigm 1 to paradigm 2 sees a movement from the single neuron to the connection precisely through the introduction of a memory. To obtain a memory in the network implementation it is necessary to be able to modify the weights of the connections, therefore, to have a tuning system available. In neuromorphic O/E/O devices, this is usually implemented electrically as shown by the photodetector-modulator neuronal model for MZI meshes [19] and by the superconducting electronic signal pathway [20]. However all-optical implementations are inherently faster due to slow carrier drift and current flow stages in the latter. The major cause of the inefficiency of the systems is linked to the need to de-multiplex and digitize many independent channels. Among the most recent all-optical technologies developed there are those that exploit PCMs chalcogens [3]. Briefly, the state of these materials depends on the processing of the inputs and remains so until a new input configuration is presented. This system exploit wavelength division multiplexing techniques to realize a scalable circuit for photonic
2.1 Overview
23
Fig. 2.2 Schematic representation of the main features of a synaptic connection. It is a bridge between two or many neural units whose structure must be dynamic in order to strengthen the connection intensity or weaken it according to the nature of input signals
neural networks. The synapses are made by optical waveguides and weighting process is achieved via PCM cells able to modify the light propagations. When the PCM cell is in the amorphous state, the structure is highly transmissive, representing a strong connection between two units. On the contrary, in the crystalline state, most of the light is absorbed, reproducing a weak connection (Fig. 2.2). Imagining using multiple samples (varying in size as needed), individual connections can be replicated. These also can switch from ON state to OFF state independently, thus creating an information pattern. Furthermore, it is possible to replicate the structure for many successive layers, organizing networks that closely resemble the typical structure of machine learning. However, there are limitations even in this case. Firstly, it is only possible to have two states (on/off). In contrast, a biological synapse can exhibit a wide range of nuances, adding further complexity to the formation of neural maps. Additionally, PCM materials alone cannot serve as the exclusive foundation for constructing a neural framework. When the crystalline state is switched, the memory of the previous configuration is erased and must be stored in a secondary memory unit, whether hardware or software. Consequently, these memories are very short-term. As previously discussed, biological storage is a complex process characterized by variability, dynamism, and non-definitiveness. Although it can adapt during learning (indeed, it occurs simultaneously), non-permanent memory can evolve over time through the weakening of some synapses and the strengthening of others [17]. This paradigm still maintains a clear-cut separation between processing and storing procedures.
24
2 Overview of Neuromorphic Optical Systems
2.1.4 Third Neuromorphic Paradigm The third paradigm represents an intense acceleration towards the biological world. If, as mentioned before, the neuromorphic paradigms up to now have been based on highly exemplary models with respect to biological complexity, Solitonic Neural Networks, which pave the way for the introduction of this new paradigm, do not take neurons as a starting point (paradigm 1) or single synapses (paradigm 2) but implement an entire neural tissue. In this way they contain many properties of the biological world. Starting from the complex, that is the structural tissue, they come to designate single neurons and single synapses. SNN networks arise from the propagation of light in nonlinear crystals. Their description from the physical point of view will be addressed in the next Chapters. This section only wants to think about the parallelism with biological neural networks in order to fully understand the novelty introduced by the paradigm 3. The process underlying the formation of SNN networks is that of photorefractivity. An important aspect to stress is that the formation of SNN networks depends on the movement of charge density; similarly, the birth-modification-destruction of biological synapses depends on the movement of neurotransmitter densities. The greater the shift in neurotransmitter density inside the synaptic region, the more intense the biological synapse; similarly the greater the charge density movement, the higher the refractive index contrast between the light-illuminated region and the dark region, and therefore the deeper the channel excavated. Nervous tissue is also characterized by glial cells, which are not directly involved in the transmission of signals, but constitute a structural support to neurons, ensuring their nourishment and protection from injury. There are many types of glia cells, the description of which is beyond the scope of this manuscript and therefore the reader is invited to a deeper reading also through the references provided [21–23]. An in-depth discussion on the physics of photorefractivity, necessary for the understanding of the electric charge redistribution processes underlying the modification of the refractive index and therefore of the formation of the soliton will be given later in Chap. 4. Thus, paradigm 3 depends strictly on the photorefractivity of some nonlinear crystals. This electro-optical property guarantees the possibility of realizing a highly connected tissue characterized by fundamental units, photonic X-Junction neurons, communicating through plastic synapses and an interaction environment that can favor or disfavor the propagation of signals and the formation of synapses themselves. The properties of the connections depend on specific factors such as the intensity of incoming light, the presence of an electrostatic bias field, and possible doping agents. It is these factors that can “feed” the solitonic network by increasing its degree of connectivity or decreasing it. With paradigm 3, neuromorphic applications open their horizon, no longer focusing on individual structures. Indeed, with SNNs, neuromorphic technology opens to an entire complex learning structure, which is close in function and architecture to the biological brain. In fact, the complexity of the brain cannot be achieved by making units and connecting them. A context is needed. It is necessary to immerse them in an environment that integrates with their functionality and can in this way assist the life of artificial neurons. This increases
References
25
Fig. 2.3 The third paradigm on photonic neuromorphic hardware encompasses those systems that are capable of reproducing an entire learning environment and is not limited to the realization of individual structures or connections. To date, only solitonic neural networks are able to fulfill this goal
the degree of connectedness (see Hebb’s postulate on learning), increases the degree of complexity and variability by improving the characteristics of learning. Moreover, it is precisely the connection between neurons, synapses and the surrounding neural environment (with all its properties and units) that ensures plastic learning behavior. Since SNNs arise from the local modification of the refractive index with respect to the global configuration, they are able to reproduce such a behavior: the created units are in constant contact with the environment in which they are formed and, with it, give rise to more complex superstructures, what are precisely called (Fig. 2.3).
References 1. Dupeyroux, Julien, Jesse J. Hagenaars, Federico Paredes-Vallés, and Guido CHE de Croon. 2021. Neuromorphic control for optic-flow-based landing of MAVs using the Loihi processor. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 96–102. Xi’an, China. https://doi.org/10.1109/ICRA48506.2021.9560937 2. Bile, A., F. Moratti, H. Tari, et al. 2021. Supervised and unsupervised learning using a fully-plastic all-optical unit of artificial intelligence based on solitonic waveguides. Neural Computing and Applications 33: 17071–17079. https://doi.org/10.1007/s00521-021-06299-7. 3. Feldmann, J., N. Youngblood, C.D. Wright, et al. 2019. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature 569: 208–214. https://doi.org/10.1038/s41586019-1157-8. 4. Tari, H., A. Bile, F. Moratti, et al. 2022. Sigmoid type neuromorphic activation function based on saturable absorption behavior of graphene/PMMA composite for intensity modulation of surface plasmon polariton signals. Plasmonics 17: 1025–1032. https://doi.org/10.1007/s11468021-01553-z. 5. Mead, C. 1990. Neuromorphic electronic systems. Proceedings of the IEEE 78 (10): 1629– 1636. https://doi.org/10.1109/5.58356.
26
2 Overview of Neuromorphic Optical Systems
6. Gerstner, Wulfram, Werner M. Kistler, Richard Naud, and Liam Paninski. 2014. Neuronal Dynamics: From Single Neruons to Networks and Models of Cognition. Cambridge University Press 7. Krauskopf, Bernd, Klaus Schneider, Jan Sieber, Sebastian Wieczorek, and Matthias Wolfrum. 2003. Excitability and self-pulsations near homoclinic bifurcations in semiconductor laser systems. Optics Communications 215(4–6): 367–379. ISSN 0030–4018. https://doi.org/10. 1016/S0030-4018(02)02239-3 8. Wünsche, H.J., O. Brox, M. Radziunas, and F. Henneberger. 2021. Excitability of a semiconductor laser by a two-mode homoclinic bifurcation. Physical Review Letters 88: 023901. 9. David, S., et al. 1993. LIFRb and gp130 as Heterodimerizing Signal Transducers of the Tripartite CNTF Receptor. Science 260: 1805–1808. https://doi.org/10.1126/science.8390097. 10. Sharma, Deepak, Sweta Shastri, and Pradeep Sharma. 2016. Intrauterine growth restriction: antenatal and postnatal aspects. Clinical Medicine Insights: Pediatrics 10. https://doi.org/10. 4137/CMPed.S40070 11. Azin, M., D.J. Guggenmos, S. Barbay, R.J. Nudo, and P. Mohseni. 2011. A battery-powered activity-dependent intracortical microstimulation IC for brain-machine-brain interface. IEEE Journal of Solid-State Circuits 46 (4): 731–745. https://doi.org/10.1109/JSSC.2011.2108770. 12. Selmi F., Braive R., Beaudoin G., I. Sagnes, R. Kuszelewicz, and S. Barbay. 2014. Relative refractory period in an excitable seminconductor laser. Physical review letters 112: 183902. Published 7 May 2014 13. Selmi, Carlo, and M. Eric Gershwin. 2014. Diagnosis and classification of reactive arthritis. Autoimmunity Reviews 13(4–5): 546–549. ISSN 1568–9972. https://doi.org/10.1016/j.autrev. 2014.01.005 14. Brunstein, Maia, Alejandro M. Yacomotti, Isabel Sagnes, Fabrice Raineri, Laurent Bigot, and Ariel Levenson. 2012. Excitability and self-pulsing in a photonic crystal nanocavity. Physical Review A 85: 031803(R). Published 26 March 2012 15. Al-Seyab, Rihab, Kevin Schires, Nadir Ali Khan, Antonio Hurtado, Ian D. Henning, and Michael J. Adams. 2011. Dynamics of polarized optical injection in 1550-nm VCSELs: theory and experiments. IEEE Journal of Selected Topics in Quantum Electronics 17(5): 1242–1249. https://doi.org/10.1109/JSTQE.2011.2138683 16. Gelens, Lendert, Lilia Mashal, Stefano Beri, Werner Coomans, Guy Van der Sande, Jan Danckaert, and Guy Verschaffelt. 2010. Excitability in semiconductor microring lasers: Experimental and theoretical pulse characterization. Physical Review A 82: 063841. Published 30 December 2010 17. Prucnal, Paul R., Bhavin J. Shastri, Thomas Ferreira de Lima, Mitchell A. Nahmias, and Alexander N. Tait. 2016. Recent progress in semiconductor excitable lasers for photonic spike processing. Adv Opt Photon 8:228–299 18. Bruno Romeira, Julien Javaloyes, Charles N. Ironside, José M. L. Figueiredo, Salvador Balle, and Oreste Piro. 2013. Excitability and optical pulse generation in semiconductor lasers driven by resonant tunneling diode photo-detectors. Opt. Express 21: 20931–20940 19. Minkov, M., I.A.D. Williamson, L.D. Andreani, D. Gerace, and L. Beicheng. 2020. Inverse design of photonic crystals through automatic differentiation. ACS Photonics 7 (7): 1729–1741. 20. McCaughan, A.N., V.B. Verma, S.M. Buckley, et al. 2019. A superconducting thermal switch with ultrahigh impedance for interfacing superconductors to semiconductors. Nat Electron 2: 451–456. https://doi.org/10.1038/s41928-019-0300-8. 21. Ransom, B.R., and H. Sontheimer. 1992. The neurophysiology of glial cells. Journal of Clinical Neurophysiology 9 (2): 22–252. 22. Parpura, et al. 2012. Glial cells in (patho)physiology. Journal of Neurochemistry 121: 4–27. 23. Verkhratsky, Alexej, and Christian Steinhäuser. 2000. Ion channels in glial cells. Brain Research Reviews 32(2–3): 380–412. ISSN 0165–0173. https://doi.org/10.1016/S0165-0173(99)000 93-4
References
27
24. Mattson, Mark P., M. Murrain, P. B. Guthrie, and S. B. Kater. 1989. Fibroblast growth factor and glutamate: opposing roles in the generation and degeneration of hippocampal neuroarchitecture. Journal of Neuroscience 1, 9(11): 3728–3740. https://doi.org/10.1523/JNEUROSCI.09-1103728.1989 25. Maragos, William F., J. Timothy Greenamyre, John B. Penney Jr, and Anne B. Young. Glutamate dysfunction in Alzheimer’s disease: an hypothesis. Trends in Neurosciences 10(2): 65–68. https://doi.org/10.1016/0166-2236(87)90025-7
Chapter 3
Towards Neuro-Learning Process: Pyschomemories
Abstract Modern neuropsychology recognizes three main types of memories: episodic, procedural and semantic. These are independent but “work” in close connection, making it possible to construct external knowledge from external stimuli, which is then consolidated and recorded. Particular weight is given to episodic-type memory, which is the model mainly considered by the solitonic systems proposed in this manuscript.
In the previous Chapter it was shown as the biological memory depends on the organization of synaptic connections. Neural architecture can change conformation by orienting toward specific topologies depending on incoming information. In fact, neural connections are dynamic entities that change over time according to the stimuli received, encoding the processed information into real maps. When synaptic connections have been constructed with very high weights, and this is especially the case when intense stimuli occur or are iterated over a long time, the relevant neural map persists over time, even if new stimuli are received. But how exactly does memory work? Are there different forms of memory? Do they communicate with each other? Modern psychology focuses on these three questions and has developed psychological mnemonic structures. This book does not go into the details of the psychomemories modern neuropsychology identifies as underlying learning processes. Readers with specific interests in this regard can consult manuscripts dealing with these topics. However, some functional mechanisms that are useful for understanding the innovative learning mode introduced by SNNs will be analyzed. The model describing the structure of memory provides an initial division between two different levels. The first level encloses short-term termite memory (STM), which retains information for a period limited to a few seconds. Unless strategies are used to reinforce the pathways that have been delineated at this stage, the information is removed. Instead, when activated connections are reinforced then knowledge is progressively consolidated until the second level represented by long-term memory (LTM)
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 A. Bile, Solitonic Neural Networks, Machine Intelligence for Materials Science, https://doi.org/10.1007/978-3-031-48655-5_3
29
30
3 Towards Neuro-Learning Process: Pyschomemories
is reached, which retains information for a period ranging from several minutes to several years. In LTM two memory types have been recognized [1]: explicit (or declarative) memory and implicit (or nondeclarative) memory. The term explicit refers to memory that allows to analyze objects (allowing a process of abstraction starting from the elementary characteristics that compose them) and then to store their features in order to identify them in different contexts. It is also called declarative because it travels on a conscious level and allows to recall information and memories voluntarily. Two types of memory, that are at first glance very different from each other, are part of explicit memory: episodic memory, which records individual events by “photographing” them, and semantic memory, which instead arises from the analysis of saved information and gives it semantic meaning. In contrast, implicit memory is not voluntarily accessible and underlies all those automatic mechanisms that regulate behavior and actions in an unconscious way. There are many of these mechanisms that allow us to live without even realizing it. Generally, this type of memory is achieved through constant repetition of information. When a procedure is repeated, it acquires greater importance in memory, according to the neural mechanisms that were introduced in the previous Chapter. The most important type of learning that belongs to this kind of memory is the procedural one, which precisely leads to the acquisition of automatic behaviors.
3.1 Episodic Memory Learning and memorization skills are fundamental to intelligent organisms, as they enable dynamic and adaptive behavior based on changes in the environment [2]. Developing a store of knowledge not only allows one system to have at its disposal cognitive usable models to deal with reoccurring situations, but also, and more importantly, to be able to merge elements already seen to develop new knowledge. New knowledge can be conceived when experienced and contextually stored in memory (empirical knowledge) or as an interaction between the fundamental block of the already stored knowledge (deductive knowledge). The starting point for learning is thus the ability to record events and therefore to start a process of data collection. One of the first scientists to conceptualize the differences between episodic, procedural and semantic memory as three psychic processes of human brain activity was Tulving. In one of his most important works [3] he states that: “On temporally dated episodes or events and the spatiotemporal relations between these events […] in terms of autobiographical reference to the already existing contents of the storehouse of episodic memory”. Thus, episodic memory is a record of events that occur during life and are recorded without any processing. Information from the external environment is saved as if it were the pages of a book, with all the characteristics that distinguish it. A kind of notebook that records exactly the entirety of memories and that you can flip through to recall. If a piece of that page is torn in half, then it loses its initial
3.1 Episodic Memory
31
connotation, becoming other than before. It cannot be said that the still intact half is the information which is visualized before, although it contains a portion of it. From a biological point of view, it is a long-term memory type [4]. As mentioned above, episodic memory falls within declarative memories, which are those that act under the awareness of the individual, who can access them whenever he or she wants. Therefore, they are identified as explicit. The major distinction to be made within declarative memories is between episodic and semantic. The latter refers to facts and rules and background knowledge about the world [5], while episodic memory refers to individual personal events contextualized in time and space. Memory, in this case, is therefore necessarily linked to spatiotemporal information and cannot be detached from it. For these reasons, the adjective personal is often associated with episodic memory: it is different for each subject and is constructed on the basis of lived experiences. Episodic memory is a snapshot of a lived event. Some studies show that this frame is saved along with the emotion experienced during the encoding process [6]. In fact, neuropsychology identifies another type of memory among declarative ones called emotional, which works closely with episodic. However, the emotional sphere is extremely rich and convoluted, which is why it is still a major unknown for research on the mind and how memories work. It will therefore not be considered in the discussion of this textbook. Returning to the definition given by Tulving, episodic memory is a system specialized in storing specific idiosyncratic experiences in terms of what happened, where and when it happened [7]. But this definition only considers an already outlined structure, without considering how the entire storage process takes place. Therefore, Tulving himself later expanded this definition by describing the phenomenological processes that are specifically associated with the retrieval of episodic memories, but not semantic ones: episodic memory is self-dependent. The foundation of episodic memory is thus itself: it has self-consciousness and as such, from a functional point of view, it stores information by self-modifying its own structure. Each event can be thought of as an image that is photographed and translated, similar to what happens in computer science, into a stream of bits that is then mapped within the brain through the activation of a specific signal propagation pathway. The bits are recorded as they are, without being processed. They are recorded, not understood. For simplicity’s sake, without losing generalization, Fig. 3.1 shows an 8 × 8 binary image in which its component pixels can be turned on or off. The colors yellow and blue take on the same meaning as black and white. Yellow means bits on while blue means bits off. At the episodic level, the image is recorded a sequence of numbers. Based on the information contained by the individual pixel, a particular synapse will be excited. Each sequence will therefore have its own mapping. Registration results in a change in the brain region used for storage. Let’s go deeper into the concept of self-modification since this is the fundamental property for the development of SNNs. Starting from the investigation of the accessibility and functioning of episodic memory, its definition has recently been extended [8] and directly linked to the semantic memory: the characteristics of the first one are also shared by the other, in the sense that they function jointly as independent units.
32
3 Towards Neuro-Learning Process: Pyschomemories
Fig. 3.1 An event can be schematized as an instant picture. Every elementary piece of information in the displayed image is saved as is, without processing. The image is translated into a bit stream that is subsequently stored
Indeed, they are two distinct units. Indeed, two distinct neural regions are recognized for the two types of memories: episodic memories are recognized to be closer to the hippocampal regions while semantic memories are associated with activation of the frontal and temporal cortices. Functionally, how do they work? Very simplistically, episodic memory records specific information and semantic memory processes them, associating words and concepts with the information. What does it mean that episodic memory does not process information? To answer this question, it is important to focus on the two pillars of learning: knowing and remembering. Knowledge coincides with the memorization of information that is not saved as it appears, but it is first processed according to learning grids (intellectual abilities, cultural biases, historical, etc.) and then saved. Contrary to popular belief, because of what has become widespread in our culture over time, memorization in some cases is a phenomenon that can also trigger learning. Specifically in the episodic case, it is possible to speak of episodic learning: a learning which occurs as a result of an event. The best example that can explain this concept concerns fear of something [9]: the fear of cats after being scratched by a cat is the result of episodic learning. In this case, memory is a tool for learning and developing specific rather than general knowledge. Consequently, learning is an analog event that depends on the strength of information. It lacks a process of abstraction, which occurs instead through information processing and filtering typical of other types of memory, which allows general information to be extracted from specific information. As already anticipated, because of its purely specific character, episodic memory is called subjective precisely because it lacks a general universal notation: my memory is mine alone and can be recognized only by me. In some studies [10], episodic memory is described as a re-recording of a person’s experience that contains time-dated information and spatio-temporal relationships.
3.2 Semantic Memory
33
Although functionally simpler, it is understood that episodic memory is the fundamental building block for the development of more complex mnemonic structures such as procedural and semantic. The latter, in fact, is derived from accumulated episodic memory. Episodic memory can be thought of as a “map” of separate locations that are linked together in procedural and semantic memory. In this sense, therefore, the episodic structure corresponds to the first step for the proper realization of a neuromorphic device that is able to reproduce the learning system typical of the biological brain, so distant from that of software AI.
3.2 Semantic Memory Semantic memory allows to define a first draft of the concept of knowledge, quite different from that of information. In fact, it concerns the general and not specific aspect of what is learned. It starts from a set of stored knowledge but is not limited to raw information. It is related to a process of abstraction that allows the conceptualization of data and the extrapolation of a general rule [3]. For this reason, unlike episodic memory, it is not subjective but objective. Whereas episodic memory involves awareness of the feeling of having personally experienced an event or object, independent of meaning, semantic memory involves awareness of meaning unaccompanied by the familiar feeling of having previously experienced the event. Semantic memory is not tied to a specific experience [11]. Rather, it starts from the background of recorded experiences to arrive at finding the rules that define and connect them structurally and functionally. Using a more related to data processing language [12], it can be said that episodic memory stores raw information while semantic memory creates a link between them, leading, through a process of abstraction, to the building of the new concept. A clarifying example of how semantic memory acts starting from episodic savings is shown in Fig. 3.2. The image in (a) shows a German shepherd in a snowy environment. Episodic memory records information as it is, without any processing, as a succession of bits. Then, the information is stored as a specific subject, with a precise orientation in space and in a specific environment with its own characteristics. If stored only episodically the event can be remembered and learned only if it is replayed exactly as it was experienced. If the same dog is in the same position but in a different environmental context (b), the new situation will not be recognized episodically. Even if the same image is rotated with respect to the reference system with which it was learned (c), it will not be recognized episodically. No features extraction is performed in order to get the type of subject (dog and not another animal), the breed of dog, the environment, and it place with respect to the space. On the other hand, semantic memory works together with a filtering system that analyzes the information in detail, breaking it down and allowing for an abstraction operation, as shown in Fig. 3.3. Therefore, it recognizes the subject of the image, that belongs to the German shepherd breed and that is likely in the mountains due
34
3 Towards Neuro-Learning Process: Pyschomemories
Fig. 3.2 An event photographed by episodic memory (a), is not subject to processing. This means that the same slightly modified event will not be recognized episodically. So if the subject of the memory is transposed to a different environment (b), episodic recognition will not occur. If the event undergoes a transformation, such as a rotation (c), episodic recognition will not occur. A deeper level of analysis is needed, which is done through semantic memory
to the presence of snow in the surrounding environment. This consideration is not unique. If the image is rotated, then the information is recognized equally. If the dog is shown in a sandy environment, as shown before, then semantic analysis recognizes the same breed of dog but also the new environment.
Fig. 3.3 Semantic memory operates an abstraction process through an extrapolation of the basic features of the input event. In this case, subject peculiarities are recognized: type of color, ears and paws, which leads to the identification of the dog’s breed. Similarly, the context is also analyzed, which allows identification of the external environment
3.3 Procedural Memory
35
Summarizing, semantic memory saves the individual components of the analyzed information, categorizing them and then affixing a label (or name) to them. According to some experts [13], semantic memory includes at least two key elements: – Semantic knowledge: this element includes perceptual features (such as shapes, edges, colors, etc.) and functional features, i.e., the conceptual structures that explain the functioning of an object. – The implementation of the semantic knowledge process within biological processes, which corresponds to the ability to identify relationships between the parts of an object or between objects. Given these premises, the question that neuropsychological research is still trying to answer is: where does semantic memory physically reside? Some studies [14] seem to identify the complexity of semantic memory in the massively interconnected nature of neural elements and their continuous microscopic changes. Therefore, a specific network of connections represents the features of a specific concept and its learning process corresponds to the microscopic changes of each connection. It exists a second neuroscientific approach hypothesizing that semantic knowledge is represented locally in specific brain areas [15]. This model represents an extension of the previous one: semantic knowledge is no longer identifiable only in connections and their micro-modifications but has its own topological representation in the brain. A malfunction of semantic memory is associated with specific pathologies. Indeed, there is a type of Alzheimer’s disease that is related to a disorder of semantic memory and results in errors in the description and naming of objects [14]. Another disorder related to this type of memory is semantic dementia, which can lead to difficulty in visual recognition of objects and consequently in pronunciation of words [15].
3.3 Procedural Memory Procedural memory falls within the subdivision of nondeclarative memory since it acts on a nonconscious plane [16]. It allows to acquire and model learned connections between stimulus and response, or rather, chain of responses. It thus allows us to associate a stimulus with a certain type of behavior to be adopted. In this sense, then, procedural memory underlies human adaptive behavior. When a new situation arises, our brain activates already rehearsed and trained mechanisms that enable us to cope with the novelty [17]. Such behavior avoids loss of time that could be fatal in some cases. On the other hand, if the brain region in which this type of memory resides [18] is damaged, the loss can be enormous because the subject can “forget” normal actions which are carried out unconsciously on a daily basis such as knowing how to dress oneself, driving a vehicle or even walking.
36
3 Towards Neuro-Learning Process: Pyschomemories
However, it seems that this type of memory is retained much longer than the other two [19]. For this reason, it is not uncommon to see individuals with dementia or Alzheimer’s disease who are able to perfectly perform musical pieces through the use of the instrument they have studied throughout their lives. Procedural memory can be recognized as fundamental to process a database of processual information that, side by side with registered data by the episodic memory, nourishes the individual’s cognitive store of knowledge. Starting from processes and raw data, through elaboration, it is then possible to construct new information. This is where semantic memory intervenes. Together, the three types of memory, constitute a fundamental triad for learning and memorization. Any artificial neuromorphic system should seek strategies to reproduce the interaction of this triad. The SNNs described in this chapter represent the first step toward a system that is capable of learning in a manner similar to the biological brain, perfectly replicating the functionality of episodic memory.
References 1. Graf, P., and D.L. Schacter. 1985. Implicit and explicit memory for new associations in normal and amnesic subjects. Journal of Experimental Psychology: Learning, Memory, and Cognition 11 (3): 501–518. https://doi.org/10.1037/0278-7393.11.3.501. 2. Pause, B., A. Zlomuzica, K. Kinugawa, J. Mariani, R. Pietrowsky, and E. Dere. 2013. Perspectives on episodic-like and episodic memory. Frontiers in Behavioral Neuroscience, 7. https:// doi.org/10.3389/fnbeh.2013.00033. 3. Tulving, E. 1972. Episodic and semantic memory. Organization of Memory 1 (381–403): 1. 4. Conway, M.A. 2009. Episodic memories. Neuropsychologia 47 (11): 2305–2313, ISSN 00283932. https://doi.org/10.1016/j.neuropsychologia.2009.02.003. 5. Squire, L.R. 2004. Memory systems of the brain: A brief history and current perspective. Neurobiology of Learning and Memory 82 (3): 171–177, ISSN 1074-7427. https://doi.org/10. 1016/j.nlm.2004.06.005. 6. Dere, K., I. Sucullu, E.T. Budak, S. Yeyen, A.I. Filiz, S. Ozkan, and G. Dagli. 2010. A comparison of dexmedetomidine versus midazolam for sedation, pain and hemodynamic control, during colonoscopy under conscious sedation. European Journal of Anaesthesiology 27 (7): 648–652 https://doi.org/10.1097/EJA.0b013e3283347bfe. 7. Pause B.M., A. Zlomuzica, K. Kinugawa, J. Mariani, R. Pietrowsky, and E. Dere. 2013. Perspectives on episodic-like and episodic memory. Frontiers in Behavioral Neuroscience 7. https:// doi.org/10.3389/fnbeh.2013.00033. 8. Scott S.A., K. Sangkuhl, E.E. Gardner, C.M. Stein, J.-S. Hulot, J.A. Johnson, D.M. Roden, T.E. Klein, and A.R. Shuldiner. 2011. Clinical pharmacogenetics implementation consortium guidelines for cytochrome P450–2C19 (CYP2C19) genotype and clopidogrel therapy. Clinical Pharmacology & Therapeutics 90 (2): 328–332. https://doi.org/10.1038/clpt.2011.132. 9. Oyserman, D., D. Bybee, and K. Terry. 2006. Possible selves and academic outcomes: How and when possible selves impel action. Journal of Personality and Social Psychology 91 (1): 188–204. https://doi.org/10.1037/0022-3514.91.1.188. 10. Tulving E. 1983. Elements of episodic memory. Oxford University Press. 11. Epel, E.S., A. Bandura, and P.G. Zimbardo. 2006. Escaping homelessness: The influences of self-efficacy and time perspective on coping with homelessness. Journal of Applied Social Psychology. https://doi.org/10.1111/j.1559-1816.1999.tb01402.x. 12. H. Wickham. 2007. Reshaping data with the reshape package. Journal of Statistical Software 21 (12). http://www.jstatsoft.org/v21/i12/paper.
References
37
13. Grossman, M., E.E. Smith, P.L. Koenig, et al. 2003. Categorization of object descriptions in Alzheimer’s disease and frontotemporal dementia: Limitation in rule-based processing. Cognitive, Affective, & Behavioral Neuroscience 3: 120–132. https://doi.org/10.3758/CABN. 3.2.120. 14. Ober B.A., N.F. Dronkers, E. Koss, D.C. Delis, and R.P. Friedland. 1986. Retrieval from semantic memory in Alzheimer-type dementia. Journal of Clinical and Experimental Neuropsychology 8. https://doi.org/10.1080/01688638608401298. 15. Mochizuku-Kawai, H. 2008. Neural basis of procedural memory. Brain and Nerve, Shinkei Kenkyu no Shinpo 60 (7): 825–832 PMID: 18646622. 16. Lum J.A.G., G. Conti-Ramsden, D. Page, and M.T. Ullman. 2012. Working, declarative and procedural memory in specific language impairment. Cortex 48 (9): 1138–1154, ISSN 00109452. https://doi.org/10.1016/j.cortex.2011.06.001. 17. Butters, N., N. Wolfe, M. Martone, E. Granholm, and L.S. Cermak. 1985. Memory disorders associated with huntington’s disease: Verbal recall, verbal recognition and procedural memory. Neuropsychologia 23 (6): 729–743, ISSN 0028-3932. https://doi.org/10.1016/00283932(85)90080-6. 18. Sara Cavaco and others. 2004. The scope of preserved procedural memory in amnesia. Brain 127 (8): 1853–1867. https://doi.org/10.1093/brain/awh208. 19. Spetch M. L., Grant D. S. and Kelley R. 1996. Procedural Determinants of Coding Processes in Pigeons’ Memory for Duration, Learning and Motivation, 27 (2): 179–199. https://doi.org/ 10.1006/lmot.1996.0011
Chapter 4
The Solitonic X-Junction as a Photonic Neuron
Abstract This chapter describes in detail the physics behind the formation of photorefractive spatial solitons, taking care also of the mathematical aspects. Starting with a description of single soliton formation, the X-Junction solitonic neuron is introduced from both theoretical and experimental perspectives. Experiments on the neuron writing process and selectively erasing the soliton channels of which it is composed are presented. The fabrication of a solitonic neuron in an 8-micron thin film of lithium niobate on insulator, an extremely promising technology nowadays, is also described.
Software and hardware devices realized to date are unable to propose a system capable of learning by following the dynamics of the biological case. To develop a hardware technology similar to the brain structure and functioning, it is necessary to change point of view and conceive of a system that behaves like a tissue, in which each district is connected through a dependency relationship to the whole structure. Indeed, brain biology teaches us that it is a system in which the alteration of local properties can have global effects [1]. Reasoning in terms of single independent units or dynamics between connections limited to a few units fails to be comprehensive in the attempt to obtain a dynamic structure. The solitonic neural networks (SNNs) this manuscript introduces, exploiting the properties of crystals with nonlinearly saturating refractive index, are plastic systems capable of self-modifying their structural and functional organization to learn new information and, contextually, store it. The advancement of these processes occurs simultaneously and, as will be explained shortly, through a progressive and coherent change in neural geometries.
4.1 Photorefractive Solitons The spatial soliton is a self-confining light wave that is able to propagate within some crystals without giving rise to diffraction phenomena. The first experimental study was carried out by exploiting a third-order nonlinearity of Kerr type. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 A. Bile, Solitonic Neural Networks, Machine Intelligence for Materials Science, https://doi.org/10.1007/978-3-031-48655-5_4
39
40
4 The Solitonic X-Junction as a Photonic Neuron
However, the process of forming a Kerr soliton is complex both in terms of geometries and the energy involved. To excite this type of nonlinearity requires very high light intensities of the order of GW/m2 . Alternatively, one can make use of their cumulative property whereby intensities can be reduced by proportionally increasing propagation lengths. Furthermore, Kerr solitons exist only in planar geometries that ensure their stability within a single dimension [1]. Thus, in general, Kerr soliton formation has numerous limitations. Moreover, the idea of their use for the development of neural technology turns out to be unsuitable. Indeed, a neural network is characterized by numerous connections in which information is partitioned and propagated within different channels. To keep the energy contribution of information sufficiently high would require working with even higher powers. For this reason, the Kerr soliton is not suitable for the development of integrated chips but only for the development of temporal solitons [2, 3]. Spatial solitons, on the other hand, provide a valuable means of realizing waveguides [4]. One of the first experiments in using this technology for optical signal processing was carried out by Fazio et al. [5], who showed the possibility of exploiting the photorefractive effect for their generation. A photorefractive crystal is a nonlinear medium capable of exciting a second-order electro-optical effect. Its behavior can be studied through the investigation of the nonlinear polarization given in Eq. 4.1: we note that the term for second-order nonlinearity “is activated” − → − → by the application of an electric bias field E (0) while E (ω) is the field associated with light. | | → → - E(ω) χ (1) · E(ω) +← χ (2) : E(0) P(ω) = ε0 ←
(4.1)
Consequently, the dielectric tensor assumes the form given in Eq. 4.2 | | → → ε = ε0 1 + ← χ (1) + ← χ (2) · E(0)
↔
(4.2)
In the most general case, the refractive index maintains the tensor form of susceptibility. Depending on its polarization, the wave experiences different values of the refractive index and consequently travels at different speeds. This gives rise to the phenomenon of birefringence [6]. Crystals are said to be uniaxial and anisotropic when they are characterized by two indices of refraction: an ordinary index (no ) for waves polarized along the x or y directions, and an extraordinary index (ne ) for waves polarized along the z. In general, the refractive index of crystals is expressed through an ellipsoid according to Eq. 4.3: x2 y2 z2 + 2 + 2 =1 2 nx ny nz
(4.3)
4.1 Photorefractive Solitons
41
in which its variation can be recognized (Eq. 4.4) by considering the n−2 terms (
1 / 2 ni
) =
E
ri j E j (0)
(4.4)
j
Specifically, the photorefractive effect results in a decrease in the refractive index of the crystal proportional to the local electrostatic field E(0), along the crystallographic direction i according to relation 4.5: n i [E(0)] = n i,0 −
1E 3 n ri j E j (0) 2 j i0
(4.5)
The photorefractive effect overall reduces the value of the refractive index. Instead, for light to self-focus, it is necessary that, locally, at its propagation path, the refractive index has a higher value than the surrounding environment in which the light beam is located. In this way, light can switch from a diffractive to a solitonic regime and behave like a waveguide, at a higher refractive index, for successive signals at different wavelengths. To achieve this effect, an electrostatic field must be applied to the crystal interior, so as to reduce the overall refractive index, and screened locally at the propagating light. Bright screening solitons are thus obtained. For a specific discussion of the formation of this type of solitons, we refer to the following references [7, 8]. In this text we will limit ourselves to the analysis of the equations on the dynamics of charge carriers during the formation of a second order space soliton. Consider a photorefractive crystal. Because donor states (nD ) are often energetically localized within the energy gap, light can trigger electron transitions from donor states to conduction bands. As a result, two charge populations are created: ionized donors, or holes (nD+ ), which are physically localized—this means they are restricted in their movement because they are bound to the position of the dopant ions-and electrons, which are in delocalized conduction states and thus free to move wherever they want. Donor dynamics is expressed by the Eq. 4.6: ∂n +D = σ Fn D − γ n + d ne ∂t
(4.6)
where nD+ is the density of ionized donors, ne is the local electron density, nd is the concentration of not-yet-ionized donors, σ is the cross section for photon-donor coupling, and F is the photo-flux. The concentration of electric charges, on the other hand, follows the trend described by Eq. 4.7 ) ( ∂n e ∂n +D − → kB T − − → → = − μ ∇ · ne E + ∇ ne ∂t ∂t q
(4.7)
42
4 The Solitonic X-Junction as a Photonic Neuron
where μ is the electron mobility, kB the Boltzmann constant and T the temperature. The total charge r is represented by the set of electrons and holes and, following Gauss’s theorem, generates a local electric field that can oppose the bias electric field, generating a region with a higher refractive index than its surroundings. ⎧ - ⎪ ⎨ ε∇· E = ρ SC
(4.8)
⎪ ⎩ E = E +E local
bias
SC
The nonlinear propagation of light inside the crystal can be described through the nonlinear wave equation (taking into account that in the case of a solitonic solution the amplitude A does not depend on the spatial dimension x): {|
∂ ∂x
−
i 2k
(
∂2 ∂ y2
+
∂2 ∂ z2
)|
A(x, y, z) =
ik δn(Elocal )A(x, n
y, z)
(4.9)
The solution of which is of the type A(x, y, z) = u(y, z)ei (ωt−kx)
(4.10)
4.2 Learning of the X-Junction Neuron Spatial soliton consists of local changes in the refractive index of the material over time. The magnitude of these changes can be a function of time or of the intensity of the involved signals. This means that iterating the propagation of a light ray along the same specific path can increase the index contrast by improving the efficiency of the soliton waveguide. Similarly, higher contrasts can be achieved by increasing the intensity with which light beams enter the crystal. The first process is usually referred to as stigmergy [9] and involves reinforcing specific information pathways. It is a learning strategy very used in nature not only in the neural domain but whenever a decentralized learning system arises in which it is necessary to transfer information between even very distant districts. Stigmergy is a type of reinforcement learning: the basic idea is to consider the feedback derived from the dynamic interaction of the learning agent with its environment. Reinforcement learning is also a strategy used by the biological brain. In Chap. 1 it has been shown that when an information is repeatedly proposed, the neural response is to change its geometry to save the corresponding information pattern in a structure that arises from specific neural pathways [10]. The intelligent technology developed so far, both software and hardware, finds an insurmountable limitation precisely in repurposing such a mechanism. Software applications find their main limitation in the distinction between memory and data
4.2 Learning of the X-Junction Neuron
43
processing units, which communicate with each other but do not coincide. This separation results in an important consumption in terms of required energy and processing time. It should also be added that the learning inherent in software networks does not, as is the case in the biological case, involve updating the structural geometry of the system, which instead always remains the same. What changes is only the intensity with which the individual units are connected to each other. For this reason, scientific research in recent decades has turned toward the neuromorphic hardware approach [11, 12]. The neural model proposed in this book introduces a novel approach to the world of neuromorphic optics, taking a step toward the learning paradigm enacted by the biological brain. As we will see in this and the next chapter, synaptic trajectories are represented through specific light paths within nonlinear crystals. These trajectories are steered and directed by the local refractive index contrast that changes over time as a function of the transported light signals. The photorefractive propagation mechanism succeeds in reproducing a type of learning that is very close to biological learning. The smallest solitonic neural unit in an SNN network is the X-Junction neuron, a structure that arises from the interaction between spatial solitons. Through the selfconfinement of light [13], it is possible to realize a fully optical neural structure that traces the functional model of the biological neuron. Figure 4.1 reports the two structures to be compared: in (a), the biological neuron already seen in Chap. 1 and in (b) the X-Junction neuron. In both cases the signals, electrical for the biological neuron and light for the X-Junction neuron, are collected through the input channels. Having arrived at the “computational center” soma they are processed according to nonlinear processes before being stopped or propagated along specific directions to reach subsequent neural units. The nonlinearity of processing in X-Junction neurons corresponds to that brought into play by the photorefractive crystal where they are made. The X-Junction structure is realized by the intersection of two spatial solitons, generated by pockels effect, at the center of the photorefractive crystal [14]. Each beam is able to change the refractive index independently according to Eq. 4.5. This results in a local modification of the index relative to the propagation of the single beam and independent of that of the second beam. This autonomy, as we will see shortly, is lost in the region where in two beams of light meet [15]. Symmetric or asymmetric neurons can thus originate. The meaning of learning is related to the processes of asymmetry of structure in one configuration rather than toward another. Learning occurs when the neuron is able to divert received information toward a specific pathway. When the optical signal is collected by the neuron, at a wavelength to which the crystal is not sensitive, it senses the previously induced local refractive index change and remains, during its propagation, confined within the solitonic waveguide. Having arrived at the region of beam overlap (the soma of the optic neuron), henceforth also referred to as the node, it begins to be processed. The node is an extended region in which the two solitonic channels are so close that an energy exchange is possible. The signal is distributed within the two channels following a nonlinear logic dictated by the photorefractive nature of
44
4 The Solitonic X-Junction as a Photonic Neuron
Fig. 4.1 a Structural and functional diagram of a biological neuron. Signals are collected through dendrites, processed at the soma (computational center of the neuron) and transmitted to subsequent neurons through the axon and output dendrites. b Structural and functional diagram of a solitonic neuron. Signals enter through the two input channels, processed in the node and transmitted through the two output channels
the crystal. In order for the two channels to be sufficiently close to the height of the junction, it is necessary that the entry angle of each beam with respect to the direction normal to the input face of the crystal be very small, below the degree of angular aperture. This aspect, which is of considerable importance, will be dealt with in detail when describing the experimental apparatus used. In particular, if the guides were written with lasers characterized by equal intensity, the resulting neuron is also perfectly symmetrical at the junction (balanced structure). As a result, the signal is split 50/50 in the two output channels. This phenomenon occurs regardless of which one was the input channel. Indeed, even though the signal enters at the input opposite the output with higher signal percentage, once it reaches the channel overlap region, the higher index contrast follows. The structure and behavior of a balanced X-Junction neuron are shown in Fig. 4.2: in (a) the writing process of waveguides with the typical X-shape is reported; the two channels are characterized by the same refractive index modulation; in (b) the propagation of the signal inserted into the input at the bottom is shown, which is split equally into the two outputs a and b; finally (c) shows the output face of a bulk crystal where two nearly identical circular regions of light can be observed. However, if the two channels are written with different input intensities, as shown in Fig. 4.3a, the junction will be characterized by a higher refractive index at the path followed by light with higher intensity. This results in an unbalanced structure, able to partition the informative signal non-symmetrically toward the outputs by weighting it according to the nonlinear logic of index change: the neuron has moved from a
4.2 Learning of the X-Junction Neuron
45
Fig. 4.2 In a is shown the writing of a balanced 2 × 2 solitonic neuron (out of scale plot): the two light intensities are equal. b Shows the propagation (out-of-scale) structure of the IR signal inserted in the low channel: at the node the signal is equally split 50% in the channel α and 50% in the channel β. Image c shows the output face of the crystal: the two light spots are perfectly equal indicating
neutral state to an informational state. In Fig. 4.3b the signal is inserted into the input below and, following linear logic, is expected to exit at output a. Instead, having arrived at the soma it senses the local index modulation and diverts its propagation to the channel with higher index. The output face will then show two spots, this time with different intensities. Figure 4.3c thus shows an unbalanced X-Junction neuron. Information learning coincides with a set of logical steps consisting of collecting external data, processing it and storing it. Generally, these steps are conceived in linear succession by way of exemplification. It follows that most currently existing neuromorphic systems also reflect this logic. However, as it was introduced in the first chapter, these operations are handled by our brain in parallel and coincide with a reorganization of neural structure. The X-Junction neuron is able to carry on these three main operations simultaneously. Its learning process consists of in to switch from an unbalanced to a balanced state. The Smart&Neuro Photonics Lab at Sapienza University of Rome has developed a numerical code to investigate the evolution of the refractive index in an X-Junction neuron and to characterize its learning process. To date, scientific research has tried to find a solution to Eq. 4.9 without success. However, a good numerical approximation is to use a Finite Difference in Time Domain (FDTD) method to simulate the formation of solitons and their evolution. The nonlinear Helmholtz equation that is solved is:
46
4 The Solitonic X-Junction as a Photonic Neuron
Fig. 4.3 a Shows the writing process of an unbalanced 2 × 2 solitonic neuron (out of scale plot): the intensity of the beta channel is greater than that with which the alpha channel is written. b Reports the propagation (out of scale plot) of the structure of the IR signal inserted into the low channel: at the node the signal is equally split 20% in the channel α and 80% in the channel β. Image c shows the output face of the crystal: the two luminescent spots show the unbalance
∇ 2 Ai = −
e N L · E bias 1+
|Aw1 |2 +|Aw2 |2 |A S AT |2
· Awi
(4.11)
where εNL is the nonlinear dielectric constant, Ebias the applied electrostatic bias field, Awi (with i = 1,2) represents the two writing beams, and |A S AT |2 the saturation intensity. Equation 4.11 reports the case the refractive index of the crystal is sensitive only to the propagation of the two writing beams Aw1 and Aw2 , by which it is modified. The information is represented by a laser at a wavelength that cannot change the index further, but it senses the local index contrast and remains confined to the channel characterized by its higher value. As will be described shortly, this type of neuron is able to give rise to an SNN network that can learn in a supervised manner [16]. This means that the network is modified based on externally provided cues [17]. However, when the X-Junction neuron is made within photorefractive materials whose refractive index is also sensitive to the wavelength used for the signal, or by using a signal at a wavelength that can interact with the nonlinearity of the material, the nonlinear propagation equation changes. In this way, it is possible to make neurons whose learning dynamics are totally governed by the propagation of light. Members of the Smart & Neuro Photonics Lab were the first to make this model neuron by exploiting erbium-doped lithium niobate crystals. In fact, erbium is active for laser emission at 1.55 mm with pom-paging at 980 nm. Both of these wavelengths are not absorbed by lithium niobate and thus cannot excite
4.3 Supervised and Unsupervised Learning
47
its nonlinearity. However, erbium can have a two-step nonlinear absorption process whereby the 980 nm radiation is re-emitted in the green, to which lithium niobate is sensitive. d N2step N2step = σ N0 F 2 − dt γ
(4.12)
where σ is the nonlinear absorption cross section, N0 the ground level population, F2 the square of the photon flux of the 980 nm signal, and g the relaxation time of the excited level. The N2step population, however, undergoes a change over time due to nonlinear absorption according to Eq. 4.13 N2step = γ
t σ N 0 λ2 |A3 |4 (1 − e− γ ) 2 4 2 4n ε0 c
(4.13)
Taking Eqs. 4.12 and 4.13 into consideration, the Helmholtz Eq. (4.11) for beam evolution becomes ∇ 2 Ai = −
e N L · E bias 1+
|Aw1 |2 +|Aw2 |2 +η| Aw3 |4 (1−e |A S AT |2
− γt
)
· Awi
(4.14)
4.3 Supervised and Unsupervised Learning The definitions of supervised and unsupervised learning descend directly from Artificial Intelligence and Machine Learning. To get a good understanding of the difference between these learning modes, especially from a procedural point of view, a brief description of the operation of software Machine Learning is given from which the X-Junction neuron takes inspiration about the functional structure of learning. In general, software artificial intelligence requires huge amounts of data to develop competitive and reliable models. The more data there is available, the more situations the neural network can observe [18]. At the same time, however, it is necessary to make sure that the network does not become too “accustomed” to the data fed to it because it may not recognize sudden or never-happened situations. This differentiates a recognition from an understanding that then allows for eventual prediction. The network learning consists of setting the weight of each connection between neurons (synapses). Generally, the structure of a software neural network is organized into multiple layers communicating with each other. Specifically, each neuron in one layer is connected to each neuron in the next layer. When an information occurs in the first layer, called the input layer, it activates a precise number of connections for each neural layer. This means that each piece of information activates a precise neural pathway. Before learning, there is no privileged pathway. The synaptic weights
48
4 The Solitonic X-Junction as a Photonic Neuron
of each neuron are all random. As the network begins to learn, the configuration of connections changes from chaotic to a precise order. The organization of information within synaptic pathways coincides with its learning (Fig. 4.4). Generally, learning in software AI involves three successive phases [20]. In the first step, called training, the network is trained on known data. This step allows connections to be defined on previous experiences. The greater the amount of experience (data) provided, the higher the ability of the network to reason about new and never known aspects. After this process, through the validation phase, the trained neural model is tested on another block of data. In some neural models, this step is skipped to get directly to the application of the network on new, never-before encountered data. This last step is called the testing phase. A learning mode is defined as supervised if the training phase of an intelligent unit is determined by an operator external to the unit [20]. It is this supervisor who decides when the information is correct or when it is wrong. Therefore, the supervisor provides the correct output, on the basis of which the weights of the inter-neural connections are updated. From a no-knowledge situation, the recognition of precise information is forced iteratively change every single connection.
Fig. 4.4 Architecture of a software artificial neural network. The network is organized into three types of layers: input layers, hidden layers (varying in number) and output layers. Each neuron in each layer is connected to each neuron in the next layer. Connections can be activated through several possible activation functions [20]
4.3 Supervised and Unsupervised Learning
49
How is this functional pattern exploited by photorefraction? Following Eq. 4.11, the X-Junction neuron gives rise to supervised learning by exploiting a reinforcement mechanism of a specific solitonic channel [15–21]. Specificity is donated by local modulation of the refractive index. To achieve it, it is sufficient to unbalance the input powers of the two light beams. Similarly, to the biological case, the structure is changed through two modes of excitation: it is possible to use very different powers with a large proportionality factor; or it is possible to iterate many times a small (low intensity) excitation. In both cases, an evolution of the refractive index dictated by Eq. 4.11 will be obtained. The system is “provided” with the correct output, the one to be learned. From this external information the neural unit changes its geometry and consequently the weight with which it switches the information in the output channels. The main difference that exists with machine learning lies in the fact that a software neuron is a unit that is not able to change its structure and its operation is limited to switching on or off based on the input received and the activation function. This stems from the fact that this type of neuron is characterized by only one output. It is unable to change the direction of signal propagation but can only excite it [22]. The network path arises based on which neurons, in their layers, are activated. In addition, the update of connection weights occurs only when the information signal has finished its propagation. On the other hand, the solitonic neuron self-modifies contextually with the propagation of light within it. When light arrives at the junction area, the modulation of refractive indices corresponding to the two channels weighs the two inputs and directs them to one output, to the other, or to both. In the solitonic neuromorphic paradigm, therefore it makes sense to speak of single-unit learning as well, according to a logic that is absolutely similar to that which occurs in biology. Thus, the X-Junction neuron learning corresponds to the process of asymmetrization of the structure that can occur, depending on the circumstances, as a result of the propagation of very intense light or a dense succession of light stimuli. In addition, the junction is unbalanced both when light is propagated in one direction and if in the other. This makes it possible to build, from an optical hardware point of view (see Fig. 4.5), a feedback system capable of taking the light output and reinserting it into the channel one wishes to recognize.
Fig. 4.5 Optical feedback circuit for specific reinforcement of solitonic channels. Depending on the amount of light that is taken at the output and fed back into the target channel, supervised unbalancing of the balanced neuron can be achieved
50
4 The Solitonic X-Junction as a Photonic Neuron
To investigate the theoretical learning behavior of an X-Junction neuron, the optical properties of lithium niobate were considered [23]. The electrostatic bias field was set at 40 kV/cm while the power of the writing beams was set at 8 μW corresponding to about 2.75·105 W/m2 . Signal light is entered at 0.5 μW (corresponding to about 5·103 W/m2 ). The identifying state of an X-Junction neuron before learning (the equivalent of the randomness of the weights in the ML) predicts that the two outputs are balanced as shown in Fig. 4.6a. By setting up a feedback system, variable portions of light output, defined by the external operator, are reintroduced with different weights into the channels. In Figs. 4.6b (learning of channel α) and 4.6d (learning of channel β), 30% feedback is implemented in the selected channel, while the second channel remains without feedback. Progressively the reinforced channel fills while the other empties until a new equilibrium situation is reached. The final state is a stable unbalanced X-junction structure. Interestingly, once the neuron is unbalanced, it does not matter which input channel is considered: having arrived at the junction region, in fact, the signal follows the path characterized by a higher local refractive index contrast. Waveguide unbalancing is a process that varies over time depending on the amount of light that is used in the reinforcement feedback: the greater the amount the faster the refractive index will change and consequently the junction will reach a stable unbalanced configuration in less time. Therefore, learning dynamics, in the case of the X-Junction neuron coincides with channel imbalancing process. Looking at the imbalance curves, it is possible to recognize a trend very similar to neural sigmoidal activation function one, thus simulating the nonlinear behavior of biological spike propagation [24].
Fig. 4.6 Supervised learning of the X-Junction neuron switching from balanced outputs without feedback a–b to unbalanced behaviors, either due to feedback on the alpha channel c–d or due to feedback on the beta channel e–f. Individual channels are so highlighted by deferring from the output the specific writing beam within the corresponding channel
4.3 Supervised and Unsupervised Learning
51
Figure 4.7 shows the trend for information learning. The rate of unbalancing depends directly on the contribution of light that is re-entered into the two channels at each iteration. The more feedback is inserted within the target channel, the fewer iterations are required to obtain an unbalanced neuron. In the unsupervised learning instead, the system must be able to recognize the target information autonomously, and then proceed again with the processing and storage steps. No action from outside the system should intervene. We have seen that the X-Junction neuron is able to recognize the information contained within the light signal and change geometry accordingly. In particular, in accordance with what happens in biological neural systems [25], the neuron changes its conformation from a balanced to an unbalanced state over time. In order to achieve an unsupervised learning, it is important that the information itself, as it propagates in the structure, is able to modify it. This can be done by choosing appropriate wavelengths that are capable of modulating the refractive index of the photo refractive material under consideration. The first attempt to implement non-supervised learning started directly from the supervised results. The logical operating scheme therefore remained unchanged. For this first implementation, an erbium-doped lithium niobate crystal was considered. The initial step is to write the X-structure using wavelengths at 532 nm (green). After that, in order to make the neuron sensitive to the propagation of information, which in this primordial stage of solitonic neuromorphic research was translated with a
Fig. 4.7 Activation function of the X-Junction channels in the supervised learning with reinforced α channel using 30% of light feedback. With this amount of light, the neuron learns very fast: it switches from the balanced state to the unbalanced one with few feedback iterations. Reprinted under CC-BY-4.0 license from [15], © The Authors 2021
52
4 The Solitonic X-Junction as a Photonic Neuron
wavelength at 980 nm (red), the phenomenon of dual photon emission following doping of the crystal was exploited, as introduced earlier. To reconstruct the propagation of light, it is possible to refer to Eq. 4.14, through which a mapping of the refraction index is defined, which at each iteration of light is updated respecting the saturating characteristics of the medium. This realizes the logic of photorefractive physics: by changing the refractive index of the crystal, the way in which the two guides (channels) are able to confine the signal and make it propagate changes. To study this type of learning, it is necessary to start from the “impartiality” condition, in which the signal is perfectly divided into the two outputs, as there is still no direction of preference. Whenever a specific input pattern of the signal is iterated, the index mapping is changed in accordance and a percentage of input light is diverted to the indicated output channel. Figure 3.8a shows the propagation of the signal under the three balanced and unbalanced input conditions. Figure 3.8b shows the evolution curves of the two channels as a function of the refractive index mapping update. The simulations were obtained using very low input powers, as for the experiments, standing at about 2 μW, and resulted in complete unbalance after γ = 3000 iterations. As can be seen in Fig. 4.8, after reaching the unbalanced stable state, the light remains confined mainly (though not totally) in one channel: this is synonymous with completed learning. The efficiency factor η depends on the doping used in the nonlinear material; in the case of erbium-doped lithium niobate it was set (by experimental analogy) at η ∝ 10–6 . Progressively the target channel begins to fill up while the other channel empties until the output intensity reaches near zero. During recognition there may be fluctuations in intensity between the two channels due to a jump in refractive index at the junction: entering channel α, for example, as long as the nonlinearity is still low, the signal undergoes an elastic behavior that partially redistributes the losses in channel β as well. The same behavior is observed in the opposite condition with the signal entering channel α. This is a temporary condition that vanishes as soon as the nonlinearity of the guide becomes strong enough to confine almost all the light in a single channel, limiting losses. Light intensity is thus synonymous with the level of importance of a piece of information: the higher it is, the more important the information it contains, the faster the neuron will have to learn to recognize it. The unsupervised X-Junction neuron is always able to recognize which input is more intense even when the two input powers differ little. To complete the learning configurations, other learning curves are shown in Fig. 4.9. Configurations with the following ratios of input powers were analyzed: (1) 0.2–1 (2) 0.5–1 and (3) 0.8–1. For the first two ratios, a high level (1) and a low level (0.1 and 0.5) are recognized at the output. In the third case, however, two high levels, which are in input little different, are recognized.
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
53
Fig. 4.8 Learning dynamics of the X-Junction neuron. In a are shown the final states once the apprehension is finished while in b are shown the learning curves. Starting from the initial balancing condition (synonymous with the vergi-ne system), the junction recognizes the input and swticha in accordance with it. Adapted under CC-BY-4.0 license from [15], © The Authors 2021
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films The implementation of an X-Junction requires the use of a crystal characterized by saturating nonlinearity. The Smart & Neuro Photonics lab has correctly performed X-Junction neuron training operations in LiNbO3 bulk crystals [15, 16, 21]. In [21], the experimental analysis of the X-Junction neuron performing reinforcement learning was conducted. The experiment was realized using the pyroliton configuration, using a Z-cut, congruent, striation-free lithium niobate crystals with dimensions of 12 × 12 × 0.5 mm3 . The electric bias field was generated exploiting the pyroelectric effect, thanks to temperature gradient of about 30° between the (0, 0, −1) and (0, 0, 1) faces, in order to develop a bias field of the order of 40 kV/cm. Bulk crystals, however, present a considerable problematic relative to their size that hardly makes them devices that can be integrated with other systems or with each other. The goal of realizing complex and dense networks comes into sharp contrast with this aspect. The super-efficiency of the human brain lies especially in the density of neural synapses and the ability to carry out computations in parallel. Therefore, any neuromorphic research must necessarily meet this requirement. Also, bulk crystals present an issue related to the management and control of light beams. This aspect is experimentally critical because, as seen, if the entry angles slightly exceed optimal values, there is a risk that the coupling between the channels generated by the solitons is not sufficient to allow energy exchange between them.
54
4 The Solitonic X-Junction as a Photonic Neuron
Fig. 4.9 Learning curves of an X-Junction neuron with both excited input channels. The recognition is digital: by keeping one input channel (A) fixed in state 1, the other (B) is recognized as state 0 or 1 if its power is less than or greater than 0.8. Reprinted under CC-BY-4.0 license from [15], © The Authors 2021
Lithium niobate on insulator (LNOI) films have much faster dynamics than bulk crystals [14]. As it will be shown in the following section, light beams propagating within a LiNbO3 photorefractive film, at the same power compared to bulk, are able to focus to the complete soliton formation with times two orders of magnitude shorter. Given the speed of the process, it is critical to be able to optimally manage the entry of beams. The use of crystal films ensures optimal handling experimentally.
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
55
Indeed, the planar configuration offers excellent control on the beams’ propagation and on their coupling. The experiments that this section describes are performed on 8-μm-thick LiNbO3 films. The fabrication procedure of the LNOI used is extensively described in [14]. It is obtained from a 500-μm-thick z-cut congruent LiNbO3 wafer bonded to a silicon substrate using a UV-cured adhesive. The produced film is ground and mechanically polished to obtain a layer 8 μm thick, separated from the silicon substrate by an adhesive several meters thick. The resulting wafer is diced with a diamond saw to produce 10-mm-long samples. Finally, the faces are polished. The adhesive is transparent at the wavelength used with a refractive index of about 1.5, while the LiNbO3 film (n ∼ 2.2) is a low-loss slab waveguide with high-index contrast. As introduced in the previous sections, the solitonic neuron paradigm is based on the photorefraction effect. This phenomenon is usually slow, as it is related to photogeneration and displacement of electrical charges that differs according to the material used. Among bulk dielectric crystals, a fast response has been observed, for example, in BSO [26] or SBN [8] while some media such as LiNbO3 show a very slow evolution. However, lithium niobate is an eclectic medium, suitable for countless applications as it exhibits large electro-optic and nonlinear coefficients. Recently, work [14] has been published showing that the soliton formation time in LiNbO3 crystals can be drastically reduced by two orders of magnitude by using in-vogue films instead of bulk crystals. Figure 4.10 shows the response time trend of the self-focusing effect of a single light beam as a function of the power of the incoming laser beam on a logarithmic scale in the lithium niobate film.
Fig. 4.10 Trend of soliton channel formation time versus writing power in an LNOI. Formation occurs two orders of magnitude faster than in bulk crystals
56
4 The Solitonic X-Junction as a Photonic Neuron
4.4.1 The Physics of the LNOI To understand the advantages offered by LiNbO3 layers over their bulk counterpart, it is important to analyze their physical dynamics underlying the light self-confining process (at the base of the spatial soliton formation). The dynamics of the refractive index in a photoinduced layer changes profoundly compared to that characterizing a bulk crystal with applied electrostatic bias field [27] or with pyroelectric effect [28]. The entire description was formulated from a research collaboration with the optics department of the Femto-ST Institut in Besançon (France), and for more details some their paper on the subject are recommended [14]. Let’s then consider the following sequence of Figs. 4.11, 4.12, 4.13, and 4.14: they show the evolution of electric charges and field in a c-oriented layer of LiNbO3 under pyroelectric effect. Since this is a ferroelectric crystal, the P-polarization depends on the contributions of the dipole moments of the individual unit cells and is oriented along the c-axis. The generation of the P-polarization is due to the presence of equivalent electrical charges on each face of the layer as shown in Fig. 4.11. The positive electric charges are on the c face while the negative ones are on the −c face. However, as long as equilibrium persists, the net field inside the crystal is zero since the presence of free charges offsets that of bound charges. When the Peltier cell is activated, case shown in Fig. 4.12, a thermal gradient is generated and therefore raises the temperature from T0 to T0 + DT.
Fig. 4.11 Schematic representation of the soliton by pyroelectric effect in the LNOI. P-polarization arises due to the presence of electric charges on the layer faces and is oriented as the c-axis
Fig. 4.12 Schematic representation of the soliton by pyroelectric effect in the LNOI. By applying a thermal gradient, electrical equilibrium is lost and an Epy field is generated
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
57
Fig. 4.13 Schematic representation of the soliton by pyroelectric effect in the LNOI. The presence of localized light results in local elevation of the refractive index
Fig. 4.14 Schematic representation of the soliton by pyroelectric effect in the LNOI. When the thermal gradient is canceled, the polarization returns to its initial value, but the charge distribution remains inhomogeneous: it follows that the electric field changes sign. The index profile of this configuration is the same as the pre-ceding one
The individual dipole moments decrease instantaneously and consequently so does the polarization P of the crystal. Simultaneously with the change in polarization, a net bias field E py oriented along the c-axis proportional to the thermal gradient arises: E py = −
Py /T ε
(4.15)
where ε is the static dielectric constant of the medium and Py the pyroelectric coefficient of lithium niobate. A thermal gradient of about 20 °C corresponds to a bias field E py of just over 40 kV/cm. This value of the bias field as we shall see is optimal for good conduction of the light signal. Lower values do not allow the linear diffractive regime of propagation and thus the focusing of light beams to be overcome, while higher values lead to a deflection of the written waveguide not guaranteeing optimal propagation. The bias field results in a general lowering of the refractive index by the electrooptic effect. Inserting a beam of light capable of propagating into the crystal with a fundamental mode shows that the light shields the electro-optic effect resulting in a local raising of the index, see Fig. 4.13. Specifically, what happens is that, as a result of the photoionization process, free carriers are generated. Their presence increases the local conductivity of the material. Free charges of opposite sign move across the illuminated area with low resistivity to
58
4 The Solitonic X-Junction as a Photonic Neuron
recombine. It is precisely this process that occurs faster in thin guides than in the bulk counterpart. In the region bounded by the beam, there are no free surface charges and electric field while they remain constant in the remaining space as shown in Fig. 4.13. As a consequence of the electro-optic effect, there is a peak in the refractive index trend in the illuminated region while it begins to decrease as one moves away from the beam. When the thermal gradient is canceled (/T = 0) and the initial temperature is reached, what happens is that the polarization is restored to its initial value, but the distribution of free charges remains inhomogeneous ensuring the refractive index trend just seen. A direct consequence is that the electric field changes sign: it becomes maximum in the central region of the beam see Fig. 4.14. This configuration is characterized by a refractive index profile almost identical to the previous one. It follows that the same ability to act as a waveguide remains [14]. To summarize, the main difference that exists in the self-trapping mechanism of charges between layers and bulk is due to the fact that in the former case it is due to the recombination of available charges that alter an electric field while in the latter case the insertion of a light beam results in the photo-generation of electric charges that shield an external electric field. The reduction in the time required for soliton formation depends precisely on this different physical mechanism. In the bulk, a longer waiting time is required for enough electric charges to have been photoionized to shield the field; in the film this situation is already present since free charges are already present.
4.4.2 LNOI X-Junction Writing The experiment on the solitonic X-Junction neuron realization has been successfully realized [29]. Its description will enable the reader to understand the potential of SNN networks arising from the plasticity of their constituent X-Junction neurons. For optical characterization, the beam of a 532 nm CW laser is focused with a spherical lens of 80 mm focal length on the input face of the layer. The output goes directly to a camera passing through the objective of a microscope. The crystal is placed on a stand connected to a Peltier for thermal gradient management. The experimental set-up configuration is shown in Fig. 4.15a. A first beam travels in a straight direction while a second beam is generated through a beam splitter (BS) from the first one. The two beams re-couple within a second BS, the rotation of which allows the two beams to be separated by the right distance to intersect at the center of the crystal, according to the pattern shown in Fig. 4.15b. The rotation operation of the second BS is critical because it allows the angles of the beams to be calibrated. As we mentioned earlier, the junction, in order to allow energy transfer from one channel to the other, must allow for sufficient approximation of the two channels, which means that the entry angles with respect to the normal passing through the center must be small enough.
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
59
Fig. 4.15 a Experimental setup for X-junction formation. b Detailed top view of the trajectories of the two beams before and within the LiNbO3 slab waveguide. Reprinted with permission from [15] © The Optical Society
The CW laser beam at 532 nm is divided into two collimated beams through a Mach–Zehnder type delay line, set longer than the coherence length in order to achieve the fully incoherent condition of the beams in order to avoid any kind of interference phenomenon. Their mutual power ratio is controlled through neutral density filters. The Mach–Zehnder output recombination beam splitter sets a small angular deviation between the two beams, as shown in Fig. 4.15b. This angular deviation is magnified by the focusing lens placed at focal distance from the input face of the sample. Each of the two beams has a small angle of about 0.4° to the normal direction. Therefore, the total angle between the two beams on the entrance face is 0.8°. Larger angles do not ensure coupling between the channels and thus the learning dynamics typical of the X-Junction neuron. Consequently, the size of the angle must be strictly in the narrow range around these values. In fact, signal direction switching depends on the refractive index contrast especially in the junction area. This is the soma (processing center) of the photonic neuron. Its tasks are first to weigh the injected signals and direct them towards specific outputs. But if the angle is too large, the resulting junction region will be too small to allow energy coupling between channels. The just described optical apparatus generates two focused light beams of about 10 μm FWHM on the input face of the LiNbO3 film (as long as 5 diffraction lengths). Finally, the output face of the waveguide is imaged by a camera with a magnifying optical system. To achieve optimal junction resolution, alignment is a very important step. An approximate alignment could generate a junction where the beams are not close enough to exchange energy according to the index modulation or are not close enough for a sufficient extent. In addition, good alignment avoids multimode coupling, which can affect the efficiency of the switch and therefore the soliton formation. This procedure is performed with very low powers, around 10 nW, to avoid any photorefraction
60
4 The Solitonic X-Junction as a Photonic Neuron
effect, since the layer is very sensitive. The two beams are aligned separately: first the transmitted beam and then the reflected beam. After optimal coupling is achieved, two diffraction lines, corresponding to the diffractive (linear) propagation of the beams, can be clearly distinguished. At this point, the beams are darkened to block their propagation inside the crystal, and the temperature of the sample is increased to the target value to induce a constant electric field inside the sample by the pyroelectric effect. Once the temperature is stabilized, the two beams are allowed to freely enter the sample, and the channel writing phase begins. In the initial stage of the experiment, the beam power is set below 10 nW to avoid nonlinear effects and to capture the beam distribution responding to a diffraction free pattern, as shown in Fig. 4.16a. This case corresponds to the linear propagation regime. When the temperature becomes stable and homogeneous over the entire Peltier, the beam power is increased by removing the neutral densities. The first step is to inject the two crossing beams with the same power (10 μW) to write the X-junction. In fact, depending on the ratio of the writing powers, symmetrical or asymmetrical X junctions can be realized, as described in the previous section. After writing waveguides, a probe signal with the same wavelength is inserted. Again, a power of 10 nW is used in order to not affect the refractive index of the crystal. Indeed, the soliton formation depends also on the injected lasers’ intensity. If the ratio between the input powers of the writing beams is balanced, i.e., W1 /W2 = 1, the generated solitonic channels are characterized by the same refractive index modulation along their full extent and at the nodal region. The resulting information signal transfer is about 50% to both output channels, as shown in Fig. 4.16b. By varying the powers of the writing beams, asymmetric X-junctions can be generated. Specifically, by increasing the power ratio up to W1 /W2 = 2, the percentage of signal switching in the channels is about 70% in the higher intensity write channel and about 30% in the weaker write channel, as shown in Fig. 4.16c. For power ratios
Fig. 4.16 Light distribution on the output face of an LNOI film: a beam diffraction in the linear regime; b two self-confined beams with the same writing power in the solitonic regime: the signal splits 50% in the two outputs; c if one channel is written with twice the power of the other then the signal splits 70–30 toward the channel with higher index contrast; if one channel is written with three times the power of the other then the signal splits 80–20 toward the channel with higher index contrast. Adapted with permission from [15] @ The Optical Society
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
61
up to W1 /W2 = 3, the nodal switching of the junctions is just over 80% to the strongest channel and just under 20% to the weakest channel, as shown in Fig. 4.16d, which corresponds to the results found in bulk [21]. Figure 4.17 shows the evolution of the information signal switching as a function of the power ratios of the writing beams. Interestingly, there is a limiting value of this ratio beyond which the splitting of the junction remains equal in percentage. This result has been found experimentally and theoretically in both bulk crystals [21] and thin films [29] of lithium niobate and strontium barium niobate [30]. A physical and mathematical theory has not yet been formulated to fully explain this behavior. However, it is believed to be a phenomenon connected to the characteristic saturability of the refractive index of these materials, the same property that allows the formation of solitons. Once the nonlinearity underlying the refractive index change is excited, the signals inserted into the solitonic waveguides no longer respond to linear physical laws. This means that the behavior just described does not depend on the input channel in which the probing signal is inserted, but only on the ratio of the input powers of the writing beams. This phenomenon is evidenced by Fig. 4.18, which shows an imbalance of 70%-30% with respect to the W1 /W2 = 2 ratio: in particular, Fig. 4.18a shows the output edge relative to the configuration in which the signal is inserted into the channel corresponding to the strongest output. Figure 4.18b shows the propagation of the signal taking into account node switching. If the signal enters the opposite input with respect to the strongest output, the switching remains unchanged, as Fig. 4.18c and d show. This is true even if the input is different: the physics of the junction always responds to the switch law in dependence on the input powers. The obtained results suggest that the two solitons, running very close to each other, exchange energy by altering the refractive index of the junction zone. When the information arrives at the node, it encounters a local index alteration that directs it even before the two solitons reach the end of the junction and, on exit, separate. Therefore, the imbalance is reached within this region. Fig. 4.17 Experimental switching ratios plotted along with theoretical trends as a function of highlight power normalized to the power of the other channel. Adapted with permission from [15] © The Optical Society
62
4 The Solitonic X-Junction as a Photonic Neuron
Fig. 4.18 Light distribution on the output face of an LNOI film. An 80–20 unbalance is shown. In figure a the signal is inserted within the channel opposite to the reinforced output b; in figure c the signal is inserted within the channel corresponding to the reinforced output d
4.4.3 LNOI X-Junction Erasing Biological neural networks are able to build new connections and modify specific connection intensities. In this way, it is possible to build dynamic, self-organizing circuits that can modulate according to circumstances and external stimuli that are received [31]. However, in some cases it is necessary to “forget” some information in order to learn and store new data. In synaptic terms, this coincides with the deletion of a neural district or in the destruction of a specific connection [10]. Being able only to write a circuit is therefore not synonymous with a system that can replicate the functioning of the brain. One must know how to “unlearn.” SNNs exploit space soliton technology to self-assemble. The same physical laws, however, open up a localized erasing mode that allows connections to be selectively destroyed [32]. This section introduces a novel method for erasing solitonic structures. Exactly as the biological brain resorts to inhibitory neurotransmitters to stop the formation of a synapse and, ultimately, to erase a connection, solitonic neurons exploit photovoltaic currents to open previously formed channels and return from a nonlinear solitonic regime to a linear diffractive one [29]. Numerous researches have studied the permanence of waveguides made by spatial solitons. Actually, the response time of the material, and consequently of its memory, depends on the chemical and physical characteristics of the crystal. For lithium niobate, the behavior is of medium to long times compared to the crystals usually
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
63
adopted. In fact, once built, neurons in lithium niobate are stable, which means it takes a long time for them to dissolve. Many papers can be found in the literature recounting the erasure of waveguides obtained by solitons [33–35]. In all cases this process was used in order to restore the physical condition of the crystal so that a new writing process could be operated. The crystal must be placed near a source of heat and light and allowed to radiate for between hours to a few days, depending on the index modulations to which it was subjected and the type of material. This causes a redistribution of charge, due to the photovoltaic effect [35, 36], which slowly counteracts the self-focusing process introduced in the first paragraph of this chapter, first thinning the channels and then, later, completely eliminating the waveguides. Progressively the local index contrast is lost and consequently so is the confinement of light. The crystal is thus returned to physical conditions that allow it to be rewritten. Before going into some of the implications connected to this way of proceeding may determine on neuromorphic logic, it is important to point out that this procedure limits any kind of application: in fact, it requires a time, which varies according to the material, but in any case, generally long to restore the experimental physical conditions of the crystal. In addition, it requires that the crystal be placed in the vicinity of a heat and light source, which most of the time demands a displacement of the crystal, leading to a change in the experimental conditions. From the standpoint of realizing neuromorphic hardware, we have already discussed how important it is to be able to realize new paths and consequently new geometries. Similarly, it is critical to be able to change the mappings and weights of individual connections to the point of completely erasing entire structures. Moreover, a system that intends to replicate the way through which BNNs learn and reason must be able to act autonomously and plastically. In the course of experiments with X-Junction neurons on LiNbO3 films, it has become apparent that partial or complete erasure can be achieved quickly and selectively. The principle on which this method is based is to restore the physical conditions present before the writing phase as [44]. This means being able to shift localized charges during soliton formation until a re-homogenization within the crystal is achieved by exploiting an amplification of the photovoltaic terms [37, 38]. First, it is necessary to darken the light sources (lasers) with which the guides were being written and turn off the thermal source so that the thermal gradient (/T = 0) can be nulled. Indeed, as with the writing process, the thermal gradient is also important for erasure, which must be zero with respect to the initial reference temperature to interrupt the excitation of the pyroelectric field (Ey ) underlying the photorefractive nonlinearity. Obscuring the beams avoids any reinforcement to the already written guides and facilitates their erasure, while simultaneously preventing local heating due to the propagation of the light itself. At this point, for the balanced (unbalanced) configuration both (a single) inputs (input) are activated with an input power about 80 times that used for the writing process. Typical writing powers are on the order of 10 μW, allowing the formation of solitonic waveguides in a time of the order of 5 min. When erasure is performed, the
64
4 The Solitonic X-Junction as a Photonic Neuron
power of the input beams is increased to 800–1000 μW. This produces a fast movement of charge that redistributes homogeneously throughout the crystal, canceling the local refractive index contrast at the base of the solitonic structures. Once the thermal gradient is cancelled and the initial temperature is restored, the rate of structure cancellation depends only on the power of the incoming beams. If the structure is unbalanced, the strongest channel needs the longest time to be erased, as it has undergone the most index change, as evidenced by the sequence of frames in Fig. 4.19. The frames were chosen to show the induced change to the X-Junction neurons by the erasing procedure in the short term, considered within 1 min after the start of the procedure, and in the long term, over 1 min. Specifically, the figure proposes the deletion of an unbalanced X-Junction neuron with the left channel (looking at the image) characterized by a higher index local contrast than the right channel. The two channels were written with a power ratio of W1 /W2 = 3 which implies that the channel written at W1 power encloses within itself about 80% of the injected light. In order to clear the channel, it is necessary, with the same laser input power, to wait longer. Progressively the light begins to lose the solitonic connotation and propagate with diffraction. As a result, after about 1 min the weak channel again shows a diffraction line output pattern. The second channel returns to a linear regime after 6 min from the start of the cancellation procedure.
Fig. 4.19 Light distribution on the output face of an LNOI film. The time-dependent erasing sequence of an unbalanced X-Junction neuron, written above, is reported. We move from a stillconfined situation a to a return to the diffractive regime h. The channel with higher index contrast needs more time to return to the linear regime
4.4 Experimental Writing and Erasing of the X-Junction Neuron in LNOI Films
65
This dynamic is reminiscent of the one recounted in Chap. 1 on memory formation. Kandel, in fact, showed us that when a stimulus has been presented numerous times, the system’s response goes from being short-term to long-term. Iteration over time thus fixes memory. In order to eliminate information that is embedded in neural circuits (long-term memory), it is necessary to wait much longer than for information that is still susceptible to change (short-term memory). Similarly, solitonic guidance with high index contrast, as in the case of the W1 power-written channel, is able to keep the information more confined, which therefore turns out to be rooted within the solitonic structure. If, however, the index contrast is low, as in the case of the channel written at power W2 , then its erasing becomes faster. In conclusion, this paragraph has shown that it is possible to selectively erase the geometry of an unbalanced X-Junction neuron simply by choosing the powers of the erasure beams. When the neuron is balanced, see Fig. 4.20, the charge shift required for erasing is the same for the two channels, which show virtually identical trends. The small differences that can be observed result from a slight difference in the power used during the writing phase, which is difficult to set precisely in experiment realizations. The ability to take advantage of more, and more advanced, tools in the future will allow the realization of very precise systems in waveguide writing. However, this feature demonstrates that the erasing process is particular sensible to the conformation of the structure being written. In the balanced case, the evolution of the two channels shows the same time course: the light spots begin to break up as early as 30 s when using 1000 μW of the injection powers; indeed, it begins to broaden at the sides until, after 6 min, it resolves again into the diffraction lines typical of the initial conditions without self-focusing.
Fig. 4.20 Light distribution on the output face of an LNOI film. The time-dependent erasing sequence of a previously written balanced X-Junction neuron is shown. We move from a stillconfined situation to a return to the diffractive regime. The two channels require an equal time interval to return to the linear regime
66
4 The Solitonic X-Junction as a Photonic Neuron
After studying the erasing cases of an unbalanced and a balanced structure, it is useful to analyze a neuromorphic application of weight. We have seen through Figs. 4.19–4.20 that it is possible to unbalance a neuron by iterating the injection of light to which it is sensitive. This procedure coincides with the learning of the neuron. But the question becomes: given an unbalanced neuron, it is possible to operate on the stronger channel to weaken it and thus return to a balanced situation? Indeed, it has already be shown that if the junction is unbalanced it is always possible to reinforce the weak channel in order to balance the neuron. Using terms that echo biological neuroplasticity, is it possible to inhibit a channel? By taking advantage of the selective erasing procedure, that is, affecting only one target channel, it is possible to achieve an evolution such as that shown in Fig. 4.21, in which an unbalanced neuron is returned to a balanced configuration. Interestingly, it is still possible to speak of an X-Junction neuron since this operation allows one to remain in the nonlinear soliton regime without returning to the diffractive pattern. All the described erasure procedures were operated successfully for three successive attempts. This means that the erasure was repeated after each write procedure, always returning the system to its initial diffractive state. In the only case of the balancing operation of the neuron from an unbalanced one, it switches to a still nonlinear state different from the original linear state. However, this means erasing a learned information and giving the intelligent unit a chance to learn something new. From this configuration then, by applying the erasing procedure for a balanced junction, it is possible to return to the diffractive state. Some researches are currently carried out whose goal is to understand whether there are limits in terms of successive repetitions by which the initial experimental conditions can be reestablished, or whether there is a progressive slowing of the winding motion as the number of iterations increases. This may depend on the fact that, with each new iteration, the light introduced is unable to promote the recombination of charges that accumulate in inaccessible regions of the crystal. Moreover, the proposed erasure technique is very useful for experiments. It allows the initial experimental conditions to be restored without changing the set-up, which could lead to slightly different results. It also greatly speeds up the waiting time before the same experiment can be repeated. Its advantages are also considerable from an application point of view. It is extremely useful in the implementation of hardware artificial neural networks, as it allows the weight each channel has with respect to the incoming information to be varied, increasing or decreasing it with respect to another connection, forming precise signal mappings. Similarly, biological synapses can be strengthened, if the input information is relevant, or weakened, if the information is sparse. The deletion process opens up the possibility of studying the dynamics of solitonic neuron weakening, tying these innovative neuromorphic structures even more closely to their counterparts in biological neural tissue.
4.5 Mathematical Model of the X-Junction Neuron
67
Fig. 4.21 Light distribution on the output face of an LNOI film. Selective erasing sequence. The refractive index of only the channel with higher contrast is modulated to return to a balanced condition
4.5 Mathematical Model of the X-Junction Neuron Underlying the learning of the X-Junction neuron, there is the dynamics of energy switch between waveguides generated via spatial solitons. As shown above, their formation depends on the application of a homogeneous bias electric field in the crystal, which can be generated through the use of electrodes or a Peltier cell by pyroelectric effect. This field lowers the refractive index value over the entire region over which it acts. Subsequently, through the propagation of light at precise wavelengths (to which the crystal is sensitive) the bias field is shielded locally, resulting in a
68
4 The Solitonic X-Junction as a Photonic Neuron
local raising of the index. By exploiting the intersection of two solitons with a precise angle of entry relative to the normal, the characteristic X-shape, in which two inputs and two outputs are recognized, can be realized. When the system is balanced, that is, the two channels have been written with the same light intensity, then the neuron must learn and the information is divided equally into the two outputs. If one of the two channels is more informative then the neuron changes its geometry according to the information propagation. This process simultaneously allows information to be processed and stored through structural evolution. X-junction switching shows that the physics of learning does not allow a channel to be completely emptied: in fact, the neuron stabilizes at most in a state of 80– 20 imbalance, as reported by the previous paragraph: this means that the stronger output contains 80% of the injected signal while the weaker channel containing 20%. To highlight these mechanisms, a mathematical discussion of the physical signal transmission processes seen for the single neuron and for a simple model of an SNN network is now proposed. The switching dynamics is completely described mathematically by the matrix equation: Y = W (E Bi as , X) · X
(4.16)
where X and Y represent the input and output vectors of size two, respectively, and W is the node energy transfer matrix, which, in neural network terms, corresponds to the synapse weight connection. W underlies switch and switch behavior and depends on the electrical bias Ebias and the input vector X (Fig. 4.22). The weight matrix for the X-junction is a 2 × 2 nonlinear matrix that changes according to the learning process. From the experimental results reported in [15, 29], a mathematical expression of it was developed. It is characterized by four terms, two for each channel.
Fig. 4.22 Functional structure of the X-Junction neuron. The neuron is characterized by two inputs (×1 and × 2) and two outputs (y1 and y2). Signal processing takes place at the node (where the channels overlap), the behavior of which can be described through a W-transfer matrix
4.5 Mathematical Model of the X-Junction Neuron
| W=
W 11 W 12 W 21 W 22
|
| 1 1+ = 2 1−
1 2 1 2
· ·
Q1 P Q1 + Q2 ( 1 Q1 P Q1 + Q2 ( 1
69
+ χ) 1 − + χ) 1 +
1 2 1 2
· ·
Q2 P Q1 + Q2 ( 2 Q2 P Q1 + Q2 ( 2
| + χ) + χ) (4.17)
W11 and W22 are the terms responsible for filling channel 1 and channel 2 respectively, while W12 and W21 are the terms responsible for emptying them according to the dynamics shown below. Through these four weights it is possible to completely describe the dynamics of channels building and strengthening. These depend, as already shown, on the input light intensity. In particular, the input can be characterized by successive iterations. For this reason, the weights depend on the input signal intensities over time. Therefore, the quantity Pi P i = P i (t − 1) =
x i (t − 1) x 1 (t − 1) + x 2 (t − 1) + x D
(4.18)
representing the normalized balance of the i-th channel up to the previous iteration time and the quantity Qi Q i = Q i (t) =
x i (t) x 1 (t) + x 2 (t) + x D
(4.19)
which represents the normalized signal injected into the i-th channel at time t. The term xD describes the dark radiation power [16] circulating within the neural photonic circuit. However, the photorefractive crystals in which first the X-Junction neuron and then the SNN networks are implemented have a saturating refractive index [39], a characteristic that makes them suitable for the formation of spatial solitons [27]. This aspect is crucial to explain the logic of node switching. To take the saturating trend into account, a χ factor was defined that depends on a maximum SAT saturation level: χ=
S AT 1 + ( P 1 + P 2 )2
(4.20)
The χ parameter then defines the total energy flowing to earlier times than the maximum SAT saturation level. It modifies the saturation tendency of the structure and consequently the learning process and switching speed of the node. In particular, different degrees of maximum unbalance of the X junction are observed as the SAT parameter changes, as shown in Table 4.1.
70
4 The Solitonic X-Junction as a Photonic Neuron
Table 4.1 The unbalancing of the mathematical model of the X-Junction neuron occurs as a function of iterations and the SAT parameter. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022 SAT
SJR: Single junction ratio
1
0.7–0.3
2
0.8–0.2
3
0.9–0.1
4.6 Tissue Properties of the X-Junction Neuron Biological neural networks function as complex systems. In fact, the different districts that compose them can influence each other even if they are topologically located in areas far apart. The description of learning dynamics therefore does not end with repeating the functionality of the single unit as many times as there are neurons of which the network is composed. Neuromorphic hardware developed to date attempts to reproduce the functional behaviors of neurons, synapses and networks but cannot act globally as a tissue. The neuromorphic paradigm introduced by optical hardware based on solitonic technology depends instant by instant on the physical properties and dynamics of the material in which self-assembling neurons are formed. Moreover, as will be shown in Chap. 4, information learning for the SNN network depends on the whole system and not on individual units or a limited region. The aim of this section is to read signal transmission between X-Junction neurons from a biological perspective and how the solitonic neural structure approaches the concept of biological neural tissue. As seen, focused conduction of electrical impulses at axons is ensured by the presence of myelin that discretely surrounds the output channel. A different amount of myelin results in a different signal conduction efficiency. Too low amounts of myelin, or even nonexistent amounts, can result in generalized signal spreading [40]. These are conditions often associated with the development of pathologies that alter communication between neural cells [41]. The presence of the electrostatic Ebias bias field in the X-Junction neuron plays a role quite similar to that of myelin in the biological case, directly intervening in the phenomenon of self-focusing of light in photorefractive media and thus limiting losses by optical diffraction and promoting the confinement of the signal within the written channels and the subsequent transmission of information. The light beam self-confinement process is a consequence of the electro-optical change in the refractive index, induced by the Ebias field, which, according to Eq. 3.5, causes its decrease. To be confined, the light must “feel” a higher refractive index than the surrounding one. For this reason, a positive local change in the refractive index is achieved by applying a polarization field to the entire crystal, which lowers the index everywhere and shields it in a small region where the light is located, so as to raise its value again. Ebias is thus simultaneously a necessary factor for the propagation of information
4.6 Tissue Properties of the X-Junction Neuron
71
within the channels and a “nutrient” for the channels themselves. As the electrostatic field strength increases, in fact, a higher local index contrast is generated as a result of a higher photo-production rate needed to cover the decrease in index. In this sense, one can recognize in the increased supply of electrical charges a nourishment of the X-Junction neuron, which is “strengthened” as a result and is thus better able to propagate light within it while avoiding significant losses. If the nonlinear dynamics were not excited, the light propagating in the crystal would be characterized by diffuse diffraction without preferring any path. There could be two different developments. In the case where, pre-existing no structure was written, the light would fail to construct any waveguide and the entire neuromorphic dynamics would be compromised. In the case where, on the other hand, a refractive index mapping was pre-existing then incoming light would propagate to the entrance of each successive channel spatially without any kind of preference. Diffraction results in a distribution over all channels without establishing a specific path. It is intuitive that when these conditions are present no learning is possible. Chapter 1 showed that from the biological point of view such a situation corresponds to an inaccurate impulse conduction that reaches different districts without focusing on the target. Involuntary movements ensue. This condition is often associated with the nervous system of children in whom neural tissue is still developing, connections are not stable, and neurons are precisely lacking myelin. It is also associated with neurodegenerative disorders [42]. To switch to the nonlinear regime, it is necessary to distribute an homogeneous electrostatic field over the crystal that can excite the photorefractive effect. By appropriately increasing the electrostatic bias field, it is thus possible to achieve sufficient index change to confine the light so that it is unable to leave the channel in which it is propagating and is able to generate a specific path for each incoming light pattern, allowing learning dynamics. With a low intensity field (up to 10 kV/cm), the achievable local refractive index change is very small. Consequently, the index contrast that is created remains insufficient to confine the light, which continues its path within the crystal following diffractive logic: there is no possibility of learning. By increasing the bias field, the index contrast between the illuminated region and the rest of the crystal becomes considerable and sufficient to self-focus the light according to the dynamics addressed in the previous paragraphs. The signal propagates within a specific channel. If the values too much of the electrostatic field become too high, however, the written waveguide becomes unstable, starts to pulsate and the signal propagating there bounces back; this is a condition of strong instability. Figure 4.23 shows the evolution of a solitonic channel as the intensity associated with the bias field (Ebias ) changes and the relative propagation of the signal. For simplicity, the dynamics related to a single solitonic channel is shown without loss of generalization. The second property of myelin introduced in Chap. 1 is the dependence of conduction velocity from myelin. The presence of myelin bundles around the axon branch
72
4 The Solitonic X-Junction as a Photonic Neuron
Fig. 4.23 From top to bottom: evolution of solitonic waveguide formation (left column) and signal propagation within it (right column) as a function of electric bias field. As Ebias increases, light confinement increases and signal propagation is more focused. However, above 50 kV/cm a pulsation phenomenon occurs and the waveguide begins to lose some of the signal
promotes the condition of Jumping Conduction, which accelerates electrical propagation by allowing the stimulus not to travel through the entire axon extension but to jump from one bundle to another resulting in an increase in the speed of travel. In general, conduction velocity increases proportionally (not linearly) to the length of the myelin sheath [43]. However, when this accesses a definite threshold value then Jumping Conduction exceeds the peak and begins to decrease. For precise numerical values, specific manuals on the study of biological neurons and myelin contributions should be studied. Similar behavior has been observed in X-Junction neurons. If the signal intensity is collected from the output channel as a function of increasing electrostatic field, it is possible to observe a characteristic curve, Fig. 4.24b, very similar to the biological curve, Fig. 4.24a, that it rises until it reaches a peak, after which it inexorably begins to fall. The effect of myelin on neural signal conductivity is thus quite similar to the effect of electric field bias on the propagation of optical signals within self-written solitonic guides. Low values of myelin and bias do not allow the proper propagation of information, but excessively high values restrain its evolution. Excessive increases in electrostatic field strength also result in lower information transmission capacity, similar to the problems raised by overproduction of myelin (see Chap. 1).
References
73
Fig. 4.24 In a trend of signal conduction as a function of myelin length. In b, trend of output signal intensity as a function of Ebias polarization field
References 1. Seung, H. 2000. Half a century of Hebb. Nature Neuroscience 3 (Suppl 11): 1166. https://doi. org/10.1038/81430. 2. Kordts, A., M.H.P. Pfeiffer, H. Guo, V. Brasch, and T.J. Kippenberg. 2016. Higher order mode suppression in high-Q anomalous dispersion SiN microresonators for temporal dissipative Kerr soliton formation. Optics Letters 41: 452–455. 3. Leo, F., S. Coen, P. Kockaert, et al. 2010. Temporal cavity solitons in one-dimensional Kerr media as bits in an all-optical buffer. Nature Photon 4: 471–476. https://doi.org/10.1038/nph oton.2010.120. 4. Fazio, E., M. Alonzo, F. Devaux, A. Toncelli, N. Argiolas, M. Bazzan, C. Sada, and M. Chauvet. 2010. Luminescence-induced photorefractive spatial solitons. Applied Physics Letters 96 (9): 091107. https://doi.org/10.1063/1.3313950. 5. Zitelli, M., E. Fazio, and M. Bertolotti. 1999. All-optical NOR gate based on the interaction between cosine-shaped input beams of orthogonal polarization. JOSA B 16 (2): 214–218. 6. Menyuk, C. 1987. Nonlinear pulse propagation in birefringent optical fibers. IEEE Journal of Quantum Electronics 23 (2): 174–176. https://doi.org/10.1109/JQE.1987.1073308. 7. Fazio, E., F.Renzi, R. Rinaldi, M. Bertolotti, M. Chauvet, W. Ramadan, A. Petris, and V.I. Vlad. 2004. Screening-photovoltaic bright solitons in lithium niobate and associated singlemode waveguides. Applied Physics Letters 85 (12): 2193–2195. https://doi.org/10.1063/1.179 4854. 8. Ming-feng, Shih and Mordechai Segev. 1996. Incoherent collisions between two-dimensional bright steady-state photorefractive spatial screening solitons. Optics Letters 21: 1538–1540. 9. Dorigo, M., E. Bonabeau, and G. Theraulaz. 2000. Ant algorithms and stigmergy. Future Generation Computer Systems 16 (8): 851–871, ISSN 0167-739X. https://doi.org/10.1016/ S0167-739X(00)00042-X. 10. Kandel, E.R. 2017. In search of memory: The emergence of a new science of mind. New York City: WW Norton & Company. 11. Qinghua, L., W. Liman, A.G. Frutos, A.E. Condon, R.M. Corn, and L.M. Smith. 2000. DNA computing on surfaces. Nature 403 (13): 175–178. 12. Aaronson, S. 2005. Guest column: NP-complete problems and physical reality. ACM SIGACT News 36 (1): 30–52.
74
4 The Solitonic X-Junction as a Photonic Neuron
13. Fazio, E., M. Alonzo, F. Devaux, A. Toncelli, N. Argiolas, M. Bazzan, C. Sada, and M. Chauvet. 2010. Luminescence-induced photorefractive spatial solitons. Applied Physics Letters 96 (9): 091107. https://doi.org/10.1063/1.3313950. 14. Chauvet, M., F. Bassignot, F. Henrot, F. Devaux, L. Gauthier-Manuel, H. Maillotte, G. Ulliac, and B. Sylvain. 2015. Fast-beam self-trapping in LiNbO3 films by pyroelectric effect. Optics Letters 40: 1258. 15. Bile, A., F. Moratti, H. Tari, et al. 2021. Supervised and unsupervised learning using a fully-plastic all-optical unit of artificial intelligence based on solitonic waveguides. Neural Computing and Applications 33: 17071–17079. https://doi.org/10.1007/s00521-021-06299-7. 16. Bile, A., H. Tari, and E. Fazio. 2022. Episodic memory and information recognition using solitonic neural networks based on photorefractive plasticity. Applied Sciences 12: 5585. https:// doi.org/10.3390/app12115585. 17. Liu, B. 2011. Supervised learning. In Web data mining. Data-centric systems and applications. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-19460-3_3. 18. Bile, A., H. Tari, A. Grinde, F. Frasca, A.M. Siani, and E. Fazio. 2022. Novel model based on artificial neural networks to predict short-term temperature evolution in museum environment. Sensors. 22 (2): 615. 19. Biamonte, J., P. Wittek, N. Pancotti, et al. 2017. Quantum machine learning. Nature 549: 195–202. https://doi.org/10.1038/nature23474. 20. Cunningham, P., M. Cord, and S.J. Delany. 2008. Supervised learning. In Machine learning techniques for multimedia. eds. M. Cord, P. Cunningham. Berlin, Heidelberg: Cognitive Technologies, Springer. https://doi.org/10.1007/978-3-540-75171-7_2. 21. Alonzo, M., D. Moscatelli, L. Bastiani, et al. 2018. All-optical reinforcement learning in solitonic X-Junctions. Science and Reports 8: 5716. https://doi.org/10.1038/s41598-018-240 84-w. 22. Sharma, O. 2019. A new activation function for deep neural network. In 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, pp. 84–86. https://doi.org/10.1109/COMITCon.2019.8862253. 23. Syuy, A.V., N.V. Sidorov, M.N. Palatnikov, N.A. Teplyakova, D.S. Shtarev, and N.N. Prokopiv. 2018. Optical properties of lithium niobate crystals. Optik 156: 239–246, ISSN 0030-4026. https://doi.org/10.1016/j.ijleo.2017.10.136. 24. Tari, H., A. Bile, F. Moratti, et al. 2022. Sigmoid type neuromorphic activation function based on saturable absorption behavior of graphene/PMMA composite for intensity modulation of surface plasmon polariton signals. Plasmonics 17: 1025–1032. https://doi.org/10.1007/s11468021-01553-z. 25. Barrett, D.G.T., A.S. Morcos, and J.H. Macke. 2019. Analyzing biological and artificial neural networks: Challenges with opportunities for synergy. Current Opinion in Neurobiology 55: 55–64, ISSN 0959-4388. https://doi.org/10.1016/j.conb.2019.01.007. 26. Fazio, E., et al. 2003. Complete characterization of (2+1)D soliton formation in photorefractive crystals with strong optical activity. 27. Fazio, E., F. Renzi, R. Rinaldi, M. Bertolotti, M. Chauvet, W. Ramadan, A. Petris, and V.I. Vlad. 2004. Screening-photovoltaic bright solitons in lithium niobate and associated singlemode waveguides. Applied Physics Letters 85 (12): 2193–2195. https://doi.org/10.1063/1.179 4854. 28. Safioui, J., F. Devaux, and M. Chauvet. 2009. Pyroliton: Pyroelectric spatial soliton. Optics Express 17: 22209–22216. 29. Bile, A., M. Chauvet, H. Tari, and E. Fazio. 2022. Supervised learning of soliton X-junctions in lithium niobate films on insulator. Optics Letters 47: 5893–5896. 30. Tari, H., A. Bile, A. Nabizada, and E. Fazio. 2023. Ultra-broadband interconnection between two SPP nanostrips by a photorefractive soliton waveguide. Optics Express 31: 26092–26103. 31. Suudhof, T.C. 2008. Neurotransmitter release. In Pharmacology of neurotransmitter release. Handbook of experimental pharmacology, eds. T.C. Sudhof, K. Starke, vol. 184. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-74805-21.
References
75
32. Nee, I., K. Buse, F. Havermeyer, R.A. Rupp, M. Fally, and R.P. May. 1999. Neutron diffraction from thermally fixed gratings in photorefractive lithium niobate crystals. Physical Review B 60: R9896(R) . 33. Ren, L., L. Liu, D. Liu, J. Zu, and Z. Luan. 2004. Optimal switching from recording to fixing for high diffraction from a LiNbO3 :Ce: Cu photorefractive nonvolatile hologram. Optics Letters 29: 186–188. 34. de Oliveira, I., J. Frejlich, L. Arizmendi, and M. Carrascosa. 2003. Holographic phase-shift measurement during development of a fixed grating in lithium niobate crystals. Optics Letters 28: 1040–1042. 35. Young, L., M.G. Moharam, F. El Guibaly, and E. Lun. 1979. Hologram writing in lithium niobate: Beam coupling and the transport length in the bulk photovoltaic effect. Journal of Applied Physics 50: 4201–4207. 36. Cook, G., J.P. Duignan, and D.C. Jones. 2001. Photovoltaic contribution to counter-propagating two-beam coupling in photorefractive lithium niobate. Optics Communications 192 (3–6): 393–398, ISSN 0030-4018. https://doi.org/10.1016/S0030-4018(01)01208-1. 37. Arizmendi, L., et al. 1991. Photorefractive fixing and related thermal effects in LiNbO3 . Journal of Physics: Condensed Matter 3: 5399. 38. García-Cabañes, A., A. Blázquez-Castro, L. Arizmendi, F. Agulló-López, and M. Carrascosa. 2018. Recent achievements on photovoltaic optoelectronic tweezers based on lithium niobate. Crystals 8 (2): 65. https://doi.org/10.3390/cryst8020065. 39. Kostritskii, S., and O. Sevostyanov. 1997. Influence of intrinsic defects on light-induced changes in the refractive index of lithium niobate crystals. Applied Physics B 65: 527–533. https://doi. org/10.1007/s003400050308. 40. Sanders, F.K. and D. Whitteridge. 1946. Conduction velocity and myelin thickness in regenerating nerve fibres. Journal of Physiology 105 (2): 152–174. PMID: 16991715; PMCID: PMC1393620. 41. Stadelmann, C., S. Timmler, A. Barrentes-Freer, and M. Simons. 2019. Myelin in the central nervous system: Structure, function, and pathology. Physiological Review. https://doi.org/10. 1152/physrev.00031.2018. 42. Ettle, B., J.C.M. Schlachetzki, and J. Winkler. 2016. Oligodendroglia and myelin in neurodegenerative diseases: More than just bystanders? Molecular Neurobiology 53: 3046–3062. https://doi.org/10.1007/s12035-015-9205-3. 43. Stampfli, R. 1954. Saltatory conduction in nerve. Physiological Reviews. https://doi.org/10. 1152/physrev.1954.34.1.101. 44. Alessandro B., Hamed T. and Eugenio F. 2022. Episodic Memory and Information Recognition Using Solitonic Neural Networks Based on Photorefractive Plasticity. Applied Sciences 12 (11): 5585. https://doi.org/10.3390/app12115585
Chapter 5
Solitonic Neural Network Acting as an Episodic Memory
Abstract This concluding chapter introduces complex solitonic neural networks. Starting from the results obtained for the single X-Junction neuron, the connection between multiple units is described. The ability of these systems to achieve episodic recognition through the construction of refractive index mapping in nonlinear lithium niobate crystals is shown. The chapter concludes with an overview of the current limitations of these systems and a description of future prospects.
5.1 Solitonic Neural Network Acting as an Episodic Memory In recent decades, physical and mathematical models have been developed to replicate the functioning of biological neural systems. Using complex mechanisms, these are able to collect data on the environment in which they operate, to translate them into electrical stimuli and then to process them [1]. However, this series of operations does not occur sequentially and independently. On the contrary, all neuronal regions are inevitably connected such that the presence of a stimulus can have consequences not only on the directly stimulated unit but also on neurons that apparently are not connected with the excited region at all [2]. As already described, this phenomenon heads Hebb’s postulate, which allows to introduce the key property of neuro-plasticity in the treatment of neural systems and the neuromorphics inspired by them [3]. Neuro-plasticity governs learning processes from the stage of data collection, enables its processing and contextually the storage of decoded information [4]. One of the great challenges of neuromorphic systems lies precisely in the possibility of realizing artificial neuro-plasticity that, like the biological one, enables learning by connecting the entire neural spatial extent to form a highly interconnected tissue. The neuro-plastic condition has not been fully achieved by any artificial system until now. As a result, all the developed hardware, from electronic to optical, works in a quite similar way to software artificial intelligence such as machine learning and deep learning. Their structures are static since they are able to modify and modulate individual neural connections but are unable to act globally as a tissue. The
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 A. Bile, Solitonic Neural Networks, Machine Intelligence for Materials Science, https://doi.org/10.1007/978-3-031-48655-5_5
77
78
5 Solitonic Neural Network Acting as an Episodic Memory
great innovation introduced by SNN networks compared with realized neuromorphic systems [5] is the ability to exploit the photorefractive modulation of the refractive index of specific nonlinear crystals to reproduce the plastic behavior of networks [6]. For this reason, it will be called photorefractive plasticity during the following discussion.
5.1.1 Implementation of an Episodic Memory Based on the Plastic Photorefractivity of the SNNs In the previous pages of this manuscript, the ability of spatial solitons to build reconfigurable light channels was shown. To fully understand the way SNNs learn, let’s stress two important concepts. More important information is recognized as stronger signals and less important information as weaker signals. Input signal strength is nonlinearly associated with synapse strength, which characterizes the connection intensity between two or more neurons. Synaptic activity is strictly dependent on the intensity of received inputs and is regulated by the variation of neurotransmitter density [7]. Their concentration influences signal transmission and also synaptic formation. When new information arrives, it is possible for new neuronal bridges to be built. We have seen that this phenomenon is a direct consequence of the migration of specific neurotransmitters. The second involved factor in the strengthening of biological synaptic connections and in the realization of long-term memory then is repetition [8]. The reiteration of signal propagation along specific “neural bridges” reinforces their weight, making them privileged over others. This is a strategy generally resorted to by the biological world and is called stigmergy [10]. Stigmergic strategy is, in general, a method of communication using decentralized and spatially distributed systems, in which individuals can communicate with each other by modifying their surroundings; it allows neuronal units to modulate comunication by changing the structure of synaptic connections and, therefore, of the neurons themselves. The X-Junction neuron is able to self-modify its structure according to the inputs it receives, maintaining the processing information in the new conformation assumed through a plastic memory. What happens when multiple X-Junction neurons are interfaced? The amount of information that can be processed by the SNN network in terms of bits depends on the number of X-Junction neurons. The first studied case of SNN consists of two X-Junction neurons capable of handling 4-bit images [6].
5.1 Solitonic Neural Network Acting as an Episodic Memory
79
5.1.2 A 4-bit SNN Working as an Episodic Memory Chapter 3 reviewed the characteristics of psycho-memories that modern neuroscience and neuropsychology identify. Specifically, an episodic memory is capable of recording information by taking an instantaneous “snapshot.” Accordingly, a recognition is episodic if it performs a bit-to-bit matching process. The first implemented SNN model was realized through two X-Junctions units arranged in parallel so that a complex interconnected structure could be realized. Between the face of the crystal where the light beams are input and the output face, indeed, there are different overlapping regions, which allow exchange of light information according to the already seen laws for the X-Junction neuron operating. By taking advantage of the reflection on the side edges of the crystal, however, channels at the central region of the network, that would not have allowed any connection with the network, are eliminated. The realization of an SNN network is shown in Fig. 5.1. Specifically, Fig. 5.1a shows a single neural unit (X-Junction neuron); Fig. 5.1b considers 3 parallel units giving rise, by propagating, to a complex network like that in Fig. 5.1c characterized by hidden layers. Finally, Fig. 5.1d shows how edge reflections allow the elimination of channels that would have no connection with the remaining part of the network.
SINGLE NEURAL UNIT
A
B
PARALLEL UNITS CONSTITUTING A PROCESSING LEVEL
SUPPRESSION OF 2 NEURONS FROM EVEN HIDDEN LEVELS BY TOTAL REFLECTION ON THE LATERAL FACES OF THE SUBSTRATE
INTERMEDIATE HIDDEN PROCESSING LEVELS
OUTPUT LEVEL INPUT LEVEL
C
D
Fig. 5.1 Schematic diagram of the implementation of an SNN architecture. The fundamental unit is the X-Junction neuron (a), which arranged several times in parallel generates the first processing layer (b). The propagation of this structure generates the subsequent layers of the network (c) that are connected in series. The even-order layers have two fewer units (d) because of the total reflections at the edges. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
80
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.2 Schematic structure of a 4-bit SNN network. Each node (intersection of solitonic channels) is characterized by a weight W that determines the switch. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
This operation results in an SNN network whose structure is reported in Fig. 5.2. Three different network levels can be recognized, reminiscent of the structure of software artificial intelligence networks [10, 11]. Each layer type has a distinct functionality. The first layer (Input layer) collects the input information. The number of channels corresponds to the number of bits that make up the image to be learned. Since each elementary unit has N inputs, the number of used units in the input layer is N/2. In the notation shown in Fig. 5.2, the inputs are represented by x1 , x2 , x3 , x4 . These four signals run in the corresponding solitonic channels until they reach a junction region (nodes). For each X-Junction there is a node whose dynamics is described, as in the case of the single neuron, by a W-weight. The presence of nodes identifies the transition to the second type of neural layer termed, by analogy with the corresponding software, hidden layer. In the case of SNNs, the number of hidden layers is fixed by the size of the network and thus by the number of parallel X-Junctions found in the input layer. Within the hidden layer, signals undergo a series of processing that determine the switch of individual nodes and, as a consequence, a precise mapping of light propagation. Once signal processing is finished, the last layer of the network, the output layer, begins. For SNNs, it is characterized by the same number of output channels as input channels. In fact, for episodic memory applications, the number of output channels must coincide with the number of bits to be processed [12]. This is also different from the output layer of software networks in which there is a rather small dimensionality compared to the input layer one and to the hidden layers one. These differences arise from the learning procedure: software networks operate differently from the way the biological brain works and therefore an episodic recognition is not taken into account [13]. In Machine Learning, the dimensionality of the output layer can change depending on the type of task.
5.1 Solitonic Neural Network Acting as an Episodic Memory
81
Focusing again on SNNs, the first hidden layer of the 4-bit SNN consists of an even number of nodes. From here, information flows into the second hidden layer, which features a single node that analyzes signals 2 and 3 while signals 1 and 4 are transferred to the next layer without processing. The third hidden layer is the last processing layer and again consists of two nodes. Its outputs are the outputs of the entire network. The SNN learns through successive reiterations of the signals within the structure just described that modify the nodes of individual units by determining precise signal encodings in the form of refractive index modulations.
5.1.3 Mathematical Model of the SNN The mathematical description of the SNN network working model is developed from the model of the single X-Junction neuron. All the dynamics are completely described by a matrix equation like the Eq. 4.16 described in Chap. 4. However, in the SNN case, the X and Y vectors have a number of components equal to the number of bits to be analyzed and parsed. Therefore, for the 4-bit SNN there are four components. In particular, each layer will have its own transfer weight matrix, whose elements represent the matrices of the individual units. The odd-numbered hidden layers, in this configuration layers 1 and 3, both consist of two neural units in parallel and therefore show very similar transfer matrices. Their mathematical form is of the type: ⎡ W j−odd
⎢ ⎢ ⎢ =⎢ ⎢ ⎣
1 2 1 2
+ −
Q1 Q 1 +Q 2 Q1 Q 1 +Q 2
(
1 4 ( P1 1 4 P1
j
+ χ1,2 j + χ1,2
) )
1 2 1 2
− +
Q2 Q 1 +Q 2 Q2 Q 1 +Q 2
(
1 4 ( P2 1 4 P2
j
+ χ1,2 j + χ1,2
)
1 2 1 2
0 0 0 0
⎤
0 0
) + −
Q3 Q 3 +Q 4 Q3 Q 3 +Q 4
(
0 0 )
j 1 4 ( P3 + χ3,4 ) j 1 4 P3 + χ3,4
1 2 1 2
− +
Q4 Q 3 +Q 4 Q4 Q 3 +Q 4
⎥ ⎥ ) ⎥ ⎥ ⎥ ⎦
(
j 1 4 ( P4 + χ3,4 ) j 1 4 P4 + χ3,4
(5.1)
where j is an index indicating the node (j = 1,3). The even hidden layer, on the other hand, has only one junction and therefore the competence matrix contains only one central term: ⎡
0 ⎢0 ⎢ W2 = ⎢ ⎣0 0
1 2 1 2
0( ) Q2 1 + Q 2 +Q 3 4 P2 + χ 22,3 ( ) 1 2 P2 + χ 22,3 − Q 2Q+Q 3 4 0
1 2 1 2
0( )0 Q3 1 2 − Q 2 +Q 3 4 P3 + χ 2,3 0 ( ) 1 3 P3 + χ 22,3 0 + Q 2Q+Q 3 4 0 0
⎤ ⎥ ⎥ ⎥ ⎦
(5.2)
where the χ terms take into account light iterations up to time t-1 and are defined as: ⎧ j=(1,3) ⎪ )2 χ1,2 = ( S AT ⎪ j j ⎪ 1+ P1 +P2 ⎪ ⎨ j=(1,3) )2 χ3,4 = ( S AT j j ⎪ 1+ P3 +P4 ⎪ ⎪ ⎪ S AT ⎩ χ2 = 2,3 1+(P +P )2 2
3
(5.3)
82
5 Solitonic Neural Network Acting as an Episodic Memory
The SAT parameter manages and characterizes the saturation of the crystal and thus the rates of solitonic channels formation. From the neuro-computational point of view, it is responsible for the unbalance dynamics of each node of the elementary units. Its contribution allows to control the saturation speed of the node’s unbalancing dynamics. To fully understand the switching dynamics, it is also interesting to study how quickly the channels change and which energy distribution is associated with the outputs when a precise input pattern is provided. The following tables summarise the unbalance dynamics of the output layers as a function of the maximum splice switching capability (SJR) (varying between three ratios) in the case of only one channel on out of four (single bit).The numerical values represent the intensities relative to the information light propagating within the network. In the SJR = 0.7–0.3 configuration, reported in Table 5.1a, the strongest output channel is able to carry up to 70% of the received input while the weakest one gradually empties to contain 30%; Table 5.2b reports the case SJR = 0.8–0.2, in which the maximum percentage of information that can be carried in the channel with higher index contrast rises to 80% (maximum case observed experimentally); finally, the SJR = 0.9–0.1 configuration shows signal switching reaching up to 90% of the total propagating light at the channel characterized by higher refractive index contrast and drops to 10% in the case of the weaker channel. From each table it clearly emerges that for each input configuration there is a favored channel that shows higher intensity. As an example, it is possible to consider the input configuration 1–0–0–0 for the three types of SJRs: if SJR = 0.7–0.3, the output shows as 0.224–0.274–0.134–0.368 with an unbalance toward the fourth output; if SJR = 0.8–0.2, the favored output remains the fourth output but with a higher unbalance 0.132–0.198–0.089–0.581; if SJR = 0.9–0.1, the unbalance increases until the output pattern is 0.054–0.109–0.040–0.797. Thus, the output channel contrast increases with the switching efficiency of each individual junction. To complete the picture, it is important to observe how information flows within the three hidden layers that make up the 4-bit network. Figure 5.3 reports the graph relative to the data shown in Table 5.1, for 1-bit-only input patterns. In each pattern configuration, information flows to the opposite output from the input channel emphasizing the solitonic character of the information. Increasing the SAT value results in higher SJR and thus faster dynamics: the system saturates sooner and thus enters the solitonic regime faster [6]. It can also be seen that the speed of learning is well represented by a sharper contrast between colors, which means greater imbalance at each node. Table 5.2 analyzes the case when two bits are activated in the network, thus corresponding to 2-digit 4-bit numbers, showing the output patterns for each input permutation, while Fig. 5.4 represents the flow of signals in all channels: The unbalance logic remains unchanged and increases, like in the single bit case, as the SJR factor increases. Thus, the side channels gradually fill up while emptying the center channels. At the end of learning, however, the higher SJR network will have a greater imbalance in favor of the side channels than networks characterized by a lower SJR.
5.1 Solitonic Neural Network Acting as an Episodic Memory
83
Table 5.1 Intensity values in the four output channels for the four possible single-bit input configurations as a function of SJR switching type. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022 (a)
Single junction ratio: 0.8–0.2
Channel
1
2
3
4
Input
1
0
0
0
Output
0,224
0,274
0,134
0,368
Input
0
1
0
0
Output
0,251
0,524
0,056
0,169
Input
0
0
1
0
Output
0,169
0,056
0,505
0,271
Input
0
0
0
1
Output
0,368
0,134
0,271
0,227
(b)
Single junction ratio: 0.8–0.2
Channel
1
2
3
4
Input
1
0
0
0
Output
0,132
0,198
0,089
0,581
Input
0
1
0
0
Output
0,133
0,670
0,003
0,193
Input
0
0
1
0
Output
0,193
0,003
0,670
0,133
Input
0
0
0
1
Output
0,581
0,089
0,195
0,135
(c)
Single junction ratio: 0.8–0.2
Channel
1
3
4
2
Input
1
0
0
0
Output
0,054
0,109
0,040
0,797
Input
0
1
0
0
Output
0,063
0,837
0,000
0,100
Input
0
0
1
0
Output
0,100
0,00
0,837
0,063
Input
0
0
0
1
Output
0,796
0,041
0,108
0,055
Again, higher color contrast indicates a sharper learning, with more light filling channels characterized by higher refractive index. Finally, the 4-bit SNN can successfully analyze, distinguish and learn incoming configurations up to 3-bit lit (3-digit 4-bit numbers). Thus, Table 5.3 shows the case where the first three channels from the top are turned on with all possible combinations.
84
5 Solitonic Neural Network Acting as an Episodic Memory
Table 5.2 Intensity values in the four output channels for the four possible two-bit input configurations as a function of SJR switching type. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022 (a)
Single junction ratio: 0.7–0.3
Channel
1
2
3
4
Input
1
0
0
1
Output
0,5747
0,4122
0,4285
0,5846
Input
0
1
1
0
Output
0,4212
0,586
0,5717
0,4211
Input
1
0
1
0
Output
0,3902
0,336
0,6588
0,615
Input
0
1
0
1
Output
0,6121
0,6637
0,3377
0,3865
Input
1
1
0
0
Output
0,508
0,792
0,2054
0,4946
Input
0
0
1
1
Output
0,4946
0,2054
0,7704
0,5296
(b)
Single junction ratio: 0.8–0.2
Channel
1
3
4
2
Input
1
0
0
1
Output
0,6641
0,3066
0,3345
0,6948
Input
0
1
1
0
Output
0,3236
0,6887
0,666
0,3218
Input
1
0
1
0
Output
0,3314
0,2209
0,7648
0,7029
Input
0
1
0
1
Output
0,6999
0,7731
0,2227
0,3043
Input
1
1
0
0
Output
0,3635
0,8366
0,1482
0,6518
Input
0
0
1
1
Output
0,6518
0,1482
0,8086
0,3915
(c)
Single junction ratio: 0.8–0.2
Channel
1
3
4
2
Input
1
0
0
1
Output
0,7849
0,1639
0,1977
0,8536
Input
0
1
1
0
Output
0,1851
0,827
0,8063
0,1816
Input
1
0
1
0
Output
0,1933
0,1017
0,8785
0,8266 (continued)
5.1 Solitonic Neural Network Acting as an Episodic Memory
85
Table 5.2 (continued) (c)
Single junction ratio: 0.8–0.2
Channel
1
2
3
4
Input
0
1
0
1
Output
0,8264
0,8869
0,1029
0,1839
Input
1
1
0
0
Output
0,1944
0,9057
0,0787
0,8212
Input
0
0
1
1
Output
0,8213
0,0787
0,8804
0,2196
SJR=0.7-0.3 in1=1
out1
in2=0
out2
in3=0
out3
in4=0
out4
in1=0
out1
in2=0
out2
in3=0
out3
in4=1 layer IN
SJR=0.8-0.2
SJR=0.9-0.1
0.0-0.2 0.2-0.4 0.4-0.6 0.6-0.8
layer 1
layer 2
0.8-1.0
out4 layer 3=OUT
Fig. 5.3 Evolution of intensities along the three layers of the 4-bit SNN. The two single-bit input configurations 1–0-0–0 and 0–0-0–1 for the three values of the SJR are analyzed. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
Again, the SNN network is perfectly capable of recognizing, decisively, the three information bits (Fig. 5.5). The case with all 4 bits on is obviously not reported since this situation does not make as much sense from the learning point of view. Indeed, in this case all the nodes of the SNN split the signal 50% into the two branches. It follows that the network is completely balanced. On the other hand, there is nothing to “learn”. If all bits are off, no information exists. No signal enters and therefore the network is not changed. The SNN remains balanced.
86
5 Solitonic Neural Network Acting as an Episodic Memory SJR=0.7-0.3 in1=1
out1
in2=0
out2
in3=0
out3
in4=1
out4
in1=0
out1
in2=1
out2
in3=1
out3
in4=0 layer IN
SJR=0.8-0.2
SJR=0.9-0.1
0.0-0.2 0.2-0.4 0.4-0.6 0.6-0.8
layer 1
layer 2
0.8-1.0
out4 layer 3=OUT
Fig. 5.4 Evolution of intensities along the three layers of the 4-bit SNN. The two-bit input configurations 1–0–0–1 and 0–1–1–0 for the three values of the SJR are analyzed. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
5.1.4 Learning and Memorization Process Compared with neuromorphic devices developed to date, the great novelty introduced by SNNs is the photorefractive plasticity an essential property for achieving a device that can operate with similar functionality to the neuroplasticity in the biological brain. Thanks to this property, SNNs are capable of autonomously recognizing the pattern of incoming signals, processing them according to the logic of solitonic X-junction [9, 14] and storing them by self-modifying their structure [15]. What are the mechanisms by which it learns and stores information? How does it distinguish one information from another? Like a software network, the SNN is a system that initially does not know and therefore does not distinguish between incoming information. Moreover, exactly as it happens in the brain of an infant, the connections are all to be built and strengthened. Indeed, at the beginning, the SNN is written with all equal light intensities in order to achieve a perfectly balanced system. It follows that, whatever information comes will be 50% split at each junction of each layer: there are no privileged paths. However, if the same information pattern is re-entered a second time, the local refractive index relative to its path will increase, confining an extra percentage of light. On the third iteration, the same phenomenon will be repeated, and this for the next n times until each node is unbalanced 80% toward the channel with the greatest index contrast and 20% toward the weaker one. When this process ends, there will be a single path that identifies the specific learnt information (Fig. 5.6).
5.1 Solitonic Neural Network Acting as an Episodic Memory
87
Table 5.3 Intensity values in the four output channels for the four possible three-bit input configurations as a function of SJR switching type. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022 (a)
Single junction ratio: 0.7–0.3
Channel
1
2
3
4
Input
1
1
1
0
Output
0,6581
0,8561
0,7274
0,7583
Input
0
1
1
1
Output
0,7653
0,7273
0,8416
0,6658
Input
1
0
1
1
Output
0,7305
0,4804
0,9308
0,8583
Input
1
1
0
1
Output
0,8445
0,9342
0,4924
0,7288
(b)
Single junction ratio: 0.8–0.2
Channel
1
2
3
4
Input
1
1
1
0
Output
0,4993
0,8724
0,807
0,8213
Input
0
1
1
1
Output
0,8315
0,8083
0,8508
0,5094
Input
1
0
1
1
Output
0,8067
0,3503
0,959
0,884
Input
1
1
0
1
0,9612
0,3675
0,8122
3
4
Output
0,8591
(c)
Single junction ratio: 0.7–0.3
Channel
1
2
Input
1
1
1
0
Output
0,2586
0,9174
0,8994
0,8975
Input
0
1
1
1
Output
0,9078
0,9012
0,8952
0,2958
Input
1
0
1
1
Output
0,8973
0,1888
0,983
0,9309
Input
1
1
0
1
Output
0,8913
0,9827
0,2056
0,9205
Two phases are thus recognized in the learning of an SNN: the first step is the training phase, which is necessary for the exploration of the administered data and the setting of junction weights (nodal switches) for the realization of an encoding of information patterns in the form of an index mapping; the second step is the testing phase which is used to examine whether the trained information has been stored, checking that the network is actually able to proceed through episodic recognition.
88
5 Solitonic Neural Network Acting as an Episodic Memory SJR=0.7-0.3 in1=1
out1
in2=1
out2
in3=1
out3
in4=0
out4
in1=0
out1
in2=1
out2
in3=1
out3
in4=1 layer IN
SJR=0.8-0.2
SJR=0.9-0.1
0.0-0.2 0.2-0.4 0.4-0.6 0.6-0.8
layer 1
layer 2
0.8-1.0
out4 layer 3=OUT
Fig. 5.5 Evolution of intensities along the three layers of the 4-bit SNN. The two three-bit 1–1–1–0 and 0–1–1–1 input configurations for the three values of the SJR are analyzed. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022 Fig. 5.6 Perfectly balanced 4-bit SNN network. The network is ready for the first learning phase in which it is trained to recognize a specific configuration. This process ends with structure asymmetry. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
In this phase, several light patterns are presented, but the network must respond only to the one on which the SNN has been trained. As will be shown shortly, the recognition mechanism is threshold-based. This means that if the input pattern is recognized, the network will respond with output signals above threshold value. In particular, all the learning configurations that a 4-bit SNN can fulfill were explored:
5.1 Solitonic Neural Network Acting as an Episodic Memory
89
– Single input beam, corresponding to a single illuminated pixel in a 4 × 4 pixel array (1-digit recognition). – Two input beams, corresponding to two illuminated pixels in a 4 × 4-pixel array (2-digit recognition). – Three input beams, corresponding to three illuminated pixels in a 4 × 4-pixel array (3-digit recognition). 5.1.4.1
SNN Training and Testing on 1-digit 4-bit Numbers
To evaluate the network learning on a selected single digit of a 4-pixel matrix array, four different trainings were performed, one for each number to be stored, as shown in the first row of Fig. 5.7. Here, all the possibilities for 1-digit and 4-bit numbers in terms of training-memorization and validation are reported. To validate the performed learning, for each memorization process, all four numbers are sent to the input channels and the network response is analyzed. Rows 2–5 of the matrix, that makes up Fig. 5.7, report all the cases to be validated. Only the configurations arranged along the diagonal, highlighted in red, have above-threshold output. In fact, a stronger response is expected when there is a match between the input pattern and the stored one: in this case, the response is greater than a threshold value, experimentally derived to be about 0.7, which is set as a percentage of the input signal value, as shown by Eq. 5.4: out put
Ii
input
≥ θIi
.
(5.4)
Fig. 5.7 Training and testing steps are shown for 1-digit, 4-bit numbers. The first line shows the training steps. The next rows show the test experiments. Those for which the training and test numbers coincide are framed in red and are the only ones to show above-threshold behavior. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
90
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.8 Amplitudes of output signals for different single-bit trained channels. For each training, there is only one channel whose amplitude exceeds a threshold value identified by the dashed horizontal line. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
In order to learn a specific information, it is necessary for the output to exceed the threshold value. This case is verified only if the input bit sequence being tested coincides with the pattern being trained. In all other cases, the outputs will be subthreshold, that is, unrecognized. The reason is simple. If the input sequence is different from the trained one, the light propagation will encounter inappropriately set nodal weights, which will “break” it into several small contributions. To summarize this dynamic, Fig. 5.8 should be observed: the threshold (dashed line) is exceeded, and the recognition is completed only in the case of pattern matching between the training phase and the testing phase. In addition, channels 1 and 4 show higher outputs than channels 2 and 3, and thus are less prone to attenuation: this is a peculiarity of this type of SNN network and is probably related to how the geometry of the device was realized. In fact, the two side channels (1–4) have no edge reflections, unlike channels 2 and 3, which are therefore attenuated. Their output therefore has a lower output level even when training and test channels match, as indicated by Fig. 5.8.
5.1.4.2
SNN Training and Testing on 2-digit 4-bit Numbers
The ability of the SNN to learn information is also analyzed for all 2-digit, 4-bit numbers, as shown in Fig. 5.9. Again, the only outputs above the threshold, for each learning configuration, are those at which training occurred. For this 2-digit learning, it is important to note that high test output signals are encountered even when signals
5.1 Solitonic Neural Network Acting as an Episodic Memory
91
are not injected into both trained channels, but only into one which still has to match one of the two trained channels. This phenomenon corresponds to partial recognition of information. To complete the learning validation, however, two conditions must be met simultaneously, according to Eq. 5.5: {
out put
input
≥ θ Ii Ii out put input Ij ≥ θ Ij
(5.5)
where the condition i /= j must hold. Therefore, the network is able to recognize both the complete number and only some of the used digits. Outputs of two digits have higher values than those in which only one digit is recognized. Again, all possible learning configurations on 2-bit switched-on images were tested and the results are organized in Fig. 5.9 (matrix image) in which the first row reports all training configurations while each column represents a case study of the test for the corresponding training. Since partial recognition, i.e., recognition of only one channel, cannot be considered valid because it satisfies only one of the conditions 5.5, each training configuration has a unique output pattern with both channels above the threshold condition. These can be recognized in the matrix along the diagonal highlighted with red color.
Fig. 5.9 Training and testing steps are shown for 2-digit, 4-bit numbers. The first line shows the training steps. The next rows show the test experiments. Those for which the training and test numbers coincide are framed in red and are the only ones to show above-threshold behavior. Compared with the single-digit case, partial recognition is also possible in uci one of the two channels is recognized if it matches one of the trained ones. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
92
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.10 Amplitudes of output signals for different two-bit trained channels. For each training, there is only one channel whose amplitude exceeds a threshold value identified by the dashed horizontal line. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
To take a good look at the learning behavior by the channels, again a summary image is proposed in which the behavior of each channel for each training pattern is depicted. If one looks closely at Fig. 4.10, for each configuration there is only one pair in which the channels both remain above threshold. Similarly, there is only one pair in which both channels are sub-threshold. In all other cases partial recognition occurs (Fig. 5.10).
5.1.4.3
SNN Training and Testing on 3-digit 4-bit Numbers
The SNN network is capable of performing a three-pixel illuminated recognition task (3-digit over 4-bit numbers) too. The training procedure follows the pattern of the previous two cases. In particular, Fig. 5.11 shows all the possible validation tests for each ad-de-training performed. After training, if the signal is inserted into two trained channels and one untrained channel or two untrained channels and one trained channel, the recognition is partial. Thus, there are two possible types of partial recognition for this configuration. To achieve full recognition (validation and passing the test), three conditions must be met simultaneously, as indicated by Eq. 5.6: ⎧ out put input ⎪ ≥ θ Ii ⎨ Ii out put input Ij ≥ θ Ij ⎪ ⎩ I out put ≥ θ I input k k
(5.6)
5.1 Solitonic Neural Network Acting as an Episodic Memory
93
Fig. 5.11 Training and testing steps are shown for 3-digit, 4-bit numbers. The first line shows the training steps. The next rows show the test experiments. Those for which the training and test numbers coincide are framed in red and are the only ones to show above-threshold behavior. Compared with the single-digit case, partial recognition is also possible in uci one of the two channels is recognized if it matches one of the trained ones. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
where i /= j /= k. The results organization is always realized through a matrix image in which the first row shows all the training configurations below which, by columns, are all the tests performed. In this case, in order to successfully pass the test and have the network “assimilate” the information, three conditions must be met simultaneously. Intermediate conditions, i.e. recognition of two channels or one channel, are considered partial learning. Also, in the case of 3-bit on image recognition, a graph was developed that can summarize all situations. For each training configuration, there is a unique input combination that is totally above threshold with the three channels. In this case, however, partial recognition can take two forms: the network can recognize one channel out of three, if it coincides with one of the trained triads, or two channels out of three, if the pair was also present during training (Fig. 5.12). Mathematical models play a very important role because they describe the phenomenon of learning through photorefractive plasticity. However, if photorefractive crystals are considered, some characteristics might change based on their physical properties. In some materials, the photorefractive response is very fast (BSO, InP) resulting in very fast learning dynamics; in others, it is very slow (LiNbO3 ) and in still others it can be classified as somewhere in between these two behaviors (SBN). What changes is the stability of the states of the photoexcited charges [16], which relax differently when stimulated. In lithium niobate bulk crystals (a material used to implement
94
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.12 Amplitudes of the output signals for the different three-bit trained channels. For each training, there is only one channel whose amplitude exceeds a threshold value identified by the dashed horizontal line. Reprinted under CC-BY-4.0 license from [6], © The Authors, 2022
these SNNs) for example, dielectric relaxation is so slow that the written waveguides are considered virtually permanent [17]. Therefore, by intervening through appropriate dopants, it is possible to vary the response of the host medium. From this point of view, it can be understood even more the importance of SNN networks studied in thin layers, in which the photorefractive responses are two orders of magnitude faster than in the bulk case and in some cases, under appropriate doping [18], can even reach the nanosecond. In addition, it is also understandable the importance of being able to have a means to shake photoexcited charges from their stable states and allow them to recombine so that they can restore the original pre-writing state as seen throughout Chap. 4 when the concept of erasing was introduced. In anticipation of future developments, however, a landscape of materials with different dielectric re-layers could facilitate the integration of different types of SNNs to achieve more complex behaviors close to those of biological neural tissue. Fast responses can reproduce short-term memories (or RAM-type memories to use a computer language) while materials with medium-slow or slow responses favor the construction of long-term memory tools (semi-permanent or permanent ROM-type memories) [6]. SNNs are already interfaceable with each other, and the possibility of
5.2 Power Consumption Analysis
95
constructing them in different materials could be the solution to build devices capable of switching from short-term to long-term memory if external stimuli require it.
5.2 Power Consumption Analysis The presented SNN structure is based on the X-junction neuronal unit, characterized by two inputs, two outputs and a single node (solitonic crossing area), which well schematizes the functional districts of the biological neuron presented in Chap. 1. The numerical models introduced in the previous section show that the switching speed of the single junction depends on the efficiency of nonlinearity through the handling of the SAT parameter. However, nodal switching also depends on the energy requirement of the single junction, and this varies with the number of channels it consists of. Thus, it is important to ask how the neuromorphic paradigm based on spatial solitons changes if the geometry of its fundamental unit is changed. This section considers how the energy requirement needed to make a complete switch of the neural structure changes as a function of the number of channels of which the fundamental unit is composed. The analysis is conducted through an electronic model whose starting point is the reproduction of the optical behavior of the X-Junction. In order to model the typical efficiency of photonic switching from the electronic point of view, it is necessary to consider a control system parallel to that of signal propagation. Therefore, the electronic unit consists of two circuits: one dedicated to handling information signals, and another dedicated to handling control signals which affect the switching speed of the gates and redirect the weight of the neural junction. The control signals are launched from the outputs and propagate in reverse. Thus, a separate current generator is used according to which the entire network changes conformation to accommodate the desired propagative behaviors. The schematic structure of a 2 × 2 gate able to reproduce the characteristic switch of the X-Junction neuron is shown in Fig. 5.13a. The circuit in which the information circulates is reported in black while the control current circulates in the red circuit. This section aim is not to analyze this electrical structure in detail, those who are interested are referred to the reading of the related study [19], but to delve into the operation of the switching behavior of the solitonic neuron. Therefore, the information necessary to understand its operation to read the behavior of multichannel optical neurons and study their energy requirements are provided. The main result found is that the energy required to switch the unit channels depends on both the number of channels and the number of nodes. The fundamental mechanism of operation is described below. The two signals are represented by the currents Isin1 and Isin2 , travel in feedforward mode until they reach the crossing point and here, depending on the control currents, they are switched to the outputs becoming Isout1 and Isout2 . The path of Icin1 and Icin2 (input control currents) is opposite. Also taking Fig. 5.13b as a reference, they are translated into digital conductances, then normalized and compared with each other. When this process is finished, they are carried into the
96
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.13 Circuit diagram electrically modeling an X-Junction neuron (a). The forward circuit is in black while the feedback circuit is in red. Practical implementation of the circuit (b). Solid lines represent analog signals while dotted lines represent digital signals. Reprinted under CC-BY-4.0 license from [19], © The Authors, 2022
signal propagation circuit and, by multiplying the two input currents, generate two output rates that weigh the switching of one signal and the other. This behavior is fully explainable through the 5.16 matrix equation of the X-Junction neuron introduced in Chap. 5. The conductances into which the control signals are translated have nonlinear dynamics [19]. In particular, their dynamics are characterized by a saturating expression of the type ( ) G i Ii, j =
G∞ 1+
G∞ γ (Ii, j +I0 )
(5.7)
where G ∞ represents the limit value of conductance and γ the sensitivity of conductance to changes in feedback control currents. These parameters don’t depend on the structure geometry. I0 corresponds to the background light of the optical case and sets the symmetrical behavior of the switching gate that will be readjusted by the control currents, favoring one or the other output. In particular, it has been found
5.2 Power Consumption Analysis
97
that the typical switch of an X-Junction neuron can be characterized by nonlinear quadratic conductances of the type [19]: ( ) G i I F Bi =
1+
G∞ )2 (
(5.8)
G∞ γ I F Bi
The parabolic rise at low intensities induces higher switches at lower current values. Thus, the switching efficiency of signal currents depends on the value of the saturation conductances, just as is the case in photonic gates [14]. Starting from the realization of a single-node 2 × 2 gate operating as the XJunction neuron, evolutions of the structure were made in order to analyze different types of configurations. In one case, the number of channels was varied while keeping the number of nodes fixed. In another case, the number of nodes was varied and finally the configuration with different number of channels and nodes was also studied. Referring to Fig. 5.14, two different evolutions of the 2 × 2 X-junction unit, proposed in its basic form in Fig. 5.14a, can be observed. By increasing the number of channels and the number of nodes, the topology changes and we get the unit in Fig. 5.14b; while, by retaining the topology characterized by a single node and increasing only the number of channels, we get the structure in Fig. 5.14c. As we have said, exactly as it happens in the photonic domain, the two-channel X-electronic junction is able to redirect up to 80% of the information to the strongest output. Considering the 3 × 3 unit with 3 nodes, a strong gradient brings this current to almost 80% of the total input current for a tripling of the corresponding control current.
Fig. 5.14 Evolution of the 2 × 2 X-Junction structure. Shown in (a) is the 2 × 2 single-node gate, in (b) the 3 × 3 3-node gate, and in (c) the 3 × 3 single-node gate
98
5 Solitonic Neural Network Acting as an Episodic Memory
The connection between the signal output channel and the increase in control current amplitude in the case of a three-channel-three-node unit is shown in Fig. 5.14b. The same value of control current is found for all 3 ports in the network. An important phenomenon should be emphasized: as the number of internal junctions increases, the control current required to reach a certain fixed value at the normalized output increases. Let us then go on to analyze the performance of the current switch in the three proposed cases through the aid of Fig. 5.15 in which the current imbalances in favor of output one is shown. Figure 5.15a shows the unbalance of a 2 × 2 single-node gate. It is immediate to see that the switching rate is significantly higher than that reported in the other two cases. Figure 5.15b shows the current trend for a 3 × 3 3-node gate. In this case, the curves are already much tighter. Finally, Fig. 5.15c shows a single node 3 × 3 gate: it represents an intermediate situation between the two already seen. This increase is therefore a function of the number of internal nodes, but it also depends on how these nodes are spatially distributed and their proximity to the different channels so that channel saturation is reached. The relationship between the increase in control current and the dimensionality of the neural unit is crucial to consider when developing units with multiple gates or complex neural networks. Thus, theoretical evaluation has allowed us to show that the neural unit that requires less energy to operate the switch consists of a single node and the same number between output and input channels. If the number of channels remains the same and the number of nodes is increased, then more energy must be used. Moving from a single 2 × 2 gate to a 3 × 3 gate implies an increase of about 2–4 times in the required feedback current. Therefore, what happens if a single node is fixed and the number of channels is increased? To answer this question, it is possible to study a limiting case as shown in Fig. 5.16a where there are 10 input channels and 10 output channels. In this complex multichannel neuron, the pattern of current reinforcement strictly depends on the topology of the structure and the channel under consideration. Let us take three reinforcement configurations (ch-1, ch-5, ch-10) as examples, chosen because they are at the top, middle and end of the structural order of the input channels, respectively, and represent the general situation well. All three cases have very high energy requirements: channel 1 requires a 56-fold increase, channel 5 an eightfold increase, and channel 10 a 12-fold increase. These differences are due to the different complexity of channel geometry in their trajectories. A predominant role is played by the number of nodes present and their depth within the channel extension [19]: for values of k prime close to the reinforced channel, the decreasing trend in the control current required for switching the unit is justified by the presence of an increasing number of nodes involved in the trajectories followed by the control current. For each time a node is encountered, the feedback (control) current undergoes a split by decreasing its amplitude, which at the next switch may not be sufficient to generate an unbalance and thus requiring a higher initial energy contribution. Therefore, an increase in the number of nodes requires a significant
5.2 Power Consumption Analysis
99
Fig. 5.15 The switching of normalized output currents driven by an increase in terminal 1 control current for the 2 × 2 single-node gate (a), for the 3 × 3 3-node gate (b), and for the 3 × 3 single-node gate (c) is reported. Adapted with permission from [19] @ Springer, Journal of Computational Electronics
increase in the intensity of the control currents and, consequently, the total energy involved. To overcome this important problem, topological conformations with the smallest number of nodes must be considered. The results obtained in [19] show that single-node structures, i.e., with multiple channels but not multiple nodes, drastically reduce the energy requirements for switching and, thus, for neural unit learning. Therefore, SNN networks based on X-Junction, 2-input—2-output neurons are found to be the most efficient in terms of learning capacity and speed.
100
5 Solitonic Neural Network Acting as an Episodic Memory
Fig. 5.16 In a, schematic of the structure of a 10 × 10 neuron. Switching of normalized output currents driven by increasing the control current of terminal 1 (b), terminal 5 (c) or terminal 10 (d). The simulation refers to the 10 × 10 network. Adapted with permission from [19] @ Springer, Journal of Computational Electronics
Networks whose neurons have more than two channels but maintain a single point of intersection are currently being studied.
5.3 Limitations of Actual SNNs and Future Perspectives SNNs introduce a total novelty within the developmental landscape of neuromorphic systems in general and particularly related to photonics. The most grandiose feature is the ability to self-determine its structure and learn about external stimuli by changing its geometry. In this sense, it comes remarkably close to biological appren-learning strategies much more than the hardware systems developed so far. The dynamics of movement of photogenerated charges of solitonic connections also follow mathematical laws quite similar to those governing the mobility of neurotransmitters [20]. These underlie the formation and life of biological synapses while the creation of solitonic channels depends on the distribution of photoionized charges. SNNs therefore are among the first technologies capable of reproducing the general dynamics of biological neural tissue. However, there are some structural limitations that currently prevent the realization of dense solitonic neural networks. But it is also important to describe in detail
5.3 Limitations of Actual SNNs and Future Perspectives
101
the theoretical and practical difficulties encountered in order to understand along which directions to direct future research. First, the nonlinear used bulk crystals turn out to be on the order of half a centimeter in size. This facilitates the application of electrodes perpendicular to the optical axis for the realization of the bias electric field. However, it greatly limits the possibility of constructing small dense neural devices. It must also be said that in general the response speed of bulk crystals (albeit material-dependent) is quite slow. It is not possible to give a precise value because it depends on the type of material considered. However, the slowness of the material must be contextualised in relation to other technologies with the same type of application. Finally, the neural technology in bulk crystals reported in this book has a practical inconvenience: experimentally, it is very difficult to check that the two beams of light (which will form the X-Junction neuron), lie on the same plane. Being three-dimensional structures, their intersection could coincide with the intersection of different planes and, at that point, the physics related to refractive index variation and switching would also change. A solution to these problems is provided by the use of LNOI films. This manuscript showed that it is possible to make smart solitonic structures even in lithium niobate films only 8 microns thick. The small size already constitutes a significant advantage. However, it has already been pointed out that these films are able to achieve a response a couple of orders of magnitude faster than bulk. LNOI films also solve practical experimental difficulties: since the solitonic guides are planar, there is no risk of building structures in different planes. This allows greater control over the dynamics of splitting and thus training of the neurons first and the network later. However, one unsolved problem remains. The X-Junction neuron must have a particularly large soma to function properly. The overlapping region of its component channels must be large enough to ensure energy exchange. This condition requires a sufficiently small angle of entry, relative to the normal to the crystal. Consequently, the overlap region turns out to be fixed at approximately 5 mm (small variations in the entrance angle result in slightly different values). At present, this limit cannot be exceeded. This manuscript went on to introduce the ability of SNNs to properly perform the functionality of episodic storage and recognition. The recording of a mapping of information bits occurs in both supervised and unsupervised ways. But although the dynamics of solitonic channel formation is a phenomenon to be input to the nonlinearity of the refraction index, the process of pattern mapping is instead linear. For this reason, at present it is not possible to achieve a learning system of the semantic type. To realize it would require a process of feature extraction that re-processed would lead to semantic knowledge. This is an area of solitonic neuromorphic research that is at- present being studied at the Smart & Neuro Photonics Lab at Sapienza University of Rome. Without going into details, the processing capability is achieved using photorefractive optics of solitonic networks and fourier optics for feature extrapolation. At the same time, another goal is being pursued. After investigating the ability of spatial solitons to build hardware capable of reproducing the dynamism of neural tissue, the idea of moving into the very small has gradually taken hold. Therefore,
102
5 Solitonic Neural Network Acting as an Episodic Memory
the possibility of reproducing the characteristics of SNNs through the use of surface plasmon polaritons (SPPs) [24] is investigated. This brings with it a twofold advantage. First, the circuit dimensions are greatly reduced by moving into the nanometer world, ensuring high integrability of the optical device. In addition, the dual nature of SPPs, electrical and photonic, allows interfacing with both electrical and photonic circuits. Therefore, the laboratory has developed a novel method to inter-connect metallic waveguides supporting SPPs using photorefractive soliton channels [22]. Building on these results, a system consisting of two metal strips that serve as waveguides for SPP signals with self-written soliton channels was studied. The light, propagating in the form of SPP, begins to diffract if the metal guide strip is abruptly interrupted. Taking advantage of photorefractive substrates, however, diffraction can be confined to a solitonic channel, whose propagation characteristics can be appropriately controlled and modulated [23] (Fig. 5.17). Solitonic neural networks have introduced a new way of thinking about neuromorphic photonics. They enable the realization of autonomous intelligent devices that can change in accordance with the stimuli received. Their way of learning approaches neural tissue biology. Learning, memorizing and recognizing are processes that occur contextually with structural changes. This manuscript recounted the research experience carried out during my Ph.D., which ended with the realization of a solitonic system, which acting as an intelligent tissue, is capable of operating episodic learning. However, research is never standing still. These networks are evolving and soon it will be possible to operate procedural and semantic functionality. Perhaps this will be the subject of a subsequent manuscript. Thanking you for your attention, I leave you with a contact of mine in case you would like to open a discussion with me on what you have read. We can move forward together in the evolution of SNNs.
5.3 Limitations of Actual SNNs and Future Perspectives
103
Fig. 5.17 Evolution of the formation process of a screening soliton from the SPP diffraction. Reprinted with permission from [23] @ The Optical Society
104
5 Solitonic Neural Network Acting as an Episodic Memory
References 1. Leamnson, Robert. (2000). Learning as biological brain change. Change: The Magazine of Higher Learning, 32:6, 34–40. https://doi.org/10.1080/00091380009601765 2. Zhang, Wen-Hao, Aihua Chen, Malte J. Rasch, and Si Wu. 2016. Decentralized multisensory information integration in neural systems. Journal of Neuroscience 13, 36 (2): 532–547. https:// doi.org/10.1523/JNEUROSCI.0578-15.2016 3. Seung, H. 2000. Half a century of Hebb. Nature Neuroscience 3 (Suppl 11): 1166. https://doi. org/10.1038/81430. 4. Cassilhas, R.C., S. Tufik, and M.T. de Mello. 2016. Physical exercise, neuroplasticity, spatial learning and memory. Cellular and Molecular Life Sciences 73: 975–983. https://doi.org/10. 1007/s00018-015-2102-0. 5. Ferreira de Lima, Thomas, Bhavin J. Shastri, Alexander N. Tait, Mitchell A. Nahmias, and Paul R. Prucnal. 2017. Progress in neuromorphic photonics. Nanophotonics, 6(3): 577–599. https:// doi.org/10.1515/nanoph-2016-0139 6. Bile, A., H. Tari, and E. Fazio. 2022. Episodic memory and information recognition using solitonic neural networks based on photorefractive plasticity. Applied Sciences 12: 5585. https:// doi.org/10.3390/app12115585. 7. Barbour, B., and M. Häusser. 1997. Intersynaptic diffusion of neurotransmitter. Trends in Neurosciences. 20 (9): 377–384. https://doi.org/10.1016/s0166-2236(96)20050-5. PMID: 9292962. 8. Kuriscak, Eduard, Petr Marsalek, Julius Stroffek, and Peter G. Toth. 2015. Biological context of Hebb learning in artificial neural networks, a review. Neurocomputing, 152: 27–35. ISSN 0925–2312 9. Alonzo, M., D. Moscatelli, L. Bastiani, et al. 2018. All-optical reinforcement learning in solitonic X-junctions. Science and Reports 8: 5716. https://doi.org/10.1038/s41598-018-240 84-w. 10. Shuo, W., and M. Ming. 2022. Exploring online intelligent teaching method with machine learning and SVM algorithm. Neural Computing and Applications 34: 2583–2596. https://doi. org/10.1007/s00521-021-05846-6. 11. Bile, A., H. Tari, A. Grinde, F. Frasca, A.M. Siani, and E. Fazio. 2022. Novel model based on artificial neural networks to predict short-term temperature evolution in museum environment. Sensors. 22 (2): 615. 12. Rubin, D.C. 2006. The basic-systems model of episodic memory. Perspectives on Psychological Science 1 (4): 277–311. https://doi.org/10.1111/j.1745-6916.2006.00017.x. 13. Feld, G.B., and S. Diekelmann. 2015. Sleep smart optimizing sleep for declarative learning and memory. Frontiers in Psychology 6: 622. https://doi.org/10.3389/fpsyg.2015.00622. 14. Bile, A., F. Moratti, H. Tari, et al. 2021. Supervised and unsupervised learning using a fully-plastic all-optical unit of artificial intelligence based on solitonic waveguides. Neural Computing and Applications 33: 17071–17079. https://doi.org/10.1007/s00521-021-06299-7. 15. Bile, A., M. Chauvet, H. Tari, and E. Fazio. 2022. Supervised learning of soliton X-junctions in lithium niobate films on insulator. Optics Letters 47: 5893–5896. 16. Skupin, Stephan, Ole Bang, Darran Edmundson, and Wieslaw Krolikowski. 2006. Stability of two-dimensional spatial solitons in nonlocal nonlinear media. Physical Review E, 73: 066603 17. Fazio, Eugenio, Massimo Alonzo, F. Devaux, Alessandra Toncelli, Nicola Argiolas, Marco Bazzan, Cinzia Sada, and M. Chauvet. 2010. Luminescence-induced photorefractive spatial solitons. Applied Physics Letter 1, 96 (9): 091107. https://doi.org/10.1063/1.3313950 18. Derrien, F., et al. 2000. A thermal (2D+1) spatial optical soliton in a dye doped liquid crystal. Journal of Optics A: Pure and Applied Optics 2: 332. 19. Ianero, B., A. Bile, M. Alonzo, et al. 2021. Stigmergic electronic gates and networks. Journal of Computational Electronics 20: 2614–2621. https://doi.org/10.1007/s10825-021-01799-0. 20. Bile, Alessandro, Hamed Tari, Riccardo Pepino, Arif Nabizada, and Eugenio Fazio. Photorefraction well simulates the behaviour of biological neural systems, submitted
References
105
21. Boltasseva, A., V.S. Volkov, R.B. Nielsen, E. Moreno, S.G. Rodrigo, and S.I. Bozhevolnyi. 2008. Triangular metal wedges for subwavelength plasmon-polariton guiding at telecom wavelengths. Optics Express 16 (8): 5252–5260. 22. Camponeschi, Federico, Alessandro Bile, Hamed Tari, and Eugenio Fazio. 2021. PlasmonicSolitonic coupling structure. International Journal of Scientific Engineering and Applied Science 7(3), 162–167 (2021) 23. Tari, H., A. Bile, A. Nabizada, and E. Fazio. 2023. Ultra-broadband interconnection between two SPP nanostrips by a photorefractive soliton waveguide. Optics Express 31: 26092–26103. 24. Alessandro B., Ricardo P. and Eugenio F. 2021. Study of magnetic switch for surface plasmonpolaritoncircuits. AIP Advances 11, 045222. https://doi.org/10.1063/5.0040674