390 7 3MB
English Pages 298
Programming Neural Networks in Java Program m ing Neural Net works in Java will show t he int erm ediat e t o advanced Java program m er how t o creat e neural net works. This book at t em pt s t o t each neural net work program m ing t hrough t wo m echanism s. First t he reader is shown how t o creat e a reusable neural net work package t hat could be used in any Java program . Second, t his r eusable neural net work package is applied t o several real world problem s t hat are com m only faced by I S program m ers. This book covers such t opics as Kohonen neural net works, m ult i layer neural net works, t raining, back propagat ion, and m any ot her t opics. Cha pt e r 1 : I nt r oduct ion t o N e ur a l N e t w or k s ( Wednesday, Novem ber 16, 2005)
Com put ers can perform m any operat ions considerably fast er t han a hum an being. Yet t here are m any t asks wher e t he com put er falls considerably short of it s hum an count erpart . There are num erous exam ples of t his. Given t wo pict ures a preschool child could easily t ell t he difference bet ween a cat and a dog. Yet t his sam e sim ple problem would confound t oday's com put ers. Cha pt e r 2 : Unde r st a n ding N e ur a l N e t w or k s ( Wednesday, Novem ber 16, 2005)
The neural net work has long been t he m ainst ay of Art ificial I nt elligence ( AI ) pr ogram m ing. As program m ers we can cr eat e program s t hat do fairly am azing t hings. Program s can aut om at e repet it ive t asks such as balancing checkbooks or calculat ing t he value of an invest m ent port folio. While a program could easily m aint ain a large collect ion of im ages, it could not t ell us what any of t hose im ages are of. Program s are inherent ly unint elligent and uncreat ive. Ordinary com put er program s are only able t o perform repet it ive t asks. Cha pt e r 3 : Using M ult ila ye r N e ur a l N e t w or k s ( Wednesday, Novem ber 16, 2005)
I n t his chapt er you will see how t o use t he feed- forward m ult ilayer neural net work. This neural net work archit ect ure has becom e t he m ainst ay of m odern neural net work program m ing. I n t his chapt er you will be shown t wo ways t hat you can im plem ent such a neural net work. Cha pt e r 4 : H ow a M a chine Le a r ns ( Wednesday, Novem ber 16, 2005)
I n t he preceding chapt ers we have seen t hat a neural net work can be t aught t o recognize pat t erns by adj ust ing t he weight s of t he neuron connect ions. Using t he provided neural net work class we were able t o t each a neural net work t o learn t he XOR problem . We only t ouched briefly on how t he neural net work was able t o learn t he XOR problem . I n t his chapt er we will begin t o see how a neural net work learns. Cha pt e r 5 : Unde r st a n ding Ba ck Pr opa ga t ion ( Wednesday, Novem ber 16, 2005)
I n t his chapt er we shall ex am ine one of t he m ost com m on neural net work archit ect ures- t he feed foreword back propagat ion neural net work. This neural net work archit ect ure is very popular because it can be applied t o m any different t asks. To underst and t his neural net work archit ect ure we m ust exam ine how it is t rained and how it processes t he pat t ern. The nam e "feed forward back propagat ion neural net work" gives som e clue as t o bot h how t his net work is t rained and how it processes t he pat t ern. Cha pt e r 6 : Unde r st a n ding t he Koh one n N e ur a l N e t w or k ( Wednesday, Novem ber 16, 2005)
I n t he previous chapt er you learned about t he feed forward back propagat ion neural net work. While feed forward neural net works are very com m on, t hey are not t he only archit ect ure for neural net works. I n t his chapt er we will exam ine anot her very com m on archit ect ure for neural net works. Cha pt e r 7 : OCR w it h t he Kohone n N e ur a l N e t w or k ( Wednesday, Novem ber 16, 2005)
I n t he previous chapt er you learned how t o const ruct a Kohonen neural net work . You learned t hat a Kohonen neural net work can be used t o classify sam ples int o several groups. I n t his chapt er we will closely exam ine a specific applicat ion of t he Kohonen neural
net work. The Kohonen neural net work will be applied t o Opt ical Charact er Recognit ion ( OCR) . Cha pt e r 8 : Unde r st a n ding Ge ne t ic Algor it h m s ( Wednesday, Novem ber 16, 2005)
I n t he previous chapt er you saw a pract ical applicat ion of t he Kohonen neural net work. Up t o t his point t he book has focused prim arily on neural net works. I n t his and Chapt er 9 we will focus on t wo art ificial int elligence t echnologies not direct ly relat ed t o neural net works. We will begin wit h t he genet ic algorit hm . I n t he next chapt er you will learn about sim ulat ed annealing. Finally Chapt er 10 will apply bot h of t hese concept s t o neural net works. Please not e t hat at t his t im e JOONE, w hich was covered in previous chapt ers, has no support for GAs’ or sim ulat ed annealing so we will build it . Cha pt e r 9 : Unde r st a n ding Sim ula t e d Anne a ling ( Wednesday, Novem ber 16, 2005)
I n t his chapt er we will exam ine anot her t echnique t hat allows you t o t rain neural net works. I n Chapt er 8 you were int roduced t o using genet ic algorit hm s t o t rain a neural net work. This chapt er will show you how you can use anot her popular algorit hm , which is nam ed sim ulat ed annealing. Sim ulat ed annealing has becom e a popular m et hod of neural net work t raining. As you will see in t his chapt er, it can be applied t o ot her uses as well. Cha pt e r 1 0 : Eludin g Loca l M inim a ( Wednesday, Novem ber 16, 2005)
I n Chapt er 5 backpropagat ion was int roduced. Backpropagat ion is a very effect ive m eans of t raining a neural net work. However, t here are som e inherent flaws in t he back propagat ion t raining algorit hm . One of t he m ost fundam ent al flaws is t he t endency for t he backpropagat ion t raining algorit hm t o fall int o a “ local m inim a” . A local m inim um is a false opt im al weight m at rix t hat prevent s t he backpropagat ion t raining algorit hm from seeing t he t rue solut ion. Cha pt e r 1 1 : Pr u ning N e ur a l N e t w or k s ( Wednesday, Novem ber 16, 2005)
I n chapt er 10 we saw t hat you could use sim ulat ed annealing and genet ic algorit hm s t o bet t er t rain a neural net work. These t wo t echniques em ploy various algorit hm s t o bet t er fit t he weight s of t he neural net work t o t he problem t hat t he neural net work is t o be applied t o. These t echniques do not hing t o adj ust t he st ruct ure of t he neural net work. Cha pt e r 1 2 : Fuzzy Logic ( Wednesday, Novem ber 16, 2005)
I n t his chapt er we will exam ine fuzzy logic. Fuzzy logic is a branch of art ificial int elligence t hat is not direct ly relat ed t o t he neural net works t hat we have been exam ining so far. Fuzzy logic is oft en used t o process dat a before it is fed t o a neural net work, or t o process t he out put s from t he neur al net work. I n t his chapt er we will exam ine cases of how t his can be done. We will also look at an exam ple program t hat uses fuzzy logic t o filt er incom ing SPAM em ails. Appe n dix A. JOON E Re fe r e nce ( Wednesday, Novem ber 16, 2005)
I nform at ion about JOONE. Appe n dix B. M a t he m a t ica l Ba ck gr oun d ( Friday, July 22, 2005)
Discusses som e of t he m at hem at ics used in t his book. Appe n dix C. Com piling Ex a m ple s u nde r W indow s ( Friday, July 22, 2005)
How t o inst all JOONE and t he exam ples on Windows. Appe n dix D . Com piling Ex a m ple s u nde r Linux / UN I X ( Wednesday, Novem ber 16, 2005)
How t o inst all JOONE and t he exam ples on UNI X/ Linux.
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
1/ 6
Introduction
Com put ers can perform m any operat ions considerably fast er t han a hum an being. Yet t here are m any t asks wher e t he com put er falls considerably short of it s hum an count erpart . There are num erous exam ples of t his. Given t wo pict ures a preschool child could easily t ell t he difference bet ween a cat and a dog. Yet t his sam e sim ple problem would confound t oday's com put ers. This book shows t he reader how t o const ruct neural net works wit h t he Java program m ing language. As wit h any t echnology, it is j ust as im port ant t o learn when t o use neural net works as it is t o learn how t o use neural net works. This chapt er begins t o answer t hat quest ion. What program m ing requirem ent s are conducive t o a neural net work? The st ruct ure of neural net works will be briefly int roduced in t his chapt er. This discussion begins wit h an overview of neural net work archit ect ure, and how a t ypical neur al net work is const ruct ed. Next you w ill be show how a neur al net work is t rained. Ult im at ely t he t rained neural net work's t raining m ust be validat ed. This chapt er also discusses t he hist ory of neural net works. I t is im port ant t o know w here neural net works cam e from , as well as where t hey are ult im at ely headed. The archit ect ures of early neur al net works is exam ined. Next you will be shown what problem s t hese early net works faced and how current neural net works address t hese issues. This chapt er gives a broad overview of bot h t he biological and hist oric cont ext of neural net works. We begin be exploring t he how real biological neurons st ore and process inform at ion. You will be shown t he difference bet ween biological and art ificial neurons.
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
2/ 6
Understanding Neural Networks
Art ificial I nt elligence ( AI ) is t he field of Com put er Science t hat at t em pt s t o give com put ers hum anlike abilit ies. One of t he prim ary m eans by which com put ers are endowed wit h hum anlike abilit ies is t hrough t he use of a neural net work. The hum an brain is t he ult im at e exam ple of a neural net work. The hum an brain consist s of a net work of over a billion int erconnect ed neurons. Neurons are individual cells t hat can process sm all am ount s of inform at ion and t hen act ivat e ot her neurons t o cont inue t he process. The t erm neural net work, as it is norm ally used, is act ually a m isnom er. Com put ers at t em pt t o sim ulat e an art ificial neural net work. How ever m ost publicat ions use t he t erm "neural net work" rat her t han "art ificial neural net work." This book follows t his pat t ern. Unless t he t erm "neural net work" is explicit ly prefixed wit h t he t erm s "biological" or "art ificial" you can assum e t hat t he t erm "art ificial neural net work" can be assum ed. To explore t his dist inct ion you will first be shown t he st r uct ure of a biological neural net work.
H ow is a Biologica l N eu r a l N e t w or k Con st ru ct e d To const ruct a com put er capable of “ hum an like t hought ” researchers used t he only working m odel t hey had available- t he hum an brain. To const ruct an art ificial neural net work t he brain is not considered as a whole. Taking t he hum an brain as a w hole would be far t oo com plex. Rat her t he individual cells t hat m ake up t he hum an brain ar e st udied. At t he m ost basic level t he hum an brain is com posed prim arily of neuron cells. A neuron cell, as seen in Figure 1.1 is t he basic building block of t he hum an brain. A accept s signals from t he dendrit es. When a neuron accept s a signal, t hat neuron m ay fire. When a neuron fires, a signal is t ransm it t ed over t he neuron's axon. Ult im at ely t he signal will leave t he neuron as it t ravels t o t he axon t erm inals. The signal is t hen t ransm it t ed t o ot her neurons or nerves.
Figure 1.1: A Neuron Cell ( D r a w ing cour t e sy of Ca r r ie Spe a r ) This signal t ransm it t ed by t he neuron is an analog signal. Most m odern com put ers are digit al m achines, and t hus require a digit al signal. A digit al com put er processes inform at ion as eit her on or off. This is t he basis of t he binary digit s zero and one. The presence of an elect ric signal represent s a value of one, w hereas t he absence of an elect rical signal represent s a value of zero. Figure 1.2 shows a digit al signal.
Figure 1.2: A Digit al Signal Som e of t he early com put ers were analog rat her t han digit al. An analog com put er uses a m uch great er range of values t han zero or one. This great er range is achieved as by increasing or decreasing t he volt age of t he signal. Figure 1.3 shows an analog signal. Though analog com put ers are useful for cert ain sim ulat ion act ivat es t hey are not suit ed t o processing t he large volum es of dat a t hat digit al com put ers t ypically process. Because of t his nearly every com put er in use t oday is digit al.
Figure 1.3: Sound Recorder Shows an Analog File Biological neural net works are analog. As you will see in t he next sect ion sim ulat ing analog neural net works on a digit al com put er can present som e challenges. Neurons accept an analog signal t hrough t heir dendrit es, as seen in Figure 1.1. Because t his signal is analog t he volt age of t his signal w ill vary. I f t he volt age is wit hin a cert ain range, t he neuron will
fire. When a neuron fires a new analog signal is t ransm it t ed from t he firing neuron t o ot her neurons. This signal is conduct ed over t he firing neuron's axon. The regions of input and out put are called synapses. Lat er, in Chapt er 3, “ Using Mult ilayer Neural Net works” , you will be shown t hat t he sy napses are t he int erface bet ween your program and t he neural net work. By firing or not firing a neuron is m aking a decision. These are ext rem ely low level decisions. I t t akes t he decisions of a large num ber of such neurons t o read t his sent ence. Higher level decisions are t he result of t he collect ive input and out put of m any neurons. These decisions can be represent ed graphically by chart ing t he input and out put of neurons. Figure 1.4 shows t he input and out put of a part icular neuron. As you will be shown in Chapt er 3 t here are different t ypes of neur ons t hat have different shaped out put graphs. As you can see from t he graph show n in Figure 1.4, t his neuron will fire at any input great er t han 1.5 volt s.
Figure 1.4: Act ivat ion Levels of a Neuron As you can see, a biological neuron is capable of m aking basic decisions. This m odel is what art ificial neural net works are based on. You will now be show how t his m odel is sim ulat ed using a digit al com put er.
Sim ula t in g a Biologica l N eu r a l N e t w ork w it h a Com pu t e r A com put er can be used t o sim ulat e a biological neural net work. This com put er sim ulat ed neural net work is called an art ificial neural net work. Art ificial neural net works are alm ost always referred t o sim ply as neural net works. This book is no except ion and will always use t he t erm neural net work t o m ean an art ificial neural net work. Likewise, t he neural net works cont ained in t he hum an brain will be referred t o as biological neural net works. This book will show you how t o creat e neural net works using t he Java program m ing language. You will be int roduced t o t he Java Obj ect Orient ed Neural Engine ( JOONE) . JOONE is an open source neural net work engine writ t en com plet ely in Java. JOONE is dist ribut ed under lim it ed GNU Public License. This m eans t hat JOONE m ay be fr eely used in bot h com m ercial and non- com m ercial proj ect s wit hout royalt ies. JOONE will be used in conj unct ion wit h m any of t he exam ples in t his book. JOONE will be int roduced in Chapt er 3. More inform at ion about JOONE can be found at ht t p: / / j oone.sourceforge.net / . To sim ulat e a biological neural net work JOONE gives you several obj ect s t hat approxim at e t he port ions of a biological neural net work. JOONE gives you several t ypes of neurons t o const ruct your net works. These neurons are t hen connect ed t oget her wit h synapse
obj ect s. The sy napses connect t he layers of an art ificial neural net work j ust as r eal synapses connect t he layers of a biological neural net work. Using t hese obj ect s, you can const ruct com plex neural net works t o solve problem s.
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
3/ 6
Solving Problems with Neural Networks
As a program m er of neural net works you m ust k now what problem s are adapt able t o neural net works. You m ust also be aware of what problem s are not part icularly well suit ed t o neural net works. Like m ost com put er t echnologies and t echniques oft en t he m ost im port ant t hing learned is when t o use t he t echnology and when not t o. Neural net works are no different . A significant goal of t his book is not only t o show you how t o const ruct neural net works, but also when t o use neur al net works. An effect ive neural net work program m er knows what neural net work st ruct ure, if any, is m ost applicable t o a given problem . First t he problem s t hat are not conducive t o a neural net work solut ion will be exam ined.
Pr oble m s N ot Su it ed t o a N eu r al N e t w ork I t is im port ant t o underst and t hat a neural net work is j ust a part of a larger program . A com plet e program is alm ost never writ t en j ust as a neural net work. Most program s do not require a neural net work. Program s t hat are easily writ t en out as a flowchart ar e an exam ple of program s t hat are not well suit ed t o neural net works. I f your program consist s of well defined st eps, norm al program m ing t echniques will suffice. Anot her crit erion t o consider is whet her t he logic of your program is likely t o change. The abilit y for a neural net work t o learn is one of t he prim ary feat ures of t he neural net work. I f t he algorit hm used t o solve your problem is an unchanging business rule t here is no reason t o use a neural net work. I t m ight be det rim ent al t o your program if t he neural net work at t em pt s t o find a bet t er solut ion, and begins t o diverge from t he expect ed out put of t he program . Finally, neural net works ar e oft en not suit able for problem s where you m ust know exact ly how t he solut ion was derived. A neural net work can becom e very adept at solving t he problem for which t he neural net work was t rained. But t he neural net work can not explain it s reasoning. The neural net work knows because it w as t rained t o know. The neural net work cannot explain how it followed a series of st eps t o derive t he answer.
Pr oble m s Suit ed t o a N eu r al N e t w or k Alt hough t here are m any problem s t hat neural net works are not suit ed t owards t here are also m any problem s t hat a neural net work is quit e adept at solving. Neural net w orks can oft en solve problem s wit h fewer lines of code t han a t radit ional program m ing algorit hm . I t is im port ant t o underst and what t hese problem s are.
Neural net works are part icularly adept at solving problem s t hat cannot be expressed as a series of st eps. Neural net works are part icularly useful for recognizing pat t erns, classificat ion int o groups, series predict ion and dat a m ining. Pat t ern recognit ion is perhaps t he m ost com m on use for neural net works. The neural net work is present ed a pat t ern. This could be an im age, a sound, or any ot her sort of dat a. The neural net work t hen at t em pt s t o det erm ine if t he input dat a m at ches a pat t ern t hat t he neural net work has m em orized. Chapt er 3 will show a sim ple neural net work t hat recognizes input pat t erns. Classificat ion is a process t hat is closely relat ed t o pat t ern recognit ion. A neural net work t rained for classificat ion is designed t o t ake input sam ples and classify t hem int o groups. These groups m ay be fuzzy, wit hout clearly defined boundaries. These groups m ay also have quit e rigid boundaries. Chapt er 7, “ Applying t o Pat t ern Recognit ion” int roduces an exam ple program capable of Opt ical Charact er Recognit ion ( OCR) . This program t akes handwrit ing sam ples and classifies t hem int o t he correct let t er ( e.g. t he let t er "A" or " B") . Series predict ion uses neural net works t o predict fut ure event s. The neural net work is present ed a chronological list ing of dat a t hat st ops at som e point . The neural net work is expect ed t o learn t he t rend and predict fut ure values. Chapt er 14, “ Predict ing wit h a Neural Net work” shows several exam ples of using neural net works t o t ry t o predict sun spot s and t he st ock m arket . Though in t he case of t he st ock m arket , t he key w ord is “ t ry.”
Tr a in in g N eu r a l N e t w ork s The individual neurons t hat m ake up a neural net wor k are int erconnect ed t hrough t he synapses. These connect ions allow t he neurons t o signal each ot her as inform at ion is processed. Not all connect ions are equal. Each connect ion is assigned a connect ion weight . These weight s are what det erm ine t he out put of t he neural net work. Therefore it can be said t hat t he connect ion weight s form t he m em ory of t he neural net work. Training is t he process by which t hese connect ion weight s are assigned. Most t raining algorit hm s begin by assigning random num bers t o t he weight m at rix. Then t he validit y of t he neural net work is exam ined. Next t he weight s are adj ust ed based on how valid t he neural net work perform ed. This process is repeat ed unt il t he validat ion error is wit hin an accept able lim it . There are m any ways t o t rain neural net works. Neural net work t raining m et hods generally fall int o t he cat egories of supervised, unsupervised and various hybrid approaches. Supervised t raining is accom plished by giving t he neural net work a set of sam ple dat a along wit h t he ant icipat ed out put s from each of t hese sam ples. Supervised t raining is t he m ost com m on form of neural net work t raining. As supervised t raining proceeds t he neural net work is t aken t hrough several it erat ions, or epochs, unt il t he act ual out put of t he neural net work m at ches t he ant icipat ed out put , wit h a reasonably sm all error. Each epoch is one pass t hrough t he t raining sam ples. Unsupervised t raining is sim ilar t o supervised t raining except t hat no ant icipat ed out put s are provided. Unsupervised t raining usually occurs w hen t he neural net work is t o classify t he input s int o several groups. The t raining progresses t hrough m any epochs, j ust as in supervised t raining. As t raining progresses t he classificat ion groups are “ discovered” by t he neural net work. Unsupervised t raining is covered in Chapt er 7, “ Applying Pat t ern Recognit ion” . There are several hybrid m et hods t hat com bine several of t he aspect s of supervised and unsupervised t raining. One such m et hod is called reinforcem ent t raining. I n t his m et hod t he neural net work is provided wit h sam ple dat a t hat does not cont ain ant icipat ed out put s, as is done wit h unsupervised t raining. However, for each out put , t he neural net work is t old whet her t he out put was right or wrong given t he input .
I t is very im port ant t o underst and how t o properly t rain a neural net work. This book explores several m et hods of neural net work t raining, including back propagat ion, sim ulat ed annealing, and genet ic algorit hm s. Chapt ers 4 t hrough 7 are dedicat ed t o t he t raining of neural net works. Once t he neural net work is t rained, it m ust be validat ed t o see if it is ready for use.
Va lida t ing N eu ra l N e t w ork s Once a neural net work has been t rained it m ust be evaluat ed t o see if it is ready for act ual use. This final st ep is im port ant so t hat it can be det erm ined if addit ional t raining is required. To correct ly validat e a neural net work validat ion dat a m ust be set aside t hat is com plet ely separat e from t he t raining dat a. As an exam ple, consider a classificat ion net work t hat m ust group elem ent s int o t hree different classificat ion groups. You are provided wit h 10,000 sam ple elem ent s. For t his sam ple dat a t he group t hat each elem ent should be classified int o is known. For such a syst em you would divide t he sam ple dat a int o t wo groups of 5,000 elem ent s. The first group would form t he t raining set . Once t he net work was properly t rained t he second group of 5,000 elem ent s would be used t o validat e t he neural net work. I t is very im port ant t hat a separat e group always be m aint ained for validat ion. First t raining a neural net work wit h a given sam ple set and also using t his sam e set t o predict t he ant icipat ed error of t he neural net work a new arbit rary set will surely lead t o bad result s. The error achieved using t he t raining set will alm ost always be subst ant ially lower t han t he error on a new set of sam ple dat a. The int egrit y of t he validat ion dat a m ust always be m aint ained. This brings up an im port ant quest ion. What exact ly does happen if t he neural net work t hat you have j ust finished t raining perform s poorly on t he validat ion set ? I f t his is t he case t hen you m ust exam ine w hat exact ly t his m eans. I t could m ean t hat t he init ial random weight s were not good. Rerunning t he t raining wit h new init ial weight s could correct t his. While an im proper set of init ial random weight s could be t he cause, a m ore likely possibilit y is t hat t he t raining dat a was not properly chosen. I f t he validat ion is perform ing badly t his m ost likely m eans t hat t here was dat a present in t he validat ion set t hat was not available in t he t raining dat a. The way t hat t his sit uat ion should be solved is by t rying a different , m ore random , way of separat ing t he dat a int o t raining and validat ion set s. Failing t his, you m ust com bine t he t raining and validat ion set s int o one large t raining set . Then new dat a m ust be acquired t o serve as t he validat ion dat a. For som e sit uat ions it m ay be im possible t o gat her addit ional dat a t o use as eit her t raining or validat ion dat a. I f t his is t he case t hen you are left wit h no ot her choice but t o com bine all or part of t he validat ion set wit h t he t raining set . While t his approach will forgo t he securit y of a good validat ion, if addit ional dat a cannot be acquired t his m ay be your only alt erat ive.
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
4/ 6
A Historical Perspective on Neural Networks Neural net works have been used wit h com put ers as early as t he 1950’s. Through t he years m any different neural net work archit ect ures have been present ed. I n t his sect ion you will be shown som e of t he hist ory behind neural net work s and how t his hist ory led t o t he neural net works of t oday. We will begin t his explorat ion wit h t he Percept ron.
Pe r ce pt ron The percept ron is one of t he earliest neural net works. I nvent ed at t he Cornell Aeronaut ical Laborat ory in 1957 by Frank Rosenblat t t he percept ron was an at t em pt t o underst and hum an m em ory, learning, and cognit ive processes. I n 1960 Rosenblat t dem onst rat ed t he Mark I Percept ron. The Mark I was t he first m achine t hat could "learn" t o ident ify opt ical pat t erns. The Percept ron progressed from t he biological neural st udies of neural researchers such as D.O. Hebb, Warren McCulloch and Walt er Pit t s. McCulloch and Pit t s were t he firs t o describe biological neural net works and are credit ed wit h coining t he phrase “ neural net work.” They developed a sim plified m odel of t he neuron, called t he MP neuron t hat cent ered on t he idea t hat a nerve will fire an im pulse only if it s t hreshold value is exceeded. The MP neuron funct ioned as a sort of scanning device t hat read predefined input and out put associat ions t o det erm ine t he final out put . MP neurons were incapable of leaning as t hey had fixed t hresholds. As a result MP neurons were hard- wired logic devices t hat were set up m anually. Because t he MP neuron did not have t he abilit y t o learn it was very lim it ed when com pared wit h t he infinit ely m ore flexible and adapt ive hum an nervous syst em upon which it was m odeled. Rosenblat t det erm ined t hat a learning net work m odel could it s responses by adj ust ing t he weight on it s connect ions bet ween neurons. This was t aken int o considerat ion when Rosenblat t designed t he percept ron. The percept ron showed early prom ise for neural net works and m achine learning. The Percept ron had one very large short com ing. The percept ron was unable t o lean t o recognize input t hat was not “ linearly separable.” This would prove t o be huge obst acle t hat t he neural net work w ould t ake som e t im e t o overcom e.
Pe r ce pt ron s a nd Lin e a r Se pa r abilit y To see why t he percept ron failed you m ust see w hat exact ly is m eant by a linearly separable problem . Consider a neural net work t hat accept s t wo binary digit s ( 0 or 1) and out put s one binary digit . The input s and out put of such a neural net work could be represent ed by Table 1.1.
Table 1.1: A Linearly Separable Funct ion I nput 1 I nput 2 Out put 0
0
1
0
1
0
1
0
1
1
1
1
This t able would be considered t o be linearly separable. To see why exam ine Figure 1.5. Table 1.1 is shown on t he left side of Figure 1.5. Not ice how a line can be drawn t o separat e t he out put values of 1 from t he out put values of 0? This is a linearly separable t able. Table 1.2 shows a non- linearly separable t able. Table 1.2: A Non Linearly Separable Funct ion I nput 1 I nput 2 Out put 0
0
0
0
1
1
1
0
1
1
1
0
The above t able, which happens t o be t he XOR funct ion, is not linearly separable. This can be seen in Figure 1.5. Table 1.2 is shown on t he right side of Figure 1.5. There is no way you could draw a line t hat would separat e t he 0 out put s from t he 1 out put s. As a result Table 1.2 is said t o be non- linearly separat ely. A percept ron could not be t rained t o recognize Table 1.2.
Figure 1.5: Linearly Separable and Non- Linearly Separable The Percept ion’s inabilit y t o solve non- linearly separable problem s would prove t o be a m aj or obst acle t o not only t he Percept ron, but t he ent ire field of neural net works. A form er classm at e of Rosenblat t , Marvin Minsky, along wit h Seym our Papert published t he book Percept rons in 1969. This book m at hem at ically discredit ed t he Percept ron m odel. Fat e was t o furt her rule against t he Percept ron in 1971 when Rosenblat t died in a boat ing accident . Wit hout Rosenblat t t o defend t he Percept ron and neural net works int erest dim inished for over a decade. What was j ust present ed is com m only referred t o as t he XOR problem . While t he XOR problem was t he nem esis of t he Percept ron, current neural net works have lit t le problem learning t he XOR funct ion or ot her non- linearly separable problem . The XOR pr oblem has
becom e a sort of “ Hello World” problem for new neur al net works. The XOR problem will be revisit ed in Chapt er 3. While t he XOR problem was event ually surm ount ed, anot her t est , t he Turing Test , rem ains unsolved t o t his day.
Th e Tu rin g Te st The Turing t est was proposed in a 1950 paper by Dr. Alan Turing. I n t his art icle Dr. Turing int roduces t he now fam ous “ Turing Test ” . This is a t est t hat is designed t o m easure t he advance of AI research. The Turing t est is far m ore com plex t han t he XOR problem , and has yet t o be solved. To underst and t he Turing Test t hink of an I nst ant Message window. Using t he I nst ant Message program you can chat wit h som eone using anot her com put er. Suppose a st ranger sends you an I nst ant Message and you begin chat t ing. Are you sure t hat t his st ranger is a hum an being? Perhaps you are t alking t o an AI enabled com put er program . Could you t ell t he difference? This is t he “ Turing Test .” I f you are unable t o dist inguish t he AI program from anot her hum an being, t hen t hat program has passed t he “ Turing Test ” . No com put er program has ever passed t he Turing Test . No com put er program has ever even com e close t o passing t he Turing Test . I n t he 1950’s it was assum ed t hat a com put er program capable of passing t he Turing Test was no m ore t han a decade away. But like m any of t he ot her loft y goals of AI , passing t he Turing Test has yet t o be realized. The Turing Test is quit e com plex. Passing t his t est requires t he com put er t o be able t o read English, or som e ot her hum an language, and underst and t he m eaning of t he sent ence. Then t he com put er m ust be able t o access a dat abase t hat com prises t he knowledge t hat a t ypical hum an has am assed from several decades of hum an exist ence. Finally, t he com put er program m ust be capable for form ing a response, and perhaps quest ioning t he hum an t hat it is int eract ing wit h. This is no sm all feat . This goes well beyond t he capabilit ies of current neural net works. One of t he m ost com plex part s of solving t he Turing Test is working wit h t he dat abase of hum an k nowledge. This has given way t o a new t est called t he “ Lim it ed Turing Test ” . The “ Lim it ed Turing Test ” work s sim ilarly t o t he act ual Turing Test . A hum an is allowed t o conduct a conversat ion wit h a com put er program . The difference is t hat t he hum an m ust rest rict t he conversat ion t o one narrow subj ect area. This lim it s t he size of t he hum an experience dat abase.
N eu r a l N e t w ork Today an d in t he Fu t u r e Neural net works have exist ed since t he 1950’s. They have com e a long way since t he early Percpt rons t hat were easily defeat ed by problem s as sim ple as t he XOR operat or. Yet neural net works have a long way t o go.
N eu r a l N e t w ork s Today Neural net works are in use t oday for a wide variet y of t asks. Most people t hink of neural net works at t em pt ing t o em ulat e t he hum an m ind or passing t he Turing Test . Most neural net works used t oday t ake on far less glam orous roles t han t he neural net works frequent ly seen in science fict ion. Speech and handwrit ing recognit ion are t wo com m on uses for t oday’s neural net works. Chapt er 7 cont ains an exam ple t hat illust rat es handwrit ing recognit ion using a neural net work. Neural net works t end t o work well for bot h speech and handwrit ing recognit ion because neural net works can be t rained t o t he individual user. Dat a m ining is a process where large volum es of dat a are “ m ined” for t rends and ot her st at ist ics t hat m ight ot herwise be overlooked. Very oft en in dat a m ining t he program m er is
not part icularly sure what final out com e is being sought . Neural net works are oft en em ployed in dat a m ining do t o t he abilit y for neural net works t o be t rained. Neural net works can also be used t o predict . Chapt er 14 shows how a neural net work can be present ed wit h a series of chronological dat a. The neural net work uses t he provided dat a t o t rain it self, and t hen at t em pt s t o ext rapolat e t he dat a out beyond t he end of t he sam ple dat a. This is oft en applied t o financial forecast ing. Perhaps t he m ost com m on form of neur al net work t hat is used by m odern applicat ions is t he feed forward back propagat ion neural net work. This net work feeds input s forward from one layer t o t he next as it processes. Back propagat ion refers t o t he way in w hich t he neurons are t rained in t his sort of neural net work. Chapt er 3 begins your int roduct ion int o t his sort of net work.
A Fix e d W in g N e u r al N e t w or k Som e researchers suggest t hat perhaps t he neural net work it self is a fallacy. Perhaps ot her m et hods of m odeling hum an int elligence m ust be explored. The ult im at e goal of AI is t he produce a t hinking m achine. Does t his not m ean t hat such a m achine would have t o be const ruct ed exact ly like a hum an brain? That t o solve t he AI puzzle we should seek t o im it at e nat ure. I m it at ing nat ure has not always led m ankind t o t he m ost opt im al solut ion. Consider t he airplane. Man has been fascinat ed wit h t he idea of flight since t he beginnings of civilizat ion. Many invent ors t hrough hist ory worked t owards t he developm ent of t he “ Flying Machine” . To creat e a flying m achine m ost of t hese invent ors looked t o nat ure. I n nat ure we found our only working m odel of a flying m achine, which was t he bird. Most invent ors who aspired t o creat e a flying m achine cr eat ed various form s of ornit hopt ers. Ornit hopt ers are flying m achines t hat work by flapping t heir wings. This is how a bird works so it seem ed only logical t hat t his would be t he way t o creat e such a device. However none of t he ornit hopt ers were successful. These sim ply could not generat e sufficient lift t o overcom e t heir weight . Many designs were t ried. Figure 1.6 shows one such design t hat w as pat ent ed in t he lat e 1800’s.
Figure 1.6: An Or nit hopt er I t was not unt il Wilbur and Orville Wright decided t o use a fixed wing design t hat air plane t echnology began t o t ruly advance. For years t he par adigm of m odeling t he bird was pursued. Once t wo brot hers broke wit h t his t radit ion t his area of science began t o m ove forward. Perhaps AI is no different . Perhaps it will t ake a new paradigm , out side of t he neural net work, t o usher in t he next era of AI .
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
5/ 6
Quantum Computing One of t he m ost prom ising areas of fut ure com put er research is quant um com put ing. Quant um com put ing could change t he every aspect of how com put ers are designed. To underst and Quant um com put ers we m ust first exam ine how t hey are different from t he com put er syst em s t hat are in use t oday.
Von N eum ann an d Tu ring M a chin e s Pract ically every com put er in use t oday is built upon t he Von Neum ann principle. A Von Neum ann com put er works by following sim ple discret e inst ruct ions, which are t he chiplevel m achine language codes. Such a com put ers out put is com plet ely predict able and serial. Such a m achine is im plem ent ed by finit e st at e unit s of dat a known as “ bit s” , and logic gat es t hat perform operat ions on t he bit s. This classic m odel of com put at ion is essent ially t he sam e as Babbage’s Analyt ical Engine in 1834. The com put ers of t oday have not st rayed from t his classic archit ect ure; t hey have sim ply becom e fast er and gained m ore “ bit s” . The Church- Turing t hesis, sum s up t his idea. The Church- Turing t hesis is not a m at hem at ical t heorem in t he sense t hat it can be proven. I t sim ply seem s correct and applicable. Alonzo Church and Alan Turing creat ed t his idea independent ly. According t o t he Church- Turing t hesis all m echanism s for com put ing algorit hm s are inherent ly t he sam e. Any m et hod used can be expressed as a com put er program . This seem s t o be a valid t hesis. Consider t he case where you are asked t o add t wo num bers. You would likely follow a sim ple algorit hm t hat could be easily im plem ent ed as a com put er program . I f you were asked t o m ult iply t wo num bers, you would anot her approach im plem ent ed as a com put er program . The basis of t he Church- Turing t hesis is t hat t here seem s t o be no algorit hm ic problem t hat a com put er cannot solve, so long as a solut ion does exist . The em bodim ent of t he Church- Turing t hesis is t he Turing m achine. The Turing m achine is an abst ract com put ing device t hat illust rat es t he Church- Turing t hesis. The Turing m achine is t he ancest or from which all exist ing com put ers descend. The Turing com put er consist s of a read/ writ e head, and a long piece of t ape. This head can read and writ e sym bols t o and from t he t ape. At each st ep, t he Turing m achine m ust decide it s next act ion by following a very sim ple program consist ing of condit ional st at em ent s, read/ writ e com m ands or t ape shift s. The t ape can be of any lengt h necessary t o solve a part icular problem , but t he t ape cannot be of an infinit e lengt h. I f a problem has a solut ion, t hat problem can be solved using a Turing m achine and som e finit e lengt h t ape.
Qu a n t um Com pu t in g
Pract ically ever neural net work t hus far has been im plem ent ed using a Von Neum ann com put er. But m ight t he successor t o t he Von Neum ann com put er t ake neural net works t o t he near hum an level? Advances in an area called Quant um com put ing m ay do j ust t hat . A Quant um com put er would be const ruct ed very different ly t han a Von Neum ann com put er. But w hat exact ly is a quant um com put er. Quant um com put ers use sm all part icles t o represent dat a. For exam ple, a pebble is a quant um com put er for calculat ing t he const ant - posit ion funct ion. A quant um com put er would use sm all part icles t o represent t he neurons of a neural net work. Before seeing how t o const ruct a Quant um neur al net work you m ust first see how a Quant um com put er is const ruct ed. At t he m ost basic level of a Von Neum ann com put er is t he bit . Sim ilarly at t he m ost basic level of t he Quant um com put er is t he “ qubit ” . A qubit , or quant um bit , differs from a norm al bit in one v ery im port ant way. Where a norm al bit can only have t he value 0 or 1, a qubit can have t he value 0, 1 or bot h sim ult aneously. To see how t his is possible, first you will be shown how a qubit is const ruct ed. A qubit is const ruct ed wit h an at om of som e elem ent . Hydrogen m akes a good exam ple. The hydrogen at om consist s of a nucleus and one orbit ing elect ron. For t he purposes of Quant um com put ing only t he orbit ing elect ron is im port ant . This elect ron can exist in different energy levels, or orbit s about t he nucleus. The different energy levels would be used t o represent t he binary 0 and 1. The ground st at e, when t he at om is in it s lowest orbit , could represent t he value 0. The next highest orbit would represent t he value 1. The elect ron can be m oved t o different orbit s by subj ect ing t he elect ron t o a pulse of polarized laser light . This has he effect of adding phot ons int o t he syst em . So t o flip a bit from 0 t o 1, enough light is added t o m ove t he elect ron up one orbit . To flip from 1 t o 0, we do t he sam e t hing, since overloading t he elect ron will cause t he elect ron t o ret urn t o it s ground st at e. This is logically equivalent t o a NOT gat e. Using sim ilar ideas ot her gat es can be const ruct ed such as AND and COPY. Thus far, t here is no qualit at ive difference bet ween qubit s and regular bit s. Bot h are capable of st oring t he values 0 and 1. What is different is t he concept of super posit ion. I f only half of t he light necessary t o m ove an elect ron is added, t he elect or will occupy bot h orbit s sim ult aneously. Superposit ion allows t wo possibilit ies t o be com put ed at once. Furt her, if you t ake one “ qubyt e” , t hat is 8 qubit s, and t hen 256 num bers can be represent ed sim ult aneously. Calculat ion wit h super posit ion can have cert ain advant ages. For exam ple, t o calculat e wit h t he superposit ional propert y, a num ber of qubit s are raised t o t heir superposit ions. Then t he algorit hm is perform ed on t hese qubit s. When t he algorit hm is com plet e, t he superposit ion is collapsed. This result s in t he t rue answer being revealed. You can t hink of t he algorit hm as being run on all possible com binat ions of t he definit e qubit st at es ( i.e. 0 and 1) in parallel. This is called quant um parallelism . Quant um com put ers clearly process inform at ion different ly t han t heir Von Neum ann count erpart . But does quant um com put ing offer anyt hing not already achievable by ordinary classical com put ers. The answer is yes. Quant um com put ing provides t rem endous speed advant ages over t he Von Neum ann archit ect ure. To see t his difference in speed, consider a problem which t akes an ext rem ely long t im e t o com put e on a classical com put er. Fact oring a 250 digit num ber is a good exam ple. I t is est im at ed t hat t his would t ake approxim at ely 800,000 years t o fact or wit h 1400 present day Von Neum ann com put ers working in parallel. Unfort unat ely, even as Von Neum ann com put ers im prove in speed and m et hods of large scale parallelism im prove, t he problem is st ill exponent ially expensive t o com put e. This sam e problem , posed t o a quant um com put er would not t ake nearly so long. Wit h a Quant um com put er it becom es possible t o fact or 250 digit num ber in j ust a few m illion st eps. The key elem ent is t hat
using t he parallel propert ies of superposit ion all possibilit ies can be com put ed sim ult aneously. I f t he Church- Turing t hesis is indeed t rue for all quant um com put ers is in som e doubt . The quant um com put er previously m ent ioned process sim ilar t o Von Neum ann com put ers, using bit s and logic gat es. This is not t o say t hat we cannot use ot her t ypes of quant um com put er m odels t hat are m ore powerful. One such m odel m ay be a Quant um Neural Net work, or QNN. A QNN could cert ainly be const ruct ed using qubit s, t his would be analogous t o const ruct ing an ordinary neural net work on a Von Neum ann com put er. As a direct result , would only offer speed, not com put abilit y, advant ages over Von Neum ann based neural net works. To const ruct a QNN t hat is not rest rained by Chur ch- Turing, we a radically different approach t o qubit s and logic gat es m ust be sought . As of t here does not seem t o be any clear way of doing t his.
Qu an t um N eu r al N e t w ork s How m ight a QNN be const ruct ed? Current ly t here ar e several research inst it ut es around t he world working on a QNN. Two such exam ples are Georgia Tech and Oxford Universit y. Most are reluct ant t o publish m uch det ails of t heir work. This is likely because building a QNN is pot ent ially m uch easier t han an act ual quant um com put er. This has creat ed a sort of quant um race. A QNN would likely gain exponent ially over classic neural net works t hrough superposit ion of values ent ering and exit ing a neuron. Anot her advant age would be a reduct ion in t he num ber of neuron layers required. This is because neurons can be used t o calculat e over m any possibilit ies, by using superposit ion. The m odel would t herefore requires less neurons t o learn. This would result in net works wit h fewer neurons and great er efficiency.
Ch a pt e r 1 : I n t r oduct ion t o N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 1: I nt roduct ion t o Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
6/ 6
Summary
Com put ers can process inform at ion considerably fast er t han hum an beings. Yet a com put er is incapable of perform ing m any of t he sam e t asks t hat a hum an can easily perform . For processes t hat cannot easily be broken int o a finit e num ber of st eps t he t echniques of Art ificial I nt elligence. Art ificial int elligence is usually achieved using a neural net work. The t erm neural net work is usually m eant t o refer t o art ificial neural net work. An art ificial neural net work at t em pt s t o sim ulat e t he real neural net works t hat are cont ained in t he brains of all anim als. Neur al net works were int roduced in t he 1950’s and have experienced num erous set backs, and have yet t o deliver on t he prom ise of sim ulat ing hum an t hought . Neural net works are const ruct ed of neurons t hat form layers. I nput is present ed t o t he layers of neurons. I f t he input t o a neuron is wit hin t he range t hat t he neuron has been t rained for, t hen t he neuron will fire. When a neuron fires, a signal is sent t o what ever layer of neurons, or t heir out put s, t he firing neuron w as connect ed t o. These connect ions bet ween neurons are called synapses. Java can be used t o const ruct such a net work. One such neural net work, which was writ t en in Java, is Java Obj ect Orient ed Neural Engine ( JOONE) . JOONE is an open source LGPL t hat can be used free of charge. Several of t he chapt ers in t his book will explain how t o use t he JOONE engine. Neural net works m ust be t rained and validat ed. A t raining set is usually split in half t o give bot h a t raining and validat ion set . Training t he neur al net work consist s of running t he neural net work over t he t r aining dat a unt il t he neural net work learns t o recognize t he t raining set wit h a sufficient ly low error rat e. Validat ion begins when t he neural net Just because a neural net work can process t he t raining dat a wit h a low error, does not m ean t hat t he neural net work is t rained and ready for use. Before t he neural net work should be placed int o product ion use, t he result s from t he neural net work m ust be validat ed. Validat ion involves present ing t he validat ion set t o t he neural net work and com paring t he act ual result s of t he neural net work wit h t he ant icipat ed result s. At t he end of validat ion, t he neural net work is ready t o be placed int o product ion if t he result s from t he validat ion set result in an error level t hat is sat isfact ory. I f t he r esult s are not sat isfact ory, t hen t he neural net work will have t o be ret rained before t he neural net work is placed int o product ion. The fut ure of art ificial int elligence program m ing m ay reside wit h t he quant um com put er or perhaps som et hing ot her t han t he neural net work. The quant um com put er prom ises t o speed com put ing t o levels t hat are unim aginable on t oday’s com put er plat form s.
Early at t em pt s at flying m achines at t em pt ed t o m odel t he bird. This was done because t he bird was our only working m odel of flight . I t was not unt il Wilbur and Orville Writ e broke from t he m odel of nat ure, and creat ed t he first fixed wing aircraft was t he first aircraft creat ed. Perhaps m odeling AI program s aft er nat ure analogous t o m odeling airplanes aft er birds and a m uch bet t er m odel t han t he neur al net work exist s. Only t he fut ure will t ell.
Ch a pt e r 2 : Un de r st a n ding N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 2: Underst anding Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
1/ 7
Introduction
The neural net work has long been t he m ainst ay of Art ificial I nt elligence ( AI ) pr ogram m ing. As program m ers we can cr eat e program s t hat do fairly am azing t hings. Program s can aut om at e repet it ive t asks such as balancing checkbooks or calculat ing t he value of an invest m ent port folio. While a program could easily m aint ain a large collect ion of im ages, it could not t ell us what any of t hose im ages are of. Program s are inherent ly unint elligent and uncreat ive. Ordinary com put er program s are only able t o perform repet it ive t asks. A neural net work at t em pt s t o give com put er program s hum an like int elligence. Neural net works are usually designed t o recognize pat t erns in dat a. A neural net work can be t rained t o recognize specific pat t erns in dat a. This chapt er will t each you t he basic layout of a neural net work and end by dem onst rat ing t he Hopfield neural net work, which is one of t he sim plest form s of neur al net work.
Ch a pt e r 2 : Un de r st a n ding N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 2: Underst anding Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
2/ 7
Neural Network Structure
To st udy neural net works you m ust first becom e aware of t heir st ruct ure. A neural net work is com posed of several different elem ent s. Neurons are t he m ost basic unit . Neurons are int erconnect ed. These connect ions are not equal, as each connect ion has a connect ion weight . Groups of net work s com e t oget her t o form layers. I n t his sect ion we will explore each of t hese t opics.
Th e N eu ron The neuron is t he basic building block of t he neural net work. A neuron is a com m unicat ion conduit t hat bot h accept s input and produces out put . The neuron receives it s input eit her from ot her neurons or t he user program . Sim ilarly t he neuron sends it s out put t o ot her neurons or t he user program . When a neuron produces out put , t hat neuron is said t o act ivat e, or fire. A neuron will act ivat e when t he sum if it s input s sat isfies t he neuron’s act ivat ion funct ion. Consider a neuron t hat is connect ed t o k ot her neurons. The var iable w represent s t he weight s bet ween t his neuron and t he ot her k neurons. The v ariable x represent s t he input t o t his neuron from each of t he ot her neurons. Therefore we m ust calculat e t he sum of every input x m ult iplied by t he corresponding weight w. This is shown in t he following equat ion. This book will use som e m at hem at ical not at ion t o explain how t he neural net works are const ruct ed. Oft en t his is t heoret ical and not absolut ely necessary t o use neural net works. A review of t he m at hem at ical concept s used in t his book is covered in Appendix B, “ Mat hem at ical Background” . This sum m ust be given t o t he neurons act ivat ion funct ion. An act ivat ion funct ion is j ust a sim ple Java m et hod t hat t ells t he neuron if it should fire or not . For exam ple, if you chose t o have your neuron only act ivat e when t he input t o t hat neuron is bet ween 5 and 10, t he following act ivat ion m et hod m ight be used.
boolean thresholdFunction(double input) { if( (input>=5) && (input N1
N3- > N1
N4- > N1
N e ur on 2 ( N 2 ) N1- > N2
( N/ A)
N3- > N2
N4- > N2
N e ur on 3 ( N 3 ) N1- > N3
N2- > N3
( N/ A)
N4- > N3
N e ur on 4 ( N 4 ) N1- > N4
N2- > N4
N3- > N4
( N/ A)
The connect ion weight s put int o t his array, also called a weight m at rix, allow t he neural net work t o recall cert ain pat t erns when present ed. For exam ple, t he values shown in Table 2.2 show t he correct values t o use t o recall t he pat t erns 0101 and 1010. The m et hod t o creat e t he values cont ained in Table 2.2 will be covered short ly. First you will be shown how t he values in Table 2.2 are used t o recall 0101 and 1010. Table 2.2: Weight s used t o recall 0101 and 1010 N e ur on 1 ( N 1 ) N e ur on 2 ( N 2 ) N e ur on 3 ( N 3 ) N e ur on 4 ( N 4 ) N e ur on 1 ( N 1 ) 0
-1
1
-1
N e ur on 2 ( N 2 ) - 1
0
-1
1
N e ur on 3 ( N 3 ) 1
-1
0
-1
N e ur on 4 ( N 4 ) - 1
1
-1
0
Re ca lling Pa t t e rn s You have been t old several t im es t hat t he connect ion weight m at rix shown in t able 2.2 will correct ly recall 0101 and 1010. You will now be show n exact ly how a neural net work is used t o recall pat t erns. First we will t ake t he exam ple of present ing 0101 t o t he Hopfield net work. To do t his we present each input neuron, w hich in t his case are also t he out put neurons, wit h t he pat t ern. Each neuron will act ivat e based upon t he input pat t ern. For exam ple, when neuron 1 is present ed wit h 0101 it s act ivat ion will be t he sum of all weight s t hat have a 1 in input pat t ern. For exam ple, we can see from Table 2.2 t hat Neuron 1 has t he following weight s wit h all of t he ot her neurons: 0 -1 1 -1 We m ust now com pare t hose weight s wit h t he input pat t ern of 0101: 0 1
0 1
0 -1 1 -1 We will sum only t he values t hat cont ain a 1 in t he input pat t ern. Therefore t he act ivat ion of t he first neuron is –1 + - 1, or –2. The act ivat ion of each neuron is shown below. N1 = - 1 + - 1 = - 2 N2 = 0 + 1 = 1 N3= - 1 + - 1 = - 2 N4 = 1 + 0 = 1 Therefore, t he out put neur ons, w hich are also t he input neurons, will report t he above act ivat ions. The final out put vect or would t hen be –2, 1, - 2, 1. These values ar e m eaningless wit hout a t hreshold m et hod. We said earlier t hat a t hreshold m et hod det erm ines what range of values will cause t he neur on, in t his case t he out put neuron, t o
fire. The t hreshold usually used for a Hopfield net work is t o fire on any value gr eat er t han zero. So t he following neurons would fire. N1 N2 N3 N4
act ivat ion act ivat ion act ivat ion act ivat ion
is is is is
–2, would not fire ( 0) 1, would fire ( 1) –2, would not fire( 0) 1m would fire ( 1)
As you can see we assign a binary 1 t o all neurons t hat fired, and a binar y 0 t o all neurons t hat did not fire. The final binary out put from t he Hopfield net work would be 0101. This is t he sam e as t he input pat t ern. An aut oassociat ive neural net work, such as a Hopfield net work, will echo a pat t ern back if t he pat t ern is recognized. The pat t ern was successfully recognized. Now t hat you have seen how a connect ion weight m at rix can cause a neural net work t o recall cert ain pat t erns, you will be shown how t he connect ion weight m at rix was derived.
D e rivin g t he W eigh t M a t rix You are probably wondering how t he weight m at rix, shown by Table X w as derived. This sect ion will show you how t o creat e a weight m at rix t hat can recall any num ber of pat t erns. First you should st art wit h a blank connect ion weight m at rix, as follows.
We will first t rain t his neural net work t o accept t he value 0101. To do t his we m ust first calculat e a m at rix j ust for 0101, which is called 0101’s cont ribut ion m at rix. The cont ribut ion m at rix will t hen be added t o t he act ual connect ion weight m at rix. As addit ional cont ribut ion m at rixes are added t o t he connect ion weight m at rix, t he connect ion weight is said t o learn each of t he new pat t erns. First we m ust calculat e t he cont ribut ion m at rix of 0101. There are t hree st eps involved in t his process. First we m ust calculat e t he bipolar values of 0101. Bipolar sim ply m eans t hat you are represent ing a binary st ring wit h –1’s and 1’s rat her t han 0’s and 1’s. Next we t ranspose and m ult iply t he bipolar equivalent of 0101 by it self. Finally, we set all t he values from t he nort h- west diagonal t o zero, because neurons do not connect t o t hem selves in a Hopfield net work. Let s t ake each st ep one at a t im e and see how t his is done, st art ing wit h t he bipolar conversion. St ep 1: Convert 0101 t o bipolar Bipolar is not hing m ore t han a way t o represent binary values as –1’s and 1’s r at her t han zero and 1’s. This is done because binary has one m inor flaw. Which is t hat 0 is NOT t he inverse of 1. Rat her –1 is t he m at hem at ical inverse of 1. To convert 0101 t o bipolar we convert all of t he zeros t o –1’s. This result s in: 0 1 0 1
= = = =
-1 1 -1 1
The final result is t he array –1, 1, - 1, 1. This array will be used by st ep 2 t o begin t o build t he cont ribut ion m at rix for 0101. St ep 2: Mult iply –1, 1, - 1, 1 by it s inverse
For t his st ep we will consider –1, 1, - 1, 1 t o be a m at rix.
Taking t he inverse of t his m at rix we have.
We m ust now m ult iply t hese t wo m at rixes. Appendix B, “ Mat hem at ical Background” , cont ains an exact definit ion of how t o m ult iply t wo m at rixes. I t is a relat ively easy procedure, where t he rows and colum ns are m ult iplied against each ot her t o result in: - 1 X –1 = 1 1 X –1 = - 1 - 1 X –1 = 1 1 X –1 = - 1 -1 X 1 = -1 1 X 1 = 1
-1 X 1 = -1 1 X 1 = 1
- 1 X –1 = 1 1 X –1 = - 1 - 1 X –1 = 1 1 X –1 = - 1 -1 X 1 = -1 1 X 1 = 1
-1 X 1 = -1 1 X 1 = 1
Condensed, t his t he above result s in t he following m at rix.
Now t hat we have successfully m ult iplied t he m at rix by it s inverse we are ready for st ep 3. St ep 3: Set t he nort hwest diagonal t o zero Mat hem at ically speaking w e are now going t o subt ract t he ident it y m at rix from t he m at rix we derived in st ep t wo. The net result is t hat t he nort hwest diagonal get s set t o zero. The real reason we do t his is Hopfield net works do not have t heir neurons connect ed t o t hem selves. So posit ions [ 0] [ 0] , [ 1] [ 1] , [ 2] [ 2] and [ 3] [ 3] in our t wo dim ensional array, or m at rix, get set t o zero. This result s in t he final cont ribut ion m at rix for t he bit pat t ern 0101.
This cont ribut ion m at rix can now be added t o what ever connect ion weight m at rix you already had. I f you only w ant t his net work t o recognize 0101, t hen t his cont ribut ion m at rix becom es your connect ion weight m at rix. I f you also want ed t o recognize 1001, t hen you would calculat e bot h cont ribut ion m at rixes and add each value in t heir cont ribut ion m at rixes t o result in a com bined m at rix, which would be t he connect ion weight m at rix.
I f t his process seem s a bit confusing, you m ight t ry looking at t he next sect ion where we act ually build a program t hat builds connect ion weight m at rixes. There t he process is explained in a m ore Java- cent ric way. Before we end t he discussion of det erm inat ion of t he weight m at rix, one sm all side effect should be m ent ioned. We went t hrough several st eps t o det erm ine t he correct weight m at rix for 0101. Any t im e you creat e a Hopfield net work t hat recognizes a binary pat t ern; t he net work also recognizes t he inverse of t hat bit pat t ern. You can get t he inverse of a bit pat t ern by flipping all 0’s t o 1’s and 1’s t o zeros. The inverse of 0101 is 1010. As a result , t he connect ion weight m at rix we j ust calculat ed would also recognize 1010.
Ch a pt e r 2 : Un de r st a n ding N e u r a l N e t w or k s
Ar t icle Tit le : Chapt er 2: Underst anding Neural Net works Ca t e gor y:
Art ificial I nt elligence Most Popular
Fr om Se r ie s: Program m ing Neural Net works in Java Post e d:
Wednesday, Novem ber 16, 2005 05: 14 PM
Aut h or :
JeffHeat on
Pa ge :
6/ 7
Hopfield Neural Network Example
Now t hat you have been shown som e of t he basic concept s of neural net work we will exam ple an act ual Java ex am ple of a neural net work. The exam ple program for t his chapt er im plem ent s a sim ple Hopfield neural net work t hat you can used t o experim ent wit h Hopfield neural net works. The exam ple given in t his chapt er im plem ent s t he ent ire neural net work. More com plex neural net work exam ples will oft en use JOONE. JOONE will be int roduced in Chapt er 3. The com plet e source code t o t his, and all exam ples, can be found on t he com panion CD- ROM. To learn how t o run t he ex am ples refer t o Appendix C, " Com piling Exam ples under Windows" and Appendix D, "Com piling Exam ples under Linux/ UNI X". These appendixes give t hrough discussion of how t o properly com pile and execut e exam ples. The classes used t o creat e t he Hopfield exam ple are shown in Figure 2.7.
Figure 2.7: Hopfield Exam ple Classes
Using t h e H opfie ld N e t w ork You will now be shown a Java program t hat im plem ent s a 4- neuron Hopfield neural net work. This sim ple program is im plem ent ed as a Swing Java Applicat ion. Figure 2.8 shows t he applicat ion as it appears when it init ially st art s up. I nit ially t he net work act ivat ion weight s are all zero. The net work has learned no pat t erns at t his point .
Figure 2.8: A Hopfield Exam ple We will begin by t eaching it t o recognize t he pat t ern 0101. Ent er 0101 under t he "I nput pat t ern t o run or t rain". Click t he " Train" but t on. Not ice t he weight m at rix adj ust t o absorb t he new k nowledge. You should now see t he sam e connect ion weight m at rix as Figure 2.9.
Figure 2.9: Training t he Hopfield Net work Now you will t est it . Ent er t he pat t ern 0101 int o t he "I nput pat t ern t o run or t rain"( it should st ill be t here from your t raining) . The out put will be "0101". This is an aut oassociat ive net work, t herefore it echoes t he input if it recognizes it . Now you should t ry som et hing t hat does not m at ch t he t raining pat t ern exact ly. Ent er t he pat t ern "0100" and click "Run". The out put will now be "0101". The neural net work did not recognize "0100", but t he closest t hing it knew was " 0101". I t figured you m ade an error t yping and at t em pt ed a correct ion. Now let s t est t he side effect m ent ioned previously. Ent er "1010", which is t he binary inverse of what t he net work was t rained wit h ( "0101") . Hopfield net works always get t rained for t he binary inver se t oo. So if you ent er "0110", t he net work will recognize it . We will t ry one final t est . Ent er "1111", which is t ot ally off base and not close t o anyt hing t he neural net work k nows. The neural net work responds wit h "0000", it did not t ry t o correct , it has no idea what you m ean. You can play wit h t he net work m ore. I t can be t aught m ore t han one pat t ern. As you t rain new pat t erns it builds upon t he m at rix already in m em ory. Pressing " Clear" clears out t he m em ory.
Con st ru ct in g t he H opfield Ex am ple Before we exam ine t he port ions of t he Hopfield exam ple applet t hat are responsible for t he act ual neural net work, we will first exam ine t he user int erface. The m ain applet source code is shown in list ing 2.1. This list ing im plem ent s t he Hopfield class, which is where t he user int erface code resides. List ing 2.1: The Hopfield Applet ( Hopfield.j ava)
import java.awt.*; import javax.swing.*; import java.awt.event.*; /** * Example: The Hopfield Neural Network * * This is an example that implements a Hopfield neural * network. This example network contains four fully * connected neurons. This file, Hopfield, implements a * Swing interface into the other two neural network * classes: Layer and Neuron. * * @author Jeff Heaton * @version 1.0 */ public class Hopfield extends JFrame implements ActionListener { /** * The number of neurons in this neural network. */ public static final int NETWORK_SIZE = 4; /** * The weight matrix for the four fully connected * neurons. */ JTextField matrix[][] = new JTextField[NETWORK_SIZE][NETWORK_SIZE]; /** * The input pattern, used to either train * or run the neural network. When the network * is being trained, this is the training * data. When the neural network is to be ran * this is the input pattern. */ JComboBox input[] = new JComboBox[NETWORK_SIZE]; /** * The output from each of the four neurons. */ JTextField output[] = new JTextField[NETWORK_SIZE]; /** * The clear button. Used to clear the weight * matrix. */ JButton btnClear = new JButton("Clear"); /**
* The train button. Used to train the * neural network. */ JButton btnTrain = new JButton("Train"); /** * The run button. Used to run the neural * network. */ JButton btnRun = new JButton("Run"); /** * Constructor, create all of the components and position * the JFrame to the center of the screen. */ public Hopfield() { setTitle("Hopfield Neural Network"); // create connections panel JPanel connections = new JPanel(); connections.setLayout( new GridLayout(NETWORK_SIZE,NETWORK_SIZE) ); for ( int row=0;row