124 64 318KB
English Pages [7]
15-STATE - L,
I~ 00eR !
,
2~_"
UNIVERSAL
_
IDENTICAL BLOCK H A S
(,~
,
_
! ZERO
THER~'~'~E \
l
~
z ZEROS
(~9~9'0e
y
'3~ 8 ~
'
-
\.~
,
1~900eR'~'oJ I
THERE WERE
,~
, ~ zERos
r~-
~8~K-O~
Y j
.
~1
. ~ -
l k
' "
I
l~'luu°"'~sl
K\
\
\
I
h
I
~
I~'z°le Lm~'l'J
I
i
COPY
'
I~'o00,e""~"}TH.REW.,~. i .~,o,,=,~,=,
, ____EJ, I
TO N E X T B L O C K O N T!
~
--
~T/ W A S S H O R T E R
~.ROCEEDp A S T ZEROS
~sOOeR,~s~
_
O N T! ~. r 2
Io,.O~".o,.l
OF 8LOCK O N TI l
LaY I ,/fl\
BLOCKS
/COMPARE
.
MOVETOmGNT
~-........~
'~'-,
.
"2 ~ , ~ - ~
l E N D
~
.
1-/-),,~30,eL2~,~Jl~~, L
/I
~
.
"~,,,e
/
,
.
Tel ~ ! j ~
I
SHORTER
MACHINE
' xN~.~.~...~e
slOeRz~ ,
A
+
TURING
BLOCK
A
,'Ro,~ ~, TO T~ ~__~
l-
EXAMIHESYMBOL
,
o..3
L~,@°2'=~,,L!~,,"e",~,*l
Figure 1
SIMPLE
LEARNING
BY A DIGITAL
COMPUTER
The Computation Laboratory Harvard University
1.
Techniques
Introduction Digital
programmed
comouters
which are usually the ne2vous
chologists
and neurophysiologists,
only wlth
interested
In the potentialities
technique
Delay Storage
Calculator
(EDSAC)
existing
organisms.
a concrete
Electronic
by which
animal
the
first stage
Cambridge,
made capable
of modifying
on the basis
of experience
was
its behavior acquired
like psy-
computers
are of
as models
and of the functions
nervous
will be given
of the University
Laboratory,
digital
the structure
example
Automatic
~athematlcal
the course
of behavior
of living
Thls paper describes of one practical
for those who,
associated
systems
type may have
some value
can readily be
to exhibit modes
of this
systems.
of
of
The description
In two stages.
In the
(Section 2) the behavior
of
the EDSAC under
the control
os_ ~ . ~
program will be presented
learning
of a re-
from the point of view of an experlmenter
In
who can control
of operation. 55
the input of the machine
and observe access
its output,
but who is denied
to its internal mechanism.
tensity,
This
at, of approval or d i s a p p r o v a l
is given in column 3.
The remaining
p o i n t of view corresponds
to that of an
col~nnns of Table 1 should,
e x p e r i m e n t e r who attempts
to deduce
moment,
structure
the
and the internal mode of oper-
be disregarded.
At t = 3, s t = 2 initiated
a t i o n of an animal o r g a n i s m from
sponse
controlled
initiated R 3.
observations
of its functions.
In the second stage
(Section 3),
factors d e t e r m i n i n g
the b e h a v i o r of the
machine
are revealed,
the
R2, and at t = 17,
and are a n a l y z e d
time.
An X in column 2 at time
In the interval
d e s i g n e r of the learning program.
sponse,
of the .Response
t ~ 12,
and those responses
find it possible
Learning Machine
l~
learning program
into the EDSAC,
The e x p e r i m e n t e r w o u l d to train the m a c h i n e
e x p r e s s i n g his approval
this
this response
st = 2
that are made
give one p a r t i c u l a r response
W h e n the response
at that
too weak to elicit a n y re-
occur at random.
is i n t r o d u c e d
st = 4
itiate any response w h a t s o e v e r
is f r e q u e n t l y
The Functions
the re-
t indicates that s t was too weak to in-
from the p r i v i l e g e d point of view of the
2.
for the
occurs,
to
only, b y
(a t >
O) w h e n e v e r
and his u n c o n c e r n
machine
is changed from a general p u r p o s e
(a t = O) or his d i s a p p r o v a l
digital
c o m p u t e r into a special purpose
otherwise.
machine
which will be called the response
be d i s c o u r a g e d b y r e p e a t e d d i s a p p r o v a l .
l e a r n l n ~ machine. to d e s c r i b e
An e x p e r i m e n t e r asked
soon observe
the m a c h i n e has a s e n s o r y device, input mechanism,
in the f o ~
n u m b e r whose m a g n i t u d e intensity.
He w o u l d note
a stimulus
st >
it initiates possible
response
ficient to train the m a c h i n e
signals
t = 28 when, to
training,
that w h e n such
the occurrence
tempt,
Following
of a response,
at an e a r l y stage of
R 3 o~curred.
made at t = 16,
machine
to make
An e a r l i e r atto teach the
the response
R 1 failed
for reasons whose e x p l a n a t i o n will be g i v e n later.
of the
As the training proceeds,
the n u m b e r i w i t h
its output teleprinter.
to r e s p o n d
to e v e r y stimulus w i t h R1, except at
i = 1,2,...,5.
Ri b y p r i n t i n g
occurrence
the
at r a n d o m one of a set of Ri,
signals
and a t = 1 g i v e n at t = 30 were suf-
0 is applied at time t,
responses
The m a c h i n e
that
of a
corresponds
can
a t = 2 g i v e n to R 1 at t = 27 and t = 29.
w h i c h is capable of de-
tecting a stimulus
a response
Table 1 shows that the approval
the b e h a v i o r of this response
l e a r n i n g m a c h i n e would
Conversely,
(a t < O)
become
the
less frequent,
errors
and the l e a r n e d
responses m a y be initiated b y a pro-
the e x p e r i m e n t e r
g r e s s i v e l y w e a k e r stimulus.
The a t t e m p t
m a y express his approval
or d i s a p p r o v a l .
to teach the response l e a r n i n g m a c h i n e
To analyze
behavior
to give the response
detail,
the m a c h i n e , s
in
the o b s e r v e r m i g h t g a t h e r e x p e r i -
R1 begins
at t = 27
with the stimulus 3; at t = 30 the stimu-
m e n t a l data in a form such ~s that of
lus 2 was tried and found to be s u f f i c i e n t
Table I.
and at t = 33, R 1 o c c u r r e d w h e n the
In this table,
s t is d i s p l a y e d
in c o l u m n l, the resulting Rit , appears
smallest
R i at time t,
stimulus,
same time,
in column 2, and the in56
I, was used.
At the
the need for a p p r o v a l di-
mlnishes.
At t = 27 and t = 29,
but a t = 0 at t = 31 and at t = 32, R1 nevertheless
to i, and the f r e q u e n c y of re-
sponses d r o p p e d
sharply.
until
for the stimulus
3.
it
The R e s p o n s e L e a r n i n 6 P r o 6 r a m To control
1 at t = 37.
the o c c u r r e n c e
At t = 38 even the stimulus 2 will not
sponses,
initiate
it, but at t = 39 the stimulus
S i ~ O, is a s s o c i a t e d
3 brings
it back.
R i.
proval
One last sharp disap-
(-4) f i n a l l y inhibits
t = 40 onwards,
premature
it.
attempt
is
Note h o w at t = 42, a to reduce
with each r e s p o n s e
4 through 8 of Table i disstate numbers
for every i.
threshold
the stimulus
at time
The set of
state numbers
is held in the
response learning m a c h i n e , s n u m b e r store.
from 3 to 2 p r o d u c e d no response at all. It has a l r e a d y b e e n m e n t i o n e d
Columns
t, Sit,
of re-
a threshold state n u m b e r Si,
p l a y the threshold
From
the same procedure
repeated w i t h R 4.
s t (Table 2) was
F r o m t = 34
R 1 is discouraged,
disappears
reduced
and
is still i n i t i a t e d by the
small stimulus 1 at t = 33. onwards,
t = 17 and thereafter,
a t = 2,
W h e n a stimulus
at time t,
the threshold
state n u m b e r store is
a response m a y be learned by the m a c h i n e
scanned until
the first l a r g e s t Sit , say
if e n c o u r a g e d b y the experlmenter,
Sjt,
if the e x p e r i m e n t e r presses u n c o n c e r n sponse,
is neutral
but
and ex-
found,
more and more frequently.
periments,
Eventually,
a habit.
due to this effect.
a c t i v e l y to discourage
st÷ I.
R3, which showed
time.
occurs,
As the stimulus 1 p r o d u c e d no re-
n e c e s s i t y for disapproval. possible,
the response learning machine into a lethargic
response t = 17,
The s e l e c t i o n of a response
by the procedure
1 ~ t ~ 15 are identical
essentially
sponding columns
of Table 1.
However,
orders.
at 57
At
R I and R 3 were in competition,
Table 2, columns l, 2, and 3 for with the corre-
the w i n n i n g
is selected at random.
and R 3 won.
In
W h e n two or
compete with each other,
as was the case at t = 17,
for
to d e c a y
responses.
J is p r i n t e d in
At t = 20, $3,20 + s20 = 5 + 3
more responses
state, m a k i n g in-
c r e a s i n g l y infrequent
and the n u m b e r
response Rj
= 8 ~ 7, h e n c e R 3 occurred.
the
It is also
u n d e r similar conditions,
the r e l a t i o n Skt = Sjt , k ~ J.
no such Skt exists,
c o l u m n 2.
s t = 3 was u s e d again at t = 22.
the next stimulus,
to find and count those Skt
satisfying
to 1 to test how
of R 3 then indicated
can occur at time
When this is not the case, X
to receive
proceeds,
At t = 21
~en
The r e a p p e a r a n c e
ex-
W h e n Sjt + s t ~ T, the scanning
strong a h a b i t R 3 had become by that
sponse,
T
is p r i n t e d in column 2, and the m a c h i n e prepares
the stimulus was reduced
7 in the p r e s e n t
A response
satisfieS.
is
To train the m a c h i n e
a habit.
forms Sjt + s t , and
t only if the relation Sjt + s t ~ T is
to give R1, it was first n e c e s s a r y
promise of becoming
Sit
called the triggering
threshold.
The h i g h
f r e q u e n c y of R 3 from t = 15 onwards
the m a c h i n e
is a fixed number,
to o c c u r
to the e x c l u s i o n of all others,
this response becomes
At t = 20 the s c a n n i n g
tests for the c o n d i t i o n Sit + s t ~ T.
still p o s s i b l e
for some p a r t i c u l a r response
is found.
p r o c e e d e d until $3,20 was found.
(a t = O) for e v e r y re-
it is n e v e r t h e l e s s
occurring
that
is i n t r o d u c e d
just o u t l i n e d relies
on the use of c o n d i t i o n a l
Such orders enable
the m a c h i n e
to choose among several alternative
Nit and Ni,t_ I are both pseudo-
courses on the basis of the results of
random numbers*.
earlier operations.
(t,t+l) a pseudo-random number Nt,
progr~must choice,
Naturally,
the
foresee the need for a
In each interval
-5 ~ N t ~ 5, is added to one Sjt se-
and provide conditional orders
lected at random,
to meet this need, but the actual decision is made by the machine itself,
and Nt_ 1 is subtracted
from the Skt (k # J, or k = J) to which on
it had been added in the interval
the basis of information obtained in the
(t-l,t).
course of operation either from its own
ations are superimposed on the average
store,
level of the threshold
states.
of these fluctuations,
the machine
nism,
or, by means of the input mechafrom the outside world.
make mistakes,
After a response R i has occurred, the machine
then introduces ate magnitude
important,
a number a t of appropri-
can
able probability
to form
each S i has a reason-
of being greater than
the others at some time.
When R i has
possible
occurred at time t,
More
provided that no S i is
excessively large,
and sign into the machine.
Si, t+ 1 from Sit , for all i.
Because
that is, it can occasion-
it has been taught to make.
The experimenter
Given at, the machine proceeds
random fluctu-
ally make a response other than the one
signals to the experimenter
and asks for approval.
In this fashion,
This makes
the teaching of a new response,
or, when no response is favored over the Si, t÷ 1 =
Sit+at+l+Nit-Ni,t_l-d(Sit, t);
others,
an interesting
The last factor,
scribed in turn, By increasing,
decreasing,
leaving Sit constant,
toward 1.
the addition of the
that Si,t+ 1 > Sj,t+l,
and hence the probability
d(Sit, t), produces
a decay trend of all threshold states
or
factor a t correspondingly modifies
variety
of responses.
the terms of this expression will be de-
probability
produces
d is different
from zero only
in the intervals between t = 5n and
the
t = 5n+l, w h e r e ~
i # J,
= 0,1,2,...
In these
intervals d = +l when Sit > l, d = -1
that the
scanning process will stop at Si, t+ 1.
when Sit ~-0,
Adding unity to the threshold state of
The effect of d ~ 0 is self-explanatory.
the response which has been initiated at
The negative decay is provided when
time t increases
Sit ~
the probability
that
and d = 0 when Sit = 1.
0 for the purely practical purpose
this response will occur again at time
of preventing
t' >
learning machine.
t.
This device accounts
habit-forming
for the
As illustrated
Table 2, the decay introduces
effect described earlier.
It is this effect which,
the "death" of the response
some
lethargy into the behavior of the ma-
together with
the chance selection of R~ at t = 17,
chine,
accounts
for the S-machine,s
requiring ever-increasing
learning
of the response R 1.
delayed Attempts
in
by causing all S i to drop, hence stimuli to
to *"Pseudo-random" numbers good enough for these experiments are generated by squaring a certain constant, and by selecting a number of digits from the result. The middle digits of the square then serve as a new constant.
teach this response were begun at t = 16, but were unsuccessful until t = 29, after R 3 had been effectively discouraged. 58
produce a response.
by which the behavior of the S-machine
Were all S i to
become so strongly negative
determined.
that Sit +st4T
By examining
the data of the
for all i, given the most favorable random
first three columns of Table 1 such an
effect and stimulus,
observer could easily find regularities
no further response
could be elicited from the machine.
in the response pattern,
The
and he might
use of negative decay effectively makes
even develop empirical rules for pre-
1 the average minimum level of the S i-
dicting responses with tolerable accu-
The effect of decay and of the random
racy.
variations on the threshold
by this means to obtain a good approxi-
states can
is
He would find it very difficult
mation to the description of the response
best be observed in Table 2, when t > 15.
learning program given above.
For all those Rj which did not occur
Switching
the machine off to dissect it would be
at time t,
of limited value only,
since this action
Sj,t÷ I = Sjt+Njt-Nj,t_l-d(Sjt, t). would make the response learning program vanish from the EDSAC,e store.
In this expression a t and the habit forming term 1 do not appear.
The es4.
The Di6ital Computer as a Model
sential features of the preceding The readiness with which the EDSAC, description are thus su~narized by the and other digital computers,
can play
three relations which govern the operation different roles at a moment's notice is of the response learning machine: one of their important properties. 1.
the role a digital computer is called
To initiate a response at time t:
upon to play is markedly different Sjt + st ~ T
from
for some J; those its designers had in mind,
2.
When
Where R i has occurred at time t:
a neces-
sarily large proportion of the orders in a program must be devoted to specifying
Si, t÷ 1 = 3.
Sit+at+l+Nit-Ni, t_l-d(Sit, t);
this role.
Where R i has not occurred at time t:
program,
In the response learning
an aggregate of elementary EDSAC
orders is required,
for example,
to in~-
Si,t+ 1 = Sit+Nit-Ni, t_l-d(Sit, t). itZate the firing of a response, It must be remembered
that, while he is
since
this is an operation which does not
training the response learning machine,
correspond
the experimenter does not know the
drawback is balanced by some important
threshold state numbers,
advantages.
and must rely
to any single order.
Given the EDSAC,
This
the only
on his recollection of his own past
additional equipment
actions and of the responses
into a learning machine is the length of
made to them.
the machine
This is the data given in
teleprinter tape which holds the learning
the first three columns of Table 1.
program before it is introduced
The behavior pattern of the response learning machine
machine,
is sufficiently complex
to provide a difficult
required to turn it
into the
and changes in learning machine
design and the rectification of errors
task for an ob-
require at most the preparation of a new
server required to discover the mechanism
program tape.
59
0nly a few seconds of
input time are required to turn the EDSAC
Table I (Cont't)
from an orthodox computing machine into A typical response learning experiment an experimental learning device. these reasons,
For
a flexible digital comI
puter could serve with advantage as a
2
3
4
5
6
7
8
O0
06 03 02 03 03
03 03 03 03 04
03 02 03 08 04
02 02 02 02 02
02 02 02 02 02
02 02 05 05 05
03 02 03 03 08
Ol O1 Ol Ol Ol
Ol O1 Ol 05 02
proving ground for a wide variety of models.
10
02 02 02 02 02
1 X X 3 X
15
02 02 03 03 05
X X 2 5 3
O0 O0 O0 ~
02 02 02 02 02
20
03 04 05 05 03
1 3 5 5 5
Ol O0 O0 O0 O0
06 05 05 05 03
02 02 02 02 02
03 03 04 05 05
Ol Ol Ol Ol Ol
Ol Ol 02 06 -02
25
Ol 05 03 03 05
X 3 X 5 3
-02 -03
02 Ol 02 02 02
Ol Ol 01 Ol Ol
05 05 05 05 04
Ol Ol 01 Ol Ol
Ol Ol 01 Ol Ol
50
05 05 03 03 02
X i 5 1 1
02 -01 02 Ol
01 06 04 04 07
01 01 Ol Ol Ol
01 01 05 Ol 02
01 01 Ol Ol Ol
01 01 Ol Ol Ol
55
02 02 Ol Ol 01
i 1 1 1 i
O0 O0 O0 -05 -03
08 ii i0 ii 09
Ol Ol 04 Ol 01
Ol Ol Ol 05 01
05 Ol 01 Ol 01
Ol Ol Ol Ol 01
i X X I 4
-04
40
01 Ol 02 03 03
-04 02
06 04 04 04 Ol
01 Ol 01 01 Ol
01 01 01 01 Ol
01 Ol 01 01 04
01 01 02 00 Ol
4
01
45
03 02 03 03 02
X 4 4 4
02 01 01
01 Ol 01 00 01
01 Ol 01 01 01
02 Ol 02 01 02
04 04 06 09 ii
Ol 01 01 01
AcknowledTments The author wishes Dr. M. V. Wilkes study.
to thank
for suggesting
this
He is indebted to Dr. M.V. Wilkes,
Mr. J. W. S. Pringle,
Mr. S. Gill,
and
Mr. M. F. C. Woollett for many valuable suggestions,
and to Dr. J. C. P. Miller
for helpful advice.
This paper is part
of a study carried out while the author was the holder of a Henry Fellowship. References 0ettinger, MAT.,
A. G. (in
"Programming
press),
Phil.
a Digital Computer to
Learn" 59,
Turing, A. M., 1950, Mind,
453-460,
"Computing Machinery and Intelligence" Wilkes, 177-178, Wilkes,
M. V., 1951, Spectator, "Can Machines Think?" M. V., Wheeler,
Gill, S., 1951, Programs puter,"
No. 6424,
D. J., an~
"The Preparation of
for an Electronic Digital ComCambridge,
Mass.,
Addison-Wesley
Table 1 A typical response learning experiment
50
i
2
5
4
5
6
7
8
t
st
Rit
at
Sit
S2t
Sst
S4t
~5t
X X 2 X 5
O0
5
02 02 02 02 02
03 04 03 05 03
05 03 07 04 04
05 03 03 05 07
03 03 03 Ol 05
05 03 05 05 05
O0
55
60
60
O0
-01
01
01
4
-01
01
01
01
i0
01
Ol Ol
4 4
-01 O0
O0 -01
Ol Ol
Ol Ol
13 13
Ol Ol
Ol Ol
4 4
-02 -02
Ol 05
Ol Ol
-05 Ol
14 15
Ol Ol
Ol Ol
4 4
-04 -04
4
-04
01 02 05 03 03 03 05
X X X X X X 2
Ol Ol Ol 01 Ol Ol 01 01 02 06
Ol Ol Ol 01 -04 Ol Ol 01 Ol 01
12 09 06 05 03 02 02 02 02 02
05 Ol
Ol
Ol O0 Ol Ol Ol 02 O0 01 Ol 01
O0
Ol
01 Ol Ol 01 03 Ol 01
Table 2 The effect of decay and random fluctuations on the threshold state numbers
i
2
3
4
5
6
7
8
st
Hit
at
Sit
S2t
S3t
S4t
Sst
02 02 02 02 02
X X 2 X 3
O0
03 04 03 05 03
03 03 07 04 04
03 03 03 03 07
03 03 03 Ol 03
03 03 03 03 03
I0
02 02 02 02 02
i X X 3 X
06 03 02 03 05
03 03 03 03 04
03 02 03 08 04
O0 02 02 02 02
02 02 02 02 02
15
Og 02 03 03 03
X X 2 5 3
02 02 02 02 02
02 02 05 05 03
03 02 03 03 08
Ol Ol Ol O0 Ol
Ol Ol Ol 05 02
20
03 Ol Ol Ol Ol
i X X X 5
06 02 02 02 02
02 02 02 02 02
03 03 03 03 03
Ol Ol Ol Ol Ol
Ol Ol Ol 02 06
25
Ol Ol Ol Ol Ol
X X X X X
Ol Ol O0 Ol Ol
Ol Ol Ol Ol Ol
02 02 02 O0 02
Ol Ol Ol Ol Ol
-02 02 02 02 02
Ol Ol Ol Ol
X X i X
Ol Ol 06 02
Ol Ol Ol Ol
Ol Ol Ol 05
Ol Ol Ol Ol
Ol Ol Ol Ol
O0 O0
O0
O0 O0 O0 O0
O0
O0
A N A N A L Y S I S BY ARITHMETICAL M E T H O D S OF A CALCULATING
NETWORK
W I T H FEEDBACK
By L. C. R o b b i n s Burroughs Adding Machine Company, Philadelphia
appears l i k e t h i s s
At the Wayne University meeting of
Decimal
this Association in the spring of 1951,
0 1 2 3 4 5 6 7 8 9 i0
George We Patterson presented a paper on Reversing Digit Number S~stemse
Although
these systems can be stated with respect to any base, we iili be concerned only with the special case of base 2o
The re-
versing representation for base 2 is also
Reversing 0000 0001 0011 0010 0110 0111 OlO1 oi00 ii00 ii01 iiii
Normal (DO0 0001 0010 0011 0100 0101 OllO 0111 I000 i001 I010
called the G~ay code, s ~ e t r i c binary
where t h e narmal b a s e 2 r e p r e s e n t a t i o n
codep and the reflected binary code.
been shown f o r comparisone
It
61
Note t h a t f o r