177 17 18MB
English Pages 340 [352] Year 2019
Modern Communications
Modern Communications By T H O M A S Η. CROWLEY GERARD G. H A R R I S S T E W A R T E. M I L L E R J O H N R. PIERCE J O H N P. RUNYON
NEW
YORK
AND
LONDON
1962
COLUMBIA UNIVERSITY
PRESS
Copyright © 1962 Columbia University Press Library of Congress Catalog Card Number: 62-18618 Manufactured in the United States of America
Foreword THE POSITION of man on the evolutionary ladder is, in part, described by the range and complexity of the stimuli that the human organism is capable of receiving, interpreting, and communicating to his fellows. It seems unlikely that man's ability to generate and communicate ideas is rapidly changing, except in so far as each generation is heir to a larger cultural heritage and to a larger range of devices that may extend the range and power of his senses. T h e degree of elaboration of the processes of communication among men is an indication of the state of development of civilization. T h e growth of language with an ever-increasing capacity to describe the knowledge that man may achieve of the world about him was perhaps the most important ingredient in man's social and cultural evolution. T h e development of writing, not only in the form in which we think of it today but also in the more primitive forms which have yielded much knowledge about civilizations millenia old, greatly expanded man's ability to communicate with others at other places and also at other times; man's sense of having a past stems from the written record. T h e invention of printing greatly expanded the dimensions of the group with which a single individual could communicate. T h e rise of mathematics rendered much more efficient the communication of certain kinds of information. T h e increased possibility and speed of travel have, as an important social consequence, an increase in the ease and effectiveness of communication. One of the great triumphs of science and technology has been the development, within little more than a century, of virtually instantaneous communication by electrical methods, between
vi
Foreword
almost any two points on the globe. From the first commercial installation of a telegraph line in 1844 to the transmission by telephone of the first complete spoken sentence, " M r . Watson, come here, I want you," took thirty-two years. Twenty-two more years were to elapse before the first paid radio message was transmitted in 1898. Since then the milestones in the development of electrical and electronic techniques of communication have been separated by ever-decreasing intervals. Long before striking innovation in technique has become a commonplace in the communications system, other innovations appear on the horizon. Sheer inventiveness could not, of itself, have brought our communications system to its present excellence. Coupled to inventiveness and to imaginative engineering was a determined effort, often highly theoretical, to understand the nature of communication, the qualities in man that lead to the ability and need to communicate, and the basic laws of nature that govern all communications systems. T h e telephone and the arts and techniques that contribute to its success as one of the devices that has formed the quality of contemporary life are, of course, only a part of the modern system of communications. Nevertheless, an understanding of the basic principles of telephony contributes to an understanding of almost all the other means of communications. T h e development of the telephone system is a fascinating case history of the interplay between inventiveness, imagination, theoretical analysis, and even philosophic insight. It is a case history that is typical of a great many of the profound technological and scientific developments that have changed the dimensions and the nature of the world that man inhabits. No matter what excellence we may see in the modern telephone system, it is certain that further improvements will be made. One of the world's outstanding industrial research laboratories, Bell Telephone Laboratories, is dedicated to the exploration of ways and means of increasing the effectiveness of the telephone system. T h i s encompasses a great spectrum of activities ranging from studies of the rotting of telephone poles
Foreword
vii
to research in the behavior of metals in the superconducting state, that very remarkable state at very low temperatures in which the electrical resistance of the metal is zero. From such work will come a system of increased reliability, convenience, and speed, and perhaps a system based on wholly novel principles. June,
1962
POLYKARP
KUSCH
Preface T H I S BOOK is intended to describe the principles of communication technology in a way which will make them easily understood by readers whose training is in other fields. It is the outgrowth of notes prepared for a course designed to provide teachers of basic science with a concise fund of up-to-date background information which is otherwise widely scattered. It may also be of interest to trained engineers and scientists who have to do with the subject matter of one or two of our chapters and are curious to see how their specialty plays its role in communication systems. Communication technology is a branch of electrical engineering. T h e electrical engineer who earned his degree a generaton ago will find the following chapters quite surprising. It will seem to him that about half of the book has nothing to do vith electrical engineering, or has at best a remote connection with it. On the other hand, a young electrical engineer with the ink fresh on his diploma may not realize the extraordinary fecundity of his subject. Indeed, an organized understanding of a major part of the book's content has been brought about only during the past generation. T h e pace of change is so great that it behooves us from time to time to try to explain ourselves and our works in words which will speak to a broad circe of interested but unspecialized persons.
The authors all work in the telephone industry. Naturally we lave drawn the great majority of our examples from this fiele. This may lend a somewhat parochial air to the book. T o us "communication" is the sending of a signal from one point to another, usually in a way which permits a two-way "conversation" to take place. We have little knowledge of the
χ
Preface
technical problems of those parts of the communications industry where a one-way message is "broadcast" to a large number of recipients, so we do not speak at all of newspapers or advertising, and only to a limited extent of radio and television broadcasting. T h e book has been written by a committee, which makes it modern in a second sense. None of us thinks that this is the best way to write a book, but in this case it was the only way it could be done in the time available. Although this has led to some unevenness, we offer the book in the belief that it will be useful. We would be remiss if we did not acknowledge the debts we owe our colleagues, at the Bell Telephone Laboratories and elsewhere, who in recent years have created the subjects which we here summarize. We would also like to thank all those who helped so substantially in the preparation of the manuscript. Chapters 1 3 - 1 5 are based on sections of Symbols, Signals and Noise, by J . R. Pierce (New York, Harper & Brothers, 1961), and include material from that book. June,
1962
T H E AUTHORS
Contents Foreword by Polykarp Kusch Preface ι. Introduction and Orientation
ν ix ι
2. Speech Communication
13
3. Speech and Other Signals in the Telephone System
41
4. Modulation Theory
64
5. Pulse Modulation
91
6. Multiplex System
110
7. Transmission Media
129
8. Amplification and Signal Generation
153
9. Transmission Systems
178
10. Trunking and Switching
220
1 1 . Interconnecting Networks and Trunking Plans
242
12. Central Office Control
264
13. Communication Theory
283
14. T h e Noisy Channel
304
15. Continuous Signals and Channels
321
Index
335
I
Introduction and Orientation THIS BOOK will discuss some of the knowledge and technical achievements necessary to produce the means of modern communication. W e are all familiar with the fact that the ability to communicate rapidly and conveniently has greatly affected our lives, but by and large we are unfamiliar with the technology of modern communication systems. T h i s technology is intimately connected to the telephone system. T h e invention of the telephone satisfied a need for person-to-person communication, as evidenced by the immense growth of the telephone system and its complex technology. Therefore, it is no wonder that most of the problems found in modern communications are also those which have had to be solved in the telephone system, and that a discussion of the telephone system will provide a useful framework for the subject of modern communications. It will be helpful to start with an over-all view of the functions that are present in any communication system. Shannon, who contributed greatly to communication theory, has provided a workable description which we shall adopt here. 1 In this description every communication system consists of five functional units:
Source, the originator of a message which has to be communicated; Transmitter, that unit which accepts the message from the source and converts it to a form suitable for transmisson on the channel; Channel, the link through which the message travels; Receiver, the unit that accepts the message from the channel Ό , E. Shannon, The Mathematical Theory nois, University of Illinois Press, 1948).
of Communication
(Urbana, Illi-
Introduction and Orientation
2 SOURCE
TRANSMITTER
CHANNEL
RECEIVER
DESTINATION
units
and reproduces it in a form comprehensible to the destination; Destination, the unit to which the message is to be communicated. T h i s description is simple and logical. Figure ι. ι gives some examples of this classification. In a simple telephone conversa-
Introduction and Orientation
3
tion [Fig. 1 1 ( a ) ] the speaker is the source. T h e transmitter takes the sound-wave message and changes it into an electrical signal. T h e channel may be considered as the electrical connection between the transmitter and receiver. It can be simply two wires or it can also include many switches, telephone exchanges, or even the whole transatlantic cable. T h e receiver changes the electrical signals back into sound waves which are heard by the destination or listener. Figure 1.1(b) shows the same type of classification for a television program. In this example the channel is taken as that section of the radio spectrum allotted to the station. T h e particular way in which a communication system is divided into these functional units will depend upon our viewpoint. Figure 1.1(c) illustrates an example in which the mouth acts as the transmitter, the air as the channel, and the ear as the receiver. In this case, the brain of the talker would be the source and the brain of the receiver the destination. One classification may be useful at one time, another at another time. We shall be discussing some of the physical properties of communication, not the meaning behind what is being communicated. For our purposes, then, it will be useful to consider the mouth as the source of the message and the ear as the destination. It is easy to see from this picture that the properties of the mouth-ear combination determine many of the requirements of a telephone network. For instance, the network needs to be capable of transmitting only those properties of speech which are important for perception. T h e exact boundary between different functions is arbitrary. This is particularly true for the concept of a channel. At one time we may regard the whole link between a telephone transmitter and receiver as the channel. At another time it may be useful to regard only part of that link as the channel, as when the transatlantic cable is viewed as a channel and the complicated equipment at either end is viewed as the transmitter and receiver. T h e objective of any communication system is to transmit messages correctly and as quickly as possible. Shannon has
4
Introduction a n d Orientation
shown that every channel has a finite capacity. T o show this he first defined that which is being communicated and called it information. Its precise mathematical definition will be considered in Chapter 13. Next, Shannon showed that a capacity of a communication channel can be defined as the maximum rate at which information can be sent over the channel without error. If we try to communicate at a rate faster than this we are certain to make errors. If we communicate at a rate slower than the channel capacity we may still make errors, but it is theoretically possible to send the message error-free. T o send a message error-free requires sophisticated procedures and, from a practical standpoint, the goal of error-free transmission of information can only be approached. Nonetheless, the viewpoint of communication theory can be very useful in providing limits for what can and cannot be done *vith a given type of communication problem. This over-all view of the communication problem will be taken up in Chapters 1 3 - 1 5 , after we have learned about the contents of a communication system in some detail. T h e important point for us to grasp here is that in all communication systems there are limits to the rates at which information can be communicated. We wish to understand why these limits exist because they are basic to problems in communication. What are they? A simple illustration can be given in terms of a transatlantic telegraph line. T h e requirements of early telegraph lines were relatively simple by today's standards. An electric current was used to represent dots or dashes and the absence of current denoted a space. T h e current was turned on three times longer for a dash than a dot. Sequences of dots and dashes were used to represent letters through the Morse code. T h e speed of the early telegraph systems was limited mainly by the speed of the human operators. T h a t the telegraph line itself could be a limitation became apparent with the laying of the first serviceable transatlantic telegraph cable in 1866. Lord Kelvin correctly appreciated the factors which limited the rate at which a message could be sent.
Introduction and Orientation
5
He found that the time required for any electrical operation such as turning a current on or off was proportional to the quantity RCl2 where R and C are the resistance and capacitance per unit length of the cable and I is the length from transmitter to receiver. The central wire of the cable has a certain capacitance to ground which has to be charged and discharged through the resistance of the cable itself. If a voltage is put on the cable at one end, a voltage does not immediately appear at the other end of the cable. It increases from zero to the maximum value as the capacity of the cable is charged up. If the voltage at the transmitting end is impressed and then taken away in too short a time, no detectable voltage change will occur at the receiver. The word detectable in the foregoing is important. You might imagine that there would always be some voltage change, but in actuality no signal is detected unless it is larger than a certain threshold. This threshold is determined both by the sensitivity of the receiving apparatus and by the magnitude of the spurious voltage fluctuations which always occur on any real communication link. These fluctuations are termed noise. If the voltage fluctuation at the receiving end due to the signal is sufficiently less than that due to noise, the signal will pass undetected. Figure 1.2 illustrates this. The curves represent the transmitted and received voltages plotted as a function of time. At the transmitting end the voltage is turned on. At the receiving end the voltage begins to rise, comes to a level which is less than the transmitted voltage, and, when the transmitted voltage is switched off, the received voltage slowly drops to zero. The voltage fluctuations due to noise are added on to the received signal. If the transmitted voltage is turned on and off too fast, the received signal does not have a chance to rise to a gieat enough value to be detected through the noise. There are two factors here which limit the rate of communication: There is always noise on the channel. The received signal must be larger than the spurious fluctuations due to noise.
Introduction and Orientation
6
The received signal is always attenuated, i.e., the received signal is always less than the transmitted signal. If the signal is attenuated too much it will not be detected through the noise. We have mentioned one cause of this attenuation, i.e., the resistance and capacitance of the transatlantic cable. There can be other causes of attenuation. For instance, there is alSIGNAL D E T E C T E D
SIGNAL NOT D E T E C T E D
T R A N S M I T T E D SIGNAL
RECEIVED SIGNAL W I T H O U T NOISE
TIME — • R E C E I V E D SIGNAL W I T H NOISE
Fig. 1.2. Schematic of signals from a transatlantic telegraph cable ways some leakage resistance in an undersea cable which will cause even a steady current to be less at the receiving end than at the transmitting end. Because of the resistance and capacitance in the telegraph cable the received signal does not have the same form as the transmitted signal. This is shown in Fig. 1.2. This type of distortion can be shown to be another form of attenuation. In one form or another the limitations due to noise, the finite rate of signal change, distortion, and attenuation are present in all communication channels. The study of ways of
Introduction a n d Orientation
7
dealing with these limitations will form a major portion of this book. T h e concept of noise is important in communication theory. In general, noise can be considered as any unwanted sound or signal. It usually interferes with comprehension of the message and, unfortunately, it is often of such a nature that it cannot be removed from the signal. There are many different types of noise. T h e most basic type is thermal noise. This type occurs everywhere because of thermal fluctuations and can never be eliminated. Thermal noise, sometimes called white noise, will be present at every point in the circuit where there is resistance. However, it can be reduced by lowering temperatures. Impulse noise occurs with uncontrolled spasmodic surges of signal. T h e sudden closing of a switch can cause a surge of current which is heard on a telephone line as a click, or as static. In contrast to impulse noise, thermal noise sounds like the center of a waterfall or heavy rain on a tin roof. Impulse noise can also be said to sound like occasional raindrops on a tin roof. Under proper conditions the flow of individual electrons in a vacuum tube can be detected as impulse noise. Crosstalk is another type of noise. T h e wires of two separate telephone conversations may come close together and the electrical currents on one line can induce voltages in the other line. These induced voltages may be heard as a faint conversation or garble in the background. It also is not limited to the telephone system and can be used to describe any induced, unwanted signals. We have talked about distortion, which is the altering of a signal. Distortion is not always noise, but the addition of a noise is always distortion. One cannot correct for noise; foi1 some distortions one can. For instance, suppose an amplifier distorts a signal by amplifying the low frequencies more than the high frequencies. It is possible to add another amplifier which amplifies the high frequencies more than the low frequencies. T h e net result of both amplifiers is an amplified, undistcrted signal. Noise changes the signal in an unpredictable
8
Introduction and Orientation
way—so, by its very nature, once it is present in the signal it is difficult to eliminate. T h u s , the problem with noise is to see that it does not become mixed with the signal in the first place, or at least to keep it as small as possible. W e shall see in the chapter on multiplexing that w h e n many signals are mixed together it is very important to have no distortion present. In multiplexing, signals are m i x e d together, transmitted as one signal, and separated into the original signals at the other end. If distortion is present a complete separation cannot be effected. Noise is a major problem in communications because it is always present and always unwanted. W e need some means of measuring the effect of noise. A quantity which has been found to be very useful is the logarithm of the signal-to-noise ratio. T h e signal-to-noise ratio is the signal power S divided by the noise power N. T h e signal-to-noise ratio, S/N, is usually measured in decibels. T h e signal-to-noise ratio in decibels (db) is given by 10 log,ο "ΙΑ signal-to-noise ratio of 10 db means that the signal power is ί ο times the noise power. A signal-to-noise ratio of 20 d b means that the signal power is 100 times the noise power and a ratio of —3 db means that the signal power is half the noise power. T h e example of the transatlantic telegraph has served to illustrate how noise and attenuation limit the rate at which information can be communicated. T h r o u g h o u t the book we shall examine ways of dealing with the problems caused by these factors. Of course, there exist other problems in the telephone system, which are caused by its immense size, rapid rate of growth, and complexity. As the telephone system began to develop, an immediate problem was the connecting of the different telephones. T h i s was done by a human operator. A line f r o m each telephone terminated on a board in front of the operator, w h o was able
Introduction and Orientation
g
to connect any line with any other line. Such a procedure was fine as long as the number of telephones remained small, but problems arose as the number of telephones increased. An operator's arm is only so long and it will span only a limited number of line connections. The next step was to have two or more operators, each one with a number of phones and also a number of lines interconnecting the two operator positions. It is clear that the number of lines between the operators need not be as large as the number of phones for each operator, as it would be highly unlikely that all the phones of one operator would wish to call all the phones of the other operator at the same time. In order to decide how many lines there should be it is necessary to know some of the statistics of telephone calls, such as the number of calls made, when made, for how long, etc. The function performed by the operators was gradually taken over by electromechanical switches. At present, even with switches, the same problem of growth occurs. Since every telephone has to be able to be linked with every other telephone, the number of links which must be provided increases as the square of the number of telephones. If there are Ν phones, the number of different calls possible is N(N — i)/2; yet, if there were a link between every two phones, the equipment needed for any person to call any other person would be unfeasible. Therefore, a different principle is used—that of common equipment. A number of links are shared by many phones and are assigned to a specific phone only when needed. Such a system is possible because each phone uses the equipment only a small portion of the time. The problems of switching, switching systems, and traffic will be discussed in Chapters 10 through 12. With the increase in the number of phones came an increase in the distance over which messages were transmitted. Because the signals were always attenuated and distorted, some means of signal amplification and distortion correction became necessary. To perform these functions repeaters were developed for insertion into the telephone lines at appropriate positions.2 : A repeater is a telephone system term for a device which not only amplifies the signal but also corrects for distortion.
ΙΟ
I n t r o d u c t i o n and Orientation
O n one cross-country telephone channel there may be up to 750 repeaters, or one every 4 miles. Because of this large number of repeaters, each repeater must amplify and transmit the signal with practically no distortion so that the total distortion effect of many repeaters is not noticeable. Moreover, the types of electronic circuits which are used have to be very stable and nondefective. Thus, the subject of signal amplification and distortion correction becomes important. Traditionally, phones have been connected by means of a circuit consisting of a pair of copper wires. T h i s is efficient for short distances, and also for long distances if repeaters are used, but it was found to be very inefficient if a separate circuit was used for each cross-country conversation. Consequently, as many conversations as possible were put on a single circuit. T h i s meant that some way of combining several conversations at one end of the circuit and separating them at the other had to be developed. Such methods are grouped under the general heading of multiplexing. W e will see that there are three main categories of multiplexing: space-division multiplexing, which means that the different conversations are separated in space (this is just a fancy name for separate wires or physically separate channels); time-division multiplexing, which means that the different conversations are allotted different segments of time; and frequency-division multiplexing. T h e exact significance of these three types of multiplexing will not be clear until they have been discussed in detail. It should be evident that multiplexing techniques play an important role in modern communications. For instance, television signals and telephone conversations are carried across the country on a system consisting of microwave relay channels. In such a system, microwave radio signals are beamed from the transmitter to a receiver about 25 miles away. T h e signal is then amplified and retransmitted to another receiver 25 miles further along the route, and so on across the country. One microwave relay channel can handle as many as 2000 simultaneous telephone conversations, or a television program in place of
Introduction and Orientation looo telephone conversations. T h i s requires extensive multiplexing equipment at each end of the country, but generally it is far more economical than providing individual repeaters for each telephone circuit. T h e most common circuits for carrying conversations consist of pairs of wires. Although they are good for carrying conversations over short distances they are not adequate over long distances. Other transmission media have been developed. T h e microwave system mentioned above is one. Coaxial cables are also used. Perhaps in the future waveguides and microwaves relayed by satellites will come into use. Each transmission medium has its own set of advantages and disadvantages which make it the most appropriate choice for a particular application. Since the present telephone system was developed to transmit the human voice in the form of electrical signals, much of the telephone system was designed around the properties of the voice and of conversation. But modern communication is the transmission of electrical signals, and there are important signals other than voice—as in television, for example. T h e transmission of data is another important use of a communication system, and its main requirements are speed and accuracy. T h e properties of these signals are quite different from those of voice signals. T h e question can be asked: What form of electrical signals is best for sending each type of source? Thus, the study of what type of encoding is used in a communication system becomes essential. W e shall see that different types of encoding are appropriate for different types of transmission media. As a final question we can ask: How does one plan and design a vast and complicated collection of interconnected equipment such as the telephone network? Not only must all the separate parts function by themselves, but they must also function properly when interconnected. Many factors enter into the design of a system. Standards of performance must be attained, the quality of transmission must be satisfactory, the equipment must be reliable, and it must not be too expensive. T h e r e may
12
Introduction and Orientation
be many ways of setting up a communication system to perform a specified job. Finding the best way is a formidable task—this is the function of systems analysis. In closing this introduction a word of warning is appropriate. We are going to delve into the details of communications systems and in doing so there is danger of losing an over-all perspective. Somewhere in the back of our minds we should maintain an awareness of the function of communication. An analogy can be made between the relationship of the human being and the human nervous system on the one hand, and a civilization and its means of communication on the other hand. T h e human nervous system is unbelievably complicated. Its function is to integrate the parts of the body into one whole. T h e nervous system is not concerned with what the body does. It is not important in itself, but only as a tool to the human being. Similarly, the means of communication, fascinating though they are, are not important in themselves. They are only important as a tool to civilization. T h e important question concerns the use to which they will be put. Unfortunately, this vital question cannot be answered by technology.
2
Speech Communication SPEECH COMMUNICATION is made physically possible by a complicated and interrelated pair of organs: the mouth and vocal tract, which act in combination as a transmitting apparatus, and the ear, which acts as a receiving apparatus. In a very real sense the telephone system is merely an extension in space of the distance between the mouth and ear. Much of the design of telephone and radio systems has been made with this fact implicitly in mind. However, communication systems are not exclusively designed for speech. There is also music for the ear and the whole world for the eye. Communication systems are not even designed exclusively for human beings. Communication between machine and machine is becoming increasingly important. A system consisting of a machine on each end of a communication link has vastly different requirements from a system with a human being on either end. But even though speech communication is not by any means the whole of the subject, it is sufficiently important to deserve considerable attention. There is another reason why the study of speech communication will be useful. T h e proper understanding of such a highly technical phenomenon as a communication system requires a considerable degree of sophistication. There are a number of mathematical techniques and new points of view which must be presented. It is not the purpose of this discussion to be mathematically elaborate, but such concepts as Fourier analysis, signal representation, modulation, and noise, to name a few, must be introduced. The study of speech communication will require these concepts and will provide a convenient basis for their understanding.
14
Speech Communication
r
W A S A L CAVITY
fiiil
VOCAL
CORDS
TRACHEA 10/iotsf 2o mr.
Fig. 2.1. X ray of male vocal tract T H E M O U T H AS A MESSAGE
SOURCE
T h e vocal tract (Fig. 2.1) operates in the following manner. Air from the lungs comes up through the trachea and passes through a constriction called the glottis, which is formed by
Speech Communication
15
the space between the vocal cords. The air continues up through the vocal tract, consisting of the throat and mouth, and then flows out through the mouth. T w o major types of sounds can be produced—voiced sounds and unvoiced sounds. The vowels a, e, i, o, u, are examples of voiced sounds. In voiced sounds the acoustic energy is produced by the vocal cords, which open and close rapidly, sending puffs of air through the vocal tract. T h e rate at which the vocal cords open and close determines the pitch of the voice. For male voices the vocal cords usually vibrate at a frequency between 80 and 120 cps, whereas female voices usually range between 120 and 240 cps. The air puffs, if heard by themselves, make a sound something like a buzz. This sound cannot be heard in a pure form but only as altered by its passage through the vocal tract. Thus, the sounds which come from the mouth have different characteristics depending on the shape of the vocal tract. You can make experiments with your own mouth by repeating different voiced sounds and sensing the shape of your mouth and throat. It is possible to vary the pitch, keeping the same shape of the vocal tract, and vice versa, to vary the shape of the vocal tract while keeping your pitch constant. In your natural speaking voice do you use the same pitch for the different vowels? Can you discover why the vowel in the word heed is called a front vowel and why the vowel in the word hoot is called a back vowel? T h e other main category of sounds, unvoiced sounds, is produced by a turbulence at some point in the vocal tract. The consonants h, t, s, and ρ are examples of unvoiced sounds: h is produced by a closure at the back of the mouth, t by the tongue at the front teeth, s by front teeth and tongue, and ρ by the front lips. Some consonants are combinations of voiced and unvoiced sounds; ρ and b are an example of an unvoiced-voiced consonant pair. T h e two consonants are produced with the same mouth movements except that the latter has voicing added. How many different sounds are there in English? There are about 40 different phonemes, such as ρ and b, which are recognized as different, and which can change a word if one is
i6
Speech
Communication
substituted for another, as the words pat and bat. But each phoneme can be spoken in many different ways, depending upon which other phonemes precede and follow it. Since the sounds of speech are made by the rapid change in shape of the mouth, the question can be asked: How fast can a person speak? T h e mouth and throat are rather large, massive organs and the muscles which move them have their limitations of strength. Also, the nerve system of the body, which activates the muscles, has its own speed of working. These factors combine to limit the rate of speech to a maximum of about 10 different syllables per second or about 20-30 different sounds per second. T h i s is the rate of sound production. How can we characterize the different sounds of speech? A so-called parametric description would be on" way. If we could find some parameters which adequately des. ribe the shape of the vocal tract, together with the positic 1 of the lips and tongue, we would be making a good start. Of course, the pitch of the vocal cords would also have to be specified. We will come back to such a description when we discuss voice synthesizers, or vocoders. There is another very useful way of describing the different types of sounds. T h i s is in terms of a time and frequency analysis. We will here digress from our main line of description in order to introduce those parts of frequency analysis or Fourier analysis, as it is called, which will be of use to us. Fourier analysis will not only be useful in describing speech, but will also be of immense value in describing what happens to signals in any electrical circuits. FOURIER
ANALYSIS
Figure 2.2 shows a graph of a short segment of speech. T h e distance along the vertical axis is proportional to the pressure of the sound wave. T h e distance on the horizontal axis represents time. In general, the variations of pressure with time in speech are extremely complicated and very few regularities can be detected. In order to see how to analyze speech we will first consider some variations with time which are more regular.
Speech Communication
17
O n e of the simplest types of regularities is exhibited by the m o t i o n of a point on the circumference of a wheel moving at constant speed. T h e motion is periodic which means that after a certain time ( Τ , called the period) the motion repeats itself. In the case of the revolving wheel the period Τ is the time of
Fig. 2.2. Sound-pressure variations as a time function one complete revolution of the wheel. T h e frequency f of a periodic function is the number of repetitions per second and is given by the reciprocal of the period, f = i/T. If the motion of the point on the circumference of the wheel is projected onto a vertical line which passes through the center of the wheel, a particularly simple form of periodic motion
0=
Y = A SIN 2 7 r f t
277ft
f =
Y
Fig. 2.3. Plot of sine 2nft and cosine 2π/ί
i8
Speech
Communication
results. A graph of this motion is shown in Fig. 2.3. It is called a sine function and can be represented by the formula Y(t) = A sin (271-/Ο,
(2.1)
where Y(t) is the displacement of the projected point and is a function of the time t, and A is the amplitude of the sine function corresponding to the m a x i m u m displacement. In our example, A would be the radius of the wheel and / is the frequency of rotation in cycles per second. T h e quantity 2irft is the angle of rotation θ measured in radians. A different type of periodic motion can be obtained jecting the motion of a point on the circumference of onto a horizontal axis instead of a vertical axis. T h i s is also shown in Fig. 2.3 and is represented by the
by proa wheel motion formula
x(l) = A COS (Q7t/0·
(2.2)
It is apparent that the sine function and the cosine function are related by a rotation of the point one-fourth of the way around the circle, or by an angle θ = π/2 rad. T h i s can be expressed by the f o r m u l a sin [27r/i + (7Γ/2)] = COS (27r//).
(2.3)
If the sine function is rotated through an arbitrary angle φ, it can be expressed as a sum of both a sine and a cosine function, sin (2 π/< + φ) = cos ) = COS y> SIN
^ γ — + SIN y> COS ^ ψ ^ ·
tiples of the fundamental frequency \ / T . Each frequency 2jtm/T will have a specified amplitude Sn and a specified phase U P t o infinity, should be summed,
m
=
+
(2. 5 )
T h e multiples of the fundamental frequency are called harmonics. In practice, the amplitude of the terms which represent high harmonics (i.e., large n) are usually quite small. This means that most time functions can be approximated quite well by a reasonable number of terms. Figure 2.5 shows a straight-line time function and the Fourier series which represents this function. One term of the series does not give a good approximation, whereas four terms represent the function closely. Fourier series are of immense use in dealing with periodic functions. It is a fact, however, that most variations which occur are not strictly periodic. Thus, our analysis has to be extended to include nonperiodic functions. Without going into the details we will say that this can be done by replacing the sum over the harmonic frequencies, fn = nf=n/T, by an integral over all frequencies. Thus, F(t) is given by the integral F(t) = JT- S(f) sin \21rft + φ(/) ] df,
(2.6)
where S(f) is the amplitude of the sine function whose frequency is /, and V (TYMPANIC MEMBRANE)' O V A L W I N D O W ' •;-;.· ROUND
WINDOW-'
EUSTACHIAN
TUBE NASAL
CAVITY
OVAL WINDOW STIRRUP ROUND. WINDOW
HELICOTREMASCALA V g S T I B U l A SCALA TTMPANi COCHLEAR PARTITION'
Fig. 2.14. Schematic diagram of the human ear (Bogert, 1950) ear is functionally divided into three parts: the outer ear, the middle ear, and the inner ear. Each part of the ear serves one or more quite definite functions. T h e outer ear consists of the " e a r " which is called the pinna and serves to collect the sound waves and direct them to the auditory canal and on to the eardrum. T h e asymmetry of the pinna also helps in distinguishing the direction of sound. Without this there would be a horizontal axis of symmetry passing through the ears so that sounds coming from the front and from the back of the head would be heard in an identical way. T h e auditory canal serves the purpose of protecting the eardrum. It also provides a quarter-wave resonance at around 5000 cps which increases the sensitivity of the ear in this frequency region. T h e bones of the middle ear transmit the vibrations of the eardrum to the oval window of the cochlea. T h e y help couple
34
Speech Communication
the motions of the air with the different motion of the liquidfilled cochlea. By their mode of vibration the middle-ear bones also serve to protect the inner ear from loud noises, that is, the mode of vibration of these bones at high sound intensity is different from their mode at low intensities. T h e inner ear consists of the cochlea and the vestibular apparatus with its associated semicircular canals. T h e semicircular canals are part of the spatial orientation sense and are not concerned with hearing, so they will not be discussed further. Such a description is highly simplified but it may help to show how remarkably well suited the ear is to its task. T h e inner ear and its operation deserves more of our attention. T h e cochlea, so called because of its shell-like shape, contains a remarkable wave-analyzing mechanism. A schematic of an unrolled cochlea is shown in the lower part of Fig. 2.14. In its essential details the cochlea consists of a tube separated into halves by the cochlear partition, an important part of which is the basilar membrane. T h e elastic constant of the basilar membrane varies by a factor of about 100 over its length. It is stiffest at its basal end near the oval and round windows, and it is most stretchable or flabby at the apical end where there is an opening connecting the upper and lower halves of the cochlea. Suppose there is a displacement of the oval window. If the movement is very slow, as would occur for low frequencies, the liquid in the upper half flows from the basal to the apical end of the upper vestibule, flows through the hole in the apex and into the lower vestibule, then back to the basal end of the cochlea where, having no other place to go, it displaces the round window. During this whole process the membrane displaces but slightly, as shown at the top of Fig. 2.15. As the frequency of motion at the oval window is increased, there is a dynamic conflict between the inertia of the liquid in the cochlea and the force required to displace the elastic membrane. If the membrane is displaced, the mass of liquid beyond the point of displacement need not be set into motion. Because the inertial forces increase with increased frequency, a higher
Speech Communication
35 DISPLACEMENT OF OVAL WINDOW
.OVAL WINDOW
2
ZERO FREQUENCY
-ROUND WINDOW
LOW FREQUENCY
HIGH FREQUENCY
G "UNROLLED" COCHLEA "
BASILAR MEMBRANE
DISTANCE ALONG BASILAR MEMBRANE ENVELOPE OF BASILAR MEMBRANE
— •
DISPLACEMENT
Fig. 2.15. Motion of fluid in the cochlea produced by the displacement of the oval window, showing frequency-dependent displacement of the basilar membrane frequency is more likely to cause membrane displacement rather than mass motion. Because of the variation in elasticity along the basilar membrane, the point of maximum displacement is a function of frequency. High-frequency displacements occur only at the basal end of the membrane while low-frequency movements cause the
36
Speech
Communication
membrane to move throughout its whole length but with greatest displacement at the apical end. T h e curve at the bottom of Fig. 2 . 1 5 shows the outline of the basilar membrane displacement for pure sine-wave inputs of various frequencies. Lowfrequency tones displace the high-frequency region of the membrane, but high-frequency tones do not affect the low-frequency portion of the membrane. T h u s , it is not surprising that lowfrequency tones can interfere with or mask high-frequency tones while high-frequency tones do not have much effect on the hearing of low-frequency tones. T h e next stage in the hearing process is the production of nerve impulses. Along the length of the basilar membrane are found rows of thin cells called hair cells which are attached both to the basilar membrane and to the beginnings of the auditory nerve. Motion of the basilar membrane causes a stimulation of the hair cells which in turn causes electrical impulses to travel up the nerve toward the brain. These electrical impulses are presumably analyzed by the brain, and in such a way we reach a conclusion about what we hear. T h e cochlea thus works as a mechanical frequency analyzer. But it also acts as a time analyzer. In Helmholtz's theory of the operation of the ear there were a number of resonant analyzers. T h i s would be analogous to the strings on a piano. T h e r e would be a number of such strings, each resonant to a particular frequency. W e could thus determine the frequency by detecting which string was resonating. Suppose that a person listens to two tones alternately. One tone is fixed in frequency at 1000 cps, the other tone can be shifted in frequency. He tries to match the variable frequency tone to the fixed 1000-cps tone. Most people find a match within the range 996 to 1004 cps about half the time, and so we say that the accuracy of determining pitch at 1000 cps is plus or minus 4 cps. T h u s , the bandwidth of our pitch detecting mechanism is about 8 cps. But the time resolution of the ear is about 1 msec. In order for this to be true, the basilar membrane must not be a sharp frequency analyzer but must have a bandwidth of 1000 cps or more. We can conclude from this
Speech Communication
37
that the detection of pitch involves something more than mechanical frequency analysis by the basilar membrane. T h e fact that the time resolution of the ear is limited to about 1 msec is related to a change in the way frequencies above and below 1000 cps are heard. For frequencies below 1000 cps the time fluctuations corresponding to the pressure variations of the sound wave can be observed in the nervous system, but this is not so for frequencies above 1000 cps. In our hearing the detailed time structure of the sound wave is lost. Another way of expressing the same thing is to say that the ear is phaseinsensitive for high frequencies. Because the phase of the high-frequency components of speech are not perceptually important, a certain amount of phase distortion in telephone circuits is permissible. This simplifies the design and construction of the circuits, but there are drawbacks to such simplifications. The telephone system is now being used for other signals besides speech, and a distortion which is not important for speech may be undesirable for another type of signal. We have mentioned that the ear can accurately detect the pitch of a pure tone. It does not have the same accuracy in detecting the pitch period of speech. This fact is used in singlesideband modulation. In this type of modulation, which will be discussed in detail later, the processed speech is often shifted in frequency by a few cycles per second. A frequency shift of a few cycles per second is tolerated quite well by the ear, and so makes this modulation scheme feasible. T h e mouth only radiates a small amount of power, thus the ear must be a very sensitive acoustical instrument. There are different types of sensitivities which can be measured. One important measure of the ear's sensitivity is the threshold of audibility for pure sinusoidal tones. This is the minimum intensity at which a tone can be heard when no other sounds are present. Figure 2.16 shows a curve of the threshold of audibility which is given by the American Standards Association. T h e scale is in terms of sound-pressure level in decibels with respect to 0.0002 dyne/cm 2 which is taken as the zero db reference level.
Speech Communication
38 140
•
2 \
< n ui Ζ
120
\
/
T H R E S H O L D OF / PHYSICAL SENSATION
5 100 Ν
ο ο Ο ό
βο
> IU _> «
20
α ο. ο ζ Ο
10
-20
20
40
60 -
H >
0.12 υ < 10 liiZX
IH
0.08
0.06 Slt-uj
Ut ζ ο
hυ
0.04
ο 0.02