244 49 32MB
English Pages 270 [264] Year 2021
UCLA F O R U M IN M E D I C A L
VICTOR E . HALL,
SCIENCES
Editor
MARTHA BASCOPÉ-ESPADA, Assistant
Editor
HOARD
Forrest II. Adams
H. W . M a g o u n
Mary A. B. Brazier
C. D. O'Malley
Louise M . Darling
Sidney Roberts
Morton I. Grossman
Emil L. Smith
William P. Longmire
Reidar F. Sognnaes
UNIVERSITY
OF
CALIFORNIA,
LOS
ANGELES
IMAGE PROCESSING IN BIOLOGICAL SCIENCE
UCLA FORUM IN MEDICAL SCIENCES NUMBER 9
IMAGE PROCESSING IN BIOLOGICAL SCIENCE Proceedings of a Conference held November, 1966 Sponsored by the UCLA School of Medicine, University of California, Los Angeles, and the National Institutes of Health
WILFRID J. DIXON, MARY A. B. BRAZIER and BRUCE D. WAXMAN CO-CHAIRMEN
D I A N E M. R A M S E Y EDITOR
UNIVERSITY
OF
CALIFORNIA
BERKELEY AND LOS ANGELES 1968
PRESS
University of California Press Berkeley and Los Angeles, California © 1969 by The Regents of the University of California Library of Congress Catalog Card Number: 68-63727 Printed in the United States of America
PARTICIPANTS IN THE CONFERENCE WILFRID J. DIXON, Co-Chairman Department of Biomathematics, UCLA School of Medicine University of California Los Angeles, California M A R Y A. B . B R A Z I E R , Co-Chairman Brain Research Institute, UCLA School of Medicine University of California Los Angeles, California
D. W A X M A N , Co-Chairman" Special Research Resources Branch Division of Research Facilities and Resources National Institutes of Health Bethesda, Maryland BRUCE
M. R A M S E Y , EditorF Astropower Laboratory Douglas Missile and Space Systems Division Newport Beach, California DIANE
W. Ross ADEY Space Biology Laboratory and Brain Research Institute University of California Los Angeles, California S T E P H E N L . ALDRICH
Research and Development Central Intelligence Agency Washington, D.C. HARRY B L U M
Synthetic Coding Branch, Data Sciences Laboratory Air Force Cambridge Research Laboratories Bedford, Massachusetts PATRICIA M .
BRITT|
International Business Machines Corporation Los Angeles, California Present
addresses:
* National Center for Health Services Research and Development, Health Services and Mental Health Administration, National Institutes of Health, Bethesda, Maryland, f Division of Research, Reiss-Davis Child Study Center. Los Angeles, California, t Health Sciences Computing Facility, University of California, Los Angeles, California.
DANIEL BROWN
Test and Electronic System Simulation Department Space Technology Laboratory, Inc. Redondo Beach, California D . E . CLARK
Medicai Computing Unit University of Manchester Manchester, England W E S L E Y A . CLARK
Computer Systems Laboratory Washington University St. Louis, Missouri GERALD C O H E N
Biomedicai Engineering Branch, Division of Research Service National Institutes of Health Bethesda, Maryland GEORGE N .
EAVES®
Special Research Resources Branch Division of Research Facilities and Resources National Institutes of Health Bethesda, Maryland MURRAY EDEN
Research Laboratory of Electronics Massachusetts Institute of Technology Cambridge, Massachusetts FRANK ERVIN
Department of Psychiatry Massachusetts General Hospital Boston, Massachusetts GERALD E S T R I N
Department of Engineering University of California Los Angeles, California HELEN H . GEE
Computer Research Study Section, Division of Research Grants National Institutes of Health Bethesda, Maryland ° Molecular Biology Study Section, Division of Research Grants, National Institutes of Health, Bethesda, Maryland.
DONALD A . GLASER
Department of Molecular Biology and Virus Laboratory University of California Berkeley, California L E S T E R GOODMAN
Biomedical Engineering and Instrumentation Branch Division of Besearch Service National Institutes of Health Bethesda, Maryland M A R Y L O U INGRAM
Department of Radiation Biology and Biophysics The University of Rochester School of Medicine and Dentistry Rochester, New York RICHARD J . JOHNS
Sub-Department of Biomedical Engineering The Johns Hopkins University School of Medicine Baltimore, Maryland R . DAVID JOSEPH
Astropower Laboratory Douglas Missile and Space Systems Division Newport Beach, California BALDWIN G . L A M S O N
Director of Hospitals and Clinics University of California Los Angeles, California JOSHUA LEDERBERG
Department of Genetics and Kennedy Laboratories Stanford University Medical School Palo Alto, California LEWIS LIPKIN
Perinatal Research Branch National Institute of Neurological Diseases and Blindness Bethesda, Maryland JOSIAH M A C Y ,
JR.*
Department of Physiology, Albert Einstein College of Medicine Yeshiva University Bronx, New York Present
address:
* Division Alabama
of Biophysical
Sciences,
University
of Alabama
Medical
Center,
Birmingham,
BRUCE H . M C C O R M I C K
Department of Computer Science University of Illinois Urbana, Illinois M O R T I M E R L . MENDELSOHN
Department of Radiology Hospital of the University of Pennsylvania Philadelphia, Pennsylvania MARVIN MINSKY
Department of Electrical Engineering Massachusetts Institute of Technology Cambridge, Massachusetts ROBERT N A T H A N
Jet Propulsion Laboratory California Institute of Technology Pasadena, California PETER W . N E U R A T H
New England Medical Center Hospitals Boston, Massachusetts N I L S J . NILSSON
Artificial Intelligence Group, Applied Physics Laboratory Stanford Research Institute Menlo Park, California AMOS NORMAN
Department of Radiology, UCLA School of Medicine University of California Los Angeles, California SEYMOUR PAPERT
Department of Mathematics Massachusetts Institute of Technology Cambridge, Massachusetts ARNOLD P R A T T
Division of Computer Research and Technology National Institutes of Health Bethesda, Maryland KENDALL PRESTON, J R .
Electro-Optical Division The Perkin-Elmer Corporation Norwalk, Connecticut
JUDITH
M.
S.
PREWITT
Department of Radiology Hospital of the University of Pennsylvania Philadelphia, Pennsylvania JEROME
A.
G.
RUSSELL
Research Data Facility, The Institute of Medical Sciences Presbyterian Medical Center San Francisco, California D E N I S RUTOVITZ
Medical Research Council Clinical Effects of Radiation Research Unit London, England G E O R G E A . SACHER
Division of Biological and Medical Research Argonne National Laboratory Argonne, Illinois ROBERT H . SELZER
Jet Propulsion Laboratory California Institute of Technology Pasadena, California R O B E R T R . SOKAL
Department of Entomology University of Kansas Lawrence, Kansas EDWARD F . V A S T O L A
Division of Neurology, Department of Medicine State University of New York Downstate Medical Center Brooklyn, New York H E R M A N W . VREENEGOOR
Division of Computer Research and Technology National Institutes of Health Bethesda, Maryland NIEL WALD
Graduate School of Public Health University of Pittsburgh Pittsburgh, Pennsylvania W I L L I A M S. YAMAMOTO
Department of Physiology University of Pennsylvania School of Medicine Philadelphia, Pennsylvania
FOREWORD
BRUCE D. WAXMAN Co-Chairman
First, let me express our gratitude to the UCLA staff who so willingly worked with us in the development of this conference on biological image processing. We are, of course, very much indebted to Drs. Dixon and Brazier, assisted so ably by Dr. Ramsey. I should like to exercise an administrator's prerogative to reflect upon the reasons of the National Institutes of Health in supporting this conference. Most of you are aware of current activities in biological image processing in this country and probably share with me the belief that (a) the efforts are comparatively small, (b) the equipment available to biomedical scientists for image processing is generally less than state-of-the-art, and ( c ) much of the current effort has resulted as spin-offs from work in other areas. The demands of the moment are for reasonable increases in the volume of this activity in the biomedical sciences and for a reorientation of motives. The resolution of well-defined biomedical problems is the principal objective. Adequate technological capability already exists for the resolution of many biological problems of this class. The intent of the National Institutes of Health, in co-sponsoring this conference, is to urge that a systematic effort be made to direct our image processing technology toward significant biomedical targets. Unfortunately, while administrators are sometimes allowed to talk glibly about technological opportunities, we often prejudge issues as a result of a fundamental misunderstanding of the technologies involved. I have been troubled by a logical flaw in the suggestion that the subject matter of this conference be "biological image processing". This title was chosen because it seemed more appropriate than "biological pattern recognition", which in my mind implies the analysis of data, possibly to the exclusion of data reduction. Recently, I have wondered whether or not the notion of image processing is itself restrictive; it may connote the reduction and analysis of "natural" observations but exclude from consideration two- or three-dimensional data which are abstractions of phenomena rather than the phenomena themselves. By way of example, I refer to the three-dimensional representations of protein molecules by Levinthal (1), and to our own work on the two-dimensional representation of planar chemxiii
ical structures. I feel impelled to make these observations because the program of the next two days has been defined in such a way as to exclude extensive discussions of formal pattern recognition research and does not speak directly to that domain of images which are "nonnatural". I should like to reflect upon the nature of the contribution of image processing technology to biomedicine. T h e question is, do we limit our expectations to tasks in which the technology has the capacity for replicating relatively low-order motor and perceptual capabilities—albeit at great speed and consistency—or are there additional opportunities? It has been suggested that the ability to separate signal from noise automatically has implications for improving the effective resolution of analytic and diagnostic instruments; this has not been undisputably demonstrated. Furthermore, it has been suggested that the ability to quantize data automatically from massive populations provides opportunities for empirical analyses which are unattainable from the standpoint of a non-automatic technology. One may also conclude that the availability of suitably quantized data from large populations will permit the development of stochastically based biological theories. REFERENCE 1. LEVINTHAL, C., Molecular model-building by computer. Sci. Amer., 1966, 214: 42-52.
CONTENTS I M A G E R Y IN ANALYTICAL
METHODS
Mary A. B. Brazier and Wilfrid J. Dixon
1
S U R V E Y O F I M A G E PROCESSING P H A S E S
George N. Eaves and Diane M. Bamsey
5
A U T O M A T I C SCREENING OF M E T A P H A S E SPREADS FOR C H R O M O S O M E ANALYSIS
Niel Walcl and Kendall Preston, Jr.
9
T H E A P P L I C A T I O N O F C H A R A C T E R R E C O G N I T I O N T E C H N I Q U E S TO T H E D E V E L O P M E N T OF READING M A C H I N E S FOR THE
BLIND
MURRAY EDEN
35
A N A U T O M A T E D S Y S T E M F O R G R O W T H AND A N A L Y S I S O F B A C T E R I A L
COLONIES
Donald A. Glaser
57
A U T O M A T I C PROCESSING OF M A M M O G R A M S
Josiah Macy, Jr., Fred Winsberg and William H. Weymouth AUTOMATIC DIFFERENTIATION OF W H I T E BLOOD
CELLS
Marylou Ingram, P. E. Norgren and Kendall Preston, Jr. A P P R O A C H E S TO T H E A U T O M A T I O N O F C H R O M O S O M E
Mortimer L. Mendelsohn,
75
ANALYSIS
Brian H. Mayall and Judith M. S. Prewitt
DISCRIMINANT VERSUS LOGICAL MODELS FOR P A T T E R N
97 119
ANALYSIS
Seymour Papert
137
A D V A N C E S IN T H E D E V E L O P M E N T O F I M A G E P R O C E S S I N G
HARDWARE
Bruce H. McCormick
149
D I G I T A L V I D E O D A T A H A N D L I N G : M A R S , T H E M O O N AND M E N
Robert Nathan and Robert H. Selzer B I O L O G I C A L I M A G E P R O C E S S I N G IN T H E N E R V O U S S Y S T E M :
177 N E E D S AND
PREDICTIONS
Edward F. Vastola SUMMATION
AND
211
PERSPECTIVE:
FOR BIOMEDICINE
Frank Ervin and Joshua Lederberg
219
FOR HARDWARE
Wesley A. Clark and Marvin Minsktj NAME
232
INDEX
S U B J E C T INDEX
245
xv
247
IMAGERY IN ANALYTICAL METHODS
MARY A. B. B R A Z I E R W I L F R I D J. D I X O N University of California Los Angeles, California
Biological research involves the study of living organisms. The necessary complexities of this research arise from the many variables which must be studied simultaneously, as well as from the transient nature of many observable conditions and the varieties of accommodation to the same stimulus. Quantification in biology has proceeded slowly with the paucity of analytical methods and the lack or inadequacies of the mathematical models used to assist the observational and analytical processes. In the past, the biologist attempted to compensate for this lack by devising a variety of pictorial aids to his understanding; structures and forms too complex to describe were recorded pictorially. These pictorial aids include: (a) photographs taken in ordinary light in black-and-white or in color, either stills or in motion; (b) photographs recording diffracted or filtered light, opacity produced by various substances, transmission of radiation, passage of chemicals; ( c ) direct or photographic viewing of specimens prepared by smear, cross-sectional cuts; (d) derived pictures or images resulting from dichotomized or sealed threshold intensities from other pictures; and (e) various transformations of pictures or signals. The quantification of these pictures is an emerging science in itself. Scanning devices, systems for computer guidance and analysis in the on-line mode, pattern recognition theories and mathematical and statistical techniques (these also assisted by computer) are now at a stage of development where their potential fusion will greatly accelerate basic research in biology. This conference's host institution has especial interest in this subject, for its Health Sciences Computing Facility41 has installed a Graphics Subsystem (the 2250 Display and the 2282 Recorder Scanner) which is supported for computation by a partition of the core of the IBM 360/75 computer. Portions of this equipment are not available commercially, and systems support as well as various applied programs have been developed by the facility. PROJECTS IN PROGRESS
Among the many varieties of image processing applicable to biological research that are currently being explored at UCLA are a project in chromo" Supported by National Institutes of Health Grant FR-3.
1
2
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
some analysis being pursued jointly with the Radiology Department, a Fast Fourier Transform algorithm for analysis of electroencephalograms for the Brain Research Institute, and a graphics display of electric fields of E E G potentials for Dr. Brazier's neurophysiological laboratory. In other words, activity is being developed not only in the transform of image to numerical data but also from numerical data to image display. Chromosome
Studies
Dr. H. Frey of the Radiology Department is testing algorithms for recognizing and classifying chromosomes and for identifying abnormal chromosomes resulting from genetic anomalies or radiation. The algorithms must provide techniques for identifying and separating chromosomes which touch or overlap one another, as well as means for categorizing the chromosomes from the computed descriptions derived from the presented pattern. The program currently under development examines the field presented to it, searching for "objects"—for example, connected regions. Each object is classified as a single chromosome, overlapping chromosomes, or neither. Once an object is classified as a chromosome, the program defines a boundary for it, and classification can proceed. The goal of the project is the development of a package program for scanning chromosome photomicrographs, identifying and classifying the chromosomes found, and presenting the results. An example of the display is shown in Figure 1.
F i g u r e 1. Chromosome display showing density ellipse.
The Fast Fourier
Transform
A program for E E G analysis, based on the Fast Fourier Transform algorithm, greatly reduces the computation time previously necessary for spectrum analysis. The program, developed by Dr. R. Jennrich, estimates autospectra, cross-spectra, and coherences for stationary time series. Each series is decomposed into frequency components by means of a finite Fourier transform and the required estimates are obtained by summing products of the transformed series. Linear trend is removed from each series before transfor-
IMAGERY
IN
ANALYTICAL
METHODS
3
F i g u r e 2 . Amplitude and coherence of two E E G records. T h e graph shows the autospectra of records on channel one and channel two, together with the amplitude of their cross-spectrum and their coherence. T h e horizontal axes represent frequency. Labeling, coding and identifying information are displayed simultaneously on the scope face for use by the investigator (this information is notlegible at the magnification used in this illustration).
mation. If desired, series may be prefiltered (either by low-pass filtering or constructing an Ormsby filter) and decimated before detrending. The user may control the flow of the program by selecting series to be analyzed, by constructing a desired filter, by choosing for simultaneous display the desired functions (amplitude of autospectra or amplitude, phase and coherence of cross-spectra) and by scaling the display. Output from the problems is stored in a temporary data set and may be recalled for comparison and simultaneous display. An example of an analysis of amplitude and coherence of two E E G records is shown in Figure 2. Graphics Display of Electrical Fields on the Head A research project of Dr. T. Estrin and Mr. R. Uzgalis in the neurophysiological laboratory of Dr. Brazier" has resulted in programs for on-line spatiotemporal plots of the E E G . In conventional electroencephalography, the potential differences recorded between pairs of electrodes fixed to the head are displayed as amplitude functions of time; the spatial character of the electrical field must be inferred from the usual E E G tracings. Recent advances in computer-graphics permit the electroencephalogram to be viewed as a spatiotemporal presentation and should be useful in clinical and research electroencephalography. In this new approach, a grid on a cathode ray tube is congruent with electrode positions on the scalp from which multiple channels of E E G data are amplified, multiplexed and digitized. The recorded voltages are spatially interpolated to complete the grid. An algorithm connects points of equal voltage and displays them as contours on the tube face of a 2250 graphics terminal which time-shares the I B M 360/75. The contours in each display are considered as heights defining a topo" The work of this laboratory is supported by grants from the National Science Foundation ( # GP-6438), National Institutes of Health ( # NB 0 4 7 7 3 ) and Office of Naval Research (Contract # 2 3 3 - 6 9 ) .
4
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
Figure 3. T h e strip on the left is an excerpt of three successive frames of a film made by photographing displays on a 2250 cathode ray tube face. The camera shutter and film advance mechanism were under control of the I B M 3 6 0 / 7 5 via an interface panel which utilizes the capability of the 2 2 5 0 function key. Throughout the film, each display contours the spatial distribution of voltage over the surface of the head at intervals of 5 msec. The seven small black squares on each frame correspond to electrode positions on the head. T h e figure on the lower right is an enlarged drawing of one of these three frames; for clarity, the numerals which denote contour levels on the film have been emphasized by shadowing and contour lines. Zero isopotential is represented by the broken white line. The unbroken white lines on a darkening field indicate the direction of increasing negativity; black lines on a brightening field denote increasing positivity.
graphic surface with respect to a common reference. Successive displays recreate the time history of the electric field and are photographed by a motion picture camera under computer control. An example from a film is shown in Figure 3.
SURVEY OF IMAGE PROCESSING PHASES
GEORGE N. EAVES" National Institutes of Health Bethesda, Maryland
DIANE M. RAMSEYf Douglas Missile and Space Systems Division Newport Beach, California
The title "Image Processing in Biological Science" was chosen for this conference with the expectation of providing a reasonable framework within which the organization of our discussions could develop. This choice was not necessarily an attempt to limit the scope of material under consideration. In all of our planning and deliberations, a distinction has been made between "image processing" and "graphic analysis", both of which can employ pattern recognition techniques. While these two categories are not necessarily exclusive, the distinction between them serves to underline the main emphasis of this conference, while reflecting our recognition of important differences and interrelationships between the two fields or endeavor. Perhaps the use of "image" as a descriptive rubric is partially misleading, for we are also concerned with the scanning of actual biological specimens, such as bacterial colonies. Within the context of our definition, we might include other specific examples of biological images, such as radiographs of tissues, electron micrographs, and optically enlarged histological cross sections and hematological material, or photographs of such slides. The workable parameters for interpreting our area of interest will include the use of a computer-controlled image scanner and computer-programmable analysis. For maximum developmental progress it is far more important that we stress the open-endedness of these concepts rather than force a set of premature definitions on an emerging field of endeavor; therefore, even the form of the image must not become a limiting consideration. Prewitt & Mendelsohn ( 1 ) have specified five principal phases in the analysis of digitized images, defining them as: (a) delineation of figure and ground; ( b ) description of images by numeric and nonnumeric parameters, Present addresses: " Molecular Biology Study Section, Division of Research Grants, National Institutes of Health, Bethesda, Maryland. f Division of Research, Reiss-Davis Child Study Center, Los Angeles, California. 5
6
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
and by relational descriptors; (c ) determination of the range of variation and the discriminatory power of these parameters and descriptors; (d) development of appropriate decision functions and taxonomies for classification; and ( e ) identification of unknown specimens. These five principal phases are closely analogous to the major phases involved in processing and classifying high-altitude or satellite pictures. Thus, the theoretical considerations in interpreting microimagery and high-altitude pictures are quite similar. Because of the urgency of national defense requirements, however, a higher level of effort has been devoted to research and development of techniques for automatic processing of reconnaissance photography than that expended on developing techniques for biological image processing. Much of the present-day technology for processing reconnaissance photography may be directly applicable to the processing of microimagery. A comparable level of effort has not been expended on developing techniques for biological image processing. Much of the present-day technology for processing reconnaissance photography may be directly applicable to the processing of microimagery, and these areas of similarity should be exploited fully. With these considerations serving to guide the planning of this conference, we have utilized three broad categories to form a conceptual framework for viewing developmental accomplishments in biological image processing: ( a ) preprocessing to achieve image enhancement, ( b ) feature extraction to highlight properties considered important for correct recognition and subsequent classification of the image, and ( c ) design of an appropriate classification logic. The succeeding portions of this discussion will attempt to delineate the more general aspects of the various processing phases. PREPROCESSING
The purposes of preprocessing or signal conditioning are ( a ) to emphasize or highlight aspects of the input signal which are deemed important, ( h ) to furnish in many cases a reduction in the amount of input data, ( c ) to supply a convenient input format for subsequent computer processing, and (d) to provide invariance. To accomplish invariance it is desirable that the classification assigned to an object or region in the field of view be reasonably independent of the position of that object in the field of view, the aspect at which it is viewed, the background against which it is seen, partial blocking of the object, and changes in illumination. Preprocessing techniques may include scanning, edge enhancement, enhancement of figureground contrast, Fourier transformation, and autocorrelation. Consistent image quality is of salient importance to the accomplishment of maximum preprocessing. In the case of photographs, the image enhancement techniques to be discussed by Dr. Nathan and Mr. Selzer are especially appropriate. W e find it particularly noteworthy that these techniques, although developed to meet the demands of a particular reconnaissance
SURVEY
OF
IMAGE
PROCESSING
PHASES
7
technology, were subsequently directed toward biomedical research through the cooperative interaction of alert scientists pursuing seemingly divergent technological objectives. In the case of scanning directly from microscope slides, the problem of enhancement could be reduced by devising techniques that would assure maximum display of morphological features through preparative technology. For example, the preparation of white blood cells and the related histological techniques for demonstrating chromosomes may require a réévaluation of classic histological techniques. The utilization of biological competence is here an obvious requirement. The investigations to be discussed by Dr. Wald and Mr. Preston will include not only attempts to optimize the preparation of histological material but also the development of a program which can select automatically processable material from the microscope slide. FEATURE
EXTRACTION
An almost universal phase in pattern recognition of high-altitude pictures is the extraction of features or properties from the original signal. The process of extracting a property profile consists of making a number of decisions as to whether or not the property features are present in the input signal. Techniques for defining properties that carry significant information may be divided into those of either human or automatic design. In the former, the designer constructs property detectors for those features that are known or suspected to be important; this property list may prove to be inadequate, or it may furnish a format not suitable for the decision mechanism which often provides only for linear discrimination. Statistical property extraction, in which a sample of preclassified images is analyzed automatically, may be used to augment the known property list and to reformat the property profile into a form suitable for the decision mechanism chosen. Within the context of feature extraction, Dr. Eden will discuss the application to cytological material of contour scanning techniques developed for reading printed text. Dr. Glaser will describe the use of contour scans for counting and analyzing large numbers of colonies of bacteria and other microorganisms and for identifying the organisms through observation of colony morphology and other characteristics observable during growth on solid media. Dr. Ingram will discuss the use of the CELLSCAN system for recognizing cells by topographical analysis of the black and white computer-stored image of the cell. In contrast to this approach, Dr. Mendelsohn's laboratory has derived the identifying parameters for leukocyte recognition exclusively from the optical density frequency distribution, without exploiting obvious topological features such as nuclear shape and number of nuclear lobes. Dr. Macy will discuss the use of descriptive vectors that characterize local pattern and density areas in the automated diagnosis of breast tumors;
8
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
a matching set of vectors related to the inherent symmetry of the two breasts permits detection of a tumor as an anomaly within its context by extracting patterns of disease from discarded, redundant patterns representative of normal tissue. DESIGN OF CLASSIFICATION
LOGIC
Nearly all current pattern recognition decision mechanisms essentially involve correlating the profile derived from the input pattern or image against one or more prototype patterns. Correlation schemes differ in the number of prototype patterns utilized, and in the means for specifying these paradigms. In most cases, the decision mechanism implements some form of linear discriminant function. In some instances, a quadratic decision surface has been employed to achieve separation of the pattern classes. There is increasing evidence that nonlinear discriminant functions may provide more appropriate decision rules for classification of biological images. In fact, it may be necessary to abandon traditional discriminant analysis procedures in favor of developing logical models for pattern analysis. These alternative approaches to the design of appropriate classification logic are currently being considered and formulated at the Massachusetts Institute of Technology. Dr. Papert has been part of the group engaged in this work, and will discuss the problems inherent in designing appropriate decision mechanisms for pattern recognition and classification as well as likely solutions to these problems in the area of biological image processing. It is expected that this conference will help not only to increase communication between the pioneering leaders in this emerging field, but will also serve as a model for encouraging the parallel growth of a supporting technology responsive to biomedical needs and requirements. This conference has been charged with providing the incentive for defining the most urgent biomedical needs for automated image processing. While hopefully we will be successful in influencing the direction future research and development will take in meeting these immediate needs, it is also imperative that we do not constrain the growth of this new field. Ultimately, the objective is to provide a strong technology for present-day needs while remaining flexible enough in outlook to recognize the emergence of new requirements and new methodologies. REFERENCE J. M . S., and M E N D E L S O H N , M . L „ The analysis of cell images. Ann. N. Y. Acad. Set., 1966,128: 1035-1053.
1. P R E W I T T ,
AUTOMATIC SCREENING OF METAPHASE SPREADS FOR CHROMOSOME ANALYSIS"
NIEL W A L D University of Pittsburgh Pittsburgh, Pennsylvania K E N D A L L P R E S T O N , Jr. T h e Perkin-Elmer Corporation Norwalk, Connecticut
The opening remarks by Dr. Eaves, and particularly his comment concerning one of the tasks of this conference, that is, to define the most urgent biomedical needs for automated image processing, are in accord with our thoughts in planning this presentation. Although the main subject is the preprocessing technique which we are utilizing in automatically selecting the mitotic or dividing cells for chromosome analysis, we thought it would be more meaningful initially to define the nature of the biomedical problem, that is, what the chromosomes are that we study, why we are looking at them, and what our present technique is. This will be followed by a description of the overall automatic chromosome analysis system which is now under development, so that one may see in proper perspective where this automatic microscope screening operation, or preprocessing, fits into the system. The details of the automatic microscope itself will be presented last, together with some cytogenetic data collected in the course of the design study of the instrument. THE NATURE OF THE BIOMEDICAL PROBLEM
A. What are we studying? Obviously, everyone at this conference is familiar with chromosomes in general, but we would like to narrow down to the characteristics that are pertinent to our automatic system. In essence, we are examining typical somatic cells of the body which may be considered representative of cells of any tissue other than the reproductive tissue's germ cells. They are observed in metaphase, the middle stage of mitosis or cell division, since this is when " This work was performed under National Aeronautics and Space Administration Contract NASr-169 to the University of Pittsburgh and under subcontracts NASr-169-1 and NASr169-2 to the Perkin-Elmer Corporation. 9
10
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
•V. •
•
4
•
•
' •
*
••i
f
.I
:
,
%
v s u
••
•
*
-
•
*
r
•
•
Figure 4. Human peripheral blood cell chromosome preparation. X 72
the chromosomes are most condensed and visible. In addition, the nuclear membrane has disappeared, so that the chromosomes have a larger area in which to orient themselves. By processing techniques, which we will describe in more detail subsequently, we arrest the division process at this stage, fix, mount, and stain the cells for microscopic examination. The biological material most widely used for this purpose at present is a circulating blood cell, the lymphocyte. This cell is readily available, and is suspended in the blood solution rather than being in the usual synscitial arrangement of cells in tissue. This is a very convenient characteristic for ease of processing. A photomicrograph of the kind of microscope slide preparation that the scanner has to look at, whether it be a human scanner or an automatic one, is shown in Figure 4. There are many cell nuclei which are not dividing, and some dividing nuclei arrested in midmetaphase. Cell cytoplasm is not seen clearly in this kind of preparation. A distinct difference is evident in density and in many other physical characteristics between the dividing and nondividing nuclei. The structure of the dividing nuclear material can be seen at higher magnification in Figure 5. The 46 chromosomes of the normal human cell, which are oriented in a seemingly random fashion, can be reduced to 23 chromosome pairs on the basis of appearance. The pairs can be arranged in an arbitrary but standardized fashion ( 1 ) , called a karyotype, by determining total chromosome length and location of the centromere (the point of contact of the two chro-
AUTOMATIC
SCREENING
11
Figure 5. Human blood cell in mitosis. X 800
matids which make up each metaphase chromosome). The karyotype of the cell in Figure 5 is shown in Figure 6. Note that the chromosome pairs bear a numeric designation and that sets of paired chromosomes which resemble each other are further designated by alphabetic group labels. Additional clues used to aid in the identification of particular chromosomes include the presence of "satellites", small masses of chromosomal material suspended above the short arms of D and G group chromosomes but connected to the main body by a thin thread; and secondary constrictions, or thinned out sections of the chromosome arms, appearing consistently in various locations on certain chromosomes. The purpose of processing the somatic cell chromosomes as described above is to facilitate the detection of deviations from the normal pattern. There are characteristically 46 chromosomes in man, and a consistent deviation from this number is of biomedical importance. In addition, structural changes, such as breakage of chromosomes with (in some cases) translocation of material, result in chromosomes which no longer have the appearance of the original ones. Various structural abnormalities are shown in Figure 7, such as loss of material with fragments possibly present elsewhere in the cell, and abnormal rejoining of broken ends, either when both ends of one chromosome were broken and rejoined to form a ring, or when two chromosomes were broken and then joined so that a chromosome results
12
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
IM N MM 1
3
A
4
B
5
X MM M U M X
6
c
Xiï M
mm
xx «A vin ** * *
16 is Y
le
ft éì 21
19
20
AA G
Figure 6. Karyotype of the mitotic human blood cell shown in Figure 5.
with two centromeres instead of one, a so-called dicentric chromosome. Abnormalities in division may lead to endoreduplicated chromosomes, with extra associated chromosome duplicate of the usual ones, and quadriradial structures in which the chromatids of two chromosomes are closely associated. These are conspicuous structural changes, then, which we want to know about in analyzing the chromosomes. B. Why are we examining these
chromosomes?
The particular reasons why we are doing this are important for they will determine the characteristics of any system that we develop for the purpose. One objective is the actual extension of our knowledge of cytology. The hope is that, with various improvements in our techniques, including automatic systems for data collection and for quantitative measurement, we will be able to extend human capability in depth, to see more and to get more information from the material which we examine. The work of Dr. Mendel-
AUTOMATIC
»? A
q
i
•Ik %
B
SCREENING
13
e C
t Ì
»i
D
ft. (Ili E
'"tC F
Figure 7. Chromosomal abnormalities. A: Deletion of subterminal region of one member of pair 21, the "Ph 1 " chromosome; B: chromatid break and fragment; C: ring; D: dicentric; E: endoreduplicated chromosome; F: quadriradial figure.
sohn's group at the University of Pennsylvania appears to exemplify this approach. Another purpose is to utilize chromosome number or shape as a research tool. W e may use the chromosome as a marker or an index in the measurement of some other phenomena. For example, in one study in our laboratory in which leukemia was produced by radiation in mice, cell-free material from leukemic animals passaged into normal animals produced the same disease ( 1 0 ) . In this system an abnormal marker chromosome was present in the bone marrow cells of each leukemic animal which received the passage material. It was also found in primary radiation-exposed leukemic animals. This marker, then, served as an indication of relationship between the disease appearing in the irradiated animals and in the nonirradiated but inoculated passage animals. There are other markers. Sex chromosome markers are used for experimental purposes in transplantation studies. W e have used sex chromosome markers to study the passage of maternal cells into the male fetus ( 9 ) , for example. Another objective for cytogenetic studies is to improve the management of various clinical problems. Consider, for instance, a congenital abnormal-
14
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
ity, Turner's syndrome, in which ovarian agenesis (the failure to develop ovarian tissue) results in a characteristic clinical appearance. Some of the abnormal features are short stature, absence of secondary sex characteristics, some cardiovascular abnormalities, and a webbed neck. The typical chromosome finding associated with this congenital abnormality is the presence of somatic cells with 45 chromosomes, including a single unpaired X chromosome instead of the normal female complement of two such chromosomes. Although the cytogenetic abnormality is detectable at birth or before, the clinical abnormalities are not fully developed until the time of puberty. Early recognition of this deficiency state would allow time for both physiological and psychological treatment to minimize the extent of abnormality. There are variations of the chromosome abnormalities in this syndrome, such as a partially deleted second X or an abnormal second X in the ring form. Also seen is another type of variation, the so-called mosaic, which is particularly pertinent in this context; here there is more than one cytogenetically distinct line of cells present in the individual, so that one finds some cells with only one X chromosome, and others with two X or even one X and one Y chromosome. The establishment of the presence or absence of this mosaic state is difficult. If a second cell line is present in low frequency, the likelihood of its detection will be a function of the total number of cells studied. Also, the ratio of cells of one line to the other may vary from one organ to another. A recent report by Ross & Tjio (6) pointed out that the major source of difficulty in making the cytogenetic correlation with this clinical abnormality was the failure to rule out the possibility of mosaicism. A major reason for this failure was the fact that counts of adequate numbers of metaphases and of cells from more than one tissue were not carried out. This, in turn, is due to the present laborious nature of such examinations. Another limitation on the interpretation of human clinical cytogenetic data stems from the fact that a baseline must be established to quantitate the range of variability of the chromosome complement of the general human population. It is rather remarkable that at this time there are only two large randomly selected population studies reported, one from Court Brown et al. (2) in Edinburgh that includes 438 adults from age 15 up, and one from our institution (8) that now includes a little over a thousand newborn infants. We are really at the very beginning in obtaining an adequate baseline of cytogenetic information in the unselected or randomly selected human. This, then, is another important application which contributes to the design specifications for any automatic system. A final purpose has to do with acquired chromosome abnormalities in humans, the purpose being to measure and later hopefully to predict the effects of the environment on man. Radiation exposure has been the stimulus for many cytogenetic studies (3), but actually there are many other agents that produce chromosome changes, including chemicals and viruses. The ac-
AUTOMATIC
SCREENING
15
quired abnormalities shown in Figure 7 are rare and nonspecific changes in the chromosomes. They may result from long-term, low-level exposures to environmental agents. They require the study of large numbers of cells from large numbers of people in order to be useful indicators of biologic change. The chromosomes of the circulating lymphocyte would appear to provide a cumulative and integrating indicator of biologic change, minus whatever repair has taken place up to the time of collection of the cell sample. It is cumulative because of the implicative nature of the DNA (including its acquired defects), and integrating in the sense that the lymphocyte circulates throughout the body so that it may be affected by localized, as well as whole-body exposure to environmental agents. "Biologic change" in this context should not be equated with clinical damage or any other implication of detrimental change. The reason this "biologic change" should not be assumed to be detrimental at this point in time is simply that we have not been able to make enough cytogenetic observations by manual methods to relate chromosome damage to actual biologic harm or clinical injury. We are in the vicious circle of being unable to justify the development of the automatic cytogenetic analysis method as a prognostic indicator until we have enough information with which to support such a claim, and yet the justification for developing such a system, if it lies in its prognostic value, must await the actual design of the system to acquire the supporting data. Another application, related in terms of environmental change, involves the use of cytogenetic observations of tissue cultures of human cells as a screening technique for the effects of new drugs, pesticides, food additives and whatever other agents are being introduced into our environment. Here, too, obviously, a very rapid observation system would be required for screening purposes. PRESENT CYTOGENETIC
METHODOLOGY
Let us now consider our present manual cytogenetic study technique. It may be divided into three general stages. The sample preparation stage involves collecting a sample of peripheral circulating blood from the fingertip or antecubital vein, the culture of the white blood cells in a tissue culture medium, the arrest of the cells in metaphase, and the preparation of the microscope slides. The second stage, data collection and recording, means in our laboratory scanning through the microscope, finding the usable cells, and photographing them. In some laboratories, ideograms are drawn or notations are simply recorded in a notebook. The third stage, consisting of data analysis, requires examination of the recorded information. In our laboratory, this means cutting out the chromosomes from the enlargement of the photomicrograph, matching up the pairs, and deciding what, if anything, is wrong with them.
16
IMAGE
PROCESSING
IN
BIOLOGICAL
T A B L E
1
A V E R A G E P E R F O R M A N C E T I M E OF C U R R E N T C Y T O G E N E T I C
Stages of Manual Technique Sample Preparation Collection, inoculation, harvesting, slide making, staining Data Collection Finding mitotic cells, photomicrography Data Analysis Karyotyping, evaluation
SCIENCE
METHODOLOGY
Tips, Smith et al. (7) 15 cells
Univ. of Pittsburgh 20 cells
50 minutes
2 hours
8 hours
0 hours
8 hours
7 hours
Regarding preparation time, which is a key factor, we could find only one report in the literature (7) in addition to our own observations. The data are presented in Table 1. Both time studies are in general agreement that it takes something on the order of 2.5 man days to study 20 cells of an individual. This is prohibitive when dealing with large populations. AN AUTOMATIC CYTOGENETIC ANALYSIS SYSTEM
In order to advance beyond the limitations of the present methodology, we first examined the possibility that it might be feasible to scan information by one technique or another, mechanical or electronic, through analogto-digital conversion, then to analyze by a computer program the number of chromosomes in the cells presented to the scanner, and perhaps to identify any abnormalities present. Using a photomicrograph of a human metaphase cell, a mechanical film scanner and a very simple computer program, we were able to arrive at what appeared to us to be a hopefully high degree of matching. We decided, because of our own experience as well as of the more advanced programming work of others—Dr. Rutovitz's group under the Medical Research Council in London, Dr. Mendelsohn's at the University of Pennsylvania, and Dr. Neurath's at Tufts, for example—that it is potentially possible to solve the problem of automating the third stage, that is, the analysis. We then undertook to design an automatic chromosome analysis system that would deal with some of the problems of the second stage also, i.e., the stage of data collection and recording. We are, therefore, developing the system shown in a block diagram in Figure 8, and which consists of an automatic microscope, an ultraprecision electronic flying spot scanner, and a digital computer. This system is designed to perform the following functions: (a) detection of mitotic cells, (b) placement of mitotic cells under the optical microscope, (c) focusing of the microscope, (d) classification of mitotic cells into categories of suitability, (e) photomicrography of cells, with pertinent data recorded on the film, ( f ) analysis of the cells within the class limits set under function d,
AUTOMATIC
SCREENING
17
Figure 8. Block diagram of proposed automatic cytogenetic analysis system.
above, and ( g ) output of results of analysis by means of photograph, printed page or magnetic tape. (The purpose of the magnetic tape is for additional or subsequent processing by computer.) The digital computer is a Model PDP-7, manufactured by the Digital Equipment Corporation in Maynard, Massachusetts. The precision flying spot scanner is modified from the Model 31 instrument of the Digital Equipment Corporation. It is being built at the University of Pittsburgh using flip chip modules, a Celco deflection system and a Litton Micropix tube. The automatic miscroscope system was designed and developed in collaboration with the Perkin-Elmer Corporation of Norwalk, Connecticut, and is the subject of more detailed consideration later on. T H E AUTOMATIC
MICROSCOPE
It can be shown that the electric field in the back focal plane of a lens is equal to the Fourier transform of the electric field in the front focal plane ( 5 ) . Thus, optics provide a ready method for obtaining Fourier transforms. Photo detectors placed in the back focal plane can be used to detect the square of the magnitude of the Fourier transform, which is called the Wiener spectrum. Frequently, Wiener spectrum techniques can be used to detect the presence of a particular object in the front focal plane of the lens
18
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
more readily than methods requiring the observation of the electric field in the front focal plane itself. Such a method of detection has been applied to the high-speed location of chromosome spreads on microscope slides. One of the great advantages of this method is that the Wiener spectrum is not critically dependent upon the location of the slide in the front focal plane, that is, focus need not be critically maintained. Early in 1965, an experiment was performed at the Perkin-Elmer CorporaSample Ifcn ••••1 o ( M - j r , ) I .a S P I * Il lui) 11 ii.i I it >! I \ *(>:.'! \ I .iglit Spot Diameter :io.
l i itoti«- Coll Imago
Normal Coll Imagi'
" N o i s e " Image
Coll Sp< ' trum
Cefi Spectrum
"Noise'
Spectrun
Figure 9. Wiener spectra and photomicrographs of a mitotic blood cell, a nonmitotic blood cell, and a dust particle ( " n o i s e " ) .
AUTOMATIC
SCREENING
19
tion the purpose of which was to set the stage for the construction of a pilot or breadboard version of a mitotic cell locator ( 4 ) . In the course of this experiment, the Wiener spectra of mitotic cells, normal cells, and "noise" on the microscope slide were measured (Figure 9 ) . The graph in Figure 10 is a plot of data showing light intensity as a function of spatial frequency for these three different objects. These three spectra can be separated by two simple threshold measurements; a threshold at 80 cycles/mm separates the noise from the two types of cells (mitotic and nondividing), and a threshold at 350 cycles/mm separates the mitotic cell from the nondividing cell. If the two signals are called fir (at 300-400 cycles/mm) and fr. (at 65-90 cycles/mm), the criterion for a mitotic cell becomes: fir >
Tx
fl/fH < T: where Ti and T 2 are two threshold levels. 1000
Typical Spectra Light Intensity vs. Spatial Frequency
\
\
\
Spatial Frequency Cycles per mm.
F i g u r e 1 0 . Light intensity as a function of spatial frequency in the Wiener spectra of the three objects in Figure 9.
[1]
20
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Distribution of Object Spectra vs. Light Intensity Ratio
f
L/
t
= 67 to 9 0 eye les/mm 20 »
sE
10
o
0„
44 Noise O b j e c t s
f H = 300 to 400 cycles/mm
o
jju E 3 Z
20
20 Mitotic C e l l s
10
0 Normalized Light Intensity Ratio
L/
Figure 11. T h e low-to-high spatial frequency ratio of 20 mitotic cells and 44 "noise" objects.
Measurements were taken on 20 mitotic cells from several slides furnished by the University of Pittsburgh. The histogram in Figure 11 shows that 18 of these 20 mitotic cells had a ft/fir ratio of less than one; thirteen had a ratio of less than 0.8. The next step in the initial measurement was to determine the likelihood of confusing other artifacts with mitotic cells on the basis of these two measurements. A test was made in which slides were scanned manually by a human observer and the output of the 300-400 cycles/mm spatial filter was monitored with a meter. When this fn signal exceeded a value of 0.08 mp watts per square mm (determined from the measurements on the 20 mitotic cells), the scan was stopped and the f l signal measured by the use of a different (65-90 cycle/mm) spatial filter. Forty-four objects that were in the preparation, but were not mitotic cells, were measured in this manner; all had ratios exceeding 1. Figure 11 also shows the histogram of these 44 "noise" objects. On the basis of the initial measurements on spectra of mitotic cells and noise, it was felt that a discriminator using these measurements would be satisfactory. For further testing, a larger selection of data was needed, so a semiautomatic breadboard discriminator was constructed.
AUTOMATIC
SCREENING
21
A block diagram of the breadboard discriminator is shown in Figure 12, and its actual appearance in Figure 13. The breadboard discriminator monitored the spatial frequency content of the front focal plane at 80 cycles/ mm and at 350 cycles/mm, and automatically made the calculations required to determine whether the conditions given in Equations [1] were met. As the slide was monitored, it traveled at approximately 1000 fiftymicron fields of view per second. Whenever the presence of a mitotic cell was indicated, the action of the scan was stopped and the operator then caused the scanner to retrace the scan until the object detected appeared in the field of view. At this time, a determination was made as to whether the object was actually a mitotic cell or, instead, some sort of "noise". Partial areas of a total of 45 slides were scanned in order to sample the various cytogenetic preparations adequately. The average area scanned per slide was 200 mm 2 , where one scan line, 40 mm long X 2.5 X 10~2 mm wide, equals an area of 1.0 mm 2 . It became obvious that the sample preparation was one major variable in this system. Lightly stained preparations result in low-contrast objects, and discrimination was less successful. A previously unmeasured area on a human cytogenetic preparation (slide 2255A, peripheral blood) was scanned on the breadboard discriminator. This slide was relatively heavily stained and had a density of about 0.5 mitotic cells per mm 2 . A photographic record of the first 20 detections was made, shown in Figure 14. Sixteen of the 20 detections were mitotic cells. Of the remaining four detections, three were refractile bubbles and one was a long refractile fiber in the preparation. Low S p a t i a l
Figure 12. Block diagram of the semiautomatic breadboard discriminator, the pilot model for the automatic microscope.
22
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Figure 13. The semiautomatic breadboard discriminator.
A measurement of the miss rate was necessarily limited to selected slides because the task of visually counting the number of mitotic cells in all areas to be scanned with the breadboard locator was too time-consuming. In order to determine the percentage of misses, a total selected area of 770 mm2 on five selected slides was further examined by visually counting all mitotic cells in them and calculating miss rates. The data for this group of slides are shown in Figure 15 as a plot of the detection rate as a function of i i j i u . There is a wide variation in the percentage of mitotic cells detected from slide to slide. The more heavily stained slides give a higher detection figure for a given ih/i-a ratio. In all cases, the detection percentage increases as the ratio is increased. The "false alarm" data, that is, stops at objects which are not mitotic cells, are plotted in Figure 16. They show that false alarms also increase with ih/iii. From these data, a ratio of 0.75 appears useful in avoiding the steeply rising portion of the false-alarm curve. It is important to note that, when surface contamination is neglected, the false-alarm rate drops by a factor of 10 or more. The surface contamination generally consists of dust settled on the surfaces of the slide, or of drops of slide mounting cement on top of the cover slip and on the back of the slide. Note also that elimination (or reduction) of this surface problem would allow the selec-
AUTOMATIC
SCREENING
23
tion of a higher fj,/fn ratio, which would then result in a higher percentage of detection. In the course of the semiautomatic discriminator phase, approximately 10 million 50 |j fields of view were scanned. The results are presented in Table 2. It is important to note that, for both human and animal slides, 80 per cent of the false stops in the 1965 series were due to surface contamination.
m W
V
v3 .--
u-j JT V, *
V *
'7'/
»>•
m
«
11
12 '-¿v..
M
15
16
19
20
F i g u r e 14. Photomicrographs of the first 20 objects detected as mitotic cells in a typical test run of the semiautomatic breadboard discriminator.
24
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
% Detection of Mitotic Cells
Heavily Stained Human (2272A)
30
20
-
10
-
Lightly Stained Rat (R #61)- Exp. 2
Heavily Stained Human ( 2226BM)
+ Lightly Stained * Mouse (1-CD 4th S e r i e s )
•
0
. 1
2
. 3
I—^V^rT 4
.5
.6
.7
.8
.9
fL/fH
1.0
Rati
°
Figure 15. Detection rate as a function of the low-to-high spatial frequency ratio.
False Alarms per Sq. mm.
False Alarms Averaged for Six Slides
Total False Alarms
Total False Alarms Neglecting Signals Due To Surface Contamination. .1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
f T / f „ Ratio Li ri Figure 16. Effect of the low-to-high spatial frequency ratio on the "false alarm" rate.
AUTOMATIC
SCREENING
T A B L E
2
R E S U L T S OF M I T O T I C C E L L L O C A T I N G S T U D I E S WITH T H E BREADBOARD
50-Micron Fields of View Scanned
1965 Series Human blood* Rodent bone marrow*
4.9 X10 6 2.1 X10 6
1966 Series Mouse bone marrow f Human bloodj
104 (approx.) 104 (approx.)
Specimens Scanned
25
DISCRIMINATOR
SEMIAUTOMATIC
SYSTEM
Causes for False Stops
Total Stops
Mitotic Cell Stops
Total False Stops
1360 G34
377 41
983 593
50 25
39 21
11 4
Object in preparation
Surface contamination
213 122
770 471
— —
11 4
Slides supplied by: * University of Pittsburgh f National Cancer Institute t British Medical Research Couucil
The conclusion drawn from this part of the project was that a fully automatic mitotic cell locating system using the Fourier transform appeared feasible for use with cytogenetic preparations of human blood cells, but that further study was required to determine feasibility for use with rodent bone marrow preparations. The primary difficulty with the latter appeared to be that the chromosomes showed lower optical contrast than those of human blood cells. It was decided that this problem required further investigation, including: (a) altering the wavelength of illumination, ( h ) increasing the dynamic range of the spectrum analyzer, and ( c ) altering biological staining techniques and sample preparation. The third approach was partially tested by using two slides from other laboratories in the 1966 series, with the results indicated in Table 2. The marked improvement in the percentage of valid stops in the necessarily small study was most encouraging. It reflected some technical modifications in the locating apparatus as well as the effect of differences in slide preparation methodology. At the close of the preliminary study described above, it was decided that the results justified the construction of a complete mitotic cell locating, scanning, and data processing system. This part of the project was initiated early in 1966. The optomechanical portion of this system will be under control of a general purpose digital computer. The heart of the system is the platen and microscope assembly, having a six-slide capability. At a platen rotation rate of 20 rpm, one hour is required to scan the six slides, or an average of ten minutes each. This is equivalent to 2500 fifty-micron fields of view per second. The optomechanical assembly is shown in Figure 17. The slides are
26
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Figure 17. Optomechanical assembly of the automatic microscope. Components include, 1: slide holder; 2: illuminator; 3: transfer lens; 4: photomultiplier tubes; 5: 35 mm camera.
mounted circumferentially about the disk which is driven by either a synchronous rotation or stepping motor through a clutch and gear assembly to cause each of the six slides in turn to traverse between the scanner and its laser-supplied illumination (No. 1 in Figure 17). For each complete revolution of the disk, a radial servo moves the carriage by 25 n to provide a simultaneous cross traverse. The resulting motion of the disk under the scanner provides a spiral scan with an average radius of about 3.5 inches and a lead of 25 |j. The device includes a pan which is filled with microscope immersion oil to a level which will submerge the slides in the disk and permit the use of an oil-immersed objective for the microscope to record a high-resolution image of the mitotic cell chromosomes. When a point of interest (that is, a mitotic cell) is detected during scanning, the outputs of the position data units are recorded, and the computer provides a signal that will remove the sync drive to the platen and insert the rotary step motor drive. The step motor drive will position the platen until the encoder output matches the coordinates stored, plus a delta coordinate representing the distance between the viewing and acquisition optics. When a match occurs, the servo will stop and the detected object will be under the viewing microscope which is illuminated by the illuminator (No. 2 in Figure 17). Once the slide has reached the desired position, the computer is pro-
AUTOMATIC
SCREENING
27
grammed to perform autofocus and centering by scanning the viewing microscope slide with a precision CRT. The C R T face is imaged on the slide by a transfer lens (No. 3 in the figure) and its output monitored by photomultipliers (No. 4 ) . All servo signals will be generated by the computer by comparing the digital position of the mitotic cell location to the actual position of the visible microscope. When the optimum position and focus have been obtained, photo recording may be done by means of a 35 mm camera (No. 5 ) having a film capacity of 250 exposures. CRT-generated data may also be photorecorded. When photorecording is complete, a start-scan signal will be sent out from the digital processing system. If necessary to safeguard the microscope, the focus servo will be programmed to back away slightly. T h e system (Figure 18) is enclosed in a dust-free, air-conditioned housing in order to maintain absolute cleanliness at all times. The immersion oil is under constant circulation and filtration. A control panel is provided, a detail of which is shown in Figure 19. It includes indicators giving the binary position of the axial, radial, and focal coordinates, as well as indicators showing the particular mode in which the system is operating. When in the manual mode the system is under control of push-button program switches providing settings for visual slide scan-
Figure 18. The automatic microscope. The laser light source is housed in the smaller unit on the left. A viewing lens system and the control panel are located on the front of the main instrument housing.
28
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Figure 19. T h e automatic microscope control panel. Switches provide settings for visual slide scanning, C R T slide scanning, slide photo positioning, and C R T photo positioning.
ning, CRT slide scanning, slide photo positioning, and CRT photo positioning. Additionally, manual shutter controls are provided to activate shutters between the various photomultiplier tubes and the coherent and noncoherent sources of illumination. Individual power controls are furnished for the illuminating system, the electronics blower, the oil cleaning system, and the clean hood system. Manual controls are also furnished which allow the platen to be stepped radially, axially, and in the focus direction by three separate switches. When in the automatic mode, the entire system is under computer control. In this mode, the push-button program switches and shutter control switches are disabled and the computer program determines all operational sequences. Discussion
Lederberg: How many good cells (that is, cells you would want to examine) were rejected in your studies with the semiautomatic discriminator? Preston: That I cannot answer because of the fact that a stop was considered valid if the cell was mitotic, not necessarily if the cell should be photographed. We did not make this latter judgment.
AUTOMATIC
SCREENING
29
Lederberg: There is great disparity in that. You do not know how many cells you passed over that you would have wanted to stop at. Preston: We do have some data on cells missed, as shown in Figure 15. These data were for a smaller number of fields of view taken from four slides. Mendelsohn: What ratio do the numbers in Table 2 correspond to? Preston: Those numbers correspond to an average ratio of low-to-high angle scatter of 0.5 for the human cells in the 1965 series. The investigation of variations in ratio was not made until after it was discovered that the optimum ratio found in our preliminary experiments on human cells gave very poor performance; then other ratios were investigated. This investigation, however, was not extended back to the human peripheral blood in order to continue the curves, so that these data result from an average ratio of 0.5. Eden: What is the scan rate for a human looking through the microscope? How many fields can he examine in a minute or an hour? I presume a human looks at an area larger than 50 n when he scans a field. Wald: I am not sure I can convert the working-time figures I gave for the data collection stage in Table 1 into quantitative terms of fields per hour. It would depend on many variables, including the mitotic index of the slides and selection criteria for mitotic cells. Papert: Qualitatively, how does your automatic scanning device compare with a human? Is it faster or much slower? Wald: Faster. Preston: Average performance time shown in Table 1 indicates that between six and eight hours are required to locate and to photograph 15 good mitotic cells. We found in Dr. Wald's preparations that there is approximately one mitotic cell per square millimeter, or a total population of about 200 mitotic cells per slide. This varies considerably, but on the average about 200 mitotic cells per microscope slide is typical. The device we have been describing will scan a square millimeter in approximately one second, and therefore will locate a mitotic cell—not necessarily a good one—about once a second. The information that we do not have (perhaps Dr. Rutovitz could shed some light on this matter) is how many detected mitotic cells are good mitotic cells. As I recall from a meeting held in June of 1966 on this subject, the ratio is somewhere between 10 and 20 per cent. If that is true, then it would be necessary to locate practically all of the mitotic cells on one microscope slide in order to find the 15 or 20 good ones that one would like to photograph. Rutovitz: In the cells we have studied, the percentage of usable cells was between 2 and 17 per cent of all mitotic cells, so perhaps an average of about 10 per cent is not a bad estimate. One thing to emphasize is the extreme variability of this material. Johns: Can the figures cited be interpreted as showing that it takes about ten seconds to find a valid cell in this particular preparation?
30
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Preston: Yes. We could state that the time spent in locating a valid cell would be on the order of ten seconds. However, there must be other time spent in decision making and in photographing. Until the completely automatic system is operating, we really do not have the answers to many of the questions that have been raised. Lederherg: In this question of what one means by a good cell, good means the best one can get. We are working at the boundary of human effort. As much human effort is expended as one is able to afford to get the best preparation. No metaphase spread is ever as good as one would like it to be. Preston: And if that is true, with a mechanical locator one may become even more selective, and perhaps the ten per cent figure need not apply any more. At least that is the implication. We will have to wait until the more automatic system is in operation for some of these questions to be answered. Dixon: I wonder if I could ask a question about Figure 15? You had indicated the effect on the contrast ratio in terms of the percentage of items you discovered, out of the total you would like to discover. Do you have any indication as to the effect of changing the contrast ratio on the discovery or nondiscovery of artifacts and contamination? Preston: Yes. To a certain extent that is answered by Figure 16. Of course, one of the dangers in increasing the ratio is that the probability of picking up artifacts becomes higher. Figure 16 indicates the false alarms, that is, the invalid stops or stops due to artifacts, on a per square millimeter basis. One of the reasons for setting the threshold in the initial experiment at about 0.5 is that the false-alarm rate was low. Two curves are shown in Figure 16. The upper curve is the total false-alarm rate on a per square millimeter basis if all invalid stops are included, both those due to surface contamination and those due to artifacts of some sort in the preparation itself. If invalid stops due to the surface contamination are eliminated, then the lower curve results. In both cases, as the ratio is made higher, there is a sharp increase in the total number of invalid stops where these stops are primarily due to surface contamination. McCormick: What is the output of the automatic microscope? Is it film? Preston: There are two outputs. One is 35 mm film, which is basically for record keeping. The primary output is the video signal to the computer, which is then used in automatic chromosome analysis. McCormick: Then you are planning to do an on-line analysis. Preston: Yes, that is correct; Dr. Wald is planning to do it. McCormick: How many frames a second are scanned? Wald: Approximately four seconds for a full frame scan. Minsky: Is it a scan rate or is it a program? Wald: It is a program-generated scan. In other words, each point is individually examined by a program.
AUTOMATIC
SCREENING
31
Preston: The cathode ray tube is also used to generate alphanumeric characters for recording data on film. Minsky: Can you program a lens field change, an objective lens change? Wald: No. That is fixed. Minsky: Such a capability might be useful for the high-speed scan with low resolution. Preston: If the high-speed scan were done by the cathode ray tube, which is an alternative that certainly should be investigated, then one would surely want to scan at as low a resolution as possible. This might be a factor of 10 lower than is being used for photographing and scanning of the individual spread. Eden: It is a fixed lens, but is it substitutable? Preston: Yes, but that is not programmable. Any change would have to be manual. Papert: Can you give some indication of why some valid cells were missed? Preston: I do not believe that measurements of the ratio of high-to-low frequency scatter and of the absolute threshold of high-frequency scatter were measured for those that were missed. Either high-angle scatter was not sufficiently great to trigger the threshold, or the ratio of low-angle to highangle scatter was too high. The indication was either that a nonmitotic cell was present, which would indicate an unusually low amount of high-angle scatter, or the indication was that a noise object was present, meaning that both the low- and the high-angle scatter were equally high. Papert: Those are the false alarms, or the ones that were missed? Preston: Those would be ones that were missed. In other words, two conditions are required for detection. Either one or both of these two conditions could be violated by one that was missed. Papert: Is it obvious that there is nothing to be gained by using a more structured analysis than the simple application of these two thresholds? Since approximately a millisecond is used for each scanning field, it appears there is time to do much more analysis than simply testing two thresholds. Preston: Yes. In the machine that is now being prepared, the particular scatter angles that are measured can be changed. Therefore, more data on what might be called the light scatter signature could be read out. With a 50 n field of view and a wavelength of approximately 0.6 |j, there are only about 100 meaningful scatter angles. In other words, the angular resolution is always the wavelength of light divided by the size of the illuminated area. We are using only two of these scatter angles in the present signature. Papert: It seems wrong to carry out an elaborate experimental study and throw away all the data except what passes through a simple threshold filter. Experiments may show that the threshold decision procedure is best, and one would then use it for routine production runs. But, in order to es-
32
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
tablish this, one should gather in all the data so as to study and compare different decision procedures. In this case it would have cost little time to record the scatter spectra or at least the real ratios. Preston: If we had had, let us say, analog-to-digital conversion at both the high- and low-angle scatter angles, then certainly all could have been recorded within the one millisecond of time. Papert: What is the resolution on the scan in the machine and how much does your machine cost compared with an analog computer? Preston: The new automatic microscope will be capable of doing all of these things, but the breadboard discriminator, which was built in 1965, does not have this capability. All of the data that have been accumulated to date were provided by the initial breadboard model, and no other data are yet available from the new machine. The optical resolution of the scanner is limited by the numerical aperture of the objective used, and with a high numerical aperture objective of about 1.2, the resolution would be about 0.2 n. The CRT is imaged below the theoretical optical resolution, so that analytically, at least, it does not degrade the system resolution. Papert: What resolution in terms of points of CRT scan are you aiming at? Wald: The CRT itself is rated for 4096 lines in a 75 mm square raster, of which the 36 by 24 mm center is imaged down to an 80 X 50 p field. However, due to the limits of optical resolution (0.2 p), the useful number of lines is 275. Lederberg: I do not quite understand the physics of the laser operation. You have the chromosomes immersed in a transparent medium of similar refractive index. Do you get the same scatter picture that you would if you did not stain the chromosomes? Preston: No. The scatter is very strongly dependent upon the stain that is used. The chromosome spread acts as a spatial amplitude modulator. Glaser: It is diffraction around opaque objects generated by the stain. Preston: It is a diffraction pattern. Lederberg: Most of the artifacts are not going to be of the same color as the stain. A two-color system could probably give much sharper discrimination than an angular system. Preston: Yes, if you measured the angular distribution of scattered light at more than one optical frequency, that is, at more than one wavelength. Glaser: I did not understand why you needed to use a laser to accomplish this. You would have more flexibility in doing what Dr. Lederberg is suggesting if you used white light. Preston: It would be possible to use white light, except for the fact that you cannot, over a 50 y field, or any field, create the brilliance required to scan at these speeds. Glaser: You could if you were using the whole spectrum for your detector, instead of just one wavelength.
A U T O M A T I C
S C R E E N I N G
33
Preston: If you were to use the entire spectrum, then there is an uncertainty in your angular scatter of about 2:1 because of the ratio of highto-low frequencies. Consequently, the scatter pattern is slightly degraded. Whether this would degrade the location performance or not, we do not know. The prime reason for using the laser was its brilliance rather than its being coherent in terms of a single wavelength. It would have been far better if we had used a laser in which the wavelength itself was variable. Minsktj: At the moment, lasers are probably cheaper than other light sources simply because they are in mass production. Papert: I would like to understand the focus problem a little better. Does it make sense to ask how many focal planes would be necessary with this particular kind of material in order to get a sharp focus? Preston: When carefully prepared, the chromosome spread over its diameter (about 50 n) is essentially in one focal plane. What must be determined by any automatic focusing scheme is what focal plane gives the sharpest focus. As you go through focus on one chromosome arm, the video signal is quantized into a series of pulses. In other words, each time a chromosome arm is passed by the scanner, you get a pulse. The duration of this pulse will become shorter as you go either ahead of or behind focus. Papert: I see how to focus. My point is that the value of the laser system depends on how many focal planes would have to be searched to obtain equally good results by simple-minded optical methods. Sacher: What is the difference between the mouse diagram and the human diagram? Preston: Originally we performed essentially two experiments, one with approximately 40 chromosome spreads and the later one with several thousand. In the first experiment, using various spreads of both mouse and human, the curves were averaged over both human and mouse chromosomes. At that point in time there did not seem to be a significant difference. As soon as we ran the larger experiment, and I think this is characteristic of almost all experiments in biology, i.e., as soon as thousands of chromosomes were looked at, the results were quite different. Rather than continue this laborious approach with a simple breadboard, we are waiting for the faster machine in order really to attack this problem. The faster machine will be flexible enough so that advantage can be taken of the fact that there may be some difference between the mouse chromosome signature and the human chromosome signature. REFERENCES D., H A M E R T O N , J. L., and KLINCER, H. P. (Eds.), Chicago conference: Standardization in human cytogenetics. Birth Defects Orig. Art. Ser., 1966,2(2): 1-21.
1. B E R C S M A ,
34
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
2 . COURT BROWN, W . M . , BUCKTON, K . E . , JACOBS, P . A . , TOUGH, I. M . , KUENSSBERG, E . V.,
Laboratory
and KNOX, J. D. E . , Chromosome studies on adults. Eugenics Memoirs XLII. Cambridge University Press, London, 1966.
3. COURT BROWN, W . M., EVANS, M. J., a n d MCLEAN, A. S., H u m a n r a d i a t i o n
4.
5.
6. 7.
8.
9.
10.
cytogenetics. In: Proceedings of the International Symposium on Human Radiation-Induced Chromosome Aberrations. North-Holland, Amsterdam, 1967. Izzo, N. F., Machine identification of mitotic cells using optical spectrum techniques. In: Proceedings of the 18th Annual Conference on Engineering in Medicine and Biologi/, Vol. 7 (P. L. Frommer and G. G. Vurek, Eds.). Philadelphia, 1965: 180. PRESTON, K., JR., Use of the Fourier transformable properties of lenses for signal spectrum analysis. In: Optical and Electro-Optical Information Processing (J. T. Tippett, Ed.). M.I.T. Press, Cambridge, 1965: 59-68. Ross, G. T., and T j i o , J. H., Cytogenetics in clinical endocrinology. J. Am. Med. Ass., 1965,192: 977-986. TIPS, R. L . , SMITH, G. S., M E Y E R , D. L., and USHIJIMA, R. N., Karyotype analysis of leucocytes as a practical laboratory procedure. Texas Rep. Biol. Med., 1963, 21: 581-586. TURNER, J. H., Li, C. C., WALD, N., and BORGES, W . , Preliminary reports on a continuing study of chromosome patterns in a general neonatal population. In: Research Methodology and Needs in Perinatal Studies (S. S. Chipman, A. M. Lilienfeld, B. G. Greenberg and J. F. Donnelly, Eds.). Thomas, Springfield, 1966: 176-206. TURNER, J. H., WALD, N., and QUINLIVAN, W . L., Cytogenetic evidence concerning possible transplacental transfer of leukocytes in pregnant women. Am. ]. Obstet. Gijnec., 1966, 95: 831-833. WALD, N., UPTON, A. C., JENKINS, V. K., and BORCES, W . H., Radiationinduced mouse leukemia: consistent occurrence of an extra and a marker chromosome. Science, 1964, 143: 810-813.
THE APPLICATION OF CHARACTER RECOGNITION TECHNIQUES TO THE DEVELOPMENT OF READING MACHINES FOR THE BLIND0 MURRAY EDENf Massachusetts Institute of Technology Cambridge, Massachusetts
I am very glad that Dr. Waxman broadened the scope of this conference to include nonnatural objects for classification, because letters, and certainly printed characters, are very unnatural. There is nothing in physics, biology, or any other science that I know of that prescribes the particular forms that characters may take. T h e word that we generally use is "conventional" rather than "nonnatural". As a matter of fact, there are a number of other highly conventional symbol systems, including chemical structure (as Dr. Waxman mentioned), music and speech. I am going to talk about a conventional symbol system and what we can do to recognize these symbols, because fortunately the interpretation of this particular symbol system is quite well defined. After all, almost everyone knows what a letter in his own language is. Many people have worked on recognizing characters—printed characters —either from transparencies or from opaque material. However, most of them have had quite a different motivation from ours. In most cases, the motivation has been to take a printed text and to store it in a computer, the object being to transduce the characters into this new form (for example, as entries on a magnetic tape) at a very high rate of speed and with a very low error rate. Our motivation is different; it is to transform the printed material into some other form which is acceptable to a blind person or to anyone else who cannot read at that particular moment. Under these circumstances, as humans we have a very good computer available within us to do the processing once the transduction from the sense of sight to some other sense has * This work was supported in part by the Joint Services Electronics Program (Contract DA28-043-AMC-02536 [ E ] ) , the National Science Foundation (Grant G K - 8 3 5 ) , the National Institutes of Health (Grant 2 P O l MH-04737-06) and the National Aeronautics and Space Administration (Grant NsG-496). f The work described here was performed primarily by Professor Samuel J. Mason, Professor Donald E. Troxel, Professor Francis F . Lee, Charles Seitz, Glenn Wickelgren, Charles Fontaine, Kenneth Ingham, Armen Gabrielian, Alan Citron, and Eric Jensen. 35
36
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
been accomplished. Thus, we need not concern ourselves too much with the issue of error rate. Experiments have been performed to determine the error rate that human beings will tolerate in reading mutilated texts (1). While there may be some argument as to the exact number of errors a human reader can correct without any appreciable decrease in his reading rate, it is easily verifiable that in reading a novel, for example, if there is a spelling error of one per hundred characters, the reader will never notice the errors except perhaps to say that the proofreading was rather poor. If the error rate rises above about five per cent, the reconstruction can be quite difficult. Obviously, if there is numerical information in the text, no reconstruction is possible no matter what the error rate. Except for this special case, an error rate of one per cent is completely tolerable by human beings in the reading of a natural language. Another aspect in which we depart from the usual motivation is that we are not concerned with speed. If we take as high-speed conversation one hundred words per minute and estimate the average number of letters per word as six, then we arrive at a rate of about ten characters per second as an adequate recognition rate. That is all that we are aiming for. These ground rules, therefore, change the nature of one's approach to the problem. Let me describe the problem that we have decided to tackle. We wish to scan printed material—any printed material in English, hopefully printed in justified lines, but exact alignment is not essential. We would not claim that we can read text written helically around a torus, like an advertising sign, but we can read anything written on a flat piece of paper with more or less straight lines. We should be able to read text if it is written in Roman characters or in italics, but it should be an English alphabet, and specifically it should be English. I will refer to the necessity for specifying the language later on. To summarize, our purpose is to read the individual letters, to produce appropriate output, and to ask the human being who lacks sight to study these kinds of output so as to discover whether it is comfortable, natural and useful to him. Of the kind of output that we consider, the most exciting one, if we were able to achieve it, would be synthetic speech, or natural speech if we could do so; in other words, to have it sound like Walter Cronkite reading The New York Times. We are not yet able to produce such an output. We have, however, developed two other forms of output, and this part of the work has been completed. The first of our two nonvisual modes of presentation is spelled speech. We simply spell out the word, which is recognized acoustically; if the word "dog" is read, then our device says "d-o-g". That is spelled speech. Another output is Braille. There is a direct correspondence between the symbol in the alphabet and the Braille symbol in Class I Braille; transla-
CHARACTER
RECOGNITION
37
tion programs are available for conversion of English words into Class I I Braille. So much for motivation. Let me describe the system very briefly. Let me also, during the first part of my presentation, stick very closely to what we have accomplished, and save for the second part the extensions we hope to be able to make. Initially, the text to be read is placed upon a program-controlled carriage. Our current carriage holds a piece of paper approximately six inches long and one inch wide, equivalent to about three lines in an average book page. The carriage is programmable in the sense that, when a sufficient number of letters are read, an instruction is sent to the carriage to move left so that the next unread letter will be close to the margin of the window. When the program has completed a line, a carriage return instruction returns the text to its initial position. Our sensing device is a flying spot scanner with rather coarse spatial quantization. The letter "e" in Figure 20 has approximately 30 spatial quantization levels in the vertical direction and about 30 horizontally. The quantization steps are equal in the horizontal and vertical directions. The spot of the cathode ray oscillograph is imaged on the paper. A pair of photomultiplier tubes are placed off axis so as to detect the intensity of the scattered light. When the cathode ray tube image is focused on a black region, i.e., on a letter, there is less scattered light than when the spot is focused on a white or blank region. The scan is also program controlled, and I will come back to that question in a moment. The logic which controls the scan in its various modes was constructed in our laboratory and is not part of the computer. Both input and output devices are located physically in our laboratory. They are connected to the computer by a high-speed data link; the principal computer we used is a PDP-1. It is located in another building about 500 yards away, and I daresay that one of the most annoying difficulties we have encountered in the construction of the hardware has been in the reduction of the number of errors in the high speed lines going between our laboratory and the computer. The fact that we have had to perform our computation elsewhere, as well as the fact that the PDP-1 is used primarily in a timesharing mode, has probably slowed down the rate at which we can recognize characters by about a factor of ten. In order to describe the various scanning operations, assume for a moment that the letter "e" in Figure 20 corresponds to the first "letter of a word. For example, this might be the first letter of the word "eat". Our first operation is to perform a series of horizontal scans over a field of three or four or five letters—the exact number of letters is not very important—and to count the number of black spots in each horizontal line. Thus, we are essentially preparing a histogram of black area as a function of vertical position
38
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
C. I. P. G. K-.'jAr
WHT
START
= I
CÒDE. UORD
X = I I
0 I
I
Y = 0 0 0
I
0
CO ORO (JO RD 0 0 li If 0 0 W /0 /0 0(9 Figure 20. Illustration of the code word generation procedure.
in the line. By taking only a few letters at a time, our program can tolerate a few degrees of tilt in the paper. Note that the letter " e " is the kind of letter in the English language which falls entirely within an intermediate range; we have generally referred to this as the middle field. There is an upper field, occupied by letters such as f, h, k, or 1 in most English fonts, as well as by the capital letters, and there is a lower field occupied by letters such as g, y, and so forth. We prepare the histogram in order to center the line and locate the three fields. As I have mentioned, we do not scan the whole line at a time; we need only a few letters for a histogram. Having centered the letter, we then go over to another program mode. In this mode we begin a vertical scan starting in the lower left corner and mak-
CHARACTER
RECOGNITION
39
ing successive vertical passes moving left to light between the lower and the upper field boundaries. The vertical scan will ultimately impinge upon a black region which will correspond either to a letter or perhaps to a little speck on the paper. At this point in the procedure we enter into a contour scan. For those who do not know what a digitally controlled contour scan is, perhaps I can illustrate it with Figure 21. Consider a square array of points, corresponding to positions which are acceptable to the program; the scan will examine the reflectivity of these points in order to determine whether they are white or black. Assume that we come against some contour of a letter; under these circumstances there is a white region more or less on the left and a black region more or less on the right. Suppose that the original scan finds, as its first point, the point marked " X " in Figure 21. There are two basic instructions in the program which may be paraphrased as follows: "If you find a black point, turn left"—in this case the next point reached, if this instruction is followed, will be a white point. Under these circumstances the command will be, "If you have just found a white point, turn right"; in this particular circumstance the scan turns right and proceeds from one point to the next in the same way until another type of command brings it to a halt. There are certain other aspects of our procedure that I might mention. Note, first of all, that we do not store the letter as a matrix in the computer. The letter is its own store. The information relevant to any particular test
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o y
o
o
o
o X
Figure 21. An example of the contour tracing algorithm.
40
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
letter is stored in the computer as simply the coordinates of the points located in the contour scan. Under our present operating conditions it would simply take too much time to dump the matrix corresponding to a letter into the memory of the computer. It may well be, however, that if a core memory is available to us within the laboratory, we will find it more expedient to store the matrix and operate entirely on the matrix rather than referring to the optical image for each successive point. The fact that we scan directly from the paper makes our procedure vulnerable to certain kinds of errors that would not exist if we had a simple matrix representation. Since we depend upon a parameter measured from the light gathered by a photomultiplier tube, and since the illumination of the paper is subject to certain statistical fluctuations, it is inevitable that boundary points will occasionally be regarded as white and occasionally be regarded as black. We have generally referred to this phenomenon as "twinkle", because that is more or less what it looks like. Our program has had to be modified in order to take care of this particular problem. In terms of our unmodified program, twinkle may result in our entering a small loop of four points, each one of which is either all white or all black. The problem was resolved simply by putting in an additional program step that says in effect, "Don't turn more than three times in any one direction. If you have turned right three times, turn left no matter what the state of the spot; conversely, if you have turned left three times, turn right." Most twinkle configurations can be avoided by this instruction. There is a second-order twinkle which is somewhat more complicated. As was mentioned, the first-order twinkle occurred when the scan was lost in a square all of whose points were of the same color. The second-order twinkle occurs when the scan is lost in a closed path of seven points arranged more or less like a bow tie. In any case, we have not needed to complicate our program further so as to eliminate this kind of error. So much for the contour algorithm. When the contour scan has been completed, we can display the points of the contour path as shown in Figure 22. You will note in the rightmost object in the figure that only the external contour is available to us in this program, so that the bounded region in the upper part of the "e" is not noted. (The other three objects in Figure 22 represent the output of connecting extremal points and are not relevant to this discussion.) This omission causes certain problems as, for example, in distinguishing an "e" and a "c". In this particular case, an ambiguous identification may occur that will require an additional test to be performed on the original letter. In the great majority of cases, however, only the coordinates of the contour points are used from this point onwards in the program. The algorithm then proceeds to generate a code word. The procedure for generating a code word which has worked best for us in such pattern recognition tasks as handwriting recognition or sheet music recognition, is to try to categorize objects by giving them a description in terms of the algo-
CHARACTER
RECOGNITION
41
£ • Q kà Figure 22. The rightmost display in the figure represents a contour scan display of the letter "e".
rithmic procedure used to examine the object in question. In the case of character recognition, what we do is to generate a code word in the usual binary alphabet, adding letters to the word which express what happens as we go around the contour. In the particular form that we have used, the code word consists of two parts. The first half of the code word expresses the sequence of extrema: the x and y maxima, and the x and y minima. In the second half of the code word there is a designation corresponding to the coordinate for each of the extrema we have found in the order of the contour scan. The quantization of the coordinate position is very coarse; we simply record the quadrant in which any particular maximum or minimum falls. Since we scan from left to right in a vertical scan mode, it is obvious that the first time we find a black point, the position is comparable to an x minimum. In the rest of the word we designate successive occurrences of x extrema by ones and y extrema by zeros. Some people may say, "Since you have an x minimum and maximum and a y minimum and maximum, you need two bits to specify an extremum." In actuality only one is needed. The reasoning is rather simple. Suppose that we begin at an x minimum. If the next extremum is also an extremum in x, it obviously must be an x maximum. However, if the next extremum is an extremum in y, it must be a y maximum, since the sense of rotation around the contour is clockwise. An
42
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
analogous a r g u m e n t can b e m a d e at each extremum around the contour. I n the example shown in F i g u r e 23, w e see an x m i n i m u m ( m a r k e d with an arrow ) followed b y an x m a x i m u m , an x minimum, an x maximum, a y maximum, etc. I n this w a y w e g e n e r a t e a c o d e word for the extrema, and in an analogous w a y w e g e n e r a t e a coordinate word, in this case a word with two bits for e a c h extremum. I n this instance the first e x t r e m u m occurs in t h e q u a d r a n t ( 0 , 0 ) , the second in the quadrant ( 1 , 1 ) , and so on. Note, however, t h a t in F i g u r e 2 3 the designations of the position of the respective m i n i m a and m a x i m a are rather strange. W e have employed a Cartesian coordinate system for the two dimensions of the display, but the positions m a r k e d in F i g u r e 2 3 as extrema are not, strictly speaking, the m a t h e m a t i c a l l y defined extrema of the contour of the c h a r a c t e r in Cartesian coordinates. An explanation of this s o m e w h a t bizarre procedure depends on the fact that w e are not interested in finding every local m a x i m u m and minimum.
Figure 23. Display of every fifth contour point of lower case "s". The y extrema are marked by a box and the x extrema by an x.
CHARACTER
RECOGNITION
43
F i g u r e 24. An illustration of the "gear backlash" smoothing algorithm in one dimension. T h e heavy line represents the original function; the lighter continuous line represents the smooth function.
We are well aware of the fact that, given our algorithm, it is highly likely that there will be a whole host of maxima and minima due simply to the irregularity of the contour edge. In consequence, we need to perform some kind of smoothing operation. The smoothing operation is designed to eliminate the small local maxima and minima. The mechanism that we have used might well be referred to by analogy to mechanics as "gear backlash". My description of this procedure will be in one dimension rather than two, but it obviously is generalizable to higher dimensions. Consider a function similar to that shown in Figure 24. Imagine a forklike device that "rides" on the function from left to right. When the function contacts the upper tine of the fork, the fork is pushed upward; when the function comes against the lower tine of the fork, the fork is pushed downward. As the function in Figure 24 rises, the fork rides on its upper tine until the first local maximum appears. Note that the position of the fork remains constant through the first local minimum and then begins to rise again with the function. However, beyond the second maximum the function comes in contact with the lower tine and the fork begins to move downward. Further, note that the maximum in the graph of the fork's position does not coincide with the maximum of the function it is smoothing. While it is quite feasible to adjust the value of the abscissa so that the maximum occurs at the abscissa value corresponding to the maximal value of the function, this is a minor correction and we have not introduced it in our first recognition program. In our program, the width between the tines of the fork is a variable under our control. The type faces that we have used appear to be optimally discriminated with a smoothing width of approximately one-sixth of the letter size.
44
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Under these circumstances we generate a set of code words that are rather modest in length. The number of extrema in a letter of average length is about six. The maximal-length word which our program permits contains 31 bits: a maximum of nine bits of extrema code, 18 bits corresponding to the coordinates of the extrema, and an additional four bits to specify field and height-to-width ratio. If, for one reason or another, a particular letter generates a code word longer than nine, the overflow is simply discarded. The identification of a particular letter, number, or punctuation mark depends entirely upon this code word or "signature." Let me now describe the recognition procedures. The memory of the computer contains a little "dictionary" for every upper case and lower case letter, numerical symbol and punctuation mark. Each of these dictionaries contains the signature listings corresponding to that particular symbol. The recognition procedure consists simply of comparing the signature generated by the processes described previously with the entries in each of the dictionaries. If an exact match to the signature is found in one of these lists, that completes the identification. Of course, in some cases either the signature matches identical code words in perhaps two dictionaries or, what is more common, the signature matches no code word already in the dictionaries. It is obvious that at the outset these dictionaries contain no entries and hence no identification can be made. In consequence, we first operate our reading system in a training mode. We present the system with textual material from a given book, in other words, from a particular type font. The sample shown in Figure 25 is taken from a sixth-grade reader; it was upon this text that the machine was trained. The first letter is entered into the device. It is scanned, a code word is generated, and the code word is stored in the dictionary according to the command of the manual operator. In other words, if a given word begins with, let us say, the letter "e", then the operator will enter the generated signature under that particular dictionary. In order to speed up the process of generating different code words for the same letter, we have made use of the fact that our system contains inherent sources of noise, for example, in the photomultipliers, in the vibration of the carriage, and perhaps in other sources as well. We proceed to retrace the same letter a second time and note if the same signature is generated. If a new signature is generated, then a new entry is put into the same dictionary. This process is iterated a number of times until it appears that no new signatures are being produced with this letter. It is our experience that, on a single font of type, the great majority of variations of signature may be obtained from a single specimen letter. It is also our experience that on the average each dictionary contains about five signature entries. When we have done what we consider to be sufficient training—and this means taking every symbol two, three or four times from instances of occur-
CHARACTER
RECOGNITION
"5
His bare feet had struck the water now. He was splashing in the shallow water, running down river along the edge. It felt cool to his feet, smooth and slippery and cool. George yelled once more, louder than ever, and waved with all his might. The leader of the swans began to climb the stairways of the wind, followed by the others. Their great wings beat the air
^
in flight. Their cries rang forth. George had frightened them away from their danger. But where was the boy who had saved them?
^
2
^
rj
^
^
45
With his last shout, his last' wild wave, George had lost his balance on the slippery stones. The water was not much more than a foot deep, but it was very swift. As he tried to get to his feet, he again slipped and was carried away, choking and struggling. He saw his father running along the shore, trying to overtake him. Fast as his father was coming, the river was faster still. George could not crawl from its hold. His hands reached helplessly for a hold on the smooth rocks as he struggled in the water. He was more frightened than he had ever been in his life. He could think only of the falls toward which he was being pulled, faster and faster. 376
Figure 25. A Xerox copy of the original trial text read by the program. The machine was trained on other pages of the same book. As the carriage was able to accommodate only three lines at a time, the text page had to be cut into strips (indicated by numbers on left).
rence in the text—we then turn control over to the computer and let it simply recognize and print out what it regards the letter to be. As I mentioned previously, the output we are currently able to supply to the blind reader is spelled speech, but that is a little difficult to show in a figure. We can also reproduce the output with the aid of the typewriter; a comparison of the original with the typewriter output is shown in Figure 26.
46
IMAGE
PROCESSING
C/PG, -RLE.
IN
BIOLOGICAL
Revci,*-«
SCIENCE
tAarU^e
M
K»
Tki
3_
errovi-
0 „f-
af
4z
kfter
P
h i s b a r e f e e t had s t r u c k t h e w a t e r nowperiod he / P P was s p l a s h i n g i n t h e s h a l l o w watercomma r u n n i n g down P P r i v e r a l o n g t h e e d g e p e r i o d i t f e l t c o o l t o h i s feetcomma smooth / P P and s l i p p e r y and c o o l p e r i o d pP g e o r g e y e l l e d once morecomma l o u d e r t h a n evercomma and /
-Jnu")"
pi . , i^ved w i t h a l l h i s m i g h t p e r i o d t h e l e a d e r of t h e swans / P P P
began t o c l i m b t h e s t a i r w a y s of t h e wlndcomma f o l l o w e d /
cov*T /
fi
by t h e o t h e r s p e r i o d t h e i r g r e a t wings b e a t t h e a i r / P P. ... vi n f l i g h t p e i o d t h e i r c r i e s rang f o r t h p e r i o d g e o r g e had P fC — — — — P f r i g h t e n e d them away from t h e i r d a n g e r p e r i o d P P b u t where was t h e boy who had saved theraquestionmark / P tblllp P w i t h h i s l a s t shoutcomma h i s l a s t w i l d wavecomma george P Phad l o s t h i s b a l a n c e on t h e s l i p p e r y s t o n e s p e r i o d p
-
^
f^oy" ^ -J
jupev-iyrfer -Jyp
,
t h e w a t e r was n o t much more t h a n a f o o t deepcomma / e x c l a m a t i o n P Figure 26. Typewriter output of the beginning of the text in Figure 25.
CHARACTER
RECOGNITION
47
Note that the program spells out the punctuation, as the spelled-speech output also spells out c-o-m-m-a. Accordingly, as you examine the text in Figure 26, you may occasionally find an error in the spelling of the punctuation mark that can hardly be an error in the recognition algorithm! It should also be noted that slash marks appear at the end of the lines. In our program the slash represents an unrecognized letter. In this case the interpretation is simply that the program reached the end of a line. At this stage of development, rather than put in a subroutine to recognize "end of line", we simply used the unrecognized character symbol as an instruction to return to the origin. In this text of about 950 letters we made either three or two errors, depending upon one's interpretation of an error. One of the errors consisted of interpreting the compound type symbol f-i as h, which seems a reasonable sort of error. The other occurred when the letter r was broken; this was interpreted as an i followed by a dash. It may be worthwhile to note that, although it was not our intention to claim that the reading machine simulated human performance, both of these errors are of the kind that would easily be made by a human subject if he were reading individual letters rather than coherent intelligible text. Broton: There are many extra spaces. Are those artifacts? Eden: Right. That is because we were economizing on lines. Both our input and output go over the same lines through the same computer, and we have all sorts of little glitches. Things jump and skip, and so on. In the text you will occasionally see a notation that states, "Do not count this error", and I believe that the notations are correct. These are undoubtedly errors made by the typewriter control rather than by the program. I think I should make one additional point. Although most of the work has been done on a single type font, we have tried small samples of other fonts. When a new type font is introduced, most of the letters are recognized without any trouble. Although we have not performed any extensive tests on other fonts, it is my impression that something of the order of 90 per cent of the letters in a new font are identified with the code word dictionaries obtained with the old font. The principal source of difficulty in making identifications of letters in a new font is the occurrence of letter ambiguities. To some extent ambiguity occurs within a single font as well. For example, we might confuse the letter e and the letter c in our representation, depending primarily on how deeply the upper right edge of the letter c dips below its maximum; in such cases an identical signature is obtained for either an e or a c. In our experiments to date we have come across surprisingly few such ambiguities. In every case we have resolved ambiguities by introducing a special test for any given binary distinction. In the case of the e and the c, our test consists of running a vertical scan through the middle of the text letter and counting the number of transitions from black to white to black. If there are four transitions, then the letter is c; if there are six transi-
48
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
tions, then the letter is an e. Precisely the same test can be used in other situations. For example, it is an excellent test for distinguishing between the capital B and the capital D. There are certain other problems that I might mention, problems that arise from the fact that the text is not perfect, but rather noisy. The problem which appears to be most serious from the machine point of view, and yet is completely trivial from the human point of view, is the ability to separate touching letters. W e evaded this problem in our first tests by choosing textbook material in which the printing is so clean that there are very few touching letters. Our basic procedure for separating touching letters is to defocus the optical image and to vary the clip level, that is, the level of light intensity at which a decision is made as to whether a particular point is black or white. Note that if two letters touch, as they might in the serifs of the letter i and the letter I, for example, the region of indistinguishability is likely to be rather a narrow one compared to the width of the body of the serif or the letter. If we took a densitometer trace vertically through this "touching" region, we would undoubtedly find that the maximal value for "blackness" in this case is no different from that of any other portion of a text letter. However, defocusing optically will cause the "blackness" at a narrow place to be diffused into the surrounding region. Under these circumstances the "blackness" of the narrow bridge will undoubtedly be less in magnitude than the comparable quantity for a portion of the letter. Hence, by varying the threshold for the binary decision we can in point of fact cut letters apart. Nevertheless, this is a delicate art, because occasionally the choice of threshold appropriate to the task of cutting two letters apart will at the same time cut one letter into two pieces; this is certainly to be deprecated. I should like to return to the question of our original motivation in building the reading system. Ultimately, this device is intended to be used by blind subjects, and a blind subject is as intelligent as a nonblind subject. It is our hope to be able to make use of the intelligence of the user, so that he may be in a position to explore the domain of the written word in a manner not much different from that of a sighted person. At this stage in the development it is impossible for a blind person to operate the device independent of the help of a sighted person. One blind member of our group has indeed used the reading machine, but there are certain adjustments that he has not been able to make by himself, simply because the parameters by which one would determine these adjustments are displayed on a cathode ray tube. Nevertheless, we do not believe that there is an intrinsic difficulty to the task of changing the display modes of the controls so that a blind person might use them. All the criteria that one might need to be able to make adjustments, such as the justification of the text line, the thresholds for blackwhite discrimination, the rate at which the presentation is made, or the presentation of the signatures themselves, can be ultimately designed so
CHARACTER
RECOGNITION
49
as to produce tactilc or auditory criteria by which the blind subject may make his judgments. Lederberg: It is not clear to me what output you offer the blind subject. Eden: Currently our principal output is spelled speech. In the spelled speech the output is prepared in the following way: We had a television announcer speak the letters of the alphabet; we then clipped the letters off so that they are about 100 milliseconds long—very clipped speech; these segments are then quantized, stored in appropriate registers in the computer, called upon when identified, and sent out over our loudspeaker. There are a number of things to be said about spelled speech. Of course, it does not sound like speech; it sounds like spelled speech. However, one can do a variety of things to increase the rate of presentation, i.e., the information transfer rate. We discovered, for example, that one need not turn off one speech segment before the next one came on. Let us take the word "c-a-t". There are three phonemes, corresponding to three segments, and each is 100 msec, long. First k is turned on, then a goes on some time before k is finished and then t goes on some time before a is finished. The overlap can be as much as 50 per cent without destroying intelligibility. The presentation rate can thus be increased by a factor of two. It takes study and practice to be able to hear words with this much overlap, but it is quite feasible, and I have done it myself to some extent. If you have a more modest amount of overlap, say 75 per cent, then there is no serious difficulty. Lederberg: Do you not find that different phonemes should have different presentation signs? Eden: I should have been more precise. These are not phonemes; these are spoken letters. We have a minor problem in English because certain letters are not related at all to their appropriate phonetic representationslike the lettter w, that is, "double-you" which does not have a "w" in it, and the letter h, that is, "aitch" which does not have an aspirate. We have used both and, strangely enough, if you say "double-you" rapidly, it is still quite recognizable. However, we invented a letter sound, "wa", which is better. We have provision for one mode of display other than spelled speech. We can produce Braille I or Braille II as an output on a Braille printer prototype which was constructed in the Mechanical Engineering Department at M.I.T. Let me add a few words concerning our intentions. We are in the process of installing a new carriage that will be program-controlled for both the x and the y dimensions. In this way we hope to be able to read a whole page at a time rather than a line at a time. As mentioned earlier, we are trying to redesign the controls so that they can be manipulated by nonsighted subjects. Finally, we are investigating methods for producing synthetic speech. I will not attempt to make any prediction regarding the length of time this
50
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
will take. We are reasonably confident that this is not a long-range development, as we have most of the pieces in our hands right now. Note what the problems are. As far as recognition itself is concerned, I think it is fair to say that we have solved the problem of character reading under the condition that the number of characters one wishes to read is rather large, in other words, when a book or perhaps a page of type is to be read, but not when the text to be read is simply an arbitrary word standing alone. We would expect that our reader would perform rather poorly on isolated words taken from arbitrary type fonts. Rather, our method is intended to be used with a text of sufficient length, so that some of the errors corrected by the human observer from the contextual information may be used to improve the performance of the reading device. Most people regard English as a nonphonetic language. Let me assure you that it is not at all true. English is as much a phonetic language as any other. The major difference lies in the complexity of the morphophonemic rules needed in English. It may be that when we learned to read English as children it took us longer than it took French children to learn to read French. Nevertheless, learn we did, and the fact that any native speaker and reader of English can speak from the printed page in an unambiguous manner is proof enough that the rules exist. However, there are certain very serious problems. First of all, a computer program which would embody these rules is likely to be exceedingly complicated. The number of morphophonemic rules, at last count, is something on the order of 1000 or more. Clearly, that is a large number of rules. Dr. Francis Lee, who is conducting the research for this part of the problem, has gone over to a somewhat different procedure. Rather than have a large number of rules and a small set of symbols, he has increased the number of symbols and diminished the number of rules. Rather than attempt to proceed letter by letter first from grapheme to phoneme, he uses a portion of the word that he has designated a "morph" and is more or less equivalent to what one might call a syllable. In English we need to store approximately 10,000 such syllables, a not unreasonable task for a magnetic disk. The rules prescribe the concatenations of different syllables in order to produce the appropriate phonemic representations. This problem is also essentially solved, although not all the hardware is available nor all the programs written. There are other problems in synthesizing English speech. The problems of stress and intonation are exceedingly difficult, because appropriate stress and intonation depend very much on long-range context. In the main, once the spelling is known, context no longer is required. However, there are many cases even in determining word stress in which one must know the part of speech; for example, is a certain English word pronounced "re-fuse" or "ref-use"? You need an entire sentence to decide. Syntactic analysis is difficult to program on a computer and we have not yet done so.
CHARACTER
RECOGNITION
51
Let me conclude this topic by saying that even here we are not overly concerned about the quality of our output. If the blind person is sufficiently motivated, he will not himself be concerned if the speech he is learning is not "natural", that is, a high-quality simulation of natural speech. It is probably sufficient for his purpose if the output is speech-like to the extent that he can leam to understand it without great mental labor. For our first trials we will neglect issues such as stress and intonation, and even disregard the "re-fuse/ref-use" problem. If the computer makes that particular mistake, the listener will correct it mentally, since he can himself insert the appropriate syntactic and semantic interpretation of the sentence from his knowledge. Discussion
Nilsson: There are, of course, some reading machines that are commercially available. Certainly, their goals are different from yours. The question arises, are some of their methods useful to your goals and, conversely, might your goals be useful in building a rapid reading machine? Would you say a word or two about those points? Eden: It may be that some of our methods are useful to them and some of their methods to us. However, I suspect that with regard to highly standardized objects like characters—letters in a language—a wide variety of methods would work. We can talk very loosely of the notion of redundancy even though it is difficult to compute redundancy of language in a completely satisfactory manner. There are any number of properties of letters that we can measure so as to identify them and be confident that our recognition is correct. Many schemes of which I have heard will work after a fashion. However, as I tried to point out, different motivations imply different design criteria. There are other practical or economic issues; for example, we are limited by the fact that a certain amount of information transfer must take place from the local station to the computer, and that the computer is ordinarily available to us in a time-sharing mode. With a computer of a modest capability but completely at the disposal of our problems, we might increase our recognition by a factor of five and even by a factor of ten. Lederberg: Which would make it what? Eden: Which would make it something of the order of 100 to 200 characters per second. It is now somewhere in the range of 10 per second. Estrin: How many concatenation rules did you end up with? Eden: I believe there are about one hundred. Lederberg: How much research is being done on what blind people really want to have in the way of presentable input? It seems to me that much of this research may be going at it from the wrong end. It seems to me you have started by choosing a system and then determining whether a blind person will accept it or not.
52
IMAGE PROCESSING
IN B I O L O G I C A L
SCIENCE
Eden: To be trite, I am glad you asked that question. We are not trying to produce a machine to read to the blind, although that is one foreseeable outcome. We are trying to produce a tool which we can use in the laboratory in order to determine what a blind person wants to do with printed material. Lederberg: One does not need to have a machine translator in order to do that. It would be very easy to prepare manually a number of simulated presentations to test their acceptability. Take, for example, the question of what language you speak in. If I were blind, I believe I would prefer to hear English spoken as if it were Italian rather than English spoken as English, because there would be less ambiguity in my reconstruction of what the word actually was. The purpose of speech, in this case, is to me a reconstructive process similar to reading. Eden: It is certainly true that it is not essential to have a "reading machine" in order to try a variety of possible presentations. Whether it is useful or not is an economic question. Does it save money and researcher's time? We believe the answer is a clear "yes". We have prepared tapes in our laboratory and studied a variety of tactile and auditory stimuli, but the flexibility of the new system will enable us to study these and a number of others. It will enable us to try many different type fonts and formats, many different recognition algorithms, error detection and correction procedures, and so forth. This particular development took less than six months from the time we decided to undertake it until it ran with reasonable reliability. The investment of money and staff time was quite modest and the benefits are large, at least in terms of the number of different problems for which graduate students have already begun to use the reading machine. Prewitt: Is your dictionary so large that in reading text you never generate a code word that is not in the dictionary? Eden: We do, very frequently. Prewitt: How then do you handle these code words? Eden: We make a guess as to what the letter should be. I say guess, but if I were to spell out to you the word "d-i-f-k-e-r-e-n-c-e" (or "blank" in the fourth place), you would obviously know what the substitute is, and then you would assign that code word to the dictionary for "f." Prewitt: Do you use the context? Eden: Very much so. We use word context all the time. We would not claim that we can read anything other than English, but it is obvious that there are special rules for each language. Nilsson: Context was used on this? Eden: Not on the text I have shown in Figure 25. Nilsson: That was individual character-by-character recognition? Eden: It was individual character-by-character recognition, but workable
C H A R A C T E R
R E C O G N I T I O N
53
programs are currently written which involve dictionary look-up to correct errors. McCormick: It seems to me that, if you were to address directly, you would have 65,000 words of storage—216. And you are actually storing less than one per cent of those. Are you leaving blanks or do you have some way of recoding your code words to compress the amount of storage required? Eden: I do not understand your question. Our code word is 36 bits long. We have approximately 700 code words in our dictionary. McCormick: You simply scan these linearly, then? Eden: We scan them all, using a logarithmic search procedure. McCormick: Then, accordingly, that is an alphabet of 70 characters, and you simply scan them serially. Eden: Right. McCormick: Why did you not use a vidicon input? Eden: I do not know why. I suppose the main reason was that we did not have one handy. In addition, it is not very difficult to build a scanner requiring only one bit of intensity discrimination and relatively coarse quantization. In our current input, we use a Tektronix scope with a short persistence photo tube, a couple of photomultipliers, and some feedback circuitry to take care of fluctuations in light and photomultiplier response. McCormick: I think the point was well brought out by Dr. Lederberg. You are in a research area that can be divided into two separable parts. One part has to do with the reception of information by the blind, and the other part has to do with character recognition. Commercially produced character recognition devices have been available for several years. I would estimate that commercial developmental efforts are trying to achieve recognition speeds of 104 and above characters per second, which is two orders of magnitude faster than the speeds you are talking about. I think you have to justify a slow character recognition device when clearly that is not where the developmental frontier is. Eden: Our justification is an economic one. We have much less hardware." McCormick: You say you use a 50 by 30 raster and throw away everything but the fifth point. Does that mean you can get by with a 10 by 6 raster? Eden: No. We have found that a scan raster which establishes about 200 points on the contour of a capital letter is close to optimal. This corresponds to a raster of about 50 high and 30 wide. If we used a coarser raster, we " Added by Dr. Eden after the conference: Following the conference I made an effort to find out what the state-of-art was. A major obstacle is that one must presume that companies refusing to respond to queries may regard their devices as industrial secrets. Commercially available devices, such as bank check readers, have rates somewhat lower than the figure quoted by Dr. McCormick. More important, from the point of view of the motivation of our work, all commercially available devices require a rigidly fixed format, and almost all require special type fonts and preparation.
54
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
would pass right through some of the narrow lines, as for example, the verticals on an "m" which are only two or three dots wide at the raster density quoted above. However, we do not need accurate position information for the extrema since we only specify the quadrant into which an extremum falls. Consequently, we retain only about one point in five in order to generate the signature. McCormick: If you do not look at the intermediate points, it is difficult for me to understand why you need them. Eden: We need them to find the boundary, but we do not store all of the points. Papert: I would like to make some general comments using the question of the speed of the Eden reader as a starting point. The question is really a distraction. The preoccupation with speed has contributed to creating an altogether false impression of the degree of difficulty of character reading. Sound research strategy would suggest studying a variety of procedures with a view to understanding the range of the problems and gaining insight into the value and limitations of various procedures. Naturally, the question of speed must arise. I am suggesting that one ought to understand very thoroughly what has to be done before we can reasonably understand how to do it fast. In fact the opposite strategy has been adopted. The vast bulk of reports on character recognition describe mediocre performance of systems based on extremely simple recognition procedures chosen partly because of the unsubstantiated presupposition that they will yield the fastest and cheapest practical devices. The result is that we have very little experience with success, no overall view of the problem area, and a long, discouraging record of failures. A comparison of Dr. Eden's procedure with other schemes emphasizes the value of two important aspects. The first is simplification of the representation obtained by using a serial scan which enables him to use a 36-bit code as opposed to the many hundreds of bits used in some purely parallel methods. Along another dimension, his use of a dictionary with several entries per character contrasts with the need to combine characters by an OR operation to permit decision by small numbers of independent binary choices. The latter procedure, often proposed for use with perceptrons, runs into some rather deep theoretical difficulties. Minsky and I have shown that the computational complexity of discriminations can in such cases be increased without bounds by OR-combination (2). It would be interesting to study possible converse results which would provide a theoretical framework for understanding the Eden dictionary. Eden: My only comment is that I am delighted that I agree with Dr. Papert so much. Norman: What is the state of the art of the converse process, going from speech to the printed output? Eden: For a restricted vocabulary, spoken by a single individual, I be-
CHARACTER
RECOGNITION
55
lieve n o w t h a t it is a rather easy task to recognize something of the order of 50 or 100 different words with h i g h reliability; however, I suspect that could have been done a long time ago. T h e only question here is one of speed. W e are probably doing it a little faster n o w t h a n before. I believe the problem is impossible to solve if you wish to recognize words in a language spoken by h u m a n beings in a continuous discourse. As far as I know, there is no hope. Nilsson: In t h e m a t t e r of recognizing characters in simple fonts, is this something that can b e done with ten examples per character, or is it something you m i g h t rather do by first recognizing t h e f o n t a n d t h e n reading in the dictionary that corresponds to that particular font? Eden: I really do not know. W e w e n t to a printer a n d asked h i m h o w to p r e p a r e our dictionaries. W e asked h i m to print out several representations of each letter in his set of type. T h e n we scanned the p r e p a r e d page and thence proceeded to the book. However, we also took typewriting a n d p u t it through the same set of dictionaries and achieved reasonable recognition. T h e r e is a large difference b e t w e e n typed characters and printed characters. Those w h i c h w e anticipated we would n o t be able to read, indeed w e did not. T h e y require that w e insert more code words in the dictionary. Given the size of our dictionary, w e can certainly increase it b y a factor of ten without any serious concern regarding t h e size of storage. McCormick: But the speed of processing goes down. Eden: Certainly, the speed of processing goes down. REFERENCES 1.
R. W . , Error Consideration in Reading Machine Design. Doctoral Thesis, Massachusetts Institute of Technology, Cambridge, 1967. 2. MINSKY, M . L., and PAPERT, S., Linearly unrecognizable patterns. Am. Math. Soc. Stjmpos. Appl. Math., 1967, 19: 176-217. CORNEW,
AN A U T O M A T E D S Y S T E M F O R G R O W T H AND ANALYSIS OF BACTERIAL COLONIES"
D O N A L D A. G L A S E R U n i v e r s i t y of California B e r k e l e y , California
The research program I am going to describe has been in progress for less than a year, so that my description will be mainly of the purposes, methods, and special equipment used. Although I will report a number of preliminary results obtained with the system, I will not be able to give performance data for its large scale use. Biological Research
Applications
The overall scientific purpose of the work is to automate those parts of microbiology that depend on observations of colonies of microorganisms growing on solid media such as agar, silica gel or gelatin. Such observations are usually made in petri dishes about 100 mm in diameter, containing a layer of gel a few millimeters thick, which serves as a sponge for holding nutrients and drugs and as a support for microorganisms growing on the solid surface. W e have built or designed four devices for automatic time-lapse photography of such dishes while they are held in carefully controlled incubation conditions. They range from a simple time-lapse camera that can follow events in a single dish up to a large machine able to handle the equivalent of 10,000 petri dishes in one batch and photograph each dish about once every hour. Photographs on 35 mm film are produced by all of these devices for subsequent analysis in a flying spot scanner programmed to find and count colonies and analyze the appearance of isolated colonies. The principal use of these facilities will be for intensive studies of bacterial genetics and physiology. The flying spot scanner will be used to count colonies; select colonies of mutant organisms; characterize bacterial strains according to their ability to grow under various conditions of temperature, nutrition, illumination and gaseous environment; measure colonial growth rates; and characterize mutants and progeny of genetic crosses for a variety ° This investigation has been supported in part by the National Aeronautics and Space Administration under Grant N G R - 0 5 - 0 0 3 - 0 9 1 , and by the Public Health Service through research grants GM 1 2 5 2 4 and GM 1 3 2 4 4 from the National Institute of General Medical Sciences.
57
58
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
of genetic traits. These operations must be carried out on a large scale under closely controlled conditions in order to make extensive genetic maps of the organisms, to study taxonomic relationships, to study mutation rates and types of mutations, and to examine a wide variety of other features of the organism that seem essential to understanding how it works as an entire biological system. The immediate task of the flying spot scanner will be to find and count colonies, to measure the sizes of colonies, and to record the appearance of colonies as characterized by the optical density profile taken across a colony diameter or around a small spiral or circle inside the colony. The scanner will also measure the degree and type of irregularity of the colony edges. Health
Applications
Many of the operations described above for use in biological experiments can find direct application in the public health and medical fields for assaying contamination levels in food, drugs, water or other supplies and equipment. In principle, the system is able to count the total number of viable organisms in a sample and to recognize how many of each known type of organism are present. The identification of organisms depends on analysis of their colonial growth rate and appearance under particular growth conditions, as will be described in detail below. It is hoped that the same method for identifying organisms that might work in a water or milk sample will also work in a clinical sample of blood, sputum, urine or other body fluid in which a few pathogenic organisms must be detected in a rich flora and fauna of normal and harmless organisms. W e have already examined about 50 strains of bacteria obtained from a clinical bacteriology laboratory and can quite readily distinguish one from the other on the basis of photographs of colonies taken under special conditions. Computer programs to do this recognition automatically have been written but not yet tested on a large scale. Preliminary tests have been done only with pure cultures, and we will soon find out how well the identification works using real clinical samples as they occur in the hospital clinical laboratory. Finally, the ability to measure accurately the growth rate of a large number of organisms should allow accurate quantitative measurement of drug sensitivities for these organisms. DESCRIPTION
OF INCUBATOR
CAMERAS*
Photographs made in all of the devices described below carry information as to the time, date, environmental conditions and a variety of other facts needed for proper interpretation and collation of the pictures. In addition, there will be a gray wedge containing a number of density levels in each photograph to provide quality control and interpretation of the chemical development of the film, the state of the illumination system and other 9
A detailed description of the feasibility studies and engineering development of much of the equipment to be described below has already been reported ( 1 ) .
BACTERIAL
COLONIES
59
factors that might affect subsequent interpretations of the photographic appearance of the colonies. "Candid Camera" The candid camera consists of a 35 mm Nikon F single lens reflex camera with an automatic electric drive mounted above a holder which contains a single 100 mm plastic petri dish in a small incubator. The dish is usually illuminated from below with parallel white light, but we can also use light at various angles and of various wavelengths. We have sometimes provided a half-silvered mirror above the dish to reflect some of the transmitted light back down to the surface and thus provide a photograph obtained partly with reflected and partly with transmitted light. Since we will make use of the photographic appearance of colonies to identify the organism, it will be worth discussing some of the relationships between the photograph and the "true appearance" of the colony. As a bacterial colony has lateral extension on the agar as well as variations in height, it behaves as a lens of rather complicated properties which can refract and reflect light incident upon it, as well as scatter and absorb light by means of pigments and particles contained within the colony. Since the detailed structure of a colony seems to be a complicated biological consequence of the genetic constitution of the organism and the properties of its environment, we have not made any serious attempt to make a causal analysis of the shape. Rather, we have tried to develop photographic arrangements which give the most complicated and interesting pictures in order to obtain a "fingerprint" that would serve best to distinguish a colony of one organism from that of another. We have found that photographing colonies out of focus is very useful because of their lens-like properties. Parallel light transmitted through a colony will often be brought to a point or ring focus above or below the colony, according to its particular refractive properties. Very often the most interesting pictures are obtained with the camera focused deliberately above or below the surface of the agar carrying the colony, and a set of three to five pictures, including one in focus and two to four out of focus, often allows identification of the colony when a single in-focus picture is quite uninteresting and unable to permit identification. I am sure that the same detailed characterization of the colony could be done with stereophotography, holography, use of glancing illumination to produce shadows, and perhaps other techniques as well. Our objective has been to try to avoid expensive and delicate optical arrangements and to try the simplest way of obtaining a complicated fingerprint on a single 35 mm frame if possible. Superposition of several out-offocus pictures may be the easiest way to do this. "Lazy Susan" The lazy susan, shown in Figure 27, is a doughnut-shaped incubator that
60
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Figure 27. Environmental growth chamber (lazy susan). An annular aluminum turntable inside the incubator carries 30 petri dishes slowly past stations where agar will be poured, inoculation carried out, addition of drugs, nutrients, or other chemicals done, and photographs taken (four photography stations allow use of various color filters, transmission or reflection illumination, and so forth). Gaseous environment, light levels, and humidity can be controlled accurately. Temperature control to better than 0.1°C is maintained with the help of thermistor controllers. A variety of cycles of successive operations can be programmed on the master sequence controller. Ultraviolet lights inside allow sterilization of the contents of the incubator.
contains an annular aluminum ring 42 inches in diameter, which carries 30 petri dishes around in a circle past one or more cameras for time-lapse photography using various illumination systems.
"Roundabout" The roundabout is a camera which allows rapid photography of ordinary 100 mm disposable plastic petri dishes made and processed by hand in conventional laboratory experiments. It is simply a way of entering ordinary experimental dishes into the data handling system. The camera is able to photograph about 3000 dishes per hour.
BACTERIAL
COLONIES
61
F
Figure 28. A sketch of the dumbwaiter environmental chamber under construction, showing the stack of 64 trays within the magazine on the left. In operation these trays are indexed upwards one at a time and carried through the top cross duct to the stack on the right. This stack is lowered, one tray at a time, the bottom tray being discharged into the lower cross duct and returned through it to the tray stack on the left. This cycle is repeated so that each 100 mm square area of agar is exposed to any one of the cameras as often as once an hour. Cameras and equipment for inoculating organisms onto the agar and administering drugs and nutrients, as well as carrying out direct manipulations on the colonies, will be mounted at various places on the two cross ducts shown. On the left side of the sketch is a portable magazine which can c a n y stacks of, 64 trays from the dumbwaiter to cold rooms, warm rooms, or the dishwashing machine as required during the operation of the system.
The
"Dumbwaiter"
The dumbwaiter (shown in Figure 28) is a large machine that will automatically pour, incubate, inoculate and photograph the equivalent of 10,000 petri dishes in one batch. The agar is actually carried on 32 X 16 inch glass trays, mounted two to a frame on aluminum frames about 35 inches square.
62
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
The machine carries 128 of these trays and circulates them vertically through an "elevator" stack of 64 trays, then horizontally through a cross duct to a "lowerator", which is another stack of 64 trays. As the trays index down through this stack, they finally reach the bottom, where they are sent through the bottom cross duct to reenter the elevator stack and proceed again around the circuit. The glass trays and aluminum frames carrying them are so designed that they can carry large fields of agar or other gels or can hold ordinary petri dishes loaded in by hand. Indeed, the machine is capable of running with large trays of agar only, with petri dishes only, or with any desired combination of those two modes. Mounted on the top and bottom cross ducts are various cameras, illuminating systems, devices for inoculating the agar with bacteria, for illuminating them with ultraviolet light or X rays, for spraying drugs or nutrients to be added at various stages during growth, and so forth. Large experimental areas are provided on the top and bottom cross ducts for devices that will be needed to carry out future operations. On the left in Figure 28 is seen a portable magazine able to load stacks of 64 trays in and out of the dumbwaiter, so that, once the photography has been completed, the agar can be stored in the cold room until the next operation has been chosen. Alternatively, incubation of the stack can be carried out in a separate warm room if the dumbwaiter is needed for processing a different set of trays. The portable magazines are also used to transport stacks of trays to a special dishwasher which removes the agar and washes and sterilizes the trays for reuse in the dumbwaiter. The motions of trays in the dumbwaiter are carried out by precision ball screws driven by stepping motors which make 200 steps/turn under direct control of a PDP-8 digital computer. Almost none of the motions of the trays within the dumbwaiter are tied together mechanically, and a great variety of motions is possible under programmed control of the PDP-8. Provision is made to sterilize the interior of the machine by ultraviolet irradiation and to maintain any desired temperature, humidity, illumination level and gaseous composition within the range of biological interest. DESCRIPTION OF THE F L Y I N G SPOT SCANNER
Analysis of 35 mm photographs will be made by a flying spot scanner with 8000 X 8000 line resolution under control of the PDP-6 computer with 32,000 words of memory. The picture to be scanned will be 24 X 24 mm on 35 mm film. Under direct computer control, the scanner requires 29 microseconds to go from one random point to another within the field, but operating in a raster mode by means of an interfacing "controller" between the computer and the scanner hardware it is able to scan at the rate of 1 usec./point. The output of the scanner is digitized to 64 levels of gray, although we do not yet have performance figures on the accuracy and stability of the digitizing system.
BACTERIAL COMPUTER
COLONIES
63
PROGRAMS
Scan and Count Colonies In this mode the scanner executes a regular raster scan and reports to the computer only boundary points of objects it encounters. The computer organizes these boundary points into lists, one for each object, and then assumes that all of the objects consist of one or more overlapping circles in order to count the number of colonies. Figure 29 is a typical "scan and
Figure 29. Photograph of a 100 mm plastic petri dish taken in very high contrast to provide sharp boundaries and featureless colonies. This dish contains over 600 colonies which can be found and counted automatically in about 30 seconds by the flying spot scanner.
64
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
count" photograph. In order to count colonies in this way the program first assumes that each object is a simple circle and tries to find a least squares radius and center. If this is successful within a prechosen accuracy, the program goes on to the next object. I f a simple circle cannot be fitted, the program looks for cusps in the boundary and, if two cusps are found, the portions of arc bounded by these cusps are fitted to two different circles. I f that is not successful, the program goes into a curve-following mode to locate all of the cusps by searching for the abrupt changes in the second derivative expected in the region of a cusp. For each piece of arc bounded by an adjacent pair of cusps, the program determines a least squares center and radius. W h e n this process is completed for a given object, the number of distinct centers is counted and is called the number of circles that contributed to the total object. Centers closer to each other than some prechosen distance are assigned to the same circle and not counted separately. This strategy was able to count the colonies in Figure 29 and photographs similar to it and get substantially the same answer as obtained by a technician. W e have no statistical results on the performance of this program for very large numbers of photographs, nor for more difficult cases in which the colonies are not such regular circles as in Figure 29. Presumably, the program strategy will need to be enlarged to consider parameters such as the ratio of the circumference to the area, the unevenness of the edge, and other features when the circularity test does not apply. Happily, most of the strains of interest in biological work form colonies which are quite regular circles. Shape
Analysis
W h e n photographed with moderate contrast, bacterial colonies reveal complex patterns due to a combination of refraction, absorption, reflection and scattering as described above. Figure 30 is a photograph of a dish taken under moderate contrast conditions. T h e only shape analysis we have made routinely on colonies so far has been to measure the optical density profile by taking about one hundred steps across a diameter and measuring the optical density at each place; samples of such profiles are shown in Figures 31 and 32. These profiles were made with a digitized microdensitometer with a mechanical drive and without the gray level detector on the flying spot scanner. W e expect to match the performance of the microdensitometer at much higher speed using the flying spot scanner and hope to be able to digitize gray levels at the rate of about 15 psec. per point. W e have altogether photographed about 50 different strains of medical importance and find it quite easy to identify them by eye with the help of the out-of-focus photographs described above. W e have also made a straightforward Fourier analysis as well as other shape analyses based on point-by-point correlations with a standard colony. T h e Fourier analysis works very well, and only about ten coefficients are needed to approximate the colony in a very satis-
BACTERIAL
COLONIES
65
Figure 30. A sparsely populated petri dish photographed at moderate contrast to allow measurement of colony morphology using optical density profiles. Several photographs of the same dish under different lighting and camera focus conditions are sufficient to identify the organism in the sample of 50 medically important organisms that we have studied. This particular picture was made with parallel transmitted light and an infocus camera. The organism shown is E. coli.
factory way. A final choice of method will be based on the speed of identification after the methods have been used on a production scale. SYSTEM
OUTPUT
Results of simple colony counting may be reported to the experimenter in the form of lists of colony counts versus dish number, since we plan to have all the disposable plastic petri dishes numbered for recording in the laboratory book and for reading by the scanner at the time it is doing the counting. For the dumbwaiter, colony counts versus tray or frame number or per-
66
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
haps versus Cartesian coordinates in the trays will be reported. For some purposes we have used the system for direct plotting on a C A L C O M P plotter of histograms which contain colony counts versus a variety of parameters important in the genetic or physiological interpretation of the experiment. Calculations of errors, normalization of histograms by standard experiments, and other operations are very conveniently carried out at this stage in addition to outputting of raw data where it is desired. For selecting mutants which grow slowly at elevated temperatures, for instance, the computer system plots a map of each particular dish which contains a mutant colony and places a cross at the site of the mutant colony. The laboratory technician then places the dish of the correct number on the map and sees immediately from the position of the cross which colony was chosen as out of the ordinary by the scanner. The colony can then be picked for restreaking and further biological operations. The dumbwaiter itself will be provided with an X-Y plotter on one of the experimental cross ducts, so that it can carry out this picking operation under control of its own PDP-8 on the basis of scanning information from the PDP-6. For this operation we will probably program the PDP-6 to prepare magnetic tapes which can be E. Coli
Strains
Berkeley
B
Figure 31. Optical density profiles across the diameters of colony photographs made at moderate contrast. W h e n these in-focus profiles are used together with several out-offocus photographs of the same colonies, identification of the seven strains shown can very readily be made out of a collection of about 5 0 strains we have studied in this way.
BACTERIAL
E. Coli Hayes
COLONIES
67
Strains Cavali
M j i S t r e p t o m y c i n - r e s i s t a n t mutant /~\
Berkeley B
Figure 3 2 . Optical density profiles for four strains, showing the reproducibility of the profiles from one colony to another, in contrast to profiles from closely related but different strains.
read by the PDP-8 in driving the motions of the dumbwaiter to pick particular colonies. In this operation we propose to replace the pen of an ordinary X-Y plotter with a platinum hairpin that can pick a colony or part of a colony, restreak it on fresh agar, or carry out other operations under computer control. T h e platinum hairpin can be sterilized by passing an electric current through it and then go on to the next picking operation. In some cases it will be desirable to have a list telling the total count of each type of organism in a mixed culture on a dish or a tray. FUTURE WORK
It should be emphasized that all of the equipment that I have described exists and has been operated, except for the dumbwaiter which will be completed in the spring of 1969. W e expect to be doing production experiments long before the dumbwaiter is working, however, by using the flying spot scanner to analyze the output from petri dishes that have been made and processed by hand in the conventional laboratory way, as well as those photographed in the lazy susan. The direction of future work with the dumbwaiter will depend very much on the results of those experiments, both in the biological and in the medical field. One can readily imagine a number of
68
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
other applications for the dumbwaiter and flying spot scanner, since the whole system is actually an automated way of presenting about 100 square meters of growing surface for periodic photography and analysis by the flying spot scanner. The dishes could contain a great variety of microorganisms growing on agar, animal tissue cells growing in liquid medium, or even larger organisms swimming in water or crawling on a dry surface. The dumbwaiter has flexibility to match that of the flying spot scanner in that its environment and motions, as well as frequency of photography and manipulation, are under control of a programmable computer. One can imagine carrying out studies of behavior and learning in simple organisms using the dumbwaiter and flying spot scanner. In the first part of an experiment the organisms would be watched passively as they moved around, and a record kept of the statistical behavior of their speed of motion, frequency of turns, and response to light and other stimuli. In this way the computer could, after a modest observing time, present to the experimenter a dossier on the statistical behavior of each individual organism, information that would be virtually impossible to acquire without such automation. Once this "normal behavior" has been observed thoroughly, it should be possible to try to train the animals by altering the environment in specific ways. For example, it would be easy to program the computer to direct a bright spot of light to a photosensitive organ of the organism whenever it begins to turn to the right. Assuming that this bright spot of light constitutes "punishment" for the animal, it could serve to train the animal to avoid right turns. Ofter an extended training period in which the scanner could "train and monitor" hundreds of individuals, the flying spot scanner would be returned to its passive observation mode to see whether any of the animals had "learned" to avoid right turns. If this basic training and observation technique is successful for some organism, one can imagine a fascinating series of studies of the genetics of different kinds of behavior and response to stimuli. Although the system I have described involves photography of growing colonies followed by subsequent analysis by the scanner, and then perhaps manipulations based on the scanner output, in the future it should be possible to operate the scanner on-line in real time with the biological process. The great advantage of photography is that a wide range of illumination systems can be used, and photography can be carried out during the biological experiment even if the computer is not operating properly. On the other hand, the computer can be analyzing film from a library of past experiments even if the dumbwaiter is down for repairs or a particular day's biological work is unsuccessful. Only when the dumbwaiter and the flying spot scanner system are working very reliably would it be feasible to operate the two of them together in real time. Since there are great advantages in being able to make changes in the experiment based on analysis of the data during the experiment, we hope very much to reach that stage. *
BACTERIAL
COLONIES
69
Discussion
Nathan: Are you not a bit concerned about digitizing to get the 64 levels of gray from film? Glaser: Yes, I am. I do not think it has been done at high speed, but I know that people have done it at lower speeds. We are certainly going to try to do it. There is another uncertainty. We are not sure that the biological phenomenon is reproducible to that accuracy. We hope to go as far as the reproducibility allows. Brown: Would it not be somewhat better to do a considerable amount of preprocessing and look at the kinds of things that one is really interested in? For example, one might examine various moments of density, which could be computed by special purpose equipment; also, fewer parameters might be entered at a lower rate into the computer for the pattern recognition. Glaser: Yes. I have no doubt that once we know exactly what we want to do with the data, we will do just what you are suggesting. At first we began with a naked computer and discovered it was too slow to make an acceptable raster scan, so we built the controller. If it turns out that curve following is better, we will build a special purpose circuit which can be told, "Follow this curve until a certain event occurs." We have not reached that far into the analysis of these profiles across the diameter. It may be that we shall preprocess the profiles instead of putting all the data into the computer. Our first attempt will involve obtaining a hundred cuts per colony and doing a pattern match, a simple correlation with proposed standard colonies to see if an unknown colony fits one of the known ones. Lederberg: I have a more specific question. I want to know how the colonies are illuminated. What do the levels of gray mean with respect to transparent bacterial colonies which are scattered a little bit, shaped like a lens, and so on? Glaser: A colony is indeed a complicated object. Viewed in cross section, it may well have a lenticular shape of one kind or another. The impression one gets from above is some combination of the lens effect and of pigments and particles which may absorb and scatter as turbidity. The picture obtained cannot be interpreted in any simple way in terms of what really may be there. Therefore, it is only a "fingerprint". Our attitude has been to " I would like to acknowledge with great respect and gratitude the contributions of my colleagues who have helped design and construct the equipment described here, including Dr. W. H. Wattenburg ( the computer system and many of the auxiliary data handling devices); Mr. Robert Henry ( the interfaces between the computer and scanner and some of the systems programming); Mr. Ray Kenyon (the flying spot scanner and much of the stepping motor control system); Messrs. Leif Hansen and Ronald Baker (the mechanical, optical, photographic, and environmental equipment), and Messrs. Fraser Bonnell, Don Segal and Steven Sondov (the systems, pattern recognition, and data analysis programs for the two computers).
70
IMAGE
PROCESSING
IN B I O L O G I C A L
SCIENCE
experiment with various optical systems in order to obtain the fingerprint which does the best job of distinguishing among the cases in which we are interested. Sometimes we use scattering from above, top illumination, which gives some idea of the contours; sometimes from below. To save film we now have a system which involves illuminating from below with a half-silvered mirror, so that we get some down-scattered light. Then the picture is a composite, which is more interesting to look at and has more information in it than either simple type of picture. We made a discovery quite by accident which may be very important, and may indeed have quite general application. Since this object is a lens, it will bring light to a focus at various points above and below the colony when illuminated with parallel light. The constellation of image points and halos results from the "refractive shape" of the colony. We found that, if we take pictures deliberately out of focus, we can get sharply defined ring and point images that are in focus at different heights. Our most useful fingerprint is likely to be a composite of an in-focus picture (very often quite uninteresting) and out-of-focus pictures containing a variety of rings and points. It may be that two or three photographs will be necessary for distinguishing among large numbers of different types of colonies. Color filters also make dramatic differences in the appearance of the colony. The answer to the question is that the pictures do not tell literally what is going on, so you simply want to make them as interesting as you can for good discrimination. Minskij: If you had a polarizing nutrient that is consumed, you could get very large effects and stay in the same place. Glaser: Yes. Pigment effects can be seen, of course, and there are tests for the fermentation of sugars that produce a color change in the neighborhood of colonies. It is not only the colony that can be analyzed but sometimes its environment too, when there is an interaction. Johns: I was wondering if deviations of area about the mean area were helpful in detecting overlaps. Glaser: We do not know that yet. It is often true that isolated colonies are bigger than ones that overlap. The diameters are all different. Macy: I wondered in general, is your problem dealing with a single plate ordinarily containing a large number of colonies of the same organism, or are you occasionally trying to spot one or more colonies of a different or a slightly different organism? Glaser: Our immediate uses are for counting colonies which are all alike. Another problem is to pick out a mutant. If, for example, it is desirable to pick mutants that will not grow in the absence of histidine, a plate is prepared with a limiting amount of histidine and small colonies are looked for; these must then be tested against other things to know that it was the histidine gene that was defective. Sometimes one is looking for an oddball, an
BACTERIAL
COLONIES
71
unusual colony; this medical application is rather akin to ecology; the whole flora and fauna are examined for one particular type in a background which may be quite varied. Macy: Do not these three different possibilities call for rather different fingerprint strategies? Glaser: Yes, I am sure they do. For best accuracy in counting objects, the photographic system is set up to get just black circles. There is no character, no profile. Macy: Would that sort of defocused fingerprint be at all applicable in the case of overlapped colonies? Glaser: I doubt it, but I do not know, as we have not looked at them. Prewitt: The age of a colony must certainly be considered in any attempts to identify colonies by comparing their density profiles with a set of standards. To handle this factor, you might use a dictionary of profiles, or perhaps introduce auxiliary criteria. Glaser: Yes. For medical purposes one would like to make this identification as early as possible, so our problem is going to be, "How early is the profile distinct enough to differentiate one out of 50 types?" To know that we are comparing types of the same age of development, we are proposing to try such things as insisting that the diameters be alike, or that the total integrated opacity be alike, or that the position of some standard peaks be the same. We do not know what is going to be best. McCormick: In other words, you have to speak to these nonsymmetrical colonies? Glaser: We tried deliberately to take pictures in glancing light instead of normally incident light because it is obvious to anyone who has worked with these colonies that visual examination is best when they are examined at some glancing angle, not when looking straight on. But the asymmetry was a little frightening to us from the point of view of the pattern recognition problem. That is what led to defocusing instead. So far our effort has been to symmetrize the photography as much as possible, otherwise we should have to symmetrize in the data handling later on for the simplest kind of recognition. If that fails, then I think we will try to use glancing light and take advantage of the shadow effects. At the moment we are regarding asymmetry as an artifact of the system. It is against nature that there should be an asymmetry for a colony which is in the middle of an infinite sea with nothing else around. The only question is, "How close do you have to be to the edge or to another cell?" In fact, some petri dishes contain ripples which produce optical deviations in the agar. If the agar is not poured and allowed to gel very carefully, optical inhomogeneity results; a colony growing on such an optical inhomogeneity is asymmetric, presumably because differences in density make differences in permeability. Lederberg: First, many organisms are nonspherical. Secondly, an accident occurring at the very first division of the E. coli cells, for example, may
72
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
have a further effect on the development of the colony. I am not at all convinced that one should get perfect symmetry. We are not dealing with isotropic hard black balls that are going to fall at random from their early growth. Glaser: We have done experiments along that line. We have started colonies and photographed them under the microscope, that is, as single cells. As Dr. Lederberg pointed out, early divisions may produce asymmetries. But we find that in every case of nonswimming organisms (swimmers often make puddles) the colony has usually rounded out quite well by about ten hours. There is also the case where a mutation occurs at some stage, sometimes very early, in which two daughter cells are not identical. Then, when the colony grows up, sectored colonies are found; if this occurred, say, at the fourth stage, a colony containing four quadrants is the result. Such sectored colonies are used very efficiently as ways of picking up mutants. We are doing this in our laboratory now, using very tiny colonies. We get 1000 or 10,000 colonies per dish and study them under the microscope. One can certainly see asymmetry as a result of a genetic event, but in most of the cases we have looked at, the colonies round up by 10 or 12 hours, though different strains require different times. The question, as a whole, is a very interesting one. How large a colony, with how many component cells, does one have to have before the colony properties are independent of statistical fluctuation? It is an interesting sort of problem to study if one is interested in morphogenesis, how organs are formed and get their shape. I am not sure it has any relation to higher animals, but some more ambitious members of our group have wanted to make a theory of why the colony looks as it does. Brown: In some clinical problems, you may be interested in making the detection as early as possible, and some of the nonsymmetrical growth patterns may be most helpful in that very early identification. Glaser: Using the microscope, asymmetry in the early colonies would tell something about the shape or division habit of the cell. Preicitt: At the risk of anticipating some material that might be introduced in subsequent presentations at this conference, I would like to draw attention by means of example to similarities of problem and technique that recur in the kinds of image processing that we are discussing here. Dr. Glaser's work on colony identifications shares certain features with the processing and recognition of chromosome and blood cell images. The delineation of several interacting colonies is reminiscent of the problems of estimating the positions of constrictions in individual chromosomes and of separating proximate chromosomes. The two problems can be treated with similar heuristics: locating a relative minimum in a distance function or analyzing local curvature in the image. A comparison can also be made between the attempt to identify colonies by matching their density profiles with templates
BACTERIAL
COLONIES
73
and standards, and our use of optical density distribution functions in characterizing white blood cells. In both cases, the image is (partially) described by a digitized function of one variable, and images are compared indirectly by comparing the associated function-descriptors directly. Glaser: Our way of dealing with that was to take the standard profile containing 100 numbers, and slide the proposed fitting standard profile along it, looking for an acceptable correlation. McCormick: Would you give us a few dimensions on the size of your colonies? Glaser: The smallest colonies we have measured with the projection comparator are 50 p in diameter. If we are operating at the 4000-line level and looking at a 10 cm or 100 mm dish, which is how we have designed everything so far, then that is 40 lines per millimeter. A single hit then would correspond to a 25 |j object. I think that we are going to be limited to things that are larger than that, so I would say 100 n is our minimum size for establishing any kind of circularity. Our noise rejection is related to that dimension, and we have provided in the controller I described not only the ability to scan a raster, which is prescribed by giving a starting point and the increment, but also the capability to vary spot size. In the initial once-over-lightly pass of a picture we would say, "Pick a 200-line raster and make the spot large so that the spot will not see the little grains of dirt, but only objects which cover half of the sampling size." The dimensions are very much in our hands—spot size and raster. McCormick: Are you intending to provide in your dumbwaiter the facility to index internally to a single colony? Glaser: Yes. McCormick: Will you be able to get a little piece of what is inside and reinsert it? Glaser: Yes. All of the motions are built on the use of a PDP-8 computer. There are no levers, gears or clockwork inside. All of the motions are spring-loaded ball screws driven by stepping motors connected to the computer. McCormick: And the tolerances on these are a few microns? Glaser: No; closer to 20 p, about a mil or half a mil. Those are the machine tolerances for inexpensive, commercially available ball screws. The stepping motors make 200 steps per revolution. Most likely, the PDP-8 will be given tapes out of the PDP-6 telling it to put the dish in a certain position and then pick a colony. We will use either our ball-screw indexing machine or have a separate X-Y plotter in which the ballpoint pen has been replaced by a little hairpin of platinum, which picks the colony and puts it somewhere else. By passing an electric current through the hairpin, it is made sterile and you can go to the next one. McCormick: I was wondering whether you could index within a colony,
74
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
inserting something in the doughnut. Glaser: If the colony is two millimeters in diameter and we have an accuracy of motion of tens of microns, then within one per cent or so we can decide to pick the center or the edge. That is perhaps what you had in mind. McCormick: Yes. Lederberg: When can we buy one of these? Glaser: Somebody else has to sell you the scanner; many companies make scanners. But the dumbwaiter will be more-or-less homemade, with major parts manufactured by subcontractors. Once the design has been tested, it could be copied commercially. REFERENCE A., and WATTENBURG, W . H., An automated system for the growth and analysis of large numbers of bacterial colonies using an environmental chamber and a computer-controlled flying-spot scanner. Ann. N. Y. Acad. Sci., 1966,139: 243-257.
1. GLASER, D.
AUTOMATIC PROCESSING OF MAMMOGRAMS 0
JOSIAH MACY, Jr.f FRED WINSBERG WILLIAM H. WEYMOUTH Albert Einstein College of Medicine Yeshiva University Bronx, N e w York
A study of the possibility of devising automated methods for the detection of breast cancer was undertaken for three reasons: ( a ) the need for largescale screening projects in public health, ( b ) the lack of trained people to evaluate soft-tissue X rays on such a scale, and ( c ) the desirability of devising image processing techniques to cope with cases in which no distinctive pattern can be defined as a standard. The importance of early detection of any form of malignancy is too well known to require further emphasis here. In addition to the problem of finding enough radiological competence for evaluation of the very large numbers of films to be obtained in a screening project, the development of automated techniques will overcome any problem of fatigue or tedium in human scanning of the many films per hour or day. The computer will not bog down late in the day and can work effectively at scanning for 24 hours every day. Only those films which cannot safely be categorized as malignant or nonmalignant will require inspection by human radiologists for further evaluation. The major problem in categorizing patterns in soft-tissue X rays is the lack of any clear definitions or absolute criteria for the properties of a lesion or tumor. There is a wide variation of densities and patterns in normal tissue, and no single distinctive pattern which defines a tumor or lesion. We have assumed that successful techniques for dealing with this variable array of densities and patterns would also be applicable to other areas of image processing in which there is a similar lack of clear criteria. ° W e are indebted to the Hogan Faximile Co. for their help and their generosity in giving us the basic mechanism of our scanner. All the prints used in the study were made for us by Mr. Alexander Rota of the American Museum of Natural History; we are grateful for his skill and cooperation. This research was supported in part by the U.S. Public Health Service, National Institutes of Health, grants CA-0718 and NB-03491. f Now at the University of Alabama Medical Center, Birmingham, Alabama.
75
76
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Original efforts to follow the procedures of the radiologist were shelved for further investigation when it was discovered that the conventional radiological descriptions of a single lesion were not quantifiable without reference to the complex in which the lesion is viewed. An attempt at programming the computer to look for a sequence of classical signs and symptoms has therefore been postponed; instead, we have turned to a technique which attempts to extract from each picture those features which most readily and accurately define a lesion, and to use these features in combination to characterize the density pattern. The mammograms used in this study were obtained using immersion material to produce an approximately isodense negative. Prints were then produced from these negatives and scanned by a reflected-light scanner. This scanner had a resolution of 125 points to the inch in both dimensions and about 32 levels of gray. The densities measured at each point were then recorded on digital magnetic tape. These tapes were used as input to the computer, a CDC 160-A with 24 K of memory and four tape drives. Some of the techniques used in the early stages of the project were necessary to overcome the limitations of our computing equipment. Typical problems of radiological evaluation were selected by the project radiologist, printed and scanned for preliminary analyses. These included (a) clear cases with a large dense tumor, shape distortion, skin thickening and tracks to the alveolar area all present in one breast, and normal structures in the other; ( b ) typical "normals" with fibrocystic disease, uneven photographic exposure and background, variable visibility of skin line, miscellaneous dense masses and structures and photographic artifacts; ( c ) cases in which there was considerable asymmetry in the shape and size of a patient's breasts; (d) cases which included portions of chest wall and ribs; and ( e ) cases showing fibrocystic disease, fibroadenomas, calcifications and various combinations of these abnormalities of structure. Prints were made from the selected films and scanned, with perhaps 750,000 scanned points per print. Each point was recorded on digital magnetic tape as one of 32 density values. Although all 32 values were recorded at first, further investigations indicated that half that number gave equally useful results, and subsequent work was done on the basis of 16 levels of gray scale. For the preliminary work, complete printed outputs for several scanned breasts were produced and hand-colored by the staff, with each numerical value assigned a separate color of felt-tipped marker. The colored printout sheets were then pasted into 6-foot by 10-foot murals which were used for guidance in the early stages of writing programs. The first step in processing the scanned material was to adjust the density values to compensate for local differences in exposure. The next step was to define the skin line and establish the actual area of tissue to be examined. After normalizing the scan for variations in exposure and discarding background material outside the skin line, the shape of the breast was standard-
AUTOMATIC
PROCESSING
OF
MAMMOGRAMS
77
ized. To do this, a grid was imposed on the tissue area; the size of each grid square was calculated to represent a standard fraction of the tissue measurement in the different parts of the print rather than representing a fixed number of scan points. Each case was forced to a standard shape on the grid by forcing the skin line to pass through the intersection points of the grid. We have chosen to distort the grid, not the picture, in this step. That is, the actual set of scan points is not changed; the grid is forced to pass through the skin line in a standard way, resulting in a grid whose squares represent a fixed fraction of the tissue area but is of variable actual size. Subsequent measurements are normalized with respect to each grid square as a unit. The choice of square size will be statistically optimized when enough data have been collected for a meaningful sample. The patterns in too small a square will be dominated by some very local phenomena, such as the passage of a blood vessel. Sensitivity will be lost in too large a square, as the amount of averaging will be too great to allow sufficient adjustment for differences from one part of the tissue to another. Our current guess for optimum grid-square size is about one-third inch to a side; it seems to be a size which works for our purposes. For each grid square, four characteristic vectors are developed to describe the patterns of densities within the square. All vectors are normalized so that the particular physical size of any square can thereafter be disregarded. The four vectors describe the occurrence and distribution of the different patterns and densities within a grid square; they include sufficient information to characterize the patterns observed without providing detailed information about specific shapes. Each vector has m components, one for each of the m density levels. Values of 8, 16 and 32 have been tried, with m = 16 the usual choice. These vectors are named Area, Distribution, Occurrence, and Coherence, and are used to describe the properties of the different density levels. There is one component of each of these vectors for each of the density levels in each grid square. The total set of vectors is used to extract what we hope is the pertinent information on the film. Vector I, the "area" vector, has components that represent the fraction of the grid square occupied by each of the possible gray levels. The number of points occupied by each of the possible levels is divided by the overall area of the square to obtain a normalized vector. The vector gives information on the fraction of the square occupied by each level of density regardless of its distribution. Vector II, the "distribution" vector, is a measure of the degree of uniformity to which a particular density level is distributed through the square. The occurrence of a particular density level is plotted as a histogram, and the ratio of perimeter to area of the histogram is measured to give an index of "jaggedness". The index is the component for that density level, and similar indices are derived for each density level for Vector II. Minimum values
78
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
result when a destiny level is distributed throughout the square uniformly, so that the histogram is most nearly a level line. Maximum values result from the distribution which produces the most jagged histogram, such as one in which the density occurs in alternating stripes. (Alternating stripes are an extreme example, and do not occur as part of a soft-tissue X ray of the breast.) Vector I I I , the "occurrence" vector, measures the presence or absence of each density level within one of 16 subsets of the grid square; it indicates only whether a density level was or was not present in that one-sixteenth of the grid square and gives no measurement of the area occupied by the density level within the subset. Vector IV, the "coherence" vector, is a composite that is actually two vectors in one, giving a four-digit component that measures the degree of symmetry with which a density level occurs within a square. The first two digits represent the ratio of the area occupied by a density level in the top half of the grid square to the area occupied by the same density level in the bottom half of the square. When a particular density level occurs to an approximately equal degree in both halves of the square, the square is scored as symmetrical and is assigned a minimum value. Maximum values are assigned to the least symmetrical squares, where the difference between the two halves of occurrence of a density level is greatest. The second two digits of this vector measure the number of distinct areas which contain the same density level. These second two digits are effectively a "blob count" for each density level separately; the program counts "blobs" in much the same manner as several previously described programs. Given these vectors and the description of a picture in their terms, two ways of proceeding are possible. The first is to search for an anomalous square, an anomalous vector compared to its neighbors, a sudden change in value of one of these components, or something of that sort. In the case of the mammograms, the obvious method to use is the one that radiologists ordinarily follow: comparison with the opposite side of the same patient. To do this comparison automatically, vectors are produced for both photographs, one of the left breast and one of the right breast, for a patient. From these two sets of vectors a single set of comparison vectors is produced. This set summarizes in the same vector form, not the properties of either picture, but the differences between the two pictures. They are presented grid square by grid square, density level by density level, across all four vectors: Area, Distribution, Occurrence and Coherence. In the case of Area and Distribution, the component is a coded difference which preserves a certain amount of additional information, such as the fact that one side has no entry for that particular component. Thus the density patterns in each grid square can be characterized by these four m-dimensional vectors which are derived for each pair of prints
AUTOMATIC
PROCESSING
OF
MAMMOGRAMS
79
from each patient. The four vectors for each grid square of the print of the right breast are compared with the four vectors for each corresponding grid square of the matching print of the left breast to generate a set of comparison vectors for the patient, in which each component takes a value representing the essential differences, for that component, in that vector, between the left and right breasts. This single set of comparison vectors has then extracted the essential differences in pattern between the two pictures. The comparison will ignore small local shape differences, differences in exposure, differences in average density throughout one picture, and any differences which may exist in the shape or size of the two sides. The comparison will be sensitive to the differences in density patterns or features which are diagnostic. For purposes of evaluating the differences that have been found, a single number difference score is produced by algorithm from each m-dimensional comparison vector for every grid square. These are labeled "index values", and there are four for each square. The probability of a lesion is evaluated on the basis of the index values for the square. A lesion is indicated by a minimum difference score for each vector and a minimum weighted aggregate score. The weighting algorithm for the index value for any one vector over any one square is given by m P =
(N •K ) ( 2 W i
j = l
|Dj | B i ) *
This algorithm is applied to produce four index numbers for each grid square, one for each vector. These index numbers can then produce four surfaces, each of which is a plot of the estimate from that vector of the probability of a lesion. These surfaces can be presented as contour maps, with the peaks identifying the probable lesions or anomalies. After trying several methods of coloring contour maps according to index values, we settled on marking the above-critical squares for each vector with a separate symbol on one chart. These charts seem to be intuitively informative, on an interim basis, and therefore useful to planning at this stage. In cut-and-dried cases, readily visible to nonradiologists on the simplest instructions from the radiologist, medical diagnosis and computer diagnosis agree completely. In these obvious cases, the comparison vectors turn up anomalies in the place and size that the radiologist points out on the photograph. Figures 33 and 34 show the prints and the program output for a typical 0
Where P = index value; N = number of levels for which | Dj | > L c ; Lc = "cut-off" level for trivial differences; K = balance factor (K = 1 if D / s of same sign are contiguous, K = K if Dj's of different signs are mixed) Wj = weighting factor ( W j = 0 if | Dj | < L c , and Wj = an adjustable function of j, or a constant, if | Dj | > L e ) ; Dj = difference vector j th component; Bj = symmetry or balance factor (Bj = 1 for Vectors, I, II, III; Bj = symmetry component for Vector IV).
80
IMAGE
PROCESSING
IN
BIOLOGICAL
SCIENCE
Figure 33. Mammogram film pair showing an obvious carcinoma in the left breast.
obvious carcinoma. Figures 35 and 36 show the prints and program output for an obvious cyst. Figures 37 and 38 illustrate a calcified fibroadenoma. Figures 39 and 40 illustrate fibrocystic disease. In the more ambiguous cases, differences which show up in computer diagnosis produce index scores much lower than those produced for clearly obvious tumors or lesions. Figures 41 and 42 illustrate a difficult case, diagnosed as "probable carcinoma, recommend biopsy" by a radiologist; it turned out to be an organized hematoma. The computer program spotted the anomaly at a low level of confidence, with a low set of index values. Nowhere was there an agreement of all four vectors and the scores were just over the borderline for significance. The numerical results offer some hope that a larger number of cases will permit optimizing criteria on a statistical basis and produce a set of separation techniques or algorithms to distinguish between carcinoma and all other anomalies. To whatever extent this becomes possible, the automated diagnosis would be equal to or somewhat better than that of the human radiologist. Figures 43 and 44 show a benign cyst and the corresponding program output; the index numbers clearly define the cyst.
AUTOMATIC
PROCESSING
OF
MAMMOGRAMS
81
AAA
/i 4 4 /
Y • T » » »
r vy
iti
K WX
T
« Y* \ 4 A4 r * * V •+ t \
\
x
r
+ -- -fi bo > o co -J3 Ph m 2 JJ .13CD I1— —IIl -M 1> — £>< u e b
f-^i;; (