370 98 23MB
English Pages iv, 812 pages: illustrations (black and white, and colour; 28 cm [819] Year 2013;2014
Computer Graphics with Open GL Hearn Baker Carithers Fourth Edition
Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk © Pearson Education Limited 2014 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the publisher or a licence permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS. All trademarks used herein are the property of their respective owners. The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this book by such owners.
ISBN 10: 1-292-02425-9 ISBN 13: 978-1-292-02425-7
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Printed in the United States of America
P
E
A
R
S
O
N
C U
S T O
M
L
I
B
R
A
R Y
Table of Contents
1. Computer Graphics Hardware Donald D. Hearn/M. Pauline Baker, Warren Carithers
1
Computer Graphics Hardware Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
27
2. Computer Graphics Software Donald D. Hearn/M. Pauline Baker, Warren Carithers
29
3. Graphics Output Primitives Donald D. Hearn/M. Pauline Baker, Warren Carithers
45
4. Attributes of Graphics Primitives Donald D. Hearn/M. Pauline Baker, Warren Carithers
99
5. Implementation Algorithms for Graphics Primitives and Attributes Donald D. Hearn/M. Pauline Baker, Warren Carithers
131
6. Two-Dimensional Geometric Transformations Donald D. Hearn/M. Pauline Baker, Warren Carithers
189
7. Two-Dimensional Viewing Donald D. Hearn/M. Pauline Baker, Warren Carithers
227
8. Three-Dimensional Geometric Transformations Donald D. Hearn/M. Pauline Baker, Warren Carithers
273
9. Three-Dimensional Viewing Donald D. Hearn/M. Pauline Baker, Warren Carithers
301
Three-Dimensional Viewing Color Plate Donald D. Hearn/M. Pauline Baker, Warren Carithers
353
10. Hierarchical Modeling Donald D. Hearn/M. Pauline Baker, Warren Carithers
355
11. Computer Animation Donald D. Hearn/M. Pauline Baker, Warren Carithers
365
I
12. Three-Dimensional Object Representations Donald D. Hearn/M. Pauline Baker, Warren Carithers
389
Three-Dimensional Object Representations Color Plate Donald D. Hearn/M. Pauline Baker, Warren Carithers
407
13. Spline Representations Donald D. Hearn/M. Pauline Baker, Warren Carithers
409
14. Visible-Surface Detection Methods Donald D. Hearn/M. Pauline Baker, Warren Carithers
465
15. Illumination Models and Surface-Rendering Methods Donald D. Hearn/M. Pauline Baker, Warren Carithers
493
Illumination Models and Surface-Rendering Methods Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
541
16. Texturing and Surface-Detail Methods Donald D. Hearn/M. Pauline Baker, Warren Carithers
543
Texturing and Surface-Detail Methods Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
567
17. Color Models and Color Applications Donald D. Hearn/M. Pauline Baker, Warren Carithers
569
Color Models and Color Applications Color Plate Donald D. Hearn/M. Pauline Baker, Warren Carithers
589
18. Interactive Input Methods and Graphical User Interfaces Donald D. Hearn/M. Pauline Baker, Warren Carithers
591
Interactive Input Methods and Graphical User Interfaces Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
631
19. Global Illumination Donald D. Hearn/M. Pauline Baker, Warren Carithers
633
Global Illumination Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
659
20. Programmable Shaders Donald D. Hearn/M. Pauline Baker, Warren Carithers
663
Programmable Shaders Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
693
21. Algorithmic Modeling
II
Donald D. Hearn/M. Pauline Baker, Warren Carithers
695
Algorithmic Modeling Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
725
22. Visualization of Data Sets Donald D. Hearn/M. Pauline Baker, Warren Carithers
729
Visualization of Data Sets Color Plates Donald D. Hearn/M. Pauline Baker, Warren Carithers
735
Appendix: Mathematics for Computer Graphics Donald D. Hearn/M. Pauline Baker, Warren Carithers
737
Appendix: Graphics File Formats Donald D. Hearn/M. Pauline Baker, Warren Carithers
773
Bibliography Donald D. Hearn/M. Pauline Baker, Warren Carithers
789
Index
801
III
This page intentionally left blank
Computer Graphics Hardware
1
Video Display Devices
2
Raster-Scan Systems
3
Graphics Workstations and Viewing Systems
4
Input Devices
5
Hard-Copy Devices
6
Graphics Networks
7
Graphics on the Internet
8
Summary
T
he power and utility of computer graphics is widely recognized, and a broad range of graphics hardware and software systems is now available for applications in virtually all fields. Graphics capabilities for both two-dimensional and threedimensional applications are now common, even on general-purpose computers and handheld calculators. With personal computers, we can use a variety of interactive input devices and graphics software packages. For higher-quality applications, we can choose from a number of sophisticated special-purpose graphics hardware systems and technologies. In this chapter, we explore the basic features of graphics hardware components and graphics software packages.
From Chapter 2 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
1
Computer Graphics Hardware
1 Video Display Devices Typically, the primary output device in a graphics system is a video monitor. Historically, the operation of most video monitors was based on the standard cathode-ray tube (CRT) design, but several other technologies exist. In recent years, flat-panel displays have become significantly more popular due to their reduced power consumption and thinner designs.
Refresh Cathode-Ray Tubes Figure 1 illustrates the basic operation of a CRT. A beam of electrons (cathode rays), emitted by an electron gun, passes through focusing and deflection systems that direct the beam toward specified positions on the phosphor-coated screen. The phosphor then emits a small spot of light at each position contacted by the electron beam. Because the light emitted by the phosphor fades very rapidly, some method is needed for maintaining the screen picture. One way to do this is to store the picture information as a charge distribution within the CRT. This charge distribution can then be used to keep the phosphors activated. However, the most common method now employed for maintaining phosphor glow is to redraw the picture repeatedly by quickly directing the electron beam back over the same screen points. This type of display is called a refresh CRT, and the frequency at which a picture is redrawn on the screen is referred to as the refresh rate. The primary components of an electron gun in a CRT are the heated metal cathode and a control grid (Fig. 2). Heat is supplied to the cathode by directing a current through a coil of wire, called the filament, inside the cylindrical cathode structure. This causes electrons to be “boiled off” the hot cathode surface. In
Focusing System
Base
FIGURE 1
Connector Pins
Basic design of a magnetic-deflection CRT.
Magnetic Deflection Coils
Electron Beam
Electron Gun
Cathode
Electron Beam Path
Focusing Anode
Heating Filament
FIGURE 2
Operation of an electron gun with an accelerating anode.
2
PhosphorCoated Screen
Control Grid
Accelerating Anode
Computer Graphics Hardware
the vacuum inside the CRT envelope, the free, negatively charged electrons are then accelerated toward the phosphor coating by a high positive voltage. The accelerating voltage can be generated with a positively charged metal coating on the inside of the CRT envelope near the phosphor screen, or an accelerating anode, as in Figure 2, can be used to provide the positive voltage. Sometimes the electron gun is designed so that the accelerating anode and focusing system are within the same unit. Intensity of the electron beam is controlled by the voltage at the control grid, which is a metal cylinder that fits over the cathode. A high negative voltage applied to the control grid will shut off the beam by repelling electrons and stopping them from passing through the small hole at the end of the controlgrid structure. A smaller negative voltage on the control grid simply decreases the number of electrons passing through. Since the amount of light emitted by the phosphor coating depends on the number of electrons striking the screen, the brightness of a display point is controlled by varying the voltage on the control grid. This brightness, or intensity level, is specified for individual screen positions with graphics software commands. The focusing system in a CRT forces the electron beam to converge to a small cross section as it strikes the phosphor. Otherwise, the electrons would repel each other, and the beam would spread out as it approaches the screen. Focusing is accomplished with either electric or magnetic fields. With electrostatic focusing, the electron beam is passed through a positively charged metal cylinder so that electrons along the center line of the cylinder are in an equilibrium position. This arrangement forms an electrostatic lens, as shown in Figure 2, and the electron beam is focused at the center of the screen in the same way that an optical lens focuses a beam of light at a particular focal distance. Similar lens focusing effects can be accomplished with a magnetic field set up by a coil mounted around the outside of the CRT envelope, and magnetic lens focusing usually produces the smallest spot size on the screen. Additional focusing hardware is used in high-precision systems to keep the beam in focus at all screen positions. The distance that the electron beam must travel to different points on the screen varies because the radius of curvature for most CRTs is greater than the distance from the focusing system to the screen center. Therefore, the electron beam will be focused properly only at the center of the screen. As the beam moves to the outer edges of the screen, displayed images become blurred. To compensate for this, the system can adjust the focusing according to the screen position of the beam. As with focusing, deflection of the electron beam can be controlled with either electric or magnetic fields. Cathode-ray tubes are now commonly constructed with magnetic-deflection coils mounted on the outside of the CRT envelope, as illustrated in Figure 1. Two pairs of coils are used for this purpose. One pair is mounted on the top and bottom of the CRT neck, and the other pair is mounted on opposite sides of the neck. The magnetic field produced by each pair of coils results in a transverse deflection force that is perpendicular to both the direction of the magnetic field and the direction of travel of the electron beam. Horizontal deflection is accomplished with one pair of coils, and vertical deflection with the other pair. The proper deflection amounts are attained by adjusting the current through the coils. When electrostatic deflection is used, two pairs of parallel plates are mounted inside the CRT envelope. One pair of plates is mounted horizontally to control vertical deflection, and the other pair is mounted vertically to control horizontal deflection (Fig. 3). Spots of light are produced on the screen by the transfer of the CRT beam energy to the phosphor. When the electrons in the beam collide with the phosphor
3
Computer Graphics Hardware
Base
FIGURE 3
Electrostatic deflection of the electron beam in a CRT.
FIGURE 4
Intensity distribution of an illuminated phosphor spot on a CRT screen.
FIGURE 5
Two illuminated phosphor spots are distinguishable when their separation is greater than the diameter at which a spot intensity has fallen to 60 percent of maximum.
4
Connector Pins
Focusing System
Electron Gun
Vertical Deflection Plates
Horizontal Deflection Plates
PhosphorCoated Screen
Electron Beam
coating, they are stopped and their kinetic energy is absorbed by the phosphor. Part of the beam energy is converted by friction into heat energy, and the remainder causes electrons in the phosphor atoms to move up to higher quantum-energy levels. After a short time, the “excited” phosphor electrons begin dropping back to their stable ground state, giving up their extra energy as small quantums of light energy called photons. What we see on the screen is the combined effect of all the electron light emissions: a glowing spot that quickly fades after all the excited phosphor electrons have returned to their ground energy level. The frequency (or color) of the light emitted by the phosphor is in proportion to the energy difference between the excited quantum state and the ground state. Different kinds of phosphors are available for use in CRTs. Besides color, a major difference between phosphors is their persistence: how long they continue to emit light (that is, how long it is before all excited electrons have returned to the ground state) after the CRT beam is removed. Persistence is defined as the time that it takes the emitted light from the screen to decay to one-tenth of its original intensity. Lower-persistence phosphors require higher refresh rates to maintain a picture on the screen without flicker. A phosphor with low persistence can be useful for animation, while high-persistence phosphors are better suited for displaying highly complex, static pictures. Although some phosphors have persistence values greater than 1 second, general-purpose graphics monitors are usually constructed with persistence in the range from 10 to 60 microseconds. Figure 4 shows the intensity distribution of a spot on the screen. The intensity is greatest at the center of the spot, and it decreases with a Gaussian distribution out to the edges of the spot. This distribution corresponds to the cross-sectional electron density distribution of the CRT beam. The maximum number of points that can be displayed without overlap on a CRT is referred to as the resolution. A more precise definition of resolution is the number of points per centimeter that can be plotted horizontally and vertically, although it is often simply stated as the total number of points in each direction. Spot intensity has a Gaussian distribution (Fig. 4), so two adjacent spots will appear distinct as long as their separation is greater than the diameter at which each spot has an intensity of about 60 percent of that at the center of the spot. This overlap position is illustrated in Figure 5. Spot size also depends on intensity. As more electrons are accelerated toward the phosphor per second, the diameters of the CRT beam and the illuminated spot increase. In addition, the increased excitation energy tends to spread to neighboring phosphor atoms not directly in the path of the beam, which further increases the spot diameter. Thus, resolution of a CRT is dependent on the type of phosphor, the intensity to be displayed, and the focusing and deflection systems. Typical resolution on high-quality systems is 1280 by 1024, with higher resolutions available on many systems. High-resolution systems are often referred to as high-definition systems.
Computer Graphics Hardware
The physical size of a graphics monitor, on the other hand, is given as the length of the screen diagonal, with sizes varying from about 12 inches to 27 inches or more. A CRT monitor can be attached to a variety of computer systems, so the number of screen points that can actually be plotted also depends on the capabilities of the system to which it is attached.
Raster-Scan Displays The most common type of graphics monitor employing a CRT is the raster-scan display, based on television technology. In a raster-scan system, the electron beam is swept across the screen, one row at a time, from top to bottom. Each row is referred to as a scan line. As the electron beam moves across a scan line, the beam intensity is turned on and off (or set to some intermediate value) to create a pattern of illuminated spots. Picture definition is stored in a memory area called the refresh buffer or frame buffer, where the term frame refers to the total screen area. This memory area holds the set of color values for the screen points. These stored color values are then retrieved from the refresh buffer and used to control the intensity of the electron beam as it moves from spot to spot across the screen. In this way, the picture is “painted” on the screen one scan line at a time, as demonstrated in Figure 6. Each screen spot that can be illuminated by the electron beam is referred to as a pixel or pel (shortened forms of picture element). Since the refresh buffer is used to store the set of screen color values, it is also sometimes called a color buffer. Also, other kinds of pixel information, besides color, are stored in buffer locations, so all the different buffer areas are sometimes referred to collectively as the “frame buffer.” The capability of a raster-scan system to store color information for each screen point makes it well suited for the realistic display of scenes containing subtle shading and color patterns. Home television sets and printers are examples of other systems using raster-scan methods. Raster systems are commonly characterized by their resolution, which is the number of pixel positions that can be plotted. Another property of video monitors
(a)
(b)
FIGURE 6
(c)
(d)
A raster-scan system displays an object as a set of discrete points across each scan line.
5
Computer Graphics Hardware
is aspect ratio, which is now often defined as the number of pixel columns divided by the number of scan lines that can be displayed by the system. (Sometimes this term is used to refer to the number of scan lines divided by the number of pixel columns.) Aspect ratio can also be described as the number of horizontal points to vertical points (or vice versa) necessary to produce equal-length lines in both directions on the screen. Thus, an aspect ratio of 4/3, for example, means that a horizontal line plotted with four points has the same length as a vertical line plotted with three points, where line length is measured in some physical units such as centimeters. Similarly, the aspect ratio of any rectangle (including the total screen area) can be defined to be the width of the rectangle divided by its height. The range of colors or shades of gray that can be displayed on a raster system depends on both the types of phosphor used in the CRT and the number of bits per pixel available in the frame buffer. For a simple black-and-white system, each screen point is either on or off, so only one bit per pixel is needed to control the intensity of screen positions. A bit value of 1, for example, indicates that the electron beam is to be turned on at that position, and a value of 0 turns the beam off. Additional bits allow the intensity of the electron beam to be varied over a range of values between “on” and “off.” Up to 24 bits per pixel are included in high-quality systems, which can require several megabytes of storage for the frame buffer, depending on the resolution of the system. For example, a system with 24 bits per pixel and a screen resolution of 1024 by 1024 requires 3 MB of storage for the refresh buffer. The number of bits per pixel in a frame buffer is sometimes referred to as either the depth of the buffer area or the number of bit planes. A frame buffer with one bit per pixel is commonly called a bitmap, and a frame buffer with multiple bits per pixel is a pixmap, but these terms are also used to describe other rectangular arrays, where a bitmap is any pattern of binary values and a pixmap is a multicolor pattern. As each screen refresh takes place, we tend to see each frame as a smooth continuation of the patterns in the previous frame, so long as the refresh rate is not too low. Below about 24 frames per second, we can usually perceive a gap between successive screen images, and the picture appears to flicker. Old silent films, for example, show this effect because they were photographed at a rate of 16 frames per second. When sound systems were developed in the 1920s, motionpicture film rates increased to 24 frames per second, which removed flickering and the accompanying jerky movements of the actors. Early raster-scan computer systems were designed with a refresh rate of about 30 frames per second. This produces reasonably good results, but picture quality is improved, up to a point, with higher refresh rates on a video monitor because the display technology on the monitor is basically different from that of film. A film projector can maintain the continuous display of a film frame until the next frame is brought into view. But on a video monitor, a phosphor spot begins to decay as soon as it is illuminated. Therefore, current raster-scan displays perform refreshing at the rate of 60 to 80 frames per second, although some systems now have refresh rates of up to 120 frames per second. And some graphics systems have been designed with a variable refresh rate. For example, a higher refresh rate could be selected for a stereoscopic application so that two views of a scene (one from each eye position) can be alternately displayed without flicker. But other methods, such as multiple frame buffers, are typically used for such applications. Sometimes, refresh rates are described in units of cycles per second, or hertz (Hz), where a cycle corresponds to one frame. Using these units, we would describe a refresh rate of 60 frames per second as simply 60 Hz. At the end of each scan line, the electron beam returns to the left side of the screen to begin displaying the next scan line. The return to the left of the screen, after refreshing
6
Computer Graphics Hardware
0 1 2 3
FIGURE 7
Interlacing scan lines on a raster-scan display. First, all points on the even-numbered (solid) scan lines are displayed; then all points along the odd-numbered (dashed) lines are displayed.
each scan line, is called the horizontal retrace of the electron beam. And at the 1 1 to 60 of a second), the electron beam returns end of each frame (displayed in 80 to the upper-left corner of the screen (vertical retrace) to begin the next frame. On some raster-scan systems and TV sets, each frame is displayed in two passes using an interlaced refresh procedure. In the first pass, the beam sweeps across every other scan line from top to bottom. After the vertical retrace, the beam then sweeps out the remaining scan lines (Fig. 7). Interlacing of the scan lines in this way allows us to see the entire screen displayed in half the time that it would have taken to sweep across all the lines at once from top to bottom. This technique is primarily used with slower refresh rates. On an older, 30 frameper-second, non-interlaced display, for instance, some flicker is noticeable. But 1 with interlacing, each of the two passes can be accomplished in 60 of a second, which brings the refresh rate nearer to 60 frames per second. This is an effective technique for avoiding flicker—provided that adjacent scan lines contain similar display information.
Random-Scan Displays When operated as a random-scan display unit, a CRT has the electron beam directed only to those parts of the screen where a picture is to be displayed. Pictures are generated as line drawings, with the electron beam tracing out the component lines one after the other. For this reason, random-scan monitors are also referred to as vector displays (or stroke-writing displays or calligraphic displays). The component lines of a picture can be drawn and refreshed by a random-scan system in any specified order (Fig. 8). A pen plotter operates in a similar way and is an example of a random-scan, hard-copy device. Refresh rate on a random-scan system depends on the number of lines to be displayed on that system. Picture definition is now stored as a set of line-drawing commands in an area of memory referred to as the display list, refresh display file, vector file, or display program. To display a specified picture, the system cycles through the set of commands in the display file, drawing each component line in turn. After all line-drawing commands have been processed, the system cycles back to the first line command in the list. Random-scan displays are designed to draw all the component lines of a picture 30 to 60 times each second, with up to 100,000 “short” lines in the display list. When a small set of lines is to be displayed, each refresh cycle is delayed to avoid very high refresh rates, which could burn out the phosphor. Random-scan systems were designed for line-drawing applications, such as architectural and engineering layouts, and they cannot display realistic shaded scenes. Since picture definition is stored as a set of line-drawing instructions rather than as a set of intensity values for all screen points, vector displays generally have higher resolutions than raster systems. Also, vector displays produce smooth line
7
Computer Graphics Hardware
(a)
(b)
(c)
(d)
FIGURE 8
A random-scan system draws the component lines of an object in any specified order.
drawings because the CRT beam directly follows the line path. A raster system, by contrast, produces jagged lines that are plotted as discrete point sets. However, the greater flexibility and improved line-drawing capabilities of raster systems have resulted in the abandonment of vector technology.
Color CRT Monitors A CRT monitor displays color pictures by using a combination of phosphors that emit different-colored light. The emitted light from the different phosphors merges to form a single perceived color, which depends on the particular set of phosphors that have been excited. One way to display color pictures is to coat the screen with layers of differentcolored phosphors. The emitted color depends on how far the electron beam penetrates into the phosphor layers. This approach, called the beam-penetration method, typically used only two phosphor layers: red and green. A beam of slow electrons excites only the outer red layer, but a beam of very fast electrons penetrates the red layer and excites the inner green layer. At intermediate beam speeds, combinations of red and green light are emitted to show two additional colors: orange and yellow. The speed of the electrons, and hence the screen color at any point, is controlled by the beam acceleration voltage. Beam penetration has been an inexpensive way to produce color, but only a limited number of colors are possible, and picture quality is not as good as with other methods. Shadow-mask methods are commonly used in raster-scan systems (including color TV) because they produce a much wider range of colors than the beampenetration method. This approach is based on the way that we seem to perceive colors as combinations of red, green, and blue components, called the RGB color model. Thus, a shadow-mask CRT uses three phosphor color dots at each pixel position. One phosphor dot emits a red light, another emits a green light, and the third emits a blue light. This type of CRT has three electron guns, one for each color dot, and a shadow-mask grid just behind the phosphor-coated screen. The
8
Computer Graphics Hardware Section of Shadow Mask
Electron Guns
B G R
Red Green Blue
Magnified Phosphor-Dot Triangle
Screen
FIGURE 9
Operation of a delta-delta, shadow-mask CRT. Three electron guns, aligned with the triangular color-dot patterns on the screen, are directed to each dot triangle by a shadow mask.
light emitted from the three phosphors results in a small spot of color at each pixel position, since our eyes tend to merge the light emitted from the three dots into one composite color. Figure 9 illustrates the delta-delta shadow-mask method, commonly used in color CRT systems. The three electron beams are deflected and focused as a group onto the shadow mask, which contains a series of holes aligned with the phosphor-dot patterns. When the three beams pass through a hole in the shadow mask, they activate a dot triangle, which appears as a small color spot on the screen. The phosphor dots in the triangles are arranged so that each electron beam can activate only its corresponding color dot when it passes through the shadow mask. Another configuration for the three electron guns is an in-line arrangement in which the three electron guns, and the corresponding RGB color dots on the screen, are aligned along one scan line instead of in a triangular pattern. This in-line arrangement of electron guns is easier to keep in alignment and is commonly used in high-resolution color CRTs. We obtain color variations in a shadow-mask CRT by varying the intensity levels of the three electron beams. By turning off two of the three guns, we get only the color coming from the single activated phosphor (red, green, or blue). When all three dots are activated with equal beam intensities, we see a white color. Yellow is produced with equal intensities from the green and red dots only, magenta is produced with equal blue and red intensities, and cyan shows up when blue and green are activated equally. In an inexpensive system, each of the three electron beams might be restricted to either on or off, limiting displays to eight colors. More sophisticated systems can allow intermediate intensity levels to be set for the electron beams, so that several million colors are possible. Color graphics systems can be used with several types of CRT display devices. Some inexpensive home-computer systems and video games have been designed for use with a color TV set and a radio-frequency (RF) modulator. The purpose of the RF modulator is to simulate the signal from a broadcast TV station. This means that the color and intensity information of the picture must be combined and superimposed on the broadcast-frequency carrier signal that the TV requires as input. Then the circuitry in the TV takes this signal from the RF modulator, extracts the picture information, and paints it on the screen. As we might expect, this extra handling of the picture information by the RF modulator and TV circuitry decreases the quality of displayed images.
9
Computer Graphics Hardware
Composite monitors are adaptations of TV sets that allow bypass of the broadcast circuitry. These display devices still require that the picture information be combined, but no carrier signal is needed. Since picture information is combined into a composite signal and then separated by the monitor, the resulting picture quality is still not the best attainable. Color CRTs in graphics systems are designed as RGB monitors. These monitors use shadow-mask methods and take the intensity level for each electron gun (red, green, and blue) directly from the computer system without any intermediate processing. High-quality raster-graphics systems have 24 bits per pixel in the frame buffer, allowing 256 voltage settings for each electron gun and nearly 17 million color choices for each pixel. An RGB color system with 24 bits of storage per pixel is generally referred to as a full-color system or a true-color system.
Flat-Panel Displays Although most graphics monitors are still constructed with CRTs, other technologies are emerging that may soon replace CRT monitors. The term flat-panel display refers to a class of video devices that have reduced volume, weight, and power requirements compared to a CRT. A significant feature of flat-panel displays is that they are thinner than CRTs, and we can hang them on walls or wear them on our wrists. Since we can even write on some flat-panel displays, they are also available as pocket notepads. Some additional uses for flat-panel displays are as small TV monitors, calculator screens, pocket video-game screens, laptop computer screens, armrest movie-viewing stations on airlines, advertisement boards in elevators, and graphics displays in applications requiring rugged, portable monitors. We can separate flat-panel displays into two categories: emissive displays and nonemissive displays. The emissive displays (or emitters) are devices that convert electrical energy into light. Plasma panels, thin-film electroluminescent displays, and light-emitting diodes are examples of emissive displays. Flat CRTs have also been devised, in which electron beams are accelerated parallel to the screen and then deflected 90 onto the screen. But flat CRTs have not proved to be as successful as other emissive devices. Nonemissive displays (or nonemitters) use optical effects to convert sunlight or light from some other source into graphics patterns. The most important example of a nonemissive flat-panel display is a liquid-crystal device. Plasma panels, also called gas-discharge displays, are constructed by filling the region between two glass plates with a mixture of gases that usually includes neon. A series of vertical conducting ribbons is placed on one glass panel, and a set of horizontal conducting ribbons is built into the other glass panel (Fig. 10). Firing voltages applied to an intersecting pair of horizontal and vertical conductors cause the gas at the intersection of the two conductors to break down into a glowing plasma of electrons and ions. Picture definition is stored in a refresh buffer, and the firing voltages are applied to refresh the pixel positions (at the intersections of the conductors) 60 times per second. Alternating-current methods are used to provide faster application of the firing voltages and, thus, brighter displays. Separation between pixels is provided by the electric field of the conductors. One disadvantage of plasma panels has been that they were strictly monochromatic devices, but systems are now available with multicolor capabilities. Thin-film electroluminescent displays are similar in construction to plasma panels. The difference is that the region between the glass plates is filled with a phosphor, such as zinc sulfide doped with manganese, instead of a gas (Fig. 11). When a sufficiently high voltage is applied to a pair of crossing electrodes, the
10
Computer Graphics Hardware Conductors
Conductors
Gas Glass Plate
Glass Plate
Glass Plate
Glass Plate Phosphor
FIGURE 10
FIGURE 11
Basic design of a plasma-panel display device.
Basic design of a thin-film electroluminescent display device.
phosphor becomes a conductor in the area of the intersection of the two electrodes. Electrical energy is absorbed by the manganese atoms, which then release the energy as a spot of light similar to the glowing plasma effect in a plasma panel. Electroluminescent displays require more power than plasma panels, and good color displays are harder to achieve. A third type of emissive device is the light-emitting diode (LED). A matrix of diodes is arranged to form the pixel positions in the display, and picture definition is stored in a refresh buffer. As in scan-line refreshing of a CRT, information is read from the refresh buffer and converted to voltage levels that are applied to the diodes to produce the light patterns in the display. Liquid-crystal displays (LCDs) are commonly used in small systems, such as laptop computers and calculators (Fig. 12). These nonemissive devices produce a picture by passing polarized light from the surroundings or from an internal light source through a liquid-crystal material that can be aligned to either block or transmit the light. The term liquid crystal refers to the fact that these compounds have a crystalline arrangement of molecules, yet they flow like a liquid. Flat-panel displays commonly use nematic (threadlike) liquid-crystal compounds that tend to keep the long axes of the rod-shaped molecules aligned. A flat-panel display can then be constructed with a nematic liquid crystal, as demonstrated in Figure 13. Two glass plates, each containing a light polarizer that is aligned at a right angle to the other plate, sandwich the liquid-crystal material. Rows of horizontal, transparent conductors are built into one glass plate, and columns of vertical conductors are put into the other plate. The intersection of two conductors defines a pixel position. Normally, the molecules are aligned as shown in the “on state” of Figure 13. Polarized light passing through the material is twisted so that it will pass through the opposite polarizer. The light is then reflected back to the viewer. To turn off the pixel, we apply a voltage to the two intersecting conductors to align the molecules so that the light is not twisted. This type of flat-panel device is referred to as a passive-matrix LCD. Picture definitions are stored in a refresh buffer, and the screen is refreshed at the rate of 60 frames per second, as in the emissive
FIGURE 12
A handheld calculator with an LCD screen. (Courtesy of Texas Instruments.)
11
Computer Graphics Hardware Nematic Liquid Crystal
Transparent Conductor
Polarizer
Polarizer Transparent Conductor
On State Nematic Liquid Crystal
Transparent Conductor
FIGURE 13
The light-twisting, shutter effect used in the design of most LCD devices.
Polarizer
Polarizer Off State
Transparent Conductor
devices. Backlighting is also commonly applied using solid-state electronic devices, so that the system is not completely dependent on outside light sources. Colors can be displayed by using different materials or dyes and by placing a triad of color pixels at each screen location. Another method for constructing LCDs is to place a transistor at each pixel location, using thin-film transistor technology. The transistors are used to control the voltage at pixel locations and to prevent charge from gradually leaking out of the liquid-crystal cells. These devices are called active-matrix displays.
Three-Dimensional Viewing Devices Graphics monitors for the display of three-dimensional scenes have been devised using a technique that reflects a CRT image from a vibrating, flexible mirror (Fig. 14). As the varifocal mirror vibrates, it changes focal length. These vibrations are synchronized with the display of an object on a CRT so that each point on the object is reflected from the mirror into a spatial position corresponding to the distance of that point from a specified viewing location. This allows us to walk around an object or scene and view it from different sides. In addition to displaying three-dimensional images, these systems are often capable of displaying two-dimensional cross-sectional “slices” of objects selected at different depths, such as in medical applications to analyze data from ultrasonography and CAT scan devices, in geological applications to analyze topological and seismic data, in design applications involving solid objects, and in three-dimensional simulations of systems, such as molecules and terrain.
12
Computer Graphics Hardware Projected 3D Image Timing and Control System
Vibrating Flexible Mirror
CRT
Viewer FIGURE 14
FIGURE 15
Operation of a three-dimensional display system using a vibrating mirror that changes focal length to match the depths of points in a scene.
Glasses for viewing a stereoscopic scene in 3D. (Courtesy of XPAND, X6D USA Inc.)
Stereoscopic and Virtual-Reality Systems Another technique for representing a three-dimensional object is to display stereoscopic views of the object. This method does not produce true threedimensional images, but it does provide a three-dimensional effect by presenting a different view to each eye of an observer so that scenes do appear to have depth. To obtain a stereoscopic projection, we must obtain two views of a scene generated with viewing directions along the lines from the position of each eye (left and right) to the scene. We can construct the two views as computer-generated scenes with different viewing positions, or we can use a stereo camera pair to photograph an object or scene. When we simultaneously look at the left view with the left eye and the right view with the right eye, the two views merge into a single image and we perceive a scene with depth. One way to produce a stereoscopic effect on a raster system is to display each of the two views on alternate refresh cycles. The screen is viewed through glasses, with each lens designed to act as a rapidly alternating shutter that is synchronized to block out one of the views. One such design (Figure 15) uses liquid-crystal shutters and an infrared emitter that synchronizes the glasses with the views on the screen. Stereoscopic viewing is also a component in virtual-reality systems, where users can step into a scene and interact with the environment. A headset containing an optical system to generate the stereoscopic views can be used in conjunction with interactive input devices to locate and manipulate objects in the scene. A sensing system in the headset keeps track of the viewer’s position, so that the front and back of objects can be seen as the viewer “walks through” and interacts with the display. Another method for creating a virtual-reality environment
13
Computer Graphics Hardware
is to use projectors to generate a scene within an arrangement of walls, where a viewer interacts with a virtual display using stereoscopic glasses and data gloves (Section 4). Lower-cost, interactive virtual-reality environments can be set up using a graphics monitor, stereoscopic glasses, and a head-tracking device. The tracking device is placed above the video monitor and is used to record head movements, so that the viewing position for a scene can be changed as head position changes.
2 Raster-Scan Systems Interactive raster-graphics systems typically employ several processing units. In addition to the central processing unit (CPU), a special-purpose processor, called the video controller or display controller, is used to control the operation of the display device. Organization of a simple raster system is shown in Figure 16. Here, the frame buffer can be anywhere in the system memory, and the video controller accesses the frame buffer to refresh the screen. In addition to the video controller, more sophisticated raster systems employ other processors as coprocessors and accelerators to implement various graphics operations.
Video Controller Figure 17 shows a commonly used organization for raster systems. A fixed area of the system memory is reserved for the frame buffer, and the video controller is given direct access to the frame-buffer memory. Frame-buffer locations, and the corresponding screen positions, are referenced in Cartesian coordinates. In an application program, we use the commands
System Memory
CPU
Video Controller
Monitor
System Bus FIGURE 16
Architecture of a simple raster-graphics system.
I/O Devices
CPU
System Memory
Frame Buffer
System Bus FIGURE 17
Architecture of a raster system with a fixed portion of the system memory reserved for the frame buffer.
14
I/O Devices
Video Controller
Monitor
Computer Graphics Hardware
within a graphics software package to set coordinate positions for displayed objects relative to the origin of the Cartesian reference frame. Often, the coordinate origin is referenced at the lower-left corner of a screen display area by the software commands, although we can typically set the origin at any convenient location for a particular application. Figure 18 shows a two-dimensional Cartesian reference frame with the origin at the lower-left screen corner. The screen surface is then represented as the first quadrant of a two-dimensional system, with positive x values increasing from left to right and positive y values increasing from the bottom of the screen to the top. Pixel positions are then assigned integer x values that range from 0 to xmax across the screen, left to right, and integer y values that vary from 0 to ymax , bottom to top. However, hardware processes such as screen refreshing, as well as some software systems, reference the pixel positions from the top-left corner of the screen. In Figure 19, the basic refresh operations of the video controller are diagrammed. Two registers are used to store the coordinate values for the screen pixels. Initially, the x register is set to 0 and the y register is set to the value for the top scan line. The contents of the frame buffer at this pixel position are then retrieved and used to set the intensity of the CRT beam. Then the x register is incremented by 1, and the process is repeated for the next pixel on the top scan line. This procedure continues for each pixel along the top scan line. After the last pixel on the top scan line has been processed, the x register is reset to 0 and the y register is set to the value for the next scan line down from the top of the screen. Pixels along this scan line are then processed in turn, and the procedure is repeated for each successive scan line. After cycling through all pixels along the bottom scan line, the video controller resets the registers to the first pixel position on the top scan line and the refresh process starts over. Since the screen must be refreshed at a rate of at least 60 frames per second, the simple procedure illustrated in Figure 19 may not be accommodated by typical RAM chips if the cycle time is too slow. To speed up pixel processing,
Raster-Scan Generator
x Register
Horizontal and Vertical Deflection Voltages
y Register
y Memory Address
Pixel Register
Intensity
x Frame Buffer
FIGURE 18
FIGURE 19
A Cartesian reference frame with origin at the lower-left corner of a video monitor.
Basic video-controller refresh operations.
15
Computer Graphics Hardware
video controllers can retrieve multiple pixel values from the refresh buffer on each pass. The multiple pixel intensities are then stored in a separate register and used to control the CRT beam intensity for a group of adjacent pixels. When that group of pixels has been processed, the next block of pixel values is retrieved from the frame buffer. A video controller can be designed to perform a number of other operations. For various applications, the video controller can retrieve pixel values from different memory areas on different refresh cycles. In some systems, for example, multiple frame buffers are often provided so that one buffer can be used for refreshing while pixel values are being loaded into the other buffers. Then the current refresh buffer can switch roles with one of the other buffers. This provides a fast mechanism for generating real-time animations, for example, since different views of moving objects can be successively loaded into a buffer without interrupting a refresh cycle. Another video-controller task is the transformation of blocks of pixels, so that screen areas can be enlarged, reduced, or moved from one location to another during the refresh cycles. In addition, the video controller often contains a lookup table, so that pixel values in the frame buffer are used to access the lookup table instead of controlling the CRT beam intensity directly. This provides a fast method for changing screen intensity values. Finally, some systems are designed to allow the video controller to mix the frame-buffer image with an input image from a television camera or other input device.
Raster-Scan Display Processor Figure 20 shows one way to organize the components of a raster system that contains a separate display processor, sometimes referred to as a graphics controller or a display coprocessor. The purpose of the display processor is to free the CPU from the graphics chores. In addition to the system memory, a separate display-processor memory area can be provided. A major task of the display processor is digitizing a picture definition given in an application program into a set of pixel values for storage in the frame buffer. This digitization process is called scan conversion. Graphics commands specifying straight lines and other geometric objects are scan converted into a set of discrete points, corresponding to screen pixel positions. Scan converting a straight-line segment, for example, means that we have to locate the pixel positions closest to the line path and store the color for each position in the frame
DisplayProcessor Memory
CPU
Display Processor
System Bus FIGURE 20
Architecture of a raster-graphics system with a display processor.
16
I/O Devices
Frame Buffer
System Memory
Video Controller
Monitor
Computer Graphics Hardware
buffer. Similar methods are used for scan converting other objects in a picture definition. Characters can be defined with rectangular pixel grids, as in Figure 21, or they can be defined with outline shapes, as in Figure 22. The array size for character grids can vary from about 5 by 7 to 9 by 12 or more for higher-quality displays. A character grid is displayed by superimposing the rectangular grid pattern into the frame buffer at a specified coordinate position. For characters that are defined as outlines, the shapes are scan-converted into the frame buffer by locating the pixel positions closest to the outline. Display processors are also designed to perform a number of additional operations. These functions include generating various line styles (dashed, dotted, or solid), displaying color areas, and applying transformations to the objects in a scene. Also, display processors are typically designed to interface with interactive input devices, such as a mouse. In an effort to reduce memory requirements in raster systems, methods have been devised for organizing the frame buffer as a linked list and encoding the color information. One organization scheme is to store each scan line as a set of number pairs. The first number in each pair can be a reference to a color value, and the second number can specify the number of adjacent pixels on the scan line that are to be displayed in that color. This technique, called run-length encoding, can result in a considerable saving in storage space if a picture is to be constructed mostly with long runs of a single color each. A similar approach can be taken when pixel colors change linearly. Another approach is to encode the raster as a set of rectangular areas (cell encoding). The disadvantages of encoding runs are that color changes are difficult to record and storage requirements increase as the lengths of the runs decrease. In addition, it is difficult for the display controller to process the raster when many short runs are involved. Moreover, the size of the frame buffer is no longer a major concern, because of sharp declines in memory costs. Nevertheless, encoding methods can be useful in the digital storage and transmission of picture information.
FIGURE 21
A character defined as a rectangular grid of pixel positions.
FIGURE 22
A character defined as an outline shape.
3 Graphics Workstations and Viewing Systems Most graphics monitors today operate as raster-scan displays, and both CRT and flat-panel systems are in common use. Graphics workstations range from small general-purpose computer systems to multi-monitor facilities, often with ultra-large viewing screens. For a personal computer, screen resolutions vary from about 640 by 480 to 1280 by 1024, and diagonal screen lengths measure from 12 inches to over 21 inches. Most general-purpose systems now have considerable color capabilities, and many are full-color systems. For a desktop workstation specifically designed for graphics applications, the screen resolution can vary from 1280 by 1024 to about 1600 by 1200, with a typical screen diagonal of 18 inches or more. Commercial workstations can also be obtained with a variety of devices for specific applications. High-definition graphics systems, with resolutions up to 2560 by 2048, are commonly used in medical imaging, air-traffic control, simulation, and CAD. Many high-end graphics workstations also include large viewing screens, often with specialized features. Multi-panel display screens are used in a variety of applications that require “wall-sized” viewing areas. These systems are designed for presenting graphics displays at meetings, conferences, conventions, trade shows, retail stores, museums, and passenger terminals. A multi-panel display can be used to show a large
17
Computer Graphics Hardware
view of a single scene or several individual images. Each panel in the system displays one section of the overall picture. Color Plate 7 shows a 360 paneled viewing system in the NASA control-tower simulator, which is used for training and for testing ways to solve air-traffic and runway problems at airports. Large graphics displays can also be presented on curved viewing screens. A large, curved-screen system can be useful for viewing by a group of people studying a particular graphics application, such as the example in Color Plate 8. A control center, featuring a battery of standard monitors, allows an operator to view sections of the large display and to control the audio, video, lighting, and projection systems using a touch-screen menu. The system projectors provide a seamless, multichannel display that includes edge blending, distortion correction, and color balancing. And a surround-sound system is used to provide the audio environment.
4 Input Devices Graphics workstations can make use of various devices for data input. Most systems have a keyboard and one or more additional devices specifically designed for interactive input. These include a mouse, trackball, spaceball, and joystick. Some other input devices used in particular applications are digitizers, dials, button boxes, data gloves, touch panels, image scanners, and voice systems.
Keyboards, Button Boxes, and Dials An alphanumeric keyboard on a graphics system is used primarily as a device for entering text strings, issuing certain commands, and selecting menu options. The keyboard is an efficient device for inputting such nongraphic data as picture labels associated with a graphics display. Keyboards can also be provided with features to facilitate entry of screen coordinates, menu selections, or graphics functions. Cursor-control keys and function keys are common features on generalpurpose keyboards. Function keys allow users to select frequently accessed operations with a single keystroke, and cursor-control keys are convenient for selecting a displayed object or a location by positioning the screen cursor. A keyboard can also contain other types of cursor-positioning devices, such as a trackball or joystick, along with a numeric keypad for fast entry of numeric data. In addition to these features, some keyboards have an ergonomic design that provides adjustments for relieving operator fatigue. For specialized tasks, input to a graphics application may come from a set of buttons, dials, or switches that select data values or customized graphics operations. Buttons and switches are often used to input predefined functions, and dials are common devices for entering scalar values. Numerical values within some defined range are selected for input with dial rotations. A potentiometer is used to measure dial rotation, which is then converted to the corresponding numerical value.
Mouse Devices A mouse is a small handheld unit that is usually moved around on a flat surface to position the screen cursor. One or more buttons on the top of the mouse provide a mechanism for communicating selection information to the computer; wheels or rollers on the bottom of the mouse can be used to record the amount and direction of movement. Another method for detecting mouse motion is with an optical sensor. For some optical systems, the mouse is moved over a special mouse pad that has a grid of horizontal and vertical lines. The optical sensor detects
18
Computer Graphics Hardware
FIGURE 23
A wireless computer mouse designed with many user-programmable controls. (Courtesy of Logitech® )
movement across the lines in the grid. Other optical mouse systems can operate on any surface. Some mouse systems are cordless, communicating with computer processors using digital radio technology. Since a mouse can be picked up and put down at another position without change in cursor movement, it is used for making relative changes in the position of the screen cursor. One, two, three, or four buttons are included on the top of the mouse for signaling the execution of operations, such as recording cursor position or invoking a function. Most general-purpose graphics systems now include a mouse and a keyboard as the primary input devices. Additional features can be included in the basic mouse design to increase the number of allowable input parameters and the functionality of the mouse. The Logitech G700 wireless gaming mouse in Figure 23 features 13 separatelyprogrammable control inputs. Each input can be configured to perform a wide range of actions, from traditional single-click inputs to macro operations containing multiple keystrongs, mouse events, and pre-programmed delays between operations. The laser-based optical sensor can be configured to control the degree of sensitivity to motion, allowing the mouse to be used in situations requiring different levels of control over cursor movement. In addition, the mouse can hold up to five different configuration profiles to allow the configuration to be switched easily when changing applications.
Trackballs and Spaceballs A trackball is a ball device that can be rotated with the fingers or palm of the hand to produce screen-cursor movement. Potentiometers, connected to the ball, measure the amount and direction of rotation. Laptop keyboards are often equipped with a trackball to eliminate the extra space required by a mouse. A trackball also can be mounted on other devices, or it can be obtained as a separate add-on unit that contains two or three control buttons. An extension of the two-dimensional trackball concept is the spaceball, which provides six degrees of freedom. Unlike the trackball, a spaceball does not actually move. Strain gauges measure the amount of pressure applied to the spaceball to provide input for spatial positioning and orientation as the ball is pushed or pulled in various directions. Spaceballs are used for three-dimensional positioning and selection operations in virtual-reality systems, modeling, animation, CAD, and other applications.
Joysticks Another positioning device is the joystick, which consists of a small, vertical lever (called the stick) mounted on a base. We use the joystick to steer the screen cursor around. Most joysticks select screen positions with actual stick movement; others
19
Computer Graphics Hardware
respond to pressure on the stick. Some joysticks are mounted on a keyboard, and some are designed as stand-alone units. The distance that the stick is moved in any direction from its center position corresponds to the relative screen-cursor movement in that direction. Potentiometers mounted at the base of the joystick measure the amount of movement, and springs return the stick to the center position when it is released. One or more buttons can be programmed to act as input switches to signal actions that are to be executed once a screen position has been selected. In another type of movable joystick, the stick is used to activate switches that cause the screen cursor to move at a constant rate in the direction selected. Eight switches, arranged in a circle, are sometimes provided so that the stick can select any one of eight directions for cursor movement. Pressure-sensitive joysticks, also called isometric joysticks, have a non-movable stick. A push or pull on the stick is measured with strain gauges and converted to movement of the screen cursor in the direction of the applied pressure.
Data Gloves A data glove is a device that fits over the user’s hand and can be used to grasp a “virtual object.” The glove is constructed with a series of sensors that detect hand and finger motions. Electromagnetic coupling between transmitting antennas and receiving antennas are used to provide information about the position and orientation of the hand. The transmitting and receiving antennas can each be structured as a set of three mutually perpendicular coils, forming a threedimensional Cartesian reference system. Input from the glove is used to position or manipulate objects in a virtual scene. A two-dimensional projection of the scene can be viewed on a video monitor, or a three-dimensional projection can be viewed with a headset.
Digitizers A common device for drawing, painting, or interactively selecting positions is a digitizer. These devices can be designed to input coordinate values in either a two-dimensional or a three-dimensional space. In engineering or architectural applications, a digitizer is often used to scan a drawing or object and to input a set of discrete coordinate positions. The input positions are then joined with straight-line segments to generate an approximation of a curve or surface shape. One type of digitizer is the graphics tablet (also referred to as a data tablet), which is used to input two-dimensional coordinates by activating a hand cursor or stylus at selected positions on a flat surface. A hand cursor contains crosshairs for sighting positions, while a stylus is a pencil-shaped device that is pointed at positions on the tablet. The tablet size varies from 12 by 12 inches for desktop models to 44 by 60 inches or larger for floor models. Graphics tablets provide a highly accurate method for selecting coordinate positions, with an accuracy that varies from about 0.2 mm on desktop models to about 0.05 mm or less on larger models. Many graphics tablets are constructed with a rectangular grid of wires embedded in the tablet surface. Electromagnetic pulses are generated in sequence along the wires, and an electric signal is induced in a wire coil in an activated stylus or hand-cursor to record a tablet position. Depending on the technology, signal strength, coded pulses, or phase shifts can be used to determine the position on the tablet. An acoustic (or sonic) tablet uses sound waves to detect a stylus position. Either strip microphones or point microphones can be employed to detect the sound
20
Computer Graphics Hardware
emitted by an electrical spark from a stylus tip. The position of the stylus is calculated by timing the arrival of the generated sound at the different microphone positions. An advantage of two-dimensional acoustic tablets is that the microphones can be placed on any surface to form the “tablet” work area. For example, the microphones could be placed on a book page while a figure on that page is digitized. Three-dimensional digitizers use sonic or electromagnetic transmissions to record positions. One electromagnetic transmission method is similar to that employed in the data glove: A coupling between the transmitter and receiver is used to compute the location of a stylus as it moves over an object surface. As the points are selected on a nonmetallic object, a wire-frame outline of the surface is displayed on the computer screen. Once the surface outline is constructed, it can be rendered using lighting effects to produce a realistic display of the object.
Image Scanners Drawings, graphs, photographs, or text can be stored for computer processing with an image scanner by passing an optical scanning mechanism over the information to be stored. The gradations of grayscale or color are then recorded and stored in an array. Once we have the internal representation of a picture, we can apply transformations to rotate, scale, or crop the picture to a particular screen area. We can also apply various image-processing methods to modify the array representation of the picture. For scanned text input, various editing operations can be performed on the stored documents. Scanners are available in a variety of sizes and capabilities, including small handheld models, drum scanners, and flatbed scanners.
Touch Panels As the name implies, touch panels allow displayed objects or screen positions to be selected with the touch of a finger. A typical application of touch panels is for the selection of processing options that are represented as a menu of graphical icons. Some monitors are designed with touch screens. Other systems can be adapted for touch input by fitting a transparent device containing a touch-sensing mechanism over the video monitor screen. Touch input can be recorded using optical, electrical, or acoustical methods. Optical touch panels employ a line of infrared light-emitting diodes (LEDs) along one vertical edge and along one horizontal edge of the frame. Light detectors are placed along the opposite vertical and horizontal edges. These detectors are used to record which beams are interrupted when the panel is touched. The two crossing beams that are interrupted identify the horizontal and vertical coordinates of the screen position selected. Positions can be selected with an accuracy of about 1/4 inch. With closely spaced LEDs, it is possible to break two horizontal or two vertical beams simultaneously. In this case, an average position between the two interrupted beams is recorded. The LEDs operate at infrared frequencies so that the light is not visible to a user. An electrical touch panel is constructed with two transparent plates separated by a small distance. One of the plates is coated with a conducting material, and the other plate is coated with a resistive material. When the outer plate is touched, it is forced into contact with the inner plate. This contact creates a voltage drop across the resistive plate that is converted to the coordinate values of the selected screen position. In acoustical touch panels, high-frequency sound waves are generated in horizontal and vertical directions across a glass plate. Touching the screen causes
21
Computer Graphics Hardware
part of each wave to be reflected from the finger to the emitters. The screen position at the point of contact is calculated from a measurement of the time interval between the transmission of each wave and its reflection to the emitter.
Light Pens Light pens are pencil-shaped devices are used to select screen positions by detecting the light coming from points on the CRT screen. They are sensitive to the short burst of light emitted from the phosphor coating at the instant the electron beam strikes a particular point. Other light sources, such as the background light in the room, are usually not detected by a light pen. An activated light pen, pointed at a spot on the screen as the electron beam lights up that spot, generates an electrical pulse that causes the coordinate position of the electron beam to be recorded. As with cursor-positioning devices, recorded light-pen coordinates can be used to position an object or to select a processing option. Although light pens are still with us, they are not as popular as they once were because they have several disadvantages compared to other input devices that have been developed. For example, when a light pen is pointed at the screen, part of the screen image is obscured by the hand and pen. In addition, prolonged use of the light pen can cause arm fatigue, and light pens require special implementations for some applications because they cannot detect positions within black areas. To be able to select positions in any screen area with a light pen, we must have some nonzero light intensity emitted from each pixel within that area. In addition, light pens sometimes give false readings due to background lighting in a room.
Voice Systems Speech recognizers are used with some graphics workstations as input devices for voice commands. The voice system input can be used to initiate graphics operations or to enter data. These systems operate by matching an input against a predefined dictionary of words and phrases. A dictionary is set up by speaking the command words several times. The system then analyzes each word and establishes a dictionary of word frequency patterns, along with the corresponding functions that are to be performed. Later, when a voice command is given, the system searches the dictionary for a frequency-pattern match. A separate dictionary is needed for each operator using the system. Input for a voice system is typically spoken into a microphone mounted on a headset; the microphone is designed to minimize input of background sounds. Voice systems have some advantage over other input devices because the attention of the operator need not switch from one device to another to enter a command.
5 Hard-Copy Devices We can obtain hard-copy output for our images in several formats. For presentations or archiving, we can send image files to devices or service bureaus that will produce overhead transparencies, 35mm slides, or film. Also, we can put our pictures on paper by directing graphics output to a printer or plotter. The quality of the pictures obtained from an output device depends on dot size and the number of dots per inch, or lines per inch, that can be displayed. To produce smooth patterns, higher-quality printers shift dot positions so that adjacent dots overlap.
22
Computer Graphics Hardware
FIGURE 24
A picture generated on a dot-matrix printer, illustrating how the density of dot patterns can be varied to produce light and dark areas. (Courtesy of Apple Computer, Inc.)
Printers produce output by either impact or nonimpact methods. Impact printers press formed character faces against an inked ribbon onto the paper. A line printer is an example of an impact device, with the typefaces mounted on bands, chains, drums, or wheels. Nonimpact printers and plotters use laser techniques, ink-jet sprays, electrostatic methods, and electrothermal methods to get images onto paper. Character impact printers often have a dot-matrix print head containing a rectangular array of protruding wire pins, with the number of pins varying depending upon the quality of the printer. Individual characters or graphics patterns are obtained by retracting certain pins so that the remaining pins form the pattern to be printed. Figure 24 shows a picture printed on a dot-matrix printer. In a laser device, a laser beam creates a charge distribution on a rotating drum coated with a photoelectric material, such as selenium. Toner is applied to the drum and then transferred to paper. Ink-jet methods produce output by squirting ink in horizontal rows across a roll of paper wrapped on a drum. The electrically charged ink stream is deflected by an electric field to produce dot-matrix patterns. An electrostatic device places a negative charge on the paper, one complete row at a time across the sheet. Then the paper is exposed to a positively charged toner. This causes the toner to be attracted to the negatively charged areas, where it adheres to produce the specified output. Another output technology is the electrothermal printer. With these systems, heat is applied to a dot-matrix print head to output patterns on heat-sensitive paper. We can get limited color output on some impact printers by using differentcolored ribbons. Nonimpact devices use various techniques to combine three different color pigments (cyan, magenta, and yellow) to produce a range of color patterns. Laser and electrostatic devices deposit the three pigments on separate passes; ink-jet methods shoot the three colors simultaneously on a single pass along each print line. Drafting layouts and other drawings are typically generated with ink-jet or pen plotters. A pen plotter has one or more pens mounted on a carriage, or crossbar, that spans a sheet of paper. Pens with varying colors and widths are used to produce a variety of shadings and line styles. Wet-ink, ballpoint, and felt-tip pens are all possible choices for use with a pen plotter. Plotter paper can lie flat or it can be rolled onto a drum or belt. Crossbars can be either movable or stationary, while the pen moves back and forth along the bar. The paper is held in position using clamps, a vacuum, or an electrostatic charge.
23
Computer Graphics Hardware
6 Graphics Networks So far, we have mainly considered graphics applications on an isolated system with a single user. However, multiuser environments and computer networks are now common elements in many graphics applications. Various resources, such as processors, printers, plotters, and data files, can be distributed on a network and shared by multiple users. A graphics monitor on a network is generally referred to as a graphics server, or simply a server. Often, the monitor includes standard input devices such as a keyboard and a mouse or trackball. In that case, the system can provide input, as well as being an output server. The computer on the network that is executing a graphics application program is called the client, and the output of the program is displayed on a server. A workstation that includes processors, as well as a monitor and input devices, can function as both a server and a client. When operating on a network, a client computer transmits the instructions for displaying a picture to the monitor (server). Typically, this is accomplished by collecting the instructions into packets before transmission instead of sending the individual graphics instructions one at a time over the network. Thus, graphics software packages often contain commands that affect packet transmission, as well as the commands for creating pictures.
7 Graphics on the Internet A great deal of graphics development is now done on the global collection of computer networks known as the Internet. Computers on the Internet communicate using transmission control protocol/internet protocol (TCP/IP). In addition, the World Wide Web provides a hypertext system that allows users to locate and view documents that can contain text, graphics, and audio. Resources, such as graphics files, are identified by a uniform (or universal) resource locator (URL). Each URL contains two parts: (1) the protocol for transferring the document, and (2) the server that contains the document and, optionally, the location (directory) on the server. For example, the URL http://www.siggraph.org/ indicates a document that is to be transferred with the hypertext transfer protocol (http) and that the server is www.siggraph.org, which is the home page of the Special Interest Group in Graphics (SIGGRAPH) of the Association for Computing Machinery. Another common type of URL begins with ftp://. This identifies a site that accepts file transfer protocol (FTP) connections, through which programs or other files can be downloaded. Documents on the Internet can be constructed with the Hypertext Markup Language (HTML). The development of HTML provided a simple method for describing a document containing text, graphics, and references (hyperlinks) to other documents. Although resources could be made available using HTML and URL addressing, it was difficult originally to find information on the Internet. Subsequently, the National Center for Supercomputing Applications (NCSA) developed a “browser” called Mosaic, which made it easier for users to search for Web resources. The Mosaic browser later evolved into the browser called Netscape Navigator. In turn, Netscape Navigator inspired the creation of the Mozilla family of browsers, whose most well-known member is, perhaps, Firefox. HTML provides a simple method for developing graphics on the Internet, but it has limited capabilities. Therefore, other languages have been developed for Internet graphics applications.
24
Computer Graphics Hardware
8 Summary In this chapter, we surveyed the major hardware and software features of computer-graphics systems. Hardware components include video monitors, hardcopy output devices, various kinds of input devices, and components for interacting with virtual environments. The predominant graphics display device is the raster refresh monitor, based on television technology. A raster system uses a frame buffer to store the color value for each screen position (pixel). Pictures are then painted onto the screen by retrieving this information from the frame buffer (also called a refresh buffer) as the electron beam in the CRT sweeps across each scan line from top to bottom. Older vector displays construct pictures by drawing straight-line segments between specified endpoint positions. Picture information is then stored as a set of linedrawing instructions. Many other video display devices are available. In particular, flat-panel display technology is developing at a rapid rate, and these devices are now used in a variety of systems, including both desktop and laptop computers. Plasma panels and liquid-crystal devices are two examples of flat-panel displays. Other display technologies include three-dimensional and stereoscopic-viewing systems. Virtual-reality systems can include either a stereoscopic headset or a standard video monitor. For graphical input, we have a range of devices to choose from. Keyboards, button boxes, and dials are used to input text, data values, or programming options. The most popular “pointing” device is the mouse, but trackballs, spaceballs, joysticks, cursor-control keys, and thumbwheels are also used to position the screen cursor. In virtual-reality environments, data gloves are commonly used. Other input devices are image scanners, digitizers, touch panels, light pens, and voice systems. Hardcopy devices for graphics workstations include standard printers and plotters, in addition to devices for producing slides, transparencies, and film output. Printers produce hardcopy output using dot-matrix, laser, ink-jet, electrostatic, or electrothermal methods. Graphs and charts can be produced with an ink-pen plotter or with a combination printer-plotter device.
REFERENCES A general treatment of electronic displays is available in Tannas (1985) and in Sherr (1993). Flat-panel devices are discussed in Depp and Howard (1993). Additional information on raster-graphics architecture can be found in Foley et al. (1990). Three-dimensional and stereoscopic displays are discussed in Johnson (1982) and in Grotch (1983). Head-mounted displays and virtual-reality environments are discussed in Chung et al. (1989).
4
EXERCISES
5
1
2 3
List the operating characteristics for the following display technologies: raster refresh systems, vector refresh systems, plasma panels, and LCDs. List some applications appropriate for each of the display technologies in the previous question. Determine the resolution (pixels per centimeter) in the x and y directions for the video monitor in use
6
on your system. Determine the aspect ratio, and explain how relative proportions of objects can be maintained on your system. Consider three different raster systems with resolutions of 800 by 600, 1280 by 960, and 1680 by 1050. What size frame buffer (in bytes) is needed for each of these systems to store 16 bits per pixel? How much storage is required for each system if 32 bits per pixel are to be stored? Suppose an RGB raster system is to be designed using an 8 inch by 10 inch screen with a resolution of 100 pixels per inch in each direction. If we want to store 6 bits per pixel in the frame buffer, how much storage (in bytes) do we need for the frame buffer? How long would it take to load an 800 by 600 frame buffer with 16 bits per pixel, if 105 bits can be transferred per second? How long would it take to
25
Computer Graphics Hardware
7
8
9
10
11
12
13
14
15
16
17 18
26
load a 32-bit-per-pixel frame buffer with a resolution of 1680 by 1050 using this same transfer rate? Suppose we have a computer with 32 bits per word and a transfer rate of 1 mip (one million instructions per second). How long would it take to fill the frame buffer of a 300 dpi (dot per inch) laser printer with a page size of 8.5 inches by 11 inches? Consider two raster systems with resolutions of 800 by 600 and 1680 by 1050. How many pixels could be accessed per second in each of these systems by a display controller that refreshes the screen at a rate of 60 frames per second? What is the access time per pixel in each system? Suppose we have a video monitor with a display area that measures 12 inches across and 9.6 inches high. If the resolution is 1280 by 1024 and the aspect ratio is 1, what is the diameter of each screen point? How much time is spent scanning across each row of pixels during screen refresh on a raster system with a resolution of 1680 by 1050 and a refresh rate of 30 frames per second? Consider a noninterlaced raster monitor with a resolution of n by m (m scan lines and n pixels per scan line), a refresh rate of r frames per second, a horizontal retrace time of thori z , and a vertical retrace time of t er t . What is the fraction of the total refresh time per frame spent in retrace of the electron beam? What is the fraction of the total refresh time per frame spent in retrace of the electron beam for a non-interlaced raster system with a resolution of 1680 by 1050, a refresh rate of 65 Hz, a horizontal retrace time of 4 microseconds, and a vertical retrace time of 400 microseconds? Assuming that a certain full-color (24 bits per pixel) RGB raster system has a 1024 by 1024 frame buffer, how many distinct color choices (intensity levels) would we have available? How many different colors could we display at any one time? Compare the advantages and disadvantages of a three-dimensional monitor using a varifocal mirror to those of a stereoscopic system. List the different input and output components that are typically used with virtual-reality systems. Also, explain how users interact with a virtual scene displayed with different output devices, such as two-dimensional and stereoscopic monitors. Explain how virtual-reality systems can be used in design applications. What are some other applications for virtual-reality systems? List some applications for large-screen displays. Explain the differences between a general graphics system designed for a programmer and one
designed for a specific application, such as architectural design.
IN MORE DEPTH 1
2
In this course, you will design and build a graphics application incrementally. You should have a basic understanding of the types of applications for which computer graphics are used. Try to formulate a few ideas about one or more particular applications you may be interested in developing over the course of your studies. Keep in mind that you will be asked to incorporate techniques covered in this text, as well as to show your understanding of alternative methods for implementing those concepts. As such, the application should be simple enough that you can realistically implement it in a reasonable amount of time, but complex enough to afford the inclusion of each of the relevant concepts in the text. One obvious example is a video game of some sort in which the user interacts with a virtual environment that is initially displayed in two dimensions and later in three dimensions. Some concepts to consider would be two- and three-dimensional objects of different forms (some simple, some moderately complex with curved surfaces, etc.), complex shading of object surfaces, various lighting techniques, and animation of some sort. Write a report with at least three to four ideas that you will choose to implement you acquire more knowledge of the course material. Note that one type of application may be more suited to demonstrate a given concept than another. Find details about the graphical capabilities of the graphics controller and the display device in your system by looking up their specifications. Record the following information:
What is the maximum resolution your graphics controller is capable of rendering? What is the maximum resolution of your display device? What type of hardware does your graphics controller contain? What is the GPU’s clock speed? How much of its own graphics memory does it have? If you have a relatively new system, it is unlikely that you will be pushing the envelope of your graphics hardware in your application development for this text. However, knowing the capabilities of your graphics system will provide you with a sense of how much it will be able to handle.
Computer Graphics Hardware Color P lates
Color Plate 7
The 360◦ viewing screen in the NASA airport control-tower simulator, called the FutureFlight Central Facility. (Courtesy of Silicon Graphics, Inc. and NASA. © 2003 SGI. All rights reserved.)
Color Plate 8
A geophysical visualization presented on a 25-foot semicircular screen, which provides a 160◦ horizontal and 40◦ vertical field of view. (Courtesy of Silicon Graphics, Inc., the Landmark Graphics Corporation, and Trimension Systems. © 2003 SGI. All rights reserved.)
From Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
27
This page intentionally left blank
Computer Graphics Software
1
Coordinate Representations
2
Graphics Functions
3
Software Standards
4
Other Graphics Packages
5
Introduction to OpenGL
6
Summary
T
here are two broad classifications for computer-graphics software: special-purpose packages and general programming packages. Special-purpose packages are designed for nonprogrammers who want to generate pictures, graphs, or charts in some application area without worrying about the graphics procedures that might be needed to produce such displays. The interface to a special-purpose package is typically a set of menus that allow users to communicate with the programs in their own terms. Examples of such applications include artists' painting programs and various architectural, business, medical, and engineering CAD systems. By contrast, a general programming package provides a library of graphics functions that can be used in a programming language such as C, C++, Java, or Fortran. Basic functions in a typical graphics library include those for specifying picture components (straight lines, polygons, spheres, and other objects), setting color values, selecting views of a scene, and applying rotations or other transformations. Some examples of general graphics programming packages are
From Chapter 3 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
29
Computer Graphics Software
GL (Graphics Library), OpenGL, VRML (Virtual-Reality Modeling Language), Java 2D, and Java 3D. A set of graphics functions is often called a computer-graphics application programming interface (CG API) because the library provides a software interface between a programming language (such as C++) and the hardware. So when we write an application program in C++, the graphics routines allow us to construct and display a picture on an output device.
1 Coordinate Representations To generate a picture using a programming package, we first need to give the geometric descriptions of the objects that are to be displayed. These descriptions determine the locations and shapes of the objects. For example, a box is specified by the positions of its corners (vertices), and a sphere is defined by its center position and radius. With few exceptions, general graphics packages require geometric descriptions to be specified in a standard, right-handed, Cartesian-coordinate reference frame. If coordinate values for a picture are given in some other reference frame (spherical, hyperbolic, etc.), they must be converted to Cartesian coordinates before they can be input to the graphics package. Some packages that are designed for specialized applications may allow use of other coordinate frames that are appropriate for those applications. In general, several different Cartesian reference frames are used in the process of constructing and displaying a scene. First, we can define the shapes of individual objects, such as trees or furniture, within a separate reference frame for each object. These reference frames are called modeling coordinates, or sometimes local coordinates or master coordinates. Once the individual object shapes have been specified, we can construct (“model”) a scene by placing the objects into appropriate locations within a scene reference frame called world coordinates. This step involves the transformation of the individual modeling-coordinate frames to specified positions and orientations within the world-coordinate frame. As an example, we could construct a bicycle by defining each of its parts (wheels, frame, seat, handlebars, gears, chain, pedals) in a separate modelingcoordinate frame. Then, the component parts are fitted together in world coordinates. If both bicycle wheels are the same size, we need to describe only one wheel in a local-coordinate frame. Then the wheel description is fitted into the world-coordinate bicycle description in two places. For scenes that are not too complicated, object components can be set up directly within the overall worldcoordinate object structure, bypassing the modeling-coordinate and modelingtransformation steps. Geometric descriptions in modeling coordinates and world coordinates can be given in any convenient floating-point or integer values, without regard for the constraints of a particular output device. For some scenes, we might want to specify object geometries in fractions of a foot, while for other applications we might want to use millimeters, or kilometers, or light-years. After all parts of a scene have been specified, the overall world-coordinate description is processed through various routines onto one or more output-device reference frames for display. This process is called the viewing pipeline. Worldcoordinate positions are first converted to viewing coordinates corresponding to the view we want of a scene, based on the position and orientation of a hypothetical camera. Then object locations are transformed to a two-dimensional (2D) projection of the scene, which corresponds to what we will see on the output device. The scene is then stored in normalized coordinates, where each coordinate value is in the range from −1 to 1 or in the range from 0 to 1, depending on the system.
30
Computer Graphics Software
Viewing and Projection Coordinates 1 1
Modeling Coordinates
World Coordinates
Normalized Coordinates
Video Monitor
1
Plotter
Other Output Device Coordinates FIGURE 1
The transformation sequence from modeling coordinates to device coordinates for a three-dimensional scene. Object shapes can be individually defined in modeling-coordinate reference systems. Then the shapes are positioned within the world-coordinate scene. Next, world-coordinate specifications are transformed through the viewing pipeline to viewing and projection coordinates and then to normalized coordinates. At the final step, individual device drivers transfer the normalized-coordinate representation of the scene to the output devices for display.
Normalized coordinates are also referred to as normalized device coordinates, since using this representation makes a graphics package independent of the coordinate range for any specific output device. We also need to identify visible surfaces and eliminate picture parts outside the bounds for the view we want to show on the display device. Finally, the picture is scan-converted into the refresh buffer of a raster system for display. The coordinate systems for display devices are generally called device coordinates, or screen coordinates in the case of a video monitor. Often, both normalized coordinates and screen coordinates are specified in a lefthanded coordinate reference frame so that increasing positive distances from the xy plane (the screen, or viewing plane) can be interpreted as being farther from the viewing position. Figure 1 briefly illustrates the sequence of coordinate transformations from modeling coordinates to device coordinates for a display that is to contain a view of two three-dimensional (3D) objects. An initial modeling-coordinate position (xmc , ymc , zmc ) in this illustration is transferred to world coordinates, then to viewing and projection coordinates, then to left-handed normalized coordinates, and finally to a device-coordinate position (xdc , ydc ) with the sequence: (xmc , ymc , zmc ) → (xwc , ywc , zwc ) → (xvc , yvc , zvc ) → (x pc , ypc , z pc ) → (xnc , ync , znc ) → (xdc , ydc ) Device coordinates (xdc , ydc ) are integers within the range (0, 0) to (xmax , ymax ) for a particular output device. In addition to the two-dimensional positions (xdc , ydc ) on the viewing surface, depth information for each device-coordinate position is stored for use in various visibility and surface-processing algorithms.
2 Graphics Functions A general-purpose graphics package provides users with a variety of functions for creating and manipulating pictures. These routines can be broadly classified
31
Computer Graphics Software
according to whether they deal with graphics output, input, attributes, transformations, viewing, subdividing pictures, or general control. The basic building blocks for pictures are referred to as graphics output primitives. They include character strings and geometric entities, such as points, straight lines, curved lines, filled color areas (usually polygons), and shapes defined with arrays of color points. In addition, some graphics packages provide functions for displaying more complex shapes such as spheres, cones, and cylinders. Routines for generating output primitives provide the basic tools for constructing pictures. Attributes are properties of the output primitives; that is, an attribute describes how a particular primitive is to be displayed. This includes color specifications, line styles, text styles, and area-filling patterns. We can change the size, position, or orientation of an object within a scene using geometric transformations. Some graphics packages provide an additional set of functions for performing modeling transformations, which are used to construct a scene where individual object descriptions are given in local coordinates. Such packages usually provide a mechanism for describing complex objects (such as an electrical circuit or a bicycle) with a tree (hierarchical) structure. Other packages simply provide the geometric-transformation routines and leave modeling details to the programmer. After a scene has been constructed, using the routines for specifying the object shapes and their attributes, a graphics package projects a view of the picture onto an output device. Viewing transformations are used to select a view of the scene, the type of projection to be used, and the location on a video monitor where the view is to be displayed. Other routines are available for managing the screen display area by specifying its position, size, and structure. For three-dimensional scenes, visible objects are identified and the lighting conditions are applied. Interactive graphics applications use various kinds of input devices, including a mouse, a tablet, and a joystick. Input functions are used to control and process the data flow from these interactive devices. Some graphics packages also provide routines for subdividing a picture description into a named set of component parts. And other routines may be available for manipulating these picture components in various ways. Finally, a graphics package contains a number of housekeeping tasks, such as clearing a screen display area to a selected color and initializing parameters. We can lump the functions for carrying out these chores under the heading control operations.
3 Software Standards The primary goal of standardized graphics software is portability. When packages are designed with standard graphics functions, software can be moved easily from one hardware system to another and used in different implementations and applications. Without standards, programs designed for one hardware system often cannot be transferred to another system without extensive rewriting of the programs. International and national standards-planning organizations in many countries have cooperated in an effort to develop a generally accepted standard for computer graphics. After considerable effort, this work on standards led to the development of the Graphical Kernel System (GKS) in 1984. This system was adopted as the first graphics software standard by the International Standards Organization (ISO) and by various national standards organizations, including
32
Computer Graphics Software
the American National Standards Institute (ANSI). Although GKS was originally designed as a two-dimensional graphics package, a three-dimensional GKS extension was soon developed. The second software standard to be developed and approved by the standards organizations was Programmer’s Hierarchical Interactive Graphics System (PHIGS), which is an extension of GKS. Increased capabilities for hierarchical object modeling, color specifications, surface rendering, and picture manipulations are provided in PHIGS. Subsequently, an extension of PHIGS, called PHIGS+, was developed to provide three-dimensional surfacerendering capabilities not available in PHIGS. As the GKS and PHIGS packages were being developed, the graphics workstations from Silicon Graphics, Inc. (SGI), became increasingly popular. These workstations came with a set of routines called GL (Graphics Library), which very soon became a widely used package in the graphics community. Thus, GL became a de facto graphics standard. The GL routines were designed for fast, real-time rendering, and soon this package was being extended to other hardware systems. As a result, OpenGL was developed as a hardwareindependent version of GL in the early 1990s. This graphics package is now maintained and updated by the OpenGL Architecture Review Board, which is a consortium of representatives from many graphics companies and organizations. The OpenGL library is specifically designed for efficient processing of three-dimensional applications, but it can also handle two-dimensional scene descriptions as a special case of three dimensions where all the z coordinate values are 0. Graphics functions in any package are typically defined as a set of specifications independent of any programming language. A language binding is then defined for a particular high-level programming language. This binding gives the syntax for accessing the various graphics functions from that language. Each language binding is defined to make best use of the corresponding language capabilities and to handle various syntax issues, such as data types, parameter passing, and errors. Specifications for implementing a graphics package in a particular language are set by the ISO. The OpenGL bindings for the C and C++ languages are the same. Other OpenGL bindings are also available, such as those for Java and Python. Later in this book, we use the C/C++ binding for OpenGL as a framework for discussing basic graphics concepts and the design and application of graphics packages. Example programs in C++ illustrate applications of OpenGL and the general algorithms for implementing graphics functions.
4 Other Graphics Packages Many other computer-graphics programming libraries have been developed. Some provide general graphics routines, and some are aimed at specific applications or particular aspects of computer graphics, such as animation, virtual reality, or graphics on the Internet. A package called Open Inventor furnishes a set of object-oriented routines for describing a scene that is to be displayed with calls to OpenGL. The Virtual-Reality Modeling Language (VRML), which began as a subset of Open Inventor, allows us to set up three-dimensional models of virtual worlds on the Internet. We can also construct pictures on the Web using graphics libraries developed for the Java language. With Java 2D, we can create two-dimensional scenes within Java applets, for example; or we can produce three-dimensional web displays with Java 3D. With the RenderMan Interface from the Pixar Corporation, we can generate scenes
33
Computer Graphics Software
using a variety of lighting models. Finally, graphics libraries are often provided in other types of systems, such as Mathematica, MatLab, and Maple.
5 Introduction to OpenGL A basic library of functions is provided in OpenGL for specifying graphics primitives, attributes, geometric transformations, viewing transformations, and many other operations. As we noted in the last section, OpenGL is designed to be hardware independent, so many operations, such as input and output routines, are not included in the basic library. However, input and output routines and many additional functions are available in auxiliary libraries that have been developed for OpenGL programs.
Basic OpenGL Syntax Function names in the OpenGL basic library (also called the OpenGL core library) are prefixed with gl, and each component word within a function name has its first letter capitalized. The following examples illustrate this naming convention: glBegin,
glClear,
glCopyPixels,
glPolygonMode
Certain functions require that one (or more) of their arguments be assigned a symbolic constant specifying, for instance, a parameter name, a value for a parameter, or a particular mode. All such constants begin with the uppercase letters GL. In addition, component words within a constant name are written in capital letters, and the underscore ( ) is used as a separator between all component words in the name. The following are a few examples of the several hundred symbolic constants available for use with OpenGL functions: GL_2D, GL_RGB, GL_CCW, GL_POLYGON, GL_AMBIENT_AND_DIFFUSE The OpenGL functions also expect specific data types. For example, an OpenGL function parameter might expect a value that is specified as a 32-bit integer. But the size of an integer specification can be different on different machines. To indicate a specific data type, OpenGL uses special built-in, data-type names, such as GLbyte,
GLshort,
GLint,
GLfloat,
GLdouble,
GLboolean
Each data-type name begins with the capital letters GL, and the remainder of the name is a standard data-type designation written in lowercase letters. Some arguments of OpenGL functions can be assigned values using an array that lists a set of data values. This is an option for specifying a list of values as a pointer to an array, rather than specifying each element of the list explicitly as a parameter argument. A typical example of the use of this option is in specifying xyz coordinate values.
Related Libraries In addition to the OpenGL basic (core) library, there are a number of associated libraries for handling special operations. The OpenGL Utility (GLU) provides routines for setting up viewing and projection matrices, describing complex objects with line and polygon approximations, displaying quadrics and B-splines
34
Computer Graphics Software
using linear approximations, processing the surface-rendering operations, and other complex tasks. Every OpenGL implementation includes the GLU library, and all GLU function names start with the prefix glu. There is also an objectoriented toolkit based on OpenGL, called Open Inventor, which provides routines and predefined object shapes for interactive three-dimensional applications. This toolkit is written in C++. To create a graphics display using OpenGL, we first need to set up a display window on our video screen. This is simply the rectangular area of the screen in which our picture will be displayed. We cannot create the display window directly with the basic OpenGL functions since this library contains only deviceindependent graphics functions, and window-management operations depend on the computer we are using. However, there are several window-system libraries that support OpenGL functions for a variety of machines. The OpenGL Extension to the X Window System (GLX) provides a set of routines that are prefixed with the letters glX. Apple systems can use the Apple GL (AGL) interface for window-management operations. Function names for this library are prefixed with agl. For Microsoft Windows systems, the WGL routines provide a Windows-to-OpenGL interface. These routines are prefixed with the letters wgl. The Presentation Manager to OpenGL (PGL) is an interface for the IBM OS/2, which uses the prefix pgl for the library routines. The OpenGL Utility Toolkit (GLUT) provides a library of functions for interacting with any screen-windowing system. The GLUT library functions are prefixed with glut, and this library also contains methods for describing and rendering quadric curves and surfaces. Since GLUT is an interface to other device-specific window systems, we can use it so that our programs will be device-independent. Information regarding the latest version of GLUT and download procedures for the source code are available at the following web site: http://www.opengl.org/resources/libraries/glut/
Header Files In all of our graphics programs, we will need to include the header file for the OpenGL core library. For most applications we will also need GLU, and on many systems we will need to include the header file for the window system. For instance, with Microsoft Windows, the header file that accesses the WGL routines is windows.h. This header file must be listed before the OpenGL and GLU header files because it contains macros needed by the Microsoft Windows version of the OpenGL libraries. So the source file in this case would begin with #include #include #include However, if we use GLUT to handle the window-managing operations, we do not need to include gl.h and glu.h because GLUT ensures that these will be included correctly. Thus, we can replace the header files for OpenGL and GLU with #include (We could include gl.h and glu.h as well, but doing so would be redundant and could affect program portability.) On some systems, the header files for OpenGL and GLUT routines are found in different places in the filesystem. For instance, on Apple OS X systems, the header file inclusion statement would be #include
35
Computer Graphics Software
In addition, we will often need to include header files that are required by the C++ code. For example, #include #include #include With the ISO/ANSI standard for C++, these header files are called cstdio, cstdlib, and cmath.
Display-Window Management Using GLUT To get started, we can consider a simplified, minimal number of operations for displaying a picture. Since we are using the OpenGL Utility Toolkit, our first step is to initialize GLUT. This initialization function could also process any commandline arguments, but we will not need to use these parameters for our first example programs. We perform the GLUT initialization with the statement glutInit (&argc, argv); Next, we can state that a display window is to be created on the screen with a given caption for the title bar. This is accomplished with the function glutCreateWindow ("An Example OpenGL Program"); where the single argument for this function can be any character string that we want to use for the display-window title. Then we need to specify what the display window is to contain. For this, we create a picture using OpenGL functions and pass the picture definition to the GLUT routine glutDisplayFunc, which assigns our picture to the display window. As an example, suppose we have the OpenGL code for describing a line segment in a procedure called lineSegment. Then the following function call passes the line-segment description to the display window: glutDisplayFunc (lineSegment); But the display window is not yet on the screen. We need one more GLUT function to complete the window-processing operations. After execution of the following statement, all display windows that we have created, including their graphic content, are now activated: glutMainLoop ( ); This function must be the last one in our program. It displays the initial graphics and puts the program into an infinite loop that checks for input from devices such as a mouse or keyboard. Our first example will not be interactive, so the program will just continue to display our picture until we close the display window. In later chapters, we consider how we can modify our OpenGL programs to handle interactive input. Although the display window that we created will be in some default location and size, we can set these parameters using additional GLUT functions. We use the glutInitWindowPosition function to give an initial location for the upperleft corner of the display window. This position is specified in integer screen
36
Computer Graphics Software
Vid 50
eo s
100
An E
xam
Display Window
ple O
penG
cree
L Pr
n
ogra
m
300 400
FIGURE 2
A 400 by 300 display window at position (50, 100) relative to the top-left corner of the video display.
coordinates, whose origin is at the upper-left corner of the screen. For instance, the following statement specifies that the upper-left corner of the display window should be placed 50 pixels to the right of the left edge of the screen and 100 pixels down from the top edge of the screen: glutInitWindowPosition (50, 100); Similarly, the glutInitWindowSize function is used to set the initial pixel width and height of the display window. Thus, we specify a display window with an initial width of 400 pixels and a height of 300 pixels (Fig. 2) with the statement glutInitWindowSize (400, 300); After the display window is on the screen, we can reposition and resize it. We can also set a number of other options for the display window, such as buffering and a choice of color modes, with the glutInitDisplayMode function. Arguments for this routine are assigned symbolic GLUT constants. For example, the following command specifies that a single refresh buffer is to be used for the display window and that we want to use the color mode which uses red, green, and blue (RGB) components to select color values: glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); The values of the constants passed to this function are combined using a logical or operation. Actually, single buffering and RGB color mode are the default options. But we will use the function now as a reminder that these are the options that are set for our display. Later, we discuss color modes in more detail, as well as other display options, such as double buffering for animation applications and selecting parameters for viewing three-dimensional scenes.
A Complete OpenGL Program There are still a few more tasks to perform before we have all the parts that we need for a complete program. For the display window, we can choose a background color. And we need to construct a procedure that contains the appropriate OpenGL functions for the picture that we want to display.
37
Computer Graphics Software
Using RGB color values, we set the background color for the display window to be white, as in Figure 2, with the OpenGL function: glClearColor (1.0, 1.0, 1.0, 0.0); The first three arguments in this function set the red, green, and blue component colors to the value 1.0, giving us a white background color for the display window. If, instead of 1.0, we set each of the component colors to 0.0, we would get a black background. And if all three of these components were set to the same intermediate value between 0.0 and 1.0, we would get some shade of gray. The fourth parameter in the glClearColor function is called the alpha value for the specified color. One use for the alpha value is as a “blending” parameter. When we activate the OpenGL blending operations, alpha values can be used to determine the resulting color for two overlapping objects. An alpha value of 0.0 indicates a totally transparent object, and an alpha value of 1.0 indicates an opaque object. Blending operations will not be used for a while, so the value of alpha is irrelevant to our early example programs. For now, we will simply set alpha to 0.0. Although the glClearColor command assigns a color to the display window, it does not put the display window on the screen. To get the assigned window color displayed, we need to invoke the following OpenGL function: glClear (GL_COLOR_BUFFER_BIT); The argument GL COLOR BUFFER BIT is an OpenGL symbolic constant specifying that it is the bit values in the color buffer (refresh buffer) that are to be set to the values indicated in the glClearColor function. (OpenGL has several different kinds of buffers that can be manipulated. In addition to setting the background color for the display window, we can choose a variety of color schemes for the objects we want to display in a scene. For our initial programming example, we will simply set the object color to be a dark green: glColor3f (0.0, 0.4, 0.2); The suffix 3f on the glColor function indicates that we are specifying the three RGB color components using floating-point (f) values. This function requires that the values be in the range from 0.0 to 1.0, and we have set red = 0.0, green = 0.4, and blue = 0.2. For our first program, we simply display a two-dimensional line segment. To do this, we need to tell OpenGL how we want to “project” our picture onto the display window because generating a two-dimensional picture is treated by OpenGL as a special case of three-dimensional viewing. So, although we only want to produce a very simple two-dimensional line, OpenGL processes our picture through the full three-dimensional viewing operations. We can set the projection type (mode) and other viewing parameters that we need with the following two functions: glMatrixMode (GL_PROJECTION); gluOrtho2D (0.0, 200.0, 0.0, 150.0); This specifies that an orthogonal projection is to be used to map the contents of a two-dimensional rectangular area of world coordinates to the screen, and that the x-coordinate values within this rectangle range from 0.0 to 200.0 with y-coordinate values ranging from 0.0 to 150.0. Whatever objects we define
38
Computer Graphics Software
within this world-coordinate rectangle will be shown within the display window. Anything outside this coordinate range will not be displayed. Therefore, the GLU function gluOrtho2D defines the coordinate reference frame within the display window to be (0.0, 0.0) at the lower-left corner of the display window and (200.0, 150.0) at the upper-right window corner. Since we are only describing a two-dimensional object, the orthogonal projection has no other effect than to “paste” our picture into the display window that we defined earlier. For now, we will use a world-coordinate rectangle with the same aspect ratio as the display window, so that there is no distortion of our picture. Later, we will consider how we can maintain an aspect ratio that does not depend upon the display-window specification. Finally, we need to call the appropriate OpenGL routines to create our line segment. The following code defines a two-dimensional, straight-line segment with integer, Cartesian endpoint coordinates (180, 15) and (10, 145). glBegin (GL_LINES); glVertex2i (180, 15); glVertex2i (10, 145); glEnd ( ); Now we are ready to put all the pieces together. The following OpenGL program is organized into three functions. We place all initializations and related one-time parameter settings in function init. Our geometric description of the “picture” that we want to display is in function lineSegment, which is the function that will be referenced by the GLUT function glutDisplayFunc. And the main function contains the GLUT functions for setting up the display window and getting our line segment onto the screen. Figure 3 shows the display window and line segment generated by this program.
FIGURE 3
The display window and line segment produced by the example program.
39
Computer Graphics Software
#include
// (or others, depending on the system in use)
void init (void) { glClearColor (1.0, 1.0, 1.0, 0.0);
// Set display-window color to white.
glMatrixMode (GL_PROJECTION); // Set projection parameters. gluOrtho2D (0.0, 200.0, 0.0, 150.0); } void lineSegment (void) { glClear (GL_COLOR_BUFFER_BIT); glColor3f (0.0, 0.4, 0.2); glBegin (GL_LINES); glVertex2i (180, 15); glVertex2i (10, 145); glEnd ( ); glFlush ( );
// Clear display window. // Set line segment color to green. // Specify line-segment geometry.
// Process all OpenGL routines as quickly as possible.
} void main (int argc, char** argv) { glutInit (&argc, argv); // Initialize GLUT. glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); // Set display mode. glutInitWindowPosition (50, 100); // Set top-left display-window position. glutInitWindowSize (400, 300); // Set display-window width and height. glutCreateWindow ("An Example OpenGL Program"); // Create display window. init ( ); glutDisplayFunc (lineSegment); glutMainLoop ( );
// Execute initialization procedure. // Send graphics to display window. // Display everything and wait.
}
At the end of procedure lineSegment is a function, glFlush, that we have not yet discussed. This is simply a routine to force execution of our OpenGL functions, which are stored by computer systems in buffers in different locations, depending on how OpenGL is implemented. On a busy network, for example, there could be delays in processing some buffers. But the call to glFlush forces all such buffers to be emptied and the OpenGL functions to be processed. The procedure lineSegment that we set up to describe our picture is referred to as a display callback function. And this procedure is described as being “registered” by glutDisplayFunc as the routine to invoke whenever the display window might need to be redisplayed. This can occur, for example, if the display window is moved. In subsequent chapters, we will look at other types of callback functions and the associated GLUT routines that we use to register them. In general, OpenGL programs are organized as a set of callback functions that are to be invoked when certain actions occur.
40
Computer Graphics Software
T A B L E
1
OpenGL Error Codes Symbolic Constant
Meaning
GL GL GL GL GL GL
A GLenum argument was out of range A numeric argument was out of range An operation was illegal in the current OpenGL state The command would cause a stack overflow The command would cause a stack underflow There was not enough memory to execute a command
INVALID ENUM INVALID VALUE INVALID OPERATION STACK OVERFLOW STACK UNDERFLOW OUT OF MEMORY
Error Handling in OpenGL Many of the features of the OpenGL API are extremely powerful. Unfortunately, they can also be very confusing, particularly for programmers who are just learning to use them. Although we would like to believe otherwise, it’s quite possible (if not likely) that our OpenGL programs may contain errors. Accordingly, it is worth spending a few moments discussing error handling in OpenGL programs. The OpenGL and GLU libraries have a relatively simple method of recording errors. When OpenGL detects an error in a call to a base library routine or a GLU routine, it records an error code internally, and the routine which caused the error is ignored (so that it has no effect on either the internal OpenGL state or the contents of the frame buffer). However, OpenGL only records one error code at a time. Once an error occurs, no other error code will be recorded until your program explicitly queries the OpenGL error state: GLenum code; code = glGetError (); This call returns the current error code and clears the internal error flag. If the returned value is equal to the OpenGL symbolic constant GL NO ERROR, everything is fine. Any other return value indicates that a problem has occurred. The base OpenGL library defines a number of symbolic constants which represent different error conditions; the ones which occur most often are listed in Table 1. The GLU library also defines a number of error codes, but most of them have almost meaningless names such as GLU NURBS ERROR1, GLU NURBS ERROR2, and so on. (These are not actually meaningless names, but their meaning won’t become clear until we discuss more advanced concepts in later chapters.) These symbolic constants are helpful, but printing them out directly is not particularly informative. Fortunately, the GLU library contains a function that returns a descriptive string for each of the GLU and GL errors. To use it, we first retrieve the current error code, and then pass it as a parameter to this function. The return value can be printed out using, for example, the C standard library fprintf function: #include GLenum code;
41
Computer Graphics Software
const GLubyte *string; code = glGetError (); string = gluErrorString (code); fprintf( stderr, "OpenGL error: %s\n", string ); The value returned by gluErrorString points to a string located inside the GLU library. It is not a dynamically-allocated string, so it must not be deallocated by our program. It also must not be modified by our program (hence the const modifier on the declaration of string). We can easily encapsulate these function calls into a general error-reporting function in our program. The following function will retrieve the current error code, print the descriptive error string, and return the code to the calling routine: #include GLenum errorCheck () { GLenum code; const GLubyte *string; code = glGetError (); if (code != GL_NO_ERROR) { string = gluErrorString (code); fprintf( stderr, "OpenGL error: %s\n", string ); } return code; }
We encourage you to use a function like this in the OpenGL programs you develop. It is usually a good idea to check for errors at least once in your display callback routine, and more frequently if you are using OpenGL features you have never used before, or if you are seeing unusual or unexpected results in the image your program produces.
6 Summary In this chapter, we surveyed the major features of graphics software systems. Some software systems, such as CAD packages and paint programs, are designed for particular applications. Other software systems provide a library of general graphics routines that can be used within a programming language such as C++ to generate pictures for any application. Standard graphics-programming packages developed and approved through ISO and ANSI are GKS, 3D GKS, PHIGS, and PHIGS+. Other packages that have evolved into standards are GL and OpenGL. Many other graphics libraries are available for use in a programming language, including Open Inventor, VRML, RenderMan, Java 2D, and Java 3D. Other systems, such as Mathematica, MatLab, and Maple, often provide a set of graphics-programming functions. Normally, graphics-programming packages require coordinate specifications to be given in Cartesian reference frames. Each object for a scene can be defined
42
Computer Graphics Software
in a separate modeling Cartesian-coordinate system, which is then mapped to a world-coordinate location to construct the scene. From world coordinates, three-dimensional objects are projected to a two-dimensional plane, converted to normalized device coordinates, and then transformed to the final displaydevice coordinates. The transformations from modeling coordinates to normalized device coordinates are independent of particular output devices that might be used in an application. Device drivers are then used to convert normalized coordinates to integer device coordinates. Functions that are available in graphics programming packages can be divided into the following categories: graphics output primitives, attributes, geometric and modeling transformations, viewing transformations, input functions, picture-structuring operations, and control operations. The OpenGL system consists of a device-independent set of routines (called the core library), the utility library (GLU), and the utility toolkit (GLUT). In the auxiliary set of routines provided by GLU, functions are available for generating complex objects, for parameter specifications in two-dimensional viewing applications, for dealing with surface-rendering operations, and for performing some other supporting tasks. In GLUT, we have an extensive set of functions for managing display windows, interacting with screen-window systems, and for generating some three-dimensional shapes. We can use GLUT to interface with any computer system, or we can use GLX, Apple GL, WGL, or another systemspecific software package.
REFERENCES Standard sources for information on OpenGL are Woo, et al. (1999), Shreiner (2000), and Shreiner (2010). Open Inventor is explored in Wernecke (1994). McCarthy and Descartes (1998) can be consulted for discussions of VRML. Presentations on RenderMan can be found in Upstill (1989) and Apodaca and Gritz (2000). Examples of graphics programming in Java 2D are given in Knudsen (1999), Hardy (2000), and Horstmann and Cornell (2001). Graphics programming using Java 3D is explored in Sowizral, et al. (2000); Palmer (2001); Selman (2002); and Walsh and Gehringer (2002). For information on PHIGS and PHIGS+, see Howard, et al. (1991); Hopgood and Duce (1991); Gaskins (1992); and Blake (1993). Information on the two-dimensional GKS standard and on the evolution of graphics standards is available in Hopgood, et al. (1983). An additional reference for GKS is Enderle, et al. (1984).
EXERCISES 1
2
What command could we use to set the color of an OpenGL display window to dark gray? What command would we use to set the color of the display window to white? List the statements needed to set up an OpenGL display window whose lower-left corner is at pixel position (75, 200) and has a window width of 200 pixels and a height of 150 pixels.
3
4
5 6 7
List the OpenGL statements needed to draw a line segment from the upper-right corner of a display window of width 150 and height 250 to the lowerleft corner of the window. Suppose we have written a function called rectangle whose purpose is to draw a rectangle in the middle of a given display window. What OpenGL statement would be needed to make sure the rectangle is drawn correctly each time the display window needs to be repainted? Explain what is meant by the term “OpenGL display callback function”. Explain the difference between modeling coordinates and world coordinates. Explain what normalized coordinates are and why they are useful for graphics software packages.
IN MORE DEPTH 1
Develope a few application ideas and identify and describe the objects that you will be manipulating graphically in the application. Explain desirable physical and visual properties of these objects in detail so that you can concretely define what attributes you will develop for them in later exercises. Consider the following:
43
Computer Graphics Software
• Are the objects complex in shape or texture? • Can the objects be approximated fairly well by simpler shapes?
• Are some of the objects comprised of fairly complex curved surfaces?
• Do the objects lend themselves to being represented two-dimensionally at first, even if unrealistically? • Can the objects be represented as a hierarchically organized group of smaller objects, or parts? • Will the objects change position and orientation dynamically in response to user input? • Will the lighting conditions change in the application, and do the object appearances change under these varying conditions? If all of the objects in your application yield a “no” answer to any of these questions, consider modifying the application or your design and implementation approach so that such properties are present in at least one of the objects. Alternatively, come up with two or more applications so that at least one of them contains objects that satisfy each of
44
2
these properties. Draw a rough specification, using visual diagrams and/or text, listing the properties of the objects in your application. You will use and modify this specification as you continue to develop your application. You will be developing your application using the OpenGL API. In order to do this, you will need to have a working programming environment in which you can edit, compile and run OpenGL programs. For this exercise, you should go through the necessary steps to get this environment up and running on your machine. Once you have the environment set up, create a new project with the source code given in the example in this chapter that draws a single line in the display window. Make sure you can compile and run this program. Information about getting an OpenGL environment up and running on your system can be found at http://www.opengl.org/ and http://www.opengl.org/wiki/Getting started. Make sure that you obtain all of the necessary libraries, including the GLU and GLUT libraries. The examples in this text are all written in C++. Follow your instructor’s guidelines on the use of other languages (provided OpenGL bindings can be obtained for them).
Graphics Output Primitives
1
Coordinate Reference Frames
2
Specifying a Two-Dimensional World-Coordinate Reference Frame in OpenGL
3
OpenGL Point Functions
4
OpenGL Line Functions
5
OpenGL Curve Functions
6
Fill-Area Primitives
7
Polygon Fill Areas
8
OpenGL Polygon Fill-Area Functions
9
OpenGL Vertex Arrays
10
Pixel-Array Primitives
11
OpenGL Pixel-Array Functions
12
Character Primitives
13
OpenGL Character Functions
14
Picture Partitioning
15
OpenGL Display Lists
16
OpenGL Display-Window Reshape Function
17
Summary
A
general software package for graphics applications, sometimes referred to as a computer-graphics application programming interface (CG API), provides a library of functions that we can use within a programming language such as C++ to create pictures. The set of library functions can be subdivided into several categories. One of the first things we need to do when creating a picture is to describe the component parts of the scene to be displayed. Picture components could be trees and terrain, furniture and walls, storefronts and street scenes, automobiles and billboards, atoms and molecules, or stars and galaxies. For each type of scene, we need to describe the structure of the individual objects and their coordinate locations within the scene. Those functions in a graphics package that we use to describe the various picture components are called the graphics output primitives, or simply primitives. The output primitives describing the geometry of objects are typically referred to as geometric primitives. Point positions and straight-line segments are the simplest
From Chapter 4 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
45
Graphics Output Primitives
geometric primitives. Additional geometric primitives that can be available in a graphics package include circles and other conic sections, quadric surfaces, spline curves and surfaces, and polygon color areas. Also, most graphics systems provide some functions for displaying character strings. After the geometry of a picture has been specified within a selected coordinate reference frame, the output primitives are projected to a twodimensional plane, corresponding to the display area of an output device, and scan converted into integer pixel positions within the frame buffer. In this chapter, we introduce the output primitives available in OpenGL, and discuss their use.
1 Coordinate Reference Frames To describe a picture, we first decide upon a convenient Cartesian coordinate system, called the world-coordinate reference frame, which could be either twodimensional or three-dimensional. We then describe the objects in our picture by giving their geometric specifications in terms of positions in world coordinates. For instance, we define a straight-line segment with two endpoint positions, and a polygon is specified with a set of positions for its vertices. These coordinate positions are stored in the scene description along with other information about the objects, such as their color and their coordinate extents, which are the minimum and maximum x, y, and z values for each object. A set of coordinate extents is also described as a bounding box for an object. For a two-dimensional figure, the coordinate extents are sometimes called an object’s bounding rectangle. Objects are then displayed by passing the scene information to the viewing routines, which identify visible surfaces and ultimately map the objects to positions on the video monitor. The scan-conversion process stores information about the scene, such as color values, at the appropriate locations in the frame buffer, and the objects in the scene are displayed on the output device.
Screen Coordinates y
5 4 3 2 1 0 0
1
2
3
4
5
x
FIGURE 1
Pixel positions referenced with respect to the lower-left corner of a screen area.
46
Locations on a video monitor are referenced in integer screen coordinates, which correspond to the pixel positions in the frame buffer. Pixel coordinate values give the scan line number (the y value) and the column number (the x value along a scan line). Hardware processes, such as screen refreshing, typically address pixel positions with respect to the top-left corner of the screen. Scan lines are then referenced from 0, at the top of the screen, to some integer value, ymax , at the bottom of the screen, and pixel positions along each scan line are numbered from 0 to xmax , left to right. However, with software commands, we can set up any convenient reference frame for screen positions. For example, we could specify an integer range for screen positions with the coordinate origin at the lower-left of a screen area (Figure 1), or we could use noninteger Cartesian values for a picture description. The coordinate values we use to describe the geometry of a scene are then converted by the viewing routines to integer pixel positions within the frame buffer. Scan-line algorithms for the graphics primitives use the defining coordinate descriptions to determine the locations of pixels that are to be displayed. For
Graphics Output Primitives
example, given the endpoint coordinates for a line segment, a display algorithm must calculate the positions for those pixels that lie along the line path between the endpoints. Since a pixel position occupies a finite area of the screen, the finite size of a pixel must be taken into account by the implementation algorithms. For the present, we assume that each integer screen position references the center of a pixel area. Once pixel positions have been identified for an object, the appropriate color values must be stored in the frame buffer. For this purpose, we will assume that we have available a low-level procedure of the form setPixel (x, y);
This procedure stores the current color setting into the frame buffer at integer position (x, y), relative to the selected position of the screen-coordinate origin. We sometimes also will want to be able to retrieve the current frame-buffer setting for a pixel location. So we will assume that we have the following low-level function for obtaining a frame-buffer color value: getPixel (x, y, color);
In this function, parameter color receives an integer value corresponding to the combined red, green, and blue (RGB) bit codes stored for the specified pixel at position (x, y). Although we need only specify color values at (x, y) positions for a twodimensional picture, additional screen-coordinate information is needed for three-dimensional scenes. In this case, screen coordinates are stored as threedimensional values, where the third dimension references the depth of object positions relative to a viewing position. For a two-dimensional scene, all depth values are 0.
Absolute and Relative Coordinate Specifications So far, the coordinate references that we have discussed are stated as absolute coordinate values. This means that the values specified are the actual positions within the coordinate system in use. However, some graphics packages also allow positions to be specified using relative coordinates. This method is useful for various graphics applications, such as producing drawings with pen plotters, artist’s drawing and painting systems, and graphics packages for publishing and printing applications. Taking this approach, we can specify a coordinate position as an offset from the last position that was referenced (called the current position). For example, if location (3, 8) is the last position that has been referenced in an application program, a relative coordinate specification of (2, −1) corresponds to an absolute position of (5, 7). An additional function is then used to set a current position before any coordinates for primitive functions are specified. To describe an object, such as a series of connected line segments, we then need to give only a sequence of relative coordinates (offsets), once a starting position has been established. Options can be provided in a graphics system to allow the specification of locations using either relative or absolute coordinates. In the following discussions, we will assume that all coordinates are specified as absolute references unless explicitly stated otherwise.
47
Graphics Output Primitives
2 Specifying A Two-Dimensional World-Coordinate Reference Frame in OpenGL The gluOrtho2D command is a function we can use to set up any twodimensional Cartesian reference frame. The arguments for this function are the four values defining the x and y coordinate limits for the picture we want to display. Since the gluOrtho2D function specifies an orthogonal projection, we need also to be sure that the coordinate values are placed in the OpenGL projection matrix. In addition, we could assign the identity matrix as the projection matrix before defining the world-coordinate range. This would ensure that the coordinate values were not accumulated with any values we may have previously set for the projection matrix. Thus, for our initial two-dimensional examples, we can define the coordinate frame for the screen display window with the following statements: glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (xmin, xmax, ymin, ymax);
The display window will then be referenced by coordinates (xmin, ymin) at the lower-left corner and by coordinates (xmax, ymax) at the upper-right corner, as shown in Figure 2. We can then designate one or more graphics primitives for display using the coordinate reference specified in the gluOrtho2D statement. If the coordinate extents of a primitive are within the coordinate range of the display window, all of the primitive will be displayed. Otherwise, only those parts of the primitive within the display-window coordinate limits will be shown. Also, when we set up the geometry describing a picture, all positions for the OpenGL primitives must be given in absolute coordinates, with respect to the reference frame defined in the gluOrtho2D function.
Vide
o Sc
reen
Display Window ymax
ymin FIGURE 2
World-coordinate limits for a display window, as specified in the glOrtho2D function.
48
xmin xmax
Graphics Output Primitives
3 OpenGL Point Functions To specify the geometry of a point, we simply give a coordinate position in the world reference frame. Then this coordinate position, along with other geometric descriptions we may have in our scene, is passed to the viewing routines. Unless we specify other attribute values, OpenGL primitives are displayed with a default size and color. The default color for primitives is white, and the default point size is equal to the size of a single screen pixel. We use the following OpenGL function to state the coordinate values for a single position: glVertex* ( );
where the asterisk (*) indicates that suffix codes are required for this function. These suffix codes are used to identify the spatial dimension, the numerical data type to be used for the coordinate values, and a possible vector form for the coordinate specification. Calls to glVertex functions must be placed between a glBegin function and a glEnd function. The argument of the glBegin function is used to identify the kind of output primitive that is to be displayed, and glEnd takes no arguments. For point plotting, the argument of the glBegin function is the symbolic constant GL POINTS. Thus, the form for an OpenGL specification of a point position is glBegin (GL_POINTS); glVertex* ( ); glEnd ( );
Although the term vertex strictly refers to a “corner” point of a polygon, the point of intersection of the sides of an angle, a point of intersection of an ellipse with its major axis, or other similar coordinate positions on geometric structures, the glVertex function is used in OpenGL to specify coordinates for any point position. In this way, a single function is used for point, line, and polygon specifications—and, most often, polygon patches are used to describe the objects in a scene. Coordinate positions in OpenGL can be given in two, three, or four dimensions. We use a suffix value of 2, 3, or 4 on the glVertex function to indicate the dimensionality of a coordinate position. A four-dimensional specification indicates a homogeneous-coordinate representation, where the homogeneous parameter h (the fourth coordinate) is a scaling factor for the Cartesian-coordinate values. Homogeneous-coordinate representations are useful for expressing transformation operations in matrix form. Because OpenGL treats two-dimensions as a special case of three dimensions, any (x, y) coordinate specification is equivalent to a three-dimensional specification of (x, y, 0). Furthermore, OpenGL represents vertices internally in four dimensions, so each of these specifications are equivalent to the four-dimensional specification (x, y, 0, 1). We also need to state which data type is to be used for the numericalvalue specifications of the coordinates. This is accomplished with a second suffix code on the glVertex function. Suffix codes for specifying a numerical data type are i (integer), s (short), f (float), and d (double). Finally, the coordinate values can be listed explicitly in the glVertex function, or a single argument can be used that references a coordinate position as an array. If we use an array specification for a coordinate position, we need to append v (for “vector”) as a third suffix code.
49
Graphics Output Primitives y 200 150 100 50 FIGURE 3
Display of three point positions generated with glBegin (GL POINTS).
50
100
150
x
In the following example, three equally spaced points are plotted along a twodimensional, straight-line path with a slope of 2 (see Figure 3). Coordinates are given as integer pairs: glBegin (GL_POINTS); glVertex2i (50, 100); glVertex2i (75, 150); glVertex2i (100, 200); glEnd ( );
Alternatively, we could specify the coordinate values for the preceding points in arrays such as int point1 [ ] = {50, 100}; int point2 [ ] = {75, 150}; int point3 [ ] = {100, 200};
and call the OpenGL functions for plotting the three points as glBegin (GL_POINTS); glVertex2iv (point1); glVertex2iv (point2); glVertex2iv (point3); glEnd ( );
In addition, here is an example of specifying two point positions in a threedimensional world reference frame. In this case, we give the coordinates as explicit floating-point values: glBegin (GL_POINTS); glVertex3f (-78.05, 909.72, 14.60); glVertex3f (261.91, -5200.67, 188.33); glEnd ( );
We could also define a C++ class or structure (struct) for specifying point positions in various dimensions. For example, class wcPt2D { public: GLfloat x, y; };
50
Graphics Output Primitives
Using this class definition, we could specify a two-dimensional, world-coordinate point position with the statements wcPt2D pointPos; pointPos.x = 120.75; pointPos.y = 45.30; glBegin (GL_POINTS); glVertex2f (pointPos.x, pointPos.y); glEnd ( );
Also, we can use the OpenGL point-plotting functions within a C++ procedure to implement the setPixel command.
4 OpenGL Line Functions Graphics packages typically provide a function for specifying one or more straight-line segments, where each line segment is defined by two endpoint coordinate positions. In OpenGL, we select a single endpoint coordinate position using the glVertex function, just as we did for a point position. And we enclose a list of glVertex functions between the glBegin/glEnd pair. But now we use a symbolic constant as the argument for the glBegin function that interprets a list of positions as the endpoint coordinates for line segments. There are three symbolic constants in OpenGL that we can use to specify how a list of endpoint positions should be connected to form a set of straight-line segments. By default, each symbolic constant displays solid, white lines. A set of straight-line segments between each successive pair of endpoints in a list is generated using the primitive line constant GL LINES. In general, this will result in a set of unconnected lines unless some coordinate positions are repeated, because OpenGL considers lines to be connected only if they share a vertex; lines that cross but do not share a vertex are still considered to be unconnected. Nothing is displayed if only one endpoint is specified, and the last endpoint is not processed if the number of endpoints listed is odd. For example, if we have five coordinate positions, labeled p1 through p5, and each is represented as a two-dimensional array, then the following code could generate the display shown in Figure 4(a): glBegin (GL_LINES); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glEnd ( );
Thus, we obtain one line segment between the first and second coordinate positions and another line segment between the third and fourth positions. In this case, the number of specified endpoints is odd, so the last coordinate position is ignored. With the OpenGL primitive constant GL LINE STRIP, we obtain a polyline. In this case, the display is a sequence of connected line segments between the first endpoint in the list and the last endpoint. The first line segment in the polyline is displayed between the first endpoint and the second endpoint; the second line segment is between the second and third endpoints; and so forth, up to the last line endpoint. Nothing is displayed if we do not list at least two coordinate positions.
51
Graphics Output Primitives p3
p3
p5
p1
p2
p4
p1
p2
(a)
p3
p4 (b)
p5
p1
p2
p4 (c)
FIGURE 4
Line segments that can be displayed in OpenGL using a list of five endpoint coordinates. (a) An unconnected set of lines generated with the primitive line constant GL LINES. (b) A polyline generated with GL LINE STRIP. (c) A closed polyline generated with GL LINE LOOP.
Using the same five coordinate positions as in the previous example, we obtain the display in Figure 4(b) with the code glBegin (GL_LINE_STRIP); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glEnd ( );
The third OpenGL line primitive is GL LINE LOOP, which produces a closed polyline. Lines are drawn as with GL LINE STRIP, but an additional line is drawn to connect the last coordinate position and the first coordinate position. Figure 4(c) shows the display of our endpoint list when we select this line option, using the code glBegin (GL_LINE_LOOP); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glEnd ( );
As noted earlier, picture components are described in a world-coordinate reference frame that is eventually mapped to the coordinate reference for the output device. Then the geometric information about the picture is scan-converted to pixel positions.
5 OpenGL Curve Functions Routines for generating basic curves, such as circles and ellipses, are not included as primitive functions in the OpenGL core library. But this library does contain functions for displaying B´ezier splines, which are polynomials that are defined with a discrete point set. And the OpenGL Utility (GLU) library has routines for three-dimensional quadrics, such as spheres and cylinders, as well as
52
Graphics Output Primitives
(a)
(b)
FIGURE 5
(c)
A circular arc approximated with (a) three straight-line segments, (b) six line segments, and (c) twelve line segments.
routines for producing rational B-splines, which are a general class of splines that include the simpler B´ezier curves. Using rational B-splines, we can display circles, ellipses, and other two-dimensional quadrics. In addition, there are routines in the OpenGL Utility Toolkit (GLUT) that we can use to display some three-dimensional quadrics, such as spheres and cones, and some other shapes. However, all these routines are more involved than the basic primitives we introduce in this chapter. Another method we can use to generate a display of a simple curve is to approximate it using a polyline. We just need to locate a set of points along the curve path and connect the points with straight-line segments. The more line sections we include in the polyline, the smoother the appearance of the curve. As an example, Figure 5 illustrates various polyline displays that could be used for a circle segment. A third alternative is to write our own curve-generation functions based on the algorithms presented in following chapters.
6 Fill-Area Primitives Another useful construct, besides points, straight-line segments, and curves, for describing components of a picture is an area that is filled with some solid color or pattern. A picture component of this type is typically referred to as a fill area or a filled area. Most often, fill areas are used to describe surfaces of solid objects, but they are also useful in a variety of other applications. Also, fill regions are usually planar surfaces, mainly polygons. But, in general, there are many possible shapes for a region in a picture that we might wish to fill with a color option. Figure 6 illustrates a few possible fill-area shapes. For the present, we assume that all fill areas are to be displayed with a specified solid color. Although any fill-area shape is possible, graphics libraries generally do not support specifications for arbitrary fill shapes. Most library routines require that
53
Graphics Output Primitives
FIGURE 6
Solid-color fill areas specified with various boundaries. (a) A circular fill region. (b) A fill area bounded by a closed polyline. (c) A filled area specified with an irregular curved boundary.
FIGURE 7
Wire-frame representation for a cylinder, showing only the front (visible) faces of the polygon mesh used to approximate the surfaces.
(a)
(b)
(c)
a fill area be specified as a polygon. Graphics routines can more efficiently process polygons than other kinds of fill shapes because polygon boundaries are described with linear equations. Moreover, most curved surfaces can be approximated reasonably well with a set of polygon patches, just as a curved line can be approximated with a set of straight-line segments. In addition, when lighting effects and surface-shading procedures are applied, an approximated curved surface can be displayed quite realistically. Approximating a curved surface with polygon facets is sometimes referred to as surface tessellation, or fitting the surface with a polygon mesh. Figure 7 shows the side and top surfaces of a metal cylinder approximated in an outline form as a polygon mesh. Displays of such figures can be generated quickly as wire-frame views, showing only the polygon edges to give a general indication of the surface structure. Then the wire-frame model could be shaded to generate a display of a natural-looking material surface. Objects described with a set of polygon surface patches are usually referred to as standard graphics objects, or just graphics objects. In general, we can create fill areas with any boundary specification, such as a circle or connected set of spline-curve sections. And some of the polygon methods discussed in the next section can be adapted to display fill areas with a nonlinear border.
7 Polygon Fill Areas Mathematically defined, a polygon is a plane figure specified by a set of three or more coordinate positions, called vertices, that are connected in sequence by straight-line segments, called the edges or sides of the polygon. Further, in basic geometry, it is required that the polygon edges have no common point other than their endpoints. Thus, by definition, a polygon must have all its vertices within a single plane and there can be no edge crossings. Examples of polygons include triangles, rectangles, octagons, and decagons. Sometimes, any plane figure with a closed-polyline boundary is alluded to as a polygon, and one with no crossing edges is referred to as a standard polygon or a simple polygon. In an effort to avoid ambiguous object references, we will use the term polygon to refer only to those planar shapes that have a closed-polyline boundary and no edge crossings. For a computer-graphics application, it is possible that a designated set of polygon vertices do not all lie exactly in one plane. This can be due to roundoff error in the calculation of numerical values, to errors in selecting coordinate positions for the vertices, or, more typically, to approximating a curved surface with a set of polygonal patches. One way to rectify this problem is simply to divide the specified surface mesh into triangles. But in some cases, there may be reasons
54
Graphics Output Primitives
180 180 FIGURE 8
(a)
(b)
A convex polygon (a), and a concave polygon (b).
to retain the original shape of the mesh patches, so methods have been devised for approximating a nonplanar polygonal shape with a plane figure. We discuss how these plane approximations are calculated in the section on plane equations.
Polygon Classifications An interior angle of a polygon is an angle inside the polygon boundary that is formed by two adjacent edges. If all interior angles of a polygon are less than or equal to 180◦ , the polygon is convex. An equivalent definition of a convex polygon is that its interior lies completely on one side of the infinite extension line of any one of its edges. Also, if we select any two points in the interior of a convex polygon, the line segment joining the two points is also in the interior. A polygon that is not convex is called a concave polygon. Figure 8 gives examples of convex and concave polygons. The term degenerate polygon is often used to describe a set of vertices that are collinear or that have repeated coordinate positions. Collinear vertices generate a line segment. Repeated vertex positions can generate a polygon shape with extraneous lines, overlapping edges, or edges that have a length equal to 0. Sometimes the term degenerate polygon is also applied to a vertex list that contains fewer than three coordinate positions. To be robust, a graphics package could reject degenerate or nonplanar vertex sets. But this requires extra processing to identify these problems, so graphics systems usually leave such considerations to the programmer. Concave polygons also present problems. Implementations of fill algorithms and other graphics routines are more complicated for concave polygons, so it is generally more efficient to split a concave polygon into a set of convex polygons before processing. As with other polygon preprocessing algorithms, concave polygon splitting is often not included in a graphics library. Some graphics packages, including OpenGL, require all fill polygons to be convex. And some systems accept only triangular fill areas, which greatly simplifies many of the display algorithms.
Identifying Concave Polygons A concave polygon has at least one interior angle greater than 180◦ . Also, the extension of some edges of a concave polygon will intersect other edges, and some pair of interior points will produce a line segment that intersects the polygon boundary. Therefore, we can use any one of these characteristics of a concave polygon as a basis for constructing an identification algorithm. If we set up a vector for each polygon edge, then we can use the cross-product of adjacent edges to test for concavity. All such vector products will be of the same sign (positive or negative) for a convex polygon. Therefore, if some cross-products yield a positive value and some a negative value, we have a concave polygon. Figure 9 illustrates the edge-vector, cross-product method for identifying concave polygons.
55
Graphics Output Primitives y V6
(E1 E2)z 0
V5
E5
(E2 E3)z 0
E4 V4
(E3 E4)z 0 E3
E6 FIGURE 9
V3
E2
V1
(E5 E6)z 0 (E6 E1)z 0
E1
Identifying a concave polygon by calculating cross-products of successive pairs of edge vectors.
(E4 E5)z 0
V2 x
Another way to identify a concave polygon is to look at the polygon vertex positions relative to the extension line of any edge. If some vertices are on one side of the extension line and some vertices are on the other side, the polygon is concave.
Splitting Concave Polygons Once we have identified a concave polygon, we can split it into a set of convex polygons. This can be accomplished using edge vectors and edge cross-products; or, we can use vertex positions relative to an edge extension line to determine which vertices are on one side of this line and which are on the other. For the following algorithms, we assume that all polygons are in the xy plane. Of course, the original position of a polygon described in world coordinates may not be in the xy plane, but we can always move it into that plane. With the vector method for splitting a concave polygon, we first need to form the edge vectors. Given two consecutive vertex positions, Vk and Vk+1 , we define the edge vector between them as Ek = Vk+1 − Vk Next we calculate the cross-products of successive edge vectors in order around the polygon perimeter. If the z component of some cross-products is positive while other cross-products have a negative z component, the polygon is concave. Otherwise, the polygon is convex. This assumes that no series of three successive vertices are collinear, in which case the cross-product of the two edge vectors for these vertices would be zero. If all vertices are collinear, we have a degenerate polygon (a straight line). We can apply the vector method by processing edge vectors in counterclockwise order. If any cross-product has a negative z component (as in Figure 9), the polygon is concave and we can split it along the line of the first edge vector in the cross-product pair. The following example illustrates this method for splitting a concave polygon.
E5
3
2 E6
E4
EXAMPLE 1
1 E2
E3
Figure 10 shows a concave polygon with six edges. Edge vectors for this polygon can be expressed as
E1 0
1
2
3
FIGURE 10
Splitting a concave polygon using the vector method.
56
The Vector Method for Splitting Concave Polygons
E1 = (1, 0, 0) E3 = (1, −1, 0)
E2 = (1, 1, 0)
E5 = (−3, 0, 0)
E6 = (0, −2, 0)
E4 = (0, 2, 0)
Graphics Output Primitives
where the z component is 0, since all edges are in the xy plane. The crossproduct E j × Ek for two successive edge vectors is a vector perpendicular to the xy plane with z component equal to E j x E ky − E kx E j y : E1 × E2 = (0, 0, 1) E3 × E4 = (0, 0, 2) E5 × E6 = (0, 0, 6)
E2 × E3 = (0, 0, −2) E4 × E5 = (0, 0, 6) E6 × E1 = (0, 0, 2)
Since the cross-product E2 × E3 has a negative z component, we split the polygon along the line of vector E2 . The line equation for this edge has a slope of 1 and a y intercept of −1. We then determine the intersection of this line with the other polygon edges to split the polygon into two pieces. No other edge cross-products are negative, so the two new polygons are both convex. We can also split a concave polygon using a rotational method. Proceeding counterclockwise around the polygon edges, we shift the position of the polygon so that each vertex Vk in turn is at the coordinate origin. Then, we rotate the polygon about the origin in a clockwise direction so that the next vertex Vk+1 is on the x axis. If the following vertex, Vk+2 , is below the x axis, the polygon is concave. We then split the polygon along the x axis to form two new polygons, and we repeat the concave test for each of the two new polygons. These steps are repeated until we have tested all vertices in the polygon list. Figure 11 illustrates the rotational method for splitting a concave polygon.
y
V1
V2
V3
x
V4
Splitting a Convex Polygon into a Set of Triangles Once we have a vertex list for a convex polygon, we could transform it into a set of triangles. This can be accomplished by first defining any sequence of three consecutive vertices to be a new polygon (a triangle). The middle triangle vertex is then deleted from the original vertex list. Then the same procedure is applied to this modified vertex list to strip off another triangle. We continue forming triangles in this manner until the original polygon is reduced to just three vertices, which define the last triangle in the set. A concave polygon can also be divided into a set of triangles using this approach, although care must be taken that the new diagonal edge formed by joining the first and third selected vertices does not cross the concave portion of the polygon, and that the three selected vertices at each step form an interior angle that is less than 180◦ (a “convex” angle).
FIGURE 11
Splitting a concave polygon using the rotational method. After moving V2 to the coordinate origin and rotating V3 onto the x axis, we find that V4 is below the x axis. So we split the polygon along the line of V2 V3 , which is the x axis.
Inside-Outside Tests Various graphics processes often need to identify interior regions of objects. Identifying the interior of a simple object, such as a convex polygon, a circle, or a sphere, is generally a straightforward process. But sometimes we must deal with more complex objects. For example, we may want to specify a complex fill region with intersecting edges, as in Figure 12. For such shapes, it is not always clear which regions of the xy plane we should call “interior” and which regions we should designate as “exterior” to the object boundaries. Two commonly used algorithms for identifying interior areas of a plane figure are the odd-even rule and the nonzero winding-number rule. We apply the odd-even rule, also called the odd-parity rule or the even-odd rule, by first conceptually drawing a line from any position P to a distant point
57
Graphics Output Primitives A A D
exterior
D
exterior
C
C G
G interior
interior
E
E F
F FIGURE 12
Identifying interior and exterior regions of a closed polyline that contains self-intersecting segments.
B
B Odd-Even Rule
Nonzero Winding-Number Rule
(a)
(b)
outside the coordinate extents of the closed polyline. Then we count the number of line-segment crossings along this line. If the number of segments crossed by this line is odd, then P is considered to be an interior point. Otherwise, P is an exterior point. To obtain an accurate count of the segment crossings, we must be sure that the line path we choose does not intersect any line-segment endpoints. Figure 12(a) shows the interior and exterior regions obtained using the odd-even rule for a self-intersecting closed polyline. We can use this procedure, for example, to fill the interior region between two concentric circles or two concentric polygons with a specified color. Another method for defining interior regions is the nonzero winding-number rule, which counts the number of times that the boundary of an object “winds” around a particular point in the counterclockwise direction. This count is called the winding number, and the interior points of a two-dimensional object can be defined to be those that have a nonzero value for the winding number. We apply the nonzero winding number rule by initializing the winding number to 0 and again imagining a line drawn from any position P to a distant point beyond the coordinate extents of the object. The line we choose must not pass through any endpoint coordinates. As we move along the line from position P to the distant point, we count the number of object line segments that cross the reference line in each direction. We add 1 to the winding number every time we intersect a segment that crosses the line in the direction from right to left, and we subtract 1 every time we intersect a segment that crosses from left to right. The final value of the winding number, after all boundary crossings have been counted, determines the relative position of P. If the winding number is nonzero, P is considered to be an interior point. Otherwise, P is taken to be an exterior point. Figure 12(b) shows the interior and exterior regions defined by the nonzero winding-number rule for a self-intersecting, closed polyline. For simple objects, such as polygons and circles, the nonzero winding-number rule and the odd-even rule give the same results. But for more complex shapes, the two methods may yield different interior and exterior regions, as in the example of Figure 12. One way to determine directional boundary crossings is to set up vectors along the object edges (or boundary lines) and along the reference line. Then we compute the vector cross-product of the vector u, along the line from P to a distant point, with an object edge vector E for each edge that crosses the line. Assuming that we have a two-dimensional object in the xy plane, the direction of each vector cross-product will be either in the +z direction or in the −z direction. If the z component of a cross-product u × E for a particular crossing is positive, that segment crosses from right to left and we add 1 to the winding number.
58
Graphics Output Primitives
Otherwise, the segment crosses from left to right and we subtract 1 from the winding number. A somewhat simpler way to compute directional boundary crossings is to use vector dot products instead of cross-products. To do this, we set up a vector that is perpendicular to vector u and that has a right-to-left direction as we look along the line from P in the direction of u. If the components of u are denoted as (ux , u y), then the vector that is perpendicular to u has components (−u y , ux ). Now, if the dot product of this perpendicular vector and a boundary-line vector is positive, that crossing is from right to left and we add 1 to the winding number. Otherwise, the boundary crosses our reference line from left to right, and we subtract 1 from the winding number. The nonzero winding-number rule tends to classify as interior some areas that the odd-even rule deems to be exterior, and it can be more versatile in some applications. In general, plane figures can be defined with multiple, disjoint components, and the direction specified for each set of disjoint boundaries can be used to designate the interior and exterior regions. Examples include characters (such as letters of the alphabet and punctuation symbols), nested polygons, and concentric circles or ellipses. For curved lines, the odd-even rule is applied by calculating intersections with the curve paths. Similarly, with the nonzero winding-number rule, we need to calculate tangent vectors to the curves at the crossover intersection points with the reference line from position P. Variations of the nonzero winding-number rule can be used to define interior regions in other ways. For example, we could define a point to be interior if its winding number is positive or if it is negative; or we could use any other rule to generate a variety of fill shapes. Sometimes, Boolean operations are used to specify a fill area as a combination of two regions. One way to implement Boolean operations is by using a variation of the basic winding-number rule. With this scheme, we first define a simple, nonintersecting boundary for each of two regions. Then if we consider the direction for each boundary to be counterclockwise, the union of two regions would consist of those points whose winding number is positive (Figure 13). Similarly, the intersection of two regions with counterclockwise boundaries would contain those points whose winding number is greater than 1, as illustrated in Figure 14. To set up a fill area that is the difference of two regions (say, A − B), we can enclose region A with a counterclockwise border and B with a clockwise border. Then the difference region (Figure 15) is the set of all points whose winding number is positive.
FIGURE 13
FIGURE 14
A fill area defined as a region that has a positive value for the winding number. This fill area is the union of two regions, each with a counterclockwise border direction.
A fill area defined as a region with a winding number greater than 1. This fill area is the intersection of two regions, each with a counterclockwise border direction.
59
Graphics Output Primitives
Region B Region A
AB
FIGURE 15
A fill area defined as a region with a positive value for the winding number. This fill area is the difference, A − B, of two regions, where region A has a positive border direction (counterclockwise) and region B has a negative border direction (clockwise).
Polygon Tables Typically, the objects in a scene are described as sets of polygon surface facets. In fact, graphics packages often provide functions for defining a surface shape as a mesh of polygon patches. The description for each object includes coordinate information specifying the geometry for the polygon facets and other surface parameters such as color, transparency, and light-reflection properties. As information for each polygon is input, the data are placed into tables that are to be used in the subsequent processing, display, and manipulation of the objects in the scene. These polygon data tables can be organized into two groups: geometric tables and attribute tables. Geometric data tables contain vertex coordinates and parameters to identify the spatial orientation of the polygon surfaces. Attribute information for an object includes parameters specifying the degree of transparency of the object and its surface reflectivity and texture characteristics. Geometric data for the objects in a scene are arranged conveniently in three lists: a vertex table, an edge table, and a surface-facet table. Coordinate values for each vertex in the object are stored in the vertex table. The edge table contains pointers back into the vertex table to identify the vertices for each polygon edge. And the surface-facet table contains pointers back into the edge table to identify the edges for each polygon. This scheme is illustrated in Figure 16 for two V1
E3 E1
E6
S1 S2
E2 V2
V3
V5 E4
E5
V4 VERTEX TABLE
FIGURE 16
Geometric data-table representation for two adjacent polygon surface facets, formed with six edges and five vertices.
60
V1: V2: V3: V4: V5:
x1, y1, z1 x2, y2, z2 x3, y3, z3 x4, y4, z4 x5, y5, z5
EDGE TABLE E1: E2: E3: E4: E5: E6:
V1, V2 V2, V3 V3, V1 V3, V4 V4, V5 V5, V1
SURFACE-FACET TABLE S1: S2:
E1, E2, E3 E3, E4, E5, E6
Graphics Output Primitives
adjacent polygon facets on an object surface. In addition, individual objects and their component polygon faces can be assigned object and facet identifiers for easy reference. Listing the geometric data in three tables, as in Figure 16, provides a convenient reference to the individual components (vertices, edges, and surface facets) for each object. Also, the object can be displayed efficiently by using data from the edge table to identify polygon boundaries. An alternative arrangement is to use just two tables: a vertex table and a surface-facet table. But this scheme is less convenient, and some edges could get drawn twice in a wire-frame display. Another possibility is to use only a surface-facet table, but this duplicates coordinate information, since explicit coordinate values are listed for each vertex in each polygon facet. Also the relationship between edges and facets would have to be reconstructed from the vertex listings in the surface-facet table. We can add extra information to the data tables of Figure 16 for faster information extraction. For instance, we could expand the edge table to include forward pointers into the surface-facet table so that a common edge between polygons could be identified more rapidly (Figure 17). This is particularly useful for rendering procedures that must vary surface shading smoothly across the edges from one polygon to the next. Similarly, the vertex table could be expanded to reference corresponding edges, for faster information retrieval. Additional geometric information that is usually stored in the data tables includes the slope for each edge and the coordinate extents for polygon edges, polygon facets, and each object in a scene. As vertices are input, we can calculate edge slopes, and we can scan the coordinate values to identify the minimum and maximum x, y, and z values for individual lines and polygons. Edge slopes and bounding-box information are needed in subsequent processing, such as surface rendering and visible-surface identification algorithms. Because the geometric data tables may contain extensive listings of vertices and edges for complex objects and scenes, it is important that the data be checked for consistency and completeness. When vertex, edge, and polygon definitions are specified, it is possible, particularly in interactive applications, that certain input errors could be made that would distort the display of the objects. The more information included in the data tables, the easier it is to check for errors. Therefore, error checking is easier when three data tables (vertex, edge, and surface facet) are used, since this scheme provides the most information. Some of the tests that could be performed by a graphics package are (1) that every vertex is listed as an endpoint for at least two edges, (2) that every edge is part of at least one polygon, (3) that every polygon is closed, (4) that each polygon has at least one shared edge, and (5) that if the edge table contains pointers to polygons, every edge referenced by a polygon pointer has a reciprocal pointer back to the polygon.
E1: E2: E3: E4: E5: E6:
V1, V2, S1 V2, V3, S1 V3, V1, S1, S2 V3, V4, S2 V4, V5, S2 V5, V1, S2
FIGURE 17
Edge table for the surfaces of Figure 16 expanded to include pointers into the surface-facet table.
Plane Equations To produce a display of a three-dimensional scene, a graphics system processes the input data through several procedures. These procedures include transformation of the modeling and world-coordinate descriptions through the viewing pipeline, identification of visible surfaces, and the application of rendering routines to the individual surface facets. For some of these processes, information about the spatial orientation of the surface components of objects is needed. This information is obtained from the vertex coordinate values and the equations that describe the polygon surfaces.
61
Graphics Output Primitives
Each polygon in a scene is contained within a plane of infinite extent. The general equation of a plane is Ax + B y + C z + D = 0
(1)
where (x, y, z) is any point on the plane, and the coefficients A, B, C, and D (called plane parameters) are constants describing the spatial properties of the plane. We can obtain the values of A, B, C, and D by solving a set of three plane equations using the coordinate values for three noncollinear points in the plane. For this purpose, we can select three successive convex-polygon vertices, (x1 , y1 , z1 ), (x2 , y2 , z2 ), and (x3 , y3 , z3 ), in a counterclockwise order and solve the following set of simultaneous linear plane equations for the ratios A/D, B/D, and C/D: (A/D)xk + (B/D)yk + (C/D)zk = −1,
k = 1, 2, 3
(2)
The solution to this set of equations can be obtained in determinant form, using Cramer’s rule, as x1 1 z1 1 y1 z1 B = x2 1 z2 A = 1 y2 z2 x3 1 z3 1 y3 z3 (3) x1 y1 1 x1 y1 z1 C = x2 y2 1 D = − x2 y2 z2 x3 y3 1 x3 y3 z3 Expanding the determinants, we can write the calculations for the plane coefficients in the form A = y1 (z2 − z3 ) + y2 (z3 − z1 ) + y3 (z1 − z2 ) B = z1 (x2 − x3 ) + z2 (x3 − x1 ) + z3 (x1 − x2 ) C = x1 (y2 − y3 ) + x2 (y3 − y1 ) + x3 (y1 − y2 )
(4)
D = −x1 (y2 z3 − y3 z2 ) − x2 (y3 z1 − y1 z3 ) − x3 (y1 z2 − y2 z1 ) These calculations are valid for any three coordinate positions, including those for which D = 0. When vertex coordinates and other information are entered into the polygon data structure, values for A, B, C, and D can be computed for each polygon facet and stored with the other polygon data. It is possible that the coordinates defining a polygon facet may not be contained within a single plane. We can solve this problem by dividing the facet into a set of triangles; or we could find an approximating plane for the vertex list. One method for obtaining an approximating plane is to divide the vertex list into subsets, where each subset contains three vertices, and calculate plane parameters A, B, C, D for each subset. The approximating plane parameters are then obtained as the average value for each of the calculated plane parameters. Another approach is to project the vertex list onto the coordinate planes. Then we take parameter A proportional to the area of the polygon projection on the yz plane, parameter B proportional to the projection area on the xz plane, and parameter C proportional to the projection area on the xy plane. The projection method is often used in ray-tracing applications.
Front and Back Polygon Faces Because we are usually dealing with polygon surfaces that enclose an object interior, we need to distinguish between the two sides of each surface. The side of a polygon that faces into the object interior is called the back face, and the visible, or outward, side is the front face. Identifying the position of points in space
62
Graphics Output Primitives
relative to the front and back faces of a polygon is a basic task in many graphics algorithms, as, for example, in determining object visibility. Every polygon is contained within an infinite plane that partitions space into two regions. Any point that is not on the plane and that is visible to the front face of a polygon surface section is said to be in front of (or outside) the plane, and, thus, outside the object. And any point that is visible to the back face of the polygon is behind (or inside) the plane. A point that is behind (inside) all polygon surface planes is inside the object. We need to keep in mind that this inside/outside classification is relative to the plane containing the polygon, whereas our previous inside/outside tests using the winding-number or odd-even rule were in reference to the interior of some two-dimensional boundary. Plane equations can be used to identify the position of spatial points relative to the polygon facets of an object. For any point (x, y, z) not on a plane with parameters A, B, C, D, we have
y 1
A x + B y + C z + D = 0 Thus, we can identify the point as either behind or in front of a polygon surface contained within that plane according to the sign (negative or positive) of Ax + By + C z + D: if if
A x + B y + C z + D < 0, A x + B y + C z + D > 0,
the point (x, y, z) is behind the plane the point (x, y, z) is in front of the plane
1
z
FIGURE 18
These inequality tests are valid in a right-handed Cartesian system, provided the plane parameters A, B, C, and D were calculated using coordinate positions selected in a strictly counterclockwise order when viewing the surface along a front-to-back direction. For example, in Figure 18, any point outside (in front of) the plane of the shaded polygon satisfies the inequality x − 1 > 0, while any point inside (in back of) the plane has an x-coordinate value less than 1. Orientation of a polygon surface in space can be described with the normal vector for the plane containing that polygon, as shown in Figure 19. This surface normal vector is perpendicular to the plane and has Cartesian components (A, B, C), where parameters A, B, and C are the plane coefficients calculated in Equations 4. The normal vector points in a direction from inside the plane to the outside; that is, from the back face of the polygon to the front face. As an example of calculating the components of the normal vector for a polygon, which also gives us the plane parameters, we choose three of the vertices of the shaded face of the unit cube in Figure 18. These points are selected in a counterclockwise ordering as we view the cube from outside looking toward the origin. Coordinates for these vertices, in the order selected, are then used in Equations 4 to obtain the plane coefficients: A = 1, B = 0, C = 0, D = −1. Thus, the normal vector for this plane is N = (1, 0, 0), which is in the direction of the positive x axis. That is, the normal vector is pointing from inside the cube to the outside and is perpendicular to the plane x = 1. The elements of a normal vector can also be obtained using a vector crossproduct calculation. Assuming we have a convex-polygon surface facet and a right-handed Cartesian system, we again select any three vertex positions, V1 , V2 , and V3 , taken in counterclockwise order when viewing from outside the object toward the inside. Forming two vectors, one from V1 to V2 and the second from V1 to V3 , we calculate N as the vector cross-product: N = (V2 − V1 ) × (V3 − V1 )
x
1
The shaded polygon surface of the unit cube has the plane equation x − 1 = 0. y N⫽(A, B, C)
z
x
FIGURE 19
The normal vector N for a plane described with the equation Ax + B y + C z + D = 0 is perpendicular to the plane and has Cartesian components ( A , B , C ) .
(5)
This generates values for the plane parameters A, B, and C. We can then obtain the value for parameter D by substituting these values and the coordinates for one of
63
Graphics Output Primitives
the polygon vertices into Equation 1 and solving for D. The plane equation can be expressed in vector form using the normal N and the position P of any point in the plane as N · P = −D
(6)
For a convex polygon, we could also obtain the plane parameters using the cross-product of two successive edge vectors. And with a concave polygon, we can select the three vertices so that the two vectors for the cross-product form an angle less than 180◦ . Otherwise, we can take the negative of their cross-product to get the correct normal vector direction for the polygon surface.
8 OpenGL Polygon Fill-Area Functions With one exception, the OpenGL procedures for specifying fill polygons are similar to those for describing a point or a polyline. A glVertex function is used to input the coordinates for a single polygon vertex, and a complete polygon is described with a list of vertices placed between a glBegin/glEnd pair. However, there is one additional function that we can use for displaying a rectangle that has an entirely different format. By default, a polygon interior is displayed in a solid color, determined by the current color settings. As options, we can fill a polygon with a pattern and we can display polygon edges as line borders around the interior fill. There are six different symbolic constants that we can use as the argument in the glBegin function to describe polygon fill areas. These six primitive constants allow us to display a single fill polygon, a set of unconnected fill polygons, or a set of connected fill polygons. In OpenGL, a fill area must be specified as a convex polygon. Thus, a vertex list for a fill polygon must contain at least three vertices, there can be no crossing edges, and all interior angles for the polygon must be less than 180◦ . And a single polygon fill area can be defined with only one vertex list, which precludes any specifications that contain holes in the polygon interior, such as that shown in Figure 20. We could describe such a figure using two overlapping convex polygons. Each polygon that we specify has two faces: a back face and a front face. In OpenGL, fill color and other attributes can be set for each face separately, and back/front identification is needed in both two-dimensional and threedimensional viewing routines. Therefore, polygon vertices should be specified in a counterclockwise order as we view the polygon from “outside.” This identifies the front face of that polygon.
FIGURE 20
A polygon with a complex interior that cannot be specified with a single vertex list.
64
Graphics Output Primitives
250 200 150 100 50 FIGURE 21
50
100
150
200
The display of a square fill area using the glRect function.
Because graphics displays often include rectangular fill areas, OpenGL provides a special rectangle function that directly accepts vertex specifications in the xy plane. In some implementations of OpenGL, the following routine can be more efficient than generating a fill rectangle using glVertex specifications: glRect* (x1, y1, x2, y2);
One corner of this rectangle is at coordinate position (x1, y1), and the opposite corner of the rectangle is at position (x2, y2). Suffix codes for glRect specify the coordinate data type and whether coordinates are to be expressed as array elements. These codes are i (for integer), s (for short), f (for float), d (for double), and v (for vector). The rectangle is displayed with edges parallel to the xy coordinate axes. As an example, the following statement defines the square shown in Figure 21: glRecti (200, 100, 50, 250);
If we put the coordinate values for this rectangle into arrays, we can generate the same square with the following code: int vertex1 [ ] = {200, 100}; int vertex2 [ ] = {50, 250}; glRectiv (vertex1, vertex2);
When a rectangle is generated with function glRect, the polygon edges are formed between the vertices in the order (x1, y1), (x2, y1), (x2, y2), (x1, y2), and then back to (x1, y1). Thus, in our example, we produced a vertex list with a clockwise ordering. In many two-dimensional applications, the determination of front and back faces is unimportant. But if we do want to assign different properties to the front and back faces of the rectangle, then we should reverse the order of the two vertices in this example so that we obtain a counterclockwise ordering of the vertices. Each of the other six OpenGL polygon fill primitives is specified with a symbolic constant in the glBegin function, along with a a list of glVertex commands. With the OpenGL primitive constant GL POLYGON, we can display a single polygon fill area such as that shown in Figure 22(a). For this example, we assume that we have a list of six points, labeled p1 through p6, specifying
65
Graphics Output Primitives p6
p5
p1
p6
p4
p2
p1
p3
p4
p2
(a) p6
p3 (b)
p5
p1
p6
p4
p2
p5
p3 (c)
p5
p1
p4
p2
p3 (d)
FIGURE 22
Displaying polygon fill areas using a list of six vertex positions. (a) A single convex polygon fill area generated with the primitive constant GL POLYGON. (b) Two unconnected triangles generated with GL TRIANGLES. (c) Four connected triangles generated with GL TRIANGLE STRIP. (d) Four connected triangles generated with GL TRIANGLE FAN.
two-dimensional polygon vertex positions in a counterclockwise ordering. Each of the points is represented as an array of (x, y) coordinate values: glBegin (GL_POLYGON); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glVertex2iv (p6); glEnd ( );
A polygon vertex list must contain at least three vertices. Otherwise, nothing is displayed. If we reorder the vertex list and change the primitive constant in the previous code example to GL TRIANGLES, we obtain the two separated triangle fill areas in Figure 22(b): glBegin (GL_TRIANGLES); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p6); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glEnd ( );
66
Graphics Output Primitives
In this case, the first three coordinate points define the vertices for one triangle, the next three points define the next triangle, and so forth. For each triangle fill area, we specify the vertex positions in a counterclockwise order. A set of unconnected triangles is displayed with this primitive constant unless some vertex coordinates are repeated. Nothing is displayed if we do not list at least three vertices; and if the number of vertices specified is not a multiple of 3, the final one or two vertex positions are not used. By reordering the vertex list once more and changing the primitive constant to GL TRIANGLE STRIP, we can display the set of connected triangles shown in Figure 22(c): glBegin (GL_TRIANGLE_STRIP); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p6); glVertex2iv (p3); glVertex2iv (p5); glVertex2iv (p4); glEnd ( );
Assuming that no coordinate positions are repeated in a list of N vertices, we obtain N − 2 triangles in the strip. Clearly, we must have N ≥ 3 or nothing is displayed. In this example, N = 6 and we obtain four triangles. Each successive triangle shares an edge with the previously defined triangle, so the ordering of the vertex list must be set up to ensure a consistent display. One triangle is defined for each vertex position listed after the first two vertices. Thus, the first three vertices should be listed in counterclockwise order, when viewing the front (outside) surface of the triangle. After that, the set of three vertices for each subsequent triangle is arranged in a counterclockwise order within the polygon tables. This is accomplished by processing each position n in the vertex list in the order n = 1, n = 2, . . . , n = N − 2 and arranging the order of the corresponding set of three vertices according to whether n is an odd number or an even number. If n is odd, the polygon table listing for the triangle vertices is in the order n, n + 1, n + 2. If n is even, the triangle vertices are listed in the order n + 1, n, n + 2. In the preceding example, our first triangle (n = 1) would be listed as having vertices (p1, p2, p6). The second triangle (n = 2) would have the vertex ordering (p6, p2, p3). Vertex ordering for the third triangle (n = 3) would be (p6, p3, p5). And the fourth triangle (n = 4) would be listed in the polygon tables with vertex ordering (p5, p3, p4). Another way to generate a set of connected triangles is to use the “fan” approach illustrated in Figure 22(d), where all triangles share a common vertex. We obtain this arrangement of triangles using the primitive constant GL TRIANGLE FAN and the original ordering of our six vertices: glBegin (GL_TRIANGLE_FAN); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glVertex2iv (p6); glEnd ( );
For N vertices, we again obtain N − 2 triangles, providing no vertex positions are repeated, and we must list at least three vertices. In addition, the vertices must
67
Graphics Output Primitives p8 p1
p4
p5
p6 p2
p3
p7 (a) p8
p1 p4
p5
FIGURE 23
Displaying quadrilateral fill areas using a list of eight vertex positions. (a) Two unconnected quadrilaterals generated with GL QUADS. (b) Three connected quadrilaterals generated with GL QUAD STRIP.
p6 p2
p3 (b)
p7
be specified in the proper order to define front and back faces for each triangle correctly. The first coordinate position listed (in this case, p1) is a vertex for each triangle in the fan. If we again enumerate the triangles and the coordinate positions listed as n = 1, n = 2, . . . , n = N − 2, then vertices for triangle n are listed in the polygon tables in the order 1, n + 1, n + 2. Therefore, triangle 1 is defined with the vertex list (p1, p2, p3); triangle 2 has the vertex ordering (p1, p3, p4); triangle 3 has its vertices specified in the order (p1, p4, p5); and triangle 4 is listed with vertices (p1, p5, p6). Besides the primitive functions for triangles and a general polygon, OpenGL provides for the specifications of two types of quadrilaterals (four-sided polygons). With the GL QUADS primitive constant and the following list of eight vertices, specified as two-dimensional coordinate arrays, we can generate the display shown in Figure 23(a): glBegin (GL_QUADS); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p3); glVertex2iv (p4); glVertex2iv (p5); glVertex2iv (p6); glVertex2iv (p7); glVertex2iv (p8); glEnd ( );
68
Graphics Output Primitives
The first four coordinate points define the vertices for one quadrilateral, the next four points define the next quadrilateral, and so on. For each quadrilateral fill area, we specify the vertex positions in a counterclockwise order. If no vertex coordinates are repeated, we display a set of unconnected four-sided fill areas. We must list at least four vertices with this primitive. Otherwise, nothing is displayed. And if the number of vertices specified is not a multiple of 4, the extra vertex positions are ignored. Rearranging the vertex list in the previous quadrilateral code example and changing the primitive constant to GL QUAD STRIP, we can obtain the set of connected quadrilaterals shown in Figure 23(b): glBegin (GL_QUAD_STRIP); glVertex2iv (p1); glVertex2iv (p2); glVertex2iv (p4); glVertex2iv (p3); glVertex2iv (p5); glVertex2iv (p6); glVertex2iv (p8); glVertex2iv (p7); glEnd ( );
A quadrilateral is set up for each pair of vertices specified after the first two vertices in the list, and we need to list the vertices so that we generate a correct counterclockwise vertex ordering for each polygon. For a list of N vertices, we obtain N2 − 1 quadrilaterals, providing that N ≥ 4. If N is not a multiple of 4, any extra coordinate positions in the list are not used. We can enumerate these fill polygons and the vertices listed as n = 1, n = 2, . . . , n = N2 − 1. Then polygon tables will list the vertices for quadrilateral n in the vertex order number 2n − 1, 2n, 2n + 2, 2n + 1. For this example, N = 8 and we have 3 quadrilaterals in the strip. Thus, our first quadrilateral (n = 1) is listed as having a vertex ordering of (p1, p2, p3, p4). The second quadrilateral (n = 2) has the vertex ordering (p4, p3, p6, p5), and the vertex ordering for the third quadrilateral (n = 3) is (p5, p6, p7, p8). Most graphics packages display curved surfaces as a set of approximating plane facets. This is because plane equations are linear, and processing the linear equations is much quicker than processing quadric or other types of curve equations. So OpenGL and other packages provide polygon primitives to facilitate the approximation of a curved surface. Objects are modeled with polygon meshes, and a database of geometric and attribute information is set up to facilitate the processing of the polygon facets. In OpenGL, primitives that we can use for this purpose are the triangle strip, the triangle fan, and the quad strip. Fast hardwareimplemented polygon renderers are incorporated into high-quality graphics systems with the capability for displaying millions of shaded polygons per second (usually triangles), including the application of surface texture and special lighting effects. Although the OpenGL core library allows only convex polygons, the GLU library provides functions for dealing with concave polygons and other nonconvex objects with linear boundaries. A set of GLU polygon tessellation routines is available for converting such shapes into a set of triangles, triangle meshes, triangle fans, and straight-line segments. Once such objects have been decomposed, they can be processed with basic OpenGL functions.
69
Graphics Output Primitives
9 OpenGL Vertex Arrays Although our examples so far have contained relatively few coordinate positions, describing a scene containing several objects can get much more complicated. To illustrate, we first consider describing a single, very basic object: the unit cube shown in Figure 24, with coordinates given in integers to simplify our discussion. A straightforward method for defining the vertex coordinates is to use a double-subscripted array, such as GLint points [8][3] = { {0, 0, 0}, {0, 1, 0}, {1, 0, 0}, {1, 1, 0}, {0, 0, 1}, {0, 1, 1}, {1, 0, 1}, {1, 1, 1} }; Alternatively, we could first define a data type for a three-dimensional vertex position and then give the coordinates for each vertex position as an element of a single-subscripted array as, for example, typedef GLint vertex3 [3]; vertex3 pt [8] = { {0, 0, 0}, {0, 1, 0}, {1, 0, 0}, {1, 1, 0}, {0, 0, 1}, {0, 1, 1}, {1, 0, 1}, {1, 1, 1} }; Next, we need to define each of the six faces of this object. For this, we could make six calls either to glBegin (GL POLYGON) or to glBegin (GL QUADS). In either case, we must be sure to list the vertices for each face in a counterclockwise order when viewing that surface from the outside of the cube. In the following code segment, we specify each cube face as a quadrilateral and use a function call to pass array subscript values to the OpenGL primitive routines. Figure 25 shows the subscript values for array pt corresponding to the cube vertex positions.
z
1 4
6
5
7
y
0 1
0
1
1 x
70
2
FIGURE 24
FIGURE 25
A cube with an edge length of 1.
Subscript values for array pt corresponding to the vertex coordinates for the cube shown in Figure 24.
3
Graphics Output Primitives
void quad (GLint n1, GLint n2, GLint n3, GLint n4) { glBegin (GL_QUADS); glVertex3iv (pt [n1]); glVertex3iv (pt [n2]); glVertex3iv (pt [n3]); glVertex3iv (pt [n4]); glEnd ( ); } void cube ( ) { quad (6, 2, 3, 7); quad (5, 1, 0, 4); quad (7, 3, 1, 5); quad (4, 0, 2, 6); quad (2, 0, 1, 3); quad (7, 5, 4, 6); } Thus, the specification for each face requires six OpenGL functions, and we have six faces to specify. When we add color specifications and other parameters, our display program for the cube could easily contain 100 or more OpenGL function calls. And scenes with many complex objects can require much more. As we can see from the preceding cube example, a complete scene description could require hundreds or thousands of coordinate specifications. In addition, there are various attribute and viewing parameters that must be set for individual objects. Thus, object and scene descriptions could require an enormous number of function calls, which puts a demand on system resources and can slow execution of the graphics programs. A further problem with complex displays is that object surfaces (such as the cube in Figure 24) usually have shared vertex coordinates. Using the methods we have discussed up to now, these shared positions may need to be specified multiple times. To alleviate these problems, OpenGL provides a mechanism for reducing the number of function calls needed in processing coordinate information. Using a vertex array, we can arrange the information for describing a scene so that we need only a very few function calls. The steps involved are 1. Invoke the function glEnableClientState (GL VERTEX ARRAY) to
activate the vertex-array feature of OpenGL. 2. Use the function glVertexPointer to specify the location and data
format for the vertex coordinates. 3. Display the scene using a routine such as glDrawElements, which can
process multiple primitives with very few function calls. Using the pt array previously defined for the cube, we implement these three steps in the following code example: glEnableClientState (GL_VERTEX_ARRAY); glVertexPointer (3, GL_INT, 0, pt); GLubyte vertIndex [ ] = (6, 2, 3, 7, 5, 1, 0, 4, 7, 3, 1, 5, 4, 0, 2, 6, 2, 0, 1, 3, 7, 5, 4, 6); glDrawElements (GL_QUADS, 24, GL_UNSIGNED_BYTE, vertIndex);
71
Graphics Output Primitives
With the first command, glEnableClientState (GL VERTEX ARRAY), we activate a capability (in this case, a vertex array) on the client side of a clientserver system. Because the client (the machine that is running the main program) retains the data for a picture, the vertex array must be there also. The server (our workstation, for example) generates commands and displays the picture. Of course, a single machine can be both client and server. The vertex-array feature of OpenGL is deactivated with the command glDisableClientState (GL_VERTEX_ARRAY); We next give the location and format of the coordinates for the object vertices in the function glVertexPointer. The first parameter in glVertexPointer (3 in this example) specifies the number of coordinates used in each vertex description. Data type for the vertex coordinates is designated using an OpenGL symbolic constant as the second parameter in this function. For our example, the data type is GL INT. Other data types are specified with the symbolic constants GL BYTE, GL SHORT, GL FLOAT, and GL DOUBLE. With the third parameter, we give the byte offset between consecutive vertices. The purpose of this argument is to allow various kinds of data, such as coordinates and colors, to be packed together in one array. Because we are giving only the coordinate data, we assign a value of 0 to the offset parameter. The last parameter in the glVertexPointer function references the vertex array, which contains the coordinate values. All the indices for the cube vertices are stored in array vertIndex. Each of these indices is the subscript for array pt corresponding to the coordinate values for that vertex. This index list is referenced as the last parameter value in function glDrawElements and is then used by the primitive GL QUADS, which is the first parameter, to display the set of quadrilateral surfaces for the cube. The second parameter specifies the number of elements in array vertIndex. Because a quadrilateral requires just 4 vertices and we specified 24, the glDrawElements function continues to display another cube face after each successive set of 4 vertices until all 24 have been processed. Thus, we accomplish the final display of all faces of the cube with this single function call. The third parameter in function glDrawElements gives the type for the index values. Because our indices are small integers, we specified a type of GL UNSIGNED BYTE. The two other index types that can be used are GL UNSIGNED SHORT and GL UNSIGNED INT. Additional information can be combined with the coordinate values in the vertex arrays to facilitate the processing of a scene description. We can specify color values and other attributes for objects in arrays that can be referenced by the glDrawElements function. Also, we can interlace the various arrays for greater efficiency.
10 Pixel-Array Primitives In addition to straight lines, polygons, circles, and other primitives, graphics packages often supply routines to display shapes that are defined with a rectangular array of color values. We can obtain the rectangular grid pattern by digitizing (scanning) a photograph or other picture or by generating a shape with a graphics program. Each color value in the array is then mapped to one or more screen pixel positions. A pixel array of color values is typically referred to as a pixmap.
72
Graphics Output Primitives y
ymax
n ⫽ 5 rows
ymin m ⫽ 8 columns FIGURE 26
xmin
xmax
x
Mapping an n by m color array onto a region of the screen coordinates.
Parameters for a pixel array can include a pointer to the color matrix, the size of the matrix, and the position and size of the screen area to be affected by the color values. Figure 26 gives an example of mapping a pixel-color array onto a screen area. Another method for implementing a pixel array is to assign either the bit value 0 or the bit value 1 to each element of the matrix. In this case, the array is simply a bitmap, which is sometimes called a mask, that indicates whether a pixel is to be assigned (or combined with) a preset color.
11 OpenGL Pixel-Array Functions There are two functions in OpenGL that we can use to define a shape or pattern specified with a rectangular array. One function defines a bitmap pattern, and the other a pixmap pattern. Also, OpenGL provides several routines for saving, copying, and manipulating arrays of pixel values.
OpenGL Bitmap Function A binary array pattern is defined with the function glBitmap (width, height, x0, y0, xOffset, yOffset, bitShape); Parameters width and height in this function give the number of columns and number of rows, respectively, in the array bitShape. Each element of bitShape is assigned either a 1 or a 0. A value of 1 indicates that the corresponding pixel is to be displayed in a previously defined color. Otherwise, the pixel is unaffected by the bitmap. (As an option, we could use a value of 1 to indicate that a specified color is to be combined with the color value stored in the refresh buffer at that position.) Parameters x0 and y0 define the position that is to be considered the “origin” of the rectangular array. This origin position is specified relative to the lower-left corner of bitShape, and values for x0 and y0 can be positive or negative. In addition, we need to designate a location in the frame buffer where the pattern is to be applied. This location is called the current raster position, and the bitmap is displayed by positioning its origin, (x0, y0), at the current
73
Graphics Output Primitives
raster position. Values assigned to parameters xOffset and yOffset are used as coordinate offsets to update the frame-buffer current raster position after the bitmap is displayed. Coordinate values for x0, y0, xOffset, and yOffset, as well as the current raster position, are maintained as floating-point values. Of course, bitmaps will be applied at integer pixel positions. But floating-point coordinates allow a set of bitmaps to be spaced at arbitrary intervals, which is useful in some applications, such as forming character strings with bitmap patterns. We use the following routine to set the coordinates for the current raster position: glRasterPos* ( ) Parameters and suffix codes are the same as those for the glVertex function. Thus, a current raster position is given in world coordinates, and it is transformed to screen coordinates by the viewing transformations. For our two-dimensional examples, we can specify coordinates for the current raster position directly in integer screen coordinates. The default value for the current raster position is the world-coordinate origin (0, 0, 0). The color for a bitmap is the color that is in effect at the time that the glRasterPos command is invoked. Any subsequent color changes do not affect the bitmap. Each row of a rectangular bit array is stored in multiples of 8 bits, where the binary data is arranged as a set of 8-bit unsigned characters. But we can describe a shape using any convenient grid size. For example, Figure 27 shows a bit pattern defined on a 10-row by 9-column grid, where the binary data is specified with 16 bits for each row. When this pattern is applied to the pixels in the frame buffer, all bit values beyond the ninth column are ignored. We apply the bit pattern of Figure 27 to a frame-buffer location with the following code section: GLubyte bitShape [20] = { 0x1c, 0x00, 0x1c, 0x00, 0x1c, 0x00, 0x1c, 0x00, 0x1c, 0x00, 0xff, 0x80, 0x7f, 0x00, 0x3e, 0x00, 0x1c, 0x00, 0x08, 0x00}; glPixelStorei (GL_UNPACK_ALIGNMENT, 1);
// Set pixel storage mode.
glRasterPos2i (30, 40); glBitmap (9, 10, 0.0, 0.0, 20.0, 15.0, bitShape);
10
FIGURE 27
A bit pattern, specified in an array with 10 rows and 9 columns, is stored in 8-bit blocks of 10 rows with 16 bit values per row.
74
0
0
0
0
1
0
0
0
0
0
0
0
008
000
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
0
0
1
1
1
1
1
0
0
0
0
0
03E
000
0
1
1
1
1
1
1
1
0
0
0
0
07F
000
1
1
1
1
1
1
1
1
1
0
0
0
0FF
080
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
0
0
0
1
1
1
0
0
0
0
0
0
01C
000
12 16
Graphics Output Primitives
Array values for bitShape are specified row by row, starting at the bottom of the rectangular-grid pattern. Next we set the storage mode for the bitmap with the OpenGL routine glPixelStorei. The parameter value of 1 in this function indicates that the data values are to be aligned on byte boundaries. With glRasterPos, we set the current raster position to (30, 40). Finally, function glBitmap specifies that the bit pattern is given in array bitShape, and that this array has 9 columns and 10 rows. The coordinates for the origin of this pattern are (0.0, 0.0), which is the lower-left corner of the grid. We illustrate a coordinate offset with the values (20.0, 15.0), although we do not use the offset in this example.
OpenGL Pixmap Function A pattern defined as an array of color values is applied to a block of frame-buffer pixel positions with the function glDrawPixels (width, height, dataFormat, dataType, pixMap); Again, parameters width and height give the column and row dimensions, respectively, of the array pixMap. Parameter dataFormat is assigned an OpenGL constant that indicates how the values are specified for the array. For example, we could specify a single blue color for all pixels with the constant GL BLUE, or we could specify three color components in the order blue, green, red with the constant GL BGR. A number of other color specifications are possible. An OpenGL constant, such as GL BYTE, GL INT, or GL FLOAT, is assigned to parameter dataType to designate the data type for the color values in the array. The lower-left corner of this color array is mapped to the current raster position, as set by the glRasterPos function. As an example, the following statement displays a pixmap defined in a 128 × 128 array of RGB color values: glDrawPixels (128, 128, GL_RGB, GL_UNSIGNED_BYTE, colorShape); Because OpenGL provides several buffers, we can paste an array of values into a particular buffer by selecting that buffer as the target of the glDrawPixels routine. Some buffers store color values and some store other kinds of pixel data. A depth buffer, for instance, is used to store object distances (depths) from the viewing position, and a stencil buffer is used to store boundary patterns for a scene. We select one of these two buffers by setting parameter dataFormat in the glDrawPixels routine to either GL DEPTH COMPONENT or GL STENCIL INDEX. For these buffers, we would need to set up the pixel array using either depth values or stencil information. There are four color buffers available in OpenGL that can be used for screen refreshing. Two of the color buffers constitute a left-right scene pair for displaying stereoscopic views. For each of the stereoscopic buffers, there is a front-back pair for double-buffered animation displays. In a particular implementation of OpenGL, either stereoscopic viewing or double buffering, or both, might not be supported. If neither stereoscopic effects nor double buffering is supported, then there is only a single refresh buffer, which is designated as the front-left color buffer. This is the default refresh buffer when double buffering is not available or not in effect. If double buffering is in effect, the default is either the back-left and back-right buffers or only the back-left buffer, depending on the current state of stereoscopic viewing. Also, a number of user-defined, auxiliary color buffers are supported that can be used for any nonrefresh purpose, such as saving a picture that is to be copied later into a refresh buffer for display.
75
Graphics Output Primitives
We select a single color or auxiliary buffer or a combination of color buffers for storing a pixmap with the following command: glDrawBuffer (buffer); A variety of OpenGL symbolic constants can be assigned to parameter buffer to designate one or more “draw” buffers. For instance, we can pick a single buffer with either GL FRONT LEFT, GL FRONT RIGHT, GL BACK LEFT, or GL BACK RIGHT. We can select both front buffers with GL FRONT, and we can select both back buffers with GL BACK. This is assuming that stereoscopic viewing is in effect. Otherwise, the previous two symbolic constants designate a single buffer. Similarly, we can designate either the left or right buffer pairs with GL LEFT or GL RIGHT, and we can select all the available color buffers with GL FRONT AND BACK. An auxiliary buffer is chosen with the constant GL AUXk, where k is an integer value from 0 to 3, although more than four auxiliary buffers may be available in some implementations of OpenGL.
OpenGL Raster Operations In addition to storing an array of pixel values in a buffer, we can retrieve a block of values from a buffer or copy the block into another buffer area, and we can perform a variety of other operations on a pixel array. In general, the term raster operation or raster op is used to describe any function that processes a pixel array in some way. A raster operation that moves an array of pixel values from one place to another is also referred to as a block transfer of pixel values. On a bilevel system, these operations are called bitblt transfers (bit-block transfers), particularly when the functions are hardware-implemented. On a multilevel system, the term pixblt is sometimes used for block transfers. We use the following function to select a rectangular block of pixel values in a designated set of buffers: glReadPixels (xmin, ymin, width, height, dataFormat, dataType, array}; The lower-left corner of the rectangular block to be retrieved is at screencoordinate position (xmin, ymin). Parameters width, height, dataFormat, and dataType are the same as in the glDrawPixels routine. The type of data to be saved in parameter array depends on the selected buffer. We can choose either the depth buffer or the stencil buffer by assigning either the value GL DEPTH COMPONENT or the value GL STENCIL INDEX to parameter dataFormat. A particular combination of color buffers or an auxiliary buffer is selected for the application of the glReadPixels routine with the function glReadBuffer (buffer); Symbolic constants for specifying one or more buffers are the same as in the glDrawBuffer routine except that we cannot select all four of the color buffers. The default buffer selection is the front left-right pair or just the front-left buffer, depending on the status of stereoscopic viewing. We can also copy a block of pixel data from one location to another within the set of OpenGL buffers using the following routine: glCopyPixels (xmin, ymin, width, height, pixelValues};
76
Graphics Output Primitives
The lower-left corner of the block is at screen-coordinate location (xmin, ymin), and parameters width and height are assigned positive integer values to designate the number of columns and rows, respectively, that are to be copied. Parameter pixelValues is assigned either GL COLOR, GL DEPTH, or GL STENCIL to indicate the kind of data we want to copy: color values, depth values, or stencil values. In addition, the block of pixel values is copied from a source buffer to a destination buffer, with its lower-left corner mapped to the current raster position. We select the source buffer with the glReadBuffer command, and we select the destination buffer with the glDrawBuffer command. Both the region to be copied and the destination area should lie completely within the bounds of the screen coordinates. To achieve different effects as a block of pixel values is placed into a buffer with glDrawPixels or glCopyPixels, we can combine the incoming values with the old buffer values in various ways. As an example, we could apply logical operations, such as and, or, and exclusive or, to combine the two blocks of pixel values. In OpenGL, we select a bitwise, logical operation for combining incoming and destination pixel color values with the functions glEnable (GL_COLOR_LOGIC_OP); glLogicOp (logicOp); A variety of symbolic constants can be assigned to parameter logicOp, including GL AND, GL OR, and GL XOR. In addition, either the incoming bit values or the destination bit values can be inverted (interchanging 0 and 1 values). We use the constant GL COPY INVERTED to invert the incoming color bit values and then replace the destination values with the inverted incoming values; and we could simply invert the destination bit values without replacing them with the incoming values using GL INVERT. The various invert operations can also be combined with the logical and, or, and exclusive or operations. Other options include clearing all the destination bits to the value 0 (GL CLEAR), or setting all the destination bits to the value 1 (GL SET). The default value for the glLogicOp routine is GL COPY, which simply replaces the destination values with the incoming values. Additional OpenGL routines are available for manipulating pixel arrays processed by the glDrawPixels, glReadPixels, and glCopyPixels functions. For example, the glPixelTransfer and glPixelMap routines can be used to shift or adjust color values, depth values, or stencil values. We return to pixel operations in later chapters as we explore other facets of computer-graphics packages.
12 Character Primitives Graphics displays often include textural information, such as labels on graphs and charts, signs on buildings or vehicles, and general identifying information for simulation and visualization applications. Routines for generating character primitives are available in most graphics packages. Some systems provide an extensive set of character functions, while other systems offer only minimal support for character generation. Letters, numbers, and other characters can be displayed in a variety of sizes and styles. The overall design style for a set (or family) of characters is called a typeface. Today, there are thousands of typefaces available for computer applications. Examples of a few common typefaces are Courier, Helvetica, New York,
77
Graphics Output Primitives
Palatino, and Zapf Chancery. Originally, the term font referred to a set of cast metal character forms in a particular size and format, such as 10-point Courier Italic or 12-point Palatino Bold. A 14-point font has a total character height of about 0.5 centimeter. In other words, 72 points is about the equivalent of 2.54 centimeters (1 inch). The terms font and typeface are now often used interchangeably, since most printing is no longer done with cast metal forms. Fonts can be divided into two broad groups: serif and sans serif. Serif type has small lines or accents at the ends of the main character strokes, while sans-serif type does not have such accents. For example, this text is set in a serif font (Palatino). But this sentence is printed in a sans-serif font (Univers). Serif type is generally more readable; that is, it is easier to read in longer blocks of text. On the other hand, the individual characters in sans-serif type are easier to recognize. For this reason, sans-serif type is said to be more legible. Since sans-serif characters can be recognized quickly, this font is good for labeling and short headings. Fonts are also classified according to whether they are monospace or proportional. Characters in a monospace font all have the same width. In a proportional font, character width varies. Two different representations are used for storing computer fonts. A simple method for representing the character shapes in a particular typeface is to set up a pattern of binary values on a rectangular grid. The set of characters is then referred to as a bitmap font (or bitmapped font). A bitmapped character set is also sometimes referred to as a raster font. Another, more flexible, scheme is to describe character shapes using straight-line and curve sections, as in PostScript, for example. In this case, the set of characters is called an outline font or a stroke font. Figure 28 illustrates the two methods for character representation. When the pattern in Figure 28(a) is applied to an area of the frame buffer, the 1 bits designate which pixel positions are to be displayed in a specified color. To display the character shape in Figure 28(b), the interior of the character outline is treated as a fill area. Bitmap fonts are the simplest to define and display: We just need to map the character grids to a frame-buffer position. In general, however, bitmap fonts require more storage space because each variation (size and format) must be saved in a font cache. It is possible to generate different sizes and other variations, such as bold and italic, from one bitmap font set, but this often does not produce good results. We can increase or decrease the size of a character bitmap only in integer multiples of the pixel size. To double the size of a character, we need to double the number of pixels in the bitmap. This just increases the ragged appearance of its edges. In contrast to bitmap fonts, outline fonts can be increased in size without distorting the character shapes. And outline fonts require less storage because each variation does not require a distinct font cache. We can produce boldface,
FIGURE 28
The letter “B” represented with an 8 × 8 bitmap pattern (a) and with an outline shape defined with straight-line and curve segments (b).
78
1
1
1
1
1
1
0
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
1
1
1
0
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
(a)
(b)
Graphics Output Primitives y * 100
*
x
41 94 59 43 85 74 110 59 121 89 149 122
* * *
50 *
x 0
50
100
y
150
FIGURE 29
A polymarker graph of a set of data values.
italic, or different sizes by manipulating the curve definitions for the character outlines. But it does take more time to process the outline fonts because they must be scan-converted into the frame buffer. There is a variety of possible functions for implementing character displays. Some graphics packages provide a function that accepts any character string and a frame-buffer starting position for the string. Another type of function displays a single character at one or more selected positions. Since this character routine is useful for showing markers in a network layout or in displaying a point plot of a discrete data set, the character displayed by this routine is sometimes referred to as a marker symbol or polymarker, in analogy with a polyline primitive. In addition to standard characters, special shapes such as dots, circles, and crosses are often available as marker symbols. Figure 29 shows a plot of a discrete data set using an asterisk as a marker symbol. Geometric descriptions for characters are given in world coordinates, just as they are for other primitives, and this information is mapped to screen coordinates by the viewing transformations. A bitmap character is described with a rectangular grid of binary values and a grid reference position. This reference position is then mapped to a specified location in the frame buffer. An outline character is defined by a set of coordinate positions that are to be connected with a series of curves and straight-line segments and a reference position that is to be mapped to a given frame-buffer location. The reference position can be specified either for a single outline character or for a string of characters. In general, character routines can allow the construction of both two-dimensional and three-dimensional character displays.
13 OpenGL Character Functions Only low-level support is provided by the basic OpenGL library for displaying individual characters and text strings. We can explicitly define any character as a bitmap, as in the example shape shown in Figure 27, and we can store a set of bitmap characters as a font list. A text string is then displayed by mapping a selected sequence of bitmaps from the font list into adjacent positions in the frame buffer. However, some predefined character sets are available in the GLUT library, so we do not need to create our own fonts as bitmap shapes unless we want to display a font that is not available in GLUT. The GLUT library contains routines for displaying both bitmapped and outline fonts. Bitmapped GLUT fonts are rendered using the OpenGL glBitmap function, and the outline fonts are generated with polyline (GL LINE STRIP) boundaries. We can display a bitmap GLUT character with glutBitmapCharacter (font, character);
79
Graphics Output Primitives
where parameter font is assigned a symbolic GLUT constant identifying a particular set of typefaces, and parameter character is assigned either the ASCII code or the specific character we wish to display. Thus, to display the uppercase letter “A,” we can either use the ASCII value 65 or the designation 'A'. Similarly, a code value of 66 is equivalent to 'B', code 97 corresponds to the lowercase letter 'a', code 98 corresponds to 'b', and so forth. Both fixed-width fonts and proportionally spaced fonts are available. We can select a fixed-width font by assigning either GLUT BITMAP 8 BY 13 or GLUT BITMAP 9 BY 15 to parameter font. And we can select a 10-point, proportionally spaced font with either GLUT BITMAP TIMES ROMAN 10 or GLUT BITMAP HELVETICA 10. A 12-point Times-Roman font is also available, as well as 12-point and 18-point Helvetica fonts. Each character generated by glutBitmapCharacter is displayed so that the origin (lower-left corner) of the bitmap is at the current raster position. After the character bitmap is loaded into the refresh buffer, an offset equal to the width of the character is added to the x coordinate for the current raster position. As an example, we could display a text string containing 36 bitmap characters with the following code: glRasterPosition2i (x, y); for (k = 0; k < 36; k++) glutBitmapCharacter (GLUT_BITMAP_9_BY_15, text [k]); Characters are displayed in the color that was specified before the execution of the glutBitmapCharacter function. An outline character is displayed with the following function call: glutStrokeCharacter (font, character); For this function, we can assign parameter font either the value GLUT STROKE ROMAN, which displays a proportionally spaced font, or the value GLUT STROKE MONO ROMAN, which displays a font with constant spacing. We control the size and position of these characters by specifying transformation operations before executing the glutStrokeCharacter routine. After each character is displayed, a coordinate offset is applied automatically so that the position for displaying the next character is to the right of the current character. Text strings generated with outline fonts are part of the geometric description for a twodimensional or three-dimensional scene because they are constructed with line segments. Thus, they can be viewed fromvarious directions, and we can shrink or expand them without distortion, or transform them in other ways. But they are slower to render, compared to bitmapped fonts.
14 Picture Partitioning Some graphics libraries include routines for describing a picture as a collection of named sections and for manipulating the individual sections of a picture. Using these functions, we can create, edit, delete, or move a part of a picture independently of the other picture components. In addition, we can use this feature of a graphics package for hierarchical modeling, in which an object description is given as a tree structure composed of a number of levels specifying the object subparts. Various names are used for the subsections of a picture. Some graphics packages refer to them as structures, while other packages call them segments
80
Graphics Output Primitives
or objects. Also, the allowable subsection operations vary greatly from one package to another. Modeling packages, for example, provide a wide range of operations that can be used to describe and manipulate picture elements. On the other hand, for any graphics library, we can always structure and manage the components of a picture using procedural elements available in a high-level language such as C++.
15 OpenGL Display Lists Often it can be convenient or more efficient to store an object description (or any other set of OpenGL commands) as a named sequence of statements. We can do this in OpenGL using a structure called a display list. Once a display list has been created, we can reference the list multiple times with different display operations. On a network, a display list describing a scene is stored on the server machine, which eliminates the need to transmit the commands in the list each time the scene is to be displayed. We can also set up a display list so that it is saved for later execution, or we can specify that the commands in the list be executed immediately. And display lists are particularly useful for hierarchical modeling, where a complex object can be described with a set of simpler subparts.
Creating and Naming an OpenGL Display List A set of OpenGL commands is formed into a display list by enclosing the commands within the glNewList/glEndList pair of functions. For example, glNewList (listID, listMode}; . . . glEndList ( ); This structure forms a display list with a positive integer value assigned to parameter listID as the name for the list. Parameter listMode is assigned an OpenGL symbolic constant that can be either GL COMPILE or GL COMPILE AND EXECUTE. If we want to save the list for later execution, we use GL COMPILE. Otherwise, the commands are executed as they are placed into the list, in addition to allowing us to execute the list again at a later time. As a display list is created, expressions involving parameters such as coordinate positions and color components are evaluated so that only the parameter values are stored in the list. Any subsequent changes to these parameters have no effect on the list. Because display-list values cannot be changed, we cannot include certain OpenGL commands, such as vertex-list pointers, in a display list. We can create any number of display lists, and we execute a particular list of commands with a call to its identifier. Further, one display list can be embedded within another display list. But if a list is assigned an identifier that has already been used, the new list replaces the previous list that had been assigned that identifier. Therefore, to avoid losing a list by accidentally reusing its identifier, we can let OpenGL generate an identifier for us, as follows: listID = glGenLists (1); This statement returns one (1) unused positive integer identifier to the variable listID. A range of unused integer list identifiers is obtained if we change the argument of glGenLists from the value 1 to some other positive integer. For
81
Graphics Output Primitives
instance, if we invoke glGenLists (6), then a sequence of six contiguous positive integer values is reserved and the first value in this list of identifiers is returned to the variable listID. A value of 0 is returned by the glGenLists function if an error occurs or if the system cannot generate the range of contiguous integers requested. Therefore, before using an identifier obtained from the glGenLists routine, we could check to be sure that it is not 0. Although unused list identifiers can be generated with the glGenList function, we can independently query the system to determine whether a specific integer value has been used as a list name. The function to accomplish this is glIsList (listID}; A value of GL TRUE is returned if the value of listID is an integer that has already been used as a display-list name. If the integer value has not been used as a list name, the glIsList function returns the value GL FALSE.
Executing OpenGL Display Lists We execute a single display list with the statement glCallList (listID); The following code segment illustrates the creation and execution of a display list. We first set up a display list that contains the description for a regular hexagon, defined in the xy plane using a set of six equally spaced vertices around the circumference of a circle, whose center coordinates are (200, 200) and whose radius is 150. Then we issue a call to function glCallList, which displays the hexagon. const double TWO_PI = 6.2831853; GLuint regHex; GLdouble theta; GLint x, y, k; /* Set up a display list for a regular hexagon. * Vertices for the hexagon are six equally spaced * points around the circumference of a circle. */ regHex = glGenLists (1); // Get an identifier for the display list. glNewList (regHex, GL_COMPILE); glBegin (GL_POLYGON); for (k = 0; k < 6; k++) { theta = TWO_PI * k / 6.0; x = 200 + 150 * cos (theta); y = 200 + 150 * sin (theta); glVertex2i (x, y); } glEnd ( ); glEndList ( ); glCallList (regHex);
82
Graphics Output Primitives
Several display lists can be executed using the following two statements: glListBase (offsetValue); glCallLists (nLists, arrayDataType, listIDArray); The integer number of lists that we want to execute is assigned to parameter nLists, and parameter listIDArray is an array of display-list identifiers. In general, listIDArray can contain any number of elements, and invalid displaylist identifiers are ignored. Also, the elements in listIDArray can be specified in a variety of data formats, and parameter arrayDataType is used to indicate a data type, such as GL BYTE, GL INT, GL FLOAT, GL 3 BYTES, or GL 4 BYTES. A display-list identifier is calculated by adding the value in an element of listIDArray to the integer value of offsetValue that is given in the glListBase function. The default value for offsetValue is 0. This mechanism for specifying a sequence of display lists that are to be executed allows us to set up groups of related display lists, whose identifiers are formed from symbolic names or codes. A typical example is a font set where each display-list identifier is the ASCII value of a character. When several font sets are defined, we use parameter offsetValue in the glListBase function to obtain a particular font described within the array listIDArray.
Deleting OpenGL Display Lists We eliminate a contiguous set of display lists with the function call glDeleteLists (startID, nLists); Parameter startID gives the initial display-list identifier, and parameter nLists specifies the number of lists that are to be deleted. For example, the statement glDeleteLists (5, 4); eliminates the four display lists with identifiers 5, 6, 7, and 8. An identifier value that references a nonexistent display list is ignored.
16 OpenGL Display-Window Reshape Function After the generation of our picture, we often want to use the mouse pointer to drag the display window to another screen location or to change its size. Changing the size of a display window could change its aspect ratio and cause objects to be distorted from their original shapes. To allow us to compensate for a change in display-window dimensions, the GLUT library provides the following routine:
glutReshapeFunc (winReshapeFcn); We can include this function in the main procedure in our program, along with the other GLUT routines, and it will be activated whenever the display-window size is altered. The argument for this GLUT function is the name of a procedure that
83
Graphics Output Primitives
FIGURE 30
The display window generated by the example program illustrating the use of the reshape function.
is to receive the new display-window width and height. We can then use the new dimensions to reset the projection parameters and perform any other operations, such as changing the display-window color. In addition, we could save the new width and height values so that they could be used by other procedures in our program. As an example, the following program illustrates how we might structure the winReshapeFcn procedure. The glLoadIdentity command is included in the reshape function so that any previous values for the projection parameters will not affect the new projection settings. This program displays the regular hexagon discussed in Section 15. Although the hexagon center (at the position of the circle center) in this example is specified in terms of the display-window parameters, the position of the hexagon is unaffected by any changes in the size of the display window. This is because the hexagon is defined within a display list, and only the original center coordinates are stored in the list. If we want the position of the hexagon to change when the display window is resized, we need to define the hexagon in another way or alter the coordinate reference for the display window. The output from this program is shown in Figure 30. #include #include #include const double TWO_PI = 6.2831853; /* Initial display-window size. */ GLsizei winWidth = 400, winHeight = 400; GLuint regHex; class screenPt { private: GLint x, y;
84
Graphics Output Primitives
public: /* Default Constructor: initializes coordinate position to (0, 0). screenPt ( ) { x = y = 0; } void setCoords (GLint xCoord, GLint yCoord) x = xCoord; y = yCoord; } GLint getx ( ) const return x; }
{
GLint gety ( ) const return y; }
{
*/
{
}; static void init (void) { screenPt hexVertex, circCtr; GLdouble theta; GLint k; /* Set circle center coordinates. */ circCtr.setCoords (winWidth / 2, winHeight / 2); glClearColor (1.0, 1.0, 1.0, 0.0);
//
Display-window color = white.
/* Set up a display list for a red regular hexagon. * Vertices for the hexagon are six equally spaced * points around the circumference of a circle. */ regHex = glGenLists (1); // Get an identifier for the display list. glNewList (regHex, GL_COMPILE); glColor3f (1.0, 0.0, 0.0); // Set fill color for hexagon to red. glBegin (GL_POLYGON); for (k = 0; k < 6; k++) { theta = TWO_PI * k / 6.0; hexVertex.setCoords (circCtr.getx ( ) + 150 * cos (theta), circCtr.gety ( ) + 150 * sin (theta)); glVertex2i (hexVertex.getx ( ), hexVertex.gety ( )); } glEnd ( ); glEndList ( ); } void regHexagon (void) { glClear (GL_COLOR_BUFFER_BIT); glCallList (regHex); glFlush ( ); }
85
Graphics Output Primitives
void winReshapeFcn (int newWidth, int newHeight) { glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (0.0, (GLdouble) newWidth, 0.0, (GLdouble) newHeight); glClear (GL_COLOR_BUFFER_BIT); } void main (int argc, char** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (100, 100); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Reshape-Function & Display-List Example"); init ( ); glutDisplayFunc (regHexagon); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
17 Summary The output primitives discussed in this chapter provide the basic tools for constructing pictures with individual points, straight lines, curves, filled color areas, array patterns, and text. We specify primitives by giving their geometric descriptions in a Cartesian, world-coordinate reference system. A fill area is a planar region that is to be displayed in a solid color or color pattern. Fill-area primitives in most graphics packages are polygons. But, in general, we could specify a fill region with any boundary. Often, graphics systems allow only convex polygon fill areas. In that case, a concave-polygon fill area can be displayed by dividing it into a set of convex polygons. Triangles are the easiest polygons to fill, because each scan line crossing a triangle intersects exactly two polygon edges (assuming that the scan line does not pass through any vertices). The odd-even rule can be used to locate the interior points of a planar region. Other methods for defining object interiors are also useful, particularly with irregular, self-intersecting objects. A common example is the nonzero winding-number rule. This rule is more flexible than the odd-even rule for handling objects defined with multiple boundaries. We can also use variations of the winding-number rule to combine plane areas using Boolean operations. Each polygon has a front face and a back face, which determines the spatial orientation of the polygon plane. This spatial orientation can be determined from the normal vector, which is perpendicular to the polygon plane and points
86
Graphics Output Primitives
in the direction from the back face to the front face. We can determine the components of the normal vector from the polygon plane equation or by forming a vector cross-product using three points in the plane, where the three points are taken in a counterclockwise order and the angle formed by the three points is less than 180◦ . All coordinate values, spatial orientations, and other geometric data for a scene are entered into three tables: vertex, edge, and surface-facet tables. Additional primitives available in graphics packages include pattern arrays and character strings. Pattern arrays can be used to specify two-dimensional shapes, including a character set, using either a rectangular set of binary values or a set of color values. Character strings are used to provide picture and graph labeling. Using the primitive functions available in the basic OpenGL library, we can generate points, straight-line segments, convex polygon fill areas, and either bitmap or pixmap pattern arrays. Routines for displaying character strings are available in GLUT. Other types of primitives, such as circles, ellipses, and concavepolygon fill areas, can be constructed or approximated with these functions, or they can be generated using routines in GLU and GLUT. All coordinate values are expressed in absolute coordinates within a right-handed Cartesian-coordinate reference system. Coordinate positions describing a scene can be given in either a two-dimensional or a three-dimensional reference frame. We can use integer or floating-point values to give a coordinate position, and we can also reference a position with a pointer to an array of coordinate values. A scene description is then transformed by viewing functions into a two-dimensional display on an output device, such as a video monitor. Except for the glRect function, each coordinate position for a set of points, lines, or polygons is specfied in a glVertex function. And the set of glVertex functions defining each primitive is included between a glBegin/glEnd pair of statements, where the primitive type is identified with a symbolic constant as the argument for the glBegin function. When describing a scene containing many polygon fill surfaces, we can generate the display efficiently using OpenGL vertex arrays to specify geometric and other data. In Table 1, we list the basic functions for generating output primitives in OpenGL. Some related routines are also listed in this table.
Example Programs Here, we present a few example OpenGL programs illustrating the use of output primitives. Each program uses one or more of the functions listed in Table 1. A display window is set up for the output from each program. The first program illustrates the use of a polyline, a set of polymarkers, and bit-mapped character labels to generate a line graph for monthly data over a period of one year. A proportionally spaced font is demonstrated, although a fixed-width font is usually easier to align with graph positions. Because the bitmaps are referenced at the lower-left corner by the raster-position function, we must shift the reference position to align the center of a text string with a plotted data position. Figure 31 shows the output of the line-graph program.
87
Graphics Output Primitives
T A B L E
1
Summary of OpenGL Output Primitive Functions and Related Routines Function gluOrtho2D glVertex*
88
Description Specifies a two-dimensional worldcoordinate reference. Selects a coordinate position. This function must be placed within a glBegin/glEnd pair.
glBegin (GL POINTS);
Plots one or more point positions, each specified in a glVertex function. The list of positions is then closed with a glEnd statement.
glBegin (GL LINES);
Displays a set of straight-line segments, whose endpoint coordinates are specified in glVertex functions. The list of endpoints is then closed with a glEnd statement.
glBegin (GL LINE STRIP);
Displays a polyline, specified using the same structure as GL LINES.
glBegin (GL LINE LOOP);
Displays a closed polyline, specified using the same structure as GL LINES.
glRect*
Displays a fill rectangle in the xy plane.
glBegin (GL POLYGON);
Displays a fill polygon, whose vertices are given in glVertex functions and terminated with a glEnd statement.
glBegin (GL TRIANGLES);
Displays a set of fill triangles using the same structure as GL POLYGON.
glBegin (GL TRIANGLE STRIP);
Displays a fill-triangle mesh, specified using the same structure as GL POLYGON.
glBegin (GL TRIANGLE FAN);
Displays a fill-triangle mesh in a fan shape with all triangles connected to the first vertex, specified with same structure as GL POLYGON.
glBegin (GL QUADS);
Displays a set of fill quadrilaterals, specified with the same structure as GL POLYGON.
glBegin (GL QUAD STRIP);
Displays a fill-quadrilateral mesh, specified with the same structure as GL POLYGON.
glEnableClientState (GL VERTEX ARRAY);
Activates vertex-array features of OpenGL.
glVertexPointer (size, type, stride, array);
Specifies an array of coordinate values.
glDrawElements (prim, num, type, array);
Displays a specified primitive type from array data.
Graphics Output Primitives
T A B L E
1
(continued) Function
Description
glNewList (listID, listMode)
Defines a set of commands as a display list, terminated with a glEndList statement.
glGenLists
Generates one or more display-list identifiers.
glIsList
Queries OpenGL to determine whether a display-list identifier is in use.
glCallList
Executes a single display list.
glListBase
Specifies an offset value for an array of display-list identifiers.
glCallLists
Executes multiple display lists.
glDeleteLists
Eliminates a specified sequence of display lists.
glRasterPos*
Specifies a two-dimensional or threedimensional current position for the frame buffer. This position is used as a reference for bitmap and pixmap patterns.
glBitmap (w, h, x0, y0, xShift, yShift, pattern);
Specifies a binary pattern that is to be mapped to pixel positions relative to the current position.
glDrawPixels (w, h, type, format, pattern);
Specifies a color pattern that is to be mapped to pixel positions relative to the current position.
glDrawBuffer
Selects one or more buffers for storing a pixmap.
glReadPixels
Saves a block of pixels in a selected array.
glCopyPixels
Copies a block of pixels from one buffer position to another.
glLogicOp
Selects a logical operation for combining two pixel arrays, after enabling with the constant GL COLOR LOGIC OP.
glutBitmapCharacter (font, char);
Specifies a font and a bitmap character for display.
glutStrokeCharacter (font, char);
Specifies a font and an outline character for display.
glutReshapeFunc
Specifies actions to be taken when display-window dimensions are changed.
89
Graphics Output Primitives
FIGURE 31
A polyline and polymarker plot of data points output by the lineGraph routine.
#include GLsizei winWidth = 600, winHeight = 500; GLint xRaster = 25, yRaster = 150;
// Initial display window size. // Initialize raster position.
GLubyte label [36] = {'J', 'A', 'J', 'O',
'e', 'a', 'u', 'o',
'a', 'p', 'u', 'c',
'n', 'r', 'l', 't',
'F', 'M', 'A', 'N',
'b', 'y', 'g', 'v',
'M', 'J', 'S', 'D',
'a', 'u', 'e', 'e',
'r', 'n', 'p', 'c'};
GLint dataValue [12] = {420, 342, 324, 310, 262, 185, 190, 196, 217, 240, 312, 438}; void init (void) { glClearColor (1.0, 1.0, 1.0, 1.0); glMatrixMode (GL_PROJECTION); gluOrtho2D (0.0, 600.0, 0.0, 500.0); } void lineGraph (void) { GLint month, k; GLint x = 30; glClear (GL_COLOR_BUFFER_BIT); glColor3f (0.0, 0.0, 1.0);
90
// White display window.
// Initialize x position for chart. // //
Clear display window. Set line color to blue.
Graphics Output Primitives
glBegin (GL_LINE_STRIP); // Plot data as a polyline. for (k = 0; k < 12; k++) glVertex2i (x + k*50, dataValue [k]); glEnd ( );
glColor3f (1.0, 0.0, 0.0); // Set marker color to red. for (k = 0; k < 12; k++) { // Plot data as asterisk polymarkers. glRasterPos2i (xRaster + k*50, dataValue [k] - 4); glutBitmapCharacter (GLUT_BITMAP_9_BY_15, '*'); }
glColor3f (0.0, 0.0, 0.0); // Set text color to black. xRaster = 20; // Display chart labels. for (month = 0; month < 12; month++) { glRasterPos2i (xRaster, yRaster); for (k = 3*month; k < 3*month + 3; k++) glutBitmapCharacter (GLUT_BITMAP_HELVETICA_12, label [k]); xRaster += 50; } glFlush ( ); }
void winReshapeFcn (GLint newWidth, GLint newHeight) { glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (0.0, GLdouble (newWidth), 0.0, GLdouble (newHeight)); glClear (GL_COLOR_BUFFER_BIT); }
void main (int argc, char** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (100, 100); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Line Chart Data Plot"); init ( ); glutDisplayFunc (lineGraph); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
We use the same data set in the second program to produce the bar chart in Figure 32. This program illustrates an application of rectangular fill areas, as well as bitmapped character labels.
91
Graphics Output Primitives
FIGURE 32
A bar chart generated by the barChart procedure.
void barChart (void) { GLint month, k; glClear (GL_COLOR_BUFFER_BIT); //
Clear display window.
glColor3f (1.0, 0.0, 0.0); // Set bar color to red. for (k = 0; k < 12; k++) glRecti (20 + k*50, 165, 40 + k*50, dataValue [k]); glColor3f (0.0, 0.0, 0.0); // Set text color to black. xRaster = 20; // Display chart labels. for (month = 0; month < 12; month++) { glRasterPos2i (xRaster, yRaster); for (k = 3*month; k < 3*month + 3; k++) glutBitmapCharacter (GLUT_BITMAP_HELVETICA_12, label [h]); xRaster += 50; } glFlush ( ); }
FIGURE 33
Output produced with the pieChart procedure.
92
Pie charts are used to show the percentage contribution of individual parts to the whole. The next program constructs a pie chart, using the midpoint routine for generating a circle. Example values are used for the number and relative sizes of the slices, and the output from this program appears in Figure 33.
Graphics Output Primitives
#include #include #include const GLdouble twoPi = 6.283185; class scrPt { public: GLint x, y; }; GLsizei winWidth = 400, winHeight = 300;
// Initial display window size.
void init (void) { glClearColor (1.0, 1.0, 1.0, 1.0); glMatrixMode (GL_PROJECTION); gluOrtho2D (0.0, 200.0, 0.0, 150.0); } . . .
//
Midpoint routines for displaying a circle.
void pieChart (void) { scrPt circCtr, piePt; GLint radius = winWidth / 4;
// Circle radius.
GLdouble sliceAngle, previousSliceAngle = 0.0; GLint k, nSlices = 12; // Number of slices. GLfloat dataValues[12] = {10.0, 7.0, 13.0, 5.0, 13.0, 14.0, 3.0, 16.0, 5.0, 3.0, 17.0, 8.0}; GLfloat dataSum = 0.0; circCtr.x = winWidth / 2; circCtr.y = winHeight / 2; circleMidpoint (circCtr, radius);
// Circle center position. // Call a midpoint circle-plot routine.
for (k = 0; k < nSlices; k++) dataSum += dataValues[k]; for (k = 0; k < nSlices; k++) { sliceAngle = twoPi * dataValues[k] / dataSum + previousSliceAngle; piePt.x = circCtr.x + radius * cos (sliceAngle); piePt.y = circCtr.y + radius * sin (sliceAngle); glBegin (GL_LINES); glVertex2i (circCtr.x, circCtr.y); glVertex2i (piePt.x, piePt.y); glEnd ( ); previousSliceAngle = sliceAngle; } }
93
Graphics Output Primitives
void displayFcn (void) { glClear (GL_COLOR_BUFFER_BIT);
//
Clear display window.
glColor3f (0.0, 0.0, 1.0);
//
Set circle color to blue.
pieChart ( ); glFlush ( ); } void winReshapeFcn (GLint newWidth, GLint newHeight) { glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (0.0, GLdouble (newWidth), 0.0, GLdouble (newHeight)); glClear (GL_COLOR_BUFFER_BIT); /* Reset display-window size parameters. winWidth = newWidth; winHeight = newHeight;
*/
} void main (int argc, char** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (100, 100); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Pie Chart"); init ( ); glutDisplayFunc (displayFcn); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
Some variations on the circle equations are displayed by our last example program, which uses the parametric polar equations (6-28) to compute points along the curve paths. These points are then used as the endpoint positions for straightline sections, displaying the curves as approximating polylines. The curves shown in Figure 34 are generated by varying the radius r of a circle. Depending on how we vary r , we can produce a lima¸con, cardioid, spiral, or other similar figure.
(a)
(b)
(c)
(d)
(e)
FIGURE 34
Curved figures displayed by the drawCurve procedure: (a) lima¸con, (b) cardioid, (c) three-leaf curve, (d) four-leaf curve, and (e) spiral.
94
Graphics Output Primitives #include #include #include #include struct screenPt { GLint x; GLint y; }; typedef enum { limacon = 1, cardioid, threeLeaf, fourLeaf, spiral } curveName; GLsizei winWidth = 600, winHeight = 500;
// Initial display window size.
void init (void) { glClearColor (1.0, 1.0, 1.0, 1.0); glMatrixMode (GL_PROJECTION); gluOrtho2D (0.0, 200.0, 0.0, 150.0); } void lineSegment (screenPt pt1, screenPt pt2) { glBegin (GL_LINES); glVertex2i (pt1.x, pt1.y); glVertex2i (pt2.x, pt2.y); glEnd ( ); } void drawCurve (GLint curveNum) { /* The limacon of Pascal is a modification of the circle equation * with the radius varying as r = a * cos (theta) + b, where a * and b are constants. A cardioid is a limacon with a = b. * Three-leaf and four-leaf curves are generated when * r = a * cos (n * theta), with n = 3 and n = 2, respectively. * A spiral is displayed when r is a multiple of theta. */ const GLdouble twoPi = 6.283185; const GLint a = 175, b = 60; GLfloat r, theta, dtheta = 1.0 / float (a); GLint x0 = 200, y0 = 250; // Set an initial screen position. screenPt curvePt[2]; glColor3f (0.0, 0.0, 0.0); curvePt[0].x = x0; curvePt[0].y = y0;
//
Set curve color to black.
// Initialize curve position.
95
Graphics Output Primitives
switch (curveNum) { case limacon: case cardioid: case threeLeaf: case fourLeaf: case spiral: default: }
curvePt[0].x curvePt[0].x curvePt[0].x curvePt[0].x break; break;
+= += += +=
a + b; a + a; a; a;
theta = dtheta; while (theta < two_Pi) { switch (curveNum) { case limacon: r = a * cos (theta) + b; case cardioid: r = a * (1 + cos (theta)); case threeLeaf: r = a * cos (3 * theta); case fourLeaf: r = a * cos (2 * theta); case spiral: r = (a / 4.0) * theta; default: }
break; break; break; break;
break; break; break; break; break; break;
curvePt[1].x = x0 + r * cos (theta); curvePt[1].y = y0 + r * sin (theta); lineSegment (curvePt[0], curvePt[1]); curvePt[0].x = curvePt[1].x; curvePt[0].y = curvePt[1].y; theta += dtheta; } } void displayFcn (void) { GLint curveNum; glClear (GL_COLOR_BUFFER_BIT); cout cout cout cout cin
1, δy can be set proportional to a small vertical deflection voltage with the corresponding horizontal deflection voltage set proportional to δx, calculated from Equation 5. For lines with m = 1, δx = δy and the horizontal and vertical deflections voltages are equal. In each case, a smooth line with slope m is generated between the specified endpoints. On raster systems, lines are plotted with pixels, and step sizes in the horizontal and vertical directions are constrained by pixel separations. That is, we must “sample” a line at discrete positions and determine the nearest pixel to the line at each sampled position. This scan-conversion process for straight lines is illustrated in Figure 3 with discrete sample positions along the x axis.
yend
y0
x0
xend
FIGURE 3
DDA Algorithm The digital differential analyzer (DDA) is a scan-conversion line algorithm based on calculating either δy or δx, using Equation 4 or Equation 5. A line is sampled at unit intervals in one coordinate and the corresponding integer values nearest the line path are determined for the other coordinate. We consider first a line with positive slope, as shown in Figure 2. If the slope is less than or equal to 1, we sample at unit x intervals (δx = 1) and compute successive y values as yk+1 = yk + m
Straight-line segment with five sampling positions along the x axis between x 0 and x end .
(6)
Subscript k takes integer values starting from 0, for the first point, and increases by 1 until the final endpoint is reached. Because m can be any real number between 0.0 and 1.0, each calculated y value must be rounded to the nearest integer corresponding to a screen pixel position in the x column that we are processing. For lines with a positive slope greater than 1.0, we reverse the roles of x and y. That is, we sample at unit y intervals (δy = 1) and calculate consecutive x values as 1 xk+1 = xk + (7) m In this case, each computed x value is rounded to the nearest pixel position along the current y scan line. Equations 6 and 7 are based on the assumption that lines are to be processed from the left endpoint to the right endpoint (Figure 2). If this processing is reversed, so that the starting endpoint is at the right, then either we have δx = −1 and yk+1 = yk − m
(8)
or (when the slope is greater than 1) we have δy = −1 with 1 xk+1 = xk − (9) m Similar calculations are carried out using Equations 6 through 9 to determine pixel positions along a line with negative slope. Thus, if the absolute value of the slope is less than 1 and the starting endpoint is at the left, we set δx = 1 and calculate y values with Equation 6. When the starting endpoint is at the right (for the same slope), we set δx = −1 and obtain y positions using Equation 8. For a negative slope with absolute value greater than 1, we use δy = −1 and Equation 9, or we use δy = 1 and Equation 7. This algorithm is summarized in the following procedure, which accepts as input two integer screen positions for the endpoints of a line segment. Horizontal and vertical differences between the endpoint positions are assigned to parameters dx and dy. The difference with the greater magnitude determines the value of parameter steps. This value is the number of pixels that must be drawn beyond the starting pixel; from it, we calculate the x and y increments needed to generate
133
Implementation Algorithms for Graphics Primitives and Attributes
the next pixel position at each step along the line path. We draw the starting pixel at position (x0, y0), and then draw the remaining pixels iteratively, adjusting x and y at each step to obtain the next pixel’s position before drawing it. If the magnitude of dx is greater than the magnitude of dy and x0 is less than xEnd, the values for the increments in the x and y directions are 1 and m, respectively. If the greater change is in the x direction, but x0 is greater than xEnd, then the decrements −1 and −m are used to generate each new point on the line. Otherwise, we use a unit increment (or decrement) in the y direction and an x increment (or decrement) of m1 . #include #include inline int round (const float a)
{ return int (a + 0.5); }
void lineDDA (int x0, int y0, int xEnd, int yEnd) { int dx = xEnd - x0, dy = yEnd - y0, steps, k; float xIncrement, yIncrement, x = x0, y = y0; if (fabs (dx) > fabs (dy)) steps = fabs (dx); else steps = fabs (dy); xIncrement = float (dx) / float (steps); yIncrement = float (dy) / float (steps);
Specified Line Path
13
setPixel (round (x), round (y)); for (k = 0; k < steps; k++) { x += xIncrement; y += yIncrement; setPixel (round (x), round (y)); }
12 11 10 10
11
12
13
}
FIGURE 4
A section of a display screen where a straight-line segment is to be plotted, starting from the pixel at column 10 on scan line 11.
50 Specified Line Path
49 48 50
51
52
53
FIGURE 5
A section of a display screen where a negative slope line segment is to be plotted, starting from the pixel at column 50 on scan line 50.
134
The DDA algorithm is a faster method for calculating pixel positions than one that directly implements Equation 1. It eliminates the multiplication in Equation 1 by using raster characteristics, so that appropriate increments are applied in the x or y directions to step from one pixel position to another along the line path. The accumulation of round-off error in successive additions of the floating-point increment, however, can cause the calculated pixel positions to drift away from the true line path for long line segments. Furthermore, the rounding operations and floating-point arithmetic in this procedure are still time-consuming. We can improve the performance of the DDA algorithm by separating the increments m and m1 into integer and fractional parts so that all calculations are reduced to integer operations. A method for calculating m1 increments in integer steps is discussed in Section 10. In the next section, we consider a more general scanline approach that can be applied to both lines and curves.
Bresenham’s Line Algorithm In this section, we introduce an accurate and efficient raster line-generating algorithm, developed by Bresenham, that uses only incremental integer calculations. In addition, Bresenham’s line algorithm can be adapted to display circles and other curves. Figures 4 and 5 illustrate sections of a display screen where
Implementation Algorithms for Graphics Primitives and Attributes
straight-line segments are to be drawn. The vertical axes show scan-line positions, and the horizontal axes identify pixel columns. Sampling at unit x intervals in these examples, we need to decide which of two possible pixel positions is closer to the line path at each sample step. Starting from the left endpoint shown in Figure 4, we need to determine at the next sample position whether to plot the pixel at position (11, 11) or the one at (11, 12). Similarly, Figure 5 shows a negative-slope line path starting from the left endpoint at pixel position (50, 50). In this one, do we select the next pixel position as (51, 50) or as (51, 49)? These questions are answered with Bresenham’s line algorithm by testing the sign of an integer parameter whose value is proportional to the difference between the vertical separations of the two pixel positions from the actual line path. To illustrate Bresenham’s approach, we first consider the scan-conversion process for lines with positive slope less than 1.0. Pixel positions along a line path are then determined by sampling at unit x intervals. Starting from the left endpoint (x0 , y0 ) of a given line, we step to each successive column (x position) and plot the pixel whose scan-line y value is closest to the line path. Figure 6 demonstrates the kth step in this process. Assuming that we have determined that the pixel at (xk , yk ) is to be displayed, we next need to decide which pixel to plot in column xk+1 = xk + 1. Our choices are the pixels at positions (xk + 1, yk ) and (xk + 1, yk + 1). At sampling position xk + 1, we label vertical pixel separations from the mathematical line path as dlower and dupper (Figure 7). The y coordinate on the mathematical line at pixel column position xk + 1 is calculated as y = m(xk + 1) + b
(10)
Then
yk 2
y mx b
yk 1 yk xk
xk 1 xk 2 xk 3
FIGURE 6
A section of the screen showing a pixel in column x k on scan line y k that is to be plotted along the path of a line segment with slope 0 < m < 1.
yk 1 y
d upper dlower
yk xk 1
dlower = y − yk = m(xk + 1) + b − yk
yk 3
(11)
and
FIGURE 7
Vertical distances between pixel positions and the line y coordinate at sampling position x k + 1.
dupper = (yk + 1) − y = yk + 1 − m(xk + 1) − b
(12)
To determine which of the two pixels is closest to the line path, we can set up an efficient test that is based on the difference between the two pixel separations as follows: dlower − dupper = 2m(xk + 1) − 2yk + 2b − 1
(13)
A decision parameter pk for the kth step in the line algorithm can be obtained by rearranging Equation 13 so that it involves only integer calculations. We accomplish this by substituting m = y/x, where y and x are the vertical and horizontal separations of the endpoint positions, and defining the decision parameter as pk = x(dlower − dupper ) = 2y · xk − 2x · yk + c
(14)
The sign of pk is the same as the sign of dlower − dupper , because x > 0 for our example. Parameter c is constant and has the value 2y + x(2b − 1), which is independent of the pixel position and will be eliminated in the recursive calculations for pk . If the pixel at yk is “closer” to the line path than the pixel at yk + 1 (that is, dlower < dupper ), then decision parameter pk is negative. In that case, we plot the lower pixel; otherwise, we plot the upper pixel.
135
Implementation Algorithms for Graphics Primitives and Attributes
Coordinate changes along the line occur in unit steps in either the x or y direction. Therefore, we can obtain the values of successive decision parameters using incremental integer calculations. At step k + 1, the decision parameter is evaluated from Equation 14 as pk+1 = 2y · xk+1 − 2x · yk+1 + c Subtracting Equation 14 from the preceding equation, we have pk+1 − pk = 2y(xk+1 − xk ) − 2x(yk+1 − yk ) However, xk+1 = xk + 1, so that pk+1 = pk + 2y − 2x(yk+1 − yk )
(15)
where the term yk+1 − yk is either 0 or 1, depending on the sign of parameter pk . This recursive calculation of decision parameters is performed at each integer x position, starting at the left coordinate endpoint of the line. The first parameter, p0 , is evaluated from Equation 14 at the starting pixel position (x0 , y0) and with m evaluated as y/x as follows: p0 = 2y − x
(16)
We summarize Bresenham line drawing for a line with a positive slope less than 1 in the following outline of the algorithm. The constants 2y and 2y − 2x are calculated once for each line to be scan-converted, so the arithmetic involves only integer addition and subtraction of these two constants. Step 4 of the algorithm will be performed a total of x times.
Bresenham’s Line-Drawing Algorithm for |m| < 1.0 1. Input the two line endpoints and store the left endpoint in (x0 , y0 ). 2. Set the color for frame-buffer position (x0 , y0 ); i.e., plot the first point. 3. Calculate the constants x, y, 2y, and 2y − 2x, and obtain the starting value for the decision parameter as p0 = 2y − x 4. At each xk along the line, starting at k = 0, perform the following test: If pk < 0, the next point to plot is (xk + 1, yk ) and pk+1 = pk + 2y Otherwise, the next point to plot is (xk + 1, yk + 1) and pk+1 = pk + 2y − 2x 5. Repeat step 4 x − 1 more times.
EXAMPLE 1
Bresenham Line Drawing
To illustrate the algorithm, we digitize the line with endpoints (20, 10) and (30, 18). This line has a slope of 0.8, with x = 10,
y = 8
The initial decision parameter has the value p0 = 2y − x =6
136
Implementation Algorithms for Graphics Primitives and Attributes
and the increments for calculating successive decision parameters are 2y = 16,
2y − 2x = −4
We plot the initial point (x0 , y0 ) = (20, 10), and determine successive pixel positions along the line path from the decision parameter as follows: k
pk
(xk+1 , yk+1 )
k
pk
(xk+1 , yk+1 )
0 1 2 3 4
6 2 −2 14 10
(21, 11) (22, 12) (23, 12) (24, 13) (25, 14)
5 6 7 8 9
6 2 −2 14 10
(26, 15) (27, 16) (28, 16) (29, 17) (30, 18)
A plot of the pixels generated along this line path is shown in Figure 8.
18
15
FIGURE 8
10 20 21 22
25
30
Pixel positions along the line path between endpoints (20, 10) and (30, 18), plotted with Bresenham’s line algorithm.
An implementation of Bresenham line drawing for slopes in the range 0 < m < 1.0 is given in the following procedure. Endpoint pixel positions for the line are passed to this procedure, and pixels are plotted from the left endpoint to the right endpoint. #include #include /* Bresenham line-drawing procedure for |m| < 1.0. */ void lineBres (int x0, int y0, int xEnd, int yEnd) { int dx = fabs (xEnd - x0), dy = fabs(yEnd - y0); int p = 2 * dy - dx; int twoDy = 2 * dy, twoDyMinusDx = 2 * (dy - dx); int x, y; /* Determine which endpoint to use as start position. if (x0 > xEnd) { x = xEnd; y = yEnd; xEnd = x0; }
*/
137
Implementation Algorithms for Graphics Primitives and Attributes
else { x = x0; y = y0; } setPixel (x, y); while (x < xEnd) { x++; if (p < 0) p += twoDy; else { y++; p += twoDyMinusDx; } setPixel (x, y); } }
Bresenham’s algorithm is generalized to lines with arbitrary slope by considering the symmetry between the various octants and quadrants of the xy plane. For a line with positive slope greater than 1.0, we interchange the roles of the x and y directions. That is, we step along the y direction in unit steps and calculate successive x values nearest the line path. Also, we could revise the program to plot pixels starting from either endpoint. If the initial position for a line with positive slope is the right endpoint, both x and y decrease as we step from right to left. To ensure that the same pixels are plotted regardless of the starting endpoint, we always choose the upper (or the lower) of the two candidate pixels whenever the two vertical separations from the line path are equal (dlower = dupper ). For negative slopes, the procedures are similar, except that now one coordinate decreases as the other increases. Finally, special cases can be handled separately: Horizontal lines (y = 0), vertical lines (x = 0), and diagonal lines (|x| = |y|) can each be loaded directly into the frame buffer without processing them through the line-plotting algorithm.
Displaying Polylines Implementation of a polyline procedure is accomplished by invoking a linedrawing routine n − 1 times to display the lines connecting the n endpoints. Each successive call passes the coordinate pair needed to plot the next line section, where the first endpoint of each coordinate pair is the last endpoint of the previous section. Once the color values for pixel positions along the first line segment have been set in the frame buffer, we process subsequent line segments starting with the next pixel position following the first endpoint for that segment. In this way, we can avoid setting the color of some endpoints twice. We discuss methods for avoiding the overlap of displayed objects in more detail in Section 8.
2 Parallel Line Algorithms The line-generating algorithms we have discussed so far determine pixel positions sequentially. Using parallel processing, we can calculate multiple pixel positions along a line path simultaneously by partitioning the computations
138
Implementation Algorithms for Graphics Primitives and Attributes
among the various processors available. One approach to the partitioning problem is to adapt an existing sequential algorithm to take advantage of multiple processors. Alternatively, we can look for other ways to set up the processing so that pixel positions can be calculated efficiently in parallel. An important consideration in devising a parallel algorithm is to balance the processing load among the available processors. Given n p processors, we can set up a parallel Bresenham line algorithm by subdividing the line path into n p partitions and simultaneously generating line segments in each of the subintervals. For a line with slope 0 < m < 1.0 and left endpoint coordinate position (x0 , y0 ), we partition the line along the positive x direction. The distance between beginning x positions of adjacent partitions can be calculated as x p =
x + n p − 1 np
(17)
where x is the width of the line, and the value for partition width x p is computed using integer division. Numbering the partitions, and the processors, as 0, 1, 2, up to n p − 1, we calculate the starting x coordinate for the kth partition as xk = x0 + kx p
(18)
For example, if we have n p = 4 processors, with x = 15, the width of the partitions is 4 and the starting x values for the partitions are x0 , x0 + 4, x0 + 8, and x0 + 12. With this partitioning scheme, the width of the last (rightmost) subinterval will be smaller than the others in some cases. In addition, if the line endpoints are not integers, truncation errors can result in variable-width partitions along the length of the line. To apply Bresenham’s algorithm over the partitions, we need the initial value for the y coordinate and the initial value for the decision parameter in each partition. The change yp in the y direction over each partition is calculated from the line slope m and partition width x p : yp = mx p
(19)
At the kth partition, the starting y coordinate is then yk = y0 + round(kyp )
(20)
The initial decision parameter for Bresenham’s algorithm at the start of the kth subinterval is obtained from Equation 14: pk = (kx p )(2y) − round(kyp )(2x) + 2y − x
(21)
Each processor then calculates pixel positions over its assigned subinterval using the preceding starting decision parameter value and the starting coordinates (xk , yk ). Floating-point calculations can be reduced to integer arithmetic in the computations for starting values yk and pk by substituting m = y/x and rearranging terms. We can extend the parallel Bresenham algorithm to a line with slope greater than 1.0 by partitioning the line in the y direction and calculating beginning x values for the partitions. For negative slopes, we increment coordinate values in one direction and decrement in the other. Another way to set up parallel algorithms on raster systems is to assign each processor to a particular group of screen pixels. With a sufficient number of processors, we can assign each processor to one pixel within some screen region. This
139
Implementation Algorithms for Graphics Primitives and Attributes
yend
⌬y
approach can be adapted to a line display by assigning one processor to each of the pixels within the limits of the coordinate extents of the line and calculating pixel distances from the line path. The number of pixels within the bounding box of a line is x · y (as illustrated in Figure 9). Perpendicular distance d from the line in Figure 9 to a pixel with coordinates (x, y) is obtained with the calculation d = Ax + B y + C
y0
⌬x
x0
where A=
−y linelength
B=
x linelength
C=
x0 y − y0 x linelength
xend
FIGURE 9
Bounding box for a line with endpoint separations x and y .
with linelength =
(22)
x 2 + y2
Once the constants A, B, and C have been evaluated for the line, each processor must perform two multiplications and two additions to compute the pixel distance d. A pixel is plotted if d is less than a specified line thickness parameter. Instead of partitioning the screen into single pixels, we can assign to each processor either a scan line or a column of pixels depending on the line slope. Each processor then calculates the intersection of the line with the horizontal row or vertical column of pixels assigned to that processor. For a line with slope |m| < 1.0, each processor simply solves the line equation for y, given an x column value. For a line with slope magnitude greater than 1.0, the line equation is solved for x by each processor, given a scan line y value. Such direct methods, although slow on sequential machines, can be performed efficiently using multiple processors.
3 Setting Frame-Buffer Values A final stage in the implementation procedures for line segments and other objects is to set the frame-buffer color values. Because scan-conversion algorithms generate pixel positions at successive unit intervals, incremental operations can also be used to access the frame buffer efficiently at each step of the scan-conversion process. As a specific example, suppose the frame buffer array is addressed in rowmajor order and that pixel positions are labeled from (0, 0) at the lower-left corner to (xmax , ymax ) at the top-right corner (Figure 10) of the screen. For a bilevel system (one bit per pixel), the frame-buffer bit address for pixel position (x, y) is calculated as addr(x, y) = addr(0, 0) + y(xmax + 1) + x
(23)
Moving across a scan line, we can calculate the frame-buffer address for the pixel at (x + 1, y) as the following offset from the address for position (x, y): addr(x + 1, y) = addr(x, y) + 1
(24)
Stepping diagonally up to the next scan line from (x, y), we get to the frame-buffer address of (x + 1, y + 1) with the calculation addr(x + 1, y + 1) = addr(x, y) + xmax + 2
140
(25)
Implementation Algorithms for Graphics Primitives and Attributes ymax
… (x, y) 0 0
xmax
(0, 0) (1, 0) (2, 0)
… (xmax, 0)
(xmax, ymax)
(0, 1)
addr (0, 0)
addr (x, y) Frame Buffer
Screen FIGURE 10
Pixel screen positions stored linearly in row-major order within the frame buffer.
where the constant xmax + 2 is precomputed once for all line segments. Similar incremental calculations can be obtained from Equation 23 for unit steps in the negative x and y screen directions. Each of the address calculations involves only a single integer addition. Methods for implementing these procedures depend on the capabilities of a particular system and the design requirements of the software package. With systems that can display a range of intensity values for each pixel, frame-buffer address calculations include pixel width (number of bits), as well as the pixel screen location.
4 Circle-Generating Algorithms Because the circle is a frequently used component in pictures and graphs, a procedure for generating either full circles or circular arcs is included in many graphics packages. In addition, sometimes a general function is available in a graphics library for displaying various kinds of curves, including circles and ellipses.
Properties of Circles
(x, y) r yc
u
xc
A circle (Figure 11) is defined as the set of points that are all at a given distance r from a center position (xc , yc ). For any circle point (x, y), this distance relationship is expressed by the Pythagorean theorem in Cartesian coordinates as (x − xc )2 + (y − yc )2 = r 2
FIGURE 11
Circle with center coordinates (x c , y c ) and radius r .
(26)
We could use this equation to calculate the position of points on a circle circumference by stepping along the x axis in unit steps from xc − r to xc + r and calculating the corresponding y values at each position as y = yc ± r 2 − (xc − x)2 (27) However, this is not the best method for generating a circle. One problem with this approach is that it involves considerable computation at each step. Moreover, the spacing between plotted pixel positions is not uniform, as demonstrated in Figure 12. We could adjust the spacing by interchanging x and y (stepping through y values and calculating x values) whenever the absolute value of the slope of the circle is greater than 1; but this simply increases the computation and processing required by the algorithm. Another way to eliminate the unequal spacing shown in Figure 12 is to calculate points along the circular boundary using polar coordinates r and θ
FIGURE 12
Upper half of a circle plotted with Equation 27 and with (x c , y c ) = (0, 0).
141
Implementation Algorithms for Graphics Primitives and Attributes
(Figure 11). Expressing the circle equation in parametric polar form yields the pair of equations x = xc + r cos θ y = yc + r sin θ
(y, x)
(y, x)
(x, y) 45
(x, y)
(x, y) (y, x)
(x, y)
(y, x)
FIGURE 13
Symmetry of a circle. Calculation of a circle point (x , y ) in one octant yields the circle points shown for the other seven octants.
142
(28)
When a display is generated with these equations using a fixed angular step size, a circle is plotted with equally spaced points along the circumference. To reduce calculations, we can use a large angular separation between points along the circumference and connect the points with straight-line segments to approximate the circular path. For a more continuous boundary on a raster display, we can set the angular step size at r1 . This plots pixel positions that are approximately one unit apart. Although polar coordinates provide equal point spacing, the trigonometric calculations are still time-consuming. For any of the previous circle-generating methods, we can reduce computations by considering the symmetry of circles. The shape of the circle is similar in each quadrant. Therefore, if we determine the curve positions in the first quadrant, we can generate the circle section in the second quadrant of the xy plane by noting that the two circle sections are symmetric with respect to the y axis. Also, circle sections in the third and fourth quadrants can be obtained from sections in the first and second quadrants by considering symmetry about the x axis. We can take this one step further and note that there is also symmetry between octants. Circle sections in adjacent octants within one quadrant are symmetric with respect to the 45◦ line dividing the two octants. These symmetry conditions are illustrated in Figure 13, where a point at position (x, y) on a one-eighth circle sector is mapped into the seven circle points in the other octants of the xy plane. Taking advantage of the circle symmetry in this way, we can generate all pixel positions around a circle by calculating only the points within the sector from x = 0 to x = y. The slope of the curve in this octant has a magnitude less than or equal to 1.0. At x = 0, the circle slope is 0, and at x = y, the slope is −1.0. Determining pixel positions along a circle circumference using symmetry and either Equation 26 or Equation 28 still requires a good deal of computation. The Cartesian equation 26 involves multiplications and square-root calculations, while the parametric equations contain multiplications and trigonometric calculations. More efficient circle algorithms are based on incremental calculation of decision parameters, as in the Bresenham line algorithm, which involves only simple integer operations. Bresenham’s line algorithm for raster displays is adapted to circle generation by setting up decision parameters for finding the closest pixel to the circumference at each sampling step. The circle equation 26, however, is nonlinear, so that square-root evaluations would be required to compute pixel distances from a circular path. Bresenham’s circle algorithm avoids these square-root calculations by comparing the squares of the pixel separation distances. However, it is possible to perform a direct distance comparison without a squaring operation. The basic idea in this approach is to test the halfway position between two pixels to determine if this midpoint is inside or outside the circle boundary. This method is applied more easily to other conics; and for an integer circle radius, the midpoint approach generates the same pixel positions as the Bresenham circle algorithm. For a straight-line segment, the midpoint method is equivalent to the Bresenham line algorithm. Also, the error involved in locating pixel positions along any conic section using the midpoint test is limited to half the pixel separation.
Implementation Algorithms for Graphics Primitives and Attributes
Midpoint Circle Algorithm As in the raster line algorithm, we sample at unit intervals and determine the closest pixel position to the specified circle path at each step. For a given radius r and screen center position (xc , yc ), we can first set up our algorithm to calculate pixel positions around a circle path centered at the coordinate origin (0, 0). Then each calculated position (x, y) is moved to its proper screen position by adding xc to x and yc to y. Along the circle section from x = 0 to x = y in the first quadrant, the slope of the curve varies from 0 to −1.0. Therefore, we can take unit steps in the positive x direction over this octant and use a decision parameter to determine which of the two possible pixel positions in any column is vertically closer to the circle path. Positions in the other seven octants are then obtained by symmetry. To apply the midpoint method, we define a circle function as f circ (x, y) = x 2 + y2 − r 2
(29)
Any point (x, y) on the boundary of the circle with radius r satisfies the equation f circ (x, y) = 0. If the point is in the interior of the circle, the circle function is negative; and if the point is outside the circle, the circle function is positive. To summarize, the relative position of any point (x, y) can be determined by checking the sign of the circle function as follows: ⎧ ⎪ ⎨< 0, if (x, y) is inside the circle boundary f circ (x, y) = 0, if (x, y) is on the circle boundary (30) ⎪ ⎩ > 0, if (x, y) is outside the circle boundary The tests in 30 are performed for the midpositions between pixels near the circle path at each sampling step. Thus, the circle function is the decision parameter in the midpoint algorithm, and we can set up incremental calculations for this function as we did in the line algorithm. Figure 14 shows the midpoint between the two candidate pixels at sampling position xk + 1. Assuming that we have just plotted the pixel at (xk , yk ), we next need to determine whether the pixel at position (xk + 1, yk ) or the one at position (xk + 1, yk − 1) is closer to the circle. Our decision parameter is the circle function 29 evaluated at the midpoint between these two pixels: 1 pk = f circ xk + 1, yk − 2 1 2 2 = (xk + 1) + yk − − r2 (31) 2
x2 y2 r 2 0
yk yk 1
Midpoint
xk
xk 1 x k 2
FIGURE 14
Midpoint between candidate pixels at sampling position x k + 1 along a circular path.
If pk < 0, this midpoint is inside the circle and the pixel on scan line yk is closer to the circle boundary. Otherwise, the midposition is outside or on the circle boundary, and we select the pixel on scan line yk − 1. Successive decision parameters are obtained using incremental calculations. We obtain a recursive expression for the next decision parameter by evaluating the circle function at sampling position xk+1 + 1 = xk + 2: 1 pk+1 = f circ xk+1 + 1, yk+1 − 2 1 2 = [(xk + 1) + 1]2 + yk+1 − − r2 2 or
2 pk+1 = pk + 2(xk + 1) + yk+1 − yk2 − (yk+1 − yk ) + 1 (32) where yk+1 is either yk or yk − 1, depending on the sign of pk .
143
Implementation Algorithms for Graphics Primitives and Attributes
Increments for obtaining pk+1 are either 2xk+1 + 1 (if pk is negative) or 2xk+1 + 1−2yk+1 . Evaluation of the terms 2xk+1 and 2yk+1 can also be done incrementally as 2xk+1 = 2xk + 2 2yk+1 = 2yk − 2 At the start position (0, r ), these two terms have the values 0 and 2r , respectively. Each successive value for the 2xk+1 term is obtained by adding 2 to the previous value, and each successive value for the 2yk+1 term is obtained by subtracting 2 from the previous value. The initial decision parameter is obtained by evaluating the circle function at the start position (x0 , y0 ) = (0, r ): 1 p0 = f circ 1, r − 2 2 1 = 1+ r − − r2 2 or 5 p0 = − r (33) 4 If the radius r is specified as an integer, we can simply round p0 to p0 = 1 − r
(for r an integer)
because all increments are integers. As in Bresenham’s line algorithm, the midpoint method calculates pixel positions along the circumference of a circle using integer additions and subtractions, assuming that the circle parameters are specified in integer screen coordinates. We can summarize the steps in the midpoint circle algorithm as follows:
Midpoint Circle Algorithm 1. Input radius r and circle center (xc , yc ), then set the coordinates for the first point on the circumference of a circle centered on the origin as (x0 , y0 ) = (0, r ) 2. Calculate the initial value of the decision parameter as 5 −r 4 3. At each xk position, starting at k = 0, perform the following test: If pk < 0, the next point along the circle centered on (0, 0) is (xk+1 , yk ) and p0 =
pk+1 = pk + 2xk+1 + 1 Otherwise, the next point along the circle is (xk + 1, yk − 1) and pk+1 = pk + 2xk+1 + 1 − 2yk+1 where 2xk+1 = 2xk + 2 and 2yk+1 = 2yk − 2. 4. Determine symmetry points in the other seven octants. 5. Move each calculated pixel position (x, y) onto the circular path centered at (xc , yc ) and plot the coordinate values as follows: x = x + xc , 6. Repeat steps 3 through 5 until x ≥ y.
144
y = y + yc
Implementation Algorithms for Graphics Primitives and Attributes
EXAMPLE 2
Midpoint Circle Drawing
Given a circle radius r = 10, we demonstrate the midpoint circle algorithm by determining positions along the circle octant in the first quadrant from x = 0 to x = y. The initial value of the decision parameter is p0 = 1 − r = −9 For the circle centered on the coordinate origin, the initial point is (x0 , y0 ) = (0, 10), and initial increment terms for calculating the decision parameters are 2x0 = 0,
2y0 = 20
Successive midpoint decision parameter values and the corresponding coordinate positions along the circle path are listed in the following table: k
pk
(xk+1 , yk+1 )
2xk+1
2yk+1
0 1 2 3 4 5 6
−9 −6 −1 6 −3 8 5
(1, 10) (2, 10) (3, 10) (4, 9) (5, 9) (6, 8) (7, 7)
2 4 6 8 10 12 14
20 20 20 18 18 16 14
A plot of the generated pixel positions in the first quadrant is shown in Figure 15. yx
y 10 9 8 7 6 5 4 3
FIGURE 15
2 1 0 0
1
2
3
4
5
6
7
8
9 10
x
Pixel positions (solid circles) along a circle path centered on the origin and with radius r = 10, as calculated by the midpoint circle algorithm. Open (“hollow”) circles show the symmetry positions in the first quadrant.
The following code segment illustrates procedures that could be used to implement the midpoint circle algorithm. Values for a circle radius and for the center coordinates of the circle are passed to procedure circleMidpoint. A pixel position along the circular path in the first octant is then computed and passed to procedure circlePlotPoints. This procedure sets the circle color in the frame buffer for all circle symmetry positions with repeated calls to the setPixel routine, which is implemented with the OpenGL point-plotting functions.
145
Implementation Algorithms for Graphics Primitives and Attributes
#include class screenPt { private: GLint x, y; public: /* Default Constructor: initializes coordinate position to (0, 0). screenPt ( ) { x = y = 0; } void setCoords (GLint xCoordValue, GLint yCoordValue) { x = xCoordValue; y = yCoordValue; } GLint getx ( ) const return x; }
*/
{
GLint gety ( ) const { return y; } void incrementx ( ) { x++; } void decrementy ( ) { y--; } }; void setPixel (GLint xCoord, GLint yCoord) { glBegin (GL_POINTS); glVertex2i (xCoord, yCoord); glEnd ( ); } void circleMidpoint (GLint xc, GLint yc, GLint radius) { screenPt circPt; GLint p = 1 - radius;
//
circPt.setCoords (0, radius); //
Initial value for midpoint parameter. Set coordinates for top point of circle.
void circlePlotPoints (GLint, GLint, screenPt); /* Plot the initial point in each circle quadrant. */ circlePlotPoints (xc, yc, circPt); /* Calculate next point and plot in each octant. */
146
Implementation Algorithms for Graphics Primitives and Attributes
while (circPt.getx ( ) < circPt.gety ( )) { circPt.incrementx ( ); if (p < 0) p += 2 * circPt.getx ( ) + 1; else { circPt.decrementy ( ); p += 2 * (circPt.getx ( ) - circPt.gety ( )) + 1; } circlePlotPoints (xc, yc, circPt); } } void circlePlotPoints (GLint xc, { setPixel (xc + circPt.getx ( setPixel (xc - circPt.getx ( setPixel (xc + circPt.getx ( setPixel (xc - circPt.getx ( setPixel (xc + circPt.gety ( setPixel (xc - circPt.gety ( setPixel (xc + circPt.gety ( setPixel (xc - circPt.gety ( }
GLint yc, screenPt circPt) ), ), ), ), ), ), ), ),
yc yc yc yc yc yc yc yc
+ + + + -
circPt.gety circPt.gety circPt.gety circPt.gety circPt.getx circPt.getx circPt.getx circPt.getx
( ( ( ( ( ( ( (
)); )); )); )); )); )); )); ));
5 Ellipse-Generating Algorithms
y
Loosely stated, an ellipse is an elongated circle. We can also describe an ellipse as a modified circle whose radius varies from a maximum value in one direction to a minimum value in the perpendicular direction. The straight-line segments through the interior of the ellipse in these two perpendicular directions are referred to as the major and minor axes of the ellipse.
d1 F1 P = (x, y) F2
Properties of Ellipses A precise definition of an ellipse can be given in terms of the distances from any point on the ellipse to two fixed positions, called the foci of the ellipse. The sum of these two distances is the same value for all points on the ellipse (Figure 16). If the distances to the two focus positions from any point P = (x, y) on the ellipse are labeled d1 and d2 , then the general equation of an ellipse can be stated as d1 + d2 = constant
d2
x FIGURE 16
Ellipse generated about foci F1 and F2 .
(34)
Expressing distances d1 and d2 in terms of the focal coordinates F1 = (x1 , y1 ) and F2 = (x2 , y2 ), we have (x − x1 )2 + (y − y1 )2 + (x − x2 )2 + (y − y2 )2 = constant (35) By squaring this equation, isolating the remaining radical, and squaring again, we can rewrite the general ellipse equation in the form A x 2 + B y2 + C x y + D x + E y + F = 0
(36)
147
Implementation Algorithms for Graphics Primitives and Attributes y
ry
yc
rx
x
xc
FIGURE 17
Ellipse centered at (x c , y c ) with semimajor axis r x and semiminor axis r y .
r rx u
y yc
where the coefficients A, B, C, D, E, and F are evaluated in terms of the focal coordinates and the dimensions of the major and minor axes of the ellipse. The major axis is the straight-line segment extending from one side of the ellipse to the other through the foci. The minor axis spans the shorter dimension of the ellipse, perpendicularly bisecting the major axis at the halfway position (ellipse center) between the two foci. An interactive method for specifying an ellipse in an arbitrary orientation is to input the two foci and a point on the ellipse boundary. With these three coordinate positions, we can evaluate the constant in Equation 35. Then, the values for the coefficients in Equation 36 can be computed and used to generate pixels along the elliptical path. Ellipse equations are greatly simplified if the major and minor axes are oriented to align with the coordinate axes. In Figure 17, we show an ellipse in “standard position,” with major and minor axes oriented parallel to the x and y axes. Parameter r x for this example labels the semimajor axis, and parameter r y labels the semiminor axis. The equation for the ellipse shown in Figure 17 can be written in terms of the ellipse center coordinates and parameters r x and r y as x − xc 2 y − yc 2 + =1 (37) rx ry Using polar coordinates r and θ, we can also describe the ellipse in standard position with the parametric equations x = xc + r x cos θ y = yc + r y sin θ
xc
x
FIGURE 18
The bounding circle and eccentric angle θ for an ellipse with r x > r y .
(38)
Angle θ, called the eccentric angle of the ellipse, is measured around the perimeter of a bounding circle. If r x > r y , the radius of the bounding circle is r = r x (Figure 18). Otherwise, the bounding circle has radius r = ry . As with the circle algorithm, symmetry considerations can be used to reduce computations. An ellipse in standard position is symmetric between quadrants, but, unlike a circle, it is not symmetric between the two octants of a quadrant. Thus, we must calculate pixel positions along the elliptical arc throughout one quadrant, then use symmetry to obtain curve positions in the remaining three quadrants (Figure 19).
Midpoint Ellipse Algorithm (x, y)
(x, y) ry rx
(x, y)
(x, y)
FIGURE 19
Symmetry of an ellipse. Calculation of a point (x , y ) in one quadrant yields the ellipse points shown for the other three quadrants.
148
Our approach here is similar to that used in displaying a raster circle. Given parameters r x , r y , and (xc , yc ), we determine curve positions (x, y) for an ellipse in standard position centered on the origin, then we shift all the points using a fixed offset so that the ellipse is centered at (xc , yc ). If we wish also to display the ellipse in nonstandard position, we could rotate the ellipse about its center coordinates to reorient the major and minor axes in the desired directions. For the present, we consider only the display of ellipses in standard position. The midpoint ellipse method is applied throughout the first quadrant in two parts. Figure 20 shows the division of the first quadrant according to the slope of an ellipse with r x < r y . We process this quadrant by taking unit steps in the x direction where the slope of the curve has a magnitude less than 1.0, and then we take unit steps in the y direction where the slope has a magnitude greater than 1.0.
Implementation Algorithms for Graphics Primitives and Attributes
Regions 1 and 2 (Figure 20) can be processed in various ways. We can start at position (0, r y ) and step clockwise along the elliptical path in the first quadrant, shifting from unit steps in x to unit steps in y when the slope becomes less than −1.0. Alternatively, we could start at (r x , 0) and select points in a counterclockwise order, shifting from unit steps in y to unit steps in x when the slope becomes greater than −1.0. With parallel processors, we could calculate pixel positions in the two regions simultaneously. As an example of a sequential implementation of the midpoint algorithm, we take the start position at (0, r y ) and step along the ellipse path in clockwise order throughout the first quadrant. We define an ellipse function from Equation 37 with (x , cy ) c= (0, 0) as f ellipse (x, y) = r y2 x 2 + r x2 y2 − r x2r y2 which has the following properties: ⎧ ⎪ ⎨< 0, if (x, y) is inside the ellipse boundary f ellipse (x, y) = 0, if (x, y) is on the ellipse boundary ⎪ ⎩ > 0, if (x, y) is outside the ellipse boundary
(39)
y Slope 1 ry
Region 1 Region 2
rx
x
FIGURE 20
Ellipse processing regions. Over region 1, the magnitude of the ellipse slope is less than 1.0; over region 2, the magnitude of the slope is greater than 1.0.
(40)
Thus, the ellipse function f ellipse (x, y) serves as the decision parameter in the midpoint algorithm. At each sampling position, we select the next pixel along the ellipse path according to the sign of the ellipse function evaluated at the midpoint between the two candidate pixels. Starting at (0, r y ), we take unit steps in the x direction until we reach the boundary between region 1 and region 2 (Figure 20). Then we switch to unit steps in the y direction over the remainder of the curve in the first quadrant. At each step we need to test the value of the slope of the curve. The ellipse slope is calculated from Equation 39 as 2r y2 x dy =− 2 dx 2r x y
(41)
ry2x2 rx2y2 rx2ry2 0
At the boundary between region 1 and region 2, dy/d x = −1.0 and yk
2r y2 x = 2r x2 y
yk 1
Therefore, we move out of region 1 whenever 2r y2 x ≥ 2r x2 y
(42)
Figure 21 shows the midpoint between the two candidate pixels at sampling position xk +1 in the first region. Assuming position (xk , yk ) has been selected in the previous step, we determine the next position along the ellipse path by evaluating the decision parameter (that is, the ellipse function 39) at this midpoint: 1 p1k = f ellipse xk + 1, yk − 2 1 2 2 2 2 = r y (xk + 1) + r x yk − − r x2r y2 (43) 2
Midpoint
xk
xk 1
FIGURE 21
Midpoint between candidate pixels at sampling position x k + 1 along an elliptical path.
If p1k < 0, the midpoint is inside the ellipse and the pixel on scan line yk is closer to the ellipse boundary. Otherwise, the midposition is outside or on the ellipse boundary, and we select the pixel on scan line yk − 1.
149
Implementation Algorithms for Graphics Primitives and Attributes
At the next sampling position (xk+1 + 1 = xk + 2), the decision parameter for region 1 is evaluated as 1 p1k+1 = f ellipse xk+1 + 1, yk+1 − 2 1 2 2 2 2 = r y [(xk + 1) + 1] + r x yk+1 − − r x2r y2 2 or 1 2 1 2 2 2 2 p1k+1 = p1k + 2r y (xk + 1) + r y + r x yk+1 − − yk − (44) 2 2 where yk+1 is either yk or yk − 1, depending on the sign of p1k . Decision parameters are incremented by the following amounts: ⎧ ⎨2r y2 xk+1 + r y2 , if p1k < 0 increment = ⎩2r 2 x + r 2 − 2r 2 y , if p1 ≥ 0 k y k+1 y x k+1 Increments for the decision parameters can be calculated using only addition and subtraction, as in the circle algorithm, because values for the terms 2r y2 x and 2r x2 y can be obtained incrementally. At the initial position (0, r y ), these two terms evaluate to 2r y2 x = 0 2r x2 y
ry2x2 rx2y2 rx2ry2 0 yk yk 1
Midpoint xk
xk 1 xk 2
FIGURE 22
Midpoint between candidate pixels at sampling position y k − 1 along an elliptical path.
=
2r x2r y
(45) (46)
As x and y are incremented, updated values are obtained by adding 2r y2 to the current value of the increment term in Equation 45 and subtracting 2r x2 from the current value of the increment term in Equation 46. The updated increment values are compared at each step, and we move from region 1 to region 2 when condition 42 is satisfied. In region 1, the initial value of the decision parameter is obtained by evaluating the ellipse function at the start position (x0 , y0 ) = (0, r y ): 1 p10 = f ellipse 1, r y − 2 2 1 = r y2 + r x2 r y − − r x2r y2 2 or 1 p10 = r y2 − r x2r y + r x2 (47) 4 Over region 2, we sample at unit intervals in the negative y direction, and the midpoint is now taken between horizontal pixels at each step (Figure 22). For this region, the decision parameter is evaluated as 1 p2k = f ellipse xk + , yk − 1 2 2 1 = r y2 xk + + r x2 (yk − 1)2 − r x2r y2 (48) 2 If p2k > 0, the midposition is outside the ellipse boundary, and we select the pixel at xk . If p2k ≤ 0, the midpoint is inside or on the ellipse boundary, and we select pixel position xk+1 .
150
Implementation Algorithms for Graphics Primitives and Attributes
To determine the relationship between successive decision parameters in region 2, we evaluate the ellipse function at the next sampling step yk+1 − 1 = yk − 2: 1 p2k+1 = f ellipse xk+1 + , yk+1 − 1 2 2 1 = r y2 xk+1 + + r x2 [(yk − 1) − 1]2 − r x2r y2 (49) 2 or 1 2 1 2 2 2 2 p2k+1 = p2k − 2r x (yk − 1) + r x + r y xk+1 + − xk + (50) 2 2 with xk+1 set either to xk or to xk + 1, depending on the sign of p2k . When we enter region 2, the initial position (x0 , y0 ) is taken as the last position selected in region 1 and the initial decision parameter in region 2 is then 1 p20 = f ellipse x0 + , y0 − 1 2 2 1 = r y2 x0 + + r x2 (y0 − 1)2 − r x2r y2 (51) 2 To simplify the calculation of p20 , we could select pixel positions in counterclockwise order starting at (r x , 0). Unit steps would then be taken in the positive y direction up to the last position selected in region 1. This midpoint algorithm can be adapted to generate an ellipse in nonstandard position using the ellipse function Equation 36 and calculating pixel positions over the entire elliptical path. Alternatively, we could reorient the ellipse axes to standard position, apply the midpoint ellipse algorithm to determine curve positions, and then convert calculated pixel positions to path positions along the original ellipse orientation. Assuming r x , r y , and the ellipse center are given in integer screen coordinates, we need only incremental integer calculations to determine values for the decision parameters in the midpoint ellipse algorithm. The increments r x2 , r y2 , 2r x2 , and 2r y2 are evaluated once at the beginning of the procedure. In the following summary, we list the steps for displaying an ellipse using the midpoint algorithm:
Midpoint Ellipse Algorithm 1. Input r x , r y , and ellipse center (xc , yc ), and obtain the first point on an ellipse centered on the origin as (x0 , y0 ) = (0, r y ) 2. Calculate the initial value of the decision parameter in region 1 as 1 p10 = r y2 − r x2r y + r x2 4 3. At each xk position in region 1, starting at k = 0, perform the following test: If p1k < 0, the next point along the ellipse centered on (0, 0) is (xk+1 , yk ) and p1k+1 = p1k + 2r y2 xk+1 + r y2 Otherwise, the next point along the ellipse is (xk + 1, yk − 1) and p1k+1 = p1k + 2r y2 xk+1 − 2r x2 yk+1 + r y2
151
Implementation Algorithms for Graphics Primitives and Attributes
with 2r y2 xk+1 = 2r y2 xk + 2r y2 ,
2r x2 yk+1 = 2r x2 yk − 2r x2
and continue until 2r y2 x ≥ 2r x2 y. 4. Calculate the initial value of the decision parameter in region 2 as 1 2 2 p20 = r y x0 + + r x2 (y0 − 1)2 − r x2r y2 2 where (x0 , y0 ) is the last position calculated in region 1. 5. At each yk position in region 2, starting at k = 0, perform the following test: If p2k > 0, the next point along the ellipse centered on (0, 0) is (xk , yk − 1) and p2k+1 = p2k − 2r x2 yk+1 + r x2 Otherwise, the next point along the ellipse is (xk + 1, yk − 1) and p2k+1 = p2k + 2r y2 xk+1 − 2r x2 yk+1 + r x2 using the same incremental calculations for x and y as in region 1. Continue until y = 0. 6. For both regions, determine symmetry points in the other three quadrants. 7. Move each calculated pixel position (x, y) onto the elliptical path centered on (xc , yc ) and plot these coordinate values: x = x + xc , EXAMPLE 3
y = y + yc
Midpoint Ellipse Drawing
Given input ellipse parameters r x = 8 and r y = 6, we illustrate the steps in the midpoint ellipse algorithm by determining raster positions along the ellipse path in the first quadrant. Initial values and increments for the decision parameter calculations are 2r y2 x = 0
(with increment 2r y2 = 72)
2r x2 y = 2r x2r y
(with increment −2r x2 = −128)
For region 1, the initial point for the ellipse centered on the origin is (x0 , y0 ) = (0, 6), and the initial decision parameter value is 1 p10 = r y2 − r x2r y + r x2 = −332 4 Successive midpoint decision-parameter values and the pixel positions along the ellipse are listed in the following table:
152
k
p1k
0 1 2 3 4 5 6
−332 −224 −44 208 −108 288 244
(xk+1 , yk+1 )
2r y2 xk+1
2r x2 yk+1
(1, 6) (2, 6) (3, 6) (4, 5) (5, 5) (6, 4) (7, 3)
72 144 216 288 360 432 504
768 768 768 640 640 512 384
Implementation Algorithms for Graphics Primitives and Attributes
We now move out of region 1 because 2r y2 x > 2r x2 y. For region 2, the initial point is (x0 , y0 ) = (7, 3) and the initial decision parameter is 1 p20 = f ellipse 7 + , 2 = −151 2 The remaining positions along the ellipse path in the first quadrant are then calculated as k
p1k
0 1 2
−151 233 745
(xk+1 , yk+1 )
2r y2 xk+1
2r x2 yk+1
(8, 2) (8, 1) (8, 0)
576 576 —
256 128 —
A plot of the calculated positions for the ellipse within the first quadrant is shown in Figure 23. 6 5 4 3 2 FIGURE 23
1 0 0
1
2
3
4
5
6
7
8
Pixel positions along an elliptical path centered on the origin with r x = 8 and r y = 6, using the midpoint algorithm to calculate locations within the first quadrant.
In the following code segment, example procedures are given for implementing the midpoint ellipse algorithm. Values for the ellipse parameters Rx, Ry, xCenter, and yCenter are input to procedure ellipseMidpoint. Positions along the curve in the first quadrant are then calculated and passed to procedure ellipsePlotPoints. Symmetry is used to obtain ellipse positions in the other three quadrants, and the setPixel routine sets the ellipse color in the framebuffer locations corresponding to these positions. inline int round (const float a)
{ return int (a + 0.5); }
/* The following procedure accepts values for an ellipse * center position and its semimajor and semiminor axes, then * calculates ellipse positions using the midpoint algorithm. */ void ellipseMidpoint (int xCenter, int yCenter, int Rx, int Ry) { int Rx2 = Rx * Rx; int Ry2 = Ry * Ry; int twoRx2 = 2 * Rx2; int twoRy2 = 2 * Ry2; int p; int x = 0; int y = Ry; int px = 0; int py = twoRx2 * y; void ellipsePlotPoints (int, int, int, int);
153
Implementation Algorithms for Graphics Primitives and Attributes
/* Plot the initial point in each quadrant. */ ellipsePlotPoints (xCenter, yCenter, x, y); /* Region 1 */ p = round (Ry2 - (Rx2 * Ry) + (0.25 * Rx2)); while (px < py) { x++; px += twoRy2; if (p < 0) p += Ry2 + px; else { y--; py -= twoRx2; p += Ry2 + px - py; } ellipsePlotPoints (xCenter, yCenter, x, y); } /* Region 2 */ p = round (Ry2 * (x+0.5) * (x+0.5) + Rx2 * (y-1) * (y-1) - Rx2 * Ry2); while (y > 0) { y--; py -= twoRx2; if (p > 0) p += Rx2 - py; else { x++; px += twoRy2; p += Rx2 - py + px; } ellipsePlotPoints (xCenter, yCenter, x, y); } } void ellipsePlotPoints { setPixel (xCenter + setPixel (xCenter setPixel (xCenter + setPixel (xCenter }
(int xCenter, int yCenter, int x, int y); x, x, x, x,
yCenter yCenter yCenter yCenter
+ + -
y); y); y); y);
6 Other Curves Various curve functions are useful in object modeling, animation path specifications, data and function graphing, and other graphics applications. Commonly encountered curves include conics, trigonometric and exponential functions, probability distributions, general polynomials, and spline functions. Displays of these curves can be generated with methods similar to those discussed for the circle and ellipse functions. We can obtain positions along curve paths directly from explicit representations y = f (x) or from parametric forms. Alternatively, we could apply the incremental midpoint method to plot curves described with implicit functions f (x, y) = 0.
154
Implementation Algorithms for Graphics Primitives and Attributes
A simple method for displaying a curved line is to approximate it with straight-line segments. Parametric representations are often useful in this case for obtaining equally spaced positions along the curve path for the line endpoints. We can also generate equally spaced positions from an explicit representation by choosing the independent variable according to the slope of the curve. Where the slope of y = f (x) has a magnitude less than 1, we choose x as the independent variable and calculate y values at equal x increments. To obtain equal spacing where the slope has a magnitude greater than 1, we use the inverse function, x = f −1 (y), and calculate values of x at equal y steps. Straight-line or curve approximations are used to generate a line graph for a set of discrete data values. We could join the discrete points with straightline segments, or we could use linear regression (least squares) to approximate the data set with a single straight line. A nonlinear least-squares approach is used to display the data set with some approximating function, usually a polynomial. As with circles and ellipses, many functions possess symmetries that can be exploited to reduce the computation of coordinate positions along curve paths. For example, the normal probability distribution function is symmetric about a center position (the mean), and all points within one cycle of a sine curve can be generated from the points in a 90◦ interval.
Conic Sections In general, we can describe a conic section (or conic) with the second-degree equation y
A x 2 + B y2 + C x y + D x + E y + F = 0
(52)
where the values for parameters A, B, C, D, E, and F determine the kind of curve that we are to display. Given this set of coefficients, we can determine the particular conic that will be generated by evaluating the discriminant B 2 − 4AC: ⎧ ⎪ ⎨< 0, generates an ellipse (or circle) 2 B − 4AC = 0, generates a parabola (53) ⎪ ⎩ > 0, generates a hyperbola For example, we get the circle equation 26 when A = B = 1, C = 0, D = −2xc , E = −2yc , and F = xc2 + yc2 − r 2 . Equation 52 also describes the “degenerate” conics: points and straight lines. In some applications, circular and elliptical arcs are conveniently specified with the beginning and ending angular values for the arc, as illustrated in Figure 24. Such arcs are sometimes defined by their endpoint coordinate positions. For either case, we could generate the arc with a modified midpoint method, or we could display a set of approximating straight-line segments. Ellipses, hyperbolas, and parabolas are particularly useful in certain animation applications. These curves describe orbital and other motions for objects subjected to gravitational, electromagnetic, or nuclear forces. Planetary orbits in the solar system, for example, are approximated with ellipses; and an object projected into a uniform gravitational field travels along a parabolic trajectory. Figure 25 shows a parabolic path in standard position for a gravitational field acting in the negative y direction. The explicit equation for the parabolic trajectory of the object shown can be written as y = y0 + a (x − x0 )2 + b(x − x0 )
(54)
r u1
u2 x
FIGURE 24
A circular arc, centered on the origin, defined with beginning angle θ1 , ending angle θ2 , and radius r .
y0
v0
g
x0 FIGURE 25
Parabolic path of an object tossed into a downward gravitational field at the initial position (x 0 , y 0 ).
155
Implementation Algorithms for Graphics Primitives and Attributes
with constants a and b determined by the initial velocity v0 of the object and the acceleration g due to the uniform gravitational force. We can also describe such parabolic motions with parametric equations using a time parameter t, measured in seconds from the initial projection point: x = x0 + vx0 t
y ry y r x x Left Branch rx
Right Branch
ry
ry
rx
x
FIGURE 26
Left and right branches of a hyperbola in standard position with the symmetry axis along the x axis.
(55) 1 y = y0 + v y0 t − gt 2 2 Here, vx0 and v y0 are the initial velocity components, and the value of g near the surface of the earth is approximately 980 cm/sec2 . Object positions along the parabolic path are then calculated at selected time steps. Hyperbolic curves (Figure 26) are useful in various scientific-visualization applications. Motions of objects along hyperbolic paths occur in connection with the collision of charged particles and in certain gravitational problems. For example, comets or meteorites moving around the sun may travel along hyperbolic paths and escape to outer space, never to return. The particular branch (left or right, in Figure 26) describing the motion of an object depends on the forces involved in the problem. We can write the standard equation for the hyperbola centered on the origin in Figure 26 as 2 2 x y − =1 (56) rx ry
with x ≤ −r x for the left branch and x ≥ r x for the right branch. Because this equation differs from the standard ellipse equation 39 only in the sign between the x 2 and y2 terms, we can generate points along a hyperbolic path with a slightly modified ellipse algorithm. Parabolas and hyperbolas possess a symmetry axis. For example, the parabola described by Equation 55 is symmetric about the axis x = x0 + vx0 v y0 /g The methods used in the midpoint ellipse algorithm can be applied directly to obtain points along one side of the symmetry axis of hyperbolic and parabolic paths in the two regions: (1) where the magnitude of the curve slope is less than 1, and (2) where the magnitude of the slope is greater than 1. To do this, we first select the appropriate form of Equation 52 and then use the selected function to set up expressions for the decision parameters in the two regions.
Polynomials and Spline Curves A polynomial function of nth degree in x is defined as y=
n
a k xk
k=0
= a 0 + a 1 x + · · · + a n−1 x n−1 + a n x n
(57)
where n is a nonnegative integer and the a k are constants, with a n = 0. We obtain a quadratic curve when n = 2, a cubic polynomial when n = 3, a quartic curve when n = 4, and so forth. We have a straight line when n = 1. Polynomials are useful in a number of graphics applications, including the design of object shapes, the specification of animation paths, and the graphing of data trends in a discrete set of data points. Designing object shapes or motion paths is typically accomplished by first specifying a few points to define the general curve contour, then the selected points are fitted with a polynomial. One way to accomplish the curve fitting is to
156
Implementation Algorithms for Graphics Primitives and Attributes
construct a cubic polynomial curve section between each pair of specified points. Each curve section is then described in parametric form as x = a x0 + a x1 u + a x2 u2 + a x3 u3 y = a y0 + a y1 u + a y2 u2 + a y3 u3
(58)
where parameter u varies over the interval from 0 to 1.0. Values for the coefficients of u in the preceding equations are determined from boundary conditions on the curve sections. One boundary condition is that two adjacent curve sections have the same coordinate position at the boundary, and a second condition is to match the two curve slopes at the boundary so that we obtain one continuous, smooth curve (Figure 27). Continuous curves that are formed with polynomial pieces are called spline curves, or simply splines.
FIGURE 27
A spline curve formed with individual cubic polynomial sections between specified coordinate positions.
7 Parallel Curve Algorithms Methods for exploiting parallelism in curve generation are similar to those used in displaying straight-line segments. We can either adapt a sequential algorithm by allocating processors according to curve partitions, or we could devise other methods and assign processors to screen partitions. A parallel midpoint method for displaying circles is to divide the circular arc from 45◦ to 90◦ into equal subarcs and assign a separate processor to each subarc. As in the parallel Bresenham line algorithm, we then need to set up computations to determine the beginning y value and decision parameter pk value for each processor. Pixel positions are calculated throughout each subarc, and positions in the other circle octants can be obtained by symmetry. Similarly, a parallel ellipse midpoint method divides the elliptical arc over the first quadrant into equal subarcs and parcels these out to separate processors. Again, pixel positions in the other quadrants are determined by symmetry. A screen-partitioning scheme for circles and ellipses is to assign each scan line that crosses the curve to a separate processor. In this case, each processor uses the circle or ellipse equation to calculate curve intersection coordinates. For the display of elliptical arcs or other curves, we can simply use the scanline partitioning method. Each processor uses the curve equation to locate the intersection positions along its assigned scan line. With processors assigned to individual pixels, each processor would calculate the distance (or distance squared) from the curve to its assigned pixel. If the calculated distance is less than a predefined value, the pixel is plotted.
8 Pixel Addressing and Object Geometry In discussing the raster algorithms for displaying graphics primitives, we assumed that frame-buffer coordinates referenced the center of a screen pixel position. We now consider the effects of different addressing schemes and an alternate pixel-addressing method used by some graphics packages, including OpenGL. An object description that is input to a graphics program is given in terms of precise world-coordinate positions, which are infinitesimally small mathematical points. However, when the object is scan-converted into the frame buffer, the input description is transformed to pixel coordinates which reference finite screen
157
Implementation Algorithms for Graphics Primitives and Attributes
areas, and the displayed raster image may not correspond exactly with the relative dimensions of the input object. If it is important to preserve the specified geometry of world objects, we can compensate for the mapping of mathematical input points to finite pixel areas. One way to do this is simply to adjust the pixel dimensions of displayed objects so as to correspond to the dimensions given in the original mathematical description of the scene. For example, if a rectangle is specified as having a width of 40 cm, then we could adjust the screen display so that the rectangle has a width of 40 pixels, with the width of each pixel representing one centimeter. Another approach is to map world coordinates onto screen positions between pixels, so that we align object boundaries with pixel boundaries instead of pixel centers.
5 4 3 2 1 0
0
1
2
3
4
5
FIGURE 28
Lower-left section of a screen area with coordinate positions referenced by grid intersection lines.
7 6 5 4 3 2 1 0
0
1
2
3
4
5
6
7
FIGURE 29
Illuminated pixel at raster position (4, 5).
4 3 2 1 0
1
2
3
4
FIGURE 30
Line path for two connected line segments between screen grid-coordinate positions.
158
Figure 28 shows a screen section with grid lines marking pixel boundaries, one unit apart. In this scheme, a screen position is given as the pair of integer values identifying a grid-intersection position between two pixels. The address for any pixel is now at its lower-left corner, as illustrated in Figure 29. A straight-line path is now envisioned as between grid intersections. For example, the mathematical line path for a polyline with endpoint coordinates (0, 0), (5, 2), and (1, 4) would then be as shown in Figure 30. Using screen grid coordinates, we now identify the area occupied by a pixel with screen coordinates (x, y) as the unit square with diagonally opposite corners at (x, y) and (x + 1, y + 1). This pixel-addressing method has several advantages: it avoids half-integer pixel boundaries, it facilitates precise object representations, and it simplifies the processing involved in many scan-conversion algorithms and other raster procedures. The algorithms for line drawing and curve generation discussed in the preceding sections are still valid when applied to input positions expressed as screen grid coordinates. Decision parameters in these algorithms would now be a measure of screen grid separation differences, rather than separation differences from pixel centers.
Maintaining Geometric Properties of Displayed Objects
5
0
Screen Grid Coordinates
5
When we convert geometric descriptions of objects into pixel representations, we transform mathematical points and lines into finite screen areas. If we are to maintain the original geometric measurements specified by the input coordinates for an object, we need to account for the finite size of pixels when we transform the object definition to a screen display. Figure 31 shows the line plotted in the Bresenham line-algorithm example of Section 1. Interpreting the line endpoints (20, 10) and (30, 18) as precise grid-crossing positions, we see that the line should not extend past screen-grid position (30, 18). If we were to plot the pixel with screen coordinates (30, 18), as in the example given in Section 1, we would display a line that spans 11 horizontal units and 9 vertical units. For the mathematical line, however, x = 10 and y = 8. If we are addressing pixels by their center positions, we can adjust the length of the displayed line by omitting one of the endpoint pixels. But if we think of screen coordinates as addressing pixel boundaries, as shown in Figure 31, we plot a line using only those pixels that are “interior” to the line path; that is, only those pixels that are between the line endpoints. For our example, we would plot the leftmost pixel at (20, 10) and the rightmost pixel at (29, 17). This displays a
Implementation Algorithms for Graphics Primitives and Attributes
18
4
17
3
16
2
15
1
14
0
13
0
1
2
3 (a)
4
5
0
1
2
3 (b)
4
5
0
1
2
3
4
5
12 11
FIGURE 31
10 20 21 22 23 24 25 26 27 28 29 30
Line path and corresponding pixel display for grid endpoint coordinates (20, 10) and (30, 18).
4 3 2 1
line that has the same geometric magnitudes as the mathematical line from (20, 10) to (30, 18). For an enclosed area, input geometric properties are maintained by displaying the area using only those pixels that are interior to the object boundaries. The rectangle defined with the screen coordinate vertices shown in Figure 32(a), for example, is larger when we display it filled with pixels up to and including the border pixel lines joining the specified vertices [Figure 32(b)]. As defined, the area of the rectangle is 12 units, but as displayed in Figure 32(b), it has an area of 20 units. In Figure 32(c), the original rectangle measurements are maintained by displaying only the internal pixels. The right boundary of the input rectangle is at x = 4. To maintain the rectangle width in the display, we set the rightmost pixel grid coordinate for the rectangle at x = 3 because the pixels in this vertical column span the interval from x = 3 to x = 4. Similarly, the mathematical top boundary of the rectangle is at y = 3, so we set the top pixel row for the displayed rectangle at y = 2. These compensations for finite pixel size can be applied to other objects, including those with curved boundaries, so that the raster display maintains the input object specifications. A circle with radius 5 and center position (10, 10), for instance, would be displayed as in Figure 33 by the midpoint circle algorithm
0
4 3 2 1 0
(c) FIGURE 32
Conversion of rectangle (a) with vertices at screen coordinates (0, 0), (4, 0), (4, 3), and (0, 3) into display (b), which includes the right and top boundaries, and into display (c), which maintains geometric magnitudes.
15
5
(10, 10)
15
FIGURE 33
5
A midpoint-algorithm plot of the circle equation ( x − 10) 2 + ( y − 10) 2 = 52 using pixel-center coordinates.
159
Implementation Algorithms for Graphics Primitives and Attributes
(y, x 1)
(x, y 1)
(y 1, x 1)
(x 1, y 1) (0, 0) (x 1, y)
(x, y)
FIGURE 34
Modification of the circle plot in Figure 33 to maintain the specified circle diameter of 10.
(y, x) (y 1, x)
using pixel centers as screen-coordinate positions. However, the plotted circle has a diameter of 11. To plot the circle with the defined diameter of 10, we can modify the circle algorithm to shorten each pixel scan line and each pixel column, as in Figure 34. One way to do this is to generate points clockwise along the circular arc in the third quadrant, starting at screen coordinates (10, 5). For each generated point, the other seven circle symmetry points are generated by decreasing the x coordinate values by 1 along scan lines and decreasing the y coordinate values by 1 along pixel columns. Similar methods are applied in ellipse algorithms to maintain the specified proportions in the display of an ellipse.
9 Attribute Implementations for Straight-Line Segments and Curves Recall that line segment primitives can be displayed with three basic attributes: color, width, and style. Of these, line width and style are selected with separate line functions.
Line Width Implementation of line-width options depends on the capabilities of the output device. For raster implementations, a standard-width line is generated with single pixels at each sample position, as in the Bresenham algorithm. Thicker lines are displayed as positive integer multiples of the standard line by plotting additional pixels along adjacent parallel line paths. If a line has slope magnitude less than or equal to 1.0, we can modify a line-drawing routine to display thick lines by plotting a vertical span of pixels in each column (x position) along the line. The number of pixels to be displayed in each column is set equal to the integer value of the line width. In Figure 35, we display a double-width line by generating a parallel line above the original line path. At each x sampling position, we calculate the corresponding y coordinate and plot pixels at screen coordinates (x, y) and (x, y + 1). We could display lines with a width of 3 or greater by alternately plotting pixels above and below the single-width line path.
160
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 35
A double-wide raster line with slope |m | < 1.0 generated with vertical pixel spans.
FIGURE 36
A raster line with slope |m | > 1.0 and a line width of 4 plotted using horizontal pixel spans.
With a line slope greater than 1.0 in magnitude, we can display thick lines using horizontal spans, alternately picking up pixels to the right and left of the line path. This scheme is demonstrated in Figure 36, where a line segment with a width of 4 is plotted using multiple pixels across each scan line. Similarly, a thick line with slope less than or equal to 1.0 can be displayed using vertical pixel spans. We can implement this procedure by comparing the magnitudes of the horizontal and vertical separations (x and y) of the line endpoints. If |x| ≥ |y|, pixels are replicated along columns. Otherwise, multiple pixels are plotted across rows. Although thick lines are generated quickly by plotting horizontal or vertical pixel spans, the displayed width of a line (measured perpendicular to the line ◦ path) √ depends on its slope. A 45 line will be displayed thinner by a factor of 1/ 2 compared to a horizontal or vertical line plotted with the same-length pixel spans. Another problem with implementing width options using horizontal or vertical pixel spans is that the method produces lines whose ends are horizontal or vertical regardless of the slope of the line. This effect is more noticeable with very thick lines. We can adjust the shape of the line ends to give them a better appearance by adding line caps (Figure 37). One kind of line cap is the butt cap, which has square ends that are perpendicular to the line path. If the specified line has slope m, the square ends of the thick line have slope −1/m. Each of the component parallel lines is then displayed between the two perpendicular lines at each end of the specified line path. Another line cap is the round cap obtained by adding a filled semicircle to each butt cap. The circular arcs are centered at the middle of the thick line and have a diameter equal to the line thickness. A third type of line cap is the projecting square cap. Here, we simply extend the line
161
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 37
Thick lines drawn with (a) butt caps, (b) round caps, and (c) projecting square caps.
(a)
(b)
(c)
FIGURE 38
Thick line segments connected with a miter join (a), a round join (b), and a bevel join (c).
(a)
(b)
(c)
and add butt caps that are positioned half of the line width beyond the specified endpoints. Other methods for producing thick lines include displaying the line as a filled rectangle or generating the line with a selected pen or brush pattern, as discussed in the next section. To obtain a rectangle representation for the line boundary, we calculate the position of the rectangle vertices along perpendiculars to the line path so that the rectangle vertex coordinates are displaced from the original lineendpoint positions by half the line width. The rectangular line then appears as in Figure 37(a). We could add round caps to the filled rectangle, or we could extend its length to display projecting square caps. Generating thick polylines requires some additional considerations. In general, the methods that we have considered for displaying a single line segment will not produce a smoothly connected series of line segments. Displaying thick polylines using horizontal and vertical pixel spans, for example, leaves pixel gaps at the boundaries between line segments with different slopes where there is a shift from horizontal pixel spans to vertical spans. We can generate thick polylines that are smoothly joined at the cost of additional processing at the segment endpoints. Figure 38 shows three possible methods for smoothly joining two line segments. A miter join is accomplished by extending the outer boundaries of each of the two line segments until they meet. A round join is produced by capping the connection between the two segments with a circular boundary whose diameter is equal to the line width. A bevel join is generated by displaying the line segments with butt caps and filling in the triangular gap where the segments meet. If the angle between two connected line segments is very small, a miter join can generate a long spike that distorts the appearance of the polyline. A graphics package can avoid this effect by switching from a miter join to a bevel join when, for example, the angle between any two consecutive segments is small.
Line Style Raster line algorithms display line-style attributes by plotting pixel spans. For dashed, dotted, and dot-dashed patterns, the line-drawing procedure outputs sections of contiguous pixels along the line path, skipping over a number of intervening pixels between the solid spans. Pixel counts for the span length and
162
Implementation Algorithms for Graphics Primitives and Attributes
inter-span spacing can be specified in a pixel mask, which is a pattern of binary digits indicating which positions to plot along the line path. The linear mask 11111000, for instance, could be used to display a dashed line with a dash length of five pixels and an inter-dash spacing of three pixels. Pixel positions corresponding to the 1 bits are assigned the current color, and pixel positions corresponding to the 0 bits are displayed in the background color. Plotting dashes with a fixed number of pixels results in unequal length dashes for different line orientations, as illustrated in Figure 39. Both dashes shown √ are plotted with four pixels, but the diagonal dash is longer by a factor of 2. For precision drawings, dash lengths should remain approximately constant for any line orientation. To accomplish this, we could adjust the pixel counts for the solid spans and inter-span spacing according to the line slope. In Figure 39, we can display approximately equal length dashes by reducing the diagonal dash to three pixels. Another method for maintaining dash length is to treat dashes as individual line segments. Endpoint coordinates for each dash are located and passed to the line routine, which then calculates pixel positions along the dash path.
(a)
(b) FIGURE 39
Unequal-length dashes displayed with the same number of pixels.
Pen and Brush Options Pen and brush shapes can be stored in a pixel mask that identifies the array of pixel positions that are to be set along the line path. For example, a rectangular pen could be implemented with the mask shown in Figure 40 by moving the center (or one corner) of the mask along the line path, as in Figure 41. To avoid setting pixels more than once in the frame buffer, we can simply accumulate the horizontal spans generated at each position of the mask and keep track of the beginning and ending x positions for the spans across each scan line. Lines generated with pen (or brush) shapes can be displayed in various widths by changing the size of the mask. For example, the rectangular pen line in Figure 41 could be narrowed with a 2 × 2 rectangular mask or widened with a 4 × 4 mask. Also, lines can be displayed with selected patterns by superimposing the pattern values onto the pen or brush mask.
1 1 1
1 1 1
1 1 1 Line Path
(a)
(b)
FIGURE 40
FIGURE 41
A pixel mask (a) for a rectangular pen, and the associated array of pixels (b) displayed by centering the mask over a specified pixel position.
Generating a line with the pen shape of Figure 40.
163
Implementation Algorithms for Graphics Primitives and Attributes
17
14
FIGURE 42
FIGURE 43
A circular arc of width 4 plotted with either vertical or horizontal pixel spans, depending on the slope.
A circular arc of width 4 and radius 16 displayed by filling the region between two concentric arcs.
Curve Attributes Methods for adapting curve-drawing algorithms to accommodate attribute selections are similar to those for line drawing. Raster curves of various widths can be displayed using the method of horizontal or vertical pixel spans. Where the magnitude of the curve slope is less than or equal to 1.0, we plot vertical spans; where the slope magnitude is greater than 1.0, we plot horizontal spans. Figure 42 demonstrates this method for displaying a circular arc with a width of 4 in the first quadrant. Using circle symmetry, we generate the circle path with vertical spans in the octant from x = 0 to x = y, and then reflect pixel positions about the line y = x to obtain the remainder of the curve shown. Circle sections in the other quadrants are obtained by reflecting pixel positions in the first quadrant about the coordinate axes. The thickness of curves displayed with this method is again a function of curve slope. Circles, ellipses, and other curves will appear thinnest where the slope has a magnitude of 1. Another method for displaying thick curves is to fill in the area between two parallel curve paths, whose separation distance is equal to the desired width. We could do this using the specified curve path as one boundary and setting up the second boundary either inside or outside the original curve path. This approach, however, shifts the original curve path either inward or outward, depending on which direction we choose for the second boundary. We can maintain the original curve position by setting the two boundary curves at a distance of half the width on either side of the specified curve path. An example of this approach is shown in Figure 43 for a circle segment with a radius of 16 and a specified width of 4. The boundary arcs are then set at a separation distance of 2 on either side of the radius of 16. To maintain the proper dimensions of the circular arc, as discussed in Section 8, we can set the radii for the concentric boundary arcs at r = 14 and r = 17. Although this method is accurate for generating thick circles, it provides, in general, only an approximation to the true area of other thick curves. For example, the inner and outer boundaries of a fat ellipse generated with this method do not have the same foci. The pixel masks discussed for implementing line-style options could also be used in raster curve algorithms to generate dashed or dotted patterns. For example, the mask 11100 produces the dashed circular arc shown in Figure 44.
164
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 44
FIGURE 45
A dashed circular arc displayed with a dash span of 3 pixels and an inter-dash spacing of 2 pixels.
A circular arc displayed with a rectangular pen.
We can generate the dashes in the various octants using circle symmetry, but we must shift the pixel positions to maintain the correct sequence of dashes and spaces as we move from one octant to the next. Also, as in straight-line algorithms, pixel masks display dashes and inter-dash spaces that vary in length according to the slope of the curve. If we want to display constant length dashes, we need to adjust the number of pixels plotted in each dash as we move around the circle circumference. Instead of applying a pixel mask with constant spans, we plot pixels along equal angular arcs to produce equal-length dashes. Pen (or brush) displays of curves are generated using the same techniques discussed for straight-line segments. We replicate a pen shape along the line path, as illustrated in Figure 45 for a circular arc in the first quadrant. Here, the center of the rectangular pen is moved to successive curve positions to produce the curve shape shown. Curves displayed with a rectangular pen in this manner will be thicker where the magnitude of the curve slope is 1. A uniform curve thickness can be displayed by rotating the rectangular pen to align it with the slope direction as we move around the curve or by using a circular pen shape. Curves drawn with pen and brush shapes can be displayed in different sizes and with superimposed patterns or simulated brush strokes.
10 General Scan-Line Polygon-Fill Algorithm A scan-line fill of a region is performed by first determining the intersection positions of the boundaries of the fill region with the screen scan lines. Then the fill colors are applied to each section of a scan line that lies within the interior of the fill region. The scan-line fill algorithm identifies the same interior regions as the odd-even rule. The simplest area to fill is a polygon because each scanline intersection point with a polygon boundary is obtained by solving a pair of simultaneous linear equations, where the equation for the scan line is simply y = constant. Figure 46 illustrates the basic scan-line procedure for a solid-color fill of a polygon. For each scan line that crosses the polygon, the edge intersections are
165
Implementation Algorithms for Graphics Primitives and Attributes y
Scan Line y 1
2
1 Scan Line y
1
2
1
1
x 10
14
18
24 FIGURE 47
FIGURE 46
Interior pixels along a scan line passing through a polygon fill area.
Intersection points along scan lines that intersect polygon vertices. Scan line y generates an odd number of intersections, but scan line y generates an even number of intersections that can be paired to identify correctly the interior pixel spans.
sorted from left to right, and then the pixel positions between, and including, each intersection pair are set to the specified fill color. In the example of Figure 46, the four pixel intersection positions with the polygon boundaries define two stretches of interior pixels. Thus, the fill color is applied to the five pixels from x = 10 to x = 14 and to the seven pixels from x = 18 to x = 24. If a pattern fill is to be applied to the polygon, then the color for each pixel along a scan line is determined from its overlap position with the fill pattern. However, the scan-line fill algorithm for a polygon is not quite as simple as Figure 46 might suggest. Whenever a scan line passes through a vertex, it intersects two polygon edges at that point. In some cases, this can result in an odd number of boundary intersections for a scan line. Figure 47 shows two scan lines that cross a polygon fill area and intersect a vertex. Scan line y intersects an even number of edges, and the two pairs of intersection points along this scan line correctly identify the interior pixel spans. But scan line y intersects five polygon edges. To identify the interior pixels for scan line y, we must count the vertex intersection as only one point. Thus, as we process scan lines, we need to distinguish between these cases. We can detect the topological difference between scan line y and scan line y in Figure 47 by noting the position of the intersecting edges relative to the scan line. For scan line y, the two edges sharing an intersection vertex are on opposite sides of the scan line. But for scan line y , the two intersecting edges are both above the scan line. Thus, a vertex that has adjoining edges on opposite sides of an intersecting scan line should be counted as just one boundary intersection point. We can identify these vertices by tracing around the polygon boundary in either clockwise or counterclockwise order and observing the relative changes in vertex y coordinates as we move from one edge to the next. If the three endpoint y values of two consecutive edges monotonically increase or decrease, we need to count the shared (middle) vertex as a single intersection point for the scan line passing through that vertex. Otherwise, the shared vertex represents a local extremum (minimum or maximum) on the polygon boundary, and the two edge intersections with the scan line passing through that vertex can be added to the intersection list. One method for implementing the adjustment to the vertex-intersection count is to shorten some polygon edges to split those vertices that should be counted as one intersection. We can process nonhorizontal edges around the polygon boundary in the order specified, either clockwise or counterclockwise. As we
166
Implementation Algorithms for Graphics Primitives and Attributes FIGURE 48
Scan Line y 1 Scan Line y Scan Line y 1
(a)
(b)
Adjusting endpoint y values for a polygon, as we process edges in order around the polygon perimeter. The edge currently being processed is indicated as a solid line. In (a), the y coordinate of the upper endpoint of the current edge is decreased by 1. In (b), the y coordinate of the upper endpoint of the next edge is decreased by 1.
Scan Line yk 1
(xk 1, yk 1) (xk, yk)
Scan Line yk
FIGURE 49
Two successive scan lines intersecting a polygon boundary.
process each edge, we can check to determine whether that edge and the next nonhorizontal edge have either monotonically increasing or decreasing endpoint y values. If so, the lower edge can be shortened to ensure that only one intersection point is generated for the scan line going through the common vertex joining the two edges. Figure 48 illustrates the shortening of an edge. When the endpoint y coordinates of the two edges are increasing, the y value of the upper endpoint for the current edge is decreased by 1, as in Figure 48(a). When the endpoint y values are monotonically decreasing, as in Figure 48(b), we decrease the y coordinate of the upper endpoint of the edge following the current edge. Typically, certain properties of one part of a scene are related in some way to the properties in other parts of the scene, and these coherence properties can be used in computer-graphics algorithms to reduce processing. Coherence methods often involve incremental calculations applied along a single scan line or between successive scan lines. For example, in determining fill-area edge intersections, we can set up incremental coordinate calculations along any edge by exploiting the fact that the slope of the edge is constant from one scan line to the next. Figure 49 shows two successive scan lines crossing the left edge of a triangle. The slope of this edge can be expressed in terms of the scan-line intersection coordinates: yk+1 − yk m= (59) xk+1 − xk Because the change in y coordinates between the two scan lines is simply yk+1 − yk = 1
(60)
the x-intersection value xk+1 on the upper scan line can be determined from the x-intersection value xk on the preceding scan line as 1 (61) m Each successive x intercept can thus be calculated by adding the inverse of the slope and rounding to the nearest integer. xk+1 = xk +
167
Implementation Algorithms for Graphics Primitives and Attributes
An obvious parallel implementation of the fill algorithm is to assign each scan line that crosses the polygon to a separate processor. Edge intersection calculations are then performed independently. Along an edge with slope m, the intersection xk value for scan line k above the initial scan line can be calculated as k xk = x0 + (62) m In a sequential fill algorithm, the increment of x values by the amount m1 along an edge can be accomplished with integer operations by recalling that the slope m is the ratio of two integers: y x where x and y are the differences between the edge endpoint x and y coordinate values. Thus, incremental calculations of x intercepts along an edge for successive scan lines can be expressed as m=
xk+1 = xk +
x y
(63)
Using this equation, we can perform integer evaluation of the x intercepts by initializing a counter to 0, then incrementing the counter by the value of x each time we move up to a new scan line. Whenever the counter value becomes equal to or greater than y, we increment the current x intersection value by 1 and decrease the counter by the value y. This procedure is equivalent to maintaining integer and fractional parts for x intercepts and incrementing the fractional part until we reach the next integer value. As an example of this integer-incrementing scheme, suppose that we have an edge with slope m = 73 . At the initial scan line, we set the counter to 0 and the counter increment to 3. As we move up to the next three scan lines along this edge, the counter is successively assigned the values 3, 6, and 9. On the third scan line above the initial scan line, the counter now has a value greater than 7. So we increment the x intersection coordinate by 1 and reset the counter to the value 9 − 7 = 2. We continue determining the scan-line intersections in this way until we reach the upper endpoint of the edge. Similar calculations are carried out to obtain intersections for edges with negative slopes. We can round to the nearest pixel x intersection value, instead of truncating to obtain integer positions, by modifying the edge-intersection algorithm so that the increment is compared to y/2. This can be done with integer arithmetic by incrementing the counter with the value 2x at each step and comparing the increment to y. When the increment is greater than or equal to y, we increase the x value by 1 and decrement the counter by the value of 2y. In our previous example with m = 73 , the counter values for the first few scan lines above the initial scan line on this edge would now be 6, 12 (reduced to −2), 4, 10 (reduced to −4), 2, 8 (reduced to −6), 0, 6, and 12 (reduced to −2). Now x would be incremented on scan lines 2, 4, 6, 9, and so forth, above the initial scan line for this edge. The extra calculations required for each edge are 2x = x + x and 2y = y + y, which are carried out as preprocessing steps. To perform a polygon fill efficiently, we can first store the polygon boundary in a sorted edge table that contains all the information necessary to process the scan lines efficiently. Proceeding around the edges in either a clockwise or a counterclockwise order, we can use a bucket sort to store the edges, sorted on the smallest y value of each edge, in the correct scan-line positions. Only nonhorizontal edges are entered into the sorted edge table. As the edges are processed, we can also
168
Implementation Algorithms for Graphics Primitives and Attributes ScanLine Number yC B
E
yA
Scan Line yD Scan Line yA
yC
xD 1/mDC
yE
xD 1/m DE
yE
xA 1/mAE
yB
xA 1/mAB
1/mCB
. . .
Scan Line yC
C
xC
. . . yD
C
yB
D A
. . . 1 0
FIGURE 50
A polygon and its sorted edge table, with edge DC shortened by one unit in the y direction.
shorten certain edges to resolve the vertex-intersection question. Each entry in the table for a particular scan line contains the maximum y value for that edge, the x-intercept value (at the lower vertex) for the edge, and the inverse slope of the edge. For each scan line, the edges are in sorted order from left to right. Figure 50 shows a polygon and the associated sorted edge table. Next, we process the scan lines from the bottom of the polygon to its top, producing an active edge list for each scan line crossing the polygon boundaries. The active edge list for a scan line contains all edges crossed by that scan line, with iterative coherence calculations used to obtain the edge intersections. Implementation of edge-intersection calculations can be facilitated by storing x and y values in the sorted edge list. Also, to ensure that we correctly fill the interior of specified polygons, we can apply the considerations discussed in Section 8. For each scan line, we fill in the pixel spans for each pair of x intercepts starting from the leftmost x intercept value and ending at one position before the rightmost x intercept. Each polygon edge can be shortened by one unit in the y direction at the top endpoint. These measures also guarantee that pixels in adjacent polygons will not overlap.
11 Scan-Line Fill of Convex Polygons When we apply a scan-line fill procedure to a convex polygon, there can be no more than a single interior span for each screen scan line. So we need to process the polygon edges only until we have found two boundary intersections for each scan line crossing the polygon interior. The general polygon scan-line algorithm discussed in the preceding section can be simplified considerably for convex-polygon fill. We again use coordinate extents to determine which edges cross a scan line. Intersection calculations with these edges then determine the interior pixel span for that scan line, where any
169
Implementation Algorithms for Graphics Primitives and Attributes
vertex crossing is counted as a single boundary intersection point. When a scan line intersects a single vertex (at an apex, for example), we plot only that point. Some graphics packages further restrict fill areas to be triangles. This makes filling even easier because each triangle has just three edges to process.
12 Scan-Line Fill for Regions with Curved Boundaries
FIGURE 51
Interior fill of an elliptical arc.
Because an area with curved boundaries is described with nonlinear equations, a scan-line fill generally takes more time than a polygon scan-line fill. We can use the same general approach detailed in Section 10, but the boundary intersection calculations are performed with curve equations. In addition, the slope of the boundary is continuously changing, so we cannot use the straightforward incremental calculations that are possible with straight-line edges. For simple curves such as circles or ellipses, we can apply fill methods similar to those for convex polygons. Each scan line crossing a circle or ellipse interior has just two boundary intersections; and we can determine these two intersection points along the boundary of a circle or an ellipse using the incremental calculations in the midpoint method. Then we simply fill in the horizontal pixel spans from one intersection point to the other. Symmetries between quadrants (and between octants for circles) are used to reduce the boundary calculations. Similar methods can be used to generate a fill area for a curve section. For example, an area bounded by an elliptical arc and a straight line section (Figure 51) can be filled using a combination of curve and line procedures. Symmetries and incremental calculations are exploited whenever possible to reduce computations. Filling other curve areas can involve considerably more processing. We could use similar incremental methods in combination with numerical techniques to determine the scan-line intersections, but usually such curve boundaries are approximated with straight-line segments.
13 Fill Methods for Areas with Irregular Boundaries Another approach for filling a specified area is to start at an inside position and “paint” the interior, point by point, out to the boundary. This is a particularly useful technique for filling areas with irregular borders, such as a design created with a paint program. Generally, these methods require an input starting position inside the area to be filled and some color information about either the boundary or the interior. We can fill irregular regions with a single color or with a color pattern. For a pattern fill, we overlay a color mask. As each pixel within the region is processed, its color is determined by the corresponding values in the overlaid pattern.
Boundary-Fill Algorithm If the boundary of some region is specified in a single color, we can fill the interior of this region, pixel by pixel, until the boundary color is encountered. This
170
Implementation Algorithms for Graphics Primitives and Attributes
(a)
(b)
FIGURE 52
Example of color boundaries for a boundary-fill procedure.
method, called the boundary-fill algorithm, is employed in interactive painting packages, where interior points are easily selected. Using a graphics tablet or other interactive device, an artist or designer can sketch a figure outline, select a fill color from a color menu, specify the area boundary color, and pick an interior point. The figure interior is then painted in the fill color. Both inner and outer boundaries can be set up to define an area for boundary fill, and Figure 52 illustrates examples for specifying color regions. Basically, a boundary-fill algorithm starts from an interior point (x, y) and tests the color of neighboring positions. If a tested position is not displayed in the boundary color, its color is changed to the fill color and its neighbors are tested. This procedure continues until all pixels are processed up to the designated boundary color for the area. Figure 53 shows two methods for processing neighboring pixels from a current test position. In Figure 53(a), four neighboring points are tested. These are the pixel positions that are right, left, above, and below the current pixel. Areas filled by this method are called 4-connected. The second method, shown in Figure 53(b), is used to fill more complex figures. Here the set of neighboring positions to be tested includes the four diagonal pixels, as well as those in the cardinal directions. Fill methods using this approach are called 8-connected. An 8-connected boundary-fill algorithm would correctly fill the interior of the area defined in Figure 54, but a 4-connected boundary-fill algorithm would fill only part of that region. The following procedure illustrates a recursive method for painting a 4-connected area with a solid color, specified in parameter fillColor, up to a boundary color specified with parameter borderColor. We can extend this procedure to fill an 8-connected region by including four additional statements to test the diagonal positions (x ± 1, y ± 1).
Start Position (a)
(a)
(b) FIGURE 53
Fill methods applied to a 4-connected area (a) and to an 8-connected area (b). Hollow circles represent pixels to be tested from the current test position, shown as a solid color.
(b)
FIGURE 54
The area defined within the color boundary (a) is only partially filled in (b) using a 4-connected boundary-fill algorithm.
171
Implementation Algorithms for Graphics Primitives and Attributes
void boundaryFill4 (int x, int y, int fillColor, int borderColor) { int interiorColor; /* Set current color to fillColor, then perform the following operations. */ getPixel (x, y, interiorColor); if ((interiorColor != borderColor) && (interiorColor != fillColor)) { setPixel (x, y); // Set color of pixel to fillColor. boundaryFill4 (x + 1, y , fillColor, borderColor); boundaryFill4 (x - 1, y , fillColor, borderColor); boundaryFill4 (x , y + 1, fillColor, borderColor); boundaryFill4 (x , y - 1, fillColor, borderColor) } }
Recursive boundary-fill algorithms may not fill regions correctly if some interior pixels are already displayed in the fill color. This occurs because the algorithm checks the next pixels both for boundary color and for fill color. Encountering a pixel with the fill color can cause a recursive branch to terminate, leaving other interior pixels unfilled. To avoid this, we can first change the color of any interior pixels that are initially set to the fill color before applying the boundary-fill procedure. Also, because this procedure requires considerable stacking of neighboring points, more efficient methods are generally employed. These methods fill horizontal pixel spans across scan lines, instead of proceeding to 4-connected or 8-connected neighboring points. Then we need only stack a beginning position for each horizontal pixel span, instead of stacking all unprocessed neighboring positions around the current position. Starting from the initial interior point with this method, we first fill in the contiguous span of pixels on this starting scan line. Then we locate and stack starting positions for spans on the adjacent scan lines, where spans are defined as the contiguous horizontal string of positions bounded by pixels displayed in the border color. At each subsequent step, we retrieve the next start position from the top of the stack and repeat the process. An example of how pixel spans could be filled using this approach is illustrated for the 4-connected fill region in Figure 55. In this example, we first process scan lines successively from the start line to the top boundary. After all upper scan lines are processed, we fill in the pixel spans on the remaining scan lines in order down to the bottom boundary. The leftmost pixel position for each horizontal span is located and stacked, in left-to-right order across successive scan lines, as shown in Figure 55. In (a) of this figure, the initial span has been filled, and starting positions 1 and 2 for spans on the next scan lines (below and above) are stacked. In Figure 55(b), position 2 has been unstacked and processed to produce the filled span shown, and the starting pixel (position 3) for the single span on the next scan line has been stacked. After position 3 is processed, the filled spans and stacked positions are as shown in Figure 55(c). Figure 55(d) shows the filled pixels after processing all spans in the upper-right portion of the specified area. Position 5 is next processed, and spans are filled in the upper-left portion of the region; then position 4 is picked up to continue the processing for the lower scan lines.
172
Implementation Algorithms for Graphics Primitives and Attributes
Filled Pixel Spans
Stacked Positions
2 2
1
1 (a)
3
3
1
1 (b)
5
6 6
4
5 1
4 1 FIGURE 55
(c)
5 4
5 1
4 1
(d)
Boundary fill across pixel spans for a 4-connected area: (a) Initial scan line with a filled pixel span, showing the position of the initial point (hollow) and the stacked positions for pixel spans on adjacent scan lines. (b) Filled pixel span on the first scan line above the initial scan line and the current contents of the stack. (c) Filled pixel spans on the first two scan lines above the initial scan line and the current contents of the stack. (d) Completed pixel spans for the upper-right portion of the defined region and the remaining stacked positions to be processed.
173
Implementation Algorithms for Graphics Primitives and Attributes
Flood-Fill Algorithm
FIGURE 56
An area defined within multiple color boundaries.
Sometimes we want to fill in (or recolor) an area that is not defined within a single color boundary. Figure 56 shows an area bordered by several different color regions. We can paint such areas by replacing a specified interior color instead of searching for a particular boundary color. This fill procedure is called a flood-fill algorithm. We start from a specified interior point (x, y) and reassign all pixel values that are currently set to a given interior color with the desired fill color. If the area that we want to paint has more than one interior color, we can first reassign pixel values so that all interior points have the same color. Using either a 4-connected or 8-connected approach, we then step through pixel positions until all interior points have been repainted. The following procedure flood fills a 4-connected region recursively, starting from the input position.
void floodFill4 (int x, int y, int fillColor, int interiorColor) { int color; /* Set current color to fillColor, then perform the following operations. */ getPixel (x, y, color); if (color = interiorColor) { setPixel (x, y); // Set color of pixel to fillColor. floodFill4 (x + 1, y, fillColor, interiorColor); floodFill4 (x - 1, y, fillColor, interiorColor); floodFill4 (x, y + 1, fillColor, interiorColor); floodFill4 (x, y - 1, fillColor, interiorColor) } }
We can modify the above procedure to reduce the storage requirements of the stack by filling horizontal pixel spans, as discussed for the boundary-fill algorithm. In this approach, we stack only the beginning positions for those pixel spans having the value interiorColor. The steps in this modified flood-fill algorithm are similar to those illustrated in Figure 55 for a boundary fill. Starting at the first position of each span, the pixel values are replaced until a value other than interiorColor is encountered.
14 Implementation Methods for Fill Styles There are two basic procedures for filling an area on raster systems, once the definition of the fill region has been mapped to pixel coordinates. One procedure first determines the overlap intervals for scan lines that cross the area. Then, pixel positions along these overlap intervals are set to the fill color. Another method for area filling is to start from a given interior position and “paint” outward, pixelby-pixel, from this point until we encounter specified boundary conditions. The scan-line approach is usually applied to simple shapes such as circles or regions with polyline boundaries, and general graphics packages use this fill method. Fill algorithms that use a starting interior point are useful for filling areas with more complex boundaries and in interactive painting systems.
174
Implementation Algorithms for Graphics Primitives and Attributes
Fill Styles We can implement a pattern fill by determining where the pattern overlaps those scan lines that cross a fill area. Beginning from a specified start position for a pattern fill, we map the rectangular patterns vertically across scan lines and horizontally across pixel positions on the scan lines. Each replication of the pattern array is performed at intervals determined by the width and height of the mask. Where the pattern overlaps the fill area, pixel colors are set according to the values stored in the mask. Hatch fill could be applied to regions by drawing sets of line segments to display either single hatching or cross-hatching. Spacing and slope for the hatch lines could be set as parameters in a hatch table. Alternatively, hatch fill can be specified as a pattern array that produces sets of diagonal lines. A reference point (xp, yp) for the starting position of a fill pattern can be set at any convenient position, inside or outside the fill region. For instance, the reference point could be set at a polygon vertex; or the reference point could be chosen as the lower-left corner of the bounding rectangle (or bounding box) determined by the coordinate extents of the region. To simplify selection of the reference coordinates, some packages always use the coordinate origin of the display window as the pattern start position. Always setting (xp, yp) at the coordinate origin also simplifies the tiling operations when each element of a pattern is to be mapped to a single pixel. For example, if the row positions in the pattern array are referenced from bottom to top, starting with the value 1, a color value is then assigned to pixel position (x, y) in screen coordinates from pattern position (y mod ny + 1, x mod nx+1). Here, ny and nx specify the number of rows and number of columns in the pattern array. Setting the pattern start position at the coordinate origin, however, effectively attaches the pattern fill to the screen background rather than to the fill regions. Adjacent or overlapping areas filled with the same pattern would show no apparent boundary between the areas. Also, repositioning and refilling an object with the same pattern can result in a shift in the assigned pixel values over the object interior. A moving object would appear to be transparent against a stationary pattern background instead of moving with a fixed interior pattern.
Color-Blended Fill Regions Color-blended regions can be implemented using either transparency factors to control the blending of background and object colors, or using simple logical or replace operations as shown in Figure 57, which demonstrates how these operations would combine a 2 × 2 fill pattern with a background pattern for a binary (black-and-white) system. The linear soft-fill algorithm repaints an area that was originally painted by merging a foreground color F with a single background color B, where F = B. Assuming we know the values for F and B, we can check the current contents of the frame buffer to determine how these colors were combined. The current color P of each pixel within the area to be refilled is some linear combination of F and B: P = tF + (1 − t)B
(64)
where the transparency factor t has a value between 0 and 1 for each pixel. For values of t less than 0.5, the background color contributes more to the interior color of the region than does the fill color. If our color values are represented using separate red, green, and blue (RGB) components, Equation 64 holds for
175
Implementation Algorithms for Graphics Primitives and Attributes
and
or
Pattern
Background
FIGURE 57
xor
replace
Combining a fill pattern with a background pattern using logical operations and, or, and xor (exclusive or), and using simple replacement.
Pixel Values
each component of the colors, with P = (PR , PG , PB ),
F = (F R , FG , F B ),
B = (B R , BG , B B )
(65)
We can thus calculate the value of parameter t using one of the RGB color components as follows: t=
Pk − Bk Fk − Bk
(66)
where k = R, G, or B; and Fk = Bk . Theoretically, parameter t has the same value for each RGB component, but the round-off calculations to obtain integer codes can result in different values of t for different components. We can minimize this round-off error by selecting the component with the largest difference between F and B. This value of t is then used to mix the new fill color NF with the background color. We can accomplish this mixing using either a modified flood-fill or boundary-fill procedure, as described in Section 13. Similar color-blending procedures can be applied to an area whose foreground color is to be merged with multiple background color areas, such as a checkerboard pattern. When two background colors B1 and B2 are mixed with foreground color F, the resulting pixel color P is P = t0 F + t1 B1 + (1 − t0 − t1 )B2
(67)
where the sum of the color-term coefficients t0 , t1 , and (1 − t0 − t1 ) must equal 1. We can set up two simultaneous equations using two of the three RGB color components to solve for the two proportionality parameters, t0 and t1 . These parameters are then used to mix the new fill color with the two background colors to obtain the new pixel color. With three background colors and one foreground color, or with two background and two foreground colors, we need all three RGB equations to obtain the relative amounts of the four colors. For some foreground and background color combinations, however, the system of two or three RGB
176
Implementation Algorithms for Graphics Primitives and Attributes
equations cannot be solved. This occurs when the color values are all very similar or when they are all proportional to each other.
15 Implementation Methods for Antialiasing Line segments and other graphics primitives generated by the raster algorithms discussed earlier in this chapter have a jagged, or stair-step, appearance because the sampling process digitizes coordinate points on an object to discrete integer pixel positions. This distortion of information due to low-frequency sampling (undersampling) is called aliasing. We can improve the appearance of displayed raster lines by applying antialiasing methods that compensate for the undersampling process. An example of the effects of undersampling is shown in Figure 58. To avoid losing information from such periodic objects, we need to set the sampling frequency to at least twice that of the highest frequency occurring in the object, referred to as the Nyquist sampling frequency (or Nyquist sampling rate) f s : f s = 2 f max
(68)
Another way to state this is that the sampling interval should be no larger than one-half the cycle interval (called the Nyquist sampling interval). For x-interval sampling, the Nyquist sampling interval xs is xcycle xs = (69) 2 where xcycle = 1/ f max . In Figure 58, our sampling interval is one and one-half times the cycle interval, so the sampling interval is at least three times too large. If we want to recover all the object information for this example, we need to cut the sampling interval down to one-third the size shown in the figure. One way to increase sampling rate with raster systems is simply to display objects at higher resolution. However, even at the highest resolution possible with current technology, the jaggies will be apparent to some extent. There is a limit to how big we can make the frame buffer and still maintain the refresh rate at 60 frames or more per second. To represent objects accurately with continuous parameters, we need arbitrarily small sampling intervals. Therefore, unless hardware technology is developed to handle arbitrarily large frame buffers, increased screen resolution is not a complete solution to the aliasing problem. With raster systems that are capable of displaying more than two intensity levels per color, we can apply antialiasing methods to modify pixel intensities. By
*
*
* (a)
*
*
Sampling Positions
FIGURE 58
(b)
Sampling the periodic shape in (a) at the indicated positions produces the aliased lower-frequency representation in (b).
177
Implementation Algorithms for Graphics Primitives and Attributes
appropriately varying the intensities of pixels along the boundaries of primitives, we can smooth the edges to lessen their jagged appearance. A straightforward antialiasing method is to increase sampling rate by treating the screen as if it were covered with a finer grid than is actually available. We can then use multiple sample points across this finer grid to determine an appropriate intensity level for each screen pixel. This technique of sampling object characteristics at a high resolution and displaying the results at a lower resolution is called supersampling (or postfiltering, because the general method involves computing intensities at subpixel grid positions and then combining the results to obtain the pixel intensities). Displayed pixel positions are spots of light covering a finite area of the screen, and not infinitesimal mathematical points. Yet in the line and fill-area algorithms we have discussed, the intensity of each pixel is determined by the location of a single point on the object boundary. By supersampling, we obtain intensity information from multiple points that contribute to the overall intensity of a pixel. An alternative to supersampling is to determine pixel intensity by calculating the areas of overlap of each pixel with the objects to be displayed. Antialiasing by computing overlap areas is referred to as area sampling (or prefiltering, because the intensity of the pixel as a whole is determined without calculating subpixel intensities). Pixel overlap areas are obtained by determining where object boundaries intersect individual pixel boundaries. Raster objects can also be antialiased by shifting the display location of pixel areas. This technique, called pixel phasing, is applied by “micropositioning” the electron beam in relation to object geometry. For example, pixel positions along a straight-line segment can be moved closer to the defined line path to smooth out the raster stair-step effect.
Supersampling Straight-Line Segments We can perform supersampling in several ways. For a straight-line segment, we can divide each pixel into a number of subpixels and count the number of subpixels that overlap the line path. The intensity level for each pixel is then set to a value that is proportional to this subpixel count. An example of this method is given in Figure 59. Each square pixel area is divided into nine equal-sized square subpixels, and the shaded regions show the subpixels that would be selected by Bresenham’s algorithm. This scheme provides for three intensity settings above zero, because the maximum number of subpixels that can be selected within any pixel is three. For this example, the pixel at position (10, 20) is set to the maximum intensity (level 3); pixels at (11, 21) and (12, 21) are each set to the next highest intensity (level 2); and pixels at (11, 20) and (12, 22) are each set to the lowest intensity above zero (level 1). Thus, the line intensity is spread out over a greater number of pixels to smooth the original jagged effect. This procedure displays a somewhat blurred line in the vicinity of the stair steps (between horizontal runs). If we want to use more intensity levels to antialiase the line with this method, we increase the number of sampling positions across each pixel. Sixteen subpixels gives us four intensity levels above zero; twenty-five subpixels gives us five levels; and so on. In the supersampling example of Figure 59, we considered pixel areas of finite size, but we treated the line as a mathematical entity with zero width. Actually, displayed lines have a width approximately equal to that of a pixel. If we take the finite width of the line into account, we can perform supersampling by setting pixel intensity proportional to the number of subpixels inside the polygon representing the line area. A subpixel can be considered to be inside the
178
Implementation Algorithms for Graphics Primitives and Attributes
22
22
21
21
20
20 10
11
10
12
11
12
FIGURE 59
FIGURE 60
Supersampling subpixel positions along a straight-line segment whose left endpoint is at screen coordinates (10, 20).
Supersampling subpixel positions in relation to the interior of a line of finite width.
line if its lower-left corner is inside the polygon boundaries. An advantage of this supersampling procedure is that the number of possible intensity levels for each pixel is equal to the total number of subpixels within the pixel area. For the example in Figure 59, we can represent this line with finite width by positioning the polygon boundaries parallel to the line path as in Figure 60. In addition, each pixel can now be set to one of nine possible brightness levels above zero. Another advantage of supersampling with a finite-width line is that the total line intensity is distributed over more pixels. In Figure 60, we now have the pixel at grid position (10, 21) turned on (at intensity level 1), and we also pick up contributions from pixels immediately below and immediately to the left of position (10, 21). Also, if we have a color display, we can extend the method to take background colors into account. A particular line might cross several different color areas, and we can average subpixel intensities to obtain pixel color settings. For instance, if five subpixels within a particular pixel area are determined to be inside the boundaries for a red line and the remaining four subpixels fall within a blue background area, we can calculate the color for this pixel as (5 · red + 4 · blue) 9 The trade-off for these gains from supersampling a finite-width line is that identifying interior subpixels requires more calculations than simply determining which subpixels are along the line path. Also, we need to take into account the positioning of the line boundaries in relation to the line path. This positioning depends on the slope of the line. For a 45◦ line, the line path is centered on the polygon area; but for either a horizontal or a vertical line, we want the line path to be one of the polygon boundaries. For example, a horizontal line passing through grid coordinates (10, 20) could be represented as the polygon bounded by horizontal grid lines y = 20 and y = 21. Similarly, the polygon representing a vertical line through (10, 20) can have vertical boundaries along grid lines x = 10 and x = 11. For lines with slope |m| < 1, the mathematical line path will be positioned proportionately closer to either the lower or the upper polygon boundary depending upon where the line intersects the polygon; in Figure 59, for instance, the line intersects the pixel at (10, 20) closer to the lower boundary, but intersects the pixel at (11, 20) closer to the upper boundary. Similarly, for lines with slope |m| > 1, the line path is placed closer to the left or right polygon boundary depending on where it intersects the polygon. pixelcolor =
179
Implementation Algorithms for Graphics Primitives and Attributes
1
2
1
2
4
2
1
2
1
FIGURE 61
Relative weights for a grid of 3 × 3 subpixels.
Subpixel Weighting Masks Supersampling algorithms are often implemented by giving more weight to subpixels near the center of a pixel area because we would expect these subpixels to be more important in determining the overall intensity of a pixel. For the 3 × 3 pixel subdivisions we have considered so far, a weighting scheme as in Figure 61 could be used. The center subpixel here is weighted four times that of the corner subpixels and twice that of the remaining subpixels. Intensities calculated for each of the nine subpixels would then be averaged so that the center subpixel is weighted by a factor of 14 ; the top, bottom, and side subpixels are each weighted 1 by a factor of 18 ; and the corner subpixels are each weighted by a factor of 16 . An array of values specifying the relative importance of subpixels is usually referred to as a weighting mask. Similar masks can be set up for larger subpixel grids. Also, these masks are often extended to include contributions from subpixels belonging to neighboring pixels, so that intensities can be averaged with adjacent pixels to provide a smoother intensity variation between pixels.
Area Sampling Straight-Line Segments We perform area sampling for a straight line by setting pixel intensity proportional to the area of overlap of the pixel with the finite-width line. The line can be treated as a rectangle, and the section of the line area between two adjacent vertical (or two adjacent horizontal) screen grid lines is then a polygon. Overlap areas for pixels are calculated by determining how much of the polygon overlaps each pixel in that column (or row). In Figure 60, the pixel with screen grid coordinates (10, 20) is about 90 percent covered by the line area, so its intensity would be set to 90 percent of the maximum intensity. Similarly, the pixel at (10, 21) would be set to an intensity of about 15 percent of maximum. A method for estimating pixel overlap areas is illustrated by the supersampling example in Figure 60. The total number of subpixels within the line boundaries is approximately equal to the overlap area, and this estimation is improved by using finer subpixel grids.
Filtering Techniques A more accurate method for antialiasing lines is to use filtering techniques. The method is similar to applying a weighted pixel mask, but now we imagine a continuous weighting surface (or filter function) covering the pixel. Figure 62 shows examples of rectangular, conical, and Gaussian filter functions. Methods for applying the filter function are similar to those for applying a weighting mask, but now we integrate over the pixel surface to obtain the weighted average intensity. To reduce computation, table lookups are commonly used to evaluate the integrals.
Pixel Phasing On raster systems that can address subpixel positions within the screen grid, pixel phasing can be used to antialias objects. A line display is smoothed with this technique by moving (micropositioning) pixel positions closer to the line path. Systems incorporating pixel phasing are designed so that the electron beam can be shifted by a fraction of a pixel diameter. The electron beam is typically shifted by 14 , 12 , or 34 of a pixel diameter to plot points closer to the true path of a line or object edge. Some systems also allow the size of individual pixels to be adjusted as an additional means for distributing intensities. Figure 63 illustrates the antialiasing effects of pixel phasing on a variety of line paths.
180
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 62
Box Filter (a)
Cone Filter (b)
Gaussian Filter (c)
Common filter functions used to antialias line paths. The volume of each filter is normalized to 1.0, and the height gives the relative weight at any subpixel position.
FIGURE 63
(a)
(b)
Jagged lines (a), plotted on the Merlin 9200 system, are smoothed (b) with an antialiasing technique called pixel phasing. This technique increases the number of addressable points on the system from 768 by 576 to 3072 by 2304. (Courtesy of Peritek Corp.)
Compensating for Line-Intensity Differences Antialiasing a line to soften the stair-step effect also compensates for another raster effect, illustrated in Figure 64. Both lines are plotted with the same number of √ pixels, yet the diagonal line is longer than the horizontal line by a factor of 2. For example, if the horizontal line had a length of 10 centimeters, the diagonal line would have a length of more than 14 centimeters. The visual effect of this is that the diagonal line appears less bright than the horizontal line, because the diagonal line is displayed with a lower intensity per unit length. A line-drawing algorithm could be adapted to compensate for this effect by adjusting the intensity of each
181
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 64
Unequal length lines displayed with the same number of pixels in each line.
line according to its slope. Horizontal and vertical lines would be displayed with the lowest intensity, while 45◦ lines would be given the highest intensity. But if antialiasing techniques are applied to a display, intensities are compensated automatically. When the finite width of a line is taken into account, pixel intensities are adjusted so that the line displays a total intensity proportional to its length.
Antialiasing Area Boundaries The antialiasing concepts that we have discussed for lines can also be applied to the boundaries of areas to remove their jagged appearance. We can incorporate these procedures into a scan-line algorithm to smooth out the boundaries as the area is generated. If system capabilities permit the repositioning of pixels, we could smooth area boundaries by shifting pixel positions closer to the boundary. Other methods adjust pixel intensity at a boundary position according to the percent of the pixel area that is interior to the object. In Figure 65, the pixel at position (x, y) has about half its area inside the polygon boundary. Therefore, the intensity at that position would be adjusted to one-half its assigned value. At the next position (x + 1, y + 1) along the boundary, the intensity is adjusted to about one-third the assigned value for that point. Similar adjustments, based on the percent of pixel area coverage, are applied to the other intensity values around the boundary. Supersampling methods can be applied by determining the number of subpixels that are in the interior of an object. A partitioning scheme with four subareas per pixel is shown in Figure 66. The original 4 × 4 grid of pixels is turned into an 8 × 8 grid, and we now process eight scan lines across this grid instead of four. Figure 67 shows one of the pixel areas in this grid that overlaps an object ce y rfa dar u S un Bo
Scan Line 1 y⫹1 Scan Line 2 y x
182
Subdivided Pixel Area
x⫹1
FIGURE 65
FIGURE 66
FIGURE 67
Adjusting pixel intensities along an area boundary.
A 4 × 4 pixel section of a raster display subdivided into an 8 × 8 grid.
A subdivided pixel area with three subdivisions inside an object boundary line.
Implementation Algorithms for Graphics Primitives and Attributes
boundary. Along the two scan lines, we determine that three of the subpixel areas are inside the boundary. So we set the pixel intensity at 75 percent of its maximum value. Another method for determining the percentage of pixel area within a fill region, developed by Pitteway and Watkinson, is based on the midpoint line algorithm. This algorithm selects the next pixel along a line by testing the location of the midposition between two pixels. As in the Bresenham algorithm, we set up a decision parameter p whose sign tells us which of the next two candidate pixels is closer to the line. By slightly modifying the form of p, we obtain a quantity that also gives the percentage of the current pixel area that is covered by an object. We first consider the method for a line with slope m in the range from 0 to 1. In Figure 68, a straight-line path is shown on a pixel grid. Assuming that the pixel at position (xk , yk ) has been plotted, the next pixel nearest the line at x = xk + 1 is either the pixel at yk or the one at yk + 1. We can determine which pixel is nearer with the calculation y − ymid = [m(xk + 1) + b] − (yk + 0.5)
(70)
This gives the vertical distance from the actual y coordinate on the line to the halfway point between pixels at position yk and yk + 1. If this difference calculation is negative, the pixel at yk is closer to the line. If the difference is positive, the pixel at yk + 1 is closer. We can adjust this calculation so that it produces a positive number in the range from 0 to 1 by adding the quantity 1 − m: p = [m(xk + 1) + b] − (yk + 0.5) + (1 − m)
(71)
Now the pixel at yk is nearer if p < 1 − m, and the pixel at yk + 1 is nearer if p > 1 − m. Parameter p also measures the amount of the current pixel that is overlapped by the area. For the pixel at (xk , yk ) in Figure 69, the interior part of the pixel is trapezoidal and has an area that can be calculated as area = m · xk + b − yk + 0.5
(72)
This expression for the overlap area of the pixel at (xk , yk ) is the same as that for parameter p in Equation 71. Therefore, by evaluating p to determine the next pixel position along the polygon boundary, we also determine the percentage of area coverage for the current pixel. We can generalize this algorithm to accommodate lines with negative slopes and lines with slopes greater than 1. This calculation for parameter p could then yk 0.5 dary Boun e Lin
y mx b
yk
y m(xk 0.5) b
y m(xk 0.5) b
yk 1 yk 0.5
Overlap Area
yk 0.5
yk
xk
xk 0.5
xk1
xk
xk 0.5
FIGURE 69 FIGURE 68
Boundary edge of a fill area passing through a pixel grid section.
Overlap area of a pixel rectangle, centered at position (x k , y k ), with the interior of a polygon fill area.
183
Implementation Algorithms for Graphics Primitives and Attributes
FIGURE 70
Polygons with more than one boundary line passing through individual pixel regions.
be incorporated into a midpoint line algorithm to locate pixel positions along a polygon edge and, concurrently, adjust pixel intensities along the boundary lines. Also, we can adjust the calculations to reference pixel coordinates at their lower-left coordinates and maintain area proportions, as discussed in Section 8. At polygon vertices and for very skinny polygons, as shown in Figure 70, we have more than one boundary edge passing through a pixel area. For these cases, we need to modify the Pitteway-Watkinson algorithm by processing all edges passing through a pixel and determining the correct interior area. Filtering techniques discussed for line antialiasing can also be applied to area edges. In addition, the various antialiasing methods can be applied to polygon areas or to regions with curved boundaries. Equations describing the boundaries are used to estimate the amount of pixel overlap with the area to be displayed, and coherence techniques are used along and between scan lines to simplify the calculations.
16 Summary Three methods that can be used to locate pixel positions along a straight-line path are the DDA algorithm, Bresenham’s algorithm, and the midpoint method. Bresenham’s line algorithm and the midpoint line method are equivalent, and they are the most efficient. Color values for the pixel positions along the line path are efficiently stored in the frame buffer by incrementally calculating the memory addresses. Any of the line-generating algorithms can be adapted to a parallel implementation by partitioning the line segments and distributing the partitions among the available processors. Circles and ellipses can be efficiently and accurately scan-converted using midpoint methods and taking curve symmetry into account. Other conic sections (parabolas and hyperbolas) can be plotted with similar methods. Spline curves, which are piecewise continuous polynomials, are widely used in animation and in CAD. Parallel implementations for generating curve displays can be accomplished with methods similar to those for parallel line processing. To account for the fact that displayed lines and curves have finite widths, we can adjust the pixel dimensions of objects to coincide to the specified geometric dimensions. This can be done with an addressing scheme that references pixel positions at their lower-left corner, or by adjusting line lengths. Scan-line methods are commonly used to fill polygons, circles, and ellipses. Across each scan line, the interior fill is applied to pixel positions between each pair of boundary intersections, left to right. For polygons, scan-line intersections with vertices can result in an odd number of intersections. This can be resolved by shortening some polygon edges. Scan-line fill algorithms can be simplified if fill areas are restricted to convex polygons. A further simplification is achieved if all fill areas in a scene are triangles. The interior pixels along each scan line are assigned appropriate color values, depending on the fill-attribute specifications. Painting programs generally display fill regions using a boundary-fill method or a flood-fill method. Each of these two fill methods requires an initial interior point. The interior is then painted pixel by pixel from the initial point out to the region boundaries. Soft-fill procedures provide a new fill color for a region that has the same variations as the previous fill color. One example of this approach is the linear soft-fill algorithm that assumes that the previous fill was a linear combination of foreground and background colors. This same linear relationship is then
184
Implementation Algorithms for Graphics Primitives and Attributes
determined from the frame buffer settings and used to repaint the area in a new color. We can improve the appearance of raster primitives by applying antialiasing procedures that adjust pixel intensities. One method for doing this is to supersample. That is, we consider each pixel to be composed of subpixels and we calculate the intensity of the subpixels and average the values of all subpixels. We can also weight the subpixel contributions according to position, giving higher weights to the central subpixels. Alternatively, we can perform area sampling and determine the percentage of area coverage for a screen pixel, then set the pixel intensity proportional to this percentage. Another method for antialiasing is to build special hardware configurations that can shift pixel positions.
REFERENCES Basic information on Bresenham’s algorithms can be found in Bresenham (1965 and 1977). For midpoint methods, see Kappel (1985). Parallel methods for generating lines and circles are discussed in Pang (1990) and in Wright (1990). Many other methods for generating and processing graphics primitives are discussed in Foley, et al. (1990), Glassner (1990), Arvo (1991), Kirk (1992), Heckbert (1994), and Paeth (1995). Soft-fill techniques are given in Fishkin and Barsky (1984). Antialiasing techniques are discussed in Pitteway and Watinson (1980), Crow (1981), Turkowski (1982), Fujimoto and Iwata (1983), Korein and Badler (1983), Kirk and Arvo (1991), and Wu (1991). Gray-scale applications are explored in Crow (1978). Other discussions of attributes and state parameters are available in Glassner (1990), Arvo (1991), Kirk (1992), Heckbert (1994), and Paeth (1995).
6
EXERCISES
10
1
2
3
4
5
Implement a polyline function using the DDA algorithm, given any number (n) of input points. A single point is to be plotted when n = 1. Extend Bresenham’s line algorithm to generate lines with any slope, taking symmetry between quadrants into account. Implement a polyline function, using the algorithm from the previous exercise, to display the set of straight lines connecting a list of n input points. For n = 1, the routine displays a single point. Use the midpoint method to derive decision parameters for generating points along a straightline path with slope in the range 0 < m < 1. Show that the midpoint decision parameters are the same as those in the Bresenham line algorithm. Use the midpoint method to derive decision parameters that can be used to generate straightline segments with any slope.
7 8
9
11
12
13 14
Set up a parallel version of Bresenham’s line algorithm for slopes in the range 0 < m < 1. Set up a parallel version of Bresenham’s algorithm for straight lines with any slope. Suppose you have a system with an 8 inch by 10 inch video monitor that can display 100 pixels per inch. If memory is organized in one byte words, the starting frame buffer address is 0, and each pixel is assigned one byte of storage, what is the frame buffer address of the pixel with screen coordinates (x, y)? Suppose you have a system with a 12 inch by 14 inch video monitor that can display 120 pixels per inch. If memory is organized in one byte words, the starting frame buffer address is 0, and each pixel is assigned one byte of storage, what is the frame buffer address of the pixel with screen coordinates (x, y)? Suppose you have a system with a 12 inch by 14 inch video monitor that can display 120 pixels per inch. If memory is organized in one byte words, the starting frame buffer address is 0, and each pixel is assigned 4 bits of storage, what is the frame buffer address of the pixel with screen coordinates (x, y)? Incorporate the iterative techniques for calculating frame-buffer addresses (Section 3) into the Bresenham line algorithm. Revise the midpoint circle algorithm to display circles with input geometric magnitudes preserved (Section 8). Set up a procedure for a parallel implementation of the midpoint circle algorithm. Derive decision parameters for the midpoint ellipse algorithm assuming the start position is (r x , 0) and points are to be generated along the curve path in counterclockwise order.
185
Implementation Algorithms for Graphics Primitives and Attributes
15 Set up a procedure for a parallel implementation of the midpoint ellipse algorithm. 16 Devise an efficient algorithm that takes advantage of symmetry properties to display a sine function over one cycle. 17 Modify the algorithm in the preceding exercise to display a sine curve over any specified angular interval. 18 Devise an efficient algorithm, taking function symmetry into account, to display a plot of damped harmonic motion: y = Ae −kx sin(ωx + θ ) where ω is the angular frequency and θ is the phase of the sine function. Plot y as a function of x for several cycles of the sine function or until the maxA . imum amplitude is reduced to 10 19 Use the algorithm developed in the previous exercise to write a program that displays one cycle of a sine curve. The curve should begin at the left edge of the display window and complete at the right edge, and the amplitude should be scaled so that the maximum and minimum values of the curve are equal to the maximum and minimum y values of the display window. 20 Using the midpoint method, and taking symmetry into account, develop an efficient algorithm for scan conversion of the following curve over the interval −10 ≤ x ≤ 10. y=
y = a x2 + b
25
26
27
28 29 30
31
32
1 3 x 12
21 Use the algorithm developed in the previous exercise to write a program that displays a portion of a sine curve determined by an input angular interval. The curve should begin at the left edge of the display window and complete at the right edge, and the amplitude should be scaled so that the maximum and minimum values of the curve are equal to the maximum and minimum y values of the display window. 22 Use the midpoint method and symmetry considerations to scan convert the parabola x = y2 − 5 over the interval −10 ≤ x ≤ 10. 23 Use the midpoint method and symmetry considerations to scan convert the parabola y = 50 − x 2 over the interval −5 ≤ x ≤ 5. 24 Set up a midpoint algorithm, taking symmetry considerations into account to scan convert any
186
parabola of the form
33
34
35
36
37
with input values for parameters a , b, and the range for x. Define an efficient polygon-mesh representation for a cylinder and justify your choice of representation. Implement a general line-style function by modifying Bresenham’s line-drawing algorithm to display solid, dashed, or dotted lines. Implement a line-style function using a midpoint line algorithm to display solid, dashed, or dotted lines. Devise a parallel method for implementing a linestyle function. Devise a parallel method for implementing a linewidth function. A line specified by two endpoints and a width can be converted to a rectangular polygon with four vertices and then displayed using a scan-line method. Develop an efficient algorithm for computing the four vertices needed to define such a rectangle, with the line endpoints and line width as input parameters. Implement a line-width function in a line-drawing program so that any one of three line widths can be displayed. Write a program to output a line graph of three data sets defined over the same x-coordinate range. Input to the program is to include the three sets of data values and the labels for the graph. The data sets are to be scaled to fit within a defined coordinate range for a display window. Each data set is to be plotted with a different line style. Modify the program in the previous exercise to plot the three data sets in different colors, as well as different line styles. Set up an algorithm for displaying thick lines with butt caps, round caps, or projecting square caps. These options can be provided in an option menu. Devise an algorithm for displaying thick polylines with a miter join, a round join, or a bevel join. These options can be provided in an option menu. Implement pen and brush menu options for a linedrawing procedure, including at least two options: round and square shapes. Modify a line-drawing algorithm so that the intensity of the output line is set according to its slope. That is, by adjusting pixel intensities according to the value of the slope, all lines are displayed with the same intensity per unit length.
Implementation Algorithms for Graphics Primitives and Attributes
38 Define and implement a function for controlling the line style (solid, dashed, dotted) of displayed ellipses. 39 Define and implement a function for setting the width of displayed ellipses. 40 Modify the scan-line algorithm to apply any specified rectangular fill pattern to a polygon interior, starting from a designated pattern position. 41 Write a program to scan convert the interior of a specified ellipse into a solid color. 42 Write a procedure to fill the interior of a given ellipse with a specified pattern. 43 Write a procedure for filling the interior of any specified set of fill-area vertices, including one with crossing edges, using the nonzero winding number rule to identify interior regions. 44 Modify the boundary-fill algorithm for a 4-connected region to avoid excessive stacking by incorporating scan-line methods. 45 Write a boundary-fill procedure to fill an 8-connected region. 46 Explain how an ellipse displayed with the midpoint method could be properly filled with a boundary-fill algorithm. 47 Develop and implement a flood-fill algorithm to fill the interior of any specified area. 48 Define and implement a procedure for changing the size of an existing rectangular fill pattern. 49 Write a procedure to implement a soft-fill algorithm. Carefully define what the soft-fill algorithm is to accomplish and how colors are to be combined. 50 Devise an algorithm for adjusting the height and width of characters defined as rectangular grid patterns. 51 Implement routines for setting the character up vector and the text path for controlling the display of character strings. 52 Write a program to align text as specified by input values for the alignment parameters. 53 Develop procedures for implementing marker attributes (size and color).
54 Implement an antialiasing procedure by extending Bresenham’s line algorithm to adjust pixel intensities in the vicinity of a line path. 55 Implement an antialiasing procedure for the midpoint line algorithm. 56 Develop an algorithm for antialiasing elliptical boundaries. 57 Modify the scan-line algorithm for area fill to incorporate antialiasing. Use coherence techniques to reduce calculations on successive scan lines. 58 Write a program to implement the PittewayWatkinson antialiasing algorithm as a scan-line procedure to fill a polygon interior, using the OpenGL point-plotting function.
IN MORE DEPTH 1
2
Write routines to implement Bresenham’s linedrawing algorithm and the DDA line-drawing algorithm and use them to draw the outlines of the shapes in the current snapshot of your application. Record the runtimes of the algorithms and compare the performance of the two. Next examine the polygons that represent the objects in your scene and either choose a few that would be better represented using ellipses or other curves or add a few objects with this property. Implement a midpoint algorithm to draw the ellipses or curves that represent these objects and use it to draw the outlines of those objects. Discuss ways in which you could improve the performance of the algorithms you developed if you had direct access to parallel hardware. Implement the general scan-line polygon-fill algorithm to fill in the polygons that make up the objects in your scene with solid colors. Next, implement a scan-line curve-filling algorithm to fill the curved objects you added in the previous exercise. Finally, implement a boundary fill algorithm to fill all of the objects in your scene. Compare the run times of the two approaches to filling in the shapes in your scene.
187
This page intentionally left blank
Two-Dimensional Geometric Transformations
1
Basic Two-Dimensional Geometric Transformations
2
Matrix Representations and Homogeneous Coordinates
3
Inverse Transformations
4
Two-Dimensional Composite Transformations
5
Other Two-Dimensional Transformations
6
Raster Methods for Geometric Transformations
7
OpenGL Raster Transformations
8
Transformations between Two-Dimensional Coordinate Systems
9
OpenGL Functions for Two-Dimensional Geometric Transformations
10
OpenGL Geometric-Transformation Programming Examples
11
Summary
S
o far, we have seen how we can describe a scene in terms of graphics primitives, such as line segments and fill areas, and the attributes associated with these primitives. Also, we have explored the scan-line algorithms for displaying output primitives on a raster device. Now, we take a look at transformation operations that we can apply to objects to reposition or resize them. These operations are also used in the viewing routines that convert a world-coordinate scene description to a display for an output device. In addition, they are used in a variety of other applications, such as computer-aided design (CAD) and computer animation. An architect, for example, creates a layout by arranging the orientation and size of the component parts of a design, and a computer animator develops a video sequence by moving the “camera” position or the objects in a scene along specified paths. Operations that are applied to the geometric description of an object to change its position, orientation, or size are called geometric transformations. Sometimes geometric transformations are also referred to as modeling transformations, but some graphics packages make a
From Chapter 7 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
189
Two-Dimensional Geometric Transformations
distinction between the two. In general, modeling transformations are used to construct a scene or to give the hierarchical description of a complex object that is composed of several parts, which in turn could be composed of simpler parts, and so forth. For example, an aircraft consists of wings, tail, fuselage, engine, and other components, each of which can be specified in terms of second-level components, and so on, down the hierarchy of component parts. Thus, the aircraft can be described in terms of these components and an associated “modeling” transformation for each one that describes how that component is to be fitted into the overall aircraft design. Geometric transformations, on the other hand, can be used to describe how objects might move around in a scene during an animation sequence or simply to view them from another angle. Therefore, some graphics packages provide two sets of transformation routines, while other packages have a single set of functions that can be used for both geometric transformations and modeling transformations.
1 Basic Two-Dimensional Geometric Transformations The geometric-transformation functions that are available in all graphics packages are those for translation, rotation, and scaling. Other useful transformation routines that are sometimes included in a package are reflection and shearing operations. To introduce the general concepts associated with geometric transformations, we first consider operations in two dimensions. Once we understand the basic concepts, we can easily write routines to perform geometric transformations on objects in a two-dimensional scene.
Two-Dimensional Translation y
P T P x FIGURE 1
Translating a point from position P to position P using a translation vector T.
We perform a translation on a single coordinate point by adding offsets to its coordinates so as to generate a new coordinate position. In effect, we are moving the original point position along a straight-line path to its new location. Similarly, a translation is applied to an object that is defined with multiple coordinate positions, such as a quadrilateral, by relocating all the coordinate positions by the same displacement along parallel paths. Then the complete object is displayed at the new location. To translate a two-dimensional position, we add translation distances tx and ty to the original coordinates (x, y) to obtain the new coordinate position (x , y ) as shown in Figure 1. x = x + tx ,
y = y + ty
(1)
The translation distance pair (tx , ty ) is called a translation vector or shift vector. We can express Equations 1 as a single matrix equation by using the following column vectors to represent coordinate positions and the translation vector: x x t P= , P = (2) , T= x y y ty This allows us to write the two-dimensional translation equations in the matrix form P = P + T
190
(3)
Two-Dimensional Geometric Transformations y 10
5
0
5
10
15
20 x
15
20 x
(a) y 10
5
0
5
10 (b)
FIGURE 2
Moving a polygon from position (a) to position (b) with the translation vector (−5.50, 3.75).
Translation is a rigid-body transformation that moves objects without deformation. That is, every point on the object is translated by the same amount. A straight-line segment is translated by applying Equation 3 to each of the two line endpoints and redrawing the line between the new endpoint positions. A polygon is translated similarly. We add a translation vector to the coordinate position of each vertex and then regenerate the polygon using the new set of vertex coordinates. Figure 2 illustrates the application of a specified translation vector to move an object from one position to another. The following routine illustrates the translation operations. An input translation vector is used to move the n vertices of a polygon from one world-coordinate position to another, and OpenGL routines are used to regenerate the translated polygon. class wcPt2D { public: GLfloat x, y; }; void translatePolygon (wcPt2D * verts, GLint nVerts, GLfloat tx, GLfloat ty) { GLint k; for (k = 0; k < nVerts; k++) { verts [k].x = verts [k].x + tx; verts [k].y = verts [k].y + ty; } glBegin (GL_POLYGON); for (k = 0; k < nVerts; k++) glVertex2f (verts [k].x, verts [k].y); glEnd ( ); }
If we want to delete the original polygon, we could display it in the background color before translating it. Other methods for deleting picture components
191
Two-Dimensional Geometric Transformations
are available in some graphics packages. Also, if we want to save the original polygon position, we can store the translated positions in a different array. Similar methods are used to translate other objects. To change the position of a circle or ellipse, we translate the center coordinates and redraw the figure in the new location. For a spline curve, we translate the points that define the curve path and then reconstruct the curve sections between the new coordinate positions.
Two-Dimensional Rotation
u yr
xr FIGURE 3
Rotation of an object through angle θ about the pivot point (x r , y r ).
(x, y) r u
r
(x, y)
f FIGURE 4
Rotation of a point from position (x , y ) to position (x , y ) through an angle θ relative to the coordinate origin. The original angular displacement of the point from the x axis is φ.
We generate a rotation transformation of an object by specifying a rotation axis and a rotation angle. All points of the object are then transformed to new positions by rotating the points through the specified angle about the rotation axis. A two-dimensional rotation of an object is obtained by repositioning the object along a circular path in the xy plane. In this case, we are rotating the object about a rotation axis that is perpendicular to the xy plane (parallel to the coordinate z axis). Parameters for the two-dimensional rotation are the rotation angle θ and a position (xr , yr ), called the rotation point (or pivot point), about which the object is to be rotated (Figure 3). The pivot point is the intersection position of the rotation axis with the xy plane. A positive value for the angle θ defines a counterclockwise rotation about the pivot point, as in Figure 3, and a negative value rotates objects in the clockwise direction. To simplify the explanation of the basic method, we first determine the transformation equations for rotation of a point position P when the pivot point is at the coordinate origin. The angular and coordinate relationships of the original and transformed point positions are shown in Figure 4. In this figure, r is the constant distance of the point from the origin, angle φ is the original angular position of the point from the horizontal, and θ is the rotation angle. Using standard trigonometric identities, we can express the transformed coordinates in terms of angles θ and φ as x = r cos(φ + θ) = r cos φ cos θ − r sin φ sin θ y = r sin(φ + θ) = r cos φ sin θ + r sin φ cos θ The original coordinates of the point in polar coordinates are x = r cos φ,
y = r sin φ
(4)
(5)
Substituting expressions 5 into 4, we obtain the transformation equations for rotating a point at position (x, y) through an angle θ about the origin: x = x cos θ − y sin θ y = x sin θ + y cos θ
(6)
With the column-vector representations 2 for coordinate positions, we can write the rotation equations in the matrix form P = R · P where the rotation matrix is
R=
cos θ sin θ
− sin θ cos θ
(7)
(8)
A column-vector representation for a coordinate position P, as in Equations 2, is standard mathematical notation. However, early graphics systems sometimes used a row-vector representation for point positions. This changes the order in which the matrix multiplication for a rotation would be performed. But now, graphics packages such as OpenGL, Java, PHIGS, and GKS all follow the standard column-vector convention.
192
Two-Dimensional Geometric Transformations
Rotation of a point about an arbitrary pivot position is illustrated in Figure 5. Using the trigonometric relationships indicated by the two right triangles in this figure, we can generalize Equations 6 to obtain the transformation equations for rotation of a point about any specified rotation position (xr , yr ):
(x, y)
x = xr + (x − xr ) cos θ − (y − yr ) sin θ y = yr + (x − xr ) sin θ + (y − yr ) cos θ
(9)
These general rotation equations differ from Equations 6 by the inclusion of additive terms, as well as the multiplicative factors on the coordinate values. The matrix expression 7 could be modified to include pivot coordinates by including the matrix addition of a column vector whose elements contain the additive (translational) terms in Equations 9. There are better ways, however, to formulate such matrix equations, and in Section 2, we discuss a more consistent scheme for representing the transformation equations. As with translations, rotations are rigid-body transformations that move objects without deformation. Every point on an object is rotated through the same angle. A straight-line segment is rotated by applying Equations 9 to each of the two line endpoints and redrawing the line between the new endpoint positions. A polygon is rotated by displacing each vertex using the specified rotation angle and then regenerating the polygon using the new vertices. We rotate a curve by repositioning the defining points for the curve and then redrawing it. A circle or an ellipse, for instance, can be rotated about a noncentral pivot point by moving the center position through the arc that subtends the specified rotation angle. In addition, we could rotate an ellipse about its center coordinates simply by rotating the major and minor axes. In the following code example, a polygon is rotated about a specified worldcoordinate pivot point. Parameters input to the rotation procedure are the original vertices of the polygon, the pivot-point coordinates, and the rotation angle theta specified in radians. Following the transformation of the vertex positions, the polygon is regenerated using OpenGL routines.
r (xr , yr )
u
r
(x, y)
f
FIGURE 5
Rotating a point from position (x , y ) to position (x , y ) through an angle θ about rotation point (x r , y r ).
class wcPt2D { public: GLfloat x, y; }; void rotatePolygon (wcPt2D * verts, GLint nVerts, wcPt2D pivPt, GLdouble theta) { wcPt2D * vertsRot; GLint k; for (k = 0; k < nVerts; k++) { vertsRot [k].x = pivPt.x + (verts [k].x - pivPt.x) * cos (theta) - (verts [k].y - pivPt.y) * sin (theta); vertsRot [k].y = pivPt.y + (verts [k].x - pivPt.x) * sin (theta) + (verts [k].y - pivPt.y) * cos (theta); } glBegin {GL_POLYGON}; for (k = 0; k < nVerts; k++) glVertex2f (vertsRot [k].x, vertsRot [k].y); glEnd ( ); }
193
Two-Dimensional Geometric Transformations
Two-Dimensional Scaling To alter the size of an object, we apply a scaling transformation. A simple twodimensional scaling operation is performed by multiplying object positions (x, y) by scaling factors sx and s y to produce the transformed coordinates (x , y ): x = x · sx ,
y = y · s y
(10)
Scaling factor sx scales an object in the x direction, while s y scales in the y direction. The basic two-dimensional scaling equations 10 can also be written in the following matrix form: x sx 0 x = · (11) y 0 sy y (a)
or P = S · P
(b) FIGURE 6
Turning a square (a) into a rectangle (b) with scaling factors s x = 2 and s y = 1.
x
x
FIGURE 7
A line scaled with Equation 12 using s x = s y = 0.5 is reduced in size and moved closer to the coordinate origin.
y P1 (xf , yf) P3 x FIGURE 8
Scaling relative to a chosen fixed point (x f , y f ). The distance from each polygon vertex to the fixed point is scaled by Equations 13.
194
where S is the 2 × 2 scaling matrix in Equation 11. Any positive values can be assigned to the scaling factors sx and s y . Values less than 1 reduce the size of objects; values greater than 1 produce enlargements. Specifying a value of 1 for both sx and s y leaves the size of objects unchanged. When sx and s y are assigned the same value, a uniform scaling is produced, which maintains relative object proportions. Unequal values for sx and s y result in a differential scaling that is often used in design applications, where pictures are constructed from a few basic shapes that can be adjusted by scaling and positioning transformations (Figure 6). In some systems, negative values can also be specified for the scaling parameters. This not only resizes an object, it reflects it about one or more of the coordinate axes. Objects transformed with Equation 11 are both scaled and repositioned. Scaling factors with absolute values less than 1 move objects closer to the coordinate origin, while absolute values greater than 1 move coordinate positions farther from the origin. Figure 7 illustrates scaling of a line by assigning the value 0.5 to both sx and s y in Equation 11. Both the line length and the distance from the origin are reduced by a factor of 12 . We can control the location of a scaled object by choosing a position, called the fixed point, that is to remain unchanged after the scaling transformation. Coordinates for the fixed point, (x f , y f ), are often chosen at some object position, such as its centroid (see Appendix A), but any other spatial position can be selected. Objects are now resized by scaling the distances between object points and the fixed point (Figure 8). For a coordinate position (x, y), the scaled coordinates (x , y ) are then calculated from the following relationships: x − x f = (x − x f )sx ,
P2
(12)
y − y f = (y − y f )s y
(13)
We can rewrite Equations 13 to separate the multiplicative and additive terms as x = x · sx + x f (1 − sx ) y = y · s y + y f (1 − s y )
(14)
where the additive terms x f (1 − sx ) and y f (1 − s y ) are constants for all points in the object. Including coordinates for a fixed point in the scaling equations is similar to including coordinates for a pivot point in the rotation equations. We can set up
Two-Dimensional Geometric Transformations
a column vector whose elements are the constant terms in Equations 14, then add this column vector to the product S · P in Equation 12. In the next section, we discuss a matrix formulation for the transformation equations that involves only matrix multiplication. Polygons are scaled by applying transformations 14 to each vertex, then regenerating the polygon using the transformed vertices. For other objects, we apply the scaling transformation equations to the parameters defining the objects. To change the size of a circle, we can scale its radius and calculate the new coordinate positions around the circumference. And to change the size of an ellipse, we apply scaling parameters to its two axes and then plot the new ellipse positions about its center coordinates. The following procedure illustrates an application of the scaling calculations for a polygon. Coordinates for the polygon vertices and for the fixed point are input parameters, along with the scaling factors. After the coordinate transformations, OpenGL routines are used to generate the scaled polygon.
class wcPt2D { public: GLfloat x, y; }; void scalePolygon (wcPt2D * verts, GLint nVerts, wcPt2D fixedPt, GLfloat sx, GLfloat sy) { wcPt2D vertsNew; GLint k; for (k = 0; k < nVerts; k++) { vertsNew [k].x = verts [k].x * sx + fixedPt.x * (1 - sx); vertsNew [k].y = verts [k].y * sy + fixedPt.y * (1 - sy); } glBegin {GL_POLYGON}; for (k = 0; k < nVerts; k++) glVertex2f (vertsNew [k].x, vertsNew [k].y); glEnd ( ); }
2 Matrix Representations and Homogeneous Coordinates Many graphics applications involve sequences of geometric transformations. An animation might require an object to be translated and rotated at each increment of the motion. In design and picture construction applications, we perform translations, rotations, and scalings to fit the picture components into their proper positions. The viewing transformations involve sequences of translations and rotations to take us from the original scene specification to the display on an output device. Here, we consider how the matrix representations discussed in the previous sections can be reformulated so that such transformation sequences can be processed efficiently.
195
Two-Dimensional Geometric Transformations
We have seen in Section 1 that each of the three basic two-dimensional transformations (translation, rotation, and scaling) can be expressed in the general matrix form P = M1 · P + M 2
(15)
with coordinate positions P and P represented as column vectors. Matrix M1 is a 2 × 2 array containing multiplicative factors, and M2 is a two-element column matrix containing translational terms. For translation, M1 is the identity matrix. For rotation or scaling, M2 contains the translational terms associated with the pivot point or scaling fixed point. To produce a sequence of transformations with these equations, such as scaling followed by rotation and then translation, we could calculate the transformed coordinates one step at a time. First, coordinate positions are scaled, then these scaled coordinates are rotated, and finally, the rotated coordinates are translated. A more efficient approach, however, is to combine the transformations so that the final coordinate positions are obtained directly from the initial coordinates, without calculating intermediate coordinate values. We can do this by reformulating Equation 15 to eliminate the matrix addition operation.
Homogeneous Coordinates Multiplicative and translational terms for a two-dimensional geometric transformation can be combined into a single matrix if we expand the representations to 3 × 3 matrices. Then we can use the third column of a transformation matrix for the translation terms, and all transformation equations can be expressed as matrix multiplications. But to do so, we also need to expand the matrix representation for a two-dimensional coordinate position to a three-element column matrix. A standard technique for accomplishing this is to expand each twodimensional coordinate-position representation (x, y) to a three-element representation (xh , yh , h), called homogeneous coordinates, where the homogeneous parameter h is a nonzero value such that x=
xh , h
y=
yh h
(16)
Therefore, a general two-dimensional homogeneous coordinate representation could also be written as (h·x, h·y, h). For geometric transformations, we can choose the homogeneous parameter h to be any nonzero value. Thus, each coordinate point (x, y) has an infinite number of equivalent homogeneous representations. A convenient choice is simply to set h = 1. Each two-dimensional position is then represented with homogeneous coordinates (x, y, 1). Other values for parameter h are needed, for example, in matrix formulations of three-dimensional viewing transformations. The term homogeneous coordinates is used in mathematics to refer to the effect of this representation on Cartesian equations. When a Cartesian point (x, y) is converted to a homogeneous representation (xh , yh , h), equations containing x and y, such as f (x, y) = 0, become homogeneous equations in the three parameters xh , yh , and h. This just means that if each of the three parameters is replaced by any value v times that parameter, the value v can be factored out of the equations. Expressing positions in homogeneous coordinates allows us to represent all geometric transformation equations as matrix multiplications, which is the standard method used in graphics systems. Two-dimensional coordinate positions are represented with three-element column vectors, and two-dimensional transformation operations are expressed as 3 × 3 matrices.
196
Two-Dimensional Geometric Transformations
Two-Dimensional Translation Matrix Using a homogeneous-coordinate approach, we can represent the equations for a two-dimensional translation of a coordinate position using the following matrix multiplication: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 1 0 tx x ⎣ y ⎦ = ⎣ 0 1 ty ⎦ · ⎣ y ⎦ (17) 1 0 0 1 1 This translation operation can be written in the abbreviated form P = T(tx , ty ) · P
(18)
with T(tx , ty ) as the 3 × 3 translation matrix in Equation 17. In situations where there is no ambiguity about the translation parameters, we can simply represent the translation matrix as T.
Two-Dimensional Rotation Matrix Similarly, two-dimensional rotation transformation equations about the coordinate origin can be expressed in the matrix form ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x cos θ −sinθ 0 x ⎣ y ⎦ = ⎣ sin θ cos θ 0 ⎦ · ⎣ y ⎦ (19) 1 0 0 1 1 or as P = R(θ ) · P
(20)
The rotation transformation operator R(θ ) is the 3 × 3 matrix in Equation 19 with rotation parameter θ . We can also write this rotation matrix simply as R. In some graphics libraries, a two-dimensional rotation function generates only rotations about the coordinate origin, as in Equation 19. A rotation about any other pivot point must then be performed as a sequence of transformation operations. An alternative approach in a graphics package is to provide additional parameters in the rotation routine for the pivot-point coordinates. A rotation routine that includes a pivot-point parameter then sets up a general rotation matrix without the need to invoke a succession of transformation functions.
Two-Dimensional Scaling Matrix Finally, a scaling transformation relative to the coordinate origin can now be expressed as the matrix multiplication ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x x sx 0 0 ⎣ y ⎦ = ⎣ 0 s y 0 ⎦ · ⎣ y ⎦ (21) 0 0 1 1 1 or P = S(sx , s y ) · P
(22)
The scaling operator S(sx , s y ) is the 3 × 3 matrix in Equation 21 with parameters sx and s y . And, in most cases, we can represent the scaling matrix simply as S. Some libraries provide a scaling function that can generate only scaling with respect to the coordinate origin, as in Equation 21. In this case, a scaling transformation relative to another reference position is handled as a succession of transformation operations. However, other systems do include a general scaling routine that can construct the homogeneous matrix for scaling with respect to a designated fixed point.
197
Two-Dimensional Geometric Transformations
3 Inverse Transformations For translation, we obtain the inverse matrix by negating the translation distances. Thus, if we have two-dimensional translation distances tx and ty , the inverse translation matrix is ⎡ ⎤ 1 0 −tx T−1 = ⎣ 0 1 −ty ⎦ (23) 0 0 1 This produces a translation in the opposite direction, and the product of a translation matrix and its inverse produces the identity matrix. An inverse rotation is accomplished by replacing the rotation angle by its negative. For example, a two-dimensional rotation through an angle θ about the coordinate origin has the inverse transformation matrix ⎡ ⎤ cos θ sin θ 0 R−1 = ⎣ − sin θ cos θ 0 ⎦ (24) 0 0 1 Negative values for rotation angles generate rotations in a clockwise direction, so the identity matrix is produced when any rotation matrix is multiplied by its inverse. Because only the sine function is affected by the change in sign of the rotation angle, the inverse matrix can also be obtained by interchanging rows and columns. That is, we can calculate the inverse of any rotation matrix R by evaluating its transpose (R−1 = RT ). We form the inverse matrix for any scaling transformation by replacing the scaling parameters with their reciprocals. For two-dimensional scaling with parameters sx and s y applied relative to the coordinate origin, the inverse transformation matrix is ⎤ ⎡ 1 0 0 ⎥ ⎢ sx ⎥ ⎢ 1 ⎥ S−1 = ⎢ (25) 0⎥ ⎢ 0 ⎦ ⎣ sy 0
0
1
The inverse matrix generates an opposite scaling transformation, so the multiplication of any scaling matrix with its inverse produces the identity matrix.
4 Two-Dimensional Composite Transformations Using matrix representations, we can set up a sequence of transformations as a composite transformation matrix by calculating the product of the individual transformations. Forming products of transformation matrices is often referred to as a concatenation, or composition, of matrices. Because a coordinate position is represented with a homogeneous column matrix, we must premultiply the column matrix by the matrices representing any transformation sequence. Also, because many positions in a scene are typically transformed by the same sequence, it is more efficient to first multiply the transformation matrices to form a single composite matrix. Thus, if we want to apply two transformations to point position P, the transformed location would be calculated as P = M2 · M1 · P = M·P
198
(26)
Two-Dimensional Geometric Transformations
The coordinate position is transformed using the composite matrix M, rather than applying the individual transformations M1 and then M2 .
Composite Two-Dimensional Translations If two successive translation vectors (t1x , t1y ) and (t2x , t2y ) are applied to a twodimensional coordinate position P, the final transformed location P is calculated as P = T(t2x , t2y ) · {T(t1x , t1y ) · P} = {T(t2x , t2y ) · T(t1x , t1y )} · P
(27)
where P and P are represented as three-element, homogeneous-coordinate column vectors. We can verify this result by calculating the matrix product for the two associative groupings. Also, the composite transformation matrix for this sequence of translations is ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 0 t2x 1 0 t1x 1 0 t1x + t2x ⎣ 0 1 t2y ⎦ · ⎣ 0 1 t1y ⎦ = ⎣ 0 1 t1y + t2y ⎦ (28) 0 0 1 0 0 1 0 0 1 or T(t2x , t2y ) · T(t1x , t1y ) = T(t1x + t2x , t1y + t2y )
(29)
which demonstrates that two successive translations are additive.
Composite Two-Dimensional Rotations Two successive rotations applied to a point P produce the transformed position P = R(θ2 ) · {R(θ1 ) · P} = {R(θ2 ) · R(θ1 )} · P
(30)
By multiplying the two rotation matrices, we can verify that two successive rotations are additive: R(θ2 ) · R(θ1 ) = R(θ1 + θ2 )
(31)
so that the final rotated coordinates of a point can be calculated with the composite rotation matrix as P = R(θ1 + θ2 ) · P
(32)
Composite Two-Dimensional Scalings Concatenating transformation matrices for two successive scaling operations in two dimensions produces the following composite scaling matrix: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ s2x 0 0 0 0 s1x 0 0 s1x · s2x ⎣ 0 s2y 0 ⎦ · ⎣ 0 s1y 0 ⎦ = ⎣ 0 s1y · s2y 0 ⎦ (33) 0 0 1 0 0 1 0 0 1 or S(s2x , s2y ) · S(s1x , s1y ) = S(s1x · s2x , s1y · s2y )
(34)
The resulting matrix in this case indicates that successive scaling operations are multiplicative. That is, if we were to triple the size of an object twice in succession, the final size would be nine times that of the original.
199
Two-Dimensional Geometric Transformations
(xr , yr )
FIGURE 9
A transformation sequence for rotating an object about a specified pivot point using the rotation matrix R(θ ) of transformation 19.
(xr , yr)
(a)
(b)
(c)
(d)
Original Position of Object and Pivot Point
Translation of Object so that Pivot Point (xr , yr) is at Origin
Rotation about Origin
Translation of Object so that the Pivot Point is Returned to Position (xr , yr)
General Two-Dimensional Pivot-Point Rotation When a graphics package provides only a rotate function with respect to the coordinate origin, we can generate a two-dimensional rotation about any other pivot point (xr , yr ) by performing the following sequence of translate-rotate-translate operations: 1. Translate the object so that the pivot-point position is moved to the coor-
dinate origin. 2. Rotate the object about the coordinate origin. 3. Translate the object so that the pivot point is returned to its original
position. This transformation sequence is illustrated in Figure 9. The composite transformation matrix for this sequence is obtained with the concatenation ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 0 xr cos θ −sin θ 0 1 0 −xr ⎣ 0 1 yr ⎦ · ⎣ sin θ cos θ 0 ⎦ · ⎣ 0 1 −yr ⎦ 0 0 1 0 0 1 0 0 1 ⎡ ⎤ cos θ −sin θ xr (1 − cos θ) + yr sin θ = ⎣ sin θ cos θ yr (1 − cos θ) − xr sin θ ⎦ (35) 0 0 1 which can be expressed in the form T(xr , yr ) · R(θ ) · T(−xr , −yr ) = R(xr , yr , θ)
(36)
where T(−xr , −yr ) = T−1 (xr , yr ). In general, a rotate function in a graphics library could be structured to accept parameters for pivot-point coordinates, as well as the rotation angle, and to generate automatically the rotation matrix of Equation 35.
General Two-Dimensional Fixed-Point Scaling Figure 10 illustrates a transformation sequence to produce a two-dimensional scaling with respect to a selected fixed position (x f , y f ), when we have a function that can scale relative to the coordinate origin only. This sequence is 1. Translate the object so that the fixed point coincides with the coordinate
origin.
200
Two-Dimensional Geometric Transformations
(xf , yf)
(xf , yf)
(a) Original Position of Object and Fixed Point
(b) Translate Object so that Fixed Point (xf , yf) is at Origin
(c) Scale Object with Respect to Origin
(d) Translate Object so that the Fixed Point is Returned to Position (xf , yf)
FIGURE 10
A transformation sequence for scaling an object with respect to a specified fixed position using the scaling matrix S(s x , s y ) of transformation 21.
2. Scale the object with respect to the coordinate origin. 3. Use the inverse of the translation in step (1) to return the object to its
original position. Concatenating the matrices for these three operations produces the required scaling matrix: ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ 1 0 xf 1 0 −x f sx 0 0 sx 0 x f (1 − sx ) ⎣ 0 1 y f ⎦ · ⎣ 0 s y 0 ⎦ · ⎣ 0 1 −y f ⎦ = ⎣ 0 s y y f (1 − s y ) ⎦ (37) 0 0 1 0 0 1 0 0 1 0 0 1 or T(x f , y f ) · S(sx , s y ) · T(−x f , −y f ) = S(x f , y f , sx , s y )
(38)
This transformation is generated automatically in systems that provide a scale function that accepts coordinates for the fixed point.
General Two-Dimensional Scaling Directions Parameters sx and s y scale objects along the x and y directions. We can scale an object in other directions by rotating the object to align the desired scaling directions with the coordinate axes before applying the scaling transformation. Suppose we want to apply scaling factors with values specified by parameters s1 and s2 in the directions shown in Figure 11. To accomplish the scaling without changing the orientation of the object, we first perform a rotation so that the directions for s1 and s2 coincide with the x and y axes, respectively. Then the scaling transformation S(s1 , s2 ) is applied, followed by an opposite rotation to return points to their original orientations. The composite matrix resulting from the product of these three transformations is ⎡ ⎤ s1 cos2 θ + s2 sin2 θ (s2 − s1 ) cos θ sin θ 0 R−1 (θ ) · S(s1 , s2 ) · R(θ ) = ⎣ (s2 − s1 ) cos θ sin θ s1 sin2 θ + s2 cos2 θ 0 ⎦ (39) 0 0 1
y
s2
u
x s1
FIGURE 11
Scaling parameters s 1 and s 2 along orthogonal directions defined by the angular displacement θ.
As an example of this scaling transformation, we turn a unit square into a parallelogram (Figure 12) by stretching it along the diagonal from (0, 0) to (1, 1). We first rotate the diagonal onto the y axis using θ = 45◦ , then we double its length with the scaling values s1 = 1 and s2 = 2, and then we rotate again to return the diagonal to its original orientation. In Equation 39, we assumed that scaling was to be performed relative to the origin. We could take this scaling operation one step further and concatenate the matrix with translation operators, so that the composite matrix would include parameters for the specification of a scaling fixed position.
201
Two-Dimensional Geometric Transformations y
y
(2, 2) (1/2, 3/2)
(0, 1)
(1, 1)
FIGURE 12
A square (a) is converted to a parallelogram (b) using the composite transformation matrix 39, with s 1 = 1, s 2 = 2, and θ = 45◦ .
(3/2, 1/2) (0, 0)
(1, 0)
x
(a)
x
(0, 0) (b)
Matrix Concatenation Properties Multiplication of matrices is associative. For any three matrices, M1 , M2 , and M3 , the matrix product M3 · M2 · M1 can be performed by first multiplying M3 and M2 or by first multiplying M2 and M1 : M3 · M2 · M1 = (M3 · M2 ) · M1 = M3 · (M2 · M1 )
(40)
Therefore, depending upon the order in which the transformations are specified, we can construct a composite matrix either by multiplying from left to right (premultiplying) or by multiplying from right to left (postmultiplying). Some graphics packages require that transformations be specified in the order in which they are to be applied. In that case, we would first invoke transformation M1 , then M2 , then M3 . As each successive transformation routine is called, its matrix is concatenated on the left of the previous matrix product. Other graphics systems, however, postmultiply matrices, so that this transformation sequence would have to be invoked in the reverse order: the last transformation invoked (which is M1 for this example) is the first to be applied, and the first transformation that is called (M3 in this case) is the last to be applied. Transformation products, on the other hand, may not be commutative. The matrix product M2 · M1 is not equal to M1 · M2 , in general. This means that if we want to translate and rotate an object, we must be careful about the order in which the composite matrix is evaluated (Figure 13). For some special cases—such as a sequence of transformations that are all of the same kind—the multiplication of transformation matrices is commutative. As an example, two successive rotations could be performed in either order and the final position would be the same. This commutative property holds also for two successive translations or two successive scalings. Another commutative pair of operations is rotation and uniform scaling (sx = s y ).
Final Position
(a)
Final Position
(b)
FIGURE 13
Reversing the order in which a sequence of transformations is performed may affect the transformed position of an object. In (a), an object is first translated in the x direction, then rotated counterclockwise through an angle of 45◦ . In (b), the object is first rotated 45◦ counterclockwise, then translated in the x direction.
202
Two-Dimensional Geometric Transformations
General Two-Dimensional Composite Transformations and Computational Efficiency A two-dimensional transformation, representing any combination of translations, rotations, and scalings, can be expressed as ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x r sxx r sxy tr sx x ⎣ y ⎦ = ⎣ r s yx r s yy tr s y ⎦ · ⎣ y ⎦ (41) 1 0 0 1 1 The four elements r s jk are the multiplicative rotation-scaling terms in the transformation, which involve only rotation angles and scaling factors. Elements tr sx and tr s y are the translational terms, containing combinations of translation distances, pivot-point and fixed-point coordinates, rotation angles, and scaling parameters. For example, if an object is to be scaled and rotated about its centroid coordinates (xc , yc ) and then translated, the values for the elements of the composite transformation matrix are T(tx , ty ) · R(xc , yc , θ ) · S(xc , yc , sx , s y ) ⎡ ⎤ sx cos θ −s y sin θ xc (1 − sx cos θ) + yc s y sin θ + tx s y cos θ yc (1 − s y cos θ) − xc sx sin θ + ty ⎦ = ⎣ sx sin θ 0 0 1
(42)
Although Equation 41 requires nine multiplications and six additions, the explicit calculations for the transformed coordinates are x = x · r sxx + y · r sxy + tr sx ,
y = x · r s yx + y · r s yy + tr s y
(43)
Thus, we need actually perform only four multiplications and four additions to transform coordinate positions. This is the maximum number of computations required for any transformation sequence, once the individual matrices have been concatenated and the elements of the composite matrix evaluated. Without concatenation, the individual transformations would be applied one at a time, and the number of calculations could be increased significantly. An efficient implementation for the transformation operations, therefore, is to formulate transformation matrices, concatenate any transformation sequence, and calculate transformed coordinates using Equations 43. On parallel systems, direct matrix multiplications with the composite transformation matrix of Equation 41 can be equally efficient. Because rotation calculations require trigonometric evaluations and several multiplications for each transformed point, computational efficiency can become an important consideration in rotation transformations. In animations and other applications that involve many repeated transformations and small rotation angles, we can use approximations and iterative calculations to reduce computations in the composite transformation equations. When the rotation angle is small, the trigonometric functions can be replaced with approximation values based on the first few terms of their power series expansions. For small-enough angles (less than 10◦ ), cos θ is approximately 1.0 and sin θ has a value very close to the value of θ in radians. If we are rotating in small angular steps about the origin, for instance, we can set cos θ to 1.0 and reduce transformation calculations at each step to two multiplications and two additions for each set of coordinates to be rotated. These rotation calculations are x = x − y sin θ,
y = x sin θ + y
(44)
where sin θ is evaluated once for all steps, assuming the rotation angle does not change. The error introduced by this approximation at each step decreases as the rotation angle decreases; but even with small rotation angles, the accumulated
203
Two-Dimensional Geometric Transformations
error over many steps can become quite large. We can control the accumulated error by estimating the error in x and y at each step and resetting object positions when the error accumulation becomes too great. Some animation applications automatically reset object positions at fixed intervals, such as every 360◦ or every 180◦ . Composite transformations often involve inverse matrices. For example, transformation sequences for general scaling directions and for some reflections and shears (Section 5) require inverse rotations. As we have noted, the inverse matrix representations for the basic geometric transformations can be generated with simple procedures. An inverse translation matrix is obtained by changing the signs of the translation distances, and an inverse rotation matrix is obtained by performing a matrix transpose (or changing the sign of the sine terms). These operations are much simpler than direct inverse matrix calculations.
Two-Dimensional Rigid-Body Transformation If a transformation matrix includes only translation and rotation parameters, it is a rigid-body transformation matrix. The general form for a two-dimensional rigid-body transformation matrix is ⎡ ⎤ r xx r xy tr x ⎣ r yx r yy tr y ⎦ (45) 0
0
1
where the four elements r jk are the multiplicative rotation terms, and the elements tr x and tr y are the translational terms. A rigid-body change in coordinate position is also sometimes referred to as a rigid-motion transformation. All angles and distances between coordinate positions are unchanged by the transformation. In addition, matrix 45 has the property that its upper-left 2 × 2 submatrix is an orthogonal matrix. This means that if we consider each row (or each column) of the submatrix as a vector, then the two row vectors (r xx , r xy ) and (r yx , r yy ) (or the two column vectors) form an orthogonal set of unit vectors. Such a set of vectors is also referred to as an orthonormal vector set. Each vector has unit length as follows: 2 2 2 2 r xx + r xy = r yx + r yy =1
(46)
and the vectors are perpendicular (their dot product is 0): r xx r yx + r xyr yy = 0
(47)
Therefore, if these unit vectors are transformed by the rotation submatrix, then the vector (r xx , r xy ) is converted to a unit vector along the x axis and the vector (r yx , r yy ) is transformed into a unit vector along the y axis of the coordinate system: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ r xx r xy 0 r xx 1 ⎣ r yx r yy 0 ⎦ · ⎣ r xy ⎦ = ⎣ 0 ⎦ (48) 1 1 0 0 1 ⎤ ⎡ ⎡ ⎤ ⎡ ⎤ r xx r xy 0 r yx 0 ⎣ r yx r yy 0 ⎦ · ⎣ r yy ⎦ = ⎣ 1 ⎦ (49) 1 1 0 0 1 For example, the following rigid-body transformation first rotates an object through an angle θ about a pivot point (xr , yr ) and then translates the object: ⎡ ⎤ cos θ − sin θ xr (1 − cos θ) + yr sin θ + tx cos θ yr (1 − cos θ) − xr sin θ + ty ⎦ T(tx , ty ) · R(xr , yr , θ) = ⎣ sin θ (50) 0
204
0
1
Two-Dimensional Geometric Transformations y
v
y
u FIGURE 14
x
(a)
x
(b)
The rotation matrix for revolving an object from position (a) to position (b) can be constructed with the values of the unit orientation vectors u and v relative to the original orientation.
Here, orthogonal unit vectors in the upper-left 2 × 2 submatrix are (cos θ, − sin θ) and (sin θ, cos θ ), and ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ cos θ − sin θ 0 cos θ 1 ⎣ sin θ cos θ 0 ⎦ · ⎣ − sin θ ⎦ = ⎣ 0 ⎦ (51) 0 0 1 1 1 Similarly, unit vector (sin θ , cos θ ) is converted by the preceding transformation matrix to the unit vector (0, 1) in the y direction.
Constructing Two-Dimensional Rotation Matrices The orthogonal property of rotation matrices is useful for constructing the matrix when we know the final orientation of an object, rather than the amount of angular rotation necessary to put the object into that position. This orientation information could be determined by the alignment of certain objects in a scene or by reference positions within the coordinate system. For example, we might want to rotate an object to align its axis of symmetry with the viewing (camera) direction, or we might want to rotate one object so that it is above another object. Figure 14 shows an object that is to be aligned with the unit direction vectors u and v . Assuming that the original object orientation, as shown in Figure 14(a), is aligned with the coordinate axes, we construct the desired transformation by assigning the elements of u to the first row of the rotation matrix and the elements of v to the second row. In a modeling application, for instance, we can use this method to obtain the transformation matrix within an object’s local coordinate system when we know what its orientation is to be within the overall world-coordinate scene. A similar transformation is the conversion of object descriptions from one coordinate system to another, and we take up these methods in more detail in Section 8.
Two-Dimensional Composite-Matrix Programming Example An implementation example for a sequence of geometric transformations is given in the following program. Initially, the composite matrix, compMatrix, is constructed as the identity matrix. For this example, a left-to-right concatenation order is used to construct the composite transformation matrix, and we invoke the transformation routines in the order that they are to be executed. As each of the basic transformation routines (scale, rotate, and translate) is invoked, a matrix is set up for that transformation and left-concatenated with the composite matrix. When all transformations have been specified, the composite transformation is applied to transform a triangle. The triangle is first scaled with respect to its centroid position, then the triangle is rotated about its centroid, and, lastly, it is translated. Figure 15 shows the original and final positions of the triangle that is transformed by this sequence. Routines in OpenGL are used to dispaly the initial and final position of the triangle.
205
Two-Dimensional Geometric Transformations y
y
200
200
150
150
100
100 Centroid
Centroid
50
50
FIGURE 15
A triangle (a) is transformed into position (b) using the compositematrix calculations in procedure transformVerts2D.
50
100 (a)
150
200
x
#include #include #include /* Set initial display-window size. */ GLsizei winWidth = 600, winHeight = 600; /* Set range for world coordinates. */ GLfloat xwcMin = 0.0, xwcMax = 225.0; GLfloat ywcMin = 0.0, ywcMax = 225.0; class wcPt2D { public: GLfloat x, y; }; typedef GLfloat Matrix3x3 [3][3]; Matrix3x3 matComposite; const GLdouble pi = 3.14159; void init (void) { /* Set color of display window to white. glClearColor (1.0, 1.0, 1.0, 0.0); }
*/
/* Construct the 3 x 3 identity matrix. */ void matrix3x3SetIdentity (Matrix3x3 matIdent3x3) { GLint row, col; for (row = 0; row < 3; row++) for (col = 0; col < 3; col++) matIdent3x3 [row][col] = (row == col); }
206
50
100 (b)
150
200
x
Two-Dimensional Geometric Transformations
/* Premultiply matrix m1 times matrix m2, store result in m2. */ void matrix3x3PreMultiply (Matrix3x3 m1, Matrix3x3 m2) { GLint row, col; Matrix3x3 matTemp; for (row = 0; row < 3; row++) for (col = 0; col < 3 ; col++) matTemp [row][col] = m1 [row][0] * m2 [0][col] + m1 [row][1] * m2 [1][col] + m1 [row][2] * m2 [2][col]; for (row = 0; row < 3; row++) for (col = 0; col < 3; col++) m2 [row][col] = matTemp [row][col]; } void translate2D (GLfloat tx, GLfloat ty) { Matrix3x3 matTransl; /* Initialize translation matrix to identity. matrix3x3SetIdentity (matTransl);
*/
matTransl [0][2] = tx; matTransl [1][2] = ty; /* Concatenate matTransl with the composite matrix. matrix3x3PreMultiply (matTransl, matComposite);
*/
} void rotate2D (wcPt2D pivotPt, GLfloat theta) { Matrix3x3 matRot; /* Initialize rotation matrix to identity. matrix3x3SetIdentity (matRot); matRot [0][0] = cos (theta); matRot [0][1] = -sin (theta); matRot [0][2] = pivotPt.x * (1 pivotPt.y matRot [1][0] = sin (theta); matRot [1][1] = cos (theta); matRot [1][2] = pivotPt.y * (1 pivotPt.x
*/
- cos (theta)) + * sin (theta);
- cos (theta)) * sin (theta);
/* Concatenate matRot with the composite matrix. matrix3x3PreMultiply (matRot, matComposite);
*/
} void scale2D (GLfloat sx, GLfloat sy, wcPt2D fixedPt) { Matrix3x3 matScale;
207
Two-Dimensional Geometric Transformations
/* Initialize scaling matrix to identity. matrix3x3SetIdentity (matScale); matScale matScale matScale matScale
[0][0] [0][2] [1][1] [1][2]
= = = =
*/
sx; (1 - sx) * fixedPt.x; sy; (1 - sy) * fixedPt.y;
/* Concatenate matScale with the composite matrix. matrix3x3PreMultiply (matScale, matComposite);
*/
} /* Using the composite matrix, calculate transformed coordinates. */ void transformVerts2D (GLint nVerts, wcPt2D * verts) { GLint k; GLfloat temp; for (k = 0; k < nVerts; k++) { temp = matComposite [0][0] * verts [k].x + matComposite [0][1] * verts [k].y + matComposite [0][2]; verts [k].y = matComposite [1][0] * verts [k].x + matComposite [1][1] * verts [k].y + matComposite [1][2]; verts [k].x = temp; } } void triangle (wcPt2D *verts) { GLint k; glBegin (GL_TRIANGLES); for (k = 0; k < 3; k++) glVertex2f (verts [k].x, verts [k].y); glEnd ( ); } void displayFcn (void) { /* Define initial position for triangle. */ GLint nVerts = 3; wcPt2D verts [3] = { {50.0, 25.0}, {150.0, 25.0}, {100.0, 100.0} }; /* Calculate position of triangle centroid. wcPt2D centroidPt;
*/
GLint k, xSum = 0, ySum = 0; for (k = 0; k < nVerts; k++) { xSum += verts [k].x; ySum += verts [k].y; } centroidPt.x = GLfloat (xSum) / GLfloat (nVerts); centroidPt.y = GLfloat (ySum) / GLfloat (nVerts);
208
Two-Dimensional Geometric Transformations
/* Set geometric transformation parameters. wcPt2D pivPt, fixedPt; pivPt = centroidPt; fixedPt = centroidPt;
*/
GLfloat tx = 0.0, ty = 100.0; GLfloat sx = 0.5, sy = 0.5; GLdouble theta = pi/2.0; glClear (GL_COLOR_BUFFER_BIT);
//
Clear display window.
glColor3f (0.0, 0.0, 1.0); triangle (verts);
// //
Set initial fill color to blue. Display blue triangle.
/* Initialize composite matrix to identity. matrix3x3SetIdentity (matComposite);
*/
/* Construct composite matrix for transformation sequence. */ scale2D (sx, sy, fixedPt); // First transformation: Scale. rotate2D (pivPt, theta); // Second transformation: Rotate translate2D (tx, ty); // Final transformation: Translate. /* Apply composite matrix to triangle vertices. transformVerts2D (nVerts, verts); glColor3f (1.0, 0.0, 0.0); triangle (verts);
*/
// Set color for transformed triangle. // Display red transformed triangle.
glFlush ( ); } void winReshapeFcn (GLint newWidth, GLint newHeight) { glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (xwcMin, xwcMax, ywcMin, ywcMax); glClear (GL_COLOR_BUFFER_BIT); } void main (int argc, char ** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (50, 50); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Geometric Transformation Sequence"); init ( ); glutDisplayFunc (displayFcn); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
209
Two-Dimensional Geometric Transformations
5 Other Two-Dimensional Transformations Basic transformations such as translation, rotation, and scaling are standard components of graphics libraries. Some packages provide a few additional transformations that are useful in certain applications. Two such transformations are reflection and shear.
Reflection A transformation that produces a mirror image of an object is called a reflection. For a two-dimensional reflection, this image is generated relative to an axis of reflection by rotating the object 180◦ about the reflection axis. We can choose an axis of reflection in the xy plane or perpendicular to the xy plane. When the reflection axis is a line in the xy plane, the rotation path about this axis is in a plane perpendicular to the xy plane. For reflection axes that are perpendicular to the xy plane, the rotation path is in the xy plane. Some examples of common reflections follow. Reflection about the line y = 0 (the x axis) is accomplished with the transformation matrix ⎡ ⎤ 1 0 0 ⎣ 0 −1 0 ⎦ (52) 0 0 1
1
y
Original Position 2
3
2
3
x
Reflected Position 1 FIGURE 16
Reflection of an object about the x axis.
This transformation retains x values, but “flips” the y values of coordinate positions. The resulting orientation of an object after it has been reflected about the x axis is shown in Figure 16. To envision the rotation transformation path for this reflection, we can think of the flat object moving out of the xy plane and rotating 180◦ through three-dimensional space about the x axis and back into the xy plane on the other side of the x axis. A reflection about the line x = 0 (the y axis) flips x coordinates while keeping y coordinates the same. The matrix for this transformation is ⎡ ⎤ −1 0 0 ⎣ 0 1 0⎦ (53) 0 0 1
y
Original Position 2
Reflected Position 2 1
1
3
3 x
FIGURE 17
Reflection of an object about the y axis.
Figure 17 illustrates the change in position of an object that has been reflected about the line x = 0. The equivalent rotation in this case is 180◦ through threedimensional space about the y axis. We flip both the x and y coordinates of a point by reflecting relative to an axis that is perpendicular to the xy plane and that passes through the coordinate origin. This reflection is sometimes referred to as a reflection relative to the coordinate origin, and it is equivalent to reflecting with respect to both coordinate axes. The matrix representation for this reflection is ⎡ ⎤ −1 0 0 ⎣ 0 −1 0 ⎦ (54) 0 0 1 An example of reflection about the origin is shown in Figure 18. The reflection matrix 54 is the same as the rotation matrix R(θ) with θ = 180 ◦. We are simply rotating the object in the xy plane half a revolution about the origin.
210
Two-Dimensional Geometric Transformations y Reflected Position
y
3 3 2 yrfl
1 x
1 2
1
2
Preflect
1 2 3 3 Original Position
xrfl
FIGURE 18
FIGURE 19
Reflection of an object relative to the coordinate origin. This transformation can be accomplished with a rotation in the x y plane about the coordinate origin.
Reflection of an object relative to an axis perpendicular to the x y plane and passing through point Preflect .
Reflection 54 can be generalized to any reflection point in the xy plane (Figure 19). This reflection is the same as a 180 ◦ rotation in the xy plane about the reflection point. If we choose the reflection axis as the diagonal line y = x (Figure 20), the reflection matrix is ⎡ ⎤ 0 1 0 ⎣1 0 0⎦ (55) 0 0 1 We can derive this matrix by concatenating a sequence of rotation and coordinateaxis reflection matrices. One possible sequence is shown in Figure 21. Here, we first perform a clockwise rotation with respect to the origin through a 45◦ angle, which rotates the line y = x onto the x axis. Next, we perform a reflection with respect to the x axis. The final step is to rotate the line y = x back to its original position with a counterclockwise rotation through 45◦ . Another equivalent sequence of transformations is to first reflect the object about the x axis, then rotate it counterclockwise 90◦ . To obtain a transformation matrix for reflection about the diagonal y = −x, we could concatenate matrices for the transformation sequence: (1) clockwise rotation by 45◦ , (2) reflection about the y axis, and (3) counterclockwise rotation by 45◦ . The resulting transformation matrix is ⎡ ⎤ 0 −1 0 ⎣ −1 0 0 ⎦ (56) 0 0 1
x
yx
y
Original Position
3
2 1 1
Reflected Position 3
2 x FIGURE 20
Reflection of an object with respect to the line y = x .
Figure 22 shows the original and final positions for an object transformed with this reflection matrix. Reflections about any line y = mx + b in the xy plane can be accomplished with a combination of translate-rotate-reflect transformations. In general, we
211
Two-Dimensional Geometric Transformations
first translate the line so that it passes through the origin. Then we can rotate the line onto one of the coordinate axes and reflect about that axis. Finally, we restore the line to its original position with the inverse rotation and translation transformations. We can implement reflections with respect to the coordinate axes or coordinate origin as scaling transformations with negative scaling factors. Also, elements of the reflection matrix can be set to values other than ±1. A reflection parameter with a magnitude greater than 1 shifts the mirror image of a point farther from the reflection axis, and a parameter with magnitude less than 1 brings the mirror image of a point closer to the reflection axis. Thus, a reflected object can also be enlarged, reduced, or distorted.
yx
45
(a) y
Shear x
A transformation that distorts the shape of an object such that the transformed shape appears as if the object were composed of internal layers that had been caused to slide over each other is called a shear. Two common shearing transformations are those that shift coordinate x values and those that shift y values. An x-direction shear relative to the x axis is produced with the transformation matrix ⎡ ⎤ 1 sh x 0 ⎣0 1 0⎦ (57) 0 0 1
(b)
45
which transforms coordinate positions as (c)
x = x + sh x · y,
FIGURE 21
Sequence of transformations to produce a reflection about the line y = x : A clockwise rotation of 45◦ (a), a reflection about the x axis (b), and a counterclockwise rotation by 45◦ (c).
2
1 2 Original Position
y = y
(58)
Any real number can be assigned to the shear parameter sh x . A coordinate position (x, y) is then shifted horizontally by an amount proportional to its perpendicular distance (y value) from the x axis. Setting parameter sh x to the value 2, for example, changes the square in Figure 23 into a parallelogram. Negative values for sh x shift coordinate positions to the left.
Reflected Position 3 y
1
y (2, 1)
(0, 1)
3
(3, 1)
(1, 1)
y x (0, 0)
(1, 0) (a)
212
x
(0, 0)
(1, 0)
x (b)
FIGURE 22
FIGURE 23
Reflection with respect to the line y = −x .
A unit square (a) is converted to a parallelogram (b) using the x -direction shear matrix 57 with s hx = 2.
Two-Dimensional Geometric Transformations y
y
(0, 1)
(1, 1)
(1, 1)
(1, 0)
(0, 0)
(1/2, 0)
x
(2, 1)
(3/2, 0)
x FIGURE 24
yref 1
A unit square (a) is transformed to a shifted parallelogram (b) with s h x = 0.5 and y ref = −1 in the shear matrix 59.
yref 1 (a)
(b)
y
y
(1, 2)
(0, 3/2) (0, 1)
(1, 1)
(1, 1) (0, 1/2)
xref
= – 1 (0, 0)
(1, 0)
x
FIGURE 25
A unit square (a) is turned into a shifted parallelogram (b) with parameter values s h y = 0.5 and x ref = −1 in the y -direction shearing transformation 61.
x
xref = – 1
(a)
(b)
We can generate x-direction shears relative to other reference lines with ⎡ ⎤ 1 sh x −sh x · yref ⎣0 1 ⎦ 0 (59) 0 0 1 Now, coordinate positions are transformed as x = x + sh x (y − yref ),
y = y
(60)
An example of this shearing transformation is given in Figure 24 for a shear parameter value of 12 relative to the line yref = −1. A y-direction shear relative to the line x = xref is generated with the transformation matrix ⎡ ⎤ 1 0 0 ⎣ sh y 1 −sh y · xref ⎦ (61) 0 0 1 which generates the transformed coordinate values x = x,
y = y + sh y (x − xref )
(62)
This transformation shifts a coordinate position vertically by an amount proportional to its distance from the reference line x = xref . Figure 25 illustrates the conversion of a square into a parallelogram with sh y = 0.5 and xref = −1. Shearing operations can be expressed as sequences of basic transformations. The x-direction shear matrix 57, for example, can be represented as a composite transformation involving a series of rotation and scaling matrices. This composite transformation scales the unit square of Figure 23 along its diagonal, while maintaining the original lengths and orientations of edges parallel to the x axis. Shifts in the positions of objects relative to shearing reference lines are equivalent to translations.
213
Two-Dimensional Geometric Transformations
6 Raster Methods for Geometric Transformations Pmax
Pmin
(a)
P0 (b) FIGURE 26
Translating an object from screen position (a) to the destination position shown in (b) by moving a rectangular block of pixel values. Coordinate positions Pmin and Pmax specify the limits of the rectangular block to be moved, and P0 is the destination reference position.
The characteristics of raster systems suggest an alternate method for performing certain two-dimensional transformations. Raster systems store picture information as color patterns in the frame buffer. Therefore, some simple object transformations can be carried out rapidly by manipulating an array of pixel values. Few arithmetic operations are needed, so the pixel transformations are particularly efficient. Functions that manipulate rectangular pixel arrays are called raster operations and moving a block of pixel values from one position to another is termed a block transfer, a bitblt, or a pixblt. Routines for performing some raster operations are usually available in a graphics package. Figure 26 illustrates a two-dimensional translation implemented as a block transfer of a refresh-buffer area. All bit settings in the rectangular area shown are copied as a block into another part of the frame buffer. We can erase the pattern at the original location by assigning the background color to all pixels within that block (assuming that the pattern to be erased does not overlap other objects in the scene). Rotations in 90-degree increments are accomplished easily by rearranging the elements of a pixel array. We can rotate a two-dimensional object or pattern 90◦ counterclockwise by reversing the pixel values in each row of the array, then interchanging rows and columns. A 180◦ rotation is obtained by reversing the order of the elements in each row of the array, then reversing the order of the rows. Figure 27 demonstrates the array manipulations that can be used to rotate a pixel block by 90◦ and by 180◦ . For array rotations that are not multiples of 90◦ , we need to do some extra processing. The general procedure is illustrated in Figure 28. Each destination pixel area is mapped onto the rotated array and the amount of overlap with the rotated pixel areas is calculated. A color for a destination pixel can then be computed by averaging the colors of the overlapped source pixels, weighted by their percentage of area overlap. Alternatively, we could use an approximation method, as in antialiasing, to determine the color of the destination pixels. We can use similar methods to scale a block of pixels. Pixel areas in the original block are scaled, using specified values for sx and s y , and then mapped onto a set of destination pixels. The color of each destination pixel is then assigned according to its area of overlap with the scaled pixel areas (Figure 29). An object can be reflected using raster transformations that reverse row or column values in a pixel block, combined with translations. Shears are produced with shifts in the positions of array values along rows or columns.
1 4 7 10
2 5 8 11
3 6 9 12
(a)
3 2 1
6 5 4
9 8 7 (b)
12 11 10
12 9 6 3
11 8 5 2
10 7 4 1
(c)
FIGURE 27
Rotating an array of pixel values. The original array is shown in (a), the positions of the array elements after a 90◦ counterclockwise rotation are shown in (b), and the positions of the array elements after a 180◦ rotation are shown in (c).
214
Two-Dimensional Geometric Transformations
Destination Pixel Array Destination Pixel Areas
Rotated Pixel Array
Scaled Array Destination Pixel Array u
FIGURE
28
A raster rotation for a rectangular block of pixels can be accomplished by mapping the destination pixel areas onto the rotated block.
(xf , yf) FIGURE 29
Mapping destination pixel areas onto a scaled array of pixel values. Scaling factors s x = s y = 0.5 are applied relative to fixed point (x f , y f ).
7 OpenGL Raster Transformations You should already be familiar with most of the OpenGL functions for performing raster operations. A translation of a rectangular array of pixel-color values from one buffer area to another can be accomplished in OpenGL as the following copy operation: glCopyPixels (xmin, ymin, width, height, GL_COLOR);
The first four parameters in this function give the location and dimensions of the pixel block; and the OpenGL symbolic constant GL COLOR specifies that it is color values are to be copied. This array of pixels is to be copied to a rectangular area of a refresh buffer whose lower-left corner is at the location specified by the current raster position. Pixel-color values are copied as either RGBA values or color-table indices, depending on the current setting for the color mode. Both the region to be copied (the source) and the destination area should lie completely within the bounds of the screen coordinates. This translation can be carried out on any of the OpenGL buffers used for refreshing, or even between different buffers. A source buffer for the glCopyPixels function is chosen with the glReadBuffer routine, and a destination buffer is selected with the glDrawBuffer routine. We can rotate a block of pixel-color values in 90-degree increments by first saving the block in an array, then rearranging the elements of the array and placing it back in the refresh buffer. A block of RGB color values in a buffer can be saved in an array with the function glReadPixels (xmin, ymin, width, height, GL_RGB, GL_UNSIGNED_BYTE, colorArray);
If color-table indices are stored at the pixel positions, we replace the constant GL RGB with GL COLOR INDEX. To rotate the color values, we rearrange the rows and columns of the color array, as described in the previous section. Then we put the rotated array back in the buffer with glDrawPixels (width, height, GL_RGB, GL_UNSIGNED_BYTE, colorArray);
The lower-left corner of this array is placed at the current raster position. We select the source buffer containing the original block of pixel values with glReadBuffer, and we designate a destination buffer with glDrawBuffer.
215
Two-Dimensional Geometric Transformations
A two-dimensional scaling transformation can be performed as a raster operation in OpenGL by specifying scaling factors and then invoking either glCopyPixels or glDrawPixels. For the raster operations, we set the scaling factors with glPixelZoom (sx, sy);
where parameters sx and sy can be assigned any nonzero floating-point values. Positive values greater than 1.0 increase the size of an element in the source array, and positive values less than 1.0 decrease element size. A negative value for sx or sy, or both, produces a reflection and scales the array elements. Thus, if sx = sy = −3.0, the source array is reflected with respect to the current raster position and each color element of the array is mapped to a 3 × 3 block of destination pixels. If the center of a destination pixel lies within the rectangular area of a scaled color element of an array, it is assigned the color of that array element. Destination pixels whose centers are on the left or top boundary of the scaled array element are also assigned the color of that element. The default value for both sx and sy is 1.0. We can also combine raster transformations with logical operations to produce various effects. With the exclusive or operator, for example, two successive copies of a pixel array to the same buffer area restores the values that were originally present in that area. This technique can be used in an animation application to translate an object across a scene without altering the background pixels.
8 Transformations between Two-Dimensional Coordinate Systems
y axis xis
a y
u
y0
0 FIGURE 30
x0
x
is
ax
x axis
A Cartesian x y system positioned at (x 0 , y 0 ) with orientation θ in an x y Cartesian system.
216
Computer-graphics applications involve coordinate transformations from one reference frame to another during various stages of scene processing. The viewing routines transform object descriptions from world coordinates to device coordinates. For modeling and design applications, individual objects are typically defined in their own local Cartesian references. These local-coordinate descriptions must then be transformed into positions and orientations within the overall scene coordinate system. A facility-management program for office layouts, for instance, has individual coordinate descriptions for chairs and tables and other furniture that can be placed into a floor plan, with multiple copies of the chairs and other items in different positions. Also, scenes are sometimes described in non-Cartesian reference frames that take advantage of object symmetries. Coordinate descriptions in these systems must be converted to Cartesian world coordinates for processing. Some examples of non-Cartesian systems are polar coordinates, spherical coordinates, elliptical coordinates, and parabolic coordinates. Here, we consider only the transformations involved in converting from one two- dimensional Cartesian frame to another. Figure 30 shows a Cartesian x y system specified with coordinate origin (x0 , y0 ) and orientation angle θ in a Cartesian xy reference frame. To transform object descriptions from xy coordinates to x y coordinates, we set up a transformation that superimposes the x y axes onto the xy axes. This is done in two steps: 1. Translate so that the origin (x0 , y0 ) of the x y system is moved to the origin
(0, 0) of the xy system. 2. Rotate the x axis onto the x axis.
Two-Dimensional Geometric Transformations
Translation of the coordinate origin is accomplished with the matrix transformation ⎡ ⎤ 1 0 −x0 T(−x0 , −y0 ) = ⎣ 0 1 −y0 ⎦ (63) 0 0 1 The orientation of the two systems after the translation operation would then appear as in Figure 31. To get the axes of the two systems into coincidence, we then perform the clockwise rotation ⎡ ⎤ cos θ sin θ 0 R(−θ ) = ⎣ −sin θ cos θ 0 ⎦ (64) 0 0 1 Concatenating these two transformation matrices gives us the complete composite matrix for transforming object descriptions from the xy system to the x y system: Mxy, x y = R(−θ) · T(−x0 , −y0 )
y axis
P
y
is
ax y
y
u x
x
x
is
ax
x axis
FIGURE 31
Position of the reference frames shown in Figure 30 after translating the origin of the x y system to the coordinate origin of the x y system.
(65)
An alternate method for describing the orientation of the x y coordinate system is to specify a vector V that indicates the direction for the positive y axis, as shown in Figure 32. We can specify vector V as a point in the xy reference frame relative to the origin of the xy system, which we can convert to the unit vector v=
V = (vx , v y ) |V|
(66)
We obtain the unit vector u along the x axis by applying a 90◦ clockwise rotation to vector v: u = (v y , −vx ) = (ux , u y )
(67)
In Section 4, we noted that the elements of any rotation matrix could be expressed as elements of a set of orthonormal vectors. Therefore, the matrix to rotate the x y system into coincidence with the xy system can be written as ⎡ ⎤ ux u y 0 R = ⎣ vx v y 0 ⎦ (68) 0 0 1 For example, suppose that we choose the orientation for the y axis as V = (−1, 0). Then the x axis is in the positive y direction, and the rotation transformation matrix is ⎡ ⎤ 0 1 0 ⎣ −1 0 0 ⎦ 0 0 1
y axis y
V
y0
is
ax
xis
a x
P0
FIGURE 32
0
x0
x axis
Cartesian system x y with origin at P0 = ( x 0 , y 0 ) and y axis parallel to vector V.
217
Two-Dimensional Geometric Transformations y axis
y
is
V P1 P0
ax
P1
x
is
ax
y0 P0 FIGURE 33
A Cartesian x y system defined with two coordinate positions, P0 and P1 , within an x y reference frame.
0
x0
x axis
Equivalently, we can obtain this rotation matrix from Equation 64 by setting the orientation angle as θ = 90◦ . In an interactive application, it may be more convenient to choose the direction of V relative to position P0 than to specify it relative to the xy-coordinate origin. Unit vectors u and v would then be oriented as shown in Figure 33. The components of v are now calculated as v=
P1 − P 0 |P1 − P0 |
(69)
and u is obtained as the perpendicular to v that forms a right-handed Cartesian system.
9 OpenGL Functions for Two-Dimensional Geometric Transformations In the core library of OpenGL, a separate function is available for each of the basic geometric transformations. Because OpenGL is designed as a three-dimensional graphics application programming interface (API), all transformations are specified in three dimensions. Internally, all coordinate positions are represented as four-element column vectors, and all transformations are represented using 4 × 4 matrices. Fortunately, performing two-dimensional transformations within OpenGL is generally just a matter of using a value for the transformation in the third (z) dimension that causes no change in that dimension. To perform a translation, we invoke the translation routine and set the components for the three-dimensional translation vector. In the rotation function, we specify the angle and the orientation for a rotation axis that intersects the coordinate origin. In addition, a scaling function is used to set the three coordinate scaling factors relative to the coordinate origin. In each case, the transformation routine sets up a 4 × 4 matrix that is applied to the coordinates of objects that are referenced after the transformation call.
Basic OpenGL Geometric Transformations A 4 × 4 translation matrix is constructed with the following routine: glTranslate* (tx, ty, tz);
Translation parameters tx, ty, and tz can be assigned any real-number values, and the single suffix code to be affixed to this function is either f (float) or d (double). For two-dimensional applications, we set tz = 0.0; and a two-dimensional position is represented as a four-element column matrix with the z component equal to 0.0. The translation matrix generated by this function is used to transform
218
Two-Dimensional Geometric Transformations
positions of objects defined after this function is invoked. For example, we translate subsequently defined coordinate positions 25 units in the x direction and −10 units in the y direction with the statement glTranslatef (25.0, -10.0, 0.0);
Similarly, a 4 × 4 rotation matrix is generated with glRotate* (theta, vx, vy, vz);
where the vector v = (vx, vy, vz) can have any floating-point values for its components. This vector defines the orientation for a rotation axis that passes through the coordinate origin. If v is not specified as a unit vector, then it is normalized automatically before the elements of the rotation matrix are computed. The suffix code can be either f or d, and parameter theta is to be assigned a rotation angle in degrees, which the routine converts to radians for the trigonometric calculations. The rotation specified here will be applied to positions defined after this function call. Rotation in two-dimensional systems is rotation about the z axis, specified as a unit vector with x and y components of zero, and a z component of 1.0. For example, the statement glRotatef (90.0, 0.0, 0.0, 1.0);
sets up the matrix for a 90◦ rotation about the z axis. We should note here that internally, this function generates a rotation matrix using quaternions. This method is more efficient when rotation is about an arbitrarily-specific axis. We obtain a 4 × 4 scaling matrix with respect to the coordinate origin with the following routine: glScale* (sx, sy, sz);
The suffix code is again either f or d, and the scaling parameters can be assigned any real-number values. Scaling in a two-dimensional system involves changes in the x and y dimensions, so a typical two-dimensional scaling operation has a z scaling factor of 1.0 (which causes no change in the z coordinate of positions). Because the scaling parameters can be any real-number value, this function will also generate reflections when negative values are assigned to the scaling parameters. For example, the following statement produces a matrix that scales by a factor of 2 in the x direction, scales by a factor of 3 in the y direction, and reflects with respect to the x axis: glScalef (2.0, -3.0, 1.0);
A zero value for any scaling parameter can cause a processing error because an inverse matrix cannot be calculated. The scale-reflect matrix is applied to subsequently defined objects. It is important to note that internally OpenGL uses composite matrices to hold transformations. As a result, transformations are cumulative—that is, if we apply a translation and then apply a rotation, objects whose positions are specified after that will have both transformations applied to them. If that is not the behavior we desired, we must be able to remove the effects of previous transformations. This requires additional functions for manipulating the composite matrices.
219
Two-Dimensional Geometric Transformations
OpenGL Matrix Operations The glMatrixMode routine is used to set the projection mode, which designates the matrix that is to be used for the projection transformation. This transformation. This transformation determines how a scene is to be projected onto the screen. We use the same routine to set up a matrix for the geometric transformations. In this case, however, the matrix is referred to as the modelview matrix, and it is used to store and combine the geometric transformations. It is also used to combine the geometric transformations with the transformation to a viewing-coordinate system. We specify the modelview mode with the statement glMatrixMode (GL_MODELVIEW);
which designates the 4 × 4 modelview matrix as the current matrix. The OpenGL transformation routines discussed in the previous section are all applied to whatever composite matrix is the current matrix, so it is important to use glMatrixMode to change to the modelview matrix before applying geometric transformations. Following this call, OpenGL transformation routines are used to modify the modelview matrix, which is then applied to transform coordinate positions in a scene. Two other modes that we can set with the glMatrixMode function are the texture mode and the color mode. The texture matrix is used for mapping texture patterns to surfaces, and the color matrix is used to convert from one color model to another. We discuss viewing, projection, texture, and color transformations in later chapters. For now, we limit our discussion to the details of the geometric transformations. The default argument for the glMatrixMode function is GL MODELVIEW. Once we are in the modelview mode (or any other mode), a call to a transformation routine generates a matrix that is multiplied by the current matrix for that mode. In addition, we can assign values to the elements of the current matrix, and there are two functions in the OpenGL library for this purpose. With the following function, we assign the identity matrix to the current matrix: glLoadIdentity ( );
Alternatively, we can assign other values to the elements of the current matrix using glLoadMatrix* (elements16);
A single-subscripted, 16-element array of floating-point values is specified with parameter elements16, and a suffix code of either f or d is used to designate the data type. The elements in this array must be specified in column-major order. That is, we first list the four elements in the first column, and then we list the four elements in the second column, the third column, and finally the fourth column. To illustrate this ordering, we initialize the modelview matrix with the following code: glMatrixMode (GL_MODELVIEW); GLfloat elems [16]; GLint k; for (k = 0; k < 16; k++) elems [k] = float (k); glLoadMatrixf (elems);
220
Two-Dimensional Geometric Transformations
which produces the matrix ⎡
0.0 ⎢ 1.0 M=⎢ ⎣ 2.0 3.0
4.0 5.0 6.0 7.0
8.0 9.0 10.0 11.0
⎤ 12.0 13.0 ⎥ ⎥ 14.0 ⎦ 15.0
We can also concatenate a specified matrix with the current matrix as follows: glMultMatrix* (otherElements16);
Again, the suffix code is either f or d, and parameter otherElements16 is a 16-element, single-subscripted array that lists the elements of some other matrix in column-major order. The current matrix is postmultiplied by the matrix specified in glMultMatrix, and this product replaces the current matrix. Thus, assuming that the current matrix is the modelview matrix, which we designate as M, then the updated modelview matrix is computed as M = M · M where M represents the matrix whose elements are specified by parameter otherElements16 in the preceding glMultMatrix statement. The glMultMatrix function can also be used to set up any transformation sequence with individually defined matrices. For example, glMatrixMode (GL_MODELVIEW); glLoadIdentity ( ); glMultMatrixf (elemsM2); glMultMatrixf (elemsM1);
// Set current matrix to the identity. // Postmultiply identity with matrix M2. // Postmultiply M2 with matrix M1.
produces the following current modelview matrix: M = M 2 · M1 The first transformation to be applied in this sequence is the last one specified in the code. Thus, if we set up a transformation sequence in an OpenGL program, we can think of the individual transformations as being loaded onto a stack, so the last operation specified is the first one applied. This is not what actually happens, but the stack analogy may help you remember that in an OpenGL program, a transformation sequence is applied in the opposite order from which it is specified. It is also important to keep in mind that OpenGL stores matrices in columnmajor order. In addition, a reference to a matrix element such as m jk in OpenGL is a reference to the element in column j and row k. This is the reverse of the standard mathematical convention, where the row number is referenced first. However, we can usually avoid errors in row-column references by always specifying matrices in OpenGL as 16-element, single-subscript arrays and listing the elements in a column-major order.
221
Two-Dimensional Geometric Transformations
OpenGL actually maintains a stack of composite matrices for each of the four modes that we can select with the glMatrixMode routine.
10 OpenGL Geometric-Transformation Programming Examples In the following code segment, we apply each of the basic geometric transformations, one at a time, to a rectangle. Initially, the modelview matrix is the identity matrix and we display a blue rectangle. Next, we reset the current color to red, specify two-dimensional translation parameters, and display the red translated rectangle (Figure 34). Because we do not want to combine transformations, we next reset the current matrix to the identity. Then a rotation matrix is constructed and concatenated with the current matrix (the identity matrix). When the original rectangle is again referenced, it is rotated about the z axis and displayed in red (Figure 35). We repeat this process once more to generate the scaled and reflected rectangle shown in Figure 36.
glMatrixMode (GL_MODELVIEW); glColor3f (0.0, 0.0, 1.0); glRecti (50, 100, 200, 150);
// Display blue rectangle.
glColor3f (1.0, 0.0, 0.0); glTranslatef (-200.0, -50.0, 0.0); // Set translation parameters. glRecti (50, 100, 200, 150); // Display red, translated rectangle. glLoadIdentity ( ); glRotatef (90.0, 0.0, 0.0, 1.0); glRecti (50, 100, 200, 150);
// Reset current matrix to identity. // Set 90-deg. rotation about z axis. // Display red, rotated rectangle.
glLoadIdentity ( ); glScalef (-0.5, 1.0, 1.0); glRecti (50, 100, 200, 150);
// Reset current matrix to identity. // Set scale-reflection parameters. // Display red, transformed rectangle.
200 150 100 Original Position
Translated Position 50 FIGURE 34
Translating a rectangle using the OpenGL function glTranslatef (−200.0, −50.0, 0.0).
222
150
100
50
50
100
150
200
Two-Dimensional Geometric Transformations
200
150
100
50
Rotated Position
150
100
Original Position
FIGURE 35
50
50
100
150
200
Rotating a rectangle about the z axis using the OpenGL function glRotatef (90.0, 0.0, 0.0, 1.0).
200 150 Scaled-Reflected Position Original Position 50 FIGURE 36
150 100
50
50
100
150
200
Scaling and reflecting a rectangle using the OpenGL function glScalef (−0.5, 1.0, 1.0).
11 Summary The basic geometric transformations are translation, rotation, and scaling. Translation moves an object in a straight-line path from one position to another. Rotation moves an object from one position to another along a circular path around a specified rotation axis. For two-dimensional applications, the rotation path is in the xy plane about an axis that is parallel to the z axis. Scaling transformations change the dimensions of an object relative to a fixed position. We can express two-dimensional transformations as 3 × 3 matrix operators, so that sequences of transformations can be concatenated into a single composite matrix. Performing geometric transformations with matrices is an efficient formulation because it allows us to reduce computations by applying a composite matrix to an object description to obtain its transformed position. To do this, we express coordinate positions as column matrices. We choose a column-matrix representation for coordinate points because this is the standard mathematical convention, and most graphics packages now follow this convention. A three-element column matrix (vector) is referred to as a homogeneous-coordinate representation. For geometric transformations, the homogeneous coefficient is assigned the value 1. As with two-dimensional systems, transformations between threedimensional Cartesian coordinate systems are accomplished with a sequence of translate-rotate transformations that brings the two systems into coincidence.
223
Two-Dimensional Geometric Transformations
T A B L E
1
Summary of OpenGL Geometric Transformation Functions Function
Description
glTranslate*
Specifies translation parameters.
glRotate*
Specifies parameters for rotation about any axis through the origin.
glScale*
Specifies scaling parameters with respect to coordinate origin.
glMatrixMode
Specifies current matrix for geometric-viewing transformations, projection transformations, texture transformations, or color transformations.
glLoadIdentity
Sets current matrix to identity.
glLoadMatrix* (elems);
Sets elements of current matrix.
glMultMatrix* (elems);
Postmultiplies the current matrix by the specified matrix.
glPixelZoom
Specifies two-dimensional scaling parameters for raster operations.
However, in a three-dimensional system, we must specify two of the three axis directions, not just one (as in a two-dimensional system). The OpenGL basic library contains three functions for applying individual translate, rotate, and scale transformations to coordinate positions. Each function generates a matrix that is premultiplied by the modelview matrix. Thus, a sequence of geometric-transformation functions must be specified in reverse order: the last transformation invoked is the first to be applied to coordinate positions. Transformation matrices are applied to subsequently defined objects. In addition to accumulating transformation sequences in the modelview matrix, we can set this matrix to the identity or some other matrix. We can also form products with the modelview matrix and any specified matrices. Several operations are available in OpenGL for performing raster transformations. A block of pixels can be translated, rotated, scaled, or reflected with these OpenGL raster operations. Table 1 summarizes the OpenGL geometric-transformation functions and matrix routines discussed in this chapter.
REFERENCES For additional techniques involving matrices and geometric transformations, see Glassner (1990), Arvo (1991), Kirk (1992), Heckbert (1994), and Paeth (1995). Discussions of homogeneous coordinates in computer graphics can be found in Blinn and Newell (1978) and in Blinn (1993, 1996, and 1998). Additional programming examples using OpenGL geometric-transformation functions are given in
224
Woo, et al. (1999). Programming examples for the OpenGL geometric-transformation functions are also available at Nate Robins’s tutorial website: http://www.xmission.com/∼nate/opengl.html. Finally, a complete listing of OpenGL geometrictransformation functions is provided in Shreiner (2000).
Two-Dimensional Geometric Transformations
EXERCISES 1
2
Write an animation program that implements the example two-dimensional rotation procedure of Section 1. An input polygon is to be rotated repeatedly in small steps around a pivot point in the xy plane. Small angles are to be used for each successive step in the rotation, and approximations to the sine and cosine functions are to be used to speed up the calculations. To avoid excessive accumulation of round-off errors, reset the original coordinate values for the object at the start of each new revolution. Show that the composition of two rotations is additive by concatenating the matrix representations for R(θ1 ) and R(θ2 ) to obtain R(θ1 ) · R(θ2 ) = R(θ1 + θ2 )
Modify the two-dimensional transformation matrix (39), for scaling in an arbitrary direction, to include coordinates for any specified scaling fixed point (x f , y f ). 4 Prove that the multiplication of transformation matrices for each of the following sequences is commutative: (a) Two successive rotations. (b) Two successive translations. (c) Two successive scalings. 5 Prove that a uniform scaling and a rotation form a commutative pair of operations but that, in general, scaling and rotation are not commutative operations. 6 Multiple the individual scale, rotate, and translate matrices in Equation 42 to verify the elements in the composite transformation matrix. 7 Modify the example program in Section 4 so that transformation parameters can be specified as user input. 8 Modify the program from the previous exercise so that the transformation sequence can be applied to any polygon, with vertices specified as user input. 9 Modify the example program in Section 4 so that the order of the geometric transformation sequence can be specified as user input. 10 Show that transformation matrix (55), for a reflection about the line y = x, is equivalent to a reflection relative to the x axis followed by a counterclockwise rotation of 90◦ . 11 Show that transformation matrix (56), for a reflection about the line y = −x, is equivalent to a reflection relative to the y axis followed by a counterclockwise rotation of 90◦ . 12 Show that two successive reflections about either the x axis or the y axis is equivalent to a single
13
14
15
16
17
3
18
rotation in the xy plane about the coordinate origin. Determine the form of the two-dimensional transformation matrix for a reflection about any line: y = mx + b. Show that two successive reflections about any line in the xy plane that intersects the coordinate origin is equivalent to a rotation in the xy plane about the origin. Determine a sequence of basic transformations that is equivalent to the x-direction shearing matrix (57). Determine a sequence of basic transformations that is equivalent to the y-direction shearing matrix (61). Set up a shearing procedure to display twodimensional italic characters, given a vector font definition. That is, all character shapes in this font are defined with straight-line segments, and italic characters are formed with shearing transformations. Determine an appropriate value for the shear parameter by comparing italics and plain text in some available font. Define a simple vector font for input to your routine. Derive the following equations for transforming a coordinate point P = (x, y) in one twodimensional Cartesian system to the coordinate values (x , y ) in another Cartesian system that is rotated counterclockwise by an angle θ relative to the first system. The transformation equations can be obtained by projecting point P onto each of the four axes and analyzing the resulting right triangles. x = x cos θ + y sin θ
y = −x sin θ + y cos θ
19 Write a procedure to compute the elements of the matrix for transforming object descriptions from one two-dimensional Cartesian coordinate system to another. The second coordinate system is to be defined with an origin point P0 and a vector V that gives the direction for the positive y axis of this system. 20 Set up procedures for implementing a block transfer of a rectangular area of a frame buffer, using one function to read the area into an array and another function to copy the array into the designated transfer area. 21 Determine the results of performing two successive block transfers into the same area of a frame buffer using the various Boolean operations. 22 What are the results of performing two successive block transfers into the same area of a frame buffer using the binary arithmetic operations?
225
Two-Dimensional Geometric Transformations
23 Implement a routine to perform block transfers in a frame buffer using any specified Boolean operation or a replacement (copy) operation. 24 Write a routine to implement rotations in increments of 90◦ in frame-buffer block transfers. 25 Write a routine to implement rotations by any specified angle in a frame-buffer block transfer. 26 Write a routine to implement scaling as a raster transformation of a pixel block. 27 Write a program to display an animation of a black square on a white background tracing a circular, clockwise path around the display window with the path’s center at the display window’s center (like the tip of the minute hand on a clock). The orientation of the square should not change. Use only basic OpenGL geometric transformations to do this. 28 Repeat the previous exercise using OpenGL matrix operations. 29 Modify the program in Exercise 27 to have the square rotate clockwise about its own center as it moves along its path. The square should complete one revolution about its center for each quarter of its path around the window that it completes. Use only basic OpenGL geometric transformations to do this. 30 Repeat the previous exercise using OpenGL matrix operations. 31 Modify the program in Exercise 29 to have the square additionally “pulse” as it moves along its path. That is, for every revolution about its own center that it makes, it should go through one pulse cycle that begins with the square at full size, reduces smoothly in size down to 50normal size by the end of the cycle. Do this using only basic OpenGL geometric transformations. 32 Repeat the previous exercise using only OpenGL matrix operations.
226
IN MORE DEPTH 1
2
In this exercise, you’ll set up the routines necessary to make a crude animation of the objects in your application using two-dimensional geometric transformations. Decide on some simple motion behaviors for the objects in your application that can be achieved with the types of transformations discussed in this chapter (translations, rotations, scaling, shearing, and reflections). These behaviors may be motion patterns that certain objects will always be exhibiting, or they may be trajectories that are triggered or guided by user input (you can generate fixed example trajectories since we have not yet included user input). Set up the transformation matrices needed to produce these behaviors via composition of matrices. The matrices should be defined in homogeneous coordinates. If two or more objects act as a single “unit” in certain behaviors that are easier to model in terms of relative positions, you can use the techniques in Section 8 to convert the local transformations of the objects relative to each other (in their own coordinate frame) into transformations in the world coordinate frame. Use the matrices you designed in the previous exercise to produce an animation of the behaviors of the objects in your scene. You should employ the OpenGL matrix operations and have the matrices produce small changes in position for each of the objects in the scene. The scene should then be redrawn several times per second to produce the animation, with the transformations being applied each time. Set the animation up so that it “loops”; that is, the behaviors should either be cyclical, or once the trajectories you designed for the objects have completed, the positions of all of the objects in the scene should be reset to their starting positions and the animation begun again.
Two-Dimensional Viewing
1
The Two-Dimensional Viewing Pipeline
2
The Clipping Window
3
Normalization and Viewport Transformations
4
OpenGL Two-Dimensional Viewing Functions
5
Clipping Algorithms
6
Two-Dimensional Point Clipping
7
Two-Dimensional Line Clipping
8
Polygon Fill-Area Clipping
9
Curve Clipping
10
Text Clipping
11
Summary
W
e now examine in more detail the procedures for displaying views of a two-dimensional picture on an output device. Typically, a graphics package allows a user to specify which part of a defined picture is to be displayed and where that part is to be placed on the display device. Any convenient Cartesian coordinate system, referred to as the world-coordinate reference frame, can be used to define the picture. For a two-dimensional picture, a view is selected by specifying a region of the xy plane that contains the total picture or any part of it. A user can select a single area for display, or several areas could be selected for simultaneous display or for an animated panning sequence across a scene. The picture parts within the selected areas are then mapped onto specified areas of the device coordinates. When multiple view areas are selected, these areas can be placed in separate display locations, or some areas could be inserted into other, larger display areas.
From Chapter 8 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
227
Two-Dimensional Viewing
Two-dimensional viewing transformations from world to device coordinates involve translation, rotation, and scaling operations, as well as procedures for deleting those parts of the picture that are outside the limits of a selected scene area.
1 The Two-Dimensional Viewing Pipeline A section of a two-dimensional scene that is selected for display is called a clipping window because all parts of the scene outside the selected section are “clipped” off. The only part of the scene that shows up on the screen is what is inside the clipping window. Sometimes the clipping window is alluded to as the world window or the viewing window. And, at one time, graphics systems referred to the clipping window simply as “the window,” but there are now so many windows in use on computers that we need to distinguish between them. For example, a window-management system can create and manipulate several areas on a video screen, each of which is called “a window,” for the display of graphics and text. So we will always use the term clipping window to refer to a selected section of a scene that is eventually converted to pixel patterns within a display window on the video monitor. Graphics packages allow us also to control the placement within the display window using another “window” called the viewport. Objects inside the clipping window are mapped to the viewport, and it is the viewport that is then positioned within the display window. The clipping window selects what we want to see; the viewport indicates where it is to be viewed on the output device. By changing the position of a viewport, we can view objects at different positions on the display area of an output device. Multiple viewports can be used to display different sections of a scene at different screen positions. Also, by varying the size of viewports, we can change the size and proportions of displayed objects. We achieve zooming effects by successively mapping different-sized clipping windows onto a fixed-size viewport. As the clipping windows are made smaller, we zoom in on some part of a scene to view details that are not shown with the larger clipping windows. Similarly, more overview is obtained by zooming out from a section of a scene with successively larger clipping windows. And panning effects are achieved by moving a fixed-size clipping window across the various objects in a scene. Usually, clipping windows and viewports are rectangles in standard position, with the rectangle edges parallel to the coordinate axes. Other window or viewport geometries, such as general polygon shapes and circles, are used in some applications, but these shapes take longer to process. We first consider only rectangular viewports and clipping windows, as illustrated in Figure 1.
Clipping Window
ywmax
Viewport
yvmax
yvmin
ywmin FIGURE 1
A clipping window and associated viewport, specified as rectangles aligned with the coordinate axes.
228
xwmin
xwmax
World Coordinates
xvmin
xvmax
Viewport Coordinates
Two-Dimensional Viewing
MC
Construct World-Coordinate Scene Using Modeling-Coordinate Transformations
WC
Convert World Coordinates to Viewing Coordinates
VC
Transform Viewing Coordinates to Normalized Coordinates
NC
Map Normalized Coordinates to Device Coordinates
DC
FIGURE 2
Two-dimensional viewing-transformation pipeline.
The mapping of a two-dimensional, world-coordinate scene description to device coordinates is called a two-dimensional viewing transformation. Sometimes this transformation is simply referred to as the window-to-viewport transformation or the windowing transformation. But, in general, viewing involves more than just the transformation from clipping-window coordinates to viewport coordinates. In analogy with three-dimensional viewing, we can describe the steps for two-dimensional viewing as indicated in Figure 2. Once a world-coordinate scene has been constructed, we could set up a separate two-dimensional, viewingcoordinate reference frame for specifying the clipping window. But the clipping window is often just defined in world coordinates, so viewing coordinates for two-dimensional applications are the same as world coordinates. (For a threedimensional scene, however, we need a separate viewing frame to specify the parameters for the viewing position, direction, and orientation.) To make the viewing process independent of the requirements of any output device, graphics systems convert object descriptions to normalized coordinates and apply the clipping routines. Some systems use normalized coordinates in the range from 0 to 1, and others use a normalized range from −1 to 1. Depending upon the graphics library in use, the viewport is defined either in normalized coordinates or in screen coordinates after the normalization process. At the final step of the viewing transformation, the contents of the viewport are transferred to positions within the display window. Clipping is usually performed in normalized coordinates. This allows us to reduce computations by first concatenating the various transformation matrices. Clipping procedures are of fundamental importance in computer graphics. They are used not only in viewing transformations, but also in window-manager systems, in painting and drawing packages to erase picture sections, and in many other applications.
2 The Clipping Window To achieve a particular viewing effect in an application program, we could design our own clipping window with any shape, size, and orientation we choose. For example, we might like to use a star pattern, an ellipse, or a figure with spline boundaries as a clipping window. But clipping a scene using a concave polygon or a clipping window with nonlinear boundaries requires more processing than clipping against a rectangle. We need to perform more computations to determine where an object intersects a circle than to find out where it intersects a straight line. The simplest window edges to clip against are straight lines that are parallel to the coordinate axes. Therefore, graphics packages commonly allow only rectangular clipping windows aligned with the x and y axes. If we want some other shape for a clipping window, then we must implement our own clipping and coordinate-transformation algorithms, or we could just edit
229
Two-Dimensional Viewing
the picture to produce a certain shape for the display frame around the scene. For example, we could trim the edges of a picture with any desired pattern by overlaying polygons that are filled with the background color. In this way, we could generate any desired border effects or even put interior holes in the picture. Rectangular clipping windows in standard position are easily defined by giving the coordinates of two opposite corners of each rectangle. If we would like to get a rotated view of a scene, we could either define a rectangular clipping window in a rotated viewing-coordinate frame or, equivalently, we could rotate the world-coordinate scene. Some systems provide options for selecting a rotated, two-dimensional viewing frame, but usually the clipping window must be specified in world coordinates.
Viewing-Coordinate Clipping Window yv
iew
y world Clipping Window
y0 xv
x0
iew
x world
World Coordinates FIGURE 3
A rotated clipping window defined in viewing coordinates.
A general approach to the two-dimensional viewing transformation is to set up a viewing-coordinate system within the world-coordinate frame. This viewing frame provides a reference for specifying a rectangular clipping window with any selected orientation and position, as in Figure 3. To obtain a view of the worldcoordinate scene as determined by the clipping window of Figure 3, we just need to transfer the scene description to viewing coordinates. Although many graphics packages do not provide functions for specifying a clipping window in a two-dimensional viewing-coordinate system, this is the standard approach for defining a clipping region for a three-dimensional scene. We choose an origin for a two-dimensional viewing-coordinate frame at some world position P0 = (x0 , y0 ), and we can establish the orientation using a world vector V that defines the yview direction. Vector V is called the two-dimensional view up vector. An alternative method for specifying the orientation of the viewing frame is to give a rotation angle relative to either the x or y axis in the world frame. From this rotation angle, we can then obtain the view up vector. Once we have established the parameters that define the viewing-coordinate frame, we transform the scene description to the viewing system. This involves a sequence of transformations equivalent to superimposing the viewing frame on the world frame. The first step in the transformation sequence is to translate the viewing origin to the world origin. Next, we rotate the viewing system to align it with the world frame. Given the orientation vector V, we can calculate the components of unit vectors v = (vx , v y ) and u = (ux , u y ) for the yview and xview axes, respectively. These unit vectors are used to form the first and second rows of the rotation matrix R that aligns the viewing xview yview axes with the world xw yw axes. Object positions in world coordinates are then converted to viewing coordinates with the composite two-dimensional transformation matrix MWC, VC = R · T
(1)
where T is the translation matrix that takes the viewing origin P0 to the world origin, and R is the rotation matrix that rotates the viewing frame of reference into coincidence with the world-coordinate system. Figure 4 illustrates the steps in this coordinate transformation.
World-Coordinate Clipping Window A routine for defining a standard, rectangular clipping window in world coordinates is typically provided in a graphics-programming library. We simply specify two world-coordinate positions, which are then assigned to the two opposite
230
Two-Dimensional Viewing
y y world view
y world yv
iew
y0
FIGURE 4
T
x0
x world
R
x view x world
xv
iew
(a)
(b)
corners of a standard rectangle. Once the clipping window has been established, the scene description is processed through the viewing routines to the output device. If we want to obtain a rotated view of a two-dimensional scene, as discussed in the previous section, we perform exactly the same steps as described there, but without considering a viewing frame of reference. Thus, we simply rotate (and possibly translate) objects to a desired position and set up the clipping window— all in world coordinates. For example, we could display a rotated view of the triangle in Figure 5(a) by rotating it into the position we want and setting up a standard clipping rectangle. In analogy with the coordinate transformation described in the previous section, we could also translate the triangle to the world origin and define a clipping window around the triangle. In that case, we define an orientation vector and choose a reference point such as the triangle’s centroid. Then we translate the reference point to the world origin and rotate the orientation vector onto the yworld axis using transformation matrix 1. With the triangle in the desired orientation, we can use a standard clipping window in world coordinates to capture the view of the rotated triangle. The transformed position of the triangle and the selected clipping window are shown in Figure 5(b).
A viewing-coordinate frame is moved into coincidence with the world frame by (a) applying a translation matrix T to move the viewing origin to the world origin, then (b) applying a rotation matrix R to align the axes of the two systems.
V
y world
y0
x0
x world
(a)
y world Clipping Window
x world
(b)
3 Normalization and Viewport Transformations With some graphics packages, the normalization and window-to-viewport transformations are combined into one operation. In this case, the viewport coordinates are often given in the range from 0 to 1 so that the viewport is positioned within a unit square. After clipping, the unit square containing the viewport is mapped to the output display device. In other systems, the normalization and clipping routines are applied before the viewport transformation. For these systems, the viewport boundaries are specified in screen coordinates relative to the displaywindow position.
FIGURE 5
A triangle (a), with a selected reference point and orientation vector, is translated and rotated to position (b) within a clipping window.
Mapping the Clipping Window into a Normalized Viewport To illustrate the general procedures for the normalization and viewport transformations, we first consider a viewport defined with normalized coordinate values between 0 and 1. Object descriptions are transferred to this normalized space using a transformation that maintains the same relative placement of a point in
231
Two-Dimensional Viewing
ywmax
Clipping Window
1
(xw, yw)
Normalization Viewport
yvmax
(xv, yv) yvmin ywmin xwmin
xwmax
0
xvmin
xvmax 1
FIGURE 6
A point (x w, y w ) in a world-coordinate clipping window is mapped to viewport coordinates (x v, y v ), within a unit square, so that the relative positions of the two points in their respective rectangles are the same.
the viewport as it had in the clipping window. If a coordinate position is at the center of the clipping window, for instance, it would be mapped to the center of the viewport. Figure 6 illustrates this window-to-viewport mapping. Position (xw, yw) in the clipping window is mapped to position (xv, yv) in the associated viewport. To transform the world-coordinate point into the same relative position within the viewport, we require that xv − xvmin xw − xwmin = xvmax − xvmin xwmax − xwmin yv − yvmin yw − ywmin = yvmax − yvmin ywmax − ywmin
(2)
Solving these expressions for the viewport position (xv, yv), we have xv = sx xw + tx yv = s y yw + ty
(3)
where the scaling factors are sx =
xvmax − xvmin xwmax − xwmin
yvmax − yvmin sy = ywmax − ywmin and the translation factors are xwmax xvmin − xwmin xvmax tx = xwmax − xwmin ywmax yvmin − ywmin yvmax ty = ywmax − ywmin
(4)
(5)
Because we are simply mapping world-coordinate positions into a viewport that is positioned near the world origin, we can also derive Equations 3 using any transformation sequence that converts the rectangle for the clipping window into the viewport rectangle. For example, we could obtain the transformation from world coordinates to viewport coordinates with the following sequence: 1. Scale the clipping window to the size of the viewport using a fixed-point
position of (xwmin , ywmin ). 2. Translate (xwmin , ywmin ) to (xvmin , yvmin ).
232
Two-Dimensional Viewing
The scaling transformation in step (1) can be represented with the twodimensional matrix ⎡ ⎤ sx 0 xwmin (1 − sx ) S = ⎣ 0 s y ywmin (1 − s y ) ⎦ (6) 0 0 1 where sx and s y are the same as in Equations 4. The two-dimensional matrix representation for the translation of the lower-left corner of the clipping window to the lower-left viewport corner is ⎡ ⎤ 1 0 xvmin − xwmin T = ⎣ 0 1 yvmin − ywmin ⎦ (7) 0 0 1 And the composite matrix representation for the transformation to the normalized viewport is ⎡ ⎤ sx 0 tx Mwindow, normviewp = T · S = ⎣ 0 s y ty ⎦ (8) 0 0 1 which gives us the same result as in Equations 3. Any other clipping-window reference point, such as the top-right corner or the window center, could be used for the scale–translate operations. Alternatively, we could first translate any clippingwindow position to the corresponding location in the viewport, and then scale relative to that viewport location. The window-to-viewport transformation maintains the relative placement of object descriptions. An object inside the clipping window is mapped to a corresponding position inside the viewport. Similarly, an object outside the clipping window is outside the viewport. Relative proportions of objects, on the other hand, are maintained only if the aspect ratio of the viewport is the same as the aspect ratio of the clipping window. In other words, we keep the same object proportions if the scaling factors sx and sy are the same. Otherwise, world objects will be stretched or contracted in either the x or y directions (or both) when displayed on the output device. The clipping routines can be applied using either the clipping-window boundaries or the viewport boundaries. After clipping, the normalized coordinates are transformed into device coordinates. And the unit square can be mapped onto the output device using the same procedures as in the window-to-viewport transformation, with the area inside the unit square transferred to the total display area of the output device.
Mapping the Clipping Window into a Normalized Square Another approach to two-dimensional viewing is to transform the clipping window into a normalized square, clip in normalized coordinates, and then transfer the scene description to a viewport specified in screen coordinates. This transformation is illustrated in Figure 7 with normalized coordinates in the range from −1 to 1. The clipping algorithms in this transformation sequence are now standardized so that objects outside the boundaries x = ±1 and y = ±1 are detected and removed from the scene description. At the final step of the viewing transformation, the objects in the viewport are positioned within the display window. We transfer the contents of the clipping window into the normalization square using the same procedures as in the window-to-viewport transformation. The matrix for the normalization transformation is obtained from Equation 8 by substituting −1 for xvmin and yvmin and substituting +1 for xvmax and yvmax .
233
Two-Dimensional Viewing
ywmax
Clipping Window
(xnorm, ynorm) 1
(xw, yw)
Normalization Square
⫺1
ywmin xwmin
1
yvmax yvmin
⫺1
xwmax
Screen Viewport
(xv, yv) xvmin
xvmax
FIGURE 7
A point (x w, y w) in a clipping window is mapped to a normalized coordinate position (x norm , y norm ), then to a screen-coordinate position (x v, y v) in a viewport. Objects are clipped against the normalization square before the transformation to viewport coordinates occurs.
Making these substitutions in the expressions for tx , ty , sx , and s y , we have ⎡ xwmax + xwmin 2 0 − ⎢ xwmax − xwmin xwmax − xwmin ⎢ ⎢ 2 ywmax + ywmin Mwindow, normsquare = ⎢ 0 − ⎢ ywmax − ywmin ywmax − ywmin ⎣ 0 0 1
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (9)
Similarly, after the clipping algorithms have been applied, the normalized square with edge length equal to 2 is transformed into a specified viewport. This time, we get the transformation matrix from Equation 8 by substituting −1 for xwmin and ywmin and substituting +1 for xwmax and ywmax : ⎡ xv xvmax + xvmin ⎤ max − xvmin 0 ⎢ ⎥ 2 2 ⎢ ⎥ yv yv − yv + yv ⎢ ⎥ max min max min Mnormsquare, viewport = ⎢ (10) ⎥ 0 ⎣ ⎦ 2 2 0
0
1
The last step in the viewing process is to position the viewport area in the display window. Typically, the lower-left corner of the viewport is placed at a coordinate position specified relative to the lower-left corner of the display window. Figure 8 demonstrates the positioning of a viewport within a display window. As before, we maintain the initial proportions of objects by choosing the aspect ratio of the viewport to be the same as the clipping window. Otherwise, objects yscreen Video Screen
AR
ed T rian
gle
Display Window ys
FIGURE 8
A viewport at coordinate position (x s , y s ) within a display window.
234
Viewport
xs xscreen
Two-Dimensional Viewing
will be stretched or contracted in the x or y directions. Also, the aspect ratio of the display window can affect the proportions of objects. If the viewport is mapped to the entire area of the display window and the size of the display window is changed, objects may be distorted unless the aspect ratio of the viewport is also adjusted.
Display of Character Strings Character strings can be handled in one of two ways when they are mapped through the viewing pipeline to a viewport. The simplest mapping maintains a constant character size. This method could be employed with bitmap character patterns. But outline fonts could be transformed the same as other primitives; we just need to transform the defining positions for the line segments in the outline character shapes. Algorithms for determining the pixel patterns for the transformed characters are then applied when the other primitives in the scene are processed.
Split-Screen Effects and Multiple Output Devices By selecting different clipping windows and associated viewports for a scene, we can provide simultaneous display of two or more objects, multiple picture parts, or different views of a single scene. And we can position these views in different parts of a single display window or in multiple display windows on the screen. In a design application, for example, we can display a wire-frame view of an object in one viewport while also displaying a fully rendered view of the object in another viewport. In addition, we could list other information or menus in a third viewport. It is also possible that two or more output devices could be operating concurrently on a particular system, and we can set up a clipping-window/viewport pair for each output device. A mapping to a selected output device is sometimes referred to as a workstation transformation. In this case, viewports could be specified in the coordinates of a particular display device, or each viewport could be specified within a unit square, which is then mapped to a chosen output device. Some graphics systems provide a pair of workstation functions for this purpose. One function is used to designate a clipping window for a selected output device, identified by a workstation number, and the other function is used to set the associated viewport for that device.
4 OpenGL Two-Dimensional Viewing Functions Actually, the basic OpenGL library has no functions specifically for twodimensional viewing because it is designed primarily for three-dimensional applications. But we can adapt the three-dimensional viewing routines to a twodimensional scene, and the core library contains a viewport function. In addition, the GLU library provides a function for specifying a two-dimensional clipping window, and we have GLUT library functions for handling display windows. Therefore, we can use these two-dimensional routines, along with the OpenGL viewport function, for all the viewing operations we need.
OpenGL Projection Mode Before we select a clipping window and a viewport in OpenGL, we need to establish the appropriate mode for constructing the matrix to transform from world coordinates to screen coordinates. With OpenGL, we cannot set up a
235
Two-Dimensional Viewing
separate two-dimensional viewing-coordinate system as in Figure 3, and we must set the parameters for the clipping window as part of the projection transformation. Therefore, we must first select the projection mode. We do this with the same function we used to set the modelview mode for the geometric transformations. Subsequent commands for defining a clipping window and viewport will then be applied to the projection matrix. glMatrixMode (GL_PROJECTION);
This designates the projection matrix as the current matrix, which is originally set to the identity matrix. However, if we are going to loop back through this statement for other views of a scene, we can also set the initialization as glLoadIdentity ( );
This ensures that each time we enter the projection mode, the matrix will be reset to the identity matrix so that the new viewing parameters are not combined with the previous ones.
GLU Clipping-Window Function To define a two-dimensional clipping window, we can use the GLU function: gluOrtho2D (xwmin, xwmax, ywmin, ywmax);
Coordinate positions for the clipping-window boundaries are given as doubleprecision numbers. This function specifies an orthogonal projection for mapping the scene to the screen. For a three-dimensional scene, this means that objects would be projected along parallel lines that are perpendicular to the twodimensional xy display screen. But for a two-dimensional application, objects are already defined in the xy plane. Therefore, the orthogonal projection has no effect on our two-dimensional scene other than to convert object positions to normalized coordinates. Nevertheless, we must specify the orthogonal projection because our two-dimensional scene is processed through the full threedimensional OpenGL viewing pipeline. In fact, we could specify the clipping window using the three-dimensional OpenGL core-library version of the gluOrtho2D function. Normalized coordinates in the range from −1 to 1 are used in the OpenGL clipping routines. And the gluOrtho2D function sets up a three-dimensional version of transformation matrix 9 for mapping objects within the clipping window to normalized coordinates. Objects outside the normalized square (and outside the clipping window) are eliminated from the scene to be displayed. If we do not specify a clipping window in an application program, the default coordinates are (xwmin , ywmin ) = (−1.0, −1.0) and (xwmax , ywmax ) = (1.0, 1.0). Thus the default clipping window is the normalized square centered on the coordinate origin with a side length of 2.0.
OpenGL Viewport Function We specify the viewport parameters with the OpenGL function glViewport (xvmin, yvmin, vpWidth, vpHeight);
where all parameter values are given in integer screen coordinates relative to the display window. Parameters xvmin and yvmin specify the position of the lowerleft corner of the viewport relative to the lower-left corner of the display window, and the pixel width and height of the viewport are set with parameters vpWidth
236
Two-Dimensional Viewing
and vpHeight. If we do not invoke the glViewport function in a program, the default viewport size and position are the same as the size and position of the display window. After the clipping routines have been applied, positions within the normalized square are transformed into the viewport rectangle using Matrix 10. Coordinates for the upper-right corner of the viewport are calculated for this transformation matrix in terms of the viewport width and height: xvmax = xvmin + vpWidth,
yvmax = yvmin + vp Height
(11)
For the final transformation, pixel colors for the primitives within the viewport are loaded into the refresh buffer at the specified screen locations. Multiple viewports can be created in OpenGL for a variety of applications (see Section 3). We can obtain the parameters for the currently active viewport using the query function glGetIntegerv (GL_VIEWPORT, vpArray);
where vpArray is a single-subscript, four-element array. This Get function returns the parameters for the current viewport to vpArray in the order xvmin, yvmin, vpWidth, and vpHeight. In an interactive application, for example, we can use this function to obtain parameters for the viewport that contains the screen cursor.
Creating a GLUT Display Window Because the GLUT library interfaces with any window-management system, we use the GLUT routines for creating and manipulating display windows so that our example programs will be independent of any specific machine. To access these routines, we first need to initialize GLUT with the following function: glutInit (&argc, argv);
Parameters for this initialization function are the same as those for the main procedure, and we can use glutInit to process command-line arguments. We have three functions in GLUT for defining a display window and choosing its dimensions and position: glutInitWindowPosition (xTopLeft, yTopLeft); glutInitWindowSize (dwWidth, dwHeight); glutCreateWindow ("Title of Display Window");
The first of these functions gives the integer, screen-coordinate position for the top-left corner of the display window, relative to the top-left corner of the screen. If either coordinate is negative, the display-window position on the screen is determined by the window-management system. With the second function, we choose a width and height for the display window in positive integer pixel dimensions. If we do not use these two functions to specify a size and position, the default size is 300 by 300 and the default position is (−1, −1), which leaves the positioning of the display window to the window-management system. In any case, the display-window size and position specified with GLUT routines might be ignored, depending on the state of the window-management system or the other requirements currently in effect for it. Thus, the window system might position and size the display window differently. The third function creates the display window, with the specified size and position, and assigns a title, although the use
237
Two-Dimensional Viewing
of the title also depends on the windowing system. At this point, the display window is defined but not shown on the screen until all the GLUT setup operations are complete.
Setting the GLUT Display-Window Mode and Color Various display-window parameters are selected with the GLUT function glutInitDisplayMode (mode);
We use this function to choose a color mode (RGB or index) and different buffer combinations, and the selected parameters are combined with the logical or operation. The default mode is single buffering and the RGB (or RGBA) color mode, which is the same as setting this mode with the statement glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
The color mode specification GLUT RGB is equivalent to GLUT RGBA. A background color for the display window is chosen in RGB mode with the OpenGL routine glClearColor (red, green, blue, alpha);
In color-index mode, we set the display-window color with glClearIndex (index);
where parameter index is assigned an integer value corresponding to a position within the color table.
GLUT Display-Window Identifier Multiple display windows can be created for an application, and each is assigned a positive-integer display-window identifier, starting with the value 1 for the first window that is created. At the time that we initiate a display window, we can record its identifier with the statement windowID = glutCreateWindow ("A Display Window");
Once we have saved the integer display-window identifier in variable name windowID, we can use the identifier number to change display parameters or to delete the display window.
Deleting a GLUT Display Window The GLUT library also includes a function for deleting a display window that we have created. If we know the display window’s identifier, we can eliminate it with the statement glutDestroyWindow (windowID);
Current GLUT Display Window When we specify any display-window operation, it is applied to the current display window, which is either the last display window that we created or the one we select with the following command: glutSetWindow (windowID);
238
Two-Dimensional Viewing
In addition, at any time, we can query the system to determine which window is the current display window: currentWindowID = glutGetWindow ( );
A value of 0 is returned by this function if there are no display windows or if the current display window was destroyed.
Relocating and Resizing a GLUT Display Window We can reset the screen location for the current display window with glutPositionWindow (xNewTopLeft, yNewTopLeft);
where the coordinates specify the new position for the upper-left display-window corner, relative to the upper-left corner of the screen. Similarly, the following function resets the size of the current display window: glutReshapeWindow (dwNewWidth, dwNewHeight);
With the following command, we can expand the current display window to fill the screen: glutFullScreen ( );
The exact size of the display window after execution of this routine depends on the window-management system. A subsequent call to either glutPositionWindow or glutReshapeWindow will cancel the request for an expansion to full-screen size. Whenever the size of a display window is changed, its aspect ratio may change and objects may be distorted from their original shapes. We can adjust for a change in display-window dimensions using the statement glutReshapeFunc (winReshapeFcn);
This GLUT routine is activated when the size of a display window is changed, and the new width and height are passed to its argument: the function winReshapeFcn, in this example. Thus, winReshapeFcn is the “callback function” for the “reshape event.” We can then use this callback function to change the parameters for the viewport so that the original aspect ratio of the scene is maintained. In addition, we could also reset the clipping-window boundaries, change the display-window color, adjust other viewing parameters, and perform any other tasks.
Managing Multiple GLUT Display Windows The GLUT library also has a number of routines for manipulating a display window in various ways. These routines are particularly useful when we have multiple display windows on the screen and we want to rearrange them or locate a particular display window. We use the following routine to convert the current display window to an icon in the form of a small picture or symbol representing the window: glutIconifyWindow ( );
The label on this icon will be the same name that we assigned to the window, but we can change this with the following command: glutSetIconTitle ("Icon Name");
239
Two-Dimensional Viewing
We also can change the name of the display window with a similar command: glutSetWindowTitle ("New Window Name");
With multiple display windows open on the screen, some windows may overlap or totally obscure other display windows. We can choose any display window to be in front of all other windows by first designating it as the current window, and then issuing the “pop-window” command: glutSetWindow (windowID); glutPopWindow ( );
In a similar way, we can “push” the current display window to the back so that it is behind all other display windows. This sequence of operations is glutSetWindow (windowID); glutPushWindow ( );
We can also take the current window off the screen with glutHideWindow ( );
In addition, we can return a “hidden” display window, or one that has been converted to an icon, by designating it as the current display window and then invoking the function glutShowWindow ( );
GLUT Subwindows Within a selected display window, we can set up any number of second-level display windows, which are called subwindows. This provides a means for partitioning display windows into different display sections. We create a subwindow with the following function: glutCreateSubWindow (windowID, xBottomLeft, yBottomLeft, width, height);
Parameter windowID identifies the display window in which we want to set up the subwindow. With the remaining parameters, we specify the subwindow’s size and the placement of its lower-left corner relative to the lower-left corner of the display window. Subwindows are assigned a positive integer identifier in the same way that first-level display windows are numbered, and we can place a subwindow inside another subwindow. Also, each subwindow can be assigned an individual display mode and other parameters. We can even reshape, reposition, push, pop, hide, and show subwindows, just as we can with first-level display windows. But we cannot convert a GLUT subwindow to an icon.
Selecting a Display-Window Screen-Cursor Shape We can use the following GLUT routine to request a shape for the screen cursor that is to be used with the current window: glutSetCursor (shape);
240
Two-Dimensional Viewing
The possible cursor shapes that we can select are an arrow pointing in a chosen direction, a bidirectional arrow, a rotating arrow, a crosshair, a wristwatch, a question mark, or even a skull and crossbones. For example, we can assign the symbolic constant GLUT CURSOR UP DOWN to parameter shape to obtain an up-down arrow. A rotating arrow is chosen with GLUT CURSOR CYCLE, a wristwatch shape is selected with GLUT CURSOR WAIT, and a skull and crossbones is obtained with the constant GLUT CURSOR DESTROY. A cursor shape can be assigned to a display window to indicate a particular kind of application, such as an animation. However, the exact shapes that we can use are system dependent.
Viewing Graphics Objects in a GLUT Display Window After we have created a display window and selected its position, size, color, and other characteristics, we indicate what is to be shown in that window. If more than one display window has been created, we first designate the one we want as the current display window. Then we invoke the following function to assign something to that window: glutDisplayFunc (pictureDescrip);
The argument is a routine that describes what is to be displayed in the current window. This routine, called pictureDescrip for this example, is referred to as a callback function because it is the routine that is to be executed whenever GLUT determines that the display-window contents should be renewed. Routine pictureDescrip usually contains the OpenGL primitives and attributes that define a picture, although it could specify other constructs such as a menu display. If we have set up multiple display windows, then we repeat this process for each of the display windows or subwindows. Also, we may need to call glutDisplayFunc after the glutPopWindow command if the display window has been damaged during the process of redisplaying the windows. In this case, the following function is used to indicate that the contents of the current display window should be renewed: glutPostRedisplay ( );
This routine is also used when an additional object, such as a pop-up menu, is to be shown in a display window.
Executing the Application Program When the program setup is complete and the display windows have been created and initialized, we need to issue the final GLUT command that signals execution of the program: glutMainLoop ( );
At this time, display windows and their graphic contents are sent to the screen. The program also enters the GLUT processing loop that continually checks for new “events,” such as interactive input from a mouse or a graphics tablet.
241
Two-Dimensional Viewing
Other GLUT Functions The GLUT library provides a wide variety of routines to handle processes that are system dependent and to add features to the basic OpenGL library. For example, this library contains functions for generating bitmap and outline characters, and it provides functions for loading values into a color table. In addition, some GLUT functions are available for displaying three-dimensional objects, either as solids or in a wireframe representation. These objects include a sphere, a torus, and the five regular polyhedra (cube, tetrahedron, octahedron, dodecahedron, and icosahedron). Sometimes it is convenient to designate a function that is to be executed when there are no other events for the system to process. We can do that with glutIdleFunc (function);
The parameter for this GLUT routine could reference a background function or a procedure to update parameters for an animation when no other processes are taking place. We also have GLUT functions for obtaining and processing interactive input and for creating and managing menus. Individual routines are provided by GLUT for input devices such as a mouse, keyboard, graphics tablet, and spaceball. Finally, we can use the following function to query the system about some of the current state parameters: glutGet (stateParam);
This function returns an integer value corresponding to the symbolic constant we select for its argument. For example, we can obtain the x-coordinate position for the top-left corner of the current display window, relative to the top-left corner of the screen, with the constant GLUT WINDOW X; and we can retrieve the current display-window width or the screen width with GLUT WINDOW WIDTH or GLUT SCREEN WIDTH.
OpenGL Two-Dimensional Viewing Program Example As a demonstration of the use of the OpenGL viewport function, we use a splitscreen effect to show two views of a triangle in the xy plane with its centroid at the world-coordinate origin. First, a viewport is defined in the left half of the display window, and the original triangle is displayed there in blue. Using the same clipping window, we then define another viewport for the right half of the display window, and the fill color is changed to red. The triangle is then rotated about its centroid and displayed in the second viewport.
#include class wcPt2D { public: GLfloat x, y; };
242
Two-Dimensional Viewing
void init (void) { /* Set color of display window to white. glClearColor (1.0, 1.0, 1.0, 0.0);
*/
/* Set parameters for world-coordinate clipping window. glMatrixMode (GL_PROJECTION); gluOrtho2D (-100.0, 100.0, -100.0, 100.0);
*/
/* Set mode for constructing geometric transformation matrix. glMatrixMode (GL_MODELVIEW);
*/
} void triangle (wcPt2D *verts) { GLint k; glBegin (GL_TRIANGLES); for (k = 0; k < 3; k++) glVertex2f (verts [k].x, verts [k].y); glEnd ( ); } void displayFcn (void) { /* Define initial position for triangle. */ wcPt2D verts [3] = { {-50.0, -25.0}, {50.0, -25.0}, {0.0, 50.0} }; glClear (GL_COLOR_BUFFER_BIT);
//
Clear display window.
glColor3f (0.0, 0.0, 1.0); glViewport (0, 0, 300, 300); triangle (verts);
// // //
Set fill color to blue. Set left viewport. Display triangle.
/* Rotate triangle and display in right half of display window. glColor3f (1.0, 0.0, 0.0); // Set fill color to red. glViewport (300, 0, 300, 300); // Set right viewport. glRotatef (90.0, 0.0, 0.0, 1.0); // Rotate about z axis. triangle (verts); // Display red rotated triangle.
*/
glFlush ( ); } void main (int argc, char ** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (50, 50); glutInitWindowSize (600, 300); glutCreateWindow ("Split-Screen Example"); init ( ); glutDisplayFunc (displayFcn); glutMainLoop ( ); }
243
Two-Dimensional Viewing
5 Clipping Algorithms Generally, any procedure that eliminates those portions of a picture that are either inside or outside a specified region of space is referred to as a clipping algorithm or simply clipping. Usually a clipping region is a rectangle in standard position, although we could use any shape for a clipping application. The most common application of clipping is in the viewing pipeline, where clipping is applied to extract a designated portion of a scene (either two-dimensional or three-dimensional) for display on an output device. Clipping methods are also used to antialias object boundaries, to construct objects using solid-modeling methods, to manage a multiwindow environment, and to allow parts of a picture to be moved, copied, or erased in drawing and painting programs. Clipping algorithms are applied in two-dimensional viewing procedures to identify those parts of a picture that are within the clipping window. Everything outside the clipping window is then eliminated from the scene description that is transferred to the output device for display. An efficient implementation of clipping in the viewing pipeline is to apply the algorithms to the normalized boundaries of the clipping window. This reduces calculations, because all geometric and viewing transformation matrices can be concatenated and applied to a scene description before clipping is carried out. The clipped scene can then be transferred to screen coordinates for final processing. In the following sections, we explore two-dimensional algorithms for • Point clipping • Line clipping (straight-line segments) • Fill-area clipping (polygons) • Curve clipping • Text clipping
Point, line, and polygon clipping are standard components of graphics packages. But similar methods can be applied to other objects, particularly conics, such as circles, ellipses, and spheres, in addition to spline curves and surfaces. Usually, however, objects with nonlinear boundaries are approximated with straight-line segments or polygon surfaces to reduce computations. Unless otherwise stated, we assume that the clipping region is a rectangular window in standard position, with boundary edges at coordinate positions xwmin , xwmax , ywmin , and ywmax . These boundary edges typically correspond to a normalized square, in which the x and y values range either from 0 to 1 or from −1 to 1.
6 Two-Dimensional Point Clipping For a clipping rectangle in standard position, we save a two-dimensional point P = (x, y) for display if the following inequalities are satisfied: xwmin ≤ x ≤ xwmax ywmin ≤ y ≤ ywmax
(12)
If any of these four inequalities is not satisfied, the point is clipped (not saved for display). Although point clipping is applied less often than line or polygon clipping, it is useful in various situations, particularly when pictures are modeled with particle systems. For example, point clipping can be applied to scenes involving
244
Two-Dimensional Viewing
clouds, sea foam, smoke, or explosions that are modeled with “particles,” such as the center coordinates for small circles or spheres.
7 Two-Dimensional Line Clipping Figure 9 illustrates possible positions for straight-line segments in relationship to a standard clipping window. A line-clipping algorithm processes each line in a scene through a series of tests and intersection calculations to determine whether the entire line or any part of it is to be saved. The expensive part of a line-clipping procedure is in calculating the intersection positions of a line with the window edges. Therefore, a major goal for any line-clipping algorithm is to minimize the intersection calculations. To do this, we can first perform tests to determine whether a line segment is completely inside the clipping window or completely outside. It is easy to determine whether a line is completely inside a clipping window, but it is more difficult to identify all lines that are entirely outside the window. If we are unable to identify a line as completely inside or completely outside a clipping rectangle, we must then perform intersection calculations to determine whether any part of the line crosses the window interior. We test a line segment to determine if it is completely inside or outside a selected clipping-window edge by applying the point-clipping tests of the previous section. When both endpoints of a line segment are inside all four clipping boundaries, such as the line from P1 to P2 in Figure 9, the line is completely inside the clipping window and we save it. And when both endpoints of a line segment are outside any one of the four boundaries (as with line P3 P4 in Figure 9), that line is completely outside the window and it is eliminated from the scene description. But if both these tests fail, the line segment intersects at least one clipping boundary and it may or may not cross into the interior of the clipping window. One way to formulate the equation for a straight-line segment is to use the following parametric representation, where the coordinate positions (x0 , y0 ) and (xend , yend ) designate the two line endpoints: x = x0 + u(xend − x0 ) y = y0 + u(yend − y0 )
(13)
0≤u≤1
We can use this parametric representation to determine where a line segment crosses each clipping-window edge by assigning the coordinate value for that edge to either x or y and solving for parameter u. For example, the left window boundary is at position xwmin , so we substitute this value for x, solve for u, and calculate the corresponding y-intersection value. If this value of u is outside the range from 0 to 1, the line segment does not intersect that window border line. P9 Clipping Window
Clipping Window P2
P2
P4
P8
P1 P3
P6
P5
P10
P1 P6
P5⬘
P8⬘
P7⬘
P7 Before Clipping
After Clipping
(a)
(b)
FIGURE 9
Clipping straight-line segments using a standard rectangular clipping window.
245
Two-Dimensional Viewing
However, if the value of u is within the range from 0 to 1, part of the line is inside that border. We can then process this inside portion of the line segment against the other clipping boundaries until either we have clipped the entire line or we find a section that is inside the window. Processing line segments in a scene using the simple clipping approach described in the preceding paragraph is straightforward, but not very efficient. It is possible to reformulate the initial testing and the intersection calculations to reduce processing time for a set of line segments, and a number of faster line clippers have been developed. Some of the algorithms are designed explicitly for two-dimensional pictures and some are easily adapted to sets of threedimensional line segments.
Cohen-Sutherland Line Clipping bit 4
bit 3
Top
bit 2
bit 1
Right Bottom
Left
FIGURE 10
A possible ordering for the clippingwindow boundaries corresponding to the bit positions in the CohenSutherland endpoint region code.
This is one of the earliest algorithms to be developed for fast line clipping, and variations of this method are widely used. Processing time is reduced in the Cohen-Sutherland method by performing more tests before proceeding to the intersection calculations. Initially, every line endpoint in a picture is assigned a four-digit binary value, called a region code, and each bit position is used to indicate whether the point is inside or outside one of the clipping-window boundaries. We can reference the window edges in any order, and Figure 10 illustrates one possible ordering with the bit positions numbered 1 through 4 from right to left. Thus, for this ordering, the rightmost position (bit 1) references the left clipping-window boundary, and the leftmost position (bit 4) references the top window boundary. A value of 1 (or true) in any bit position indicates that the endpoint is outside that window border. Similarly, a value of 0 (or false) in any bit position indicates that the endpoint is not outside (it is inside or on) the corresponding window edge. Sometimes, a region code is referred to as an “out” code because a value of 1 in any bit position indicates that the spatial point is outside the corresponding clipping boundary. Each clipping-window edge divides two-dimensional space into an inside half space and an outside half space. Together, the four window borders create nine regions, and Figure 11 lists the value for the binary code in each of these regions. Thus, an endpoint that is below and to the left of the clipping window is assigned the region code 0101, and the region-code value for any endpoint inside the clipping window is 0000. Bit values in a region code are determined by comparing the coordinate values (x, y) of an endpoint to the clipping boundaries. Bit 1 is set to 1 if x < xwmin , and
1001
0001
1000
0000
1010
0010
Clipping Window FIGURE 11
The nine binary region codes for identifying the position of a line endpoint, relative to the clipping-window boundaries.
246
0101
0100
0110
Two-Dimensional Viewing
the other three bit values are determined similarly. Instead of using inequality testing, we can determine the values for a region-code more efficiently using bit-processing operations and the following two steps: (1) Calculate differences between endpoint coordinates and clipping boundaries. (2) Use the resultant sign bit of each difference calculation to set the corresponding value in the region code. For the ordering scheme shown in Figure 10, bit 1 is the sign bit of x − xw min ; bit 2 is the sign bit of xwmax − x; bit 3 is the sign bit of y − ywmin ; and bit 4 is the sign bit of ywmax − y. Once we have established region codes for all line endpoints, we can quickly determine which lines are completely inside the clip window and which are completely outside. Any lines that are completely contained within the window edges have a region code of 0000 for both endpoints, and we save these line segments. Any line that has a region-code value of 1 in the same bit position for each endpoint is completely outside the clipping rectangle, and we eliminate that line segment. As an example, a line that has a region code of 1001 for one endpoint and a code of 0101 for the other endpoint is completely to the left of the clipping window, as indicated by the value of 1 in the first bit position of each region code. We can perform the inside-outside tests for line segments using logical operators. When the or operation between two endpoint region codes for a line segment is false (0000), the line is inside the clipping window. Therefore, we save the line and proceed to test the next line in the scene description. When the and operation between the two endpoint region codes for a line is true (not 0000), the line is completely outside the clipping window, and we can eliminate it from the scene description. Lines that cannot be identified as being completely inside or completely outside a clipping window by the region-code tests are next checked for intersection with the window border lines. As shown in Figure 12, line segments can intersect clipping boundary lines without entering the interior of the window. Therefore, several intersection calculations might be necessary to clip a line segment, depending on the order in which we process the clipping boundaries. As we process each clipping-window edge, a section of the line is clipped, and the remaining part of the line is checked against the other window borders. We continue eliminating sections until either the line is totally clipped or the remaining part of the line is inside the clipping window. For the following discussion, we assume that the window edges are processed in the following order: left, right, bottom, top. To determine whether a line crosses a selected clipping boundary, we can check corresponding bit values in the two endpoint region codes. If one of these bit values is 1 and the other is 0, the line segment crosses that boundary. P2
P2⬘ P2⬙ Clipping Window
P3
FIGURE 12
P1⬘
P3⬘
P1 P4
Lines extending from one clipping-window region to another may cross into the clipping window, or they could intersect one or more clipping boundaries without entering the window.
247
Two-Dimensional Viewing
Figure 12 illustrates two line segments that cannot be identified immediately as completely inside or completely outside the clipping window. The region codes for the line from P1 to P2 are 0100 and 1001. Thus, P1 is inside the left clipping boundary and P2 is outside that boundary. We then calculate the intersection position P2 , and we clip off the line section from P2 to P2 . The remaining portion of the line is inside the right border line, and so we next check the bottom border. Endpoint P1 is below the bottom clipping edge and P2 is above it, so we determine the intersection position at this boundary (P1 ). We eliminate the line section from P1 to P1 and proceed to the top window edge. There we determine the intersection position to be P2 . The final step is to clip off the section above the top boundary and save the interior segment from P1 to P2 . For the second line, we find that point P3 is outside the left boundary and P4 is inside. Thus, we calculate the intersection position P3 and eliminate the line section from P3 to P3 . By checking region codes for the endpoints P3 and P4 , we find that the remainder of the line is below the clipping window and can be eliminated as well. It is possible, when clipping a line segment using this approach, to calculate an intersection position at all four clipping boundaries, depending on how the line endpoints are processed and what ordering we use for the boundaries. Figure 13 shows the four intersection positions that could be calculated for a line segment that is processed against the clipping-window edges in the order left, right, bottom, top. Therefore, variations of this basic approach have been developed in an effort to reduce the intersection calculations. To determine a boundary intersection for a line segment, we can use the slopeintercept form of the line equation. For a line with endpoint coordinates (x0 , y0 ) and (xend , yend ), the y coordinate of the intersection point with a vertical clipping border line can be obtained with the calculation y = y0 + m(x − x0 )
(14)
where the x value is set to either xwmin or xwmax , and the slope of the line is calculated as m = (yend − y0 )/(xend − x0 ). Similarly, if we are looking for the intersection with a horizontal border, the x coordinate can be calculated as y − y0 x = x0 + (15) m with y set either to ywmin or to ywmax . (xo, yo) 1 4
Clipping Window FIGURE 13
Four intersection positions (labeled from 1 to 4) for a line segment that is clipped against the window boundaries in the order left, right, bottom, top.
248
3 2 (xend, yend)
Two-Dimensional Viewing
An implementation of the two-dimensional, Cohen-Sutherland line-clipping algorithm is given in the following procedures.
class wcPt2D { public: GLfloat x, y; }; inline GLint round (const GLfloat a)
{ return GLint (a + 0.5); }
/* Define a four-bit code for each of the outside regions of a * rectangular clipping window. */ const GLint winLeftBitCode = 0x1; const GLint winRightBitCode = 0x2; const GLint winBottomBitCode = 0x4; const GLint winTopBitCode = 0x8; /* A bit-mask region code is also assigned to each endpoint of an input * line segment, according to its position relative to the four edges of * an input rectangular clip window. * * An endpoint with a region-code value of 0000 is inside the clipping * window, otherwise it is outside at least one clipping boundary. If * the 'or' operation for the two endpoint codes produces a value of * false, the entire line defined by these two endpoints is saved * (accepted). If the 'and' operation between two endpoint codes is * true, the line is completely outside the clipping window, and it is * eliminated (rejected) from further processing. */ inline GLint inside (GLint code) { return GLint (!code); } inline GLint reject (GLint code1, GLint code2) { return GLint (code1 & code2); } inline GLint accept (GLint code1, GLint code2) { return GLint (!(code1 | code2)); } GLubyte encode (wcPt2D pt, wcPt2D winMin, wcPt2D winMax) { GLubyte code = 0x00; if (pt.x < winMin.x) code = code | winLeftBitCode; if (pt.x > winMax.x) code = code | winRightBitCode; if (pt.y < winMin.y) code = code | winBottomBitCode; if (pt.y > winMax.y) code = code | winTopBitCode; return (code); }
249
Two-Dimensional Viewing
void swapPts (wcPt2D * p1, wcPt2D * p2) { wcPt2D tmp; tmp = *p1; *p1 = *p2; *p2 = tmp; } void swapCodes (GLubyte * c1, GLubyte * c2) { GLubyte tmp; tmp = *c1; *c1 = *c2; *c2 = tmp; } void lineClipCohSuth (wcPt2D winMin, wcPt2D winMax, wcPt2D p1, wcPt2D p2) { GLubyte code1, code2; GLint done = false, plotLine = false; GLfloat m; while (!done) { code1 = encode (p1, winMin, winMax); code2 = encode (p2, winMin, winMax); if (accept (code1, code2)) { done = true; plotLine = true; } else if (reject (code1, code2)) done = true; else { /* Label the endpoint outside the display window as p1. */ if (inside (code1)) { swapPts (&p1, &p2); swapCodes (&code1, &code2); } /* Use slope m to find line-clipEdge intersection. */ if (p2.x != p1.x) m = (p2.y - p1.y) / (p2.x - p1.x); if (code1 & winLeftBitCode) { p1.y += (winMin.x - p1.x) * m; p1.x = winMin.x; } else if (code1 & winRightBitCode) { p1.y += (winMax.x - p1.x) * m; p1.x = winMax.x; } else if (code1 & winBottomBitCode) { /* Need to update p1.x for nonvertical lines only. */ if (p2.x != p1.x) p1.x += (winMin.y - p1.y) / m; p1.y = winMin.y; }
250
Two-Dimensional Viewing
else if (code1 & winTopBitCode) { if (p2.x != p1.x) p1.x += (winMax.y - p1.y) / m; p1.y = winMax.y; } } } if (plotLine) lineBres (round (p1.x), round (p1.y), round (p2.x), round (p2.y)); }
Liang-Barsky Line Clipping Faster line-clipping algorithms have been developed that do more line testing before proceeding to the intersection calculations. One of the earliest efforts in this direction is an algorithm developed by Cyrus and Beck, which is based on analysis of the parametric line equations. Later, Liang and Barsky independently devised an even faster form of the parametric line-clipping algorithm. For a line segment with endpoints (x0 , y0 ) and (xend , yend ), we can describe the line with the parametric form x = x0 + ux y = y0 + uy
0≤u≤1
(16)
where x = xend − x0 and y = yend − y0 . In the Liang-Barsky algorithm, the parametric line equations are combined with the point-clipping conditions 12 to obtain the inequalities xwmin ≤ x0 + ux ≤ xwmax ywmin ≤ y0 + uy ≤ ywmax
(17)
which can be expressed as u pk ≤ q k ,
k = 1, 2, 3, 4
(18)
where parameters p and q are defined as p1 = −x, p2 = x,
q 1 = x0 − xwmin q 2 = xwmax − x0
p3 = −y, p4 = y,
q 3 = y0 − ywmin q 4 = ywmax − y0
(19)
Any line that is parallel to one of the clipping-window edges has pk = 0 for the value of k corresponding to that boundary, where k = 1, 2, 3, and 4 correspond to the left, right, bottom, and top boundaries, respectively. If, for that value of k, we also find q k < 0, then the line is completely outside the boundary and can be eliminated from further consideration. If q k ≥ 0, the line is inside the parallel clipping border. When pk < 0, the infinite extension of the line proceeds from the outside to the inside of the infinite extension of this particular clipping-window edge. If pk > 0, the line proceeds from the inside to the outside. For a nonzero value of pk , we can calculate the value of u that corresponds to the point where the infinitely
251
Two-Dimensional Viewing
extended line intersects the extension of window edge k as qk u= (20) pk For each line, we can calculate values for parameters u1 and u2 that define that part of the line that lies within the clip rectangle. The value of u1 is determined by looking at the rectangle edges for which the line proceeds from the outside to the inside ( p < 0). For these edges, we calculate rk = q k / pk . The value of u1 is taken as the largest of the set consisting of 0 and the various values of r . Conversely, the value of u2 is determined by examining the boundaries for which the line proceeds from inside to outside ( p > 0). A value of rk is calculated for each of these boundaries, and the value of u2 is the minimum of the set consisting of 1 and the calculated r values. If u1 > u2 , the line is completely outside the clip window and it can be rejected. Otherwise, the endpoints of the clipped line are calculated from the two values of parameter u. This algorithm is implemented in the following code sections. Line intersection parameters are initialized to the values u1 = 0 and u2 = 1. For each clipping boundary, the appropriate values for p and q are calculated and used by the function clipTest to determine whether the line can be rejected or whether the intersection parameters are to be adjusted. When p < 0, parameter r is used to update u1 ; when p > 0, parameter r is used to update u2 . If updating u1 or u2 results in u1 > u2 , we reject the line. Otherwise, we update the appropriate u parameter only if the new value results in a shortening of the line. When p = 0 and q < 0, we can eliminate the line because it is parallel to and outside this boundary. If the line has not been rejected after all four values of p and q have been tested, the endpoints of the clipped line are determined from values of u1 and u2 . class wcPt2D { private: GLfloat x, y; public: /* Default Constructor: initialize position as (0.0, 0.0). wcPt3D ( ) { x = y = 0.0; } setCoords (GLfloat xCoord, GLfloat yCoord) { x = xCoord; y = yCoord; } GLfloat getx ( ) const { return x; } GLfloat gety ( ) const { return y; } }; inline GLint round (const GLfloat a)
252
{ return GLint (a + 0.5); }
*/
Two-Dimensional Viewing
GLint clipTest (GLfloat p, GLfloat q, GLfloat * u1, GLfloat * u2) { GLfloat r; GLint returnValue = true; if (p < 0.0) { r = q / p; if (r > *u2) returnValue = false; else if (r > *u1) *u1 = r; } else if (p > 0.0) { r = q / p; if (r < *u1) returnValue = false; else if (r < *u2) *u2 = r; } else /* Thus p = 0 and line is parallel to clipping boundary. if (q < 0.0) /* Line is outside clipping boundary. */ returnValue = false;
*/
return (returnValue); } void lineClipLiangBarsk (wcPt2D winMin, wcPt2D winMax, wcPt2D p1, wcPt2D p2) { GLfloat u1 = 0.0, u2 = 1.0, dx = p2.getx ( ) - p1.getx ( ), dy; if (clipTest (-dx, p1.getx ( ) - winMin.getx ( ), &u1, &u2)) if (clipTest (dx, winMax.getx ( ) - p1.getx ( ), &u1, &u2)) { dy = p2.gety ( ) - p1.gety ( ); if (clipTest (-dy, p1.gety ( ) - winMin.gety ( ), &u1, &u2)) if (clipTest (dy, winMax.gety ( ) - p1.gety ( ), &u1, &u2)) { if (u2 < 1.0) { p2.setCoords (p1.getx ( ) + u2 * dx, p1.gety ( ) + u2 * dy); } if (u1 > 0.0) { p1.setCoords (p1.getx ( ) + u1 * dx, p1.gety ( ) + u1 * dy); } lineBres (round (p1.getx ( )), round (p1.gety ( )), round (p2.getx ( )), round (p2.gety ( ))); } } }
In general, the Liang-Barsky algorithm is more efficient than the CohenSutherland line-clipping algorithm. Each update of parameters u1 and u2 requires only one division; and window intersections of the line are computed only once, when the final values of u1 and u2 have been computed. In contrast, the Cohen
253
Two-Dimensional Viewing
and Sutherland algorithm can calculate intersections repeatedly along a line path, even though the line may be completely outside the clip window. In addition, each Cohen-Sutherland intersection calculation requires both a division and a multiplication. The two-dimensional Liang-Barsky algorithm can be extended to clip three-dimensional lines.
Nicholl-Lee-Nicholl Line Clipping By creating more regions around the clipping window, the Nicholl-Lee-Nicholl (NLN) algorithm avoids multiple line-intersection calculations. In the CohenSutherland method, for example, multiple intersections could be calculated along the path of a line segment before an intersection on the clipping rectangle is located or the line is completely rejected. These extra intersection calculations are eliminated in the NLN algorithm by carrying out more region testing before intersection positions are calculated. Compared to both the Cohen-Sutherland and the Liang-Barsky algorithms, the Nicholl-Lee-Nicholl algorithm performs fewer comparisons and divisions. The trade-off is that the NLN algorithm can be applied only to two-dimensional clipping, whereas both the Liang-Barsky and the Cohen-Sutherland methods are easily extended to three-dimensional scenes. Initial testing to determine whether a line segment is completely inside the clipping window or outside the window limits can be accomplished with regioncode tests, as in the previous two algorithms. If a trivial acceptance or rejection of the line is not possible, the NLN algorithm proceeds to set up additional clipping regions. For a line with endpoints P0 and Pend , we first determine the position of point P0 for the nine possible regions relative to the clipping window. Only the three regions shown in Figure 14 need be considered. If P 0 lies in any one of the other six regions, we can move it to one of the three regions in Figure 14 using a symmetry transformation. For example, the region directly above the clip window can be transformed to the region left of the window using a reflection about the line y = −x, or we could use a 90◦ counterclockwise rotation. Assuming that P0 and Pend are not both inside the clipping window, we next determine the position of Pend relative to P0 . To do this, we create some new regions in the plane, depending on the location of P0 . Boundaries of the new regions are semi-infinite line segments that start at the position of P0 and pass through the clipping-window corners. If P0 is inside the clipping window, we set up the four regions shown in Figure 15. Then, depending on which one of the four regions (L, T, R, or B) contains Pend , we compute the line-intersection position with the corresponding window boundary. If P0 is in the region to the left of the window, we set up the four regions labeled L, LT, LR, and LB in Figure 16. These four regions again determine a unique
P0
P0
P0
P0 inside Clipping Window
P0 in an Edge Region
P0 in a Corner Region
(a)
(b)
(c)
FIGURE 14
Three possible positions for a line endpoint P0 in the NLN line-clipping algorithm.
254
Two-Dimensional Viewing
LT
L P0
T
L
LR
L L
R P0
LB B
FIGURE 15
FIGURE 16
The four regions used in the NLN algorithm when P0 is inside the clipping window and Pend is outside.
The four clipping regions used in the NLN algorithm when P0 is directly to the left of the clip window.
clipping-window edge for the line segment, relative to the position of Pend . For instance, if Pend is in any one of the three regions labeled L, we clip the line at the left window border and save the line segment from this intersection point to Pend . If Pend is in region LT, we save the line segment from the left window boundary to the top boundary. Similar processing is carried out for regions LR and LB. However, if Pend is not in any of the four regions L, LT, LR, or LB, the entire line is clipped. For the third case, when P0 is to the left and above the clipping window, we use the regions in Figure 17. In this case, we have the two possibilities shown, depending on the position of P0 within the top-left corner of the clipping window. When P0 is closer to the left clipping boundary of the window, we use the regions in (a) of this figure. Otherwise, when P0 is closer to the top clipping boundary of the window, we use the regions in (b). If Pend is in one of the regions T, L, TR, TB, LR, or LB, this determines a unique clipping-window border for the intersection calculations. Otherwise, the entire line is rejected. To determine the region in which Pend is located, we compare the slope of the line segment to the slopes of the boundaries of the NLN regions. For example, if P0 is left of the clipping window (Figure 16), then Pend is in region LT if slopeP0 PT R < slopeP0 Pend < slopeP0 PT L
(21)
yT − y0 yend − y0 yT − y0 < < xR − x0 xend − x0 xL − x0
(22)
or
P0 P0 T
T
TR
or T
TR
L
L
L
LR TB
LB (a)
LB (b)
FIGURE 17
The two possible sets of clipping regions used in the NLN algorithm when P0 is above and to the left of the clipping window.
255
Two-Dimensional Viewing
We clip the entire line if (yT − y0 )(xend − x0 ) < (xL − x0 )(yend − y0 )
(23)
The coordinate-difference calculations and product calculations used in the slope tests are saved and also used in the intersection calculations. From the parametric equations x = x0 + (xend − x0 )u y = y0 + (yend − y0 )u we calculate an x-intersection position on the left window boundary as x = xL , with u = (xL − x0 )/(xend − x0 ), so that the y-intersection position is yend − y0 y = y0 + (xL − x0 ) (24) xend − x0 An intersection position on the top boundary has y = yT and u = (yT − y0 )/(yend − y0 ), with xend − x0 x = x0 + (yT − y0 ) (25) yend − y0
Line Clipping Using Nonrectangular Polygon Clip Windows In some applications, it may be desirable to clip lines against arbitrarily shaped polygons. Methods based on parametric line equations, such as either the CyrusBeck algorithm or the Liang-Barsky algorithm, can be readily extended to clip lines against convex polygon windows. We do this by modifying the algorithm to include the parametric equations for the boundaries of the clipping region. Preliminary screening of line segments can be accomplished by processing lines against the coordinate extents of the clipping polygon. For concave-polygon clipping regions, we could still apply these parametric clipping procedures if we first split the concave polygon into a set of convex polygons. Another approach is simply to add one or more edges to the concave clipping area so that it is modified to a convex-polygon shape. Then a series of clipping operations can be applied using the modified convex polygon components, as illustrated in Figure 18. The line segment P1 P2 in (a) of this figure is to be clipped by the concave window with vertices V1 , V2 , V3 , V4 , and V5 . Two convex clipping regions are obtained, in this case, by adding a line segment from V4 to V1 . Then the line is clipped in two passes: (1) Line P1 P2 is clipped by the convex polygon with vertices V1 , V2 , V3 , and V4 to yield the clipped segment P1 P2 [Figure 18(b)]. (2) The internal line segment P1 P2 is clipped off using the convex polygon with vertices V1 , V5 , and V4 [Figure 18(c)] to yield the final clipped line segment P1 P2.
Line Clipping Using Nonlinear Clipping-Window Boundaries Circles or other curved-boundary clipping regions are also possible, but they require more processing because the intersection calculations involve nonlinear equations. At the first step, lines could be clipped against the bounding rectangle (coordinate extents) of the curved clipping region. Lines that are outside the coordinate extents are eliminated. To identify lines that are inside a circle, for instance, we could calculate the distance of the line endpoints from the circle center. If the square of this distance for both endpoints of a line is less than or equal to the radius squared, we can save the entire line. The remaining lines are then processed through the intersection calculations, which must solve simultaneous circle-line equations.
256
Two-Dimensional Viewing Concave Polygon Clipping Window
V3 V4 P1
P2
V5 V2 V1 (a)
V3 V4 P1
P⬘2
P1⬘
P2 V2 Clip Exterior Line Segments
V1
FIGURE 18
A concave- polygon clipping window (a), with vertex list (V1 , V2 , V3 , V4 , V5 ), is modified to the convex polygon (V1 , V2 , V3 , V4 ) in (b). The external
(b) V4 P1⬙
P1⬘
P⬘2 V5 V1
Clip Interior Line Segment (c)
segments of line P1 P2 are then snipped off using this convex clipping window. The resulting line segment, P1 P2 , is next processed against the triangle (V1 , V5 , V4 ) (c) to clip off the internal line segment P1 P1 to produce the final clipped line P1 P2 .
8 Polygon Fill-Area Clipping Graphics packages typically support only fill areas that are polygons, and often only convex polygons. To clip a polygon fill area, we cannot apply a line-clipping method to the individual polygon edges directly because this approach would not, in general, produce a closed polyline. Instead, a line clipper would often produce a disjoint set of lines with no complete information about how we might form a closed boundary around the clipped fill area. Figure 19 illustrates a possible output from a line-clipping procedure applied to the edges of a polygon fill area. What we require is a procedure that will output one or more closed polylines for the boundaries of the clipped fill area, so that the polygons can be scan-converted to fill the interiors with the assigned color or pattern, as in Figure 20. We can process a polygon fill area against the borders of a clipping window using the same general approach as in line clipping. A line segment is defined by its two endpoints, and these endpoints are processed through a line-clipping procedure by constructing a new set of clipped endpoints at each clipping-window boundary. Similarly, we need to maintain a fill area as an entity as it is processed through the clipping stages. Thus, we can clip a polygon fill area by determining the new shape for the polygon as each clipping-window edge is processed, as demonstrated in Figure 21. Of course, the interior fill for the polygon would not be applied until the final clipped border had been determined.
257
Two-Dimensional Viewing
Before Clipping
After Clipping
Before Clipping
After Clipping
(a)
(b)
(a)
(b)
FIGURE 19
FIGURE 20
A line-clipping algorithm applied to the line segments of the polygon boundary in (a) generates the unconnected set of lines in (b).
Display of a correctly clipped polygon fill area.
Original Polygon
Clip Left
Clip Right
Clip Bottom
Clip Top
FIGURE 21
Processing a polygon fill area against successive clipping-window boundaries. Clipping Window
Coordinate Extents of Polygon Fill Area
FIGURE 22
A polygon fill area with coordinate extents outside the right clipping boundary.
Just as we first tested a line segment to determine whether it could be completely saved or completely clipped, we can do the same with a polygon fill area by checking its coordinate extents. If the minimum and maximum coordinate values for the fill area are inside all four clipping boundaries, the fill area is saved for further processing. If these coordinate extents are all outside any of the clipping-window borders, we eliminate the polygon from the scene description (Figure 22). When we cannot identify a fill area as being completely inside or completely outside the clipping window, we then need to locate the polygon intersection positions with the clipping boundaries. One way to implement convex-polygon
258
Two-Dimensional Viewing 1 1⬘
1⬙
1⬘
1⬙ 3⬙
3⬙ 3
Clip 3⬘
3⬘
Clipping Window
FIGURE 23
2⬙
2⬘
2⬘
2⬙
2 (a)
(b)
A convex-polygon fill area (a), defined with the vertex list {1, 2, 3}, is clipped to produce the fill-area shape shown in (b), which is defined with the output vertex list {1 , 2 , 2 , 3 , 3 , 1 }.
clipping is to create a new vertex list at each clipping boundary, and then pass this new vertex list to the next boundary clipper. The output of the final clipping stage is the vertex list for the clipped polygon (Figure 23). For concave-polygon clipping, we would need to modify this basic approach so that multiple vertex lists could be generated.
Sutherland--Hodgman Polygon Clipping An efficient method for clipping a convex-polygon fill area, developed by Sutherland and Hodgman, is to send the polygon vertices through each clipping stage so that a single clipped vertex can be immediately passed to the next stage. This eliminates the need for an output set of vertices at each clipping stage, and it allows the boundary-clipping routines to be implemented in parallel. The final output is a list of vertices that describe the edges of the clipped polygon fill area. Because the Sutherland-Hodgman algorithm produces only one list of output vertices, it cannot correctly generate the two output polygons in Figure 20(b) that are the result of clipping the concave polygon shown in Figure 20(a). However, more processing steps can be added to the algorithm to allow it to produce multiple output vertex lists, so that general concave-polygon clipping could be accomodated. And the basic Sutherland-Hodgman algorithm is able to process concave polygons when the clipped fill area can be described with a single vertex list. The general strategy in this algorithm is to send the pair of endpoints for each successive polygon line segment through the series of clippers (left, right, bottom, and top). As soon as a clipper completes the processing of one pair of vertices, the clipped coordinate values, if any, for that edge are sent to the next clipper. Then the first clipper processes the next pair of endpoints. In this way, the individual boundary clippers can be operating in parallel. There are four possible cases that need to be considered when processing a polygon edge against one of the clipping boundaries. One possibility is that the first edge endpoint is outside the clipping boundary and the second endpoint is inside. Or, both endpoints could be inside this clipping boundary. Another possibility is that the first endpoint is inside the clipping boundary and the second endpoint is outside. And, finally, both endpoints could be outside the clipping boundary. To facilitate the passing of vertices from one clipping stage to the next, the output from each clipper can be formulated as shown in Figure 24. As each successive pair of endpoints is passed to one of the four clippers, an output is generated for the next clipper according to the results of the following tests: 1. If the first input vertex is outside this clipping-window border and the
second vertex is inside, both the intersection point of the polygon edge with the window border and the second vertex are sent to the next clipper.
259
Two-Dimensional Viewing V1⬘
V2
V2
V1⬘
V1
V2
V1
V1
V1
V2
(1)
(2)
(3)
(4)
out in Output: V1⬘, V2
in in Output: V2
in out Output: V1⬘
out out Output: none
FIGURE 24
The four possible outputs generated by the left clipper, depending on the position of a pair of endpoints relative to the left boundary of the clipping window.
2
2⬘
Clipping Window
3 2⬙
1⬘
3⬘ 1
Input Edge:
FIGURE 25
Processing a set of polygon vertices, {1, 2, 3}, through the boundary clippers using the Sutherland-Hodgman algorithm. The final set of clipped vertices is {1 , 2, 2 , 2 }.
Left Clipper
{1, 2}: (in – in)
{2}
{2, 3}: (in – out)
{2⬘}
{3, 1}: (out – in)
{3⬘, 1}
Right Clipper
{2, 2⬘}: (in – in)
Bottom Clipper
Top Clipper
{2⬘}
{2⬘, 3⬘}: (in – in)
{3⬘} {2⬘, 3⬘}: (in – out)
{2⬙}
{3⬘, 1}: (in – in)
{1} {3⬘, 1}: (out – out)
{}
{1, 2}: (in – in)
{2}
{1, 2}: (out – in)
{1⬘, 2} {2⬙, 1⬘}: (in – in)
{1⬘}
{2, 2⬘}: (in – in)
{2⬘}
{1⬘, 2}: (in – in)
{2}
{2, 2⬘}: (in – in)
{2⬘}
{2⬘, 2⬙}: (in – in)
{2⬙}
2. If both input vertices are inside this clipping-window border, only the
second vertex is sent to the next clipper. 3. If the first vertex is inside this clipping-window border and the second
vertex is outside, only the polygon edge-intersection position with the clipping-window border is sent to the next clipper. 4. If both input vertices are outside this clipping-window border, no vertices
are sent to the next clipper. The last clipper in this series generates a vertex list that describes the final clipped fill area. Figure 25 provides an example of the Sutherland-Hodgman polygonclipping algorithm for a fill area defined with the vertex set {1, 2, 3}. As soon as a clipper receives a pair of endpoints, it determines the appropriate output using the tests illustrated in Figure 24. These outputs are passed in succession from the left clipper to the right, bottom, and top clippers. The output from the
260
Two-Dimensional Viewing
top clipper is the set of vertices defining the clipped fill area. For this example, the output vertex list is {1 , 2, 2 , 2 }. A sequential implementation of the Sutherland-Hodgman polygon-clipping algorithm is demonstrated in the following set of procedures. An input set of vertices is converted to an output vertex list by clipping it against the four edges of the axis-aligned rectangular clipping region.
typedef enum { Left, Right, Bottom, Top } Boundary; const GLint nClip = 4; GLint inside (wcPt2D p, Boundary { switch (b) { case Left: if (p.x < wMin.x) case Right: if (p.x > wMax.x) case Bottom: if (p.y < wMin.y) case Top: if (p.y > wMax.y) } return (true); }
b, wcPt2D wMin, wcPt2D wMax)
return return return return
(false); (false); (false); (false);
break; break; break; break;
GLint cross (wcPt2D p1, wcPt2D p2, Boundary winEdge, wcPt2D wMin, wcPt2D wMax) { if (inside (p1, winEdge, wMin, wMax) == inside (p2, winEdge, wMin, wMax)) return (false); else return (true); } wcPt2D intersect (wcPt2D p1, wcPt2D p2, Boundary winEdge, wcPt2D wMin, wcPt2D wMax) { wcPt2D iPt; GLfloat m; if (p1.x != p2.x) m = (p1.y - p2.y) / (p1.x - p2.x); switch (winEdge) { case Left: iPt.x = wMin.x; iPt.y = p2.y + (wMin.x - p2.x) * m; break; case Right: iPt.x = wMax.x; iPt.y = p2.y + (wMax.x - p2.x) * m; break; case Bottom: iPt.y = wMin.y; if (p1.x != p2.x) iPt.x = p2.x + (wMin.y - p2.y) / m; else iPt.x = p2.x; break; case Top: iPt.y = wMax.y; if (p1.x != p2.x) iPt.x = p2.x + (wMax.y - p2.y) / m; else iPt.x = p2.x; break; }
261
Two-Dimensional Viewing
return (iPt); } void clipPoint (wcPt2D p, Boundary winEdge, wcPt2D wMin, wcPt2D wMax, wcPt2D * pOut, int * cnt, wcPt2D * first[], wcPt2D * s) { wcPt2D iPt; /* If no previous point exists for this clipping boundary, * save this point. */ if (!first[winEdge]) first[winEdge] = &p; else /* Previous point exists. If p and previous point cross * this clipping boundary, find intersection. Clip against * next boundary, if any. If no more clip boundaries, add * intersection to output list. */ if (cross (p, s[winEdge], winEdge, wMin, wMax)) { iPt = intersect (p, s[winEdge], winEdge, wMin, wMax); if (winEdge < Top) clipPoint (iPt, b+1, wMin, wMax, pOut, cnt, first, s); else { pOut[*cnt] = iPt; (*cnt)++; } } /* Save p as most recent point for this clip boundary. s[winEdge] = p;
*/
/* For all, if point inside, proceed to next boundary, if any. */ if (inside (p, winEdge, wMin, wMax)) if (winEdge < Top) clipPoint (p, winEdge + 1, wMin, wMax, pOut, cnt, first, s); else { pOut[*cnt] = p; (*cnt)++; } } void closeClip (wcPt2D wMin, wcPt2D wMax, wcPt2D * pOut, GLint * cnt, wcPt2D * first [ ], wcPt2D * s) { wcPt2D pt; Boundary winEdge; for (winEdge = Left; winEdge 0
h ≤ xh ≤ −h,
h ≤ yh ≤ −h,
h ≤ zh ≤ −h
if h < 0
(46)
In most cases h > 0, and we can then assign the bit values in the region code for a coordinate position according to the tests: bit 1 = 1 bit 2 = 1 bit 3 = 1 bit 4 = 1
if h + xh < 0 if h − xh < 0
bit 5 = 1
if h + yh < 0 if h − yh < 0 if h + zh < 0
bit 6 = 1
if h − zh < 0
(left) (right) (bottom) (top)
(47)
(near) (far)
These bit values can be set using the same approach as in two-dimensional clipping. That is, we simply use the sign bit of one of the calculations h ± xh , h ± yh , or h ± zh to set the corresponding region-code bit value. Figure 50 lists the 27 region codes for a view volume. In those cases where h < 0 for some point, we could apply clipping using the second set of inequalities in 46 or we could negate the coordinates and clip using the tests for h > 0.
343
Three-Dimensional Viewing y z
Top
x Far (c)
Near Bottom Left
(b) Right
(a)
FIGURE 50
Values for the three-dimensional, six-bit region code that identifies spatial positions relative to the boundaries of a view volume.
011001
011000
011010
001001
001000
001010
101001
101000
101010
010001
010000
010010
000001
000000
000010
100001
100000
100010
010101
010100
010110
000101
000100
000110
100101
100100
100110
Region Codes In Front of Near Plane (a)
Region Codes Between Near and Far Planes (b)
Region Codes Behind Far Plane (c)
Three-Dimensional Point and Line Clipping For standard point positions and straight-line segments that are defined in a scene that is not behind the projection reference point, all homogeneous parameters are positive and the region codes can be established using the conditions in 47. Then, once we have set up the region code for each position in a scene, we can easily identify a point position as outside the view volume or inside the view volume. For instance, a region code of 101000 tells us that the point is above and directly behind the view volume, while the region code 000000 indicates a point within the volume (Figure 50). Thus, for point clipping, we simply eliminate any individual point whose region code is not 000000. In other words, if any one of the tests in 47 is negative, the point is outside the view volume. Methods for three-dimensional line clipping are essentially the same as for two-dimensional lines. We can first test the line endpoint region codes for trivial acceptance or rejection of the line. If the region code for both endpoints of a line is 000000, the line is completely inside the view volume. Equivalently, we can trivially accept the line if the logical or operation on the two endpoint region codes produces a value of 0. And we can trivially reject the line if the logical and operation on the two endpoint region codes produces a value that is not 0. This nonzero value indicates that both endpoint region codes have a 1 value in the same bit position, and hence the line is completely outside one of the clipping planes. As an example of this, the line from P3 to P4 in Figure 51 has the endpoint regioncode values of 010101 and 100110. So this line is completely below the bottom clipping plane. If a line fails these two tests, we next analyze the line equation to determine whether any part of the line should be saved.
344
Three-Dimensional Viewing P2 (001001)
Normalized View Volume
P1 (000010) P4 (100110)
FIGURE 51
Three-dimensional region codes for two line segments. Line P1 P2 intersects the right and top clipping boundaries of the view volume, while line P3 P4 is completely below the bottom clipping plane.
P3 (010101)
Equations for three-dimensional line segments are conveniently expressed in parametric form, and the clipping methods of Cyrus-Beck or Liang-Barsky can be extended to three-dimensional scenes. For a line segment with endpoints P1 = (xh1, yh1, zh1, h1) and P2 = (xh2, yh2, zh2, h2), we can write the parametric equation describing any point position along the line as
P = P1 + (P2 − P1 )u
0≤u≤1
(48)
When the line parameter has the value u = 0, we are at position P1 . And u = 1 brings us to the other end of the line, P2 . Writing the parametric line equation explicitly, in terms of the homogeneous coordinates, we have xh = xh1 + (xh2 − xh1 )u yh = yh1 + (yh2 − yh1 )u zh = zh1 + (zh2 − zh1 )u
0≤u≤1
(49)
h = h 1 + (h 2 − h 1 )u Using the endpoint region codes for a line segment, we can first determine which clipping planes are intersected. If one of the endpoint region codes has a 0 value in a certain bit position while the other code has a 1 value in the same bit position, then the line crosses that clipping boundary. In other words, one of the tests in 47 generates a negative value, while the same test for the other endpoint of the line produces a nonnegative value. To find the intersection position with this clipping plane, we first use the appropriate equations in 49 to determine the corresponding value of parameter u. Then we calculate the intersection coordinates. As an example of the intersection-calculation procedure, we consider the line segment P1 P2 in Figure 51. This line intersects the right clipping plane, which can be described with the equation xmax = 1. Therefore, we determine the intersection value for parameter u by setting the x-projection coordinate equal to 1: xp =
xh xh1 + (xh2 − xh1 )u = =1 h h 1 + (h 2 − h 1 )u
(50)
Solving for parameter u, we obtain u=
xh1 − h 1 (xh1 − h 1 ) − (xh2 − h 2 )
(51)
345
Three-Dimensional Viewing
Next, we determine the values yp and z p on this clipping plane, using the calculated value for u. In this case, the yp and z p intersection values are within the ±1 boundaries of the view volume and the line does cross into the view-volume interior. So we next proceed to locate the intersection position with the top clipping plane. That completes the processing for this line segment, because the intersection points with the top and right clipping planes identify the part of the line that is inside the view volume and all the line sections that are outside the view volume. When a line intersects a clipping boundary but does not enter the viewvolume interior, we continue the line processing as in two-dimensional clipping. The section of the line outside that clipping boundary is eliminated, and we update the region-code information and the values for parameter u for the part of the line inside that boundary. Then we test the remaining section of the line against the other clipping planes for possible rejection or for further intersection calculations. Line segments in three-dimensional scenes are usually not isolated. They are most often components in the description for the solid objects in the scene, and we need to process the lines as part of the surface-clipping routines.
Three-Dimensional Polygon Clipping Graphics packages typically deal only with scenes that contain “graphics objects.” These are objects whose boundaries are described with linear equations, so that each object is composed of a set of surface polygons. Therefore, to clip objects in a three-dimensional scene, we apply the clipping routines to the polygon surfaces. Figure 52, for example, highlights the surface sections of a pyramid that are to be clipped, and the dashed lines show sections of the polygon surfaces that are inside the view volume. We can first test a polyhedron for trivial acceptance or rejection using its coordinate extents, a bounding sphere, or some other measure of its coordinate limits. If the coordinate limits of the object are inside all clipping boundaries, we save the entire object. If the coordinate limits are all outside any one of the clipping boundaries, we eliminate the entire object. When we cannot save or eliminate the entire object, we can next process the vertex lists for the set of polygons that define the object surfaces. Applying
Normalized View Volume
FIGURE 52
Three-dimensional object clipping. Surface sections that are outside the view-volume clipping planes are eliminated from the object description, and new surface facets may need to be constructed.
346
Three-Dimensional Viewing
methods similar to those in two-dimensional polygon clipping, we can clip edges to obtain new vertex lists for the object surfaces. We may also need to create some new vertex lists for additional surfaces that result from the clipping operations. And the polygon tables are updated to add any new polygon surfaces and to revise the connectivity and shared-edge information about the surfaces. To simplify the clipping of general polyhedra, polygon surfaces are often divided into triangular sections and described with triangle strips. We can then clip the triangle strips using the Sutherland-Hodgman approach. Each triangle strip is processed in turn against the six clipping planes to obtain the final vertex list for the strip. For concave polygons, we can apply splitting methods to obtain a set of triangles, for example, and then clip the triangles. Alternatively, we could clip threedimensional concave polygons using the Weiler-Atherton algorithm.
Three-Dimensional Curve Clipping As in polyhedra clipping, we first check to determine whether the coordinate extents of a curved object, such as a sphere or a spline surface, are completely inside the view volume. Then we can check to determine whether the object is completely outside any one of the six clipping planes. If the trivial rejection-acceptance tests fail, we locate the intersections with the clipping planes. To do this, we solve the simultaneous set of surface equations and the clipping-plane equation. For this reason, most graphics packages do not include clipping routines for curved objects. Instead, curved surfaces are approximated as a set of polygon patches, and the objects are then clipped using polygon-clipping routines. When surface-rendering procedures are applied to polygon patches, they can provide a highly realistic display of a curved surface.
Arbitrary Clipping Planes It is also possible, in some graphics packages, to clip a three-dimensional scene using additional planes that can be specified in any spatial orientation. This option is useful in a variety of applications. For example, we might want to isolate or clip off an irregularly shaped object, eliminate part of a scene at an oblique angle for a special effect, or slice off a section of an object along a selected axis to show a cross-sectional view of its interior. Optional clipping planes can be specified along with the description of a scene, so that the clipping operations can be performed prior to the projection transformation. However, this also means that the clipping routines are implemented in software. A clipping plane can be specified with the plane parameters A, B, C, and D. The plane then divides three-dimensional space into two parts, so that all parts of a scene that lie on one side of the plane are clipped off. Assuming that objects behind the plane are to be clipped, then any spatial position (x, y, z) that satisfies the following inequality is eliminated from the scene. Ax + By + C z + D < 0
(52)
As an example, if the plane-parameter array has the values (A, B, C, D) = (1.0, 0.0, 0.0, 8.0), then any coordinate position satisfying x + 8.0 < 0.0 (or, x < −8.0) is clipped from the scene. To clip a line segment, we can first test its two endpoints to see if the line is completely behind the clipping plane or completely in front of the plane. We can represent inequality 52 in a vector form using the plane normal vector
347
Three-Dimensional Viewing
P1 N ⫽ (A, B, C )
P
P2 FIGURE 53
Clipping a line segment against a plane with normal vector N.
N = (A, B, C). Then, for a line segment with endpoint positions P1 and P2 , we clip the entire line if both endpoints satisfy N · Pk + D < 0,
k = 1, 2
(53)
We save the entire line if both endpoints satisfy N
· Pk + D ≥ 0,
k = 1, 2
(54)
Otherwise, the endpoints are on opposite sides of the clipping plane, as in Figure 53, and we calculate the line intersection point. To calculate the line-intersection point with the clipping plane, we can use the following parametric representation for the line segment: P = P1 + (P2 − P1 )u,
0≤u≤1
(55)
Point P is on the clipping plane if it satisfies the plane equation N
· P+ D=0
(56)
Substituting the expression for P from Equation 55, we have N
· [P1 + (P2 − P1 )u] + D = 0
(57)
Solving this equation for parameter u, we obtain u=
−D − N · P1 N · (P2 − P1 )
(58)
We then substitute this value of u into the vector parametric line representation 55 to obtain values for the x, y, and z intersection coordinates. For the example in Figure 53, the line segment from P1 to P is clipped and we save the section of the line from P to P2 . For polyhedra, such as the pyramid in Figure 54, we apply similar clipping procedures. We first test to see if the object is completely behind or completely in front of the clipping plane. If not, we process the vertex list for each polygon surface. Line-clipping methods are applied to each polygon edge in succession, just as in view-volume clipping, to produce the surface vertex lists. But in this case, we have to deal with only one clipping plane. Clipping a curved object against a single clipping plane is easier than clipping the object against the six planes of a view volume. However, we still need to solve a set of nonlinear equations to locate intersections, unless we approximate the curve boundaries with straight-line sections.
348
Three-Dimensional Viewing
N ⫽ (A, B, C)
FIGURE 54
Clipping the surfaces of a pyramid against a plane with normal vector N. The surfaces in front of the plane are saved, and the surfaces of the pyramid behind the plane are eliminated.
12 OpenGL Optional Clipping Planes In addition to the six clipping planes enclosing the view volume, OpenGL provides for the specification of additional clipping planes in a scene. Unlike the view-volume clipping planes, which are each perpendicular to one of the coordinate axes, these additional planes can have any orientation. We designate an optional clipping plane and activate clipping against that plane with the statements glClipPlane (id, planeParameters); glEnable (id);
Parameter id is used as an identifier for a clipping plane. This parameter is assigned one of the values GL CLIP PLANE0, GL CLIP PLANE1, and so forth, up to a facility-defined maximum. The plane is then defined using the four-element array planeParameters, whose elements are the double-precision, floatingpoint values for the four plane-equation parameters A, B, C, and D. An activated clipping plane that has been assigned the identifier id is turned off with glDisable (id);
The plane parameters A, B, C, and D are transformed to viewing coordinates and used to test viewing-coordinate positions in a scene. Subsequent changes in viewing or geometric-transformation parameters do not affect the stored plane parameters. Therefore, if we set up optional clipping planes before specifying any geometric or viewing transformations, the stored plane parameters are the same as the input parameters. Also, because the clipping routines for these planes are applied in viewing coordinates, and not in the normalized coordinate space, the performance of a program can be degraded when optional clipping planes are activated.
349
Three-Dimensional Viewing
Any points that are “behind” an activated OpenGL clipping plane are eliminated. Thus, a viewing-coordinate position (x, y, z) is clipped if it satisfies condition 52. Six optional clipping planes are available in any OpenGL implementation, but more might be provided. We can find out how many optional clipping planes are possible for a particular OpenGL implementation with the inquiry glGetIntegerv (GL_MAX_CLIP_PLANES, numPlanes);
Parameter numPlanes is the name of an integer array that is to be assigned an integer value equal to the number of optional clipping planes that we can use. The default for the glClipPlane function is that the clipping-plane parameters A, B, C, and D are each assigned a value of 0 for all optional planes. And, initially, all optional clipping planes are disabled.
13 Summary Viewing procedures for three-dimensional scenes follow the general approach used in two-dimensional viewing. We first create a world-coordinate scene, either from the definitions of objects in modeling coordinates or directly in world coordinates. Then we set up a viewing-coordinate reference frame and transfer object descriptions from world coordinates to viewing coordinates. Object descriptions are then processed through various routines to device coordinates. Unlike two-dimensional viewing, however, three-dimensional viewing requires projection routines to transform object descriptions to a viewing plane before the transformation to device coordinates. Also, three-dimensional viewing operations involve more spatial parameters. We can use the camera analogy to describe three-dimensional viewing parameters. A viewing-coordinate reference frame is established with a view reference point (the camera position), a view-plane normal vector N (the camera lens direction), and a view-up vector V (the camera up direction). The view-plane position is then established along the viewing z axis, and object descriptions are projected to this plane. Either parallel-projection or perspective-projection methods can be used to transfer object descriptions to the view plane. Parallel projections are either orthographic or oblique, and they can be specified with a projection vector. Orthographic parallel projections that display more than one face of an object are called axonometric projections. An isometric view of an object is obtained with an axonometric projection that foreshortens each principal axis by the same amount. Commonly used oblique projections are the cavalier projection and the cabinet projection. Perspective projections of objects are obtained with projection lines that meet at the projection reference point. Parallel projections maintain object proportions, but perspective projections decrease the size of distant objects. Perspective projections cause parallel lines to appear to converge to a vanishing point, provided the lines are not parallel to the view plane. Engineering and architectural displays can be generated with one-point, two-point, or three-point perspective projections, depending on the number of principal axes that intersect the view plane. An oblique perspective projection is obtained when the line from the projection reference point to the center of the clipping window is not perpendicular to the view plane. Objects in a three-dimensional scene can be clipped against a view volume to eliminate unwanted sections of the scene. The top, bottom, and sides of the view volume are formed with planes that are parallel to the projection lines and that pass through the clipping-window edges. Near and far planes (also called front and back planes) are used to create a closed view volume. For a parallel
350
Three-Dimensional Viewing
T A B L E
1
Summary of OpenGL Three-Dimensional Viewing Functions Function
Description
gluLookAt
Specifies three-dimensional viewing parameters.
glOrtho
Specifies parameters for a clipping window and the near and far clipping planes for an orthogonal projection.
gluPerspective
Specifies field-of-view angle and other parameters for a symmetric perspective projection.
glFrustum
Specifies parameters for a clipping window and near and far clipping planes for a perspective projection (symmetric or oblique).
glClipPlane
Specifies parameters for an optional clipping plane.
projection, the view volume is a parallelepiped. For a perspective projection, the view volume is a frustum. In either case, we can convert the view volume to a normalized cube with boundaries either at 0 and 1 for each coordinate or at −1 and 1 for each coordinate. Efficient clipping algorithms process objects in a scene against the bounding planes of the normalized view volume. Clipping is generally carried out in graphics packages in four-dimensional homogeneous coordinates following the projection and view-volume normalization transformations. Then, homogeneous coordinates are converted to three-dimensional, Cartesian projection coordinates. Additional clipping planes, with arbitrary orientations, can also be used to eliminate selected parts of a scene or to produce special effects. A function is available in the OpenGL Utility library for specifying threedimensional viewing parameters (see Table 1). This library also includes a function for setting up a symmetric perspective-projection transformation. Three other viewing functions are available in the OpenGL basic library for specifying an orthographic projection, a general perspective projection, and optional clipping planes. Table 1 summarizes the OpenGL viewing functions discussed in this chapter. In addition, the table lists some viewing-related functions.
REFERENCES Discussions of three-dimensional viewing and clipping algorithms can be found in Weiler and Atherton (1977), Weiler (1980), Cyrus and Beck (1978), and Liang and Barsky (1984). Homogeneous-coordinate clipping algorithms are described in Blinn and Newell (1978), Riesenfeld (1981), and Blinn (1993, 1996, and 1998). Various programming techniques for three-dimensional viewing are discussed in Glassner (1990), Arvo (1991), Kirk (1992), Heckbert (1994), and Paeth (1995). A complete listing of three-dimensional OpenGL viewing functions is given in Shreiner (2000). For OpenGL programming examples using threedimensional viewing, see Woo, et al. (1999). Additional programming examples can be found at Nate Robins’s tutorial website: www.xmission.com/∼nate/ opengl.html.
EXERCISES 1
2
3
4
Write a procedure to set up the matrix that transforms world-coordinate positions to threedimensional viewing coordinates, given P0 , N, and V. The view-up vector can be in any direction that is not parallel to N. Write a procedure to transform the vertices of a polyhedron to projection coordinates using a parallel projection with any specified projection vector. Write a program to obtain different parallelprojection views of a polyhedron by allowing the user to rotate the polyhedron via the keyboard. Write a procedure to perform a one-point perspective projection of an object.
351
Three-Dimensional Viewing
5 6 7
8 9
10
11
12
13
14
15 16
17
18
352
Write a procedure to perform a two-point perspective projection of an object. Develop a routine to perform a three-point perspective projection of an object. Write a program that uses the routines in the previous three exercises to display a threedimensional cube using a one-, two-, or threepoint perspective projection according to input taken from the keyboard, which should be used to switch between projections. The program should also allow the user to rotate the cube in the xz plane around its center. Examine the visual differences of the three different types of projections. Write a routine to convert a perspective projection frustum to a regular parallelepiped. Modify the two-dimensional Cohen-Sutherland line-clipping algorithm to clip three-dimensional lines against the normalized symmetric view volume square. Write a program to generate a set of 10 random lines, each of which has one endpoint within a normalized symmetric view volume and one without. Implement the three-dimensional Cohen-Sutherland line-clipping algorithm designed in the previous exercise to clip the set of lines against the viewing volume. Modify the two-dimensional Liang-Barsky lineclipping algorithm to clip three-dimensional lines against a specified regular parallelepiped. Write a program similar to that in Exercise 10 that generates a set of 10 random lines, each partially outside of a specified regular parallelepiped viewing volume. Use the three-dimensional Liang-Barsky line-clipping algorithm developed in the previous exercise to clip the lines against the viewing volume. Modify the two-dimensional Liang-Barsky lineclipping algorithm to clip a given polyhedron against a specified regular parallelepiped. Write a program to display a cube in a regular parallelepiped viewing volume and allow the user to translate the cube along each axis using keyboard input. Implement the algorithm in the previous exercise to clip the cube when it extends over any of the edges of the viewing volume. Write a routine to perform line clipping in homogeneous coordinates. Devise an algorithm to clip a polyhedron against a defined frustum. Compare the operations needed in this algorithm to those needed in an algorithm that clips against a regular parallelepiped. Extend the Sutherland-Hodgman polygonclipping algorithm to clip a convex polyhedron against a normalized symmetric view volume. Write a routine to implement the preceding exercise.
19 Write a program similar to the one in Exercise 14 to display a cube in a normalized symmetric view volume that can be translated around the viewing volume via keyboard input. Use the implementation of the polygon-clipping algorithm developed in the previous exercise to clip the cube when it extends over the edge of the viewing volume. 20 Write a routine to perform polyhedron clipping in homogeneous coordinates. 21 Modify the program example in Section 10 to allow a user to specify a view for either the front or the back of the square. 22 Modify the program example in Section 10 to allow the perspective viewing parameters to be specified as user input. 23 Modify the program example in Section 10 to produce a view of any input polyhedron. 24 Modify the program in the preceding exercise to generate a view of the polyhedron using an orthographic projection. 25 Modify the program in the preceding exercise to generate a view of the polyhedron using an oblique parallel projection.
IN MORE DEPTH 1
2
In this exercise, you will give “depth” to the polygons that represent the objects in your scene and clip them against a normalized view volume. First, choose new z-coordinates and threedimensional orientations for the polygons in your scene that are appropriate to the snapshot of your application. That is, they should be taken out of the xy plane in which they have been constrained so far and given appropriate depth. Once you have done this, implement an extension of the Sutherland-Hodgman polygon-clipping algorithm that allows clipping of convex polygons against a normalized symmetric view volume. You will use this algorithm in the next exercise to produce a view of some portion of your threedimensional scene. Choose a view of the scene from the previous exercise that produces a view volume in which all objects are not fully contained. Apply the algorithm for polygon clipping that you developed in the previous exercise against the view volume. Write routines to display the scene using a parallel projection and a perspective projection. Use the OpenGL three-dimensional viewing functions to do this, choosing appropriate parameters to specify the viewing volume in each case. Allow the user to switch between the two projections via keyboard input and note the differences in the visual appearance of the scene in the two cases.
Three-Dimensional Viewing Color P lates
Color Plate 9
A realistic room display, achieved with a perspective projection, illumination effects, and selected surface properties. (Courtesy of John Snyder, Jed Lengyel, Devendra Kalra, and Al Barr, California Institute of Technology. © 1992 Caltech.)
From Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
353
This page intentionally left blank
Hierarchical Modeling
1
Basic Modeling Concepts
2
Modeling Packages
3
General Hierarchical Modeling Methods
4
Hierarchical Modeling Using OpenGL Display Lists
5
Summary
I
n setting up the definition of a complex object or system, it is usually easiest to specify the subparts first and then describe how the subparts fit together to form the overall object or system. For instance, a bicycle can be described in terms of a frame, wheels, fenders, handlebars, seat, chain, and pedals, along with the rules for positioning these components to form the bicycle. A hierarchical description of this type can be given as a tree structure, consisting of the subparts as the tree nodes and the construction rules as the tree branches. Architectural and engineering systems, such as building layouts, automobile design, electronic circuits, and home appliances, are now routinely developed using computer-aided design (CAD) packages. And graphical design methods are used also for representing economic, financial, organizational, scientific, social, and environmental systems. Simulations are often constructed to study the behavior of a system under various conditions, and the outcome of a simulation can serve as an instructional tool or as a basis for making decisions
From Chapter 11 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
355
Hierarchical Modeling
about the system. Design packages generally provide routines for creating and managing hierarchical models, and some packages also contain predefined shapes, such as wheels, doors, gears, shafts, and electric-circuit components.
1 Basic Modeling Concepts The creation and manipulation of a system representation is termed modeling. Any single representation is called a model of the system, which could be defined graphically or purely descriptively, such as a set of equations that describe the relationships among system parameters. Graphical models are often referred to as geometric models, because the component parts of a system are represented with geometric entities such as straight-line segments, polygons, polyhedra, cylinders, or spheres. Because we are concerned here only with graphics applications, we will use the term model to mean a computer-generated, geometric representation of a system.
System Representations Figure 1 shows a graphical representation for a logic circuit, illustrating the features common to many system models. Component parts of the system are displayed as geometric structures, called symbols, and relationships among the symbols are represented in this example with a network of connecting lines. Three standard symbols are used to represent logic gates for the Boolean operations: and, or, and not. The connecting lines define relationships in terms of input and output flow (from left to right) through the system parts. One symbol, the and gate, is displayed at two different positions within the logic circuit. Repeated positioning of a few basic symbols is a common method for building complex models. Each such occurrence of a symbol within a model is called an instance of that symbol. We have one instance for the or and not symbols in Figure 1 and two instances of the and symbol. In many cases, the particular graphical symbols chosen to represent the parts of a system are dictated by the system description. For circuit models, standard electrical or logic symbols are used. But with models representing abstract concepts, such as political, financial, or economic systems, symbols may be any convenient geometric pattern. Information describing a model is usually provided as a combination of geometric and nongeometric data. Geometric information includes coordinate positions for locating the component parts, output primitives and attribute functions to define the structure of the parts, and data for constructing connections between the parts. Nongeometric information includes text labels, algorithms describing the operating characteristics of the model, and rules for determining the relationships or connections between component parts, if these are not specified as geometric data.
Binary Input
and
Binary Output
not and
FIGURE 1
Model of a logic circuit.
356
or
Hierarchical Modeling
T A B L E
1
Data table defining the structure and position of each gate in the circuit of Figure 1 Symbol Code Gate 1 Gate 2 Gate 3 Gate 4
Geometric Description
Identifying Label
(Coordinates and other parameters) : : :
and or not and
There are two methods for specifying the information needed to construct and manipulate a model. One method is to store the information in a data structure, such as a table or linked list. The other method is to specify the information in procedures. In general, a model specification will contain both data structures and procedures, although some models are defined completely with data structures and others use only procedural specifications. An application to perform solid modeling of objects might use mostly information taken from some data structure to define coordinate positions, with very few procedures. A weather model, on the other hand, may need mostly procedures to calculate plots of temperature and pressure variations. As an example of how combinations of data structures and procedures can be used, we consider some alternative model specifications for the logic circuit of Figure 1. One method is to define the logic components in a data table (Table 1), with processing procedures used to specify how the network connections are to be made and how the circuit operates. Geometric data in this table include coordinates and parameters necessary for drawing and positioning the gates. These symbols could all be drawn as polygons, or they could be formed as combinations of straight-line segments and elliptical arcs. Labels for each of the component parts also have been included in the table, although the labels could be omitted if the symbols are displayed as commonly recognized shapes. Procedures would then be used to display the gates and construct the connecting lines, based on the coordinate positions of the gates and a specified order for connecting them. An additional procedure is used to produce the circuit output (binary values) for any given input. This procedure could be set up to display only the final output, or it could be designed to display intermediate output values to illustrate the internal functioning of the circuit. Alternatively, we might specify graphical information for the circuit model in data structures. The connecting lines, as well as the gates, could then be defined in a data table that explicitly lists endpoints for each of the lines in the circuit. A single procedure might then display the circuit and calculate the output. At the other extreme, we could completely define the model in procedures, using no external data structures.
Symbol Hierarchies Many models can be organized as a hierarchy of symbols. The basic elements for the model are defined as simple geometric shapes appropriate to the type of model under consideration. These basic symbols can be used to form composite objects, sometimes called modules, which themselves can be grouped to form
357
Hierarchical Modeling
Logic Circuit
FIGURE 2
and Gate
A one-level hierarchical description of a circuit formed with logic gates.
not Gate
or Gate
and Gate
higher-level objects, and so on, for the various components of the model. In the simplest case, we can describe a model by a one-level hierarchy of component parts, as in Figure 2. For this circuit example, we assume that the gates are positioned and connected to each other with straight lines according to connection rules that are specified with each gate description. The basic symbols in this hierarchical description are the logic gates. Although the gates themselves could be described as hierarchies—formed from straight lines, elliptical arcs, and text— that description would not be a convenient one for constructing logic circuits, in which the simplest building blocks are gates. For an application in which we were interested in designing different geometric shapes, the basic symbols could be defined as straight-line segments and arcs. An example of a two-level symbol hierarchy appears in Figure 3. Here, a facility layout is planned as an arrangement of work areas. Each work area is outfitted with a collection of furniture. The basic symbols are the furniture items: worktable, chair, shelves, file cabinet, and so forth. Higher-order objects are the work areas, which are put together with different furniture organizations. An instance of a basic symbol is defined by specifying its position, size, and orientation within each work area. Positions are given as coordinate locations in the work areas, and orientations are specified as rotations that determine which way the symbols are facing. At the first level below the root node for the facility tree, each work area is defined by specifying its position, size, and orientation within the facility layout. The boundary for each work area might be defined with a divider that encloses the work area and provides aisles within the facility. More complex symbol hierarchies are formed with repeated groupings of symbol clusters at each higher level. The facility layout of Figure 3 could be extended to include symbol clusters that form different rooms, different floors
Facility
... Work Area 1
Worktable
Chair
Work Area 2
File Cabinet
Worktable
Chair
Work Area 3
Shelves
FIGURE 3
A two-level hierarchical description for a facility layout.
358
Worktable
Chair
Shelves
File Cabinet
...
Hierarchical Modeling
of a building, different buildings within a complex, and different complexes at widely separated geographical locations.
2 Modeling Packages Although system models can be designed and manipulated using a general computer-graphics package, specialized modeling systems are available to facilitate modeling in particular applications. Modeling systems provide a means for defining and rearranging model representations in terms of symbol hierarchies, which are then processed by graphics routines for display. General-purpose graphics systems often do not provide routines to accommodate extensive modeling applications. But some graphics packages, such as GL and PHIGS, do include integrated sets of modeling and graphics functions. If a graphics library contains no modeling functions, we can often use a modeling-package interface to the graphics routines. Alternatively, we could create our own modeling routines using the geometric transformations and other functions available in the graphics library. Specialized modeling packages, such as some CAD systems, are defined and structured according to the type of application the package has been designed to handle. These packages provide menus of symbol shapes and functions for the intended application. And they can be designed for either two-dimensional or three-dimensional modeling.
3 General Hierarchical Modeling Methods We create a hierarchical model of a system by nesting the descriptions of its subparts into one another to form a tree organization. As each node is placed into the hierarchy, it is assigned a set of transformations to position it appropriately into the overall model. For an office-facility design, work areas and offices are formed with arrangements of furniture items. The offices and work areas are then placed into departments, and so forth on up the hierarchy. An example of the use of multiple coordinate systems and hierarchical modeling with threedimensional objects is given in Figure 4. This figure illustrates simulation of tractor movement. As the tractor moves, the tractor coordinate system and frontwheel coordinate system move in the world coordinate system. The front wheels rotate in the wheel system, and the wheel system rotates in the tractor system when the tractor turns. yt Tractor System
yw
xt
zt
zw ytw
ztw
xtw
Front-Wheel System
World System
xw FIGURE 4
Possible coordinate systems used in simulating tractor movement. A rotation of the front-wheel system causes the tractor to turn. Both the wheel and tractor reference frames move in the world coordinate system.
359
Hierarchical Modeling Arrays for Worktable Coordinates
10
x Worktable Arrays for Chair Coordinates x Chair ⫺3 ⫺3 3 3 1 2 1
⫺3 3 3 ⫺3 3 0 ⫺3
0 ⫺8 ⫺8 8 8 0
5
y Chair 5
0
0
⫺5
⫺5 ⫺5
0 Chair
5
(a)
y Worktable 0 0 6 6 ⫺10 ⫺10
⫺10 ⫺10
⫺5
0 Worktable
5
10
(b) FIGURE 5
Objects defined in local coordinates.
Local Coordinates In general design applications, models are constructed with instances (transformed copies) of the geometric shapes that are defined in a basic symbol set. Each instance is positioned, with the proper orientation, in the world-coordinate reference of the overall structure of the model. The various graphical objects to be used in an application are each defined relative to the world-coordinate reference system, which is referred to as the local coordinate system for that object. Local coordinates are also called modeling coordinates, or sometimes master coordinates. Figure 5 illustrates local-coordinate definitions for two symbols that could be used in a two-dimensional facility-layout application.
Modeling Transformations To construct a graphical model, we apply transformations to the local-coordinate definitions of symbols to produce instances of the symbols within the overall structure of the model. Transformations applied to the modeling-coordinate definitions of symbols to give them a particular position and orientation within a model are referred to as modeling transformations. The typical transformations available in a modeling package are translation, rotation, and scaling, but other transformations might also be used in some applications.
Creating Hierarchical Structures A first step in a hierarchical modeling application is to construct modules that are compositions of basic symbols. The modules themselves may then be combined into higher-level modules, and so on. We define each initial module as a list of symbol instances, along with appropriate transformation parameters for each symbol. At the next level, we define each higher-level module as a list of symbol and lower-level module instances along with their transformation parameters. This process is continued up to the root of the tree, which represents the total model in world coordinates.
360
Hierarchical Modeling
In a modeling package, a module is created with a sequence of commands of the form
createModule1 setSymbolTransformation1 insertSymbol1 setSymbolTransformation2 insertSymbol2 . . . closeModule1
Each instance of a basic symbol is assigned a set of transformation parameters for that module. Similarly, modules are combined to form higher-level modules with functions such as
createModule6 setModuleTransformation1 insertModule1 setModuleTransformation2 insertModule2 setSymbolTransformation5 insertSymbol5 . . . closeModule6
The transformation function for each module or symbol specifies how that object is to be fitted into the higher-level module. Often, options are provided so that a specified transformation matrix could premultiply, postmultiply, or replace the current transformation matrix. Although a basic set of symbols could be available in a modeling package, the symbol set might not contain the shapes we need for a particular application. In that case, we can create additional shapes within a modeling program. As an example, the following pseudocode illustrates the specification of a simple model for a bicycle:
createWheelSymbol createFrameSymbol createBicycleModule setFrameTransformation insertFrameSymbol
361
Hierarchical Modeling
setFrontWheelTransformation insertWheelSymbol setBackWheelTransformation insertWheelSymbol closeBicycleModule
A number of other modeling routines are usually available in a system designed for hierarchical modeling. Modules often can be selectively displayed or temporarily taken out of a system representation. This allows a designer to experiment with different shapes and design structures. And selected modules could be highlighted or moved around in the display during the design process.
4 Hierarchical Modeling Using OpenGL Display Lists Complex objects can be described in OpenGL using nested display lists to form a hierarchical model. Each symbol and module for the model is created with a glNewList function. And we insert one display list into another display list using the glCallList function within the definition of the higher-order list. Geometric transformations can be associated with each inserted object to specify a position, orientation, and size within the higher-level module. As an example, the following code could be used to describe a bicycle that is simply composed of a frame and two identical wheels:
glNewList (bicycle, GL_COMPILE); glCallList (frame); glPushMatrix ( ); glTranslatef (tx1, ty1, tz1); glCallList (wheel); glPopMatrix ( ); glPushMatrix ( ); glTranslatef (tx2, ty2, tz2); glCallList (wheel); glPopMatrix ( ); glEndList ( );
This code creates a new display list which, when executed, will invoke glCallList to execute two additional display lists which will draw the frame and the wheels of the bicycle. Because we are positioning the two wheels relative to the location of the frame of the bicycle, when we draw each wheel we use a call to glPushMatrix before applying the translation which positions the wheel, followed by a call to glPopMatrix after drawing the wheel to restore the transformation matrix to its previous state. This isolates the per-wheel translations; without these calls to to glPushMatrix and glPopMatrix, the translations would be cumulative rather than separate—in effect, we would position the
362
Hierarchical Modeling
second wheel relative to the position of the first, rather than relative to the position of the frame. Just as this display list is composed from other lists, so could the frame display list be composed from individual display lists describing the handlebars, chain, pedals, and other components, and the wheel display could be composed from other lists describing the wheel rim, its spokes, and the tire that surrounds the rim.
5 Summary The term “model,” in computer-graphics applications, refers to a graphical representation for some system. Basic components of a system are represented as symbols, defined in local-coordinate reference frames, which are also referred to as modeling, or master, coordinates. We create a model, such as an electrical circuit, by placing instances of the symbols at selected locations with prescribed orientations. Many models are constructed as symbol hierarchies. We can construct a hierarchical model by nesting modules, which are composed of instances of basic symbols and other modules. This nesting process may continue down to symbols that are defined with graphical output primitives and their attributes. As each symbol or module is nested within a higher-level module, an associated modeling transformation is specified for the nested structure. A hierarchical model can be set up in OpenGL using display lists. The glNewList function can be used to define the overall structure of a system and its component modules. Individual symbol structures or other modules are inserted within a module using the glCallList function, preceded by an appropriate set of transformations to specify the position, orientation, and size of the inserted component.
REFERENCE Examples of modeling applications using OpenGL are given in Woo, et al. (1999).
4
Devise a two-dimensional city-planning package that presents a menu of building structure (e.g., residential buildings, commercial buildings, roads, etc.) to a designer. A two-level hierarchy is to be used so that building structures can be placed into various neighborhoods, and the neighborhoods can be arranged within a larger area comprising the land a given city can occupy. Building structures are to be placed into neighborhoods using only translation and rotation instance transformations.
5
Extend the previous exercise so that building structures can also be scaled along each dimension.
6
Write a set of routines for creating and displaying symbols for logic-circuit design. As a minimum, the symbol set should include the and, or, and not gates shown in Figure 1. Develop a modeling package for designing logic circuits that will allow a designer to position
EXERCISES 1
2
3
Discuss model representations that would be appropriate for several distinctly different kinds of systems. Also discuss how graphical representations might be implemented for each system. Devise a two-dimensional neighborhood layout package. A menu of various building structures (e.g., residential buildings, commercial buildings, roads, etc.) is to be provided to a designer, who can use a mouse to select and place an object in any location within a given tract of land that constitutes a neighborhood (a one-level hierarchy). Instance transformations can be limited to translations and rotations. Extend the previous exercise so that building structures can also be scaled along each dimension.
7
363
Hierarchical Modeling
8
9
10 11
12
13
electrical symbols within a circuit network. Use the symbol set from the previous exercise, and use only translations to place an instance of one of the menu shapes into the network. Once a component has been placed in the network, it is to be connected to other specified components with straight-line segments. Suppose you are designing a “creature creator” for a video game. The player should be given access to a list of various forms from a fixed set of body parts: heads, bodies, arms, and legs. A creature must have exactly one head and one body, but may have any number of arms or legs. Write a set of routines for editing creatures (a single creature in this instance would be a module) in the proposed game. Your routines should provide for the ability to replace one body part instance for another, insert additional arms or legs, and delete existing arms or legs. Given the coordinate extents of all displayed objects in a model, write a routine to delete any selected object. Write procedures to display and to delete a specified module in a model. Write a routine that will selectively take modules out of a model display or return them to the display. Write a procedure to highlight a selected module in some way. For example, the selected module could be displayed in a different color or it could be enclosed within a rectangular outline. Write a procedure to highlight a selected module in a model by causing the module’s scale to oscillate slightly.
IN MORE DEPTH 1
364
Recall that in previous exercises, you may have organized a subset of the objects in your scene into a group that behaves in a way that is easier mod-
2
eled in terms of relative positions and orientations within the group. Then, you used transformations from those local modeling coordinates to world coordinates to convert local transformations of the objects in the group into transformations in the world coordinate system. In this exercise, you will take this concept further, or implement it if you haven’t already. Consider the ways in which the objects in your application interact, and identify groups of objects that make sense to model as a single unit. Alternatively, if none of the objects in your scene exhibit this property, consider modeling single objects in terms of several polygons that change position, scale, or orientation relative to other polygons that make up the object model. This will require that you modify the representation of each object so that it is composed of more than a single polygon. You will learn more about three-dimensional object representations in later chapters, but this temporary solution will allow you to experiment with hierarchical modeling for now. If possible, try to build a hierarchy of two or more levels to get the full appreciation of the utility of organizing objects hierarchically. For each group of objects that comprise a group, identify the subcomponents and how they interact with each other. Use the hierarchical organization of the objects that you developed in the previous exercise to replicate the simple animation of your scene. The transformations that you developed earlier may need to be modified because you altered the z coordinates and orientations of the objects in your scene. Set up display lists to define each group and perform the appropriate transformations on each by using the matrix stack appropriately, as described in Section 4. Use the perspective projection viewing scheme that you developed to display the animation in the display window.
Computer Animation
1
Raster Methods for Computer Animation
2
Design of Animation Sequences
3
Traditional Animation Techniques
4
General Computer-Animation Functions
5
Computer-Animation Languages
6
Key-Frame Systems
7
Motion Specifications
8
Character Animation
9
Periodic Motions
10 OpenGL Animation Procedures 11
Summary
C
omputer-graphics methods are now commonly used to produce animations for a variety of applications, including entertainment (motion pictures and cartoons), advertising, scientific and engineering studies, and training and education. Although we tend to think of animation as implying object motion, the term computer animation generally refers to any time sequence of visual changes in a picture. In addition to changing object positions using translations or rotations, a computer-generated animation could display time variations in object size, color, transparency, or surface texture. Advertising animations often transition one object shape into another: for example, transforming a can of motor oil into an automobile engine. We can also generate computer animations by varying camera parameters, such as position, orientation, or focal length, and variations in lighting effects or other parameters and procedures associated with illumination and rendering can be used to produce computer animations. Another consideration in computer-generated animation is realism. Many applications require realistic displays. An accurate
From Chapter 1 2 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
365
Computer Animation
representation of the shape of a thunderstorm or other natural phenomena described with a numerical model is important for evaluating the reliability of the model. Similarly, simulators for training aircraft pilots and heavy-equipment operators must produce reasonably accurate representations of the environment. Entertainment and advertising applications, on the other hand, are sometimes more interested in visual effects. Thus, scenes may be displayed with exaggerated shapes and unrealistic motions and transformations. However, there are many entertainment and advertising applications that do require accurate representations for computer-generated scenes. Also, in some scientific and engineering studies, realism is not a goal. For example, physical quantities are often displayed with pseudo-colors or abstract shapes that change over time to help the researcher understand the nature of the physical process. Two basic methods for constructing a motion sequence are real-time animation and frame-by-frame animation. In a real-time computer-animation, each stage of the sequence is viewed as it is created. Thus the animation must be generated at a rate that is compatible with the constraints of the refresh rate. For a frame-by-frame animation, each frame of the motion is separately generated and stored. Later, the frames can be recorded on film, or they can be displayed consecutively on a video monitor in “real-time playback” mode. Simple animation displays are generally produced in real time, while more complex animations are constructed more slowly, frame by frame. However, some applications require real-time animation, regardless of the complexity of the animation. A flight-simulator animation, for example, is produced in real time because the video displays must be generated in immediate response to changes in the control settings. In such cases, special hardware and software systems are often developed to allow the complex display sequences to be developed quickly.
1 Raster Methods for Computer Animation Most of the time, we can create simple animation sequences in our programs using real-time methods. In general, though, we can produce an animation sequence on a raster-scan system one frame at a time, so that each completed frame could be saved in a file for later viewing. The animation can then be viewed by cycling through the completed frame sequence, or the frames could be transferred to film. If we want to generate an animation in real time, however, we need to produce the motion frames quickly enough so that a continuous motion sequence is displayed. For a complex scene, one frame of the animation could take most of the refresh cycle time to construct. In that case, objects generated first would be displayed for most of the frame refresh time, but objects generated toward the end of the refresh cycle would disappear almost as soon as they were displayed. For very complex animations, the frame construction time could be greater than the time to refresh the screen, which can lead to erratic motion and fractured frame displays. Because the screen display is generated from successively modified pixel values in the refresh buffer, we can take advantage of some of the characteristics of the raster screen-refresh process to produce motion sequences quickly.
Double Buffering One method for producing a real-time animation with a raster system is to employ two refresh buffers. Initially, we create a frame for the animation in one
366
Computer Animation
of the buffers. Then, while the screen is being refreshed from that buffer, we construct the next frame in the other buffer. When that frame is complete, we switch the roles of the two buffers so that the refresh routines use the second buffer during the process of creating the next frame in the first buffer. This alternating buffer process continues throughout the animation. Graphics libraries that permit such operations typically have one function for activating the doublebuffering routines and another function for interchanging the roles of the two buffers. When a call is made to switch two refresh buffers, the interchange could be performed at various times. The most straightforward implementation is to switch the two buffers at the end of the current refresh cycle, during the vertical retrace of the electron beam. If a program can complete the construction of a frame within 1 the time of a refresh cycle, say 60 of a second, each motion sequence is displayed in synchronization with the screen refresh rate. However, if the time to construct a frame is longer than the refresh time, the current frame is displayed for two or more refresh cycles while the next animation frame is being generated. For 1 example, if the screen refresh rate is 60 frames per second and it takes 50 of a second to construct an animation frame, each frame is displayed on the screen twice and the animation rate is only 30 frames each second. Similarly, if the frame 1 construction time is 25 of a second, the animation frame rate is reduced to 20 frames per second because each frame is displayed three times. Irregular animation frame rates can occur with double buffering when the frame construction time is very nearly equal to an integer multiple of the screen refresh time. As an example of this, if the screen refresh rate is 60 frames per second, then an erratic animation frame rate is possible when the frame construction 1 2 3 time is very close to 60 of a second, or 60 of a second, or 60 of a second, and so forth. Because of slight variations in the implementation time for the routines that generate the primitives and their attributes, some frames could take a little more time to construct and some a little less time. Thus, the animation frame rate can change abruptly and erratically. One way to compensate for this effect is to add a small time delay to the program. Another possibility is to alter the motion or scene description to shorten the frame construction time.
Generating Animations Using Raster Operations We can also generate real-time raster animations for limited applications using block transfers of a rectangular array of pixel values. This animation technique is often used in game-playing programs. A simple method for translating an object from one location to another in the xy plane is to transfer the group of pixel values that define the shape of the object to the new location. Two-dimensional rotations in multiples of 90º are also simple to perform, although we can rotate rectangular blocks of pixels through other angles using antialiasing procedures. For a rotation that is not a multiple of 90º, we need to estimate the percentage of area coverage for those pixels that overlap the rotated block. Sequences of raster operations can be executed to produce realtime animation for either two-dimensional or three-dimensional objects, so long as we restrict the animation to motions in the projection plane. Then no viewing or visible-surface algorithms need be invoked. We can also animate objects along two-dimensional motion paths using colortable transformations. Here we predefine the object at successive positions along the motion path and set the successive blocks of pixel values to color-table entries. The pixels at the first position of the object are set to a foreground color, and the pixels at the other object positions are set to the background color. The animation
367
Computer Animation
is then accomplished by changing the color-table values so that the object color at successive positions along the animation path becomes the foreground color as the preceding position is set to the background color (Figure 1).
FIGURE 1
Real-time raster color-table animation.
2 Design of Animation Sequences Constructing an animation sequence can be a complicated task, particularly when it involves a story line and multiple objects, each of which can move in a different way. A basic approach is to design such animation sequences using the following development stages:
• • • •
Storyboard layout Object definitions Key-frame specifications Generation of in-between frames
The storyboard is an outline of the action. It defines the motion sequence as a set of basic events that are to take place. Depending on the type of animation to be produced, the storyboard could consist of a set of rough sketches, along with a brief description of the movements, or it could just be a list of the basic ideas for the action. Originally, the set of motion sketches was attached to a large board that was used to present an overall view of the animation project. Hence, the name “storyboard.” An object definition is given for each participant in the action. Objects can be defined in terms of basic shapes, such as polygons or spline surfaces. In addition, a description is often given of the movements that are to be performed by each character or object in the story. A key frame is a detailed drawing of the scene at a certain time in the animation sequence. Within each key frame, each object (or character) is positioned according to the time for that frame. Some key frames are chosen at extreme positions in the action; others are spaced so that the time interval between key frames is not too great. More key frames are specified for intricate motions than for simple, slowly varying motions. Development of the key frames is generally the responsibility of the senior animators, and often a separate animator is assigned to each character in the animation. In-betweens are the intermediate frames between the key frames. The total number of frames, and hence the total number of in-betweens, needed for an animation is determined by the display media that is to be used. Film requires 24 frames per second, and graphics terminals are refreshed at the rate of 60 or more frames per second. Typically, time intervals for the motion are set up so that there are from three to five in-betweens for each pair of key frames. Depending on the speed specified for the motion, some key frames could be duplicated. As an example, a 1-minute film sequence with no duplication requires a total of 1,440 frames. If five in-betweens are required for each pair of key frames, then 288 key frames would need to be developed. There are several other tasks that may be required, depending on the application. These additional tasks include motion verification, editing, and the production and synchronization of a soundtrack. Many of the functions needed to produce general animations are now computer-generated. Figures 2 and 3 show examples of computer-generated frames for animation sequences.
368
Computer Animation
FIGURE 3
FIGURE 2
One frame from the award-winning computer-animated short film Luxo Jr. The film was designed using a key-frame animation system and cartoon animation techniques to provide lifelike actions of the lamps. Final images were rendered with multiple light sources and c 1986 Pixar.) procedural texturing techniques. (Courtesy of Pixar.
One frame from the short film Tin Toy, the first computer-animated film to win an Oscar. Designed using a key-frame animation system, the film also required extensive facial-expression modeling. Final images were rendered using procedural shading, self-shadowing techniques, motion blur, and c 1988 Pixar.) texture mapping. (Courtesy of Pixar.
3 Traditional Animation Techniques Film animators use a variety of methods for depicting and emphasizing motion sequences. These include object deformations, spacing between animation frames, motion anticipation and follow-through, and action focusing. One of the most important techniques for simulating acceleration effects, particularly for nonrigid objects, is squash and stretch. Figure 4 shows how this technique is used to emphasize the acceleration and deceleration of a bouncing ball. As the ball accelerates, it begins to stretch. When the ball hits the floor and stops, it is first compressed (squashed) and then stretched again as it accelerates and bounces upwards. Another technique used by film animators is timing, which refers to the spacing between motion frames. A slower moving object is represented with more closely spaced frames, and a faster moving object is displayed with fewer frames over the path of the motion. This effect is illustrated in Figure 5, where the position changes between frames increase as a bouncing ball moves faster. Object movements can also be emphasized by creating preliminary actions that indicate an anticipation of a coming motion. For example, a cartoon character
0
Stretch
Squash
FIGURE 4
FIGURE 5
A bouncing-ball illustration of the “squash and stretch” technique for emphasizing object acceleration.
The position changes between motion frames for a bouncing ball increase as the speed of the ball increases.
369
Computer Animation
might lean forward and rotate its body before starting to run; or a character might perform a “windup” before throwing a ball. Similarly, follow-through actions can be used to emphasize a previous motion. After throwing a ball, a character can continue the arm swing back to its body; or a hat can fly off a character that is stopped abruptly. An action also can be emphasized with staging, which refers to any method for focusing on an important part of a scene, such as a character hiding something.
4 General Computer-Animation Functions Many software packages have been developed either for general animation design or for performing specialized animation tasks. Typical animation functions include managing object motions, generating views of objects, producing camera motions, and the generation of in-between frames. Some animation packages, such as Wavefront for example, provide special functions for both the overall animation design and the processing of individual objects. Others are special-purpose packages for particular features of an animation, such as a system for generating in-between frames or a system for figure animation. A set of routines is often provided in a general animation package for storing and managing the object database. Object shapes and associated parameters are stored and updated in the database. Other object functions include those for generating the object motion and those for rendering the object surfaces. Movements can be generated according to specified constraints using twodimensional or three-dimensional transformations. Standard functions can then be applied to identify visible surfaces and apply the rendering algorithms. Another typical function set simulates camera movements. Standard camera motions are zooming, panning, and tilting. Finally, given the specification for the key frames, the in-betweens can be generated automatically.
5 Computer-Animation Languages We can develop routines to design and control animation sequences within a general-purpose programming language, such as C, C++, Lisp, or Fortran, but several specialized animation languages have been developed. These languages typically include a graphics editor, a key-frame generator, an in-between generator, and standard graphics routines. The graphics editor allows an animator to design and modify object shapes, using spline surfaces, constructive solidgeometry methods, or other representation schemes. An important task in an animation specification is scene description. This includes the positioning of objects and light sources, defining the photometric parameters (light-source intensities and surface illumination properties), and setting the camera parameters (position, orientation, and lens characteristics). Another standard function is action specification, which involves the layout of motion paths for the objects and camera. We need the usual graphics routines: viewing and perspective transformations, geometric transformations to generate object movements as a function of accelerations or kinematic path specifications, visible-surface identification, and the surface-rendering operations. Key-frame systems were originally designed as a separate set of animation routines for generating the in-betweens from the user-specified key frames. Now, these routines are often a component in a more general animation package. In the simplest case, each object in a scene is defined as a set of rigid bodies connected at the joints and with a limited number of degrees of freedom. As an example, the
370
Computer Animation Elbow Extension Shoulder Swivel Yaw
Arm Sweep Pitch
Roll
Base
FIGURE 6
Degrees of freedom for a stationary, single-armed robot.
single-armed robot in Figure 6 has 6 degrees of freedom, which are referred to as arm sweep, shoulder swivel, elbow extension, pitch, yaw, and roll. We can extend the number of degrees of freedom for this robot arm to 9 by allowing three-dimensional translations for the base (Figure 7). If we also allow base rotations, the robot arm can have a total of 12 degrees of freedom. The human body, in comparison, has more than 200 degrees of freedom. Parameterized systems allow object motion characteristics to be specified as part of the object definitions. The adjustable parameters control such object characteristics as degrees of freedom, motion limitations, and allowable shape changes. Scripting systems allow object specifications and animation sequences to be defined with a user-input script. From the script, a library of various objects and motions can be constructed.
FIGURE 7
Translational and rotational degrees of freedom for the base of the robot arm.
6 Key-Frame Systems A set of in-betweens can be generated from the specification of two (or more) key frames using a key-frame system. Motion paths can be given with a kinematic description as a set of spline curves, or the motions can be physically based by specifying the forces acting on the objects to be animated. For complex scenes, we can separate the frames into individual components or objects called cels (celluloid transparencies). This term developed from cartoonanimation techniques where the background and each character in a scene were placed on a separate transparency. Then, with the transparencies stacked in the order from background to foreground, they were photographed to obtain the completed frame. The specified animation paths are then used to obtain the next cel for each character, where the positions are interpolated from the key-frame times. With complex object transformations, the shapes of objects may change over time. Examples are clothes, facial features, magnified detail, evolving shapes, and exploding or disintegrating objects. For surfaces described with polygon meshes, these changes can result in significant changes in polygon shape such that the number of edges in a polygon could be different from one frame to the next. These changes are incorporated into the development of the in-between frames by adding or subtracting polygon edges according to the requirements of the defining key frames.
Morphing Transformation of object shapes from one form to another is termed morphing, which is a shortened form of “metamorphosing.” An animator can model morphing by transitioning polygon shapes through the in-betweens from one key frame to the next.
371
Computer Animation
Given two key frames, each with a different number of line segments specifying an object transformation, we can first adjust the object specification in one of the frames so that the number of polygon edges (or the number of polygon vertices) is the same for the two frames. This preprocessing step is illustrated in Figure 8. A straight-line segment in key frame k is transformed into two line segments in key frame k + 1. Because key frame k + 1 has an extra vertex, we add a vertex between vertices 1 and 2 in key frame k to balance the number of vertices (and edges) in the two key frames. Using linear interpolation to generate the in-betweens, we transition the added vertex in key frame k into vertex 3 along the straight-line path shown in Figure 9. An example of a triangle linearly expanding into a quadrilateral is given in Figure 10. We can state general preprocessing rules for equalizing key frames in terms of either the number of edges or the number of vertices to be added to a key frame. We first consider equalizing the edge count, where parameters L k and L k+1 denote the number of line segments in two consecutive frames. The maximum and minimum number of lines to be equalized can be determined as L max = max(L k , L k+1 ),
L min = min(L k , L k+1 )
(1)
Next we compute the following two quantities: Ne = L max mod L min L max Ns = int L min
1
(2)
1
1
1 3
2 2 Key Frame k
Key Frame k 1
3 Added point 2 Key Frame k
Halfway Frame
FIGURE 8
FIGURE 9
An edge with vertex positions 1 and 2 in key frame k evolves into two connected edges in key frame k + 1.
Linear interpolation for transforming a line segment in key frame k into two connected line segments in key frame k + 1.
added point
Key Frame k
Halfway Frame
FIGURE 10
Linear interpolation for transforming a triangle into a quadrilateral.
372
Key Frame k1
2 Key Frame k1
Computer Animation
The preprocessing steps for edge equalization are then accomplished with the following two procedures: 1. Divide Ne edges of keyframemin into Ns + 1 sections. 2. Divide the remaining lines of keyframemin into Ns sections.
As an example, if L k = 15 and L k+1 = 11, we would divide four lines of keyframek+1 into two sections each. The remaining lines of keyframek+1 are left intact. If we equalize the vertex count, we can use parameters Vk and Vk+1 to denote the number of vertices in the two consecutive key frames. In this case, we determine the maximum and minimum number of vertices as Vmax = max(Vk , Vk+1 ),
Vmin = min(Vk , Vk+1 )
(3)
Then we compute the following two values: Nls = (Vmax − 1) mod (Vmin − 1) Vmax − 1 Np = int Vmin − 1
(4)
These two quantities are then used to perform vertex equalization with the following procedures: 1. Add Np points to Nls line sections of keyframemin . 2. Add Np − 1 points to the remaining edges of keyframemin .
For the triangle-to-quadrilateral example, Vk = 3 and Vk+1 = 4. Both Nls and Np are 1, so we would add one point to one edge of keyframek . No points would be added to the remaining lines of keyframek .
Simulating Accelerations Curve-fitting techniques are often used to specify the animation paths between key frames. Given the vertex positions at the key frames, we can fit the positions with linear or nonlinear paths. Figure 11 illustrates a nonlinear fit of keyframe positions. To simulate accelerations, we can adjust the time spacing for the in-betweens. If the motion is to occur at constant speed (zero acceleration), we use equalinterval time spacing for the in-betweens. For instance, with n in-betweens and
Key Frame k
InBetween
Key Frame k1
Key Frame k2
FIGURE 11
Fitting key-frame vertex positions with nonlinear splines.
373
Computer Animation FIGURE 12
In-between positions for motion at constant speed.
t1
t
t2
⌬t
key-frame times of t1 and t2 (Figure 12), the time interval between the key frames is divided into n + 1 equal subintervals, yielding an in-between spacing of t2 − t1 (5) t = n+1 The time for the jth in-between is t B j = t1 + jt,
j = 1, 2, . . . , n
(6)
and this time value is used to calculate coordinate positions, color, and other physical parameters for that frame of the motion. Speed changes (nonzero accelerations) are usually necessary at some point in an animation film or cartoon, particularly at the beginning and end of a motion sequence. The startup and slowdown portions of an animation path are often modeled with spline or trigonometric functions, but parabolic and cubic time functions have been applied to acceleration modeling. Animation packages commonly furnish trigonometric functions for simulating accelerations. To model increasing speed (positive acceleration), we want the time spacing between frames to increase so that greater changes in position occur as the object moves faster. We can obtain an increasing size for the time interval with the function 1 − cos θ , 0 < θ < π/2 For n in-betweens, the time for the jth in-between would then be calculated as jπ (7) , j = 1, 2, . . . , n t B j = t1 + t 1 − cos 2(n + 1) where t is the time difference between the two key frames. Figure 13 gives a plot of the trigonometric acceleration function and the in-between spacing for n = 5. We can model decreasing speed (deceleration) using the function sin θ , with 0 < θ < π/2. The time position of an in-between is then determined as jπ t B j = t1 + t sin , j = 1, 2, . . . , n (8) 2(n + 1) 1.0 1 – cos u
0.5
cos u
0
1
2
3
4
5
tB1 j
tB5
t1
FIGURE 13
A trigonometric acceleration function and the corresponding in-between spacing for n = 5 and θ = j π/12 in Equation 7, producing increased coordinate changes as the object moves through each time interval.
374
t
Computer Animation
1.0 sinu
0.5
tB1 0
1
2
3
4
5
j
tB5
t1
t2
t
FIGURE 14
A trigonometric deceleration function and the corresponding in-between spacing for n = 5 and θ = j π/12 in Equation 8, producing decreased coordinate changes as the object moves through each time interval.
A plot of this function and the decreasing size of the time intervals is shown in Figure 14 for five in-betweens. Often, motions contain both speedups and slowdowns. We can model a combination of increasing–decreasing speed by first increasing the in-between time spacing and then decreasing this spacing. A function to accomplish these time changes is 1 (1 − cos θ ), 0 < θ < π/2 2 The time for the jth in-between is now calculated as 1 − cos[ jπ/(n + 1)] t B j = t1 + t (9) , j = 1, 2, . . . , n 2 with t denoting the time difference between the two key frames. Time intervals for a moving object first increase and then decrease, as shown in Figure 15.
1.0 1 cosu 2 0.5
0
1
2
3
4
5
j
0.5 cos u tB1
tB5
1.0 t1
t2
t
FIGURE 15
The trigonometric accelerate–decelerate function ( 1 − cos θ ) /2 and the corresponding in-between spacing for n = 5 in Equation 9.
375
Computer Animation
Processing the in-betweens is simplified by initially modeling “skeleton” (wire-frame) objects so that motion sequences can be interactively adjusted. After the animation sequence is completely defined, objects can be fully rendered.
7 Motion Specifications General methods for describing an animation sequence range from an explicit specification of the motion paths to a description of the interactions that produce the motions. Thus, we could define how an animation is to take place by giving the transformation parameters, the motion path parameters, the forces that are to act on objects, or the details of how objects interact to produce motion.
Direct Motion Specification The most straightforward method for defining an animation is direct motion specification of the geometric-transformation parameters. Here, we explicitly set the values for the rotation angles and translation vectors. Then the geometric transformation matrices are applied to transform coordinate positions. Alternatively, we could use an approximating equation involving these parameters to specify certain kinds of motions. We can approximate the path of a bouncing ball, for instance, with a damped, rectified, sine curve (Figure 16): y(x) = A| sin(ωx + θ0 )|e −kx
(10)
where A is the initial amplitude (height of the ball above the ground), ω is the angular frequency, θ0 is the phase angle, and k is the damping constant. This method for motion specification is particularly useful for simple userprogrammed animation sequences.
Goal-Directed Systems At the opposite extreme, we can specify the motions that are to take place in general terms that abstractly describe the actions in terms of the final results. In other words, an animation is specified in terms of the final state of the movements. These systems are referred to as goal-directed, since values for the motion parameters are determined from the goals of the animation. For example, we could specify that
y
FIGURE 16
Approximating the motion of a bouncing ball with a damped sine function (Eq. 10).
376
x
Computer Animation
we want an object to “walk” or to “run” to a particular destination; or we could state that we want an object to “pick up” some other specified object. The input directives are then interpreted in terms of component motions that will accomplish the described task. Human motions, for instance, can be defined as a hierarchical structure of submotions for the torso, limbs, and so forth. Thus, when a goal, such as “walk to the door” is given, the movements required of the torso and limbs to accomplish this action are calculated.
Kinematics and Dynamics We can also construct animation sequences using kinematic or dynamic descriptions. With a kinematic description, we specify the animation by giving motion parameters (position, velocity, and acceleration) without reference to causes or goals of the motion. For constant velocity (zero acceleration), we designate the motions of rigid bodies in a scene by giving an initial position and velocity vector for each object. For example, if a velocity is specified as (3, 0, −4) km per sec, then this vector gives the direction for the straight-line motion path and the speed (magnitude of velocity) is calculated as 5 km per sec. If we also specify accelerations (rate of change of velocity), we can generate speedups, slowdowns, and curved motion paths. Kinematic specification of a motion can also be given by simply describing the motion path. This is often accomplished using spline curves. An alternate approach is to use inverse kinematics. Here, we specify the initial and final positions of objects at specified times and the motion parameters are computed by the system. For example, assuming zero acceleration, we can determine the constant velocity that will accomplish the movement of an object from the initial position to the final position. This method is often used with complex objects by giving the positions and orientations of an end node of an object, such as a hand or a foot. The system then determines the motion parameters of other nodes to accomplish the desired motion. Dynamic descriptions, on the other hand, require the specification of the forces that produce the velocities and accelerations. The description of object behavior in terms of the influence of forces is generally referred to as physically based modeling. Examples of forces affecting object motion include electromagnetic, gravitational, frictional, and other mechanical forces. Object motions are obtained from the force equations describing physical laws, such as Newton’s laws of motion for gravitational and frictional processes, Euler or Navier-Stokes equations describing fluid flow, and Maxwell’s equations for electromagnetic forces. For example, the general form of Newton’s second law for a particle of mass m is F=
d (mv) dt
(11)
where F is the force vector and v is the velocity vector. If mass is constant, we solve the equation F = ma, with a representing the acceleration vector. Otherwise, mass is a function of time, as in relativistic motions or the motions of space vehicles that consume measurable amounts of fuel per unit time. We can also use inverse dynamics to obtain the forces, given the initial and final positions of objects and the type of motion required. Applications of physically based modeling include complex rigid-body systems and such nonrigid systems as cloth and plastic materials. Typically, numerical methods are used to obtain the motion parameters incrementally from the dynamical equations using initial conditions or boundary values.
377
Computer Animation
8 Character Animation Animation of simple objects is relatively straightforward. When we consider the animation of more complex figures such as humans or animals, however, it becomes much more difficult to create realistic animation. Consider the animation of walking or running human (or humanoid) characters. Based upon observations in their own lives of walking or running people, viewers will expect to see animated characters move in particular ways. If an animated character’s movement doesn’t match this expectation, the believability of the character may suffer. Thus, much of the work involved in character animation is focused on creating believable movements.
Articulated Figure Animation
FIGURE 17
A simple articulated figure with nine joints and twelve connecting links, not counting the oval head.
A basic technique for animating people, animals, insects, and other critters is to model them as articulated figures, which are hierarchical structures composed of a set of rigid links that are connected at rotary joints (Figure 17). In less formal terms, this just means that we model animate objects as moving stick figures, or simplified skeletons, that can later be wrapped with surfaces representing skin, hair, fur, feathers, clothes, or other outer coverings. The connecting points, or hinges, for an articulated figure are placed at the shoulders, hips, knees, and other skeletal joints, which travel along specified motion paths as the body moves. For example, when a motion is specified for an object, the shoulder automatically moves in a certain way and, as the shoulder moves, the arms move. Different types of movement, such as walking, running, or jumping, are defined and associated with particular motions for the joints and connecting links. A series of walking leg motions, for instance, might be defined as in Figure 18. The hip joint is translated forward along a horizontal line, while the connecting links perform a series of movements about the hip, knee, and angle joints. Starting with a straight leg [Figure 18(a)], the first motion is a knee bend as the hip moves forward [Figure 18(b)]. Then the leg swings forward, returns to the vertical position, and swings back, as shown in Figures 18(c), (d), and (e). The final motions are a wide swing back and a return to the straight
Hip Joint
(a)
(b)
(c)
(d)
(e)
FIGURE 18
Possible motions for a set of connected links representing a walking leg.
378
(f)
(g)
Computer Animation
vertical position, as in Figures 18(f) and (g). This motion cycle is repeated for the duration of the animation as the figure moves over a specified distance or time interval. As a figure moves, other movements are incorporated into the various joints. A sinusoidal motion, often with varying amplitude, can be applied to the hips so that they move about on the torso. Similarly, a rolling or rocking motion can be imparted to the shoulders, and the head can bob up and down. Both kinematic-motion descriptions and inverse kinematics are used in figure animations. Specifying the joint motions is generally an easier task, but inverse kinematics can be useful for producing simple motion over arbitrary terrain. For a complicated figure, inverse kinematics may not produce a unique animation sequence: Many different rotational motions may be possible for a given set of initial and final conditions. In such cases, a unique solution may be possible by adding more constraints, such as conservation of momentum, to the system.
Motion Capture An alternative to determining the motion of a character computationally is to digitally record the movement of a live actor and to base the movement of an animated character on that information. This technique, known as motion capture or mo-cap, can be used when the movement of the character is predetermined (as in a scripted scene). The animated character will perform the same series of movements as the live actor. The classic motion capture technique involves placing a set of markers at strategic positions on the actor’s body, such as the arms, legs, hands, feet, and joints. It is possible to place the markers directly on the actor, but more commonly they are affixed to a special skintight body suit worn by the actor. The actor is them filmed performing the scene. Image processing techniques are then used to identify the positions of the markers in each frame of the film, and their positions are translated to coordinates. These coordinates are used to determine the positioning of the body of the animated character. The movement of each marker from frame to frame in the film is tracked and used to control the corresponding movement of the animated character. To accurately determine the positions of the markers, the scene must be filmed by multiple cameras placed at fixed positions. The digitized marker data from each recording can then be used to triangulate the position of each marker in three dimensions. Typical motion capture systems will use up to two dozen cameras, but systems with several hundred cameras exist. Optical motion capture systems rely on the reflection of light from a marker into the camera. These can be relatively simple passive systems using photoreflective markers that reflect illumination from special lights placed near the cameras, or more advanced active systems in which the markers are powered and emit light. Active systems can be constructed so that the markers illuminate in a pattern or sequence, which allows each marker to be uniquely identified in each frame of the recording, simplifying the tracking process. Non-optical systems rely on the direct transmission of position information from the markers to a recording device. Some non-optical systems use inertial sensors that provide gyroscope-based position and orientation information. Others use magnetic sensors that measure changes in magnetic flux. A series of transmitters placed around the stage generate magnetic fields that induce current in the magnetic sensors; that information is then transmitted to receivers. Some motion capture systems record more than just the gross movements of the parts of the actor’s body. It is possible to record even the actor’s
379
Computer Animation
facial movements. Often called performance capture systems, these typically use a camera trained on the actor’s face and small light-emitting diode (LED) lights that illuminate the face. Small photoreflective markers attached to the face reflect the light from the LEDs and allow the camera to capture the small movements of the muscles of the face, which can then be used to create realistic facial animation on a computer-generated character.
9 Periodic Motions When we construct an animation with repeated motion patterns, such as a rotating object, we need to be sure that the motion is sampled frequently enough to represent the movements correctly. In other words, the motion must be synchronized with the frame-generation rate so that we display enough frames per cycle to show the true motion. Otherwise, the animation may be displayed incorrectly. A typical example of an undersampled periodic-motion display is the wagon wheel in a Western movie that appears to be turning in the wrong direction. Figure 19 illustrates one complete cycle in the rotation of a wagon wheel with one red spoke that makes 18 clockwise revolutions per second. If this motion is recorded on film at the standard motion-picture projection rate of 24 frames per second, then the first five frames depicting this motion would be as shown in 1 Figure 20. Because the wheel completes 34 of a turn every 24 of a second, only one animation frame is generated per cycle, and the wheel thus appears to be rotating in the opposite (counterclockwise) direction. In a computer-generated animation, we can control the sampling rate in a periodic motion by adjusting the motion parameters. For example, we can set the
(a) 0 sec.
(b) 1/ 72 sec.
(c) 1/ 36 sec.
(d) 1/ 24 sec.
(e) 1/ 18 sec.
FIGURE 19
Five positions for a red spoke during one cycle of a wheel motion that is turning at the rate of 18 revolutions per second.
Frame 0 0 sec.
Frame 1 1/ 24 sec.
Frame 2 2/ 24 sec.
Frame 3 3/ 24 sec.
Frame 4 4/ 24 sec.
FIGURE 20
The first five film frames of the rotating wheel in Figure 19 produced at the rate of 24 frames per second.
380
Computer Animation
angular increment for the motion of a rotating object so that multiple frames are generated in each revolution. Thus, a 3◦ increment for a rotation angle produces 120 motion steps during one revolution, and a 4◦ increment generates 90 steps. For faster motion, larger rotational steps could be used, so long as the number of samples per cycle is not too small and the motion is clearly displayed. When complex objects are to be animated, we also must take into account the effect that the frame construction time might have on the refresh rate, as discussed in Section 1. The motion of a complex object can be much slower than we want it to be if it takes too long to construct each frame of the animation. Another factor that we need to consider in the display of a repeated motion is the effect of round-off in the calculations for the motion parameters. We can reset parameter values periodically to prevent the accumulated error from producing erratic motions. For a continuous rotation, we could reset parameter values once every cycle (360º).
10 OpenGL Animation Procedures Raster operations and color-index assignment functions are available in the core library, and routines for changing color-table values are provided in GLUT. Other raster-animation operations are available only as GLUT routines because they depend on the window system in use. In addition, computer-animation features such as double buffering may not be included in some hardware systems. Double-buffering operations, if available, are activated using the following GLUT command: glutInitDisplayMode (GLUT_DOUBLE);
This provides two buffers, called the front buffer and the back buffer, that we can use alternately to refresh the screen display. While one buffer is acting as the refresh buffer for the current display window, the next frame of an animation can be constructed in the other buffer. We specify when the roles of the two buffers are to be interchanged using glutSwapBuffers ( );
To determine whether double-buffer operations are available on a system, we can issue the following query: glGetBooleanv (GL_DOUBLEBUFFER, status);
A value of GL TRUE is returned to array parameter status if both front and back buffers are available on a system. Otherwise, the returned value is GL FALSE. For a continuous animation, we can also use glutIdleFunc (animationFcn);
where parameter animationFcn can be assigned the name of a procedure that is to perform the operations for incrementing the animation parameters. This procedure is continuously executed whenever there are no display-window events that must be processed. To disable the glutIdleFunc, we set its argument to the value NULL or the value 0. An example animation program is given in the following code, which continuously rotates a regular hexagon in the xy plane about the z axis. The origin of
381
Computer Animation
three-dimensional screen coordinates is placed at the center of the display window so that the z axis passes through this center position. In procedure init, we use a display list to set up the description of the regular hexagon, whose center position is originally at the screen-coordinate position (150, 150) with a radius (distance from the polygon center to any vertex) of 100 pixels. In the display function, displayHex, we specify an initial 0◦ rotation about the z axis and invoke the glutSwapBuffers routine. To activate the rotation, we use procedure mouseFcn, which continually increments the rotation angle by 3◦ once we press the middle mouse button. The calculation of the incremented rotation angle is performed in procedure rotateHex, which is called by the glutIdleFunc routine in procedure mouseFcn. We stop the rotation by pressing the right mouse button, which causes the glutIdleFunc to be invoked with a NULL argument.
#include #include #include const double TWO_PI = 6.2831853; GLsizei winWidth = 500, winHeight = 500; GLuint regHex; static GLfloat rotTheta = 0.0;
// Initial display window size. // Define name for display list.
class scrPt { public: GLint x, y; }; static void init (void) { scrPt hexVertex; GLdouble hexTheta; GLint k; glClearColor (1.0, 1.0, 1.0, 0.0); /* Set up a display list for a red regular hexagon. * Vertices for the hexagon are six equally spaced * points around the circumference of a circle. */ regHex = glGenLists (1); glNewList (regHex, GL_COMPILE); glColor3f (1.0, 0.0, 0.0); glBegin (GL_POLYGON); for (k = 0; k < 6; k++) { hexTheta = TWO_PI * k / 6; hexVertex.x = 150 + 100 * cos (hexTheta); hexVertex.y = 150 + 100 * sin (hexTheta); glVertex2i (hexVertex.x, hexVertex.y); } glEnd ( ); glEndList ( ); }
382
Computer Animation
void displayHex (void) { glClear (GL_COLOR_BUFFER_BIT); glPushMatrix ( ); glRotatef (rotTheta, 0.0, 0.0, 1.0); glCallList (regHex); glPopMatrix ( ); glutSwapBuffers ( ); glFlush ( ); } void rotateHex (void) { rotTheta += 3.0; if (rotTheta > 360.0) rotTheta -= 360.0; glutPostRedisplay ( ); } void winReshapeFcn (GLint newWidth, GLint newHeight) { glViewport (0, 0, (GLsizei) newWidth, (GLsizei) newHeight); glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); gluOrtho2D (-320.0, 320.0, -320.0, 320.0); glMatrixMode (GL_MODELVIEW); glLoadIdentity ( ); glClear (GL_COLOR_BUFFER_BIT); } void mouseFcn (GLint button, GLint action, GLint x, GLint y) { switch (button) { case GLUT_MIDDLE_BUTTON: // Start the rotation. if (action == GLUT_DOWN) glutIdleFunc (rotateHex); break; case GLUT_RIGHT_BUTTON: // Stop the rotation. if (action == GLUT_DOWN) glutIdleFunc (NULL); break; default: break; } }
383
Computer Animation
void main (int argc, char** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_DOUBLE | GLUT_RGB); glutInitWindowPosition (150, 150); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Animation Example"); init ( ); glutDisplayFunc (displayHex); glutReshapeFunc (winReshapeFcn); glutMouseFunc (mouseFcn); glutMainLoop ( ); }
11 Summary An animation sequence can be constructed frame by frame, or it can be generated in real time. When separate frames of an animation are constructed and stored, the frames can later be transferred to film or displayed in rapid succession on a video monitor. Animations involving complex scenes and motions are commonly produced one frame at a time, while simpler motion sequences are displayed in real time. On a raster system, double-buffering methods can be used to facilitate motion displays. One buffer is used to refresh the screen, while a second buffer is being loaded with the screen values for the next frame of the motion. Then the roles of the two buffers are interchanged, usually at the end of a refresh cycle. Another raster method for displaying an animation is to perform motion sequences using block transfers of pixel values. Translations are accomplished by a simple move of a rectangular block of pixel colors from one frame-buffer position to another. And rotations in 90◦ increments can be performed with combinations of translations and row-column interchanges within the pixel array. Color-table methods can be used for simple raster animations by storing an image of an object at multiple locations in the frame buffer, using different colortable values. One image is stored in the foreground color, and the copies of the image at the other locations are assigned a background color. By rapidly interchanging the foreground and background color values stored in the color table, we can display the object at various screen positions. Several developmental stages can be used to produce an animation, starting with the storyboard, object definitions, and specification of key frames. The storyboard is an outline of the action, and the key frames define the details of the object motions for selected positions in the animation. Once the key frames have been established, in-between frames are generated to construct a smooth motion from one key frame to the next. A computer animation can involve motion specifications for the “camera,” as well as motion paths for the objects and characters involved in the animation. Various techniques have been developed for simulating and emphasizing motion effects. Squash and stretch effects are standard methods for stressing
384
Computer Animation
accelerations, and the timing between motion frames can be varied to produce speed variations. Other methods include a preliminary windup motion, a followthrough at the end of an action, and staging methods that focus on an important action in a scene. Trigonometric functions are typically used to generate the time spacing for in-between frames when the motions involve accelerations. Animations can be generated with special-purpose software or with a generalpurpose graphics package. Systems that are available for automated computer animation include key-frame systems, parameterized systems, and scripting systems. Many animations include morphing effects, in which one object shape is transformed into another. These effects are accomplished by using the in-between frames to transition the defining points and lines in one object into the points and lines of the other object. Motions in an animation can be described with direct motion specifications or they can be goal-directed. Thus, an animation can be defined in terms of translation and rotation parameters, or motions can be described with equations or with kinematic or dynamic parameters. Kinematic motion descriptions specify positions, velocities, and accelerations; dynamic motion descriptions are given in terms of the forces acting on the objects in a scene. Articulated figures are often used to model the motions of people and animals. Rigid links, connected at rotary joints, are defined in a hierarchical structure. When a motion is imparted to an object, each subpart is programmed to move in a particular way in response to the overall motion. Motion capture techniques provide an alternative to computed character motion. They can be used to produce more realistic movement for articulated characters. The sampling rate for periodic motions should produce enough frames per cycle to display the animation correctly. Otherwise, erratic or misleading motions may result. In addition to the raster ops and color-table methods, a few functions are available in the OpenGL Utility Toolkit (GLUT) for developing animation programs. These provide routines for double-buffering operations and for incrementing motion parameters during idle-processing intervals. In Table 1, we list the GLUT functions for producing animations with OpenGL programs.
T A B L E
1
Summary of OpenGL Animation Functions Function
Description
glutInitDisplayMode (GLUT DOUBLE)
Activates double-buffering operations.
glutSwapBuffers
Interchanges front and back refresh buffers.
glGetBooleanv (GL DOUBLEBUFFER, status)
Queries a system to determine whether double buffering is available.
glutIdleFunc
Specifies a function for incrementing animation parameters.
385
Computer Animation
REFERENCES Computer-animation systems are discussed in Thalmann and Thalmann (1985), Watt and Watt (1992), O’Rourke (1998), Maestri (1999 and 2002), Kerlow (2000), Gooch and Gooch (2001), Parent (2002), Pocock and Rosebush (2002), and Strothotte and Schlechtweg (2002). Traditional animation techniques are explored in Lasseter (1987), Thomas, Johnston, and Johnston (1995), and Thomas and Lefkon (1997). Morphing methods are discussed in Hughes (1992), Kent, Carlson, and Parent (1992), Sederberg and Greenwood (1992), and Gomes, et al. (1999). Facial animation is discussed in Parke and Waters (2008). Motion capture techniques are discussed in Menache (2000). Various algorithms for animation applications are available in Glassner (1990), Arvo (1991), Kirk (1992), Gascuel (1993), Snyder, et al. (1993), and Paeth (1995). And a discussion of animation techniques in OpenGL is given in Woo, et al. (1999).
12
13
EXERCISES 1
Design a storyboard layout and accompanying key frames for an animation of a simple stick figure, as in Figure 17. 2 Write a program to generate the in-betweens for the key frames specified in Exercise 1 using linear interpolation. 3 Expand the animation sequence in Exercise 1 to include two or more moving objects. 4 Write a program to generate the in-betweens for the key frames in Exercise 3 using linear interpolation. 5 Write a morphing program to transform any given polygon into another specified polygon, using five in-betweens. 6 Write a morphing program to transform a sphere into a specified polyhedron, using five in-betweens. 7 Set up an animation specification involving accelerations and implementing Eq. 7. 8 Set up an animation specification involving both accelerations and decelerations, implementing the in-between spacing calculations given in Equations 7 and 8. 9 Set up an animation specification implementing the acceleration–deceleration calculations of Equation 9. 10 Write a program to simulate the linear, twodimensional motion of a filled circle inside a given rectangular area. The circle is to be given an initial position and velocity, and the circle is to rebound from the walls with the angle of reflection equal to the angle of incidence. 11 Convert the program of the previous exercise into a two-player ball and paddle game by replacing
386
14
15 16 17
18
19
20
21
two opposite sides of the rectangle with short line segments that can be moved back and forth along each of the rectangle edges. Interactive movement of each line segment simulates a paddle that can be positioned to prevent the bouncing ball from escaping that side of the rectangle. Each time the circle escapes from the interior of the rectangle the score of the player assigned to the opposite side is increased. Initial input parameters include circle position, direction, and speed. The game scores can be displayed in one corner of the display window. Modify the ball and paddle game in the previous exercise to vary the speed of the bouncing ball. After each successful block by a player, the speed of the ball is incremented by a small amount. Modify the game in the previous exercise to include two balls, each initialized to the same speed but different positions and opposite directions. Modify the two-dimensional bouncing ball inside a rectangle to a three-dimensional motion of a sphere bouncing around inside a parallelepiped. Interactive viewing parameters can be set to view the motion from different directions. Write a program to implement the simulation of a bouncing ball using Eq. 10. Expand the program in the previous exercise to include squash and stretch effects. Write a program to implement the motion of a bouncing ball using dynamics. The motion of the ball is to be governed by a downward gravitational force and a ground-plane friction force. Initially, the ball is to be projected into space with a given velocity vector. Write a program to implement dynamic motion specifications. Specify a scene with two or more objects, initial motion parameters, and specified forces. Then generate the animation from the solution of the force equations. (For example, the objects could be the earth, moon, and sun with attractive gravitational forces that are proportional to mass and inversely proportional to distance squared.) Modify the rotating hexagon program in Section 10 to allow a user to interactively choose a three-dimensional object from a list of menu options to be rotated about the y axis. Use a perspective projection to display the object. Modify the program in the previous exercise so that the rotation about the y axis is an elliptical path in the xz plane. Modify the program in the previous exercise to allow interactive variation of the rotation speed.
Computer Animation
IN MORE DEPTH 1
The goal of this chapter’s exercises is to increase the sophistication of the animation of your developing application. You may do this by modifying or adding to the existing animation that you have developed, or by using the techniques that you have learned so far to design a different animation of some portion of your application. In either case, draw a storyboard of your planned animation and outline the methods that you will use to implement it. Identify key frames that represent critical points in the animation. Consider methods by which you can generate the in-between frames that will move the objects in the scene realistically from one key frame to the next. Sketch a timeline that will help you determine the rate at which each object or group of objects should
2
move in between each key frame. Try to have some objects exhibit nonzero accelerations in your animation. If possible, include an instance of linearly interpolated morphing from one polygon or set of polygons to another. If your application contains objects that behave like a physical dynamical system, try to incorporate dynamics in the form of physical models to produce the animation. Finally, make an attempt to include periodic motion in the animation if appropriate. Given the storyboard and specification that you designed in the previous exercise, implement the animation in an OpenGL program with just a single buffer as previously done, and then with a double buffer. Note any differences in the quality of the animation between the two cases.
387
This page intentionally left blank
Three-Dimensional Object Representations
1
Polyhedra
2
OpenGL Polyhedron Functions
3
Curved Surfaces
4
Quadric Surfaces
5
Superquadrics
6
OpenGL Quadric-Surface and Cubic-Surface Functions
7
Summary
G
raphics scenes can contain many different kinds of objects and material surfaces: trees, flowers, clouds, rocks, water, bricks, wood paneling, rubber, paper, marble, steel, glass, plastic, and cloth, just to mention a few. So it may not be surprising that there is no single method that we can use to describe objects that will include all the characteristics of these different materials. Polygon and quadric surfaces provide precise descriptions for simple Euclidean objects such as polyhedrons and ellipsoids. They are examples of boundary representations (B-reps), which describe a three-dimensional object as a set of surfaces that separate the object interior from the environment. In this chapter, we consider the features of these types of representation schemes and how they are used in computer-graphics applications.
From Chapter 13 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
389
Three-Dimensional Object Representations
1 Polyhedra The most commonly used boundary representation for a three-dimensional graphics object is a set of surface polygons that enclose the object interior. Many graphics systems store all object descriptions as sets of surface polygons. This simplifies and speeds up the surface rendering and display of objects because all surfaces are described with linear equations. For this reason, polygon descriptions are often referred to as standard graphics objects. In some cases, a polygonal representation is the only one available, but many packages also allow object surfaces to be described with other schemes, such as spline surfaces, which are usually converted to polygonal representations for processing through the viewing pipeline. To describe an object as a set of polygon facets, we give the list of vertex coordinates for each polygon section over the object surface. The vertex coordinates and edge information for the surface sections are then stored in tables along with other information, such as the surface normal vector for each polygon. Some graphics packages provide routines for generating a polygon-surface mesh as a set of triangles or quadrilaterals. This allows us to describe a large section of an object’s bounding surface, or even the entire surface, with a single command. And some packages also provide routines for displaying common shapes, such as a cube, sphere, or cylinder, represented with polygon surfaces. Sophisticated graphics systems use fast hardware-implemented polygon renderers that have the capability for displaying a million or more shaded polygons (usually triangles) per second, including the application of surface texture and special lighting effects.
2 OpenGL Polyhedron Functions We have two methods for specifying polygon surfaces in an OpenGL program. Using the polygon primitives we can generate a variety of polyhedron shapes and surface meshes. In addition, we can use GLUT functions to display the five regular polyhedra.
OpenGL Polygon Fill-Area Functions A set of polygon patches for a section of an object surface, or a complete description for a polyhedron, can be given using the OpenGL primitive constants GL POLYGON, GL TRIANGLES, GL TRIANGLE STRIP, GL TRIANGLE FAN, GL QUADS, and GL QUAD STRIP. For example, we could tessellate the lateral (axial) surface of a cylinder using a quadrilateral strip. Similarly, all faces of a parallelogram can be described with a set of rectangles, and all faces of a triangular pyramid could be specified using a set of connected triangular surfaces.
GLUT Regular Polyhedron Functions Some standard shapes—the five regular polyhedra—are predefined by routines in the GLUT library. These polyhedra, also called the Platonic solids, are distinguished by the fact that all the faces of any regular polyhedron are identical regular polygons. Thus, all edges in a regular polyhedron are equal, all edge angles are equal, and all angles between faces are equal. Polyhedra are named according to the number of faces in each of the solids, and the five regular polyhedra are the regular tetrahedron (or triangular pyramid, with 4 faces), the regular hexahedron (or cube, with 6 faces), the regular octahedron (8 faces), the regular dodecahedron (12 faces), and the regular icosahedron (20 faces).
390
Three-Dimensional Object Representations
Ten functions are provided in GLUT for generating these solids: five of the functions produce wire-frame objects, and five display the polyhedra facets as shaded fill areas. The displayed surface characteristics for the fill areas are determined by the material properties and the lighting conditions that we set for a scene. Each regular polyhedron is described in modeling coordinates, so that each is centered at the world-coordinate origin. We obtain the four-sided, regular triangular pyramid using either of these two functions: glutWireTetrahedron ( );
or glutSolidTetrahedron ( );
This polyhedron is generated with its center at the world-coordinate origin and with √ a radius (distance from the center of the tetrahedron to any vertex) equal to 3. The six-sided regular hexahedron (cube) is displayed with glutWireCube (edgeLength);
or glutSolidCube (edgeLength);
Parameter edgeLength can be assigned any positive, double-precision floatingpoint value, and the cube is centered on the coordinate origin. To display the eight-sided regular octahedron, we invoke either of the following commands: glutWireOctahedron ( );
or glutSolidOctahedron ( );
This polyhedron has equilateral triangular faces, and the radius (distance from the center of the octahedron at the coordinate origin to any vertex) is 1.0. The twelve-sided regular dodecahedron, centered at the world-coordinate origin, is generated with glutWireDodecahedron ( );
or glutSolidDodecahedron ( );
Each face of this polyhedron is a pentagon. The following two functions generate the twenty-sided regular icosahedron: glutWireIcosahedron ( );
or glutSolidIcosahedron ( );
Default radius (distance from the polyhedron center at the coordinate origin to any vertex) for the icosahedron is 1.0, and each face is an equilateral triangle.
391
Three-Dimensional Object Representations
FIGURE 1
A perspective view of the five GLUT polyhedra, scaled and positioned within a display window by procedure displayWirePolyhedra.
Example GLUT Polyhedron Program Using the GLUT functions for the Platonic solids, the following program generates a transformed, wire-frame perspective display of these polyhedrons. All five solids are positioned within one display window (shown in Figure 1).
#include GLsizei winWidth = 500, winHeight = 500; void init (void) { glClearColor (1.0, 1.0, 1.0, 0.0); } void displayWirePolyhedra (void) { glClear (GL_COLOR_BUFFER_BIT); glColor3f (0.0, 0.0, 1.0);
// Initial display-window size.
// White display window.
// Clear display window. // Set line color to blue.
/* Set viewing transformation. */ gluLookAt (5.0, 5.0, 5.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0); /* Scale cube and display as wire-frame parallelepiped. */ glScalef (1.5, 2.0, 1.0); glutWireCube (1.0);
392
Three-Dimensional Object Representations
/* Scale, translate, and display wire-frame dodecahedron. */ glScalef (0.8, 0.5, 0.8); glTranslatef (-6.0, -5.0, 0.0); glutWireDodecahedron ( ); /* Translate and display wire-frame tetrahedron. */ glTranslatef (8.6, 8.6, 2.0); glutWireTetrahedron ( ); /* Translate and display wire-frame octahedron. */ glTranslatef (-3.0, -1.0, 0.0); glutWireOctahedron ( ); /* Scale, translate, and display wire-frame icosahedron. */ glScalef (0.8, 0.8, 1.0); glTranslatef (4.3, -2.0, 0.5); glutWireIcosahedron ( ); glFlush ( ); } void winReshapeFcn (GLint newWidth, GLint newHeight) { glViewport (0, 0, newWidth, newHeight); glMatrixMode (GL_PROJECTION); glFrustum (-1.0, 1.0, -1.0, 1.0, 2.0, 20.0); glMatrixMode (GL_MODELVIEW); glClear (GL_COLOR_BUFFER_BIT); } void main (int argc, char** argv) { glutInit (&argc, argv); glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (100, 100); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Wire-Frame Polyhedra"); init ( ); glutDisplayFunc (displayWirePolyhedra); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
3 Curved Surfaces Equations for objects with curved boundaries can be expressed in either a parametric or a nonparametric form. The various objects that are often useful in graphics
393
Three-Dimensional Object Representations
applications include quadric surfaces, superquadrics, polynomial and exponential functions, and spline surfaces. These input object descriptions typically are tessellated to produce polygon-mesh approximations for the surfaces.
4 Quadric Surfaces A frequently used class of objects are the quadric surfaces, which are described with second-degree equations (quadratics). They include spheres, ellipsoids, tori, paraboloids, and hyperboloids. Quadric surfaces, particularly spheres and ellipsoids, are common elements of graphics scenes, and routines for generating these surfaces are often available in graphics packages. Also, quadric surfaces can be produced with rational spline representations. z axis
Sphere P = (x, y, z) r f
In Cartesian coordinates, a spherical surface with radius r centered on the coordinate origin is defined as the set of points (x, y, z) that satisfy the equation
y axis
x 2 + y2 + z2 = r 2
u x axis FIGURE 2
Parametric coordinate position ( r , θ , φ) on the surface of a sphere with radius r .
(1)
We can also describe the spherical surface in parametric form, using latitude and longitude angles (Figure 2): x = r cos φ cos θ,
− π/2 ≤ φ ≤ π/2
y = r cos φ sin θ,
−π ≤θ ≤π
(2)
z = r sin φ
z axis
f
r
P
y axis u x axis FIGURE 3
Spherical coordinate parameters ( r , θ , φ) , using colatitude for angle φ.
The parametric representation in Equations 2 provides a symmetric range for the angular parameters θ and φ. Alternatively, we could write the parametric equations using standard spherical coordinates, where angle φ is specified as the colatitude (Figure 3). Then, φ is defined over the range 0 ≤ φ ≤ π , and θ is often taken in the range 0 ≤ θ ≤ 2π . We could also set up the representation using parameters u and v defined over the range from 0 to 1 by substituting φ = πu and θ = 2πv.
Ellipsoid z
An ellipsoidal surface can be described as an extension of a spherical surface where the radii in three mutually perpendicular directions can have different values (Figure 4). The Cartesian representation for points over the surface of an ellipsoid centered on the origin is
rz ry rx
y
x FIGURE 4
An ellipsoid with radii r x , r y , and r z , centered on the coordinate origin.
x rx
2
y ry
2 +
2 z =1 rz
(3)
And a parametric representation for the ellipsoid in terms of the latitude angle φ and the longitude angle θ in Figure 2 is x = r x cos φ cos θ ,
− π/2 ≤ φ ≤ π/2
y = r y cos φ sin θ,
−π ≤θ ≤π
z = r z sin φ
394
+
(4)
Three-Dimensional Object Representations
z axis (0, y, z) r
f
y axis
0
y axis
0 u
raxial
(x, y, z) FIGURE 5
A torus, centered on the coordinate origin, with a circular cross-section and with the torus axis along the z axis.
x axis Side View
Top View
Torus A doughnut-shaped object is called a torus or anchor ring. Most often it is described as the surface generated by rotating a circle or an ellipse about a coplanar axis line that is external to the conic. The defining parameters for a torus are then the distance of the conic center from the rotation axis and the dimensions of the conic. A torus generated by the rotation of a circle with radius r in the yz plane about the z axis is shown in Figure 5. With the circle center on the y axis, the axial radius, raxial , of the resulting torus is equal to the distance along the y axis to the circle center from the z axis (the rotation axis); and the cross-sectional radius of the torus is the radius of the generating circle. The equation for the cross-sectional circle shown in the side view of Figure 5 is (y − raxial )2 + z2 = r 2 Rotating this circle about the z axis produces the torus whose surface positions are described with the Cartesian equation 2 x 2 + y2 − raxial + z2 = r 2 (5) The corresponding parametric equations for the torus with a circular cross-section are x = (raxial + r cos φ) cos θ,
−π ≤φ ≤π
y = (raxial + r cos φ) sin θ ,
−π ≤θ ≤π
(6)
z = r sin φ We could also generate a torus by rotating an ellipse, instead of a circle, about the z axis. For an ellipse in the yz plane with semimajor and semiminor axes denoted as r y and r z , we can write the ellipse equation as
y − raxial ry
2 +
2 z =1 rz
where raxial is the distance along the y axis from the rotation z axis to the ellipse center. This generates a torus that can be described with the Cartesian equation z 2 x 2 + y2 − raxial + =1 (7) ry rz
395
Three-Dimensional Object Representations
The corresponding parametric representation for the torus with an elliptical crosssection is x = (raxial + r y cos φ) cos θ ,
−π ≤φ ≤π
y = (raxial + r y cos φ) sin θ,
−π ≤θ ≤π
(8)
z = r z sin φ Other variations on the preceding torus equations are possible. For example, we could generate a torus surface by rotating either a circle or an ellipse along an elliptical path around the rotation axis.
5 Superquadrics The class of objects called Superquadrics is a generalization of the quadric representations. Superquadrics are formed by incorporating additional parameters into the quadric equations to provide increased flexibility for adjusting object shapes. One additional parameter is added to curve equations, and two additional parameters are used in surface equations.
Superellipse We obtain a Cartesian representation for a superellipse from the corresponding equation for an ellipse by allowing the exponent on the x and y terms to be variable. One way to do this is to write the Cartesian superellipse equation in the form
x rx
2/s
+
y ry
2/s =1
(9)
where parameter s can be assigned any real value. When s = 1, we have an ordinary ellipse. Corresponding parametric equations for the superellipse of Equation 9 can be expressed as x = r x coss θ, −π ≤θ ≤π (10) s y = r y sin θ Figure 6 illustrates superellipse shapes that can be generated using various values for parameter s.
FIGURE 6
Superellipses plotted with values for parameter s ranging from 0.5 to 3.0 and with r x = r y .
396
Three-Dimensional Object Representations
Superellipsoid A Cartesian representation for a superellipsoid is obtained from the equation for an ellipsoid by incorporating two exponent parameters as follows: 2/s2 2/s2 s2 /s1 2/s1 x y z + + =1 (11) rx ry rz For s1 = s2 = 1, we have an ordinary ellipsoid. We can then write the corresponding parametric representation for the superellipsoid of Equation 11 as x = r x coss1 φ coss2 θ, y = r y coss1 φ sins2 θ,
− π/2 ≤ φ ≤ π/2 −π ≤θ ≤π
(12)
z = r z sins1 φ Color Plate 10 illustrates superellipsoid shapes that can be generated using various values for parameters s1 and s2 . These and other superquadric shapes can be combined to create more complex structures, such as depictions of furniture, threaded bolts, and other hardware.
6 OpenGL Quadric-Surface and Cubic-Surface Functions A sphere and a number of other three-dimensional quadric-surface objects can be displayed using functions that are included in the OpenGL Utility Toolkit (GLUT) and in the OpenGL Utility (GLU). In addition, GLUT has one function for displaying a teapot shape that is defined with bicubic surface patches. The GLUT functions, which are easy to incorporate into an application program, have two versions each. One version of each function displays a wire-frame surface, and the other displays the surface as a rendered set of fill-area polygon patches. With the GLUT functions, we can display a sphere, cone, torus, or the teapot. Quadricsurface GLU functions are a little more involved to set up, but they provide a few more options. With the GLU functions, we can display a sphere, cylinder, tapered cylinder, cone, flat circular ring (or hollow disk), and a section of a circular ring (or disk).
GLUT Quadric-Surface Functions We generate a GLUT sphere with either of these two functions: glutWireSphere (r, nLongitudes, nLatitudes);
or glutSolidSphere (r, nLongitudes, nLatitudes);
where the sphere radius is determined by the double-precision floating-point number assigned to parameter r. Parameters nLongitudes and nLatitudes are used to select the integer number of longitude and latitude lines that will be used to approximate the spherical surface as a quadrilateral mesh. Edges of the quadrilateral surface patches are straight-line approximations of the longitude and latitude lines. The sphere is defined in modeling coordinates, centered at the world-coordinate origin with its polar axis along the z axis.
397
Three-Dimensional Object Representations
A GLUT cone is obtained with glutWireCone (rBase, height, nLongitudes, nLatitudes);
or glutSolidCone (rBase, height, nLongitudes, nLatitudes);
We set double-precision, floating-point values for the radius of the cone base and for the cone height using parameters rbase and height, respectively. As with a GLUT sphere, parameters nLongitudes and nLatitudes are assigned integer values that specify the number of orthogonal surface lines for the quadrilateral mesh approximation. A cone longitude line is a straight-line segment along the cone surface from the apex to the base that lies in a plane containing the cone axis. Each latitude line is displayed as a set of straight-line segments around the circumference of a circle on the cone surface that is parallel to the cone base and that lies in a plane perpendicular to the cone axis. The cone is described in modeling coordinates, with the center of the base at the world-coordinate origin and with the cone axis along the world z axis. Wire-frame or surface-shaded displays of a torus with a circular cross-section are produced with glutWireTorus (rCrossSection, rAxial, nConcentrics, nRadialSlices);
or glutSolidTorus (rCrossSection, rAxial, nConcentrics, nRadialSlices);
The torus obtained with these GLUT routines can be described as the surface generated by rotating a circle with radius rCrossSection about the coplanar z axis, where the distance of the circle center from the z axis is rAxial (see Section 4). We select a size for the torus using double-precision, floating-point values for these radii in the GLUT functions. And the size of the quadrilaterals in the approximating surface mesh for the torus is set with integer values for parameters nConcentrics and nRadialSlices. Parameter nConcentrics specifies the number of concentric circles (with center on the z axis) to be used on the torus surface, and parameter nRadialSlices specifies the number of radial slices through the torus surface. These two parameters designate the number of orthogonal grid lines over the torus surface, with the grid lines displayed as straight-line segments (the boundaries of the quadrilaterals) between intersection positions. The displayed torus is centered on the world-coordinate origin, with its axis along the world z axis.
GLUT Cubic-Surface Teapot Function During the early development of computer-graphics methods, sets of polygonmesh data tables were constructed for the description of several three-dimensional objects that could be used to test rendering techniques. These objects included the surfaces of a Volkswagen automobile and a teapot, developed at the University of Utah. The data set for the Utah teapot, as constructed by Martin Newell in 1975, contains 306 vertices, defining 32 bicubic B´ezier surface patches. Since determining the surface coordinates for a complex object is time-consuming,
398
Three-Dimensional Object Representations
these data sets, particularly the teapot surface mesh, became widely used. We can display the teapot, as a mesh of over 1,000 bicubic surface patches, using either of the following two GLUT functions:
glutWireTeapot (size);
or glutSolidTeapot (size);
The teapot surface is generated using OpenGL B´ezier curve functions . Parameter size sets the double-precision floating-point value for the maximum radius of the teapot bowl. The teapot is centered on the world-coordinate origin coordinate origin with its vertical axis along the y axis.
GLU Quadric-Surface Functions To generate a quadric surface using GLU functions, we need to assign a name to the quadric, activate the GLU quadric renderer, and designate values for the surface parameters. In addition, we can set other parameter values to control the appearance of a GLU quadric surface. The following statements illustrate the basic sequence of calls for displaying a wire-frame sphere centered on the world-coordinate origin: GLUquadricObj *sphere1; sphere1 = gluNewQuadric ( ); gluQuadricDrawStyle (sphere1, GLU_LINE); gluSphere (sphere1, r, nLongitudes, nLatitudes);
A name for the quadric object is defined in the first statement, and, for this example, we have chosen the name sphere1. This name is then used in other GLU functions to reference this particular quadric surface. Next, the quadric renderer is activated with the gluNewQuadric function, and then the display mode GLU LINE is selected for sphere1 with the gluQuadricDrawStyle command. Thus, the sphere is displayed in a wire-frame form with a straight-line segment between each pair of surface vertices. Parameter r is assigned a double-precision value for the sphere radius, and the sphere surface is divided into a set of polygon facets by the equally spaced longitude and latitude lines. We specify the integer number of longitude lines and latitude lines as values for parameters nLongitudes and nLatitudes. Three other display modes are available for GLU quadric surfaces. Using the symbolic constant GLU POINT in the gluQuadricDrawStyle, we display a quadric surface as a point plot. For the sphere, a point is displayed at each surface vertex formed by the intersection of a longitude line and a latitude line. Another option is the symbolic constant GLU SILHOUETTE. This produces a wire-frame display without the shared edges between two coplanar polygon facets. And with the symbolic constant GLU FILL, we display the polygon patches as shaded fill areas.
399
Three-Dimensional Object Representations
We generate displays of the other GLU quadric-surface primitives using the same basic sequence of commands. To produce a view of a cone, cylinder, or tapered cylinder, we replace the gluSphere function with gluCylinder (quadricName, rBase, rTop, height, nLongitudes, nLatitudes);
The base of this object is in the xy plane (z = 0), and the axis is the z axis. We assign a double-precision radius value to the base of this quadric surface using parameter rBase, and we assign a radius to the top of the quadric surface using parameter rTop. If rTop = 0.0, we get a cone; if rTop = rBase, we obtain a cylinder. Otherwise, a tapered cylinder is displayed. A double-precision height value is assigned to parameter height, and the surface is divided into a number of equally spaced vertical and horizontal lines as determined by the integer values assigned to parameters nLongitudes and nLatitudes. A flat, circular ring or solid disk is displayed in the xy plane (z = 0) and centered on the world-coordinate origin with gluDisk (ringName, rInner, rOuter, nRadii, nRings);
We set double-precision values for an inner radius and an outer radius with parameters rInner and rOuter. If rInner = 0, the disk is solid. Otherwise, it is displayed with a concentric hole in the center of the disk. The disk surface is divided into a set of facets with integer parameters nRadii and nRings, which specify the number of radial slices to be used in the tessellation and the number of concentric circular rings, respectively. Orientation for the ring is defined with respect to the z axis, with the front of the ring facing in the +z direction and the back of the ring facing in the −z direction. We can specify a section of a circular ring with the following GLU function: gluPartialDisk (ringName, rInner, rOuter, nRadii, nRings, startAngle, sweepAngle);
The double-precision parameter startAngle designates an angular position in degrees in the xy plane measured clockwise from the positive y axis. Similarly, parameter sweepAngle denotes an angular distance in degrees from the startAngle position. Thus, a section of a flat, circular disk is displayed from angular position startAngle to startAngle + sweepAngle. For example, if startAngle = 0.0 and sweepAngle = 90.0, then the section of the disk lying in the first quadrant of the xy plane is displayed. Allocated memory for any GLU quadric surface can be reclaimed and the surface eliminated with gluDeleteQuadric (quadricName);
Also, we can define the front and back directions for any quadric surface with the following orientation function: gluQuadricOrientation (quadricName, normalVectorDirection);
Parameter normalVectorDirection is assigned either GLU OUTSIDE or GLU INSIDE to indicate a direction for the surface normal vectors, where “outside” indicates the front-face direction and “inside” indicates the
400
Three-Dimensional Object Representations
back-face direction. The default value is GLU OUTSIDE. For the flat, circular ring, the default front-face direction is in the direction of the positive z axis (“above” the disk). Another option is the generation of surface-normal vectors, as follows: gluQuadricNormals (quadricName, generationMode);
A symbolic constant is assigned to parameter generationMode to indicate how surface-normal vectors should be generated. The default is GLU NONE, which means that no surface normals are to be generated and no lighting conditions typically are applied to the quadric surface. For flat surface shading (a constant color value for each surface), we use the symbolic constant GLU FLAT. This produces one surface normal for each polygon facet. When other lighting and shading conditions are to be applied, we use the constant GLU SMOOTH, which generates a normal vector for each surface vertex position. Other options for GLU quadric surfaces include setting surface-texture parameters. In addition, we can designate a function that is to be invoked if an error occurs during the generation of a quadric surface: gluQuadricCallback (quadricName, GLU_ERROR, function);
Example Program Using GLUT and GLU Quadric-Surface Functions Three quadric-surface objects (sphere, cone, and cylinder) are displayed in a wireframe representation by the following example program. We set the view-up direction as the positive z axis so that the axis for all displayed objects is vertical. The three objects are positioned at different locations within a single display window, as shown in Figure 7.
FIGURE 7
Display of a GLUT sphere, GLUT cone, and GLU cylinder, positioned within a display window by procedure wireQuadSurfs.
401
Three-Dimensional Object Representations #include GLsizei winWidth = 500, winHeight = 500; void init (void) { glClearColor (1.0, 1.0, 1.0, 0.0); } void wireQuadSurfs (void) { glClear (GL_COLOR_BUFFER_BIT); glColor3f (0.0, 0.0, 1.0);
// Initial display-window size.
// Set display-window color.
// Clear display window. // Set line-color to blue.
/* Set viewing parameters with world z axis as view-up direction. gluLookAt (2.0, 2.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0); /* Position and display GLUT wire-frame sphere. glPushMatrix ( ); glTranslatef (1.0, 1.0, 0.0); glutWireSphere (0.75, 8, 6); glPopMatrix ( ); /* Position and display GLUT wire-frame cone. glPushMatrix ( ); glTranslatef (1.0, -0.5, 0.5); glutWireCone (0.7, 2.0, 7, 6); glPopMatrix ( );
*/
*/
/* Position and display GLU wire-frame cylinder. */ GLUquadricObj *cylinder; // Set name for GLU quadric object. glPushMatrix ( ); glTranslatef (0.0, 1.2, 0.8); cylinder = gluNewQuadric ( ); gluQuadricDrawStyle (cylinder, GLU_LINE); gluCylinder (cylinder, 0.6, 0.6, 1.5, 6, 4); glPopMatrix ( ); glFlush ( ); } void winReshapeFcn (GLint newWidth, GLint newHeight) { glViewport (0, 0, newWidth, newHeight); glMatrixMode (GL_PROJECTION); glOrtho (-2.0, 2.0, -2.0, 2.0, 0.0, 5.0); glMatrixMode (GL_MODELVIEW); glClear (GL_COLOR_BUFFER_BIT); } void main (int argc, char** argv) { glutInit (&argc, argv);
402
*/
Three-Dimensional Object Representations
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); glutInitWindowPosition (100, 100); glutInitWindowSize (winWidth, winHeight); glutCreateWindow ("Wire-Frame Quadric Surfaces"); init ( ); glutDisplayFunc (wireQuadSurfs); glutReshapeFunc (winReshapeFcn); glutMainLoop ( ); }
7 Summary Many representations have been developed for modeling the wide variety of objects and materials that we might want to display in a computer-graphics scene. In most cases, a three-dimensional object representation is rendered by a software package as a standard graphics object, whose surfaces are displayed as a polygon mesh. Functions for displaying some common quadric surfaces, such as spheres and ellipsoids, are often available in graphics packages. Extensions of the quadrics, called superquadrics, provide additional parameters for creating a wider variety of object shapes. Polygon surface facets for a standard graphics object can be specified in OpenGL using the polygon, triangle, or quadrilateral primitive functions. Also, GLUT routines are available for displaying the five regular polyhedra. Spheres, cones, and other quadric-surface objects can be displayed with GLUT and GLU functions, and a GLUT routine is provided for the generation of the cubic-surface Utah teapot. Tables 1 and 2 summarize the OpenGL polyhedron and quadric functions discussed in this chapter.
T A B L E
1
Summary of OpenGL Polyhedron Functions Function
Description
glutWireTetrahedron
Displays a wire-frame tetrahedron.
glutSolidTetrahedron
Displays a surface-shaded tetrahedron.
glutWireCube
Displays a wire-frame cube.
glutSolidCube
Displays a surface-shaded cube.
glutWireOctahedron
Displays a wire-frame octahedron.
glutSolidOctahedron
Displays a surface-shaded octahedron.
glutWireDodecahedron
Displays a wire-frame dodecahedron.
glutSolidDodecahedron
Displays a surface-shaded dodecahedron.
glutWireIcosahedron
Displays a wire-frame icosahedron.
glutSolidIcosahedron
Displays a surface-shaded icosahedron.
403
Three-Dimensional Object Representations
T A B L E
2
Summary of OpenGL Quadric-Surface and Cubic-Surface Functions Function
Description
glutWireSphere
Displays a wire-frame GLUT sphere.
glutSolidSphere
Displays a surface-shaded GLUT sphere.
glutWireCone
Displays a wire-frame GLUT cone.
glutSolidCone
Displays a surface-shaded GLUT cone.
glutWireTorus
Displays a wire-frame GLUT torus with a circular cross-section.
glutSolidTorus
Displays a surface-shaded, circular cross-section GLUT torus.
glutWireTeapot
Displays a wire-frame GLUT teapot.
glutSolidTeapot
Displays a surface-shaded GLUT teapot.
gluNewQuadric
Activates the GLU quadric renderer for an object name that has been defined with the declaration: GLUquadricObj *nameOfObject;
gluQuadricDrawStyle
Selects a display mode for a predefined GLU object name.
gluSphere
Displays a GLU sphere.
gluCylinder
Displays a GLU cone, cylinder, or tapered cylinder.
gluDisk
Displays a GLU flat, circular ring or solid disk.
gluPartialDisk
Displays a section of a GLU flat, circular ring or solid disk.
gluDeleteQuadric
Eliminates a GLU quadric object.
gluQuadricOrientation Defines inside and outside orientations for a GLU quadric object. gluQuadricNormals
Specifies how surface-normal vectors should be generated for a GLU quadric object.
gluQuadricCallback
Specifies a callback error function for a GLU quadric object.
REFERENCES A detailed discussion of superquadrics is contained in Barr (1981). Programming techniques for various representations can be found in Glassner (1990), Arvo (1991), Kirk (1992), Heckbert (1994), and Paeth (1995). Kilgard (1996) discusses the GLUT functions for displaying polyhedrons, quadric surfaces, and the Utah teapot. And a complete listing of the OpenGL functions in the core library and in GLU is presented in Shreiner (2000).
EXERCISES 1
404
Set up an algorithm for converting a given sphere to a polygon-mesh representation.
2 3 4 5
6
Set up an algorithm for converting a given ellipsoid to a polygon-mesh representation. Set up an algorithm for converting a given cylinder to a polygon-mesh representation. Set up an algorithm for converting a given superellipsoid to a polygon-mesh representation. Set up an algorithm for converting a given torus with a circular cross section to a polygon mesh representation. Set up an algorithm for converting a given torus with an ellipsoidal cross section to a polygon mesh representation.
Three-Dimensional Object Representations
7
Write a program that displays a sphere in the display window and allows the user to switch between solid and wire-frame views of the sphere, translate the sphere along any dimension, rotate the sphere around its center in any direction, and change the size of the sphere (i.e., its radius). 8 Write a program that displays a torus in the display window and allows the user to switch between solid and wire-frame views of the torus, translate the torus along any dimension, rotate the torus around its center in any direction, and change the sizes of the torus’ defining properties (i.e., the radius of its cross section ellipse and its axial radius). 9 Write a program that displays a sphere of fixed radius at world coordinate origin and allows the user to adjust the number of longitude and latitude lines used to approximate the sphere’s surface as a quadrilateral mesh. The user should also be able to switch between solid and wireframe views of the sphere. Vary the resolution of the mesh approximation and observe the visual appearance of the sphere in both solid and wire-frame mode. 10 Write a program that displays a cylinder of fixed height and radius at world coordinate origin and allows the user to adjust the number of longitude and latitude lines used to approximate the cylinder’s surface as a quadrilateral mesh. The user should also be able to switch between solid and wire-frame views of the cylinder. Vary the resolution of the mesh approximation and observe the visual appearance of the cylinder in both solid and wire-frame mode.
IN MORE DEPTH 1
The material presented in this chapter will allow you to increase the complexity of the representations of the objects in your application by constructing more complex three-dimensional
2
shapes. Choose the most appropriate threedimensional shapes introduced in this chapter to replace the polygonal approximations of the objects in your application with which you have been working so far. Be sure to include at least a few curved-surface objects, using the GLU and GLUT functions for generating spheres, ellipsoids, and other quadric and cubic surfaces. Use the shaded fill areas to render the objects, not wire-frame views. Choose a reasonable setting for the number of latitude and longitude lines used to generate the polygon mesh approximation to these curved-surface objects. Write routines to call the appropriate functions and display the shapes in the appropriate positions and orientations in the scene. Use techniques in hierarchical modeling to generate objects that are better approximated as a group of these more-primitive shapes if appropriate. In this exercise, you will experiment with varying the resolution of the polygon meshes that serve as the approximations to the curved-surface objects specified in the previous exercise. Choose a minimum number of latitude and longitude lines at which the representation of the objects is minimally acceptable as far as visual appearance goes. Using this as a baseline, render the scene from the previous exercise several times, each time increasing the number of latitude and longitude lines that define the mesh approximations of the objects by some fixed amount. For each setting of resolution, record the amount of time that it takes to render the scene using shaded fill areas to render the objects. Continue doing this until the resolution produces little or no noticeable difference in approximation quality. Then, make a plot of rendering time as a function of resolution parameters (number of latitude and longitude lines) and discuss the properties of the plot. Is there an ideal setting for this scene that balances visual quality with performance?
405
This page intentionally left blank
Three-Dimensional Object Representations Color Plate
Color Plate 10
Superellipsoids plotted with values for parameters s 1 and s 2 ranging from 0.0 to 2.5 and with r x = r y = r z .
From Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
407
This page intentionally left blank
Spline Representations
1
Interpolation and Approximation Splines
2
Parametric Continuity Conditions
3
Geometric Continuity Conditions
4
Spline Specifications
5
Spline Surfaces
6
Trimming Spline Surfaces
7
Cubic-Spline Interpolation Methods
8
Bézier Spline Curves
9
Bézier Surfaces
10 B-Spline Curves 11 B-Spline Surfaces 12 Beta-Splines 13 Rational Splines 14 Conversion Between Spline Representations 15 Displaying Spline Curves and Surfaces 16 OpenGL Approximation-Spline Functions 17 Summary
S
plines are another example of boundary representation modeling techniques. In drafting terminology, a spline is a flexible strip used to produce a smooth curve through a designated set of points. Several small weights are distributed along the length of the strip to hold it in position on the drafting table as the curve is drawn. The term spline curve originally referred to a curve drawn in this manner. We can mathematically describe such a curve with a piecewise cubic polynomial function whose first and second derivatives are continuous across the various curve sections. In computer graphics, the term spline curve now refers to any composite curve formed with polynomial sections satisfying any specified continuity conditions at the boundary of the pieces. A spline surface can be described with two sets of spline curves. There are several different kinds of spline specifications that are used in computer-graphics applications. Each individual specification simply refers to a particular type of polynomial with certain prescribed boundary conditions.
From Chapter 14 of Computer Graphics with OpenGL®, Fourth Edition, Donald Hearn, M. Pauline Baker, Warren R. Carithers. Copyright © 2011 by Pearson Education, Inc. Published by Pearson Prentice Hall. All rights reserved.
409
Spline Representations
Splines are used to design curve and surface shapes, to digitize drawings, and to specify animation paths for the objects or the camera position in a scene. Typical computer-aided design (CAD) applications for splines include the design of automobile bodies, aircraft and spacecraft surfaces, ship hulls, and home appliances.
1 Interpolation and Approximation Splines
FIGURE 1
A set of six control points interpolated with piecewise continuous polynomial sections.
FIGURE 2
A set of six control points approximated with piecewise continuous polynomial sections.
FIGURE 3
An approximation spline surface for a CAD application in automotive design. Surface contours are plotted with polynomial curve sections, and the surface control points are connected with straight-line segments. (Courtesy of Evans & Sutherland.)
410
We specify a spline curve by giving a set of coordinate positions, called control points, which indicate the general shape of the curve. These coordinate positions are then fitted with piecewise-continuous, parametric polynomial functions in one of two ways. When polynomial sections are fitted so that all the control points are connected, as in Figure 1, the resulting curve is said to interpolate the set of control points. On the other hand, when the generated polynomial curve is plotted so that some, or all, of the control points are not on the curve path, the resulting curve is said to approximate the set of control points (Figure 2). Similar methods are used to construct interpolation or approximation spline surfaces. Interpolation methods are commonly used to digitize drawings or to specify animation paths. Approximation methods are used primarily as design tools to create object shapes. Figure 3 shows the screen display of an approximation spline surface for a design application. Straight lines connect the control-point positions above the surface. A spline curve or surface is defined, modified, and manipulated with operations on the control points. By interactively selecting spatial positions for the control points, a designer can set up an initial shape. After the polynomial fit is displayed for a given set of control points, the designer can then reposition some of or all the control points to restructure the shape of the object. Geometric transformations (translation, rotation, and scaling) are applied to the object by transforming the control points. In addition, CAD packages sometimes insert extra control points to aid a designer in adjusting the object shapes. A set of control points forms a boundary for a region of space that is called the convex hull. One way to envision the shape of a convex hull for a two-dimensional curve is to imagine a rubber band stretched around the positions of the control
Spline Representations p2 p2
p3
p0
p3
p0
p1 FIGURE
p1
4
Convex-hull shapes (dashed lines) for two sets of control points in the x y plane.
(b)
(a)
p2
p2 p0
p3
p3
p0
p1 p1 (a)
(b)
FIGURE 5
Control-graph shapes (dashed lines) for two sets of control points in the x y plane.
points so that each control point is either on the perimeter of this boundary or inside it (Figure 4). Thus, the convex hull for a two-dimensional spline curve is a convex polygon. In three-dimensional space, the convex hull for a set of spline control points forms a convex polyhedron. Convex hulls provide a measure for the deviation of a curve or surface from the region of space near the control points. In most cases, a spline is bounded by its convex hull, which ensures that the object shape follows the control points without erratic oscillations. Also, the convex hull provides a measure of the coordinate extents of a designed curve or surface, so it is useful in clipping and viewing routines. A polyline connecting the sequence of control points for an approximation spline curve is usually displayed to remind a designer of the control-point positions and ordering. This set of connected line segments is called the control graph for the curve. Often the control graph is alluded to as the “control polygon” or the “characteristic polygon,” even though the control graph is a polyline and not a polygon. Figure 5 shows the shape of the control graph for the control-point sequences in Figure 4. For a spline surface, two sets of polyline control-point connectors form the edges for the polygon facets in a quadrilateral mesh for the surface control graph, as in Figure 3.
2 Parametric Continuity Conditions To ensure a smooth transition from one section of a piecewise parametric spline to the next, we can impose various continuity conditions at the connection points. If each section of a spline curve is described with a set of parametric coordinate
411
Spline Representations
functions of the form x = x(u),
(a)
(b)
(c) FIGURE 6
Piecewise construction of a curve by joining two curve segments using different orders of continuity: (a) zero-order continuity only, (b) first-order continuity, and (c) second-order continuity.
y = y(u),
z = z(u),
u1 ≤ u ≤ u2
(1)
we set parametric continuity by matching the parametric derivatives of adjoining curve sections at their common boundary. Zero-order parametric continuity, represented as C 0 continuity, means simply that the curves meet. That is, the values of x, y, and z evaluated at u2 for the first curve section are equal, respectively, to the values of x, y, and z evaluated at u1 for the next curve section. First-order parametric continuity, referred to as C 1 continuity, means that the first parametric derivatives (tangent lines) of the coordinate functions in Equation 1 for two successive curve sections are equal at their joining point. Second-order parametric continuity, or C 2 continuity, means that both the first and second parametric derivatives of the two curve sections are the same at the intersection. Higher-order parametric continuity conditions are defined similarly. Figure 6 shows examples of C 0, C 1, and C 2 continuity. With second-order parametric continuity, the rates of change of the tangent vectors of connecting sections are equal at their intersection. Thus, the tangent line transitions smoothly from one section of the curve to the next [Figure 6(c)]. With first-order parametric continuity, however, the rate of change of tangent vectors for the two sections can be quite different [Figure 6(b)], so that the general shapes of the two adjacent sections can change abruptly. First-order parametric continuity is often sufficient for digitizing drawings and some design applications, while second-order parametric continuity is useful for setting up animation paths for camera motion and for many precision CAD requirements. A camera traveling along the curve path in Figure 6(b) with equal steps in parameter u would experience an abrupt change in acceleration at the boundary of the two sections, producing a discontinuity in the motion sequence. But if the camera was traveling along the path in Figure 6(c), the frame sequence for the motion would smoothly transition across the boundary.
3 Geometric Continuity Conditions Another method for joining two successive curve sections is to specify conditions for geometric continuity. In this case, we require only that the parametric derivatives of the two sections are proportional to each other at their common boundary, instead of requiring equality. Zero-order geometric continuity, described as G 0 continuity, is the same as zero-order parametric continuity. That is, two successive curve sections must have the same coordinate position at the boundary point. First-order geometric continuity, or G 1 continuity, means that the parametric first derivatives are proportional at the intersection of two successive sections. If we denote the parametric position on the curve as P(u), the direction of the tangent vector P (u), but not necessarily its magnitude, will be the same for two successive curve sections at their common point under G 1 continuity. Second-order geometric continuity, or G 2 continuity, means that both the first and second parametric derivatives of the two curve sections are proportional at their boundary. Under G 2 continuity, curvatures of two curve sections will match at the joining position. A curve generated with geometric continuity conditions is similar to one generated with parametric continuity, but with slight differences in curve shape. Figure 7 provides a comparison of geometric and parametric continuity. With geometric continuity, the curve is pulled toward the section with the greater magnitude for the tangent vector.
412
Spline Representations C3 C2 p0
p2
p1
C1
p0 C1
(a)
p1
p2
(b)
FIGURE 7
Three control points fitted with two curve sections joined with (a) parametric continuity and (b) geometric continuity, where the tangent vector of curve C 3 at point P1 has a greater magnitude than the tangent vector of curve C 1 at P1 .
4 Spline Specifications There are three equivalent methods for specifying a particular spline representation, given the degree of the polynomial and the control-point positions: (1) We can state the set of boundary conditions that are imposed on the spline; or (2) we can state the matrix that characterizes the spline; or (3) we can state the set of blending functions (or basis functions) that determine how specified constraints on the curve are combined to calculate positions along the curve path. To illustrate these three equivalent specifications, suppose we have the following parametric cubic polynomial representation for the x coordinate along the path of a spline-curve section: x(u) = a x u3 + b x u2 + c x u + dx ,
0≤u≤1
(2)
Boundary conditions for this curve can be set for the endpoint coordinate positions x(0) and x(1) and for the parametric first derivatives at the endpoints: x (0) and x (1). These four boundary conditions are sufficient to determine the values of the four coefficients a x , b x , c x , and dx . From the boundary conditions, we can obtain the matrix that characterizes this spline curve by first rewriting Equation 2 as the following matrix product: ⎡ ⎤ ax ⎢ bx ⎥ 3 2 ⎥ x(u) = [u u u 1] ⎢ ⎣ cx ⎦ dx = U·C (3) where U is the row matrix of powers of parameter u and C is the coefficient column matrix. Using Equation 3, we can write the boundary conditions in matrix form and solve for the coefficient matrix C as C = Mspline · Mgeom
(4)
where Mgeom is a four-element column matrix containing the geometric constraint values (boundary conditions) on the spline, and Mspline is the 4 by 4 matrix that transforms the geometric constraint values to the polynomial coefficients and provides a characterization for the spline curve. Matrix Mgeom contains controlpoint coordinate values and other geometric constraints that have been specified. Thus, we can substitute the matrix representation for C into Equation 3 to obtain x(u) = U · Mspline · Mgeom
(5)
The matrix Mspline , characterizing a spline representation, sometimes called the basis matrix, is particularly useful for transforming from one spline representation to another.
413
Spline Representations
Finally, we can expand Equation 5 to obtain a polynomial representation for coordinate x in terms of the geometric constraint parameters gk , such as the control-point coordinates and slope of the curve at the control points: x(u) =
3
gk · BFk (u)
(6)
k=0
The polynomials BFk (u), for k = 0, 1, 2, 3, are called blending functions or basis functions because they combine (blend) the geometric constraint values to obtain coordinate positions along the curve. In subsequent sections, we explore the features of the various spline curves and surfaces that are useful in computergraphics applications, including the specification of their matrix and blendingfunction representations.
5 Spline Surfaces The usual procedure for defining a spline surface is to specify two sets of spline curves using a mesh of control points over some region of space. If we denote the control-point positions as pku ,kv , then any point position on the spline surface can be computed as the product of the spline-curve blending functions as follows: P(u, v) = pku ,kv BFku (u)BFkv (v) (7) ku ,kv
Surface parameters u and v often vary over the range from 0 to 1, but this range depends on the type of spline curves we use. One method for designating the three-dimensional control-point positions is to select height values above a twodimensional mesh of positions on a ground plane.
6 Trimming Spline Surfaces In CAD applications, a surface design may require some features that are not implemented just by adjusting control-point positions. For instance, a section of a spline surface may need to be snipped off to fit two design pieces together, or a hole may be needed so that a conduit can pass through the surface. For these applications, graphics packages often provide functions to generate trimming curves that can be used to take out sections of a spline surface, as illustrated in Figure 8. Trimming curves are typically defined in parametric uv surface coordinates, and often they must be specified as closed curves. Trimming Curves
FIGURE 8
Modification of a surface section using trimming curves.
414
Spline Representations
7 Cubic-Spline Interpolation Methods This class of splines is most often used to set up paths for object motions or to provide a representation for an existing object or drawing, but interpolation splines are also used sometimes to design object shapes. Cubic polynomials offer a reasonable compromise between flexibility and speed of computation. Compared to higher-order polynomials, cubic splines require less calculations and storage space, and they are more stable. Compared to quadratic polynomials and straight-line segments, cubic splines are more flexible for modeling object shapes. Given a set of control points, cubic interpolation splines are obtained by fitting the input points with a piecewise cubic polynomial curve that passes through every control point. Suppose that we have n + 1 control points specified with coordinates pk = (xk , yk , zk ),
k = 0, 1, 2, . . . , n
A cubic interpolation fit of these points is illustrated in Figure 9. We can describe the parametric cubic polynomial that is to be fitted between each pair of control points with the following set of equations: x(u) = a x u3 + b x u2 + c x u + dx y(u) = a y u3 + b y u2 + c y u + d y ,
(0 ≤ u ≤ 1)
(8)
z(u) = a z u + b z u + c z u + dz 3
2
For each of these three equations, we need to determine the values for the four coefficients a , b, c, and d in the polynomial representation for each of the n curve sections between the n + 1 control points. We do this by setting enough boundary conditions at the control-point positions between curve sections so that we can obtain numerical values for all the coefficients. In the following sections, we discuss common methods for setting the boundary conditions for cubic interpolation splines.
Natural Cubic Splines One of the first spline curves to be developed for graphics applications is the natural cubic spline. This interpolation curve is a mathematical representation of the original drafting spline. We formulate a natural cubic spline by requiring that two adjacent curve sections have the same first and second parametric derivatives at their common boundary. Thus, natural cubic splines have C 2 continuity. If we have n + 1 control points, as in Figure 9, then we have n curve sections with a total of 4n polynomial coefficients to be determined. At each of the n − 1 interior control points, we have four boundary conditions: The two curve sections on either side of a control point must have the same first and second parametric derivatives at that control point, and each curve must pass through that control point. This gives us 4n − 4 equations to be satisfied by the 4n polynomial coefficients. We obtain an additional equation from the first control point p0 , the position of the beginning of the curve, and another condition from control point pn , which must be the last point on the curve. However, we still need p1
pk
p0 p2 …
pk⫹1 …
FIGURE 9
pn
A piecewise continuous cubic-spline interpolation of n + 1 control points.
415
Spline Representations
two more conditions to be able to determine values for all the coefficients. One method for obtaining the two additional conditions is to set the second derivatives at p0 and pn equal to 0. Another approach is to add two extra control points (called dummy points), one at each end of the original control-point sequence. That is, we add a control point labeled p−1 at the beginning of the curve and a control point labeled pn+1 at the end. Then all the original control points are interior points, and we have the necessary 4n boundary conditions. Although natural cubic splines are a mathematical model for the drafting spline, they have a major disadvantage. If the position of any of the control points is altered, the entire curve is affected. Thus, natural cubic splines allow for no “local control,” so that we cannot restructure part of the curve without specifying an entirely new set of control points. For this reason, other representations for a cubic-spline interpolation have been developed.
Hermite Interpolation pk
P(u) (x(u), y(u), z(u))
pk 1 FIGURE 10
Parametric point function P( u ) for a Hermite curve section between control points pk and pk +1 .
A Hermite spline (named after the French mathematician Charles Hermite) is an interpolating piecewise cubic polynomial with a specified tangent at each control point. Unlike the natural cubic splines, Hermite splines can be adjusted locally because each curve section depends only on its endpoint constraints. If P(u) represents a parametric cubic point function for the curve section between control points pk and pk+1 , as shown in Figure 10, then the boundary conditions that define this Hermite curve section are P(0) = pk P(1) = pk+1 P (0) = Dpk P (1) = Dpk+1
(9)
with Dpk and Dpk+1 specifying the values for the parametric derivatives (slope of the curve) at control points pk and pk+1 , respectively. We can write the vector equivalent of Equations 8 for this Hermite curve section as P(u) = a u3 + b u2 + c u + d, 0≤u≤1 (10) where the x component of P(u) is x(u) = a x u3 + b x u2 + c x u + dx , and similarly for the y and z components. The matrix equivalent of Equation 10 is ⎡ ⎤ a ⎢ ⎥ b ⎥ P(u) = [u3 u2 u 1] · ⎢ (11) ⎣c⎦ d and the derivative of the point function can be expressed as ⎡ ⎤ a ⎢b⎥ 2 ⎥ P (u) = [3u 2u 1 0] · ⎢ (12) ⎣c⎦ d Substituting endpoint values 0 and 1 for parameter u into the preceding two equations, we can express the Hermite boundary conditions 9 in the matrix form ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ pk 0 0 0 1 a ⎢ p ⎥ ⎢ ⎢b⎥ ⎢ k+1 ⎥ ⎢ 1 1 1 1 ⎥ ⎥·⎢ ⎥ (13) ⎢ ⎥= ⎣ Dpk ⎦ ⎣ 0 0 1 0 ⎦ ⎣ c ⎦ 3 2 1 0 d Dp k+1
416
Spline Representations
Solving this equation for the polynomial coefficients, we get ⎡ ⎤ ⎡ ⎤−1 ⎡ p ⎤ k a 0 0 0 1 ⎥ ⎢ p ⎢b⎥ ⎢1 1 1 1⎥ k+1 ⎥ ⎢ ⎥=⎢ ⎥ ·⎢ ⎥ ⎢ ⎣c ⎦ ⎣0 0 1 0⎦ ⎣ Dpk ⎦ d 3 2 1 0 Dp k+1
⎡
2 −2 ⎢ −3 3 =⎢ ⎣ 0 0 1 0
1 −2 1 0
⎡
pk ⎢ p ⎢ k+1 = M H ·⎢ ⎣ Dpk
⎤
⎤ ⎡ p k 1 ⎢ p −1 ⎥ k+1 ⎥·⎢ ⎢ 0 ⎦ ⎣ Dpk 0 Dp
⎤ ⎥ ⎥ ⎥ ⎦
k+1
⎥ ⎥ ⎥ ⎦
(14)
Dpk+1 where M H , the Hermite matrix, is the inverse of the boundary constraint matrix. Equation 11 can thus be written in terms of the boundary conditions as ⎤ ⎡ pk ⎥ ⎢ p ⎢ k+1 ⎥ P(u) = [u3 u2 u 1] · M H · ⎢ (15) ⎥ ⎣ Dpk ⎦ Dpk+1 Finally, we can determine expressions for the polynomial Hermite blending functions, Hk (u) for k = 0, 1, 2, 3, by carrying out the matrix multiplications in Equation 15 and collecting coefficients for the boundary constraints to obtain the polynomial form P(u) = pk (2u3 − 3u2 + 1) + pk+1 (−2u3 + 3u2 ) + Dpk (u3 − 2u2 + u) + Dpk+1 (u3 − u2 ) = pk H0 (u) + pk+1 H1 + Dpk H2 + Dpk+1 H3
(16)
Figure 11 shows the shape of the four Hermite blending functions. Hermite polynomials can be useful for some digitizing applications, where it may not be too difficult to specify or approximate the curve slopes. But for most problems in computer graphics, it is more useful to generate spline curves without requiring input values for curve slopes or other geometric information, in addition to control-point coordinates. Cardinal splines and Kochanek-Bartels splines, discussed in the following two sections, are variations on the Hermite splines that do not require input values for the curve derivatives at the control points. Procedures for these splines compute parametric derivatives from the coordinate positions of the control points.
Cardinal Splines As with Hermite splines, the cardinal splines are interpolating piecewise cubic polynomials with specified endpoint tangents at the boundary of each curve section. The difference is that we do not input the values for the endpoint tangents. For a cardinal spline, the slope at a control point is calculated from the coordinates of the two adjacent control points.
417
Spline Representations H0(u) 1
H1(u) 1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.2
0.4
0.6
0.8
1
u
0
0.2
0.4
(a)
0.8
1
0.6
0.8
1
u
(b)
H2(u) 1
H3(u) 1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.6
0.2
0.4
0.6
0.8
1
u
0
0.2
0.4
u
0.2 (c)
(d)
FIGURE 11
The Hermite blending functions.
pk
p(u) pk 1 pk 2
pk 1 FIGURE 12
Parametric point function P( u ) for a cardinal-spline section between control points pk and pk +1 .
A cardinal spline section is completely specified with four consecutive control-point positions. The middle two control points are the section endpoints, and the other two points are used in the calculation of the endpoint slopes. If we take P(u) as the representation for the parametric cubic point function for the curve section between control points pk and pk+1 , as in Figure 12, then the four control points from pk−1 to pk+1 are used to set the boundary conditions for the cardinal-spline section as P(0) = pk P(1) = pk+1 1 (17) (1 − t)(pk+1 − pk−1 ) 2 1 P (1) = (1 − t)(pk+2 − pk ) 2 Thus, the slopes at control points pk and pk+1 are taken to be proportional, respectively, to the chords pk−1 pk+1 and pk pk+2 (Figure 13). Parameter t is called the tension parameter because it controls how loosely or tightly the cardinal spline fits the input control points. Figure 14 illustrates the shape of a cardinal curve for very small and very large values of tension t. When t = 0, this class of curves is referred to as Catmull-Rom splines, or Overhauser splines. P (0) =
pk
pk 1
pk 1 pk 2
FIGURE 13
Tangent vectors at the endpoints of a cardinal-spline section are parallel to the chords formed with neighboring control points (dashed lines).
418
Spline Representations
Using methods similar to those for Hermite splines, we can convert the boundary conditions 17 into the matrix form ⎤ ⎡ pk−1 ⎢ p ⎥ ⎢ k ⎥ P(u) = [u3 u2 u 1] · MC · ⎢ (18) ⎥ ⎣ pk+1 ⎦
−s ⎢ 2s MC = ⎢ ⎣ −s 0
2−s s−3 0 1
s−2 3 − 2s s 0
t0 (Tighter Curve)
FIGURE 14
Effect of the tension parameter on the shape of a cardinal-spline section.
pk+2 where the cardinal matrix is ⎡
t0 (Looser Curve)
⎤ s −s ⎥ ⎥ 0 ⎦ 0
(19)
with s = (1 − t)/2. Expanding Equation 18 into polynomial form, we have P(u) = pk−1 (−s u3 + 2s u2 − s u) + pk [(2 − s)u3 + (s − 3)u2 + 1] + pk+1 [(s − 2)u3 + (3 − 2s)u2 + s u] + pk+2 (s u3 − s u2 ) = pk−1 CAR0 (u) + pk CAR1 (u) + pk+1 CAR2 (u) + pk+2 CAR3 (u)
(20)
where the polynomials CARk (u) for k = 0, 1, 2, 3 are the cardinal-spline blending (basis) functions. Figure 15 gives a plot of the basis functions for cardinal splines with t = 0. CAR0(u)
CAR1(u)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.2
0.4
0.6
0.8
1
u
0
0.2
0.4
0.6
0.8
1
0.6
0.8
1
u
0.2 (a)
(b)
CAR2(u)
CAR3(u)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.2
0.4
0.6
0.8
(c)
1
u
0 0.2
0.2
0.4
u
(d)
FIGURE 15
The cardinal-spline blending functions for t = 0 (s = 0.5).
419
Spline Representations
Examples of curves produced with the cardinal-spline blending functions are given in Figures 16, 17, and 18. In Figure 16, four cardinal-spline sections are plotted to form a closed curve. The first curve section is generated using the control-point set {p0 , p1 , p2 , p3 }, the second curve is produced with the control-point set {p1 , p2 , p3 , p0 }, the third curve section has control points {p2 , p3 , p0 , p1 }, and the final curve section has control points {p3 , p0 , p1 , p2 }. In Figure 17, a closed curve is obtained with a single cardinal-spline section by setting the position of the third control point to the coordinate position of the
7 p2
p1
6 5 4 3 2 FIGURE 16
1
A closed curve with four cardinalspline sections, obtained with a cyclic permutation of the control points and with tension parameter t = 0.
p3
p0 2
4
6
8
10
1
10
p1 p 2
8 6 4 FIGURE 17
A cardinal- spline loop produced with curve endpoints at the same coordinate position. The tension parameter is set to the value 0.
2 p0 0
p3 2
4
6
8
10
6
8
10 p3
2.50 2.25 p1
2.00
p2
1.75 1.50 FIGURE 18
A self-intersecting cardinal-spline curve section produced with closely spaced curve endpoint positions. The tension parameter is set to the value 0.
420
1.25 1.00 p0
2
4
Spline Representations p2
p2
FIGURE 19
p0
p1
p3
p4
b0
p0
p1
p3
Effect of the bias parameter on the shape of a Kochanek-Bartels spline section.
p4
b0
second control point. In Figure 18, a self-intersecting cardinal-spline section is produced by setting the position of the third control point very near the coordinate position of the second control point. The resulting self-intersection is due to the constraints on the curve slope at the endpoints p1 and p2 .
Kochanek-Bartels Splines These interpolating cubic polynomials are extensions of the cardinal splines. Two additional parameters are introduced into the constraint equations defining Kochanek-Bartels splines to provide further flexibility in adjusting the shapes of curve sections. Given four consecutive control points, labeled pk−1 , pk , pk+1 , and pk+2 , we define the boundary conditions for a Kochanek-Bartels curve section between pk and pk+1 as P(0) = pk P(1) = pk+1 1 P (0)in = (1 − t)[(1 + b)(1 − c)(pk − pk−1 ) 2 + (1 − b)(1 + c)(pk+1 − pk )] 1 P (1)out = (1 − t)[(1 + b)(1 + c)(pk+1 − pk ) 2 + (1 − b)(1 − c)(pk+2 − pk+1 )]
(21)
where t is the tension parameter, b is the bias parameter, and c is the continuity parameter. In the Kochanek-Bartels formulation, parametric derivatives might not be continuous across section boundaries. Tension parameter t has the same interpretation as in the cardinal spline formulation; that is, it controls the looseness or tightness of the curve sections. Bias, b, is used to adjust the curvature at each end of a section so that curve sections can be skewed toward one end or the other (Figure 19). Parameter c controls the continuity of the tangent vector across the boundaries of sections. If c is assigned a nonzero value, there is a discontinuity in the slope of the curve across section boundaries. Kochanek-Bartels splines were designed to model animation paths. In particular, abrupt changes in the motion of an object can be simulated with nonzero values for parameter c. These motion changes are used in cartoon animations, for example, when a cartoon character stops quickly, changes direction, or collides with some other object.
8 Bézier Spline Curves This spline approximation method was developed by the French engineer Pierre Bézier for use in the design of Renault automobile bodies. Bézier splines have a number of properties that make them highly useful and convenient for curve and
421
Spline Representations
surface design. They are also easy to implement. For these reasons, Bézier splines are widely available in various CAD systems, in general graphics packages, and in assorted drawing and painting packages. In general, a Bézier curve section can be fitted to any number of control points, although some graphic packages limit the number of control points to four. The degree of the Bézier polynomial is determined by the number of control points to be approximated and their relative position. As with the interpolation splines, we can specify the Bézier curve path in the vicinity of the control points using blending functions, a characterizing matrix, or boundary conditions. For general Bézier curves, with no restrictions on the number of control points, the blendingfunction specification is the most convenient representation.
Bézier Curve Equations We first consider the general case of n + 1 control-point positions, denoted as pk = (xk , yk , zk ), with k varying from 0 to n. These coordinate points are blended to produce the following position vector P(u), which describes the path of an approximating Bézier polynomial function between p0 and pn : n P(u) = pk BEZk,n (u), 0≤u≤1 (22) k=0
The Bézier blending functions BEZk,n (u) are the Bernstein polynomials BEZk,n (u) = C(n, k)uk (1 − u)n−k
(23)
where parameters C(n, k) are the binomial coefficients n! C(n, k) = (24) k!(n − k)! Equation 22 represents a set of three parametric equations for the individual curve coordinates: n x(u) = xk BEZk,n (u) k=0
y(u) =
n
yk BEZk,n (u)
(25)
k=0
z(u) =
n
zk BEZk,n (u)
k=0
In most cases, a Bézier curve is a polynomial of a degree that is one less than the designated number of control points: Three points generate a parabola, four points a cubic curve, and so forth. Figure 20 demonstrates the appearance of some Bézier curves for various selections of control points in the xy plane (z = 0). With certain control-point placements, however, we obtain degenerate Bézier polynomials. For example, a Bézier curve generated with three collinear control points is a straight-line segment; and a set of control points that are all at the same coordinate position produce a Bézier “curve” that is a single point. Recursive calculations can be used to obtain successive binomial-coefficient values as n−k +1 C(n, k) = C(n, k − 1) (26) k for n ≥ k. Also, the Bézier blending functions satisfy the recursive relationship BEZk,n (u) = (1 − u)BEZk,n−1 (u) + u BEZk−1,n−1 (u), with BEZk,k = u and BEZ0,k = (1 − u) . k
422
k
n>k≥1
(27)
Spline Representations p1
p1
p2
p1
p3
p0
p0 p0 p2
p3
(a)
p2
(b)
(c) p1
p2
p3
p0
p0
p4
p1
FIGURE 20
p3 (d)
Examples of two-dimensional Bézier curves generated with three, four, and five control points. Dashed lines connect the control-point positions.
p2 (e)
Example Bézier Curve-Generating Program An implementation for calculating the Bézier blending functions and generating a two-dimensional, cubic Bézier-spline curve is given in the following program. Four control points are defined in the xy plane, and 1000 pixel positions are plotted along the curve path using a pixel width of 4. Values for the binomial coefficients are calculated in procedure binomialCoeffs, and coordinate positions along the curve path are calculated in procedure computeBezPt. These values are passed to procedure bezier, and pixel positions are plotted using the OpenGL point-plotting routines. Alternatively, we could have approximated the curve path with straight-line sections, using fewer points. More efficient methods for generating coordinate positions along the path of a spline curve are explored in Section 15. For this example, the world-coordinate limits are set so that only the curve points are displayed within the viewport (Figure 21). If we also wanted to plot the control-point positions, the control graph, or the convex hull, we would need to extend the limits of the world-coordinate clipping window.
#include #include #include /* Set initial size of the display window. GLsizei winWidth = 600, winHeight = 600;
*/
/* Set size of world-coordinate clipping window. GLfloat xwcMin = -50.0, xwcMax = 50.0; GLfloat ywcMin = -50.0, ywcMax = 50.0;
*/
423
Spline Representations
FIGURE 21
A Bézier curve displayed by the example program.
class wcPt3D { public: GLfloat x, y, z; }; void init (void) { /* Set color of display window to white. glClearColor (1.0, 1.0, 1.0, 0.0); }
*/
void plotPoint (wcPt3D bezCurvePt) { glBegin (GL_POINTS); glVertex2f (bezCurvePt.x, bezCurvePt.y); glEnd ( ); } /* Compute binomial coefficients C for given value of n. void binomialCoeffs (GLint n, GLint * C) { GLint k, j; for (k = 0; k = k + 1; C [k] *= j; for (j = n - k; j >= 2; C [k] /= j;
j--) j--)
} } void computeBezPt (GLfloat u, wcPt3D * bezPt, GLint nCtrlPts, wcPt3D * ctrlPts, GLint * C) { GLint k, n = nCtrlPts - 1; GLfloat bezBlendFcn; bezPt->x = bezPt->y = bezPt->z = 0.0; /* Compute blending functions and blend control points. */ for (k = 0; k < nCtrlPts; k++) { bezBlendFcn = C [k] * pow (u, k) * pow (1 - u, n - k); bezPt->x += ctrlPts [k].x * bezBlendFcn; bezPt->y += ctrlPts [k].y * bezBlendFcn; bezPt->z += ctrlPts [k].z * bezBlendFcn; } } void bezier (wcPt3D * ctrlPts, GLint nCtrlPts, GLint nBezCurvePts) { wcPt3D bezCurvePt; GLfloat u; GLint *C, k; /* Allocate space for binomial coefficients C = new GLint [nCtrlPts];
*/
binomialCoeffs (nCtrlPts - 1, C); for (k = 0; k n for values of j ranging from 0 to n + d. With this assignment, the first d knots are assigned the value 0, and the last d knots have the value n − d + 2. Open uniform B-splines have characteristics that are very similar to Bézier splines. In fact, when d = n + 1 (degree of the polynomial is n), open B-splines reduce to Bézier splines, and all knot values are either 0 or 1. For example, with a cubic open B-spline (d = 4) and four control points, the knot vector is {0, 0, 0, 0, 1, 1, 1, 1} The polynomial curve for an open B-spline connects the first and last control points. Also, the parametric slope of the curve at the first control point is parallel to the straight line formed by the first two control points, and the parametric slope at the last control point is parallel to the line defined by the last two control points. Thus, the geometric constraints for matching curve sections are the same as for Bézier curves. As with Bézier curves, specifying multiple control points at the same coordinate position pulls any B-spline curve closer to that position. Since open B-splines start at the first control point and end at the last control point, a closed curve can be generated by setting the first and last control points at the same coordinate position. EXAMPLE 2
Open Uniform, Quadratic B-Splines
From conditions 48 with d = 3 and n = 4 (five control points), we obtain the following eight values for the knot vector: {u0 , u1 , u2 , u3 , u4 , u5 , u6 , u7 } = {0, 0, 0, 1, 2, 3, 3, 3} The total range of u is divided into seven subintervals, and each of the five blending functions Bk,3 is defined over three subintervals, starting at knot position uk . Thus B0,3 is defined from u0 = 0 to u3 = 1, B1,3 is defined
437
Spline Representations
from u1 = 0 to u4 = 2, and B4,3 is defined from u4 = 2 to u7 = 3. Explicit polynomial expressions are obtained for the blending functions from recurrence relations 38 as B0,3 (u) = (1 − u)2 ⎧ 1 ⎪ ⎪ ⎨ u(4 − 3u) 2 B1,3 (u) = ⎪ ⎪ ⎩ 1 (2 − u)2 2 ⎧ 1 2 ⎪ ⎪ u ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎨ 1 1 B2,3 (u) = u(2 − u) + (u − 1)(3 − u) ⎪ 2 2 ⎪ ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎩ (3 − u)2 2 ⎧ 1 ⎪ 2 ⎪ ⎨ (u − 1) 2 B3,3 (u) = ⎪ ⎪ ⎩ 1 (3 − u)(3u − 5) 2
0≤u