138 55 3MB
English Pages 206 Year 2007
UNDERSTANDING
IPTV
OTHER TELECOMMUNICATIONS BOOKS FROM AUERBACH Architecting the Telecommunication Evolution: Toward Converged Network Services Vijay K. Gurbani and Xian-He Sun ISBN: 0-8493-9567-4 Business Strategies for the Next-Generation Network Nigel Seel ISBN: 0-8493-8035-9
Security in Distributed, Grid, Mobile, and Pervasive Computing Yang Xiao ISBN: 0-8493-7921-0 TCP Performance over UMTS-HSDPA Systems Mohamad Assaad and Djamal Zeghlache ISBN: 0-8493-6838-3 Testing Integrated QoS of VoIP: Packets to Perceptual Voice Quality Vlatko Lipovac ISBN: 0-8493-3521-3
Chaos Applications in Telecommunications Peter Stavroulakis ISBN: 0-8493-3832-8 Context-Aware Pervasive Systems: Architectures for a New Breed of Applications Seng Loke ISBN: 0-8493-7255-0 Fundamentals of DSL Technology Philip Golden, Herve Dedieu, Krista S Jacobsen ISBN: 0-8493-1913-7 Introduction to Mobile Communications: Technology, Services, Markets Tony Wakefield ISBN: 1-4200-4653-5 IP Multimedia Subsystem: Service Infrastructure to Converge NGN, 3G and the Internet Rebecca Copeland ISBN: 0-8493-9250-0 MPLS for Metropolitan Area Networks Nam-Kee Tan ISBN: 0-8493-2212-X Performance Modeling and Analysis of Bluetooth Networks: Polling, Scheduling, and Traffic Control Jelena Misic and Vojislav B Misic ISBN: 0-8493-3157-9 A Practical Guide to Content Delivery Networks Gilbert Held ISBN: 0-8493-3649-X
The Handbook of Mobile Middleware Paolo Bellavista and Antonio Corradi ISBN: 0-8493-3833-6 Traffic Management in IP-Based Communications Trinh Anh Tuan ISBN: 0-8493-9577-1 Understanding Broadband over Power Line Gilbert Held ISBN: 0-8493-9846-0 Understanding IPTV Gilbert Held ISBN: 0-8493-7415-4 WiMAX: A Wireless Technology Revolution G.S.V. Radha Krishna Rao, G. Radhamani ISBN: 0-8493-7059-0 WiMAX: Taking Wireless to the MAX Deepak Pareek ISBN: 0-8493-7186-4 Wireless Mesh Networking: Architectures, Protocols and Standards Yan Zhang, Jijun Luo and Honglin HU ISBN: 0-8493-7399-9 Wireless Mesh Networks Gilbert Held ISBN: 0-8493-2960-4
Resource, Mobility and Security Management in Wireless Networks and Mobile Communications Yan Zhang, Honglin Hu, and Masayuki Fujise ISBN: 0-8493-8036-7
AUERBACH PUBLICATIONS www.auerbach-publications.com To Order Call: 1-800-272-7737 • Fax: 1-800-374-3401 E-mail: [email protected]
UNDERSTANDING
IPTV Gilbert Held
Boca Raton New York
Auerbach Publications is an imprint of the Taylor & Francis Group, an informa business
Auerbach Publications Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2007 by Taylor & Francis Group, LLC Auerbach is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 0-8493-7415-4 (Hardcover) International Standard Book Number-13: 978-0-8493-7415-9 (Hardcover) his book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Held, Gilbert, 1943Understanding IPTV / Gilbert Held. p. cm. -- (Informa telecoms and media ; 3) Includes bibliographical references and index. ISBN 0-8493-7415-4 (alk. paper) 1. Digital television. 2. Television broadcasting--Technological innovations. 3. Multicasting (Computer networks) I. Title. II. Title: Understanding IP television. TK6678.H35 2006 621.388’17--dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the Auerbach Web site at http://www.auerbach-publications.com
2006043033
Dedication Over the past two decades I have had the privilege to teach a series of graduate courses focused on various aspects of communications technology for Georgia College and State University. Teaching graduate school has enabled me to both convey information as well as learn from the inquisitive minds of students. A long time ago, when I commenced work at my first full-time job, at the IBM Corporation in upstate New York, many desks were most notable by the placement of a sign that simply stated the word, “Think.” For more than 20 years, my graduate school students have made me remember the information that sign at IBM conveyed. In recognition of their inquisitive nature, this book is dedicated to the students of Georgia College and State University.
v
Contents Preface ................................................................................................. ix Acknowledgments .............................................................................. xi About the Author.............................................................................. xiii
1
Introduction to IPTV ....................................................................1 1.1 The Concept of IPTV .................................................................................1 1.2 Applications...............................................................................................11 1.3 The Potential Impact of IPTV..................................................................16
2
Market Drivers and Developing IPTV Infrastructure .............21 2.1 2.2 2.3 2.4
3
Telephone Company Landline Erosion...................................................22 The Pay-TV Market...................................................................................35 Convergence of Voice, Data, and Video ................................................41 Evolution of Video Compression ............................................................44
Television Concepts ...................................................................49 3.1 Analog Television .....................................................................................49 3.2 Digital Television ......................................................................................57 3.3 Lossy Compression ...................................................................................64
4
The TCP/IP Protocol Suite and IPTV ........................................85 4.1 The TCP/IP Protocol Suite.......................................................................85 4.2 Delivering IPTV ...................................................................................... 105
5
Last Mile Solutions ...................................................................115 5.1 VDSL ........................................................................................................ 116 5.2 Distribution into the Home ................................................................... 127
vii
viii 䡲 Contents
6
Hardware Components............................................................135 6.1 Set-Top Boxes ......................................................................................... 135 6.2 Media Center and Center Extenders ..................................................... 143 6.3 Servers ..................................................................................................... 145
7
Software Solutions....................................................................153 7.1 7.2 7.3 7.4
8
Microsoft’s Windows Media Player ....................................................... 154 Apple Computer’s QuickTime ............................................................... 165 Other Media Players ............................................................................... 173 Summary.................................................................................................. 174
Internet Television ...................................................................177 8.1 Internet Television vs. IPTV .................................................................. 177 8.2 Internet Television .................................................................................. 178 8.3 Summary.................................................................................................. 184
Index ..................................................................................................185
Preface Just when we thought we had mastered modern communications-related acronyms, a new one has appeared. That acronym, IPTV, which is the subject of this book, represents an emerging technology that could change the manner by which we receive home entertainment, obtain training, operate our personal computers, and even use our cell phones. The acronym is an abbreviation for television transmitted over an Internet Protocol (IP) network, but it can also represent a series of technologies that provide television services to screens ranging in size from cell phone displays and personal computer monitors to large plasma and LCD televisions mounted on walls in homes or hung from the ceilings in airports. Although the acronym IPTV represents the “Internet” as its first character pair, that term merely references the protocol used to transport television and does not mean that content has to be delivered over the Internet. Instead, IPTV refers to the use of the Internet Protocol that is required to be used to deliver television content. That content can include conventional television shows, movies, music videos, and other types of combined audio and video offerings. In this book the reader will obtain a solid understanding of IP television in the form of IPTV. The text focuses on how IPTV operates; how it may reach the home or office; how it will compete with traditional cable, overthe-air broadcast stations, and satellite television; and the hardware and software that will make it a reality. Because readers can reasonably be expected to have diverse backgrounds, I review both television concepts and the TCP/IP protocol suite in separate chapters in this book. In my review of television concepts, I will also review several popular compression standards, the use of which is integral to enabling the large number
ix
x
䡲 Preface
of channels we can select for viewing when we subscribe to certain types of television services. Because IPTV represents a series of technologies that can be used over any type of IP network, including the Internet, we will examine the operation and utilization of hardware and software components required to view television content delivered over different types of IP networks. This examination will include so-called “last mile” solutions that in actuality can represent the manner by which communications organizations connect homes and offices to their infrastructure either directly via fiber or via short spans of a few hundred to a few thousand feet of copper cable. As an emerging technology, the development of IPTV resulted in a number of industry alliances as well as individual companies focusing their efforts on this technology. Thus, any examination of the technology would not be complete without also discussing industry players and alliances. Although this is an area that can be expected to undergo considerable change over the next few years, this discussion will provide the reader with a firm understanding of companies that are working with the technology and how their efforts can alter the manner by which television content is delivered. Last, but not least, the concluding chapter of this book provides examples of the use of IPTV that will illustrate the potential of this evolving technology. As a professional author, I truly value reader comments. Please feel free to write me either in care of my publisher, whose address is on the jacket of this book, or via e-mail to [email protected]. Let me know if I dwelt too long on a particular topic or if I should add material on a particular topic, or any other comments you may wish to share with me. Because I frequently travel, I may not be able to immediately answer your letter or e-mail; however, I will attempt to answer all persons within a week or two. Gilbert Held Macon, GA
Acknowledgments A long time ago, after I completed my first manuscript, I became aware of the many persons involved in the book production process. From the typing of an initial manuscript to the editing process and the production of galley pages, through the cover design process and binding effort, there is a literal army of men and women whose efforts are crucial in producing the book you are now reading. I would be remiss if I did not acknowledge their efforts. First and foremost, every book idea will come to naught unless an author works with an editor who has the foresight and vision to back an effort focused on an emerging technology. Once again, I am indebted to Rich O’Hanley at CRC Press for supporting my writing efforts. The preparation of a manuscript is a long and lengthy pr ocess. Although this author has many laptop and notebook computers, long ago he gave up modern technology for pen and paper. Regardless of where one travels, the differences in electrical outlets and airline policies do not have an effect on pen and paper. Although this method of manuscript generation may appear awkward in today’s era of electronic gadgets, as long as I have paper and pen I do not have to worry about whether or not I can use my computer on an airline or if my outlet converter will mate to the receptacle used in a hotel. Of course, my penmanship leaves a lot to be desired and makes me truly grateful for the efforts of my wife, soul mate, and excellent typist, Beverly. Commencing her effort on a 128kB Macintosh to type my first book almost 30 years ago, Beverly now uses Microsoft Word under Windows XP on both desktop and notebook computers to not only type this author’s manuscripts, but also to type the index for each book.
xi
xii 䡲 Acknowledgments
Once a manuscript reaches the publisher a number of behind-thescenes efforts occur. Once again, I am indebted to Claire Miller for guiding the manuscript through the book production process. Concerning that process, I would also like to thank the Taylor & Francis/Auerbach team in Boca Raton, Florida, for their efforts in reviewing and editing my manuscript as well as for guiding the reviewed galley pages into the book you are reading.
About the Author Gilbert Held is an award-winning author and lecturer who specializes in the application of communications technology. Over the past 30 years, Gil has authored approximately 100 books and 300 articles focused on communications technology and personal computing. Although the number of books Gil has authored may appear to be quite high, that number includes second, third, and even fourth editions of several books that were researched and written over a long period of time. In recognition of Gil’s writing talents, he twice received the Karp Award for technical excellence in writing. Gil has also received awards from the American Publishers Institute and Federal Week. After earning a BS in Electrical Engineering from Widener University, Gil earned an MSEE degree from New York University and the MSTM and MBA degrees from The American University. Presently Gil is the director of 4-Degree Consulting, a Macon, Georgia-based organization that specializes in the application of communications technology.
xiii
Chapter 1
Introduction to IPTV The purpose of this introductory chapter is to acquaint the reader with the technology the book will focus on, beginning with a definition of the technology. Once that is accomplished, we will compare that definition to other television delivery methods, including standard over-the-air, digital satellite, and cable television. Using the preceding information as a base, we will conclude this chapter by examining the existing and potential utilization of IPTV. This information will enable us to obtain an appreciation for how IPTV can be used as well as allow us to note the advantages and disadvantages associated with its use. In concluding this chapter we will examine the potential impact of IPTV. Because IPTV represents a series of technologies, we will primarily focus on the delivery of television shows, movies, and similar content via private IP-based networks. However, prior to doing so, we will examine the network elements common to different types of IPTV services to obtain an appreciation for the general manner by which video content can be delivered to consumers over both public and private IP-based networks. So, let’s grab a soda and perhaps a few munchies and begin our exploration of IPTV.
1.1 The Concept of IPTV We can define IPTV as representing “digital video content, including television, that is delivered via the use of the Internet Protocol (IP).” This definition of IPTV not only is very simple but also stresses that the Internet does not need to play a role in the delivery of television or any other 1
2 䡲 Understanding IPTV
type of video content. Instead, IPTV refers to the use of the IP as a delivery mechanism that can use the Internet, which represents a public IP-based network, or IPTV can be used to deliver video content over a private IP-based network. Because IPTV requires the use of the IP only as a delivery mechanism, IP can be used to deliver various types of content over both the Internet and private IP-based networks. Examples of IPTV content can range in scope from music videos to television shows, full feature movies, rock concerts, and a variety of special events, such as boxing matches, football games, or even Broadway musicals. This means that our brief definition of IPTV covers a wide range of both existing and potential activities. Some of those activities could include downloading a movie or music video via the Internet for viewing now or at a later date or subscribing to a television service that will be delivered to a homeowner via the installation of a private network that will provide the delivery of television content through the use of the IP. Thus, the term IPTV does not restrict content to that provided by broadcast television nor does it imply that delivery of content has to occur over the Internet. As we will note both later in this chapter as well as in other chapters throughout this book, IPTV represents a broad term used to reference the delivery of a wide variety of video content using the IP as a mechanism for transporting content. Prior to discussing in more detail a few examples of the delivery of video content via IP-based networks, a few words are in order concerning the mnemonic “IPTV.” That mnemonic should not be confused with IP/TV, which is an active, registered U.S. trademark owned by Cisco, the company best known for its routers. Cisco uses IP/TV to reference a series of products developed to transport television content over the Internet or via private IP-based networks.
Public IP-Based Network Utilization In this section we will turn our attention to the delivery of video content via the Internet. As previously mentioned, IPTV can operate over any IP-based network, including the Internet. There are literally hundreds of examples of the use of this technology on the Internet. Some examples, such as the now famous Victoria’s Secret annual fashion show, represent the free broadcast of a video stream. Other examples of the use of IPTV on the Internet range from the downloading of music videos, television shows, and full feature movies to various types of special events, such as the launch of the Space Shuttle to the movement of the Mars lander when it made its first steps on the Red Planet. To obtain a better appreciation for the operation of IPTV over the Internet, let’s turn our attention to two major Internet-based video services.
Introduction to IPTV
䡲 3
Currently, two major Internet-based video services provide on-demand digital content to people operating personal computers via a broadband connection. Those Internet-based video services are operated by CinemaNow and MovieLink. Although both CinemaNow and MovieLink offer thousands of movies on an on-demand basis, as of the time this book was prepared they were not providing the typical content associated with a television viewing audience. Thus, this explains why our definition of IPTV included the phrase “digital video content, including television,” because IPTV can be used to deliver movies, live events, TV shows, and in general a wide range of videos that have in common the fact that delivery occurs via an IP network. Although the term “IP video” is probably better suited for the delivery of all types of video via an IP network, we will follow industry practice and use the term IPTV to refer to the delivery of all types of video content, including television. Because CinemaNow and MovieLink are representative of public IP-based content delivery, we will now look at each organization in more detail.
CinemaNow CinemaNow, Inc., represents one of two key players in the delivery of IPTV content to consumers over the Internet. Currently, CinemaNow offers legal content from a library of more than 6500 movies, television programs, music concerts, and music videos from more than 200 licensors via downloading from its Web site (www.cinemanow.com) or as streaming content. CinemaNow was founded in 1999 and is financially backed by several key companies, including Lions Gate Entertainment, Cisco Systems, Blockbuster, and Microsoft. In the latter part of 2005 CinemaNow announced the availability of high-definition content and support for portable media devices. Concerning the latter, CinemaNow announced in September 2005 that it would make its download service available on new portable video players from the French consumer electronics manufacturer Archos as well as bundle new players with two free downloads from the company. Under the agreement, users could download and select from approximately 550 feature-length titles and 200 music videos that are available for compatible portable video players or connect their video players to a television set for viewing.
MovieLink A second major player in the IPTV market where delivery occurs over the Internet is MovieLink, whose Web address is www.movielink.com. MovieLink represents an online video-rental service that was founded by
4 䡲 Understanding IPTV
five major movie studios: MGM, Paramount, Sony Pictures, Universal, and Warner Brothers. In September 2005 MovieLink had a library of approximately 900 films. Viewers pay from 99 cents to $4.99 per title to download a film and store it on their computer for up to 30 days. Once they begin viewing the film, they have 24 hours to finish viewing it, at which point the film is automatically deleted. The MoveLink Web site allows customers to search for movies by category, such as action, drama, or Oscar films. Users can also search for movies by actor, director, or title. Renting a movie requires that you first download and install the MovieLink Manager, a program that controls the movie downloading process as well as its playback. MovieLink allows users to purchase an additional 24 hours of movie viewing and supports both RealPlayer and Windows Media formats. According to the company, it takes approximately 80 minutes to download a movie. Currently the MovieLink Web site is accessible only to U.S. residents who have a broadband connection with a minimum data rate of 128 kbps. In addition, the computer must run a version of Windows at or above Windows 98, such as Windows 2000, ME, or XP, with 2 GB of free disk space.
Apple Computer’s iPod In concluding our brief examination of IPTV occurring over the public IP-based network, a few words are in order concerning Apple Computer’s newest iPod that became available to the consumer during October 2005. Featuring a 2.5-inch color display, the new iPod, which can be purchased with either a 30-GB or a 60-GB disk, is capable of storing up to 150 hours of video. Although the iPod does not transmit video, its consumer-friendly ability to store and replay video makes it a market driver for the use of the Internet to download video onto PCs and then transfer the downloaded video content onto the new iPod. In fact, in 2005, Apple Computer’s iTunes Web site (www.apple.com/itunes) began offering the latest episodes of ABC and Disney television shows, including Lost and Desperate Housewives, for $1.99 per show. The Web site also allows customers to select from a list of more than 2000 music videos. Although iPod downloading was in its infancy at the time this book was written, the availability of thousands of video music and television shows should significantly increase the use of the Internet for various types of IPTV operations.
Private IP-Based Network Utilization We have discussed a few IPTV-related applications that are occurring over the Internet, a public-based IP network. In this section we will examine
Introduction to IPTV
䡲 5
some examples of the use of private IP-based networks for the delivery of video content. At the time this book was prepared both SBC Communications and Verizon, which operate large private IP-based networks, were in the process of installing several billion dollars’ worth of fiber communications in their service areas as a mechanism to provide voice, video, and data communications services to their customers. Through the installation of fiber, either to the neighborhood or directly to the customer’s premises, sufficient bandwidth becomes available to provide television services in competition with cable television and satellite operators. In this section we will briefly discuss the plans of SBC Communications and Verizon concerning the delivery of video services over the fiber networks they are in the process of constructing. However, prior to doing so, a few words are in order concerning the rationale for firms we think of as telephone companies entering the IPTV marketplace.
Rationale Conventional telephone companies are under a gun that is causing them to lose their customer base. Over the past five years, cable television providers have introduced Voice-over-IP (VoIP) on their cable networks, which now have approximately two million subscribers. As wireless phone usage has grown, so too has the number of homeowners and apartment renters who have disconnected their landlines. Projections indicate that by 2010 more than 10 million Americans will have either disconnected a second telephone line or dropped their primary landline altogether in favor of VoIP or wireless cell phone service or due to the use of both technologies. Thus, conventional telephone companies are experiencing a significant loss in both customers and revenue that could be replaced by providing an IPTV service in competition with cable television and satellite television. A second reason for providing an IPTV service resembles the response of the famous criminal Willie Sutton, who, when asked why he robbed banks, said he did so because “that’s where the money is.” Similarly, telephone companies are entering the IPTV market because that’s where the money resides. For example, local telephone service typically costs the consumer between $30 and $40 per month, and the addition of unlimited long distance typically increases the phone bill by another $20 per month. In comparison, basic cable TV video service costs more than $40 per month, with digital video adding approximately $20 per month to one’s cable TV bill. When subscribers add high-speed Internet access and VoIP telephone service to their cable TV bill, the bill approaches or exceeds $100 per month. Thus, telephone companies that can develop
6 䡲 Understanding IPTV
a competitive IPTV service can look forward to a potential revenue stream in addition to retaining a portion of their customer base that is migrating to cable television. Now that we have an appreciation for the reasons why telephone companies are expanding into providing IPTV, let’s turn our attention to the services being rolled out by SBC Communications and Verizon.
SBC Communications At the time this book was written, SBC Communications had announced that it would adopt the AT&T name once its planned acquisition of that company was approved. Although the Federal Communications Commission (FCC) approved SBC Communications’ takeover of AT&T and Verizon Communications’ purchase of MCI on October 31, 2005, SBC Communications had not renamed itself by the time this book was developed. Thus, this author will refer to the company by both its pre-merger name and its anticipated post-merger name. SBC Communications began testing an IP-based television service built on Microsoft’s TV IPTV Edition platform in June 2004. SBC Communications announced its Project Lightspeed in early 2005 to take advantage of its collaboration with Microsoft, and both companies began field trials of IPTV during mid-2005. SBC’s Project Lightspeed represents the company’s initiative to deploy fiber closer to customer locations to enable provision of a variety of IP-based services, such as IP television, VoIP, and ultra-fast Internet access. Under Project Lightspeed, customers will be able to access all services over a single network connection as well as have the ability to share access to those services from any number of IP-enabled household devices, such as PCs, PDAs, TVs, set-top boxes, and telephones. The IP-based TV service is expected to include instant channel changing, customizable channel lineups, video on demand, digital video recording, multimedia interactive programming guides, event notifications, and other features. Because SBC Communications’ IPTV offering will occur via two-way broadband communications, it is able to transmit alerts and notifications to customers watching television beyond the simple setting of program notifications available from cable television and digital satellite. For example, SBC Communications could configure its system to allow customers to enable the display of caller ID and instant messaging on their TV screens. According to SBC, the company expects to connect 18 million homes to its network by the end of 2007, using both fiber-to-the-neighborhood (FTTN) and fiber-to-the-premises (FTTP). The FTTN service will provide 20 to 25 Mbps of capacity to each customer whereas the FTTP service will enable up to 39 Mbps of capacity to the customer. For both services
Introduction to IPTV
䡲 7
IPTV will allow up to four high-quality TV str eams, including highdefinition TV (HDTV). In addition to IPTV, the single SBC Communications customer connection will support VoIP and data services that will provide a 6-Mbps downstream and 1-Mbps upstream capability. Later in this book, when we turn our attention to industry players and alliances, we will discuss SBC Communications’ Lightspeed offering in additional detail.
Verizon Verizon, like SBC Communications, represents a large former Regional Bell Operating Company (RBOC) that has significantly grown in both service area and offerings since its divesture by AT&T during the l980s. During September 2005 Verizon began selling its IPTV service after commencing the construction of fiber connections to homes in half of the 29 states where it offers telephone service. Marketed under the name “Fios TV,” the service offered customers more than 180 digital video and music channels, including more than 20 in high definition, for $39.95 per month. In addition, Verizon expanded its offerings to 1800 video-on-demand titles by the end of 2005. Verizon’s FiOS service was first launched in Keller, Texas, a city 30 miles west of Dallas. Verizon was expanding its service to several other cities in Texas as well as expanded its offerings to cities in Florida, Virginia, and California. FiOS for the home provides a 15-Mbps connection for $45 per month, and a 5-Mbps connection costs $35 per month. For customers with deep pockets, a 30-Mbps connection is available at a preliminary cost of $200 per month. The speed of the connection governs the number of multiple TV services (different programs) that can be viewed at the same time. Although video compression can significantly reduce bandwidth requirements, a 5-Mbps connection may enable the viewing of only a single channel while using the Internet; whereas the 15-Mbps connection should provide the ability to simultaneously view three or four different programs on different television sets while another person in the home is surfing the Web. Along with its FiOS service, Verizon offers three set-top boxes. A standard-definition set-top box can be rented for $3.95 per month, whereas an HDTV set-top box is rented for $9.95 per month. The third type of set-top box combines high definition with a digital video recorder and can be rented for $12.95 per month. We will discuss Verizon’s offerings in more detail later in this book. However, now that we have a general appreciation for IPTV, we can compare its general capability to a trio of competitive television delivery services — over-the-air broadcast television, cable television, and satellite television.
8 䡲 Understanding IPTV
Comparison to Other TV Delivery Methods We can subdivide our comparison into two areas: financial and technical. Thus, we will compare and contrast IPTV delivered over a private IP-based network to over-the-air broadcast television, cable television, and satellite television by briefly examining the financial and technical aspects of each. Of course, later in this book we will probe much deeper into the technical aspects of television in general and, specifically, the manner by which IPTV is delivered to customers.
Financial From a broad financial perspective, IPTV delivered over a private IP-based network represents a subscription service that requires customers to pay a monthly fee for service plus the monthly cost for one or more set-top boxes. Although over-the-air broadcast television is a free service to consumers, it is paid for by advertising revenue. In addition, unless a consumer is located in a large metropolitan area, the number of over-the-air broadcast stations that may be viewable through the use of built-in television or rooftop antennas is usually very limited. This explains why a high percentage of homeowners subscribe to cable television and satellite television services. That is, such services provide access to hundreds of television channels. IPTV is best compared and financially equivalent to cable television and satellite television. Concerning cable television services, although most cable operators offer a basic package of analog stations without requiring the homeowner to rent a set-top box, such boxes are necessary to subscribe to digital and premium services. If a homeowner subscribes to satellite television, because all offerings are in digital format, a set-top box or decoder is required for each television. Table 1.1 provides a general financial comparison between IPTV delivered via a private IP-based network and the troika of existing television delivery methods.
Table 1.1 Financial Comparison of IPTV to Existing TV Delivery Methods Financial Feature
IPTV
Monthly subscription fee Set-top box fee Digital channel fee
Yes Yes No
Over-the-Air Broadcast TV
Cable TV
Satellite TV
No N/A No*
Yes Yes Yes
Yes Yes No
* Existing over-the-air analog broadcasting is scheduled to terminate within two years, to be replaced by digital broadcasting. Analog television sets will then require a converter box to receive over-the-air digital broadcast signals.
Introduction to IPTV
䡲 9
Technical From a technical perspective, over-the-air broadcast television stations, cable television operators, and satellite television providers operate similarly, broadcasting television channels at predefined frequencies, which enables subscribers to tune their television sets or set-top boxes to the channel they wish to view. Figure 1.1 illustrates an example of the manner by which a subscriber would view a channel when subscribing to a cable television or satellite service. In this example, note that both cable television and satellite operators simultaneously broadcast all of their channel offerings over a range of frequencies. Customers control a tuner built into their television or set-top box directly by pressing up and down buttons on the TV or set-top box or they use a remote control unit that turns the tuner on the TV or set-top box to a different frequency range when they change the channel. Because analog television channels require approximately 6 MHz of bandwidth, when you use the remote control to switch from channel 2 to channel 3, in effect your timer is switched to display a different 6 MHz of bandwidth on the coaxial cable that provides cable television service to your home. If you subscribe to a satellite television service, switching from channel 2 to channel 3 also switches the bandwidth. However, satellite operators provide an all-digital service in which data compression reduces the bandwidth of each channel, which reduces the frequency that is switched. Although both cable television and satellite operate very similarly with respect to the use of a tuner to select the fr equency for a particular channel, IPTV is a completely different technology with respect to the delivery of video content. IPTV can be considered to represent a softwarebased “pull–push” technology. Here the term “pull” represents the subscriber transmitting via an IP a request for a particular TV channel, movie, video musical, or similar product. The request is received by the IPTV provider, which pushes the video stream from a server to the requestor using the IP address of the requestor as the destination address. Note that because a single video stream flows in response to a request, this
Figure 1.1 Viewing a cable television or satellite television channel.
10 䡲 Understanding IPTV
minimizes the bandwidth required for the delivery of a television channel. Whereas cable television and satellite operators broadcast a large selection of channels at the same time, which requires the use of a tuner to select a desired channel, IPTV can be considered to represent an on-demand service, although as we will note later in this book, some types of video may be transmitted as broadcast streams to selected locations within an IPTV network. Thus, in most cases the 600+-MHz bandwidth required by cable television and satellite operators can be significantly reduced by an IPTV provider. In fact, according to SBC Communications and Verizon, a typical home with a 15- to 20-Mbps data channel connection to the network can receive between three and four simultaneous television channels via IPTV as well as obtain a VoIP capability and a high-speed Internet connection. Although the amount of bandwidth required to provide an IPTV capability is significantly less than the bandwidth provided by cable television and satellite operators, it is still more than that available from many types of Digital Subscriber Line (DSL) facilities provided by RBOCs. This explains why SBC Communications and Verizon are currently installing extensive fiber-optic-based networks as a mechanism to provide IPTV-based subscription services. The higher bandwidth of fiber-optic cable routed either to the neighborhood or to the customer premises enables each subscriber to obtain sufficient bandwidth to view three or four different television channels while receiving a VoIP capability and a high-speed Internet connection.
Potential IPTV Features Because IPTV represents an all-digital service that can have its video presentation scaled to different types of monitors, it has the ability to provide features beyond the capability of other television distribution mechanisms. For example, IPTV set-top boxes via software could enable the simultaneous placement of four pictures on the screen that represent four customer channel requests. In addition, incoming telephone Short Message Service (SMS) messages, e-mail, and caller ID could be displayed on a customer’s television at a predefined location. Combine this with the ability to enable customers to select the viewing of video rentals and a virtually unlimited number of high-definition content and IPTV could represent a quantum leap over existing television delivered via over-the-air broadcast stations and cable and satellite operators. This probably explains why many forecasting organizations predict that by 2008 as many as 20 million homes will subscribe to an IPTV service. If we use a fee of $50 per month for the IPTV service, to include one set-top box, the revenue stream from this emerging service could be approximately $12 billion per year in a few years. Thus, any way one examines the potential of IPTV, it is hard
Introduction to IPTV
䡲 11
not to note that it provides the potential to enable the RBOCs to negate the revenue loss associated with what until recently represented their core revenue market, the home telephone. Otherwise, without IPTV, it is possible that the telephone companies we grew up with could go the way of the companies that delivered ice blocks to homes prior to the development of the refrigerator. Now that we have an appreciation for the lower bandwidth requirements of IPTV in comparison to the conventional manner by which we view television, let’s turn our attention to some of the applications that can be supported by this relatively new technology.
1.2 Applications Although there are many “flavors” of IPTV, we can view the technology as a mechanism for delivering high-quality digital video content over public and private IP-based networks. Because IP-based networks have a bidirectional communications capability, developers can create IPTV technology that enables customers to select what they want to watch as well as when they want to watch it. With the preceding in mind, let’s turn our attention to a few of the potential applications that IPTV can support.
Homeowner Entertainment First and foremost, IPTV represents a technology that will enable telephone companies to compete with standard over-the-air television, cable television, and satellite operators for the entertainment budget of homeowners. Although homeowner entertainment is expected to represent the largest application of IPTV in terms of both subscribers and revenue, it is just one of a series of applications that can be supported by the technology. Table 1.2 lists eight IPTV applications, including the general category of
Table 1.2 Potential IPTV Applications Homeowner entertainment Digital television On-demand video Business TV to the desktop Distance learning Corporate communications Mobile phone television Video chat
12 䡲 Understanding IPTV
homeowner entertainment, that can be expected to achieve significant growth over the next few years. In the remainder of this section we will briefly describe and discuss each of the other seven applications.
Digital Television As previously discussed in this chapter, IPTV can be considered to represent a pull–push technology whereby a subscriber makes a request to a service provider for a particular video stream. Because digitized television is both a very popular entertainment provider as well as very suitable for being compressed and carried via IPTV, it represents the primary application for the technology. In addition, because a service provider may have to transmit only what is requested, unlike cable and satellite, IPTV could theoretically provide an unlimited number of viewing channels, which would enable the service provider to offer a more diverse content than conventional competitors that simultaneously broadcast every channel regardless of whether anyone is watching them. Thus, the architectural difference between IPTV and broadcast television enables the former to offer a more diverse content, assuming the service provider can acquire significant content to match subscriber requirements.
On-Demand Video Although subscribers to cable and satellite television have been able for many years to obtain pay-per-view movies and sporting events, that capability pales in comparison to on-demand video that can be provided through IPTV technology. The key reason why IPTV on-demand video can be considered far superior to pay-per-view resides in the fact that the former can provide virtually unlimited program content whereas the latter is restricted to a handful of broadcast channels. One recent example of IPTV on-demand video is the Apple Computer iTunes Music Store, which in October 2005 began selling episodes of the hit television series Lost for $1.99 the day after the show aired on broadcast television. In a deal with Walt Disney Co., the parent of ABC Television, Apple also offers past and current episodes of Desperate Housewives, Night Stalker, and That’s So Raven to its customers. By the end of October 2005, Apple Computer Company was offering more than two million songs, 20,000 podcasts, 2000 music videos, and a variety of ABC and Disney television series that customers could download to their Mac or PC and then synchronize the content onto their iPod. Apple’s newest iPod, released in October 2005, could be obtained with either a 30-GB
Introduction to IPTV
䡲 13
or 60-GB disk and a 2.5-inch 320 × 240 pixel Thin Film Transistor (TFT) display, enabling customers to store up to 25,000 photos, 15,000 songs, and up to 150 hours of video.
Business TV to Desktop Although the primary market for IPTV is the individual consumer and household, the technology is also well suited for business applications. One such application is streaming business television to the desktop. In a business environment, each LAN workstation can be assigned a distinct IP address. Doing so makes it possible for different video streams to be directed to different employees. For example, some employees might require instant access to CNBC whereas other employees could require access to Bloomberg, Reuters, or another finance-oriented program. Because IPTV can be scaled on a screen, it also becomes possible for employees to view the requested business channel or channels while performing other computer operations using a different portion of their PC screen.
Distance Learning In an academic environment it is possible to be in two places at the same time through the power of distance learning facilities. In fact, this author has used distant learning to teach a data communications course in Macon, Georgia, that was simultaneously broadcast onto video monitors located in Millegeville, Georgia, the home of Georgia College & State University, and the learning center at Robbins Air Force Base. Although distance learning can be accomplished through the use of conventional teleconferencing equipment, when performed through the use of IPTV the efficiency associated with reaching students at distant locations can significantly increase. This is because conventional distance learning that is based on the use of teleconferencing equipment results in a central monitor at distant locations. Not only do all students have to focus their attention on a single monitor, but in addition, a microphone has to be passed around by a proctor at each distant location to the students who wish to talk to the instructor giving the lecture. In comparison, the use of IPTV can significantly improve distance learning because the image of the distant instructor can be directed onto the PC monitor of each student workstation while a microphone connected to each computer enables students to converse with the instructor without having to wait for a microphone to be passed through the classroom.
14 䡲 Understanding IPTV
Another significant advantage of IPTV within a distance learning environment is the fact that, similar to the previous discussion about business TV to the desktop, it can be scaled on a PC screen. This would allow distance learning courses on programming and other topics to have students both view and hear the instructor while they perform different exercises. Because software can be developed to enable an instructor to view student activities, it’s possible for a student’s work to be viewed by the instructor. Similarly, with appropriate programming, the instructor could display the efforts of one student on a designated portion of each student’s PC screen, which would significantly enhance instructor–student interaction.
Corporate Communications In most organizations, the president or a corporate officer often needs to address employees. In a conventional environment this requirement is commonly satisfied by scheduling the use of one or several auditorium sessions during which the corporate officer explains the reason why earnings went up or down, the effect of a new product line, changes to the employee benefit plan, or another subject that needs to be disseminated to a broad range of employees. The conventional use of an auditorium to announce a new policy or shed light on a recent event can require a significant amount of time and effort. If the auditorium was previously scheduled for another event, then the logistics of moving that event to a different time and venue could be considerable. In addition, there can be a considerable loss of employee productivity because the use of an auditorium requires time for employees to arrive and depart from the site as well as time for employees to move through the facility to a seat. In comparison, the use of IPTV can result in corporate communications being only a mouse-click away from any employee. That is, a corporate officer can tape a message that becomes available for downloading via IPTV. Employees could then be alerted to the availability of the newly created video via an e-mail containing a URL to click. Then, employees could view the video at their leisure, with no need to stop what they are working on to visit the auditorium or a conference room. Thus, the use of IPTV for corporate communications can significantly enhance employee productivity.
Mobile Phone Television Currently, mobile phone television is being developed to allow reception of broadcast television. This means that the first generation of mobile
Introduction to IPTV
䡲 15
phones with a television viewing capability will be limited to viewing over-the-air broadcast television offerings. Thus, users with that type of phone will be limited with respect to the content they can watch. As mobile phones with television viewing capability evolve, we can reasonably expect the addition of higher capacity secure digital cards, perhaps miniature disk drives, WiFi, and other communications capabilities to the product. As this action occurs, mobile phones can be expected to be used in hot zones at airports, Starbucks, hotels, motels, and other locations where mobile phone operators can connect to the Internet. This capability will enable mobile phone users to obtain access to significantly increased content. In addition, through the inclusion of either a secure digital card slot or a miniature disk drive, it becomes possible for users to download video content into their phone. Then, they could view the content at their leisure, similar to the manner by which users of the relatively new Apple Computer video iPod can view video.
Video Chat One of the more popular features associated with the use of the Internet is chat rooms, commonly hosted by different Internet Service Providers (ISPs) and Web portals. Chat rooms are used primarily to discuss a variety of topics, ranging in scope from American Idol and current events to science and education. Although people usually enter a chat room anonymously, this is not always true, especially when the chat facility requires individuals to provide identification prior to being able to access the facility. Once a person joins a chat room they can observe the identifiers of the other members currently in the room as well as what they are saying in the form of typed text. Although chat rooms are a popular mechanism for exchanging ideas and political views, the need to type responses significantly delays the interaction between people. Through the use of IPTV it becomes possible to develop a video chat facility; each person could mount a camera with a built-in microphone on their monitor that would transmit audio and video to the video chat room operator. Through applicable software, a user’s screen could be subdivided to display a number of chat room participants, allowing each user to scroll through the screen to view other members of the chat room as well as to click on the image of a person to display that person’s image on the full screen. A variation of video chat can be expected to result in a change in the commonly used “messenger” programs offered by Yahoo and other Web portals. Using a messenger program, a person creates a “buddy” or “friends” list, which allows certain other people to communicate with that person via text messages. Similar to current chat programs, messenger
16 䡲 Understanding IPTV
programs depend on the skill of the people typing queries and responses and could be significantly enhanced through the use of a video capability.
Summary In this section we briefly looked at eight existing and emerging applications that could occur via IPTV. Each of the applications mentioned has significant advantages when performed through the use of IPTV instead of conventional video services. Thus, the use of IPTV technology can be expected to be driven by the advantages it provides in convenience and worker productivity.
1.3 The Potential Impact of IPTV In this introductory chapter we briefly examined the concept behind IPTV and a few of its existing and potential applications. In concluding this chapter we will focus attention on the network elements required to provide an IPTV system that can compete with cable and satellite providers and the impact the technology can be expected to have on both the consumer and the industry.
Network Elements An IPTV system can be considered to represent four major elements that are both generic and common to any system provider’s infrastructure. Those elements include a video headend, a service provider’s IP network, a service provider’s access network, and the home or residence network. Figure 1.2 illustrates the relationship of the IPTV network elements and the data flow from the content provider to the consumer. In examining the relationship of the four key IPTV network elements shown in Figure 1.2, it is important to note that the network elements can be provided by more than a single vendor. For example, if a consumer is using IPTV simply to download a movie or music video via the public Internet, the video headend could represent one company and the service
Figure 1.2 Key IPTV network elements.
Introduction to IPTV
䡲 17
provider’s IP network could consist of a series of IP networks interconnected at a peering point to form the Internet backbone. Then, the service provider’s access network could represent an ISP, and the home network could consist of a router and wireless LAN products obtained from one or more manufacturers. In comparison, if the consumer were accessing a movie or TV show via a private IP network, the video headend, service provider IP network, and service provider access network would be provided by a single company. In fact, it is quite possible that that company would provide an end-to-end service, including any required home networking equipment. Now that we have an appreciation for how different organizations can provide different elements of an IPTV system, let’s focus our attention on each of the elements and their function.
The Video Headend The video headend represents the point within a network where content is captured and formatted for distribution over the IP network. The video headend for an IP network is similar to the headends used by cable television and digital satellite systems. That is, the IP network video headend could be connected to satellite receivers to receive broadcast television and premium television, such as HBO and Showtime, which are broadcast via satellite. Other programming could be received via a terrestrial fiber-based connection or occur via the use of DVD or hard disk servers to provide a content-on-demand service. The headend takes each data stream and encodes it into a digital video format, such as MPEG-2 or MPEG-4. MPEG is a mnemonic for Motion Picture Experts Group, an organization that develops standards for compressing still and moving images and audio. Later in this book we will examine several MPEG standards in some detail. After encoding, each data stream, which can be thought of as representing a conventional TV channel, is encapsulated into an IP data stream and transmitted to a specific IP destination address in response to a customer request for a particular channel. As an alternative to the transmission of TV channels to individual destinations, which is technically referred to as unicast transmission, popular IPTV channels are more than likely transmitted as IP multicast data streams. With multicast transmission, a series of TV channels in the form of data streams is simultaneously broadcast over each link of a network, with a single copy of each data stream flowing over the network. Each data stream is copied only when there is a network branch, so it can flow onto the branch, which minimizes the amount of data that flows over the network. Customers on each network branch then join a multicast group, which enables multiple customers to view a single data stream that flows
18 䡲 Understanding IPTV
Figure 1.3 Comparing addressing methods.
over a majority of the IP network, which minimizes transmission on the backbone as well as represents a TV channel under an IPTV service. Figure 1.3 compares three popular methods of IP addressing: unicast, broadcast, and multicast. With unicast addressing data is sent to a specific destination, whereas with broadcast addressing data is read by every station. Thus, multicast addressing can be viewed as falling between the two, requiring stations to become a member of a multicast group in order to view an IPTV multicast transmission. Using multicast transmission, a service provider can transmit one IP data stream per broadcast channel from the video headend through the IP network onto the service provider’s access network. Multicast transmission can significantly reduce the flow of data over the network. For example, consider a heavyweight boxing match that tens of thousands of people may wish to view. Instead of having separate data streams of the match sent to each individual subscriber, the IPTV operator could transmit the match as a multicast broadcast. Then, tens of thousands of subscribers could tune into the match by joining the multicast group that carries the match.
The Service Provider Network The service provider network can be considered as a delivery system that enables data to flow from the core of the network that is connected to the video headend to the network edge. Over the service provider network, the channel lineup flows in the form of encoded video streams. Those flows can consist of data transmitted as unicast, multicast, and broadcast transmission. The TV guide that flows to each subscriber could be a broadcast transmission. In comparison, a specially requested movie could be transmitted directly to a single subscriber via unicast transmission,
Introduction to IPTV
䡲 19
whereas the popular channel lineup could flow to all subscribers via multicast transmission.
The Access Network The access network provides connectivity from the customer’s premises to the backbone network operated by the service provider. In telephone terminology, the access network is commonly referred to as the “last mile” connection. Because telephone companies, such as SBC Communications (AT&T) and Verizon, are the primary developers of IPTV networks used to transport television content along with movies and other types of video content, the method used for the transport facility on the access network corresponds to the offering of RBOCs. Those offerings include several versions of Asymmetrical Digital Subscriber Lines (ADSL), very-high-bitrate Digital Subscriber Lines (VDSL), and different types of fiber-optic technology, such as passive optical networking (PON). In an IPTV environment, the service provider will use the access network to the subscriber’s premises to provide a single high-bandwidth connection. That connection will enable multiple television channels, VoIP, and high-speed Internet access to be provided over a common connection to the service provider’s network.
The Home Network The last major network element in an IPTV environment is the home network. The home network is responsible for distributing IPTV services throughout the home. Currently, the home network is in an evolutionary stage of development, with a transition occurring from wired Ethernet to wireless Ethernet and HomePlug audio-visual (AV) equipment. Wireless Ethernet can provide data rates up to approximately 100 Mbps, and the HomePlug AV specification enables data rates up to 200 Mbps to be transmitted over the electrical wiring in a home or office. The endpoints in the home network are telephones, the home computer or computers, and the set-top boxes that are required for each television. Now that we have an appreciation for the major network elements associated with an IPTV service, let’s focus our attention on the potential impact of this evolving television service.
Impact of IPTV The impact of IPTV on the delivery of video can be expected to be most pronounced in three areas. Those areas are content, convergence, and interactivity.
20 䡲 Understanding IPTV
Content As previously explained in this chapter, IPTV enables the consumer not only to select television channels but also to choose from a virtually unlimited number of movies, videos, and other types of content that will be delivered on demand. This means that the ability to promote the advantage of IPTV will require the service provider to negotiate agreements with broadcast television and movie studios to offer an expanded content above and beyond that available from conventional television delivery entities, such as cable television and satellite television operators, that provide only a handful of on-demand programming. Assuming IPTV service providers are successful, we can reasonably expect available content to be several orders of magnitude beyond what conventional television now provides.
Convergence Through the installation of a high-speed access line, it becomes possible to provide customers with the ability to receive video content access the Internet and use VoIP via a common connection to the service provider’s network. In addition, other applications, such as meter reading, may eventually occur over the common access line. Thus, the use of an IP network can be expected to facilitate many applications to occur over a single service delivery network, in effect promoting the convergence of applications onto a common transport facility.
Interactivity Interactivity represents the third major impact that IPTV can be expected to have on the existing industry method of delivering television. An IP network provides a bidirectional transmission facility. This makes it possible to use a television remote control, computer keyboard, game console, or other device to select viewing content, initiate a video chat session, answer a telephone call through the speakers of a television, or perform other functions that currently represent items on a drawing board rather than reality. Thus, the bidirectional capability of IP networks can be expected to result in the development of applications that would otherwise be difficult or impossible to perform with conventional television delivery systems.
Chapter 2
Market Drivers and Developing IPTV Infrastructure In Chapter 1, which introduced the different applications associated with IPTV technology, we briefly examined the rationale for telephone companies beginning to offer television-type services to their customers. In this chapter we will probe that reasoning a bit further as well as expand on our examination of market drivers for different types of IPTV applications. Commencing our effort by revisiting problems that are eroding telephone company core wireline voice communications services, we will also examine several additional drivers. Those drivers include the pay-TV market, the convergence of voice, data, and video services, the evolution of broadband and video compression technologies, and competition. Concerning the latter, we will describe and discuss several new technologies and marketing techniques of competitive companies that serve as market drivers for different types of IPTV services. As we describe and discuss different market drivers, we will also examine the IPTV infrastructure being developed or used to provide different types of IPTV services. In doing so we will examine the requirements of what this author considers to represent the home of the future, which will have two standarddefinition televisions (SDTVs) and two high-definition televisions (HDTVs) as well as a high-data-rate Internet connection. In addition, the home of the future will make use of a Voice-over-IP (VoIP) service, where voice 21
22 䡲 Understanding IPTV
conversations are digitized at a data rate significantly below the 64-kbps data rate used by conventional telephone services.
2.1 Telephone Company Landline Erosion In Chapter 1 we discussed several reasons why the telephone landline business is contracting. In this section we will review those reasons as well as focus our attention on revenue retention measures telephone companies are conducting in the form of using technology to both minimize their revenue loss as well as enter the IPTV market.
Overview The telephone company, or more accurately the Regional Bell Operating Company (RBOC), faces a series of competitive technologies that are eroding their “bread and butter” in the form of revenues they receive from landlines or wired telephone service. Competition from wireless cell phones and cable television Internet offerings has significantly reduced the use of dial-up Internet access. This in turn has resulted in consumers abandoning their second telephone line and, in some cases, their primary landline because wireless cell phones and Internet access via cable television operators can be used to satisfy their data and voice communications requirements. For other consumers, the ability to obtain telephone service via VoIP offerings from their cable television operator allows them to obtain their required communications services from a vendor other than their local telephone provider. In fact, beginning in January 2006, Cox Communications, which is the local cable television operator in Macon, Georgia, began offering telephone service to its customers at rates as low as $14.70 per month for basic telephone service, but the addition of call waiting, call forwarding, and unlimited long distance within the United States increased the monthly cost to approximately $45. With more than a million telephone customers, Cox Communications has significantly eroded the customer base of RBOCs where it provides cable service.
Revenue Retention Measures Although the traditional telephone company landline business is under attack by wireless cell phone and cable television companies, the RBOCs are not standing still nor are they letting the attack on their revenue stream occur without countermeasures. Some RBOCs, such as Bell South and AT&T (formerly known as SBC Communications until SBC acquired AT&T),
Market Drivers and Developing IPTV Infrastructure
䡲 23
entered into agreements with satellite television operators and their cell phone affiliates to offer bundled discounts for telephone, cell phone, and television delivered via satellite. To sweeten the pot, Bell South was also offering potential customers up to $150 cash back if they signed up for two or more services. Other RBOCs, such as Verizon and AT&T, are installing fiber-to-the-neighborhood (FTTN) or fiber-to-the-premises (FTTP) as the backbone infrastructure necessary to offer IPTV to their customers as well as provide a mechanism to take subscribers from cable television and satellite television operators.
Use of Fiber When FTTN is employed to route fiber into a neighborhood, the RBOC uses copper-based Asymmetric Digital Subscriber Lines (ADSL) or veryhigh-bit-rate Digital Subscriber Lines (VDSL) to cover the “last mile” from the termination of the neighborhood fiber into the subscriber’s premises. In actuality, ADSL enables a maximum data rate of approximately 8 Mbps over twisted copper wire less than 14,000 feet in length. As the distance between the serving telephone company office and the customer’s premises increases, the maximum achievable data rate significantly decreases, reaching approximately 1.5 Mbps at distances of approximately 18,000 feet. Unfortunately, although it enables high-speed Internet access, conventional ADSL service does not provide a sufficient data transmission rate at long distances to transport IPTV. This is because one HDTV show or movie when compressed requires between 8 and 10 Mbps, which exceeds the capacity of ADSL at distances commonly encountered between a fiber hub and a subscriber’s premises. Because most homes have multiple televisions, customers who need to simultaneously view two or more HDTV channels or a single HDTV channel and an SDTV channel cannot obtain the necessary bandwidth through the use of ADSL. Thus, telephone companies using FTTN in their rollout of IPTV are using ADSL2+ to provide an enhanced capability into customer premises.
ADSL The original ADSL standard was referred to as G.dmt by the International Telecommunications Union (ITU) and designated as the G.992.1 standard. Under that standard, operations at data rates up to 8 Mbps downstream to the subscriber and up to 768 kbps upstream to the telephone company are supported at distances up to 18,000 feet. Figure 2.1 illustrates the subdivision of the twisted wire telephone line by frequency to accommodate the transmission and reception of data
24 䡲 Understanding IPTV
Figure 2.1 ADSL frequency utilization.
under ADSL concurrent with voice communications. Note that the voice channel of approximately 4 kHz is not affected by the data channels because the latter uses blocks of frequencies well beyond the 0 to 4 kHz of frequency used for voice. Also note that because the data rate is proportional to bandwidth, the downstream frequency band is significantly larger than the upstream frequency band.
Modulation There are three modulation methods used by ADSL for encoding data onto the local loop: carrierless amplitude and phase (CAP), discr ete multitone (DMT), and a simplified DMT technology used by equipment adhering to the G.lite standard. CAP modulation can be considered to represent a nonstandard version of quadrature amplitude modulation (QAM). Under QAM, a double-sideband suppressed carrier signal is constructed from two multilevel pulse amplitude modulated (PAM) signals applied in phase quadrature to one another. CAP results in the same form of signal as QAM; however, it does not require in-phase and quadrature components of the carrier to first be generated. CAP was the de facto standard for ADSL use until the mid-1990s, when DMT usage increased, and now represents the preferred modulation method. Under DMT the upstream and downstream bands are subdivided into a sequence of smaller frequency ranges of approximately 4 kHz that are referred to as subchannels. Data bits are modulated using QAM on each subchannel, with up to 15 bits per subchannel being encoded when transmission occurs over a good quantity line. Because DMT enables the transmission bandwidth to be divided into a sequence of subchannels that may or may not be used depending on the quality of the line, this
Market Drivers and Developing IPTV Infrastructure
䡲 25
modulation technique allows the distinct characteristics of each line to have a maximum transmission rate. Both the American National Standards Institute (ANSI) and the ITU have specified DMT as the standard modulation method for full-rate ADSL and a modified version of DMT for G.lite. Concerning the latter, G.lite provides a maximum transmission rate of approximately 4 Mbps downstream and 512 kbps upstream. G.lite was approved by the ITU as the G.992.2 standard and a revision to the standard referred to as G.lite.bis was approved as the G.992.4 standard. Because G.lite and G.lite.bis do not have the capacity to transport multiple video channels, RBOCs will have to upgrade customers using that technology as well as many ADSL lines if they want their subscribers to participate in the rollout of IPTV services. Fortunately, most modern ADSL modems support CAP and several versions of DMT, which may facilitate the upgrading of customers.
ADSL2 The two versions of the ADSL2 standard, referred to as G.dmt.bis and G.lite.bis, were standardized by the ITU as G.992.3 and G.992.4, respectively. G.992.3 represents ADSL2 for full-rate ADSL, whereas G.992.4 represents the splitterless version of the more modern standard. Both ADSL2 standards were approved by the ITU in 2002 and supersede previously developed ADSL standards. ADSL2 was developed to improve the data transmission rate and transmission range of ADSL. This improvement is accomplished by an enhanced modulation efficiency, a reduction in framing overhead, higher coding gain, and the use of enhanced signal processing algorithms.
Enhanced Modulation Under ADSL2 a four-dimensional, 16-state trellis-coded and 1-bit QAM constellation is employed. This modulation method enables the achievement of higher data rates for extended distances when the signal-to-noise (S/N) ratio is low.
Framing Overhead Reduction A second feature of ADSL2 that facilitates a data rate and transmission distance improvement is a reduction of framing overhead. This framing reduction is accomplished by providing a frame with a programmable number of overhead bits. In comparison, the original versions of ADSL
26 䡲 Understanding IPTV
used a fixed number of overhead bits per frame that consumed 32 kbps of transmission. Because ADSL2 enables overhead bits to be programmed from 4 to 32 kbps, up to 28 kbps of additional bandwidth can be used for payload data.
Coding Gain ADSL2 specifies the use of Reed–Solomon coding for forward error correction. Under Reed–Solomon coding, which represents a block-based method of error correction, extra or “redundant” bits are added to each block of digital data. At the receiver, a Reed–Solomon decoder processes each block and attempts to correct errors and recover the original data in the block. Under ADSL2, a higher coding gain occurs from the use of Reed–Solomon coding due to improvements in the framing, which in turn improves the flexibility and programmability in the construction of Reed–Solomon codewords.
Codewords A Reed–Solomon codeword is specified as RS(n,k) with s-symbol bits. This means that a Reed–Solomon encoder takes k data symbols of s bits each and adds parity symbols to make an n-symbol codeword. Thus, there are n – k parity symbols of s bits each. A Reed–Solomon decoder can correct up to t symbols that contain the errors in a codeword, where 2t = n – k. Figure 2.2 illustrates a typical Reed–Solomon codeword. In this example the codeword is referred to as a systematic code because the data is left unchanged while parity symbols are added. One popular example of a Reed–Solomon code is RS(255,223) with 8-bit symbols. Here each codeword contains 255 bytes, of which 223 represent data and 32 bytes represent redundant parity bits.
Figure 2.2 A Reed–Solomon codeword.
Market Drivers and Developing IPTV Infrastructure
䡲 27
For this code: n = 255, k = 223, s = 8 2t = 32, t = 16 In this example the Reed–Solomon decoder can correct any 16 symbol errors in the codeword; in effect, errors anywhere in the codeword that do not exceed 16 bytes in length. The maximum codeword length (n) for a Reed–Solomon code is n = 2s − 1 where s represents a given symbol size. Thus, the maximum length of a code with 8-bit symbols (s = 8) becomes: n = 28 – 1 or 255 bytes Key properties. One of the key properties of Reed–Solomon codes is that they can be conceptually shortened by setting a number of data symbols to 0 at the encoder, skipping their transmission, and then reinserting them at the decoder . For example, a (255,223) Reed–Solomon code can be shortened to (200,168). Her e the Reed–Solomon encoder would operate on blocks of 168 data bytes, add 55 zero bytes to create a (255,223) codeword, but transmit only the 168 data bytes and 32 parity bytes. At the decoder, the 55 zero bytes would be added to the received data. Enhancing channel capacity. In a Reed–Solomon coding environment, the ratio of the probability of an error occurring if Reed–Solomon coding is not used to the probability of an error not detected when Reed–Solomon coding is used is referred to as the coding gain. Thus, the bit error rate of a communications system, expressed in terms of 1-bit error in 10x bits transmitted, can be enhanced in two ways. First, the signal strength of the transmitter can be increased, which increases the S/N ratio. From Shannon’s Law, where the capacity of a channel (C) in bits per second (bps) is proportional to the bandwidth (W) in hertz and S/N ratio, C = Wlog2 (1 + S/N) An increase in the S/N ratio will boost the capacity of the channel in bits per second. The second method that can be used to enhance the bit error
28 䡲 Understanding IPTV
rate is through the addition of Reed–Solomon coding or, more specifically, its coding gain. Returning to our discussion of ADSL2, on long lines where data rates are lower, ADSL2 obtains higher coding gain fr om the use of Reed–Solomon coding. This increased gain occurs due to improvements in the ADSL2 frames that enhance the construction of Reed–Solomon codewords.
Other Improvements In addition to the previously mentioned improvements of enhanced modulation efficiency, reduced framing overhead, and higher Reed–Solomon coding gain, ADSL2 provides many additional improvements over ADSL that result in an increased data rate being obtained by ADSL2 systems. Some of those additional improvements include power reduction capabilities at each end of the telephone line, which reduces both near-end echo and cross-talk levels and the determination by the receiver of the DMT carriers used to transmit initialization messages, which avoids channel nulls from bridged taps as well as narrow-band interference from AM radio and improvements in the determination of training signals. As a result of the additional improvements and features added to ADSL2, such systems can provide a 50-kbps increase in data rate and approximate 600-foot transmission extension in comparison to ADSL.
ADSL2+ Whereas ADSL2 represents a small improvement over ADSL, ADSL2+ represents a significant improvement with respect to the transmission rate obtainable at distances of 5000 feet or less.
Frequency Utilization To obtain a higher data transmission rate, ADSL2+ approximately doubles the bandwidth used to transport data, with the downstream frequency band extended from 1.1 MHz under ADSL and ADSL2 to 2.2 MHz under ADSL2+. Figure 2.3 illustrates ADSL2+ frequency utilization.
Comparison to ADSL2 Previously we noted that under Shannon’s Law the data rate obtainable on a channel is proportional to available bandwidth and the signal-tonoise ratio on the channel. Doubling the available bandwidth while
Market Drivers and Developing IPTV Infrastructure
䡲 29
Figure 2.3 ADSL2+ frequency utilization.
holding the S/N ratio constant results in the obtainable data rate on the downstream channel of approximately double on the local loop at distances up to 3000 feet, with the increase in the downstream data rate of ADSL2+ decreasing as the length of the local loop incr eases. Table 2.1 provides an approximate comparison of the maximum obtainable data rate of ADSL2 and ADSL2+ with respect to the length of the local loop. In addition to increasing the downstream frequency band to 2.2 MHz, ADSL2+ includes an optional operational mode that can be used to double upstream bandwidth. However, because Web surfing and the use of IPTV results in relatively small upstream queries followed by larger delivery of downstream data, this option may not be important for most ADSL2+ subscribers.
Table 2.1 Comparing ADSL2 and ADSL2+ Maximum Data Rates (Mbps) Downstream Local Loop Distance (ft)
1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000
ADSL2
ADSL2+
12.5 12.5 12.5 12.5 12.5 11.0 10.0 9.5 7.5 6.0
26.0 26.0 25.5 24.5 20.0 15.5 12.5 9.5 7.5 6.0
30 䡲 Understanding IPTV
FTTN and ADSL In concluding our discussion of the different versions of ADSL, we will examine why telephone companies will need to support ADSL2+ if they intend to provide IPTV services via FTTN. Figure 2.4 illustrates the approximate bandwidth capacity required to deliver HDTV, SDTV, and digitized voice into a typical home. In examining Figure 2.4, note that the typical home of the future will probably have four televisions, of which two will be high definition. Although readers may question the number of HDTVs in a typical home of the future due to their current cost, it is important to note that the price of both plasma and LCD TVs are rapidly dropping. Within a few years the price of a 32-inch high-definition LCD TV may be under $400, which would result in a mass market for this type of flat-panel television. Because compressed HDTV can currently be transported at a data rate between 8 and 10 Mbps, the simultaneous transmission of two HDTV signals would require between 16 and 20 Mbps of bandwidth. Similarly, the use of two SDTVs would require between 2 and 4 Mbps. Thus, the future home with four televisions, of which two are HD and two are SD, would require between 18 and 24 Mbps of bandwidth, assuming all four televisions were in use simultaneously. In comparison, high-speed data (HSD) between 2 and 4 Mbps should be more than sufficient for Webbased activities, and digitized voice requires significantly less bandwidth than the 64 kbps required by a conventional landline telephone system. Thus, the total bandwidth required to support the home of the future can be expected to be between 20 and 28 Mbps.
Figure 2.4 IPTV bandwidth requirements.
Market Drivers and Developing IPTV Infrastructure
䡲 31
From Table 2.1, note that ADSL2+ can support a data rate at or above 20 Mbps at distances up to 5000 feet. Thus, for FTTN to be successful, the fiber terminal point within a neighborhood needs to be positioned such that the maximum local loop into a potential subscriber’s premises is less than or equal to 5000 feet. Because ADSL and ADSL2 support lower data rates, their use is practical only if the subscriber does not require the ability to view HDTV in real-time.
Passive Optical Network Although the passive optical network (PON) was invented at British Telecom laboratories in 1982, it wasn’t until 1987 that early field trials in the use of the technology occurred. In 1993 Deutsche Telekom began the installation of PON architecture on a massive scale in Eastern Germany, and in 1999 Bell South completed its beta testing of PON architecture to 400 homes in the Atlanta area. Based on the success of numerous field trials around the globe, PON technology has rapidly increased in use and provides a reasonable-cost method for RBOCs to create the backbone infrastructure necessary to deploy FTTN or fiber-to-the-curb (FTTC). In this section we turn our attention to PON to obtain an understanding of how the technology operates.
Overview Today most telecommunications networks are constructed using active components. Such components operate by consuming power and normally consist of memory, processors, and other devices that are active and process information, such as routers, multiplexers, and line drivers. In comparison, all active components between a telephone company central office and the customer premises are eliminated when a PON is employed. Employment of a PON results in the installation of passive optical components that guide communications traffic based on splitting the power of optical wavelengths to endpoints along the route. Through the replacement of active components by passive devices, the service provider eliminates the need to power and service active components in the transmission loop. This in turn reduces the service provider’s cost of operations.
Equipment The backbone of a PON is couplers and splitters that passively restrict or pass light. Such devices have neither power nor processing requirements. As a result, the mean time between failures (MTBF) of a PON is
32 䡲 Understanding IPTV
virtually unlimited, which further lowers the operating cost of this type of network. Figure 2.5 illustrates the potential use of a PON to interconnect a telephone company central office and subscriber premises by providing a backbone to neighborhoods where ADSL2, ADSL2+, or VDSL can be used to provide short-distance but high-speed transmission over existing copper into subscriber homes and offices. At the central office an optical line terminator (OLT) is installed, and a set of associated optical network terminations (ONTs) and optical network units (ONUs) are installed at locations where optical fiber terminates at the neighborhood (FTTN) or at a building (FTTB). Between the two fiber endpoints is the optical distribution network (ODN), which consists of fiber, passive splitters, and couplers. The OLT either generates light signals on its own or receives SONET (synchronous optical network) signals, such as an OC-12, from a colocated SONET crossconnect. The OLT then broadcasts traffic through one or more outbound ports, where the light signal flows until it reaches an applicable ONU or ONT, which then converts the optical signal into an electrical signal. ONTs are used when the fiber extends into the customer premises, whereas ONUs are used when fiber terminates outside the customer facility. Thus, turning our attention to Figure 2.5, note that the ONTs are shown when fiber is routed into a building (FTTB) whereas ONUs are employed when fiber is routed to a neighborhood (FTTN). As illustrated in Figure 2.5, a single fiber can be split several times, enabling fiber’s large data transmission capacity to be routed directly to
National hub
Regional hub
Local hub
Regional hub
Local hub
Local hub
Local hub
To subscribers
Figure 2.5 Using a passive optical network as a backbone linking ADSL2 connections.
Market Drivers and Developing IPTV Infrastructure
䡲 33
buildings where multiple offices and apartments are located as well as into neighborhoods. The main fiber run on a PON can occur at 155 Mbps, 622 Mbps, 1.25 Gbps, or 2.5 Gbps. At each terminal’s location by or in a building or neighborhood, a version of ADSL or VDSL is used to enable high-speed communications over existing metallic wiring into the subscriber’s premises. Thus, the PON backbone is shared among many customers, lowering the overall cost of deploying the optical network. In addition, the ability to use existing metallic wiring instead of providing individual fiber connections to subscribers further reduces the cost associated with establishing a PON.
Operation The transmission of data between the central office and customer premises differs from the manner by which data flows from the customer premises to the central office. In the downstream direction, when data is transmitted toward the customer premises, it is broadcast from the OLT to each ONT, with each ONT processing the data destined to it by matching the address located in the protocol header. In comparison, upstream transmission is a bit more complicated due to the shared media functionality of the ODN. This is because transmission of each ONT to the OL T needs to be coordinated to avoid the occurrence of collisions. In a PON upstream direction, data is transmitted using a time division multiple access (TDMA) protocol, where dedicated transmission slots are assigned to each ONT. Because the time slots are synchronized, this ensures that transmissions from different ONTs do not collide. Due to the need to synchronize upstream transmissions, PONs normally represent an asymmetrical transmission method. For example, a PON that delivers 622 Mbps downstream to users might provide a 155-Mbps upstream capacity. Because movies and Web surfing involve relatively small user upstream requests that are followed by large downstream transmissions, PON represents a strong complement to ADSL and VDSL that can be used within buildings or within a neighborhood.
Types of PONs Today communications carriers employ several types of PONs, with the key difference among types being the upper layer pr otocols used. Although in general the technology remains the same, the use of different upper layer protocols results in the use of different physical layers. The end result is that the transmission rates obtainable on different types of PONs can vary considerably. Table 2.2 lists four common types of PONs.
34 䡲 Understanding IPTV Table 2.2 Common Types of Passive Optical Networks ATM PON (APON) Broadband PON (BPON) Ethernet PON (EPON) Gigabit PON (GPON)
APON An ATM (Asynchronous Transfer Mode)-based PON operates similarly to an ATM network. That is, subscribers establish virtual circuits (VCs) across the APON to a specific destination, such as an Internet service provider’s premises. A number of VCs are bundled into a virtual path (VP) for faster switching through the carrier’s network.
BPON The initial PON specification used ATM as its signaling protocol, resulting in the term APON being used to reference this type of network. Because that term could be misleading (that only ATM services could be provided to end users), the Full Service Access Network (FSAN) group decided to broaden the name to Broadband PON (BPON). This name can reference the use of ATM, Ethernet, and Gigabit Ethernet. For both APON and ATM used by BPON a 53-byte ATM cell is used to transport data. Data rates up to 620 Mbps symmetrical and 1240/622 asymmetrical have been standardized by the ITU in the G983.x series of standards.
EPON Ethernet-based PONs (EPONs) evolved from the set of extensions developed by the IEEE for its 802.3 Media Access Control (MAC) and MAC sublayers with a family of physical (PHY) layers. These extensions were developed to enable the use of Ethernet for subscriber access networks, which is referred to by the phrase “Ethernet in the first mile (EFM).” The extensions to Ethernet’s physical layers included optical fi ber and unshielded twisted-pair (UTP) copper cable used for point-to-point connections, resulting in EFM providing support for EPONs in which a pointto-multipoint network topology is implemented through the use of passive optical splitters and couplers. EPON is based on the use of the Multi-Point Control Protocol (MPCP), which represents a new function defined within the MAC sublayer.
Market Drivers and Developing IPTV Infrastructure
䡲 35
Under MPCP, messages and timers are used to control access to the point-to-multipoint networks. EPON was standardized by the IEEE as the 802.3 standard.
GPON The fourth type of PON is based on the use of high-speed Ethernet, referred to as Gigabit Ethernet, resulting in the term GPON being used to reference this network access method. GPON provides support for data rates of 622 Mbps and 1.25 Gbps symmetrically as well as 2.5 Gbps downstream and 1.25 Gbps upstream. GPON provides support for Ethernet 10 Mbps and 100 Mbps, ATM, TDM, and SONET connectivity. Although GPON is not backward compatible with BPON because its goals to operate at higher data rates required a modified physical layer, the GPON system standardized as G.984.1 through G.984.4 uses the work of BPON to develop this new standard, which was ratified during 2003 and 2004.
2.2 The Pay-TV Market A second market driver for IPTV is the pay-TV market. In this section we will discuss both the conventional pay-TV market as well as evolving competition for consumer funds and the effect of this driver on the emerging IPTV infrastructure.
Overview The pay-TV market has evolved from cable television and satellite TV providers to a variety of businesses that offer television shows and movies over the Internet. Although revenues from cable television and satellite TV providers were greater than $50 billion in 2005, whereas the rentals and sales of movies and television shows over the Internet are currently less than $50 million, a profound market shift is based on the following four factors: 䡲 䡲 䡲 䡲
The substitution of broadband access for dial-up service The introduction of the video iPod and similar products The availability of TV shows for sale by major television networks The growth in the number of Internet sites offering the sale and rental of movies, music videos, and television shows
36 䡲 Understanding IPTV
Broadband Access The substitution of broadband for dial-up Internet access resulted in millions of subscribers being able to download large data files within a reasonable period of time. This in turn has created a growing market of consumers who download movies, television shows, music videos, and videos of special events, either for direct viewing on their computer or for transfer to another device, such as directly onto a video iPod or by creating a DVD that can be viewed on their television using a DVD player or while traveling by using a portable DVD player.
Dial-Up Delays Until broadband transmission became commonly available at a relatively low monthly cost, consumers were restricted to using dial-up, where the maximum data rate was 56 kbps. Downloading a one-hour television show or relatively short movie could require 1.175 GB of data to be received. At a download data rate of 56 kbps, this activity would require: 1.175 Gbytes × 8 bits/byte = 167,857 seconds The above computed download time is equivalent to 2797 minutes, or approximately 46.6 hours! This relatively long time to download a onehour video via dial-up explains why dial-up Internet access makes it almost impossible to download anything but relatively short video clips.
DSL Offerings In comparison to dial-up, DSL subscribers can select from a variety of offerings ranging from a slow service at 256 kbps, which can be used for Web surfing but is still impractical for downloading full-length videos in a timely manner, to faster services that can provide data rates as high as 20 Mbps at distances up to approximately 5000 feet from a central office or fiber termination point.
Cable Modem Offerings Cable modem subscribers more often than not are offered a low-speed data rate similar to those provided to DSL subscribers for competitive purposes. However, a key difference between DSL and cable modem rates is in their high-speed offerings. Most cable modem high-speed offerings begin at data rates where DSL offerings stop, with, for example,
Market Drivers and Developing IPTV Infrastructure
䡲 37
Cablevision announcing during November 2005 that it was increasing the maximum download speed to 15 Mbps from 10 Mbps, which was already higher than most DSL offerings.
Download Time Comparison The rollout of IPTV services by AT&T (formerly known as SBC Communications), Verizon, and other regional phone companies resulted in an increase in DSL data rates as the copper “last mile” is either shortened by FTTN or replaced by the routing of fiber to the premises. This action has resulted in some locations now being offered DSL download speeds from 5 to 30 Mbps. Table 2.3 compares the time required to download a onehour video at a 56-kbps dial-up rate to seven popular DSL and cable modem rates. Download times are shown in terms of the approximate number of hours required to download the one-hour video at each data rate. In examining the entries in Table 2.3, it is obvious that the higher data rates provided by broadband communications significantly lower the time required to download a video. Note that at a data rate of 4096 kbps which is representative of high-speed DSL and medium-speed cable modem services, the download time is reduced to slightly more than a half hour, whereas at a data rate of 8192 kbps only approximately a third of an hour is required. For many consumers, both are reasonable time periods, especially when compared to almost 46 hours when downloading occurs via dial-up. Thus, the growth in the number of people subscribing to broadband Internet access increases the potential of those subscribers to download different types of video for viewing on their PC or on another device.
Table 2.3 Download Time for One-Hour Video Data Rate
56 kbps 256 kbps 1024 kbps 2048 kbps 4096 kbps 8192 kbps 16,384 kbps 32,768 kbps
Hours Required
46.64 10.16 2.54 1.27 0.64 0.32 0.16 0.08
38 䡲 Understanding IPTV
Introduction of Video Products Although the introduction of the Apple Computer video iPod in October 2005 received considerable press attention, it is just one of many types of portable video products to reach the market. If you travel by train or airplane, chances are high that you will see several people in the railway car or the airplane cabin watching a video using a portable video player. Currently, the majority of portable video players are DVD devices with 5-, 7-, 10-, or 11-inch displays. Although the majority of people currently using portable DVD players view purchased or rented DVDs, the ability to download many types of video via the Internet as well as to use a digital video recorder (DVR) to create DVDs for later viewing can be expected to alter the use of players. This technique of viewing previously recorded video at a different time is referred to as time-shifting, whereas viewing video at a different location is referred to as video-shifting or view-shifting. Because portable DVD players have a larger screen than the Apple Computer video iPod and a virtually unlimited storage capacity because multiple DVDs are easily packed and do not take up much storage space, for the foreseeable future they will more than likely represent the preferred method for viewing videos in a mobile environment.
Availability of TV Shows A third factor that is facilitating a market shift away from conventional pay TV is the significant increase in the availability of TV shows. First there was Apple Computer’s agreement with ABC in October 2005 to make several television programs available for downloading via the Internet; then NBC and CBS announced deals with cable and satellite providers in November 2005 that would commence operation at the beginning of 2006. Under the deals struck with CBS and NBC, cable and satellite providers would be able to watch popular shows anytime after those shows were aired. The announced deals allow viewers to order episodes of such primetime shows as Law & Order: Special Victims Unit and Survivor for 99 cents apiece.
CBS Shows Under the deal between Comcast and CBS, customers of the cable television firm with digital service would be able to purchase episodes of four primetime shows beginning in January 2006 starting a few hours after the shows aired on the network. Although the shows would include commercials, viewers would be able to fast-forward through them.
Market Drivers and Developing IPTV Infrastructure
䡲 39
NBC Shows In a separate agreement that was announced at the same time, NBC agreed with Direct TV to make a series of programs available, including some that air on its cable channels, such as Sci-Fi, Bravo, and USA. Unlike CBS, NBC is making available commercial-free shows, with each show being billed at 99 cents. The DirecTV VOD service will be available to subscribers who obtain a new set-top box and have a TiVo set-top DVR. The recorder will have 160 hours of recording capacity, but only 100 hours will be available for consumers to use. The other disk space in the recorder will be used to store approximately five hours per week of NBC primetime television shows as well as other programming transmitted by DirecTV. Customers can then purchase and view the stored programs. Whereas DirecTV’s offering is not a conventional on-demand offering, Comcast is providing a more flexible strategy by adding TV shows to its library of 3800 on-demand offerings, most of which represent free content, such as real estate listings. Because Comcast’s infrastructure includes many on-demand channels, its subscribers have more flexibility than the subscribers to DirecTV, although their offerings pale in comparison to the potential of virtually unlimited offerings that can be expected to be provided by IPTV systems.
AOL and Warner Brothers In concluding our discussion of the availability of TV shows, I would be remiss if I did not mention an agreement between AOL and Warner Brothers that was announced in mid-November 2005. Under the announced agreement by these two divisions of Time Warner Inc., vintage television shows made by Warner Brothers, such as Welcome Back, Kotter, Wonder Woman, and Kung Fu, will be offered free online by AOL. Although it will be free to view the programs, the AOL–Warner Brothers service will include 15-second commercials that viewers cannot bypass. In addition, video delivery will occur via video feeds, which will prevent viewers from recording shows. Referred to as “In2TV,” this service was scheduled to commence in January 2006 with six channels providing shows ranging from comedy to drama. Although it is difficult to compete with a free product, the fact that the Warner Brothers offerings cannot be time-shifted or place-shifted leaves questions about its viability. That is, how many people will want to watch an entire TV show on their PC screen, and will decades-old fare be a draw? Although such questions may take awhile to be answered, the AOL–Warner Brothers agreement at a minimum will increase the availability of TV shows on the Internet.
40 䡲 Understanding IPTV
Cell Phone Television Another area of pay television that is emerging as a market driver for IPTV is cell phone television. In November 2005 Sprint Nextel Corporation announced a deal with several of the largest cable television companies, including Comcast, Cox Communications, and Time Warner, that lets the cable companies sell Sprint wireless services along with their own TV, phone, and high-speed Internet access. Under this deal, a single voice mailbox would be available for both the cellular and the wired phone, the amount of content available for watching on certain types of cell phones would significantly increase, and customers would be able to watch shows stored on their home DVRs as well as program their home recorders via their cell phones. Although it is still premature to discuss the fees that will be associated with cell phone television, subscribers can expect to pay between $99 and $250 for an applicable cell phone and approximately $15 per month for the ability to view television on the “small screen.”
IP Datacast Initially, the majority of television viewed on cell phones will be broadcast TV. However, in the future we can expect a combination of digital broadcast and the IP to provide a new broadcast technology referred to as IP datacast over DVB-H (digital video broadcast — handheld), which for simplicity is referred to as IP datacast. With IP datacast, the quantity of data transmitted to represent a TV channel is reduced to between 128 and 384 kbps because the screen on which the video will be observed is smaller than a regular television. This reduction in the required data rate permits approximately 30 smallscreen TV channels to be broadcast over the bandwidth now used to broadcast a single analog channel. Because analog transmission will be phased out over the next few years, it’s quite possible that hundreds of small-screen TV channels could become available. In addition, because each IP datacast-ready cell phone would have a unique IP address, it becomes possible for cell phone users to select a particular program that they could view on an “on-demand” channel, resulting in the development of another screen version of wireless IPTV.
Growth in Internet Video Content Just a few years ago, the availability of videos on the Internet was more than likely from “hacker” Web sites that provided free copies of movies prior to the release of the movie on DVD. In fact, many movies that made
Market Drivers and Developing IPTV Infrastructure
䡲 41
their way to the Web were the result of a person visiting a movie theater with a hand-held digital camcorder, which resulted in some rather interesting movements when the person making the bootleg copy had an itch to scratch. Today the number of pay-video sites offering legitimate first-rate movies has significantly expanded, providing tens of thousands of music videos, movies, and television shows. The growth in Internet video content enables diverse subscriber viewing habits to be accommodated and provides the developers of IPTV with a potential revenue stream once the infrastructure is developed to enable rapid downloads of content as well as negotiate the availability of content from movie studios, TV networks, and other content providers.
Summary The pay-TV market consists of a series of submarkets, including video on demand and TV shows delivered to cell phones. Although the technology used for each submarket differs, most have a similar capability in that they allow a subscriber to time-shift a video to a more convenient time. Because traditional telephone companies are investing billions of dollars in IPTV, to be successful they must be competitive. Thus, one can expect the various types of pay TV mentioned in this section to represent benchmarks for IPTV delivery. Because AT&T (previously known as SBC Communications) and Verizon also operate cell phone service, it is reasonable to expect their IPTV delivery to eventually provide a mechanism by which subscribers can download video into their homes as well as onto their cell phones.
2.3 Convergence of Voice, Data, and Video Another key market driver for IPTV is the convergence of voice, data, and video. Because each can be represented in digital format, it is both possible and easy to transmit voice, data, and video over a common network infrastructure. At their destination, voice, data, and video can be stored on a common device, such as an Apple video iPod, or they can be delivered to a destination defined by an IP address, such as a home computer, a television set-top box, or another addressable device. In this section we will note how each individual data stream can be transported within a common IP data stream.
Voice Until a few years ago the primary buzz when discussing convergence was VoIP, also referred to as IP telephony, with the latter term adding fax to
42 䡲 Understanding IPTV Table 2.4 Popular Voice Coding Standards Standard Description
Data Rate (kbps)
Delay (ms)
64
3.98
MOS*
G.711
Pulse code modulation (PCM)
G.726
Adaptive differential PCM (ADPCM)
16, 24, 32
G.723.1
Algebraic codebook excited linear predictor (ACELP) Low-delay codebook excited linear predictors (LD-CELP) Conjugate-structured algebraic CELP (CS-ACELP) Multipulse maximum likelihood quantizer (MP-MLQ)**
5.3
0.12 5 0.12 5 37.5
16
2.5
4.2
8
10.0
4.2
5.3, 6.3
30.0
3.5
G.728 G.729 G.723.1
4.8 4.2
* Mean opinion score. ** LD-CELP variation.
digitized voice as being capable of being transmitted over a public or private IP data network. When digitized voice is transmitted over an IP network an analog conversation is digitized through the use of a voice coder into a digital data stream. The coder converts voice at a specific data rate based on the type of coder employed. Associated with the use of each coder is a delay time, expressed in milliseconds (ms), and a mean opinion score (MOS), representing a subjective scale from 1 (bad) to 5 (excellent) of the perceived clarity of a conversation. The codec delay time is extremely important because it is one of several delays that cumulatively need to be less than 150 ms to obtain a high MOS. Other delays include the egress and ingress data links and the delay time packets experience flowing through a network. Table 2.4 lists six voice coding standards, their digital data rate, delay, and MOS. In examining the entries in Table 2.4, note that PCM and ADPCM are used primarily by RBOCs when calls flow through their legacy switches. The other voice digitization standards are used primarily in a VoIP environment because they significantly reduce the data transport needed to convey a voice conversation. As you might surmise from examining the entries in Table 2.4, as the data rate required to transport a digitized voice conversation decreases, its delay increases, and the quality of the conversation in terms of its MOS degrades.
Market Drivers and Developing IPTV Infrastructure
䡲 43
The delay listed in Table 2.4 is only a portion of the end-to-end delay that can adversely affect VoIP. To ensure high-quality voice communications requires both sufficient bandwidth to minimize transmission delays as well as routers within the network that can prioritize voice traffic, with the latter requiring such traffic to be tagged to indicate its need for prioritization. This is broadly referred to as quality of service (QoS), which enables routers to reduce delay by giving voice traffic a higher priority than data. Today cable television operators are providing a VoIP service to entice subscribers to obtain a troika of services (voice, Internet access [data], and video) from one vendor. However, although voice, data, and video flow over a common coaxial cable, the troika of services does not flow over a common IP network. In comparison, AT&T and Verizon are rapidly constructing a large fiber-based backbone to provide the capability to deliver voice, video, and data to their customers over a common IP network. Whereas the driving force behind real estate pricing is location, the ability to provide convergence can be summed up as “bandwidth.”
Data Both RBOCs and cable television operators offer subscribers various Internet access offerings, ranging in monthly price and data rate from a minimal monthly cost at a data rate five times dial-up to more costly and higher data rate offerings. Because RBOCs provide this service over their existing last mile line and cable television operators provide a similar capability within their coaxial cable routed into homes, we can say that technically data is physically converged on the same “pipe” into homes. However, true convergence will occur in an IPTV environment, where data, voice, and video will share a common bandwidth, being distinguished from one another by the header in each packet that will indicate the type of data carried.
Video The convergence of video onto the same pipe with voice and data is similar to the delivery of voice under certain situations and data under other conditions. That is, when viewing a real-time video stream, video must be delivered with minimum delay so that frames are not distorted in time. In comparison, when a video is downloaded, delay between frames is not an issue because once the video is downloaded it will then be viewed from disk. This means that real-time video packets must be tagged to enable routers to prioritize the content of such packets.
44 䡲 Understanding IPTV
A second constraint concerning the convergence of video is not technical but concerns content availability. With the ability of IPTV to provide a virtually unlimited number of shows, providers must license applicable material. As their digital library expands, so will the probability of a customer ordering a pay-per-view video or subscribing to the service due to the content matching the subscriber’s preference.
Video Distribution Network The ability to deliver requested content will require the development of a video distribution network. That network will consist of a three-tier series of hubs at the national, regional, and local level.
National Hub At the national level a central bank of servers connected to terrestrial and satellite feeds would provide national video content. The national hub would encode and compress video as well as serve as a central repository for VOD content.
Regional Hubs Regional hubs would receive national content from the national hub as well as connect to and receive local content and insert local advertisements. In addition to facilitating delivery of VOD content, the regional hub would temporarily store popular movies and other content, such as new pay-per-view movies.
Local Hub The lowest level in the video distribution network would be the local hub. This facility would receive national and local content from a regional hub. In addition, the local hub would insert more localized advertisements as well as transmit content directly to the subscriber. Figure 2.6 illustrates the video distribution network hierarchy described.
2.4 Evolution of Video Compression In concluding this chapter we will briefly discuss the key technology that enables IPTV to be a reality. That technology is video compression.
Market Drivers and Developing IPTV Infrastructure
䡲 45
ONU FTTN
ADSL2+
ONT FTTB Telephone central office
OLT
ONU FTTN
ADSL2+
ONT FTTB Legend: OLT: Optical line termination ONU: Optical network unit ONT: Optical network terminal FTTN: Fiber to the neighborhood FTTB: Fiber to the building
Figure 2.6 Video distribution network hierarchy.
Overview Prior to the development of digital video, most data compression techniques were developed to operate on text data. One common text-based data compression technique is run-length encoding, in which repeating strings of the same character are replaced by a trio of characters that indicate compression has occurred, the character that was compressed, and the number of characters that were compressed. A second popular text-based data compression technique is actually a series of string-based techniques based on the work of Lempel and Ziv, two professors who worked at the Technion, Israel’s equivalent of MIT. The problem with these techniques is the fact that their typical compression ratio is between 2:1 and 3:1, which is insufficient for video.
Lossless vs. Lossy The two previously mentioned text-based compression techniques are referred to as lossless compression. That is, compressed data can be decompressed into its original data stream. In comparison, lossy compression was
46 䡲 Understanding IPTV
initially developed to reduce the storage and transmission requirements of pictures under which blocks of pixels are compared and assumed to be equivalent even if they differ by a few pixels. Thus, lossy compression is not fully reversible. Because blocks of pixels are compared to one another and assumed to be equal even when they differ by a few pixels, lossy compression has a significantly higher compression ratio than lossless compression. This makes lossy compression more suitable for compressing digital video. Examples of lossy compression techniques include Joint Photography Experts Group (JPEG) for still photos and the Motion Picture Experts Group (MPEG) series of compression techniques used for digital video.
Compression Requirements We can think of digital video as representing a sequence of images, each of which is a two-dimensional frame made up of picture elements (pels) or pixels. Each pixel has a luminance and chrominance value, where luminance represents a pixel’s intensity and chrominance represents the color of the pixel. A high-quality image would use 3 bytes (or 24 bits) per pixel to represent color, a technique referred to as true color. We can think of digital video as a sequence of images displayed to convey the illusion of motion. Here the images are referred to as frames, and different frame rates are used for different types of digital video applications. Whereas movies use a frame rate of 24, North American television has a frame rate of approximately 30 (29.97) and European television’s frame rate is 25. The width and height of each digital video image comparable to analog television is 640 × 480 for SDTV, whereas one of several HDTV standards requires 1920 × 1080 pixels. In addition, whereas 2 bytes (or 16 bits) can be used for SDTV color depth, 3 bytes (or 24 bits) are required for HDTV. Thus, at a frame rate of 30 frames per second (fps), the noncompressed data rate for SDTV becomes 30 × 640 × 480 × 16 or 147,456,000 bps For HDTV, the noncompressed data rate becomes 30 × 1920 × 1080 × 24 or 1,492,992,000 bps The preceding computations for SDTV and HDTV makes it obvious that the ability of an RBOC or another service provider to enable subscribers to access multiple television channels over a common access line is severely limited without the use of a lossy compression method. This is because the use of copper-based ADSL2+ is limited to allowing a data
Market Drivers and Developing IPTV Infrastructure
䡲 47
rate of approximately 25 Mbps at distances up to 5000 feet, whereas the use of PON enables a bandwidth of approximately 30 Mbps into the subscriber’s premises. The purpose of this section was to briefly examine the evolution of compression and the rationale for the use of lossy compression for delivering digital video. In the next chapter we will examine compression in much more detail. For now, we can note that advances in the development of different data compression techniques made it possible for the transmission of video to the desktop and into subscribers’ homes in the form of IPTV.
Chapter 3
Television Concepts An understanding of IPTV depends on knowledge of basic television concepts and the Internet Protocol (IP) used to transport television as a sequence of packetized frames. In this chapter we turn our attention to obtaining an understanding of television concepts; the next chapter will focus on the IP. First we will reacquaint ourselves with analog television concepts and the evolution from analog to digital television. As we examine basic television concepts we will become familiar with a variety of terms associated with the technology, including pixels, frames, and interlacing and how they affect bandwidth. By obtaining an appreciation for the bandwidth requirements of standard- and high-definition television, we will note the need for lossy compression. Thus, in concluding this chapter we will examine several data compression techniques whose use enables IPTV to become a reality.
3.1 Analog Television The original development of television was based on analog technology. Television shows were created using an analog video camera to create a video signal that was then formatted and transmitted via a broadcasting station. In the home analog television, receivers would translate the received formatted signal and present the results on the television’s display. Figure 3.1 illustrates the major components of an analog video television system. In actuality, when we replace an analog video camera with a digital camera and use digital formatting, the components shown in Figure 3.1 are also applicable to a digital television system. 49
50 䡲 Understanding IPTV
Figure 3.1 Major components of an analog television system.
Signal Formatting The signal formatting shown in Figure 3.1 represents a standardized method of taking analog camera signals represented by red, blue, and green (RBG) primary colors and a luminance signal such that they are transmitted at certain frequencies. In the United States, Japan, Canada, and Mexico analog television is formatted based on standards developed by the National Television Systems Committee (NTSC). Two additional analog color systems are used in many countries: PAL and SECAM. The Phase Alternating Line (PAL) standard is similar to NTSC but uses a subcarrier alteration technique that results in certain types of transmission errors appearing as if they have canceled each other out. PAL is used in most of Europe but not in France and Russia. The Séquentiel Couleur à Mémoire (SECAM) standard was developed in France and is used primarily in that country, Russia, and ex-French colonies. Under SECAM, two FM subcarriers are used to transmit a color-difference component, instead of using a high-frequency subcarrier according to the NTSC standards. Table 3.1 compares the bandwidth requirements of the three previously mentioned analog color television systems in megahertz. Note that the
Table 3.1 Analog Color TV System Bandwidth System
Country
Bandwidth (MHz)
NTSC PAL PAL SECAM
United States, Japan, Canada, Mexico Great Britain Austria, Germany, Italy France, Russia
4.2 5.0 5.5 6.0
Television Concepts
䡲 51
table contains two entries for PAL, because different systems require slight differences in bandwidth. As we probe further into analog television we will primarily reference the NTSC system, using it as a basis for comparison.
NTSC Operation The color broadcasting scheme standard developed by the NTSC was proposed to the U.S. Federal Communications Commission (FCC) in July 1953. Five months later this analog broadcasting system was approved for use. Under this system a video camera generates three primary signals referred to as ER, EG, and EB. These primary signals are the electrical analog values of the red, green, and blue visual components of the scene being viewed. Because camera signals at that time were analog, the amplitudes of each signal are proportional to the spectral energy in the scene being viewed. In addition to the three primary signals, a luminance signal based on the relative sensitivity of the human eye to the primary colors is generated as well as two chrominance signal components. The later signal components, referred to as I and Q signals, are derived from the primary color components.
Scanning Scanning represents the process of converting optical images into electrical signals and is used by all types of video systems. Under the NTSC system, scanning occurs in a television camera via the movement of an electronic sensing spot across the image from left to right horizontally to form a scanning line. At the end of the image the sensing spot snaps back to begin a new line; this is referred to as retracing. The sensing spot converts each image point it sees into an electrical voltage. As the sensing spot begins a new line, it does so slightly below the previous line until it reaches the end of the last scanned line. At this point the sensing image performs both horizontal and vertical retrace operations, which results in the sensing spot returning to its original position in the upper left portion of the screen. Figure 3.2 illustrates the scanning process. A complete scan of the image area is referred to as a frame. Under the NTSC television system, 480 interlaced lines are used to create a picture. This means that the picture is created by scanning across the screen horizontally from left to right with 480 lines. The term “interlaced” refers to the fact that the screen is first scanned on all evennumbered lines; then a vertical synchronization pulse returns the electron
52 䡲 Understanding IPTV
Figure 3.2 The scanning process.
beam to the top of the picture tube, after which all odd-numbered lines are scanned. The vertical synchronization pulse can be viewed as an inter-frame timing gap equal to about the time required to scan 45 lines. During this timing gap the television that will display the transmitted image has time to receive the next frame; however, no picture information occurs. Instead, this time gap is used to transmit control information, such as closedcaption data. Thus, the total number of lines in each video frame can be thought of as being 525, but only 480 contain active video information that is displayed. Sometimes this standard analog TV format is referred to as 525i, which means 525 interlaced lines. However, because only 480 lines are displayed, another common term used is 480i. Each scan of even and odd lines occurs in one-sixtieth of a second, resulting in one full picture being displayed every one-thirtieth of a second. Thus, the frame rate is 30 frames per second (fps), whereas in a movie theater a slightly different rate of 24 fps is used. As previously noted, the 525 lines specified by the NTSC television format actually result in 480 lines of resolution. This is because during the timing gap interval equivalent to 45 lines, 22 lines ar e used for transmitting test signals, vertical synchronization of the picture, closed captioning, and program guides, and a few additional lines are used for the mask of the picture. The term 480i, which is used to reference 480 lines of interlaced scanning, represents what is referred to as analog standarddefinition television (SDTV). Standard analog television has been in use for approximately 55 years, during which time it has provided a reasonably good-quality picture.
Television Concepts
䡲 53
However, to obtain a good-quality picture, a television set must be relatively small. As television sets reached the market with 30-, 32-, 40-, 42-, and even 50-inch diagonally measur ed screens, the scan lines used in analog television have become mor e visible. Although one solution to the image quality problem is to move to a digital television format that provides additional lines of resolution, which will be discussed later in this chapter, many manufacturers of large-screen televisions use double painting to enhance the analog image. With double painting, each of the 480 lines ar e displayed twice, with each line slightly offset from the prior line, improving image clarity. With double painting it becomes possible to obtain a decent analog image on a large-screen TV.
Overcoming Bandwidth Problems The amplitude of the three primary colors — R (red), G (green), and B (blue) — represents a bandwidth hog. To conserve bandwidth, RGB is converted into a more compact format referred to as component video. Component video consists of three signals. The first component video signal is luminance, which indicates the brightness of the original RGB signal. Luminance is referred to as the “Y” component. The second and third signals are color difference signals, which indicate how much blue and red there are relative to luminance. The blue component (B-Y) and red component (R-Y) are mathematical derivatives of the RGB signal. Because green can be determined from the Y, B-Y, and R-Y signals, it does not have to be transmitted as a separate signal. Thus, once video information is converted into a component video for mat, bandwidth requirements are reduced by a factor of 3 to 2. All of the component information is then broadcast as a single signal that combines amplitude and phase modulation, reducing required bandwidth to 4.2 MHz under the NTSC system. This single signal, known as composite video, is broadcast to the TV’s antenna or onto the coaxial cable routed into the home. The yellow “video out” jacks on the back of the VCR or DVD player represent composite video signal sources.
Video Image Information Seven types of electronic information are used to define a video image when the NTSC format is used. Together, these seven types of electronic information that form a television composite waveform are commonly referred to as composite video (Table 3.2).
54 䡲 Understanding IPTV Table 3.2 Electronic Information That Forms Composite Video Horizontal line sync pulse Color reference burst Reference black level Picture luminance Color saturation Color hue Vertical sync pulse
Horizontal Line Sync Pulse The purpose of the horizontal line sync pulse is to set the electronic beam to a locked position. Doing so ensures that each line of picture information commences at the same position during scanning. Thus, the horizontal line sync pulse is generated before each line is scanned. In addition, it controls a horizontal blanking interval. Within that interval are both the horizontal sync pulse and the color reference burst.
Color Reference Burst The purpose of the color reference burst is to ensure standard hue and color saturation. To accomplish this task, a 3.58-MHz color reference burst in the form of a sine wave is added before the picture information on each scan line.
Reference Black Level The reference black level represents the level corresponding to the specified maximum excursion of the luminance signal in the black direction. Black level is also referred to as “setup” or “pedestal” and is defined as 7.5 IEEE units.
Picture Luminance Picture luminance describes the brightness level and ranges from 7.5 IEEE units for black to 100 IEEE units for peak white.
Television Concepts
䡲 55
Color Saturation Color has three distinct properties, referred to as hue, value, and saturation. Hue represents the spectral color name whereas value represents lightness or darkness. Saturation represents brightness or dullness. Under the NTSC format, color information is interleaved with picture luminance information through the use of a 3.58-MHz subcarrier. The saturation of the colors is determined by the amplitude of the subcarrier.
Color Hue As previously noted, hue represents the spectral color name. Color hue is also present in the 3.58-MHz subcarrier signal used to transport color information and luminance.
Vertical Sync Pulse The purpose of the vertical sync pulse is to control the length of time the television screen is blanked between the end of one field and the beginning of the next field. This delay is necessary because the electron beam, which is controlled by a magnetic field, cannot instantly reposition itself to the first line of the television screen. This delay period is referred to as the vertical blanking interval and is sometimes used for the insertion of time code, automatic color tuning, and captioning information into the video signal.
Comparison to PAL and SECAM The major difference between NTSC and PAL and SECAM color television resides in their use of subcarrier frequencies, phases, and formats, which affect the required bandwidth in megahertz per broadcast channel. Table 3.3 provides a comparison of the three analog color television standards. Table 3.3 Comparing Analog Color Television Standards System
NTSC PAL England Japan SECAM
Aspect Ratio
Interlace
Frames per Second
Total of Active Lines
Bandwidth (MHz)
4:3
2:1
30
525/480
4.2
4.3 4:3 4:3
2:1 2:1 2:1
25 29.97 25
625/580 525/480 625/580
5.5 4.2 6.0
56 䡲 Understanding IPTV
Comparison to Digital Although an analog television system uses a continuous carrier for modulation instead of discrete values, many times it is beneficial to consider the required capacity of analog television in terms of a data rate necessary to convey the signal. In this section we will first review the concept of pixels. Next we will compute the data rate required to convey a 4.2-MHz NTSC color television signal.
Pixels In a digital system an image consists of a series of digital values that represent individual points or picture elements along the path taken to scan the image. Each picture element or pixel has a third dimension referred to as color depth. Thus, the resolution and color capability of a digital system depend on both its pixel count and the number of bits used to define the color representation capability of each pixel. For example, a color monitor that has a VGA (Video Graphics Array) capability will have a resolution of 640 × 480 pixels and its color depth would be 24 bits, sufficient to provide a display of approximately 16 million colors, which is referred to as true color. The 3 bytes used for true color represent red, green, and blue intensities that range between 0 and 255. Table 3.4 provides eight examples of RGB values encoded as 24-bit true color.
Data Rate Requirement Although an analog video system does not use pixels, it is convenient to consider the concept of pixels when referring to such systems. We can do this by defining the width of a pixel as one-half the cycle of the highest video Table 3.4 24-Bit True Color Coding Examples (0,0,0) (255,0,0) (0,255,0) (0,0,255) (255,255,0) (0,255,255) (255,0,255) (255,255,255)
Black Red Green Blue Yellow Cyan Magenta White
Television Concepts
䡲 57
frequency, and the height of a pixel can be considered to represent the height of one scanning line. Thus, the number of pixels per line (PPL) becomes PPL = 2B/FH *CH where B = bandwidth in Hz FH = horizontal scanning frequency CH = fraction of the horizontal scanning interval used to signal transmission For the analog color NTSC television system with a bandwidth of 4.2 MHz, the PPL becomes PPL = 2 × 4,200,000/15,734 × 0.84 = 448 Thus, the pixels per frame (PPF) becomes PPF = 448 × 480 or 215,040 because 480 represents the number of visible scanning lines. Because NTSC operates at 30 fps, the data rate required to transport an analog color television signal as digital data is 215,040 × 30 or 6,451,299 bps Now that we have an appreciation for the fundamentals of analog television, let’s turn our attention to digital television.
3.2 Digital Television On a simplified basis, digital television can be considered to represent a method of transmitting video and audio by turning them into a sequence of 1s and 0s associated with computerized data. Unlike analog television, which uses one UHF or VHF channel to broadcast each television channel, a number of compressed digital television program streams are multiplexed into one transmission stream. This results in a combined transmission stream carrying multiple television channels and greatly increases the transmission of digital TV over analog television. In December 1996 the FCC mandated the conversion of analog television into a digital broadcast TV standard. Broadcasters were initially given a ten-year transition period, which was r ecently extended to February 17, 2009.
58 䡲 Understanding IPTV
Overview Digital television refers to the transmission of a television signal and its reception on a digital TV or a set-top box that will convert the picture and sound so it can be displayed and heard on an analog set. Thus, digital television signals result from the direct transmission of digital camera output or from the transformation of analog video camera output through the process of sampling and quantization of the analog signal. Sampling refers to how often an analog-to-digital converter samples the signal and quantization references the number of discrete levels to which an analog signal can be converted.
Advantages The conversion of an analog signal to digital enables data to be easily manipulated. This in turn allows the development of compression techniques that minimize both the transmission bandwidth and the storage of information. Thus, the ability to compress digital television permits a number of compressed program streams to be transmitted while minimizing bandwidth requirements. Today most cable television and satellite operators transmit digital TV to subscribers by flowing content into a settop box that unbundles and decodes programs for viewing. Some newer digital televisions have an equivalent set-top box built in, allowing them to directly view digital content.
Comparison to Analog Previously, we noted several advantages associated with the ability to digitize television. In addition, several key differences exist between conventional analog color television (SDTV) and digital television, resulting in the latter providing both a superior picture and superior sound. Those differences include the resolution, picture scanning, color, and sound transported. Table 3.5 provides a general comparison between analog and digital television. Some of the listed features may require a bit of explanation. Thus, let’s briefly discuss each one.
Resolution Resolution plays an important part in the clarity of a picture. In an analog TV system, 480 active lines are used to paint a picture on the television receiver. In comparison, digital television has 18 formats that we will examine later in this chapter; however, for now we can note that 480,
Television Concepts
䡲 59
Table 3.5 Comparing Analog and Digital Television Feature
Analog
Resolution
525 lines with 480 active
Picture scanning Aspect ratio Synchronization
Interlaced 4:3 Horizontal and vertical sync pulses, vertical blanking interval Color added as a separate carrier Two-channel FM sound on separate carrier
Color Sound
Digital
720 and 1080 lines for HDTV, 480 for digital standard TV Interlaced or progressive 16:9 (or 4:3) Frame sync signal
Color included in data Six-channel Dolby 5.1 surround sound
720, and 1080 lines are supported, with the latter two representing highdefinition television (HDTV) resolutions.
Picture Scanning Analog television is limited to interlaced scanning. In interlaced scanning a frame is subdivided into two fields: odd lines (1,3,5,…) and even lines (2,4,6,…). Odd-numbered lines are scanned in the first one-sixtieth of a second and even-numbered lines are scanned during the second onesixtieth of a second. Combining the two fields results in the creation of a frame every one-thirtieth of a second. This results in an NTSC analog frame rate of 30 fps. In comparison, various digital television for mats include support for both interlaced and progressive scanning. Under progressive scanning, the odd and even lines of a picture are scanned sequentially (1,2,3,4,…) every one-sixtieth of a second for several popular digital formats. This results in 60 frames produced every second, which creates a smoother, more vivid picture with considerably less flicker than analog television.
Aspect Ratio The ratio between the width and the height of a picture is the aspect ratio. For analog NTSC, the aspect ratio is 4:3 (1.33:1). This aspect ratio was initially selected to match the ratio used in cinema films that were popular when television was initially developed. During the latter part of
60 䡲 Understanding IPTV
the 1950s, movie studios gravitated toward widescreen aspect ratios to distance their products from television as well as to provide a more panoramic vision, which was well received for use in many westerns and biblical-themed movies. Recognizing the advantages of being able to provide a panoramic view, digital television manufacturers began using an aspect ratio of 16:9 (1.85:1). Widescreen television and cinema has now spread to computing; many modern laptops provide a 16:10 aspect ratio. Although not matching the 16:9 ratio of digital television, its availability indicates a growing preference for the ability of consumers to view information, including television and data, in a widescreen format.
Synchronization In the area of synchronization, NTSC analog television uses horizontal and vertical sync pulses as well as a vertical blanking interval to keep the output of a camera in step with the television r eceiver’s display. In comparison, digital TV uses frame sync signals to pr ovide a similar capability.
Color In an NTSC analog system, color information is added as a separate carrier that is multiplexed with monochrome. This technique enables color broadcasts to be received on a legacy black-and-white television. In comparison, digital television transports color information in each pixel.
Sound The last major difference between analog and digital television is sound. Analog television uses two-channel FM sound modulated on a separate carrier. In comparison, digital television uses six-channel Dolby 5.1 surround sound.
Digital Television Formats In examining the entries in Table 3.5, we need to note that there are a total of 18 digital television formats, of which six are considered to represent HDTV. Table 3.6 summarizes the 18 flavors of digital television, including their vertical and horizontal resolution, aspect ratio, refresh rate in Hz, and the type of television system the previously mentioned parameters represent. As indicated in Table 3.6, digital television can be classified as SDTV, enhanced-definition television (EDTV), and HDTV.
Television Concepts
䡲 61
Table 3.6 Digital Television Formats Format Index
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Vertical Resolution
480 480 480 480 480 480 480 480 480 480 480 480 720 720 750 1080 1080 1080
Horizontal Resolution
Aspect Ratio
Scan Type
Refresh Rate
System Type
640 640 640 640 704 704 704 704 704 704 704 704 1280 1280 1280 1280 1920 1920
4:3 4:3 4:3 4:3 4:3 4:3 4:3 4:3 16:9 16:9 16:9 16:9 16:9 16:9 16:9 16:9 16:9 16:9
Interlaced Progressive Progressive Progressive Interlaced Progressive Progressive Progressive Interlaced Progressive Progressive Progressive Progressive Progressive Progressive Interlaced Progressive Progressive
30 24 30 60 30 24 30 60 30 24 30 60 24 30 60 30 24 30
SDTV EDTV EDTV EDTV EDTV EDTV EDTV EDTV EDTV EDTV EDTV EDTV HDTV HDTV HDTV HDTV HDTV HDTV
Standard-Definition Television The first digital TV format shown in Table 3.6 represents SDTV. Because it uses 480 lines of vertical resolution that are interlaced, it is referred to as 480i. In addition, because its refresh rate is 30 fps, this digital format is also often referred to as 480i/30. This SDTV digital format is equivalent to the interlaced output of DVDvideo in a 4:3 aspect ratio. It is used when bandwidth is a larger concern than picture quality. Because SDTV uses a data rate between 4 and 7 Mbps, between three and six SDTV channels can be multiplexed into the same bandwidth required to support one HDTV channel.
Enhanced Definition Television Enhanced-definition television (EDTV) consists of 11 formats, including both progressive and interlaced screen painting. The vertical resolution is limited to 480 lines, with horizontal resolution varying from 640 to 704 lines. Both 4:3 and 16:9 aspect ratios are supported as well as refresh
62 䡲 Understanding IPTV
rates of 24, 30, and 60 fps. EDTV is used when a better picture quality than SDTV is desired but bandwidth constraints preclude the use of true HDTV. If you visit an electronics store you may notice many low-cost plasma televisions available for purchase. Such plasma TVs typically are limited to supporting EDTV and although they produce a brilliant color picture with very good clarity they are not as good as a true HDTV. Because an EDTV set lacks the electronics to generate 720 or 1080 lines of resolution, they cost less than sets with an HDTV receiver.
High-Definition Television HDTV has six distinct formats, each providing a picture superior to SDTV and EDTV. All six HDTV formats are in a 16:9 aspect ratio, providing a widescreen view. When a 720-line vertical resolution is employed, scanning occurs progressively and refresh rates of 24, 30, and 60 fps are supported. Although Table 3.6 indicates that both interlaced and progressive scanning are supported in a vertical resolution of 1080 lines, that resolution is used primarily for interlaced scanning due to current limitations of broadcasting equipment and consumer television products. The 1080p format can be viewed as providing a path for future growth as imaging and display technologies evolve. If you subscribe to cable or satellite television service and enroll in their HDTV service, you will receive an HDTV set-top box that has a monthly rental approximately twice to three times that of the provider’s regular digital box. Although the box is billed as an HDTV set-top box, in reality only a handful of channels are broadcast in HDTV mode. Typically, your television provider will carry the three major networks, two independent HDTV channels that show movies and special events, and a sports channel in high definition. In addition, the HDTV set-top box will allow you to view one or more high-definition premium channels, such as HBO in high definition, assuming you subscribe to one or more premium channels. At its highest resolution of 1080 × 1920, HDTV offers an image consisting of 2,073,600 pixels, which represents almost a seven-fold improvement in picture detail over SDTV, which has a resolution of 480 × 640, resulting in a display consisting of 307,200 pixels. In addition to an enhanced resolution, color resolution is also improved over SDTV by a factor of two. Each of the digital television formats listed in Table 3.6 uses MPEG-2 for video compression. Later in this chapter we will discuss the use of
Television Concepts
䡲 63
compression to minimize the bandwidth required to transport a digital television signal.
HDTV Reception Although you can receive HDTV through the use of an appropriate settop box when you subscribe to a cable or satellite television service, it is also possible to receive HDTV via over-the-air broadcasts. To do this you normally will need to connect your antenna to an HDTV terrestrial receiver that tunes and decodes all 18 digital television for mats. Most HDTV receivers allow you to specify the output format, such as 480p, 720p, or 1080i, which makes it possible to connect the receiver to any of the three types of digital televisions available for purchase. A second type of HDTV receiver is emerging in popularity due to the FCC mandate that will require all broadcast analog television stations to convert to digital on or before February 17, 2009. Although this mandate will not affect analog television sets connected to cable or satellite television, because the set-top box performs any required conversion, more than 30 million analog television sets in use do not have cable or satellite service. When February 17, 2009 arrives, those television sets will not be able to receive any TV signal unless they use a conversion box that receives digital TV signals and converts those to analog.
Sound One aspect of digital television commonly overlooked is its sound quality. Under what many people refer to as NTSC legacy analog television, sound is provided through two-channel FM that is transported on a separate carrier from the TV signal. In comparison, digital television supports Dolby Digital audio encoding for all 18 digital TV formats. Although Dolby Digital is familiar to readers who have a home theater or visit the movies, it is actually more flexible than just a 5.1-channel (four speakers and woofer) surround-sound format. Dolby Digital represents a scalable digital encoding algorithm that supports 1.0 channel (mono) and 2.0 channels (stereo) when the original programming being transmitted has only a mono or stereo soundtrack. In addition, Dolby Digital can scale upward to 6.1 extended surround sound. Now that we have an appreciation for analog and digital television, we will conclude this chapter by examining the technique that enables digital television to be transported using considerably less bandwidth than its unaltered series of frames requires. That technique is lossy compression, which is the focus of the third section of this chapter.
64 䡲 Understanding IPTV
3.3 Lossy Compression As briefly indicated in Chapter 2, lossy compression provides a mechanism to considerably increase the ratio of noncompressed original data to its compressed equivalent. This ratio, technically referred to as the compression ratio, can be much greater than the ratio obtained when a lossless compression method, which returns data to its exact original form, is used.
Characteristics of Digital Video Digital video can be viewed as a sequence of images, each of which represents a two-dimensional frame of picture elements that are commonly referred to as pixels or pels. Associated with each pixel are two values — luminance and chrominance. The luminance is a value proportional to the pixel’s intensity and the chrominance is a value that represents the color of the pixel. Several methods can be used to represent the color of a pixel. One method is by specifying an appropriate mixture of the three primary colors (red, green, and blue). Another method is to specify the luminance (Y) and chrominance (U and V). As we will note later in this chapter, the YUV system allows color to be approximated using only two variables and represents one of several mechanisms by which lossy compression obtains a high compression ratio when operating on a digital video stream. The majority of digital video broadcasting (DVB) uses MPEG-2 compression, where the term MPEG refers to the Motion Picture Experts Group, a working group of the International Standards Organization (ISO) that has, since its founding in 1988, developed a series of standards for the compressed representation of audio and video. Thus, in this section, to paraphrase a popular tune from The Sound of Music, “we will begin at the beginning” our examination of MPEG by first turning our attention to MPEG-1.
MPEG-1 MPEG-1 represents the first of a series of MPEG standards developed by the Motion Picture Experts Group. This standard was used for the development of such products as the video CD and MP3 players.
Parts of MPEG-1 MPEG-1 was primarily developed for the coding of moving pictures and associated audio for digital storage media at up to approximately a data
Television Concepts
䡲 65
Table 3.7 The Parts of the MPEG-1 Standard Part
Description
1
Addresses the problem of combining one or more data streams from the video and audio parts of the standard with timing information to form a single data stream Defines a coded representation that can be used for compressing both 525 and 625 lines to bit rates of approximately 1.5 Mbps Defines a coded representation that can be used for compressing both mono and stereo audio sequences Defines how tests can be used to verify if bitstreams and decoders meet the requirements of the prior parts of the standard A technical report that provides a full software implementation of the first three parts of the standard
2 3 4 5
rate of 1.5 Mbps. The standard is defi ned in five parts, which are summarized in Table 3.7. In examining the entries in Table 3.7, note that Part 1 of the standard represents an important function because it enables audio and video to be stored or transmitted as a single entity. Although this is important for transmission and storage, we will focus our attention on Parts 2 and 3 of the standard, which define techniques used to compress video and audio, respectively.
Overview The MPEG-1 standard covers three types of coded frames: I, P, and B. I frames are intra-frames whereas P (predicted) frames are produced from the most recently reconstructed I or P frame. B frames, or bidirectional frames, are predicted from the closest I or P frames. Each frame represents a still image. Although images are in color, they are converted into YUV space (luminance and two chrominance values). Motion is predicted from frame to frame in the temporal direction in the luminance (Y) channel on 16 × 16 blocks, and the discrete cosine transformation (DCT) operates on 8 × 8 pixel blocks to organize redundancy in the spatial direction. Thus, for each 16 × 16 pixel block in the current frame, a search is performed for a close match to that block in a previous or future block. The DCT coefficients are quantized, resulting in many coefficients being 0, which results in strings of 0s suitable for compression. Then, the DCT coefficients, motion vectors, quantization parameters, and other data are Huffman coded using fixed tables. In this section we will first examine each of the encoding processes specified under
66 䡲 Understanding IPTV
the MPEG-1 standard. Once this is accomplished, we will tur n our attention to how intra-frame and inter-frame coding are performed under the standard.
Video Compression In the MPEG-1 standard, video is represented as a sequence of pictures, with each picture treated as a two-dimensional array of pixels (picture elements or pels). The color of each pel is defined by three components: Y (luminance), Cb (first chrominance component), and Cr (second chrominance component). To obtain a high compression ratio, MPEG-1 uses coding techniques to reduce both spatial redundancy, where neighboring samples on a scanning line are similar, and temporal redundancy, where neighboring images in a video sequence are similar.
Color Space Conversion Under the MPEG-1 standard, each pel in a picture, which consists of red (R), green (G), and blue (B) components, is converted to Y, Cb, Cr. This is less correlated than R,G,B, which means that the former can be coded more efficiently.
Subsampling After the previously mentioned conversion, each pel is represented as Y, Cb, Cr. Because human sight is more sensitive to Y, that component is encoded with full resolution, whereas the CbCr components are subsampled. This results in a reduction of data without affecting the visual quality of the image. Under MPEG-1, Y’s resolution is four times the resolution of the CbCr subsample. Figure 3.3 illustrates the subsample process. Subsampling represents the most basic of all image compression techniques. It reduces the amount of data by simply discarding some of it. There are two primary methods of subsampling. The first method involves copying the original image but only using a portion of the original pixels. A second method requires first computing the average pixel value for each group of several pixels. Next, the average pixel value is substituted in the appropriate location in the approximated image. This second technique is more complex but normally results in better quality images. In Figure 3.3, subsampling by 2 in both the x and y directions results in every second line and second column being ignor ed. While the color component of the image is subsampled by 2 in both directions, the luminance component remains intact. This is done because human vision is much less sensitive to chrominance than it is to luminance.
Television Concepts
䡲 67
Figure 3.3 An example of the subsample process.
Thus, subsampling color images in this manner reduces the number of bits required to specify the chrominance component by three fourths. Subsampling is obviously nonreversible and thus lossy. Although the subsampling process relies on the ability of human visual perception to fill in the gaps, the receiver or decoder can attempt to restore pixels that were removed due to subsampling. To do this, adjacent pixels of the subsampled image are compared. Then, the values of the missing in-between pixels can be approximated through interpolation.
Quantization Once color space conversion and subsampling of chrominance information are completed, the next step is to reduce the resulting data through the process of quantization. As a refresher, quantization refers to the process of approximating the continuous set of values in an image so that they have a finite set of values. Thus, the use of quantization enables a lesser number of bits to represent a continuous set of values. In the MPEG-1 standard, a quantifier matrix (Q[i,j]) is used to define quantization steps. Each time a pel’s matrix (X[i,j]) with the same size as (Q[i,j]) occurs, the pel matrix is divided by the quantized value matrix (Xq[i,j]). Because we can round a real to an integer, we obtain (Xq[i,j]) = Round (X[i,j]/Q[i,j])
68 䡲 Understanding IPTV
Non-Quantized
Quantization Result
487 1 3 2 1 2 3-3
293 0 1 1 0 1 1-1
-5 1 2 1 4 1 0 0
-1 0 1 0 1 0 0 0
21311125
10100012
-2 -1 0 1 1 2 2 2
-1 0 0 0 0 1 1 1
-1 -2 0 0-1 0 1 2
0-1 0 0 0 0 0 1
2 -1 0 0 1 1 2 1
10000010
3 -1 0 0 0 0 1 0
10000000
-1 -1 -2 -1 0 0 1 0
0 0 -1 0 0 0 0 0
Figure 3.4 An example of the quantization process when Q[i,j] equals 2.
The inverse of quantization is referred to as inverse quantization, which is expressed mathematically as follows: X1[i,j] = Xq[i,j] * Q[i,j] Because the quantization equation uses the round function to obtain the nearest integer value, the reconstructed or dequantized value will not be the same as the original value. The difference between the actual and reconstructed values is referred to as the quantization error and explains why MPEG-1 is referred to as a lossy compression method. Through the careful design of Q[i,j], the visual quality of the quantized image will appear to the human eye reasonably similar to the original image. Figure 3.4 illustrates an example of the quantization process on an 8 × 8 block based on Q[i,j] equal to 2.
Discrete Cosine Transform Under the MPEG-1 standard, 8 × 8 pel blocks are converted to another 8 × 8 using the discrete cosine transform. The DCT represents a technique for converting a signal into elementary frequency components. A onedimensional DCT is used in processing one-dimensional signals, such as speech waveforms, and two-dimensional DCT is required for the analysis of two-dimensional signals, such as images. For an m × n matrix S, the
Television Concepts
䡲 69
two-dimensional DCT is applied to each row of the matrix S and then to each column of the result. Thus, the transform of S is m −1 n −1
S(u,v ) = 2C(u ) nm
∑ ∑ S(x,y )Cos y =0 x =0
(2 x + 1)uΠ Cos(2 y + 1)uΠ 2n 2m
(3.1)
where: u = 0,…n v = 0,…m C(u) = 2–1/2 for u = 0; =1 otherwise For the two-dimensional 8 × 8 pel DCT transform, we obtain 7
7
uΠ
vΠ
∑ ∑ S(x,y )Cos(2Π + 1) 16 Cos(2y + 1) 16
S(u,v ) = 1/ 4 2C(u )C( v )
(3.2)
y =0 x=0
where: u, v, x, y = 0, 1…7 C(u) = 2–1/2 for u = 0; otherwise C(u) = 1 Through the use of DCT, data in the time domain is converted to data in the frequency domain. If there is little variation of data in the time domain, then frequency domain data results in low-frequency data being larger and high-frequency data being smaller. We can view the DCT as changing the representation of data as an array of 64 values (8 × 8) into a varying signal that can be approximated by a set of 64 cosine functions with a series of amplitudes that can be represented as a matrix of coefficients. Although the DCT process by itself does not result in compression, the resulting coefficients, when scanned in an applicable zig-zag order, tend to be good candidates for compression. The coefficient location (0,0) in DCT is referred to as the DC coefficient. In comparison, the other coefficient values are called AC coefficients, resulting in large quantization steps being used to quantize AC coefficients whereas small quantization steps are used to quantize the DC coefficient. This action retains a high level of precision. After the application of the DCT process and quantization, a majority of the AC values will be 0. For sequences of 0s, run-length encoding (RLE) is used to compress the results of the DCT and quantization process. To do so, the bitstream is encoded as (skip, value) pairs, where “skip” represents the number of 0s and “value” is the next nonzer o value encountered. RLE is used only on the AC component. For the DC component, differential pulse code modulation (DPCM) is used. DPCM can be viewed as an extension of pulse code modulation in which the difference between the previous output and a new source are encoded
70 䡲 Understanding IPTV
New Source
132574103
Previous Output
012483012
Differential Values
1 2 0 1 -1 1 0 -1 1
Figure 3.5 Differential pulse code modulation.
instead of the actual values of the new source. Figure 3.5 illustrates the creation of differences between a previous output sequence and a new source sequence. Note that by encoding the difference instead of the new source values a lesser number of bits is required. After the application of DCT and quantization, most AC values will be set to 0. At this point a zig-zag scan process is used to gather more consecutive 0s. Figure 3.6 illustrates a portion of the zig-zag scanning process applied to an 8 × 8 block and the resulting data. Once the zig-zag process is applied to the block, the resulting bitstream is encoded as (skip, value) pairs. The value 293 is the DC coefficient and is not coded. Instead, DPCM is used to predict the DC coefficient.
Motion Estimation Under the MPEG-1 standard, data is further reduced through motion estimation. Motion estimation is used to predict the values of a block of pixels in the current picture. A frame is selected as a reference and subsequent frames are predicted from the reference. This process of video compression using motion estimation is also referred to as inter-frame coding.
293 0 1 –1 0 1 1 0 1 –1 0 0 0 –1 0 1 0 0 1 0 0 0 0 –1
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0
1 –1 0 0 1 2 1 1 0 1 1 0 0 0 0 0
Zigzag data: 293, 0, –1, 1, 0, 1, 1, 1, 0, –1, 0, 0, 1, 0, 0, ....
Figure 3.6 A portion of a zig-zag scan.
Television Concepts
Frame N 76 81 79 80 81
78 82 79 80 81
79 82 83 79 80
80 81 81 80 80
Frame N+1 79 80 79 81 81
79 81 79 80 81
80 81 82 80 80
80 81 82 81 80
䡲 71
Motion vector 3 –1 0 1 0
–1 1 –1 –1 0 –1 0 1 0 0
0 0 1 1 0
Figure 3.7 Motion estimation example.
Inter-frame coding or motion estimation occurs by dividing the current frame into macroblocks, typically 8 × 8 or 16 × 16 pixels in size. Each macroblock is then compared to a macroblock in the reference frame and the best matching macroblock is selected. Next, a vector that denotes the displacement of the macroblock in the reference frame with respect to the macroblock in the current frame is computed. This vector is referred to as the motion vector and the difference between the two blocks is known as the prediction error. When a previous frame is used as a reference, the prediction is referred to as forward prediction. In comparison, when the reference frame represents a future frame, the prediction is referred to as backward prediction. When both forward and backward predictions are used together, this technique is referred to as bidirectional prediction. Figure 3.7 shows an example of the motion estimation process on a 4 × 5 pixel block for ease of illustration. In this example, Frame N serves as a reference for motion estimation on Frame N + 1. Thus, Figure 3.7 can also be considered to represent a forward prediction process.
Motion Compensation During the reconstruction process, the reference frame is used to predict the current frame based on the use of the motion vectors. Under the MPEG-1 standard, the encoder computes motion vectors and their associated prediction errors. During the decoding process the decoder uses this information to reconstruct frames, a process referred to as motion compensation. As you might surmise, motion compensation is the inverse of motion estimation.
Variable-Length Coding As a mechanism to further reduce data redundancy, MPEG-1 specifies the use of variable-length coding (VLC) as the last step of the encoding process. Variable-length coding can be considered to represent a statistical coding technique in which short codewords are used to represent
72 䡲 Understanding IPTV
frequently occurring values and longer codewords are used to represent less frequently occurring values. For example, consider a VLC with five symbols, s1 through s5, in the source code 5. If the code alphabet is binary we can assign the following values to s1 through s5, so that we obtain an instantaneous code in which no symbol is a pr efix of any other symbol and a decoding tree is easily constructed: s1 = 0 s2 = 10 s3 = 110 s4 = 1110 s5 = 1111 The preceding code is an instantaneous code because no symbol is a prefix of any other symbol. To illustrate its instantaneously decoding capability, let’s assume the following binary data occurs at the receiver: 01101110010… This binary string is instantly decoded as s1, s3, s4, s1, s2 When we assign codes to symbols based on the probability of symbols occurring in data we create the well-known Huffman code. Under the MPEG-1 standard, encoding and decoding employ the use of a code table with two entries. One entry represents the possible symbols or original data, and the second entry stores the corresponding codewords for each data symbol.
Intra-Frame Coding An I (or intra-) frame is a frame coded as a still image without any past history. Thus, we can view an I frame as a beginning or reference for predicted (P) frames and bidirectional (B) frames, with the latter predicted from the closest I or P frames. When I frames are coded only the spatial redundancy is reduced because the frame serves as a reference for P and B frames. The basic coding unit of the I frame is a block. That block is an 8 × 8 matrix, whereas a macroblock consists of six blocks: four of luminance, one of Cb chrominance, and one of Cr chrominance. The actual encoding of an
Television Concepts
䡲 73
Table 3.8 Intra-Frame Encoding Steps Decompose image into RGB components Convert RGB to YCbCr Divide image into macroblocks Perform DCT on each block Quantize each coefficient in block Gather AC value Use DPCM to encode DC value; use VLC to encode it Use RLE to encode AC value; use VLC to encode it
I frame follows an eight-step process. Table 3.8 summarizes the steps involved in encoding an I frame. Intra-frame coding does not examine the differences between frames. Instead, it focuses on reducing the data required to represent a single frame, similar to the manner by which JPEG operates. Thus, although intra-frame coding can obtain a significant degree of compression, it is not as effective as inter-frame coding, which can take advantage of the fact that motion estimation can significantly reduce the amount of data required to represent P and B frames.
Inter-Frame Coding Inter-frame coding represents the process by which P and B frames are created. As you might expect, the coding of P and B frames is mor e complex than for I frames because motion-compensated macroblocks may be created. When this occurs, the dif ference between the motioncompensated macroblock and the current macroblock is transformed through the use of two-dimensional DCT, resulting in the creation of an array of 8 × 8 transform coefficients. The resulting coefficients are quantized and then encoded using a run-length data compression technique. Table 3.9 lists the steps required to perform inter-frame encoding. Note that the encoder has to make a series of decisions when it performs motion compensation. First, it has to decide how to divide the picture into macroblocks. Next, it needs to determine the best motion vectors to use. Once this is accomplished the encoder must decide whether to code each macroblock as an intra- or predicted frame. Once the frame coding method is determined, the encoder must then determine how to set the quantizer scale so that a balance is obtained between the level of compression obtained and the quality of the decompr essed frame.
74 䡲 Understanding IPTV Table 3.9 Inter-Frame Encoding Steps Decompose image into three RGB components Convert RGB to YCbCr Perform motion estimation between encoding frame and reference frame Divide image into macroblock consisting of 6 blocks (4 for Y, 1 for Cb, 1 for Cr) Perform DCT on each block Quantize each coefficient Gather AC value Reconstruct frame and store it in frame buffer Apply DPCM to encode DC value and use VLC to encode it Use RLE to encode AC value and then use VLC to encode it
B-Frame Coding The B (or bidirectional) frames are predicted from the closest two I or P frames — one in the past and one in the future. The encoder searches for matching blocks in those frames, trying three different approaches to determine which approach works best. That is, the encoder computes a forward vector and a backward vector and averages the two blocks from the future and past frames, subtracting the average from the block being coded. Figure 3.8 illustrates the relationships among I, B, and P frames. Note that P frames use motion to define the dependence between continuous
Figure 3.8 Relationships among I, B, and P frames.
Television Concepts
䡲 75
frames whereas the B frame functions as a mechanism for increasing the frame rate without having to significantly increase the bit rate. One advantage associated with B frames is that they reduce noise at low bit rates. However, the computational complexity, bandwidth, delay, and buffer requirements are negative aspects associated with the use of B frames. Concerning delay, additional delay is introduced in the encoding process because the frame used for backward prediction has to be transmitted to the decoder before the intermediate B frame can be decoded and displayed. Coding decisions. The coding of the B frame involves a series of decisions. First, the MPEG-1 standard defines 12 types of macroblocks for B frames, which provide additional types over P frames due to the introduction of a backward motion vector. Thus, the encoding process requires more types of macroblocks to be considered. In addition, when both forward and backward motion vectors are present, the motioncompensated macroblocks need to be constructed from both future and previous frames. This requires an averaging process to form an interpolated motion-compensated macroblock. Once a motion compensation mode is selected and required computations are performed, a DCT operation occurs on each block. After this, each coefficient is quantized and AC values are gathered via a zig-zag scan process, after which DPCM is applied to encode the DC value. Next, VLC is used to encode the DC value and RLE is used to encode the AC value, after which VLC is used to encode the compressed AC value.
Summary As indicated in this section, MPEG-1 uses intra-frame and inter-frame coding to reduce the quantity of data that needs to be transmitted. Intraframe coding can be considered to represent a starting point for further reduction of data because this results in compressed I frames, which are used as a reference for the creation of B and P frames. In comparison, inter-frame coding reduces the encoded bit rate by coding frames with respect to the previous encoded frame (P) and sometimes the subsequent (B) frame. This method of coding results in a higher compression ratio than intra-frame coding but requires a large computational effort.
MPEG-1 Audio Previously, we focused on MPEG-1 video. In this section we turn our attention to the manner by which sound is digitized under the MPEG-1 standard.
76 䡲 Understanding IPTV
Coding Schemes There are three different coding schemes for digitized sound under the MPEG-1 standard. Those coding schemes are referred to as Layers I, II, and III. Under the MPEG-1 audio standard, the type of information an encoder has to produce and not the encoder is standardized. Perceptual audio coding occurs instead of lossy coding; perceptual coding eliminates those parts of a sound signal that are not applicable to the human ear. In effect, an MPEG-1 audio encoder transforms sound signals into the frequency domain and eliminates those frequency components that cannot be heard due to their masking by stronger frequency components.
Audio Layers MPEG-1 audio layers increase in complexity and coding efficiency as they progress from Layer I to Layer II and then to Layer III. The well-known .mp3 file extension was created with the development of MPEG-1 Layer III encoder and decoder software for the Windows operating system. After removal of the portions of audio signals that cannot be heard, the remaining audio is quantized into a bitstream, which is divided into data blocks. In Layer I, blocks consist of 384 audio samples, whereas in Layers II and III there are 1152 samples in a block. Each block is encoded within an MPEG-1 audio frame, with an MPEG-1 audio stream consisting of a series of consecutive audio frames. Similar to video frames, an audio frame consists of a header that contains such information as the MPEG layer, the sampling frequency, the number of channels, and whether the frame is Cyclic Redundancy Check (CRC) protected. The header is then followed by the encoded sound data.
Bit Rate Support MPEG-1 audio supports a wide range of bit rates from 32 kbps to 320 kbps. A low sampling frequency extension included in MPEG-2 extends the data rate downward to 8 kbps. Although Layer III supports a variable bit rate, lower layer support is optional. However, most MPEG-1 audio decoders support variable bit rates for all layers.
Stereo Mode Support MPEG-1 audio supports one- and two-channel audio signals. Four different audio modes are supported by the standard. Those modes are mono,
Television Concepts
䡲 77
stereo, joint stereo, and dual channel, with the latter providing two independent channels that can be used to support two languages.
MPEG-2 MPEG-2 is the most commonly used method to compress audio and video signals. This extension to the MPEG-1 standard dates to 1990, when the Motion Picture Experts Group realized it was necessary to develop a requirement for coding video broadcasts at higher data rates than MPEG-1’s support of bit rates up to 1.5 Mbps. The r esulting MPEG-2 standard is capable of encoding SDTV at bit rates from approximately 3 to 15 Mbps and HDTV at bit rates from 15 to 30 Mbps. In addition, MPEG-2 extends the stereo audio capabilities of MPEG-1 to multichannel surroundsound coding.
Overview Similar to MPEG-1, MPEG-2 is a standard with many parts. However, unlike MPEG-1, which has five parts, MPEG-2 has nine parts.
Parts of MPEG-2 Part 1 of MPEG-2 addresses how one or more elementary video and audio streams as well as other data can be combined into one or more data streams suitable for storage or transmission. The primary purpose of Part 1 is to define a syntax for transporting packets of audio and video bitstreams and a syntax for their synchronization. Part 2 of MPEG-2 builds on the video compression capabilities defined by the prior MPEG-1 standard. Under Part 2 the header and bitstreams are defined, as are the algorithms used to process video. Part 2 also defines a series of profiles that offer different functionalities, ranging from coding high-bit-rate data to pictures with different color resolutions. Part 3 of MPEG-2 represents a backward-compatible multichannel extension of the MPEG-1 audio standard. Parts 4 and 5 of MPEG-2 correspond to Parts 4 and 5 of MPEG-1. Part 4 defines the meaning of conformance for system, video, and audio, and Part 5 contains an example C language software encoder and compliant decoder for video and audio. Part 6 of MPEG-2, Digital Storage Media Command and Control (DSM-CC), represents a set of protocols that controls MPEG-1 and MPEG-2 bitstreams. A syntax for controlling VCR-style and random-access disks is defined, including such commands as Still Frame, Fast Forward, Goto, and so on.
78 䡲 Understanding IPTV
Part 7 of MPEG-2 represents a specification for a multichannel audio coding algorithm that is not backward compatible with the MPEG-1 audio specification. Thus, Part 7 removes the constraint of having audio that is backward compatible with MPEG-1. Part 8 introduces a 10-bit video extension whose primary application is for studio video that requires 10 bits of sample precision. Work on Part 8 was discontinued due to a lack of industry interest. Part 9 of MPEG-2 defines the specification of the Real-Time Interface (RTI) for transporting video-on-demand control signals between set-top boxes and headend servers.
Comparison to MPEG-1 As previously mentioned, MPEG-2 video represents an extension to MPEG-1 video. MPEG-2 provides extra algorithms beyond those supported by MPEG-1 that can be used to efficiently code interlaced video at a wide range of data rates. Table 3.10 provides a summary of the key differences between MPEG-2 and MPEG-1.
Profiles With MPEG-2 a small number of subsets of the complete toolkit ar e defined. Referred to as profiles and levels, a profile represents a subset of algorithmic tools and a level identifies a set of constraints on parameter values, such as the picture size or bit rate. Under MPEG-2 two nonscalable profiles, simple and main, are defined. The simple profile uses no B frames. Thus, there are no backward or interpreted predictions, which reduces both processing requirements and the delay associated with the computations. This profile is suitable for low-delay applications, such as videoconferencing, where the overall endto-end delay is approximately 100 ms. The main profile adds support for B frames and is the more widely used profile. Using B frames increases video quality; however, it adds
Table 3.10 MPEG-2 Additions Defines nonscalable and scalable profiles Defines four levels of coding parameters Supports interlaced or progressive video sequences Changes meaning of aspect ratio information variable
Television Concepts
䡲 79
Table 3.11 MPEG-2 Upper Limit Coding Parameter Constraints Level
Low Main High-1440 High
Frame Width (Pixels)
Frame Height (Pixels)
Frame Rate (Hz)
352 720 1440 1920
288 576 1152 1152
30 30 60 60
Bit Rate (Mbps)
4 15 60 80
Buffer Size (Bits)
475136 1835008 7340032 9781248
approximately 120 ms to the coding delay to support frame reordering. The main profile is backward compatible with MPEG-1 video; the simple profile is not. MPEG-2 also supports two scalable profiles, referred to as the SNR profile and the spatial profile. The SNR profile adds support for the enhancement layers of DCT coefficient refinement by using a signal-tonoise (S/N) ratio scalability tool. The spatial profile adds support for enhancement layers that transport coded images at different resolutions. The SNR profile can be used for digital terrestrial television as a method of providing for graceful degradation; the spatial profile provides a way to broadcast HDTV with a main profile-compatible SDTV service. A third profile, referred to as the high profile, adds support for coding a 4:2:2 video signal, where the terms represent the sampling structure of the digital picture such that chrominance is horizontally subsampled by a factor of 2 relative to the luminance.
Levels MPEG-2 defines four levels of coding parameter constraints, including frame resolution, frame rate, maximum bit rate, and buffer size required for each level. Table 3.11 lists the upper limits of the constraints for each level. Note that SDTV requires main level, whereas HDTV requires a high-1440 level.
Video Sequencing Under the MPEG-2 standard, both interlaced and pr ogressive video sequencing are supported. In comparison, MPEG-1 is limited to the support of progressive sequences because the target application was the compact video disk, which required a data rate of 1.2 Mbps.
80 䡲 Understanding IPTV
Aspect Ratio The MPEG-2 standard changed the meaning behind the variable used to define aspect ratio information as well as significantly reduced the number of defined aspect ratios. With MPEG-2, aspect ratio information refers to the overall display aspect ratio, such as 4:3 or 16:9. In fact, the MPEG-2 specification uses the video stream header to specify the intended display aspect ratio instead of the pixel aspect ratio. The display aspect ratio can be 4:3, 16:9, or 2.21:1. In comparison, under the MPEG-1 standard the aspect ratio refers to the particular pixel, where a square pixel used in computer graphics has a pixel aspect ratio, also known as the sample aspect ratio (SAR), of 1:1; video pixels, which are rectangular, have an aspect ratio of width:height (W:H), where W and H are not equal. The pixel aspect ratio should not be confused with the display aspect ratio (DAR); the latter describes the shape of the physically displayed image as measured with a ruler. The relationship of the width and height in pixels to the SAR and DAR is as follows: Width/height = DAR/SAR
Frame Rate Another difference between MPEG-1 and MPEG-2 is the meaning of the variable used to define the frame rate. Under MPEG-2 the variable references the intended display rate. In comparison, under MPEG-1 the variable references the coded frame rate. In addition, under MPEG-2 two new variables were defined that can be used with the frame rate variable to specify a much wider variety of display frame rates.
Other Differences In addition to the previously mentioned differences between MPEG-2 and MPEG-1, there are a significant number of minute differences between the two standards. Those differences include the concept of the slice, vertical and horizontal range specifiers, and the ability to perform nonlinear macroblock quantization, which were added under the MPEG-2 standard. Another difference concerns the support for scanning quantized coefficients. In addition to the zig-zag scanning process that is supported by both MPEG-2 and MPEG-1, the former added support for an additional scan pattern that is used to scan quantized coefficients resulting from interlaced source frames. Because the purpose of this chapter is to become acquainted with the manner by which the MPEG-1 and MPEG-2 standards reduce the data transmission requirements of digital video signals, we will
Television Concepts
䡲 81
not probe deeper into the differences. Many books and Internet-accessible white papers specialize in this information.
H.264 In concluding our discussion of lossy compression, we will briefly discuss a video coding standard developed by both the ISO/IEC (International Engineering Consortium) and ITU-T (the Telecommunication Standardization Sector of the International Telecommunications Union). Referred to as the H.264 standard, this standard is also known as MPEG-4 AVC (advanced video coding). The H.264 standard is a video compression standard that provides a significantly greater compression capability than its predecessors, including providing DVD-quality video at a data rate under 1 Mbps.
Goal The primary goal of the H.264 standard is to obtain a high compression ratio while preserving video quality. The result is the achievement of an approximate 50 percent reduction in the bit rate required to transport multimedia in comparison to previously developed standards.
Overview The H.264 multimedia compression standard is similar to its predecessors in that its design is based on a sequence of actions referred to as building blocks. Where it differs from its predecessors is in additional algorithms, such as quarter-pixel motion accuracy, which increases its computational complexity in order to obtain a greater compression capability.
Layers The H.264 standard has two distinct layers: the network abstraction layer (NAL) and the video coding layer (VCL). The NAL is r esponsible for packing coded data based on the characteristics of the network that will be used as the transport facility. In comparison, the VCL is responsible for generating an efficient representation of the data. Thus, the NAL manages transport over the network and the VCL represents a networkindependent interface. The H.264 standard supports both IP and non-IP-based operations, including both fixed and wireless operations. In effect, the H.264 standard
82 䡲 Understanding IPTV Table 3.12 Key Features of H.264 Video Coding Grouping of macroblocks for intra-prediction Use of an integer DCT-like transform Variable-block motion compensation Quantization parameters
can be considered a Swiss Army knife of multimedia compression capability because it can be used with PDAs, DVDs, satellite, and content delivery servers streaming data to set-top boxes.
Operation The H.264 standard is similar to other multimedia compression techniques because it uses the spatial and temporal redundancies within and between frames to reduce the quantity of data. However, the H.264 standard differs in the techniques it uses. Table 3.12 lists some of the more prominent features associated with H.264 video coding.
Intra-Prediction To exploit the spatial redundancies between adjacent blocks of a picture, the H.264 standard predicts the pixel value of a block from adjacent blocks, coding the difference between the actual value and the predicted value. Nine different intra-prediction modes for macroblocks are supported by the H.264 standard.
Use of an Integer DCT-Like Transform The H.264 standard defines the use of an integer DCT-like transform. This transform is an approximate of the conventional DCT designed to enable its core functions to be computed via adders and shifters, eliminating the transform mismatch between the encoder and decode that occurs when a conventional fixed-point DCT is used. In addition, the block size of the transform is reduced to 4 × 4, which enables a better quality result to be obtained during decoding.
Variable Block Motion Compensation Under the H.264 standard, motion compensation occurs using variable block sizes. Up to seven modes of variable block sizes and 16 motion
Television Concepts
䡲 83
vectors are supported for each macroblock. Although the use of variable block sizes improves prediction capability, it also increases the complexity of its codec computations.
Quantization Parameters Perhaps the most important factor in a multimedia compression specification is the coding gain obtained due to quantization. Under the H.264 standard, a nonlinear quantization process is supported, with an approximate 12 percent increase in the magnitude of the step size from one quantization parameter to another. In addition, a total of 52 quantization parameters are supported by the standard, which enhances the coding gain obtainable in the quantization process. Although the H.264 standard supports a wide range of network operations from ADSL to cable television, its higher complexity than MPEG-2 resulted in the vast majority of current set-top boxes supporting the earlier standard. As the processing power of chips increases, we can reasonably expect the H.264 standard to have an increased level of implementation.
Chapter 4
The TCP/IP Protocol Suite and IPTV The purpose of this chapter is twofold. First, we will review the major characteristics of the TCP/IP protocol suite. Once this is accomplished, we will use our knowledge of the protocol suite to obtain an understanding of the manner by which video can be transported in a TCP/IP environment.
4.1 The TCP/IP Protocol Suite When we discuss the TCP/IP protocol suite and IPTV it is important to note that there are two types of video that can be delivered through the use of the TCP/IP protocol suite. Those types of video can be categorized as realtime and stored for replay. The first type of video, real-time, requires the use of a jitter buffer to smooth out delay variations experienced by packets as they flow through an IP network. In comparison, video that will be stored and later viewed on a PC, video iPod, or other device does not require the use of a jitter buffer. This section will focus on the TCP/IP protocol suite. Commencing with a review of the architecture of the TCP/IP protocol suite and its comparison to the well-known seven-layer reference model, we will then examine the network and common transport layer headers included in the protocol suite.
85
86 䡲 Understanding IPTV
Overview The TCP/IP protocol suite represents a layered protocol similar to the International Standards Organization (ISO) Open System Interconnection (OSI) seven-layer reference model, but it predates that model and consists of five layers. Figure 4.1 illustrates the five layers of the TCP/IP protocol suite during the formation of a LAN frame as well as the relationship between the layers in the ISO reference model and the TCP/IP protocol suite. In examining Figure 4.1, note that the TCP/IP protocol suite does not define a physical layer (layer 1). Instead, the TCP/IP protocol suite defines a series of address resolution protocols (ARPs) that enable the network layer’s addressing to be adapted to operate on the Media Access Control layer (MAC layer) supported by a particular LAN. In addition, layers 5 through 7, which represent the session, presentation, and application layers in the ISO reference model, are a single application layer in the TCP/IP protocol suite. As a LAN frame is formed in the TCP/IP protocol suite, a transport layer header, typically either a TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) header, is prefixed to application data. Both TCP and UDP headers include a source and destination numeric port number identifier, which indicates the type of application data being transported. In actuality, the destination port number indicates the application because a receiving device will “listen” on predefined port numbers to support one or more predefined applications associated with certain port numbers. In comparison, the source port is normally set to either a value of 0 or a randomly selected value.
Figure 4.1 TCP/IP encapsulated in a LAN header.
The TCP/IP Protocol Suite and IPTV
䡲 87
Segments and Datagrams Several terms are used to reference headers prefixed onto application data units. First, the prefix of a TCP header to an application data unit is referred to as a TCP segment. In comparison, the prefix of a UDP header to an application data unit results in the formation of a UDP datagram. Both the formation of TCP segments and UDP datagrams occur at the transport layer of the TCP/IP protocol suite, which represents layer 4 in the ISO reference model. When an IP header is prefixed to a TCP segment or UDP datagram, the result is an IP datagram. As indicated in Figure 4.1, this action occurs at layer 3 in the ISO reference model. In comparison to TCP and UDP headers, which identify the application being transported through the use of destination port numbers, the IP header denotes the sending and receiving interfaces through the use of source and destination address fields. Thus, both an IP header as well as a TCP or UDP header are required to identify both the type of data transported by an IP datagram and the originator and receiver of the datagram. As we probe deeper into the relevant fields of the protocol headers that must be considered when transporting video through firewalls and routes, we will note some of the well-known field assignments. Returning our attention to Figure 4.1, the physical and data link layers are responsible for transporting raw data in the form of binary 1s and 0s. The physical layer can be twisted-pair, fiber, or a wireless link, whereas the data link layer can be a form of Ethernet or another type of network. The network layer is responsible for delivering data to its destination over one or more router “hops” based on the destination IP address in the IP header. Because data flows through routers, this layer is sometimes referred to as the routing layer. Moving up the protocol stack, we come to the transport layer, which is responsible for the delivery of packets. That delivery can be reliable, in sequence, when TCP is used, or unreliable and possibly out of sequence when UDP is used. Although many traditional Internet applications use TCP for the transport layer protocol, it is not suitable for digitized voice and data. This is because TCP corrects for lost packets and transmission errors by retransmission, which causes latency that adversely affects realtime applications. Thus, IPTV primarily uses UDP at the transport layer. But UDP can be considered an unreliable protocol that depends on the upper-layer application for error detection and correction, sequencing of packets, and other actions that developers may elect to add to the application. Because early trials of IPTV had more than enough bandwidth devoted to the video streams, the probability of packets becoming lost was virtually zero. However, as more subscribers select this service, the
88 䡲 Understanding IPTV
probability of packets being dropped by routers can be expected to increase. Thus, although what is referred to as “raw UDP” is sufficient for delivering video today, in the future UDP will more than likely be used in conjunction with the Real-Time Transport Protocol (RTP), which provides time stamping and sequencing. Both raw UDP and RTP will be discussed later in this chapter.
ICMP Messages Although not shown in Figure 4.1, the transport of an Internet Control Message Protocol (ICMP) message warrants mention. ICMP messages convey error and control information such that they represent an integral part of the TCP/IP protocol suite. Both routers and hosts use ICMP to transmit reports concerning received datagrams back to the originator. In addition, ICMP is used to generate the well-known and frequently used echo request and echo reply messages that are collectively better known as ping messages. ICMP messages are transported as an IP datagram. This means that an ICMP message is prefixed with an IP header, resulting in the encapsulation of an ICMP message within an IP datagram, as shown in Figure 4.2. As we probe deeper into the use of the TCP/IP protocol suite to convey video, we will note how encapsulation of data through a sequence of headers is used to control the flow of video.
The Network Layer Because the data link layer represents a transport facility on a LAN or serial communications on a WAN, it is similar to the physical layer in that both are not defined in the TCP/IP protocol suite. Thus, we will commence our investigation of the operation of the TCP/IP protocol suite at the network layer. Through the prefix of an IP header, an IP datagram is formed. The IP
Figure 4.2 An ICMP message is transported by the prefix of an IP header to the message.
The TCP/IP Protocol Suite and IPTV
䡲 89
Figure 4.3 The IPv4 header.
header includes a series of fields that controls the delivery of data. Because IPv4 is currently used by more than 95 percent of all TCP/IP users, we will focus our attention on the IPv4 header even though the more modern IPv6 header is considered to represent a replacement for the prior network layer protocol.
The IPv4 Header The IPv4 header is illustrated in Figure 4.3. Note that this header consists of 12 fields plus optional options and padding fields. The first field in the header is a 4-bit version field. This field not only specifies the version of the IP in use, but also enables the originator, recipient, and routers between the source and destination to agree on the format of the datagram. For IPv4, the value of the version field is binary 0100 or decimal 4. Although all of the fields in the IPv4 header are important, we will limit our discussion of the header fields to the delivery of video. Thus, in a video operating environment, we need to concentrate on several IP header fields. Those fields include three that provide fragmentation control (identification, flags, and fragment offset), the time-to-live field, the protocol field, and the source and destination address fields. We will also focus on IP addressing and the subnet mask because several methods based on addressing can be used to deliver video over an IP network.
Fragment Control Fields Previously we noted that three fields in the IP header control fragmentation. Those are the identification, flags, and fragment offset fields.
90 䡲 Understanding IPTV
The 16-bit identification field contains a unique number that serves to identify the datagram. When a datagram is fragmented, each fragment is prefixed with an IP header that has the same number in the identification field. At the destination, the receiving device will reassemble the fragments based on the identification and source IP address field values because the contents of both fields uniquely identify a sequence of fragments generated by a common source. The next field in the IP header that is associated with fragmentation is the flags field. The two low-ordered bits in this 3-bit field are used to control fragmentation. The first bit setting is used to denote if a datagram can be fragmented. Because setting this bit to binary 1 indicates that the datagram should not be fragmented, this bit is referred to as the “do not fragment bit.” The low-ordered bit in the flags field indicates whether the fragment is the last one or if more fragments follow. Thus, this bit is referred to as the “more fragments bit.” The third field associated with fragmentation is the 13-bit fragment offset field. The purpose of this field is to denote the offset of the fragment from the original datagram. The value of this field is specified in units of 8 bytes, with the initial value being 0, to indicate the first fragment of a datagram. Now that we understand the use of the three fields of the IP header associated with fragmentation, let’s move on and discuss the TTL field.
Time-to-Live Field The original intention of the time-to-live (TTL) field was to specify how long, in seconds, a datagram could remain on the Internet. Because it is extremely difficult to time synchronize all routers, the TTL field is used to denote the number of router hops a datagram can traverse. That is, the value of the TTL field is decremented by 1 as the datagram reaches a router. When the value of the TTL field reaches 0, the datagram is sent to the great bit bucket in the sky. Thus, the TTL field provides a mechanism to prevent a datagram from continuously flowing over the Internet due to a variety of erroneous conditions.
Protocol Field The purpose of the 8-bit protocol field is to identify the higher-layer protocol that is transported by the IP header as a datagram. For example, the protocol field will have a value of 1 when IP transports an ICMP message as an IP datagram. Similarly, the protocol field will have a value of 6 when it transports TCP, whereas a value of 17 would indicate it is transporting UDP.
The TCP/IP Protocol Suite and IPTV
䡲 91
Table 4.1 Examples of IPv4 Protocol Field Assigned Values Decimal Value
Keyword
Protocol Defined
1 6 8 17 41 46 58
ICMP TCP EGP UDP IPv6 RSVP IPv6 = ICMP
Internet Control Message Protocol Transmission Control Protocol Exterior Gateway Protocol User Datagram Protocol IPv6 Resource Reservation Protocol ICMP for IPv6
Table 4.1 lists seven examples of the assignment of IP numbers. Because the protocol field in the IP header is an 8-bit byte, this enables 256 (0 through 255) protocols to be assigned.
Source and Destination IP Address Fields Both the source and destination IP address fields are 32 bits in length under IPv4. Those IP addresses are further broken down into five classes, referred to as Class A through Class E. Class A, B, and C addresses are subdivided into network and host address portions and are collectively referred to as classful IPv4 addresses. Class D addresses represent multicast addresses, where source traffic is transmitted to multiple receivers as a bandwidth conservation method. The fifth type of IPv4 address is a Class E address, which is used for experimental purposes. Figure 4.4 illustrates the three classful IPv4 address formats. Class D addresses used for multicast fall into the address block 224.0.0.0 through 239.255.255.255. Thus, the first three bits in a Class D address are set to indicate that the address represents a multicast address. In examining Figure 4.4, note that a Class A address is identified by a binary 0 in its first bit position. Similarly, a Class B address is identified by a binary 1 followed by a binary 0 in its second bit position, and a Class C address is identified by the bit sequence 110 in its first three bit positions. As we move from a Class A network address to a Class B and then a Class C address, the network portion of the address increases while the host portion of the address decreases. Thus, a Class A address has the smallest number of definable networks but the largest number of definable hosts, whereas a Class C address has the largest number of definable networks but the smallest number of definable hosts. Because the first bit in a Class A network is set to a binary 0, this reduces the number of Class A networks to a maximum of 127. However,
92 䡲 Understanding IPTV
Figure 4.4 Classful IPv4 address formats.
the 127.0.0.0 address represents the well-known loopback address, further reducing available Class A network addresses to a maximum of 126.
General Classful Address Structure Using N to represent a network byte and H to represent a host byte, we can note the general structure of a Class A address as follows: (N) (H) (H) (H) As previously noted, each successive classful address byte increases the number of hosts that can be defined on a network while decreasing the number of unique networks that can be defined. Thus, the general structure of a Class B address becomes (N) (N) (H) (H) Similarly, the general structure of a Class C address becomes (N) (N) (N) (H) Due to a significant increase in the number of devices being connected to the Internet, classful addresses have become a rare and valuable resource. Although such techniques as network address translation (NAT)
The TCP/IP Protocol Suite and IPTV
䡲 93
and the use of private IP addresses behind a NAT device have extended the useful life of IPv4, the process of subnetting has allowed organizations to considerably conserve on the use of IP addresses, further extending the useful life of IPv4. For example, consider an organization that has two LANs — one with 15 workstations used by accountants and one with 20 workstations used by engineers. Without subnetting, the organization would use two Class C addresses, each permitting 256 unique hosts (0 through 255) to be identified. However, because a host address of 0 could be confused with the basic network address and a host address of 255 is reserved as a broadcast address (hexFF or all 1s is 255), the maximum number of distinct hosts that can be supported on a Class C network address is reduced by 2 to 254. Thus, for the previously mentioned organization with two LANs, the use of a single Class C address would accommodate the 35 workstations of the two LANs. This in turn would save one Class C address, which could be used to support up to 254 additional workstations. The key to the ability to assign a common IP address to multiple networks is the process of subnetting and the use of the subnet mask. Thus, let’s turn our attention to these topics.
Subnetting Subnetting represents the process of subdividing a classful IPv4 address into two or more separate entities. The subnetting process results in the subdivision of the host portion of a classful IPv4 address into a subnet portion and a host portion, with the network portion of the address remaining as is. Thus, subnetting has no effect on routing on the Internet because the network portion of the address is not modified. Figure 4.5 illustrates the
Figure 4.5 The subnetting process converts a two-level address into a three-level address.
94 䡲 Understanding IPTV
Network
198.
78.
64.
0
Subnet 0 11000000.01001110.01000000.00xxxxxx Subnet 1 11000000.01001110.01000000.01xxxxxx Subnet 2 11000000.01001110.01000000.10xxxxxx Subnet 3 11000000.01001110.01000000.11xxxxxx Figure 4.6 An example of the relationships among the Class C network address of 198.78.64.0, four subnets, and the host portion of each subnet.
subnetting process, which converts a two-level classful address into a threelevel address. From Figure 4.5 it is obvious that as the number of subnets increases the number of hosts that can reside on each subnet decreases. As an example of the use of subnetting, let’s assume an organization has four LANs located in a building, with a maximum of 25 hosts on any network. Let’s further assume that your organization can obtain only a single Class C address. Thus, let’s focus our attention on how that one Class C address can be subnetted, which would eliminate the need for three additional Class C addresses. Because there are four LANs, we need two bits for the subnet, reducing the number of bits used to represent a host on each subnet to 6 (8 – 2). Thus, we can have up to 2 6 – 2, or 62, distinct hosts on each subnet, which is more than sufficient for each LAN. Assuming the Class C IP address provided to the organization is 198.78.64.0, then the relationships among the network address, subnet, and host portions of the Class C address would appear as illustrated in Figure 4.6. In examining Figure 4.6, note that the host portion of each subnet is represented by six bit positions indicated by Xs, resulting in 64 distinct values that range from 000000 to 111111. Because a subnet is similar to a classful network address in that it cannot have a host address of all 0s or all 1s, this reduces the number of hosts on each subnet to 26 – 2, or 62.
The Subnet Mask Although the process of subnetting a classful IPv4 address is relatively straightforward, an unanswered question concerns how one determines the subnet within an address. The answer to this question is the use of the subnet mask, which is formed by a sequence of binary 1s to extend
The TCP/IP Protocol Suite and IPTV
䡲 95
the network portion of a classful address through the subnet series of bits. Because the first few bits in a classful IPv4 address identify the address type, this also indicates the initial subdivision of the addr ess between its network portion and its host portion. Then, the subtraction of the number of bits in the network portion of the address from the number of 1s in the subnet mask indicates the number of bit positions in the subnet. For example, consider the previous example in which the network address was 198.78.64.0. Then, the subnet mask required to have a two-position subnet becomes: 11111111.11111111.11111111.11000000 In dotted decimal notation the subnet mask would be entered as 255.255.255.192. The receipt of a datagram with the network address 198.78.64.0 by a router to which the previously mentioned LANs are connected initiates a series of steps to ensure traffic flows to the correct subnet. First, the router examines the first byte of the network address, noting that the first two bits in the byte are set. This indicates that the address is a Class C address and tells the router that the network portion of the address is contained in the first 24 bits or 3 bytes. Examining the subnet mask, the router notes that it has 26 set 1 bits. By subtracting the number of bits in the network address (24) from the length of the set bits in the subnet mask (26), the router determines that the subnet is 2 bits in length. This enables the router to examine the first two bits in the host portion of the address to determine the subnet onto which the datagram should be routed. Thus, by examining the destination address in the IP datagram in conjunction with the subnet mask, the router obtains the ability to transfer the datagram onto the correct subnet. Now that we understand the use of the IP header, a logical follow-up is to move up the TCP/IP protocol stack to the transport layer. However, prior to doing so, we need to turn our attention to a special type of IP datagram that we briefly discussed previously in this chapter. That IP datagram consists of an IP header used to transport an ICMP message. Because ICMP messages are used to perform a variety of functions ranging from providing the foundation for the well-known ping test to determining the subnet mask, it is important to obtain an appreciation of the capability of ICMP messages. Thus, let’s turn our attention to this topic.
Understanding ICMP Messages Although it is true that some ICMP messages can be used to exploit network defenses, it is also true that preventing all ICMP messages from flowing through routers and firewalls can result in lost productivity. In this
96 䡲 Understanding IPTV Table 4.2 ICMP Type Field Values Type Field Value
0 3 4 5 8 11 12 13 14 15 16 17 18
Defined ICMP Message Type
Echo reply Destination unreachable Source quench Redirect Echo request Time exceeded Parameter problem Time-stamp request Time-stamp reply Information request Information reply Address mask request Address mask reply
section we review the operation of 13 actively used ICMP messages. This review will make us aware of the benefits associated with ICMP messages as well as provide us with a better understanding as to why we may wish to allow certain messages to flow through routers and firewalls. Table 4.2 lists 13 actively used ICMP messages and their type field values. Concerning the latter, each ICMP message commences with the use of three common fields, with the remaining fields in a particular message structured based on the specific message. The three fields common to each ICMP message include an 8-bit type field, which defines the ICMP message, an 8-bit code field, which may provide additional information about a particular message type, and a 16-bit checksum field, which is used to provide integrity for the message. In the following paragraphs we will discuss the use of ICMP messages.
Echo Request and Echo Reply The ICMP echo request (type 8) and echo reply (type 0) messages are used to test if a destination is active and reachable. A host or router will transmit an echo request to a distant device. That device, if both reachable and active, will respond with an echo reply. Both echo request and echo reply messages are used by the well-known ping application.
Destination Unreachable An ICMP message with a type field value of 3 represents a destination unreachable message. The reason the destination was unreachable is
The TCP/IP Protocol Suite and IPTV
䡲 97
Table 4.3 Destination Unreachable Code Field Values Code Field Value
0 1 2 3 4 5
Meaning
Network unreachable Host unreachable Protocol unreachable Port unreachable Fragmentation needed Source route failed
further defined by a numeric entry in the message’s code field. Table 4.3 lists the code field defined values and their meanings for a destination unreachable message. Routers can be configured to transmit a network or host unreachable message when they cannot route or deliver an IP datagram. The type field in the resulting ICMP message identifies the message as a destination unreachable message and the code field value defines why the datagram could not be delivered. By default, many organizations configure their routers and firewalls to block all or most ICMP messages. If this situation occurs it will adversely affect your ability to determine certain conditions by which ICMP messages could explain why datagrams could not reach their destination cannot be determined. Sometimes a bit of coordination with security personnel can result in the unblocking of one or more ICMP messages, which will enable you to obtain the results you seek from the use of such messages.
Source Quench An ICMP message type of 4 represents a source quench message. This message is used by routers and hosts to control the flow of data. To understand the use of source quench, note that when datagrams arrive at a device at a rate higher than its processing rate, the device discards them. This explains how packets can be lost. That is, assume a router connects two large domains on the Internet to a third. At various times throughout the day, the packet arrival rate from two domains destined to the third may exceed the packet processing rate of the router. The router then is forced to discard or drop packets. When this situation occurs, the device that discards the datagrams transmits an ICMP source quench message, which informs the source to slow down its datagram transmission rate. Typically, routers and hosts will transmit one source quench message for every datagram they discard.
98 䡲 Understanding IPTV
Redirect A type field value of 5 in an ICMP message denotes a redirect. When a router detects that it is using a non-optimum route, it will transmit an ICMP redirect message to the host. Because many hackers use this message to play havoc with an organization’s network, it is commonly blocked by routers and firewalls.
Time Exceeded The ICMP time exceeded message is generated by a router when it has to discard a datagram. Because the time-to-live (TTL) field in the IP header is decremented by 1 when a datagram flows through a router and is discarded when the value reaches 0, a router will both discard the datagram and transmit an ICMP time exceeded message back to the source when this situation occurs. A second reason for the transmission of a time exceeded message is when fragment r eassembly time is exceeded. A code field value of 0 indicates a time-to-live count value was exceeded, whereas a value of 1 denotes that the fragment reassembly time was exceeded.
Parameter Problem An ICMP type field value of 12 is used to define a parameter problem. A router or host that encounters a problem in interpreting the fields within an IP header will return an ICMP parameter problem message to the source. This message will include a pointer that identifies the byte in the IPv4 header that caused the problem.
Time-Stamp Request and Reply From Table 4.2 you will note that ICMP message types 13 and 14 represent time-stamp request and time-stamp response messages, respectively. Both messages are used to synchronize the clocks of two devices. In addition, the fields within these messages can be used to estimate the transit time of datagrams between two devices.
Information Request and Reply From Table 4.2, another pair of ICMP messages are types 15 and 16, information request and information reply. The information request message is used to obtain an IP address for a network to which a device is attached. Thus, this ICMP message serves as an alternative to the use of
The TCP/IP Protocol Suite and IPTV
䡲 99
a reverse ARP. The information reply message functions as a response to the information request.
Address Mask Request and Reply The last two type field values listed in Table 4.2 represent an address mask request (17) and an address mask reply (18) message. This pair of ICMP messages enables a device to learn its subnet mask. A device first transmits an ICMP address mask request to a router. That transmission can be either as a broadcast if the device was not previously configured with the router’s IP address or as a unicast message if it was configured with the address. For either situation the router will respond with the address mask in an ICMP address mask reply message. Now that we have an appreciation for ICMP messages as well as the fields within the IP header, let us move up the protocol stack and turn our attention to the transport layer.
The Transport Layer In our prior examination of the fields within the IP header, we noted that the 8-bit protocol field defines the transport layer protocol header that follows the IP header. The transport layer permits multiple applications to flow to a common destination, either from the same source IP address or from different source addresses. To accomplish this task, the transport layer protocol includes a destination port field in which a numeric entry defines the application data being transported. Thus, the transport layer resides above the network layer but below the application layer, receiving application data, which is then encapsulated with a transport header that identifies the application. Once encapsulated, the TCP segment or UDP datagram is passed to the network layer, where the IP header is added to form an IP datagram.
TCP vs. UDP Although the protocol field within the IPv4 header is capable of defining 256 transport layer protocols, two protocols account for the vast majority of transport layer activity: TCP and UDP. TCP is a reliable, connection-oriented protocol that includes a flow control mechanism. In comparison, UDP is an unreliable, best-effort protocol that depends on the application layer to add such functions as error detection and correction and flow control. As an example of the use of TCP and UDP, we can consider Voice-over-IP (VoIP). For this
100 䡲 Understanding IPTV
application, TCP would be used to transport the dialed number and UDP would be used to transport digitized voice as a sequence of small periods of digitized data. Because TCP is a connection-oriented, reliable protocol, a response from the destination is required and the dialed digits arrive error free. In comparison, UDP, which is an unreliable, connectionless protocol, allows digital voice to flow to its destination and small snippets of, say, 20 ms of voice can be lost without adversely af fecting the reconstructed conversation at the destination. Now that we have an appreciation for the difference between these two popular transport layer protocols, let’s examine the headers of each.
The TCP Header Figure 4.7 illustrates the fields within the TCP header. Included in the header are source and destination port fields, with the destination field used to define the application being transported. The sequence number field enables datagrams received out of order to be correctly sequenced, and other fields in the header perform flow control (window and acknowledgement) and data integrity (checksum). For the purpose of this book, which is focused on IPTV, we will limit our discussion of TCP and UDP headers primarily to their source and destination port fields.
The UDP Header Previously we noted that the UDP protocol represents a best-effort transport protocol that depends on the application layer for such functions as
Figure 4.7 The TCP header.
The TCP/IP Protocol Suite and IPTV
䡲 101
Figure 4.8 The UDP header.
flow control and error detection and correction. Thus, as you might expect, the UDP header is streamlined in comparison to the TCP header. Figure 4.8 illustrates the fields in the TCP header. Similar to the TCP header, the UDP header includes 16-bit source and destination ports that identify the process or application being transported. Thus, let’s turn our attention to those two fields, which are common in both headers.
Source and Destination Port Fields For both TCP and UDP, the source and destination fields are each 16 bits in length. The source port field number is supposed to denote the application associated with the data generated by the originating station. However, most source port field values are either set to 0 if the source port is not used or represent a random number generated by the originator. In comparison, the destination port field contains a value that identifies a user process or application for the receiving station whose IP address is denoted by the destination IP address field value in the IP header. Because a pair of origination and destination address data flows can occur on multiple destination port numbers, the use of the port field enables multiple applications to flow to a common destination. For example, when a station initiates an HTTP session, it would place port number 80 in the destination port field. Later, the HTTP session could be followed by a Telnet session, with the originating station placing port number 23 in the destination port field. Because there are three types of port numbers that can be used in the TCP and UDP port fields, let’s examine port numbers in more detail.
Port Numbers Both TCP and UDP headers, as illustrated in Figures 4.7 and 4.8, contain 16-bit source and destination port fields, enabling port numbers to range
102 䡲 Understanding IPTV
in value from 0 to 65535. This results in a “universe” of 65536 port numbers, which are subdivided into three ranges referred to as well-known ports, registered ports, and dynamic or private ports.
Well-Known Ports Well-known ports are also referred to as assigned ports because their assignment is controlled by the Internet Assigned Numbers Authority (IANA). Well-known or assigned ports are in the range of 0 to 1023, providing 1024 possible assignments. Such ports are used to indicate the transportation of standardized process and for the most part have the same assignments for both TCP and UDP. Ports used by TCP typically provide connections that transport relatively long-term connections requiring error detection and correction, such as file transfers (FTP) and remote access (Telnet).
Registered Ports Port numbers beyond 1023 can be used by any process or application. However, doing so in a haphazard manner could result in incompatibilities between vendor products. To alleviate this potential problem, the IANA allows vendors to register their use of port numbers, resulting in port number values from 1024 to 49151 allocated for registered ports. Although a vendor can register an application or process with the IANA and obtain a port number for the use of the process or application, the registration does not carry the weight of law. That is, registered ports primarily allow other vendors to develop compatible products and end users can configure equipment to use such products. For example, when a new application uses a registered port number, it becomes a relatively easy task to both adjust a router access list or firewall configuration to enable the flow of datagrams used by the new application as well as purchase and use other vendor products that perform a similar function through the use of the same registered port.
Dynamic Ports Dynamic ports begin where registered ports end, resulting in their use of ports 49152 through 65535. Port numbers in this range are commonly used by vendors implementing proprietary network applications. A second common use of dynamic port numbers is for NAT, which we will discuss in the next section because it can adversely affect certain IPTV operations. Table 4.4 provides a few examples of well-known and registered port numbers. Although some services and applications may be familiar to
The TCP/IP Protocol Suite and IPTV
䡲 103
Table 4.4 Examples of Well-Known and Registered TCP and UDP Services and Port Utilization Service
Port Type
Port Number
Well-Known Ports Remote job entry Echo Quote of the day File transfer (data) File transfer (control) Telnet Simple Mail Transfer Protocol Domain Name Server Trivial File Transfer Protocol Finger Hypertext Transfer Protocol Secure HTTP AppleTalk Filing Protocol Kazaa
TCP TCP and UDP TCP TCP TCP TCP TCP TCP and UDP UDP TCP TCP TCP TCP and UDP TCP and UDP
5 7 17 20 21 23 25 53 69 79 80 443 548 1214
Registered Ports Lotus Notes Novell Group Wise H.323 host call MSN Messenger Yahoo Messenger: voice chat Yahoo Messenger Yahoo Messenger: Web cams AOL Instant Messenger Bit Torrent RTP-QT4 (Apple QuickTime) RTP
TCP TCP and UDP TCP and UDP TCP TCP and UDP TCP TCP TCP TCP and UDP UDP UDP
1352 1677 1720 1863 5100–5001 5050 5100 5190 6881–6889, 6969 6970–6999 16384–32767
readers, a few deserve a bit of explanation. Bit Torrent represents an application and peer-to-peer File Transfer Protocol (FTP) that sends portions of files from one client to another. A central server, referred to as a tracker, coordinates the actions of peers. Because Bit Torrent enables uploads and downloads to occur simultaneously, it makes more efficient use of bandwidth. In addition, because large files, such as videos, are broken into smaller pieces, the use of Bit Torrent enhances the availability of popular files; instead of an “all or nothing” approach to downloading,
104 䡲 Understanding IPTV
a file may be split into hundreds of pieces that can be obtained from many sites. A second protocol worth noting is the RTP, which provides end-toend network transport functions suitable for applications transmitting realtime data, such as audio and video, over multicast and unicast network services.
Network Address Translation Network address translation (NAT) was originally developed as a tool to extend the life of scarce IPv4 addresses. As the use of the Internet expanded, the ability of organizations to obtain IPv4 addresses became more difficult. Because only a portion of an organization’s workstations might require access to the Internet at a particular period of time, the use of a classful IP address could result in wasting many host addresses on a network. Recognition of the fact that some organizations would not directly connect their workstations to the Internet resulted in three address blocks being reserved for private Internet use. Those address blocks, which are listed in Table 4.5, are also defined in RFC 1918. By combining an address translator to map or translate unregistered, private IPv4 addresses into a registered address, it became possible to conserve IP addresses. For example, suppose your organization has five LANs, each with 200 workstations. Instead of assigning each workstation a scarce IPv4 public address, you could use five private IP Class C network addresses from Table 4.5. Then, using an address translator, your organization would translate RFC 1918 addresses to a single public IP address. Obviously, without a technique to differentiate one translation from another, havoc would result. Thus, to ensure each translation is unique, the address translator uses or assigns a high port number to the source port in the TCP or UDP header and enters the RFC 1918 IP address and the port number into a table. Then, when a response occurs, the address translator notes the port number returned in the header and uses that number to perform a table lookup, noting the RFC 1918 address associated with the port number. Next, the address translator converts the Class C public address in the IP header’s destination IP address field to the Table 4.5 RFC 1918 Reserved IPv4 Addresses Address Blocks
10.0.0.0–10.255.255.255 172.16.0.0–172.31.255.255 192.168.0.0–192.168.255.255
The TCP/IP Protocol Suite and IPTV
䡲 105
recently obtained RFC 1918 address and forwards the datagram to the private IP network. This type of translation is referred to as port-address translation and is performed by many routers and firewalls. Although NAT can considerably economize on the use of scarce IPv4 addresses, it can create problems when some type of tunneling is employed and the inner IP datagram remains as is, with the outer IP header operated on by NAT. This means that you could not interconnect two networks that use the same RFC 1918 private network addresses, because doing so would result in routing problems on each network. Instead, you would need to change the network address on one of the interconnected networks.
4.2 Delivering IPTV Now that we have an appreciation for the TCP/IP protocol suite, we will turn our attention to the delivery of video via that protocol suite. In this section we will first discuss the two major delivery methods used by IPTV. This will be followed by a discussion of the different standards used to encode and deliver video. Because the MPEG-2 standard is the most common method used for the delivery of video over an IP network, we will focus our attention primarily on this topic. In addition, because the RTP is essential for the delivery of video, this section will also explore its use.
Delivery Methods There are three basic methods by which video can be delivered via an IP network. Those methods include delivery as a file transfer, which precludes real-time viewing, broadcast, and video on demand (VOD). The latter two methods are used for the real-time viewing of a movie, television show, concert, or other type of visual performance. Because the file transfer of video, although representing a transfer of data over an IP network, is used for playback and not immediate viewing, I will simply state that the transfer can occur via FTP or through the use of the previously described Bit Torrent application. Thus, in the remainder of this section we will focus our attention on the use of broadcast and video-on-demand technologies to provide an IPTV capability.
Broadcast When video is broadcast, each feed is provided a unique channel number to enable a set-top box to select the feed the person controlling the box
106 䡲 Understanding IPTV
Figure 4.9 IPTV can be delivered via broadcast and unicast video-on-demand (VOD) transmission. (STB, set-top box)
wishes to view. In actuality, when a person uses the set-top box to select a channel the box will establish a multicast connection to the broadcasted channel, eliminating the need for all digitized channels to flow into the subscriber’s home, as they now do when cable TV is used. The top portion of Figure 4.9 illustrates the delivery of video via the broadcasting of channels. The broadcast source can be movies previously stored on a server as well as a live feed from an on-air television station showing the Summer Olympic basketball finals, a soap opera, or another show. Each source is input into a broadcast encoder, which packetizes the video stream, including setting a channel number and multicast address group to which set-top boxes will join whenever a viewer selects a channel using the box. The broadcast system can be thought of as a series of media servers that host a number of broadcast streams. The media servers support the delivery of both multicast and unicast, with the latter used for VOD operations. Because billing for services represents an important aspect of any IPTV operation, a subscriber management system is used to perform that function. In addition to billing subscribers, the subscriber management system will normally provide such additional functions as broadcasting an electronic programming guide and supporting interactive set-top box features, such as providing selected content to subscribers as a VOD product.
The TCP/IP Protocol Suite and IPTV
䡲 107
Video on Demand The lower portion of Figure 4.9 illustrates the integration of VOD into an IPTV communications system. Because VOD responds to a query generated by a subscriber through the set-top box or PC, the response flows as a sequence of unicast datagrams to the IP address of the set-top box or personal computer. Typically, the subscriber management station will display a list of VOD events from which a subscriber can select a program. However, it’s also possible for an IPTV operator to insert a card with the subscriber’s monthly bill, which can list hundreds of events, their viewing cost, and an access code to retrieve selected events. For either method, the flow of IP datagrams will represent a unicast transmission to the subscriber’s set-top box or personal computer.
Video Delivery Standards Several standards can be used for providing an IPTV video delivery system. The two most popular standards are MPEG-1 and MPEG-2, with the latter representing a modification to the earlier standard that is commonly used in cable TV set-top boxes due to its enhanced data compression capability. A newer MPEG standard is MPEG-4, which was finalized in 1998 and became an international standard in 2000. MPEG-4 is designed to deliver DVD-quality video similar to MPEG-2 but at lower data rates. In fact, MPEG-4 scales to transport media at any data rate, from slow-speed dialup to high-bandwidth fiber-to-the-home connections. In 2002 Apple Computer added support for MPEG-4 to its QuickTime technology and worked with such leading vendors as Cisco, IBM, Philips, Sun Microsystems, and 20 other companies to form the Internet Streaming Media Alliance (ISMA) to ensure that an MPEG-4 media stream created with one vendor’s product will run on another vendor’s player. The international version of MPEG-4 is better known as the H.264 standard and is equivalent to the MPEG-4 part 16 standard. Although both MPEG-4 and H.264 provide a considerable enhancement over MPEG-2, they are much more computing intensive. This means that for existing cable providers that have a large installed base of MPEG-2 set-top boxes, the upgrade to MPEG-4- or H.264compatible devices can be a difficult task. In comparison, newly emerging IPTV providers, whose customers are just receiving new set-top boxes or will use Pentium 4 or dual-core-based PCs to view video, can be more readily served with this newer technology. In fact, some satellite television systems, such as Direct TV, have adopted MPEG-4 for the delivery of digital television due to its picture quality at lower data rates, enabling more channels to be delivered within the frequency spectrum they are authorized to use. Now that we have an appreciation for commonly used
108 䡲 Understanding IPTV
video delivery standards, let’s turn our attention to the manner by which video is delivered when MPEG-2 is used as the compression scheme.
Using MPEG-2 One of the most popular methods used to deliver IPTV is through the encapsulation of MPEG-2 using UDP at the transport layer. When this encapsulation occurs, UDP can optionally use the RTP to provide applicationlevel framing that identifies the payload being transported and provides a sequence number for each RTP data packet, which allows packet loss to be detected.
UDP/RAW and UDP/RTP Although Figure 4.10 illustrates the use of RTP for transporting MPEG-2based IPTV, it is important to note that video can also be transported directly in UDP packets without the use of RTP. When this situation occurs, the transport stream is referred to as UDP/RAW. When UDP/RAW is used, several error and informational conditions can be detected, including 䡲 䡲 䡲 䡲 䡲 䡲
Sender changed Missing synchronization bytes Incorrect packet size Time-outs Excessive jitter Improper UDP bit rate
Figure 4.10 Delivering MPEG-2 via IP.
The TCP/IP Protocol Suite and IPTV
䡲 109
When RTP is used with UDP, as shown in Figure 4.10, packets can be time stamped and identified through the use of a sequence number. This allows detection of several additional error conditions beyond those detectable when UDP/RAW is used. These newly detectable error conditions using UDP/RTP include 䡲 䡲 䡲 䡲
Determining packets received out of order Detecting duplicate packets Determining if a packet is lost Determining packets that have an incorrect size
Although both UDP/RAW and UDP/RTP can be used to transport video, the latter provides the ability to compensate for such error conditions as packets being received out of order, packets of incorrect size, or duplicate packets. In addition, because UDP/RTP enables a receiver to determine if a packet is lost, it also allows a receiver to compensate for the occurrence of lost packets. Depending on the software used by a receiver, it may do nothing, during which time the screen will appear blank or repeat the previously received frame. Concerning the latter, because the next received frame may or may not be considerably different from the repeated frame, the result could be either a smooth or jerky transition. A time stamp enables a receiver to perform synchronization as well as to resolve jitter due to delays a packet experiences as it flow through a network. Figure 4.10 illustrates the encapsulation process that enables MPEG-2 video to be delivered via IP datagrams. Because RTP is an integral part of the delivery process that enables lost packets transporting video to be reconstructed, let’s turn our attention to this protocol. First we will examine the overhead associated with the use of this protocol. Then we will investigate the protocol in detail.
RTP Overhead In examining the encapsulation of an MPEG-2 data stream, as shown in Figure 4.10, note the overhead of the protocol at the network layer. The IP, UDP, and RTP headers result in 40 bytes of overhead while 1316 bytes of video are transported via MPEG-2. Thus, the overhead at the network layer is 40/1316, or approximately 3 percent. If the IP datagram flows on an Ethernet network that has a 26-byte header and 4-byte trailer, the resulting overhead becomes 70/1316, or approximately 5.3 percent. When UDP/RAW is used, the RTP header is eliminated. With the 12 bytes of the RTP header eliminated, the overhead at the network layer becomes 28/1316, or approximately 2.1 percent. When UDP/RAW within an IP datagram flows on an Ethernet network, the resulting overhead
110 䡲 Understanding IPTV Table 4.6 Comparing Overhead of UDP/RAW and UDP/RTP Protocol
UDP/RTP UDP/RAW
Network Layer
Data Link Layer
3.0 2.1
5.3 4.4
becomes 58/1316, or approximately 4.4 percent. Table 4.6 compares the overhead associated with the use of UDP/RAW and UDP/RTP. Note that for both the network and data link layers the difference in overhead between UDP/RTP and UDP/RAW is less than 1 percent. As noted in Table 4.6, the addition of RTP to the protocol stack does not result in an excessive amount of overhead. Now that we have an appreciation for the overhead resulting from the use of RTP, let’s focus our attention on the protocol itself.
The Real-Time Transport Protocol (RTP) RTP is identified in the UDP header with a value of 5004 in the port field. As previously mentioned, RTP provides end-to-end network transport functions that facilitate the delivery of such real-time data as audio, video, and simulation data via multicast or unicast network services. Although RTP provides sequencing and time stamping of data, it does not address the reservation of resources nor does it guarantee quality of service (QoS) for real-time data. Thus, it is incumbent on network operators to configure routing queues to prioritize predefined traffic to enable real-time video to reach its destination with minimal delay.
The RTP Header Figure 4.11 illustrates the RPT version 2 header. RTP version 1 is currently limited to legacy operations, and all modern applications are written for version 2, which is not backward compatible with version 1. The RTP version 2 header consists of ten fields, each of which we will briefly discuss.
Version (Ver) Field The first two bits in the header represent the version field. For RTP version 2, this field is always set to binary 10 or decimal 2.
The TCP/IP Protocol Suite and IPTV
䡲 111
Figure 4.11 The RTP version 2 header.
Padding (P) Field The padding field is 1 bit in length. This field is set to a binary 1 when one or more additional padding bytes that are not part of the payload are added to the packet. The last byte of padding contains a count of the number of padding bytes. Thus, the receiver uses this value to determine the number of padding bytes to ignore.
Extension (X) Field This 1-bit field is set if the fixed header is followed by exactly one header extension. Otherwise, the value of this field is set to 0.
CSRC (CC) Count Field This 4-bit field indicates the number of contributing source (CSRC) identifiers that follow the fixed header. Up to 15 contributing sources can be defined.
Marker (M) Field This 1-bit field, when set, is interpreted by a profile. For example, setting this bit enables frame boundaries to be marked in a stream of packets.
112 䡲 Understanding IPTV
Payload Type (PT) Field The purpose of this 7-bit field is to identify the format of the RTP payload so that it can be correctly interpreted by the application. For example, H.261 video is identified by a PT field value of 31 whereas H.263 video is identified by a PT field value of 34.
Sequence Number Field The 16-bit sequence number field has an initial randomly selected value. Thereafter, the sequence number increments by one for each transmitted RTP data packet. The primary purpose of the sequence number field is to detect packet loss. For video, a frame may be split across several RTP packets, so those packets may have the same time stamp. Thus, the sequence number can be used to ensure that multi-part video frames are correctly reassembled at the receiver.
Time-Stamp Field The 32-bit time-stamp field is used to place audio and video packets in their correct timing order. The value of the time-stamp field reflects the sampling of the first byte in the RTP data packet, which explains why a frame that needs to be transported by a series of RTP packets will have the same time-stamp value in each RTP packet header.
Synchronization Source (SSRC) Field This 32-bit field identifies the synchronization source. The value for this field is randomly selected so that two synchronization sources within the same RTP session will have a very low probability of having the same value. If they do, all RTP implementations must be capable of detecting and resolving collisions.
Contributing Source (CSRC) Field This 32-bit field represents an array of 0 to 15 CSRC elements. Up to 16 CSRC elements can be included and identify the contributing sources for the payload contained in the packet. CSRC identifiers are inserted by mixers, using the SSRC identifiers of contributing sources, such as the identifiers of all sources mixed together to create an audio packet. A mixer is an RTP-level relay device that enables a variety of bandwidths to be used for listening to and/or viewing a common channel. A mixer is placed
The TCP/IP Protocol Suite and IPTV
䡲 113
near a low-bandwidth area and resynchronizes packets to maintain the spacing generated by the originator, and it translates, for example, audio coding used on a high-bandwidth connection to an encoding method more suitable for the lower bandwidth connection. In a video environment, a mixer could be designed to scale the images of individuals in separate video streams into a single composite stream to simulate a group scene. Other examples of video mixers include devices that translate video streams from IP/UDP into a different protocol or translate video streams from individual sources without performing resynchronization or mixing.
Time-Stamp Computations For video applications, the time stamp depends on the ability of the application to determine the frame number. For example, if the application transmits every frame at a fixed frame rate, the time stamp is governed by the frame rate. That is, at a frame rate of 30 frames per second (fps), the time stamp would increase by 3000 for each frame; whereas at a frame rate of 25 fps, the time stamp would increase by 3600 per frame. As previously noted, when a frame is transmitted as a series of R TP packets, the time-stamp value in each header will be the same. For situations where the frame number cannot be determined, the system clock value will be used to compute the time stamp.
Port Number Utilization Under the RTP specification, RTP data is to be transported on an even UDP port number, whereas Real-Time Transport Control Protocol (RTCP) packets are to be transported on the next-higher odd port number. Although applications can use any UDP port pair, port numbers 5004 and 5005 represent registered ports for applications that elect to use them as the default pair.
RTCP The Real-Time Transport Control Protocol provides information about the participants in an ongoing session as well as a mechanism to monitor the quality of service. Information about the participants in a session can vary from situations where there is no explicit membership control to sessions that require detailed membership control. RTCP operates via the periodic transmission of control packets to all members of a session, following the same distribution method as that used for data packets. Because the primary function of RTCP is to provide feedback on the quality of the
114 䡲 Understanding IPTV
data distribution, it functions as an integral part of RTP’s role as a transport protocol. In fact, information provided by RTCP can be used to determine where bottlenecks reside when multicasting occurs, which facilitates the problem resolution process.
Jitter Considerations One of the major problems associated with the real-time delivery of video over an IP network is the displacement of packets transporting frames from their original position. This displacement of packets is referred to as jitter and is normally compensated for through the use of a jitter buffer at the receiver. A jitter buffer can be thought of as an area of memory used to compensate for delays in the flow of packets through a network. Those delays can be caused by distance or propagation delay as well as router processing delays. Packets first enter the receiver’s jitter buffer, from which they are extracted at applicable times such that their display eliminates movements that would otherwise appear due to the displacement of packets from one another as they flow through a network. For example, assume a frame rate of 30 fps. This means that each packet requires onethirtieth (or .0333) of a second for its display. Assume the packets transporting frames arrive at a receiver one after the other; however, the delay between packets appears random, ranging from 0 to 0.002 seconds. This random delay represents jitter and, if not compensated for, will result in the display of frames appearing awkward. However, with a jitter buffer, the frames would first flow into the buffer and then be extracted in a time sequence that minimizes the displacement of frames fr om one another. Jitter buffers have been used for many years. Perhaps the most popular use of jitter buffers is in VoIP applications, where such buffers enable jitter to be removed from packets transporting digitized voice conversations. Without the use of jitter buffers, the displacement of packets from their ideal position by time in a sequence of packets would result in some periods of reconstructed voice sounding awkward.
Chapter 5
Last Mile Solutions Previously in this book we discussed several variations of ADSL (Assymetrical Digital Subscriber Line), including ADSL2 and ADSL2+. We noted their use in the literal “last mile” to provide communications connectivity from a home or office to fiber located in the neighborhood or on a direct run to a telephone company central of fice that was in close proximity to the customer. Although we covered the use of different versions of ADSL in detail earlier in this book, in this chapter we will examine the use of the technology in conjunction with several methods of installing fiber cable to central locations where groups of homes and offices are clustered. In addition, we will discuss a version of digital subscriber lines referred to as VDSL (very-high-bit-rate DSL) that can provide very high bandwidth for relatively short distances and may offer an alternative or supplement to the use of various versions of ASDL. Because understanding VDSL will provide a foundation for discussing alternative last mile solutions, we will commence our examination by focusing on that topic. When we discuss last mile solutions, we normally reference the connection between a central office and a subscriber. However, in an IPTV environment, we need to expand the last mile to include the home network used to distribute IPTV within a residence. Thus, the second portion of this chapter will examine home networking methods that can be used to distribute IPTV within a residence.
115
116 䡲 Understanding IPTV
5.1 VDSL VDSL represents the most powerful member of the xDSL family of products, providing support for data rates up to approximately 50 Mbps on a single telephone twisted-pair wire for relatively short distances, usually between 1000 and 4500 feet. There are several versions of VDSL, including asymmetric and symmetric, with the latter more suitable for businesses where the connection of servers or transfer of general data between locations requires bidirectional data rates to be balanced.
Data Rate Support Table 5.1 indicates the data rates supported by VDSL technology running over 26-gauge copper wire. As you will note from examining the entries in the table, as the distance increases the obtainable data rate decreases. Although the table provides an indication of transmission support over 26-gauge copper wiring, telephone companies also use 22- and 24-gauge wiring. According to the American Wire Gauge (AWG) specifications, as the gauge number decreases the diameter of the cable increases. This results in a lower gauge wire diameter increasing, which reduces its resistance and enables a greater transmission distance. Thus, although the transmission distances shown in the referenced table are accurate for 26-AWG twisted-pair wire, it is possible for some locations that were installed using a lower gauge twisted-pair wire to achieve slightly longer transmission distances.
Table 5.1 VDSL Transmission Rates and Range Data Rate (Mbps) Downstream/Upstream
Transmission Distance Over 26-AWG Copper (ft)
52/30 54/13 26/26 22/13 13/13 10/10 6/6 16/1
1000 1000 3000 3000 4500 4500 6000 6000
Last Mile Solutions
䡲 117
FSAN Since June 1995, an international consortium referred to as the Full Service Access Network (FSAN) consortium has been actively pursuing the standardization and deployment of both narrowband and broadband full-service access networks. Companies ranging from Bell South and US West to Bell Canada, Korea Telecom, and France Telecom are members of FSAN. Although FSAN is not a standards body, it works with many international, regional, and country-based standards committees to promote the development of standards. The consensus of FSAN members resulted in the organization specifying ATM (Asynchronous Transfer Mode) as the primary transport technology, with fi ber in the core network and the use of copper for the last-mile access network. The FSAN consensus is based on the fact that ATM can provide a guaranteed bandwidth, enabling a guaranteed quality of service (QoS), which is essential for delivering such real-time services as voice and video.
VDSL Access Technology Currently, VDSL represents the highest rate of all xDSL types of technology. Because it supports both symmetric and asymmetric operations and is approximately ten times faster than ADSL, it is a technology well suited to transport video into residences from several types of backhaul fiber that can be routed into a neighborhood, to the curb, into a building, or into a home or office. Similar to other DSL technologies, VDSL uses the frequencies beyond those used for telephone service on the same twisted wire pair, enabling the telephone company to utilize the existing copper wire infrastructure for the delivery of broadband services.
Frequency Utilization VDSL is based on the use of frequency division multiplexing (FDM), so upstream and downstream data channels are separated from the 0 to 4 kHz frequency used for telephone service. Through the use of separate upstream and downstream channels, transmission can occur simultaneously in both directions. Currently, there are three types of frequency band allocation standards defined for VSDL: 10 Base-S, ETSI Plan 997, and ETSI/ANSI Plan 998.
118 䡲 Understanding IPTV
Figure 5.1 10 Base-S frequency allocation.
10 Base-S 10 Base-S was developed by Infineon, a leading VDSL chip vendor, as a mechanism to extend 10-Mbps full-duplex Ethernet over an existing copper-based infrastructure up to approximately 4000 feet. You can view 10 Base-S as a combination of Ethernet’s simplicity and VDSL technology that results in a symmetric transmission capability. Under 10 Base-S, the 0.9- to 3.75-MHz frequency spectrum is used for downstream transmission and the 3.75- to 8.0-MHz frequency spectrum is used for upstream transmission. Figure 5.1 illustrates the 10 Base-S frequency allocation. Currently, 10 Base-S is the most popular of the three VDSL standards.
ETSI Plan 997 The European Telecommunications Standards Institute (ETSI) Plan 997 specifies the use of four bands for VDSL. Upstream (US1 and US2) and downstream (DS1 and DS2) bands differ in their frequency width and can support both asymmetric and symmetric transmission. Figure 5.2 illustrates the ETSI Plan 997 frequency allocation. As indicated in the figure, frequencies from 0.138 to 3.0 MHz and 5.1 to 7.05 MHz are allocated for downstream transmission and 3.0 to 5.1 MHz and 7.05 to 12 MHz are allocated for upstream use.
ETSI/ANSI Plan 998 The ETSI and American National Standards Institute (ANSI) Plan 998 is similar to the ETSI Plan 997 with respect to the use of two upstream
Last Mile Solutions
䡲 119
Figure 5.2 ETSI Plan 997 frequency allocation.
and two downstream channels. However, it differs from the prior draft standard in the use of frequencies that are optimized for asymmetrical transmission. Figure 5.3 illustrates the ETSI/ANSI Plan 998 frequency allocation. Similar to the prior plan, ETSI/ANSI Plan 998 is currently a draft standard.
Ham Band Notching The VDSL frequency spectrum covers a number of ham radio bands. Thus, the use of VDSL could cause interference with amateur radio operators. To prevent such interference, notching capability is included in VDSL frequency spectrums in 10-Hz steps that can be enabled or disabled. Table 5.2 lists ham radio bands defined for VDSL notching.
Figure 5.3 ETSI/ANSI Plan 998 frequency allocation.
120 䡲 Understanding IPTV Table 5.2 HAM Radio Bands Defined for VDSL Notching Start Frequency
1,810 kHz 3,500 kHz 7,000 kHz 10,100 kHz
Stop Frequency
2,000 kHz 3,800 kHz (ETSI); 4,000 kHz (ANSI) 7,100 kHz (ETSI); 7,300 kHz (ANSI) 10,150 kHz
Applications Through the use of VDSL it becomes possible to support digital broadcast television data streams, high-speed Internet access, video on demand (VoD), distance learning, teleconferencing, and other applications via a common wire pair that is already routed into most homes and offices. Although the data rate of VDSL decreases with distance, when used up to a range of approximately 4500 feet, the technology can support one or more high-definition (HD) TV shows as well as high-speed Internet access, and several standard-television-type channels.
ANSI Developments Recently, the ANSI T1/E1.4 Committee defined a series of upstream and downstream data rates for VDSL derived from submultiples of the synchronous optical network (SONET) and its European equivalent, the synchronous digital hierarchy (SDH). Those submultiples of 12.96, 25.92, and 51.84 Mbps represent fractions of the Optical Carrier 3 (OC-3) data rate of 155.52 Mbps, which is supported by both SONET and SDH. Table 5.3 lists the ANSI T1/E1.4 downstream rates for asymmetric VDSL and Table 5.4 lists the upstream data rates for asymmetric VDSL services. Although symmetric VDSL can be expected to find a viable market for telemedicine, Web hosting, and other applications that require high-speed bidirectional transmission, because asymmetric VDSL is better suited and more economical for supporting IPTV, which is the focus of this book, we will concentrate our attention on the asymmetric version of the xDSL technology. In examining the entries in Tables 5.3 and 5.4, I believe several items warrant discussion. First, if you turn your attention to Table 5.3 you will note that at the highest downstream bit rate of 51.84 Mbps, the modulation method packs 4 bits per baud (51.84/12.96).When the bit rate is 38.88 Mbps, then the number of bits transported based on a baud rate of 12.96 becomes 3 (38.88/12.96). Similarly, the bit rate of 29.16 Mbps, which
Last Mile Solutions
䡲 121
Table 5.3 Downstream Line Rates for Asymmetric VDSL Service Range
Short range (1000 ft)
Medium range (3000 ft)
Bit Rates (Mbps)
Baud Rate (MBd)
51.84 38.88 29.16 25.92 25.92 22.68 19.44 19.44 16.20 14.58 12.96
12.96 12.96 9.72 12.96 6.48 5.67 6.48 4.86 4.05 4.86 6.48
12.96 9.72 6.48
3.24 3.24 3.24
Long range (4500 ft)
requires a 9.72-MBd signaling rate, also results in 3 bits being conveyed by each signaling change. If we turn our attention to Table 5.4, we will note that at a data rate of 6.48 Mbps, the 0.81-MBd rate results in 8 bits being packed into each symbol. Similarly, for long ranges the 3.24-Mbps operating rate at a 0.405-MBd rate results in 8 bits being packed into every baud change. If you carefully examine Tables 5.3 and 5.4 you will realize that the mixture of data rates and baud rates results in the number of bits per baud, ranging from a low Table 5.4 Upstream Line Rates for Asymmetric VDSL Service Range
Short range (1000 ft)
Medium range (3000 ft)
Long range (4500 ft)
Bit Rate (Mbps)
6.48 4.86 3.24 3.24 2.43 1.68 3.24 2.43 1.62
Symbol Rate (MBd)
0.81 0.81 0.81 0.405 0.405 0.405 0.405 0.405 0.405
122 䡲 Understanding IPTV
of 2 to a high of 8. We have just determined the number of bits per baud that need to be modulated. An unanswered question concerns the modulation scheme used by VDSL. Thus, let’s turn our attention to this topic.
Modulation Similar to the debate over ADSL modulation that occurred during 1993, a battle occurred during the early turn of the century concerning VDSL modulation. In one camp were proponents of carrierless amplitude and phase modulation combined with quadrature amplitude modulation (CAP/QAM); on the other side were developers of discrete multi-tone (DMT) technology. DMT uses multiple carrier frequencies referred to as subcarriers to modulate data. For VDSL, DMT uses digital signal processing techniques, such as the fast Fourier transform (FFT), to modulate data on as many as 4096 subcarriers. Because VDSL DMT modems maximize data throughput by dynamically adapting the power level and number of bits modulated on each subcarrier to match impairments on the line, they provide a higher throughput than CAP/QAM, especially in the presence of noisy conditions. During the initialization process and periodically thereafter, VDSL DMT modems determine the signal-to-noise (S/N) ratio on each subcarrier, modulating the carrier with a varying number of bits based on the S/N ratio. That is, as the S/N ratio becomes low, a lesser number of bits are modulated, whereas a higher S/N ratio results in more bits being modulated on a subcarrier. In comparison, CAP and QAM are both single-carrier modulation methods. Although they are similar to one another, CAP directly generates a modulated signal whereas QAM requires the generation of a quadrature carrier signal. In a VDSL environment, three variables define the use of QAM: the center frequency, the constellation size (QAM2 to QAM256, which defines the number of different phase and amplitude combinations), and the symbol rate, with the latter defining the bandwidth requirements. Although VDSL single-carrier modulation modems can be manipulated by changing their constellation size, carrier frequency, and symbol rate, under adverse conditions they are not as flexible as VDSL DMT modems that can alter the number of bits and power level on each subcarrier. Although the standards bodies are looking at both QAM and DMT, DMT appears to represent a better choice, similar to the ADSL battles a decade ago. One factor that can be expected to influence the standards debate is the action of the VDSL Alliance. The VDSL Alliance — a partnership between Alcatel, Texas Instruments, and other vendors — has announced support for DMT. Under the VDSL Alliance DMT method, signals occur on 247 subchannels, each 4 kHz in width, as illustrated in the top portion
Last Mile Solutions
䡲 123
Figure 5.4 Comparing DMT and CAP.
of Figure 5.4. Periodically, training signals enable the VDSL modems on each end of the copper media to perform synchronization and equalization. In addition, each subchannel is monitored and, if the quality of the channel becomes impaired beyond a threshold, the signal is shifted to another channel. The lower portion of Figure 5.4 illustrates the use of CAP and QAM. When CAP is used, the bandwidth of the copper wire is subdivided into three distinct entities. Voice conversations occur over the 0- to 4-kHz band. The 25- to 160-kHz band is used for the upstream data channel, and the downstream data channel begins at 240 kHz and extends to a maximum value that varies with such conditions as line noise and line length but that can be no greater than approximately 1.5 MHz. Both DMT and CAP use QAM, with the key difference being the manner in which QAM is applied. Under DMT, QAM can occur simultaneously on up to 247 subchannels. In comparison, under CAP, QAM occurs once on one upstream channel and once on one downstream channel. Comparing QAM on DMT versus CAP, it is important to remember Shannon’s Law. In 1948, Shannon’s Law was formulated by Claude Shannon, a mathematician who defined the highest data rate a communications channel can support. Under Shannon’s Law, c = B* log2 (1 + S/N) where c = maximum data rate (in bits per second) obtainable on a communications channel B = bandwidth of a channel (in Hertz) S/N = signal-to-noise ratio on the channel
124 䡲 Understanding IPTV
The function log2 represents the base-2 logarithm, such that the base-2 logarithm of a number x is the number y and 2y = x. Bandwidth plays a vital role in the capacity of a communications channel. Thus, the tradeoff between DMT and CAP becomes one of having up to 247 4-kHz subchannels used for both upstream and downstream operations versus the use of 135 kHz (160 – 25) for upstream and 1.26 kHz (1.5 – .24) for downstream operations.
Deployment Options When we discussed ADSL earlier in this book we noted that its maximum range was approximately 18,000 feet. In comparison, VDSL’s maximum range when used to transport standard television is approximately 4500 feet, and its use to transport HDTV and a few SDTV channels is limited to approximately 3000 feet. This means that, compared to ADSL, the deployment of VDSL is more dependent on fiber being routed closer to the customer. Although the best network architecture to support nonconstrained IPTV, which would enable a home viewer to see many HDTV programs at the same time, is through the use of fiber-to-the-home (FTTH), from an economic perspective, it is too costly in most situations. Those situations primarily refer to existing developments, where the expense of running fiber cable from the neighborhood or the curb into the home would require lawns to be dug up, driveways to be tunneled under, or fiber wires to be installed on overhead lines, which would be costly. Instead, several telephone companies are using five fiber deployment alternatives. Those alternatives include: 䡲 䡲 䡲 䡲 䡲
Fiber-to-the-exchange (FTTEx) Fiber-to-the-cabinet (FTTCb) Fiber-to-the-neighborhood (FTTN) Fiber-to-the-curb (FTTC) Fiber-to-the-building (FTTB)
FTTEx Fiber-to-the-exchange (FTTEx) references the use of fiber to the central office. From the central office, VDSL can be deployed over existing copper wiring at distances up to approximately 4500 feet. In actuality, if no HDTV services are required, VDSL can be used to provide a “last mile” solution for several SDTV channels and high-speed Internet access at distances up to 3500 feet. If HDTV support is required, the transmission distance would be significantly lowered to approximately 1000 feet.
Last Mile Solutions
䡲 125
FTTCb Fiber-to-the-cabinet (FTTCb) is a method to serve subscribers more distant from a central office. In this situation fiber is run from the central office to an optical network unit (ONU), from which data flows to the subscriber over the existing copper wiring. Figure 5.5 compares FTTEx and FTTCb. The upper portion of the figure illustrates the use of FTTEx, for which a central office must be located within 4500 feet of a subscriber. In comparison, the lower portion of Figure 5.5 illustrates how FTTCb can be used to extend the distance from the central office to subscribers and still allow service over existing copper wiring. Not shown in Figure 5.5 are the Digital Subscriber Line Access Multiplexers (DSLAMs) that reside at central offices that directly serve many xDSL subscribers. The function of the DSLAM is to connect subscribers communicating via twisted pair to the backbone network. In an IPTV environment, the DSLAM should support multicast transmission. If it does not, the switch or router located at a central office will need to replicate each multicasted TV channel to subscribers requesting to view the channel. This action can result in congestion at the DSLAM input when a popular program is requested. In comparison, if the DSLAM supports multicast transmission, it will have to receive only one data stream for each channel as input, then replicate the data stream for all subscribers who used their set-top box or PC to select the channel.
Figure 5.5 FTTEx vs. FTTCb.
126 䡲 Understanding IPTV
FTTN Fiber-to-the-neighborhood (FTTN), which is also known as fi ber-tothe-node, has similarities to FTTCb but is also different. In an FTTN environment, fiber is routed from a central office to locations within neighborhoods, which minimizes the distances of copper wiring routed into a group of homes and offices. Although this is similar to FTTCb, FTTN would have the capability to be upgraded in the future to enable optical fiber to be supported directly into homes and offices. This will allow subscribers to be migrated to more capable fiber technology as demand for HDTV and other technologies increases.
FTTC In a fiber-to-the-curb (FTTC) deployment, fiber cable is extended to the curbs of homes and offices. Although this minimizes the distance for the use of twisted-pair wiring, it increases the use of fiber, which can result in the need to burrow fiber cable through roads and driveways. Typically, FTTC represents a good solution for home developments where roads and curbs are in the process of being prepared. If a neighborhood is already established, then FTTN may represent a better deployment method.
FTTB Another option that can be considered to facilitate the use of VDSL is to route fiber directly into a building. Referred to as FTTB, this method of fiber deployment represents a practical solution for multi-dwelling units, such as apartment buildings, as well as for businesses. Of course, the density of multi-dwelling units within the area to be served will be a significant factor in determining if economics justifies this type of deployment. For example, if half a dozen medium-rise apartment buildings each consisting of 20 units are located on a 1-acre track, FTTB would be a more suitable solution than if the buildings were in a rural area and each property was located on a 5- or 10-acre track.
FTTH The last deployment option that warrants attention is fiber-to-the-home (FTTH). FTTH, as its name implies, results in the use of fiber directly into the home or office. Although FTTH represents the most expensive method used for last mile connectivity, it also can provide the highest level of
Last Mile Solutions
䡲 127
bandwidth to subscribers. In several field trials, FTTH has resulted in a data rate of 155 Mbps to the consumer, although in many instances that rate represents a maximum physical capacity, with the average data rate to the endpoint limited to between 10 and 40 Mbps.
Summary Because VDSL can provide a relatively high transmission rate over existing copper-based twisted pair, telephone companies can replace many of their main feeds with fiber-optic cable without having to route the cable directly into homes and businesses. By deploying FTTN or FTTC, telephone companies can avoid the cost of digging up lawns and gardens as well as burrowing through driveways. Instead, a VDSL modem can be placed in the home or office and connected to existing telephone wiring. If FTTN is used, a VDSL gateway is typically located where the fiber terminates and provides analog-to-digital and digital-to-analog conversion for a series of twisted-pair wires routed from subscriber homes and offices to the gateway. In actuality, the gateway performs optical-to-electrical and electrical-to-optical conversion. This is because VDSL is similar to other DSL technologies in that it is based on the use of modems that perform analog modulation.
5.2 Distribution into the Home In concluding our discussion of last mile solutions we will note how IPTV can be distributed within a subscriber’s home. We will examine how video and audio can enter a customer’s home on a single fiber or twisted-pair metallic conductor and be routed to television and stereo devices and personal computers located throughout the residence.
Introduction The basic distribution of IPTV can be expected to be bundled with music channels and high-speed Internet access. As a single flow of data transporting this mixture of recreational activities and potential business connectivity reaches the subscriber’s premises, the manner by which the data flow can be distributed will be based on several factors. Those factors include the manner by which the data stream was transported and the distance from the subscriber to the nearest communications carrier office or ONU servicing the subscriber.
128 䡲 Understanding IPTV
Versions of xDSL Currently there are several versions of xDSL that can be used to transport a high-speed data streams containing video, audio, and Internet access into a subscriber’s premises. Those versions of DSL include ADSL, ADSL2, ADSL2+, and VDSL. As discussed earlier in this chapter as well as in Chapter 3 when we reviewed the three version of ADSL, each version of xDSL, including VDSL, has a transmission rate inversely proportional to the transmission distance over the copper media that connects a DSLAM in a central office to a subscriber’s xDSL modem. This means that as the distance increases, the data rate decreases. Thus, some subscribers currently without HDTV or willing to forgo that service may be able to be supported at a greater distance than subscribers who expect HDTV to be part of the package offered to them. In comparison, FTTH, which represents an extension of fiber from the central office to the subscriber’s premises, does not result in a similar limitation. However, because communications carriers are in business to make a profit, they will more than likely use a version of passive optical network (PON) technology to split the capacity of a fiber routed to the neighborhood, which will result in fibers routed into subscriber premises having an overall data delivery rate less than 50 Mbps. Although this rate is considerably less than the theoretical capacity of the fiber, it is sufficient to transport several channels of SDTV and HDTV as well as enable a member of the household to surf the Web. Thus, the use of fiber directly into homes and offices will more than likely remain as a small fraction of all last mile solutions because an equivalent bandwidth can be provided by VDSL for most subscribers at a lower implementation cost for the communications carrier.
Inside the Home In addition to outside factors, such as the distance to telephone company equipment and transmission method used, several factors inside the subscriber’s premises will govern the delivery of IPTV. Those factors include the type of set-top box installed by the telephone company, the capability of a “routing gateway” typically installed by the homeowner, and the type of home networking technology used or to be installed by the subscriber. In addition, some modern flat-panel television sets are being manufactured with a USB interface, which facilitates the distribution of IPTV either to home networking equipment connected directly to the TV or to a set-top box, which in turn is cabled to the television via its USB port. Now that we have a general overview of how IPTV can be distributed to a subscriber’s home or office, let’s turn our attention to specifics.
Last Mile Solutions
䡲 129
IPTV and the Home Network Because all versions of ADSL and VDSL terminate at the subscriber in a similar manner, we can easily note the manner by which any type of DSL service is terminated in the home or small office. Figure 5.6 illustrates the manner by which a serial data stream flowing over an xDSL connection can be terminated. The top portion of Figure 5.6 indicates the integration of a set-top box into an xDSL modem, with the subscriber providing an optional routing gateway to distribute IPTV to other locations in the home. In the lower portion of Figure 5.6 an optional routing gateway is shown built into the xDSL modem. The key difference between the two illustrations shown in Figure 5.6 is the fact that the integrated set-top box with the xDSL modem provides users with the ability to connect a standard or HD television without requiring a home network or a routing gateway. Thus, this method of termination using an xDSL modem with a built-in set-top box might be appropriate for apartments and other small residences. In comparison, the lower portion of Figure 5.6 enables a routing gateway to be built into the xDSL modem or directly attached to the device at the point of entry of the service into the home. This configuration enables subscribers to locate their computers closer to the point of xDSL service entry while using a home network to distribute SDTV and HDTV
Figure 5.6 Distribution of an IPTV data stream.
130 䡲 Understanding IPTV
to both set-top boxes and televisions located throughout an apartment, home, or office.
Network Options Because the home network represents the key to the delivery of IPTV within homes, apartments, and small offices, any discussion of the delivery of the technology would be incomplete without a discussion of home networking. A variety of home networking solutions are currently available for selection by the consumer. Those networks range from wired and wireless Ethernet networks to networks that operate over existing electrical wires and are referred to as broadband over power lines (BPL). In this section we will briefly examine the advantages and disadvantages associated with each type of home network.
Wired Ethernet Wired Ethernet needs to provide a 100-Mbps transmission capability because 10-Mbps Ethernet is too slow and Gigabit Ethernet has a very short range when transmitted over copper wiring. Also known as 100 Base-T, 100-Mbps Ethernet needs to operate on category 5 (CAT5) or better wiring, which can be expensive to install after a building is constructed. Thus, the majority of home networks using wired Ethernet will more than likely be new homes that are sold as “pre-wired” for highspeed communications. When a wired Ethernet is used as the home network, the router gateway, which normally includes between three and four switch ports, will be connected to the Ethernet hub that forms the home network. From the hub, data will be broadcast onto each Ethernet endpoint. Thus, this network solution requires set-top boxes or televisions with client software and an Ethernet connector to receive the broadcast stream and select the appropriate portion of the data stream for viewing.
Wireless Ethernet Over the past decade the IEEE has standardized a number of wireless transmission methods for operation in the 2.4-GHz frequency band as well as one method for operation in the 5-GHz frequency band. The first wireless LAN standard, referred to as 802.11, operated at only 1 or 2 Mbps and is not suitable for distributing IPTV within the home. The first two extensions to the initial 802.11 standard are referred to as the 802.11a and 802.11b standards. The 802.11a standard defined data
Last Mile Solutions
䡲 131
rates up to 54 Mbps in the 5-GHz frequency band, whereas the 802.11b standard defined a maximum data rate of 11 Mbps in the 2.4-GHz frequency band. Although an 11-Mbps transmission rate is capable of carrying a single standard television broadcast, it does not have the capacity to transport multiple SDTV broadcasts and provide high-speed Internet access or a single HDTV broadcast throughout the home. In comparison, although the maximum data rate of 54 Mbps for 802.11a technology can transport a mixture of SDTV and HDTV, because it operates in the 5-GHz band, its transmission distance is limited compared to equipment compatible with the 802.11b standard. This is because higher frequencies attenuate more rapidly than lower frequencies, and equipment manufactured to be compatible with the 802.11a standard operates at almost twice the frequency of 802.11b equipment. Recognition of the need to improve both the data rate and transmission distance resulted in the development of the 802.11g standard. Under this standard, wireless LANs can communicate at data rates up to 54 Mbps in the 2.4-GHz frequency band. Because 802.11g-compatible equipment operates in the lower frequency band, its transmission range is normally sufficient to cover most areas of a typical-sized home or apartment. The newest member of the IEEE 802.11 series of standards is the 802.11n standard. This standard builds on previous standards by adding a multiple-input, multiple-output (MIMO) capability in which transmitters and multiple antennas allow for increased data throughput. Although a theoretical data rate of 540 Mbps appears possible, most vendors advertise a throughput of 110 Mbps for their 802.11n products. Because 802.11n products operate in the 2.4-GHz frequency band as well as implement spatial diversity through the use of multiple antennas, their transmission range can be expected to cover the area of an average home or apartment. Thus, equipment that supports the IEEE 802.11n standard is extremely well suited for providing a home networking capability in an IPTV environment.
Powerline The HomePlug Powerline Alliance completed the HomePlug 1.0 specification many years ago. This specification defines the transmission of data over existing electrical wiring at data rates up to 12 Mbps. A second emerging HomePlug Powerline Alliance standard is the HomePlug AV specification. The HomePlug AV specification defines a 200-Mbps channel rate at the physical layer that is capable of delivering 150 Mbps at the data link layer. Designed for the home of the future, which will have multiple HDTVs, this standard is well suited for a home network that requires support for a mixture of audio, video, and Internet access
132 䡲 Understanding IPTV Table 5.5 xDSL Capacities and Constraints Version of DSL Feature
Downstream data rate Upstream data rate Maximum distance
ADSL
ADSL2
ADSL2+
VDSL
8 Mbps 640 kbps 18,000 ft
12 Mbps 1 Mbps 12,000 ft
24 Mbps 2 Mbps 9,000 ft
52 Mbps 6 Mbps 1,000 ft
being delivered to different locations within the home. Because most rooms have multiple electrical outlets, it appears that the evolving HomePlug AV standard may provide the most practical method to deliver the high-speed data stream consisting of a mixture of standard- and high-definition television channels as well as high-speed Internet access to locations throughout a home, apartment, or small office. With an expected retail price of under $75 per adapter, which plugs into an electrical outlet and has either a USB or Fast Ether net connector, a router could be connected to the electrical power line in the home and data distributed to four locations at a cost of $375 for five power line adapters.
Access Technologies vs. Home Networking In concluding this chapter, we turn our attention to comparing the service rates and distance support of the various flavors of xDSL to the operating rates of different home networking technologies. Table 5.6 Home Networking Technologies That Can Support Different xDSL Access Technologies Type of DSL Downstream Data Rate Home Networking Technologies
ADSL 8 Mbps Wired Ethernet
802.11 — 802.11b 802.11g 802.11n
ADSL2 12 Mbps Wired Ethernet
ADSL2+ 24 Mbps Wired Ethernet
VDSL 52 Mbps Wired Ethernet
— 802.11a — 802.11g 802.11n
— 802.11a — 802.11g 802.11n
— 802.11a — 802.11g 802.11n
Last Mile Solutions
䡲 133
Access Technologies As telephone companies deploy IPTV services, including high-speed Internet access, they need to consider the programming mix they will sell against the distance of subscribers from the nearest central office and the operating rate of the service. As previously noted in this book, the delivery of just one channel of standard-definition television requires between 2 and 6 Mbps under MPEG-2. In comparison, a single channel of HDTV can require up to approximately 20 Mbps under MPEG-2. Comparing those data rates against the various flavors of xDSL shown in Table 5.5, one can see that only ADSL2+ and VDSL have the capacity to support both several SDTV channels and high-speed Internet access. In addition, only VDSL provides the bandwidth necessary to support several SDTV channels plus HDTV channels and high-speed Internet access. Thus, service providers will more than likely have to tailor their product mix to the access technology used to provide subscribers with a connection to their central office.
Home Networking The type of home networking method can be consider ed to be as important as the access technology because the former provides a mechanism for delivering received data streams throughout a home or apartment. Thus, the bandwidth provided by the home networking equipment should be at least equal to the download speed provided by the xDSL technology used to provide the access line. Table 5.6 provides a comparison of home networking data rates and access technologies, indicating the potential home networking technologies that can be used to support different xDSL access technologies.
Chapter 6
Hardware Components Up to this point in the book, we have only briefly mentioned a number of hardware components, without any significant explanation of their functionality. This is because the basic functions of hardware devices such as servers and set-top boxes are well known. In this chapter we will probe deeper into IPTV hardware components, examining their functionality in an IPTV environment. For example, we will examine the differences between a cable TV set-top box and an IPTV set-top box as well as the functions performed by a media center and media center extender, content managers, broadcast servers, and archive servers.
6.1 Set-Top Boxes If you subscribe to a cable service to obtain the ability to view channels beyond a basic low-cost analog bundle, you are using a set-top box. That set-top box converts analog, and may also convert digital channels if you subscribe to that offering, from predefined frequencies that channels are broadcast on the cable to either channel 3 or channel 4, depending on the set-top box. Similarly, if you subscribe to a satellite service you must install a set-top box, which will be connected to your television. This set-top box converts a digital compressed signal into the television format suitable for being displayed on your screen on either channel 3 or channel 4.
135
136 䡲 Understanding IPTV
Evolution Both analog and digital set-top boxes trace their origins to the later 1960s and early 1970s when cable companies began to offer premium channels. Because premium channels could be viewed by subscribers who did not want nor did not wish to pay for the service, channel scrambler technology was incorporated into the first generation of set-top boxes. This technology at first simply distorted the premium channel picture by broadcasting the channel with modified vertical and horizontal synchronization. Thus, early set-top boxes simply added the missing synchronization signal to premium channels to enable an applicable undistorted series of TV frames to appear on the subscriber’s television. Later versions of these early set-top boxes performed a rudimentary scrambling of both video and audio by shifting portions of audio and video frequencies. It turned out to be quite easy for some hobbyists to develop “descramblers,” and an active market for such products developed. Although technically illegal, many stores continue to advertise the sale of descramblers, telling the purchaser that they should inform their cable company of its use. Since the early rollout of set-top boxes, analog technology has rapidly been replaced by digital technology. Today the vast majority of set-top boxes are digital, with many incorporating such functions as personal digital recorders that store up to hundreds of hours of television and provide a high-definition viewing capability. By 2006, global shipments of set-top boxes were approaching 15 million per year.
Market Leaders Currently two companies dominate the digital set-top box market — Motorola and Scientific Atlanta, the latter acquired by Cisco Systems during 2006. Together, they account for more than 80 percent of worldwide shipments, but this is expected to change as new manufacturers ramp up production of IPTV set-top boxes based on contracts initialed with vendors providing IPTV service. In addition, the world’s potentially largest cable market, China, remains a great unknown with respect to the relationship between installation of cable TV and IPTV services. This relationship probably favors cable due to the fact that the majority of the Chinese mainland telephone system would require a considerable infrastructure upgrade to support IPTV.
Basic Functionality In an IPTV environment a good portion of the functionality of set-top boxes is similar to the set-top boxes of cable and satellite operators.
Hardware Components
䡲 137
For example, for all types of services the set-top box represents a dedicated computer that provides an interface between the television set and the service provider. In addition to decoding signals, the set-top box will, on demand, provide a guide listing of shows by time, channel, or other selected methods as well as provide information about a selected show. Some set-top boxes will include a hard drive, enabling the subscriber to record programs for later viewing, and more recently developed set-top boxes including one or more USB ports, which enable support of WiFi communications or attachment to a home network. With this added capability it becomes possible for subscribers to transmit recorder programs within their home or office to a PC similarly equipped with a network-compatible device. Then, subscribers can elect to watch TV on their computer screen or burn a DVD and view the show on a television that does not have a built-in hard drive. Perhaps the major difference between IPTV and conventional set-top boxes resides in the added functionality the former provides and which we will shortly discuss in some detail. However, another difference between the two that deserves mention is the elimination of the need for an IPTV set-top box to perform frequency shifting. This is because the input to the IPTV set-top box is a digital data stream that the set-top box will output on either channel 3 or channel 4. In comparison, conventional set-top boxes will shift the frequency of a selected channel residing on a coaxial cable or received via a satellite dish to channel 3 or 4.
IPTV Set-Top Box Added Functionality Although IPTV set-top boxes have similar basic functionality to boxes developed for cable and satellite providers, they recognize and act on UDPs (User Datagram Protocols), transmitted within IP datagrams. Through a significant amount of software or firmware coding, the set-top box transmits a request to join a multicast group when the subscriber simply changes the channel from one standard channel to another. Another feature built into the software of the set-top box is the ability to transmit unicast requests to the network when the subscriber selects a premium video-on-demand event. The request will first flow to a billing and management server, which will verify that the subscriber is not in arrears on his or her bill and then add the selected event’s cost to the current bill prior to transmitting the event as a series of IP datagrams. In addition, IPTV set-top boxes also support additional features that are neither available nor possible to add to conventional cable and satellite set-top boxes. For example, IPTV set-top boxes can support Web browsing both for the Internet and as a mechanism to quickly cycle through guide data. The browser can also be used to view e-mail and e-mail attachments, interface
138 䡲 Understanding IPTV
with various types of home networks via gateways, as well as provide support for real-time Voice-over-IP (VoIP), videoconferencing, and evolving telephones that combine telephone audio with a camera that enables parties to a conversation to see one another. Although the IP set-top box functions are considerable, readers should recognize that not all of these functions and features will be incorporated into each box. Some manufacturers may produce a series of products that incorporates additional features as subscribers move up their product line. Other manufacturers may decide to incorporate only certain features and functions into a standard product. Now that we have an appreciation for the general features and functionality of the IPTV set-top box, let’s turn our attention to a series of set-top boxes developed over the past six years, commencing with Microsoft’s early efforts at enhancing conventional set-top boxes for cable companies and then discussing Microsoft’s efforts and those of other software and hardware developers in the IPTV set-top box area.
IPTV Set-Top Box Developers Worldwide a large number of hardware and software developers are working on IPTV set-top boxes. Although space constraints preclude a full listing of all companies in this field, we will obtain an appreciation of the overall effort by focusing our attention on several vendors. Because Microsoft has been very active in this development area, we will first discuss that company’s efforts in developing set-top boxes.
Microsoft Microsoft’s IPTV efforts date to before the turn of the century, when the company began to explore the display of video with its Windows Media player. In June 2000, the company introduced Microsoft TV Foundation Edition, which represented a new digital television platform. This TV platform enabled network operators to offer video on demand, news, weather, sports, and games to set-top boxes and TV devices, including conventional cable and satellite operators. What was particularly interesting about Microsoft’s announcement was the fact that the company made its software platform scalable to support current and future-generation set-top boxes.
Initial Market Focus The initial target markets of the Microsoft TV Foundation Edition were cable operators with tens of millions of installed low-end or legacy set-top
Hardware Components
䡲 139
boxes. Such boxes, which were manufactured during the prior decade, normally contained minimal hardware, such as a 15- or 20-MHz processor and as little as 1 to 4 MB of memory. In addition to targeting the low-end set-top box market, Microsoft announced its “Advanced Solution,” which was targeted toward what were then high-end set-top boxes. Such set-top boxes typically included a 100-MHz processor and between 8 and 32 MB of memory. Microsoft’s Advanced Solution was based on an embedded operating system (Windows CE), a TV graphical user interface (GUI), and middleware developed for delivering content and services to a higher performance set-top box.
IPTV Effort A few months after introducing its Foundation Edition and Advanced Solution, Microsoft unveiled plans during October 2003 to develop an end-to-end IP television delivery solution. This solution would include set-top boxes based on Windows CE, .NET, and an embedded version of the XP operating system. Also included was Windows Media 9 audiovisual technology, which, according to Microsoft, has approximately three times the efficiency of MPEG-2 and twice that of MPEG-4. Microsoft’s announcement of its end-to-end IPTV delivery solution was accompanied by a brochure touting the next-generation features subscribers could expect to receive through its use. Among those features were instant channel changing, multimedia programming guides with integrated video, and multiple picture-within-picture capability on standard television sets that lack that capability. Other features would include digital video recording, video on demand, and, to placate video providers, an enhanced digital rights management capability that would secure video assets and television shows.
Microsoft’s Prototype Set-Top Box In late 2003, at the International Telecommunications Union (ITU) Telecom World exhibit in Geneva, Switzerland, Microsoft demonstrated its IPTV set-top box solution targeted at high-end boxes. Microsoft’s prototype IPTV set-top box assumed the availability of a low-cost, single-chip IPTV set-top box processor as well as a hardware decoder for Windows Media 9. This would enable the company’s cost per box to be initially around $150, with the ability for economies of scale to reduce its costs to below $50 by 2007. Although Microsoft’s efforts in the area of IPTV took a back stage position relative to its Windows and Office suite developments, at SuperComm 2005, the company announced a series of updates to its IPTV
140 䡲 Understanding IPTV
Edition software platform. The updates were developed to facilitate the creation and delivery of video services. In addition, Microsoft announced several partners that planned to use its IPTV software, including both network operators and set-top box manufacturers. In France, T-Online France, the French subsidiary of T-Online International, announced that it had selected Microsoft’s IPTV Edition software platform to run trials of next-generation television, joining T-Online of Germany, which had previously announced its selection of a Windows CE–based set-top box for its IPTV service. Another key announcement at SuperComm 2005 was by Microsoft concerning the names of set-top box manufacturers that agreed to imbed Microsoft’s IPTV Edition client-side software into their set-top boxes. The firms mentioned included Motorola and Scientific Atlanta, which control more than 80 percent of the set-top box market, as well as Harmonic and Tandberg. Because SBC Communications (now known as AT&T) previously agreed to use Microsoft’s IPTV Edition platform in its Project Lightspeed, which will provide support to approximately 18 million homes by the end of 2007, it appears that Microsoft will be the major player behind the deployment of IPTV. However, to borrow a well-known phrase, the only certainty about certainty is uncertainty. Thus, in concluding our discussion of set-top boxes, we will look at several additional vendors and their products.
Royal Philips Electronics Royal Philips Electronics dates to 1891, when Ger ard Philips established a company in the Netherlands. Today, Royal Philips Electronics is one of the world’s largest electronics companies, whose businesses include consumer electronics, appliances, lighting, semiconductors, and medical products.
IPTV Efforts During 2005 Royal Philips Electronics introduced a new IP set-top box that targets both IP and broadcast set-top boxes. Referred to as the STB810 IP set-top box, it supports such advanced features as video telephony, timeshift recording, DVD playback, improved picture algorithms, data storage, and personal video recording. The Philips STB810 IP set-top box uses a multi-core system-on-chip (SOC) processor, which includes a MIPS32 central processing unit (CPU) core along with dual Tri Media media processing cores. Although the STB810 can be obtained with Windows CE, it can also be obtained with an embedded Linux operating system, which could result in an attractive price because this would alleviate the necessity to pay royalties to Microsoft.
Hardware Components
䡲 141
Sigma Designs Sigma Designs is a company that specializes in the development of siliconbased MPEG decoding firmware for consumer products. The company currently offers decoding for high-definition MPEG-4.10 as well as MPEG-4, MPEG-2, MPEG-1, and other compressed video standards. In addition, Sigma Designs markets complete reference designs for various markets, including networked DVD players, portable media players, and IPTV settop boxes.
IPTV Efforts In June 2005 Sigma Designs announced that TIS Net, a Tatung subsidiary, would launch an IP set-top box based on Sigma’s media pr ocessors that would use Windows CE as its operating system. By January 2006 Sigma Designs introduced a line of set-top boxes based on its SMP8630 family of chipsets that includes a 300-MHz MIPS core and a 200-MHz memory interface. Sigma noted that its new pr oduct is the first to integrate decoders for all major digital video formats, including MPEG-2 and H.264. The new family of chipsets initially announced included the SMP8630 and SMP8634. The SMP8630 tar gets single-stream highdefinition or multi-stream standard-definition applications. In comparison, the higher end SMP8634 targets multi-stream high-definition applications. The basic SMP8630 supports up to 256 MB of 32-bit RAM, whereas the higher end SMP8634 will support up to 512 MB of 64-bit RAM and 256 MB of 8/16-bit flash memory. The SMP8630 provides a 7.1 home theater audio output and a single video input along with a shared smartcard bus. The higher end SMP8634 adds 2.0 stereo output in addition to a 7.1 home theater capability, a second video input, a second highdefinition decoder, and a second digital signal processor as well as a graphics input port, an Ethernet MAC (Media Access Control) controller, HDMI (High-Definition Multimedia Interface), and a dedicated smartcard bus. Both Sigma Designs chips provide support for video decoding up to 1920 × 1080 pixels at 30 progressive frames per second for MPEG-2, MPEG-4, and H.264. The chips also support picture-in-picture window operations as well as a range of audio decoding compatibility from Dolby Digital through the three audio layers of MPEG-1 and MPEG-2. Similar to the Royal Philips Electronics set-top box, Sigma Designs announced that both Windows CE and Linux will be ported to the SMP chips’ MIPS cores, allowing the set-top box to use either operating system. In addition, Opera Software announced that it will port its Opera browser as well as its Opera browser software development kit to the new Sigma Designs chips. According to Sigma Designs, its SMP8630
142 䡲 Understanding IPTV
would be available in samples in March 2006 and the SMP8634 would follow in April 2006. Both chips were scheduled to reach production availability by mid-2006.
Talegent In concluding our examination of set-top boxes, we will discuss Talegent’s Evolution 1 series of set-top box platforms.
IPTV Efforts Recognizing the need to satisfy different markets, Talegent announced during late 2005 three models in its Evolution 1 series of set-top boxes. Tables 6.1 provides a summary of the features of each of the three initial announced set-top boxes. The Talegent Evolution 1 series is based on the Philips STB810 platform. Basic hardware, which is shared by all three Talegent set-top boxes, includes a 10/100-Mbps RJ45 Ethernet port, four USB 2.0 ports, an asynchronous serial port, a Mini PCI slot for expansion, video out support for PAL or NTSC, S-video, composite video, HDMI, and S/PDIF, and RCA stereo jacks for audio out. In addition, the basic hardwar e platform supports 250-, 280-, and 310-MHz CPUs, from 32 to 128 MB of RAM, and 16 to 64 MB of flash memory. If you compare the previously mentioned set-top box features and functionality to set-top boxes manufactured during the 1980s and 1990s, it is obvious that a quantum leap in the technology and capability of set-top boxes has occurred. Not resting on its laurels, Talegent has indicated that it plans to extend the capabilities of its settop boxes. Those plans include the addition of WiFi (802.11) and Bluetooth wireless communications using the Mini PCI card slot, 250-GB hard drives, a photo printer, and a DVD player/recorder, each of which can be attached via a USB port. In addition to offering the Evolution 1 series of set-top box platforms directly, Telegent plans to manufacture them for private labeling for sale or lease by third parties.
Table 6.1 Talegent Evolution 1 Series Comparison Feature
DVB front end Hard drive connection DVD connection
TG200
TG400
TG600
One No No
Two Yes Yes
Two Integrated 160-GB drive Integrated
Hardware Components
䡲 143
6.2 Media Center and Center Extenders One of the more interesting types of computers to reach the market over the past few years is a device sold as a “media center.” This is the first PC developed to operate via a remote control and reside on a shelf in the den or living room along with other audio-visual equipment.
Overview The media center represents a PC with a television tuner that allows users to view real-time television while performing such traditional computer operations as sending and receiving e-mail, creating documents, and performing other computer-related activities. Depending on the permission associated with the digital content and operating system software, users may be able to record video to disk, burn a DVD, or even forward a previously viewed show to a friend at another location or to other television receivers in the home.
Functionality In addition to managing video, a media center will provide users with the ability to manage audio and photographs. Optional hardware allows users to connect the media center to one or mor e televisions in the home to display a comprehensive guide of program listings. Once the media center is capable of displaying infor mation on one or more televisions via a home network, it can be used to control the presentation of information. In fact, one could use the media center to present a slide show of photographs from a family vacation, download a movie acquired from the Internet, or even show a PowerPoint business presentation.
Microsoft’s Media Center Software Currently, Microsoft’s Media Center, which is based on a special version of Windows XP, basically controls the market for media center-based PCs. Microsoft’s software supports a large number of third-party plug-ins that can add a considerable amount of functionality to its media center. Some examples of plug-ins include software for displaying and distributing caller ID on other devices connected to the home network, a program that facilitates the creation and playback of movies, an online TV listing guide that can be customized, and a program that turns the hard drive in the media center computer into a sophisticated video recorder that allows the
144 䡲 Understanding IPTV
user to record specific shows or even a series of shows on any PC with Web access. Through the use of Microsoft and third-party software, the capability of the Media Center PC can considerably expand. For example, when watching a video or television show, you could have the caller ID of an inbound telephone call displayed on your PC monitor. If you decide to take the call you could then use a remote control to record the show while activating a microphone and routing the caller output through the television or computer speakers. Once the call is completed you might then press a few keys on the remote to deactivate the microphone, hang up the call, and resume watching the show from the point where you initiated its recording.
Media Center Extender Although the Media Center PC was developed as a mechanism to record, control, and display audio and video content throughout a home, it is the Media Center Extender that makes the distribution of information a reality.
Overview Media Center Extender technology was announced by Microsoft at the International Consumer Electronics Show in Las Vegas during 2004. Based on Windows CE embedded software, the Media Center Extender’s objective is to extend the reach of the core systems to television displays located in various areas and floors in a home.
Product Operation One of the first products to use Microsoft’s CE embedded software is the Linksys Wireless Media Center Extender. Linksys, a subsidiary of Cisco Systems, announced its first product in late 2004. Marketed as a media center extender, the Linksys product connects to home entertainment devices using such standard cable connectors as RCA plugs that ar e inserted into a device’s jack. The Media Center Extender connects to a home network via wired Ethernet 10/100 BASE-T or wireless Ethernet, supporting either IEEE 802.11a or IEEE 802.11g communications. Use of the IEEE 802.11a standard results in communications occurring in the 5-GHz band, whereas use of the 802.11g standard results in communications occurring in the 2.4-GHz band. Although the 5-GHz frequency band has less interference, because high frequencies attenuate more rapidly than lower frequencies, the transmission range is roughly
Hardware Components
䡲 145
half of the 2.4-GHz range. However, if you use a microwave oven and have one or more 2.4-GHz cordless telephones used by teenagers, you will probably want to use the Media Center Extender’s wireless capability as an 802.11a device to minimize interference. The Linksys Media Center Extender includes S-video, component video, and composite video output along with a digital audio port and left and right audio ports that can be connected to speakers. Once the extender is connected to a television, users can operate a remote control to make their way through a series of menus to obtain access to digital movies, TV shows, pictures, or music previously stored on their Windows Media Center PC. Users can also watch, record, and pause live television shows, download and view digital movies, and even select and listen to hundreds of Internet radio stations via a stereo system connected to the extender. Because the media center extender, as its name implies, extends the range of the media center, its use allows the device to be located most anywhere in the home. Thus, if the telephone line comes into the home in the kitchen and bedroom you could locate the Media Center PC on a desk in the kitchen or bedroom, whereas the extender could be located in the den along with your audio-visual equipment.
6.3 Servers The set-top box, media center, and media center extender represent devices located in the home or office. At the opposite end of the IPTV network are a series of servers that acquire content, format the content for distribution, and transmit the content onto an IP network for delivery to subscribers. In this section we will focus our attention on the operational characteristics of a series of servers that form an integral portion of an IPTV infrastructure. Commencing with the headend server, we will discuss the role of the broadcast server, digital video server, timeshift broadcast server, and subscriber management system. If you compare the servers just mentioned to the servers discussed in Chapter 4 and previously illustrated in Figure 4.9, you will notice many similarities and a few differences. The differences result from the fact that we are now taking a more in-depth examination of the role of servers in an IPTV environment. Although not all IPTV operations include every server listed in this section, you need to become aware of their functionality. For example, a timeshift broadcast server enables a service provider to broadcast prerecorded video at different times. If the service provider does not offer this feature it would not need this type of equipment. In addition, this function could be accomplished via software on a “media” server, also alleviating
146 䡲 Understanding IPTV
the need for a separate server. However, as the number of IPTV subscribers increases, the service provider will more than likely employ a “division of labor” by moving certain functions to separate servers.
Headend Server One of the more important types of servers used in an IPTV environment is the headend server, which is the focus of this section. The headend server resides at the central facility of the service provider. This server captures direct video broadcast streams transmitted via satellite (DVB-S), terrestrial (DVB-T), and cable (DVB-C). The captured broadcasts are then converted into multicast data streams using preselected television channel associations for transmission over an IP network. The conversion process includes receiving streaming video broadcasts as a series of frames and converting the frames from each broadcast into a digital broadcast format, such as MPEG-2 or MPEG-4. Then, the digital br oadcast frames are streamed via multicast addressing using selected protocols, such as RTP under UDP or UDP RAW. Returning our attention to Figure 4.9, note that the generic “media server” is shown accepting a variety of broadcast sources and then transferring data to a “broadcast encoder.” Both the media server and the broadcast encoder shown in Figure 4.9 are equivalent to a headend server. The headend software must obviously support the set-top box installed at the subscriber’s premises. Assuming it does, the subscriber could either directly change channels or use the guide facility to select a channel. For either action the set-top box will transmit a request to join the multicast group associated with the selected channel. This request will flow from the set-top box to the Digital Subscriber Line Access Multiplexer (DSLAM) in a DSL environment, with the DSLAM transmitting the requested channel in the form of a received multicast transmission being transmitted as a unicast transmission over the copper connection to the subscriber.
Broadcast Server Another name for a headend server is a broadcast server; however, the two terms can be synonyms or antonyms. Because we previously discussed the main functions of the headend server, this section will focus on the additional functions that may be included in a broadcast server used in a corporate environment. A corporate broadcast server is designed to enable organizations ranging in scope from single to multiple locations to broadcast live video, audio, and different types of presentation data, such as a series of
Hardware Components
䡲 147
PowerPoint slides, to selected IP addresses. Those addresses can represent employees in the central office, regional offices, or area offices or even customers and subcontractors. Unlike a headend server, the video broadcast server does not convert broadcast streams into multiple multicast transmissions, where each transmission is assigned to a television channel, nor does this server work in conjunction with one or more types of set-top boxes. Instead, the video broadcast server is commonly designed to generate a single feed at a time, which is viewed through the use of a browser operating on a desktop or laptop computer. Of course, through the use of hardware and software it becomes possible to take the video feed generated on a PC and move it via a wired or wireless network onto a large-screen display for viewing by a group of employees, contractors, or customers.
Digital Video Server A third type of server that warrants a degree of explanation is a digital video server (DVS). Although this server can be located at the headend of a transmission system, it can also be located at any point on a network. This server supports the capture, editing, storage, and transmission of digital video. Designed for use by schools, libraries, hotels, museums, and corporations, the DVS is typically a smaller and less powerful server than a headend device and allows organizations to create content that is displayed within a predefined area, such as on displays within a museum or airport. Unlike a conventional digital video server, whose broadcasts will appear the same on all screens, a DVS that is IP capable can transmit different frames to different IP addresses. Thus, the DVS might transmit information about modern art to a museum room housing Jackson Pollock works and the work of similar modern artists, while a different sequence of frames could be directed to a room where several Rembrandts and Van Goghs are on display. Because a DVS is commonly used in a closed IPTV system, such as a school or museum, it was not shown in Figure 4.9. However, most if not all of its functionality can be included in the generic media servers shown in Figure 4.9.
Video-on-Demand Server The video-on-demand (VOD) server functions as a repository for shows, movies, and other types of video events that subscribers may wish to view at any time and for which a fee is usually associated. Thus, the three main differences between the broadcast or headend server and the VOD server can be categorized by their accessibility, transmission method, and cost.
148 䡲 Understanding IPTV
Whereas programs from the headend or broadcast server are transmitted at predefined times, information stored on the VOD server is transmitted in response to specific subscriber requests. Concerning the transmission method, the headend or broadcast server transmits each video stream as a multicast message that flows as a single sequence of datagrams to a DSLAM or equivalent device that serves many subscribers, and at least one subscriber has requested a specific content. Thereafter, the multicast-capable DSLAM will transmit one sequence of datagrams to each subscriber who previously joined the multicast group by turning the channel in their set-top box. In comparison, a VOD server will transmit selected movies, sports, television reruns, and other information in response to a specific subscriber request. Thus, the VOD server transmits such information directly to the requesting subscriber as a sequence of unicast datagrams. A third difference between a headend or broadcast server and the VOD server resides in the fact that a subscriber will normally have to pay a fee to view a VOD performance. This may require the IPTV operator to first have VOD requests transmitted to a billing server that could also check the status of the subscriber’s account. That billing server is shown in Figure 4.9 as a subscriber management system and will be discussed in more detail later in this section. Then, if the account has been in arrears for more than a predefined threshold or time duration, the billing server might not only refuse the request, but also transmit a message that flows through the subscriber’s set-top box and is displayed on the computer or television screen informing the subscriber of the reason for the rejection of the VOD request. Assuming the billing server approves the request, after updating its records that server would then forward the request to the VOD server.
Archive Server One of the key differences between data and video is the fact that the latter requires considerably more storage; consider a description of a movie or television show versus its actual stored content. A recognition of the fact that both headend and VOD servers can easily lack the capacity to store thousands of hours of programming resulted in the development of video archive servers. Similar to conventional servers, archive servers are manufactured in a variety of configurations. What sets them apart from headend or broadcast and VOD servers is the fact that the archive server is designed specifically to support an extremely large amount of online storage, typically in the tetrabyte range. In addition, the archive server will normally store video using a Redundant Array of Independent Disks (RAID), where, depending
Hardware Components
䡲 149
Table 6.2 RAID Levels Level
Level 0
Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
Level 10
Description
Striped disk array without fault tolerance provides data striping or the spreading of blocks across multiple disk drives. Although this improves performance, it does not provide any fault tolerance capability. Mirroring and duplexing provide a duplicate copy of data and twice the read transaction of single disks while the write transaction rate is unchanged. Error-correcting coding stripes data at the bit level. Redundant bits are computed using a Hamming code, which is written along with data. Upon retrieval, data and the error correcting code are read, allowing a single bit error to be corrected on the ‘‘fly.’’ Bit-interleaved parity results in a byte-level striping using a dedicated parity disk, allowing for the failure of a single disk. However, performance is degraded by the need to write to the parity disk when data is striped. Block-level striping with dedicated parity improves performance by striping data across many disks, and fault tolerance occurs due to a dedicated parity disk. If a data disk fails, the parity disk is used to create a replacement disk. A disadvantage of Level 4 is similar to that of Level 3 in that the parity disk can create write bottlenecks. Block-level striping with distributed parity stripes both data and parity across three or more drives. Although similar to RAID Level 4, Level 5 removes the need for a dedicated parity drive, enhancing performance while providing a good level of fault tolerance. RAID Level 5 is perhaps the most popular version of RAID. This proprietary method, which is also a trademark of Storage Computer Corporation, uses asynchronous, cached striping with dedicated parity to enhance Levels 3 and 4. Cache is arranged into multiple levels and a processor manages the array asynchronously. This improves performance while maintaining fault tolerance. A combination of RAID Levels 0 and 1, Level 10 was not one of the original levels. Under Level 10 the array is initially set up as a group of mirrored pairs (Level 1) and then striped (Level 0). Both performance and fault tolerance are high, but Level 10 requires a minimum of four drives and has a high overhead and limited scalability.
on the RAID level employed, either performance or fault tolerance or a combination of both are improved. Table 6.2 provides a brief description of both standardized and proprietary RAID levels. Note that RAID represents a category of disk drives
150 䡲 Understanding IPTV
that use two or more drives together to enable a degree of increased performance, fault tolerance, or both. A RAID can be attached to most types of servers, but it is particularly useful when used with a video archive server in an IPTV environment. Because an archive server can function as a back end to other types of servers it can be used as auxiliary storage for any of the generic media servers shown in Figure 4.9. In addition, the archive server can be connected to a LAN, where it can provide a backup storage capability to other servers connected to the local area network.
Timeshift Broadcast Server Timeshifting represents the transmission of a video at a time other than when it was originally scheduled. Timeshifting is most often considered to represent the operation of a subscriber using a DVD or VCR recorder, but it can also represent the operation of an IPTV provider. Although time-shifting can be performed through software on a headend, broadcast, or video server, sometimes the IPTV service provider will acquire and store video on a separate server for broadcasting at a specific time. When this occurs, the server is commonly referred to as a timeshift broadcast server. Because high-definition video requires considerably more bandwidth than standard-definition television, some IPTV providers may find it convenient to acquire programming via satellite or terrestrial communications and store such programming on a separate server for broadcasting at predefined times. Similarly, certain popular premium channel programming is commonly rebroadcast several times after its initial showing. Thus, the timeshift broadcast server can also store premium channel rebroadcast standard-definition and high-definition programming.
Billing and Management Server In previous discussions, we just briefly mentioned that access to video on demand might require the status of a subscriber’s account to be verified. That verification as well as billing and other management functions can be performed on a separate server. Because a billing and management capability is critical to the operation of the IPTV service, this capability is usually implemented on a dual-processor system connected to a RAID that provides a high degree of fault tolerance. This configuration ensures that the failure of a processor or disk can be compensated for. Although smaller IPTV providers as well as some telephone companies, satellite providers, and cable operators may use dual-processor systems on a single
Hardware Components
䡲 151
server, most operators will more than likely elect to maintain their billing and management system on dual servers that are interconnected and operate in tandem. The functions of a billing and management server can include verification of the account status of subscribers to preclude those in arrears from ordering new pay-per-view shows as well as generating applicable messages to remind subscribers when payment is due or overdue. In addition, the data captured by the billing and management system is used to create a monthly subscriber bill that may be either transmitted via e-mail or sent as a postal delivery to each subscriber. The billing and management server is shown as a “Subscriber management system” in Figure 4.9. This system can also include the ability to accept payments via electronic bank transfer or selected credit cards. Thus, the subscriber management system shown in Figure 4.9 may include telecommunications connections to several major credit card organizations as well as a connection to perform transfer of funds via electronic banking.
Chapter 7
Software Solutions In the previous chapter, we focused on the key hardware components that cumulatively provide an IPTV solution. In this chapter we will turn our attention to software in the form of media players that enable both corporate and individual users to view a variety of video data streams retrieved from the Internet. In doing so, we will describe and discuss several of the more popular media players. As this is accomplished, we will illustrate the use of IPTV, examining how a media player can enable users to view and hear a range of media events, including movies and other types of video media, directly from the Internet or previously stored as a file on the computer. At the time this book was written, four media players accounted for the vast majority of PC usage: Microsoft’s Windows Media Player, Apple Computer’s QuickTime, Real Networks’ RealPlayer, and Macromedia’s Flash Player. Although the four products possess many similarities, there are also some differences. Concerning similarities, all four products handle both video and audio and are capable of displaying certain types of stored images. The differences between each of the four products have to do more with their functionality, because they basically perform the same types of operations differently. Because Microsoft’s Windows Media Player and Apple Computer’s QuickTime account for the vast majority of media players currently used, we will primarily discuss these two products in this chapter. However, we will briefly touch on the other two media players to ensure readers are aware of all four media players.
153
154 䡲 Understanding IPTV
7.1 Microsoft’s Windows Media Player Microsoft’s Windows Media Player has been continuously updated since its initial release over a decade ago. Now in its tenth edition, this software enables users to organize and play multimedia. Users can listen to a variety of audio, including Internet radio; copy music to portable devices; view different types of media, including pictures and movies; and copy and play DVDs, CDs, and eventually high-definition (HD)-DVDs. As Microsoft likes to remind its software users, Windows Media Player can be considered to represent a combination of a radio, television, and photograph viewer combined into a single application. In this section we will first turn our attention to Windows Media Player 9. Once we are familiar with its functionality, we will then focus on the latest version of Windows Media Player, version 10. Both versions 9 and 10 are significant upgrades from prior versions of Windows Media Player and support the transfer of video to portable devices as well as the viewing of video streams.
Windows Media Player 9 Figure 7.1 illustrates Microsoft’s Player 9 with its Media Guide button selected. Clicking on the Media Guide button results in the periodic display of options for movie trailers, new DVD releases, and music videos. As can be seen in Figure 7.1, the menus associated with the Media Guide button allow users to select Music, Movies, Entertainment, Radio, Current Events, Site Index, and WindowsMedia.com. Although most of the menus are self-explanatory, a couple deserve mention. Selecting Site Index results in the display of the site index for the WindowsMedia.com Web site. In comparison, selecting WindowsMedia.com results in the display of that Web site’s home page, which, when performed by this author, was the same as selecting Home or Movies.
Buttons On the left side of the viewing window shown in Figure 7.1 are a series of buttons, with the Media Guide button selected. The top button, labeled Now Playing, allows users to use accelerator keys (Tab + Enter) for visualization, a term Microsoft uses to reference splashes of color and geometric shapes that change in tandem with the beat of the audio being played. Thus, users would first select the Radio Tuner button to hear a selected radio station, and then they would select the Now Playing button to display changing shapes and colors as they listened to the selected radio station.
Software Solutions
䡲 155
Figure 7.1 Windows Media Player 9, with its Media Guide button selected, displays a periodically changing collection of movie clips and radio listening options.
The third button on the left side of Windows Media Player 9, labeled CD Audio, provides users with the ability to listen to an audio CD. Users can also copy music as well as display artist and album information.
Playlists The fourth button, which is labeled Media Library, allows users to create and manipulate the contents of playlists, a term used to represent a collection of audio and video titles that collectively form a library. The items on or added to the playlist can include the audio or video fi le currently being played as well as other files either on the local computer or residing on another computer, with the latter referenced through the use of a URL. Users can add and delete items from a playlist and even delete a playlist from the library, including all items previously added to the playlist. Users can use the CD Audio button to create a CD from any playlist they have created. However, the playlist cannot be more than 74 minutes long, and only .mp3, .wav, .asf, .wma, and .wmv files are supported.
156 䡲 Understanding IPTV Table 7.1 Windows Media Player 9 Audio and Video File Support File Format
.aif
.mp3
.wav .wma .wmv
Description
Also referred to as .aifc and .aiff, this audio specification came from Apple Computer and is used on Silicon Graphics computers. The MPEG Layer 3 audio format is the most popular format for downloading music. By eliminating portions of the audio file that are not essential, mp3 files are compressed to approximately one-tenth the size of an equivalent pulse code modulation (PCM) file. This is the standard audio file format used by Windows PCs. Stores uncompressed (PCM) CD-quality sound files. The Windows Media audio format provides the ability to apply copy protection to files. The Windows Media video files represent advanced system format (.asf) files, which include audio, video, or audio and video compressed with Windows Media Audio (WMA) and Windows Media Video (WMV) codecs.
File Format Support Table 7.1 provides a brief description of the audio and video file formats associated with the file extensions supported by Microsoft’s Windows Media Player.
Radio Tuner The fifth button on the left portion of the Windows Media Player is labeled Radio Tuner. This button can be used in a manner similar to the preselected station, search, and seek buttons on a radio. Figure 7.2 illustrates the initial Windows Media Player 9 main window when the Radio Tuner button is selected. The left portion of the window lists featured stations and the right portion of the window lists groups of stations that will result in a list of stations that belong to each group. When users select a featured station they can add it to a station list, be directed to its Web site if given the option to do so, or hear its broadcast by selecting Play. Selecting the Find More Stations entry on the upper right portion of the window allows users to search by keyword or zip code to locate a list of stations.
Software Solutions
䡲 157
Figure 7.2 The Radio Tuner button allows users to listen to and record audio from stations around the world.
Portable Device Support The sixth button, which is labeled Portable Device, provides users with the ability to copy audio and video files to a variety of devices, including products that use flash memory, as well as onto a disk, CD, or DVD.
Windows Media Player 10 Prior to examining the operation of Windows Media Player 10, a few words are in order concerning its privacy options (first shown during the installation process) and its new interface.
Installation Options Figure 7.3 illustrates the default privacy option settings display that appears during the installation of Windows Media Player 10. Note that users can obtain detailed information about the privacy options by clicking on the More Information link. By default, media information from the Internet is enabled, music files are updated, licenses for protected content are automatically
158 䡲 Understanding IPTV
Figure 7.3 The Windows Media Player 10 default privacy options preclude it from transmitting the player ID to content providers and usage information to Microsoft.
acquired, and file and URL history is saved. However, to protect user privacy, consumers’ transmissions from the player to content providers in the form of unique Player ID and usage information to Microsoft are disabled. Another option that deserves mention concerns the selection of Windows Media Player 10 as the default player for 13 types of audio and video files. Unless unchecked, each of the file types listed in a table during installation will be opened by Windows Media Player 10. The audio and video file types directly supported by this player that are enabled by default are listed in Table 7.2.
Screen Display As previously mentioned, Windows Media Player 10 includes a new visual interface. Figure 7.4 illustrates the initial Windows Media Player 10 screen display. Note that the left panel buttons used in Windows Media Player 9 have been replaced by a series of tabs across the top of the screen as well as the use of the file menu entries across the top portion of the new Player.
Software Solutions
䡲 159
Table 7.2 Windows Media Player 10 Default File Types Windows Media Audio file (wma) Windows Media Video file (wmv) Windows Media file (asf) Microsoft Recorded TV Show (dvr-ms) DVD video Music CD Playback MP3 audio file (mp3) Windows Video file (avi) Windows Audio file (wav) Movie file (mpeg) MIDI file (midi) AIFF file (aiff) AU audio file (au)
Figure 7.4 Windows Media Player 10 replaced the buttons of Player 9 with tabs across the top of the screen.
160 䡲 Understanding IPTV
Now Playing Tab As its name implies, clicking on the Now Playing tab results in the display of a selected video (or of a visualization effect if an audio file is selected). In examining the top portion of Figure 7.4, note that the tab labeled Now Playing was selected; however, this resulted in a blank screen, because at the time the screen was captured no entries in the video playlist located on the right portion of the screen were selected.
Library Tab The Library tab can be used to easily access music, television shows, videos, and playlists. An example of the use of the Library tab is shown in Figure 7.5. In this example I used the Library tab to view titles stored in my playlists.
Figure 7.5 The Windows Media Player Library tab provides the ability to select music and videos from a tree-type menu.
Software Solutions
䡲 161
Rip, Burn, and Sync Tabs The Rip tab can be considered to represent the opposite of the Burn tab. That is, the Rip tab gives users the ability to copy songs from CDs onto the computer. Once they copy one or more songs onto the computer, they can use the Burn tab to create a CD or the Sync tab to synchronize and download previously stored files onto a portable device. Concerning burning and synchronization operations, only certain types of files can be transferred to a CD or portable media player. Those files typically include .wmv, .asf, .wma, .mp3, and JPEG images. Other types of files may have a protection feature that requires users to purchase a license that “unlocks” the file and allows it to be copied and played. Now that we have an appreciation for the Windows Media Player 10 defaults and new interface, let’s examine its use for viewing videos.
Video Operations Using the latest version of Windows Media Player, which was Windows Media Player 10 at the time this book was prepared, this author was able to both listen to CDs and play DVDs on his computer. To play DVDs using Windows Media Player requires that both a DVD-ROM and a software or hardware DVD decoder be previously installed on the computer. By default, Windows Media Player (including the most recent version) does not include a DVD decoder. However, when you obtain a PC with a DVD drive or install an internal or external DVD drive, you will more than likely install a DVD decoder that is designed to work with the hardware. Although the DVD decoder will work with Windows Media Player 10, its functionality should be checked to avoid potential problems. To do this, point your browser to the Microsoft Windows Media Player 10 Web page (www.microsoft.com/windows/windowsmedia/mp10), where you can add a variety of plug-ins, including a DVD decoder plug-in, as well as access a utility program to check the compatibility of DVD decoders on your computer.
DVD Decoder Plug-Ins At the time this book was written the Microsoft Windows Media Player 10 Web site offered three DVD decoder plug-ins, including the CinePlayer DVD decoder from Sonic Solutions, Inc., the NVIDIA DVD decoder from NVIDIA, and the Power DVD SE for Windows XP from CyberLink Corp. To provide readers with an indication of the capabilities of Windows Media Player 10 bundled with a DVD decoder plug-in, we will briefly discuss the features of two of these products.
162 䡲 Understanding IPTV
The CinePlayer DVD decoder from Sonic Solutions, Inc., is priced at $14.99. This product claims to offer the industry’s “best DVD navigation” as well as provide the highest quality DVD playback on PCs. In addition to playing DVDs, the CinePlayer DVD decoder includes an MPEG-2 decoder that enables the playing of video files, including the numerous movie trailers available for viewing on the Web. A second plug-in that deserves a few words is the Power DVD SE for Windows XP from CyberLink Corp. Offered for a price of $14.95, this software provides support for the playback of DVD movies and MPEG-2 video files. In addition, the Power DVD SE provides support for Dolby Digital audio decoding and provides a downmix of 5.1 soundtracks to two channels for video playing on a computer limited to two speakers. Although each DVD decoder plug-in included a slight variance in features from other products, all of the decoders were limited to supporting MPEG-2 video streams. Thus, it appears that software development is lagging behind hardware, because some IP set-top boxes now provide support for high-definition video.
Video Decoder Checkup Utility In addition to the previously mentioned plug-ins, Microsoft now provides a free DVD and MPEG-2 utility designed for computers operating Windows XP with Media Player 10. Referred to as the Windows XP Video Decoder Checkup Utility, this program determines if an MPEG-2 decoder is installed on your Windows XP computer and whether or not the decoder is compatible with Windows Media Player 10 and Windows XP Media Center Edition. Thus, if a computer user encounters a problem, such as synchronizing (copying) recorded TV shows to a portable media center or another device, this utility could be used to determine whether the computer has a compatible MPEG-2 decoder. The filename for the utility program provided by Microsoft is dvdchecksetup.exe. This file can be downloaded from several Web sites in addition to Microsoft.com. Figure 7.6 illustrates the result I obtained when I executed the utility program on one of my desktop computers. In examining the display resulting from the execution of dvdchecksetup.exe, note that two DVD decoders were located on my computer. Also note that although the CyberLink Video/SP Decoder was shown as compatible with Windows XP Media Center Edition, and as such was indicated as the preferred decoder, it was not compatible with the synchronization features of Windows Media Player 10. Thus, to synchronize MPEG-2-encoded content to a portable device the display warns the user to obtain an updated version of the MPEG-2 decoder. This warning is most useful because it could eliminate many hours of attempting to determine why
Software Solutions
䡲 163
Figure 7.6 The Microsoft DVD utility program checks your computer for MPEG2 decoders and their compatibility with Windows XP Media Center Edition as well as support for file synchronization.
the transmission of video from desktop or laptop to a portable device is failing using the preferred decoder.
Viewing Video To view a video, DVD, or audio visualization, you would select the item from a playlist if the target is on your computer or referenced to a location via a URL. If you are using a DVD player, it would normally play automatically; however, if it does not begin to play you could click on the Library button and select and click on the drive that contains the disk. I once received via e-mail the famous “copper clappers” short video from a friend. After I added it to my playlist, clicked on its entry, and selected the Now Playing tab, the video began playing. Figure 7.7 shows a still picture from that video of Jack Webb beginning the famous skit. (If you have never seen it, this is from an episode of The Tonight Show in which Johnny Carson performed a skit with Jack Webb of Dragnet fame. The skit concerned the theft of copper clappers by Claude Cooper, the kleptomaniac from Cleveland. Watching this video, I was amazed that both Mr. Webb and Mr. Carson were able to perform the entire skit without bursting into laughter. Check it out and you’ll see.)
Manipulating the Video Display Users can manipulate the viewing area using a key pair, clicking on a button, or selecting a menu entry. In Windows Media Player 10 you can
164 䡲 Understanding IPTV
Figure 7.7 A still shot from the “copper clappers” video short viewed using Windows Media Player 10.
click on a View Full Screen button, press the ALT + Enter keys, or select Full Mode from the View menu to expand the video onto the full screen. The full-screen button (icon) is shown in Figure 7.7 at the top right but to the left of the playlist. Pressing the ESC key restores the player to its original size. A word of caution is in order concerning full-screen viewing. Although most DVDs look fine when viewed in full-screen mode, most video shorts and movie trailers do not have sufficient resolution to be viewed in that manner.
Skin Mode In concluding our tour of Windows Media Player, I would be remiss if I did not briefly discuss skin mode. A skin can be considered to represent a new surrounding added to Windows Media Player that changes its appearance. Figure 7.8 illustrates the default Windows Media Player 10
Software Solutions
䡲 165
Figure 7.8 The default Windows Media Player 10 skin looks like a spaceship connected to a viewer.
skin, which is similar to a labeled spaceship that contains a video display area. Note that the default skin has a series of buttons over which the user can move the cursor to obtain a display of their meaning. Although most skins incorporate basic player functions such as Play, Previous, Next, Stop, and Volume Control, some skins provide added functionality. Most skins by default use only a small portion of a display and represent a convenient method for viewing a video or listening to an audio while working at your PC.
7.2 Apple Computer’s QuickTime Apple Computer has long been the leading producer of hardware and software innovations. Its current QuickTime player, version 7, is bundled with iTunes 6 for Windows 2000 and Windows XP based computers. Although the basic bundle is free, Apple also offers an upgrade to QuickTime 7 Pro for $29.99 from their Web site. This version of QuickTime converts users from video watchers to video makers, because it includes support for creating videos using the H.264 codec, which in effect is MPEG-4. In addition, QuickTime 7 Pro allows users to record audio for podcasts (a term used to reference radio shows downloaded over the Internet), create movies that can be viewed on an Apple iPod, and convert media into more than a dozen formats.
166 䡲 Understanding IPTV
Overview Both QuickTime 7 and QuickTime 7 Pro include built-in support for H.264, which provides a significant advance in compression technology over MPEG-2. The H.264 compression standard was selected for use in HDDVDs and enables users to watch video that is crisp, clear, and colorful in a window that is up to four times the size of windows typically used to view video when an MPEG-2 codec is used. Although QuickTime Pro’s support of MPEG-4 represented a method of media player differentiation, as more Internet HD programming becomes available for user access, one can expect that other media players will likely obtain a similar capability.
Installation Although Apple provides users with the ability to download QuickTime as a standalone entity, I selected the bundled version of QuickTime and iTunes. The download process and program installation were straightforward, with some options that were very interesting. First, users can select from 14 languages, with the default being English. A second interesting portion of the installation process is the appearance of a dialog box that provides a summary of information about iTunes and QuickTime, including system requirements and new features incorporated into each program. The user selects the button labeled Next, which is common to a sequence of dialog boxes, and a Setup Type dialog box gives the option of enabling or disabling the installation of desktop shortcuts and the use of iTunes as the default player for the audio files stored on the computer. The user either accepts the default location for installing iTunes and QuickTime or selects another location, after which the installation process commences, followed by an “Installation Successful” message displayed in a new dialog box, informing the user that the installation was completed without error. Once the user clicks on the button labeled Finish at the bottom of the dialog box, a series of iTunes setup boxes will be displayed. The first box can be used to search the My Music folder to find songs already stored and add them to the iTunes music library.
iTunes Setup Assistant iTunes can play both mp3 and aac music files as well as automatically convert unprotected Microsoft wma files to aac files. This conversion is performed by the iTunes Setup Assistant, whose first dialog box by default selects adding mp3 and aac files to the iTunes music library and converting wma files to aac files. The next dialog box generated by the iTunes Setup Assistant concerns organizing the iTunes music folder. Users can choose
Software Solutions
䡲 167
to keep the iTunes music folder organized, which will cause the program to rearrange and rename their music files and folders or they can choose to be in charge of changing file and folder names. A third iTunes Setup Assistant dialog box allows users to select whether to allow the program to take them to the iTunes Music Store or the iTunes Library.
File Conversion Figure 7.9 illustrates the iTunes display after I allowed the default iTunes Setup Assistant settings to convert existing wma files to aac files and add those files to the iTunes Music Library along with mp3 and aac files already in the My Music folder on my desktop PC. In the left column of Figure 7.9, which has the heading Source, note the entry labeled Videos. By selecting that entry and using the file menu to input a previously stored video,
Figure 7.9 iTunes can support listening to CDs and radio stations, creating lists of songs, viewing videos, and burning CDs and DVDs.
168 䡲 Understanding IPTV
Figure 7.10 Through the iTunes File menu, users can import video and audio files.
you can use iTunes to view videos. However, as we will note, the support of certain file types represents a problem.
Video Incompatibility When the user selects Import from the iTunes File menu, by default iTunes looks in the My Music folder. In Figure 7.10, which illustrates the use of the Import command in the File menu, note that this author previously stored four videos in the wmv format, as indicated by the icons to the left of each video name. Because a friend previously e-mailed me the copper clappers video, which operated under Windows Media Player, we can note one file type incompatibility by selecting that file. Clicking on the Open button shown in Figure 7.10 will result in an error message: “The file ‘copper clappers.wmv’ cannot be imported because it does not appear to be a valid exported file.” Thus, one file incompatibility is the inability of QuickTime to open Microsoft wmv video files.
File Support Previously we described file types supported by Windows Media Player 7. Apple Computer developed file formats referred to as mov and qt for
Software Solutions
䡲 169
QuickTime to create, edit, publish, and view multimedia files. Only QuickTime files version 2.0 or earlier can be played in the Microsoft Windows Media Player. Later versions of QuickTime require the proprietary Apple QuickTime player. Similarly, in the opposite direction, Apple Computer’s QuickTime cannot be used to view Windows Media files. Now that we understand the reason why a short video that was viewable in Windows Media Player 7 could not be viewed in QuickTime (or, more accurately, opened in iTunes’ movie source), let’s turn our attention back to QuickTime.
Evolution The first version of QuickTime, referred to as QuickTime 1.x, provided the basic architecture to include multiple movie tracks, extensible media type support, and a range of editing features. The original video codecs included what is referred to as “road pizza” (used for normal, live-action video), an animation codec, and a graphics codec. Apple released QuickTime 1.5 for the Macintosh in 1992 and contracted with an outside company to develop QuickTime 1.0 for Windows. QuickTime 2.0 for the Macintosh was released in February 1994 and represents the only version of this media player that did not have a free version. QuickTime 2.0 added support for music soundtracks, and a version for Windows was released in November 1994. The next release of QuickTime, version 3.0 for the Macintosh, occurred in March 1998. This release included both a free version and a Pro version, which included additional features. This dual-release policy of a free and Pro version of QuickTime has continued through version 7. In June 1999 Apple released QuickTime 4.0 for the Macintosh. This version of QuickTime added a second video codec and support for large movie files. QuickTime 5.0 for the Macintosh was released in April 2001. This release added skins to the QuickTime player and support for multiprocessor image compression. In addition, a full-screen video mode was added to the Pro version of the player. Shortly after its Macintosh release, a version for Windows became available. In July 2002, Apple Computer released QuickTime 6.0, which included support for MPEG-2, MPEG-4, and aac file types. Between July 2002 and October 2004 Apple Computer added nine minor releases to include support for iTunes 4. In May 2005 Apple released iTunes 4.8, which included support for viewing full-screen QuickTime video through iTunes. The latest release of QuickTime, version 7.0, occurred in April 2005, with a Windows version released in July 2005. Since then, Apple has used a series of minor releases to fix bugs and improve H.264 performance.
170 䡲 Understanding IPTV
Currently, QuickTime supports mov files, which function as multimedia container files that contain one or more tracks, each of which stores a particular type of data, such as audio, video, effects, or text used for subtitles. Each track contains track media as data reference to a file or as a digitally encoded media stream created using a specific codec. In addition, QuickTime 7 supports most of the H.264 standard as well as MPEG-4 files.
Operation As previously noted, Apple Computer has integrated some elements of QuickTime and iTunes that enable movies to be viewed in iTunes. In addition, users can operate QuickTime as a separate program, as illustrated in Figure 7.11. Notice the default size of QuickTime in relation to the size of a PC screen. This size is what Apple refers to as “normal.” Through the View menu, users can also select Half and Double Size. However, the Pro version of QuickTime is needed to obtain a full-screen viewing capability, unlike Windows Media Player 7, which includes built-in support for full-screen display of video.
Figure 7.11 QuickTime’s normal size display.
Software Solutions
䡲 171
Watching Videos Examine the top right column shown in Figure 7.11 and you will notice a list of movie trailers that were transmitted from the Apple Computer Web site when the QuickTime program was initiated. If you click on a movie trailer or select a video with a compatible file format and have iTunes 4.8 or later, you can browse and watch videos in iTunes. An example of this is shown in Figure 7.12, which illustrates the result obtained by clicking on the Basic Instinct 2 trailer shown in the upper right column of the QuickTime player shown in Figure 7.11. Note that the iTunes display screen is initiated as a response to the QuickTime selection, with the selected trailer appearing as being from the Music Store, a term used to represent the iTunes Web site. Clicking on the Trailer option in Figure 7.13 results in the selected preview being displayed in iTunes. When this occurs the source is shown as the Movie Store. Figure 7.13 illustrates the beginning of the movie preview for Basic Instinct 2. As indicated earlier in this book, the Apple Computer Movie Store offers consumers the ability to purchase a wide
Figure 7.12 Selecting a trailer in QuickTime results in iTunes being initiated to allow the trailer to be played from the Music Store.
172 䡲 Understanding IPTV
Figure 7.13 Watching a trailer in iTunes whose source is the Movie Store.
variety of television shows. Because those shows include several ABC hits, including Lost, it will be interesting to observe the effect of a recent ABC announcement (April 2006) that that network intends to make available many shows for downloading for free. Although such shows will include many hit series now sold on the Movie Store, viewers will receive each show free from ABC with commercials.
Returning to QuickTime As previously noted, QuickTime is highly integrated with iTunes; however, it can also operate as a standalone video player. In fact, if you know the URL of a movie, TV show, or another type of video whose file format is supported by QuickTime you can open that URL for viewing in a QuickTime player. Although users can open files and URLs, only QuickTime Pro allows them to save or export files. Thus, Apple Computer has limited the basic capability of QuickTime, which encourages users to spend a nominal $29.95 to purchase a license to use QuickTime Pro.
Software Solutions
䡲 173
Plug-In Components Similar to Microsoft, Apple Computer’s Web site lists a series of components that can be acquired from third-party vendors to enhance QuickTime. Examples of components, which are referred to as plug-ins by Microsoft, include a full-screen, full-motion, TV-quality codec from On2 as well as 360′ × 360′ images useful for real estate, E-commerce, travel, and hospitality organizations from IPIX.
7.3 Other Media Players In addition to Microsoft’s Windows Media Player and Apple Computer’s QuickTime and iTunes, two other players deserve mention. Those players are the Real Networks RealPlayer and the Macromedia Flash Player.
RealPlayer Similar to Apple Computer, Real Networks offers a free and a Plus version of RealPlayer, with a slight charge associated with the Plus version of the player. The basic RealPlayer supports all major media formats and supports the transfer of music to more than 100 portable devices. RealPlayer and RealPlayer Plus support RAM file types, which are not compatible with Windows Media Player and QuickTime. Although Real Networks includes support for many popular video file formats, it currently has a small portion of the market for media players.
Flash Player A second media viewer that warrants attention is the Macromedia Flash Player. Many Web sites, including abc.com and Yahoo News, support the use of the Macromedia Flash Player. Figure 7.14 illustrates the playing of ABC’s Commander-in-Chief program within a Macromedia Flash Player, with the settings box displayed in the middle of the player. The four icons on the bottom edge of the settings box determine which settings can be viewed and changed, with the leftmost icon displaying privacy settings. The file folder icon allows a user to control how much information can be stored on the user’s computer. The microphone icon controls the record volume, and the camera icon allows camera controls if the user’s computer is connected to that type of device.
174 䡲 Understanding IPTV
Figure 7.14 Viewing the ABC network hit Commander in Chief using Macromedia’s Flash Player.
7.4 Summary In this chapter we looked at two media players in detail and briefl y discussed two additional players. In an IPTV environment, the media player is required to view video only on a PC. When IPTV enters the home on a DSL or fiber connection, it will primarily flow to a set-top box, which will then distribute the selected channel via a home network to an applicable television located anywhere in the home that has a connection to the home network or to a TV directly connected to the box. Although a media player is not required to view video on a television, as PCs and TVs converge there exists the possibility that the media player will evolve to not only display video on televisions but, in addition, allow many other features, including displaying caller ID information, enabling video conference calls, permitting remote gaming, and supporting to be developed applications that are limited only by one’s imagination. For example, combining the capabilities of media players with the display and audio capability of modern television, it becomes possible for IPTV data
Software Solutions
䡲 175
streams to be manipulated for the convenience of the consumer to include recording shows with less commercial breaks on a network drive available for viewing on any television in a home connected to the home network. Thus, convergence of PC and TV technology along with a broadband Internet connection offers the potential to alter the manner by which we view video as well as when and where we view video.
Chapter 8
Internet Television In concluding this book, we will examine what we can consider to represent an evolving industry — television delivered via the Internet. Although television delivered via the Internet represents IPTV, there are some significant differences between the two that we will discuss in the first section of this chapter. In the second section we will turn our attention to Internet television, examining how we can view television broadcasts on our PC from stations located around the world.
8.1 Internet Television vs. IPTV When we talk about Internet television and IPTV we tend to treat them as synonymous terms because Internet television represents a stream of IP datagrams that delivers MPEG frames generated by a television station. Although the two terms can be treated as synonyms, in reality they should be used to describe two different technologies. Thus, let’s focus on the true meaning of each technology to obtain an appreciation for how they actually differ from one another.
Internet Television Internet television refers to the broadcast of news, weather, and TV shows from television stations that add an Internet interface to their over-the-air broadcasts. The Internet interface either takes selected station videotapes and converts them into a sequence of IP datagrams transporting, most 177
178 䡲 Understanding IPTV
commonly, MPEG-2 frames, or provides a “dawn-to-dusk” broadcast via the Internet of the station’s over-the-air transmission. Viewing of station broadcasts on the Internet is accomplished via a media player. Typically, an Internet television station, which in effect represents a conventional television station that also broadcasts video via an Internet connection, limits its support to one type of media player, such as Microsoft’s Windows Media Player, Apple’s QuickTime Media Player, or Real Networks’ RealPlayer. The media player supported provides the interface required to view the stream of IP datagrams on a desktop or laptop computer. Because media players support buffering of IP datagrams, a broadband connection to the Internet, although desirable, is not mandatory for viewing video. Now that we have a general appreciation for Internet television, let’s turn our attention to IPTV.
IPTV Although IPTV can be viewed as Internet television, the term’s intended usage is to describe the transmission of video, including movies, television, and concerts, at a high speed that enables subscribers with an applicable set-top box to view events on a television without buffering. Probably one of the most mentioned IPTV projects is Project Lightspeed initiated by SBC Communications (which acquired AT&T and assumed its name during 2006). Project Lightspeed, which was described earlier in this book, represents a private IP network that will deliver television, video on demand (VOD), and high-speed Internet access to millions of homes. Video will be decoded by a set-top box and delivered either directly to a connected television or via a home network to a remote television. Thus, the set-top box is an integral hardware component associated with IPTV whereas the media player represents an integral software component associated with Internet television viewing. Now that we have an appreciation for the similarities and differences of Internet television and IPTV, we will conclude this chapter by examining the evolving industry represented by Internet television.
8.2 Internet Television From a handful of television stations viewable via the Internet a few years ago, this industry has exhibited explosive growth to the point where hundreds of stations were available for viewing during 2006. In this section we will look at a few individual Internet television sites as well as a Web site for connecting to and viewing tens of television stations located around the globe.
Internet Television
䡲 179
Evolution Internet television dates to the mid-1990s, when Reid Johnson, a 20-year veteran of the television news business, founded the firm Internet Broadcasting. Its first television station Web site, www.Channel4000.com, went online during 1996. The success of Channel 4000 became a model for expansion into additional markets. This expansion caught the attention of such media companies as the Hearst Corporation, the Washington Post, and the McGraw-Hill Companies, which became partners of Internet Broadcasting. By 2000, Internet Broadcasting produced more than 70 television Web sites that cumulatively received more than 12 million unique visitors monthly. Although Internet Broadcasting has achieved significant success, including becoming number one in TV news in 18 of the top 25 markets in the United States, its Web sites use a mixture of video, text, and images to present the news. For example, consider NBC10.com, which is the Web site operated by Internet Broadcasting in Philadelphia, Pennsylvania. Figure 8.1 illustrates the home page of NBC10 from the morning of
Figure 8.1 The home page of NBC10.com enables visitors to view a video or read the top stories.
180 䡲 Understanding IPTV
April 18, 2006. A user can elect to watch a video of the top story, read the story, or view images. In addition, under the “News” column on the left portion of the Web page visitors can select the “Video” entry, which will result in the display of a series of videos by pr edefined category. Currently, NBC10 limits its support of video to Windows Media Player.
Webcasting At approximately the same time Internet Broadcasting was placing television stations on the Web, other organizations began to realize the potential of broadcasting movies and television shows. As other companies developed Web sites to broadcast video, the term “Webcasting” evolved. This term was initially used to reference the broadcasting of television programs, such as soap operas, news, and comedy shows, over the Internet. Later, the term was expanded to reference the electronic transmission of audio and video data over the Inter net in realtime in the form of streaming audio and video. Thus, this newly expanded definition included music videos, movies, and other forms of audio-visual entertainment.
Advantages A conventional television station is limited by the FCC as to its broadcast power. Thus, the conventional television station can be considered to be limited to a specific geographic market. This limitation affects advertising, which is the manner by which television stations obtain the majority of funds for their operation. The development of satellites allowed television stations to br eak their former geographic barrier, because distant cable companies could negotiate deals that enabled television stations located in one area of the country to be carriers in a cable territory located in another portion of the country. Among the first television stations to break the geographic barrier was WTBS in Atlanta, which became known as a “super station” due to the large number of cable companies that carry its programming. In an Internet environment it becomes possible for television stations to become “global super stations” because any user connected to the Internet via an applicable high-speed connection becomes capable of viewing the features of the site, including different types of video. Thus, it also becomes possible for television stations to expand their advertising base to national and international companies.
Internet Television
䡲 181
Legal Issues Although the transmission of television Webcasting may appear to be simple, some legal issues must be considered. Those legal issues are associated with copyrighted material. Currently, most programming has licensing and distribution agreements that may be applicable to a geographic area or a country. When a television station offers such programming to Internet users, a key question is whether the station is now violating its licensing and distribution agreement. Another question that warrants consideration occurs when a foreign television station obtains foreign rights to programming produced in the United States and allows Internet users in the United States to view such programming. When this occurs, is the foreign television Webcaster accountable for copyright infringement under U.S. law? Although some initial U.S. rulings indicate that copyright infringement occurs when U.S. citizens located in the United States view copyrighted television programming originated in a foreign jurisdiction, the appeals process may require several years until this issue is fully resolved. Now that we have an appreciation for the evolution of Internet television and some of its legal issues, let’s turn our attention to television portals.
Internet Television Portals One of the more recent developments in the wonderful world of Internet television is the establishment of portals that provide users with access to hundreds of Internet television stations located around the globe. One such portal is BeelineTV.com, whose home page is shown in Figure 8.2. Looking at Figure 8.2, you will note a number to the left of each television station entry. That number identifies the Internet connection (in kilobits per second) required to view streaming video from the station. To the right of the station entry you will see the word “Real,” “Q time,” or “Media,” which identifies the type of media player required for viewing the station’s streaming media. Here, “Real” identifies Real Networks’ RealPlayer, “Q time” identifies Apple Computer’s QuickTime Media Player, and “Media” identifies Microsoft’s Windows Media Player. Through the BeelineTV.com Web site you can view television stations located in more than 20 countries. If you scroll down the site’s home page, you encounter more than 35 stations listed under the “English TV” category, including England’s BBC News, Canada’s CBC, and from the United States, AFTV Movie Classics and NASA TV. To view certain stations using Windows Media Player, you will need to run an ActiveX control. The BeelineTV.com Web site will prompt you with an applicable message that, when accepted, will result in Windows Media
182 䡲 Understanding IPTV
Figure 8.2 From the BeelineTV.com Web site you can access more than 100 Internet television stations.
Player opening in a separate window. Figure 8.3 illustrates the window that opened after this author selected the AFTV sci-fi/horror station. Note that from the new window in which the programming is displayed you have the option of viewing the channel schedule and doubling the screen size. Depending on the media player’s codec, it may or may not be a good idea to increase the screen size or change the view to full-screen mode. If your codec supports MPEG-2 and the station transmits streaming video in a lowresolution format, it will look awkward when switching to a larger screen size. However, if your media player supports MPEG-4, there is an H.264 movie station you can access to view movies on a full-screen basis with very good clarity. As more users begin to view video over the Internet, we can reasonably expect more stations to offer MPEG-4 streaming video and media players to eventually support the technology by default.
Other Portals Although BeelineTV.com and other portals provide access to a large amount of free content, broadband video content from major news
Internet Television
䡲 183
Figure 8.3 Viewing an AFTV sci-fi/horror station broadcast on the Internet.
stations and other stations can be viewed only via a subscription service. Thus, some portals now charge a monthly or annual fee to view hundreds of stations, including subscription-only stations. As the industry matures it will be interesting to observe the difference in the growth of advertiser-supported Internet television versus subscription-based Internet television.
Individual Internet Stations In addition to the use of a portal, you can dir ectly access various Internet television sites that may or may not be available for access via a portal. For example, the National Aeronautics and Space Administration (NASA) Web site provides a link to NASA TV, which enables users to view press briefings and various scientific-related clips without cost. A second example of Internet television viewing is Israel National TV (URL: www.israelnationaltv.com). Figure 8.4 illustrates the home page of this
184 䡲 Understanding IPTV
Figure 8.4 The home page of IsraelNationalTV.com.
Web site. Note that you can choose to view news, interviews, and other types of video as well as purchase programming. This site envelopes Windows Media Player with a series of selections and text-based news, which illustrates how stations can tailor a media player to satisfy their operational requirements.
8.3 Summary Today we are at the start of a revolution concerning the manner by which we access and view television stations connected to the Internet. Although current Internet connection speeds and media player capabilities make most Internet television viewing feel similar to viewing a modern television show on a TV set produced during the 1960s, evolving technology will change this situation for the better. As more capable codecs are added to media players, higher speed Internet access becomes more economical and available, and Internet television stations convert to H.264-compatible streaming media, we can reasonably expect its use to significantly increase. As this occurs, Internet television will join the ranks of other types of entertainment that on a daily basis compete for our attention.
Index 10 Base-5 118 802.11 130 802.11a 130–132,144–145 802.11b 130–132 802.11g 131–132,144 802.11n 131–132
A Access network 19 Active components 31 Address mask request message 99 Address mask reply message 99 Address resolution protocol (see ARP) ADSL 19, 23, 30–32, 132 ADSL2 25–28, 30–32, 132 ADSL2+ 28–32, 46–47, 132 American National Standards Institute (see ANSI) American Wire Gauge (see AWG) Analog television 49–56 ANSI 25, 120–122 AOL 39 Apple Computer 4, 12–13, 15, 38, 165–173 APON 34 Archive server 148–149 Archos 3 ARP 86 Aspect ratio 59–60, 80 Asymmetrical Digital Subscriber Line (see ADSL, ADSL2) ATM PON (see APON)
AT&T 6–7, 10, 19, 22–23, 37, 140, 178 AWG 116
B Beeline TV 181–182 Bell South 22–23, 31 B-frame coding 74–74, 78 Billing and management server 150–151 Blockbuster 3 BPON 34–35 British Telcom Laboratories 31 Broadband access 36 Broadband PON (see BPON) Broadcast server 146–147 Broadcast transmission 18, 105–106 Business TV to desktop 13
C Cable TV operation 9–10, 36–37, 43 CAP 24, 122–124 Carrierless amplitude phase (see CAP) CBS 38 Cell phone transmission 40 Channel 4000 179–180 Chrominance 46, 51, 64, 66 Cinema Now 3 Cisco Systems 2–3, 136 Codec delay 42 Coding gain 27 Color hue 55
185
186 䡲 Understanding IPTV Color saturation 55 Color reference burst 54 Colorspace conversion 66 Comcast 38, 40 Component video 53 Composite video 53–55 Constellation size 122 Convergence 42–44 Corporate communications 14 Cox communications 40
D Data link layer 86–87 Destination unreachable message 96–97 Deutsche Telecom 31 Dial-up delays 36 Digital Subscriber Line (see DSL) Digital Subscriber Line Access Multiplexer (see DSLAM) Digital television 57–63 Digital television formats 60–61 Digital video server 147 Discrete cosine transform 65, 68–70 Discrete multitone (see DMT) Distance learning 13–14 DMT 24–25, 122–124 Dolby 60, 63 DSL 10, 36 DSLAM 146 DVD decoder plug-ins 161–162 Dynamic ports 101–102
E
Fiber-to-the-curb (see FTTC) Fiber-to-the-exchange (see FTTEx) Fiber-to-the-home (see FTTH) Fiber-to-the-neighborhood (see FTTN) Fiber-to-the-premise (see FTTP) File format 155–156, 159, 168–170 Financial comparison 8–9 FiOS TV 7 Flash player 173–174 Fragmentation 89–90 Frame rate 80 FSAG 34 FSAN 117 FTTB 32, 45, 126 FTTC 31, 126 FTTCb 125 FTTEx 124–125 FTTH 126–127 FTTN 6, 23, 30–35, 45, 126 FTTP 6, 23 Full Service Access Group (see FSAG) Full Service Access Network (see FSAN)
G Gigabit Ethernet PON (see GPON) GPON 35 G.dmt 23–24 G.lite 25 G.lite.bis 25 G.992.1 standard 23–24 G.992.2 standard 25 G.992.3 standard 25 G.992.4 standard 25
Echo request 96 Echo reply 96 EDTV 60–62 Enhanced definition television (see EDTV) EPON 34–35 Ethernet-based PON (see EPON) ETSI Plan 997 118–119 ETSI Plan 998 118–119 European Telecommunications Standards Institute (see ETSI Plan 997, Plan 998)
H
F
I
Fiber-to-the-building (see FTTB) Fiber-to-the-cabinet (see FTTCb)
ICMP 88, 95–99 Information request message 98–99
Ham band notching 119–120 HDTV 221, 23, 30–31, 46–47, 60–63 Headend server 146 High definition television (see HDTV) Home network 19, 128–134 Horizontal line sync pulse 54 Hotzone 15 Huffman coding 65, 72 H.264 81–83, 164
Index
Information reply message 98–99 Interlaced 51, 59 Inter-frame coding 71, 73–74 Internet 1–2 Internet Control Message Protocol (see ICMP) Internet television 177–184 Internet television portals 181–182 Intra-frame coding 72–73 iPOD 4, 12–13, 15, 38, 166 IP address fields 91–92 IP datacast 40 IP datagram 87–88 IPTV Applications 11–16 Definition 1 Delivery 105–110 Features 10–11 Home network 129–133 Impact 19–20 Market drivers 21–28 Network elements 16–20 Potential impact 16–20 TC/IP protocol suite 85–115 Television 177–184 IP/TV 2 IP video 3 IPv4 Reader 87–92 Israel National TV 183–184 iTunes 4, 12, 166–172
䡲 187
Mobile phone television 14–15 Motion compensation 71 Motion estimation 70–71 Motion picture expert group (see MEG1, MPEG2, MPEG4) Motorola 136,140 Movie link 3–4 MPEG1 64–77,107 MPEG1 audio 76–77 MPEG2 17,77–81,107–109,146 MPEG4 17,107,146,166 MTBF 31–32 Multicast transmission 17–18
N NAT 104–105 National Television Systems Committee (see NTSC) NBC 38 Network address translation (see NAT) Network layer 88 NTSC 50–55,59–60
O OLT 32–33,45 On–Demand video 12 ONU 32–33,45 Optical line terminator (see OLT) Optical Network unit (see ONU)
J Jitter buffer 85, 114
L Last mile 19, 115–135 Lossless compression 45–46 Lossy compression 45–46, 64–83 Luminance 46, 50–51, 53, 64, 66
M Mean time between failure (see MTBF) Media center 143–144 Media center extender 144–145 Messenger program 15 MGM 4 Microsoft 3,6,138–140,143–145,154–165
P PAL 50,55 Parameter problem message 98 Paramount 4 Passive components 31 Passive optical network (see PON) Pay-TV 35–41 Phase Alternating Line (see PAL) Physical layer 86–87 Picture luminance 54 Picture scanning 59 Pixels 56–57 Playlist 155–156,160 PON 31–35 Port numbers 101–103 Powerline Alliance 131–132 Private IP–based network utilization 4–5
188 䡲 Understanding IPTV Progressive scanning 59 Project Lightspeed 6–7,140,178 Public IP-based network utilization 2–3
Q QAM 24,123 QoS 43 Quadrature amplitude modulation (see QAM) Quantization 67–68 Quantization error 68 Quality of service (see QoS) Quicktime 165–173, 181–182
R RAID 148–150 Raw UDP 88 RBOC 22–23,43 Real Player 4,173 Real time protocol (see RTP) Real time video 85 Real time transport control protocol (see RTCP) Reed–Solomon coding 26–28 Reference black level 54 Regional Bell Operating Company (see RBOC) Registered ports 102–103 Reserved IPv4 network address 104–105 Resolution 57–58 Royal Philips Electronics 140 RTCP 113–114 RTP 108–113
S SBC Communications (see AT&T) Scanning 51–52 Scientific Atlanta 136, 140 SDH 120 SDTV 21,23,30–31,46,60–61 SECAM 50,55 Sequential Couleur a Memoire (see SECAM) Servers 145–151 Service provider network 18–19
Set-top boxes 135–142 Shannon’s Law 27–28,123–124 Sigma Designs 141–142 Signal to noise ratio (see S/N ratio) S/N ratio 122 SONET 120 Sony Pictures 4 Sound 60,63 Source quench message 97 Sprint Nextel Corporation 40 Standard definition television (see SDTV) Stored replay video 85 Subsampling 66–67 Subnet mask 94–95 Subnetting 93–95 Symbol rate 122 Synchronization 60 Synchronous digital hierarchy (see SDH) Synchronous optical network (see SONET)
T Talegent 142 TCP 86,99–104 Telephone company landline erosion 22–28 Time exceeded message 98 Timeshift broadcast server 150 Time stamp request message 98 Time stamp reply message 98 Time-to-live 90 Time Warner 40 Transmission control protocol (see TCP) Transport layer 86–87,99–105
U UDP 86–87, 99–104 UDP/raw 108–109 UDT/RTP 108–109 Unicast transmission 18 Universal 4 User datagram protocol (see UDP)
V VDSL 19, 32–33, 116–124, 132 VDSL alliance 122–123 Verizon 7,10,19,23,37
Index
Vertical sync pulse 55 Vertical synchronization 51,55 Very-high-bit-rate Digital Subscriber Line (see VDSL) Video chat 15–16 Video compression 45–48 Video content 2–3 Video decoder checkup utility 162–163 Video distribution network 44–48 Video headend 16–18 Video-on-demand 107,147–148 Video-on-demand server 147–148 Video sequencing 79 Voice coding standards 42 Voice over Internet Protocol (see VoIP) VoIP 5,21–22
䡲 189
W Warner Brothers 4,39 Webcasting 180–182 Well known ports 102 Windows Media Player 4,138–139, 181–182 Windows Media Player 9 154–157 Windows Media Player 10 157–165 Wired Ethernet 130 Wireless Ethernet 19,130–131
Y Yahoo 15