291 63 7MB
English Pages 310 Year 1999
Table of Contents
Frame Relay for High Speed Networks ISBN: by Walter 0471312746 Goralski John Wiley & Sons © 1999 , 410
| pages
Everything you need to know about frame relay technology, in plainspoken English. Pete Loshin
Frame Relay for High-Speed Networks Introduction Chapter 1: What Frame Relay Can Do Chapter 2: The Public Data Network Chapter 3: Frame Relay Networks Chapter 4: The Frame Relay UserNetwork Interface Chapter 5: Frame Relay Signaling and Switched Virtual Circuits Chapter 6: Congestion Control Chapter 7: Link Management Chapter 8: The Network-Network Interface (NNI) Chapter 9: Voice over Frame Relay Chapter 10: Systems Network Architecture and Frame Relay Chapter 11: Internet Protocol and Frame Relay Chapter 12: Asynchronous Transfer Mode and Frame Relay Chapter 13: The Future of Frame Relay Bibliography Acronym List
Frame Relay for High-Speed Networks Walter Goralski Copyright © 1999 by Walter Goralski. All rights reserved. Published by John Wiley & Sons, Inc. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-, (212) 850-, fax (212) 850-, E-Mail: [email protected]. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in professional services. If professional advice or other expert assistance is required, the services of a competent professional person should be sought. Publisher: Robert Ipsen Editor: Marjorie Spencer Assistant Editor: Margaret Hendrey Managing Editor: Micheline Frederick Text Design & Composition: North Market Street Graphics Designations used by companies to distinguish their products are often claimed as trademarks. In all instances where John Wiley & Sons, Inc. is aware of a claim, the product names appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration.
Acknowledgments There are a number of people who should be thanked for their assistance in making this book. First I would like to thank Hill Associates, Inc. for creating an intellectual work environment where personal growth is always encouraged and for a climate that makes the writing of books like this possible in the first place. I owe special thanks to the Hill reviewers who painstakingly waded through the individual chapters and found many of my errors. These are Clyde Bales, Rod Halsted, Gary Kessler, Hal Remington, Harry Reynolds, Ed Seager, Tom Thomas, and John Weaver. Any mistakes which remain are my own. On the publishing side, Marjorie Spencer supplied the vision that produced this text. Margaret Hendrey saw the book-writing process through, although my primitive figure-drawing skills must have been a real challenge. Finally, Micheline Frederick has produced this fine volume from my raw material.
My family, Jodi, Alex, and Ari, has also come to grips with the fact that I am now a writer in addition to all the other roles I play from day to day, and that I am now entitled to all the rights and privileges that a writer of my stature deserves (such as quiet). Thank you one and all.
Introduction What do an accountant sitting in a small home office accessing the Web, a sales representative at a branch office checking the latest home office price list, and a hospital worker ordering medication from a supplier’s remote server all have in common? Whether they are aware of it or not, more and more often the links from place to place in these environments are provided by a frame relay network. In the near future, this list might easily extend to corporate executives holding a videoconference, commercial artists manipulating images, and even college students calling home. In fact, all three of these things are done on frame relay networks now, just not routinely or in all environments. At a lower level than the user application scenarios above, a frame relay network can support LAN interconnectivity for Intranet router-to-router traffic, carry financial transactions for a corporate SNA network, or carry digital voice overseas, and all faster than before and at a fraction of the price of almost any alternative. How can frame relay do all of this so well? That is what this book is about. It should come as no surprise that networking needs have changed radically in the past few years. After all, the end systems that perch on almost every worker desktop and in every home, school, and library has changed drastically. Systems that were considered state-of-the-art 2 or 3 years ago struggle mightily even to run the games of today, let alone the new applications. Audio and video support is not a luxury, but a must, if only so that the multimedia tutorial for a new application can be appreciated (and no one reads the manual anyway). A video game requires more power than a supercomputer had 20 years ago. A palmtop assistant draws on more computing power than an IBM mainframe could in 1964. In 1991, a 66 MHz computer with color monitor and 500-megabyte hard drive and 16 Meg of RAM and a modest 2x CD-ROM cost more than $13,000. And so on. And in computing, as in almost nowhere else outside of the electronics industry in general, prices fall as power rises. As illuminating (or boring) as these examples might be, the sole point is that the typical device that uses a network has changed radically in the past 10 or 20 years, from alphanumeric display terminal to multimedia color computer. Yet how much has the network changed in that same time period? Hardly at all, and most of those changes apply to the local area network (LAN), where the cost of use is minimal after installation, and not to the wide area network (WAN), where the cost of use in the form of monthly recurring charges can be crippling. In a very real sense, frame relay (and other fast packet WAN technologies like ATM) represents the application of modern computing power not to the devices at the endpoints of the network, but inside of the network itself. Computer networks are so common today that it is hard to imagine a time when they were not. Yet the history of computer networks could be said to begin with the invention of what eventually became the Internet in 1969. This was only four years after Time magazine, in a story about computers (remarkable in itself), boldly predicted that “eventually” computers would be able to talk to each other over “communication links.” In the 1980s, the idea of networking computers was not so new, and by the end of that decade, it was more or less expected. In the 1990s, the rise of the whole client/server phenomena has enabled a whole raft of new applications requiring both networks and computers, from electronic messaging to remote database access to training videos.
Of course, this new emphasis on clients and servers and the networks that connected them led to the creation of many networks. In fact, there turned out to be perhaps too many networks. It seemed like every application, not only voice and video but the many types of data, required a slightly different network structure in terms of bandwidth (speed), errors, and delay (latency). In the expanding economy of the 1980s, the immediate reaction when faced with a new network application was to build a completely new network specifically groomed for that application. This typically meant employing time division multiplexing (TDM) to share the total bandwidth available among sites by dedicating the maximum amount available or the minimum amount required to adequately support the application. This form of channelized networking came to dominate the WAN scene in the 1980s. But just as the economy needed to regroup before it could advance to new heights in the 1990s, so did networking. Maintaining all of these essentially parallel networks proved wasteful and enormously costly. Few people used the tie-lines between office PBXs at 3 a.m. Yet the bandwidth these channels represented was still dedicated to the voice network. Many organizations, attempting to perform out-of-hours tasks such as backing up remote servers over the channelized networks, were unable to utilize the bandwidth locked up in other channels. Frame relay solves this problem by flexibly, or dynamically, allocating the total bandwidth based on the instantaneous (well, within a few milliseconds) demand of all active applications. Idle applications consume no bandwidth at all. Frame relay is not the only way to solve the problems posed by the time-consuming and wasteful task of maintaining parallel networks; it has just proved to be the most popular. And frame relay addresses more than just the need to integrate the total communications infrastructure. Frame relay can also: 1.Reduce costs. A great deal of this cost reduction comes from the elimination of the need for parallel communications networks. But there is more to it than that. Frame relay replaces a complex, incomplete web of dedicated private lines with a “cloud” of general and total connectivity, not only within the United States, but around the world. Organizations who could only dream of leasing a dedicated private line to Europe now enjoy affordable communications with major European capitals thanks to frame relay. 2.Improve network performance. It used to be true that everyone involved in a network was so happy that it worked at all that they had little inclination to care how the network was performing. And even if they did care, there was little in the way of performance tuning methodologies or software to assist them in their task. The problem with creating custom networks for each application is that there were all slightly different in their performance tuning needs. But frame relay is the same network for all its various applications. Not only is frame relay faster than almost anything else, it is also more tunable than a collection of individual, parallel networks. 3.Make the network as a whole more reliable. In a network composed of an interconnected mass of individual leased private lines, the failure of one critical link can be devastating to the network as a whole. Part of the allure of public networks is that they are more resilient and robust than private networks. Everyone knows that the public voice network suffers internal link failures all the time. Yet with a few widely publicized exceptions, notable only due to their rarity, these failures have no impact on overall voice network service. Since frame relay is almost always a public network service, it shares this characteristic with the public voice network. 4.Make the network more future-proof. Many networks are difficult to scale for new and more applications in terms of speed and connectivity. Even simple reconfiguration can be a timeconsuming and costly process. Frame relay networks can react to sudden changes quite rapidly, often within 24 hours and as the result of a simple telephone call. And nothing is more painful than watching competing organizations become more successful while one’s own organization is saddled with technology that is outdated and perhaps even considered obsolete. Frame relay is not only a member of the fast-packet family, but is intended to be interoperable with the other major fast-packet network technology, ATM.
5.Make the network more manageable. Network management centers often resemble a hospital emergency room. Nothing much happens until the patient is wheeled in on a cart. Then swarms of personnel swing into action to save the victim. Without adequate network management hardware and software at their disposal, network managers often attempt little more than fixing problems as they are reported. Frame relay network management techniques offer a way of detecting problems like link failures literally as they occur (not when the users finally get around to calling), and network congestion areas can actually be detected before they occur. And most of the management tasks associated with frame relay can be outsourced to the service provider (this is a cost savings benefit as well). 6.Provide a more structured migration path. All networks must be upgraded eventually. But to what? And how? Frame relay offers a certain migration path from old network to frame relay, regardless of whether the old network was SNA or routers linked by leased private lines, or X.25, or almost anything else. In most cases, the changes required are simple and few. And once the transition to frame relay is made, the fast packet aspects of frame relay virtually guarantee a long and useful life for the new network. If it sometimes seems like frame relay is enjoying the kind of success that ATM and a few other WAN technologies wish they had, it is because this is undoubtedly true. Frame relay is sometimes called “the first public data networking solution that customers actually like and will pay for.” Frame relay has also be called “the first international standard network method that works” and “X.25 on steroids.” All of these statements are quite accurate, and all contribute to the continued success of frame relay networks.
Overview Usually it is easy to accurately gauge the popularity of any given network technology. One can just walk into a bookstore (online or otherwise) and check out the number of titles or shelf space that books on a particular topic occupy. The shelves are typically overflowing with books on the Internet, Web, TCP/IP, Java, and the like. Only a dedicated search will turn up books on the less popular technologies and topics such as token ring LANs or Fibre Channel. There are exceptions to this general rule, of course. Among LAN technologies, Ethernet is one of these exceptions. In spite of the immense popularity of Ethernet LANs, to the extent that the label “not Ethernet” attached to a new LAN technology is a sentence of death or at least condemnation to a niche market, there are few books about Ethernet or related technologies like 10Base-T. In the WAN technology arena, the same is true of frame relay networks. There are a handful of texts, and that’s about it. Yet frame relay remains not only one of the most popular public WAN technologies of all time, frame relay continues to expand its horizons into the voice and video application areas as well. This book is intended to address the apparent imbalance between the high level of frame relay network popularity and the lack of formal, book-length sources of information on frame relay networks. A good, current, complete book on frame relay is definitely needed to help networking personnel understand exactly how frame relay works in order to assess both what frame relay does today and what it is capable of in the future. Customers and end users should know the basics of frame relay configuration and have a realistic grasp of frame relay capabilities. Service providers should know the basics of frame relay network operation and have a clear understanding and just what the network is doing at all times. Everyone should have an appreciation of the enormous power of frame relay networks to adequately address most of the issues involving modern networking needs. Most existing books on frame relay are quite short, especially when compared to the weighty volumes dedicated to even such niche technologies such as Java or intranets. These frame relay books tend to either emphasize the bits and bytes of the network operation at the expense of the application of frame relay to various networking areas, or give checklists for user and customer frame relay service preparation without ever mentioning exactly how the frame relay network exactly provides such services. This book will attempt to give equal time and emphasis to both aspects of frame relay. Those used to assessing and appreciating technology at the network node and data unit level will find plenty to occupy them here. Those used to evaluating and admiring technology at the service and effectiveness level, in the sense of “what can I do with frame relay better or that I cannot do now?”, will also find that this book has much to offer.
So a new book on frame relay is much needed. Two aspects of frame relay are important in this regard. First, there is the fact that frame relay is a public data service. Private frame relay networks can be built, just as private voice telephony networks, and have been, but the emphasis here and in the networking spotlight is on this public aspect of frame relay. There have been other public data services, notably X.25 packet switching and ISDN. Neither has ever achieved the stature that frame relay has in a short amount of time. This book will explore the reasons for this. Second, frame relay was standardized and first implemented as a public data service. But frame relay has become a vehicle not only for data traffic in the traditional sense, but it now carries all sorts of types of digital information, such as voice and video and graphics and audio. Since frame relay is widely deployed by telephone companies, the application of voice over frame relay must come as a surprise. In fact, frame relay was conceived as a method of carrying voice packets inside frames, so no one need be shocked that the promise of frame relay is now being fulfilled. Naturally, the reasons and methods behind this use of frame relay for mixed traffic types will be investigated in full in this book.
Who Should Read This Book This book is intended for readers with at least a passing familiarity with the operation of modems, LANs, and other basic networking principles. But outside of these basics, little is required to fully understand and appreciate this text. There are few equations, and these few require no more than basic high-school algebra to understand completely. It is anticipated that most of the readers of this book will be those who currently manage networks in organizations, need to understand more about frame relay networks in general, or work for service providers who offer frame relay services. This does not rule out readers with a general interest in networks, of course. While not specifically intended for a university or college audience (there are no individual chapter summaries, study guides, or questions, for example), there is nothing that prevents this book from being used at an undergraduate or even graduate level in a comprehensive networking curriculum. In fact, the author compiled much of the material in this book from teaching a graduate course on current telecommunications networks at a large university. In short, if you have a wide area network, use a wide area network, or need to assess frame relay networks, this is the book for you.
How This Book Is Organized This book consists of thirteen main chapters.The first chapter is a necessarily long look at the factors which have made frame relay the most popular public data networking solution of all time. The length is due to the fact that many of the topics are dealt with at an introductory level and presume no more than an acquaintance with LAN and WAN technologies like Ethernet and the Internet. The chapter begins with a look at the popularity of LANs and the Web, and progresses to the methods most organizations use to link these networks to each other and to the Internet: pointto-point private lines. The limitations of this solution are investigated in light of the major characteristics of current network traffic: mixed audio/video/data traffic and “bursty” traffic patterns. This leads to a need for a type of “bandwidth on demand” for the underlying network, which is hard to provide on private line networks. Finally, the benefits of frame relay are shown to coincide nicely with the limitations of private line networks, setting the stage for the rest of the book. Chapter 2 looks into key aspects of public data networks. Again there is some effort made to handle these topics at an introductory level, but always the intent is to create a basic level of knowledge for all readers of the rest of the book. Just what makes a network public or private is explored, along with an introduction to the differences between networking with circuits as opposed to packets. Frame relay is positioned as a fast packet network technology. Once the differences between circuit-switching and packet-switching are examined, the chapter introduces the X.25 public packet-switched network as a so-called slow packet network. After a brief examination of X.25 networks, frame relay is discussed as a fast packet technology, one which is capable of both broadband (very high) speeds and flexible, dynamic bandwidth allocation (the more proper term for “bandwidth on demand”). The relationship between X.25 and frame relay by way of a frame structure called LAPD is also examined. Chapter 3 explores the overall structure of an entire frame relay network. The chapter begins with a look at the key components for FRADs and network switches. How they combine to offer applications adequate quality of service (QOS) is outlined as well. This leads to a discussion of private routers and public switches as network nodes. In fact, the whole switch versus router “controversy” is addressed here at a very basic level (again for readers who may not be as familiar as they like with these terms and concepts). The chapter concludes with an exploration of permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) for frame relay networks. In this context, signaling protocols are introduced and connections for various traffic types are discussed. A major point of this chapter is that as data becomes more and more “bursty” in nature, and as new voice techniques make voice (and even video) look more and more like data, it makes little sense to send “data” packets over private line circuits. Chapter 4 examines all aspects of the way a customer connects to a frame relay network, the frame relay user-network interface (UNI). The chapter begins with a look at a key concept in frame relay networking: the committed information rate (CIR). The relationship of the CIR to the frame relay UNI port speed is also examined. Several “rules” for configuring CIRs on the UNI are discussed. The important service provider concepts of “regular booking” and oversubscription are detailed with respect to the UNI. Finally, the major issues relating to the physical configuration of the UNI are investigated. These include diverse routing, dial backups, ISDN access, multi-homing, inverse multiplexing, and the use of analog modems for remote frame relay access.
Chapter 5 investigates the topic of the signaling protocols used on a frame relay network. The standard signaling protocol for frame relay, Q.933, is detailed. But the whole issue revolves around the use of Q.933 for switched virtual circuits in a frame relay network. The chapter includes a complete discussion of frame relay call control using Q.933 to set up, maintain, and release switched virtual circuits. One of the main topics in this chapter is the delay in offering switching virtual circuit services to frame relay network users and customers. All of the reasons are explored, with the emphasis on billing complexities and resource determination. The chapter also looks into the possible merging of frame relay and ISDN services in the future. Chapter 6 explores congestion control, a key aspect of all networks, in some detail. After defining congestion as a global network property, the chapter examines the relationship between congestion control and its more localized counterpart, flow control. Several mechanisms used in networks for congestion control and flow control are investigated, always pointing out their possible use in frame relay. Then the specific frame relay mechanisms for flow control and congestion control are discussed, the use of the discard eligible (DE) bit and the use of the forward and backward explicit congestion notification bits (FECN/BECN). The recommended frame relay network actions regarding FECN and BECN to maintain service quality are discussed, as well as the current limitations of FECN and BECN usefulness. The chapter closes with a look at frame relay compression techniques, which are commonly seen as a way to prevent network congestion. Chapter 7 concerns managing a frame relay network. There is much more to networking than the delivery of bits, and this chapter makes this point clearly. Managing the frame relay network includes methods for detecting link failures and responding to these link failures once they are detected. Key parts of frame relay network management covered in this chapter include the role of the Simple Network Management Protocol (SNMP) and the Management Information Base (MIB) database in the frame relay network elements. The standards for frame relay network management are discussed, the Link Management Interface (LMI) from the Frame Relay Forum (FRF) and Annex D from the ITU-T. Issues of support for each in frame relay equipment, the impact on the network, and user awareness are also mentioned. Finally, Service Level Agreements (SLAs) for frame relay networks between customers and service providers are detailed. Chapter 8 examines the frame relay Network-to-Network Interface (NNI). Although it might seem that such a topic might be of concern only to frame relay network service providers, it is in fact a topic of immediate interest to users and customers. Because of the current state of telecommunications deregulation in the United States, it is necessary to examine the relationship between local exchange carriers (LEC) and interexchange carriers (IXCs), as well as various competing local and national entities such as competitive LECs (CLECs) and Internet service providers (ISPs). After exploring the current concerns and issues to the LEC, the chapter contains a discussion of local access and transport areas (LATA) limitations as they apply to frame relay. Several methods of providing multi-LATA frame relay networks in spite of current restrictions are outlined, from private lines to “gateways” to full NNI interoperability agreements. Chapter 9 explores the various ways in which a frame relay network can be used to support voice and video as well as more traditional data applications. The whole issue of voice over frame relay is placed in the context of the historical effort to deliver adequate packetized (or more correctly, packet-switched) voice services. The related G.728, G.729, and H.323 ITU-T standards are introduced, with the intent of providing a means to evaluate voice over frame relay equipment and associated voice quality issues. Finally, the chapter details some of the early efforts to provide adequate video service over a frame relay network. This introduces a brief discussion of MPEG-2 video streams and frame relay equipment designed for this purpose. Chapter 10 investigates the key relationship between IBM’s System Network Architecture (SNA) protocols and frame relay. It is a fact that IBM’s endorsement of frame relay as an SNA transport greatly encouraged the spread of frame relay. This chapter investigates the role of SNA in the world of LANs and routers. The position of X.25 public packet switched networks with regard to SNA is considered, both in the United States and in the rest of the world, where private line networks were and remain somewhat scarce. The chapter closes with a look at the distinctive way that SNA networks use the frame relay DE bit and the FECN/BECN bits.
Chapter 11 explores the use of frame relay for various Internet-related network interactions. The relationship between the Internet, private intranets and extranets, and the Web is discussed. The chapter also examines the concept of a Virtual Private Network (VPN) and how frame relay can support the mixed traffic types commonly encountered on Web sites. Various roles for frame relay in this Internet-dominated environment are investigated, such as the use of frame relay by ISPs, using frame relay to construct VPNs, and using frame relay for multimedia traffic. Chapter 12 examines the relationship between the two major “fast packet” networking methods, frame relay and ATM. A brief overview of ATM network features is given, involving a discussion of the frame versus cell controversy. The chapter next explores mixed traffic (voice, video, data,...) and mixed networks (all types of traffic on the same physical infrastructure), the environment that ATM was primarily invented for. The chapter considers the use of ATM rather than frame relay for LAN connectivity and introduces the ATM methods for providing such connectivity. Finally, the chapter ends with a consideration of using an ATM network to provide frame relay services to customers. This leads to a discussion of linking frame relays users and networks with ATM and also linking ATM users to frame relay users for interoperability. All of the relevant issues regarding this ATM-frame relay interaction are also investigated. Chapter 13 closes the book with a look at the future of frame relay. The relationship of frame relay to a number of other technologies is examined closely, especially frame relay and the newer, highspeed LANs such as Gigabit Ethernet. The future of frame relay in a world dominated by IP is discussed as well, and seems like a fairly good match. A more detailed look at an even closer relationship between frame relay and ATM networks and users is explored, with the conclusion that service providers can continue to “build ATM and sell frame relay” for the foreseeable future. A bibliography includes a variety of sources of frame relay information, standards and other types of information such as Web sites and white papers. The book also includes a complete list of acronyms used in the text.
Summary Frame relay is one of the most popular public data network services of all time. Frame relay can address many of the issues and problems that have come to be associated with the private line and channelized networks that are still common today. Frame relay can reduce overall network costs through integration, and yet at the same time improve network performance and reliability, ease network management tasks, provide a more future-proof network infrastructure, and offer a graceful migration path. Frame relay can be used for SNA networks, interconnections for bursty LAN client/server traffic, international networks, and even voice communications. This combination of lower cost and at the same time higher network quality is hard to beat. This is just a brief description of the capabilities of a frame relay network. The first chapter elaborates on each of these frame relay benefits and provides a framework for understanding exactly what it is about frame relay functions that allows frame relay to perform so well in a modern networking environment.
Chapter 1: What Frame Relay Can Do Overview This chapter looks at the current networking environment more or less in terms of technological popularity. It is intended mainly to allow readers to appreciate the capabilities of frame relay when it comes to addressing the limitations inherent in some of these popular networking methods. At the same time, this chapter introduces some of the basic terms and concepts that will come up over and over again in the ensuing chapters. Since many of these terms and concepts might be quite familiar to the intended audience, some might be tempted to skip this chapter altogether. However, there are at least three sound reasons why this chapter should be read. First, some of the terms and concepts might have changed in meaning, often subtly, over the past few years. For instance, the whole idea of what a router does and what a switch does has changed from what these network devices were thought of a few years ago. Second, although the terms and concepts might have been familiar, some readers might wish to reacquaint themselves with these anew, as a type of review. Finally, and most importantly, different network professionals often have different perspectives on the use of terms and concepts when they speak and write. Now, no author can claim that his or her own use of a term or concept that is not firmly defined by standards is more correct or should be favored over others. But for the purposes of this book, unless the terms and concepts used by the author coincide to a large degree with the terms and concepts used by the reader, confusion could easily result in the later chapters. For example, many sources distinguish between network security devices known as application gateways and proxy servers. In this book, these terms are used almost interchangeably (in the sense that all proxy servers are a form of application gateway). (Lacking firm definitions for each term, everyone is free to use these terms as they wish.) But certainly for the purposes of this book, the meaning of these terms should be known to the reader. Along these same lines, at least one quirk of technology terminology must be mentioned right away. Most customers and users have Local Area Networks (LANs) that are based on a particular LAN type from the Institute of Electrical and Electronics Engineers (IEEE) 802 committee for international LAN standards. This LAN type, known as 10Base-T, is the most popular LAN type ever invented. Most users refer to this as Ethernet, the parent technology that evolved into 10BaseT, resulting from the work of the IEEE 802.3 committee. Sometimes 10Base-T LANs are called Ether-type LANs, but this variation is not common. In this book, the term Ethernet applies not to a proprietary LAN technology involving heavy coaxial cable but to a 10Base-T LAN based on central hubs and Unshielded Twisted-pair (UTP) copper wire. This use of the term is essentially correct in any case because most 10Base-T LANs do not use the recommended and specified IEEE 802.3 frame structure, but the much simpler and more practical Ethernet frame structure. So most physically 10Base-T LANs use Ethernet frames, which surely justifies the use of the term Ethernet to designate these LANs.
The Network Needed By now most readers should be convinced that the old ways of building private networks out of leased private lines, if not quite impossible yet, will be quite impossible soon. The bandwidth pressure will keep going up, to the point that some organizations have begun to deploy Storage Area Networks (SANs) that link servers at multigigabit speeds. But the faster that servers can get the information, the faster they must give it out in order to keep going themselves. The connectivity pressure will build also as the use of Web-based information becomes more and more a way of life. Any-to-any connectivity is no longer a luxury but a necessity to the extent that many organizations have turned to the Internet itself for public connectivity, with all the hazards for security that this entails. The problem is that the type of network needed today is not the voice-engineered network that the service providers built. It is hardly their fault, however. The national network was built to handle three-minute telephone calls, not 12-hour Internet sessions. This network is often called the Public Switched Telephone Network (PSTN). Private lines are essentially just little pieces of bandwidth sold to customers which then can no longer be used for handling telephone calls or anything else for that matter. There is much debate over the future of the PSTN in a world where more and more fax and voice traffic travels the Internet instead of the PSTN, especially internationally. Some claim there are more bits representing information than voice on the PSTN anyway, at least since 1995. One telephone service provider says that by 2001 or so, more than half of all access lines (also called local loops) will terminate not at a telephone, but at a PC or some advanced PC-like device. So the PSTN is not the network that is needed to solve bandwidth and connectivity problems. But what is? Many think the network needed is the Internet, pure and simple. This section will examine some of the needs of modern client/server, LAN-based networking and see if the Internet totally fits the bill.
Bursty Applications A definition of bursty traffic has been already given, but will take only a moment to repeat. A bursty application is one that over time generates a torrent of traffic (usually consisting of packets) which flows between end-user devices, typically located on LANs. Between bursts, long periods of relative inactivity occur in which little to no traffic at all passes between the LANs. This is why highspeed LANs linked by lower-speed WAN private lines work so well in the first place. If the LAN traffic is bursty enough, packets can be buffered until the burst is over, at which time the packets can be sent out on the much lower speed link to the other LAN. In actual practice, it is more common to place these packets in a frame, and it is the frames that are properly spoken of as being buffered. A packet is, by definition, the content of such a frame. Just how bursty is LAN traffic? Very bursty, as it turns out. A more or less accepted measure of burstiness (the term is widely used, but always seems as if it should be more technical) is to measure the peak number of bits per second between two network end points (say a client and a server) and divide this number by the average number of bits seen during some specified time interval. This peak-to-average ratio is a good measure of the degree of burstiness for the application or applications used between the client and server during the observation period.
It would be nice if there were a standard definition for this process of measuring burstiness, but there is none. The peak bit rate is easy to define: Whenever bits flow, they always flow at the peak rate allowed by the link, be it 64 kbps, or 1.5 Mbps, or 45 Mbps. It is the average that causes problems. The average is the total number of bits in the observation period divided by the observation period interval. The trouble is that there is no standard observation interval established to measure the total bits. One is as good as another. It could be a minute, an hour, or a second. It could be a day. And when the interval was chosen, whether it represents a Monday or other weekday, or weekend would be important too. Each one would have the same peak bit rate, but widely varying averages and therefore also burst ratios. An example of this varying burst ratio is shown in Figure 1.7.
Figure 1.7 Different intervals, different peak to average ratios. Consider trying to figure out the average speed of a car. The car starts and stops, sometimes cruises, and sometimes sits in a parking space. Commuting is a bursty activity. The average speed of a car driven 10 miles to work and back over a 10-hour period is very low, but no one complains about the car being slow. But when everyone bursts at once, as during rush hour, congestion results and everyone complains about the slowness of the network, which is only marginally slower than the average speed over the entire 10-hour period. The PSTN is like the highway. It is not made for bursty traffic. The PSTN was made for non-bursty, constant bit rate traffic, not the variable bit rates that bursty applications are noted for. This works fine for digital telephone calls—at least it did in the past—since these full duplex calls even digitized the periods of silence that flowed in the opposite direction when one party was just listening. So the link used all of the 64 kbps of the voice channel bandwidth (it was the voice digitization process that determined the basic speed for digital private lines) all the time, which is the essence of networking with circuits, or just circuit switching. There will be much more on circuit switching in the next chapter. But even the fixed bandwidth, circuit-switched PSTN is due for a change. The latest international digital voice standards from the International Telecommunications Union’s Telecommunications Standardization Section (the ITU-T) feature voice compression (voice that uses less than the normal 64 kbps, usually much less, like 8 or 13 kbps) and silence suppression (no bits are sent when there is no speech). What this does, of course, is make voice start to look pretty much like bursty data applications. So it now starts to make sense to put voice into packets like everything else, since packets were more or less designed primarily for bursty applications. The clearest sign of the transition from constant bit rate voice to bursty, variable bit rate voice is the practice of trying to put voice calls on the Internet or LANs in general, not on the PSTN. Naturally, it is always desirable to have a certain quality to the voice call, which is most often characterized as toll quality voice. It makes no difference to the voice bits whether they are transported over a circuit or in packets, as long as the users will use the service. There are two key aspects of delivering toll quality voice over networks designed for data like the Internet. First, the voice packets must have a low and stable enough delay to allow users to converse normally. Second, the voice packets must all arrive intact and in sequence, since the real-time nature of voice services prevents any retransmission of missing voice conversations. The Internet was and still is designed primarily for data packet delivery. There have been efforts to improve the basic quality of service provided by the Internet for mixed types of traffic, but little work has been completed or implemented yet. In the meantime, the Internet works the same for bursty applications, whether data or packetized voice. The problem is providing adequate quality of service on the Internet for mixed traffic types. And as Web sites, e-mail, and all sorts of networked information combine text and graphics, and sound and moving images, this issue becomes important enough to merit a section all its own when considering the type of network needed for modern telecommunications.
Mixed Traffic Types In the good old days of networking, voice went on the voice network and data went on the data network. The two networks were absolutely distinct, whether public or private. When a private line or access line was ordered, one of the first questions asked was, “Is this line going to be used for voice or data?” Even today, home modem users sometimes are shocked to find that their local telephone company will not even take a report of trouble on a modem line such as, “I can only connect to my Internet service provider at 9.6 kbps this afternoon when it was 33.6 kbps this morning.” This access line was bought as a voice line, not a data line, and only voice troubles such as, “I hear a lot of static on the line,” can be reported as troubles. Data lines sometimes require additional work to be done on installation and can be harder to troubleshoot, so the cost of provisioning a data line has been traditionally higher than a voice line of the same length. Therefore, a voice line works for modem data most of the time, but when it fails or degrades, there is a different process of troubleshooting and repair. One of the reasons for this voice and data separation was regulatory, both within and without the United States. Most voice service providers could process voice calls (e.g., switch them, provide caller identification, etc.), but could not do anything with data bits (or voice bits either, for that matter) except pass them unchanged and unprocessed through the PSTN. This was called raw bit transport or basic transport services and all the service provider could supply in this regulated environment was a leased private line provisioned for the user’s data bits. This was certainly true until the service providers received permission to offer their own public data network services. But this is a topic for the next chapter. In addition to regulatory issues, there were technical reasons as well. The PSTN was designed for constant bit rate voice calls, not bursty data packets. Putting bursty data packets on all-thebandwidth circuits was extremely wasteful of network resources, but it worked. Perhaps it would be better to try to put voice calls on the data networks as a stream of packets. In fact, it was just this quest for packetized voice that culminated in frame relay. But this is also a topic for the next chapter. For now, it is enough to point out that trying to put voice traffic on data networks is the classic example of mixed traffic types. It made less and less sense to have two circuits going everywhere, one for voice and one for data. Even data was no longer just pure text. There were graphical data with engineering drawings, image data with photographs, moving picture data with action content, audio data with the soundtrack, and so forth. It became harder and harder to give the correct quality of service to each of these traffic types on the same network. Some of the differences in the quality of service needed by these different traffic types are listed in Table 1.1. They are all so different in their needs that separate networks have been built just to carry each type of traffic, as shown in parentheses. Table 1.1 Different Networks, Different Quality of Service Trafffic Type
Error Sensitivity
Delay Sensitivity
Bandwidth Need
Burstiness
64 kbps voice (PSTN)
Low
High
Low
None
Analog video (cable TV)
Low
High
High
None
Digital video (HDTV)
Medium
High
Very High
Moderate/None
Packet data (Internet)
High
Low
Low/Medium
High
Some schemes, such as Asynchronous Transfer Mode (ATM), were invented just to address the issue with a totally new type of network using fixed-length cells instead of variable-length frames to carry packets. The trouble with anything new, especially technologies, is how easy it is to migrate to a new system and integrate newer parts that have already migrated with older parts that have not. The issue of backward compatibility ultimately halted ATM almost in its tracks, except for specific applications for which it was still best-suited. So if some way could be found to easily mix traffic types—be it voice, video, data, or audio—on the same physical network, the potential for such a networking scheme was enormous. There would just be a network, not a voice network, or a data network, or a video network, but just a network. And the way to build such a network is no mystery. All that need be done is to take the most stringent characteristic required for each traffic type, then build that network. This is where the Internet, built as it was for data applications, and older ones at that, struggles to support the quality of service needed (for instance) to deliver toll quality voice. So, as well-suited as the Internet is for bursty traffic, the Internet fails when it comes to providing a transport service that is low in errors, and has low and stable delays. Since more and more traffic appears bursty due to compression and so on, this delay issue is not the limitation it once appeared to be. The issues of errors and delay on the Internet are important ones. But much can be done at the application level to alleviate the effects of missing and incorrect information, and high and varying delays as well. But there is still the issue of adequate bandwidth to support the service traffic type. Here is where the current version of the Internet fails miserably. All packets look exactly the same to the Internet. No packet stream can reserve the proper amount of bandwidth it needs to get the minimal error rate it requires (lack of bandwidth leads to packet discards), and low and stable delay (lack of bandwidth leads to a lot of time in buffers). It would be nice if each application could identify its needs to the network. It would also be nice if each application could get the proper amount of bandwidth only when it needed it, like during a burst. This would be flexible bandwidth allocation (circuit networks allocate peak bandwidth all the time), also known as dynamic bandwidth allocation. But at least some requisite bandwidth is allocated, and not just scrambled over as in most packet networks like the Internet. This dynamic bandwidth allocation is often called bandwidth-on-demand.
Bandwidth-on-Demand The Internet certainly does provide a form of bandwidth-on-demand, or flexible bandwidth allocation. But the bandwidth that current applications demand during bursts is simply not available to all at the same time on the Internet. The whole concept of bandwidth-on-demand means not only releasing bandwidth ordinarily tied up in the form of dedicated private lines when not needed, but getting enough bandwidth during bursts to allow the most bandwidth-hungry applications like video to function properly. This need for increased bandwidth frequently leads to discussions about the need for broadband networks. Unfortunately, the meaning of the term broadband keeps changing as well. Still officially defined as speeds above 1.5 Mbps in the United States and above 2 Mbps almost everywhere else, this speed is not very useful for even simple applications today. Some 20 percent of all e-mail messages received by certain individuals have attachments such as documents or video clips that are larger than 1 megabyte (8 megabits). So even checking the morning’s e-mail can be a timeconsuming task. So, for most practical purposes, the term broadband is usually employed today to indicate network bandwidths than are at least 10 Mbps, the common Ethernet speed that all PCs and other network devices can easily operate at. But here is where the flexible bandwidth allocation comes in: If there is a burst, the whole 10 Mbps might be used. If two sources burst at the same time, each can get 5 Mbps, and so on. But if there are many sources bursting, each must get a minimum bandwidth to enable it to function under the load. This is the essence of bandwidth-on-demand: No single application is assigned a fixed amount of bandwidth. The bandwidth allocated is based on the instantaneous demand of the given application.
While the Internet can easily provide this flexible bandwidth allocation, there are two problems. First, the bandwidth available is nowhere near broadband speeds, especially if the official definition is extended to 10 Mbps. Second, there is no reservation system for assigning a minimal bandwidth (and thus error rate and delay) to an application at all. Steps have been taken to correct this situation on the Internet, but large-scale and widespread deployment will take years. What is an organization to do in the meantime? This book suggests that the answer is frame relay. Frame relay can replace a maze of private lines with dedicated bandwidth between sites and replace this network with a logical collection of logical links over the switched services that frame relay provides. And frame relay can still furnish the links to and from the Internet that all organizations need today to pursue their daily activities.
Point-to-Point Private Lines In the United States, the most common service that organizations expect public network service providers to furnish is simple point-to-point private lines. The term “public network service providers” is used in preference to older terminology such as “carriers” or even “telephone companies” (telcos). Not all providers of frame relay network services are certified as common carriers, although most are, especially the major ones. Certainly, not all providers of network services are, or even were, telephone companies in the traditional sense. It is true that network service providers include many Incumbent Local Exchange Carriers (ILECs, those who held the original franchise for telephone service in a given area), Interexchange Carriers (IXCs), Competitive LECs (CLECs), and Regional Bell Operating Companies (RBOCs, former pieces of the AT&T Bell System that were split off under the Divestiture of 1984). But more and more service providers are now Internet Service Providers (ISPs) and even power company utilities. Even the term “point-to-point private line” has a lot of variations. Essentially, it is a circuit that connects two different customer sites. The bandwidth is private in the sense that all of the bits on the circuit are the customer’s and they pass unchanged through the service provider’s network, not in the sense of ownership of the physical link in any way. The private line philosophy and approach is sometimes called all the bandwidth, all the time, since this is what the customer is paying for. The lines are not purchased, in the sense of ownership, but leased from the service provider for a fixed amount per month. A typical lease runs two or three years and is routinely renewed. Sometimes these private lines are called leased lines or dedicated circuits, but the terms essentially mean the same thing. There are even multipoint private lines that are more or less only used for IBM SNA networks today. In a multipoint configuration, a private line connects a central site to several remote locations in a special fashion. Sometimes these multipoint configurations are called multidrop lines, but the idea is the same. The differences between point-to-point and multipoint private line configurations is shown in Figure 1.3.
Figure 1.3 Typical point-to-point and multipoint private line use. In the United States where bandwidth is plentiful, the leasing of service provider bandwidth for private use is common practice. The service provider basically sells a piece of its network to the customer, agreeing that the only bits sent on that circuit will be the customer’s bits. This practice is both common and relatively inexpensive. Outside of the United States, this practice is neither common nor inexpensive. Many other countries do not have the extensive telecommunications infrastructure that the United States has. So selling all the bandwidth, all the time on a particular circuit to an individual user is not always in the service provider’s best interest. Typically, the service provider has adequate network capacity for customers sharing these lines for voice purposes and has little to spare for private users. And even if there were enough facilities to allow large-scale sale of these public facilities for private use, the prices are kept high enough by financial and regulatory considerations to discourage all but the most economically well-off organizations from using private lines outside of the United States.
Bandwidth Limits Enough has been said previously in this book and in many other places about the incredible increased demands on network bandwidth that have been placed by the Web, intranets, and newer applications like videoconferencing. This section will not repeat any of the numerous examples that have forced the popular Ethernet LAN technology to move from 10 Mbps first to 100 Mbps and now to 1,000 Mbps with Gigabit Ethernet. The point here is not so much how bandwidth demands have forced LANs to change, but how little effect this pressure has had on the WANs that connect them. Ethernet LANs running at 10 Mbps have been around since the late 1970s, but most organizations did not have enough need for networking on the local, building-wide level until the mid to late 1980s. The change, of course, was brought about by the rise of the PC and the whole movement away from the hierarchical, mainframe or minicomputer central networks to the more peer-oriented, distributed networks that PCs could more easily use. The early LANs of the mid-1980s ran at the 10 Mbps Ethernet speed or the more modest 4 Mbps that IBM used for their token ring LANs. IBM mainframe shops ran token ring because token ring supported SNA sessions, and very well at that; IBM would not endorse SNA on Ethernet (Ethernet struggled with the need for stable, low delays that SNA sessions required). But almost everyone else ran Ethernet, which cost about half as much as token ring to support the same number of users. The general guidelines for building Ethernet LANs in those days was one Ethernet segment (the shared 10 Mbps bandwidth on a length of cable) for every 200 or 300 users. And why not? No PC in existence could consume more than a fraction of the 10 Mbps bandwidth on the cable. Even in those early days of client/server networking, there was a need to link separate LANs over a wide area. The natural solution was to use the same private line approach that had been used to build SNA networks in the previous decade when networks first become popular. In the mid-1980s, the most popular and affordable private lines were still analog and ran at speeds that are almost laughable today: 4,800 and 9,600 bits per second. These 4.8 and 9.6 kbps lines were fine for SNA and minicomputer networks, where interactions between terminal and host (the central computer) were measured not in megabytes, but in a few hundreds and thousands of bytes. The most popular terminals of the day displayed only alphanumeric characters (not graphics) of 80 columns and 25 rows. So the whole screen only generated 2,000 characters (16,000 bits) of traffic at a time, and most transactions between terminal and host only exchanged a few hundreds of bytes because not all of the screen was sent back and forth. But LANs were different. The processing power of PCs was not limited to simple exchanges of text, although many early PCs were used in exactly this way—to supply simple terminal emulation capabilities to access the organization’s mainframe or minicomputer. Once users began to use PCs for word processing, spreadsheets, and other more familiar client/server applications, this basic terminal emulation would no longer do. PCs were now loading programs from a remote server, exchanging files almost routinely, and overwhelming the lower speed links that had been adequate for centralized approaches to networking. So it was more common to link LANs with the newer digital private lines, available almost everywhere starting in 1984 and running at 64 kbps. So many LANs were initially linked in the late 1980s with 64 kbps private lines. How could this possibly work if the LANs were running at 10 Mbps and the links between them ran at 64 kbps, apparently about a 150:1 bottleneck in speed? There were two main reasons. First, LAN traffic, and all data traffic in general, is bursty, as previously mentioned. So not all users needed the bandwidth between the LANs at the same time. Even if they did, there were 200 or 300 users (and other devices like servers and printers) sharing the 10 Mbps on the Ethernet anyway. So each user only got about 50 kbps (10 Mbps divided by 200) or 33.3 kbps under heavy loads in any case. The actual traffic patterns varied, of course, and a more precise analysis would take this statistical nature of the bandwidth usage into account, but the main point is still the same: The restricted WAN private line bandwidth was adequate for early LAN connectivity because the traffic was bursty and there were large numbers of users sharing the LAN bandwidth (so no one needed or got a lot of bandwidth).
Things rarely stay the same for long, however, and the pace of change in the computing world can be breathtaking. In a relatively small amount of time, PC capabilities, performance, and applications increased to the point that 200 or 300 users could no longer share a single 10 Mbps Ethernet segment. One solution was to further segment the LAN into more 10 Mbps Ethernets, connected by a network device known as a bridge. The bridge made two or more separate LAN segments look and act as one big LAN. By the late 1980s, it was common to put no more than 20 or 30 PCs on an Ethernet segment. There were still 200 or 300 PCs in the building, but now they were scattered among 10 or so segments connected by bridges. Bridges could be used not only to connect LAN segments in the same building, but to connect remote LAN segments as well, as long as both were Ethernets or token rings. These remote bridges had no accepted common name; IBM called them split bridges, but the purpose was the same. They all made separate LAN segments behave like one big LAN. Only the users needing remote LAN access needed to use the bridge connecting other sites. This was not common in those days, however, as great efforts were made to preserve the administrational and functional lines introduced by departmental computing, a concept first made popular in the minicomputer arena. Suppose only 20 or 30 users needed to access a remote server over the WAN bridge. A 64 kbps link now had to handle 500 kbps (or 333 kbps) because that’s what the users got on their shared 10 Mbps Ethernet segment. But the WAN link was still only 64 kbps in most cases. The result was a world in which users were instantly aware of whether the remote server they were accessing was on the next floor or across the country. Remote servers took forever to respond. It was only due to the bursty nature of the traffic that this scheme worked at all. The applications that client/server users were used to on the LAN kept evolving as well. By the early 1990s, Windows and other GUIs were becoming common. It is hard to appreciate today the impact that GUIs had on network demands. Sending the letter “A” from client to server was no longer just a matter of sending 8 bits. Information about the font, the size, the color, and so forth had to be sent as well. And things got even worse with graphics and images. LANs were now flooded with all manner of traffic, not just simple text. There were two main ways that organizations approached this newest crisis. First, organizations could even further segment the LAN and restrict numbers of users on each segment. So segments of as few as 2 or 3 were seen. This was not a very good solution, however, because each segment required a port on a bridge to enable communications with other segments. If 20 user segments needed 10 ports for 200 total users, then 2 user segments needed 100 ports. Most bridges at the time could easily support up to 16 or so ports, but there were few products that went beyond this. This whole movement to segment Ethernet LANs with bridges is shown in Figure 1.4. Now users that had up to 5 Mbps locally on their segment were totally overwhelming the 64 kbps private lines, no matter how bursty their traffic patterns were. At this time many organizations began to upgrade their 64 kbps private lines to 1.5 Mbps (called a full T1) out of sheer desperation. And in fact, this approach worked very well, mainly due to bursts, but also due to some other factors discussed later.
Figure 1.4 LAN segmentation and bridging.
The second way that LAN administrators dealt with the WAN bandwidth limitations in the early 1990s was first to encourage, then quickly adopt, the new 100 Mbps Ethernet standard. Once this was available, a heavily accessed server or even a real power user client device could get 10 times the bandwidth previously available on the old Ethernet segment. So, if a site had two extremely busy servers, these devices could be put on an Ethernet hub port with their own 100 Mbps Ethernet segment. And remote users could access these servers, each fully capable of slamming out information at 100 Mbps, at a leisurely 64 kbps in most cases, and only 1.5 Mbps in the best of circumstances. This was an automatic 60:1 bottleneck in the 1.5 Mbps case and an amazing 1400:1 bottleneck in the case of the 64 kbps private line. At the time, any increase in private line speed beyond 1.5 Mbps was often impractical. Either the higher speed circuits were not available from the service provider (bandwidth is not unlimited) or the price was far beyond the budget of even the most prosperous organizations. The next full step was 45 Mbps (T3) and there was little available. In some places, multiples of the T1 bandwidth of 1.5 Mbps were available, most often 6 Mbps, but this fractional T3 (often called FT3 or NxDS1) service was and still is not widely available and remains expensive in any case. The irony of the situation was not lost on either the LAN administrators or WAN service providers. The fact was that there was and is plenty of bandwidth available on the national network infrastructure. It is just that leasing it out as private lines with all the bandwidth, all the time is the least efficient way of using this bandwidth. No server needs 100 Mbps all the time: Traffic is bursty. But if a burst is not carried efficiently, the delay defeats the whole purpose of remote access. So a high-speed WAN is still needed, but not as the dedicated bandwidth represented by private lines. And now with Gigabit Ethernet, the problem will only get worse. Clearly, there must be a better solution to this bandwidth problem than private lines.
Connectivity Limits The corporate networking environment situation changed in the 1980s, quickly and dramatically. Most private corporate networks had been built to accommodate the IBM SNA environment. Even when other computer vendors’ networking products were used (for example, WANGnet and HP networking products), the resulting networks looked pretty much the way SNA networks did: hierarchical, star-shaped, wide area networks (WANs). These networks typically linked a remote corporate site to a central location where the IBM mainframe or other vendor’s minicomputer essentially ran the whole network. In this sense, the networks were hierarchical, with the central computer forming the top of a pyramid and the remote sites only communicating, if at all, through the central computer. Building private networks for this environment was easy and relatively inexpensive. Private lines were leased from the telephone companies to link all the remote sites to the central site. Although the private lines were leased by the mile (that is, a 1000-mile link cost more than a 100-mile link), there were various ways around letting this variable expense limitation impose too much of a burden on the private network. The situation is shown in Figure 1.5. In the figure, five sites are networked together with point-topoint leased lines. The connectivity needed is simple: Hook all four of the remote sites to the remaining central site where the main corporate computer was located.
Figure 1.5 Central site connectivity.
A quick look at the figure makes it easy to see how many links are needed to create this network. There are four private leased lines in the figure. Fewer point-to-point lines cannot be used to link each remote site directly to the central location. These links are the main expense when building and operating such private corporate networks. But the links are not the only expense. There is also the expense associated with the communication ports on each computer in the network. This expense is minimal at each remote site. A glance at the figure shows that each remote site needs only one communications port and related hardware. The situation is different at the central computer site, however. Here the central computer needs a communications port and related hardware to link to each of the other remote sites. And, of course, the central computer must be able to be configured with the total amount of ports needed for remote site connectivity. This last requirement was not usually a problem, especially in the IBM mainframe environment. But what if the number of remote sites grew to 10? 20? 100? How many links and ports would be needed to deploy such a hierarchical network then? As corporations—and the corporate networks—merged, expanded, and otherwise grew throughout the 1970s and into the 1980s, this became a real issue for corporate data network designers. Fortunately, it is not necessary to draw detailed pictures of these larger networks to figure out exactly how many links and ports are needed. There is a simple mathematical relationship that can be used to figure out how many links and ports would be needed to link any number of sites into a hierarchical, star-shaped network. If the number of sites is designated by the letter “N” (including the central site), then the number of links needed would be N− 1. For instance, in Figure 1.5, N= 5 and the number of links needed is N− 1 = 4. The number of communication ports needed throughout the network is given by 2(N− 1), read as 2 times N− 1. When N= 5, the number of ports is 2(N− 1) = 8. Fully half of these ports (given by N− 1 = 4) would be needed at the central site. It is now easy to figure out that when the number of sites is 20 (N= 20), the number of links would be 19 (N− 1 = 19) and the number of ports would be 38 (2(N− 1) = 38), with 19 of them (N− 1) at the central site. If N = 100 (100 locations on the corporate network, a figure easily approached in larger corporations and governmental agencies as well), the number of links would be 99 and the number of communications ports would be 198, with 99 at the central site. These networks were large and expensive, but not prohibitively so. What has all of this to do with frame relay? Simply, the rise of corporate LANs and client/server computing in the 1980s has meant that building private corporate networks in hierarchical stars is no longer adequate for private corporate networks. This is not the place to rehash the evolution of LANs and client/server in detail, but it is not necessary. It is enough to understand that LANs and personal computers (PCs) running client/server software (for example, a database client package to a database server or even corporate e-mail applications) are best served by peer-to-peer networks. It can no longer be assumed in a client/server LAN environment that all communications would necessarily be between a remote site and a central location, as the hierarchical networks assumed. In a client/server environment with LANs connected by WANs, literally any client could need to access any server, no matter where in the corporation the client or the server PC happened to be. Client/server LANs at corporate sites that need to be connected are better served by peer, meshconnected networks. This need for a different type of private corporate network created real problems. The number of links and ports needed for peer, mesh-connected private LAN networks was much higher than the modest link and port needs in the older hierarchical, star environment. Figure 1.6 points out the problem. A mesh network consists of direct point-to-point links between every pair of locations (LANs) on the corporate network. This way, no client and server are separated by more than one direct link. But the associated numbers for the required links and ports have exploded.
Figure 1.6 Full mesh connectivity. The figure shows that the peer network requires 10 links and four communications ports at each location to establish the required mesh connectivity. The total number of ports is now 20. The formulas, again based on well-understood mathematical principles, are now N(N - 1)/2 = (5 * 4)/2 = 10for the number of point-to-point links needed to mesh connect N = 5sites and for the total number of communications ports needed (four at each site). For 20 sites (N= 20), the numbers would be (20 * 19)/2 = 190 for the links and 20 * 19 = 380 for the ports (19 at each site). While it would not be impossible to configure 19 ports for each site, the hardware expense alone would be enormously high, even prohibitive. And most network managers would instantly balk at providing 190 WAN leased lines paid for by the mile to link the sites together. For 100 sites (N= 100), the numbers would require an astonishing 4,950 links ((100 * 99)/2) and 9,900 communications ports (100 * 99). Each site would need hardware to support 99 links to other sites, an impossible task for any common communications device architecture.
Again, various strategies were employed by corporations to keep LAN connectivity networking expenses in line. Partial meshes were deployed in varying backbone and access network configurations. These measures were only partially successful, in the sense that any economic gain within the corporate network was offset by the loss of efficiency and productivity on the part of those using the network. The traffic jams and bottlenecks that resulted from a minimum link configuration hindered users immensely. The situation is only worse when the need for international connectivity in many organizations today is taken in account. In many cases, marketing, sales, and even personnel departments not only deal with issues inside the United States, but also around the world. The cost of international private lines, even of modest speeds, is astronomical, and not even available to many of the countries where connectivity is most commonly needed. One of the attractions of frame relay is that it actually makes international connections affordable. Today, one of the most pressing needs in any organization is the need for Internet access. Usually this is provided by having the organization link to an Internet service provider (ISP). Each site that needs Internet connectivity must be linked to an ISP. This access could be through another site. But this creates enormous traffic jams and bottlenecks at the site with the Internet access. It would be better to allow each site to link to the ISP site directly. But consider the problem if one of the sites in the previous figure were an ISP site. A lot of resources would be consumed with all of the dedicated private lines required between each site and the ISP. But frame relay’s logical connectivity allows each site to be linked over one physical link which carries the traffic for many logical links.
Technology Winners In theory, any technology should be totally neutral and independent of users. That is, users should be able to pick and choose the best solution from a whole suite of technologies to fit their situation. In practice, a given technology is always involved in a kind of popularity contest with other methods for the user’s attention. Users always pick something comfortable and familiar over something uncomfortable and alien. When it comes to LANs, the good news is that users know Ethernet. The bad news is that they don’t want to know anything else. There are other examples along the same lines, but the point is the same. Some technologies are obviously helpful and immediately prosper. For example, within 40 years of the invention of the typewriter, no major author submitted handwritten manuscripts. Some technologies appeal to only niche audiences and languish, never quite catching on, but never quite disappearing either. Things like electric cars could easily be put in this category. And other technologies basically become obsolete, some rather quickly. The famous Pony Express mail service ran only 18 months before the telegraph put it out of business. There is nothing obvious about a technology that immediately is labeled as a winner or loser. So this section is not based on anything inherent in the technologies that doomed their competitors. In many cases, competing methods are still around. This section examines three popular technologies that all work together today to form a basic structure for an organization’s network from top to bottom. These are Ethernet LANs, the Internet/intranets/Web (all three have much in common), and frame relay. They are presented in this order, but only because the main topic of this book is frame relay.
Ethernet LANs Whether the LAN is called Ethernet or not, the general structure of the LAN is shown in Figure 1.1. A LAN is defined in this book as a private network where even the cable that the bits flow on is totally owned and operated by the organization using the LAN. LANs span small geographical areas, usually no more than a few miles, but typically much smaller areas. LANs are almost exclusively confined to a single building and usually a single floor, especially those based on 10Base-T. In the figure, desktop devices, either clients or servers, are attached to the central 10Base-T hub using up to 100 meters (about 328 feet) of unshielded twisted-pair (UTP) category 3 (Cat 3) or category 5 (Cat 5) copper wire. Usually, the hubs are located in telecommunications closets, but sometimes they are out in the general office space. Note that several hubs could be joined together with what is known as a backbone LAN, which might be 10Base-T, but could easily be something else.
Figure 1.1 An Ethernet (10Base-T) LAN.
Now, not all LANs are Ethernet or 10Base-T. In fact, not all networks used in a building are LANs. Many are based on Wide Area Network (WAN) technologies and older network technologies which have come to be called legacy networks, meaning “what I had before there were LANs.” Many of these legacy networks are based on proprietary, vendor-controlled network protocols such as IBM’s System Network Architecture (SNA), Digital Equipment’s DECnet, or others. There is nothing wrong with basing networks on these protocols. In fact, if they did not work and work well, they would not have thrived enough to become legacy protocols in the first place. But because they did work well, many organizations still retain the legacy SNA network that their financial business runs on, leaving the LAN applications to address more exciting, but routine, user needs. Ethernet-based LANs caught on because they were less expensive than token ring LANs, the major competition from IBM. Token rings were built in IBM shops, especially those where SNA session support to LAN-attached PCs was required. The reasons for using token ring for SNA are not important here, but what is important is that token ring LANs initially needed entirely new building wiring—known as IBM Shielded Twisted-Pair (STP)—to be run. This was much more expensive both to purchase and to install than the cable needed for Ethernet, especially once Ethernet, originally running on a thick coaxial cable, became 10Base-T and ran on quite inexpensive unshielded twisted-pair copper wire. Eventually, IBM allowed token ring to be run on UTP also, but by then the advantage gained by Ethernet could not be overcome. The advantage in wiring eventually grew to include Network Interface Cards (NIC). As more manufacturers turned to Ethernet in preference to token ring, price competition was much fiercer. It is not an exaggeration to say that for every token ring vendor, there were 10 Ethernet vendors. This brief discussion of the whole Ethernet-token ring controversy could include considerations of a political nature as well. What is most important is that by the early 1990s, when people thought “LANs,” they thought “Ethernet.” This basic Ethernet structure has been around since 1990 or so. Over the years, the 10 Mbps Ethernet speed has jumped to 100 Mbps and has jumped again to 1,000 Mbps (which is 1 Gbps). The new Gigabit Ethernet has excited a lot of equipment vendors and users, but the upgrade to 1 Gbps will not just be a matter of swapping out one network interface card (NIC) for another, as was done when 10 Mbps Ethernet became 100 Mbps Ethernet. In most cases when moving from 10 Mbps to 100 Mbps, the same building wiring was used and the NIC cards had a small toggle switch on them to allow users to easily switch from 10 Mbps to 100 Mbps. But Gigabit Ethernet will probably require not only new copper wire to be run (and shorter runs at that), but also at least some fiber optic cable as well. The success of any LAN scheme bearing the name Ethernet is virtually assured, as previously noted. The good news for LAN equipment vendors is that users and customers understand and appreciate Ethernet. So, as much as people struggle with Gigabit Ethernet, it will be considered definitely worth the effort. If ever there was a technology winner, Ethernet LANs are a prime example. When it comes to WANs, the impact of Ethernet is being seen everywhere. Most sites have LANs, and even a number of residences have begun to appreciate the fast connectivity that LANs provide over a limited distance. The distance limit is the key. Even 10 Mbps Ethernets can hardly be expected to network effectively over low-speed, dialup modems of the type used to surf the Web. The lowest speed seriously considered is known as DS-0, which runs at 64 kbps. Even this is barely adequate today; that it works at all is due to the fact that most LAN-based applications are what are known as bursty applications. The term bursty means that over time the application is seen to generate bursts of traffic, bursts consisting of data units usually called packets, which flow between the LANs. Between bursts, long periods of relative silence occur in which little or no traffic at all passes between the LANs. So if the LAN traffic is bursty enough, packets can be buffered (saved in a memory location dedicated to communications) until the burst is over, at which time the packets can be sent out on the much lower speed link to the other LAN.
More and more applications are being run on LANs, where they are usually built on what is known as the client/server model. In the client/server model of network computing, all processes running on any type of computer at all can be classified as clients or servers. Since all modern computers and operating systems can run multiple programs or processes at the same time, there can even be clients and servers running on the same machine at the same time. The difference between clients and servers is that clients talk and servers listen for initial requests. Server processes have become so complex and performance-conscious that it is much more common to implement servers on a dedicated computer of their own, one machine for each major server type. It is usually easy to tell the clients from the servers in an organization. The clients typically have people sitting in front of them doing work. The servers typically are found in the backoffice or, if located in the general office space, have signs on them saying “do not touch.” It is a truism of client/server computing that no one can actually do work on a server, but no one can do work without it. For the servers listen to client requests for information, which the servers provide over the network for the clients to process. The importance of the client/server model of network computing is not so much in the mere fact that there are clients and servers on LANs. It is in this implication: Any client must be able to access any server it is authorized to reach in order to perform its job. So, for instance, a client PC in the Human Resources department of some organization must be able to reach all of the Human Resources servers, whether in the same building or not. Naturally, if the servers are not in the same building as the clients, a WAN is needed to reach them. In practice, the servers might be literally anywhere in the organization. In fact, they might even be outside of the organization. This is especially true when speaking of the latest phenomena in the realm of client/server computing: the Internet, intranets, and the World Wide Web.
Internet/Intranet/Web The Internet is a global, public collection of computer networks that are all interconnected according to the familiar client/server model. E-mail clients send to e-mail servers. File transfer clients fetch software packages from file transfer servers. And so on. All of these clients and servers must be running software that complies with the protocols of the Internet Protocol Suite, which most people know as TCP/IP, for the two major protocols themselves. The key to the position of the Internet in networking today is its current public status. The Internet began as a military network in 1969 and only “went public” in a big way in the early 1990s. The software available for the Internet was standardized, inexpensive (often bundled with other software and so “free”), and well understood. It was not long before the software and protocols used for Internet access were also used for client/server computing on LANs. Of course, the client/server interactions within an organization were inherently private, not public. They were not characteristic of an Inter-net, but rather an intra-net, within the same organization. So the application of Internet client/server applications and protocols among the clients and servers within a single organization became known as an intranet. This preserved the privacy of the organization’s internal network while at the same time taking advantage of the effectiveness and popularity of the Internet Protocol Suite (TCP/IP). The key here was the concept of a client being authorized to access a given server. In a private network, such authorization is implicit in the client’s membership in the network, but not haphazard. For example, few organizations allow all clients to access the server with employee’s salary information since the potential for abuse is much too high. There are occasions when it is even appropriate to allow clients from outside the organization to access servers within the organization over the Internet. An example would be with a manufacturer and its suppliers. An automobile manufacturer might require client access to a tire maker’s server in order to place orders, check shipments, and perform various other tasks. This permission can be granted to such external clients, as long as the proper safeguards are in place to prevent abuses and further exposure to outside threats. This arrangement is known as an extranet, reflecting the external client relationship to the intranet servers. A lot of the security for these arrangements are provided by Virtual Private Networks (VPNs) based on various more or less standard security methods. A VPN can provide a way to exchange private information with confidence over the public Internet. The relationship between clients and servers across the Internet, within an intranet, and on an extranet is shown in Figure 1.2.
Figure 1.2 Internet, intranet, and extranet. Today, much of the activity and popularity of the Internet, and intranets and extranets as well, is firmly based on the popularity of the World Wide Web, or just the Web for short. In a few short years, the Web has invaded the public consciousness to such an extent that it is hard even now to imagine how people got along without it. The familiar child’s question of “What did people do before there was television?” will soon be replaced with “What did people do before there was the Web?” So grade schools assign Web homework, goods are sold over the Web, and even stocks can be traded on secure Web sites. Naturally, any transaction involving finances requires a high degree of security, such as that provided by a VPN. The Web browser constitutes a kind of universal client that can be run on any type of computer. So instead of needing a separate client program for e-mail, file transfers, and remote login (after all, the servers are all separate), a user really only needs a simple Web browser to perform all of these functions and more. The Web supports a wide range of multimedia content, from streaming audio Web radio stations to Web video movies. All of this information and versatility is rolled into one point-and-click Graphical User Interface (GUI) that is as simple to use as a television remote. Together, the Internet, intranets, and the Web form another technology winner with Ethernet LANs. But there is even a third technology winner that can provide an organization with even better security than almost all intranets and extranets. This is public frame relay network services.
Frame Relay Frame relay is typically used as a virtual private data service that employs virtual circuits or logical (rather than physical) connections to give frame relay users the look and feel of a private network. The shared nature of logical connections (virtual circuits) is an important one in frame relay and is fully discussed later in this chapter. In most cases, the frame relay network is built by a public network service provider and use is contracted on a multiyear basis (2-and 3-year contracts are most common) to customers. This relieves the customer of building his or her own private network with purchased network nodes (switches or routers) and leased private lines. This is not to say that frame relay networks cannot be built as totally private networks, just that very few private frame relay networks exist. Most frame relay networks are public, which gives customers a lot of flexibility and economic advantages, in the same way that taking a public taxi is more flexible and economic than buying a car. Frame relay is the first public network service designed specifically for bursty LAN applications. Frame relay supports all common data applications and protocols typically found in LAN, SNA, and other data network environments. Frame relay support for voice is not uncommon and video support might become common as well. The virtual network nature of frame relay allows for the consolidation of previously separate networks such as LANs and SNA into a one-network solution. Frame relay supports all mission-critical data applications, whether based on distributed or centralized computing. Frame relay supports LAN interconnectivity, high-speed Internet access, and traditional terminal-to-host or SNA connectivity. The bursty nature of these data applications allows users to take advantage of the special features that frame relay was designed with.
Common frame relay network configurations include LAN to LAN, terminal-to-host (the most common term for an IBM mainframe or other central computer), LAN-to-host, and even host-tohost. The typical types of applications that users run on frame relay include document or file sharing, remote database access, office automation (such as order entry, payroll, and so on), interactive sessions, e-mail, presentation or graphics file sharing, and bulk file transfers. The customers that form the best base for frame relay services have three major characteristics in common: They have five or more dispersed locations that need connectivity; they want to consolidate separate networks into one integrated network; and they need full or nearly full mesh connectivity between the sites. The popularity of frame relay can be appreciated when compared to other public data network solutions. The first public network designed specifically for data is arguably the X.25 standard for public packet-switched data services. X.25 public Packet-Switched Data Networks (PSDN) were built in almost every country around the world, but in the United States, X.25 use remained rare. People preferred to build totally private networks out of privately owned switches (and then routers) linked by point-to-point leased private lines. Even outside of the United States, X.25 networks were plagued by annoying incompatibilities between national versions, lack of availability in key locations, and the failure of the service providers to market the solution effectively. Outside of a few niche applications where it thrives to some extent even today, X.25 in the United States became a public network without a public. Then along came Integrated Services Digital Network (ISDN), which was supposed to lead the telephone companies out of the wilderness and into the public consciousness as providers of voice service with unprecedented quality, and all the data service support anyone could ever need. Most of the data service support was provided by X.25 packet switching, hardly a winner on its own. And most people in the United States at least were quite happy with the quality of their voice service already. The promised integrated services like video and fax services were either already available in other forms or soon came from other sources such as the cable TV companies. After almost 15 years, ISDN was still not available everywhere. Outside of the United States, with a few notable exceptions like Germany, ISDNs were plagued by annoying incompatibilities between national versions, lack of availability in key locations, and the failure of the service providers to market the solution effectively (one does see a pattern here). In the face of the less than rousing reception given these previous public data network solutions, the sudden success of frame relay took most service providers by surprise (actual shock in some cases). Inside and outside of the United States, frame relay enjoyed smoother international connectivity, great availability, and brilliant marketing tactics. Frame relay’s success, although surprising in its scope, should not have been totally unanticipated. Users had been faced with increasing difficulties in linking their LANs with private lines for some time. Frame relay, unlike X.25 and ISDN, filled an immediate need. This construction of private networks with point-to-point private lines requires some further exploration.
Frame Relay Benefits The time has come to bring this chapter to a close with a look at just what frame relay can do to help resolve all of the issues regarding network limitations surveyed to this point. This section is not intended to replace the list of frame relay benefits given in the Introduction. Think of this more as the foundation for all the benefits listed earlier. There are four main benefits that frame relay networks offer to organizations. These are in terms of bandwidth, connectivity, economics, and applications. The concluding sections look at each one in turn.
Bandwidth The need for increased bandwidth, and even broadband networks in some instances, is a fact of life in most organizations today. The question is more one of the best way to get the increased bandwidth. Faster private lines, the traditional answer, is wasteful in the United States and simply not an option in many other parts of the world. When the need for bandwidth for international connectivity is factored in, paying for private line bandwidth is even more of a problem. The attraction of frame relay is that there is no dedicated bandwidth as on private lines. The total amount of bandwidth is divided according to the needs of the currently running and bursting applications. This is no more than bandwidth-on-demand in action. Since a customer is not paying for dedicated bandwidth, but shared bandwidth, frame relay networks typically use relatively modest speeds for linking customer sites (since sharing cuts down on the overall bandwidth need). But where larger, broadband speeds are needed, frame relay can be used as well. Access from a customer site to a frame relay network can be at 64 kbps, simple multiples of this basic rate (called fractional T1 (FT1) or Nx64), 1.5 Mbps (full T1), 2 Mbps (mostly outside of the United States), and even 45 Mbps (full T3). Higher speeds are being considered, but 45 Mbps should remain the top speed for some time to come. Of course, a 45 Mbps link used to access a frame relay network is no more than a private line leading not to another customer site, but to the frame relay network itself. So the cost is kept manageable due to the relatively short length of the link. When the fact that all the logical connections share the 45 Mbps bandwidth on a dynamic basis is added, it turns out that frame relay has more than enough bandwidth for any application.
Connectivity Frame relay connections are logical connections, not the physical connections of a private line network. Logical connections are sometimes called virtual circuits and the terms are more or less interchangeable. This book prefers the terminology of logical connections because the term virtual has become overworked in the networking industry. There are virtual LANs, virtual private networks, and even several forms of virtual reality. However, the later use of the terms Permanent Virtual Circuit (PVC) and Switched Virtual Circuit (SVC) will be unavoidable. Whatever they are called, the connectivity provided by frame relay networks is logical, not physical. These logical connections form the virtual circuits which replace the dedicated or physical circuits that form the basis of the private line network. This logical connectivity is shown in Figure 1.8.
Figure 1.8 Private lines and logical connections. Now all access is to the public network, not using point-to-point private lines (or the multipoint private lines still in use with SNA) for all site connectivity, but with logical connections or virtual circuits established on the access link. This is the essence of public networking in general: All reachable end points are contacted through the same local access link. Think of telephone calls on the PSTN, which uses the same idea. Frame relay sites connect to each other by connecting to the frame relay network. This is a huge advantage over private line networking. Consider an organization with a need to establish an office in Paris, for example. A private line would not only be expensive, but probably not even available at anything near broadband speeds. But frame relay can link the sites together with 1.5 Mbps access in the United States and 2 Mbps access in Europe. This is a pretty neat trick, because the two ends run at different speeds. This is hard to do with physical connections, but easy to do with frame relay virtual circuits. One other connectivity example should suffice. AT&T says that some 40 percent of all telephone calls to Europe from the United States are not voice calls, but fax messages. These are expensive, to say the least, and form a considerable part of the voice network budget. With frame relay, much if not all of this fax traffic can be sent over the frame relay network itself, with the resulting savings to the voice finances. The Internet can be used for all of the connectivity advantages mentioned in this section. After all, the Internet is also a public network which reaches the world through local access links. But the Internet does not offer even the beginnings of the security that frame relay users count on routinely. Internet security must be added by users, often at considerable expense. And many studies have shown that commercial Internet security products fall far short of their claims to comply with the most basic security standards. Also, the Internet handles logical connections much differently than frame relay does. There is no connection path set up between the routers on the Internet as there is between the switches on a frame relay network. There is no minimum bandwidth guarantee either, much to the dismay of those who would rely on the Internet for things better done with frame relay. So think of the Internet as an adjunct to frame relay, not a competitor. Indeed, it is a rare frame relay network that does not include at least one virtual circuit to an Internet service provider. But frame relay provides better security and performance than the Internet.
Economics
With enough money, there is no reason to favor one technology over another. This is especially true of standards. After all, they all work. There is no single right way to build a LAN or WAN. But since this is not an ideal world, money does matter. And often one technology is favored over another because a small initial economic advantage gets magnified over time until it becomes much more expensive to do anything else. Certainly the small economic edge that Ethernet enjoyed over token ring at the beginning of LANs (token ring chipsets were just more complex and therefore more expensive) was magnified over and over. Frame relay has held on to an economic edge over ATM for some time now and probably will for some time to come. Some of the discussion in the preceding sections has touched on the economic benefits of frame relay. It should be enough to list these benefits and add a little more information about each aspect. Following are the “big 5” savings possibilities of frame relay. There are more, but they have less impact that these main ones. 1.Bandwidth savings. Since there is no dedicated bandwidth, applications can share a pool of bandwidth on the access line. This can lead to a significant cost savings. 2.Connection savings. Since there are no physical connections between sites, there is no need to have multiple links running to remote sites. All sites are reached through the same access link. 3.International savings. Private lines to other parts of the world are expensive. Frame relay logical connections can reduce this expense. And if frame relay can be used for fax and/or voice, the savings is even greater. 4.Network savings. Frame relay is a logical network. Connections can usually be added in a week or so to new sites, and logical connections rearranged literally overnight. (In contrast, private lines typically require 30 to 60 days to “rearrange”.) 5.Management savings. Most of the details of managing the day-to-day activities on a network, such as routing around failures and avoiding congestion, is now done on the public network on behalf of all customers. Some frame relay users rely on the service provider for all of their network management needs.
Applications One of the benefits of frame relay that is just beginning to be explored is the wide range of applications that a frame relay network can support. Most people are aware that frame relay will support not only bulk file transfer and delay sensitive transactions, but is perfectly suited for IBM SNA network support as well. Many now are aware that faxing and voice telephony can be done on frame relay also. Only a few are becoming aware that frame relay can support very good video services, such as corporate video conferencing. What took so long? The answer is simply that frame relay was designed to be first and foremost a data service. It was only after the success of frame relay in that role that frame relay equipment manufacturers and service providers began to explore the use of frame relay for voice and video. The whole point is that if an organization needs a network platform that is secure, fast enough for almost any application, and virtually future-proof, then frame relay is the way to go.
Chapter 2: The Public Data Network Overview This chapter will further investigate the role of frame relay as a public data network. There are many types of networks, naturally. But most of them can be classified according to whether they were designed and intended to provide public or private service, and whether these networks were designed and intended primarily to deliver voice, data, or some other kind of service on circuits or with packets. Frame relay was primarily designed and intended as a public data network service. This is an important point because of the current trend to use frame relay not only for data, but also for voice. Voice over frame relay works very well, in most cases much better than voice delivered over the public Internet, for example. There is a reason for this, which will be detailed in this chapter. For now, it is enough to point out that what is known as cell-based frame relay delivers voice that is just about as good as many international telephony calls, but at a fraction of the price of using public telephone company voice services. So frame relay is not just a data network anymore. The second half of the “frame relay was primarily designed and intended as a public data network service” equation is the public aspect of frame relay. Yet many people speak of their frame relay network as the corporate backbone for what remain essentially private network services delivered only to the authorized users of the network within the corporation. Obviously, frame relay is a public network service that can be used in conjunction with a private, corporate network. In some cases the term virtual private network is applied to a frame relay network configuration, but it is more common to apply this term to companies linked over the Internet. Nevertheless, one of the most common uses of frame relay is to link a corporate network to the Internet in a systematic and costeffective manner. This chapter will look into frame relay as a public data network and examine how it came to be characterized in this way in the first place. This chapter will enable readers to understand how frame relay can also be used today as a voice and video network, and as a virtual private network that offers privacy to employees while at the same time opening up the organization to the Internet and World Wide Web. This versatility is hard to match with many other network technologies and, of course, is one of the reasons for frame relay’s popularity today.
Networking with Circuits and Packets Frame relay, like X.25 before it, is a packet-switching technology. Frame relay is usually classified as a fast packet technology, for the simple reason that frame relay is fast enough to carry packetized voice and video. ATM is another of these fast packet technologies. The relationship between frame relay and ATM will be explored in much more detail in a later chapter (Chapter 12). ATM can and does form the switch-to-switch backbone technology for many frame relay networks. This form of cell-based frame relay makes the voice and video capabilities of a frame relay network that much more attractive to users, since ATM is designed for such mixed traffic environments. Before X.25 came along, people built data networks out of circuits, just like the old days of the telegraph network. In other words, all the telephone companies provided was the raw bandwidth on wires to send and receive frames with packets inside. If there was any switching or routing of messages, this had to be done by user-provided equipment. In the telegraph network, this was the equivalent of taking a message from one place, then sending it out on the telegraph wire to the next hop done the line. This process was repeated until the message eventually got where it was going. The alternative of a direct wire link everywhere was hardly technically or economically feasible. Yet a method for networking without circuits was clearly needed for data. Data is not like a telephone call. Data is bursty, but human telephone calls are not. Circuits used for voice are almost constantly in use, at least in one direction when someone talks and another listens. But circuits used primarily for voice when used for simple PC dialups, are filled with idle periods between bursts. These idle periods could be used for other data packets, except for the fact that the bandwidth on a leased, private line belongs exclusively to the customer leasing the line. Circuits also connect only one point to another, in most cases. This is fine for the PSTN because it was intended to connect the device at the end of one access line to the device at the end of the other access line. It matters little whether the devices are telephones, modems, or fax machines; as long as they are compatible. But one major aspect of data networks in general, especially those built on the principles of the OSI RM, is that they need to connect the client device at the end of one access line to everything. As more and more people use PSTN access lines for Internet access and not telephone calls, it is clear that circuit-switched networks with their all the bandwidth, all the time approach like the PSTN are not the best way to handle this traffic. On data networks like the Internet, information is sent and received in the form of packets. Packets are variable-length units that have some maximum size and minimum size. The term packetized means that everything sent around the network, from data to voice and video, is in the form of packets. Voice and video have traditionally been handled by circuit-switched networks (such as the PSTN and cable TV networks) rather than packet-switched networks. Once packetized using stateof-the-art digitization and compression techniques, this type of voice and video more closely resembles bursty data than anything else. Circuits are ill-suited for packets. Packets can go anywhere, but circuits only go to one place at a time. Circuits reserve all of the bandwidth, all of the time to one place at a time. Long circuits are paid for by the mile. It is too expensive to require one circuit for each potential user. A lot of expensive bandwidth on these circuits is tied up for bursty data applications.
To save money and make more efficient use of long and expensive circuits, packet switching was invented. Who invented packet switching is the subject of intense debate. Surely the Internet people were pioneers and IBM certainly popularized the process with its own networking products. The whole concept was standardized with X.25 in the 1970s. With packet switching, individual packets to all destinations could be switched from place to place based not on what circuit the packet represented, but what end application the packet was carrying information for. This is how a single link on a packet-switched network can connect one end device to everything. The packet switch (now called a router on the Internet) could send a packet literally anywhere on the network, based on the information carried in the packet header. Only one link into the network cloud was needed for all of these activities. This user-network interface (UNI) link was still only a circuit in most cases because packets still must flow on something. The individual address information attached to every packet gives another perspective for distinguishing circuit switching from packet switching. Circuit switches, such as PSTN local exchanges, switch the entire bandwidth of the circuits (all the bandwidth, all the time) from one place to another. Packet switches only switch the individual packets that form the content of the frames. So packets can be mixed to a whole host of destinations on the same link, which may still be a PSTN circuit, dialed or leased. The basic differences between circuit switching and packet switching are shown in Figure 2.7. The figure shows the basic differences between circuit-switched networks (e.g., the PSTN) and packet-switched networks (e.g., frame relay).
Figure 2.7 Circuit switching and packet switching.
X.25: The Slow Packet Network In the early days of data communications protocols, there were many private protocols designed to address a particular need or specific private network. For example, the airline industry employed IPARS, the International Protocol for Airline Reservation System, which was specific to a particular application and network. No one gave any thought to using IPARS on any other network or in any other context except for the one for which it was developed. But at the same time, several companies and telephone administrations in North America and Europe implemented a number of public data networks. The idea was to provide a data service that paralleled the voice service’s public connectivity and degree of standardization. These are commonly known today as packet-switched data networks (PSDN), but other names were common in the past. The names all acknowledge that PSDNs function by switching packets through the network. The network node in a PSDN is a called a switch, not a router, bridge, or any other data communications device. X.25 was intended to be an international standard for building a PSDN in which any data user could contact any other data user, literally anywhere in the world served by a PSDN, to exchange information as easily as voice users used the voice network to exchange information by means of speech. The ideas behind X.25 especially appealed to European telecommunications administrations, where the number of relatively small nations made adaptation of a single data standard very attractive to users. In the United States, there was less of a need, economically or politically, to consider alternatives to private data networks.
It is important to realize that the X.25 standard for PSDNs specifies one main thing: the user’s connection to the network in a standard fashion. This means that even if different vendors provide the customer premises equipment and the network node switch, they can interoperate as long as they both comply with the X.25 interface standard. How one network node sends X.25 packets to another switch (network node) is beyond the scope of the X.25 standard, which only specifies the User-Network Interface (UNI). Even the important consideration of how one PSDN should send packets to another PSDN (for instance, a PSDN in another country or one run by another service provider) is covered by a separate (but related) international standard known as X.75. X.25 was developed in the early 1970s by the CCITT (now International Telecommunications Union-Telecommunications Standardization Sector: ITU-T) and published in 1976. The assumptions made by the designers reflected the state of networking at the time. There were mainly two. First, end-user devices were of limited intelligence in terms of processing power. Second, the communications network connecting the users to the X.25 networks, and the communications network between the X.25 switches themselves, was extremely error prone and unreliable. Therefore, X.25 contained a lot of provisions that took much of the communications responsibility away from the end-user device and included a lot of error-checking and flow-control services. Since X.25 was intended to mimic the voice network in function, X.25 was intentionally made a connection-oriented service. This meant that, just as in the public telephone network, one user had to establish a connection to another user across the network before data transfer could begin. Because end devices were so limited in capability and the network similarly limited, this decision made perfect sense. Why send a packet halfway across the country only to find out that the destination could not accept any traffic, for whatever the reason? X.25 is a connection-oriented protocol, as opposed to other common protocols such as TCP/IP, which is connectionless, at least within the network itself. There had to be connections, node by node, or else packets would not flow. The connections in X.25 were logical, not physical, and the familiar frame relay terms of permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) were fundamental to the architecture of X.25 as well. The idea of a datagram in TCP/IP, which basically means a connectionless packet, does not exist in X.25. The X.25 protocol is a layered protocol. The three layers of X.25 are shown in Figure 2.8. At the bottom, the Physical Layer describes the type of connector used and the way that data bits in the form of 0s and 1s are sent to and from the X.25 network switch. Several possibilities are listed in the figure. At the Data Link Layer, the LAPB (Link Access Procedure-Balanced) defines a particular frame structure that bits are organized into when sent to and from an X.25 PSDN, as well as various functions that the frames can also provide such as setting up SVCs, sending data, and control functions. The upper layer is the Network Layer and X.25 defines a packet structure that forms the content of certain kinds of frame. The protocol which determines the rules for the sending and receiving of frame over the X.25 network interface between user and switch is known as the X.25 PLP (Packet Layer Protocol).
Figure 2.8 The layers of the X.25 protocol stack. The X.25 LAPB frame structure is probably the most important part of X.25 for understanding the development of frame relay. This is shown in Figure 2.9.
Figure 2.9 The structure of the X.25 LAPB frame.
This structure has appeared over and over again in data communications protocols. The structure is simple enough, yet possesses all of the capabilities needed to carry the X.25 packet over a series of links between X.25 switches. This is, in fact, a key point. In all layered data communications protocols, the frame is sent on a link-by-link basis only. That is, at every step along the way on an X.25 network, a frame is created by the sender and essentially destroyed (in the act of processing the frame) by the receiver. It might seem impossible to ever send anything intact from a source to a final destination if this is the case. But the key to understanding how layered protocols operate, and how frame relay relates to X.25, is to realize that it is the packet, as the contents of the constantly invented and destroyed frames, that is sent across the network end to end from a source to a destination. Many protocol developers are even in the habit of calling any protocol data unit (PDU) that travels the length of a network from end to end intact, through switches and other network devices, a packet. The packet label seems to apply whether the actual PDU is a TCP/IP Layer 3 data unit (datagram) or an ATM Layer 1 data unit (cell). The term “packet” is used generically in many cases simply to indicate a data unit that leaves one end-user location, flows over a network, and arrives at another end-user location intact. So the first step in understanding the relationship of frame relay to X.25 is to realize that in frame relay it is the frame that flows intact from source to destination on a frame relay network. In X.25 this function is performed by the packet, or frame contents. In an X.25 packet-switching network, packets are switched on the network. In a frame relay network, frames are relayed across the network much more efficiently, since the frames no longer need to be created and destroyed on a link-by-link basis just to get at the packet inside.
X.25 to Frame Relay X.25 public packet switching is often cited as the parent protocol of frame relay. X.25 is a slow packet technology while frame relay is a fast packet technology, and so on. But the road from X.25 to frame relay leads through the Integrated Services Digital Network (ISDN). ISDN was a plan to merge access to digital voice and digital switching networks and bring some packet switching capabilities and services to the circuit switching voice network. This is not the place to debate the failure or success of ISDN with regard to this envisioned merging of packets and circuits. What is important for frame relay discussions is that the layers of X.25 were adapted for use as a signaling protocol and packet data protocol for use with an ISDN. The original LAPB protocol became Link Access Procedure-D channel (LAPD). LAPD was used on ISDN signaling channels to set up and maintain and terminate ISDN connections. LAPD frames could also carry user data when not otherwise used for this signaling purpose. The LAPD frames now could carry signaling packets to and from the ISDN switch. All user devices sharing the integrated access to an ISDN signaled using LAPB. There is an important aspect of the LAPD frame structure that holds a key to understanding the relationship between X.25 and frame relay, through ISDN. This is the address field. The ISDN LAPD address field has a structure as shown in Figure 2.10.
Figure 2.10 The structure of the ISDN LAPD address field. It is obvious from the figure that there are two different fields involved in the ISDN LAPD address structure. The first is the SAPI (Service Access Point Identifier, 6 bits in length), and the second is the TEI (Terminal Endpoint Iden tifier, 7 bits in length). These identifiers are just numbers, from 0 to 63 in the case of the SAPI and from 0 to 127 in the case of the TEI field. But why should a frame, which only flows from a single location to another single location (point-to-point), have such a complicated address structure?
This was one of the innovations of ISDN itself. While it is true that all frames flow over the same point-to-point link on a data network, it is not the case (and cannot be) that all packets must do the same. Otherwise, a separate physical point-to-point link to every possible destination must be configured at the source for these frames to flow on. In fact, this is the essence of a private network. But this is not the essence of ISDN D-channels and X.25. The parent protocol LAPD, as its child protocol frame relay, allows the multiplexing of connections from a single source location over a single physical link to multiple destinations. The greatest benefit of this approach is to make more efficient use of expensive links and cut down on the number needed in the network. The two fields in the LAPD frame address in ISDN deal with the two possible kinds of multiplexing that a customer site sharing a single physical network link must accommodate. First, there may be a number of user devices at a customer site. Second, there may be a number of different kinds of traffic that each of these devices generates. For instance, even though all information is 0s and 1s, some of these digits may represent user data and some may represent control signaling to the network itself. The TEI field deals with the first of these multiplexing possibilities. The TEI field addresses a specific logical entity on the user side of the ISDN interface. Typically, each user device is given a unique TEI number. In fact, a user device can have more than one TEI, as would be the case with a concentrator device with multiple user ports. The SAPI field deals with the second multiplexing possibility. The SAPI field addresses a specific protocol understood by the logical entity addressed by a TEI. These are Layer 3 packet protocols and provide a method for all X.25 equipment to determine the structure of the packet contained in the frame. Taken together, the TEI and SAPI address fields make it possible for all network devices on the network to (1) determine the source or destination device of a particular frame on an interface (the TEI), and (2) determine the source or destination protocol and packet structure on that particular device (the SAPI). What all this has to do with frame relay will become apparent later. For now, it is enough to point out the functions of the TEI and SAPI LAPD frame address fields.
X.25 Limitations While the X.25 public, packet-switched data network protocol was a workable and international standard way of transferring data from place to place, X.25 was mostly ignored in the United States and was slow to catch on in the rest of the world as well. There were a number of reasons for this, some of which had to do with the way early data networks were built in the United States and some of which had to do with inherent limitations in the way that X.25 functioned. The X.25 protocol was designed to be a public networking standard which would operate well over noisy, error-prone, copper-based networks. The philosophy of the X.25 designers was to make the network intelligent enough to perform the necessary error recovery and flow-control operations on behalf of the end-user equipment, which had only modest capabilities along these lines when X.25 was developed. The problem with X.25 adoption in the United States is that the private corporations that had the greatest need for the kind of service that a public X.25 network could provide were addressing these same issues with other solutions. In the case of public networking, companies were busily building their own private networks out of point-to-point dedicated links. The bandwidth on these links was leased from the public telephone carriers. Thus, these carriers were in the uncomfortable position of cutting into their leased line revenues by setting up a public, packet-switched network service in direct competition with their private line business. Corporations, who were normally in intense competition with other companies in the same lines of business, were much more comfortable with the data security that private lines provided, at least on the surface (all corporations’ data was still mixed together on the telephone carriers’ backbone network, of course). And although these private lines were leased by the mile, and so could be very expensive in some instances, corporations in the United States at that time were enjoying record profits and undergoing a heady expansion phase along with the entire U.S. economy. (There were exceptional companies that embraced public data networks, usually in the computer/network industry itself.)
As far as the error-prone links were concerned, companies adapted to that environment more or less successfully as well. The data applications developed for this error-rich situation performed their own error recovery at the end points of the network. Since most, if not all, links were point-topoint connections between sites, there was little delay or overhead added to a network as a result of these error recovery mechanisms. Although this extensive error recovery increased the price of network devices, the same corporate profits which bankrolled the links also paid for the expensive network devices at the ends of the link. Besides, the philosophy went, end devices would still perform their own error-checking and flow control even if connected over a public network. After all, no computing device should be so trusting of a network as to process a piece of data delivered without extensive error checking and flow control in any case. These perceived limitations that prevented widespread deployment and use of X.25 public data networks were not so pronounced, or even present, in the rest of the world. The capabilities of the United States telecommunications network and the money available to the users of data network services were not common throughout the world. In fact, the United States was exceptional in this regard. Because the limitations of X.25 mentioned to this point were mainly matters of perception, X.25 public data services became more common throughout the world, especially in Europe. However, even in countries where X.25 networks were extensively deployed and relatively well accepted by the data user community, the technical limitations of X.25 quickly became apparent: Copy of frame was kept link-by-link on the X.25 network.
Full error checking was done hop-by-hop on each frame. All switches had to examine the packet (frame contents). The X.25 network had to do flow control in the network.
Logical connections between sites were at the packet layer (Layer 3 of the OSI RM). One of the limitations was that a copy of the frame had to be kept in each network node (the X.25 switch) until a message was received from the next node that the frame was received without errors. Of course, if the frame was received with errors, another copy was sent, and perhaps another, and another, until one was received without errors or a retry counter was exceeded. Many X.25 switches had to have large and expensive buffers to hold all of these frame copies within the network. Next, this full error-checking procedure had to be performed on each hop through the X.25 network. A hop is simply a link between adjacent X.25 switching nodes. In the United States, 10 or even 15 X.25 switches could be found on coast-to-coast links. This elaborate error-checking procedure could slow throughput to a crawl. Each X.25 switch had to examine the X.25 packet inside the X.25 frame to determine the proper output port to send the packet out on toward the correct destination. This meant that each X.25 switch had to assemble the entire packet (which could be spread across several frames), examine the header, and then disassemble the packet into another potential series of frames for the outbound link. Trying to limit packet size to fit in a single frame was not an elegant solution, since this produced that many more packets to be switched by the X.25 network, but this is exactly what was usually done. The X.25 network also did flow control, which prevented any sender from overwhelming a receiver. This seemingly innocent feature meant that a sender could pump data into the X.25 network at 9,600 bits per second (bps) that was intended for a destination that could only accept data at 1200 bps. The X.25 network was forced to buffer this extra data until it could be sent to the destination, again at an added expense for the additional memory needed in each X.25 network node (although the X.25 network could eventually slow the sender). Finally, the fact that the logical connection was located inside the packet itself meant that the full X.25 Packet Layer Protocol (PLP) had to be implemented in every X.25 network node. This layer also had to do error control, but the additional processing power needed to switch the packet correctly only added delay to the network as a whole.
All these limitations conspired to make X.25 an adequate network technology for modest data networking needs, but wholly inadequate for voice, video, and even the kinds of connectivity needed in the mid-1980s for the new breed of applications that ran on the faster premises networks that appeared at this time: the bursty LAN applications. The preceding discussion of X.25 limitations might seem harsh. But it must be pointed out that X.25 is still a widely used, international standard, packet-switching technology that works very well in any number of network situations. Most telex (teletype or teletypewriter) applications employ X.25 for short text messages, and sometimes even more. Equipment is very cost-effective and comes in 240 port models, surely a sign of a viable technology. There are many features that make X.25 attractive today, beyond simple standards issues, that are hard to find in newer, less mature, networks. These include things like the creation of closed user groups (for public network security), signaling connectivity based on universally unique telephone numbers (no separate network address needed), call redirect and forwarding (for finding mobile users), and other features. It is hard to send a message to an oil platform or other remote location without using X.25 today. In fact, X.25 will be found today wherever the infrastructure to support frame relay is lacking.
Modern, Packet-Switched Data Networks As interesting as the history of the telegraph might be, the type of data network most relevant to frame relay is the modern, packet-switched data network, sometimes abbreviated PSDN, because the international standard for PSDNs is recommendation X.25 from the International Telecommunications Union (ITU). And X.25 is really the direct parent technology of frame relay. Frame relay has not been called “X.25 on steroids” and “X.25 for the 1990s” for nothing. This section will introduce not only X.25, but also many of the key concepts of modern data networking, such as the Open Systems Interconnection Reference Model (OSI RM) from the International Organization for Standardization, usually (but incorrectly) abbreviated ISO. At the end of this section, not only will the stage be set for introducing frame relay’s distinctive features, but the incentive for doing so will be absolutely clear. It is necessary to fast-forward into the 20th century to do so, not because networking did not evolve in the interim, but because most of the attention was applied to voice, not data, networking. First, however, a simplified, illustrated, and painless look at the OSI RM is in order. Even those familiar with the OSI RM might still want to read this section carefully, as there are many misunderstandings about just what the OSI RM does and what it is for. And of course, without a thorough knowledge of the OSI RM, most of the interesting features of frame relay are utterly meaningless. Simply put, the OSI RM is a standard way of bridging the gap between a software program (application) running in the local memory address space of a computer and the hardware connector on the back of the machine. The OSI RM creates the bridge by breaking down this software-tohardware task into a series of layers, seven in all, that have strictly defined functions that are to be implemented in a standard fashion. The layers of the OSI RM are illustrated in Figure 2.3.
Figure 2.3 The OSI RM. All the OSI RM does in a computer is to take bits from an application program and send these bits out to a communications port on the back of the computer, and vice versa. What is so complex about this seemingly simple task that it requires seven layers to perform? Most critics of the OSI RM would indeed maintain that the seven-layer OSI RM needlessly complicates what is basically a trivial computing task. However, this also trivializes the very act of networking to the point where complex networks become impossible to build. To see why networking can be so complex, this section will gradually build up the reasoning behind the structure of the OSI RM, with the ultimate goal of better understanding the layers of the X.25 and frame relay protocol architecture.
At the bottom of the OSI RM is the Physical Layer. This is the hardware. Actually, it is a full description of the connector on the back of the machine, such as the RS-c connector. Technically, this is now the EIA-d or EIA-232e connector. The RS-c connector is a D-shaped, 25-pin connector that is normally attached by a cable to a modem for communications purposes. Note that the cable, the modem, and any other sending or receiving telecommunications equipment is not really part of the OSI RM. Such components (including multiplexers and so on) are usually referred to as the subphysical layer network. These components must be there, of course, but with a few modem exceptions, the OSI RM does not standardize their form or function. They are covered by other standards and will not be considered further. The OSI RM Physical Layer specification is divided into four parts: mechanical, electrical, functional, and procedural. The mechanical part specifies the physical size and shape of the connector itself, the number and thickness of the pins, and so forth, so components will plug into each other easily. The electrical specification spells out what values of voltage or current determines whether a pin is active or what exactly represents a 0 or 1 bit. The functional specification determines the function of each pin or lead on the connector (pin 2 is send; pin 3 is receive, etc.). The procedural specification details the sequence of actions that must take place to send or receive bits on the interface to or from a modem or other network device (first pin 8 is activated, then pin 5 goes high, and so on). The RS-c interface is a common implementation of the OSI RM Physical Layer that includes all these elements. So far, the data communications task does not seem too tough. It would be nice if there was a simple way to make the application program send bits to the connector (or connectors, since there may be more than one, e.g., COM1, COM2, etc.). And there is. All programming languages, from Basic to C++, allow the application programmer to write to a port. That is, there is a simple program statement to send bits for any purpose out the connector. For example, a statement such as “Write (Port 20$, “A”)” will send the internal bit sequence representing the letter “A” out of the send data pin on the connector. But the data communications task is trickier than this. It might not be such a good idea to just send raw bits from an application program to a communications port all in one statement. This method has several problems, some of which are not obvious. First, the statement mentioned will not check to see if there is really a modem attached to the port at all. It executes and completes when the data is delivered to the port. The application itself must check for error conditions—all possible error conditions—and take action in the application program to correct them. Second, there is no way to determine if the data has actually been delivered across the network to the destination system (which is presumably running a receiving application program). Third, since there are different internal bit configurations for the letter “A” (7-bit ASCII and 8-bit EBCDIC, for example), there is no guarantee that even if the bits get through to the destination system that the other machine will understand the delivered bits as the letter “A.” There are other, subtle problems but these are the main ones. Maybe the decision to write to a port was not a wise one. This is exactly the rationale behind the OSI RM and all layered communications protocol architectures like TCP/IP. The OSI RM offers a way to write application programs with statements like “send e-mail” or “get a file” in a networked environment in a standard fashion. These statements are now standard library functions or subprograms that are linked to the application program at compile (and link) time. This saves a lot of time and effort in the network program development cycle. But what are all those other layers for? The answer involves understanding how modern data communications networks were built in the 1990s. Consider a very simple network as shown in Figure 2.4. The figure shows two systems—A and B—connected at their physical ports by a cable. The systems may be two feet, 200 miles, or even 2,000 miles apart. There may be modems and multiplexers (muxes), and all manner of sub-physical layer network devices in between; it makes no difference to the OSI RM. The only important thing is that when bits are sent on the port on System A, they must show up at System B and whatever bits are sent out of the port on System B, end up at System A.
Figure 2.4 A very simple network.
However, bits are just bits. There should be a way for System A to tell System B: “Here come some bits, here they are, and this is the end of the bits.” In other words, the unstructured bit stream is organized into a data unit known as a frame. Of course this task—frame organization and interpretation—is much too difficult for the Physical Layer to handle in addition to its assigned tasks (and it is not part of the specification). So the ISO committee invented a layer above the Physical Layer just to send and receive frames: Layer 2 of the OSI RM, the Data Link Layer. These frames are officially known as Layer 2 Protocol Data Units (L2-PDUs) in OSI RM language. The frame structure of all Layer 2 protocols (the official OSI RM Layer 2 protocol is known as HDLC [High-level Data Link Control]) have many features in common. The frames all have a header, a body, and a trailer. On LANs, the frame header usually contains at least a source and destination address (known as the physical address or port address since it refers to the physical communication port), although this is absent in HDLC. There is also some control information. The control information is data passed from one Layer 2 to another Layer 2 and not data originating from a user. The body contains the sequence of bits being transferred across the network from System A to System B. The trailer usually contains at least some information used in detecting bit errors (such a Cyclical Redundancy Check [CRC]). There is always some maximum size associated with the frame that the entire unit must not exceed (because all systems must allocate space for the data in memory). So the task of the Data Link Layer in the OSI RM is to transfer frames from system to system across the network. The network in the original OSI RM must consist of point-to-point links between adjacent systems. Actually, the OSI RM allows what are known as multidrop or multipoint links also, but these are seldom seen today except in older networks and do not change the main points of the discussion. The only impact of these multipoint links is that the Data Link Layer (DLL) allows 256 systems on a single link. Obviously, in the figure whatever bits are sent out the port on System A arrive at System B, and whatever bits show up on the port on System B must have come from System A. The frame source and destination address in this instance are not even needed. The situation changes in the network shown in Figure 2.5. There are now two point-to-point links in the network and System C has been added to the network. System B now has two communications ports. System A and System C are now nonadjacent systems (i.e., they are not connected directly by a single one-hop point-to-point link). The question is: Can System A send a frame to System C? If not, then the network needs a direct link from System A to System C, and a large network will quickly become hopelessly complex. If so, then it is by no means obvious that the Data Link Layer is capable of doing this. In fact, the original definition of the Data Link Layer requires adjacent systems.
Figure 2.5 A more complex network. In the network illustrated in the figure, System B plays a critical role. System B can no longer assume that all frames arriving from System A are destined for System B. Some will obviously be for System C. System B will have to send these frames out the opposite port and on to System C. In the OSI RM, Systems A and C are now End Systems (ES) and System B is now an Intermediate System (IS). In 1979, when the OSI RM was developed, these systems were envisioned to be multiuser systems one and all, with many terminals attached to the host computer. As it turned out, this was asking a lot of System B. In many cases, there was simply not enough computing power to efficiently handle multiple communications links and many user terminals at the same time. The solution was to dedicate System B exclusively to the networking task. That is, there were no users on System B. System B merely took frames from an input port and determined the proper output port to resend the frames on. System B became a network node in modern language. There was another problem with System A sending frames to System C on a network as simple as this one. Recall that frames contain the physical or port address of the source and destination. Could System A really be expected to know the port address of System C? What if it changed? And could System B be expected to know the proper links to send frames out to all possible destination port addresses in a large network? Probably not. The OSI RM addressed this issue as before: The ISO added another layer on top of the Data Link Layer: Layer 3, the Network Layer.
The name Network Layer is a little confusing. The original name for this layer was the Routing Layer, since it addressed the need to route data through the network from a source to a destination. While this describes the layer’s main function, routing is not all that Layer 3 does. So the name was changed to reflect this reality. The Network Layer does not use the frame address to determine the destination for data. This may seem surprising, but the problem was that the physical address gave no indication of location on the network. Physical address “2125551212” could be anywhere in the world. It would be nicer if the address used by the Network Layer was similar to a telephone number: Anything starting with “212” was in Manhattan, for example. System B would route the data addressed to anything starting with “212” to Manhattan and let other systems in New York worry about just where 555-was. System B now becomes a router. But in the tradition of public data network terminology, this network node is called a switch when it is used on a public carrier’s data network. The exception is the Internet, where all public network nodes are called routers. The ISO addressed this physical address problem by inventing a network address for Layer 3 (actually, the OSI RM calls it a Network Service Access Point (NSAP), but it is used as a network address). Every system in the network has a network address, whether end system or intermediate systems. Systems could have many physical or port addresses, but still needed only one network address in most cases. The routing function of System B means simply this: The Data Link Layer on Port 1 of System B receives a frame from System A (which has System B’s Port 1 address as a destination). Inside that body of the frame is yet another data unit, a Layer 3 Protocol Data Unit (L3-PDU) known as a packet (in OSI RM) or datagram (in TCP/IP). This PDU has a header and body, but no trailer. In the header is the source and destination network address (System A and System C). System B looks the destination address up in a table (known, not surprisingly, as the routing table), and finds out which output port to forward the PDU out on. System B then puts the packet or datagram inside another frame (with System C’s physical address) and sends it to System C. The situation is now as it appears in Figure 2.6, with the layers of the OSI RM filled in below the network nodes and the sub-physical network indicated by simple links. (If the systems have applications running on them, more layers are needed.)
Figure 2.6 Layers in a data network. There is one more layer needed to get data across even this simple network. Something on the end systems had to handle the interface between the network and the application software so that long files were broken up to be sent across the network, electronic mail was sent to the proper network address, and so on. This function was not needed on the Intermediate Systems (no one sent e-mail to a router), but only on the End Systems where traffic originated and terminated. Of course, the ISO created another layer to do this: Layer 4, the Transport Layer (originally called the End-to-End Layer). This layer prepares messages (Layer 4 PDUs) for transport across the network. The other three layers, 5 through 7, have little relevance to X.25 and even less so for frame relay, but their functions as envisioned in the OSI RM should be outlined. It is important to realize that these layers are never implemented separately, but are always bundled in a single library function, which is essentially what the Internet protocol suite (TCP/IP) does. The Session Layer (Layer 5) contains what are known as state variables about a connection (session) on the network. For example, the Session Layer would know that 3 of 4 files intended to be transferred across the network have been sent successfully, and the network failed (the session terminated) halfway through the last file. The Session Layer would only send the rest of the last file when the network came back up again. This is essentially a way of keeping track of the history of a connection.
The Presentation Layer (Layer 6) takes care of all differences in internal data representation (e.g., 7-bit ASCII and 8-bit EBCDIC) between different systems. It does so by translating all data into a common representation (known as Abstract Syntax Notation [ASN]) and sending this ASN across the network where the receiving system translates it back to the proper representation for the destination computer. A similar function occurs when a native Englishman and German converse freely in French, which both happen to know. The Application Layer (Layer 7) is really misnamed. It is not really a layer at all and there are no application programs. Rather, there are various compartments containing Application Program Interface (API) verbs that are appropriate for network tasks. For example, “Send e-mail” and “Get a file” are separate Application Layer APIs that may be used in the application program (which runs above the OSI RM) and are linked to the application program before it is run. Of course, there is no need to include the e-mail compartment in a file transfer application program. All modern network protocols are implemented in layers, whether OSI RM layers or not. This simplifies the overall networking task and releases the application programmer from the chore of writing a complete network implementation in each and every application program. This layered approach is followed by X.25 and frame relay. In a layered protocol, there are usually two or more possibilities (or options) that a network designer may choose to implement at several layers. When one of these possibilities is actually chosen at each layer between the application program and the network, this forms a protocol stack, a term which precisely reflects the layered nature of the protocol (the layers are stacked one on top of another).
Early Public Data Networks The first public network built in the United States for telecommunications purposes was not a voice network at all. It was a data network: the telegraph network. It was built on the principles of Samuel Finley Breese Morse, but it was not even the first public telecommunications network in the world. The first national telecommunications networks were built in Europe during the late 1700s and early 1800s. These were true data networks, commonly known as optical telegraph networks. These optical telegraphs were sophisticated semaphore systems capable of sending messages across hundreds of kilometers in an hour or so. The message speeds were limited by the need to relay the messages from tower to tower along the message’s route and the complexity of the encoded message. But these systems were much faster than any other form of communications available. The most elaborate systems were built in France and Sweden. These networks could take four minutes to transmit a message such as, “If you succeed, you will bask in glory.” The messages were sent as a series of numbers which had to be looked up in code books and written down. By 1799, the code books had grown to three volumes with 25,392 entries. This clearly pointed out the need for a system that was based not on codes but on alphabetic representations, but this possibility was never explored. Besides, the use of code numbers provided a measure of security for what was essentially a broadcast medium. By 1800, the maximum speed attainable for a message was about 20 characters per minute, or 2.67 bps (bits per second) in modern terms. Several important concepts and advances came out of these first public data networks. The idea of compressing information (the code books) was proven to be a vital and viable concept. The whole area of error recovery and flow control (a sender must never overwhelm a receiver) was pioneered in these early systems. And the concept of encrypting sensitive information was first used on a large and systematic scale on these networks. The first practical electrical telegraphs merely translated the codings of the optical systems to a new medium. The semaphore towers were replaced by pole-mounted strands of copper and iron (cheaper and stronger) wire. As soon as electricity was shown to be a predictable physical entity by scientists such as Michael Faraday, engineers began working on schemes to use it to send messages over the wires. One important side effect of this activity was the exposure of people to this new technology. In 1824, a New York University art professor named Samuel Finley Breese Morse attended a lecture on electromagnetism, which set his mind in motion. The limitations of communication over distance were made painfully obvious to Morse in the following year. His wife died when he was out of town and it took days for him to learn of it. By 1837, his ideas had reached the patentable stage. He had strung 1700 feet of wire around his room at NYU. That same year, he staged a public demonstration of his device. Morse had grappled with the code book problem. His associate and assistant, Alfred Vail, soon hit upon an ideal solution. Instead of transferring coded letters and numbers, which had to be looked up in voluminous code books, Vail represented simple text by means of dots and dashes, where a dash was defined as three times the duration of a dot. A spool of paper at the receiver printed out the dots and dashes as they were sent. These dots and dashes can be easily thought of as the 0s and 1s of modern binary codes.
In 1838, Morse demonstrated a working telegraph to a Congressional committee in the Capitol building in Washington, D.C. By this time, the telegraph was working over 10 miles of wire filling the room. After some delay and bickering, including a seriously proposed amendment to fund the study of hypnotism and the possibility of the end of the world in 1844, Congress approved $30,000 to run a telegraph line between Washington, D.C. and Baltimore, Maryland in March of 1843. This 40-mile run would be a true test of the technology’s capabilities. The first official telegram was sent on May 24, 1844 between Vail at the Baltimore and Ohio Mount Clare railroad station and Morse in the Supreme Court chamber of the Capitol. The famous message “What hath God wrought?” was not a Morse inspiration. To prevent possible collusion, the assembled dignitaries in Washington decided to go along with Morse’s suggestion that a spur-ofthe-moment message be sent and returned. The expression was selected by Annie Ellsworth, daughter of a government official who was a longtime friend of Morse. Vail immediately echoed it back and the witnesses cheered. By May of 1845, he had extended the line to Philadelphia. It cost about $50 per mile to build a telegraph line, so expanding service was not an enormous burden. Rates remained based on message size. In England, by contrast, two networks had been set up by September, 1847. The rate structure of the Electric Telegraph Company was based on distance, but this proved too expensive for most potential customers. By 1850, a maximum rate of 10 shillings was imposed; this was dropped to 2 shillings by 1860. This whole distance-sensitive versus flat-rate pricing issue comes up again and again in networking. Frame relay pricing is typically distance-insensitive. Message transfer remained slow, mostly due to the laborious task of interpreting the paper tape dots and dashes into letters and words. In 1848, a 15-year-old boy in Louisville, Kentucky, became a celebrity of sorts when he demonstrated the odd ability to interpret Morse Code directly by ear alone. Soon this became common and speeds of 25 to 30 words per minute were achievable. Figuring a 5-character word, this rate of almost 20 “bits” per second is impressive for its day. It compares very favorably to the 2.67 bits per second rate of optical telegraphs. By 1858, newer mechanical senders and receivers boosted the rate on the telegraph lines up to 267 bits per second. Data compression was used on the telegraph lines as well. There was no systematic code use, but an ad hoc abbreviated writing taken from the newspaper industry was widely used. It was known as Phillips code after the Associated Press’ Walter P. Phillips. Operators could tap out “Wr u ben?” for “Where have you been?” and even “gx” for “great excitement.” The code was only used internally and customers were still charged by the word. The success of the telegraph spawned a whole new kind of business as well. In 1886, a young telegraph operator named Richard Sears took possession of a shipment of watches refused by a local jeweler. Using his telegraph, he soon sold them all to fellow operators and railroad employees. In six months, he had made $5,000, quit his job, and founded the company that later became Sears, Roebuck, and Company. A killer application for the telegraph had been found. This was the national network in the 1870s; an all-digital, unchannelized, public data network that the public used to sell goods as well as to communicate.
The Public Voice Network in the United States The national AT&T telephone network in the United States, usually called the Bell system, was regulated as a unit by the states and the federal government from 1913 until 1984. Each state regulated telephone service quality and rate structure for calls that were initiated and terminated within the boundaries of the individual state. For interstate telephone calls, where one end of the call was within one state and other end of the call was in another, regulation was handled by the federal government. Before 1934, this was done by the Interstate Commerce Commission. But after the Telecommunications Act of 1934 was passed, control and regulation passed into the hands of the Federal Communication Commission (FCC). In 1984, as a result of a decades-long battle between the FCC, Department of Justice, and AT&T, and with the new competitive long-distance companies such as MCI joining in, a federal judge and the United States Department of Justice split up the Bell system into effectively AT&T Long Lines and seven newly organized Regional Bell Operating Companies (RBOCs). The local independents more or less continued as they were, but there were major changes in how the RBOCs and independents handled long-distance calls. The RBOCs could still carry local calls end-to-end on their own facilities. For all other long-distance calls the RBOCs had to hand off the call to a long distance carrier, which could not be an RBOC. Furthermore, the RBOCs and independents had to let their subscribers use not only AT&T for longdistance service, but also any of the competitive long-distance carriers such as Sprint and MCI. In fact, any long-distance carrier that was approved by the FCC could offer long-distance services in any local service area if it had a switching office close enough. There was no firm definition at the time of just what a “local call” was, or what “close enough” was, so the court and Department of Justice provided one. The entire United States was divided into about 240 or so areas with about the same number of calls within each. Calls inside these areas, known as Local Access and Transport Areas (LATAs), could be carried on facilities wholly owned by the RBOCs. All calls that crossed a LATA boundary had to be handed off to a long-distance company, which were now called the Interexchange Carriers (IXCs, or sometimes IECs). The local companies, RBOCs and independents alike, were collectively the local exchange carriers. This whole structure neatly corresponded to the two-tier, local long-distance structure already in place. In order to carry long-distance traffic from a LEC, the IXC had to maintain a switching office within the LATA. This switching office was called the IXC Point of Presence (POP). The POPs formed the interface between the LECs at each end of the long-distance call, and the IXC switching and trunking network in between. For the most part, LATAs were contained within a single state, but there were exceptions. Any subscriber served by a LEC had to be able to route calls through the IXC of their choice, as long as the IXC maintained a POP within the originating LATA through a rule called equal access. If the chosen IXC did not have a POP in the destination LATA, the IXC could decline to carry the call (rarely), or hand the call off in turn to another IXC with a POP in the destination LATA. Naturally, the second IXC charged the first for this privilege. It soon became apparent that there were just too many LATAs anyway; as late as 1993, only AT&T had a POP in every LATA in the United States. But the system was in place and cynics noted that the LATA structure closely mirrored AT&T Long Lines switching office distribution. With the breakup of the Bell system in 1984, it became common to speak of the entire system of telephone and switches in the United States as the PSTN.
One other point should be made here. In many cases, the practice developed of running trunks not directly to other local exchanges (although this practice also continued based on calling patterns), but to a more centrally located local exchange. Usually, this local exchange received a second switch, but one which only switched from trunk-to-trunk, and not from loop-to-loop or loop-to-trunk. These trunk switching offices were called tandems, and the practice of switching trunks without any loops was said to take place at a toll office or tandem office. Usually, a call routed through a toll office was a toll call. The term toll call is exactly analogous to the term toll road. A toll road is just a road, but it costs more to drive on it, above and beyond the road use taxes assessed against drivers. In the same fashion, a toll call is just a telephone call, but it costs more to make it, above and beyond whatever the subscriber pays for local service. The amount of the toll usually depended on distance and duration of the call. Keep in mind that these calls were distinct from long-distance calls, which crossed a LATA boundary. A toll call stayed in the same LATA, but cost more (there were a few odd LATA arrangements, but these need not be of concern in this general discussion). Also, the tandem-toll office arrangement offered a convenient way for IXCs to attach POPs to the LECs’ networks. Instead of running trunks from a POP to each and every local exchange, an IXC could simply link to the area’s tandem or toll office. Since the tandem or toll office existed to tie all of the local loops in the area together, this guaranteed that all subscribers would be able to make interLATA calls through that IXC’s POP, at least on the originating end. From the IXC perspective, this preferred point of trunk connectivity was called the serving wire center, since the POP was served from this switching office. Again, this term was used from the IXC perspective. To the LECs, a wire center was just a big cabling rack (called a distribution frame where trunks and loops connected to the switching office) in the local exchange. In other words, a wire center is nothing special to the LEC, but is quite important to the IXC. Many IXCs maintain trunks to several wire centers in a LATA, all in the name of efficiency and to cut down on the number of links needed. Today, the PSTN has a structure similar to the one shown in Figure 2.2. The local exchanges (LEs, also called central offices [COs]) and toll offices inside the LATA make up the first tier of the PSTN, the LEC portion. Since the Telecommunications Act of 1996, service providers may be any entity approved or certified by the individual states to become a LEC. Newer companies are Competitive LECs (CLECs) and the former service provider in a given area becomes the Incumbent LEC (ILEC). Terms such as Other LEC (OLEC) are sometimes used as well. ILECs, CLECs, OLECs, or some other exotic alphabet combination may still be RBOCs, Independents, ISPs, or even cable TV and power companies in various parts of the United States today. There are some 1300 LECs operating in the United States today, but many of these LECs are quite small with only a few thousand subscribers and located in quite isolated areas.
Figure 2.2 The PSTN today. The second tier of the PSTN is comprised of the IXC’s networks. The IXC POP in the LATA could handle long-distance calls for all subscribers in the LATA. The IXC had to have its own backbone network of switches and links as well. The acknowledged leaders in this arena are AT&T, MCI, and Sprint. Sprint remained an oddity for a while because Sprint is also a LEC in some parts of the country, a rare mix of local and long-distance services. There are some 700 IXCs in operation in the United States today, but most of them have POPs in only a handful of LATAs. Many will still handle calls which originate from people within a LATA where the particular POP appears, to almost anywhere, but frequently hand the call off to another IXC.
A few points about Figure 2.2 should be emphasized. All lines shown on the figure are trunks, not local loops, since they connect switches rather than user devices to switches. Although shown as a single line, each trunk may carry more than one voice conversation or channel. In fact, many can carry hundreds or even thousands of simultaneous voice conversations. Also, IXC B, since it does not have a POP in the leftmost LATA, cannot carry long-distance calls from anyone in that LATA. The same is true for IXC A in the rightmost LATA. In the center LATA, customers will be able to choose either IXC A or IXC B for long-distance calls, either under equal access arrangements or by presubscription. Pre-subscription automatically sends all calls outside the LATA to a particular IXC. Recently, the practice of deceptive IXC presubscription switching known as slamming has been universally condemned by regulators and most IXCs alike. As a minor point, note that a POP need not be linked to a toll office switch. The actual trunking of the POP all depends on calling traffic patterns, expense, and other factors. At the risk of causing confusion, it should be pointed out that LEC B and LEC C, for instance, could both exist within the same LATA, depending on state ruling and certification. In this case, one would be the ILEC and the other the CLEC (or OLEC), and both would compete for the same customer pool within the LATA for local service. Finally, there is no significance at all to the number and location of LEs, POPs, and so on, nor the links between them. The figure is for illustrative purposes only. The PSTN, at both the LEC level and the IXC level, is what is known as a circuit-switched or circuitswitching network. Much more will be said about circuit-switching in the next section in order to compare and contrast this practice with packet-switching. But any discussion of the current PSTN architecture would be incomplete without introducing one of its most distinctive features. What has all this to do with frame relay? Just this: When considering frame relay services, an organization must be aware of whether the service being evaluated is proposed by a service provider that can carry the frame relay service outside the LATA if required. While there are ways for frame relay service providers that are regulated LECs to cross LATAs with frame relay services, the organization might be better served by having a national frame relay service provider rather than a LEC. It should further be noted that there are always regulatory plans for allowing LECs to offer interLATA services, especially for advanced data services like frame relay.
Public and Private Networks Privacy is a vital issue in all aspects of life. There is private property and public property. Different rules of behavior and different individual rights apply to each. Very often an important rule of law revolves around the key issue of whether an act was performed in private or in public. The concept of ownership is critical in determining if public or private rules of conduct apply. For instance, if the property is a private home, privacy is expected and certain behaviors accepted. However, if the property is a park, then privacy is never assumed and other rules of conduct apply. But who owns a network? The answer is not as straightforward as it first seems. Consider a simple scenario with two PCs in homes linked by dialup modems over the Internet. Is the resulting “network” public or private? Clearly, the PCs are privately owned by individuals and so are the modems. But the local access lines (local loops) remain the property of the local telephone company. (The quotes reflect the fact that local service providers are delivering more than simple telephony today, and in a short while the majority of local access lines might terminate at nontelephone devices.) The Internet is a global public network owned by everyone and no one. In this case, two PCs networked together are a mix of private and public property. Yet no one would hesitate to label this scenario as an instance of a public networking environment. But exactly what is it that makes this a public network and not a private one? The answer, quite simply, is who owns the network nodes? While simple to state, the answer is actually a little more complex. What exactly, for instance, is a network node?
Common Network Characteristics The nice thing about talking about networks, whether voice or video or data, is that all networks look pretty much the same. So all networks share certain structural and architectural characteristics that make them appear quite similar to each other. This is not to say there are not significant differences in the function and operation of voice, data, and video networks. There are obviously. But all networks share characteristics that make them networks in the first place. Every network discussion eventually represents the network as an ill-defined cloud that user devices access. This “network cloud concept” was first introduced for very good reasons in the 1970s when public X.25 packet-switched networks first challenged private line networks in the United States. X.25 eventually lost, but the failure arguably paved the way for the success of frame relay. In any case, there were no details about the functioning of network components inside the cloud, which was why it appeared as a cloud in the first place. The philosophy was that users did not have to concern themselves with the inner workings of the network. All the users needed to worry about was whether all other users that they cared about were reachable through the cloud. All networks are simple in overall structure, even inside the cloud that hides their inner workings. However, each major network has its own set of terms and acronyms for network components that turn out to do much the same thing. Consider, for example, the general network shown in Figure 2.1. Some of the details of just what is inside the cloud are presented in the figure also.
Figure 2.1 A network The figure shows that inside the cloud there are devices known as network nodes. Outside of the cloud, there are other devices usually called user devices or end devices. This is how the boundaries of the cloud are determined. Network nodes go inside the cloud and user devices belong outside the cloud. Usually, but not always, a user device links to one and only one network node. Network nodes, on the other hand, may have multiple links to other network nodes, but again not universally. So another way to draw the cloud is with devices with only one link outside the cloud and devices with more than one link inside the cloud. The user devices are linked to network nodes by a communications link known as a the UserNetwork Interface (UNI). The UNI link can run at a variety of speeds, is supported on a number of different transmission media from coaxial cable to copper to fiber, and can be used up to some standardized or designed distance. Network nodes link to each other by a link known as a NetworkNetwork Interface (NNI). These links vary by speed, media, and distance also. There is only one exception to this general network diagram. Older local area networks (LANs) did not conform to this generic wide area network (WAN) structure of user device linked to network node. But by the early 1990s, most newer LANs do indeed look exactly like this, although the entire cloud may only encompass a single building or even a single floor. In most LANs, the network nodes are the hubs (which may be linked together) and the user devices are PCs. Whether LAN or WAN, a key aspect of networks is that all users can be reached through the network. Different networks vary in the use of different hardware and software required by users for access, the type of network nodes, and what the UNIs and NNIs are called, but most of the differences would be lost on those not intimately familiar with the various protocols. Note that user devices do not communicate with the network nodes directly at the application level, but rather to other end users. Also, some of the network nodes do not link directly to users, but only to other network nodes. It is common to call network nodes with user interfaces edge or access devices and network nodes with links only to other network nodes backbone devices or nodes, but this is only by convention. Network nodes of all types take traffic from an input port, determine where it goes next through some rule or set of rules, then put the traffic onto an output port queue. Now, the easiest way to determine where the traffic goes is to look up the destination in a table maintained and updated in the network node itself. There are many variations of this theme, but these variations are well beyond the scope of this discussion. Up to this point the terminology presented has mostly been that of data networks. But both UNIs and NNIs are usually leased private telephone lines (there are variations here as well). However, this figure can also be used to represent the public switched telephone network (PSTN) used around the world today for general voice and dialup PC Internet connectivity. The user devices would be telephones, fax machines, or computers with modems in the PSTN, which all fall into the category of Customer Premises Equipment (CPE). In the United States, the form that the CPE might take is completely up to the customer. Any approved device from any manufacturer may be used, as long as it conforms with some basic electrical guidelines. The CPE in the United States is owned and operated by the user, and is beyond the direct control of the network service provider. (This does mean that a service provider cannot furnish the CPE for free as part of a service, but the customer is also free to reject such offers.) In other countries, the CPE can be provided and owned by the service provider under a strict set of regulations, although deregulation is helping other users to gain more control over the CPE. In the PSTN itself, the network node is a voice switch. A switch is a type of network node that functions in a particular way in terms of how traffic makes its way from an input port to an output port, the way the table is maintained, and so on. At least that is how most network people see it. There is a continuing controversy between what exactly is meant by the term switch, especially between the voice and data networking communities. The controversy extends to the companion term router, which is another type of network node, and the type usually found in the Internet. To data network people, a switch is more or less a fast router.
This is not the place to discuss the merits of switching or routing. It is enough to point out that a router is a type of network node found on the Internet that performs networking tasks differently than a switch, which is the basic network node type of the PSTN. Because of the use of the term switch applied to the PSTN, the network node on most public networks, both voice and data, is traditionally called a switch. This applies to frame relay, of course. In the voice networks, instead of the edge and backbone structure of data network nodes seen in frame relay networks, the PSTN uses terms such as local exchange and toll exchange or long distance to distinguish voice switches having links to users or not. All users link to a local exchange, usually called a central office (CO) in the United States. The local exchanges link to toll offices or tandems, or long distance switches, with the actual terms varying depending on the detailed structure of that portion of the PSTN. In the PSTN, the UNI is now the access line or local loop. Some people reserve the term line for digitized loops where analog voice is represented as a stream of bits while others reserve the term loop for purely analog user interfaces. Whether it is called a loop or line, the user interface is not normally a leased, point-to-point, private line, as in a frame relay or other data network. Rather the local loop supports a switched, dialup connection that is capable of reaching out and touching almost every other telephone in the world by dialing a simple telephone number. One such destination may be the local Internet service provider if the loop or line is connected to a PC with a modem or other specialized network interface device. Naturally, the ISP needs to be connected to a PSTN local exchange also for this to happen at all. This is how PC users can use the PSTN to access the Internet. In the PSTN, the network node interface becomes the trunk. There is little to no physical difference between loops or lines and trunks in the PSTN. The difference is in how the physical facilities are used. Lines and loops are used to connect users to the network. Trunks connect network nodes (voice switches) to each other. Trunks are typically high-speed links for one main reason. The reason is that trunks must aggregate a lot of traffic from thousands of users and ship it around the network to other network nodes efficiently. The same is true for NNIs in general for the same reason and frame relay is no exception. In fact, it is the way in which frame relay performs this traffic aggregation, with dynamic bandwidth allocation, that makes frame relay so attractive in the first place. Before taking a closer look at private networks, it will help to summarize the differences in network terminology not only between the PSTN and frame relay, but among many new technologies in general. These differences are shown in Table 2.1. The table adds a few terms to the commonly used terms for the PSTN and the Internet. X.25, frame relay, and ATM all fall into the category of packet-switching data networks, just like the Internet. Their network nodes are switches, not routers. However, as will be shown shortly, a router can be a user device on frame relay (or even ATM) networks. It should be noted that X.25 networks can use a special X.75′ (X.75 prime) interface as a network node interface, but this is not mandated or universal. Frame relay also defines a specialized device for network access CPE known as the Frame Relay Access Device (FRAD). However, the FRAD is not really a user device in some senses because users do not sit down at FRAD to do work. Note that of the three packet data networks, only ATM defines a Network Node Interface (NNI). Oddly, frame relay does define an NNI acronym, but as a network-to-network interface which handles the interface from one frame relay network to another. This will be explored in more depth later. Table 2.1 Network Terminology Network
Network Node
User Device (Usually)
User-network Interface
Network Node Interface
PSTN
Local exchange
Telephone
Local loop
Trunk
Internet
Router
PC client or server with modem
Dialup modem or leased line on local loop
Leased line
X.25 packet switching
Packet switch
Computer
X.25 interface
(undefined)
Frame relay
Frame relay switch
Router, FRAD
UNI
(undefined)
ATM
ATM switch
Router, PC
UNI
NNI
LANs since ca. 1990
Hub
Router, PC
Horizontal run
Riser, backbone
As mentioned previously, LANs looked very different from WANs before about 1990. Pre-LANs were mostly shared-media, distributed networks, and some still are. But today most LANs conform to the network node model by way of the hub. There are no firm equivalents for lines and trunks, however, and the “horizontal” and “riser” terminology in the table is used for convenience only.
Private Networks The PSTN is a public network. But this is not the only type of voice service there is, especially in corporate environments. There are also what are considered to be private voice networks, although it will be shown that there are still public elements in them. Since it will be important in understanding how public frame relay fits into a private networking environment, this section will discuss the mixing of public and private network elements a little more closely. It is not uncommon for a larger organization with multiple sites and more than 100 or so employees per site to employ a private voice switch as a CPE called a Private Branch Exchange (PBX) at each site. With only one site, and with smaller numbers of employees, other methods can be more costeffective, but even these organizations aim to grow into the types of organizations and companies that require PBXs. The PBX itself can save money by allowing extension handsets (telephones) to call each other without needing to access the public network voice switch every time someone on the third floor calls someone on the second floor. So every telephone does not need an access line to the local central office. Instead, all that is needed is the proper number of trunks to the PSTN for the amount of people who are talking outside of the organization at one time, typically about 1 in every 5 on the telephone or even more. In addition to saving on outgoing access lines, the PBX can save money for incoming lines as well. Instead of having a separate line for every employee, requiring each person to answer his or her own phone when it rang, a central call attendant takes the incoming calls, says “Corporation X, good morning,” and switches the call to the proper person or department. If there is no answer or the line is busy, the call reverts to the attendant. These three features—internal calling, attendant, and revert—are the essential features of all PBX systems. Many PBX systems add more features, so many in some cases that no one knows what they all are, let alone how to use them. When there are multiple locations, some way must be found for the PBXs to link to each other. This is usually done by leasing private lines between the locations, from a local exchange carrier (LEC) if the sites are close enough, or from an interexchange carrier (IXC) if the sites are far enough apart. These leased lines are called tie-lines. Note that although the tie-lines form part of a private voice network, the lines remain part of the public network and revert to full carrier ownership again when the lease runs out. Typically, users must dial a special prefix number such as “8” before using the tie-line system. Some more elaborate systems have their own, internal, w-digit numbering schemes. Other PBXs just analyze the dialed number of use the tie-line network automatically if it makes sense. In any case, if the tie-line network is congested, not appropriate, or otherwise unavailable, the PSTN is usually just a “9” away.
What’s the point of discussing the public voice network and private voice network? Because here is the key to understanding what distinguishes them. The PSTN switches, all of them, belong to the telephone service provider. The PBX switches, all of them, belong not to the network, but to the end-user organization. Note that this configuration does not rule out connections between the PBX network and the PSTN. The same connection aspect is true of a private data network with dialup links to the Internet; it becomes a public network. To distinguish public networks from private networks, find out who owns the network nodes? If the end-user organization owns the network nodes, this is a private network scenario. If the service provider owns the network nodes, this is a public network situation. It does not matter if the service is voice, data, or a mixture. It does not matter if the end-user organization owns the CPE. What counts is the network nodes. In frame relay, if the service provider owns the network nodes (which is usually true), the frame relay service is public service.
Fast Packet Technologies Certainly enough has been said to this point about bursty applications and LANs. But as voice becomes compressed and silence suppressed, and packet video techniques such as MPEG-2 become more common, there will be little on a network that is not well suited for packet-switching networks. Therefore, X.25 should be all over the place. Yet it is not. X.25 is not seriously considered the packet technology of choice because it is too slow, for the reasons outlined in the previous section. To do voice and video effectively, end-to-end delays must be held to some stable minimum, and information loss must be minimal as well. A packet network fast enough to handle voice and video is a fast packet network. The delays are low and stable, and the information loss is very low also. Fast packet networks are sometimes called broadband networks, but the equivalence is not always true. Broadband networks have lots of bandwidth and are used for multimedia applications that require these large bandwidths. Fast packet networks can have modest bandwidths and still be used for voice and video. However, as time goes by, all fast packet networks will have to employ broadband bandwidths just to keep up with the types of devices that users are networking together.
Broadband Needs A lot of the previous sections have dealt with theory and abstract concepts. The time has come to be very concrete and see why networks are rapidly evolving toward broadband capabilities. Consider as an example the PC that sits in front of me as I write this. I use this example because I consider myself to be neither a power user nor one of those that clings to a beloved PC long after it has ceased to be useful. I consider myself a typical PC user. Table 2.2 shows the PC that I have used over the years. The table shows how much the most commonly networked device today—the simple PC—has evolved over the years since IBM first introduced its PC in 1981. The visual traces the evolution in terms of the most common random access memory (RAM) size, CPU speed, hard disk size (if any), and size of the operating system itself for each year listed. Most importantly, the visual then gives the maximum theoretical number of bits that a serial port on one of these machines would be able to produce and consume bits. This uses the common IBM guideline of 1 bit per second for ever 2 Hz of CPU speed. The first IBM PCs had 64 kilobytes of RAM, an 8 MHz 8088 chip, and only a 5 Megabyte hard drive, if one was present at all. PC-DOS fit on a single 5/4-inch 360-kilobyte floppy. One of the reasons that token ring LANs ran at 4 Mbps is due to the theoretical speed limit of these early PCs. By 1987, a PC had 256 k of RAM, ran a 16 MHz 80286 or even an 80386, had a 40 Meg hard drive, and ran DOS 3.1 from a high-density 5/4-inch 1.2 M floppy. Table 2.2 The Evolution of the User Network Device Feature
1981
1987
1992
1995
1998
RAM
64 k
256 k
1M
8M
16-M
CPU speed
8 MHz
16 MHz
32 MHz
133 MHz
300 MHz
Hard disk
(5 Meg)
40 Meg
80 Meg
500 Meg
2-Gig
OS size
360 K
Theoretical 4 Mbps Peak Bit Rate
1.2 M
7M
70 M
~200 M
8 Mbps
16 Mbps
66 Mbps
116 Mbps
The typical 1992 PC model had 1 Meg of RAM, ran a 32 MHz 80386, had an 80 Meg hard drive, and ran Windows 3.1 from a 7 Meg directory on the hard drive. By 1995, the PC had 8 Meg of RAM, ran a 133 MHz Pentium, sometimes with MMX, and had a 500 Meg hard drive, which was needed just to hold the 70 Meg or so of Windows 95. Today, most PCs come with 16 or 32 Meg of RAM, run 300 MHz Pentium IIs, have 2 or 3 gigabyte hard drives. No one knows how big Windows 98 will become. The whole point is that as the systems being networked change, so must the network. Theoretical bit rates have quickly grown from 4 Mbps to 8 Mbps to 16 Mbps. PCs today can pump out and take in anywhere from 66 Mbps to 116 Mbps and beyond for 300 MHz PCs. The network must be able somehow to keep up and scale with the requirements of the end-user devices.
Flexible, Dynamic Bandwidth Allocation One of the key features and benefits of the frame relay protocol is flexible bandwidth allocation. This is often called bandwidth-on-demand, but this term is not nearly as accurate or descriptive as the term flexible bandwidth allocation. The term bandwidth-on-demand implies that bandwidth is created when needed, but this is not what frame relay does; it is impossible. What frame relay does is to dynamically allocate existing bandwidth on an as-needed basis. Suppose that point-to-point links were not needed to each and every site that a given site’s router or bridge needed to connect to. There would be fewer periods of idle patterns sent over the link, since when one LAN-based client-server application was silent, another might be sending data. In a leased-line network, one link would be idle while the other was busy. In this scenario, there is only one physical link with logical connections, so the link has fewer periods of idleness. There are no longer any physical channels to remote sites, but rather a number of logical channels or virtual connections. There is no longer a need to send long periods of idle channel or interframe fill bit patterns across the expensive leased line. As long as the average bandwidth use remains below the peak bandwidth use, this scheme is very effective. This idea of virtual circuits sharing one physical access link to a network is distinctive of packet networks in general and frame relay in particular. The potential gain in networking efficiency should never be underestimated. The X.25 protocol attempted this same kind of efficiency as well, but the other limitations of X.25 prevented this feature of X.25 from being used to its full promise. It remained for frame relay to strip off most of the hop-by-hop error-checking and flow-control processing overhead present in X.25. Following are the key aspects of frame relay flexible bandwidth allocation: Often misleadingly called bandwidth-on-demand
Does not create bandwidth, but dynamically allocates bandwidth on an as-needed basis More efficient use of physical connectivity
No longer a physical channel, but a logical channel (virtual circuit) linking sites No need to send special idle channel bit sequences as much
Since frame relay became available at the exact time that corporations were both looking to minimize private line costs and yet increase the efficiency of their remaining network links, frame relay services have flourished in spite of the fact that the vast majority of frame relay networks are public network services. Frame relay has broken the private line way of thinking in many environments today.
Chapter 3: Frame Relay Networks Overview Frame relay networks are a form of connection-oriented, fast packet network. They are based on the older X.25 networks and also intended to be public data networks. Some frame relay networks might also be considered as broadband networks, but few frame relay networks fit the definition perfectly. However, most frame relay networks are still fast enough in terms of network nodal processing delays and stability to deal quite well with compressed voice and video applications along with data. In a very real sense, frame relay is the result of years of effort to enable public packet-switching networks such as X.25 to handle packetized voice (and today also video). This chapter will establish the overall structure of a frame relay network. The chapter will emphasize not only the physical structure of the network, but also the need to maintain an adequate quality of service (QoS) for voice and video services while at the same time handling extremely bursty data applications. This QoS discussion introduces the concepts of routing and switching. There is much talk today about “Layer 3 switching” in routers and “adding routing” to a fast packet switch in ATM or frame relay. This chapter is the place to explore the relationship between routing and switching once and for all. The chapter will also detail exactly how frame relay evolved from X.25 and how it still retains some evidence of this process. The chapter will end with a look at frame relay connections. Both permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) are discussed. The need for both PVCs and SVCs is examined, along with consideration of the frame relay protocols that need to be implemented to allow for SVC service.
Private Routers and Public Switches Frame relay is typically a public network service. The essence of a public network service is that the service provider owns and operates the network nodes. The basic network nodes in the public frame relay network are called switches. A lot of the reasoning behind the use of the term switch for a frame relay network node is historical. Traditionally, the network nodes for services provided by the telephone companies (itself an increasingly historical term in the days of deregulation) have been called switches. So there are central office switches, ISDN switches, and X.25 packet switches. Today, the term switch has come to mean any network node whose primary method of operation involves setting up connections as paths for information to follow from source to destination. So today there are Ethernet switches, Layer 3 switches, and the like. Private networks also have plenty of network nodes. The essence of a private network is that the end user’s organization owns and operates the network nodes. The basic network nodes in private networks today have a variety of names. In a small LAN, the network nodes are called hubs. When a private network is used to connect LANs, the network node used to connect these LANs was most often a bridge in the past, but it is the router today. At first, these characterizations of public switch and private router seem obviously wrong. Is not the network node of the public Internet the router? And is not a Fast Ethernet (100 Mbps) hub called an Ethernet switch? The answer is yes to both questions. But this does not mean that the origins of these terms are not correct, only that their current usage has little to do with their original context. The Internet network node is called a router because a company named cisco decided that this is what the device should properly be called. Until then, Internet routers were called gateways; this term can still be seen in various Internet acronyms such as IGP (Interior Gateway Protocol) and BGP (Border Gateway Protocol) that apply exclusively to routers. One of the reasons for the change is that the OSI RM defined a gateway as a network connectivity device that operated at all seven layers of the OSI RM. But Internet gateways (routers) operated only at the bottom three layers of the OSI RM. So the change was made, and successfully, largely due to cisco’s enhanced standing in the field it basically created single-handedly. (One of the reasons that bridges faded as LAN interconnection devices is that many routers could also function as bridges if so configured. Once called brouters, the bridging capabilities of all modern routers is a given today and so the term was mercifully dropped.) The Ethernet switch or switching hub is called a switch because LAN equipment manufacturers were looking for a term to distinguish how these LAN network nodes operated internally from other types of LAN hubs. The term gateway did not apply and the term router was already taken. Ironically, the most descriptive term and accurate term for what a switched Ethernet hub does, a simple bridge, was avoided since by then everyone knew that a router was a more advanced network device than a bridge (and this was true). The only term left that had ever been applied to network nodes at all was switch. So the very private LAN hub that employed bridging between each individual LAN port became known, for better or worse, as a LAN switch. And no matter how much more accurate the term single port bridging hub might be, LAN switch it remains and will remain. The term Layer 3 switch applied to what otherwise appears to be an ordinary router is simply a repetition of this naming crisis. Routers operate at Layer 3. But this device is radically different, so what do we call it? Well, Layer 3 switch is not taken, and it certainly points out the router relationship (Layer 3). In this instance, the term router switch, which is basically the same as Layer 3 switch was avoided as too confusing.
So frame relay network nodes are switches and users’ LANs use routers at the ends of leased lines to connect their LANs. But a private router can be a software FRAD. And frame relay switches can be used to link customer’s routers to public Internet routers. Given the converging terminology previously noted (router switch), does all of this mean there is no difference at all today between switches and routers? Not at all. And because the frame relay switch and customer router as FRAD have such a close relationship, being at either end of the UNI, this relationship is worth exploring in a little more depth.
What Is a Router and What Is a Switch? As has already been established, the network nodes in a frame relay network are called switches. On the Internet, the network nodes are called routers. More accurately, these are IP routers, since IP is the OSI RM Layer 3 protocol used in these routers. Why should any of this matter? The answer to this question is of vital importance for organizations building networks for their users and for the service providers who build the infrastructures that link the users together. If frame relay and other technologies such as ATM are to survive and prosper in the world of the Internet, the position of public switches in relation to IP routers must be considered. Switches and routers can be compared in a number of ways. It is important to realize that even though this section emphasizes the differences between switches and routers, both are still network nodes that can be used in a wide variety of networks and under a wide range of circumstances. Switches usually (there are exceptions) have the following characteristics. They are hardwarebased; processing happens very quickly at the chipset level with a minimum of added overhead processing. Switches were created by the telcos for use on a public WAN and standards are governed by the ITU. The tables that are used for routing traffic through the switch are set up by a signaling protocol when the connection between the users is initially made. So switches are typically connection-oriented and no data transfer takes place until this connection between users is set up. Connections might be of the permanent virtual circuit (PVC) type or switched virtual circuit (SVC) type (on-demand connections is a more accurate, but seldom used, term). Both PVCs and SVCs are connections in either case, no matter how they are established. All data units are distributed from input ports to their proper output ports by a simple, quick lookup in the routing table of a connection identifier, which makes the hardware implementation so attractive. This simplicity is a result of the connection-oriented approach of the switch environment. The connection identifiers often have what is called local significance only, which means they can be used over and over on the network as a whole and thus must be changed as the data units flow from node to node across the network. Examples of networks that use switches with these characteristics as network nodes include X.25, frame relay (naturally), ATM, ISDN, and many other mostly public network services. The behavior of routers can be contrasted with switches almost point by point. Routers usually have the following characteristics, but as with switches, with some exceptions. Routers are mostly software-based (but this is changing) and processing happens more slowly at the CPU level with some added overhead processing. Routers were created for use on the public Internet (and called gateways until cisco popularized the name router), and router standards are governed by Internet organizations. The switching tables (the contrasting terms are used intentionally here) that are used for routing traffic through the router from input port to output port are created by a routing protocol that periodically contacts neighboring network nodes (other routers) for the purpose of exchanging this routing information. Routers are typically connectionless and data transfer between users can take place at any time without establishing connections between the routers. Note that there may be permanent or switched (on-demand) connections that exist end-to-end between users or applications, but there are no logical connections at all between the routers themselves, just physical links.
Because of the connectionless approach used in the router environment, the data units are distributed from input ports to their proper output ports by a set of rules which are applied to a global network address that must be present in each and every data unit. The fact that this global routing information is present in each data unit is a major reason behind the flexibility of routing and makes the software implementation so attractive. The network addresses handed out in a router environment have global significance; so they must be carefully allocated to users on the network. There can be no overlap among network addresses in a router network, a fact which alone adds administrational complexity to the network. Examples of networks that use routers with these characteristics as network nodes include IP, IPX, and a few others. Note that these network protocols started out as proprietary or closed protocols, whereas most protocols based on switches as nodes started out expressly as public and open protocols. Today, the trend is toward convergence between switches and routers as network nodes. That is, routers have begun to take on the characteristics of a kind of connectionless switch routing data units with global addresses while switches have begun to take on the characteristics of connectionoriented routers switching data units with local addresses (connection identifiers). As previously mentioned, some former LAN connectivity devices displaying the characteristics of both bridges and routers were called brouters. The term has thankfully disappeared, but perhaps the same approach could be taken with regard to network nodes that combine the characteristics of a switch and a router. This swouter device would do a lot of the data unit processing in hardware at the chipset level but would have many tables to look things up in as well. This device could handle connections or flows of IP packets. The point is clear: If this device is neither a traditional switch nor traditional router, then what exactly is it? Switches and routers have already merged in function to the point where more and more equipment manufacturers are not calling their new products a switch or a router at all. The device might be a packet processor or nodal processor, but not merely a switch or router. Two examples are instructive. Start with a normal, premises-based IP router. Add the hardware module that cisco calls its route switch processor card. Then add some software to handle IP version 6 flows, which are basically a type of on-demand connection between routers. If a frame relay UNI is added, the result is more than a FRAD, but something less than a full-blown public frame relay switch. Or start with a frame relay switch in the public frame relay network. Everyone has a router, but perhaps no potential customers want to buy a FRAD or they are reluctant to change their router configuration. No problem. Just add some software to the frame relay switch to handle IP and other traditionally connectionless protocols. This frame relay switch-based software needs the IP routing tables to perform its task, of course and the IP routing protocols would need to be added to maintain the routing tables properly. Is this frame relay switch now an IP router? Or is it rather (and this seems to be what it really is) a new type of central office FRAD or CO FRAD? It should also be noted that in a router network, each data unit is usually routed independently. But in a switch network, only the call setup message of the signaling protocol is routed independently. In a switch, all subsequent traffic follows the same path. A router can also do this, using a concept called flows, as previously noted.
Routing and Switching on a Frame Relay Network All the pieces are now in place to understand both how a frame relay network operates and how this operation is an improvement on how an X.25 packet-switching network operates. X.25 networks operate by taking source X.25 packets from an end-user device and placing them inside of frames based on the Link Access Procedure-Balanced (LAPB) standard. The term balanced means that the same type of messages can flow from either end of the link, making them peers from the networking perspective. The LAPB frames are sent as a stream of bits from source device to network X.25 packet switch.
At the public X.25 switch, each arriving frame is checked for errors and, if none are detected, an acknowledgment is sent back periodically to the source saying, in effect, “the last x frames were good.” The sender must wait after sending “x” frames until this acknowledgment appears. There is always the possibility of a Negative Acknowledgment (NACK) appearing as well. This usually prompts the sender to resend at least one and probably more frames due to the detected error. All this occurs at Layer 2 of the OSI RM. Inside the frames are the packets or more likely a piece of a packet. There is a measure of flow control done at this level as well. Flow control simply means that the user premises device can never send frames and packets into the network faster than the network can handle them. One of the simplest ways to perform flow control at this level is to merely delay an acknowledgment so the sender cannot send any more frames or packets. The public X.25 switch now assembles the entire packet and examines the connection identifier at Layer 3. The connection identifier is looked up in a table (the switching or routing table) and the proper output port determined. There might be packet-level acknowledgments involved as well. Then the public switch repackages the X.25 packet in an appropriate frame for sending to another public X.25 switch. These frame-level procedures are not quite the same as LAPB, and are vendorspecific, but must perform the same error checking and acknowledgments sequence as on the user access link. There might be an arbitrary number of public X.25 switches to traverse until the packet arrives at the switch that has the X.25 access link to the destination. At the destination side of the network, this entire packet-in-frame error detection, acknowledgment, and flow control procedure is repeated. So all three layers of the OSI RM are involved at each hop between nodes along the way. This link-by-link, or hop-by-hop, error and flow control is characteristic of all older network protocol such as X.25 and is shown in Figure 3.6.
Figure 3.6 Information flow in X.25. There is nothing at all wrong with doing things the hop-by-hop way on a network. In fact, it saves the end devices the tasks associated with error and flow control, since the network is doing all of this on behalf of the users. With X.25 it was even possible to have a source sending at 9,600 bps (which shows how long X.25 has been around) and a destination receiving at 4,800 bps. The network would buffer and store the excess bits until the receiver was ready for them. Try doing that with a leased line! However, today’s source and destination PCs and routers are much more capable than they were just a few years ago. The complete Layer 2 and Layer 3 processing required in X.25 now slows the network down more than it decreases the burden on the end systems. All in all, there are some 10 decisions that an X.25 switch must make to process a packet through an X.25 network node. About six of the decisions are at Layer 2 and four of them are at Layer 3. The details are unimportant here. What is important is that most of these decisions have to do with the error and flow control that must be done hop-by-hop throughout the X.25 network. The philosophy in a frame relay network is radically different. Instead of the network performing error and flow control hop-by-hop, the frame relay network makes these procedures the responsibility of the end-user device. (Some frame relay texts say that the end user is responsible, conjuring up the image of an office worker feverishly working to resend information through the frame relay network.) It is the end-user device, such as the host attached to the router, that performs the error and flow control tasks end-to-end across the frame relay network.
In a frame relay environment, the packet scenario is as follows. Frame relay networks operate by taking source frame relay frames from a site CPE device (e.g., a router or FRAD) and placing them inside of frames based on the Link Access Procedure-Frame Relay (LAPF) standard. There are several levels or types of service a frame relay network can offer; the most basic is based on LAPF core. This is the type of frame relay network described here. The LAPF frames are sent as a stream of bits from source device to a network frame relay switch. At the public frame relay switch, each arriving frame is checked for only two things. First, if the frame contains any errors at all, it is just discarded. No notification of this frame discard is sent to the source. Second, the arriving frame is checked to see if the Data Link Connection Identifier (DLCI) in the frame header has a routing or switching table entry. If the DLCI is not in the table, then the frame is discarded. If no errors are detected and the DLCI has a table entry, the frame is switched to the proper output port. No acknowledgment is sent back periodically to the source. So there is no flow control or error control done within the network. If missing frames are to be resent, it is the task of the end systems to decide if any frames are missing and what to do about the missing frames (voice frames can hardly be resent!). The same logic applies to flow control: If the destination system wants the source system to slow down, then it is the responsibility of the destination to inform the source of the need to slow down the sending process. The frame relay network will convey this information inside the frames to the source, but the frame relay network is never aware of the contents of the frames that the network transports on behalf of the users. There is no need for the public frame relay switch ever to look inside a frame and assemble or process the entire packet. The connection identifier is all that is needed to allow the frame relay switch to determine the proper output port from the switching or routing table lookup. There are no packet-level acknowledgments or flow control done in the network at all. The public frame relay switch does no re-packaging of packets; it only relays frames from input port to output port. The frame-level procedures used between the frame relay switches are not quite the same as LAPF core, but are vendor-specific, just as in X.25 and most other public networks. There still might be an arbitrary number of public frame relay switches to traverse until the frame arrives at the destination. At the destination side of the network, the frame content is subjected to error detection, acknowledgment, and flow-control procedures if necessary based on the application. But this is an end-to-end function of the user devices, not a hop-by-hop function of the network itself. So only the bottom two layers of the OSI RM are involved at each hop between nodes along the way. This endto-end error and flow control is characteristic of all newer fast packet protocols such as frame relay. The frame relay information flow is shown in Figure 3.7.
Figure 3.7 Information flow in frame relay. Note that the end-to-end layer in a frame relay network is the Network Layer and not the Transport Layer. This means that IP packets, or any other OSI RM Layer 3 data unit, now become the end-toend data transfer unit through the frame relay network. So any mechanisms that the TCP/IP protocol has in place to handle error control and flow control work just as before (if the network was previously run on leased lines or the like) and the frame relay network is totally transparent to the IP routers. This simple transparency is both a benefit and a liability to the network and router alike, and will be explored more fully in the next few chapters.
The Frame Relay Access Device The Frame Relay Access Device (FRAD) is the user’s view of the frame relay network. Users do not see the frame relay switch, nor do they usually even see the UNI. What users see is the FRAD. And even then, the FRAD might very well be the same familiar router down the hall. But in many cases, the FRAD is a special frame relay network device that terminates the UNI at the customer site; it is the CPE of the frame relay network. A lot of times, evaluating a frame relay network is really a two-step process. First, examine and choose the service provider. Then, examine and choose the FRAD vendor(s) (it is much easier to mix and match FRAD vendors than frame relay switch vendors). A quick definition of a FRAD is easy to provide. A FRAD has at least one port capable of supporting the hardware needed for the UNI link and the software needed for understanding frame relay protocols, and one or more nonframe relay ports, usually LAN ports. So a FRAD has at least one UNI and one or more non-UNI ports. Many FRADs, especially smaller, less expensive models have exactly one UNI port and perhaps four non-UNI ports, all usually just 10Base-T Ethernet. More expensive FRADs have more sophisticated UNI options and configurations, including dial backups, and a wider range of non-UNI ports and/or more of them as well. The point of this section is to describe the different types of FRADs that can be found on the premises of a typical frame relay customer. This does not mean that several criteria for deciding which FRAD is right for which situation are not covered here, it just means the emphasis is on description, not selection. It is possible to divide all of the FRADs marketed by vendors into roughly the following categories and subcategories: Software FRADs (Routers)
-FRADs with only basic features -FRADs with more advanced features Hardware FRADs
-FRADs (Traditional FRADs) -M-FRADs (Multiservice FRADs) -V-FRADs (Voice FRADs) So FRADs fall into two major categories, software FRADs and hardware FRADs. In each major category, several variations are possible. Each of the major types of FRAD is discussed in a little more detail here. Several of the more advanced features will be explored in more detail in later chapters. This section only indicates support options.
Software FRADs (Routers)
Most frame relay services are used to link organizations’ LANs together. Even when frame relay is used for support SNA, PBX voice, or some even more exotic services, the basic LAN connectivity private line replacement role of frame relay is still present. The device that is most often used to act as the external gateway between LANs is the router. In fact, gateway was the older term used for router until a company called cisco (properly spelled with a lower case “c,” but seldom seen that way) essentially invented the router and, more importantly for the LAN interconnection industry, the market for routers. This is not necessarily an endorsement of cisco routers, but more of an acknowledgment that cisco outsells every other router vendor put together. Routers are the network nodes of the Internet and have become the common term for the network nodes for LAN interconnections of any type, from leased private lines to virtual private networks. Older LAN connectivity schemes used bridges, but these devices had such limitations when compared to routers that once the pricing was right, people took advantage of the benefits of routing almost immediately. Note that the function of routers on LAN internetworks and the Internet is exactly the same as the role of switches in a public data network. Both are network nodes. This is not a coincidence. Some would even claim that there is basically no difference at all between routers and switches. More on this subject will be said later in this chapter, where the position is developed that there are still some consistent differences between routers and switches, but the differences are becoming less and less over time. Because both routers and frame relay switches are network nodes, it might seem logical that they should be able to interface directly. And they can. But since the router is not a frame relay switch, but a CPE device, the interface between them must be the UNI. Since the router is also a device that has one or more non-UNI ports for LAN attachment, it is easily seen that the router fits the oneUNI, one-or-more non-UNIs definition of a FRAD and can perform the same function as a FRAD. When a router performs this function, it is sometimes known as a software FRAD; that is the term used here. Why software FRAD? Because a router typically has at least one serial WAN port that runs an appropriate WAN protocol such as PPP (Point-to-Point Protocol) at the frame level (Data Link Layer 2) that the serial router port on the other end of the link understands. If the serial WAN port is now to be used as a frame relay UNI, PPP will no longer do. The protocol that runs at the frame level now must understand frame relay frames and nothing else. This is not usually a problem. All major router vendors (a phrase that can just mean “cisco,” but in this case means almost everybody) support and bundle the frame relay protocol on their serial WAN ports. The use of a router as the CPE on a frame relay UNI is quite attractive. This is especially true if the frame relay network is replacing an existing private line network between the routers and many frame relay networks do. What usually happens is that the customer can terminate service on all of the other links, which are usually 56-64 kbps private lines, and keep one to form the UNI. The router port is reconfigured as a frame relay UNI, restarted when service begins, and the UNI is up and running. What could be simpler? This graceful migration path aspect of frame relay must not be underestimated. It should be noted that with the exception of the frame structure, the UNI link performs exactly as it did before it became a frame relay UNI. That is, the link carrier frames are represented as bits. But now the link terminates at a local frame relay switch instead of at another router port hundreds or even thousands of miles away. Because private lines are paid for by the mile, not only are there fewer links in the frame relay network, but also they are a fraction of their former length (and price). The UNI still requires a standard Digital Service Unit/Channel Service Unit (DSU/CSU) arrangement on the premises if the UNI is a digital link such as a 56-kbps DS-0, which the vast majority of UNIs are. The DSU/CSU forms the boundary between the service provider network and the CPE (the router or FRAD). The position and function of the DSU/CSU on a frame relay UNI is shown in Figure 3.4.
Figure 3.4 The DSU/CSU on a frame relay UNI. The DSU/CSU takes bits having one type of coding suitable for short distances and some media sent on the serial port, and converts them to and from bits that are suitable for longer distances and other media. The DSU/CSU is strictly a Physical Layer 1 device in the OSI RM. The function and position of the DSU/CSU is worth mentioning because the CPE is customer property and cannot be directly managed or configured by the service provider in most cases. But the DSU/CSU can be directly managed, since it is seldom changed. This makes the DSU/CSU an attractive place to try to manage frame relay UNIs; several service providers have attempted to do just that. Software FRADs do not even have to be routers, but most of them are. Almost any device that has a serial port can be used as a FRAD, as long as the vendor provides or supports software to generate frame relay frames and understands the switch at the other end of the UNI (there is more to frame relay and networking in general than just data transfer). Bridges, mainframe Front-End Processors (FEPs), minicomputers, and other equipment can be software FRADs, and many are, especially IBM SNA FEPs. But SNA and frame relay are a topic for a later chapter. Previously, software FRADs were distinguished by the presence or absence of advanced features. In the case of software FRADs, advanced features can translate to many things that hardware FRADs offer routinely and easily. Most routers that function as software FRADs, as convenient as this is when frame relay is used for LAN interconnectivity and private line replacement (which is just about always), are unable to provide more than simply a way to package LAN frames or packets into frame relay frames on one side of the network and haul them out again on the other. However, it has already been pointed out that there is more to networking than simple data transfer. The basic frame relay functions will transfer data across the network, but how will the routers detect congestion on the frame relay network? How will the routers deal with missing frame relay frames that are dropped because of errors? Most software FRADs do not deal with these issues; in fact, they are not really worried about these issues at all. Routers simply route. Let the end systems worry about what to do about errors and congestion. The problem is that routers are the end systems to the frame relay network. But simply putting frame relay software in a router will not necessarily make the router a particularly good FRAD. Nevertheless, the use of a router as a software FRAD is common and accepted. There are several benefits to this use of routers. First, the frame relay software is usually bundled with the router, at least all but the very low-end routers. So no extra hardware is needed and the cost of a separate FRAD is not incurred. One router is typically the gateway off the premises for network traffic, so making this the gateway to the frame relay network makes sense also. Certainly this is the simplest configuration. Finally, routers have been around for a number of years; there is widespread understanding and support for routers in the networking community. Sometimes, the limitations of software FRADs only become apparent when the frame relay network becomes so successful that frame relay network access must be expanded to more users, perhaps all of them. The biggest limit is that there can only be one frame relay UNI per router, unless definite steps are taken to provide more than one UNI. Many customers, familiar with private line environments, do not think about having more than one UNI per router. Also, many router-based FRAD implementations are not totally compliant with frame relay specifications. The router might be able to handle basic frame transfer, but not much else. There are usually few really advanced frame relay features on the router acting as a FRAD, which is only understandable. Routers are built and marketed as routers, not as FRADs, after all. Finally, because routers are basically connectionless, best-effort packet delivery platforms, there is little quality of service (QoS) support in a software FRAD. In this context, QoS means that the application is able to obtain the bandwidth, error rate, and delay characteristics that the application needs to function from the underlying frame relay network. QoS in networks, in general, and frame relay QoS, in particular, will be defined more fully in a later section of this chapter.
In fairness to the router industry, it should be noted that nothing prevents a router from becoming as good a FRAD as anything else, except perhaps in the area of support for other services. That is, voice telephony ports on a router will be rare for the time being. The strength of software FRADs depends on the amount of effort put into them by the individual router vendor.
Hardware FRADs Many of the basic features of FRADs have been covered in the software FRAD section. A lot can be covered by contrast rather than detailed descriptions. That is not to say that hardware FRADs are not as important a topic as using routers as FRADs; it is simply a reflection of the natural tendency to use the familiar router in a role that can also be served, and sometimes better served, than a separate, dedicated frame relay network access device. Today, the FRAD marketplace is broken up loosely (very loosely) into three major categories: “traditional” FRADs, multiservice FRADs, and voice FRADs. The term “traditional” refers to FRAD as a stand-alone hardware device which might have very advanced state-of-the-art capabilities. In fact, the perspective employed here on these divisions is not based on any formal definitions at all, only the author’s individual view of the marketplace. Formal definitions may evolve, but for now the dividing line between the various categories of FRADs remains quite fluid.
Traditional FRADs It is surely a measure of how far hardware FRADs have come that it is now necessary to call the simplest packaging of FRAD capabilities a traditional FRAD. Although the leading edge of the market has progressed far beyond the simple packaging of the basic FRAD, such traditional FRADs remain in heavy use on many frame relay networks. The basic package should remain common in low-end FRAD offerings for many years to come. The term “traditional” applied to hardware FRADs has nothing to do with frame relay features that determine frame relay standards compliance or support for options. All hardware FRADs offer such compliance (at least they should) with frame relay standards; option support is another issue altogether. Rather, the term “traditional” applies to a FRAD that only supports data services and treats all data traffic identically in the FRAD itself (i.e., there are no priorities for individual virtual circuits). The major features of a traditional hardware FRAD are shown in Figure 3.5. The main difference between a hardware FRAD and a software FRAD, such as a router, is support for more types of non-UNI ports on the FRAD than only LAN ports, although such a FRAD might still have nothing but LAN ports, but generally more than one. Even X.25 or telex ports can be accommodated on some models of this type of FRAD. One of the distinguishing characteristics of a traditional FRAD is that traffic from all non-UNI ports is treated exactly the same inside the FRAD itself. In other words, if two frame relay frames containing client/server LAN traffic are already waiting to be sent on the UNI into the frame relay network, and a port connected to an IBM AS/400 generates a delay-sensitive unit of SNA traffic representing a financial transaction, there is no way for the SNA traffic to leap frog and thus gain preference in the output queue over any other traffic in the FRAD.
Figure 3.5 The stand-alone FRAD with advanced state-of-the-art features.
So from the traditional FRAD perspective of the frame relay network (which is basically all frame relay network levels since the FRAD is the interface to the frame relay network), all traffic is created equal. Under light loads, traffic moves through quickly, transactions and e-mail alike. Under heavy loads, traffic moves through more slowly, perhaps slow enough to result in SNA session restarts. SNA session restarts slow transaction processing to a crawl and make more work for all the components of the network, frame relay and non-frame relay components alike. Perhaps if some way to distinguish bulk e-mail traffic from delay-sensitive SNA transactions within the FRAD were possible, the frame relay network users would be a much happier group. E-mail could easily wait while SNA session traffic was delivered more quickly. Perhaps it would be better if there was a way in the FRAD itself to acknowledge inherent differences in the type of service that an application needs and prioritize the virtual connections on the network. This FRAD would not only prioritize connections, but also dynamically allocate more or less bandwidth to an application as the application uses the frame relay network. This type of FRAD is sometimes called the M-FRAD, or multiservice FRAD.
Multiservice FRADs The first thing that should be said about M-FRADs is that they are not necessarily the same as multiservice access concentrators. M-FRADs support frame relay UNIs, pure and simple. Multiservice access concentrators generally support both frame relay and ATM UNIs. M-FRADs still support delay-sensitive traffic streams like voice and video only if these services are delivered to and originate from LAN-attached PCs. So multiservice support in an M-FRAD still revolves around data packet delivery. Multiservice access concentrators have voice and video support, but the voice and video support is usually handled by specialized ports with not only non-UNI interfaces but nonLAN ports as well. This is not to say that a multiservice access concentrator with a frame relay UNI and only LAN ports cannot be used in the same fashion as an M-FRAD. It simply acknowledges that the multiservice access concentrator is a more general device while the M-FRAD is a more specialized device. The key feature and benefit of an M-FRAD is that the device can prioritize the frame relay service given to individual connections based on the needs of the traffic on the connection. Typically, this need is for data traffic priorities, but nothing prevents M-FRADs from supporting voice traffic as well—usually by encapsulating digital voice inside data packets. It is hard to be precise when there are no accepted definitions and a large measure of common sense is needed when evaluating these types of FRADs. Before M-FRADs, almost all FRADs had a few common characteristics. Some of these features have been discussed already. First, these traditional FRADs provided a single type of service (first in, first out) for all traffic. Next, the number of UNIs and non-UNI ports serviced by the FRAD were relatively small, so central site concentration was awkward. Finally, these FRADs all relied solely on permanent virtual circuits (PVCs) for connectivity. Usually, the M-FRAD at least distinguishes between LAN traffic and SNA sessions, often called legacy data traffic by the product vendors. More sophisticated M-FRADs can give priorities to interactive client/server database access over bulk file transfers, or even SNA transactions to one mainframe over SNA transactions to another mainframe. The key is that the M-FRAD is aware not only of frame relay frames, but also the differing connection identifiers (the DLCIs) of the frame relay connections or virtual circuits.
Voice FRADs The last type of specialized hardware that might be encountered in FRADs is voice capability. These voice FRADs, or just V-FRADs, typically have a harmonica style interface for hooking up 50pair twisted-pair copper wire from an organization’s PBX. These cables can carry up to 24 voice channels from the PBX into and across the frame relay network to a similar device on the other end of the PVCs used for voice. Ordinarily, these voice channels would be carried on tie-lines, which are nothing more than leased lines used for voice purposes between PBXs. Most often, the 24 voice channels would be represented by 64 kbps digital voice and the 24 channels would be carried by a 1.5 Mbps DS-circuit in the United States.
Naturally, if an entire DS-acting as a frame relay UNI were used to carry regular 64 kbps voice conversations, then when 24 people were on the telephone at one site, all data transfer would cease. This is what the V-FRAD is for. The V-FRAD will take the 64 kbps voice and further compress it to anywhere from 4 kbps to about 13 kbps, depending on the V-FRAD vendor and desired voice quality. So 24 voice compressed channels should only take up between 192 kbps and 312 kbps on the 1.5 Mbps UNI. The voice compression is done by a special board in the V-FRAD known as a Digital Signal Processor (DSP). Although the term DSP might seem to imply that any digital signals at all could be subjected to this process, in practice only digital voice signals are processed by current DSPs. The compression must be done by hardware because of the delays that might otherwise be introduced by attempting to perform this task in software. For smaller installations using only a 64 kbps UNI, the DSP boards usually have individual modular jacks for handling only a few telephones instead of a whole T-interface. Voice over frame relay is facing a real challenge from Voice over IP (VoIP) proponents and equipment. However, doing VoIP with adequate quality typically means that the organization must use a managed Internet Service Provider’s service. This usually translates to the ISP providing a separate access line and backbone network in the form of routers connected by leased lines. In other words, the VoIP in this case has nothing to do with the Internet, other than the fact that the ISP also happens to provide Internet access. The attraction of doing voice over frame relay is that the voice is more intimately tied in with the basic service. That is, the voice over frame relay is delivered over the same network, from UNI to backbone, as the data service. This is not often true with VoIP today. More will be said about voice over frame relay in Chapter 9. The discussions in this section should be used for information purposes, not as a blueprint for one FRAD or another. As time goes on, all FRADs will develop support for multiservice priorities and support for nonpacket-based telephony. So the distinctions between FRAD, M-FRAD, and V-FRAD will blur over time. In some truly state-of-the-art packages, the differences between -FRAD and VFRAD have already begun to be merged into the same device, just with different boards for the different functions. Figure 3.5 shows what passes for a state-of-the-art hardware FRAD today. Some of the features have yet to be discussed in detail, but this is the place to deal with the overall features that a frame relay customer should expect to find or at least be available for the CPE device at the customer end of the UNI. The chassis is a standard rack-mountable or standalone unit whose cost will vary widely based on features and functions supported. All include a main and backup power supply, and redundancy is always advisable for installations using voice over frame relay (otherwise voice communication is cut off with loss of building power). Typically, any network management capabilities are also built into this main system board, but there are exceptions. In fact, the network management capabilities are the main distinction between software and hardware FRADs, as will become apparent. If the unit supports multiple queues for giving priority to one traffic stream or physical port over another, there is typically a separate board for that function, although some units combine this capability with the main system board. The rest of the slots are configurable on a mix-and-match basis depending on number and type of premises connection, and number and type of UNI connections (some FRADs can support multiple UNIs). For the premises side of the network, the connectors almost universally include one or more 10Base-T LAN connectors, often support one or more token ring connectors, and might include more exotic LAN and device connector types such as Fibre Channel. For the WAN side of the network, the FRAD supports multiple UNI connector types, usually depending on the speed of the UNI itself. The most common is the V.35 connector for a 56-64 kbps UNI, but 1.5 Mbps and 45 Mbps UNI are also supported, of course. Usually there is only one UNI connector, but more are possible. There would also be a voice DSP board for compressed voice at 8 kbps or so, depending on the compression method used. The remaining slots (if any) would be used for expansion. And, of course, there is no requirement for the DSP board (for instance) to be present until there are plans for voice support, and so on for the other optional services.
More details on FRADs are available from the individual vendors, from various trade magazines who periodically review such devices, and from the Frame Relay Forum (FRF). The FRF is a vendor consortium interested in vendor interoperability of frame relay devices; it issues implementation agreements (IA) covering a wide variety of frame relay topics. More information on the frame relay forum will be found in the bibliography to this book. Regardless of the future of FRADs, all FRADs share a common purpose. The FRAD exists to allow users to access the frame relay network. The frame relay network exists to give each user the quality of service (QoS) that he or she needs to allow applications to function as designed. Because a lot of time will be spent going over a frame relay network’s techniques for delivering the proper QoS needed, this is a good place to say a few words about what QoS on a network is precisely.
Quality of Service and Networks Having the proper quality of service (QoS) on a network is a lot like having nice weather on a vacation. Not only does the term “nice weather” mean different things to different people, it means different things depending on what the vacation is all about. Obviously, nice weather for skiing in the Rockies is not the same as nice weather for sunbathing in the Bahamas. So it is with networks. The QoS needed for bulk file transfers like remote server backups is not the same as the QoS needed for packetized voice. Also, people being what they are, no one complains about too much nice weather or excellent QoS. But everyone notices when the weather or QoS fails to live up to their expectations. But the network, like the travel agent, is always held responsible if the weather or QoS disappoints a user expecting one thing and given another. Analogies are nice tools for comparison, but they can only be pushed so far before they either become inadequate or just annoying. This section will say no more about vacations, especially since no one has ever confused setting up a network with taking a vacation. There is no official definition of QoS. For the purposes of this book, QoS will be defined as the ability of a network to deliver to each and every user application that specifies a series of QoS parameters the correct amount of network resources needed to deliver that QoS. This definition sounds complicated, but it really is not. All it really means is, as an example, that if a user tells the network that this application needs a delay of 20 ms across the network, plus or minus 1 ms, the network will make sure that this happens. If not, then the user has a legitimate complaint and there might be a rebate on the monthly bill or some other penalty involved. Guaranteeing QoS is not easy on the part of the network. The network must not only look around (using whatever mechanism employed for this purpose) and see if this 20 ms delay is even possible to deliver, but also make sure that no other applications granted a QoS in the future are allowed to affect the QoS just given out. In other words, the network cannot suddenly stop 20 ms delay because a whole raft of other users are now demanding 20 ms delays also. The delivery of QoS is so difficult to accomplish consistently on large public networks that the Internet as structured today cannot do it at all, and networks such as ATM, designed for QoS delivery from the ground up, can only deliver QoS under certain circumstances. This non-QoS support is the main reason that the Internet, and IP networks in general, are considered to be unreliable. It seems odd that a network like the Internet, characterized by dynamic rerouting around network node (router) failures while switched networks drop connections when switches fail, is considered unreliable while switched networks that drop connections are considered reliable. But this is only because the term “unreliable” when used in an Internet or IP context simply means that the network is unreliable when it comes to delivering user-desired QoS parameters such as stable delays (or guaranteed delivery!). The circuit-switched PSTN, although perhaps failure-prone, is much more reliable at delivering the QoS parameters that voice (especially) requires in terms of stable delays.
There is no real agreement as to exactly what parameters go into QoS. This sounds odd, but it is true. Most would agree that at least bandwidth, delay, delay variation (jitter), and error performance (in terms of cell/packet/frame loss) belong on the list of QoS parameters. Some add at least one or sometimes even two more. Reliability concerns, this time in the sense of network availability, have become more acute with the recent widely publicized outages of portions of the Internet and public frame relay networks. In fact without reliability, the ability of a network to deliver any other QoS parameters becomes pretty much moot. In some routing protocols, reliability is a metric that can be maximized when routing decisions are made. There is also a good argument for adding security to the list of QoS parameters. The venerable IP protocol has had a bit configuration available in the IP packet header telling routers to “maximize security” when routing the packet for more than 20 years. Router vendors have never implemented this option, but that does not mean it is unimportant. It could even be argued that given today’s dependence on the Internet and other public networks for commerce and finance, if security is not a QoS parameter, it soon must be. So a comprehensive list of QoS parameters will have not four, but six items. These are: 1.Bandwidth (number of bits per second the application needs, e.g., 8 kbps). 2.Delay (maximum amount of time it can take to reach destination, e.g., r0 ms). 3.Delay variation or jitter (amount of time the delay is allowed to vary, e.g., 1 ms). 4.Errors or information loss (percentage of cells/packets/frames the network can lose, e.g., 0%). 5.Reliability (annual percentage of time that the network must be available, e.g., five nines or 99.999%). 6.Security (degree of protection afforded to information on the network, e.g., double encryption). The actual values of these parameters will differ from application to application and the ability of a given network architecture to support them will differ from network to network. For instance, delays are so variable on the Internet that it makes absolutely no sense for applications to specify delay variations limits. What has all of this discussion of QoS have to do with frame relay? Primarily, to make the point that frame relay occupies a sort of halfway point between network architectures with no QoS delivery mechanisms at all like the Internet and other IP-based best-effort networks, and network architectures that were invented specifically to deliver precise QoS performance to all applications, such as ATM. This means that while there is no mechanism in frame relay for an application using a frame relay PVC to inform the frame relay network of its QoS needs, there are some basic bandwidth reservation mechanisms built into frame relay. Even a multiservice FRAD can only guarantee that some PVCs will receive priority queuing over other PVCs, not that the network delay will be lower than X at all times. This is a form of relative QoS, not the kind of absolute QoS that ATM can deliver. But arguments that frame relay has no QoS guarantees and seem to put frame relay into the same category as the Internet or other IP networks are just wrong. This line of thought emphasizes the lack of a complete set of explicit QoS parameters that is present (but not always used) in ATM networks. It is true that the best that can be said for frame relay QoS is that the QoS is probabalistic and not deterministic as in ATM. So a frame relay network might be able to probably deliver frames in under 20 ms (for example) 99.4 percent of the time. While very good QoS performance, it is not ironclad. The 0.6 percent of a year that the delay is over 20 ms works out to 52.56 hours. If the QoS is not met for several 8-hour days when critical business activities are scheduled, this 99.4 percent might be of little consolation to the user. Yet the service provider has met the letter of the Service Level Agreement (SLA).
Frame relay service providers routinely use terms and conditions like “delay is less than 40 ms 99.5 percent of the time” and “PVC available will be 99.99 percent annually” (this works out to less than an hour of downtime a year). Some SLAs are quite explicit: “Delay is 10 ms plus 0.05 ms per 100 km of route miles from source to destination.” All of these conditions require that some mechanism is put in place to verify the QoS level available to each and every application to verify compliance and detect violations. This is one of the reasons that frame relay, although a public network service, allows users to have ways of gathering more performance information about their portion of the network than ever before (how else could a customer ever determine route miles on the network?). The good news is that in most cases all of the network management mechanisms work extremely well. More details on SLAs, frame relay network management, and related topics will be discussed in Chapter 7. It might be a good idea to close this section with a look at the service guarantees that a frame relay service provider would typically offer as opposed to the service guarantees that a typical Internet service provider would offer. These examples come from no specific source or service provider. However, they are certainly representative of the types of figures one would expect to see in a virtual private network proposal for a network based on frame relay as opposed to one based on the Internet or the IP protocol in general. This comparison in made in Table 3.1. Note that service providers commonly distinguish between network availability and user or application availability. This is an attempt to say that just because an individual user or site has a bad month or year, overall the network is doing just fine. Also, the Internet column has the absolute best guarantee from any number of widely-known business Internet service providers. So when it comes to QoS, frame relay is not the best network architecture available, but neither is it the worst. Frame relay QoS mostly provides only absolute bandwidth guarantees, but bandwidth is probably the most critical of the six QoS parameters when it comes to correct day-to-day user application operation. Table 3.1 QoS Levels in Frame Relay and on the Internet or with IP Parameter
Frame relay
Internet, Typical
Internet, Best
Delay
60 ms one way
No guarantee
150 ms or less one way
Errors (Loss)
99.99% 1
No guarantee
Individual case basis
Network
99.99%
99%
100% 2
User/application
99.9%
No guarantee
100% 2
Penalty?
Yes, detailed
No
Almost same as frame relay
Reliability
1 This applies only to traffic which conforms to the committed information rate (CIR). 2 The ability of any service, let alone the Internet or IP, to be 100% reliable is remote. 99.7% or 99.5% are more often the best.
The Frame Relay UNI and NNI A basic frame relay network is composed of three simple elements. The elements are the access link, the port connection, and the associated virtual |onnections (which are almost all PVCs today). The access link and UNI arrangement is such an important piece of the frame relay network service that the access link will be discussed more fully in the next chapter. This chapter will emphasize the frame relay switch port connection and virtual connections (or circuits). Although the two elements of port connection and virtual connections will be discussed separately, the port connection and virtual connections have no meaning unless used together in a frame relay network. Think of the port connection as a hardware aspect of frame relay and virtual connections as a software aspect of a frame relay network. It takes both to do anything useful. The relationship between the frame relay access link (UNI), port connection, and virtual circuits is shown in Figure 3.1. The access link runs between the customer premises frame relay access device, or FRAD, and the frame relay network service provider’s switch. The port connection is the actual physical connection on the switch to which the FRAD device is attached by the access link. Finally, the virtual connections (PVCs) are what allow all user traffic from the CPE to be sent into the network on the same access link, yet be delivered almost anywhere in the world. In frame relay, the virtual connections are identified by a Data Link Connection Identifier (DLCI) which will be discussed further later on.
Figure 3.1 A typical frame relay network user connection. The port connection forms the user entry point into a frame relay network. The port connection is usually associated with a single customer site, but not always. In other words, it is possible to have two sites linked to the frame relay network through one port, or even one site linked through two ports, but neither of these situations is common. In most cases, one site gets one port, no matter if there are many users, applications, or protocols sharing the network. The key with frame relay is that many logical connections will share a single physical port connection. These logical or virtual connections (the PVCs) will carry traffic to many remote locations. And the nice thing about frame relay is that all of a particular site’s traffic—regardless of originating users, application, or protocol—will use the same PVC to send and receive traffic to a particular site in the vast majority of cases. In spite of all this connectivity, there is no dedicated bandwidth allocated to these individual users, applications, or protocols on the frame relay network. Dedicated bandwidth (all the bandwidth, all the time) is a characteristic of private line networks, but not of frame relay. Instead, the port connection on the frame relay switch will dynamically allocate the frame relay network capacity to meet the changing needs of all the users on the network, not just this one port. This is simply a way of saying that in frame relay, there is no dedicated bandwidth, but there is dedicated capacity on the network. The idea of dedicated capacity will be more fully explored in Chapter 4.
Capacity is determined based on the port speed. It determines the total amount of information a user sends into the network in a given unit of time (usually a full second). For example, a port speed of 64 kbps effectively allocates a capacity of 64,000 bits each second to all of the users attached to that port connection. There is no possible way that any user or application could ever send more than this number of bits into the network in a second. Most frame relay service providers allow port connection speeds as a set of multiples of 64 kbps (56 kbps in some cases). These speeds are essentially based on something called fractional T1 (FT1) speeds. While it is not important to know exactly what this means, it is important to know what speeds are represented. These speeds are usually 56/64 kbps, 128 kbps, 256 kbps, 384 kbps, 512 kbps, 768 kbps, 1024 kbps, and 1536 kbps. Some providers do not support all of these speeds and some support other speeds, but this set is very common. Figure 3.1 showed only one site using a frame relay network across the UNI. To be more complete, there would need to be at least two sites and UNIs linked across the network. There must be at least one switch, and usually there are many. The switches link to each other over a network node interface, which is undefined in frame relay. Almost any protocol and hardware can be used, as long as it is supported by the switch vendor(s) of both switches and provides adequate Quality of Service (QoS) to users. Ironically, one of the most common uses of ATM today is to provide such a backbone for frame relay switches to connect. When ATM forms the frame relay backbone, it is known as cell-based frame relay and has many benefits for service providers and users alike. The relationship between ATM and frame relay will be explored more fully later in Chapter 12. For the sake of completeness, an entire (but very small) frame relay network is shown in Figure 3.2. Note that ATM is used as the backbone technology, but this is just one of the possibilities. In this case, the frame relay switches themselves become the end devices on the ATM network.
Figure 3.2 A frame relay network. Public network standards typically spend a lot of time detailing exactly what should happen in terms of the software and what the hardware arrangements from the CPE to the network switch port should be. The CPE to switch interface defines the UNI and is clearly a key part of any network architecture. Without firm and complete UNI specifications, CPE devices could not be a varied as they are in terms of vendors and optional capabilities, and service providers could only support a small subset of all possible CPE configurations. The frame relay UNI allows this interoperability to take place. But the frame relay network node interface is another story. Normally, the network node interface can be abbreviated NNI, but this is not a wise idea when speaking of frame relay networks. The acronym NNI does in fact exist in frame relay, but means network-to-network interface. So the frame relay NNI means something entirely different than the acronym NNI does in other technologies, such as ATM. Actually, ATM was the first major network architecture to define a standard network node interface. Most other network architectures, especially public network architectures, never bothered to define a standard network node (switch-to-switch interface). The reason is very simple: Such a standard interface on public networks was not felt to be in the best interests of the network.
Such an attitude sounds quite odd given the current climate and push toward standardization at all costs. But this attitude grew out of the voice network and the philosophy was later applied to X.25 packet-switching networks and frame relay, among other types of networks. The approach to standardization on the public voice network did not emphasize interoperability. Instead, the approach emphasized innovation. The feeling was that if standards are too strictly defined, no one will ever do anything radically different, since obviously this new approach would not fit in with the currently defined standard. If standards are more loosely defined, then innovation can proceed with less concern for interoperability. Consider the PSTN as an example. Once people could buy their own telephones, the interface from telephone to voice switch was fully and strictly defined, right down to the voltages. But there was still no standard way for the voice switches to talk to each other. Each central office or local exchange switch vendor had its own, proprietary way of signaling and trunking between switches, and each felt that its way was the absolute best possible way of performing this task. This situation encouraged vendors to freely innovate and explore other methods of switch interfaces, since the only concern was for backward compatibility with their older products, at least until the older switches could be phased out. But what about interoperability? Proprietary voice switch interfaces meant that a multivendor environment was difficult to achieve. If it had to be done, vendor A’s switch had to translate everything into vendor B’s talk or vice versa before any interswitch communication could take place. And this translation process is exactly what was done. At first, it would seem a chaotic situation, especially to those used to a standards-dominated world. What saved the PSTN was the fact that there were only about a half dozen public voice switch vendors, so multivendor translation was not as big a problem as it would be in the LAN world with 60 or more Ethernet hub vendors. Large public networks like the PSTN were seldom multivendor environments anyway. There were few alternatives, as just mentioned, and no one cared to build a network where intervendor fingerpointing between the switch vendors at each end of the link slowed troubleshooting and repair times to a crawl. The customers (and regulators) would not stand for it. So most large public networks standardized on one vendor or another, and that was that. The proprietary approach was extended to X.25 public data networks, then to frame relay as well. So frame relay switch vendors are free to innovate any way they choose on the network node switch-to-switch interface. The only real requirement is that the two ends of the link understand each other. Ironically, in spite of the lack of standards for use on the network node interface between frame relay switches, there is one standard that is forbidden for use between frame relay switches. This is the frame relay UNI. The precise reasons for this are beyond the scope of this discussion, but this prohibition revolves around the fact that two frame relay switches are total peers, and the UNI requires one end of the link to be CPE. The UNI relationship is a peer one with regard to data transfer, but not so with regard to network management and so forth. How did ATM become a common backbone for frame relay networks? One major reason is that the pressure today in the industry is not toward innovation but toward multivendor interoperability. So proprietary interfaces, while tolerable, are not always the first choice. Also, as services grow, there are more network nodes than ever. Multivendor environments are more common in the data world, where a huge switch has 256 ports, not 10,000 or even 40,000 as on a large voice switch. Therefore, if no standard network node interface exists, it might still be a good idea to use something else that is standard to tie all of the nodes together. That is one role of ATM in a frame relay network. ATM provides the standard network node interface that frame relay lacks. Each frame relay switch essentially becomes a user on the ATM network. There is much more to the relationship between frame relay and ATM than just an ATM backbone for frame relay switches. But the positioning of ATM as backbone for frame relay is enough for now.
In spite of the previous discussion, there is an NNI acronym in frame relay; NNI means Network-toNetwork Interface. The frame relay NNI is used whenever two different service provider’s frame relay networks need to communicate. After all, they might be using different switch vendor’s products and proprietary interfaces will not work. And although multivendor public network environments are not all that common, they are not particularly rare either. For instance, a service provider might change switch vendors at some point. It would hardly be possible or intelligent to discard the previous vendor’s equipment. The vendors might be isolated by area or function, but the switches must still communicate when required. Translation can be used in this situation, but there are potentially more frame relay switch vendors than voice switch vendors. And the more vendors, the more the need for standard interoperability between them. Some standard way must be found in order to allow two different frame relay networks, or portions of networks, to communicate. The relationship and uses of the frame relay UNI, NNI, and interswitch interfaces are shown in Figure 3.3. Note the presence of a simple router as a FRAD. The possible FRAD configurations are a major theme of this chapter and the next. The other FRADs in the figure have multiple ports that connect other devices, probably routers, but also other things, especially IBM SNA components. The figure shows a private frame relay network as well as public frame relay. Nothing prevents an organization from buying and installing its own internal frame relay network. Only leased private lines are needed to tie the switches together and link the FRADs to the switch ports. But if the private network needs to access users on a public frame relay network, there must be a standard interface between them if the switch does not understand each and every proprietary protocol in use. This is one role of the NNI.
Figure 3.3 Frame relay UNIs and NNIs. The primary role of the NNI is shown in the figure also. The two public frame relay networks could belong to two frame relay service providers, perhaps a LEC and an IXC. Alternatively, the two public frame relay networks could belong to the same service providers, and could even service the same geographical area. But in this case, the two clouds contain all of vendor A’s switches in one cloud and all of vendor B’s switches in the second cloud. This is a job for the NNI as well. Both uses are common.
The Frame Relay Protocol Stack Only a few related topics remain to give a good overall description of how a frame relay network actually works. It has already been pointed out that the vast majority of public frame relay networks (and even private ones) offer only Permanent Virtual Circuits (PVCs) for connectivity. The few frame relay service providers that do offer Switched Virtual Circuits (SVCs) are few and far between, and usually have many restrictions on the number of SVCs that can be established, where the endpoints are located, and so on. Of course, PVCs have no call setup delays while SVC signaling messages are processed by the network to determine routes and network resources, establish switch table entries, and engage billing procedures. All of these issues will be discussed more fully later. All that remains here is to show that SVC support depends on the exact frame relay protocol stack that a frame relay service provider supports. Mention has already been made of LAPF core, the basic frame protocol run on any frame relay UNI at all. The fact is that LAPF core simply transfers frame relay frames around a PVC-defined network. That is, no SVCs are possible in a frame relay network employing only LAPF core. In is often said, and not incorrectly, that frame relay is defined at the bottom two layers of the OSI RM. This is not inaccurate if the data transfer aspect of frame relay is being discussed. But there is more to networking than data transfer, much more in fact. Networks must be managed with some form of network management techniques. The techniques could be added on, but network management is more efficient and consistent if the techniques are part of the network specification itself. A network must be controlled with some form of signaling protocol so that users and network can inform each other of their intentions in terms of connectivity. This is the task of the signaling protocol. In any case, there is no room at Layer 2 of the OSI RM for these management and control functions. Yet these functions must be performed in the frame relay network nodes, the switches themselves. With regard to the frame relay protocol stack, these functions can be considered to be at Layer 3, the Network Layer, although most texts are fond of insisting that there is no Layer 3 in frame relay switches at all. Here is how the frame relay protocol stack actually looks. X.25 is a fairly faithful representation of what the OSI RM should do at the Physical, Data Link, and Network layers. In X.25, the layers can be represented by V.35 or X.21 at the Physical Layer, the X.25 LAPB at the Data Link Layer, and the X.25 Packet Layer Protocol (PLP) at the Network Layer. Frame relay can perform all data transfer tasks with a subset of the full OSI RM Data Link Layer. Network management is done in frame relay as a small subset of OSI RM Layer 3 and only on the UNI. This basically means that frame relay network management on the UNI consists of a small set of messages inside special frames sent back and forth on the UNI. LAPF on its own only supports manually configured PVCs. If SVCs are to be supported in frame relay, they can take one of two forms. The Integrated Services Digital Network (ISDN) signaling protocol specification on which frame relay signaling is based is called Q.931. Frame relay adapts and extends the Q.931 ISDN signaling protocol as Q.933. If the frame relay SVCs are established with signaling messages that form a subset of the full Q.931 ISDN call control message set, technically a Q.933 subset, this is known as non-ISDN SVC support. With non-ISDN SVC support, there is no relationship between a service provider’s ISDN signaling network (and billing system) and its frame relay SVC offering. But at least there are frame relay SVCs. However, a service provider can make its frame relay network a part of its ISDN, with frame relay playing the same role as X.25 as a packet-bearer service. This requires the full implementation of Q.933, however. With ISDN-compliant SVC support, there is a close relationship between a service provider’s ISDN signaling network (and billing system) and its frame relay SVC
offering. In this case, the user can use the service provider’s ISDN to establish frame relay SVCs. The relationships between all of these frame relay protocol stack permutations are shown in Figure 3.8.
Figure 3.8 The OSI RM, X.25, and frame relay.
Chapter 4: The Frame Relay User-Network Interface Overview The most visible portion of a frame relay network from the user’s perspective is the UNI. The user cannot see the frame relay switch at the other end of the UNI, nor the internal trunking network tying all of the switches together. But the user has direct access to the premises end of the frame relay UNI. In many frame relay network situations, the main concern of the user is: “How are the UNIs?” In some ways the UNI is also the most visible portion of the frame relay network from the service provider’s perspective as well. This is where many if not most of the network management efforts are focused. The UNI is also the portion of the network that requires the most care and attention when configuring PVCs for users. And the UNI is that part of the frame relay network that requires the greatest lead time if a physical link that can form the basis of a UNI is not already in place. In many frame relay network situations, the main concern of the service provider is: “How are the UNIs?” This chapter explorers all of the details of the UNI itself and any other issues involved with connecting a user to a frame relay network. This is the first look at the frame relay frame structure and the concept of the Data Link Layer connection identifiers used in frame relay. UNI configuration issues are examined in this chapter, especially the key frame relay concept of Committed Information Rate (CIR) that is often confusing. Ways of handling CIRs are discussed, with a long look at the related concepts of regular booking and oversubscription. Finally, all of the currently supported options for the UNI are investigated, from simple links to multiport UNIs.
Regular Booking and Oversubscription Every DLCI defined on a UNI must have a CIR. The CIR can be zero in some cases, but this is not the same as a DLCI not having a CIR at all. The CIR can be equal to, but cannot exceed, the access line rate of the UNI. However, the CIR on a DLCI is typically set at anywhere from 50 percent to 33 percent of the UNI speed, or even lower in many cases. CIRs can be adjusted on a monthly or even weekly basis, and often should be as traffic patterns change on the frame relay network. But suppose a frame relay UNI is configured with four DLCIs representing PVCs. Each must have a CIR associated with it. If the user or customer does not have a preference, the CIR is typically set at 50 percent of the access line rate and adjusted periodically as the network is used. But if each of the four DLCIs has a CIR of 32 kbps, the total of the CIRs defined on the UNI is 128 kbps. This greatly exceeds, and is in fact twice as much, as the access line rate of the UNI. How can this scheme possibly work? Because it does work, usually quite well, due to the extremely bursty nature of frame relay traffic. As long as all four DLCIs are not bursting at the same time, the 64 kbps UNI can handle all of the traffic. If all four DLCIs do burst at once, the FRAD must either buffer the excess frames or discard them at the source before the frames even enter the frame relay network. Most FRADs typically have anywhere from 1 to 4 megabytes of buffer space just to handle these bursts. Sometimes, a frame relay service provider will know from experience that some applications or users will regularly burst simultaneously. The multiple bursts will result in lost information if the UNI line rate is consistently exceeded. In this case, the service provider might adopt a policy known as regular booking for the UNI. With regular booking the sum of the CIRs defined on all of the DLCIs configured on the UNI cannot exceed the line rate of the UNI. So four DLCIs on a UNI with regular booking cannot have CIRs that add up to more than 64 kbps. The CIRs need not all be equal, of course, but the sum of the CIRs cannot exceed 64 kbps. If another DLCI must be added to a UNI with regular booking, the DLCI can only be added by decreasing the CIRs on one or more other DLCIs to preserve the total CIRs equal to the UNI speed relationship, or the new DLCI must be added with a CIR of zero if appropriate or allowed. It is important to note that even with regular booking, applications running on DLCIs can still exceed their CIRs, producing DE = 1 frames that might be discarded by the frame relay network under certain conditions. But with regular booking, all DLCIs can receive their CIRs at the same time, which is usually the whole idea. If the sum of the CIRs defined on all of the DLCIs configured on the UNI are allowed by the service provider to exceed the line rate of the UNI, this is known as oversubscription. The actual amount that the sum of the CIRs can exceed the UNI line rate is known as the oversubscription policy of the frame relay service provider. For example, an oversubscription policy of 200 percent means that the sum of the CIRs cannot exceed twice the UNI line rate, or 128 kbps on a 64 kbps UNI. Oversubscription policies of 300 percent are common and even 400 percent is not all that unusual. An oversubscription policy of 100 percent is essentially the same as regular booking.
The Frame Relay Committed Information Rate DLCIs are defined on a UNI. As many DLCIs are defined as are needed for connectivity to all reachable remote sites, within limits. The limitations are dictated by a couple of factors. First, there are only so many DLCIs that can be realistically supported on a UNI of a given speed, no matter how much buffering is used. After all, it is a lot to expect of a 56/64 kbps UNI that is replacing 67 private lines (for instance), no matter how bursty or intermittent the traffic. Second, each DLCI defined on a UNI has to include a certain Committed Information Rate (CIR) parameter. The CIR is probably the most difficult concept to grasp immediately when it comes to frame relay. This is due to the fact that in the world of leased private lines there is no equivalent concept. A lot of works on frame relay, from books to course notes to magazine articles, spend so much time describing the parameters and operational details involved in CIRs that people are often left with the impression that CIRs have some deep mathematical significance or are so complex that no one could or should attempt to understand them at a functional level. But none of this is true. The purpose of CIRs is easy to grasp and the function of the CIR is easy to understand. Think of the following discussion as the simplified, illustrated, and painless introduction to frame relay CIRs. Every DLCI must have a CIR. A PVC has a CIR assigned at service provision time. An SVC has a CIR configured when the demand connection is set up (the CIR is requested in the frame relay Q.933 call setup message). Some service providers allow a CIR of zero on a DLCI. This is not the same as saying there is no CIR at all. The CIR just contributes nothing to the CIR total on a given UNI. The CIR is often defined as a dedicated amount of bandwidth on a DLCI. This is not really true. There is no dedicated bandwidth in a frame relay network. Dedicated bandwidth (all the bandwidth, all the time) is more a characteristic of the private line networks that frame relay usually replaces. It is more accurate to say that the CIR represents a commitment on the part of the frame relay network to dedicate a given capacity for the transmission of user information at a certain rate determined by the CIR. The emphasized terms in this definition are precisely where the concept of a CIR originally comes from. Every DLCI, PVC or SVC on a frame relay network will have a CIR associated with it. This CIR assignment should reflect the average amount of traffic between the two sites connected by the DLCI. The CIR is a key part of exactly how a frame relay network can replace (for instance) three dedicated private lines with one access link to a public frame relay network running at the same speed as any one of the three private lines. This happens because, on average, each of the private lines may only be in use one-third of the time. The trick is to determine what the appropriate CIR is for each DLCI. This is not always easy; there is always a risk that the CIR chosen may be wrong. Fortunately, CIRs are relatively easy and quick to change and there are some general guidelines. Another way to interpret the CIR is as a statistical measurement of throughput over time. This introduces a number of technical parameters used in computing an actual CIR. All of these parameters are important, but only for understanding the deep theoretical underpinnings of frame relay. This discussion will be based more on example than abstractions. Frame relay specifications establish a number of traffic descriptors that are used by the service provider to determine when the traffic arriving on a UNI is within the configured class of service on the DLCI. This specific frame relay class of service is not to be confused with the general Quality of Service (QoS) concept that applies to all networks from the Internet to ATM. Following is a list, with definitions of the class of service parameters:
Access rate This is the physical speed of the UNI. The most common speed is 56/64 kbps, but speeds as high as 45 Mbps are supported in current frame relay specifications. Committed Rate Measurement Interval (TC) This is interval of time over which the network computes the information arrival rates and applies the CIR. Usually, this interval is measured in fractions of a second. Committed Burst Size (BC) This is the maximum number of bits that the network can absorb in the time interval TC, and still guarantee to deliver on the DLCI, under normal conditions. Usually, normal conditions means that the frame relay network is not congested from a series of bursts occurring in too short a time period. Excess Burst Size (Be) This is the maximum number of bits that the network can absorb above and beyond the committed burst size BC and attempt to deliver on the DLCI. Note that bursts within the committed size are guaranteed delivery, but delivery of excess bursts is only attempted. There is no penalty for nondelivery of excess bursts. These four basic parameters are combined in various ways to determine the amount of network resources that must be dedicated to support a particular CIR on a given DLCI. For instance, the frame relay standards specify that the CIR be set equal to the committed burst rate BC divided by the committed rate measurement interval TC or BC/TC. The usual time unit is a fraction of a second, or a full second, but this varies from service provider to service provider. The following examples will use one full second, which makes the math much easier when dealing with CIRs. There is also an Excess Information Rate (EIR) which is defined as the excess burst size Be divided by TC The CIR and EIR give results expressed in bits per second, usually kbps. Neither the CIR nor the EIR, nor even the sum of the CIR and the EIR, assigned to a DLCI can ever exceed the access rate. This only makes sense. The network cannot receive more bits, excess or not, per unit time than the UNI can transmit. However, the CIR and EIR together can add up to less than the access rate of the UNI, although this is not common. Why is so much effort applied to each DLCI on a UNI in terms of CIR? Because the CIR is directly, and the EIR indirectly, used to determine the status of the Discard Eligible (DE) bit in the frame relay header. If the DE bit is zero, then the network should deliver the frame to the destination under all but the most dire network conditions, a guarantee including such routine conditions as congestion and backbone link failures. This is what the customer is paying for, of course. On the other hand, if the DE bit is set to 1, the network can discard the frame under congested conditions (and other threatening circumstances). Under particularly drastic conditions, the frame relay network can even discard DE = 0 frames, but of course this action might result in rebates to customers on the affected links if it occurs routinely. Sometimes users familiar with private line networks or other forms of networking recoil at the very thought that a network might discard data, even if it is only under certain conditions. But this is not the point. The fact is that all networks must discard data if the alternative is to crash the node or whole network. Routers must discard packets when the buffers become full, and routinely do so. The Internet would grind to a halt (and sometimes seems to anyway) if all packets sent into the network had to be delivered. Normally, anywhere from 10 percent to even 20 percent packet loss on the Internet is considered acceptable under certain peak periods and traffic loads. The packets must be resent from the endpoint applications if necessary. The problem with haphazard discard mechanisms is that they often discard precisely the wrong traffic. A router seeking to free up buffer space often goes after smaller packets. The philosophy seems to be that freeing up a little buffer space as needed to soldier on is preferable to a wholesale wiping out of large packets which might contain large amounts of user data. But what often happens is that the smaller packets contain the network or application-level acknowledgments that destinations send to sources to inform the source that everything up to a given point has been received properly. If an acknowledgment is discarded, the net result is often a barrage of resend traffic that swamps the network worse than the original congestion that triggered the discard process in the first place.
So the DE provides a mechanism that frame relay networks can use to establish a loose system of priorities for frames that might be discarded before others. In other words, the rule is to discard frame with DE = 1 before those with DE = 0, but only if such action becomes necessary to avoid severe network congestion. This is what the CIR is for. FRADs that respect the CIR will generate only traffic flows that have DE = 0 frames. Only above the CIR will DE = 1 frames be tagged at the network end of the UNI. It is hard enough to understand CIRs when they are described in words alone. A few graphics and examples might make the concept of CIRs more real and meaningful. Always keep in mind that the CIR is designed for bursty traffic and represents a statistical smoothing of traffic over time. A person might drive to work in an hour, at 60 miles per hour, and do the same returning home. But for the eight hours that the person is at work, the car in the employee lot is not moving at all. So the burst speed is 60 miles per hour, but the smoothed or average speed is only 12 miles per hour for all 10 hours. This does not mean that the roads should be built for 12 miles per hour instead of 60 miles per hour, any more than a 64 kbps UNI should operate at only 8 kbps. But it does mean that traffic on highways, as on frame relay networks, is bursty and prone to congestion at certain times. The CIR for a DLCI may set at 32 kbps. Of course, the UNI physical access link into the frame relay port connection on the frame relay switch may run at 64 kbps. The important thing to remember about CIRs is that when information is sent on a UNI, this information must always be sent at 64 kbps. It is physically impossible for the link to do otherwise. The CIR of 32 kbps means that frames containing data may flow on this DLCI into the frame relay network at 64 kbps, but only for one-half second at a time. This obviously preserves the CIR of 32 kbps, since one half of 64 kbps is 32 kbps. This is just another way of saying that, for this CIR on this DLCI, the committed burst rate BC
is 32 kbps and the committed rate measurement interval TC is 1 second (BC= 32 Kbps and TC= 1).
This CIR of 32 kbps should not be seen as much of a limitation. An Ethernet frame has a maximum size of about 1,500 bytes, which would fit comfortably (and transparently) inside a frame relay frame. This is 12,000 bits. So a CIR of 32 kbps established for a particular DLCI means that the frame relay network will guarantee delivery of about 2 ½ Ethernet frames per second on this DLCI. This is not only quite good in terms of capacity, it is an astonishing amount of traffic between two LANs at different sites. As long as the FRAD sends no more than two 12,000 bits frames on this DLCI per second, the frames will be marked (or tagged; both terms are used) as DE = 0 (do not discard). It is entirely possible that a user application will generate information to be sent to a remote location at more than 32 kbps. More accurately, the application will send for more than one half second on a 64 kbps UNI. The application will burst above the CIR in frame relay talk. In the example using Ethernet frames, perhaps three Ethernet frames (36,000 bits) are sent in the same second interval. At the network end of the UNI, the third frame would be marked as DE = 1, since the whole frame is considered to be above the CIR. The frame might still make it through the network under most circumstances, but if the frame is discarded at the ingress switch, or any other switch, the user has no reason for complaint, since no commitment has been made on the part of the service provider to deliver frames above the CIR. If a fourth Ethernet frame arrives, this, too, must be marked DE = 1, since the bit total for the interval is now 48,000 bits. This example assumes that the sum of the CIR and EIR on the DLCI adds up to 64 kbps, which is normal. It is possible that the EIR might be set to only 16 kbps, however. (This is allowed, but not common.) In this case, the fourth Ethernet frame would be totally ignored, its bits not even buffered, and no response to the user end of the UNI is ever made. Given the added complexity when the CIR and EIR do not add up to the access rate on the UNI (is this frame within or without the EIR limit?) and the questionable practice of doing bad things on a network without informing the sources, the guideline that CIR + EIR = access rate is almost universally followed.
Figure 4.7 shows how the example relates to the general frame relay class of service parameters. Note that when frames are sent, they are always sent at the UNI access rate, so their slopes are identical. Note that it is in the best interests of all involved if the FRAD respects the CIR for the DLCI and only DE = 0 frames are generated. However, the user applications have no idea of CIR or DLCI. If the CIR is to be respected, usually the FRAD must buffer the traffic. Here is an area where hardware FRADs generally perform better than software FRADs, such as routers. Typically, hardware FRADs respect CIRs and routers do not. So routers on frame relay networks usually have higher resend rates than hardware FRADs, although there is much more involved in resend rates than pure CIR considerations.
Figure 4.7 The CIR and the DE bit. What happens to a frame marked DE = 0 or DE = 1 next varies according to the frame relay network. The fate of the DE = 1 frame depends heavily on the network switch and the current state of the frame relay network in terms of congestion. In some cases, frames sent in excess of the CIR will be stored in buffers on the network switch until there is capacity on the backbone link to forward them. But the DE = 1 frames are the first ones to be discarded in the case of network congestion. In other cases, these excess frames are immediately discarded, an option known as early packet discard. This is quite effective is avoiding congestion on the frame relay backbone, if somewhat drastic during low traffic periods. In no circumstances whatsoever can the CIR chosen for a DLCI exceed the port speed at the end of the frame relay UNI. However, a site with a UNI and frame relay port speed of 64 kbps connected by a DLCI to a remote site with a UNI and port speed of 256 kbps cannot have a CIR associated with this DLCI in excess of 64 kbps. This only makes sense. The 64 kbps UNI and port could never keep up with a DLCI receiving frames from across the network at up to 256 kbps. So obviously the CIR in this example cannot exceed 64 kbps in both directions on the DLCI. CIR values are typically available from 0 to 1.536 Mbps, and many values in between. Most frame relay providers offer a few sub-64 kbps rates and in the same increments of 64 kbps that the frame relay port speeds are available in. There is no technical reason to restrict the CIR to fractions and multiples of 64 kbps. It is just easier for the hardware, software, and network administrators to deal with. Some service providers do not offer a CIR of zero or CIRs higher than about 1 Mbps (1,000 kbps). A CIR of zero basically means that each and every frame sent into the network is automatically marked discard eligible, or DE = 1. The benefits of using a CIR of zero are twofold. First, the CIR of zero might be appropriate if the user has no idea of what the average traffic between two sites actually is. Second, there is only a very small amount of traffic between two sites, so any CIR at all could be considered overkill.
The Frame Relay Frame Frame relay is a network service that occupies the lower two layers of the OSI RM. This model breaks down all network communication functions into seven functional layers, of which only a few are needed within the network itself. Each layer provides services to the layer above and obtains service from the layer below. Each layer is defined as a set of functions, which may be provided by many different actual protocols, as long as the protocol provides the services defined at that layer. Within the context of the OSI RM, frame relay is a Data Link Layer (Layer 2) protocol. Other Layer 2 protocols used in other network technologies include Synchronous Data Link Control (SDLC), which is used in SNA networks, APB which is used in X.25, and High-level Data Link Control (HDLC), which is intended to be used in pure OSI RM environments. In addition, LANs employ a slightly different Layer 2 structure to accommodate such common LAN protocols as token ring or Ethernet. All Layer 2 protocols are distinguished by their use of a common Protocol Data Unit (PDU) structure known as the frame. All of these different Layer 2 protocols use a distinctive frame structure, and frame relay is no exception. The frame relay frame structure is more similar to WAN protocols such as SDLC or LAPB than it is to the LAN frame structures. This is only understandable, given frame relay’s relationship to X.25 and WAN protocols in general. What is different is the detailed structure of each field within the frame relay frame itself. Frame relay uses an adaptation of Link Access Protocol-D Channel (LAPD), a version of HDLC originally developed for ISDN. This is known as LAPF in frame relay networks. Actually, the LAPF specification for frame relay is a subset of the full LAPD protocol, which is specified in the Q.921 international standard for Digital Subscriber Signaling System, since frame relay networks are not required to do error recovery. LAPF does define a series of core functions which all frame relay networks must perform. In many texts, the core functions are called the basic operations of frame relay. These functions or operations are basic or core in the sense that a frame relay network must perform these tasks for each and every user data frame transferred through the network. The five LAPF core functions are: Use a Frame Check Sequence (FCS) to check for errors. Check for valid frame length (another error condition). Discard invalid or errored frames.
Provide frame delimiting, alignment, and transparency. Multiplex and demultiplex frames.
All of these have been discussed to some extent already, except perhaps the last point. Briefly, all frame relay network nodes must check that the frame relay frame is within the allowable frame size parameters. These nodes must also use a specific Frame Check Sequence (FCS) to check frames for transmission errors. These errors are not corrected within the frame relay network. The node must discard invalid or errored frames. The source and destination frame relay equipment is expected to deal with these error conditions, not the frame relay network nodes.
The frame relay network nodes must also be able to delimit (detect the start and end of) frames, align the frame (that is, make sure the frame is formatted correctly), and ensure transparency. Transparency means that the frame relay node can make no assumptions about the content of the frame relay frame and cannot even look at the frame contents when processing the frame. So the frame contents are totally transparent to the frame relay network and literally anything should be able to be sent over the network, even voice and video. The last point about the ability of frame relay networks to multiplex and demultiplex frames simply means that a router or FRAD will send all frames to all reachable destinations on the same physical link (the UNI). All traffic from a particular site is multiplexed onto the UNI at the source and all arriving traffic to a particular site is demultiplexed to the destination application. All multiplexing and demultiplexing in frame relay is based on the connection identifier (DLCI). There are other functions commonly performed by other Data Link WAN protocols. These include such functions as frame sequencing, window sizing, and acknowledgments. These are not performed by the frame relay network nodes at all. This is one of the secrets of making frame relay into a fast packet protocol, since the network nodes are relieved of these functions and operate much faster as a result. Of course, no network protocol can afford to ignore these important functions. The point is that frame relay networks do not handle them within the network at the switches. These functions are performed at the end points of the network, at layers above the Data Link Layer. The combination of higher-layer protocol functions and intelligent end systems controls end-to-end transport of data and makes the end system responsible for error recovery, windowed flow control, acknowledgments, and so on. Actually, the LAPF specification makes allowances for a frame relay transfer service to provide some of these functions to the Layer 3 (Network Layer) protocol. But since these functions apply to signaling messages, the frame relay network usually only provides the core functions for user frames. TCP/IP, itself, or some other higher layers protocol architecture, takes care of the additional function at the Network Layer and even above. The frame relay UNI sends and receives frame relay frames. The frame relay frame format is very simple in structure. First there is a special flag indicating the beginning of a frame. The flag is the bit configuration 01111110 or 7E in hexadecimal notation. Then there is a two-octet frame relay header field. An octet is defined as 8 bits and is preferred to the more common term “byte,” which can have various sizes in some contexts. In some documentation, the frame relay header is called the address field, but it will be called a header here because this is the more common term. The header is followed by the frame payload or information field, which is variable in size up to some maximum, most often 4096 octets. The information field is followed by a trailer two octets in length. This frame check sequence (FCS) contains a 16-bit Cyclical Redundancy Check (CRC-) that provides a very good error-detection mechanism for finding frames that have bit errors in them. The whole frame ends with another 7E flag octet. Immediately following an ending flag, a new frame relay frame might begin, although such rapid operation is rare. The structure of the frame relay frame is shown in Figure 4.1.
Figure 4.1 The frame relay frame. The frame relay header has a interesting structure itself. There are even a few variations allowed on the frame relay header, but all networks must support the basic structure. The basic frame relay header structure is shown in Figure 4.2.
Figure 4.2 The basic frame relay header. The fields in the figure are labeled by the acronyms used to identify the parts of the frame relay header. These can be confusing at first, so a closer look at these fields and their functions is needed. DLCI This is the Data Link Connection Identifier. It is a 10-bit field that contains the logical connection number of the PVC (or SVC, when an on-demand connection is made) that the frame is following across the network. This number ranges from 0 to 1023, but some connection numbers are reserved for special functions. The DLCI is examined by each frame relay switch on the network to determine the correct output port to relay the frame onto. The DLCI number may be changed by the switch as well, and usually is. This is all right, as a particular DLCI has local significance only on a frame relay network. The reason that the DLCI is split between two octets is that the field is a concatenation of the SAPI and TEI fields from the ISDN LAPD frame structure, which are not used in LAPF. C/R This is the Command/Response bit. It is not used in the current definition of the frame relay protocol. Again, it is an artifact of the X.25 roots of frame relay. EA This is the Extended Address bit. These bits are at the end of each header octet and allow the DLCI field to be extended to more than 10 bits. The last EA bit is coded as a “1” and all previous EA bits are coded as “0s”. In the figure, there are only two EA bits, but other header configurations are allowed in frame relay. These other possibilities are discussed later. FECN and BECN These are the Forward Explicit Congestion Notification and Backward Explicit Congestion Notification bits. These bits help with congestion control on the frame relay network. The use of these bits can be quite complex and will be discussed more fully in a later chapter. Some real controversies about the proper and effective use of FECN and BECN have come up in the past few years, so this attention is certainly warranted. DE This is the Discard Eligibility bit. It is used to identify frames that may be discarded by the frame relay network under certain conditions. The use of this bit will be discussed further later in this chapter. The FECN, BECN, and DE bits are distinguishing characteristics of the frame relay protocol. They represent something new in the philosophy of just what a network can and should do in special circumstances. One of the features of the frame relay header that strikes people as extremely odd is the fact that the DLCI field is split between the two octets of the frame relay header. As it turns out, the reason for this is quite important for the understanding of why frame relay is considered to be an improvement over LAPD. The easiest way to appreciate this is to compare the structure of the ISDN LAPD frame and header to the frame relay frame and header structure just outlined. The structure of the ISDN LAPD frame is shown in Figure 4.3. The similarities with the frame relay frame structure are immediately obvious.
Figure 4.3 The ISDN LAPD frame structure. The basic frame structure has appeared over and over again in many data communications protocols. The structure is simple enough, yet possesses all of the capabilities needed to carry the packet over a series of links between switches. This is in fact a key point. In all layered data communications protocols, the frame is sent on a link-by-link basis only. That is, at every step along the way on a network, a frame is created by the sender and essentially destroyed (in the act of processing the frame) by the receiver to get at the packet inside the frame. It might seem impossible to ever send anything intact from a source to a final destination if this is the case. But the key to understanding how layered protocols operate, and how frame relay relates to X.25 and LAPD, is to realize that it is the packet, as the contents of the constantly invented and destroyed frames that is sent across the network end -to-end from a source to a destination
Many protocol developers are even in the habit of calling any protocol data unit that travels the length of a network from end-to-end intact, through switches and other network devices, a packet. The packet label seems to apply whether the actual PDU is a TCP/IP Layer 3 data unit (formerly called a datagram or connectionless packet) or an ATM data unit (the cell). The term “packet” is used generically in many cases to indicate a data unit that leaves one end-user location, flows over a network, and arrives at another end-user location intact. This is where the term “fast packet switching” comes from. As mentioned in the previous chapter, the first step in understanding the relationship of frame relay to X.25 is to realize that in frame relay it is the frame that flows from source to destination on a frame relay network. In X.25 this function is performed by the packet or frame contents. In an X.25 packet-switching network, packets are switched on the network. In a frame relay network, frames are relayed across the network much more efficiently, since the frames no longer need to be created and destroyed on a link-by-link basis to get at the packet inside. Frame headers are examined, processed, and modified, but all of this happens at the frame level. There is another important aspect of the ISDN LAPD frame structure that holds a key to understanding the relationship between X.25 (through ISDN) and frame relay. This is the ISDN LAPD frame address field; its structure is shown in Figure 4.4. Again, this structure should be compared to the frame relay header structure.
Figure 4.4 The ISDN LAPD frame address field. Note that this is the LAPD frame address field. In frame relay, it is common to call this field the header, although plenty of frame relay documentation retains this address designation for the frame relay header. In either case, header or address field, the function is the same in both LAPD and frame relay: to tell the network devices what to do with the information inside the frame. It is obvious from Figure 4.4 that there are two different fields involved in the ISDN LAPD address structure. The first is the Service Access Point Identifier (SAPI) that is 6 bits in length and the second is the Terminal Endpoint Identifier (TEI) that is 7 bits in length. These identifiers are just numbers, from 0 to 63 in the case of the SAPI and from 0 to 127 in the case of the TEI field. But why should a frame, which only flows from a single location to another single location (point-topoint), have such a complicated address structure? This was one of the innovations of LAPD itself. While it is true that all frames flow over the same point-to-point link on a data network, it is not the case (and cannot be) that all packets must do the same. Otherwise, a separate physical point-to-point link to every possible destination must be configured at the source for these frames to flow on. And in fact this is the essence of a private network based on leased lines. But this is not the essence of LAPD, which is packet-switched, not circuit-switched. The parent protocol LAPD, as its child protocol frame relay, allows the multiplexing of connections from a single source location over a single physical link. The greatest benefit of this approach is to make more efficient use of expensive links and cut down on the number needed in the network. The two fields in the LAPD frame address deal with the two possible kinds of multiplexing that a customer site sharing a single physical network link must accommodate. First, there may be a number of user devices at a customer site. Second, there may be a number of different kinds of traffic that each of these devices generates. For instance, even though all information is 0s and 1s, some of these digits may represent user data and some may represent control signaling to the network itself.
The TEI field deals with the first of these multiplexing possibilities. The TEI field addresses a specific logical entity on the user side of the interface. Typically, each user device is given a unique TEI number. In fact, a user device can have more than one TEI, as would be the case with an ISDN concentrator device with multiple user ports. The SAPI field deals with the second multiplexing possibility. The SAPI field addresses a specific protocol understood by the logical entity addressed by a TEI. These are Layer 3 packet protocols and provide a method for all equipment to determine the structure of the packet contained in the frame. Taken together, the TEI and SAPI address fields make it possible for all network devices on the network to first determine the source or destination device of a particular frame on an interface (the TEI), and then determine the source or destination protocol and packet structure on that particular device (the SAPI). When frame relay was invented, the TEI and SAPI fields were combined, which is the main reason why the DLCI field is split between the two octets of the frame relay header field. The two-level structure of the TEI and SAPI, which in truth was marginally effective in ISDN, was combined into the flat address space of the 10-bit DLCI. The reason that the overall structure of the ISDN LAPD address field had to be preserved was due to a desire to eventually replace LAPD running on the D-channel of ISDN with frame relay. The frame relay field structure allows for networks to handle both frame relay DLCIs and ISDN LAPD SAPIs. Ten bits can count 1024 things, and the DLCI is just a number between 0 and 1023 that is used by the frame relay network to identify a virtual circuit connection.
The Data Link Connection Identifier The number of Data Link Connection Identifiers (DLCIs) possible on a UNI, 1024, might seem like a lot. But in actual practice the number of DLCIs that can be assigned on a given UNI of a certain speed are strictly limited. One reason is that some DLCIs are reserved for network management and control functions. This is certainly understandable. The second reason that DLCIs are limited on a UNI has to do with bandwidth on the UNI. Frame relay offers flexible bandwidth allocation, to be sure, but that does not mean that the UNI’s bandwidth is unlimited. When many source devices generate bursts of traffic all at once, any frames that cannot be sent in a given time frame across the UNI to the frame relay switch must be buffered or just dropped. Either strategy is allowed and strictly implementation-dependent. Naturally, the more DLCIs defined from sources, the more the chances are that many of them will be bursting at once. If the excess traffic is delayed through buffering, the end-to-end delays will rise through the network. If the excess bursts are discarded, most data applications respond by sending copies which only makes the problem worse. Other applications, such as voice and video, function hardly at all when traffic losses reach about 10 percent or so (even less for commercial voice and video applications). The number of DLCIs allowed on a UNI depends on the service provider’s policy with regard to the speed of the UNI and the Customer Premises Equipment (CPE) capabilities. In actual practice, the CPE is not much of a limitation, especially if FRADs with adequate buffers are used. Routers routinely buffer excess traffic even in a leased-line environment and hardware FRADs come with a variety of buffer sizes available. So the number of DLCIs that can be supported on a UNI of a particular speed boils down to what the service provider says is the maximum. It is in neither the customer’s nor the service provider’s best interest to have high delays or high information loss on the UNI due to over-configured connections. The majority of frame relay UNIs are still 56/64 kbps links. In many cases, the maximum number of DLCIs that can be configured on this speed UNI is 50. But in most cases, no more than 10 DLCIs on a 56/64 kbps UNI is a good guideline to work with. If this seems like a limitation, remember that a DLCI identifies a logical connection to a destination. In a leased line environment, 10 56/64 kbps links would be needed instead of 10 DLCIs on one UNI. DLCI values are to be used conforming to the pattern in Table 4.1, according to ITU-T Recommendation Q.922. Table 4.1 DLCI Values and Their Uses
DLCI Value
Assigned Use
0
In-channel signaling and management (ANSI/ITU-T)
1–15
Reserved for future use
16–991
User connections (only 512 to 991 when on an ISDN Dchannel)
992–1007
Frame relay management at Layer 2
1008–1022
Reserved for future use
1023
In-channel virtual circuit management (Frame Relay Forum)
The table is more complex than it seems at first. There appear to be two DLCIs that are used for link-management purposes. There are. DLCI 0 is used for signaling messages for demand connections (SVCs) and for some link-management functions. The use of this DLCI for this function is defined by the American National Standards Institute (ANSI) and the International Telecommunications Union Telecommunications Standards Section (ITU-T). The ITU-T sets international standards for telecommunications and ANSI adapts them for use within the United States. So the use of DLCI 0 for in-channel signaling and management is official. But the table also notes that DLCI 1023 (all 1s in the DLCI field) is used for “in-channel virtual circuit management,” which is almost the same purpose and function, and even wording, as used for DLCI 0. This is because DLCI 1023 is used by the industry consortium formed to bring frame relay equipment and services to market as quickly as possible, the Frame Relay Forum (FRF). It is enough to note here that the FRF issues various Implementation Agreements (IAs) regarding certain aspects of frame relay. FRF IAs are not really standards, but formal agreements among the members to make frame relay equipment and create frame relay networks that do certain things in certain ways. IAs are a good example of a defacto standard, one established not by rule, but by common use. With regard to DLCI 1023, the FRF has said that this connection identifier is to be used for the link management function. Link management provides such key UNI functions as verifying that the UNI and each DLCI is up and running, detecting new PVCs, and so on. Since the link management is is important to frame relay, Chapter 7 will be devoted to it. DLCIs 992 through 1007 are reserved for what is known as Layer 2 management. This sounds odd, since all of frame relay is at Layer 2 of the OSI RM, but it is exactly the point. Frame relay user DLCIs carry virtual circuit traffic end-to-end across the frame relay network for the users and such frame content is transparent to the frame relay network. DLCIs 0 and 1023 carry information back and forth from the customer premises equipment to the frame relay switch across the UNI. But how can one frame relay Layer 2 function at the premises end of a UNI communicate with its counterpart at the other end of another UNI? This is what DLCIs 992 through 1007 are for: to allow frame relay equipment to frame relay equipment communication over the frame relay network. Frames with these DLCIs contain neither user traffic nor local UNI management and control information. DLCIs 16 through 991, 976 in all, are assigned for user connections. They may be PVCs or SVCs (demand connections). There is one exception. If the frame relay frame is part of an ISDN and uses the ISDN D-channel for frame transport, which is allowed, then the user-assigned DLCIs must be between 512 and 991. This automatically forces the SAPI value to be between 32 and 61 from the ISDN perspective. This is just another way of saying, “Frame relay frames on the ISDN D-channel must use SAPIs 32 through 61.” While this limits the number of connections on an ISDN, these 480 remaining connections should not be a limitation for currently defined ISDN link speeds. Practical considerations regarding link traffic will be a limitation long before logical link connection numbers will.
This brings up another point about DLCIs and frame relay headers in general. What if a UNI or other frame relay network link is not running at a relatively low speed like 56/64 kbps, but a much higher speed like 45 Mbps or even higher? At these higher speeds, might not logical connectivity in terms of assigned DLCIs become an issue before physical traffic considerations? In other words, might not a high-speed link be able to handle 976 bursty DLCIs with ease? How would customers react to having to purchase another 45 Mbps link just because the original had run out of DLCIs if the link were only 25 percent or so loaded? These are all legitimate concerns, f course. Fortunately, frame relay has an answer: to allow the frame relay network to extend the frame relay header beyond two octets. This allows more than 10 bits to be used for the DLCI. There are various methods to do this, some using three octets and some using four. The method recommended by both the ITU-T and ANSI employs a four-octet frame relay header. Normally, this extended frame relay header is found on high-speed links and the frame relay NNIs. The four-octet frame relay header structure is shown in Figure 4.5.
Figure 4.5 The extended frame relay header format. For backward compatibility, the four-octet frame header has the first two octets essentially the same as the two-octet version, except the EA bit is now 0. This EA = 0 lets the receiver know that the four-octet header is in use. All of the bits in the third octet, except for the last EA = 0 bit, are used to extend the DLCI. This would give a 17-bit DLCI capable of enumerating some 128,000 virtual connections. If this is not enough, six more bits in the fourth octet can be used to further extend the DLCI to a total of 23 bits. This gives about 8 million virtual connections, almost certainly severe overkill. So this fourth and final octet can be used for Data Link-Core (DL-Core) control functions, similar in intent and function to the Layer 2 management connections, but of course able to be used on all DLCIs (because this field is present on all extended DLCIs), not just individual ones. The D/C bit indicates to the receiver whether the fourth octet is used for extended DLCI (D) or DL-Core control (C) purposes. No DL-Control procedures have yet been defined, however, which renders this field and function relatively useless. It should be noted that when the extended DLCI formats are used, the higher reserved DLCI values must move up to take these higher positions. So FRF link management would be on DLCI 128,000 or 8 million, as the case may be. DLCIs identify virtual connections on a frame relay network. So each connection, whether PVC or SVC, must have a DLCI in the allowed user range. There is one other major DLCI topic to be discussed, however, and a topic which can be confusing to those used to LAN and connectionless IP environments such as the Internet. LAN Layer 2 addresses and IP Layer 3 addresses have global significance. That is, there can be no two LAN or IP devices on the same network with identical addresses, just as there can be no two telephones in the world with identical telephone numbers. How could LAN and IP traffic, or voice connections, be routed if it were otherwise? But DLCIs have local significance only. This topic is important enough to deserve a section of its own.
Local Significance
When a packet is sent across the Internet from one router to another, the packet header contains the full, globally unique, source and destination IP addresses. The same is true of all LAN frames, from Ethernet to token ring, and beyond. If there are two sources or destinations with the same LAN or IP address, the system breaks down and there will be unreachable places on the network. This is the essence of global significance. But this is not the case with frame relay frames. The frame relay DLCI is a connection identifier, not a source and/or destination address. DLCIs have local rather than global significance. Only one DLCI is needed to connect any two sites because a connection is defined as a logical relationship between two endpoints. Now, there is a separate DLCI on each NNI on both ends of the network, but these are just opposite ends of the same connection. None of this discussion about local significance implies that there are no network addresses on frame relay networks. When a connection is established on any kind of network, the connection must know just which two network endpoints to connect. This may be done by hand at service provision time (which makes this connection a PVC) or dynamically by means of a signaling protocol (which makes the connection an SVC). Many public frame relay networks employ network addresses that look like telephone numbers, with area codes, office codes, and access line numbers. Since only 1024 DLCIs can exist in most frame relay networks and there will obviously be more than 1,024 users on large public frame relay networks, the DLCIs must be of local significance only. Similarly, the word “mother” is of local significance only as well. Everyone has a mother, but most people mean a different person when they say “mother” (unless they are siblings). On a frame relay network, there may be many DLCI = 17 connections, but as long as there is only one DLCI = 17 connection on a given UNI or NNI, there is no problem. Frame relay switches use the DLCI to route the frame on a path from input port to output port. All that needs to be done is to look up the DLCI in the frame header on the incoming frame in a table. The table entry essentially gives two pieces of information. First, it contains the switch output port number which the frame will be sent out from the switch on. Second, it contains the value of the new DLCI that will be inserted into the outgoing frame in place of the original DLCI value. This must be done to preserve the purely local use of DLCIs. Of course, frame relay switches can use whatever protocol they wish to communicate. In the most general case, only the DLCIs on the two UNIs will be different and actually matter, from the user perspective. The result is not chaos for the user, however. At the origination point on the frame relay network, the router will interpret the IP destination address of the packet (for example) and place the DLCI for the PVC to the destination into the frame relay DLCI header field. Although the DLCI might change as the frame flows through the network switches, the switch at the destination UNI will replace the arriving frame header DLCI with the proper DLCI for the frame on the destination UNI and send it to the user. While DLCIs change, they do not do so haphazardly. The use of locally significant DLCIs on a frame relay network is illustrated in Figure 4.6. There is more than one DLCI = 18 connection, but only one }LCI = 18 can exist at any one time on each UNI. All the FRADs need to know is that when they need to send a frame to Site B, for example, they send it on DLCI = 17 and when they need to send a frame to Site C, they send it on }LCI = 18. All that a destination such as Site D needs to know is that when it receives a frame with DLCI = 18, it came from Site C and when it receives a frame with DLCI = 19, it came from Site B. If two DLCIs happen to actually match at opposite UNIs, it is more likely the result of a coincidence than planning.
Figure 4.6 DLCIs and local significance.
It should be noted that Site A and Site D in Figure 4.6 cannot communicate directly over the frame relay network. No DLCI, and so no connection, has been defined between them. The sites might still be able to exchange information, but only through an intermediate site (from the FRAD perspective; to the frame relay network, all sites are endpoints) such as Site B or Site C. Such partial mesh connectivity might be provided for the purposes of expense (DLCIs cost little, but they are not free), traffic patterns (Site A rarely communicates with Site D), or even security (all Site A to Site D traffic must pass through Headquarters Site C). On the other hand, full logical mesh connectivity with DLCIs can be provided as well, as long as traffic requirements are respected. DLCIs are, by definition, bidirectional, as is also shown in Figure 4.6. This means that if a frame relay PVC is configured from Site A to Site B, and Site B receives from Site A on DLCI = 20, then the same DLCI number leads back to Site A. Oddly, frame relay service providers are fond of charging customers for a connection from A and B, then also for a connection from B to A, as if DLCIs were like ATM connections, unidirectional. But regardless of how the frame relay connections are bundled and charged for, DLCIs are bidirectional. This does not mean that a customer can actually use a DLCI from B back to A if it has not been paid for. It simply means that special care needs to be taken in the network to prevent this bidirectional capability, not to enable it.
UNI Options Frame relay networks are WAN technologies. Since frame relay networks may span long distances, it is common for a frame relay customer to obtain services from a long-distance public service provider. The three largest in the United States are AT&T, MCI, and Sprint. These companies handle long-distance voice services and offer frame relay services. If the frame relay network spans a small enough distance, the entire frame relay network may fall within a single Local Access and Transport Area (LATA). Within a LATA, the local exchange carrier (LEC) such as NYNEX, Bell Atlantic, or GTE may offer a full service frame relay network without the need for involving an interexchange carrier (IXC) like AT&T, MCI, or Sprint. However, most frame relay networks easily span LATAs or even states. In these cases, the carrier chosen to provide frame relay network services is often the customer’s long-distance service provider. However, given the new deregulated environment in the United Sates and around the world, it is no longer a given that a multi-LATA or multistate frame relay network always involves an IXC. And with the rise of NNI agreements between LECs and IXCs when it comes to their frame relay networks, it is always in the customer’s best interest to seek the lowest price for frame relay, whether from LEC or IXC. But even if the LEC or IXC is the ultimate supplier of the frame relay service, the local access portion of the frame relay network, the UNI, is commonly obtained from the LEC. The only issue that arises is how the customer site UNI is linked to the frame relay service. And even this LEC access is not always the only way to go. LEC control of the UNI local access portion of the frame relay network is slowly changing in many parts of the United States. Several companies have achieved co-carrier status with the incumbent LEC (ILEC) in many states. This section will consider all the possibilities of providing a local physical connection into the wide area frame relay network.
Leasing Local Access The simplest way to provide the connectivity is to lease a local loop from the LEC to the frame relay service provider’s Point of Presence (POP) within the LATA. The POP is where the frame relay switch is located. In the simplest case, the LEC is both the supplier of the UNI and the supplier of the frame relay network service. But when an IXC is the provider of the frame relay service, the UNI is still most often a LEC-provided leased line from the customer site to the frame relay POP. The choice of leased-line speeds runs from low-speed access (56/64 kbps) to high speed (a full T1 at 1.5 Mbps, or even a full T3 running at 45 Mbps) and a number of possibilities in between. The high-speed alternative may be un-channelized (just a single 1.5 Mbps or 45 Mbps channel) or channelized (multiple lower-speed channels). Channelized access still allows non-frame relay services to share the same local access line as the frame relay service. There is no need to provide exactly the same arrangement at each and every site linked to the frame relay network. This identical arrangement may be desirable from a network management standpoint merely for the sake of consistency, but it is not a requirement. Depending on the specific needs of each site, a number of local access alternatives may be used on a frame relay network.
The most popular port speed for frame relay networks to date, based on number of ports sold, is 56/64 kbps. Over 50 percent of all frame relay ports still run at this speed. Of course, the easiest way to access a 56/64 kbps frame relay switch port is with a leased 56/64 kbps DS-circuit. The “56” in the 56/64 kbps low-speed access refers to the fact that in some parts of the United States, full 64 kbps clear channel digital circuits are not available. In these places, only 56 kbps is used, which is 7/8ths the speed of the full 64 kbps DS-circuit. The difference in performance is minor in most applications. Normally, a DS-(or T1, as many customers know it) is channelized or broken up into 24 DSchannels, each running at 64 kbps (56 kbps in some U.S. locations). Although few sites may need 24 links running at 64 kbps, the price crossover point is usually only three DS-circuits. In other words, if four separate DS-s are running to a site, it is actually cheaper to lease a single DS-with 24 DS-0 channels, even though 20 of them sit idle. Think of it as buying a dozen eggs when the recipe only calls for four because a dozen is cheaper. The other eight can keep in the refrigerator for a while until needed. An un-channelized DS-runs at 1.544 Mbps and offers a data transfer rate of 1.536 Mbps, the remaining 8 kbps representing overhead. An un-channelized local access loop can only support the frame relay network traffic, however. It is typically attractive to use a channelized DS-to provide integrated access to the frame relay network, especially for initial frame relay network implementations. Here is how. Because of the price differential between individual DS-s and a 24 channel DS-, many companies have DS-channels in DS-that are unused. Many of these channels are used for voice tie-lines from PBXs or SNA networks. In most cases, some of these otherwise idle channels can be used for frame relay network access. Naturally, if the DS-1 channels run to the AT&T POP, the frame relay service provider must be AT&T. This concept of integrated access is illustrated in Figure 4.8.
Figure 4.8 Integrated access to a frame relay network. A nice feature of this arrangement is that it offers more scalability than single DS-circuits alone. Frame relay port speeds are usually available in multiples of 64 kbps. If there are six spare channels on a DS-, any number up to all six may be used for the frame relay access link, as long as the frame relay port speed is upgraded to match. This may require a separate multiplexer box but would offer access link speeds of 64 kbps, 128 kbps, 192 kbps (rare as a port speed offering), 256 kbps, 320 kbps (also rare), or 384 kbps. One word of caution is in order, however. The channels assigned for the frame relay access link should be contiguously slotted. This means that in the preceding example, channels 4, 5, 6, and 7 on a DS-1 could be used to give an access link speed (and matching port) of 256 kbps (4 ∞ 64 kbps), but not channels 4, 9, 12, and 15. These last four channels are not contiguous (not in numerical order). There is usually a severe performance degradation when using noncontiguous channels. Leasing a point-to-point local access link to an IXC from a LEC, whether as a series of DS-s or a single channelized or un-channelized DS-1 is no longer the only possibility in many areas within the United States. The alternative local service providers, which used to be called Competitive Access Providers (CAPs) but now almost universally prefer the term Competitive LECs (CLEC) also can supply local access to an IXC’s frame relay POP.
Diverse Routing A frame relay UNI will typically replace a larger number of point-to-point leased lines. But the single UNI might be considered a single point of failure on the frame relay network. That is, if the UNI fails, the entire site is suddenly cut off from the frame relay network. In order to address this issue the local access link that constitutes the UNI might be diversely routed to avoid single points of failure along the path from customer site to frame relay switch.
So, in many cases the local access provider can offer attractive alternatives to point-to-point local access links. Many of these local service providers have installed rings of fiber optic cable in major metropolitan areas. These fiber rings have two advantages over point-to-point access configurations based on copper wire or coaxial cable. In the first place, the fiber rings are much less susceptible to service outages. The rings are configured so that if a link between two adjacent network nodes is broken, service is not disrupted. Information is wrapped back around an alternate path fiber in a matter of milliseconds (60 milliseconds is not uncommon). Repair crews can go about their tasks in an unhurried fashion without the threat of irate customers (or regulatory limits) forcing the crews to cut corners or rush cable splices. Second, the fiber optic cable itself offers much lower error rates than copper media. The higher quality is quite noticeable to users. While typical copper media have bit error rates of about 1 in a million, most fiber networks have bit error rates of about 1 in a billion, which is fully 1000 times better. This translates to 1/1000th of the bit errors encountered on an access link before fiber optic cable is used. The twin advantages of quality and automatic service rerouting have made the use of fiber rings for service access—frame relay or not—very attractive for a wide range of customers. One other point should be made when it comes to leasing the local access portion of the public frame relay network. This concerns the format of the transmission frame used to carry the frame relay frames from the customer premises to the public frame relay switch. Whether high speeds or fiber rings are available, most users choose a DS-or lower speeds for the local access portion of the frame relay network. All DS-s in the United States have two critical parameters that must be matched between the customer’s premises equipment DS-port and the public service provider’s DS-port on the frame relay switch. The two parameters are frame format and line coding. Many installation dates for frame relay services have been missed because of miscommunication regarding these two parameters. Older DS-s employ a frame format known as the D4 superframe. Some service providers refer to this as SF, or superframe formatting, but this is the same thing as D4. Newer DS-1s support a frame format known as Extended SuperFrame (ESF). ESF is always preferred because of its superior manageability and problem isolation features. In some areas, ESF may not be an option, while other areas will offer a choice. Either frame format will work with frame relay, as long as both ends are configured to use D4 or ESF. Older DS-s also employ a line coding technique known as Bipolar-AMI (Alternate Mark Inversion). Line coding is used to represent the digital 0 and 1s with electricity to carry the data over long distances. To make Bipolar-AMI function over the up to several miles between a customer site and a serving office, it was necessary to limit the bandwidth of each DS-channel of the 24 present in a DS-1 to 56 kbps instead of the 64 kbps possible. This is no problem when the DS-s are used for voice, but imposes a limit of the speed of data when the DS-channel is used for this purpose. The solution was to develop a new form of line coding known as Binary 8-Zero Substitution (B8ZS) which allowed each DS-to operate at a full 64 kbps. This clear channel capability is available in many areas, but not all. As with ESF, B8ZS is always preferred to enable the user to get the maximum functionality out of his or her access link. The Bipolar-AMI limitation applies to higher speeds as well. If a fractional T1 access link is needed at 256 kbps (4 x 64 kbps), with Bipolar-AMI, the total bandwidth available will only be 224 kbps. While perhaps not critical, the difference may be noticeable. It is important to realize that frame relay will function equally well whether accessed with BipolarAMI and D4 framing or B8ZS and ESF, or any combination of the two. These two alternatives have been deployed independently in many cases, so care is needed. The whole point is to make sure there is a match between the way the premises equipment functions and is configured, and the way the service provider’s equipment functions and is configured. With any network, the fewer surprises, the better.
Dial Backup
Diverse routing on a fiber ring will go a long way in avoiding the single point of failure that a UNI represents. However, rings are not available in all locations from all frame relay service providers. In such cases, it is possible to configure a dial backup port that can be used when the leased line UNI is out of service. Most hardware FRADs will support a dial backup port, especially for 56/64 kbps access lines. Naturally, the frame relay service provider must provision a number of ports at the local switch to allow such access to take place. These dial backup ports are often perceived as a tempting target for hackers or crackers, or others who seek to enter networks without authorization, so their use is advisedly used with caution and added security measures such as password protection and encryption. The actual use of the dial backup might even be totally transparent to the customer. There are two ways to dial around UNI failures in frame relay. The first method is the most common and simply substitutes a dialed 56/64 kbps link for the leased line UNI in the event of a failure. In some older frame relay networks, the dialed connection could extend from one customer site FRAD directly to another customer site FRAD. The drawback of this approach is that no other sites on the customer’s frame relay network are accessible beyond the two sites directly linked. This option is seldom used in newer frame relay networks.
Multihoming and Multiport UNIs Single UNIs without fiber rings can also have a backup UNI that is used only when the main UNI is unavailable. In fact, this second UNI does not even have to lead to the same frame relay switch site as the main UNI. This practice is known as multihoming. Multihoming not only can protect from UNI failure, but it also can protect from a frame relay switch outage. The major drawback of a multihomed UNI is that the customer might be paying for connectivity that is not used to its fullest. The DLCIs defined on a primary UNI must be duplicated on the other UNI leading to the second switch. So there is added protection against failures, but not added efficiency. The whole idea behind diverse routing of UNIs, dial backups, and multihoming is to provide a customer site with protection against UNI failures. The latest way to provide such UNI protection is to use inverse multiplexing for frame relay, which is called the multiport UNI by the Frame Relay Forum. The multiport UNI is actually two local access links that behave like one UNI. The two links can be diversely routed, of course, but they are not usually multihomed. Both the customer site FRAD and service provider switch must support the multiport UNI option. There is only one set of DLCIs defined. Under normal operating conditions, both links are used to handle traffic on all defined DLCIs. In its most basic form, the multiport UNI uses two 64 kbps links to provide what seems to be a 128 kbps UNI. In the event of a failure on a single 64 kbps link, the only thing that the user might notice is a decrease in throughput as the UNI operates at 64 kbps until the second link is restored to service. This basic operation of a multiport UNI is shown in Figure 4.9.
Figure 4.9 The multiport UNI. It is possible to configure four-port multiport UNIs or even other combinations. In the case of fourport UNIs, the throughput is 256 kbps. Multiport UNIs can also be used to provide fractional T1 access speeds where fractional T1 speeds are not ordinarily available. This has become one of the most common uses of multiport UNI equipment. Of course, the protection against failures is still an attraction.
Analog Modems and Switched Access
It sometimes comes as a surprise to those used to private line networks that frame relay works as well as it does with UNI speeds as low as 56/64 kbps. Of course, this is due to the bursty nature of client/server LAN applications, and the efficiency and effectiveness of modern compression techniques applied to voice and video digital data streams. But the bursty nature of frame relay applications extends to UNI speeds even lower than 56/64 kbps. It is even possible for a frame relay UNI to operate at analog modem speeds as low as 33.6 kbps. Many users have home PCs that include 56 kbps modems, whether compliant with the V.90 standard. Even these modems still operate at 33.6 kbps upstream, out of the PC. It is only in the downstream direction that 56 kbps modems function at the full 56 kbps and only under certain circumstances at that. Many of these same home users still need to access their employer’s corporate network. In this case the home PC is the client and the server might be based on the organization’s frame relay network. Such small office, home office (SOHO) workers or telecommuters no longer need be left out of the organization’s network. While it is true that home workers could access a corporate network over the Internet, this type of access is considered quite insecure and not to be trusted for many types of transactions, especially those of a financial or confidential nature. The lack of any quality of service guarantees at all on the Internet has already been discussed. So frame relay access over analog modems is desirable both from a security and QoS perspective. It the case of analog access, the home PC simply runs a frame relay software package and dials in to a special modem-based frame relay switch port. A PVC is defined on this port to run to the home worker’s server site within the organization. The PVC has a DLCI and a CIR, naturally. It is even possible to configure more than one PVC on the analog UNI, but this is not common due to both traffic load and security considerations. The connectivity is still through the PVC and not through any type of frame relay SVC arrangement, but this is seen as a security feature and not as a limitation. The use of analog modems to provide dial access to PVCs brings up other alternative frame relay UNI arrangements. These alternate arrangements are all distinguished by providing switched access to the PVCs defined on the UNI. The whole suite of possibilities includes both dialup digital access (such as that used for backing up a dedicated UNI channel) and dialup analog access. The term switched access is used to avoid confusion with true SVC service on a frame relay network established through the use of a signaling protocol based on Q.933. The remaining way that a frame relay UNI can be provided in a switched access environment is by way of an Integrated Services Digital Network (ISDN). ISDN as a service offering has suffered from a variety of woes over the years, but has enjoyed renewed popularity for high-speed access to the Internet. In this same fashion, ISDN can be used for access to a frame relay network instead of (or along with) Internet access. Using ISDN to support frame relay is actually a very good fit with the intentions of ISDN. The original data protocol used on an ISDN was X.25, the forerunner of frame relay by way of LAPD. Why not replace the functions of X.25 inside an ISDN with frame relay? The main drawbacks are: (1) Merging ISDN and frame relay service offerings might have revenue repercussions for the service providers who now have separate incomes from both services, and (2) putting extremely bursty and long holding-time frame relay traffic through a network switch designed primarily for voice might not be the smartest thing to do. So tight frame relay integration with ISDN will not happen soon, if at all. Usually, ISDN just provides access to a remote frame relay switch. ISDN access to frame relay would involve having a FRAD connected not to a leased private line UNI, but to the site’s ISDN network termination type 1 (NT1) device. The FRAD could share the ISDN access line (typically a 1.5 Mbps Primary Rate Interface or PRI, but not always) with the site’s PBX or other ISDN devices. All the voice calls would still use 64 kbps ISDN-B-channels on the PRI, but the FRAD could also use a B-channel for access to an ISDN device that represents the UNI on a frame relay network. In this case the frame relay network replaces the X.25 network cloud at the other side of the ISDN. If SVCs are supported, the SVCs must still be set up by the frame relay network. If only PVCs are supported, the PVCs must still be configured separately. All the ISDN does is provide the access method. ISDN access to a frame relay network is shown in Figure 4.10. This is basically what ITUT Recommendation calls “Case A frame mode service.”
Figure 4.10 ISDN access to frame relay. It is even possible to provide packet data or D-channel frame relay support on an ISDN. In this case, messages on the ISDN D-channel are in the form of frame relay frames and not packets inside LAPD frames. All of these UNI options are simply ways to gain access to the PVCs, and the DLCIs that represent the PVCs, defined on the frame relay network. Leased lines remain the most common method by far. But there is another way to provide connections and their DLCIs on a frame relay network. This method involves the use of a signaling protocol, Q.933, to provide demand connections on frame relay networks. Although not common, this topic of SVCs and signaling on a frame relay network is deserving of a chapter all its own.
Chapter 5: Frame Relay Signaling and Switched Virtual Circuits Overview Frame relay is a form of fast packet switching. This means that frame relay switches, the network nodes of a public frame relay network, are capable of switching packets fast enough to satisfy any application carried inside the packets, including compressed voice and video. Packet switching has been around for a while in the form of X.25; it has its roots as far back as IBM’s SNA and early Internet protocols before TCP/IP. The essence of packet switching is that individually addressed data units called packets all flow on the same shared link, one after another on this virtual circuit or virtual channel, without the packet content application needing any dedicated bandwidth or channel to function correctly. (Networks that rely on channels with dedicated bandwidth to function correctly are know as circuit-switched networks.) In connection-oriented packet protocols like X.25 and frame relay, the individual address is a locally unique connection identifier, the data link connection identifier (DLCI) in the case of frame relay. In connectionless packet protocols like TCP/IP and many LAN-based protocols, the individual address is a globally unique end-system (host in TCP/IP) identifier, the fully qualified IP address in the case of TCP/IP. Packets are routed along connection paths in packet networks like frame relay. Packets are all routed independently in packet networks like the TCP/IP-based Internet. Packet networks like frame relay need a connection to be set up between source and destination, and typically between each and every network node, before the first packet makes its way from source to destination. The frames in frame relay all say something like, “Send this packet (or piece of packet) inside the frame on connection 22.” Packet networks like the Internet do not need any connection at all between source and destination before packets are sent into the network. There might be an end-to-end connection at a higher layer than is present in an IP router (that is, there might be a connection at the TCP layer), but the point is that there are no IP connections between routers or users. The packets in IP networks all say something like, “Send this packet from source address A to destination address B.” Note the choice of wording in the previous paragraph. In both cases, the fundamental unit of exchange is the packet. The packets are inside frames in all cases, but the frame is of minimal interest to the application since frames only flow hop-by-hop on each link through the network, whatever type of network it may be. It is the packet that is the fundamental unit that leaves here and arrives there unchanged by the network. In both cases the networks route the packets or frames containing the packets. The Internet is as much a packet-switching network as frame relay. But the Internet network node is called a router and the frame relay network node is called a switch. The reasons for this terminology difference have already been discussed and need not be repeated here. However, there are not fundamental differences between connection-oriented networks like frame relay and connectionless networks like the Internet. The presence or lack of network-level connections is the most fundamental difference of all. Of course, in packet switching the connection is not a circuit or channel, but a virtual or logical channel or circuit. Connection-oriented networks such as frame relay require the presence of virtual circuits to provide the path for the user data transfer frames to follow across the network. The virtual circuit also provides a basis for the bandwidth guarantees for what quality of service a frame relay network provides, but this is a different matter. What is important here is the presence of a connection (frame relay virtual circuit) in a frame relay network to provide the path through the network for user data to follow.
The question is: where do the connections come from? This seemingly simple question actually has profound implications for the future of not only frame relay networks, but also all connectionoriented networks in general. This book has already pointed out that there are really two distinct types of virtual circuits. The permanent virtual circuit (PVC) is the packet-switching equivalent of the leased private line. The PVC is available 24 hours a day, seven days a week, so there is always a path established through the network from A to B. But there are also switched virtual circuits (SVC) as well. The SVC is the packet-switching equivalent of a dialup modem connection. The SVC is only available after a connection phase or call setup phase, which takes some finite amount of time for the network to complete, so it is not really transparent to the user. After a data transfer phase, in which the information of importance to the user application is exchanged across the network, there is a further disconnect phase or call clearing phase, transparent to the user, after which there is no longer any connectivity between A and B across the network. The different terminology has been applied to different types of connection-oriented networks. But whether connection or setup, disconnect or clearing, the ideas are exactly the same. The whole connection-transfer-disconnect process is usually known as call control in many networks, including frame relay. There are three distinct phases to the whole call control process. The first is the call establishment phase which sets up a connection. The second is the information (or data) transfer phase where the endpoints exchange whatever information—voice, video, or data—that the connection was established to exchange in the first place. Finally, there is a call clearing phase after which the endpoints are disconnected, exactly the same state as they were before the process began. The concept of call control for SVCs in a frame relay network is shown in Figure 5.1.
Figure 5.1 Switched virtual circuit call control. There are some implementation quirks to this connection process that should be pointed out. The whole process starts off when a user issues a call setup request across the local UNI into the network. This originator process is not as simple as a user typing in the frame relay network address of the recipient and pressing enter. In frame relay the originator is usually a software process in the FRAD (or, rarely, software in the end-user device) that sends a signaling message across the UNI to the signaling software process in the frame relay switch (the local network node in this example). If the two endpoints are not serviced by the same network node, then the call setup message must be sent through the network to the remote frame relay switch. In Figure 5.1, this signaling message shuttling is indicated by a broken line. The only special action that the frame relay network has to take is to find the remote network node and UNI switch port that the recipient device is located on. All the local network node has to go on is the destination frame relay network address. The recipient of the call setup request message might be in the next town or across the country. The switches’ signaling protocol routing table must be able to find the destination anywhere on the frame relay network. The switches’ signaling protocol routing table also must be able to determine the best path at the moment to set up the connection as well. The local frame relay switch must be able to route new call setup requests around congestion or failed network links, for example. And, if the connection is billed by connection time (no other fair means of charging has ever been implemented successfully), then the billing software must be engaged as well in anticipation of the new connection.
At this point, with the exception of the billing part, it might start to sound as if the routing of a frame relay (and all networks that support SVCs must do the same) call setup request is suspiciously like the routing of an IP packet through the Internet. Both setup message routing and IP routing employ full, globally unique network addresses to route each data unit independently; both involve routing table lookups; both rely on updated network topology information to choose the best path; both dynamically reroute around network failures, and so on. Of course, the reason for this similarity is that routing setup messages and routing IP packets are essentially the very same process! That is, all connection-oriented switching networks that support SVCs must also support connectionless routing. Otherwise there could never be any new SVCs established at all. The main difference between connection-oriented SVC networks like frame relay and connectionless IP networks like the Internet is that IP networks treat every packet like a call setup request. Once this realization is made, the familiarity that people have today for how IP routers function makes it much easier to understand exactly what a frame relay switch does with a call setup request. The call setup message emerging from the network at Step 1a in Figure 5.1 is nothing more than the delivery of a connectionless packet over the frame relay network, with all the features that the process implies, such as a routing protocol running between switches to provide topology updates, and so on. Early Internet documents were often fond of pointing out that while connection-oriented networks like the PSTN had to support both connectionless and connectionoriented networking, the Internet only had to implement connectionless services. So, obviously, the Internet’s structure was much simpler than the PSTN’s. This line of thinking is seldom seen today, especially with all the talk in IP circles about IP flows (flows are basically IP connections but purists cringe at the very thought of connection-oriented IP), but that does not mean the argument is not valid. Note that these features to enable SVCs are totally independent of the end user, FRAD, or frame relay switch support for the frame relay signaling protocol itself. The presence of a signaling protocol alone does not mean that a frame relay network suddenly can support SVCs. Of course, the absence of such signaling protocol support in any needed component does prevent the network from supporting SVCs. In any case, Step 2 shows the local network node issuing a call proceeding message to the originator. In the PSTN, older voice switches can still generate call proceeding tones which are meant to indicate to the user that the network was working on the request, and in truly old switches represented the actual signaling tones themselves. In most modern voice switches, there is just silence on the line. In frame relay, the call proceeding message tells the originator to hold on because the far end is being contacted but has not responded yet. Although not shown in the simple figure in this text, it is common for the originator to receive the call proceeding message before the call setup message has made its way through the network to the recipient. This does not materially change the procedure. Step 3 shows the recipient accepting the new connection. The recipient can also choose to reject the call setup request for a variety of reasons. Network printers routinely reject call setups when they are currently printing one job already (they can’t reasonably intersperse pages anyway) or when they are simply out of paper. End computers might reject new connections due to current traffic load. The reasons are many and have nothing to do with the network itself, in most cases. It is worth noting that the network can also reject a request for a new connection as well, mostly due to traffic congestion considerations (the requested CIR cannot be guaranteed). In Figure 5.1, the recipient accepts the connection and sends a connect message back to the originator. This message now follows the path set up through the network by the routing of the outbound call setup message, so no independent routing is required for the connect message. Upon receipt of the connect message by the originator, the connection is now ready. The messages exchanged to this point also contain all of the information needed to allow the endpoints to determine the DLCIs used in both directions, the CIRs, and so forth.
Once the connection is no longer needed, one party or the other issues a disconnect message which initiates the call clearing phase. This decision to disconnect is usually determined by the end users, but the network can release connections that remain unused for previously defined periods of time. Since disconnects by the originator are more common, this is the interaction shown in the figure. The disconnect message follows the same path as the information through the network and pops out at the recipient. Thereafter, the two network nodes involved (sometimes there is only one, as previously noted) operate independently. Step 2 shows the local node issuing a release to the originator of the disconnect message. If this message is issued before the network knows that the disconnect has been received by the recipient, the process is known as an abrupt release. If the network waits until the recipient issues its own release as shown in Step 1b, then this is known as a graceful release and is somewhat uncommon. If a network relies on SVCs to generate revenue and conserve network resources (typical), then the network naturally wants to release the buffers and bandwidth tied up with a given connection as soon as possible. In either case, the originator issues a release complete in Step 3 while the recipient independently receives a release complete in Step 1c. There is no requirement for the sequencing of the two release-complete messages. Once either end issues a disconnect, that’s enough for the network. The actual implementation of a signaling protocol on a network can be quite complex. This simple example cannot begin to address all of the issues involved at each step of the process. Some but not all of the details regarding these issues and some of the additional information conveyed in the messages themselves will be discussed later. However, all of the details are not necessarily needed to understand the functioning of frame relay SVCs in general. Connections come from some connection setup phase between user (user A, for example) and network. This phase is only needed when there is no PVC or preexisting SVC between A and B established on the network. Most frame relay networks today offer PVC service and that is all. But some frame relay service providers have begun to offer SVCs, at least in limited circumstances. The reasons for these limitations will become apparent once a full discussion of signaling protocol implementation is completed.
Frame Relay Signaling Messages The SVCs established using the FRF.4 subset of full Q.933 look pretty much like Q.933 messages, but FRF.4 messages are not all Q.933 types in all circumstances. Q.933 messages, in turn, look like the Q.931 messages first established for ISDN signaling on the D-channel, but again Q.933 is only a subset of full Q.931. It should be noted that none of this is actually in the FRF.4 document, which is fond of referring readers to some section of Q.933. But Q.933 is fond of referring readers to sections of Q.931 (and even the related Q.932), so without all three documents in hand, it is hard to figure out exactly what is supposed to be going on. So this section of the book should bring a lot of ideas and information together. All frame relay signaling messages have a common format. They are all stuck inside a frame relay frame flowing on DLCI = 0, so they are relatively easy for FRAD and a network to discover (there are other things that flow on DLCI = 0, but these are discussed in a later chapter). When the frame relay frame is used for signaling information (DLCI = 0), the first two octets after the frame relay header are a Control field. Since there are more than just signaling protocol messages that use DLCI = 0, this control field is used to allow receivers to figure out just what is inside the frame. When used to carry an SVC-related signaling messages, the Control field is what is known as an Information frame or I-frame. The I-frame structure is identified by having a “0” bit at the end of the first octet (least significant bit) of the Control field. The second octet ends with what is known as the Poll bit. When set (1), the poll bit tells the receiver that the sender expects a response frame. When the poll bit is not set (0), it means that the receiver need not respond to the frame. The other seven bits in each octet form the N(S) and N(R) fields. All information frames are always numbered and these sequence numbers are used as a means for the receivers to figure out if any frames are missing in a given sequence of signaling messages. The N(S) field is the sequence number of the frame being sent and the N(R) field is the sequence number of the next-expected I-frame (signaling message). The numbers cycle from 0 to 127, then repeat. The overall structure of a frame relay frame carrying an FRF.4 signaling message is shown in Figure 5.2. The similarity with the LAPD frame is notable and just another confirmation of frame relay’s origins.
Figure 5.2 Frame relay frame carrying an FRF.4 SVC signaling message. The figure shows not only the two octets of the Control field structure, but also the entire Information field overall structure as well. All frame relay signaling messages, identified by the DLCI = 0, I-frame format, start out with a five-octet signaling message header. The header has three fields. The first is a one-octet Protocol Discriminator field which is used to identify the exact signaling protocol used. For frame relay, this field is set to 00001000, which is nothing more than the number 8 in hexadecimal (08h).
The next three octets in the required signaling message header form the Call Reference field. The first four bits of this field are always 0000 and the next four bits give the exact length of the call reference value itself. In frame relay, this value is 0010, or 2. This means that the call reference value itself is two octets long. The value is carried in the final two octets (16 bits) of the three-octet call reference field. One of these bits is a flag bit, leaving 15 bits for the actual call reference value itself. The flag bit is set to 0 at the originator side of the frame relay network and set to 1 at the destination side of the frame relay network. This prevents a phenomenon known as glare, which can happen when both endpoints happen to pick the same call reference value for an incoming and outgoing call. The call reference number is essentially how the frame relay network identifies a connection internally, a mechanism that works beyond the DLCI number. Like DLCIs, call reference values are of local significance only and there can be many calls with the same call reference value around the network, but only one of a given value on a given UNI. At first glance, it might not be apparent why a number other than the DLCI would be helpful to the network when SVCs are supported on a frame relay network. After all, PVCs work just fine with only the DLCI to go on. The key here is that SVCs come and go, and each SVC needs a DLCI only when it is established and the connection is being billed. There are less than a thousand DLCIs that can be used for SVCs, which sounds like a lot, but really isn’t. If a frame relay SVC is used as much as a typical Web session (or for a typical Web session), about 30 minutes or so, then 50 users will establish, use, then release, 100 connections per hour across a frame relay UNI. Over a 10-hour work day, that works out to 1000 SVC connections, more than could be tracked by expecting the network to give each one a distinct DLCI when the connection is established. Admittedly so, this example seems high, but the point is that at some level of SVC activity, billing by DLCI alone could cause confusion on the part of the network. So the call reference system, with more than 32,000 distinct values to use (15 bits), gives the frame relay service provider a greater range to assign and track DLCIs internally to reference SVC calls and bill users properly. An SVC will have different call reference values on each UNI, however, and the standards do not specify exactly how a frame relay network could or should use these call reference values other than to say that call reference values are temporary identifiers of SVCs. The internal use of the call reference values to track SVCs is up to the individual hardware and software vendors, and service providers. The last octet of the five-octet signaling message header shown in Figure 5.2 is the Message Type field. The first bit of this field is always 0 and the other 7 bits are used to indicate to the receiver whether the signaling message is a call setup, disconnect, or whatever. After the five-octet header, the frame relay signaling message has many possible structures depending on the value of the message type field in the signaling message header. The rest of the information field consists of a variable number of Information Elements (IEs), and each IE has a variable length depending on its type. There has to be at least one IE present in all signaling message types. All of the IEs are either one octet long (and of two formats, Type 1 or Type 2 IEs) or more than two octets long (a variable-length IE). The single-octet IEs begin with 1 bit and the variable-length IEs all begin with a 0 bit. A complete signaling message in frame relay is simply a frame on DLCI = 0 with the control field for I-frame that contains the five-octet signaling message header and one or more IEs. IEs themselves can become a bewildering array of the most seemingly arcane information that could be imagined. Many IEs seem to encompass the most minute details of connection behavior. Both of these statements are true. The point is that a frame relay network supporting SVCs must be able to gather all of the same types of information from the user requesting a connection as from a user requesting a PVC. With PVCs, however, the interaction is human to human, and it is relatively easy to see what will work well and what is not such a good idea. The DLCIs must be unique on the UNI, the CIR must not exceed the booking policy, the total DLCIs must not exceed the supported number of connections, and so on. With SVCs, the frame relay network has to figure all of this out on-the-fly, in real time, without the guiding hand of a human anywhere in the process to say “wait a minute! This is a dumb thing to do, and the UNI or switch might fail...”
Frame Relay Information Elements
FRF.4 uses a subset of the full Q.933 signaling message types (call setup, disconnect, etc.) to handle SVCs. Fortunately, FRF.4 also uses a subset of the full range of IEs established for Q.933 SVCs. Some of the IEs are mandatory (M) for a given message type and must appear, while others are optional (O) and can be absent. All of the IEs used in FRF.4 are more than one octet long, so there are no Type 1 or Type 2 single-octet signaling messages in FRF.4. Each variable-length IE used in FRF.4 has a common format as shown in Figure 5.3.
Figure 5.3 FRF.4 Information Element (IE) format. All of the FRF.4 IEs start with a 0 bit, indicating a multiple-octet IE. The second octet always contains the length of the contents of the IE itself and not the length of the entire IE, as might be expected. The remaining octets, and there might be many, contain the values of all the fields of the IE itself. All of the IEs have distinct numerical identifiers. Because some are mandatory and some are optional, they might or might not be present in a signaling message. So, to make life easier for receivers when loaded into a single signaling message, all of IE must be in ascending numerical order. This makes it easy for a receiver to tell if a given IE is present. Table 5.3 shows the IE identifier coding for the FRF.4 IEs used for SVCs, the mandatory (M) or optional (O) fields by message type, and the IE’s maximum length in octets (if applicable). Table 5.3 FRF.4 Signaling Messages Used for SVCs FRF.4 SVC Message Types 1
SETUP
2
CALL PROCEEDING
3
CONNECT
4
DISCONNECT
5
RELEASE
6
RELEASE COMPLETE
Identifier
Information Element
SVC Message Type
Max. Length
123456 000 0100
Bearer Capability
M
5
000 1000
Cause
MMM
32
001 1001
Data Link Connection Identifier
MMM
6
100 1000
Link Layer Core Parameters
OM
27
100 1100
Connected Number
O
100 1101
Connected Subaddress
O
110 1100
Calling Party Number
O
110 1101
Calling Party Subaddress
O
23
23
111 0000
Called Party Number
O
111 0001
Called Party Subaddress
O
111 1000
Transit Network Selection
O
111 1100
Low Layer Compatibility
O
14
111 1110
User-user
OO
131
23
Anyone familiar with the tables presented in Q.933 or even Q.931 can appreciate the compactness of the FRF.4 IE list. There are a minimal number of IEs, and most are optional. Only a few are needed for all message types and, of these, only the SETUP message has a complex set of options to deal with. A few words about the function of each IE are definitely in order. Some of the IEs are relatively self-explanatory in nature and function. For instance, an SVC has no DLCI assigned initially, so it only makes sense that the Setup, Call Proceeding, and Connect signaling messages must contain a DLCI information element. In general, the DLCI IE value can be requested by the user or assigned by the network. The network will try to grant the user’s DLCI request, but always reserves the right to allocate another DLCI value for the SVC. FRF.4 only uses the DLCI IE in the network to user direction, however. Likewise, the Disconnect, Release, and Release Complete messages must have a Cause associated with them to inform the endpoints why the connection is being dropped. Frame relay network addresses can consist of what basically amounts to a site identifier (this UNI) and an additional subaddress (this port on the FRAD or this software process on an end-user device). The presence of these IEs as options in Setup and Connect messages is therefore neither exciting nor remarkable. In an equal-access environment where users have the right to choose an IXC (transit network) regardless of the LEC used on each end of the frame relay network, the presence of a Transit Network Selection IE is expected only. In fact, this IE was included in FRF.4 for future use only, but can be present nonetheless. The remaining four IEs—Bearer Capability, Link Layer Core Parameters, Low Layer Compatibility, and User-user—are a little more complex. The Bearer Capability is the basic IE and must be present in the Setup message. This IE is used by the network to identify a network service. There are lots of services that could be supported by a fast packet network such as frame relay. For now, the Bearer Capability IE indicates frame mode as the transfer mode, that is the means by which information travels across the network. The Bearer Capability IE also says that these frames are Layer 2 protocol frames, and that the information transfer capability of the SVC will be unrestricted digital information. Taken together, the Bearer Capability IE defined in FRF.4 is just another way of saying that the network does not need to look inside the information frames for any reason whatsoever. The Link Layer Core Parameters IE is the most complicated and must be present in the Connect message in the network to user direction. This IE is optional in the user-to-network direction (few users would know these parameters anyway). There are four main network parameters that must be set for each and every DLCI on the frame relay network. For PVCs, these parameters can be established through human contact, a service agreement, or some other mechanism. For SVCs, these parameters must be established on-the-fly, in real time. The four main parameters are the Frame Relay Information Field (FRIF) maximum size, the throughput needed for the connection (call), the committed burst size, and the excess burst size. The committed and excess burst size are used to determine if and when frames may be tagged as discard-eligible or ignored, as described in the previous chapter. The throughput parameter is the equivalent of the CIR and allows the network to determine the proper CIR for the new connection. All four parameters are specified (and can differ) in both directions, outbound and inbound. This provides further evidence of the inherently bidirectional nature of frame relay connections, no matter how they might be billed by the service provider.
The Low Layer Compatibility IE is a number of fields that in some ways resemble the fields of the Bearer Capability IE. That is, this IE gives the network and the other end of the SVC further information about the Layer 2 and Layer 3 protocols that will be used on the new connection. The Setup message can include this information and additional details such as the user’s data rate or what flow-control mechanisms the end users intend to employ. The whole intent is to allow an intended destination on an SVC to decide if it makes any sense to accept the connection at all if there are concerns that the two end processes cannot communicate due to lower layer incompatibilities. Higher layer incompatibilities might still be a problem, of course, but this is not the concern of the frame relay network itself under FRF.4. Finally, the User-user IE provides a means for the users at the ends of a not-yet-established SVC to transfer up to 131 octets worth of information in order to provide some miscellaneous information from one user to another. For example, the User-user IE can be used to convey a password to an endpoint that is needed before the destination will accept the SVC from an originator. The Useruser IE can always be used to fill the pipe in the most efficient fashion. This IE is optional in the Setup and Connect messages. Remember that to a user, the call setup delay is added to the overall end-to-end delay through the network. It might take only 1 second to transfer the user information, but if the call setup delay is five seconds long, the user perceives the delay for the SVC service to be six seconds. Therefore, if some user data can be transferred as the connection is being set up, it reduces the perceived delay on the part of the user. In fact, for short interactions, the use of the User-user IE can mimic a kind of connectionless service of the frame relay network, since all call setups are routed independently through the frame relay network. Those interested in more details about the actual bit structures of frame relay signaling messages for SVCs and all of the IEs are referred to the relevant ITU-T, ANSI, and FRF documentation listed in the biblography. For the purposes of this chapter, it will be enough to show a frame relay SVC call setup message with all its IEs, mandatory and optional. This call setup message is shown in Figure 5.4.
Figure 5.4 Frame relay SVC call setup message. Some texts tend to become excited about signaling protocols and messages for SVCs. But the real excitement of signaling protocols and messages for SVCs is not in their structure, but in their use. This topic will occupy the rest of this chapter.
The Q.933 Signaling Protocol and Frame Relay Forum’s FRF.4 It has already been mentioned that frame relay LAPF core provides a basic, PVC-based data transfer service from FRAD to FRAD across the frame relay network. In order to offer SVC-based data transfer services, it is necessary for the frame relay network and FRAD to support some form of signaling protocol. At this point there are two main possibilities and for adding this SVC support. The two main methods are to use ISDN to access a frame relay network and set up frame relay SVCs (non-ISDN SVCs), or to make frame relay the data service part of the ISDN network and set up up frame relay SVCs the same way that any other connections are made on the ISDN (ISDN SVCs). If ISDN is used to access a frame relay switch, or even point-to-point leased lines as a UNI, then it is possible to use the Frame Relay Forum’s User-to-Network SVC Implementation Agreement (FRF.4). The FRAD and network understand the same signaling protocol; it is based on the signaling protocol used in the ISDN scenarios. This chapter will outline the use of non-ISDN and ISDN SVCs, but will mostly emphasize the use of the Frame Relay Forum’s FRF.4 as the way that frame relay service providers and FRAD vendors currently implement SVCs in frame relay. The ITU-T tends to see frame relay as another thing that ISDN can do. There is nothing wrong with this, but service providers have tended to deploy, market, and sell ISDN and frame relay in an entirely separate fashion. The ITU-T recommendation that addresses frame relay signaling issues and establishes the signaling protocol for frame relay networks is Q.933, which has the mindblowing title of Integrated Services Digital Network (ISDN) Digital Subscriber Signalling System No. 1 (DSS 1)—Signalling Specifications for Frame Mode Switched and Permanent Virtual Connection Control and Status Monitoring. As the title promises, there is much in Q.933 that concerns PVCs and status monitoring. This is in line with the ITU-T philosophy of considering anything that is not information transfer on a network to be signalling (note the double “l”). The related topics in Q.933 are further considered in later chapters. For now, it is what Q.933 has to say about SVCs that is of interest. Q.933 says that there are two ways that a Frame Relay Bearer Service or FRBS can use ISDN to establish demand connections (SVCs). The use of the term FRBS refers the information transfer aspects of a frame relay network as part of an overall ISDN. Q.933 calls these two ways Case A and Case B. Case A uses ISDN to access a Remote Frame Handler (RFH), which is the frame relay switch. Once this initial ISDN connection is established using ISDN Q.931 messages, then the signaling endpoint can generate the proper frame relay signaling Q.933 messages to establish an SVC. Case B considers the frame relay switch (frame handler to Q.933) to be local to the ISDN switch and therefore essentially integrated with the ISDN switch. So Q.933 messages can be used directly and immediately, without the need for an ISDN Q.931 connection first. This still works because Q.933 is a subset of the full Q.931 signaling protocol. The problem is that neither Case A nor Case B are often encountered in the real world of ISDN and frame relay. More typically, there are ISDN lines and there are frame relay UNIs. Signaling messages sent on one just don’t find their way to the other. So real-world frame relay networks usually follow the Frame Relay Forum’s User-to-Network SVC Implementation Agreement (FRF.4) which requires no ISDN relationship in the frame relay network or FRADs at all. The FRF.4 document is basically a subset of the full Q.933 signaling, since not everything is needed for SVCs when there is no ISDN around. FRF.4 basically says:
If ISDN is used, it is used only in Case A scenarios (no frame relay-ISDN integration). There are a few exceptions to the full Q.933 Case A signaling message scenarios. Q.933 signaling messages are always sent inside LAPF Core frames on DLCI 0. A UNI can have both PVCs and SVCs established at the same time. End systems will have either E.164 network addresses (telephone numbers) or X.121 network addresses (the same as used in X.25), which look like 10-digit telephone numbers anyway. There are a lot of other things addressed by FRF.4, but most of the document concerns how to pare down the full Q.933 signaling messages to get a useful subset that does not require or even rely on ISDN to function. For example, the full Q.933 message set employs 11 message types in three major categories. FRF.4 keeps the three categories, but cuts the number of message types down to eight. The message types are shown in Table 5.2. As evident in the table, the message types dropped by FRF.4 are concerned with connection establishment and mostly geared toward ISDN. For instance, Progress messages allow an originator to tell if an attempted connection is blocked due to lack of network resources. The Alerting message is the Q.933 equivalent of the telephone ringing, and so on. The Frame Relay Forum decided that frame relay SVCs could be supported (and established more quickly) without this additional messaging overhead. Table 5.2 Q.933 and FRF.4 Messages Types Message category
Q.933
FRF.4
Call establishment
Setup
Setup
Call Proceeding
Call Proceeding
Progress Alerting Connect
Connect
Connect Acknowledgment Call Clearing
Miscellaneous
Disconnect
Disconnect
Release
Release
Release Complete
Release Complete
Status
Status
Status Enquiry
Status Enquiry
Who Needs SVCs Anyway? Frame relay PVCs are logical (or virtual) connections on a frame relay network that are available at any time to send information to a remote site located at the other end of the PVC. In this sense, PVCs are the equivalent of dedicated, point-to-point leased lines on a private data network. But the use of leased lines comes at a price. A leased line will only ever lead to one other network location. Sending information somewhere else with privates requires another private line and the associated expense. The alternative is to employ SVCs or switched services to reach other locations on the network on an intermittent basis. In this context, the term “switched services” applies to what are loosely called dialup services employing modems and the public-switched telephone network (PSTN) to send data over the voice network. The use of switched circuits instead of point-to-point links is best known from the public voice network. To place a voice telephone call, the user picks up the handset, dials a number, and waits for a connection. The number represents the network address of the remote location (people seldom think of telephone calls in this way, but this is exactly what a telephone number is). The PSTN represents this number in a signaling protocol like Signaling System 7 (SS7) understood by public voice networks and uses this information to indicate to the remote location that a request for a voice connection has been made (the telephone rings). If the connection is successful (“Hello?”), the users may then transfer information (which is why people call in the first place, but again few people hink of telephones calls this way). When the transfer, which is usually two-way, is completed, the connection is terminated by hanging up the handset at either end. What is not obvious about this scenario is that this is exactly what happens on a frame relay network using SVCs instead of PVCs. With SVCs, there is no need to establish a PVC with an associated and dedicated DLCI number at service provision time at all. Instead, the locations only need to establish an access link to a frame relay switch port connection at each end of the network. Connections and paths can then be established dynamically as needed between the sites using a special frame relay signaling protocol. In fact, there is no need to restrict this SVC process to a particular set of remote locations. Literally any location on the frame relay network can receive a connection request, even if the frame relay network address is not known to the originator! This may seem hard to figure out at first, but a similar thing frequently happens on the public voice network when telemarketers just call everyone attached to a given telephone company central office switch. The network address telephone numbers are generated and dialed one after another. The signaling protocol attempts to make the connection (rings the phone) regardless of where or who is making the connection request. This is a security threat in frame relay, not merely an annoyance. So SVCs are connections that are dynamically negotiated (in terms of CIR [bandwidth] and other parameters) and established between locations attached to the frame relay network. SVCs cannot be established unless the remote location has a frame relay access link and port connection of its own, of course.
This does not mean that PVCs will or should go away. PVCs can still be used to establish virtual private networks (VPNs) between corporate offices on the frame relay network, although VPNs are typically thought of as private Internet or intranet entities today. SVCs would handle the lessfrequent need to establish connections outside of this corporate network. For example, SVCs could easily handle supplier-to-customer traffic needs or support to users as well. The use of SVCs in frame relay networks could cut down on users’ PVC costs as well, since PVCs must be paid for on an ongoing basis (although this is generally a very small cost compared to the cost of the frame relay service itself). Although SVCs can function similarly to dialup modem PSTN services, there are important differences. SVCs do not allow dialup access into the frame relay network over the frame relay UNI, which is often implied in SVC descriptions. The sites still must be connected to the frame relay network with dedicated access links, as with PVCs. The use and deployment of dialup access services is a separate issue and development that is totally independent of the concept and use of SVCs. This book has emphasized the use of PVCs for frame relay network services instead of SVCs. There are several good reasons for this emphasis. First, PVCs are available now, and very inexpensively—usually a few dollars a month for a PVC in each direction between two sites. Second, the use of SVCs is sometimes seen as a security risk to corporate sites (although security can be added in a number of ways). Finally, the full standard frame relay Q.933 implementation of the SVC signaling protocol is not easy to make work effectively in large frame relay networks. The pro and cons of frame relay SVCs are listed in Table 5.1. There is a lot of merit to the pro-SVC position in the table. Each PVC established must have a table entry in each frame relay switch. These table entries must be held in memory for speed of access. Too many PVCs can slow the network by making the lookup process slower and can increase service costs by requiring memory upgrades. SVCs keep table sizes to a minimum. There is no other way to effectively reach sites that were unplanned at service provision time without the use of SVCs. The PVC process can take days to implement, although 24 hours is a more typical timeframe. But during this period of time a site may literally be unreachable from some places on the frame relay network, even though the connectivity is there. Table 5.1 Pros and Cons of Frame Relay SVCs Pros Needed to keep the size of PVC tables to minimum. Needed to reach sites not planned for at service provision time. Needed to make frame relay as flexible as possible. Cons PVCs are inexpensive and table sizes are immaterial. SVCs are another unnecessary security risk. Adding SVC signaling protocol support to a frame relay network is not easy. Finally, SVCs are needed to make frame relay as flexible as possible and to ensure long-term customer acceptance. Imagine voice services or Internet access networking today without telephone numbers or dialup modems! As voice comes to frame relay, SVCs will become even more necessary. In spite of these very good arguments, it seems unlikely that frame relay SVCs will become common anytime soon, if at all. The fact remains that since PVCs are so inexpensive in most cases, and memory priced so reasonably, there is little need for SVCs in the foreseeable future, at least because there are a lot of PVCs required.
The security risk is real enough, also. The use of PVCs does not pose the possible risks that switched services entail (as in many businesses with dial-in network ports). Of course, frame relay is positioned as a public substitute for a private leased-line network. Frame relay has been so successfully marketed as a private line replacement that it might be difficult if not impossible to reposition the frame relay service as a switched service also. Finally, adding support for the SVC signaling protocol to all frame relay networks will not be simple nor inexpensive. And SVC support on one frame relay network does not ensure universal connectivity unless the NNI with SVC support to all other frame relay networks is implemented. It appears that none of these things will happen soon, and maybe not at all. But this does not mean that SVCs may not be desirable in some frame relay network configurations, especially very large ones with huge PVC needs and very intermittent (and unforeseen) site interactions. The potential customer should be clear about the frame relay service provider’s position when it comes to SVCs.
Whither Frame Relay SVCs? At this point in the chapter it seems clear that the frame relay signaling protocol needed by equipment vendors and service providers in order to offer SVCs on a frame relay network is ready to go. Yet, with the major exception of MCI in 1997, the major frame relay service providers have not offered SVCs. Even MCI, whose HyperStream SVC frame relay service was first offered in late 1997 with CIRs from 16 kbps to 6 Mbps, had no plans to charge SVC customers on a usage basis until mid-. Prior to this the MCI SVC service was strictly on a fixed-rate basis regardless of traffic load and connection time. An informal survey has shown that some 75 percent of all frame relay UNIs have five or fewer PVCs configured on them. About 90 percent have fewer than 20 PVCs defined to remote locations. CIR limits and traffic load have much to do with this, of course, but there are many networks that routinely connect to more than 20 remote sites, although not often directly. But that is one of the attractions of frame relay: the ability to logically mesh connect sites directly. SVCs can certainly be useful to overcome PVC limits and provide greater (and more flexible) connectivity to sites with low traffic volumes. Therefore, lack of SVC services is not due to user indifference. It is true that all the details of the frame relay signaling protocols have yet to be worked out. But this is not a big stumbling block. There is nothing to prevent a vendor from developing a proprietary signaling standard between its own switches. And since few (if any) multivendor ATM switch networks exist, incomplete standards would become a nonissue. In the frame relay world, the network node interface is beyond the scope of the frame relay standards (NNI is the NetworkNetwork Interface in frame relay), so vendors would have to develop or adapt their own signaling protocols in any case. Perhaps the problem with the absent SVC offerings is a lack of signaling protocol implementations. This is certainly true of end-user devices. But in most cases the end user would hardly be expected to set up his or her own SVCs with user device-generated signaling messages. And hardware FRADs can easily be built to comply with FRF.4, if not all of Q.933. Certainly MCI had no trouble finding FRF.4 and Q.933 software and hardware for its pioneering SVC offerings. So, there must be some other reason why SVCs are only offered by the rare frame relay service provider, and only then with limits such as flat-rate billing (then why bother with SVCs at all?). In fact, there is a very good reason why SVCs have not yet appeared in force in frame relay networks. The problem is not the lack of full standards, nor the implementation of these standards. One of the problems is the issue of routing SVCs on these networks. This will be called the “signaling protocol routing problem” in this chapter. Just what is the signaling protocol routing problem? Connection-oriented networks like frame relay (and ATM) do not independently route packets from source to destination as TCP/IP routers and the Internet do. Rather, connection-oriented networks use a signaling protocol call setup message to establish a path through a network from source to destination. It is only this call setup message that needs to be independently routed. The path, once set up, is used by all packets sent from source to destination for the duration of the “call.” The question is: What is the best way to route a call request from source to destination? This is an unanswered question in frame relay and ATM networks and defines one of the signaling protocol routing problems.
Another problem is the fact that SVC cannot realistically be charged for in the same fashion as PVCs. PVCs are available 24 hours a day, 7 days a week. So it only makes sense for service providers to bill for PVCs at a recurring, flat monthly rate. But SVCs can be established at the customer’s whim and used as long as the customer likes. It follows that some other billing methods must be used for SVCs. What this alternate billing method should be is open to some debate, as will be shown later. There are good reasons why simple call-time duration is not a good method of determining frame relay (or ATM for that matter) SVC charges. This represents another unsolved problem in frame relay networks—the billing and administration problem. Therefore, there are two main problems which must be solved before SVCs become common on frame relay networks: the signaling protocol routing or Resource Allocation problem and the Billing problem. Both are so important that they deserve the capital letters. However, it should be pointed out that these are only terms used in this book, not industry standard terms. Others may call them by other names and some may even prefer to think of them as issues, since frame relay and ATM have no problems at all. The goal here is to promote understanding of what these issues or problems are. Until these two problems are resolved, and resolved in a standard and common fashion, there will be no widespread deployment of SVCs in either frame relay or ATM networks, especially for data SVCs. What is the big deal about resource allocation and billing? The telephone companies have allocated voice and data resources for years. While it is true that congestion and busy trunks (fast busy) do occur, this has hardly prevented telephone signaling deployment. Also, the telephone companies have automated the billing process with computers for more than 30 years and, in fact, along with the power utilities were the first major corporate entities to use computers in this fashion. It is also true that billing errors occur, but again, this has not stopped either the deployment of signaling protocols or the sending of (incorrect) bills. Surely there must be something fundamentally different between the public switched telephone network (PSTN) and frame relay networks if resource allocation and billing are such problems. It turns out that there is a difference. In fact, there is such an extreme difference between resource allocation and billing in the PSTN, and in frame relay networks that few even want to think about offering SVCs until these twin problems are solved. The problems will be explored one at a time. That way, each one can be better understood and used to see what the trouble with SVCs seems to be.
The Resource Allocation Problem Resource allocation in the PSTN has already been mentioned. This section will offer more details on how resource allocation is performed on the public voice network that will help in understanding the problem with regard to frame relay networks later. A very simple telephone network is shown in Figure 5.5.
Figure 5.5 A very simple telephone network. The network in the figure is simple enough, but it has all the elements needed to illustrate how the PSTN signaling protocol interacts with the physical network of trunks and switches to perform resource allocation. There are four central office (CO) switches with both user local loops (lines) and links between the central office switches (trunks). The figure could add a few wire centers, toll offices, tandems, and IXC POPs, but the principles are the same no matter what the configuration.
Each central office switch has only so many voice channels on the trunks between them, of course. There may be as many as 10,000 local loops with telephones on each central office (maybe even more today), but since the average business or residential phone is only in use a few hours a day (and many residential phones are in use only 60 minutes or so), it makes no sense at all to have one trunk for every local loop. Besides, even the simple voice network in the figure would need not 10,000, but 20,000 trunks: 10,000 to each central office it was attached to. After all, it would be just as likely that someone attached to (or served by) Central Office A would call someone attached to Central Office B as Central Office D. So a lot fewer trunks are needed between central offices than the lines they each serve. But how many trunks are needed? The science of traffic engineering answers this question. (Actually, traffic engineering sometimes seems like a mystic art requiring advanced enlightenment.) Traffic engineering is used in voice networks to say things like, “With 10,000 phones on Central Office A, 600 trunks are needed to Central Office B to make sure that 99.6 percent of the calls get through.” The phrase “get through” is the key. This is exactly the point. If calls are blocked, the resulting busy signals generate no revenue, even though a lot of network resources are used to switch the call. These resources include things like switch digit registers (otherwise there is no dial tone), screening software (need to dial a “1” first?), initial billing record software (toll free, collect, or bill originator?), trunks, and so on. So maybe 0.4 percent calls blocked for lack of trunks is okay, maybe not. If not, the traffic engineer can recommend raising the trunk count to 650, 700, or whatever by installing more facilities between Central Office A and Central Office B (for instance). It really does not matter what technology is used to add trunks—-carrier, Sonet fiber, or microwave— since all of the trunks will be broken up into 64 kbps digital voice channels anyhow. But installing new trunk capacity is often an expensive and time-consuming task, whatever the media chosen. This being the case, the number of voice channels within the trunks between Central Office A and Central Office B remains very stable over time. A T3 will have 672 voice channels, two T3s will have 1344 voice channels, and so on. Each voice channel will have 64 kbps bandwidth, minimal delays, be fairly tolerant of bit errors, and so forth. In other words, the voice channels are built for the voice QoS parameters. New facilities and provisioning can change this trunk channel number, but not day-by-day or hour-by-hour. There is one major exception to this stability of trunks voice channels: outages. If a T3 between Central Office A and Central Office B is lost, then there are 672 fewer voice channels right away (and 672 more right away when the T3 is repaired). Calls may be blocked in the interim, which has an enormous and immediate impact on revenues and service. Service outages may have tariff impacts, leading, in turn, to fines and/or other financial penalties such as customer rebates. This outage effect is so important to resource allocation that network control centers have been created to deal with the effects of these outages and inages on phone calls. The point is that varying the resources available on the network causes problems, whether through addition or subtraction. This is important to remember. But why make trouble for ourselves? These trunks are all Sonet fiber rings today which provide automatic protection switching, right? There are still trunk outages, but not as many service outages. Okay, consider this aspect for a moment. Assume no trunks flip-flop in and out of service at all. Then the resource allocation on the network when a user makes a phone call will go as follows. Suppose a user on Central Office A (user A-) wants to make a call to another user on Central Office A (A-). This is just an SVC, of course. It is a type of virtual circuit known as an equivalent circuit in the telephony world, but it is an SVC nonetheless. The DTMF (touch-tone) signaling protocol is usually used to initiate the establishment of this SVC. In this case, resource allocation is very simple. Resource allocation affects only Central Office A resources, since there are no other Central Offices, and therefore no trunks, involved at all. The switch software in Central Office A looks around and says: “Hey! I can give A-dialtone! I can ring A-! I can supervise the call until someone hangs up! No problem!” (Signaling protocols always go “Hey!” They are really quite rude.) And it is not really any problem at all. This is because Central Office A is dealing only with a local knowledge of the network, not global knowledge of resource states elsewhere. Even if Central Office A cannot give dialtone, or ring A-, or whatever, the decision-making process is still the same and just as easy.
Now consider what must take place when a user of Central Office A (user A-, just to be different) wants to make a call (establish an SVC) to another user serviced by Central Office B. Now Central Office A’s resource allocation job is much harder. Why? Because Central Office A must now allocate resources based on a global knowledge of network resources, not only a knowledge of local resources as before. This is what makes the whole SVC routing process difficult. Consider the sample network with the following added trunk availability information, as shown in Figure 5.6.
Figure 5.6 The example network with trunk availability information. The resource allocation process at Central Office A could now go something like: “Well, there are no trunks available to Central Office B, so I’ll give them the fast busy and they’ll try again later.” The assumption that the users would try again later used to be a good one. What else were the users going to do? But maybe the “they’ll call later” assumption is not such a good one today. There are other things that a user can do besides make a phone call with the incumbent carrier. People can use their cell phones instead or make a call over the cable TV network. They can send e-mail. Maybe the problem will be the same, but maybe not. Whether the call is a flat-rate call, no service provider is happy to deny service, for revenue or tariff reasons, or both. But service providers need not fret. There is a better way. The resource allocation software in Central Office A could have a table that says: “If you can’t route a call through the A-B trunks, give it to the A-D trunks.” Central Office A will surely know that there are plenty of A-D trunks available, since it can “see” one end of the trunk group directly. The switch at Central Office D will have a table to route the call to Central Office C, and Central Office C will pass the call to Central Office B, which will complete the call. No busy trunks along the way. No lost revenue. Not a bad plan. And all that had to be done was to build a routing table with a topology database in each central office so that each switch had knowledge of other paths over which to route the call. But this does not entirely solve the problem. More smarts need to be built into each central office switch. Here is why. Consider the following trunk availability situation shown in Figure 5.7. Now when someone on Central Office A (user A-) dials the number of someone on Central Office C (C1), the task of resource allocation becomes very complex and difficult indeed. The Central Office A switch will attempt to set up the call through Central Office D or Central Office B. But it is easy to see that a call routed (or switched) through Central Office D will not and cannot be completed. The correct way for Central Office A to route the call is through Central Office B. The challenge is for the implementation of the signaling protocol with regard to resource allocation is as follows: How is Central Office A to know the proper way to route the call globally?
Figure 5.7 A more realistic resource allocation scenario.
Central Office A must know about the global conditions of the trunks on the network. For instance, it makes no sense for a central office switch in New York to route a call to San Francisco through Kansas City if all the trunks to Phoenix are busy. The central office switch should route the call through Chicago, where plenty of trunks to Seattle are available. This situation comes up all the time in the voice world. As it turns out, there are several ways to deal with this resource allocation problem in the voice world. The best way would be for the call setup packet that is used to route the call to just reroute itself as it threads its way through the network. That is, the call setup packet, even if sent to Central Office D, would look at the trunk situation and say: “Hey! Wait a minute! I can’t go anywhere from here. Better go back to Central Office A and start over.” But in the real world, this would not work. What if there are three sets of trunks out of Central Office D that the call setup packet could try, not just one? How long would it take to try all possible routes? As it turns out, this method takes much too long to set up calls within the international guidelines used by the PSTN. Of course, there are other ways. A database could be set up in each central office which is updated periodically. The database could be used exactly like the routing table in a router to determine the proper path a call should take. This is a very robust and efficient approach. It is basically what Signaling System No. 7 (SS7) does today, with Service Switching Points (SSP) as clients, Service Control Points (SCPs) as the database servers, and Signaling Transfer Points (STPs) as the routers. Perhaps the central offices could be arranged in a hierarchy, with more and more trunks at higher levels. The central offices in the figures could be called level 5 central offices. If a call gets to Central Office D and there is no trunk available to Central Office C, the call could be routed to a higher level switch at level 4 of the hierarchy. Higher levels would be developed and used as needed. The was essentially the structure used in the Bell System prior to Divestiture in 1984, and it was very successful. However, it required that all of the trunks and level switches be controlled by the same organization, and completely out of the user’s control. Of course, this was true of the AT&T Long Lines network, but this was no longer possible after 1984 with Equal Access and Carrier Selection. In fact, resource allocation can be handled quite well in the PSTN with a combination of these approaches. Since the voice network is engineered for peak loads (Mother’s Day, Thanksgiving, and New Year’s are always neck-and-neck), most of the time there are plenty of trunks to go around. Resource allocation decisions can be made locally without too much trouble. But when congestion occurs (or, ideally, right before it occurs), a network management center could see the trend and distribute traffic more efficiently. This could be as simple as adjusting a few routing table entries and parameters in the central office switch to say: “Hey! Send more stuff to Central Office B and less to Central Office D.” This would also require both a Network Management Center (NMC) and communications with the central offices. But most telcos have NMCs and central office links for other reasons already. And they could put a big traffic map on the wall to impress tour groups. In spite of this slightly tongue-in-cheek approach to routing alternatives in the PSTN, all of the methods suggested are instructive. This is how routing works in some real-world portions of the global PSTN. All real-world routing algorithms in the global PSTN use a concept known as trunk reservation (TR). Each link between switches has a TR value. If a direct route is not available to route a call, a TRpermissible alternative is sought, based on TR values, which constantly fluctuate with traffic conditions. If no current TR values are suitable, the call is blocked. The PSTN routing algorithms in use today differ in their way of choosing from the set of TR-permissible alternative routes. But this is the only way they differ. For example, from the 1980s until the early 1990s, AT&T used a routing algorithm called Dynamic Non-Hierarchical Routing (DNHR), which replaced the older hierarchical Long Lines level switch structure previously described. In DNHR, the day is divided into 10 time periods. The TR parameters vary from time period to time period, based on updates from a central location (the NMC) which reflect weightings due to current traffic load on the network, traffic forecasting rules, and manual intervention.
All DNHR switches use special signaling messages to propagate TR parameters among the switches, which number in the hundreds. Since DNHR switches tend to be highly mesh-connected in terms of trunking, DNHR switches will pare down the full alternate route set to a more manageable subset. There is no rethreading of calls, but a special crankback message is used when a call is blocked at another switch to prevent this from happening repeatedly. In Great Britain, British Telcom (BT) uses a routing algorithm known as Dynamic Alternate Routing (DAR) that depends more on actual current traffic loads than forecasting and minimizes the number of messages sent between the switches. DAR picks one alternate route all the time and uses it until a call is blocked on it. DAR then selects another route at random and the process repeats. In the early 1990s, AT&T implemented a Real-Time Network Routing (RTNR) algorithm. RTNR increases the number of messages exchanged between NMC and switches, and also among switches themselves, but is much better for completing calls than either DNHR or DAR. Both DNRH and DAR tend to pick the same alternative route over and over. RTNR, which included another routing algorithm called Least Load Routing (LLR), distributes traffic more evenly. Two new routing algorithms are claimed to be even better than RTNR. Bell Northern Research has developed Dynamically Controlled Routing (DCR) for the Trans-Canada Network. In DCR, a central computer tracks link status and gathers update messages every 15 seconds. Bellcore has developed State-Dependent Routing (SDR) which assigns TR values based in cost. Costs are determined from information gathered every five minutes and calculated by a large nonlinear program running on another processor. Because of this time lag, real-time rerouting operation is not possible. This is the whole point of this section about PSTN routing algorithms. Where is the DNHR, DAR, RTNR, DCR, or SDR for frame relay (or ATM) networks? These routing algorithms will not just port over into the frame relay and ATM worlds. This is because resource allocation, even in the small voice network example with only four switches, is enormously complex. What makes it possible at all is the fact that the resources being allocated are (in the vast majority of cases) fixed increments of 64 kbps. Imagine how much more complicated the task would be if the trunks were not channelized into 64 kbps circuits. In fact, this is exactly the case when we replace the voice central office switches with frame relay or ATM switches. There are no more channels on the trunks, Sonet or otherwise, connecting the switches. There are just virtual circuits representing channels as a stream of frames or cells. But not all connections on a frame relay network work best by emulating circuits designed to deliver Constant Bit Rate (CBR) services. Frame relay is designed for a whole array of services, especially data services, that are Variable Bit Rate (VBR) services. Most are extremely bursty data applications. How do bursty VBR data connections make resource allocation so different on a frame relay or ATM network? Here’s how. In the Q.931 signaling protocol, used with narrowband ISDN, the call setup message is only required to specify how many 64 kbps units of bandwidth the connection needs. This is one of the main bearer capability parameters. The digital switches only need to compute the effect of this parameter on the TR number to successfully route the call setup message. In the Q.933 signaling protocol used in frame relay networks, users are allowed to specify much more than bandwidth bearer capability, and indeed they must if they are to take advantage of frame relay’s VBR dynamic bandwidth allocation (also misleadingly known as bandwidth on-demand) capabilities. Here are the fields that must be specified in a Q.933 call setup message in order for the frame relay network to provide efficient VBR services: Maximum frame-mode information field size Throughput
Minimum acceptable throughput Committed burst size
Excess burst size
Transmit window value
Retransmission timer value What has all this to do with resource allocation? What was once a quick look at a small number of fields in the ISDN Q.931 call setup process is now a long and involved process of examination and analysis with Q.933. This must be done to determine the effect of the VBR connections on the current available trunk bandwidth. It is fine to say that frame relay network connections allow for dynamic bandwidth allocation, but the network only has a fixed amount of bandwidth to play around with. The problem is that the VBR flow of frame cells may vary drastically over short periods of time. The question is no longer one of how much bandwidth per unit time a connection will consume, as in channelized, CBR circuit connections. Some VBR connections may tolerate more delay in exchange for more capacity (just buffer these frames). Some VBR connections send fewer frames but these information units must be delivered quickly within a bounded delay (and so cannot be buffered for long, if at all). Since ATM from the start, and frame relay more recently, have had standard services defined for mixing CBR (uncompressed voice/video) and VBR connections on the same unchannelized trunking system, these connections may have vastly different resource requirements. The challenge of resource allocation in frame relay networks is to determine, based on the call setup message field parameter values, the total drain on network resources in terms of buffers and bandwidth that the VBR connection will consume. Only then can the call be routed properly, whatever the routing algorithm used. This must be done in an acceptable amount of time, of course. The holdup on SVC implementation in frame relay and ATM networks is directly due to this resource allocation problem. There is currently no accepted or efficient way to determine the equivalent capacity of a frame relay or ATM VBR connection in terms of fixed time division multiplexed trunks. If the connection could be expressed as equivalent bandwidth on a fixed bandwidth trunk network, the existing routing TR mechanisms could be used. And this must be done quickly enough, based on global network capacity, current load, and congestion potential, to satisfy all types of service connections, including voice and video-on-demand, among others. PVCs make this task a little easier, but not trivial. Resource allocation for frame relay network PVCs is done at service provision time, which usually means between the contract signing and next Monday. In the interval, network engineers feverishly try to figure out the load each PVC will add to the network. But it is easier for humans, especially those who have designed the network in the first place, to obtain and use the global knowledge needed to make PVC routing decisions. More facilities may eventually have to be added, but this should be predictable in a PVC world. Consider the following simple frame relay network in Figure 5.8. Notice that this network is as simple in structure as the previous voice network. But now the central office switches are frame relay (or ATM) switches. The figure could also show the current state of network resources, not in terms of trunk channels available, but in terms of frame capacities, queue depths, buffer maximums, service times, and so on. The question for SVCs is: Can the SVC request be granted based on the current global state of the network resources and, if so, how should it be routed?
Figure 5.8 Frame relay and resource allocation. Right now, this is an unsolved problem in frame relay for general cases. Notice that the problem is totally independent of the presence or absence of a standard signaling protocol.
The Billing Problem Suppose for the sake of argument that the SVC resource allocation problem has been solved to everyone’s satisfaction. Further suppose that these calls take no longer to set up than regular PSTN voice calls. After all, if software can be written to beat chess grand masters, surely the resource allocation problem is not unsolvable. Indeed it is not. But SVCs may still not be right around the corner for all frame relay networks. This section will discuss the reasons why. The nice thing about PVCs is that they are available for customers to use all the time. Therefore, a service provider can bill the customer for each PVC at a fixed monthly rate and not fear an angry customer or face a lawsuit. But SVCs are very different. SVCs are not available for constant use, by definition. SVCs should be billed by some criteria other than fixed monthly rates. In the voice network, two criteria are commonly used: time and distance. Users pay more based on how long they talk (talk longer, pay more) and how far apart the endpoints are (generally: boundaries are arbitrary). Perhaps these criteria can also be used for SVCs in frame relay (and even ATM) networks. Consider time first. In a voice network, it is not a bad assumption to make that if one end is not sending information (talking), the other is. So for the total duration of the call, one end or another is basically active at all points in time. So duration is a valid and fair way to bill. But frame relay connections carry more than voice. Bursty data are the rule rather than the exception. Long periods of time may pass before either end of the connection is active. Is it fair to bill the customer for nothing? After all, the voice network does. If people forget to hang up, the tab just runs higher and higher. Customers may not be too happy about this, but it only happens rarely. So what’s the big deal about frame relay SVCs being billed based on time? Who cares if the customers don’t like it. Too bad. Take it or leave it. But it isn’t that simple. This is mainly because of the call setup/holding time ratio, an important consideration for all switched services. There should be a shorter term to describe it, or even another dreaded acronym (CS/HTR?). But it seems that all are stuck with call setup/holding time ratio. In a PSTN, processing and routing a call setup message takes a lot of effort, as shown in the previous section. But no revenue is earned by the service provider until the call is completed, that is, when the called party picks up the phone and says “Hello?” All blocked or abandoned calls (abandoned because they took too long to route and the caller thought something was wrong) consume resources but generate no income at all. Disconnects also consume resources, but not nearly as many as connections and will be ignored in the following discussion. The cost of routing a call setup request must be balanced by the revenue generated during the duration of the call. Flat-rate local service is an apparent exception, but these calls tend to be easy to set up and long in duration (so there are fewer of them per unit time), so they are still profitable. For example, if it takes 10 seconds to route a call (ring the destination) and the average holding time (conversation) is 10 minutes, this gives a call setup/holding time ratio of 10:600 or 1:60, which is a good number. The cost to the customer must offset the cost of the call setup time as well as the cost of the call itself to the service provider. But what if the holding time shrunk to 5 seconds instead of 10 minutes? The call setup/holding time ratio would then be 10:5 or 2:1. This is not a good ratio whether calls are charged by flat rate or duration. In both cases, the revenue generated by holding time might not be adequate to offset the call setup costs and leave any profit at all.
What’s the point? Consider an SVC on a frame relay or ATM network with the following pattern of activity as shown in Figure 5.9. Notice that the bursts of traffic at each end of the call are separated by a long period of idleness. No work is done by the network on behalf of the users for the entire 10 minutes of inactivity on the SVC. There are no idle patterns in frame relay. But the customer must pay for the SVC for the entire duration of the call.
Figure 5.9 Bursty traffic and SVCs. How long will it take before the users do what is shown in Figure 5.10? If the cost of the two calls is low enough, users will do it. And the lower the cost, the more tolerable the second call setup delay becomes. But the call setup/holding time ratio may no longer be adequate to cover the costs and the network is doing a lot more work than it did before. Since the users can establish SVCs with almost as much bandwidth as they like, the users will compensate by increasing the bandwidth on the SVC if the call setup delay is too high.
Figure 5.10 Short holding time SVCs for bursty traffic. So maybe duration is not the best method to bill for frame relay or ATM SVCs. What about distance? Without duration as an adjunct to distance, this makes little sense by itself. The distance between endpoints must be determined as each SVC is established in order to bill by miles alone. But would a 10-Mbps, 2-hour, 10-mile SVC cost more than a 10-Mbps, 10-hour, 2-mile SVC? If not, why not? Without duration or something else as an adjunct second parameter, this makes little sense. Users may respond by establishing their own, short hop relay points to offset the costs of long connections, defeating the whole public network purpose. Also, with few exceptions, existing frame relay networks PVC prices are distance-insensitive. Should there be any compelling reason why the SVCs on these networks would not be distanceinsensitive also? What user would migrate from PVCs to such an SVC system? This price structure is likely to remain in place even after SVCs become common. But what other parameter could be used as a fair and yet profitable criterion for SVC billing? How about traffic load? What could be fairer? Send more frame relay frames, pay more. A potential problem with frame relay is the presence of the discard eligibility (DE) bit. If a frame is counted for billing purposes at the User-Network Interface (UNI) on the sending side, but is tagged as DE, the frame may never make it to the destination. The frame relay network may discard the DE frame under certain conditions like congestion. Also, a discarded DE frame will probably have to be resent, adding to the sender’s billing cost. But if the frame is counted at the destination UNI, billing information must be gathered for each connection terminating there. This information must then be correlated with the proper ingress UNI in order to send the proper sender the bill. This is not an impossible task, but it is certainly complicated. In fact, most SVCs on a frame relay network that are billed by frame counts will probably be billed at the sending UNI, and that is that. So what is the answer to the SVC billing problem? There is no generally accepted answer, most prominently in the case of public frame relay (and ATM) networks of arbitrary size.
Conclusion
Until the resource allocation and billing problems are solved for the general case in both large and small frame relay networks, SVCs will not be implemented widely, in spite of service provider claims and “me too” announcements. SVCs will appear in some situations, most notably singleswitch environments or in cases where the service offered is still in fixed bandwidth increments (such as voice, 10 Mbps Ethernet, and so on). Routing and billing for frame relay networks will remain a topic of intense research for the next few years. The fact that frame relay networks still employ frames instead of numerous small cells will make usage-based billing easier for frame relay service providers. This might be the place to simply list all the issues outlined in this chapter that make frame relay SVC offerings difficult to implement and use: Security on the inbound connections DLCI and CIR limits on the UNI
Call setup/holding time ratios for bursty traffic Network-to-network SVCs
Signaling messages with no priority and subject to DE rules Billing system issues
Competition from inexpensive PVCs
Chapter 6: Congestion Control Overview The issue of congestion control in networks is not limited to frame relay networks, of course. All networks from private line networks to public X.25 networks to brand new ATM networks must deal with the problem of congestion. Typically, the standards and related documents that define the network service itself will also outline or even detail the mechanisms to be used for dealing with congestion. This chapter will describe the mechanisms established in frame relay to handle congestion. Congestion means that there is too much traffic in the network for the network to deal with effectively. Some texts are fond of dividing periods of congestion into those of mild congestion and severe congestion. This is a little like dividing drinkers into those that appear mildly drunk and those that appear to be severely drunk. There could be debates about whether lampshade dancing or falling off the barstool belong in the mild or severe category, but the important thing is that neither a mild drunk nor a severe drunk should ever get behind the wheel of a car. The trouble with the mild and severe congestion approach is that someone might perceive mild congestion as a less drastic condition of the network or even somehow okay. In truth, all congestion is harmful to network and users alike and should be avoided at all costs. And in fact it is much easier to avoid congestion than it is to alleviate congestion once it has occurred, even mild congestion. When a packet-switching network is congested, it slows down. This means that it takes longer for traffic to find its way through the network nodes. Since packets are generally presented one after another to the network, this slowing down is seen by the users as a reduction in effective bandwidth and a lengthening of the delay through the network. The change in the characteristic throughput of the network might be gradual or abrupt, but more than this is needed to distinguish mild from severe congestion. Normally, the relationship between the traffic load offered to a packet network and the network throughput is what is known as linear. This means that a doubling of offered load results in a doubling of throughput between senders and receivers. There is more traffic in the network at the doubled load, but if the network is designed correctly, more traffic is not necessarily a bad thing. Triple the offered load or input, triple the throughput, and so on. But what if the offered load continues to increase to its maximum possible value? This should not happen often, if at all, in a packet network designed for bursty applications. But what if, for one reason or another, all senders are bursting all the time? Obviously, no one expected this when the packet network was designed. If they did expect this all-the-bandwidth-all-the-time situation, the result would be the same as with circuit-switching. There would be no sense in using packet networks to recreate circuit networks. At some point, under heavy loads, the linear relationship between offered load and throughput breaks down. The relationship becomes nonlinear. A doubling of offered load at this level of traffic activity does not result in a doubling of throughput. Usually, the increase in throughput will now be much less than a doubling. In fact, under extreme conditions, doubling the offered traffic load may actually decrease the throughput from what it was before the traffic increase occurred! Some texts refer to the point where the load-to-throughput relationship goes nonlinear as the onset of mild congestion and refer to the point where the load-to-throughput relationship goes negative as the onset of severe congestion. This is shown in Figure 6.1. To be mathematically correct, the figure should technically have curves instead of straight line segments in the nonlinear sections of the figure.
Figure 6.1 The relationship between network load, throughput, and congestion. Again, the approach taken in this chapter is that mild congestion in a network is too much to allow the network to function as designed. It cannot be stressed enough that even regions of mild congestion are to be avoided if at all possible on modern packet-switched networks. Flirting with mild congestion at today’s higher network speeds is asking for a network disaster to strike. Congestion control is related to a network concept known as flow control. The two are really distinct phenomena, but it is common for flow-control problems to cause congestion and customary for a network to try to address problems of congestion control with flow-control remedies, as will become apparent. Flow control is a local property of a packet network. The principle of flow control means that no sender should ever be able to send faster than a receiver can receive. This only makes sense. Why send packets or frames at 1.5 Mbps if a receiver can only handle 128 kbps? If a receiver is being overwhelmed, flow-control mechanisms provide the receiver with a way of telling the sender to “slow down!” until more packets or frames can be digested. But it takes time for receivers to issue slow down messages and for senders to react to them. In the meantime, bits keep flowing into the network. The extra bits either build up in the network or must be discarded by the network, which usually means that the sender must resend them. Flow control never became much of a network issue until packet-switched networks like X.25 and the Internet came along. This is because when two end devices are connected with a leased private line, they “see” each other directly across the network. The end devices are always connected at the exact same speeds: 64 kbps to 64 kbps or 1.5 Mbps to 1.5 Mbps. The end devices could send all day without the network caring, since the bit pipe always matched at both ends. Flow control was still a user issue in this environment. If the link were between a print server and a remote printer, for example, the link could be operational when the printer was out of paper. The printer needed a way to tell the print server “slow down to nothing!” until there was paper in the printer again. There were mechanisms to do this, of course, but none of this concerned the circuitswitched network in the least. Private lines cost the same whether they deliver data bits or idle patterns. Packet-switched networks are different. The user devices do not “see” each other directly across the network cloud. The user devices see only the cloud at the end of the UNI. But one UNI could run at 64 kbps and another could run at 256 kbps. If a sender is sending across the UNI at 256 kbps for an extended period (packet-wise) to a destination serviced by a 64 kbps UNI, the extra bits can easily build up in the network. And it’s not really the sender’s fault. The UNI across the network remains essentially invisible to the sending device, which can only see the cloud at the end of the UNI. The X.25 packet-switching standard allowed the network to absorb some extra bits from a sender and parcel them out to a receiver as best it could. The bits simply stayed in an X.25 switch buffer until they could be sent to the destination. But this type of network flow control usually occurred at 4800 bits per second on one end and 2400 bits per second on the other. Since packet applications are inherently bursty, this approach worked, and still works, in X.25 packet-switched networks. The only trick is to make sure that the number of packets a sender can send without hearing from the receiver does not exceed the buffer capacity of the network and the time it takes to react to the slow down message.
But at fast packet speeds, the buffering approach to flow control becomes almost impossible. Buffering at 1.5 Mbps or 45 Mbps is a problem. The time it takes for a destination to tell an originator to “slow down” and for the originator to actually do so may result in plenty of lost information due to buffer overflows. In the interest of network efficiency, frame relay and ATM discard extra traffic above the level that the connection, PVC or SVC, was established to handle. In frame relay, the acceptable traffic level is set by the committed information rate (CIR) on the connection. This approach turns the issue of flow control back into a user issue as it was in circuitswitching instead of a network issue as it was in older packet-switching networks. Note that flow control concerns this sender and that receiver only. Flow control is a strictly local phenomenon. All of the other users on a network can be experiencing adequate service, but this pair of end devices are hopelessly bogged down. But congestion affects all users, regardless of who or what is causing the congestion. Congestion is a more global, but not necessarily universal, phenomenon. Some of the related concepts of flow control and congestion are illustrated in Figure 6.2. Flow control is a local property of a network, while congestion is a more global property of a network. No sender may be sending faster than a receiver can receive, but there is just too much traffic in the network. This is why most networks handle congestion by using flow control. Flow control makes senders slow down. In the case of flow control used for congestion control, the receiver is not the actual traffic destination, the network itself is the receiver of the sender’s traffic. It is true, however, that congestion might be restricted to a single network node or group of nodes. It that case, the relief method must inform the senders that are contributing to the congestion to slow while allowing other senders that are not contributing to the congestion to continue functioning as before. Frame relay employs such a relief method that targets only the specific senders that contribute to the congested node or nodes (it is hoped that congestion in a frame relay node is alleviated before the congestion spreads to other nodes).
Figure 6.2 Flow control and congestion control in frame relay. The only other way to handle congestion is to speed up the output. Since most networks output at the maximum value at all times anyway (there is little incentive not to), the only real way to speed up output is to discard traffic. Of course, receivers detecting missing traffic that they need will respond by asking the senders to resend all of the missing traffic, and usually much more traffic besides, even though the traffic was actually delivered intact. Fragments of IP packets inside frame relay frames, for example, cannot be resent individually; all the fragments must be resent. If only one fragment out of 10 was discarded due to network congestion, the net result will be a load of 20 packets on the network instead of only 10, even though only one was discarded. This is one of the main reasons that congestion is better to avoid than attempt to alleviate. It should be noted that user-to-user flow control mechanisms must continue to function regardless of the flow control mechanisms used by the network. Printers do still run out of paper.
Flow Control Mechanisms The most common form of flow control in use today is the windowing flow control method. Most network protocols are called windowing protocols for this reason. The members of this group include such popular network protocols as TCP/IP, SNA (although IBM calls this flow control mechanism pacing), and even frame relay in some cases. Frame relay is an oddity on this list because for data transfer, frame relay networks never bother with flow control within the frame relay network. The protocols employed at the endpoints of the frame relay network, such as TCP/IP or SNA, must handle this crucial function. Windowing is the process where a receiver grants permission to the sender to transmit a given number of data units (typically frames or packets) without waiting for the receiver to respond. The sender also establishes a send window tuned to the size of the receiver window. If the receiver’s window size is 4, just to give a simple and not terribly precise example, the sender can send no more than 4 frames or packets or whatever to the receiver across the network without having to stop. The efficiency of this process is clear, especially when compared to older stop-and-wait flow control protocols that forced a sender to stop and wait for an individual acknowledgment from the receiver for each and every data unit sent across the network. This acknowledgment performed double duty. Not only did the acknowledgment inform the sender that the data unit had arrived intact and so could be safely deleted from a sender’s buffer, but the acknowledgment also informed the sender that it was okay to now send another data unit without fear of overwhelming the receiver. The stop-and-wait method provides admirable flow control: Since every data unit is individually approved by the receiver, it is next to impossible to overwhelm a receiver with data units at all. But obviously, if the network delay is measured in hundreds of milliseconds or even whole seconds, the stop-and-wait flow control method is not very efficient at all. Stop-and-wait protocols spend a huge amount of time that could otherwise be spent sending just waiting around for acknowledgments to slowly make their way back across the network. Windowing protocols make more efficient use of network resources by allowing one acknowledgment to convey more information to the sender. When windows are just the right size, an acknowledgment should arrive just as the sender has filled the send window, allowing the process to continue and fill the pipe between sender and receiver across the network. Windowing protocols have been around since at least X.25 and the roots of windowing go back in the noninternational standard arena beyond X.25. While usually seen as a huge improvement over stop-and-wait approaches, the X.25 window sizes were quite modest in most cases. A typical value was 2; this was still twice as efficient as stop-and-wait and X.25 was universally applauded for that. Wild and crazy X.25 networks used window sizes of 3 or even 4, but the effectiveness of larger window sizes was limited by the high error rates on these networks. This was because the error recovery mechanism in X.25 was called go-back-N. With go-back-N, if a window size of 4 was used (another simple example), and the second frame was received with errors, the third and fourth frames were ignored by the receiver, even if received without errors. The receiver in the meantime sent a message to the sender to go-back-to- and resend everything from frame two on in the window. So high bit errors on a network set a practical limit on just how big a window could and should be.
Whether stop-and-wait or go-back-N, the emphasis in these flow control methods is on the receivers. This also makes sense because the whole goal of flow control is to prevent senders from overwhelming receivers. But not all flow control is under the control of the receiver. This is true of all windowing protocols, but not all protocols, especially older ones, are windowing protocols. Also, informing the receiver alone of flow control issues is not always the best strategy. What if the receiver cannot or does not communicate with the sender? Examples of such interactions are not as far-fetched as they might seem. Remote weather stations dump huge amounts of information into a central site, which can easily become overwhelmed. Yet few simple weather stations are able to receive from the other site. The same is true of burglar alarms, remote sensors, and even television sets. This point will be important later in the discussion of frame relay congestion control. Senders always want to send as fast as they possibly can. In windowing protocols, slowing down is an unnatural act that must be imposed upon a sender by a receiver. The receiver can do a few things to slow down a sender, and one or more of these methods is routinely built into all data communications protocols. First, the receiver can withhold an acknowledgment for some time period after the acknowledgment could have been sent. Since the sender cannot send beyond the window without an acknowledgment, this effectively slows the sender down. However, the acknowledgment delay cannot be set too high or else the sender will assume the acknowledgment itself has been lost and resend the whole window in a kind of “maybe you didn’t hear me” mode. Second, the sender can shrink the window size. This permits the sender to send a smaller amount of information than before and has the desired result. Of course, once the window is set to a minimum, this mechanism no longer functions at all. Finally, the receiver can send a message to the sender explicitly saying, “slow down.” This is quite effective. The problem is that if there is a specific message to “speed up” which is lost on the network, the application can remain sluggish. It should be noted that there are actually two levels of acknowledgments, and therefore potential flow control, in most windowing protocols. There is a window established hop-by-hop in the network, between each network node and end system, usually at the frame or packet level. There is also a window established end-to-end through the network between end systems only, usually at the packet level or above. The details are not important. What is important is that the decreased error rates in modern networks have allowed fast packet networks like frame relay to speed themselves up by relaying data units without hop-by-hop flow control or error correction within the network between network nodes (frame relay switches in the case of frame relay). But this does not mean that there is no end-to-end flow control or error correction at protocol layers above frame relay. In fact, it is because these upper layers still must perform flow control and error correction at some layer that the network itself can dispense with this function. The network can now concentrate on congestion control rather than internal flow control. This approach is shown in Figure 6.3. Note that frame relay does no internal flow control between frame relay switches. If a sending frame relay switch sends faster than another frame relay switch can receive, the only recourse available is for the receiving switch to discard traffic. The only question is which traffic should be discarded first.
Figure 6.3 Hop-by-hop and end-to-end flow control.
Many of these points about frame relay networks have been made earlier. But the points about flow control are important enough to repeat here in this chapter about congestion control. Frame relay uses flow control concepts to address congestion control issues in order to prevent haphazard discarding of user traffic at a congested frame relay switch. This only makes sense because discarding user information needed to complete an action at the receiver will only result in the endto-end error recovery mechanisms telling the sender to repeat at least the missing data units and at most everything in the window from that point on (go-back-N). One more point about flow control should be made. As the limitations of stop-and-wait flow control led to the development and deployment of go-back-N methods, the limitations of go-back-N flow control led to the development and deployment of selective retransmission methods. The major difference is that while go-back-N requires otherwise perfectly good data units to be resent under most circumstances of data unit loss (whether due to error or network discards), selective retransmission does not. With selective retransmission, only the missing data units in a window need to be resent. However, due mainly to the complexities and processing overhead of the selective retransmission approach, implementation has been rare outside of specialized networks such as wireless networks, where resending a good but missing data unit is quite counterproductive. The higher bit error rates and frequently exorbitant air time costs on wireless networks combine to make go-back-N very expensive and inefficient.
Flow Control in Frame Relay The first and foremost flow control mechanism in frame relay is the CIR. It has already been noted that frame relay networks do not have hop-by-hop flow control between the frame relay switches. Yet the CIR is a flow control mechanism of sorts. There is no contradiction here because the CIR is enforced in frame relay at only one point in the entire network—at the switch port side of the sending UNI—and nowhere else. The role of the CIR in frame relay has already been discussed in some detail in Chapter 4. The CIR represents the amount of usable bandwidth an end system can employ on a frame relay network without fear of discards under normal circumstances. In most cases, normal circumstances mean without congestion. Since congestion control is the main topic of the whole chapter, the role of the CIR under congested conditions on a frame relay network is deferred until later. For now, assume that the frame relay network is indeed operating under normal circumstances (i.e., without congestion). How does the CIR perform flow control? The goal of flow control is to prevent senders from overwhelming receivers. This is one thing when all senders and receivers are linked by the same speed as on a private line network. But frame relay allows UNIs to operate at a wide range of speeds, from as low as 56 kbps (and lower if analog modem access is considered) to as high as 45 Mbps (and beyond if Sonet access is considered) with numerous speeds in between. There is no requirement that the UNI speeds match at each end of a virtual circuit on frame relay. The way that the CIR is used for flow control is shown in Figure 6.4. Here a sender (Sender A) on a 256 kbps (fractional T1 or FT1) UNI has a DLCI that leads to the receiver (Receiver B). Since each DLCI needs a CIR, the CIR is established between the sender and the frame relay switch port at the end of the UNI. There is a local DLCI at the receiving UNI as well, but it is important to remember that there is no CIR between network and user on the remote UNI. The CIR is purely a user-to-network concept on the sending side (but it is true that any traffic from a virtual circuit on the 64 kbps UNI will have its own CIR enforced at that UNI).
Figure 6.4 The CIR as a form of flow control. What would happen if the CIR from Sender A to Receiver B was set at 128 kbps? It is easy to see from the figure that data units could easily enter the network at 128 kbps, since the sending UNI is running at 256 kbps. Of course, 128 kbps could never leave the network at this rate since the receiving UNI operates at only 64 kbps. The extra data units will accumulate until discarded at the egress UNI, unless there is some mechanism for the receiver to tell the sender to slow down. This is the role of the CIR in frame relay flow control. Note that there is no easy way for a sending frame relay PVC to know exactly what the UNI physical speed is on the other side of the network. Care in configuration is definitely required. Note further that this use of the CIR has nothing whatsoever to do with the concept of oversubscription on the UNI.
Of course, the situation in Figure 6.4 could easily be interpreted as just another form of congestion control since the frame relay network is really the one being protected (the receiving UNI cannot operate faster than its physical speed anyway). The fact that the CIR is enforced at the ingress port makes this point less emphatic, however. Nevertheless, the CIR is a form of flow control and should be considered a flow control issue, if only because the CIR is a strictly local parameter on the network, between this sender and that receiver. This local characteristic is the essence of flow control. There is one other situation where a frame relay network can and should employ flow control. This is when LAPF reliable mode is used to provide reliable information transfer across a frame relay UNI. The only information transferred across the UNI in reliable mode is the Q.933 call setup signaling message. In this case, full data link layer frame transmission techniques, virtually identical to those employed in ISDN LAPD, are used to establish windows, withhold acknowledgments, and so on to perform flow control on the UNI. Reliable mode uses go-back-N recovery for lost data units. LAPF frames used for reliable mode contain a control field used for these purposes, while normal LAPF frames do not have a control field at all. Theoretically, a frame relay network service provider is allowed to implement reliable mode on any virtual circuit DLCI on the UNI, even those carrying user traffic. But, up to this point in practice, reliable mode has only been used on the UNI for DLCI = 0 Q.933 call setup signaling messages. (Further use of reliable mode for some actions on the frame relay NNI will be discussed in Chapter 8.)
Congestion Control Mechanisms Congestion control mechanisms are closely related to and interact with flow control mechanisms. But this does not mean the mechanisms are the same. Network protocols can employ one of two main methods of congestion control. There is implicit congestion control and explicit congestion control. With implicit congestion control, the end devices on the network will use some indirect source of information to determine if congestion exists anywhere along the path through the network from source to destination. Explicit congestion control involves the end devices on the network. Explicit congestion control means that the end devices on the network will use some direct source of information to determine if congestion exists anywhere along the path through the network from source to destination. Consider implicit congestion control first. Since networks like frame relay simply relay frames through the network, the higher layer protocols at the ends of the network are not notified of network congestion directly. End systems are all concerned with flow control, not so much with network congestion. End systems routinely monitor the delay through the network, but for purely selfish reasons. Nevertheless, there are two main ways that higher layer protocols can infer that congestion within the underlying network is taking place. First, the end systems can monitor how the round trip delay changes over time. The round trip delay might rise or fall, of course, but if the delay is slowly but surely rising over time without corresponding drops, this is a pretty good sign that network congestion is occurring. Second, the end systems can monitor the number of lost data units they send across the network. If the lost data units rise above the number expected due to the long-term error rate on the network, this is also a sign of network congestion. Both methods rely on the fact that congested networks slow down and ultimately discard data units. Because neither mechanism can be said to rely on explicit information about network congestion, this is definitely implicit congestion control. There might be other causes for the same delay rise and data loss effects. Even in the case of implicit congestion control, however, response is usually swift and often effective. Senders realizing that delays are rising and/or lost data units are above the expected level will typically perform a quick backoff followed by a slow increase back up to their former sending rates. The quick backoff is usually to 50 percent of the former rate, and if the delay continues to rise and/or the data unit loss is high, another 50 percent backoff can be assessed (now the send rate is 25 percent of the uncongested rate). The slow increase means that senders can slowly increase their sending rate back up to their former levels if the delays begin to fall and/or data unit loss reverts to expected levels due to pure error rates. The nice feature of implicit congestion control is that the higher layer protocol employing such methods is totally independent of the type of network used to deliver the higher layer data units. Implicit congestion control is employed in many network protocols, including the most popular of all: TCP/IP. Now consider explicit congestion control. The whole idea is to avoid the network having to rely on implicit methods and extreme measures such as discarding information to inform the senders to slow down. There is no universal mechanism for explicit congestion control and the sender notification that accompanies the method. The message could be conveyed in the form of a special message or packet or frame. The only requirement is that the congestion notification be received and acted on by the senders and receivers that are in a position to actually do something about the congestion. For instance, it makes no sense to send congestion notifications to senders on the East coast if none of them are contributing to congestion in Idaho.
Because there is no standard for explicit congestion control and the congestion control mechanism in frame relay is explicit congestion control and notification, this discussion is best pursued by leaving the realm of the abstract for the surer footing of frame relay itself.
Congestion Control in Frame Relay Frame relay networks use an explicit congestion control method to prevent the frame relay network from filling switch buffers with frames that cannot be delivered before the frames must be discarded to make room for more arriving frames. The frame relay network uses a mechanism of explicit congestion notification (sometimes abbreviated ECN) to convey congestion information to both senders and receivers at the end of a particular DLCI (label for a PVC or SVC). The first point to be made about frame relay network congestion is that it has both a location and a direction. This might seem odd at first. But determining if a frame relay network is congested is mainly a process of determining the status of buffers on the output ports of frame relay switches. Why output ports? Because the whole idea is to get traffic out of the network. Frame relay frames sitting in an output buffer represent work already done by the network node. If the frame is discarded, the switch might have to perform the work all over again. The most important output buffer is the one at the receiving UNI. A frame that has reached the receiving UNI’s output buffer means that the frame relay network has done all that it needs to do to deliver the frame to the user (actually, the FRAD) at the other end of the UNI and collect the fee for delivering the frame successfully. Congestion here means that this work might have to be repeated. The location of congestion in a frame relay network is very specific. Congestion occurs at this output port. A frame relay switch might have eight output ports, only one of which is considered congested. Naturally, all eight could be congested, or only one or two. The point remains that congestion occurs in frame relay networks on an output port by output port basis. All DLCIs that have a path mapped through the frame relay network using that port will be affected. It does not matter which DLCI might be flooding the switch with traffic, congestion affects all user connections serviced on the port. Note that in the opposite direction on a given DLCI, there might be no congestion at all. The output port buffers leading back in the opposite direction might be totally empty. This is why frame relay congestion has both a location and a direction. It is true that if one output port in one direction is congested at some point in time, the effects will ripple back through the frame relay network and congest other output ports. That is the nature of congestion. The aim of congestion control and the notification process is to prevent this ripple effect from occurring. In keeping with the whole mild and severe congestion philosophy, most frame relay switches have two levels of congestion. The levels are based on the percentage of output port buffers in use at any particular point in time. There is no magic number attached to these percentages. Most frame relay switch vendors set the default at 50 percent and 80 percent of output port buffer capacity respectively. Few frame relay service providers ever see fit to change the defaults. The concepts of normal, mildly congested, and severely congested frame relay output buffers are shown in Figure 6.5.
Figure 6.5 Mild and severe congestion in frame relay.
It is clear from the figure that only two of the four output ports are congested. This means that only those DLCIs that actually have frames routed through those two output ports (Ports #1 and #3) are currently experiencing congestion. Unless steps are taken, all output ports might join the congested category in the near future. But not all frames arriving on the input ports are switched through the congested output ports. The two DLCIs defined on input port #4, for instance, are routed with one to the severely congested output port #3 and the other to the uncongested output port #4. Perhaps both DLCIs originate at the same customer site. But traffic on one DLCI is experiencing congestion and maybe increasing delays (output port #3) while the other DLCI is not (output port #4). Ideally, congestion control in this situation should have a way to inform the sender and/or receiver on the paths experiencing congestion that the congested condition exists. That way, only the connections affected will have to throttle back their sending until the congestion is alleviated. Frame relay uses the mechanism of explicit congestion notification to accomplish this task. The frame relay explicit congestion notification is sent to both senders and receivers. This might sound odd since the whole goal is to get the senders to slow down. But remember that most modern network protocols are windowing protocols in which the receiver, not the sender, is the primary party for the flow control mechanisms that ultimately slow senders. However, not all protocols are windowing protocols and, more importantly, not all applications involve frequent receiver to sender interactions. The examples of burglar alarms and weather stations were used earlier. Only the senders in these applications can do anything about their sending rate. So the senders and receivers must first be notified of the congestion and, if it continues, the frame relay network will invoke congestion control to alleviate the congested condition. Explicit congestion notification in frame relay involves the use of two bits in the frame relay frame header. Congestion control in frame relay involves the use of one other bit as well. The two bits directly involved in explicit congestion notification are the Forward Explicit Congestion Notification (FECN) and Backward Explicit Congestion Notification (BECN) bits. Congestion control involves the Discard Eligible (DE) bit, also in the frame relay frame header. Figure 6.6 shows the location of these bits in the second octet of the frame relay header.
Figure 6.6 The FECN, BECN, and DE bits. The FECN bit is used to inform the receiver of the presence of congestion anywhere on the path from sender to receiver. The BECN bit is used to inform the sender of the presence of congestion anywhere on the path from sender to receiver. Both are necessary just to make sure that the sender always gets the message and has an opportunity to slow down before severe congestion occurs. User FRADs always generate frame relay frames with the FECN and BECN bits set to a 0 bit. Any frame relay switch can set the two bits to a 1 bit on all frames traveling in both directions that pass through the congested port, both input and output. In other words, if DLCI 19 from User A to User B is mapped to a congested output port, then not only will all frames traveling on this output port have the FECN and BECN bits set, but so will all frames that travel on DLCI 19 from User B to User A, even though there is no congestion at all in that direction. Uncongested frame relay switches will ignore the FECN and BECN bits and never change them back to a 0 again under any circumstances. In most frame relay networks, the setting of the FECN and BECN bits occurs when the output port buffers are 50 percent full. As a simple example, consider a frame relay switch with four output ports that has output buffers for 10 frames on each output port. If there are five frames in the buffer, then the switch will begin to set the FECN bits on all frames switched to that output port to a 1 bit. All frames that arrive on the corresponding input port will have their BECN bits set to a 1 bit. Frames switched to and from other ports on the switch will not be affected by this activity.
FRADs that receive frames on a given DLCI with the FECN bit set know that somewhere on the path between themselves and the other FRAD there is an output buffer handling arriving frames that is more than 50 percent full. FRADs that receive frames on a given DLCI with the BECN bit set know that somewhere on the path between themselves and the other FRAD there is an output buffer handling departing frames that is more than 50 percent full. The two meanings are somewhat different and affect just which sender should slow down in the case of FRADs that exchange information in both directions. Obviously, there is no benefit derived from having senders slow down if this will not help to alleviate the congestion in the network. A FRAD could receive frames with both the FECN and BECN bits set to a 1 bit. This means that one or more output buffers in both directions are more than 50 percent full. A FRAD could receive frames with the FECN and/or BECN bits set to a 1 bit on one or more DLCIs. Since DLCIs are used to label all connections, this information can be used to slow senders that are directly contributing to the congestion and no other connections. This interplay of buffers, FRADs, and FECN/BECN bits is shown in Figure 6.7.
Figure 6.7 Buffers, FRADs, and BECN/FECN bits. Note that the FECN bit tells a receiver that there is congestion and that is all. It is up to the higher layer protocol to first detect the bit, then invoke some delayed acknowledgment or reduced window strategy to force the sender to slow down. Unfortunately, there are few higher layer protocols that are capable of detecting whether frame relay frames arrive with the FECN and/or BECN bit set to a 1 bit in the first place. And TCP/IP, the most common higher layer protocol, definitely does not and will not detect the status of these bits. This is because TCP/IP highly prizes its runs the same on anything reputation. Adding features and functions only for frame relay networks would violate this principle. Without the cooperation of the receiver and sender higher layer protocols, there is only so much a FRAD can do to respond to the FECN and BECN bits. If user applications continue to send in an unrestricted manner, the FRAD can either attempt to buffer the extra traffic, perform its own version of CIR enforcement (no frames that would be tagged DE = 1 by the switch are sent), or use a combination of the two approaches. Modern FRADs do some amount of buffering of frames, usually quite a bit in an effort to smooth bursts in a process known as traffic shaping. Many FRADs also respect the CIR, and even set the DE bit to 1 when the switch would in any case, although this is not technically the job of the FRAD. In most cases where a FRAD, router or hardware device, responds at all to the FECN and/or BECN bits, it is by slowing down to the CIR level established for each DLCI affected by the congestion. Bursts above the CIR on those DLCIs are buffered, if possible, or else discarded. After all, frames above the CIR are not guaranteed delivery anyway. Who cares if it is the FRAD or the switch that actually performs the discard?
Which Traffic to Discard? The role of the FRAD in congestion control has introduced the function of the DE bit in the whole congestion control process. The DE bit comes into play because frame relay applications routinely burst above the CIR. As previously mentioned, this is okay, but there is no guarantee that frames sent in excess of the CIR on a given DLCI will actually get there. The CIR essentially forms a bandwidth quality of service guarantee on the frame relay network, although the CIR mechanism is not perfect and even traffic conforming to the CIR might be discarded under congested conditions. Here is why.
The FECN and BECN provide a mechanism that effectively kicks in at 50 percent buffer capacity, which should be early enough to avoid severe congestion and a need to discard any user traffic. However, it is clear that if any traffic is to be discarded, it should be the frames tagged upon entry to the network as DE = 1 (may be discarded under certain conditions). After all, if the rates that users pay for DLCIs are higher for higher CIRs, then users can attempt to cut corners and pay for a lower rate CIR and simply burst above this rate constantly. If all DE = 1 frames go through anyway, what is the point of a CIR at all? Maybe all CIRs should just be zero (all frames are tagged DE = 1). In fact, several major frame relay networks only support CIR = 0 service. So a discard of user traffic should focus on DE = 1 frames. These are frames that the network has not “promised” to deliver, so there should be no repercussions in terms of service level agreements (SLAs) or tariffs. If the DE = 1 frames do not arrive, the missing frame contents (but not necessarily the frame relay frames themselves) will be detected by the receiving higher layer protocol (IP for instance). Then normal error-recovery procedures are taken by the higher layer protocol to either compensate for the missing frames’ information (freeze a video, silence on a voice call) or ask for a retransmission from the sender (routine in data applications). This is why customer premises equipment and applications that respect the CIR have such value in a frame relay network. Notice that 50 percent buffer capacity on an output port does not trigger the frame relay switch to do anything at all with the frame relay frame contents in the buffer. The switch just begins setting FECN and BECN bits on traffic in both directions, and continues to accept traffic for the overloaded port and send the contents of the buffer out onto the network. Ideally, if all FRADs, protocols, and applications properly interpret and react as recommended to the FECN and BECN bits, this action should be enough. But what if it isn’t? What if the congested buffer continues to fill, from 50 percent to 60 percent to 70 percent and beyond? At 100 percent buffer capacity, which in our example has been set at 10 frames queued to go out the output port, bad things will assuredly happen. Since buffers are memory areas dedicated for communications purposes, the symptoms of full buffers closely resemble the symptoms a simple PC suffers when it runs out of main memory. The system can hang, crash, or reboot itself. So can frame relay switches. Naturally, this does not mean the frame relay switch will always do one thing or another. The switch could even just switch frames incorrectly. But why take chances? The trick is to never allow the output buffers to reach 100 percent capacity and see what happens in a production environment. Most frame relay switches will begin to set the FECN and/or BECN at 50 percent buffer capacity, or five frames in the example switch. The next major mechanism takes over at 80 percent of buffer capacity. At 80 percent capacity, or when eight frames are now in the buffer, the buffered frames are scanned for the status of the DE bit. If the bit is set (DE = 1), then the frame is discarded. Actually, the term “discarded’ is much too concrete to describe what really happens in most cases. Typically, the buffer space is freed, usually by returning the pointer to that buffer space to the pool for free buffers on the port in the switch. Either way, though, the result is the same: The frame is gone. It is instructive to consider some of the possible outcomes of this DE = 1 scan procedure. After all buffer contents are scanned, the buffer could actually be totally empty! This would be the case if all the frames in the buffer were tagged as DE = 1. This often happens when CIR = 0 is used a lot on networks. The process does not halt when buffer occupancy is less than 80 percent, or 50 percent, or some other number. The process completes when all frames in the buffer are examined for possible discard. Not only is severe congestion alleviated, but the buffer is given a new lease on life. And if the applications have reacted to the FECN and/or BECN bits, it should be a while before the port is again congested.
Alternatively, the scan on the buffer could result in a buffer that is as full as it was before the DE = 1 scan commenced! This is not as odd as it sounds and actually has a message for those concerned with frame relay configuration and traffic management.
The output buffers that are of concern here are the buffers located on the network side of the UNI. The destination FRAD is at the other end of the UNI, and the frame is considered to be delivered once it has arrived intact at the FRAD. There could be many other output buffers inside the frame relay network, and often are. But since there is no requirement to use frame relay to link frame relay switches together inside the cloud (this would not be true UNI nor NNI anyway), the output buffer at the destination UNI is the only output buffer definitely under the control of frame relay specifications.
Recall that only frame relay frames bursting above the CIR are tagged as DE = 1. What if everyone everywhere only sends frames into the frame relay network at the CIR rate? There are many ways this could be accomplished. FRADs could buffer frames before they enter the network, applications could pace themselves, the CIR could be vastly overconfigured, and so forth. In any case, if many DLCI all lead to a central site, as is often the case, and all sending FRADs respect the CIR, then there will be no DE = 1 frames in the buffer at all, even if the buffer occupancy exceeds 80 percent. Therefore, there will be no frames to throw away. What can be done about congestion in these all DE = 0 circumstances? Here is where the under normal conditions clause of the CIR promise of assured delivery and guaranteed bandwidth takes effect. Buffers that are congested and contain no DE = 1 frames at all are not normal in frame relay terminology. If senders can burst, they will. If they do not, that is not normal. And under non-normal conditions, the switch is basically free to do whatever it likes to alleviate the congestion, especially if the alternative is to hang, crash, or reboot the switch.
Some switches will just proceed to skim off the excess DE = 0 frames above the 50 percent level. This makes sense because that is the last traffic to enter the switch and such a procedure does not penalize user traffic that has been sitting in the buffer patiently waiting to be sent. Other switches will simply flush the entire buffer and start from scratch.
Switch vendors often become creative when dealing with DE = 1 frames on the frame relay network. One major frame relay switch vendor made a feature, known as early packet discard, the default action on the sending UNI. The thinking behind this feature seems to have been: “Why wait until output buffers congest at the egress switch to look for DE = 1 frames? Let’s find them at the input port at the ingress switch when we screen for CIR compliance. Then any frame that would have been tagged as DE = 1 can be discarded earlier rather than later.” The feature worked very well. In fact, it worked so well that no one with a CIR = 0 ever was able to send a single frame through the network. All CIR = 0 traffic was duly tagged as DE = 1 in one process and tossed in the bit bucket by the next process. Since the service provider had nearly every customer configured for CIR = 0 at the time of the switch cutover, users were less than pleased. The only action that could be taken was to either disable the early packet discard or give everyone a non-zero CIR. The service provider quickly handed out CIRs, usually at 50 percent of the UNI physical line rate, since the benefits of early packet discard were still attractive to the service provider.
Users faced with frame relay networks that routinely discard frames tagged as DE = 1 quickly learn a few things. First, it is always a good idea to provide the network with a few frame relay frames to discard if the network starts hunting down DE = 1 frames. Some FRADs allow users to set the DE bit to a 1 bit intentionally, even if the frame is sent below the CIR. Of course, the FRAD cannot set a DE bit to a 0 bit if the frame would be above the CIR and have a DE bit set to a 1 bit ordinarily. Second, DE bits can be used as a very raw form of priorities when set by the FRAD. If a single DLCI carries (for example) IP packets with voice samples in some packets and bulk file transfer data in others, it only makes sense to tag the file transfer frames as DE = 1 and the voice frames as DE = 0, even if all are sent below the CIR. Congestion can occur anywhere in the network and has no respect for real-time requirements. The discarding DE = 0 traffic by a frame relay switch is a short-term solution to the problem of congested buffers with no discardable traffic. But there is a long-term solution, one that requires the cooperation of those responsible for network management, planning, and configuration. They must cooperate whether they work for the customer or service provider to consider the repeated situation of 80 percent buffer capacity with no DE = 1 frame to discard to be a clear message and to mandate for change on the frame relay network.
This is the message that the consistent 80 percent or more congested output buffer with all DE = 0 frames is sending to the responsible personnel: The receiving UNI’s port speed and physical line rate is underconfigured. In plain terms, all senders are respecting their CIRs. The network has allocated sufficient resources to deliver these frames to the destination UNI efficiently. The destination UNI just cannot get rid of them fast enough to prevent congestion. The only long-term solution is to increase the port speed and physical line rate of the destination UNI. This is usually not a trivial task. The proper new speed must be determined (256 kbps instead of 128 kbps? A full T1?). Facilities must be provided and paid for. The receiving FRAD must be configured properly. The cutover must be coordinated and managed. All of these issues are more properly design issues and are not discussed further.
What FRADs Can Do about Congestion So far, the discussion about frame relay congestion might give the impression that FRADs are completely at the mercy of the frame relay network and dependent on the higher layer protocols when it comes to congestion control. Therefore, it seems ironic that FRADs are the frame relay network components that are the target of the FECN and BECN bits. The whole concept of layered protocols and layer independence makes it difficult for the FRAD operating at lower layers to inform the applications operating at the higher layers on the frame relay network to slow down and speed up when they should. The issue is more than just getting senders to pay attention. Just what constitutes an adequate slow down? When should senders be allowed to speed up again? These issues have also been mentioned, but so far the answers presented have been more along the lines of general terminology (“usually 50% slow down”) and not much else. Obviously it is in everyone’s best interest if some standard mechanism were established to allow for vendor independence and customer evaluation of devices on a common ground. The standard issue is always important, especially for public networks. As it turns out, such guidelines do exist. Annex A of the ANSI’s T1.618 specification on frame relay actually specifies how user devices (routers and/or FRADs) and networks should use and act on the FECN and BECN bits. Since more and more frame relay equipment manufacturers and vendors have pledged to react to the FECN and BECN bits, routinely ignored until relatively recently, this is a good place to discuss what frame relay equipment should do about congestion and information loss.
Forward Explicit Congestion Notification (FECN) Use by User and Network According to the specification, the user device (FRAD) compares the number of frames in which the FECN bit is set to a 1 bit (congestion) to the number of frames in which the FECN bit is set to a 0 bit (no congestion) over a defined measurement period. During this measurement period, if the number of FECN bits set to a 1 bit are equal to or exceed the number of FECN bits set to a 0 bit, the user device should reduce its sending rate to 0.875 of its previous value. By the same token, if the number of FECN bits set to a 0 bit are equal to or exceed the number of FECN bits set to a 1 bit, the user device is allowed to increase sending by a value of 1/16(0.0625) of its sending rate (reflecting the slow start used to restore sending rate). The measurement interval is to be equal to approximately four times the end-to-end network delay. So if the end-to-end delay is 40 ms, the interval should be about 160 ms or so. As for the frame relay network use of the FECN bit, the frame relay switch constantly monitors the size of each queue in the buffers based on what is called the regeneration cycle. A regeneration cycle starts when an output buffer goes from being idle (empty) to busy (one or more frames). A measurement period is also defined, between the start of the previous regeneration cycle and the present time within the current measuring cycle. During this measurement period the average size of the output buffer queue is computed according to a formula. When this average size exceeds a predetermined threshold value, this particular output link is considered to be in a state of incipient congestion. At this time, the FECN bit on outgoing frames is set to 1 and remains set to 1 until the average queue size falls below this preestablished threshold.
The ANSI T1.618 specification defines an algorithm to be used to compute the average queue length. The algorithm involves making a series of calculations such as the queue length update, the queue area update, and the average queue length update. This process makes use of the following variables: t = Current time
ti = Time of the ith arrival or departure event
qi= Number of frames in the system after the event TO= Time at the beginning of the previous cycle T1= Time at the beginning of the current cycle
For the sake of completeness, the actual algorithm consists of the following three components for the calculations: 1.The queue length update:
Beginning with q0 = 0,
If the ith event is an arrival event, qi= qi+ 1
If the ith event is a departure event, qi= qi− 1 2.The queue area (integral) update:
Area of the previous cycle = sum of qi− 1(ti− ti− 1) over the interval
Area of the current cycle = sum of qi− 1(ti− ti− 1) over the interval 3.The average queue length update:
Average queue length over the two cycles =
Backward Explicit Congestion Notification (BECN) Use by User and Network According to the specification, if a user receives “n” consecutive frames with the BECN bit set to a 1 bit, the traffic should be reduced from the user by a step below the current sending or offered rate. The step count (S) is defined in the following order: 0.675 times throughput 0.5 times throughput
0.25 times throughput In the same fashion, traffic can be built up after receiving “n/2” consecutive frames with the BECN bit set to a 0 bit. The rate is increased by a factor of 0.125 times the sending rate. The value of S is calculated according to the following formulas:
where
IRf= Information rate in the forward direction
IRh= Information rate in the backward direction S = Step function count
Thf= Throughput in the forward direction agreed during call establishment
Thh= Throughput in the backward direction agreed during call establishment EETD = End-to-end transit delay
N202f = Maximum information field length in the forward direction
N202b = Maximum information field length in the backward direction Arf = Access rate forward
Arh= Access rate backward
Bef= Excess burst size forward
Beh= Excess burst size backward
Bcf = Committed burst size forward
Bch= Committed burst size backward
Fh/Ff= Ratio (either expected or measured over some implementation-dependent period of time) of frames received to frames sent The same document recommends that for network use of the BECN bit the frame relay network begin setting the BECN bit to a 1 bit prior to experiencing serious congestion and having to discard frames. Of course, if congestion ever reaches the point of creating severe problems, the network will start to discard frames, and frames with the DE bit set to 1 should be the first to go.
Windowing and FECN/BECN There are actually four situations that need to be considered when the interplay between windowing protocols and the FECN and BECN bits need to be considered. These four are: FECN with no traffic loss.
FECN when traffic loss is detected. BECN with no traffic loss.
BECN when traffic loss is detected. Consider each case in order. Most end-user protocols employ some form of windowing protocol for end-to-end flow control. In this environment, when a FRAD or user device employs FECN with no information loss detected, it compares the number of frames received with the FECN bit set to a 1 bit and those frames received with the FECN bit set to a 0 bit during a measurement interval. The measurement interval should be equal to two window turns. A window turn is the maximum number of frames that can be sent before an acknowledgment is required. If the number of frames with the FECN bit set to a 1 bit are greater than or equal to the number of frames with the FECN bit set to a 0 bit, then the device reduces the window size to 0.875 of its current value. But if the number of frames received with the FECN bit set to a 1 bit is less than the number of frames with the FECN bit
set to a 0, the device increases the window size by one frame, as long as this action would not exceed the maximum window size for the connection. After each adjustment, the whole process repeats itself. Next, consider the case where the user device has detected not only FECN bits but also some lost traffic and missing information. Not all user devices can do this, but if a frame is missing and the device realizes it, the device should reduce the window size to 0.25 of its current value. But if the device realizes that the frame relay network is providing congestion notification (early frame relay switches did not even use the FECN/BECN bits) and no frames with the FECN bit set to a 1 bit were received during the measurement interval (as previously defined), the device should conclude that the information loss is not due to network congestion. This conclusion is based on the fact that the network would normally send frames with the FECN bit set to a 1 if congestion was occurring. So frame loss without any indication of FECN bits set to a 1 bit is assumed to be due to errors on the network and not congestion. If there is further indication of congestion (FECN bits set to 1), then the window size is reduced by a factor of 0.625 instead of 0.25. The third possibility is when BECN bits indicate congestion but no frames are detected as missing. In this situation, the step count S (as previously defined) is used to adjust the sending rate. The step count S can have several values, but in this example S is assumed to be one window turn. If a frame is received with the BECN bit set to a 1 bit, the device reduces the window size by 0.625. The device will continue to reduce the window size if S consecutive frames with the BECN bit set to a 1 bit are received. Naturally, the window cannot be reduced to less than one frame, so the process eventually halts. But, when frames are received with the BECN bit set to a 0 bit, the device increases the window size by one frame after receiving a total of S/2 frames. The final case is where there are BECN bits for congestion and frame loss is detected. Again, this assumes that the user device is capable of detecting the lost traffic. In this case, the device reduces the sending rate to 0.25 of the current rate. This occurs whether the sending rate is being reduced due to congestion notification (BECN bits set to a 1 bit), or the frame relay network does not support frame operations that can set the BECN bit. Most frame relay networks rely on the user device (for example, a router or desktop PC attached to a FRAD) to perform flow control operations on the UNI. In many cases, the transport layer (end-toend layer) in the user device performs the flow control function. But if the FECN/BECN bits are to be interpreted and acted on by this transport layer, some mechanism must be put in place for the frames with the FECN and/or BECN bits set to a 1 bit to notify the transport layer of their status. This plan, as simple as it seems, is not so easy to implement. The plan requires changes to higher layer functions and the new coding to go along with it. Also, many transport layer protocols will timeout at the sending end if acknowledgments are delayed or missing. The result is a lot of retransmissions of discarded traffic. This only makes network congestion worse, as the identical traffic pattern that caused the congestion is reintroduced into the network by the unsuspecting transport layer. So it is not only the FRAD or the router that needs to adjust sending rates in response to frame relay network congestion.
Consolidated Link Layer Management (CLLM) This chapter has been concerned with many aspects of frame relay congestion notification and control. Notification is handled in frame relay by conveying to the endpoint devices on a connection the FECN and BECN bits in the frame relay frame header. The congested DLCI is indicated by the simple fact that the frame relay header also contains the DLCI of the affected connection. It has been shown that a given sender might receive FECN and/or BECN indications on only one DLCI defined on a UNI, or more than one DLCI, or all of them. The use of FECN and BECN is a simple yet powerful notification mechanism that can be used to avoid congestion on the frame relay network. But there is one circumstance where this simple FECN/BECN mechanism will just not work. Mention has already been made of one-way user traffic applications and several examples have been given. Other user applications such as the famous network printer (more correctly, networked print server) can pose a special challenge for FECN/BECN situations as well. The problem is this: How can the status of the FECN and BECN bits be sent to both ends of a frame relay connection efficiently if there is little or no traffic in one direction? Both ends of a frame relay connection must be notified of congestion, since the frame relay network has no idea (and should not have any idea) whether flow control is performed by the sender or receiver. But if only one end ever or rarely receives frames from its counterpart across the network, how can the FECN/BECN information be sent at all? The situation is shown in Figure 6.8.
Figure 6.8 The trouble with FECN/BECN and one-way user traffic. If there are no or few user frames flowing in one direction on the network, then the FECN/BECN congestion notification method will not work. Some sources say that frame relay networks must provide out-of-band network management messages. But the term out-of-band is typically used by service providers to refer to bandwidth and is not useful for user or bearer traffic. There is no such thing as out-of-band in frame relay in the sense of bandwidth dedicated for control functions. In frame relay, out-of-band effectively means not on a user DLCI. So the problem of no or insufficient user frames on user DLCIs to carry timely information such as FECN and BECN status has to be solved by using a nonuser DLCI to convey these types of information. The only question left is exactly which DLCI should be used. ANSI has established that DLCI 1023 (all 1 bits in the DLCI field) is to be used not only to convey FECN and BECN types of information, but a whole range of network situations that the end devices should be aware of. CLLM frames and messages are sent periodically on all UNIs. Response to CLLM congestion notifications is to be the same as that specified by ANSI and detailed in the previous section. A full discussion of the CLLM protocol is not needed in this chapter. It is enough to note that the CLLM frames on DLCI 1023 can solve the one-way traffic problem. Use of CLLM to address this issue is shown in Figure 6.9.
Figure 6.9 The one-way problem solved with CLLM. As it turns out, there is a whole family of messages used for out-of-band frame relay network control and management. Certainly the Q.933 signaling messages discussed in the previous chapter belong to this category, and the family of link management protocols also fall into this category. Link management is the topic of the next chapter in this book.
Frame Relay Compression Although this chapter deals with congestion control in frame relay networks, this section is about compression in frame relay networks. What has one to do with the other? The answer is easy enough. Congestion occurs when there are too many user traffic bits sent into the network. Compression algorithms remove redundancies in user traffic and thus reduce the number of bits sent into the network. So, the effective and standard use of compression applied to user traffic bits can reduce the risk of congestion in a frame relay network. The Frame Relay Forum is the source of the standard way to do compression on the contents of frame relay frames. FRF.9 is the Data Compression Over Frame Relay Implementation Agreement (IA). Now, there is nothing to stop FRAD vendors from using whatever data compression techniques they wish in their equipment. The problem is that a remote FRAD from another vendor might not be able to decompress the frame contents correctly in all circumstances. This multivendor interoperability is what IAs like FRF.9 are all about. It should also be noted that nothing prevents the higher layer applications at the endpoints of the frame relay network from using whatever form of data compression they wish. In fact, applications often use compression techniques such as the .zip file format to carry information across a network, not just a frame relay network. In many cases, an IP packet inside a frame relay frame carries a portion of the zipped file. But FRF.9 applies not to the content of the packet inside the frame relay frame. FRF.9 applies to the entire content of the frame relay frame, including the packet if present, and that is the difference. FRF.9 defines a compression technique called the Data Compression Protocol (DCP). The same document also specifies how to encapsulate the DCP content inside a frame relay frame. FRF.9 also defines a default Data Compression Function Definition (DCFD) to cut down on some of the options that are available to implementers of FRF.9. FRF.9 applies only to frames having a Control field (a Q.933 Annex E frame) and does not apply to signaling or other types of control messages. Use of the Control field enables receivers to determine which DLCIs have FRF.9 frames as their basic traffic unit. FRF.9 works with both PVCs and SVCs, and also works when used in conjunction with another network such as an ATM network. The DCP itself is divided into two sublayers, the DCP Function sublayer and the DCP Control sublayer. From the user perspective, the Function sublayer is the most important. The DCP Function sublayer performs the actual encoding and decoding (compression/decompression) of the frame contents and can use a wide variety of public and proprietary compression algorithms to do so. The DCP Control sublayer manages and coordinates the whole process. These control services include: Identification of the various DCP contexts and exact format of the frame contents.
A form of anti-expansion protection so that messages sent on a DLCI that are not compressed will not be subjected to the uncompression process at the receiver.
Negotiation of the exact form for DCP to be used on a connection, including the options to be used and the precise DCP Functions supported. What is called synchronization of the sender and receiver so that missing frames with compressed content can be detected and resynchronization performed.
The default DCFD is described in Annex A of FRF.9. Unless the two end devices agree to do otherwise, frame relay data compression will consist of a 1 to 3 octet DCP Header without extensions. The values of the fields in the DCP Header are all given default values, and the data that follows the DCP Header can be compressed or uncompressed, as the contents dictate. The default compression protocol used is LZS-DCP (the popular Lempel-Ziv compression method from Stac, Inc. adapted for DCP use) with a few user-settable parameters. Full FRF.9 implementation is quite complex. Those interested in the details of frame relay compression in terms of format, codings, and procedures are referred to the relevant sections of FRF.9 itself. Many frame relay networks rely more on user applications to generate compressed information, and many do. Web sites used zip-file formats, voice applications can use 8 kbps or even lower rate voice, and digital video has a number of more or less built-in compression techniques. But for those who want frame relay networks to be able to address compression issues directly, FRF.9 is always available.
Chapter 7: Link Management Overview The Frame relay link management performs a crucial function on the standard interfaces defined in frame relay networks. The two standard interfaces concerned with link management are the usernetwork interface (UNI) and network-network interface (NNI). It is always good to remember that the inter-switch interfaces between frame relay network nodes are undefined by frame relay standards and vendors are free to explore and improvise as they wish for this network node interface. So link management remains strictly a UNI and NNI concern. The Local Management Interface (LMI) is a specific form of frame relay network link management procedures, and LMI is sometimes used as a generic term for frame relay link management. The fact is the frame relay, amidst an acronym-laden world of networks, has no convenient acronym for link management procedures on the UNI and NNI interface.
Introduction There is more to networking than delivering bits representing user information from source to destination. It is easy to think only of data delivery when talking about networks, since this data delivery function is the one most dear to users’ hearts and minds. But this delivery of information must be both controlled and managed. Some of these control functions were discussed in the previous chapters on frame relay signaling (connection control, flow control, and congestion control) protocols. This chapter focuses on the management nature of network architectures, in particular the management of the UNI and NNI. Each standard interface, UNI or NNI, has its own unique set of requirements in terms of link management operation. The link management operations and functions for the UNI are fully treated in this chapter. Most of the link management operations and functions for the NNI are discussed in this chapter as well, but some of the details are left until the next chapter, which focuses on the NNI itself. The scope of the link management procedures is shown in Figure 7.1.
Figure 7.1 The scope of the link management procedures. The figure shows that link management procedures run on the frame relay UNI and NNI. Link management does not, however, run between ports of frame relay switches within the network cloud. Vendors are more or less free to use their own vendor-specific procedures between frame relay switches. This lack of a frame relay standard between switches has given rise to what the Frame Relay Forum calls cell-based frame relay where the frame relay switches are linked inside the cloud by an ATM network based on ATM standards and specifications. Because ATM has a fuller and more complete set of service classes for delivering QoS to applications than frame relay standards define, there is some advantage to linking frame relay switches over an ATM network. So the lack of a standard link management procedure between frame relay switches does not imply that a frame relay network is not manageable internally. Far from it. It is just that the network internally uses more equipment-oriented procedures that are linked either directly or indirectly to the service provider’s operations, administration, maintenance, and provisioning (OAM&P) hardware and software for the purposes of managing the frame relay network. What exactly is link management for? Some books and articles make it seem that link management is simply used as a way for FRADs to discover that the UNI into the frame relay network has failed, even if the link is not currently active. But just thinking about the UNI for a few minutes will lead to the realization that there must be more to link management than just that. And indeed there is. After all, the frame relay UNI is a synchronous bidirectional communications link like an ISDN digital subscriber line (DSL) or leased private line running SNA. All synchronous links constantly stream idle patterns (technically, “interframe fill”) of bits back and forth over the link when they are not sending live traffic frames. If the patterns disappear, even when there is no live traffic, then both ends of the link immediately know that the link has failed. If this works in other networks, why does frame relay need more?
The simple answer is that frame relay needs more than simple idle or fill pattern disappearance to indicate UNI link failure because frame relay does more than just provide a passive bit pipe for endto-end for users. Users on a frame relay network are at the ends of two separate local UNIs that have no direct access to how things are inside the frame relay network cloud. Perhaps a simple analogy will make this increased need for more information in frame relay networks evident. After a hard day’s work and a satisfying dinner, some people like to sit in front of their television set and watch their favorite cable TV channel (others, strange as it seems, prefer to read and/or write). But sometimes the picture on the channel just disappears, replaced by a blank screen. Now the cable TV user is faced with a number of questions that need answers before an alternative form of relaxation is considered, selected, and pursued. Some of these questions, but probably not all, are mentioned here. Is the effect local to the channel connection itself? That is, if the channel is changed, are there pictures on any other channels? If there are many channels, this might take a while to determine with absolute confidence. Is the effect local to the cable TV connection? That is, are neighbors’ cable TVs affected or not? This might involve making telephone calls to several neighbors (assuming the telephone service provider is not the cable TV company!), which might prove nothing if the neighbor says “I don’t have (or I’m not watching) cable TV.” How long will the channel/system be unavailable? That is, did a lightning strike temporarily cause the outage on a small branch of the network or did a tornado take out the central head-end site entirely for what will be an indeterminate period? This example list could easily be extended, but the point has been made. Just realizing that a network link is down is not the same as knowing the extent, cause, severity, and probable duration of the outage. The proper response might vary based on the actual value of one or more of these variables. One set of values leads to reading a book for a change, while another leads to the immediate cutover to a satellite dish and system. A FRAD at the end of a frame relay UNI always know exactly when the link has failed. The interframe fill pattern present before is now gone. But what if a remote UNI has failed? Then some PVC will continue to carry traffic while others cannot. What if it is the frame relay switch that has suddenly failed? Then all UNIs might not have their expected bit patterns in one direction, but how is the FRAD to determine this? How widespread in the failure? And so on. Again, the point is that whether the FRAD should employ a dialup ISDN link to the frame relay switch or perform some total disaster recovery procedure depends entirely on the answer to these questions. But usually only the network has access to the information that the FRAD needs to determine the proper course of action. There must be some standard mechanism defined in frame relay to allow the FRAD to query the frame relay network for the types of management information needed by the FRAD to allow the FRAD and the users supported by the FRAD to make informed decisions when considering and choosing alternative courses of action. This is what the link management procedures do.
Managing the Links In spite of its name and usage, the link management is not really used for frame relay network management. This means that no one would ever think that it should or could be used to manage a frame relay network. Full network management protocols include ways and means for network personnel to perform activities such as initialization, configuration, troubleshooting, and so on for the network in question. This is not to say that the link management procedures are not helpful to network management personnel when they are about their tasks. But link management in and of itself is not the frame relay version of something like the Internet protocol suite’s Simple Network Management Protocol (SNMP) used for devices than run TCP/IP. Instead, link management is more or a basis for information that can be made available to network managers by way of SNMP. This theme will be discussed more fully at the end of the chapter. First it is necessary to explore the link management procedures themselves in more detail. Link management is so important in frame relay networks that there are not one or two, but three, organizations that have established specifications as to how frame relay user devices (FRADs) and the frame relay network should exchange link management information. In actual practice, there are only two link management protocols, however, and there is never any concern about needing to run all three at the same time. Why should there be three in the first place? It is mainly because the three organizations worked on their versions of link management independently and at different, but overlapping, times. All of this might seem somewhat mysterious, but it happens all the time in standards groups and vendor consortiums. Frame relay link management procedures have had an interesting history. The initial experiments with frame relay networks included no link management procedures at all or proposed performing link management in a variety of proprietary and mutually exclusive ways. At first glance, it might seem strange that a standard network scheme based on ITU-T standards would not include link management procedures. There was a good reason for that apparent oversight, however. The ITU-T saw frame relay as a part of an overall ISDN, as has been previously mentioned. Frame relay was basically the new and improved version of the X.25 packet switching included as part of an ISDN. So a lot of the link management types of information needed for network management could be bundled with the overall ISDN procedures. All non-information data units to the ITU-T are all used for “signaling,” so link management procedures were simply other types of signaling messages. The close relationship envisioned between frame relay and ISDN posed somewhat of a problem for early implementers interested in frame relay. X.25 networks could be built and used apart from ISDN. Why could not frame relay networks also be built and used apart from ISDN? But then how could adequate link management information be provided specifically for a frame relay network without the benefit of having an ISDN to rely on? X.25 procedures would not just port over to the frame relay environment because all of Layer 3 was essentially absent on frame relay devices. Therefore, it was clear that straight ISDN procedures could not be used on frame relay networks. As a result, a group of equipment vendors got together and decided to create their own link management procedures to use until the standards organizations such as ANSI (in the United States) and the ITU-T (then the CCITT) came up with their own ways to perform link management procedures without the presence of an ISDN. The two groups cooperated about as well as could be expected, given the scope of their task, but on one point neither was willing to give an inch. This sticking point involved the choice of DLCI used to convey the link management information.
The ITU-T, in line with the philosophy of “all non-user stuff is signaling” wanted link management to use DLCI 0. Traditionally, signaling always used the lowest connection or device number available, and there was nothing lower than DLCI 0 for this task. On the other hand, ANSI saw the link management procedures as more closely aligned with network management and troubleshooting procedures. Traditionally, network management and exception reporting is done on the highest connection or device number available. To ANSI, signaling in the form of call control belonged on DLCI 0, but link management belonged on some other, higher-numbered DLCI. In most implementations of frame relay, there is no connection number higher than 10 1 bits or DLCI 1023. The debate over which DLCI to use for link management was spirited, to say the least. But this debate was hardly a matter of intense importance to the equipment vendors interested in manufacturing frame relay equipment before the new millennium. Four of these early frame relay equipment manufacturers got together and formed the Frame Relay Implementers’ Forum (FRIF). The companies were router vendor Cisco Systems (sometimes, and more properly, seen as lowercase “cisco”), computer maker Digital Equipment Corporation (DEC, part of Compaq), switch vendor Northern Telecom (now Nortel), and packet voice pioneer StrataCom (now part of cisco). All in all, they were a mix of vendors that could build a complete frame relay network for many types of traffic among themselves. Once the group began to grow and nonimplementers such as service providers and even end users joined, the organization became known as the Frame Relay Forum (FRF). The FRF worked in terms of Implementation Agreements, or specifications, not standards or official recommendations with the force of law. The FRF quickly decided to endorse the use of DLCI 1023 for the Local Management Interface (LMI) specification (call control signaling would still use DLCI 0, but initially the FRF just used PVCs, of course). The FRF published its version of the LMI link management procedures in September 1990. Due to the prominence of cisco in the FRF and the router industry, and the fact that 50 percent of the original Gang of Four as they were known is now only cisco, sometimes this LMI method that uses DLCI 1023 is known as cisco LMI. Although the equipment vendors decided to create the consortium LMI (yet another name for DLCI 1023 LMI procedures), this did not stop ANSI and the ITU-T from proceeding on their own. By October of 1991, ANSI had produced a link management Annex D to its frame relay specification, T1.617. Either as a result of subtle pressure from the ITU-T or because ANSI realized that the ITUT was right all along, Annex D specified the use of DLCI 0 for link management. To complete the triumvirate, the ITU-T completed work on link management Annex A in June of 1992, naturally using DLCI 0 as well. The ITU-T recommendation for link management has much in common with the ANSI version. So, although there are three organizations that were involved with the link management specifications, there were only two DLCIs involved: DLCI 0 (ITU-T and ANSI) and DLCI 1023 (consortium LMI or cisco LMI). This was okay because no frame relay network should ever have to try to run both the ANSI or ITU-T versions on any single UNI or NNI. The rules regarding which to use when are quite clear. If both ends of the UNI or NNI are within the borders of the United States, then ANSI Annex D is used on the UNI or NNI. If both ends are not within the borders of the United States, then ITU-T Annex A must be used. Since any individual UNI or NNI cannot belong to both classes at the same time, a UNI or NNI is said to be running either Annex D from ANSI or Annex A from ITU-T. One could hardly wish to use DLCI 0 for both types at the same time anyway (although it is technically possible). But the same logic does not apply to the LMI from the founders of the Frame Relay Forum (hereafter just FRF LMI). This FRF LMI could be run anywhere at anytime on any UNI or NNI since the FRF LMI was only an implementation agreement and not really a standard. The FRF LMI came first, and by the time ANSI and ITU-T were ready, the FRF LMI had a huge embedded base of FRADs and switches all merrily using DLCI 1023 for link management purposes. Most vendors soon decided that in the best interests of service providers and customers alike, and to better conform to national and international standards, FRF LMI should be treated as a type of interim solution on the road to ANSI Annex D or ITU-T Annex A. Converting from FRF LMI to ANSI or ITU-T versions was usually a software upgrade, but since the FRAD was CPE and controlled by the customer, there was no easy way to coordinate or enforce the upgrade process. Older FRADs usually supported only FRF LMI. Today, most FRADs allow configuration for using either FRF LMI or ANSI/ITU-T annexes.
But while FRADs typically use only FRF LMI or ANSI/ITU-T annexes, but not both at the same time, frame relay switches must be able to detect whether a FRAD on a switch port is using FRF LMI on DLCI 1023 or the ANSI/ITU-T annexes on DLCI 0 and respond accordingly. Not so long ago, this coordination was a monumental task and had to be done by manual configuration at customer premises and at the switch. Now all frame relay switches have an auto-detection feature to configure themselves for the correct version in use on a UNI. The user can even upgrade from a FRAD using FRF LMI to a FRAD using ANSI Annex D and the switch will go right along with the change from DLCI 1023 to DLCI 0. The days of FRF LMI appear to be numbered. The Frame Relay Forum itself notes that FRF LMI is not an implementation agreement and does not even maintain a version of the LMI documentation on its Web site. So the future of the FRF LMI is uncertain, except perhaps in the original vendors’ equipment packages (e.g., cisco products), and even then mostly due to embedded base considerations. In view of this, the rest of this chapter will focus on the ANSI and ITU-T specifications, noting a few of the basics regarding FRF LMI. Table 7.1 lists some of the major differences between FRF LMI and ANSI Annex D/ITU-T Annex A. The only entry not discussed in detail so far is the fact that FRF LMI is a unidirectional or asymmetric protocol only. Both ANSI and ITU-T link management procedures are unidirectional on the UNI, but can be bidirectional or symmetric on the NNI. This means that with the FRF LMI (asymmetric), a different type of message is always sent from FRAD to network than from network to FRAD. The ANSI/ITU-T annexes, on the other hand, can be either unidirectional (asymmetric on the UNI with different messages depending on direction) or bidirectional (symmetric on the NNI with the same messages in both directions). In this case, symmetric means that the same type of messages go back and forth on the NNI in either direction. With asymmetric operation, the message depends on which device originates the message. Table 7.1 Link Management Specifications for Frame Relay Networks COMMON NAME
ORIGIN
USAGE
DLCI
OPERATION
LMI
Gang of Four
Any UNI
1023
Unidirectional
Annex D
ANSI T1.617
U.S. UNI/NNI
0
Uni/Bidirectional
Annex A
ITU-T Q.933
International UNI/NNI
0
Uni/Bidirectional
It is somewhat confusing that the FRF LMI specification is used in this chapter as if it were an acronym for link management interface, but there is no easy alternative acronym to use. When dealing with LMI specifically, however, especially in contrast to Annex D or Annex A link management procedures, every effort will be made to avoid confusion.
Link Management Messages In frame relay, the link management messages are carried by LAPF unnumbered information (UI) frames, meaning these frames contain a control field in addition to and immediately following the frame relay header (address field). The link management messages always flow on either DLCI 0 or DLCI 1023, depending on whether LMI or Annex A/Annex D link management is used. It is the reception of frames on either DLCI 0 or DLCI 1023 that indicates to the receiver that the control field is present in the first place. Recall that link management messages are considered by ANSI and the ITU-T to be simply another type of signaling message that contains this control field. There are two main types of link management messages that are present in the information field of a DLCI 0 or DLCI 1023 frame. These are the Status Enquiry and Status messages, which were included as part of the full signaling message suite. There is also an Update Status message that exists in LMI, but this section will emphasize the link management messages used with ANSI Annex D or ITU-T Annex A. Figure 7.2 shows the general structure of a frame relay frame containing a link management message.
Figure 7.2 Link management message frame format. The main purpose of the link management messages is to allow the customer device, the FRAD, at the end of a UNI to determine first of all if a UNI link to the frame relay switch is up and running, and second what DLCIs are currently defined on the UNI. Note the emphasis on the customer premises device’s role in the link management process. After all, if a UNI fails at 3:00 a.m. on a Sunday morning, there is not much chance of the customer finding out that user data will not pass across the UNI until 8:00 a.m. on Monday. This is probably the worst possible time for anyone to realize that a link to a network is not available. Therefore, the link management messages allow customer equipment to detect UNI failures at any time of day or night. Variations of the basic UNI forms of link management messages are defined for use on the NNI also, but the UNI usage is presented first. The user device always sends the Status Enquiry message to the frame relay network on the UNI. The purpose of this message is fairly self-explanatory: What is the status of the UNI link and the DLCIs defined upon it? The network always responds with a Status message. This explains the asymmetric nature of the link management messages used on a UNI: Status Enquiry messages are always from users; Status messages are always from the network. Some of the Status Enquiry messages are simple keep alive messages (and are called such in LMI) that detect basic UNI connectivity. These Status Enquiry messages are called Link Integrity Verification (LIV) messages in Annex A and Annex D. There is no information about DLCI status provided by these keep alive messages, only sequence numbers to allow both ends of the UNI to determine if any Status Enquiry or Status messages have been missed. Status messages with DLCI information are called full Status messages. How often does a FRAD on a UNI send a Status Enquiry message to the frame relay switch and when is the response a simple LIV Status or a full Status message? This depends on the value of a timer and a counter. Both are configuration parameters that must be coordinated in both the FRAD and at the frame relay switch port, although both timer and counter have default values which are seldom changed. The timer is the Polling Interval Timer (T391) and the counter is the Polling Interval Counter (N391). The numbers do not imply that there are 390 other timers and counters in frame relay; that is simply what they are called. LMI calls them the nT1 timer and the nN1 counter, but their purpose and use is identical to their ANSI/ITU-T counterparts.
Most link management message exchanges are simple LIV exchanges to make sure the UNI is up and running. Every 10 seconds, which is the T391 default value, the user premises device (FRAD) sends a Status Enquiry message on the UNI which contains a sequence number from the FRAD and a sequence number representing the last sequence number received from the network side of the UNI. The network always responds as quickly as it can with a matching Status message which contains a sequence number from the network and a sequence number representing the last sequence number received from the FRAD. The sequence numbers increment with each exchange and eventually roll over and start again. Each end of the UNI should see each pair of numbers increase without gaps in the sequence under normal circumstances, of course. Gaps represent missing LIV messages. The T391 timer can be set in the FRAD between 5 seconds and 30 seconds, but the default of 10 seconds is seldom modified. After some number of N391 LIV messages, the default value being 6, the Status Enquiry message requests a full Status message response from the network. Once per minute with the default values, the network will send a full Status message to the FRAD on the UNI. The full Status message contains information which should reflect the status of every DLCI established on the UNI. Although there is a limit on the number of DLCIs that can be reported in a full Status message, this limit is seldom of concern in real, working frame relay networks. The N391 counter is set in the FRAD and can vary from 1 to 255, but the default of 6 is seldom modified. Taken together, the Status Enquiry and Status message pairs allow a FRAD to detect a UNI failure quickly and determine minute by minute whether a particular DLCI is available. There is a related timer in the frame relay switch, the T392 timer (how long should a switch wait for Status Enquiry message?), that must exceed the T391 timer value in the FRAD. The default value of the T392 timer is 15 seconds. This way the network always sees a Status Enquiry message before the switch records a missing Status Enquiry message. This action could result in the UNI being declared down by the network when it is in fact still available. The T392 timer, called nT2 in LMI, can be set between 5 and 30 seconds, with a default of 15 seconds. The frame relay switch also maintains two other counters, the N392 and N393 counters. These counters count the number of missing Status Enquiry messages (N392) and the total number of expected Status Enquiry messages (N393). In LMI, the N392 counter is the nN2 counter and the N393 counter is the nN3 counter. The default value of the N392 counter is 3, but it can be set between 1 and 10. The default value of the N393 counter is 4, but it can be set between 1 and 10 as well. An alarm is sent to the frame relay network operations center (NOC) if there are three Status Enquiry messages in error (N392 errors) out of four Status Enquiry events (N393 errors). Usually, this boils down to an alarm when three consecutive expected Status Enquiry messages are missing (30 seconds with default timers).
Link Management Message Formats The structure of the frame relay link management messages looks very much like the structure of the frame relay signaling messages described in Chapter 5. This is no accident, of course, because to the ITU-T and ANSI, link management messages are only other kinds of signaling (i.e., nonuser information-carrying) frames. The section will detail the structure of the ANSI and ITU-T Status Enquiry and Status messages. Some mention will be made of the LMI message formats, but only in passing. The overall structure of the ITU-T Q.933 Annex A and ANSI T1.617 Annex D Status Enquiry message is shown in Figure 7.3. Note the similarities between the two. Consider the ITU-T Annex A format first. The DLCI is 0 and the Control field value of 3 in hexadecimal (03h) indicates that this is an unnumbered information (UI) frame (regular signaling messages are sent as Information [I] frames that have a send and receive sequence field in the Control field). The Protocol Discriminator field which follows is still present and used to identify the exact signaling protocol used. As in all frame relay signaling messages based on Q.931, this field is set to 00001000, which is nothing more than the number 8 in hexadecimal (08h). The next octet in the Status Enquiry message header forms the Call Reference field. The first four bits of this field are always 0000 and the next four bits give the exact length of the call reference value itself. For link management messages, the Call Reference value is always exactly zero in all 8-bit positions. The all-zero Call Reference not only means that the length of the Call Reference field is just one octet and this is it; but it also serves as a Dummy (the term is used consistently in the standards) Call Reference value. The Dummy Call Reference makes sense in link management messages because the Call Reference value is used to track demand connections internally on the frame relay network. However, there is no connection to track in a link management message exchange of Status Enquiry and Status messages, so the Dummy Call Reference value keeps the field intact but lets the network effectively ignore it.
Figure 7.3 ITU-T Annex A and ANSI Annex D Status Enquiry message. The next octet is the Message Type. For a Status Enquiry message, this field is set to 0111 0101 or 75 in hexadecimal (75h). The Status message sent in response has the message type field set to 0111 1101 (7Dh). Note that these three fields—protocol discriminator, dummy call reference, and message type—make up the signaling message header as in all Q.933 signaling messages, although the presence of the dummy call reference field makes the signaling message header for link management messages only three octets total instead of the usual five octets when used for SVC signaling.
After the three-octet header, the Status Enquiry message consists of two Information Elements, in keeping with the whole link management is signaling philosophy. Both of the IEs have the familiar IE structure, which consists of a 0 bit followed by a 7-bit IE identifier in the first octet, a second octet with the total IE length in octets, then the octets of the IE itself. Both of the Status Enquiry IEs must always be present. The first IE in the Status Enquiry message is the report type field (as in all IEs), which is always three octets in length. The first octet is set to 0101 0001 (51h). The second octet is the length in octets of the report type contents (also as in all IEs), which is only three octets for this IE. The third octet is the type of report field and indicates mainly whether this is a link integrity verification (LIV) Status Enquiry message (0000 0001) or a full Status message (0000 0000). There is also a type of report known as a single PVC asynchronous status type of report which can be used to request status information on a particular DLCI; however, use of this type of report is for further study and not used at this time. The second and final IE in the Status Enquiry message is the LIV IE; it is used mainly to exchange the send and receive sequence numbers used for LIV purposes. This IE is four octets long. The first octet is set to 0101 0011 (53h). The second (length) octet is set to four octets. The final two octets carry the send sequence number (this Status Enquiry’s number) and the receive sequence number (the send number from the last Status message received from the network). Both numbers sequence from 0 to 255, then repeat. They roll over in about 40 minutes with the default value of one Status Enquiry every 10 seconds. The only major difference between the structure of the ITU-T Annex A Status Enquiry message and the ANSI Annex D Status Enquiry message is the presence in the ANSI format of the locking shift field (considered to be an IE all its own). The locking shift field is a single octet that appears between the message type field and the report type IE, and so forms a type of header extension to the signaling message header in ANSI implementations. The locking shift field is used to indicate that the IEs that follow are coded according to the ANSI formats (called codeset 5) instead of the ITU-T Q.931 formats previously described. The locking shift octet has a structure all its own. The first bit is a 1 bit, followed by 3 bits set to 001 which is the shift identifier. The 4th bit is set to a 0 bit, which triggers the locking shift action itself. Finally, the last 3 bits indicate the codeset invoked by the shift in coding. In ANSI Annex D, this field is set to 101, which is the number 5 in decimal. So the whole locking shift field, or IE, is 1001 0101 or 95 in hexadecimal (95h). This locking shift action basically adds a 50h to (or pastes a “5” in front of) the codes in the IEs that follow. So, for example, in ANSI Annex D, the report type field is 01h instead of 51h for an LIV type of report. Also, the LIV IE is coded as 03h in ANSI instead of 53h as in ITU-T messages. The LMI Status Enquiry message is not illustrated but is a kind of blend of the ITU-T and ANSI coding and formats. That is, the locking shift octet is not present, but the coding follows ANSI values anyway. So the LMI uses 01h in the report type field for an LIV message instead of the ITUT coding of 51h, and so on. Of course, LMI messages flow on DLCI 1023 and not DLCI 0. Finally, the LMI sets the protocol discriminator field to 0000 1001 (09h). These LMI differences (i.e., no locking shift, but ANSI coding, protocol discriminator field to 0000 1001 or 09 in hexadecimal [09h]) are consistent, so no further discussion of LMI details is necessary. Those still interested in LMI details are referred to cisco’s own documentation and implementation. (Be warned, however, that there are quirks in cisco’s LMI implementation that are vendor-specific to cisco.) What response does the Status Enquiry sent from the FRAD evoke from the switch? This depends on whether the Status Enquiry is an LIV or keep alive message sent every 10 seconds (the default) or a full Status request sent every minute (the default). In both cases the response is a Status message, but the content of an LIV Status message and a full Status message are radically different in size and use. Consider the LIV Status message first.
The LIV Status message is basically the mirror image of the Status Enquiry message shown in Figure 7.3. That is, the message flows on DLCI 0, it is a UI frame, and has a signaling message header three octets in length. The biggest difference is that the message type field in the signaling message header is coded as 0111 1101 (7Dh), instead of the 0111 0101 (75h) as in the Status Enquiry message. Both the report type and LIV IEs are present, in their familiar shapes and sizes. Naturally, the send and receive sequence numbers reflect the network perspective and not the user perspective. In the ANSI and LMI versions, the locking shift field (IE) is not present. In LMI, the message flows on DLCI 1023. LMI also sets the protocol discriminator field to 0000 1001 (09h). The full Status message is more interesting, but more complex. It contains information about the status of each and every PVC, by DLCI, established on the UNI. (This also works on the NNI as well, as will be shown later.) Consider the case where a FRAD knows that the UNI is up and running, based on the success of the LIV exchanges, but frames sent on a particular DLCI receive no response. Clearly, something is wrong, but what? Before the customer initiates various troubleshooting activities, the full Status message at least provides a basis for where such a troubleshooting process could and should begin. Perhaps the situation is more of a “never mind” condition at the endpoints of the PVC rather than a network trouble or failure. The Status Enquiry message that requests the full Status report from the network is unremarkable. The only notable difference from the formats illustrated in Figure 7.3 is that both message and report have the type of report field in the report type IE set to 0000 0000 (00h). The default counter sends this message after every 6 LIV messages, which have the type of report field in the report type IE set to 0000 0001. The LMI format is the same and uses the ANSI coding for the IEs, as previously mentioned. The real action in the full Status message is in the response from the frame relay network to the full Status Enquiry message from the customer premises device. The overall format of the full Status message is shown in Figure 7.4, again for both the ITU-T and ANSI formats. The LMI format is mentioned later.
Figure 7.4 ITU-T Annex A and ANSI Annex D full Status message. The familiar signaling message header is present in the network full Status message and coded with 08h (0000 1000) as protocol discriminator, 00h (0000 0000) as dummy call reference, and 75h (0111 0101) as the message type. The ANSI form has the locking shift. The report type field is either 51h (0101 0001) for ITU-T Annex A or 01h (0000 0001) for ANSI Annex D. In both cases, the type of report field indicates full Status or 00h (0000 0000). The LIV IE, 53h (0101 0011) for ITU-T or 03h (0000 0011) for ANSI, is also present, and still carries the send and receive sequence numbers.
The most interesting part of the full Status message follows the initial IEs. For the rest of the full Status message, up to the maximum frame relay frame size allowed on the UNI, there are simply repeated PVC Status IEs. These are usually five octet IEs, but some forms can extend to seven or even eight octets. The purpose of these IEs is to give the FRAD at the customer end of the UNI a minute-by-minute (using the default timers and counters) update on the state of all the DLCI the network knows about on the UNI. DLCIs are always reported in numerical order, making it easier for receivers to tell whether the status of a particular DLCI is being reported. However, there is an apparent problem here. Most frame relay networks employ a DLCI numbering scheme from 0 to 1023, allowing about 975 user-DLCIs after reserved values are subtracted. But most frame relay networks also allow a maximum frame relay information field size of 4096 octets, and some allow much smaller sizes (e.g., 1600 octets). If each full Status message uses 10 octets for signaling headers and initial IEs, then the full Status message can only report on the status of some 817 DLCIs (4086 octets of 5 octets per DLCI). So it might seem that more than 150 DLCIs might exist on a UNI that the network could never report on, since there is no provision for extending the reporting fields into a second full Status frame sent in response to a Status Enquiry. Fortunately, this is only a problem on paper and not in the real world. No one has ever seen a UNI with anywhere near 800 DLCIs defined. And even if this were possible without worrying about congestion, it would be unlikely that all of the DLCIs would be PVCs. And SVCs, or demand connections, need not be reported in the full Status messages. The LMI format of the full Status message again mixes some aspects of the ITU-T form (no locking shift) and the ANSI form (ANSI coding of IEs), and sets the protocol discriminator field to 0000 1001 (09h). Oddly, the LMI has an option to add three additional octets to the five octets of the Status IE. The three octets are used to indicate the PVC bandwidth used on the DLCI reported. The PVC bandwidth is defined as the minimum bandwidth that the frame relay network has assigned to the DLCI, expressed in bits per second. This field is also an option in ANSI Annex D, but has pretty much disappeared today. The Status IE has a structure, too, as shown in Figure 7.5. Thankfully, there are only two differences between the ITU-T Annex A IE and the ANSI Annex D version, so a separate figure for each is not used. The first difference is in the identifier code, which is 53h (0101 0011) in the ITU-T version and 03h (0000 0011) in ANSI, thanks to the locking shift. The second difference is in the use of the final bit of the last octet of the IE, which will be discussed shortly.
Figure 7.5 The Status IE. Consider the ITU-T format of the last three octets first. The last bit in each three octets is the extension bit which should be familiar from the concept of address extension (EA) bits in the frame relay header. Here, as there, the function of this bit is to indicate to the receiver whether the DLCI numbering is extended into the next octet (0) or not (1). In these three octets, the extension bit pattern is 0 1 1, meaning that the DLCI is extended into the second octet (0), but that’s all (1), and the third octet is the last of this IE (1). As with the frame relay header, the first (high-order) six bits of the DLCI are in the first octet and the last (low-order) 4 bits of the DLCI are in the second octet when 10-bit DLCIs are used. Naturally, this system allows for the Status message to report the larger DLCIs used in larger than 10-bit numbering schemes. The other four bits in the first two octets are just spare bits and must be set to 0 bits, although why they are spare and not reserved is anyone’s guess. The use of the term “spare” makes it seem as if the bits could be used if some other bits are busy doing other things at the time, but this is never the case, of course.
The third and final octet is what the Status IE is all about. The last 4 bits are more or less ignored for link management purposes. These last 4 bits are the extension bit and 3 spare 0 bits. The first 4 bits, after essentially five octets worth of labels and overhead, are the status bits themselves. The New (N) bit is set to a 1 bit if the DLCI just enumerated has not been reported on by the network before. The Delete (D) bit is set to a 1 bit if the network has deleted the DLCI just enumerated (the ITU-T says that this bit is only meaningful for the optional [and unused] single PVC report). If the D bit is set to a 1 bit, the status of the other bits in this octet are ignored. The Active (A) bit is set to a 1 bit if the DLCI just enumerated is active. This really means that the DLCI is available for normal use. If the A bit is set to a 0 bit, then the user is supposed to stop sending information on the DLCI, so effectively this is the enable/disable bit. The only real difference between ITU-T and ANSI (and LMI) PVC Status messages is the use in ANSI Annex D (and LMI) of the last bit in the message itself, which is reserved in the ITU-T version. In ANSI (and LMI) this is the Receiver not ready (R) bit. The R bit is set when all is otherwise well on the DLCI, but senders should not send for reasons of congestion. Now, the frame relay header has the FECN and BECN bits that perform pretty much the same function, so it is easy to see why the ITU-T decided that there was little value added by use of the R bit. In fact, it could be argued that equipment might be confused if not receiving FECNs and/or BECNs, and at the same time getting R bit status indicating congestion on DLCIs every minute through link management messages. Taken as a whole, the full suite of link management messages form an efficient mechanism that allow FRADs to track the status of the UNI on a minute-by-minute basis. It should be noted that in all modern frame relay CPE packages, use of either ITU-T Annex A or ANSI Annex D is a configurable parameter. Some devices also support LMI, although as time goes by this is more restricted to a single vendor (cisco) and then the vendor-specific implementation of LMI. However, since link management is strictly local, the two UNIs at the ends of a PVC might have differing link management procedures configured and still function properly. In other words, a cisco LMI UNI can interoperate easily with a UNI using ANSI Annex D on the other side of the PVC. All current frame relay switches detect which link management procedure mechanism is in use and configure themselves to respond properly to the Status Enquiry message type the switch receives. The local nature of the Status messages has caused some problems in frame relay, but not with interoperability, since the various types can easily be used on different UNIs. Of course, different types cannot be used on the same UNI at the same time. Just because a new DLCI has been configured on a UNI at one side of the network does not mean the configuration process has been completed at the other UNI, nor everywhere else in between. So service providers must be careful not to set the A bit to a 1 bit until the PVC labeled by the DLCI is ready to go end-to-end through the network. Otherwise, frame flowing on the DLCI will be discarded somewhere in the network without any direct indication to the sender that this action is taking place! Use of the link management procedures described in this chapter on the frame relay NNI is deferred until the next chapter, which contains a full discussion of the frame relay NNI.
What about Switched Virtual Circuits? A lot of discussion in earlier chapters focused on the whole PVC versus SVC issue in frame relay. Some service providers have begun supporting frame relay SVCs, so the issue is an important one. But for the purposes of this chapter, the important point about frame relay SVCs is that all of the previously described link management procedures are concerned with PVC status. However, once an SVC has been set up and given a DLCI, SVCs become indistinguishable from PVCs from both the user and network perspective. So how can PVC link management procedures be extended to SVCs?
As it turns out, the extension can be made fairly easily. The only tricky part is that frame relay SVCs are tracked internally to the network by call reference, a kind of connection identifier that supersedes the local DLCI values on each UNI and is unique in the network. PVCs, on the other hand, do not require call reference tracking because they are set up by hand and therefore employ the dummy (all-zero) call reference value. Does this SVC call reference value need to be conveyed to the end equipment along with the status of the SVC? If so, then modifications to the existing link management message formats are needed. If not, then PVC and SVC DLCIs can be mixed in a single full Status message with no problem. It will be interesting to see how SVC Status messages are handled by service providers once SVC service offerings become more common.
Customer Network Management (CNM) Networks involve more than just the transport of user data frames. The flow of data must be controlled and managed. This is what the last few chapters have mainly been about. Networks fail, and even worse, sometimes do not fail, causing more problems than a simple service outage. This not fail problem sounds odd at first, but it is a characteristic of all large, modern major network architectures, public or private. In the old days, networks failed all the time, but the problems were easy to find and correct. Today, networks hardly ever fail outright; but when they do, the problem is not always obvious and the cure might be complex. Network management must be able to manage networks of all shapes and sizes. Frame relay networks are complicated by the fact that users cannot see and control the entire link end-to-end, as users usually could with private, leased-line networks. So early frame relay network service providers had to deal with the perception that if a customer bought and used frame relay, the customer’s network management center would still be able to control and troubleshoot the frame relay network in spite of the network being public and appearing to the user as a collection of local UNIs. The control issue was handled by making the PVC process simple and efficient. Usually, service providers can configure new PVCs in 24 hours, or even less. The troubleshooting issue was more touchy. After all, an argument could be made that if the public network service provider is doing its job correctly, why would customers even have to concern themselves with troubleshooting in the first place? Troubleshooting is the job of the service provider’s network operations center, not the customer’s. Of course, if the troubleshooting process were that good, then customers migrating from private line networks to frame relay could just disband the NOC and go home. Naturally, nothing is that simple. In most cases, frame relay networks are a combination of public and private components. The FRAD is typically a router owned and operated by the customer. The frame relay switch is firmly under the control of the service provider. In many frame relay networks, the UNI might be provided by another service provider altogether. Yet all must work together not only when the network is running well, but when the network is not doing what it is supposed to. This is the whole idea behind Customer Network Management (CNM) features offered by most frame relay service providers: The service provider will make it easier for the customer’s network operations personnel to manage the public portions of the frame relay network.
Troubleshooting and Frame Relay CNM services can even extend to troubleshooting. A lot of network management activity in frame relay involves finding out where the trouble lies: in the network cloud, on the UNI itself, or in the FRAD. Usually, this is a three-step process that follows these steps: 1.Is the FRAD (router) connected to the DSU/CSU? This is the network equivalent of the “is it plugged in to the electrical socket?” troubleshooting step in the PC and LAN world. But it is still a good place to start. 2.Is the FRAD (router) exchanging Status Enquiry and Status messages with the frame relay switch? This is the sign that the UNI itself has failed. Note that both UNIs must be checked in this regard. 3.Are all of the DLCIs active? A UNI outage affects all DLCIs, of course. But if a complaint is about a particular DLCI, the full Status message should reveal any problems. Again, both UNIs need to be checked. Assuming all is well, or appears well, up to this point, a couple of further steps can be taken to verify end-to-end connectivity through the frame relay network. Usually, the steps are: 1.Try to ping the remote router (FRAD) if the router and network support TCP/IP, which they usually do. Ping is just a simple TCP/IP control message that is sent back by the target device. Typically, a whole stream of pings are sent, and the disappearance of a large number of responses is a sign of network congestion. A word of caution about using ping today: Many routers will filter out pings because of well-documented denial-of-service threats to TCP/IP routers (just the right number and type of pings can bring the router to its knees quickly). So just not being able to ping a router may not always be a sign that something is wrong: Perhaps everything is right. 2.Try to ping end-to-end, from user device to user device. In a perfect world, this step would be tried first. Unfortunately, users will either not bother before calling network operations or claim they did anyway, but it failed. But that does not mean network operations, following a simple try this, try that approach, should not finish up a no trouble found process with this test. The problem could be in the end device itself. Or the trouble could be just intermittent enough to disappear while the previous steps are being followed. (No one ever claimed network management was easy.) Each step involves a number of subactivities that focus on one aspect or another of the overall problem. Of course, if the FRAD is not a router, other methods must be used. Most CNM offerings from frame relay service providers do not totally outsource troubleshooting activities to the service provider, except for small customers. Generally, if the customer already has a network operations group, that group will continue its role as troubleshooting focal point for users. In that case, the CNM service offered will emphasize frame relay performance statistics.
Network Performance As important as troubleshooting is, there is more to network management than just find-it-and-fix-it. The overall health of the whole network needs to be assessed periodically as well. Is a given CIR adequate on a particular DLCI? Is there significant congestion on the frame relay network? What are the busiest periods on the network? All of these statistics should be gathered by the network operations center also, perhaps not by the same people fielding calls from irate users, but by some group at the customer’s network operations center (NOC). Customers can use their own methods to gather statistics about the frame relay network performance on their own, of course. Several packages exist, priced from about $5,000 to $20,000, excluding hardware. All will provide customers with service-level verification (is the network available when no one is using it?) and performance statistics (delay, delay variation, etc.). CNM marketing efforts focus on a couple of facts no one can deny. First of all, the frame relay service provider has experience with frame relay that a new customer does not have, by definition. Why train personnel in frame relay when the service provider has plenty? Second, a new customer has little to no hardware or software in their NOC that is specifically for frame relay network operation, again by definition. Why spend $20,000 when a small, monthly expenditure can provide the same information? These two simple but powerful facts combine to have the effect that few frame relay network services are sold without any CNM providing some form of enhanced (beyond basic uptime/downtime ratios) performance statistics. Generally, frame relay customers can get information in any one of three main ways from frame relay service providers. In all cases, these reports apply only to the customer’s PVCs and SVCs (if supported) on the public frame relay network. In fact, more than one method can be used in combination. The three main methods are: 1.The customer receives periodic written reports from the frame relay service provider. This is the original way, but has the severe draw-back of delay. On Thursday, who cares why the network was slow on Monday? The information is needed faster. Yet, several major frame relay service providers only offer this type of information to smaller customers. 2.The customer is given online access to the service provider’s report engine. This used to be an option offered only to larger customers, but no longer. A favorite method was to install a simple PC with dialup access to the service provider’s NOC. When the customer’s network operation staff took a call or needed to generate written reports of their own, they could simply dial in and access the information they needed. Even troubleshooting could be done this way in some cases. The information available to the customer was typically updated daily. 3.The customer uses an ordinary Web browser to access a secure Web site with the service provider’s reports. This option has caused a revolution in CNM circles. Implementation is simple and inexpensive for the service provider (who cannot afford a new Web site?) and trivial for the customer (who does not have a Web browser?). No separate link or port is needed, just Internet access. Security is provided by standard TCP/IP methods, which while not state-of-the-art, can be made effective.
Naturally, the most popular option today is the Web-based method. Information is usually updated every 15 minutes, practically real-time compared to other methods. Information on busy day, busy hour, traffic peaks, and traffic variations is typically gathered. The overall architecture of the Web method is shown in Figure 7.6. Note that access to the Web site need not necessarily be through the frame relay network itself, although this is of course possible. Also note that if FRAD information is included, the customer must allow access to this CPE device.
Figure 7.6 Frame relay statistics on the Web.
Service Level Agreements How often can a network fail before the network is unusable? How much delay is too long? How long should it take to return a failed UNI to service after it is down? Managing a network involves formulating answers for these and many more questions just like them. None of these questions have right or wrong answers, naturally. If a network carrying mostly e-mail is out of service for an hour, some might not even notice. But a system performing credit card validations out of service for an hour at a retail store could cause howls of protest. The same is true with end-to-end delay through the network. All networks, and frame relay is no exception, have varying service qualities from time to time as traffic comes and goes, links fail and are restored, and so forth. So it is important for the service provider to guarantee that the frame relay network support all of the user applications run across the frame relay network adequately. This is where the idea of a Service Level Agreement (SLA) enters. SLAs are essentially promises on the part of the overall frame relay service customer to each major user group that the level of service provided by the network will be within the bounds established by the SLA. The customer gets its own quality of service (QoS) from the frame relay service provider, of course. Typically, a frame relay tariff or contract will specify such QoS parameters as overall network availability (99.995%, or about 26.28 minutes per year of downtime, links restored in less than 4 hours), bit error rate (BER) (10 -11 , or 1 bit in every 100 million in error), block error rate (error rate on the frame, usually figured as the maximum size of the frame in octets times 8 bits per octet times the BER), the burst capacity (usually up to the UNI rate), delay (usually a flat less than 40 milliseconds upper bound on delay), and information loss (99.999% at CIR, or only 1 in 100,000 frames with DE = 0 discarded). There are variations within all of these categories, of course, and some major frame relay service providers compute delay along the lines of 1 millisecond per 100 route (cable) miles plus 0.5 milliseconds delay per frame relay switch. While nice in concept, few potential customers have any idea what their route actually is or how many switches there are between source and destination UNIs. Other service providers distinguish between user availability and overall network availability, which acknowledges that although the frame relay network as a whole might experience less than half an hour downtime in a year, individual users might have more extended outages, especially if the UNI is from another service provider altogether. The QoS parameters offered by frame relay service providers generally fall into the following categories: Bandwidth (the CIR, burst capacity)
Delay (upper bound, but no limit on delay variation)
Information loss (bits errors, block/frame errors, delivery rate at CIR) Reliability (availability, restoral times)
These four items are fairly comprehensive as far as network QoS goes. Only delay variation (jitter) and security are not really addressed. However, PVCs form a kind of closed user group that affords some measure of security for virtual private networks (VPNs) and the like. As in all cases like these, if the network does not provide adequate QoS for the application, then each and every application must address the shortcoming if the user is to make the application work on the network. For example, users must add jitter buffers to the end equipment on frame relay networks if stable delays are needed for the application (e.g., voice). Usually, the SLA takes these overall performance statistics, and parcels them out to the individual users and departments using the network. However, the SLA concept has also been extended to the relationship between customer and service provider. There are some issues associated with SLAs before they become common as part of frame relay services, however. First, service guarantees above and beyond what the network gives users are only available at a premium cost. Delays less than 40 milliseconds may be had, for instance, but only by routing PVCs carefully. Second, there are no standard terms and metrics for SLAs at all. Is it bit errors or block errors? Network availability or PVC availability? And so on. Third, given the lack of standard definitions, direct comparisons between service providers is difficult. Is 99 percent of frames with less than 40 milliseconds delay better than an absolute upper delay bound of 45 milliseconds? For all traffic types? Finally, it is hard to determine exactly when an SLA has been violated, especially in terms of individual frame delay and disappearing traffic. The Frame Relay Forum has developed standard SLA performance requirements for frame relay networks. The ITU-T has done some work in this area already. Eventually, SLAs will become the basis for a method for creating traffic priorities and service classes on frame relay, much like in ATM networks. For now, SLAs remain more or less a contract bargaining chip in pricing negotiations. For now, SLAs usually define penalties in the form of rebates on monthly bills. A full day’s credit for a one-hour outage or a full week’s credit for a four-hour outage are not uncommon terms. Some SLAs involve a form of disaster recovery for the network, but this usually only helps to avoid failures of a few UNIs or trunks between frame relay switches. One major frame relay service provider has three levels of disaster recovery service available as follows: UNI protection For a one-time fee of several hundred dollars, plus a charge of several thousand dollars if the other circuit is activated, the customer gets a second backup UNI to use if the primary UNI fails. But the second UNI still leads to the service provider’s frame relay switch. Backup PVCs For a one-time fee of less than one hundred dollars, plus a charge of several thousands dollars if the secondary is activated, each customer PVC and the CIR associated with it is mapped to a secondary route through the network. There is a small monthly maintenance fee for this service. Dynamic PVCs This option is the same as the second and the costs are essentially the same, although the monthly charges can rise quickly. The difference is that the backup PVC, which presumably kicks in when severe congestion occurs, has a larger CIR than the primary one. The philosophy is that the larger CIR will handle the bursts that caused the problems on the first circuit. The whole subject of frame relay SLAs, and disaster recovery options, will continue to be a topic of considerable interest to frame relay service providers, customers, and users.
The Service Level Definitions IA (FRF.13) The popularity of frame relay service means that frame relay is available from telephone companies (both local and long distance), ISPs, and even other types of companies. The diversity of service providers has made it difficult to assess the quality of the frame relay delivered, whether contracts are honored at all times, and even how one frame relay service compares with another. So the Frame Relay Forum has established FRF.13, the service level definitions IA, to define a series of “transfer parameters” that can be used “plan, describe, and evaluate frame relay products and offerings.”
FRF.13 makes use of many of the parameters that are established in the ITU-T’s X.144 recommendation (User Information Transfer Performance Parameters for Data Networks Providing International Frame Relay PVC Service). But of course the FRF.13 guidelines apply to any frame relay PVCs, not just international ones. FRF.13 is mainly concerned with how the parameters should be defined, not so much with how they used to compare or measure frame relay service, although there are elements of the latter. The parameters in FRF.13 are used to measure four main service elements: frame transfer delay, frame delivery ratio, data delivery ratio (not quite the same thing), and service availability. In other words, the frame relay network delay, the effects of errors, and the reliability of the network. A frame relay service provider or equipment vendor basically has to agree to define these parameters in the way that FRF.13 suggests and agree to use them according to the definition whenever the parameters are mentioned. There are 10 parameters defined in FRF.13. Each are assigned to one of the four respective characteristics of frame relay as follows: 1.Frame transfer delay: Frame Transfer Delay(FTD). The time between a defined frame entry event and a defined frame exit event. 2.Frame delivery ratio: Frame Delivery Ratio (FDR). The ratio between the total frames delivered by the network and the total frames offered to the network. Frame Delivery Ratio within CIR (FDRC). The ratio between the total frames that conform to the CIR delivered by the network and the total frames that conform to the CIR offered to the network. Frame Delivery Ratio above CIR (FDRE). The ratio between the total frames in excess of the CIR delivered by the network and the total frames in excess of the CIR offered to the network. 3.Data delivery ratio: Data Delivery Ratio (DDR). The ratio between the total payload octets delivered by the network and the total octets offered to the network. Frame Delivery Ratio within CIR (FDRC). The ratio between the total payload octets in frames that conform to the CIR delivered by the network and the total payload octets in frames that conform to the CIR offered to the network. Frame Delivery Ratio above CIR (FDRE). The ratio between the total payload octets in frames in excess of the CIR delivered by the network and the total payload octets in frames in excess of the CIR offered to the network. (Note that the difference between frame ratios and data ratios involves whether the frame contains user octets as payload or not. Missing SE frames count as missing frames, but not as missing data.) 4.Service availability: Frame Relay Virtual Connection Availability (FRVCA). A formula is allowing excluded outage time (time the network is not available due to scheduled maintenance or failures beyond the control of the frame relay network) to be subtracted from the calculation. Only outage time, direct due to faults in the network, is tracked by the service availability parameters. Frame Relay Mean Time to Repair (FRMTTR). A simple ratio between the outage time, as defined above, and the number of outages in the measurement period. If there are no outages, FRMTTR = 0 for the interval.
Frame Relay Mean Time Between Service Outages (FRMTBSO). The ratio between the measurement interval less excluded outage time and outage time, as defined above, and the number of outages in the measurement period. If there are no outages, FRMTBSO = 0 for the interval. Time intervals are measured in milliseconds for delay and minutes otherwise for the purposes of calculation. Delay is measured using a standard 128 octet frame payload, unless customer and service provider agree otherwise. FRF.13 establishes a standard, basic frame relay network model of two access circuit sections (UNIs) and the access network section (cloud) in between. More elaborate NNI situations are also defined. Several standard measurement reference points are established in FRF.13 (e.g., Egress Queue Input Reference Point (EqiRP)). There is a section on how hybrid private and public frame relay networks are to be treated (the private network looks like a UNI to the public network, which is now measuring public “edge-to-edge” service levels as well as public-private “end-to-end” levels). The actual mechanism for implementing FRF.13 is not discussed in the document. While some things are open to negotiation in FRF.13, others are not. When it comes to delay, for example, customer and service provider are free to have an SLA that determines the “object dimension” (delay per PVC?, per UNI?, both?, more?, etc.) and “time dimension” (day?, week?, business day week?, 7 day week?, etc.) over which delay is calculated. On the other hand, the SLA must describe at least: The measurement domain (objects and times)
Applicable reference points (UNIs and NNIs, and so on) Delay measurement mechanism
Identification of connections subject to the delay measurement Measurement frequency Frame size used
Information about all of the four characteristic service level areas can be aggregated in the form of a report, although it just says “reported.” Some possible formats are given in an Annex to FRF.13. For example, a delay report measures each connection once every 15 minutes, 24 hours a day, for 30 days. There is a lot of flexibility here. Only a few more points are needed about FRF.13. The intent is to take much of the guesswork out of SLAs and comparing frame relay services and service providers wherever they are in the world. However, FRF.13 is just a step in this direction, although a giant one. For example, although FRF.13 references X.144, the FRF.13 definition of a Outage Count is not the same as X.144s. But it will surely matter if a frame relay service provider complies with FRF.13 or not.
Simple Network Management Protocol, the Management Information Base, and Frame Relay Network management is a crucial part of any network. But where does the network management hardware and software get the raw material—the information—that it needs to present a coherent picture of “how the network is doing” to the network operations personnel? In frame relay networks, as in most networks today from the Internet to SNA, the answer is with the Simple Network Management Protocol (SNMP), the related Management Information Base (MIB), and the network management software running in the operations workstations. SNMP has its roots in the TCP/IP protocol suite and the Internet. Introduced in 1989 to manage routers (then called gateways) on the Internet, SNMP is now the industry standard network management software not only for routers, but for LAN hubs, modems, multiplexers, FRADs, ATM switches, and even end-user computers. One reason for this is SNMP’s profound simplicity. SNMP is very simple. The original version is now known as SNMPv1. Today, whenever only SNMP is used, it is understood to indicate SNMPv1. The newer version, SNMPv2, was standardized in 1993 but has only slowly appeared. SNMPv2 is much more complex than SNMPv1, but much better suited to managing complex and large networks with many LANs, routers, hubs, modems, and so on. Other enhancements such as better authentication and encryption were part of SNMPv2, but are now an option in SNMPv3. SNMP is built on the client/server model. In SNMP, the client process is the central network management software running on a system in a network operations center (usually abbreviated NOC). The server process runs in each and every SNMP-manageable device on the network, which need not be a TCP/IP network. The network can support and/or use any network protocol at all. TCP/IP is only needed for SNMP communications with the managed device. Many network devices today, such as frame relay switches, employ TCP/IP only as a vehicle for SNMP network management. The SNMP network management application is run on one or more central network management workstations. Typically, the SNMP server running in the network device, known as the agent, is requested by the network management software to supply some piece of information about itself. The SNMP server process replies with the current status of some piece of information about the device. There is only one exception to this “ask me and I’ll tell you” mode of operation. SNMP defines a set of alarm conditions known as traps on the managed device. The alarm is a special message sent to the network management client software without the managed device waiting to be polled. There is a standard database of information kept in each SNMP-managed device. This network management information database is technically known in SNMP as a set of objects. But most people refer to this database of information about a managed device as a Management Information Base, although strictly speaking the MIB is only a description of this database’s structure. The MIB is really a piece of paper that says things like: “The first field is an integer and represents the number of frames processed, the second field is 20 characters long and represents the manufacturers of the device,” and so on.
However, once a MIB is implemented (written, compiled, and linked like any other program) and installed in a managed device, the MIB fields (objects) take on current, valid values (926, acme.com, etc.). Note that the agent is only able to access current values in the MIB. Any historical network management information must be kept on the network management workstation. This is a way of keeping the size of the MIB to a minimum in the managed device. The whole idea of SNMP and MIBs is shown in Figure 7.7.
Figure 7.7 SNMP and the MIB. There are four main steps shown in Figure 7.7 before any network management software package can know anything for certain about the state of the network. First, the network management software package must send an SNMP message to the managed network device, based on its IP network address. Some network devices have IP addresses solely for SNMP purposes. The SNMP poll can be generated automatically and periodically by the network management software, or the message can be generated by a point-and-click on the part of the network operations staff. The agent software in the device accesses the database of managed objects, defined by the MIB, and returns the current value of the database field in another SNMP message. The network management software now knows nothing more than a number. Only by comparing the received value with some historical information kept in a database, usually on the network management station itself, can the raw information be made meaningful. In this case, the fact that the SNMP poll an hour earlier recorded the value “6” is used to realize that an additional two bad frames have been logged by the network device in the past hour. This example, although quite simple, is basically the way that SNMP operates most of the time. Occasionally, alarms generated by SNMP traps are sent to the network management station without waiting for a poll cycle. There are two main types of MIB defined in frame relay. The first is the MIB found in the frame relay FRAD, or frame relay Data Terminal Equipment (DTE) to the MIB writers. The second MIB type is the one found in the frame relay network itself. Both are discussed in some detail in the following paragraphs. Customer’s network operations always have access to the local FRAD’s MIB and might also have access to the MIBs in FRADs all over the network. The service provider’s network operations center has access to the MIB in the frame relay network itself, and sometimes to the FRAD MIBs as well (depending on the degree of network management provided). Seldom, if ever, does a customer have access to the frame relay network’s MIB, although this is slowly becoming a feature in premium frame relay service in the network management area. The general idea of the use of the frame relay MIB by both customer and service provider network management software is shown in Figure 7.8.
Figure 7.8 The frame relay MIBs. Figure 7.8 shows three UNIs attached to FRADs, all using the frame relay MIB. LANs attached to each FRAD (which are often routers anyway) have at least one, and maybe more, stations running some form of SNMP-based network management software. The network management software is often based on HP’s OpenView graphical user interface. OpenView is not really a network management application in and of itself, but is often used as the basis for many software vendor’s network management packages. In any case, the network management station(s) sends SNMP messages to the agents which access the MIBs in the managed devices on the network.
Both MIB types are illustrated in the figure. The DTE MIB in each FRAD can be accessed by the customer owning the FRAD over a normal DLCI configured specifically for network management. In CNM situations, the DTE MIB information can be gathered by the service provider. However, the service MIB within the network is exclusively accessed by the service provider’s network management software, except in rare circumstances. Note that access to the service MIB, which is not necessarily located in each and every frame relay switch, is through a regular LAN and FRAD arrangement (and the service provider FRAD also has a DTE MIB which must be managed!). Most network equipment vendors define their own, private extensions to the MIB defined in SNMP for a specific network device. The private MIB fields in the database are usually low-level, hardware-specific extensions with information such as whether the device is on battery backup, has experienced a fan failure, and so on. Most frame relay devices sold today include the standard frame relay MIBs that are accessible by most SNMP manager software products. The MIB forms the source of raw material for the frame relay network management software.
Chapter 8: The Network-Network Interface (NNI) Overview Frame relay networks are quite simple. They consist of a series of devices supporting a number of standard interfaces. The standard interfaces are only two in number: the User-Network Interface (UNI) from customer premises to frame relay switch and the Network-Network Interface (NNI) between frame relay switches in two different frame relay networks. Much of frame relay focuses on the UNI, for the obvious reason that this is where the users and customers are. But there is a lot happening at the NNI that affects the users and customers as well. This chapter looks at the NNI in more detail.
Introduction The NNI connects two different frame relay network clouds. Sometimes it is claimed that the different networks must be from two different service providers, and they often are, but this is not always true. Many frame relay vendors employ frame relay switches from two different manufacturers. This usually happens when the equipment from a former vendor of choice is still present while equipment from a new vendor of choice is moving in. It could also happen that one service provider has purchased another, one with a different vendor for frame relay switches. In any case, the problem is that interswitch interfaces in frame relay networks are not covered by frame relay standards. So multivendor interoperability is not a feature of the frame relay network as a whole. In this case, the NNI forms the standard interface needed on the links between the two vendors’ equipment on the overall network. Without the NNI, customers with the same service provider might not be able to connect to each other—hardly a good idea. So a single NNI can connect two frame relay networks, whether they are owned and operated by the same service provider. There are two other configurations where the NNI is used; in these situations there are usually two NNIs. In regulated environments, where the line between local service and long-distance service (the term is almost meaningless today, but entrenched) is firmly drawn and closely watched, one NNI would connect frame relay customers on one local frame relay network with a long-distance service provider’s frame relay backbone and another NNI would connect to the other local frame relay network (which might be the same local service provider or even a third frame relay service provider). In international environments, a local (national) frame relay service provider would employ an NNI to reach an international frame relay service provider’s backbone, which would attach by another NNI to a second local (national) frame relay service provider in the second country. All three of these NNI arrangements are shown in Figure 8.1. It should be noted that there could be many variations in these configurations, especially the last two. There might be only two service providers instead of three, for example. But the figure is representative of just where the frame relay NNI is used.
Figure 8.1 The three major uses of the NNI. As an aside, it should be pointed out that the acronym NNI applied to frame relay is not the same as the acronym NNI applied to ATM networks. In frame relay, NNI stands for network-network interface and defines the standard interface between distinct frame relay networks, not switches per se. In ATM networks, NNI stands for network node interface and defines the standard interface between each and every ATM switch within an ATM network cloud. The acronym B-ICI (Broadband InterCarrier Interface) is used in ATM to define the role that the NNI (network-network interface) plays in frame relay. This is unfortunate and confusing, but a fact of networking life.
NNI Implementation There are two main concerns regarding the interconnecting of two frame relay clouds with the NNI. First and foremost, the transfer of user information must take place transparently so that the users are not aware that a PVC connecting two UNIs actually consists of separate segments, as frame relay documentation calls them. The multisegment PVC has one segment with associated CIRs, excess bursts, and so on, defined within each and every frame relay cloud between source and destination UNI. Each NNI configuration contains at least two PVC segments, by definition, and many NNI configurations actually have three segments for each PVC, as can be seen in Figure 8.1. Second, the multisegment PVCs must be managed just like any other PVCs, which includes both link management and network management considerations. Not only do the separate UNIs not see the frame relay inner workings, the users never are aware of the presence of the NNI between frame relay clouds. Yet all of the PVCs that are mapped onto the NNI(s) must be activated, maintained, and so forth like any other PVCs. And supporting SVCs on the NNI is especially challenging, mainly due to the need to coordinate billing and resource allocation between not only individual switches, but also entire networks. Initial frame relay documentation said little about the NNI. In the packet-switching network protocol, X.25, the X.25 protocol ran on the UNI, and a separate protocol, X.75, ran on the links between public packet data networks (the packet-switching NNI). While standards like the LAPF core portions of Q.933 addressed UNI issues, little was done to develop the X.75 equivalent for the frame relay NNI until well into the 1990s. As with many aspects of frame relay, the Frame Relay Forum (FRF) addressed the issue head on. ANSI had provided a phase 1 NNI document which allowed data transfer without any protocol changes and a slight modification to link management procedures. But the FRF implementation agreement (IA) mechanism was a better fit for providing transparent PVC service to users, given the structure and procedures of the FRF group. Currently, the FRF NNI is defined in FRF IA 2.1, the Frame Relay Network-to-Network (NNI) IA. This document is intended as a blueprint for the ITU-T recommendation on the frame relay NNI, and FRF IA 2.1 is sometimes called Q.frnni 1. Care must be taken when using the FRF IA 2.1 documentation. The document is usually separated into a body and an annex, although the pages run sequentially between the two (1 through 50). Both body and annex are equally important when it comes to understanding the features and functions of the frame relay NNI.
Multi-network PVCs A key concept in any study of the frame relay NNI is the multi-network PVC (the term “multinetwork” is always hyphenated in FRF documentation). Any frame relay network configuration that has an active NNI must include multi-network PVCs. Every multi-network PVC consists of two or more PVC segments, one in each frame relay network cloud. The concatenation of these separate PVC segments forms the multi-network PVC. A multi-network PVC always starts and ends at a UNI. The idea of a multi-network PVC is shown in Figure 8.2. Two typical local provider-to-interexchange (labeled “regional”) carrier NNIs are illustrated. The local service provider could be an incumbent LEC, regional Bell operating company (RBOC), or any other entity prohibited by regulation, law, or both from carrying user bits out of a defined area, usually a local access and transport area (LATA) in the United States. All such local entities must employ the facilities of an interexchange carrier (IXC) to carry bits across a LATA bound ary, even if the same local entity receives the bits again in the destination LATA. This handoff to the IXC must occur whether the LATAs are across the country or adjacent and within the same state.
Figure 8.2 Multi-network PVC segments. The PVC that runs between the two UNIs is now a multi-network PVC consisting of three distinct segments that are under the control of three distinct frame relay service providers. Note that there could be only two service providers involved if the local frame relay service provider were the same, but just in two different LATAs. In any case, two PVCs run from UNI to NNI, while the third in the middle runs from NNI to NNI. Each PVC is set up by the appropriate service provider, yet the multinetwork PVC must appear to be one construct to the end users. This requires coordination of all the parameters that go into PVC configuration, from CIRs to burst levels to DE policies. Even something as simple as the PVC leaving one network as DLCI 47 and entering the next as DLCI 89 requires careful bookkeeping and configuration. At first glance, the task might seem to be impossible, given how little real communication and coordination exists when trying to provision even a multi-LATA private line, which just essentially boils down to, “We need four wires to and from the same place by the end of the month.” In many cases, dates become targets rather than actual deadlines. But at least frame relay benefits from the years of experience the service providers (some of them, anyway) have had with X.25 multinetwork circuits (logical connections). In X.25, the role of the UNI is played by X.25 itself, while X.75 assumes the role of the NNI. X.25 and X.75, like any public packet network service, had to deal with multiple service provider and multiple national networks years before frame relay was even imagined. The success of X.75, although after many years of hard work, shows that multinetwork PVCs can function, and function well, in frame relay environments.
Network-Network Interface Coordination The success of X.75 does not mean that configuring multi-network PVCs in frame relay is somehow easy or routine, however. Far from it. X.25 and X.75 do not offer anything like the flexible bandwidth allocation (bandwidth-on-demand) capabilities that are characteristic of frame relay networks. This means that frame relay networks have to coordinate not only DLCI numbers, but also bandwidth commitments, congestion procedures, and everything else related to the PVC. Here is a list of the minimum settable frame relay connection parameters that must be configured in a coordinated fashion at the NNI: The DLCI numbers at each end of the PVC segments that meet at the NNI
The CIR, committed and excess burst levels (Bc, Be), and time-measurement interval (Tc), in each direction on each PVC segment
The maximum frame size in each PVC segment (usually either 1600 or 4096 octets) All signaling timers and counters used in link management (N391, T391, etc.)
In addition, the networks should also support the standard implementations of FECN and BECN bits. All actions of the networks regarding the setting and use of the DE bit to perform traffic shaping and congestion control must be spelled out and coordinated as well. This is a good place to list the default values of the counters and timers used on the NNI, along with their ranges. All relate to the same status message use described in Chapter 7. Their use is essentially the same as on the UNI (T392 should be set greater than T391, etc.). These values are established by the Frame Relay Forum in FRF.2.1, the Frame Relay Network-to-Network (NNI) Implementation Agreement. The names and default values are listed in Table 8.1. Service providers rarely tinker with these values. Both of the frame relay networks at the ends of the NNI will generate Status Enquiry messages based on the T391 value but, of course, there is no need to coordinate the network clocks at this level to any degree. There is not even any requirement for the N391 counter to have the same value in both networks. But both sides of the NNI must have the same values for N392, N393, T391, and T392. The NNI discussion so far has only addressed PVC issues. What about SVCs? With SVCs, all of these parameters must be agreed upon and set up not in the time interval between service ordering and service provisioning, which can take several business days, but during processing of the call setup mes sage, which should take no more than about 10 seconds (the usual call setup target for switched services of all types). And SVCs should require additional coordination tasks, such as engaging billing systems and consideration of reciprocal billing arrangements between different service providers. Table 8.1 NNI Parameters for Link Management INFORMAL NAME
PARAMETER
RANGE
DEFAULT VALUE
COMMENT
Polling cycle
N391
1–255
6
Determines full status interval
Errors
N392
1–10
3
Missing cycles (