362 21 2MB
English Pages 145
Data Center Networking: Enterprise Distributed Data Centers Solutions Reference Nework Design March, 2003
Corporate Headquarters Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA http://www.cisco.com Tel: 408 526-4000 800 553-NETS (6387) Fax: 408 526-4100
Customer Order Number: 956599
THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS. THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY. The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB’s public domain version of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California. NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS” WITH ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
CCIP, the Cisco Arrow logo, the Cisco Powered Network mark, the Cisco Systems Verified logo, Cisco Unity, Follow Me Browsing, FormShare, iQ Breakthrough, iQ Expertise, iQ FastTrack, the iQ Logo, iQ Net Readiness Scorecard, Networking Academy, ScriptShare, SMARTnet, TransPath, and Voice LAN are trademarks of Cisco Systems, Inc.; Changing the Way We Work, Live, Play, and Learn, Discover All That’s Possible, The Fastest Way to Increase Your Internet Quotient, and iQuick Study are service marks of Cisco Systems, Inc.; and Aironet, ASIST, BPX, Catalyst, CCDA, CCDP, CCIE, CCNA, CCNP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, the Cisco IOS logo, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Empowering the Internet Generation, Enterprise/Solver, EtherChannel, EtherSwitch, Fast Step, GigaStack, Internet Quotient, IOS, IP/TV, LightStream, MGX, MICA, the Networkers logo, Network Registrar, Packet, PIX, Post-Routing, Pre-Routing, RateMUX, Registrar, SlideCast, StrataView Plus, Stratm, SwitchProbe, TeleRouter, and VCO are registered trademarks of Cisco Systems, Inc. and/or its affiliates in the U.S. and certain other countries. All other trademarks mentioned in this document or Web site are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (0208R)
Data Cemter Networking: Enterprise Distributed Data Centers Copyright © 2003, Cisco Systems, Inc. All rights reserved.
C O N T E N T S Preface
i
Intended Audience
i
Document Organization
i
Document Conventions
ii
Obtaining Documentation iii World Wide Web iii Documentation CD-ROM iii Ordering Documentation iii Documentation Feedback iii Obtaining Technical Assistance iv Cisco.com iv Technical Assistance Center iv Cisco TAC Web Site v Cisco TAC Escalation Center
CHAPTER
1
v
Enterprise Distributed Data Centers Overview
1-1
Business Goals and Requirements 1-1 The Problem 1-1 The Solution 1-2 Single Site Architecture 1-3 Multi Site Architecture 1-3 Application Overview 1-3 Legacy Applications 1-4 Non-Legacy Applications 1-4 Application Requirements 1-4 Benefits of Distributed Data Centers 1-5 Site-to-Site Recovery 1-6 Multi-Site Load Distribution 1-6 Solution Topologies 1-6 Site-to-Site Recovery 1-6 User to Application Recovery 1-9 Database-to-Database Recovery 1-9 Storage-to-Storage Recovery 1-9 Multi-Site Topology 1-10
Data Center Networking: Enterprise Distributed Data Centers 956599
iii
Contents
Summary
CHAPTER
2
1-12
Inter Data Center Transport Technologies
2-1
Interconnecting Data Centers 2-1 Server Peer-to-Peer Communication 2-1 Data Replication and Networked Storage 2-2 Synchronous Replication 2-3 Asynchronous Replication 2-4 Transport Technologies LAN 2-6 MAN 2-6 WAN 2-7
CHAPTER
3
2-5
Site Selection Technologies
3-1
Request Routing 3-1 DNS Based Request Routing HTTP Redirection 3-3 Route Health Injection 3-4
3-1
Supporting Platforms 3-5 Global Site Selector 3-5 Distributed Director 3-6 Distributed Director, IOS-SLB, and Dynamic Feedback Protocol WebNS and Global Server Load Balancing 3-7 Content Switching Module for Catalyst 6000 3-9
CHAPTER
4
Site to Site Recovery
3-7
4-1
Site-to-Site Recovery Topology 4-1 Hardware and Software Requirements Design Details 4-5 Design Goals 4-5 Redundancy 4-5 High Availability 4-6 Application Requirements Additional Design Goals 4-7
4-5
4-6
Design Recommendations 4-7 Topology Using Distributed Director 4-8 Topology Using Content Services Switch 4-9 Integrated Architecture 4-9 Decoupled Architecture 4-10 Data Center Networking: Enterprise Distributed Data Centers
iv
956599
Contents
Topology Using the Content Switching Module Recommendations 4-11 Implementation Details 4-13 Topology Using CSS 4-13 Authoritative DNS 4-13 CAPP 4-14 Warm Standby Deployment 4-14 Hot Standby Deployment 4-14 High Availability 4-14 Configuring CSS in Integrated Architecture Caveats: 4-17 Configuring CSS in Decoupled Architecture Caveats: 4-21 Topology Using DistributedDirector 4-21 High Availability 4-23 Configuration Details 4-23 Caveats 4-24 Topology using CSM 4-24 High Availability 4-26 Configuration Details 4-26 Caveats 4-29
CHAPTER
5
Multi-Site Load Distribution
4-10
4-14
4-18
5-1
Hardware and Software Requirements
5-1
Design Details 5-2 Design Goals 5-2 High Availability 5-2 Scalability 5-2 Security 5-2 Other Requirements 5-3 Design Topologies 5-3 Topology Using CSS 5-3 Integrated Architecture 5-3 Decoupled Architecture 5-4 Topology Using Distributed Director
5-5
Implementation Details 5-6 Working with the CSS 5-6 Site Selection Methods 5-7 Modes of Operation 5-8 Data Center Networking: Enterprise Distributed Data Centers 956599
v
Contents
Deployment Methods 5-9 Caveats 5-11 Configuration 5-11 Decoupled Architecture 5-12 Caveats 5-14 Configuration 5-14 Working with Distributed Directors 5-15 Director Response Protocol (DRP) 5-15 DRP Access Limiting and Authentication Dynamic Feedback Protocol 5-16 Site Selection Methods 5-16 Deployment Options 5-17 Configuration 5-18 Recommendations 5-20
CHAPTER
6
Multi Site Multi Homing Overview
5-15
6-1
6-1
Multi-Site Multi-Homing Design Principles High Availability 6-3 Scalability 6-5 Intelligent Network Services 6-5 HSRP 6-5 Routing Protocol Technologies 6-6 Edge Routing - BGP 6-6
6-3
Design Caveats 6-8 Work Arounds 6-9 Multi-Site Multi-Homing Design Recommendations 6-9 Border Router Layer 6-10 Internet Data Center Core Switching Layer 6-10 Firewall Layer 6-11 Data Center Core Switching Layer 6-11 Implementation Details 6-12 Multi-Site Multi-Homing Topology 6-12 Internet Cloud Router Configurations Internet Edge Configurations 6-13 Edge Switching Layer Configurations Core Switching Layer Configurations BGP Attribute Tuning 6-23 Security Considerations
6-12
6-15 6-20
6-24
Data Center Networking: Enterprise Distributed Data Centers
vi
956599
Contents
CHAPTER
7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Overview 7-1 Hardware and Software Requirements Design Details 7-2 Design Goals 7-2 Redundancy 7-2 High Availability 7-2 Scalability 7-3 Security 7-3 Other Requirements 7-3 Design Topologies 7-3 Site-to-Site Recovery 7-4 Multi-Site Load Distribution
7-1
7-1
7-5
Implementation Details 7-6 Redundancy 7-7 High Availability 7-8 Scalability 7-8 Basic Configuration 7-8 Site-to-Site Recovery 7-10 Site Selection Method 7-10 Configuration 7-10 Multi-Site Load Distribution 7-12 Site Selection Methods 7-13 Configuration 7-14 Summary
7-17
INDEX
Data Center Networking: Enterprise Distributed Data Centers 956599
vii
Contents
Data Center Networking: Enterprise Distributed Data Centers
viii
956599
Preface For small, medium and large businesses, it is critical to provide high availability of data for both customers and employees. The objective behind disaster recovery and business continuance plans is accessibility to data anywhere and at any time. Meeting these objectives is all but impossible with a single data center. The single data center is a single point of failure if a catastrophic event occurs. The business comes to a standstill until the data center is rebuilt and the applications and data are restored. Enterprises can realize application scalability and high availability and increased redundancy by deploying multiple data centers, also known as distributed data centers (DDC). This Solutions Reference Network Design (SRND) guide discusses the benefits, technologies, and platforms related to designing distributed data centers. More importantly, this SRND discusses disaster recovery and business continuance, which are two key problems addressed by deploying a DDC.
Intended Audience This document is for intended for network design architects and support engineers who are responsible for planning, designing, implementing, and operating networks.
Document Organization This document contains the following chapters: Chapter or Appendix
Description
Chapter 1, “Enterprise Distributed Data Centers Overview”
Provides an overview of Distributed Data Centers.
Chapter 2, “Inter Data Center Transport Technologies”
Provides an overview of the transport technologies available for inter data center communications.
Chapter 3, “Site Selection Technologies.”
Provides an overview of the various technologies available for site selection between multiple data centers.
Chapter 4, “Site to Site Recovery”
Provides design guidance and implementation recommendations for site to site recovery in distributed data center environments.
Chapter 5, “Multi-Site Load Distribution”
Provides design guidance for implementing global server load balancing (GLSB) in distributed data center environments.
Data Center Networking: Enterprise Distributed Data Center 956599
i
Preface Document Conventions
Chapter or Appendix
Description
Chapter 6, “Multi Site Multi Homing”
This chapter provides design recommendations on how to deploy a multi-site mulit-homed network.
Chapter 7, “Deploying Site-to-Site Recovery and Multi-Site Load Distribution”
This chapter provides design recommendations for deploying site-to-sSite recovery and multi-site load distribution.
Document Conventions This guide uses the following conventions to convey instructions and information: Table 1
Document Conventions
Convention
Description
boldface font
Commands and keywords.
italic font
Variables for which you supply values.
[
Keywords or arguments that appear within square brackets are optional.
]
{x | y | z}
A choice of required keywords appears in braces separated by vertical bars. You must select one.
screen font
Examples of information displayed on the screen.
boldface screen
Examples of information you must enter.
font
Nonprinting characters, for example passwords, appear in angle brackets.
[
]
Default responses to system prompts appear in square brackets.
Note
Timesaver
Tips
Caution
Means reader take note. Notes contain helpful suggestions or references to material not covered in the manual.
Means the described action saves time. You can save time by performing the action described in the paragraph.
Means the following information will help you solve a problem. The tips information might not be troubleshooting or even an action, but could be useful information, similar to a Timesaver.
Means reader be careful. In this situation, you might do something that could result in equipment damage or loss of data.
Data Center Networking: Enterprise Distributed Data Center
ii
956599
Preface Obtaining Documentation
Obtaining Documentation These sections explain how to obtain documentation from Cisco Systems.
World Wide Web You can access the most current Cisco documentation on the World Wide Web at this URL: http://www.cisco.com Translated documentation is available at this URL: http://www.cisco.com/public/countries_languages.shtml
Documentation CD-ROM Cisco documentation and additional literature are available in a Cisco Documentation CD-ROM package, which is shipped with your product. The Documentation CD-ROM is updated monthly and may be more current than printed documentation. The CD-ROM package is available as a single unit or through an annual subscription.
Ordering Documentation You can order Cisco documentation in these ways: •
Registered Cisco.com users (Cisco direct customers) can order Cisco product documentation from the Networking Products MarketPlace: http://www.cisco.com/cgi-bin/order/order_root.pl
•
Registered Cisco.com users can order the Documentation CD-ROM through the online Subscription Store: http://www.cisco.com/go/subscription
•
Nonregistered Cisco.com users can order documentation through a local account representative by calling Cisco Systems Corporate Headquarters (California, U.S.A.) at 408 526-7208 or, elsewhere in North America, by calling 800 553-NETS (6387).
Documentation Feedback You can submit comments electronically on Cisco.com. In the Cisco Documentation home page, click the Fax or Email option in the “Leave Feedback” section at the bottom of the page. You can e-mail your comments to [email protected]. You can submit your comments by mail by using the response card behind the front cover of your document or by writing to the following address: Cisco Systems Attn: Document Resource Connection 170 West Tasman Drive San Jose, CA 95134-9883
Data Center Networking: Enterprise Distributed Data Center 956599
iii
Preface Obtaining Technical Assistance
We appreciate your comments.
Obtaining Technical Assistance Cisco provides Cisco.com as a starting point for all technical assistance. Customers and partners can obtain online documentation, troubleshooting tips, and sample configurations from online tools by using the Cisco Technical Assistance Center (TAC) Web Site. Cisco.com registered users have complete access to the technical support resources on the Cisco TAC Web Site.
Cisco.com Cisco.com is the foundation of a suite of interactive, networked services that provides immediate, open access to Cisco information, networking solutions, services, programs, and resources at any time, from anywhere in the world. Cisco.com is a highly integrated Internet application and a powerful, easy-to-use tool that provides a broad range of features and services to help you with these tasks: •
Streamline business processes and improve productivity
•
Resolve technical issues with online support
•
Download and test software packages
•
Order Cisco learning materials and merchandise
•
Register for online skill assessment, training, and certification programs
If you want to obtain customized information and service, you can self-register on Cisco.com. To access Cisco.com, go to this URL: http://www.cisco.com
Technical Assistance Center The Cisco Technical Assistance Center (TAC) is available to all customers who need technical assistance with a Cisco product, technology, or solution. Two levels of support are available: the Cisco TAC Web Site and the Cisco TAC Escalation Center. Cisco TAC inquiries are categorized according to the urgency of the issue: •
Priority level 4 (P4)—You need information or assistance concerning Cisco product capabilities, product installation, or basic product configuration.
•
Priority level 3 (P3)—Your network performance is degraded. Network functionality is noticeably impaired, but most business operations continue.
•
Priority level 2 (P2)—Your production network is severely degraded, affecting significant aspects of business operations. No workaround is available.
•
Priority level 1 (P1)—Your production network is down, and a critical impact to business operations will occur if service is not restored quickly. No workaround is available.
The Cisco TAC resource that you choose is based on the priority of the problem and the conditions of service contracts, when applicable.
Data Center Networking: Enterprise Distributed Data Center
iv
956599
Preface Obtaining Technical Assistance
Cisco TAC Web Site You can use the Cisco TAC Web Site to resolve P3 and P4 issues yourself, saving both cost and time. The site provides around-the-clock access to online tools, knowledge bases, and software. To access the Cisco TAC Web Site, go to this URL: http://www.cisco.com/tac All customers, partners, and resellers who have a valid Cisco service contract have complete access to the technical support resources on the Cisco TAC Web Site. The Cisco TAC Web Site requires a Cisco.com login ID and password. If you have a valid service contract but do not have a login ID or password, go to this URL to register: http://www.cisco.com/register/ If you are a Cisco.com registered user, and you cannot resolve your technical issues by using the Cisco TAC Web Site, you can open a case online by using the TAC Case Open tool at this URL: http://www.cisco.com/tac/caseopen If you have Internet access, we recommend that you open P3 and P4 cases through the Cisco TAC Web Site.
Cisco TAC Escalation Center The Cisco TAC Escalation Center addresses priority level 1 or priority level 2 issues. These classifications are assigned when severe network degradation significantly impacts business operations. When you contact the TAC Escalation Center with a P1 or P2 problem, a Cisco TAC engineer automatically opens a case. To obtain a directory of toll-free Cisco TAC telephone numbers for your country, go to this URL: http://www.cisco.com/warp/public/687/Directory/DirTAC.shtml Before calling, please check with your network operations center to determine the level of Cisco support services to which your company is entitled: for example, SMARTnet, SMARTnet Onsite, or Network Supported Accounts (NSA). When you call the center, please have available your service agreement number and your product serial number.
Data Center Networking: Enterprise Distributed Data Center 956599
v
Preface Obtaining Technical Assistance
Data Center Networking: Enterprise Distributed Data Center
vi
956599
C H A P T E R
1
Enterprise Distributed Data Centers Overview Centralized data centers have helped many Enterprises achieve substantial productivity gains and cost savings. These data centers house mission critical applications. Mission critical applications must be highly available. The demand on data centers is therefore higher than ever before. Data center design must focus on scaling methodology and achieving high availability. A disaster in a single data center that houses enterprise applications and data has a crippling affect on that enterprises ability to conduct business. How does an enterprise survive a natural or man-made disaster that affects the data center? Enterprises can realize application scalability and high availability and redundancy by deploying distributed data centers (DDC). This paper discusses the benefits, technologies, and platforms related to designing distributed data centers. More importantly, this document discusses disaster recovery and business continuance, which are two key problems addressed by deploying a DDC.
Business Goals and Requirements Before getting farther into the details, it is important to keep in mind the goals and requirements of businesses. Technology allows businesses to be productive and to quickly react to business environment changes. Data centers are one of the most important business assets with data being a key element. That data must be protected, preserved and highly available. In order for a business to access data from anywhere and at any time, the data center must be operational around the clock, under any circumstance. In addition to high availability, as the business grows, businesses should be able to scale while protecting their current capitol investments. In summary, data is an important aspect of business and from this perspective; the business goal is to achieve redundancy, high availability, and scalability. Securing this data gets the highest priority.
The Problem Data center unavailability causes both direct and indirect losses. Direct losses translate into money. Indirect losses, however, are much harder to quantify. Generally, indirect losses translate to the costs related to legal, contractual, and regulatory obligations, and customer satisfaction issues due to not meeting service level agreements. Gartner Group Dataquest estimates that two out of five companies that experience a disaster go out of business within five years. As companies realize the productivity gains the network brings to their businesses, more and more companies move towards an infrastructure that is available 24x7, 365 days a year; for example, hospitals, credit card companies and financial institutions. Examples of applications that have to be available around the clock are email, file services, enterprise resource planning (ERP), supply-chain, eCommerce, and customer relationship management(CRM).
Data Center Networking: Enterprise Distributed Data Centers 956599
1-1
Chapter 1
Enterprise Distributed Data Centers Overview
Business Goals and Requirements
According to a report published by Strategic Research Corporation, the financial impact of a major system outage is enormous: •
US$6.5 million per hour in the case of a brokerage operation
•
US$2.6 million per hour for a credit-card sales authorization system
•
US$14,500 per hour in automated teller machine (ATM) fees if an ATM system is offline
According to a research report from META Group, the average cost of an hour of downtime is estimated at $330,000. Another consulting firm puts the figure closer to $1 million. Even planned application downtime causes some disruption to businesses. Deploying multiple data centers is a key step to solving downtime problems. Table 1-1 provides an overview of the impact of catastrophic failures based on industry.
Industry Sector
Impact of Catastrophic Failures
Revenue/Hour
Energy
$2,817,846
Telecommunications
$2.066,245
Manufacturing
$1,610,654
Financial Institutions
$1,495,134
Information Technology
$1,344,461
Insurance
$1,202,444
Retail
$1,107,274
87015
Table 1-1
Business continuance and disaster recovery are important goals for businesses. According to the Yankee Group, business continuity is a strategy that outlines plans and procedures to keep business operations such as sales, manufacturing and inventory applications, etc. 100% available. Business continuity is defined as providing continuous business support services in the face of disaster. Companies embracing e-business applications must adopt strategies that keep application services up and running 24 x 7 and ensure business critical information is secure and protected from corruption and loss. In addition to high availability, the ability to scale as the business grows is also important.
The Solution Resilient networks provide business resilience. A business continuance strategy for application data that addresses these issues involves two steps. 1.
Replicating data, either synchronously or asynchronously
2.
Directing users to the recovered data
Data Center Networking: Enterprise Distributed Data Centers
1-2
956599
Chapter 1
Enterprise Distributed Data Centers Overview Application Overview
Data needs to be replicated synchronously or at regular intervals (asynchronously), retrieved and restored when needed. The intervals at which data is backed up forms the critical component of a company’s business continuance strategy. The company’s business and the application requirements dictate the interval at which the data is replicated. In the event of a failure, the backed up data is restored, and applications are enabled with the restored data. The second part of the solution is to provide a path to and direct the end users to the recovered data. The important goal of business continuity is reducing the gap between when data needs to be recovered and the time when data is recovered and used, with little or no loss of data. This helps minimize the business losses. For example, if data from a sales order is lost, it is a loss for the business unless that order information is recovered and processed in a timely fashion.
Single Site Architecture Building a single data center is potentially risky when one considers the aforementioned business continuance requirements. A highly available design protects access to critical information if hardware or software breaks down at the data center. Protecting data accessibility cannot be achieved if the entire data center is not accessible. Achieving high availability aimed at preventing the single site failure problem in the event of a catastrophic failure requires that applications and information be replicated at a different location. This requires the build-out of more than one data center.
Multi Site Architecture With multiple data centers, where the application data is duplicated, clients have the ability to go to the available data center in the event of catastrophic failures. You can also use the data centers concurrently to increase scalability. Scalability is a direct result of having more application resources distributed across multiple data centers. Building multiple data centers is analogous to building a global server farm, which increases the number of requests and number of clients that can be handled.
Note
Application information otherwise known as data or content is kept on servers. Content includes critical application information, static data (web pages), and dynamically generated data. To learn more about scaling and deploying static content, refer to the Caching in the Enterprise: Overview document. Once content is distributed across multiple data centers, the need to distribute requests for content arises. There must be a control mechanism deployed to manage the load placed on each data center. This control mechanism translates into the ability to direct user requests for content, and route them to the appropriate data center. The selection of the appropriate data center is normally based on server availability, content availability or network distance from client to data center, among other parameters. Directing requests for content is also known as content routing or more appropriately request routing.
Application Overview Applications are at the heart of the data center. Applications can be broadly classified into two categories: legacy and non-legacy.
Data Center Networking: Enterprise Distributed Data Centers 956599
1-3
Chapter 1
Enterprise Distributed Data Centers Overview
Application Overview
Legacy Applications Legacy applications are those that have been inherited from languages, platforms, and techniques earlier than current technology. A significant number of enterprises have legacy applications and databases that serve critical business needs. Typically, the challenge is to keep the legacy application running during the conversion to newer, more efficient code that makes use of new technology and software programming developments. In the past, applications were tailored for a specific operating system and even hardware platform. Currently, many companies are migrating their legacy applications to new platforms and systems that follow open or standard programming interfaces to make it easier to upgrade them without having to rewrite them. This strategy also allows for server consolidation. In addition to moving to new languages, enterprises are redistributing the locations of applications and data. In general, legacy applications must continue to run on the platforms for which they were developed. Typically, new development environments account for the need to continue to support legacy applications and data. With many new tools, newer programs can access legacy databases. Legacy applications also follow the model of client-server computing. In an IP environment, the legacy applications normally have an hard-coded an IP address to communicate with servers throught the DNS process
Non-Legacy Applications The current trend is to provide user-friendly front ends to applications. The web browser is one such front-end application that is very popular. Most enterprise applications in development today have such a user interface. The newer applications follow open standards such that it becomes easier to interoperate with other vendors’ applications and data. Web based interfaces are nowadays very common among different vendors. Migrating or upgrading applications also becomes easier by deploying these newer open standards based applications. The current trend is for enterprises to build three-tier server farm architectures to support non-legacy and/or present-day applications. In addition to using DNS for domain name resolution, the newer application programs often interoperate with the HTTP protocol and its ability to use redirection.
Application Requirements Applications store, retrieve and modify data based on client input. Typically, application requirements align with the business requirements, namely high availability, security, and scalability. Applications must be capable of supporting a specific number of users and be able to provide redundancy within the data center to protect against hardware and software failures. To scale the number of users, deploy applications at multiple data centers. This eliminates the single point of failure. Applications deployed in multiple data centers provide high availability. Table 1-2 provides an idea of different types of application requirements.
Data Center Networking: Enterprise Distributed Data Centers
1-4
956599
Chapter 1
Enterprise Distributed Data Centers Overview Benefits of Distributed Data Centers
Table 1-2
Application Requirements
HA
Security
Scalability
ERP/Mfg
High
High
High
E-Commerce
High
High
High
Financial
High
High
–
CRM
High
High
High
Hospital Apps
High
High
–
E-mail
Medium
High
Medium 87016
Application
Benefits of Distributed Data Centers The objective of deploying distributed data centers is to provide redundancy, scalability and high availability. Redundancy is the first line of defense against any failure. Redundancy within a data center protects against link failure, equipment failure and application failure and protects businesses from both direct and indirect losses. A business continuance strategy for application data backup that addresses these issues includes backup, restore and disaster recovery. Backup and restore are critical components of a company’s business continuance strategy. These strategies often include: •
Data archiving for protection against data loss and corruption, or to meet regulatory requirements
•
Remote replication of data for distribution of content, application testing, disaster protection, and data center migration
•
Non-intrusive replication technologies that do not impact production systems and still meet shrinking backup window requirements
•
Critical e-business applications that require a robust disaster recovery infrastructure Real-time disaster recovery solutions, such as synchronous mirroring, allow companies to safeguard their data operations by: – Ensuring uninterrupted mission-critical services to employees, customers, and partners – Guaranteeing that mission-critical data is securely and remotely mirrored to avoid any data loss
in the event of a disaster The following sections cover the benefits of deploying distributed data centers.
Data Center Networking: Enterprise Distributed Data Centers 956599
1-5
Chapter 1
Enterprise Distributed Data Centers Overview
Benefits of Distributed Data Centers
Site-to-Site Recovery DDCs provide redundancy through site-to-site recovery mechanisms. Site-to-site recovery is the ability to recover from a site failure by failing over to a secondary or backup site. As companies realize the productivity gains the network brings to their businesses, more and more companies are moving towards a networked infrastructure. Achieve application high availability by deploying distributed data centers and business continuance solutions.
Multi-Site Load Distribution Distributing applications among multiple sites allows you to take advantage of multi-site load distribution to create a more efficient, cost-effective use of global resources, ensure scalable content, and provide end users with a better experience of using applications. Routing clients to a site based on load conditions and health of the site results in scalability and high availability, respectively. You can load balance any application that uses standard HTTP, TCP or UDP, including mail, news, chat, and lightweight directory access protocol (LDAP). This provides enhanced scalability for a variety of mission-critical e-Business applications. These benefits come with some hurdles. Some of the challenges include mirroring database state information and mirroring data and session information across multiple data centers. The application vendors are dealing with these issues. The underlying infrastructure to facilitate mirroring mechanisms helps simplify the problem by providing high bandwidth and a high-speed connection between the data centers. Improve site (data center) availability and site load by routing end users to the appropriate data centers. You can use different criteria to route end users to different data centers. Routing users to a data center closer to them, statistically improves the application’s response time. This is referred to as proximity based request routing. In addition to this, you can route users to different data centers based on the load at the data center and the application availability. Load distribution based on proximity plays an important role when delivering multi media to end-users. You can distribute applications like video on demand (VoD) or media on demand (MoD) across different data centers. In this instance, clients are redirected to the closest data center. This improves the end users experience and helps reduce congestion on the network. Access to applications is limited by a number of factors related to hardware, software and the network architecture. To accommodate anticipated demand, estimate traffic loads on the system to derive the number of nodes required to handle peak workloads. DDCs allow you to distribute applications across multiple sites, increasing scalability and the redundancy of the application environments both of which are key goals in supporting mission critical applications.
Solution Topologies Site-to-Site Recovery Typically, in a data center, the web servers, application servers, databases, and storage devices are organized in a multi-tier environment. This is referred to as an instance of the multi-tier architecture or N-Tier architecture. This document describes the most common N-Tier model, the 3-tier. A 3-tier architecture has the following components: •
Front-end layer
•
Application layer
Data Center Networking: Enterprise Distributed Data Centers
1-6
956599
Chapter 1
Enterprise Distributed Data Centers Overview Benefits of Distributed Data Centers
•
Back-end layer
The front-end layer or presentation tier provides the client-facing interface and serves information to client requests. The servers in this tier assemble the information and present it to the client. This tier includes DNS, FTP, SMTP and other servers whose purpose is rather generic. The application tier, also known as middleware or business logic, contains the applications that process the requests for information and provide the logic that generates or fulfills dynamic content. This tier literally runs the tasks needed to assemble the dynamic content and plays the key role of communicating to both the front-end and back-end tier. The database forms the back end layer. Typically, a disaster recovery or a business continuance solution involves two data centers as depicted in Figure 1-1.
Data Center Networking: Enterprise Distributed Data Centers 956599
1-7
Chapter 1
Enterprise Distributed Data Centers Overview
Benefits of Distributed Data Centers
Figure 1-1
Distributed Data Center Model
Internet Service provider A Internal network
Service provider B Internet edge
Internal network
Internet edge
Core switches
Front-end Layer
S Application Layer e r v e r
Back-end Layer GE
Storage
GE
ONS 15xxx
ONS 15xxx
DWDM
DWDM
FC
FC
DWDM Ring
Metro Optical ESCON Primary Data Center
ESCON
87017
F a r m s
Secondary Data Center
There are two main topologies from a solutions perspective: •
Hot standby
Data Center Networking: Enterprise Distributed Data Centers
1-8
956599
Chapter 1
Enterprise Distributed Data Centers Overview Benefits of Distributed Data Centers
•
Warm standby
In a hot standby solution, the secondary data center has some applications running actively and has some traffic processing responsibilities. This is so that the resources are not kept idle in the secondary data center, which improves overall application scalability and equipment utilization. In a warm standby solution, the applications at the secondary data center are active at all times but the traffic is only processed by the secondary data center when the primary data center goes out of service. Note that the multi-tier architecture is replicated at both the primary and secondary data centers. The following elements are involved in a disaster recovery or a business continuance solution: •
User to application recovery
•
Database to database recovery
•
Storage to storage recovery
User to Application Recovery When a catastrophic failure occurs at a data center, you lose connectivity with the application. Depending on the kind of session involved, the application at your end might try to reconnect to the cached IP address of the server. Ultimately, you have to restart the application on the desktop because the primary data center is not available. When the client’s application connects to the remote server, it resolves the domain to an IP address. The new IP address belongs to the secondary data center. You are unaware that the secondary data center is active and that your request has been routed to the secondary data center. Behind the scenes, the request routing devices are constantly monitoring the applications at the both data centers. All new requests are routed to the secondary data center in the event that the primary data center becomes unavailable.
Database-to-Database Recovery Databases also maintain keep-alive traffic and session state information between the primary and secondary data centers. Like the application tier, the database tier has to update the state information to the secondary data center. Database state information updates tend to be chattier than application tier state information updates. Database updates consume more bandwidth and have a drastic impact on the corporate network if done frequently during regular business hours. The database synchronization benefits from the backend network infrastructure introduced to support the application tier. During a catastrophic failure at the primary data center, the secondary data center becomes active and the database rolls back to the previous update.
Storage-to-Storage Recovery The destination for all application transactions is the storage media, like the disk arrays, which are part of the data center. These disks are backed up using tapes and can be backed up either synchronously or asynchronously to the secondary data center. If the data is backed up using disk arrays, after a catastrophic failure, the data is recovered from the tapes at an alternate data center, which requires a great deal of time and effort. Refer back to the previous section, The Problem, for an illustration of the impact this might have in dollars. In asynchronous backup, data is written to the secondary data center at a specified time. All the data saved on the local disk arrays for a specific window of operation are transferred to the secondary data center at that specified time. When a disaster occurs, the data is retrieved from the previous update and your business resumes, starting at the last update. With this mechanism, your business is rolled back to
Data Center Networking: Enterprise Distributed Data Centers 956599
1-9
Chapter 1
Enterprise Distributed Data Centers Overview
Benefits of Distributed Data Centers
the previous update. This method has less recovery overhead when compared to tape backup mechanism and recovery is quick, although some data loss is still likely. If that loss is acceptable, you have essentially recovered from the catastrophic failure. Many businesses have a low tolerance for downtime and lost data. An hour of downtime might mean losing substantial amounts of money, possibly several millions of dollars. Such businesses use synchronous data backup. With such a mechanism, the data is written to the remote or secondary data center every time the data is written at the primary data center. You are acknowledged of a successful transaction only when the data is written both at the primary and secondary data center. If there is a catastrophic failure, the secondary data center takes over immediately and no data is lost. The end user, after going through user to application recovery is able to access the secondary data center and has access to most of the data. Essentially, close to 100% of data is retrieved and there is virtually no business impact. Business continues normally.
Multi-Site Topology Multi-site topology is an extension to site-to-site recovery topology. As shown in Figure 1-2, data center 2 typically is not part of the DWDM ring. Data Center 2 is used to back up data over longer distance. This topology typically supports business continuity as well as load sharing between data centers. It is possible to interconnect these data centers without the DWDM ring. In that case, the data replication between the data centers takes place at longer intervals. This interval is defined by business models and the tolerance for downtime. Typically if a high speed, high density medium is not used to connect data centers at the back end, the down time after a catastrophic event is several hours or days depending on the type of backup and recovery mechanisms used.
Data Center Networking: Enterprise Distributed Data Centers
1-10
956599
Chapter 1
Enterprise Distributed Data Centers Overview Benefits of Distributed Data Centers
Figure 1-2
Multi-Site Architecture
Internet Service provider B
Service provider A Internal network
Internal network
Internet edge
Internet edge
GE ONS 15xxx DWDM
FC GE
ONS 15xxx DWDM
GE
ESCON
ONS 15xxx DWDM
FC
Data center 2
FC
ESCON Data center 1
ESCON
87018
DWDM Ring
Data center 3
Data Center Networking: Enterprise Distributed Data Centers 956599
1-11
Chapter 1
Enterprise Distributed Data Centers Overview
Summary
In a local server load balancing environment, scalability is achieved by deploying a server farm and front ending that server farm with a content switch. Analogous to this, the other data centers are viewed by request routers as servers distributed geographically. The applications are distributed across different data centers. The clients requesting connection to these applications now, get directed to different data centers based on certain criteria. This is referred to as a Site Selection Method. Different site selection methods include least loaded, round robin, preferred sites and source IP hash. The data centers in this topology supports an N-tier architecture. A metro ring helps with quick backup procedures which include application to application recovery, database-to-database recovery, and storage to storage recovery. Due to the complexities involved in storage technologies, it is common to provide synchronous replication between two data centers and replicate the data to the third data center at a later time asynchronously.
Summary Data has become a valuable corporate asset in the information age. Accessibility to this data around the clock enables businesses to compete effectively in today's rapidly changing business conditions. Building redundancy into the application environments helps keep the company's information available around the clock. The time spent recovering from disaster has a significant impact on business; which implies that business continuance has become extremely a critical solution. There exists statistical proof of the direct relationship between a successful business continuance plan and the general health of a business in the face of disaster. Likewise, ROI can be assessed based on the direct and indirect losses incurred by a critical application outage. With consideration given to the compelling events mentioned here, it is imperative that enterprises construct and maintain viable business continuance strategies that include distributed data centers.
Data Center Networking: Enterprise Distributed Data Centers
1-12
956599
C H A P T E R
2
Inter Data Center Transport Technologies Companies embracing e-business applications must adopt strategies that keep application services up and running 24 x 7. They must also ensure that business critical information is protected from corruption and loss. Applications use databases and storage devices to store and retrieve data. This data has to be replicated to distributed data centers. This chapter explains the requirements for interconnecting distributed data centers and the different technologies available to interconnect data centers. Deploying distributed data centers helps achieve redundancy and high availability. Distributed data centers are the heart and soul of businesses which need to have application availability at all times. Latency limitations are dictated by the enterprise applications and data loss tolerance levels depend on business models. Application availability also implies access to data at all times. Most enterprise data is stored captive in servers or within a storage subsystem directly attached to the server. As the number of storage resources proliferate and the volume of stored data increases, this server-centric architecture is proving expensive to scale, complex to manage, and difficult to deliver with 24 x 7 availability. Centralizing the storage requirements has various advantages from a data management and a low cost of ownership perspective. Data can be backed up to tapes and replicated to remote storage easily. In addition to storage, the servers in the distributed data centers have a need to communicate with each other. Examples of these requirements are peer to peer communication between the N tier server farms or communication between the main frame computers located in different data centers to achieve redundancy or high availability. Once again, based on the hosts and the business continuance and disaster recovery strategies, the bandwidth requirements at the transport layer may be different.
Interconnecting Data Centers Server Peer-to-Peer Communication N tier server farm architectures are becoming more popular because they provide enhanced scalability and eliminates the single point of failure. N tier architecture also improves manageability by reducing hardware and software overhead. As part of the business continuance or disaster recovery dstrategy, these N tiers are deployed at both the primary and secondary data centers. In support of the business continuance or disaster recovery, the N tier server farms between the data centers need to replicate the data. The amount of data that needs to be replicated depends on the applications and also on the number of transactions. The application sensitivity to latency dictates the need to have a high speed and high bandwidth transport layers between data centers. The transport technologies provide the necessary path for communication between the tiers. An example of disaster recovery application is Data Guard from Oracle, which relies on the connectivity between the data centers to send messages, or data base logs in
Data Center Networking: Enterprise Distributed Data Centers 956599
2-1
Chapter 2
Inter Data Center Transport Technologies
Interconnecting Data Centers
the form of messages, to the servers in the secondary data center. These logs can then be used for data recovery in the event of a catastrophic failure. In addition to this, many vendors support messaging or communication between the data centers. Another application that drives the need for a high speed and high bandwidth transport layer between data centers is the main frame applications. A typical example would be IBM’s Globally Distributed Parallel Sysplex (GDPS) protocol. IBM's Parallel Sysplex technology, hardware, and automation software provides customers with a high-availability solution where multiple mainframe processors are tightly coupled, thus providing seamless failover at the application level. GDPS is almost always accompanied by multiple Enterprise Systems Connection (ESCON) or Fiber connection (FICON) links. •
ESCON – ESCON is a 200-Mbps unidirectional serial bit transmission protocol used to dynamically
connect IBM or IBM compatible mainframes with their various control units. – ESCON provides nonblocking access through either point-to-point connections or high speed
switches, called ESCON Directors. •
FICON – FICON is the next generation bidirectional channel protocol used to connect mainframes
directly with control units or ESCON aggregation switches (ESCON Directors with a bridge card). – FICON runs over Fibre Channel at a data rate of 1.062 Gbps.
Data Replication and Networked Storage In a disaster recovery or business continuance solution, data and state information at different server farm tiers are replicated to secondary data center. When businesses adopt disaster recovery or business continuance solutions, typically two data centers (active/standby) are deployed. Data is replicated from the active data center to the standby data center. In other words, the storage is networked together to provide data backup and recovery. Networked storage holds the promise of reducing the cost and complexity associated with delivering highly available and scalable storage services. This networked model, termed storage networking, is best described as the software and hardware that enables storage consolidation, sharing, accessibility, and management over a networked infrastructure. Storage networks are deployed as both storage area networks (SANs), which provide block-based access to shared disk, and network attached storage (NAS), which provides file-based access to shared storage. File access builds on the block access framework to provide an abstraction layer whereby data blocks are logically arranged and manipulated as objects called files.
Data Center Networking: Enterprise Distributed Data Centers
2-2
956599
Chapter 2
Inter Data Center Transport Technologies Interconnecting Data Centers
Figure 2-1
Storage Area Networking (NAS and SAN)
Servers
Servers
Storage Network
NAS Tape
Disk Arrays or Network Storage
Storage
87027
Tape
FC switch
Figure 2-1 shows both NAS and SAN networks. When it comes to data replication or mirroring, different SAN and NAS have different requirements. NAS supports file access. Servers in a NAS environment connect to file servers, which are normally referred to as filers, through a gigabit switch. It is easy to deploy NAS. Whereas, in a SAN environment, the servers connect to SANs through fibre channel switches. Both tapes and storage are connected to the SANs. There are primarily two different data replication techniques available. The choice of one over the other depends on the tolerance limits for data loss. The two different mechanisms used are synchronous and asynchronous replication. Several vendors provide the applications for data replication. Some of these are mentioned below. Table 2-1
Storage Application Vendors
Storage Vendor
Synchronous Replication Application
Asynchronous Replication Application
EMC
SRDF
SRDF
IBM
PPRC
XRC
Hitachi
True Copy
TrueCopy
Compaq/HP
DRM
DRM
Synchronous Replication Enterprises use synchronous replication when business continuity requirements dictate multiple synchronized copies of the data at multiple sites or data centers. Every write to disk is synchronously replicated across the network to the storage array in the standby data center. The synchronous application which resides on the intelligent controller waits for both disk drives to complete writing data before it returns an acknowledgement to the I/O requestor or initiator. This is shown in Figure 2-2.
Data Center Networking: Enterprise Distributed Data Centers 956599
2-3
Chapter 2
Inter Data Center Transport Technologies
Interconnecting Data Centers
Figure 2-2
Synchronous Replication Between Data Centers
Synchronous disk Mirroring
Server disk I/O
4 3
Local Storage System
2
Production Data Center
Backup Storage System Backup Data Center
Required in low latency network infrastructures. 1.
Write to disk, wait for acknowledgement
2.
Copy write I/O to remote storage system
3.
Remote acknowledges copy complete
4.
Local acknowledges write to host
87028
1
Synchronous replication is optimized for local high speed connections (low latency) or across a metro optical network providing minimal risk to data integrity. Since the replication is essentially real-time, the risk to data integrity refers to having accurate up to the second data at the standby data center in the event of a fault in the primary data center. As a rule, synchronous replication is optimal when the distance between data centers is no more than 100 km, with the optimal distance about 50 km. Longer distance results in longer latency. Latency is the key determinant of throughput. Latency also impacts the transaction rate of the Host/Server.
Asynchronous Replication Asynchronous replication is used when disaster recovery or business continuance requirements are not stringent and the business can tolerate certain amount of data loss. Again, the tolerance levels dictate the frequency at which the data is replicated to the remote data center. Data replicated at regular intervals, based on the business tolerance levels, is known as asynchronous replication. In this case, the intelligent controller responds to the client with an acknowledgement as soon as the data is written to the local disk. The standby site’s disk is updated, at a later time. And this update happens at regular time intervals.
Data Center Networking: Enterprise Distributed Data Centers
2-4
956599
Chapter 2
Inter Data Center Transport Technologies Transport Technologies
Figure 2-3
Asynchronous Replication
Asynchronous Disk Mirroring
Server disk I/O
2 4
Local Storage System
Backup Storage System
3
Backup Data Center
Production Data Center
Required in high latency network infrastructures. 1.
Write to disk, wait for acknowledgement
2.
Local acknowledges write to host
3.
Copy write I/O to remote storage system
4.
Remote acknowledges copy complete
87029
1
Asynchronous replication is typically used for lower speed remote connections. For example, a data center in California may use asynchronous replication to a data center in Kansas City. The asynchronous replication application sends I/O requests to the remote site as they are received without first waiting for acknowledgements for the previous I/O operation. Asynchronous replication is not as delay sensitive as synchronous and therefore is more suited for relatively lower speed remote connections. Protocols such as Fibre Channel over IP (FCIP) may be used for asynchronous replication. There is a risk of compromising data integrity with asynchronous replication. However, it is a viable option when a customer is using a Disaster Recovery site to back up the primary data center.
Transport Technologies Businesses tolerance for data loss and application sensitivity for latency and bandwidth requirements dictate the deployment of distributed data centers in a LAN, MAN or WAN environment. The deployment of active and standby data centers falls into one of these three transport environments. The transport technologies provide the connectivity between the data centers. In LAN and MAN environments, the back end connectivity is provided by high speed, high bandwidth transport technologies. Back end connectivity refers to inter connection between the data centers and used primarily for application server and database server peer to peer communication and storage applications. The table below provides an overview of different transport technologies and the applications that benefit from these transport technologies. For more details on transport technologies please refer to the Cisco Storage Networking Solutions area of Cisco.com (http://www.cisco.com/en/US/netsol/ns110/ns258/ns259/ns261/networking_solutions_package.html).
Data Center Networking: Enterprise Distributed Data Centers 956599
2-5
Chapter 2
Inter Data Center Transport Technologies
Transport Technologies
Table 2-2
Disaster Recovery
Applications of Transport Technology
GigEthernet (Campus/LAN)
CWDM (Campus/LAN)
DWDM (Metro/MAN)
SONET (WAN)
PoS (WAN)
DPT(WAN)
X
X
X
X
X
X
X
X
X
X
X
Storage Distributed Data Centers
X
X
X
Storage Consolidation
X
X
X
Main Frame Resilience
X
LAN In a LAN or a campus environment, gigabit links and gigabit switches can provide the required back end connectivity for the servers. ATM is another alternative that can be used for back end connectivity providing different speeds like OC-3, OC-12, etc. The need for ATM is diminished with the availability of Gigabit technologies in LAN environments. Fibre channels carry the storage protocols, such as block access protocols, between data centers connected by fibre channel switches. Block access protocols typically provide high data throughput. For back end connectivity between data centers, Coarse Wave Division Multiplexing (CWDM) is a technology which is useful in campus environments. CWDM is capable of multiplexing fibre channel and GE links over CWDM.
Note
There currently are no firm availability dates for FC over CWDM.
MAN The business continuance applications and associated storage protocols such as Fibre Channel make use of high speed I/O operations and require a fault-tolerant, high-bandwidth, low-latency network. The transport layers can have a significant impact on the applications performance. To support business continuance in main frame environments, technologies such as FICON and ESCON are used for high speed I/O. In addition, MAN transport technologies can carry the server to server messaging (asynchronous messaging) and asynchronous data replication traffic. Multiple wavelengths carrying disparate traffic types can be multiplexed together onto a single fiber pair to provide multi-service transport. In general, the technique of multiplexing different wavelengths over a single fiber is known as wavelength division multiplexing (WDM). If there are insufficient fiber runs between sites or optimizing the fiber runs is an objective, DWDM and CWDM would be the transport technologies used. Currently, the optimal transport technology to be used for synchronous replication between data centers is DWDM. Transport technologies such as FICON and ESCON, which are used in main frame environments for high speed I/O. can also be multiplexed onto DWDM and hence provide higher distances between data centers without compromising application requirements. In addition to this, as mentioned above, DWDM can support the main frame resiliency (IBM’s GDPS) between data centers.
Data Center Networking: Enterprise Distributed Data Centers
2-6
956599
Chapter 2
Inter Data Center Transport Technologies Transport Technologies
Figure 2-4
Data Centers connected by Metro DWDM
Data Center Front End
ESCON GE Extension across Metro area
FC
Cisco optical switch
Server
Storage Device
Block
Block
Fiber Channel Network
Metro DWDM
Fiber Channel Network Block
Storage Device
Server
GE ESCON 87030
Block
Cisco optical switch
FC
Data Center Front End
In a MAN environment, in addition to DWDM and SONET, Dynamic Packet Transport (DPT)/ Resilient Packet Ring (RPR) can also provide the transport layer for back end connectivity between data centers.
WAN When businesses have the tolerance levels and can afford to loose some data, the data centers can be located at geographically distant locations connected by a wide area network (WAN). Typically, asynchronous replication of data is done over WAN connections. Disaster recovery applications make use of the existing WAN infrastructure for asynchronous messaging and asynchronous data replication between data centers. Data replication for storage replication can be done over WAN infrastructures and the associated technology is called FCIP. A WAN is a data communications network that covers a relatively broad geographic area and often uses transmission facilities provided by common carriers, such as telephone companies. A point-to-point link provides a single, preestablished WAN communications path from the customer premises through a carrier network, such as a telephone company, to a remote network. Point-to-point links are also known as leased lines. The carrier company reserves point-to-point links for the private use of the customer. These links accommodate two types of transmissions: datagram transmissions, which are composed of individually addressed frames, and data-stream transmissions, which are composed of a stream of data for which address checking occurs only once. ATM, Frame Relay, Switched Multimegabit Data Service (SMDS), and X.25 are all examples of packet-switched WAN technologies and also represent a conventional view of WAN.
Data Center Networking: Enterprise Distributed Data Centers 956599
2-7
Chapter 2
Inter Data Center Transport Technologies
Transport Technologies
With the new generation of WAN infra structure, the communication between data centers can make use of higher bandwidth, low latency networks over long haul to support disaster recovery and server to server communications between data centers. In a WAN environment, typically, businesses buy high bandwidth services from their service providers and connect to their service providers using MAN transport technologies. The options available for WAN are leased lines using SONET/SDH channels, and WAN services like ATM. Frequently used SONET transmission levels are OC-3 (155 Mbps), OC-12 (622 Mbps), and OC-48 (2488 Mbps or 2.4 Gbps). SONET/SDH has been most successfully used for high-speed IP transport in wide area networking (WAN) applications. In most WAN applications today, routers with PoS interfaces are connected to carrier SONET rings via add drop multiplexors (ADMs). Router PoS interfaces are frequently connected to ADMs, terminating point-to-point SONET/SDH links.
Data Center Networking: Enterprise Distributed Data Centers
2-8
956599
C H A P T E R
3
Site Selection Technologies Several technologies make up a complete site-to-site recovery and multi-site load distribution solution. In a client to server communication, the client looks for the IP address of the server before communicating with the server. When the server is found, the client communicates with the server and completes a transaction. This transaction data is stored in the data center. The technology that deals with routing the client to the appropriate server is at the front end of the data centers. In a distributed data center environment, the end users have to be routed to the data center where the applications are active. The technology that is at the front end of distributed data centers is called request routing.
Request Routing Most applications use some form of address resolution to get the IP address of the servers with which they communicate. Some examples of the applications that use address resolution mechanisms to communicate with the servers or hosts are Web browsers, telnet, and thin clients on users desktop. Once an IP address is obtained, these applications connect to the servers in a secure or non-secure way, based on the application requirements, to carry out the transaction. Address resolution can be further extended to include server health tracking. Tracking sever health allows the address resolution mechanism to select the best server to handle client requests and adds high availability to the solution. In a distributed data center environment, where redundant servers which serve the same purpose at geographically distant data centers are deployed, the clients can be directed to the appropriate data center during the address resolution process. This method of directing the clients to the appropriate server by keeping track of server health is called request routing. There are three methods of request routing to connect clients to the appropriate data center. •
DNS based request routing
•
HTTP Redirection
•
Route Health Injection (RHI)
DNS Based Request Routing The first solution, depicted in Figure 3-1, is based on DNS. Normally, the first step when connecting to a server is resolving the domain name to an IP address. The client’s resolution process becomes a DNS query to the local DNS server, which then actively iterates over the DNS server hierarchy on the Internet/Intranet until it reaches the target DNS server. The target DNS server finally issues the IP address.
Data Center Networking: Enterprise Distributed Data Centers 956599
3-1
Chapter 3
Site Selection Technologies
Request Routing
Figure 3-1
Basic DNS Operation
Root DNS for/ 2 DNS proxy
3
Root DNS for .com
4 5 6
Authoritative DNS for www.foo.com, "www.foo.com = 208.10.4.17" 7
Web server IP = 208.10.4.17
87019
1
Authoritative DNS foo.com
1.
Client requests to resolve www.foo.com
2.
DNS Proxy sends a request to the root DNS;root DNS responds with an address of root DNS for.com
3.
DNS Proxy requests root DNS for .com;response comes back with IP address of authoritative DNS for foo.com
4.
DNS Proxy requests authoritative DNS for foo.com;response comes back with an IP address of www.foo.com
5.
DNS Proxy requests authoritative DNS for www.foo.com;response comes back with an IP address of web server
6.
DNS Proxy responds to the client with the IP address of the web server
7.
Client establishes a connection with the web server
At its most basic level, the DNS provides a distributed database of name-to-address mappings spread across a hierarchy of domains and sub domains with each domain administered independently by an authoritative name server. Name servers store the mapping of names to addresses in resource records. Each record keeps an associated time to live (TTL) field that determines how long the entry is cached by other name servers. Name servers implement iterative or recursive queries. •
Iterative queries return either an answer to the query from its local database (A-record), or a referral to another name server that is able to answer the query (NS-record).
•
Recursive queries return a final answer (A-record), querying all other name servers necessary to resolve the name.
Most name servers within the hierarchy send and accept only iterative queries. Local name servers, however, typically accept recursive queries from clients. Recursive queries place most of the burden of resolution on a single name server. In recursion, a client resolver sends a recursive query to a name server for information about a particular domain name. The queried name server is then obliged to respond with the requested data, or with an error indicating that the data of the requested type or the domain name does not exist. Because the query
Data Center Networking: Enterprise Distributed Data Centers
3-2
956599
Chapter 3
Site Selection Technologies Request Routing
was recursive, the name server cannot refer the querier to a different name server. If the queried name server is not authoritative for the data requested, it must query other name servers for the answer. It could send recursive queries to those name servers, thereby obliging them to find the answer and return it (and passing the buck). Alternately, the DNS proxy could send iterative queries and be referred to other name servers for the name it is trying to locate. Current implementations tend to be polite and do the latter, following the referrals until an answer is found. Iterative resolution, on the other hand, does not require nearly as much on the part of the queried name server. In iterative resolution, a name server simply gives the best answer it already knows back to the querier. There is no additional querying required. The queried name server consults its local data, including its cache, looking for the requested data. If it does not find the data, it makes the best attempt to give the querier data that helps it continue the resolution process. Usually these are names and addresses of other name servers. In iterative resolution, a client’s resolver queries a local name server, which then queries a number of other name servers in pursuit of an answer for the resolver. Each name server it queries refers it to another name server further down the DNS name space and closer to the data sought. Finally, the local name server queries the name server authoritative for the data requested, which returns an answer.
HTTP Redirection Many applications currently available today have a browser front end. The browsers have built in http redirection built so that they can communicate with the secondary server if the primary servers are out of service.In HTTP redirection, the client goes through the address resolution process once. In the event that the primary server is not accessible, the client gets redirected to a secondary server with out having to repeat the address resolution process. Typically, HTTP redirection works like this. HTTP has a mechanism for redirecting a user to a new location. This is referred to as HTTP-Redirection or HTTP-307 (the HTTP return code for redirection). The client, after resolving the IP address of the server, establishes a TCP session with the server. The server parses the first HTTP get request. The server now has visibility of the actual content being requested and the client’s IP address. If redirection is required, the server issues an HTTP Redirect (307) to the client and sends the client to the site that has the exact content requested. The client then establishes a TCP session with the new host and requests the actual content. HTTP redirection mechanism is depicted in Figure 3-2.
Data Center Networking: Enterprise Distributed Data Centers 956599
3-3
Chapter 3
Site Selection Technologies
Request Routing
Figure 3-2
Basic operation of HTTP Redirect
1 Client's request to DNS resolves www.cisco.com to the IP address of the server 2 Client
GET/HTTP/1.1 Host:www.cisco.com
www.cisco.com
HTTP/1.1 307 found Location:www1.cisco.com 3 GET/HTTP/1.1 Host:www1.cisco.com
www1.cisco.com
HTTP/1.1 200 OK Host:www1.cisco.com Client talks to www1.cisco.com for the remainder of the session
87020
Client
The advantages of HTTP-Redirection are: •
Visibility into the content being requested.
•
Visibility of the client’s IP address helps in choosing the best site for the client in multi-site load distribution.
The disadvantages of HTTP-Redirection are: •
In order for redirection to work, the client has to always go to the main site first and then get redirected to an alternate site.
•
Book marking issues arise because you can bookmark your browser to a particular site and not the global http://www.foo.com site, thus bypassing the request routing system.
•
HTTP redirects only work for HTTP traffic. Some applications, which do not have browser front ends, do not support HTTP redirection.
Route Health Injection RHI is a mechanism which allows the same IP address to be used at two different data centers. This means that the same IP address (host route) can be advertised with different metrics. The upstream routers see both routes and inserts the route with better metric into its routing table. When RHI is enabled on the device, it injects a static route in the device’s routing table when VIPs become available. This static route is withdrawn when the VIP is no longer active. In case of a failure of the device, the alternate route is used by the upstream routers to reach the servers thereby providing high availability. It is important to note that the host routes are advertised by the device only if the server is healthy.
Note
Most routers do not propagate host-route information to the Internet. Therefore, RHI, since it advertises host routes, is normally restricted to Intranets. The same IP address can also be advertised from a different location, calling it the secondary location, but with a different metric. The mechanism is exactly the same as in the previous case, with the only difference being the metric that is used.
Data Center Networking: Enterprise Distributed Data Centers
3-4
956599
Chapter 3
Site Selection Technologies Supporting Platforms
The advantages of RHI are: •
Quick convergence (IGP convergence)
•
Self regulated, no dependency on external content routing devices
•
Ideal for business continuance and disaster recovery solutions
•
Single IP address
The disadvantages of RHI are: •
Number of host routes grows with number of critical applications
•
Cannot do route summarization
•
Used for intranet applications only
•
Cannot be used for site-to-site load balancing because the routing table has only one entry. Typically it is used only for active/standby configurations.
Supporting Platforms Cisco has various products that support request routing for distributed data centers. Each product has different capabilities. All the supporting products are described below. •
Global Site Selector (GSS 4480)
•
Distributed Director
•
Distributed Director and IOS SLB/DFP
•
WebNS & Global Server Load Balancing
•
Content Switching Module for Cat6K platforms
Global Site Selector The Cisco GSS 4480 load balances distributed data centers. GSS interoperates with server load balancing products like the Cisco CSS 11000 and CSS 11500 Content Services Switch and the Content Switching Module (CSM) for the Cisco Catalyst® 6500 Series switches. The Cisco GSS 4480 product delivers the following key capabilities: •
Provides a scalable, dedicated hardware platform for Cisco's content switches to ensure applications are always available, by detecting site outages or site congestion
•
Improves global data center or site selection process by using different site selection algorithms
•
Complements existing DNS infrastructure by providing centralized sub-domain management
The Cisco GSS 4480 allows businesses to deploy internet and intranet applications by directing clients to a standby data center if a primary data-center outage occurs. The Cisco GSS 4480 continuously monitors the load and health of the server load balancing devices at multiple data centers and can redirect clients to a data center with least load. The load conditions are user defined at each data center. Key Features and Benefits •
Offers site persistence for e-commerce applications
•
Provides architecture critical for disaster recovery and multi-site deployments
•
Provides centralized command and control of DNS resolution process
Data Center Networking: Enterprise Distributed Data Centers 956599
3-5
Chapter 3
Site Selection Technologies
Supporting Platforms
•
Provides dedicated processing of DNS requests for greater performance and scalability
•
Offers DNS race feature. The Cisco GSS 4480 can in real time direct clients to the closest data center based on round trip time (RTT) between the local DNS and the multiple sites.
•
Supports a web-based graphical user interface (GUI) and wizard to simplify the configuration
Distributed Director The Cisco Distributed Director is a DNS-based technology that load balances traffic between multiple geographically or topologically separate data centers. Its goal is to get a client to the most optimized or best data center. It accomplishes this by communicating with other Cisco routers on the network via a protocol called Director Response Protocol (DRP). Without Distributed Director, end users connect to different servers in a cyclical pattern, using round robin DNS. Because DNS knows nothing about network topology or server availability, end users can be connected to geographically distant or unavailable servers, resulting in poor access performance and increased transmissions costs. Lastly, because round robin DNS distributes services to servers in a cyclical pattern, it treats all servers as equal. As a result, less powerful servers become oversubscribed, while larger (and more expensive) servers remain underutilized. You can use Distributed Director to distribute any IP service, whether TCP-based or UDP-based. Using intelligence, for instance, the capability to monitor server health, Distributed Director transparently directs end-user service requests to the best server as determined by client-to-server topological proximity and/or client-to-server link latency (round-trip times), resulting in increased access performance seen by the end user and reduced transmission costs in dial-on-demand routing environments. The Cisco DistributedDirector provides the ability to perform load distribution in a manner that takes server availability, relative client-to-server topological proximities (“distances”), and client-to-server link latency into account to determine the “best” server. This means that users need only a single sub-domain name or Universal Resource Locator (URL)-embedded hostname for accessing a distributed set of servers. This eliminates the need for end-users to choose a server from a list of possible sites. The Cisco DistributedDirector leverages the intelligence in the network to automatically, dynamically, and efficiently pick the “best” server for the user. The DistributedDirector uses the Director Response Protocol (DRP), a simple UDP-based application developed by Cisco Systems, to perform the following two tasks: •
Query DRP server agents in the field for BGP and IGP routing table metrics between the distributed servers and clients to determine client-to-server topological proximities. These DRP metrics are discussed in detail later in this document.
•
Query DRP server agents in the field for client-to-server link latency metrics. Use of the round-trip time DRP metric is discussed in detail later in this document.
Figure 3-3 provides an explanation of basic operation of a Distributed Director.
Data Center Networking: Enterprise Distributed Data Centers
3-6
956599
Chapter 3
Site Selection Technologies Supporting Platforms
Figure 3-3
Distributed Director Basic Operation
Web server
Web server
Distributed Director
1
1
DRP agents
2 4 Local DNS 87056
Client
3
1.
The DD probes for the server health and is aware of state of the servers
2.
Client requests to resolve www.foo.com
3.
Local DNS server performs the iterative DNS query and in the process, the DD responds with the IP address after consulting with DRP agents based on configuration
4.
The client connects to the server to complete the transaction
Distributed Director, IOS-SLB, and Dynamic Feedback Protocol Distributed Director, as mentioned in the previous section, uses various metrics to steer clients to the appropriate global server farm or distributed data center. IOS-SLB running Dynamic Feedback Protocol (DFP) provides additional attributes to the Distributed Director environment that makes the decision as to which is the best site that can handle the client’s request. The Cisco IOS server load balancing (SLB) feature is a Cisco IOS software solution that provides server load-balancing function. DFP is a mechanism by which servers and the load balancing devices provide feedback to Distributed Director on load conditions of the device. The Distributed Director uses this feedback to intelligently forward clients to least loaded data centers.
WebNS and Global Server Load Balancing The Cisco 11000 series Content Services Switch (CSS) provide both global server load balancing (GSLB) and network proximity methods for content request distribution across multiple sites.
Data Center Networking: Enterprise Distributed Data Centers 956599
3-7
Chapter 3
Site Selection Technologies
Supporting Platforms
The Cisco 11000 series CSS is capable of GSLB of content requests across multiple sites, using content intelligence to distribute the requests according to what is being requested, and where the content is available. Network proximity is an enhanced version of GSLB that selects the closest or most proximate web site based on measurements of round-trip time to the content consumer’s location. Network proximity naturally provides a high degree of global persistence, because the proximity calculation is typically identical for all requests from a given location (Local DNS) as long as the network topology remains constant. WebNS also provides a scalable solution that provides sticky site selection without sacrificing proximity or GSLB. In this enhanced version, the sticky database allows the network administrator to configure how long a D-proxy remains sticky. The TTL value ranges from minutes to days. Figure 3-4 explains the basic operation of GSLB using content services switch. Figure 3-4
Basic Operation of GSLB Using Content Services Switch
Content servies switches
Web server
Web server
TCP session 1
1
2 4 Local DNS
Client
87057
3
1.
The CSSs’ probes for the server health and is aware of state of the servers and exchange the server availability information using the TCP session
2.
Client requests to resolve www.foo.com
3.
Local DNS server performs the iterative DNS query and in the process, the CSS responds with the IP address based on configuration
4.
The client connects to the server to complete the transaction
Data Center Networking: Enterprise Distributed Data Centers
3-8
956599
Chapter 3
Site Selection Technologies Supporting Platforms
Content Switching Module for Catalyst 6000 The Cisco CSM integrates advanced Layer 4-7 content switching into the Cisco Catalyst 6500 Series or Cisco 7600 Series Internet Router. The Cisco CSM provides high-performance, high-availability load balancing, while taking advantage of the complete set of Layer 2, Layer 3 and QoS features inherent to the platform. The CSM communicates directly with Cisco Distributed Director, for use in GSLB, supports DFP, and also supports RHI feature. Figure 3-5 provides an overview of how the route health injection works using CSM. When RHI is enabled on CSM, the CSM injects a static route into the MSFC’s routing table. This, in turn, is redistributed by the MSFC. Figure 3-5
RHI with the CSM
Content switching module in Cat6k
Web server
Web server
1
1 2
3 4
5 Local DNS 87058
Client
1.
The CSMs probes for the server health and if servers are available, puts in a static route into the MSFC routing table which gets advertised with different metrics from the 2 Cat6ks (same IP address gets advertised with different metrics from 2 locations)
2.
The host routes are propagated to the upstream routers and the route with the best metric is used by the upstream routers
3.
Client requests to resolve www.foo.com
4.
Local DNS server performs the iterative DNS query and responds with an IP address
5.
The client connects to the server on the right because the route is advertised with a better metric
Data Center Networking: Enterprise Distributed Data Centers 956599
3-9
Chapter 3
Site Selection Technologies
Supporting Platforms
Data Center Networking: Enterprise Distributed Data Centers
3-10
956599
C H A P T E R
4
Site to Site Recovery The productivity gain, cost savings, and business revenue realized by deploying a data center that is available 24X7 is substantial. As enterprises continue to move towards web-enabled applications, it has become increasingly important to have highly available data centers that support mission critical applications. You can achieve redundancy and application high availability by deploying multiple data centers and distributing applications across those data centers. This design document focuses on the design and deployment of distributed data centers for disaster recovery and business continuity. There are two main benefits to deploying distributed data centers: •
Data center recovery (site recovery) that typically sustains disaster recovery and business continuance plans
•
Increased application availability and scalability resulting from load distribution between multiple data centers
The downtime experienced by a data center is closely related to the revenue and productivity loss of an enterprise. You can minimize downtime and assure business continuance by deploying distributed data centers and distributing the business applications and databases. When one of the primary data centers goes out of service, the standby data center supports business critical applications thereby providing “Business Resilience” to enterprises. Business resilience is one benefit of distributed data centers. Other benefits include application scalability, high availability and an improved end user experience. This design guide addresses the issue of how to route the end users to the available secondary data center when a catastrophic event occurs at the primary data center. This process of getting access to the application after a catastrophic failure is referred to as regaining application access.
Site-to-Site Recovery Topology Disaster recovery solutions involve at least two data centers: a primary and secondary. The disaster recovery solution topology is shown in Figure 4-1.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-1
Chapter 4
Site to Site Recovery
Site-to-Site Recovery Topology
Figure 4-1
Disaster Recovery Solution Topology
Internet Service provider B
Service provider A
ACLs
Internal network
Firewalls DNS proxy Cache L2-L7 switch IDSs
IDSs Front-end Layer
S e r v e r
Firewalls Application Layer
IDSs
F a r m s
Firewalls IDSs
87033
Back-end Layer
Primary Data Center
Secondary Data Center
Deploy the primary and secondary data centers in one of the following two modes: •
Warm standby
•
Hot standby
In warm standby solutions, the primary data center is the active data center receiving all client traffic. The secondary data center is the standby data center and receives no client traffic although it does host applications.
Data Center Networking: Enterprise Distributed Data Centers
4-2
956599
Chapter 4
Site to Site Recovery Site-to-Site Recovery Topology
In hot standby solutions, the primary and secondary data centers share the load with the secondary data center processing some traffic. For example, application A, B and C are active on the primary data center and application X and Y are active on the secondary data center. Deploying distributed data centers in hot standby mode makes good use of the bandwidth and the resources available at the secondary data center. Regardless of the deployment mode chosen, neither data center should be overloaded in case there is a catastrophic failure. You must provide mission critical application support, which is necessary for the business to operate normally, at all times. The secondary data center must be capable of supporting all mission critical applications in the event of a disaster. Several products in Cisco’s product portfolio support disaster recovery deployments. This document reviews the different topologies using the various platforms. The first disaster recovery topology is depicted in Figure 4-2. In all the following topologies, most of the topology is similar except for the request routing device and how it is deployed. So a generic explanation of the topologies is provided. In a distributed data center environment, the aggregation switches connect to the core routers. These switches are called aggregation switches because all the services within the data center are supported on these switches, such as, content switching, content routing, Intrusion detection, and firewalls. The core routers connect to the internet edge routers through the firewalls. The private WAN connects to the core routers. The core routers connect to other core routers within the campus to provide access to the campus. Needless to say, there is always a DNS proxy resolving the DNS request from the clients. The clients access the default DNS address when they login. Now, returning to the specific topology depicted in Figure 4-2, a request routing device is deployed on the inside of the firewall in the data center, that serves as the authoritative DNS for the sub-domains. The request routing device monitors the application health at both data centers and directs end users to the preferred data center, the primary data center. Typically, applications hosted at the data center are reached though a virtual IP address and its associated port number. The content switch distributes connection requests made to these virtual addresses among the available servers in the server farm.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-3
Chapter 4
Site to Site Recovery
Site-to-Site Recovery Topology
Figure 4-2
Disaster Recovery Solution Using Distributed Director
DNS proxy
Internet ISP1
ISP2
Internal network
Clients User Community
Data Center Edge
Distributed Director
Aggregation
Content switch CSM
Content switch CSS
87034
Access
In the event of a catastrophic failure at the primary data center, clients are directed to the standby data center. In this topology and all the topologies, a name server entry has to be made to the local DNS that points to the request routing device as the authoritative DNS.
Data Center Networking: Enterprise Distributed Data Centers
4-4
956599
Chapter 4
Site to Site Recovery Design Details
Hardware and Software Requirements Product
Release
Platforms
Distributed Director
12.2(8)T
2600, 3620, 3640, 3660, 3725, 3745, 7200
Content Services Switch (CSS)
WebNS 5.02
CSS 11150, CSS 11050
Content Switching Module (CSM)
12.1(8a)EX
Module for Cat 6k
CSM 3.1(2) software
Design Details The multi-layer network model is a fundamental principle associated with network design. This model provides a modular building block approach to network design, which increases the ease of deployment, stability, and scalability. A modular approach is very important to build a robust and scalable network. A well-designed Layer 2 and Layer 3 network serves as a foundation for deploying application services, which includes voice and video. The data center and campus design documents detail how to design a multi-layer network and deploy application services. There are three main design aspects of distributed data center design for disaster recovery and business continuance. •
Designing the primary and standby data centers
•
Providing the back end connectivity between the data centers for data replication
•
Providing the front-end intelligence to direct traffic to active data center
Front-end intelligence for site-to-site recovery is detailed in this design document. Back-end connectivity between data centers is outside the scope of this paper.
Design Goals Cisco supports a modular approach to deploying distributed data centers. Building a data center involves careful planning for capacity, redundancy, high availability, security and manageability. This data center design guide touches each aspect of data center design.
Redundancy The key feature of deploying distributed data centers is that they provide redundancy to support mission critical applications. Typically, there are three layers in the server farm architecture. •
Front-end layer
•
Application layer
•
Back-end layer
Deploying redundant application servers in the server farm and front ending that server farm with a content switch achieves application redundancy within the data center. For more information on the data center architecture, review the reference provided at the end of this document. You can achieve similar redundancy across multiple data centers by duplicating the data centers and creating intelligence at the
Data Center Networking: Enterprise Distributed Data Centers 956599
4-5
Chapter 4
Site to Site Recovery
Design Details
front end to divert traffic to the active applications. You must duplicate the three layers in the secondary data centers and replicate the state information and data across the two data centers. This ensures that your data centers are in sync at all times to achieve business continuity.
High Availability High availability is achieved when the application downtime is at or near zero. Eliminate the single point of failure by deploying redundant networks and data centers. After you replicate data to the standby data center and deploy redundant applications at the standby center, achieve application high availability during a data center failure by: •
Detecting data center or application failure
•
Diverting end users to the standby data center
Use request routing mechanisms to direct clients or end users to available data centers. (See Chapter 1, “Enterprise Distributed Data Centers Overview” for more information on request routing.) Request routing devices use various algorithms to direct traffic to the appropriate application. In a distributed data center environment, more than one data center hosts those applications. The end users are routed to different data centers based on the following criteria: •
Round robin or weighted round robin request routing
•
Routing based on proximity (static proximity)
•
Routing based on round trip time (RTT)
•
Routing end users to preferred data centers
Application availability means that the application is up and running at a data center. Application availability, which must be ensured before the client is directed to a specific data center, is characterized by the following: •
The real servers in the server farm are up and running at a data center
•
The Web tier is able to communicate with the application tier
•
Application tier is able to communicate with the database tier
•
The data center is reachable by the request routing device
Application Requirements A successful disaster recovery solution deployment depends on your understanding of application requirements and of end user connection mechanisms. Enterprises use both legacy and non-legacy applications. End users connect to those applications either through DNS or through statically configured IP addresses. In a successful deployment, a combination of different request routing mechanisms exists. In an enterprise environment, legacy and non-legacy applications often co-exist. Legacy application end users use hard coded server IP addresses. There is only one available request routing solution for these environments: RHI. Non-legacy applications use DNS and are supported by the other request routing topologies discussed in this paper. Typically, different mechanisms must co-exist to support all enterprise applications.
Data Center Networking: Enterprise Distributed Data Centers
4-6
956599
Chapter 4
Site to Site Recovery Design Recommendations
Security Security is built into data center architecture by deploying ACLs, firewalls and intrusion detection devices. Refer to Enterprise Data Center SRND (http://www.cisco.com/en/US/netsol/ns110/ns53/ns224/ns304/networking_solutions_design_guidances_list.html) for any security related design issues. When deploying request routing devices in a data center, some configuration is required on the firewalls to allow certain sessions to pass through the firewalls.
Scalability and Performance The number of applications supported in a data center should be able to scale before deploying additional data centers. For a distributed data center, this translates to scaling the “A” records. An “A” record is a record sent in response to a DNS query for a specific host by the authoritative DNS for the specific sub-domain. Request routing device performance must be at acceptable levels, as well. Disaster recovery fails when the request routing devices themselves become the bottleneck for end users.
Additional Design Goals In addition to all the goals mentioned above, false alarms or false triggers must be reduced to zero in a disaster recovery solution. False alarms are due to either a device failure or network failure. When request routing devices cannot talk to the application servers, they wrongly assume that the application servers are down and trigger a failover to a standby data center. False alarms create confusion and lead to data synchronization issues based on how the data is replicated. Each enterprise has its own tolerance limits about data center downtime. Financial institutions have the most stringent requirements. Other institutions and Enterprises have tolerance limits of up to 4 hours. For enterprises with high tolerance limits, the alternative to automatic failover is manual failover to standby data center. This gives the network personnel involved enough time to really inspect the situation and take appropriate action.
Design Recommendations The applications used in the enterprise data centers drive solution topologies. Understanding application requirements is critical. When there is a combination of legacy and non-legacy applications, the current solution is a combination of topologies. The choice of a topology also depends on the content switching device used at the data center. Cisco offers several content switching platforms. Several of these platforms also provide features that support disaster recovery. While choosing content switching products, choose a content switch solely for the purpose of content switching. Once that decision is made, you can turn on the appropriate features to deploy a disaster recovery solution. Several topologies are available. The distinction between the available topologies is related to performance and scalability. When everything is equal, the choice is often made based on the manageability of the devices. Keep the design goals in mind when deploying a disaster recovery solution and avoid false failures that trigger an unnecessary failover.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-7
Chapter 4
Site to Site Recovery
Design Recommendations
Topology Using Distributed Director Figure 4-3 displays the topology that includes Distributed Directors (DD). The DD software runs on various platforms. For small to medium enterprises, deploy DD software on the data center core 7200 router. If 7200 routers are not used as core routers, use dedicated platforms as shown in Figure 4-3. DD can also be deployed on a DD supported aggregation layer. The DD at each data center probes the applications at both the local and remote data center. The probes available are TCP and HTTP. In this topology, there is no dependency on which content services switch you use. The DDs does not communicate with each other at any time unless a director response protocol (DRP) agent is created on the remote DD. Figure 4-3
Cisco 2600 Routers Running DD
DNS proxy
Internet ISP1
ISP2
Internal network
Clients User Community
Data Center Edge
Distributed Director
Aggregation
Content switch CSM
Note
Content switch CSS
87034
Access
DDs can also be deployed on the core routers connected to the aggregation switches. In that case, the dedicated DDs as shown in the figure are not needed.
Data Center Networking: Enterprise Distributed Data Centers
4-8
956599
Chapter 4
Site to Site Recovery Design Recommendations
Topology Using Content Services Switch There are two topologies that support the CSS with global server load balancing (GSLB) enabled. •
Integrated architecture
•
Decoupled architecture
Integrated Architecture As depicted in Figure 4-4, the two content switches are connected in a box-to-box redundant mode. Virtual redundant routing protocol virtual IP (VRRP/VIP) redundancy mode cannot be used due to the content application peering protocol (CAPPs) inability to establish sessions with the virtual interfaces. More details on VRRP/VIP and CAPP are discussed in the Implementation Details section of this document. In this mode, CSS acts both as a content switch and a request router. Figure 4-4
Integrated Architecture
CAPP session Cross over connection for box-to-box redundancy Internet
Campus Internet edge
Campus Internet edge
Core
Aggregation
Content services switch (WebNS) Primary Data Center
Content services switch (WebNS) Secondary Data Center
87035
Access
Data Center Networking: Enterprise Distributed Data Centers 956599
4-9
Chapter 4
Site to Site Recovery
Design Recommendations
Decoupled Architecture In the second mode, a pair of dedicated CSSs for request routing is used. The dedicated pair of CSSs connect to the aggregate switch. The CSSs are deployed in a box to box redundant mode. As shown in Figure 4-5, the decoupled architecture provides flexibility in choosing the content switches for the data center. Unlike the previous architecture, with the decoupling of content switching and request routing, the content switches can use different redundancy modes. Figure 4-5
Decoupled Architecture
TCP session Cross over connection for box-to-box redundancy To Core
To Core Request routers Application health probes
Aggregation
Content switch (CSS) Primary Data Center
Content switch (CSM) Secondary Data Center
87036
Access
Topology Using the Content Switching Module Use CSMs to activate RHI, as shown in Figure 4-6. The CSMs are part of the aggregate switches. They are deployed in redundant mode within the data center. Unlike with CSSs and DDs, CSMs use routing to solve disaster recovery problems. CSMs advertise host routes into the network from both data centers. Based on the metrics, the upstream router picks the best route.
Data Center Networking: Enterprise Distributed Data Centers
4-10
956599
Chapter 4
Site to Site Recovery Design Recommendations
Figure 4-6
Using CSMs with RHI
ISP and Internet
Local DNS Clients User Community Intranet
Data Center Edge
Content switch CSM
Content switch CSM
87063
Aggregation
Recommendations Due to application requirements, you must use a combination of at least two technologies when both legacy and non-legacy applications are deployed in the data center. RHI provides support for disaster recovery solutions for legacy applications. Not all platforms support this feature. There is no current disaster recovery solution that supports both legacy and non-legacy applications on a single topology. For non-legacy applications, Cisco recommends the use a topology that consists of a CSS. Use the following factors to determine the other topology required for your network deployment. •
Content switches deployed in the data center
•
Enterprise size and core or distribution routers used
Data Center Networking: Enterprise Distributed Data Centers 956599
4-11
Chapter 4
Site to Site Recovery
Design Recommendations
•
Application and application health probe requirements
•
Health probe frequency requirements
The advantages to deploying a topology using a CSS for disaster recovery are: •
Application server health monitoring functionality is an integral part of content switches
•
Authoritative DNS becomes integral part of the data center
•
Security and manageability considerations
The disadvantages of this topology are: •
Scalability and performance consideration as the content switch performs dual functions (content switching and disaster recovery) if integrated architecture is used.
•
Flexibility of using different metrics when it comes to multi-site load distribution.
If you use a content switch that does not support the GSLB feature, then you have to use a dedicated CSS for disaster recovery at each site. Configure these devices in a box-to-box redundant mode. The advantages of a topology using Distributed Director are: •
Comes as part of Cisco IOS feature with no additional devices required.
•
Interoperates with different content switching devices.
•
Simple to configure and manage.
The disadvantages of this topology are: •
Dependency on campus design and platforms used in the data center.
•
Health monitoring limitations, such as different types of health probes supported and the amount of traffic Distributed Director generates across the network for each active application.
The topology using the CSM is recommended only when there are legacy applications that do not use DNS resolution to resolve the server IP addresses. The advantages of using RHI are: •
Quick convergence times
•
No external health probes
•
No exchange of information between sites about application availability
•
Ideal for active standby or warm standby disaster recovery solutions
The disadvantages of using RHI are: •
Number of host routes grow with the number of applications
•
Inability to summarize routes
•
Cannot be used for site-to-site load distribution
•
Can be used for intranet applications only
•
Requirement of CSS to run routing protocols
For a solution involving applications that use DNS and applications that use hard coded IP addresses, the recommendation is to use CSS in a decoupled architecture for request routing and use CSM for RHI. The decoupled architecture supports the use of different content switches in a data center.
Data Center Networking: Enterprise Distributed Data Centers
4-12
956599
Chapter 4
Site to Site Recovery Implementation Details
Implementation Details The concept behind disaster recovery is to provide an intelligent IP address resolution solution for the applications. The address resolution process provides information as to the closest site that has the requested information. The process, formally known as content routing, is also known as request routing. These devices are intelligent DNSs, which keep track of the health of the application servers and direct the end users to the appropriate data center. Cisco offers three main request routing products, which can be used for site-to-site recovery. •
Distributed Director
•
Content Services Switch (CSS)
•
Content Switching Module (CSM)
The different topologies are discussed in the following sections.
Topology Using CSS Use this topology to implement a disaster recovery center. For a more detailed description of the data center infrastructure, refer to the Enterprise Data Center Design SRND. The two CSSs provide redundant behavior. There is a connection between the CSSs, which indicates that a cross over cable connects the two CSSs. There are two modes of redundancy operation for CSSs. •
VRRP/VIP Redundancy
•
Box-to-Box Redundancy
Virtual redundant routing protocol virtual IP (VRRP VIP) redundancy connects the redundant CSSs in the data center and exchanges keepalive information for the virtual interfaces configured on both devices. No dedicated link is necessary to exchange keepalives, they use the VLAN on the links connected to the aggregation switches. The keepalives monitor the status of the CSSs over the virtual interface. One virtual interface is active and the second virtual interface is standby. If the active CSS fails, the standby CSS becomes active, thus providing redundancy. With box-to-box redundancy protocol, the CSSs are directly connected using a crossover cable. The two devices exchange keepalive information over that cable. One CSS is active and while the other is standby. There are no virtual interfaces in this configuration. The standby device becomes active when the primary device fails. As discussed in the previous sections, there are two modes of deployment: the integrated architecture or the decoupled architecture. Both modes of deployment use box to box redundancy modes for request routing. In the case of integrated architecture, since the CSSs perform dual functions of content switching and request routing, only box to box redundancy deployment is recommended for content switching. For decoupled architecture, content switches are free to use different redundancy modes.
Authoritative DNS The CSS is configured as the authoritative DNS. “A” records are returned in response to DNS queries either by the active CSS at the primary or the secondary data center.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-13
Chapter 4
Site to Site Recovery
Implementation Details
CAPP CAPP is the protocol used between the active CSSs at the primary and secondary data center to exchange health and server resource information. The active CSS at the primary data center establishes a CAPP session with the active CSS at the secondary data center. More details are provided in the Implementation Details section of this document.
Warm Standby Deployment In a disaster recovery solution, similar applications are hosted at two geographical locations. These similar applications have different VIPs., The content switch returns only the VIP of the active data center in response to DNS requests. When DNS requests reach the primary data center, the primary data center returns the active VIP addresses corresponding to the requested sub-domain in the form of “A” records. If the DNS requests reach the secondary data center, the content switch at the secondary data center responds with an “A” record with the VIP for the application at the primary data center. This is true for all applications in a warm standby deployment. This ensures that all clients are connected to the primary data center under normal circumstances. In the event of a catastrophic event at the primary data center, all existing sessions to the primary data center time out. The thin clients must be restarted. Reissued DNS requests are now routed to the secondary data center. The content switch at the secondary data center responds with an “A” record. The “A” record now has the VIP address of the application hosted at the secondary data center. End users are now connected to the secondary data center without ever realizing that a catastrophic event occurred at the primary data center.
Hot Standby Deployment In a hot standby scenario, both the primary and secondary data centers are active with the primary data center supporting a majority of the active applications. There may be a few active applications at the secondary data center. This deployment mode makes better use of the resources at the secondary data center. The data center hosting the active application returns the “A” record in response to DNS. If the active application is at the primary data center, the end users get connected to the primary data center. As with warm standby deployments, if there is a catastrophic failure at the primary data center, the end user session times out and must be restarted. When the client reattempts communication, the request is directed to the secondary data center.
High Availability High availability requires databases replication between the primary and secondary data centers. Achieve high availability by routing users to a secondary data center in the event of a catastrophic failure at the primary data center. Provide redundancy and high availability to the request routing devices by using box-to-box redundancy within the data center and by establishing a CAPP session between the data centers.
Configuring CSS in Integrated Architecture CSS supports two modes of operation for GSLB. When fully qualified domain names are configured as part of content rules, it is called Rule based GSLB. The A records are not configured explicitly. In zone based GSLB, the fully qualified domain names are explicitly configured as A records and these records
Data Center Networking: Enterprise Distributed Data Centers
4-14
956599
Chapter 4
Site to Site Recovery Implementation Details
are associated with VIPs. The A records are handed out as long as the VIP is alive. Use rule based GSLB configurations for disaster recovery solutions. Use zone based GSLB in multi-site load distribution solutions. Figure 4-7
Topology Using CSS (Integrated Architecture)
CAPP session Cross over connection for box-to-box redundancy Internet
Campus Internet edge
Campus Internet edge
Core
Aggregation
Content services switch (WebNS) Primary Data Center
Content services switch (WebNS) Secondary Data Center
87035
Access
CAPP exchanges information about which VIPs are available in the secondary/primary data center locations. CAPP allows you to build a mesh of Internet Data Centers so that each location knows about all other locations. In this example, for the disaster recovery solution, there are two data centers and there is one CAPP session between the two data centers. The following is a list of configuration requirements at the primary data center. 1.
Global configuration – Enable the DNS server, setup a CAPP session to the remote or secondary data center, and enable the CAPP session.
2.
Owner configuration – Enable DNS exchange methods.
3.
Configure the content rule with the sub domain, VIP, and then activate the content rule.
The following configurations are required at the secondary data center. 1.
Same configurations used at the primary data center.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-15
Chapter 4
Site to Site Recovery
Implementation Details
2.
Configure the ACL at the secondary data center to decide which VIP to resolve to for all DNS queries based on the source IP of the DNS proxy.
The relevant configurations are below.
Primary Data Center Configuration Global Configuration ! Enable DNS server, configure app session with 20.17.99.9 and enable app session dns-server app session 20.17.99.9 app ! Please Note that the service configuration is not show here. For every service that is added in the content rule there has ! to be a service configured here.
Owner Level Configuration ! Enable DNS exchange between the data centers and select preferred DNS balance methods only if you are using active/standby ! topology for all the content rules. If there are some content rules that might use round robin method of site selection, ! use dnsbalance command in the content rule to select site selection methods. owner idc-max dns both dnsbalance preferlocal
Content Rule configuration ! Configure the content rule, add services, add fully qualified domain name and enable content rule content l4-server-30 redundancy-l4-stateless add service server-30-1 add service server-40-1 vip address 20.18.99.100 add dns www.ese-cdn.com active
Secondary Data Center Configuration Global ! Enable DNS server, configure app session with 20.18.99.9 and enable app session dns-server app session 20.18.99.9 app ! Please Note that the service configuration is not show here. For every service that is added in the content rule there has ! to be a service configured here.
Owner ! Enable DNS exchange between the data centers and select preferred DNS balance methods only if you are using active/standby ! topology for all the content rules. If there are some content rules that might use round robin method of site selection, ! use dnsbalance command in the content rule to select site selection methods.
Data Center Networking: Enterprise Distributed Data Centers
4-16
956599
Chapter 4
Site to Site Recovery Implementation Details
owner idc-max dns both dnsbalance preferlocal ! Configure the content rule, add services, add fully qualified domain name and enable content rule
Content Rule content l4-server-30 redundancy-l4-stateless add service server-30-1 add service server-40-1 vip address 20.17.99.100 add dns www.ese-cdn.com active ! Configure the ACLs to choose the primary as the active data center
ACL Configuration acl 10 clause 20 permit any any destination any clause 10 permit any any destination content idc-max/l4-server-30 prefer [email protected] apply dns acl 20 clause 10 permit any any destination any apply circuit-(VLAN16) apply circuit-(VLAN40) apply circuit-(VLAN30) apply circuit-(VLAN20) apply circuit-(VLAN1)
Warning
The content rules on both the content switches have to have the same name. Without the same name, the content switch does not learn the remote services.
Warning
There is some configuration required on the firewalls to let the CAPP session pass through and any additional configuration for NAT requirements. This information can be found in the configuration document of firewalls.
Caveats: The main caveats to this design are •
Performance issues due to the fact that a CSSs are used for both content switching and disaster recovery solution.
•
The use of the CAPP protocol and its dependency on content rules limits the use of this topology to CSSs and other devices that support similar protocols and rules.
•
If CSSs are not used for content switching in the data center, additional devices must be deployed to support disaster recovery solution.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-17
Chapter 4
Site to Site Recovery
Implementation Details
Configuring CSS in Decoupled Architecture In decoupled architecture, there is an extra pair of CSSs in each data center connected to the data center. The CSS connects to the aggregate switch and both the CSSs are connected together with a cross over cable for redundancy. The NS records point to the circuit IP address of the CSSs. For traffic from the CSSs, the CSSs use a default route pointing to the active HSRP address on the aggregate switches. Figure 4-8
Topology Using CSS (Decoupled Architecture)
TCP session Cross over connection for box-to-box redundancy To Core
To Core Request routers Application health probes
Aggregation
Content switch (CSS) Primary Data Center
Content switch (CSM) Secondary Data Center
87036
Access
The following is a list of configuration requirements at the primary data center. 1.
Global configuration – Enable the DNS server, setup a CAPP session to the remote or secondary data center, and enable the CAPP session.
2.
Owner configuration – Enable DNS exchange methods and select the preferred DNS balance method.
3.
Configure the content rule with the sub domain, VIP, and then activate the content rule.
The following configurations are required at the secondary data center. 1.
Same configurations used at the primary data center.
2.
Configure the ACL at the secondary Data to decide which VIP to resolve to for all DNS queries based on the source IP of the DNS proxy.
In addition to this, there is some configuration that needs to be done on the content switches. The relevant configurations are provided below.
Data Center Networking: Enterprise Distributed Data Centers
4-18
956599
Chapter 4
Site to Site Recovery Implementation Details
Primary Data Center Configuration Global Configuration ! Enable DNS server, configure app session with 20.17.99.9 and enable app session. ! Please note that the IP address in the app session is that of the circuit IP address of the request routing device at the ! secondary data center. dns-server app session 20.17.100.9 app ! The IP address in this service is that of a VIP on the content switch. The services under this VIP in the content switching ! device is exactly same as that of the content rule which has the VIP 20.18.99.100. In other words, the content rule for the VIP ! 20.14.30.100 is exactly same as the content rule for the VIP 20.18.99.100. The idea is that if the VIP 20.18.99.100 goes down ! due to any reason, the VIP 20.14.30.100 will also go down and the request routing device will stop handing out this A record ! The reason for adopting this strategy is because same VIPs cannot be used on both the content switch and the request router ! with one VIP probing the other VIP. service probe-VIP-20.18.99.100 ip address 20.14.30.100 active
Owner Level Configuration ! Enable DNS exchange between the data centers and select preferred DNS balance methods only if you are using active/standby ! topology for all the content rules. If there are some content rules that might use round robin method of site selection, ! use dnsbalance command in the content rule to select site selection methods. owner idc-max dns both dnsbalance preferlocal
Content Rule Configuration ! Configure the content rule, add services, add fully qualified domain name and enable content rule content l4-server-30 add service probe-VIP-20.18.99.100 vip address 20.18.99.100 add dns www.ese-cdn.com active
Secondary Data Center Configuration Global ! Enable DNS server, configure app session with 20.18.100.9 and enable app session dns-server app session 20.18.100.9 app ! Service probing the VIP 20.17.99.100 service probe-VIP-20.17.99.100 ip address 20.13.30.100 active
Data Center Networking: Enterprise Distributed Data Centers 956599
4-19
Chapter 4
Site to Site Recovery
Implementation Details
Owner ! Enable DNS exchange between the data centers and select preferred DNS balance methods only if you are using active/standby ! topology for all the content rules. If there are some content rules that might use round robin method of site selection, ! use dnsbalance command in the content rule to select site selection methods. owner idc-max dns both dnsbalance preferlocal ! Configure the content rule, add services, add fully qualified domain name and enable content rule
Content Rule content l4-server-30 redundancy-l4-stateless add service probe-VIP-20.17.99.100 vip address 20.17.99.100 add dns www.ese-cdn.com active ! Configure the ACLs to choose the primary as the active data center
ACL Configuration acl 10 clause 20 permit any any destination any clause 10 permit any any destination content idc-max/l4-server-30 prefer [email protected] apply dns acl 20 clause 10 permit any any destination any apply circuit-(VLAN16) apply circuit-(VLAN40) apply circuit-(VLAN30) apply circuit-(VLAN20) apply circuit-(VLAN1)
Content Rules on the Content Switch Global ! configure the services here. These are the probes for all the real servers in the data center. Only one ! service is shown here. service server-30 ip address 20.13.99.30 active ! Please note that only the secondary data centers content rule is shown below. The content rule for the primary ! data center’s content switch is configured similary but with the IP address of the real servers applicable to that ! data center. ! Configure the content rule, add services, add fully qualified domain name and enable content rule
Content Rule
Data Center Networking: Enterprise Distributed Data Centers
4-20
956599
Chapter 4
Site to Site Recovery Implementation Details
! The real content rule which is used for local server load balancing on the content switch. Both these content rules ! are under the same VIP content VIP-20.17.99.100 add service server-30 add service server-40 add service server-50 vip address 20.17.99.100 active ! This content rule is configured for the sake of request router. The IP address of the VIP used is internal and ! this content rule has the same services of the previous content rule. This content rule serves the purpose of ! providing the request router with the availability of the VIP 20.17.99.100 content dummy-VIP-20.17.99.100 add service server-30 add service server-40 add service server-50 vip address 20.13.99.100 active
Caveats: The main caveat to this design is •
The use of the CAPP protocol and its dependency on content rules limits the use of this topology to CSSs and other devices that support similar protocols and rules.
Topology Using DistributedDirector The topology using Distributed Directors is displayed in Figure 4-9. To emphasize the irrelevance of which content switch you deploy, CSMs are shown in the primary data center and CSSs are shown in the secondary data center.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-21
Chapter 4
Site to Site Recovery
Implementation Details
Figure 4-9
Topology using DistributedDirector
DNS proxy
Internet ISP1
ISP2
Internal network
Clients User Community
Data Center Edge
Distributed Director
Aggregation
Content switch CSM
Content switch CSS
87034
Access
DistributedDirector interoperates with all content switches, including IOS SLB. DistributedDirector tracks the health of applications by probing the VIPs on the content switches and is part of Cisco IOS. You can enable DistributedDirector at the data center edge or on the aggregation switches based on which platforms you deploy. Typically, edge or aggregation switches are deployed in HSRP mode in the data center. When this is the case, enable the DistributedDirector in HSRP mode for redundancy. Deploy DistributedDirector close to the data center to monitor the health of the applications and also for better manageability DistributedDirector check the health of the applications by either establishing or tearing down a TCP connection to the application at configured intervals or by using an HTTP GET at specified intervals. In a disaster recovery solution, DistributedDirector acts as an authoritative DNS for configured sub-domains. You can specify multiple VIPs and associate each VIP with a preference level. DistributedDirector monitors the health of the VIPs and responds to the DNS queries with a VIP of higher preference. You can configure several rules on a DistributedDirector to answer DNS queries with a priority level attached to each rule. However, disaster recovery solutions have no requirement for
Data Center Networking: Enterprise Distributed Data Centers
4-22
956599
Chapter 4
Site to Site Recovery Implementation Details
multiple rules. Most of the rules are configured to select the best location based on RTT, IGP or BGP metrics. Only the health of the applications and the preference levels are required to deliver disaster recovery solution.
High Availability Again, the database must be replicated to the secondary data center if the application is active at the primary data center and vice versa to achieve high availability. You can achieve high availability by routing users to a secondary data center in the event of a catastrophic failure at the primary data center. Achieve redundancy and high availability on the request routing devices by running DistributedDirectors in HSRP mode and duplicating the DistributedDirector configuration at the secondary data center.
Configuration Details Use the following steps to configure this topology: Step 1
Configure Hosts or sub-domain with redundant VIPs.
Step 2
Configure Start Of Authority (SOA) records. DistributedDirector does not respond to DNS queries, without SOA records.
Step 3
Specify host preference indicating the preferred VIP.
Step 4
Configure health checks for the VIPs using either TCP or HTTP Get based on the application requirement.
Step 5
Specify the priority. This allows for flexibility in setting up different rules, which are used to keep up with the dynamic conditions to choose the best server available. Because of the predetermined traffic pattern, do not configure multiple rules for a disaster recovery solution.
The relevant commands and the configuration are shown below.
Configure Hosts ip host www.ese-cdn.com 20.18.99.100 20.17.99.100 ip host rtp.ese-cdn.com 20.17.99.100 ip host sj.ese-cdn.com 20.18.99.100
Configure SOA records ip dns primary www.ese-cdn.com soa DDIR.ese-cdn.com admin.ese-cdn.com ip dns primary sj.ese-cdn.com soa DDIR.ese-cdn.com admin.ese-cdn.com ip dns primary rtp.ese-cdn.com soa DDIR.ese-cdn.com admin.ese-cdn.com
Configure Host Preference ip director server 20.17.99.100 preference 1 ip director server 20.18.99.100 preference 2
Data Center Networking: Enterprise Distributed Data Centers 956599
4-23
Chapter 4
Site to Site Recovery
Implementation Details
Configure Health Check TCP ip director hosts rtp.ese-cdn.com connect 23 interval 10 ip director hosts sj.ese-cdn.com connect 23 interval 10 HTTP HTTP Health Check ip director hosts rtp.ese-cdn.com ip director hosts rtp.ese-cdn.com 15 ip director hosts sj.ese-cdn.com ip director hosts sj.ese-cdn.com
Warning
port-service 80 verify-url http://rtp.ese-cdn.com/ connection-interval
port-service 80 verify-url http://sj.ese-cdn.com/ connection-interval 15
Configure either a TCP or an HTTP health check.
Configure Priority ip director hosts www.ese-cdn.com
priority adm 1
Caveats •
DistributedDirectors are not supported on products like Catalyst 6000, which are used at the aggregation layer.
•
Health monitoring is an important aspect of disaster recovery solution. The decision to route end users to available data center is based solely on the application availability. Distributed Director generates a lot of traffic across the network for health monitoring. It creates one TCP setup and tear down for each active application.
•
Limitation on the different health probes available to monitor application health.
Topology using CSM This topology uses RHI. Use this topology to provide disaster recovery solutions for legacy applications. A complete solution for an enterprise that has both legacy and non-legacy applications can only be provided by using CSS and CSM topologies. However, since CSMs and CSSs are both content switching devices, using both is not a possibility. The only solution then is to use a product that has both RHI and GSLB functionality. Or to use a dedicated request routing device in each data center together with CSM. RHI is the only solution for a legacy environment. This is depicted in Figure 4-10.
Data Center Networking: Enterprise Distributed Data Centers
4-24
956599
Chapter 4
Site to Site Recovery Implementation Details
Figure 4-10 RHI using CSM
ISP and Internet
Local DNS Clients User Community Intranet
Data Center Edge
Content switch CSM
Content switch CSM
87063
Aggregation
Configure RHI on the CSMs. As stated earlier, this topology is the only available option that provides redundancy and high availability in environments containing legacy applications that are accessed via static, hard-coded IP addresses.
Note
The CSMs are deployed on a service switch connected to the aggregation switches. For enterprise data centers, CSMs are deployed on the aggregation switches subject to the availability and comfort level of using native images on aggregation switches. In this method, two different data centers use the same VIP address. The VIP addresses are advertised as host routes from both data centers. The upstream router picks the best route and routes clients to the destination data center. The downside of this method is routes cannot be summarized because it relies on host routes. But this method requires no other changes and converges quickly. Under normal
Data Center Networking: Enterprise Distributed Data Centers 956599
4-25
Chapter 4
Site to Site Recovery
Implementation Details
conditions, the end users or clients are request routed to the best route listed in the routing tables of the routers. When a catastrophic failure occurs, IP routing takes care of updating the routing table with the alternate route. In this topology, the end users session times out during a catastrophic failure at the active data center. The clients must restart the application to get connected to the alternate data center.
High Availability CSMs can be deployed in the data center in a redundant configuration. It works similar to HSRP with one active CSM and one standby CSM. Refer to the Enterprise Data Center SRND for more details on configuring redundant CSMs. When primary and backup data centers are deployed, high availability is achieved as a result of routing changes.
Configuration Details There are three modes of operation for the content switch in a data center. RHI works in all three modes. For more information about these different modes refer to the Scaling Server Farms chapter of the Enterprise Data Center SRND. Step 1
Configure a working VIP by configuring the client side VLAN, server side VLAN, VLAN database, server farm and virtual server. Refer to the above link for more details on these configuration details.
Step 2
Inject the VIP as a static route into the MSFC routing table by advertising the active VIP to the MSFC.
Step 3
Redistribute routes into OSPF.
Step 4
Make necessary routing changes by tuning the route metrics injected in step 2.
Step 5
Repeat at least steps 1-3 at the secondary data center.
Since there are different modes of operation, the mode of operation that suites your deployment can be configured by referring to the above document. Step 1 allows you to bring the VIP online. In addition to this, tune the keepalive frequency for the real servers using instructions provided in the document. Once the VIP is online, the route can be injected into MSFC’s routing table using Step 2. Perform Step 4 at one of the data centers, not both. For the sake of completeness, both steps are shown below. For a detailed configuration information, refer to the link provided.
Configure the VLAN interface which connects to the core routers. Router(config)#interface vlan 100 Router(config-if)# ip address 20.18.31.2 255.255.255.0 Router(config-if)# no ip redirects Router(config-if)# no ip unreachables Router(config-if)# no ip proxy-arp
Warning
If you have a fault tolerant configuration, you must configure “no ip proxy-arp” on the interface. in this sequence, before you perform any subsequent steps. Turning off proxy arp prevents the server arping for the VIP from receiving a response from the secondary CSM before it receives a response from the primary CSM.
Data Center Networking: Enterprise Distributed Data Centers
4-26
956599
Chapter 4
Site to Site Recovery Implementation Details
Configure the Server Farm Router(config)#mod csm 5 Router(config-module-csm)#serverfarm REAL_SERVERS Router(config-slb-sfarm)#real 20.40.30.100 Router(config-slb-real)#inservice Configure the Client side VLAN Router(config)#mod csm 5 Router(config-module-csm)#vlan 100 client Route(config-slb-vlan-client)#ip address 20.18.31.150 255.255.255.0 Route(config-slb-vlan-client)#gateway 20.18.31.2 Route(config-slb-vlan-client)#alias 20.18.31.6 255.255.255.0
The IP address on the upstream VIP on the aggregate switch is 20.18.31.2. This is the interface through which the client connections come in. It is important to note that it is important to use the alias. Typically this is used when there is a redundant CSM. For RHI configurations, use this even if there is no redundant CSMs.(work around for a bug CSCdz28212).
Configure the Server side VLAN. Router(config)#mod csm 5 Router(config-module-csm)#vlan 30 server Route(config-slb-vlan-server)#ip address 20.40.30.1 255.255.255.0 The default gateway address on the real servers is 20.40.30.1.
Configure the virtual server. Router(config)#mod csm 5 Router(config-module-csm)#vserver VIP1 Router(config-slb-vserver)# vlan 100 Router(config-slb-vserver)#serverfarm REAL_SERVERS Router(config-slb-vserver)#inservice
Inject the route into the MSFC’s routing table. The “advertise active” command to makes the MSFC aware that there is a VIP available that is reachable via client VLAN. The MSFC then injects this static route into the routing table. Router(config)#mod csm 5 Router(config-module-csm)#vserver VIP1 Router(config-slb-vserver)#advertise active
Redistribute routes into OSPF. Router(config)#router ospf 1 Router(config-router)#redistribute static subnets
Change route metrics. This step is needed only at one of the data centers. This step changes the metric so that the upstream routers carry only the best routes. Metric 10 is used as an example. Router(config)#router ospf 1 Router(config-router)#redistribute static metric 10 subnets
Data Center Networking: Enterprise Distributed Data Centers 956599
4-27
Chapter 4
Site to Site Recovery
Implementation Details
The following command shows the routing table after the completing the above configuration steps. The route is injected into the routing table as a static route. After you complete the configuration at the primary data center, the static route is redistributed into OSPF with the configured metric. Router#sh ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route Gateway of last resort is not set 20.0.0.0/8 is variably subnetted, 17 subnets, 3 masks O 20.19.0.0/16 is a summary, 1d19h, Null0 O 20.18.30.0/24 [110/2] via 20.18.31.1, 2d08h, Vlan100 C 20.18.31.0/24 is directly connected, Vlan100 O IA 20.17.31.0/24 [110/15] via 20.18.31.1, 2d08h, Vlan100 O IA 20.17.30.0/24 [110/14] via 20.18.31.1, 2d08h, Vlan100 C 20.40.30.0/24 is directly connected, Vlan30 O 20.18.100.0/24 [110/3] via 20.18.31.1, 2d08h, Vlan100 O 20.18.99.0/24 [110/3] via 20.18.31.1, 2d08h, Vlan100 O IA 20.17.99.0/24 [110/15] via 20.18.31.1, 2d08h, Vlan100 S 20.19.30.200/32 [1/0] via 20.18.31.6, Vlan100
These static routes are redistributed. The following command shows the routing table in one of the upstream routers. Router#sh ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route Gateway of last resort is 20.17.40.2 to network 0.0.0.0 20.0.0.0/8 is variably subnetted, 16 subnets, 2 masks O E2 20.19.0.0/16 [110/10] via 20.17.50.2, 00:13:23, Serial1/0 O IA 20.18.30.0/24 [110/10001] via 20.17.50.2, 00:13:23, Serial1/0 O IA 20.18.31.0/24 [110/10001] via 20.17.50.2, 00:13:23, Serial1/0 O IA 20.17.31.0/24 [110/12] via 20.17.40.2, 00:13:23, BVI10 O IA 20.17.30.0/24 [110/11] via 20.17.40.2, 00:13:23, BVI10 O IA 20.40.30.0/24 [110/10002] via 20.17.50.2, 00:13:23, Serial1/0 O IA 20.17.33.0/24 [110/11] via 20.17.40.2, 00:13:24, BVI10 O 20.17.35.0/24 [110/11] via 20.17.40.2, 00:13:24, BVI10 C 20.17.41.0/24 is directly connected, BVI20
Repeat steps 1 to 4 at the standby or secondary data center.
Route Advertisements in RHI RHI works by injecting a host route in the routing table based on the availability of the VIP. Typically, the VIPs in a data center are in the same subnet as the client VLAN. This helps in avoiding configuring static routes on the aggregation switch and advertising static routes. But with RHI, the secondary data center has to have VIP in a different subnet (same as the VIP in the primary data center) and has to be advertised with a metric other than the primary data center’s VIP. Notice the highlighted routes in the above routing table. What is the best solution when it comes to advertising routes?
Data Center Networking: Enterprise Distributed Data Centers
4-28
956599
Chapter 4
Site to Site Recovery Implementation Details
Case 1 Figure 4-11 Using NSSA in the Data Center
Data Center Edge
Data Center Edge Core OSPF Area 0
NSSA Area 1
NSSA Area 2
Content switch module
Content switch module
Primary Data Center
Secondary Data Center
87038
Aggregation
Use a different subnet for all the VIPs in a data center other than the client VLANs subnet. For example, all the network addresses for Layer 3 interfaces start with 20.17.x.x in the primary data center and start with 20.18.x.x in the secondary data center. Use a network starting with 20.19.x.x for all VIPs. The advantage of doing this is that it allows you to configure summarization on the aggregate switch for the VIPs. Also configure the data center aggregation switches in a Not so stubby area (NSSA). This also helps in case you are summarizing these routes. Even if the routes are summarized, in case of a failure of any single application, all the applications have to be failed over to the secondary data center. The other alternative is to put each application in a different subnet and summarizing the routes. So, the failure of any one VIP does not affect other VIPs.
Case 2 Put both data centers in a single OSPF area (area 0) and use host routes. For example, all the network addresses for Layer 3 interfaces start with 20.17.x.x in the primary data center and starts with 20.18.x.x in the secondary data center. Use a network starting with 20.17.x.x for all VIPs. The advantage of doing this is that it is simple to configure and works well if host routes are advertised. The disadvantage is that it looses the flexibility of Case 1. Deploying RHI is left up to you but Case 1 is recommended because RHI supports route summarization when connecting to multiple service providers.
Caveats The following caveats exists for RHI regardless of following case 1 or case 2.
Data Center Networking: Enterprise Distributed Data Centers 956599
4-29
Chapter 4
Site to Site Recovery
Implementation Details
•
This solution is better for intranets since either host routes have to be advertised or each application has to be put in a different subnet. Each subnet can be summarized to overcome the summarization issues at the ISP.
•
The second caveat is with regards to solution scalability. If the number of applications supported grows, the numbers of host routes in the enterprise network grow linearly.
Data Center Networking: Enterprise Distributed Data Centers
4-30
956599
C H A P T E R
5
Multi-Site Load Distribution Redundancy and high availability are achieved by deploying multiple data centers, and distributing applications across those data centers. This chapter focuses on the design and deployment of distributed data centers to achieve multi site load distribution. As an enterprise looks to expand operations, the creation of multiple data centers becomes an attractive option. The challenge of geographic load balancing is to ensure that transaction requests from clients are directed to the most appropriate data center. Geographic load distribution requires control points for all transaction requests destined to any data center. The point of control for a geographic load-distribution function resides within DNS. All clients must contact a DNS server at some point prior to requesting service from a server. Because, geographically replicated content and applications reside on servers with unique IP addresses, unique DNS responses can be provided to queries for the same URLs or applications based on a series of metrics. Metrics are dynamically calculated and updated at distributed sites and clients are directed to the best site based on these metrics and the availability of services. These different types of metrics include proximity between the client and the data center, the load at different data centers, weighted round robin load distribution and site persistence methods.
Hardware and Software Requirements The table below lists different hardware and software required to support multi-site load distribution. Not all platforms meet all the requirements and each one has their own advantages and disadvantages. In subsequent sections of this document, the different strengths and weaknesses are discussed and specific products and topologies are recommended. Table 5-1
Hardware and Software Requirements
Product
Release
Platforms
Distributed Director
12.2(8)T
2600, 3620, 3640, 3660, 3725, 3745, 7200
Content Services Switch (CSS)
WebNS 5.02
CSS 11150, CSS 11050
Data Center Networking: Enterprise Distributed Data Centers 956599
5-1
Chapter 5
Multi-Site Load Distribution
Design Details
Design Details Design Goals The basic design goal is to be able to direct clients to appropriate data center based on the configured rules and the availability of the servers or services at the data center. The major design goals are: •
High availability
•
Scalability
•
Security
•
Other requirements as necessary
High Availability Within a data center, high availability is achieved by deploying redundant devices and addressing both device failure and link failure using Layer 2 and Layer 3 protocols. The design should take into account the possibility of device and link failures and recovery within a data center. The design should also take into account the possibility of link and device failure of request routing device within a data center. For a more detailed description of how high availability is achieved within a data center, refer to the Enterprise Data Center SRND (http://www.cisco.com/en/US/netsol/ns110/ns53/ns224/ns304/networking_solutions_design_guidances_list.html). By deploying distributed data centers, the single point of failure, either by data center failure or application failure, is eliminated. Due to the number of applications hosted in a data center, the different health check mechanisms used, and the frequency of health checks, it is desirable to keep the health checks local to the data center and exchange the application availability on a TCP connection.
Scalability Scalability is an inherent element of the distributed data center environment. Clients can be directed to the most appropriate data center thereby distributing the load across multiple data centers. Applications are not overloaded when the applications are hosted at multiple data centers leading to a manageable and somewhat less complex server farm. Scalability comes from application scalability and site scalability. The design should be able to support the growth of the number of sites without over-hauling the design; and should be able to add more applications which means more DNS records without performance degradation. There are some limitations to scaling number of sites. The details are covered in implementation section of this design document.
Security Security is deployed as a service at the aggregation layer in a data center. Deploying request routing devices should not compromise security in the data center. For instance, the placement of authoritative DNS should ensure that security requirements are met because of the amount and type of overhead traffic needed to ensure application availability at each data center. Monitoring application availability might include determining both the health and load on the applications.
Data Center Networking: Enterprise Distributed Data Centers
5-2
956599
Chapter 5
Multi-Site Load Distribution Design Topologies
Other Requirements Other design requirements include meeting client and application requirements. For a business to business client, the client has to stay with the same site as long as it is available for the length of the transaction period (site persistence). In the case of client accessing a streaming application, the client should be directed to a topologically closest data center (proximity). Some other client and application requirements include directing clients to the data center based on round trip time, IGP, and BGP metrics. Ideally the design should be able to meet all these requirements.
Design Topologies Cisco offers several products that are available for multi-site load distribution solutions. There are different topologies based on the products and each one has its own limitations. All the topologies are covered in the following sections and recommendations are made based on the solution requirements. All the topologies adhere to Cisco’s design recommendation and contains a layered approach. The layers common to each topology are the data center edge and the core layer. Typically, the primary data center is connected to two service providers. The layer which connects to dual service providers is known as the Internet edge. Multi homing to dual ISPs are covered in more detail in Internet Edge SRND. At each layer, redundant devices are deployed for high availability. The core layer provides connectivity to branch offices, remote users, and campus users. Different data centers are also connected together at the core layer. For more information on data centers, refer to the Enterprise Data Center SRND and for Campus Design, refer to the CANI SRND
Topology Using CSS There are two topologies that can be used with CSS. These are classified based on the functionality of request routing. In an integrated architecture, a single device performs both the request routing and content switching functions. In a decoupled architecture, separate devices perform these functions.
Integrated Architecture Figure 5-1 depicts the integrated architecture. The CSSs act as the authoritative DNS for the domains in the data center, in addition to performing server load balancing. They are connected to the aggregation switch in the data center. Box-to-box redundancy provides redundancy in the data center.
Data Center Networking: Enterprise Distributed Data Centers 956599
5-3
Chapter 5
Multi-Site Load Distribution
Design Topologies
Figure 5-1
Integrated Architecture Using CSS
Internet SP 1
CAPP session Cross over connection for box-to-box redundancy
SP 2 Internal network
Data Center Edge
Zone 2 To Site 3
87021
Distribution
Access Zone 0
Zone 1 Site 1
Site 2
In this topology, the request routing functionality is deployed on top of the existing data center infrastructure. This requires that CSSs are deployed in box-to-box redundant architecture in all the data centers. The CSSs in the different data centers exchange information about VIP availability using a mesh of TCP sessions called a CAPP session. You can deploy the content switches in either a one-arm or a two-arm configuration. Figure 5-1 shows the one-arm configuration. In addition to this, the data center architecture might follow different modes based on where the default gateway points to from the server farm. Please refer to Enterprise Data Center SRND for more details. The default gateway can reside on the content switch or the aggregation switch.
Decoupled Architecture In the decoupled architecture, the request routing and content switching are decoupled for several reasons. The first and the foremost reason is to provide flexibility in choosing content switching products in the data center. This also eliminates some of the restrictions on the type of redundancy used within a data center for content switching. This is further discussed in the implementation section. The other reasons are that decoupling provides a stable authoritative DNS server and also provides a good migration path, if needed, for both request routing and content switching.
Data Center Networking: Enterprise Distributed Data Centers
5-4
956599
Chapter 5
Multi-Site Load Distribution Design Topologies
Figure 5-2
Decoupled Architecture Using CSS
CAPP session Internal network
Internal network
Internet Internet edge
Internet edge Zone 2 Site 3
Core Zone 1
Zone 0
Aggregation
Content services switch (WebNS) Site 1
Content services switch (WebNS) Site 2
87022
Access
In a decoupled architecture, the request routing device is connected to the aggregation switches. The content switches in the data center are free to choose a one-arm or a two arm configuration, as well as VRRP/VIP or box-to-box redundancy if CSSs are deployed. There is additional flexibility of choosing different content switches, as well. The configurations at the distributed data centers are similar to the first data center. All the data centers are connected in a mesh of TCP or CAPP sessions to exchange VIP availability. The VIP availability within the data center is obtained by probing VIPs on the content switches using ICMP or CAPP-UDP. The remainder of the configuration is similar to integrated architecture. In summary, from a topology point of view, the main difference between the decoupled and integrated architecture is the deployment of an additional content services switch at the aggregation layer
Topology Using Distributed Director DistributedDirector and Director Response Protocol(DRP) agents are software features in IOS. They are deployed in the data center as a service to provide multi-site load distribution. The physical topology matches the generic data center topology and the DistributedDirector is placed at the core layer with the assumption that Cisco 7200 platforms are being used at the core layer. If core layers do not exist or does not use 7200 platforms, then the DistributedDirectors must be deployed as dedicated devices for request
Data Center Networking: Enterprise Distributed Data Centers 956599
5-5
Chapter 5
Multi-Site Load Distribution
Implementation Details
routing at the aggregation layer. The deployable switches are the 26xx or 36xx platforms. The application health checks must traverse through layers of firewalls and the firewall configuration must allow this health check traffic to flow through. For redundancy purposes, the DistributedDirectors are deployed in HSRP mode. DistributedDirectors should be deployed at a minimum of two sites. The DRP agents reside on the aggregation switches of data centers because data center availability is provided as a service and the aggregation layer is where most services are provided. Deploy the DRP agent at all the data centers. Figure 5-3
Topology Using Distributed Director
Internal network Distributed Director
Internet IE
Application Health Check
IE
Core
87023
Aggregation
Site 1
Site 2
Site 3
Implementation Details Working with the CSS There are two main deployment modes supported by CSS, as mentioned in the previous section. Before detailing the different architectures, this is a good time to describe, in detail, the different criteria used in selecting a specific site. These criteria or site selection methods are referred to as DNS balance methods in CSSs. The different DNS balance methods used are: •
Source IP hash
•
Round robin
Data Center Networking: Enterprise Distributed Data Centers
5-6
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
•
Load distribution based on site load
•
Proximity – Static proximity – Round trip time
Note
Some other proximity categories like IGP and BGP metrics that could be used to send the clients to the closest site are not supported in CSS and are not applicable to Enterprises. Both the integrated and decoupled architectures support these balance methods. Using different balance methods yield different results. Conversely, different DNS balance methods are used for different applications. For instance, for static content, like web pages, which are distributed among different web sites, a simple DNS balance method like round robin or weighted round robin is normally sufficient. For an e-Commerce application, directing a client to the same site over an extended period of time is an important requirement. This is achieved using Source IP hash DNS balance method.
Site Selection Methods When a CSS is acting as an authoritative DNS, it monitors the availability of different VIPs on different sites. Upon receiving a DNS request, the CSS responds with an A record of the active VIPs based on one of the following criteria: •
Round Robin — Clients are directed to the sites in a circular fashion. For instance, if the VIPs in the ordered list for a specific domain www.xyz.com are1.1.1.1, 2.2.2.2 and 3.3.3.3. In response to the first DNS request, one of the 3 VIPs is sent as a response. If the first response was 1.1.1.1, then on the second DNS request, 2.2.2.2 is sent as a response and so on.
•
Source IP Hash — The source IP address of the DNS request determines which site the client is directed to. Again, considering the ordered list in the previous example, based on the source IP address of the DNS request, one of the 3 VIPs is arrived at using the source IP address in the hashing algorithm. If for instance, the hashing algorithm comes out with VIP 2.2.2.2, then as long as the request is coming from that same client, the response has the same VIP, 2.2.2.2.
•
Proximity — Clients matching a list of IP addresses in the access list are directed to specific sites. This is called Static Proximity. The second category of proximity is to direct clients to the site with the least round trip time between the requesting client (client’s DNS proxy) and the site
•
Least Loaded — Clients are directed to a site with the least load. The definition of load is based on the load balancing device used at the data center. If load balancing device is a Content Switching Module (CSM) in a Cat6k, then load is measured based on the number of TCP connections setup through the device. However, if a CSS is the load balancing device, then the load is computed based on the response time from the servers. The CSS in a data center assigns load to each server based on the response times. The default load assigned is 2 for the fastest server. The formula used to compute the load is as follows: Load = {(Response from a specific server – Fastest Response) / Load Step } + 2 Where Fastest Response = 2 and Load Step is configurable starting at 10msecs.
Load for a specific site is computed relative to the fastest server. There are some shortcomings to this method of computation. For instance, while computing the load, the switch takes into account the fastest response time from servers irrespective of whether the fastest server is being used in the pool of servers serving a specific domain or application. This results in directing clients to a site that
Data Center Networking: Enterprise Distributed Data Centers 956599
5-7
Chapter 5
Multi-Site Load Distribution
Implementation Details
might be busy. Further, all the clients might be directed to a specific site till the load on the site increases beyond a certain point. In other words, sometimes the load is not distributed equally across multiple sites. Because of this, the Least Loaded site selection mechanism is not recommended. The different DNS balance methods used depends on the applications. For instance, web pages or static pages can be distributed across multiple data centers and the clients accessing these web pages can be evenly distributed across different data centers. Or, the applications that are sensitive to latency might require that the clients be distributed based on the latency between the clients and these different sites. E-Commerce applications require that the clients be distributed between different data centers using round robin approach but once they are directed to a specific site, they should remain on that site until the transaction is complete. The physical topology is similar in all the DNS balance methods. In addition to the above balance methods, a combination of different methods can be used. But, in general, the applications determine the deployment of one or the other DNS balance mechanisms. Some of the balance method combinations are shown in Table 1. Source IP hash for site persistence cannot be used with other DNS balance methods today. This eliminates some of the combinations. Least Loaded method of load distribution has some caveats as mentioned above. Static proximity can be viewed as a method of providing site persistence.
Modes of Operation With so many variations, what are the rules that dictate what can and cannot be done? To start with, multi-site load distribution can be achieved using two different modes of operation on content services switch. These are •
Rule based DNS
•
Zone based DNS
In Rule based DNS, ACLs can be used to influence site selection method. Rule based DNS does not scale well beyond two sites, mainly because the configuration size increases linearly and becomes difficult to troubleshoot and maintain. Therefore, using Rule based for sites beyond two is discouraged. Zone based DNS is the second method. It is simple to configure and scales well beyond two sites. But it does not work with ACLs and so it cannot be used with DNS balance methods that require ACLs. In addition to this, here are some more details about zone and rule based DNS. •
Zone based DNS does not care about owner and content rule name. Therefore, you can only configure one A record for the same fully qualified domain name (FQDN) on each CSS. The A record for the remote sites is learned through the CAPP session. In rule based request routing, most of the configuration is under the content rule.
•
When configuring “zone based” request routing, content rules with different names in different locations are allowed. However, when configuring “rule based” GSLB, you have to configure the same owner name and the same content rule on the peering CSSs.
You can use either an integrated or decoupled architecture, based on the number of sites and the different combinations of DNS balance methods that the applications require. The following table covers the different DNS balance methods that can be deployed with both the architectures.
Data Center Networking: Enterprise Distributed Data Centers
5-8
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
Table 1: Site Selection Methods
Site Selection Methods (DNS Balance Methods)
Integrated Architecture
Decoupled Architecture
Note
Round Robin
Proximity
Source Least IP Hash Loaded
Combinations
Rule Based
Yes
Yes (ACL) Yes
Yes
[Round robin or Preferlocal or leastloaded] + Proximity
Zone Based
Yes
RTT (PDB)
Yes
Yes
None
Rule Based
Yes
Yes
Yes
Yes
[Round robin or Preferlocal or leastloaded] + Proximity
Zone Based
Yes
RTT (PDB)
Yes
Yes
None
It is recommended that static proximity be deployed using integrated architecture and that rule based DNS be used with two sites up to a maximum of four sites.
Deployment Methods Integrated Architecture If CSSs are being used for content switching in a data center, you can deploy multi-site load distribution on the content services switches. Figure 5-4 provides a detailed topology of a data center starting with the aggregation layer. Services like content switching, firewalls, SSL termination etc., are deployed at the distribution layer of the data center. As a result of all these new services being deployed at the distribution layer, this layer is termed as the aggregation layer. The assumption is that other data centers will be similar in topology and is not shown in Figure 5-4.
Data Center Networking: Enterprise Distributed Data Centers 956599
5-9
Chapter 5
Multi-Site Load Distribution
Implementation Details
Figure 5-4
Multi-Site Load Distribution using Integrated Architecture
CAPP session Cross over connection for box-to-box redundancy Edge
To Site 3
Core
To Site 2
87024
Aggregation
Zone 0 Site 1
The aggregation switches are connected by an ether channel. Typically, the aggregation switches function in active/standby mode. The firewalls connect to the aggregation switches and also function in active/standby mode. The firewalls exchange information via a Layer 2 path provided by the aggregation switches and the trunked ether channel. The access switches connected to the aggregation switches are trunked and provide a Layer 2 path to the real/orign servers. The access and aggregation switches run spanning tree protocol and provide the required redundancy at Layer 2. One of the aggregation switches is configured as the root bridge. The CSSs shown in Figure 5-4 are connected to the aggregation switches in a one arm configuration. The second possibility is connecting two interfaces on the content services switch, called a two arm configuration, to the aggregation switch. In addition to this, based on where the default gateway for the real/origin servers is, there are three modes of operation. Details of the different modes of operation are covered in more detail in another design document. Refer to the Enterprise Data Center SRND for mor information. Multi-Site load distribution is agnostic to the three modes of operation hence only the configurations and topologies for multi-site load distribution are discussed from this point on. The advantages of using this architecture is that no additional device needs to be deployed or monitored. It does reduce some overhead but at the same time looses flexibility. The caveats are covered in more detail in the next section. Device redundancy is provided by box-to-box redundancy. If the CSS fails, the redundant CSS running box-to-box redundancy takes over and the TCP sessions now terminate on the backup CSS. CAPP sessions cannot be established with a virtual interface. Hence VRRP/VIP redundancy cannot be used within the data center.
Data Center Networking: Enterprise Distributed Data Centers
5-10
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
It is recommended that Zone based request routing be used in this architecture. Rule based request routing can also be used but only when the there is a maximum of four sites. Zone based DNS request routing is recommended simply because it scales well, is easier to configure, troubleshoot and maintain. With Zone based DNS request routing, you can deploy a maximum of 16 sites. Zone based request routing can be used with most DNS balance methods except for methods which involve static proximity. If the number of sites is limited to two, then this architecture also supports rule based request routing.
Caveats There is concern for performance when the CSS is used for both content switching and request routing. Failure of one of the content switches results in a much larger time for request routing to resume normal operation. The probability of device failure also depends on content switching software stability. In an integrated architecture, multi-site load distribution cannot be achieved if the content switch used does not support request routing. It is not recommended that you deploy least loaded multi-site load distribution due to the fact that mechanisms used to compute the load is not effective as discussed in a previous section. And finally this architecture prohibits the use of VRRP/VIP redundancy in the data center.
Configuration There are two methods of configuration: rule based and zone based. Rule based configuration is not recommended beyond four sites because of the configuration scalability and maintenance issues. More information about rule based configuration is provided in Chapter 4, “Site to Site Recovery.” Zone based is recommended for multi-site load distribution. The steps involved in the configuration are as shown in the following table. Step 1
Configure the zone along with the site selection method. dns-server zone 3 tier1 "San Jose Data Center" leastloaded
Step 2
Configure a DNS record specifying a fully qualified domain name and an IP address to go with it. dns-record a www.ese-datacenter.com 20.17.99.101 0
Step 3
Configure a CAPP session to each one of the remote data centers. This command also provides options to change the keepalive frequency and session encryption. Refer to the configuration guide for the different options. app session 20.18.100.2 app session 20.16.100.2
Step 4
Enable DNS server on the content switching device. Dns-server
Step 5
Enable CAPP sessions. App
Step 6
Note
Repeat these steps for all the sites.
Refer to the configuration guide for all the different options of the command.
Data Center Networking: Enterprise Distributed Data Centers 956599
5-11
Chapter 5
Multi-Site Load Distribution
Implementation Details
Use the first command to specify the site selection mechanism. In this example, leastloaded is the site selection mechanism. You could also use round robin, source ip or preferlocal. In step 2, the IP address specified is that of a virtual IP address configured on the device. For more information on configuring a VIP and content rules, refer to the Enterprise Data Center SRND. Once all the steps are completed, the CSS establishes a CAPP session with the active CSS at the remote data centers and, after the session goes into active state, the CSS learns A records from the remote data centers. Use the following commands to monitor the status of DNS records and get the statistics. * Show dns-record keepalive CSS-B(config)# sh dns-record keepalive Name: www.ese-datacenter.com Type: Ap IP: 20.17.99.101
State: UP Load: 2
Transitions: 21 Threshold: 254
* Show dns-server stats CssA(config)# sh dns-server stats DNS Server SCM database Statistics: DNS Name: Content Name: Location: Resolve Local: Remote: -----------------------------------------------------------------------------------------www.ese-datacenter.com l4-server-40 20.17.99.101 10 17
Decoupled Architecture The decoupled architecture is more flexible compared to the integrated architecture. In this architecture, the request routing functionality is deployed on a dedicated CSS. Two different flavors are depicted in Figure 5-5. In the first scenario, the switch is connected to both the aggregation switches. The content switch can use the same VLAN to connect to both the switches. The content switch points to the active HSRP address at the aggregation layer. Redundancy is provided at the aggregation layer. If the aggregation switch with active HSRP fails, then the second switch takes over. In the second scenario, a redundant CSS is deployed. When CAPP session is used on the content switch, only box to box redundancy can be used.
Data Center Networking: Enterprise Distributed Data Centers
5-12
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
Decoupled Architecture Using CSS
CAPP session Cross over connection for box-to-box redundancy
CAPP session
Core Layer
Core Layer To Site 3 (Zone 2)
Aggregation
To Site 2 (Zone 1)
Aggregation
Zone 0
Trunk
Trunk
To Site 3 (Zone 2)
To Site 2 (Zone 1)
Zone 0
Trunk
Trunk
To Servers
To Servers
Site 1
Site 1
87025
Figure 5-5
The illustration on the left identifies a deployment with redundant links and the illustration on the right shows a deployment with redundant devices. The separation of request routing and content switching has many advantages. There is no dependency on the content switches being used. The dependency on box-to-box redundancy and content switches is eliminated with this architecture. The content switch still probes the different VIPs to determine which FQDNs are configured locally and communicates the domain’s health information as well as learns the domains configured on remote sites using the CAPP session with other sites. Probe for VIP availability using UDP sessions between the proximity domain name server (PDNS) and the content switches. This is true as long as the content switches can also use the same protocol. This is referred to as APP-UDP sessions. In addition to availability information, the PDNS and the content switch can also exchange load information over the same UDP sessions. As previously mentioned, deploying least loaded multi-site load distribution is not recommended. ICMP is also an option to probe for VIP availability. The content switch is referred to as the PDNS in this architecture. With this architecture, site-to-site recovery or disaster recovery is also supported. For a list of site selection methods supported with this architecture, refer to Table 1. This architecture mimics the proximity data base (PDB) and PDNS deployment. The PDB is not shown in the architecture, but by deploying a PDB, this architecture provides multi-site load distribution based on RTT between the clients and the sites. Or, it is possible to migrate from deploying a PDNS to an appliance deployment
Data Center Networking: Enterprise Distributed Data Centers 956599
5-13
Chapter 5
Multi-Site Load Distribution
Implementation Details
which supports multi-site load distributions based on RTT without having to deploy PDBs. Further, with this architecture, a variation of least loaded connections can be deployed. This variation takes maximum connections into account. The PDNS and CSSs are deployed on the inside interface of the firewalls. This means the firewalls should allow the CAPP sessions to pass through them.
Caveats This architecture primarily uses zone based DNS. ACLs cannot be used to influence DNS decisions. However, when a domain is embedded in the content rule (Rule based), ACLs can influence DNS decisions. But with rule based DNS, it is advised the number of sites using rule based DNS is limited to two but can be stretched to a maximum of four. This implies, multiple A records cannot be handed out if required. This translates to: use static proximity to distribute load between two sites and use zone based DNS for other site selection methods.
Note
PDNS + PDB deployment, which is used to provide site stickiness, has some limitations and is not tested because, there are newer products that provide a better solution without deploying additional devices.
Configuration In this architecture, configure the PDNS and the content switch. The PDNS uses Zone based configuration and the configuration steps are similar to integrated architecture. These are repeated in the following table for convenience. For the content switch, the configuration is minimal and requires the addition of one command to the existing content rule. Step 1
Configure the zone along with the site selection method dns-server zone 3 tier1 "San Jose Data Center" roundrobin
Step 2
Configure a DNS record specifying a fully qualified domain name and an IP address to go with it. In addition, the circuit address of the content switch needs to be specified to keep track of the VIP availability. dns-record a www.ese-datacenter.com 20.17.99.101 0 0 single kal-ap 20.17.99.9
Step 3
Configure a CAPP session to each one of the remote data centers. This command also provides options to change the keepalive frequency and session encryption. Refer to the configuration guide for the different options. app session 20.18.100.2 app session 20.16.100.2
Step 4
Enable DNS server on the content switching device Dns-server
Step 5
Enable CAPP sessions App
Step 6
Repeat these steps for all the sites
Step 7
On the CSS, enable the UDP session to monitor the health of the VIP App-udp
Data Center Networking: Enterprise Distributed Data Centers
5-14
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
Step 8
On the same CSS as in step 7, configure the fully qualified domain in the content rule of the corresponding VIP. This is really important Add dns www.ese-datacenter.com
Step 9
Repeat steps 7 and 8 on the redundant content switch
In addition to the above steps, there are some commands available to control the exchange of information via CAPP sessions. These commands are optional and use the default values if not enabled. App framesz
Set the maximum frame size
App port
Change the TCP port. Default is 5001
Dns-peer receive-slots
Set the maximum DNS names to be received. Default is 128
Dns-peer send-slots
Set the maximum DNS names to be sent. Default is 128
Working with Distributed Directors There are two modules that work together to provide a multi-site load distribution solution. The module that acts as the authoritative DNS is the DistributedDirector(DD) and the module that helps the DD make the decision, is the agent called the DRP agent. Typically there is at least two DDs or two sets of DDs in two sites and as many DRP agents as the number of sites. The DDs send health probes to the VIPs at each of the data centers to determine VIP availability and, based on the configuration, query the DRP agents to decide where to direct the clients.
Director Response Protocol (DRP) The DRP is a simple UDP-based proprietary protocol. It enables Cisco's DistributedDirector software to query routers (DRP Server Agents) in the field for BGP and IGP routing table metrics between distributed servers and clients. DistributedDirector uses DRP to transparently direct end-user service requests to the topologically closest site. DRP enables DistributedDirector to provide dynamic, scalable, and “network intelligent” Internet traffic load distribution between multiple geographically dispersed data centers. DRP Server Agents are border routers (or peers to border routers) that support the geographically distributed servers for which DistributedDirector service distribution is desired.
Note
Because DistributedDirector makes decisions based on BGP and IGP information, all DRP Server Agents must have access to full BGP and IGP routing tables.
DRP Access Limiting and Authentication It is possible to apply an ACLs to the interface on the router where the DRP agent resides to limit the response of the DRP agents. Only the IP addresses in the ACL are allowed to query the DRP agent. This might be done in order to qualify the queries. Another available security measure is to configure the DRP
Data Center Networking: Enterprise Distributed Data Centers 956599
5-15
Chapter 5
Multi-Site Load Distribution
Implementation Details
Server Agent to authenticate DRP queries and responses. Message Digest 5 (MD5) can be setup to authenticate DRP queries from DistributedDirector and DRP agents. It involves defining a key chain to identify the keys that belong to the key chain, and specify how long each key is valid.
Dynamic Feedback Protocol Like the CSS, DistributedDirector performs multi-site load distribution based on site load. DistributedDirector can obtain load information from content switches like Cisco Local Director and the CSM using DFP. DFP support enables DistributedDirector to make server and site selection on the basis of site load, in addition to other DistributedDirector metrics. This protocol allows you to configure the DistributedDirector to communicate with various DFP agents (similar to DRP agents) residing on content switches. The DistributedDirector tells the DFP agents how often they should report load information; then the DFP agent informs the DistributedDirector which content switch to decommission from service. The metric used is Where
= 65535 * RF * BF
RF = real server factor = (# reals accepting connections / # reals inservice) = num of servers active / num of servers defined BF = box factor = minimum of (SF, CF) where SF = storage factor &CF = CPU factor = (# active connections / max connections) = ((100 - % CPU utilization)/ 100)
Note
The Real Server Factor (RF) takes into consideration DFP protocol running on hosts or real servers. Host DFP informs the content switch of the status of the real servers. If some of these servers stop accepting the connections, the content switch is informed. If host DFP is not implemented, then RF = 1. By configuring the maximum connections allowed on a content switching device, the acceptable load on a content switch can be limited. This allows the Distributed Director to avoid directing clients to a busy site.
Site Selection Methods In addition to the site selection methods supported in CSS, DD supports IGP and BGP to select a specific site. One mechanism that is not supported in DD is source IP hash or site stickiness, but DD provides a fallback mechanism when selecting a site. For example, it supports different rules to select a site based on RTT without having to deploy additional devices. DD provides more flexibility, as far as choosing a specific site based on different metrics. For example, to choose a site, different rules can be configured and priorities assigned to those rules. While making a site selection, the DD tries to select a site based on the priorities. If a decision cannot be made for that rule, DD consults the second rule and so on. DD also supports static proximity and site selection based on IGP and BGP metrics. However, DD does not support site stickiness based on the source IP address, i.e., source IP hash. Site persistence is based on cached values. Typically, based on a request from DNS proxy, DD consults the various rules and sites to resolve to a specific IP address. This IP address is cached for a specific time period, one minute, by default. But it is possible to change this default value such that it is cached forever (subject to availability of the IP address). The different site selection methods are summarized and shown in Table 2.
Data Center Networking: Enterprise Distributed Data Centers
5-16
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
Site Selection Methods Supported in DistributedDirector
Site Selection Methods Distributed Director
Round Robin
Proximity
Least Loaded
Yes
RTT, ACL, DFP IGP & BGP
Source IP Hash
Combinations
Not Supported. Substitute is to use cached DNS entry
[Least Loaded] + [Proximity [RTT, ACL, IGP]] + [Round robin]
Deployment Options Distributed Director (DD) is now integrated into IOS and is supported on different platforms. You can deploy DD at different layers, as shown in Figure 5-6, based on the availability of DD on different platforms. DDs can be deployed as an integrated part of a switch in the aggregation, core, at the edge or as a dedicated device as shown in Figure 5-6. Ideally, DDs should be deployed in the aggregation switch similar to the CSM deployment. Due to the unavailability of DDs in Cat6Ks, it cannot be deployed in the aggregation switch if a Cat6k is being used at the aggregation layer. The next good option for deploying DD is at the core layer. There is a good chance that a 7200 platform is deployed at the core and DD can be deployed at the core. The second option is to use a dedicated device like a 2600 that supports DD and deploy it at the aggregation layer. The DRP agent is supported on Cat6K platforms, if the native image is used. So, DRP agent can be deployed on the aggregation switch. DD sends health probes across the network to monitor application availability. These health probes have to traverse the firewalls. This implies that the firewalls need to be configured to allow the health probes to pass through. The different types of traffic include TCP and HTTP. The TCP ports used for health probes depend on the type of applications being used. When a dedicated DD is used, it is typically deployed at the aggregation layer and physically connected to both the aggregation switches to provide redundancy from link failure. To provide enhanced redundancy, two DDs can be connected to each of the aggregation switches and run HSRP between the DDs.
Data Center Networking: Enterprise Distributed Data Centers 956599
5-17
Chapter 5
Multi-Site Load Distribution
Implementation Details
Figure 5-6
Distributed Directors at Different Layers
To Internet Edge Core Layer E1
E2
Core
Aggregation
Aggregation
Trunk
Trunk
Trunk
87026
Trunk
Site 1
Site 1
Figure 5-6 displays DD deployed at the core on the left and on the aggregation switches on the right.
Caveats DD is not available on all platforms. This makes it difficult to recommend DD since flexibility is lost when deploying at different layers in the data center. As shown in the second part of Figure 5-6, a dedicated device has to be deployed to achieve multi-site load distribution. DD does not exchange messages between DDs to learn about VIP availability using a TCP session. But does probe VIPs across data centers to check on availability. This creates too much health probe traffic based on the number of VIPs and the frequency. In addition to this, the type of applications being deployed in the data centers might require more robust health check mechanisms. With the increasing number of applications being hosted in a data center, the health probes can pile on quickly. The increasing number of health probes can also cause higher CPU utilization in the DDs. Moreover, the different types of health probes available in DD is limited to TCP and verify-URL. Another limitation related to health probes is the lack of commands to tune the health check. It is recommended to keep the maximum number of SOA records configured on DDs to around 100.
Configuration DD is essentially an intelligent DNS. DD has a list of IP addresses that are handed out in response to a DNS query for a sub-domain. The following steps show the configuration steps.
Data Center Networking: Enterprise Distributed Data Centers
5-18
956599
Chapter 5
Multi-Site Load Distribution Implementation Details
Step 1
Configure fully qualified domain and the different IP addresses associated with the domain.
Step 2
Configure the SOA records.
Step 3
Configure the DRP agents and enable DRP agents.
Step 4
Create an association between the DRP agents and the DD.
Step 5
Configure the site selection mechanism.
The commands associated with the steps above are as follows: The following command is used to specify the ip addresses of the multiple sites. ip host clients.ese-datacenter.com 20.16.1.5 20.17.1.5 20.18.1.5
The Distributed Director is also “authoritative” for the subdomain www.ese-datacenter.com.
Note
The name server for clients.ese-datacenter.com must have an NS record entry delegating the subdomain www.ese-datacenter.com to the Distributed Director. The SOA record is configured with the following command. ip dns primary clients.ese-datacenter.com soa ddir.ese-datacenter.com admin.ese-datacenter.com
Configuring a DRP agent The following example enables the DRP Agent. Sources of DRP queries are limited by access list 1, which permits only queries from the host at 33.45.12.4. Authentication is also configured for the DRP queries and responses. ip drp server access-list 1 permit 33.45.12.4 ip drp access-group 1
Configuring authentication: ip drp authentication key-chain mktg key chain mktg key 1 key-string internal
To monitor and maintain the DRP Agent, use the following commands in EXEC mode: clear ip drp show ip drp
Clear statistics being collected on DRP requests and responses Display information about the DRP Agent
Associating DD and DRP Agent The association between a DD and a DRP agent is made using the following command: ip director server drp-association
Each time a DNS query is made to the Distributed Director, the Distributed Director makes a DRP query to each of the DRP agents, and compares the metrics returned to select the server that is closest to your client.
Data Center Networking: Enterprise Distributed Data Centers 956599
5-19
Chapter 5
Multi-Site Load Distribution
Implementation Details
ip director server 192.168.1.5 drp-association 192.168.1.1 ip director server 172.16.1.5 drp-association 172.16.1.1 ip director server 10.3.4.5 drp-association 10.3.4.1
Configuring Site Selection on DD Use the following two steps to perform specific site selection: •
Associate a metric to a specific server. It is possible to configure different metrics for the same server. For a more detailed explanation of the different metrics, consult the configuration guide. The different metrics that are allowed are access-group, portion, availability, preference and route-map ip director server
•
Configure a priority for the fully qualified domain. The different metrics allowed are admin, availability, portion, drp-ext, drp-int, drp-int, drp-rtt, drp-ser, routemap, and random. Based on the priority associated with each metric, the DD directs the clients to a specific clients. ip director hosts clients.ese-datacenter.com priority admin 1 portion 15
Recommendations The DD and CSS are the only two products that provide features for multi-site load distribution. Although DD provides most of the features, CSS is recommended for multi-site load distribution. CSS has various deployment methods that can be used. It is recommended that CSS be used in a decoupled architecture for the following reasons. •
Provides a modular design approach for multi-site load distribution
•
CSS being a content switch, provides a variety of health probes and also provides more flexibility on how these health checks are done.
•
Helps keep the health probes local to the data center
•
Can exchange application availability information between different sites using encrypted sessions
•
Can provide site stickiness based on the source IP address of the DNS request
•
Provides a good migration path towards future product enhancements and new products
Data Center Networking: Enterprise Distributed Data Centers
5-20
956599
C H A P T E R
6
Multi Site Multi Homing In almost all Enterprise infrastructures today Internet connectivity is universal, although the topology designs may be unique. This chapter introduces a reference topology for Internet edge network deployments. This covers the basis of design principles as well as an introduction of common deployment barriers related to Internet edge topologies.
Overview This chapter identifies and clarifies multi-site Internet edge designs. Multi-site Internet edge design refers to the instance of having more than one data center connected to the Internet backbone. This can either imply that each respective data center is multi-homed or has a single connection each. This architecture includes the core design principles associated with all network infrastructure designs while paying special attention to the unique requirements relevant to Internet edge multi-site topologies. Like any infrastructure design, these aformentioned designs must be highly scalable while maintaining the key aspects of security and redundancy. The key security functions include: •
Element security
•
Identity services
•
IP anti-spoofing
•
Demilitarized zones (DMZ)
•
Basic filtering and application definition
•
Intrusion detection
The key redundancy functions associated with multi-site topologies are: •
Multiple data centers that act as Internet gateways for internal users.
•
Distributed data centers that provide Internet/intranet server farm resiliency.
This chapter also discusses multi-homing. Multi-homing provides ISP resiliency by connecting each data center to two or more ISPs depending on the bandwidth requirements, the server farm architecture, or other internet services. These Internet connections can be a transit point for traffic both inbound to the architecture and outbound to the Internet backbone for both the Internet/intranet server farms as depicted in Figure 6-1. This chapter also describes and chapters some common deployment problems when introducing distributed data centers into any network topology. Deploying distributed data centers introduces additional complexities to network administrators who want to fully utilize both internet gateway locations. These challenges include: •
Application distribution
Data Center Networking: Enterprise Distributed Data Centers 956599
6-1
Chapter 6
Multi Site Multi Homing
Overview
•
DNS propagation
•
Replication timeouts
These design issues are covered in the Data Center Networking: Distributed Data Center SRND located at http://www.cisco.com/en/US/netsol/ns110/ns53/ns224/ns304/networking_solutions_design_guidances _list.html. Figure 6-1
Data Center Topology
Internet PSTN
Partners WAN
SP2
SP1
VPN Remote Office
Internet Edge Internet Gateway
DMZ Or
Private WAN
Or
Internet Server Farm
Campus Core 87352
Corporate Infrastructure
Extranet Data Center
Intranet Data Center
Data Center Networking: Enterprise Distributed Data Centers
6-2
956599
Chapter 6
Multi Site Multi Homing Multi-Site Multi-Homing Design Principles
Multi-Site Multi-Homing Design Principles As mentioned above, Internet edge solutions touch many different types of enterprise networks and therefore may potentially have many different topologies. They can range from any remote office connection to a major ISP peering point. Therefore, using the common design principles associated with all network architectures allows you to carry these recommendations into almost all Internet edge topologies; ranging from a single-site ISP connection to a multi-site multi-homing environment.
High Availability With topologies that connect to a single ISP, which is an enterprise connected to a single ISP, the differences in redundancy reside at the ISP peering point. If you have a single ISP connection at the edge of your network topology, the need for redundancy is a null issue because you have only a single exit point. If the primary edge router fails, the Internet connection itself is down. Therefore, defining redundancy at the edge of the network has no beneficial affect unless the provider supplies two terrestrial circuits as depicted below. Multi-homing implementations offer redundancy in these instances, as well as in the instances where there are multiple data centers for a single enterprise. You can leverage each respective data center for redundancy and scalability, if you partition applications across multiple data centers for high availability and scalability.
Data Center Networking: Enterprise Distributed Data Centers 956599
6-3
Chapter 6
Multi Site Multi Homing
Multi-Site Multi-Homing Design Principles
Figure 6-2
Multi-Site Multi-Homing Design
Internet SP 1
SP 2 SP 3
Corporate WAN
East Coast Remote offices
West Coast Data Center
East Coast Data Center
87353
West Coast Remote offices
Multi-site internet edge topologies are also composed of multiple layers. There must be no single point of failure within the network architecture. Therefore, complete Internet edge device redundancy in this architecture is a necessity. The infrastructure devices, such as routers and switches, coupled with specific Layer 2 and Layer 3 technologies, help achieve this device redundancy. To meet this redundancy requirement, Internet edge topologies use some of the key functions of the IOS software. The Layer 3 features, used for high availability, offer redundant default gateways for networked hosts and provide a predictable traffic flow in both normal operating conditions and under the adverse conditions surrounding a network link or device failure. These Layer 3 features include: •
Hot Standby Router Protocol (HSRP)
•
Multigroup Hot Standby Router Protocol (MHSRP)
•
Routing protocol metric tuning (EIGRP and OSPF)
These Layer 3 functions also apply to redundancy by offering multiple default gateways in the network topologies. HSRP and Multigroup HSRP offer Layer 3 gateway redundancy, whereas the dynamic routing protocols offer a look into network availability from a higher level.
Data Center Networking: Enterprise Distributed Data Centers
6-4
956599
Chapter 6
Multi Site Multi Homing Multi-Site Multi-Homing Design Principles
For instance, you could deploy HSRP between the edge routers to propagate a single default gateway instance to the internal networks. In this case, if the primary router fails, the HSRP address is still active on the secondary router instance, therefore the defined static route is still valid.
Scalability The network architecture must be scalable to accommodate increasing user support, as well as unforeseen bursts in network traffic. While feature availability and the processing power of network devices are important design considerations; physical capacity attributes, like port density, can limit architecture scalability. The termination of circuits can become a burden on device scalability within the border layer of this topology. This burden is the same for the firewall device provisioning layer and the Layer 3 switching layer. Port density scalability is also important at the Layer 3 switching layer because it provides additional connections for host devices, in this case, servers.
Intelligent Network Services In all network topologies, the intelligent network services present in the IOS software, such as QoS functions and high availability technologies like HSRP, are used to ensure network availability. Below, HSRP is documented and detailed for typical deployment scenarios.
HSRP HSRP enables a set of routers to work together, giving the appearance of a single virtual router or default gateway to the hosts on a LAN. HSRP is particularly useful in environments where critical applications are running and fault-tolerant networks have been designed. By sharing an IP address and a MAC address, two or more routers acting as a single virtual router are able to seamlessly assume the routing responsibility in a defined event or an unexpected failure. This enables hosts on a LAN to continue to forward IP packets to a consistent IP and MAC address enabling a transparent changeover of routing devices. HSRP allows you to configure hot standby groups to share responsibility for an IP address. You can give each router a priority, which enables you to weight the prioritization of routers for active router selection. One of the routers in each group is elected to be the active forwarder and one is elected as the stand-by router. This is done according to the router's configured priorities. The router with the highest priority wins and, in the event of a tie in priority, the greater value of their configured IP addresses breaks the tie. Other routers in this group monitor the active and stand-by routers' status to enable further fault tolerance. All HSRP routers participating in a standby group watch for hello packets from the active and the standby routers. They learn the hello and dead timers, as wells as the shared standby IP address from the active router in the group, if these parameters are not explicitly configured on each individual router. Although this is a dynamic process, Cisco recommends that you define the HSRP dead timers in the topology. If the active router becomes unavailable due to scheduled maintenance, power failure, or other reasons; the stand-by assumes this functionality transparently within a few seconds. Failover occurs when three successive hello packets are missed and the dead timer is reached. The standby router promptly takes over the virtual addresses, identity, and responsibility. When the secondary interface assumes mastership, the new master sends a gratuitous ARP, which updates the L2 switch's content addressable memory (CAM). This then becomes the primary route for the devices accessing this gateway. These HSRP timers can be configured on a per instance of HSRP.
Data Center Networking: Enterprise Distributed Data Centers 956599
6-5
Chapter 6
Multi Site Multi Homing
Multi-Site Multi-Homing Design Principles
Routing Protocol Technologies Before introducing and examining the basic ways in which autonomous systems can be connected to ISPs, some basic terminology and concepts of routing must be established. There are three basic routing approaches: •
Static
•
Default
•
Dynamic
Static routing refers to routes to destinations manually listed in the router. Network reachability, in this case, is not dependent on the existence and state of the network itself. Whether a destination is up or down, the static routes remain in the routing table, and traffic is still sent toward that destination. Default routing refers to a “last resort” outlet. Traffic to destinations that are unknown to the router are sent to the default outlet. Default routes are also manually listed in the router. Default routing is the easiest form of routing for a domain connected to a single exit point. Dynamic routing refers to the router learning routes via an internal or external routing protocol. Network reachability is dependent on the existence and state of the network. If a destination is down, the route disappears from the routing table and traffic is sent toward that destination. These three routing approaches are possibilities for all the configurations considered in forthcoming sections, but usually there is an optimal approach. Thus, in illustrating different autonomous systems, this chapter considers whether static, dynamic, default, or some combination of these routing approaches is optimal. This chapter also considers whether interior or exterior routing protocols are appropriate. Interior gateway protocols (IGPs) can be used for the purpose of advertising the customer's networks. An IGP can be used between the enterprise and provider for the enterprise to advertise its routes. This has all the benefits of dynamic routing because network information and changes are dynamically sent to the provider. Also, the IGP's distribute the network routes upstream to the BGP function.
Edge Routing - BGP Border gateway protocols (BGPs) perform interdomain routing in TCP/IP networks. BGP is an exterior gateway protocol (EGP), which means that it performs routing between multiple autonomous systems or domains and exchanges routing and reachability information with other BGP systems. BGP devices exchange routing information upon initial data exchange and during incremental updates. When a router first connects to the network, BGP routers exchange their entire BGP routing tables and, when the routing table changes, those same routers send only the changed portion of their routing tables. BGP routers do not send regularly scheduled routing updates and BGP routing updates advertise only the optimal path to a network. BGP uses a single routing metric to determine the best path to a given network. This metric consists of an arbitrary unit number that specifies the degree of preference of a particular link. The BGP metric typically is assigned to each link by the network administrator. The value assigned to a link can be based on any number of criteria, including the number of autonomous systems through which the path passes, stability, speed, delay, or cost. BGP performs three types of routing: interautonomous system routing, intra-autonomous system routing, and pass-through autonomous system routing. •
Interautonomous system routing occurs between two or more BGP routers in different autonomous systems. Peer routers in these systems use BGP to maintain a consistent view of the internetwork topology. BGP neighbors communicating between autonomous systems must reside on the same physical network. The Internet serves as an example of an entity that uses this type of routing
Data Center Networking: Enterprise Distributed Data Centers
6-6
956599
Chapter 6
Multi Site Multi Homing Multi-Site Multi-Homing Design Principles
because it is comprised of autonomous systems or administrative domains. Many of these domains represent the various institutions, corporations, and entities that make up the Internet. BGP is frequently used to provide path determination to provide optimal routing within the Internet. •
Intra-autonomous system routing occurs between two or more BGP routers located within the same autonomous system. Peer routers within the same autonomous system use BGP to maintain a consistent view of the system topology. BGP is also used to determine which router serves as the connection point for specific external autonomous systems. Once again, the Internet provides an example of interautonomous system routing. An organization, such as a university, could make use of BGP to provide optimal routing within its own administrative domain or autonomous system. The BGP protocol can provide both inter- and intra-autonomous system routing services.
•
Pass-through autonomous system routing occurs between two or more BGP peer routers that exchange traffic across an autonomous system that does not run BGP. In a pass-through autonomous system environment, the BGP traffic did not originate within the autonomous system in question and is not destined for a node in the autonomous system. BGP must interact with whatever intra-autonomous system routing protocol is being used to successfully transport BGP traffic through that autonomous system.
BGP Attributes BGP attributes support the control of both inbound and outbound network routes. These attributes can be adjusted to control the decision making process of BGP itself. The BGP attributes are a set of parameters that describe the characteristics of a prefix (route). The BGP decision process uses these attributes to select its best routes. Specific attributes associated with larger topologies like this one are addressed later in this chapter. More specifically, the MED attribute and the use of route reflectors are addressed. Figure 6-3 displays a multi site multi homed topology.
Data Center Networking: Enterprise Distributed Data Centers 956599
6-7
Chapter 6
Multi Site Multi Homing
Design Caveats
Figure 6-3
Multi Site Internet Edge Topology
Internet SP 1
SP 2 SP 3
Corporate WAN
East Coast Remote offices
West Coast Data Center
East Coast Data Center
87353
West Coast Remote offices
Design Caveats In certain multi-site deployments, device placement becomes a caveat to the overall design. In a specific instance, the placement of the firewall and how it is introduced into the architecture from a routing standpoint are of major concern. There are two main caveats to be concerned with when designing your network: •
Inability to terminate IGP on firewall device
•
Lack of upstream route health or interface uptime
In a design where the PIX firewall is placed at the edge of the network between the Internet border routers and the internet data center core switches, the PIX can become a black hole route to the end-users that are geographically adjacent to that data center. In detail, when deploying a PIX firewall, the most common deployment is to have the device configured with static routes upstream to the internet border routers and with a static route downstream to the internal Layer 3 switching platform. Since static
Data Center Networking: Enterprise Distributed Data Centers
6-8
956599
Chapter 6
Multi Site Multi Homing Multi-Site Multi-Homing Design Recommendations
routing is the configuration of choice, you can assume that the firewall cannot participate in the IGP routing protocol. If the external routes from the internet border routers disappear from the routing table, the internal routing process has no idea that this is no longer a valid route to the Internet. Since the PIX is not participating in an IGP routing protocol, the firewall has no intelligence of the routing updates that take place above the firewall layer. Therefore, the device still accepts packets destined for the Internet. This is usually the case because the Layer 3 switching layer below the PIX device propagates or redistributes a static route of 0.0.0.0 into the IGP downstream.
Work Arounds The aformentioned problem is common when deploying distributed data centers and has the following three work arounds: •
The first work around is using the BGP routing protocol to inform the Layer 3 switching platform of the route change by tunneling the I-BGP traffic through the firewall to the peer on the inside interface or the Layer 3 switching platform that houses the IGP routing process. This design is documented in the Data Center Networking: Internet Edge Design Architectures SRND.
•
With a future release of HSRP, the you could use HSRP tracking to track the HSRP interface of the Internet border routers. This assumes that the border routers also implement a tracking instance of the upstream ISP interfaces. This has not been tested or documented.
•
Finally, you could use the Firewall Service Module (FWSM) in the edge Layer 3 switching platform. This deployment allows you to process OPSF routes internal to the IGP by having the firewall device participate in the OSPF process. This deployment that has been tested and is documented in this chapter.
Multi-Site Multi-Homing Design Recommendations As mentioned above; multi-site Internet edge topologies are different than single site topologies in various ways. Also, the scale of these topologies may be different. But these topologies are increasingly important to enterprise business functions. Hence, the scalability of these topologies can not be overlooked. It is also imperative to this type of architecture to have complete redundancy. The details of the functional layers of the Internet edge topologies and how they interact with one another are detailed below. When deploying a distributed data center environment, you must adhere to certain characteristics. For example, these topologies still use a similar ISP multi-homing relationship, but the attributes are slightly different. Also, since this architecture is distributed, it becomes a network that has multiple Internet gateways in different data centers. This network is usually partitioned in such a way that locally adjacent users traverse their respective local data centers. This type of design recommendation assumes that you have configured the internal IGP to route locally adjacent end-users through their respective data centers while still offering redundancy to the other data center in the event of failure. This distributed data center design topology deployed of a not-so stubby area networks. This allows you to define multiple Internet data center topologies without changing the integrity of the core infrastructure. Each of the geographically dispersed area's and autonomous systems are represented below in Figure 6-4.
Data Center Networking: Enterprise Distributed Data Centers 956599
6-9
Chapter 6
Multi Site Multi Homing
Multi-Site Multi-Homing Design Recommendations
Figure 6-4
Internet Edge AS/ Area Topology
West Coast
East Coast BGP ISP AS1
Internet ISP Cloud ISP AS 1/2
BGP ISP AS2
BGP AS 100 OSPF NSSA 252
OSPF NSSA 251
OSPF Area 0
87354
Corporate LAN Connectivity
Border Router Layer Border routers, typically deployed in pairs, are the edge-facing devices of the network. The number of border routers deployed is a decision of provisioning, based on memory requirements, and physical circuit termination. The border router layer is where you provision ISP termination and initial security parameters. The border router layer serves as the gateway of the network and uses an externally facing Layer 3 routing protocol, like BGP, integrated with an internally facing OSPF to intelligently route traffic throughout the external and internal networks, respectively. This layer starts the OSPF process internally into the network. The Internet edge in an enterprise environment may provide Internet connectivity to an ISP through the use of multi-homed internet border routers. This layer also injects the gateway of last resort route into the IGP through specific BGP parameters defined below.
Internet Data Center Core Switching Layer The Layer 3 switching layer is the layer in the multi-site internet edge topology that serves as the gateway to the core of the network. This is also a functional layer of the internet server farm design. This layer may act as either a core layer or an aggregation layer in some design topologies. Yet, the primary function, from the internet edge design topology standpoint, is to advertise the IGP routing protocol internally to the infrastructure. OSPF processes for each data center interfaces with Area 0 at this layer, as shown in Figure 6-4.
Data Center Networking: Enterprise Distributed Data Centers
6-10
956599
Chapter 6
Multi Site Multi Homing Multi-Site Multi-Homing Design Recommendations
Firewall Layer The firewall layer is a security layer that allows stateful packet inspection into the network infrastructure and to the services and applications offered in the server farms and database layers. In this topology, the firewall layer is represented by the FWSM in the Catalyst 6500 series switching platform. This layer also acts as the network address translation (NAT) device in most design topologies. NAT, at the Internet edge, is common based on the ever depleting Ipv4 address pool associated with ISP's. This allows many ISP's to give a limited address range, which, in turn, requires NAT pools at the egress point of the topology.
Data Center Core Switching Layer The data center core layer in this topology is the transport layer between data centers. This assumes that the layers are represented in the same Area 0 in the OPSF routing process. This layer is also the termination point for both the geographically adjacent WAN routers or the geographically adjacent LAN's in the architecture. This layer allows you to control the administrative distances or actual costs associated with the gigabit links to the upstream edge Layer 3 switches. Figure 6-5 displays how the network topology is partitioned into two different geographic areas. Physical Layer Topology
Networks 1.x.x.x,...5.x.x.x .1 East Coast internet edge
Network 172.16.253.x
East Coast edge
.100 .2 .1
.1 West Coast internet edge
.254 Network FWSM Outside 172.16.254.x FWSM Intside .100 Network 172.16.252.x
Network 172.16.251.x
.1
Networks 6.x.x.x,...12.x.x.x
Network 172.16.10.x
Network 172.16.11.x .254
.1
East Coast core
Internet ISP Cloud ISP AS 1/2
Network 172.16.250.x
Corporate LAN Connectivity
.100
.1 West Coast edge .2 .1 West Coast core
87355
Figure 6-5
Data Center Networking: Enterprise Distributed Data Centers 956599
6-11
Chapter 6
Multi Site Multi Homing
Implementation Details
Implementation Details Below are the implementation details associated with defining and configuring the multi-site Internet edge topology. Also, there are specific configurations associated with each layer that allow for the route control and failover of the topology stated above.
Multi-Site Multi-Homing Topology In this section, the router configurations were taken from the each of the East Coast routers as depicted in the Figure 6-5. These configurations were defined solely for this testbed and are not representative of the normal ISP recommended configurations.
Internet Cloud Router Configurations The Internet cloud routers were configured with loopback interfaces for testing purposes. These interfaces allow ping traffic to traverse the internal network outbound to the internet backbone. Below, each configuration was defined with each of the respective network segments. This also made it easier to determine the routes locally adjacent to each of the internet gateway routers.
Internet Cloud Router ISP AS1 hostname InternetCloud1 interface Loopback0 ip address 2.0.0.1 ip address 3.0.0.1 ip address 4.0.0.1 ip address 5.0.0.1 ip address 1.0.0.1
255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0
secondary secondary secondary secondary
When looking into the BGP process below, you can see that only specific subnets were defined for redistribution. Since this solution is not performance focused, the decision was made to only propagate those routes. This allows for BGP redistribution to the lower layers to ensure that the internal OSPF redistribution is working correctly. router bgp 1 network 1.0.0.0 network 2.0.0.0 network 3.0.0.0 network 4.0.0.0 network 5.0.0.0 network 20.10.5.0 network 172.16.11.0 redistribute connected neighbor 20.10.5.254 remote-as 2 neighbor 172.16.11.254 remote-as 100
Internet Cloud Router ISP AS2 The configuration of the second Internet cloud router is the same as the first except that different IP addresses were used. hostname InternetCloud2
interface Loopback0
Data Center Networking: Enterprise Distributed Data Centers
6-12
956599
Chapter 6
Multi Site Multi Homing Implementation Details
ip ip ip ip ip ip no
address 7.0.0.1 255.255.255.0 secondary address 8.0.0.1 255.255.255.0 secondary address 9.0.0.1 255.255.255.0 secondary address 11.0.0.1 255.255.255.0 secondary address 12.0.0.1 255.255.255.0 secondary address 6.0.0.1 255.255.255.0 ip directed-broadcast
! router bgp 2 network 6.0.0.0 network 7.0.0.0 network 8.0.0.0 network 9.0.0.0 network 11.0.0.0 network 12.0.0.0 network 20.10.5.0 network 172.16.10.0 redistribute connected neighbor 20.10.5.1 remote-as 1 neighbor 172.16.10.254 remote-as 100
Internet Edge Configurations The first layer in the topology was the Internet border router layer. At this layer, the peering relationship via BGP to the ISP routers takes place. Also at this layer, the first instance of the OSPF process begins. The BGP process propagates a default route into the OSPF routing instance. Below are the internet edge router configurations.
East Coast Internet Edge Configurations EdgeRouter1#wr t Building configuration...
! hostname EdgeRouter1 !
The interface configurations below represent the donwstream link to the outside interface or segment of the FWSM link:
Note
The OSPF hello and dead-interval timers must be the same across all links and interface: ! interface FastEthernet0/0 ip address 172.16.253.254 255.255.255.0 no ip route-cache ip ospf hello-interval 1 ip ospf dead-interval 3 no ip mroute-cache duplex full !
The following configuration examples are associated with the upstream links to the ISP clouds: interface FastEthernet3/0
Data Center Networking: Enterprise Distributed Data Centers 956599
6-13
Chapter 6
Multi Site Multi Homing
Implementation Details
ip address 172.16.11.254 255.255.255.0 no ip redirects no ip route-cache no ip mroute-cache duplex half
The following OSPF and BGP edge configurations allow the edge to redistribute BGP processes to the internal network. The redistribute bgp command within the OSPF process causes this redistribution. This assumes that the router can propagate those routes internal to the other network segments. Injecting full BGP routes into an IGP is not recommended. Doing so adds excessive routing overhead to any IGP. Interior routing protocols were never designed to handle more than the networks inside your autonomous systems, plus some exterior routes from other IGPs. This does not mean that BGP routes should never be injected into IGPs. Depending on the number of BGP routes and how critical the need for them to be in the IGP, injecting partial BGP routes into IGP may well be appropriate. Below are the OSPF and BGP configurations respectively:
Note
Router OSPF is defined as a not so stubby area (NSSA). This is needed to redistribute the external routes form the upstream routing instance: For the sake of the testbed topology and to define that routes have been updated properly, specific BGP routes were redistributed into the architecture: router ospf 500 log-adjacency-changes area 251 nssa redistribute bgp 100 network 172.16.251.0 0.0.0.255 area 251 network 172.16.253.0 0.0.0.255 area 251
Note
In typical Internet edge deployments, the edge routing instance does not redistribute the BGP process into the OSPF process, but rather uses the default-information originate command to define a default route to edge routing instance. That default route is then redistributed via the OSPF process to the internal network only if the edge routing instance has a default route itself: router ospf 500 log-adjacency-changes area 251 nssa network 172.16.251.0 0.0.0.255 area 251 network 172.16.253.0 0.0.0.255 area 251 default-information originate route-map SEND_DEFAULT_IF
The ACL's below state that if the router has any entry in its routing table from the next hop ISP router, then it sends the default route internal to the network. This configuration must be deployed on both edge routing devices. access-list 1 permit 0.0.0.0 access-list 2 permit 172.16.11.1 route-map SEND_DEFAULT_IF permit 10 match ip address 1 match ip next-hop 2
Data Center Networking: Enterprise Distributed Data Centers
6-14
956599
Chapter 6
Multi Site Multi Homing Implementation Details
Note
The route map SEND_DEFAULT_IF is associated with the default-information originate command. This route map matches on the condition that the 0/0 default (access-list 1) has a next hop of 172.16.11.1 (access-list 2). This satisfies the condition that the 0/0 is learned via EBGP rather than I-BGP. Below is the BGP routing instance that defines the upstream BGP neighbor that is necessary for the above route-map to work. router bgp 100 bgp log-neighbor-changes network 172.16.11.0 redistribute connected neighbor 172.16.11.1 remote-as 1
Below, are the routes available to the East Coast edge routers:
Note
Setting the route propagation via OSPF on the FWSM requires defining route-maps that only allow specific traffic to the edge layer. Therefore, the only internal route propagated is the 172.16.251.x. This can be controlled by supernetting the segment to allow only specific addresses. EdgeRouter1#sho ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route Gateway of last resort is not set B B B B B B B
1.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 2.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 3.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 4.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 20.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 5.0.0.0/8 [20/0] via 172.16.11.1, 00:09:13 6.0.0.0/8 [20/0] via 172.16.11.1, 00:08:13 172.16.0.0/24 is subnetted, 6 subnets C 172.16.253.0 is directly connected, FastEthernet0/0 O IA 172.16.251.0 [110/11] via 172.16.253.1, 00:44:14, FastEthernet0/0 C 172.16.11.0 is directly connected, FastEthernet3/0 B 7.0.0.0/8 [20/0] via 172.16.11.1, 00:08:14 B 8.0.0.0/8 [20/0] via 172.16.11.1, 00:08:14 B 9.0.0.0/8 [20/0] via 172.16.11.1, 00:08:14 B 11.0.0.0/8 [20/0] via 172.16.11.1, 00:08:14 B 12.0.0.0/8 [20/0] via 172.16.11.1, 00:08:14
Edge Switching Layer Configurations The edge switching layer configurations house the FWSM as well as the NSSA instance of the OSPF process. The instance of OSPF that the upstream Internet edge routing layer binds to is the outside interface of the FWSM. The inside instance of the FWSM is an OSPF neighbor to the process running locally on the MSFC on the switch itself. These switches run in a same NSSA area, while running two different OSPF processes. This ensures the tuning of the protocol that defines which routes can be propagated to the upstream network. This configuration recommendation is to ensure that restricted routes are not externally advertised. With areas defined this way, you can tune the routes appropriately
Data Center Networking: Enterprise Distributed Data Centers 956599
6-15
Chapter 6
Multi Site Multi Homing
Implementation Details
by defining route maps to allow a specific network segment outbound. For instance; if you only wanted to advertise the VIP addresses of the server farm outbound, you could create a route-map that only allows those specific addresses outbound.
East Coast Edge Switching Layer Configurations EASTEDGE1#wr t Building configuration... hostname EASTEDGE1 !
Below, is the configuration that associates specific VLAN's with the FWSM. VLAN 200 is the internal vlan and VLAN 300 is the external vlan: firewall module 2 vlan-group 1 firewall vlan-group 1 200,300 ! vlan dot1q tag native
Gigabit 1/1 is the downstream link to the OSPF core Layer 3 switching layer. Note it has be configured to be in VLAN 200: ! ! interface GigabitEthernet1/1 no ip address switchport switchport access vlan 200
FastEthernet 1/1 is the upstream link to the edge routing layer. Note it has be configured to be in VLAN 300: ! interface FastEthernet3/1 no ip address duplex full speed 100 switchport switchport access vlan 300 !
Note
Interface VLAN 200 OSPF configurations are the same across all OSPF interfaces from the timers perspective: ! interface Vlan200 ip address 172.16.251.2 255.255.255.0 ip ospf hello-interval 1 ip ospf dead-interval 3
Note
The OSPF routing process below is the internal OSPF neighbor to the core switching layer: ! router ospf 500 log-adjacency-changes area 251 nssa
Data Center Networking: Enterprise Distributed Data Centers
6-16
956599
Chapter 6
Multi Site Multi Homing Implementation Details
network 172.16.251.0 0.0.0.255 area 251 ! ip classless no ip http server ! ! arp 127.0.0.12 0000.2100.0000 ARPA ! ! line con 0 line vty 0 4 login transport input lat pad mop telnet rlogin udptn nasi ! end
Below are the configurations associated with the FWSM. Notice the OSPF configuration in the FWSM itself and how it binds itself to the OSPF process. EDGE1# sess slot 2 proc 1 The default escape character is Ctrl-^, then x. You can also type 'exit' at the remote prompt to end the session Trying 127.0.0.21 ... Open
FWSM passwd: Welcome to the FWSM firewall Type help or '?' for a list of available commands. EASTFWSM> en Password: EASTFWSM# wr t Building configuration... : Saved : FWSM Version 1.1(1) no gdb enable nameif vlan200 inside security100 nameif vlan300 outside security0 enable password 8Ry2YjIyt7RRXU24 encrypted passwd 2KFQnbNIdI.2KYOU encrypted hostname EASTFWSM fixup protocol ftp 21 fixup protocol h323 H225 1720 fixup protocol h323 ras 1718-1719 fixup protocol ils 389 fixup protocol rsh 514 fixup protocol smtp 25 fixup protocol sqlnet 1521 fixup protocol sip 5060 fixup protocol skinny 2000 fixup protocol http 80 names access-list outside permit tcp any any access-list outside permit udp any any access-list outside permit icmp host 6.0.0.1 any echo-reply access-list inside permit tcp any any access-list inside permit udp any any access-list inside permit icmp host 172.16.250.10 any echo
ACL 500 defines which addresses need to be matched to support the advertisement of the OSPF: access-list 500 permit 172.16.251.0255.255.255.0
Data Center Networking: Enterprise Distributed Data Centers 956599
6-17
Chapter 6
Multi Site Multi Homing
Implementation Details
pager lines 24 icmp permit any inside icmp permit any outside mtu inside 1500 mtu outside 1500 ip address inside 172.16.251.100 255.255.255.0 ip address outside 172.16.253.1 255.255.255.0 no failover failover lan unit secondary failover timeout 0:00:00 failover poll 15 failover ip address inside 0.0.0.0 failover ip address outside 0.0.0.0 pdm history enable arp timeout 14400 static (inside,outside) 172.16.250.10 172.16.250.10 netmask 255.255.255.255 0 0 access-group inside in interface inside access-group outside in interface outside
The OSPF interface timer configurations below are common across the architecture: interface inside ospf hello-interval 1 ospf dead-interval 3 ! ! interface outside ospf hello-interval 1 ospf dead-interval 3 !
The route-map below states that the permitted advertised traffic must match the access-list 500 addresses. This route-map is then bound to the OSPF process that is redistributing the routes, as seen below in router OSPF 100: route-map 500 permit 10 match ip address 500 !
The OSPF configurations below are representative of the recommended security configuration. Within the configuration, two different OSPF routing processes were defined to control inbound and outbound route propagation: router ospf 500 network 172.16.251.0 255.255.255.0 area 251 area 251 nssa log-adj-changes redistribute ospf 100 router ospf 100 network 172.16.253.0 255.255.255.0 area 251 area 251 nssa log-adj-changes
redistribute ospf 500 subnets route-map 500 ! timeout xlate 3:00:00
Data Center Networking: Enterprise Distributed Data Centers
6-18
956599
Chapter 6
Multi Site Multi Homing Implementation Details
timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 rpc 0:10:00 h323 0:05:00 sip 0:30:00 sip_media 0:02:00 timeout uauth 0:05:00 absolute aaa-server TACACS+ protocol tacacs+ aaa-server RADIUS protocol radius aaa-server LOCAL protocol local no snmp-server location no snmp-server contact snmp-server community public no snmp-server enable traps floodguard enable no sysopt route dnat telnet timeout 5 ssh timeout 5 terminal width 80 Cryptochecksum:03e78100e37fef97b96c15d54be90956 : end [OK]
Showing the routes available to the FWSM ensures that the proper outside/inside routes were propagated: EASTFWSM# sho route C 127.0.0.0 255.255.255.0 is directly connected, eobc O N2 1.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 2.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 3.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 4.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 20.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 5.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:46:35, outside O N2 6.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:35, outside
O IA
172.16.0.0 255.255.255.0 is subnetted, 5 subnets 172.16.252.0 [110/12] via 172.16.251.1, 1:16:14, inside
C
172.16.253.0
is directly connected, outside
O IA
172.16.254.0
[110/22] via 172.16.251.1, 1:10:14, inside
O IA
172.16.250.0
[110/11] via 172.16.251.1, 1:21:35, inside
C
172.16.251.0
is directly connected, inside
O N2 7.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:36, outside O N2 8.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:36, outside O N2 9.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:36, outside O N2 11.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:36, outside O N2 12.0.0.0 255.0.0.0 [110/1] via 172.16.253.254, 0:45:36, outside EASTFWSM# exit Logoff
Data Center Networking: Enterprise Distributed Data Centers 956599
6-19
Chapter 6
Multi Site Multi Homing
Implementation Details
[Connection to 127.0.0.21 closed by foreign host]
Below are the routes associated with the edge switching layers. Notice that the edge layer has two routes to the Internet backbone: one primary route via the OSPF process running locally on the switch and the redundant route running through Area 0 to the secondary switch. This is the same respectively on each switch. EASTEDGE1#sho ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route Gateway of last resort is not set O O O O O O O
N2 N2 N2 N2 N2 N2 N2
O O O O C O O O O O
IA N2 IA N2 N2 N2 N2 N2
1.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 2.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 3.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 4.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 20.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 5.0.0.0/8 [110/1] via 172.16.251.100, 00:45:56, Vlan200 6.0.0.0/8 [110/1] via 172.16.251.100, 00:44:56, Vlan200 172.16.0.0/24 is subnetted, 5 subnets 172.16.252.0 [110/3] via 172.16.251.1, 01:15:39, Vlan200 172.16.253.0 [110/11] via 172.16.251.100, 01:20:56, Vlan200 172.16.254.0 [110/13] via 172.16.251.1, 01:09:35, Vlan200 172.16.250.0 [110/2] via 172.16.251.1, 01:20:56, Vlan200 172.16.251.0 is directly connected, Vlan200 7.0.0.0/8 [110/1] via 172.16.251.100, 00:44:57, Vlan200 8.0.0.0/8 [110/1] via 172.16.251.100, 00:44:57, Vlan200 9.0.0.0/8 [110/1] via 172.16.251.100, 00:44:57, Vlan200 11.0.0.0/8 [110/1] via 172.16.251.100, 00:44:57, Vlan200 12.0.0.0/8 [110/1] via 172.16.251.100, 00:44:57, Vlan200
Core Switching Layer Configurations The core switching layers are the layers that house the OPSF Area 0 process. This layer becomes the transport for Internet destined traffic for each of the respective data centers in the event of failure. This is also the layer where the configurations are controlled to ensure that traffic is destined to the right geographical areas. This is accomplished using the ip ospf cost configurations on interfaces where the OSPF neighbor areas are present.
East Coast Core Switching Layer Configurations EASTCOASTCORE#wr t Building configuration... ! hostname EASTCOASTCORE ! ! interface Port-channel1 no ip address switchport switchport trunk encapsulation dot1q
Data Center Networking: Enterprise Distributed Data Centers
6-20
956599
Chapter 6
Multi Site Multi Homing Implementation Details
The interface configurations below state that Gigabit 1/1 is the OSPF interface that neighbors to the East Coast edge layer. This is where the you can tune the OSPF cost to define that any users locally adjacent to the East Coast core would chose this upstream link for the internet traffic. This same configuration is also tuned to ensure any west coast traffic traverse the west coast routes: ! interface GigabitEthernet1/1 ip address 172.16.251.1 255.255.255.0 ip ospf hello-interval 1 ip ospf dead-interval 3 ip ospf cost 5 ! interface GigabitEthernet1/2 no ip address switchport switchport trunk encapsulation dot1q channel-group 1 mode on ! interface GigabitEthernet2/1 no ip address switchport switchport trunk encapsulation dot1q channel-group 1 mode on ! interface Vlan1 no ip address shutdown ! interface Vlan200 ip address 172.16.250.1 255.255.255.0 ip ospf hello-interval 1 ip ospf dead-interval 3 ! router ospf 500 log-adjacency-changes area 251 nssa network 172.16.250.0 0.0.0.255 area 0 network 172.16.251.0 0.0.0.255 area 251 ! ip classless no ip http server ! ! ! line con 0 line vty 0 4 login transport input lat pad mop telnet rlogin udptn nasi ! end
The following displays the routes associated with East Coast core. Note the path preference is the respective edge switching layer. EASTCOASTCORE#sho ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route
Data Center Networking: Enterprise Distributed Data Centers 956599
6-21
Chapter 6
Multi Site Multi Homing
Implementation Details
Gateway of last resort is not set O O O O O O O
N2 N2 N2 N2 N2 N2 N2
O O O C C O O C O O O
IA IA
N2 N2 N2 N2 N2
1.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 2.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 3.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 4.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 20.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 5.0.0.0/8 [110/1] via 172.16.251.100, 00:06:46, GigabitEthernet1/1 6.0.0.0/8 [110/1] via 172.16.251.100, 00:05:53, GigabitEthernet1/1 172.16.0.0/24 is subnetted, 5 subnets 172.16.252.0 [110/2] via 172.16.250.254, 00:36:37, Vlan200 172.16.253.0 [110/11] via 172.16.251.100, 00:41:53, GigabitEthernet1/1 172.16.254.0 [110/12] via 172.16.250.254, 00:30:32, Vlan200 172.16.250.0 is directly connected, Vlan200 172.16.251.0 is directly connected, GigabitEthernet1/1 7.0.0.0/8 [110/1] via 172.16.251.100, 00:05:54, GigabitEthernet1/1 8.0.0.0/8 [110/1] via 172.16.251.100, 00:05:54, GigabitEthernet1/1 127.0.0.0/8 is directly connected, EOBC0/0 9.0.0.0/8 [110/1] via 172.16.251.100, 00:05:54, GigabitEthernet1/1 11.0.0.0/8 [110/1] via 172.16.251.100, 00:05:54, GigabitEthernet1/1 12.0.0.0/8 [110/1] via 172.16.251.100, 00:05:54, GigabitEthernet1/1
West Coast Core Switching Layer Configurations The West Coast configurations refer to the IP routes. The West Coast core prefers the West Coast edge as its primary routes: WESTCOASTCORE#wr t Building configuration... ! hostname WESTCOASTCORE ! ! ! ! interface Port-channel1 no ip address switchport switchport trunk encapsulation dot1q ! interface GigabitEthernet1/1 ip address 172.16.252.1 255.255.255.0 ip ospf hello-interval 1 ip ospf dead-interval 3 ip ospf cost 5 ! interface GigabitEthernet1/2 no ip address switchport switchport trunk encapsulation dot1q channel-group 1 mode on ! interface GigabitEthernet2/1 no ip address switchport switchport trunk encapsulation dot1q channel-group 1 mode on !
Data Center Networking: Enterprise Distributed Data Centers
6-22
956599
Chapter 6
Multi Site Multi Homing Implementation Details
! interface Vlan1 no ip address shutdown ! interface Vlan200 ip address 172.16.250.254 255.255.255.0 ip ospf hello-interval 1 ip ospf dead-interval 3 ! router ospf 500 log-adjacency-changes area 252 nssa network 172.16.250.0 0.0.0.255 area 0 network 172.16.252.0 0.0.0.255 area 252 ! The following displays the routes associated with West Coast core. Note the path preference is that of the West Coast edge switching layer. WESTCOASTCORE#sho ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area * - candidate default, U - per-user static route, o - ODR P - periodic downloaded static route Gateway of last resort is not set O O O O O O O C O O C O O O C O O O
N2 N2 N2 N2 N2 N2 N2
IA
IA N2 N2 N2 N2 N2
1.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 2.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 3.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 4.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 20.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 5.0.0.0/8 [110/1] via 172.16.252.100, 00:02:37, GigabitEthernet1/1 6.0.0.0/8 [110/1] via 172.16.252.100, 00:01:44, GigabitEthernet1/1 172.16.0.0/24 is subnetted, 5 subnets 172.16.252.0 is directly connected, GigabitEthernet1/1 172.16.253.0 [110/12] via 172.16.250.1, 00:26:19, Vlan200 172.16.254.0 [110/11] via 172.16.252.100, 00:26:19, GigabitEthernet1/1 172.16.250.0 is directly connected, Vlan200 172.16.251.0 [110/2] via 172.16.250.1, 00:26:19, Vlan200 7.0.0.0/8 [110/1] via 172.16.252.100, 00:01:45, GigabitEthernet1/1 8.0.0.0/8 [110/1] via 172.16.252.100, 00:01:45, GigabitEthernet1/1 127.0.0.0/8 is directly connected, EOBC0/0 9.0.0.0/8 [110/1] via 172.16.252.100, 00:01:45, GigabitEthernet1/1 11.0.0.0/8 [110/1] via 172.16.252.100, 00:01:45, GigabitEthernet1/1 12.0.0.0/8 [110/1] via 172.16.252.100, 00:01:45, GigabitEthernet1/1
BGP Attribute Tuning In Internet edge topologies, controlling the outbound routes is first and foremost. This is how your network topology is seen by the world. Which also suggests that, by default, this is how traffic returns to your site. Controlling the outbound traffic of the topology allows you to manipulate the amount of traffic that comes in from one ISP or another. In detail, if you wanted to define that all traffic leaves your topology from the one ISP link, yet all traffic destined to the topology comes inbound on another ISP
Data Center Networking: Enterprise Distributed Data Centers 956599
6-23
Chapter 6
Multi Site Multi Homing
Security Considerations
link; you must implement autonomous system prepending. This is most commonly deployed in instances where you do not want to leave an link idle. For more information regarding route control via BGP, refer to the Data Center Networking: Internet Edge Design Architectures SRND.
Security Considerations Security is a necessity in all network architectures today, regardless of your Internet connectivity. Proper requirements must be taken to ensure that the network architecture and the network devices are securely provisioned and managed. Internet edge security is discussed in the Data Center Networking: Internet Edge Design Architectures SRND.This section provides a brief summary from those guides of the security functions supported within Internet edge designs. These functions include: •
Element Security - The secure configuration and management of the devices that collectively define the Internet edge.
•
Identity Services - The inspection of IP traffic across the Internet edge requires the ability to identify the communicating endpoints. Although this can be accomplished with explicit user/host session authentication mechanisms, usually IP identity across the Internet edge is based on header information carried within the IP packet itself. Therefore, IP addressing schemas, address translation mechanisms, and application definition (IP protocol/port identity) play key roles in identity services.
•
IP Anti-Spoofing - This includes support for the requirements of RFC-2827, which requires enterprises to protect their assigned public IP address space; and RFC-1918, which allows the use of private IP address spaces within enterprise networks.
•
Demilitarized Zones (DMZ) - A basic security policy for enterprise networks is that internal network hosts should not be directly accessible from hosts on the Internet (as opposed to replies from Internet hosts for internally initiated session, which are statefully permitted). For those hosts, such as web servers, mail servers, VPN devices, etc., which are required to be directly accessible from the Internet, it is necessary to establish quasi-trusted network areas between, or adjacent to both, the Internet and the internal enterprise network. Such DMZs allow internal hosts and Internet hosts to communicate with DMZ hosts, but the separate security policies between each area prevent direct communication originating from Internet hosts from reaching internal hosts.
•
Basic Filtering and Application Definition - Derived from enterprise security policies, ACLs are implemented to provide explicitly permitted and/or denied IP traffic which may traverse between areas (Inside, Outside, DMZ, etc.) defined to exist within the Internet edge.
•
Stateful Inspection - Provides the ability to establish and monitor session states of traffic permitted to flow across the Internet edge, and deny that traffic which fails to match the expected state of an existing or allowed session.
•
Intrusion Detection - The ability to promiscuously monitor network traffic across discrete points within the Internet edge, and alarm and/or take action upon detecting suspect behavior that may threaten the enterprise network.
Data Center Networking: Enterprise Distributed Data Centers
6-24
956599
C H A P T E R
7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution You can achieve redundancy and high availability by deploying multiple data centers and distributing applications across those data centers. This design document focuses on the design and deployment of distributed data centers using the Global Site Selector (GSS) and the Content Switching Module (CSM).
Overview The challenge of geographic load balancing is to ensure that transaction requests from clients are directed to the most appropriate server load balancing device at the geographically distant data center. Geographic load distribution requires control points for all transaction requests destined to any data center. The point of control for a geographic load-distribution function resides within DNS. Most clients must contact a DNS server to get an IP address to request service from a server. Because, geographically replicated content and applications reside on servers with unique IP addresses, unique DNS responses can be provided to queries for the same URLs or applications based on a series of criteria. These different criteria are based on the availability of applications at different data centers and different metrics. The different metrics include proximity, weighted round robin, preferred data centers, load at the data center, etc. The different metrics are dynamically calculated and updated at distributed sites. Based on these different metrics and the availability of services, clients are directed to the best site. Benefits Redundancy, scalability, and high availability are the key benefits of multi-site load distribution. Site-to-site recovery enables businesses to provide redundancy in case of disasters at the primary data centers. Multi-site load distribution provides application high availability and scalability. Multi-Site load distribution provides these benefits by making individual sites look like a single server and getting application availability and load information from these servers. This makes it possible to deploy multiple inexpensive devices rather than one large expensive system, providing for incremental scalability and higher availability.
Hardware and Software Requirements The table below lists different hardware and software required to support site-to-site recovery and multi-site load distribution. The GSS interoperates with the CSM and CSS. It also works with other server load balancing products, but some of the features, like the least loaded connections and shared keepalive features, cannot be used with other server load balancers. In subsequent sections of this chapter, interoperability of the GSS and the CSM is described.
Data Center Networking: Enterprise Distributed Data Centers 956599
7-1
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Design Details
Design Details
Warning
Product
Release
Platforms
Global Site Selector (GSS)
1.0(0.24.3)2
GSS-4480
Content Switching Module (CSM)
3.1.1a
SLB complex for Cat6k platforms
The CSM version used for testing was 3.1.1a. If you are using RHI, you must use 3.1(2).
Design Goals The basic design goal is to be able to direct clients to appropriate data center based on the configured rules and the availability of the servers or services at the data center. The major design issues are: •
Redundancy
•
High availability
•
Scalability
•
Security
•
Other requirements as necessary
Redundancy Within a data center, redundancy is achieved at different layers. It could be link redundancy or device redundancy. Redundancy provides a way of maintaining connectivity if there is a failure in the primary path. This is achieved by deploying devices that support stateful failover. There are times when the entire data center might go out of service due to an unforeseen reason. In such cases, clients can be directed to a redundant data center. In the event of a data center fail over, it is difficult to provide stateful failover. Although, there is some impact due to the failure of primary data center, the impact is very small compared to not having a redundant design.
High Availability High availability, from a global server load balancing point of view, is the ability to distribute the load among available data centers. If, for any reason such as a hardware or software failure, over loaded data center etc., it is determined that the new service requests cannot be handled by the data center, the request is directed to a data center that can really handle the service request. It is the constant monitoring of application availability and the load at a data center that helps in achieving high availability. High availability is also prevalent at each layer of the network including Layer 2 and Layer 3. For a more detailed description of how high availability is achieved within a data center, refer to the Data Center Networking: Infrastructure Architecture SRND (http://www.cisco.com/en/US/netsol/ns110/ns53/ns224/ns304/networking_solutions_design_guidances _list.html).
Data Center Networking: Enterprise Distributed Data Centers
7-2
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Design Details
Scalability Scalability is an inherent element of distributed data center environment. Applications can be hosted at multiple data centers to distribute the load across multiple data centers and make the applications scalable and highly available. The design should be able to support the growth both in number of sites and the number of DNS records without performance degradation and without over-hauling the design. There are scaling limitations. The details are covered in the Implementation Details section of this chapter.
Security Security is deployed as a service at the aggregation layer in a data center. Deploying request routing devices should not compromise security in the data center. For instance, the placement of authoritative DNS should ensure that security requirements are met because of the amount and type of overhead traffic needed to ensure application availability at each data center. Monitoring application availability might include determining both the health and load on the applications.
Other Requirements Other design requirements include meeting client and application requirements. For a business to business client, the client has to stay with the same site as long as it is available for the length of the transaction period (site persistence). In the case of a client accessing a streaming application, the client should be directed to a topologically closest data center (proximity). Some other client and application requirements include directing clients to the data center based on round trip time, IGP, and BGP metrics. Ideally the design should be able to meet all these requirements.
Design Topologies Cisco offers several products that are available for multi-site load distribution solutions. There are different topologies based on the products. All the topologies adhere to Cisco's design recommendations and contain a layered approach. The layer which connects to dual service providers is known as the Internet edge. At each layer, redundant devices are deployed for high availability. The core layer provides connectivity to branch offices, remote users and campus users. This chapter focuses on the GSS interoperating with the CSM. Specific topologies covered are site-to-site recovery and multi-site load distribution topologies. Both the topologies look similar except for some minor differences. Cisco recommends the GSS for both site-to-site load distribution and multi-site load distribution. Although the GSS has its limitations, it provides most required features in a single product.
Note
The GSS does not support MX records, IGP and BGP metrics. MX records can be supported by forwarding requests to devices that handle MX records.
Data Center Networking: Enterprise Distributed Data Centers 956599
7-3
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Design Details
Site-to-Site Recovery Site-to-site recovery provides a way of recovering data center applications and data in case of an unexpected outage. Sometimes, building redundancy into each layer of networking is not enough. This leads to building standby data centers. Standby data centers host similar applications and databases. You can replicate your data to the standby data center to minimize downtime in the event of an unexpected failure at the primary data center. Figure 7-1 depicts the site-to-site recovery topology using the GSS as the request routing device. Typically, the request router is connected to the aggregate switch and the GSSs is the authoritative DNS for the domains in the data center. In this example, the GSS is connected to the access switch instead. This is due to the link redundancy limitation. Connecting the request router to the access switch provides some level of redundancy. If the aggregate switch fails, the request router is still reachable. More details are provided in the Implementation section. Typically the primary data center connects to two ISPs through the Internet edge to achieve redundancy and the primary and secondary data center are connected either by WAN or metro optical links to replicate data for recovery in case of disasters at the primary data center. If disaster hits the primary data center, the end users or clients are directed to the secondary data center where the same applications and data is available. Figure 7-1
Site-to-Site Recovery
Internet Internet edge
Internal network
Internet edge
GSS
Core
Aggregation
GSS 87339
GSS
Primary
Standby
Data Center Networking: Enterprise Distributed Data Centers
7-4
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Design Details
Multi-Site Load Distribution Figure 7-2 depicts the topology for multi-site load distribution. The GSSs are overlayed on top of the existing data center infrastructures. More details about the data center topologies and the Internet edge can be found in theData Center Networking: Internet Edge SRND. Figure 7-2
Multi-site Load Distribution using GSS
Internet Internet edge
Internet edge
Internal network
GSS
Core
Aggregation
GSS 87340
GSS
Site 1
Site 2
Site 3
There is no difference between the site-to-site recovery and multi-site load distribution topologies except that a GSS per site is not required to support multi-site load distribution. The GSS in Site 1 is the primary GSS and the GSS in Site 3 is the secondary GSS. There is no permanent session between the GSSs. But, after a configuration change on the primary GSS, the secondary GSS synchronizes with the primary GSS. Further, the configuration changes can only be made on the primary as long as the primary is up and running. If the primary GSS goes down for some reason, the configuration changes can be made on the secondary. As in site-to-site recovery, the GSSs are connected to the access switch for the lack of good link redundancy mechanisms on the GSS. GSSs can be configured with different DNS rules. Each DNS rule can select a different predictor algorithm. Based on the DNS rules and the predictors used for the DNS rules, the GSSs, both primary and secondary, respond to DNS queries from the end users or clients. The responses lead the client into the appropriate data center.
Data Center Networking: Enterprise Distributed Data Centers 956599
7-5
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Implementation Details
Implementation Details Before getting into the details, this is an appropriate point to discuss the features common to both site-to-site recovery and multi-site load distribution. The first one to consider are the GSS health probes. It is important to note that the GSS has to send health probes to all sites or data centers or server load balancing devices to learn about application availability. The GSS also offers a shared keepalive. With the shared keepalive, the GSS sends out one request for all the VIPs in a specific data center and gets the response from all the VIPs in the same response. This is depicted in Figure 7-3.
These health probes have to traverse across the firewalls. The GSS configuration guide provides more information for deploying GSS behind firewalls.
Note
Moreover, in case of multi-site load distribution, as the number of sites increase, each GSS has to send health probes to N data centers, N being the number of data centers. The amount of traffic is somewhat alleviated by the use of KAL-AP health probes. Figure 7-3
Health Probes from GSS
GSS Application Health Check
Internet Internet edge
Internet edge
Internal network
Core
Aggregation
GSS 87341
GSS
Primary
Standby
Keep in mind that the key design goals, when deploying both site-to-site recovery and multi-site load distribution, are the following: •
Redundancy
•
High availability
Data Center Networking: Enterprise Distributed Data Centers
7-6
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Implementation Details
•
Scalability
Redundancy Typically, redundancy in a single site is provided by deploying a redundant device in active/standby mode. A request router, such as a DistributedDirector or a CSS, is typically connected to the aggregate switch; these devices can support both link and device redundancy. The GSS cannot be deployed in active/standby mode. Due to this limitation, the GSS is deployed at the access switch. To configure the IP address on the GSS interface, use the following command gss1.ese-cdn.com#conf t gss1.ese-cdn.com(config)#interface ethernet 0 gss1.ese-cdn.com(config-eth0)#ip address 172.25.99.100 255.255.255.0
Note
While configuring the interface IP addresses, the global site selector services should be stopped using the command “gss stop” at the enable mode. Now the default gateway on the GSS has to be configured. The default gateway points to the active HSRP address on the aggregate switches. gss1.ese-cdn.com#conf t gss1.ese-cdn.com(config)#ip default-gateway 172.25.99.1
Figure 7-4
Link and Device Redundancy
To Core Layer
Default Gateway (active HSRP)
Trunk
Trunk
Aggregation
X
To Servers
87342
Access
VLAN x
Figure 7-4 depicts the implementation details. The CSMs are deployed in active/standby configuration with the fault tolerant VLAN carried across the port channel between the two aggregate switches. The PIX firewalls are also deployed in active/standby configuration. With the aggregate switches and the access switches running spanning tree, one of the paths is blocked, as shown in Figure 7-4 with the red
Data Center Networking: Enterprise Distributed Data Centers 956599
7-7
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Basic Configuration
VLAN. Typically, the aggregate switch on the left is configured as the root for the spanning tree and the aggregate switch on the right is the secondary root for the spanning tree. With this topology, the GSS is deployed at the access switch and is part of the red VLAN. Now, a L3 interface is created on both the aggregate switches for the red VLAN and are configured as part of the HSRP group. The aggregate switch on the right hand side is in standby mode. The default gateway on the GSS points to the active HSRP address. This topology minimizes the impact of aggregate switch failure. However, if the access switch fails, even though the spanning tree converges to provide redundancy to the server path, the GSS gets taken out of the picture.
Note
There might be more than one client VLAN on the CSM. It is a good idea to put the GSS on a different VLAN. Alternatively, the GSS can also be connected directly to the aggregate switch but in this case, if the link to the GSS fails, the GSS is out of the picture. With the GSS at the access layer, the GSS is protected from failures to the aggregate switch and failures to the links between the aggregate and the access switches.
High Availability The secondary GSS deployed at the secondary data center also answers DNS queries. Typically, the upstream DNS round robins are between the primary and secondary GSSs. As long as the primary is active, it responds to DNS queries and directs the end users to the appropriate data center. If the primary GSS goes down for any reason, the secondary GSS continues to answer DNS queries.
Scalability The GSS can scale up to 2000 authoritative domains and up to 8 GSSs can work together in a network. If there are more than 2 GSSs in the network, one of them is primary, the second one is standby and the remaining GSSs are configured as gss.
Basic Configuration Before getting into implementation details, there are a few basic setup steps that must be done on the GSSs. These help in enabling the GUI on the GSS. Only the basic steps that are helpful are described in this section. More details about this are found in the configuration document. All the content routing information is stored in a SQL database. The database files must be created on the GSSMs before the GUI can be accessed. These are the initial steps to configure a GSS. Step 1
Initial configuration like the IP addresses for the interfaces, the default gateway, the hostname and the name server is configured on the GSS. The name server has to configured on the GSS for the GSS to work.
Step 2
Create the data base with the gssm database create command to enable the graphical user interface on the GSS. This command is executed in the enable mode. Also note that the database is enabled only on the primary and the standby GSS.
Step 3
Configure the node type on the GSS. The node type must be chosen for every GSS in the network. The different node types are primary, standby, or gss.
Data Center Networking: Enterprise Distributed Data Centers
7-8
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Basic Configuration
Step 4
Enable gss with the gss enable gssm-primary command. Again, this is done from the enable mode. To follow the activity on the gss, use the show log follow and gss status commands.
Step 5
Follow steps 1-4 for the standby GSS. In step 4, instead of gssm-primary, use the gssm-standby command to enable the GSS and specify the IP address of the primary GSS.
Step 6
Open a browser window and type https:// as the URL to access the GUI.
Step 7
The default username is admin and the password is default.The next step is to configure the health probes, answers, answer group, domain lists and balance methods. This is explained in more detail in both site-to-site recovery and multi-site load distribution sections.
The information below assists you in understanding the relationship between different configuration rules, such as DNS rules, Domain lists etc. The DNS rules consist of the following objects. You can get to each object by clicking on DNS rules and then using the drop down menu on the left hand side. •
Source Address List: A list of addresses of local DNS. For site-to-site recovery, this can be set to accept all IP addresses. This represents the source that is requesting the IP address of the domain.
•
Domain List: A list of domains. This represents the a list of domains, one of which matches the domain name requested.
•
Answer Group: A group of resources from which the answers are provided.
•
Balance Method: The global load balancing algorithm that is used to balance responses among the answer groups.
•
Answers: Configure different VIPs here along with the type of keep alive method used.
•
Shared Keepalives: Specifies the IP address on the load balancer to which the KAL-AP health probes are sent.
Both site-to-site recovery and multi-site load distribution use health probes. The different types of health probes used are shared keepalive and ICMP. Shared keepalive is also called as KAL-AP. There are two types of KAL-Aps: KAL-AP by VIP and KAL-AP by tag. Shared keepalives can be set up either using KAL-AP by VIP or KAL-AP by tag. KAL-AP by VIP uses the VIP and KAL-AP by tag uses a domain string. For KAL-AP by tag, some additional configuration is required on the CSM, which is the load balancer. The tag specifies the sub-domain and the length of the tag has to be less than 64 characters. This is because the KAL-AP query limits the length of the tag to 64 characters. The idea behind using the domain as a tag is to probe the health of the VIP by domain name instead of the IP address (KAL-AP by VIP). This comes in handy if the addresses are being translated between the GSS and the load balancer. Configuration on the CSM: AggSwitch#conf t Enter configuration commands, one per line. End with CNTL/Z. AggSwitch(config)#mod csm 5 AggSwitch(config-module-csm)#vserver VIP2 AggSwitch(config-slb-vserver)#virtual 20.17.30.201 any AggSwitch(config-slb-vserver)#vlan 100 AggSwitch(config-slb-vserver)#serverfarm REAL_SERVERS AggSwitch(config-slb-vserver)#domain www.ese-cdn.com AggSwitch(config-slb-vserver)#inservice AggSwitch(config-slb-vserver)#
Data Center Networking: Enterprise Distributed Data Centers 956599
7-9
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Basic Configuration
Note
When setting up shared keepalives for CSM, the Primary IP address used can be either the IP address of the client VLAN or the alias IP address of the client VLAN. Also note that if the Content Services Switch (CSS) is used instead of CSM, use the circuit IP addresses on the CSS in the primary and secondary boxes of shared keep alive configuration.
Site-to-Site Recovery In a site-to-site recovery solution, typically, the primary site is the active site and all the end users are directed to the primary site as long as the applications are alive and well. The secondary site, or the recovery site, also hosts the applications but these are in a standby mode. The data is replicated to the standby data center to be used in the event of unexpected downtime at the primary data center. The site-to-site recovery topology was introduced in Figure 7-1. In this section, the implementation details for a single site are provided. This design also applies to the secondary site. The only difference are the configurations on the GSS itself: one is configured as the primary and the second GSS is configured as the standby GSS.
Note
GSS is deployed as authoritative DNS for the sub-domains for critical applications. This implies that the IP addresses of the authoritative DNS have to be configured in the upstream DNS as name servers. Typically there is more than one name servers in the upstream DNS. During the DNS resolution process, the upstream DNS uses the round trip time as a measure to query one of the name servers. Site-to-site recovery is always based on active-standby configurations and regardless of which GSSs are queried, the result should be the same.
Site Selection Method The site selection method or balance method, also known as predictor, is the algorithm that is followed while answering DNS queries from the clients. For site-to-site recovery, where the data centers are in active/standby mode, a balance method called the ordered list is used. Using the ordered list balance method, each resource within an answer group (for example, an SLB VIP or a name server) is assigned a number that corresponds to the rank of that answer within the group. Devices with lower numbers rank above those with higher numbers. Using the rankings, the GSS tries each resource in the order that has been prescribed, selecting the first available (“live”) answer to serve a user request. List members are given precedence and tried in order. A member is not be used unless all previous members fail to provide a suitable result.
Configuration GSSs are configured using GUI. It is difficult to show all the screens in this section to discuss the configurations. Refer to the site below for a detailed description of the configuration. http://www.cisco.com/en/US/products/sw/conntsw/ps4038/products_configuration_guide_chapter0918 6a00800ca80d.html
Data Center Networking: Enterprise Distributed Data Centers
7-10
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Basic Configuration
Note
The TTL (time to live) values configured on the GSS determines for how long the A records are cached. For site-to-site recovery, setting the TTL to a low value will ensure that a new request is made after the TTL expiration. It should also be noted that the lowest value of health probe interval that can be set on the GSS is 45 seconds. The recommended value for TTL on the GSS has to be between 5 and 45 seconds. This section provides a configuration outline.
Step 1
Perform initial configurations as described above.
Step 2
On the web browser, use https to enter the IP address of your primary GSS, and login.
Step 3
Once you are on the GSS, the domain list, answer group, balance methods, answers, and shared keepalives have to be configured either by selecting each of the items individually or by using the wizard.
Step 4
Configure the VIPs by selecting the “answers” option in the drop down menu for which health probes have to be sent. The type of health probes used is also configured here. Different options available are: a. No health probe b. ICMP c. KAL-AP by VIP and Tag d. HTTP-HEAD
For more information on health probes, refer to the Basic Configuration section. Step 5
Select the answer groups and configure the members of the answer group. The members of the answer group are the VIPs.
Step 6
Select the DNS rules from the drop down menu and tie all the information together. The way the DNS rules read when translated to simple english is “For all the clients which belong to this source list, looking for the sub-domain in the domain list, if the status is active, select one from this answer group based on the this balance method.”
Step 7
•
For site-to-site recovery, always select ordered list for balance method. As long as the VIP is alive at the primary site, all clients are directed towards the primary site.
•
The first active address is the primary site's VIP.
•
The second active address is the secondary or standby data center's VIP.
Once the configuration is complete, click on Monitoring to view the health information of all the different VIPs. The following sections describe the configurations for different balance methods.
Data Center Networking: Enterprise Distributed Data Centers 956599
7-11
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Basic Configuration
Figure 7-5
GSS GUI
Figure 7-5 depicts the GSS screen. The drop down menu is on the left hand side of the screen. You can also see the rule wizard, monitoring, etc.
Multi-Site Load Distribution Figure 7-2 depicts a multi-site load distribution deployment with a primary and a standby GSS. The advantage of using GSS for multi-site load distribution is that it provides an integrated feature set compared to deploying global site load balancing using the CSS. However, on the flip side, the health probes are sent across the data centers. Multi-site load distribution provides redundancy, high availability, and scalability. Deploying multi-site load distribution has similarities to site-to-site recovery deployments. Providing redundancy for GSSs in a multi-site load distribution is identical to deploying redundancy in site-to-site recovery.
Note
Providing redundancy in a single site is identical to what was described earlier. Refer to the Redundancy section for a detailed information. From a high availability and scalability perspective, multi-site load distribution offers more site selection methods and scales well due to the number of sites and number of GSSs that can be deployed. When there are two or more sites to share the load, similar to a server farm, there are multiple predictor algorithms, balance methods, or site selection methods that can be used.
Data Center Networking: Enterprise Distributed Data Centers
7-12
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Basic Configuration
Site Selection Methods The GSS, acting as an authoritative DNS, monitors the availability of different VIPs at different sites. Upon receiving a DNS request, the GSS responds with an A record of the active VIPs based on one of the following criteria. •
Round robin
•
Weighted round robin
•
Ordered list
•
Least loaded
•
Hashed
•
Proximity
•
Static
•
Based on round trip time (Boomerang)
Least loaded, hashed, and proximity are covered in this section. Other balance methods are prevalent or have been discussed earlier in this chapter. Further, it was noted, in an earlier section, that the authoritative DNS is queried based on the round trip time. This is true in case of multi-site load distribution as well. The response for the query is really based on the different site selection methods. For example, if least loaded site selection method is configured on all the GSSs, regardless of which authoritative DNS is queried, the response is based on the load at different sites. If static proximity is used instead, the response is based on the IP address of the querying device.
Least Loaded Clients are directed to a site with the least load. The definition of load is based on the load balancing device used at the data center. •
Calculate the maximum capacity for a given virtual server: max_capacity = For each inservice real add 10,000. (ie. 1 inservice real=10000, 2 inservice reals = (2 X 10000) = 20000, etc)
•
Calculate the factor: factor = ((max_capacity - CurrentConnectionCount) > 10
This returns a value in the range of 1 to 64K-1 with 65535 meaning MOST available. This weight has to be mapped to a range between 2-254 called capp_weight, with 2 being MOST available as follows: capp_weight = weight >> 8 if capp_weight is less than 2 assign capp_weight = 2;
As an example, consider that there is one server in a server farm. This implies that the maximum number of connections is equal to 10000. i.e., max_capacity =10000. Consider that there are 5000 connections going through the switch, the factor is calculated as follows. Factor = ((10000 - 5000) > 10 = 32767
Data Center Networking: Enterprise Distributed Data Centers 956599
7-13
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Basic Configuration
capp_weight = 32767 >> 8; right shift 32767 by 8 bits, i.e., right shift 111111111111111 by 8 bits = 127 in decimal or 1111111 in binary
This provides a measure of availability for different VIPs at different sites. When responding to the DNS query, the GSS looks at the available load information. The GSS responds with an A record with the IP address of the VIP, which has the lowest value. If there is more than one VIP with the same load, the GSS performs round robin between the VIPs. It is important to note that the maximum number of connections allowed when using a CSM is about a million. To limit the number of connections to a maximum per server based on the server capacity, use maximum connections command option on the CSM for each real server. More details are provided in the configuration section.
Hashed Using the source address and domain hash balance method, elements of the client's DNS proxy, IP address and the requesting client's domain, are extracted and used to create a unique value, referred to as a hash value. The unique hash value is attached to and used to identify a VIP that is chosen to serve the DNS query. The use of hash values makes it possible to “stick” traffic from a particular requesting client to a specific VIP, ensuring that future requests from that client are routed to the same VIP. This type of feature lets the client stick to a specific site. If there are two or more VIPs for the specified domain and the site sticky site goes out of service, the GSS picks an available VIP to go to based on the hash value and sticks to it.
Note
The health probes are sent at regular intervals. If the sticky VIP goes down, the GSS learns the status of the sticky VIP on the next health probe. During this window, the GSS directs the clients to the sticky VIP.
Proximity Clients matching a list of IP addresses in the source address list are directed to specific sites. This is called Static Proximity. The second category of proximity, called boomerang, is to direct clients to the site with the least round trip time between the requesting client (client's DNS proxy) and the site.
Note
GSS does not provide site selection methods based on IGP and BGP metrics.
Configuration You configure the GSS with a GUI. It is difficult to show all the screens in this section to discuss the configurations. Refer to the site below for a detailed description of the configuration. http://www.cisco.com/en/US/products/sw/conntsw/ps4038/products_configuration_guide_chapter0918 6a00800ca80d.html As mentioned in the Basic Configuration section, the various objects of DNS rules have to be configured and tied together. You can get to each object by clicking on DNS rules and then using the drop down menu on the left hand side. In this section a configuration outline is provided for different site selection or balance methods. Step 1
Perform initial configurations as described above.
Step 2
On the web browser, use HTTPS to enter the IP address of your primary GSS (gssm primary), and login. Once you are on the GSS, the different parameters can be configured individually or by using the wizard.
Data Center Networking: Enterprise Distributed Data Centers
7-14
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Basic Configuration
Step 3
Configure the VIPs by selecting the “answers” option in the drop down menu for which health probes have to be sent. The type of health probes used is also configured here. Different options available are a. No health probe b. ICMP c. KAL-AP by VIP and Tag d. HTTP-HEAD
Step 4
Select the answer groups and configure the members of the answer group. The members of the answer group are the VIPs.
Step 5
Select the domain list and configure the sub-domains.
Step 6
Optionally, the source list can also be configured if needed. If this option is not used, by default it applies to all requests.
Step 7
Select the DNS rules from the drop down menu and tie all the information together. The way the DNS rules read when translated to simple english is “For all the clients that belong to this source list, looking for the sub-domain in the domain list, if the status is active, select one from this answer group based on the this balance method.”
Step 8
Once the configuration is complete, click on monitoring to view the health information of all the different VIPs. The following sections describe the configurations for different balance methods.
Least Loaded Configuration The GSS relies on the load balancing device to learn the load at a specific site. The load information is obtained by using the UDP based KAL-AP health probes. Since the load information is obtained from the load balancer, some configuration is required on the load balancers as well. This document uses the CSM to test the GSS. As a result, the configuration discussed here applies only to CSM. Configuration on CSS might not look the same. Configuring the CSM: Step 1
Configure CSM for SLB either in bridged or routed mode.
Step 2
Configure the maximum number of connections on the real servers based on how many connections they can safely handle. Agg1#conf t Enter configuration commands, one per line. End with CNTL/Z. Agg1(config)#mod csm 5 Agg1(config-module-csm)#serverfarm REAL_SERVERS Agg1(config-slb-sfarm)#real 20.30.30.100 Agg1(config-slb-real)#maxconns 1000
Step 3
Note
Enable capp udp.
The maximum number of connections option for the virtual servers on the CSM is currently unavailable. Configuring the GSS:
Data Center Networking: Enterprise Distributed Data Centers 956599
7-15
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Basic Configuration
Step 1
Follow the basic configuration steps for the initial configuration.
Step 2
Configure the shared keep alives and use KAL-AP by VIP. For the CSM, only the primary address can be used if an alias is set up for the client side VLAN. (An alias IP is the virtual IP for an active standby setup). The CSM does respond to the KAL-AP requests to the alias IP.
Step 3
Configure the VIPs by selecting the Answers option in the drop down menu for the configured health probes.
Step 4
Select the answer groups and configure the members of the answer group. The members of the answer group are the VIPs.
Step 5
Select the domain list and configure the sub-domains.
Step 6
Optionally, the source list can also be configured if needed. If this option is not used, by default it applies to all requests from any source.
Step 7
Select the DNS rules from the drop down menu and tie all the information together. The way the DNS rules read when translated to simple english is “For all the clients which belong to this source list, looking for the sub-domain in the domain list, if the status is active, select one from this answer group based on the this balance method.”
Step 8
Click on Rule Builder and enter all the relevant information (source address list, domain list and balance clause). For balance clause, choose least loaded balance method.
Note
The CSM can handle up to a million connections. If the maximum connections are not setup on the real servers, the CSM does not report a higher load. This might lead to the servers being over loaded. Refer to the least load calculation procedures described above.
Hashed Configuration Hashing is used for site stickiness based on the source address and/or destination domain. It should be noted that the source address used here is that of the local DNS or DNS proxy for the client. Although either one or both the options can be used for hashing, Cisco recommends that both the source address and the destination domain be used for site stickiness. Now, based on the hash value, the client is handed an A record with the IP address of a specific VIP. As long as the VIP is alive and well, time and again, the client's request to the same destination domain takes the client to the same VIP. If the VIP goes down or is not reachable, the client is directed to a different VIP. The configuration steps are similar to what is described in the previous section with a few changes when it comes to selecting the balance method. To choose hashed balance method, select “Open Rule Builder” and enter all the information. Then choose Hashed and click on the by domain name and by source address boxes.
Proximity Configuration Of the two proximity solutions supported by GSS, only static proximity configurations are provided in this document. Static proximity involves identifying the source and directing the client to a specific VIP or a group of VIPs based on a balance method. So, it is really a combination of source list and balance method. There is a possibility of static proximity for each one of the balance method.
Data Center Networking: Enterprise Distributed Data Centers
7-16
956599
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution Summary
As far as the steps themselves are concerned, they are exactly the same as the ones described earlier. Step 6. is not optional for proximity. The source address has to be configured to classify the clients into different groups. Different groups of clients can adopt different balance methods. For static proximity, using ordered list is preferred. By using ordered list, requests from a specific client for a specific sub-domain, can be directed towards a specific VIP as long as it is active. If the VIP goes down, clients can be directed to the second active VIP in the ordered list and so on. More than one such rule can be configured with the same VIPs as members of the answer groups. The source address list and the order for the new rules can be different. Figure 7-6 depicts the use of static proximity. Refer to the last two rows of the screen. StaticProximityRuleA and StaticProximityRuleB indicate that both use different source address list but use the same answer groups with different orders. Figure 7-6 also depicts the least loaded rule in the first row. Figure 7-6
DNS Rules for Static Proximity
Summary Unlike other global server load balancing devices, the GSS provides all the features in the same chassis. GSS also provides the decoupling between server load balancer and the global load balancing devices. GSS interoperates well with the CSM. GSS also provides most of the features and is easier to configure. GSS does support application high availability, load distribution and business resilience. Except for a few features, GSS does meet most of the requirements today.
Data Center Networking: Enterprise Distributed Data Centers 956599
7-17
Chapter 7
Deploying Site-to-Site Recovery and Multi-Site Load Distribution
Summary
Data Center Networking: Enterprise Distributed Data Centers
7-18
956599
G L O S S A RY
B Boomerang Protocol
The Boomerang Control Protocol (BCP) allows the Content Router 4400 and boomerang agents to perform global server selection. This advanced selection technology reacts within seconds to network congestion, down links, malfunctioning, or overloaded servers. The boomerang system scales to very large numbers of nodes and is very simple to configure and monitor.
Boomerang Agent
The boomerang agent resides on the content switches and content engines and participates in the boomerang control protocol to determine the best site in site selection mechanism.
C Content and Application Peering Protocol (CAPP)
Enables distributed CSS to exchange global proximity information in real time. The use of CAPP ensures that all content requests draw on complete and accurate proximity information. In a session between two Content Services Switches serving Global Load Balancing function to exchange information about the site and make decisions on redirecting clients to the appropriate sited based on the information.
Content Routing
The ability of the network to take a client request and redirect it to an appropriate resource for servicing. There are a number of ways to accomplish this, some proprietary some not. The three most common ways of accomplishing this are DNS-based redirection, HTTP-based redirection, and Route Health Injection.
Content Rule
A hierarchal rule set containing individual rules that describe which content (for example, .html files) is accessible by visitors to the Web site, how the content is mirrored, on which server the content resides, and how the CSS processes the request for content. Each rule set must have an owner.
D Distributed Data Center
Consists of more than one data centers connected together by WAN/LAN or high speed transport layer to provide redundant data center services. These distributed data centers also share load between them.
G Global Server Load Balancing (GSLB)
Load-balancing servers across multiple sites, allowing local servers to respond to not only incoming requests, but to remote servers as well. The Cisco CSS 11000 Content Services Switch supports GSLB through inter-switch exchanges or via a proximity database option. Data Center Networking: Enterprise Distributed Data Centers
956599
GL-19
Glossary
H Health Checks or Health Probes
Used by the server load balancing and global load balancing devices to check server state and availability based on standard application and network protocols and (depending on the server load balancing product) sometimes customized health check information.
HTTP Redirection
The process by which Hypertext Transfer Protocol (HTTP) requests made by the Cisco Content Distribution Manager are redirected to a client “local” content engine. The request is then served from the content engine.
N NAS (Network Attached Storage)
A central data storage system that is attached to the network that it serves. A File Server with internal or external storage is a simple example.
NAT Peering
Cisco CSS11000 Series Switches use NAT Peering to direct requests to the best site with the requested content based on URL or file type, geographic proximity and server/network loads, avoiding the limitations of Domain Name System (DNS)-based site selection and the overhead of HTTP redirect. NAT peering acts as a “triangulation protocol” allowing the response to be directly delivered to the user over the shortest Internet path.
O Origin Web Server
Core of Content Networking. Base from where web services are sourced
R Physical server providing the services behind the virtual server to the clients.
Real Server
S SAN (Storage Area Network)
A dedicated, centrally managed, secure information infrastructure that enables any-to-any interconnection of servers and storage systems. SANs are typically built using the SCSI and Fibre Channel (SCSI-FCP) protocols.
Secure Content Accelerator (SCA)
The Cisco 11000 series Secure Content Accelerator (SCA 11000) is an appliance-based solution that increases the number of secure connections supported by a Web site by offloading the processor-intensive tasks related to securing traffic with SSL. Moving the SSL security processing to the SCA simplifies security management and allows Web servers to process more requests for content and handle more e-transactions
Source of Authority The primary DNS server for a particular domain. (SOA)
Data Center Networking: Enterprise Distributed Data Centers
GL-20
956599
Glossary
SRDF (Symmetrix Remote Data Facility)
EMC Symmetrix Remote Data Facility enables real-time data replication between processing environments. This can be the same data center or separated by longer distances
Stateful Failover
Ensures that connection “state” information is maintained upon failover from one device to another. Session transaction information is also maintained and copied between devices to alleviate any downtime from occurring with websites and services.
Stateless Failover
Maintains both device and link failure status and provides failover notifications if one of these fails. However, unlike stateful failover, stateless failover does not copy session state information from one device to another upon failure. Therefore, any “state” information between the client and server must be retransmitted.
Storage Array (SA)
Cisco storage arrays provide storage expansion to Cisco’s Content Delivery Network products. Two models are offered: Cisco Storage Array 6 (108 GB) and Cisco Storage Array 12 (216 GB).
T Time-to-Live (TTL)
The time to live a packet has to transeverse the network. Each hop that a packet takes thoughtout the network, decrements the TTL value until it is eventually dropped. Keeps the packet from bouncing around the network. For mulitcast, the TTL should never be greater than 7, for routing the TTL should never be greater than 15.
U Universal Resource Locator (URL)
Standardized addressing scheme for accessing hypertext documents and other services using a browser. URLs are contained within the User Data field and point to specific Web pages and content.
URL Hashing
This feature is an additional predictor for Layer 7 connections in which the real server is chosen using a hash value based on the URL. This hash value is computed on the entire URL or on a portion of it.
V Virtual Server
Logical Server in a content switch used to a service offered by multiple Real Servers to a single IP address, protocol and port number used by clients to access the specific service.
W Weighted Round Robin
When weights are assigned to different sites and clients are directed to sites in a round robin fashion based on the weights assigned to different sites, the sites with the highest weight take end up with more clients.
Wavelength
The distance between points of corresponding phase of two consecutive cycles of a wave. In DWDM systems, wavelength is also called lambda.
Data Center Networking: Enterprise Distributed Data Centers 956599
GL-21
Glossary
Data Center Networking: Enterprise Distributed Data Centers
GL-22
956599
I N D EX
border routers
Numerics
6-10
box to box redundancy 7200
4-8
business continuance business resilience
4-10, 4-13, 5-5 1-1, 1-2, 1-5, 2-1
1-2, 4-1
A ACL
C
6-14
active/standby mode address resolution Answer Group Answers
5-10 3-1
7-9
6-5
CAPP
4-14, 4-15, 5-11, 5-14
Cisco Storage Networking Solutions
7-9
content addressable memory
application availability
4-6
Application distribution application layer
content routing
1-6, 4-5
6-5
1-9, 2-4, 2-5
7-1
critical e-business application CRM
1-1
authoritative DNS
4-12
CSM
3-9, 4-10, 4-12, 7-1
authoritative DNS
4-13, 4-22, 5-3, 7-3
CSS
7-8
4-9
1-3
Content Switching Module
2-7
authoritative domain
2-5
content application peering protocol (CAPP)
6-1
asynchronous replication ATM
CAM
1-5
4-9, 4-13
customer relationship management CWDM
1-1
2-6
B back end connectivity back-end layer
1-7, 4-5
Balance Method basic filtering
D
2-5
data archiving
7-9
database-to-database recovery
6-1
DDC
basic filtering and application definition BGP
6-6, 6-9, 6-10, 6-12, 6-13, 6-14
BGP attributes
6-7 6-23
7-3
BGP routing table
6-24
6-6
border gateway protocols
decoupled architecture default routing
4-10, 5-4
6-6 7-9
Demilitarized zones
6-1
demilitarized zones
6-24
device redundancy 6-6
DFP
1-9
1-1
default username
BGP attribute tuning BGP metrics
1-5
6-4
5-16
Data Center Networking: Enterprise Distributed Data Centers 956599
IN-1
Index
direct losses
firewalls
1-1
disaster recovery
1-1, 1-2, 2-1, 4-1, 4-7, 5-13
distributed data center model DistributedDirector
1-8
3-6, 4-8, 4-22, 7-7
Firewall Service Module Frame Relay
6-1, 6-24
FTP
DNS
1-4, 1-7, 4-6
FWSM
DNS propagation DNS query DNS rules
6-2
GDPS 7-14
2-6
Globally Distributed Parallel Sysplex, GDPS Global Site Selector GSLB
1-10, 2-6
Dynamic Feedback Protocol Dynamic Packet Transport dynamic routing
7-1
global server load balancing, GSLB
7-9
5-15
DWDM
1-1
geographic load distribution
domain hash balance method DRP
6-9, 6-11, 6-15, 6-16, 6-17, 6-19
Gartner Group Dataquest
7-8
7-1
Domain List
1-7
G
7-14
DNS server
1-6, 4-5
3-1
7-14
DNS round robins
6-9
2-7
front-end layer
DMZ
DNS based request routing
7-6
3-7 2-7
GSS
3-7
7-1
4-9, 4-12, 4-14 7-1
GSS health probe gss stop
6-6
2-2
7-6
7-7
H
E EBGP
hashed
6-15
eCommerce EGP
hashing
1-1
7-16
high availability
6-6
EIGRP
7-13
hot standby
6-4
1-1, 4-6, 4-14, 4-23, 4-26, 5-1, 5-2, 7-1, 7-2
1-8, 4-2, 4-14
Element security
6-1
hot standby router protocol
element security
6-24
HSRP
enterprise resource planning ERP
1-1
ESCON
6-6
6-5
1-6
HTTP GET
2-2, 2-6
exterior gateway protocol
6-4, 6-5, 6-9, 7-7
HSRP dead timers HTTP
1-1
4-22, 4-23
HTTP-HEAD
7-11, 7-15
HTTP protocol
1-4
HTTP Redirection
F false alarms FCIP FDQN FICON
4-7
2-5, 2-7
2-2, 2-6
3-1, 3-3
I ICMP
5-13
6-4
7-9, 7-11, 7-15
Identity services
6-1
Data Center Networking: Enterprise Distributed Data Centers
IN-2
956599
Index
identity services IGP
6-24
M
6-9, 6-10
IGP metrics IGPs
MAC address
7-3
MAN
6-6
indirect losses
integrated architecture
media on demand
6-14
Meta Group
4-9, 5-3
interautonomous system routing interior gateway protocols intrusion detection
metrics
6-6
6-7
6-1, 6-24
1-6
1-2
6-4
MoD
1-6
multi
6-1
multigroup hot standby router protocol
3-7
IP anti-spoofing
multi-homing
6-1
ISP peering point
multi-site load distribution multi-site topologies
3-2
multi-site topology MX records
K KAL-AP by tag
7-9, 7-11, 7-15
KAL-AP by VIP keepalive features
1-10
7-3
N
7-6, 7-15
7-1
NAS
2-2
NAT
6-11
network distance
L
networked storage No health probe
2-6
Layer 3 features
1-3 2-2 7-15
non-legacy applications
6-4
not so stubby area
1-6
NSSA
7-13
legacy applications load at the data cent
1-5
1-3, 1-4, 4-6, 4-11
6-9, 6-14
6-14
n-tier architecture
1-3, 1-4, 4-6, 4-11
lightweight directory access protocol load balancing
6-11
non-intrusive replication technologies
2-1, 2-4
least loaded
3-1, 7-3
6-1
network address translation
LDAP
6-4
7-9, 7-11, 7-15
KAL-AP health probe
latency
6-4
6-1
multi-site internet edge topologies
6-3
6-1
iterative queries
LAN
7-14
7-1
MHSRP
6-6
intra-autonomous system routing
ISP resiliency
2-6
maximum number of connections
1-1
injecting partial BGP routes
IOS-SLB
6-5
1-12, 2-1
1-6
7-1
O
1-6
load distribution based on site load local server load balancing
1-12
5-7
OPSF Oracle
6-9 2-1
ordered list OSPF
7-13
6-10, 6-12, 6-14, 6-16
Data Center Networking: Enterprise Distributed Data Centers 956599
IN-3
Index
OSPF area
4-29
Route Health Injection
OSPF cost
6-21
route-map
OSPF dead-interval timers OSPF hello timers
3-1, 3-4
6-18
routing protocol metric tuning
6-13
RTT
6-13
OSPF interface timer configurations
4-6, 5-13
rule based DNS
6-18
6-4
5-8
rule based GSLB
4-14
P P anti-spoofing
S
6-24
pass-through autonomous system routing password PDNS PIX
6-7
2-2
scalability
7-9
security
7-7
PIX firewall
server load balancer
4-6, 7-1
Shared Keepalives
4-8
site persistence
5-3, 5-7, 7-1, 7-3, 7-13
proximity data base
site recovery
5-13
proximity domain name server proximity solutions
5-13
6-15 7-1
7-9
5-1, 5-3, 7-3 4-1
site selection methods site stickiness
7-16
1-12, 5-7
7-16
site-to-site load distribution site-to-site recovery
R
SLB VIP
rankings
7-10
real server factor
5-16
real-time disaster recovery recursive queries redundancy
1-5
3-2
regaining application access remote replication request routing
4-1
1-5
Replication timeouts
SMTP
1-7
6-2
2-8
Source Address List source IP hash
static
7-8
Static proximity
7-16 4-6, 5-7
4-6, 4-10, 4-12, 4-24, 4-25, 4-28
static proximity
ROI
1-12
static routing 5-6, 5-7, 7-13
round robin request routing round trip time
sticky VIP 4-6
5-7, 7-3, 7-13
route advertisements
4-28
6-24
7-13
RHI
round robin
7-9
5-6, 5-16
stateful inspection
2-7
1-6, 3-1
4-23, 5-19
SQL database
1-3, 3-1
Resilient Packet Ring
2-7
SONET
1-1, 1-5, 4-5, 5-1, 7-1, 7-2
7-3
7-10
SMDS SOA
4-7
4-7, 5-2, 7-3
SEND_DEFAULT_IF
6-8
preferred data centers proximity
1-1, 5-2, 7-1
scalability and performance
5-14
probes
SAN
6-6
7-14
storage-to-storage recovery
1-9
Strategic Research Corporation supply-chain
1-2
1-1
Data Center Networking: Enterprise Distributed Data Centers
IN-4
956599
Index
sycnchronous replication synchronous replication
1-9 1-12, 2-3
T TCP
1-6
TTL
7-11
U UDP
1-6
user-friendly application front ends user to application recovery
1-4
1-9
V video on demand VIP VoD
1-6
4-25, 4-26 1-6
VRRP/VIP
4-9, 4-13, 5-5, 5-11
W WAN
2-7
Warm standby
1-9
warm standby
4-2, 4-14
WebNS
3-7
weighted round robin
7-1, 7-13
X X.25
2-7
Z zone based DNS zone based GSLB
5-8, 5-11 4-15
Data Center Networking: Enterprise Distributed Data Centers 956599
IN-5