VPNs Illustrated: Tunnels, Vpns, and Ipsec: Tunnels, Vpns, and Ipsec 032124544X, 5320050046, 9780321245441

Virtual private networks (VPNs) based on the Internet instead of the traditional leased lines offer organizations of all

409 117 2MB

English Pages 480 [481] Year 2005

Table of contents :
Cover......Page 1
Contents......Page 8
Preface......Page 14
Part 1. Background......Page 18
1.1 Purpose......Page 20
1.3 Typographical Conventions......Page 21
1.5 Testbed......Page 22
1.6 Road Map......Page 23
1.7 Summary......Page 25
2.2 Layering......Page 26
2.3 Encapsulation......Page 28
2.4 Addressing......Page 30
2.5 IP......Page 37
2.6 UDP......Page 39
2.7 TCP......Page 41
2.8 ICMP......Page 48
2.9 NAT and Private IP Addresses......Page 52
2.10 PPP......Page 57
2.11 IPv6......Page 60
2.12 Routing......Page 64
2.13 Summary......Page 71
Exercises......Page 72
3.1 Introduction......Page 74
3.2 Symmetric Ciphers......Page 75
3.3 Asymmetric Ciphers......Page 86
3.4 Cryptographic Hash Functions, MACs, and HMACs......Page 92
3.5 Digital Signatures......Page 97
3.6 Certificates......Page 100
3.7 Summary......Page 104
Exercises......Page 105
4.1 Introduction......Page 106
4.2 IP-in-IP Tunnels......Page 109
4.3 PPPoE......Page 112
4.4 GRE......Page 117
4.5 PPTP......Page 121
4.6 L2TP......Page 126
4.7 MPLS......Page 152
4.8 gtunnel......Page 162
4.9 Summary......Page 168
Exercises......Page 169
Part 2. Tunnels and VPNs......Page 170
5.1 Introduction......Page 172
5.2 PPTP......Page 174
5.3 L2TP......Page 175
5.4 Other VPNs......Page 179
Exercises......Page 180
6.1 Introduction......Page 182
6.2 Cipher Suites......Page 183
6.3 The SSL Protocol......Page 184
6.4 SSL on the Wire......Page 188
6.5 OpenSSL......Page 208
6.6 The stunnel Program......Page 213
6.7 SSL Security......Page 221
Exercises......Page 222
7.1 Introduction......Page 224
7.2 The SSHv1 Protocol......Page 225
7.3 The SSHv2 Protocol......Page 249
7.4 Building VPNs with SSH......Page 277
Exercises......Page 283
8.2 VTun......Page 284
8.3 CIPE......Page 289
8.4 Tinc......Page 300
8.5 OpenVPN......Page 309
Exercises......Page 319
Part 3. IPSec......Page 322
9.1 Introduction......Page 324
9.3 Road Map for Part 3......Page 325
9.4 Summary......Page 326
Exercises......Page 327
10.1 Introduction......Page 328
10.2 Protocols......Page 329
10.3 IPsec Modes......Page 330
10.4 Security Associations......Page 333
10.5 Combining Security Associations......Page 335
10.6 Policies......Page 337
10.7 IPsec Processing......Page 338
10.8 Summary......Page 340
Exercises......Page 341
11.1 Introduction......Page 342
11.2 The AH Header......Page 343
11.3 Sequence Numbers......Page 345
11.4 AH Processing......Page 347
11.5 Transport Mode......Page 348
11.6 Tunnel Mode......Page 350
11.7 AH with IPv6......Page 353
Exercises......Page 355
12.1 Introduction......Page 358
12.2 The ESP Header......Page 359
12.3 ESP Processing......Page 361
12.4 Transport Mode......Page 362
12.5 Tunnel Mode......Page 365
12.6 ESP with IPv6......Page 370
12.7 Summary......Page 371
Exercises......Page 372
13.1 Introduction......Page 374
13.2 ISAKMP......Page 375
13.3 IKE......Page 392
13.4 An Example Negotiation......Page 405
13.5 Summary......Page 410
Exercises......Page 411
14.1 Introduction......Page 414
14.2 IPsec Architecture......Page 415
14.3 AH......Page 418
14.4 ESP......Page 420
14.5 IKE......Page 421
14.6 NAT Traversal......Page 426
Exercises......Page 433
A.2 Cryptographic Routines......Page 436
A.3 Library Code......Page 440
B.1 Netcat......Page 442
B.2 tcpdump and Other Packet Sniffers......Page 443
B.3 ssldump......Page 446
B.4 PPP......Page 448
Bibliography......Page 452
A......Page 462
C......Page 463
E......Page 465
F......Page 466
I......Page 467
K......Page 469
M......Page 470
O......Page 471
P......Page 472
S......Page 474
T......Page 477
Z......Page 478

Recommend Papers

VPNs Illustrated: Tunnels, VPNs, and IPsec 032124544X, 9780321245441

Virtual private networks (VPNs) based on the Internet instead of the traditional leased lines offer organizations of all

415 24 5MB Read more

Guide to IPsec VPNs: recommendations of the National Institute of Standards and Technology

This guide provides specific recommendations related to configuring cryptography for IPsec. In addition this guide prese

350 9 936KB Read more

IPsec virtual private network fundamentals: [an introduction to VPNs] [4th print ed.] 1587052075, 9781587052071

An introduction to designing and configuring Cisco IPsec VPNs Understand the basics of the IPsec protocol and learn impl

370 7 5MB Read more

Building Mpls-Based Broadband Access Vpns

400 57 1MB Read more

SSL Remote Access VPNs [1st edition] 9781587052422, 1587052423

"SSL Remote Access VPNs" An introduction to designing and configuring SSL virtual private networks Jazib Frahi

350 51 11MB Read more

Extreme Privacy: VPNs & Firewalls [Digital Edition]

Revision: 2024.02.01. This digital (PDF) supplement to Extreme Privacy delivers a much more thorough guide about VPNs

124 42 2MB Read more

End-To-End Qos Network Design Quality Of Service In Lans Wans And Vpns

462 37 11MB Read more

configuring gre tunnel over ipsec with ospf

424 41 51KB Read more

Tunnels, Towers & Temples 9780752480282, 9780752480275

354 17 20MB Read more

Broadband Telecommunications Handbook-VPNS,3GW,GPRS,MPLS,VoIP,SIP [2 ed.]

342 15 18MB Read more

VPNs Illustrated: Tunnels, Vpns, and Ipsec: Tunnels, Vpns, and Ipsec
032124544X, 5320050046, 9780321245441

Author / Uploaded
Snader
Jon C

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

VPNs Illustrated

This page intentionally left blank

VPNs Illustrated Tunnels, VPNs, and IPsec Jon C. Snader

Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Capetown • Sydney • Tokyo • Singapore • Mexico City

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales (800) 382-3419 [email protected] For sales outside the U.S., please contact: International Sales [email protected] Visit us on the Web: www.awprofessional.com Library of Congress Cataloging-in-Publication Data Snader, Jon C., 1944– VPNs illustrated : tunnels, VPNs, and IPsec / Jon C. Snader. p. cm. Includes bibliographical references and index. ISBN 0-321-24544-X (pbk. : alk. paper) 1. Extranets (Computer networks) 2. IPSec (Computer network protocol) I. Title: Tunnels, VPNs, and IPsec. II. Title. TK5105.875.E87S53 2005 004.67'8—dc22 2005023223 Copyright © 2006 Pearson Education, Inc. All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, write to: Pearson Education, Inc. Rights and Contracts Department One Lake Street Upper Saddle River, NJ 07458 ISBN 0-321-24544-X Text printed in the United States on recycled paper at R.R. Donnelley in Crawfordsville, Indiana. First printing, October 2005

To Rich Stevens, who showed me the way, and to Maria, who makes it all possible.

This page intentionally left blank

Contents

Preface

xiii

Part 1.

Background

1

Chapter 1.

Introduction

3

Purpose Readers Typographical Conventions Source Code and Third-Par ty Programs Testbed Road Map Summary

3 4 4 5 5 6 8

TCP/IP Overview

9

1.1 1.2 1.3 1.4 1.5 1.6 1.7

Chapter 2. 2.1 2.2 2.3 2.4

Introduction Layering Encapsulation Addressing

9 9 11 13

vii

Contents

viii

2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13

IP UDP TCP ICMP NAT and Private IP Addresses PPP IPv6 Routing Summary Exercises

20 22 24 31 35 40 43 47 54 55

Chapter 3.

Cryptography Overview

57

Introduction Symmetric Ciphers Asymmetric Ciphers Cryptographic Hash Functions, MACs, and HMACs Digital Signatures Certiﬁcates Summary Exercises

57 58 69 75 80 83 87 88

Tunnels

89

3.1 3.2 3.3 3.4 3.5 3.6 3.7

Chapter 4. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9

Introduction IP-in-IP Tunnels PPPoE GRE PPTP L2TP MPLS gtunnel Summary Exercises

89 92 95 100 104 109 135 145 151 152

Part 2.

Tunnels and VPNs

153

Chapter 5.

Virtual Private Networks

155

5.1 5.2 5.3 5.4

Introduction PPTP L2TP Other VPNs

155 157 158 162

Contents

ix

5.5

Chapter 6. 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8

Chapter 7. 7.1 7.2 7.3 7.4 7.5

Chapter 8. 8.1 8.2 8.3 8.4 8.5 8.6

Summary Exercises

Secure Sockets Layer Introduction Cipher Suites The SSL Protocol SSL on the Wire OpenSSL The stunnel Program SSL Security Summary Exercises

SSH

163 163

165 165 166 167 171 191 196 204 205 205

207

Introduction The SSHv1 Protocol The SSHv2 Protocol Building VPNs with SSH Summary Exercises

207 208 232 260 266 266

Lightweight VPNs

267

Introduction VTun CIPE Tinc OpenVPN Summary Exercises

267 267 272 283 292 302 302

Part 3.

IPSec

305

Chapter 9.

IPsec

307

9.1 9.2 9.3 9.4

Introduction An Over view of IPsec Road Map for Par t 3 Summary Exercises

307 308 308 309 310

Contents

x

Chapter 10. 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8

Chapter 11. 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8

Chapter 12. 12.1 12.2 12.3 12.4 12.5 12.6 12.7

Chapter 13. 13.1 13.2 13.3 13.4 13.5

IPsec Architecture Introduction Protocols IPsec Modes Security Associations Combining Security Associations Policies IPsec Processing Summary Exercises

AH Introduction The AH Header Sequence Numbers AH Processing Transpor t Mode Tunnel Mode AH with IPv6 Summary Exercises

ESP Introduction The ESP Header ESP Processing Transpor t Mode Tunnel Mode ESP with IPv6 Summary Exercises

IKE Introduction ISAKMP IKE An Example Negotiation Summary Exercises

311 311 312 313 316 318 320 321 323 324

325 325 326 328 330 331 333 336 338 338

341 341 342 344 345 348 353 354 355

357 357 358 375 388 393 394

Contents

xi

Chapter 14. 14.1 14.2 14.3 14.4 14.5 14.6 14.7

Appendix A A.1 A.2 A.3

Appendix B B.1 B.2 B.3 B.4

IPsec Futures

397

Introduction IPsec Architecture AH ESP IKE NAT Traversal Summary Exercises

397 398 401 403 404 409 416 416

Source Code Introduction Cryptographic Routines Library Code

Miscellaneous Software Netcat tcpdump and Other Packet Sniffers ssldump PPP

419 419 419 423

425 425 426 429 431

Bibliography

435

Index

445

This page intentionally left blank

Preface

Introduction There is a revolution going on in enterprise networking. Until very recently, enterprises that needed to link computers in geographically dispersed locations had to build their own wide area networks (WANs). Usually this meant renting expensive and, by today’s Internet standard, relatively slow frame relay circuits. A typical 56 Kb/s circuit could cost several hundred, or even over a thousand, dollars a month. Today, the typical home computer user enjoys a broadband Internet connection having a 5 Mb/s download speed for a cost of about $40 per month. Commercial variants of this service, offering higher speeds and other amenities, are available for between $100 and $200 per month. Obviously, this signiﬁcant increase in speed and decrease in cost represent a tremendous opportunity for enterprises, but they do introduce new problems. The Internet is an open environment and, compared to leased lines, dreadfully insecure. Increases in bandwidth and decreases in cost are worthless if they mean that an enterprise’s vital data can be intercepted by competitors, or that ﬁnancial transactions are subject to manipulation by outsiders. This book discusses ways to overcome these problems by recreating the security of leased lines in a public medium such as the Internet. The fundamental mechanism that allows us to have secure communications in the Internet is the notion of a tunnel. As we’ll see, tunnels are a way of overlaying a logical or virtual network on top of a physical network. Once we have such a tunnel, we can secure it by encrypting and authenticating the network trafﬁc that ﬂows through it, thus recreating the security of private leased lines. Of course, this simple description hides a substantial set of details and problems. We’ll see that it’s actually quite difﬁcult to endow such tunnels with robust security. xiii

Preface

xiv

Much of the book is concerned with exploring solutions to these problems, and seeing why the successful solutions work and where the unsuccessful ones fail.

Source Code and Errata Availability Source code discussed in the text and other supporting material are available on my Web site at . The networking libraries and skeletons from Effective TCP/IP Programming, which I mention and use occasionally in the text, are also available on the Web site. My readers, it turns out, are much better at ﬁnding mistakes than I am. Although I go over the text carefully, checking that every i is dotted and every t crossed, errors still manage to evade me. Fortunately, most of these are caught by the careful and fastidious professionals at Addison-Wesley. Still, some errors will no doubt escape into the ﬁnal published text. As these are discovered—usually by careful readers—I add them to an errata list for the book. This list is always available at my Web site.

Colophon As with my previous book, I produced camera-ready copy for this text using James Clark’s splendid Groff typesetting suite (now maintained by Ted Harding and Werner Lemberg) and Rich Stevens’ modiﬁed ms macros. I used the gpic, gtbl, and geqn preprocessors for the ﬁgures, tables, and mathematical notation, respectively. Some of the ﬁgures use gpic macros from Rich Stevens and Gary Wright. Indexing tools from Jon Bentley and Brian Kernighan were a huge help in the production of the index. I included the source code for the programming examples directly from their source ﬁles with Dave Hanson’s loom utility. The text is set in the Palatino typeface.

Acknowledgments Although writing a book is primarily a solitary endeavor, it is nevertheless true that many people make many kinds of contributions to the effort, and that without those contributions, most books, including this one, would not see the light of day. Once again, I have to thank my wife, Maria, who inexplicably agreed to sign on for another book. Considering the extra work and lonely hours that this entails, it’s a considerable sacriﬁce, and one for which there are no words adequate to thank her. Several reviewers suffered through drafts at various stages in the writing cycle. Ronan McLaughlin, Thomas D. Nadeau, and Radia Perlman reviewed some early material from the proposal stage and offered much good advice. Radia Perlman also reviewed some of the earlier chapters. Peter Gutmann and Sandra Henry-Stocker read and reviewed the entire manuscript. Their advice and technical insights were a tremendous help to me in writing the book. Finally, Robin Snader was the ﬁrst to read much of the material, and he provided valuable and detailed feedback that helped shape the

Preface

xv

form and content of the book. All these reviewers provided invaluable assistance and helped make the book much better than it otherwise would have been. I offer them my heartfelt thanks. No acknowledgments would be complete without mentioning the wonderful staff at Addison-Wesley. Mary Franz helped get the project started and championed the book to Addison-Wesley. My editor, Catherine Nolan, brought her prodigious editing and library science skills to bear and helped shaped the book into its ﬁnal form. She also taught me a lot of things I didn’t know about bibliographies; who knew there was so much to learn? Evelyn Pyle copyedited the book in astounding detail. Her careful checking of every cross reference and citation caught many errors, and left me wondering whether she knew more about the subject matter than I do. Linda Begley, whose sharp eyes would make an eagle weep with envy, proofread the ﬁnal pages. My production editor, Tyrrell Albaugh, once again helped one of my books through the difﬁcult birthing process. Her cheerful prodding to ‘‘do the right thing’’ helped make the book better in every way. In view of Rich Stevens’ extraordinary TCP/IP Illustrated series [Stevens 1994, Wright and Stevens 1995, Stevens 1996], naming this book VPNs Illustrated might be considered an act of hubris, but my intent is merely to pay homage to Rich and his books. Those books are at once the benchmark and goal towards which I strive in my own writing. As always, I welcome readers’ comments, suggestions, and corrections. Please feel free to email me at the address below.

Tampa, Florida October 2005

Jon C. Snader [email protected] http://home.netcom.com/˜jsnader

This page intentionally left blank

Part 1

Background

This page intentionally left blank

1

Introduction

1.1

Purpose This book focuses on the technology behind tunneling and virtual private networks (VPNs). The explosive growth of the Internet and the buildout of the underlying infrastructure have led many enterprises to replace their private networks based on leased lines with far cheaper solutions based on the public Internet. Although the cost savings are substantial, using the Internet to carry sensitive information presents serious privacy and security problems. One way of addressing these problems is to create private, virtual networks within the Internet structure. These virtual networks are created by using tunneling, authentication, and encryption to provide a virtual leased line between enterprise networks. Because the trafﬁc ﬂow is encrypted and authenticated, it cannot be read or tampered with by third parties, and thus the virtual network recreates the privacy and security of a leased line. The book is intended for software engineers, systems/sales engineers, system administrators, and others who want an in-depth understanding of tunneling and VPN technology. The text provides the background necessary for readers to understand existing VPN implementations, to create their own implementations, and to read the ﬁeld’s advanced literature in an informed way. The text also teaches readers how to read and interpret various network traces, such as those produced by tcpdump, as a way of understanding and troubleshooting VPN and network behavior. Finally, the text can be used as a handbook for those seeking information about the functioning of the protocols that we discuss or the message formats that they use. Our intent is not to restate the relevant RFCs (Request for Comments) or provide an abstract discussion of tunnels and VPNs, but rather to explore how tunnels and VPNs actually function, by observing their behavior ‘‘on the wire.’’ This is accomplished by examining network traces that expose the behavior and packet content of the protocols 3

4

Introduction

Chapter 1

used in building tunnels and VPNs. This is, of course, the same approach used in Rich Stevens’ wonderful TCP/IP Illustrated, Volume 1 [Stevens 1994].

1.2

Readers This book is aimed primarily at developers, system engineers, and IT personnel who need to understand tunneling and VPN technology. Readers should have a basic understanding of networking concepts and feel comfortable with such chores as conﬁguring interfaces and routing on their workstations. Chapter 2 provides a brief review of the TCP/IP suite, including packet and header formats. Cryptography plays a central role in VPN technology. Chapter 3 covers the basics of cryptography at a level more than adequate to understand its use in the text. Modern cryptography is based on mathematical ideas, some of which are quite sophisticated. Nonetheless, only very modest mathematical skills, mostly some facts about modular arithmetic, are needed to understand the cryptographic ideas in the text. Even so, readers can skip over the mathematics without losing anything critical. In a few places, the text presents some example code. Readers with a basic understanding of C will have no problems with the C code or even the examples in Python. The code serves only to illustrate various points in the book, and no real programming skills are needed. Readers unfamiliar with C or Python but with experience in another language should be able to follow most of the code from the detailed discussions that accompany it. Readers unfamiliar with any programming language can still follow the important ideas by reading the commentary that accompanies the code.

1.3

Typographical Conventions During our explorations we will run many small examples designed to illustrate some aspect of the protocol we are examining. When we show interactive input and output, we will use the following conventions: • User input is set in boldface monospaced type. • System output is set in plain monospaced type. • Comments that are not part of the input or output are set in italics: laptop# ifconfig gif0 create laptop# ifconfig gif0 192.168.0.2 192.168.0.1 laptop# gifconfig gif0 172.30.0.12 172.30.0.1

• The name of the system is included in the shell prompt • Some material is set off from the surrounding text.

logical addresses physical addresses

Section 1.5

Testbed

5

Footnote-type material and parenthetical remarks are set in smaller type and are indented like this paragraph. Often, this material can be skipped on a ﬁrst reading.

• URLs are set off with angle brackets.

1.4

Source Code and Third-Party Programs The limited amount of programming code in the book is, for the most part, self-contained. It is written in either standard C or Python. Readers familiar with C but not Python should have no problem understanding the Python examples. All the code from the text is available from the author’s Web site. Some of the programming examples make use of a networking library developed for the author’s previous book Effective TCP/IP Programming [Snader 2000]. The functions in the library are described in Appendix A; the code is available from the author’s Web site. In some instances we will use publicly available programs to help illustrate a point, build a tunnel, or analyze trafﬁc ﬂow. Appendix B describes these programs and provides a pointer to where they can be obtained. Occasionally we’ll mention Effective TCP/IP Programming in the text. When we do, we’ll refer to it as ETCP.

1.5

Testbed Throughout the text we will use the test network shown in Figure 1.1 to perform tests and to demonstrate the operation of the various tunnels and VPNs that we study. 192.168.123.0/24

1 bsd 1

to Internet

192.168.122.0/24

1 solaris 3

linux

laptop

4

linuxlt

router 254

172.30.0.0/24 Figure 1.1 The Test Network

The bsd and laptop hosts run FreeBSD 4.10, the linux and linuxlt hosts run Linux 2.4, and the solaris host runs Solaris 8 on Intel. The laptop and linuxlt hosts are shown as dashed because they’re both laptops that are plugged into various places on the network as required by the situation. When

6

Introduction

Chapter 1

we discuss a particular example, we will include only the parts of the network that are pertinent to that example. Although each of these hosts runs some version of UNIX, this book is not about UNIX networking. We are interested in the protocols themselves, not in any particular implementation, so the hosts on which the examples and tests are run is immaterial.

1.6

Road Map The book comprises three parts: background material (Part 1), a discussion of tunneling and VPN technology (Part 2), and a discussion of IPsec (Part 3). Each part depends to some extent on the material in previous parts, but readers with the appropriate background can read them independently. In particular, readers who are interested only in IPsec can skip directly to Part 3, and use the other two parts to ﬁll in any gaps in their backgrounds as needed.

Background Material Part 1 provides background material that allows the text to be mostly self-contained. Chapter 2 is a review of the core TCP/IP protocols. It pays particular attention to the notion of encapsulation, an idea that is a central concept in the text. The chapter also has brief discussions of NAT, PPP, and routing. Chapter 3 is a primer on basic modern cryptography. This chapter covers block and stream symmetric ciphers, such as DES, 3DES, AES, and RC4, and the asymmetric ciphers, such as RSA and ElGamal, which form the basis for public key cryptography. Next, the chapter discusses message authentication codes, a sort of cryptographic checksum, that can detect tampering with a message by a third party. Particular attention is paid to the class of authentication codes called HMACs. Finally, the chapter brieﬂy discusses digital signatures and certiﬁcates. These ideas play an important role in many of the authentication protocols that we will study. Chapter 4 focuses on tunnels and tunneling technology. After deﬁning a tunnel as a way of providing a virtual network on top of a real network through encapsulation, the chapter discusses several examples of tunnels and how they are used to solve various networking problems. These tunnels are explored in depth; the chapter examines both their message formats and their on-the-wire behavior. The chapter concludes with a discussion of gtunnel, a generalized mechanism that allows users to build tunnels with a user-space program. Several VPN technologies use the ideas in gtunnel to build their tunnels. Tunnels and VPNs Part 2 focuses on using the tunneling technology from Chapter 4 to build VPNS. Chapter 5 reexamines two of the tunnels from Chapter 4 and observes how encryption is added to the tunnels to make rudimentary VPNs.

Section 1.6

Road Map

7

Chapter 6 studies the SSL protocol and how it can provide a VPN—or at least VPNlike functionality—at the application layer. Although some authorities object to calling SSL a tunneling or VPN technology, we’ll see that it can, in fact, be used to build real network-to-network VPNs. Later chapters discuss how the SSL protocol is used in some VPN technologies to provide end-node authentication and key-management functions. Chapter 7 discusses the two SSH protocols and how they are used as drop-in replacements for telnet, ftp, and the BSD r-commands. Like SSL, SSH operates at the application layer, but can nonetheless be used to build true network-to-network VPNs. Chapter 8 concludes Part 2 with a discussion of several lightweight VPNs—VTun, CIPE, tinc, and OpenVPN—and examines their strengths and weaknesses. In all but one case, these VPNs are implemented in user space, and so make use of the gtunnel-like capabilities that are examined in Chapter 4. This chapter is particularly revealing because it exposes many of the security problems that a VPN designer must consider. The four VPNs vary greatly in their security and robustness, and these differences highlight common mistakes in implementing secure software. IPsec Part 3 is a detailed examination of IPsec, the IETF standard VPN technology. IPsec operates at the network layer and is thus largely indifferent to the type of trafﬁc it is carrying. IPsec is transparent to applications; they are not aware of its existence. Chapter 9 provides a brief introduction and road map. Chapter 10 discusses the IPsec architecture: the protocols, modes, and databases that make up the IPsec suite. Each of the following three chapters discusses one of the three protocols that make up IPsec. Chapter 11 discusses the Authentication Header (AH) protocol and how it provides data origin authentication, message integrity, and protection from replay attacks. The algorithms used to provide replay protection receive detailed attention, as do the two modes of AH encapsulation. Chapter 12 examines the Encapsulating Security Payload (ESP) protocol, which can provide the essentially identical services that AH does, as well as privacy through encryption. Thus although third parties can read but not tamper with messages protected by AH, they can neither read nor tamper with messages protected by ESP. As with AH, the two encapsulation modes receive careful examination. The Internet Key Exchange (IKE) protocol is the third IPsec protocol. This key-management protocol is examined in Chapter 13. By far the most complicated of the three protocols, IKE is largely responsible for IPsec’s reputation as a complicated and difﬁcult protocol. The chapter examines how IKE negotiates security associations between peers, and how it derives the keys used by AH, ESP, and IKE to secure their communications. Part 3 ends with Chapter 14, which discusses the near-term future of IPsec. First, the chapter examines the new versions of AH, ESP, and IKE that are currently under

8

Introduction

Chapter 1

IETF development. Then it discusses the recently standardized NAT traversal mechanisms that provide IPsec and NAT with some degree of interoperability.

1.7

Summary This introduction laid out the purpose and prerequisites of the text. The chapter discussed typographical conventions, source code, and third-party software that the book will use. Finally, it provided a detailed road map for the rest of the book.

2

TCP/IP Over view

2.1

Introduction The ﬂow and routing of data on the Internet is controlled by a set of protocols called the Transmission Control Protocol/Internet Protocol (TCP/IP) suite. These protocols provide many sorts of services. Some protocols provide a connectionless, best-effort, datagram delivery service. Others provide a connection-based reliable data delivery service. Still others provide routing, name resolution, and network control messaging. Together, the TCP/IP protocols form an infrastructure that applications can use to communicate with peer applications on machines across the room or across the world. Before beginning our examination of the role of tunnels and VPNs in networking, let’s review some of the basic facts about TCP/IP and its use in the Internet. This chapter looks at the major protocols that we will build on later in the text. We will recall their normal operating modes, their data formats, and their operation on the wire. The subject is a large one, of course, so our coverage will be a pre´cis rather than an exhaustive account. A detailed treatment of TCP/IP along the lines of our review is given in [Stevens 1994]. Another excellent account of TCP/IP [Comer 2000], takes a slightly different view.

2.2

Layering Layering is an important conceptual tool that helps us to organize, understand, and deal with the complexity of network architecture. The idea is to divide the network’s functions into layers. Each layer makes use of the services of the layer below it to provide a set of speciﬁc services for the layer above it. Adjacent layers communicate with

9

10

TCP/IP Overview

Chapter 2

each other through a well-deﬁned interface, so that in principle, we could change one layer or even completely rewrite it without affecting the others. Together, the layers are said to constitute a stack. The TCP/IP stack comprises four layers, as shown in Figure 2.1.

Application

ftp, ssh, email, etc.

Transport

TCP, UDP

Network

IP, ICMP, IGMP

Interface

Ethernet, Token Ring, etc.

Figure 2.1 The TCP/IP Stack

Listed to the right of the stack are some of the protocols that operate in each layer. The four layers and their functions are: 1. The interface layer comprises the operating system device driver and associated network interface hardware. This layer handles the details of getting the bits from the network layer’s datagrams onto the wire. Typical examples of protocols at this layer are Ethernet, Token Ring, Fiber Distributed Data Interface (FDDI), and the Point-to-Point Protocol (PPP). The interface layer is sometimes called the link or datalink layer. 2. The network layer is primarily concerned with routing IP datagrams through the Internet. It also carries Internet Control Message Protocol (ICMP) messages, which contain error and control information about the network, and Internet Group Management Protocol (IGMP) messages, which help with the management of multicast messaging. Because the network layer’s most important task is to handle IP datagrams, it is sometimes called the Internet layer. 3. The transport layer is concerned with end-to-end communications. Whereas the network layer is concerned with moving an IP datagram from one host or router to the next, the transport layer is concerned with communication between the two ultimate destinations, as illustrated in Figure 2.2. As shown in that ﬁgure, the transport layer views itself as logically directly connected to its peer transport layer on the destination host. The two most important transport-layer protocols are TCP and the User Datagram Protocol (UDP). 4. The application layer is where the applications reside. Typical examples are such standard TCP/IP applications as ftp, ssh, and email, as well as user-written

Section 2.3

Encapsulation

11

applications. As with the transport layer, the application layer is logically directly connected to its peer layer on the destination host. Host

Host application protocol

Application

TCP or UDP

Transport Router Network

Interface

Application

IP

Network

Transport Router

...

Interface

Network

IP

Interface

Network

Interface

... Figure 2.2 Communication Between Corresponding Layers

It is common to ﬁnd references to the seven-layer Open Systems Interconnection (OSI) reference model [International Standards Organization 1984] in networking literature. In that model, the network, transport, and application layers are numbered 3, 4, and 7, respectively. To avoid confusion between the two models, we will always refer to layers by their names rather than their numbers. When numbers are used in networking literature, they almost always refer to the OSI model. Thus, for example, layer 4 refers to the transport layer. See Tip 14 in ETCP for more on the OSI model and how it relates to the TCP/IP stack.

Beginners are often confused about the difference between the network and transport layers. Figure 2.2 makes the distinction clear. The network layer carries on a conversation with its peer network layer on the next hop, whereas the transport layer carries on a conversation with its peer layer on the ﬁnal destination. To put it another way, the transport layer behaves as if it were directly connected to its peer layer and is unaware of the actual path that its data takes through the network. The network layer, on the other hand, is concerned with choosing the path that the data takes and, as such, is involved with processing at each hop. Note from the ﬁgure that routers do not necessarily even have transport layers.

2.3

Encapsulation As we shall see later in the text, the notion of encapsulation is fundamental to tunneling and VPNs. We needn’t wait until we discuss tunnels to see encapsulation in action, however. As data travels down the stack on its way to the network cable or other media, each layer adds a header and, possibly, a trailer to the data. We say that each layer encapsulates the data from the previous layer. This notion is illustrated in Figure 2.3, which shows data for a TCP session moving through the stack. The data that, say, the user types in at the console is encapsulated by

12

TCP/IP Overview

Chapter 2

the application layer, which adds an application header. When the encapsulated application data enters the transport layer, it is encapsulated into a TCP segment by the addition of a TCP header. Similarly, when the TCP segment arrives at the network layer, IP adds its own header, encapsulating the TCP segment into an IP datagram. Finally, when the IP datagram gets to the interface layer, the Ethernet driver encapsulates it in an Ethernet frame by adding a header and trailer. application data

Ethernet header

app. header

application data

Application Layer

TCP header

app. header

application data

Transport Layer

IP header

TCP header

app. header

application data

Network Layer

IP header

TCP header

app. header

application data

Ethernet trailer

Interface Layer

Figure 2.3 Data Encapsulation in the TCP/IP Stack

As it moves up the stack at the destination, data is decapsulated at each layer so that that layer sees exactly the same data as its peer. For example, when a TCP segment arrives at the destination host and moves up the stack to the transport layer, TCP will see exactly the same data as that sent by TCP on the source host. Similarly, the application layer will see the same data that the application on the source host sent. It is in this sense that the peer transport and application layers are logically directly connected. Whatever the lower layers on the source host do to the data, the lower layers on the destination host undo. It is important to be aware that, at the network and interface layers, the peer is the next hop, not the ultimate destination, as shown in Figure 2.2. The network layer at the source host is carrying on a conversation with the router at the next hop, not the destination host. As the IP datagrams move through each router, the router changes some of the ﬁelds in the IP header, so that the IP layer at the destination host will not, in general, see the same data that the source host sent. Each hop’s network layer will see the exact data that the previous hop sent. Figure 2.4 shows a tcpdump of data, such as that depicted in Figure 2.3, as it enters the interface layer. That is, the dump of the data is just before the Ethernet framing is added.

Section 2.4

Addressing

1 1.1 1.2 1.3 1.4 1.5

13

05:58:10.846770 172.30.0.12.1027 > 172.30.0.1.5000: P 1146985371:1146985392(21) ack 1409259751 win 57920 (DF) 4500 0049 007c 4000 4006 e1e9 ac1e 000c E..I.|@.@....... ac1e 0001 0403 1388 445d 9b9b 53ff 98e7 ........D]..S... 8018 e240 7b3a 0000 0101 080a 0000 c860 ...@{:.........‘ 3503 72f1 0000 0011 4441 5441 4441 5441 5.r.....DATADATA 4441 5441 4441 5441 0a DATADATA. Figure 2.4 Encapsulated Data

We’ll look at each stage of the encapsulation as we cover the relevant protocols. For now, note the data in boldface on lines 1.4 and 1.5. This is the encapsulated application data. As shown in Figure 2.5, the ﬁrst 4 bytes are the length of the user data (0x11 = 17), followed by 17 bytes of user data. length 17 4 bytes

DATADATADATADATA\n 17 bytes Figure 2.5 Encapsulated Application Data

Not every application encapsulates its data in this way, of course. As we shall see, some add larger headers and trailers, and some add no headers at all.

2.4

Addressing In this section we discuss IP addresses, but before we begin, we should be clear on what that means. IP uses IP addresses to route packets from one host to another. These hosts may or may not be on the same physical network—on the same Ethernet cable, say—so an IP address is not the same thing as a physical address, such as an Ethernet address. In terms of Figure 2.3, IP addresses are used by the network layer, whereas physical addresses are used by the interface layer. A version 4 IP address is a 32-bit integer. IP addresses are usually written as four decimal numbers connected by dots; we call this dotted decimal notation. Each decimal number is the value of 1 byte of the 32-bit address. Thus, we would write 1.2.3.4 rather than 0x01020304 or, worse yet, 16,909,060. Traditionally, IP addresses were divided into ﬁve classes, as shown in Figure 2.6. The division of the address space into classes was intended to make address allocation more ﬂexible. Class D addresses are used in multicasting—sending an IP datagram to a group of hosts rather than to a speciﬁc host on the network. Class E addresses are used for experimental purposes, and we won’t consider them further. Each interface on a host has at least one Class A, B, or C address assigned to it. These are the addresses that are used to send IP trafﬁc to a speciﬁc machine. More precisely, they are the addresses used to send IP trafﬁc to a speciﬁc interface on a speciﬁc machine.

14

TCP/IP Overview

Chapter 2

0 1

78

31

Class A 0 network ID

0

host ID

2

15 16

Class B 1 0

0

31

network ID

host ID

3

23 24

Class C 1 1 0

network ID

0

31 host ID

4

31

Class D 1 1 1 0

0

multicast group

4

31

Class E 1 1 1 1

reserved

Figure 2.6 IP Classful Addressing

Although part of the address, we can think of the ﬁrst few bits of an IP address as identifying the network class, as shown in Figure 2.6. The network ID ﬁeld identiﬁes the network that this address belongs to. Routers use this ﬁeld to route IP datagrams through the Internet or other wide area network (WAN). The size of the network ID ﬁeld depends on the address class, which is determined by the leading bits. The host ID ﬁeld identiﬁes a particular host on the network speciﬁed in the network ID ﬁeld. The host ID is arbitrary; the system administrator in charge of the network assigns it in any way that is convenient. This ﬁeld is not used in routing datagrams outside its home network. Figure 2.7 shows the number of networks and hosts and the ranges for Class A, B, and C addresses.

Class

Networks

Hosts

Address Range

A B C

128 16,384 2,097,152

16,777,214 65,534 254

0.0.0.1 to 127.255.255.255 128.0.0.0 to 191.255.255.255 192.0.0.0 to 223.255.255.255

Figure 2.7 Networks, Hosts, and Ranges for Class A, B, and C Addresses

The ﬁgure clariﬁes the distinction between Class A, B, and C addresses: Class A addresses are for the few networks with a huge number of hosts. Class C addresses are

Section 2.4

Addressing

15

for the many networks with just a few hosts. Class B addresses are for networks with a moderate number of hosts. The host IDs consisting of all 1-bits or all 0-bits are reserved. That’s why, for instance, Class C networks have only 254 hosts—host 0 and host 255 are reserved. Network 127 is also reserved; it is used for the internal loopback address. Datagrams addressed to this network are looped back up the stack and never leave the machine.

Although classful addressing is still prescribed in the standards, it is too inﬂexible to meet the needs of modern networks. There are several problems with classful addressing, so let’s consider it ﬁrst from the point of view of the holder of a Class A or B address block. Conceptually, each network corresponds to a group of machines that are connected by some networking medium, such as an Ethernet cable. But no one puts 65 thousand, let alone 16 million, hosts on a single cable. Instead, the network is organized into several smaller segments, each with its own physical network medium, and these segments are connected with routers. Figure 2.8 shows an example of this with just two segments. to Internet

host 1

host 2

host 3

router 1

host 5

host 6

router 2

Segment 1

host 4

Segment 2 Figure 2.8 Two Segments Connected by a Router

If host 1 wants to send an IP datagram to host 2, it need merely map host 2’s IP address to its physical address and put the datagram on the wire. Now consider what happens if host 1 wants to send a datagram to host 4: Even if it knows or is able to determine host 4’s physical address, host 1 can’t send the datagram directly to host 4, because it’s on a different physical network. Thus, host 1 must send the datagram to host 4 indirectly through router 2. This raises the question of how host 1 knows that it must send the datagram to router 2. One possibility is for router 2 to masquerade as the hosts on segment 2 for any host on segment 1, and as the hosts on segment 1 for any host on segment 2. For Ethernet segments, this is called proxy ARP. ARP (Address Resolution Protocol) is used to ﬁnd a host’s Ethernet media access control (MAC) address from its IP address by broadcasting a message to all hosts on the Ethernet segment,

16

TCP/IP Overview

Chapter 2

asking the host with the target IP address to send the requesting host its MAC address. ARP is deﬁned in [Plummer 1982].

When host 1 tries to map the IP address of host 4 to a physical address, router 2 replies with its physical address, and then forwards the datagram onto host 4 when it arrives. The problem with this solution is that it does not work with every type of physical network, and that even when it does, it doesn’t scale well to complicated network topologies. Another possible solution is to add entries to host 1’s routing table for each host on segment 2 listing router 2 as the next hop. This solution will work for any type of physical network but clearly doesn’t scale to more than a few hosts. For a large network, the routing tables would be unworkably large and their administration daunting. The problem is that both segments have the same network ID. If we imagine for a moment that they had different network ID’s, the difﬁculty largely disappears. Suppose that all the hosts on segment 1 have Class C addresses of the form 192. 168. 1. n and that all the hosts on segment 2 have addresses of the form 192. 168. 2. n. In this case, segment 1 hosts have a network ID of 192.168.1, and segment 2 hosts have a network ID of 192.168.2. We now need add only a single entry to the routing table of each host on segment 1 specifying router 2 as the next hop for network 192.168.2. We add a similar entry to the routing table of each host on segment 2. We can easily see that this method would scale well to large networks. This solution brings its own problems, however. When both segments had the same network ID, hosts outside the network needed only a single routing table entry to route to any host in the network. If each segment has its own network ID, outside hosts will need a routing table entry for each segment. That doesn’t matter much for a single organization with two segments, of course, but imagine thousands of organizations with hundreds of segments, and the problem becomes clear. Routers, especially the high-speed routers in the Internet’s core, often put their routing tables in special highspeed memory on the link interface cards, and because the amount of this memory is limited, the large routing tables envisioned by our solution wouldn’t work. This problem is not merely theoretical. [Huitema 2000] recounts some real-world examples of router failure caused by the growth of routing tables.

What we need is a solution that lets hosts inside an organization see each segment as a separate network but that lets hosts outside the organization see a single network. A simple solution of this sort, called subnetting, does exist. The basic idea is that part of the host ID is used to specify the segment, or subnetwork, and the internal routers know about this. External routers don’t concern themselves with the host ID portion of the address, so they need only a single routing table entry for the entire organization’s network. The complete details are given in RFC 950 [Mogul and Postel 1985]. CIDR Subnetting is subsumed in a more general solution, so rather than discuss it in detail, let’s consider the problems of classful addressing from the point of view of an organization that needs several hundred or thousand IP addresses. Ideally, such an organization

Section 2.4

Addressing

17

would like a Class B address block. A single Class C block doesn’t have enough addresses, and using several Class C blocks reintroduces the problem of routing table growth. As we see from Figure 2.7, only 16 thousand Class B address blocks are available, and most of those have already been assigned. That leaves multiple Class C address blocks as the only solution. To avoid the problem of routing table growth, we use a scheme called classless interdomain routing (CIDR). An example will make clear how CIDR works. Suppose that an organization needs 1,000 IP addresses. Under CIDR, we would allocate the organization four Class C blocks that share the same most signiﬁcant bits—200.10.4.0 through 200.10.7.0, say. Note that the upper 22 bits of each address in these blocks are the same. We use these 22 bits as the network ID for the organization, and we write the network ID as 200.10.4.0/22. The ‘‘22’’ is called the preﬁx, or network mask. The preﬁx speciﬁes how many of the most signiﬁcant bits of the IP address make up the network ID. The preﬁx is really just a compact way of specifying the network mask, which is a 32-bit integer with the prefix most signiﬁcant bits set to 1. Thus, a preﬁx of 22 corresponds to a network mask of 0xfffffc00. In practice, the two terms are used interchangeably.

Notice that 200.10.4.0/22 and 200.10.4.0/24 are not the same: The ﬁrst has a network ID of 0x320281, whereas the second has a network ID of 0xc80a04. Obviously, routers must know the subnet mask so that they can extract the network ID to route the datagram. Thus, under CIDR, the network mask becomes part of the routing table entry for all routers. Recall that with classful addressing, external routers didn’t need this information, because they used the ﬁrst few bits of the IP address to determine the portion of the address making up the network ID. CIDR solves the opposite problem too. Suppose that instead of an organization needing 1,000 addresses, we have a home network that needs only 5. In that case, we can allocate part of a Class C block—200.10.4.0, say—by using a 29-bit preﬁx. Note that we can suballocate the block into 32 such smaller networks. The classless part of CIDR comes from the fact that we completely ignore the original division into Class A, B, and C addresses, and use the network mask to determine the network ID instead. Thus, even though we often hear people say that 200.10.4.0/22 combines four Class C blocks or that 200.10.4.0/29 is part of a Class C block, in reality, Class C blocks no longer exist. CIDR generalizes the Class A, B, C division by allowing us to ﬁx the network/host ID boundary at any bit. Now let’s review how CIDR solves the problems with classful addressing. We’ve just seen how it remedies the problem of Class B address block depletion by combining Class C blocks, and how it helps with IP address depletion in general by allowing networks with fewer than 254 hosts. Our organization with the 200.10.4.0/22 network will probably partition its network into several segments as we discussed above. This reintroduces the internal routing problem that using more than one segment entails. However, because the network mask is part of the routing table entry, the internal routers can use a different preﬁx. For example, suppose that the organization divided the 200.10.4.0/22 network into four segments. The segments would have the network IDs 200.10.4.0/24, 200.10.5.0/24,

18

TCP/IP Overview

Chapter 2

200.10.6.0/24, and 200.10.7.0/24. The internal routers would have a routing table entry for each of these segments with a preﬁx of 24 and a next hop of the appropriate router. Although we divided the 200.10.4.0/22 network back into its constituent Class C blocks, that isn’t necessary. We could just as well suballocate it into two subnetworks with a 23-bit preﬁx and 510 hosts or eight subnetworks with a 25-bit preﬁx and 126 hosts, and so forth.

To help make these ideas clear, let’s assume that our example organization with the 200.10.4.0/22 network decides to split the network into two segments, as in Figure 2.8. The system administrator might decide to assign the hosts on segment 1 to the 200.10.4.0/23 network and the hosts on segment 2 to the 200.10.6.0/23 network. Figure 2.9 shows the two segments of Figure 2.8 redrawn and labeled with their network addresses. Each router interface has been labeled with the network address that it reaches, and each host has been labeled with the last 2 bytes of its IP address. to Internet 200.10.4.0/22 host 1 .4.1

host 2

host 3

.4.2

router 1

.4.3

200.10.4.0/23

200.10.4.0/23 200.10.4.0/23 host 4 .6.1

host 5

host 6

.6.2

router 2

.6.3

200.10.6.0/23

200.10.6.0/23 Figure 2.9 Two Segments with Suballocated Network Addresses

What would the routing table for host 1 look like? Because it is directly connected to the 200.10.4.0/23 network, it can send datagrams to hosts on that network directly and doesn’t need a next hop. We’ll indicate that by specifying the next hop as ‘‘local.’’ The only way to reach the 200.10.6.0/23 network is through router 2, so we will need an entry for that. All other datagrams will have destinations outside the 200.10.4.0/22 network, so we will need a default route listing router 1 as the next hop. Thus, our routing table would look like Figure 2.10. Route

Preﬁx

Next Hop

200.10.4.0

23

local

200.10.6.0 default

23 —

router 2 router 1

Figure 2.10 The Routing Table for Host 1

Section 2.4

Addressing

19

Suppose again that host 1 wants to send a datagram to host 4. As shown in Figure 2.9, host 4 has the IP address 200.10.6.1. Host 1 will consult each entry in its routing table and will AND the network mask of the entry with 200.10.6.1. If the result matches the route, then this is a candidate entry to use for routing. If more than one entry matches, the one with the longest match is used. In this case, the 200.10.6.0/23 entry is the best (and only) match, so host 1 will send the datagram to router 2. CIDR is discussed in detail in RFCs 1517, 1518, and 1519 [Hinden 1993, Rekhter and Li 1993, Fuller, Li, Yu, and Varadhan 1993]. Tip 2 of ETCP discusses subnetting and CIDR in a little more detail. Broadcast Addresses Before leaving the subject of IP addressing, let’s take a quick look at broadcast addresses, which are used to send a datagram to all hosts on a segment or a network. With CIDR, there are two types of broadcast address: 1. Limited broadcast 2. Network-directed broadcast A datagram cannot be broadcast outside the network of the host sending it. If we think about the mischief that a malefactor could cause by being able to broadcast a datagram to every host on the Internet, we see that this restriction makes sense. For example, a broadcast with a forged source address would cause every receiving host to reply, resulting in a very efﬁcient denial-of-service attack. The limited broadcast address is 255.255.255.255. It is called limited because routers will never forward datagrams with that destination address. This means that datagrams with the limited broadcast address are conﬁned to the particular network segment on which they originate. A typical use for the limited broadcast address is for a host to map an IP address to a physical address. With Ethernet, for example, a host wishing to determine the physical address of some other host from its IP address broadcasts a request asking the holder of that IP address to send the original host a message with its physical address. All hosts on the segment will receive the request, but only the holder of the target IP address will respond. Other uses are with the Dynamic Host Conﬁguration Protocol (DHCP), used by hosts to obtain an IP address when they boot, and the BOOTP protocol, used by diskless workstations to get their boot images. The network-directed broadcast address has the normal network ID of the speciﬁed network, and the host ID set to all ones. Thus, if host 1 of Figure 2.9 wanted to send a broadcast message to every host on the 200.10.4.0/22 network, it would address the datagram to 200.10.7.255. Because of security problems with this type of address, routers are often conﬁgured to not forward them. Smurf attacks are an example of how network-directed broadcasts can be misused. The Smurf attack is another denial-of-service attack; a datagram with a forged source address is sent to a network-directed broadcast address, causing each host on the network to reply to the forged address.

20

2.5

TCP/IP Overview

Chapter 2

IP IP provides a connectionless, unreliable datagram delivery service. Connectionless means that the transmission of each datagram is a distinct event, and no state is maintained by the host about previously sent datagrams. Unreliable means that IP makes no guarantees about whether the datagram will be delivered and, if it is, whether it will be delivered in order with respect to other datagrams. The standard analogy is that IP is like sending a letter or postcard. We separately address each letter that we send, even if it is going to the same correspondent as our last letter. The post ofﬁce handles our letters without regard to any other letters we may have sent to the same correspondent; that is, it does not maintain state between letters. Nor does the post ofﬁce guarantee that our letters will not be lost, delayed, or delivered out of order. Note that we are talking about IP here. It may be that the application or transport layer is maintaining state and providing reliability. The point is that IP isn’t doing so.

Given that IP provides such a bare-bones service, we might ask why we bother with it at all. Why not merely deﬁne a reliable service in the ﬁrst place? The answer is that IP provides a simple building block upon which other, more robust, protocols can be built. When viewed from above—from the upper layers in the stack—IP provides a simple packet-transport vehicle that makes no assumptions about the packets it will be carrying. The upper layers simply load the data and tell IP where to deliver it. IP does not care about or look at the data it is carrying. This means that IP is capable of serving as a delivery mechanism for a wide variety of upper-level protocols. The view from below—from the physical medium on which IP runs—is of a very undemanding protocol. Because it assumes nothing about the underlying medium except that it is capable of routing packets, IP can run on any type of physical network that can carry packets. Thus, IP can, and does, run on a wide variety of physical networks. A few of the (widely different) physical networks that IP runs on are serial lines, Ethernet, Token Ring, asynchronous transfer mode (ATM) WANs, X.25, FDDI, cellular digital packet data (CDPD), and 802.11 WiFi. There are many others. The fact that IP runs on all these physical networks means that the upper-layer protocols, such as UDP and TCP, do too. We see, therefore, that the simplicity of IP is its chief strength. The IP speciﬁcation is given in RFC 791 [Postel 1981a]. The IP Header Figure 2.11 shows the format of the IP header. This is the header for version 4 of IP. See Figure 2.37 for the version 6 header format. The length of the header in 32-bit words is given in the hdr. len. ﬁeld. For the normal case, in which the IP header contains no options, this will be a 5. The type of service ﬁeld contains either a precedence value and bit ﬁeld specifying how routers should handle the datagram, or it contains explicit congestion notiﬁcation (ECN) information.

Section 2.5

IP

0

3 4 version

7 8 hdr. len.

15 16

time to live

31

type of service

total length (bytes) 0

identiﬁcation

21

D M F F

protocol

fragment offset IP header checksum

source address destination address IP options (if any)

data

Figure 2.11 The IPv4 Header

The total length ﬁeld is the length in bytes of the complete datagram, including the headers. This ﬁeld is 16 bits, so the maximum sized IP datagram is 65,535 bytes. Even though an IP datagram can be up to 65,535 bytes, most physical networks can’t handle packets that large. Ethernet, for example, has a maximum payload, called the maximum transmission unit (MTU), of 1,500 bytes. If an application tries to send a datagram larger than the MTU of the interface, IP breaks the datagram into smaller fragments. Even if the datagram is smaller than the MTU of the sending host’s interface, it may encounter a smaller MTU at one of the intermediate hops, causing the router to fragment the datagram. If a datagram is fragmented, the total length ﬁeld will contain the size of the fragment, and the fragment offset ﬁeld will contain the offset of the fragment in the original datagram. Every fragment except the last will have the more fragments (MF) bit set. The fragments of an IP datagram are tied together with the identiﬁcation ﬁeld. Every IP datagram that a host sends will have a unique (modulo 216 ) identiﬁcation number, so even if the datagram is fragmented by an intermediate router, the destination host will have the required information to reassemble the datagram. Hosts can signal routers not to fragment a datagram by setting the don’t fragment (DF) bit. This bit is useful for discovering the minimum MTU on the path from source to destination in a process called path MTU (PMTU) discovery. The PMTU discovery process is described in RFC 1191 [Mogul and Deering 1990]. To prevent IP datagrams from circulating in the Internet indeﬁnitely—because of a router loop, say—each datagram carries a time to live (TTL) value. The sending host initializes the TTL to some value (typically 32 or 64), and the router at every hop decrements it by 1. If the TTL reaches 0 before it arrives at its destination, the datagram is dropped, and an error message is sent back to the source host, using ICMP.

22

TCP/IP Overview

Chapter 2

IP datagrams are used to carry upper-layer protocols and, sometimes, networklayer protocols, such as ICMP. The type of protocol that the datagram is carrying is in the protocol ﬁeld. Some important protocols and their values are shown in Figure 2.12. Protocol ICMP IGMP IP/IP TCP UDP IPv6 GRE ESP AH OSPF L2TP

Value

Description

1 2 4 6 17 41 47 50 51 89 115

Internet Control Message Protocol Internet Group Management Protocol IP-in-IP Transmission Control Protocol User Datagram Protocol IP version 6 Generic Routing Encapsulation Encapsulating Security Payload Authentication Header Open Shortest Path First Routing Protocol Layer Two Tunneling Protocol Figure 2.12 Internet Protocols

The ﬁelds in the IP header are protected by the standard Internet checksum—see RFC 1071 [Braden, Borman, and Partridge 1988]. After the header is completely ﬁlled in, IP calculates the checksum and places it in the IP header checksum ﬁeld. Note that the IP header checksum covers only the header itself, not the data that the datagram carries. This means that the upper-layer protocols must provide their own checksums if they require one. The source address and destination address ﬁelds hold the 32-bit source and destination IP addresses. We discussed IP addresses in Section 2.4. Looking at lines 1.1 and 1.2 from Figure 2.4, we see the IP header encapsulating the upper-layer protocols: 1.1 1.2

4500 0049 007c 4000 4006 e1e9 ac1e 000c ac1e 0001 0403 1388 445d 9b9b 53ff 98e7

E..I.|@.@....... ........D]..S...

The 45 in the ﬁrst byte tells us that this is an IPv4 datagram (ﬁrst 4 bits) with a header size of 20 bytes (second 4 bits). The third and fourth bytes tell us that the entire datagram is 73 (0x49) bytes. We see from the ninth byte that the operating system (FreeBSD, in this case) has set the initial TTL to 64 (0x40). The next byte tells us that the datagram is encapsulating protocol 6. From Figure 2.12, we see that this is TCP. Finally, we see that the sending host’s IP address is 172.30.0.12 (0xac1e000c), and that it is sending the datagram to 172.30.0.1 (0xac1e0001).

2.6

UDP The User Datagram Protocol (UDP) enables user applications to send and receive connectionless, unreliable datagrams. As a transport-layer protocol, UDP is carried in IP datagrams, as shown in Figure 2.13.

Section 2.6

UDP

IP header

UDP header

20 bytes

8 bytes

23

data

UDP payload UDP datagram IP datagram

Figure 2.13 UDP Datagram Encapsulation

UDP adds very little to the basic IP service on which it depends. First, UDP provides the ability to associate a UDP datagram with the sending and receiving processes through the source and destination port numbers. Recalling our earlier analogy of IP being like sending a letter, we could say that the IP address corresponds to the street address of an apartment building, whereas the port number corresponds to individual apartments. In any event, the receiving UDP uses the destination port number to demultiplex the UDP datagram to the proper application. If the application sends a reply, it will address the reply to the host given in the source address ﬁeld of the IP header and the application given by the source port number. The other addition that UDP brings to the basic IP service is an optional checksum. We call the checksum optional because a host can disable its validation by setting it to 0. When present, the checksum covers the entire UDP datagram—header and data. Actually, the checksum includes a little more. Before calculating the checksum, a pseudoheader that includes the source and destination addresses, the protocol, and the length of the IP payload is prepended to the UDP header. This header is not transmitted; it is merely used for calculating the checksum: 0

78

15 16

31

source IP address destination IP address 0

protocol

IP payload length

Notice that the length of the UDP datagram is checksummed twice. Although this appears redundant for UDP, it makes more sense for TCP, which also uses the pseudoheader for calculating the checksum, because TCP doesn’t have an explicit length ﬁeld.

The UDP header is shown in Figure 2.14. The source port and destination port hold the port numbers. The checksum ﬁeld is either the UDP checksum or 0. The length ﬁeld holds the total length of the datagram, including the header. Note that this ﬁeld isn’t really necessary, because the UDP datagram’s length can be inferred from the length ﬁeld of the encapsulating IP datagram. We shall see, in fact, that TCP does not have an explicit length ﬁeld. UDP is speciﬁed in RFC 768 [Postel 1980].

24

TCP/IP Overview

Chapter 2

0

15 16

31

source port

destination port

length

checksum

data (if any)

Figure 2.14 The UDP Header

2.7

TCP The Transmission Control Protocol (TCP) provides a reliable, connection-based byte stream delivery service. The last sentence is ripe with meaning, so let’s spend a little time exploring what it means. We have seen that IP and UDP are best-effort protocols. They make no guarantees that they will deliver a datagram or if they do, that they will deliver it in order. Because IP does not checksum its payload, and because the UDP checksum is optional, these protocols don’t even guarantee that any data they do deliver will arrive uncorrupted. TCP, on the other hand, is prepared to make some guarantees. It guarantees that any data that arrives at the destination will be in order and uncorrupted. We should be more precise here. When we say that TCP guarantees that the data will be uncorrupted, we mean that it guarantees it to the extent that the 16-bit Internet checksum is able to detect corruption.

Reliability also means that TCP agrees to try really hard to deliver all the data that the sender commits to it for transmission to the destination. It does this by demanding acknowledgments from its peer TCP that data has arrived and by retransmitting the data after a suitable time if it does not receive an acknowledgment. Reliability most emphatically does not mean that TCP unconditionally promises to deliver any data that the sender writes. Even a moment’s thought will convince us that TCP couldn’t possibly keep that promise under all circumstances. For more on what reliability means and doesn’t mean, see Tip 9 of ETCP. In order to implement its retransmission strategy, TCP must maintain state between the blocks of data, called segments, that it sends to its peer. It does this by establishing a logical connection with its peer. The usual analogy is that TCP is like a phone call: A connection is established, words are delivered in the order that they are spoken, and it is not necessary to address each sentence or word by, say, continually redialing the peer’s number. When the parties ﬁnish their conversation, they say goodbye and hang up, and the connection is torn down. As we shall see, TCP goes through similar stages: A connection is established, the peers exchange data without needing to specify their peers’ addresses with each write, and when they are ﬁnished, the connection is torn down.

Section 2.7

TCP

25

The telephone analogy is a useful one for understanding the difference between connectionless and connection-based protocols, but it can be misleading. With a phone call, a physical connection is established, but the TCP connection is entirely notional, consisting only of shared state maintained by the two peers. The data itself, as shown in Figure 2.15, is carried in an IP datagram, just as it is for IP, ICMP, and UDP.

IP header

TCP header

20 bytes

20 bytes

data

TCP data TCP segment IP datagram

Figure 2.15 TCP Encapsulation

One of the most common misunderstandings about TCP involves its data delivery model. TCP delivers a byte stream to the receiving application. This means that TCP has no notion of records or packets that are visible at the user level. Suppose that an application writes a series of 500-byte messages. On any given read, the receiver may read part of one of those messages, all of one of the messages, or more than one of the messages. As Varghese [Varghese 2005] puts it, TCP simulates a shared data queue into which the sender puts bytes and the receiver removes bytes. There is no way for the receiving TCP to tell whether or not 2 bytes were put into the queue at the same time. This is not to say that the application cannot impose its own record structure on the byte stream, only that TCP doesn’t do so. Figure 2.5, for example, shows one way for an application to do this. Explicit record markers, such as newlines in textual data, are another. Later in the text, we will see several examples of applications or other protocols running over TCP imposing their own record structure on the byte stream. From the preceding discussion, the meaning of our description of TCP as a reliable, connection-based, byte stream delivery service should now be clear. Notice how this service is the antithesis of UDP, which is an unreliable, connectionless, datagram delivery service. The TCP speciﬁcation is given in RFC 793 [Postel 1981b]. The TCP Header As we mentioned above, TCP sends its data in blocks called segments. The format of these segments is shown in Figure 2.16. The source port and destination port serve to identify the sending and receiving applications, just as they do in UDP datagrams. In TCP, every byte has a sequence number. We do not, of course, attach a sequence number to every byte. Instead, the sequence number of the ﬁrst byte in a segment is placed in the sequence number ﬁeld. The sequence numbers of the remaining bytes in the segment are then known by implication.

26

TCP/IP Overview

Chapter 2

0

15 16 source port

31 destination port

sequence number

acknowledgment number data offset

reserved

U A P R S F R C S S Y I G K H T N N

TCP checksum

window size

urgent pointer

options (if any)

data (if any)

Figure 2.16 The TCP Header As we shall see when we discuss the TCP ﬂag bits, the SYN and FIN bits also take up a sequence number.

When TCP receives a segment from its peer, it returns the sequence number of the next byte it is expecting—that is, the sequence number after that of the largest-numbered byte it has received—in the acknowledgment number ﬁeld. This serves as an acknowledgment to its peer that TCP has received all bytes up to but not including the byte numbered in the acknowledgment number ﬁeld. If the receiving peer has data of its own to send, it piggybacks the acknowledgment number in the data segment. Otherwise it sends the acknowledgment in a segment without any data. Every segment after the ﬁrst (SYN) segment must specify the next byte it is expecting from its peer in the acknowledgment ﬁeld. TCP uses sequence numbers to ensure that applications receive the data in order. Most TCPs will queue any out-of-order data they receive until the missing bytes arrive. It is legal, however, for TCP to merely drop the data and send an acknowledgment indicating the data it is expecting. The TCP header is nominally 20 bytes but commonly contains optional data, such as timestamps or announcements concerning the maximum segment size the peer is willing to accept. TCP puts the size of the header, including options, in 32-bit words in the data offset ﬁeld.

Section 2.7

TCP

27

There are six ﬂags that TCP can set in the header: SYN

This ﬂag is set in the initial segment when a new connection is being set up. When the SYN ﬂag is set, it has the sequence number given in the sequence number ﬁeld, and the ﬁrst byte of data has the next sequence number.

FIN

This ﬂag serves as an EOF indicating that the TCP that set it is through sending data, although it may be willing to accept more data. When this ﬂag is set, it has the sequence number after that of the last byte of data in the segment.

ACK

This ﬂag is set when the value in the acknowledgment number ﬁeld is valid. The ACK ﬂag must be set in every segment except possibly the ﬁrst when the SYN segment is set.

RST

This ﬂag is used to reset the connection. Its most common use is during connection setup to indicate that no application is listening for a connection on the destination port. The ﬂag is also used to immediately abort a connection under certain error conditions.

PSH

This ﬂag is intended to indicate that the receiver should deliver any data in its receive buffer to the application but is virtually always ignored by the receiver. Senders often set it when the current segment empties the send buffer.

URG

This ﬂag indicates that urgent data is available in the byte stream at the offset—from this segment—shown in the urgent pointer ﬁeld. The meaning and use of this ﬁeld are widely misunderstood. See [Stevens 1998] for some guidance on its use.

TCP provides ﬂow control by telling its peer how much data it is currently willing to accept. It does this by advertising a receive window in the window size ﬁeld. The receive window is the number of bytes, starting with the one numbered in the acknowledgment ﬁeld, that TCP has space to buffer. When TCP has exhausted the buffer space for a connection, it will advertise a zero-sized window, and its peer will not send it any more data until it advertises a positive window size again. TCP has a mandatory checksum, which is placed in the checksum ﬁeld. As with IP and UDP, the checksum is the standard Internet checksum. Also as with UDP, the checksum is calculated over the pseudoheader, the TCP header and options, and the TCP data. Looking at lines 1.2–1.4 from Figure 2.4, we see the TCP header of the segment encapsulated in the IP datagram: 1.2 1.3 1.4

ac1e 0001 0403 1388 445d 9b9b 53ff 98e7 8018 e240 7b3a 0000 0101 080a 0000 c860 3503 72f1 0000 0011 4441 5441 4441 5441

........D]..S... ...@{:.........‘ 5.r.....DATADATA

28

TCP/IP Overview

Chapter 2

The ﬁrst 12 bytes are the source and destination ports, the sequence number of the ﬁrst byte of data, and the acknowledgment number. The destination port number, for example, is 5000 (0x1388). The ﬁrst 2 bytes in line 1.3 tell us that the TCP header is 32 bytes (8 words) long and that the PSH and ACK ﬂags (0x18) are set. A good exercise is to verify the rest of the ﬁelds with line 1 of Figure 2.4. We had tcpdump print that line in a way that makes it easy to see the exact values. Normally, tcpdump prints sequence and acknowledgment numbers relative to the sequence number of the SYN bytes. This makes the numbers smaller and easier to follow. For Figure 2.4, we inhibited that behavior. We also asked tcpdump to print numeric rather than symbolic addresses.

Connection Setup: The Three-Way Handshake The normal connection sequence is shown in Figure 2.17.

client

server SYN

SYN-ACK

ACK

Figure 2.17 TCP Three-Way Handshake

One of the peers, usually called the client, initiates the connection by sending its peer, usually called the server, a SYN segment that has an initial sequence number and perhaps some other connection parameters. The server responds by acknowledging the client’s SYN—we say that the server ACKs the SYN—and sending its own SYN segment with an initial sequence number and optional connection parameters. The client ACKs the server’s SYN, and the connection is established. Sometimes, we say that the connection is synchronized, meaning that the two peers have synchronized their connection state. The term SYN is often used as shorthand for synchronization segment—a segment with the SYN ﬂag set.

We can see the handshake in action by using tcpdump to capture a TCP session. On laptop, we use netcat (nc) to connect to the echo server on solaris: laptop:˜ $ nc solaris echo hello

Section 2.7

TCP

29

hello ˆC punt!

We run tcpdump on laptop to capture the connection setup. In this case, we didn’t ask tcpdump to print a hex dump of the datagrams: 1 2

3

05:57:05.603882 laptop.1033 > solaris.echo: S 544197796:544197796(0) win 57344 (DF) 05:57:05.631720 solaris.echo > laptop.1033: S 3156319241:3156319241(0) ack 544197797 win 24616 (DF) 05:57:05.631927 laptop.1033 > solaris.echo: . ack 1 win 57920 (DF)

In line 1, laptop sends a SYN segment to solaris asking to establish a connection with the application listening on port 7 (the echo port). The segment establishes an initial sequence number of 544197796 and announces two connection parameters. First, it tells its peer that its MSS (maximum segment size) is 1,460 bytes. Second, it turns off window scaling (wscale 0). The timestamp option contains some timing information that TCP uses to calculate the round-trip time (RTT) of segments. TCP uses this information in its retransmission strategy. See [Stevens 1994] for the details of the MSS and window scale parameters and the timestamp option. The (DF) at the end of the line indicates that the DF ﬂag in the IP header is set. This is the path MTU discovery mechanism that we discussed earlier. Next, solaris ACKs the SYN and sends its own SYN and connection parameters. Notice that the acknowledgment number is 544197797, reﬂecting the fact that solaris is expecting byte 544197797 next. Finally, in line 3, laptop ACKs the SYN from solaris, and the connection is established. It is also possible, but rare, for both peers to initiate a connection. This happens when they both send a SYN at roughly the same time. The SYNs cross in the network, as illustrated in Figure 2.18, and both peers respond with an ACK. This four-way handshake results in a single connection rather than the two that we might expect. client

server SYN

SYN

ACK

ACK

Figure 2.18 The Four-Way Handshake

If a host sends a SYN to a host for a port at which no application is listening, the receiving host will respond with an RST (a segment with the RST bit set). This tells the sending host that the connection cannot be established and that it should abandon the attempt.

30

TCP/IP Overview

Chapter 2

That is, the RST indicates a hard error. If, for example, the sending host does not receive a SYN-ACK in response to its SYN after a given time, it will continue the attempt to connect by resending the SYN.

To illustrate this, we attempt to connect to a port where no application is listening: laptop:˜ $ nc -v linux 8000 linux [172.30.0.4] 8000 (?) : Connection refused

Then tcpdump shows linux responding to the SYN with an RST: 1 2

11:21:19.974218 laptop.1070 > linux.8000: S 1025154961:1025154961(0) win 57344 (DF) 11:21:19.980602 linux.8000 > laptop.1070: R 0:0(0) ack 1025154962 win 0 (DF)

As we see in line 2, the response from linux has the RST bit set. Connection Shutdown After the two TCP peers have ﬁnished exchanging data, they enter the ﬁnal phase of the session: connection teardown. When one side is ﬁnished transmitting data, it sends a segment with the FIN bit set. This acts as an EOF, telling the other side that TCP will send no more data. As illustrated in Figure 2.19, the other side will normally also send a FIN, completing the shutdown. client application close

server FIN

ACK FIN

application close

ACK

Figure 2.19 Connection Shutdown

Here is the end of the connection that we initiated to the echo server on solaris: 1 2

05:57:26.297609 laptop.1033 > solaris.echo: F 7:7(0) ack 7 win 57920 (DF) 05:57:26.300517 solaris.echo > laptop.1033: . ack 8 win 24616 (DF)

Section 2.8

ICMP

3 4

31

05:57:26.322748 solaris.echo > laptop.1033: F 7:7(0) ack 8 win 24616 (DF) 05:57:26.322930 laptop.1033 > solaris.echo: . ack 8 win 57920 (DF)

The FIN segments are in lines 1 and 3; the ACKs for them, in lines 2 and 4. It’s also possible for one side to close and the other to continue to send data. For example, the client could make a request of the server and then close its half of the connection to indicate that it’s through making requests. The server would not close its side of the connection until it had ﬁnished responding to the client. This is illustrated in Figure 2.20. See Tip 16 of ETCP for more information on the halfclose operation and how it can be used in the so-called orderly release operation.

client application close

server FIN

ACK data

ACK

FIN

application close

ACK

Figure 2.20 A Halfclose

2.8

ICMP The Internet Control Message Protocol (ICMP) is used to carry error and control messages. For example, if a router decrements an IP datagram’s TTL to 0, it will send an ICMP error message back to the sending host, informing it that the datagram was dropped due to an exceeded lifetime. Another common example is the ICMP echo request and echo reply control messages, which are used by the ping utility to test connectivity between hosts.

32

TCP/IP Overview

Chapter 2

Although ICMP is usually considered to be a network-layer protocol, ICMP messages are carried in IP datagrams and have their own encapsulation, as shown in Figure 2.21. The speciﬁcation for ICMP is RFC 777 [Postel 1981].

IP header

ICMP message

Figure 2.21 ICMP Message Encapsulation in an IP Datagram

ICMP Message Formats Figure 2.22 shows the format of an ICMP message. 0

7 8 type

15 16

31

code

checksum

message-speciﬁc data

Figure 2.22 The General ICMP Header

The type and code ﬁelds indicate the ICMP message type and subtype. Figure 2.24 lists the values for these ﬁelds. The checksum ﬁeld contains a normal IP checksum of the entire ICMP message. Recall that the IP checksum covers only the IP header, so ICMP must provide its own. We will use the ping utility in many of our examples, so it is worthwhile taking a look at the ICMP echo request and reply messages in greater detail. 0

7 8 type 8 or 0

15 16 code 0

31 checksum

identiﬁcation

sequence number

data

Figure 2.23 A ping Packet

Figure 2.23 shows an ICMP echo request or reply message. Under UNIX, the identiﬁcation ﬁeld is usually the pid of the ping process sending the echo requests. This identiﬁcation ﬁeld is used to tie the echo replies back to the process that sent the echo request. As we have seen, the upper-layer protocols use ports to route data to the appropriate

Section 2.8

ICMP

Type

Code

Description

0 3

0

echo reply destination unreachable network unreachable host unreachable protocol unreachable port unreachable fragmentation required but DF bit set source route failed destination network unknown destination host unknown source host isolated communication with destination network administratively prohibited communication with destination host administratively prohibited network unreachable for TOS host unreachable for TOS communication administratively prohibited host precedence violation precedence cutoff in effect source quench redirect redirect for network redirect for host redirect for TOS and network redirect for TOS and host alternate host address echo request router advertisement router solicitation time exceeded TTL is 0 during transit TTL is 0 during reassembly parameter problem IP header bad required option missing timestamp request timestamp reply information request information reply address mask request address mask reply

4 5

6 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0 1 2 3 0 0 0 0 0 1

12

13 14 15 16 17 18

0 1 0 0 0 0 0 0

33

Figure 2.24 ICMP Message Types

process, but because ICMP runs in the network-layer and doesn’t have ports, it must use the identiﬁcation ﬁeld. Here is a tcpdump of a single ping to host linux: 1 17:07:49.959527 laptop > linux: icmp: echo request 1.1 4500 0054 B0f60 0000 4001 12fd ac1e 000c E..T.‘..@.......

34

TCP/IP Overview

Chapter 2

1.2 ac1e 0004 0800 c237 2308 0000 a504 1.3 5979 0e00 0809 0a0b 0c0d 0e0f 1011 1.4 1415 1617 1819 1a1b 1c1d 1e1f 2021 1.5 2425 2627 2829 2a2b 2c2d 2e2f 3031 1.6 3435 3637 2 17:07:49.995696 linux > laptop: icmp: 2.1 4500 0054 2778 0000 4001 fae4 ac1e 2.2 ac1e 000c 0000 ca37 2308 0000 a504 2.3 5979 0e00 0809 0a0b 0c0d 0e0f 1011 2.4 1415 1617 1819 1a1b 1c1d 1e1f 2021 2.5 2425 2627 2829 2a2b 2c2d 2e2f 3031 2.6 3435 3637

1b3f 1213 2223 3233 echo 0004 1b3f 1213 2223 3233

.......7#......? Yy.............. .............!"# $%&’()*+,-./0123 4567 reply E..T’x..@....... .......7#......? Yy.............. .............!"# $%&’()*+,-./0123 4567

The ﬁrst 20 bytes in each datagram are the IP header. The ICMP messages follow the IP headers and are set in boldface. Notice that the echo request has a type of 0 and that the echo reply has a type of 8, as expected. The data portion of the message depends on what operating system the ping was run under and what options were speciﬁed for the ping invocation. In this case, the ﬁrst 8 bytes are a timestamp, and each byte of the remaining data is the offset from the beginning of the data. ICMP Error Messages As we saw in Figure 2.24, ICMP messages can either be informational, such as the echo request/reply messages, or they can be error messages. The error messages follow the format shown in Figure 2.25: The type and code ﬁelds indicate the error, and the message-speciﬁc data is a copy of the IP header, including options, and at least the ﬁrst 8 bytes of the upper-layer protocol header.

IP header 20 bytes

ICMP header 8 bytes

original IP header 20 bytes

ﬁrst 8 bytes of transport header 8 bytes

Figure 2.25 An ICMP Error Message

We can see an example of an ICMP error message by sending a UDP datagram to a port with no application listening. The destination host will respond with a portunreachable message as shown next. On laptop, we use netcat to send a datagram to port 6666 on bsd. Once the datagram is sent, netcat terminates: laptop:˜ $ nc -u bsd 6666 hello laptop:˜

We capture the result by running tcpdump on laptop: 1 06:28:53.089078 laptop.iad2 > bsd.6666: udp 6 1.1 4500 0022 003f 0000 4011 2243 ac1e 000c E..".?..@."C.... 1.2 ac1e 0001 0407 1a0a 000e 459b 6865 6c6c ..........E.hell 1.3 6f0a o. 2 06:28:53.091639 bsd > laptop: icmp: bsd udp port 6666 unreachable

Section 2.9

NAT and Private IP Addresses

2.1 2.2 2.3 2.4

4500 ac1e 003f 0407

0038 000c 0000 1a0a

0538 0303 4011 000e

0000 4001 1d44 ac1e 0001 dedd 0000 0000 4500 0022 2243 ac1e 000c ac1e 0001 0000

35

[email protected].... ............E.." .?..@."C........ ........

We see the UDP datagram on lines 1.2 and 1.3 with its payload of hello\n. The ICMP error message is on line 2, with the ICMP header on line 2.2 set in boldface. Notice that the type and code are both 3. From Figure 2.24 we see that this is a port unreachable message. As shown on line 2, this message informs laptop that it sent a message to bsd on port 6666 but that no application was listening on that port—hence the port was ‘‘unreachable.’’ In lines 2.2 and 2.3, we see a copy of the IP header from lines 1.1 and 1.2: 2.2 2.3

ac1e 000c 0303 dedd 0000 0000 4500 0022 003f 0000 4011 2243 ac1e 000c ac1e 0001

............E.." .?..@."C........

Notice that this header is an exact copy of the IP header from the UDP datagram. Finally, on line 2.4, we see the 8 bytes of the UDP header: 2.4

0407 1a0a 000e 0000

........

These bytes are how laptop knows that it was port 6666 (0x1a0a) that was unreachable, and that the source port of the sending application was 1031 (0x407). Notice that bsd has set the UDP checksum ﬁeld to 0 in the ICMP error message. This behavior is incorrect. If we run this experiment again but send the UDP datagram to solaris or linux instead of to bsd, the checksum ﬁeld is correct. Indeed, both solaris and linux return the entire UDP datagram, showing that exactly what gets returned in an ICMP error message depends very much on the implementation.

2.9

NAT and Private IP Addresses RFC 1918 [Rekhter, Moskowitz, Karrenberg et al. 1996] speciﬁes that certain IP addresses are private. That is, that routers should not forward them outside the organization that is using them. The original idea was that enterprises that had no need to connect to the Internet or that needed only limited connectivity to the Internet through application gateways could use the private address space rather than obtain, and waste, globally routable addresses. Figure 2.26 lists the private address blocks and their address ranges. If an organization is never going to connect its network to the Internet, it doesn’t matter what addresses it uses, of course. Before RFC 1918, it was common for such organizations to arbitrarily choose an address block and use it. Suppose, however, that such an organization decides to add a mail gateway that is connected to both the private network and the Internet, so as to provide email services to the hosts on the private network. Figure 2.27 shows an example of this for an organization that arbitrarily chose to number the hosts in its private network with addresses from the 18.0.0.0/8 block. As

36

TCP/IP Overview

Chapter 2

Address Block

Range

10.0.0.0/8 172.16.0.0/12 192.168.0.0/16

10.0.0.1–10.255.255.255 172.16.0.0–172.31.255.255 192.168.0.0–192.168.255.255

Figure 2.26 RFC 1918 Private Addresses

shown in the ﬁgure, the mail gateway has its internal—to the private network—address set to 18.0.0.254 and is connected to the Internet through the globally routable address 96.29.5.15. Hosts inside the private network send their outgoing email to and receive their incoming email from the mail gateway.

private network (18.0.0.0/8)

18.0.0.254

mail GW

96.29.5.15

to Internet

Figure 2.27 A Private Network with a Mail Gateway

Now suppose that someone in the organization wants to send email to a user at MIT. This creates a problem because MIT is the owner of the 18.0.0.0/8 address block, and the mail gateway will try to send the mail back into the private network rather than on to the mit.edu domain. Note how this problem disappears if the organization uses one of the private address blocks instead. If that were all there was to private addresses, they would have remained a littleknown backwater, especially today when almost every network that doesn’t need to be completely isolated from the Internet enjoys full connectivity. When combined with network address translation (NAT), private addresses become a useful tool for conserving Internet addresses and for providing an amount of independence from any particular Internet service provider (ISP). Before we can understand how this works, we have to know a little about network address translation. NAT is typically deployed on routers at the edge of an organization’s network. If we replace the mail gateway in Figure 2.27 with a NAT-enabled router and renumber the private network with one of the private address blocks—10.0.0.0/8, say—we get Figure 2.28. To understand why we need NAT, let’s look again at what happens when a host in the private network—10.0.0.1, say—sends an email to someone at MIT. The MIT mail server is at 18.7.7.76, so the IP datagrams from our user on the private network to MIT will have a source address of 10.0.0.1 and a destination address of 18.7.7.76. This datagram could be delivered as it stands, but MIT’s mail server will not be able to respond, because the 10.0.0.1 address is not routable.

Section 2.9

NAT and Private IP Addresses

private network (10.0.0.0/8)

10.0.0.254

NAT

96.29.5.15

37

to Internet

Figure 2.28 A Private Network with a NAT-Enabled Router To be more precise, the datagram could be delivered in principle. Many routers and mail servers are conﬁgured to ignore trafﬁc from private addresses as a way of avoiding denial-ofservice (DOS) attacks and unsolicited email (Spam).

Therefore, before leaving the private network, the datagram must have its source address changed so that return datagrams can ﬁnd their way back. NAT can operate in three modes: 1. In static mode, every host in the private-network that has access to the Internet has a corresponding public address. Thus, NAT need merely perform a one-toone mapping from the private address to the public address and replace the source address with the public address. 2. In pooled mode, hosts in the private network can use a pool of public addresses. For example, a pool of 32 public addresses might be available for 300 hosts in the private network. When a host in the private network begins a conversation with a remote peer, one of the public addresses from the pool is temporarily assigned to the host until the conversation is ﬁnished. If all the pool addresses are in use, no other hosts will be able to communicate outside the private network until one of the public addresses becomes free. 3. In port address translation (PAT), the most common mode for NAT, there is usually only a single public address that all the hosts in the private network share, but the source port of the outgoing datagram is changed to a unique value that is used to associate return datagrams with the originating private address. To get a better view of how PAT operates, let’s drill down into the private network of Figure 2.28, as shown in Figure 2.29.

96.29.5.15

...

host 3 10.0.0.3

host 2 10.0.0.2

host 1

NAT

10.0.0.1

10.0.0.254 10.0.0.0/8

Figure 2.29 Part of a Private Network Using NAT

38

TCP/IP Overview

Chapter 2

We once again suppose that host 1 wants to establish a connection with the MIT email server at 18.7.7.76 and that it uses the source port 2443. Figure 2.30 shows the progress of the IP datagram—containing the TCP SYN segment for the connection to MIT’s mail server—as it leaves host 1, prior to having NAT applied, and after it leaves the router, which applied NAT. datagram before NAT Src: 10.0.0.1.2443 Dst: 18.7.7.76.25

host 1

10.0.0.1

10.0.0.254

datagram after NAT Src: 96.29.5.15.5420 Dst: 18.7.7.76.25

NAT

96.29.5.15

to Internet

Figure 2.30 An Outbound Datagram Before and After NAT

As expected, the datagram has a nonroutable source address of 10.0.0.1 and a source port of 2443 as it leaves host 1. After having had the router apply NAT, the datagram has a source address of 96.29.5.15 and a source port of 5420. MIT’s mail server will see a connection request from a host at 96.29.5.15.5420, and will reply to the same address and port. When there is a need to specify the port as well as the address in a TCP or UDP setting, it is common and convenient to write the port as a ﬁfth number at the end of the address. Thus, 96.29.5.15.5420 means that the datagram has a source address of 96.29.5.15 and a source port of 5420.

We show the return datagram in Figure 2.31. Notice that as expected, the datagram arrives at the router with a destination of 96.29.5.15.5420 and a source address and port of MIT’s mail server. The router looks up the 96.29.5.15.5420 in its PAT state table and discovers that the datagram should be sent to 10.0.0.1.2443. The left side of Figure 2.31 shows the datagram after the router has applied NAT. datagram after NAT Src: 18.7.7.76.25 Dst: 10.0.0.1.2443

host 1

10.0.0.1

10.0.0.254

datagram before NAT Src: 18.7.7.76.25 Dst: 96.29.5.15.5420

NAT

96.29.5.15

to Internet

Figure 2.31 An Inbound Datagram Before and After NAT

If another host on the private network, say host 2 at 10.0.0.2, decided to send email to MIT at the same time as host 1, NAT would also map host 2’s source address to 96.29.5.15 but would use a different source port, say 7322. Then when a return packet

Section 2.9

NAT and Private IP Addresses

39

arrives addressed to 96.29.5.15.7322, NAT would know from the destination port number that the packet should be forwarded to 10.0.0.2. The advantages of NAT are immediately clear. First, it saves IP address space by requiring only a single globally routable IP address for the entire private network. Second, it helps with router table growth by requiring only a single, or at most a few, router entries for the network. Third, it allows the organization to assign private addresses to all the computers in its network and to translate those addresses to public addresses at the edge of the network, where it connects to the outside world. Thus, the organization has a measure of independence from its ISP. If the ISP institutes policies that the organization ﬁnds unacceptable or if the organization is able to negotiate a more favorable contract with another provider, it need merely change its NAT rules, and won’t have to worry about renumbering all the hosts within the network. Finally, NAT can result in cost savings by allowing the organization to obtain only a single globally routable IP address from its ISP. This last advantage makes NAT especially popular with home and small-business networks. Despite these advantages, NAT is almost universally reviled among networking experts. Although we discussed NAT as if it involved a simple remapping of the internal host’s address, a closer look will reveal that much more is involved. Let’s consider what happens after NAT performs its translation. First, when the host’s IP address is changed in the IP header, the IP header checksum is invalidated and will have to be recalculated. Next, NAT will have to look in the IP datagram to see what type of data it’s carrying; if it’s a TCP or UDP packet, that checksum is also invalidated, because of the pseudoheader. Requiring a network-layer function, such as NAT, to look inside the IP datagram is called a layer violation. Recall that the ideal is for IP to consider the data that it’s carrying as opaque. Once one layer must look at and change the data of another layer, we lose much of the beneﬁt of layering that we discussed earlier.

This is only the beginning, though. Many application protocols, such as ftp [Reynolds and Postel 1985], require the use of more than one connection. In the case of ftp, for example, the client sends the server a PORT command asking the server to open a connection to the client at a given address and port. The address and port number are transmitted as ASCII data. Note that NAT must be aware of this for two reasons: ﬁrst, it must remap the private IP address that the client sends the server; second, it must remember, and possibly remap, the port number so that it won’t interfere with a previously NAT-assigned port and so that NAT will know how to map the IP/port pair back to the private network when the server responds. This means that NAT must look inside the TCP packet to see whether it’s carrying an ftp PORT command. If it is, NAT must change the address and port in that command. This invalidates the TCP checksum, of course, but we already have to recalculate that because of the pseudoheader. Note, though, that the size of the segment may change—mapping 10.0.0.1 to 96.29.5.15 adds 2 bytes, for example. If the client believes that the last byte it sent has sequence number n, the server—assuming, as in the example, that 2 bytes were added—will believe that the last byte has sequence number n + 2, and will send an ACK specifying n + 3 as the next expected byte. Thus, NAT will have

40

TCP/IP Overview

Chapter 2

to change the ACK back to n + 1. Furthermore, NAT will have to remember and correct this discrepancy in sequence numbers for the remainder of the connection. FTP passive mode was introduced to deal with these problems. In passive mode, the client initiates both the control and data connections, so there is no need for NAT to look inside the data packet to ﬁnd the address and port of the connection. When the client wants to open a data connection, it sends the server a PASV command, which requests that the server start listening for the data connection on a port of its choice. The server responds on the control connection with the port that it will listen on.

NAT can and does deal with ftp and a few other well-known protocols but will certainly break any user-written protocols that depend on passing address or port information in the data stream. For our purposes, NAT, and especially PAT, have fatal ﬂaws. As we shall see, certain IPsec modes will detect any changes in the source or destination IP addresses and drop the datagram if a change is detected. The most common IPsec protocol, the Encapsulating Security Payload protocol (ESP), encrypts the entire TCP or UDP header, so that the ports numbers will not even be visible. Thus, PAT cannot be used with IPsec. We will come back to these problems in Part 3 when we discuss IPsec and NAT traversal. The operation of NAT, its strengths, and its weaknesses are discussed in RFC 1631 [Egevang and Francis 1994]. NAT terminology is deﬁned and discussed in RFC 2663 [Srisuresh and Holdrege 1999].

2.10

PPP The Point-to-Point Protocol (PPP) is an interface-layer protocol. We usually think of it as the protocol used to carry IP datagrams over serial lines in dial-up connections to an ISP, but as we shall see, PPP can also carry IP over other media. For example, PPP is commonly used to carry data over Ethernet, T1/E1 connections, and ATM. In Chapter 4 and Part 2, we will see that PPP is frequently used in building tunnels. Because of its use in tunneling and because we will use it ourselves in some of the tunnels we build, it is worthwhile spending a little time to understand the rudiments of the protocol. PPP evolved from two other networking technologies: serial line IP (SLIP) and the High-Level Data Link Control (HDLC) protocol, as shown in Figure 2.32. SLIP is an extremely simple protocol used to frame IP datagrams on serial lines. It is deﬁned in RFC 1055 [Romkey 1988], which also provides sample C code for sending and receiving IP packets assuming the primitives send_char() and recv_char(), which send and receive a single character over a serial line. The design is straightforward: A special END character (0xc0) is used to begin and end each IP datagram. Any END characters in the data are replaced with the sequence 0xdb 0xdc, where the 0xdb is called the SLIP ESC character. Any SLIP ESC characters are replaced by the sequence 0xdb 0xdd. Note that the SLIP ESC character, 0xdb, is different from the ASCII ESC character, 0x1b.

SLIP provides no addressing or error detection and can carry only IP datagrams. It

Section 2.10

PPP

SLIP

41

HDLC

PPP

Figure 2.32 The PPP Family Tree

also has no built-in compression capabilities, but Van Jacobson’s VJ header compression is often used in conjunction with it to compress the TCP and IP headers in TCP segments. VJ compression, which is deﬁned in RFC 1144 [Jacobson 1990], can also be used in PPP connections. The HDLC protocol was originally developed for use on synchronous leased lines. It speciﬁes both packet-framing formats and a method of transmitting a bit stream on the synchronous digital link that provides the physical medium. The basic HDLC packet is shown in Figure 2.33.

address

control

information

FCS

Figure 2.33 HDLC Packet Format

The address ﬁeld carries the station ID in multidrop lines. Because PPP is a point-topoint protocol, this ﬁeld is not necessary and is set to 0xff. The control ﬁeld speciﬁes the type of data in the information ﬁeld. For PPP, the control ﬁeld is always set to 0x03. IP datagrams and other data are carried in the information ﬁeld. The frame check sequence (FCS) ﬁeld is a standard 16- or 32-bit cyclic redundancy check (CRC). On synchronous lines, each HDLC frame is preceded by the 8-bit sequence 01111110. This so-called ﬂag is used to synchronize the receiver and tell it that another frame is coming. The synchronous wire protocol ensures that no more than ﬁve consecutive 1-bits will appear in the data, so the ﬂag is unique. PPP on asynchronous media uses the same idea: Each PPP frame is terminated with a ﬂag byte of 0x7e. Although not strictly required, most PPP implementations will also start the frame with a ﬂag byte to help prevent interframe line noise, particularly on asynchronous serial lines, from corrupting the following frame. As with SLIP, a PPP ESC byte (0x7d) is used to escape ﬂag bytes, PPP ESC bytes, and other bytes, such as the ASCII control bytes, that the PPP peers may negotiate. The effect of a PPP ESC is to exclusive-OR 0x20 with the byte being escaped. Thus, a 0x7e (ﬂag) byte in the data will be transmitted as the sequence 0x7d, 0x5e. Similarly, a PPP

42

TCP/IP Overview

Chapter 2

ESC byte in the data will be transmitted as the sequence 0x7d, 0x5d. This process is illustrated in Figure 2.34, which shows a PPP packet before and after the data is framed and escaped.

7e

address ff

control 03

7d

7e

FCS

address ff

control 03

7d5d

7d5e

FCS

7e

Figure 2.34 A PPP Packet Before and After Framing and Escaping

The negotiation of a PPP link has three main phases. 1. The Link Control Protocol (LCP) establishes the basic parameters for the transmission of data over the physical link. Parameters, such as the maximum receive unit (MRU), bytes needing escaping, header compression options, type of CRC to use, authentication protocols, and padding requirements are negotiated in this phase. 2. The authentication phase identiﬁes the peers to each other, using the protocol or protocols negotiated in the LCP phase. The authentication phase uses protocols such as the Password Authentication Protocol (PAP), the Challenge-Handshake Authentication Protocol (CHAP), and the Extensible Authentication Protocol (EAP) to authenticate the peers. 3. The Network Control Protocol (NCP) phase negotiates parameters for one or more network protocols, such as IP, that the PPP link will carry. We are interested in using PPP to carry IP datagrams only, but it can also carry other protocols, such as AppleTalk and IPX. When the NCP phase is operating on behalf of IP, it negotiates parameters such as the IP addresses of the peers, domain name system (DNS) servers, and whether to perform VJ compression. Although not really network-layer protocols, the Compression Control Protocol (CCP) and Encryption Control Protocol (ECP) are also considered NCPs. As their names suggest, they are used to negotiate compression and encryption algorithms and their parameters. The information ﬁeld can contain IP datagrams or control information used during LCP, authentication, and NCP negotiation. The ﬁrst 2 bytes of the information ﬁeld, called the protocol ﬁeld, indicate the type of data the information ﬁeld is carrying, as shown in Figure 2.35. The format of the LCP and NCP data is deﬁned in a series of

Section 2.11

IPv6

address ff

control 03

protocol

information

protocol 0021

IP datagram

protocol c021

LCP data

protocol 8021

NCP data

43

FCS

Figure 2.35 PPP Frame Formats

RFCs, but [Carlson 2000] has an excellent discussion of this data and a thorough explanation of the working of PPP. Because PPP normally makes no use of the address and control ﬁelds, the peers can agree during LCP negotiations to omit them in non-LCP frames. Similarly, the peers can agree that they will compress the protocol ﬁeld to a single byte whenever possible. If the most signiﬁcant byte of the protocol ﬁeld is odd, it is compressed; otherwise, the full protocol ﬁeld is present. Being aware of these optimizations is important because we will see examples of them in our network traces later in the text.

2.11

IPv6 We have already seen how CIDR and NAT are being used as short-term solutions to the depletion of IPv4 addresses. The longer-term solution is a new version of IP: IP version 6 (IPv6). IPv6 improves on IPv4 in the following areas: • A larger address space. IPv6 increases the 32-bit addresses of IPv4 to 128 bits. The new addresses are explicitly hierarchical, which helps prevent router table growth. We’ll discuss the IPv6 address structure shortly. • A simpliﬁed, ﬁxed-sized header. The IPv6 header is considerably simpliﬁed. The number of ﬁelds has been reduced from 13 in IPv4 to 8 in IPv6. This simpliﬁed header allows routers to process IP datagrams more efﬁciently and therefore to obtain higher routing speeds. • Improved handling of IP options. Rather than include options in the IP header itself as IPv4 does, IPv6 has separate option headers. With one exception, intermediate routers need not examine these headers, again adding to routing speed. • Flow labels. The IPv6 header includes a 20-bit ﬂow label that allows IP packets to be identiﬁed as belonging to a particular ﬂow. This capability enables differentiated handling of a series of IP datagrams belonging to the same ﬂow.

44

TCP/IP Overview

Chapter 2

The IPv6 Address Model IPv6 has three types of addresses: 1. Unicast addresses. These addresses refer to a unique interface on a unique node. They serve the same function and are similar to the unicast addresses of IPv4. 2. Anycast addresses. These addresses refer to a set of interfaces, usually on different nodes. A datagram sent to an anycast address is delivered to the nearest interface having that address, where ‘‘nearest’’ is determined by the routing protocols. 3. Multicast addresses. Datagrams sent to a multicast address will be delivered to all interfaces having that address. Be aware that under IPv6, there are no broadcast addresses. The functions of the IPv4 broadcast address types are assumed by multicast addresses in IPv6. Because IPv6 addresses are so long, we use a different notation to write them. The nominal textual representation is x 1 : x 2 : x 3: x 4 : x 5 : x 6 : x 7 : x 8 , where each x i represents one of the eight 16-bit pieces of the address. Typical examples are fe9c:ba78:0:0:0:400:33ac:20 2:0:0:0:204:faff:fe41:d630 0:0:0:0:0:0:0:1

As we see in the examples, we can simplify the addresses by not writing the leading 0s in each ﬁeld. We can further simplify them by replacing the longest run of 0s with ::. Thus, we could write the three preceding examples as fe9c:ba78::400:33ac:20 2::204:faff:fe41:d630 ::1

The last address in the examples is the loopback address. It serves the same function as 127.0.0.1 does in IPv4. The structure of a typical unicast address in shown in Figure 2.36.

routing preﬁx

subnet ID

interface ID

n bits

m bits

128 − n − m bits

Figure 2.36 An IPv6 Unicast Address

The routing preﬁx corresponds to the IPv4 network ID. It is used to route datagrams to a particular network, or site in IPv6 terminology. The subnet ID is used to identify individual segments, or links in IPv6 terminology. A site’s internal routers can use this ﬁeld to route datagrams to the proper link. The interface ID is unique for each interface on the link and corresponds to the host ID in IPv4. This ﬁeld is required to be 64 bits unless the address begins with the binary sequence 000.

Section 2.11

IPv6

45

The IPv6 addressing architecture is speciﬁed in RFC 3513 [Hinden and Deering 2003]. Speciﬁc aspects of addresses, such as the formation of the interface ID, are speciﬁed in a series of other RFCs referenced in RFC 3513. The IPv6 Header The IPv6 header, shown in Figure 2.37, is simpler than the IPv4 header. Because the header has a ﬁxed size, there is no header-length ﬁeld. Because IPv6 headers are not checksummed, there is no checksum ﬁeld. Because fragmentation is handled with a separate extension header, there are no identiﬁcation, fragment-offset, DF, or MF ﬁelds. 0

34 version

11 12

15 16

23 24

trafﬁc class payload length

31

ﬂow label next header

hop limit

128-bit source address

128-bit destination address

Figure 2.37 The IPv6 Header

The version ﬁeld identiﬁes the datagram as a version 6 datagram. The version ﬁeld is set to 6. The trafﬁc class and ﬂow label ﬁelds are used to provide special handling for IP datagrams. The use of these ﬁelds is still experimental, and they will not play any role in this text. The payload length ﬁeld is the length of the datagram exclusive of the header. Because the header size is ﬁxed at 40 bytes, the size of the entire datagram is implicitly known from this value. The payload length includes any extension headers present in the datagram. The next header ﬁeld is analogous to the protocol ﬁeld in the IPv4 header. If the datagram has no extension headers, this ﬁeld will specify an upper-layer protocol, such as TCP or UDP, or a network-layer protocol, such as ICMP. If the datagram does have

46

TCP/IP Overview

Chapter 2

extension headers, this ﬁeld will specify the extension header immediately following the IP header. The hop limit ﬁeld is the same as the TTL ﬁeld in IPv4. It is used to prevent IP datagrams from circulating in the network indeﬁnitely. The source and destination address ﬁelds contain the 128-bit addresses of the node originating the datagram and the node to which it is being sent. If a routing extension header is present, the destination address may not be that of the ultimate destination. Because the IPv6 header is not checksummed, the use of a pseudoheader is even more important than in IPv4. The IPv6 pseudoheader is shown in Figure 2.38. As in IPv4, this pseudoheader is prepended to TCP and UDP packets for purposes of calculating the checksum only. 0

23 24

31

source address

destination address

upper-layer packet length zero

next header

Figure 2.38 The IPv6 Pseudoheader

IPv6 Extension Headers In IPv6, optional network-layer information is encoded into separate data structures called extension headers. The IPv6 speciﬁcation, RFC 2460 [Deering and Hinden 1998], lists six types of extension headers. Two of these—the authentication header (AH) and the encapsulating security payload (ESP) header—are part of IPsec; we will discuss them in Part 3. The other four deal with hop-by-hop options, destination options, routing, and fragmentation. Except for the hop-by-hop option header, no node except that speciﬁed in the destination address ﬁeld need examine or deal with these headers in any way. Because the hop-by-hop extension header must immediately follow the IPv6 header, a router need

Section 2.12

Routing

47

merely look at the next header ﬁeld in the IPv6 header to know whether or not it need concern itself with the extension headers. Currently, no hop-by-hop options are deﬁned except for the padding options used by all the headers to align data on the appropriate boundary. The router extension header plays a role similar to that of source routing in IPv4. This header allows the sender to specify a series of intermediate nodes through which the datagram must pass. Fragmentation in IPv6 differs from that in IPv4. First, IPv6 datagrams are never fragmented by intermediate nodes; hence the lack of a DF bit. In IPv6, only the source host can fragment a datagram. This implies that path MTU discovery is mandatory in IPv6. Second, when a datagram is fragmented, the information about each fragment is in the fragmentation extension header. This header contains fragment-offset, MF bit, and identiﬁcation ﬁelds, just as in IPv4. The destination extension header is examined by the destination node or nodes. Like the hop-by-hop extension header, no nontrivial destination options are currently deﬁned. In addition to the ﬁelds necessary to perform their functions, each extension header has a next-header ﬁeld. Nodes use this ﬁeld to determine the meaning of the data following the extension header: another extension header, a network-layer protocol header, or a transport-layer protocol header. Figure 2.39 shows a typical IPv6 datagram with extension headers and a TCP segment.

IPv6 hdr.

ext. hdr. 1

ext. hdr. 2

...

ext. hdr. n

TCP hdr.

data

Figure 2.39 An IPv6 Datagram

The complete speciﬁcation of IPv6 is spread among several RFCs. For further information, see , the Web site of the IETF (Internet Engineering Task Force) Working Group on IPv6.

2.12

Routing Later in the text, we will need some rudimentary information about routing protocols, so let’s brieﬂy review a few basics here. Our goal is not to examine any particular routing protocol in depth but merely to understand the various types of routing protocols and how they interoperate to provide routers with the information they need to choose the next hop for an incoming IP datagram. See [Comer 2000] for an excellent summary of routing and routing protocols. For detailed discussions of the algorithms, their strengths and weaknesses, and how they work, see [Huitema 2000] and [Perlman 2000]. As we’ve seen, hosts usually have routing (forwarding) tables that contain at most a few routes to speciﬁc networks or hosts, and a default route for all other trafﬁc. Indeed, most hosts probably have only a default route, so that all trafﬁc that is not bound to a destination on the same network is sent to a single router that forwards it on to its

48

TCP/IP Overview

Chapter 2

destination. Typically, the default route and any others a host might need are static; that is, they are set manually by the system administrator or user of the host. In a small organization, the routers could be conﬁgured statically too, but it’s obvious that this solution doesn’t scale beyond a very small number of routers. That realization brings us to the central question of this section: How do the routers learn the routes they use to populate their routing tables? The obvious answer is that the routers exchange information about which networks they can reach, but that answer brings its own questions. For example, if an organization has a router sitting on one of its networks, how many other routers should it exchange routes with? It’s obviously impractical for the router to exchange routes with every other router on the Internet, so where do we draw the line? Let’s consider an organization, such as an ISP, that manages a large number of networks. Figure 2.40 represents a small part of the total network topology of the organization. to external networks

R1 network 1

R2

R3

network 2

network 3

R4 network 4

R5 network 5

Figure 2.40 Part of the Network Topology for a Large Organization

Suppose that a host on network 4 wants to send an IP datagram to a host on network 3. Notice that the datagram could take two possible paths: R4 →R2 →R3 and R4 →R5 . R4 must have enough information to choose the best next hop—presumably, R5 in this case. In order to do this, R4 has to have at least a partial understanding of the topology of the network. Similarly, the other routers must understand the network topology well enough to choose the best path for datagrams. Router R1 , for example, must have an entry in its routing table for each of the internal networks so that it can deliver datagrams coming in from the outside. To obtain this information, the internal routers exchange messages telling each other what networks they can reach and perhaps how many hops away each network is.

Section 2.12

Routing

49

On the other hand, outside routers don’t need to have any understanding of the organization’s internal network topology. They need know only that datagrams destined for any of the organization’s networks should be sent to R1 . Thus, R1 is the only router that needs to exchange routes with outside routers. This leads us to the observation that routing information exchanged within an organization’s networks is likely to be different from the routing information exchanged with outside routers. The information exchanged internally must reﬂect the network topology, whereas the information exchanged with outside routers need merely contain reachability data. Autonomous Systems As we’ve seen, internal and external routers need different types of routing information. The term autonomous system (AS) formalizes the notion of internal. An autonomous system is a collection of networks and routers under the control of a single authority. Each autonomous system is assigned a number by the Internet Assigned Numbers Authority (IANA). Examples of autonomous systems in the United States are Harvard (11), Earthlink (3703), and Novell (3680). For our purposes, the importance of autonomous systems is that the authority in charge of each system is totally responsible for its internal topology and routing. Each autonomous system chooses one or more routing protocols and any policies that apply to those protocols. To an outsider, an autonomous system appears, in the words of RFC 1772 [Rekhter and Gross 1995], ‘‘to have a single coherent interior routing plan and presents a consistent picture of which destinations are reachable through it.’’ As suggested in Figure 2.40, each autonomous system has one or more gateways, called border routers, that exchange trafﬁc with other autonomous systems. Autonomous systems exchange trafﬁc at network access points (NAPs), which are facilities where several autonomous systems maintain border routers on a common network so that they can exchange trafﬁc with one another. Examples of NAPs are MAE-East (Washington DC), Chicago NAP, LINX (London), MAE-Paris, CATNIX (Barcelona), and several others throughout the United States and the world. In addition, some ISPs have private peering agreements that take place outside of NAPs.

A routing protocol that runs within an autonomous system is called an interior gateway protocol (IGP); one that exchanges routes between autonomous systems is called an exterior gateway protocol (EGP). The next two subsections brieﬂy examine each type of protocol. Interior Gateway Protocols Autonomous systems use interior gateway protocols to enable internal routers to optimize their forwarding decisions by learning the best next hop for IP datagrams. In order to make optimum decisions, each router must understand the network topology at least to the extent that it can assign a cost to each next hop. For example, let us assume that the routers in Figure 2.40 measure the cost of a route by the number of hops a datagram must take to reach its ﬁnal destination. If router R4 needs to forward a

50

TCP/IP Overview

Chapter 2

datagram to network 3, it would choose the path through R5 with a cost of 2 rather than the path through routers R2 and R3 with a cost of 3. Note that R4 needn’t understand the exact route the datagram will take after the next hop, merely that one next hop will yield a lower-cost route than the other. There are two main classes of IGPs: distance-vector protocols and link-state protocols. In distance-vector protocols, as exempliﬁed by the Routing Information Protocol (RIP) [Hedrick 1988] and RIP2 [Malkin 1994], each router sends its neighboring routers a copy of its routing table, which includes the cost to reach each destination network. More precisely, each router sends its neighbors certain columns from its routing table. There is no need, for example, to send the link that a certain network is reachable through; the network and cost to reach it are sufﬁcient.

In this way, each router builds up a routing table that contains the next hop and cost to reach each network. In Figure 2.40, for example, R5 would inform R3 and R4 that it can reach network 5 in one hop. R3 would enter a route to network 5 in its routing table with a next hop of R5 and a cost of 2 and then inform R1 of its routes. Router R1 would make an entry in its routing table for network 5 with a next hop of R2 and a cost of 3. These routers may learn better routes to network 5 as they hear from other routers, so these initial entries may be replaced as the routing protocol converges to a ﬁnal state. Each time a router updates its forwarding table, it must send the updated table to its neighbors. Several problems with distance-vector protocols limit their use to relatively small networks. First, because each router sends its neighbors information proportional to the size of its routing table, the trafﬁc generated by the routing protocol itself can become signiﬁcant. Second, distance-vector protocols can be slow to converge after a network change— suppose that the direct line between R4 and R5 of Figure 2.40 goes down, for instance— and it’s easy for routing loops to develop unless special care is taken to prevent them. Finally, each router depends on its neighbors to calculate part of its routing table. In this sense, the calculation of each router ’s forwarding table is a distributed process, and if one router makes an error in calculating its routes, it can affect the routing tables of all the routers. More formally, each router takes part in a distributed Bellman-Ford algorithm calculation [Ford and Fulkerson 1962] to arrive at a ﬁnal common routing table.

Link-state, or shortest path ﬁrst, protocols take a different approach. Each router broadcasts a list of currently reachable directly connected routers and networks and the cost of reaching them; this process is called ﬂooding. Note that the amount of information that each router transmits is proportional to the number of interfaces on the router, not the size of its routing table. Each router then uses this information to build a directed graph representing the network. For example, with the network of Figure 2.40, such a graph would be similar to Figure 2.41. Each router can now independently calculate the shortest path to each of the other nodes in the graph, using Dijkstra’s algorithm [Dijkstra 1959] or a similar method.

Section 2.12

Routing

51

ext. nets

R1

N1

R2

N2

R4

R3

N3

R5

N5

N4

Figure 2.41 A Graph of the Network in Figure 2.40 By ‘‘shortest path’’ we mean the path of least cost. Several metrics are possible for measuring cost. We could use hops, as with RIP, or some other measure such as delay, link speed, monetary cost, or reliability.

The most common link-state routing protocol is the Open Shortest Path First (OSPF) protocol [Moy 1998a, Moy 1998b]. Because it scales better than distance-vector protocols and can divide an autonomous system into semi-independent areas that operate within the autonomous system, much like the autonomous systems operate within the greater Internet, OSPF is the IGP of choice for all but the smallest networks. OSPF has other optimizations and capabilities as well. See the preceding references for complete details. The IS-IS (intermediate system to intermediate system) protocol, another link-state routing protocol, is very similar to OSPF but for historical reasons is used primarily by ISPs, whereas OSPF is used primarily on customer networks. For more information on IS-IS and how it differs from OSPF, see [Perlman 2000]. A third choice for routing within autonomous systems , ix autonomous˜system is the Enhanced Interior Gateway Routing Protocol (EIGRP). EIGRP is a Cisco proprietary protocol in the distance-vector family. Although similar to RIP, EIGRP has many improvements that help prevent loops from forming and avoids some of the other problems with distance-vector protocols while maintaining much of their simplicity. Exterior Gateway Protocols In the history of the Internet, several exterior gateway protocols have been used. Given the architecture of the Internet today, however, virtually all border routers use the Border Gateway Protocol (BGP) [Rekhter and Li 1995]. In the early days of the Internet, there was a backbone network to which the other networks connected in a manner similar to the way hosts connect to an Ethernet backbone today. With that topology, simpler exterior routing protocols, such as the Exterior Gateway Protocol (EGP), were

52

TCP/IP Overview

Chapter 2

used. Today’s topology of autonomous systems exchanging trafﬁc at a series of NAPs or through private peerings requires a more complicated protocol.

BGP views the Internet as a graph where the nodes are the autonomous systems and the edges are the links between them. The internal topology of an external AS plays no role in BGP. Indeed, an AS is generally ignorant of a neighboring autonomous system’s internal structure except for the networks that it contains. As a ﬁrst approximation, border routers use BGP to tell a peer border router in a neighboring AS what networks it can reach. These networks may be interior networks—those belonging to the router ’s AS—or they may be networks belonging to another AS that the border router knows how to reach. When a border router receives routes from a peer, it redistributes them to the interior routers, using whatever IGP the AS is running. Our ﬁrst-approximation description covers the simplest case of an autonomous system having a single border router. In this case, the border router has no real choices to make. It collects reachability information from its external peers and distributes it to the internal routers. Because it is participating in the IGP of the AS, the border router knows what internal networks are reachable, so it informs its external peers of these networks so that external hosts can reach them. It may be that the manager of the AS wants to keep some or all of the internal networks private. In this case, the border router can be conﬁgured not to advertise them to its external peers. Likewise, it is conceivable that the manager may wish to prohibit internal access to certain external networks. Again, the border router can be conﬁgured to ignore these routes and not distribute them to the internal routers. Other than these exceptional cases, the single border router merely exchanges reachability information with its external peers. Now let’s consider a slightly more complicated case. Figure 2.42 shows parts of three autonomous systems, each having two border routers. Notice that from the point of view of a router within AS1 , network N 3 in AS3 is reachable through both routers B1 and B2 . From the ﬁgure, it appears that, all links being equal, the route through B2 is preferable. Thus, only router B2 ’s route to N 3 should be advertised to the internal routers. We might think that both B1 and B2 could advertise their routes but with different metrics. The problem is that BGP doesn’t have metrics in the sense that RIP and OSPF do. It’s not difﬁcult to see why: Different autonomous systems can use different metrics within their networks, and it may not make sense to compare them. For example, suppose that AS2 uses hop count as a metric but that AS3 uses link speed. What metric should B3 report to B1 for network N 3 ? Similarly, how could an internal router, such as I 1 , compare the metrics it got from B1 and B2 ?

This raises the question of how a border router knows whether to advertise a route it has learned from one of its peers. In our example from Figure 2.42, B1 should not advertise a route to network N 3 , even though N 3 is reachable through B1 . The answer is that the internal border routers talk to each other. Ideally, the BGP instance on each border router establishes a TCP connection with every other border router in the autonomous system so that the connections between the routers form a full mesh. This may not be practical for very large autonomous

Section 2.12

Routing

AS2

53

AS3 B4

B5

B3

I3

network N 3

I2

B6

B1

B2

I1 AS1

Figure 2.42 Parts of Three Autonomous Systems

systems, so sometimes the AS manager will conﬁgure the routers to form only a partial mesh. These connections are logical TCP connections, not necessarily direct physical connections. Indeed, the border routers are likely to be on separate networks in different, widely dispersed geographical locations. BGP is an example of a path-vector protocol. Each BGP route advertisement includes additional information about the route, the most important of which is a list of the autonomous systems that must be traversed to reach the destination. This list of autonomous systems is referred to as a path. When BGP receives path information to a network in an external AS, it chooses the best path to that destination among all the paths that it knows about. This best path is distributed to the other internal BGP routers. The factors considered in choosing a best path are determined by BGP conﬁguration. RFC 1772 [Rekhter and Gross 1995] lists such possible considerations as

54

TCP/IP Overview

Chapter 2

• AS count—the number of autonomous systems in the path • Policy considerations—a decision by the AS manager not to use routes through a certain AS or a preference for one AS to another • Path origin—where the route was learned from: BGP, an IGP router, or another EGP • AS path subsets—generally, preference for a path that is a subset of another • Link dynamics—preference for stable paths over unstable paths

The other internal BGP routers do the same thing, of course, so that at the end of this process, each BGP router will have a list of candidate paths to the destination. Each router then calculates the most-preferred path, which is distributed to the IGP routers and the external BGP routers. All internal BGP routers perform the same best-path calculation, so they will agree on the preferred path to the destination, and will advertise the same path.

2.13

Summary In this chapter, we have made a quick tour of the TCP/IP protocols and their data formats. We began by discussing layering and encapsulation and their roles in networking architecture. These conceptual tools allow us to divide the network’s functions into discrete sets, or layers, that can be more easily understood in isolation. Communication between layers follows a well-deﬁned API (application programming interface) that allows changes to one layer without affecting the others. In our examination of IP addressing, we saw that the classical division of addresses into ﬁve classes leads to the growth of routing tables and the depletion of globally routable addresses. CIDR helps solve these problems by generalizing classful addressing to allow ﬁne-grained control over the boundary between the network and host IDs in the address. Although classful addressing is still mandated by the IETF as the standard, virtually all addressing uses CIDR. Next we studied the IP, TCP, UDP, and ICMP protocols. These protocols make up the core of the TCP/IP suite and are the foundation on which applications are built. While discussing these protocols, we studied their operation on the wire by examining tcpdump traces. Although examining the protocols at this level may seem like overkill in an overview, it helps us to develop the skill of reading tcpdump output and dissecting network packets. We will use these skills often as we discuss tunneling and VPNs in the rest of the text. We continued our study of addressing by considering RFC 1918 private addresses and NAT. Although NAT has several undesirable effects, it is a useful and popular tool to conserve IP global addresses, foster independence from any particular ISP, and reduce costs by allowing several hosts to share the same global IP address. The combination of NAT and CIDR is providing a short-term solution to the IP address depletion problem until the long-term solution, IPv6, can be deployed. PPP is, of course, just another interface layer protocol like Ethernet or Token Ring, but because it is often used as a vehicle for IP datagrams in tunnels, we made a cursory examination of it. We usually think of PPP as a protocol running on serial links, but it

Section 2.13

Summary

55

can run over a variety of media. PPP combines the simple point-to-point serial-link protocol of SLIP with the framing and some of the functionality of HDLC. Next, we examined IPv6. Although we will not dwell much on IPv6, it is important to understand some of its features. This is particularly true because IPsec, which we discuss in Part 3, is part of IPv6. Finally, we brieﬂy discussed routing. We introduced the notion of autonomous systems and noted that the interior gateway protocols used within an AS differ from the exterior gateways protocols used among autonomous systems.

Exercises 2.1

Consider a host that supports IP and other networking protocols, such as IPX or AppleTalk. Each of these protocols will have its own stack. Suppose that the protocols all use the same Ethernet interface. Investigate the Ethernet frame format and discover how the operating system is able to tell which stack it should deliver the data to.

2.2

What is the identiﬁcation number of the IP datagram in Figure 2.4?

2.3

How can IP differentiate the ﬁrst fragment of a fragmented IP datagram, which will have a fragment offset of 0, from an unfragmented datagram?

2.4

IP does not reassemble a fragmented datagram until it has received all the fragments. How does IP know when it has received all the fragments? Remember that the fragments can arrive out of order.

2.5

We said that UDP checksums are optional and that a sender signals the receiver that it is not using a checksum by setting the checksum ﬁeld in the UDP header to 0. How can a receiver tell the difference between ‘‘no checksum’’ and a UDP datagram that happens to checksum to 0? Hint: Internet checksums are calculated using ones-complement arithmetic.

2.6

Ignoring any implementation-speciﬁc or MTU restrictions, what is the maximum amount of data that can be transmitted in a UDP datagram?

2.7

Suppose that hosts A and B have established a TCP connection but are not exchanging data, and that host A crashes and reboots. Is there still a connection? Do hosts A and B agree with your answer?

2.8

What will the routing table for host 4 of Figure 2.9 look like?

2.9

We said that only the hop-by-hop extension header in IPv6 needs to be examined by each router. Why doesn’t the routing header need to be examined by each router also?

2.10 How do the IGP distance-vector protocols differ from the BGP path-vector protocol? 2.11 BGP views the Internet abstractly as a graph, with the nodes being autonomous systems. This suggests that a link-state protocol might be an appropriate way of introducing more precise metrics into EGP routing. What problems do you think a link-state protocol might encounter? 2.12 As with any other routing protocol, BGP must take steps to ensure that routing loops do not develop. How can BGP use the path information associated with an advertised network to avoid such loops?

This page intentionally left blank

3

Cr yptography Overview

3.1

Introduction Cryptography is a vast and difﬁcult ﬁeld that we cannot hope to cover in this book. Nevertheless, an understanding of the basics and an appreciation of some of the subtleties of cryptography are necessary for an understanding of what is to follow. Fortunately, just as with TCP/IP, several excellent texts are available for those who wish to delve deeper into its mysteries—see [Schneier 1996], [Ferguson and Schneier 2003], and [Menezes, Oorschot, and Vanstone 1996], for example. In this chapter, we are concerned mainly with three major subjects: 1. Encryption/decryption 2. Message authentication codes 3. Digital signatures Along the way, we shall also examine Difﬁe-Hellman key exchange and the use of certiﬁcates for authentication and key exchange. As we shall see, there are two main types of ciphers used for encryption: symmetric and asymmetric. These two types generally serve different purposes but work together to provide a total solution. The symmetric ciphers are again divided into two main classes: block and stream. Both of these classes have their strengths and weaknesses, and we shall examine examples of each as we go along. This chapter is more mathematical than the others, so some readers may want to skip the details and take in just the major points. On the other hand, the chapter covers the bare minimum needed for a reasonable understanding of modern cryptography, so interested readers may want to consult the references cited in the chapter. 57

58

3.2

Cryptography Overview

Chapter 3

Symmetric Ciphers Symmetric ciphers get their name from the fact that the same key is used for both encryption and decryption. A trivial—and horribly insecure—example is a cipher that merely performs an exclusive-OR (⊕) of each byte of a message with a corresponding byte of the key (see Figure A.1). For example, if our key is 0123, we would encrypt and decrypt the message trivial as shown in Figure 3.1. Plaintext Key Plaintext ⊕ key = ciphertext Key Ciphertext ⊕ key = plaintext

ASCII trivial 0123012 DC[EYPˆ 0123012 trivial

74 30 44 30 74

72 31 43 31 72

69 32 5b 32 69

HEX 76 69 33 30 45 59 33 30 76 69

61 31 50 31 61

6c 32 5e 31 6c

Figure 3.1 A Trivial Symmetric Cipher

In this case, the decryption operation (exclusive-OR) as well as the key are the same as for encryption, but in many symmetric ciphers, only the key is the same. Interestingly, even though this cipher is trivial to break (see [Dawson and Nielsen 1996] and Section 1.4 of [Schneier 1996]), it is almost identical to the only known provably unbreakable cipher: the one-time pad. The crucial difference is that with the one-time pad, the key is never repeated. This implies, of course, that the key must be at least as long as the message and that each message must have its own key.

Stream Ciphers Our trivial cipher is an example of a stream cipher. The distinguishing characteristic of stream ciphers is that they operate on the plaintext one character at a time, and that the same character does not necessarily encrypt to the same cipher text each time it is encountered. We can see an example of this last phenomenon in Figure 3.1, where the ﬁrst i of trivial encrypts to [ and the second i encrypts to Y. When we say character in the preceding deﬁnition, we are deliberately speaking loosely. Stream ciphers can operate on individual bits, on bytes, or even on 32-bit ‘‘characters.’’ The point is that stream ciphers operate on relatively small units of information and may encrypt a given character differently at different places in the text stream.

Probably the most common stream cipher—owing to its use in SSL (Secure Sockets Layer), discussed in Chapter 6—is RC4. The idea behind RC4 is to generate a sequence of pseudorandom bytes, called the key stream, that can be exclusive-ORed into the plaintext, as we did in Figure 3.1. Because the sequence of bytes in the key stream does not repeat for a very long time, it is not vulnerable to the easy attacks that a cipher based on the repeated exclusive-ORing of a key into the plaintext is. Because that sequence is not truly random, it does not offer the perfect security of a one-time pad. RC4 was invented in 1987 by Ron Rivest at RSA Data Security, Inc. The RC4 algorithm is proprietary to RSA Security, and the details of its operation have never been publicly revealed.

Section 3.2

Symmetric Ciphers

59

What we are about to describe is a reverse engineered version sometimes called alleged RC4. Independent observers with access to the licensed RC4 code have conﬁrmed that alleged RC4 is, in fact, the same algorithm.

RC4 is simple to describe and implement. Indeed, there is a three-line Perl implementation of it at . The algorithm has an initialization, or key-scheduling, phase, in which the initial state of the pseudorandom byte generator is set up. In the second phase, the key stream is generated and exclusive-ORed into the plaintext to produce the ciphertext. RC4’s internal state consists of a 256-byte array, S, of unique 8-bit values, and two counters, i and j. The key-scheduling algorithm takes as input the sequence 〈K i 〉, which is the key concatenated with itself repeatedly to make a sequence of length 256 bytes, and performs the following three operations: 1. m = 0 For n = 0 . . . 255 Sn = n 2. For n = 0 . . . 255 m = (m + Sn + K n ) mod 256 swap Sn and Sm 3. i = j = 0 Once the key-scheduling algorithm initializes the internal state, a random byte, R, is generated by: i = (i + 1) mod 256 j = ( j + Si ) mod 256 k = (Si + S j ) mod 256 swap Si and S j R = Sk These steps are iterated to produce a sequence of pseudorandom bytes 〈R i 〉. Encryption proceeds by exclusive-ORing the i th byte of plaintext with the corresponding random byte. That is, C i = P i ⊕ R i , where the 〈P i 〉 are the plaintext bytes and the 〈C i 〉 are the ciphertext bytes. Decryption works in exactly the same way. We show a python implementation of RC4 in Figure A.2. Although RC4 is simple and may even seem ad hoc, it is an excellent algorithm with many desirable cryptographic properties. It is also very fast. RSA claims speeds of 1 MB/second even on a 33 MHz machine [Robshaw 1995], and the OpenSSL speed benchmark reports speeds on the order of 60 MB/second on a 1.6 GHz Pentium 4. In 2004, Marc Bevand achieved speeds of 319 MB/second on the AMD64 processor (). See [Mister

60

Cryptography Overview

Chapter 3

and Tavares 1999] for an analysis of RC4’s cryptographic properties. An important vulnerability in RC4’s use with wired equivalent privacy (WEP) is discussed in [Fluhrer, Mantin, and Shamir 2001]. [Roos 1995] identiﬁes a class of weak keys and recommends the ﬁrst few bytes of RC4 output be discarded. A cautionary tale involving the misuse of RC4—in this case, reusing the key stream—can be found in [Stevenson 1995]. Block Ciphers As their name implies, block ciphers operate on ﬁxed-sized blocks of data. The traditional size was usually 64 bits, but more recent block ciphers use block sizes of 128, 192, or even 256 bits. A block cipher takes an N-bit block of plaintext as input and outputs an N-bit block of ciphertext. At ﬁrst glance, it’s difﬁcult to see the difference between stream and block ciphers except for the larger unit of data operated on by block ciphers, and, indeed, each can take on the characteristics of the other. The salient difference is that whereas stream ciphers might encrypt two occurrences of the same character differently, depending on their position in the text stream, block ciphers will always encrypt the same block of bits the same way. In this sense, block ciphers are really just a ‘‘code book’’ that speciﬁes an encrypted block for each plaintext block, and we could, in principle, implement them as a table lookup. As a practical matter, this isn’t feasible, because such a code book would require two arrays of 2 N entries of N bits each. For the reasons discussed in the previous paragraph, block ciphers are said to operate in electronic code book (ECB) mode. This mode has some security problems because many messages have data in common, and it is sometimes possible to infer information about a message that has a block in common with another message. That is, if two messages have the same block of plaintext, they will have the same block of ciphertext. To avoid these problems, we can add an extra step to the encryption process. We choose an arbitrary block, called an initialization vector (IV), and exclusive-OR it with the ﬁrst plaintext block before we encrypt it. Before encrypting the second plaintext block, we exclusive-OR it with the ﬁrst cipher block, and so on. That is, if EK is the block encryption algorithm with key K, we encrypt the message as: CB0 = EK (PB0 ⊕ IV) CBi = EK (PBi ⊕ CBi−1 ), i > 0 where IV is the initialization vector, 〈PBi 〉 is the sequence of plaintext blocks, and 〈CBi 〉 is the sequence of cipher blocks. This is called cipher block chaining (CBC) mode. The IV need not be kept secret, although it often is, but it must be different for every message. Most experts recommend using a random value for the IV. Note that even if two plaintext blocks are the same, they will almost certainly encrypt to different blocks because they will be exclusive-ORed with different blocks ﬁrst. The CBC mode process is illustrated in Figure 3.2. Decryption is similar, except that the IV and cipher blocks are exclusive-ORed after the decryption step.

Section 3.2

Symmetric Ciphers

PB0

PB1

PB2

PB3

encrypt

encrypt

encrypt

encrypt

CB0

CB1

CB2

CB3

61

IV

Figure 3.2 Cipher Block Chaining

Another effective way of using block ciphers is called counter (CTR) mode [Dworkin 2001]. With this mode, a counter is encrypted with the block cipher, the resulting n-bit block is exclusive-ORed into the plaintext block, and the counter is incremented. That is: CBi = PBi ⊕ EK (CTR i ) where is the sequence of plaintext blocks, is the sequence of corresponding ciphertext blocks, CTR i is the i th counter value, and EK (X) represents encryption of X with the block cipher, using key K. Notice that CTR mode effectively turns a block cipher into a stream cipher. As we’ll see later, CTR mode can have certain advantages over CBC mode, but we must take care to ensure that the key stream is not reused. This means that we can never reuse a counter/key pair. It also means that we cannot allow the counter to wrap, or the key stream will be repeated. Typically, the counter is chosen to have a component that depends on the message, such as a message number, and another component that serves as the actual counter. Another suggested implementation merely takes a random IV as the counter and increments it after every encryption. [Ferguson and Schneier 2003] regard counter mode as the preferred way of using block ciphers. Although they emphasize the dangers involved with its misuse, they believe that it leaks less information than any of the other modes (see Exercise 3.4). The advantages of CTR mode for block ciphers are also discussed in [Lipmaa, Rogaway, and Wagner 2000]. DES The best-known and most-studied block cipher is, of course, the Data Encryption Standard (DES). It has been the worldwide standard for more than 25 years, but is now at the end of its life. The problem is not with the algorithm itself—that’s held up

62

Cryptography Overview

Chapter 3

remarkably well—but with its small key size of 56 bits, and to a lesser extent, the small, by modern standards, block size of 64 bits. It is worthwhile to brieﬂy examine the operation of DES because it illustrates many of the features of block ciphers. The complete speciﬁcation is given in [NIST 1999], so we will content ourselves with covering the concepts without looking at every detail. [Schneier 1996] gives a fascinating history of DES, and discusses its design and cryptographic properties. At ﬁrst glance, the DES algorithm seems complex, especially in comparison to the simplicity of RC4. The basic idea, however, is quite simple. First, the 56-bit key is used to generate 16 48-bit round keys. Each round key is obtained from the original by shifting the original key by an amount that depends on the round and then extracting a 48-bit subset. After an initial permutation, an input block of 64 bits is split into two halves, L and R. The right half, R, is combined with one of the round keys by a function traditionally called f . The result is then exclusive-ORed into the left half, L, and L and R are swapped. This is repeated 16 times, after which the inverse of the original permutation is applied to the result. The traditional way of presenting this is shown in Figure 3.3, but it’s easier to think of it as the recursion Li = R i−1

(1)

R i = Li−1 ⊕ f (R i−1 , K i )

(2)

where K i is the i th round key, and we have ignored the initial and ﬁnal permutations. Ciphers that divide the input block into two halves and use a recursion such as this are called Feistel ciphers, or Feistel networks. Notice that from (1), we can obtain R i−1 from Li and that if we exclusive-OR both sides of (2) with f (R i−1 , K i ), we recover Li−1 from R i . This means that we can use the same algorithm for encryption and decryption by merely reversing the order of the round keys. That is, encryption uses the sequence K 1 , K 2 , . . . , K 16 , whereas decryption uses the sequence K 16 , K 15 , . . . , K 1 . This fact also explains an apparent anomaly in Figure 3.3. Observe that in the last round, L16 and R16 are not swapped. Thus, an encrypted block can be fed back into the algorithm as is for decryption. This was an important consideration for DES because the original speciﬁcation required that it be implemented in hardware, and the ability to decrypt by feeding the cipher text as is into the same algorithm meant that the same circuitry could be used for both encryption and decryption.

The ability to use the same algorithm for encryption and decryption is a characteristic of Feistel ciphers. As long as f (R i−1 , K i ) can be reproduced, it is always possible to invert the encryption using the same algorithm, even if f itself is not invertible. Putting aside the internals of the f function for the moment, we see that each round of DES is reminiscent of RC4. The f function produces some pseudorandom data that is exclusive-ORed into the plaintext stream. There are obvious differences, of course. For one thing, the bits that get exclusive-ORed into Li depend on the plaintext as well as the key. This is an important point; one of the goals of the DES algorithm is to ensure that every bit of the encrypted block depends on every bit of the plaintext block and every bit of the key. Feeding the plaintext into f is a vital part of guaranteeing this.

Section 3.2

Symmetric Ciphers

63

plaintext block initial permutation

K1

L0

R0

f

K2

L1 = R0

R1 = L0 ⊕ f (R0 , K 1 )

f

K3

L2 = R1

R2 = L1 ⊕ f (R1 , K 2 )

f

L15

. . . = R14

K 16

R15

. . . = L14 ⊕ f (R14, K 15 )

f R16 = L15 ⊕ f (R15, K 16 )

L16 = R15

inverse initial permutation ciphertext block Figure 3.3 DES Encryption Rounds

The operation of the f function is illustrated in Figure 3.4. First, the expansion permutation rearranges the 32 bits of the R half-block and expands the result to 48 bits by repeating some of the bits. The expansion to 48 bits makes the result match the size of

64

Cryptography Overview

Chapter 3

the round key, but more important, it means that half the bits in any R i will affect multiple bits in the output of f . Next, the round key is exclusive-ORed with the output of the expansion permutation, and the resulting 48 bits are split into eight groups of 6 bits. Each 6-bit group is fed into one of eight substitution boxes, or S-boxes. Each S-box maps its input into 4 output bits that are combined with the output of the other S-boxes to create a 32-bit result. These 32 bits are permuted by the P permutation to form the ﬁnal output of f . R i−1 (32 bits)

expansion permutation

expanded half block (48 bits)

Ki (48 bits)

48 bits

S1

S2

S3

S4

S5

S6

S7

S8

32 bits P permutation

f (R i−1 , K i ) (32 bits) Figure 3.4 The DES f Function

The S-boxes are the heart of the DES algorithm and are what gives it most of its strength. The structure of the S-boxes is simple; they merely perform a table lookup on the 6-bit input to obtain a 4-bit entry. The values used for those tables are anything but simple, however. They are carefully chosen to have certain cryptographic properties that make DES resistant to cryptanalytic attack. The ever-expanding power of modern computers has rendered DES obsolete. As far back as the late 1970s, Difﬁe and Hellman speculated that a DES-cracking machine

Section 3.2

Symmetric Ciphers

65

that could break DES by brute force—that is, by trying every possible key—could be built for $20 million. They estimated that the machine would take a day to recover a key. And, indeed, in 1998 the Electronic Frontier Foundation built a DES-cracking machine, called Deep Crack, for less than $250,000. Deep Crack was able to recover the key for RSA’s DES Challenge II in less than three days. The design of the machine and the story of its development are recounted in [Electronic Frontier Foundation 1998]. More recently, distributed networks of small general-purpose computers have been used in conjunction with Deep Crack to break DES in less than a day. Triple DES As we noted above, the major problem with DES is that the key has only 56 bits. That means that a brute-force attack takes at most 256 attempts, a number well within practical limits for today’s machines. Triple DES (3DES) deals with this problem by encrypting the plaintext three times with three different keys. Although this yields an effective key size of only 112 bits, rather than the 168 we might expect, it does extend the life of DES for a few more years. Triple DES is usually implemented in encrypt-decrypt-encrypt (EDE) mode, where the ﬁrst key is used to encrypt a block, the second key is used to decrypt the result of step 1, and the third key is used to encrypt the result of step 2. If EK i is the encryption function for key K i , and DK i is the decryption function for key K i , we have CB = EK 3 (DK 2 (EK 1 (PB))) PB = DK 1 (EK 2 (DK 3 (CB))) where PB is a plaintext block, and CB is the corresponding encrypted block. EDE mode does not have any cryptographic effect, but it does allow 3DES to be backward compatible with DES by setting K 1 = K 2 = K 3 . Note that when the three keys are the same, the result is the same as a single encryption or decryption with DES. Despite partially solving the problem of the small key size of DES, 3DES still has problems. First, it is three times slower than DES, which is already slow when compared to most modern block ciphers. More important, 3DES still uses the 64-bit block size, which yields less security than the larger block size used in recent block ciphers. [Ferguson and Schneier 2003] recommend against using 3DES except for legacy applications that demand it. AES In 1997, the National Institute of Standards and Technology (NIST) solicited proposals for a new cipher to replace the aging DES. This new cipher was to be called the Advanced Encryption Standard (AES). NIST speciﬁed that AES must support a block size of at least 128 bits and key sizes of at least 128, 192, and 256 bits. Fifteen candidates were submitted, and in August 1999, NIST announced the ﬁve ﬁnalists. The criteria for selection of the ﬁnalists and for AES itself included the security

66

Cryptography Overview

Chapter 3

of the cipher, its performance, and its ease of implementation in hardware and software. On October 2, 2000, NIST revealed its selection of the Rijndael cipher for AES. Rijndael (AES) Rijndael supports block and key sizes of 128, 192, and 256 bits. There are 10 to 14 rounds, depending on the key size. The key-scheduling algorithm expands the initial key to N × (R + 1) bits, where N is the block size in bits, and R is the number of rounds. The ﬁrst N bits are used as the key for round 1, the second N bits for round 2, and so on. The last N bits are used in a ﬁnal key mixing after the last round. Although both AES and DES use some of the same basic operations, such as S-boxes and permutations, AES is different from DES in several ways. First, AES is byte oriented: All the basic operations are performed on bytes rather than on bits as in DES. Second, AES is not a Feistel cipher. Indeed, decryption uses a different algorithm from encryption. Finally, the number of rounds is variable, depending on the key and block sizes. Figure 3.5 shows a typical AES round for a block size of 128. The rounds for the other block sizes are similar. The input block is delivered to the round as 16 bytes labeled b0 , . . . , b15 . The output of the round is also 16 bytes, labeled c 0 , . . . , c 15 . b0 (i) K0

b1 (i) K1

S

b2 (i) K2

S

b3 (i) K3

S

b4 (i) K4

S

b5 (i) K5

S

b6 (i) K6

S

b7 (i) K7

S

b8 (i) K8

S

b9 (i) K9

S

b10 (i) K 10

S

b11 (i) K 11

S

b12 (i) K 12

S

b13 (i) K 13

S

b14 (i) K 14

S

b15 (i) K 15

S

S

byte permutation

linear mixing function

c0

c1

c2

c3

linear mixing function

c4

c5

c6

c7

linear mixing function

c8

c9

c 10 c 11

linear mixing function

c 12 c 13 c 14 c 15

Figure 3.5 One Round of AES

As the bytes enter at the top of Figure 3.5, they are exclusive-ORed with the 16 bytes of the round key K (i) . The result is fed into 16 identical S-boxes that do a table lookup on the input byte to produce an output byte. Next, the outputs from the 16 S-boxes are permuted by the permutation

Section 3.2

Symmetric Ciphers

0 0

1 5

2 10

3 15

4 4

5 9

6 14

7 8 3 8

9 13

10 2

11 7

12 12

13 1

14 6

15  11 

67

(3)

where the meaning of (3) is that b0 → b0 , b1 → b5 , b2 → b10 , and so on. Notice that the bytes from each group of four are dispersed uniformly to each of the other three groups. Finally, the output of the byte permutation is divided into groups of 4 bytes, and each group is subjected to a linear mixing function that produces 4 output bytes. Technically, the linear mixing function is the linear transformation  c 0   0x02  c 1   0x01  =  c 2   0x01  c 3   0x03

0x03 0x02 0x01 0x01

0x01 0x03 0x02 0x01

0x01  0x01   0x03  0x02 

 b0   b1     b2   b3 

over the ﬁnite Galois ﬁeld GF(28 ), but as a practical matter, we can think of the linear mixing function as exclusive-ORing input bytes to produce an output byte.

The last round differs from the others in that the linear mixing function is skipped, and the output bytes from the permutation are exclusive-ORed with the ﬁnal 128 bits of the expanded key. The decryption rounds follow the same pattern except that the data ﬂow is from bottom to top, the inverse of each of the operations is used, and the round keys are used in reverse order, as they were with DES. The speciﬁcation for Rijndael is given in [NIST 2002a]. NIST maintains an AES home page at , with pointers to the Rijndael home page, implementations, and information about the other AES candidates. Blowﬁsh Blowﬁsh is a block cipher developed by Bruce Schneier. Because the algorithm is fast and unencumbered, it’s a popular encryption method that is used in several VPNs. Although its 64-bit block size is considered too small by modern standards, many of the applications that we are concerned with are still using Blowﬁsh, so it is worth our while to take a quick look at it. Blowﬁsh uses a variable-length key of up to 448 bits, although 128 bits are more usual. The large key size adds considerable strength to the algorithm compared to DES, which Blowﬁsh was designed to replace. Like DES, Blowﬁsh is a Feistel cipher, and the Blowﬁsh f function uses S-boxes, but the S-boxes depend on the key, which makes cryptanalysis of the algorithm more difﬁcult. The key-scheduling algorithm generates 18 subkeys, P 1 , . . . , P 18 , and four S-boxes, S1 , . . . , S4 , from the key. Each S-box maps an 8-bit input to a 32-bit output. Figure 3.6 shows the operation of the algorithm and also makes its structural similarity to DES clear. Each of the 16 rounds consists of simply exclusive-ORing its left input with one of the 18 subkeys, and its right input with the output of the f function. The f function is also simple. It uses only the S-boxes, exclusive-ORs, and 232 modular

68

Cryptography Overview

Chapter 3

plaintext block

L0

R0

P1

f

L1 = R0 ⊕ f (L0 ⊕ P 1 ) P2

R1 = L0 ⊕ P 1 f

L2 = R1 ⊕ f (L1 ⊕ P 2 ) P3

L15

R2 = L1 ⊕ P 2 f

. . . = R14 ⊕ f (L14 ⊕P 15 )

P 16

R15

. . . = L14 ⊕ P 15

f L16 = L15 ⊕ P 16

R16 = R15 ⊕ f (L15 ⊕ P 16 )

P 18

P 17

ciphertext block Figure 3.6 Blowﬁsh Encryption Rounds

addition. This simplicity is what gives Blowﬁsh its speed, especially on modern 32-bit processors. The operation of the f function is shown in Figure 3.7. The Blowﬁsh key-scheduling algorithm requires considerable computation. It is roughly equivalent to encrypting 4 kilobytes of data. The algorithm works by

Section 3.3

Asymmetric Ciphers

69

input (32 bits) 8 bits

8 bits S1

32 bits

8 bits S2

32 bits

8 bits S3

32 bits

S4 32 bits

output (32 bits) Figure 3.7 The Blowﬁsh f Function

initializing the 18 subkeys and the 256 32-bit entries of each S-box to the hexadecimal digits of the fractional part of π . Next, the user’s key is exclusive-ORed into the subkeys, cycling through the key repeatedly, if necessary, to ﬁll out the 18 subkeys. Then, a block of 0s is encrypted, and the result replaces P 1 and P 2 . The modiﬁed algorithm is now used to encrypt the results of the previous encryption, and the output replaces P 3 and P 4 . This operation is performed 521 times until each of the subkeys and all the Sbox entries are initialized. Because of the time it takes to initialize the algorithm’s state, Blowﬁsh is not suitable for situations in which the key is changed frequently. Blowﬁsh is ideal for applications such as VPNs, in which a key is used for some time before it is replaced. Additional information about Blowﬁsh is available on Schneier’s Web site, , where he gives a complete discussion of the algorithm, an explanation of some of the design decisions, and pointers to several free implementations. As of this writing, there is no known practical attack against Blowﬁsh. Certain weak keys can be detected, but not identiﬁed or exploited, in a reduced 14-round version of Blowﬁsh, and a 4-round version is vulnerable to a differential analysis attack.

3.3

Asymmetric Ciphers One of the problems with symmetric ciphers is that they require both parties to the exchange of encrypted messages to have access to a shared secret: the key. If we think in terms of a typical application, encrypted email, we see immediately why this is a problem: We must give every possible correspondent a key to use when corresponding with us. Furthermore, this key must be different for each correspondent so that one

70

Cryptography Overview

Chapter 3

can’t read the messages of another. Because for n correspondents, each able to communicate privately with any other, this requires 12 (n2 − n) keys, the number of keys grows quadratically in the number of correspondents. Sometimes, sharing a secret isn’t possible. Consider the typical application of securing a Web transaction (see Chapter 6). Because an online merchant doesn’t know a priori who its customers will be, it is not possible to use a shared secret to secure the transaction. There are many other situations in which it is impractical or impossible for both parties to use a preassigned key. In this section, we consider algorithms that enable two parties to communicate securely without such problems. For reasons that will become clear shortly, this is usually called public key cryptography. Asymmetric ciphers are those that have different keys for encryption and decryption. The idea goes back to [Difﬁe and Hellman 1976] and, independently, [Merkle 1978]. The GCHQ (Government Communications Headquarters—the British equivalent of the American National Security Agency) also has some claim to the idea of asymmetric ciphers, but its work was classiﬁed and unpublished until after the Difﬁe-Hellman and Merkle papers.

The notion was that keys would come in pairs—one each for encryption and decryption—and that one key could not be derived from the other. These algorithms are important for public key cryptography, in which one key is public and used by anyone to encrypt a message that can be decrypted only by the holder of the other, secret, key. Because only the recipient has the secret key, the messages cannot be read by anyone else. Asymmetric ciphers are based on one-way trapdoor functions, which are easy to compute but difﬁcult to invert. It is the trapdoor function that makes it difﬁcult to obtain one key from the other. See [Schneier 1996] for a discussion of many of these algorithms and the trapdoor functions on which they are based. Here, we consider only two of the asymmetric algorithms: RSA and ElGamal. RSA The RSA algorithm was invented by Ron Rivest, Adi Shamir, and Ken Adleman and described in their 1978 paper [Rivest, Shamir, and Adleman 1978]. RSA is based on the difﬁculty of factoring large numbers. Given two large prime numbers, p and q, the trapdoor function is to multiply p and q to obtain the product n. Obviously, this is trivial to compute, but the difﬁculty of factoring n into p and q makes it very hard to invert. The mathematics required to understand RSA is modest—mostly some facts about modular arithmetic—and, except for the proof that it works, is probably familiar to most of us. For an excellent review of the necessary mathematics, see [Ferguson and Schneier 2003]; a slightly more comprehensive review is given in [Menezes, Oorschot, and Vanstone 1996]. The description of RSA is simple. Pick two large random numbers, p and q, of about the same size. They should be at least 1,024 bits long, with larger numbers being more secure. Then choose a number e for the encryption key such that e has no factors in common with (p − 1)(q − 1). Popular choices for e are 3, 5, 17, and 65,537.

Section 3.3

Asymmetric Ciphers

71

These values are chosen because they make it easier to perform the calculations discussed below. In practice, e is often chosen ﬁrst, and p and q are chosen to ensure that (p − 1)(q − 1) has no factors in common with it.

Next, choose the decryption key d such that ed = 1 mod (p − 1)(q − 1). The condition on e and (p − 1)(q − 1) ensures that this is possible. We say that d is the multiplicative inverse of e mod (p − 1)(q − 1) and sometimes write d = 1/e mod (p − 1)(q − 1). The public key is the pair (n, e), and we can make it known to anyone who might wish to send us a message. The numbers p, q, and d are secret and should not be revealed. Only d is necessary to decrypt messages, so p and q could be discarded, but retaining them allows us to build more efﬁcient implementations of the decryption algorithm. Given a message m, thought of as a single large integer, such that m < n, we encrypt it as c = m e mod n

(4)

where c is the encrypted message. We decrypt it as m = c d mod n

(5)

We should take note of two things here. First, given that we choose p and q to be on the order of 1,024 bits, n ≈ 2, 048 bits long. Thus, we can’t encrypt a message longer than about 256 characters. Second, making these calculations requires working with huge numbers, so in general, we will need to use an arbitrary-precision math library for the calculations. It’s not difﬁcult to see why (4) and (5) work: c d mod n = (m e )d mod n = m ed mod n = m k(p−1)(q−1)+1 mod n = m ⋅ m k(p−1)(q−1) mod n = m ⋅ 1k mod n

(6)

=m The only step that’s not completely obvious is (6). This follows from Fermat’s little theorem—the details are in [Menezes, Oorschot, and Vanstone 1996], but we don’t need to understand any of that to use the algorithm. A simple example will make these ideas clear. Let us choose two primes of 4 bits each—say, 5 and 11—for our p and q, and let’s choose 3 for e. Next, we need to ﬁnd a number d such that 3d = 1 mod (p − 1)(q − 1) = 1 mod 40. It is easy to check that 27 satisﬁes this condition. In this case, the numbers are small enough that we can just search for d directly. In general, the easiest way to ﬁnd d is to use the extended Euclidean Algorithm [Knuth 1998], but we needn’t worry about this unless we are trying to implement the RSA algorithm. See Appendix A for bc and python scripts that calculate these inverses.

We now have everything we need to encrypt a message. The public key is (55, 3), and the secret key is 27. Let us encrypt the message 11012 (decimal 13): c = 133 mod 55 = 2197 mod 55 = 52

72

Cryptography Overview

Chapter 3

To decrypt this, we use the secret key as in equation (5): m = 5227 mod 55 Even with these small numbers, the calculations can get out of hand. Although we could calculate 5227 by hand (the result is 47 digits), it’s easier to use an arbitrary-precision calculator, such as the UNIX utility bc for this: $ bc -q 52ˆ27%55 13

As expected, we get our original message, 13, back. An algorithm that encrypts messages that are limited to 5 bits is not particularly useful, of course, but even with the more typical implementation, we are still limited to about 256 bytes. We could solve this problem by partitioning our messages into blocks of 5 bits and encrypting each block independently, as we do for block ciphers, but RSA is very slow—the typical software implementation of RSA runs at about 1 percent of the speed of DES [Schneier 1996]—so RSA is almost never used this way. Instead, we combine RSA with one of our symmetric ciphers. Recall that a problem with symmetric ciphers is that they require both parties to an exchange of encrypted messages to have access to the key. Public key systems, such as RSA, solve this problem as follows. Suppose that Alice wants to send an encrypted message to Bob. The names Alice and Bob are traditionally used in cryptographic literature to refer to two people who wish to communicate with each other.

Alice picks a random number as a one-time session key for a symmetric cipher—AES, say—and encrypts it, using Bob’s public key. Alice then encrypts the message itself with AES, using the random session key, and sends both the RSA encrypted key and the AES encrypted message to Bob. The actual process is a bit more complicated. An AES key, K, of 256 bits is too small to encrypt securely with RSA because for a public key of, say, (n, 3), where n has about 2,048 bits, we have K e < n. Therefore, K e mod n = K e , and an attacker can recover K by merely taking the e th root. To avoid this, Alice can choose a large random number to encrypt with RSA. Both Alice and Bob then use this random number to generate the actual key, K. Another method is for Alice to pad K with random bits so that the total message has about the same number of bits as n. Other methods of encoding K and, more generally, any message m, are given in PKCS #1 [RSA Laboratories 2002]. The encoding methods in PKCS #1 add security because they use hash functions (Section 3.4) and pseudorandom masking functions that ensure that two similar or identical messages will have dissimilar encodings. At the same time, the encodings have a structure that the receiver can check to avoid certain chosen ciphertext attacks. See [Bleichenbacher, Kalisky, and Staddon 1998] and [Kaliski and Robshaw 1995] for details on how PKCS #1 encoding prevents many attacks on RSA-encoded messages.

When Bob receives these, he ﬁrst uses his secret RSA key to recover the AES session key, and then uses that to decrypt the message. If several messages will be exchanged in a session, the same session key is used for each message, thereby avoiding the relatively slow RSA step for all messages but the ﬁrst. We will see this same pattern—using

Section 3.3

Asymmetric Ciphers

73

an asymmetric cipher to exchange session keys for a symmetric cipher—many times as we study VPNs and related security protocols. ElGamal If p is prime, and g, y ∈{1, 2, . . . , p − 1}, the discrete logarithm problem is to ﬁnd an integer x such that y = g x mod p. In the real numbers, R, this is merely the normal logarithm, and x is easily determined, although it may not be an integer, of course. In the ﬁnite ﬁeld Z p , which is what we are discussing, this is a problem on the same order of difﬁculty as factoring the prime p. Therefore, for large p, modular exponentiation is a suitable trapdoor function for an asymmetric cipher algorithm: It is trivial to perform modular exponentiation, but inverting it requires ﬁnding its discrete logarithm. The ElGamal encryption algorithm [ElGamal 1985] is based on the difﬁculty of computing a discrete logarithm. We must use some care in picking p and g. Generally, we want p = 2q + 1, where q is also prime, and we want p to be on the order of 2,048 bits or more. [Ferguson and Schneier 2003] suggests the following method for picking g: Choose a random number α ∈{2, 3, . . . , p − 2}, and compute g = α 2 mod p. If g = 1 or g = p − 1, pick another α and try again. Otherwise, g, which is called the group generator, is suitable. Next, pick a random number x, and compute y = g x . The public key is the triple (y, g, p). The private key is x. To encrypt a message m, choose a random number k that has no factors in common with p − 1, and compute the two numbers: a = g k mod p b = y k m mod p. The encrypted message is the pair (a, b). To decrypt the message, compute m = b/a x mod p Note that a x mod p = (g k )x mod p = g kx mod p = g xk mod p = (g x )k mod p = y k mod p Thus, dividing b by a x (modulo p) yields the original message m. To see how this works in practice, let q be 3 so that p = 7. Choose g = 2. Because 2 = 32 mod 7, g is a suitable choice for the generator. Let’s use 4 for our secret key, so that y = 24 mod 7 = 2. Suppose that we want to encrypt the message 1102 (decimal 6). First, we choose a random k with no factors of 2 or 3; let’s say 5. We compute a = 25 mod 7 = 4 b = 25 ⋅ 6 mod 7 = 3

74

Cryptography Overview

Chapter 3

To recover the message, we compute a x mod 7 = 44 mod 7 = 4 b/a x mod 7 = 3 ⋅ 2 mod 7 = 6 where 1/a x = 2 mod 7. Difﬁe-Hellman Key Exchange ElGamal encryption is closely related to the Difﬁe-Hellman key-exchange algorithm [Difﬁe and Hellman 1976]. Difﬁe-Hellman enables two parties to dynamically agree on a shared secret over an insecure transmission medium without any previously shared information. Even if their entire session is captured, an attacker will not be able to discover the shared secret. This seemingly impossible task is in fact quite simple. The process is illustrated in Figure 3.8.

Alice

Bob (p, g, y ) A

yB

Figure 3.8 Difﬁe-Hellman Key Exchange

To produce the shared secret, Alice chooses a prime p and generator g, just as in ElGamal. Then Alice chooses a random private key x A , calculates y A = g x A mod p, and sends the triple (p, g, y A ) to Bob. Bob chooses his own random private key x B , calculates y B = g x B mod p, and sends it to Alice. Note that it is impractical for an attacker to recover the two private keys x A and x B , because of the difﬁculty of the discrete logarithm problem. x Next, Alice generates her secret key, K A , as K A = y BA mod p. Likewise, Bob calcuxB lates his own secret key as K B = y A mod p. But now we have x

x

K A = y BA mod p = g x B x A mod p = g x A x B mod p = y AB mod p = K B so Alice and Bob have, in fact, calculated the same key independently. In many situations, Alice and Bob will have agreed on p and g in advance, so Alice need send only y A to Bob instead of the triple (p, g, y A ). In some applications, such as IPsec, there are a small number of primes and generators that most implementations choose from, and Alice need send only an indication of which set she wishes to use, rather than the entire prime and generator.

Section 3.4

Cryptographic Hash Functions, MACs, and HMACs

75

ElGamal and Difﬁe-Hellman over Elliptic Curves The ElGamal and Difﬁe-Hellman algorithms that we’ve discussed involve calculations over the ﬁnite ﬁeld Z p , using the normal ﬁeld operations, such as addition, multiplication, and exponentiation. It is possible to deﬁne an elliptic curve over Z p or GF(2n ), and then implement the ElGamal and Difﬁe-Hellman algorithms as calculations on the points of the curve. The mathematics are considerably more demanding than what we have been using, so we will omit the details. A reasonably accessible introduction to elliptic curve cryptography and its application to Difﬁe-Hellman is in [Davis 2001]. The advantage of elliptic curve cryptography is that the discrete logarithm problem is considerably more difﬁcult over an elliptic curve than it is in Z p , so smaller values of p can be used, with a consequent increase in speed and decrease in the amount of data that needs to be exchanged between peers during key negotiation. For example, DifﬁeHellman over Z p with a 1,024-bit prime has about the same security as Difﬁe-Hellman over an elliptic curve on the ﬁnite ﬁeld GF(2n ) with a prime of 185 bits [Doraswamy and Harkins 1999].

3.4

Cryptographic Hash Functions, MACs, and HMACs One problem in implementing VPNs is message authentication. That is, how can we be sure that the message is from whom it says it is, and that it hasn’t been tampered with in transmission? An everyday example of this is an email message signed with PGP, GPG, or a similar utility. The recipient of such a message can be sure that the sender is legitimate and that the message has not been altered by a third party. In this section, we discuss some of the tools that provide these capabilities. We are not interested in utilities, such as GPG, but rather in the building blocks that are used to construct them. As we shall see, these tools have applications far beyond signing email messages. One of the basic building blocks that we will need is a cryptographic hash function. These functions take an input of arbitrary length and produce a ﬁxed-sized digest, or ‘‘ﬁngerprint,’’ of the input. A trivial example of a hash function on a message m is h(m) = m mod n for some n. Although such a hash function is useful for table-lookup applications, it is not a cryptographic hash function, because it lacks the essential properties required of a cryptographic hash function, h. • Given x, it is computationally infeasible to compute m such that x = h(m). • Given m1 , it is computationally infeasible to compute m2 such that h(m1 ) = h(m2 ). • It is computationally infeasible to ﬁnd m1 and m2 such that h(m1 ) = h(m2 ). This last property is called strong collision resistance. It may seem superﬂuous, but hash functions lacking this property are vulnerable to various attacks based on the birthday paradox. See Section 18.1 of [Schneier 1996] for a description of one such attack.

76

Cryptography Overview

Chapter 3

The birthday paradox is the surprising result that if 23 people are in a room, the chances are better than even that 2 of them will have the same birthday.

An examination of the properties that hash functions should have and the place of strong collision resistance in those properties is presented in [Anderson 1993]. Although there are several cryptographic hash functions, we examine only two: MD5 and SHA. Both of these functions take a message of any length and produce a ﬁxed-sized result: 128 bits for MD5 and 160 bits for the standard SHA algorithm, called SHA-1. MD5 The MD5 algorithm was developed by Ron Rivest in 1992 and released to the public domain. Its speciﬁcation and a reference implementation are given in RFC 1321 [Rivest 1992b]. Before describing the algorithm, let’s see it in action. We use the FreeBSD md5 utility to compute the MD5 digest of the two strings 1 and 3: $ md5 -s 1 MD5 ("1") = c4ca4238a0b923820dcc509a6f75849b $ md5 -s 3 MD5 ("3") = eccbc87e4b5ce2fe28308fd9f2a7baf3

Two points are worth noticing here. First, even though the input strings are a single byte, they produce a 128-bit digest. Second, although the strings 1 and 3 vary in only a single bit, they produce two radically different digests. This is an example of the avalanche effect, where a change in one bit affects several bits in the result. Given a message m of length l m , MD5 ﬁrst pads the message with a 1-bit and then as many 0-bits as required to make l m = 448 mod 512. Then the least signiﬁcant 64 bits of the message length are appended to make the padded message a multiple of 512 bits. Because the 1-bit is always appended, the message is always padded, even if it is already a multiple of 512 bits. The MD5 algorithm operates on one 512-bit block at a time until the entire message is processed. During this process, MD5 maintains 16 bytes of state in four 32-bit registers (A, B, C, D). This state is modiﬁed by mixing it with data from the input block in a nonlinear way. After the last block has been processed, the 16 bytes of state are concatenated to form the digest. Before processing the ﬁrst block, the state is initialized to (0x01234567, 0x89abcdef, 0xfedcda98, 0x76543210). Figure 3.9 shows an overview of the MD5 process for one block of data. After the four rounds of mixing, the modiﬁed state (A′, B′, C′, D′) is added to the original input state (A, B, C, D) and used as the input state for the next block. The addition of the original state with the modiﬁed state is done one register at a time with 32-bit modular addition. The real work of MD5 is done by the four mixing rounds. The four rounds are similar but not identical. Each mixing round has a nonlinear function at its heart. These four functions, F, G, H, and I, are deﬁned as F(X, Y, Z) = (X ∧ Y) ∨ (( ¬ X) ∧ Z)

Section 3.4

Cryptographic Hash Functions, MACs, and HMACs

Bi four rounds of mixing

(A′, B′, C′, D′)

(A, B, C, D)

77

(A′, B′, C′, D′) + (A, B, C, D)

Figure 3.9 Process One Block with MD5

G(X, Y, Z) = (X ∧ Z) ∨ (Y ∧ ( ¬ Z)) H(X, Y, Z) = X ⊕ Y ⊕ Z I(X, Y, Z) = Y ⊕ (X ∨ ( ¬ Z)) where ∧ is the bitwise AND operator, ∨ is the bitwise OR operator, and ¬ is the bit complement operation. ˆ H, ˆ G, ˆ and Iˆ: F, G, H, and I are used to deﬁne the four round operations F, ˆ b, c, d, M, s, t) ≡ a = b + ((a + F(b, c, d) + M + t) 172.30.0.4.1701: l2tp:[TLS](52379/20575)Ns=3,Nr=2 *MSGTYPE(ICCN) *TX_CONN_SPEED(10000000) *FRAMING_TYPE(S) *RX_CONN_SPEED(10000000) (DF) 15:18:08.266317 172.30.0.15.1701 > 172.30.0.4.1701: l2tp:[](52379/20575) {Conf-Req(1), ACCM=00000000, Magic-Num=c49a6f1e, PFC, ACFC} (DF) 15:18:08.294282 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[TLS](48224/0)Ns=2,Nr=3 ZLB (DF) 15:18:08.321691 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[TLS](48224/40607)Ns=2,Nr=4 ZLB (DF) 15:18:08.332097 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[L](48224/40607) {Conf-Req(1), ACCM=00000000, Magic-Num=c49a6f1e, PFC, ACFC} (DF) additional PPP negotiation deleted 15:18:47.603327 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[TLS](48224/0)Ns=2,Nr=5 *MSGTYPE(HELLO) (DF) 15:18:47.609330 172.30.0.15.1701 > 172.30.0.4.1701: l2tp:[TLS](52379/0)Ns=5,Nr=3 ZLB (DF) 15:19:07.189282 172.30.0.15.1701 > 172.30.0.4.1701: l2tp:[](52379/20575) {192.168.122.2 > 192.168.122.1: icmp: echo request (DF)} (DF) 15:19:07.193070 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[L](48224/40607) {192.168.122.1 > 192.168.122.2: icmp: echo reply} (DF) L2TP and PPP hellos deleted

117

118

Tunnels

16

17 18 19

Chapter 4

15:19:55.713922 172.30.0.15.1701 > 172.30.0.4.1701: l2tp:[TLS](52379/20575)Ns=6,Nr=4 *MSGTYPE(CDN) *RESULT_CODE(1/0 Bad file descriptor) *ASSND_SESS_ID(40607) (DF) 15:19:55.717628 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[TLS](48224/40607)Ns=4,Nr=7 ZLB (DF) 15:20:16.441387 172.30.0.15.1701 > 172.30.0.4.1701: l2tp:[TLS](52379/0)Ns=7,Nr=4 *MSGTYPE(StopCCN) *ASSND_TUN_ID(48224) *RESULT_CODE(1/0 Goodbye!) (DF) 15:20:16.444907 172.30.0.4.1701 > 172.30.0.15.1701: l2tp:[TLS](48224/0)Ns=4,Nr=8 ZLB (DF)

The session begins on lines 1–4 with linuxlt, acting as the LAC, establishing a tunnel to linux. After tunnel establishment, linuxlt initiates a data session on lines 5–7, followed by the PPP negotiation starting on line 8. In lines 16–19, the data session and tunnel are torn down. We look at each of these phases in greater detail shortly. Tunnel Establishment As with PPTP, either the LAC or the LNS can initiate a tunnel. If both sides attempt to initiate a tunnel at the same time, a tie-breaking mechanism lets one side proceed as the initiator and the other side take the role of the responder. Figure 4.33 shows the typical case of a LAC initiating a tunnel to the LNS. LAC

LNS SCCRQ

SCCRP

SCCCN

Figure 4.33 Tunnel-Initiation Handshake

The three-message process is reminiscent of the TCP’s three-way connection establishment handshake. In the ﬁrst message, the initiator sends a Start-Control-Connection-Request message. The SCCRQ message contains several AVPs, some of which are optional. The AVPs are shown in Figure 4.34. Many of the AVPs in Figure 4.34 are self-explanatory. Recall that the length of the value ﬁeld of an AVP is l − 6, where l is the length given in the table. Note that the lengths are before any hiding. The meaning of the M bit is special for the message type AVP. Rather than indicating that the AVP itself must be recognized, the M bit indicates that the message must be

Section 4.6

L2TP

AVP

Attribute Type

Length

M Bit

H Bit

Required

message type protocol version framing capabilities bearer capabilities tie breaker ﬁrmware revision host name vendor name assigned tunnel ID receive window size challenge random vector

0 2 3 4 5 6 7 8 9 10 11 36

8 8 10 10 14 8 var var 8 8 var var

1 1 1 1 0 0 1 0 1 1 1 1

0 0

• • •

119

0 0

• •

0 0

Figure 4.34 SCCRQ AVPs

recognized. If the M bit of message type AVP is set and the receiver does not recognize the message, the tunnel must be torn down. If the M bit is cleared and the receiver does not recognize the message, the receiver can ignore the message. Note that all the messages in Figure 4.30 will have the M bit of the message type AVP set. The protocol version value ﬁeld comprises 2 bytes. The most signiﬁcant byte is the version, which is set to 1. The low-order byte is the revision, which is set to 0. Note that this ﬁeld is not the same as the version ﬁeld in the common header. The framing capabilities AVP is a 32-bit mask. The lowest-order bit is set if the sender is capable of using synchronous framing. The next-lowest-order bit is set if the sender is capable of asynchronous framing. The bearer capabilities ﬁeld is also a 32-bit mask. The lowest-order bit is set if the sender supports digital access. The second-lowest-order bit is set if the sender supports analog access. The tie breaker is a 64-bit random value used to resolve the conﬂict that arises when both the LAC and LNS try to initiate a tunnel at the same time. When one side receives an SCCRQ, that side must check whether it has also initiated a tunnel. If so, the receiver compares its tie-breaker value with that of its peer. The side with the lower value proceeds with its tunnel, and the other side abandons its tunnel. The assigned tunnel ID is used to demultiplex messages for multiple tunnels. The sender expects its peer to set the tunnel ID ﬁeld of the common header to this value for all messages that the peer sends to it. The tunnel ID ﬁeld in the common header will be 0 for the SCCRQ message. The receive window size AVP indicates how many unacknowledged control messages may be outstanding to the sender. If this AVP is missing, the peer must assume a value of 4. If the sender of the SCCRQ wishes to authenticate its peer with a CHAP-like mechanism (RFC 1994 [Simpson 1996]), it can include the challenge AVP. The value ﬁeld of this AVP is an arbitrarily long sequence of random bytes.

120

Tunnels

Chapter 4

As we discussed previously, whenever a hidden AVP is included in a message, that AVP must be preceded by a random vector AVP that contains an arbitrarily long sequence of random bytes. The receiver of the SCCRQ message will respond with a Start-Control-ConnectionReply (SCCRP) message, indicating that it will accept the connection and specifying its connection parameters. If the peer refuses to accept the connection, it will respond with a StopCCN message. This message contains a result code specifying the reason for rejecting the connection.

The AVPs in the SCCRP message are the same as those in the SCCRQ message (Figure 4.34), with the addition of a challenge response, as shown in Figure 4.35. AVP

challenge response

Attribute Type

Length

M Bit

H Bit

Required

Same as those for SCCRQ 13 22 1 Figure 4.35 SCCRP AVPs

The value of the challenge response is computed by taking an MD5 hash of the message type (2 for SCCRP), the shared secret, and the challenge value: CR = MD5(T||S||C) where T is the message type, S is the shared secret, C is the challenge value, and CR is the challenge response. If the parameters in the SCCRP message are acceptable to the tunnel initiator, it will complete the tunnel establishment by sending its peer a Start-Control-Connection-Connected (SCCCN) message. If the parameters are not acceptable, it will respond with a StopCCN message.

The SCCCN message has one required and two optional AVPs, as shown in Figure 4.36. AVP message type challenge response random vector

Attribute Type

Length

M Bit

H Bit

Required

0

8

•

22 var

1 1 1

0

13 36

0

Figure 4.36 SCCCN AVPs

Once the responder receives the SCCCN message, the tunnel is established. At that point, either side can request that a PPP session between a remote host and the LNS be started.

Section 4.6

L2TP

121

Either side can reject the connection by sending a StopCCN message. This message is also used to tear down the tunnel when one or both of the peers are ﬁnished with it. The StopCCN message has three required and one optional AVP, as shown in Figure 4.37. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type result code assigned tunnel ID random vector

0 1 9 36

8 var 8 var

1 1 1 1

0 0

• • •

0

Figure 4.37 StopCCN AVPs

The result code AVP is formatted as shown in Figure 4.38. 0

15 16 result code

31 32 error code (optional)

error message (optional)

Figure 4.38 The Result Code AVP

The length ﬁeld indicates whether the optional ﬁelds are present. The 2-byte result code is always present. The values deﬁned for the StopCCN message are shown in Figure 4.39. Code

Meaning

0 1 2 3 4 5 6 7

reserved general request to clear control connection general error—error code indicates problem control channel already exists requester not authorized bad protocol version—error code indicates highest version supported requester is being shut down ﬁnite state machine error Figure 4.39 StopCCN Result Codes

The error code ﬁeld contains additional information about the result code values. When the result code indicates a general error (1), the error code tells us what the particular error is. Its values are given in Figure 4.40. The error message ﬁeld contains a human-readable string that provides further information about the error code. This ﬁeld is not present unless the error code ﬁeld is. Thus the receiver can unambiguously determine which ﬁelds are present, by examining the AVP length ﬁeld.

122

Tunnels

Chapter 4

Code

Meaning

0 1 2 3 4 5 6 7 8

no error no control connection exists for the LAC/LNS pair incorrect length ﬁeld value out of range or reserved ﬁeld nonzero insufﬁcient resources invalid session ID vendor-speciﬁc error (see error message ﬁeld) try another LNS unknown AVP with M bit set (see error message ﬁeld for AVP type) Figure 4.40 Error Codes

Let’s look again at our sample L2TP session to see these messages in action. Here, in greater detail, are the ﬁrst four messages from the session: 1

15:17:47.532030 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](0/0)Ns=0,Nr=0 *MSGTYPE(SCCRQ) *PROTO_VER(1.0) *FRAMING_CAP(AS) *BEARER_CAP() FIRM_VER(1680) *HOST_ NAME(eriwan) VENDOR_NAME(l2tpd.org) *ASSND_TUN_ID(48224) *RECV_WIN_SIZE(4) (DF) 1.1 4500 007f 17f8 4000 4011 ca26 ac1e 000f E.....@.@..&.... 1.2 ac1e 0004 06a5 06a5 006b 74c4 c802 0063 .........kt....c 1.3 0000 0000 0000 0000 8008 0000 0000 0001 ................ 1.4 8008 0000 0002 0100 800a 0000 0003 0000 ................ 1.5 0003 800a 0000 0004 0000 0000 0008 0000 ................ 1.6 0006 0690 800c 0000 0007 6572 6977 616e ..........eriwan 1.7 000f 0000 0008 6c32 7470 642e 6f72 6780 ......l2tpd.org. 1.8 0800 0000 09bc 6080 0800 0000 0a00 04 ......‘........ 2 15:17:47.576801 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/0)Ns=0,Nr=1 *MSGTYPE(SCCRP) *RANDOM_VECTOR(d3a7909125f5c53e58f869c079868059) *PROTO_VER(1.0) *FRAMING_CAP(AS) *BEARER_CAP() FIRM_VER(1680) *HOST_NAME(eriwan) VENDOR_NAME(l2tpd.org) *ASSND_TUN_ID(52379) *RECV_WIN_SIZE(4) (DF) 2.1 4500 0095 aa5f 4000 4011 37a9 ac1e 0004 E...._@[email protected]..... 2.2 ac1e 000f 06a5 06a5 0081 f0ce c802 0079 ...............y 2.3 bc60 0000 0000 0001 8008 0000 0000 0002 .‘.............. 2.4 8016 0000 0024 d3a7 9091 25f5 c53e 58f8 .....$....%..>X. 2.5 69c0 7986 8059 8008 0000 0002 0100 800a i.y..Y.......... 2.6 0000 0003 0000 0003 800a 0000 0004 0000 ................ 2.7 0000 0008 0000 0006 0690 800c 0000 0007 ................ 2.8 6572 6977 616e 000f 0000 0008 6c32 7470 eriwan......l2tp 2.9 642e 6f72 6780 0800 0000 09cc 9b80 0800 d.org........... 2.10 0000 0a00 04 ..... 3 15:17:47.578062 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/0)Ns=1,Nr=1 *MSGTYPE(SCCCN) (DF) 3.1 4500 0030 17fb 4000 4011 ca72 ac1e 000f E..0..@[email protected].... 3.2 ac1e 0004 06a5 06a5 001c 855c Bc802 0014 ............... 3.3 cc9b 0000 0001 0001 8008 0000 0000 0003 ................

Section 4.6

L2TP

123

4

15:17:47.583876 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/0)Ns=1,Nr=2 ZLB (DF) 4.1 4500 0028 aa60 4000 4011 3815 ac1e 0004 E..(.‘@[email protected]..... 4.2 ac1e 000f 06a5 06a5 0014 15ba c802 000c ................ 4.3 bc60 0000 0001 0002 0000 0000 0000 .‘............

As we see, linuxlt initiates tunnel establishment with an SCCRQ message. The ﬁrst 20 bytes are the IP header, which is followed by the 8 bytes of the UDP header. We see from these headers that the message is from 172.30.0.15 (0xac1e000f) to 172.30.0.4 (0xac1e0004) and that the source and destination ports are both 1705 (0x06a5). Next comes the L2TP header (Figure 4.26), shown in boldface on lines 1.2 and 1.3. The ﬁrst 8 bits tell us that the T, L, and S ﬂags are set. The L2TP version is 2, as expected. The next 2 bytes tell us that the length of the L2TP message is 99 (0x63) bytes long. No tunnel or session ID has been established yet, so these two ﬁelds are zeroed. Finally, we see that Ns and Nr are both 0 as well. The ﬁrst AVP, the message type, immediately follows the header in the last 8 bytes of line 1.3. We see that the M bit is set, indicating that our peer must recognize this message type, and that the AVP is 8 bytes long. Because this is a standard AVP, the vendor ID ﬁeld is 0. The attribute type is 0, indicating that this is the message type AVP (Figure 4.34), and the message type is 1, indicating an SCCRQ message (Figure 4.30). The next AVP starts 8 bytes after the ﬁrst byte of the message type AVP. We see that its attribute type is 2, which, from Figure 4.34, speciﬁes the protocol version. We can step through the remaining AVPs in a similar fashion to verify the information that tcpdump printed out in line 1. The tcpdump output is particularly informative for L2TP messages. The bracketed letters (TLS in the SCCRQ message) indicate the bits set in the L2TP header. Because the O and P bits are not set, the offset size and offset pad ﬁelds are not present. Next comes the local tunnel and session IDs in parentheses, followed by the Ns and Nr values (when the S bit is set). Finally, the AVPs and their values are listed. The asterisk in front of an AVP indicates that the M bit is set.

Because tcpdump provides comprehensive information about L2TP messages, we needn’t bother analyzing the remaining hex dumps in detail. It is an excellent exercise, however, to verify the details that tcpdump prints out for one or two of the other messages. Doing so results in a deeper understanding of how the protocol works and how its messages are formed. Before leaving the SCCRQ message, we should note an anomaly with the host name AVP. As shown on line 1, linuxlt sent a host name of eriwan. This is because of a bug in the L2TP software used for the session. Instead of using the conﬁgured name, it always uses the hard-coded name eriwan. The second message shows linux responding to the SCCRQ with an SCCRP. The tunnel ID ﬁeld is ﬁlled in with 48224 because its value is known from the assigned tunnel ID AVP in the SCCRQ message. The session ID is still 0, of course, because no sessions are active. We note that linux sets its tunnel ID to 52379. This is the value that linuxlt will put in the tunnel ID ﬁeld, as we’ll see in the next message.

124

Tunnels

Chapter 4

We also see that Nr is set to 1. This indicates that linux expects the Ns value of the next message from linuxlt to be set to 1. Although it is not used, linux provides a random vector AVP. Sending this AVP when it’s not needed is a peculiarity of this implementation. The rest of the AVPs are similar to those in the SCCRQ message. The tunnel establishment is completed with the next message, in which linuxlt sends an SCCCN message. Because it doesn’t have a message pending for linuxlt, linux responds to the SCCCN with a ZLB. This corresponds to TCP’s pure ACK message and is sent as part of the tunnel-reliability mechanism. The ZLB plays no part in tunnel establishment. Lines 18 and 19 from the sample session show what happens when the peers are ﬁnished with the tunnel: 18 18.1 18.2 18.3 18.4 18.5 19 19.1 19.2 19.3

15:20:16.441387 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/0)Ns=7,Nr=4 *MSGTYPE(StopCCN) *ASSND_TUN_ID(48224) *RESULT_CODE(1/0 Goodbye!) (DF) 4500 004a 1856 4000 4011 c9fd ac1e 000f E..J.V@.@....... ac1e 0004 06a5 06a5 0036 4a0f c802 002e .........6J..... cc9b 0000 0007 0004 8008 0000 0000 0004 ................ 8008 0000 0009 bc60 8012 0000 0001 0001 .......‘........ 0000 476f 6f64 6279 6521 ..Goodbye! 15:20:16.444907 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/0)Ns=4,Nr=8 ZLB (DF) 4500 0028 aabb 4000 4011 37ba ac1e 0004 E..(..@[email protected]..... ac1e 000f 06a5 06a5 0014 15b1 c802 000c ................ bc60 0000 0004 0008 0000 0000 0000 .‘............

Tunnel teardown is initiated when linuxlt sends a StopCCN message. Notice that the tunnel ID in the L2TP header (52379) is the ID assigned by linux. As required by the protocol, linuxlt also sends its value of the tunnel ID (48224) as an AVP, making it unambiguous as to which tunnel is being torn down. The result code AVP (Figure 4.38) is shown in boldface on lines 18.4 and 18.5. It shows a result/error code (Figure 4.39 and Figure 4.40) of 1/0 (general request to clear control connection/no error) and the optional human-readable message (Goodbye!). The tunnel teardown is complete when linux responds with a ZLB, acknowledging that it received the StopCCN message. The Control Connection The control connection is a reliable connection that largely duplicates the semantics of TCP reliability. As we’ve seen, each control connection has sequence number and ACK ﬁelds that help provide reliability in the same manner that the corresponding TCP ﬁelds do. Once a message is sent, a retransmission timer is started. If the message is not acknowledged by the other side before the timer expires, the message is retransmitted, and the timer is restarted with a larger timeout value. After several unsuccessful attempts, the tunnel is considered down and is abandoned. If the receiver of a control message has a response, it can acknowledge the control message in its response. If no response is necessary, the peer will respond with a

Section 4.6

L2TP

125

common message header that contains no AVPs. This is the ZLB message. Its sole purpose is to acknowledge a previous message by indicating the next expected sequence number in its Nr ﬁeld. Like TCP, the control channel guarantees that control messages will be delivered in order and that it will make a sustained effort to deliver each message. When IP is used as the transport, the UDP checksum provides the same protection from corruption that TCP enjoys. Other transports have their own checksum schemes for protection against corruption. RFC 2661 recommends that TCP-like congestion control also be implemented for the control channel, but [Carlson 2000] reports that most implementations fail to do this. If one of the tunnel endpoints has not received any messages from its peer for a certain amount of time—RFC 2661 recommends 60 seconds—the endpoint can send its peer a hello (HELLO) message. As we see from Figure 4.30, the HELLO is a control message, so after transmitting the HELLO, the sender can depend on the reliable-delivery mechanism of the control channel: either the HELLO will be delivered and acknowledged by the other side or will time out and cause the tunnel to be abandoned. Because HELLO messages are global to the tunnel, the session ID ﬁeld is 0. An unacknowledged HELLO will cause all sessions in the tunnel to be terminated. The HELLO mechanism allows both peers to discover independently that the tunnel is no longer functioning. The HELLO message contains a single message type AVP, as shown in Figure 4.41. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type

0

8

1

0

•

Figure 4.41 HELLO AVP

Lines 12 and 13 of our sample session show an example of a HELLO sequence: 12 12.1 12.2 12.3 13 13.1 13.2 13.3

15:18:47.603327 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/0)Ns=2,Nr=5 *MSGTYPE(HELLO) (DF) 4500 0030 aa77 4000 4011 37f6 ac1e 0004 E..0.w@[email protected]..... ac1e 000f 06a5 06a5 001c 958f c802 0014 ................ bc60 0000 0002 0005 8008 0000 0000 0006 .‘.............. 15:18:47.609330 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/0)Ns=5,Nr=3 ZLB (DF) 4500 0028 1812 4000 4011 ca63 ac1e 000f E..(..@[email protected].... ac1e 0004 06a5 06a5 0014 057a c802 000c ...........z.... cc9b 0000 0005 0003 ........

Incoming Calls from a Remote Host When a remote host connects to a LAC, the LAC will initiate a session with the LNS. If a tunnel between the LAC and LNS does not already exist, the LAC will negotiate one before proceeding with the session establishment. The protocol that the LAC uses to establish a session is similar to the one used to establish the tunnel. As with tunnel establishment, there is a three-step handshake, as shown in Figure 4.42.

126

Tunnels

Chapter 4

LAC

LNS ICRQ

ICRP

ICCN

Figure 4.42 Incoming Session Establishment

After receiving a call indication from the remote host—either from the remote host connecting to the LAC in mandatory mode or internally in voluntary mode—the LAC sends an ICRQ to the LNS with information about the call. The LAC can defer answering the call (in mandatory mode) until the LNS agrees to establish the session, or it can immediately answer the call and perform LCP negotiation and authentication before sending the ICRQ. In either case, the LAC sends the ICRQ message to begin the session-establishment handshake. The ICRQ has three required and six optional AVPs, as shown in Figure 4.43. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type assigned session ID call serial number bearer type called number calling number subaddress physical channel ID random vector

0 14 15 18 21 22 23 25 36

8 8 10 10 var var var 10 var

1 1 1 1 1 1 1 0 1

0

• • •

0

Figure 4.43 ICRQ AVPs

The assigned session ID AVP is this LAC’s identiﬁcation number for the session. The LNS will place this number in the session ID ﬁeld of the common header. Because the LAC does not yet have the LNS’s session ID, it puts a 0 in the session ID ﬁeld of the ICRQ’s common header. The call serial number AVP is a 32-bit counter that identiﬁes a particular call. This AVP is used by system administrators at the LAC and LNS for troubleshooting. The bearer type AVP is similar to the bearer capabilities AVP; it consists of a 32-bit mask with the lowest-order bit indicating that the call is on a digital channel and the

Section 4.6

L2TP

127

next-lowest-order bit indicating that the call is on an analog channel. In the case of a voluntary-mode connection, where the LAC is embedded in the remote host, both bits may be left unset. The called number, calling number, and subaddress AVPs are variable-length ASCII strings that contain information about the phone numbers of the called and calling parties. The subaddress contains additional dialing information. RFC 2661 remarks that the LAC and LNS may have to agree on the interpretation of these ﬁelds. The physical channel ID contains vendor-speciﬁc information about the physical channel of the call. This AVP is used only for logging purposes. If the LNS agrees to accept the call, it will reply to the ICRQ with an Incoming-CallReply (ICRP). If the LNS does not want to accept the call, it will reply with a Call-Disconnect-Notify (CDN) message. We’ll look at the CDN message shortly. The ICRP has two required and one optional AVP, as shown in Figure 4.44. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type assigned session ID random vector

0 14 36

8 8 var

1 1 1

0

• •

0

Figure 4.44 ICRP AVPs

We’ve already seen these AVPs in the ICRQ message. After receiving the ICRP, the LAC answers the call, if it hasn’t already done so, and then completes the session establishment by sending an Incoming-Call-Connected (ICCN) message to the LNS. The ICCN informs the LNS that the remote host is connected and passes the LNS some parameters concerning the physical-connection characteristics. If the LAC has already performed the LCP and authentication phases of PPP, the ICCN message will also contain information about the PPP parameters negotiated and the results of the authentication. The ICCN message has 3 required and 12 optional AVPs, as shown in Figure 4.45. The framing type AVP is a 32-bit mask that describes the type of framing the LAC is using. If the lowest-order bit is set, the connection is using synchronous framing. If the next-lowest-order bit is set, the connection is using asynchronous framing. The Tx connect speed AVP is a 4-byte value indicating the transmit speed—from LAC to remote host—in bits per second. Similarly, the Rx connect speed is the speed of the connection from the remote host to the LAC. When the Rx connect speed AVP is missing, the receive speed is assumed to be the same as the transmit speed. If the LAC performs PPP LCP negotiation with the remote host, the initial received LCP CONFREQ, last received LCP CONFREQ, and last sent LCP CONFREQ AVPs help the LNS to understand what PPP parameters the remote host requested and what values the LAC and remote host ﬁnally agreed on. The LNS uses these values to set the corresponding parameters in its PPP implementation just as if it had negotiated them itself. If the LAC performs authentication of the remote host for the LNS, the proxy authen. type, proxy authen. name, proxy authen. challenge, proxy authen. ID, and proxy authen.

128

Tunnels

Chapter 4

AVP

Attribute Type

Length

M Bit

H Bit

Required

message type framing type Tx connect speed initial received LCP CONFREQ last sent LCP CONFREQ last received LCP CONFREQ proxy authen. type proxy authen. name proxy authen. challenge proxy authen. ID proxy authen. response random vector private group ID Rx connect speed sequencing required

0 19 24 26 27 28 29 30 31 32 33 36 37 38 39

8 10 10 var var var 8 var var 8 var var var 10 6

1 1 1 0 0 0 0 0 0 0 0 1 0 0 1

0

• • •

0

0

Figure 4.45 ICCN AVPs

response may be present. The proxy authen. type indicates the type of authentication that the LAC and remote host performed. The authentication types are shown in Figure 4.46. Type

Authentication Method

0 1 2 3 4 5

reserved textual username/password exchange PPP CHAP PPP PAP no authentication Microsoft CHAP

Figure 4.46 L2TP Proxy Authentication Type

If the LAC performs proxy authentication, the proxy authen. type AVP must be present. The other parameters may or may not be present, depending on the authentication method the LAC and remote host used. The private group AVP indicates that the LNS should treat this call as belonging to a particular customer group. The attribute value ﬁeld is an arbitrary sequence of bytes. The sequencing required AVP indicates that sequence numbers must be used on data messages for this call. Recall that the LAC and LNS can use these sequence numbers to detect dropped or out-of-order messages. We see these messages in action in lines 5–7 of our sample session: 5

15:18:08.255780 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/0)Ns=2,Nr=1 *MSGTYPE(ICRQ) *ASSND_SESS_ID(40607) *CALL_SER_NUM(3) *BEARER_TYPE() (DF)

Section 4.6

L2TP

129

5.1 4500 004c 17fd 4000 4011 ca54 ac1e 000f E..L..@[email protected].... 5.2 ac1e 0004 06a5 06a5 0038 6611 c802 0030 .........8f....0 5.3 cc9b 0000 0002 0001 8008 0000 0000 000a ................ 5.4 8008 0000 000e 9e9f 800a 0000 000f 0000 ................ 5.5 0003 800a 0000 0012 0000 0000 ............ 6 15:18:08.259860 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/40607)Ns=1,Nr=3 *MSGTYPE(ICRP) *RANDOM_VECTOR(4890def0d66a88ed4db0473642debca1) *ASSND_SESS_ID(20575) (DF) 6.1 4500 004e aa62 4000 4011 37ed ac1e 0004 E..N.b@[email protected]..... 6.2 ac1e 000f 06a5 06a5 003a 8aa3 c802 0032 .........:.....2 6.3 bc60 9e9f 0001 0003 8008 0000 0000 000b .‘.............. 6.4 8016 0000 0024 4890 def0 d66a 88ed 4db0 .....$H....j..M. 6.5 4736 42de bca1 8008 0000 000e 505f G6B.........P_ 7 15:18:08.260005 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/20575)Ns=3,Nr=2 *MSGTYPE(ICCN) *TX_CONN_SPEED(10000000) *FRAMING_TYPE(S) *RX_CONN_SPEED(10000000) (DF) 7.1 4500 004e 17fe 4000 4011 ca51 ac1e 000f E..N..@[email protected].... 7.2 ac1e 0004 06a5 06a5 003a 85f4 c802 0032 .........:.....2 7.3 cc9b 505f 0003 0002 8008 0000 0000 000c ..P_............ 7.4 800a 0000 0018 0098 9680 800a 0000 0013 ................ 7.5 0000 0001 800a 0000 0026 0098 9680 .........&....

On line 5, linuxlt initiates a session by sending linux an ICRQ message. As expected, the session ID ﬁeld in the header is 0 because linux has not yet assigned an ID for this session. We see, however, that linuxlt has assigned session ID 40607. In line 6, linux responds with the ICRP message as expected. This time, the header ’s session ID ﬁeld is ﬁlled in with the value that linuxlt sent in its ICRQ message. Just as with the tunnel ID, the sender places its peer ’s session ID in the header. The session is established when linuxlt sends the ICCN message. At this point, PPP can begin its negotiation. We can see this starting in line 8 of the sample session: 8 8.1 8.2 8.3 8.4

15:18:08.266317 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[](52379/20575) {Conf-Req(1), ACCM=00000000, Magic-Num=c49a6f1e, PFC, ACFC} (DF) 4500 003a 17ff 4000 4011 ca64 ac1e 000f E..:..@[email protected].... ac1e 0004 06a5 06a5 0026 7307 0002 cc9b .........&s..... 505f ff03 c021 0101 0014 0206 0000 0000 P_...!.......... 0506 c49a 6f1e 0702 0802 ....o.....

As always, the L2TP message is encapsulated in a UDP datagram. The L2TP header, shown in boldface, has no ﬂags set, so no optional ﬁelds are present (see Figure 4.26). After the header, we see the PPP packet starting with the address ﬁeld. This is expected, as we are using synchronous framing as indicated in line 7. Once the session is established and PPP negotiation is ﬁnished, we can send data over the session connection. Here is the result of a single ping and reply: 14

15:19:07.189282 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[](52379/20575) {192.168 .122.2 > 192.168.122.1: icmp: echo request (DF)} (DF) 14.1 4500 0077 1815 4000 4011 ca11 ac1e 000f E..w..@.@....... 14.2 ac1e 0004 06a5 06a5 0063 5b91 0002 cc9b .........c[.....

130

Tunnels

Chapter 4

14.3 505f 2145 0000 5418 1440 0040 01ad 40c0 P_!E..T..@.@..@. 14.4 a87a 02c0 a87a 0108 0085 b8b6 0b00 013b .z...z.........; 14.5 16c1 3fd2 e102 0008 090a 0b0c 0d0e 0f10 ..?............. 14.6 1112 1314 1516 1718 191a 1b1c 1d1e 1f20 ................ 14.7 2122 2324 2526 2728 292a 2b2c 2d2e 2f30 !"#$%&’()*+,-./0 14.8 3132 3334 3536 37 1234567 15 15:19:07.193070 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[L](48224/40607) {192.16 8.122.1 > 192.168.122.2: icmp: echo reply} (DF) 15.1 4500 0079 aa7a 4000 4011 37aa ac1e 0004 E..y.z@[email protected]..... 15.2 ac1e 000f 06a5 06a5 0065 dd2a 4002 005d .........e.*@..] 15.3 bc60 9e9f 2145 0000 54aa 7900 0040 015a .‘[email protected] 15.4 dbc0 a87a 01c0 a87a 0200 008d b8b6 0b00 ...z...z........ 15.5 013b 16c1 3fd2 e102 0008 090a 0b0c 0d0e .;..?........... 15.6 0f10 1112 1314 1516 1718 191a 1b1c 1d1e ................ 15.7 1f20 2122 2324 2526 2728 292a 2b2c 2d2e ..!"#$%&’()*+,-. 15.8 2f30 3132 3334 3536 37 /01234567

The data portion of the hex dump does not appear to make sense, because PPP negotiated payload compression. We deleted the lines containing this negotiation from our sample session.

When the peers are ﬁnished with the data session, they can release its resources by tearing the connection down. In our sample session, linuxlt initiates the teardown by sending a CDN message, and linux ACKs the message with a ZLB, as shown on lines 16 and 17: 16

16.1 16.2 16.3 16.4 16.5 16.6 17 17.1 17.2 17.3

15:19:55.713922 172.30.0.15.l2f > 172.30.0.4.l2f: l2tp:[TLS](52379/20575)Ns=6,Nr=4 *MSGTYPE(CDN) *RESULT_CODE(1/0 Bad file descriptor) *ASSND_SESS_ID(40607) (DF) 4500 0055 184f 4000 4011 c9f9 ac1e 000f E..U.O@.@....... ac1e 0004 06a5 06a5 0041 4bc2 c802 0039 .........AK....9 cc9b 505f 0006 0004 8008 0000 0000 000e ..P_............ 801d 0000 0001 0001 0000 4261 6420 6669 ..........Bad.fi 6c65 2064 6573 6372 6970 746f 7280 0800 le.descriptor... 0000 0e9e 9f ..... 15:19:55.717628 172.30.0.4.l2f > 172.30.0.15.l2f: l2tp:[TLS](48224/40607)Ns=4,Nr=7 ZLB (DF) 4500 0028 aab4 4000 4011 37c1 ac1e 0004 E..(..@[email protected]..... ac1e 000f 06a5 06a5 0014 7712 c802 000c ..........w..... bc60 9e9f 0004 0007 0000 0000 0000 .‘............

Outgoing Calls to a Remote Host Just as the remote host can initiate a call through the LAC to the LNS, the LNS can initiate a call through the LAC to the remote host. As with incoming call requests, the call is established with a three-step handshake, as shown in Figure 4.47. The ﬁrst step in establishing an outgoing call is for the LNS to send an OutgoingCall-Request (OCRQ) to the LAC. As we see from Figure 4.48, we have already encountered most of the AVPs in this message.

Section 4.6

L2TP

LAC

131

LNS OCRQ

OCRP OCCN

Figure 4.47 Outgoing Session Establishment

AVP

Attribute Type

Length

M Bit

H Bit

Required

message type assigned session ID call serial number minimum bps maximum bps bearer type framing type called number subaddress random vector

0 14 15 16 17 18 19 21 23 36

8 8 10 10 10 10 10 var var var

1 1 1 1 1 1 1 1 1 1

0

• • • • • • • •

0

Figure 4.48 OCRQ AVPs

The minimum bps and maximum bps AVPs specify the minimum and maximum speed in bits per second required for the outgoing call. Both attribute value ﬁelds are 32-bit integers. If the LAC is able to honor the call request, it responds with an Outgoing-CallReply (OCRP) informing the LNS of the session ID and physical channel, as shown in Figure 4.49. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type assigned session ID physical channel ID

0 14 25

8 8 10

1 1 0

0

• •

random vector

36

var

1

0

Figure 4.49 OCRP AVPs

After it has connected to the remote host, the LAC sends the LNS an Outgoing-CallConnected (OCCN) message that notiﬁes the LNS that the session is ready to proceed

132

Tunnels

Chapter 4

and informs it of the connection parameters. The OCCN has three required and three optional AVPs, as shown in Figure 4.50. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type framing type Tx connect speed random vector Rx connect speed sequencing required

0 19 24 36 38 39

8 10 10 var 10 6

1 1 1 1 0 1

0

• • •

0 0

Figure 4.50 OCCN AVPs

If the LAC can’t complete the call, it will send a Call-Disconnect-Notify (CDN) message to the LNS. Similarly, if the remote host disconnects or otherwise terminates the session, the LAC will send a CDN to the LNS. The LNS can also terminate the session by sending a CDN to the LAC. The CDN has three required and two optional AVPs, as shown in Figure 4.51. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type result code Q.931 cause code assigned session ID random vector

0 1 12 14 36

8 var var 8 var

1 1 1 1 1

0 0 0

• • •

0

Figure 4.51 CDN AVPs

Except for the Q.931 cause code, we’ve seen all these AVPs before. The Q.931 cause code is used to report the cause of an unsolicited Integrated Services Digital Network (ISDN) disconnection. Details about the format of this AVP are in RFC 2661. Other Control Messages There are two other control messages that we haven’t yet explored. The ﬁrst is the WAN-Error-Notify (WEN) message, used by the LAC to report WAN errors to the LNS. RFC 2661 speciﬁes that this message be sent only when an error occurs and not more than once every 60 seconds. The WEN message has two required and one optional AVP, as shown in Figure 4.52. The attribute value ﬁeld of the call errors AVP consists of six cumulative counters for various types of errors, as shown in Figure 4.53. Our ﬁnal control message is the Set-Link-Info (SLI) message. RFC 2661 describes the SLI message as a mechanism for the LNS to inform the LAC of negotiated PPP parameters, but its only use is for the LNS to inform the LAC of the asynchronous control

Section 4.6

L2TP

AVP

Attribute Type

Length

M Bit

H Bit

Required

message type call errors random vector

0 34 36

8 32 var

1 1 1

0

• •

133

0

Figure 4.52 WEN AVPs

0

15 16

31

reserved

CRC errors (high)

CRC errors (low)

framing errors (high)

framing errors (low)

hardware overruns (high)

hardware overruns (low)

buffer overruns (high)

buffer overruns (low)

timeout errors (high)

timeout errors (low)

alignment errors (high)

alignment errors (low)

Figure 4.53 Call Errors Attribute Value Field

character map (ACCM). Recall from Chapter 2 that each PPP peer can negotiate which bytes it requires its peer to escape. When it performs LCP negotiation, the LNS must inform the LAC of the ACCM, as it is the LAC that will perform the escaping. The SLI message has two required and one optional AVP, as shown in Figure 4.54. AVP

Attribute Type

Length

M Bit

H Bit

Required

message type ACCM random vector

0 35 36

8 16 var

1 1 1

0

• •

0

Figure 4.54 SLI AVPs

The attribute value ﬁeld for the ACCM AVP comprises two 32-bit character masks and a reserved ﬁeld, as shown in Figure 4.55. 0

15 16 reserved

47 48 send ACCM

Figure 4.55 ACCM Attribute Value Field

79 receive ACCM

134

Tunnels

Chapter 4

The L2TP speciﬁcation is given in RFC 2661 [Townsley, Valencia et al. 1999]. [Shea 2000] gives an excellent and thorough exposition of the operation of L2TP. This is the place to look for an explanation of the various state machines and other internal L2TP mechanisms. L2TP Futures At the time of this writing, the IETF L2TP Extensions Working Group (see ) is developing the speciﬁcations for the next version (L2TPv3) of L2TP. Whereas L2TP is designed to carry PPP packets over various media, L2TPv3 generalizes this facility to the notion of a pseudowire that can transport many types of interface-layer packets, such as PPP, Ethernet, frame relay, and ATM. In the context of the TCP/IP, L2TPv3 will provide the pseudowire by tunneling the link-layer protocols over either UDP, as L2TP does, or directly over IP. Although tunneling over IP seems cleaner and imposes less overhead, there are advantages to using UDP. Tunneling directly over IP does not interoperate well with NAT or, more precisely, PAT, because no port numbers are available for NAT to remap. Thus, tunneling L2TPv3 packets over UDP can help with NAT traversal. In addition to separating the PPP functionality from L2TP proper and adding the capability to tunnel interface-layer protocols other than PPP, L2TPv3 adds additional AVPs and enhances some of the data structures. For example, the session and tunnel ID ﬁelds are 32 bits in L2TPv3. Finally, L2TPv3 optionally extends the CHAP-like tunnel authentication mechanism to cover all packets in the control channel. To enable authentication, the peers exchange nonces during the tunnel establishment protocol. A nonce is a number that is used only once. Typically, nonces are random numbers, but in some applications, counters or timestamps can be used as nonces. There is an excellent discussion of the various types of nonces and their uses in [Kaufman, Perlman, and Speciner 2002].

Each control message is authenticated by taking a cryptographic hash—currently, HMAC-MD5 and HMAC-SHA-1 are deﬁned—of the concatenation of the two nonces, a shared key, and the message. That is, D = H(LN||RN||K||m) where D is the authentication digest, H is the hash function, LN is the local nonce, RN is the remote nonce, K is the shared key, and m is the message. The authentication digest is carried in an AVP attached to the message. As we saw in Chapter 3, these digests assure both sides that the message is from its peer and has not been altered. The addition of the two nonces ensures that the message is from this incarnation of the tunnel and prevents replay attacks using packets from previous tunnels.

Section 4.7

4.7

MPLS

135

MPLS Multiprotocol Label Switching (MPLS) is a way of tunneling IP datagrams, usually within and among autonomous systems. Actually, MPLS can be used to tunnel any type of network-layer packet—thus, multiprotocol. Because we are concerned mostly with tunnels in the Internet, we concentrate on MPLS’s use with IP.

The idea is that a small label is inserted between the interface- and network-layer protocol headers, as shown in Figure 4.56, by the router at the entrance to the tunnel. Downstream routers use the label to make routing decisions, and do not need to consult the network-layer header at all. Thus, like all tunnels, MPLS treats the encapsulated IP datagram as opaque data and does not access it in any way while it’s in the tunnel. link-layer header

MPLS label

IP header

data

Figure 4.56 MPLS Label Encapsulation

According to the MPLS Resource Center’s FAQ (frequently asked questions) at , MPLS was originally envisioned as a way to perform network-layer routing at the interface layer—that is, at ATM or frame relay—speeds. MPLS is now perceived as a means of providing trafﬁc engineering, including quality of service; ATM-like virtual circuits at the network layer; interfacelayer tunnels, such as Ethernet over IP/MPLS; a special type of VPN for enterprises that provides security comparable to that obtained with frame relay or ATM circuits; and a generally simpliﬁed network architecture where, for example, many ATM control plane functions are migrated to the network layer. After looking brieﬂy at how MPLS tunnels work, we then see how they can be used to provide a type of VPN. Because MPLS is usually implemented at the network service provider—that is, the autonomous system—level, most of us will not come into direct contact with it, but it does provide an interesting example of tunneling and is well worth our study. The MPLS Architecture As we’ve seen, routing in the Internet is a distributed process. Each router makes an individual decision as to the next hop for an IP datagram, based on network topology information learned from its neighboring routers. Except for the immediately preceding and next hops, routers don’t usually know where a packet has been or the path it will follow to its destination. This is a powerful architecture because it is very resilient to changes in the network. If a downstream router fails after a packet has started its

136

Tunnels

Chapter 4

journey, the network is usually able to route around the failure by ﬁnding an alternative path for the packet. The Internet routing model has shortcomings, however. As line speeds increase into the 10 and 100 gigabit per second range, it becomes more and more difﬁcult for routers to make the forwarding decisions quickly enough to keep the links running at capacity. One of the difﬁculties is the time it takes to examine the IP header at each hop. Another problem is trafﬁc engineering and quality of service (QoS). Service providers might want to use different paths for different types of trafﬁc. As a trivial example, interactive trafﬁc, such as a telnet or ssh session, would be better routed over low-delay links, whereas ftp or Web page downloads might be better routed over high-speed links even if the average delay is greater. A more complex consideration is QoS: A provider might decide to sell specialized service at a premium. For example, a provider might guarantee a customer a minimum or average dedicated bandwidth through its network at a cost depending on the bandwidth the customer requires. To deliver these types of services, network providers must segregate the IP trafﬁc into distinct classes and treat those classes differently. This exacerbates the problem of examining the IP headers quickly enough to make forwarding decisions at high line speeds, of course, but it also requires information, such as customer identiﬁcation, that intermediate routers won’t have. MPLS addresses these problems by assigning each IP datagram to a forwarding equivalence class (FEC). Class membership can be based on a simple criterion, such as destination network, or a more complex set of criteria involving service-level contracts with a customer, destination, protocol, TCP or UDP port numbers, time of day, and other policy constraints. Each packet in a FEC will follow a predetermined path, called a label switched path (LSP) through a series of MPLS-aware routers. An MPLS-aware router is called a label switching router (LSR). The assignment of an IP datagram to a FEC takes place at the ingress router—the ﬁrst LSR in the path that the packet will follow. Subsequent LSRs in the LSP do not need to reclassify the packet; they merely forward it based on the label, as described next. Routing with MPLS A packet is assigned to a particular FEC by the ingress router. Each router has a label associated with each FEC it handles, but these labels are not unique among routers. That is, each LSR will have its own label for a given FEC. This label is purely a local matter except that a router ’s immediate upstream neighbors—routers that deliver packets directly to the LSR in question—will also know it. Before describing the routing process, we need one more fact: Packets can carry multiple labels arranged in a stack. Only the label at the top of the stack is used to route the packet. Other labels play no part until the top label is removed (popped) by one of the routers. We’ll see how this mechanism is used later, when we discuss MPLS tunnels and VPNs. The label is shown in Figure 4.57. The label comprises a 20-bit label value, an 8-bit TTL that’s used to prevent routing loops, just as the TTL value in IP datagrams is, an S bit that, when set, indicates the bottom of the stack, and a 3-bit experimental ﬁeld.

Section 4.7

MPLS

0

19 20 label value

22 23 24 Exp

S

137

31 TTL

Figure 4.57 The MPLS Label The Exp ﬁeld will probably be used to provide differentiated services for MPLS. RFC 3270 [Faucheur, Wu, Davie, Davari, Vaananen et al. 2002] discusses this in detail.

The encoding of the label and its use in routing is discussed in RFC 3032 [Rosen, Tappan, Fedorkow, Rekhter et al. 2001]. The use and processing of the label’s TTL ﬁeld is discussed in RFC 3443 [Agarwal and Akyol 2003]. When a packet arrives at an LSR, the router retrieves the label and uses it as an index into a local database. An entry in the database is called a next hop label forwarding entry (NHLFE). As indicated in the diagram, labeled packets are mapped to one or more NHLFEs by using the label as an index into the incoming label map (ILM), which in turn points to the appropriate NHLFE or NHLFEs. Unlabeled packets are mapped to one or more NHLFEs by analyzing the packet to determine its FEC and then using the FEC and FEC-to-NHLFE (FTN) map. labeled

ILM

packet

NHLFE unlabeled

FTN

Each entry of the database contains at least a next hop and a label-forwarding action. Possible label actions are: • Swap—replace the label at the top of the stack with another label • Pop—pop the label stack, removing the top label and exposing the next label, if any • Replace—replace the label at the top of the stack with another label, and then push one or more additional labels onto the top of the stack As an example of a typical routing event, let’s suppose that LSR R i receives a labeled packet. R i ﬁrst uses the label at the top of the stack to retrieve the appropriate entry in the label forwarding database. If we further suppose that the label action is to swap labels, R i will replace the label with a new label speciﬁed by R i+1 , the router at the next hop. R i then forwards the packet to the next hop indicated in the database entry— R i+1 in this case. Note that R i did not have to consult any network-layer headers to make this routing decision; the label determined both its actions and the next hop. These actions are repeated at each hop along the label switched path until the last router in the LSP is reached. The entire process is illustrated in Figure 4.58, where an IP datagram enters the LSP at ingress router LSR1 and is label switched through the LSP until it leaves the tunnel at router LSR n . While the datagram is in the MPLS tunnel, only the

138

Tunnels

Chapter 4

label is used for routing; the IP datagram itself is treated as opaque data. label: 12 IP pkt.

LSR1

label: 7 LSR2

⋅⋅⋅

label: 55 LSR n

IP pkt.

label switched path Figure 4.58 An IP Datagram Traversing an LSP

When the ﬁnal router in the LSP—the egress router—receives the packet, it pops the label from the stack. If there are further labels in the stack, the top label is used to make a routing decision as described in the label-swapping case; if the label stack is now empty and this is not the ﬁnal destination for the packet, the router examines the network-layer header and makes its forwarding decision in the normal (non-MPLS) way. Sometimes, the router just before the egress router will pop the label off the stack; this is called penultimate hop popping. This makes sense because the only use the egress router will make of the label is to notice that it should be popped. By having the previous router pop the label, the egress router is saved the label lookup overhead. This doesn’t shift the cost of label processing to the previous router, because it has to perform one of the label actions in any case.

Label Distribution Protocols In order for MPLS to operate, LSRs must inform their neighbors of the label/FEC bindings that they are using. That is, before router R i can forward a packet belonging to FEC F to router R i+1 , it must know what label R i+1 has assigned to F. The routers exchange label information by means of a label distribution protocol (LDP). Several label distribution protocols have been proposed, including, confusingly enough, one called LDP. Some of these protocols are extensions to existing protocols, such as BGP [Rekhter and Rosen 2001]; others, such as LDP [Andersson, Doolan, Feldman et al. 2001], are new protocols designed speciﬁcally for MPLS. There are two methods for distributing labels: unsolicited downstream mode and downstream on demand mode. In unsolicited downstream mode, a router informs its immediate upstream neighbors of its label bindings without being asked. In this mode, the downstream router will send its neighbors label/FEC bindings when a label changes or when it ﬁrst comes online. In downstream on demand mode, the downstream router does not distribute label information unless it’s asked. In this mode, the upstream router will request a label binding for a particular FEC whose packets it wants to forward to the downstream router as the next hop. A router can use either or both modes, depending on circumstances. The MPLS architecture document (RFC 3031 [Rosen, Viswanathan, and Callon 2001]) discusses these modes in detail. If the upstream router has no use for a label—it doesn’t use the downstream router as the next hop for the FEC that the label is bound to—it may either discard it (conservative label retention mode) or keep it for possible later use (liberal retention mode). Generally, routers that have the necessary memory will use liberal retention mode because it saves

Section 4.7

MPLS

139

requesting the label later if it’s needed. Some routers, such as ATM-based LSRs, do not have the necessary memory and use conservative retention mode. MPLS Tunnels We have been discussing MPLS as another example of tunneling technology, but not all authorities treat MPLS this way. The exercises ask you to justify our treatment of MPLS in terms of tunneling by considering our deﬁnition of tunnels and verifying that it applies to MPLS. Aside from the question of whether MPLS itself is an example of tunneling, there is a notion of tunnels in MPLS. As we shall see, these MPLS tunnels certainly satisfy our deﬁnition of a tunnel. To motivate the idea of this new type of tunnel, let’s consider an example of their use from RFC 3031. Suppose that we have an autonomous system that functions purely as a transit AS. That is, it carries trafﬁc for other autonomous systems, but trafﬁc does not originate or terminate within the AS. Figure 4.59 shows this AS with four border routers B1 , . . . , B4 and four internal routers I 1 , . . . , I 4 .

B2

I2

B1

I1

I3

B3 3

7 3 4 3

I4

9 3

B4

Figure 4.59 An MPLS Tunnel

Let’s suppose that border router B3 is advertising a route for network N and that an IP datagram with N as its destination enters the AS at router B1 . As we saw in Chapter 2, the border routers will be running an exterior routing protocol—BGP, say—and will be advertising to each other the external routes that they learn from their external

140

Tunnels

Chapter 4

BGP peers. The internal routers will be running an interior routing protocol, such as OSPF, and will learn the topology of the autonomous system’s network from each other, and the external networks that they can reach from the border routers. In the normal situation as described in Chapter 2, the internal routers will be aware of external as well as internal routing information. Thus, for example, I 3 will know that any packets destined for network N should be sent to B3 . This means that the forwarding tables of the internal routers must contain all the external routes, even though they will never originate trafﬁc to any of those external networks. Let’s see how MPLS can solve this problem. First, it’s clear that if the border routers were directly connected, there would be no problem: When the IP datagram destined to network N entered the AS at B1 , the router would simply send it to its next hop, B3 . Recall that one way of thinking about a tunnel is that it simulates a direct wire between two nodes. This suggests that we set up tunnels between the border routers. If such tunnels were in place, the internal routers wouldn’t need to know anything about external networks or routes. They would merely need to know how to carry tunneled trafﬁc among the border routers; the actual trafﬁc, including the fact that the payload is a packet destined for network N, would be opaque to them. The MPLS label stack mechanism makes it easy to establish tunnels between two LSRs. For the rest of this example, let’s assume that all the routers in the AS are LSRs. When it learns reachability information for network N from an external peer, router B3 assigns an MPLS label to the route and attaches it as a path attribute (see Chapter 2) to the rest of the routing information about N that it distributes to the other border routers. This process and the attribute formats are speciﬁed in RFC 3107 [Rekhter and Rosen 2001]. When B1 receives the IP datagram, it looks up the route for network N and discovers that the BGP next hop is router B3 and that B3 has assigned the label 3, say, to the route. Thus, B1 pushes the label 3 onto the MPLS label stack. Next, B1 again consults its routing table and learns that the next hop for router B3 is internal router I 1 with a label of, say, 7. If we assume that the LSP between B1 and B3 is B1 →I 1 →I 4 →I 3 →B3 , we get the LSP and label stacks shown in Figure 4.59. I 1 receives the packet from B1 , swaps the label 7 at the top of the stack with a label of 4, and sends the packet to its next hop, I 4 . This process continues until the packet gets to I 3 , where the label action is to pop the label stack—because of penultimate hop popping—and the next hop is B3 . The packet travels from I 3 to B3 with the single label of 3. B3 uses this label to look up the proper forwarding information for the packet and sends it on its way. B3 might pop the ﬁnal label from the stack or it might swap it for another label if its neighboring AS is also using MPLS and the two autonomous systems have agreed to exchange labeling information. Note how the internal routers don’t need to carry any routing information about external networks; they are concerned only with routing packets within the AS. Also note how easy it is to apply trafﬁc engineering or QoS with this topology. Suppose, for example, that the AS manager wants to give trafﬁc from a certain external autonomous system priority. If we again assume that the trafﬁc enters at B1 and is destined for

Section 4.7

MPLS

141

network N, B1 can assign the packet to a different from normal FEC for the tunnel and thus assign it a different label. This other tunnel may take a different—faster or better in some other way— path to B3 . The only thing that changes from our example is that the labels at the top of the stack will be different. The bottom label will still be a 3, indicating to B3 that this packet is ultimately destined for network N. MPLS VPNs If we generalize the previous tunnel example slightly, we will see how MPLS can provide a special type of VPN that simulates a leased line—a frame relay or ATM leased circuit—that has many of the isolation and security features that leased lines do. The idea is to isolate trafﬁc among a customer’s various sites by carrying it in MPLS tunnels and to keep routing information for a customer’s sites completely separate from any other customer’s routing data. Let’s assume that we have a service provider with two ﬁctional customers: Acme Widgets and Ajax Gadgets. Acme has ofﬁces in New York, Dallas, and Los Angeles; Ajax has ofﬁces in New York, Chicago, and Dallas. Both Acme and Ajax wish to connect their various ofﬁces with private networks. The traditional way of doing this is with leased lines, but this can be very expensive. Instead, Acme and Ajax arrange to use their provider ’s MPLS network to obtain an equivalent, but cheaper, solution. Our example assumes that only a single provider is involved, but two or more providers can cooperate in offering MPLS VPN services to their customers as long as they are all MPLS networks and are willing to share label information. For further information, see RFC 2547 [Rosen and Rekhter 1999], which discusses MPLS VPNs in detail.

As shown in Figure 4.60, each customer site accesses the provider ’s network with a customer edge (CE) router that connects to a provider edge (PE) router. The interior topology of the provider ’s network consists of provider (P) routers. P routers do not connect directly to customer routers. The terminology we are using (CE, PE, and P routers) is from RFC 2547.

The PE routers act as border routers, and as we shall see, run BGP just as the border routers B1 , . . . , B4 in Figure 4.59 did. One of the nice features of MPLS VPNs is that customers may number their networks any way they like and may even use the same addresses as another of the provider ’s customers. For speciﬁcity, let’s assume that Acme and Ajax have assigned their networks the addresses shown in Figure 4.61. This example has some features that we should notice. First, the Acme New York site and the Ajax Chicago site have the same network address (10.1.0.0/16), so the provider ’s network will have to have some way of distinguishing them. Second, all the customers’ sites are numbered with nonroutable addresses. Ordinarily, this would require NAT, but just as with our previous IP-in-IP example, the fact that the trafﬁc to and from the sites will be tunneled makes this unnecessary. Finally, although Acme and Ajax do not share a common VPN is this example, it is possible that two or more

142

Tunnels

Chapter 4

Chicago

CE2 (Ajax)

PE2

P2

CE1 (Acme)

CE3a (Acme) PE1

P1

P5

P3

PE3 CE3b (Ajax)

Los Angeles

New York

P4

PE4

CE4a (Acme)

CE4b (Ajax)

Dallas Figure 4.60 MPLS VPNs

Acme Ajax

New York

Dallas

Chicago

10.1.0.0/16 10.0.0.0/16

10.2.0.0/16 192.168.1.0/24

10.1.0.0/16

Los Angeles 10.3.0.0/16

Figure 4.61 MPLS VPN Example Customer Site Addressing

customers that have a business relationship would require a secure way to exchange information. In that case, the MPLS VPN architecture allows a VPN to be set up among some or all of the separate companies’ sites. In order to set up an intercompany VPN, none of the sites involved can have the same network address, of course. Sites that are not involved in the common VPN can still have the same address as a site in another company’s intracompany VPN.

Section 4.7

MPLS

143

10.1.0.0/16

CE3a

le

Los Angeles

leased line

CE1

lin e

10.3.0.0/16

In order to understand how an MPLS VPN operates, let’s focus on Acme. The goal is to make the three ofﬁces appear to be on a common network. The illusion is that customer edge routers CE1 , CE3a , and CE4a are connected with a mesh of leased lines, as shown in Figure 4.62.

e as le

e lin

as e

d

d

New York

CE4a

10.2.0.0/16 Dallas Figure 4.62 The MPLS VPN as Seen by Acme’s Sites

One signiﬁcant aspect of Figure 4.62 is that the Acme network is completely isolated from any other network. Although the trafﬁc in our example is neither encrypted nor authenticated, the isolation ensures that trafﬁc cannot be intercepted and that outsiders cannot inject forged packets into the network. When leased lines are used, the difﬁculty in breaching the physical security of the provider ’s network is what provides the security. As we shall see, in the case of MPLS VPNs, it is the isolation of routing data that provides the security. Because attackers do not know how to route to the customer’s sites, they cannot interfere with the customer’s trafﬁc. This assumes the security of the MPLS routers, of course. See [Dreyfus 1997] for an example of what happens when routers are compromised.

MPLS VPNs are implemented almost entirely in the provider edge routers. To see how this works, let’s further narrow our focus to PE3 , the New York provider edge router. PE3 is a routing peer with both Acme’s CE3a and Ajax’s CE3b customer edge routers, so in order to maintain isolation, PE3 has separate routing tables for Acme and Ajax. In our simple example, this routing information consists of a single network, but the network topology behind a customer edge router could be arbitrarily complex, and a provider edge router could learn routing information for several networks from each of its customer edge routing peers. Similarly, a customer may have more than one VPN in service, which would generally result in separate routing tables for each VPN. For clarity, we are ignoring these possibilities in our example.

144

Tunnels

Chapter 4

As we indicated earlier, the provider edge routers are BGP speakers that exchange routing and label information among themselves, just as in the MPLS tunnel example of Figure 4.59. As PE3 learns routes from the customer edge routers, it advertises them to its BGP peers: PE1 , PE2 , and PE4 . PE3 augments each route with a 64-bit route distinguisher that is used to distinguish routes from two customers with identically numbered networks. PE3 also adds a VPN target, or route target, that identiﬁes the particular VPN that the route applies to. The VPN target enables PE3 ’s BGP routing peers to assign the route to the correct routing table. PE3 will, of course, also assign a label to the route, just as in the case of MPLS tunnels. Let us assume that PE3 assigns the label 18 for the route to the Acme 10.1.0.0/16 network. The ‘‘10.1.0.0’’ will be augmented to 96 bits by the route discriminator so that the other provider edge routers can distinguish it from the Ajax Chicago network of the same number. Finally, a VPN target will be assigned to the route to indicate that this route is for the Acme VPN. PE3 will advertise this information to the other provider edge routers in the usual BGP way. When the other provider edge routers receive this information, they will use the VPN target attribute to assign it to one of their routing tables. Now suppose that the Dallas Acme ofﬁce wants to send an IP datagram over the VPN to the New York ofﬁce—10.1.0.15, say. When the packet arrives from the Acme CE router, the Dallas PE router (PE4 ) looks up the route for 10.1.0.15 in the Acme routing table and determines that the BGP next hop is PE router PE3 and that this packet should be sent through the VPN associated with the label 18. The 18 is pushed onto the label stack, and the next hop for PE3a is determined from a global routing table. The label for this next hop is pushed onto the label stack, just as it was in the case of the MPLS tunnel example. The packet travels through the provider ’s network until it reaches PE router PE3 , which pops the upper label if the previous router has not already done so and uses the label 18 to determine that the packet should be forwarded to CE3a , the Acme CE at New York. In most cases, PE3 can make the ﬁnal routing decision based on the label alone and need not consult the IP datagram itself. Just as with MPLS tunnels, the internal (P) routers need not know anything about the CE routers, the VPNs, or the routing information associated with them; they need merely know how to route trafﬁc among the PE routers. This is an important advantage for service providers because it helps the network scale to a large number of VPNs. No single router—neither PE nor P router—need be aware of all the routes used within the system. The P routers need know only the internal network topology, and a PE router need be aware only of the routing information used for VPNs that terminate at that PE router. We have examined only the rudiments of MPLS VPNs. Many more complicated conﬁgurations are possible. For example, a company could have one or more VPNs that all its sites take part in and other VPNs that reach only some of its sites. Similarly, two or more companies can arrange to have intercompany VPNs. [Pepelnjak and Guichard 2001] covers all these and other conﬁgurations in detail.

Section 4.8

4.8

gtunnel

145

gtunnel Sometimes, we want to build a type of tunnel that is not supported by the operating system, or we need to have ﬁner control of the tunnel parameters than the operating system provides. In these cases, it is useful to be able to do our own encapsulation of packets as they leave the TCP/IP stack. In principle, we could do this by modifying the stack itself, but doing so has several disadvantages. First, it requires an intimate knowledge of the operating system’s kernel, its data structures, and its support routines. More serious, unless the operating system’s vendor can be persuaded to incorporate the changes into the base system, the modiﬁcations become a maintenance problem because they have to be reapplied with every new release of the operating system. Fortunately, many operating systems—including FreeBSD, Linux, and Solaris—provide a facility that allows us to do the encapsulation easily in a user-space program. Our examples will use the FreeBSD tunnel driver, but Linux and Solaris offer essentially identical services through their TUN/TAP drivers. We refer to all these facilities as ‘‘tunnel drivers.’’ The tunnel drivers appear to the TCP/IP stack to be device drivers for a network interface device, such as an Ethernet card. Instead of encapsulating packets from the TCP/IP in, say, an Ethernet frame and passing the result to a physical device, the tunnel drivers deliver them to a user-space program, which performs further processing on the packet and then delivers it to the appropriate output device. Similarly, the user-space program can pass packets to the tunnel driver for delivery back up the TCP/IP stack. A typical use of a tunnel driver is illustrated in Figure 4.63, which shows a user application talking to the TCP/IP stack in the normal way. The packets that result move down the stack to the tunnel driver, which delivers them to a user-space program, labeled gtunnel, that performs further processing and delivers them to the output device. A typical use of the tunnel driver is illustrated by the FreeBSD PPP implementation. Rather than implement PPP in the kernel, as most other systems do, FreeBSD provides PPP functionality with a normal user-space application program called pppd. The pppd program communicates with the TCP/IP stack through the tunnel driver and with the outside world, typically, through a serial port. Thus, pppd encapsulates IP packets from the stack in PPP frames and delivers them to a serial port for transmission to the remote system. PPP frames from the remote system arrive at the serial port and are read by pppd, which strips off the PPP framing and delivers the resulting IP packet to the TCP/IP stack through the tunnel driver.

Building a gtunnel.c Skeleton We can write our own programs that provide whatever processing and encapsulation are needed to implement a particular type of tunnel. To make this easier, we use the gtunnel.c skeleton shown in Figure 4.64. By providing the startup, inbound, and outbound functions, we can ﬂesh gtunnel out to a complete tunnel implementation.

146

Tunnels

Chapter 4

user app

gtunnel user kernel

TCP/IP stack

tunnel driver

Figure 4.63 gtunnel and the Tunnel Driver

1 #include "etcp.h" 2 int main( int argc, char **argv ) 3 { 4 fd_set allmasks; 5 fd_set rdmask; 6 int tun; 7 int infd; 8 int outfd; 9 int maxfd; 10 int rc; 11 12 13 14 15

INIT(); startup( argc, argv, &infd, &outfd ); tun = open( "/dev/tun0", O_RDWR ); if ( tun < 0 ) error( 1, errno, "couldn’t open tunnel driver" );

16 17 18 19

FD_ZERO( &allmasks ); FD_SET( tun, &allmasks ); FD_SET( infd, &allmasks ); maxfd = ( tun > infd ? tun : infd ) + 1;

20 21 22 23 24 25 26 27 28

for( ;; ) { rdmask = allmasks; rc = select( maxfd, &rdmask, NULL, NULL, NULL ); if ( rc h_addr; } *infd = *outfd = s;

23 }

ipip.c Figure 4.67 The startup Function

The startup Function 10–12

13–21

22

We begin by allocating a socket. Because we want our outer IP packets to have a protocol type of IP-in-IP (4), we set the socket type to SOCK_RAW and the protocol type to IPPROTO_IPIP. Next, we ﬁll in remote with our peer’s IP address. If the address is given as a numerical address, the inet_aton function will convert it to a 32-bit integer and store it in remote. Otherwise, we use gethostbyname to look up our peer’s address. Finally, we set both the ﬁle descriptors to the socket. When the select call returns with a read event on the tun device, we call outbound (Figure 4.68) to process the packet. 24 void outbound( int tun, int s ) 25 { 26 int rc; 27 char buf[ 1500 ]; 28 29 30 31 32 33 34 35 }

rc = read( tun, buf, if ( rc < 0 ) error( 1, errno, rc = sendto( s, buf, sizeof( remote ) if ( rc < 0 ) error( 1, errno,

ipip.c

sizeof( buf ) ); "read from tun returned %d", rc ); rc, 0, SOCKADDR( remote ), ); "write to socket returned %d", rc ); ipip.c

Figure 4.68 The outbound Function

The outbound Function 28–30

31–34

We begin by reading the packet from the tunnel driver. Assuming that we’re running on bsd, this will be an IP packet with a source address of 192.168.1.1 and a destination address of 192.168.2.1. We process this packet by simply sending it to our peer. As a result of the call to sendto, the TCP/IP stack will add an outer IP header with a source address of 172.30.0.1 and a destination address of 172.30.0.6, again assuming that we are running on bsd. The inbound function (Figure 4.69) is only slightly more complicated. Because we speciﬁed SOCK_RAW for our socket, the packet will come to us with the outer IP header still in place. We must remove this header before sending the inner IP packet up the stack to the user application.

150

Tunnels

Chapter 4

36 void inbound( int tun, int s ) 37 { 38 int rc; 39 struct ip *ip; 40 char buf[ 1500 ]; 41 42 43 44 45 46 47 48 49 50 51 52 }

ipip.c

rc = read( s, buf, sizeof( buf ) ); if ( rc < 0 ) error( 1, errno, "read from socket returned %d", rc ); if ( rc == 0 ) error( 1, 0, "EOF from peer\n" ); ip = ( struct ip * )buf; ip = ( struct ip * )( buf + ( ip->ip_hl ip_len ) ); if ( rc != ntohs( ip->ip_len ) ) error( 1, errno, "write to tun returned %d istead of %d", rc, ntohs( ip->ip_len ) ); ipip.c Figure 4.69 The inbound Function

The inbound Function 41–45

46–47

49–51

First, we read the packet from our peer as we normally would in any TCP/IP program. This packet will have the format shown in Figure 4.4. We strip off the outer IP header by ﬁrst setting ip to point at the outer packet and then incrementing the pointer by the number of bytes in the outer header. After line 47, ip points to the inner IP header. We send the entire inner IP packet up the stack by writing it to the tun device. Notice that we get the size of the packet from the ip_len ﬁeld of the IP header. We can test our IP-in-IP tunnel by conﬁguring the tun interfaces and starting ipip on bsd and laptop. For example, on bsd, we conﬁgure the tunnel interface using the normal ifconfig command and then start ipip: bsd# ifconfig tun0 192.168.1.1 192.168.2.1 up bsd# ./ipip laptop

In another window on bsd, we test our tunnel by pinging the remote interface: $ ping 192.168.2.1 PING 192.168.2.1 (192.168.2.1): 56 data bytes 64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.466 ms 64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=1.464 ms 64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.454 ms 64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=0.523 ms 64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=0.405 ms ˆC --- 192.168.2.1 ping statistics --5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.405/0.662/1.464/0.403 ms

Section 4.9

Summary

151

If we examine the tcpdump capture of one of these pings, we see the expected encapsulation. Except for the addresses, it is identical to that from Section 4.2 (the outer IP header is set in boldface): 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

16:47:06.482086 172.30.0.1 > 172.30.0.6: 192.168.1.1 > 192.168.2.1: icmp: echo request 4500 0068 17db 0000 4004 0a74 ac1e 0001 ac1e 0006 4500 0054 17da 0000 4001 de7c c0a8 0101 c0a8 0201 0800 b033 463e 0000 4aef 3041 945a 0700 0809 0a0b 0c0d 0e0f 1011 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 3435 3637

(ipip-proto-4) [email protected].... ....E..T....@..| ...........3F>.. J.0A.Z.......... ................ .!"#$%&’()*+,-./ 01234567

We’ve seen that by adding as little as 52 lines of code to our gtunnel skeleton, we can build a functional tunnel. Obviously, we could improve our tunnel by including code to track the tunnel health, to worry about MTUs, to provide diagnostics and other features, and so on (see Exercises 4.6 and 4.7, for example). The point is that gtunnel provides an infrastructure on which we can build arbitrarily complex tunnels.

4.9

Summary In this chapter, we’ve studied tunneling technology in depth. We began with a tentative deﬁnition of tunneling; after looking at an example, we reﬁned it to be the encapsulation of a protocol’s data in the payload of another protocol at the same or higher layer. We looked at two common examples: IP-in-IP and PPPoE tunnels. We saw how they encapsulate the tunneled data and, in the case of PPPoE, how the tunnel endpoints can negotiate tunnel parameters. Next, we looked at GRE tunnels as a way of decreasing the complexity of implementing tunnels. Instead of the X × Y implementations required to tunnel X protocols in Y protocols, GRE serves as a generalized mechanism allowing one protocol to be tunneled in another. We examined the PPTP and L2TP protocols and observed how they serve as a generalization of the traditional telco-/modem-based RAS system. Although users think of PPTP and L2TP as VPN technologies, we examined only their tunnel aspects and left their security features for later investigation. We noted that PPTP, essentially a Microsoft product, is being replaced by L2TP. Both PPTP and L2TP depend on PPP to frame the tunneled packets. We discussed MPLS, ﬁrst as an efﬁcient routing mechanism and then as a way of providing a kind of VPN. MPLS uses a small label—or stack of labels—prepended to packets in order to make efﬁcient routing decisions. We observed that MPLS is useful mainly within an autonomous system but that it can be used across autonomous systems if their directors agree and coordinate labels. Finally, we saw how we can use the gtunnel skeleton to build our own tunnels, whether or not they are supported natively by the operating system. We used gtunnel to build another version of the IP-in-IP tunnel.

152

Tunnels

Chapter 4

Exercises 4.1

Suppose that host A is multihomed, with one interface accessible to the Internet and the other interface on an enterprise’s home network. Suppose further that host A is conﬁgured to forward IP packets from one interface to the other. In terms of a remote user wishing to access the home network, what advantages do PPTP and L2TP have over the remote user merely connecting to host A over the Internet using SSH, Telnet, or a similar protocol?

4.2

Why does L2TP need a random vector when encrypting hidden AVPs?

4.3

We said that MPLS tunnels IP datagrams or some other network-layer protocol. Explain how MPLS meets the deﬁnition of tunnel that we gave at the beginning of this chapter.

4.4

How do the interface protocols, such as Ethernet, know that an MPLS label is present? This is obviously important, as an Ethernet, say, frame will be demultiplexed differently if an MPLS label is present.

4.5

Use gtunnel to build a GRE tunnel between two hosts.

4.6

Suppose that we have two gtunnel-based IP-in-IP tunnels to a host H. Our ipip program uses raw sockets, so each instance of the program will receive a copy of every IP datagram sent to either tunnel. Modify ipip to account for this.

4.7

Add soft-state functionality, as discussed in Section 4.2, to the ipip program.

Part 2

Tunnels and VPNs

This page intentionally left blank

5

Virtual Private Networks

5.1

Introduction In Part 2, we discuss virtual private networks (VPNs). Chapter 4 had some examples of VPNs, but we were interested mainly in their tunneling aspects and didn’t dwell on their security and authentication features. In this chapter, we deﬁne VPN and brieﬂy revisit the VPNs from Chapter 4. In the rest of Part 2, we study several types of VPNs, see how they are used, and take note of their strengths and weaknesses. As we shall see, these VPNs can operate at any layer in the TCP/IP stack. As usual, we will be less concerned with the administrative details of conﬁguring the VPNs than with developing an appreciation for the protocols themselves and the manifestation of those protocols on the wire. Before beginning our discussion of VPNs, we should agree on a deﬁnition for them. We already have an implicit deﬁnition from our study of MPLS VPNs in Chapter 4. We might say that according to that deﬁnition, a VPN is a method of using tunneling to build a private overlay network on top of a public network in such a way that the security of the private network is equivalent to that provided by leased lines. But this deﬁnition suffers from a lack of precision as to the meaning of ‘‘security equivalent to that provided by leased lines’’ and is a bit too general for our purposes. Instead, let us say that a virtual private network is an overlay network built with tunnels in which the tunnel payloads are encrypted and authenticated. Given that we use robust encryption and authentication, such a VPN would certainly provide security as good as or better than that provided by leased lines, so this deﬁnition is consistent, if more restrictive, than that for MPLS VPNs. The underlying notion of both deﬁnitions is that we are trying to create the illusion of a private network while using a public network, such as the Internet.

155

156

Virtual Private Networks

Chapter 5

It’s worth dwelling, for a moment, on the differences between a ‘‘real’’ private network and a virtual private network. As a ﬁrst approximation, we could say that real private networks provide security by physical separation of the underlying communication media. Separate leased lines are dedicated to the network, and these leased lines carry trafﬁc only for that network. This means that short of a physical wiretap, an attacker does not have access to network trafﬁc. As we saw in our study of MPLS VPNs, the security of real private networks does not necessarily depend on the actual separation of physical media. We can also achieve segregation of network trafﬁc through routing, or by multiplexing several data channels onto a single physical cable. MPLS VPNs are an example of providing a private network by using routing to ensure that a private network’s trafﬁc is delivered only to intended recipients. Because the assignment of the MPLS label, and thus the route, takes place within the MPLS cloud, an attacker on the edge of the MPLS network has no way to capture trafﬁc from another private network or to inject packets into it. In a typical transcontinental leased-line deployment, the customer is provided with a partial or whole T1 line. Even if the entire T1 line is dedicated to a single private network, the trafﬁc from the user’s T1 line is multiplexed onto a higher-bandwidth backbone, such as a T3 or OC4 line, for transport across the continent to the other endpoint’s T1 line. Thus, trafﬁc from the private network is carried on the same physical media as trafﬁc from other networks. Nevertheless, from the point of view of a user of the private network, this data is inaccessible and as a practical matter does not exist. Despite the realities of the previous paragraphs, our normal conceptual model for a leased-line connection is a dedicated wire from one site to another. This model includes the notion of a physical connection, and when we’re told that the network is down, we imagine that a physical event, perhaps involving a backhoe, has taken place. A virtual private network, on the other hand, is just that: virtual. As with a TCP connection, a VPN’s tunnel is a purely notional construct consisting of shared state at the tunnel endpoints. When told that the VPN is down, our ﬁrst thought is not that a cable has been cut but that the shared state has become desynchronized. Once one of the VPN’s packets enters the Internet, it is like any other IP datagram in the Internet. A malefactor can use a ﬂooding attack to cause a router to drop it or can inject phony packets into the VPN by forging some of the packet’s header ﬁelds. To protect itself from these and other attacks, a VPN relies on encryption and authentication to secure its data. The advantages of a VPN over an actual private network should be clear. Instead of expensive leased lines or other infrastructure, we can make use of the relatively inexpensive, high-bandwidth Internet. More important in many instances is the ubiquity of the Internet. In most developed areas, access to the Internet is readily available without special provisioning or long waiting times. Given a VPN with robust cryptographic primitives and protocols, we could argue that a VPN is, in fact, more secure than a dedicated leased line, even if we accept our conceptual model of such a line as real. In our deﬁnition of VPN, we said that the tunnel payload is protected by encryption and authentication. As we study the various types of VPNs, we will see that the meaning of payload depends on the class of VPN. In Chapter 6, for example, we study SSL

Section 5.2

PPTP

157

tunnels, which operate primarily at the application layer. Thus, the payloads that they encrypt and authenticate are usually application data. At the other end of the spectrum, tunnel-mode ESP in IPsec (Chapter 12) operates at the network layer, so its payloads are entire IP datagrams.

5.2

PPTP We studied the tunneling aspects of PPTP in Chapter 4, where we viewed it as a type of remote access server that doesn’t require expensive capital expenditures on modem banks and telco lines. As we mentioned, users view PPTP primarily as a VPN technology—they perceive its main beneﬁt as secure communications. In this section, we take a brief look at PPTP’s VPN properties. If we reexamine the PPTP message types, we see that none of them deals with encryption or authentication. That’s because PPTP really is a tunneling protocol, not a VPN protocol. PPTP relies on the underlying PPP protocol for its encryption and authentication services: In practice, that means Microsoft’s Microsoft Point-to-Point Encryption (MPPE) [Pall and Zorn 2001] and Microsoft Challenge Handshake Authentication Protocol (MS-CHAP) [Zorn and Cobb 1998]. At ﬁrst glance, these seem like reasonable cryptographic protocols. MPPE uses RC4, which we saw and remarked on in Chapter 3. MS-CHAP is Microsoft’s version of PPP CHAP, a typical challenge/response protocol. Unfortunately, Microsoft’s implementation of these protocols has several problems. The MS-CHAPv1 protocol has several weaknesses that make recovery of the user’s password comparatively easy using a dictionary attack. MS-CHAPv2 ﬁxes the worst of these weaknesses but is still susceptible to an attack where a dictionary of N trial passwords can be checked at a cost of about N/216 attempts and some precomputation. The version of MPPE used with MS-CHAPv1 has a fatal ﬂaw: It uses the same RC4 key for both the PAC and PNS, resulting in reuse of the key stream. As we saw in Chapter 3, this leads directly to the recovery of both plaintext streams. With MSCHAPv2, Microsoft changed MPPE to avoid the key stream reuse, but serious problems remain. The main problem is that the RC4 session keys are derived in a deterministic way from the user’s password and information passed in the clear. Thus, the keys have the same entropy as the password instead of the 128 bits that a randomly generated session key would have. Because user passwords are generally low entropy and, with MSCHAP, susceptible to a particularly effective dictionary attack, PPTP cannot be considered to have robust encryption. Finally, there is no per packet authentication. This means that various bit-ﬂipping attacks are possible on the encrypted data. We can ‘‘ﬂip,’’ or change, any bit without detection when using RC4 and similar stream ciphers. If we know that a certain bit in a message is important for some reason, we can easily change it. Suppose, for example, that we know that the most signiﬁcant bit of byte X in a message enables certain features that make it more difﬁcult for an attacker to compromise the security of the system. If we know that the bit is turned on, we can turn it off by merely exclusiveORing 0x80 with byte X of the ciphertext. Note that we don’t have to know the encryption key

158

Virtual Private Networks

Chapter 5

or even the plaintext value of byte X to do this: only that we want to change the most signiﬁcant bit. It’s easy to see how we can generalize this to alter larger units of data if we know their plaintext values and positions in the data stream.

The PPP control protocols, such as LCP, are particularly vulnerable, and it may be possible to convince the client and server to use the older MS-CHAPv1 protocol, resulting in fairly easy user password recovery and compromise of the RC4 cipher stream. The fact that MS-CHAPv2 considerably strengthened the Microsoft version of PPTP notwithstanding, almost all experts recommend against using PPTP. The speciﬁcs of the MS-CHAP and MPPE weaknesses are detailed in [Schneier and Mudge 1998] and [Schneier, Mudge, and Wagner 1999]. The weaknesses we have been discussing are speciﬁc to the Microsoft implementation, of course, but this implementation is the preponderant one. Virtually all other PPTP implementations are written to interoperate with Microsoft’s, so they are likely to share the same weaknesses.

5.3

L2TP L2TP has minimal built-in security. The LAC and LNS can authenticate each other during tunnel setup, and most AVPs can be encrypted, but L2TP, like PPTP, depends on PPP to protect the user data in the tunnel. This default security has several problems. Let’s put aside the problems with MPPE and MS-CHAP for the moment and assume that we have robust versions of CHAP and an encryption protocol installed in our PPP implementation. Unfortunately, we are still far from secure. Let’s take a look at some of the problems that remain. First, the control channel, once set up, is unprotected except for any AVPs that are encrypted. This means that an attacker can easily disrupt the tunnel by, for example, sending a forged StopCCN message to the LAC or LNS. The attacker will have to know the tunnel ID and proper sequence numbers, of course, but they are easily obtained by snooping the tunnel or even by informed trial and error. A parallel weakness exists in PPP. Although packets carrying user data are encrypted, control packets, such as LCP, CHAP, and IPCP packets, are not. This means that PPP can leak information, such as the internal IP addresses of the enterprise network. It also means that active attacks, such as sending a forged packet pointing to a compromised DNS server, are possible. Because PPP packets are not authenticated— whether or not encryption is in use—these types of attacks are particularly easy. This last point is an important one. Because neither PPP nor L2TP messages are authenticated, they are subject to various manipulations. We must distinguish here between the endpoint authentication, which happens at L2TP tunnel setup and PPP session establishment, and message authentication. Endpoint authentication typically involves a CHAP-like mechanism to convince each side of the proposed connection that its peer is who he says he is. Message authentication, on the other hand, refers to providing each message with a message authentication code, such as one of the HMACs we discussed in Chapter 3, to guarantee that the message is from the peer, not a forged message injected into the message stream by an attacker, and that it has not been modiﬁed since the

Section 5.3

L2TP

159

MAC was calculated. Message authentication is sometimes called message integrity to distinguish it from endpoint authentication. Similarly, MACs are sometimes called message integrity codes (MICs).

Those who are not well versed in security and cryptographic protocols often believe that encryption alone provides message authentication. After all, if the message is not from the peer, how did the attacker encrypt it? Similarly, how could an attacker alter a message’s plaintext without knowing the encryption key? Unfortunately, even encrypted messages are subject to manipulation by an attacker. Data that is encrypted with a stream cipher is subject to bit-ﬂipping attacks, as we noted in our discussion of PPTP. Data encrypted with block ciphers is subject to cut-and-paste attacks, as we’ll see in Chapter 12. Thus, the lack of message authentication in L2TP is a serious security shortcoming. As we mentioned in Chapter 4, the forthcoming L2TPv3 will provide optional authentication for all messages in the control channel.

A consequence of L2TP’s lack of message authentication is that there is no replay protection. That is, an attacker can replay previous messages in order to confuse the protocol or end application. We would not, for example, want to allow a previous request for a bank fund transfer to be replayed. Replay attacks can easily be prevented by adding a sequence number to each message and then authenticating the message. Notice that the sequence number alone is insufﬁcient, as the attacker could merely supply the expected sequence number of an unauthenticated message. It is the authentication that prevents such trivial attacks. Another problem with L2TP and PPP encryption and endpoint authentication is that they are based on a single shared secret. There are two issues with this: First, this shared secret could be compromised in some manner outside of the protocol, and once compromised, all previous and future trafﬁc can be read, and the enterprise network that it is protecting is subject to immediate attack. Second, the shared secret is long lived, allowing an attacker to accumulate a large amount of data encrypted with it. The more data that a cryptanalyst has, the easier it is to break the encryption and discover the key. VPNs generally have more robust key-management protocols that change keys often and prevent the accumulation of data that a cryptanalyst can work with. MPPE does this to a limited extent—the session key can change with every packet or after 256 packets—but because these keys are derived from a single shared secret and information sent in the clear, their security is no better than that of the shared secret. If the shared secret is compromised, all messages—past and future—can be read. The ideal key-management protocol has a property called perfect forward secrecy (PFS). This means that each session key is independent and that the compromise of one such key does not compromise any of the others. Thus, our complaint about MPPE in the previous paragraph is that although it does change keys frequently, thereby making brute-force attacks more difﬁcult, it does not enjoy the PFS property. This is a problem with PPP in general; none of the encryption protocols deﬁned for use with it enjoy PFS or have robust key-management protocols. Indeed, as of this writing, they all depend on a shared secret.

160

Virtual Private Networks

Chapter 5

L2TP and IPsec Because L2TP has such weak native security, many experts consider it a remote access technology rather than a VPN. On the other hand, users tend to think of it as a VPN technology and are interested primarily in protecting their communications. One way of reconciling these two views is to combine L2TP with an external security protocol. The most popular way of doing this is to run L2TP over IPsec (Part 3). IPsec provides encryption, authentication, and other security services at the network layer. IPsec can run in several modes and provide differing security services, but for now, let us merely stipulate that IPsec can provide encryption and authentication for IP packets. In particular, we shall be interested in ESP transport mode (Chapter 12), in which the payload of IP datagrams is encrypted and authenticated while in transit between, in this case, the LAC and LNS. Figure 5.1 shows the encapsulation of the L2TP header and message within IPsec, UDP, and IP.

IP header

IPsec header

UDP header

L2TP header

L2TP message

IPsec trailer

Figure 5.1 Encapsulation of L2TP Within IPsec

With this encapsulation, the UDP header, the L2TP header, the L2TP message, and parts of the IPsec header and trailers are encrypted and authenticated, giving complete protection for everything in the L2TP tunnel. That is, the control channel and all the data channels, including the PPP control protocols, such as LCP and IPCP, are protected from snooping and alteration. This solves the problem that we have with plain L2TP of leaving these protocols unprotected. Finally, ESP provides replay protection, so L2TP/IPsec is also safe from replay attacks. Given a connection over an L2TP/IPsec tunnel from a remote host to a host on the enterprise network, it is important to understand what parts of the path between the two hosts are protected. In the most common case of the road warrior with a voluntary connection to the enterprise network, we have the situation shown in Figure 5.2. If we assume that the remote host is communicating with one of the servers on the enterprise network, we almost have an end-to-end VPN. The L2TP/IPsec tunnel ends at the LNS on the enterprise network, so the ﬁnal hop to the server is unprotected, but this won’t matter for most applications, because the LNS and the server are both on the enterprise network, which is presumably protected by ﬁrewalls and other means from outside interference. Notice though, that the connection is still subject to snooping or manipulation by a host inside the enterprise. If this is a concern, another VPN could be established between the LNS and the server. Another typical application of an L2TP/IPsec VPN is shown in Figure 5.3. Here, a remote network, such as a branch ofﬁce, is connected to the enterprise network through an L2TP/IPsec tunnel. We assume that a host on the remote network has a connection to a server on the enterprise network.

L2TP

L

remote A host

L2TP/IPsec tunnel

LNS

C

enterprise network

Section 5.3

161

server

server

host

LAC

L2TP/IPsec tunnel

LNS

enterprise network

host

remote network

Figure 5.2 An L2TP/IPsec VPN Between a Road Warrior and the Enterprise Network

server

server

Figure 5.3 An L2TP/IPsec VPN Between Networks

The security situation at the enterprise network is exactly like that in Figure 5.2: The connection is unprotected from the LNS to the server. The situation at the remote network is a little more complicated. If we assume that the host is talking to the LAC over a PPP connection, we can arrange that the data between the host and LNS is encrypted by PPP. That means, of course, that the data is doubly encrypted over most of the path between the host and the server, but more important, there is limited protection for the conversation on the remote network. Thus, such a connection would be resistant to snooping by a host on the remote network. Other conﬁgurations are possible, of course, so it is important to evaluate the security of each leg in any proposed L2TP/IPsec VPN topology. If we trust the hosts and servers on the private networks, we needn’t be too concerned about the lack of security there. If we don’t trust them, we must take steps to ensure that any sensitive data is protected from snooping or alteration from within the private networks. Microsoft uses the combination of L2TP/IPsec as its VPN solution (replacing PPTP), so we can expect to see this topology frequently. Nonetheless, not everyone agrees that this is a good solution. First, it reintroduces the NAT problem. Recall that L2TP uses UDP encapsulation rather than GRE, as PPTP does, and that the rationale for this was that it allowed L2TP

162

Virtual Private Networks

Chapter 5

to interoperate with NAT. When L2TP is combined with IPsec, the UDP header is encrypted and thus unavailable to NAT. This means that the most common NAT mode, PAT, cannot be used. Fortunately, this problem is being solved. The IETF is developing a standard for a technology, called NAT traversal (NAT-T), that allows IPsec to interoperate with NAT. The Microsoft implementation of L2TP/IPsec includes a version of NATT. We discuss NAT-T further in Chapter 14. The other common complaint about L2TP/IPsec is that it’s a solution in search of a problem. Critics complain that L2TP not only adds overhead and does not scale well, but also fails to solve any problems that IPsec alone can’t solve. Microsoft and other L2TP adherents counter that L2TP enables the use of existing session authentication protocols, such as MS-CHAP, and makes use of PPP’s ability to assign IP addresses and DNS servers to the remote host. [Messmer 2000] discusses the pros and cons of L2TP/IPsec deployment. Chapter 10 of [Shea 2000] has an excellent summary of L2TP security concerns and a discussion of L2TP/IPsec. RFC 3193 [Patel, Aboba, Dixon et al. 2001] discusses securing L2TP with IPsec in detail.

5.4

Other VPNs In the remainder of Part 2, we examine several other VPN technologies. We begin with examinations of the SSL/TLS (Chapter 6) and SSH (Chapter 7) protocols. Because these protocols operate at the application layer, some might consider them merely secure applications and not real VPNs. We will see, however, that they meet our deﬁnition for a VPN, and that we can, in fact, use them to build traditional network-to-network VPNs. Regardless of whether SSL/TLS and SSH are ‘‘real’’ VPNs, they solve the central problems of privacy, authentication, and key management that every VPN must address. By studying their solutions to these problems and by noting where they succeed and where they fail, we will gain a deeper appreciation for both the problems and their solutions. Indeed, we will see that the design sets we introduce in our examination of SSL/TLS and SSH are used again and again in other types of VPNs. For these reasons alone, our study of these protocols will pay handsome dividends. Next, we introduce and study some lightweight VPN technologies. When we say that they are lightweight, we mean that they are simpler and easier to deploy than, for example, the more comprehensive IPsec protocols (Part 3). In some applications—especially ad hoc applications—using one of these protocols might make sense. We begin with an examination of VTun, a very simple VPN that illustrates the difﬁculty of engineering robust security protocols. At the same time, VTun also illustrates the use of a common framework for building lightweight VPNs. Because of its simplicity, VTun exposes this framework in a way that makes it easy to see and understand. After VTun, we take a quick look at CIPE, a VPN running only on Linux and Windows NT. Because CIPE depends on a kernel module, porting it to other platforms is difﬁcult. As we’ll see, CIPE solves some of the security problems in VTun but still has ﬂaws.

Section 5.5

Summary

163

Next, we examine tinc, a VPN using the same framework as VTun. We’ll see that it solves most of the problems that CIPE did not resolve. Tinc is interesting because it is designed as a network of VPNs, where a set of tinc gateway nodes securely connect a series of networks by maintaining encrypted tunnels between the nodes. Within this network, tinc manages routing and the decryption and reencryption of IP datagrams as they pass through intermediate nodes to their destination node. Finally, we study OpenVPN, an excellent VPN that appears to offer security comparable to that of IPsec. OpenVPN achieves this by reusing the TLS/SSL protocol (Chapter 6) for endpoint authentication and key exchange, and by closely mimicking ESP (Chapter 12) for its data channel. Although it uses the same simple framework as VTun and tinc, OpenVPN provides robust security by leveraging the proven SSL and ESP protocols. Again, the study of these VPNs will deepen our appreciation for the problems that all VPNs must solve. It will also help us to understand their limitations and enable us to make informed decisions as to whether they are appropriate for any given application.

5.5

Summary In this chapter, we agreed on a deﬁnition of VPN and acknowledged that even with our deﬁnition, there can still be disagreement about whether a particular technology is a VPN or merely a secure application. We saw how this deﬁnition applies to PPTP and L2TP and examined the extent to which those protocols provide a robust VPN. We ended the chapter by providing a road map for the rest of Part 2. We indicated that we will study SSL, SSH, and four lightweight VPNs.

Exercises 5.1

Verify the statements concerning bit ﬂipping in stream ciphers. Speciﬁcally, show that exclusive-ORing a 1-bit into the ciphertext changes the corresponding plaintext bit.

5.2

Would it make sense to reverse the L2TP/IPsec encapsulation? That is, what would be the advantages and disadvantages of establishing an IPsec connection between a host on a remote network and a server on the enterprise network, and then running that through an L2TP tunnel between the LAC and LNS? The topology would be similar to that in Figure 5.3, but IPsec would be encapsulated in L2TP messages instead of the other way around.

5.3

Make the argument that a VPN is more secure than a leased line.

5.4

Given a VPN with robust cryptographic primitives and protocols, or a dedicated leased line, what are the most vulnerable points in the network? That is, if a malefactor were tasked with compromising the network’s data, how should the attacker proceed?

This page intentionally left blank

6

Secure Sockets Layer

6.1

Introduction The most ubiquitous transport-layer tunneling protocol, by far, is the Secure Sockets Layer (SSL)—the protocol used to, among other things, secure HTML (Hypertext Markup Language) transactions on the Web. As we shall see, SSL has many applications and can easily be used to build general-purpose transport-layer tunnels. In this chapter, we examine the SSL protocol, watch its operation on the wire by means of the tcpdump and ssldump utilities, see how we can use it to build a tunnel between two programs—one or both of which need not be SSL-aware—and, ﬁnally, see how we can use it to build a VPN between two networks. The ﬁrst SSL speciﬁcation originated in 1994 at Netscape, which was interested in a way to secure certain transactions made with its Netscape Navigator Web browser. The ﬁrst version was not released outside Netscape. Later that same year, the speciﬁcation for version 2 of SSL (SSL 2) was released [Hickman 1995], and an implementation appeared in Netscape Navigator 1.1 early in 1995. Unfortunately, the SSL 2 protocol had security problems, and the Netscape implementation, in particular, had serious security ﬂaws due to the way the pseudorandom number generator was seeded. The seeding method and the exploit that took advantage of it are described in [Goldberg and Wagner 1996]. Because the seed was calculated from the time of day, the UNIX process ID (pid), and parent’s process ID (ppid), Goldberg and Wagner were able to signiﬁcantly reduce the number of possible keys to try. An attacker who had an account on the same machine and thus knew the user’s pid and ppid, could recover the key in about 25 seconds. According to [Rescorla 2001], even without direct access to the pid and ppid, Goldberg and Wagner were able to discover the key in about an hour.

165

166

Secure Sockets Layer

Chapter 6

At the same time, other vendors were producing their own implementations. One of these, Microsoft’s Private Communications Technology (PCT), addressed several shortcomings of SSL 2, had better security, and was backward compatible with SSL 2. In late 1995, Netscape released the version 3 (SSL 3) speciﬁcation. SSL 3 was a complete rewrite of SSL, which included many of the features of PCT, added new cipher suites, and introduced a closure notiﬁcation that prevented truncation attacks. Before SSL 3, it was possible for an attacker to send a forged TCP FIN to one or both sides, making it appear that less data was transmitted than actually was. The closure notiﬁcation prevents this by sending what amounts to an EOF as part of the (authenticated) protocol. We’ll see the close notiﬁcation in action shortly.

Most browsers and servers currently use SSL 3, although as we’ll see, many browsers ﬁrst try to negotiate an SSL 2 connection. The latest edition of the SSL 3 speciﬁcation is [Freier, Karlton, and Kocker 1996]. This document was published as an Internet Draft but never became an RFC. Netscape continues to make it available as the best written speciﬁcation of SSL 3.

We’ll discuss the security properties of SSL 3 shortly. In 1996, the IETF began an effort to standardize the SSL protocol. For political reasons, the new protocol was named the Transport Layer Security (TLS) protocol. TLS is based mostly on version 3 of SSL but with enough ‘‘minor ’’ changes to make it incompatible with SSL 3. The TLS speciﬁcation was ﬁnished in 1999 and published as RFC 2246 [Dierks and Allen 1999]. At the time of this writing, SSL 3 remains the dominant protocol. Deployment of TLS has been slow, but its use is increasing. We use SSL to stand for both TLS and the various versions of SSL. We use the speciﬁc names SSL 2, SSL 3, and TLS when it’s necessary to distinguish which version we are speaking about.

6.2

Cipher Suites SSL makes use of three cryptographic functions. First, the two sides need a way of exchanging keying material with each other. Part of this key exchange can also provide authentication of the server. Second, there must be a method of encrypting the application data and other secured messages in the protocol. SSL supports several ciphers, both stream and block, for this purpose. Finally, each record transmitted must be authenticated. This is accomplished by adding a cryptographically secure message digest—or HMAC, see Section 3.4—to each record. Because one of the inputs to the HMAC will be a sequence number, replay attacks as well as data alterations can be detected. SSL supports many different combinations of these three functions, so the peers need a way of specifying which particular set of functions they support and will use. A cipher suite is a triple consisting of the key-exchange method, a cipher, and an HMAC. Figure 6.1 lists the standard cipher suites.

Section 6.3

The SSL Protocol

Cipher Suite

Suite Code

Key Exchange

Cipher

167

HMAC

SSL_NULL_WITH_NULL_NULL

0x0000

NULL

NULL

NULL

SSL_RSA_WITH_NULL_MD5

0x0001

RSA

NULL

MD5

SSL_RSA_WITH_NULL_SHA

0x0002

RSA

NULL

SHA

SSL_RSA_EXPORT_WITH_RC4_40_MD5

0x0003

RSA_EXPORT

RC4_40

MD5

SSL_RSA_WITH_RC4_128_MD5

0x0004

RSA

RC4_128

MD5

SSL_RSA_WITH_RC4_128_SHA

0x0005

RSA

RC4_128

SHA

SSL_RSA_EXPORT_WITH_RC2_CBC_40_MD5

0x0006

RSA_EXPORT

RC2_CBC_40

MD5

SSL_RSA_WITH_IDEA_CBC_SHA

0x0007

RSA

IDEA_CBC

SHA

SSL_RSA_EXPORT_WITH_DES40_CBC_SHA

0x0008

RSA_EXPORT

DES40_CBC

SHA

SSL_RSA_WITH_DES_CBC_SHA

0x0009

RSA

DES_CBC

SHA

SSL_RSA_WITH_3DES_EDE_CBC_SHA

0x000A

RSA

3DES_EDE_CBC

SHA

SSL_DH_DSS_EXPORT_WITH_DES40_CBC_SHA

0x000B

DH_DSS_EXPORT

DES40_CBC

SHA

SSL_DH_DSS_WITH_DES_CBC_SHA

0x000C

DH_DSS

DES_CBC

SHA

SSL_DH_DSS_WITH_3DES_EDE_CBC_SHA

0x000D

DH_DSS

3DES_EDE_CBC

SHA

SSL_DH_RSA_EXPORT_WITH_DES40_CBC_SHA

0x000E

DH_RSA_EXPORT

DES40_CBC

SHA

SSL_DH_RSA_WITH_DES_CBC_SHA

0x000F

DH_RSA

DES_CBC

SHA

SSL_DH_RSA_WITH_3DES_EDE_CBC_SHA

0x0010

DH_RSA

3DES_EDE_CBC

SHA

SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA

0x0011

DHE_DSS_EXPORT

DES40_CBC

SHA

SSL_DHE_DSS_WITH_DES_CBC_SHA

0x0012

DHE_DSS

DES_CBC

SHA

SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA

0x0013

DHE_DSS

3DES_EDE_CBC

SHA

SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA

0x0014

DHE_RSA_EXPORT

DES40_CBC

SHA

SSL_DHE_RSA_WITH_DES_CBC_SHA

0x0015

DHE_RSA

DES_CBC

SHA

SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA

0x0016

DHE_RSA

3DES_EDE_CBC

SHA

SSL_DH_anon_EXPORT_WITH_RC4_40_MD5

0x0017

DH_anon_EXPORT

RC4_40

MD5

SSL_DH_anon_WITH_RC4_128_MD5

0x0018

DH_anon

RC4_128

MD5

SSL_DH_anon_EXPORT_WITH_DES40_CBC_SHA

0x0019

DH_anon

DES40_CBC

SHA

SSL_DH_anon_WITH_DES_CBC_SHA

0x001A

DH_anon

DES_CBC

SHA

SSL_DH_anon_WITH_3DES_EDE_CBC_SHA

0x001B

DH_anon

3DES_EDE_CBC

SHA

SSL_FORTEZZA_KEA_WITH_NULL_SHA

0x001C

FORTEZZA_KEA

NULL

SHA

SSL_FORTEZZA_KEA_WITH_FORTEZZA_CBC_SHA

0x001D

FORTEZZA_KEA

FORTEZZA_CBC

SHA

SSL_FORTEZZA_KEA_WITH_RC4_128_SHA

0x001E

FORTEZZA_KEA

RC4_128

SHA

Figure 6.1 Cipher Suites

The three suites using FORTEZZA_KEA are not supported by TLS and are virtually never used outside of government settings, as they require additional hardware.

6.3

The SSL Protocol Many of the SSL messages and mechanisms deal with crypto export restrictions that no longer exist. Although now not required, these mechanisms continue to be supported in the name of backward compatibility. Because they all involve weakening the encryption in some way, they are of little interest to us, and we will not cover them. [Rescorla 2001] has a nice discussion of these mechanisms for those interested.

168

Secure Sockets Layer

Chapter 6

We begin by examining the exchange of packets in a typical SSL session. As a transport-layer protocol, SSL depends on a transport-layer protocol to carry its packets. In principle, there is no reason that SSL couldn’t run over UDP, but reliability would have to be built into the SSL protocol itself. To avoid these complications, SSL requires a reliable transport protocol to run over. In practice, this means TCP. In response to the need for a general protocol that protects UDP trafﬁc, Rescorla and Modadugu have proposed the Datagram Transport Layer Security (DTLS) protocol. DTLS provides a retransmission timer for the SSL handshake but otherwise does not attempt to add reliability to the underlying transport. In particular, DTLS is insensitive to lost or reordered data messages. DTLS offers optional replay protection using the same method that IPsec implements in AH (Chapter 11) and ESP (Chapter 12). In early 2005, the DTLS proposal was still at the Internet Draft stage. The latest draft of the proposal is available from the Transport Layer Security Working Group’s Web site at .

As with TCP, we can think of an SSL session as having three stages: connection setup, data transfer, and connection teardown. In the ﬁrst stage, the encryption, authentication, and compression algorithms are negotiated; the identity of the server and, optionally, the client is veriﬁed, and a key exchange takes place. In the second stage, the client and server exchange application data. These exchanges are encrypted and authenticated to ensure that the data cannot be read by third parties (encryption) and that third parties cannot alter the data without detection (authentication). When the applications have ﬁnished exchanging data, one or both of them send a closure notiﬁcation as an EOF. Because the closure notiﬁcation is authenticated, it can’t be forged by a third party. This prevents malevolent parties from forging a TCP FIN and truncating the data prematurely. The SSL 3 and TLS speciﬁcations require that both sides send closure notiﬁcations, but in practice, this is often ignored, and only one side sends it.

Basic SSL Packet Flow Figure 6.2 shows a typical SSL session. The ﬁrst nine messages comprise connection establishment. In this phase, the client sends the server a ClientHello message that speciﬁes which version of SSL it supports and lists the cipher suites and compression algorithms that it is willing to use. The server responds with three messages. As we’ll see later, the peers generally coalesce the messages in each step of the handshake, so these three messages will be sent as a single TCP segment.

The ﬁrst, the ServerHello message, informs the client which cipher suite and compression algorithm the server has chosen. The second message is the server’s certiﬁcate. The certiﬁcate serves two purposes. First, it allows the client to verify the identity of the server. Second, it contains the

Section 6.3

The SSL Protocol

client

169

server Handshake

: ClientHello

: ServerHello

Handshake

: Certiﬁcate

Handshake

Done

: ServerHello

Handshake

Handshake

: KeyExchan

ge

ChangeCiphe

rSpec

Handshake

: Finished

rSpec

ChangeCiphe

: Finished

Handshake

application

data

application

data

Alert: Close

Notify

Notify

Alert: Close

Figure 6.2 Basic SSL Packet Flow

server ’s public key, which the client uses to encrypt a secret that will be used by both sides to generate the cryptographic keys needed for the session. Finally, the server sends a ServerHelloDone message. Because some SSL modes require additional server handshake messages, the ServerHelloDone packet serves to mark the end of the server’s hello sequence. At this point, the server has positively identiﬁed itself to the client, the client and server have agreed on cipher suites and

170

Secure Sockets Layer

Chapter 6

compression algorithms, and the client has the necessary key to securely send the server some key-generating material. In the next three messages, the client sends the server some key-generating material (ClientKeyExchange), tells the server that it will henceforth use the newly generated keys to encrypt and authenticate its messages (ChangeCipherSpec), and informs the server that it has completed its part of the handshake (Finished). The server responds with its own ChangeCipherSpec and Finished messages. Now the applications are ready for the data exchange phase. The data transfer looks like any other TCP data transfer except that the data in the TCP segments is encrypted and authenticated. When we watch this transfer with tcpdump, we’ll see that it is indistinguishable from any other TCP data transfer except that the application data appears to be random bits. The last messages are the closure notiﬁcations. At this point, data can no longer be transmitted, and the connection is shut down. The SSL Record Layer Before taking a detailed look at SSL in action, we need to understand the SSL record layer. All SSL messages are carried in SSL records that identify what type of messages are in the records, the length of the messages, and the version of SSL being used. Figure 6.3 shows the general format of these records. 0

7 8 type

15 16 23 24 major minor version version

39 40

63

length

data

SHA1 HMAC (160 bits) padding

pad length

Figure 6.3 The SSL Record Format

The 8-bit type ﬁeld identiﬁes the type of message contained in this record. Figure 6.4 lists the four message types. The major and minor version numbers indicate what version of SSL is being used. SSL 3 has a major version of 3 and a minor version of 0. TLS has a major version of 3 and a minor version of 1. The length ﬁeld gives the length in bytes of the data ﬁeld. The length ﬁeld does not include the HMAC, padding, or pad length ﬁelds.

Section 6.4

SSL on the Wire

Record Type

Value

Description

CHANGE_CIPHER_SPEC ALERT HANDSHAKE APPLICATION_DATA

20 21 22 23

switch to the last negotiated cipher suite error or CloseNotify messages hello and other connection-initiation messages data messages

171

Figure 6.4 SSL Record-Layer Messages Types

The data ﬁeld carries the message data. This ﬁeld can be empty, as in the case of a ChangeCipherSpec message, or it can carry handshake data, alert messages, or application data. HMAC is a cryptographically secure message authentication code. This ﬁeld is used to provide authentication. If any of the previous ﬁelds are altered, the receiver will calculate a different HMAC and detect the alteration. Because one of the keys generated during the key exchange is used as an input to the HMAC calculation, an attacker cannot adjust the HMAC in the message to escape detection. As we mentioned earlier, the input to the HMAC includes a sequence number that is used to prevent replay attacks. Figure 6.3 depicts the HMAC as being a 160-bit SHA1 digest. It is also possible to use an MD5 128-bit digest. The SSL 3 HMAC is slightly different from that described in RFC 2104 [Krawczyk, Bellare, and Canetti 1997], being based on an earlier draft of that RFC. TLS uses the standard HMAC as deﬁned in RFC 2104.

The padding and pad length ﬁelds are used to pad the message to the block size of the encryption algorithm when a block encryption cipher, such as 3DES, is used. When a stream cipher, such as RC4, is used, these ﬁelds are not present. During the ﬁrst part of the handshake, before the ChangeCipherSpec message, the NULL cipher suite is used, in which case there is no HMAC or padding. This is because the keys are not yet available to generate the HMACs or encrypt the message.

6.4

SSL on the Wire Let’s look at some SSL exchanges. First, we look at a ‘‘normal’’ SSL session that corresponds to the packet ﬂow depicted in Figure 6.2. Many of the SSL messages contain a great deal of information, so we examine them one at a time.

Handshake Messages SSL uses handshake messages to negotiate cipher suites and to exchange keying material. All handshake messages begin with a simple 4-byte header (Figure 6.5) that gives the handshake type and the length of the message, exclusive of the header. The values for the handshake types are given in Figure 6.6.

172

Secure Sockets Layer

Chapter 6

handshake type (1 byte)

message length (3 bytes)

Figure 6.5 The Handshake Header

Type

Handshake Type

0 1 2 11 12 13 14 15 16 20

HELLO_REQUEST CLIENT_HELLO SERVER_HELLO CERTIFICATE SERVER_KEY_EXCHANGE CERTIFICATE_REQUEST SERVER_HELLO_DONE CERTIFICATE_VERIFY CLIENT_KEY_EXCHANGE FINISHED Figure 6.6 Handshake Types

Because several layers of encapsulation are involved with the handshake messages, it’s beneﬁcial to review what that encapsulation looks like. Figure 6.7 shows the layout of an IP datagram that is carrying an SSL handshake message. The sizes given for the IP and TCP headers assume that neither has any options. Recall that the HMAC and padding will not be present until after a ChangeCipherSpec message is sent.

IP header

TCP header

SSL record header

handshake header

20 bytes

20 bytes

5 bytes

4 bytes

handshake data

HMAC 16 or 20 bytes

handshake message SSL record TCP segment IP packet

Figure 6.7 SSL Handshake Message Encapsulation

padding

Section 6.4

SSL on the Wire

173

The Client Hello The message format for the ClientHello is shown in Figure 6.8. Because the SSL message formats are highly irregular and are best thought of as a byte stream, we will alter their graphical presentation a bit in order to make their structure clearer.

handshake type (1) (1 byte)

message length bytes

major version (1 byte)

message length (3 bytes) minor version (1 byte) random data (32 bytes)

session ID length (1 byte) cipher suite length (2 bytes) compression list length (1 byte)

session ID (0–32 bytes)

cipher suite list (2–216 − 1 bytes)

compression list (1–255 bytes)

Figure 6.8 The ClientHello Message

The major and minor version are the latest version of SSL that the client understands. This allows the server to determine which protocol version to use for the rest of the session. Both the client and server will contribute some random data that will be used for key generation. The random data changes with each ClientHello or ServerHello, helping prevent replay attacks. The random data ﬁeld comprises a standard UNIX 4-byte time value—seconds since midnight, January 1, 1970—and 28 random bytes. The session ID is used to resume a previous session, thereby sidestepping the keygeneration phase, which requires considerable resources. We’ll see an example of its use later. In the current exchange, it is not used, so the session ID length is set to 0. Each cipher suite is represented by a 2-byte number (see Figure 6.1). The client sends a list of the cipher suites it is prepared to support, sorted in order of preference. The server, however, is under no obligation to honor the client’s preference and may choose any suite in the client’s list.

174

Secure Sockets Layer

Chapter 6

None of the SSL versions support compression; RFC 3749 [Hollenbeck 2004] describes compression for TLS. Some proprietary implementations and OpenSSL also support compression, but we hardly ever see this capability used. Thus, the compression list contains a single byte of 0, representing NULL. Now let’s look at a ClientHello message. To capture the data, we use ssldump, a tool developed by Eric Rescorla for SSL development, debugging, and his book [Rescorla 2001]. The ssldump program is similar to tcpdump but is specialized to decode SSL packets. Here is the ﬁrst message, a ClientHello, in the SSL handshake: 1 1

0.0021 (0.0021) C>SV3.0(47) Handshake ClientHello Version 3.0 random[32]= 3e 5e 6f d3 6b a3 1e ca 45 3f 3a 87 50 92 8e 6b c4 9f 74 1e b4 45 b8 44 e7 41 72 31 36 fd e8 63 cipher suites SSL_RSA_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_IDEA_CBC_SHA compression methods NULL

The ﬁrst line tells us that this is a handshake message from the client to the server (C>S), that it is a version 3.0 message (3.0) of 47 bytes. The ﬁrst 1 tells us that this is the ﬁrst connection in the capture, and the second 1 tells us that this is the ﬁrst record in this connection. The second two ﬁelds are the time since the beginning of the capture of this record, and the time since the last record. The next two lines tells us that this is a ClientHello and that the client understands version 3.0 and below. We also see the client’s random number and that the client is willing to use the two cipher suites SSL_RSA_WITH_3DES_EDE_CBC_SHA and SSL_RSA_WITH_IDEA_CBC_SHA. As expected with SSL, the client is unwilling to use any compression methods and lists NULL as the only method. The Server Hello As shown in Figure 6.9, the format of ServerHello is similar to that of the ClientHello, except that instead of lists for supported cipher suites and compression methods, only the single chosen value for each is speciﬁed by the server. The server’s response to our previous ClientHello is 1 2

0.0033 (0.0012) S>CV3.0(74) Handshake ServerHello Version 3.0 random[32]= 3e 5e 6f d3 f8 e3 4d 89 e7 9d 2d 5f 0e b9 8f f8 77 44 45 95 cb 69 ae 4e 4a 9b c8 29 29 76 da b4 session_id[32]= 14 17 3a 80 5e 1d 50 9f 46 1c 38 12 2f e6 d8 3d b6 6c 18 59 8b 00 f4 3d a1 1c 2f 22 72 80 37 50 cipherSuite SSL_RSA_WITH_3DES_EDE_CBC_SHA compressionMethod NULL

Section 6.4

SSL on the Wire

handshake type (2) (1 byte)

message length bytes

major version (1 byte)

175

message length (3 bytes) minor version (1 byte) random data (32 bytes)

session ID length (1 byte)

session ID (0–32 bytes)

cipher suite (2 bytes) compression method (1 byte) Figure 6.9 The ServerHello Message

The server has agreed to the version 3.0 protocol and has chosen to use the SSL_RSA_WITH_3DES_EDE_CBC_SHA cipher suite and NULL compression method. The server has contributed its own random data and has speciﬁed a session ID. The client can use this session ID to resume the session later. We’ll see an example of this shortly. Certiﬁcate The next message is the server’s certiﬁcate. This message is usually a chain of certiﬁcates, with the ﬁrst being the server’s own certiﬁcate, followed by any authenticating certiﬁcates from certiﬁcate authorities. This works as follows: The certiﬁcate authority (CA) signs the server’s certiﬁcate with its private key. The CA’s certiﬁcate contains the CA’s public key, which the client uses to verify that the CA was, in fact, the signer of the server’s certiﬁcate. The client typically has its own copy of the CA’s certiﬁcate, making the transaction secure.

The Certiﬁcate message is shown in Figure 6.10. The output from ssldump merely reﬂects the fact that the server has sent its certiﬁcate to the client: 1 3

0.0033 (0.0000) Certificate

S>CV3.0(571)

Handshake

176

Secure Sockets Layer

Chapter 6

handshake type (11) (1 byte)

message length (3 bytes) certiﬁcate list length (3 bytes)

certiﬁcate list (1–224 − 1 bytes) message length bytes

Figure 6.10 The SSL Certiﬁcate Message

The Server Hello Done The next message in our typical SSL session is the ServerHelloDone. Recall that it is used to tell the client that the server has ﬁnished its hello sequence and that the client may begin the key-transfer process. The ServerHelloDone message consists merely of a handshake header with a type of 14, as shown in Figure 6.11. handshake type (14) (1 byte)

message length (0) (3 bytes)

Figure 6.11 The ServerHelloDone Message

As with the Certiﬁcate message, ssldump merely notes that the message was sent by the server: 1 4

0.0033 (0.0000) S>CV3.0(4) ServerHelloDone

Handshake

The Client Key Exchange After the server sends its ServerHelloDone message, the client responds with a ClientKeyExchange message. This message contains what SSL calls the PreMasterSecret encrypted with the server’s public key that was in its certiﬁcate. In this exchange, we are using RSA for the key exchange. When Difﬁe-Hellman or Fortezza is used, the details are slightly different.

Figure 6.12 shows the format of the ClientKeyExchange message. The PreMasterSecret is composed of the client’s version and 46 cryptographically secure random bytes. Both sides use the PreMasterSecret along with the random bytes from the client and server hellos to generate the MasterSecret, which in turn is used to generate the various keys used by the encryption and digest algorithms.

Section 6.4

SSL on the Wire

177

Cryptographically secure means generated by a cryptographically strong pseudorandom number generator (see RFC 1750 [Eastlake, Crocker, and Schiller 1994]). As the problems with version 2 illustrate, failure to address this issue leads directly to exploits. This is especially important here, because the PreMasterSecret is used as a seed for all the keying material generated during the session.

The message length ﬁeld speciﬁes 128 bytes. The enlargement from 48 to 128 bytes is the result of the RSA encryption. One other important point is that the TLS implementation is slightly different. In TLS, the EncryptedPreMasterSecret is deﬁned as a variable array of between 0 and 216 − 1 bytes and therefore has a 2-byte length preﬁx.

message length bytes

handshake type (16) (1 byte)

message length (128) (3 bytes)

major version (1 byte)

minor version (1 byte) encrypted random data (46 bytes)

Figure 6.12 The Client Key Exchange Message

We see the expansion of the EncryptedPreMasterSecret to 128 bytes in the ssldump output for the ClientKeyExchange message: 1 5

0.0052 (0.0018) C>SV3.0(132) Handshake ClientKeyExchange EncryptedPreMasterSecret[128]= cb d0 b0 29 2a 7d 0b 52 52 2b 9a 63 cf 22 cb 65 c3 0b 2b f6 9b a4 d0 50 c8 73 34 19 c9 cd b7 87 2b 9f 4b 28 da 67 8e d4 49 9a 7e b0 41 60 29 46 13 47 1d 3d cb 39 e7 5b f2 dc 11 19 9b 43 36 fe 69 70 41 25 a8 ff 69 50 c6 a5 a4 45 ae 1f 8a ee 25 e0 82 ae ea 3b 2b a1 d1 a3 45 19 f0 55 b6 a3

49 20 b0 c1 dc f9 a4 76

62 a6 86 d8 45 68 21 22

28 a3 64 b6 96 f3 f1 36

ce cd 4a 6c a6 e3 ff 20

The ChangeCipherSpec Message After the ClientKeyExchange message, both sides can generate keys for the rest of the session. The next message in our typical session is a ChangeCipherSpec, by which the client indicates that it will henceforth use the new cipher suite: 1 6

0.0052 (0.0000)

C>SV3.0(1)

ChangeCipherSpec

Recall that the ChangeCipherSpec message is one of the four basic record types (Figure 6.4) and is therefore not a Handshake message. The format of the message is

178

Secure Sockets Layer

Chapter 6

just an SSL record (Figure 6.3) with a type of CHANGE_CIPHER_SPEC (20) and a data ﬁeld of the single byte of value 1. The Finished Message Finally, the client indicates that it is ﬁnished with the handshake by sending a Finished message. This message is encrypted as promised by the previous ChangeCipherSpec message. Figure 6.13 shows the format of the Finished message.

message length bytes

handshake type (20) (1 byte)

message length (36) (3 bytes)

MD5 hash (16 bytes)

SHA1 hash (20 bytes)

Figure 6.13 The Finished Message

The two HMACs are computed over all the previous handshake messages, verifying that the unauthenticated messages were not tampered with. Because the Finished message is encrypted, we normally wouldn’t be able to see its contents, but we used our own certiﬁcate for the session and therefore know the private key. With access to the private key, ssldump is able to decrypt the PreMasterSecret and generate its own set of keys. Thus, we can see the values of the HMACs in the client’s Finished message: 1 7

0.0052 (0.0000) C>SV3.0(64) Handshake Finished md5_hash[16]= 09 92 f4 f1 16 c6 17 a8 39 a2 c8 c8 44 89 7f df sha_hash[20]= ae 15 cc e0 8e 5f 13 c3 e2 1e 9c 10 5f 30 8c 24 cf 6a 83 5e

Note the discrepancy between the message length from the SSL record header (64) and decoded contents (40). The difference is the (SSL record) HMAC and the padding added to ﬁll out the 3DES block size. The server next sends its own ChangeCipherSpec and Finished messages, and the session is ready for the data transfer stage: 1 8 1 9

0.0141 (0.0089) S>CV3.0(1) ChangeCipherSpec 0.0141 (0.0000) S>CV3.0(64) Handshake Finished md5_hash[16]= 62 57 8c f7 1e 4b 88 81 8b 92 2b 73 dd ce 2e d3

Section 6.4

SSL on the Wire

179

sha_hash[20]= a8 6a b6 04 f1 ed 67 ce d1 c9 aa 1a 82 60 8c f3 ef 00 61 c4

The client and server exchange application data: 1 10 2.7108 (2.6966) C>SV3.0(24) application_data --------------------------------------------------------------------------------------------------------------------1 11 2.7108 (0.0000) C>SV3.0(32) application_data ----------------------------------------------------------hello ----------------------------------------------------------1 12 2.7125 (0.0016) S>CV3.0(24) application_data --------------------------------------------------------------------------------------------------------------------1 13 2.7125 (0.0000) S>CV3.0(32) application_data ----------------------------------------------------------hello -----------------------------------------------------------

Record 10 from the client and record 12 from the server do not appear to have any data. These records are called empty fragments and are inserted by OpenSSL as a countermeasure against a vulnerability in version 3 and TLS involving CBC ciphers. Recall that the SSL data transfer is indistinguishable from a normal TCP data transfer except that the TCP payload appears to be random data. The following tcpdump output is the trace of the preceding four application data records. Notice that there is no obvious indication that this is an SSL session. In lines 1.4, 1.6, 2.4, and 2.6, we can see the SSL record headers (in boldface). In line 1.6, for example, we see that the SSL record type is 0x17 (2310 ), indicating that this is application data (see Figure 6.4). The next 2 bytes are the version (3.0), followed by the length, which is 0x0020 (3210 ). The rest of the SSL record is encrypted, so we can’t see the data. Observe that SSL records 10 and 11 were sent in a single TCP segment, as were records 12 and 13. 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 2 2.1 2.2 2.3 2.4

15:06:45.828412 127.0.0.1.1526 > 127.0.0.1.5001: P 265:331(66) ack 740 win 57344 (DF) 4500 0076 5b26 4000 4006 e159 7f00 0001 E..v[&@[email protected].... 7f00 0001 05f6 1389 d87a 3458 d386 d023 .........z4X...# 8018 e000 c246 0000 0101 080a 002b ee2f .....F.......+./ 002b ed21 1703 0000 1848 2ce9 04b5 4071 .+.!.....H,...@q f2df faf1 2ab4 eb6f e012 3e68 cf5b db2f ....*..o..>h.[./ 5217 0300 0020 7821 6be6 0b08 3567 559c R.....x!k...5gU. 8d62 74e8 3af8 7f1d 6557 3e82 52f8 7fc1 .bt.:...eW>.R... fff8 2b0f 94e6 ..+... 15:06:45.830097 127.0.0.1.5001 > 127.0.0.1.1526: P 740:806(66) ack 331 win 57344 (DF) 4500 0076 5b29 4000 4006 e156 7f00 0001 E..v[)@[email protected].... 7f00 0001 1389 05f6 d386 d023 d87a 349a ...........#.z4. 8018 e000 d64b 0000 0101 080a 002b ee2f .....K.......+./ 002b ee2f 1703 0000 18de 1c26 9d57 55f4 .+./.......&.WU.

180

Secure Sockets Layer

2.5 2.6 2.7 2.8

c411 0517 cd48 52c5

Chapter 6

2363 0300 6bff e1b1

93af e9fc c4b4 1fd8 7956 c557 0020 7493 732e b754 89c4 40d1 8db9 46dd 6b52 4605 1988 c942 0b24

..#c........yV.W ......t.s..T..@. .Hk...F.kRF....B R....$

Alert Messages and Closure Notiﬁcation The ﬁnal type of record is the Alert message. It consists of 2 bytes carried in the SSL record, as shown in Figure 6.14.

level (1 byte)

description (1 byte)

Figure 6.14 The Alert Message

The level ﬁeld indicates the severity of the alert. It can be either WARNING (1) or FATAL (2). Description is an alert code that indicates what type of alert this is. The values and the protocol versions that support them are given in Figure 6.15. 1 14 4.9688 level value 1 15 4.9690 level value 1 4.9723 1 4.9726

(2.2563) C>SV3.0(24) Alert warning close_notify (0.0001) S>CV3.0(24) Alert warning close_notify (0.0032) C>S TCP FIN (0.0002) S>C TCP FIN

Resumed Sessions We saw earlier that the server sent the client a session ID in its ServerHello message. The client can use this ID to bypass much of the handshake negotiation in subsequent sessions. Although this may seem to have little utility, it arises in a natural way. After a Web server replies to a browser ’s request for a Web page, it typically closes the TCP connection to indicate an EOF (see Tip 16 of ETCP). If, as is typical of the ﬁnancial transactions protected by SSL, the client makes another request, being able to resume the session saves time and resources. Although this may not amount to much for the client, it can be signiﬁcant for busy Web servers. A resumed session differs from a normal session only in the handshake, as shown in Figure 6.16. The client asks to resume the session by including the session ID from the last session. The server, which has cached the state of that session, agrees by returning the same session ID. We see that the server does not send its certiﬁcate and that the client does not send the PreMasterSecret. Both sides will use the existing MasterSecret to generate new keys. Because the new random data from both sides will be used to

Section 6.4

SSL on the Wire

Description

Value

TLS

SSL 3

CLOSE_NOTIFY UNEXPECTED_MESSAGE BAD_RECORD_MAC DECRYPTION_FAILED RECORD_OVERFLOW DECOMPRESSION_FAILURE HANDSHAKE_FAILURE NO_CERTIFICATE BAD_CERTIFICATE UNSUPPORTED_CERTIFICATE CERTIFICATE_REVOKED CERTIFICATE_EXPIRED CERTIFICATE_UNKNOWN ILLEGAL_PARAMETER UNKNOWN_CA ACCESS_DENIED DECODE_ERROR DECRYPT_ERROR EXPORT_RESTRICTION PROTOCOL_VERSION INSUFFICIENT_SECURITY INTERNAL_ERROR USER_CANCELED NO_RENEGOTIATION

0 10 20 21 22 30 40 41 42 43 44 45 46 47 48 49 50 51 60 70 71 80 90 100

• • • • • • •

• • •

• • • • • • • • • • • • • • • •

181

• • • • • • • • •

Figure 6.15 Alert Descriptions

generate these keys, they will be different from those in the previous incarnation of the session. The following exchange is a resumption of the previous one: 2 1

0.0011 (0.0011) C>SV3.0(79) Handshake ClientHello Version 3.0 random[32]= 3e 5e 6f db 15 04 57 00 03 5a a8 ae e9 21 e6 0e 08 23 18 cc 5a 9c 4e bb resume [32]= 14 17 3a 80 5e 1d 50 9f 46 1c 38 12 b6 6c 18 59 8b 00 f4 3d a1 1c 2f 22 cipher suites SSL_RSA_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_IDEA_CBC_SHA compression methods NULL

30 89 8b 6c 20 36 aa f9 2f e6 d8 3d 72 80 37 50

182

Secure Sockets Layer

Chapter 6

client

server Handshake: Cl

ientHello (sess

ion ID)

rverHello

Handshake: Se

Spec

ChangeCipher

nished

Handshake: Fi

ChangeCipher

Spec

Handshake: Fi

nished

Figure 6.16 Data Flow for a Resumed Session Handshake

2 2

2 3 2 4

2 5 2 6

0.0016 (0.0004) S>CV3.0(74) Handshake ServerHello Version 3.0 random[32]= 3e 5e 6f db 3c 45 6f 38 d4 1e 83 79 91 fe 84 6a f3 db 09 6e 23 aa 10 96 e0 63 cd c0 81 15 5c 55 session_id[32]= 14 17 3a 80 5e 1d 50 9f 46 1c 38 12 2f e6 d8 3d b6 6c 18 59 8b 00 f4 3d a1 1c 2f 22 72 80 37 50 cipherSuite SSL_RSA_WITH_3DES_EDE_CBC_SHA compressionMethod NULL 0.0016 (0.0000) S>CV3.0(1) ChangeCipherSpec 0.0016 (0.0000) S>CV3.0(64) Handshake Finished md5_hash[16]= c1 c7 66 99 cd 32 9a 20 7d d6 e2 82 e1 67 1f a0 sha_hash[20]= 5b 04 2f df a1 bb 47 75 44 35 20 62 4d 56 e1 66 54 54 38 b9 0.0018 (0.0002) C>SV3.0(1) ChangeCipherSpec 0.0018 (0.0000) C>SV3.0(64) Handshake Finished md5_hash[16]= 5c 5b 69 d3 d1 59 86 7f 2a d3 40 3e a1 6e 0f b5 sha_hash[20]= 48 ac 60 8e 45 39 b1 c0 3e b1 86 1f eb ec 9e 89 5d fa d7 f7

Section 6.4

SSL on the Wire

183

2 7

2

2

2

2

2

2 2

4.5853 (4.5834) C>SV3.0(24) application_data --------------------------------------------------------------------------------------------------------------------8 4.5853 (0.0000) C>SV3.0(40) application_data ----------------------------------------------------------hello again ----------------------------------------------------------9 4.5870 (0.0016) S>CV3.0(24) application_data --------------------------------------------------------------------------------------------------------------------10 4.5870 (0.0000) S>CV3.0(40) application_data ----------------------------------------------------------hello again ----------------------------------------------------------11 6.2698 (1.6828) C>SV3.0(24) Alert level warning value close_notify 12 6.2700 (0.0001) S>CV3.0(24) Alert level warning value close_notify 6.2730 (0.0029) C>S TCP FIN 6.2732 (0.0002) S>C TCP FIN

Version 2 Client Hello In order to interoperate with version 2 servers, many browsers will start the handshake with a version 2 ClientHello message that indicates their ability to support version 3. If the server also supports version 3, it will respond with a version 3 ServerHello, and the rest of the session will take place using the version 3 protocol. If the server responds with a version 2 ServerHello, the browser will continue to use the version 2 protocol. Some version 2 browsers won’t accept a ClientHello with a version number other than 2. In this case, the handshake fails, and a secure connection cannot be established.

The version 2 SSL Record and ClientHello format is shown in Figure 6.17. The following session is between a Mozilla 1.2 browser and a bank’s secure Web site. We can’t tell for sure which Web server the bank is using, because the trafﬁc is encrypted, but the rest of the bank’s Web site is hosted by Netscape-Enterprise Server. 1 1 0.0512 (0.0512) C>S SSLv2 compatible client hello Version 3.1 cipher suites SSL2_CK_RC4 SSL2_CK_RC2 SSL2_CK_3DES SSL2_CK_DES SSL2_CK_RC4_EXPORT40 SSL2_CK_RC2_EXPORT40 TLS_RSA_WITH_RC4_128_MD5 Unknown value 0xfeff TLS_RSA_WITH_3DES_EDE_CBC_SHA Unknown value 0xfefe

Chapter 6

record header

Secure Sockets Layer

record length (2 or 3 bytes)

type (1) (1 byte) major version (1 byte) verion 2 ClientHello (record length bytes)

184

minor version (1 byte) cipher spec length (2 bytes) session ID length (0 or 16) (2 bytes) challenge length (32) (2 bytes) cipher spec list (3–216 − 1 bytes)

session ID (0 or 16 bytes)

challenge (32 bytes)

Figure 6.17 The Version 2 Record and ClientHello Format

TLS_RSA_WITH_DES_CBC_SHA TLS_RSA_EXPORT1024_WITH_RC4_56_SHA TLS_RSA_EXPORT1024_WITH_DES_CBC_SHA TLS_RSA_EXPORT_WITH_RC4_40_MD5 TLS_RSA_EXPORT_WITH_RC2_CBC_40_MD5 1 2 0.1077 (0.0564) S>CV3.0(1286) Handshake ServerHello Version 3.0 random[32]= 00 00 07 98 0d 95 99 80 1f 6c d2 f9 20 9a ed cf f6 5c 95 f8 b1 d5 3f cb b8 49 4d a5 c8 89 63 99

Section 6.4

SSL on the Wire

session_id[32]= 36 79 c6 ba c7 40 1b dc f3 0b 95 a7 e5 ed b9 60 65 f0 48 85 25 80 b2 f9 e4 33 18 b7 cf 64 cipherSuite SSL_RSA_WITH_RC4_128_MD5 compressionMethod NULL Certificate ServerHelloDone 1 3 0.1229 (0.0152) C>SV3.0(132) Handshake ClientKeyExchange EncryptedPreMasterSecret[128]= c2 61 ad 31 a3 3e 2e 8b 6f 77 81 e1 8a 76 b1 76 54 fa 11 57 14 e7 4e f8 85 c1 2e a9 99 eb 7b d5 54 04 7c 04 70 0b b7 41 39 0c c6 78 05 ce 83 40 31 15 95 9d 7f 03 c7 06 3d a8 8b 13 4a 82 8a 53 1a 06 e6 25 f3 29 14 21 04 1b a0 bb bd 00 b3 ca 60 11 07 5e 36 ba 20 1b f4 05 4e e7 b2 f2 91 14 c0 68 78 af 23 83 8f 9e 30 63 a6 2b 20 13 3c ca 76 ab 85 7e dd 09 64 7e 1 4 0.1229 (0.0000) C>SV3.0(1) ChangeCipherSpec 1 5 0.1229 (0.0000) C>SV3.0(56) Handshake 1 6 0.1828 (0.0599) S>CV3.0(1) ChangeCipherSpec 1 7 0.1828 (0.0000) S>CV3.0(56) Handshake 1 8 0.1842 (0.0013) C>SV3.0(473) application_data 1 9 0.2371 (0.0529) S>CV3.0(253) application_data 1 10 0.2399 (0.0027) S>CV3.0(18) Alert 1 0.2409 (0.0009) S>C TCP FIN 1 11 0.2459 (0.0049) C>SV3.0(18) Alert 1 0.2460 (0.0000) C>S TCP FIN

185

e5 7d

20 5c 39 5d f4 b6 3a 8a

Mozilla sends a version 2 ClientHello but offers to negotiate at version 3.1 (TLS) or lower. The cipher suites beginning with SSL2_ are those that are deﬁned in version 2. The others are the normal version 3.0/3.1 cipher suites. The server responds with a version 3 ServerHello, saying that it will negotiate with version 3 using RSA, 128-bit RC4, and MD5. The rest of the session proceeds in a normal version 3 manner. SSL record 2 contains three handshake messages: ServerHello, Certiﬁcate, and ServerHelloDone. This behavior is normal. An SSL record can contain multiple messages as long as they are all the same record type: Handshake, in this case. It is also legal for a single message to be split across more than one SSL record, but this does not happen in practice [Rescorla 2001]. Client Authentication It is possible for the server to require that the client authenticate itself just as the server must. To do this, the server issues a CertiﬁcateRequest, and the client replies with its certiﬁcate and a CertiﬁcateVerify message. Figure 6.18 shows the handshake portion of the data ﬂow. The handshake begins with the normal exchange of hello messages, followed by the server’s certiﬁcate. Then, instead of sending the ServerHelloDone, the server sends a CertiﬁcateRequest, asking the client to send its own certiﬁcate to the

186

Secure Sockets Layer

Chapter 6

client

server Handshake

: ClientHello

: ServerHello

Handshake

: Certiﬁcate

Handshake

Request

: Certiﬁcate

Handshake

Done

: ServerHello

Handshake

Handshake

: Certiﬁcate

Handshake

: ClientKeyEx

change

Handshake

: Certiﬁcate

Verify

ChangeCiphe

rSpec

Handshake

: Finished

rSpec

ChangeCiphe

: Finished

Handshake

Figure 6.18 Client Authentication Handshake

server. This is our ﬁrst example of the necessity for the ServerHelloDone message; the server has an extra message to send before it is done with its hello sequence. The client responds with Certiﬁcate and ClientKeyExchange messages. At this point, both sides can generate their keying material. Before sending the ChangeCipherRequest, however, the client sends one additional message—the CertiﬁcateVerify message—a digest of the previous handshake messages, encrypted with the client’s private key. The server then decrypts the message with the client’s public key from its certiﬁcate and veriﬁes that the digest is correct. In this way, the client veriﬁes that it is the owner of the certiﬁcate, as only the owner will know the private key. This is important, because the certiﬁcate is sent unencrypted and could easily be captured and reused by an attacker. As before, the random values in the ClientHello

Section 6.4

SSL on the Wire

187

and ServerHello messages prevent a replay attack by ensuring that the digest will differ from any other handshake between the peers, even if everything else is the same. We can learn a little more about these messages by watching them on the wire. We’ve shortened the ssldump output a bit by inhibiting the printing of the random values and EncryptedPreMasterSecret. These values were, of course, still sent.

1 1

1 2

1 3 1 4

1 5 1 6

0.0050 (0.0050) C>S Handshake ClientHello Version 3.0 cipher suites SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_RC4_128_SHA SSL_RSA_WITH_RC4_128_SHA SSL_RSA_WITH_RC4_128_MD5 SSL_DHE_DSS_WITH_RC2_56_CBC_SHA SSL_RSA_EXPORT1024_WITH_RC4_56_SHA SSL_DHE_DSS_EXPORT1024_WITH_DES_CBC_SHA SSL_RSA_EXPORT1024_WITH_DES_CBC_SHA SSL_RSA_EXPORT1024_WITH_RC2_CBC_56_MD5 SSL_RSA_EXPORT1024_WITH_RC4_56_MD5 SSL_DHE_RSA_WITH_DES_CBC_SHA SSL_DHE_DSS_WITH_DES_CBC_SHA SSL_RSA_WITH_DES_CBC_SHA SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA SSL_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_RSA_EXPORT_WITH_RC2_CBC_40_MD5 SSL_RSA_EXPORT_WITH_RC4_40_MD5 compression methods NULL 0.0056 (0.0005) S>C Handshake ServerHello Version 3.0 session_id[32]= 1c f7 13 97 e5 fb e4 7f d8 43 4c 78 dc 7b 68 22 86 a2 e1 f2 04 d5 bd c6 7b 45 3d 91 11 64 36 4d cipherSuite SSL_RSA_WITH_3DES_EDE_CBC_SHA compressionMethod NULL 0.0056 (0.0000) S>C Handshake Certificate 0.0056 (0.0000) S>C Handshake CertificateRequest certificate_types rsa_sign certificate_types dss_sign ServerHelloDone 0.0179 (0.0122) C>S Handshake Certificate 0.0179 (0.0000) C>S Handshake ClientKeyExchange

188

Secure Sockets Layer

Chapter 6

1 7

0.0179 (0.0000) C>S Handshake CertificateVerify Signature[128]= 87 d7 11 5e fa fa ed e1 87 d5 25 5e c3 c8 5f c6 ed de ee b8 c1 b6 9c 50 5f c7 6f 91 d6 db 7c a4 ed 05 b0 1d 4c c1 50 92 61 6b fd 3d 1c 71 ea e5 8f d4 b5 c1 d2 d7 ed 33 37 82 b4 93 a9 dd d5 20 82 2e b0 b5 68 20 c6 e3 14 74 dc 79 a7 73 13 88 af a0 75 1d c9 67 3a e2 61 50 b8 f6 7e 62 64 7b 1 8 0.0179 (0.0000) C>S ChangeCipherSpec 1 9 0.0179 (0.0000) C>S Handshake 1 10 0.0242 (0.0063) S>C ChangeCipherSpec 1 11 0.0242 (0.0000) S>C Handshake

19 97 16 d1 9c 2d 5c 6f

71 45 8a a5 e1 d5 81 55

35 33 d4 d5 07 b4 9b 09

e6 b2 37 89 3a 59 59 d0

In record 4, the CertiﬁcateRequest includes a list of certiﬁcate types that the server is willing to accept. In this case, the server will accept either an RSA or a DSS signed certiﬁcate. Figure 6.19 shows the format of the CertiﬁcateRequest.

message length bytes

handshake type (13) (1 byte) certiﬁcate type length (1 byte) CA list length (2 bytes)

message length (3 bytes)

certiﬁcate type list

certiﬁcate authority list

Figure 6.19 The CertiﬁcateRequest

As shown in the ﬁgure, the server can also specify which certiﬁcate authorities it is willing to accept. This capability was not used in the previous exchange. Difﬁe-Hellman Key Exchange Until now, all our example sessions have used RSA key exchange. With RSA key exchange, the client generates a random PreMasterSecret and encrypts it with the server ’s public key, which it obtains from the server’s certiﬁcate. It is also possible to use Difﬁe-Hellman to exchange keys. With Difﬁe-Hellman, the process is different. The usual method is for the server to generate a Difﬁe-Hellman key, sign it with its DSS key, and send it to the client. The client also generates a Difﬁe-Hellman key and sends it to the server. Both sides then compute the Difﬁe-Hellman shared secret from these keys and use it as the PreMasterSecret. The server can also use a permanent Difﬁe-Hellman key, in which case the key is included with the server’s

Section 6.4

SSL on the Wire

client

189

server Handshake

: ClientHello

: ServerHello

Handshake

: Certiﬁcate

Handshake

Exchange

: ServerKey

Handshake

Done

: ServerHello

Handshake Handshake

: ClientKeyEx

change

ChangeCiphe

rSpec

Handshake

: Finished

rSpec

ChangeCiphe

: Finished

Handshake

Figure 6.20 Difﬁe-Hellman Key Exchange

certiﬁcate. We see this exchange of messages in Figure 6.20. Note that the server sends its Difﬁe-Hellman key in the ServerKeyExchange message right after sending its certiﬁcate. The format of the ServerKeyExchange is given in Figure 6.21. In addition to the Difﬁe-Hellman key, the message contains the prime modulus and group generator to be used with the Difﬁe-Hellman exchange. We can see the messages by examining the output of ssldump: 1 1

0.0041 (0.0041) C>SV3.0(83) Handshake ClientHello Version 3.0 random[32]= 3e 7e 03 ef c9 db 73 e5 3f 85 6d 79 0d c0 d4 81 d6 db ea 8f 92 14 68 ac 6f db 2e 83 a9 02 1a 3b cipher suites SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA

Secure Sockets Layer

Chapter 6

handshake type (12) (1 byte)

message length bytes

190

message length (3 bytes)

prime modulus length (2 bytes)

prime modulus (2–216 − 1 bytes)

group generator length (2 bytes)

group generator (2–216 − 1 bytes)

DH key length (2 bytes)

Difﬁe-Hellman key (2–216 − 1 bytes)

DSS signature (20 bytes)

Figure 6.21 ServerKeyExchange (Difﬁe-Hellman)

1 2

SSL_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_RC4_128_SHA SSL_RSA_WITH_RC4_128_SHA SSL_RSA_WITH_RC4_128_MD5 SSL_DHE_DSS_WITH_RC2_56_CBC_SHA SSL_RSA_EXPORT1024_WITH_RC4_56_SHA SSL_DHE_DSS_EXPORT1024_WITH_DES_CBC_SHA SSL_RSA_EXPORT1024_WITH_DES_CBC_SHA SSL_RSA_EXPORT1024_WITH_RC2_CBC_56_MD5 SSL_RSA_EXPORT1024_WITH_RC4_56_MD5 SSL_DHE_RSA_WITH_DES_CBC_SHA SSL_DHE_DSS_WITH_DES_CBC_SHA SSL_RSA_WITH_DES_CBC_SHA SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA SSL_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_RSA_EXPORT_WITH_RC2_CBC_40_MD5 SSL_RSA_EXPORT_WITH_RC4_40_MD5 compression methods NULL 0.0069 (0.0028) S>CV3.0(74) Handshake ServerHello Version 3.0 random[32]= 3e 7e 03 ef bb 3b 30 7d d5 02 ea f4 d4 61 9f 93 ca 0c 4c d1 98 3d 67 9f e3 7a session_id[32]= b8 9b 8d 85 19 56 8a 28 c8 16 fc c6 cc ba 24 4b 73 63 1d d7 b6 7c fd 5d 51 1e

e7 c5 6a 28 e2 a1 55 a0 1e 90 8c 4e

Section 6.5

OpenSSL

1 1

1 1

1 1 1 1

191

cipherSuite SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA compressionMethod NULL 3 0.0069 (0.0000) S>CV3.0(767) Handshake Certificate 4 0.1035 (0.0965) S>CV3.0(189) Handshake ServerKeyExchange params DH_p[64]= da 58 3c 16 d9 85 22 89 d0 e4 af 75 6f 4c ca 92 dd 4b e5 33 b8 04 fb 0f ed 94 ef 9c 8a 44 03 ed 57 46 50 d3 69 99 db 29 d7 76 27 6b a2 d3 d4 12 e2 18 f4 dd 1e 08 4c f6 d8 00 3e 7c 47 74 e8 33 DH_g[1]= 02 DH_Ys[64]= 9d 5d 2e ac b2 95 20 83 a0 87 37 68 f6 be 25 19 d6 eb e0 21 fd 38 88 83 d6 cb 84 6f 10 d2 98 e5 72 5b ef a4 52 23 65 9d b1 c4 22 21 1a 90 d2 e7 53 f0 8b 63 43 12 43 22 ef 7d 45 77 1e 67 79 9b signature[48]= 30 2e 02 15 00 ad b4 23 61 61 1e 44 35 da 8e f4 9b 5b d7 c0 cd e0 c6 35 07 02 15 00 98 d3 df d7 04 48 d5 d3 ed bb fe 5d e4 30 e5 31 fb 5f 0b c1 5 0.1035 (0.0000) S>CV3.0(4) Handshake ServerHelloDone 6 0.1106 (0.0071) C>SV3.0(70) Handshake ClientKeyExchange DiffieHellmanClientPublicValue[64]= 7e 80 ba 59 2e 63 d2 e6 ca f4 03 c9 8a c8 16 02 e1 3c 39 61 97 a0 61 d0 f3 30 57 a7 a9 5c a4 83 40 35 83 d6 7a 9d 1f 72 d7 d6 35 96 27 52 a6 60 5c 75 92 ec 04 c8 6f cd d2 10 1d b3 1c ca 0a 90 7 0.1106 (0.0000) C>SV3.0(1) ChangeCipherSpec 8 0.1106 (0.0000) C>SV3.0(64) Handshake 9 0.1131 (0.0024) S>CV3.0(1) ChangeCipherSpec 10 0.1131 (0.0000) S>CV3.0(64) Handshake

The ServerKeyExchange is in record 4. As indicated in Figure 6.21, the server sends the prime modulus (DH_p) and group generator (DH_g) as well as its key (DH_Ys) and the signature. Because the client must use the same prime modulus and group generator, it merely responds with its key in record 6. The rest of the session proceeds as usual.

6.5

OpenSSL The most common open source implementation of the SSL protocol is OpenSSL, a library that implements the SSL protocol and other cryptographic functions. OpenSSL also has a command line front end that allows the user to perform various administrative tasks, such as making and signing certiﬁcates, making RSA and DSA keys, generating Difﬁe-Hellman parameters, calculating various message digests, encrypting and decrypting with several ciphers, and handling S/MIME signed or encrypted mail. Versions are available for virtually all UNIX systems, Windows, and VMS.

192

Secure Sockets Layer

Chapter 6

The OpenSSL front end also has s_client and s_server functions that implement conﬁgurable SSL-enabled client and server programs. These applications are convenient for testing and debugging SSL applications. Indeed, our example traces of client authentication and Difﬁe-Hellman key exchange were generated with these programs. The OpenSSL library and command line front end are too large to cover here. Fortunately, [Viega, Messier, and Chandra 2002] covers the topic in detail. One can also ﬁnd documentation, source code, and further information about OpenSSL at the OpenSSL Web site . In the next section, we’ll ﬁnd it convenient to have an SSL-aware echo server to explore stunnel, so let’s take the opportunity here to brieﬂy explore programming with the OpenSSL library. As we’ll see, it follows the familiar TCP server paradigm but with calls speciﬁc to SSL. Thus, for example, we’ll call SSL_read instead of read. Our short example exercises only the most basic aspects of the API and is in no way representative of the full functionality. [Rescorla 2001] and [Viega, Messier, and Chandra 2002] cover the material in detail and should be consulted for more information about programming with SSL. We begin with the includes, deﬁnes, and main function in Figure 6.22. This section initializes SSL, sets up our SSL context, and arranges for the application to listen for and accept connections. 1 2 3 4

#include #include #include #include

"etcp.h"

sslecho.c

5 #define PORT "4134" 6 #define SCERT "sslecho.pem" 7 #define ERROR(m) error( 1, 0, m ": %s\n", \ 8 ERR_error_string( ERR_get_error(), NULL ) ) 9 void echo( SSL * ); 10 int main( int argc, char **argv ) 11 { 12 SSL_CTX *ctx; 13 BIO *b; 14 SSL *ssl; 15 int s0; 16 int s; 17 18 19 20 21 22 23 24 25

INIT(); if ( !SSL_library_init() ) error( 1, 0, "SSL_init_library failed\n" ); SSL_load_error_strings(); ctx = SSL_CTX_new( SSLv3_server_method() ); if ( !ctx ) ERROR( "SSL_CTX_new failed" ); if ( SSL_CTX_use_certificate_chain_file( ctx, SCERT ) != 1 ) ERROR( "Couldn’t load certificate" );

Section 6.5

OpenSSL

26 27 28

if ( SSL_CTX_use_PrivateKey_file( ctx, SCERT, SSL_FILETYPE_PEM ) != 1 ) ERROR( "Couldn’t load private key" );

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 }

s0 = tcp_server( NULL, PORT ); for ( ;; ) { s = accept( s0, NULL, NULL ); if ( s < 0 ) error( 1, errno, "accept failed" ); b = BIO_new_socket( s, BIO_NOCLOSE ); if ( !b ) ERROR( "Couldn’t create BIO for socket" ); ssl = SSL_new( ctx ); if ( !ssl ) ERROR( "Could not create SSL context" ); SSL_set_bio( ssl, b, b ); if ( SSL_accept( ssl ) CV3.0(112) application_data --------------------------------------------------------------7e 21 45 00 00 54 72 7b 40 00 40 01 82 80 0a 00 ˜!E..Tr{@.@..... 00 04 c0 a8 7b 01 08 00 56 21 63 11 01 00 3c 5a ....{...V!c...z1............ 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 .............. ! 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 "#$%&’()*+,-./01 32 33 34 35 36 37 92 51 7e 234567.Q˜ --------------------------------------------------------------1 41 72.1209 (0.0013) C>SV3.0(112) application_data --------------------------------------------------------------7e 21 45 00 00 54 36 fa 40 00 40 01 be 01 c0 a8 ˜!E..T6.@.@..... 7b 01 0a 00 00 04 00 00 5e 21 63 11 01 00 3c 5a {.......ˆ!c...z1............ 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 .............. ! 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 "#$%&’()*+,-./01 32 33 34 35 36 37 94 7a 7e 234567.z˜ ---------------------------------------------------------------

As shown in Figure 6.32, these packets are 169 bytes, excluding the Ethernet framing, compared to 84 bytes for a normal ping packet. All VPNs have some overhead, of course, but here we have doubled the packet size. Even in the best case of a full-size Ethernet frame, we have about 85 bytes out of 1,500 devoted to VPN overhead. Ethernet is not the only link-layer protocol, of course, and some of the others have larger MTUs. For these protocols, the size overhead is not as onerous. Nevertheless, Ethernet is, at the time of this writing, by far the most common.

The real overhead of our tunnel is not size, however, but time. Notice that the ping times are about 2.3–2.4 milliseconds. Recall that these times were measured on a LAN,

204

Secure Sockets Layer

Chapter 6

not on the Internet. For comparison, here is a ping between bsd and linux that does not go through the tunnel: bsd# ping linux PING linux.jcs.local (172.30.0.4): 56 data bytes 64 bytes from 172.30.0.4: icmp_seq=0 ttl=64 time=0.186 64 bytes from 172.30.0.4: icmp_seq=1 ttl=64 time=0.214 64 bytes from 172.30.0.4: icmp_seq=2 ttl=64 time=0.214 64 bytes from 172.30.0.4: icmp_seq=3 ttl=64 time=0.212 ...

ms ms ms ms

We see that the ping times in the tunnel are an order of magnitude greater than those that don’t go through the tunnel. It’s not difficult to understand the difference. In addition to the normal IP/ICMP processing, we have the SSL/stunnel overhead, including the expensive encryption, and the overhead of the PPP processes. In one sense, we tested our VPN in a way that is most favorable to it. That’s because the tunnel was running on a LAN (see Tip 12 of ETCP), and we weren’t running TCP over it. If we had established our tunnel between two machines on the Internet and made a TCP connection between them over the tunnel, we would have seen even poorer performance: We would have been tunneling one TCP connection over another, and the retransmission strategies of the two TCPs would have interfered with each other. In general, it’s not a good idea to tunnel one reliable protocol inside another, because both are trying to provide reliability, and they can get in each other’s way.

6.7

SSL Security Ironically, most experts believe that although SSL provides an excellent secure channel with strong encryption and authentication, it does a poor job at its original purpose of protecting Web transactions. This is not a weakness of the SSL protocol itself but rather a result of the way browsers implement it. The problem is that although SSL can verify the identity of a server, the browsers do a poor job of informing their users of that identity in a way that allows them to make an informed decision about whether the connection is with the server that they intended. Bruce Schneier’s description of the problem [Schneier 2003] is typical: ‘‘Imagine you are sitting in a lightless room with a stranger. You know that your conversation cannot be eavesdropped on. What secrets are you going to tell the stranger? Nothing, because you have no idea who he is. SSL is kind of like that.’’ Of course, we do tell that stranger our secrets. The current epidemic of so-called phishing attacks, where users are duped into connecting to a dummy site, is an example of this. An excellent analysis of the security aspects of the SSL 3 protocol is presented in [Wagner and Schneier 1996]. Although the authors identify a few small problems, they conclude that, by and large, the protocol provides excellent security against passive attacks and, except for a couple of protocol issues that could lead to implementation errors, against active attacks as well. Canvel, Hiltgen, Vaudenay, and Vuagnoux [Canvel, Hiltgen, Vaudenay, and Vuagnoux 2003] combined Vaudenay’s attack on CBC padding [Vaudenay 2002] with a

Section 6.8

Summary

205

timing attack, and used it to recover passwords for an SSL-secured IMAP server. The timing attack takes advantage of the difference in time it takes the server to check for correct padding versus a correct MAC. Because most SSL servers at the time reported the error as soon as they detected it, the attacker was able to discern whether the padding was correct and thus apply Vaudenay’s attack. This attack is made possible by an implementation error, of course, and SSL servers now take pains to make sure that the timing attack won’t work—by always performing both tests before reporting an error, for example. Because most SSL servers default to RC4 encryption, which is not vulnerable to Vaudenay’s attack, this exploit had little practical importance except to illustrate, once again, the dangers of leaking even seemingly innocuous information.

6.8

Summary In this chapter, we’ve taken a fairly detailed look at the SSL protocol. Although it is most often thought of as a way of securing Web transactions, SSL is, in fact, a versatile protocol with many uses. We examined the protocol, its message formats, and many of its negotiation mechanisms. The OpenSSL library and support front end is, by far, the most common open source implementation of SSL. As we saw, one can use the library to write simple SSL-aware applications in a fairly straightforward way. Complex applications are more complicated, but the library is flexible and able to handle any reasonable demands placed on it. The OpenSSL front-end program, openssl, is very useful for performing administrative tasks, such as creating and signing certificates, as well as for handling routine encryption/decryption tasks and S/MIME mail chores. Finally, we explored the stunnel program and saw how we can use it to connect two applications with SSL even if one or both of them are SSL-unaware. We also saw how we can use stunnel and PPP to build an SSL VPN between two networks. Although this type of VPN suffers from performance problems that limit its practicality, it’s easy to set up and can be useful in certain settings.

Exercises 6.1

Can we use RC4 with DTLS? Why or why not?

6.2

We said that during a client certificate exchange, the client verifies that it is the certificate owner by sending the server a CertificateVerify message. How does the server prove to the client that it is the owner of the certificate that it sends?

6.3

Set up a PPP/SSL VPN between a Linux and FreeBSD machine where the Linux machine is the client and the FreeBSD machine is the server.

6.4

Increase the security of our PPP/SSL tunnel by adding PPP-level authentication to both sides.

6.5

Build an SSL VPN without using PPP. Hint: Use gtunnel and stunnel.

206

Secure Sockets Layer

6.6

Chapter 6

Write an SSL client that provides an interactive session to the user at the terminal. Test your client with our sslecho server.

7

SSH

7.1

Introduction Historically, telnet, ftp, and the BSD r-commands (rcp, rsh, rexec, and rlogin) have been used to handle interactive sessions and ﬁle transfers between a local and a remote host. Although these utilities are still popular and in widespread use, their severe security problems make them unsuitable for use in settings where security is a concern. For example, telnet and ftp provide no encryption or authentication services, so any data transferred using them is vulnerable to eavesdroppers using simple passive attacks. More seriously, these utilities send the user’s password as plaintext, allowing the attacker to recover it and subsequently log on to the remote system as the user. The r-commands are even worse. They share the same problems as telnet and ftp and are often conﬁgured to use a convenience mode that does not require the user to present any credentials. This mode is easily subverted to allow an attacker on any machine to log on to or run commands on the target machine as any user authorized to use the r-commands. The Secure Shell (SSH) suite is a set of programs that serve as drop-in replacements for telnet, ftp, and the r-commands. More accurately, SSH is a set of protocols. Because their most popular implementations are the UNIX programs ssh and sshd, most users think of SSH as its implementation rather than the underlying protocols.

Despite its name, SSH has nothing to do with a shell, such as sh, csh, or bash. Rather, SSH provides a secure connection over which a user may, among other things, run a remote shell session.

207

208

SSH

Chapter 7

When we examine this connection, we will see that it meets our requirements for a VPN. Data sent, for example, over the public Internet is encrypted and authenticated, ensuring that it is safe from snooping and alteration. From the user’s perspective, these VPN functions are transparent. The user need merely call ssh rather than, say, rsh to enjoy the beneﬁts of VPN-like security. Two versions of SSH are in use today. These are not program versions; they are protocol versions. That is, the SSH protocol has two independent versions. Fortunately, most implementations support both versions and will negotiate which version to use at session start-up time. .. In 1995, Helsinki University of Technology researcher Tatu Ylonen developed the ﬁrst version of SSH. As often happens, he designed it for his own use, in this case, as a response to a password-snifﬁng attack on his university’s network. As also often hap.. pens, Ylonen released his code for others to beneﬁt from, and its use exploded all over .. the world. To deal with the increasing support issues, Ylonen formed SSH Communications Security (SCS, ) that same year. This version of the software is now known as SSH version 1 (SSHv1). Actually version 1 of the protocol underwent steady reﬁnement. What is now known as SSHv1 is really version 1.5 of the protocol.

As with SSL, there were no formal design documents for the ﬁrst version of SSH, .. but Ylonen did document the protocol after the fact as an Internet Draft (draft-ylonenssh-protocol-00.txt). This draft has long since expired, of course, but is still distributed with the SSH source code and is available in various repositories on the Web (see, for example, ). Because of security problems with SSHv1, SCS released version 2 of the protocol in 1996. SSHv2 is a complete rewrite of the SSH protocol and is incompatible with SSHv1. The IETF became involved by forming the Secure Shell working group (SECSH). Their Web site is at . In late 1999, in response to increasingly restrictive licenses from SCS, the OpenSSH project () released an SSHv1 implementation based on SCS’s 1.2.12 release. This version supported protocol versions 1.3 and 1.5. In June 2000, OpenSSH released support for SSHv2, and support for Secure FTP (SFTP) followed soon afterward in November of that year. At this time, the OpenSSH suite is the most common implementation of the SSH protocols.

7.2

The SSHv1 Protocol Like SSL, SSH is a transport-layer protocol and uses TCP to carry its packets. This has the usual advantages of providing an underlying reliable transport, freeing SSH from having to worry about retransmissions, packet ordering, and ﬂow control. Unlike SSL, SSH does not require that either the local or remote application be SSH-aware. The situation is more analogous to an stunnel environment, such as that in Figure 6.24. That is, SSH provides a secure tunnel through which local and remote applications may communicate.

Section 7.2

The SSHv1 Protocol

209

The most common case is shown in Figure 7.1: A local user is communicating with a remote shell. In this case, the ssh client is providing the local user with a terminal interface, but this is merely a convenience. This use of ssh is as a secure replacement for rsh and is virtually identical from the user’s perspective.

ssh

SSH tunnel

sshd

shell (sh, csh, etc.)

Figure 7.1 SSH as a Remote Shell

Because the use of ssh as a replacement for rsh and telnet is so common, we ﬁrst examine the SSH protocol from the point of view of the remote shell application. Later, we consider other applications and capabilities of the SSH protocols. Let’s start with a simple interactive session and watch the protocol in action as SSH connects; authenticates the server, client, and user; transfers user data securely; and ﬁnally disconnects. In order to see the unencrypted packets, we specify null encryption (-c none). As we see, ssh warns us that there will be no encryption and that the password will be passed in the clear, just as it is for, say, telnet. Although the SSH protocol recommends that null encryption should be available for debugging purposes and although OpenSSH does provide support for it, there is no way to request it from the command line. We are using a patched version that recognizes the -c none option. Our patched server is listening on port 2022 instead of the normal port 22; that is why we specify -p 2022 on the call to ssh.

The -1 and -4 specify the version 1 protocol and IPv4, respectively. $ ./ssh -1 -4 -c none -p 2022 guest@localhost WARNING: Encryption is disabled! Password will be transmitted in clear text. guest@localhost’s password: Last login: Sat May 15 14:55:16 2004 from localhost Have a lot of fun... guest@linuxlt:˜> ls Documents public_html guest@linuxlt:˜> exit logout Connection to localhost closed.

After we supply our password, sshd starts a shell for us, and we list the home directory of user guest. Finally, we exit from the shell, and the connection is torn down. Before studying the protocol messages for this session in detail, we must examine the SSHv1 binary protocol packet. Figure 7.2 shows the format of these packets. As with the SSL packets from Chapter 6, the SSH packet does not necessarily align its data on word boundaries, so we display them as we did for SSL.

The length ﬁeld is the size of the packet, not including the length ﬁeld itself or the variable-length random padding ﬁeld that follows it. The padding ﬁeld is intended to

210

SSH

Chapter 7

length (4 bytes)

length bytes

type (1 byte)

data (length−5 bytes)

encrypted

random padding (1–8 bytes)

CRC (4 bytes)

Figure 7.2 The SSHv1 Binary Packet

make known text attacks more difﬁcult. Its size is chosen to make the size of the encrypted part of the packet a multiple of 8 bytes. This is presumably because all the block ciphers originally supported by SSHv1 have a 64-bit block size. The OpenSSH version 1 protocol supports only DES, 3DES, and Blowﬁsh, all of which use a 64-bit block size.

Following the padding is a 1-byte type ﬁeld that identiﬁes the type of message that the packet contains. The type values are shown in Figure 7.3. The type ﬁeld is followed by the message data. The CRC ﬁeld, which serves as a MAC, ends the packet. When encryption is enabled, everything except the length ﬁeld is encrypted. The protocol allows for optional compression of the data. This can be useful when SSH is used in low-bandwidth situations such as dial-up lines. If the client and server negotiate compression, only the type and data ﬁelds are compressed. Many of these messages either carry no arguments—they consist of only the length, padding, type, and CRC ﬁelds—or have a single argument consisting of a string, integer, or extended integer. In these cases, we won’t bother showing the message layout but will merely indicate what type of argument, if any, the message carries. Server Authentication The server authentication phase of the session, as shown in Figure 7.4, begins with the exchange of identiﬁcation strings. When the SSH client, ssh, connects to the SSH

Section 7.2

The SSHv1 Protocol

211

No.

Message Name

Message

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

SSH_MSG_NONE SSH_MSG_DISCONNECT SSH_SMSG_PUBLIC_KEY SSH_CMSG_SESSION_KEY SSH_CMSG_USER SSH_CMSG_AUTH_RHOSTS SSH_CMG_AUTH_RSA SSH_SMSG_AUTH_RSA_CHALLENGE SSH_CMSG_AUTH_RESPONSE SSH_CMSG_AUTH_PASSWORD SSH_CMSG_REQUEST_PTY SSH_CMSG_WINDOW_SIZE SSH_CMSG_EXEC_SHELL SSH_CMSG_EXEC_CMD SSH_SMSG_SUCCESS SSH_SMSG_FAILURE SSH_CMSG_STDIN_DATA SSH_SMSG_STDOUT_DATA SSH_SMSG_STDERR_DATA SSH_CMSG_EOF SSH_SMSG_EXITSTATUS SSH_MSG_CHANNEL_OPEN_CONFIRMATION SSH_MSG_CHANNEL_OPEN_FAILURE SSH_MSG_CHANNEL_DATA SSH_MSG_CHANNEL_CLOSE SSH_MSG_CHANNEL_CLOSE_CONFIRMATION

never sent causes immediate connection teardown server’s public key client choice of cipher and session key user logon name request for user rhosts type authentication request for user RSA authentication server challenge for RSA authentication client response to RSA challenge request for password authentication request for a server pseudoterminal client’s window size request to start a user shell request to run a command server accepts last request server does not accept last request client input data for shell/command output data from shell/command STDERR output from shell/command client is ﬁnished sending data exit status from shell/command indicates channel opened indicates channel could not be opened data transmitted over channel sender is closing channel sender acknowledges channel close obsolete client is connected to proxy X-server client requests server port be forwarded connection made on forwarded port requests authentication agent forwarding requests channel to authentication agent no op response to SSH_SMSG_EXITSTATUS requests a proxy X-server requests rhosts/RSA authentication debugging information for peer client requests compression

SSH_SMSG_X11_OPEN SSH_CMSG_PORT_FORWARD_REQUEST SSH_MSG_PORT_OPEN SSH_CMSG_AGENT_REQUEST_FORWARDING SSH_SMSG_AGENT_OPEN SSH_MSG_IGNORE SSH_CMSG_EXIT_CONFIRMATION SSH_CMSG_X11_REQUEST_FORWARDING SSH_CMSG_AUTH_RHOSTS_RSA SSH_MSG_DEBUG SSH_CMG_REQUEST_COMPRESSION

Figure 7.3 SSHv1 Message Types

212

SSH

Chapter 7

client

server IDENT

IDENT

EY

_PUBLIC_K

SSH_SMSG SSH_CMSG

_SESSION_K

EY

_SUCCESS

SSH_SMSG

Figure 7.4 SSHv1 Server Authentication

server, sshd, the server sends an identiﬁcation string indicating which of the protocols it supports and, perhaps, additional program version information. For example, if we connect to the SSH server with netcat, the server responds with its identiﬁcation string: $ nc linux 22 SSH-1.99-OpenSSH_3.5p1

The SSH-1.99 is a special version number that tells the client that the server supports protocol versions 1 and 2. The -OpenSSH_3.5p1 is meant for human consumption and speciﬁes the version of the OpenSSH SSH server. The client will respond with its own identiﬁcation string, so that the peers will know which protocol to use. After the server receives the client’s identiﬁcation string, the peers switch to the binary protocol, and the server sends the SSH_SMSG_PUBLIC_KEY message shown in Figure 7.5. The cookie is 8 random bytes that are intended to make IP spooﬁng more difﬁcult. The client must return these bytes to the server unchanged. The host key is a permanent RSA public key that the client uses to verify the identity of the server. Unlike with SSL, this key is not signed by a third party. Rather, the client is expected to have a database of known host keys. In practice, this database is built by accepting the key as valid the ﬁrst time a user connects to a host. Thereafter, the host must present the known key to the client for the session to proceed. SSH also supports a key-ﬁngerprinting mechanism that allows a user to manually verify a site’s key with the site’s system administrator. There is also a proposal to make these ﬁngerprints available through DNS by means of SSHFP (SSH key ﬁngerprint) records. The server also sends a second key, the server key. This key is regenerated periodically, once every hour by default, to help improve security. As we shall see, the client uses both of these keys to form its response to the server.

Section 7.2

The SSHv1 Protocol

213

cookie (8 bytes)

server key bits (32 bits)

server public key exponent

server public key modulus

host key bits (32 bits)

host public key exponent

host public key modulus

protocol ﬂags (32 bits)

supported ciphers mask (32 bits)

supported authentication methods mask (32 bits)

Figure 7.5 SSH_SMSG_PUBLIC_KEY Message

Finally, the message includes three 32-bit bit masks. The protocol ﬂags bit mask is intended for protocol extension. The supported ciphers mask indicates which ciphers the server can use. The client will choose one of these for the session’s cipher. The supported authentications mask indicates which user authentications the server supports. Again, the client will try one or more of these methods to authenticate the user. Both sides use the information in this message to calculate a session ID by taking the MD5 hash of the concatenation of the moduli of the server and host keys and the cookie. As we’ll see, the session ID is used in generating the session key and thus ensures that both the client and server contribute to the session key. From Figure 7.4, we see that the client responds to the SSH_SMSG_PUBLIC_KEY message with an SSH_CMSG_SESSION_KEY message, shown in Figure 7.6. cipher type (1 byte) cookie (8 bytes)

encrypted session key

protocol ﬂags (32 bits)

Figure 7.6 SSH_CMSG_SESSION_KEY Message

214

SSH

Chapter 7

The cipher type ﬁeld contains the number of the cipher that the client has chosen for the session. The cookie and protocol ﬂags ﬁelds are just as they were for the SSH_SMSG_PUBLIC_KEY message. In particular, the cookie must be returned to the server exactly as it was received by the client. The encrypted session key ﬁeld is a random 32-byte value chosen by the client. The session ID, calculated from the values in the SSH_SMSG_PUBLIC_KEY message, is exclusive-ORed into the ﬁrst 16 bytes of the random value. This result is then encrypted twice: ﬁrst by the smaller (usually the server) RSA key and then by the other (usually host) key. Although this operation seems complicated, it accomplishes three separate tasks. First, by exclusive-ORing the session ID into the random value used for the session key, the client ensures that both the server and the client contribute to the ﬁnal session key. Then, by encrypting with the host key, the client veriﬁes the identity of the server, because the server must have the corresponding private key in order to recover the session key. Finally, by encrypting with the server key, the client ensures perfect forward secrecy by using the periodically changing server key. Both sides now begin encrypting their packets. The server completes the server authentication phase by sending an SSH_SMSG_SUCCESS message. At this point, the peers have established a secure channel, and the server has authenticated itself to the client. User Authentication The next step is for the user to authenticate himself to the server. This can be done in several ways. SSH allows, but discourages, the insecure rhosts trusted-host model. Because it is easily spoofed, this model should never be used when security is important. SSH also supports a variation of the rhosts model in which the identity of the client machine is veriﬁed with an RSA key. This is an improvement but still relies on the client to certify the identity of the user. Once again, this method should not be used when security or user identity is a concern. The rhosts and rhosts/RSA methods are discussed in detail in [Barrett and Silverman 2002], so we will not belabor them further here. A third authentication method is to use Kerberos. With this method the user obtains a ‘‘ticket’’ from the Kerberos server and sends it to the SSH server as authentication. Although Kerberos is a complicated system and requires a separate server, it may make sense when there is a large user base, especially if Kerberos is already in place. See [Garman 2003] for more information on Kerberos. Next, there is a class of methods known as password authentication. In the simplest of these, which we’ll examine shortly, the user merely supplies a password, which the server checks against its password ﬁle. Recall that after server authentication, the peers have established a secure channel, so this password is not passed in the clear as it is in, say, the telnet protocol. The other password methods are variations of a one-time-password scheme. One example is the popular RSA SecurID system, which is described at RSA’s Web site (). With SecurID, the user

Section 7.2

The SSHv1 Protocol

215

carries a hardware device, such as a key fob, that generates a pseudorandom number every 60 seconds. During authentication, the user enters the pseudorandom number and a private PIN. If the correct values are entered, the user is authenticated, and the session is started. Another one-time-password scheme is the S/Key system [Haller 1994, Haller 1995], a challenge/response system. With S/Key, the user is prompted with a challenge and responds with a set of short words. This response can be obtained either programmatically—from a PDA applet, say—or from a preprinted list. As with SecureID, passwords are not reused, thereby increasing security. The S/Key system is supported by OpenSSH. Each of these password methods has its own protocol, but they all depend on the SSH secure channel for their security. The enhanced methods, such as SecureID and S/Key, derive their increased security from the fact that they use one-time passwords and are thus resistant to local passive attacks, such as keyboard loggers and other attempts to capture the password before it enters the encrypted channel; even if a password is captured, it is useless because it’s not reused. Figure 7.7 shows the protocol for a simple password-based user authentication.

client

server SSH_CMSG

_USER

_FAILURE

SSH_SMSG SSH_CMSG

_AUTH_PA

SSWORD

_SUCCESS

SSH_SMSG

Figure 7.7 User Authentication with a Password

First, the client sends the server the user’s name as a string in the SSH_CMSG_USER message. If no further authentication is required—rhosts authentication is being used, for example—the server will respond with an SSH_SMSG_SUCCESS message, and authentication will be completed. If further credentials, such as a password, are required, the server will respond with an SSH_SMSG_FAILURE message, indicating that the user name alone is not sufﬁcient. SSH_SMSG_SUCCESS and SSH_SMSG_FAILURE are both simple messages with no arguments. At this point, the client will try various authentication methods until it ﬁnds one acceptable to the server. In our example session, the client ﬁrst tries a standard password authentication by sending the server an SSH_CMSG_AUTH_PASSWORD message containing the user’s password as a string. In our case, this is acceptable, and the

216

SSH

Chapter 7

server returns an SSH_SMSG_SUCCESS message indicating that the authentication is complete. Because SSHv1 normally encrypts everything in its messages except the length, there is not much sense in dwelling on packet captures as a debugging aid. Nevertheless, it is instructive to see what the messages look like, to aid us in understanding the protocol. Therefore, let’s brieﬂy examine the messages from Figure 7.7 in detail. We begin with the SSH_CMSG_USER message from the client. As we see from bytes 5–8 in line 1.4, the message is 14 bytes long. From line 1, we see that the TCP segment is 20 bytes long, so there are 20 − 14 − 4 = 2 bytes of padding. The padding is easy to spot because OpenSSH pads with zero bytes. Next, we see the message type (0x04) in byte 11 of line 1.4. From Figure 7.3, we see that this is indeed the SSH_CMSG_USER message. Following the message type is the user ’s logon name, guest in this case. In SSH, string data is preceded by 4 bytes of length followed by the string data. There is no trailing NULL byte. Finally, the last 4 bytes are the CRC, which serves as a MAC. In line 2, the server responds with its SSH_SMSG_FAILURE message. The message type (0x0f) is on line 2.4. Line 3 is the TCP ACK for the segment in line 2. This occurs because of the time it takes to type in the password. 1 1.1 1.2 1.3 1.4 1.5 2 2.1 2.2 2.3 2.4 3

14:56:52.381285 127.0.0.1.32802 > 127.0.0.1.2022: P 184:204(20) ack 316 win 3276 7 (DF) 4500 0048 0daa 4000 4006 2f04 7f00 0001 E..H..@.@./..... 7f00 0001 8022 07e6 2b99 26b8 2b24 4d08 ....."..+.&.+$M. 8018 7fff d4cb 0000 0101 080a 0007 272e ..............’. 0007 272e 0000 000e 0000 0400 0000 0567 ..’............g 7565 7374 33b3 5ce1 uest3.. 14:56:52.381867 127.0.0.1.2022 > 127.0.0.1.32802: P 316:328(12) ack 204 win 3276 7 (DF) 4500 0040 0dab 4000 4006 2f0b 7f00 0001 E..@..@.@./..... 7f00 0001 07e6 8022 2b24 4d08 2b99 26cc ......."+$M.+.&. 8018 7fff a93e 0000 0101 080a 0007 272e .....>........’. 0007 272e 0000 0005 0000 000f 90bf 1d91 ..’............. 14:56:52.413821 127.0.0.1.32802 > 127.0.0.1.2022: . ack 328 win 32767 (DF)

Next, we see the SSH_CMSG_AUTH_PASSWORD message (type 0x09) carrying the password of knockknock. Because this password is acceptable to the server, it responds with a SSH_SMSG_SUCCESS message in line 5. As we see in line 5.4, the SSH_SMG_SUCCESS message carries no data other than its message type (0x0e). 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5

14:56:59.534978 127.0.0.1.32802 > 127.0.0.1.2022: P 204:256(52) ack 328 win 3276 7 (DF) 4500 0068 0dba 4000 4006 2ed4 7f00 0001 E..h..@.@....... 7f00 0001 8022 07e6 2b99 26cc 2b24 4d14 ....."..+.&.+$M. 8018 7fff ecd7 0000 0101 080a 0007 29fa ..............). 0007 272e 0000 0029 0000 0000 0000 0009 ..’....)........ 0000 0020 6b6e 6f63 6b6b 6e6f 636b 0000 ....knockknock.. 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000 0000 7ba8 d3b8 ....{... 14:56:59.536726 127.0.0.1.2022 > 127.0.0.1.32802: P 328:340(12) ack 256 win 3276 7 (DF)

Section 7.2

The SSHv1 Protocol

5.1 5.2 5.3 5.4

4500 7f00 8018 0007

0040 0001 7fff 29fa

0dbc 07e6 3cf8 0000

4000 8022 0000 0005

4006 2b24 0101 0000

2efa 4d14 080a 000e

7f00 2b99 0007 e7b8

0001 2700 29fa 2d07

217

E..@..@.@....... ......."+$M.+.’. .... localhost.ssh: P 23:567(544) ack 24 win 32767 (DF) 1.1 4500 0254 001b 4000 4006 3a87 7f00 0001 E..T..@.@.:..... 1.2 7f00 0001 8007 0016 fded 9b30 fdd3 3e71 ...........0..>q 1.3 8018 7fff 2456 0000 0101 080a 001f 140a ....$V.......... 1.4 001f 140a 0000 021c 0914 752b cb0f 1554 ..........u+...T 1.5 933c a4d0 c8d9 f222 cbbb 0000 003d 6469 . 192.168.123.1: AH(spi=0x0e9ec45c,seq=0x1): IP 192.168.123.5 > 172.30.0.4: icmp 64: echo request seq 512 (ipip-proto-4) 4500 0080 0038 0000 4033 02bc c0a8 7b05 E....8..@3....{. c0a8 7b01 0404 0000 0e9e c45c 0000 0001 ..{............ 2040 b6f2 2283 92f6 39f8 8941 4500 0054 .@.."...9..AE..T 0037 0000 4001 92a2 c0a8 7b05 ac1e 0004 .7..@.....{..... 0800 cd58 3d01 0200 5ed1 3042 6e8f 0300 ...X=...ˆ.0Bn... 0809 0a0b 0c0d 0e0f 1011 1213 1415 1617 ................ 1819 1a1b 1c1d 1e1f 2021 2223 2425 2627 .........!"#$%&’ 2829 2a2b 2c2d 2e2f 3031 3233 3435 3637 ()*+,-./01234567

336

AH

Chapter 11

The outer IP header is on lines 1.1 and 1.2. It has a source address of 192.168.123.5 (0xc0a87b05) and a destination address of 192.168.123.1 (0xc0a87b01). The AH header is on lines 1.2 and 1.3 in boldface. Its next header ﬁeld is set to IP-in-IP (4). This indicates that an IP header follows the AH header. As mentioned in Chapter 10, we can think of AH tunnel mode as AH transport mode applied to an IP-in-IP tunnel.

The part of the AH header on line 1.3 is the authentication data, which, like the transport example, is HMAC-SHA1-96. We see the inner IP header immediately following the AH header on lines 1.3 and 1.4. The source address is still 192.168.123.5 (0xc0a87b05), but the destination address is 172.30.0.4 (0xac1e0004), which is linux. The ICMP packet (the ping) starts on line 1.5 and has a type of echo request (0x08) as expected. In AH tunnel mode, the situation with NAT is slightly different from what it was with transport mode. It is common for a security gateway to include router and NAT functions. In this case, the gateways can apply NAT before they calculate the ICV. When AH is used alone—that is, with no ESP encryption—it can see the transport-layer port numbers, so it can even apply PAT. On the other hand, NAT is often not necessary: In Figure 11.10, a datagram is carried between two networks with private IP address ranges without the use of NAT. In this respect, the AH tunnel is similar to the IP-in-IP tunnel that we saw in Chapter 4.

11.7

AH with IPv6 The operation of AH with IPv6 is virtually the same as it is with IPv4. The only additional considerations are the mutable ﬁelds in the IPv6 header and the placement of the AH header in the IPv6 datagram. Figure 11.11 shows the IPv6 header with the mutable ﬁelds shaded. As in the IPv4 case, the mutable ﬁelds are zeroed before calculating the ICV. Similar to the IPv4 case and for the same reason, the destination address is mutable but predictable. The address of the ﬁnal destination is used when calculating the ICV. Because IPv6 considers AH an end-to-end protocol, it comes after the hop-by-hop extension headers instead of immediately after the IP header. The destination extension headers can come before or after the AH header, but unless there is a compelling reason to do otherwise, it makes sense to protect those headers by placing them after the AH header. Thus, the AH encapsulation is a little more complicated with IPv6, because the AH header must be inserted among the other extension headers. Figure 11.12 shows a typical encapsulation for AH transport mode with IPv6. The authentication extends to the middle of the IP header, as usual. The extension headers have been split into two parts: those that come before the AH header and those that come after. The next header ﬁeld of the AH header will carry the number of the ﬁrst

Section 11.7

AH with IPv6

0

34 version

11 12

15 16

23 24

trafﬁc class

337

31

ﬂow label

payload length

next header

hop limit

128-bit source address

128-bit destination address

Figure 11.11 The IPv6 Header Mutable Fields

IPv6 header

IPv6 header

extension headers

TCP header

outer ext. headers

AH header

inner ext. headers

data

TCP header

data

authenticated

Figure 11.12 IPv6 AH Transport-Mode Encapsulation

inner extension header or of the upper-layer protocol—TCP, in this case. Immutable ﬁelds in the outer extension headers are also included in the ICV calculation. The encapsulation for tunnel mode is easier, just as it was for the IPv4 case. The entire IP datagram is encapsulated by prepending an outer IP header, possible additional extension headers, and an AH header, as shown in Figure 11.13.

338

AH

Chapter 11

outer IPv6 header

new ext. headers

AH header

IPv6 header

extension headers

TCP header

data

inner IPv6 header

extension headers

TCP header

data

authenticated

Figure 11.13 IPv6 AH Tunnel Mode Encapsulation

11.8

Summary In this chapter, we examined the AH protocol. AH can use transport mode to protect data between two hosts or tunnel mode to protect data between two networks or a host and a network. The protection afforded by AH is data-origin authentication and data integrity. Because it does not use encryption, AH does not provide privacy. An attacker can read AH-protected data but not tamper with it. We looked at the input and output processing that IPsec must perform to implement AH. In particular, we studied how AH uses sequence numbers, even in the face of unreliable delivery, to prevent replay attacks. Finally, we brieﬂy looked at how the encapsulation of AH differs in IPv6. We saw that the major difference is that in IPv6, AH must deal with the extension headers.

Exercises 11.1

We rejected as impractical the idea of checking sequence numbers by remembering each sequence number received. Describe an algorithm that checks sequence numbers by remembering sequence numbers that should have been received but weren’t. Critique the practicality of this idea.

11.2

Appendix C of RFC 2401 has a C code reference implementation for AH/ESP sequence number checking, but it uses a window that is 32 sequence numbers wide rather than the recommended 64. Modify the code in RFC 2401 to use a window of 64 sequence numbers.

11.3

Draw a network diagram, similar to Figure 11.8, showing an AH tunnel between a host and a network protected by a security gateway.

11.4

What is the trust model for Figure 11.8? That is, what assumptions is the network designer making about the security of the various parts of the network?

Section 11.8

Summary

339

11.5

With respect to Figure 11.8, how does the NAT situation change if NAT is applied by devices between GW1 and GW2 instead of by the security gateways themselves?

11.6

Use gtunnel to build an AH-like authentication mechanism. For simplicity, use static keying and a single, ﬁxed, authentication algorithm. Use either transport- or tunnel-mode encapsulation.

This page intentionally left blank

12

ESP

12.1

Introduction The Encapsulating Security Payload (ESP) protocol provides the same authentication, data integrity, and antireplay protection that AH provides but adds the IPsec conﬁdentiality function. In tunnel mode, ESP also provides limited protection from trafﬁc analysis. The ESP speciﬁcation is RFC 2406 [Kent and Atkinson 1998b]. Except for the data authenticated and the placement of the authentication data in the packet, the ESP authentication function is identical to that in AH. Given this, we might wonder why ESP has its own authentication function or even why, given that the data is encrypted, we need authentication at all. It happens that unauthenticated ESP is vulnerable to certain remarkably simple cut-and-paste attacks—see [Bellovin 1996] for details. Because of these attacks, ESP should always be authenticated, and therefore it makes sense to include the authentication function in ESP itself rather than require another set of SAs and another protocol header. As we shall see, ESP, unlike AH, does not authenticate the IP header—the outer IP header in tunnel mode—so it is sometimes useful to use AH in conjunction with ESP where the security model demands that the source address of an IP datagram be authenticated. In [Ferguson and Schneier 1999], the Ferguson and Schneier argue that there is no reason why the IP header needs to be authenticated at all. The receiver knows that the packet was sent by someone who knows the authentication key, so authenticating the IP header, which is merely used to route the packet, does not appear to add any security. In any event, an attacker who knows the authentication key can just as easily forge the IP header and authenticate the forgery. Another point concerning using AH and ESP in tunnel mode is that RFC 2401 [Kent and Atkinson 1998c] does not require that implementations support nested AH and ESP in tunnel mode.

341

342

ESP

Chapter 12

Both authentication and encryption are optional in ESP, but at least one must be used. The particular encryption and authentication algorithms used are speciﬁed in the SA. Either of the two functions may be disabled by specifying the NULL algorithm.

12.2

The ESP Header The format of the ESP packet is shown in Figure 12.1. 0

78

15 16

23 24

31

security parameter index (SPI) sequence number

IV and payload data

padding

pad length

next header

authentication data

Figure 12.1 The ESP Header and Trailer

As with AH, the SPI, the destination address, and the IPsec protocol are used to uniquely identify the SA that applies to this packet. Also as with AH, the sequence number is used to provide the antireplay function. When the SA is established, the sequence number is initialized to 0. Before each packet is sent, the sequence number is incremented by 1 and placed in the ESP header. To ensure that no packet will be accepted more than once, the sequence number is not allowed to wrap to 0. Once the sequence number 232 − 1 is used, a new SA and, except in the case of manual keying, a new authentication key are established. As we saw in Chapter 3, some encryption algorithms require an initialization vector, especially for block ciphers used in CBC mode. When an explicit IV is required, it is included in the payload data. In principle, we could send the IV in the ﬁrst datagram and let the receiver cache the most recent encrypted block for use with the next block’s CBC operation. In practice, the unreliable delivery of IP datagrams makes this impracti-

Section 12.2

The ESP Header

343

cal, and IPsec requires that an IV be sent with each IP datagram [Madson and Doraswamy 1998, Pereira and Adams 1998]. This makes sense because it makes the protocol simpler and allows the receiver to decrypt packets even if they arrive out of order or are lost, a common occurrence. The IV, if needed, and the payload data are placed in the IV and payload data ﬁeld. As we have seen, block ciphers require that plaintext be padded to a multiple of the block size. Such padding, if needed, is placed immediately after the payload data in the padding ﬁeld. Even if a stream cipher or NULL encryption is used, we may require padding for alignment or data-hiding purposes. For example, the next header ﬁeld must be right aligned on a 4-byte boundary, as shown in Figure 12.1, so that the authentication data will start on a 4-byte boundary. It is also possible to add a random number of padding bytes to hide the length of the payload data. In any event, 0 to 255 bytes of padding are added to the payload data. Unless the encryption algorithm speciﬁes otherwise, the ﬁrst padding byte must be 0x01, the second 0x02, and so on. RFC 2406 says that the receiver should inspect the padding bytes to verify that they meet the prescribed values. This check serves to verify that the decryption was successful and provides a small amount of protection against cut-andpaste attacks when authentication is not used [Doraswamy and Harkins 1999]. The length of the padding is in the pad length ﬁeld. It can take on any value between 0 and 255 inclusive. The pad length ﬁeld is always present, even if there is no padding. The next header ﬁeld indicates what type of data is in the IV and payload data ﬁeld. We shall see how the next header ﬁeld is used and some of its common values shortly. The authentication data ﬁeld contains an integrity check value for the ESP packet. The ICV is calculated over the entire ESP packet except for the authentication data ﬁeld itself. The ICV must start on a 4-byte boundary and must be a multiple of 32-bit words. The two most common authentication methods are HMAC-MD5-96 [Madson and Glenn 1998a] and HMAC-SHA1-96 [Madson and Glenn 1998b]. Each method takes the ﬁrst 96 bits from the HMAC-MD5 or HMAC-SHA-1 algorithm (described in Chapter 3) as the ICV. Although it is counterintuitive, restricting the output of the HMACs to the ﬁrst 96 bits increases the security of the HMAC because it gives an attacker less information to work with. This is discussed in RFC 2104 ([Krawczyk, Bellare, and Canetti 1997]) and [Bellare, Canetti, and Krawczyk 1996]. For what follows, it is convenient to think of the ESP packet as consisting of four parts: 1. The ESP header, which contains the SPI and sequence number ﬁelds 2. The payload, which contains the IV and payload data ﬁelds 3. The ESP trailer, which contains the padding, pad length, and next header ﬁelds 4. The ESP authentication data, which contains the ICV. We will refer to these ﬁelds in the rest of this chapter.

344

12.3

ESP

Chapter 12

ESP Processing Before looking at the details of the ESP transport and tunnel modes, we should understand how the TCP/IP stack processes ESP packets. The rules are slightly different for input and output, so we treat them separately.

ESP Output Processing When it is ready to be placed on the output queue, an IP datagram is checked for possible IPsec processing. If ESP encapsulation is required, its exact form depends on whether the SA mandates transport or tunnel mode. We examine this in detail in the next two sections. Output processing involves the following steps. 1. The SPD is searched for an SA that matches the appropriate selectors—source address, destination address, ports, protocol, etc.—in the packet. If an SA doesn’t already exist, a pair of SAs is negotiated (see Chapter 13). 2. The sequence number from the SA is incremented and placed in the ESP header. If the peer has not disabled the antireplay function, the sequence number is checked to make sure that it hasn’t wrapped to 0. 3. Padding is added, if necessary, and the pad length and next header ﬁelds are ﬁlled in. If the encryption algorithm requires it, an IV is added to the payload data. The IV and data payload and the ESP trailer ﬁelds are encrypted, using the algorithm and key speciﬁed in the SA. 4. The ICV is calculated over the ESP header, the IV and data payload, and the ESP trailer ﬁelds and placed in the authentication data ﬁeld. The ICV is calculated, using the algorithm and key speciﬁed in the SA. 5. If the resulting packet requires fragmentation, it is performed at this point. In transport mode, ESP is applied only to entire IP datagrams. In tunnel mode, ESP may be applied to an IP datagram fragment. For example, a VPN gateway may apply ESP to an IP datagram that was fragmented by the sending host. The order in which the encryption and authentication functions are performed is important. Because authentication is performed last, the ICV is computed over the encrypted data. This means that the receiver can perform the relatively speedy authentication veriﬁcation before performing the slower decryption process. This prevents an attacker from overloading the receiver by sending a ﬂood of randomly encrypted packets. See [Ferguson and Schneier 1999] for a contrarian view. The authors argue that ‘‘the meaning’’ and not ‘‘what was said’’ should be authenticated, and thus ESP should ﬁrst authenticate and then encrypt. They point out that if concerns about DOS attacks require the current order, the encryption key should, at the very least, be part of the data authenticated. The principle here, which they call out as Lesson 3, is that not just the message but everything used to determine the meaning of the message should be authenticated.

Section 12.4

Transport Mode

345

It turns out that there is a ‘‘right’’ answer to this question. Krawczyk [Krawczyk 2001] shows that under fairly general assumptions about the encryption and authentication algorithms, encrypting and then authenticating is secure, but authenticating and then encrypting is not. In the context of IPsec, these results are less dispositive than we might hope, because he also shows that the order of encryption/authentication does not effect security when using a block cipher in CBC mode or a stream cipher.

ESP Input Processing Because an IP datagram carrying an ESP packet may have been fragmented by intervening routers, the stack reassembles the IP datagram before performing the ESP processing. After any reassembly, the stack performs the following steps. 1. The SA is retrieved by matching the destination address, protocol (ESP), and SPI of the packet. If no SA exists for the packet, it is dropped. 2. If the antireplay service is enabled, the sequence number of the packet is checked to make sure that it is new and falls within the antireplay window. 3. The packet is authenticated by computing the ICV over the ESP header, payload, and ESP trailer ﬁelds, using the algorithm and key speciﬁed in the SA. If the authentication fails, the packet is dropped; otherwise, the antireplay window is updated. 4. The payload and ESP trailer ﬁelds are decrypted, using the algorithm and key in the SA. If padding was added, it should be checked to make sure it has the values appropriate for the decryption algorithm. The original IP datagram is reconstructed from the ESP packet. The details of this reconstruction depend on whether the SA speciﬁes transport or tunnel mode.

12.4

Transport Mode The IPsec protocols can operate in one of two modes: transport or tunnel. In this section, we investigate the operation of ESP in transport mode, which is used to secure the communication between two ﬁxed hosts. That is, the ESP tunnel connects two speciﬁc hosts rather than several hosts on two networks. Transport mode is illustrated in Figure 12.2. host A

ESP transport tunnel

host B

Figure 12.2 ESP in Transport Mode

In transport mode, ESP is used to secure the upper-layer protocols of the IP datagram. This usually means a TCP segment or a UDP datagram, but it could also be an ICMP packet or other legal IP protocol. Figure 12.3 shows the ESP encapsulation for a

346

ESP

Chapter 12

TCP segment. Notice that the ESP header is inserted after the IP header and its options but before the TCP header. As we see in the ﬁgure, the payload, consisting of the TCP header and data, and the ESP trailer are encrypted. The ESP header is not encrypted—otherwise, the receiver couldn’t ﬁnd the SPI and wouldn’t know how to decrypt the packet—but it is authenticated. This means that an attacker can’t, for example, substitute another SPI or sequence number.

IP header

IP header

ESP header

TCP header

TCP header

data

data

ESP trailer

ESP auth. data

encrypted authenticated

Figure 12.3 ESP Transport Encapsulation

Figure 12.3 also shows that the IP header is not protected. This means that an attacker can change any of the IP header ﬁelds without detection. Even though Figure 12.2 might make it appear that host A and host B are directly connected, the tunnel will, in general, traverse the Internet or other WAN, so at the very least, an attacker can see how much trafﬁc is ﬂowing between the two hosts and can forge IP headers (but not message content). On the other hand, the receiver of a transport-mode ESP packet can still be conﬁdent that it is from its peer, because only the receiver and its peer have the key to the authentication algorithm. This means that if an attacker attempts to forge an IP datagram, the ICV will be incorrect, and the authentication step will fail. Let’s look at transport-mode ESP in action. We set up an ESP transport mode tunnel between bsd (172.30.0.1) and linux (172.30.0.4). The policy for this tunnel, as seen on bsd, is 172.30.0.4[any] 172.30.0.1[any] any in ipsec esp/transport/172.30.0.4-172.30.0.1/require spid=9 seq=1 pid=9886 refcnt=1 172.30.0.1[any] 172.30.0.4[any] any out ipsec esp/transport/172.30.0.1-172.30.0.4/require spid=10 seq=0 pid=9886 refcnt=1

Section 12.4

Transport Mode

347

As usual, the policy comes in pairs. The second policy speciﬁes that any IP datagram going to 172.30.0.4 from 172.30.0.1 should be encapsulated in an ESP transport-mode tunnel. The require keyword at the end of the third line of the policy indicates that the use of this SA is mandatory for any matching IP datagram. The ﬁrst policy, which covers inbound trafﬁc from 172.30.0.4, is similar. The exact meaning of the values in the printout are speciﬁc to the FreeBSD (KAME) implementation and needn’t concern us. Once the tunnel is established, we can display the SAs on bsd: 172.30.0.1 172.30.0.4 esp mode=transport spi=2899419086(0xacd19fce) reqid=0(0x00000000) E: 3des-cbc 4679d9e8 76242719 575f9733 79f3f250 8eef4767 a40ca67f A: hmac-md5 c0646bc1 6421f527 0e873138 81ad53f1 seq=0x00000042 replay=4 flags=0x00000000 state=mature created: Jun 1 13:12:37 2003current: Jun 1 13:29:00 2003 diff: 983(s)hard: 28800(s)soft: 23040(s) last: Jun 1 13:26:54 2003hard: 0(s)soft: 0(s) current: 7984(bytes)hard: 0(bytes)soft: 0(bytes) allocated: 66hard: 0soft: 0 sadb_seq=1 pid=9883 refcnt=2 172.30.0.4 172.30.0.1 esp mode=transport spi=58938842(0x038355da) reqid=0(0x00000000) E: 3des-cbc e773a247 88d38f6d bc6058bb 86fa7212 2f4ecebd 1f5274cf A: hmac-md5 0cd852b6 47084ac3 33f3bb80 c331b54f seq=0x00000042 replay=4 flags=0x00000000 state=mature created: Jun 1 13:12:37 2003current: Jun 1 13:29:00 2003 diff: 983(s)hard: 28800(s)soft: 23040(s) last: Jun 1 13:26:54 2003hard: 0(s)soft: 0(s) current: 5519(bytes)hard: 0(bytes)soft: 0(bytes) allocated: 66hard: 0soft: 0 sadb_seq=0 pid=9883 refcnt=1

As with the display of the policy information, most of the information is implementation speciﬁc, but we do see that the tunnel is using Triple DES in CBC mode for encryption and HMAC-MD5 for authentication. With the KAME IPsec implementation, the SPD and SAD are displayed via the setkey command. The meanings of some of the SPD and SAD ﬁelds displayed are documented in the setkey man page. The others, unfortunately, must be dug out of the source code.

To see what ESP transport-mode trafﬁc looks like, we ping linux from bsd: $ ping linux PING linux.jcs.local (172.30.0.4): 56 data bytes 64 bytes from 172.30.0.4: icmp_seq=0 ttl=64 time=0.745 ms 64 bytes from 172.30.0.4: icmp_seq=1 ttl=64 time=0.672 ms

. . . The ﬁrst ping results in: 1

13:26:50.374973 172.30.0.1 > 172.30.0.4: ESP(spi=0xacd19fce, seq=0x3e) 1.1 4500 0078 a7b1 0000 4032 7a61 ac1e 0001 E..x....@2za.... 1.2 ac1e 0004 acd1 9fce 0000 003e 2e10 4a32 ...........>..J2 1.3 e488 a012 f20b 1053 f265 43f2 257e 33e9 .......S.eC.%˜3.

348

ESP

Chapter 12

1.4 1.5 1.6 1.7 1.8

0355 e14f 9455 1280 1689

b1de 2e1e 0e0e b4d3 a104

e40c 93bc bfaf 84df 747e

e1a3 d607 59ea 4a45 3a4b

6164 4daf cec7 348e

021a f014 3111 5d7a

95ff 2fc9 5dd5 8183

5595 be94 51a8 439e

.U......ad....U. .O......M.../... .U....Y...1.].Q. ......JE4.]z..C. ....t˜:K

The ﬁrst 20 bytes are the IP header (Figure 2.11), which is typeset in boldface. The tenth byte of the IP header is the protocol, a 50 (0x32), indicating that the IP datagram contains an ESP packet. Notice that the source and destination addresses are those of bsd, 172.30.0.1 (0xac1e0001), and linux, 172.30.0.4 (0xac1e0004), as expected. The next 8 bytes are the ESP header: 1.2

ac1e 0004 acd1 9fce 0000 003e 2e10 4a32

...........>..J2

The SPI is 0xacd119fce as in line 1. Similarly, the sequence number is 62 (0x3e). The payload and ESP trailer (Figure 12.3) are encrypted, so we can’t see them explicitly. We know that the payload is an ICMP echo request and therefore that the ESP trailer speciﬁes that the next header is type 1 (ICMP). When we talk about tunnel mode in the next section, we will specify NULL encryption so that we can see the actual data and ESP trailer. The last 12 bytes of the datagram are the authentication data, HMAC-MD5-96 in this case: 1.7 1.8

12.5

1280 b4d3 84df 4a45 348e 5d7a 8183 439e 1689 a104 747e 3a4b

......JE4.]z..C. ....t˜:K

Tunnel Mode ESP tunnel mode is used to provide a VPN between two networks or between a host and a network. A typical conﬁguration is shown in Figure 12.4, in which the two security gateways, GW A and GWB , connect networks A and B with a tunnel-mode ESP VPN. (Contrast this diagram with Figure 12.2.) With this VPN in place, any host on network A can communicate securely with any host on network B. Before looking at how this works, let’s examine the ESP tunnelmode encapsulation. Figure 12.5 shows an IP datagram carrying a TCP segment before and after ESP encapsulation. As we see, the entire IP datagram is swallowed and encrypted by the ESP packet. This means that the ultimate recipient of the packet can be sure that the original IP header has not been tampered with during its transit of the WAN, because it is both encrypted and authenticated. Although the outer IP header is still vulnerable, the ﬁnal datagram that gets delivered to the recipient is protected. Another aspect of the ESP tunnel-mode encapsulation is that it provides some protection from trafﬁc analysis. That is, an attacker who is capturing the packets as they transit the network is unable to determine the source or destination hosts. The only visible information is that some undetermined host on network A is communicating with some other undetermined host on network B.

Section 12.5

Tunnel Mode

host A1

host A2

1

...

host A3

2

3

host An n

Network A: 10.0.1.0/24

250 GW A 1.1.1.1

WAN

2.2.2.2 GWB 250 1

Network B: 10.0.2.0/24

2

host B1

3

host B2

host B3

m

...

host Bm

Figure 12.4 An ESP Tunnel-Mode VPN

outer IP header

ESP header

IP header

TCP header

data

inner IP header

TCP header

data

encrypted authenticated

Figure 12.5 ESP Tunnel-Mode Encapsulation

ESP trailer

ESP auth. data

349

350

ESP

Chapter 12

The construction of the outer IP header follows simple and mostly obvious rules, which are speciﬁed in RFC 2401 [Kent and Atkinson 1998c]. Most of the ﬁelds in the outer IP header are constructed from scratch, using the expected values. These ﬁelds include the header length, total length, ID, fragment offset, checksum, TTL, protocol, and the source and destination addresses. The TOS ﬁeld is copied from the inner header. The DF ﬂag may or may not be copied, depending on implementation and conﬁguration. Looking again at Figure 12.4, we see that network A has an RFC 1918 private address, as does network B. Although this is not necessary, it is a common practice, and it will also illustrate how the nonroutable RFC 1918 addresses can be sent through the Internet or other WAN. We also see that GW A has a routable address of 1.1.1.1 on its WAN interface. Similarly, GWB has the routable address 2.2.2.2 on its WAN interface. Assume that host A2 sends a TCP segment to host B3 . Let’s follow the IP datagram as it leaves host A2 until it reaches host B3 . As shown in Figure 12.6, when the datagram leaves host A2 , it is a ‘‘normal’’ IP datagram with a source address of 10.0.1.2 and a destination address of 10.0.2.3. The protocol ﬁeld in the IP header is set to 6, indicating that the upper-layer protocol is TCP. Because host A2 either has its default route set to GW A or has a route to the 10.0.2.0/24 network with GWA as the next hop, the datagram is routed to GW A . When the datagram reaches GW A , the gateway checks its SPD and notices that it has a policy specifying that any datagram from the 10.0.1.0/24 network to the 10.0.2.0/24 network should be encapsulated with tunnel-mode ESP and sent to GWB at 2.2.2.2. After GWA encapsulates the IP datagram, the outer IP header has a source address of 1.1.1.1 (GW A ) and a destination address of 2.2.2.2 (GWB ). The protocol ﬁeld of the outer IP header is 50, indicating that the upper-layer protocol is ESP. The next header ﬁeld of the ESP packet is 4, indicating that the ESP packet is encapsulating an IP datagram. The inner IP header is unchanged. When the encapsulated IP datagram arrives at GWB , the gateway sees that it contains an ESP packet and retrieves the authentication and encryption keys from the appropriate SA, performs the authentication checks, and decrypts the ESP payload. The outer IP header, the ESP header and trailer, and the ICV are stripped off, and the inner IP datagram is forwarded to its destination, which is 10.0.2.3 (host B3 ). One more thing is worth noting in this example. Although neither of the gateways performs any NAT functions on the datagram as it passes through, we have connected two networks with private, nonroutable addresses. Unlike NAT, neither the inner IP header nor the TCP segment is modiﬁed. Now let’s look at an example of ESP tunnel mode on the wire. For this example, we set up an ESP tunnel-mode VPN between the 192.168.123.0/24 and 172.30.0.0/24 networks in our testbed. We will plug host laptop into the 192.168.123.0/24 network and use it to telnet into host solaris over the VPN. The relevant portions of our testbed are shown in Figure 12.7. Here is a copy of the IPsec policy on laptop: 172.30.0.0/24[any] 192.168.123.5[any] any in ipsec esp/tunnel/192.168.123.1-192.168.123.5/require

Section 12.5

Tunnel Mode

351

Host A2 10.0.1.2 IP Proto: 6 10.0.1.250

TCP segment

S: 10.0.1.2 D: 10.0.2.3

GW A 1.1.1.1 IP Proto: 50 2.2.2.2

ESP Nxt Hdr: 4

S: 1.1.1.1 D: 2.2.2.2

IP Proto: 6

TCP segment

ESP Trailer

ICV

S: 10.0.1.2 D: 10.0.2.3

GWB 10.0.2.250 IP Proto: 6 10.0.2.3

TCP segment

S: 10.0.1.2 D: 10.0.2.3

Host B3

Figure 12.6 Packet Flow from Host A2 to Host B3

spid=2 seq=1 pid=372 refcnt=1 192.168.123.5[any] 172.30.0.0/24[any] any out ipsec esp/tunnel/192.168.123.5-192.168.123.1/require spid=1 seq=0 pid=372 refcnt=1

The policy on bsd is the same except that the roles of the in and out directions are reversed. We have set the encryption algorithm to NULL so that we can see the contents of the ESP packets. From laptop we telnet into solaris: laptop:˜ $ telnet solaris

352

ESP

Chapter 12

laptop 5 1

192.168.123.0/24

bsd

...

1

solaris 3 172.30.0.0/24

Figure 12.7 A Portion of the Network Testbed

Trying 172.30.0.3... Connected to solaris.jcs.local. Escape character is ’ˆ]’. SunOS 5.8 login: guest Password: Sun Microsystems Inc. $

SunOS 5.8

Generic February 2000

Here is the tcpdump output of the SYN segment from laptop to solaris: 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

18:50:51.262663 192.168.123.5 > 192.168.123.1: ESP(spi=0x09e95635, seq=0x34) [tos 0x10] 4510 0068 0788 0000 4032 fb74 c0a8 7b05 [email protected]..{. c0a8 7b01 09e9 5635 0000 0034 4510 003c ..{...V5...4E..< 0787 4000 4006 4b56 c0a8 7b05 ac1e 0003 ..@[email protected]..{..... 040d 0017 1939 f0e5 0000 0000 a002 e000 .....9.......... c660 0000 0204 05b4 0103 0300 0101 080a .‘.............. 0004 ae90 0000 0000 0102 0204 4d60 d80b ............M‘.. 56a6 7e32 b9ac 51fc V.˜2..Q.

The outer IP header is typeset in boldface. The tenth byte of the IP header (Figure 2.11) is the protocol number of the upper-layer protocol. As expected, it is set to ESP (50=0x32). The following two 32-bit words are the source and destination addresses—192.168.123.5 (0xc0a87b05) and 192.168.123.1 (0xc0a87b01)—as shown in line 1. The next 8 bytes are the ESP header: 1.2

c0a8 7b01 09e9 5635 0000 0034 4510 003c

..{...V5...4E..
192.168.123.1.isakmp: isakmp 1.0 msgid : phase 1 I agg: (sa: doi=ipsec situation=identity (p: #1 protoid=isakmp transform=1 (t: #1 id=ike (type=lifetype value=sec) (type=lifeduration value=0e10) (type=enc value=3des) (type=auth value=preshared) (type=hash value=sha1) (type=group desc value=modp1024)))) (ke: key len=128) (nonce: n len=16) (id: idtype=IPv4 protoid=udp port=500 len=4 192.168.123.5) 1.1 4500 0110 00a3 0000 4011 01e3 c0a8 7b05 E.......@.....{. 1.2 c0a8 7b01 01f4 01f4 00fc 439e 4598 0d8e ..{.......C.E... 1.3 0b43 b113 0000 0000 0000 0000 0110 0400 .C.............. 1.4 0000 0000 0000 00f4 0400 0034 0000 0001 ...........4.... 1.5 0000 0001 0000 0028 0101 0001 0000 0020 .......(........ 1.6 0101 0000 800b 0001 800c 0e10 8001 0005 ................ 1.7 8003 0001 8002 0002 8004 0002 0a00 0084 ................ 1.8 2aa6 27ab a407 e593 f013 5bbe c25f 1be4 *.’.......[.._.. 1.9 fcab 63a3 348c 4fa9 4a1f 4328 155b 4ca1 ..c.4.O.J.C(.[L. 1.10 62dd d762 6808 e226 c442 adc5 545d c336 b..bh..&.B..T].6 1.11 077a e70a 16d4 f583 7b3f e014 753b ccb2 .z......{?..u;.. 1.12 5357 e025 3ce8 8ea3 9572 db05 3593 5d3e SW.% 1.13 cca8 7987 305f 016e 52c4 e70c 1dc3 82c9 ..y.0_.nR....... 1.14 0bab 06f0 ea3a 9c49 71e4 3aa7 edba 1af5 .....:.Iq.:..... 1.15 1a8e d5bd a8d6 896f 1390 5c83 360d d9fe .......o...6... 1.16 0500 0014 c7e2 d68a 8e8f 1b55 3936 2ca0 ...........U96,. 1.17 e458 6e4b 0000 000c 0111 01f4 c0a8 7b05 .XnK..........{. 2 18:00:00.429855 IP 192.168.123.1.isakmp > 192.168.123.5.isakmp: isakmp 1.0 msgid : phase 1 R agg: (sa: doi=ipsec situation=identity (p: #1 protoid=isakmp transform=1 (t: #1 id=ike (type=lifetype value=sec) (type=lifeduration value=0e10) (type=enc value=3des) (type=auth value=preshared) (type=hash value=sha1) (type=group desc value=modp1024)))) (ke: key len=128) (nonce: n len=16)

Section 13.4

An Example Negotiation

391

(id: idtype=IPv4 protoid=udp port=500 len=4 192.168.123.1) (hash: len=20) (vid: len=16) 2.1 4500 013c f287 0000 4011 0fd2 c0a8 7b01 E.. 2.13 06e5 2676 252b 8c12 061f 6f1d fc53 90a2 ..&v%+....o..S.. 2.14 e59d 8be1 aaeb 8b1b 7a5c e6e1 33c2 ae18 ........z..3... 2.15 da28 087f ec03 ac50 5188 6063 ac22 d956 .(.....PQ.‘c.".V 2.16 0500 0014 b0f7 d512 2fb2 e41b 9fd1 1227 ......../......’ 2.17 d795 a479 0800 000c 0111 01f4 c0a8 7b01 ...y..........{. 2.18 0d00 0018 6d70 6043 8779 6e0d 4963 01cf ....mp‘C.yn.Ic.. 2.19 e803 cac4 aa13 ccb5 0000 0014 7003 cbc1 ............p... 2.20 097d be9c 2600 ba69 83bc 8b35 .}..&..i...5 3 18:00:00.492011 IP 192.168.123.5.isakmp > 192.168.123.1.isakmp: isakmp 1.0 msgid : phase 1 I agg: (hash: len=20) 3.1 4500 0050 00a4 0000 4011 02a2 c0a8 7b05 E..P....@.....{. 3.2 c0a8 7b01 01f4 01f4 003c 6346 4598 0d8e ..{...... 192.168.123.5.isakmp: isakmp 1.0 msgid : phase 2/others R inf[E]: [encrypted hash] 5 18:00:01.526554 IP 192.168.123.5.isakmp > 192.168.123.1.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] 6 18:00:01.541429 IP 192.168.123.1.isakmp > 192.168.123.5.isakmp: isakmp 1.0 msgid : phase 2/others R oakley-quick[E]: [encrypted hash] 7 18:00:01.548966 IP 192.168.123.5.isakmp > 192.168.123.1.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] 8 18:00:02.334647 IP 192.168.123.5 > 192.168.123.1: AH(spi=0x0504cb4c,sumlen=16,seq=0x1): icmp 64: echo request seq 512

In line 1, we see a breakout of the ﬁrst message in the phase 1 negotiation. Notice that this is an Aggressive mode (agg) negotiation. The ﬁrst payload is the SA offer (sa:) with a single Proposal payload (p:), which in turn has a single Transform payload (t:). The Transform payload has six attributes that describe the SA’s lifetime, encryption algorithm, authentication method, integrity method, and Difﬁe-Hellman group. After the SA payload, are the Key, Nonce, and Identiﬁcation payloads.

392

IKE

Chapter 13

Let’s examine the packet itself to see how this data is carried in the message. Lines 1.1 and 1.2 contain the IP and UDP headers. On lines 1.2–1.4, we see the ISAKMP header set in boldface. As we see from Figure 13.1, the ﬁrst 16 bytes are the initiator and responder cookies. Notice that the responder cookie is 0 in the ﬁrst message, as expected. The next byte (0x01) tells us that the next payload is an SA payload (see Figure 13.2), followed by the major and minor version (0x10) and the exchange type (0x04). From Figure 13.3, we see that this is an Aggressive mode exchange. The next byte is the ﬂags; as we see, no ﬂags are set. The next 8 bytes are the message ID (0) and the length of 244 (0xf4). On lines 1.4 and 1.5, we see the SA payload: 1.4 1.5

0000 0000 0000 0120 0400 0034 0000 0001 0000 0001 0000 0028 0101 0001 0000 0020

...........4.... .......(........

Notice from Figure 13.6 that the next header ﬁeld (0x04) is a Key Exchange payload, not the Proposal payload, which comes next but is considered part of the SA payload. The second 4 bytes are the DOI, which is IPSEC (1). The last 4 bytes of the SA payload proper are the situation, which in this case is SIT_IDENTITY_ONLY. Following the SA payload is the Proposal payload on line 1.5. The format of this payload was shown in Figure 13.7. 1.5

0000 0001 0000 0028 0101 0001 0000 0020

.......(........

The length of 40 (0x28) includes the following Transform payload, just as the SA payload length ﬁeld included the Proposal and Transform payloads’ lengths. Also, the next payload ﬁeld is NONE (0), even though there is a following Transform payload. If there were another Proposal in the SA payload, this ﬁeld would be set to Proposal (2). The proposal ID in the ﬁfth byte is 1, as we saw in line 1. The protocol ID (in the sixth byte) is a 1, indicating ISAKMP (see Figure 13.8). The seventh byte is the SPI size. Because this is 0, the SPI ﬁeld is not present in the payload. Recall that the SPI for phase 1 SAs is the two cookies. Finally, the last byte tells us that there is one Transform payload for this proposal, shown on lines 1.5–1.7: 1.5 1.6 1.7

0000 0001 0000 0028 0101 0001 0000 0020 0101 0000 800b 0001 800c 0e10 8001 0005 8003 0001 8002 0002 8004 0002 0a00 0084

.......(........ ................ ................

As we see from Figure 13.9, most of the information in a Transform payload is carried in the attributes. We see from the ﬁrst 2 bytes on line 1.6 that this is transform 1 and that the transform ID is IKE (1)—see Figure 13.10. The remaining bytes in the payload are the attributes. For example, the ﬁrst attribute (on line 1.6) is 800b 0001. Because the AF bit is set (see Figure 13.5), this is a basic attribute, and its value is carried in the second 2 bytes. From Figure 13.11, we see that attribute 11 (0x000b) is the lifetime type for this SA. From the IKE speciﬁcation (RFC 2409), we see that a value of 1 is seconds. The next attribute, 800c 0e10, is the lifetime duration attribute, which tells us how many seconds the SA should live before being replaced. Note from Figure 13.11 that this is a variable attribute but that because the value ﬁts in 16 bits, it was encoded as a basic

Section 13.5

Summary

393

attribute instead. The lifetime is 3,600 (0xe10) seconds, or 1 hour. The other attributes are similar, and we can see what they are from line 1. The Key Exchange payload is on lines 1.7–1.15. Its generic header is the last 4 bytes on line 1.7. The rest of the payload is the key-exchange data. Its length is 132 (0x84), but tcpdump shows the length of the data as only (128), instead of the total length. The Nonce payload on lines 1.16 and 1.17 is similar. It consists of a generic header and 16 bytes of nonce. Its next payload is set to Identiﬁcation payload (5). The Identiﬁcation payload appears on line 1.17: 1.17

e458 6e4b 0000 000c 0111 01f4 c0a8 7b05

.XnK..........{.

Its next payload ﬁeld is set to NONE (0), indicating that this is the last payload. The ﬁfth byte (0x01) is the identiﬁcation type. From the DOI (RFC 2407), this speciﬁes that the identiﬁcation is an IPv4 address. The protocol ID is 17 (0x11), indicating UDP (Figure 2.12), and the port is the default value of 500 (0x1f4). The initiator’s IP address is 192.168.123.5 (0xc0a87b05), as expected. The other two phase 1 messages are similar. Exercise 13.11 asks us to perform a similar analysis on them. The phase 2 Quick mode messages are encrypted, so we can’t see their contents. On line 8, we see the AH protected ping that initiated the negotiation we have been studying.

13.5

Summary In this chapter, we explored ISAKMP and IKE, the mechanism by which IPsec negotiates security associations and exchanges keying material. Although these SAs can be conﬁgured by hand, manual keying does not scale well and is subject to the security shortcomings that long-lived keys always suffer from. We began by observing that IKE, the Internet Key Exchange protocol, is a hybrid of three other protocols: ISAKMP, OAKLEY, and SKEME. ISAKMP provides the infrastructure on which we can build a variety of key-exchange protocols. OAKLEY and SKEME contribute exchange modes and authentication methods to IKE. In addition, the original four Difﬁe-Hellman groups that IKE uses, the OAKLEY groups, come from OAKLEY. Before we could study and understand IKE, we had to examine the ISAKMP message formats, payload types, and exchange modes. ISAKMP works in two phases. In the ﬁrst phase, an ISAKMP SA is negotiated. The SA provides an encrypted and authenticated channel over which the second phase negotiates the IPsec SAs that are used by AH and ESP. In addition to negotiating the SAs and handling the key exchange, IKE authenticates each peer to the other. This ensures that each node can be sure of the identity of its peer. There are four ways to do this authentication: shared secrets, digital signatures, public key encryption of nonces, and revised public key encryption of nonces. As we saw, these methods all have their advantages and disadvantages. In the event that the peers are acting as proxies and negotiating SAs for client hosts, they can, in Main mode, hide the identity of those hosts. This ability is missing in

394

IKE

Chapter 13

Aggressive mode, but this mode has the advantage of using only half the messages required by Main mode. The fundamental method that ISAKMP and IKE use to establish a secure channel is to exchange Difﬁe-Hellman private keys from which they derive a shared secret. This shared secret, in turn, is combined with a nonce from each peer, and other parameters from the exchange, and run through an HMAC calculation to generate keying material for the cryptographic algorithms. Each of the algorithms uses this material in an algorithm-speciﬁc manner to generate its keys. After phase 1 is completed, either peer can initiate a New Group exchange to negotiate a different Difﬁe-Hellman group for future SAs. The new group can be speciﬁed either by its identiﬁer—in the case of the predeﬁned groups—or by the group attributes for new groups. Quick mode, the phase 2 exchange mode, can generate keys very quickly by combining the Difﬁe-Hellman shared secret from phase 1 with nonces exchanged in phase 2. This method has the advantage of not requiring expensive big-number exponentiations, but it can’t provide perfect forward secrecy. If Key Exchange payloads are included in the Quick mode exchange, perfect forward secrecy is provided at the cost of the DifﬁeHellman exponentiations. We concluded the chapter with an example of an Aggressive mode IKE negotiation using shared secret authentication. We performed a fairly complete analysis of one of the messages, showing how the various payloads are combined to form the initiator’s message. We were not able to do a similar analysis on the phase 2 messages, because they are encrypted, and only the ISAKMP header is visible to us.

Exercises 13.1

Why do the ISAKMP payloads need a next payload ﬁeld?

13.2

Is the CONNECTED notiﬁcation message an ISAKMP or an IPsec message?

13.3

Consider the case of a mobile host with a nonﬁxed IP address using Main mode with shared-secret authentication to negotiate an SA with a security gateway. Why doesn’t the mobile host have the same problem with Main mode that the security gateway does?

13.4

What is the quantity g x i x r in the calculation of SKEYID for the authentication with digital signatures method and the calculations of the quantities SKEYID d , SKEYID a , and SKEYID e ?

13.5

In IKE Aggressive mode, the Authentication payload is optionally encrypted. How does the responder know whether it’s encrypted?

13.6

Does Main mode with signature authentication have the same problem with mobile hosts and dynamic IP addresses that Main mode with shared-secret authentication has? Why or why not?

13.7

How does authentication with signatures guarantee that the state variables were not tampered with in transport? How does it authenticate each node to its peer?

13.8

Does authentication with a preshared key offer the same repudiation as authentication with public key encryption?

Section 13.5

13.9

Summary

395

In the ﬁnal analysis, both authentication with public key encryption and authentication with digital signatures use public key cryptography to authenticate the peers. Why doesn’t authentication with digital signatures offer repudiation?

13.10 What is the responder ’s cookie in the sample negotiation of Section 13.4? 13.11 Perform an analysis of the second and third phase 1 messages from the sample negotiation of Section 13.4 similar to that we performed of the ﬁrst message. 13.12 In Section 13.4, we showed the results of pinging through an AH transport mode tunnel. We terminated ping after six requests had been sent, but we received only four replies. Given that this experiment was performed on a LAN, why did we lose the two packets?

This page intentionally left blank

14

IPsec Futures

14.1

Introduction At the time of this writing (early 2005), the IPsec Working Group is developing speciﬁcations for new versions of AH, ESP, and IKE. Although these speciﬁcations are still in the Internet Draft stage, they are nearing completion and will soon become RFCs. We can expect that the ﬁnal versions of these protocols will be essentially as described in this chapter. Because the speciﬁcations are not yet in ﬁnal form, and because it will likely take some time for implementations to appear and be deployed, we do not describe them in the same detail as we have the current versions. Rather, we discuss how they differ from today’s versions and what additional problems they solve. For reference, our discussion is based on the following drafts: Architecture

draft-ietf-ipsec-rfc2401bis-05.txt

AH

draft-ietf-ipsec-rfc2402bis-11.txt

ESP

draft-ietf-ipsec-esp-v3-10.txt

IKE

draft-ietf-ipsec-ikev2-17.txt

We also discuss NAT Traversal (NAT-T), a method of easing the interoperability problems between NAT and IPsec. The NAT-T speciﬁcations, RFC 3947 [Kivinen, Swander, Huttunen, and Volpe 2005] and RFC 3948 [Huttunen, Swander et al. 2005], were released in January 2005, so implementations should start appearing soon. We discuss NAT-T in detail and see how it can overcome most of the problems that IPsec has when running in an environment that includes NAT.

397

398

14.2

IPsec Futures

Chapter 14

IPsec Architecture The current IPsec architecture is speciﬁed in RFC 2401 [Kent and Atkinson 1998c]; we refer to the replacement described in the RFC 2401bis draft as the new architecture. Although many of the details in the new architecture differ from the current version, the fundamental ideas and underlying principles remain the same.

The IPsec Processing Model In the new architecture, the SPD selection and routing functions are separated; consequently, SPD entries are no longer associated with an interface. This means that for unicast packets, the meaning of the security parameter index (SPI) no longer depends on the destination address. Whether the SPI’s meaning depends on the IPsec protocol is a local matter; an implementation can use a single number space for both AH and ESP, or it can use separate number spaces for each. Implementations are no longer required to support nesting or SA bundles but are still free to do so. The new speciﬁcation describes a method of achieving the same results through entries in the SPD and forwarding tables. IPsec is now allowed to process fragmented IPv4—but not IPv6—tunnel-mode packets. The new architecture speciﬁes several methods for dealing with fragments, and ﬁelds supporting this are now required in the SPD. Finally, AH support is no longer required, but implementations are free to provide it. This recognizes the fact that although in almost all cases, the ESP-provided integrity is sufﬁcient, in a small number of contexts, the extra protection that AH provides is useful. The Security Policy Database In the new architecture, the function and form of the SPD are speciﬁed much more precisely than in RFC 2401. Recall from Chapter 10 that the SPD describes the processing to be applied to each packet by matching selectors in the packet with selectors in the SPD. Also recall that the SPD is ordered so that the more speciﬁc of overlapping policies will be applied before the more general. In the new architecture, the SPD conceptually consists of three disjoint but interleaved parts: SPD-I

The entries that describe the inbound trafﬁc that should be discarded or bypass IPsec

SPD-O The entries that describe the outbound trafﬁc that should be discarded or bypass IPsec SPD-S

The entries that describe trafﬁc to which IPsec should be applied

The disposition of inbound trafﬁc is speciﬁed in either the SPD-I or the SPD-S. Similarly, the disposition of outbound trafﬁc is speciﬁed in either the SPD-O or the SPD-S. We could, for example, describe outbound processing as checking the SPD-O to see

Section 14.2

IPsec Architecture

399

whether a packet should be discarded or bypass IPsec and then checking the SPD-S if no matching entry is found in the SPD-O. The new architecture envisions that caches will be associated with each of the three parts of the SPD. For example, when a packet ﬁrst matches an SPD-O entry, the entry will be placed in the SPD-O cache. The rationale is that using the caches would improve IPsec’s performance. Unfortunately, the ordering of the SPD can cause problems with using caches this way. Suppose, for example, that a more general overlapping entry is cached, but the more speciﬁc is not. If a packet matching the more speciﬁc policy is presented for processing, IPsec will ﬁnd the incorrect—that is, more general—cached entry and perform the wrong action on the packet. To solve this problem, the new architecture includes a method for decorrelating the SPD entries—that is, for removing overlapping policies by breaking them into multiple disjoint policies. An example will make this clear. Suppose we have the following two policies. 1.

Trafﬁc to 192.168.1.10 should have ESP applied.

2.

Trafﬁc to 192.168.1.1 through 192.168.1.200 should be discarded.

Notice that the order is important because if policy 2 were checked ﬁrst, trafﬁc to 192.168.1.10 would be discarded instead of having ESP applied. To decorrelate these entries, we transform the second policy into two policies, neither of which overlaps the ﬁrst policy: 1.

Trafﬁc to 192.168.1.10 should have ESP applied.

2a. Trafﬁc to 192.168.1.1 through 192.168.1.9 should be discarded. 2b. Trafﬁc to 192.168.1.11 through 192.168.1.200 should be discarded. Notice that in this case, we can check the policies in any order without ambiguity. In particular, we can add these decorrelated policies to the caches without fear of mishandling a packet. Trafﬁc Selectors The new architecture SPD is enhanced to allow more ﬂexible packet handling. Selectors are now speciﬁed as lists of ranges. For example, we can specify that a policy should apply to destination addresses 192.168.1.1–192.168.1.13, 192.168.1.25–192.168.1.100, and 10.1.0.1–10.1.0.50. Single addresses are expressed as a trivial range. Ports can be speciﬁed in the same way, so in the new architecture, we can specify port ranges. For ICMP packets, we can specify the type and code ﬁelds (Figure 2.24) instead of a port, so we have increased granularity for these packets. Similarly, we can specify the IPv6 mobility header (MH) type as a policy selector. See RFC 3775 [Johnson, Perkins, and Arkko 2004] for a discussion of the IPv6 mobility header.

In addition to the selectors, the SPD indicates the mode (transport or tunnel), the local and remote tunnel addresses for tunnel mode, whether the SA should use

400

IPsec Futures

Chapter 14

extended 64-bit sequence numbers, the IPsec protocol (AH or ESP), cryptographic algorithms, some ﬂags specifying how to handle fragments and the differentiated service bits, and the populate from packet (PFP) ﬂags that specify whether the SA should use the selector from the packet or from the SPD. The PFP ﬂags are a generalization of the RFC 2401 ability to generate multiple SAs from the same policy entry. For example, if one of the policy selectors speciﬁed a destination address of 192.168.1.1–192.168.1.50 and if the PFP ﬂag were set for the destination address, a separate SA would be negotiated for any of the 50 destination addresses for which IPsec sees a packet. If the PFP ﬂag is not set, the same SA would be used for all 50 destination addresses.

There are PFP ﬂags for • The local address • The remote address • The next-layer protocol • The local port, ICMP type/code ﬁeld, or MH type • The remote port, ICMP type/code ﬁeld, or MH type Support for Multicast The new architecture supports multicast and anycast as well as unicast packets. As discussed above, the SPI or perhaps the SPI and the IPsec protocol are sufﬁcient to locate the SA for an incoming unicast packet. Because a third party, rather than the destination host, usually supplies the SPI for multicast packets, a collision with a host-generated SPI is possible, and further steps are required to avoid ambiguity. An IPsec-protected single-source multicast group packet is speciﬁed by the source address, destination address, and SPI, so these three quantities are consulted to determine the appropriate SA. In the case of a multisource multicast group, only the destination address and SPI are used to specify the SA. Thus, there are three ways of mapping an IPsec packet to its SA, and we must have some way of ensuring that we do this in an unambiguous way. The new architecture accomplishes this by using a ‘‘longest match’’ rule. When processing an incoming packet, IPsec ﬁrst searches the SAD for a match on the SPI and source and destination addresses. If there is no match, the SAD is searched again, looking for a match on destination address and SPI. If there is still no match, the SAD is searched for a matching SPI. If this last search also fails, there is no SA for this packet, and the packet is dropped. The Peer Authorization Database The new architecture adds a third database, the peer authorization database (PAD), to the SPD and SAD. The PAD is used to tie together IKE and the SPD. It contains information on which peers are allowed to negotiate AH or ESP SAs with the host. As an example, consider a mobile host that establishes a VPN with a security gateway on its home network from many different IP addresses. In this case, it is obviously

Section 14.3

AH

401

not possible to simply use the mobile host’s source address to locate the proper policy or check whether it’s authorized. Instead, the mobile host sends identifying information in the Identity payload, which the security gateway uses to check authorization and to locate an appropriate policy. The PAD also includes information on the method of identity authorization— shared secret, digital signature, and so on—that the responder should use to verify the identity of the initiator. If certiﬁcates are involved, the PAD contains information on acceptable root certiﬁcates and revocation lists. In short, the PAD contains the information that the responder needs to execute the authorization protocol with an initiator. The PAD may be notional in the sense that its functions are part of the SA management software—IKE or the equivalent—and that no distinct database exists. Whether or not the PAD exists as a distinct database is a local matter. The new architecture uses it as the (possibly notional) locus of information needed to perform initiator authorization and to locate the appropriate SPD entries.

14.3

AH As noted in our discussion of the new IPsec architecture, AH is no longer a mandatory protocol. Other than that, the main changes from RFC 2402 [Kent and Atkinson 1998a] are the new SPI processing, discussed in the previous section, and the new processing required to deal with the 64-bit extended sequence numbers. The AH header (Figure 11.2) remains unchanged from RFC 2402, so the on-the-wire behavior of AH is essentially the same as before. As we discussed in Chapter 11, it is critical that the sequence number not be allowed to wrap when antireplay protection is enabled. When sequence number wrapping is imminent, IPsec invokes IKE to negotiate a new SA. Because the new SA would have a new SPI and new keys, an attacker could not replay old packets with the same sequence number. In very high speed connections, the 32-bit sequence number can wrap frequently and therefore cause frequent renegotiation of SAs. To alleviate this problem, IPsec can use a 64-bit extended sequence number (ESN) for AH and ESP. As we’ll see later, IKEv2 negotiates whether an SA will use normal 32-bit sequence numbers or the extended 64-bit sequence numbers. To save bandwidth and keep the AH header unchanged, only the lower 32 bits of an ESN are transmitted in the AH packet. The two peers maintain separate versions of the entire 64-bit sequence number counter and keep them synchronized as described next. In order to guarantee that a packet is not a replay, the receiver must ensure that its value of the ESN is the same as the sender’s—that is, that the untransmitted upper 32 bits of the ESN are the same as the sender’s. In order to keep the peers’ ESN counters synchronized, the sender includes the upper 32 bits in the ICV for the packet. That is, the data used to compute the MAC includes the upper 32 bits. This does not add to the length of the AH packet, of course. Let’s brieﬂy examine how to perform the antireplay check. We use the same antireplay window technique that we discussed in Chapter 11. As before, the speciﬁcation

402

IPsec Futures

Chapter 14

mandates that this window be at least 32 sequence numbers wide and recommends that it be 64 sequence numbers wide by default. The speciﬁcation also recommends that the window size be conﬁgurable so that it can be increased for high-speed connections. First, let us assume that a connection will drop or lose no more than 232 − 1 consecutive packets. In the very unlikely event that this happens, we must take special action to resynchronize the counters. We discuss that procedure shortly. There are two cases to consider. In the ﬁrst case, every sequence number in the window has the same upper 32 bits. In the second case, the value of the upper 32 bits increases by 1 at some packet in the window. Case 1 is straightforward. If the packet’s sequence number is to the right of the window, we assume that the sequence number check succeeded, and we update the window in the usual way. If, in fact, the upper 32 bits differ, the packet will fail the ICV check and be rejected anyway. Similarly, if the packet’s sequence number falls within the window, we accept or reject it, depending on whether it is marked as already received. Again, if the upper 32 bits differ, the packet will be rejected by the ICV check. Finally, if the packet lies to the left of the window, we tentatively assume that the counter ’s lower 32 bits have wrapped and that the upper 32 bits have increased by 1. If the packet passes the ICV check, these assumptions are conﬁrmed, and we update the antireplay window and ESN counter appropriately. If the ICV check fails, we reject the packet and leave the sequence number state as it was. In the second case, the window straddles a point in the ESN number space where the lower 32 bits wrapped, so packets within the window can have differing values for the upper 32 bits. The antireplay check in this case is similar to the preceding one, but the details are a little fussier because we have to track which value of the upper 32 bits to use (see Exercise 14.1). The complete details are in Appendix B of the AH and ESP drafts. If 232 or more consecutive packets are lost, the peers lose synchronization of the upper 32 bits. This presents two problems: First, the peers must detect that their ESN counters have become desynchronized; second, they must resynchronize them. The recommended solutions for these problems are as follows. A host counts the number of consecutive packets that fail the ICV check. When this number exceeds some conﬁgured threshold, the receiver assumes that its and its peer’s ESN have become desynchronized. To resynchronize, the receiver selects one of the packets that failed the ICV check and retries it by successively incrementing the value of the upper 32 bits. If the packet authenticates with one of these trial values, that value is assumed to be the correct value for the upper 32 bits, and the sequence number state is adjusted to reﬂect that. To prevent an old or forged packet from causing this process to run away, the receiver should limit the number of trial values that it checks. Recall that antireplay checking is optional and that the receiver can choose to disable it. Note, however, that when the ESN option is enabled, antireplay should be too, because the antireplay algorithms are what keep the ESN counters synchronized. In any event, specifying ESN without antireplay makes little sense, because the point of sequence numbers is the antireplay check.

Section 14.4

14.4

ESP

403

ESP Like AH, the new version of ESP is very similar to the old (ESPv2) as speciﬁed in RFC 2406 [Kent and Atkinson 1998b]. In particular, if one of the new combined-mode cryptographic algorithms (see the next subsection) is not in use, the packet format is the same as it was for ESPv2 (Figure 12.1). On the other hand, there are enough differences, including in some cases the on-the-wire packet format, that the new version is called ESPv3. Just as with AH, ESPv3 supports the new SPI semantics and optional extended 64-bit sequence numbers (ESN). When ESN is enabled, ESP uses the same antireplay mechanisms that AH does.

Combined-Mode Cryptographic Algorithms Combined encryption/authentication algorithms can increase performance by providing conﬁdentiality and authentication with a single pass over the data. Such algorithms may or may not require that an explicit ICV be appended, so it is possible that the authentication data ﬁeld of Figure 12.1 is not present in a packet protected by a combined-mode algorithm. Combined-mode algorithms may also have other problems. For example, if ESN is enabled, the upper 32 bits are not present in the packet, so it may not be possible to include them in the authentication calculations. In that case, the high-order bits of the ESN counter may have to be added to the packet explicitly. Similarly, the SPI and sequence number ﬁelds from Figure 12.1 are included in the authentication calculation, so they too may have to be replicated in the encrypted part of the packet. These considerations dictate that combined-mode algorithm packets must be processed differently from packets that use separate algorithms. Because the details will vary with the particular combined-mode algorithm, these details must be deﬁned in the speciﬁcation for that algorithm, and each algorithm must provide facilities to implement those details. Optional Padding and Dummy Packets As we discussed in Chapter 12, tunnel-mode ESP can provide a limited amount of protection from trafﬁc analysis. Although up to 255 bytes of padding can be added to a packet, this padding is intended to ﬁll out encryption blocks and provide alignment. The amount of padding is not adequate to effectively hide the payload size. To remedy this, ESPv3 provides two additional mechanisms that make trafﬁc analysis more difﬁcult. The ﬁrst mechanism is called trafﬁc ﬂow conﬁdentiality (TFC). TFC allows an arbitrary number of bytes to be added just before the encryption/alignment padding, as shown in Figure 14.1. Notice that there is no ﬁeld that speciﬁes the length of the TFC padding. This means that the receiver must have some way of determining this length in order to

404

IPsec Futures

0

Chapter 14

78

15 16

23 24

31

security parameter index (SPI) sequence number

IV and payload data

TFC padding

encryption/alignment padding pad length

next header

authentication data

Figure 14.1 The ESPv3 Packet

discard it. With tunnel mode, the total length ﬁeld from the IP header (Figure 2.11) can be used to determine the TFC length. For transport mode, the length can sometimes be determined from the upper-layer protocol length ﬁeld. This is the case with UDP (Figure 2.14) and ICMP (Figure 2.22) but not with TCP (Figure 2.16), which does not carry the total length of the TCP segment in its header. The second mechanism that ESPv3 provides to ﬁght trafﬁc analysis is the dummy packet. ESP can generate and transmit dummy packets, which the receiver discards without further processing. Dummy packets are identiﬁed by a protocol ID of 59 in the ESP next header ﬁeld. The SPI, sequence number, next header, padding, pad length, and ICV ﬁelds must be present so that IPsec can authenticate and decrypt the packet before discovering that it’s a dummy packet. The data portion of the packet contains arbitrary data. The intention is that these packets will be transmitted at random intervals to obscure the trafﬁc ﬂow. The optimal strategy for generating the dummy packets depends on the trafﬁc they are intended to protect and is a local matter.

14.5

IKE The most signiﬁcant changes in the new protocols are in IKE. IKEv2 is considerably different from the ﬁrst version, which we discussed in Chapter 13. As we’ll see, the SA

Section 14.5

IKE

405

establishment protocol is considerably simpliﬁed. At the same time, IKEv2 has signiﬁcantly enhanced capabilities to negotiate ﬁne-grained SAs in a reliable manner. IKEv2 exchanges consist of two messages: a request and a response. The sender of the request provides reliability by implementing a TCP-like retransmission strategy. The default behavior is for a node to send a request and wait for the response, but messages are sequenced, so the peers can agree to multiplex several exchanges onto a single SA at the same time. IKE Messages IKEv2 messages are formed almost exactly as they were in IKEv1. Each message begins with an IKE header that has the same format (see Figure 13.1) that it did in IKEv1, except that the initiator and responder cookies are renamed the initiator and responder SPIs. Following the header are one or more IKE payloads, each of which begins with a generic header, just as in IKEv1. The IKEv2 generic header is the same as in IKEv1 (see Figure 13.4), except that a critical bit has been added at bit 8 to indicate that the responder must recognize this payload. Although we won’t discuss the payload formats in detail, most of them have, and are similar to, an IKEv1 equivalent. IKE Exchanges Recall from Chapter 13 that IKEv1 had eight possible exchange sequences that it could use to establish an IKE SA. After the IKE SA was in place, a three-message quick exchange was needed to establish a phase 2 SA. In IKEv2, the process of establishing these SAs is considerably simpliﬁed. IKEv2 uses three exchange types to negotiate SAs: IKE_SA_INIT

This is the initial exchange, in which the peers establish a secure channel. After the IKE_SA_INIT exchange, all further exchanges are encrypted.

IKE_AUTH

This is the second exchange, in which the peers authenticate themselves to each other and create the ﬁrst child SA, a phase 2 SA in IKEv1 terminology.

CREATE_CHILD_SA This exchange is used to create additional child SAs. It serves the same function that the Quick mode exchange does in IKEv1. Each of these exchanges is only two messages, so the process of establishing an IKE SA and its ﬁrst child SA involves only four messages. Each additional child SA takes two messages. Note that IKEv2 no longer speaks of phase 1 and 2 SAs; instead, it uses the terms IKE SA and child SA. Let’s see how IKE uses these exchanges to negotiate SAs. Figure 14.2 shows the normal IKE_SA_INIT/IKE_AUTH exchange sequence that establishes an IKE SA and its ﬁrst child SA. In the typical case of a single ESP tunnel between two nodes, this is all

406

IPsec Futures

Chapter 14

that is required to establish the VPN. The ﬁrst two messages are the IKE_SA_INIT exchange. The peers exchange keying material and nonces in the KE and NONCE payloads and negotiate IKE SA encryption and authentication parameters in the SA1 payloads. The responder can request a certiﬁcate from the initiator by including the optional CERTREQ payload. If it wants a certiﬁcate from the responder, the initiator will request it in the IKE_AUTH exchange. initiator

responder

HDR - SA1 - KE - NO NCEi i i

NCEr [- CERTREQ] HDR - SA1r - KEr - NO HDR - IDi - [CERT -] [CERTREQ -] [ID

r

-] AUTH - SA - TS 2i i - TSr

TS AUTH - SA2r - TSi - r HDR - ID r - [CERT -]

Figure 14.2 The IKE_SA_INIT and IKE_AUTH Exchanges

The second two messages are the IKE_AUTH exchange. The two peers exchange their identities in the Identiﬁcation (ID) payloads and authenticate them with data carried in the AUTH payloads. The exact data in the AUTH payload depends on the authentication method. If the responder has more than one identity, the initiator can indicate which identity it wishes to use, by including the IDr payload in its IKE_AUTH message. Except for the header, everything is encrypted in the IKE_AUTH message. We are using the same convention as in Chapter 13: Payloads that are encrypted are set in italics.

The SA2 , TSi , and TSr payloads are used to negotiate the child SA. The SA2 payloads negotiate the IPsec protocol and the cryptographic primitives that it will use. Unlike IKEv1, the trafﬁc selectors are not speciﬁed in the SA payload but are speciﬁed in their own TSi and TSr payloads. Also unlike IKEv1, the generation of keying material does not depend on the type of peer authentication that is used. One of the cryptographic primitives negotiated in the SA1 payloads is a pseudorandom function prf(K, M). This, in turn, is used to deﬁne the iterative function prf* (K, M) as prf* (K, M) = T 1 ||T 2 || . . . ||T n where T 1 = prf(K, M||0x01)

Section 14.5

IKE

407

T 2 = prf(K, T 1 ||M||0x02) ... T n = prf(K, T n−1 ||M||n) Keying material for the IKE SA is generated by ﬁrst calculating the quantity KEYSEED as KEYSEED = prf(NONCEi ||NONCEr , g x i x r ) where g x i x r is the Difﬁe-Hellman shared secret calculated from the Key Exchange payloads. Next, a long string of bits, KEYMAT, is generated by KEYMAT = prf* (KEYSEED, NONCEi ||NONCEr ||SPI i ||SPI r ) and keying material for the authentication, encryption, and integrity algorithms is extracted from this string. A seed for further keying material, SK d , is also extracted from KEYMAT. Keying material for the ﬁrst child SA is generated in a similar way, using SK d instead of KEYSEED: KEYMAT = prf* (SK d , NONCEi ||NONCEr ) where the nonces are from the IKE_SA_INIT exchange. IKEv2 has three methods for the peers to authenticate their identities. Two of these methods involve having the peers sign selected data from the exchanges. The peers may do this using the prf and a shared secret, or they can use one of the common signature methods, such as DSA or RSA. The third method is to use the Extensible Authentication Protocol (EAP). In general, EAP is more complicated and will involve the exchange of additional messages, but it provides a general framework into which new authentication methods can be added without changing the IKEv2 protocol. EAP is speciﬁed in RFC 3748 [Aboba, Blunk, Vollbrecht et al. 2004]. If additional child SAs are required or if the IKE SA or one of the child SAs needs to be rekeyed, the peers execute a CREATE_CHILD_SA exchange. Figure 14.3 shows the CREATE_CHILD_SA exchange.

initiator

responder

HDR - N - SA - NO NCEi [- KE ] [- TS i i - TSr ] TS ] E r [- KE r ] [- TSi - r

HDR - SA - NONC

Figure 14.3 The CREATE_CHILD_SA Exchange

408

IPsec Futures

Chapter 14

If the exchange is to rekey an existing SA, the ﬁrst payload must be a Notify payload, N, that speciﬁes which SA is being rekeyed. If a new child SA is being negotiated, the N payload must not be present. For perfect forward secrecy, the peers can include the optional Key Exchange payloads in the exchange. In this case, the new keying material will be generated by KEYMAT = prf* (SK d , g x i x r ||NONCEi ||NONCEr ) where the g x i x r is the new Difﬁe-Hellman shared secret from the exchange. The nonces are from the current exchange. If perfect forward secrecy is not required, the new keying material is generated the same way except that the Difﬁe-Hellman shared secret is not concatenated to the nonces. There are two other types of exchanges. The ﬁrst is used to help prevent denial-ofservice attacks. The rationale for this exchange is much like the cookie exchange from IKEv1. An attacker can send dummy IKE_SA_INIT requests with a forged source address, forcing the responder to perform expensive Difﬁe-Hellman calculations for its response. If the responder notices that it has several half-completed IKE SAs, it can insist that its peer prove that it is reachable at its source address by sending a Notify payload containing a cookie instead of the response to the IKE_SA_INIT request. The requester then repeats its request, including the Notify payload with the cookie. Figure 14.4 shows the exchanges. initiator

responder HDR(I,0) - SA - KE 1i i - NONCEi

HDR(I,0) - N

HDR(I,0) - N

- SA1i - KEi - NONCE i

Q] - NONCEr [- CERTRE HDR(I,R) - SA1r - KEr HDR(I,R) - ID - [CERT i -] [CERTREQ -] [ID -] AUTH - SA2 - TS - TS r i i r TS -] AUTH - SA2r - TSi - r HDR(I,R) - ID r - [CERT

Figure 14.4 The Initial Exchange with a Cookie

The HDR(x,y) notation shows the contents of the SPIs in the header: I indicates the initiator ’s SPI and R the responder ’s SPI; 0 indicates the SPI has not been determined yet.

Section 14.6

NAT Traversal

409

The other exchange is the INFORMATIONAL exchange. Like all IKEv2 exchanges, each INFORMATIONAL exchange request expects a response. Three types of payloads can be included in an INFORMATIONAL exchange. Any number of any combination of payloads can be included, as shown in Figure 14.5. initiator

responder HDR [- N ] [-

HDR [- N ] [-

D] [- CP ] . . .

D] [- CP ]

...

Figure 14.5 The INFORMATIONAL Exchange

We’ve already seen the Notify payload (N) in conjunction with cookies. There are several other types as well. They carry error and status information, much as they do in IKEv1. The Delete payload (D) informs the peer that the sender has deleted one or more of its incoming SAs. The responder is expected to delete those SAs as well and will usually include Delete payloads for the corresponding SAs in the other direction in its response message. The Conﬁguration payload (CP) is used to negotiate conﬁguration data between the peers. One important use of the CP payload is to request (request) and assign (response) an address on a network protected by a security gateway. In the typical case, a mobile host will establish a VPN with a security gateway on its home network and will request that it be given an IP address on the home network. Notice that this eliminates one of the problems that the combined use of L2TP and IPsec (Chapter 5) is intended to solve. IKEv2 is a very rich protocol, and we have looked only at some of its more important features here. IKEv2 is intended to simplify the key-management function in IPsec, to add new ﬂexibility in negotiating SAs, and to solve some of the problems in IKEv1.

14.6

NAT Traversal We have already seen several instances of interactions between IPsec and NAT/PAT that result in operability problems. In this section, we use ‘‘NAT’’ to mean both NAT and PAT. We use ‘‘PAT’’ when we want to make it clear that we are concerned with the port-remapping aspects of NAT/PAT.

RFC 3715 [Aboba and Dixon 2004] discusses these problems at length and remarks that attempts to solve them with NAT alone have not been completely successful and in some cases have made the problems worse. In this section, we discuss a relatively new solution, NAT-Traversal (NAT-T) that addresses many of these issues. Most of the interoperability difﬁculties between IPsec and NAT involve one or more of the following facts.

410

IPsec Futures

Chapter 14

• AH authenticates the source IP address and ports and thus prevents NAT from changing them. • With ESP, TCP and UDP ports are authenticated and protected, and therefore PAT cannot remap the source port. • TCP/UDP checksums depend, through the pseudoheader, on the source and destination addresses. When NAT changes these addresses, the TCP/UDP checksums are invalidated. Because the checksums are authenticated and/or encrypted, NAT cannot recalculate them. Because not much can be done about the ﬁrst problem, NAT-T concerns itself only with ESP and IKE. Even with ESP, some common situations don’t have a good solution. We look at a couple of these problem areas after we discuss the operation of NAT-T. As we’ll see shortly, IPsec peers negotiate the use of NAT-T through IKE. UDP Encapsulation of ESP and IKE When NAT-T is used, ESP packets, whether in transport or tunnel mode, are encapsulated in UDP packets, as shown in Figure 14.6. The utility of this encapsulation is immediately clear when we consider that PAT now has an unprotected source port that it can remap. RFC 3948 [Huttunen, Swander et al. 2005] speciﬁes that the UDP checksum should be disabled by setting the checksum ﬁeld to 0 and that this trafﬁc should be sent on the same source and destination port that is used for IKE trafﬁc. The source and destination ports are normally set to 4500, but the source port is subject to remapping by PAT. That means that the peer of a node behind a NAT must send its UDP packets to the remapped port instead of to port 4500. NAT Keep-Alives One of the problems that can occur when IPsec and NAT are used together is that the NAT mapping may time out if there is no trafﬁc for a while. To prevent this, an IPsec node behind a NAT must ensure that the connection is not idle for longer than the NAT timeout threshold. The IPsec node does this by periodically sending keep-alive packets to its peer, using the same source and destination ports as the ESP and IKE trafﬁc use. The keep-alive packets are a normal UDP packet with a single byte of 0xff as the payload. A peer receiving a keep-alive packet ignores it, because its only purpose is to prevent the sender’s NAT from timing out. As with the encapsulated ESP packet, RFC 3948 speciﬁes that the checksum should be disabled. IKE and NAT-T As mentioned earlier, the use of NAT-T is negotiated by IKE. Both peers must support NAT-T for this to happen, of course, so they inform each other of their willingness to engage in NAT-T negotiations, by means of Vendor ID payloads, which they send in the ﬁrst two messages of a phase 1 negotiation. The content of the Vendor ID payload is the

Section 14.6

NAT Traversal

IP hdr.

UDP hdr.

ESP hdr.

TCP hdr.

data

ESP trailer

411

ESP auth.

encrypted authenticated

Transport Mode

outer IP hdr.

UDP hdr.

ESP hdr.

inner IP hdr.

TCP hdr.

data

ESP trailer

ESP auth.

encrypted authenticated

Tunnel Mode Figure 14.6 ESP UDP Encapsulation

MD5 hash of ‘‘RFC 3947.’’ RFC 3947 [Kivinen, Swander, Huttunen, and Volpe 2005] speciﬁes the NAT-T negotiation protocol within IKE. Recall from Chapter 13 that the Vendor ID payload contains a hash of some vendor-deﬁned string.

Once the peers have agreed to negotiate NAT-T, they perform NAT discovery, which determines which, if any, of the peers is behind a NAT. They do this by sending NAT Discovery payloads as part of the phase 1 negotiation—speciﬁcally, in the third and fourth messages of Main mode or the second and third messages of Aggressive mode. The NAT Discovery (NAT-D) payload, shown in Figure 14.7, is a hash of an IP address and port. Each node sends at least two NAT-D payloads: The ﬁrst is the hash of its peer’s address and port, and the second is the address and port from which it is sending. If a node is multihomed and cannot determine which interface it will be using, it sends a NAT-D payload for each of its interfaces, in addition to the one for its peer’s IP address and port. The hash is calculated as HASH(CKY i ||CKY r ||IP||port) where HASH is the negotiated hash. The payload type for the NAT-D payload is 20.

412

IPsec Futures

Chapter 14

0

78 next payload

15 16 reserved

31 payload length

hash of IP address and port

Figure 14.7 The NAT-D Payload

When a node receives the NAT-D payloads, it checks that • The ﬁrst received NAT-D payload matches one of the NAT-D payloads for its local interfaces • One of the other received NAT-D payloads matches the payload for its peer’s address, that is, the ﬁrst NAT-D payload that it sends to its peer If both of these tests succeed, the addresses have not changed in transit, so there is no NAT between the peers. If the ﬁrst test fails, this peer is behind a NAT, so it must start sending NAT keep-alives, as discussed in the previous subsection. Even if the other peer is also behind a NAT, that NAT won’t tamper with the destination address. When the ﬁrst test fails, it means that the destination address has changed and therefore that the receiving node is behind a NAT.

We mentioned previously that IKE and ESP will use port 4500 when NAT-T is in use. The reason for this is that many NATs try to interoperate with IKE by not remapping port 500, even if there are multiple hosts behind the NAT, or by using some other method to demultiplex IKE trafﬁc. By switching to port 4500, IPsec avoids this behavior and ensures that NAT-T can operate as intended. In Main mode, the initiator changes ports when it sends its Identiﬁcation payload. Figure 14.8 shows a typical Main mode exchange when NAT-T is in use. The UDP header is included in the ﬂow diagram in order to show which ports IKE is using. In the ﬁgure, VID is the Vendor ID payload, and NATD is the NAT-D payload.

Notice that in the second message, the responder replies to port X, the remapped port 500 from the initiator (see Exercise 14.7). After the initiator changes ports in the ﬁfth message, the responder begins replying to port Y, the remapped port 4500. In Aggressive mode, the responder sends its NAT-D payloads in the second message, as shown in Figure 14.9. When the initiator receives this message, it will know whether a NAT is between the peers and, if so, will switch to port 4500. In the third message, the initiator sends its NAT-D payloads, so after receiving this message, the responder can determine whether a NAT is between the peers and also change to port 4500 if there is. Again, note how the responder replies to the remapped ports rather than to port 500 or 4500. RFC 3947 says that the peers may begin their negotiation on port 4500 and thus avoid having to switch ports.

Section 14.6

NAT Traversal

initiator

413

responder

UDP(500, 500) - HD

R - SA - VID

R - SA - VID

UDP(500, X) - HD UDP(500, 500) - HD

R - KE - NONCE

i

R - KE - NONCE r

UDP(500, X) - HD

UDP(4500, 4500)

- NATD - NATD

- NATD - NATD

- HDR - ID - AU TH i

R - ID r - AUTH

UDP(4500, Y) - HD

Figure 14.8 Main-Mode Exchange with NAT-T

initiator

responder UDP(500, 500) - HDR -

UDP(500, X) - HDR - SA

SA - KE - NONCE - ID i i - VID

AUTH VID - NATD - NATD - KE - NONCEr - IDr -

UDP(4500, 4500) - HDR

- NATD - NATD - AUTH

UDP(4500, Y) - HDR -

...

Figure 14.9 Aggressive-Mode Exchange with NAT-T

Although not shown in Figure 14.8 or Figure 14.9, there is an additional change to the IKE packets when the peers change to port 4500. Because the encapsulated ESP packets will be carried in UDP datagrams using the same ports as IKE, IPsec must have a way of distinguishing between encapsulated ESP packets and NAT packets. IPsec does this by adding a non-ESP marker of four 0-bytes to IKE packets, as shown in Figure 14.10. Notice that the marker is in the same position as the SPI of an ESP packet.

414

IPsec Futures

Chapter 14

0

15 16

31

source port

destination port

length

checksum non-ESP marker (0x00000000)

IKE packet

Figure 14.10 An IKE Packet with NAT-T

This means that ESP must never send an SPI of 0, but as we saw in Chapter 13, this is illegal anyway. After one of the exchanges shown in Figure 14.8 or Figure 14.9, IKE has negotiated the use of NAT-T for itself; all future IKE packets will use the new ports and will include the non-ESP marker. The use of NAT-T by ESP is negotiated in the Quick mode exchange by adding two new encapsulation modes to the SA payload: UDP-Encapsulated-Tunnel (3) and UDP-Encapsulated-Transport (4). These two modes replace the normal Tunnel and Transport modes when one or both of the peers is behind a NAT. Although it provides a port for PAT to remap, UDP encapsulation does not address the problem of TCP/UDP checksums. As mentioned earlier, these checksums include a pseudoheader and therefore depend on the source and destination addresses. NAT itself cannot correct these checksums, because they are encrypted and/or authenticated. Therefore, they must be adjusted after ESP has processed the packet. On the other hand, the receiver may not know the original addresses that were used to calculate the checksum. Therefore, the NAT-T peers must send each other these original addresses. The peers do this by exchanging their addresses in a NAT-OA packet, shown in Figure 14.11. 0

78 next payload ID type

15 16

31

reserved

payload length reserved

IPv4 or IPv6 address

Figure 14.11 The NAT-OA Payload

The NAT-OA payload has a payload type of 21. It is carried in the second and third message of the Quick exchange, as shown in Figure 14.12.

Section 14.6

NAT Traversal

415

initiator responder HDR - HASH - SA - NO 1 NCEi [- KE ] [- ID - ID i r ] [- NATOAi - NATOA r ]

NCE r HDR - HASH 2 - SA - NO

TOAi - NATOA r ] [- KE ] [- ID i - ID r ] [- NA

HDR - HASH 3

Figure 14.12 The Quick Exchange with NAT-T

Unsolved Problems There are interoperability problems between IPsec and NAT that NAT-T doesn’t solve. A thorough discussion of these problems and the requirements to solve them are in RFC 3715 [Aboba and Dixon 2004]. We mention two examples from RFC 3948 that illustrate the types of problems that can occur. For the ﬁrst example, imagine two mobile hosts in different locations connecting to their home network through a security gateway via a tunnel-mode VPN. Let’s suppose that each of the mobile hosts is behind a NAT and that each has an assigned private address of 192.168.0.1, as shown in Figure 14.13. When only one of the tunnels is active—say, from host A—NAT-T handles everything just as we discussed earlier. The security gateway has an SA that says that trafﬁc to 192.168.0.1 should be protected with ESP using UDP-Encapsulated-Tunnel mode encapsulation and that the trafﬁc should be sent to 1.1.1.1. Now let’s consider what happens if both tunnels are up at the same time. From the point of view of host C or D on the home network, both host A and host B are at 192.168.0.1, with the next hop being the security gateway. The security gateway has two SAs that specify UDP-Encapsulated-Tunnel mode ESP for 192.168.0.1, but they go to different endpoints, and the gateway has no way of determining which SA to use. The second example shows that transport mode can also be a problem. Suppose that host A and host B are behind the same NAT and that they both want to establish an ESP transport-mode connection to a remote server, host S, and send all their TCP trafﬁc through it. Neither host A nor host B is aware of the other, and from their point of view, they each have a UDP-encapsulated-transport mode SA specifying that all TCP trafﬁc to and from host S should be protected with ESP. Host S, however, has a problem. From its point of view, both host A and host B have the same address—their common NAT’s external address—and both SAs say to protect all TCP trafﬁc with ESP. The problem is, host S has no way of knowing which SA, and therefore which keys, to use, because both host A and host B look the same to it. In general, having the same trafﬁc descriptor for multiple hosts behind the same NAT is not possible.

IPsec Futures

host A

Chapter 14

1

192.168.0.0/24

416

50

NAT

1.1.1.1

host C

Internet

1

host D

192.168.0.0/24

host B

GW

75

NAT

2.2.2.2

Figure 14.13 Two Mobile Hosts with the Same Private Address

14.7

Summary In this chapter, we took a quick look at the near-term future of IPsec. As we saw, the changes to AH and ESP are minimal, but IKE has been enhanced to make it both simpler and more ﬂexible. IKEv2 is able to negotiate lists of address and port ranges, as well as type/code values for ICMP and IPv6 mobility header types. The IKEv2 protocol is reliable and uses two message exchanges, making the entire negotiation shorter. IKEv2 can negotiate an IKE SA and the ﬁrst child SA in only four messages. Because the most common situation requires only these two SAs, we can normally establish an ESP VPN with only four IKE messages. We also looked at NAT traversal. We saw how NAT-T can solve many, but not all, of the problems caused by the interaction of IPsec and NAT. Part of NAT-T involves NAT discovery, by which the peers determine whether a NAT is between them, so NATT can always be enabled and will adjust to whether a NAT is present when the peers negotiate a new IKE SA.

Exercises 14.1

Describe a method for checking an extended sequence number when the replay window contains sequence numbers that differ in their upper 32 bits.

Section 14.7

Summary

417

14.2

Sketch an algorithm for decorrelating a policy database that has only two selectors.

14.3

The longest-match rule for source address, destination address, and SPI requires that the SAD be searched three times. Describe an algorithm for searching the SAD

14.4

Why is IKEv2 able to create a child SA (with the CREATE_CHILD_SA exchange) in only two messages, whereas the Quick mode exchange from IKEv1 took three messages?

14.5

Why does it make sense for IKE and the UDP encapsulation of ESP to share ports under NAT-T?

14.6

Why does the NAT-D hash include the IKE cookies CKY i and CKY r ?

14.7

In Figure 14.8, which peer is behind the NAT? How can we tell?

14.8

Explain why ESP must never send an SPI of 0 when using NAT-T.

This page intentionally left blank

Appendix A

Source Code

A.1

Introduction This appendix contains some miscellaneous source code that was referenced, but did not appear, in the main text. Short descriptions of the few library routines from ETCP used in the text are also included. The source code for these library routines is available on the author’s Web site.

A.2

Cryptographic Routines This section shows implementations of some simple cryptographic algorithms mentioned in Chapter 3. Two implementations of the Extended Euclidean Algorithm, which is used to ﬁnd inverses in Z p , are also shown. Be aware that these implementations are intended to illustrate the principles involved and are deﬁnitely not meant to be used in real cryptographic software. As we’ve seen throughout the text, writing robust security software is difﬁcult and requires special diligence to avoid potential exploits. Unfortunately, writing robust security software involves details that obscure the underlying principles the examples are intended to illustrate, so the examples were chosen to favor simplicity over robustness.

A Trivial Cipher This subsection shows a python implementation (Figure A.1) of the trivial cipher that we saw in Chapter 3. As mentioned there, this cipher is useful only as an example of what not to do.

419

Source Code

420

Appendix A

1 import sys 2 def tohex( s ): 3 return ’:’.join( map( lambda x: ’%02x’ % ord( x ), s ) ) 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

trivial

key = sys.argv[ 1 ] msg = sys.argv[ 2 ] ct = ’’ class keystream: "Trivial cipher keystream generator" def __init__( self, key ): self.key = key self.keylen = len( self.key ) - 1 self.i = -1 def next( self ): if self.i < self.keylen: self.i += 1 else: self.i = 0 return self.key[ self.i ]

19 ks = keystream( key ) 20 for c in msg: 21 ct += chr( ord( c ) ˆ ord( ks.next() ) ) 22 print tohex( ct )

trivial

Figure A.1 An Implementation of the Trivial Cipher 2–3

4–5 6 9–12

13–18

19 20–21

22

The tohex function uses the python join/map idiom to return a string of bytes as colon-delimited hex digits. These lines retrieve the key and message from the command line. We empty ct, the string that will hold the ciphertext. We initialize the key stream generator by capturing the length of the key and setting the index into the key to −1. The next function returns the next byte of the key—that is, of the key stream—by incrementing the index and resetting it to 0 if it exceeds the key length. This line constructs an instance, ks, of the key stream generator. These lines take each byte of the message, exclusive-OR it with the next key stream byte, and append the result to the ciphertext string. This line prints the encrypted message.

An RC4 Implementation Figure A.2 shows a python implementation of RC4 written to act as a ﬁlter. Notice that it is only barely more complicated than the trivial cipher. 1 import sys 2 if len( sys.argv[ 1: ] ) != 1: 3 print "Usage: rc4 key" 4 sys.exit( 1 )

rc4

Section A.2

Cryptographic Routines

421

5 key = sys.argv[ 1 ] 6 class keystream: 7 "RC4 key stream generator" 8 def __init__( self, key ): 9 self.lk = len( key ) 10 self.j = 0 11 self.s = range( 256 ) 12 for self.i in range( 256 ): 13 self.k = ord( key[ self.i % self.lk ] ) 14 self.j = ( self.j + self.s[ self.i ] + self.k ) % 256 15 self.swap( self.i, self.j ) 16 self.i = 0 17 self.j = 0 18 19 20 21

def swap( self, t = self.s[ self.s[ i ] self.s[ j ]

i, j ): i ] = self.s[ j ] = t

22 23 24 25 26 27

def next( self ): self.i = ( self.i + 1 ) % 256 self.j = ( self.j + self.s[ self.i ] ) % 256 self.k = ( self.s[ self.i ] + self.s[ self.j ] ) % 256 self.swap( self.i, self.j ) return self.s[ self.k ]

28 ks = keystream( key ) 29 for c in sys.stdin.read(): 30 sys.stdout.write( chr( ord( c ) ˆ ks.next() ) )

rc4

Figure A.2 An Implementation of RC4 5

8–17 18–21 22–27

28 29–30

We obtain the key from the command line. Because this application functions as a ﬁlter, it will read the plaintext from STDIN and write the ciphertext to STDOUT. These lines initialize the RC4 state exactly as described in Chapter 3. The swap function merely swaps Si and S j . The next function returns the next byte of the key stream. It follows the pseudocode from Chapter 3 very closely. We instantiate an instance, ks, of the key stream generator. These lines continuously read bytes from STDIN, exclusive-OR them with the next byte from the key stream generator, and write the result to STDOUT.

Finding Inverses in Z p Recall from Chapter 3 that the RSA algorithm requires us to ﬁnd an inverse for x ∈Z p . That is, given x ∈{0, 1, . . . , p − 1}, where p is prime, ﬁnd y such that xy mod p = 1. Because x and p are obviously relatively prime for prime p, their greatest common denominator is 1. That is, 1 is the largest integer that divides them both. We can use the Extended Euclidean Algorithm to ﬁnd m and n such that mx + np = gcd(x, p) = 1. Note that this implies that mx mod p = 1.

422

Source Code

Appendix A

We present two implementations of this algorithm: one using the UNIX bc utility and the other using python. Both implementations are based on Algorithm X from Section 4.5.2 of [Knuth 1998]. Our ﬁrst implementation (Figure A.3) uses bc. Normally, bc is used interactively, but it can also accept input from a script. See Knuth for an explanation of this algorithm and a proof that it is correct. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

print "Enter modulus:" p = read() print "Enter number to invert:" a = read() m = a /* remember for output text */ n = p /* remember for output text */ u = 1 /* u*n + v*m = p */ v = 0 w = 0 /* w*n + x*m = a */ x = 1 while ( p != 0 ) { q = a / p t = p p = a - q * p /* a mod p */ a = t t = w - q * u s = x - q * v w = u x = v u = t v = s } /* w*n + x*m = 1, so x is the mod n inverse of m */ if ( x < 0 ) x += n print "Inverse of ", m, " mod ", n, " = ", x, "\n" quit

modinv.bc

modinv.bc

Figure A.3 A bc-Based Implementation of the Extended Euclidean Algorithm

Figure A.4 gives a slightly more readable python-based implementation. 1 2 3 4 5 6 7 8 9 10 11 12 13

euclid # Euclid’s extended algorithm from Knuth Vol. 2 (Algorithm X in # Section 4.5.2). Finds u[ 0 ], u[ 1 ], and u[ 2 ] such that # u[ 0 ] * a + u[ 1 ] * p = gcd( a, p ) = u [ 2 ]. Since a and p # are relatively prime, gcd( a, p ) = 1, and thus u[ 0 ] is the # mod p inverse of a. # import sys if len( sys.argv[ 1: ] ) != 2: print "Usage:", sys.argv[ 0 ], "a p" sys.exit( 1 ) a = int( sys.argv[ 1 ] ) p = int( sys.argv[ 2 ] ) u = ( 1, 0, a )

Section A.3

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

Library Code

423

v = ( 0, 1, p ) while v[ 2 ] != 0: q = u[ 2 ] / v[ 2 ] t = map( lambda x, y: x - q * y, u, v ) u = v v = t # # Since u[ 0 ] * a + u[ 1 ] * p = u[ 2 ] = gcd( a, p ) = 1, u[ 0 ] # is the inverse of a mod p # if u[ 0 ] < 0: inv = u[ 0 ] + p # Make sure it’s in the range [0...p-1] else: inv = u[ 0 ] # # Make sure there *is* an inverse (if a and p are not relatively # prime, there will not be). Of course, Zp is not a field unless # p is prime, so there will be an inverse in all cases of interest. # if u[ 2 ] != 1: print a, "and", p, "are not relatively prime" else: print "The mod", p, "inverse of", a, "is", inv euclid Figure A.4 A python-Based Implementation of the Extended Euclidean Algorithm

A.3

Library Code Some of the examples in the text used library functions from ETCP. The library code is publicly available, so rather than complete listings, short descriptions of the functions are given.

The error Function The ﬁrst function, error, is a generalized diagnostic routine. It can report errors or diagnostics and continue processing, or it can report a fatal error and terminate the process. The function writes its output to STDERR. The error prototype is: #include "etcp.h" void error( int status, int err, char * format, ... );

If status is 0, error will output the diagnostic and return. When status is nonzero, error will output the diagnostic and exit with a status of status. If err is nonzero, error treats it as an errno value and will append the error string associated with that value by calling strerror. The format parameter is a standard printf formatting string that error uses along with any following arguments to format the diagnostic.

424

Source Code

Appendix A

The tcp_server and tcp_client Functions The tcp_server and tcp_client functions establish a TCP connection. In the case of tcp_server, the function accepts a host and a port and returns a listening socket. In the case of tcp_client, the function also accepts a host and a port, and returns a socket connected with the speciﬁed peer. The prototype for tcp_server is: #include "etcp.h" SOCKET tcp_server( char *host, char *port ); Returns: a listening socket (terminates on error)

The host parameter points to a string that tells tcp_server which interface it should listen on. The string can be an ASCII IP address, a host name, or NULL. If it is NULL, tcp_server will listen on all the host’s interfaces. Similarly, the port parameter points to a string that tells tcp_server which port to listen on. It can be either an ASCII port number or a symbolic service name. The function returns a socket that is listening for connections on the speciﬁed interface and port. The caller is expected to use this socket as an input to the accept function. The prototype for tcp_client is: #include "etcp" SOCKET tcp_client( char *host, char *port ); Returns: a connected socket (terminates on error)

The host parameter points to a string that contains the name or address of the host to connect to. It can be either an ASCII IP address or the name of the host. The port parameter points to a string that contains the port number or name that tcp_client should connect to. It can be either an ASCII port number or a symbolic service name. The function returns a connected socket or terminates.

Appendix B

Miscellaneous Software

B.1

Netcat Some of the examples in the text use the netcat program (nc) as a client application. Netcat describes itself as a ‘‘TCP/IP Swiss army knife’’ because it is designed to handle a variety of common chores. Its name comes from the fact that it tries to act as much as possible like the UNIX cat utility but over a network. Thus, its basic purpose is to read and write data across a network using TCP or UDP. Although used in this text primarily as a telnet-like utility to connect to servers on the same or other machines, it is much more ﬂexible. We can gain an appreciation for some of that ﬂexibility by asking netcat to print its help information, as shown in Figure B.1. As an example, we use netcat to listen on a port and send a ﬁle of data to the ﬁrst user that connects: nc -l -p 6666

-q 3 < some_file

The -q 3 tells netcat to terminate 3 seconds after it reads an EOF on STDIN. The 3-second delay is to allow time for the last buffer of data to be sent. More generally, we can set up pipelines on each side of the connection to process data before and after it is sent. Here is an example, adapted from the netcat documentation, that transfers a directory from one machine to another, compressing it for transmission. On the receiving machine, we set up a pipeline that listens for the connection from the sending machine, receives the data, decompresses it, and stores it in a parallel directory: nc -l -p 6666 | gzip -dc | tar -xpf -

425

426

Miscellaneous Software

Appendix B

nc -h [v1.10] connect to somewhere: nc [-options] hostname port[s] [ports] ... listen for inbound: nc -l -p port [-options] [hostname] [port] options: -4 Use IPv4 (default) -6 Use IPv6 -A algorithm cast256, mars, saferp, twofish, or rijndael -k password AES encrypt and ascii armor session -b allow broadcasts -g gateway source-routing hop point[s], up to 8 -G num source-routing pointer: 4, 8, 12, ... -h this cruft -i secs delay interval for lines sent, ports scanned -l listen mode, for inbound connects -n numeric-only IP addresses, no DNS -o file hex dump of traffic -p port local port number -r randomize local and remote ports -q secs quit after EOF on stdin and delay of secs -s addr local source address -t answer TELNET negotiation -u UDP mode -v verbose [use twice to be more verbose] -w secs timeout for connects and final net reads -z zero-I/O mode [used for scanning] port numbers can be individual or ranges: lo-hi [inclusive] Figure B.1 Netcat Command Line Options

On the sending side, we execute the pipeline: tar -cpf - directory_to_send | gzip -c | nc -w 3 remote_host 6666

The -w 3 allows time for the data to drain from the kernel buffers before terminating the connection. This method works whether or not the particular version of tar that we’re using supports compression. More generally, we could pre- and postprocess the data in any way that we desire. The best discussion of netcat’s capabilities and how to use them is in the README ﬁle that accompanies the distribution. Netcat can be obtained from the Freshmeat archive at .

B.2

tcpdump and Other Packet Sniffers Throughout the text, we use tcpdump to watch the on-the-wire behavior of the tunnels and VPNs that we study. It’s worth taking a moment to specify how we use tcpdump to capture this data and how we reformat the data for better presentation in the text. To show the effect of the various tcpdump options, let’s ping from linux to bsd and capture the results. If we run tcpdump without any options, we get

Section B.2

tcpdump and Other Packet Sniffers

427

16:17:37.684842 IP linux.jcs.local > bsd.jcs.local: icmp 64: echo request seq 1 16:17:37.684948 IP bsd.jcs.local > linux.jcs.local: icmp 64: echo reply seq 1

This output is set in smaller type so that it will ﬁt on one line and appear exactly as it does on a terminal screen. Often, it is more useful to see the IP addresses than it is the host names. Indeed, that is the way most of the captures in the text are run. To do this, we use the -n option with tcpdump. That gives us the result 16:17:37.684842 IP 172.30.0.4 > 172.30.0.1: icmp 64: echo request seq 1 16:17:37.684948 IP 172.30.0.1 > 172.30.0.4: icmp 64: echo reply seq 1

If we want to see exactly what’s in the packet instead of relying on tcpdump to decode it for us, we can specify the -x option, which gives us a hex dump of the packet, or the -X option, which gives us a hex and ASCII dump. Be aware that tcpdump captures only the ﬁrst few bytes by default, so if we want to examine the entire packet contents, we should tell tcpdump to capture as much data as we’re interested in. The amount of data that tcpdump captures by default depends on the platform it’s running on. It’s usually 68 bytes, but some packet-capture interfaces support other lengths.

We can set the amount of data to capture with the -s option. Almost all the examples in the text invoke tcpdump as tcpdump -nXs 1500 filter_expression

where filter_expression speciﬁes which packets tcpdump should capture. If we don’t specify a ﬁlter, tcpdump will capture all packets. If we were interested only in packets from linux to bsd, we would specify a ﬁlter of src linux and dst bsd. The tcpdump ﬁlter language is surprisingly rich and therefore complicated. It is described in detail in the tcpdump man page. For our ping example, we are interested only in ICMP packets, so we can specify a ﬁlter of icmp and capture only ICMP packets. We capture the results of our ping with tcpdump -nXs 1500 icmp

and get 16:17:37.684842 IP 172.30.0.4 > 172.30.0.1: icmp 64: echo request seq 1 0x0000: 4500 0054 0000 4000 4001 e267 ac1e 0004 E..T..@[email protected].... 0x0010: ac1e 0001 0800 aab1 0e0b 0001 e189 5942 ..............YB 0x0020: 0f73 0a00 0809 0a0b 0c0d 0e0f 1011 1213 .s.............. 0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 .............!"# 0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 $%&’()*+,-./0123 0x0050: 3435 3637 4567 16:17:37.684948 IP 172.30.0.1 > 172.30.0.4: icmp 64: echo reply seq 1 0x0000: 4500 0054 8c8b 4000 4001 55dc ac1e 0001 E..T..@[email protected]..... 0x0010: ac1e 0004 0000 b2b1 0e0b 0001 e189 5942 ..............YB 0x0020: 0f73 0a00 0809 0a0b 0c0d 0e0f 1011 1213 .s.............. 0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 .............!"# 0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 $%&’()*+,-./0123 0x0050: 3435 3637 4567

Notice that tcpdump does not number the lines the way we do in the text. Instead, it

428

Miscellaneous Software

Appendix B

gives the offset—from the start of the packet—of the ﬁrst byte of each line. Such output is usually the most useful when debugging a VPN by looking at its behavior on the wire. When using tcpdump to diagnose VPN and tunnel behavior, it is imperative to understand how and where it works. Figure B.2 shows a typical invocation of tcpdump.

tcpdump

user program

libpcap user-space kernel

buffer

ﬁlter TCP/UDP IP Interface BPF

network Figure B.2 Capturing Packets with tcpdump

Notice that tcpdump uses the pcap library (libpcap) to ﬁlter and capture the packets. The pcap library provides general packet capture and ﬁltering facilities and can be used by other programs that need to examine packets as seen by the interface layer. We’ll see an example of this when we discuss ssldump. Figure B.2 shows tcpdump using the Berkeley Packet Filter (BPF) mechanism to capture the packets. BPF is the kernel interface to the pcap library used in BSD-derived UNIX systems. Other operating systems have other kernel interfaces, but the concepts are the same.

The important thing to note in Figure B.2 is where the packet is captured. Pcap grabs a copy just after it’s received from the network hardware (NIC) or just before it’s sent to the NIC. Thus, we will see the packet before any processing by the TCP/IP

Section B.3

ssldump

429

stack—in the case of inbound packets—or after all such processing is complete—in the case of outbound packets. We are not counting the I/O processing that the interface layer uses to read from or write to the NIC, of course.

This can be a little confusing when pseudointerfaces, such as the tun or tun/tap interfaces, come into play, because the packets may have been partially processed before they get to such interfaces. If we remember that the tun interface, say, looks like a normal NIC interface driver to the rest of the stack, we can usually understand what tcpdump is showing us when we capture data at such interfaces. Although we use exclusively tcpdump in the text, it’s not the only possibility. Some UNIX systems have their own packet sniffers. Solaris has snoop, and IBM AIX has iptrace, for example. Nonetheless, tcpdump runs on Solaris, AIX, Windows, and virtually all other UNIX systems. Because it is generally available on all systems, it pays to be familiar with it and its operation. For those who prefer a graphical user interface, the ethereal sniffer is an ideal solution. It too runs on virtually all UNIX and Windows systems. The ethereal sniffer can read packet captures from several other sniffers, including tcpdump, etherpeek (Windows), snoop (Solaris), LANalyzer (Novell), iptrace (AIX), nettl (HP-UX), and other, less common formats as well. Tip 34 of ETCP covers the use of tcpdump in greater detail. The latest version of tcpdump is available from . The ethereal sniffer is available from .

B.3

ssldump Chapter 6 makes extensive use of Eric Rescorla’s ssldump program. It’s similar in concept and operation to tcpdump but is specialized to decode SSL packets. One of ssldump’s most useful features is its ability to decrypt the SSL payloads when we have access to the SSL key ﬁle. This capability is useful only when we are working with our own SSL servers, of course, but that is usually exactly when we need it. Architecturally, ssldump is very similar to tcpdump, using the pcap library to capture packets from the interface, ﬁltering them according to the ﬁlter expression, and using the OpenSSL library to decode them. See Chapter 6 for examples of the SSL output. In our examples, we use three of ssldump’s options extensively. -A

Print all ﬁelds in the SSL record instead of just the most interesting ones.

-d

Display application data. This is also useful for seeing TCP trafﬁc before the SSL session initiation.

-k ﬁle Specify the location of the key ﬁle that can be used to decrypt the SSL payload data.

430

Miscellaneous Software

Appendix B

Here is a fragment of an ssldump output from Chapter 6: 2 1

0.0011 (0.0011) C>SV3.0(79) Handshake ClientHello Version 3.0 random[32]= 3e 5e 6f db 15 04 57 00 03 5a a8 ae e9 21 e6 0e 08 23 18 cc 5a 9c 4e bb resume [32]= 14 17 3a 80 5e 1d 50 9f 46 1c 38 12 b6 6c 18 59 8b 00 f4 3d a1 1c 2f 22 cipher suites SSL_RSA_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_IDEA_CBC_SHA compression methods NULL

30 89 8b 6c 20 36 aa f9 2f e6 d8 3d 72 80 37 50

The leading two numbers on the ﬁrst line provide the connection and record numbers. Thus, this is SSL record 1 from connection 2. The next two numbers are the time since the beginning of the connection and the time since the last record. In this case, they are the same because this is the ﬁrst record in the connection. The next ﬁeld tells us that this packet is from the client to the server (C>S), that we are using SSL version 3.0, and that the record is 79 bytes long. The last ﬁeld indicates the record type (see Figure 6.4). In this case, it’s a Handshake message. Once the session enters encrypted mode—that is, after the CHANGE_CIPHER_SPEC message—this ﬁrst line is all that will be visible unless we have the server key. The rest of the output is a decode of the ﬁelds within the record. We see that this is a ClientHello message for a session resumption and that two cipher suites are being offered by the client. It is sometimes convenient to be able to see the application data, and for this, ssldump offers the -x and -X options, just as tcpdump does. Here’s an example, again from Chapter 6, of ssldump’s output of application data: 1 40 72.1196 (11.9504) S>CV3.0(112) application_data --------------------------------------------------------------7e 21 45 00 00 54 72 7b 40 00 40 01 82 80 0a 00 ˜!E..Tr{@.@..... 00 04 c0 a8 7b 01 08 00 56 21 63 11 01 00 3c 5a ....{...V!c...z1............ 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 .............. ! 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 "#$%&’()*+,-./01 32 33 34 35 36 37 92 51 7e 234567.Q˜ ---------------------------------------------------------------

Notice that this is data from the server to the client and that the record type is APPLICATION_DATA. Further information about ssldump and the SSL/TLS protocol is available in Rescorla’s excellent text SSL and TLS [Rescorla 2001]. The ssldump source code distribution and information about the book is available from Rescorla’s Web site at .

Section B.4

B.4

PPP

431

PPP Most PPP implementations fall into one of two categories: They are implemented either in the kernel just below the IP layer in the TCP/IP stack or in user space and use a tunnel driver, such as the FreeBSD tun driver, to communicate with the TCP/IP stack. Our test systems include examples of both methods. The Linux and Solaris operating systems use an in-kernel PPP implementation. If we think of PPP as another layer in the TCP/IP stack, then the top of the PPP layer talks to the IP layer, and the bottom of the PPP layer talks, by default, to the serial port device driver. Actually, Linux uses a hybrid method, in which the encapsulation is done in the kernel, but the LCP/NCP processing is done by the user-space pppd program. PPP packets containing IP datagrams are handled in the kernel and do not go through pppd.

It is often convenient to have the bottom of the PPP layer communicate with the STDIN and STDOUT interfaces of a user-space process. This is usually accomplished by having PPP connect to a pseudo-tty device [Stevens 1992] rather than an actual serial device. Because we are interested primarily in using PPP to form tunnels, this is the mode that we usually use, and we see examples of this use in the text. The help screen from pppd, shown in Figure B.3, lists the most important options and gives us an idea of how to use it. pppd -h pppd version 2.4.1 Usage: pppd [ options ], where options are:

Communicate over the named device

Set the baud rate to : Set the local and/or remote interface IP addresses. Either one may be omitted. asyncmap Set the desired async map to hex auth Require authentication from peer connect

Invoke shell command

to set up the serial line crtscts Use hardware RTS/CTS flow control defaultroute Add default route through interface file Take options from file modem Use modem control lines mru Set MRU value to for negotiation See pppd(8) for more options. Figure B.3 Common pppd Options

In addition to the options listed in the help screen, we use the notty option, which tells pppd to open a pseudo-tty so that we can intercept the PPP frames and encapsulate them in our tunnels. These are only a few of the options; the typeset man page for pppd is more than 20 pages. Several additional scripts can help control the PPP link. After authentication succeeds, pppd will check for the existence of the /etc/ppp/auth-up script. If it exists, pppd will execute it. If the auth-up script is executed, /etc/ppp/auth-down will be

432

Miscellaneous Software

Appendix B

executed, if it exists, when the link is torn down. Similar scripts are run when an NCP starts and stops. For example, /etc/ppp/ip-up is run when the link is ready to send or receive IP datagrams, and /etc/ppp/ip-down is run when the link is no longer able to send or receive IP datagrams. An excellent discussion of the internals of pppd and how the kernel and user-space portions of the PPP implementation communicate is given in PPP Design and Debugging [Carlson 2000]. As mentioned in Chapter 2, Carlson’s book is an excellent reference for the speciﬁcs of PPP and explains the functioning of its state machines in detail. FreeBSD’s PPP implementation is an example of the second method, in which all the PPP processing takes place in a user-space program. In FreeBSD, this program is called ppp. Figure B.4 shows how ppp communicates with the TCP/IP stack. Note the similarity to our gtunnel implementation. The ﬁgure shows ppp communicating with a serial driver for its external I/O, but we most often set it to communicate either through the TCP/IP stack—when we encapsulate PPP frames in TCP segments or UDP datagrams—or through its STDIN and STDOUT, as we do in Chapter 6, where we want it to communicate through another program. user app

ppp user kernel

TCP/IP Stack serial driver tunnel driver

Figure B.4 The FreeBSD PPP Architecture

Because ppp is so ﬂexible and has so many options—the typeset man page is almost 50 pages—almost all the conﬁguration is done by a conﬁguration ﬁle. The command line interface merely selects a mode, a ‘‘system,’’ and a few options, as we see from the help screen: bsd# ppp -h usage: ppp [-auto | -foreground | -background | -direct | -dedicated | -ddial | -interactive] [-nat] [-quiet] [-unit N] [system ...]

The ﬁrst set of choices is the mode, which selects the manner in which ppp will operate.

Section B.4

PPP

433

The modes are: auto

ppp opens the connection to the TCP/IP stack and becomes a daemon but takes no further action until it sees an IP packet, at which point it will attempt to contact the remote system.

foreground ppp attempts to establish a connection with the remote system immediately but does not become a daemon. background ppp attempts to establish a connection with the remote system immediately. If ppp succeeds, it becomes a daemon; otherwise, it exits with a failure status. direct

This mode causes ppp to listen for connections rather than trying to establish a connection to the remote system.

dedicated

This mode indicates that ppp has a dedicated connection to the remote system, and no chat script is necessary to establish communications.

ddial

This mode is like the auto mode, but will try to reestablish the connection if it is dropped.

interactive

This mode allows a user to control its operation interactively.

The -nat option enables ppp’s internal NAT. The -quiet option inhibits ppp’s output to the console when it starts. The -unit N option allows the user to specify which tun interface ppp should use. Without this option, ppp will use the ﬁrst available tun device. The conﬁguration of PPP connections is speciﬁed in the /etc/ppp/ppp.conf ﬁle. The ppp.conf ﬁle can contain conﬁgurations for several PPP connections. These conﬁgurations are separated into named sections in the ﬁle, and it is one or more of these that the user speciﬁes for ‘‘system . . .’’ when calling ppp. Chapter 6 has an example of a ppp.conf ﬁle when ppp is used in an SSL tunnel. Figure B.5 is another example, showing the conﬁguration for a normal ppp session over a serial line. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# Default setup. Always executed when PPP is invoked. # default: allow user jru set device /dev/cuaa0 set speed 38400 disable lqr deny lqr set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 5 \"\" \ ATE1Q0 OK-AT-OK \\dATDT\\T TIMEOUT 40 CONNECT" # # ISP Account # myisp: set phone 5551234 set login

ppp.conf

434

Miscellaneous Software

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Appendix B

set timeout 3600 deny pap enable chap set authname jrandomuser set authkey gd%#B2aK.9 set ifaddr 0 0 add default HISADDR # # Work RAS # work: set speed 9600 set phone 18005558765 set login "TIMEOUT 5 login:-\\r-login: pppuser word: pppuser set timeout 3600 deny pred1 disable pred1 set authname juser set authkey dFx2% set ifaddr 2.2.2.2 0 delete ALL add default HISADDR set openmode active 3

ppp.conf

Figure B.5 A Typical ppp.conf File

Most of the options in the conﬁguration ﬁle are self-explanatory. Because the options are discussed in detail in the ppp man page and because there are so many of them, we won’t discuss them further. In addition to reading the ppp.conf ﬁle, ppp checks for the existence of a /etc/ppp/ppp.linkup ﬁle when a link is established. If the ﬁle exists, its contents are executed. It can set ppp parameters or run arbitrary scripts. A common use is to set a route, as we do in Chapter 6. Similarly, the /etc/ppp/ppp.linkdown script is checked when the link is torn down.

Bibliography

Aboba, B., Blunk, L. J., Vollbrecht, J. R., Carlson, J., and Levkowetz, H. 2004. ‘‘Extensible Authentication Protocol (EAP),’’ RFC 3748 (June). Aboba, B. and Dixon, W. 2004. ‘‘IPsec-NAT Compatibility Requirements,’’ RFC 3715 (Mar.). Agarwal, P. and Akyol, B. 2003. ‘‘Time to Live (TTL) Processing in Multi-Protocol Label Switching (MPLS),’’ RFC 3443 (Jan.). Alvestrand, H. T. 2001. ‘‘Tags for the Identiﬁcation of Languages,’’ RFC 3066 (Jan.). Anderson, R. 1993. ‘‘The Classiﬁcation of Hash Functions,’’ Proceedings of the Fourth IMA Conference on Cryptography and Coding, pp. 83–93. http://www.ftp.cl.cam.ac.uk/ftp/users/rja14/hash.ps.Z

Andersson, L., Doolan, P., Feldman, N., Fredette, A., and Thomas, B. 2001. ‘‘LDP Speciﬁcation,’’ RFC 3036 (Jan.). Barrett, D. J. and Silverman, R. E. 2002. SSH, The Secure Shell: The Deﬁnitive Guide. O’Reilly & Associates, Sebastopol, Calf. Baugher, M., Weis, B., Hardjono, T., and Harney, H. 2003. ‘‘The Group Domain of Interpretation,’’ RFC 3547 (July). Bellare, M., Canetti, R., and Krawczyk, H. 1996. ‘‘Keyed Hash Functions and Message Authentication,’’ Advances in Cryptology–CRYPTO ’96, Lecture Notes in Computer Science, no. 1109, pp. 1–15, Springer-Verlag. http://www.research.ibm.com/security/keyed-md5.html

Bellare, M., Kohno, T., and Namprempre, C. 2002. ‘‘Authenticated Encryption in SSH: Provably Fixing the SSH Binary Packet Protocol,’’ Proceedings of the 9th ACM Conference on Computer and Communications Security, pp. 1–11 (Nov.). http://www.cs.ucsd.edu/users/tkohno/papers/SSH/

Bellovin, S. M. 1996. ‘‘Problem Areas for the IP Security Protocols,’’ Proceedings of the 6th USENIX Security Symposium, pp. 1–16 (July). http://www.research.att.com/˜smb/papers/badesp.ps

435

Bibliography

436

Bellovin, S. M. 1997. ‘‘Probable Plaintext Cryptanalysis of the IP Security Protocols,’’ Proceedings of the Symposium on Network and Distributed System Security, pp. 155–160 (Feb.). http://www.research.att.com/˜smb/papers/probtxt.ps

Bleichenbacher, D. 1998. ‘‘Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS#1,’’ Advances in Cryptology–CRYPTO ’98, Lecture Notes in Computer Science, no. 1462, pp. 1–12, Springer-Verlag. http://www.bell-labs.com/user/bleichen/papers/pkcs.ps.gz

Bleichenbacher, D., Kalisky, B., and Staddon, J. 1998. ‘‘Recent Results on PKCS #1: RSA Encryption Standard,’’ Bulletin 7, RSA Laboratories (June). ftp://ftp.rsasecurity.com/pub/pdfs/bulletn7.pdf

Braden, R. T., Borman, D., and Partridge, C. 1988. ‘‘Computing the Internet Checksum,’’ RFC 1071 (Sept.). Brumley, D. and Boneh, D. 2003. ‘‘Remote Timing Attacks Are Practical,’’ Proceedings of the 12th USENIX Security Symposium, pp. 1–14 (Aug.). http://crypto.stanford.edu/˜dabo/abstracts/ssl-timing.html

Canvel, B., Hiltgen, A., Vaudenay, S., and Vuagnoux, M. 2003. ‘‘Password Interception in a SSL/TLS Channel,’’ Advances in Cryptology–CRYPTO ’03, Lecture Notes in Computer Science, no. 2729, pp. 583–599, Springer-Verlag. http://lasecwww.epfl.ch/pub/lasec/doc/CHVV03.ps

Carlson, J. 2000. PPP Design, Implementation, and Debugging, Second Edition. Addison-Wesley, Boston, Mass. Comer, D. E. 2000. Internetworking with TCP/IP Volume I: Principles, Protocols, and Architecture, Fourth Edition. Prentice Hall, Englewood Cliffs, N.J. Dai, W. 2002. ‘‘An Attack Against SSH2 Protocol,’’ Message-ID , IETF Secsh Working Group Email List (Feb.). ftp://ftp.ietf.org/ietf-mail-archive/secsh/2002-02.mail

Davis, C. R. 2001. IPSec: Securing VPNs. McGraw-Hill, Berkeley, Calif. Dawson, E. and Nielsen, L. 1996. ‘‘Automated Cryptanalysis of XOR Plaintext Strings,’’ Cryptologia, vol. 20, no. 2, pp. 165–181 (Apr.). Deering, S. E. and Hinden, R. M. 1998. ‘‘Internet Protocol, Version 6 (IPv6) Speciﬁcation,’’ RFC 2460 (Dec.). Dierks, T. and Allen, C. 1999. ‘‘The TLS Protocol: Version 1.0,’’ RFC 2246 (Jan.). Difﬁe, W. and Hellman, M. 1976. ‘‘New Directions in Cryptography,’’ IEEE Transactions on Information Theory, vol. IT-22, no. 6, pp. 644–654 (Nov.). Dijkstra, E. W. 1959. ‘‘A Note on Two Problems in Connection with Graphs,’’ Numerische Mathematic, vol. 1, pp. 269–271. Dommety, G. 2000. ‘‘Key and Sequence Number Extensions to GRE,’’ RFC 2890 (Sept.). Doraswamy, N. and Harkins, D. 1999. IPSec, The New Security Standard of the Internet, Intranets, and Virtual Private Networks. Prentice Hall PTR, Upper Saddle River, N.J. Dreyfus, S. 1997. Underground: Tales of Hacking, Madness, and Obsession on the Electronic Frontier. Mandarin, Kew, Australia. http://onlinebooks.library.upenn.edu/webbin/gutbook/lookup?num=4686

Bibliography

437

Dworkin, M. 2001. ‘‘Recommendation for Block Cipher Modes of Operation—Methods and Techniques,’’ NIST Special Publication 800-38a, National Institute of Standards and Technology (Dec.). http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf

Eastlake, D. E., 3rd, Crocker, S. D., and Schiller, J. I. 1994. ‘‘Randomness Recommendations for Security,’’ RFC 1750 (Dec.). Egevang, K. B. and Francis, P. 1994. ‘‘The IP Network Address Translator (NAT),’’ RFC 1631 (May). Electronic Frontier Foundation 1998. Cracking DES. O’Reilly & Associates, Sebastopol, Calif. ElGamal, T. 1985. ‘‘A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms,’’ IEEE Transactions on Information Theory, vol. IT-31, no. 4, pp. 469–472 (July). Etienne, J. 2001. ‘‘Security Analysis of VTun,’’ white paper (Dec.). http://www.netsys.com/library/papers/vtun_secu.pdf

Farinacci, D., Li, T., Hanks, S., Meyer, D., and Traina, P. 2000. ‘‘Generic Routing Encapsulation (GRE),’’ RFC 2784 (Mar.). Faucheur, F. L., Wu, L., Davie, B., Davari, S., Vaananen, P., Krishnan, R., Cheval, P., and Heinanen, J. 2002. ‘‘Multi-Protocol Label Switching (MPLS) Support of Differentiated Services,’’ RFC 3270 (May). Ferguson, N. and Schneier, B. 1999. ‘‘A Cryptographic Evaluation of IPsec,’’ white paper, Counterpane Internet Security. http://www.counterpane.com/ipsec.pdf

Ferguson, N. and Schneier, B. 2003. Practical Cryptography. John Wiley & Sons, N.Y. Fluhrer, S., Mantin, I., and Shamir, A. 2001. ‘‘Weaknesses in the Key Scheduling Algorithm of RC4,’’ Lecture Notes in Computer Science, vol. 2259, pp. 1–24, Springer-Verlag. http://www.crypto.com/papers/others/rc4_ksaproc.ps

Ford, L. R. and Fulkerson, D. R. 1962. Flows in Networks. Princeton University Press, Princeton, N.J. Freier, A. O., Karlton, P., and Kocker, P. C. 1996. ‘‘The SSL Protocol: Version 3.0,’’ RFC draftfreier-ssl-version3-02 (Nov.). http://wp.netscape.com/eng/ssl3/draft302.txt

Fuller, V., Li, T., Yu, J., and Varadhan, K. 1993. ‘‘Classless Inter-Domain Routing (CIDR): An Address Assignment,’’ RFC 1519 (Sept.). Futoransky, A., Kargieman, E., and Pacetti, A. M. 1998. ‘‘An Attack on CRC-32 Integrity Checks of Encrypted Channels Using CBC and CFB Modes,’’ white paper, Core Security Technologies (Oct.). http://tinyurl.com/27ddk

Garman, J. 2003. Kerberos: The Deﬁnitive Guide. O’Reilly & Associates, Sebastopol, Calf. Goldberg, I. and Wagner, D. 1996. ‘‘Randomness and the Netscape Browser,’’ Dr. Dobb’s Journal (Jan.). http://www.ddj.com/documents/s=965/ddj9601h/9601h.htm

Gutmann, P. 2000. ‘‘X.509 Style Guide,’’ white paper (Oct.). http://www.cs.auckland.ac.nz/˜pgut001/pubs/x509guide.txt

Bibliography

438

Gutmann, P. 2003a. ‘‘Everything You Never Wanted to Know About PKI but Have Been Forced to Find Out,’’ white paper. http://www.cs.auckland.ac.nz/˜pgut001/pubs/pkitutorial.pdf

Gutmann, P. 2003b. ‘‘Linux’s Answer to MS-PPTP,’’ white paper (Sept.). http://www.cs.auckland.ac.nz/˜pgut001/pubs/linux_vpn.txt

Gutmann, P. 2005. Private communication. Haller, N. M. 1994. ‘‘The S/KEY One-time Password System,’’ Proceedings of the ISOC Symposium on Network and Distributed System Security (Feb.), San Diego, Calif. http://tinyurl.com/38vur

Haller, N. M. 1995. ‘‘The S/KEY One-Time Password System,’’ RFC 1760 (Feb.). Hamzeh, K., Pall, G. S., Verthein, W., Taarud, J., Little, W. A., and Zorn, G. 1999. ‘‘Point-to-Point Tunneling Protocol (PPTP),’’ RFC 2637 (July). Hanks, S., Li, T., Farinacci, D., and Traina, P. 1994. ‘‘Generic Routing Encapsulation (GRE),’’ RFC 1701 (Oct.). Harkins, D. and Carrel, D. 1998. ‘‘The Internet Key Exchange (IKE),’’ RFC 2409 (Nov.). Hedrick, C. 1988. ‘‘Routing Information Protocol,’’ RFC 1058 (June). Herzog, J. 1999. ‘‘A Suggested Improvement to SSHv2,’’ WN99B000041, The MITRE Corporation. http://www.mitre.org/work/tech_papers/tech_papers_00/herzog_improvement/herzog_improvement.pdf

Hickman, K. E.B. 1995. ‘‘The SSL Protocol,’’ Unpublished RFC draft (Feb.). http://wp.netscape.com/eng/security/SSL_2.html

Hinden, R. and Deering, S. 2003. ‘‘Internet Protocol Version 6 (IPv6) Addressing Architecture,’’ RFC 3513 (Apr.). Hinden, R. M. 1993. ‘‘Applicability Statement for the Implementation of Classless Inter-Domain Routing (CIDR),’’ RFC 1517 (Sept.). Hollenbeck, S. 2004. ‘‘Transport Layer Security Protocol Compression Methods,’’ RFC 3749 (May). Housley, R., Polk, T., Ford, W., and Solo, D. 2002. ‘‘Internet X.509 Public Key Infrastructure Certiﬁcate and Certiﬁcate Revocation List (CRL) Proﬁle,’’ RFC 3280 (Apr.). Huitema, C. 2000. Routing in the Internet, Second Edition. Prentice Hall, Upper Saddle River, N.J. Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and Stenberg, M. 2005. ‘‘UDP Encapsulation of IPsec Packets,’’ RFC 3948 (Jan.). International Standards Organization 1984. ‘‘OSI—Basic Reference Model,’’ ISO 7498, International Standards Organization, Geneva. International Telecommunication Union 2000. Recommendation X.509-The Directory: Public-key and Attribute Certiﬁcate Frameworks. International Telecommunication Union, Geneva http://www.itu.int/rec/recommendation.asp?type=items&lang=e&parent=TREC-X.509-200003-I

Ioannidis, J. and Blaze, M. 1993a. ‘‘Architecture and Implementation of Network-Layer Security under Unix,’’ Proceedings of the USENIX Security Workshop (Oct.). http://www.crypto.com/papers/swipeusenix.ps

Ioannidis, J. and Blaze, M. 1993b. ‘‘The swIPe IP Security Protocol,’’ Internet Draft (Dec.). http://www.crypto.com/papers/swipe.id

Bibliography

439

ISO 1992. ‘‘Network Layer Security Protocol,’’ ISO/IEC DIS-11577, International Standards Organization (Nov.). Jacobson, V. 1990. ‘‘Compressing TCP/IP Headers for Low-Speed Serial Links,’’ RFC 1144 (Feb.). Johnson, D. B., Perkins, C. E., and Arkko, J. 2004. ‘‘Mobility Support for IPv6,’’ RFC 3775 (June). Kaliski, B. and Robshaw, M. 1995. ‘‘The Secure Use of RSA,’’ CryptoBytes, vol. 1, no. 3, pp. 7–13 (Autumn), RSA Laboratories. ftp://ftp.rsasecurity.com/pub/cryptobytes/crypto1n3.pdf

Kaufman, C., Perlman, R., and Speciner, M. 2002. Network Security: Private Communications in a Public World, Second Edition. Prentice Hall PTR, Upper Saddle River, N.J. Kent, S. and Atkinson, R. 1998a. ‘‘IP Authentication Header,’’ RFC 2402 (Nov.). Kent, S. and Atkinson, R. 1998b. ‘‘IP Encapsulating Security Payload,’’ RFC 2406 (Nov.). Kent, S. and Atkinson, R. 1998c. ‘‘Security Architecture for the Internet Protocol,’’ RFC 2401 (Nov.). Kivinen, T., Swander, B., Huttunen, A., and Volpe, V. 2005. ‘‘Negotiation of NAT-Traversal in the IKE,’’ RFC 3947 (Jan.). Klima, V. 2005. ‘‘Finding MD5 Collisions on a Notebook Using Multi-Message Modiﬁcations,’’ Preprint (Mar.). http://eprint.iacr.org/2005/102

Knuth, D. E. 1998. The Art of Computer Programming, Volume 2, Seminumerical Algorithms, Third Edition. Addison-Wesley, Reading, Mass. Kolesnikov, O. and Hatch, B. 2002. Building Linux Virtual Private Networks (VPNs). New Riders Publishing, Indianapolis, Ind. Krawczyk, H. 1996. ‘‘SKEME: A Versatile Secure Key Exchange Mechanism for Internet,’’ in Proceedings of the 1996 Symposium on Network and Distributed System Security (SNDSS ’96), pp. 114. IEEE Computer Society http://www.research.ibm.com/security/skeme.ps

Krawczyk, H. 2001. ‘‘The Order of Encryption and Authentication for Protecting Communications (Or: How Secure Is SSL?),’’ Proceedings on Crypto ’01, Lecture Notes in Computer Science, no. 2139, pp. 310–331 (Aug.), Springer-Verlag. http://eprint.iacr.org/2001/045.ps

Krawczyk, H., Bellare, M., and Canetti, R. 1997. ‘‘HMAC: Keyed-Hashing for Message Authentication,’’ RFC 2104 (Feb.). Lenstra, A., Wang, X., and Wegner, B. de 2005. ‘‘Colliding X.509 Certiﬁcates,’’ Preprint (Mar.). http://eprint.iacr.org/2005/067

Lipmaa, H., Rogaway, P., and Wagner, D. 2000. ‘‘Comments to NIST Concerning AES Modes of Operations: CTR-Mode Encryption,’’ First NIST Workshop on Modes of Operation for Symmetric Key Block Ciphers (Oct.). http://csrc.nist.gov/CryptoToolkit/modes/workshop1/papers/lipmaactr.pdf

Madson, C. and Doraswamy, N. 1998. ‘‘The ESP DES-CBC Cipher Algoritm with Explicit IV,’’ RFC 2405 (Nov.). Madson, C. and Glenn, R. 1998a. ‘‘The Use of HMAC-MD5-96 Within ESP and AH,’’ RFC 2403 (Nov.). Madson, C. and Glenn, R. 1998b. ‘‘The Use of HMAC-SHA-1-96 Within ESP and AH,’’ RFC 2404 (Nov.).

Bibliography

440

Malkin, G. S. 1994. ‘‘RIP Version 2: Carrying Additional Information,’’ RFC 1723 (Nov.). Mamakos, L., Lidl, K., Evarts, J., Carrel, D., Simone, D., and Wheeler, R. 1999. ‘‘A Method for Transmitting PPP over Ethernet (PPPoE),’’ RFC 2516 (Feb.). Maughan, D., Schertler, M., Schneider, M., and Turner, J. 1998. ‘‘Internet Security Association and Key Management Protocol (ISAKMP),’’ RFC 2408 (Nov.). Menezes, A. J., Oorschot, P. C. van, and Vanstone, S. A. 1996. Handbook of Applied Cryptography. CRC Press, Boca Raton, Fla. Merkle, R. 1978. ‘‘Secure Communications Over Insecure Channels,’’ Communications of the ACM, vol. 21, no. 4, pp. 294–299. Messmer, E. 2000. ‘‘Win 2000 VPN Technology Causes Stir,’’ Network World (Jan. 10). http://www.nwfusion.com/news/2000/0110vpn.html

Mister, S. and Tavares, S. E. 1999. ‘‘Cryptanalysis of RC4-like Ciphers,’’ Lecture Notes in Computer Science, vol. 1556, pp. 131–143, Springer-Verlag. http://www.cs.columbia.edu/˜dcook/candexam/Y_23_rc4_cryptana.pdf

Mogul, J. and Deering, S. 1990. ‘‘Path MTU Discovery,’’ RFC 1191 (Nov.). Mogul, J. and Postel, J. B. 1985. ‘‘Internet Standard Subnetting Procedure,’’ RFC 950 (Aug.). Moy, J. T. 1998a. ‘‘OSPF Version 2,’’ RFC 2328 (Apr.). Moy, J. T. 1998b. OSPF: Anatomy of an Internet Routing Protocol. Addison-Wesley, Reading, Mass. Narten, T., Nordmark, E., and Simpson, W. A. 1998. ‘‘Neighbor Discovery for IP Version 6 (IPv6),’’ RFC 2461 (Dec.). NIST 1990. ‘‘Secure Data Network System (SDNS) Network, Transport, and Message Security Protocols,’’ NISTIR 90-4250, National Institute of Standards and Technology (Feb.). NIST 1999. ‘‘Data Encryption Standard (DES),’’ FIPS PUB 46-3, National Institutes of Standards and Technology (Oct.). http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf

NIST 2001. ‘‘Digital Signature Standard (DSS),’’ FIPS PUB 186-2 (+Change Notice), National Institute of Standards and Technology (Oct.). http://csrc.nist.gov/publications/fips/fips186-2/fips186-2-change1.pdf

NIST 2002a. ‘‘Advanced Encryption Standard (AES),’’ FIPS PUB 197, National Institute of Standards and Technology (Nov.). http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf

NIST 2002b. ‘‘Secure Hash Standard,’’ FIPS PUB 180-2, National Institute of Standards and Technology (Aug.). http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf

Orman, H. K. 1998. ‘‘The OAKLEY Key Determination Protocol,’’ RFC 2412 (Nov.). Pall, G. S. and Zorn, G. 2001. ‘‘Microsoft Point-to-Point Encryption (MPPE) Protocol,’’ RFC 2118 (Mar.). Patel, B. V., Aboba, B., Dixon, W., Zorn, G., and Booth, S. 2001. ‘‘Securing L2TP using IPsec,’’ RFC 3193 (Nov.). Pepelnjak, I. and Guichard, J. 2001. MPLS and VPN Architectures. Cisco Press, Indianapolis, Ind. Pereira, R. and Adams, R. 1998. ‘‘The ESP CBC-Mode Cipher Algorithms,’’ RFC 2451 (Nov.). Perkins, C. 1996. ‘‘IP Encapsulation within IP,’’ RFC 2003 (Oct.). Perlman, R. 2000. Interconnecions: Bridges, Routers, Switches, and Internetworking Protocols, Second Edition. Addison-Wesley, Reading, Mass.

Bibliography

441

Piper, D. 1998. ‘‘The Internet IP Security Domain of Interpretation for ISAKMP,’’ RFC 2407 (Nov.). Plummer, D. C. 1982. ‘‘An Ethernet Address Resolution Protocol,’’ RFC 826 (Nov.). Postel, J. B. 1980. ‘‘User Datagram Protocol,’’ RFC 768 (Aug.). Postel, J. B. 1981. ‘‘Internet Control Message Protocol,’’ RFC 777 (Apr.). Postel, J. B., ed. 1981a. ‘‘Internet Protocol,’’ RFC 791 (Sept.). Postel, J. B., ed. 1981b. ‘‘Transmission Control Protocol,’’ RFC 793 (Sept.). Preneel, B. and Oorschot, P. C. van 1995. ‘‘MDx-MAC and Building Fast MACs from Hash Functions,’’ Advances in Cryptology–CRYPTO ’95, Lecture Notes in Computer Science, no. 963, pp. 1–14, Springer-Verlag. http://www.scs.carleton.ca/˜paulv/papers/Crypto95.ps

Preneel, B. and Oorschot, P. C. van 1996. ‘‘On the Security of Two MAC Algorithms,’’ Advances in Cryptology–EUROCRYPT ’96, Lecture Notes in Computer Science, no. 1070, pp. 19–32, Springer-Verlag. http://www.scs.carleton.ca/˜paulv/papers/Euro96-2MACs.ps

Provan, D. 1991. ‘‘Tunneling IPX Trafﬁc through IP Networks,’’ RFC 1234 (June). Rekhter, Y. and Gross, P. 1995. ‘‘Application of the Border Gateway Protocol in the Internet,’’ RFC 1772 (Mar.). Rekhter, Y. and Li, T. 1993. ‘‘An Architecture for IP Address Allocation with CIDR,’’ RFC 1518 (Sept.). Rekhter, Y. and Li, T. 1995. ‘‘A Border Gateway Protocol 4 (BGP-4),’’ RFC 1771 (Mar.). Rekhter, Y., Moskowitz, R. G., Karrenberg, D., Groot, G. J. de, and Lear, E. 1996. ‘‘Address Allocation of Private Internets,’’ RFC 1918 (Feb.). Rekhter, Y. and Rosen, E. 2001. ‘‘Carrying Label Information in BGP-4,’’ RFC 3107 (May). Rescorla, E. 2001. SSL and TLS: Designing and Building Secure Systems. Addison-Wesley, Boston, Mass. Reynolds, J. K. and Postel, J. B. 1985. ‘‘File Transfer Protocol (FTP),’’ RFC 959 (Oct.). Rivest, R. 1992a. ‘‘The MD4 Message-Digest Algorithm,’’ RFC 1320 (Apr.). Rivest, R. 1992b. ‘‘The MD5 Message-Digest Algoritm,’’ RFC 1321 (Apr.). Rivest, R. L., Shamir, A., and Adleman, L. M. 1978. ‘‘A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,’’ Communications of the ACM, vol. 21, no. 2, pp. 120–126 (Feb.). Robshaw, M. J. B. 1995. ‘‘Stream Ciphers,’’ Technical Report TR-701, Version 2.0, RSA Laboratories (July). ftp://ftp.rsasecurity.com/pub/pdfs/tr701.pdf

Romkey, J. L. 1988. ‘‘A Nonstandard for Transmission of IP Datagrams Over Serial Lines: SLIP,’’ RFC 1055 (June). Roos, A. 1995. ‘‘A Class of Weak Keys in the RC4 Stream Cipher (Preliminary Draft),’’ MessageID: , Usenet, Sci.Crypt.Research (Sept.). http://groups.google.com/groups?selm=43vf2e%24sr8%40net.auckland.ac.nz&oe=UTF-8&output=gplain

Rosen, E. C. and Rekhter, Y. 1999. ‘‘BGP/MPLS VPNs,’’ RFC 2547 (Mar.). Rosen, E. C., Tappan, D., Fedorkow, G., Rekhter, Y., Farinacci, D., Li, T., and Conta, A. 2001. ‘‘MPLS Label Stack Encoding,’’ RFC 3032 (Jan.). Rosen, E. C., Viswanathan, A., and Callon, R. 2001. ‘‘Multiprotocol Label Switching Architecture,’’ RFC 3031 (Jan.).

Bibliography

442

RSA Laboratories 2002. PKCS #1 v2.1: RSA Cryptography Standard. RSA Laboratories ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf

Schneier, B. 1996. Applied Cryptography Second Edition: Protocols, Algorithms, and Source Code in C. John Wiley & Sons, N.Y. Schneier, B. 2003. ‘‘SSL Flaw,’’ Crypto-Gram Newsletter, Counterpane Internet Security, Inc. (Mar.). http://www.schneier.com/crypto-gram-0303.html

Schneier, B. and Mudge 1998. ‘‘Cryptanalysis of Microsoft’s Point-to-Point Tunneling Protocol (PPTP),’’ Proceeding of the 5th ACM Conference on Communications and Computer Security, pp. 132–141, ACM Press. http://www.counterpane.com/pptp-paper.html

Schneier, B., Mudge, and Wagner, D. 1999. ‘‘Cryptanalysis of Microsoft’s PPTP Authentication Extensions (MS-CHAPv2),’’ white paper. http://www.counterpane.com/pptpv2-paper.html

Shea, R. 2000. L2TP Implementation and Operation. Addison-Wesley, Reading, Mass. Simpson, W. A. 1996. ‘‘PPP Challenge Handshake Authentication Protocol (CHAP),’’ RFC 1994 (Aug.). Snader, J. C. 2000. Effective TCP/IP Programming. Addison-Wesley, Boston, Mass. Song, D. X., Wagner, D., and Tian, X. 2001. ‘‘Timing Analysis of Keystrokes and Timing Attacks on SSH,’’ 10th USENIX Security Symposium (Aug.). http://www.usenix.org/events/sec01/full_papers/song/song.pdf

Srisuresh, P. and Holdrege, M. 1999. ‘‘IP Network Address Translator (NAT) Terminology and Consideration,’’ RFC 2663 (Aug.). Stevens, W. R. 1992. Advanced Programming in the UNIX Environment. Addison-Wesley Pub. Co., Reading, Mass. Stevens, W. R. 1994. TCP/IP Illustrated, Volume 1: The Protocols. Addison-Wesley Pub. Co., Reading, Mass. Stevens, W. R. 1996. TCP/IP Illustrated, Volume 3: TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols. Addison-Wesley Pub. Co., Reading, Mass. Stevens, W. R. 1998. UNIX Network Programming, Volume 1, Second Edition, Networking APIs: Sockets and XTI. Prentice Hall PTR, Upper Saddle River, N.J. Stevenson, F. A. 1995. ‘‘Cracked: WINDOWS.PWL,’’ Message-ID: , Cypherpunks Mail List (Dec.). http://groups.google.com/groups?selm=Pine.SGI.3.91.951204174641.2847A100000%40odin&oe=UTF-8&output=gplain

Townsley, W. M., Valencia, A. J., Rubens, A., Pall, G. S., Zorn, G., and Palter, B. 1999. ‘‘Layer Two Tunneling Protocol ‘‘L2TP’’,’’ RFC 2661 (Aug.). Varghese, G. 2005. Network Algorithmics : An Interdisciplinary Approach to Designing Fast Networked Devices. Morgan Kaufmann, San Francisco Vaudenay, S. 2002. ‘‘Security Flaws Induced by CBC Padding-Applications to SSL, IPSEC, WTLS. . .,’’ Advances in Cryptology–EUROCRYPT ’02, Lecture Notes in Computer Science, no. 2332, pp. 534–545, Springer-Verlag. http://lasecwww.epfl.ch/pub/lasec/doc/Vau02a.ps

Bibliography

443

Viega, J., Messier, M., and Chandra, P. 2002. Network Security with OpenSSL. O’Reilly & Associates, Sebastopol, Calif. Voydock, V. L. and Kent, S. T. 1983. ‘‘Security Mechanisms in High-Level Network Protocols,’’ ACM Computing Surveys, vol. 15, no. 2, pp. 135–171 (June). Wagner, D. and Schneier, B. 1996. ‘‘Analysis of the SSL 3.0 protocol,’’ The Second USENIX Workshop on Electronic Commerce Proceedings, pp. 29–40 (Nov.). http://www.counterpane.com/ssl.html

Waissbein, A. and Friedman, A. A. 2001. ‘‘SSH Protocol 1.5 Session Key Recovery Vulnerability,’’ Advisory CORE-20010116, Core Security Technologies (Feb.). http://tinyurl.com/yrzub

Wang, X., Feng, D., Lai, X., and Yu, H. 2004. ‘‘Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD,’’ white paper. http://eprint.iacr.org/2004/199.pdf

Wang, X., Yin, Y. L., and Yu, H. 2005. ‘‘Collision Search Attacks on SHA1,’’ white paper. http://s17.yousendit.com/d.aspx?id=0MZULY5IBDAU130DK0RKV3GTIB

Wang, X. and Yu, H. 2005. ‘‘How to Break MD5 and Other Hash Functions,’’ Preprint (Mar.). http://www.infosec.sdu.edu.cn/paper/md5-attack.pdf

Wilson, S. 2003. ‘‘Rethinking PKI,’’ SC Magazine (June). http://www.scmagazine.com/scmagazine/2003_06/feature_2/index.html

Wright, G. R. and Stevens, W. R. 1995. TCP/IP Illustrated, Volume 2: The Implementation. AddisonWesley Pub. Co., Reading, Mass. Zheng, P. 2003. ‘‘Tradeoffs in Certiﬁcate Revocation Schemes,’’ Computer Communication Review, vol. 33, no. No. 2, pp. 103–112 (Apr.). Zorn, G. and Cobb, S. 1998. ‘‘Microsoft PPP CHAP Extensions,’’ RFC 2433 (Oct.).

This page intentionally left blank

Index

3DES (triple DES), 6, 65, 87, 171, 178, 210, 347, 366 deﬁnition of, 65

Aboba, B., 162, 407, 409, 415 accept function, 424 access concentrator, deﬁnition of, 99 ACCM (asynchronous control character map) attribute value ﬁeld, 134 deﬁnition of, 133 ACK ﬂag, deﬁnition of, 27 Adams, R., 343 address broadcast, 19 IPv6, 44–45 limited broadcast, 19 network-directed broadcast, 19 private, 35–40, 54, 93, 104, 350 Address Resolution Protocol, see ARP addressing classful, 13–19 IP, 13–19 Adleman, L.M., 70, 81, 84 Advanced Encryption Standard, see AES AES (Rijndael), 6, 65–67, 72, 87, 271, 282, 291, 365–366 Agarwal, P., 137

aggressive mode, deﬁnition of, 376 AH (Authentication Header protocol), 46, 168, 308–339, 341–342, 355, 358, 365–366, 373, 376, 389, 393, 395, 397–398, 400–403, 410, 416 deﬁnition of, 309, 325 header, 326–328 header, deﬁnition of, 327 input processing, 331 IPsec protocol, 7–8 IPv6, 336–337 output processing, 330–331 processing, 330–331 transport mode, 331–333 tunnel mode, 333–336 Akyol, B., 137 Albaugh, T., xv alert message types, SSL, 181 SSL, 180 algorithm Bellman-Ford, 50 Dijkstra’s, 50 extended Euclidean, 71 alleged RC4, 59 Allen, C., 166 Alvestrand, H.T., 237 Anderson, R., 76

445

446

Index

Andersson, L., 138 AppleTalk, 42, 55 application layer, 11–12, 20, 157 definition of, 10 arithmetic, ones-compliment, 55 Arkko, J., 399 ARP (Address Resolution Protocol), definition of, 15 assigned session ID, definition of, 126 tunnel AVP, definition of, 119 asymmetric cipher, 6, 57, 69–75, 87 asynchronous control character map, see ACCM Atkinson, R., 311, 325, 341, 350, 398, 401, 403 attack DOS, 37 Smurf, 19 attribute value pair, see AVP attribute value pair, 113–115 authentication, 3–8, 42, 57, 155–162, 168, 171, 204, 207, 220, 271, 292–295, 308–313, 315–318, 321, 323, 325–326, 331–333, 336, 339, 341–345, 347, 350, 354–355, 357–358, 361, 365, 379, 403, 406 Authentication Header protocol, see AH authentication, VTun, 269–271 auth-up file, 431 autonomous system, see AS autonomous system, 49, 51–55, 135, 139–140, 151 avalanche effect, definition of, 76 AVP (attribute value pair) definition of, 111 HELLO, 125 L2TP message type, 116 Result code, 121 AVPs CDN, 132 ICCN, 128 ICRP, 127 ICRQ, 126 OCCN, 132 OCRP, 131 OCRQ, 131 SCCCN, 120 SCCRP, 120 SCCRQ, 119 SLI, 133

SopCCN, 121 WEN, 133

Barrett, D.J., 214, 221, 228 bash program, 207 Baugher, M., 363 Baze, M., 307 bc program, 71–72, 422 bearer capabilities, definition of, 119 type, definition of, 126 Begley, L., xv Bellare, M., 80, 171, 242, 259–260, 292, 343 Bellman-Ford algorithm, 50 Bellovin, S.M., 341, 355–356 Bentley, J., xiv BGP (Border Gateway Protocol), 51–55, 138–141, 144 birthday paradox, 75–76, 83 bit-flipping, definition of, 157 BITS (bump-in-the-stack), definition of, 311 BITW (bump-in-the-wire), definition of, 312 Bleichenbacher, D., 72, 230–231 block cipher, 57, 60–69, 87–88, 159, 166, 210, 233, 259–260, 282, 285, 342–343, 345, 355–356 Blowfish, 67–69, 87, 210, 271, 274, 281, 286, 291, 366 Blunk, L.J., 407 Boneh, D., 291 Booth, S., 162 BOOTP, 19 Border Gateway Protocol, see BGP border router, definition of, 49 Borman, D., 22 Braden, R.T., 22 broadcast address, 19 Brumley, D., 291 BSD r-commands, 7, 207–208, 227, 266 bsd work station, 5, 34–35 bump-in-the-stack, see BITS bump-in-the-wire, see BITW

CA (certificate authority), 84–87, 175, 188, 198, 293–294, 370 definition of, 84 call serial number, definition of, 126

Index

447

called number, definition of, 127 calling number, definition of, 127 Callon, R., 138 Canetti, R., 80, 171, 292, 343 Canvel, B., 204 Carlson, J., 43, 100, 125, 407, 432 Carrel, D., 96, 357 cat program, 425 CBC mode, definition of, 60 CCP (Compression Control Protocol), definition of, 42 CDN AVPs, 132 certificate authority, see CA certificate, 6, 57, 83–88, 168, 217, 246, 293–294, 299, 369, 382–384, 401, 406 chain, 86–87 revocation, 87 SSL, 175–176, 178, 180, 185–186, 188–189, 193, 195–198, 205 types, IKE, 371 X.509, 84–86 CFB (cipher feedback mode), definition of, 295 challenge, definition of, 119 Challenge-Handshake Authentication Protocol, see CHAP Chandra, P., 192 change cipher spec, SSL, 177–178 channel messages, SSHv2, 250 CHAP (Challenge-Handshake Authentication Protocol), definition of, 42 Cheval, P., 137 CIDR (classless interdomain routing), 16–19, 43, 54 CIPE, 7, 162–163, 267, 272–283, 302 binary packet, 273–274 binary packet, definition of, 274 configuration data, definition of, 281 control message types, 281 control messages, 281–282 key exchange packet, definition of, 275 key exchange types, 275 key negotiation, 274–277 P byte, definition of, 274 security, 282–283 ciped program, 273, 280–281 cipher block chaining mode, see CBC cipher feedback mode, see CFB

cipher asymmetric, 6, 57, 69–75, 87 block, 57, 60–69, 87–88, 159, 166, 210, 233, 259–260, 282, 285, 342–343, 345, 355–356 Feistel, 62, 66–67 stream, 57–60, 87, 157, 163, 166, 171, 241, 324, 343, 345 suite, SSL, 166–167 symmetric, 6, 57–69, 72–73, 80 Clark, J., xiv classful address ranges, 14 addressing, 13–19 classless interdomain routing, see CIDR client authentication, SSL, 185–188 hello, SSL, 173–174 key exchange, SSL, 176–177 closure notification, SSL, 180 Cobb, S., 157 combined mode cryptographic algorithms, 403 security associations, 318–320 Comer, D.E., 9, 47 Compression Control Protocol, see CCP computationally infeasible, definition of, 83 connection set up, 28–30 shutdown, 30–31 connection-based, 9, 24–25 connectionless, 9, 22, 25 definition of, 20 conservative label retention mode, definition of, 138 Conta, A., 137 control messages L2TP, 116 PPTP, 107 counter mode, see CTR CRC (cyclic redundancy check), 41–42, 91, 109, 203, 210, 216, 231, 233, 259, 266, 273, 276–277, 282 CRC-32 compensation attack detector, 231 Crocker, S.D., 177 cryptographic, 4 hash function, 75–87 implementations, 419–423

448

Index

cryptographically secure random number, 177 cryptography, 4–8, 57–88 elliptic curve, 75 csh program, 207 CTR mode, definition of, 61 cyclic redundancy check, see CRC

Dai, W., 259 Data Encryption Standard, see DES Datagram Transport Layer Security, see DTLS datalink layer, definition of, 10 Davari, S., 137 Davie, B., 137 Davis, C.R., 75 Dawson, E., 58 dc1 device, 98–99 de Groot, J.G., 35 de Wegner, B., 83 decorrelation, definition of, 399 Deep Crack, 65 Deering, S., 21 Deering, S.E., 45–46, 327 denial of service, see DOS DES (Data Encryption Standard), 6, 61–67, 72, 87, 210 /dev/urandom device, 275 DHCP (Dynamic Host Configuration Protocol), 19, 381 DiBurro, L., 397, 410 Dierks, T., 166 Diffie, W., 70, 74, 84 Diffie-Hellman key exchange, 57, 74–75, 192 exchange, SSL, 188–191 digital signature, 6, 80–83, 85, 87, 371, 376, 393–395, 401 Digital Signature Algorithm, see DSA Digital Signature Standard, see DSS Digital Subscriber Line, see DSL Dijkstra, E.W., 50 Dijkstra’s algorithm, 50 discrete logarithm, 73–75, 240 DISPLAY variable, 220–221 distance vector protocol, 50–51, 55 Dixon, W., 162, 409, 415 DNS (Domain Name System), 42, 158, 162, 212, 320–321

DOI (Domain of Interpretation), definition of, 362 Domain Name System, see DNS Domain of Interpretation, see DOI Dommety, G., 101 Doolan, P., 138 Doraswamy, N., 75, 343 DOS (denial of service), 344, 358, 377 attack, 37 downstream on demand mode, definition of, 138 Dreyfus, S., 143 DSA (Digital Signature Algorithm), 81–83, 407 DSL (Digital Subscriber Line), 95–100 definition of, 95 DSS (Digital Signature Standard), 78, 81–83, 244, 381 definition of, 81 DTLS (Datagram Transport Layer Security), definition of, 168 Dynamic Host Configuration Protocol, see DHCP

EAP (Extensible Authenication Protocol), definition of, 42 Eastlake, D.E., 3rd, 177 EBC mode, definition of, 60 echo function, 194 program, 28–30 echoit program, 197–199, 226–227 ECP, definition of, 42 EDE mode, definition of, 65 Effective TCP/IP Programming, see ETCP Egevang, K.B., 40 EGP (Exterior Gateway Protocol), 51–55 definition of, 49 egress router, definition of, 138 EIGRP (Enhanced Interior Gateway Routing Protocol), 51 electronic code book mode, see ECB Electronic Frontier Foundation, 65 ElGamal, 6, 70, 73–75, 82, 87 ElGamal, T., 73 elliptic curve cryptography, 75 Encapsulating Security Payload, see ESP encapsulation, 6–7, 11–13, 32, 54, 89–90, 92,

Index

449

99, 102–104, 110, 145, 151, 160, 172, 271, 284, 288, 290, 294, 302, 307–308, 312–315, 318–320, 330, 332–333, 335–337, 339, 344, 348, 353, 410, 415, 431 encrypt-decrypt-encrypt mode, see EDE encryption, 3–8, 42, 57–88, 113–114, 155–162, 167–168, 176, 204, 207, 209–210, 232–235, 237–238, 260, 267–268, 270–272, 285, 292–293, 295, 308–311, 315, 318, 323, 325, 336, 338, 342, 344–345, 354, 361, 365–366, 406 symmetric, 383–385 endpoint authentication, definition of, 158 Enhanced Interior Gateway Routing Protocol, see EIGRP errno variable, 423 error codes, L2TP, 122 error function, 147, 193, 423 ERROR macro, 193 ESN (extended sequence number), definition of, 401 ESP (Encapsulating Security Payload), 40, 46, 157, 160, 163, 168, 296, 308–310, 312–315, 317–319, 321–326, 336, 338, 341–356, 358, 364–366, 373, 376, 393, 397–405, 410, 412–417 definition of, 309, 341 header, 342–343 header/trailer, definition of, 342 input processing, 345 IPsec protocol, 7–8 IPv6, 353–354 output processing, 344–345 processing, 344–345 transport mode, 345–348 tunnel mode, 348–353 ESPv3, 403–404 header, definition of, 404 ETCP ({Effective TCP/IP Programming}), 5, 24, 31, 180, 193, 204, 294, 419, 423, 429 /etc/ppp/auth-down file, 431 /etc/ppp/auth-up file, 431 /etc/ppp/ip-down file, 432 /etc/ppp/ip-up file, 432 /etc/ppp/ppp.conf file, 433 /etc/ppp/ppp.linkdown file, 434 /etc/ppp/ppp.linkup file, 434 eth1 device, 99 ethereal program, 429

etherpeek program, 429 Etienne, J., 272 Evarts, J., 96 exchange types, IKE, 361 exec function, 228 execl function, 263 explicit congestion notification, see ECN extended sequence number, see ESN extended Euclidean algorithm, 71, 421–423 sequence numbers, 401–403 Extensible Authenication Protocol, see EAP extension headers, IPv6, 46–47, 55 Exterior Gateway Protocol, see EGP Exterior Gateway Protocol, see EGP

Farinacci, D., 101, 137 FCS (frame check sequence), 41, 100 FEC to NHLFE map, see FTN FEC (forwarding equivalence class), definition of, 136 Fedorkow, G., 137 Feistel network, see Feistel cipher Feistel cipher (Feistel network), 62, 66–67 definition of, 62 Feldman, N., 138 Feng, D., 83 Ferguson, N., 57, 61, 65, 70, 73, 81, 84, 87, 309, 313, 316–317, 324–325, 341, 344, 355 Fermat’s little theorem, 71 FIN flag, definition of, 27 finished message, SSL, 178–180 flooding, definition of, 50 Fluhrer, S., 60, 324 Ford, L.R., 50 Ford, W., 85 fork function, 228, 263 forwarding equivalence class, see FEC four-way handshake, 29 frame check sequence, see FCS frame formats, PPP, 43 framing capabilities, definition of, 119 type, definition of, 127 Francis, P., 40 Franz, M., xv Fredette, A., 138 Freier, A.O., 166

450

Index

Friedman, A.A., 231 FTN (FEC to NHLFE map), definition of, 137 FTP, definition of, 40 ftp program, 7, 39–40, 136, 207, 266 Fulkerson, D.R., 50 Fuller, V., 19 function, cryptographic hash, 75–87 Futoransky, A., 231, 282

Garman, J., 214 GCHQ (Government Communications Headquarters), 70 general authentication messages, SSH, 243 Generic Routing Encapsulation, see GRE geqn program, xiv gethostbyname function, 149 gif device, 92, 101 Glenn, R., 327, 343 Goldberg, I., 165 Government Communications Headquarters, see GCHQ gpic program, xiv GRE (Generic Routing Encapsulation), 100–104, 106–107, 110, 151–152, 161 definition of, 100 gre device, 101 GRE header, definition of, 101 gretun device, 102–103 Gross, P., 49 group generator, definition of, 73 gtbl program, xiv gtunnel, 145–151 program, 6–7, 145, 147–148, 151–152, 201, 205, 261–262, 266, 311, 339, 432 gtunnel.c file, 145, 148 Guichard, J., 144 Gutmann, P., 3, 84, 87, 242, 272, 283, 292

half close, 31 Haller, N.M., 215 Hamzeh, K., 109 handshake four-way, 29 messages, SSL, 171–172 three-way, 28–29 types, SSL, 172 Hanks, S., 101

Hanson, D., xiv Harding, T., xiv Hardjono, T., 363 Harkins, D., 75, 343, 357 Harney, H., 363 hash function, cryptographic, 75–87 Hatch, B., 261 HDLC (High-Level Data Link Control Protocol), 40–42, 55, 100 definition of, 41 header AH, 326–328 IP, 20–22 IPv6, 45–46 TCP, 25–28 UDP, 23 Hedrick, C., 50 Heinanen, J., 137 Hellman, M., 70, 74, 84 HELLO AVP, 125 definition of, 125 hello done, SSL, 176 Henry-Stocker, S., xiv Herzog, J., 240 Hickman, K.E.B., 165 hidden AVP, definition of, 114 High-Level Data Link Control Protocol, see HDLC Hiltgen, A., 204 Hinden, R.M., 19, 45–46, 327 HMAC, 6, 80–83, 134, 158, 166, 170–172, 178, 233, 284–286, 292–293, 295, 298, 300–301, 303, 327–328, 336, 343, 347–348, 353, 370, 379, 388, 394 Holdrege, M., 40 Hollenbeck, S., 174 host ID, definition of, 14 Housley, R., 85 Huitema, C., 16, 47 Huttunen, A., 397, 410–411

IANA (Internet Assigned Numbers Authority), definition of, 102 ICCN AVPs, 128 ICMP (Internet Control Message Protocol), 10, 21–22, 25, 45, 54, 91, 93, 95, 103–104, 203–204, 297, 314, 323, 333, 336, 345,

Index

451

348, 372, 399–400, 404, 416, 427 definition of, 32 echo reply, 32–33 echo request, 32–33 error messages, 34–35 message types, 33 protocol, 31–35 ICRP AVPs, 127 ICRQ AVPs, 126 identification payload, IPsec, 369 types, IPsec, 369 IETF (Internet Engineering Task Force), 7, 47, 54, 84, 96, 109, 113, 134, 162, 166, 208, 232, 238, 259, 307, 365, 373 ifconfig program, 92, 102–103, 150 IGMP (Internet Group Management Protocol), 10 IGP (Interior Gateway Protocol), 49–51, 54–55 definition of, 49 IKE (Internet Key Exchange), 308–310, 312, 317, 321–323, 330, 357–395, 397, 400–401, 404–414, 416–417 authentication with signatures, 381–383 certificate types, 371 definition of, 7, 309, 357 exchange types, 361 IPsec protocol, 7–8 key generation, 378–379 new group exchange, 386–387 notification message types, 374 payload types, 360 phase 1, 376–378 phase 1 attributes, 367 phase 2 attributes, 367 phase 2 quick mode, 387–388 public key authentication, 383 revised public key authentication, 383–386 shared secret authentication, 379–381 IKEv2, 401, 404–409, 416–417 exchanges, 405–409 messages, 405 ILM (incoming label map), definition of, 137 inbound function, 145, 147, 149 incoming label map, see ILM inet_aton function, 149 inetd program, 258–259

inetd. program, 259 ingress router, definition of, 136 INIT macro, 147, 193 initial received LCP CONFREQ, definition of, 127 initialization vector, see IV Integrated Services Digital Network, see ISDN integrity check value, see ICV interface layer, 12, 40, 92, 134–135, 202 definition of, 10 Interior Gateway Protocol, see IGP Intermediate System to Intermediate System Protocol, see IS-IS International Telecommunication Union, 84 Internet Assigned Numbers Authority, see IANA Internet Control Message Protocol, see ICMP Internet Engineering Task Force, see IETF Internet Group Management Protocol, see IGMP Internet Key Exchange, see IKE Internet Protocol, see IP Internet layer, definition of, 10 internet protocol numbers, 22 Internet Security Association and Key Management Protocol, see ISAKMP internet service provider, see ISP Ioannidis, J., 307 IP (Internet Protocol) addressing, 13–19 header, 20–22 layer, 283 ip program, 102 IP security, see IPsec IP protocol, 20–22 ip variable, 150 ip_len member, 150 ip_p member, 103 IP-in-IP tunnel, 92–95, 100–101, 141, 147–152, 262, 311, 315, 324, 336 ipip program, 150, 152 ipip.c file, 148 IPPROTO_IPIP socket option, 149 IPsec (IP security), 6–8, 40, 46, 55, 74, 80, 157, 160–163, 168, 224, 267, 283, 296, 301–302, 307–318, 320–325, 328, 330, 333, 338, 341–345, 347, 350, 354, 357–358, 362–365, 368–369, 372–373,

452

Index

376, 378, 387–389, 393–394, 397–401, 404, 406, 409–410, 412–413, 415–416 and multicast, 400 architecture, 311–324, 398–401 definition of, 307 futures, 397–417 ICMP processing, 323 identification payload, 369 identification types, 369 inbound processing, 322–323 modes, 313–316 outbound processing, 322 overview, 308 policies, 320–321 policy, definition of, 320 processing, 321–323 protocol, AH, 7–8 protocol, ESP, 7–8 protocol IDs, 364 protocol, IKE, 7–8 protocols, 312–313 selectors, definition of, 321 sequence numbers, 328–330 transform IDs, 366 iptrace program, 429 ip-up file, 201 IPv4, definition of, 21 IPv6, 43–47, 54–55, 327, 330, 338, 359, 398–399, 416 address, 44–45 AH, 336–337 anycast address, definition of, 44 definition of, 45 ESP, 353–354 extension headers, 46–47, 55 header, 45–46 muticast address, definition of, 44 pseudoheader, definition of, 46 unicast address, definition of, 44 IPX, 42, 55, 89–90 ISAKMP (Internet Security Association and Key Management Protocol), 357–376, 378–380, 386–388, 392–394 attribute, definition of, 362 certificate payload, 369–370 certificate payload, definition of, 370 certificate request, definition of, 370 cookies, 358–359 delete payload, 373

delete payload, definition of, 375 generic header, definition of, 362 hash payload, 370 hash payload, definition of, 371 header, 359–361 header, definition of, 360 identification payload, 368–369 key exchange payload, 367–368 key exchange payload, definition of, 368 message processing, 373–375 nonce payload, 372 nonce payload, definition of, 372 notification payload, 372–373 notification payload, definition of, 372 payloads, 361–362 proposal payload, definition of, 364, 368 SA payload, definition of, 363 signature payload, 370–371 signature payload, definition of, 371 transform payload, definition of, 365 vendor ID payload, definition of, 375 vendor payload, 373 ISDN (Integrated Services Digital Network), definition of, 132 IS-IS (Intermediate System to Intermediate System Protocol), 51 iterated tunneling, definition of, 319 IV (initialization vector), definition of, 60

Jacobson, V., 41 Johnson, D.B., 399

Kalisky, B., 72, 231 Kargieman, E., 231, 282 Karlton, P., 166 Karrenberg, D., 35 Kaufman, C., 134 Kent, S., 309, 311, 325, 341, 350, 398, 401, 403 kermit program, 259 Kernighan, B., xiv key management, 7 Kivinen, T., 397, 411 Klima, V., 83 Knuth, D.E., 71, 422 Kocker, P.C., 166 Kohno, T., 242, 259–260 Kolesnikov, O., 261

Index

453

Krawczyk, H., 80, 171, 292, 343, 345, 358 Krishnan, R., 137

L2F (Layer Two Forwarding), definition of, 109 L2TP Access Concentrator, see LAC L2TP Network Server, see LNS L2TP (Layer Two Tunneling Protocol), 109–134, 151–152, 158–163, 222, 224, 409 attribute value pair, definition of, 113 AVP, definition of, 113 common header, definition of, 111 control messages, 116 definition of, 109 error codes, 122 message type AVP, 116 proxy authentication types, 128 L2TPv3, 134 label distribution protocol, see LDP label switched path, see LSP label switching router, see LSR label distribution protocol, 138–139 MPLS, 137 LAC (L2TP Access Concentrator), definition of, 109 Lai, X., 83 LANalyzer program, 429 laptop work station, 5, 28–29, 34–35 last received LCP CONFREQ, definition of, 127 sent LCP CONFREQ, definition of, 127 Layer Two Forwarding, see L2F Layer Two Tunneling Protocol, see L2TP layering, 9–11, 39, 54 LCP (Link Control Protocol), definition of, 42 LDP (label distribution protocol), definition of, 138 Le Faucheur, F., 137 Lear, E., 35 leased line, 2–3, 141, 143, 155–156, 163, 311 Lemberg, W., xiv Lenstra, A., 83 Levkowetz, H., 407 Li, T., 19, 51, 101, 137 liberal retention mode, definition of, 138 Lidl, K., 96

lightweight VPN, 162–163, 267–303, 307, 309 limited broadcast address, 19 Link Control Protocol, see LCP link layer, 109, 134, 283 definition of, 10 link-state protocol, 50–51, 55 linux work station, 5, 30, 33, 35 linuxlt work station, 5 Little, W.A., 109 LNS (L2TP Network Server), definition of, 109 logarithm, discrete, 73–75, 240 loom program, xiv LSP (label switched path), definition of, 136 LSR (label switching router), definition of, 136

MAC (message authentication code), 57, 79–83, 87–88, 159, 205, 210, 216, 231, 233, 238, 242, 259, 266, 272, 282, 292, 312, 325, 327, 401 address, 16 address, definition of, 15 definition of, 79 Madson, C., 327, 343 Main mode, definition of, 376 Malkin, G.S., 50 Mamakos, L., 96 mandatory mode, definition of, 107 Mantin, I., 60, 324 manual keying, definition of, 317 Maughan, D., 357 maximum receive unit, see MRU maximum transmission unit, see MTU maximum bps, definition of, 131 McLaughlin, R., xiv MD5, 76–83, 88, 115, 134, 213, 217–218, 231, 233, 271, 295, 301, 327, 343, 347–348, 366, 411 md5 program, 76 Menezes, A.J., 57, 70–71, 295 Merkle, R., 70 message authentication code, see MAC message integrity code, see MIC message authentication, definition of, 158 types, SSHv1, 211 Messier, M., 192

454

Index

Messmer, E., 162 Meyer, D., 101 MH (mobility header), definition of, 399 MIC (message integrity code), definition of, 159 Microsoft Challenge Handshake Authentication Protocol, see MS-CHAP Microsoft Point-to-Point Encryption, see MPPE minimum bps, definition of, 131 Mister, S., 60 mobility header, see MH Modadugu, N., 168 mode transport, 314–315 tunnel, 315–316 modes, IPsec, 313–316 Mogul, J., 16, 21 Moskowitz, R.G., 35 Moy, J., 51 MPLS (Multiprotocol Label Switching), 135–144, 151–152, 156 definition of, 135 label, 137 tunnel, 139–144 VPN, 141–144, 155–156 MPPE (Microsoft Point-to-Point Encryption), 157–159 MRU (maximum receive unit), 42 ms macro, xiv MS-CHAP (Microsoft Challenge Handshake Authentication Protocol), 157–158, 162 MTU (maximum transmission unit), 21, 29, 55, 94–95, 151, 203, 288, 323 Mudge, 158 multicast, 13 Multiprotocol Label Switching, see MPLS mutable but predictable, 326 IP header fields (AH), definition of, 326 IPv6 header fields (AH), definition of, 337

Nadeau, T.D., xiv Namprempre, C., 242, 259–260 NAP (network access point), 52 definition of, 49 Narten, T., 313 NAT transversal, see NAT-T

NAT (network address translation), 6, 8, 35–40, 43, 54, 93, 110, 134, 141, 161–162, 201, 271, 273, 309, 333, 336, 339, 350, 397, 409–417, 433 keep-alives, 410 NAT-D payload, definition of, 412 National Security Agency, see NSA NAT-OA payload, definition of, 414 NAT-T (NAT transversal), 8, 40, 134, 162, 309, 397, 409–417 definition of, 162, 397 nc program, 28, 425 NCP (Network Control Protocol), definition of, 42 netcat, 28, 34, 196–199, 212, 425–426 command line options, 426 nettl program, 429 network access point, see NAP network address translation, see NAT Network Control Protocol, see NCP network ID, definition of, 14 layer, 10–13, 22, 32–33, 39, 42, 45–47, 89–90, 92, 95, 106, 135, 137–138, 152, 157, 268, 272, 307, 328 layer, definition of, 10 Network Layer Security Protocol, see NLSP network traces, 3 network-directed broadcast address, 19 new IPsec processing model, 398 newmail function, 230 next hop label forwarding entry, see NHLFE NHLFE (next hop label forwarding entry), definition of, 137 Nielsen, L., 58 NIST, 61–62, 65–67, 78, 81–83 NLSP (Network Layer Security Protocol), definition of, 307 Nolan, C., xv nonce, definition of, 134 Nordmark, E., 313 notification message types, IKE, 374 payload, ISAKMP, 372–373 notify function, 229–230 NSA (National Security Agency), 70, 78 Oakley Key Determination Protocol, see OAKLEY

Index

455

OCCN AVPs, 132 OCRP AVPs, 131 OCRQ AVPs, 131 OFB (output feedback mode), definition of, 295 one time pad, 58 ones-compliment arithmetic, 55 open failure reason codes, SSHv2, 253 Open Shortest Path First Protocol, see OSPF Open Systems Interconnection, see OSI OpenSSH, 208–210, 212, 215–216, 228, 231, 237–238, 244 OpenSSL, 174, 179, 191–196, 205, 271, 277, 286, 291–292, 302, 429 openssl program, 205 OpenVPN, 7, 163, 267, 283, 292–303 control channel, 297–301 control channel packet, definition of, 298 data channel, 294–296 data packet, definition of, 296 key exchange message-1, definition of, 299 key exchange message-2, definition of, 300 OCC message, definition of, 297 OCC op codes, 297 op codes, 295 packet header, definition of, 295 ping and OCC protocols, 296–297 security, 301–302 security models, 293–294 optional ESP padding, 403–404 orderly release, 31 Orman, H.K., 358 OSI reference model, 11 OSPF (Open Shortest Path First Protocol), 51–52, 140, 283 outbound function, 145, 147, 149 output feedback mode, see OFB PAC (PPTP Access Concentrator), definition of, 105 Pacetti, A.M., 231, 282 packet PPP, 42 sniffers, 426–429 types, PPPoE, 98 PAD (peer authentication database), 400–401 definition of, 400

PADI (PPPoE Active Discovery Initiation), 95–100 definition of, 96 PADO (PPPoE Active Discovery Offer), 95–100 definition of, 96 PADR (PPPoE Active Discovery Request), 95–100 definition of, 96 PADS (PPPoE Active Discovery Session-confirmation), 95–100 definition of, 96 PADT (PPPoE Active Discovery Terminate), 95–100 definition of, 96 Pall, G.S., 109, 113, 134, 157 Palter, B., 113, 134 PAP (Password Authentication Protocol), definition of, 42 Partridge, C., 22 Password Authentication Protocol, see PAP Patel, B.V., 162 path MTU, see PMTU path-vector protocol, 53 payload types, IKE, 360 PCT (Private Communications Technology), definition of, 166 peer authentication database, see PAD penultimate hop popping, definition of, 138 Pepelnjak, I., 144 Pereira, R., 343 perfect forward secrecy, see PFS Perkins, C., 94, 323 Perkins, C.E., 399 Perlman, R., 3, 47, 51, 134 PFP (populate from packet flags), definition of, 400 PFS (perfect forward secrecy), definition of, 159 phase 1 attributes, IKE, 367 2 attributes, IKE, 367 physical channel ID, definition of, 127 pid variable, 32 ping program, 31–34, 91, 93, 99–100, 103–104, 129, 150–151, 202–204, 264–265, 332, 334, 336, 347, 389–390, 393, 395, 426–427 Piper, D., 362–363

456

Index

pkcipe, 277–281 PKCIPE message types, 278 packet, definition of, 277 pkcipe program, 273–274, 277 PKCS #1, 72, 81 PKI (public key infrastructure), 84–87 definition of, 84 Plummer, D.C., 16 PMTU (path MTU), 21, 47, 288 PNS (PPTP Network Server), definition of, 105 Point to Point Protocol, see PPP Point to Point Tunneling Protocol, see PPTP policies, IPsec, 320–321 Polk, T., 85 pooled mode, definition of, 37 popen function, 229 populate from packet flags, see PFP port address translation, see PAT port address translation, definition of, 37 forwarding, SSH, 223–226 Postel, J.B., 16, 20, 23, 25, 32, 39 PPP over Ethernet, see PPPoE PPP (Point to Point Protocol), 6, 10, 40–43, 54–55, 90–92, 95–96, 98–100, 104–134, 145, 151, 157–162, 199–205, 261, 266, 268–269, 431–434 frame formats, 43 packet, 42 ppp program, 201–202, 432–434 ppp.conf file, 98, 433–434 pppd command line options, 431 program, 99, 145, 200, 202, 431–432 ppp.linkup file, 201–202 PPPoE Active Discovery Initiation, see PADI PPPoE Active Discovery Offer, see PADO PPPoE Active Discovery Request, see PADR PPPoE Active Discovery Session-confirmation, see PADS PPPoE Active Discovery Terminate, see PADT PPPoE (PPP over Ethernet), 95–100, 151, 201 PPPoe, definition of, 95 PPPoE (PPP over Ethernet) header, definition of, 98 packet types, 98

pppoe program, 99 PPPoE tag, 97 PPTP Access Concentrator, see PAC PPTP Network Server, see PNS PPTP (Point to Point Tunneling Protocol), 101–102, 104–110, 118, 151–152, 157–159, 161, 163 control message header, definition of, 106 control messages, 107 definition of, 105 extended GRE header, definition of, 108 Preneel, B., 79 printf function, 423 private address, 35–40, 54, 93, 104, 350 ranges, 36 Private Communications Technology, see PCT private group, definition of, 128 peering, 52 protocol distance vector, 50–51, 55 ICMP, 31–35 IDs, IPsec, 364 IP, 20–22 label distribution, 138–139 link-state, 50–51, 55 path-vector, 53 TCP, 24–31 UDP, 22–23 version, definition of, 119 protocols, IPsec, 312–313 Provan, D., 89 proxy ARP, definition of, 15 authen challenge, definition of, 127 authen ID, definition of, 127 authen name, definition of, 127 authen response, definition of, 128 authen type, definition of, 127 pseudo-header, 23, 39 pseudo-randomness, definition of, 231 pseudowire, 134 PSH flag, definition of, 27 public key infrastructure, see PKI public key signature data, SSHv2, 245 Pyle, E., xv python program, 59, 71, 229, 419–420, 422

Index

457

Quick mode, definition of, 387

rand function, 271, 274 random device, definition of, 275 vector, definition of, 120 random device, 275 random/urandom device, 275 RAS (remote access server), definition of, 104 rbiff program, 229 rbiffd program, 229–230, 259 RC4, 6, 58–60, 62, 87–88, 157–158, 171, 205, 241, 259–260, 279, 324, 420–421 alleged, 59 rcp program, 207, 227–228 read function, 192, 194 receive window, definition of, 119 record layer message types, SSL, 171 SSL, 170–171 recv_char() program, 40 reference model, OSI, 11 Rekhter, Y., 19, 35, 49, 51, 137–138, 140–141 reliable, 9, 20, 24, 124–125, 204, 208, 232, 283, 292–293, 302, 374, 416 remote access server, see RAS remote variable, 148–149 Rescorla, E., 165, 167–168, 174, 185, 192, 430 Result code AVP, 121 result codes, StopCCN, 121 resumed session, SSL, 180–183 rexec program, 207 Reynold, J.K., 39 Rijndael, see AES RIP, 50–51 Rivest, R., 70, 76, 78, 81, 84 rlogin program, 207 Robshaw, M., 59, 72 Romkey, J.L., 40 Roos, A., 60 rootcert.pem file, 198 Rosen, E.C., 137–138, 140–141 round trip time, see RTT round keys, 62 route program, 103 routing, 4, 6, 9–10, 14, 16–19, 43–44, 46–55, 93, 135–144, 151, 156, 163, 283, 288, 313, 333, 353, 398

RSA, 6, 70–73, 81, 87–88, 176–177, 188, 212, 214, 217, 237, 242, 244–245, 266, 277, 280, 286–287, 291, 381, 383, 407, 421 Laboratories, 72, 81 rsh program, 207–209, 227 RST flag, definition of, 27 RTT (round trip time), 29 Rubens, A., 113, 134 Rx connect speed, definition of, 127 s_client program, 192, 194–195 s_server program, 192 SA (security association), 309, 316–324, 327, 330–331, 341–342, 344–345, 347, 350, 355, 357–395, 398–401, 404–409, 414–417 definition of, 309, 316–317 proposal and transform payloads, 362–367 SAD (security association database), definition of, 317 S-box (substitution box), definition of, 64 SCCCN AVPs, 120 SCCRP AVPs, 120 SCCRQ AVPs, 119 Schertler, M., 357 Schiller, J.I., 177 Schneider, M., 357 Schneier, B., 57–58, 61–62, 65, 70, 72–73, 75, 81–82, 84, 87, 158, 204, 309, 313, 316–317, 324–325, 341, 344, 355–356 SCP (Secure Copy Program), 227–230 scp program, 227–228, 248, 258, 266 Secure Copy Program, see SCP Secure Data Network System, see SP3 Secure Hash Algorithm, see SHA Secure Shell, see SSH Secure Sockets Layer, see SSL security association, see SA security association database, see SAD security parameter index, see SPI security policy database, see SPD security association, 7, 309, 316–321, 324, 357, 363, 393 select function, 147, 149 send_char() program, 40 sendto function, 149 SEQ_LT, definition of, 112 SEQ_LT macro, 112 sequencing required, definition of, 128

458

Index

Serial Line Protocol, see SLIP server hello, SSL, 174–175 key, SSH, 212 session key, definition of, 72 key generation, SSHv2, 241 set link info message, see SLI setkey program, 347, 389 sftp program, 258 sh program, 207 SHA (Secure Hash Algorithm), 76–83, 88, 134, 171, 231, 233, 235, 240–241, 278, 285–286, 291, 295, 301, 327, 336, 343, 353, 365, 370, 379, 381 Shamir, A., 60, 70, 81, 84, 324 Shea, R., 134, 162 SHELL variable, 227 shortest path, definition of, 51 signature, digital, 6, 80–83, 85, 87, 371, 376, 393–395, 401 Silverman, R.E., 214 Simone, D., 96 Simpson, W.A., 119, 313 Siverman, R.E., 221, 228 sleep function, 229 SLI (set link info message) AVPs, 133 definition of, 132 SLIP (Serial Line Protocol), 40–41, 55 Smurf attack, 19 Snader, J.C., 5 Snader, M, xiv Snader, R., xiv snoop program, 429 SOCK_RAW socket option, 149 SOCKADDR macro, 148 sockaddr structure, 148 sockaddr_in structure, 148 soft state, 95, 323 solaris work station, 5, 28–30, 35 Solo, D., 85 Song, D.X., 260 SopCCN AVPs, 121 SP3 (Secure Data Network System), definition of, 307 Spam, 37 SPD (security policy database), 398–399 definition of, 321

Speciner, M., 134 SPI (security parameter index), definition of, 317 Srisuresh, P., 40 SSH (Secure Shell), 7, 152, 162–163, 207–266, 268, 273, 286, 293, 297, 324, 328 general authentication messages, 243 port forwarding, 223–226 ssh program, 136, 207–210, 220, 224, 228, 230, 258, 262–263 SSH (Secure Shell) server key, 212 VPN, 260–267 SSH_CMSG_PORT_FORWARD_REQUEST, definition of, 227 SSH_CMSG_REQUEST_PTY, definition of, 219 SSH_CMSG_SESSION_KEY, definition of, 214 SSH_CMSG_X11_REQUEST_FORWARDING, definition of, 221 SSH_MSG_CHANNEL_DATA, definition of, 222 SSH_MSG_CHANNEL_EXTENDED_DATA, definition of, 254 SSH_MSG_CHANNEL_OPEN, definition of, 251 SSH_MSG_CHANNEL_OPEN_CONFIRMATION, definition of, 222, 252 SSH_MSG_CHANNEL_OPEN_FAILURE, definition of, 253 SSH_MSG_CHANNEL_REQUEST, definition of, 254 SSH_MSG_GLOBAL_REQUEST, definition of, 251 SSH_MSG_KEXDH_GEX_REPLY, definition of, 239 SSH_MSG_KEXDH_REQUEST, definition of, 239 SSH_MSG_KEXINIT, definition of, 236 SSH_MSG_PORT_OPEN, definition of, 226 SSH_MSG_USERAUTH_FAILURE, definition of, 244 SSH_MSG_USERAUTH_INFO_REQUEST, definition of, 249 SSH_MSG_USERAUTH_INFO_RESPONSE, definition of, 250 SSH_MSG_USERAUTH_PASSWD_CHANGEREQ, definition of, 247

Index

459

SSH_SMSG_PUBLIC_KEY, definition of, 213 sshd program, 207, 209, 212, 228, 259, 262–264 SSHv1, 208–231 authentication, 210–220 binary packet, definition of, 210 message types, 211 remote commands, 226–227 security, 230–231 user authentication, 214–220 SSHv2, 232–260 authentication, 242–248 binary packet, definition of, 234 channel messages, 250 connection protocol, 248–252 data transfer, 252–253 Diffie-Hellman key exchange, 238–240 exchange hash, 240–241 key generation, 241–242 keyboard interactive authentication, 247–248 none authentication, 244 open failure reason codes, 253 parameter negotiation, 234–238 password authentication, 246–247 port forwarding, 257–258 public key authentication, 244–246 public key signature data, 245 remote commands, 253–256 security, 259–260 services, 242 session key generation, 241 subsystems, 258–259 transport message types, 234 transport protocol, 232–233 user authentication request, definition of, 243 sshvpn, 262–265 program, 261–264 SSL (Secure Sockets Layer), 7, 58, 80, 86–87, 156, 162–163, 165–205, 208–209, 212, 260–261, 271, 273, 277, 286, 291–294, 297–302, 324, 429–430, 433 alert, 180 alert message, definition of, 180 alert message types, 181 certificate, 175–176, 178, 180, 185–186, 188–189, 193, 195–198, 205 certificate message, definition of, 176

change cipher spec, 177–178 cipher suite, 166–167 cipher suite, definition of, 166 cipher suites, 167 client authentication, 185–188 client cert. request, definition of, 188 client hello, 173–174 client hello, definition of, 173 client key exchange, 176–177 client key exchange message, definition of, 177 closure notification, 180 definition of, 165 Diffie-Hellman key exchange, 188–191 finished message, 178–180 finished message, definition of, 178 handshake header, definition of, 172 handshake messages, 171–172 handshake types, 172 hello done, 176 MasterSecret, definition of, 176 PreMasterSecret, definition of, 176 protocol, 167–171 record format, definition of, 170 record layer, 170–171 record layer message types, 171 resumed session, 180–183 security, 204–205 server hello, 174–175 server hello, definition of, 175 server hello done, definition of, 176 server key exchange message, definition of, 190 v2 client hello, 183–185 v2 record-client hello, definition of, 184 VPN, 265–266 SSL_accept function, 194 SSL_read function, 192, 194 SSL_set_bio function, 194 SSL_write function, 194 ssldump, 429–430 ssldump program, 165, 174–178, 187, 189, 199, 202, 428–430 sslecho program, 194–195, 198–199, 206 sslecho.pem file, 193 stack definition of, 10 TCP/IP, 10 Staddon, J., 72, 231

460

Index

startup function, 145, 147–148 static mode, definition of, 37 Stenberg, M., 397, 410 Stevens, W.R., xiv–v, 9, 27, 29, 200, 431 Stevenson, F.A., 60 StopCCN result codes, 121 stream cipher, 57–60, 87, 157, 163, 166, 171, 241, 324, 343, 345 strerror function, 423 strong collision resistance, definition of, 75 stunnel, 196–204 program, 192, 196–202, 204–205, 208 stunnel. program, 223 stunnel.client file, 199 stunnel.server file, 198, 200 subaddress, definition of, 127 subnetting, 16 substitution box, see S-box subsystems, SSHv2, 258–259 Swander, B., 397, 410–411 symmetric cipher, 6, 57–69, 72–73, 80 encryption, 383–385 SYN flag, definition of, 27 synchronization segment, see SYN synchronization segment, definition of, 28 synchronous line, 41–42 Taarud, J., 109 tag, PPPoE, 97 Tappan, D., 137 tar program, 426 Tavares, S.E., 60 TCP (Transmission Control Protocol) data delivery, 25 definition of, 26 header, 25–28 protocol, 24–31 segment, definition of, 24 tcp_client function, 424 tcp_server function, 193, 424 tcpdump program, 3, 12, 28–30, 33–34, 54, 91, 99, 103, 115–116, 123, 151, 165, 170, 174, 179, 244, 308, 332, 335, 352, 355, 390, 393, 426–430 TCP/IP, 9–55 stack, 10 telnet program, 7, 136, 207, 209, 266, 350–351, 425

testbed, 5–6 TFC (traffic flow confidentiality), definition of, 403 Thomas, B., 138 three-way handshake, 28–29 Tian, X., 260 tie breaker, definition of, 119 time to live, see TTL tinc, 7, 163, 267, 283–292, 296, 302 binary packet, definition of, 285 binary protocol, 284–286 metaprotocol, 286–291 metaprotocol message types, 287 security, 291–292 TLS (Transport Layer Security), 165–206, 292–294, 298 definition of, 166 tohex function, 420 Townsley, W.M., 113, 134 traces, network, 3 traffic flow confidentiality, see TFC traffic analysis, definition of, 309 selectors, 399–400 Traina, P., 101 transform IDs, IPsec, 366 Transmission Control Protocol, see TCP transport adjacency, definition of, 318 layer, 10–12, 20, 22, 47, 90, 92, 165, 168, 202, 208, 292, 314, 336 layer, definition of, 10 Transport Layer Security, see TLS transport message types, SSHv2, 234 mode, 314–315 triple DES, see 3DES TTL (time to live), 21–22, 31, 46, 95, 136–137, 350 tun device, 147, 149–150, 267–268, 429, 431, 433 tun0 device, 261–262, 264 tunnel, 2, 9, 11, 40, 54, 89–152, 155–163, 165, 196–204, 208, 224, 228, 260–262, 264–265, 267–271, 273, 281–283, 286, 296–297, 302, 311, 315–316, 332–333, 346–347, 376, 389, 395, 399, 415, 426, 428, 431, 433 definition of, 90

Index

461

IP-in-IP, 92–95, 100–101, 141, 147–152, 262, 311, 315, 324, 336 mode, 315–316 MPLS, 139–144 tunneling, 3–8, 11, 40, 54 definition of, 90 TUN/TAP device, 145 tun/tap device, 267–268, 283, 429 Turner, J., 357 Tx connect speed, definition of, 127

UDP (User Datagram Protocol) definition of, 10, 24 header, 23 protocol, 22–23 UNIX, 6, 32, 72, 165, 173, 191, 201, 207, 228, 422, 425, 428–429 unreliable, definition of, 20 unsolicited email, see SPAM unsolicited downstream mode, definition of, 138 urandom device, definition of, 275 urandom device, 275 URG flag, definition of, 27 User Datagram Protocol, see UDP Vaananen, P., 137 Valencia, A.J., 113, 134 van Oorschot, P.C., 57, 70–71, 79, 295 Vanstone, S.A., 57, 70–71, 295 Varadhan, K., 19 Varghese, G., 25 Vaudenay, S., 204 Verthein, W., 109 Viega, J., 192 virtual private network, see VPN Viswanathan, A., 138 Vollbrecht, J.R., 407 Volpe, V., 397, 410–411 voluntary mode, definition of, 108 Voydock, V.L., 309 VPN (virtual private network), 3–9, 11, 54, 67, 73, 75, 84, 104, 109–110, 135–136, 141–144, 151, 155–163, 165, 199, 201–205, 208, 225, 260–266, 307–324, 328, 334, 348, 350, 355, 357–358, 383, 400, 406, 409, 415–416, 426, 428

definition of, 3 lightweight, 162–163, 267–303, 307, 309 MPLS, 141–144, 155–156 SSH, 260–267 SSL, 265–266 VTun, 7, 162–163, 267–272, 277, 283, 292, 302 authentication, 269–271 security, 271–272 tunnel parameter options, 270 vtund program, 267–269 Vuagnoux, M., 204

Wagner, D., 158, 165, 204, 260 Waissbein, A., 231 WAN error notify message, see WEN Wang, X., 83 Weis, B., 363 WEN (WAN error notify message) AVPs, 133 definition of, 132 WEP (wired equivalent privacy), definition of, 60 Wheeler, R., 96 Wilson, S., 84 wired equivalent privacy, see WEP Wright, G., xiv write function, 194 Wu, L., 137

X11 forwarding, 220–223, 226, 256–257, 266 X.509, 83 certificate, 84–86 certificate, definition of, 85

Yin, Y.L., 83 .. Ylonen, T., 227 Yu Yu, H., 83 Yu, J., 19

zero length body message, see ZLB Zheng, P., 87 ZLB message, definition of, 115 Zorn, G., 109, 113, 134, 157, 162

Also available from Jon C. Snader and Addison-Wesley

9-4

-6158

0-201

00 · · © 20

ages

320 p

Programming in TCP/IP can seem deceptively simple. Nonetheless, many network programmers recognize that their applications could be much more robust. Effective TCP/IP Programming is designed to boost programmers to a higher level of competence by focusing on the protocol suite’s more subtle features and techniques. It gives you the know-how you need to produce highly effective TCP/IP programs.

In forty-four concise, self-contained lessons, this book offers experience-based tips, practices, and rules of thumb for learning high-performance TCP/IP programming techniques. Moreover, it shows you how to avoid many of TCP/IP’s most common trouble spots. Effective TCP/IP Programming offers valuable advice on such topics as: • • • • • • • • •

Exploring IP addressing, subnets, and CIDR Preferring the sockets interface over XTI/TLI Using two TCP connections Making your applications event-driven Using one large write instead of multiple small writes Avoiding data copying Understanding what TCP reliability really means Recognizing the effects of buffer sizes Using tcpdump, traceroute, netstat, and ping effectively

Numerous examples demonstrate essential ideas and concepts. Skeleton code and a library of common functions allow you to write applications without having to worry about routine chores. Through individual tips and explanations, you will acquire an overall understanding of the inner workings of TCP/IP and the practical knowledge needed to put it to work. Using Effective TCP/IP Programming, you’ll speed through the learning process and quickly achieve the programming capabilities of a seasoned pro. Visit us online at www.awprofessional.com for more information about this book and to read sample chapters.

www.informit.com

YOUR GUIDE TO IT REFERENCE Articles Keep your edge with thousands of free articles, indepth features, interviews, and IT reference recommendations – all written by experts you know and trust.

Online Books Answers in an instant from InformIT Online Book’s 600+ fully searchable on line books. For a limited time, you can get your first 14 days free.

Catalog Review online sample chapters, author biographies and customer rankings and choose exactly the right book from a selection of over 5,000 titles.

at www.awprofessional.com/register You may be eligible to receive: • Advance notice of forthcoming editions of the book • Related book recommendations • Chapter excerpts and supplements of forthcoming titles • Information about special contests and promotions throughout the year • Notices and reminders about author appearances, tradeshows, and online chats with special guests

If you are interested in writing a book or reviewing manuscripts prior to publication, please write to us at: Editorial Department Addison-Wesley Professional 75 Arlington Street, Suite 300 Boston, MA 02116 USA Email: [email protected]

Visit us on the Web: http://www.awprofessional.com