143 64 4MB
English Pages 770 Year 2004
< Day Day Up >
•
Table of Cont ent s
•
I ndex
•
Reviews
•
Reader Reviews
•
Errat a
•
Academ ic
Squ id: Th e D e fin it ive Gu ide By Duane Wessels
Publisher: O'Reilly Pub Dat e: January 2004 I SBN: 0- 596- 00162- 2 Pages: 496
Squid is t he m ost popular Web caching soft ware in use t oday, and it works on a variet y of plat form s including Linux, FreeBSD, and Windows. Writ t en by Duane Wessels, t he creat or of Squid, Squid: The Definit ive Guide will help you configure and t une Squid for your part icular sit uat ion. Newcom ers t o Squid will learn how t o download, com pile, and inst all code. Seasoned users of Squid will be int erest ed in t he lat er chapt ers, which t ackle advanced t opics such as high- perform ance st orage opt ions, rewrit ing request s, HTTP server accelerat ion, m onit oring, debugging, and t roubleshoot ing Squid. < Day Day Up >
< Day Day Up >
•
Table of Cont ent s
•
I ndex
•
Reviews
•
Reader Reviews
•
Errat a
•
Academ ic
Squ id: Th e D e fin it ive Gu ide By Duane Wessels
Publisher: O'Reilly Pub Dat e: January 2004 I SBN: 0- 596- 00162- 2 Pages: 496
Copyright Dedicat ion Preface About This Book Recom m ended Reading Convent ions Used in This Book Com m ent s and Quest ions Acknowledgm ent s Chapt er 1. I nt roduct ion Sect ion 1.1. Web Caching Sect ion 1.2. A Brief Hist ory of Squid Sect ion 1.3. Hardware and Operat ing Syst em Requirem ent s Sect ion 1.4. Squid I s Open Source Sect ion 1.5. Squid's Hom e on t he Web Sect ion 1.6. Get t ing Help Sect ion 1.7. Get t ing St art ed wit h Squid Sect ion 1.8. Exercises Chapt er 2. Get t ing Squid Sect ion 2.1. Versions and Releases
Sect ion 2.2. Use t he Source, Luke Sect ion 2.3. Precom piled Binaries Sect ion 2.4. Anonym ous CVS Sect ion 2.5. devel.squid- cache.org Sect ion 2.6. Exercises Chapt er 3. Com piling and I nst alling Sect ion 3.1. Before You St art Sect ion 3.2. Unpacking t he Source Sect ion 3.3. Pret uning Your Kernel Sect ion 3.4. The configure Script Sect ion 3.5. m ake Sect ion 3.6. m ake I nst all Sect ion 3.7. Applying a Pat ch Sect ion 3.8. Running configure Lat er Sect ion 3.9. Exercises Chapt er 4. Configurat ion Guide for t he Eager Sect ion 4.1. The squid.conf Synt ax Sect ion 4.2. User I Ds Sect ion 4.3. Port Num bers Sect ion 4.4. Log File Pat hnam es Sect ion 4.5. Access Cont rols Sect ion 4.6. Visible Host nam e Sect ion 4.7. Adm inist rat ive Cont act I nform at ion Sect ion 4.8. Next St eps Sect ion 4.9. Exercises Chapt er 5. Running Squid Sect ion 5.1. Squid Com m and- Line Opt ions Sect ion 5.2. Check Your Configurat ion File for Errors Sect ion 5.3. I nit ializing Cache Direct ories Sect ion 5.4. Test ing Squid in a Term inal Window Sect ion 5.5. Running Squid as a Daem on Process Sect ion 5.6. Boot Script s Sect ion 5.7. A chroot Environm ent Sect ion 5.8. St opping Squid Sect ion 5.9. Reconfiguring a Running Squid Process Sect ion 5.10. Rot at ing t he Log Files Sect ion 5.11. Exercises Chapt er 6. All About Access Cont rols Sect ion 6.1. Access Cont rol Elem ent s Sect ion 6.2. Access Cont rol Rules Sect ion 6.3. Com m on Scenarios
Sect ion 6.4. Test ing Access Cont rols Sect ion 6.5. Exercises Chapt er 7. Disk Cache Basics Sect ion 7.1. The cache_dir Direct ive Sect ion 7.2. Disk Space Wat erm arks Sect ion 7.3. Obj ect Size Lim it s Sect ion 7.4. Allocat ing Obj ect s t o Cache Direct ories Sect ion 7.5. Replacem ent Policies Sect ion 7.6. Rem oving Cached Obj ect s Sect ion 7.7. refresh_pat t ern Sect ion 7.8. Exercises Chapt er 8. Advanced Disk Cache Topics Sect ion 8.1. Do I Have a Disk I / O Bot t leneck? Sect ion 8.2. Filesyst em Tuning Opt ions Sect ion 8.3. Alt ernat ive Filesyst em s Sect ion 8.4. The aufs St orage Schem e Sect ion 8.5. The diskd St orage Schem e Sect ion 8.6. The coss St orage Schem e Sect ion 8.7. The null St orage Schem e Sect ion 8.8. Which I s Best for Me? Sect ion 8.9. Exercises Chapt er 9. I nt ercept ion Caching Sect ion 9.1. How I t Works Sect ion 9.2. Why ( Not ) I nt ercept ? Sect ion 9.3. The Net work Device Sect ion 9.4. Operat ing Syst em Tweaks Sect ion 9.5. Configure Squid Sect ion 9.6. Debugging Problem s Sect ion 9.7. Exercises Chapt er 10. Talking t o Ot her Squids Sect ion 10.1. Som e Term inology Sect ion 10.2. Why ( Not ) Use a Hierarchy? Sect ion 10.3. Telling Squid About Your Neighbors Sect ion 10.4. Rest rict ing Request s t o Neighbors Sect ion 10.5. The Net work Measurem ent Dat abase Sect ion 10.6. I nt ernet Cache Prot ocol Sect ion 10.7. Cache Digest s Sect ion 10.8. Hypert ext Caching Prot ocol Sect ion 10.9. Cache Array Rout ing Prot ocol Sect ion 10.10. Put t ing I t All Toget her Sect ion 10.11. How Do I ...
Sect ion 10.12. Exercises Chapt er 11. Redirect ors Sect ion 11.1. The Redirect or I nt erface Sect ion 11.2. Som e Sam ple Redirect ors Sect ion 11.3. The Redirect or Pool Sect ion 11.4. Configuring Squid Sect ion 11.5. Popular Redirect ors Sect ion 11.6. Exercises Chapt er 12. Aut hent icat ion Helpers Sect ion 12.1. Configuring Squid Sect ion 12.2. HTTP Basic Aut hent icat ion Sect ion 12.3. HTTP Digest Aut hent icat ion Sect ion 12.4. Microsoft NTLM Aut hent icat ion Sect ion 12.5. Ext ernal ACLs Sect ion 12.6. Exercises Chapt er 13. Log Files Sect ion 13.1. cache.log Sect ion 13.2. access.log Sect ion 13.3. st ore.log Sect ion 13.4. referer.log Sect ion 13.5. useragent .log Sect ion 13.6. swap.st at e Sect ion 13.7. Rot at ing t he Log Files Sect ion 13.8. Privacy and Securit y Sect ion 13.9. Exercises Chapt er 14. Monit oring Squid Sect ion 14.1. cache.log Warnings Sect ion 14.2. The Cache Manager Sect ion 14.3. Using SNMP Sect ion 14.4. Exercises Chapt er 15. Server Accelerat or Mode Sect ion 15.1. Overview Sect ion 15.2. Configuring Squid Sect ion 15.3. Gee, That Was Confusing! Sect ion 15.4. Access Cont rols Sect ion 15.5. Cont ent Negot iat ion Sect ion 15.6. Got chas Sect ion 15.7. Exercises Chapt er 16. Debugging and Troubleshoot ing Sect ion 16.1. Som e Com m on Problem s Sect ion 16.2. Debugging via cache.log
Sect ion 16.3. Core Dum ps, Assert ions, and St ack Traces Sect ion 16.4. Replicat ing Problem s Sect ion 16.5. Report ing a Bug Sect ion 16.6. Exercises Appendix A. Config File Reference ht t p_port ht t ps_port ssl_unclean_shut down icp_port ht cp_port m cast _groups udp_incom ing_address udp_out going_address cache_peer cache_peer_dom ain neighbor_t ype_dom ain icp_query_t im eout m axim um _icp_query_t im eout m cast _icp_query_t im eout dead_peer_t im eout hierarchy_st oplist no_cache cache_access_log cache_log cache_st ore_log cache_swap_log em ulat e_ht t pd_log log_ip_on_direct cache_dir cache_m em cache_swap_low cache_swap_high m axim um _obj ect _size m inim um _obj ect _size m axim um _obj ect _size_in_m em ory cache_replacem ent _policy m em ory_replacem ent _policy st ore_dir_select _algorit hm m im e_t able ipcache_size ipcache_low
ipcache_high fqdncache_size log_m im e_hdrs useragent _log referer_log pid_filenam e debug_opt ions log_fqdn client _net m ask ft p_user ft p_list _widt h ft p_passive ft p_sanit ycheck cache_dns_program dns_children dns_ret ransm it _int erval dns_t im eout dns_defnam es dns_nam eservers host s_file diskd_program unlinkd_program pinger_program redirect _program redirect _children redirect _rewrit es_host _header redirect or_access redirect or_bypass aut h_param aut hent icat e_t t l aut hent icat e_cache_garbage_int erval aut hent icat e_ip_t t l ext ernal_acl_t ype wais_relay_host wais_relay_port request _header_m ax_size request _body_m ax_size refresh_pat t ern quick_abort _m in quick_abort _m ax quick_abort _pct
negat ive_t t l posit ive_dns_t t l negat ive_dns_t t l range_offset _lim it connect _t im eout peer_connect _t im eout read_t im eout request _t im eout persist ent _request _t im eout client _lifet im e half_closed_client s pconn_t im eout ident _t im eout shut down_lifet im e acl ht t p_access ht t p_reply_access icp_access m iss_access cache_peer_access ident _lookup_access t cp_out going_t os t cp_out going_address reply_body_m ax_size cache_m gr cache_effect ive_user cache_effect ive_group visible_host nam e unique_host nam e host nam e_aliases announce_period announce_host announce_file announce_port ht t pd_accel_host ht t pd_accel_port ht t pd_accel_single_host ht t pd_accel_wit h_proxy ht t pd_accel_uses_host _header dns_t est nam es logfile_rot at e
append_dom ain t cp_recv_bufsize err_ht m l_t ext deny_info m em ory_pools m em ory_pools_lim it forwarded_for log_icp_queries icp_hit _st ale m inim um _direct _hops m inim um _direct _rt t cachem gr_passwd st ore_avg_obj ect _size st ore_obj ect s_per_bucket client _db net db_low net db_high net db_ping_period query_icm p t est _reachabilit y buffered_logs reload_int o_im s always_direct never_direct header_access header_replace icon_direct ory error_direct ory m axim um _single_addr_t ries snm p_port snm p_access snm p_incom ing_address snm p_out going_address as_whois_server wccp_rout er wccp_version wccp_incom ing_address wccp_out going_address delay_pools delay_class delay_access
delay_param et ers delay_init ial_bucket _level incom ing_icp_average incom ing_ht t p_average incom ing_dns_average m in_icp_poll_cnt m in_dns_poll_cnt m in_ht t p_poll_cnt m ax_open_disk_fds offline_m ode uri_whit espace broken_post s m cast _m iss_addr m cast _m iss_t t l m cast _m iss_port m cast _m iss_encode_key nonhierarchical_direct prefer_direct st rip_query_t erm s coredum p_dir ignore_unknown_nam eservers digest _generat ion digest _bit s_per_ent ry digest _rebuild_period digest _rewrit e_period digest _swapout _chunk_size digest _rebuild_chunk_percent age chroot client _persist ent _connect ions server_persist ent _connect ions pipeline_prefet ch ext ension_m et hods request _ent it ies high_response_t im e_warning high_page_fault _warning high_m em ory_warning ie_refresh vary_ignore_expire sleep_aft er_fork Appendix B. The Mem ory Cache Appendix C. Delay Pools
Sect ion C.1. Overview Sect ion C.2. Configuring Squid Sect ion C.3. Exam ples Sect ion C.4. I ssues Sect ion C.5. Monit oring Delay Pools Appendix D. Filesyst em Perform ance Benchm arks Sect ion D.1. The Benchm ark Environm ent Sect ion D.2. General Com m ent s Sect ion D.3. Linux Sect ion D.4. FreeBSD Sect ion D.5. OpenBSD Sect ion D.6. Net BSD Sect ion D.7. Solaris Sect ion D.8. Num ber of Disk Spindles Appendix E. Squid on Windows Sect ion E.1. Cygwin Sect ion E.2. SquidNT Appendix F. Configuring Squid Client s Sect ion F.1. Manually Sect ion F.2. Proxy Aut o- Configurat ion Sect ion F.3. WPAD Sect ion F.4. Sum m ary Colophon I ndex < Day Day Up >
< Day Day Up >
Copyright © 2004 O'Reilly Media, I nc. Print ed in t he Unit ed St at es of Am erica. Published by O'Reilly Media, I nc., 1005 Gravenst ein Highway Nort h, Sebast opol, CA 95472. O'Reilly & Associat es books m ay be purchased for educat ional, business, or sales prom ot ional use. Online edit ions are also available for m ost t it les ( ht t p: / / safari.oreilly.com ) . For m ore inform at ion, cont act our corporat e/ inst it ut ional sales depart m ent : ( 800) 998- 9938 or corporat [email protected] . Nut shell Handbook, t he Nut shell Handbook logo, and t he O'Reilly logo are regist ered t radem arks of O'Reilly Media, I nc. Squid: The Definit ive Guide, t he im age of a giant squid and relat ed t rade dress are t radem arks of O'Reilly Media, I nc. Many of t he designat ions used by m anufact urers and sellers t o dist inguish t heir product s are claim ed as t radem arks. Where t hose designat ions appear in t his book, and O'Reilly & Associat es was aware of a t radem ark claim , t he designat ions have been print ed in caps or init ial caps. While every precaut ion has been t aken in t he preparat ion of t his book, t he publisher and aut hors assum e no responsibilit y for errors or om issions, or for dam ages result ing from t he use of t he inform at ion cont ained herein. < Day Day Up >
< Day Day Up >
Dedication To m y darling Anne. You have no idea. < Day Day Up >
< Day Day Up >
Preface About This Book Recom m ended Reading Convent ions Used in This Book Com m ent s and Quest ions Acknowledgm ent s < Day Day Up >
< Day Day Up >
About This Book I st art ed t he Squid proj ect eight years ago while working at t he Nat ional Laborat ory for Applied Net work Research and t he Universit y of California. Back t hen I cert ainly enj oyed writ ing code and fixing bugs but always felt bad about t he lack of decent docum ent at ion. This book is m y at t em pt t o rect ify t hat sit uat ion. I t 's been a long t im e com ing and alm ost didn't happen. Like t hey say, " bet t er lat e t han never! " This book is writ t en for t hose who are t asked wit h set t ing up and m aint aining one or m ore Squid caches. I f you're new t o Squid, I 'll show you how t o download, com pile, and inst all t he code. Those of you who have been using Squid for a while will be m ore int erest ed in t he lat er chapt ers, where I t alk about disk cache perform ance, m odifying request s, surrogat e m ode, caching hierarchies, m onit oring Squid, and m ore. I n order t o use t his book, you should have a basic knowledge of Unix syst em s. Many of t he book's exam ples are based on free operat ing syst em s, such as Linux, FreeBSD, Net BSD, and OpenBSD. I also have som e t ips for Solaris users. I f you're m ore com fort able wit h Windows syst em s, you can use Squid under a Unix em ulat or or give t he nat ive NT port a t ry. Here's an overview of t he book's cont ent s:
Chapt er 1, I nt roduct ion This chapt er int roduces you t o Squid and web caching. I give a brief hist ory of t he proj ect , and a few not es on our fut ure work. I explain how you can find addit ional support and inform at ion, including a FAQ, on t he Squid web sit e.
Chapt er 2, Get t ing Squid I n t his chapt er, I explain how and why you should download Squid's source code. You m ay prefer t o inst all a precom piled binary or use a preconfigured package. I also t alk about st aying up t o dat e wit h Squid using t he anonym ous CVS server.
Chapt er 3, Com piling and I nst alling Assum ing you've downloaded t he source code, t his chapt er explains how t o configure and com pile Squid. I n som e cases you m ay need t o t une your syst em before com piling Squid. For exam ple, your kernel m ay have relat ively low file- descript or lim it s t hat affect Squid's perform ance.
Chapt er 4, Configurat ion Guide for t he Eager Here, I give a brief int roduct ion t o Squid's configurat ion file. I f you are t he im pat ient t ype and can't wait t o st art using Squid, t his chapt er will leave you wit h a m inim al
configurat ion file you can st art playing wit h.
Chapt er 5, Running Squid I n t his chapt er, I explain how t o run Squid for t he first t im e and how t o t est Squid in a t erm inal window. Following t hat , I suggest a num ber of ways t o configure your syst em so t hat Squid st art s each t im e it boot s. I also explain how t o reconfigure Squid while it is running and how t o safely shut it down.
Chapt er 6, All About Access Cont rols I t alk ext ensively about access cont rols in t his chapt er. Squid has a powerful collect ion of access cont rol feat ures and a num ber of different rule set s t hat det erm ine how request s and responses are t reat ed. This is an im port ant chapt er because a m ist ake in your access cont rols m ay leave your cache, or even int ernal syst em s, vulnerable t o abuse from out siders.
Chapt er 7, Disk Cache Basics This chapt er is about Squid's prim ary funct ion: st oring cached responses on disk. I explain how t o configure t he disk cache, including replacem ent policies and freshness cont rols. I also show you how t o m anually rem ove unwant ed obj ect s from t he cache.
Chapt er 8, Advanced Disk Cache Topics I n t his chapt er, I explain how t o im prove t he perform ance of Squid's disk cache. I 'll t alk about Squid's different st orage schem es and a num ber of filesyst em t uning opt ions t hat m ay help. I f your Squid cache handles a relat ively light load, you probably don't need t o worry about disk perform ance.
Chapt er 9, I nt ercept ion Caching Here, I explain how t o configure Squid for HTTP int ercept ion, som et im es also called t ransparent caching. Act ually, configuring Squid is t he easy part . The difficult y com es from set t ing up a rout er or swit ch on your net work and t he host from which Squid is running. I explain how t o configure net working equipm ent from Cisco, Alt eon, Foundry, and Ext rem e. I 'll also show you how t o configure your operat ing syst em ( Linux, FreeBSD, Net BSD, OpenBSD, and Solaris) for HTTP int ercept ion. Finally, I t alk about WCCP.
Chapt er 10, Talking t o Ot her Squids I n t his chapt er, I cover t he ins and out s of cache cooperat ion, including m eshes, arrays, and hierarchies. You m ay also find it useful if you sim ply need t o forward request s from Squid t o anot her proxy or int erm ediary. I 'll t alk about t he various int ercache prot ocols
support ed by Squid ( I CP, HTCP, Cache Digest s, and CARP) and how Squid chooses t he next - hop locat ion for a given cache m iss.
Chapt er 11, Redirect ors Redirect ors are t he best way t o m ake Squid rewrit e HTTP request s before forwarding t hem . I describe t he int erface bet ween Squid and a redirect or program so t hat you can writ e your own. I also present a few of t he m ore popular t hird- part y redirect ors available.
Chapt er 12, Aut hent icat ion Helpers I n t his chapt er, I explain how Squid int erfaces wit h ext ernal aut hent icat ion dat abases such as LDAP, NT dom ain cont rollers, and password files. Squid com es wit h a num ber of aut hent icat ion helpers and underst ands Basic, Digest , and NTLM aut hent icat ion credent ials. I also docum ent t he API for each, in case you want t o develop your own helper.
Chapt er 13, Log Files I cover Squid's various log files in t his chapt er, including access.log, st ore.log, cache. log, and ot hers. I explain what each log file cont ains and how you should periodically m aint ain t hem .
Chapt er 14, Monit oring Squid This chapt er provides a lot of inform at ion on m onit oring Squid's operat ion. I cover bot h SNMP and Squid's own cache m anager int erface. You'll find it useful for bot h long- t erm m onit oring and short - t erm problem diagnosis.
Chapt er 15, Server Accelerat or Mode Squid's server accelerat or m ode is useful in a num ber of sit uat ions. You can use it t o boost your origin server's poor perform ance, as a firewall t o prot ect t he server, or even t o build your own cont ent delivery net work. I show how t o set up Squid and m ake sure t hat out siders can't abuse your service.
Chapt er 16, Debugging and Troubleshoot ing The book's final chapt er explains how t o debug and t roubleshoot problem s wit h Squid. You m ay find t hat som e sit es, or som e user agent s, don't work properly wit h Squid. I show how t o isolat e and reproduce t he problem and how t o present t he inform at ion t o Squid developers for assist ance.
Appendix A, Config File Reference This appendix is a reference guide for each of Squid's 200 configurat ion file direct ives. Each has a descript ion, synt ax, default s, and exam ples.
Appendix B, The Mem ory Cache This brief appendix explains a lit t le about Squid's m em ory cache.
Appendix C, Delay Pools You can use Squid's delay pools feat ure t o lim it bandwidt h consum ed by web surfers. I explain how t he delay pools work and provide a num ber of exam ple configurat ions.
Appendix D, Filesyst em Perform ance Benchm arks I n t his appendix, I present t he result s of num erous filesyst em benchm arks. These m ay help you m ake inform ed decisions regarding part icular operat ing syst em s, filesyst em feat ures, and Squid's st orage t echniques.
Appendix E, Squid on Windows Have a look at t his appendix if you'd like t o run Squid on your Windows box. I t alk about using Cygwin and about a nat ive port of Squid, called SquidNT.
Appendix F, Configuring Squid Client s This appendix cont ains inform at ion on how t o configure various user agent s t o use Squid. I t alk about m anual configurat ion, environm ent variables, Proxy Aut oConfigurat ion funct ions, and t he Web Proxy Aut o Discovery prot ocol. As I 'm finishing up t his book, t he lat est st able version is Squid- 2.5.STABLE4, and t he developm ent version is Squid- 3.0. Perhaps t he m ost im port ant difference bet ween t he t wo is t hat Squid- 3 is being rewrit t en in C+ + . You should find t hat m ost t hings are backwardcom pat ible, alt hough a few new configurat ion direct ives have been creat ed. Please read t he release not es carefully if you use Squid- 3.0 or lat er. I have creat ed a web sit e for t he book, locat ed at ht t p: / / squidbook.org/ . There, you will find errat a, supplem ent al inform at ion, and links t o online resources.
Topics Not Covered Due t o a lack of t im e and space, t here are som e t opics I was unable t o cover in t his book; t hey include:
Non- HTTP prot ocols You'll find t hat I m ost ly t alk about HTTP, even t hough Squid also support s FTP, Gopher, and som e ot her relat ively obscure prot ocols.
Cust om izing error m essages Squid's error m essages can be cust om ized and t he source dist ribut ion includes versions of t he error m essages in a num ber of different languages. You can probably figure out how t o cust om ize t he error m essages by m odifying t he default pages or by reading Squid's source code.
Load balancing Squids Load balancing is a popular way t o increase t he capacit y of a caching service. Refer t o one of t he load balancing books m ent ioned in t he following sect ion if necessary.
What is cachable HTTP has a num ber of som ewhat com plicat ed rules for det erm ining what m ay, or m ay not be, cached, and for how long. Refer t o Web Caching, or HTTP: The Definit ive Guide ( for m ore inform at ion, see t he next sect ion) .
Copyright A num ber of nont echnical issues surround web caching. These include copyright s and privacy.
Modifying t he source I don't go int o det ail about Squid's source code in t his book. The Squid proj ect host s a program m ers' guide, which is generally incom plet e and out of dat e. I f you have quest ions about t he source code, please j oin t he squid- dev m ailing list .
SOCKS Squid doesn't support t he SOCKS prot ocol at t his t im e. < Day Day Up >
< Day Day Up >
Recommended Reading While reading t his book, you m ay want t o consult som e of t hese ot her resources for m ore inform at ion ( I 'll refer t o t hem t hroughout t his book) : ●
● ● ●
● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ●
The Design and I m plem ent at ion of t he 4.4 BSD Operat ing Syst em by Marshall Kirk McKusick, Kiet h Bost ic, Michael J. Karels, and John S. Quart erm an ( Addison- Wesley Longm an) DNS and BI ND by Paul Albit z and Cricket Liu ( O'Reilly & Associat es) HTTP: The Definit ive Guide by David Gourley and Brian Tot t y ( O'Reilly) Load Balancing Servers, Firewalls, and Caches by Chandra Koopurapu ( John Wiley & Sons) Mast ering Regular Expressions by Jeffrey E. F. Friedl ( O'Reilly) Server Load Balancing by Tony Bourke ( O'Reilly) Unix Syst em Adm inist rat ion Handbook and Linux Syst em Adm inist rat ion Handbook by Evi Nem et h, Gart h Snyder, Scot t Seebass, and Trent R. Hein ( Prent ice Hall) My book, Web Caching ( O'Reilly) RFC 1413: I dent ificat ion Prot ocol RFC 1738: Uniform Resource Locat ors ( URL) RFC 2186: I nt ernet Cache Prot ocol ( I CP) , Version 2 RFC 2187: Applicat ion of I nt ernet Cache Prot ocol ( I CP) , Version 2 RFC 2396: Uniform Resource I dent ifiers ( URI ) : Generic Synt ax RFC 2616: Hypert ext Transfer Prot ocol—HTTP/ 1.1 RFC 2617: HTTP Aut hent icat ion: Basic and Digest Access Aut hent icat ion RFC 2756: Hypert ext Caching Prot ocol RFC 2817: Upgrading t o TLS Wit hin HTTP/ 1.1 RFC 3040: I nt ernet Web Replicat ion and Caching Taxonom y RFC 3143: Known HTTP Proxy/ Caching Problem s Caching- relat ed web sit es, such as ht t p: / / www.caching.com / and ht t p: / / www.webcache.com / < Day Day Up >
< Day Day Up >
Conventions Used in This Book I use t he following t ypeset t ing convent ions in t his book:
I t alic Used for new t erm s where t hey are defined, but t ons, pages, configurat ion file direct ives, filenam es, m odules, ACLs, direct ories, and URI / URLs
Constant width Used for configurat ion file exam ples, program out put , HTTP header nam es and direct ives, script s, opt ions, environm ent variables, funct ions, m et hods, rules, keywords, libraries, and com m and nam es
Constant width italic Used for replaceable t ext wit hin exam ples and code pieces
Constant width bold Used t o indicat e com m ands t o be t yped verbat im When displaying a Unix com m and, I 'll include a shell prom pt , like t his: % ls -l I f t he com m and is specific t o t he Bourne shell ( sh) or C shell ( csh) , t he prom pt will indicat e which you should use: sh$ ulimit -a csh% limits I f t he com m and requires super- user privileges, t he shell prom pt is a hash m ark: # make install Occasionally, I provide configurat ion file exam ples wit h long lines. I f t he line is t oo wide t o fit on t he page, it 's wrapped around and indent ed. Squid doesn't accept t his sort of synt ax, so you m ust m ake sure t o place everyt hing on one line.
This icon signifies a t ip, suggest ion, or general not e.
This icon indicat es a warning or caut ion.
< Day Day Up >
< Day Day Up >
Comments and Questions Please address com m ent s and quest ions concerning t his book t o t he publisher: O'Reilly & Associat es, I nc. 1005 Gravenst ein Highway Nort h Sebast opol, CA 95472 ( 800) 998- 9938 ( in t he Unit ed St at es or Canada) ( 707) 829- 0515 ( int ernat ional or local) ( 707) 829- 0104 ( fax) There is a web page for t his book, which list s errat a, exam ples, and any addit ional inform at ion. You can access t his page at : ht t p: / / www.oreilly.com / cat alog/ squid To com m ent or ask t echnical quest ions about t his book, send em ail t o: bookquest [email protected] For m ore inform at ion about books, conferences, Resource Cent ers, and t he O'Reilly Net work, check t he O'Reilly web sit e at : ht t p: / / www.oreilly.com You can cont act t he aut hor at wessels@packet - pushers.com . < Day Day Up >
< Day Day Up >
Acknowledgments Looking back at t he event s and people t hat allowed m e t o writ e t his book m akes m e feel ext rem ely hum ble and grat eful. I 'm so happy t o have been a part of t he Harvest proj ect wit h Mike Schwart z, Pet er Danzig, and t he ot hers. That led direct ly t o m y work wit h kc claffy and Hans- Werner Braun at NLANR/ UCSD. The Squid proj ect would have never been at all wit hout t heir support , and t he grant from t he Nat ional Science Foundat ion. I 'm also very t hankful for all t he hard work put in by t he sm all crew of Squid developers: Henrik Nordst röm , Robert Collins, Adrian Chadd, and everyone else who has cont ribut ed t im e and code t o t he proj ect . And I 'm sorry t hat you ever had t o read and/ or fix any ugly code I wrot e. To all t he reviewers who read t he draft s—Joe Cooper, Scot t Pepple, Robert Collins, and Adrian Chadd—t hanks for finding m y m ist akes and suggest ing ways t o m ake t he book bet t er. I also owe so m uch t o t he people at O'Reilly for m aking t he book possible, and for m aking it all com e t oget her. My edit ors Tat iana Diaz and Nat Torkingt on, t he product ion edit or Mary Anne Mayo, t he graphic designer Melanie Wang, t he illust rat or, Rob Rom ano, t he XML m ungers Andrew Savikas and Joe Wizda, and t he count less ot her folks working behind t he scenes for m e. To m y good friend, and business part ner, Alex Rousskov: t hanks for giving m e t he t im e and freedom t o see t his lit t le proj ect t hrough. Finally, t o t he m em bers of m y new fam ily, Annie and Blooey, t hanks for put t ing up wit h t he lat e night s. Can I m ake it up t o you wit h ext ra back scrat ches? < Day Day Up >
< Day Day Up >
Chapter 1. Introduction This long- overdue book is about Squid: a popular open source caching proxy for t he Web. Wit h Squid you can: ● ● ● ● ● ● ● ● ●
Use less bandwidt h on your I nt ernet connect ion when surfing t he Web Reduce t he am ount of t im e web pages t ake t o load Prot ect t he host s on your int ernal net work by proxying t heir web t raffic Collect st at ist ics about web t raffic on your net work Prevent users from visit ing inappropriat e web sit es at work or school Ensure t hat only aut horized users can surf t he I nt ernet Enhance your user's privacy by filt ering sensit ive inform at ion from web request s Reduce t he load on your own web server( s) Convert encrypt ed ( HTTPS) request s on one side, t o unencrypt ed ( HTTP) request s on t he ot her
Squid's j ob is t o be bot h a proxy and a cache. As a proxy, Squid is an int erm ediary in a web t ransact ion. I t accept s a request from a client , processes t hat request , and t hen forwards t he request t o t he origin server. The request m ay be logged, rej ect ed, and even m odified before forwarding. As a cache, Squid st ores recent ly ret rieved web cont ent for possible reuse lat er. Subsequent request s for t he sam e cont ent m ay be served from t he cache, rat her t han cont act ing t he origin server again. You can disable t he caching part of Squid if you like, but t he proxying part is essent ial.
Figu r e 1 - 1 . Squ id sit s be t w e e n clie n t s a n d se r ve r s
As Figure 1- 1 shows, Squid accept s HTTP ( and HTTPS) request s from client s, and speaks a num ber of prot ocols t o servers. I n part icular, Squid knows how t o t alk t o HTTP, FTP, and [ 1] Concept ually, Squid has t wo " sides." The client - side t alks t o web client s ( e. Gopher servers. g., browsers and user- agent s) ; t he server- side t alks t o HTTP, FTP, and Gopher servers. These are called origin servers, because t hey are t he origin locat ion for t he dat a t hey serve.
[ 1]
Gopher servers are quit e rare t hese days. Squid also knows about WAI S and whois, but t hese are even m ore obscure.
Not e t hat Squid's client - side underst ands only HTTP ( and HTTP encrypt ed wit h SSL/ TLS) . This m eans, for exam ple, t hat you can't m ake an FTP client t alk t o Squid ( unless t he FTP client is also an HTTP client ) . Furt herm ore, Squid can't proxy prot ocols for em ail ( SMTP) , inst ant m essaging, or I nt ernet Relay Chat . < Day Day Up >
< Day Day Up >
1.1 Web Caching Web caching refers t o t he act of st oring cert ain web resources ( i.e., pages and ot her dat a files) for possible fut ure reuse. For exam ple, Mat ilda is t he first person in t he office each m orning, and she likes t o read t he local newspaper online wit h her wake- up coffee. As she visit s t he various sect ions, t he Squid cache on t heir office net work st ores t he HTML pages and JPEG im ages. Harry com es in a short while lat er and also reads t he newspaper online. For him , t he sit e loads m uch fast er because m uch of t he cont ent is served from Squid. Addit ionally, Harry's browsing doesn't wast e t he bandwidt h of t he com pany's DSL line by t ransferring t he exact sam e dat a as when Mat ilda viewed t he sit e. A cache hit occurs each t im e Squid sat isfies an HTTP request from it s cache. The cache hit rat io, or cache hit rat e, is t he percent age of all request s sat isfied as hit s. Web caches t ypically achieve hit rat ios bet ween 30% and 60% . A sim ilar m et ric, t he byt e hit rat io, represent s t he volum e of dat a ( i.e., num ber of byt es) served from t he cache. A cache m iss occurs when Squid can't sat isfy a request from t he cache. A m iss can happen for any num ber of reasons. Obviously, t he first t im e Squid receives a request for a part icular resource, it is a cache m iss. Sim ilarly, Squid m ay have purged t he cached copy t o m ake room for new obj ect s. Anot her possibilit y is t hat t he resource is uncachable. Origin servers can inst ruct caches on how t o t reat t he response. For exam ple, t hey can say t hat t he dat a m ust never be cached, can be reused only wit hin a cert ain am ount of t im e, and so on. Squid also uses a few int ernal heurist ics t o det erm ine what should, or should not , be saved for fut ure use. Cache validat ion is a process t hat ensures Squid doesn't serve st ale dat a t o t he user. Before reusing a cached response, Squid oft en validat es it wit h t he origin server. I f t he server indicat es t hat Squid's copy is st ill valid, t he dat a is sent from Squid. Ot herwise, Squid updat es it s cached copy as it relays t he response t o t he client . Squid generally perform s validat ion using t im est am ps. The origin server's response usually cont ains a last - m odified t im est am p. Squid sends t he t im est am p back t o t he origin server t o find if t he original resource has changed. For a det ailed t reat m ent of web caching, have a look at m y book Web Caching, also by O'Reilly. < Day Day Up >
< Day Day Up >
1.2 A Brief History of Squid I n t he beginning was t he CERN HTTP server. I n addit ion t o funct ioning as an HTTP server, it was also t he first caching proxy. The caching m odule was writ t en by Ari Luot onen in 1994. That sam e year, t he I nt ernet Research Task Force Group on Resource Discovery ( I RTF- RD) st art ed t he Harvest proj ect . I t was " an int egrat ed set of t ools t o gat her, ext ract , organize, search, cache, and replicat e" I nt ernet inform at ion. I j oined t he Harvest proj ect near t he end of 1994. While m ost people used Harvest as a local ( or dist ribut ed) search engine, t he Obj ect Cache com ponent was quit e popular as well. The Harvest cache boast ed t hree m aj or im provem ent s over t he CERN cache: fast er use of t he filesyst em , a single process design, and caching hierarchies via t he I nt ernet Cache Prot ocol. Towards t he end of 1995, m any Harvest t eam m em bers m ade t he m ove t o t he excit ing world of I nt ernet - based st art up com panies. The original aut hors of t he Harvest cache code, Pet er Danzig and Anawat Chankhunt hod, t urned it int o a com m ercial product . Their com pany was lat er acquired by Net work Appliance. I n early 1996, I j oined t he Nat ional Laborat ory for Applied Net work Research ( NLANR) t o work on t he I nform at ion Resource Caching ( I RCache) proj ect , funded by t he Nat ional Science Foundat ion. Under t his proj ect , we t ook t he Harvest cache code, renam ed it Squid, and released it under t he GNU General Public License. Since t hat t im e Squid has grown in size and feat ures. I t now support s a num ber of cool t hings such as URL redirect ion, t raffic shaping, sophist icat ed access cont rols, num erous aut hent icat ion m odules, advanced disk st orage opt ions, HTTP int ercept ion, and surrogat e m ode ( a.k.a. HTTP server accelerat ion) . Funding for t he I RCache proj ect ended in July 2000. Today, a num ber of volunt eers cont inue t o develop and support Squid. We occasionally receive financial or ot her t ypes of support from com panies t hat benefit from Squid. Looking t owards t he fut ure, we are rewrit ing Squid in C+ + and, at t he sam e t im e, fixing a num ber of design issues in t he older code t hat are lim it ing t o new feat ures. We are adding support for prot ocols such as Edge Side I ncludes ( ESI ) and I nt ernet Cont ent Adapt at ion Prot ocol ( I CAP) . We also plan t o m ake Squid support I Pv6. A few developers are const ant ly m aking Squid run bet t er on Microsoft Windows plat form s. Finally, we will add m ore and m ore HTTP/ 1.1 feat ures and work t owards full com pliance wit h t he lat est prot ocol specificat ion. < Day Day Up >
< Day Day Up >
1.3 Hardware and Operating System Requirements Squid runs on all popular Unix syst em s, as well as Microsoft Windows. Alt hough Squid's Windows support is im proving all t he t im e, you m ay have an easier t im e wit h Unix. I f you have a favorit e operat ing syst em , I 'd suggest using t hat one. Ot herwise, if you're looking for a recom m endat ion, I really like FreeBSD. Squid's hardware requirem ent s are generally m odest . Mem ory is oft en t he m ost im port ant resource. A m em ory short age causes a drast ic degradat ion in perform ance. Disk space is, nat urally, anot her im port ant fact or. More disk space m eans m ore cached obj ect s and higher hit rat ios. Fast disks and int erfaces are also beneficial. SCSI perform s bet t er t han ATA, if you can j ust ify t he higher cost s. While fast CPUs are nice, t hey aren't crit ical t o good perform ance. Because Squid uses a sm all am ount of m em ory for every cached response, t here is a relat ionship bet ween disk space and m em ory requirem ent s. As a rule of t hum b, you need 32 MB of m em ory for each GB of disk space. Thus, a syst em wit h 512 MB of RAM can support a 16GB disk cache. Your m ileage m ay vary, of course. Mem ory requirem ent s depend on fact ors such as t he m ean obj ect size, CPU archit ect ure ( 32- or 64- bit ) , t he num ber of concurrent users, and part icular feat ures t hat you use. People oft en ask such quest ions as, " I have a net work wit h X users. What kind of hardware do I need for Squid?" These quest ions are difficult t o answer for a num ber of reasons. I n part icular, it 's hard t o say how m uch t raffic X users will generat e. I usually find it easier t o look at bandwidt h usage, and go from t here. I t ell people t o build a syst em wit h enough disk space t o hold 3- 7 days wort h of web t raffic. For exam ple, if your users consum e 1 Mbps ( HTTP and FTP t raffic only) for 8 hours per day, t hat 's about 3.5 GB per day. So, I 'd say you want bet ween 10 and 25 GB of disk space for each Mbps of web t raffic. < Day Day Up >
< Day Day Up >
1.4 Squid Is Open Source Squid is free soft ware and a collaborat ive proj ect . I f you find Squid useful, please consider cont ribut ing back t o t he proj ect in one or m ore of t he following ways: ● ● ●
● ●
● ●
Part icipat e on t he squid- users discussion list . Answer quest ions and help out new users. Try out new versions and report bugs or ot her problem s. Cont ribut e t o t he online docum ent at ion and Frequent ly Asked Quest ions ( FAQ) . I f you not ice an inconsist ency, report it t o t he m aint ainers. Subm it your local m odificat ions back t o t he developers for inclusion int o t he code base. Provide financial support t o one or m ore developers t hrough sm all developm ent cont ract s. Tell t he developers about feat ures you would like t o have. Tell your friends and colleagues t hat Squid is cool.
Squid is released as free soft ware under t he GNU General Public License. This m eans, for exam ple, t hat anyone who dist ribut es Squid m ust m ake t he source code available t o you. See ht t p: / / www.gnu.org/ licenses/ gpl- faq.ht m l for m ore inform at ion about t he GPL. < Day Day Up >
< Day Day Up >
1.5 Squid's Home on the Web The m ain source for up- t o- dat e inform at ion about Squid is ht t p: / / www.squid- cache.org. There you can: ● ● ● ● ● ●
Download t he source code. Read t he FAQ and ot her docum ent at ion. Subscribe t o t he m ailing list , or read t he archives. Cont act t he developers. Find links t o t hird- part y applicat ions. And m ore! < Day Day Up >
< Day Day Up >
1.6 Getting Help Given t hat Squid is free soft ware, you m ay need t o rely on t he kindness of st rangers for occasional assist ance. The best place t o do t his is t he squid- users m ailing list . Before post ing a m essage t o t he m ailing list , however, you should check Squid's FAQ docum ent t o see if your quest ion has already been asked and answered. I f neit her resource provides t he help you need, you can cont act one of t he m any services offering professional support for Squid.
1.6.1 Frequently Asked Questions Squid's FAQ docum ent , locat ed at ht t p: / / www.squid- cache.org/ Doc/ FAQ/ FAQ.ht m l, is a good source of inform at ion for new users. The FAQ evolves over t im e, so it will cont ain ent ries writ t en aft er t his book. The FAQ also cont ains som e hist orical inform at ion t hat m ay be irrelevant t oday. Even so, t he FAQ is one of t he first places you should look for answers t o your quest ions. This is especially t rue if you are a new user. While it is cert ainly less effort for you t o sim ply writ e t o t he m ailing list for help, vet eran m ailing list m em bers grow t ired of reading and answering t he sam e quest ions. I f your quest ion is frequent ly asked, it m ay sim ply be ignored. The FAQ is quit e large. The HTML version exist s as approxim at ely 25 different chapt ers, each in a separat e file. These can be difficult t o search for keywords and awkward t o print . You can also download Post Script , PDF, and t ext versions by following links at t he t op of t he HTML version.
1.6.2 Mailing Lists Squid has t hree m ailing list s you m ight find useful. I explain how t o becom e a subscriber below, but you m ay want t o check Squid's m ailing list page, ht t p: / / www.squid- cache.org/ m ailing- list s. ht m l, for possibly m ore up- t o- dat e inform at ion.
1.6.2.1 squid-users The squid- users m ailing list is an excellent place t o find answers for such quest ions as: ● ● ● ●
How do I ... ? I s t his a bug ... ? Does t his feat ure/ program work on m y plat form ? What does t his error m essage m ean?
Not e t hat you m ust subscribe before you can post a m essage. To subscribe t o t he squid- users list , send a m essage t o squid- users- subscribe@squid- cache.org. I f you prefer, you can receive t he digest version of t he list . I n t his case, you'll receive m ult iple post ings in a single em ail m essage. To sign up t his way, send a m essage t o squid- users- digest subscribe@squid- cache.org.
Once you subscribe, you can post a m essage t o t he list by writ ing t o squid- users@squid- cache. org. I f you have a quest ion, consider checking t he FAQ and/ or m ailing list archives first . You can browse t he list archive by visit ing ht t p: / / www.squid- cache.org/ m ail- archive/ squid- users/ . However, if you are looking for som et hing specific, you'll probably have m ore luck wit h t he search int erface at ht t p: / / www.squid- cache.org/ search/ .
1.6.2.2 squid-announce The m oderat ed squid- announce list is used t o announce new Squid versions and im port ant securit y updat es. The volum e is quit e low, usually less t han one m essage per m ont h. Writ e t o squid- announce- subscribe@squid- cache.org if you'd like t o subscribe.
1.6.2.3 squid-dev The squid- dev list is a place where Squid hackers and developers can exchange ideas and inform at ion. Anyone can post a m essage t o squid- dev, but subscript ions are m oderat ed. I f you'd like t o j oin t he discussion, please send a m essage about yourself and your int erest s in Squid. One of t he list m em bers should subscribe you wit hin a few days. The squid- dev m essages are archived at ht t p: / / www.squid- cache.org/ m ail- archive/ squid- dev/ , where anyone m ay browse t hem .
1.6.3 Professional Support A num ber of com panies now offer professional assist ance for Squid. They m ay be able t o help you get st art ed wit h Squid for t he first t im e, recom m end a configurat ion for your net work environm ent , and even fix som e bugs. Som e of t he consult ing com panies are associat ed wit h core Squid developers. By giving t hem your business, you ensure t hat fixes and feat ures will be com m it t ed t o fut ure Squid soft ware releases. I f necessary, you can also arrange for developm ent of privat e feat ures. Visit ht t p: / / www.squid- cache.org/ Support / services.ht m l for t he list of professional support services. < Day Day Up >
< Day Day Up >
1.7 Getting Started with Squid I f you are new t o Squid, t he next few chapt ers will help you get st art ed. First , I 'll show you how t o get t he code, eit her t he original source or precom piled binaries. I n Chapt er 3, I go t hrough t he st eps necessary t o com pile and inst all Squid on your Unix syst em ; t his chapt er is im port ant because you'll probably need t o t une your syst em before com piling t he source code. Chapt er 4 provides a very brief int roduct ion t o Squid's configurat ion file. Finally, Chapt er 5 explains how t o run Squid. I f you've already had a lit t le experience inst alling and running Squid, you m ay want t o skip ahead t o Chapt er 6. < Day Day Up >
< Day Day Up >
1.8 Exercises ●
● ●
Visit t he Squid sit e and locat e t he squid- users m ailing list archive. Browse t he m essages for t he past few weeks. Search t he Squid FAQ for inform at ion about file descript ors. Check one of t he Squid m irror sit es. I s it up t o dat e wit h t he prim ary sit e? < Day Day Up >
< Day Day Up >
Chapter 2. Getting Squid Squid is norm ally dist ribut ed as source code. This m eans you'll probably need t o com pile it , as described in Chapt er 3. The inst allat ion process should be relat ively painless. The developers put a lot of effort int o m aking sure Squid com piles easily on all t he popular operat ing syst em s. You can also find precom piled binaries for som e operat ing syst em s. Linux users can get Squid in one of t he various package form at s ( e.g., RPM, Debian, et c.) . The FreeBSD, Net BSD, and OpenBSD proj ect s offer Squid port s. The BSD port s aren't binary dist ribut ions but rat her a sm all set of files t hat know how t o download, com pile, and inst all t he Squid source. While t hese precom piled or preconfigured packages m ay be easier t o inst all, I recom m end t hat you download and com pile t he source yourself. Anonym ous CVS is a great way for developers and users t o st ay current wit h t he official source t ree. I nst ead of downloading ent ire new releases, you run a com m and t o ret rieve only t he part s t hat have changed since your last updat e. < Day Day Up >
< Day Day Up >
2.1 Versions and Releases The Squid developers m ake periodic releases of t he source code. Each release has a version num ber, such as 2.5.STABLE4. The t hird com ponent st art s eit her wit h STABLE or DEVEL ( short for developm ent ) . As you can probably guess, t he DEVEL releases t end t o have newer, experim ent al feat ures. They are also m ore likely t o have bugs. I nexperienced users should not run DEVEL releases. I f you choose t o t ry a DEVEL release, and you encount er problem s, please report t hem t o t he Squid m aint ainers. Aft er spending som e t im e in t he developm ent st at e, t he version num ber changes t o STABLE. These releases are suit able for all users. Of course, even t he st able releases m ay have som e bugs. The higher- num bered st able versions ( e.g., STABLE3, STABLE4) are likely t o have fewer bugs. I f you are really concerned about st abilit y, you m ay want t o wait for one of t hese lat er releases. < Day Day Up >
< Day Day Up >
2.2 Use the Source, Luke So why can't you j ust copy a precom piled binary t o your syst em and expect it t o work perfect ly? The prim ary reason is t hat t he code needs t o know about cert ain operat ing syst em param et ers. I n part icular, t he m ost im port ant param et er is t he m axim um num ber of open file descript ors. Squid's ./ configure script ( see Sect ion 3.4) probes for t hese values before com piling. I f you t ake a Squid binary built for one value and run it on a syst em wit h a different value, you m ay encount er problem s. Anot her reason is t hat m any of Squid's feat ures m ust be enabled at com pile t im e. I f you t ake a binary t hat som ebody else com piled, and it doesn't include t he code for t he feat ures t hat you want , you'll need t o com pile your own version anyway. Finally, not e t hat shared libraries som et im es m ake it difficult t o share execut able files bet ween syst em s. Shared libraries are loaded at runt im e. This is also known as dynam ic linking. Squid's ./ configure script probes your syst em t o find out cert ain t hings about your C library funct ions ( if t hey are present , if t hey work, et c.) . Alt hough library funct ions don't usually change, it is possible t hat t wo different syst em s have slight ly different shared C libraries. This m ay becom e a problem for Squid if t he t wo syst em s are different enough. Get t ing t he Squid source code is really quit e easy. To get it , visit t he Squid hom e page, ht t p: / / www.squid- cache.org/ . The hom e page has links t o t he current st able and developm ent releases. I f you aren't locat ed in t he Unit ed St at es, you can select one of t he m any m irror sit es. The m irror sit es are usually nam ed " wwwN.CC.squid- cache.org," where N is a num ber and CC is a t wo- let t er count ry code. For exam ple, www1.au.squid- cache.org is an Aust ralian m irror sit e. The hom e page has links t o t he current m irror sit es. Each Squid release branch ( e.g., Squid- 2.5) has it s own HTML page. This page has links t o t he source code releases and " diffs" bet ween releases. I f you are upgrading from one release t o t he next , you m ay want t o download t he diff file and apply t he pat ch as described in Sect ion 3.7. The release pages describe t he new feat ures and im port ant changes in each version, and also have links t o bugs t hat have been fixed. When web access isn't an opt ion, you can get t he source release from t he ft p: / / ft p.squid- cache. org FTP server or one of t he FTP m irror sit es. For t he current versions, look in t he pub/ squid- 2/ DEVEL or pub/ squid- 2/ STABLE direct ories. The Squid FTP sit e is m irrored at m any locat ions as well. You can use t he sam e count ry- code t rick t o guess som e m irror sit es, such as ft p1.uk. squid- cache.org. The current Squid release dist ribut ions are about 1 MB in size. Aft er downloading t he com pressed t ar file, you can proceed t o Chapt er 3. < Day Day Up >
< Day Day Up >
2.3 Precompiled Binaries Som e Unix dist ribut ions include, or m ake available, precom piled Squid packages. For Linux, you can easily find Squid RPMs. Oft en t he Squid RPM is included on Linux CD- ROMs you can buy. The FreeBSD/ Net BSD/ OpenBSD dist ribut ions also cont ain Squid in t heir port s and/ or packages collect ions. While RPMs and precom piled packages m ay init ially save you som e t im e, t hey also have som e drawbacks. As I already m ent ioned, cert ain feat ures m ust be enabled or disabled before you st art com piling Squid. The precom piled package t hat you inst all m ay not have t he part icular feat ure you want . Furt herm ore, Squid's ./ configure script probes your operat ing syst em for cert ain param et ers. These param et ers m ay be configured different ly on your m achine on which Squid was com piled. Finally, if you want t o apply a pat ch t o Squid, you'll eit her have t o wait for som eone t o build a new RPM/ package or get t he source and do it yourself. I st rongly encourage you t o com pile Squid from t he source, but t he decision is yours t o m ake. < Day Day Up >
< Day Day Up >
2.4 Anonymous CVS The Concurrent Versioning Syst em ( CVS) is a nift y package t hat allows you t o sim ult aneously edit and m anage source code and ot her files. Alm ost every open source soft ware proj ect uses CVS. You can anonym ously access Squid's CVS files ( read- only) t o keep your source code up t o dat e. The nice t hing about CVS is t hat you can easily ret rieve only t he changes ( diffs) of your current version. Thus, it is easy t o see what has changed recent ly. Applying t he changes t o your current files efficient ly synchronizes your source code wit h t he official version. CVS uses a t ree- like indexing syst em . The t runk of t he t ree is called t he head branch. For Squid's reposit ory, t his is where all new changes and feat ures are placed. The head branch usually cont ains experim ent al and, possibly unst able, code. The st able code is t ypically found on ot her branches. To effect ively use Squid's anonym ous CVS server, you first need t o underst and how different versions and branches are t agged. For exam ple, t he Version 2.5 branch is nam ed SQUI D_2_5. Part icular releases, which represent a snapshot in t im e, have longer nam es, such as SQUI D_2_5_STABLE4. To get exact ly Squid Version 2.5.STABLE4, use t he SQUI D_2_5_STABLE4 t ag; t o get t he lat est code on t he 2.5 branch, use SQUI D_2_5. To use t he Squid anonym ous CVS server, you first need t o set t he CVSROOT environm ent variable: csh% setenv CVSROOT :pserver:[email protected]:/squid Or, for Bourne shell users: sh$ CVSROOT=:pserver:[email protected]:/squid sh$ export CVSROOT You t hen log in t o t he server: % cvs login (Logging in to [email protected]) CVS password: At t he prom pt , ent er anoncvs for t he password. Now you can check out t he source t ree wit h t his com m and: % cvs checkout -r SQUID_2_5 -d squid-2.5 squid The - r opt ion specifies t he revision t ag t o ret rieve. Om it t ing t he - r opt ion get s you t he head
branch. The - d opt ion changes t he t op- level direct ory nam e in which files are placed. I f you om it t he - d opt ion, t he t op- level direct ory is t he sam e as t he m odule nam e. The final com m andline argum ent ( squid) is t he nam e of t he m odule t o check out . Once you have t he Squid source t ree checked out , you can run t he cvs updat e com m and t o updat e your files and synchronize wit h t he m ast er reposit ory. Addit ional int erest ing com m ands are cvs diff, cvs log, and cvs annot at e. To learn m ore about CVS, visit ht t p: / / www.cvshom e.org/ . < Day Day Up >
< Day Day Up >
2.5 devel.squid-cache.org The Squid developers m aint ain a separat e sit e, current ly host ed at SourceForge, for experim ent al Squid feat ures. Check it out at ht t p: / / devel.squid- cache.org/ . There you'll find a num ber of cut t ing- edge developm ent proj ect s t hat haven't yet been int egrat ed int o t he official Squid code base. You can access t hese proj ect s t hrough SourceForge's anonym ous CVS server or download diff files based on t he st andard releases. < Day Day Up >
< Day Day Up >
2.6 Exercises ●
● ●
Visit t he Squid web sit e or FTP server and look at t he recent st able and developm ent releases. How oft en are new releases m ade? Download t he m ost recent st able code. Use Squid's anonym ous CVS server t o check out t he recent st able branch. Change one of t he source files by insert ing a blank line, t hen run cvs diff. < Day Day Up >
< Day Day Up >
Chapter 3. Compiling and Installing Squid is designed t o be port able and should com pile on all m aj or Unix syst em s, including Linux, BSD/ OS, FreeBSD, Net BSD, OpenBSD, Solaris, HP- UX, OSF/ DUNI X/ TRU- 64, Mac OS/ X, I RI X, and AI X. Squid also runs on Microsoft Windows. Please see Appendix E for inst ruct ions on com piling and running Squid on Windows. Com piling Squid is relat ively st raight forward. I f you've inst alled m ore t han a few open source packages, you're probably already fam iliar wit h t he procedure. You first use a program called ./ configure t o probe your syst em and t hen a program called m ake t o do t he act ual com piling. Before get t ing t o t hat st ep, however, let 's t alk about t uning your syst em in preparat ion for Squid. Your operat ing syst em m ay have default resource lim it s t hat are t oo low for Squid t o run correct ly. Most im port ant ly, you need t o worry about t he num ber of available file descript ors. < Day Day Up >
< Day Day Up >
3.1 Before You Start I f you've been using Unix for a while, chances are t hat you've already com piled a num ber of ot her soft ware packages. I f so, you can probably quickly scan t his chapt er. The procedure for com piling and inst alling Squid is sim ilar t o m any ot her soft ware dist ribut ions. To com pile Squid, you need an ANSI C com piler. Don't be t oo alarm ed by t he " ANSI " part . Chances are t hat if you already have a C com piler, it is com pliant wit h t he ANSI specificat ion. The GNU C com piler ( gcc) is an excellent choice and widely available. Most operat ing syst em s com e wit h a C com piler as a part of t he st andard inst allat ion. The com m on except ions are Solaris and HP- UX. I f you're using one of t hose operat ing syst em s, you m ight not have a com piler inst alled. I deally you should com pile Squid on t he sam e syst em on which it will run. Part of t he inst allat ion process probes your syst em for cert ain param et ers, such as t he num ber of available file descript ors. However, if your syst em doesn't have a C com piler, you m ay be able t o com pile Squid elsewhere and t hen copy t he binaries back. I f t he operat ing syst em s are different , Squid m ay encount er som e problem s. Also, Squid m ay becom e confused if t he t wo syst em s have different kernel configurat ions. I n addit ion t o a C com piler, you'll also need Perl and awk. awk is a st andard program on all Unix syst em s, so you shouldn't need t o worry about it . Perl is quit e com m on, but it m ay not be inst alled on your syst em by default . You m ay need t he gzip program t o uncom press t he source dist ribut ion file. Solaris users, m ake sure t hat / usr/ ccs/ bin is in your PATH, even if you're using gcc. To com pile Squid, you m ay need t he m ake and ar program s found in t hat direct ory. < Day Day Up >
< Day Day Up >
3.2 Unpacking the Source Aft er downloading t he source dist ribut ion, you need t o unpack it som ewhere. The part icular locat ion doesn't really m at t er. You can unpack Squid in your hom e direct ory or anywhere; you'll need about 20 MB of free disk space. Personally, I like t o use / t m p. Use t he t ar com m and t o ext ract t he source direct ory: % cd /tmp % tar xzvf /some/where/squid-2.5.STABLE4-src.tar.gz squid-2.5.STABLE4/ squid-2.5.STABLE4/CONTRIBUTORS squid-2.5.STABLE4/COPYING squid-2.5.STABLE4/COPYRIGHT squid-2.5.STABLE4/CREDITS squid-2.5.STABLE4/ChangeLog squid-2.5.STABLE4/INSTALL squid-2.5.STABLE4/QUICKSTART squid-2.5.STABLE4/README ... Som e t ar program s don't have t he z opt ion, which aut om at ically uncom presses gzip files. I n t hat case, you'll need t o use t his com m and: % gzip -dc /some/where/squid-2.5.STABLE4-src.tar.gz | tar xvf Once t he source code has been unpacked, t he next st ep is usually t o configure t he source t ree. However, if t his is t he first t im e you're com piling Squid, you should m ake sure cert ain kernel resource lim it s are high enough; t o find out how, read on. < Day Day Up >
< Day Day Up >
3.3 Pretuning Your Kernel Squid requires a fair am ount of kernel resources under m oderat e and high loads. I n part icular, you m ay need t o configure your syst em wit h a higher- t han- norm al num ber of file descript ors and m buf clust ers. The file- descript or lim it can be especially annoying. You'd be bet t er off t o increase t he lim it before com piling Squid. At t his point , you m ight be t em pt ed t o get t he precom piled binaries t o avoid t he hassle of [ 1] building a new kernel. Unfort unat ely, you need t o m ake a new kernel, regardless. Squid and t he kernel exchange inform at ion t hrough dat a st ruct ures t hat m ust not exceed t he set filedescript or lim it s. Squid checks t hese lim it s at runt im e and uses t he safest ( sm allest ) value. Thus, even if a precom piled binary has higher file descript ors t han t he kernel, t he kernel value t akes precedence. [ 1]
Not all operat ing syst em s require building a new kernel. Som e m ay be t unable at runt im e.
To change som e set t ings, you m ust build and inst all a new kernel. This procedure varies am ong different operat ing syst em s. Consult Unix Syst em Adm inist rat ion Handbook ( Prent ice Hall) or your operat ing- syst em docum ent at ion if necessary. I f you're using Linux, you probably don't need t o recom pile your kernel.
3.3.1 File Descriptors File descript ors are sim ply int egers t hat ident ify each file and socket t hat a process has opened. The first opened file is 0, t he second is 1, and so on. Unix operat ing syst em s usually im pose a lim it on t he num ber of file descript ors t hat each process can open. Furt herm ore, Unix also norm ally has a syst em wide lim it . Because of t he way Squid works, t he file- descript or lim it s m ay adversely affect perform ance. When Squid uses up all t he available file descript ors, it is unable t o accept new connect ions from users. I n ot her words, running out of file descript ors causes denial of service. Squid can't accept new request s unt il som e of t he current request s com plet e, and t he corresponding files and socket s are closed. Squid issues a warning when it det ect s a file- descript or short age. You can save yourself som e t rouble by m aking sure t he file descript or lim it s are appropriat e before running ./ configure. I n m ost cases, 1024 file descript ors will be sufficient . Very busy caches m ay require 4096 or m ore. When configuring file descript or lim it s, I recom m end set t ing t he syst em wide lim it t o t wice t he per- process lim it . You can usually discover your syst em 's file- descript or lim it from your Unix shell. All C shells and sim ilar have t he built - in lim it com m and. Newer Bourne shells and sim ilar have a com m and called ulim it . To find your file- descript or lim it s, t ry running t hese com m ands: csh% limit descriptors unlimited csh% limit descriptors
descriptors
4096
or: sh$ ulimit -n unlimited sh$ ulimit -n 4096 On FreeBSD, you can also use t he sysct l com m and: % sysctl -a | grep maxfiles kern.maxfiles: 8192 kern.maxfilesperproc: 4096 I f you can't figure out t he file- descript or lim it , Squid's ./ configure script can do it for you. When you run ./ configure, as described in Sect ion 3.4, wat ch for out put like t his near t he end: checking Maximum number of file descriptors we can open... 4096 I f eit her lim it , ulim it , or ./ configure report a value less t han 1024, you should invest t he t im e t o increase t he lim it before com piling Squid. Ot herwise, Squid's perform ance will be poor under a m oderat e load. I ncreasing t he file descript or lim it varies from syst em t o syst em . The following sect ions offer som e t ips t o help get you st art ed.
3.3.1.1 FreeBSD, NetBSD, OpenBSD Edit your kernel configurat ion file, and add a line like t his: options
MAXFILES=8192
On OpenBSD, use option inst ead of options. Then, configure, com pile, and inst all t he new kernel. Reboot your syst em so t he change t akes effect .
3.3.1.2 Linux Configuring file descript ors on Linux is a lit t le com plicat ed. You m ust edit one of t he syst em include files, and execut e som e shell com m ands before com piling and running Squid. St art off by edit ing t he file / usr/ include/ bit s/ t ypes.h. Change t he value for _ _FD_SETSIZE as follows: #define _ _FD_SETSIZE
8192
Next , increase t he kernel file descript or lim it wit h t his com m and:
# echo 8192 > /proc/sys/fs/file-max Finally, increase t he process file- descript or lim it in t he sam e shell in which you will configure and com pile Squid: sh# ulimit -Hn 8192 This com m and m ust be execut ed as root and only works from t he bash shell. There is no need t o reboot on Linux.
Wit h t his t echnique, you m ust execut e t he echo and ulim it com m ands each t im e your syst em boot s, or at least before st art ing Squid. I f you use an rc.d script t o st art Squid ( see Sect ion 5.6.2) , t hat is a good place t o st ick t hese com m ands.
3.3.1.3 Solaris Add t his line t o your / et c/ syst em file: set rlim_fd_max = 4096 Then, reboot t he syst em for t he change t o t ake effect .
3.3.2 Mbuf Clusters The BSD- based net working code uses a dat a st ruct ure known as an m buf ( see W.R.St evens' book, TCP/ I P I llust rat ed, Vol 2) . Mbufs are t ypically sm all ( e.g., 128 oct et s) chunks of m em ory. The dat a for larger net work packet s are st ored in m buf clust ers. The kernel m ay enforce an upper lim it on t he t ot al num ber of m buf clust ers available in t he syst em . You can find t his lim it wit h t he net st at com m and: % netstat -m 196/6368/32768 mbufs in use (current/peak/max): 146 mbufs allocated to data 50 mbufs allocated to packet headers 103/6182/8192 mbuf clusters in use (current/peak/max) 13956 Kbytes allocated to network (56% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed
0 calls to protocol drain routines I n t his exam ple, t here are 8,192 m buf clust ers available, but t here are never m ore t han 6,182 used at once. When t he syst em runs out of m buf clust ers, I / O rout ines such as read( ) and writ e ( ) ret urn t he " No buffer space available" error m essage. Net BSD and OpenBSD don't display m buf usage in net st at - m out put . I nst ead, t hey report " WARNI NG: m clpool lim it reached" via syslog. To increase t he num ber of m buf clust ers, you need t o add an opt ion t o your kernel configurat ion file: options
NMBCLUSTERS=16384
3.3.3 Ephemeral Port Range Ephem eral port s are t he local port num bers t he TCP/ I P st ack assigns t o out going connect ions. I n ot her words, when Squid m akes a connect ion t o an origin server, t he kernel assigns a port num ber t o t he local socket . These local port num bers fall wit hin a cert ain range. On FreeBSD, for exam ple, t he default ephem eral port range is 1024- 5000. A short age of ephem eral port s m ay adversely affect perform ance for very busy proxies ( i.e., hundreds of request s per second) . This is because som e TCP connect ions ent er a TI ME_WAI T st at e when t hey are closed. An ephem eral port num ber can't be reused while t he connect ion is in t he TI ME_WAI T st at e. You can see how m any connect ions are in t his st at e wit h t he net st at com m and: % netstat -n | grep TIME_WAIT Proto Recv-Q Send-Q
Local Address
Foreign Address
(state)
tcp4
0
0
192.43.244.42.19583
212.67.202.80.80
TIME_WAIT
tcp4
0
0
192.43.244.42.19597
202.158.66.190.80
TIME_WAIT
tcp4
0
0
192.43.244.42.19600
207.99.19.230.80
TIME_WAIT
tcp4
0
0
192.43.244.42.19601
216.131.72.121.80
TIME_WAIT
tcp4
0
0
192.43.244.42.19602
209.61.183.115.80
TIME_WAIT
tcp4
0
0
192.43.244.42.3128
128.109.131.47.25666
TIME_WAIT
tcp4
0
0
192.43.244.42.3128
128.109.131.47.25795
TIME_WAIT
tcp4
0
0
192.43.244.42.3128
128.182.72.190.1488
TIME_WAIT
tcp4
0
0
192.43.244.42.3128
128.182.72.190.2194
TIME_WAIT
Not e t hat t his exam ple has bot h client - and server- side connect ions. Client - side connect ions
have 3128 as t he local port num ber; server- side connect ions have 80 as t he rem ot e ( foreign) port num ber. The ephem eral port num bers appear under t he Local Address heading. I n t his exam ple, t hey are in t he 19,000s. Unless you see t housands of ephem eral port s in t he TI ME_WAI T st at e, you probably don't need t o increase t he range. On FreeBSD, you can increase t he range wit h t his com m and: # sysctl -w net.inet.ip.portrange.last=30000 On OpenBSD, t he com m and is alm ost t he sam e, but t he sysctl variable has a different nam e: # sysctl -w net.inet.ip.portlast=49151 On Net BSD, t hings work a lit t le different ly. The default range is 49,152- 65,535. To increase t he range, change t he lower lim it : # sysctl -w net.inet.ip.anonportmin=10000 On Linux, sim ply writ e a pair of num bers t o t he following special file: # echo "1024 40000" > /proc/sys/net/ipv4/ip_local_port_range Don't forget t o add t hese com m ands t o your syst em st art up script s so t hat t hey t ake effect each t im e your m achine reboot s. < Day Day Up >
< Day Day Up >
3.4 The configure Script Like m any ot her Unix soft ware packages, Squid uses a ./ configure script t o learn about an operat ing syst em before com piling. The ./ configure script is generat ed by t he popular GNU aut oconf program . When t he script runs, it probes t he syst em in various ways t o find out about libraries, funct ions, t ypes, param et ers, and feat ures t hat m ay or m ay not be present . One of t he first t hings t hat ./ configure does is look for a working C com piler. I f t he com piler can't be found or fails t o com pile a sim ple t est program , t he ./ configure script can't proceed. The ./ configure script has a num ber of different opt ions. The m ost im port ant is t he inst allat ion prefix. Before running ./ configure, you need t o decide where Squid should live. The inst allat ion prefix det erm ines t he default locat ions for t he Squid logs, binaries, and configurat ion files. You can change t he locat ion for t hose files aft er inst alling, but it 's easier if you decide now. The default inst allat ion prefix is / usr/ local/ squid. Squid put s files in seven different subdirect ories under t he prefix: % ls -l /usr/local/squid total 5 drwxr-x---
2 wessels
wheel
512 Apr 28 20:42 bin
drwxr-x---
2 wessels
wheel
512 Apr 28 20:42 etc
drwxr-x---
2 wessels
wheel
512 Apr 28 20:42 libexec
drwxr-x---
3 wessels
wheel
512 Apr 28 20:43 man
drwxr-x---
2 wessels
wheel
512 Apr 28 20:42 sbin
drwxr-x---
4 wessels
wheel
512 Apr 28 20:42 share
drwxr-x---
4 wessels
wheel
512 Apr 28 20:43 var
Squid uses t he bin, et c, libexec, m an, sbin, and share direct ories for a few, relat ively sm all files ( or ot her direct ories) t hat don't change very oft en. The files under t he var direct ory, however, are a different st ory. This is where you'll find Squid's log files, which m ay grow quit e large ( t ens or hundreds of m egabyt es) . var is also t he default locat ion for t he act ual disk cache. You m ay want t o put var on a different part it ion wit h plent y of space. One easy way t o do t his is wit h t he —localst at edir opt ion: % ./configure --localstatedir=/bigdisk/var You don't need t o worry t oo m uch about pat hnam es when configuring Squid. You can always change t he pat hnam es lat er, in t he squid.conf file.
3.4.1 configure Options
The ./ configure script has a num ber of different opt ions t hat all st art wit h —. You can see t he full list of opt ions by t yping ./configure --help. Som e of t hese opt ions are com m on t o all configure script s, and som e are unique t o Squid. Here are t he st andard opt ions t hat you m ight find useful:
--prefix = PREFIX This set s t he inst allat ion prefix direct ory, as described earlier. The inst allat ion prefix is t he default direct ory for all execut ables, logs, and configurat ion files. Throughout t his book, $prefix refers t o your choice for t he inst allat ion prefix.
--localstatedir = DIR This opt ion allows you t o change t he locat ion for t he var direct ory. The default is $prefix/ var, but you m ight want t o change it so t hat Squid's disk cache and log files are st ored elsewhere.
--sysconfdir = DIR This opt ion allows you t o change t he locat ion for t he et c direct ory. The default is $prefix/ et c. I f you like t o use / usr as t he inst allat ion prefix, you m ight want t o set —sysconfdir t o / et c. Here are t he Squid- specific ./ configure opt ions:
--enable-dlmalloc[ = LI B] On som e syst em s, t he built - in m em ory allocat ion ( m alloc) funct ions have poor perform ance charact erist ics when used wit h Squid. Using t he —enable- dlm alloc opt ion builds and links wit h t he dlmalloc package included in t he Squid source code. I f you already have dlmalloc built on your syst em , you can specify t he library's pat hnam e as t he = LI B argum ent . See ht t p: / / g.oswego.edu/ dl/ ht m l/ m alloc.ht m l for m ore inform at ion on dlmalloc.
--enable-gnuregex Squid uses regular expressions for pat t ern m at ching in access cont rol list s and ot her configurat ion direct ives. The GNU regular expression library com es wit h t he Squid source code; it can be used on operat ing syst em s t hat don't have built - in regular expression funct ions. The ./ configure script probes your syst em for a regular expression library and enables t he use of GNU regex if necessary. I f, for som e reason, you want t o force t he usage of GNU regex, you can add t his opt ion t o t he ./ configure com m and.
--enable-carp
The Cache Array Rout ing Prot ocol ( CARP) is useful for forwarding cache m isses t o an array, or clust er, of parent caches. There's m ore about CARP in Sect ion 10.9.
--enable-async-io [ = N_THREADS] Async I / O refers t o one of Squid's t echniques for im proved st orage perform ance. The aufs st orage m odule uses a num ber of t hread processes t o perform disk I / O operat ions. This code works only on Linux and Solaris syst em s. The = N_THREADS argum ent changes t he num ber of t hread processes Squid uses. aufs and Async I / O are discussed in Sect ion 8.4. Not e t hat t he —enable- async- io opt ion is a short cut t hat t urns on t hree ot her ./ configure opt ions. I t is equivalent t o specifying: --with-aufs-threads=N_THREADS --with-pthreads --enable-storeio=ufs,aufs
--with-pthreads The —wit h- pt hreads opt ion causes t he com pilat ion procedure t o link wit h your syst em 's Pt hreads library. The aufs st orage m odule is t he only part of Squid t hat uses t hreads. Norm ally, you don't specify t his opt ion on t he ./ configure com m and line because it 's enabled aut om at ically when you use —enable- async- io.
--enable-storeio = LIST Squid support s a num ber of different st orage m odules. Wit h t his opt ion, you t ell ./ configure which m odules t o com pile. The ufs, aufs, diskd, coss, and null m odules are support ed in Squid- 2.5. You can also get a list by looking at t he direct ories under src/ fs. LIST is a com m a- separat ed list of m odule nam es. For exam ple: % ./configure --enable-storeio=afus,diskd,ufs The ufs m odule is t he default and least likely t o cause problem s. Unfort unat ely, it also has lim it ed perform ance charact erist ics. The ot her m odules m ay not necessarily com pile on your part icular operat ing syst em . For a com plet e descript ion of Squid's st orage m odules, see Chapt er 8.
--with-aufs-threads = N_THREADS Specifies t he num ber of t hreads t o use for t he aufs st orage schem e ( see Sect ion 8.4) .
By default , Squid aut om at ically calculat es how m any t hreads t o use, based on t he num ber of cache direct ories.
--enable-heap-replacement This opt ion has been deprecat ed but rem ains for backward com pat ibilit y. You should always use t he —enable- rem oval- policies opt ion inst ead.
--enable-removal-policies = LIST Rem oval policies are t he algorit hm s Squid uses t o ej ect cached obj ect s when m aking room for new ones. Squid- 2.5 support s t hree rem oval policies: least recent ly used ( LRU) , greed dual size ( GDS) , and least frequent ly used ( LFU) . However, for som e reason, t he ./ configure opt ions blur t he dist inct ion bet ween a part icular replacem ent policy and t he underlying dat a st ruct ures required t o im plem ent t hem . LRU, which is t he default , is im plem ent ed wit h a doubly linked list . The GDS and LFU im plem ent at ions use a dat a st ruct ure known as a heap. To use t he GDS or LFU policies, you specify: % ./configure --enable-removal-policies=heap You t hen select bet ween GDS and LFU in t he Squid configurat ion file. I f you want t o ret ain t he opt ion of using LRU, specify: % ./configure --enable-removal-policies=heap,lru There's m ore about replacem ent policies in Sect ion 7.5.
--enable-icmp As you'll see in Sect ion 10.5, Squid can m ake round- t rip t im e m easurem ent s wit h I CMP m essages, m uch like t he ping program . You can use t his opt ion t o enable t hese feat ures.
--enable-delay-pools Delay pools are Squid's t echnique for t raffic shaping or bandwidt h lim it ing. The pools consist of groups of client I P addresses. When request s from t hese client s are cache m isses, t heir responses m ay be art ificially delayed. See m ore about delay pools in Appendix C.
--enable-useragent-log
This opt ion enables logging of t he HTTP User-Agent header from client request s. See m ore about t his in Sect ion 13.5.
--enable-referer- log This opt ion enables logging of t he HTTP referer header from client request s. See m ore about t his in Sect ion 13.4.
--disable-wccp The Web Cache Coordinat ion Prot ocol ( WCCP) is Cisco's once- propriet ary prot ocol for int ercept ing and dist ribut ing HTTP request s t o one or m ore caches. WCCP is enabled by default , but you can use t his opt ion t o prevent com pilat ion of t he WCCP code if you like.
--enable-snmp The Sim ple Net work Managem ent Prot ocol ( SNMP) is a popular way t o m onit or net work devices and servers. This opt ion causes t he build procedure t o com pile all of t he SNMPrelat ed code, including a cut - down version of t he CMU SNMP library.
--enable-cachemgr - hostname [ = hostname] cachem gr is a CGI program you can use t o adm inist rat ively query Squid. By default , cachem gr's host nam e field is blank, but you can creat e a default value wit h t his opt ion. For exam ple: % ./configure --enable-cachemgr-hostname=mycache.myorg.net
--enable-arp-acl Squid support s ARP, or Et hernet address, access cont rol list s on som e operat ing syst em s. The code t o im plem ent ARP ACLs uses nonst andard funct ion int erfaces, so it is disabled by default . I f you run Squid on Linux or Solaris, you m ay be able t o use t his feat ure.
--enable-htcp HTCP is t he Hypert ext Caching Prot ocol—an int ercache prot ocol sim ilar t o I CP. See Sect ion 10.8 for m ore inform at ion.
--enable-ssl Use t his opt ion t o give Squid t he abilit y t o t erm inat e SSL/ TLS connect ions. Not e t his
only works for accelerat ed request s in surrogat e m ode. See Sect ion 15.2.2 for m ore inform at ion.
--with-openssl [ = DIR] This opt ion exist s so t hat you can t ell t he com piler where t o find t he OpenSSL libraries and header files, if necessary. I f t hey aren't in t he default locat ion, ent er t he parent direct ory aft er t his opt ion. For exam ple: % ./configure --enable-ssl --with-ssl=/opt/foo/openssl Given t his exam ple, your com piler looks for t he OpenSSL header files in / opt / foo/ openssl/ include, and for libraries in / opt / foo/ openssl/ lib.
--enable-cache-digests Cache Digest s are anot her alt ernat ive t o I CP, but wit h significant ly different charact erist ics. See Sect ion 10.7.
--enable-err-languages = " lang1 lang2 ..." Squid support s cust om izable error m essages and com es wit h error m essages in m any different languages. This opt ion det erm ines t he languages t hat are copied t o t he inst allat ion direct ory ( $prefix/ share/ errors) . I f you don't use t his opt ion, all available languages are inst alled. To see which languages are available, look at a direct ory list ing of t he errors direct ory in t he source dist ribut ion. Here's how t o enable m ore t han one language: % ./configure --enable-err-languages="Dutch German French" ...
--enable-default-err-language = lang This opt ion set s t he default value for t he error_direct ory direct ive. For exam ple, if you want t o use Dut ch error m essages, you can use t his ./ configure opt ion: % ./configure --enable-default-err-language=Dutch You can also set t he error_direct ory direct ive in squid.conf, as described in Appendix A. English is t he default error language if you om it t his opt ion.
--with-coss-membuf-size = N The Cyclic Obj ect St orage Syst em ( coss) is an experim ent al st orage schem e for Squid. This opt ion set s t he m em ory buffer size for coss cache direct ories. Not e t hat in order t o
use coss, you m ust specify it as a st orage t ype in t he —enable- st oreio opt ion. The argum ent is given in byt es. The default is 1,048,576 byt es or 1 MB. You can specify a 2- MB buffer like t his: % ./configure --with-coss-membuf-size=2097152
--enable-poll Unix provides t wo sim ilar funct ions t hat scan open file descript ors for I / O event s: select ( ) and poll( ) . The ./ configure script usually does a very good j ob of figuring out when t o use poll( ) over select ( ) . Use t his opt ion if you want t o override t he ./ configure script and force it t o use poll( ) .
--disable-poll Sim ilarly, Unix gurus m ay want t o force ./ configure t o not use poll( ) .
--disable-http-violations By default , Squid can be configured t o violat e t he HTTP prot ocol specificat ions. You can use t his opt ion t o rem ove t he code com plet ely t hat would violat e HTTP.
--enable-ipf-transparent I n Chapt er 9, I 'll describe how t o configure Squid for int ercept ion caching. Som e operat ing syst em s use t he I P Filt er package t o assist wit h t he int ercept ion. I n t hese cases you should use t his ./ configure opt ion. I f you enable t his opt ion and get com piler errors on t he src/ client _side.c file, chances are t hat t he I P Filt er package isn't act ually ( or correct ly) inst alled on your syst em .
--enable-pf-transparent You m ay need t his opt ion t o use HTTP int ercept ion on syst em s t hat use t he PF packet filt er. PF is t he st andard packet filt er for OpenBSD and m ay have been port ed t o ot her syst em s as well. I f you enable t his opt ion and get com piler errors on t he src/ client _side. c file, chances are t hat PF isn't act ually inst alled on your syst em .
--enable-linux-netfilter Net filt er is t he nam e of t he Linux packet filt er for t he 2.4 kernel series. Enable t his opt ion if you want t o use HTTP int ercept ion wit h Linux 2.4 or lat er.
--disable-ident-lookups
ident is a sim ple prot ocol t hat allows a server t o find t he usernam e associat ed wit h a client 's part icular TCP connect ion. I f you use t his opt ion, t he com piler excludes com plet ely t he code t hat perform s such lookups. Even if you leave t he code enabled at com pile t im e, Squid doesn't m ake ident lookups unless you configure t hem in squid.conf.
--disable-internal-dns The Squid source code includes t wo different DNS resolut ion im plem ent at ions, called int ernal and ext ernal. I nt ernal lookups are t he default , but som e people prefer t he ext ernal t echnique. This opt ion disables t he int ernal funct ionalit y and revert s t o t he older m et hod. I nt ernal lookups use Squid's own im plem ent at ion of t he DNS prot ocol. That is, Squid generat es raw DNS queries and sends t hem t o a resolver. I t ret ransm it s queries t hat t im e out , and you can specify any num ber of resolvers. One of t he benefit s t o t his im plem ent at ion is t hat Squid get s accurat e TTLs for DNS replies. Ext ernal lookups use t he C library's get host bynam e( ) and get host byaddr( ) funct ions. Since t hese rout ines block t he process unt il t he answer com es back, t hey m ust be called from ext ernal, helper processes. Squid uses a pool of ext ernal processes t o m ake queries in parallel. The prim ary drawback t o ext ernal DNS resolut ion is t hat you need m ore helper processes as Squid's load increases. Anot her annoyance is t hat t he C library funct ions don't convey TTLs wit h t he answers, in which case Squid uses a const ant value supplied by t he posit ive_dns_t t l direct ive.
--enable-truncate The t runcat e( ) syst em call is an alt ernat ive t o using unlink( ) . While unlink( ) rem oves a cache file alt oget her, t runcat e( ) set s t he file size t o zero. This frees t he disk space associat ed wit h t he file but leaves t he direct ory ent ry in place. This opt ion exist s because som e people believed ( or hoped) t hat truncate( ) would produce bet t er perform ance t han unlink( ). However, benchm arks have shown lit t le or no real difference.
--disable-hostname-checks By default , Squid requires t hat URL host nam es conform t o t he som ewhat archaic specificat ions in RFC 1034: The labels m ust follow t he rules for ARPANET host nam es. They m ust st art wit h a let t er, end wit h a let t er or digit , and have as int erior charact ers only let t ers, digit s, and hyphen. Here, " let t er" m eans t he ASCI I charact ers A t hrough Z. Since int ernat ionalized dom ain nam es are becom ing increasingly popular, you m ay want t o use t his opt ion t o rem ove t he rest rict ion.
--enable-underscores This opt ion cont rols Squid's behavior regarding underscore charact ers in host nam es. General consensus is t hat host nam es m ust not include underscore charact ers, alt hough som e people disagree. Squid, by default , generat es an error m essage for request s t hat have an underscore in a URL host nam e. You can use t his opt ion t o m ake Squid t reat t hem as valid. However, your DNS resolver m ay also enforce t he no- underscore requirem ent and fail t o resolve such host nam es.
--enable-auth [ = LIST] This opt ion cont rols which HTTP aut hent icat ion schem es t o support in t he Squid binary. You can select any com binat ion of t he following schem es: basic, digest , and nt lm . I f you om it t he opt ion, Squid support s only basic aut hent icat ion. I f you give t he —enable- aut h opt ion wit hout any argum ent s, t he build process adds support for all schem es. Ot herwise, you can give a com m a- separat ed list of schem es t o support : % ./configure --enable-auth=digest,ntlm I t alk m ore about aut hent icat ion in Chapt ers 6 and 12.
--enable-auth-helpers = LIST This old opt ion is now deprecat ed, but st ill rem ains for backward com pat ibilit y. You should use —enable- basic- aut h- helpers= LIST inst ead.
--enable-basic-auth-helpers = LIST Wit h t his opt ion, you can build one or m ore of t he HTTP Basic aut hent icat ion helper program s found in helpers/ basic_aut h. See Sect ion 12.2 for t heir nam es and descript ions.
--enable-ntlm-auth-helpers = LIST Wit h t his opt ion, you can build one or m ore of t he HTTP NTLM aut hent icat ion helper program s found in helpers/ nt lm _aut h. See Sect ion 12.4 for t heir nam es and descript ions.
--enable-ntlm-fail-open When you enable t his opt ion, Squid's NTLM aut hent icat ion m odule default s t o allow access in t he event of an error or problem .
--enable-digest-auth-modules= LI ST
Wit h t his opt ion, you can build one or m ore of t he HTTP Digest aut hent icat ion helper program s found in helpers/ digest _aut h. See Sect ion 12.3 for t heir nam es and descript ions.
--enable-external-acl-helpers= LI ST Wit h t his opt ion, you can build one or m ore of t he ext ernal ACL helper program s t hat I discuss in Sect ion 12.5. For exam ple: % ./configure --enable-external-acl-helpers=ip_user,ldap_group
--disable-unlinkd Unlinkd is anot her one of Squid's ext ernal helper processes. I t s sole j ob is t o execut e t he unlink( ) ( or t runcat e( ) ) syst em call on cache files. Squid realizes a significant perform ance gain by im plem ent ing file delet ion in an ext ernal process. Use t his opt ion t o disable t he ext ernal unlink daem on feat ure.
--enable-stacktrace Som e operat ing syst em s support aut om at ic generat ion of st ack t race dat a in t he event of a program crash. When you enable t his feat ure and Squid crashes, t he st ack t race inform at ion is writ t en t o t he cache.log file. This inform at ion is oft en helpful t o developers in t racking down program m ing bugs.
--enable-x-accelerator-vary This advanced feat ure m ay be used when Squid is configured as a surrogat e. I t inst ruct s Squid t o look for X-Accelerator-Vary headers in responses from backend origin servers. See Sect ion 15.5.
3.4.2 Running configure Now we're ready t o run t he ./ configure script . Go t o t he t op- level source direct ory and t ype ./ configure, followed by any of t he opt ions m ent ioned previously. For exam ple: % cd squid-2.5.STABLE4 % ./configure --enable-icmp --enable-htcp ./ configure's j ob is t o probe your operat ing syst em and find out which t hings are available, and which are not . One of t he first t hings it does is m ake sure your C com piler is working. I f ./ configure det ect s a problem wit h your C com piler, t he script exit s wit h t his error m essage: configure: error: installation or configuration problem: C compiler
cannot create executables. Most likely, you'll never see t hat m essage. I f you do, it m eans eit her your syst em doesn't have a C com piler at all or t hat t he com piler isn't inst alled correct ly. Look at t he config.log file for hint s as t o t he exact problem . I f your syst em has m ore t han one C com piler, you can t ell ./ configure which t o use by set t ing t he CC environm ent variable before running ./ configure: % setenv CC /usr/local/bin/gcc % ./configure ... Aft er ./ configure checks out t he com piler, it looks for a long list of header files, libraries, and funct ions. Norm ally you won't have t o worry about t his part . I n som e cases, ./ configure pauses t o get your at t ent ion about som et hing t hat m ay be a problem ( such as not enough file descript ors) . I t m ay also st op if you specify incom pat ible or unreasonable com m and- line opt ions. I f som et hing does go wrong, check t he config.log out put . ./ configure's final t ask is t o creat e Makefiles and ot her files based on t he t hings it learned about your syst em . At t his point , you're ready t o begin com piling. < Day Day Up >
< Day Day Up >
3.5 make Once ./ configure has done it s j ob, you can sim ply t ype m ake t o begin com piling t he source code: % make
Norm ally, t his part goes sm oot hly. You'll see a lot of lines t hat look like t his:
[ 2]
[ 2]
The m ake out put used t o be m uch pret t ier, but such is t he price we pay for advanced com piling t ools such as aut om ake. source='cbdata.c' object='cbdata.o' libtool=no tmpdepfile='.deps/cbdata.TPo'
depfile='.deps/cbdata.Po'
depmode=gcc /bin/sh ../cfgaux/depcomp
gcc -DHAVE_
CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../ include -I. -I. -I../include -I../include
-g -O2 -Wall -c 'test -f cbdata.c ||
echo './''cbdata.c source='client_db.c' object='client_db.o' libtool=no tmpdepfile='.deps/client_db.TPo'
depfile='.deps/client_db.Po'
depmode=gcc /bin/sh ../cfgaux/depcomp
gcc -DHAVE_
CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../ include -I. -I. -I../include -I../include
-g -O2 -Wall -c 'test -f client_db.c ||
echo './''client_db.c source='client_side.c' object='client_side.o' libtool=no tmpdepfile='.deps/client_side.TPo'
depfile='.deps/client_side.Po'
depmode=gcc /bin/sh ../cfgaux/depcomp
gcc -
DHAVE_CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../ include -I. -I. -I../include -I../include
-g -O2 -Wall -c 'test -f client_side.c ||
echo './''client_side.c source='comm.c' object='comm.o' libtool=no deps/comm.TPo'
depfile='.deps/comm.Po' tmpdepfile='.
depmode=gcc /bin/sh ../cfgaux/depcomp
gcc -DHAVE_CONFIG_H -DDEFAULT_
CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../include -I. -I. -I../ include -I../include
-g -O2 -Wall -c 'test -f comm.c || echo './''comm.c
You m ay see som e com piler warnings. I n m ost cases, it is safe t o ignore t hese. I f you see a lot of t hem or som et hing t hat looks really serious, report it t o t he developers as described in Sect ion 16.5. I f t he com pilat ion get s all t he way t o t he end wit hout any errors, you can m ove t o t he next sect ion,
which describes how t o inst all t he program s you j ust built .
To verify t hat com pilat ion was successful, you can run m ake again. You should see t his out put :
[ 3]
[ 3]
I f m ake recom piles t he source every t im e you run it , and t here are no errors, your syst em clock m ay be set wrong. % make Making all in lib... Making all in scripts... Making all in src... Making all in fs... Making all in repl... 'squid' is up to date. 'client' is up to date. 'unlinkd' is up to date. 'cachemgr.cgi' is up to date. Making all in icons... Making all in errors... Making all in auth_modules... The com pilat ion st ep m ay fail for a num ber of reasons, including:
Source code bugs Usually t he Squid source code is t horoughly debugged. However, you m ay encount er som e bugs or problem s t hat prevent Squid from com piling. You're m ore likely t o find t hese sort s of bugs in t he newer developm ent versions. Report t hese t o t he developers.
Com piler inst allat ion problem s An im properly inst alled C com piler probably won't be able t o com pile Squid or any ot her m oderat ely sized soft ware package. Usually, com pilers com e pre- inst alled wit h t he operat ing syst em , so you don't have t o worry about t hat . However, if you at t em pt t o upgrade your com piler aft er inst alling t he operat ing syst em , you m ight m ake a m ist ake. Never copy a com piler inst allat ion from one m achine t o anot her, unless you are absolut ely sure about what you are doing. I feel it is always bet t er t o inst all t he com piler on each m achine separat ely. Always m ake sure t hat your com piler's header files are synchronized wit h t he library files. The header files norm ally reside in / usr/ include, while libraries are found in / usr/ lib. Linux's popular RPM syst em m akes it possible t o upgrade one, but not t he ot her. I f t he libraries are based on different header files, Squid m ay not com pile.
I f you want t o upgrade t he com piler on one of t he open- source BSD variant s, be sure t o run m ake world from t he / usr/ src direct ory, rat her t han from t he / usr/ src/ lib or / usr/ src/ include direct ories. Here are som e com m on com pilat ion problem s and error m essages:
Solaris: make[1]: *** [libmiscutil.a] Error 255 This m eans t hat ./ configure didn't find t he ar program . Make sure / usr/ ccs/ bin is list ed in your PATH environm ent variable. I f you don't have t he Sun com piler inst alled, you'll need t he GNU binut ils ( ht t p: / / www.gnu.org/ direct ory/ binut ils.ht m l) .
Linux: storage size of 'rl' isn't known This happens when t he header and library files don't m at ch, as described earlier. Be sure t o upgrade bot h packages at t he sam e t im e.
Digital Unix: Don't know how to make EXTRA_libmiscutil_a_SOURCES. Stop. Digit al Unix's m ake program isn't t olerant of t he Makefile produced by t he automake package. For exam ple, lib/ Makefile.in cont ains t hese lines: noinst_LIBRARIES = \ @LIBDLMALLOC@ \ libmiscutil.a \ libntlmauth.a \ @LIBREGEX@ Aft er subst it ut ion, when lib/ Makefile is creat ed, it looks like t his: noinst_LIBRARIES = \ \ libmiscutil.a \ libntlmauth.a \
As shown above, t he last line cont ains an ( invisible) TAB charact er, which confuses m ake. You can get past t his problem by inst alling and using GNU m ake, or by m anually edit ing lib/ Makefile ( and any ot hers exhibit ing t his problem ) t o m ake it look like t his: noinst_LIBRARIES = \ \
libmiscutil.a \ libntlmauth.a I f you have problem s com piling Squid, check t he FAQ first . You m ay also want t o search t he Squid web sit e ( use t he search box on t he hom e page) . Finally, if you're st ill st uck, send em ail t o t he squidusers@squid- cache.org list . < Day Day Up >
< Day Day Up >
3.6 make Install Aft er com piling, you need t o inst all t he program s int o t heir perm anent direct ories. This m ight require superuser privileges, t o put files in t he inst allat ion direct ories. I f so, becom e root first : % su Password: # make install I f you enable Squid's I CMP m easurem ent feat ures wit h t he —enable- icm p opt ion, you m ust inst all t he pinger program . The pinger program m ust be inst alled wit h superuser privileges because only root is allowed t o send and receive I CMP m essages. The following com m and inst alls pinger wit h t he appropriat e perm issions: # make install-pinger Aft er inst alling Squid, you should see t he following direct ories and files list ed under t he inst allat ion prefix direct ory ( / usr/ local/ squid by default ) :
sbin The sbin direct ory cont ains program s norm ally st art ed by root .
sbin/ squid This is t he m ain Squid program .
bin The bin direct ory cont ains program s for all users.
bin/ RunCache RunCache is a shell script you can use t o st art Squid. I f Squid dies, t his script aut om at ically st art s it again, unless it det ect s frequent rest art s. The RunCache script is a relic from t he t im e when Squid was not a daem on process. Wit h t he current versions, RunCache is less useful because Squid aut om at ically rest art s it self when you don't use t he - N opt ion.
bin/ RunAccel
The RunAccel script is nearly ident ical t o RunCache, except t hat it adds a com m and- line argum ent t hat t ells Squid where t o list en for HTTP request s.
bin/ squidclient squidclient is a sim ple HTTP client you can use t o t est Squid. I t also has som e special feat ures for m aking m anagem ent request s t o a running Squid process.
libexec The libexec direct ory t radit ionally cont ains helper program s. These are com m ands t hat you wouldn't norm ally run yourself. Rat her, t hese program s are norm ally st art ed by ot her program s.
libexec/ unlinkd unlinkd is a helper program t hat rem oves files from t he cache direct ories. As you'll see lat er, file delet ion can be a significant bot t leneck. By im plem ent ing t he delet e operat ion in an ext ernal process, Squid achieves som e perform ance gain.
libexec/ cachem gr.cgi cachem gr.cgi is a CGI int erface t o Squid's m anagem ent funct ions. To use it , you'll probably need t o copy t his program t o your HTTP server's cgi- bin direct ory. You'll see m ore about t his in Sect ion 14.2.
libexec/ diskd ( opt ional) You get t his only if you specify —enable- st oreio= diskd.
libexec/ pinger ( opt ional) You get t his only if you specify —enable- icm p.
et c The et c direct ory cont ains Squid's configurat ion files.
et c/ squid.conf This is t he prim ary configurat ion file for Squid. I nit ially, t his file cont ains a lot of com m ent s t o explain what each opt ion does. Aft er you underst and t he configurat ion
direct ives, it 's a good idea t o rem ove t he com m ent s t o m ake t he configurat ion file sm aller and easier t o read. Not e t hat t he inst allat ion procedure doesn't overwrit e t his file if it already exist s.
et c/ squid.conf.default This is a copy of t he default configurat ion file from t he source dist ribut ion. You m ay find it useful t o have a copy of t he current default configurat ion file aft er upgrading your Squid inst allat ion. New configurat ion direct ives m ay be added, and som e of t he exist ing direct ives m ay have changed.
et c/ m im e.conf The m im e.conf file t ells Squid which MI ME t ypes t o use for dat a ret rieved from FTP and Gopher servers. The file is a t able t hat correlat es filenam e ext ensions t o MI ME t ypes. Norm ally, you won't need t o edit t his file. However, you m ay need t o add ent ries for special file t ypes used wit hin your organizat ion.
et c/ m im e.conf.default This is t he default m im e.conf file from t he source dist ribut ion.
share The share direct ory norm ally cont ains read- only dat a files used by Squid.
share/ m ib.t xt This is t he SNMP Managem ent I nform at ion Base ( MI B) file for Squid. Squid doesn't use t his file it self. Rat her, your SNMP agent soft ware ( such as snm pget and Mult i- Rout er Traffic Grapher ( MRTG) ) needs t his file t o underst and t he SNMP obj ect s available from Squid.
share/ icons The share/ icons direct ory cont ains a num ber of sm all icon files Squid uses in FTP and Gopher direct ory list ings. Norm ally, you won't need t o worry about t hese files, but you can change t hem if you want .
share/ errors The share/ errors direct ory cont ains t em plat es for t he error m essages Squid shows t o users. These files are copied from t he source direct ory when you inst all Squid. You can edit t hem if you like. However, t he inst allat ion procedure always overwrit es t hese files
every t im e you run m ake inst all. So if you want t o have cust om ized error m essages, it 's a good idea t o put t hem in a different direct ory.
var The var direct ory cont ains files t hat aren't crit ical and t hat change frequent ly. These are t he sort of files you don't norm ally back up.
var/ logs The var/ logs direct ory is t he default locat ion for Squid's various log files. I t is em pt y when you first inst all Squid. Once Squid get s running, you can expect t o find files here nam ed access.log, cache.log, and st ore.log.
var/ cache This is t he default cache direct ory ( cache_dir) if you don't specify one in squid.conf. See Chapt er 7 for all t he det ails about cache direct ories. < Day Day Up >
< Day Day Up >
3.7 Applying a Patch Aft er you've been running Squid for a while, you m ay find t hat you need t o pat ch t he source code t o fix a bug or add an experim ent al feat ure. Pat ches are post ed for im port ant bug fixes on t he squid- cache.org web sit e. I f you don't want t o wait for t he next official release, you can download and apply t he pat ch t o your source code. You will t hen need t o recom pile Squid. To apply a pat ch—also som et im es called a diff—you need a program called pat ch. Chances are t hat your operat ing syst em already has t he pat ch program . I f not , you can download it from t he GNU collect ion ( ht t p: / / www.gnu.org/ direct ory/ pat ch.ht m l) . Not e t hat if you're using anonym ous CVS ( see Sect ion 2.4) , you don't need t o worry about pat ching files. The CVS syst em does it for you aut om at ically when you updat e your t ree. To apply a pat ch, you need t o save t he pat ch file som ewhere on your syst em . Then cd t o t he Squid source direct ory and run t he com m and like t his: % cd squid-2.5.STABLE4 % patch < /tmp/patch_file By default , t he pat ch program t ells you what it 's doing as it runs. Usually t his out put scrolls by very quickly, unless t here is a problem . You can safely ignore t he warnings t hat say offset NNN lines. I f you don't want t o see all t his out put , use t he - s opt ion t o m ake pat ch silent . When pat ch updat es t he source files, it creat es a backup copy of t he original file. For exam ple, if you're applying a pat ch t o src/ ht t p.c, pat ch nam es t he backup file src/ ht t p.c.orig. Thus, if you want t o undo t he pat ch aft er applying it , you can sim ply renam e all t he .orig files back t o t heir form er nam es. To use t his t echnique successfully, it 's a good idea t o rem ove all .orig files before applying a pat ch. I f pat ch encount ers a problem , it st ops and prom pt s you for advice. Com m on problem s are as follows: ●
●
●
●
Running pat ch from t he wrong direct ory. To fix t his problem , you m ay need t o cd t o a different direct ory or use pat ch's - p opt ion. Pat ch is already applied. pat ch can usually t ell if t he pat ch file has already been applied. I n t his case, it asks if you want t o unpat ch t he file. The pat ch program doesn't underst and t he file you are giving it . Pat ch files com e in t hree flavors: norm al, cont ext , and unified. Old versions of pat ch m ay not underst and cont ext or unified diff out put . Get t ing t he lat est version from t he GNU FTP sit e will solve t his problem . Corrupt ed pat ch file. I f you aren't careful when downloading and saving t he pat ch file, it m ay becom e corrupt ed. Som et im es people send pat ch files in em ail m essages, and it is t em pt ing t o sim ply cut - and- past e t hem int o a new window. On som e syst em s, cut - andpast e can change Tab charact ers int o spaces, or incorrect ly wrap long lines. Bot h changes confuse pat ch. The - l opt ion m ay be helpful, but it 's best t o m ake sure you copy and save t he pat ch file correct ly.
Som et im es pat ch can't apply part or all of t he diff. I n t hese cases, you'll see such m essages as
Hunk 3 of 4 failed. The failed sect ions are saved t o files nam ed .rej . For exam ple, if a failure occurs while processing src/ ht t p.c, pat ch saves t hat piece of t he diff t o src/ ht t p.c.rej . I n som e cases, you m ay be able t o fix t hese by hand, but it 's usually not wort h t he t rouble. I f you have a lot of " failed hunks" or .rej files, it 's a good idea t o download a whole new copy of t he lat est source code. Aft er you apply a pat ch, you need t o recom pile Squid. One of t he great t hings about m ake is t hat it only recom piles t he files t hat have changed. But som et im es m ake doesn't com prehend all t he int ricat e dependencies, and it doesn't rebuild enough of t he files. To be safe, it 's usually a good idea t o recom pile everyt hing. The best way t o do t his is t o clean t he source t ree before recom piling: % make clean % make
< Day Day Up >
< Day Day Up >
3.8 Running configure Later Som et im es you m ay find it necessary t o rerun ./ configure. For exam ple, if you t une your kernel param et ers, you m ust run ./ configure again so it picks up t he new set t ings. As you read t his book, you m ay also find t hat you want t o use feat ures t hat m ust be enabled wit h ./ configure opt ions. To rerun ./ configure wit h t he sam e opt ions, use t his com m and: % ./config.status --recheck Anot her t echnique is t o " t ouch" t he config.st at us file, which updat es it s t im est am p. This causes m ake t o re- run t he ./ configure script before com piling t he source code: % touch config.status % make To add or rem ove ./ configure opt ions, you need t o t ype in t he whole com m and again. I f you can't rem em ber t he previous opt ions, j ust look at t he t op of t he config.st at us file. For exam ple: % head config.status #! /bin/sh # Generated automatically by configure. # Run this file to recreate the current configuration. # This directory was configured as follows, # on host foo.life-gone-hazy.com: # # ./configure #
--enable-storeio=ufs,diskd --enable-carp \
--enable-auth-modules=NCSA
# Compiler output produced by configure, useful for debugging # configure, is in ./config.log if it exists. Aft er rerunning ./ configure, you m ust com pile and inst all Squid again. To be safe, it 's a good idea t o run make clean first : % make clean % make
Recall t hat ./ configure caches t he t hings it discovers about your syst em . I n som e sit uat ions, you'll want t o clear t his cache and st art t he com pilat ion process from t he very beginning. You can sim ply rem ove t he config.cache file if you like. Then, t he next t im e ./ configure runs, it won't use t he previous values. You can also rest ore t he Squid source t ree t o it s preconfigure st at e wit h t he following com m and: % make distclean This rem oves all obj ect files and ot her files creat ed by t he ./ configure and m ake com m ands. < Day Day Up >
< Day Day Up >
3.9 Exercises ● ●
●
●
●
Aft er com piling Squid, rem ove one or m ore of t he .o files and run make again. Use t he ulim it or lim it s com m and t o change t he file descript or lim it t o som e sm all value before com piling Squid. Does ./ configure obey or ignore your new lim it ? Com pile Squid wit h a high file- descript or lim it , t hen t ry t o run it on a syst em wit h a lower lim it . Does Squid use t he lower or higher lim it ? What happens if you m ist ype one of t he —enable opt ions? What if you specify an invalid st orage schem e wit h t he —enable- st ore- io opt ion? Aft er com piling Squid, rem ove src/ Makefile and t ry t o com pile it again. What 's t he easiest way t o rest ore t he file? < Day Day Up >
< Day Day Up >
Chapter 4. Configuration Guide for the Eager Aft er com piling and inst alling Squid, your next t ask is t o delve int o t he configurat ion file. I f you're new t o Squid, you're likely t o find it a bit overwhelm ing. The m ost recent version has approxim at ely 200 configurat ion file direct ives and 2700 lines of com m ent s. I cert ainly don't expect you t o read about , and configure, every direct ive before st art ing Squid. This chapt er can help you get Squid running quickly. All t he squid.conf direct ives have default values. You m ight be able t o get Squid going wit hout even t ouching t he configurat ion file. However, I don't recom m end t rying t hat . You'll be m uch happier if you read t he following sect ions first . I f you are really t urned off by Squid's configurat ion file synt ax, you m ight want t o t ry t he Webm in graphical user int erface. I t allows you t o configure Squid ( and num erous ot her program s) from your web browser. See ht t p: / / www.webm in.com and The Book of Webm in by Joe Cooper ( No St arch Press) for m ore inform at ion. < Day Day Up >
< Day Day Up >
4.1 The squid.conf Syntax Squid's configurat ion file is relat ively st raight forward. I t is sim ilar in st yle t o m any ot her Unix program s. Each line begins wit h a configurat ion direct ive, followed by som e num ber of values and/ or keywords. Squid ignores em pt y lines and com m ent lines ( beginning wit h #) when reading t he configurat ion file. Here are som e sam ple configurat ion lines: cache_log /squid/var/cache.log
# define the localhost ACL acl Localhost src 127.0.0.1/32
connect_timeout 2 minutes
log_fqdn on Som e direct ives t ake a single value. For t hese, repeat ing t he direct ive wit h a different value overwrit es t he previous value. For exam ple, t here is only one connect _t im eout value. The first line in t he following exam ple has no effect because t he second line overwrit es it : connect_timeout 2 minutes connect_timeout 1 hour On t he ot her hand, som e direct ives are act ually list s of values. For t hese, each occurrence of t he direct ive adds a new value t o t he list . The ext ension_m et hods direct ive works t his way: extension_methods UNGET extension_methods UNPUT extension_methods UNPOST For t hese list - based direct ives, you can also usually put m ult iple values on t he sam e line: extension_methods UNGET UNPUT UNPOST Many of t he direct ives have com m on t ypes. For exam ple, connect _t im eout is a t im e specificat ion t hat has a num ber followed by a unit of t im e. For exam ple: connect_timeout 3 hours
client_lifetime 4 days negative_ttl 27 minutes Sim ilarly, a num ber of direct ives refer t o t he size of a file or chunk of m em ory. For t hese, you can writ e a size specificat ion as a decim al num ber, followed by bytes, KB, MB, or GB. For exam ple: minimum_object_size 12 bytes request_header_max_size 10 KB maximum_object_size 187 MB Anot her t ype wort h m ent ioning is t he t oggle, which can be eit her on or off. Many direct ives use t his t ype. For exam ple: server_persistent_connections on strip_query_terms off prefer_direct on I n general, t he configurat ion file direct ives m ay appear in any order. However, t he order is im port ant when one direct ive m akes reference t o som et hing defined by anot her. Access cont rols are a good exam ple. An acl m ust be defined before it can be used in an ht t p_access rule: acl Foo src 1.2.3.4 http_access deny Foo Many t hings in squid.conf are case- sensit ive, such as direct ive nam es. You can't writ e HTTP_port inst ead of ht t p_port . The default squid.conf file cont ains com m ent s describing each direct ive, as well as t he default values. For exam ple: #
TAG: persistent_request_timeout
#
How long to wait for the next HTTP request on a persistent
#
connection after the previous request completes.
# #Default: # persistent_request_timeout 1 minute Each t im e you inst all Squid, t he current default configurat ion file is saved as squid.conf.default in t he $prefix/ et c direct ory. Since direct ives change from t im e t o t im e, you can refer t o t his file
for t he m ost up- t o- dat e docum ent at ion on squid.conf. The rest of t his chapt er is about t he handful of direct ives you need t o know before running Squid for t he very first t im e. < Day Day Up >
< Day Day Up >
4.2 User IDs As you probably know, Unix processes and files have user and group ownership at t ribut es. You need t o select a user and group for Squid. This user and group com binat ion m ust have read and writ e access t o m ost of t he Squid- relat ed files and direct ories. I highly recom m end creat ing a dedicat ed squid user and group. This m inim izes t he chance t hat som eone can exploit Squid t o read ot her files on t he syst em . I f m ore t han one person has adm inist rat ive aut horit y over Squid, you can add t hem t o t he squid group. Unix processes inherit t heir parent process' ownership at t ribut es. That is, if you st art Squid as user j oe, Squid also runs as user j oe. I f you don't want Squid t o run as j oe, you need t o change your user I D beforehand. This is t ypically accom plished wit h t he su com m and. For exam ple: joe% su - squid squid% /usr/local/squid/sbin/squid Unfort unat ely, running Squid isn't always so sim ple. I n som e cases, you m ay need t o st art Squid as root , depending on your configurat ion. For exam ple, only root can bind a TCP socket t o privileged port s like port 80. I f you need t o st art Squid as root , you m ust set t he cache_effect ive_user direct ive. I t t ells Squid which user t o becom e aft er perform ing t he t asks t hat require special privileges. For exam ple: cache_effective_user squid The nam e t hat you provide m ust be a valid user ( i.e., in t he / et c/ passwd file) . Furt herm ore, not e t hat t his direct ive is used only when you st art Squid as root . Only root has t he abilit y t o becom e anot her user. I f you st art Squid as j oe, it can't swit ch t o user squid. You m ight be t em pt ed t o j ust run Squid as root wit hout set t ing cache_effect ive_user. I f you t ry, you'll find t hat Squid refuses t o run. This, again, is due t o securit y concerns. I f an out sider were som ehow able t o com prom ise or exploit Squid, he could gain full access t o your syst em . Alt hough we st rive t o m ake Squid secure and bug- free, t his requirem ent provides som e ext ra insurance, j ust in case. I f you st art Squid as root wit hout set t ing cache_effect ive_user, Squid uses nobody as t he default value. What ever user I D you choose for Squid, m ake sure it has read access t o t he files inst alled in $prefix/ et c, $prefix/ libexec, and $prefix/ share. The user I D m ust also have writ e access t o t he log files and cache direct ory. Squid also has a cache_effect ive_group direct ive, but you probably don't need t o set it . By default , Squid uses t he cache_effect ive_user's default group ( from t he password file) . < Day Day Up >
< Day Day Up >
4.3 Port Numbers The ht t p_port direct ive t ells Squid which port num ber t o list en on for HTTP request s. The default is port 3128: http_port 3128 I f you are running Squid as a surrogat e ( see Chapt er 15) , you should probably set t his t o 80. You can inst ruct Squid t o list en on m ult iple port s wit h addit ional ht t p_port lines. This is oft en useful if you m ust support groups of client s t hat have been configured different ly. For exam ple, t he browsers from one depart m ent m ay be sending request s t o port 3128, while anot her depart m ent uses port 8080. Sim ply list bot h port num bers as follows: http_port 3128 http_port 8080 You can also use t he ht t p_port direct ive t o m ake Squid list en on specific int erface addresses. When Squid is used on a firewall, it should have t wo net work int erfaces: one int ernal and one ext ernal. You probably don't want t o accept HTTP request s com ing from t he ext ernal side. To m ake Squid list en on only t he int ernal int erface, sim ply put t he I P address in front of t he port num ber: http_port 192.168.1.1:3128
< Day Day Up >
< Day Day Up >
4.4 Log File Pathnames I 'll discuss all t he det ails of Squid's log files in Chapt er 13. For now t he only t hing you m ay need t o worry about is where you want Squid t o put it s log files. The default locat ion is a direct ory nam ed logs under t he inst allat ion prefix. For exam ple, if you don't use t he —prefix= opt ion wit h ./ configure, t he default log file direct ory is / usr/ local/ squid/ var/ logs. You need t o m ake sure t hat log files are st ored on a disk part it ion wit h enough space. When Squid receives a writ e error for a log file, it exit s and rest art s. The prim ary reason for t his behavior is t o grab your at t ent ion. Squid want s t o m ake sure you don't m iss any im port ant logging inform at ion, especially if your syst em is being abused or at t acked. Squid has t hree m ain log files: cache.log, access.log, and st ore.log. The first of t hese, cache. log, cont ains inform at ional and debugging m essages. When you st art Squid t he first few t im es, you should closely wat ch t his file. I f Squid refuses t o run, t he reason is probably at t he end of cache.log. Under norm al condit ions, t his log file doesn't becom e large enough t o warrant any special at t ent ion. Also not e t hat if you st art Squid wit h t he - s opt ion, t he im port ant cache.log m essages are also sent t o your syslog daem on. You can change t he locat ion for t his log file wit h t he cache_log direct ive: cache_log /squid/logs/cache.log The access.log file cont ains a single line for each client request m ade t o Squid. On average, each line is about 150 byt es. I n ot her words, it t akes about 150 MB t o log one m illion client request s. Use t he cache_access_log direct ive t o change t he locat ion of t his log file: cache_access_log /squid/logs/access.log I f, for som e reason, you don't want Squid t o log client request s, you can specify t he log file pat hnam e as / dev/ null. The st ore.log file is probably not very useful t o m ost cache adm inist rat ors. I t cont ains a record for each obj ect t hat ent ers and leaves t he cache. The average record size is t ypically 175- 200 byt es. However, Squid doesn't creat e an ent ry in st ore.log for cache hit s, so it cont ains fewer records t han access.log. Use t he cache_st ore_log direct ive t o change t he locat ion: cache_store_log /squid/logs/store.log You can easily disable st ore.log alt oget her by specifying t he locat ion as none: cache_store_log none I f you're not careful, Squid's log files increase in size wit hout lim it . Som e operat ing syst em s enforce a 2- GB file size lim it , even if you have plent y of free disk space. Exceeding t his lim it result s in a writ e error, which t hen causes Squid t o exit . To keep log file sizes reasonable, you should creat e a cron j ob t hat regularly renam es and archives t he log files. Squid has a built - in feat ure t o m ake t his easy. See Sect ion 13.7 for an explanat ion of log file rot at ion.
< Day Day Up >
< Day Day Up >
4.5 Access Controls I 'll have a lot t o say about access cont rols in Chapt er 6. For now, I 'll cover a few cont rols so t hat m ore ent husiast ic readers can quickly st art using Squid. Squid's default configurat ion file denies every client request . You m ust place addit ional access cont rol rules in squid.conf before anyone can use t he proxy. The sim plest approach is t o define an ACL t hat corresponds t o your user's I P addresses and an access rule t hat t ells Squid t o allow HTTP request s from t hose addresses. Squid has m any different ACL t ypes. The src t ype m at ches client I P addresses, and t he ht t p_access rules are checked for client HTTP request s. Thus, you need t o add only t wo lines: acl MyNetwork src 192.168.0.0/16 http_access allow MyNetwork The t ricky part is put t ing t hese lines in t he right place. The order of ht t p_access lines is very im port ant , but t he order of acl lines doesn't m at t er. You should also be aware t hat t he default configurat ion file cont ains som e im port ant access cont rols. You shouldn't change or disrupt t hese unt il you fully com prehend t heir significance. When you edit squid.conf for t he first t im e, look for t his com m ent : # # INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS # I nsert your new rules below t his com m ent , and before t he ht t p_access deny All line. For t he sake of com plet eness, here is a suit able init ial access cont rol configurat ion, including t he recom m ended default cont rols and t he exam ple earlier: acl All src 0/0 acl Manager proto cache_object acl Localhost src 127.0.0.1/32 acl Safe_ports port 80 21 443 563 70 210 280 488 591 777 1025-65535 acl SSL_ports 443 563 acl CONNECT method CONNECT acl MyNetwork src 192.168.0.0/16
http_access allow Manager Localhost
http_access deny Manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow MyNetwork http_access deny All
< Day Day Up >
< Day Day Up >
4.6 Visible Hostname Hopefully, you won't need t o worry about t he visible_host nam e direct ive. However, you'll need t o set it if Squid can't figure out t he host nam e of t he m achine on which it is running. When t his happens, Squid com plains and refuses t o run: % squid -Nd1 FATAL: Could not determine fully qualified hostname.
Please set 'visible_hostname'
Squid want s t o be sure about it s host nam e for a num ber of reasons: ●
●
●
●
The host nam e appears in Squid's error m essages. This helps users ident ify t he source of pot ent ial problem s. The host nam e appears in t he HTTP Via header of cache m isses t hat Squid forwards. When t he request arrives at t he origin server, t he Via header cont ains a list of all proxies involved in t he t ransact ion. Squid also uses t he Via header t o det ect forwarding loops. I 'll t alk about forwarding loops in Chapt er 10. Squid uses int ernal URLs for cert ain t hings, such as t he icons for FTP direct ory list ings. When Squid generat es an HTML page for an FTP direct ory, it insert s em bedded im ages for lit t le icons t hat indicat e t he t ype of each file in t he direct ory. The icon URLs cont ain t he cache's host nam e so t hat web browsers request t hem direct ly from Squid. Each HTTP reply from Squid includes an X-Cache header. This isn't an official HTTP header. Rat her, it is an ext ension header t hat indicat es if t he response was a cache hit or a cache m iss. Since request s and responses m ay flow t hrough m ore t han one cache, each X-Cache header includes t he nam e of t he cache report ing hit or m iss. Here's a sam ple response t hat passed t hrough t wo caches: HTTP/1.0 200 OK Date: Mon, 29 Sep 2003 22:57:23 GMT Content-type: text/html Content-length: 733 X-Cache: HIT from bo2.us.ircache.net X-Cache: MISS from bo1.us.ircache.net Squid t ries t o figure out t he host nam e aut om at ically at st art up. First it calls t he get host nam e( ) funct ion, which usually ret urns t he correct host nam e. Next , Squid at t em pt s a DNS lookup on t he host nam e wit h get host bynam e( ) . This funct ion t ypically ret urns bot h I P addresses and t he canonical nam e for t he syst em . I f get host bynam e( ) succeeds, Squid uses t he canonical nam e in error m essages, Via headers, et c.
● ●
Squid m ay be unable t o det erm ine it s fully qualified host nam e for a num ber of reasons, including: The host nam e m ay not be set . The host nam e m ay be m issing from t he DNS zone or / et c/ host s files.
●
The Squid syst em 's DNS client configurat ion m ay be incorrect or m issing. On Unix, you should check t he / et c/ resolv.conf and / et c/ host .conf files.
I f you see t he fat al m essage m ent ioned previously, you need eit her t o fix t he host nam e and DNS inform at ion or explicit ly configure t he host nam e for Squid. I n m ost cases, it is sufficient t o ensure t he host nam e com m and ret urns a fully qualified host nam e and add an ent ry t o / et c/ host s. I f t hat doesn't work, j ust set t he visible host nam e in squid.conf: visible_hostname squid.packet-pushers.net
< Day Day Up >
< Day Day Up >
4.7 Administrative Contact Information You should set t he cache_m gr direct ive as a favor t o your users. The value is an em ail address users can writ e t o in case a problem surfaces. The cache_m gr address appears in Squid's error m essages by default . For exam ple: cache_mgr [email protected]
< Day Day Up >
< Day Day Up >
4.8 Next Steps Aft er creat ing t he m inim al configurat ion file, you're m ore or less ready t o run Squid for t he first t im e. To do t hat , j ust follow t he inst ruct ions in t he next chapt er. When you've m ast ered st art ing and st opping Squid, you can spend som e t im e beefing up t he configurat ion file. You m ay want t o add m ore sophist icat ed access cont rols, which you'll find docum ent ed in Chapt er 6. Since I didn't say anyt hing about t he disk cache yet , you should also spend a fair am ount of t im e in Chapt er 7 and Chapt er 8. < Day Day Up >
< Day Day Up >
4.9 Exercises ● ●
●
●
Parse Squid's configurat ion file wit h squid -k parse and check t he process exit st at us. I nt ent ionally int roduce a som e errors int o t he configurat ion file and run squid -k parse again. Not ice how Squid report s different errors. I nsert com m ent s int o t he configurat ion file. Can you st art a com m ent anywhere, even aft er a valid direct ive? Why do you t hink som e configurat ion file errors are fat al, but ot hers are not ? < Day Day Up >
< Day Day Up >
Chapter 5. Running Squid Now t hat you have Squid inst alled, and m aybe even configured, you need t o learn t he ins and out s of running Squid. Alt hough m ost of t he configurat ion occurs in squid.conf, you m ay find som e of Squid's com m and- line opt ions useful. For exam ple, one of t he first t hings you m ust do is use t he - z opt ion t o init ialize t he cache direct ories. You m ay also find t he - d opt ion useful for debugging. Squid norm ally runs as a daem on process. I f you are new t o Squid, however, I recom m end running Squid in t he foreground from a t erm inal window unt il you are confident t hat it is working properly. Following t hat , you can run Squid as a daem on, in t he background. Most likely, you'll want t o st art Squid each t im e your syst em boot s. Different operat ing syst em s have different approaches t o st art up script s. I 'll show you how t o m ake it happen in t hree different ways. You can send signals t o t he running Squid process t o execut e various t asks, such as halt ing and reconfiguring Squid, and rot at ing t he log files. Alt hough you can use t he kill com m and t o send signals, it is easier t o use t he squid - k com m ands. < Day Day Up >
< Day Day Up >
5.1 Squid Command-Line Options Before get t ing t oo far int o ot her t hings, let 's look at Squid's com m and- line opt ions. Many of t hese you will never use and som e are useful only when debugging problem s:
-a port Specifies a new ht t p_port value. This opt ion always overrides t he value from squid.conf. Not e, however, t hat you can specify m ult iple values in squid.conf. The - a opt ion overrides only t he first value from t he config file. ( This opt ion uses t he let t er " a" because in t he Harvest cache, t he HTTP port was called t he ASCI I port .)
-d level Makes Squid writ e it s debugging m essages t o st derr ( as well as cache.log and syslog, if configured) . The level argum ent specifies t he m axim um level for m essages t hat should be shown on st derr. I n m ost cases - d1 works well. See Sect ion 16.2 for a descript ion of debugging levels.
-f file Specifies an alt ernat e configurat ion file.
-h Displays t he usage inform at ion.
- k function Signals Squid t o perform various adm inist rat ive funct ions. The function argum ent m ay be one of t he following: reconfigure, rotate, shutdown, interrupt, kill, debug, check, or parse. reconfigure causes t he running Squid process t o reread it s configurat ion file. rotate causes Squid t o rot at e it s log files, which involves closing t hem , possibly renam ing t hem , and opening t hem again. shutdown sends t he signal t o shut down t he Squid process. interrupt also shut s down Squid but does so im m ediat ely, wit hout wait ing for act ive t ransact ions t o finish. kill sends t he unst oppable KI LL signal t o Squid, which should only be used as a last resort . debug put s Squid int o full debugging m ode. I t can quickly fill up your disk space if your cache is busy. check sim ply checks for a running Squid process. The process ret urn value indicat es whet her Squid is running or not . Finally, parse sim ply parses t he squid.conf file. The process ret urn value is non- zero if t he configurat ion file cont ains errors.
-s Enables logging t o t he syslog daem on. Squid uses t he LOCAL4 syslog facilit y. Level 0 debug m essages are logged wit h priorit y LOG_WARNING, and level 1 m essages are logged wit h LOG_NOTICE. Higher level debugging m essages aren't sent t o syslogd. You m ight use an ent ry like t his in / et c/ syslogd.conf: local4.warning
/var/log/squid.log
- u port Specifies an alt ernat e I CP port num ber, overriding icp_port in squid.conf.
-v Print s t he version st ring.
-z I nit ializes cache, or swap, direct ories. You m ust use t his opt ion when running Squid for t he first t im e or whenever you add a new cache direct ory.
-C Prevent s t he inst allat ion of signal handlers t hat t rap cert ain fat al signals such as SI GBUS and SI GSEGV. Norm ally, t he signals are t rapped by Squid so t hat it can at t em pt a clean shut down. However, t rapping t he signal m ay m ake it harder t o debug t he problem aft erwards. Wit h t his opt ion, t he fat al signals cause t heir default act ions, which is usually t o dum p core.
-D Disables init ial DNS t est s. Norm ally, Squid won't st art unt il it verifies t hat it s DNS server is working. This opt ion prevent s t hat check. You can also alt er or rem ove t he dns_t est nam es opt ion in squid.conf.
-F Makes Squid refuse all request s unt il it rebuilds t he st orage m et adat a. I f your cache is busy, t his opt ion m ay short en t he t im e required t o rebuild t he m et adat a. I f your cache is large, however, t he rebuild procedure m ay t ake a long t im e anyway.
-N Prevent s Squid from becom ing a background daem on process.
-R Prevent s Squid from using t he SO_REUSEADDR opt ion before binding t o t he HTTP port .
-V Enables virt ual host surrogat e m ode. Sim ilar t o ent ering ht t pd_accel_host virt ual in squid.conf.
-X Forces full debugging, as t hough you had specified debug_opt ions ALL,9 in squid.conf.
-Y Ret urns ICP_MISS_NOFETCH inst ead of ICP_MISS when rebuilding st ore m et adat a. For busy parent caches, t his opt ion m ay result in less load while t he cache is rebuilding. See Sect ion 10.6.1.2. < Day Day Up >
< Day Day Up >
5.2 Check Your Configuration File for Errors Before t rying t o st art Squid, you should verify t hat your squid.conf file m akes sense. This is easy t o do. Just run t he following com m and: % squid -k parse I f you see no out put , t he configurat ion file is valid, and you can proceed t o t he next st ep. However, if your configurat ion file cont ains an error, Squid t ells you about it : squid.conf line 62: http_access allow okay2 aclParseAccessLine: ACL name 'okay2' not found. Here you can see t hat t he ht t p_access direct ive on line 62 references an ACL t hat doesn't exist . Som et im es t he error m essages are less inform at ive: FATAL: Bungled squid.conf line 76: memory_pools I n t his case, we forgot t o put eit her on or off aft er t he m em ory_pools direct ive on line 76. I t 's a good idea t o develop t he habit of using squid - k parse every t im e you m odify your configurat ion file. I f you don't bot her, and your file has som e errors, Squid t ells you about t hem and refuses t o st art anyway. I f you end up m anaging a num ber of caches, it is likely t hat you'll develop som e script s t o aut om at e st art ing, st opping, and reconfiguring Squid. You can use t his feat ure in your script s t o ensure t hat t he configurat ion files are always valid. < Day Day Up >
< Day Day Up >
5.3 Initializing Cache Directories Before running Squid for t he first t im e, and whenever you add a new cache_dir, you m ust init ialize t he cache direct ories. The com m and is sim ply: % squid -z For t he UFS- relat ed st orage schem es ( ufs, aufs, and diskd; see Chapt er 8) , t his com m and creat es t he subdirect ories needed under each cache_dir. You don't need t o worry t hat Squid will wipe out your current cache direct ories ( if any) . Ownership and perm issions are a com m on problem at t his st age. Squid runs under a cert ain user I D, specified wit h cache_effect ive_user in squid.conf. This user I D m ust have read and writ e perm ission under each cache_dir direct ory. I f not , you'll see a m essage like t his: Creating Swap Directories FATAL: Failed to make swap directory /usr/local/squid/var/cache/00: (13) Permission denied I n t his case, you should m ake sure t hat all com ponent s of / usr/ local/ squid/ var/ cache are accessible t o t he user I D given in squid.conf. The final com ponent —t he cache direct ory—m ust be writ able by t his user I D as well. Cache direct ory init ializat ion m ay t ake a couple of m inut es, depending on t he size and num ber of cache direct ories, and t he speed of your disk drives. I f you want t o wat ch t he progress, use t he - X opt ion: % squid -zX
< Day Day Up >
< Day Day Up >
5.4 Testing Squid in a Terminal Window Once you've init ialized t he cache direct ories, you should run Squid in a t erm inal window wit h logging t o st derr. This way, you can easily spot any errors or problem s and m ake sure t hat Squid successfully st art s. Use t he - N opt ion t o keep Squid in t he foreground and t he - d1 opt ion t o display level 1 debugging on st derr: % squid -N -d1 You should see out put like t his: 2003/09/29 12:57:52| Starting Squid Cache version 2.5.STABLE4 for i386-unknown-freebsd4.8... 2003/09/29 12:57:52| Process ID 294 2003/09/29 12:57:52| With 1064 file descriptors available 2003/09/29 12:57:52| DNS Socket created on FD 4 2003/09/29 12:57:52| Adding nameserver 206.107.176.2 from /etc/resolv.conf 2003/09/29 12:57:52| Adding nameserver 205.162.184.2 from /etc/resolv.conf 2003/09/29 12:57:52| Unlinkd pipe opened on FD 9 2003/09/29 12:57:52| Swap maxSize 102400 KB, estimated 7876 objects 2003/09/29 12:57:52| Target number of buckets: 393 2003/09/29 12:57:52| Using 8192 Store buckets 2003/09/29 12:57:52| Max Mem
size: 8192 KB
2003/09/29 12:57:52| Max Swap size: 102400 KB 2003/09/29 12:57:52| Rebuilding storage in /usr/local/squid/var/cache (DIRTY) 2003/09/29 12:57:52| Using Least Load store dir selection 2003/09/29 12:57:52| Set Current Directory to /usr/local/squid/var/cache 2003/09/29 12:57:52| Loaded Icons. 2003/09/29 12:57:52| Accepting HTTP connections at 0.0.0.0, port 3128, FD 11. 2003/09/29 12:57:52| Accepting ICP messages at 0.0.0.0, port 3130, FD 12. 2003/09/29 12:57:52| WCCP Disabled. 2003/09/29 12:57:52| Ready to serve requests.
I f you see an error m essage, you need t o fix it before proceeding. Be sure t o check t he first few lines of out put for warning m essages. The m ost com m on errors are file/ direct ory perm issions and configurat ion file synt ax errors. I f you see an error m essage t hat doesn't m ake sense, have a look at Chapt er 16 for advice and inform at ion on t roubleshoot ing Squid. I f t hat doesn't help, check t he Squid FAQ, or search t he m ailing list archives for an explanat ion. Once you see t he Ready to serve requests m essage, t est Squid wit h a few HTTP request s. You can do t his by configuring your browser t o use Squid as a proxy and t hen open a web page. I f Squid is working correct ly, t he page should load as quickly as it would wit hout using Squid. Alt ernat ively, you can use t he squidclient program t hat com es wit h Squid: % squidclient http://www.squid-cache.org/ I f t his works, Squid's hom e page HTML file will scroll across your t erm inal window. Once you're confident t hat Squid works okay, you can int errupt t he Squid process ( i.e., wit h Ct rl- C) and run Squid as a daem on. < Day Day Up >
< Day Day Up >
5.5 Running Squid as a Daemon Process Norm ally you'll want t o run Squid as a daem on process ( i.e., not at t ached t o your t erm inal window) . The easiest way t o do t his is sim ply execut e Squid as follows: % squid -s The - s opt ion causes Squid t o writ e im port ant st at us and warning m essages t o syslogd. Squid uses t he LOCAL4 facilit y and t he LOG_WARNING and LOG_NOTICE priorit ies. Your syslog daem on m ay or m ay not act ually log Squid's m essages, depending on how it is configured. These sam e m essages are writ t en t o t he cache.log file, so it is safe t o om it t he - s opt ion if you prefer. When you st art Squid wit hout t he - N opt ion ( as shown earlier) , Squid aut om at ically backgrounds it self and creat es a parent / child process pair. The child process is t he one t hat does all t he real work. The parent process m akes sure t hat a child process is always running. Thus, if t he child process dies unexpect edly, t he parent st art s anot her so t hat Squid rem ains in operat ion. You can see t his parent / child process int eract ion by looking at your syslog m essages: Jul 31 14:58:35 zapp squid[294]: Squid Parent: child process 296 started Here you can see t hat t he parent is process I D 294, and t he child is 296. When you look at ps out put , you'll see t hat t he child process is list ed as (squid): % ps ax | grep squid 294
??
Is
0:00.01 squid -sD
296
??
S
0:00.27 (squid) -sD (squid)
I f t he child Squid process dies unexpect edly, t he parent st art s anot her. For exam ple: Jul 31 15:02:53 zapp squid[294]: Squid Parent: child process 296 exited due to signal 6 Jul 31 15:02:56 zapp squid[294]: Squid Parent: child process 359 started I n som e sit uat ions, t he child Squid process m ay die im m ediat ely. Rat her t han const ant ly spawning new Squid processes, t he parent process gives up if t he child processes won't st ay running for at least 10 seconds five t im es in a row: Jul 31 15:13:48 zapp squid[455]: Squid Parent: child process 474 exited with status 1 Jul 31 15:13:48 zapp squid[455]: Exiting due to repeated, frequent failures I f t his happens t o you, check syslog and Squid's cache.log for error m essages.
5.5.1 The squid_start Script When Squid runs as a daem on process, it looks for a file nam ed squid_st art in t he sam e direct ory as t he squid binary. I f found, t his program is execut ed before t he parent process forks t o run t he child process. You can use t his script for cert ain adm inist rat ive t asks, such as not ifying som eone t hat Squid
is st art ing, m anaging log files, et c. Squid doesn't st art t he child process unt il t he squid_st art program exit s.
The squid_st art script only works when you st art Squid by it s absolut e or relat ive pat hnam e. I n ot her words, Squid doesn't use t he PATH environm ent variable t o locat e squid_st art . Thus, you m ay want t o develop t he habit of st art ing Squid like t his: % /usr/local/squid/sbin/squid -sD rat her t han st art ing Squid like t his: % squid -sD
< Day Day Up >
< Day Day Up >
5.6 Boot Scripts Most likely, you'll want Squid t o st art aut om at ically every t im e your com put er boot s. Different operat ing syst em s vary widely in how t heir boot - up script s work. I 'll describe som e com m on environm ent s here, but you m ay need t o refer t o your part icular operat ing syst em for specific inform at ion.
5.6.1 /etc/rc.local One of t he easiest schem es is t he / et c/ rc.local script . This is sim ply a shell script t hat runs as root each t im e t he syst em boot s. Using t his script t o st art Squid is as easy as adding t he following line: /usr/local/squid/sbin/squid -s Of course your inst allat ion prefix m ay be different , and you m ay like t o use som e ot her com m and- line opt ions. Don't use t he - N opt ion here. I f, for som e reason, you're not using t he cache_effect ive_user direct ive, you can t ry using su t o st art Squid as a non- root user: /usr/bin/su nobody -c '/usr/local/squid/sbin/squid -s'
5.6.2 init.d and rc.d The init .d and rc.d schem es use a separat e shell script t o st art different services. These script s are oft en locat ed in one of t he following direct ories: / sbin/ init .d, / et c/ init .d, and / usr/ local/ et c/ rc.d. The script s usually t ake a single com m and- line argum ent , which is eit her st art or st op. Som e syst em s only use t he st art argum ent . Here's a basic script for st art ing Squid: #!/bin/sh # # this script starts and stops Squid
case "$1" in start) /usr/local/squid/sbin/squid -s echo -n ' Squid' ;;
stop) /usr/local/squid/sbin/squid -k shutdown ;; esac
Linux users m ay want t o add com m ands t hat set t he file- descript or lim it s before running Squid. For exam ple: echo 8192 > /proc/sys/fs/file-max limit -HSn 8192
To use t his script , find t he appropriat e direct ory in which such script s are st ored. Give it a m eaningful nam e, sim ilar t o t he ot hers. Perhaps S98squid or sim ply squid.sh. Be sure t o t est t he script by reboot ing your com put er rat her t han assum ing it will work.
5.6.3 /etc/inittab Anot her schem e support ed on som e operat ing syst em s is t he / et c/ init t ab file. On t hese syst em s, t he init process st art s and st ops services based on t he run level. A t ypical init t ab ent ry looks like t his: sq:2345:once:/usr/local/squid/sbin/squid -s Wit h t his ent ry, t he init process st art s Squid j ust once and t hen forget s about it . Squid m akes sure it st ays running as described previously. Alt ernat ively, you can do it like t his: sq:2345:respawn:/usr/local/squid/sbin/squid -Ns Here, since we use t he respawn opt ion, init rest art s Squid if t he process exit s. I f you use respawn, be sure t o use t he - N opt ion. Aft er edit ing t he init t ab file, use t his com m and t o m ake init reread it s configurat ion file and st art Squid: # init q
< Day Day Up >
< Day Day Up >
5.7 A chroot Environment Som e people like t o run Squid in a chroot environm ent . This is a Unix feat ure t hat gives a process a new root filesyst em direct ory. I t provides an ext ra level of securit y in t he event t hat Squid is com prom ised. I f an at t acker som ehow gains access t o t he operat ing syst em t hrough Squid, she can only access files under t he chroot filesyst em . The ot her syst em files, out side of t he chroot t ree, rem ain inaccessible. The easiest way t o run Squid in a chroot environm ent is by specifying t he new root direct ory in t he squid.conf file wit h t his direct ive: chroot /new/root/directory
The chroot ( ) syst em call requires superuser privileges, so you m ust st art Squid as root t o use t his feat ure.
The chroot environm ent isn't for first - t im e Unix users. I t is a lit t le t ricky because you m ust replicat e a num ber of files underneat h t he new root direct ory. For exam ple, if t he default configurat ion file is norm ally / usr/ local/ squid/ et c/ squid.conf, and you use t he chroot direct ive, t he file m ust be locat ed at / new/ root / direct ory/ usr/ local/ squid/ et c/ squid.conf. You m ust copy all of t he files under $prefix/ et c, $prefix/ share, and $prefix/ libexec t o t he chroot direct ory. Make sure t hat $prefix/ var and t he cache direct ories exist and are writ able under t he chroot direct ory as well. Chances are t hat your operat ing syst em requires a num ber of files in t he chroot direct ory, such as / et c/ resolv.conf and / dev/ null. I f you use an ext ernal helper program , such as a redirect or ( see Chapt er 11) or an aut hent icat or ( see Chapt er 12) , you'll also need som e shared libraries from / usr/ lib. You can use t he ldd ut ilit y t o find out which shared libraries are required for a given program : % ldd /usr/local/squid/libexec/ncsa_auth /usr/local/squid/libexec/ncsa_auth: libcrypt.so.2 => /usr/lib/libcrypt.so.2 (0x28067000) libm.so.2 => /usr/lib/libm.so.2 (0x28080000) libc.so.4 => /usr/lib/libc.so.4 (0x28098000) You can also use t he chroot com m and t o t est helpers: # chroot /new/root/directory /usr/local/squid/libexec/ncsa_auth /usr/libexec/ld-elf.so.1: Shared object "libcrypt.so.2" not found
For m ore inform at ion on chroot , see t he chroot ( ) m anpage on your syst em . < Day Day Up >
< Day Day Up >
5.8 Stopping Squid The safest way t o shut down Squid is wit h t he squid - k shut down com m and: % squid -k shutdown This com m and sends t he TERM signal t o t he running Squid process. Upon receipt of t he TERM signal, Squid closes it s incom ing socket s so t hat new request s aren't accept ed. I t t hen wait s som e am ount of t im e for out st anding request s t o com plet e. The default is 30 seconds, which you can change wit h t he shut down_lifet im e direct ive. I f, for som e reason, t he squid.pid file is m issing or unreadable, t he squid -k com m ands don't work. I n t his case, you can m anually kill Squid by finding t he process I D wit h ps. For exam ple: % ps ax | grep squid I f you see m ore t han one Squid process, be sure t o kill t he one t hat shows up as (squid). For exam ple: % ps ax | grep squid 294
??
Is
0:00.01 squid -sD
296
??
S
0:00.27 (squid) -sD (squid)
% kill -TERM 296 Aft er sending t he TERM signal, you m ay want t o wat ch t he log file t o double- check t hat Squid is shut t ing down: % tail -f logs/cache.log 2003/09/29 21:49:30| Preparing for shutdown after 9316 requests 2003/09/29 21:49:30| Waiting 10 seconds for active connections to finish 2003/09/29 21:49:30| FD 11 Closing HTTP connection 2003/09/29 21:49:31| Shutting down... 2003/09/29 21:49:31| FD 12 Closing ICP connection 2003/09/29 21:49:31| Closing unlinkd pipe on FD 9 2003/09/29 21:49:31| storeDirWriteCleanLogs: Starting... 2003/09/29 21:49:32| Finished.
Wrote 253 entries.
2003/09/29 21:49:32| Took 0.1 seconds (1957.6 entries/sec).
2003/09/29 21:49:32| Squid Cache (Version 2.5.STABLE4): Exiting normally. I f you use squid - k int errupt , Squid shut s down im m ediat ely, wit hout wait ing for act ive request s t o com plet e. This is equivalent t o sending t he I NT signal wit h kill. < Day Day Up >
< Day Day Up >
5.9 Reconfiguring a Running Squid Process As you learn m ore about Squid, you'll probably find yourself m aking m any changes t o t he squid. conf file. To have t he new set t ings t ake effect , you can eit her shut down and rest art Squid, or you can reconfigure Squid while it is running. The best way t o reconfigure a running Squid process is wit h t he squid - k reconfigure com m and: % squid -k reconfigure When you run t his com m and, a HUP signal is sent t o t he running Squid process. Squid t hen reads and parses t he squid.conf file. I f t he operat ion is successful, you'll see t his in cache.log: 2003/09/29 22:02:25| Restarting Squid Cache (version 2.5.STABLE4)... 2003/09/29 22:02:25| FD 12 Closing HTTP connection 2003/09/29 22:02:25| FD 13 Closing ICP connection 2003/09/29 22:02:25| Cache dir '/usr/local/squid/var/cache' size remains unchanged at 102400 KB 2003/09/29 22:02:25| DNS Socket created on FD 5 2003/09/29 22:02:25| Adding nameserver 10.0.0.1 from /etc/resolv.conf 2003/09/29 22:02:25| Accepting HTTP connections at 0.0.0.0, port 3128, FD 9. 2003/09/29 22:02:25| Accepting ICP messages at 0.0.0.0, port 3130, FD 11. 2003/09/29 22:02:25| WCCP Disabled. 2003/09/29 22:02:25| Loaded Icons. 2003/09/29 22:02:25| Ready to serve requests. You need t o be a lit t le careful wit h t he reconfigure opt ion because it 's possible t o m ake changes t hat cause a fat al error. For exam ple, not e t hat Squid closes and reopens t he incom ing HTTP and I CP socket s. I f you change t he ht t p_port t o a port num ber t hat Squid can't open, it exit s wit h a fat al error m essage. Cert ain opt ions and direct ives can't be changed while Squid is running. This includes: ● ● ●
●
Rem oval of cache direct ories ( cache_dir direct ive) . Changes t o t he st ore_log direct ive. Changing t he block- size value for coss cache_dirs. I n fact , whenever you change t his value, you m ust reinit ialize t he coss cache_dir. The coredum p_dir direct ive isn't exam ined during t he reconfigure procedure. Thus, you can't m ake Squid change it s current direct ory aft er it has st art ed.
Solaris users m ay experience a subt le problem when reconfiguring Squid. The fopen( ) call in t he Solaris st dio im plem ent at ion requires an unused file descript or less t han 256. The FI LE st ruct ure st ores t he file descript or as an 8- bit value. Norm ally t his isn't a problem because Squid uses raw I / O ( e.g., open( ) ) t o open cache files. However, cert ain t asks t hat occur during t he reconfigure procedure use fopen( ) . These m ay fail if t he first 256 file descript ors are already allocat ed. < Day Day Up >
< Day Day Up >
5.10 Rotating the Log Files Squid writ es t o a num ber of log files unless you disable t hem in squid.conf. You m ust periodically rot at e t he log files t o prevent t hem from consum ing t oo m uch disk space. Squid places a lot of im port ance on log files and exit s wit h an error m essage when it can't writ e t o t hem . To keep disk space consum pt ion under cont rol, use t he following com m and in a cron j ob: % squid -k rotate For exam ple, t his cront ab ent ry rot at es t he logs every 24 hours, at 4 A.M.: 0 4 * * * /usr/local/squid/sbin/squid -k rotate This com m and does t wo t hings. First , it closes t he current ly open log files. Then, it renam es t he cache.log, st ore.log, and access.log files by appending a num eric ext ension. For exam ple, cache. log becom es cache.log.0, cache.log.0 becom es cache.log.1, and so on, up t o t he value of t he logfile_rot at e opt ion. Squid keeps only t he last logfile_rot at e versions of each log file. The older versions are sim ply rem oved during t he renam ing process. I f you want t o keep m ore copies, you need t o increase t he logfile_rot at e lim it or writ e som e cust om script s t hat m ove t he log files t o a different locat ion. See Sect ion 13.7 for addit ional inform at ion about rot at ing log files. < Day Day Up >
< Day Day Up >
5.11 Exercises ● ● ● ●
Use Squid's - s opt ion and verify t hat it s m essages are saved by your syslog daem on. Run squid -X -d9, and exam ine som e of t he debugging m essages. Writ e a shell script t hat st ops Squid but doesn't exit unt il all Squid processes exit . Play wit h squid -k rotate. What happens if you have tail -f cache.log running when you rot at e t he log files? < Day Day Up >
< Day Day Up >
Chapter 6. All About Access Controls Access cont rols are t he m ost im port ant part of your Squid configurat ion file. You'll use t hem t o grant access t o your aut horized users and t o keep out t he bad guys. You can use t hem t o rest rict , or prevent access t o, cert ain m at erial; t o cont rol request rewrit ing; t o rout e request s t hrough a hierarchy; and t o support different qualit ies of service. Access cont rols are built from t wo different com ponent s. First , you define a num ber of access cont rol list ( ACL) elem ent s. These elem ent s refer t o specific aspect s of client request s, such as I P addresses, URL host nam es, request m et hods, and origin server port num bers. Aft er defining t he necessary elem ent s, you com bine t hem int o a num ber of access list rules. The rules apply t o part icular services or operat ions wit hin Squid. For exam ple, t he ht t p_access rules are applied t o incom ing HTTP request s. I cover t he access cont rol elem ent s first , and t hen t he rules lat er in t his chapt er. < Day Day Up >
< Day Day Up >
6.1 Access Control Elements ACL elem ent s are t he building blocks of Squid's access cont rol im plem ent at ion. These are how you specify t hings such as I P addresses, port num bers, host nam es, and URL pat t erns. Each ACL elem ent has a nam e, which you refer t o when writ ing t he access list rules. The basic synt ax of an ACL elem ent is as follows: acl name type value1 value2 ... For exam ple: acl Workstations src 10.0.0.0/16 I n m ost cases, you can list m ult iple values for one ACL elem ent . You can also have m ult iple acl lines wit h t he sam e nam e. For exam ple, t he following t wo configurat ions are equivalent : acl Http_ports port 80 8000 8080
acl Http_ports port 80 acl Http_ports port 8000 acl Http_ports port 8080
6.1.1 A Few Base ACL Types Squid has approxim at ely 25 different ACL t ypes, som e of which have a com m on base t ype. For exam ple, bot h src and dst ACLs use I P addresses as t heir base t ype. To avoid being redundant , I 'll cover t he base t ypes first and t hen describe each t ype of ACL in t he following sect ions.
6.1.1.1 IP addresses Used by: src, dst , m yip Squid has a powerful synt ax for specifying I P addresses in ACLs. You can writ e addresses as subnet s, [ 1] address ranges, and dom ain nam es. Squid support s bot h " dot t ed quad" and CI DR prefix subnet specificat ions. I n addit ion, if you om it a net m ask, Squid calculat es t he appropriat e net m ask for you. For exam ple, each group in t he next exam ple are equivalent : [ 1]
CI DR st ands for Classless I nt er- Dom ain Rout ing. I t is from an I nt ernet - wide effort t o support rout ing by any prefix lengt h, inst ead of t he old class A, B, and C subnet lengt hs. acl Foo src 172.16.44.21/255.255.255.255 acl Foo src 172.16.44.21/32 acl Foo src 172.16.44.21
acl Xyz src 172.16.55.32/255.255.255.248
acl Xyz src 172.16.55.32/28
acl Bar src 172.16.66.0/255.255.255.0 acl Bar src 172.16.66.0/24 acl Bar src 172.16.66.0 When you specify a net m ask, Squid checks your work. I f your net m ask m asks out non- zero bit s of t he I P address, Squid issues a warning. For exam ple, t he following lines result s in t he subsequent warning: acl Foo src 127.0.0.1/8
aclParseIpData: WARNING: Netmask masks away part of the specified IP in 'Foo' The problem here is t hat t he / 8 net m ask ( 255.0.0.0) has all zeros in t he last t hree oct et s, but t he I P address 127.0.0.1 doesn't . Squid warns you about t he problem so you can elim inat e t he am biguit y. To be correct , you should writ e: acl Foo src 127.0.0.1/32 or: acl Foo src 127.0.0.0/8 Som et im es you m ay need t o list m ult iple, cont iguous subnet s. I n t hese cases, it m ay be easier t o specify an address range. For exam ple: acl Bar src 172.16.10.0-172.16.19.0/24 This is equivalent t o, and m ore efficient t han, t his approach: acl Foo src 172.16.10.0/24 acl Foo src 172.16.11.0/24 acl Foo src 172.16.12.0/24 acl Foo src 172.16.13.0/24 acl Foo src 172.16.14.0/24 acl Foo src 172.16.15.0/24 acl Foo src 172.16.16.0/24 acl Foo src 172.16.18.0/24 acl Foo src 172.16.19.0/24 Not e t hat wit h I P address ranges, t he net m ask goes only at t he very end. You can't specify different net m asks for t he beginning and ending range values.
You can also specify host nam es in I P ACLs. For exam ple: acl Squid dst www.squid-cache.org
Squid convert s host nam es t o I P addresses at st art up. Once st art ed, Squid never m akes anot her DNS lookup for t he host nam e's address. Thus, Squid never not ices if t he address changes while it 's running.
I f t he host nam e resolves t o m ult iple addresses, Squid adds each t o t he ACL. Also not e t hat you can't use net m asks wit h host nam es. Using host nam es in address- based ACLs is usually a bad idea. Squid parses t he configurat ion file before init ializing ot her com ponent s, so t hese DNS lookups don't use Squid's nonblocking I P cache int erface. I nst ead, t hey use t he blocking get host bynam e( ) funct ion. Thus, t he need t o convert ACL host nam es t o addresses can delay Squid's st art up procedure. Avoid using host nam es in src, dst , and m yip ACLs unless absolut ely necessary. Squid st ores I P address ACLs in m em ory wit h a dat a st ruct ure known as an splay t ree ( see ht t p: / / www. link.cs.cm u.edu/ splay/ ) . The splay t ree has som e int erest ing self- organizing propert ies, one of which being t hat t he list aut om at ically adj ust s it self as lookups occur. When a m at ching elem ent is found in t he list , t hat elem ent becom es t he new root of t he t ree. I n t his way frequent ly referenced it em s m igrat e t o t he t op of t he t ree, which reduces t he t im e for fut ure lookups. All subnet s and ranges belonging t o a single ACL elem ent m ust not overlap. Squid warns you if you m ake a m ist ake. For exam ple, t his isn't allowed: acl Foo src 1.2.3.0/24 acl Foo src 1.2.3.4/32 I t causes Squid t o print a warning in cache.log: WARNING: '1.2.3.4' is a subnetwork of '1.2.3.0/255.255.255.0' WARNING: because of this '1.2.3.4' is ignored to keep splay tree searching predictable WARNING: You should probably remove '1.2.3.4' from the ACL named 'Foo' I n t his case, you need t o fix t he problem , eit her by rem oving one of t he ACL values or by placing t hem int o different ACL list s.
6.1.1.2 Domain names Used by: srcdom ain, dst dom ain, and t he cache_host _dom ain direct ive A dom ain nam e is sim ply a DNS nam e or zone. For exam ple, t he following are all valid dom ain nam es: www.squid-cache.org
squid-cache.org org Dom ain nam e ACLs are t ricky because of a subt le difference relat ing t o m at ching dom ain nam es and subdom ains. When t he ACL dom ain nam e begins wit h a period, Squid t reat s it as a wildcard, and it m at ches any host nam e in t hat dom ain, even t he dom ain nam e it self. I f, on t he ot her hand, t he ACL dom ain nam e doesn't begin wit h a period, Squid uses exact st ring com parison, and t he host nam e m ust be exact ly t he sam e for a m at ch. Table 6- 1 shows Squid's rules for m at ching dom ain and host nam es. The first colum n shows host nam es t aken from request ed URLs ( or client host nam es for srcdom ain ACLs) . The second colum n indicat es whet her or not t he host nam e m at ches lrrr.org. The t hird colum n shows whet her t he host nam e m at ches an .lrrr.org ACL. As you can see, t he only difference is in t he second case.
Ta ble 6 - 1 . D om a in n a m e m a t ch in g URL h ost n a m e
M a t ch e s ACL lr r r .or g?
M a t ch e s ACL .lr r r .or g?
lrrr.org
Yes
Yes
i.am .lrrr.org
No
Yes
iam lrrr.org
No
No
Dom ain nam e m at ching can be confusing, so let 's look at anot her exam ple so t hat you really underst and it . Here are t wo slight ly different ACLs: acl A dstdomain foo.com acl B dstdomain .foo.com A user's request t o get ht t p: / / www.foo.com / m at ches ACL B, but not A. ACL A requires an exact st ring m at ch, but t he leading dot in ACL B is like a wildcard. On t he ot her hand, a user's request t o get ht t p: / / foo.com / m at ches bot h ACLs A and B. Even t hough t here is no word before foo.com in t he URL host nam e, t he leading dot in ACL B st ill causes a m at ch. Squid uses splay t rees t o st ore dom ain nam e ACLs, j ust as it does for I P addresses. However, Squid's dom ain nam e m at ching algorit hm present s an int erest ing problem for splay t rees. The splay t ree t echnique requires t hat only one key can m at ch any part icular search t erm . For exam ple, let 's say t he search t erm ( from a URL) is i.am .lrrr.org. This host nam e would be a m at ch for bot h .lrrr.org and .am .lrrr.org. The fact t hat t wo ACL values m at ch one host nam e confuses t he splay algorit hm . I n ot her words, it is a m ist ake t o put som et hing like t his in your configurat ion file: acl Foo dstdomain .lrrr.org .am.lrrr.org I f you do, Squid generat es t he following warning m essage:
WARNING: '.am.lrrr.org' is a subdomain of '.lrrr.org' WARNING: because of this '.am.lrrr.org' is ignored to keep splay tree searching predictable WARNING: You should probably remove '.am.lrrr.org' from the ACL named 'Foo' You should follow Squid's advice in t his case. Rem ove one of t he relat ed dom ains so t hat Squid does exact ly what you int end. Not e t hat you can use bot h dom ain nam es as long as you put t hem in different ACLs: acl Foo dstdomain .lrrr.org acl Bar dstdomain .am.lrrr.org This is allowed because each nam ed ACL uses it s own splay t ree.
6.1.1.3 Usernames Used by: ident , proxy_aut h ACLs of t his t ype are designed t o m at ch usernam es. Squid m ay learn a usernam e t hrough t he RFC 1413 ident prot ocol or via HTTP aut hent icat ion headers. Usernam es m ust be m at ched exact ly. For exam ple, bob doesn't m at ch bobby. Squid also has relat ed ACLs ( ident _regex and proxy_aut h_regex) t hat use regularexpression pat t ern m at ching on usernam es. You can use t he word REQUIRED as a special value t o m at ch any usernam e. I f Squid can't det erm ine t he usernam e, t he ACL isn't m at ched. This is how Squid is usually configured when using usernam e- based access cont rols.
6.1.1.4 Regular expressions Used by: srcdom _regex, dst dom _regex, url_regex, urlpat h_regex, browser, referer_regex, ident _regex, proxy_aut h_regex, req_m im e_t ype, rep_m im e_t ype A num ber of ACLs use regular expressions ( regex) t o m at ch charact er st rings. ( For a com plet e regularexpression reference, see O'Reilly's Mast ering Regular Expressions.) For Squid, t he m ost com m only used regex feat ures m at ch t he beginning and/ or end of a st ring. For exam ple, t he ^ charact er is special because it m at ches t he beginning of a line or st ring: ^http:// This regex m at ches any URL t hat begins wit h http://. The $ charact er is also special because it m at ches t he end of a line or st ring: .jpg$ Act ually, t he previous exam ple is slight ly wrong because t he . charact er is special t oo. I t is a wildcard t hat m at ches any charact er. What we really want is t his: \.jpg$ The backslash escapes t he . so t hat it s specialness is t aken away. This regex m at ches any st ring t hat ends wit h .jpg. I f you don't use t he ^ or $ charact ers, regular expressions behave like st andard subst ring
searches. They m at ch an occurrence of t he word ( or words) anywhere in t he st ring. Wit h all of Squid's regex t ypes, you have t he opt ion t o use case- insensit ive com parison. Mat ching is casesensit ive by default . To m ake it case- insensit ive, use t he - i opt ion aft er t he ACL t ype. For exam ple: acl Foo url_regex -i ^http://www
6.1.1.5 TCP port numbers Used by: port , m yport This t ype is relat ively st raight forward. The values are individual port num bers or port num ber ranges. Recall t hat TCP port num bers are 16- bit values and, t herefore, m ust be great er t han 0 and less t han 65,536. Here are som e exam ples: acl Foo port 123 acl Bar port 1-1024
6.1.1.6 Autonomous system numbers Used by: src_as, dst _as I nt ernet rout ers use Aut onom ous Syst em ( AS) num bers t o const ruct rout ing t ables. Essent ially, an AS num ber refers t o a collect ion of I P net works m anaged by a single organizat ion. For exam ple, m y I SP has been assigned t he following net work blocks: 134.116.0.0/ 16, 137.41.0.0/ 16, 206.168.0.0/ 16, and m any m ore. I n t he I nt ernet rout ing t ables, t hese net works are advert ised as belonging t o AS 3404. When rout ers forward packet s, t hey t ypically select t he pat h t hat t raverses t he fewest aut onom ous syst em s. I f none of t his m akes sense t o you, don't worry. AS- based ACLs should only be used by net working gurus. Here's how t he AS- based t ypes work: when Squid first st art s up, it sends a special query t o a whois server. The query essent ially says, " Tell m e which I P net works belong t o t his AS num ber." This inform at ion is collect ed and m anaged by t he Rout ing Arbit er Dat abase ( RADB) . Once Squid receives t he list of I P net works, it t reat s t hem sim ilarly t o t he I P address- based ACLs. AS- based t ypes only work well when I SPs keep t heir RADB inform at ion up t o dat e. Som e I SPs are bet t er t han ot hers about updat ing t heir RADB ent ries; m any don't bot her wit h it at all. Also not e t hat Squid convert s AS num bers t o net works only at st art up or when you signal it t o reconfigure. I f t he I SP updat es it s RADB ent ry, your cache won't know about t he changes unt il you rest art or reconfigure Squid. Anot her problem is t hat t he RADB server m ay be unreachable when your Squid process st art s. I f Squid can't cont act t he RADB server, it rem oves t he AS ent ries from t he access cont rol configurat ion. The default server, whois.ra.net , m ay be t oo far away from m any users t o be reliable.
6.1.2 ACL Types Now we can focus on t he ACL t ypes t hem selves. I present t hem here roughly in order of decreasing im port ance.
6.1.2.1 src I P addresses are t he m ost com m only used access cont rol elem ent s. Most sit es use I P address cont rols t o specify client s t hat are allowed t o access Squid and t hose t hat aren't . The src t ype refers t o client ( source) I P addresses. That is, when an src ACL appears in an access list , Squid com pares it t o t he I P address of
t he client issuing t he request . Norm ally you want t o allow request s from host s inside your net work and block all ot hers. For exam ple, if your organizat ion is using t he 192.168.0.0 subnet , you can use an ACL like t his: acl MyNetwork src 192.168.0.0 I f you have m any subnet s, you can list t hem all on t he sam e acl line: acl MyNetwork src 192.168.0.0 10.0.1.0/24 10.0.5.0/24 172.16.0.0/12 Squid has a num ber of ot her ACL t ypes t hat check t he client 's address. The srcdom ain t ype com pares t he client 's fully qualified dom ain nam e. I t requires a reverse DNS lookup, which m ay add som e delay t o processing t he request . The srcdom _regex ACL is sim ilar, but it allows you t o use a regular expression t o com pare dom ain nam es. Finally, t he src_as t ype com pares t he client 's AS num ber.
6.1.2.2 dst The dst t ype refers t o origin server ( dest inat ion) I P addresses. Am ong ot her t hings, you can use t his t o prevent som e or all of your users from visit ing cert ain web sit es. However, you need t o be a lit t le careful wit h t he dst ACL. Most of t he request s received by Squid have origin server host nam es. For exam ple: GET http://www.web-cache.com/ HTTP/1.0 Here, www.web- cache.com is t he host nam e. When an access list rule includes a dst elem ent , Squid m ust find t he I P addresses for t he host nam e. I f Squid's I P cache cont ains a valid ent ry for t he host nam e, t he ACL is checked im m ediat ely. Ot herwise, Squid post pones request processing while t he DNS lookup is in progress. This can add significant delay t o som e request s. To avoid t hose delays, you should use t he [ 2] dst dom ain ACL t ype ( inst ead of dst ) whenever possible. [ 2]
Apart from access cont rols, Squid only needs an origin server's I P address when est ablishing a connect ion t o t hat server. DNS lookups norm ally occur m uch lat er in request processing. I f t he HTTP request result s in a cache hit , Squid doesn't need t o know t he server's address. Addit ionally, Squid doesn't need I P addresses for cache m isses t hat are forwarded t o a neighbor cache.
Here is a sim ple dst ACL exam ple: acl AdServers dst 1.2.3.0/24 Not e t hat one problem wit h dst ACLs is t hat t he origin server you are t rying t o allow or deny m ay change it s I P address. I f you don't not ice t he change, you won't bot her t o updat e squid.conf. You can put a host nam e on t he acl line, but t hat adds som e delay at st art up. I f you need m any host nam es in ACLs, you m ay want t o preprocess t he configurat ion file and t urn t he host nam es int o I P addresses.
6.1.2.3 myip The m yip t ype refers t o t he I P address where client s connect t o Squid. This is what you see under t he Local Address colum n when you run net st at - n on t he Squid box. Most Squid inst allat ions don't use t his t ype. Usually, all client s connect t o t he sam e I P address, so t his ACL elem ent is useful only on syst em s t hat have m ore t han one I P address. To underst and how m yip m ay be useful, consider a sim ple com pany local area net work wit h t wo subnet s. All users on subnet - 1 are program m ers and engineers. Subnet - 2 consist s of account ing, m arket ing, and
ot her adm inist rat ive depart m ent s. The syst em on which Squid runs has t hree net work int erfaces: one on subnet - 1, one on subnet - 2, and t he t hird connect ing t o t he out bound I nt ernet connect ion ( see Figure 6- 1) .
Figu r e 6 - 1 . An a pplica t ion of t h e m yip ACL
When properly configured, all users on subnet - 1 connect t o Squid's I P address on t hat subnet , and sim ilarly, all subnet - 2 users connect t o Squid's second I P address. You can use t his t o give t he t echnical st aff on subnet - 1 full access, while lim it ing t he adm inist rat ive st aff t o only work- relat ed web sit es. The ACLs m ight look like t his: acl Eng myip 172.16.1.5 acl Admin myip 172.16.2.5 Not e, however, t hat wit h t his schem e you m ust t ake special m easures t o prevent users on one subnet from connect ing t o Squid's address on t he ot her subnet . Ot herwise, clever users on t he account ing and m arket ing subnet can connect t hrough t he program m ing and engineering subnet and bypass your rest rict ions.
6.1.2.4 dstdomain I n som e cases, you're likely t o find t hat nam e- based access cont rols m ake a lot of sense. You can use t hem t o block access t o cert ain sit es, t o cont rol how Squid forwards request s and t o m ake som e responses uncachable. The dst dom ain t ype is very useful because it checks t he host nam e in request ed URLs. First , however, I want t o clarify t he difference bet ween t he following t wo lines: acl A dst www.squid-cache.org acl B dstdomain www.squid-cache.org A is really an I P address ACL. When Squid parses t he configurat ion file, it looks up t he I P address for www. squid- cache.org and st ores t he address in m em ory. I t doesn't st ore t he nam e. I f t he I P address for www. squid- cache.org changes while Squid is running, Squid cont inues using t he old address. The dst dom ain ACL, on t he ot her hand, is st ored as a dom ain nam e ( i.e., a st ring) , not as an I P address. When Squid checks ACL B, it uses st ring com parison funct ions on t he host nam e part of t he URL. I n t his
case, it doesn't really m at t er if t he www.squid- cache.org I P changes while Squid is running. The prim ary problem wit h dst dom ain ACLs is t hat som e URLs have I P addresses inst ead of host nam es. I f your goal is t o block access t o cert ain sit es wit h dst dom ain ACLs, savvy users can sim ply look up t he sit e's I P address m anually and insert it int o t he URL. For exam ple, t hese t wo URLs bring up t he sam e page: http://www.squid-cache.org/docs/FAQ/ http://206.168.0.9/docs/FAQ/ The first can be easily m at ched wit h dst dom ain ACLs, but t he second can't . Thus, if you elect t o rely on dst dom ain ACLs, you m ay want t o also block all request s t hat use an I P address inst ead of a host nam e. See t he Sect ion 6.3.8 for an exam ple.
6.1.2.5 srcdomain The srcdom ain ACL is som ewhat t ricky as well. I t requires a so- called reverse DNS lookup on each client 's I P address. Technically, Squid request s a DNS PTR record for t he address. The answer—a fully qualified dom ain nam e ( FQDN) —is what Squid com pares t o t he ACL value. ( Refer t o O'Reilly's DNS and BI ND for m ore inform at ion about DNS PTR records.) As wit h dst ACLs, FQDN lookups are a pot ent ial source of significant delay. The request is post poned unt il t he FQDN answer com es back. FQDN answers are cached, so t he srcdom ain lookup delay usually occurs only for t he client 's first request . Unfort unat ely, srcdom ain lookups som et im es don't work. Many organizat ions fail t o keep t heir reverse lookup dat abases current . I f an address doesn't have a PTR record, t he ACL check fails. I n som e cases, request s m ay be post poned for a very long t im e ( e.g., t wo m inut es) unt il t he DNS lookup t im es out . I f you choose t o use t he srcdom ain ACL, m ake sure t hat your own DNS in- addr.arpa zones are properly configured and working. Assum ing t hat t hey are, you can use an ACL like t his: acl LocalHosts srcdomain .users.example.com
6.1.2.6 port Most likely, you'll want t o use t he port ACL t o lim it access t o cert ain origin server port num bers. As I 'll explain short ly, Squid really shouldn't connect t o cert ain services, such as em ail and I RC servers. The port ACL allows you t o define individual port s, and port ranges. Here is an exam ple: acl HTTPports port 80 8000-8010 8080 HTTP is sim ilar in design t o ot her prot ocols, such as SMTP. This m eans t hat clever users can t rick Squid int o relaying em ail m essages t o an SMTP server. Em ail relays are one of t he prim ary reasons we m ust deal wit h a daily deluge of spam . Hist orically, spam relays have been act ual m ail servers. Recent ly, however, m ore and m ore spam m ers are using open HTTP proxies t o hide t heir t racks. You definit ely don't want your Squid cache t o be used as a spam relay. I f it is, your I P address is likely t o end up on one of t he m any m ail- relay blacklist s ( MAPS, ORDB, spam haus, et c.) . I n addit ion t o em ail, t here are a num ber of ot her TCP/ I P services t hat Squid shouldn't norm ally com m unicat e wit h. These include I RC, Telnet , DNS, POP, and NNTP. Your policy regarding port num bers should be eit her t o deny t he known- t o- be- dangerous port s and allow t he rest , or t o allow t he known- t o- be- safe port s and deny t he rest . My preference is t o be conservat ive and allow only t he safe port s. The default squid.conf includes t he following Safe_ports ACL: acl Safe_ports port 80
# http
acl Safe_ports port 21
# ftp
acl Safe_ports port 443 563
# https, snews
acl Safe_ports port 70
# gopher
acl Safe_ports port 210
# wais
acl Safe_ports port 1025-65535
# unregistered ports
acl Safe_ports port 280
# http-mgmt
acl Safe_ports port 488
# gss-http
acl Safe_ports port 591
# filemaker
acl Safe_ports port 777
# multiling http
http_access deny !Safe_ports This is a sensible approach. I t allows users t o connect t o any nonprivileged port ( 1025- 65535) , but only specific port s in t he privileged range. I f one of your users t ries t o request a URL, such as ht t p: / / www.lrrr. org: 123/ , Squid ret urns an access denied error m essage. I n som e cases, you m ay need t o add addit ional port num bers t o t he Safe_ports ACL t o keep your users happy. A m ore liberal approach is t o deny access t o cert ain port s t hat are known t o be part icularly dangerous. The Squid FAQ includes an exam ple of t his: acl Dangerous_ports 7 9 19 22 23 25 53 109 110 119
http_access deny Dangerous_ports One drawback t o t he Dangerous_port s approach is t hat Squid ends up searching t he ent ire list for alm ost every request . This places a lit t le ext ra burden on your CPU. Most likely, 99% of t he request s reaching Squid are for port 80, which doesn't appear in t he Dangerous_port s list . The list is searched for all of t hese request s wit hout result ing in a m at ch. However, int eger com parison is a fast operat ion and should not significant ly im pact perform ance.
6.1.2.7 myport Squid also has a m yport ACL. Whereas t he port ACL refers t o t he origin server port num ber, m yport refers t o t he port where Squid receives client request s. Squid list ens on different port num bers if you specify m ore t han one wit h t he ht t p_port direct ive. The m yport ACL is part icularly useful if you use Squid as an HTTP accelerat or for your web sit e and as a proxy for your users. You can accept t he accelerat or request s on port 80 and t he proxy request s on port 3128. You probably want t he world t o access t he accelerat or, but only your users should access Squid as a proxy. Your ACLs m ay look som et hing like t his: acl AccelPort myport 80 acl ProxyPort myport 3128 acl MyNet src 172.16.0.0/22
http_access allow AccelPort
# anyone
http_access allow ProxyPort MyNet
# only my users
http_access deny ProxyPort
# deny others
6.1.2.8 method The m et hod ACL refers t o t he HTTP request m et hod. GET is t ypically t he m ost com m on m et hod, followed by POST, PUT, and ot hers. This exam ple dem onst rat es how t o use t he m et hod ACL: acl Uploads method PUT POST Squid knows about t he following st andard HTTP m et hods: GET, POST, PUT, HEAD, CONNECT, TRACE, OPTIONS, and DELETE. I n addit ion, Squid knows about t he following m et hods from t he WEBDAV specificat ion, RFC [ 3] Cert ain Microsoft product s use 2518: PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK. nonst andard WEBDAV m et hods, so Squid knows about t hem as well: BMOVE, BDELETE, BPROPFIND. Finally, you can configure Squid t o underst and addit ional request m et hods wit h t he ext ension_m et hods direct ive. See Appendix A. [ 3]
For t he RFC dat abase, visit ht t p: / / www.rfc- edit or.org/ rfc.ht m l.
Not e t hat t he CONNECT m et hod is special in a num ber of ways. I t is t he m et hod used for t unneling cert ain request s t hrough HTTP proxies ( see also RFC 2817: Upgrading t o TLS Wit hin HTTP/ 1.1) . Be especially careful wit h t he CONNECT m et hod and rem ot e server port num bers. As I t alked about in t he previous sect ion, you don't want Squid t o connect t o cert ain rem ot e services. You should lim it t he CONNECT m et hod t o only t he HTTPS/ SSL and perhaps NNTPS port s ( 443 and 563, respect ively) . The default squid.conf does t his: acl CONNECT method CONNECT acl SSL_ports 443 563
http_access allow CONNECT SSL_ports http_access deny CONNECT Wit h t his configurat ion, Squid only allows t unneled request s t o port s 443 ( HTTPS/ SSL) and 563 ( NNTPS) . CONNECT m et hod request s t o all ot her port s are denied. PURGE is anot her special request m et hod. I t is specific t o Squid and not defined in any of t he RFCs. I t provides a way for t he adm inist rat or t o forcibly rem ove cached obj ect s. Since t his m et hod is som ewhat dangerous, Squid denies PURGE request s by default , unless you define an ACL t hat references t he m et hod. Ot herwise, anyone wit h access t o t he cache m ay be able t o rem ove any cached obj ect . I recom m end allowing PURGE from localhost only: acl Purge method PURGE acl Localhost src 127.0.0.1
http_access allow Purge Localhost http_access deny Purge See Sect ion 7.6 for m ore inform at ion on rem oving obj ect s from Squid's cache.
6.1.2.9 proto This t ype refers t o a URI 's access ( or t ransfer) prot ocol. Valid values are t he following: http, https ( sam e as HTTP/ TLS) , ftp, gopher, urn, whois, and cache_object. I n ot her words, t hese are t he URL schem e nam es ( RFC 1738 t erm inology) support ed by Squid. For exam ple, suppose t hat you want t o deny all FTP request s. You can use t he following direct ives: acl FTP proto FTP http_access deny FTP The cache_obj ect schem e is a feat ure specific t o Squid. I t is used t o access Squid's cache m anagem ent int erface, which I 'll t alk about in Sect ion 14.2. Unfort unat ely, it 's not a very good nam e, and it should probably be changed. The default squid.conf file has a couple of lines t hat rest rict cache m anager access: acl Manager proto cache_object acl Localhost src 127.0.0.1
http_access allow Manager Localhost http_access deny Manager These configurat ion lines allow cache- m anager request s only when t hey com e from t he localhost address. All ot her cache- m anager request s are denied. This m eans t hat any user wit h an account on t he Squid m achine can access t he pot ent ially sensit ive cache- m anager inform at ion. You m ay want t o m odify t he cache- m anager access cont rols or prot ect cert ain pages wit h passwords. I 'll t alk about t hat in Sect ion 14.2.2.
6.1.2.10 time The t im e ACL allows you t o cont rol access based on t he t im e of day and t he day of t he week. The synt ax is som ewhat crypt ic: acl name [days] [h1:m1-h2:m2] You can specify days of t he week, st art ing and st opping t im es, or bot h. Days are specified by t he singlelet t er codes shown in Table 6- 2. Tim es are specified in 24- hour form at . The st art ing t im e m ust be less t han t he ending t im e, which m akes it awkward t o writ e t im e ACLs t hat span " m idnight s."
Ta ble 6 - 2 . D a y code s for t h e t im e ACL
Code
Day
S
Sunday
M
Monday
T
Tuesday
W
Wednesday
H
Thursday
F
Friday
A
Sat urday
D
All weekdays ( M- F)
Days and t im es are int erpret ed wit h t he localt im e( ) funct ion, which t akes int o account your local t im e zone and daylight savings t im e set t ings. Make sure t hat your com put er knows what t im e zone it is in! You'll also want t o m ake sure t hat your clock is synchronized t o t he correct t im e.
To specify a t im e ACL t hat m at ches your weekday working hours, you can writ e: acl Working_hours MTWHF 08:00-17:00 or: acl Working_hours D 08:00-17:00 Let 's look at a t rickier exam ple. Perhaps you're an I SP t hat relaxes access during off- peak hours, say 8 P. M. t o 4 A.M. Since t his t im e spans m idnight , you can't writ e " 20: 00- 04: 00." I nst ead you'll need eit her t o split t his int o t wo ACLs or define t he peak hours and use negat ion. For exam ple: acl Offpeak1 20:00-23:59 acl Offpeak2 00:00-04:00 http_access allow Offpeak1 ... http_access allow Offpeak2 ...
Alt ernat ively, you can do it like t his: acl Peak 04:00-20:00 http_access allow !Peak ... Alt hough Squid allows it , you probably shouldn't put m ore t han one day list and t im e range on a single t im e ACL line. The parser isn't always sm art enough t o figure out what you want . For exam ple, if you ent er t his: acl Blah time M 08:00-10:00 W 09:00-11:00 what you really end up wit h is t his: acl Blah time MW 09:00-11:00 The parser ORs weekdays t oget her and uses only t he last t im e range. I t does work, however, if you writ e it like t his, on t wo separat e lines: acl Blah time M 08:00-10:00 acl Blah time W 09:00-11:00
6.1.2.11 ident The ident ACL m at ches usernam es ret urned by t he ident prot ocol. This is a sim ple prot ocol, t hat 's docum ent ed in RFC 1413. I t works som et hing like t his: 1. A user- agent ( client ) est ablishes a TCP connect ion t o Squid. 2. Squid connect s t o t he ident port ( 113) on t he client 's syst em . 3. Squid writ es a line cont aining t he t wo TCP port num bers of t he client 's first connect ion. The Squidside port num ber is probably 3128 ( or what ever you configured in squid.conf) . The client - side port is m ore or less random . 4. The client 's ident server writ es back t he usernam e belonging t o t he process t hat opened t he first connect ion. 5. Squid records t he usernam e for access cont rol purposes and for logging in access.log. When Squid encount ers an ident ACL for a part icular request , t hat request is post poned unt il t he ident lookup is com plet e. Thus, t he ident ACL m ay add som e significant delays t o your users' request s. We recom m end using t he ident ACL only on local area net works and only if all or m ost of t he client workst at ions run t he ident server. I f Squid and t he client workst at ions are connect ed t o a LAN wit h low lat ency, t he ident ACL can work well. Using ident for client s connect ing over WAN links is likely t o frust rat e bot h you and your users. The ident prot ocol isn't very secure. Savvy users will be able t o replace t heir norm al ident server wit h a fake server t hat ret urns any usernam e t hey select . For exam ple, if I know t hat connect ions from t he user administrator are always allowed, I can writ e a sim ple program t hat answers every ident request wit h t hat usernam e.
You can't use ident ACLs wit h int ercept ion caching ( see Chapt er 9) . When Squid is configured for int ercept ion caching, t he operat ing syst em pret ends t hat it is t he origin server. This m eans t hat t he local socket address for int ercept ed TCP connect ions has t he origin server's I P address. I f you run net st at - n on Squid, you'll see a lot of foreign I P addresses in t he Local Address colum n. When Squid m akes an ident query, it creat es a new TCP socket and binds t he local endpoint t o t he sam e I P address as t he local end of t he client 's TCP connect ion. Since t he local address isn't really local ( it 's som e far away origin server's I P address) , t he bind( ) syst em call fails. Squid handles t his as a failed ident query.
Not e t hat Squid also has a feat ure t o perform " lazy" ident lookups on client s. I n t his case, request s aren't delayed while wait ing for t he ident query. Squid logs t he ident inform at ion if it is available by t he t im e t he HTTP request is com plet e. You can enable t his feat ure wit h t he ident _lookup_access direct ive, which I 'll discuss lat er in t his chapt er.
6.1.2.12 proxy_auth Squid has a powerful, and som ewhat confusing, set of feat ures t o support HTTP proxy aut hent icat ion. Wit h proxy aut hent icat ion, t he client 's HTTP request includes a header cont aining aut hent icat ion credent ials. Usually, t his is sim ply a usernam e and password. Squid decodes t he credent ial inform at ion and t hen queries an ext ernal aut hent icat ion process t o find out if t he credent ials are valid. Squid current ly support s t hree t echniques for receiving user credent ials: t he HTTP Basic prot ocol, Digest aut hent icat ion prot ocol, and NTLM. Basic aut hent icat ion has been around for a long t im e. By t oday's st andards, it is a very insecure t echnique. Usernam es and passwords are sent t oget her, essent ially in cleart ext . Digest aut hent icat ion is m ore secure, but also m ore com plicat ed. Bot h Basic and Digest aut hent icat ion are docum ent ed in RFC 2617. NTLM also has bet t er securit y t han Basic aut hent icat ion. However, it is a propriet ary prot ocol developed by Microsoft . A handful of Squid developers have essent ially reverse- engineered it . I n order t o use proxy aut hent icat ion, you m ust also configure Squid t o spawn a num ber of ext ernal helper processes. The Squid source code includes som e program s t hat aut hent icat e against a num ber of st andard dat abases, including LDAP, NTLM, NCSA- st yle password files, and t he st andard Unix password dat abase. The aut h_param direct ive cont rols t he configurat ion of all helper program s. I 'll go t hrough it in det ail in Chapt er 12. The aut h_param direct ive and proxy_aut h ACL is one of t he few cases where t heir order in t he configurat ion file is im port ant . You m ust define at least one aut hent icat ion helper ( wit h aut h_param ) before any proxy_aut h ACLs. I f you don't , Squid print s an error m essage and ignores t he proxy_aut h ACLs. This isn't a fat al error, so Squid m ay st art anyway, and all your users' request s m ay be denied. The proxy_aut h ACL t akes usernam es as values. However, m ost inst allat ions sim ply use t he special value REQUIRED: auth_param ... acl Auth1 proxy_auth REQUIRED I n t his case, any request wit h valid credent ials m at ches t he ACL. I f you need fine- grained cont rol, you can specify individual usernam es: auth_param ... acl Auth1 proxy_auth allan bob charlie
acl Auth2 proxy_auth dave eric frank
Proxy aut hent icat ion doesn't work wit h HTTP int ercept ion because t he useragent doesn't realize it 's t alking t o a proxy rat her t han t he origin server. The user- agent doesn't know t hat it should send a Proxy-Authorization header in it s request s. See Sect ion 9.2 for addit ional det ails.
6.1.2.13 src_as This t ype checks t hat t he client ( source) I P address belongs t o a specific AS num ber. ( See Sect ion 6.1.1.6 for inform at ion on how Squid m aps AS num bers t o I P addresses.) As an exam ple, consider t he fict it ious I SP t hat uses AS 64222 and advert ises t he 10.0.0.0/ 8, 172.16.0.0/ 12, and 192.168.0.0/ 16 net works. You can writ e an ACL like t his, which allows request s from any host in t he I SP's address space: acl TheISP src 10.0.0.0/8 acl TheISP src 172.16.0.0/12 acl TheISP src 192.168.0.0/16 http_access allow TheISP Alt ernat ively, you can writ e it like t his: acl TheISP src_as 64222 http_access allow TheISP Not only is t he second form short er, it also m eans t hat if t he I SP adds m ore net works, you won't have t o updat e your ACL configurat ion.
6.1.2.14 dst_as The dst _as ACL is oft en used wit h t he cache_peer_access direct ive. I n t his way, Squid can forward cache m isses in a m anner consist ent wit h I P rout ing. Consider an I SP t hat exchanges rout es wit h a few ot her I SPs. Each I SP operat es t heir own caching proxy, and t hese proxies can forward request s t o each ot her. I deally, I SP A forwards cache m isses for servers on I SP B's net work t o I SP B's caching proxy. An easy way t o do t his is wit h AS ACLs and t he cache_peer_access direct ive: acl ISP-B-AS dst_as 64222 acl ISP-C-AS dst_as 64333 cache_peer proxy.isp-b.net parent 3128 3130 cache_peer proxy.isp-c.net parent 3128 3130 cache_peer_access proxy.isb-b.net allow ISP-B-AS cache_peer_access proxy.isb-c.net allow ISP-C-AS These access cont rols m ake sure t hat t he only request s sent t o t he t wo I SPs are for t heir own origin
servers. I 'll t alk furt her about cache cooperat ion in Chapt er 10.
6.1.2.15 snmp_community The snm p_com m unit y ACL is m eaningful only for SNMP queries, which are cont rolled by t he snm p_access direct ive. For exam ple, you m ight writ e: acl OurCommunityName snmp_community hIgHsEcUrItY acl All src 0/0 snmp_access allow OurCommunityName snmp_access deny All I n t his case, an SNMP query is allowed only if t he com m unit y nam e is set t o hIgHsEcUrItY.
6.1.2.16 maxconn The m axconn ACL refers t o t he num ber of sim ult aneous connect ions from a client 's I P address. Som e Squid adm inist rat ors find t his a useful way t o prevent users from abusing t he proxy or consum ing t oo m any resources. The m axconn ACL m at ches a request when t hat request exceeds t he num ber you specify. For t his reason, you should use m axconn ACLs only in deny rules. Consider t his exam ple: acl OverConnLimit maxconn 4 http_access deny OverConnLimit I n t his case, Squid allows up t o four connect ions at once from each I P address. When a client m akes t he fift h connect ion, t he OverConnLim it ACL is m at ched, and t he ht t p_access rule denies t he request . The m axconn ACL feat ure relies on Squid's client dat abase. This dat abase keeps a sm all dat a st ruct ure in m em ory for each client I P address. I f you have a lot of client s, t his dat abase m ay consum e a significant am ount of m em ory. You can disable t he client dat abase in t he configurat ion file wit h t he client _db direct ive. However, if you disable t he client dat abase, t he m axconn ACL will no longer work.
6.1.2.17 arp The arp ACL is used t o check t he Media Access Cont rol ( MAC) address ( t ypically Et hernet ) of cache client s. The Address Resolut ion Prot ocol ( ARP) is t he way t hat host s find t he MAC address corresponding t o an I P address. This feat ure cam e about when som e universit y st udent s discovered t hat , under Microsoft Windows, t hey could set a syst em 's I P address t o any value. Thus, t hey were able t o circum vent Squid's address- based cont rols. To escalat e t his arm s race, a savvy syst em adm inist rat or gave Squid t he abilit y t o check t he client 's Et hernet addresses. Unfort unat ely, t his feat ure uses nonport able code. I f you use Solaris or Linux, you should be able t o use arp ACLs. I f not , you're out of luck. The best way t o find out is t o add t he —enable- arp- acl opt ion when you run ./ configure. The arp ACL feat ure cont ains anot her im port ant lim it at ion. ARP is a dat alink layer prot ocol. I t works only for host s on t he sam e subnet as Squid. You can't easily discover t he MAC address of a host on a different subnet . I f you have rout ers bet ween Squid and your users, you probably can't use arp ACLs.
Now t hat you know when not t o use t hem , let 's see how arp ACLs act ually look. The values are Et hernet addresses, as you would see in ifconfig and arp out put . For exam ple: acl WinBoxes arp 00:00:21:55:ed:22 acl WinBoxes arp 00:00:21:ff:55:38
6.1.2.18 srcdom_regex The srcdom _regex ACL allows you t o use regular expression m at ching on client dom ain nam es. This is sim ilar t o t he srcdom ain ACL, which uses m odified subst ring m at ching. The sam e caveat s apply here: som e client addresses don't resolve back t o dom ain nam es. As an exam ple, t he following ACL m at ches host nam es t hat begin wit h dhcp: acl DHCPUser srcdom_regex -i ^dhcp Because of t he leading ^ sym bol, t his ACL m at ches t he host nam e dhcp12.exam ple.com , but not host 12. dhcp.exam ple.com .
6.1.2.19 dstdom_regex The dst dom _regex ACL is obviously sim ilar, except t hat it applies t o origin server nam es. The issues wit h dst dom ain are relevant here, t oo. The following exam ple m at ches host nam es t hat begin wit h www: acl WebSite dstdom_regex -i ^www\. Here is anot her useful regular expression t hat m at ches I P addresses given in URL host nam es: acl IPaddr dstdom_regex [0-9]$ This works because Squid requires URL host nam es t o be fully qualified. Since none of t he global t op- level dom ains end wit h a digit , t his ACL m at ches only I P addresses, which do end wit h a num ber.
6.1.2.20 url_regex You can use t he url_regex ACL t o m at ch any part of a request ed URL, including t he t ransfer prot ocol and origin server host nam e. For exam ple, t his ACL m at ches MP3 files request ed from FTP servers: acl FTPMP3 url_regex -i ^ftp://.*\.mp3$
6.1.2.21 urlpath_regex The urlpat h_regex ACL is very sim ilar t o url_regex, except t hat t he t ransfer prot ocol and host nam e aren't included in t he com parison. This m akes cert ain t ypes of checks m uch easier. For exam ple, let 's say you need t o deny request s wit h sex in t he URL, but st ill possibly allow request s t hat have sex in t heir host nam e: acl Sex urlpath_regex sex As anot her exam ple, let 's say you want t o provide special t reat m ent for cgi- bin request s. You can cat ch som e of t hem wit h t his ACL:
acl CGI1 urlpath_regex ^/cgi-bin Of course, CGI program s aren't necessarily kept under / cgi- bin/ , so you'd probably want t o writ e addit ional ACLs t o cat ch t he ot hers.
6.1.2.22 browser Most HTTP request s include a User-Agent header. The value of t his header is t ypically som et hing st range like: Mozilla/4.51 [en] (X11; I; Linux 2.2.5-15 i686) The browser ACL perform s regular expression m at ching on t he value of t he User-Agent header. For exam ple, t o deny request s t hat don't com e from a Mozilla browser, you can use: acl Mozilla browser Mozilla http_access deny !Mozilla Before using t he browser ACL, be sure t hat you fully underst and t he User-Agent st rings your cache receives. Som e user- agent s lie about t heir ident it y. Even Squid has a feat ure t o rewrit e User-agent headers in request s t hat it forwards. Wit h browsers such as Opera and KDE's Konqueror, users can send different user- agent st rings t o different origin servers or om it t hem alt oget her.
6.1.2.23 req_mime_type The req_m im e_t ype ACL refers t o t he Content-Type header of t he client 's HTTP request . Content-Type headers usually appear only in request s wit h m essage bodies. POST and PUT request s m ight include t he header, but GET request s don't . You m ight be able t o use t he req_m im e_t ype ACL t o det ect cert ain file uploads and som e t ypes of HTTP t unneling request s. The req_m im e_t ype ACL values are regular expressions. To cat ch audio file t ypes, you can use an ACL like t his: acl AuidoFileUploads req_mime_type -i ^audio/
6.1.2.24 rep_mime_type The rep_m im e_t ype ACL refers t o t he Content-Type header of t he origin server's HTTP response. I t is really only m eaningful when used in an ht t p_reply_access rule. All ot her access cont rol form s are based on aspect s of t he client 's request . This one is based on t he response. I f you want t o t ry blocking Java code wit h Squid, you m ight use som e access rules like t his: acl JavaDownload rep_mime_type application/x-java http_reply_access deny JavaDownload
6.1.2.25 ident_regex You saw t he ident ACL earlier in t his sect ion. The ident _regex sim ply allows you t o use regular expressions, inst ead of exact st ring m at ching on usernam es ret urned by t he ident prot ocol. For exam ple,
t his ACL m at ches usernam es t hat cont ain a digit : acl NumberInName ident_regex [0-9]
6.1.2.26 proxy_auth_regex As wit h ident , t he proxy_aut h_regex ACL allows you t o use regular expressions on proxy aut hent icat ion usernam es. For exam ple, t his ACL m at ches admin, administrator, and administrators: acl Admins proxy_auth_regex -i ^admin
6.1.3 External ACLs Squid Version 2.5 int roduces a new feat ure: ext ernal ACLs. You inst ruct Squid t o send cert ain pieces of inform at ion t o an ext ernal process. This helper process t hen t ells Squid whet her t he given dat a is a m at ch or not . Squid com es wit h a num ber of ext ernal ACL helper program s; m ost det erm ine whet her or not t he nam ed user is a m em ber of a part icular group. See Sect ion 12.5 for descript ions of t hose program s and for inform at ion on how t o writ e your own. For now, I 'll explain how t o define and ut ilize an ext ernal ACL t ype. The ext ernal_acl_t ype direct ive defines a new ext ernal ACL t ype. Here's t he general synt ax: external_acl_type type-name [options] format helper-command type-name is a user- defined st ring. You'll also use it in an acl line t o reference t his part icular helper. Squid current ly support s t he following opt ions:
t t l= n The am ount of t im e, in seconds, t o cache t he result for values t hat are a m at ch. The default is 3600 seconds, or 1 hour.
negat ive_t t l= n The am ount of t im e, in seconds, t o cache t he result for values t hat aren't a m at ch. The default is 3600 seconds, or 1 hour.
concurrency= n The num ber of helper processes t o spawn. The default is 5.
cache= n The m axim um num ber of result s t o cache. The default is 0, which doesn't lim it t he cache size. form at is one or m ore keywords t hat begin wit h t he % charact er. Squid current ly support s t he following form at t okens:
% LOGI N The usernam e, t aken from proxy aut hent icat ion credent ials.
% I DENT The usernam e, t aken from an RFC 1413 ident query.
% SRC The I P address of t he client .
% DST The I P address of t he origin server.
% PROTO The t ransfer prot ocol ( e.g., HTTP, FTP, et c.) .
% PORT The origin server TCP port num ber.
% METHOD The HTTP request m et hod.
% { Header} The value of an HTTP request header; for exam ple, % { User- Agent } causes Squid t o send st rings like t his t o t he aut hent icat or: "Mozilla/4.0 (compatible; MSIE 6.0; Win32)"
% { Hdr: member} Select s cert ain m em bers of list - based HTTP headers, such as Cache-Control; for exam ple, given t his HTTP header: X-Some-Header: foo=xyzzy, bar=plugh, foo=zoinks and t he t oken %{X-Some-Header:foo}, Squid sends t his st ring t o t he ext ernal ACL process:
foo=xyzzy, foo=zoinks
% { Hdr: ; member} The sam e as %{Hdr: member}, except t hat t he ; charact er is t he list separat or. You can use any nonalphanum eric charact er as t he separat or. helper- com m and is t he com m and t hat Squid spawns for t he helper. You m ay include com m and argum ent s here as well. For exam ple, t he ent ire com m and m ay be som et hing like: /usr/local/squid/libexec/my-acl-prog.pl -X -5 /usr/local/squid/etc/datafile Put t ing all t hese t oget her result s in a long line. Squid's configurat ion file doesn't support t he backslash linecont inuat ion t echnique shown here, so rem em ber t hat all t hese m ust go on a single line: external_acl_type MyAclType cache=100 %LOGIN %{User-Agent} \ /usr/local/squid/libexec/my-acl-prog.pl -X -5 \ /usr/local/squid/share/usernames \ /usr/local/squid/share/useragents Now t hat you know how t o define an ext ernal ACL, t he next st ep is t o writ e an acl line t hat references it . This is relat ively st raight forward. The synt ax is as follows: acl acl-name external type-name [args ...] Here is a sim ple exam ple: acl MyAcl external MyAclType Squid accept s any num ber of opt ional argum ent s following t he type-name. These are sent t o t he helper program for each request , aft er t he expanded t okens. See m y descript ion of t he unix_group helper in Sect ion 12.5.3 for an exam ple of t his feat ure.
6.1.4 Dealing with Long ACL Lists ACL list s can som et im es be very long. Such list s are awkward t o m aint ain inside t he squid.conf file. Also, you m ay need t o generat e Squid ACL list s aut om at ically from ot her sources. I n t hese cases, you'll be happy t o know t hat you can include ACL list s from ext ernal files. The synt ax is as follows: acl name "filename" The double quot es here inst ruct Squid t o open filename and assign it s cont ent s t o t he ACL. For exam ple, inst ead of t his: acl Foo BadClients 1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7 1.2.3.9 ... you can do t his: acl Foo BadClients "/usr/local/squid/etc/BadClients"
and put t he I P addresses int o t he BadClient s file: 1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7 1.2.3.9 ... Your file m ay include com m ent s t hat begin wit h a # charact er. Not e t hat each ent ry in t he file m ust be on a separat e line. Whereas a space charact er delim it s values on an acl line, newlines are t he delim it er for files cont aining ACL values.
6.1.5 How Squid Matches Access Control Elements I t is im port ant t o underst and how Squid searches ACL elem ent s for a m at ch. When an ACL elem ent has m ore t han one value, any single value can cause a m at ch. I n ot her words, Squid uses OR logic when checking ACL elem ent values. Squid st ops searching when it finds t he first value t hat causes a m at ch. This m eans t hat you can reduce delays by placing likely m at ches at t he beginning of a list . Let 's look at a specific exam ple. Consider t his ACL definit ion: acl Simpsons ident Maggie Lisa Bart Marge Homer When Squid encount ers t he Sim psons ACL in an access list , it perform s t he ident lookup. Let 's see what happens when t he user's ident server ret urns Marge. Squid's ACL code com pares t his value t o Maggie, Lisa, and Bart before finding a m at ch wit h Marge. At t his point , t he search t erm inat es, and we say t hat t he Sim psons ACL m at ches t he request . Act ually, t hat 's a bit of a lie. The ident ACL values aren't st ored as an unordered list . Rat her, t hey are st ored as an splay t ree. This m eans t hat Squid doesn't end up searching all t he nam es in t he event of a nonm at ch. Searching an splay t ree wit h N it em s requires log( N) com parisons. Many ot her ACL t ypes use splay t rees as well. The regular expression- based t ypes, however, don't . Since regular expressions can't be sort ed, t hey are st ored as linked list s. This m akes t hem inefficient for large list s, especially for request s t hat don't m at ch any of t he regular expressions in t he list . I n an at t em pt t o im prove t his sit uat ion, Squid m oves a regular expression t o t he t op of t he list when a m at ch occurs. I n fact , due t o t he nat ure of t he ACL m at ching code, Squid m oves m at ched ent ries t o t he second posit ion in t he list . Thus, com m only m at ched values nat urally m igrat e t o t he t op of t he ACL list , which should reduce t he num ber of com parisons. Let 's look at anot her sim ple exam ple: acl Schmever port 80-90 101 103 107 1 2 3 9999 This ACL is a m at ch for a request t o an origin server port bet ween 80 and 90, and all t he ot her individual list ed port num bers. For a request t o port 80, Squid m at ches t he ACL by looking at t he first value. For port 9999, all t he ot her values are checked first . For a port num ber not list ed, Squid checks every value before declaring t he ACL isn't a m at ch. As I 've said before, you can opt im ize t he ACL m at ching by placing t he m ore com m on values first .
< Day Day Up >
< Day Day Up >
6.2 Access Control Rules As I m ent ioned earlier, ACL elem ent s are t he first st ep in building access cont rols. The second st ep is t he access cont rol rules, where you com bine elem ent s t o allow or deny cert ain act ions. You've already seen som e ht t p_access rules in t he preceding exam ples. Squid has a num ber of ot her access cont rol list s:
ht t p_access This is your m ost im port ant access list . I t det erm ines which client HTTP request s are allowed, and which are denied. I f you get t he ht t p_access configurat ion wrong, your Squid cache m ay be vulnerable t o at t acks and abuse from people who shouldn't have access t o it .
ht t p_reply_access The ht t p_reply_access list is sim ilar t o ht t p_access. The difference is t hat t he form er list is checked when Squid receives a reply from an origin server or upst ream proxy. Most access cont rols are based on aspect s of t he client 's request , in which case t he ht t p_access list is sufficient . However, som e people prefer also t o allow or deny request s based on t he reply cont ent t ype. Because Squid doesn't know t he cont ent t ype value unt il it receives t he server's reply, t his addit ional access list is necessary. See Sect ion 6.3.9 for m ore inform at ion.
icp_access I f your Squid cache is configured t o serve I CP replies ( see Sect ion 10.6) , you should use t he icp_access list . I n m ost cases, you'll want t o allow I CP request s only from your neighbor caches.
no_cache You can use t he no_cache access list t o t ell Squid it m ust never st ore cert ain responses ( on disk or in m em ory) . This list is t ypically used in conj unct ion wit h dst , dst dom ain, and url_regex ACLs. The " no" in no_cache causes som e confusion because of double negat ives. A request t hat is denied by t he no_cache list isn't cached. I n ot her words no_cache deny ... is t he way t o m ake som et hing uncachable. See Sect ion 6.3.10 for an exam ple.
m iss_access The m iss_access list is prim arily useful for a Squid cache wit h sibling neighbors. I t
det erm ines how Squid handles request s t hat are cache m isses. This feat ure is necessary for Squid t o enforce sibling relat ionships wit h it s neighbors. See Sect ion 6.3.7 for an exam ple.
redirect or_access This access list det erm ines which request s are sent t o one of t he redirect or processes ( see Chapt er 11) . By default , all request s go t hrough a redirect or if you are using one. You can use t he redirect or_access list t o prevent cert ain request s from being rewrit t en. This is part icularly useful because a redirect or receives less inform at ion about a part icular request t han does t he access cont rol syst em .
ident _lookup_access The ident _lookup_access list is sim ilar t o redirect or_access. I t enables you t o m ake " lazy" ident lookups for cert ain request s. Squid doesn't issue ident queries by default . I t does so only for request s t hat are allowed by t he ident _lookup_access rules ( or by an ident ACL) .
always_direct This access list affect s how a Squid cache wit h neighbors forwards cache m isses. Usually Squid t ries t o forward cache m isses t o a parent cache, and/ or Squid uses I CP t o locat e cached responses in neighbors. However, when a request m at ches an always_direct rule, Squid forwards t he request direct ly t o t he origin server. Wit h t his list , m at ching an allow rule causes Squid t o forward t he request direct ly. See Sect ion 10.4.4 for m ore inform at ion and an exam ple.
never_direct Not surprisingly, never_direct is t he opposit e of always_direct . Cache m iss request s t hat m at ch t his list m ust be sent t o a neighbor cache. This is part icularly useful for proxies behind firewalls. Wit h t his list , m at ching an allow rule causes Squid t o forward t he request t o a neighbor. See Sect ion 10.4.3 for m ore inform at ion and an exam ple.
snm p_access This access list applies t o queries sent t o Squid's SNMP port . The ACLs t hat you can use wit h t his list are snm p_com m unit y and src. You can also use srcdom ain, srcdom _regex, and src_as if you really want t o. See Sect ion 14.3 for an exam ple.
broken_post s This access list affect s t he way t hat Squid handles cert ain POST request s. Som e older user- agent s are known t o send an ext ra CRLF ( carriage ret urn and linefeed) at t he end of t he request body. That is, t he m essage body is t wo byt es longer t han indicat ed by t he Content-Length header. Even worse, som e older HTTP servers act ually rely on t his incorrect behavior. When a request m at ches t his access list , Squid em ulat es t he buggy client and sends t he ext ra CRLF charact ers. Squid has a num ber of addit ional configurat ion direct ives t hat use ACL elem ent s. Som e of t hese used t o be global set t ings t hat were m odified t o use ACLs t o provide m ore flexibilit y.
cache_peer_access This access list cont rols t he HTTP request s and I CP/ HTCP queries t hat are sent t o a neighbor cache. See Sect ion 10.4.1 for m ore inform at ion and exam ples.
reply_body_m ax_size This access list rest rict s t he m axim um accept able size of an HTTP reply body. See Appendix A for m ore inform at ion.
delay_access This access rule list cont rols whet her or not t he delay pools are applied t o t he ( cache m iss) response for t his request . See Appendix C.
t cp_out going_address This access list binds server- side TCP connect ions t o specific local I P addresses. See Appendix A.
t cp_out going_t os This access list can set different TOS/ Diffserv values in TCP connect ions t o origin servers and neighbors. See Appendix A.
header_access Wit h t his direct ive, you can configure Squid t o rem ove cert ain HTTP headers from t he request s t hat it forwards. For exam ple, you m ight want t o aut om at ically filt er out Cookie headers in request s sent t o cert ain origin servers, such as doubleclick.net . See Appendix A.
header_replace This direct ive allows you t o replace, rat her t han j ust rem ove, t he cont ent s of HTTP headers. For exam ple, you can set t he User-Agent header t o a bogus value t o keep cert ain origin servers happy while st ill prot ect ing your privacy. See Appendix A.
6.2.1 Access Rule Syntax The synt ax for an access cont rol rule is as follows: access_list allow|deny [!]ACLname ... For exam ple: http_access allow MyClients http_access deny !Safe_Ports http_access allow GameSites AfterHours When reading t he configurat ion file, Squid m akes only one pass t hrough t he access cont rol lines. Thus, you m ust define t he ACL elem ent s ( wit h an acl line) before referencing t hem in an access list . Furt herm ore, t he order of t he access list rules is very im port ant . I ncom ing request s are checked in t he sam e order t hat you writ e t hem . Placing t he m ost com m on ACLs early in t he list m ay reduce Squid's CPU usage.
For m ost of t he access list s, t he m eaning of deny and allow are obvious. Som e of t hem , however, aren't so int uit ive. I n part icular, pay close at t ent ion when writ ing always_direct , never_direct , and no_cache rules. I n t he case of always_direct , an allow rule m eans t hat m at ching request s are forwarded direct ly t o origin servers. An always_direct deny rule m eans t hat m at ching request s aren't forced t o go direct ly t o origin servers, but m ay st ill do so if, for exam ple, all neighbor caches are unreachable. The no_cache rules are t ricky as well. Here, you m ust use deny for request s t hat m ust not be cached.
6.2.2 How Squid Matches Access Rules Recall t hat Squid uses OR logic when searching ACL elem ent s. Any single value in an acl can cause a m at ch. I t 's t he opposit e for access rules, however. For ht t p_access and t he ot her rule set s, Squid uses AND logic. Consider t his generic exam ple:
access_list allow ACL1 ACL2 ACL3 For t his rule t o be a m at ch, t he request m ust m at ch each of ACL1, ACL2, and ACL3. I f any of t hose ACLs don't m at ch t he request , Squid st ops searching t his rule and proceeds t o t he next . Wit hin a single rule, you can opt im ize rule searching by put t ing least - likely- t o- m at ch ACLs first . Consider t his sim ple exam ple: acl A method http acl B port 8080 http_access deny A B This ht t p_access rule is som ewhat inefficient because t he A ACL is m ore likely t o be m at ched t han B. I t is bet t er t o reverse t he order so t hat , in m ost cases, Squid only m akes one ACL check, inst ead of t wo: http_access deny B A One m ist ake people com m only m ake is t o writ e a rule t hat can never be t rue. For exam ple: acl A src 1.2.3.4 acl B src 5.6.7.8 http_access allow A B This rule is never going t o be t rue because a source I P address can't be equal t o bot h 1.2.3.4 and 5.6.7.8 at t he sam e t im e. Most likely, som eone who writ es a rule like t hat really m eans t his: acl A src 1.2.3.4 5.6.7.8 http_access allow A As wit h t he algorit hm for m at ching t he values of an ACL, when Squid finds a m at ching rule in an access list , t he search t erm inat es. I f none of t he access rules result in a m at ch, t he default act ion is t he opposit e of t he last rule in t he list . For exam ple, consider t his sim ple access configurat ion: acl Bob ident bob http_access allow Bob Now if t he user Mary m akes a request , she is denied. The last ( and only) rule in t he list is an allow rule, and it doesn't m at ch t he usernam e Mary. Thus, t he default act ion is t he opposit e of allow, so t he request is denied. Sim ilarly, if t he last ent ry is a deny rule, t he default act ion is t o allow t he request . I t is good pract ice always t o end your access list s wit h explicit rules t hat eit her allow or deny all request s. To be perfect ly clear, t he previous exam ple should be writ t en t his way:
acl All src 0/0 acl Bob ident bob http_access allow Bob http_access deny All The src 0/0 ACL is an easy way t o m at ch each and every t ype of request .
6.2.3 Access List Style Squid's access cont rol synt ax is very powerful. I n m ost cases, you can probably t hink of t wo or m ore ways t o accom plish t he sam e t hing. I n general, you should put t he m ore specific and rest rict ive access cont rols first . For exam ple, rat her t han: acl All src 0/0 acl Net1 src 1.2.3.0/24 acl Net2 src 1.2.4.0/24 acl Net3 src 1.2.5.0/24 acl Net4 src 1.2.6.0/24 acl WorkingHours time 08:00-17:00
http_access allow Net1 WorkingHours http_access allow Net2 WorkingHours http_access allow Net3 WorkingHours http_access allow Net4 http_access deny All you m ight find it easier t o m aint ain and underst and t he access cont rol configurat ion if you writ e it like t his: http_access allow Net4 http_access deny !WorkingHours http_access allow Net1 http_access allow Net2 http_access allow Net3
http_access deny All Whenever you have a rule wit h t wo or m ore ACL elem ent s, it 's always a good idea t o follow it up wit h an opposit e, m ore general rule. For exam ple, t he default Squid configurat ion denies cache m anager request s t hat don't com e from t he localhost I P address. You m ight be t em pt ed t o writ e it like t his: acl CacheManager proto cache_object acl Localhost src 127.0.0.1 http_access deny CacheManager !Localhost However, t he problem here is t hat you haven't yet allowed t he cache m anager request s t hat do com e from localhost . Subsequent rules m ay cause t he request t o be denied anyway. These rules have t his undesirable behavior: acl CacheManager proto cache_object acl Localhost src 127.0.0.1 acl MyNet 10.0.0.0/24 acl All src 0/0 http_access deny CacheManager !Localhost http_access allow MyNet http_access deny All Since a request from localhost doesn't m at ch MyNet, it get s denied. A bet t er way t o writ e t he rules is like t his: http_access allow CacheManager localhost http_access deny CacheManager http_access allow MyNet http_access deny All
6.2.4 Delayed Checks Som e ACLs can't be checked in one pass because t he necessary inform at ion is unavailable. The ident , dst , srcdom ain, and proxy_aut h t ypes fall int o t his cat egory. When Squid encount ers an ACL t hat can't be checked, it post pones t he decision and issues a query for t he necessary inform at ion ( I P address, dom ain nam e, usernam e, et c.) . When t he inform at ion is available, Squid checks t he rules all over again, st art ing at t he beginning of t he list . I t doesn't cont inue where t he previous check left off. I f possible, you m ay want t o m ove t hese likely- t o- be- delayed ACLs near t he t op of your rules t o avoid unnecessary, repeat ed checks.
Because t hese delays are cost ly ( in t erm s of t im e) , Squid caches t he inform at ion whenever possible. I dent lookups occur for each connect ion, rat her t han each request . This m eans t hat persist ent HTTP connect ions can really benefit you in sit uat ions where you use ident queries. Host nam es and I P addresses are cached as specified by t he DNS replies, unless you're using t he older ext ernal dnsserver processes. Proxy Aut hent icat ion inform at ion is cached as I described previously in Sect ion 6.1.2.12.
6.2.5 Slow and Fast Rule Checks I nt ernally, Squid considers som e access rule checks fast , and ot hers slow. The difference is whet her or not Squid post pones it s decision t o wait for addit ional inform at ion. I n ot her words, a slow check m ay be deferred while Squid asks for addit ional dat a, such as: ● ● ● ● ●
A reverse DNS lookup: t he host nam e for a client 's I P address An RFC 1413 ident query: t he usernam e associat ed wit h a client 's TCP connect ion An aut hent icat or: validat ing t he user's credent ials A forward DNS lookup: t he origin server's I P address An ext ernal, user- defined ACL
Som e access rules use fast checks out of necessit y. For exam ple, t he icp_access rule is a fast check. I t m ust be fast , t o serve I CP queries quickly. Furt herm ore, cert ain ACL t ypes, such as proxy_aut h, are m eaningless for I CP queries. The following access rules are fast checks: ● ● ● ● ● ● ● ● ● ● ●
header_access reply_body_m ax_size reply_access ident _lookup delay_access m iss_access broken_post s icp_access cache_peer_access redirect or_access snm p_access
The following ACL t ypes m ay require inform at ion from ext ernal sources ( DNS, aut hent icat ors, et c.) and are t hus incom pat ible wit h fast access rules: ● ● ● ● ●
srcdom ain, dst dom ain, srcdom _regex, dst dom _regex dst , dst _as proxy_aut h ident ext ernal_acl_t ype
This m eans, for exam ple, t hat you can't reliably use an ident ACL in a header_access rule. < Day Day Up >
< Day Day Up >
6.3 Common Scenarios Because access cont rols can be com plicat ed, t his sect ion cont ains a few exam ples. They dem onst rat e som e of t he com m on uses for access cont rols. You should be able t o adapt t hem t o your part icular needs.
6.3.1 Allowing Local Clients Only Alm ost every Squid inst allat ion should rest rict access based on client I P addresses. This is one of t he best ways t o prot ect your syst em from abuses. The easiest way t o do t his is writ e an ACL t hat cont ains your I P address space and t hen allow HTTP request s for t hat ACL and deny all ot hers: acl All src 0/0 acl MyNetwork src 172.16.5.0/24 172.16.6.0/24
http_access allow MyNetwork http_access deny All Most likely, t his access cont rol configurat ion will be t oo sim ple, so you'll need t o add m ore lines. Rem em ber t hat t he order of t he ht t p_access lines is im port ant . Don't add anyt hing aft er deny All. I nst ead, add t he new rules before or aft er allow MyNetwork as necessary.
6.3.2 Blocking a Few Misbehaving Clients For one reason or anot her, you m ay find it necessary t o deny access for a part icular client I P address. This can happen, for exam ple, if an em ployee or st udent launches an aggressive web crawling agent t hat consum es t oo m uch bandwidt h or ot her resources. Unt il you can st op t he problem at t he source, you can block t he request s com ing t o Squid wit h t his configurat ion: acl All src 0/0 acl MyNetwork src 172.16.5.0/24 172.16.6.0/24 acl ProblemHost src 172.16.5.9
http_access deny ProblemHost http_access allow MyNetwork http_access deny All
6.3.3 Denying Pornography Blocking access t o cert ain cont ent is a t ouchy subj ect . Oft en, t he hardest part about using Squid t o deny pornography is com ing up wit h t he list of sit es t hat should be blocked. You m ay want t o m aint ain such a list yourself, or get one from som ewhere else. The " Access Cont rols" sect ion of t he Squid FAQ has links t o freely available list s. The ACL synt ax for using such a list depends on it s cont ent s. I f t he list cont ains regular expressions, you probably want som et hing like t his: acl PornSites url_regex "/usr/local/squid/etc/pornlist" http_access deny PornSites On t he ot her hand, if t he list cont ains origin server host nam es, sim ply change url_regex t o dst dom ain in t his exam ple.
6.3.4 Restricting Usage During Working Hours Som e corporat ions like t o rest rict web usage during working hours, eit her t o save bandwidt h, or because policy forbids em ployees from doing cert ain t hings while working. The hardest part about t his is different iat ing bet ween appropriat e and inappropriat e use of t he I nt ernet during t hese t im es. Unfort unat ely, I can't help you wit h t hat . For t his exam ple, I 'm assum ing t hat you've som ehow collect ed or acquired a list of web sit e dom ain nam es t hat are known t o be inappropriat e. The easy part is configuring Squid: acl NotWorkRelated dstdomain "/usr/local/squid/etc/not-work-related-sites" acl WorkingHours time D 08:00-17:30
http_access deny !WorkingHours NotWorkRelated Not ice t hat I 've placed t he !WorkingHours ACL first in t he rule. The dst dom ain ACL is expensive ( com paring st rings and t raversing list s) , but t he t im e ACL is a sim ple inequalit y check. Let 's t ake t his a st ep furt her and underst and how t o com bine som et hing like t his wit h t he source address cont rols described previously. Here's one way t o do it : acl All src 0/0 acl MyNetwork src 172.16.5.0/24 172.16.6.0/24 acl NotWorkRelated dstdomain "/usr/local/squid/etc/not-work-related-sites" acl WorkingHours time D 08:00-17:30
http_access deny !WorkingHours NotWorkRelated
http_access allow MyNetwork http_access deny All This schem e works because it accom plishes our goal of denying cert ain request s during working hours and allowing request s only from your own net work. However, it m ight be som ewhat inefficient . Not e t hat t he NotWorkRelated ACL is searched for all request s, regardless of t he source I P address. I f t hat list is long, you'll wast e CPU resources by searching it for request s from out side your net work. Thus, you m ay want t o change t he rules around som ewhat : http_access deny !MyNetwork http_access deny !WorkingHours NotWorkRelated http_access Allow All Here we've delayed t he m ost expensive check unt il t he very end. Out siders t hat m ay be t rying t o abuse Squid will not be wast ing your CPU cycles.
6.3.5 Preventing Squid from Talking to Non-HTTP Servers You need t o m inim ize t he chance t hat Squid can com m unicat e wit h cert ain t ypes of TCP/ I P servers. For exam ple, people should never be able t o use your Squid cache t o relay SMTP ( em ail) t raffic. I covered t his previously when int roducing t he port ACL. However, it is such an im port ant part of your access cont rols t hat I 'm present ing it here as well. First of all, you have t o worry about t he CONNECT request m et hod. User agent s use t his m et hod t o t unnel TCP connect ions t hrough an HTTP proxy. I t was invent ed for HTTP/ TLS ( a.k.a SSL) request s, and t his rem ains t he prim ary use for t he CONNECT m et hod. Som e user- agent s m ay also t unnel NNTP/ TLS t raffic t hrough firewall proxies. All ot her uses should be rej ect ed. Thus, you'll need an access list t hat allows CONNECT request s t o HTTP/ TLS and NNTP/ TLS port s only. Secondly, you should prevent Squid from connect ing t o cert ain services such as SMTP. You can eit her allow safe port s or deny dangerous port s. I 'll give exam ples for bot h t echniques. Let 's st art wit h t he rules present in t he default squid.conf file: acl Safe_ports port 80
# http
acl Safe_ports port 21
# ftp
acl Safe_ports port 443 563
# https, snews
acl Safe_ports port 70
# gopher
acl Safe_ports port 210
# wais
acl Safe_ports port 280
# http-mgmt
acl Safe_ports port 488
# gss-http
acl Safe_ports port 591
# filemaker
acl Safe_ports port 777
# multiling http
acl Safe_ports port 1025-65535
# unregistered ports
acl SSL_ports port 443 563 acl CONNECT method CONNECT
http_access deny !Safe_ports http_access deny CONNECT !SSL_ports
Our Safe_ports ACL list s all privileged port s ( less t han 1024) t o which Squid m ay have valid reasons for connect ing. I t also list s t he ent ire nonprivileged port range. Not ice t hat t he Safe_ports ACL includes t he secure HTTP and NNTP port s ( 443 and 563) even t hough t hey also appear in t he SSL_ports ACL. This is because t he Safe_ports ACL is checked first in t he rules. I f you swap t he order of t he first t wo http_access lines, you could probably rem ove 443 and 563 from t he Safe_ports list , but it 's hardly wort h t he t rouble. The ot her way t o approach t his is t o list t he privileged port s t hat are known t o be unsafe: acl Dangerous_ports 7 9 19 22 23 25 53 109 110 119 acl SSL_ports port 443 563 acl CONNECT method CONNECT
http_access deny Dangerous_ports http_access deny CONNECT !SSL_ports
Don't worry if you're not fam iliar wit h all t hese st range port num bers. You can find out what each one is for by reading t he / et c/ services file on a Unix syst em or by reading I ANA's list of regist ered TCP/ UDP port num bers at ht t p: / / www.iana.org/ assignm ent s/ port - num bers.
6.3.6 Giving Certain Users Special Access Organizat ions t hat em ploy usernam e- based access cont rols oft en need t o give cert ain users special privileges. I n t his sim ple exam ple, t here are t hree elem ent s: all aut hent icat ed users,
t he usernam es of t he adm inist rat ors, and a list of pornographic web sit es. Norm al users aren't allowed t o view pornography, but t he adm ins have t he dubious j ob of m aint aining t he list . They need t o connect t o all servers t o verify whet her or not a part icular sit e should be placed in t he pornography list . Here's how t o accom plish t he t ask: auth_param basic program /usr/local/squid/libexec/ncsa_auth /usr/local/squid/etc/passwd
acl Authenticated proxy_auth REQUIRED acl Admins proxy_auth Pat Jean Chris acl Porn dstdomain "/usr/local/squid/etc/porn.domains" acl All src 0/0
http_access allow Admins http_access deny Porn http_access allow Authenticated http_access deny All Let 's exam ine how t his all works. First , t here are t hree ACL definit ions. The Aut hent icat ed ACL m at ches any valid proxy aut hent icat ion credent ials. The Adm ins ACL m at ches valid credent ials from users Pat , Jean, and Chris. The Porn ACL m at ches cert ain origin server host nam es found in t he porn.dom ains file. This exam ple has four access cont rol rules. The first checks only t he Adm ins ACL and allows all request s from Pat , Jean, and Chris. For ot her users, Squid m oves on t o t he next rule. According t o t he second rule, a request is denied if it s origin server host nam e is in t he porn.dom ains file. For request s t hat don't m at ch t he Porn ACL, Squid m oves on t o t he t hird rule. Here, t he request is allowed if it cont ains valid aut hent icat ion credent ials. The ext ernal aut hent icat or ( ncsa_aut h in t his case) is responsible for deciding whet her or not t he credent ials are valid. I f t hey aren't , t he final rule applies, and t he request is denied. Not e t hat t he ncsa_aut h aut hent icat or isn't a requirem ent . You can use any of t he num erous aut hent icat ion helpers described in Chapt er 12.
6.3.7 Preventing Abuse from Siblings I f you open up your cache t o peer wit h ot her caches, you need t o t ake addit ional precaut ions. Caches oft en use I CP t o discover which obj ect s are st ored in t heir neighbors. You should accept I CP queries only from known and approved neighbors. Furt herm ore, you can configure Squid t o enforce a sibling relat ionship by using t he miss_access rule list . Squid checks t hese rules only when forwarding cache m isses, never
cache hit s. Thus, all request s m ust first pass t he http_access rules before t he miss_access list com es int o play. I n t his exam ple, t here are t hree separat e ACLs. One is for t he local users t hat connect direct ly t o t his cache. Anot her is for a child cache, which is allowed t o forward request s t hat are cache m isses. The t hird is a sibling cache, which m ust never forward a request t hat result s in a cache m iss. Here's how it all works: alc All src 0/0 acl OurUsers src 172.16.5.0/24 acl ChildCache src 192.168.1.1 acl SiblingCache src 192.168.3.3
http_access allow OurUsers http_access allow ChildCache http_access allow SiblingCache http_access deny All
miss_access deny SiblingCache
icp_access allow ChildCache icp_access allow SiblingCache icp_access deny All
6.3.8 Denying Requests with IP Addresses As I m ent ioned in Sect ion 6.1.2.4, t he dst dom ain t ype is good for blocking access t o specific origin servers. However, clever users m ight be able t o get around t he rule by replacing URL host nam es wit h t heir I P addresses. I f you are desperat e t o st op such request s, you m ay want t o block all request s t hat cont ain an I P address. You can do so wit h a redirect or ( see Chapt er 11) or wit h a sem icom plicat ed dst dom _regex ACL like t his: acl IPForHostname dstdom_regex ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ http_access deny IPForHostname
6.3.9 An http_reply_access Example
Recall t hat t he response's cont ent t ype is t he only new inform at ion available when Squid checks t he ht t p_reply_access rules. Thus, you can keep t he ht t p_reply_access rules very sim ple. You need only check t he rep_m im e_t ype ACLs. For exam ple, here's how you can deny responses wit h cert ain cont ent t ypes: acl All src 0/0 acl Movies rep_mime_type video/mpeg acl MP3s rep_mime_type audio/mpeg http_reply_access deny Movies http_reply_access deny MP3s http_reply_access allow All
You don't need t o repeat your ht t p_access rules in t he ht t p_reply_access list . The allow All rule shown here doesn't m ean t hat all request s t o Squid are allowed. Any request t hat is denied by ht t p_access never m akes it t o t he st age where Squid checks t he ht t p_reply_access rules.
6.3.10 Preventing Cache Hits for Local Sites I f you have a num ber of origin servers on your net work, you m ay want t o configure Squid so t hat t heir responses are never cached. Because t he servers are nearby, t hey don't benefit t oo m uch from cache hit s. Addit ionally, it frees up st orage space for ot her ( far away) origin servers. The first st ep is t o define an ACL for t he local servers. You m ight want t o use an address- based ACL, such as dst : acl LocalServers dst 172.17.1.0/24 I f t he servers don't live on a single subnet , you m ight find it easier t o creat e a dst dom ain ACL: acl LocalServers dstdomain .example.com Next , you sim ply deny caching of t hose servers wit h a no_cache access rule: no_cache deny LocalServers
The no_cache rules don't prevent your client s from sending t hese request s t o Squid. There is not hing you can configure in Squid t o st op such request s from com ing. I nst ead, you m ust configure t he useragent s t hem selves.
I f you add a no_cache rule aft er Squid has been running for a while, t he cache m ay cont ain som e obj ect s t hat m at ch t he new rule. Prior t o Squid Version 2.5, t hese previously cached obj ect s m ight be ret urned as cache hit s. Now, however, Squid purges any cached response for a request t hat m at ches a no_cache rule. < Day Day Up >
< Day Day Up >
6.4 Testing Access Controls As your access cont rol configurat ion becom es longer, it also becom es m ore com plicat ed. I st rongly encourage you t o t est your access cont rols before t urning t hem loose on a product ion server. Of course, t he first t hing you should do is m ake sure t hat Squid can correct ly parse your configurat ion file. Use t he - k parse feat ure for t his: % squid -k parse To furt her t est your access cont rols, you m ay need t o set up a fake Squid inst allat ion. One easy way t o do t hat is com pile anot her copy of t he Squid source code wit h a different $prefix locat ion. For exam ple: % tar xzvf squid-2.5.STABLE4.tar.gz % cd squid-2.5.STABLE4 % ./configure --prefix=/tmp/squid ... % make && make install Aft er inst alling, you need t o edit t he new squid.conf file and change a few direct ives. Change ht t p_port if Squid is already running on t he default port . For sim ple t est ing, creat e a single, sm all cache direct ory like t his: cache_dir ufs /tmp/squid/cache 100 4 4 I f you don't want t o recom pile Squid again, you can also j ust creat e a new configurat ion file. The drawback t o t his approach is t hat you'll need t o set all t he log- file pat hnam es t o t he t em porary locat ion so t hat you don't overwrit e t he real files. You can easily t est som e access cont rols wit h t he squidclient program . For exam ple, if you have a rule t hat depends on t he origin server host nam e ( dst dom ain ACL) , or som e part of t he URL ( url_regex or urlpat h_regex) , sim ply ent er a URI t hat you would expect t o be allowed or denied: % squidclient -p 4128 http://blocked.host.name/blah/blah or: % squidclient -p 4128 http://some.host.name/blocked.ext Cert ain aspect s of t he request are harder t o cont rol. I f you have src ACLs t hat block request s from out side your net work, you m ay need t o act ually t est t hem from an ext ernal host . Test ing t im e ACLs m ay be difficult unless you can change t he clock on your syst em or st ay awake long enough. You can use squidclient 's - H opt ion t o set arbit rary request headers. For exam ple, use t he
following if you need t o t est a browser ACL. % squidclient -p 4128 http://www.host.name/blah \ -H 'User-Agent: Mozilla/5.0 (compatible; Konqueror/3)\r\n' For m ore com plicat ed request , wit h m any headers, you m ay want t o use t he t echnique described in Sect ion 16.4. You m ight also consider developing a rout ine cron j ob t hat checks your ACLs for expect ed behavior and report s any anom alies. Here is a sam ple shell script t o get you st art ed: #!/bin/sh set -e
TESTHOST="www.squid-cache.org"
# make sure Squid is not proxying dangerous ports # ST=`squidclient 'http://$TESTHOST:25/' | head -1 | awk '{print $2}'` if test "$ST" != 403 ; then echo "Squid did not block HTTP request to port 25" fi
# make sure Squid requires user authentication # ST=`squidclient 'http://$TESTHOST/' | head -1 | awk '{print $2}'` if test "$ST" != 407 ; then echo "Squid allowed request without proxy authentication" fi
# make sure Squid denies requests from foreign IP addresses
# elsewhere we already created an alias 192.168.1.1 on one of # the system interfaces # EXT_ADDR=192.168.1.1 ST=`squidclient -l $EXT_ADDR 'http://$TESTHOST/' | head -1 | awk '{print $2}'` if test "$ST" != 403 ; then echo "Squid allowed request from external address $EXT_ADDR" fi
exit 0
< Day Day Up >
< Day Day Up >
6.5 Exercises ●
●
●
Define an ACL for each known t ype ( src, dst , ident , et c.) and writ e a rule t hat uses all of t hem . I nt ent ionally m ist ype t he nam e of an ACL in one of your rules. Does squid -k parse cat ch t he error? Does Squid st art anyway? Writ e an ht t p_access t hat uses slow ACLs, like srcdom ain or ident . Tim e how long Squid t akes t o serve a request wit h and wit hout t he slow ACL checks. < Day Day Up >
< Day Day Up >
Chapter 7. Disk Cache Basics I 'm going t o t alk a lot about disk st orage and filesyst em s in t his chapt er. I t is im port ant t o m ake sure you underst and t he difference bet ween t wo relat ed t hings: disk filesyst em s and Squid's st orage schem es. Filesyst em s are feat ures of part icular operat ing syst em s. Alm ost every Unix variant has an im plem ent at ion of t he Unix File Syst em ( UFS) . I t is also som et im es known as t he Berkeley Fast File Syst em ( FFS) . Linux's default filesyst em is called ext 2fs. Many operat ing syst em s also support newer filesyst em t echnologies. These include nam es and acronym s such as advfs, xfs, and reiserfs. Program s ( such as Squid) int eract wit h filesyst em s via a handful of syst em calls. These are funct ions such as open( ) , close( ) , read( ) , writ e( ) , st at ( ) , and unlink( ) . The argum ent s t o t hese syst em calls are eit her pat hnam es ( st rings) or file descript ors ( int egers) . Filesyst em im plem ent at ion det ails are hidden from program s. They t ypically use int ernal dat a st ruct ures such as inodes, but Squid doesn't know about t hat . Squid has a num ber of different st orage schem es. The schem es have different propert ies and t echniques for organizing and accessing cache dat a on t he disk. Most of t hem use t he filesyst em int erface syst em calls ( e.g., open( ) , writ e( ) , et c.) . Squid has five different st orage schem es: ufs, aufs, diskd, coss, and null. The first t hree use t he sam e direct ory layout , and t hey are t hus int erchangeable. coss is an at t em pt t o im plem ent a new filesyst em specifically opt im ized for Squid. null is a m inim al im plem ent at ion of t he API : it doesn't act ually read or writ e dat a t o/ from t he disk.
Due t o a poor choice of nam es, " UFS" m ight refer t o eit her t he Unix filesyst em or t he Squid st orage schem e. To be clear here, I 'll writ e t he filesyst em as UFS and t he st orage schem e as ufs.
The rem ainder of t his chapt er focuses on t he squid.conf direct ives t hat cont rol t he disk cache. This includes replacem ent policies, obj ect rem oval, and freshness cont rols. For t he m ost part , I 'll only t alk about t he default st orage schem e: ufs. We'll get t o t he alt ernat ive schem es and ot her t ricks in t he next chapt er. < Day Day Up >
< Day Day Up >
7.1 The cache_dir Directive The cache_dir direct ive is one of t he m ost im port ant in squid.conf. I t t ells Squid where and how t o st ore cache files on disk. The cache_dir direct ive t akes t he following argum ent s: cache_dir scheme directory size L1 L2 [options]
7.1.1 Scheme Squid support s a num ber of different st orage schem es. The default ( and original) is ufs. Depending on your operat ing syst em , you m ay be able t o select ot her schem es. You m ust use t he —enable- st oreio= LIST opt ion wit h ./ configure t o com pile t he opt ional code for ot her st orage schem es. I 'll discuss aufs, diskd, coss, and null in Sect ion 8.7. For now, I 'll only t alk about t he ufs schem e, which is com pat ible wit h aufs and diskd.
7.1.2 Directory The direct ory argum ent is a filesyst em direct ory, under which Squid st ores cached obj ect s. Norm ally, a cache_dir corresponds t o a whole filesyst em or disk part it ion. I t usually doesn't m ake sense t o put m ore t han one cache direct ory on a single filesyst em part it ion. Furt herm ore, I also recom m end put t ing only one cache direct ory on each physical disk drive. For exam ple, if you have t wo unused hard drives, you m ight do som et hing like t his: # newfs /dev/da1d # newfs /dev/da2d # mount /dev/da1d /cache0 # mount /dev/da2d /cache1 And t hen add t hese lines t o squid.conf: cache_dir ufs /cache0 7000 16 256 cache_dir ufs /cache1 7000 16 256 I f you don't have any spare hard drives, you can, of course, use an exist ing filesyst em part it ion. Select one wit h plent y of free space, perhaps / usr or / var, and creat e a new direct ory t here. For exam ple: # mkdir /var/squidcache Then add a line like t his t o squid.conf: cache_dir ufs /var/squidcache 7000 16 256
7.1.3 Size The t hird cache_dir argum ent specifies t he size of t he cache direct ory. This is an upper lim it on t he am ount of disk space t hat Squid can use for t he cache_dir. Calculat ing an appropriat e value can be t ricky. You lose som e space t o filesyst em overheads, and you m ust leave enough free space for t em porary files and swap.st at e logs ( see Sect ion 13.6) . I recom m end m ount ing t he em pt y filesyst em and running df: % df -k Filesystem
1K-blocks
Used
Avail Capacity
Mounted on
/dev/da1d
3037766
8
2794737
0%
/cache0
/dev/da2d
3037766
8
2794737
0%
/cache1
Here you can see t hat t he filesyst em has about 2790 MB of available space. Rem em ber t hat UFS reserves som e " m infree" space, 8% in t his case, which is why Squid can't use t he full 3040 MB in t he filesyst em . You m ight be t em pt ed j ust t o put 2790 on t he cache_dir line. You m ight even t o get away wit h it if your cache isn't very busy and if you rot at e t he log files oft en. To be safe, however, I recom m end t aking off anot her 10% or so. This ext ra space will be used by Squid's swap.st at e file and t em porary files. Not e t hat t he cache_swap_low direct ive also affect s how m uch space Squid uses. I 'll t alk about t he low and high wat erm arks in Sect ion 7.2. The bot t om line is t hat you should init ially be conservat ive about t he size of your cache_dir. St art off wit h a low est im at e and allow t he cache t o fill up. Aft er Squid runs for a week or so wit h full cache direct ories, you'll be in a good posit ion t o re- evaluat e t he size set t ings. I f you have plent y of free space, feel free t o increase t he cache direct ory size in increm ent s of a few percent .
7.1.3.1 Inodes I nodes are fundam ent al building blocks of Unix filesyst em s. They cont ain inform at ion about disk files, such as perm issions, ownership, size, and t im est am ps. I f your filesyst em runs out of inodes, you can't creat e new files, even if it has space available. Running out of inodes is bad, so you m ay want t o m ake sure you have enough before running Squid. The program s t hat creat e new filesyst em s ( e.g., newfs or m kfs) reserve som e num ber of inodes based on t he t ot al size. These program s usually allow you t o set t he rat io of inodes t o disk space. For exam ple, see t he - i opt ion in t he newfs and m kfs m anpages. The rat io of disk space t o inodes det erm ines t he m ean file size t he filesyst em can support . Most Unix syst em s creat e one inode for each 4 KB, which is usually sufficient for Squid. Research shows t hat , for m ost caching proxies, t he m ean file size is about 10 KB. You m ay be able t o get away wit h 8 KB per inode, but it is risky. You can m onit or your syst em 's inode usage wit h df - i. For exam ple:
% df -ik Filesystem
1K-blocks
Used
Avail Capacity iused
/dev/ad0s1a
197951
57114
125001
31%
/dev/ad0s1f
5004533
2352120
2252051
51%
/dev/ad0s1e
396895
6786
358358
2%
205
/dev/da0d
8533292
7222148
628481
92%
/dev/da1d
8533292
7181645
668984
/dev/da2d
8533292
7198600
/dev/da3d
8533292
7208948
1413
ifree
%iused
Mounted on
52345
3%
/
129175 1084263
11%
/usr
99633
0%
/var
430894
539184
44%
/cache1
91%
430272
539806
44%
/cache2
652029
92%
434726
535352
45%
/cache3
641681
92%
427866
542212
44%
/cache4
As long as t he inode usage ( %iused) is less t han t he space usage ( Capacity) , you're in good shape. Unfort unat ely, you can't add m ore inodes t o an exist ing filesyst em . I f you find t hat you are running out of inodes, you need t o st op Squid and recreat e your filesyst em s. I f you're not willing t o do t hat , decrease t he cache_dir size inst ead.
7.1.3.2 The relationship between disk space and process size Squid's disk space usage direct ly affect s it s m em ory usage as well. Every obj ect t hat exist s on disk requires a sm all am ount of m em ory. Squid uses t he m em ory as an index t o t he on- disk dat a. I f you add a new cache direct ory or ot herwise increase t he disk cache size, m ake sure t hat you also have enough free m em ory. Squid's perform ance degrades very quickly if it s process size reaches or exceeds your syst em 's physical m em ory capacit y. Every obj ect in Squid's cache direct ories t akes eit her 76 or 112 byt es of m em ory, depending on your syst em . The m em ory is allocat ed as St oreEnt ry, MD5 Digest , and LRU policy node st ruct ures. Sm all- point er ( i.e., 32- bit ) syst em s, like t hose based on t he I nt el Pent ium , t ake 76 byt es. On syst em s wit h CPUs t hat support 64- bit point ers, each obj ect t akes 112 byt es. You can find out how m uch m em ory t hese st ruct ures use on your syst em by viewing t he Mem ory Ut ilizat ion page of t he cache m anager ( see Sect ion 14.2.1.2) . Unfort unat ely, it is difficult t o predict precisely how m uch addit ional m em ory is required for a given am ount of disk space. I t depends on t he m ean reply size, which t ypically fluct uat es over t im e. Addit ionally, Squid uses m em ory for m any ot her dat a st ruct ures and purposes. Don't assum e t hat your est im at es are, or will rem ain, correct . You should const ant ly m onit or Squid's process size and consider shrinking t he cache size if necessary.
7.1.4 L1 and L2 For t he ufs, aufs, and diskd schem es, Squid creat es a t wo- level direct ory t ree underneat h t he cache direct ory. The L1 and L2 argum ent s specify t he num ber of first - and second- level direct ories. The default s are 16 and 256, respect ively. Figure 7- 1 shows t he filesyst em st ruct ure.
Figu r e 7 - 1 . Th e ca ch e dir e ct or y st r u ct u r e for u fs- ba se d st or a ge sch e m e s
Som e people t hink t hat Squid perform s bet t er, or worse, depending on t he part icular values for L1 and L2. I t seem s t o m ake sense, int uit ively, t hat sm all direct ories can be searched fast er t han large ones. Thus, L1 and L2 should probably be large enough so t hat each L2 direct ory has no m ore t han a few hundred files. For exam ple, let 's say you have a cache direct ory t hat st ores about 7000 MB. Given a m ean file size of 10 KB, you can st ore about 700,000 files in t his cache_dir. Wit h 16 L1 and 256 L2 direct ories, t here are 4096 t ot al second- level direct ories. 700,000 ÷ 4096 leaves about 170 files in each second- level direct ory. The process of creat ing swap direct ories wit h squid - z, goes fast er for sm aller values of L1 and L2. Thus, if your cache size is really sm all, you m ay want t o reduce t he num ber of L1 and L2 direct ories. Squid assigns each cache obj ect a unique file num ber. This is a 32- bit int eger t hat uniquely ident ifies files on disk. Squid uses a relat ively sim ple algorit hm for t urning file num bers int o pat hnam es. The algorit hm uses L1 and L2 as param et ers. Thus, if you change L1 and L2, you change t he m apping from file num ber t o pat hnam e. Changing t hese param et ers for a nonem pt y cache_dir m akes t he exist ing files inaccessible. You should never change L1 and L2 aft er t he cache direct ory has becom e act ive. Squid allocat es file num bers wit hin a cache direct ory sequent ially. The file num ber- t o- pat hnam e algorit hm ( e.g., st oreUfsDirFullPat h( ) ) is writ t en so t hat each group of L2 files go int o t he sam e second- level direct ory. Squid does t his t o t ake advant age of localit y of reference. This algorit hm increases t he probabilit y t hat an HTML file and it s em bedded im ages are st ored in t he sam e second- level direct ory. Som e people expect Squid t o spread cache files evenly am ong t he second- level direct ories. However, when t he cache is init ially filling, you'll find t hat only t he first few direct ories cont ain any files. For exam ple:
% cd /cache0; du -k 2164
./00/00
2146
./00/01
2689
./00/02
1974
./00/03
2201
./00/04
2463
./00/05
2724
./00/06
3174
./00/07
1144
./00/08
1
./00/09
1
./00/0A
1
./00/0B
... This is perfect ly norm al and not hing t o worry about .
7.1.5 Options Squid has t wo schem e- independent cache_dir opt ions: a read-only flag and a max-size value.
7.1.5.1 read-only The read-only opt ion inst ruct s Squid t o cont inue reading from t he cache_dir, but t o st op st oring new obj ect s t here. I t looks like t his in squid.conf: cache_dir ufs /cache0 7000 16 256 read-only You m ight use t his opt ion if you want t o m igrat e your cache st orage from one disk t o anot her. I f you sim ply add one cache_dir and rem ove anot her, Squid's hit rat io decreases sharply. You can st ill get cache hit s from t he old locat ion when it is read- only. Aft er som e t im e, you can rem ove t he read-only cache direct ory from t he configurat ion.
7.1.5.2 max-size Wit h t his opt ion, you can specify t he m axim um obj ect size t o be st ored in t he cache direct ory. For exam ple:
cache_dir ufs /cache0 7000 16 256 max-size=1048576 Not e t hat t he value is in byt es. I n m ost sit uat ions, you shouldn't need t o add t his opt ion. I f you do, t ry t o put t he cache_dir lines in order of increasing max-size. < Day Day Up >
< Day Day Up >
7.2 Disk Space Watermarks The cache_swap_low and cache_swap_high direct ives cont rol t he replacem ent of obj ect s st ored on disk. Their values are a percent age of t he m axim um cache size, which com es from t he sum of all cache_dir sizes. For exam ple: cache_swap_low 90 cache_swap_high 95 As long as t he t ot al disk usage is below cache_swap_low, Squid doesn't rem ove cached obj ect s. As t he cache size increases, Squid becom es m ore aggressive about rem oving obj ect s. Under st eady- st at e condit ions, you should find t hat disk usage st ays relat ively close t o t he cache_swap_low value. You can see t he current disk usage by request ing t he st oredir page from t he cache m anager ( see Sect ion 14.2.1.39) . Not e t hat changing cache_swap_high probably won't have a big im pact on Squid's disk usage. I n earlier versions of Squid, t his param et er played a m ore im port ant role; now, however, it doesn't . < Day Day Up >
< Day Day Up >
7.3 Object Size Limits You can cont rol bot h t he m axim um and m inim um size of cached obj ect s. Responses larger t han m axim um _obj ect _size aren't st ored on disk. They are st ill proxied, however. The logic behind t his direct ive is t hat you don't want a really big response t o t ake up space bet t er ut ilized by m any sm all responses. The synt ax is as follows: maximum_object_size size-specification Here are som e exam ples: maximum_object_size 100 KB maximum_object_size 1 MB maximum_object_size 12382 bytes maximum_object_size 2 GB Squid checks t he response size in t wo different ways. I f t he reply includes a Content-Length header, Squid com pares it s value t o t he m axim um _obj ect _size value. I f t he cont ent lengt h is t he larger of t he t wo num bers, t he obj ect becom es im m ediat ely uncachable and never consum es any disk space. Unfort unat ely, not every response has a Content-Length header. I n t his case, Squid writ es t he response t o disk as dat a com es in from t he origin server. Squid checks t he obj ect size again only when t he response is com plet e. Thus, if t he obj ect 's size reaches t he m axim um _obj ect _size lim it , it cont inues consum ing disk space. Squid increm ent s t he t ot al cache size only when it is done reading a response. I n ot her words, t he act ive, or in- t ransit , obj ect s don't cont ribut e t o t he cache size value Squid m aint ains int ernally. This is good because it m eans Squid won't rem ove ot her obj ect s in t he cache, unless t he obj ect rem ains cachable and t hen cont ribut es t o t he t ot al cache size. However, it is also bad because Squid m ay run out of free disk space if t he reply is very large. To reduce t he chance of t his happening, you should also use t he reply_body_m ax_size direct ive. A response t hat reaches t he reply_body_m ax_size lim it is cut off im m ediat ely. Squid also has a m inim um _obj ect _size direct ive. I t allows you t o place a lower lim it on t he size of cached obj ect s. Responses sm aller t han t his size aren't st ored on disk or in m em ory. Not e t hat t his size is com pared t o t he response's cont ent lengt h ( i.e., t he size of t he reply body) , which excludes t he HTTP headers. < Day Day Up >
< Day Day Up >
7.4 Allocating Objects to Cache Directories When Squid want s t o st ore a cachable response on disk, it calls a funct ion t hat select s one of t he cache direct ories. I t t hen opens a disk file for writ ing on t he select ed direct ory. I f, for som e reason, t he open( ) call fails, t he response isn't st ored. I n t his case, Squid doesn't t ry opening a disk file on one of t he ot her cache direct ories. Squid has t wo of t hese cache_dir select ion algorit hm s. The default algorit hm is called leastload; t he alt ernat ive is round-robin. The least-load algorit hm , as t he nam e im plies, select s t hat cache direct ory t hat current ly has t he sm allest workload. The not ion of load depends on t he underlying st orage schem e. For t he aufs, coss, and diskd schem es, t he load is relat ed t o t he num ber of pending operat ions. For ufs, t he load is const ant . For cases in which all cache_dirs have equal load, t he algorit hm uses free space and m axim um obj ect sizes as t ie- breakers. The select ion algorit hm also t akes int o account t he max-size and read-only opt ions. Squid skips a cache direct ory if it knows t he obj ect size is larger t han t he lim it . I t also always skips any read- only direct ories. The round-robin algorit hm also uses load m easurem ent s. I t always select s t he next cache direct ory in t he list ( subj ect t o max-size and read-only) , as long as it s load is less t han 100% . Under som e circum st ances, Squid m ay fail t o select a cache direct ory. This can happen if all cache_dirs are overloaded or if all have max-size lim it s less t han t he size of t he obj ect . I n t his case, Squid sim ply doesn't writ e t he obj ect t o disk. You can use t he cache m anager t o t rack t he num ber of t im es Squid fails t o select a cache direct ory. View t he st ore_io page ( see Sect ion 14.2.1.41) , and find t he create.select_fail line. < Day Day Up >
< Day Day Up >
7.5 Replacement Policies The cache_replacem ent _policy direct ive cont rols t he replacem ent policy for Squid's disk cache. Version 2.5 offers t hree different replacem ent policies: least recent ly used ( LRU) , greedy dualsize frequency ( GDSF) , and least frequent ly used wit h dynam ic aging ( LFUDA) . LRU is t he default policy, not only for Squid, but for m ost ot her caching product s as well. LRU is a popular choice because it is alm ost t rivial t o im plem ent and provides very good perform ance. On 32- bit syst em s, LRU uses slight ly less m em ory t han t he ot hers ( 12 versus 16 byt es per obj ect ) . On 64- bit syst em s, all policies use 24 byt es per obj ect . Over t he years, m any researchers have proposed alt ernat ives t o LRU. These ot her policies are t ypically designed t o opt im ize a specific charact erist ic of t he cache, such as response t im e, hit rat io, or byt e hit rat io. While t he research alm ost always shows an im provem ent , t he result s can be m isleading. Som e of t he st udies use unrealist ically sm all cache sizes. Ot her st udies show t hat as cache size increases, t he choice of replacem ent policy becom es less im port ant . I f you want t o use t he GDSF or LFUDA policies, you m ust pass t he —enable- rem oval- policies opt ion t o t he ./ configure script ( see Sect ion 3.4.1) . Mart in Arlit t and John Dilley of HP Labs wrot e t he GDSF and LFUDA im plem ent at ion for Squid. You can read t heir paper online at ht t p: / / www.hpl.hp.com / t echreport s/ 1999/ HPL- 1999- 69.ht m l. My O'Reilly book, Web Caching, also t alks about t hese algorit hm s. The cache_replacem ent _policy direct ive is unique in an im port ant way. Unlike m ost of t he ot her squid.conf direct ives, t he locat ion of t his one is significant . The cache_replacm ent _policy value is act ually used when Squid parses a cache_dir direct ive. You can change t he replacem ent policy for a cache_dir by set t ing t he replacem ent policy beforehand. For exam ple: cache_replacement_policy lru cache_dir ufs /cache0 2000 16 32 cache_dir ufs /cache1 2000 16 32 cache_replacement_policy heap GDSF cache_dir ufs /cache2 2000 16 32 cache_dir ufs /cache3 2000 16 32 I n t his case, t he first t wo cache direct ories use LRU replacem ent , and t he second t wo use GDSF. This charact erist ic of t he replacem ent _policy direct ive is im port ant t o keep in m ind if you ever decide t o use t he config opt ion of t he cache m anager ( see Sect ion 14.2.1.7) . The cache m anager out put s only one ( t he last ) replacem ent policy value, and places it before all of t he cache direct ories. For exam ple, you m ay have t hese lines in squid.conf: cache_replacement_policy heap GDSF cache_dir ufs /tmp/cache1 10 4 4
cache_replacement_policy lru cache_dir ufs /tmp/cache2 10 4 4 but when you select config from t he cache m anager, you get : cache_replacement_policy lru cache_dir ufs /tmp/cache1 10 4 4 cache_dir ufs /tmp/cache2 10 4 4 As you can see, t he heap GDSF set t ing for t he first cache direct ory has been lost . < Day Day Up >
< Day Day Up >
7.6 Removing Cached Objects At som e point you m ay find it necessary t o m anually rem ove one or m ore obj ect s from Squid's cache. This m ight happen if: ● ● ●
● ●
One of your users com plains about always receiving st ale dat a. Your cache becom es " poisoned" wit h a forged response. Squid's cache index becom es corrupt ed aft er experiencing disk I / O errors or frequent crashes and rest art s. You want t o rem ove som e large obj ect s t o free up room for new dat a. Squid was caching responses from local servers, and now you don't want it t o.
Som e of t hese problem s can be solved by forcing a reload in a web browser. However, t his doesn't always work. For exam ple, som e browsers display cert ain cont ent t ypes ext ernally by launching anot her program ; t hat program probably doesn't have a reload but t on or even know about caches. You can always use t he squidclient program t o reload a cached obj ect if necessary. Sim ply insert t he - r opt ion before t he URI : % squidclient -r http://www.lrrr.org/junk >/tmp/foo I f you happen t o have a refresh_pat t ern direct ive wit h t he ignore-reload opt ion set , you and your users m ay be unable t o force a validat ion of t he cached response. I n t hat case, you'll be bet t er off purging t he offending obj ect or obj ect s.
7.6.1 Removing Individual Objects Squid accept s a cust om request m et hod for rem oving cached obj ect s. The PURGE m et hod isn't one of t he official HTTP request m et hods. I t is different from DELETE, which Squid forwards t o an origin server. A PURGE request asks Squid t o rem ove t he obj ect given in t he URI . Squid ret urns eit her 200 ( Ok) or 404 ( Not Found) . The PURGE m et hod is som ewhat dangerous because it rem oves cached obj ect s. Squid disables t he PURGE m et hod unless you define an ACL for it . Norm ally you should allow PURGE request s only from localhost and perhaps a sm all num ber of t rust ed host s. The configurat ion m ay look like t his: acl AdminBoxes src 127.0.0.1 172.16.0.1 192.168.0.1 acl Purge method PURGE http_access allow AdminBoxes Purge http_access deny Purge The squidclient program provides an easy way t o generat e PURGE request s. For exam ple:
% squidclient -m PURGE http://www.lrrr.org/junk Alt ernat ively, you could use som et hing else ( such as a Perl script ) t o generat e your own HTTP request . I t can be very sim ple: PURGE http://www.lrrr.org/junk HTTP/1.0 Accept: */* Not e t hat a URI alone doesn't uniquely ident ify a cached response. Squid also uses t he original request m et hod in t he cache key. I t m ay also use ot her request headers if t he response cont ains a Vary header. When you issue a PURGE request , Squid looks for cached obj ect s originally request ed wit h t he GET and HEAD m et hods. Furt herm ore, Squid also rem oves all variant s of a response, unless you rem ove a specific variant by including t he appropriat e headers in t he PURGE request . Squid rem oves only variant s for GET and HEAD request s.
7.6.2 Removing a Group of Objects Unfort unat ely, Squid doesn't provide a good m echanism for rem oving a bunch of obj ect s at once. This oft en com es up when som eone want s t o rem ove all obj ect s belonging t o a cert ain origin server. Squid lacks t his feat ure for a couple of reasons. First , Squid would have t o perform a linear search t hrough all cached obj ect s. This is CPU- int ensive and t akes a long t im e. While Squid is searching, your users can experience a perform ance degradat ion. Second, Squid keeps MD5s, rat her t han URI s, in m em ory. MD5s are one- way hashes, which m eans, for exam ple, t hat you can't t ell if a given MD5 hash was generat ed from a URI t hat cont ains t he st ring " www.exam ple. com ." The only way t o know is t o recalculat e t he MD5 from t he original URI and see if t hey m at ch. Because Squid doesn't have t he URI , it can't perform t he calculat ion. So what can you do? You can use t he dat a in access.log t o get a list of URI s t hat m ight be in t he cache. Then, feed t hem t o squidclient or anot her ut ilit y t o generat e PURGE request s. For exam ple: % awk '{print $7}' /usr/local/squid/var/logs/access.log \ | grep www.example.com \ | xargs -n 1 squidclient -m PURGE
7.6.3 Removing All Objects I n ext rem e circum st ances you m ay need t o wipe out t he ent ire cache, or at least one of t he cache direct ories. First , you m ust m ake sure t hat Squid isn't running. One of t he easiest ways t o m ake Squid forget about all cached obj ect s is t o overwrit e t he swap. st at e files. Not e t hat you can't sim ply rem ove t he swap.st at e files because Squid t hen scans t he
cache direct ories and opens all t he obj ect files. You also can't sim ply t runcat e swap.st at e t o a zero- sized file. I nst ead, you should put a single byt e t here, like t his: # echo '' > /usr/local/squid/var/cache/swap.state When Squid reads t he swap.st at e file, it get s an error because t he record t hat should be t here is t oo short . The next read result s in an end- of- file condit ion, and Squid com plet es t he rebuild procedure wit hout loading any obj ect m et adat a. Not e t hat t his t echnique doesn't rem ove t he cache files from your disk. You've only t ricked Squid int o t hinking t hat t he cache is em pt y. As Squid runs, it adds new files t o t he cache and m ay overwrit e t he old files. I n som e cases, t his m ight cause your disk t o run out of free space. I f t hat happens t o you, you need t o rem ove t he old files before rest art ing Squid again. One way t o rem ove cache files is wit h rm . However, it oft en t akes a very long t im e t o rem ove all t he files t hat Squid has creat ed. To get Squid running fast er, you can renam e t he cache direct ory, creat e a new one, st art Squid, and rem ove t he old one at t he sam e t im e. For exam ple: # squid -k shutdown # cd /usr/local/squid/var # mv cache oldcache # mkdir cache # chown nobody:nobody cache # squid -z # squid -s # rm -rf oldcache & Anot her t echnique is t o sim ply run newfs ( or mkfs) on t he cache filesyst em . This works only if you have t he cache_dir on it s own disk part it ion. < Day Day Up >
< Day Day Up >
7.7 refresh_pattern The refresh_pat t ern direct ive cont rols t he disk cache only indirect ly. I t helps Squid decide whet her or not a given request can be a cache hit or m ust be t reat ed as a m iss. Liberal set t ings increase your cache hit rat io but also increase t he chance t hat users receive a st ale response. Conservat ive set t ings, on t he ot her hand, decrease hit rat ios and st ale responses.
The refresh_pat t ern rules apply only t o responses wit hout an explicit expirat ion t im e. Origin servers can specify an expirat ion t im e wit h eit her t he Expires header, or t he Cache-Control: max-age direct ive.
You can put any num ber of refresh_pat t ern lines in t he configurat ion file. Squid searches t hem in order for a regular expression m at ch. When Squid finds a m at ch, it uses t he corresponding values t o det erm ine whet her a cached response is fresh or st ale. The refresh_pat t ern synt ax is as follows: refresh_pattern [-i] regexp min percent max [options] For exam ple: refresh_pattern -i \.jpg$ 30 50% 4320 reload-into-ims refresh_pattern -i \.png$ 30 50% 4320 reload-into-ims refresh_pattern -i \.htm$ 0 20% 1440 refresh_pattern -i \.html$ 0 20% 1440 refresh_pattern -i . 5 25% 2880 The regexp param et er is a regular expression t hat is norm ally case- sensit ive. You can m ake t hem case- insensit ive wit h t he - i opt ion. Squid checks t he refresh_pat t ern lines in order; it st ops searching when one of t he regular expression pat t erns m at ches t he URI . The min param et er is som e num ber of m inut es. I t is, essent ially, a lower bound on st ale responses. A response can't be st ale unless it s t im e in t he cache exceeds t he m inim um value. Sim ilarly, max is an upper lim it on fresh responses. A response can't be fresh unless it s t im e in t he cache is less t han t he m axim um t im e. Responses t hat fall bet ween t he m inim um and m axim um are subj ect t o Squid's last - m odified fact or ( LM- fact or) algorit hm . For such responses, Squid calculat es t he response age and t he LMfact or and com pares it t o t he percent value. The response age is sim ply t he am ount of t im e passed since t he origin server generat ed, or last validat ed, t he response. The resource age is t he difference bet ween t he Last-Modified and Date headers. The LM- fact or is t he rat io of t he response age t o t he resource age.
Figure 7- 2 dem onst rat es t he LM- fact or algorit hm . Squid caches an obj ect t hat is 3 hours old ( based on t he Date and Last-Modified headers) . Wit h an LM- fact or value of 50% , t he response will be fresh for t he next 1.5 hours, aft er which t he obj ect expires and is considered st ale. I f a user request s t he cached obj ect during t he fresh period, Squid ret urns an unvalidat ed cache hit . For a request t hat occurs during t he st ale period, Squid forwards a validat ion request t o t he origin server.
Figu r e 7 - 2 . Ca lcu la t in g e x pir a t ion t im e s ba se d on LM - fa ct or
I t 's im port ant t o underst and t he order t hat Squid checks t he various values. Here is a sim plified descript ion of Squid's refresh_pat t ern algorit hm : ● ● ● ●
The response is st ale if t he response age is great er t han t he refresh_pat t ern max value. The response is fresh if t he LM- fact or is less t han t he refresh_pat t ern percent value. The response is fresh if t he response age is less t han t he refresh_pat t ern min value. Ot herwise, t he response is st ale.
The refresh_pat t ern direct ive also has a handful of opt ions t hat cause Squid t o disobey t he HTTP prot ocol specificat ion. They are as follows:
override- expire When set , t his opt ion causes Squid t o check t he min value before checking t he Expires header. Thus, a non- zero min t im e m akes Squid ret urn an unvalidat ed cache hit even if t he response is preexpired.
override- last m od When set , t his opt ion causes Squid t o check t he min value before t he LM- fact or percent age.
reload- int o- im s When set , t his opt ion m akes Squid t ransform a request wit h a no-cache direct ive int o a validat ion ( If-Modified-Since) request . I n ot her words, Squid adds an If-ModifiedSince header t o t he request before forwarding it on. Not e t hat t his only works for obj ect s t hat have a Last-Modified t im est am p. The out bound request ret ains t he nocache direct ive, so t hat it reaches t he origin server.
ignore- reload When set , t his opt ion causes Squid t o ignore t he no-cache direct ive, if any, in t he request . < Day Day Up >
< Day Day Up >
7.8 Exercises ●
●
● ●
●
Run df on your exist ing filesyst em s and calculat e t he rat io of inodes t o disk space. I f any of t hose part it ions are used for Squid's disk cache, do you t hink you'll run out of space, or inodes first ? Try t o int ent ionally m ake Squid run out of disk space on a cache direct ory. How does Squid deal wit h t his sit uat ion? Writ e a shell script t o search t he cache for given URI s and opt ionally rem ove t hem . Exam ine Squid's st ore.log and est im at e t he percent age of request s t hat are subj ect t o t he refresh_pat t ern rules. Can you t hink of any negat ive side effect s of t he ignore-reload, override-expire, and relat ed opt ions? < Day Day Up >
< Day Day Up >
Chapter 8. Advanced Disk Cache Topics Perform ance is one of t he biggest concerns for Squid adm inist rat ors. As t he load placed on Squid increases, disk I / O is t ypically t he prim ary bot t leneck. The reason for t his perform ance lim it at ion is due t o t he im port ance t hat Unix filesyst em s place on consist ency aft er a syst em crash. By default , Squid uses a relat ively sim ple st orage schem e ( ufs) . All disk I / O is perform ed by t he m ain Squid process. Wit h t radit ional Unix filesyst em s, cert ain operat ions always block t he calling process. For exam ple, calling open( ) on t he Unix Fast Filesyst em ( UFS) causes t he operat ing syst em t o allocat e and init ialize cert ain on- disk dat a st ruct ures. The syst em call doesn't ret urn unt il t hese I / O operat ions com plet e, which m ay t ake longer t han you'd like if t he disks are already busy wit h ot her t asks. Under heavy load, t hese filesyst em operat ions can block t he Squid process for sm all, but significant , am ount s of t im e. The point at which t he filesyst em becom es a bot t leneck depends on m any different fact ors, including: ● ● ● ● ●
The num ber of disk drives The rot at ional speed and seek t im e of your hard drives The t ype of disk drive int erface ( ATA, SCSI ) Filesyst em t uning opt ions The num ber of files and percent age of free space < Day Day Up >
< Day Day Up >
8.1 Do I Have a Disk I/O Bottleneck? Web caches such as Squid don't usually com e right out and t ell you when disk I / O is becom ing a bot t leneck. I nst ead, response t im e and/ or hit rat io degrade as load increases. The t ricky t hing is t hat response t im e and hit rat io m ay be changing for ot her reasons, such as increased net work lat ency and changes in client request pat t erns. Perhaps t he best way t o explore t he perform ance lim it s of your cache is wit h a benchm ark, such as Web Polygraph. The good t hing about a benchm ark is t hat you can fully cont rol t he environm ent and elim inat e m any unknowns. You can also repeat t he sam e experim ent wit h different cache configurat ions. Unfort unat ely, benchm arking oft en t akes a lot of t im e and requires spare syst em s t hat aren't already being used. I f you have t he resources t o benchm ark Squid, begin wit h a st andard caching workload. As you increase t he load, at som e point you should see a significant increase in response t im e and/ or a decrease in hit rat io. Once you observe t his perform ance degradat ion, run t he experim ent again but wit h disk caching disabled. You can configure Squid never t o cache any response ( wit h t he null st orage schem e, see Sect ion 8.7) . Alt ernat ively, you can configure t he workload t o have 100% uncachable responses. I f t he average response t im e is significant ly bet t er wit hout caching, you can be relat ively cert ain t hat disk I / O is a bot t leneck at t hat level of t hroughput . I f you're like m ost people, you have neit her t he t im e nor resources t o benchm ark Squid. I n t his case, you can exam ine Squid's runt im e st at ist ics t o look for disk I / O bot t lenecks. The cache m anager General Runt im e I nform at ion page ( see Chapt er 14) gives you m edian response t im es for bot h cache hit s and m isses: Median Service Times (seconds)
5 min
60 min:
HTTP Requests (All):
0.39928
0.35832
Cache Misses:
0.42149
0.39928
Cache Hits:
0.12783
0.11465
Near Hits:
0.37825
0.39928
Not-Modified Replies:
0.07825
0.07409
For a healt hy Squid cache, hit s are significant ly fast er t han m isses. Your m edian hit response t im e should usually be 0.5 seconds or less. I st rongly recom m end t hat you use SNMP or anot her net work m onit oring t ool t o collect periodic m easurem ent s from your Squid caches ( see Chapt er 14) . A significant ( fact or of t wo) increase in m edian hit response t im e is a good indicat ion t hat you have a disk I / O bot t leneck. I f you believe your product ion cache is suffering in t his m anner, you can t est your t heory wit h t he sam e t echnique m ent ioned previously. Configure Squid not t o cache any responses, t hus avoiding all disk I / O. Then closely observe t he cache m iss response t im e. I f it goes down, your t heory is probably correct .
Once you've convinced yourself t hat disk t hroughput is lim it ing Squid's perform ance, you can t ry a num ber of t hings t o im prove it . Som e of t hese require recom piling Squid, while ot hers are relat ively sim ple st eps you can t ake t o t une t he Unix filesyst em s. < Day Day Up >
< Day Day Up >
8.2 Filesystem Tuning Options First of all, you should never use RAI D for Squid cache direct ories. I n m y experience, RAI D always degrades filesyst em perform ance for Squid. I t is m uch bet t er t o have a num ber of separat e filesyst em s, each dedicat ed t o a single disk drive. I have found four sim ple ways t o im prove UFS perform ance for Squid. Som e of t hese are specific t o cert ain operat ing syst em s, such as BSD and Linux, and m ay not be available on your plat form : ●
●
●
Som e UFS im plem ent at ions support a noat im e m ount opt ion. Filesyst em s m ount ed wit h noat im e don't updat e t he inode access t im e value for reads. The easiest way t o use t his opt ion is t o add it t o t he / et c/ fst ab like t his: # Device
Mountpoint
FStype
Options
Dump
Pass#
/dev/ad1s1c
/cache0
ufs
rw,noatime
0
0
Check your m ount ( 8) m anpage for t he async opt ion. Wit h t his opt ion set , cert ain I / O operat ions ( such as direct ory updat es) m ay be perform ed asynchronously. The docum ent at ion for som e syst em s not es t hat it is a dangerous flag. Should your syst em crash, you m ay lose t he ent ire filesyst em . For m any inst allat ions, t he perform ance im provem ent is wort h t he risk. You should use t his opt ion only if you don't m ind losing t he cont ent s of your ent ire cache. I f t he cached dat a is very valuable, t he async opt ion is probably not for you. BSD has a feat ure called soft updat es. Soft updat es are BSD's alt ernat ive t o j ournaling [ 1] On FreeBSD, you can enable t his opt ion on an unm ount ed filesyst em filesyst em s. wit h t he t unefs com m and: [ 1]
For furt her inform at ion, please see " Soft Updat es: A Technique for Elim inat ing Most Synchronous Writ es in t he Fast File Syst em " by Marshall Kirk McKusik and Gregory R. Ganger. Proceedings of t he 1999 USENI X Annual Technical Conference, June 6- 11, 1999, Mont erey, California.
# umount /cache0 # tunefs -n enable /cache0 # mount /cache0
●
You only have t o run t he t unefs once for each filesyst em . Soft updat es are aut om at ically enabled on t he filesyst em again when your syst em reboot s. On OpenBSD and Net BSD, you can use t he soft dep m ount opt ion: # Device
Mountpoint
FStype
Options
Dump
Pass#
/dev/sd0f
/usr
ffs
rw,softdep
1
2
I f you're like m e, you're probably wondering what t he difference is bet ween t he async opt ion and soft updat es. One im port ant difference is t hat soft updat e code has been designed t o m aint ain filesyst em consist ency in t he event of a syst em crash, while t he async opt ion has not . This m ight lead you t o conclude t hat async perform s bet t er t han soft updat es. However, as I show in Appendix D, t he opposit e is t rue. Previously, I m ent ioned t hat UFS perform ance, especially writ ing, depends on t he am ount of free space. Disk writ es for em pt y filesyst em s are m uch fast er t han for full ones. This is one reason behind UFS's m infree param et er and space/ t im e opt im izat ion t radeoffs. I f your cache disks are full and Squid's perform ance seem s bad, t ry reducing t he cache_dir capacit y values so t hat m ore free space is available. Of course, t his reduct ion in cache size also decreases your hit rat io, but t he response t im e im provem ent m ay be wort h it . I f you're buying t he com ponent s for a new Squid cache, consider get t ing m uch larger disks t han you need and using only half t he space. < Day Day Up >
< Day Day Up >
8.3 Alternative Filesystems Som e operat ing syst em s support filesyst em s ot her t han UFS ( or ext 2fs) . Journaling filesyst em s are a com m on alt ernat ive. The prim ary difference bet ween UFS and j ournaling filesyst em s is in t he way t hat t hey handle updat es. Wit h UFS, updat es are m ade in- place. For exam ple, when you change a file and save it t o disk, t he new dat a replaces t he old dat a. When you rem ove a file, UFS updat es t he direct ory direct ly. A j ournaling filesyst em , on t he ot her hand, writ es updat es t o a separat e j ournal, or log file. You can t ypically select whet her t o j ournal file changes, m et adat a changes, or bot h. A background process reads t he j ournal during idle m om ent s and applies t he act ual changes. Journaling filesyst em s t ypically recover m uch fast er from crashes t han UFS. Aft er a crash, t he filesyst em sim ply reads t he j ournal and com m it s all t he out st anding changes. The prim ary drawback of j ournaling filesyst em s is t hat t hey require addit ional disk writ es. Changes are first writ t en t o t he log and lat er t o t he act ual files and/ or direct ories. This is part icularly relevant for web caches because t hey t end t o have m ore disk writ es t han reads in t he first place. Journaling filesyst em s are available for a num ber of operat ing syst em s. On Linux, you can choose from ext 3fs, reiserfs, XFS, and ot hers. XFS is also available for SGI / I RI X, where it was originally developed. Solaris users can use t he Verit as filesyst em product . The TRU64 ( form erly Digit al Unix) Advanced Filesyst em ( advfs) support s j ournaling. You can use a j ournaling filesyst em wit hout m aking any changes t o Squid's configurat ion. Sim ply creat e and m ount t he filesyst em as described in your operat ing syst em docum ent at ion. You don't need t o change t he cache_dir line in squid.conf. Use a com m and like t his t o m ake a reiserfs filesyst em on Linux: # /sbin/mkreiserfs /dev/sda2 For XFS, use: # mkfs -t xfs -f /dev/sda2 Not e t hat ext 3fs is sim ply ext 2fs wit h j ournaling enabled. Use t he - j opt ion t o m ke2fs when creat ing t he filesyst em : # /sbin/mke2fs -j /dev/sda2 Refer t o your docum ent at ion ( e.g., m anpages) for ot her operat ing syst em s. < Day Day Up >
< Day Day Up >
8.4 The aufs Storage Scheme The aufs st orage schem e has evolved out of t he very first at t em pt t o im prove Squid's disk I / O response t im e. The " a" st ands for asynchronous I / O. The only difference bet ween t he default ufs schem e and aufs is t hat I / Os aren't execut ed by t he m ain Squid process. The dat a layout and form at is t he sam e, so you can easily swit ch bet ween t he t wo schem es wit hout losing any cache dat a. aufs uses a num ber of t hread processes for disk I / O operat ions. Each t im e Squid needs t o read, writ e, open, close, or rem ove a cache file, t he I / O request is dispat ched t o one of t he t hread processes. When t he t hread com plet es t he I / O, it signals t he m ain Squid process and ret urns a st at us code. Act ually, in Squid 2.5, cert ain file operat ions aren't execut ed asynchronously by default . Most not ably, disk writ es are always perform ed synchronously. You can change t his by set t ing ASYNC_WRITE t o 1 in src/ fs/ aufs/ st ore_asyncufs.h and recom piling. The aufs code requires a pt hreads library. This is t he st andard t hreads int erface, defined by POSI X. Even t hough pt hreads is available on m any Unix syst em s, I oft en encount er com pat ibilit y problem s and differences. The aufs st orage syst em seem s t o run well only on Linux and Solaris. Even t hough t he code com piles, you m ay encount er serious problem on ot her operat ing syst em s. To use aufs, you m ust add a special ./ configure opt ion: % ./configure --enable-storeio=aufs,ufs St rict ly speaking, you don't really need t o specify ufs in t he list of storeio m odules. However, you m ight as well because if you t ry aufs and don't like it , you'll be able t o fall back t o t he plain ufs st orage schem e. You can also use t he —wit h- aio- t hreads= N opt ion if you like. I f you om it it , Squid aut om at ically calculat es t he num ber of t hreads t o use based on t he num ber of aufs cache_dirs. Table 8- 1 shows t he default num ber of t hreads for up t o six cache direct ories.
Ta ble 8 - 1 . D e fa u lt n u m be r of t h r e a ds for u p t o six ca ch e dir e ct or ie s ca ch e _ dir s
Th r e a ds
1
16
2
26
3
32
4
36
5
40
6
44
Aft er you com pile aufs support int o Squid, you can specify it on a cache_dir line in squid.conf: cache_dir aufs /cache0 4096 16 256 Aft er st art ing Squid wit h aufs enabled, m ake sure everyt hing st ill works correct ly. You m ay want t o run t ail - f st ore.log for a while t o m ake sure t hat obj ect s are being swapped out t o disk. You should also run t ail - f cache.log and look for any new errors or warnings.
8.4.1 How aufs Works Squid creat es a num ber of t hread processes by calling pt hread_creat e( ) . All t hreads are creat ed upon t he first disk act ivit y. Thus, you'll see all t he t hread processes even if Squid is idle. Whenever Squid want s t o perform som e disk I / O operat ion ( e.g., t o open a file for reading) , it allocat es a couple of dat a st ruct ures and places t he I / O request int o a queue. The t hread processes have a loop t hat t ake I / O request s from t he queue and execut es t hem . Because t he request queue is shared by all t hreads, Squid uses m ut ex locks t o ensure t hat only one t hread updat es t he queue at a given t im e. The I / O operat ions block t he t hread process unt il t hey are com plet e. Then, t he st at us of t he operat ion is placed on a done queue. The m ain Squid process periodically checks t he done queue for com plet ed operat ions. The m odule t hat request ed t he disk I / O is not ified t hat t he operat ion is com plet e, and t he request or response processing proceeds. As you m ay have guessed, aufs can t ake advant age of syst em s wit h m ult iple CPUs. The only locking t hat occurs is on t he request and result queues. Ot herwise, all ot her funct ions execut e independent ly. While t he m ain process execut es on one CPU, anot her CPU handles t he act ual I / O syst em calls.
8.4.2 aufs Issues An int erest ing propert y of t hreads is t hat all processes share t he sam e resources, including m em ory and file descript ors. For exam ple, when a t hread process opens a file as descript or 27, all ot her t hreads can t hen access t hat file wit h t he sam e descript or num ber. As you probably know, file- descript or short age is a com m on problem wit h first - t im e Squid adm inist rat ors. Unix kernels t ypically have t wo file- descript or lim it s: per process and syst em wide. While you m ight t hink t hat 256 file descript ors per process is plent y ( because of all t he t hread processes) , it
doesn't work t hat way. I n t his case, all t hreads share t hat sm all num ber of descript ors. Be sure t o increase your syst em 's per- process file descript or lim it t o 4096 or higher, especially when using aufs. Tuning t he num ber of t hreads can be t ricky. I n som e cases, you m ight see t his warning in cache.log: 2003/09/29 13:42:47| squidaio_queue_request: WARNING - Disk I/O overloading I t m eans t hat Squid has a large num ber of I / O operat ions queued up, wait ing for an available t hread. Your first inst inct m ay be t o increase t he num ber of t hreads. I would suggest , however, t hat you decrease t he num ber inst ead. I ncreasing t he num ber of t hreads also increases t he queue size. Past a cert ain point , it doesn't increase aufs's load capacit y. I t only m eans t hat m ore operat ions becom e queued. Longer queues result in higher response t im es, which is probably som et hing you'd like t o avoid. Decreasing t he num ber of t hreads, and t he queue size, m eans t hat Squid can det ect t he overload condit ion fast er. When a cache_dir is overloaded, it is rem oved from t he select ion algorit hm ( see Sect ion 7.4) . Then, Squid eit her chooses a different cache_dir or sim ply doesn't st ore t he response on disk. This m ay be a bet t er sit uat ion for your users. Even t hough t he hit rat io goes down, response t im e rem ains relat ively low.
8.4.3 Monitoring aufs Operation The Async I O Count ers opt ion in t he cache m anager m enu displays a few st at ist ics relat ing t o aufs. I t shows count ers for t he num ber of open, close, read, writ e, st at , and unlink request s received. For exam ple: % squidclient mgr:squidaio_counts ... ASYNC IO Counters: Operation
# Requests
open
15318822
close
15318813
cancel
15318813
write
0
read
19237139
stat
0
unlink check_callback
2484325 311678364
queue
0
The cancel count er is norm ally equal t o t he close count er. This is because t he close funct ion always calls t he cancel funct ion t o ensure t hat any pending I / O operat ions are ignored. The writ e count er is zero because t his version of Squid perform s writ es synchronously, even for aufs. The check_callback count er shows how m any t im es t he m ain Squid process has checked t he done queue for com plet ed operat ions. The queue value indicat es t he current lengt h of t he request queue. Norm ally, t he queue lengt h should be less t han t he num ber of t hreads x 5. I f you repeat edly observe a queue lengt h larger t han t his, you m ay be pushing Squid t oo hard. Adding m ore t hreads m ay help but only t o a cert ain point . < Day Day Up >
< Day Day Up >
8.5 The diskd Storage Scheme diskd ( short for disk daem ons) is sim ilar t o aufs in t hat disk I / Os are execut ed by ext ernal processes. Unlike aufs, however, diskd doesn't use t hreads. I nst ead, int er- process com m unicat ion occurs via m essage queues and shared m em ory. Message queues are a st andard feat ure of m odern Unix operat ing syst em s. They were invent ed m any years ago in AT&T's Unix Syst em V, Release 1. The m essages passed bet ween processes on t hese queues are relat ively sm all: 32- 40 byt es. Each diskd process uses one queue for receiving request s from Squid and anot her queue for t ransm it t ing result s back.
8.5.1 How diskd Works Squid creat es one diskd process for each cache_dir. This is different from aufs, which uses a large pool of t hreads for all cache_dirs. Squid sends a m essage t o t he corresponding diskd process for each I / O operat ion. When t hat operat ion is com plet e, t he diskd process sends a st at us m essage back t o Squid. Squid and t he diskd processes preserve t he order of m essages in t he queues. Thus, t here is no concern t hat I / Os m ight be execut ed out of sequence. For reads and writ es, Squid and t he diskd processes use a shared m em ory area. Bot h processes can read from , and writ e t o, t his area of m em ory. For exam ple, when Squid issues a read request , it t ells t he diskd process where t o place t he dat a in m em ory. diskd passes t his m em ory locat ion t o t he read( ) syst em call and not ifies Squid t hat t he read is com plet e by sending a m essage on t he ret urn queue. Squid t hen accesses t he recent ly read dat a from t he shared m em ory area. diskd ( as wit h aufs) essent ially gives Squid nonblocking disk I / Os. While t he diskd processes are blocked on I / O operat ions, Squid is free t o work on ot her t asks. This works really well as long as t he diskd processes can keep up wit h t he load. Because t he m ain Squid process is now able t o do m ore work, it 's possible t hat it m ay overload t he diskd helpers. The diskd im plem ent at ion has t wo feat ures t o help out in t his sit uat ion. First , Squid wait s for t he diskd processes t o cat ch up if one of t he queues exceeds a cert ain t hreshold. The default value is 64 out st anding m essages. I f a diskd process get s t his far behind, Squid " sleeps" a sm all am ount of t im e and wait s for it t o com plet e som e of t he pending operat ions. This essent ially put s Squid int o a blocking I / O m ode. I t also m akes m ore CPU t im e available t o t he diskd processes. You can configure t his t hreshold by specifying a value for t he Q2 param et er on a cache_dir line: cache_dir diskd /cache0 7000 16 256 Q2=50 Second, Squid st ops asking t he diskd process t o open files if t he num ber of out st anding operat ions reaches anot her t hreshold. Here, t he default value is 72 m essages. I f Squid would like t o open a disk file for reading or writ ing, but t he select ed cache_dir has t oo m any pending operat ions, t he open request fails int ernally. When t rying t o open a file for reading, t his causes a cache m iss inst ead of a cache hit . When opening files for writ ing, it prevent s Squid from st oring a cachable response. I n bot h cases t he user st ill receives a valid response. The only real effect is t hat Squid's hit rat io decreases. This t hreshold is configurable wit h t he Q1 param et er:
cache_dir diskd /cache0 7000 16 256 Q1=60 Q2=50 Not e t hat in som e versions of Squid, t he Q1 and Q2 param et ers are m ixed- up in t he default configurat ion file. For opt im al perform ance, Q1 should be great er t han Q2.
8.5.2 Compiling and Configuring diskd To use diskd, you m ust add it t o t he —enable- st oreio list when running ./ configure: % ./configure --enable-storeio=ufs,diskd diskd seem s t o be port able since shared m em ory and m essage queues are widely support ed on m odern Unix syst em s. However, you'll probably need t o adj ust a few kernel lim it s relat ing t o bot h. Kernels t ypically have t he following variables or param et ers:
MSGMNB This is t he m axim um charact ers ( oct et s) per m essage queue. Wit h diskd, t he pract ical lim it is about 100 out st anding m essages per queue. The m essages t hat Squid passes are 32- 40 oct et s, depending on your CPU archit ect ure. Thus, MSGMNB should be 4000 or m ore. To be safe, I recom m end set t ing t his t o 8192.
MSGMNI This is t he m axim um num ber of m essage queues for t he whole syst em . Squid uses t wo queues for each diskd cache_dir. I f you have 10 disks, t hat 's 20 queues. You should probably add even m ore in case ot her applicat ions also use m essage queues. I recom m end a value of 40.
MSGSSZ This is t he size of a m essage segm ent , in oct et s. Messages larger t han t his size are split int o m ult iple segm ent s. I usually set t his t o 64 so t hat t he diskd m essage isn't split int o m ult iple segm ent s.
MSGSEG This is t he m axim um num ber of m essage segm ent s t hat can exist in a single queue. Squid norm ally lim it s t he queues t o 100 out st anding m essages. Rem em ber t hat if you don't increase MSGSSZ t o 64 on 64- bit archit ect ures, each m essage requires m ore t han one segm ent . To be safe, I recom m end set t ing t his t o 512.
MSGTQL This is t he m axim um num ber of m essages t hat can exist in t he whole syst em . I t should
be at least 100 m ult iplied by t he num ber of cache_dirs. I recom m end set t ing it t o 2048, which should be m ore t han enough for as m any as 10 cache direct ories.
MSGMAX This is t he m axim um size of a single m essage. For Squid, 64 byt es should be sufficient . However, your syst em m ay have ot her applicat ions t hat use larger m essages. On som e operat ing syst em s such as BSD, you don't need t o set t his. BSD aut om at ically set s it t o MSGSSZ x MSGSEG. On ot her syst em s you m ay need t o increase t he value from it s default . I n t his case, you can set it t o t he sam e as MSGMNB.
SHMSEG This is t he m axim um num ber of shared m em ory segm ent s allowed per process. Squid uses one shared m em ory ident ifier for each cache_dir. I recom m end a set t ing of 16 or higher.
SHMMNI This is t he syst em wide lim it on t he num ber of shared m em ory segm ent s. A value of 40 is probably enough in m ost cases.
SHMMAX This is t he m axim um size of a single shared m em ory segm ent . By default , Squid uses about 409,600 byt es for each segm ent . Just t o be safe, I recom m end set t ing t his t o 2 MB, or 2,097,152.
SHMALL This is t he syst em wide lim it on t he am ount of shared m em ory t hat can be allocat ed. On som e syst em s, SHMALL m ay be expressed as a num ber of pages, rat her t han byt es. Set t ing t his t o 16 MB ( 4096 pages) is enough for 10 cache_dirs wit h plent y rem aining for ot her applicat ions.
To configure m essage queues on BSD, add t hese opt ions t o your kernel configurat ion file: [ 2]
OpenBSD is a lit t le different . Use option inst ead of options, and specify t he SHMMAX value in pages, rat her t han byt es. # System V message queues and tunable parameters options
SYSVMSG
# include support for message queues
options
MSGMNB=8192
# max characters per message queue
[ 2]
options
MSGMNI=40
# max number of message queue identifiers
options
MSGSEG=512
# max number of message segments per queue
options
MSGSSZ=64
# size of a message segment MUST be power of 2
options
MSGTQL=2048
# max number of messages in the system
options
SYSVSHM
options
SHMSEG=16
# max shared mem segments per process
options
SHMMNI=32
# max shared mem segments in the system
options
SHMMAX=2097152
# max size of a shared mem segment
options
SHMALL=4096
# max size of all shared memory (pages)
To configure m essage queues on Linux, add t hese lines t o / et c/ sysct l.conf: kernel.msgmnb=8192 kernel.msgmni=40 kernel.msgmax=8192 kernel.shmall=2097152 kernel.shmmni=32 kernel.shmmax=16777216 Alt ernat ively, or if you find t hat you need m ore cont rol, you can m anually edit include/ linux/ m sg. h and include/ linux/ shm .h in your kernel sources. For Solaris, add t hese lines t o / et c/ syst em and t hen reboot : set msgsys:msginfo_msgmax=8192 set msgsys:msginfo_msgmnb=8192 set msgsys:msginfo_msgmni=40 set msgsys:msginfo_msgssz=64 set msgsys:msginfo_msgtql=2048 set shmsys:shminfo_shmmax=2097152 set shmsys:shminfo_shmmni=32 set shmsys:shminfo_shmseg=16 For Digit al Unix ( TRU64) , you can probably add lines t o t he kernel configurat ion in t he st yle of
BSD, seen previously. Alt ernat ively, you can use t he sysconfig com m and. First , creat e a file called ipc.st anza like t his: ipc: msg-max = 2048 msg-mni = 40 msg-tql = 2048 msg-mnb = 8192 shm-seg = 16 shm-mni = 32 shm-max = 2097152 shm-max = 4096 Now, run t his com m and and reboot : # sysconfigdb -a -f ipc.stanza Aft er you have m essage queues and shared m em ory configured in your operat ing syst em , you can add t he cache_dir lines t o squid.conf: cache_dir diskd /cache0 7000 16 256 Q1=72 Q2=64 cache_dir diskd /cache1 7000 16 256 Q1=72 Q2=64 ... I f you forget t o increase t he m essage queue lim it s, or if you don't set t hem high enough, you'll see m essages like t his in cache.log: 2003/09/29 01:30:11| storeDiskdSend: msgsnd: (35) Resource temporarily unavailable
8.5.3 Monitoring diskd The best way t o m onit or diskd perform ance is wit h t he cache m anager. Request t he diskd page; for exam ple: % squidclient mgr:diskd ... sent_count: 755627 recv_count: 755627
max_away: 14 max_shmuse: 14 open_fail_queue_len: 0 block_queue_len: 0
OPS SUCCESS
FAIL
open
51534
51530
4
create
67232
67232
0
close
118762
118762
0
unlink
56527
56526
1
read
98157
98153
0
write
363415
363415
0
See Sect ion 14.2.1.6 for a descript ion of t his out put . < Day Day Up >
< Day Day Up >
8.6 The coss Storage Scheme The Cyclic Obj ect St orage Schem e ( coss) is an at t em pt t o develop a cust om filesyst em for Squid. Wit h t he ufs- based schem es, t he prim ary perform ance bot t leneck com es from t he need t o execut e so m any open( ) and unlink( ) syst em calls. Because each cached response is st ored in a separat e disk file, Squid is always opening, closing, and rem oving files. coss, on t he ot her hand, uses one big file t o st ore all responses. I n t his sense, it is a sm all, cust om filesyst em specifically for Squid. coss im plem ent s m any of t he funct ions norm ally handled by t he underlying filesyst em , such as allocat ing space for new dat a and rem em bering where t here is free space. Unfort unat ely, coss is st ill a lit t le rough around t he edges. Developm ent of coss has been proceeding slowly over t he last couple of years. Nonet heless, I 'll describe it here in case you feel advent urous.
8.6.1 How coss Works On t he disk, each coss cache_dir is j ust one big file. The file grows in size unt il it reaches it s m axim um size. At t his point , Squid st art s over at t he beginning of t he file, overwrit ing any dat a [ 3] already st ored t here. Thus, new obj ect s are always st ored at t he " end" of t his cyclic file. [ 3]
The beginning is t he locat ion where dat a was first writ t en; t he end is t he locat ion where dat a was m ost recent ly writ t en.
Squid act ually doesn't writ e new obj ect dat a t o disk im m ediat ely. I nst ead, t he dat a is copied int o a 1- MB m em ory buffer, called a st ripe. A st ripe is writ t en t o disk when it becom es full. coss uses asynchronous writ es so t hat t he m ain Squid process doesn't becom e blocked on disk I / O. As wit h ot her filesyst em s, coss also uses t he blocksize concept . Back in Sect ion 7.1.4, I t alked about file num bers. Each cached obj ect has a file num ber t hat Squid uses t o locat e t he dat a on disk. For coss, t he file num ber is t he sam e as t he block num ber. For exam ple, a cached obj ect wit h a swap file num ber equal t o 112 st art s at t he 112t h block in a coss filesyst em . File num bers aren't allocat ed sequent ially wit h coss. Som e file num bers are unavailable because cached obj ect s generally occupy m ore t han one block in t he coss file. The coss block size is configurable wit h a cache_dir opt ion. Because Squid's file num bers are only 24 bit s, t he block size det erm ines t he m axim um size of a coss cache direct ory: size = block_size x 2 24 . For exam ple, wit h a 512- byt e block size, you can st ore up t o 8 GB in a coss cache_dir. coss doesn't im plem ent any of Squid's norm al cache replacem ent algorit hm s ( see Sect ion 7.5) . I nst ead, cache hit s are " m oved" t o t he end of t he cyclic file. This is, essent ially, t he LRU algorit hm . I t does, unfort unat ely, m ean t hat cache hit s cause disk writ es, albeit indirect ly. Wit h coss, t here is no need t o unlink or rem ove cached obj ect s. Squid sim ply forget s about t he space allocat ed t o obj ect s t hat are rem oved. The space will be reused event ually when t he end
of t he cyclic file reaches t hat place again.
8.6.2 Compiling and Configuring coss To use coss, you m ust add it t o t he —enable- st oreio list when running ./ configure: % ./configure --enable-storeio=ufs,coss ... coss cache direct ories require a max-size opt ion. I t s value m ust be less t han t he st ripe size ( 1 MB by default , but configurable wit h t he —enable- coss- m em buf- size opt ion) . Also not e t hat you m ust om it t he L1 and L2 values t hat are norm ally present for ufs- based schem es. Here is an exam ple: cache_dir coss /cache0/coss 7000 max-size=1000000 cache_dir coss /cache1/coss 7000 max-size=1000000 cache_dir coss /cache2/coss 7000 max-size=1000000 cache_dir coss /cache3/coss 7000 max-size=1000000 cache_dir coss /cache4/coss 7000 max-size=1000000 Furt herm ore, you can change t he default coss block size wit h t he block-size opt ion: cache_dir coss /cache0/coss 30000 max-size=1000000 block-size=2048 One t ricky t hing about coss is t hat t he cache_dir direct ory argum ent ( e.g., / cache0/ coss) isn't act ually a direct ory. I nst ead, it is a regular file t hat Squid opens, and creat es if necessary. This is so you can use raw part it ions as coss files. I f you m ist akenly creat e t he coss file as a direct ory, you'll see an error like t his when st art ing Squid: 2003/09/29 18:51:42|
/usr/local/squid/var/cache: (21) Is a directory
FATAL: storeCossDirInit: Failed to open a coss file. Because t he cache_dir argum ent isn't a direct ory, you m ust use t he cache_swap_log direct ive ( see Sect ion 13.6) . Ot herwise Squid at t em pt s t o creat e a swap.st at e file in t he cache_dir direct ory. I n t hat case, you'll see an error like t his: 2003/09/29 18:53:38| /usr/local/squid/var/cache/coss/swap.state: (2) No such file or directory FATAL: storeCossDirOpenSwapLog: Failed to open swap log. coss uses asynchronous I / Os for bet t er perform ance. I n part icular, it uses t he aio_read( ) and aio_writ e( ) syst em calls. These m ay not be available on all operat ing syst em s. At t his t im e, t hey are available on FreeBSD, Solaris, and Linux. I f t he coss code seem s t o com pile okay, but you get a " Funct ion not im plem ent ed" error m essage, you need t o enable t hese syst em calls in
your kernel. On FreeBSD, your kernel m ust have t his opt ion: options
VFS_AIO
8.6.3 coss Issues coss is st ill an experim ent al feat ure. The code has not yet proven st able enough for everyday use. I f you want t o play wit h and help im prove it , be prepared t o lose any dat a st ored in a coss cache_dir. On t he plus side, coss's prelim inary perform ance t est s are very good. For an exam ple, see Appendix D. coss doesn't support rebuilding cached dat a from disk very well. When you rest art Squid, you m ight find t hat it fails t o read t he coss swap.st at e files, t hus losing any cached dat a. Furt herm ore, Squid doesn't rem em ber it s place in t he cyclic file aft er a rest art . I t always st art s back at t he beginning. coss t akes a nonst andard approach t o obj ect replacem ent . This m ay cause a lower hit rat io t han you m ight get wit h one of t he ot her st orage schem es. Som e operat ing syst em s have problem s wit h files larger t han 2 GB. I f t his happens t o you, you can always creat e m ore, sm aller coss areas. For exam ple: cache_dir coss /cache0/coss0 1900 max-size=1000000 block-size=128 cache_dir coss /cache0/coss1 1900 max-size=1000000 block-size=128 cache_dir coss /cache0/coss2 1900 max-size=1000000 block-size=128 cache_dir coss /cache0/coss3 1900 max-size=1000000 block-size=128 Using a raw disk device ( e.g., / dev/ da0s1c) doesn't work very well yet . One reason is t hat disk devices usually require t hat I / Os t ake place on 512- byt e block boundaries. Anot her concern is t hat direct disk access bypasses t he syst em s buffer cache and m ay degrade perform ance. Many disk drives, however, have built - in caches t hese days. < Day Day Up >
< Day Day Up >
8.7 The null Storage Scheme Squid has a fift h st orage schem e called null. As t he nam e im plies, t his is m ore of a nonst orage schem e. Files t hat are " writ t en" t o a null cache_dir aren't act ually writ t en t o disk. Most people won't have any reason t o use t he null st orage syst em . I t 's prim arily useful if you [ 4] want t o ent irely disable Squid's disk cache. You can't sim ply rem ove all cache_dir lines from squid.conf because t hen Squid adds a default ufs cache_dir. The null st orage syst em is also som et im es useful for t est ing and benchm arking Squid. Since t he filesyst em is t ypically t he perform ance bot t leneck, using t he null st orage schem e gives you an upper lim it of Squid's perform ance on your hardware. [ 4]
Som e responses m ay st ill be cached in m em ory, however.
To use t his schem e you m ust first specify it on t he —enable- st oreio list when running ./ configure: % ./configure --enable-storeio=ufs,null ... You can t hen creat e a cache_dir of t ype null in squid.conf: cache_dir /tmp null I t m ay seem odd t hat you need t o specify a direct ory for t he null st orage schem e. However, Squid uses t he direct ory nam e as a cache_dir ident ifier. For exam ple, you'll see it in t he cache m anager out put ( see Sect ion 14.2.1.39) . < Day Day Up >
< Day Day Up >
8.8 Which Is Best for Me? Squid's st orage schem e choices m ay seem a lit t le overwhelm ing and confusing. I s aufs bet t er t han diskd? Does m y syst em support aufs or coss? Will I lose m y dat a if I use one of t hese fancy schem es? I s it okay t o m ix- and- m at ch st orage schem es? First of all, if your Squid is light ly used ( say, less t han five request s per second) , t he default ufs st orage schem e should be sufficient . You probably won't see a not iceable perform ance im provem ent from t he ot her schem es at t his low request rat e. I f you are t rying t o decide which schem e t o t ry, your operat ing syst em m ay be a det erm ining fact or. For exam ple, aufs runs well on Linux and Solaris but seem s t o have problem s on ot her syst em s. The coss code uses funct ions t hat aren't available on cert ain operat ing syst em s ( e.g., Net BSD) at t his t im e. I t seem s t o m e t hat higher- perform ing st orage schem es are also m ore suscept ible t o dat a loss in t he event of a syst em crash. This is t he t radeoff for bet t er perform ance. For m any people, however, cached dat a is of relat ively low value. I f Squid's cache becom es corrupt ed due t o a crash, you m ay find it easier t o sim ply newfs t he disk part it ion and let t he cache fill back up from scrat ch. I f you find it difficult or expensive t o replace t he cont ent s of Squid's cache, you probably want t o use one of t he slow, but reliable, filesyst em s and st orage schem es. Squid cert ainly allows you t o use different filesyst em s and st orage schem es for each cache_dir. I n pract ice, however, t his is uncom m on. You'll probably have fewer hassles if all cache direct ories are approxim at ely t he sam e size and use t he sam e st orage schem e. < Day Day Up >
< Day Day Up >
8.9 Exercises ● ● ●
Try t o com pile all possible st orage schem es on your syst em . Run Squid wit h a separat e cache_dir for each st orage schem e you can get t o com pile. Run Squid wit h one or m ore diskd cache_dirs. Then run t he ipcs -o com m and. < Day Day Up >
< Day Day Up >
Chapter 9. Interception Caching I nt ercept ion caching is a popular t echnique for get t ing t raffic t o Squid wit hout configuring any client s. I nst ead, you configure a rout er or swit ch t o divert HTTP connect ions t o t he m achine on which Squid is running. Squid's operat ing syst em is configured t o accept t he foreign packet s and deliver t hem t o t he Squid process. To m ake HTTP int ercept ion work, you need t o configure t hree separat e com ponent s: a net work device, Squid's operat ing syst em , and Squid it self. This chapt er begins wit h an overview of HTTP int ercept ion. I 'll explain how it all works and define som e t erm s so t hat t he rem aining sect ions m ake sense. I also explain t he t radeoffs involved wit h HTTP int ercept ion. Following t hat , I 'll discuss your opt ions for devices and configurat ions t hat can int ercept client t raffic. I n part icular, I cover Cisco policy rout ing, Cisco's WCCP, layer four swit ches, and running Squid on a host t hat also funct ions as a rout er or bridge. Next , I 'll show how t o configure t he operat ing syst em t o handle t he int ercept ed connect ions. This funct ionalit y is a feat ure of t he I P packet filt ering soft ware, which varies from syst em t o syst em . I t is called ipt ables ( Net filt er) on Linux; ipfw on FreeBSD; pf on OpenBSD; and I PFilt er on Net BSD, Solaris, and ot her BSD variant s. Squid is t he final com ponent you need t o configure. Fort unat ely, t his is relat ively st raight forward because it doesn't depend on your operat ing syst em or net work device. I finish t he chapt er wit h a lit t le checklist t hat m ay help you debug HTTP int ercept ion problem s. < Day Day Up >
< Day Day Up >
9.1 How It Works I nt ercept ion caching involves som e net work t rickery, so it is helpful for you t o underst and what happens bet ween t he client and Squid. I 'll use Figure 9- 1 and t he following sam ple t cpdum p out put t o explain how t he packet s are int ercept ed as t hey flow t hrough your net work.
Figu r e 9 - 1 . H ow H TTP in t e r ce pt ion w or k s
1. The user- agent want s t o request a resource, say / index.ht m l from an origin server, say www.oreilly.com . I t needs t he origin server's I P address, so it m akes a DNS request : Packet 1 TIME:
19:54:41.317310
UDP:
206.168.0.3.2459 -> 206.168.0.2.53
DATA:
.d...........www.oreilly.com.....
--------------------------------------------------------------------------Packet 2 TIME:
19:54:41.317707 (0.000397)
UDP:
206.168.0.2.53 -> 206.168.0.3.2459
DATA:
.d...........www.oreilly.com.............PR.....%........PR. ....$........PR...ns1.sonic.net.........PR...ns2.Q........PR ...ns...M...............h.............!.z.......b......
2. Now t hat it has t he I P address, t he user- agent init iat es a TCP connect ion t o t he origin server on port 80: Packet 3 TIME:
19:54:41.320652 (0.002945)
TCP:
206.168.0.3.3897 -> 208.201.239.37.80 Syn
DATA:
3. The swit ch/ rout er not ices a TCP SYN packet wit h dest inat ion port 80. What happens next depends on t he part icular int ercept ion t echnology. I n t he case of layer four swit ches and policy rout ing, t he device sim ply forwards t he TCP packet t o Squid's dat alink layer ( Et hernet ) address. This works only when Squid is direct ly at t ached t o t he net work device. For WCCP, t he rout er encapsulat es t he TCP packet int o a GRE packet . Because t he GRE packet has it s own I P address, it can be rout ed t hrough m ult iple subnet s. I n ot her words, WCCP doesn't require Squid t o be direct ly at t ached t o t he rout er. 4. The Squid host 's operat ing syst em receives t he int ercept ed packet . For layer four swit ches, t he TCP/ I P packet is unchanged from t he earlier explanat ion. I f t he packet is encapsulat ed wit h GRE, t he host rem oves t he out er I P and GRE headers and places t he original TCP/ I P packet on t he input queue. Not e t hat t he Squid host receives an I P packet for a foreign address ( t he origin server's) . Norm ally t his packet is dropped because it s dest inat ion address doesn't m at ch any of t he local int erface addresses. To m ake t he host accept t he foreign packet , you m ust enable I P forwarding on m ost operat ing syst em s. 5. The client 's TCP/ I P packet is processed by t he packet filt ering code. The packet m at ches a rule t hat inst ruct s t he kernel t o forward or divert t his packet t o Squid. Wit hout t his rule, t he kernel sim ply forwards t his packet on it s way t o t he origin server, which isn't what you want . Not e t hat t he SYN packet 's dest inat ion port is 80, but Squid m ay be list ening on a different port , such as 3128. The packet filt ering rules allow you t o change t he port num ber. You don't need t o m ake Squid list en on port 80. You can't see t his st ep wit h t cpdum p because t he divert ed packet doesn't flow t hrough t he net work int erface code again. The packet filt er's redirect ion rule is st ill necessary even if you have Squid list en on port 80. Sim ply m aking t he port num bers m at ch doesn't allow Squid t o receive t he int ercept ed packet s. The redirect ion rule is t he m agic t hat delivers foreign packet s t o Squid. 6. Squid receives not ificat ion of t he new connect ion, which it accept s. The kernel sends a SYN/ ACK packet back t o t he client :
Packet 4 TIME:
19:54:41.320735 (0.000083)
TCP:
208.201.239.37.80 -> 206.168.0.3.3897 SynAck
DATA:
As you can see, t he source address is t he origin server's, even t hough t his packet didn't reach t he origin. The operat ing syst em sim ply copies and swaps t he source and dest inat ion I P addresses from t he SYN packet int o t he reply. 7. The user- agent receives t he SYN/ ACK packet , fully est ablishing t he TCP connect ion. The user- agent now believes it is connect ed t o t he origin server, so it writ es t he HTTP request : Packet 5 TIME:
19:54:41.323080 (0.002345)
TCP:
206.168.0.3.3897 -> 208.201.239.37.80 Ack
DATA:
--------------------------------------------------------------------------Packet 6 TIME:
19:54:41.323482 (0.000402)
TCP:
206.168.0.3.3897 -> 208.201.239.37.80 AckPsh
DATA:
GET / HTTP/1.0 User-Agent: Wget/1.8.2 Host: www.oreilly.com Accept: */* Connection: Keep-Alive
8. Squid receives t he HTTP request . I t uses t he HTTP Host header t o convert t he part ial URL int o a full URL. I n t his case, you'll see ht t p: / / www.oreilly.com / in t he access.log file. 9. From t his point on, Squid t reat s t he request norm ally. As usual, cache hit s are ret urned im m ediat ely. Cache m isses are forwarded t o t he origin server. 10. Last ly, here is t he response t hat Squid receives from t he origin server: Packet 8 TIME:
19:54:41.448391 (0.030030)
TCP:
208.201.239.37.80 -> 206.168.0.3.3897 AckPsh
DATA:
HTTP/1.0 200 OK Date: Mon, 29 Sep 2003 01:54:41 GMT Server: Apache/1.3.26 (Unix) PHP/4.2.1 mod_gzip/1.3.19.1a mo d_perl/1.27 P3P: policyref="http://www.oreillynet.com/w3c/p3p.xml",CP="C AO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAa IVDa CONo OUR DELa PUBi OTRa IND PHY ONL UNI PUR COM NAV INT DEM CNT STA P RE" Last-Modified: Sun, 28 Sep 2003 23:54:44 GMT ETag: "1b76bf-b910-3ede86c4" Accept-Ranges: bytes Content-Length: 47376 Content-Type: text/html X-Cache: MISS from www.oreilly.com X-Cache: MISS from 10.0.0.1 Connection: keep-alive
You don't want your swit ch/ rout er t o int ercept t he connect ions t hat Squid m akes t o origin servers. I f t hat happens, Squid ends up t alking t o it self and can't sat isfy any cache m isses. The best way t o avoid forwarding loops like t his is t o m ake sure t hat your users and Squid connect t o separat e int erfaces on t he swit ch/ rout er. Whenever feasible, you should apply t he int ercept ion rules t o specific int erfaces. Obviously, you should not enable int ercept ion on t he int erface t hat Squid uses.
< Day Day Up >
< Day Day Up >
9.2 Why (Not) Intercept? Many organizat ions find int ercept ion caching at t ract ive because t hey can't , or would rat her not , configure all t heir user's web browsers. I t 's probably easier t o perform a lit t le net work t rickery on a single swit ch or rout er t han it is t o configure hundreds or t housands of workst at ions. As wit h m any choices we face, int ercept ion caching is really a t radeoff. I t brings bot h benefit s and drawbacks. I t m ay m ake your life easier, or m ore difficult . The obvious benefit of int ercept ion caching is t hat all HTTP request s leaving your net work aut om at ically go t hrough Squid. You don't need t o worry about configuring any browsers or t hat users m ight disable t heir proxy set t ings. I nt ercept ion caching put s you, t he net work adm inist rat or, in cont rol of t he HTTP t raffic. You can change, add, or rem ove Squid caches from service wit hout significant ly int errupt ing your users' web surfing. Most of t he disadvant ages surrounding HTTP int ercept ion are because t his t echnique violat es t he TCP/ I P st andards. These prot ocols m andat e t hat rout ers ( and swit ches) forward TCP/ I P packet s t o t he host specified by t he dest inat ion I P address. Divert ing t he packet s t o a caching proxy breaks t he rules. The proxy accept s divert ed connect ions under false pret ense. User agent s are t ricked int o believing t hey have est ablished a TCP connect ion wit h t he origin server. This confusion causes a serious problem wit h older versions of Microsoft 's I nt ernet Explorer. The browser's Reload but t on is t he easiest way t o refresh an HTML page. When Explorer is configured t o use a caching proxy, a reload request includes a Cache-Control: no-cache header t o force a cache m iss ( or validat ion) and ensure t hat t he response is up t o dat e. Explorer om it s t his header when not explicit ly configured for proxying. Wit h int ercept ion caching, Explorer t hinks it is connect ing t o t he origin server anyway, and t here is no need t o send t his header. Squid can't t ell t hat t he user pressed t he Reload but t on in t his case and m ay not validat e t he cached response. Squid's ie_refresh provides a part ial workaround for t his bug ( see Appendix A) . According t o Microsoft , t his problem has been correct ed in Explorer Version [ 1] 5.5, Service Pack 1. [ 1]
See Microsoft support knowledge base art icle Q266121 for m ore ( or less) inform at ion: ht t p: / / support .m icrosoft .com / support / kb/ art icles/ Q266/ 1/ 21.ASP. For sim ilar reasons, you can't use HTTP proxy aut hent icat ion in com binat ion wit h int ercept ion caching. Because t he client is unaware of t he proxy, it doesn't send t he necessary ProxyAuthorization header. Addit ionally, t he 407 ( Proxy Aut horizat ion Required) response code is inappropriat e because t he response should look like it cam e from t he origin server, which would never send such a reply. You also can't use RFC 1413 ident lookups ( see Sect ion 6.1.2.11) wit h int ercept ion. Squid can't bind a new TCP socket t o t he necessary I P address. The operat ing syst em cheat s when forwarding t he int ercept ed connect ion t o Squid. However, it can't cheat when Squid want s t o bind a new TCP socket t o t he foreign I P address. The address t hat it want s t o bind t o isn't really local, so t he bind syst em call fails. I nt ercept ion caching is also incom pat ible wit h I P filt ering designed t o prevent address spoofing ( See also RFC 2267: Net work I ngress Filt ering: Defeat ing Denial of Service At t acks Which
Em ploy I P Source Address Spoofing) . Consider t he net work shown in Figure 9- 2. The rout er has t wo LAN int erfaces: lan0 and lan1. The net work adm inist rat or uses packet filt ers on t he rout er t o m ake sure t hat t he int ernal host s don't t ransm it packet s wit h spoofed source addresses. The rout er forwards only packet s wit h source addresses corresponding t o t he connect ed net works. The packet filt er rules m ight look som et hing like t his: # lan0 allow ip from 172.16.1.0/24 to any via lan0 deny ip from any to any via lan0 # lan1 allow ip from 10.0.0.0/16 to any via lan1 deny ip from any to any via lan1
Figu r e 9 - 2 . I n t e r ce pt ion ca ch in g br e a k s a ddr e ss spoofin g filt e r s
Now consider what happens when t he rout er and Squid box on lan1 are configured t o int ercept HTTP connect ions com ing from lan0. Squid pret ends t o be t he origin server, which m eans t hat t he TCP packet s carrying response dat a from Squid back t o t he users have spoofed source addresses. These lan0 filt er rules cause t he rout er t o deny t hese packet s. To m ake int ercept ion caching work, t he net work adm inist rat or m ust rem ove t he lan0 rules. This, in t urn, leaves t he net work vulnerable t o being t he source of denial- of- service at t acks. As I explained in t he previous sect ion, client s m ust m ake DNS queries before opening a connect ion. This m ay be undesirable or difficult in cert ain firewall environm ent s. A host whose HTTP t raffic you want t o int ercept m ust be able t o query t he DNS. Client s t hat know t hey are using a proxy ( due t o m anual configurat ion or proxy aut o- configurat ion, for exam ple) don't usually t ry t o resolve host nam es. I nst ead, t hey sim ply forward full URLs t o Squid, and it becom es Squid's j ob t o look up origin server I P addresses.
Anot her lit t le problem is t hat Squid accept s connect ions for any dest inat ion I P address. Consider, for exam ple, a web sit e t hat st ill has a DNS ent ry even t hough t he sit e and server have been t aken down. Squid accept s t he TCP connect ion for t his bogus sit e. The client believes t he sit e is up and running, because it 's connect ion is est ablished. When Squid fails t o connect t o t he origin server, it is forced t o ret urn an error m essage. I n case it 's not clear, HTTP int ercept ion can be t ricky and difficult t o get working t he first t im e. A num ber of different com ponent s m ust all work t oget her and be correct ly configured. Furt herm ore, it can be difficult t o recreat e t he ent ire configurat ion from m em ory. I st rongly encourage you t o set up a t est environm ent before at t em pt ing t his on a product ion syst em . Once you get it all working, be sure t o docum ent every lit t le st ep. < Day Day Up >
< Day Day Up >
9.3 The Network Device Now t hat you know all t he ins and out s of int ercept ion caching, let 's see how t o act ually m ake it work. We'll st art by configuring t he net work devices t hat will be int ercept ing your HTTP connect ions.
9.3.1 Inline Squid I n t his configurat ion, you don't need a swit ch or net work rout er t o int ercept HTTP connect ions. I nst ead, Squid runs on a Unix syst em t hat is also your rout er ( or perhaps bridge) , as shown in Figure 9- 3.
Figu r e 9 - 3 . A syst e m t h a t com bin e s r ou t in g a n d ca ch in g ca n e a sily in t e r ce pt H TTP t r a ffic
This configurat ion essent ially skips t he first t hree st eps shown in Sect ion 9.1. The Squid host already receives t he HTTP connect ion packet s because it is t he rout er for your net work. I f you are t aking t his approach, feel free t o skip ahead t o Sect ion 9.4.
9.3.2 Layer Four Switches Many organizat ions use layer four swit ches specifically for t heir HTTP int ercept ion support . These product s offer addit ional feat ures as well, such as healt h checks and load balancing. I 'll only cover int ercept ion here. For inform at ion on healt h checks and load balancing, see O'Reilly's Server Load Balancing and Load Balancing Servers, Firewalls, and Caches ( John Wiley & Sons) . The following subsect ions cont ain working- exam ple configurat ions for a num ber of product s and t echniques.
9.3.2.1 Alteon/Nortel The following configurat ion is from an ACEswit ch 180 and Alt eon's WebOS 8.0.21. The net work set up is shown in Figure 9- 4.
Figu r e 9 - 4 . Sa m ple n e t w or k for la ye r fou r sw it ch in t e r ce pt ion , for Alt e on a n d Fou n dr y e x a m ple s
Client s are connect ed t o port 1, t he connect ion t o t he I nt ernet is via port 2, and Squid is on port 3. The following lines are t he relevant out put of a / cfg/ dum p com m and on t he swit ch. You don't necessarily need t o t ype all of t hese lines. Furt herm ore, som e of t he com m ands m ay have changed for newer versions of Alt eon's soft ware. Not e t hat Alt eon calls t his feat ure Web Cache Redirect ion ( WCR) . Here's t he process, st ep by st ep: 1. First , you m ust give t he Alt eon swit ch an I P address. This seem s necessary so t hat t he swit ch can perform healt h checks wit h Squid: /cfg/ip/if 1 ena addr 172.16.102.1 mask 255.255.255.0 broad 172.16.102.255 2. Alt eon's WCR is a feat ure of it s Server Load Balancing ( SLB) configurat ion. Thus, you need t o enable SLB feat ures on t he swit ch wit h t his com m and: /cfg/slb on 3. Next , you define a real server wit h Squid's I P address: /cfg/slb/real 1 ena rip 172.16.102.66 4. You m ust also define a group and m ake t he real server a m em ber: /cfg/slb/group 1
health tcp add 1 5. The next st ep is t o define t wo filt ers. The first filt er m at ches HTTP connect ions—TCP packet s wit h dest inat ion port 80—and redirect s t hem t o a server in group 1. The second filt er m at ches all ot her packet s and forwards t hem norm ally: /cfg/slb/filt 1 ena action redir sip any smask 0.0.0.0 dip any dmask 0.0.0.0 proto tcp sport any dport http group 1 rport 0 /cfg/slb/filt 224 ena action allow sip any smask 0.0.0.0 dip any dmask 0.0.0.0 proto any 6. The final st ep is t o configure specific swit ch port s for SLB. On port 1, you enable client processing ( t his is where t he client s connect ) , and add t he t wo filt ers. On t he second port you need only configure it for servers ( i.e., t he upst ream I nt ernet connect ion) : /cfg/slb/port 1 client ena
filt ena add 1 add 224 /cfg/slb/port 2 server ena To verify t hat HTTP int ercept ion is configured and working correct ly, you can use t he com m ands under t he /stats/slb and /info/slb m enus. The /info/slb/dump com m and is a quick and easy way t o see t he ent ire SLB configurat ion: >> Main# /info/slb/dump Real server state: 1: 172.16.102.66, 00:c0:4f:23:d7:05, vlan 1, port 3, health 3, up
Virtual server state:
Redirect filter state: 1: dport http, rport 0, group 1, health tcp, backup none real servers: 1: 172.16.102.66, backup none, up
Port state: 1: 0.0.0.0, client filt
enabled, filters: 1 224
2: 0.0.0.0, server filt disabled, filters: empty 3: 0.0.0.0 filt disabled, filters: empty I n t his out put , not ice t hat t he swit ch says Squid is reachable via port 3 and t hat t he healt h checks show Squid is up. You can also see t hat filt er 1 has been applied t o port 1, where t he client s connect . I n t he Port st at e sect ion, port 1 is designat ed as a place where client s connect , and port 2 is sim ilarly m arked as a server port .
The /stats/slb/real com m and shows a handful of st at ist ics for t he real server ( i.e., Squid) : >> Main# /stats/slb/real 1 -----------------------------------------------------------------Real server 1 stats: Health check failures:
0
Current sessions:
41
Total sessions:
760
Highest sessions:
55
Octets:
0
Most of t he st at ist ics relat e t o t he num ber of sessions ( i.e., TCP connect ions) . The Tot al sessions count er should increase if you execut e t he com m and again. Last ly, t he /stats/slb/group com m and shows alm ost t he sam e inform at ion: >> Main# /stats/slb/group 1 -----------------------------------------------------------------Real server group 1 stats: Current Real IP address
Highest
Sessions Sessions
Octets
---- --------------- -------- ---------- --------
---------------
1 172.16.102.66
Sessions
Total
65
90
0
---- --------------- -------- ---------- --------
---------------
65
2004
2004
90
0
This out put would be m ore int erest ing if t here was m ore t han one real server in t he group.
9.3.2.2 Foundry The configurat ion in t he following exam ple com es from a ServerI ron XL, running soft ware version 07.0.07T12. As before, client s are on port 1, t he I nt ernet link is on port 2, and Squid is on port 3. However, t hat m at t ers less for t his part icular configurat ion because you can enable HTTP int ercept ion globally. Foundry's nam e for int ercept ion caching is Transparent Cache Swit ching ( TCS) . Refer back t o Figure 9- 4 for t his exam ple.
The first st ep is t o give t he swit ch an I P address so it can perform healt h checks: ip address 172.16.102.1 255.255.255.0 Foundry allows you t o enable or disable TCS on part icular port s. However, for t he sake of sim plicit y, let 's enable it globally: ip policy 1 cache tcp http global I n t his line, cache is a keyword t hat corresponds t o t he TCS feat ure. The next line defines a web cache. I 've given it t he nam e squid1 and t old t he swit ch it s I P address: server cache-name squid1 172.16.102.66 The final st ep is t o add t he web cache t o a cache group: server cache-group 1 cache-name squid1 I f you're having problem s get t ing t he Foundry swit ch t o divert connect ions, have a look at t he show cache- group out put : ServerIron#show cache-group
Cache-group 1 has 1 members Admin-status = Enabled Active = 0 Hash_info: Dest_mask = 255.255.255.0 Src_mask = 0.0.0.0
Cache Server Name
Admin-status Hash-distribution
squid1
6
HTTP Traffic
From to
Name: squid1
3
Web-Caches
IP: 172.16.102.66
State: 6
Host->Web-cache
Groups =
Client
1
Web-cache->Host
State
CurConn TotConn Packets
Octets
Packets
Octets
active
441
15976623
156962
154750098
12390
188871
Web-Server active
193
11664
150722
151828731
175796
15853612
Total
634
24054
339593
167805354
332758
170603710
Som e of t his out put is crypt ic, but you can t ell int ercept ion is working by repeat ing t he com m and and wat ching t he count ers increase. The show server real com m and provides alm ost t he sam e inform at ion: ServerIron#show server real squid1 Real Servers Info
Name : squid1
Mac-addr: 00c0.4f23.d705
IP:172.16.102.66
Range:1
Src-nat (cfg:op):(off:off)
State:Active
Wt:1
Dest-nat (cfg:op):(off:off)
squid1 is a TRANSPARENT CACHE in groups Remote server
: No
Max-conn:1000000
1
Dynamic : No
Server-resets:0
Mem:server: 02009eae Mem:mac: 045a3714
Port
State
Ms CurConn TotConn Rx-pkts
Tx-pkts
Rx-octet
Tx-octet
Reas
----
-----
-- ------- ------- -------
-------
--------
--------
----
http
active
0
855
29557
379793
471713
373508204
39425322
0
default active
0
627
28335
425106
366016
38408994
368496301
0
1482
57892
804899
837729
411917198
407921623
0
Server
Total
Finally, you can use t he show logging com m and t o see if t he swit ch believes Squid is up or down: ServerIron#show logging ... 00d00h11m51s:N:L4 server 172.16.102.66 squid1 port 80 is up 00d00h11m49s:N:L4 server 172.16.102.66 squid1 port 80 is down 00d00h10m21s:N:L4 server 172.16.102.66 squid1 port 80 is up
00d00h10m21s:N:L4 server 172.16.102.66 squid1 is up Not e t hat t he ServerI ron t hinks t he server is running on port 80. As you'll see lat er, m y exam ples have Squid running on port 3128. The packet filt ering rules act ually change t he packet 's dest inat ion port from 80 t o 3128. This has som e int erest ing consequences for healt h checks, which I address lat er in Sect ion 9.3.2.5.
9.3.2.3 Extreme Networks I n t his exam ple, t he hardware is a Sum m it 1i, and t he soft ware is Version 6.1.3b11. Once again, t he client s are on port 1, t he I nt ernet link is on port 2, and Squid is on port 3. The net work configurat ion is shown in Figure 9- 5.
Figu r e 9 - 5 . Sa m ple n e t w or k for in t e r ce pt in g w it h a r ou t e r , for t h e Ex t r e m e a n d Cisco policy r ou t in g e x a m ple s
The Ext rem e swit ch can int ercept HTTP connect ions only for packet s t hat it rout es bet ween subnet s. I n ot her words, if you use t he Ext rem e swit ch in layer t wo m ode ( wit h a single VLAN) , you can't divert t raffic t o Squid. To m ake HTTP int ercept ion work, you m ust configure separat e VLANs for users, Squid, and t he I nt ernet : configure Default delete port 1-8
create vlan Users configure Users ip 172.16.102.1 255.255.255.192 configure Users add port 1
create vlan Internet configure Internet ip 172.16.102.129 255.255.255.192
configure Internet add port 2
create vlan Squid configure Squid ip 172.16.102.65 255.255.255.192 configure Squid add port 3 The next st ep is t o enable and configure rout ing in t he swit ch: enable ipforwarding configure iproute add default 172.16.102.130 Last ly, you configure t he swit ch t o redirect HTTP connect ions t o Squid: create flow-redirect http tcp destination any ip-port 80 source any configure http add next-hop 172.16.102.66
9.3.2.4 Cisco Arrowpoint The following configurat ion is based on not es from an old t est I ran. However, I don't have access t o an arrowpoint swit ch now and can't verify t hat t hese lines are correct . circuit VLAN1 ip address 172.16.102.1 255.255.255.0
service pxy1 type transparent-cache ip address 172.16.102.66 port 80 protocol tcp active
owner foo content bar add service pxy1
protocol tcp port 80 active
9.3.2.5 A comment on HTTP servers and health checks I 've set up t hese exam ples so t hat t he rout er/ swit ch forwards packet s wit hout changing t he dest inat ion TCP port . The packet filt ering rules t hat I 'll cover in Sect ion 9.4 change t he dest inat ion port . An int erest ing problem arises when you also run an HTTP server on t he Squid box. To run an HTTP server on port 80 while running Squid on port 3128, your packet filt er configurat ion m ust have a special rule t hat accept s TCP connect ions for t he HTTP server. Ot herwise, t he connect ion get s divert ed t o Squid. The special rule is sim ple t o const ruct . I f t he dest inat ion port is 80, and t he dest inat ion address is t he server's, accept t he packet norm ally. All t he int ercept ed packet s have foreign dest inat ion addresses, so t hey won't m at ch t he special rule. However, when t he rout er/ swit ch m akes an HTTP healt h check, it connect s t o t he server's I P address. Thus, t he healt h- check packet m at ches t he special rule and isn't divert ed t o Squid. The rout er/ swit ch is checking t he healt h of t he wrong server. I f t he HTTP server is down, but Squid is up ( or vice versa) , t he healt h check will be wrong. I f you find yourself in t his sit uat ion, you have a few opt ions: ● ●
● ●
Don't run an HTTP server on t he Squid host . Add a specific packet filt ering rule t hat divert s TCP healt h check connect ions from t he rout er/ swit ch t o Squid. Configure your rout er/ swit ch t o change t he dest inat ion port t o 3128. Disable layer four healt h checks.
9.3.3 Cisco Policy Routing Policy rout ing isn't t hat different from what I 've t alked about wit h layer four swit ches. I t is im plem ent ed in rout ing product s m ade by Cisco and ot hers. The prim ary difference is t hat policy rout ing doesn't include any healt h checking. Thus, if Squid becom es overloaded or fails ent irely, t he rout er cont inues t o forward packet s t o Squid, rat her t han rout e t hem direct ly t o origin servers. Policy rout ing requires t hat Squid be on one of t he rout er's direct ly connect ed subnet s. I n t his exam ple, I 'm using a Cisco 7204 rout er running I OS Version 12.0( 5) T. The net work configurat ion is t he sam e as t he previous exam ple, shown in Figure 9- 5. The first configurat ion st ep is t o define an access list t hat m at ches port 80 packet s com ing from client s. You m ust m ake sure t hat port 80 packet s com ing from Squid aren't reint ercept ed. One way t o do t his is wit h a specific rule t hat denies packet s com ing from Squid, followed by a rule t hat allows all ot hers: access-list 110 deny tcp host 172.16.102.66 any eq www
access-list 110 permit tcp any any eq www Alt ernat ively, if Squid and your users are on different subnet s, you can perm it only t hose packet s t hat originat e from t he client net work: access-list 110 permit tcp 10.102.0.0 0.0.255.255 any eq www The next st ep is t o define a rout e m ap. This is where you t ell t he rout er where t o forward t he int ercept ed packet s: route-map proxy-redirect permit 10 match ip address 110 set ip next-hop 172.16.102.66 Those com m ands say, " I f t he I P address m at ches access- list 110, forward t he packet t o 172.16.102.66." The 10 on t he route-map line is a sequence num ber in case you have m ult iple rout e m aps. The final st ep is t o apply t he rout e m ap t o int erfaces where your client s connect : interface Ethernet0/0 ip policy route-map proxy-redirect I OS doesn't provide m uch in t he way of debugging for policy rout ing. However, t he show rout em ap com m and m ay be sufficient : router#show route-map proxy-redirect route-map proxy-redirect, permit, sequence 10 Match clauses: ip address (access-lists): 110 Set clauses: ip next-hop 172.16.102.66 Policy routing matches: 730 packets, 64649 bytes
9.3.4 Web Cache Coordination Protocol Cisco's answer t o layer four swit ching t echnology ( before t hey acquired Arrowpoint ) is t he Web [ 2] WCCP is different from t he t ypical layer four Cache Coordinat ion Prot ocol ( WCCP) . int ercept ion in a couple of ways. [ 2]
At various t im es it has also been called Web Cache Cont rol Prot ocol.
First , int ercept ed packet s are encapsulat ed wit h GRE ( Generic Rout ing Encapsulat ion) . This sim ply allows t hem t o t raverse subnet s, which m eans Squid doesn't need t o be direct ly connect ed t o t he rout er. Because t hey are encapsulat ed, t he Squid host m ust unencapsulat e t hem . Not all Unix syst em s have t he code for unwrapping GRE packet s. The second difference is in how t he rout er decides t o spread t he load am ong m ult iple caches. I n fact , t he rout er doesn't m ake t his decision, t he cache does. When a rout er has a group of WCCP- enabled caches, one nom inat es it self t o be t he leader. The leader decides how t o spread t he load and inform s t he rout er. This is an ext ra st ep t hat m ust occur before t he rout er can redirect any connect ions. Because WCCP uses GRE, t he rout er m ay be forced t o fragm ent large TCP packet s from HTTP request s. Fort unat ely, t his shouldn't occur very oft en because m ost HTTP request s are sm aller t han t he Et hernet MTU size ( 1500 oct et s) . The default TCP and I P packet headers are 20 oct et s each, which m eans an Et hernet fram e can carry 1460 oct et s of act ual dat a. GRE encapsulat ion adds 20 oct et s for t he GRE header, plus anot her 20 for t he second I P header. Thus a norm al 1500- oct et TCP/ I P packet from t he client becom es 1540 oct et s aft er encapsulat ion. This is t oo large t o t ransm it in a single Et hernet fram e, so t he rout er fragm ent s t he original packet int o t wo packet s.
9.3.4.1 WCCPv1 The configurat ion exam ples in t his sect ion were t est ed on a Cisco 7204 running I OS Version 12.0( 5) T. The net work configurat ion is t he sam e as shown in Figure 9- 5. First , ent er t hese t wo lines in t he I OS configurat ion t o enable WCCP for t he rout er: ip wccp version 1 ip wccp web-cache Second, you m ust enable WCCP on individual rout er int erfaces. You should do t his only on int erfaces where HTTP packet s leave t he rout er. I n ot her words, select int erfaces t hat connect t o origin servers or your I nt ernet gat eway: interface Ethernet0/1 ip address 172.16.102.129 255.255.255.192 ip wccp web-cache redirect out Be sure t o save your configurat ion changes. You m ay need t o use an access list t o prevent int ercept ion for cert ain web sit es. You can also use t he access list t o prevent forwarding loops. For exam ple: ! don't re-intercept connections coming from Squid: access-list 112 deny
tcp host 172.16.102.66 any eq www
! don't intercept this broken web site access-list 112 deny
tcp any 192.16.8.7 255.255.255.255 eq www
! allow other HTTP traffic access-list 110 permit tcp any any eq www
ip wccp web-cache redirect-list 112 The rout er doesn't send any t raffic t o Squid unt il Squid announces it self t o t he rout er. I explain how t o configure Squid for WCCP in Sect ion 9.5.1.
9.3.4.2 WCCPv2 The st andard Squid dist ribut ion current ly only support s WCCPv1. However, you can find a pat ch for WCCPv2 on t he ht t p: / / devel.squid- cache.org/ sit e. This code is st ill experim ent al. Not e t hat t he GRE packet s sent from t he rout er t o Squid cont ain an addit ional four oct et s. WCCPv2 insert s a redirect header bet ween t he GRE header, and t he encapsulat ed I P packet . You m ay need t o m odify your kernel code t o account for t his addit ional header.
9.3.4.3 Debugging I OS provides a couple of com m ands t o m onit or and debug WCCP. The show ip wccp web- cache com m and provides som e basic inform at ion: router#show ip wccp web-cache Global WCCP information: Router information: Router Identifier:
172.16.102.129
Protocol Version:
1.0
Service Identifier: web-cache Number of Cache Engines:
1
Number of routers:
1
Total Packets Redirected:
1424
Redirect access-list:
-none-
Total Packets Denied Redirect:
0
Total Packets Unassigned:
0
Group access-list:
-none-
Total Messages Denied to Group:
0
Total Authentication failures:
0
For a few m ore det ails, add t he word detail t o t he end of t he previous com m and: router#show ip wccp web-cache detail WCCP Cache-Engine information: IP Address:
172.16.102.66
Protocol Version:
0.4
State:
Usable
Initial Hash Info:
00000000000000000000000000000000 00000000000000000000000000000000
Assigned Hash Info:
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
Hash Allotment:
256 (100.00%)
Packets Redirected:
1424
Connect Time:
00:17:40
Here you can see Squid's I P address and st at e. I f m ore t han one cache speaks WCCP t o t he rout er, t he hash assignm ent inform at ion should look different . Most likely, each cache receives an equal proport ion of t he hash bucket s. Not e t hat t he det ailed out put has a Prot ocol Version line wit h a different value t han t he first com m and. Unfort unat ely, t he word " version" is overloaded. The show ip wccp web- cache com m and appears t o report t he WCCP prot ocol m aj or version num ber ( i.e., 1 or 2) , while t he det ail version seem s t o be a different ( perhaps int ernal, or m inor version) num ber t hat m at ches t he value of Squid's wccp_version direct ive. < Day Day Up >
< Day Day Up >
9.4 Operating System Tweaks You m ust enable cert ain net working feat ures in your operat ing syst em t o m ake int ercept ion caching work. First , you need t o enable I P packet forwarding. This allows t he operat ing syst em t o receive packet s wit h foreign dest inat ion addresses. Second, you m ust enable and configure opt ional code in t he kernel t hat redirect s t he foreign packet s t o Squid.
9.4.1 Linux The inst ruct ions in t his sect ion should work for t he 2.4 series of Linux kernels. I used RedHat Linux 7.2 ( kernel 2.4.7- 10) . I f you are using an older or newer version, t hese m ay not work. I recom m end searching t he Squid FAQ and ot her places for updat ed or hist orical inform at ion. I n m y t est s wit h ipt ables, it wasn't necessary t o enable I P forwarding. However, you m ay want t o enable it init ially and see if you can disable it aft er everyt hing else is working. The best way t o enable packet forwarding is t o add t his line t o / et c/ sysct l.conf: net.ipv4.ip_forward = 1 Most likely you'll need t o m ake a new kernel before HTTP int ercept ion will work. See O'Reilly's Running Linux by Mat t Welsh, Mat t hias Kalle Dalheim er, and Lar Kaufm an, if you don't know how t o configure and creat e a Linux kernel. When you configure t he kernel, m ake sure t hese opt ions are enabled: o
General setup Networking support (CONFIG_NET=y) Sysctl support (CONFIG_SYSCTL=y)
o
Networking options Network packet filtering (CONFIG_NETFILTER=y) TCP/IP networking (CONFIG_INET=y) Netfilter Configuration Connection tracking (CONFIG_IP_NF_CONNTRACK=y) IP tables support (CONFIG_IP_NF_IPTABLES=y) Full NAT (CONFIG_IP_NF_NAT=y) REDIRECT target support (CONFIG_IP_NF_TARGET_REDIRECT=y)
o
File systems /proc filesystem support (CONFIG_PROC_FS=y)
Addit ionally, m ake sure t his opt ion isn't enabled: o Networking options Fast switching (CONFIG_NET_FASTROUTE=n) The code t hat redirect s foreign packet s t o Squid is part of t he Net filt er soft ware. Here is a rule t hat sends t he int ercept ed HTTP connect ions t o Squid: iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3128 The Linux kernel m aint ains a num ber of different t ables. The - t nat opt ion indicat es t hat we are m odifying t he Net work Address Translat ion ( NAT) t able. I n essence, we're using ipt ables t o t ranslat e origin server TCP/ I P addresses t o Squid's local TCP/ I P address. Each ipt ables t able has a num ber of chains. The - A PREROUTI NG opt ion indicat es t hat we are appending a rule t o t he built - in chain nam ed PREROUTING. The PREROUTING chain applies only t o packet s ent ering t he syst em from t he out side net work. The next t hree opt ions det erm ine which packet s m at ch t his rule. The - i et h0 opt ion rest rict s t he rule t o packet s received on t he eth0 int erface. The - p t cp opt ion specifies TCP packet s, and — dport 80 specifies packet s wit h dest inat ion port equal t o 80. I f all t hree condit ions are t rue, t he packet m at ches t he rule. The - j REDI RECT opt ion indicat es t he t arget , or act ion t o t ake, for packet s t hat m at ch t he rule. REDIRECT is a built - in t arget nam e t hat causes ipt ables t o change t he packet 's dest inat ion address t o 127.0.0.1. The —t o- port 3128 opt ion inst ruct s ipt ables also t o change t he dest inat ion TCP port num ber t o 3128. I f you are also running an HTTP server ( such as Apache) on t he Squid host , you m ust add anot her ipt ables rule. The addit ional rule is necessary t o allow connect ions t o your HTTP server. Ot herwise, t he REDIRECT rule causes ipt ables t o send t hose connect ions t o Squid on port 3128. You can use t he - I opt ion t o insert a new rule at t he t op of t he list : iptables -t nat -I PREROUTING -i eth0 -p tcp -d 172.16.102.66 --dport 80 -j ACCEPT Once you have all your ipt ables rules working correct ly, be sure t o save t hem wit h t his com m and: /sbin/service iptables save This saves t he current rules t o / et c/ sysconfig/ ipt ables so t hey get aut om at ically loaded when you reboot .
9.4.1.1 Linux and WCCP Version 2.4 of t he Linux kernel com es wit h a GRE pseudo- int erface. However, it doesn't work for decoding GRE- encapsulat ed packet s from a WCCP session. The problem seem s t o be t hat t he rout er set s t he Prot ocol Type field t o 0x883E for WCCP/ GRE packet s. Linux's GRE driver doesn't
know what t o do wit h t hese packet s because it doesn't know about prot ocol t ype 0x883E. You can t ry pat ching Linux's GRE m odule so t hat it works wit h WCCP. The Squid FAQ cont ains a link t o such a pat ch. However, you'll probably find it easier t o use t he WCCP- specific m odule for Linux. You can find it at ht t p: / / www.squid- cache.org/ WCCP- support / Linux/ ip_wccp.c. You need t o com pile t he ip_wccp.c file as a loadable kernel m odule. This can be a lit t le t ricky because t he specific com piler opt ions m ay change depending on your kernel version. One t hing you can do is go t o your kernel source direct ory, t ype m ake m odules and wat ch t he com piler com m ands scroll by. Then copy one of t hose com m ands and change t he last argum ent t o ip_wccp.c. Here are t he com m ands t hat I used wit h t he 2.4.7- 10 Linux kernel: % gcc -Wall -D_ _KERNEL_ _ -I/usr/src/linux-2.4.7-10/include
\
-DMODULE -DMODVERSIONS -DEXPORT_SYMBAB \ -include /usr/src/linux-2.4.7-10/include/linux/modversions.h \ -O2 -c ip_wccp.c The gcc com m and should leave you wit h an ip_wccp.o file in t he current direct ory. The next st ep is t o load t hat file int o t he kernel wit h t he insm od com m and: # insmod ip_wccp.o Not e t hat t he ip_wccp m odule accept s GRE/ WCCP packet s from any source address. I n ot her words, a m alicious person m ight be able t o send t raffic t o your Squid cache. I f you use t his m odule, you should also inst all an ipt ables rule t o deny foreign GRE packet s. For exam ple: # iptables -A INPUT -p gre -s 172.16.102.65 -j ACCEPT # iptables -A INPUT -p gre -j DROP Again, don't forget t o save your working rules wit h t he / sbin/ service ipt ables save com m and.
9.4.2 FreeBSD The exam ples in t his sect ion are based on FreeBSD- 4.8 and should work for any lat er version of FreeBSD- 4 and FreeBSD- 5. To enable I P packet forwarding, add t his line t o / et c/ sysct l.conf: net.inet.ip.forwarding=1 You'll need a kernel wit h t wo special opt ions enabled. I f you don't know how t o m ake a kernel, refer t o Sect ion 9 of t he FreeBSD Handbook ( ht t p: / / www.freebsd.org/ handbook/ index.ht m l) . Edit your kernel config file and m ake sure t hese lines are present : options
IPFIREWALL
options
IPFIREWALL_FORWARD
I f t he Squid box is in an unat t ended m achine room , I also recom m end using t he IPFIREWALL_DEFAULT_TO_ACCEPT opt ion. I n case you m ess up t he firewall rules, you'll st ill be able t o log in. These ipfw com m ands t ell t he kernel t o redirect int ercept ed connect ions t o Squid: /sbin/ipfw add allow tcp from 172.16.102.66 to any out /sbin/ipfw add allow tcp from any 80 to any out /sbin/ipfw add fwd 127.0.0.1,3128 tcp from any to any 80 in /sbin/ipfw add allow tcp from any 80 to 172.16.102.66 in The first rule m at ches packet s originat ing from t he Squid host . I t ensures t hat out going TCP [ 3] connect ions won't be redirect ed back t o Squid. The second rule m at ches TCP packet s sent from Squid back t o t he client s. I 've added it here in case you have addit ional ipfw rules lat er t hat would deny t hese packet s. The t hird rule is t he one t hat act ually redirect s incom ing connect ions t o Squid. The fourt h rule m at ches packet s com ing back from origin servers t o Squid. Again, t his is in case you have subsequent deny rules. [ 3]
Alt hough a m isconfigurat ion on t he swit ch/ rout er m ay st ill reint ercept t hese packet s.
I f you're also running an HTTP server on t he Squid host , you m ust add anot her rule t hat passes, rat her t han redirect s, TCP packet s dest ined for t he origin server. The following rule goes before t he fwd rule: /sbin/ipfw add allow tcp from any to 172.16.102.66 80 in FreeBSD t ypically st ores ipfw rules in / et c/ rc.firewall. Once you get your rule set working properly, be sure t o save t hem . Add t his line t o / et c/ rc.conf t o m ake FreeBSD aut om at ically run t he / et c/ rc.firewall script when it boot s: firewall_enable="YES"
9.4.2.1 FreeBSD and WCCP FreeBSD Version 4.8 and lat er have built - in support for GRE and WCCP. Earlier versions require pat ches, which you can st ill find at ht t p: / / www.squid- cache.org/ WCCP- support / FreeBSD/ . The built - in im plem ent at ion is m uch bet t er, however, as it is writ t en by real kernel gurus. You'll probably need t o m ake a new kernel t hat support s GRE. Add t his line t o your kernel configurat ion: pseudo-device
gre
For FreeBSD- 5, use device inst ead of pseudo-device. Of course, you also need t he FIREWALL
opt ions m ent ioned in t he preceding sect ion. Aft er inst alling and boot ing from t he new kernel, you m ust configure a GRE t unnel t o accept GRE packet s from t he rout er. For exam ple: # ifconfig gre0 create # ifconfig gre0 172.16.102.66 172.16.102.65 netmask 255.255.255.255 up # ifconfig gre0 tunnel 172.16.102.66 172.16.102.65 # route delete 172.16.102.65 The ifconfig com m and adds a rout ing t able ent ry for t he rout er ( 172.16.102.65) over t he gre0 int erface. I found it necessary t o delet e t hat rout e so t hat Squid can t alk t o t he rout er. You m ay want or need t o add an ipfw rule for t he GRE packet s com ing from t he rout er: /sbin/ipfw add allow gre from 172.16.102.65 to 172.16.102.66
9.4.3 OpenBSD The exam ples in t his sect ion are based on OpenBSD 3.3. To enable packet forwarding, uncom m ent or add t his line in / et c/ sysct l.conf: net.inet.ip.forwarding=1 Now, configure t he packet filt er rules for int ercept ion by adding lines like t hese t o / et c/ pf.conf: rdr inet proto tcp from any to any port = www -> 127.0.0.1 port 3128 pass out proto tcp from 172.16.102.66 to any pass out proto tcp from any port = 80 to any pass in proto tcp from any port = 80 to 172.16.102.66 I f you aren't already using OpenBSD's packet filt er, you need t o enable it wit h t his line in / et c/ rc. conf.local: pf=YES
9.4.3.1 OpenBSD and WCCP First , t ell t he syst em t o accept and process GRE and WCCP packet s by adding t hese lines t o / et c/ sysct l.conf: net.inet.gre.allow=1
net.inet.gre.wccp=1 Then, configure a GRE int erface wit h com m ands like t hese: # ifconfig gre0 172.16.102.66 172.16.102.65 netmask 255.255.255.255 up # ifconfig gre0 tunnel 172.16.102.66 172.16.102.65 # route delete 172.16.102.65 As wit h FreeBSD, I found it necessary t o delet e t he rout e t hat is aut om at ically added by ifconfig. Finally, depending on your packet filt er configurat ion, you m ay need t o add a rule t hat allows t he GRE packet s: pass in proto gre from 172.16.102.65 to 172.16.102.66
9.4.4 IPFilter on NetBSD and Others The exam ples in t his sect ion are based on Net BSD 1.6.1. They m ight also work on Solaris, HPUX, I RI X, and Tru64 since I PFilt er runs on t hose syst em s as well. To enable packet forwarding ( on Net BSD) , add t his line t o / et c/ sysct l.conf: net.inet.ip.forwarding=1 Then, insert a line like t his int o t he NAT ( net work address t ranslat ion) configurat ion file, / et c/ ipnat .conf: rdr fxp0 0/0 port 80 -> 172.16.102.66 port 3128 tcp Your int erface nam e m ay be different from fxp0 in t his exam ple.
9.4.4.1 NetBSD and WCCP I was not able t o m ake WCCP work wit h Net BSD, even aft er pat ching t he GRE code t o accept WCCP packet s. The problem seem s t o arise because t he I PFilt er rdr rule is bound t o a specific int erface. Packet s com ing from t he rout er go t hrough Net BSD's gre0 int erface ( where t hey are unencapsulat ed) . However, packet s going t he ot her way, back t o t he rout er, aren't encapsulat ed and don't go t hrough t he sam e net work int erface. Therefore, t he I PFilt er code doesn't t ranslat e Squid's local I P address back t o t he origin server's address. < Day Day Up >
< Day Day Up >
9.5 Configure Squid I f you are using Linux 2.4 and ipt ables, you should probably use t he —enable- linux- net filt er opt ion when you run ( or re- run) ./ configure. I t enables som e Linux- specific code so t hat Squid can find t he I P address of t he origin server from where t he request was originally sent . Squid norm ally get s t he origin server nam e ( and/ or address) from t he Host header. The —enablelinux- net filt er feat ure is necessary only for request s t hat don't have a Host header. St at ist ics show t hat alm ost all request s have t he Host header, so you m ay act ually be able t o get by wit hout t he —enable- linux- net filt er opt ion. I f you are using t he I PFilt er package ( wit h Net BSD, Solaris, and ot hers) , you should use t he — enable- ipf- t ransparent opt ion for t he sam e reason. On OpenBSD, you should use t he —enablepf- t ransparent opt ion. Each t im e you run ./ configure you m ust recom pile Squid, as described in Sect ion 3.8. Aft er you get t he ./ configure opt ions figured out , and Squid recom piled, you can edit squid. conf. As a st art ing point , m ake sure t he following direct ives are defined wit h t he given values: httpd_accel_host virtual httpd_accel_port 80 httpd_accel_uses_host_header on httpd_accel_with_proxy on httpd_accel_single_host off The ht t pd_accel_host direct ive is t he key. I t inst ruct s Squid t o accept HTTP request s wit h part ial URI s. The ht t pd_accel_uses_host _header direct ive is enabled so t hat Squid uses t he Host header t o reconst ruct full URI s. The virtual keyword inst ruct s Squid t o put t he origin server's I P address in t he URL when t he Host header is absent . The ht t pd_accel_wit h_proxy direct ive cont rols whet her or not Squid accept s bot h HTTP server ( part ial URI ) request s, and proxy ( full URI ) request s. I t should probably be enabled for int ercept ion caching. Squid m ay st ill work if ht t pd_accel_wit h_proxy is disabled as long as none of your client s are explicit ly configured for Squid as a proxy. The ht t pd_accel_single_host direct ive is norm ally disabled, but it was enabled by default in som e earlier versions of Squid. I 've list ed it here t o m ake sure t hat it is disabled for int ercept ion caching. I f you are int ercept ing m ore t han j ust port 80, you m ay want t o set ht t pd_accel_port t o 0. See Appendix A for m ore inform at ion. I f you're not using WCCP, you should be ready t o st art sending int ercept ed t raffic t o Squid. Give it a t ry by surfing t he Web wit h your browser or by m aking som e t est request s wit h squidclient . I f you are using WCCP, t here is j ust one m ore st ep t hat you m ust com plet e.
9.5.1 Configuring WCCPv1 The rout er doesn't send any t raffic t o Squid unt il Squid announces it self t o t he rout er. To m ake Squid do t hat , add t hese lines t o your squid.conf: wccp_router 172.16.102.65 wccp_version 4 Your rout er has m any int erfaces. Be sure t o use t he I P address of t he int erface closest t o Squid. This is necessary because t he WCCP m essages com ing from t he rout er have t he source I P address set t o t he address of t he out going int erface. Squid rej ect s WCCP m essages if t he source address doesn't m at ch t he wccp_rout er value. The WCCPv1 docum ent specifies 4 as t he prot ocol version num ber. However, som e users report t hat Cisco I OS 11.2 support s only Version 3. I f you are using t his version of I OS, change t he version in squid.conf: wccp_version 3
< Day Day Up >
< Day Day Up >
9.6 Debugging Problems HTTP int ercept ion is com plicat ed because m any different devices m ust all work correct ly t oget her. To help you t rack down problem s, here's a t rouble- shoot ing check list :
Are client packet s going t hrough t he rout er/ swit ch? This should be obvious for sim ple net works. You can t race t he cables and wat ch t he act ivit y light s blink. I n a large, com plex net work, however, packet s m ay be t aking an alt ernat e pat h. I f your organizat ion is large enough t o have a net work sniffer, you m ay want t o observe t he t raffic on t he link t hat should carry request s from web client s. A low- t ech approach is t o disconnect t he link in quest ion and see if it affect s t he client 's web browsing.
I s t he rout er/ swit ch configured properly? You m ay want t o double- check your rout er/ swit ch configurat ion. I f you configured specific int erfaces, did you get t he right ones? I s your new configurat ion act ually running on t he device? Perhaps t he rout er/ swit ch was reboot ed before you could save t he configurat ion. You m ay need t o reboot before t he changes t ake effect .
Can t he swit ch/ rout er t alk t o t he Squid host ? Can you ping Squid from t he rout er/ swit ch? Most layer four int ercept ion configurat ions require t hat t he device and Squid be on t he sam e subnet . Log int o t he rout er/ swit ch, and m ake sure you can ping Squid's I P address.
Does t he swit ch/ rout er believe t hat Squid is up? Many t raffic int ercept ion devices don't send t raffic t o Squid unless t hey know it 's healt hy. Use t he debugging com m ands t o view Squid's healt h st at us. You m ay find t hat a layer t hree healt h check ( e.g., I CMP ping) is sim pler t han a layer four check ( e.g., HTTP) , and m ore likely t o m ake t he net work device m ark Squid as up.
I s Squid act ually running? Double- check t hat Squid is really running, especially if t he syst em has recent ly been reboot ed.
Are packet s arriving at t he Squid host ? You should be able t o see int ercept ed TCP connect ions wit h t cpdum p. Here's an exam ple:
# tcpdump -n -i eth0 port 80 I f you use WCCP, check for GRE packet s com ing from t he rout er: # tcpdump -n -i eth0 ip proto gre I f you don't see any out put from t cpdum p, t he rout er/ swit ch is probably not sending anyt hing. I n t hat case, ret urn t o t he previous suggest ions. Not e, if t he device is using layer four healt h checks, you should see t hose in t he t cpdum p out put . Healt h checks com e from t he rout er/ swit ch I P address, so t hey should be easy t o spot . I f you see healt h checks, but no ot her t raffic, it probably m eans t he rout er/ swit ch is int erpret ing Squid's reply as unhealt hy. For exam ple, t he device m ay want t o see a 200 ( OK) response, but Squid ret urns an error, such as 401 ( Unaut horized) or 404 ( Not Found) . You m ay want t o run t ail - f on t he access.log.
Did you enable I P forwarding? Double- check t hat Squid's operat ing syst em is configured t o forward I P packet s. I f not , t he host m ay drop int ercept ed packet s because t he dest inat ion I P address isn't local.
Did you configure t he packet filt er? Make sure t hat t he packet filt er ( i.e., ipfw, ipt ables, pf, et c.) is configured correct ly. When everyt hing is working well, you should be able t o run t he com m and periodically t hat displays t he filt ering rules and see t he count ers increase. For exam ple: # ipfw show 300 ; sleep 3; ipfw show 300 00300
86216 8480458 fwd 127.0.0.1,3128 tcp from any to any 80 in
00300
86241 8482240 fwd 127.0.0.1,3128 tcp from any to any 80 in
Not e t hat in t his exam ple on FreeBSD, t he packet and byt e count ers ( second and t hird colum ns) are being increm ent ed.
I s t he loopback int erface up and configured? I f you have a rule t o forward/ redirect packet s t o 127.0.0.1, m ake sure t hat t he loopback ( e. g., lo0, lo) int erface is up and configured. I f not , t he kernel m ay sim ply skip t he forward/ redirect rule.
Are WCCP/ GRE packet s being unencapsulat ed correct ly? I f you use WCCP, m ake sure t hat t he GRE packet s are being unencapsulat ed. I f, for som e reason, your syst em doesn't know what t o do wit h GRE packet s, it probably increm ent s t he " unknown/ unsupport ed prot ocol" count er in net st at - s out put : # netstat -s | grep unknown
46 packets for unknown/unsupported protocol I f your OS has a GRE int erface, run net st at - i every so oft en and look for increasing packet count s: # netstat -in | grep ^gre0 Name
Mtu Network
gre0
1476
304452
0
Address
0
4
Ipkts Ierrs
Opkts Oerrs
Coll
0
Also, t ry running t cpdum p on t he GRE int erface: # tcpdump -n -i gre0
Can Squid t alk back t o t he client s? You m ay have a sit uat ion in which t he rout er/ swit ch is able t o send packet s t o Squid, but Squid can't send packet s back t o t he client s. This can happen if your firewall filt er rules rej ect t hose out going packet s or if Squid j ust doesn't have a rout e t o t he client addresses. To check for t his condit ion, run net st at - n and look for a lot of socket s in t he SYN_RCVD st at e: % netstat -n Active Internet connections Proto Recv-Q Send-Q
Local Address
Foreign Address
(state)
tcp4
0
0
10.102.129.246.80
10.102.0.1.36260
SYN_RCVD
tcp4
0
0
10.102.129.226.80
10.102.0.1.36259
SYN_RCVD
tcp4
0
0
10.102.128.147.80
10.102.0.1.36258
SYN_RCVD
tcp4
0
0
10.102.129.26.80
10.102.0.2.36257
SYN_RCVD
tcp4
0
0
10.102.129.29.80
10.102.0.2.36255
SYN_RCVD
tcp4
0
0
10.102.129.226.80
10.102.0.1.36254
SYN_RCVD
tcp4
0
0
10.102.128.117.80
10.102.0.1.36253
SYN_RCVD
tcp4
0
0
10.102.128.149.80
10.102.0.1.36252
SYN_RCVD
I f you see t his, use ping and t racerout e t o m ake sure t hat Squid has bidirect ional com m unicat ion wit h t he client s.
Can Squid t alk t o origin servers?
I nt ercept ed HTTP connect ions get st uck if Squid can't connect t o origin servers. When t his happens, net st at should show you a lot of connect ions in t he SYN_SENT st at e: % netstat -n Active Internet connections Proto Recv-Q Send-Q
Local Address
Foreign Address
(state)
tcp4
0
0
172.16.102.66.5217
10.102.129.145.80
SYN_SENT
tcp4
0
0
172.16.102.66.5216
10.102.129.224.80
SYN_SENT
tcp4
0
0
172.16.102.66.5215
10.102.128.71.80
SYN_SENT
tcp4
0
0
172.16.102.66.5214
10.102.129.209.80
SYN_SENT
tcp4
0
0
172.16.102.66.5213
10.102.129.62.80
SYN_SENT
tcp4
0
0
172.16.102.66.5212
10.102.129.160.80
SYN_SENT
tcp4
0
0
172.16.102.66.5211
10.102.128.129.80
SYN_SENT
tcp4
0
0
172.16.102.66.5210
10.102.129.44.80
SYN_SENT
tcp4
0
0
172.16.102.66.5209
10.102.128.73.80
SYN_SENT
tcp4
0
0
172.16.102.66.5208
10.102.128.43.80
SYN_SENT
Again, use ping and t racerout e t o m ake sure t hat Squid can t alk t o origin servers.
Are out going connect ions being int ercept ed? I f Squid can ping origin servers, and you st ill see a lot of connect ions in t he SYN_SENT st at e, t he rout er/ swit ch m ay be int ercept ing Squid's out going TCP connect ions. I n som e cases, Squid can det ect such forwarding loops, and it writ es a warning m essage t o cache.log. Such a forwarding loop can quickly exhaust all of Squid's file descript ors, which also generat es a warning in cache.log. I f you suspect t his problem , use t he squidclient program t o m ake som e sim ple HTTP request s. For exam ple, t his com m and m akes an HTTP request direct ly t o t he origin server: % /usr/local/squid/bin/squidclient -p 80 -h slashdot.org / I f t his com m and succeeds, you should see a bunch of ugly HTML from t he Slashdot sit e on your screen. You can t hen t ry t he sam e request t hrough Squid: % /usr/local/squid/bin/squidclient -r -p 3128 -h 127.0.0.1 http://slashdot.org/ Again, you should see som e HTML on your screen. I f not check for error m essages in cache.log. I f you see forwarding loop errors, you need t o reconfigure your rout er/ swit ch so t hat it allows Squid's out going connect ions t o pass wit hout being int ercept ed.
< Day Day Up >
< Day Day Up >
9.7 Exercises ●
Try running Squid wit h a bogus ht t pd_accel_host value. For exam ple: httpd_accel_host blah.blah.blah
●
●
●
Does it st ill work, or do you get error m essages? Disconnect Squid's net work connect ion while your rout er/ swit ch is divert ing t raffic t o it . Does t he net work device bypass Squid? How long does it t ake t o not ice t he problem ? Repeat t he sam e experim ent , but t his t im e kill t he Squid process inst ead of unplugging t he net work cable. Enable Squid's user- agent log and see if you are int ercept ing any nonbrowser web t raffic. < Day Day Up >
< Day Day Up >
Chapter 10. Talking to Other Squids For one reason or anot her, you m ay find t hat you want Squid t o forward it s cache m isses t o anot her cache or HTTP proxy. This is necessary, for exam ple, if you are using Squid inside a large corporat e net work t hat has one or m ore firewalls prot ect ing you from t he out side world. I f your caching service is act ually a clust er of Squid caches, you probably want t hem t o cooperat e wit h each ot her t o m inim ize duplicat ion of cached responses. You can also use Squid as a cont ent rout er—rout ing web t raffic in different direct ions based on som e aspect of t he request . Or, perhaps you'd like t o part icipat e in an inform al collect ion of caches t o furt her im prove response t im e and reduce wide- area net work t raffic. I nt ercache com m unicat ion is a com plex undert aking, and Squid has num erous feat ures and prot ocols t o accom plish t he t ask. Aft er explaining som e of t he t erm inology and discussing t he issues, I 'll int roduce t he configurat ion file direct ives t hat cont rol request rout ing. Following t hat I describe t he nift y net work m easurem ent dat abase. Most likely, you'll use one or m ore of Squid's int ercache prot ocols t o assist in com m unicat ing wit h t he ot her caches or proxies. The I nt ernet Cache Prot ocol ( I CP) is t he oldest but not necessarily t he best . I t is widely im plem ent ed in non- Squid product s, so you m ay need t o use it for t hat reason alone. The newer prot ocols are Cache Digest s, t he Hypert ext Caching Prot ocol ( HTCP) , and t he Cache Array Rout ing Prot ocol ( CARP) . There are m any choices here, so I 'll spend a bit of t im e explaining how everyt hing works inside Squid. < Day Day Up >
< Day Day Up >
10.1 Some Terminology Caching hierarchy is t he nam e generally given t o a collect ion of caches ( or proxies) t hat forward request s t o one anot her. We say t hat t he m em bers of t he hierarchy are neighbors or peers. Neighbor caches have eit her a parent or sibling relat ionship. Topologically, parent caches are one level up in t he hierarchy, while siblings are on t he sam e level. The real difference is t hat parent s can forward cache m isses for t heir children. Siblings, on t he ot her hand, aren't allowed t o forward cache m isses. This m eans t hat , before sending a request t o a sibling, t he originat or should know t hat it will be a cache hit . I nt ercache prot ocols like I CP, HTCP, and Cache Digest s can predict cache hit s in neighbors. CARP, however, can't . Som et im es, cache hierarchies aren't really hierarchical. Consider, for exam ple, a group of five sibling caches. Because t here are no parent s or children, t here is no sense of up or down. I n t his case, you could call it a cache m esh, or even an array, inst ead of a hierarchy. < Day Day Up >
< Day Day Up >
10.2 Why (Not) Use a Hierarchy? A neighbor cache im proves perform ance by providing som e ext ra fract ion of request s as cache hit s. I n ot her words, som e of t he request s t hat are m isses in your cache m ay be hit s in t he neighbor cache. I f your cache can download t hese neighbor hit s fast er t han from t he origin server, t he hierarchy should im prove perform ance overall. The downside is t hat neighbor caches usually provide only a sm all percent age of request s as hit s. About 5% , or m aybe 10% if you're lucky, of your request s t hat are cache m isses will be hit s in a neighbor. I n som e cases, t his sm all benefit doesn't j ust ify t he hassle of j oining a hierarchy. I n ot her cases, such as net works wit h poor or overut ilized connect ivit y, hierarchies definit ely im prove perform ance for end users. I f you use Squid inside a firewalled net work, you m ay need t o configure t he firewall proxy as a parent . I n t his case, Squid forwards every request t o t he firewall because it can't connect direct ly t o out side origin servers. I f you have som e origin servers inside t he firewall, you can inst ruct Squid t o connect t o t hem direct ly. You can also use a hierarchy t o send web t raffic in different direct ions. This is som et im es called applicat ion- layer rout ing, or m ore recent ly, cont ent rout ing. Consider, for exam ple, a large organizat ion wit h t wo I nt ernet connect ions. Perhaps t he second connect ion cost s less, or has higher lat ency, t han t he ot her. This organizat ion m ay want t o use t he second connect ion for low- priorit y t raffic, such as downloading binaries, audio and video files, or ot her kinds of large t ransfers. Or, perhaps t hey want t o send all HTTP t raffic over one link, and non- HTTP t raffic over t he ot her. Or, perhaps cert ain users' t raffic should go t hrough t he low- priorit y connect ion, while prem ium cust om ers get t o use t he m ore expensive link. You can accom plish any of t hese scenarios wit h a hierarchy of caching proxies. Trust is one of t he m ost im port ant issues for t he m em bers of a cache hierarchy. You m ust t rust your neighbors t o serve correct , unm odified responses. You m ust t rust t hem wit h sensit ive inform at ion, such as t he URI s request ed by your users. You m ust t rust t hat t hey m aint ain secure and up- t o- dat e syst em s t o m inim ize t he chances of unaut horized access and denials of service. Anot her problem wit h hierarchies is t he way t hat t hey norm ally propagat e errors. When a neighbor cache experiences an error, such as an unreachable server, it generat es an HTML page t hat explains t he error and it s origin. Your users m ay becom e confused if t hey get errors from neighbor caches out side t he im m ediat e organizat ion. I f t he problem persist s, t hey'll have a hard t im e finding an adm inist rat or who can help t hem . Sibling relat ionships are subj ect t o special problem , known as false hit s. This occurs when Squid sends a request t o a sibling, believing it will be a cache hit , but t he sibling is unable t o sat isfy t he request wit hout cont act ing t he origin server. False hit s happen in a num ber of circum st ances, but usually wit h a low probabilit y. Furt herm ore, Squid and ot her HTTP proxies have feat ures for aut om at ically ret rying such request s so t hat t he user isn't even aware of t he problem . A forwarding loop is anot her problem som et im es seen in cache hierarchies. I t occurs when Squid forwards a request som ewhere, but t hat request com es back t o Squid again, as shown in Figure 10- 1.
Figu r e 1 0 - 1 . A for w a r din g loop
Forwarding loops t ypically happen when t wo caches consider each ot her parent s. I f you have such an arrangem ent , m ake sure t hat you use t he cache_peer_access direct ive t o prevent loops. For exam ple, if t he neighbor's I P address is 192.168.1.1, t he following lines ensure Squid won't cause a forwarding loop: acl FromNeighbor src 192.168.1.1 cache_peer_access the.neighbor.name deny FromNeighbor Forwarding loops can also occur wit h HTTP int ercept ion, especially if t he int ercept ion device is on t he pat h bet ween Squid and an origin server. Squid det ect s forwarding loops by looking for it s own host nam e in t he Via header. You m ay act ually get false forwarding loops if t wo cooperat ing caches have t he sam e host nam e. The unique_host nam e direct ive is useful in t his sit uat ion. Not e t hat if t he Via header is filt ered out ( e.g., wit h headers_access) , Squid can't det ect forwarding loops. < Day Day Up >
< Day Day Up >
10.3 Telling Squid About Your Neighbors The cache_peer direct ive defines your neighbor caches and t ells Squid how t o com m unicat e wit h t hem : cache_peer hostname type http-port icp-port [options] The first argum ent is t he neighbor's host nam e, or I P address. You can safely use host nam es here because Squid doesn't block while resolving t hem . I n fact , Squid periodically resolves t he host nam e in case t he I P address changes while Squid is running. Neighbor host nam es m ust be unique: you can't have t wo neighbors wit h t he sam e nam e, even if t hey have different port s. The second argum ent specifies t he t ype of neighbor cache. The choices are: parent , sibling, or m ult icast . Parent and sibling are st raight forward. I 'll t alk about m ult icast in Sect ion 10.6.3. The t hird argum ent is t he neighbor's HTTP port num ber. I t should correspond t o t he neighbor's ht t p_port ( or equivalent ) set t ing. You m ust always specify a nonzero HTTP port num ber. The fourt h argum ent specifies eit her t he I CP or HTCP port num ber. By default , Squid uses I CP t o query ot her caches. That is, Squid sends I CP queries t o t he neighbor on t he port given here. I f you add t he htcp opt ion, Squid sends HTCP queries t o t his port inst ead. The default I CP port is 3130, and t he default HTCP port is 4827. Make sure t hat you change t he port num ber if you add t he htcp opt ion. Set t ing t his port num ber t o zero disables bot h I CP and HTCP. However, you should inst ead ( or also) use t he no-query opt ion t o disable t hese prot ocols.
10.3.1 cache_peer Options The cache_peer direct ive has quit e a few opt ions. I 'll describe som e of t hem here, and t he ot hers in t he sect ions relat ing t o specific prot ocols.
proxy-only This opt ion inst ruct s Squid not t o st ore any responses it receives from t he neighbor. This is oft en useful when you have a clust er and don't want a resource t o be st ored on m ore t han one cache.
weight= n This opt ion is specific t o I CP/ HTCP. See Sect ion 10.6.2.1.
ttl= n This opt ion is specific t o m ult icast I CP. See Sect ion 10.6.3.
no-query This opt ion is specific t o I CP/ HTCP. See Sect ion 10.6.2.1.
default This opt ion specifies t he neighbor as a suit able choice in t he absence of ot her hint s. Squid norm ally prefers t o forward a cache m iss t o a parent t hat is likely t o have a cached copy of t he part icular resource. Som et im es Squid won't have any clues ( e.g., if you disable I CP/ HTCP wit h no-query) . I n t hese cases, Squid looks for a parent t hat has been m arked as a default choice.
round-robin This opt ion is a sim ple load- sharing t echnique. I t m akes sense only when you m ark t wo or m ore parent caches as round-robin. Squid keeps a count er for each parent . When it needs t o forward a cache m iss, Squid select s t he parent wit h t he lowest count er.
multicast-responder This opt ion is specific t o m ult icast I CP. See Sect ion 10.6.3.
closest-only This opt ion is specific t o I CP/ HTCP. See Sect ion 10.6.2.1.
no-digest This opt ion is specific t o Cache Digest s. See Sect ion 10.7.
no-netdb-exchange This opt ion t ells Squid not t o request t he neighbor's net db dat abase ( see Sect ion 10.5) . Not e, t his refers t o t he bulk t ransfer of t he RTT m easurem ent s, not t he inclusion of t hese m easurem ent s in I CP m iss replies.
no-delay This opt ion t ells Squid t o ignore any delay pools set t ings for request s t o t he neighbor. See Appendix C for m ore inform at ion on delay pools.
login= credentials This opt ion inst ruct s Squid t o send HTTP aut hent icat ion credent ials t o t he neighbor. I t has t hree different form at s:
login = user: password This is t he m ost com m only used form . I t causes Squid t o add t he sam e usernam e and password in every request going t o t he neighbor. Your users don't need t o ent er any aut hent icat ion inform at ion.
login= PASS Set t ing t he value t o PASS causes Squid t o pass t he user's aut hent icat ion credent ials t o t he neighbor cache. I t works only for HTTP basic aut hent icat ion. Squid doesn't add or m odify any aut hent icat ion inform at ion. I f your Squid is configured t o require proxy aut hent icat ion ( i.e., wit h a proxy_aut h ACL) , t he neighbor cache m ust use t he sam e usernam e and password dat abase. I n ot her words, you should use t he PASS form only for a group of caches owned and operat ed by a single organizat ion. This feat ure is dangerous because Squid doesn't rem ove t he aut hent icat ion credent ials from forwarded request s.
login = * : password Wit h t his form , Squid changes t he password, but not t he usernam e, in request s t hat it forwards. I t allows t he neighbor cache t o ident ify individual users, but doesn't expose t heir passwords. This form is less dangerous t han using PASS, but does have som e privacy im plicat ions. Use t his feat ure wit h ext rem e caut ion. Even if you ignore t he privacy issues, t his feat ure m ay cause undesirable side effect s wit h upst ream proxies. For exam ple, I know of at least one ot her caching product t hat only looks at t he credent ials of t he first request on a persist ent connect ion. I t apparent ly assum es ( incorrect ly) t hat all request s on a single connect ion com e from t he sam e user.
connect-timeout= n This opt ion specifies how long Squid should wait when est ablishing a TCP connect ion t o t he neighbor. Wit hout t his opt ion, t he t im eout is t aken from t he global connect _t im eout direct ive, which has a default value of 120 seconds. By using a lower t im eout , Squid gives up on t he neighbor quickly and m ay t ry t o send t he request t o anot her neighbor or direct ly t o t he origin server.
digest-url= url This opt ion is specific t o Cache Digest s. See Sect ion 10.7.
allow-miss This opt ion inst ruct s Squid t o om it t he Cache-Control: only-if-cached direct ive for request s sent t o a sibling. You should use t his only if t he neighbor has enabled t he icp_hit _st ale direct ive and isn't using a m iss_access list .
max-conn= n This opt ion places a lim it on t he num ber of sim ult aneous connect ions t hat Squid can open t o t he neighbor. When t his lim it is reached, Squid excludes t he neighbor from it s select ion algorit hm .
htcp This opt ion designat es t he neighbor as an HTCP server. I n ot her words, Squid sends HTCP queries, inst ead of I CP, t o t he neighbor. Not e t hat Squid doesn't accept I CP and HTCP queries on t he sam e port . When you add t his opt ion, don't forget t o change t he icp-port value as well. See Sect ion 10.8.1. HTCP support requires t he —enable- ht cp opt ion when running ./ configure.
carp-load-factor= f This opt ion m akes t he neighbor, which m ust be a parent , a m em ber of a CARP array. The sum of all f values, for all parent s, m ust equal 1. I cover CARP in Sect ion 10.9. CARP support requires t he —enable- carp opt ion when running ./ configure.
10.3.2 Neighbor State Squid keeps a variet y of st at ist ics and st at e inform at ion about each of it s neighbors. One of t he m ost im port ant is whet her Squid t hinks t he neighbor is alive ( up) or dead ( down) . The neighbor's alive/ dead st at e affect s m any aspect s Squid's select ion procedures. The algorit hm for det erm ining t he alive/ dead st at e is a lit t le bit com plicat ed, so I 'll go t hrough it here. I f you want t o follow along in t he source code, look at t he neighborUp( ) funct ion. Squid uses bot h TCP ( HTTP) and UDP ( I CP/ HTCP) com m unicat ion t o det erm ine t he st at e. The TCP st at e default s t o alive, but changes t o dead if 10 consecut ive TCP connect ions fail. When t his happens, Squid init iat es probe connect ions, no m ore t han once every connect _t im eout t im e period ( t he global direct ive, not t he cache_peer opt ion) . The st at e rem ains dead unt il one of t he probe connect ions succeeds.
I f t he no-query opt ion isn't set ( m eaning Squid is sending I CP/ HTCP queries t o t he neighbor) , t he UDP layer com m unicat ion also fact ors int o t he alive/ dead algorit hm . The UDP st at e default s t o alive, but changes t o dead if Squid doesn't get any I CP/ HTCP replies for a cert ain am ount of t im e—t he value of t he dead_peer_t im eout direct ive. Squid also m arks a neighbor dead if it s host nam e doesn't resolve t o any I P addresses. When Squid det erm ines a neighbor is dead, it writ es an ent ry in cache.log. Here's an exam ple: 2003/09/29 01:13:46| Detected DEAD Sibling: bo2.us.ircache.net/3128/3130 When com m unicat ion wit h t he neighbor is reest ablished, Squid logs a m essage like t his: 2003/09/29 01:13:49| Detected REVIVED Sibling: bo2.us.ircache.net/3128/3130 A neighbor's st at e affect s neighbor- select ion algorit hm s in t he following ways: ●
●
●
Squid doesn't expect t o receive I CP/ HTCP replies from dead neighbors. Squid sends I CP queries t o dead neighbors no m ore t han once each dead_peer_t im eout int erval. See Appendix A. A dead parent is excluded from t he following algorit hm s: Cache Digest s, round- robin parent , first up parent , default parent , and closest parent . CARP is special: any failed TCP connect ions ( not t he 10 required t o becom e dead) excludes t he parent from t he CARP algorit hm .
There is no way t o force Squid t o send HTTP request s t o a dead neighbor. I f all neighbors are dead, Squid will t ry connect ing t o t he origin server. I f you don't allow Squid t o t alk t o t he origin server ( wit h never_direct , for exam ple) , Squid ret urns a cannot forward error m essage: This request could not be forwarded to the origin server or to any parent caches.
The most likely cause for this error is that:
* The cache administrator does not allow this cache to make direct connections to origin servers, and * All configured parent caches are currently unreachable.
10.3.3 Altering the Relationship The neighbor_t ype_dom ain direct ive allows you t o change t he relat ionship wit h your neighbor based on t he origin server's host nam e. This is useful, for exam ple, if your neighbor is willing t o serve cache hit s for any request but m isses only for cert ain nearby dom ains. The synt ax is: neighbor_type_domain neighbor.host.name relationship [!]domain ... For exam ple:
cache_peer squid.uk.web-cache.net sibling 3128 3130 neighbor_type_domain squid.uk.web-cache.net parent .uk Of course, t he squid.uk.web- cache.net cache in t his exam ple should ut ilize appropriat e m iss_access rules t o enforce t he sibling relat ionship for non- UK request s. Not e t hat dom ain nam es are m at ched t o host nam es as described in Sect ion 6.1.1.2. < Day Day Up >
< Day Day Up >
10.4 Restricting Requests to Neighbors Many people who use hierarchical caching need t o cont rol or lim it request s t hat Squid sends t o it s neighbors. Squid has seven different direct ives t hat affect request rout ing: cache_peer_access, cache_peer_dom ain, never_direct , always_direct , hierarchy_st oplist , nonhierarchical_direct , and prefer_direct .
10.4.1 cache_peer_access The cache_peer_access direct ive defines an access list for a neighbor cache. That is, it det erm ines which request s m ay, or m ay not , be sent t o t he neighbor. You can use t his, for exam ple, t o split t he flow of FTP and HTTP request s. You can send all FTP URI s t o one parent and all HTTP URI s t o anot her: cache_peer A-parent.my.org parent 3128 3130 cache_peer B-parent.my.org parent 3128 3130 acl FTP proto FTP acl HTTP proto HTTP cache_peer_access A-parent allow FTP cache_peer_access B-parent allow HTTP This configurat ion ensures t hat A-parent receives only request s for FTP URI s, while B-parent receives only request s for HTTP URI s. This includes I CP/ HTCP queries as well. You m ight also use cache_peer_access t o enable or disable a neighbor cache during cert ain t im es of t he day: cache_peer A-parent.my.org parent 3128 3130 acl DayTime time 07:00-18:00 cache_peer_access A-parent.my.org deny DayTime
10.4.2 cache_peer_domain The cache_peer_dom ain direct ive is an earlier form of cache_peer_access. Rat her t han using t he full access cont rol feat ure set , it only uses dom ain nam es in URI s. I t is oft en used t o part it ion a group of parent caches by dom ain nam e. For exam ple, if you have a global int ranet , you m ay want t o send request s t o caches locat ed on each cont inent : cache_peer europe-cache.my.org parent 3128 3130
cache_peer asia-cache.my.org
parent 3128 3130
cache_peer aust-cache.my.org
parent 3128 3130
cache_peer africa-cache.my.org parent 3128 3130 cache_peer na-cache.my.org
parent 3128 3130
cache_peer sa-cache.my.org
parent 3128 3130
cache_peer_domain europe-cache.my.org parent .ch .dk .fr .uk .nl .de .fi ... cache_peer_domain asia-cache.my.org parent
.jp .kr .cn .sg .tw .vn .hk ...
cache_peer_domain aust-cache.my.org parent
.nz .au .aq ...
cache_peer_domain africa-cache.my.org parent .dz .ly .ke .mz .ma .mg ... cache_peer_domain na-cache.my.org parent
.mx .ca .us ...
cache_peer_domain sa-cache.my.org parent
.br .cl .ar .co .ve ...
Of course, t his schem e doesn't address t he popular global t op- level dom ains, such as .com .
10.4.3 never_direct The never_direct direct ive is an access list for request s t hat m ust never be sent direct ly t o an origin server. When a request m at ches t his access list , it m ust be sent t o a neighbor ( usually parent ) cache. For exam ple, if Squid is behind a firewall, it m ay be able t o t alk t o your " int ernal" servers direct ly but m ust send all request s for ext ernal servers via t he firewall proxy ( a parent ) . You can t ell Squid " never connect direct ly t o sit es out side t he firewall." To do so, t ell Squid what is inside t he firewall: acl InternalSites dstdomain .my.org never_direct allow !InternalSites The synt ax is a lit t le st range. never_direct allow foo m eans Squid will not go direct ly for request s t hat m at ch " foo." Since t he set of int ernal sit es is easy t o specify, I used t he negat ion operat or ( ! ) t o m at ch ext ernal sit es, which Squid m ust never direct ly cont act . Not e t hat t his exam ple doesn't force Squid t o connect direct ly t o sit es t hat m at ch t he I nt ernalSit es ACL. The never_direct access rule can only force Squid not t o cont act cert ain origin servers. You m ust use t he always_direct rule t o force direct connect ions t o origin servers. You m ust t ake care when using never_direct in com binat ion wit h t he ot her direct ives t hat cont rol request rout ing. You can easily creat e an im possible sit uat ion. Here's an exam ple: cache_peer A-parent.my.org parent 3128 3130
acl COM dstdomain .com cache_peer_access A-parent.my.org deny COM never_direct allow COM This configurat ion creat es a cont radict ion because any request whose dom ain nam e ends wit h . com m ust go t hrough a neighbor cache. However, I defined only one neighbor cache, and don't allow t he .com request s t o go t here. When t his happens, Squid em it s t he " cannot forward" error m essage m ent ioned earlier in Chapt er 10.
10.4.4 always_direct As you can probably guess, t he list of always_direct rules t ell Squid t hat som e request s m ust be forwarded direct ly t o t he origin server. For exam ple, m any organizat ions want t o keep t heir local t raffic local. An easy way t o do t his is t o define an I P address- based ACL and put it in t he always_direct rule list : acl OurNetwork src 172.16.3.0/24 always_direct allow OurNetwork
10.4.5 hierarchy_stoplist I nt ernally, Squid flags each client request as eit her hierarchical or nonhierarchical. A nonhierarchical request is one t hat is unlikely t o result in a cache hit . For exam ple, responses t o POST request s are alm ost never cachable. Forwarding request s for uncachable obj ect s t o neighbors is a wast e of resources when Squid can sim ply connect t o t he origin server. Som e of t he rules for different iat ing hierarchical and nonhierarchical request s are hardcoded in Squid. For exam ple, t he POST and PUT m et hods are always nonhierarchical. However, t he hierarchy_st oplist direct ive allows you t o cust om ize t he algorit hm . I t cont ains a list of st rings t hat , when found in a URI , m ake t he request nonhierarchical. The default list is: hierarchy_stoplist ? cgi-bin Thus, any request t hat cont ains a quest ion m ark or t he cgi-bin st ring m at ches t he st oplist and becom es nonhierarchical. By default , Squid prefers t o send nonhierarchical request s direct ly t o origin servers. Because t hey are unlikely t o result in cache hit s, t hey are generally an ext ra burden on neighbor caches. However, t he never_direct access cont rol rules override hierarchy_st oplist . I n part icular, Squid: ●
● ●
Never sends I CP/ HTCP queries for nonhierarchical request s unless t he request m at ches a never_direct rule Never sends I CP/ HTCP queries t o sibling caches for nonhierarchical request s Never looks in neighbor cache digest s for nonhierarchical request s
10.4.6 nonhierarchical_direct
This direct ive cont rols t he way t hat Squid forwards nonhierarchical ( i.e., probably uncachable) request s. By default , Squid prefers t o send nonhierarchical request s direct ly t o origin servers. This is because such request s are unlikely t o result in cache hit s. I feel it is always bet t er t o get t hem direct ly from t he origin server, rat her t han wast e t im e looking for t hem in neighbor caches. I f, for som e reason, you want t o rout e such request s t hrough t he hierarchy, disable t his direct ive: nonhierarchical_direct off
10.4.7 prefer_direct This direct ive cont rols t he way t hat Squid forwards hierarchical ( i.e., probably cachable) request s. By default , Squid prefers t o send such request s t o a neighbor cache first and t hen direct ly t o t he origin server. You can reverse t his behavior by enabling t he direct ive: prefer_direct on I n t his way, your neighbor caches becom e a backup if com m unicat ion wit h t he origin server fails. < Day Day Up >
< Day Day Up >
10.5 The Network Measurement Database Squid's net work m easurem ent dat abase ( net db) is designed t o m easure t he proxim it y of origin servers. I n ot her words, by querying t his dat abase, Squid knows how close it is t o t he origin server. The dat abase includes I CMP round- t rip t im e ( RTT) m easurem ent s and hop count s. Squid norm ally uses only t he RTT m easurem ent s but can also use t he hop count s in som e sit uat ions. To enable net db, you m ust configure Squid wit h t he —enable- icm p opt ion. You m ust also inst all t he pinger program wit h superuser perm issions, as described in Sect ion 3.6. When everyt hing is working correct ly, you should see a m essage like t his in cache.log: 2003/09/29 00:01:03| Pinger socket opened on FD 28 When net db is enabled, Squid sends I CMP " pings" t o origin servers. The I CMP m essages are act ually sent and received by t he pinger program , which runs as root . Squid is careful not t o send pings t oo frequent ly, which m ay annoy web sit e adm inist rat ors. By default , Squid wait s at least five m inut es before sending anot her ping t o t he sam e host , or t o any ot her host on t he sam e / 24 subnet . You can adj ust t he int erval wit h t he net db_ping_period direct ive. The I CMP pings are generally sm all in size ( less t han 100 byt es) . Squid includes t he origin server host nam e in t he payload of t he I CMP m essage, along wit h a t im est am p. To reduce m em ory requirem ent s, Squid aggregat es t he net db dat a by / 24 subnet s. Squid assum es t hat all host s wit hin t he subnet have sim ilar RTT and hop- count m easurem ent s. This schem e also allows Squid t o est im at e t he proxim it y of a new origin server when ot her servers in t he subnet have already been m easured. Along wit h t he RTT and hop- count m easurem ent s, Squid also st ores a list of host nam es associat ed wit h t he subnet . A t ypical record m ay look som et hing like t his: Subnet 140.98.193.0 RTT
76.5
Hops
20.0
Hosts
services1.ieee.org www.spectrum.ieee.org www.ieee.org
The net db m easurem ent s are prim arily used by I CP and HTCP. When you enable t he query_icm p direct ive in squid.conf, Squid set s a flag in t he I CP/ HTCP queries t hat it sends t o neighbors. This flag is a request t o include proxim it y m easurem ent s in t he I CP/ HTCP reply. I f your neighbors also enabled net db, t heir replies should include RTT and hop- count m easurem ent s if available. Not e t hat Squid always sends I CP replies im m ediat ely. I t doesn't wait for an I CMP m easurem ent before replying t o t he query. See Sect ion 10.6.2.2 for det ails on
how I CP uses net db. Squid rem em bers t he RTT values it learns from I CP/ HTCP replies. These values m ay be used lat er t o opt im ize forwarding decisions. Squid also support s a " bulk t ransfer" of net db m easurem ent s via what is called net db exchange. Squid periodically m akes an HTTP request t o a neighbor for it s net db dat a. You can disable t hese request s wit h t he no-netdb-exchange opt ion on t he cache_peer line. The net db_low and net db_high direct ives cont rol t he size of t he m easurem ent dat abase. When t he num ber of st ored subnet s reaches net db_high, Squid delet es t he least recent ly used ent ries unt il t he count is less t han net db_low. The m inim um _direct _hops and m inim um _direct _rt t direct ives inst ruct Squid t o connect direct ly t o origin servers t hat are no m ore t han som e num ber of hops, or m illiseconds, away. Request s t hat m eet t his crit eria are logged wit h CLOSEST_DIRECT in access.log. The cache m anager's net db page displays t he ent ire net work m easurem ent dat abase, including values from neighbor caches. For exam ple: Network DB Statistics: Network 63.241.84.0
recv/sent 1/
1
RTT 25.0
Hops Hostnames 9.0 www.xyzzy.com
sd.us.ircache.net
21.5
15.0
bo1.us.ircache.net
27.0
13.0
pb.us.ircache.net
70.0
11.0
206.100.24.0
5/
5
25.0
3.0 wcarchive.cdrom.com ftp.cdrom.com
uc.us.ircache.net
23.5
11.0
bo1.us.ircache.net
27.7
7.0
pb.us.ircache.net
35.7
10.0
sd.us.ircache.net
72.9
10.0
25.0
13.0 www.cm.utexas.edu
bo1.us.ircache.net
32.0
11.0
sd.us.ircache.net
55.0
8.0
25.0
8.0 postfuture.com www1.123india.com
146.6.135.0
216.234.248.0
1/
2/
1
2
pb.us.ircache.net 216.148.242.0
1/
sd.us.ircache.net
44.0 1
25.0 25.2
14.0 9.0 images.worldres.com 15.0
bo1.us.ircache.net
27.0
13.0
pb.us.ircache.net
69.5
11.0
Here you can see t hat t he server www.xyzzy.com has an I P address in t he 63.241.84.0/ 24 block. The RTT from t his cache t o t he origin server is 25 m illiseconds. The neighbor cache sd.us. ircache.net is a lit t le closer, at 21.5 m illiseconds. < Day Day Up >
< Day Day Up >
10.6 Internet Cache Protocol [ 1] I CP is a light weight obj ect locat ion prot ocol invent ed as a part of t he Harvest proj ect . An I CP client sends a query m essage t o one or m ore I CP servers, asking if t hey have a part icular URI cached. Each server replies wit h an ICP_HIT, ICP_MISS, or ot her t ype of I CP m essage. The I CP client uses t he inform at ion in t he I CP replies t o m ake a forwarding decision. [ 1]
For m ore inform at ion, see t he following papers: " A Hierarchical I nt ernet Obj ect Cache," by Danzig, Chankhunt hod, et al, USENI X Annual Technical Conference, 1995, and " The Harvest inform at ion discovery and access syst em ," by C. Mic Bowm an, Pet er B. Danzig, Darren R. Hardy, Udi Manber, and Michael F. Schwart z, Proceedings of t he Second I nt ernat ional World Wide Web Conference.
I n addit ion t o predict ing cache hit s, I CP is also useful for providing hint s about net work condit ions bet ween Squid and t he neighbor. I CP m essages are sim ilar t o I CMP pings in t his regard. By m easuring t he query/ response round- t rip t im e, Squid can est im at e net work congest ion. I n t he ext rem e case, I CP m essages m ay be lost , indicat ing t hat t he pat h bet ween t he t wo is down or congest ed. From t his, Squid decides t o avoid t he neighbor for t hat part icular request . I ncreased lat ency is perhaps t he m ost significant drawback t o using I CP. The query/ response exchange t akes som e t im e. Caching proxies are supposed t o decrease response t im e, not add m ore lat ency. I f I CP helps us discover cache hit s in neighbors, t hen it m ay lead t o an overall reduct ion in response t im e. See Sect ion 10.10 for a descript ion of t he query algorit hm im plem ent ed in Squid. I CP also suffers from a num ber of design deficiencies: securit y, scalabilit y, false hit s, and t he lack of a request m et hod. The prot ocol doesn't include any securit y feat ures. I n general, Squid can't verify t hat an I CP m essage is aut hent ic; it relies on address- based access cont rols t o filt er out unwant ed I CP m essages. I CP has poor scaling propert ies. The num ber of I CP m essages ( and bandwidt h) grows in proport ion t o t he num ber of neighbors. Unless you use som e kind of part it ioning schem e, t his places a pract ical lim it on t he num ber of neighbors you can have. I don't recom m end having m ore t han five or six neighbors. I CP queries cont ain only URI s, wit h no addit ional request headers. This m akes it difficult t o predict cache hit s wit h perfect accuracy. An HTTP request m ay include addit ional headers ( such as Cache-Control: max-stale=N) t hat t urn a cache hit int o a cache m iss. These false hit s are part icularly awkward for sibling relat ionships. Also m issing from t he I CP query m essage is t he request m et hod. I CP assum es t hat all queries are for GET request s. A caching proxy can't use I CP t o locat e cached obj ect s for non- GET request m et hods. You can find addit ional inform at ion about I CP by reading:
● ● ●
●
My book Web Caching ( O'Reilly) RFCs 2186 and 2187 My art icle wit h kc claffy: " I CP and t he Squid Web Cache" in t he I EEE Journal on Select ed Areas in Com m unicat ion, April 1998 ht t p: / / icp.ircache.net /
10.6.1 Being an ICP Server When you use t he icp_port direct ive, Squid aut om at ically becom es an I CP server. That is, it list ens for I CP m essages on t he port you've specified, or port 3130 by default . Be sure t o t ell your sibling and/ or child caches if you decide t o use a nonst andard port . By default , Squid denies all I CP queries. You m ust use t he icp_access rule list t o allow queries from your neighbors. I t 's usually easiest t o do t his wit h src ACLs. For exam ple: acl N1 src 192.168.0.1 acl N2 src 172.16.0.2 acl All src 0/0 icp_access allow N1 icp_access allow N2 icp_access deny All Not e t hat only ICP_QUERY m essages are subj ect t o t he icp_access rules. I CP client funct ions, such as sending queries and receiving replies, don't require any special access cont rols. I also recom m end t hat you t ake advant age of your operat ing syst em 's packet filt ering feat ures ( e.g., ipfw, iptables, and pf) if possible. Allow UDP m essages on t he I CP port from your t rust ed neighbors and deny t hem from all ot her host s. When Squid denies an I CP query due t o t he icp_access rules, it sends back an ICP_DENIED m essage. However, if Squid det ect s t hat m ore t han 95% of t he recent queries have been denied, it st ops responding for an hour. When t his happens, Squid writ es a m essage in cache. log: WARNING: Probable misconfigured neighbor at foo.web-cache.com WARNING: 150 of the last 150 ICP replies are DENIED WARNING: No replies will be sent for the next 3600 seconds I f you see t his m essage, you should cont act t he adm inist rat or responsible for t he m isconfigured cache. Squid was designed t o answer I CP queries im m ediat ely. That is, Squid can t ell whet her or not it has a fresh, cached response by checking t he in- m em ory index. This is also why Squid is a bit of a m em ory hog. When an I CP query com es in, Squid calculat es t he MD5 hash of t he URI and looks for it in t he index. I f not found, Squid sends back an ICP_MISS m essage. I f found, Squid
checks t he expirat ion t im e. I f t he obj ect isn't fresh, Squid ret urns ICP_MISS. For fresh obj ect s, Squid ret urns ICP_HIT. By default , Squid logs all I CP queries ( but not responses) t o access.log. I f you have a lot of busy neighbors, your log file m ay becom e t oo large t o m anage. Use t he log_icp_queries direct ive t o prevent logging of t hese queries. Alt hough you'll lose t he det ailed logging for I CP, you can st ill get som e aggregat e st at s via t he cache m anager ( see Sect ion 14.2.1.24) . I f you have sibling neighbors, you'll probably want t o use t he m iss_access direct ive t o enforce t he relat ionship. I t specifies an access rule for cache m isses. I t is sim ilar t o ht t p_access but is checked only for request s t hat m ust be forwarded. The default rule is t o allow all cache m isses. Unless you add som e m iss_access rules, any sibling cache can becom e a child cache and forward cache m isses t hrough your net work connect ion, t hus st ealing your bandwidt h. Your m iss_access rules can be relat ively sim ple. Don't forget t o include your local client s ( i.e., web browsers) as well. Here's a sim ple exam ple: acl Browsers src 10.9.0.0/16 acl Child1 src 172.16.3.4 acl Child2 src 192.168.2.0/24 acl All src 0/0 miss_access allow Browsers miss_access allow Child1 miss_access allow Child2 miss_access deny All Not e t hat I haven't list ed any siblings here. The child caches are allowed t o request m isses t hrough us, but t he siblings are not . Their cache m iss request s are denied by t he deny All rule.
10.6.1.1 The icp_hit_stale directive One of t he problem s wit h I CP is t hat it ret urns ICP_MISS for cached but st ale responses. This is t rue even if t he response is st ale, but valid ( such t hat a validat ion request ret urns " not m odified" ) . Consider a sim ple hierarchy wit h a child and t wo parent caches. An obj ect is cached by one parent but not t he ot her. The cached response is st ale, but unchanged, and needs validat ion. The child's I CP query result s in t wo ICP_MISS replies. Not knowing t hat t he st ale response exist s in t he first parent , t he child forwards it s request t o t he second parent . Now t he obj ect is st ored in bot h parent s, wast ing resources. You m ight find t he icp_hit _st ale direct ive useful in t his sit uat ion. I t t ells Squid t o ret urn an ICP_HIT for any cached obj ect , even if it is st ale. This is perfect ly safe for parent relat ionships but can creat e problem s for siblings. Recall t hat in a sibling relat ionship, t he client cache is only allowed t o m ake request s t hat are
cache hit s. Enabling t he icp_hit _st ale direct ive increases t he num ber of false hit s because Squid m ust validat e t he st ale responses. Squid norm ally handles false hit s by adding t he CacheControl: only-if-cached direct ive t o HTTP request s sent t o siblings. I f t he sibling can't sat isfy t he HTTP request as a cache hit , it ret urns an HTTP 504 ( Gat eway Tim eout ) m essage inst ead. When Squid receives t he 504 response, it forwards t he request again, but only t o a parent or t he origin server. I t m akes lit t le sense t o enable icp_hit _st ale for sibling relat ionships if all t he false hit s m ust be reforwarded. This is where t he I CP client 's allow-miss opt ion t o cache_peer becom es useful. When t he allow-miss opt ion is set , Squid om it s t he only-if-cached direct ive in HTTP request s it sends t o siblings. I f you enable icp_hit _st ale, you also need t o m ake sure t hat m iss_access doesn't deny cachem iss request s from siblings. Unfort unat ely, t here is no way t o m ake Squid allow only cachem isses for cached, st ale obj ect s. Allowing cache m isses for siblings also leaves your cache open t o pot ent ial abuse. The adm inist rat or of t he sibling cache m ay change it t o a parent relat ionship wit hout your knowledge or perm ission.
10.6.1.2 The ICP_MISS_NOFETCH feature The com m and- line - Y opt ion t o Squid causes it t o ret urn ICP_MISS_NOFETCH, inst ead of ICP_MISS, while rebuilding t he in- m em ory indexes. I CP client s t hat receive ICP_MISS_NOFETCH responses should not send HTTP request s for t hose obj ect s. This reduces t he load placed on Squid and allows t he rebuild process t o com plet e sooner.
10.6.1.3 The test_reachability directive I f you enable t he net db feat ure ( see Sect ion 10.5) , you m ight also be int erest ed in enabling t he t est _reachabilit y direct ive. The goal behind it is t o accept only request s for origin servers Squid can reach. Enabling t est _reachabilit y causes Squid t o ret urn ICP_MISS_NOFETCH, inst ead of ICP_MISS, for origin server sit es t hat don't respond t o I CMP pings. This can help reduce t he num ber of failed HTTP request s and increase t he chance t hat t he end user receives t he dat a prom pt ly. However, a significant percent age of origin server sit es int ent ionally filt er out I CMP t raffic. For t hese, Squid ret urns ICP_MISS_NOFETCH even t hough an HTTP connect ion would succeed. Enabling t est _reachabilit y also causes Squid t o m ake net db m easurem ent s in response t o I CP queries. I f Squid doesn't have any RTT m easurem ent s for t he origin server in quest ion, it sends out an I CMP ping ( subj ect t o t he rat e lim it ing m ent ioned previously) .
10.6.2 Being an ICP Client First , you m ust use t he cache_peer direct ive t o define your neighbor caches. See t he sect ion Sect ion 10.3. Second, you m ust also use t he icp_port direct ive, even if your Squid is only an I CP client . This is because Squid uses t he sam e socket for sending and receiving I CP m essages. I t is perhaps a bad design decision in ret rospect . I f you are a client only, use icp_access t o block queries. For exam ple:
acl All src 0/0 icp_access deny All Squid sends I CP queries t o it s neighbors for m ost request s by default . See Sect ion 10.10 for a com plet e descript ion of t he way t hat Squid decides when, and when not , t o query it s neighbors. Aft er sending one or m ore queries, Squid wait s som e am ount of t im e for I CP replies t o arrive. I f Squid receives an ICP_HIT from one of it s neighbors, it forwards t he request t here im m ediat ely. Ot herwise, Squid wait s unt il all replies arrive or unt il a t im eout occurs. The t im eout is calculat ed dynam ically, based on t he following algorit hm . Squid knows t he average round- t rip t im e bet ween it self and each neighbor, t aken from recent I CP t ransact ions. When querying a group of neighbors, Squid calculat es t he m ean of all t he neighbor I CP RTTs, and t hen doubles it . I n ot her words, t he query t im eout is t wice t he m ean of RTTs for each neighbor queried. Squid ignores neighbors t hat appear t o be down when calculat ing t he t im eout . I n som e cases, t he algorit hm doesn't work well, especially if you have neighbors wit h widely varying RTTs. You can change t he upper lim it on t he t im eout using t he m axim um _icp_query_t im eout direct ive. Alt ernat ively, you can m ake Squid always use a const ant t im eout value wit h t he icp_query_t im eout direct ive.
10.6.2.1 cache_peer options for ICP clients weight=n allows you t o weight parent caches art ificially when using I CP/ HTCP. I t com es int o play only when all parent s report a cache m iss. Norm ally, Squid select s t he parent whose reply arrives first . I n fact , it rem em bers which parent has t he best RTT for t he query. Squid act ually divides t he RTT by t he weight , so t hat a parent wit h weight=2 is t reat ed as if it 's closer t o Squid t han it really is. no-query disables I CP/ HTCP for t he neighbor. That is, your cache won't send any queries t o t he neighbor for cache m isses. I t is oft en used wit h t he default opt ion. closest-only refers t o one of Squid's net db feat ures. I t inst ruct s Squid t o select t he parent based only on net db RTT m easurem ent s and not t he order in which replies arrive. This opt ion requires net db at bot h ends.
10.6.2.2 ICP and netdb As m ent ioned in t he sect ion Sect ion 10.5, net db is m ost ly used wit h I CP queries. I n t his sect ion, we'll follow all t he st eps involved in t his process. 1. A Squid cache, act ing as an I CP client , prepares t o send a query t o one or m ore neighbors. I f query_icm p is set , Squid set s t he SRC_RTT flag in t he I CP query. This inform s t he I CP server t hat Squid would like t o receive an RTT m easurem ent in t he I CP reply.
2. The neighbor receives t he query wit h t he SRC_RTT flag set . I f t he neighbor is configured t o m ake net db m easurem ent s, it searches t he dat abase for t he origin server host nam e. Not e t hat t he neighbor doesn't query t he DNS for t he origin server's I P address. Thus, it finds a net db ent ry only if t hat part icular host has already been m easured. 3. I f t he host exist s in t he net db dat abase, t he neighbor includes t he RTT and hop count in t he I CP reply. The SRC_RTT flag is set in t he reply t o indicat e t he m easurem ent is present . 4. When Squid receives t he I CP reply wit h t he SRC_RTT flag set , it ext ract s t he RTT and hop count . These are added t o t he local net db so t hat , in t he fut ure, Squid knows t he approxim at e RTT from t he neighbor t o t he origin server. 5. An ICP_HIT reply causes Squid t o forward t he HTTP request im m ediat ely. I f, on t he ot her hand, Squid receives only ICP_MISS replies, it select s t he parent wit h t he sm allest ( nonzero) m easured RTT t o t he origin server. The request is logged t o access.log wit h CLOSEST_PARENT_MISS. 6. I f none of t he parent ICP_MISS replies cont ain RTT values, Squid select s t he parent whose I CP reply arrived first . I n t his case, t he request is logged wit h FIRST_PARENT_MISS. However, if t he closest-only opt ion is set for a parent cache, Squid never select s it as a " first parent ." I n ot her words, t he parent is select ed only if it is t he closest parent t o t he origin server.
10.6.3 Multicast ICP As you already know, I CP has poor scaling propert ies. The num ber of m essages is proport ional t o t he num ber of neighbors. Because Squid sends ident ical ICP_QUERY m essages t o each neighbor, you can use m ult icast t o reduce t he num ber of m essages t ransm it t ed on t he net work. Rat her t han send N m essages t o N neighbors, Squid sends one m essage t o a m ult icast address. The m ult icast rout ing infrast ruct ure m akes sure each neighbor receives a copy of t he m essage. See t he book I nt erdom ain Mult icast Rout ing: Pract ical Juniper Net works and Cisco Syst em s Solut ions by Brian M. Edwards, Leonard A. Giuliano, and Brian R. Wright ( Addison Wesley) for m ore inform at ion on t he inner workings of m ult icast . Not e t hat I CP replies are always sent via unicast . This is because I CP replies m ay be different ( e.g., hit versus m iss) and because t he unicast and m ult icast rout ing t opologies m ay differ. Because I CP is also used t o indicat e net work condit ions, an I CP reply should follow t he sam e pat h an HTTP reply t akes. The bot t om line is t hat m ult icast only reduces m essage count s for queries. Hist orically, I 've found m ult icast infrast ruct ure unst able and unreliable. I t seem s t o be a low priorit y for m any I SPs. Even t hough it works one day, som et hing m ay break a few days or weeks lat er. You're probably safe using m ult icast ent irely wit hin your own net work, but I don't recom m end using it for I CP on t he public I nt ernet .
10.6.3.1 Multicast ICP server A m ult icast I CP server j oins one or m ore m ult icast group addresses t o receive m essages. The m cast _groups direct ive specifies t hese group addresses. The values m ust be m ult icast I P
addresses or host nam es t hat resolve t o m ult icast addresses. The I Pv4 m ult icast address range is 224.0.0.0- 239.255.255.255. For exam ple: mcast_groups 224.11.22.45 An int erest ing t hing about m ult icast is t hat host s, rat her t han applicat ions, j oin a group. When a host j oins a m ult icast group, it receives all packet s t hat are t ransm it t ed t o t hat group. This m eans t hat you need t o be a lit t le bit careful when select ing a m ult icast group t o use for I CP. You don't want t o select an address t hat 's already being used by anot her applicat ion. When t his kind of group overlap occurs, t he t wo groups becom e j oined and receive each ot her's t raffic.
10.6.3.2 Multicast ICP client Mult icast I CP client s t ransm it queries t o one or m ore m ult icast group addresses. Thus, t he host nam e argum ent of t he cache_peer line m ust be, or resolve t o, a m ult icast address. For exam ple: cache_peer 224.0.14.1 multicast 3128 3130 ttl=32 The HTTP port num ber ( e.g., 3128) is irrelevant in t his case because Squid never m akes HTTP connect ions t o a m ult icast neighbor. Realize t hat m ult icast groups don't have any access cont rols. Any host can j oin any m ult icast group address. This m eans t hat , unless you're careful, ot hers m ay be able t o receive t he m ult icast I CP queries sent by your Squid. I P m ult icast has t wo ways t o prevent packet s from t raveling t oo far: TTLs and adm inist rat ive scoping. Because I CP queries m ay carry sensit ive inform at ion ( i.e., URI s t hat your users access) , I recom m end using an adm inist rat ively scoped address and properly configured rout ers. See RFC 2365 for m ore inform at ion. The ttl=n opt ion is for m ult icast neighbors only. I t is t he m ult icast TTL value t o use for I CP queries. I t cont rols how far away t he I CP queries can t ravel. The valid range is 0- 128. A larger value allows t he m ult icast queries t o t ravel fart her, and possibly be int ercept ed by out siders. Use a lower num ber t o keep t he queries close t o t he source and wit hin your net work. Mult icast I CP client s m ust also t ell Squid about t he neighbors t hat will be responding t o queries. Squid doesn't blindly t rust any cache t hat happens t o send an I CP reply. You m ust t ell Squid about legit im at e, t rust ed neighbors. The multicast-responder opt ion t o cache_peer ident ifies such neighbors. For exam ple, if you know t hat 172.16.2.3 is a t rust ed neighbor on t he m ult icast group, you should add t his line t o squid.conf: cache_peer 172.16.3.2 parent 3128 3130 multicast-responder You can, of course, use a host nam e inst ead of an I P address. I CP replies from foreign ( unlist ed) neighbors are ignored, but logged in cache.log. Squid norm ally expect s t o receive an I CP reply for each query t hat it sends. This changes, however, wit h m ult icast because one query m ay result in m ult iple replies. To account for t his, Squid periodically sends out " probes" on t he m ult icast group address. These probes t ell Squid how m any servers are out t here list ening. Squid count s t he num ber of replies t hat arrive wit hin a cert ain am ount of t im e. That am ount of t im e is given by t he m cast _icp_query_t im eout
direct ive. Then, when Squid sends a real I CP query t o t he group, it adds t his count t o t he num ber of I CP replies t o expect .
10.6.3.3 Multicast ICP example Since m ult icast I CP is t ricky, here's anot her exam ple. Let 's say your I SP has t hree parent caches t hat list en on a m ult icast address for I CP queries. The I SP needs only one line in it s configurat ion file: mcast_groups 224.0.14.255 The configurat ion for you ( t he child cache) is a lit t le m ore com plicat ed. First , you m ust list t he m ult icast neighbor t o which Squid should send queries. You m ust also list t he t hree parent caches wit h t heir unicast addresses so t hat Squid accept s t heir replies: cache_peer 224.0.14.225 multicast 3128 3130 ttl=16 cache_peer parent1.yourisp.net parent 3128 3130 multicast-responder cache_peer parent2.yourisp.net parent 3128 3130 multicast-responder cache_peer parent3.yourisp.net parent 3128 3130 multicast-responder mcast_icp_query_timeout 2 sec Keep in m ind t hat Squid never m akes HTTP request s t o multicast neighbors, and it never sends I CP queries t o multicast-responder neighbors. < Day Day Up >
< Day Day Up >
10.7 Cache Digests One of t he m ost com m on com plaint s about I CP is t he addit ional delay added for each request . I n m any cases, Squid wait s for all I CP replies t o arrive before m aking a forwarding decision. Squid's Cache Digest feat ure offers sim ilar funct ionalit y but wit hout per- request net work delays. Cache Digest s are based on a t echnique first published by Pei Cao, called Sum m ary Cache. The fundam ent al idea is t o use a Bloom filt er t o represent t he cache cont ent s. Neighboring caches download each ot her's Bloom filt ers, or digest s in t his t erm inology. Then, t hey can query t he digest t o det erm ine whet her or not a part icular URI is in t he neighbor's cache. Com pared t o I CP, Cache Digest s t rade t im e for space. Whereas I CP queries incur t im e penalt ies ( lat ency) , digest s incur space ( m em ory, disk) penalt ies. I n Squid, a neighbor's digest is st ored ent irely in m em ory. A t ypical digest requires about 625 KB of m em ory for every m illion obj ect s. The Bloom filt er is an int erest ing dat a st ruct ure t hat provides lossy encoding of a collect ion of it em s. The filt er it self is sim ply a large array of bit s. Given a Bloom filt er ( and t he param et ers used t o generat e it ) , you can find, wit h som e uncert aint y, if a part icular it em is in t he collect ion. I n Squid, it em s are URI s, and t he digest is sized at 5 bit s per cached obj ect . For exam ple, you can represent t he collect ion of 1,000,000 cached obj ect s wit h a filt er of 5,000,000 bit s, or 625,000 byt es. Due t o t heir nat ure, Bloom filt ers aren't a perfect represent at ion of t he collect ion. They som et im es incorrect ly indicat e t hat a part icular it em is present in t he collect ion because t wo or m ore it em s m ay t urn on t he sam e bit . I n ot her words, t he filt er can indicat e t hat obj ect X is in t he cache, even t hough X was never cached or request ed. These false posit ives occur wit h a cert ain probabilit y you can cont rol by adj ust ing t he param et ers of t he filt er. For exam ple, increasing t he num ber of bit s per obj ect decreases t he false posit ive probabilit y. See m y O'Reilly book, Web Caching, for m any m ore det ails about Cache Digest s.
10.7.1 Configuring Squid for Cache Digests First of all, you m ust com pile Squid wit h t he Cache Digest code enabled. Sim ply add t he — enable- cache- digest s opt ion when running ./ configure. Taking t his st ep causes t wo t hings t o happen when you run Squid: ●
●
Your Squid cache generat es a digest of it s own cont ent s. Your neighbors m ay request t his digest if t hey are also configured t o use Cache Digest s. Your Squid request s a Cache Digest from each of it s neighbors.
I f you don't want t o request digest s for a part icular neighbor, use t he no-digest opt ion on t he cache_peer line. For exam ple: cache_peer neighbor.host.name parent 3128 3130 no-digest Squid st ores it s own digest under t he following URL: ht t p: / / m y.host .nam e: port / squid- int ernalperiodic/ st ore_digest . When Squid request s a neighbor's digest , it sim ply request s ht t p: / / neighbor.host .nam e: port / squid- int ernal- periodic/ st ore_digest . Obviously, t his nam ing schem e
is specific t o Squid. I f you have a non- Squid neighbor t hat support s Cache Digest s, you m ay need t o t ell your Squid t hat t he neighbor's digest has a different address. The digest-url=url opt ion t o cache_peer allows you t o configure t he URL for t he neighbor's Cache Digest . For exam ple: cache_peer neighbor.host.name parent 3128 3130 digest-url=http://blah/digest squid.conf has a num ber of direct ives t hat cont rol t he way in which Squid generat es it s own Cache Digest . First , t he digest _generat ion direct ive cont rols whet her or not Squid generat es a digest of it s cache. You m ight want t o disable digest generat ion if your cache is a child t o a parent , but not a parent or sibling t o any ot her caches. The rem aining direct ives cont rol lowlevel underlying det ails of digest generat ion. You should change t hem only if you fully underst and t he Cache Digest im plem ent at ion. The digest _bit s_per_ent ry det erm ines t he size of t he digest . The default value is 5. I ncreasing t he value result s in larger digest s ( consum ing m ore m em ory and bandwidt h) and lower false- hit probabilit ies. A lower set t ing result s in sm aller digest s and m ore false hit s. I feel t hat t he default set t ing is a very nice t radeoff. A set t ing of 3 or lower has t oo m any false hit s t o be useful, and a set t ing of 8 or higher sim ply wast es bandwidt h. Squid uses a t wo- st ep process t o creat e a cache digest . First , it builds t he cache digest dat a st ruct ure. This is basically a large Bloom filt er and sm all header t hat cont ains t he digest param et ers. Once t he dat a st ruct ure is filled, Squid creat es a cached HTTP response for t he digest . This sim ply involves prepending som e HTTP headers and st oring t he response on disk wit h t he ot her cached responses. A Cache Digest represent s a snapshot in t im e of t he cache's cont ent s. The digest _rebuild_period cont rols how frequent ly Squid rebuilds t he digest dat a st ruct ure ( but not t he HTTP response) . The default is once per hour. More frequent rebuilds m ean Squid's digest is m ore up t o dat e, at t he expense of higher CPU ut ilizat ion. The rebuild procedure is relat ively CPU- int ensive. Your users m ay experience a slowdown while Squid rebuilds it s digest . The digest _rebuild_chunk_percent age direct ive cont rols how m uch of t he cache t o add t o t he digest each t im e t he rebuild procedure is called. The default is 10% . During each invocat ion of t he rebuild funct ion, Squid adds som e percent age of t he cache t o t he digest . Squid doesn't process user request s while t his funct ion runs. Aft er adding t he specified percent age, t he funct ion reschedules it self and t hen exit s so t hat Squid can process norm al HTTP request s. Aft er processing pending request s, Squid ret urns t o t he rebuild funct ion and adds anot her chunk of t he cache t o t he digest . Decreasing t his value should give bet t er response t im e t o your users, while increasing t he t ot al t im e needed t o rebuild t he digest . The digest _rewrit e_period direct ive cont rols how oft en Squid creat es an HTTP response from t he digest dat a st ruct ure. I n m ost cases, t his should m at ch t he digest _rebuild_period value. The default is one hour. The rewrit e procedure consist s of num erous calls t o a funct ion t hat sim ply appends som e am ount of t he digest dat a st ruct ure t o t he cache ent ry ( as t hough Squid were reading an origin server response from t he net work) . Each t im e t his funct ion is called, Squid appends digest _swapout _chunk_size byt es of t he digest . < Day Day Up >
< Day Day Up >
10.8 Hypertext Caching Protocol HTCP and I CP have m any com m on charact erist ics, alt hough HTCP is broader in scope and generally m ore com plex. Bot h use UDP for t ransport , and bot h are per- request prot ocols. However, HTCP addresses a num ber of problem s wit h I CP, nam ely: ●
●
●
An I CP query cont ains only a URI , wit hout even a request m et hod. HTCP queries cont ain full HTTP request headers. I CP provides no securit y. HTCP has opt ional m essage aut hent icat ion via shared secret keys, alt hough it isn't yet im plem ent ed in Squid. Neit her prot ocol support s encrypt ed m essages. I CP uses a sim ple, fixed- sized binary m essage form at t hat is difficult t o ext end. HTCP uses a com plex, variable- sized binary m essage form at .
HTCP support s four basic opcodes:
TST Test s for t he presence of a cached response
SET Tells a neighbor t o updat e cached obj ect headers
CLR Tells a neighbor t o rem ove an obj ect from it s cache
MON Monit ors a neighbor cache's act ivit y I n Squid, only t he TST opcode is current ly im plem ent ed. This book won't cover t he ot hers. The prim ary advant age of using HTCP over I CP is fewer false hit s. HTCP has fewer false hit s because t he query m essages include full HTTP request headers, including any Cache-Control requirem ent s from t he client . The prim ary disadvant ages are t hat HTCP queries are larger, and t hey require addit ional CPU processing t o generat e and parse. Measurem ent s indicat e t hat HTCP queries are about six t im es larger t han I CP queries, due t o t he presence of HTTP request headers. However, Squid's HTCP replies are t ypically sm aller t han I CP replies. HTCP is docum ent ed as an experim ent al prot ocol in RFC 2756. For m ore inform at ion about t he m essage form at , see t he RFC at ht t p: / / www.ht cp.org or m y O'Reilly book, WebCaching.
10.8.1 Configuring Squid for HTCP To use HTCP, you m ust configure Squid wit h t he —enable- ht cp opt ion. Wit h t his opt ion enabled, Squid becom es an HTCP server by default . The ht cp_port specifies t he HTCP port num ber, which default s t o 4827. Set t ing t he port t o 0 disables t he HTCP server m ode. To becom e an HTCP client , you need t o add t he htcp opt ion t o a cache_peer line. When you add t his opt ion, Squid always sends HTCP m essages, inst ead of I CP, t o t he neighbor. You can't use bot h HTCP and I CP wit h a single neighbor. The I CP port num ber field act ually becom es an HTCP port num ber, so you need t o change t hat as well. For exam ple, let 's say you want t o convert an I CP neighbor t o HTCP. Here's t he neighbor configured for I CP: cache_peer neighbor.host.name parent 3128 3130 To swit ch over t o HTCP, t he line becom es: cache_peer neighbor.host.name parent 3128 4827 htcp Som et im es people forget t o change t he port num ber, and t hey end up sending HTCP m essages t o t he I CP port . When t his happens, Squid writ es warnings t o cache.log: 2003/09/29 02:28:55| WARNING: Unused ICP version 23 received from 64.216.111.20:4827 Squid doesn't current ly log HTCP queries as it does for I CP queries. HTCP queries aren't t racked in t he client_list page eit her. However, when you enable HTCP for a peer, t he cache m anager server_list page ( see Sect ion 14.2.1.50) shows t he count and percent age of HTCP replies t hat were hit s and m isses: Histogram of PINGS ACKED: Misses Hits
5085
98%
92
2%
Not e t hat none of t he current Squid versions support HTCP aut hent icat ion yet . < Day Day Up >
< Day Day Up >
10.9 Cache Array Routing Protocol CARP is an algorit hm t hat part it ions URI - space am ong a group of caching proxies. I n ot her words, each URI is assigned t o one of t he caches. CARP m axim izes hit rat ios and m inim izes duplicat ion of obj ect s am ong t he group of caches. The prot ocol consist s of t wo m aj or com ponent s: a Rout ing Funct ion and a Proxy Array Mem bership Table. Unlike I CP, HTCP, and Cache Digest s, CARP can't predict whet her a part icular request will be a cache hit . Thus, you can't use CARP wit h siblings—only parent s. The basic idea behind CARP is t hat you have a group, or array, of parent caches t o handle all t he load from users or child caches. A cache array is one way t o handle ever- increasing loads. You can add m ore array m em bers whenever you need m ore capacit y. CARP is a det erm inist ic algorit hm . That is, t he sam e request always goes t o t he sam e array m em ber ( as long as t he array size doesn't change) . Unlike I CP and HTCP, CARP doesn't use query m essages. Anot her int erest ing t hing about CARP is t hat you have t he choice t o deploy it in a num ber of different places. For exam ple, one approach is t o m ake all user- agent s execut e t he CARP algorit hm . You could probably accom plish t his wit h a Proxy Aut o- Configurat ion ( PAC) funct ion, writ t en in JavaScript ( see Appendix F) . However, you're likely t o have cert ain web agent s on your net work t hat don't im plem ent or support PAC files. Anot her opt ion is t o use a t wo- level cache hierarchy. The lower level ( child caches) accept request s from all user- agent s, and t hey execut e t he CARP algorit hm t o select t he parent cache for each request . However, unless your net work is very large, m any caches can be m ore of a burden t han a benefit . Finally, you can also im plem ent CARP wit hin t he array it self. That is, user- agent s connect t o a random m em ber of t he cache array, but each m em ber forwards cache m isses t o anot her m em ber of t he array based on t he CARP algorit hm . CARP was designed t o be bet t er t han a sim ple hashing algorit hm , which t ypically works by applying a hash funct ion, such as MD5, t o URI s. The algorit hm t hen calculat es t he m odulus for t he num ber of array m em bers. I t m ight be as sim ple as t his pseudocode: N = MD5(URI) % num_caches; next_hop = Caches[N]; This t echnique uniform ly spreads t he URI s am ong all t he caches. I t also provides a consist ent m apping ( m axim izing cache hit s) , as long as t he num ber of caches rem ains const ant . When caches are added or rem oved, however, t his algorit hm changes t he m apping for m ost of t he URI s. CARP's Rout ing Funct ion im proves on t his t echnique in t wo ways. First , it allows for unequal sharing of t he load. For exam ple, you can configure one parent t o receive t wice as m any request s as anot her. Second, adding or rem oving array m em bers m inim izes t he fract ion of URI s t hat get reassigned. The downside t o CARP is t hat it is relat ively CPU- int ensive. For each request , Squid calculat es a " score" for each parent . The request is sent t o t he parent cache wit h t he highest score. The com plexit y of t he algorit hm is proport ional t o t he num ber of parent s. I n ot her words, CPU load increases in proport ion t o t he num ber of CARP parent s. However, t he calculat ions in CARP have been designed t o be fast er t han, say, MD5, and ot her crypt ographic hash funct ions.
I n addit ion t o t he load- sharing algorit hm , CARP also has a prot ocol com ponent . The Mem bership Table has a well- defined st ruct ure and synt ax so t hat all client s of a single array can have t he sam e configurat ion. I f som e client s are configured different ly, CARP becom es less useful because not all client s send t he sam e request t o t he sam e parent . Not e t hat Squid doesn't current ly im plem ent t he Mem bership Table feat ure. Squid's CARP im plem ent at ion is lacking in anot her way. The prot ocol says t hat if a request can't be forwarded t o t he highest - scoring parent cache, it should be sent t o t he second- highest scoring m em ber. I f t hat also fails, t he applicat ion should give up. Squid current ly uses only t he highest - scoring parent cache. CARP was originally docum ent ed as an I nt ernet Draft in 1998, which is now expired. I t was developed by Vinod Valloppillil of Microsoft and Keit h W. Ross of t he Universit y of Pennsylvania. Wit h a lit t le searching, you can st ill find t he old docum ent out t here on t he I nt ernet . You m ay even be able t o find som e docum ent at ion on t he Microsoft sit es. You can also find m ore inform at ion on CARP in m y O'Reilly book Web Caching.
10.9.1 Configuring Squid for CARP To use CARP in Squid, you m ust first run t he ./ configure script wit h t he —enable- carp opt ion. Next , you m ust add carp-load-factor opt ions t o t he cache_peer lines for parent s t hat are m em bers of t he array. The following is an exam ple. cache_peer neighbor1.host.name parent 3128 0 carp-load-factor=0.3 cache_peer neighbor2.host.name parent 3128 0 carp-load-factor=0.3 cache_peer neighbor3.host.name parent 3128 0 carp-load-factor=0.4 Not e t hat all carp-load-factor values m ust add up t o 1.0. Squid checks for t his condit ion and com plains if it finds a discrepancy. Addit ionally, t he cache_peer lines m ust be list ed in order of increasing load fact or values. Only recent versions of Squid check t hat t his condit ion is t rue. Rem em ber t hat CARP is t reat ed som ewhat specially wit h regard t o a neighbor's alive/ dead st at e. Squid norm ally declares a neighbor dead ( and ceases sending request s t o it ) aft er 10 failed connect ions. I n t he case of CARP, however, Squid skips a parent t hat has one or m ore failed connect ions. Once Squid is working wit h CARP, you can m onit or it wit h t he carp cache m anager page. See Sect ion 14.2.1.49 for m ore inform at ion. < Day Day Up >
< Day Day Up >
10.10 Putting It All Together As you probably realize by now, Squid has m any different ways t o decide how and where request s are forwarded. I n m any cases, you can em ploy m ore t han one prot ocol or t echnique at a t im e. Just by looking at t he configurat ion file, however, you'd probably have a hard t im e figuring out how Squid uses t he different t echniques in com binat ion. I n t his sect ion I 'll explain how Squid act ually m akes t he forwarding decision. Obviously, it all st art s wit h a cache m iss. Any request t hat is sat isfied as an unvalidat ed cache hit doesn't go t hrough t he following sequence of event s. The goal of t he select ion procedure is t o creat e a list of appropriat e next - hop locat ions. A next hop locat ion m ay be a neighbor cache or t he origin server. Depending on your configurat ion, Squid m ay select up t o t hree possible next - hops. I f t he request can't be sat isfied by t he first , Squid t ries t he second, and so on.
10.10.1 Step 1: Determine Direct Options The first st ep is t o det erm ine if t he request m ay, m ust , or m ust not be sent direct ly t o t he origin server. Squid evaluat es t he never_direct and always_direct access rule list s for t he request . The goal is t o set a flag t o one of t hree values: DI RECT_YES, DI RECT_MAYBE, or DI RECT_NO. This flag lat er det erm ines whet her Squid should, or should not , t ry t o select a neighbor cache for t he request . Squid checks t he following condit ions in order. I f any condit ion is t rue, it set s t he direct flag and proceeds t o t he next st ep. I f you're following along in t he source code, t his st ep corresponds t o t he beginning of t he peerSelect Foo( ) funct ion: 1. Squid looks at t he always_direct list first . I f t he request m at ches t his list , t he direct flag is set t o DI RECT_YES. 2. Squid looks at t he never_direct list next . I f t he request m at ches t his list , t he direct flag is set t o DI RECT_NO. 3. Squid has a special check for request s t hat appear t o be looping. When Squid det ect s a forwarding loop, it set s t he direct flag t o DI RECT_YES t o break t he loop. 4. Squid checks t he m inim um _direct _hops and m inim um _direct _rt t set t ings, but only if you've enabled net db. I f t he m easured hop count or round- t rip t im e is lower t han t he configured values, Squid set s t he direct flag t o DI RECT_YES. 5. I f none of t he previous condit ions are t rue, Squid set s t he direct flag t o DI RECT_MAYBE. I f t he direct flag is set t o DI RECT_YES, t he select ion process is com plet e. Squid forwards t he request direct ly t o t he origin server and skips t he rem aining st eps in t his sect ion.
10.10.2 Step 2: Neighbor Selection Protocols Here Squid uses one of t he hierarchical prot ocols t o select a neighbor cache. As before, once
Squid select s a neighbor in t his st ep, it exit s t he rout ine and proceeds t o St ep 3. This st ep roughly corresponds t o t he peerGet Som eNeighbor( ) funct ion: 1. Squid exam ines t he neighbor's Cache Digest s. I f it indicat es a hit , t hat neighbor is placed on t he next - hop list . 2. Squid t ries CARP if enabled. CARP always succeeds ( i.e., select s a parent ) , unless t he cache_peer_access or cache_peer_dom ain rules forbid com m unicat ion wit h any of t he parent caches for a part icular request . 3. Squid checks net db m easurem ent s ( if enabled) for a " closest parent ." I f Squid knows t hat t he round- t rip t im e from one or m ore parent s t o t he origin server is less t han it s own RTT t o t he origin server, Squid select s t he parent wit h t he least RTT. For t his t o happen, t he following condit ions m ust be m et : ❍
Bot h your Squid and t he parent cache( s) m ust have enabled net db m easurem ent s.
❍
query_icm p m ust be enabled in your configurat ion file.
❍
The origin server m ust respond t o I CMP pings.
❍
The parent ( s) m ust have previously m easured t he RTT t o t he origin server and ret urned t hose m easurem ent s in I CP/ HTCP replies, or t hrough a net db exchange.
4. Squid sends I CP/ HTCP queries as t he last resort . Squid loops t hrough all neighbors and checks a num ber of condit ions. Squid doesn't query a neighbor if: ❍
❍
❍
The direct flag is DI RECT_MAYBE and t he request is nonhierarchical ( see Sect ion 10.4.5) . Because Squid is allowed t o go direct ly t o t he origin server, it doesn't bot her t he neighbor wit h t his request , which is likely t o be uncachable. The direct flag is DI RECT_NO, t he neighbor is a sibling, and t he request is nonhierarchical. Because Squid is forced t o use a neighbor, it only queries parent s, which can always handle a cache m iss. The cache_peer_access or cache_peer_dom ain rules forbid sending t his request t o t he neighbor.
❍
The neighbor's no-query flag is set , or it s I CP/ HTCP port num ber is zero.
❍
The neighbor is a m ult icast responder.
5. Squid count s how m any queries it sends and calculat es how m any replies t o expect . I f it expect s at least one reply, t he rest of t he next - hop select ion procedure is post poned unt il t he replies arrive, or a t im eout occurs. Squid expect s t o receive replies from neighbors t hat are alive, but not neighbors t hat are dead ( see Sect ion 10.3.2) .
10.10.3 Step 2a: ICP/HTCP Reply Processing
I f Squid sends out any I CP or HTCP queries, it wait s for som e num ber of replies. Just aft er t ransm it t ing t he queries, Squid knows how m any replies t o expect and t he m axim um am ount of t im e t o wait for t hem . Squid expect s a reply from every alive neighbor queried. I f you're using m ult icast , Squid adds t he current group size est im at e t o t he expect ed reply count . While wait ing for replies, Squid schedules a t im eout , in case one or m ore of t he replies don't arrive. When Squid receives an I CP/ HTCP reply from a neighbor, it t akes t he following act ions: 1. I f t he reply is a hit , Squid forwards t he request t o t hat neighbor im m ediat ely. Any replies arriving aft er t his point are ignored. 2. I f t he reply is a m iss, and it is from a sibling, it is ignored. 3. Squid doesn't im m ediat ely act on I CP/ HTCP m isses from parent s. I nst ead, it rem em bers which parent s m eet t he following crit eria:
The closest - parent m iss I f t he reply includes a net db RTT m easurem ent , Squid rem em bers t he parent t hat has t he least RTT t o t he origin server.
The first - parent m iss Squid rem em bers t he parent t hat had t he first reply. I n ot her words, t he parent wit h least RTT t o your cache. Two cache_peer opt ions affect t his part of t he algorit hm : weight=N and closest-only. The weight=N opt ion m akes a parent closer t han it really is. When calculat ing RTTs, Squid divides t he act ual RTT by t his art ificial weight . Thus you can give higher preference t o cert ain parent s by increasing t heir weight value. The closest-only opt ion disables t he first - parent m iss feat ure for a neighbor cache. I n ot her words, Squid select s a parent ( based on I CP/ HTCP m iss replies) only if t hat parent is t he closest t o t he origin server. 4. I f Squid receives t he expect ed num ber of replies ( all m isses) , or if t he t im eout occurs, it select s t he closest - parent m iss neighbor if set . Ot herwise, it select s t he first - parent m iss neighbor if set . Squid m ay not receive any I CP/ HTCP replies from parent caches, eit her because t hey weren't queried or because t he net work dropped som e packet s. I n t his case, Squid relies on t he secondary parent ( or direct ) select ion algorit hm described in t he next sect ion. I f t he I CP/ HTCP query t im eout occurs before receiving t he expect ed num ber of replies, Squid prepends t he st ring TIMEOUT_ t o t he result code in access.log.
10.10.4 Step 3: Secondary Parent Selection This st ep is a lit t le t ricky. Rem em ber t hat if t he direct flag is DI RECT_YES, Squid never execut es t his st ep. I f t he flag is DI RECT_NO, Squid calls t he get Som eParent ( ) funct ion ( described subsequent ly) t o select a backup parent , in case St ep 2 failed t o select one. Following t hat , Squid adds t o t he list all parent s it believes are alive. Thus, it t ries all possible parent caches before ret urning an error m essage t o t he user. I n t he case of DI RECT_MAYBE, Squid adds bot h a parent cache, and t he origin server. The order, however, depends on t he prefer_direct set t ing. I f prefer_direct is enabled, Squid insert s t he origin server int o t he list first . Next , Squid calls get Som eParent ( ) if t he request is hierarchical or if t he nonhierarchical_direct direct ive is disabled. Finally, Squid adds t he origin server last if prefer_direct is disabled. The get Som eParent ( ) funct ion select s one of t he parent s based on t he following crit eria. I n each case, t he parent m ust be alive and allowed t o handle t he request according t o t he cache_peer_access and cache_peer_dom ain rules: ● ● ●
The first parent wit h t he default cache_peer opt ion The parent wit h t he round-robin cache_peer opt ion t hat has t he lowest request count The first parent t hat is known t o be alive
10.10.5 Retrying Occasionally, Squid's at t em pt t o forward a request t o an origin server or neighbor m ay fail for one reason or anot her. This is why Squid creat es a list of appropriat e next - hop locat ions during t he neighbor select ion procedure. When one of t he following t ypes of errors occurs, Squid can ret ry t he request at t he next server in t he list : ● ●
●
●
●
●
Net work congest ion or ot her errors can cause a " connect ion t im eout ." The origin server or neighbor cache m ay be t em porarily unavailable, causing a " connect ion refused" error. A sibling m ay ret urn a 504 ( Gat eway Tim eout ) error if t he request would cause a cache m iss. A neighbor m ay ret urn an " access denied" error m essage if t he t wo caches have a m ism at ch in access cont rol policies. A read error m ay occur on an est ablished connect ion before Squid reads t he HTTP m essage body. There m ay be race condit ions wit h persist ent connect ions.
Squid's algorit hm for ret rying failed request s is relat ively aggressive. I t is bet t er for Squid t o keep t rying ( causing som e ext ra delay) , rat her t han ret urn an error t o t he user. < Day Day Up >
< Day Day Up >
10.11 How Do I ... New Squid users oft en ask t he sam e, or sim ilar, quest ions about get t ing Squid t o forward request s in t he right way. Here I 'll show you how t o configure Squid for som e com m on scenarios.
10.11.1 Send All Requests Through Another Proxy? You sim ply need t o define a parent and t ell Squid it isn't allowed t o connect direct ly t o origin servers. For exam ple: cache_peer parent.host.name parent 3128 0 acl All src 0/0 never_direct allow All The drawback t o t his configurat ion is t hat Squid can't forward cache m isses if t he parent goes down. I f t hat happens, your users receive t he " cannot forward" error m essage.
10.11.2 Send All Requests Through Another Proxy Unless It's Down? Try t his configurat ion: nonhierarchical_direct off prefer_direct off cache_peer parent.host.name parent 3128 0 default no-query Or, if you'd like t o use I CP wit h t he ot her proxy: nonhierarchical_direct off prefer_direct off cache_peer parent.host.name parent 3128 3130 default Wit h t his configurat ion, Squid forwards all cache m isses t o t he parent as long as it is alive. Using I CP should cause Squid t o det ect a dead parent quickly, but at t he sam e t im e m ay incorrect ly declare t he parent dead on occasion.
10.11.3 Make Sure Squid Doesn't Use Neighbors for Some Requests? Define an ACL t o m at ch t he special request :
cache_peer parent.host.name parent 3128 0 acl Special dstdomain special.server.name always_direct allow Special I n t his case, cache m isses for request s in t he special.server.nam e dom ain are always sent t o t he origin server. Ot her request s m ay, or m ay not , go t hrough t he parent cache.
10.11.4 Send Some Requests Through a Parent to Bypass Local Filters? Som e I SPs ( and ot her organizat ions) have upst ream providers t hat force HTTP t raffic t hrough a filt ering proxy ( perhaps wit h HTTP int ercept ion) . You m ight be able t o get around t heir filt ers if you can use a different proxy beyond t heir net work. Here's how you can send only special request s t o t he far- away proxy: cache_peer far-away-parent.host.name parent 3128 0 acl BlockedSites dstdomain www.censored.com cache_peer_access far-away-parent.host.name allow BlockedSites never_direct allow BlockedSites
< Day Day Up >
< Day Day Up >
10.12 Exercises ●
●
●
●
●
Toggle your prefer_direct and/ or nonhierarchical_direct set t ings and look for any changes in t he access.log. Enable net db and view t he net db cache m anager page aft er Squid has been running for a while. I f using I CP or HTCP, count t he percent age of request s t hat experienced a t im eout wait ing for replies t o arrive. I f you used —enable- cache- digest s and have a reasonably full cache, disable t he digest _generat ion direct ive and not e any change in m em ory usage. Use your operat ing syst em 's packet filt ers t o block I CP or HTCP m essages t o your neighbors. How quickly does Squid change t heir st at e from alive t o dead, and back again? < Day Day Up >
< Day Day Up >
Chapter 11. Redirectors A redirect or is an ext ernal process t hat rewrit es URI s from client request s. For exam ple, alt hough a user request s t he page ht t p: / / www.exam ple.com / page1.ht m l, a redirect or can change t he request t o som et hing else, such as ht t p: / / www.exam ple.com / page2.ht m l. Squid fet ches t he new URI aut om at ically, as t hough t he client originally request ed it . I f t he response is cachable, Squid st ores it under t he new URI . The redirect or feat ure allows you t o im plem ent a num ber of int erest ing t hings wit h Squid. Many sit es use t hem for access cont rols, rem oving advert isem ent s, local m irrors, or even working around browser bugs. One of t he nice t hings about using a redirect or for access cont rol is t hat you can send t he user t o a page t hat explains exact ly why her request is denied. You m ay also find t hat a redirect or offers m ore flexibilit y t han Squid's built - in access cont rols. As you'll see short ly, however, a redirect or doesn't have access t o t he full spect rum of inform at ion cont ained in a client 's request . Many people use a redirect or t o filt er out web page advert isem ent s. I n m ost cases, t his involves changing a request for a GI F or JPEG advert isem ent im age int o a request for a sm all, blank im age, locat ed on a local server. Thus, t he advert isem ent j ust " disappears" and doesn't int erfere wit h t he page layout . So in essence, a redirect or is really j ust a program t hat reads a URI and ot her inform at ion from it s input and writ es a new URI on it s out put . Perl and Pyt hon are popular languages for redirect ors, alt hough som e aut hors use com piled languages such as C for bet t er perform ance. The Squid source code doesn't com e wit h any redirect or program s. As an adm inist rat or, you are responsible for writ ing your own or downloading one writ t en by som eone else. The first part of t his chapt er describes t he int erface bet ween Squid and a redirect or process. I also provide a couple of sim ple redirect or exam ples in Perl. I f you're int erest ed in using som eone else's redirect or, rat her t han program m ing your own, skip ahead t o Sect ion 11.3. < Day Day Up >
< Day Day Up >
11.1 The Redirector Interface A redirect or receives dat a from Squid on st din one line at a t im e. Each line cont ains t he following four t okens separat ed by whit espace: ● ● ● ●
Request - URI Client I P address and fully qualified dom ain nam e User's nam e, via eit her RFC 1413 ident or proxy aut hent icat ion HTTP request m et hod
For exam ple: http://www.example.com/page1.html 192.168.2.3/user.host.name jabroni GET The Request - URI is t aken from t he client 's request , including query t erm s, if any. Fragm ent ident ifier com ponent s ( e.g., t he # charact er and subsequent t ext ) are rem oved, however. The second t oken cont ains t he client I P address and, opt ionally, it s fully qualified dom ain nam e ( FQDN) . The FQDN is set only if you enable t he log_fqdn direct ive or use a srcdom ain ACL elem ent . Even t hen, t he FQDN m ay be unknown because t he client 's net work adm inist rat ors didn't properly set up t he reverse point er zones in t heir DNS. I f Squid doesn't know t he client 's FQDN, it places a hyphen ( -) in t he field. For exam ple: http://www.example.com/page1.html 192.168.2.3/- jabroni GET The client ident field is set if Squid knows t he nam e of t he user behind t he request . This happens if you use proxy aut hent icat ion, ident ACL elem ent s, or enable ident _lookup_access. Rem em ber, however, t hat t he ident _lookup_access direct ive doesn't cause Squid t o delay request processing. I n ot her words, if you enable t hat direct ive, but don't use t he access cont rols, Squid m ay not yet know t he usernam e when writ ing t o t he redirect or process. I f Squid doesn't know t he usernam e, it displays a -. For exam ple: http://www.example.com/page1.html 192.168.2.3/- - GET Squid reads back one t oken from t he redirect or process: a URI . I f Squid reads a blank line, t he original URI rem ains unchanged. A redirect or program should never exit unt il end- of- file occurs on st din. I f t he process does exit prem at urely, Squid writ es a warning t o cache.log: WARNING: redirector #2 (FD 18) exited I f 50% of t he redirect or processes exit prem at urely, Squid abort s wit h a fat al error m essage.
11.1.1 Handling URIs That Contain Whitespace
I f t he Request - URI cont ains whit espace, and t he uri_whit espace direct ive is set t o allow, any whit espace in t he URI is passed t o t he redirect or. A redirect or wit h a sim ple parser m ay becom e confused in t his case. You have t wo opt ions for handling whit espace in URI s when using a redirect or. One opt ion is t o set t he uri_whit espace direct ive t o anyt hing except allow. The default set t ing, strip, is probably a good choice in m ost sit uat ions because Squid sim ply rem oves t he whit espace from t he URI when it parses t he HTTP request . See Appendix A for inform at ion on t he ot her values for t his direct ive. I f t hat isn't an opt ion, you need t o m ake sure t he redirect or's parser is sm art enough t o det ect t he ext ra t okens. For exam ple, if it finds m ore t han four t okens in t he line received from Squid, it can assum e t hat t he last t hree are t he I P address, ident , and request m et hod. Everyt hing before t he t hird- t o- last t oken com prises t he Request - URI .
11.1.2 Generating HTTP Redirect Messages When a redirect or changes t he client 's URI , it norm ally doesn't know t hat Squid decided t o fet ch a different resource. This is, in all likelihood, a gross violat ion of t he HTTP RFC. I f you want t o be nicer, and rem ain com pliant , t here is a lit t le t rick t hat m akes Squid ret urn an HTTP redirect m essage. Sim ply have t he redirect or insert 301: , 302: , 303: , or 307: , before t he new URI . For exam ple, if a redirect or writ es t his line on it s st dout : 301:http://www.example.com/page2.html Squid sends a response like t his back t o t he client : HTTP/1.0 301 Moved Permanently Server: squid/2.5.STABLE4 Date: Mon, 29 Sep 2003 04:06:23 GMT Content-Length: 0 Location: http://www.example.com/page2.html X-Cache: MISS from zoidberg Proxy-Connection: close
< Day Day Up >
< Day Day Up >
11.2 Some Sample Redirectors Exam ple 11- 1 is a very sim ple redirect or writ t en in Perl. I t s purpose is t o send HTTP request s for t he squid- cache.org sit e t o a local m irror sit e in Aust ralia. I f t he request ed URI looks like it is for www.squid- cache.org or one of it s m irror sit es, t his script out put s a new URI wit h t he host nam e set t o www1.au.squid- cache.org. A com m on problem first - t im e redirect or writ ers encount er is buffered I / O. Not e t hat here I m ake sure st dout is unbuffered.
Ex a m ple 1 1 - 1 . A sim ple r e dir e ct or in Pe r l #!/usr/bin/perl -wl $|=1;
# don't buffer the output
while () { ($uri,$client,$ident,$method) = ( ); ($uri,$client,$ident,$method) = split; next unless ($uri =~ m,^http://.*\.squid-cache\.org(\S*),); $uri = "http://www1.au.squid-cache.org$1"; } continue { print "$uri"; } Exam ple 11- 2 is anot her, som ewhat m ore com plicat ed, exam ple. Here I m ake a feeble at t em pt t o deny request s when t he URI cont ains " bad words." This script dem onst rat es an alt ernat ive way t o parse t he input fields. I f I don't get all five required fields, t he redirect or ret urns a blank line, leaving t he request unchanged. This exam ple also gives preferent ial t reat m ent t o som e users. I f t he ident st ring is equal t o " BigBoss," or com es from t he 192.168.4.0 subnet , t he request is passed t hrough. Finally, I use t he 301: t rick t o m ake Squid ret urn an HTTP redirect t o t he client . Not e, t his program is neit her efficient nor sm art enough t o correct ly deny so- called bad request s.
Ex a m ple 1 1 - 2 . A sligh t ly le ss sim ple r e dir e ct or in Pe r l #!/usr/bin/perl -wl $|=1;
# don't buffer the output
$DENIED = "http://www.example.com/denied.html"; &load_word_list( );
while () { unless (m,(\S+) (\S+)/(\S+) (\S+) (\S+),) { $uri = ''; next; } $uri = $1; $ipaddr = $2; #$fqdn = $3; $ident = $4; #$method = $5; next if ($ident eq 'TheBoss'); next if ($ipaddr =~ /^192\.168\.4\./); $uri = "301:$DENIED" if &word_match($uri); } continue { print "$uri"; }
sub load_word_list { @words = qw(sex drugs rock roll); }
sub word_match { my $uri = shift; foreach $w (@words) { return 1 if ($uri =~ /$w/); } return 0;
} For m ore ideas about writ ing your own redirect or, I recom m end reading t he source code for t he redirect ors m ent ioned in Sect ion 11.5. < Day Day Up >
< Day Day Up >
11.3 The Redirector Pool A redirect or can t ake an arbit rarily long t im e t o ret urn it s answer. For exam ple, it m ay need t o m ake a dat abase query, search t hrough long list s of regular expressions, or m ake som e com plex com put at ions. Squid uses a pool of redirect or processes so t hat t hey can all work in parallel. While one is busy, Squid hands a new request off t o anot her. For each new request , Squid exam ines t he pool of redirect or processes in order. I t subm it s t he request t o t he first idle process. I f your request rat e is very low, t he first redirect or m ay be able t o handle all request s it self. You can cont rol t he size of t he redirect or pool wit h t he redirect _children direct ive. The default value is five processes. Not e t hat Squid doesn't dynam ically increase or decrease t he size of t he pool depending on t he load. Thus, it is a good idea t o be a lit t le liberal. I f all redirect ors are busy, Squid queues pending request s. I f t he queue becom es t oo large ( bigger t han t wice t he pool size) , Squid exit s wit h a fat al error m essage: FATAL: Too many queued redirector requests I n t his case, you need t o increase t he size of t he redirect or pool or change som et hing so t hat t he redirect ors can process request s fast er. You can use t he cache m anager's redirect or page t o find out if you have t oo few, or t oo m any redirect ors running. For exam ple: % squidclient mgr:redirector ... Redirector Statistics: program: /usr/local/squid/bin/myredir number running: 5 of 5 requests sent: 147 replies received: 142 queue length: 2 avg service time: 953.83 msec
#
FD
PID
# Requests
Flags
Time
Offset Request
1
10
35200
46
AB
0.902
0 http://...
2
11
35201
29
AB
0.401
0 http://...
3
12
35202
25
AB
1.009
1 cache_o...
4
14
35203
25
AB
0.555
0 http://...
5
15
35204
21
AB
0.222
0 http://...
I f, as in t his exam ple, you see t hat t he last redirect or has alm ost as m any request s as t he second t o last , you should probably increase t he size of t he redirect or pool. I f, on t he ot her hand, you see m any redirect ors wit h no request s, you can probably decrease t he pool size. < Day Day Up >
< Day Day Up >
11.4 Configuring Squid The following five squid.conf direct ives cont rol t he behavior of redirect ors in Squid.
11.4.1 redirect_program The redirect _program direct ive specifies t he com m and line for t he redirect or program . For exam ple: redirect_program /usr/local/squid/bin/my_redirector -xyz Not e, t he redirect or program m ust be execut able by t he Squid user I D. I f, for som e reason, [ 1] For Squid can't execut e t he redirect or, you should see an error m essage in cache.log. exam ple: [ 1]
This m essage appears only in cache.log, and not on st dout , if you use t he - d opt ion, or in syslog, if you use t he - s opt ion. ipcCreate: /usr/local/squid/bin/my_redirector: (13) Permission denied Due t o t he way Squid works, t he m ain Squid process m ay be unaware of problem s execut ing t he redirect or program . Squid doesn't det ect t he error unt il it t ries t o writ e a request and read a response. I t t hen print s: WARNING: redirector #1 (FD 6) exited Thus, if you see such a m essage for t he first request sent t o Squid, check cache.log closely for ot her errors, and m ake sure t he program is execut able by Squid.
11.4.2 redirect_children The redirect _children direct ive specifies how m any redirect or processes Squid should st art . For exam ple: redirect_children 20 Squid warns you ( via cache.log) when all redirect ors are sim ult aneously busy: WARNING: All redirector processes are busy. WARNING: 1 pending requests queued. I f you see t his warning, you should increase t he num ber of child processes and rest art ( or reconfigure) Squid. I f t he queue size becom es t wice t he num ber of redirect ors, Squid abort s wit h a fat al m essage.
Don't at t em pt t o disable Squid's use of t he redirect ors by set t ing redirect _children t o 0. I nst ead, sim ply rem ove t he redirect _program line from squid.conf.
11.4.3 redirect_rewrites_host_header Squid norm ally updat es a request 's Host header when using a redirect or. That is, if t he redirect or ret urns a new URI wit h a different host nam e, Squid put s t he new host nam e in t he Host header. I f you use Squid as a surrogat e ( see Chapt er 15) , you m ight want t o disable t his behavior by set t ing t he redirect _rewrit es_host _header direct ive t o off: redirect_rewrites_host_header off
11.4.4 redirector_access Squid norm ally sends every request t hrough a redirect or. However, you can use t he redirect or_access rules t o send cert ain request s t hrough select ively. The synt ax is ident ical t o ht t p_access: redirector_access allow|deny [!]ACLname ... For exam ple: acl Foo src 192.168.1.0/24 acl All src 0/0 redirector_access deny Foo redirector_access allow All I n t his case, Squid skips t he redirect or for any request t hat m at ches t he Foo ACL.
11.4.5 redirector_bypass I f you enable t he redirect or_bypass direct ive, Squid bypasses t he redirect ors when all of t hem are busy. Norm ally, Squid queues pending request s unt il a redirect or process becom es available. I f t his queue grows t oo large, Squid exit s wit h a fat al error m essage. Enabling t his direct ive ensures t hat Squid never reaches t hat st at e. The t radeoff, of course, is t hat som e user request s m ay not be redirect ed when t he load is high. I f t hat 's all right wit h you, sim ply enable t he direct ive wit h t his line: redirector_bypass on
< Day Day Up >
< Day Day Up >
11.5 Popular Redirectors As I already m ent ioned, t he Squid source code doesn't include any redirect ors. However, you can find a num ber of useful t hird- part y redirect ors linked from t he Relat ed Soft ware page on ht t p: / / www.squid- cache.org. Here are som e of t he m ore popular offerings:
11.5.1 Squirm ht t p: / / squirm .foot e.com .au/ Squirm com es from Chris Foot e. I t is writ t en in C and dist ribut ed as source code under t he GNU General Public License ( GPL) . Squirm 's feat ures include: ● ● ● ● ●
●
Being very fast wit h m inim al m em ory usage Full regular expression pat t ern m at ching and replacem ent Abilit y t o apply different redirect ion list s t o different client groups I nt eract ive m ode for t est ing on t he com m and line Fail- safe m ode passes request s t hrough unchanged in t he event t hat configurat ion files cont ain errors Writ ing debugging, errors, and m ore t o various log files
11.5.2 Jesred ht t p: / / www.linofee.org/ ~ elkner/ webt ools/ j esred/ Jesred com es from Jens Elkner. I t is writ t en in C, based on Squirm , and also released under t he GNU GPL. I t s feat ures include: ● ● ● ●
●
Being fast er t han Squirm , wit h slight ly m ore m em ory usage Abilit y t o reread it s configurat ion files while running Full regular expression pat t ern m at ching and replacem ent Fail- safe m ode passes request s t hrough unchanged in t he event t hat configurat ion files cont ain errors Opt ionally logging rewrit t en request s t o a log file
11.5.3 squidGuard ht t p: / / www.squidguard.org/ squidGuard com es from Pål Balt zersen and Lars Erik Håland at Tele Danm ark I nt erNordia. I t is released under t he GNU GPL. The aut hors also m ake sure squidGuard com piles easily on m odern Unix syst em s. Their sit e cont ains a lot of good docum ent at ion. Here are som e of squidGuard's feat ures: ●
Highly configurable; you can apply different rules t o different groups of client s or users and at different t im es or days
● ● ● ●
URI subst it ut ion, not j ust replacem ent , à la sed print f- like subst it ut ions allow passing param et ers t o CGI script s for cust om ized m essages Support ive of t he 301/ 302/ 303/ 307 HTTP redirect st at us code feat ure for redirect ors Select ive logging for rewrit e rule set s
At t he squidGuard sit e, you can also find a blacklist of m ore t han 100,000 sit es cat egorized as porn, aggressive, drugs, hacking, ads, and m ore.
11.5.4 AdZapper ht t p: / / www.adzapper.sourceforge.net AdZapper is a popular redirect or because it specifically t arget s rem oval of advert isem ent s from HTML pages. I t is a Perl script writ t en by Cam eron Sim pson. AdZapper can block banners ( im ages) , pop- up windows, flash anim at ions, page count ers, and web bugs. The script includes a list of regular expressions t hat m at ch URI s known t o cont ain ads, pop- ups, et c. Cam eron updat es t he script periodically wit h new pat t erns. You can also m aint ain your own list of pat t erns. < Day Day Up >
< Day Day Up >
11.6 Exercises ● ●
●
Writ e a redirect or t hat never changes t he request ed URI and configure Squid t o use it . While running tail -f cache.log, kill Squid's redirect or processes one by one unt il som et hing int erest ing happens. Download and inst all one of t he redirect ors m ent ioned in t he previous sect ion. < Day Day Up >
< Day Day Up >
Chapter 12. Authentication Helpers I originally t alked about proxy aut hent icat ion in Sect ion 6.1.2.12. However, I only explained how t o writ e access cont rol rules t hat use proxy aut hent icat ion. Here, I 'll show you how t o select and configure t he part icular aut hent icat ion helpers. Recall t hat Squid support s t hree m et hods for gat hering aut hent icat ion credent ials from users: Basic, Digest , and NTLM. These m et hods specify how Squid receives t he usernam e and password from a client . From a securit y st andpoint , Basic aut hent icat ion is ext rem ely weak. Digest and NTLM are significant ly st ronger. For each m et hod, Squid provides som e aut hent icat ion m odules, or helper processes, which act ually validat e t he credent ials. All of t he aut hent icat ion helpers t hat I m ent ion here are included in t he Squid source code dist ribut ion. You can com pile t hem wit h ./ configure opt ions t hat m at ch t heir direct ory nam es. For exam ple: % ls helpers/basic_auth LDAP
NCSA
getpwnam
MSNT
PAM
multi-domain-NTLM
Makefile
SASL
winbind
Makefile.am
SMB
Makefile.in
YP
% ./configure --enable-basic-auth-helpers=LDAP,NCSA ... Helper program s are norm ally inst alled in t he $prefix/ libexec direct ory. As wit h redirect ors, Squid uses a pool of aut hent icat ion helper processes. A request for aut hent icat ion is sent t o t he first idle helper. When all aut hent icat or processes are busy, Squid queues pending request s. I f t he queue becom es t oo large, Squid exit s wit h a fat al error m essage. I n m ost cases, Squid caches aut hent icat ion result s. This reduces t he load on t he helper processes and im proves response t im e. < Day Day Up >
< Day Day Up >
12.1 Configuring Squid The aut h_param direct ive cont rols every aspect of configuring Squid's aut hent icat ion helpers. The different m et hods ( Basic, Digest , NTLM) have som e t hings in com m on, and som e unique param et ers. The first argum ent following aut h_param m ust be one of basic, digest, or ntlm. I 'll cover t his direct ive in det ail for each aut hent icat ion schem e lat er in t he chapt er. I n addit ion t o aut h_param , Squid has t wo m ore direct ives t hat affect proxy aut hent icat ion. You can use t he m ax_user_ip ACL t o prevent users from sharing t heir usernam e and password wit h ot hers. I f Squid det ect s t he sam e usernam e com ing from t oo m any different I P addresses, t he ACL is a m at ch and you can deny t he request . For exam ple: acl FOO max_user_ip 2 acl BAR proxy_auth REQUIRED http_access deny FOO http_access allow BAR I n t his case, if a user subm it s request s from t hree or m ore different I P addresses, Squid denies t he request . The aut hent icat e_ip_t t l direct ive cont rols how long Squid rem em bers t he source I P addresses for each user. A sm aller TTL m akes it easier for users wit h frequent ly changing I P addresses. You can use larger TTLs in an environm ent where users have t he sam e I P address for long periods of t im e. < Day Day Up >
< Day Day Up >
12.2 HTTP Basic Authentication Basic aut hent icat ion is t he sim plest and least secure t hat HTTP has t o offer. I t essent ially t ransm it s user passwords as cleart ext , alt hough t hey are encoded int o print able charact ers. For exam ple, if t he user t ypes her nam e as Fannie and her password as FuRpAnTsClUb, t he user- agent first com bines t he t wo int o a single st ring, wit h nam e and password separat ed by a colon: Fannie:FuRpAnTsClUb Then it encodes t his st ring wit h base64 encoding, as defined in RFC 2045. I t looks like t his in t he HTTP headers: Authorization: Basic RmFubmllOkZ1UnBBblRzQ2xVYgo= Anyone who happens t o capt ure your users' HTTP request s can easily get bot h t he usernam e and password: % echo RmFubmllOkZ1UnBBblRzQ2xVYgo= | /usr/local/lib/python1.5/base64.py -d Fannie:FuRpAnTsClUb As required by t he HTTP/ 1.1 RFC, Squid doesn't forward " consum ed" aut horizat ion credent ials t o ot her servers. I n ot her words, if t he credent ials are for access t o Squid, t he Authorization header [ 1] is rem oved from out going request s. [ 1]
Unless you configure a peer wit h t he login=PASS opt ion.
You'll not ice t hat som e of t he Basic aut hent icat ors can be configured t o check t he syst em password file. Because Basic credent ials aren't encrypt ed, it is a bad idea t o com bine login passwords wit h cache access passwords. I f you choose t o use t he get pwnam aut hent icat or, m ake sure you fully underst and t he im plicat ions of having your users' passwords t ransm it t ed in t he clear across your net work. HTTP Basic aut hent icat ion support s t he following aut h_param param et ers: ● ● ● ●
auth_param auth_param auth_param auth_param
basic basic basic basic
program command children number realm string credentialsttl time-specification
The program param et er specifies t he com m and, including argum ent s, for t he helper program . I n m ost cases, t his will be t he pat hnam e t o one of t he aut hent icat ion helper program s t hat you com piled. By default , t hey live in / usr/ local/ squid/ libexec. The children param et er t ells Squid how m any helper processes t o use. The default value is 5, which is a good st art ing point if you don't know how m any Squid needs t o handle t he load. I f you specify t oo few, Squid warns you wit h m essages in cache.log.
The realm param et er is t he aut hent icat ion realm st ring t hat t he user- agent should present t o t he user when prom pt ing for a usernam e and password. You can use som et hing sim ple, such as " access t o t he Squid caching proxy." The credentialsttl param et er specifies t he am ount of t im e t hat Squid int ernally caches aut hent icat ion result s. A larger value reduces t he load on t he ext ernal aut hent icat or processes, but increases t he am ount of t im e unt il Squid det ect s changes t o t he aut hent icat ion dat abase. Not e, t his only affect s posit ive result s ( i.e., successful validat ions) . Negat ive result s aren't cached inside Squid. The default TTL value is t wo hours. Here is a com plet e exam ple: auth_param basic program /usr/local/squid/libexec/pam_auth auth_param basic children 10 auth_param basic realm My Awesome Squid Cache auth_param basic credentialsttl 1 hour
acl KnownUsers proxy_auth REQUIRED http_access allow KnownUsers Next I will discuss t he Basic aut hent icat ion helper program s t hat com e wit h Squid.
12.2.1 NCSA ./ configure —enable- basic- aut h- helpers= NCSA The NCSA aut hent icat ion helper is relat ively popular due t o it s sim plicit y and hist ory. I t st ores usernam es and passwords in a single t ext file, sim ilar t o t he Unix / et c/ passwd file. This password file form at was originally developed as a part of t he NCSA HTTP server proj ect . You pass t he pat h t o t he password file as t he program 's single com m and- line argum ent in squid. conf: auth_param basic program /usr/local/squid/libexec/ncsa_auth /usr/local/squid/etc/passwd You can use t he ht passwd program t hat com es wit h Apache t o creat e and updat e t he password file. Also, you can download it from ht t p: / / www.squid- cache.org/ ht passwd/ . From t hat page, you can also download t he chpasswd CGI script , which allows users t o change t heir own passwords if necessary.
12.2.2 LDAP ./ configure —enable- basic- aut h- helpers= LDAP
The LDAP helper int erfaces t o a Light weight Direct ory Access Prot ocol server. The OpenLDAP libraries and header files m ust be inst alled before you can com pile t he squid_ldap_aut h helper. You can find OpenLDAP at ht t p: / / www.openldap.org/ . The squid_ldap_aut h program requires at least t wo argum ent s: t he base dist inguished nam e ( DN) and t he LDAP server host nam e. For exam ple: auth_param basic program /usr/local/squid/libexec/squid_ldap_auth -b "ou=people,dc=example,dc=com"
ldap.example.com
The LDAP helper has a Unix m anual page t hat describes all of it s opt ions and param et ers. However, Squid's m anual pages aren't norm ally inst alled when you run m ake inst all. You can read t he m anual page by locat ing it in t he source t ree and m anually running nroff. For exam ple: % cd helpers/basic_auth/LDAP % nroff -man squid_ldap_auth.8 | less
12.2.3 MSNT ./ configure —enable- basic- aut h- helpers= MSNT The MSNT aut hent icat or int erfaces t o a Microsoft NT dom ain dat abase via t he Server Message Block ( SMB) prot ocol. I t uses a sm all configurat ion file, nam ed m snt aut h.conf, which m ust be placed in t he $prefix/ et c or —sysconfidr direct ory. You can specify up t o five NT dom ain cont rollers in t he configurat ion file. For exam ple: server pdc1_host bdc1_host my_nt_domain server pdc2_host bdc2_host another_nt_domain By default , t he MSNT aut hent icat or allows any user validat ed by t he server. However, it also has t he abilit y t o allow or deny specific usernam es. I f you creat e an allowusers file, only t he users list ed t here are allowed access t o Squid. You m ight want t o use t his feat ure if you have a large num ber of users on t he NT server, but only a sm all num ber who are allowed t o use t he cache. Alt ernat ively, you can creat e a denyusers file. Any user list ed in t hat file is aut om at ically denied access, even before checking t he allowusers file. Alt ernat ively, you can allow or deny specific usernam es by placing t hem in t he proxy_auth ACL as described in Sect ion 6.1.2.12. For addit ional docum ent at ion, see t he README.ht m l file in t he helpers/ basic_aut h/ MSNT direct ory.
12.2.4 Multi-domain-NTLM ./ configure —enable- basic- aut h- helpers= m ult i- dom ain- NTLM The m ult i- dom ain- NTLM aut hent icat or is sim ilar t o MSNT. Bot h send queries t o a Windows NT dom ain dat abase. Whereas MSNT queries up t o five dom ain cont rollers, t he m ult i- dom ain- NTLM
aut hent icat or requires users t o insert t he NT dom ain nam e before t heir usernam e, like t his: ntdomain\username The m ult i- dom ain- NTLM helper program is a relat ively short Perl script . I t relies on t he Authen:: SMB package from CPAN ( ht t p: / / www.cpan.org) . I f you don't hardcode t he dom ain cont roller host nam es in t he Perl script , it ut ilizes t he nm blookup program from t he Sam ba package ( www. sam ba.org) t o discover t hem aut om at ically. The Perl script is nam ed sm b_aut h.pl. I t m ight look like t his in squid.conf: auth_param basic program /usr/local/squid/libexec/smb_auth.pl Docum ent at ion for m ult i- dom ain- NTLM is t hin, but if you underst and Perl, you should be able t o figure it out by reading t he code.
12.2.5 PAM ./ configure —enable- basic- aut h- helpers= PAM I n a sense, Pluggable Aut hent icat ion Modules ( PAM) are t he glue bet ween aut hent icat ion m et hods ( e.g., one- t im e passwords, kerberos, sm art cards) and applicat ions requiring aut hent icat ion services ( e.g., ssh, ft p, im ap) . Your syst em 's / et c/ pam .conf file describes which m et hods t o use for each applicat ion. To use Squid's PAM aut hent icat ion helper, you need t o add " squid" as a service in t he / et c/ pam . conf file and specify which PAM m odules t o use. For exam ple, t o use t he Unix password file on FreeBSD, you m ight put t his in pam .conf: squid
auth
required
pam_unix.so
try_first_pass
To check t he Unix password dat abase, t he pam _aut h process m ust run as root . This is a securit y risk and you m ust m anually m ake t he execut able set uid root . I f pam _aut h doesn't run as root , and it is configured t o check t he Unix password dat abase, every request for aut hent icat ion fails.
The PAM aut hent icat or is docum ent ed wit h a m anual page t hat you can find in t he helpers/ basic_aut h/ PAM direct ory.
12.2.6 SASL ./ configure —enable- basic- aut h- helpers= SASL The Sim ple Aut hent icat ion and Securit y Layer ( SASL) is an I ETF proposed st andard, docum ent ed in RFC 2222. I t is a prot ocol for negot iat ing securit y param et ers for connect ion- based prot ocols ( e.g., FTP, SMTP, HTTP) . However, t he SASL aut hent icat or is sim ilar t o t he PAM aut hent icat or. I t int erfaces wit h a t hird- part y library t o query a num ber of different aut hent icat ion dat abases.
Specifically, Squid's SASL aut hent icat or requires t he Cyrus SASL library developed by Carnegie Mellon Universit y. You can find it at ht t p: / / asg.web.cm u.edu/ sasl/ . You can configure t he SASL aut hent icat or t o check t he t radit ional password file, t he PAM syst em , or any of t he ot her dat abases support ed by CMU's library. For furt her inform at ion, see t he README file in t he helpers/ basic_aut h/ SASL direct ory.
12.2.7 SMB ./ configure —enable- basic- aut h- helpers= SMB SMB is anot her aut hent icat or for Microsoft Windows dat abases. The aut hent icat or it self is a C program . That program execut es a shell script each t im e it t alks t o t he Windows dom ain cont roller. The shell script cont ains com m ands from t he Sam ba package. Thus, you'll need t o inst all Sam ba before using t he SMB aut hent icat or. The SMB aut hent icat or program , sm b_aut h t akes t he Windows dom ain nam e as an argum ent . For exam ple: auth_param basic program /usr/local/squid/libexec/smb_auth -W MYNTDOMAIN You can list m ult iple dom ains by repeat ing t he - W opt ion. For full docum ent at ion, see ht t p: / / www. hacom .nl/ ~ richard/ soft ware/ sm b_aut h.ht m l.
12.2.8 YP ./ configure —enable- basic- aut h- helpers= YP The YP aut hent icat or checks a syst em 's " Yellow Pages" ( a.k.a. NI S) direct ory. To use it wit h Squid, you need t o provide t he NI S dom ain nam e and t he nam e of t he password dat abase, usually passwd. byname on t he aut hent icat or com m and line: auth_param basic program /usr/local/squid/libexec/yp_auth my.nis.domain passwd.byname The yp_aut h program is relat ively sim ple, but doesn't have any docum ent at ion.
12.2.9 getpwnam ./ configure —enable- basic- aut h- helpers= get pwnam This aut hent icat or is sim ply an int erface t o t he get pwnam ( ) funct ion found in t he C library on Unix syst em s. The get pwnam ( ) funct ion looks in t he syst em password file for a given usernam e. I f you use YP/ NI S, get pwnam ( ) checks t hose dat abases as well. On som e operat ing syst em s, it m ay also ut ilize t he PAM syst em . You can use t his aut hent icat or if your cache users have login account s on t he syst em where Squid is running. Alt ernat ively, you could set up " nologin" account s in t he password file for your cache users.
12.2.10 winbind
./ configure —enable- basic- aut h- helpers= winbind Winbind is a feat ure of t he Sam ba suit e of soft ware. I t allows Unix syst em s t o ut ilize Windows NT user account inform at ion. The winbind aut hent icat or is a client for t he Sam ba winbindd daem on. You m ust have Sam ba inst alled and t he winbindd daem on running before you can use t his aut hent icat or. The nam e of t he winbind Basic aut hent icat or is wb_basic_aut h. I t t ypically looks like t his in squid. conf: auth_param basic program /usr/local/squid/libexec/wb_basic_auth
12.2.11 The Basic Auth API The int erface passwords t o aut hent icat or aut hent icat or
bet ween Squid and a Basic aut hent icat or is quit e sim ple. Squid sends usernam es and t he aut hent icat or process, separat ed by a space and t erm inat ed by a newline. The reads t he usernam e and password pairs on st din. Aft er checking t he credent ials, t he writ es eit her OK or ERR t o st dout .
Any " URL- unsafe" charact ers are encoded according t o t he RFC 1738 rules. Thus, t he nam e " j ack+ j ill" becom es " j ack% 2bj ill" . Squid accept s usernam es and passwords t hat cont ain whit espace charact ers. For exam ple " a password" becom es " a% 20password" . The aut hent icat or program should be prepared t o handle whit espace and ot her special charact ers aft er decoding t he nam e and password.
You can easily t est a Basic aut hent icat or on t he com m and line. Sim ply run t he aut hent icat or program in a t erm inal window and ent er usernam es and passwords. Or, you can do it like t his: % echo "bueller pencil" | ./ncsa_auth /tmp/passwd OK Here is a sim ple t em plat e aut hent icat or writ t en in Perl: #!/usr/bin/perl -wl
use URI::Escape;
$|=1; while () { ($u,$p) = split;
# don't buffer stdout
$u = uri_unescape($u); $p = uri_unescape($p); if (&valid($u,$p)) { print "OK"; } else { print "ERR"; } }
sub valid { my $user = shift; my $pass = shift; ... }
< Day Day Up >
< Day Day Up >
12.3 HTTP Digest Authentication Digest aut hent icat ion is designed t o be significant ly m ore secure t han Basic. I t m akes ext ensive use of crypt ographic hash funct ions and ot her t ricks. Essent ially, inst ead of sending a cleart ext password, t he user- agent sends a " m essage digest " of t he password, usernam e, and ot her inform at ion. ( See RFC 2617 and O'Reilly's HTTP: The Definit ive Guide for m ore inform at ion.) HTTP Digest aut hent icat ion support s t he following aut h_param param et ers: ● ● ● ● ● ● ●
auth_param auth_param auth_param auth_param auth_param auth_param auth_param
digest digest digest digest digest digest digest
program command children number realm string nonce_garbage_interval time-specification nonce_max_duration time-specification nonce_max_count number nonce_strictness on|off
The program, children, and realm param et ers are t he sam e as for Basic aut hent icat ion. All of t he unique param et ers relat e t o Digest aut hent icat ion's use of som et hing called nonce. A nonce is a special st ring of dat a, which changes occasionally. During t he aut hent icat ion process, t he server ( Squid in t his case) provides a nonce value t o t he client . The client uses t he nonce value when generat ing t he digest . Wit hout t he nonce dat a, an at t acker could sim ply int ercept and replay t he digest values t o gain access t o Squid. The nonce_garbage_interval param et er t ells Squid how oft en t o clean up t he nonce cache. The default value is every 5 m inut es. A very busy cache wit h m any Digest aut hent icat ion client s m ay benefit from m ore frequent nonce garbage collect ion. The nonce_max_duration param et er specifies how long each nonce value rem ains valid. When a client at t em pt s t o use a nonce value older t han t he specified t im e, Squid generat es a 401 ( Unaut horized) response and sends along a fresh nonce value so t he client can re- aut hent icat e. The default value is 30 m inut es. Not e t hat any capt ured Authorization headers can be used in a replay at t ack unt il t he nonce value expires. Set t ing t he nonce_max_duration t oo low, however, causes Squid t o generat e 401 responses m ore oft en. Each 401 response essent ially wast es t he user's t im e as t he client and server renegot iat e t heir aut hent icat ion credent ials. The nonce_max_count param et er places an upper lim it on how m any t im es a nonce value m ay be used. Aft er t he specified num ber of request s, Squid ret urns a 401 ( Unaut horized) response and a new nonce value. The default is 50 request s. Nonce count s are anot her feat ure designed t o prevent replay at t acks. Squid sends qop=auth in it s 401 responses. This causes user- agent s t o include a nonce count in t heir request s, and t o use t he nonce count when generat ing t he digest it self. Nonce count values m ust always increase over t im e. A decreasing nonce count indicat es a replay at t ack. However, t he count s m ay increase, but skip som e values, for exam ple: 5,6,8,9. The nonce_strictness param et er
det erm ines what Squid does in t his case. I f set t o on, Squid ret urns a 401 response if a nonce count doesn't equal t he previous nonce count plus one. I f set t o off, Squid allows gaps in t he nonce count values. Here is a com plet e exam ple: auth_param digest program /usr/local/squid/libexec/digest_pw auth_param digest children 8 auth_param digest realm Access to Squid auth_param digest nonce_garbage_interval 10 minutes auth_param digest nonce_max_duration 45 minutes auth_param digest nonce_max_count 100 auth_param digest nonce_strictness on
acl KnownUsers proxy_auth REQUIRED http_access allow KnownUsers Next I will discuss t he Digest aut hent icat ion helper program s t hat com e wit h Squid.
12.3.1 password ./ configure —enable- aut h= digest —enable- digest - aut h- helpers= password This is a sim ple, reference im plem ent at ion of Digest aut hent icat ion for Squid. I t dem onst rat es how t o writ e a Digest - based aut hent icat ion helper. This code sim ply reads usernam es and passwords from a plaint ext file. The form at of t his file is as follows: username:password The password file pat hnam e is t he single argum ent t o t he digest _pw_aut h program . For exam ple: auth_param digest program /usr/local/squid/libexec/digest_pw_auth /usr/local/squid/etc/digest_passwd auth_param digest realm Some Nifty Realm Squid doesn't provide any t ools t o m aint ain a password file in t his form at . I f you choose t o use Digest aut hent icat ion, you m ust m anage t he file on your own, perhaps wit h a t ext edit or or Perl script s.
12.3.2 Digest Authentication API I f you'd like t o writ e your own Digest aut hent icat ion helper, you need t o underst and t he com m unicat ion bet ween Squid and t he helper process. The exchange is sim ilar t o t hat for Basic aut hent icat ion, albeit a lit t le m ore com plicat ed. The first difference is t hat Squid writ es t he usernam e and realm st ring, rat her t han usernam e and password, t o t he helper process. These st rings are quot ed and separat ed by a colon. For exam ple: "bobby":"Tom Landry Middle School" The second difference is t hat t he helper process ret urns an MD5 digest st ring, rat her t han OK, if t he usernam e is valid. As wit h Basic aut hent icat ion, t he helper process writ es ERR if t he user doesn't exist or if t he input from Squid is unparseable for som e reason. The helper ret urns an MD5 digest wit h t he usernam e, realm , and password. The t hree st rings are concat enat ed t oget her and separat ed by colons: username:realm:password Rem em ber t hat t he password isn't sent in t he HTTP request . Rat her, t he helper ret rieves t he user's password from a dat abase ( like t he plaint ext file used by t he password helper) . For exam ple, let 's say t hat Bobby's password is CapeRs. The helper process receives t he usernam e and realm from Squid, get s t he password from it s dat abase, and calculat es an MD5 checksum of t his st ring: bobby:Tom Landry Middle School:CapeRs The Squid source code includes a library funct ion, Digest CalcHA1( ) , which im plem ent s t his calculat ion. We can t est all t his in a t erm inal window t o see what t he helper ret urns: % echo 'bobby:CapeRs' > /tmp/pw % echo bogus_input | digest_pw_auth /tmp/pw ERR % echo "nouser":"some realm" | digest_pw_auth /tmp/pw ERR % echo '"bobby":"Tom Landry Middle School"' | digest_pw_auth /tmp/pw c7ca3efda238c65b2d48684a51baa90e Squid st ores t his MD5 checksum and uses it in ot her part s of t he Digest aut hent icat ion algorit hm . Not e t hat t he checksum only changes when t he user changes his password. I n Squid's current Digest im plem ent at ion, t hese checksum s are kept in m em ory as long as t he user rem ains act ive. I f t he user is inact ive for aut hent icat e_t t l seconds, t he MD5 checksum m ay be rem oved from Squid's m em ory. Upon t he next request from t hat user, Squid asks t he
ext ernal helper process t o calculat e it again. < Day Day Up >
< Day Day Up >
12.4 Microsoft NTLM Authentication [ 2] NTLM is a propriet ary connect ion aut hent icat ion prot ocol from Microsoft . A num ber of groups, including t he Squid developers, have reverse- engineered t he prot ocol from what lit t le inform at ion is available and by exam ining net work t raffic. You can find som e t echnical det ails at ht t p: / / www.innovat ion.ch/ j ava/ nt lm .ht m l. [ 2]
NTLM apparent ly st ands for " NT LanMan" or perhaps " NT Lan Manager."
NTLM uses a t hree- way handshake t o aut hent icat e a connect ion. First , t he client sends it s request wit h a couple of ident ifiers. Second, t he server sends back a challenge m essage. Third, t he client sends it s request again wit h a response t o t he challenge. At t his point , t he connect ion is aut hent icat ed and any furt her request s on t he sam e connect ion don't require any challenge/ response inform at ion. I f t he connect ion is closed, t he client and server m ust repeat t he ent ire t hree- way handshake. Persist ent connect ions help reduce t his overhead for NTLM. NTLM uses crypt ographic hash funct ions and nonce values, sim ilar t o Digest aut hent icat ion, alt hough expert s believe NTLM is weaker. NTLM aut hent icat ion support s t he following aut h_param param et ers: ● ● ● ●
auth_param auth_param auth_param auth_param
ntlm ntlm ntlm ntlm
program command children number max_challenge_reuses number max_challenge_lifetime time-specification
The program and children param et ers are t he sam e as for Basic and Digest aut hent icat ion. The rem aining param et ers det erm ine how oft en Squid m ay reuse a single challenge t oken. The max_challenge_reuses param et er specifies how m any t im es a challenge t oken m ay be reused. The default value is 0, so t hat challenges are never reused. I ncreasing t his value m ay reduce t he com put at ional load on Squid and t he NTLM helper processes, at t he risk of weakening t he prot ocol's securit y. Sim ilarly, t he max_challenge_lifetime param et er places a t im e lim it on challenge reuses, even if t he max_challenge_reuses count has not been reached. The default value is 60 seconds. Here is a com plet e exam ple: auth_param ntlm program /usr/local/squid/libexec/ntlm_auth foo\bar auth_param ntlm children 12 auth_param ntlm max_challenge_reuses 5
auth_param ntlm max_challenge_lifetime 2 minutes
acl KnownUsers proxy_auth REQUIRED http_access allow KnownUsers Squid com es wit h t he following NTLM aut hent icat ion helper program s:
12.4.1 SMB ./ configure —enable- aut h= nt lm —enable- nt lm - aut h- helpers= SMB The Server Message Block ( SMB) aut hent icat or for NTLM is sim ilar t o t hose for Basic aut hent icat ion. Your users can sim ply supply t heir Windows NT dom ain, usernam e, and password. This aut hent icat or can load balance bet ween m ult iple dom ain cont rollers. The dom ain and cont roller nam es go on t he com m and line: auth_param ntlm program /usr/local/squid/libexec/ntlm_auth domain\controller [domain\controller ...]
12.4.2 winbind ./ configure —enable- aut h= nt lm —enable- nt lm - aut h- helpers= winbind This aut hent icat or is sim ilar t o winbind for Basic aut hent icat ion. Bot h require t hat you have t he Sam ba winbindd daem on inst alled and running. The nam e of t he winbind Basic aut hent icat or is wb_nlt m _aut h. I t t ypically looks like t his in squid.conf: auth_param basic program /usr/local/squid/libexec/wb_ntlm_auth
12.4.3 NTLM Authentication API The com m unicat ion bet ween Squid and an NTLM aut hent icat or is m uch m ore com plicat ed t han for Basic and Digest . One reason is t hat each helper process act ually creat es it s own challenge. Thus, helpers becom e " st at eful" and Squid m ust rem em ber which connect ions belong t o which helpers. Squid and t he helper processes use a handful of t wo- charact er codes t o indicat e what t hey are sending. Those codes are as follows:
YR Squid sends t his t o a helper when it needs a new challenge t oken. This is always t he first com m unicat ion bet ween t he t wo processes. I t m ay also occur at any t im e t hat Squid needs a new challenge, due t o t he aut h_param max_challenge_lifetime and
max_challenge_uses param et ers. The helper should respond wit h a TT m essage.
TT challenge A helper sends t his m essage back t o Squid and includes a challenge t oken. I t is sent in response t o a YR request . The challenge is base64- encoded, as defined by RFC 2045.
KK credentials Squid sends t his t o a helper when it want s t o aut hent icat e a user's credent ials. The helper responds wit h eit her AF, NA, BH, or LD.
AF username The helper sends t his m essage back t o Squid when t he user's aut hent icat ion credent ials are valid. The helper sends t he usernam e wit h t his m essage because Squid doesn't t ry t o decode t he NTLM Authorization header.
NA reason The helper sends t his m essage back t o Squid when t he user's credent ials are invalid. I t also includes a " reason" st ring t hat Squid can display on an error page.
BH reason The helper sends t his m essage back t o Squid when t he validat ion procedure fails. This m ight happen, for exam ple, when t he helper process is unable t o com m unicat e wit h a Windows NT dom ain cont roller. Squid rej ect s t he user's request .
LD username This helper- t o- Squid response is sim ilar t o BH, except t hat Squid allows t he user's request . Like AF, it ret urns t he usernam e. To use t his feat ure, you m ust com pile Squid wit h t he —enable- nt lm - fail- open opt ion. Since t his prot ocol is relat ively com plicat ed, you'll probably be bet t er off t o st art wit h one of t he t wo skelet on aut hent icat ors included in t he Squid source dist ribut ion. The no_check helper is writ t en in Perl, and fakeaut h is writ t en in C. You can find t hem in t he helpers/ nt lm _aut h direct ory. < Day Day Up >
< Day Day Up >
12.5 External ACLs As of Version 2.5, Squid includes a new feat ure known as ext ernal ACLs. These are ACL elem ent s t hat are im plem ent ed in ext ernal helper processes. You inst ruct Squid t o writ e cert ain inform at ion t o t he helper, which t hen responds wit h eit her OK or ERR. Refer t o Sect ion 6.1.3 for a descript ion of t he ext ernal_acl_t ype synt ax. Here, I 'll only discuss t he part icular ext ernal ACL helper program s t hat com e wit h t he Squid source code.
12.5.1 ip_user ./ configure —enable- ext ernal- acl- helpers= ip_user This helper reads usernam es and client I P addresses as input . I t checks t he t wo values against a configurat ion file t o decide whet her or not t he com binat ion is valid. To use t his ACL helper, you would add lines like t his t o squid.conf: external_acl_type ip_user_helper %SRC %LOGIN /usr/local/squid/libexec/ip_user -f /usr/local/squid/etc/ip_user.conf acl AclName external ip_user_helper % SRC is replaced wit h t he client 's I P address and % LOGI N is replaced wit h t he usernam e for each request . The ip_user.conf configurat ion file has t he following form at : ip_addr[/mask]
user|@group|ALL|NONE
For exam ple: 127.0.0.1
ALL
192.168.1.0/24
bob
10.8.1.0/24
@lusers
172.16.0.0/16
NONE
This configurat ion file causes ip_user t o ret urn OK for any request com ing from 127.0.0.1, for Bob's request s com ing from t he 192.168.1.0/ 24 net work, for any nam e in t he luser group when t he request com es from t he 10.8.1.0/ 24 net work, and ret urns ERR for any request from t he 172.16.0.0/ 16 net work. I t also ret urns ERR for any address and usernam e pair t hat doesn't appear in t he list .
12.5.2 ldap_group ./ configure —enable- ext ernal- acl- helpers= ldap_group This helper det erm ines whet her or not a user belongs t o a part icular LDAP group. You specify t he
LDAP group nam es on t he acl line. I t m ight look like t his in your configurat ion file: external_acl_type ldap_group_helper %LOGIN /usr/local/squid/libexec/squid_ldap_group -b "ou=people,dc=example,dc=com"
ldap.example.com
acl AclName external ldap_group_helper GroupRDN ... Not e t hat you m ust have t he OpenLDAP ( ht t p: / / www.openldap.org) libraries inst alled on your syst em t o com pile t he squid_ldap_group helper program .
12.5.3 unix_group ./ configure —enable- ext ernal- acl- helpers= unix_group This helper looks for usernam es in t he Unix group dat abase ( e.g., / et c/ group file) . You specify t he groups t o check on t he helper com m and line as follows: external_acl_type unix_group_helper %LOGIN /usr/local/squid/libexec/check_group -g group1 -g group2 ... acl AclName external unix_group_helper Alt ernat ively, you can specify groups on t he acl line. This allows you t o use t he sam e helper for different groups: external_acl_type unix_group_helper %LOGIN /usr/local/squid/libexec/check_group acl AclName1 external unix_group_helper group1 ... acl AclName2 external unix_group_helper group2 ...
12.5.4 wbinfo_group ./ configure —enable- ext ernal- acl- helpers= wbinfo_group This helper is a short Perl script t hat ut ilizes t he wbinfo program from t he Sam ba package. wbinfo is a client for t he winbindd daem on. The script expect s a single Unix group nam e following t he usernam e on each request . Thus, you m ust put a group nam e on t he acl line: external_acl_type wbinfo_group_helper %LOGIN /usr/local/squid/libexec/wbinfo_group.pl acl AclName external wbinfo_group_helper group
12.5.5 winbind_group ./ configure —enable- ext ernal- acl- helpers= winbind_group This helper, writ t en in C, also queries a winbindd server about group m em bership of Windows NT
usernam es. I t is based on t he winbind helpers for Basic and NTLM aut hent icat ion. You can specify m ult iple group nam es on t he acl com m and line: external_acl_type winbind_group_helper %LOGIN /usr/local/squid/libexec/wb_check_group acl AclName external winbind_group_helper group1 group2 ...
12.5.6 Write Your Own The ext ernal ACL int erface offers a lot of flexibilit y. Chances are you can use it t o im plem ent alm ost any access cont rol check not support ed by t he built - in m et hods. Writ ing an ext ernal ACL is a t wost ep process. First , you m ust decide what request inform at ion t he helper program needs t o m ake a decision. Place t he appropriat e keywords on an ext ernal_acl_t ype line, along wit h t he pat hnam e t o t he helper program . For exam ple, if you want t o writ e an ext ernal ACL helper t hat uses t he client 's I P address, t he user's nam e, and t he value of t he Host header, you would writ e som et hing like: external_acl_type MyAclHelper %SRC %LOGIN %{Host} /usr/local/squid/libexec/myaclhelper The second st ep is t o writ e t he m yaclhelper program . I t m ust read t he request t okens on st din, m ake it s decision, t hen writ e eit her OK or ERR t o st dout . Cont inuing wit h t he previous exam ple, t his Perl script illust rat es how t o do it : #!/usr/bin/perl -wl require 'shellwords.pl'; $|=1; while () { ($ip,$name,$host) = &shellwords; if (&valid($ip,$name,$host)) { print "OK"; } else { print "ERR"; } }
sub valid { my $ip = shift; my $name = shift; my $host = shift;
... } Refer t o Sect ion 6.1.3 for t he list of t okens ( % SRC, % LOGI N, et c.) t hat you can pass from Squid t o t he helper. Not e t hat when a t oken cont ains whit espace, Squid wraps it in double quot es. As t he exam ple shows, you can use Perl's shellwords library t o parse quot ed t okens easily. Of course, t o ut ilize t he ext ernal ACL, you m ust reference it in an acl line. The ACL elem ent is a m at ch whenever t he ext ernal helper ret urns OK. The ext ernal ACL helper int erface allows you t o pass addit ional inform at ion from t he helper t o Squid ( on t he OK/ ERR line) . These t ake t he form of keyword= value pairs. For exam ple: OK user=hank Current ly, t he only keywords t hat Squid knows about are error and user. I f t he user value is set , Squid uses it in t he access.log. The error value isn't current ly used by Squid. < Day Day Up >
< Day Day Up >
12.6 Exercises ● ●
● ●
●
Writ e a fake helper for Basic aut hent icat ion t hat always ret urns eit her OK or ERR. Use t cpdum p or et hereal t o capt ure som e HTTP request s. Decode t he aut horizat ion credent ials. I f you're using NTLM, capt ure som e HTTP request s and at t em pt a replay at t ack. Kill Squid's aut hent icat ion helper processes one- by- one while running tail -f cache. log. Find out what happens t o your favorit e NTLM- based aut hent icat or when it can't com m unicat e wit h t he NT dom ain cont roller. < Day Day Up >
< Day Day Up >
Chapter 13. Log Files Log files are t he prim ary sources of persist ent inform at ion about Squid's operat ion. I n ot her words, t hey provide a record of what Squid has been doing. This includes URI s request ed by users, obj ect s t hat have been saved t o disk, and various warnings and errors. When Squid appears t o be m alfunct ioning, you'll want t o check t he log files first . By t he end of t his chapt er, you'll know how t o int erpret and m anage all of Squid's various log files. Depending on your configurat ion, Squid m aint ains, at m ost , seven log files. The t hree prim ary files are: cache.log, access.log, and st ore.log. Two opt ional log files, useragent .log and referer. log, are sim ilar t o access.log but cont ain addit ional inform at ion. I 'll also t alk about t he swap. st at e and net db_st at e files. These are dat abases, used by Squid when it rest art s. Not e t hat t he filenam es, such as access.log, are t he default values. You can change m ost of t he log file nam es wit h various squid.conf direct ives. The following list cont ains a brief descript ion of each log file:
cache.log This log file cont ains hum an- orient ed, inform at ional m essages about Squid's operat ion. The filenam e is defined by t he cache_log direct ive. Under norm al condit ions, t he file grows by about 10- 100 KB per day.
access.log This log file cont ains an ent ry for every HTTP and ( opt ionally) I CP t ransact ion m ade by Squid's client s. The filenam e is defined by t he cache_access_log direct ive. I t grows at a rat e of 100- 200 byt es per t ransact ion.
st ore.log This log file cont ains low- level inform at ion about obj ect s t hat ent er and leave t he cache. The filenam e is defined by t he cache_st ore_log direct ive. I t grows at a rat e of about 150 byt es per t ransact ion.
referer.log [ 1] This opt ional log file cont ains HTTP Referer headers for each client request . You m ust enable referer logging wit h t he —enable- referer- log opt ion when running ./ configure. The filenam e is defined by t he referer_log direct ive. I t grows at a rat e of about 80 byt es per t ransact ion.
[ 1]
No, t his isn't a t ypo. " Referer" has been hist orically m isspelled by HTTP developers.
useragent .log This opt ional log file cont ains HTTP User-Agent headers for each client request . You m ust enable user- agent logging wit h t he —enable- useragent - log opt ion when running ./ configure. The filenam e is defined by t he useragent _log direct ive. I t grows at a rat e of about 75 byt es per t ransact ion.
swap.st at e These files cont ain int ernal m et adat a about t he obj ect s st ored on disk. Squid uses t hem t o reconst ruct t he cache upon st art up. By default , t hey are locat ed in t he cache_dir direct ories. However, you can change t he locat ion wit h t he cache_swap_log direct ive. They grow at a rat e of 100 byt es per cache m iss.
net db_st at e This file holds t he cont ent s of t he Net work Measurem ent Dat abase ( see Sect ion 10.5) . I t is always locat ed in t he first cache_dir direct ory. I t s size is det erm ined by t he net db_high value. I f Squid receives an error while writ ing a log file, it doesn't silent ly cont inue. I nst ead, it exit s wit h a fat al error m essage t o get your at t ent ion. Make sure t hat you periodically rot at e t he log files, as described in Sect ion 13.7, t o reduce t he possibilit y of filling your disks. For t he sam e reason, I also recom m end placing your log files on a part it ion separat e from any of your cache direct ories. < Day Day Up >
< Day Day Up >
13.1 cache.log cache.log cont ains various m essages such as inform at ion about Squid's configurat ion, warnings about possible perform ance problem s, and serious errors. Here is som e sam ple cache.log out put : 2003/09/29 12:09:45| Starting Squid Cache version 2.5.STABLE4 for i386unknown-freebsd4.8... 2003/09/29 12:09:45| Process ID 18990 2003/09/29 12:09:45| With 1064 file descriptors available 2003/09/29 12:09:45| Performing DNS Tests... 2003/09/29 12:09:45| Successful DNS name lookup tests... 2003/09/29 12:09:45| DNS Socket created at 0.0.0.0, port 1154, FD 5 2003/09/29 12:09:45| Adding nameserver 24.221.192.5 from /etc/resolv.conf 2003/09/29 12:09:45| Adding nameserver 24.221.208.5 from /etc/resolv.conf 2003/09/29 12:09:45| helperOpenServers: Starting 5 'redirector.pl' processes 2003/09/29 12:09:45| Unlinkd pipe opened on FD 15 2003/09/29 12:09:45| Swap maxSize 10240 KB, estimated 787 objects 2003/09/29 12:09:45| Target number of buckets: 39 2003/09/29 12:09:45| Using 8192 Store buckets 2003/09/29 12:09:45| Max Mem
size: 8192 KB
2003/09/29 12:09:45| Max Swap size: 10240 KB 2003/09/29 12:09:45| Rebuilding storage in /usr/local/squid/var/cache (CLEAN) 2003/09/29 12:09:45| Using Least Load store dir selection 2003/09/29 12:09:45| Set Current Directory to /usr/local/squid/var/cache 2003/09/29 12:09:45| Loaded Icons. 2003/09/29 12:09:45| Accepting HTTP connections at 0.0.0.0, port 3128, FD 16. 2003/09/29 12:09:45| Accepting ICP messages at 0.0.0.0, port 3130, FD 17. 2003/09/29 12:09:45| WCCP Disabled. 2003/09/29 12:09:45| Ready to serve requests.
Each cache.log ent ry st art s wit h a t im est am p showing when t he m essage was generat ed. The very first ent ry in t his sam ple report s t he Squid version ( 2.5.STABLE4) and a st ring ident ifying t he operat ing syst em for which Squid was configured ( i386-unknown-freebsd4.8) . The process I D ( 18990) follows. Many cache.log ent ries m ay look crypt ic ( Target number of buckets: 39) . I n m ost cases, under norm al condit ions, you can ignore ent ries you don't underst and. On t he ot her hand, you m ay want t o look over essent ial configurat ion det ails such as nam e- server addresses or HTTP server address. This sam ple out put ends wit h a st at em ent t hat Squid is ready t o serve request s. At t his point , Squid can accept HTTP connect ions from client s. Usually, t he cache.log file grows slowly. However, an unusual HTTP t ransact ion or sim ilar event m ay cause Squid t o em it a debugging m essage. I f such an event happens oft en ( e.g., a DoS at t ack, a new virus, or sudden disk failure) , t he log file m ay grow quickly. Rot at ing log files reduces t he chance t hat you'll run out of disk space. Maj or errors and abnorm al condit ions are likely t o be report ed in cache.log. I recom m end archiving t hese logs so t hat it is possible t o go back and find t he first occurrence of an unusual event . When describing a part icular Squid problem on t he m ailing list or a sim ilar forum , t he relevant lines from cache.log m ay be very useful. You m ay also want t o increase debugging levels for som e sect ions so t hat ot hers can bet t er underst and and fix your problem .
13.1.1 Debugging Levels The debug_opt ions direct ive cont rols t he level of det ail for cache.log m essages. The default value ( ALL,1) is usually t he best choice. At higher levels, t he unim port ant m essages m ake it harder t o find t he im port ant ones. Refer t o Sect ion 16.2 for a t horough descript ion of t he debug_opt ions direct ive. Not e t hat debugging at t he highest levels ( 9 or 10) m ay add t housands of lines for each request , quickly consum ing disk space and significant ly degrading Squid's perform ance. You can use Squid's - X com m and- line opt ion t o enable full debugging for all sect ions. This m ode is part icularly useful if Squid refuses t o st art , and t he debugging levels in squid.conf are insufficient t o diagnose t he problem . This is also a good way t o enable full debugging of t he configurat ion file parser, before it get s t o t he debug_opt ions direct ive. You should never use t he - X when Squid is operat ing properly. You can use Squid's - k debug com m and- line opt ion t o enable full debugging im m ediat ely on a running Squid process. This com m and operat es as a t oggle: t he first invocat ion t urns on full debugging, and t he second invocat ion t urns it off. See Chapt er 5 for a general discussion about t he - k opt ion. As I already m ent ioned, full debugging generat es an overwhelm ing am ount of dat a. This can m ake Squid, and t he operat ing syst em , very slow. I n ext rem e cases, you m ay find your t erm inal session becom es unresponsive aft er execut ing t he first squid - k debug com m and. Locking yourself out while Squid is spit t ing m egabyt es of cache.log ent ries per second is an unpleasant experience. I find t he following t rick useful t o get a com pact , five- second debugging snapshot wit h less risk: % squid -k debug; sleep 5; squid -k debug
13.1.2 Forwarding cache.log Messages to the System Log To have Squid send copies of cache.log m essages t o t he syst em log, use t he - s com m and- line opt ion. Only m essages wit h debugging levels 0 and 1 are forwarded. Level 0 m essages are logged wit h syslog level LOG_WARNING. Level 1 m essages use syslog level LOG_NOTICE. All m essages use t he LOCAL4 syslog facilit y. Here is one way t o configure syslogd so t hat t hese m essages are saved: local4.warning
/var/log/squid.log
Using syslog in addit ion t o cache.log is especially handy when you m aint ain several Squid boxes. You can configure each local syslog daem on t o forward t hese m essages t o a cent ral host and enj oy a unified view of all caches in one locat ion. For exam ple, you m ight use t his ent ry in / et c/ syslogd.conf: local4.notice
@192.168.45.1
13.1.3 Dumping cache.log Messages to Your Terminal The - d level com m and- line opt ion inst ruct s Squid t o dum p cache.log m essages t o your t erm inal ( i.e., st derr) . The level argum ent specifies t he m axim um level for m essages t hat are dum ped. Not e t hat you'll see only m essages t hat would appear in cache.log, subj ect t o t he debug_opt ions set t ing. For exam ple, if you have debug_opt ions ALL,1, and run squid - d2, you won't see any level 2 debugging m essages. The - d level and - N opt ions are m ost useful for debugging Squid problem s or quickly t est ing a change t o t he configurat ion file. They allow you t o st art Squid easily and see t he cache.log m essages. This opt ion m ay also be useful when Squid st art s from cron or a sim ilar facilit y t hat aut om at ically capt ures a program 's st andard error out put and report s it back t o t he user. For exam ple, you m ay have a cron j ob t hat aut om at ically reconfigures t he running Squid process: 15 */4 * * * /usr/local/squid/sbin/squid -d1 -k reconfigure
< Day Day Up >
< Day Day Up >
13.2 access.log Squid saves key inform at ion about HTTP t ransact ions in access.log. This file is line- based, such t hat each line corresponds t o one client request . Squid records t he client I P address ( or host nam e) , request ed URI , response size, and ot her inform at ion. Squid records all HTTP accesses in access.log, except for t hose t hat disconnect before sending any dat a. Squid also records all I CP ( but not HTCP) t ransact ions unless you disable t hem wit h t he log_icp_queries direct ive. Sect ion 13.2.4 describes t he ot her squid.conf direct ives t hat affect t he access log. The default access.log form at cont ains 10 fields. Here are som e exam ples, wit h long lines split and indent ed: 1066037222.011
126389 9.121.105.207 TCP_MISS/503 1055
GET http://home.gigigaga.com/n8342133/Miho.DAT.019 DIRECT/203.187.1.180 1066037222.011
19120 12.83.179.11 TCP_MISS/200 359
GET http://ads.x10.com/720x300/Z2FtZ3JlZXRpbmcxLmRhd/7/AMG DIRECT/63.211.210.20 text/html 1066037222.011
34173 166.181.33.71 TCP_MISS/200 559
GET http://coursesites.blackboard.com:8081/service/collab/../1010706448190/ DIRECT/216.200.107.101 application/octet-stream 1066037222.011
19287 41.51.105.27 TCP_REFRESH_MISS/200 500
GET http://fn.yam.com/include/tsemark/show.js DIRECT/210.59.224.59 application/x-javascript 1066037222.011
19395 41.51.105.27 TCP_MISS/304 274
GET http://fnasp.yam.com/image/coin3.gif DIRECT/211.72.254.133 1066037222.011
19074 30.208.85.76 TCP_CLIENT_REFRESH_MISS/304 197
GET http://ads.icq.com/content/B0/0/..bC6GygEYNeHGjBUin5Azfe68m5hD1jLk$/aol DIRECT/64.12.184.121 1066037222.011
19048 12.83.179.11 TCP_MISS/200 261
GET http://ads.adsag.com/js.ng/...ne&cat=friendship&subcat=girltalk DIRECT/209.225.54.119 application/x-javascript
1066037222.118
106 41.51.105.27 TCP_HIT/200 536
GET http://rcm-images.amazon.com/images/G/01/rcm/privacy.gif NONE/- image/gif 1066037222.352
19475 27.34.49.248 TCP_MISS/200 12387
GET http://espanol.geocities.com/lebastias/divulgacion/budismo-tarot.html DIRECT/209.1.225.139 text/html 1066037222.352
132 144.157.100.17 TCP_MISS/504 1293
GET http://ar.atwola.com/image/93101912/aol NONE/- Here are t he definit ions for all fields:
1: t im est am p The com plet ion t im e of t he request , expressed as t he num ber of seconds since t he Unix epoch ( Thu Jan 1 00: 00: 00 UTC 1970) , wit h m illisecond resolut ion. Squid uses t his form at , inst ead of som et hing m ore hum an- friendly, t o sim plify t he work of various log file processing program s. You can use a sim ple Perl com m and t o convert t he Unix t im est am ps int o local t im e. For exam ple: perl -pe 's/^\d+\.\d+/localtime($&)/e;' access.log
2: response t im e For HTTP t ransact ions, t his field indicat es how m uch t im e it t ook t o process t he request . The t im er st art s when Squid receives t he HTTP request and st ops when t he response has been fully delivered. The response t im e is given in m illiseconds. The response t im e is usually 0 for I CP queries. This is because Squid answers I CP queries very quickly. Furt herm ore, Squid doesn't updat e t he process clock bet ween receiving an I CP query and sending t he reply. While t im e values are report ed wit h m illisecond resolut ion, t he precision of t hose ent ries is probably about 10 m illiseconds. Tim ing becom es even less precise when Squid is heavily loaded.
3: client address This field cont ains t he client 's I P address, or host nam e if you enable log_fqdn. For securit y or privacy reasons, you m ay want t o m ask a part of client 's address out using t he client _net m ask direct ive. However, t hat also m akes it im possible t o group request s com ing from t he sam e client .
4: result / st at us codes This field consist s of t wo t okens separat ed by a slash. The first t oken, result code, classifies t he prot ocol and t he result of a t ransact ion ( e.g., TCP_HIT or UDP_DENIED) . These are Squidspecific codes, defined in Sect ion 13.2.1. The codes t hat begin wit h TCP_ refer t o HTTP request s, while UDP_ refers t o I CP queries. The second t oken is t he HTTP response st at us code ( e.g, 200, 304, 404, et c.) . The st at us code norm ally com es from t he origin server. I n som e cases, however, Squid m ay be responsible for select ing t he st at us code. These codes, defined by t he HTTP RFC, are sum m arized lat er in Table 13- 1.
5: t ransfer size This field indicat es t he num ber of byt es t ransferred t o t he client . St rict ly speaking, it is t he num ber of byt es t hat Squid t old t he TCP/ I P st ack t o send t o t he client . Thus, it doesn't include overheads from TCP/ I P headers. Also not e t hat t he t ransfer size is norm ally larger t han t he response's Content-Length. This value includes t he HTTP response headers, while ContentLength does not . These propert ies m ake t he t ransfer size field useful for approxim at e bandwidt h usage analysis but not for exact HTTP ent it y size calculat ions. I f you need t o know a response's ContentLength, you can find it in t he st ore.log file.
6: request m et hod This field cont ains t he request m et hod. Because Squid client s m ay use I CP or HTTP, t he request m et hod is eit her HTTP- or I CP- specific. The m ost com m on HTTP request m et hod is GET. I CP queries are always logged wit h ICP_QUERY. See Sect ion 6.1.2.8 for a list of HTTP m et hods Squid knows about .
7: URI This field cont ains t he URI from t he client 's request . The vast m aj orit y of logged URI s are act ually URLs ( i.e., t hey have host nam es) . Squid uses a special form at for cert ain failures. These are cases when Squid can't parse t he HTTP request or ot herwise det erm ine t he URI . I nst ead of a URI / URL, you'll see a st ring such as " error: invalid- request ." For exam ple: 1066036250.603 310 192.0.34.70 NONE/400 1203 GET error:invalid-request - NONE/- Also in t his field look out for whit espace charact ers in t he URI . Depending on your uri_whit espace set t ing, Squid m ay print t he URI in t he log file wit h whit espace charact ers. When t his happens, t he t ools t hat read access.log files m ay becom e confused by t he ext ra fields. When logging, Squid st rips all URI charact ers aft er t he first quest ion m ark unless t he st rip_query_t erm s direct ive is disabled.
8: client ident it y Squid can det erm ine a user's ident it y in t wo different ways. One is wit h t he RFC 1413 ident prot ocol; t he ot her is from HTTP aut hent icat ion headers. Squid at t em pt s ident lookups based on t he ident _lookup_access rules, if any ( see Sect ion 6.2) . Alt ernat ively, if you use proxy aut hent icat ion ( or regular server aut hent icat ion in surrogat e m ode) , Squid places t he given usernam e in t his field. I f bot h m et hods provide Squid wit h a usernam e, and you're using t he nat ive access.log form at , t he HTTP aut hent icat ion nam e is logged, and t he RFC 1413 nam e is ignored. The com m on log file form at has separat e fields for bot h nam es.
9: peering code/ peerhost The peering inform at ion consist s of t wo t okens, separat ed by a slash. I t is relevant only for request s t hat are cache m isses. The first t oken indicat es how t he next hop was chosen. The second t oken is t he address of t hat next hop. The peering codes are list ed in Sect ion 13.2.3. When Squid sends a request t o a neighbor cache, t he peerhost address is t he neighbor's host nam e. I f t he request is sent direct ly t o t he origin server, however, Squid writ es t he origin server's I P address or it s host nam e if log_ip_on_direct is disabled. The value NONE/- indicat es t hat Squid didn't forward t his request t o any ot her servers.
10: cont ent t ype The final field of t he default , nat ive access.log is t he cont ent t ype of t he HTTP response. Squid obt ains t he cont ent t ype value from t he response's Content-Type header. I f t hat header is m issing, Squid uses a hyphen ( -) . I f you enable t he log_m im e_headers direct ive, Squid appends t wo addit ional fields t o each line:
11: HTTP request headers Squid encodes t he HTTP request headers and print s t hem bet ween a pair of square bracket s. The bracket s are necessary because Squid doesn't encode space charact ers. The encoding schem e is a lit t le st range. Carriage ret urn ( ASCI I 13) and newline ( ASCI I 10) are print ed as \r and \n, respect ively. Ot her non- print able charact ers are encoded wit h t he RFC 1738 st yle, such t hat Tab ( ASCI I 9) becom es %09.
12: HTTP response headers Squid encodes t he HTTP response headers and print s t hem bet ween a pair of square bracket s. Not e t hat t hese are t he headers sent t o t he client , which m ay be different from headers received from t he origin server. Squid writ es t o access.log only aft er t he ent ire response has been sent t o t he client . This allows Squid t o include bot h request and response inform at ion in t he log file. However, t ransact ions t hat t ake
m inut es, or even hours, t o com plet e aren't visible in access.log at t he t im e of t he request . When t hese t ypes of t ransact ions present a perform ance or policy concern, t he access.log m ay be unable help you. I nst ead, use t he cache m anager t o view a list of pending t ransact ions ( see Sect ion 14.2.1.37) .
13.2.1 access.log Result Codes The following labels m ay appear in t he fourt h field of t he access.log file in response t o HTTP request s:
TCP_HIT Squid found a likely fresh copy of t he request ed resource and sent it im m ediat ely t o t he client .
TCP_MISS Squid didn't have a cached copy of t he request ed resource.
TCP_REFRESH_HIT Squid found a likely st ale copy of t he request ed resource and sent a validat ion request t o t he origin server. The origin server sent a 304 ( Not Modified) response, indicat ing t hat Squid's copy is st ill fresh.
TCP_REF_FAIL_HIT Squid found a likely st ale copy of t he request ed resource and sent a validat ion request t o t he origin server. However, t he origin server failed t o respond or sent a response t hat Squid didn't underst and. I n any case, Squid sent t he cached ( and likely st ale) copy t o t he client .
TCP_REFRESH_MISS Squid found a likely st ale copy of t he request ed resource and sent a validat ion request t o t he origin server. The server responded wit h new cont ent , indicat ing t he cached response was indeed st ale.
TCP_CLIENT_REFRESH_MISS Squid found a copy of t he request ed resource, but t he client 's request included a CacheControl: no-cache direct ive. Squid forwarded t he client 's request t o t he origin server, forcing a cache validat ion.
TCP_IMS_HIT The client sent a validat ion request , and Squid found a m ore recent , and likely fresh, copy of t he request ed resource. Squid sent t he newer cont ent t o t he client , wit hout cont act ing t he origin server.
TCP_SWAPFAIL_MISS Squid found a valid copy of t he request ed resource but failed t o load it from disk. Squid t hen sent t he request t o t he origin server as t hough it were a cache m iss.
TCP_NEGATIVE_HIT When a request t o an origin server result s in an HTTP error, Squid m ay cache t he response anyway. Repeat ed request s for t hese resources, wit hin a short am ount of t im e, result in negat ive hit s. The negat ive_t t l direct ive cont rols t he am ount of t im e t hese errors m ay be cached. Also not e t hat errors are cached only in m em ory and never writ t en t o disk. The following HTTP st at us codes m ay be negat ively cached, subj ect t o addit ional const raint s: 204, 305, 400, 403, 404, 405, 414, 500, 501, 502, 503, 504.
TCP_MEM_HIT Squid found a valid copy of t he request ed resource in t he m em ory cache and sent it im m ediat ely t o t he client . Not e t hat t his doesn't accurat ely represent all responses served from m em ory. For exam ple, responses t hat are cached in m em ory, but require validat ion, are logged wit h TCP_REFRESH_HIT, TCP_REFRESH_MISS, et c.
TCP_DENIED The client 's request was denied, due t o eit her t he ht t p_access or ht t p_reply_access rules. Not e t hat request s denied by ht t p_access have NONE/- in t he nint h field, whereas t hose denied by ht t p_reply_access have a valid ent ry.
TCP_OFFLINE_HIT When offline_m ode is enabled, Squid ret urns cache hit s for alm ost any cached response, wit hout considering it s freshness.
TCP_REDIRECT A redirect or program t old Squid t o generat e an HTTP redirect t o a new URI ( see Sect ion 11.1) . Norm ally, Squid doesn't log t hese redirect s. To do so, you m ust m anually define t he LOG_TCP_REDIRECTS preprocessor direct ive before com piling Squid.
NONE Unclassified result used for cert ain errors, such as invalid host nam es. The following labels m ay appear in t he fourt h field of t he access.log file in response t o I CP queries:
UDP_HIT
Squid found a likely fresh copy of t he request ed resource in t he cache.
UDP_MISS Squid didn't find a likely fresh copy of t he request ed resource in t he cache. I f t he sam e obj ect is request ed via HTTP, it would probably be a cache m iss. Com pare wit h UDP_MISS_NOFETCH.
UDP_MISS_NOFETCH Like UDP_MISS, except t hat t his also indicat es Squid's reluct ance t o handle t he corresponding HTTP request . I f you use t he - Y com m and- line opt ion, Squid ret urns t his, inst ead of UDP_MISS, while rebuilding it s in- m em ory indexes at st art up.
UDP_DENIED The I CP query is denied due t o t he icp_access rules. I f m ore t han 95% of t he I CP replies t o a client are UDP_DENIED, and t he client dat abase is enabled ( see Appendix A) , Squid st ops sending any I CP replies t o t he client for an hour. When t his happens you'll also see a warning in cache.log.
UDP_INVALID Squid received an invalid query ( e.g., t runcat ed m essage, invalid prot ocol version, whit espace in t he URI , et c.) . Squid sent an ICP_INVALID reply back t o t he client .
13.2.2 HTTP Response Status Codes Table 13- 1 list s t he num erical HTTP response codes and reason phrases. Not e t hat Squid and ot her HTTP agent s care only about t he num eric value. The reason phrase is purely inform at ional and doesn't affect t he m eaning of t he response. For each st at us code, I also provide a reference t o t he part icular sect ion in RFC 2616 t hat describes it . Not e t hat st at us codes 0 and 600 are nonst andard values used by Squid, and aren't m ent ioned in t he RFC.
Ta ble 1 3 - 1 . H TTP r e spon se st a t u s code s Code
Re a son ph r a se
RFC 2 6 1 6 se ct ion
0
No Response Received ( Squid- specific)
N/ A
1xx
I nform at ional
10.1
100
Cont inue
10.1.1
101
Swit ching Prot ocols
10.1.2
2xx
Successful
10.2
200
OK
10.2.1
201
Creat ed
10.2.2
202
Accept ed
10.2.3
203
Non- Aut horit at ive I nform at ion
10.2.4
204
No Cont ent
10.2.5
205
Reset Cont ent
10.2.6
206
Part ial Cont ent
10.2.7
3xx
Redirect ion
10.3
300
Mult iple Choices
10.3.1
301
Moved Perm anent ly
10.3.2
302
Found
10.3.3
303
See Ot her
10.3.4
304
Not Modified
10.3.5
305
Use Proxy
10.3.6
306
( Unused)
10.3.7
307
Tem porary Redirect
10.3.8
4xx
Client Error
10.4
400
Bad Request
10.4.1
401
Unaut horized
10.4.2
402
Paym ent Required
10.4.3
403
Forbidden
10.4.4
404
Not Found
10.4.5
405
Met hod Not Allowed
10.4.6
406
Not Accept able
10.4.7
407
Proxy Aut hent icat ion Required
10.4.8
408
Request Tim eout
10.4.9
409
Conflict
10.4.10
410
Gone
10.4.11
411
Lengt h Required
10.4.12
412
Precondit ion Failed
10.4.13
413
Request Ent it y Too Large
10.4.14
414
Request - URI Too Long
10.4.15
415
Unsupport ed Media Type
10.4.16
416
Request ed Range Not Sat isfiable
10.4.17
417
Expect at ion Failed
10.4.18
5xx
Server Error
10.5
500
I nt ernal Server Error
10.5.1
501
Not I m plem ent ed
10.5.2
502
Bad Gat eway
10.5.3
503
Service Unavailable
10.5.4
504
Gat eway Tim eout
10.5.5
505
HTTP Version Not Support ed
10.5.6
6xx
Proxy Error
N/ A
600
Unparseable Response Headers ( Squid- specific)
N/ A
You'll see st at us code 0 in t he access.log if Squid doesn't receive any response from t he origin server. You'll see st at us code 600 if Squid received a response but couldn't find any HTTP headers. I n a sm all fract ion of cases, cert ain origin servers send only t he response body and om it any headers.
13.2.3 access.log Peering Codes The following codes m ay appear in t he nint h field of t he access.log. Refer t o Sect ion 10.10 for a descript ion of how Squid select s t he next - hop for cache m isses.
NONE This indicat es t hat Squid didn't com m unicat e wit h any ot her servers ( neighbors, origin) for t his request . You'll see it in associat ion wit h various t ypes of cache hit s, denied request s, cache m anager request s, errors, and all I CP queries.
DIRECT Squid forwarded t he request direct ly t o t he origin server. The second half of t he field shows t he origin server's I P address, or host nam e if you've disabled log_ip_on_direct .
SIBLING_HIT Squid sent t he request t o t his sibling cache aft er t he sibling ret urned an I CP or HTCP hit .
PARENT_HIT Squid sent t he request t o t his parent cache aft er t he parent ret urned an I CP or HTCP hit .
DEFAULT_PARENT Squid select ed t his parent because it was m arked as default on t he cache_peer line in squid. conf.
FIRST_UP_PARENT Squid forwarded t he request t o t his parent because it is t he first parent in t he list known t o be alive.
FIRST_PARENT_MISS Squid forwarded t he request t o t he parent cache t hat was first t o respond wit h an I CP/ HTCP m iss m essage. I n ot her words, for t his part icular I CP/ HTCP query, at t his part icular t im e, t he select ed parent had t he best round- t rip t im e. Not e t hat m easured RTTs m ay be art ificially adj ust ed by t he weight opt ion t o t he cache_peer direct ive.
CLOSEST_PARENT_MISS Squid select ed t his parent because it report s t he lowest RTT t o t he origin server. This occurs only if bot h caches have net db enabled ( see Sect ion 10.5) , and t he origin server ( or ot her servers on it s subnet ) ret urns I CMP pings.
CLOSEST_PARENT This is sim ilar t o CLOSEST_PARENT_MISS, except t hat t he RTT m easurem ent s don't com e from t he I CP/ HTCP reply m essages. I nst ead, t hey com e from older m easurem ent s saved by Squid, such as t he net db exchange feat ure.
CLOSEST_DIRECT Squid forwarded t he request t o t he origin server based on net db m easurem ent s. This happens if any of t hese condit ions occur: ❍
❍
The RTT bet ween Squid and t he origin server is less t han t he configured m inim um _direct _rt t value. The m easured num ber of rout er hops bet ween Squid and t he origin server is less t han t he configured m inim um _direct _hops value.
❍
The RTT values ret urned in I CP/ HTCP replies indicat e t hat Squid is closer t o t he origin server t han any of it s neighbors.
ROUNDROBIN_PARENT Squid forwarded t he request t o t his parent because t he round-robin opt ion was set , and it had t he lowest usage count er.
CD_PARENT_HIT Squid forwarded t he request t o t his parent based on t he Cache Digest algorit hm ( see Sect ion 10.7) .
CD_SIBLING_HIT Squid forwarded t he request t o t his sibling based on t he Cache Digest algorit hm .
CARP Squid select ed t his parent based on t he Cache Array Rout ing Prot ocol algorit hm ( see Sect ion 10.9) .
ANY_PARENT Squid select ed t his parent as a last resort because none of t he ot her m et hods result ed in a viable next - hop. Not e t hat m ost of t hese codes m ay be preceded by TI MEOUT_ t o indicat e t hat a t im eout occurred while wait ing for I CP/ HTCP replies. For exam ple: 1066038165.382
345 193.233.46.21 TCP_MISS/200 2836
GET http://www.caida.org/home/images/home.jpg TIMEOUT_CLOSEST_DIRECT/213.219.122.19 image/jpeg You can adj ust t he t im eout wit h t he icp_query_t im eout direct ive.
13.2.4 Configuration Directives That Affect access.log Following are t he configurat ion file direct ives t hat affect t he access.log in one way or anot her.
13.2.4.1 log_icp_queries This direct ive, enabled by default , causes Squid t o log all I CP queries. I f you're running a busy parent cache, t his m ay m ake your access.log files huge. To save space, disable t his direct ive:
log_icp_queries off I f you disable I CP query logging, I suggest t hat you m onit or t he num ber of queries, eit her t hrough t he cache m anager or wit h SNMP.
13.2.4.2 emulate_httpd_log The access.log file has t wo form at s: com m on and nat ive. The com m on form at is t he sam e as m ost HTTP servers ( e.g., Apache) use. I t cont ains less inform at ion t han Squid's nat ive form at . However, you m ight want t o use t he com m on log- file form at if you use Squid as a surrogat e ( see Chapt er 15) . The com m on form at m ay also be useful if you have log- file analysis t ools t hat know how t o parse it . Use t his direct ive t o enable t he com m on form at : emulate_httpd_log on See t he sit e ht t p: / / www.w3.org/ Daem on/ User/ Config/ Logging.ht m l# com m on- logfile- form at , for a descript ion of t his form at .
13.2.4.3 log_mime_hdrs Use t he log_m im e_hdrs direct ive t o m ake Squid log t he HTTP request and response headers: log_mime_headers on When enabled, Squid appends t he request and response headers t o access.log. This adds t wo fields t o each line. Each field is surrounded by square bracket s t o m ake parsing easier. Cert ain charact ers are encoded t o keep t he log file readable. Table 13- 2 shows t he encoding schem e.
Ta ble 1 3 - 2 . Ch a r a ct e r e n codin g r u le s for H TTP h e a de r s in a cce ss.log Ch a r a ct e r
En codin g
Newline
\n
Carriage ret urn
\r
Backslash
\\
[
%5b
]
%5d
%
%25
ASCI I 0- 31
%xx ( hexadecim al value)
ASCI I 127- 255
%xx ( hexadecim al value)
13.2.4.4 log_fqdn By default , Squid put s client I P addresses in t he access.log. You can record host nam es, when available, by enabling t his direct ive: log_fqdn on This causes Squid t o m ake reverse DNS lookups for t he client 's address when it receives a request . I f an answer is available by t he t im e t he request is com plet e, Squid places it in t he t hird field.
13.2.4.5 ident_lookup_access This access rule list det erm ines whet her or not Squid m akes an RFC 1413 ident query for t he client 's TCP connect ion. By default , Squid doesn't issue ident queries. To enable t his feat ure, sim ply add one or m ore rules: acl All src 0/0 ident_lookup_access allow All I f an answer is available by t he t im e t he request is com plet e, Squid places it in t he eight h field. I f you are also using HTTP aut hent icat ion, t hat usernam e is writ t en inst ead of t he ident answer.
13.2.4.6 log_ip_on_direct When Squid forwards a cache m iss t o an origin server, it records t he origin server's I P address in t he nint h field. You can disable t his direct ive so t hat Squid writ es t he host nam e inst ead: log_ip_on_direct off I n t his case, t he host nam e com es from t he URI . I f t he URI cont ains an I P address, Squid doesn't convert it t o a host nam e.
13.2.4.7 client_netmask This direct ive exist s t o provide som e level of privacy for your users. Rat her t han logging t he ent ire client I P address, you can m ask off som e bit s. For exam ple: client_netmask 255.255.255.0 Wit h t his set t ing, all client I P addresses in access.log have 0 as t he last oct et : 1066036246.918
35 163.11.255.0 TCP_IMS_HIT/304 266 GET http://...
1066036246.932
16 163.11.255.0 TCP_IMS_HIT/304 266 GET http://...
1066036247.616
313 140.132.252.0 TCP_MISS/200 1079 GET http://...
1066036248.598
44459 140.132.252.0 TCP_MISS/500 1531 GET http://...
1066036249.230 1066036249.752 1066036250.467
17 170.210.173.0 TCP_IMS_HIT/304 265 GET http://... 2135 140.132.252.0 TCP_MISS/200 50230 GET http://... 4 170.210.173.0 TCP_IMS_HIT/304 265 GET http://...
1066036250.762
102 163.11.255.0 TCP_IMS_HIT/304 265 GET http://...
1066036250.832
20 163.11.255.0 TCP_IMS_HIT/304 266 GET http://...
1066036251.026
74 203.91.150.0 TCP_CLIENT_REFRESH_MISS/304 267 GET http://...
13.2.4.8 strip_query_terms This direct ive is anot her privacy feat ure. I t rem oves query t erm s from URI s before logging t hem . I f your log files som ehow fall int o t he wrong hands, t hey won't be able t o find any usernam es and passwords. When t his direct ive is enabled, all charact ers aft er a quest ion m ark ( ?) are rem oved. For exam ple, a URI like t his: http://auto.search.msn.com/response.asp?MT=www.kimo.com.yw&srch=3 &prov=&utf8 is logged like t his: http://auto.search.msn.com/response.asp?
13.2.4.9 uri_whitespace Earlier, I m ent ioned t he problem wit h whit espace appearing in som e URI s. The RFCs st at e t hat URI s m ust not cont ain whit espace, but in realit y it happens all t oo oft en. The uri_whit espace direct ive dict at es how Squid should handle such cases. The allowed set t ings are: strip ( default ) , deny, allow, encode, and chop. Of t hese, strip, encode, and chop ensure t hat t he URI field doesn't cont ain any whit espace ( t hus adding m ore fields t o access.log) . The allow set t ing allows t he request t o pass t hrough Squid unm odified. I t is likely t o cause t rouble for redirect ors and log file parsers. The deny set t ing, on t he ot her hand, causes Squid t o deny t he request . The user receives an error m essage, but t he request is st ill writ t en t o access.log wit h t he whit espace charact ers. I f you set it t o encode, Squid changes t he whit espace charact ers t o t heir RFC 1738 equivalent s. This is probably what t he user- agent should have done in t he first place. The chop set t ing causes Squid t o cut off t he URI at t he first whit espace charact er. The default set t ing is strip, which m akes Squid rem ove t he whit espace charact ers from t he URI . I t ensures t hat your log- file parsers and redirect ors will be happy, but it m ight break cert ain t hings, such as im properly encoded search engine queries.
13.2.4.10 buffered_logs By default , Squid disables buffering for t he cache.log file, which allows you t o run t ail - f and wat ch log file ent ries appear in real t im e. I f you t hink t his will cause an unnecessary overhead, you can disable buffering: buffered_logs off However, it probably doesn't m at t er unless you are running Squid wit h full debugging. Not e t hat t his opt ion affect s only cache.log. The ot hers always use unbuffered writ es.
13.2.5 access.log Analysis Tools The access.log file cont ains a wealt h of inform at ion, m uch m ore t han you can see by j ust browsing t hrough it . I n order t o get t he big pict ure view, you'll need t o use a t hird- part y log- file analysis package. You can find a long list of t hem linked from t he Squid web page, or by going direct ly t o ht t p: / / www.squid- cache.org/ Script s/ . One of t he m ost popular t ools is Calam aris—a Perl script t hat parses t he log file and generat es eit her t ext or HTML- based report s. I t provides a breakdown of t raffic by request m et hod, client I P address, origin server dom ain nam e, cont ent t ypes, filenam e ext ensions, reply size, and m ore. Calam aris also report s on I CP query t raffic and even underst ands log files from ot her caching product s. Check it out by visit ing ht t p: / / calam aris.cord.de/ . Squeezer, and it s derivat ive, Squeezer2, are Squid- specific analysis t ools. They provide m any st at ist ics t hat can help you underst and Squid's perform ance, especially when you have neighbors. Bot h generat e HTML pages as out put . Visit t he Logfile Analysis page on t he squid- cache.org sit e for links t o t hese program s. Webalyzer is anot her good ut ilit y. I t is designed t o be fast and produces HTML pages wit h t ables and bar chart s. I t was originally designed for origin server access logs. Alt hough it can parse Squid's logs, it doesn't report on such t hings as hit rat ios and response t im es. I t also uses som e t erm s different ly t han I do. For exam ple, Webalyzer calls any request a " hit ," which isn't t he sam e as a cache hit . I t also m akes a dist inct ion bet ween " pages" and " files." For m ore inform at ion, visit t he Webalyzer hom e page at ht t p: / / www.m runix.net / webalyzer/ . < Day Day Up >
< Day Day Up >
13.3 store.log The st ore.log is a record of Squid's decisions t o st ore and rem ove obj ect s from t he cache. Squid creat es an ent ry for each obj ect it st ores in t he cache, each uncachable obj ect , and each obj ect t hat is rem oved by t he replacem ent policy. The log file covers bot h in- m em ory and on- disk caches. The st ore.log provides t he following you can't get from access.log: ● ●
● ● ●
Whet her or not a part icular response was cached. The file num ber for cached obj ect s. For UFS- based st orage schem es, you can convert t his t o a pat hnam e and exam ine t he cont ent s of t he cache file. The response's cont ent lengt h: t he Content-Length value, and t he act ual body lengt h. Values for t he Date, Last-Modified, and Expires headers. The response's cache key ( i.e., MD5 hash value) .
As you can see, t his is m ost ly low- level inform at ion you won't need on a daily basis. Unless you do sophist icat ed analyses, or wish t o debug a problem , you can probably get by wit hout t he st ore.log. You can disable it wit h a special set t ing: cache_store_log none As wit h ot her log files, Squid appends new st ore.log ent ries t o t he end of t he file. A given URI m ay appear in t he file m ore t han once. For exam ple, it get s cached, t hen released, t hen cached again. Only t he m ost recent ent ry reflect s t he obj ect 's current st at us. The st ore.log is t ext - based and looks som et hing like t his: 1067299212.411 RELEASE -1 FFFFFFFF A5964B32245AC98592D83F9B6EA10B8D 206 1067299212 1064287906 -1 application/octet-stream 6840/6840 GET http://download.windowsupdate.com/msdownload/update/v3-19990518/cab... 1067299212.422 SWAPOUT 02 0005FD5F 6F34570785CACABC8DD01ABA5D73B392 200 1067299210 1057899600 -1 image/gif 1125/1125 GET http://forum.topsportsnet.com/shfimages/nav_members1.gif 1067299212.641 RELEASE -1 FFFFFFFF B0616CB4B7280F67672A40647DD08474 200 1067299212 -1 -1 text/html -1/67191 GET http://www.tlava.com/ 1067299212.671 RELEASE -1 FFFFFFFF 5ECD93934257594825659B596D9444BC 200 1067299023 1034873897 1067299023 image/jpeg 3386/3386
GET http://ebiz0.ipixmedia.com/abc/ebiz/_EBIZ_3922eabf57d44e2a4c3e7cd234a... 1067299212.786 RELEASE -1 FFFFFFFF B388F7B766B307ADEC044A4099946A21 200 1067297755 -1 -1 text/html -1/566 GET http://www.evenflowrocks.com/pages/100303pic15.cfm 1067299212.837 RELEASE -1 FFFFFFFF ABC862C7107F3B7E9FC2D7CA01C8E6A1 304 1067299212 -1 1067299212 unknown -1/0 GET http://ebiz0.ipixmedia.com/abc/ebiz/_EBIZ_3922eabf57d44e2a4c3e7cd234a... 1067299212.859 RELEASE -1 FFFFFFFF 5ED2726D4A3AD83CACC8A01CFDD6082B 304 1066940882 1065063803 -1 application/x-javascript -1/0 GET http://www.bellsouth.com/scripts/header_footer.js Each ent ry cont ains t he following 13 fields:
1: t im est am p The t im est am p when t he event t ook place, expressed as seconds since t he Unix epoch wit h m illisecond resolut ion.
2: act ion The act ion t aken on t he obj ect . This field has t hree possible values: SWAPOUT, RELEASE, and SO_FAIL.
❍
❍
❍
A SWAPOUT occurs when Squid successfully com plet es saving t he obj ect dat a t o disk. Som e obj ect s, such as t hose t hat are negat ively cached, are kept in m em ory, but not on disk. Squid doesn't m ake a st ore.log ent ry for t hem . A SO_FAIL ent ry indicat es t hat Squid could not com plet ely save t he obj ect t o disk. Most likely it m eans t hat t he st orage schem e im plem ent at ion refused t o open a new disk file for writ ing. A RELEASE occurs when Squid rem oves an obj ect from t he cache, or decides t hat t he response isn't cachable in t he first place.
3: direct ory num ber The direct ory num ber is a 7- bit index t o t he list of cache direct ories t hat 's writ t en as a decim al num ber. For obj ect s t hat aren't saved t o disk, t his field cont ains t he value -1.
4: file num ber
The file num ber is a 25- bit ident ifier used int ernally by Squid. I t is writ t en as an 8charact er hexadecim al num ber. The UFS- based st orage schem es have an algorit hm for m apping file num bers t o pat hnam es ( see Sect ion 13.3.1) . Obj ect s t hat aren't saved t o disk don't have a valid file num ber. For t hese, t he file num ber field cont ains FFFFFFFF. This value appears only for RELEASE and SO_FAIL ent ries.
5: cache key Squid uses MD5 hash values for t he prim ary index t o locat e cached obj ect s. The key is based on t he request m et hod, URI , and possibly ot her inform at ion. You m ight be able t o use t he cache key t o m at ch up st ore.log ent ries. Not e, however, t hat an obj ect 's cache key can change. This happens, for exam ple, whenever Squid logs a TCP_REFRESH_MISS request in access.log. I t looks like t his: 1065837334.045 SWAPOUT ... 554BACBD2CB2A0C38FF9BF4B2239A9E5 ... http://blah 1066031047.925 RELEASE ... 92AE17121926106EB12FA8054064CABA ... http://blah 1066031048.074 SWAPOUT ... 554BACBD2CB2A0C38FF9BF4B2239A9E5 ... http://blah So what 's going on? The obj ect is originally cached under one key ( 554B...) . Som e t im e lat er, Squid receives anot her request for t he obj ect and forwards a validat ion request t o t he origin server. When t he response com es back wit h new cont ent , Squid changes t he cache key of t he old obj ect ( t o 92AE...) so t hat it can give t he new obj ect t he correct key ( 554B...) . The old obj ect is t hen rem oved, and t he new obj ect is saved t o disk.
6: st at us code This field shows t he HTTP st at us code of t he response, j ust like access.log. See Table 131 for a list of st at us codes.
7: dat e The value of t he Date header in t he HTTP response, expressed as seconds since t he Unix epoch. The value -1 indicat es an unparseable Date header, and -2 m eans t he header was ent irely absent .
8: last - m odified The value of t he Last-Modified header in t he HTTP response, expressed as seconds since t he Unix epoch. The value -1 indicat es an unparseable Last-Modified header, and -2 m eans t he header was ent irely absent .
9: expires The value of t he Expires header in t he HTTP response, expressed as seconds since t he Unix epoch. The value -1 indicat es an unparseable Expires header, and -2 m eans t he header was ent irely absent .
10: cont ent - t ype The value of t he Content-Type header in t he HTTP response, excluding any m edia- t ype param et ers. Squid insert s t he value unknown if t he Content-Type is m issing.
11: cont ent - lengt h/ size This field cont ains t wo num bers, separat ed by a slash. The first is t he value of t he Content-Length header. A -1 indicat es t he Content-Length header is absent . The second is t he act ual size of t he HTTP m essage body. You can use t hese t wo num bers t o ident ify part ially received responses and origin servers t hat incorrect ly calculat e t he cont ent lengt h. I n m ost cases, t he t wo num bers are t he sam e.
12: m et hod The HTTP request m et hod for t he obj ect , as in access.log.
13: URI The final field is t he request ed URI , as in access.log. This field also has t he whit espace problem m ent ioned in t he previous sect ion. However, it is less worrisom e here because you can safely ignore any ext ra fields. For m any of t he RELEASE ent ries, you'll see quest ion m arks ( ?) for t he last eight fields. This is because m ost of t hose field values com e from what Squid calls t he MemObject st ruct ure. This st ruct ure is present only for obj ect s t hat have j ust been received, or are being st ored ent irely in m em ory. Most of t he obj ect s in Squid's cache don't have a MemObject because t hey exist only on disk. For t hese, Squid put s quest ion m arks in t he fields wit h m issing inform at ion.
13.3.1 Mapping File Numbers to Pathnames I f you find you need t o exam ine a part icular cache file, you can, wit h som e effort , t urn a file num ber int o a pat hnam e. You'll also need t he direct ory num ber, and L1 and L2 values. I n t he Squid source code, t he st oreUfsDirFullPat h( ) funct ion does t his. You can find it in t he src/ fs/ ufs/ st ore_dir_ufs.c file. This short Perl program m im ics t he current algorit hm : #!/usr/bin/perl
$L1 = 16; $L2 = 256; while () { $filn = hex($_); printf("%02X/%02X/%08X\n", (($filn / $L2) / $L2) % $L1, ($filn / $L2) % $L2, $filn); } And here's how you can use it : % echo 000DCD06 | ./fileno-to-pathname.pl 0D/CD/000DCD06 To find t his file in t he Nt h cache_dir, sim ply go t o t he corresponding direct ory and list or view t he file: % cd /cache2 % ls -l 0D/CD/000DCD06 -rw-------
1 squid
squid
391 Jun
3 12:40 0D/CD/000DCD06
% less 0D/CD/000DCD06
< Day Day Up >
< Day Day Up >
13.4 referer.log The opt ional referer.log cont ains Referer header values from client request s. To use t his feat ure, you m ust run ./ configure wit h t he —enable- referer- log opt ion. You m ust also ent er a pat hnam e for t he referer_log direct ive. For exam ple: referer_log /usr/local/squid/var/logs/referer.log Set t he filenam e t o none if you want t o disable referer logging. The Referer header norm ally cont ains t he URI from which t he request was obt ained ( see Sect ion 14.36 of RFC 2616) . For exam ple, when a web browser issues a request for an em bedded im age, t he Referer header is set t o t he URI of t he ( HTML) page cont aining t he im ages. I t is also set when you click on an HTML link. Som e web sit e operat ors use Referer values t o find so- called dead links. You m ay find referer.log part icularly useful if you use Squid as a surrogat e. The referer.log has a sim ple form at , wit h only four fields. Here are a few exam ples: 1068047502.377 3.0.168.206 http://www.amazon.com/exec/obidos/search-handle-form/002-7230223-8205634 http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... 1068047503.109 3.0.168.206 http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... http://g-images.amazon.com/images/G/01/gourmet/gourmet-segway.gif 1068047503.196 3.0.168.206 http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... http://g-images.amazon.com/images/G/01/marketing/cross-shop/arnold/appar... 1068047503.198 3.0.168.206 http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... http://g-images.amazon.com/images/G/01/marketing/cross-shop/arnold/appar... 1068047503.825 3.0.168.206 http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... http://images.amazon.com/images/P/B00005R8BC.01.TZZZZZZZ.jpg 1068047503.842 3.0.168.206
http://www.amazon.com/exec/obidos/ASIN/0596001622/qid=1068047396/sr=2-1/... http://images.amazon.com/images/P/0596001622.01._PE_PI_SCMZZZZZZZ_.jpg Not e t hat request s t hat lack a Referer header aren't logged. The four fields are as follows:
1: t im est am p The t im e of t he request , expressed as t he num ber of seconds since Unix epoch wit h m illisecond resolut ion. Not e t hat , unlike access.log, a referer.log ent ry is m ade as soon as Squid receives t he com plet e request . Thus, t he referer.log ent ry occurs before t he access.log, which wait s for t he end of t he response.
2: client address The sam e as t he client address in access.log. The log_fqdn and client _net m ask direct ives affect t his log file as well.
3: referer The value of t he Referer header from t he client 's request . Not e t hat t he referer value m ight have whit espace ( or any ot her) charact ers. Squid doesn't encode t he value before writ ing t o referer.log.
4: URI The URI t hat t he client is request ing. I t m at ches t he URI in access.log. < Day Day Up >
< Day Day Up >
13.5 useragent.log The opt ional useragent .log cont ains User-Agent header values from client request s. To use t his feat ure, you m ust supply t he —enable- useragent - log opt ion when running ./ configure. You also m ust ent er a pat hnam e for t he useragent _log direct ive. For exam ple: useragent_log /usr/local/squid/var/logs/useragent.log The User-Agent header norm ally cont ains a descript ion of t he agent t hat m ade t he request . I n m ost cases, t he descript ion is sim ply a list of product nam es wit h opt ional version inform at ion. You should be aware t hat applicat ions can easily provide false user- agent inform at ion. Modern user- agent s provide a way t o cust om ize t he descript ion. Even Squid can alt er t he User-Agent header in forwarded request s. The useragent .log form at is relat ively sim ple. I t looks like t his: 3.0.168.206 [05/Nov/2003:08:51:43 -0700] "Mozilla/5.0 (compatible; Konqueror/3; FreeBSD)" 3.0.168.207 [05/Nov/2003:08:52:18 -0700] "Opera/7.21 (X11; FreeBSD i386; U)
[en]"
4.241.144.204 [05/Nov/2003:08:55:11 -0700] "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/103u (KHTM..." 3.0.168.206 [05/Nov/2003:08:51:43 -0700] "Java1.3.1_01" 64.68.82.28 [05/Nov/2003:08:52:50 -0700] "Googlebot/2.1 (http://www.googlebot.com/bot.html)" 3.0.168.205 [05/Nov/2003:08:52:50 -0700] "WebZIP/4.1 (http://www.spidersoft.com)" 4.241.144.201 [05/Nov/2003:08:52:50 -0700] "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; Hotbar 3.0)" 3.0.168.206 [05/Nov/2003:08:54:40 -0700] "Bookmark Renewal Check Agent [http://www.bookmark.ne.jp/] (Version 2.0..." Unlike t he ot her log files, it has j ust t hree fields:
1: client address The sam e as t he client address in access.log. The log_fqdn and client _net m ask direct ives affect t his log file as well.
2: t im est am p Unlike t he ot her log files, which represent t he t im e as seconds since t he Unix epoch, t his one uses a hum an- readable form at . I t is t he HTTP com m on log- file form at t im est am p, which looks like t his: [10/Jun/2003:22:38:36 -0600] Not e t hat t he square bracket s delim it t he t im est am p, which includes a space charact er. Also not e t hat , like referer.log, t hese ent ries are creat ed as soon as Squid receives t he com plet e request .
3: user- agent The value of t he User-Agent header. These st rings alm ost always cont ain whit espace. Squid doesn't encode User-Agent values before writ ing t hem in t his log file. < Day Day Up >
< Day Day Up >
13.6 swap.state A swap.st at e file is a j ournal of obj ect s t hat have been added t o, and rem oved from , a cache direct ory. Each cache_dir has it s own swap.st at e file. When Squid st art s up, it reads t he swap. st at e files t o rebuild it s in- m em ory indexes of cached obj ect s. These files are a relat ively crit ical part of Squid's operat ion. By default , each swap.st at e file is locat ed in it s corresponding cache direct ory. Thus, each st at e file aut om at ically st ays wit h each cache_dir. This is im port ant if you ever decide t o reorder your cache_dir lines or if you rem ove one or m ore from t he list . I f you prefer t o put t hem in a different locat ion, you can use t he cache_swap_log direct ive: cache_swap_log /usr/local/squid/var/logs/swap.state I n t his case, Squid creat es a swap.st at e file for each direct ory by appending a num eric suffix. For exam ple, if you have four cache direct ories, Squid creat es t he following: /usr/local/squid/var/logs/swap.state.00 /usr/local/squid/var/logs/swap.state.01 /usr/local/squid/var/logs/swap.state.02 /usr/local/squid/var/logs/swap.state.03 I n t his sit uat ion, if you add, rem ove, or rearrange cache_dir lines, you m ay need t o renam e t he swap.st at e files m anually t o keep everyt hing consist ent . Technically, t he swap.st at e form at is st orage schem e- dependent . However, all st orage schem es use t he sam e form at in t he current versions of Squid. The swap.st at e file uses a fixed- size ( 48byt e) binary form at . Fields are writ t en in host - byt e order and are t hus not necessarily port able bet ween different operat ing syst em s. Table 13- 3 describes t he fields of a swap.st at e ent ry.
Ta ble 1 3 - 3 . sw a p.st a t e e n t r y fie lds Nam e
Size , in byt e s
D e scr ipt ion
op
1
Operat ion on t he ent ry: added ( 1) or delet ed ( 2) .
file num ber
4
Sam e as t he fourt h field of st ore.log, except it is st ored in binary.
t im est am p
4
A t im est am p corresponding t o t he t im e when t he response was generat ed or last validat ed. Taken from t he Date header for responses t hat have one. St ored as t he num ber of seconds since t he Unix epoch.
last ref
4
A t im est am p corresponding t o t he m ost recent access t o t he obj ect .
expires
4
The obj ect 's expirat ion t im e, t aken from an Expires header or Cache-Control max-age direct ive.
last - m odified
4
The obj ect 's Last-Modified value.
swap file size 4
The am ount of space t he obj ect occupies on disk. This includes HTTP headers and ot her Squid- specific m et ainform at ion.
refcount
2
The num ber of t im es t his obj ect has been request ed.
flags
2
Various int ernal flags used by Squid.
key
16
The MD5 hash of t he corresponding URI . Sam e as t he key in st ore.log, except t his one is st ored in binary.
< Day Day Up >
< Day Day Up >
13.7 Rotating the Log Files Squid always appends new ent ries t o it s log files. I f your cache is busy, som e of t hese files can becom e very large aft er a few days. Som e operat ing syst em s even place lim it s on t he size of a file ( e.g., 2 GB) and ret urn an error for writ es beyond t hat size. To keep your log files m anageable, and Squid happy, you m ust regularly rot at e t hem . Squid has a built - in feat ure for rot at ing log files. You can invoke it wit h t he squid - k rot at e com m and. You t hen t ell Squid how m any old copies of each file t o keep wit h t he logfile_rot at e direct ive. For exam ple, if you set it t o 7, you'll have eight versions of each log file: t he current file and seven old ones. Old log files are renam ed wit h num eric ext ensions. For exam ple, when you execut e a rot at ion, Squid renam es log.6 t o log.7, t hen log.5 t o log.6, and so on. The current log becom es log.0, and Squid creat es a new, em pt y file nam ed log. Each t im e you execut e squid - k rot at e, Squid rot at es t he following files: cache.log, access.log, st ore.log, useragent .log ( if enabled) , and referer.log ( if enabled) . Squid also creat es up- t o- dat e versions of t he swap.st at e files. Not e, however, t hat swap.st at e isn't archived wit h num eric ext ensions. Squid doesn't rot at e t he log files aut om at ically. The best way t o m ake it happen is wit h a daily cron j ob. For exam ple: 0 0 * * * /usr/local/squid/sbin/squid -k rotate I f you'd rat her writ e your own script s t o m anage t he log files, Squid has a special m ode t hat you'll find useful. Sim ply set t he logfile_rot at e direct ive t o 0. Then, when you run squid - k rot at e, Squid sim ply closes t he current log files and opens new ones. This is very useful when t he operat ing syst em allows you t o renam e files opened by anot her process. The following shell script illust rat es t he idea: #!/bin/sh set -e
yesterday_secs=`perl -e 'print time -43200'` yesterday_date=`date -r $yesterday_secs +%Y%m%d`
cd /usr/local/squid/var/logs
# rename the current log file without interrupting the logging process
mv access.log access.log.$yesterday_date
# tell Squid to close the current logs and open new ones /usr/local/squid/sbin/squid -k rotate
# give Squid some time to finish writing swap.state files sleep 60
mv access.log.$yesterday_date /archive/location/ gzip -9 /archive/location/access.log.$yesterday_date
< Day Day Up >
< Day Day Up >
13.8 Privacy and Security Squid's log files, especially access.log, cont ain a record of users' act ivit ies and, hence, are subj ect t o privacy concerns. As t he Squid adm inist rat or, you should t ake every precaut ion t o keep t he log files safe and secure. One of t he best ways t o do t hat is lim it t he num ber of people who have access t o t he syst em on which Squid runs. I f t hat isn't possible, carefully exam ine t he file and direct ory perm issions t o m ake sure t hey can't be viewed by unt rust ed or unaut horized users. You can also help prot ect your users' privacy by t aking advant age of t he client _net m ask and st rip_query_t erm s direct ives. The form er m akes it harder t o ident ify individual users in t he access.log; t he lat t er rem oves URI query t erm s t hat m ay cont ain personal inform at ion. See Sect ion 13.2.4 for m ore inform at ion. You m ay also want t o develop a policy for keeping old log files. Obviously access.log helps keep users account able for t heir act ivit ies, but how far back would you ever need t o go searching for som et hing? A week? A year? What would you do if present ed wit h a court order t o hand over your log files for t he last t hree m ont hs? I f you like t o keep hist orical dat a for a long t im e, perhaps you can m ake t he log files anonym ous or som ehow reduce t he dat aset . I f you are int erest ed only in which URI s were accessed, but not by whom , you can ext ract only t hat field from access.log. This not only m akes t he file sm aller, it also reduces t he risk of a privacy violat ion. Anot her t echnique is t o random ize t he client I P addresses. I n ot her words, creat e a filt er t hat m aps real I P addresses t o fake ones, such t hat t he sam e real address is always changed t o t he sam e fake address. I f you are using RFC 1413 ident ificat ion or HTTP aut hent icat ion, consider m aking t hose fields anonym ous as well. < Day Day Up >
< Day Day Up >
13.9 Exercises ● ● ● ● ●
●
Configure Squid so t hat it doesn't creat e any log files, except for t he swap.st at e file( s) . Writ e a sim ple Perl or awk script t o calculat e your cache hit rat io from access.log. How does an " access denied" response appear in t he access.log? Does st ore.log have t he sam e num ber of, m ore, or fewer, ent ries t han access.log? Take a file num ber from st ore.log and find t he corresponding file in t he disk cache. Exam ine t he file and m ake sure you've found t he correct response. Develop and im plem ent a policy for archiving old cache log files. Consider where and how t hey will be st ored, for how long, and who has perm ission t o access t hem . < Day Day Up >
< Day Day Up >
Chapter 14. Monitoring Squid How can you t ell if Squid is perform ing well? Does Squid have enough m em ory, bandwidt h, and disk space? When t he I nt ernet seem s slow, is it Squid's fault or a problem som ewhere else? I s t he operat ing syst em giving enough resources t o Squid? I s som eone t rying t o abuse or hack int o m y proxy? You can find t he answers t o t hese, and m any m ore, quest ions in t his chapt er. Squid provides inform at ion about it self in t hree different ways: cache.log m essages, t he cache m anager, and an SNMP MI B. Squid writ es various m essages t o cache.log as it runs. Most of t hese are abnorm al event s of one sort or anot her. Unfort unat ely, Squid isn't always sm art enough t o different iat e serious problem s from t hose t hat can be safely ignored. Even so, cache.log is a good place t o st art when invest igat ing a Squid problem . The cache m anager and SNMP int erfaces allow you t o query Squid for a variet y of dat a. The cache m anager, which has it s own short com ings, probably provides t he m ost inform at ion in current versions of Squid. I t has a TCP socket - based int erface and t ries t o generat e out put suit able for bot h hum an and com put er processing. The bulk of t his chapt er is devot ed t o explaining all t he inform at ion available from t he cache m anager. Squid support s SNMP as well. Unfort unat ely, t he dat a available t hrough SNMP is only a subset of t he cache- m anager inform at ion. Addit ionally, t he Squid MI B has not evolved m uch over t he years; it 's essent ially unchanged since it s first incarnat ion. I 'll explain how t o m ake Squid process SNMP queries and describe all obj ect s in t he current MI B. < Day Day Up >
< Day Day Up >
14.1 cache.log Warnings This is one of t he first places you should look whenever you perceive a problem wit h Squid. During norm al operat ion, you'll find various warnings and inform at ional m essages t hat m ay or m ay not indicat e a problem . I covered t he m echanics of cache.log back in Sect ion 13.1. Here, I 'll go over a few of t he warning m essages you m ight see in your log file. The high_response_t im e_warning direct ive m akes Squid print a warning whenever t he m edian response t im e exceeds a t hreshold. The value is in m illiseconds and is disabled by default . I f you add t his line t o squid.conf: high_response_time_warning 1500 Squid will print t he following warning if t he m edian response t im e, m easured over a 1- m inut e int erval, exceeds 1.5 seconds: 2003/09/29 03:17:31| WARNING: Median response time is 2309 milliseconds Before set t ing t his direct ive, you should have a good idea of Squid's norm al response t im e levels. I f you set it t oo low, you'll get false alarm s. I n t his part icular exam ple, it m eans t hat half of your user's request s t ake m ore t han 2.3 second t o com plet e. High response t im es m ay be caused by local problem s, such as running out of file descript ors, or by rem ot e problem s, such as a severely congest ed I nt ernet link. The high_page_fault _warning direct ive is sim ilar. I t causes Squid t o em it a warning if t he num ber of page fault s per m inut e exceeds a given value. A high page- fault rat e usually indicat es t hat t he Squid process can't fit ent irely in m em ory and m ust be swapped out t o disk. This swapping severely im pact s Squid's perform ance, so you should rem edy t he sit uat ion as soon as possible, as I 'll discuss in Sect ion 16.1.8. Squid uses t he Unix get rusage( ) funct ion t o get page fault count s. On som e operat ing syst em s ( e.g., Solaris) , t he page fault count er represent s som et hing besides swapping. Therefore, t he high_page_fault _warning m ay cause false alarm s on t hose syst em s. The high_m em ory_warning direct ive is also sim ilar t o t he previously m ent ioned warnings. I n t his case, it checks t he size of t he Squid process; if it exceeds t he t hreshold, you get t he warning in cache.log. On som e operat ing syst em s, t he process size can only grow and never shrink. Therefore, you'll const ant ly get t his warning unt il Squid shut s down. Process size inform at ion com es from eit her t he m allinfo( ) , m st at s( ) , or sbrk( ) funct ions. I f t hese are unavailable on your operat ing syst em , t he high_m em ory_warning warning won't work. Squid has a num ber of ot her hardcoded warnings you m ay see in cache.log:
DNS lookup for 'neighbor.host.name' failed!
This occurs whenever Squid fails t o look up t he I P address for a cache neighbor. Squid refreshes t he neighbor addresses every hour or so. As long as t he neighbor's address is unknown, Squid doesn't send any t raffic t here.
Detected DEAD Sibling: neighbor.host.name/3128/3130 Squid logs t his m essage when it believes it can't com m unicat e wit h a neighbor cache. This happens, for exam ple, when t oo m any consecut ive I CP queries go unacknowledged. See Sect ion 10.3.2 for m ore inform at ion.
95% of replies from 'neighbor.host.name' are UDP_DENIED This m essage indicat es t hat a neighbor cache is refusing t o answer Squid's queries. I t probably m eans t hat you are sending queries t o t he neighbor wit hout t heir perm ission. I f t hey are using address- based access cont rols, and you have recent ly changed your address, t hey won't know about t he change. Squid refuses t o send any m ore queries t o t he neighbor aft er det ect ing t his condit ion.
Probable misconfigured neighbor at 192.168.121.5 This occurs when you have an unaut horized cache client sending you I CP or HTCP queries. The best t hing t o do in t his case is t ry t o find out t he person or organizat ion responsible for t he given address. Ask why t hey are querying your cache.
Forwarding loop detected for: Recall t hat a forwarding loop occurs when a single request flows t hrough Squid a second t im e. The request 's Via header cont ains a list of all proxies t hat have seen t he request . I f Squid det ect s it s own nam e in t he Via list , it em it s t he forwarding loop warning and sends t he request direct ly t o t he origin server. See Sect ion 10.2 for an explanat ion of forwarding loops.
Closing client 192.168.121.5 connection due to lifetime timeout The client _lifet im e direct ive places an upper lim it on t he durat ion for a single HTTP request . Squid warns you when such a request is t erm inat ed because it m ay indicat e som eone is abusing your cache wit h very long- lived connect ions, for exam ple, by downloading infinit ely long obj ect s. As you can see, cache.log provides only not ificat ion of abnorm al event s. For periodic m onit oring, you need som et hing else. The cache m anager is perhaps t he best choice, even t hough it s int erface is less t han perfect . < Day Day Up >
< Day Day Up >
14.2 The Cache Manager The Cache Manager is an int erface t o Squid for receiving inform at ion about various com ponent s. I t is accessed via norm al HTTP request s wit h a special prot ocol nam e: cache_object. A full cache m anager URL looks like cache_obj ect : / / cache.host .nam e/ info. Squid provides t wo easy ways t o [ 1] or t he cachem gr. access t he cache m anager inform at ion: t he com m and- line squidclient program cgi CGI program . [ 1]
I n older versions of Squid, it was called j ust client .
The squidclient ut ilit y is a sim ple HTTP client , wit h a few special feat ures for use wit h Squid. For exam ple, you can use a short cut t o request t he cache m anager pages. Rat her t han t yping a long URL like t his: % squidclient cache_object://cache.host.name/info you can use t his short er version: % squidclient mgr:info squidclient is a convenient way t o quickly see som e of t he cache m anager pages. I t 's also useful when you need t o save t he cache m anager out put t o disk for lat er analysis. However, som e pages, such as t he m em ory ut ilizat ion t able, are difficult t o read in a t erm inal window. They are really designed t o be form at t ed as an HTML page and viewed wit h your web browser. I n t hat case, you m ay want t o use cachem gr.cgi. To use cachem gr.cgi, you m ust have an HTTP server t hat can execut e t he program . You can use an exist ing server or inst all one alongside Squid if you prefer. Keep in m ind t hat t he cache m anager has only weak securit y ( cleart ext passwords) . I f t he HTTP server is on a different host , you need t o add it s I P address t o a cache m anager access list ( see Sect ion 14.2.2) . You m ay also want t o add access cont rols t o t he HTTP server so t hat ot hers can't access cachem gr.cgi. I f you use Apache, I recom m end m aking a special cgi- bin direct ory so you can prot ect cachem gr.cgi wit h access cont rols. For exam ple, creat e a new direct ory, and copy t he binary t o it : # mkdir /usr/local/apache/squid-cgi # cp /usr/local/squid/libexec/cachemgr.cgi /usr/local/apache/squid-cgi # chmod 755 /usr/local/apache/squid-cgi/cachemgr.cgi Now, add a Script Alias line t o Apache's ht t pd.conf: ScriptAlias /squid-cgi/ "/usr/local/apache/squid-cgi/" Finally, creat e an .ht access file in t he squid- cgi direct ory t hat cont ains access cont rols. To allow request s from only one I P address, use som et hing like t his:
Allow from 192.168.4.2 Deny from all Once cachem gr.cgi is inst alled, sim ply ent er t he appropriat e URL int o your web browser. For exam ple: http://www.server.name/squid-cgi/cachemgr.cgi I f t he CGI program is working, you should see a page wit h four fields. See Figure 14- 1 for an exam ple. The Cache Host field cont ains t he nam e of t he host on which Squid is running—localhost by default . You can set it wit h t he —enable- cachem gr- host nam e opt ion when running ./ configure. Sim ilarly, Cache Port cont ains t he TCP port num ber t o which Squid list ens for request s. I t 's 3128 by default and can be changed wit h t he —enable- cachem gr- port opt ion. The Manager nam e and Password fields are for access t o prot ect ed pages, which I 'll t alk about short ly.
Figu r e 1 4 - 1 . Th e ca ch e m gr .cgi login scr e e n
Aft er clicking on t he Cont inue... but t on, you should see a list of all cache m anager pages current ly available. The following sect ion describes t he various pages, som e of which are available only when you enable cert ain feat ures at com pile t im e.
14.2.1 Cache Manager Pages This sect ion describes t he cache m anager pages, in t he sam e order in which t hey appear in t he m enu. Each sect ion t it le has bot h t he page nam e ( for use wit h squidclient ) , followed by it s descript ion. Descript ions t hat cont ain an ast erisk indicat e pages t hat are disabled by default , unless you configure a password for t hem . Table 14- 1 shows t he t able of cont ent s and t he sect ion num ber for each page.
Ta ble 1 4 - 1 . Ca ch e m a n a ge r pa ge s
Sh or t n a m e
D e scr ipt ion
leaks
Mem ory Leak Tracking
m em
Mem ory Ut ilizat ion
cbdat a
Callback Dat a Regist ry Cont ent s
event s
Event Queue
squidaio_count s
Async I O Funct ion Count ers
diskd
DI SKD St at s
config
Current Squid Configurat ion*
com m _incom ing
com m _incom ing( ) St at s
ipcache
I P Cache St at s and Cont ent s
fqdncache
FQDN Cache St at s and Cont ent s
idns
I nt ernal DNS St at ist ics
dns
Dnsserver St at ist ics
redirect or
URL Redirect or St at s
basicaut hent icat or
Basic User Aut hent icat or St at s
digest aut hent icat or
Digest User Aut hent icat or St at s
nt lm aut hent icat or
NTLM User Aut hent icat or St at s
ext ernal_acl
Ext ernal ACL St at s
ht t p_headers
HTTP Header St at ist ics
via_headers
Via Request Headers
forw_headers
X- Forwarded- For Request Headers
m enu
This Cache Manager Menu
shut down
Shut Down t he Squid Process*
offline_t oggle
Toggle offline_m ode Set t ing*
info
General Runt im e I nform at ion
filedescript ors
Process File Descript or Allocat ion
obj ect s
All Cache Obj ect s
vm _obj ect s
I n- Mem ory and I n- Transit Obj ect s
openfd_obj ect s
Obj ect s wit h Swapout Files Open
io
Server- Side Net work read( ) Size Hist ogram s
count ers
Traffic and Resource Count ers
peer_select
Peer Select ion Algorit hm s
digest _st at s
Cache Digest and I CP Blob
5m in
5 Minut e Average of Count ers
60m in
60 Minut e Average of Count ers
ut ilizat ion
Cache Ut ilizat ion
hist ogram s
Full Hist ogram Count s
act ive_request s
Client - Side Act ive Request s
st ore_digest
St ore Digest
st oredir
St ore Direct ory St at s
st ore_check_cachable_st at s
st oreCheckCachable( ) St at s
st ore_io
St ore I O I nt erface St at s
pconn
Persist ent Connect ion Ut ilizat ion Hist ogram s
refresh
Refresh Algorit hm St at ist ics
delay
Delay Pool Levels
forward
Request Forwarding St at ist ics
client _list
Cache Client List
net db
Net work Measurem ent Dat abase
asndb
AS Num ber Dat abase
carp
CARP I nform at ion
server_list
Peer Cache St at ist ics
non_peers
List of Unknown Sit es Sending I CP Messages
14.2.1.1 leaks: Memory Leak Tracking This page is available only wit h t he ./ configure —enable- leakfinder opt ion and is int ended for developers t rying t o t rack down m em ory leaks. The page shows each m em ory point er being t racked
and where and when it was m ost recent ly referenced. See t he Squid Program m er's Guide ( ht t p: / / www.squid- cache.org/ Doc/ Prog- Guide/ ) for m ore inform at ion about Squid's leak- finder feat ure.
14.2.1.2 mem: Memory Utilization The m em ory ut ilizat ion page shows a large t able of num bers. Each row corresponds t o a different pool of m em ory. The pools have nam es like acl_list and MemObject. Much of t his inform at ion is of int erest t o developers only. However, a few colum ns are wort h m ent ioning here.
I t is im port ant t o keep in m ind t hat t his t able doesn't represent all t he m em ory allocat ed by Squid. Som e m em ory allocat ions aren't t racked and don't appear in t he t able. Thus, t he Tot al row m ay be m uch less t han Squid's act ual m em ory usage.
The impact colum n shows each pool's cont ribut ion t o t he t ot al am ount of m em ory allocat ed. Usually, t he StoreEntry, MD5 digest, and LRU policy node pools t ake up m ost of t he m em ory. I f you are a developer, you can use t his page t o look for m em ory leaks. The colum n labeled high (hrs) shows t he am ount of t im e elapsed since t he pool reached it s m axim um size. A sm all value in t his colum n m ay indicat e t hat m em ory for t hat pool isn't being freed correct ly. You can also use t his page t o find out if cert ain feat ures, such as net db, t he ipcache, and client _db consum e t oo m uch m em ory. For exam ple, t he ClientInfo pool is associat ed wit h t he client _db feat ure. The m em ory ut ilizat ion page shows you how m uch m em ory you can save if you disable client _db in squid.conf.
14.2.1.3 cbdata: Callback Data Registry Contents The Callback Dat a Regist ry is an int ernal Squid program m ing feat ure for m anaging m em ory point ers. Current ly, t his cache m anager page doesn't provide m uch useful inform at ion, apart from t he num ber of act ive cbdat a point ers being t racked. I n earlier Squid versions, t he cbdat a feat ure was im plem ent ed different ly and t his page provided som e inform at ion t o developers debugging t heir code.
14.2.1.4 events: Event Queue Squid m aint ains an event queue for a num ber of t asks t hat m ust occur separat ely from user request s. Perhaps t he m ost im port ant of t hese is t he periodic t ask t hat m aint ains t he disk cache size. Every second or so, t his t ask runs and looks for cache files t o rem ove. On t his page, you can see all t asks current ly scheduled for execut ion. Most likely, you'll not find t his very int erest ing unless you are hacking t he source code.
14.2.1.5 squidaio_counts: Async IO Function Counters This page is available only wit h t he ./ configure —enable- st oreio= aufs opt ion. I t shows count ers for t he num ber of open, close, read, writ e, st at , and unlink request s received. For exam ple:
ASYNC IO Counters: Operation
# Requests
open
15318822
close
15318813
cancel
15318813
write
0
read
19237139
stat
0
unlink check_callback queue
2484325 311678364 0
The cancel count er is norm ally equal t o t he close count er. This is because t he close funct ion always calls t he cancel funct ion t o ensure t hat any pending I / O operat ions are ignored. The write count er is zero because t his version of Squid perform s writ es synchronously, even for aufs. The check_callback count er shows how m any t im es t he m ain Squid process has checked t he done queue for com plet ed operat ions. The queue value indicat es t he current lengt h of t he request queue. Norm ally, t he queue lengt h should be less t han t he num ber of t hreads x 5. I f you repeat edly observe a queue lengt h larger t han t his, you m ay be pushing Squid t oo hard. Adding m ore t hreads m ay help, but only t o a cert ain point .
14.2.1.6 diskd: DISKD Stats This page is available only wit h t he ./ configure —enable- st oreio= diskd opt ion. I t provides various st at ist ics relat ing t o t he diskd st orage schem e. The sent_count and recv_count lines are count ers for t he num ber of I / O request s sent bet ween Squid and t he group of diskd processes. The t wo num bers should be very close t o each ot her and could possibly be equal. The difference indicat es how m any request s are current ly out st anding. The max_away value indicat es t he largest num ber of out st anding request s. Sim ilarly, t he max_shmuse count er indicat es t he m axim um num ber of shared m em ory blocks in use at once. These t wo values are reset ( t o zero) each t im e you request t his page. Thus, if you wait longer bet ween request s for t his page, t hese m axim um count ers are likely t o be larger. The open_fail_queue_len count er indicat es t he num ber of t im es t hat t he diskd code decided t o ret urn failure in response t o a request t o open a file because t he m essage queue exceeded it s configured lim it . I n ot her words, t his is t he num ber of t im es a diskd queue reached t he Q1 lim it .
Sim ilarly, block_queue_len shows how m any t im es t he Q2 lim it has been reached. See t he descript ions of Q1 and Q2 in Sect ion 8.5.1. The diskd page also shows how m any request s Squid sent t o t he diskd processes for each of t he six I / O operat ions: open, creat e, close, unlink, read, and writ e. I t also shows how m any t im es each operat ion succeeded or failed. Not e, t hese count ers are increm ent ed only for request s sent . The open_fail_queue_len check occurs earlier, and in t hat case, Squid doesn't send a request t o a diskd process.
14.2.1.7 config: Current Squid Configuration* This opt ion dum ps Squid's current configurat ion in t he squid.conf form at . Thus, if you ever accident ally rem ove t he configurat ion file, you can recover it from t he running Squid process. By saving t he out put t o a file, you can also com pare ( e.g., wit h t he diff com m and) t he running configurat ion t o t he saved configurat ion. Not e, however, t hat com m ent s and blank lines aren't preserved. This opt ion reveals pot ent ially sensit ive inform at ion, so it 's available only wit h a password. You m ust add a cache m anager password for t he config opt ion wit h t he cachem gr_passwd direct ive. See Sect ion 14.2.2 for specifics. Addit ionally, t hese cache m anager passwords aren't displayed in t his out put .
14.2.1.8 comm_incoming: comm_incoming( ) Stats This page provides low- level net work I / O inform at ion t o developers and Squid wizards. The loop t hat checks for act ivit y on file descript ors is called com m _poll( ) . Over t he years, t his funct ion has becom e increasingly com plicat ed in order t o im prove Squid's perform ance. One of t hose perform ance im provem ent s relat es t o how oft en Squid checks cert ain net work socket s relat ive t o t he ot hers. For exam ple, t he incom ing HTTP socket is where Squid accept s new client connect ions. This socket t ends t o be busier t han a norm al dat a socket because each new connect ion com es t hrough t he incom ing socket . To provide good perform ance, Squid m akes an ext ra effort t o check t he incom ing socket m ore frequent ly t han t he ot hers. At t he t op of t he com m _incom ing page, you'll see t hree incom ing int erval num bers: one each for I CP, DNS, and HTTP. The int erval is t he num ber of norm al I / O event s t hat Squid handles before checking t he incom ing socket again. For exam ple, if incoming_dns_interval is set t o 140, Squid checks t he incom ing DNS socket aft er 140 I / Os on norm al connect ions. Unless your Squid is very busy, you'll probably see 256 for all incom ing int ervals. The page also cont ains t hree hist ogram s t hat show how m any event s occur for each incom ing funct ion call. Norm ally, t he m aj orit y of t he hist ogram count s occur in t he low values. I n ot her words, funct ions such as com m _select _ht t p_incom ing( ) usually handle bet ween one and four event s.
14.2.1.9 ipcache: IP Cache Stats and Contents The I P cache cont ains cached result s of host nam e- t o- address lookups. This cache m anager page displays quit e a lot of inform at ion. At t he t op of t his page you'll see a handful of st at ist ics like t hese: IPcache Entries: 10034
IPcache Requests: 1066445 IPcache Hits: 817880 IPcache Negative Hits: 6846 IPcache Misses: 200497 I n t his exam ple, you can see t hat t he I P cache cont ains slight ly m ore t han 10,000 ent ries ( host nam es) . Since Squid was st art ed, t here have been 1,066,445 nam e- t o- address request s, 817,880 of which were cache hit s. This is a cache hit rat io of 77% . An I P cache negat ive hit occurs when Squid receives a subsequent request for a host nam e t hat it recent ly failed t o resolve. Rat her t han ret ry t he DNS lookup im m ediat ely, Squid assum es it will fail again and ret urns an error m essage t o t he user. Following t hese brief st at ist ics, you'll see a long list of t he I P cache cont ent s. For each host nam e in t he cache, Squid print s six fields: ● ●
● ● ●
●
The host nam e it self Flags: N for negat ively cached ent ries and H if t he addresses cam e from t he local host s file, rat her t han t he DNS The num ber of seconds since t he host nam e was last request ed or used The num ber of seconds unt il t he cached ent ry expires The num ber of I P addresses known for t he host , and, in parent heses, t he num ber of BAD addresses A list of I P addresses and whet her each is OK or BAD
Here is a short sam ple ( form at t ed t o fit t he page) : Hostname ads.x10.com us.rd.yahoo.com
Flg lstref
TTL
N
9
110
1( 0)
640
-340
63.211.210.20-OK
4( 0) 216.136.232.150-OK 216.136.232.147-OK 216.136.232.149-OK 216.136.232.148-OK
www.movielodge.com shell.windows.com
7143
-2161
1( 0)
66.250.223.36-OK
10865
-7447
2( 1)
207.46.226.48-BAD 207.46.248.237-OK
www.surf3.net
126810 -40415
1( 0)
212.74.112.95-OK
The list is sort ed by t he t im e since last reference. Recent ly referenced nam es are at t he t op of t he list , and unused ( about t o be rem oved) nam es are at t he bot t om . I P addresses are m arked OK by default . An address is m arked BAD when Squid receives an error or t im eout during a TCP connect ion at t em pt . Subsequent I P cache request s don't ret urn BAD
addresses. I f all t he host 's addresses becom e BAD, Squid reset s t hem all back t o OK.
14.2.1.10 fqdncache: FQDN Cache Stats and Contents The FQDN cache is sim ilar t o t he I P cache, except t hat it st ores address- t o- host nam e lookups. Anot her difference is t hat t he FQDN cache doesn't m ark host nam es as OK or BAD. Your FQDN cache m ay be em pt y, unless you enable t he log_fqdn direct ive, use dom ain- based ACLs ( such as srcdom ain, dst dom ain, srcdom _regex, and dst dom _regex) , or use a redirect or.
14.2.1.11 idns: Internal DNS Statistics Squid cont ains an int ernal DNS client im plem ent at ion, which is enabled by default . Disabling int ernal DNS wit h t he —disable- int ernal- dns opt ion also disables t his page. Here is som e sam ple out put : Internal DNS Statistics:
The Queue: DELAY SINCE ID
SIZE SENDS FIRST SEND LAST SEND
------ ---- ----- ---------- --------001876
44
1
0.010
0.010
001875
44
1
0.010
0.010
Nameservers: IP ADDRESS
# QUERIES # REPLIES
--------------- --------- --------192.168.19.124
4889
4844
192.168.19.190
91
51
192.168.10.2
73
39
Rcode Matrix: RCODE ATTEMPT1 ATTEMPT2 ATTEMPT3 0
6149
4
2
1
0
0
0
2
38
34
32
3
0
0
0
4
0
0
0
5
0
0
0
The I nt ernal DNS page cont ains t hree t ables. First , you'll see t he queue of unanswered queries. Unfort unat ely, you can't see t he cont ent s of t he query ( t he host nam e or I P address) . I nst ead, Squid print s t he I D, size, num ber of t ransm issions, and elapsed t im es for each query. You should see relat ively few queries in t he queue. I f you see a lot relat ive t o your t ot al t raffic rat e, m ake sure your DNS servers are funct ioning properly. The second t able ( Nameservers) shows how m any queries have been sent t o, and replies received from , each DNS server. Squid always queries t he first server in t he list first . Second ( and t hird, et c.) servers are queried only when t he previous server t im es out for a given query. I f you see zero replies from t he first address, m ake sure a server is act ually running at t hat address. Finally, you'll see a t able of DNS response codes versus num ber of at t em pt s. The cell for response code 0 and ATTEMPT1 should have t he highest count . Response code 0 indicat es success, while ot hers are different t ypes of errors ( see RFC 1035 for t heir descript ions) . You m ay see som e sm aller num bers for response code 0 in t he colum ns for ATTEMPT2 and ATTEMPT3. This shows t he cases when ret ransm it t ing a query, aft er init ially receiving an error, result ed in a successful reply. Not e t hat Squid ret ries only response code 2 ( server failure) errors.
14.2.1.12 dns: Dnsserver Statistics This cache m anager page is available only when you use t he —disable- int ernal- dns opt ion. I n t his case, Squid uses a num ber of ext ernal dnsserver processes t o perform DNS lookups. The dnsserver program is one of a num ber of helper processes Squid can use. The ot her t ypes of helpers are redirect ors, aut hent icat ors, and ext ernal ACLs. All Squid's helpers have cache m anager pages t hat display t he sam e st at ist ics. For exam ple: Dnsserver Statistics: number running: 5 of 5 requests sent: 3001 replies received: 3001 queue length: 0 avg service time: 23.10 msec
#
FD
PID
# Requests
1
6
20110
128
2
7
20111
3
8
20112
Flags
Time
Offset Request
AB
0.293
0 www.nlanr.net
45
A
0.000
0 (none)
4
A
0.000
0 (none)
4
9
20113
0
A
0.000
0 (none)
5
10
20114
0
A
0.000
0 (none)
The number running line shows how m any helper processes are running and how m any should be running. The dns_children direct ive specifies how m any dnsserver processes t o use. The t wo num bers should m at ch, but t hey m ay not if a helper process dies unexpect edly or if som e processes could not be st art ed. Recall t hat when you reconfigure a running Squid inst ance, all t he helpers are killed and rest art ed. See t he discussion in Appendix A. The requests sent and replies received values display t he num ber of request s sent t o ( and responses received from ) t he helpers since Squid st art ed. The difference bet ween t hese t wo, if any, should correspond t o t he num ber of out st anding request s. The queue length line shows how m any request s are queued, wait ing for one of t he helpers t o becom e free. The queue lengt h should usually be zero. I f not , you should add m ore helpers t o reduce delays for your users. The avg service time line shows t he running average service t im e for all helpers. Your part icular value m ay depend on num erous fact ors, such as your net work bandwidt h and processing power. The next sect ion displays a t able of st at ist ics for t he running dnsserver processes. The FD colum n shows t he file descript or for t he socket bet ween Squid and each dnsserver process. Sim ilarly, t he PID colum n shows each helper's process I D num ber. The # Requests colum n shows how m any request s have been sent t o each helper. These num bers are zeroed each t im e you reconfigure Squid, so t hey m any not add up t o t he t ot al num ber of request s sent , as shown earlier. Not e t hat Squid always chooses t he first idle helper in t he list , so t he first process should receive t he largest num ber of request s. The last few processes m ay not receive any request s at all. The Flags colum n shows a few flags describing t he st at e of t he helper process. You should norm ally see A ( for Alive) in each colum n. Occasionally, when t he helper process is handling a request , you'll see B ( for Busy) . The Time colum n displays t he am ount of t im e elapsed ( in seconds) for t he current , or last , request . Offset shows how m any byt es of t he response m essage Squid has read on t he socket . This is alm ost always zero. Finally, t he Request colum n shows t he request t hat was sent t o t he helper process. I n t his case, it is eit her a host nam e or an I P address.
14.2.1.13 redirector: URL Redirector Stats The Redirect or St at s page is available only if you are using a redirect or ( see Chapt er 11) . The form at of t his page is ident ical t o Dnsserver St at ist ics, described earlier.
14.2.1.14 basicauthenticator: Basic User Authenticator Stats This page is available only wit h t he ./ configure —enable- aut h= basic opt ion and when you define a Basic aut hent icat or wit h t he aut h_param basic program direct ive. The form at of t his page is
ident ical t o Dnsserver St at ist ics, described earlier.
14.2.1.15 digestauthenticator: Digest User Authenticator Stats This page is available only wit h t he ./ configure —enable- aut h= digest opt ion and when you define a Digest aut hent icat or wit h t he aut h_param digest program direct ive. The form at of t his page is ident ical t o Dnsserver St at ist ics, described earlier.
14.2.1.16 ntlmauthenticator: NTLM User Authenticator Stats This page is available only wit h t he ./ configure —enable- aut h= nt lm opt ion and when you define a NTLM aut hent icat or wit h t hen aut h_param nt lm program direct ive. The form at of t his page is sim ilar t o Dnsserver St at ist ics, described earlier, wit h a few addit ions. The t able of helper processes includes an ext ra colum n: # Deferred Requests. NTLM requires " st at eful" helpers because t he helper processes t hem selves generat e t he challenges. Squid receives a challenge from a helper, sends t hat challenge t o a user, and receives a response. Squid m ust send t he user's challenge response back t o t he sam e helper for validat ion. For t his prot ocol t o work, Squid m ust defer som e m essages t o be sent t o a helper unt il t he helper is ready t o accept t hem . These helpers also have t wo new flags: R ( reserved or deferred) and P ( placeholder) . The R flag is set when t he helper has at least one deferred request wait ing. The P flag is set when Squid is wait ing for t he NTLM helper t o generat e a new challenge t oken.
14.2.1.17 external_acl: External ACL Stats This page displays helper st at ist ics for your ext ernal ACLs. I f you don't have any ext ernal_acl_t ype lines in squid.conf, t his page will be em pt y. Ot herwise, Squid displays t he st at ist ics for each ext ernal ACL. The form at is t he sam e as for t he Dnsserver St at ist ics.
14.2.1.18 http_headers: HTTP Header Statistics This page displays a num ber of t ables cont aining st at ist ics about HTTP headers. I t cont ains up t o four sect ions: HTCP reply st at s ( if HTCP is enabled) , HTTP request st at s, HTTP reply st at s, and a final sect ion called HTTP Fields St at s. The HTCP reply st at ist ics refer t o HTCP replies received by your cache. The HTTP request sect ion refers t o HTTP request s eit her sent or received by your cache. Sim ilarly, t he HTTP reply sect ion refers t o replies eit her sent or received by Squid. The first t hree sect ions have t he sam e form at . Each sect ion cont ains t hree t ables: Field t ype dist ribut ion, Cache- cont rol direct ives dist ribut ion, and Num ber of fields per header dist ribut ion. The Field t ype dist ribut ion t able shows t he num ber of t im es t hat each header value occurs and t he percent age of cases in which it occurs. For exam ple, in Table 14- 2 you can see t hat t he Accept header occurs in 98% of HTTP request s.
Ta ble 1 4 - 2 . Sa m ple Fie ld t ype dist r ibu t ion va lu e s for H TTP r e qu e st s
ID
Nam e
Cou n t
# / h e a de r
0
Accept
1416268
0.98
1
Accept-Charset
322077
0.22
2
Accept-Encoding
709715
0.49
3
Accept-Language
1334736
0.92
...
...
...
...
Unfort unat ely, t hese ( and t he following) st at ist ics are t ricky because t hey don't correspond one- t oone for client request s. For exam ple, Squid m ay report 1,416,268 Accept headers in request s but only 800,542 client request s. This happens because Squid creat es m ore t han one HTTP header dat a st ruct ure for each request . I n t he case of HTTP replies, it seem s t hat Squid m ay creat e up t o four separat e header st ruct ures, depending on t he circum st ances. The Cache-Control direct ives dist ribut ion is sim ilar, but applies only t o t he values of t he CacheControl header. Table 14- 3 shows som e of t he possible field values.
Ta ble 1 4 - 3 . Sa m ple Ca ch e - Con t r ol dir e ct ive s dist r ibu t ion va lu e s for H TTP r e qu e st s ID
Nam e
Cou n t
# / cc_ fie ld
0
public
6866
0.02
1
private
69783
0.24
2
no-cache
78252
0.27
3
no-store
9878
0.03
4
no-transform
168
0.00
5
must-revalidate
10983
0.04
6
proxy-revalidate
2480
0.01
7
max-age
165034
0.56
8
s-maxage
4995
0.02
9
max-stale
0
0.00
10
only-if-cached
0
0.00
11
Other
9149
0.03
The Number of fields per header distribution t able shows how m any headers occur in each request or reply. Usually, you should see som et hing like a norm al dist ribut ion wit h a peak around 10- 13 headers per request or response. Finally, t his page ends wit h a t able labeled Http Fields Stats (replies and requests). For each header, t his t able shows t hree values: #alive, %err, and %repeat. The #alive colum n shows how m any inst ances of t his header are current ly st ored in m em ory. HTTP headers are kept in m em ory for bot h act ive request s/ responses and for com plet ed obj ect s st ored in t he m em ory cache. The %err colum n shows t he percent age of t im es Squid encount ered an error while parsing t his header. Com m on errors include incorrect dat e form at s for Date, Expires, Last-Modified, and sim ilar headers. The value -1 indicat es no errors. The %repeat colum n indicat es t he num ber of t im es t hat a part icular header is repeat ed in a single request or response. These aren't errors because HTTP allows headers t o be repeat ed.
14.2.1.19 via_headers: Via Request Headers This page is available only wit h t he ./ configure —enable- forw- via- db opt ion. The inform at ion in t his page is int ended t o help cache adm inist rat ors underst and where client request s com e from . When enabled, Squid count s t he num ber of t im es each unique Via header occurs in client request s. The Via header cont ains a list of downst ream proxies t hat have forwarded t he request so far. When a proxy forwards a request , it should append it s host nam e and ot her ident ifying inform at ion t o t he Via header. Wit h t he inform at ion in t his dat abase, you can, in t heory, reconst ruct t he hierarchy of proxies forwarding request s t hrough yours.
Squid print s t he Via dat abase ent ries in a random order. The out put m ay look som et hing like t his: 4 1.0 proxy.firekitten.org:3128 (squid/2.5.STABLE1) 1 1.0 xnsproxy.dyndns.org:3128 (squid/2.5.PRE3-20020125) 1751 1.0 nt04.rmtcc.cc.oh.us:3128 (Squid/2.4.STABLE6), 1.0 tasksmart.rmtcc.cc.oh.us:3128 (Squid/2.4.STABLE7) 137 1.0 reg3.bdg.telco.co.id:8080 (Squid/2.2.STABLE5), 1.0 c1.telco.co.id:8080 (Squid/2.4.STABLE6), 1.0 cache2.telco.co.id:8080 (Squid/2.4.STABLE1) 53 1.0 IS_GW_312:3128 (Squid/2.4.STABLE6) 60 1.0 proxy.kiltron.net:3128 (Squid/2.4.STABLE7) 815 1.1 DORM I n t his exam ple, Squid received 1751 request s t hat previously passed t hrough t wo ot her proxies ( nt 04 and t asksm art ) . Not e t hat only proxies add a Via header. Request s from user- agent s usually don't have t he header and, t herefore, aren't count ed in t his dat abase. As you can see, t he Via headers reveal som e sem iprivat e inform at ion, such as host nam es, port num bers, and soft ware versions. Please t ake care t o respect t he privacy of your users if you enable t his feat ure. The Via dat abase is st ored ent irely in m em ory and is lost if Squid rest art s. The dat abase is cleared whenever you rot at e t he log files ( see Sect ion 13.7) .
14.2.1.20 forw_headers: X-Forwarded-For Request Headers This page is available only wit h t he ./ configure —enable- forw- via- db opt ion. I t is sim ilar t o t he via_headers page, except t hat it displays t he accum ulat ion of X-Forwarded-For headers. X-Forwarded-For is a nonst andard HTTP header t hat originat ed wit h t he Squid proj ect . I t s value is a list of client I P addresses. I n ot her words, when Squid receives and forwards a request , it appends t he client 's I P address t o t his header. I t is sim ilar t o Via because t he header grows each t im e a proxy passes t he request on t owards t he origin server. The forw_headers out put is sim ilar t o via_headers. Each line begins wit h an int eger, followed by a header value. The int eger indicat es how m any t im es t hat part icular X-Forwarded-For value was received. For exam ple: 1 10.37.1.56, 10.1.83.8 3 10.3.33.77, 10.1.83.8 569 116.120.203.54
21 10.65.18.200, 10.1.83.120 31 116.120.204.6 5 10.1.92.7, 10.1.83.120 1 10.3.65.122, 10.3.1.201, 10.1.83.8 2 10.73.73.51, 10.1.83.120 1 10.1.68.141, 10.1.83.8 3 10.1.92.7, 10.1.83.122 As wit h via_headers, t his dat abase is also st ored in m em ory and is lost if Squid exit s. The dat abase is cleared each t im e you rot at e Squid's log files.
14.2.1.21 menu: This Cache Manager Menu This page sim ply displays a list ing of t he ot her cache m anager pages. You can use it if you forget t he nam e of a page or if you want t o know if cert ain opt ional pages are available. When using cachem gr.cgi, each it em in t he m enu is a clickable link.
14.2.1.22 shutdown: Shut Down the Squid Process* This is one of t he few cache m anager funct ions t hat doesn't sim ply display som e inform at ion. Rat her, t his " page" allows you t o shut down Squid rem ot ely. To allow shut down via t he cache m anager, you m ust assign it a password wit h t he cachem gr_passwd ( see Sect ion 14.2.2) direct ive in squid.conf. Wit hout a password, t he shut down operat ion is disabled ( but you can st ill use squid k shut down) . Because t he cache m anager has very weak securit y—passwords are sent in cleart ext —I don't recom m end enabling t his operat ion.
14.2.1.23 offline_toggle: Toggle offline_mode Setting* This is anot her funct ion t hat allows you t o cont rol Squid, rat her t han sim ply receive inform at ion. I t also requires a password ( see Sect ion 14.2.2) in order t o becom e act ive. Each t im e you request t his page, Squid t oggles t he offline_m ode set t ing. Squid report s t he new set t ing on your screen and in cache.log.
14.2.1.24 info: General Runtime Information This page provides a lot of basic inform at ion about t he way t hat Squid is operat ing. I t is a good st art ing point for using t he cache m anager and for t racking down perform ance problem s. At t he t op, you'll see t he release version ( e.g., Version 2.5.STABLE4) and t wo t im est am ps: t he st art ing and current t im es. For exam ple: Squid Object Cache: Version 2.5.STABLE4
Start Time:
Mon, 22 Sep 2003 03:10:37 GMT
Current Time:
Mon, 13 Oct 2003 10:25:16 GMT
Following t hat , you'll see seven different sect ions. The first sect ion, Connection information, displays a few st at ist ics about t he num ber and rat e of connect ions, and t he num ber of cache client s: Connection information for squid: Number of clients accessing cache:
386
Number of HTTP requests received:
12997469
Number of ICP messages received:
16302149
Number of ICP messages sent:
16310714
Number of queued ICP replies:
0
Request failure ratio:
0.00
Average HTTP requests per minute since start:
423.7
Average ICP messages per minute since start:
1063.2
Select loop called: 400027445 times, 4.601 ms avg
Number of clients accessing cache Here, " client " act ually m eans I P address. Squid assum es t hat each client has a unique I P address.
Number of HTTP requests received The t ot al num ber of HTTP request s since Squid was st art ed.
Number of ICP messages received The t ot al num ber of I CP m essages received since Squid was st art ed. Not e, received m essages includes bot h queries and responses. These values don't include HTCP m essages, however.
Number of ICP messages sent The t ot al num ber of I CP m essages sent since Squid was st art ed. Not e, received m essages includes bot h queries and responses. Doesn't include HTCP m essages. Most likely, your sent and received count s will be about t he sam e.
Number of queued ICP replies
I CP m essages are sent over UDP. The sendt o( ) syst em call rarely fails, but if it does, Squid queues t he I CP m essage for ret ransm ission. This count er shows how m any t im es an I CP m essage was queued for ret ransm ission. Most likely, you'll see 0 here.
Request failure ratio The failure rat io is a m oving average rat io bet ween t he num ber of failed and successful request s. I n t his cont ext , a failed request is caused by eit her a DNS error, TCP connect ion error, or net work read error. When t his rat io exceeds 1.0—m eaning Squid ret urns m ore errors t han successful responses— Squid goes int o hit - only m ode. I n t his m ode, Squid ret urns ICP_MISS_NOFETCH inst ead of ICP_MISS. Thus, your neighbor caches t hat use I CP won't forward cache m isses t o you unt il t he problem goes away.
Average HTTP requests per minute since start This value is sim ply t he num ber of HTTP request s divided by t he am ount of t im e Squid has been running. This average doesn't reflect short - t erm variat ions in load. To get a bet t er inst ant aneous load m easurem ent , use t he 5m in or 60m in page.
Average ICP messages per minute since start The num ber of I CP queries received by Squid divided by t he am ount of t im e t hat it has been running.
Select loop called This num ber is probably m eaningful only t o Squid developers. I t represent s t he num ber of t im es t he select ( ) ( or poll( ) ) funct ion has been called and t he average t im e bet ween calls. During norm al operat ion, t he t im e bet ween calls should be in t he 1- 100 m illisecond range. The Cache information sect ion displays hit rat io and cache size st at ist ics: Cache information for squid: Request Hit Ratios:
5min: 22.6%, 60min: 25.8%
Byte Hit Ratios:
5min: 24.6%, 60min: 38.7%
Request Memory Hit Ratios:
5min: 0.7%, 60min: 1.4%
Request Disk Hit Ratios:
5min: 6.0%, 60min: 12.4%
Storage Swap size:
41457489 KB
Storage Mem size:
10180 KB
Mean Object Size:
14.43 KB
Requests given to unlinkd:
0
Request Hit Ratios Here, and on subsequent lines, you'll see t wo hit rat io num bers: one for t he last five m inut es, and one for t he last hour. These values are sim ply t he percent age of HTTP request s t hat result in a cache hit . Here, hit s include cases in which Squid validat es a cached response and receives a 304 ( Not Modified) reply.
Byte Hit Ratios Squid calculat es byt e hit rat io by com paring t he num ber of byt es received from origin servers ( or neighbors) t o t he num ber of byt es sent t o client s. When received byt es are less t han sent byt es, t he byt e hit rat io is posit ive. However, it is possible t o see a negat ive byt e hit rat io. This m ight occur, for exam ple, if you have a lot of client s t hat abort t heir request before receiving t he ent ire response.
Request Memory Hit Ratios These values represent t he percent age of all cache hit s t hat were served from m em ory. Or, m ore accurat ely, t he percent age of all hit s ( not request s! ) logged as TCP_MEM_HIT.
Request Disk Hit Ratios Sim ilarly, t hese values represent t he percent age of " plain" cache hit s served from disk. I n part icular, t hese values are t he percent age of all hit s logged as TCP_HIT. You'll see t hat t he m em ory and disk hit percent ages don't add up t o 100% . This is because t he ot her cases ( such as TCP_IMS_HIT, et c.) aren't included in eit her disk or m em ory hit s.
Storage Swap size The am ount of dat a current ly cached on disk. I t is always expressed in kilobyt es. To com pensat e for space wast ed in part ial blocks at t he end of files, Squid rounds up file sizes t o t he nearest filesyst em block size.
Storage Mem size The am ount of dat a current ly cached in m em ory. I t is always expressed in kilobyt es and is always a m ult iple of Squid's m em ory page size: 4 KB.
Mean Object Size Sim ply t he st orage swap size divided by t he num ber of cached obj ect s. You should set t he configurat ion direct ive st ore_avg_obj ect _size close t o t he act ual value report ed here. Squid uses t he configured value for a num ber of int ernal est im at es.
Requests given to unlinkd The unlinkd process handles file delet ion ext ernal t o Squid ( depending on your configurat ion) . This value sim ply shows how m any files Squid has asked unlinkd t o rem ove. I t is zero when unlinkd isn't used. The Median Service Times sect ion displays t he m edian of various service t im e ( or response t im e) dist ribut ions. You'll see a value for t he last five m inut es and for t he last hour. All values are in seconds. Squid uses t he m edian, rat her t han t he m ean, because t hese dist ribut ions oft en have heavy t ails t hat can significant ly skew t he m ean value. The out put looks like t his: Median Service Times (seconds)
5 min
60 min:
HTTP Requests (All):
0.19742
0.15048
Cache Misses:
0.22004
0.17711
Cache Hits:
0.05951
0.04047
Near Hits:
0.37825
0.14252
Not-Modified Replies:
0.01309
0.01387
DNS Lookups:
0.05078
0.03223
ICP Queries:
0.00000
0.07487
HTTP Requests (All) These are t he m edian response t im es for all HTTP request s t aken t oget her. For an HTTP request , t he t im er st art s as soon as Squid receives t he request and ends when Squid writ es t he last byt e of t he response. Thus, t his t im e also includes DNS lookups ( if any) , and I CP queries t o upst ream neighbors ( if you have t hem ) for cache m isses.
Cache Misses This line shows t he response t im e for cache m isses only. Unless your cache hit rat io is close t o 50% , t he cache m iss response t im e is close t o ( but a lit t le larger t han) t he overall response t im e.
Cache Hits The cache hit response t im e includes only request s logged as TCP_HIT, TCP_MEM_HIT, and TCP_OFFLINE_HIT. These are unvalidat ed cache hit s served direct ly from Squid, wit hout any com m unicat ion t o t he origin server. Thus, your cache hit response t im e should be significant ly less t han t he m iss t im e. You should keep t rack of t his value over t im e; if it clim bs t oo high, your disk filesyst em m ay be a perform ance bot t leneck.
Near Hits
A near hit is a validat ed cache hit . I t corresponds t o TCP_REFRESH_HIT in access.log. For t hese, Squid cont act s t he origin server ( or parent cache) , which adds som e lat ency t o t he response t im e. The server's response is a sm all 304 ( Not Modified) m essage. Thus, t he near hit response t im e is t ypically in bet ween cache hit s and cache m isses.
Not-Modified Replies This line shows t he response t im es for request s logged as TCP_IMS_HIT. This occurs when t he client sends a condit ional ( a.k.a. validat ion) request , and Squid serves a response wit hout cont act ing t he origin server. The nam e " not - m odified" is som ewhat m isleading for t his cat egory because t he st at us code received by t he client isn't necessarily 304. For exam ple, t he client m ay send an I f- m odified- since request , and Squid has a fresh, cached response wit h a m ore recent m odificat ion t im e. Squid knows t hat it s response is fresh and t hat t he client 's copy is st ale. I n t his case, t he client receives a 200 ( OK) reply wit h t he new obj ect dat a.
DNS Lookups The DNS service t im e shows how long it t akes, on average, t o query t he DNS. This includes bot h nam e- t o- address and address- t o- nam e lookups. I t doesn't include I P- and FQDN- cache hit s, however. DNS queries can be a significant source of lat ency. I f you experience perform ance problem s wit h Squid, be sure t o check t his value. I f you see a high m edian service t im e ( i.e., around five seconds) , m ake sure your prim ary DNS server ( usually list ed in / et c/ resolv.conf) is up and running.
ICP Queries The I CP query t im e represent s t he elapsed t im e bet ween an I CP query and response t hat causes Squid t o select t he corresponding neighbor as t he next hop. Thus, it includes only request s logged as PARENT_HIT, SIBLING_HIT, FIRST_PARENT_MISS, and CLOSEST_PARENT_MISS. This value m ay not be a good est im at e of t he overall I CP response t im e because I CP query/ response t ransact ions t hat don't result in Squid select ing a neighbor are ignored. Due t o a bug in Squid Versions 2.5.STABLE1 and earlier, I CP response t im e st at ist ics aren't collect ed, and t hese values always appear as 0. The Resource usage sect ion includes a few st at ist ics relat ing t o CPU and m em ory usage: Resource usage for squid: UP Time:
1840478.681 seconds
CPU Time:
70571.874 seconds
CPU Usage:
3.83%
CPU Usage, 5 minute avg:
1.33%
CPU Usage, 60 minute avg:
4.41%
Process Data Segment Size via sbrk( ): 342739 KB
Maximum Resident Size: 345612 KB Page faults with physical i/o: 65375
UP Time This line sim ply shows t he am ount of t im e t his Squid process has been running. I t is expressed in seconds.
CPU Time The am ount of CPU t im e used by Squid, also in seconds. This value com es from t he get rusage( ) syst em call, which m ight not be available on all operat ing syst em s.
CPU Usage This sect ion has t hree CPU Usage lines. The first is t he CPU Tim e value divided by t he UP Tim e value. I t is a long- t erm average CPU usage m easurem ent . The next t wo lines show t he CPU usage for t he last five m inut es and t he last hour.
Process Data Segment Size via sbrk( ) This line offers an est im at e of Squid's process size. sbrk( ) is a low- level syst em call used by t he m em ory allocat ion library ( m alloc( ) ) . The sbrk( ) t echnique provides only an est im at e, which usually differs from values report ed by program s such as ps and t op. When t he sbrk ( ) value is great er t han t he Maximum Resident Size ( discussed next ) , t he Squid process is probably page fault ing, and perform ance m ay be degrading.
Maximum Resident Size This is anot her est im at e of m em ory usage and process size. The m axim um resident set size ( RSS) value com es from t he get rusage( ) syst em call. Alt hough t he definit ion of RSS m ay vary bet ween operat ing syst em s, you can t hink of it as t he m axim um am ount of physical m em ory used by t he process at any one t im e. Squid's process size m ay be larger t han t he RSS, in which case som e part s of t he process are act ually swapped t o disk.
Page faults with physical i/o This value also com es from get rusage( ) . A page fault occurs when t he operat ing syst em m ust read a page of t he process's m em ory from disk. This usually happens when t he Squid process becom es t oo large t o fit ent irely in m em ory, or when t he syst em has ot her program s com pet ing for m em ory. Squid's perform ance suffers significant ly when page fault s occur. You probably won't not ice any problem s as long as t he page- fault s rat e is an order of m agnit ude lower t han t he HTTP request rat e. You'll see a sect ion called Memory usage for squid via mstats( ) if your syst em has t he m st at s ( ) funct ion. I n part icular, you'll have t his funct ion if t he GNU malloc library ( libgnum alloc.a) is
inst alled. Squid report s t wo st at ist ics from m st at s( ) : Memory usage for squid via mstats( ): Total space in arena:
415116 KB
Total free:
129649 KB 31%
Total space in arena This represent s t he t ot al am ount of m em ory allocat ed t o t he process. I t m ay be sim ilar t o t he value report ed by sbrk( ) . Not e t hat t his value only increases over t im e.
Total free This represent s t he am ount of m em ory allocat ed t o t he process but not current ly in use by Squid. For exam ple, if Squid frees up som e m em ory, it goes int o t his cat egory. Squid can lat er reuse t hat m em ory, perhaps for a different dat a st ruct ure, wit hout increasing t he process size. This value fluct uat es up and down over t im e. The Memory accounted for sect ion cont ains a few t idbit s about Squid's int ernal m em ory m anagem ent t echniques: Memory accounted for: Total accounted:
228155 KB
memPoolAlloc calls: 2282058666 memPoolFree calls: 2273301305
Total accounted Squid keeps t rack of som e, but not nearly all, of t he m em ory allocat ed t o it . This value represent s t he t ot al size of all dat a st ruct ures account ed for. Unfort unat ely, it is t ypically only about t wo- t hirds of t he act ual m em ory usage. Squid uses a significant am ount of m em ory in ways t hat m ake it difficult t o t rack properly.
memPoolAlloc calls m em PoolAlloc( ) is t he funct ion t hrough which Squid allocat es m any fixed- size dat a st ruct ures. This line shows how m any t im es t hat funct ion has been called.
memPoolFree calls m em PoolFree( ) is t he com panion funct ion t hrough which Squid frees m em ory allocat ed wit h m em PoolAlloc( ) . I n a st eady- st at e condit ion, t he t wo values should increase at t he sam e rat e and t heir difference should be roughly const ant over t im e. I f not , t he code m ay cont ain
a bug t hat frees pooled m em ory back t o t he malloc library. The File descriptor usage sect ion shows how m any file descript ors are available t o Squid and how m any are in use: File descriptor usage for squid: Maximum number of file descriptors:
7372
Largest file desc currently in use:
151
Number of file desc currently in use:
105
Files queued for open:
0
Available number of file descriptors: 7267 Reserved number of file descriptors: Store Disk files open:
100 0
Maximum number of file descriptors This is t he lim it on open file descript ors for t he squid process. This should be t he sam e value report ed by ./ configure when you com piled Squid. I f you don't see at least 1024 here, you should probably go back and recom pile Squid aft er reading Sect ion 3.3.1.
Largest file desc currently in use This is t he highest file descript or current ly open. I t s value isn't part icularly im port ant but should be wit hin 15- 20% of t he next line ( num ber current ly in use) . This value is m ore useful for developers because it corresponds t o t he first argum ent of t he select ( ) syst em call.
Number of file desc currently in use The num ber of current ly open descript ors is an im port ant perform ance m et ric. I n general, Squid's perform ance decreases as t he num ber of open descript ors increases. The kernel m ust work harder t o scan t he larger set of descript ors for act ivit y. Meanwhile, each file descript or wait s longer ( on average) t o be serviced.
Files queued for open This value will always be zero, unless you are using t he aufs st orage schem e ( see Sect ion 8.4) . I t shows how m any file- open request s have been dispat ched t o t he t hread processes but have not yet ret urned. aufs is t he only st orage schem e in which disk file descript ors are [ 2] opened asynchronously. [ 2]
diskd also opens files asynchronously, but t hose file descript ors belong t o t he
diskd processes, not t he squid process.
Available number of file descriptors The num ber of available descript ors is t he m axim um , m inus t he num ber current ly open and t he num ber queued for open. I t represent s t he am ount of breat hing room Squid has t o handle m ore load. When t he available num ber get s close t o t he reserved num ber ( next line) , Squid st ops accept ing new connect ions so t hat exist ing t ransact ions cont inue receiving service.
Reserved number of file descriptors The num ber of reserved file descript ors st art s out at t he lesser of 100 or 25% of t he m axim um . Squid refuses new client connect ions if t he num ber of available ( free) descript ors reaches t his lim it . I t is increased if Squid encount ers an error while t rying t o creat e a new TCP socket . I n t his case, you'll see a m essage in cache.log: Reserved FD adjusted from 100 to 150 due to failures
Store Disk files open This count er shows t he num ber of disk files current ly open for reading or writ ing. I t is always zero if you are using t he diskd st orage schem e because disk files are opened by t he diskd processes, rat her t han Squid it self. I f you use t he m ax_open_disk_fds direct ive in squid.conf, Squid st ops opening m ore cache files for reading or writ ing when it reaches t hat lim it . I f your filesyst em is a bot t leneck, t his is a sim ple way t o sacrifice a few cache hit s for st able perform ance. The Internal Data Structures sect ion gives a quick overview of how m any obj ect s are in t he cache and how m any are on disk or in m em ory. You can find m ore det ail about Squid's dat a st ruct ure allocat ions in t he m em page ( see Sect ion 14.2.1.2) . This sect ion has a few st at s: Internal Data Structures: 2873586 StoreEntries 1336 StoreEntries with MemObjects 1302 Hot Object Cache Items 2873375 on-disk objects
StoreEntries This represent s t he num ber of obj ect s cached by Squid. Each obj ect in t he cache uses one StoreEntry st ruct ure.
StoreEntries with MemObjects
MemObject is t he dat a st ruct ure used for obj ect s current ly being request ed and for obj ect s st ored in t he m em ory cache.
Hot Object Cache Items The Hot Object Cache is anot her nam e for t he m em ory cache ( see Appendix B) . These obj ect s are st ored ent irely in m em ory ( as well as on disk) . This num ber should always be less t han t he num ber of ent ries wit h MemObjects.
on-disk objects This count er shows how m any obj ect s are current ly st ored on disk. The count er is increm ent ed when t he ent ire obj ect has been successfully writ t en. Thus, t his num ber isn't necessarily equal t o t he num ber of StoreEntries m inus t he num ber of Hot Objects.
14.2.1.25 filedescriptors: Process File Descriptor Allocation This page displays a t able of all file descript ors current ly opened by Squid. I t looks like t his: File Type
Tout Nread
* Nwrite * Remote Address
Description
---- ------ ---- -------- -------- ----------------- -----------------------------3 File
0
0
0
6 File
0
0
2083739
12 Pipe
0
0
0
13 File
0
0
2485913
15 Pipe
0
0
0
16 Socket 18 Pipe 19 Socket 21 Pipe 22 Socket
24 0 179 0 20
220853* 0 476* 0 158783*
1924
/usr/local/squid/logs/cache.log /usr/local/squid/logs/access.log unlinkd -> squid /usr/local/squid/logs/store.log squid -> unlinkd 65.200.216.110.80 http://downloads.mp3.com/
0 1747
squid -> diskd 202.59.16.30.4171 http://ads.vesperexchange.com/
0 998
squid -> diskd 210.222.20.8.80
http://home.hanmir.com/a
24 Pipe
0
0
0
squid -> diskd
25 Socket
1
0
0* 210.222.20.8.80
http://home.hanmir.com/b
26 Socket
0 9048307* 1578290
27 Pipe
0
0
0
squid -> diskd
28 Socket
0
0
0* 66.28.234.77.80
http://updates.hotbar.com/
.0
DNS Socket
29 Socket
0
0*
0
30 Pipe
0
0
0
31 Socket
0
93
1126
127.0.0.1.3434
ncsa_auth #1
32 Socket
0
3
31
127.0.0.1.3438
ncsa_auth #3
33 Socket
0
0
0
127.0.0.1.3440
ncsa_auth #4
34 Socket
164
8835* 1070222* 212.47.19.52.2201 http://www.eyyubyaqubov.com/
35 Socket
177
6137*
36 Socket
0
37 Socket
7
158783*
38 Socket
166
1000*
0
.0
HTTP Socket squid -> diskd
249899* 212.47.19.25.3044 http://files10.rarlab.com/ 0 774
127.0.0.1.3442
ncsa_auth #5
210.222.20.8.80
http://home.hanmir.com/c
148415* 202.17.13.8.5787
http://home.hanmir.com/d
The t able has seven colum ns:
File This is sim ply t he file descript or num ber. The list always st art s wit h 3 because descript ors 0, 1, and 2 are reserved for st din, st dout , and st derr. Any ot her gaps in t he list represent closed descript ors.
Type The t ype field cont ains one of t he following values: File, Pipe, or Socket . The File t ype is used bot h for files st oring cached responses and for log files, such as cache.log and access. log. The Pipe t ype represent s kernel pipes used for int erprocess com m unicat ion. The Socket t ype is also occasionally used for int erprocess com m unicat ion, but it 's m ost ly used for HTTP ( and FTP) connect ions t o client s and servers.
Tout This is t he general- purpose t im eout value for t he descript or. I t is expressed in m inut es. Files and Pipes usually don't have a t im eout , so t his value is zero. For Socket s, however, if t his num ber of m inut es go by wit hout any act ivit y on t he descript or, Squid calls a t im eout funct ion.
Nread This is where Squid report s t he num ber of byt es read from t he descript or. An ast erisk ( * ) aft er t he num ber m eans Squid has a funct ion ( a read handler) regist ered t o read addit ional dat a, if t here is som e available.
Nwrite This colum n shows t he num ber of byt es writ t en t o t he descript or. Again, t he ast erisk ( * ) indicat es t hat a writ e handler is present for t he descript or. You can usually t ell if a given socket is connect ed t o a client or t o a server by com paring t he num ber of byt es read and writ t en. Because request s are norm ally sm aller t han responses, a server connect ion has a higher Nread count t han Nwrite. The opposit e is t rue for client connect ions.
Remote Address For Socket s, t his field shows t he rem ot e TCP address of t he connect ion. The form at is sim ilar t o what you would find in net st at - n out put : an I P address followed by t he TCP port num ber.
Description The descript ion field indicat es t he descript or's use. For Files, you'll see a pat hnam e; for Pipes, a descript ion t o what t he pipe is connect ed; and for Socket s, a URI , or at least t he first part of it . A descript ion such as web.icq.com idle connection indicat es an idle persist ent connect ion t o an origin server. Sim ilarly, Waiting for next request is an idle client - side persist ent connect ion. By default , t he File Descript or page isn't password- prot ect ed. However, you m ay want t o give it a password because it cont ains som e sensit ive and, perhaps, personally ident ifiable inform at ion.
14.2.1.26 objects: All Cache Objects Request ing t his page result s in a list of all obj ect s in t he cache. Be careful wit h t his page because it can be ext rem ely long. Furt herm ore, it cont ains low- level inform at ion t hat is probably useful only t o developers. For each cached obj ect , Squid print s a sequence of lines, m ost of which look like t his: KEY FF1F6736BCC167A4C3F93275A126C5F5 STORE_OK
NOT_IN_MEMORY SWAPOUT_DONE PING_NONE
CACHABLE,DISPATCHED,VALIDATED LV:1020824321 LU:1020824671 LM:1020821288 EX:-1 0 locks, 0 clients, 1 refs Swap Dir 0, File 0X010AEE The first line shows t he cache key—a 128- bit MD5 checksum of t he URI . The sam e MD5 checksum appears in st ore.log and in t he m et adat a at t he beginning of each response cached on disk. The second line shows four st at e variables of t he StoreEntry dat a st ruct ure: st ore_st at us, m em _st at us, swap_st at us, and ping_st at us. Refer t o t he Squid source code if you'd like m ore inform at ion about t hem .
The t hird line is a list of t he StoreEntry flags t hat are set . Search t he source code for e- > flags for m ore inform at ion. The fourt h line shows t he values of four t im est am ps: last - validat ion, last - use, last - m odificat ion, and expirat ion. The last - m odificat ion and expirat ion t im est am ps are t aken from t he origin server's HTTP response. The ot hers are m aint ained by Squid. The fift h line shows a few count ers: locks, client s, and references. An ent ry wit h locks can't be rem oved. The clients count er shows how m any client s are current ly receiving dat a for t his obj ect . The refs count er shows how m any t im es t he obj ect has been request ed. The sixt h line shows t he obj ect 's index t o t he on- disk st orage. Each obj ect has a 7- bit swap direct ory index and a 25- bit file num ber. Each st orage schem e has a funct ion t o m ap t hese num bers int o pat hnam es.
14.2.1.27 vm_objects: In-Memory and In-Transit Objects This page is sim ilar t o All Cache Obj ect s, except t hat it displays only obj ect s t hat have a MemObject dat a st ruct ure. I n ot her words, obj ect s t hat are current ly being request ed or are st ored in t he m em ory cache. These obj ect s are displayed like t his: KEY 5107D49BA7F9C6BA9559E006D6DDC4B2 GET http://www.rpgplanet.com/ac2hq/cartography/dynamic/LinvakMassif.jpg STORE_PENDING NOT_IN_MEMORY SWAPOUT_WRITING PING_DONE CACHABLE,DISPATCHED,VALIDATED LV:1043286120 LU:1043286122 LM:1036015230 EX:-1 4 locks, 1 clients, 1 refs Swap Dir 1, File 00X31BD9 inmem_lo: 184784 inmem_hi: 229840 swapout: 229376 bytes queued swapout: 229509 bytes written Client #0, 1533a1018 copy_offset: 217552 seen_offset: 217552 copy_size: 4096 flags:
As you can see, m any of t he lines are t he sam e. However, t he in- m em ory obj ect s have a few addit ional lines. Direct ly following t he cache key ( MD5 checksum ) , Squid print s t he request m et hod and URI . The inmem_lo and inmem_hi lines are byt e offset s of t he HTTP reply. They indicat e t he sect ion of obj ect dat a current ly in m em ory. I n m ost cases, t he difference bet ween t hese t wo should be less t han t he value of t he m axim um _obj ect _size_in_m em ory direct ive. The swapout: bytes queued line shows t he offset for how m any byt es have been given t o t he st orage layer for writ ing. For obj ect s in t he SWAPOUT_DONE st at e, t his value is t he sam e as t he obj ect size. I f t he st at e is SWAPOUT_WRITING, Squid also shows t he bytes written line, which indicat es how m any byt es have been successfully st ored on disk. I f one or m ore client s are current ly receiving t he response, you'll see a sect ion for each of t hem ( Client #0 in t his exam ple) . For each client , Squid report s anot her pair of offset values. The first , copy_offset, is t he st art ing point for t he last t im e t he client - side asked for dat a from t he st orage syst em . The second, seen_offset, is t he point at which t he response dat a has been sent t o t he client . Not e t hat copy_offset is always great er t han or equal t o seen_offset. The copy_size indicat es t he m axim um am ount of dat a t he client can receive from t he st orage syst em .
14.2.1.28 openfd_objects: Objects with Swapout Files Open The form at of t his page is t he sam e as for I n- Mem ory and I n- Transit Obj ect s. The obj ect s report ed on t his page should all be in t he SWAPOUT_WRITING st at e. The page is prim arily useful t o developers when t rying t o t rack down file- descript or leaks.
14.2.1.29 io: Server-Side Network read( ) Size Histograms This page displays a hist ogram for each of t he following four server- side prot ocols: HTTP, FTP, Gopher, and WAI S. The hist ogram s show how m any byt es each read( ) call received. The inform at ion is prim arily useful t o developers for t uning buffer sizes and ot her aspect s of t he source code. The bins of t he hist ogram are logarit hm ic t o accom m odat e t he large scale of read sizes. Here is an exam ple: HTTP I/O number of reads: 9016088 Read Histogram: 1-
1:
3082
0%
2-
2:
583
0%
3-
4:
905
0%
5-
8:
2666
0%
9-
16:
16690
0%
17-
32:
88046
1%
33-
64:
19712
0%
65-
128:
116655
1%
129-
256:
749259
8%
257-
512:
633075
7%
513- 1024:
903145 10%
1025- 2048:
3664862 41%
2049- 4096:
1643747 18%
4097- 8192:
789796
9%
8193-16384:
99476
1%
16385-32768:
30059
0%
I n t his case, you can see t hat t he bin for 1025- 2048 byt es is t he m ost popular. When reading from an HTTP server, Squid got bet ween 1025 and 2048 byt es per read 41% of t he t im e.
14.2.1.30 counters: Traffic and Resource Counters Squid m aint ains a dat a st ruct ure of count ers. Act ually, it is an array of count ers. Squid shift s t he array every 60 seconds and calculat es 1- , 5- , and 60- m inut e averages from t his array. This page is sim ply a dum p of t he current count er values in a form at m ore suit able for com put er processing t han for reading by hum ans. The count ers are as follows:
sample_time The sam ple t im e is act ually t he t im e of t he last shift , rat her t han t he current t im e. The sam ple t im e is always wit hin 60 seconds of t he current t im e.
client_http.requests The num ber of HTTP request s received from client s.
client_http.hits The num ber of cache hit s in response t o client request s. A hit is any t ransact ion logged wit h one of t he TCP_HIT codes in access.log.
client_http.errors The num ber of client t ransact ions t hat result ed in an error.
client_http.kbytes_in The am ount of t raffic ( in kilobyt es) received from client s in t heir request s. This is m easured at t he HTTP layer and doesn't include TCP, I P, and ot her packet headers.
client_http.kbytes_out The am ount of t raffic ( in kilobyt es) sent t o client s in responses. Also m easured at t he HTTP layer.
client_http.hit_kbytes_out The am ount of t raffic sent t o client s in responses t hat are cache hit s. Keep in m ind t hat som e cache hit s are 304 ( Not Modified) responses.
server.all.requests The num ber of request s forwarded t o origin servers ( or neighbor caches) for all server- side prot ocols ( HTTP, FTP, Gopher, et c.) .
server.all.errors The num ber of server- side request s ( all prot ocols) t hat result ed in som e kind of error.
server.all.kbytes_in The am ount of t raffic ( in kilobyt es) read from t he server- side for all prot ocols.
server.all.kbytes_out The am ount of t raffic writ t en t o origin servers and/ or neighbor caches for server- side request s.
server.http.requests The num ber of server- side request s t o HTTP servers, including neighbor caches.
server.http.errors The num ber of server- side HTTP request s t hat result ed in an error.
server.http.kbytes_in
The am ount of t raffic read from HTTP origin servers and neighbor caches.
server.http.kbytes_out The am ount of t raffic writ t en t o HTTP origin servers and neighbor caches.
server.ftp.requests The num ber of request s sent t o FTP servers.
server.ftp.errors The num ber of request s sent t o FTP servers t hat result ed in an error.
server.ftp.kbytes_in The am ount of t raffic read from FTP servers, including cont rol channel t raffic.
server.ftp.kbytes_out The am ount of t raffic writ t en t o FTP servers, including cont rol channel t raffic.
server.other.requests The num ber of " ot her" server- side request s. Current ly, t he ot her prot ocols are Gopher, WAI S, and SSL.
server.other.errors The num ber of Gopher, WAI S, and SSL request s t hat result ed in an error.
server.other.kbytes_in The am ount of t raffic read from Gopher, WAI S, and SSL servers.
server.other.kbytes_out The am ount of t raffic writ t en t o Gopher, WAI S, and SSL servers.
icp.pkts_sent
The num ber of I CP m essages sent t o neighbors. This includes bot h queries and replies but doesn't include HTCP m essages.
icp.pkts_recv The num ber of I CP m essages received from neighbors, including bot h queries and replies.
icp.queries_sent The num ber of I CP queries sent t o neighbors.
icp.replies_sent The num ber of I CP replies sent t o neighbors.
icp.queries_recv The num ber of I CP queries received from neighbors.
icp.replies_recv The num ber of I CP replies received from neighbors.
icp.query_timeouts The num ber of t im es t hat Squid t im ed out wait ing for I CP replies t o arrive.
icp.replies_queued The num ber of t im es Squid queued an I CP m essage aft er t he init ial at t em pt t o send failed. See Sect ion 14.2.1.24.
icp.kbytes_sent The am ount of t raffic sent in all I CP m essages, including bot h queries and replies.
icp.kbytes_recv The am ount of t raffic received in all I CP m essages, including bot h queries and replies.
icp.q_kbytes_sent
The am ount of t raffic sent t o neighbors in I CP queries.
icp.r_kbytes_sent The am ount of t raffic sent t o neighbors in I CP replies.
icp.q_kbytes_recv The am ount of t raffic received from neighbors in I CP queries.
icp.r_kbytes_recv The am ount of t raffic received from neighbors in I CP replies.
icp.times_used The num ber of t im es I CP result ed in t he select ion of a neighbor as t he next - hop for a cache m iss.
cd.times_used The num ber of t im es Cache Digest s result ed in t he select ion of a neighbor as t he next - hop for a cache m iss.
cd.msgs_sent The num ber of Cache Digest m essages sent t o neighbors.
cd.msgs_recv The num ber of Cache Digest m essages received from neighbors.
cd.memory The am ount of m em ory ( in kilobyt es) used by enabling t he Cache Digest s' feat ure.
cd.local_memory The am ount of m em ory ( in kilobyt es) used t o st ore Squid's own Cache Digest .
cd.kbytes_sent The am ount of t raffic sent t o neighbors in Cache Digest m essages.
cd.kbytes_recv The am ount of t raffic received from neighbors in Cache Digest m essages.
unlink.requests The num ber of unlink request s given t o t he ( opt ional) unlinkd process.
page_faults The num ber of ( m aj or) page fault s as report ed by get rusage( ) .
select_loops The num ber of t im es Squid called select ( ) or poll( ) in t he m ain I / O loop.
cpu_time The am ount of CPU t im e ( in seconds) accum ulat ed, as report ed by get rusage( ) .
wall_time The am ount of hum an t im e ( in seconds) elapsed since Squid was st art ed.
swap.outs The num ber of obj ect s ( swap files) saved t o disk.
swap.ins The num ber of obj ect s ( swap files) read from disk.
swap.files_cleaned The num ber of orphaned cache files rem oved by t he periodic cleanup procedure.
aborted_requests
The num ber of server- side HTTP request s abort ed due t o client - side abort s.
14.2.1.31 peer_select: Peer Selection Algorithms This page cont ains a lot of low- level det ail about cache digest s t hat I won't discuss. Most of t he num bers are m eaningful only t o t he developers t hat originally wrot e t he Cache Digest im plem ent at ion. However, at t he end of t his page is a lit t le t able t hat com pares Algorithm usage: Algorithm usage: Cache Digest:
27 ( 24%)
Icp:
84 ( 76%)
Total:
111 (100%)
I n t his exam ple, you can see t hat Squid sent 111 request s t o one of it s neighbors: 27 are due t o Cache Digest s and 84 are due t o I CP. I n t his cont ext , I CP also includes HTCP.
14.2.1.32 digest_stats: Cache Digest and ICP Blob This page is act ually j ust a concat enat ion of t he following ot her cache m anager pages: ● ● ● ● ●
Traffic and Resource Count ers 5 Minut e Average of Count ers Full Hist ogram Count s Peer Select ion Algorit hm s St ore Digest
I t s only purpose is t o enable developers t o t ake a snapshot of a num ber of st at ist ics wit h a single request .
14.2.1.33 5min: 5 Minute Average of Counters This page shows a five- m inut e average of t he dat a in t he Traffic and Resource Count ers page. I n addit ion t o t he count ers m ent ioned in Sect ion 14.2.1.30, t his page also cont ains t he following values:
client_http.all_median_svc_time The m edian service ( response) t im e for all client request s from t he last five m inut es.
client_http.miss_median_svc_time The m edian service t im e for cache m isses from t he last five m inut es.
client_http.nm_median_svc_time The five- m inut e m edian service t im e for request s logged as TCP_IMS_HIT. See " Not - Modified Replies" in Sect ion 14.2.1.24.
client_http.nh_median_svc_time The five- m inut e m edian service t im e for Near Hit s ( TCP_REFRESH_HIT request s) .
client_http.hit_median_svc_time The five- m inut e m edian service t im e for unvalidat ed cache hit s.
icp.query_median_svc_time The five- m inut e m edian service t im e for I CP queries sent by Squid ( how long it t akes for t he neighbors t o reply t o our queries) .
icp.reply_median_svc_time The five- m inut e m edian service t im e for I CP queries received by Squid ( how long it t akes Squid t o reply t o it s neighbor's queries) . I CP processing norm ally occurs fast er t han t he process clock is updat ed, so t his value is always zero.
dns.median_svc_time The five- m inut e m edian service t im e for DNS queries.
select_fds The m ean rat e at which t he m ain I / O loop scans file descript ors wit h select ( ) or poll( ) . Not e: a low num ber doesn't necessarily indicat e poor perform ance. I t m ay j ust be t hat Squid oft en has no work t o do.
average_select_fd_period The m ean num ber of seconds required t o scan a file descript or in t he m ain I / O loop.
median_select_fds The five- m inut e m edian num ber of ready file descript ors each t im e Squid calls select ( ) or poll( ) ( t he m edian of t he select ( ) / poll( ) ret urn value) . Unfort unat ely, t his value is alm ost always zero because Squid's funct ions for calculat ing t he m edian don't work very well wit h
t he select_fds hist ogram , in which 0 and 1 are t he m ost com m on values.
syscalls.selects The five- m inut e m ean rat e of calls t o select ( ) / poll( ) . I f Squid is using poll( ) on your syst em , t he variable is called syscalls.polls. This value m ay be a lit t le larger t han select _loops, because t he lat t er only includes calls in t he m ain I / O loop.
syscalls.disk.opens The five- m inut e m ean rat e of open( ) calls for disk files.
syscalls.disk.closes The five- m inut e m ean rat e of close( ) calls for disk files.
syscalls.disk.reads The five- m inut e m ean rat e of read( ) calls for disk files.
syscalls.disk.writes The five- m inut e m ean rat e of writ e( ) calls for disk files.
syscalls.disk.seeks The five- m inut e m ean rat e of seek( ) calls for disk files. Probably zero unless you are using aufs, which always calls seek( ) before reading.
syscalls.disk.unlinks The five- m inut e m ean rat e of unlink( ) ( or, in som e cases, t runcat e( ) ) calls for disk files.
syscalls.sock.accepts The five- m inut e m ean rat e of accept ( ) calls for net work socket s.
syscalls.sock.sockets The five- m inut e m ean rat e of socket ( ) calls for net work socket s.
syscalls.sock.connects The five- m inut e m ean rat e of connect ( ) calls for net work socket s.
syscalls.sock.binds The five- m inut e m ean rat e of bind( ) calls for net work socket s.
syscalls.sock.closes The five- m inut e m ean rat e of close( ) calls for net work socket s.
syscalls.sock.reads The five- m inut e m ean rat e of read( ) calls for net work socket s.
syscalls.sock.writes The five- m inut e m ean rat e of writ e( ) calls for net work socket s.
syscalls.sock.recvfroms The five- m inut e m ean rat e of recvfrom ( ) calls for net work socket s. Used for UDP- based prot ocols, such as DNS, I CP, HTCP, and som e int erprocess com m unicat ion.
syscalls.sock.sendtos The five- m inut e m ean rat e of sendt o( ) calls for net work socket s. Used for UDP- based prot ocols, such as DNS, I CP, HTCP, and som e int erprocess com m unicat ion.
14.2.1.34 60min: 60 Minute Average of Counters This page shows a 60- m inut e average of t he dat a in t he Traffic and Resource Count ers page. The descript ions are ident ical t o t hose for t he 5 Minut e Average of Count ers page, except t he m easurem ent s are t aken over one hour.
14.2.1.35 utilization: Cache Utilization This page displays averages of t he count ers ( see Traffic and Resource Count ers and 5 Minut e Average of Count ers) over various t im e spans. The sam e values are report ed for 5- m inut e, 15m inut e, 1- hour, 8- hour, 1- day, and 3- day int ervals. This page, wit h a poorly chosen nam e, exist s prim arily so t hat developers can t ake a quick snapshot of st at ist ics for t est ing purposes.
14.2.1.36 histograms: Full Histogram Counts This page displays t he current hist ogram values ( since Squid was st art ed) for a num ber of m easurem ent s: ● ● ● ● ● ● ● ● ●
client_http.all_svc_time client_http.miss_svc_time client_http.nm_svc_time client_http.nh_svc_time client_http.hit_svc_time icp.query_svc_time icp.reply_svc_time dns.svc_time select_fds_hist
These are t he sam e m easurem ent s described in Sect ion 14.2.1.33, except t hat here Squid gives t he full hist ogram , inst ead of t he m ean or m edian. Depending on t he t ype of hist ogram , you m ay see t wo or t hree colum ns. The first colum n is t he bin num ber and lower bound on t he bin value. The second colum n is t he num ber of count s for t hat bin. The opt ional t hird colum n is t he num ber of count s divided by t he " size" of t he bin. The last colum n is probably only int erest ing for log- based hist ogram s, in which t he bin size isn't const ant .
14.2.1.37 active_requests: Client-Side Active Requests This page shows a list of current ly act ive client - side request s. The list is sort ed st art ing wit h t he m ost recent , and ending wit h t he oldest request s. The inform at ion given here is prim arily useful t o developers. A t ypical ent ry looks like t his: Connection: 0x84ecd10 FD 132, read 1273, wrote 12182 FD desc: http://www.squid-cache.org/Doc/FAQ/FAQ.html in: buf 0xa063000, offset 0, size 4096 peer: 206.168.0.9:1058 me: 192.43.244.42:3128 nrequests: 3 defer: n 0, until 0 uri http://www.squid-cache.org/Doc/FAQ/FAQ.html log_type TCP_MISS out.offset 0, out.size 0 req_sz 392 entry 0x960c680/3B49762ABF444D80B6465552F6CFAD4C
old_entry 0x0/N/A start 1066036250.669955 (2.240814 seconds ago)
Connection The int ernal m em ory address of t he connect ion st ruct ure.
FD The file descript or for t he TCP connect ion, followed by t he num ber of byt es read and writ t en.
FD desc A short descript ion of t he socket , usually a URI . This is t he sam e as in Sect ion 14.2.1.25.
in The int ernal m em ory locat ion of t he input buffer, t he offset at which Squid will place dat a aft er t he next read( ) call, and t he size of t he input buffer.
peer The rem ot e socket address of t he TCP connect ion. You can correlat e t his value wit h what you see in net st at - n out put .
me The local socket address of t he TCP connect ion.
nrequests The num ber of HTTP request s received on t his connect ion. A value great er t han 1 indicat es persist ent connect ion reuse.
defer I ndicat es whet her Squid is post poning reads on t he socket .
uri The URI from t he client 's request . Unlike FD desc, t his one isn't t runcat ed.
log_type The cache st at us code t hat appears in access.log when t his t ransact ion is com plet e.
out.offset The offset , relat ive t o t he st art of t he HTTP reply m essage, in which t he client side has request ed dat a from t he st orage syst em .
out.size The num ber of response byt es writ t en t o t he client .
req_sz The size of t he client 's HTTP request . Not e, for persist ent connect ions, t his refers only t o t he current request .
entry The m em ory address and MD5 hash of t he corresponding StoreEntry st ruct ure.
old_entry For validat ion request s, t his is t he m em ory address and MD5 hash of t he cached response StoreEntry.
start The t im e at which Squid began processing t his request .
14.2.1.38 store_digest: Store Digest This page is available only wit h t he ./ configure —enable- cache- digest s opt ion. I t displays a few st at ist ics about Squid's own cache digest . I t looks like t his: store digest: size: 620307 bytes entries: count: 324806 capacity: 992490 util: 33% deletion attempts: 0 bits: per entry: 5 on: 1141065 capacity: 4962456 util: 23%
bit-seq: count: 1757902 avg.len: 2.82 added: 324806 rejected: 611203 ( 65.30 %) del-ed: 0 collisions: on add: 0.08 % on rej: 0.07 %
size The num ber of byt es t hat t he digest occupies in m em ory.
entries count The num ber of cached obj ect s ent ered int o t he digest .
entries capacity The t arget capacit y for t he digest . Not e, t his isn't a hard lim it , but rat her an est im at e for opt im ally sizing t he digest .
entries util The percent age of ent ries added com pared t o t he capacit y.
deletion attempts Squid doesn't current ly support delet ion of cache digest ent ries, so t his is always zero.
bits per entry The num ber of bit s t hat each it em t urns on. The sam e as t he digest _bit s_per_ent ry value from squid.conf.
bits on The num ber of bit s t hat have been t urned on so far.
bits capacity The t ot al num ber of bit s in t he digest . Equal t o t he digest size m ult iplied by eight .
bit-seq count The num ber of sam e- bit sequences in t he digest . For exam ple, t he pat t ern 110100011111
has 5 sequences of 1s and 0s.
bit-seq avg.len The m ean lengt h of sam e- bit sequences.
added The num ber of ent ries added t o t he digest since it was creat ed.
rejected The num ber of ent ries not added t o t he digest . An ent ry m ay not be added because it isn't cachable, is t oo large, st ale, or about t o becom e st ale, et c.
del-ed Squid doesn't current ly support delet ion of cache digest ent ries, so t his is always zero.
collisions on add This is t he percent age of addit ions t hat didn't t urn on any new bit s. Recall t hat Bloom filt ers have t he propert y t hat t wo or m ore ent ries m ay t urn on t he sam e bit .
collisions on rej This is t he percent age of rej ect ed addit ions t hat wouldn't have t urned on any new bit s.
14.2.1.39 storedir: Store Directory Stats This page displays som e st at ist ics from t he st orage syst em . First , you'll see a few global values. For exam ple: Store Directory Statistics: Store Entries
: 2873564
Maximum Swap Size
: 46080000 KB
Current Store Swap Size: 41461672 KB Current Capacity
Store Entries
: 90% used, 10% free
The num ber of StoreEntry obj ect s. Most , but not necessarily all, of t hese are for on- disk obj ect s.
Maximum Swap Size The sum of all cache_dir sizes.
Current Store Swap Size The t ot al am ount of cached dat a current ly st ored on disk. Not e t hat Squid rounds response sizes ( e.g., 1722 byt es) up t o t he nearest m ult iple filesyst em block size ( e.g., 2048 byt es) when increm ent ing and decrem ent ing t his value.
Current Capacity The percent age of t he m axim um disk space current ly in use. The percent age in use should norm ally st ay below t he cache_swap_high value. Next , you'll see a sect ion for each cache_dir. I t looks som et hing like t his: Store Directory #1 (diskd): /cache1 FS Block Size 1024 Bytes First level subdirectories: 16 Second level subdirectories: 64 Maximum Size: 15360000 KB Current Size: 13823996 KB Percent Used: 90.00% Filemap bits in use: 958439 of 2097152 (46%) Filesystem Space in use: 14030485/17370434 KB (81%) Filesystem Inodes in use: 959440/4340990 (22%) Flags: SELECTED Pending operations: 0 Removal policy: lru LRU reference age: 23.63 days
Store Directory # The direct ory num ber, t ype, and pat hnam e.
FS Block Size The filesyst em block size, det erm ined by t he st at fs( ) or st at vfs( ) syst em calls. I f t hese funct ions aren't available or ret urn an error, t he block size default s t o 2048 byt es. The next few lines are act ually st orage schem e- dependent . For t he m ost part , ufs, aufs, and diskd are very sim ilar and all report t he sam e st at ist ics.
First level subdirectories The num ber of first - level subdirect ories you t old Squid t o use on t he cache_dir line.
Second level subdirectories The num ber of second- level subdirect ories you t old Squid t o use on t he cache_dir line.
Maximum Size The m axim um allowed size for t his cache direct ory.
Current Size The am ount of disk space current ly in use.
Percent Used The percent age of cache_dir space current ly in use.
Filemap bits in use Squid uses a bit m ap t o keep t rack of file num bers t hat are allocat ed and free. This line shows t he num ber and percent age of bit s in use. The filem ap grows aut om at ically as needed, so don't worry if it shows up as 99% full.
Filesystem Space in use These num bers com e from t he st at fs( ) / st at vfs( ) syst em calls. These should be t he sam e values as you'd see from t he df com m and. Squid doesn't use t hese num bers, ot her t han t o report t hem here for your inform at ion. Not e t hat t hese values m ay be larger t han Current Size, especially if t he part it ion is used for m ore t han Squid's cache.
Filesystem Inodes in use These num bers also com e from st at fs( ) / st at vfs( ) . They are present t o rem ind you t hat running out of inodes is j ust as bad as running out of free space. Unfort unat ely, if you run out of inodes, you'll probably be forced t o newfs t he part it ion.
Flags Possible values include SELECTED and READ-ONLY. The SELECTED flag m eans t hat t his part icular cache_dir was m ost recent ly select ed by t he cache direct ory select ion algorit hm ( see Sect ion 7.4) . The READ-ONLY flag m eans t hat t he cache direct ory has been m arked read-only in t he configurat ion file ( see Sect ion 7.1.5) .
Pending operations This line appears only for diskd cache direct ories. I t shows t he num ber of I / O request s dispat ched t o t he diskd process t hat have not yet been acknowledged. That 's t he end of t he schem e- specific dat a. The rem aining lines are specific t o t he cache_dir replacem ent algorit hm :
Removal policy Possible values include lru ( t he default ) or heap. Not e t hat for heap, you won't see t he algorit hm nam e ( LFU, GDSF, or LRU) .
LRU reference age I f t he rem oval policy is lru, you'll also see t his line. I t shows t he age of t he oldest obj ect in t he LRU list .
14.2.1.40 store_check_cachable_stats: storeCheckCachable( ) Stats This page displays a t able of count ers from t he st oreCheckCachable( ) funct ion. I t is called for m ost responses, j ust before Squid at t em pt s t o open a disk file for writ ing.
Squid knows t hat som e responses can't be cached, based ent irely on t he request . These responses aren't included in t he st oreCheckCachable( ) st at ist ics.
The t able includes t he following lines:
no.not_entry_cachable
The ENTRY_CACHABLE flag was cleared for som e reason.
no.release_request The RELEASE_REQUEST flag was set while reading t he response. This m ay be due t o an error ( such as receiving a part ial response) or t o t he rules of t he t ransfer prot ocol. I n som e versions of Squid, t his count er is always zero because t he st oreReleaseRequest ( ) funct ion always clears t he ENTRY_CACHABLE bit , causing such obj ect s t o be count ed as no. not_entry_cachable inst ead.
no.wrong_content_length The act ual cont ent lengt h doesn't m at ch t he Content-Length header value. I n som e versions of Squid, t his count er is always zero because st oreReleaseRequest ( ) is always called if t he response size doesn't m at ch t he expect ed cont ent lengt h.
no.negative_cached The ENTRY_NEGCACHED flag was set . See t he descript ion for TCP_NEGATIVE_HIT in Sect ion 13.2.1.
no.too_big The response body was larger t han t he m axim um _obj ect _size value.
no.too_small The response body was sm aller t han t he m inim um _obj ect _size value.
no.private_key The response has a privat e cache key, indicat ing t hat it can't be shared wit h ot her users.
no.too_many_open_files The Squid process was low on free file descript ors.
no.too_many_open_fds
Squid had m ore t han m ax_open_disk_fds opened at one t im e.
yes.default The response was cachable because it did not m eet any of t he preceding crit eria.
14.2.1.41 store_io: Store IO Interface Stats This short t able cont ains four lines relat ed t o allocat ing disk st orage for a new response. For exam ple: Store IO Interface Stats create.calls 2825670 create.select_fail 0 create.create_fail 0 create.success 2825670
create.calls The num ber of calls t o t he funct ion t hat creat es a new disk file.
create.select_fail The num ber of t im es t hat t he creat e operat ion failed because t he cache_dir select ion algorit hm did not select a cache direct ory. The default select ion algorit hm , least-load, fails if it t hinks all cache direct ories are t oo busy.
create.create_fail The num ber of t im es t hat t he creat e operat ion failed at t he st orage layer. This m ay happen if t he open( ) call ret urns an error or if t he st orage syst em ( e.g., diskd) elect s t o not open a disk file for som e reason ( e.g., overload condit ion) .
create.success The num ber of t im es t he creat e operat ion succeeded.
14.2.1.42 pconn: Persistent Connection Utilization Histograms This page displays t wo hist ogram s. The first is for client - side persist ent connect ion usage. For exam ple: Client-side persistent connection counts:
req/ conn
count
----
---------
0
74292
1
14362705
2
3545955
3
2068486
4
1411423
5
1030023
6
778722
7
603649
8
474592
9
376154
10
301396
On t he left is t he num ber of request s per connect ion. On t he right is t he num ber of t im es a client connect ion had t hat m any request s. Most likely, you'll see t hat one request / connect ion has t he highest count and t hat t he count s decrease as t he num ber of request s/ connect ion increases. The second t able has t he sam e inform at ion, but for server- side HTTP connect ions. You should see t he sam e sort of pat t ern here, wit h one request / connect ion having t he highest count .
14.2.1.43 refresh: Refresh Algorithm Statistics The refresh page shows a few t ables relat ing t o t he freshness of cached obj ect s. I nt ernally, Squid keeps t rack of t he way different m odules use t he refresh funct ions. The first t able shows how m any calls each m odule has m ade. The really int erest ing dat a is cont ained in t he rem aining t ables, however. The HTTP histogram shows t he breakdown of freshness checks for client HTTP request s. For exam ple: HTTP histogram: Count
%Total
Category
0
0.00
Fresh: request max-stale wildcard
0
0.00
Fresh: request max-stale value
173984
9.76
462757
25.97
42
0.00
Fresh: refresh_pattern min value
0
0.00
Fresh: refresh_pattern override expires
0
0.00
Fresh: refresh_pattern override lastmod
5521
0.31
Stale: response has must-revalidate
0
0.00
Stale: changed reload into IMS
0
0.00
Stale: request has no-cache directive
470912
26.43
Stale: age exceeds request max-age value
455073
25.54
Stale: expires time reached
65612
3.68
Stale: refresh_pattern max age rule
144706
8.12
Stale: refresh_pattern last-mod factor percentage
3274
0.18
Stale: by default
1781881 100.00
Fresh: expires time not reached Fresh: refresh_pattern last-mod factor percentage
TOTAL
Not e, t he rules aren't necessarily evaluat ed in t he order in which t hey appear in t he t able. Here's what each line m eans:
Fresh: request max-stale wildcard Squid considers t he cached response fresh because t he request includes a max-stale direct ive wit hout any value. For exam ple: GET /blah... HTTP/1.1 Cache-control: max-stale According t o RFC 2616: " I f no value is assigned t o max-stale, t hen t he client is willing t o accept a st ale response of any age."
Fresh: request max-stale value Squid considers t he cached response fresh because t he request includes a max-stale direct ive wit h a part icular value, which is larger t han t he am ount of t im e since t he obj ect expired.
Fresh: expires time not reached
Squid considers t he cached response fresh because it s expirat ion t im e has not yet been reached.
Fresh: refresh_pattern last-mod factor percentage Squid considers t he cached response fresh because it m at ches one of t he refresh_pattern rules and has a last - m odified fact or ( LM- fact or) value t hat 's less t han t hat specified by t he rule. See Sect ion 7.7.
Fresh: refresh_pattern min value Squid considers t he cached response fresh because it m at ches one of t he refresh_pattern rules and it s age is less t han t he min value specified by t he rule. See Sect ion 7.7.
Fresh: refresh_pattern override expires Squid considers t he cached response fresh because it m at ched one of t he refresh_pattern rules wit h t he override-expire opt ion. This opt ion causes Squid t o give precedence t o t he refresh_pattern m inim um value over t he obj ect 's expirat ion t im e. Not e: using t he override-expire opt ion is a violat ion of RFC 2616.
Fresh: refresh_pattern override lastmod Squid considers t he cached response fresh because it m at ched one of t he refresh_pattern rules wit h t he override-lastmod opt ion. This opt ion causes Squid t o give precedence t o t he refresh_pattern m inim um value over t he LM- fact or value. Not e: using t he overridelastmod opt ion is a violat ion of RFC 2616.
Stale: response has must-revalidate Squid considers t he cached response st ale because it cont ains a Cache-Control: mustrevalidate direct ive.
Stale: changed reload into IMS Squid considers t he cached response st ale because it m at ches one of t he refresh_pattern rules wit h t he reload-into-ims opt ion. Wit h t his opt ion, Squid t urns a request wit h CacheControl: no-cache ( or sim ilar) int o a cache validat ion. Not e: using t he reload-into-ims opt ion is a violat ion of RFC 2616.
Stale: request has no-cache directive Squid considers t he cached response st ale because t he request cont ains a Cache-Control:
no-cache direct ive.
Stale: age exceeds request max-age value Squid considers t he cached response st ale because t he request has a max-age direct ive, which is less t han t he response's age.
Stale: expires time reached Squid considers t he cached response st ale because it s expirat ion t im e has been reached.
Stale: refresh_pattern max age rule Squid considers t he cached response st ale because it m at ches one of t he refresh_pat t ern rules, and it s age is great er t han t he max value specified by t he rule.
Stale: refresh_pattern last-mod factor percentage Squid considers t he cached response st ale because it m at ches one of t he refresh_pat t ern rules, and it s LM- fact or value is great er t han t he fact or specified by t he rule.
Stale: by default Squid considers t he cached response st ale by default , because it didn't m eet any of t he ot her crit eria. Following t he HTTP histogram, you'll see t he sam e dat a for ICP, HTCP, Cache Digests, and On Store. The On Store t able represent s freshness checks for responses t hat are com ing int o Squid's cache ( i. e., cachable m isses) . Not e, however, t hat Squid does st ore st ale responses ( as long as t hey have a cache validat or) . Don't be alarm ed if you see som e st ale responses in t he On Store hist ogram .
14.2.1.44 delay: Delay Pool Levels This page displays t he Delay Pool st at ist ics. Squid has t hree classes of pools ( 1, 2, 3) and t hree t ypes of bucket s ( aggregat e, individual, and net work) . A class 1 pool has only an aggregat e bucket , a class 2 pool has bot h aggregat e and individual, and a class 3 pool has all t hree. An aggregat e bucket looks like t his: Aggregate: Max: 16384 Restore: 4096
Current: 6144 The values are all in byt es. Max is t he size of t he bucket , which is t he num ber of byt es t he bucket can hold. Restore is t he num ber of byt es added t o t he bucket each second. Current is t he num ber of byt es current ly in t he bucket . I f nobody uses t he byt es, t he bucket fills unt il it reaches t he m axim um size. An individual bucket is alm ost t he sam e: Individual: Max: 20000 Restore: 5000 Current: 1:18760 9:4475 14:20000 The only difference is t hat t he Current line displays a num ber of different values, one for each host num ber. The host num ber is defined as t he last oct et of an I Pv4 address. I n t his exam ple, t he host num bers are 1, 9, and 14. I n a class 2 delay pool, t he host num bers from different net works share t he sam e bucket . For exam ple, 192.168.0.1 and 192.168.44.1 bot h share t he bucket for host num ber 1. I n a class 3 pool, however, each net work num ber ( t hird oct et ) has it s own array of individual bucket s. Thus, for a class 3 pool, t he individual bucket s appear t his way: Individual: Max: 20000 Rate: 5000 Current [Network 0]: 1:12000 Current [Network 44]: 1:17000 A net work bucket ( for class 3 pools only) is sim ilar as well: Network: Max: 30000 Rate: 15000 Current: 0:3912 7:30000 I n t his case, t he Current line shows t he current level for each net work num ber ( t hird oct et ) . See Appendix C for m ore inform at ion about Delay Pools.
14.2.1.45 forward: Request Forwarding Statistics The t able on t his page shows how m any at t em pt s were m ade t o forward each request , wit h t heir result s. Upon receiving som e st at us codes, Squid gives up im m ediat ely. For ot hers, however, Squid keeps t rying. Each row of t he t able is a different HTTP st at us code ( 200, 401, 404, et c.) . Each
colum n is t he num ber of forwarding at t em pt s. The value in each cell is how m any request s were forwarded t hat m any t im es, result ing in t he corresponding st at us code. This inform at ion helps developers underst and whet her or not it m akes sense t o ret ry a request aft er receiving cert ain t ypes of responses. Here is an exam ple: Status
try#1
0
try#2
1
try#3
0
try#4
0
try#5
0
try#6
0
try#7
try#9
try#10
0
0
0
0
0
try#8
200
3970083 111015
51185
29002
18242
12097
8191
6080
4490
6140
201
57
0
0
0
0
0
0
0
0
0
202
162
0
0
0
0
0
0
0
0
0
204
1321
11
0
0
0
0
0
0
0
0
206
624288
453
25
9
4
3
0
1
0
0
207
147
0
0
0
0
0
0
0
0
0
300
23
0
0
0
0
0
0
0
0
0
301
23500
25
2
0
0
0
1
0
0
0
302
339332
3806
153
26
6
4
2
3
0
1
303
101
1
0
0
0
0
0
0
0
0
304
772831
3510
125
21
7
8
8
5
3
2
307
7
0
0
0
0
0
0
0
0
0
400
529
1
0
0
0
0
0
0
0
0
401
1559
0
0
0
0
0
0
0
0
0
403
5098
30
1
1
0
0
0
0
0
0
404
100800
216
25
6
7
1
2
4
1
5
405
1
0
0
0
0
0
0
0
0
0
... A value of 29,002 in t he cell under try#4 and in t he row for status 200 m eans t hat t here were 29,002 t im es when Squid finally got a successful response aft er 4 forwarding at t em pt s. I f you look at t he t able, you m ay see som e unknown st at us codes. Squid keeps t rack of all st at us codes up t o 600, even t hose it doesn't know about . See Table 13- 1 for t he list of codes t hat Squid does know about .
14.2.1.46 client_list: Cache Client List The cache client list shows a handful of st at ist ics for each client I P address accessing Squid, which looks like t his:
Address: 206.168.0.9 Name: 206.168.0.9 Currently established connections: 0 ICP Requests 59000 UDP_HIT
1609
3%
57388
97%
3
0%
656
6%
TCP_MISS
3464
31%
TCP_REFRESH_HIT
4477
40%
TCP_REFRESH_MISS
767
7%
TCP_CLIENT_REFRESH_M
397
4%
1082
10%
7
0%
13
0%
418
4%
UDP_MISS UDP_INVALID HTTP Requests 11281 TCP_HIT
TCP_IMS_HIT TCP_SWAPFAIL_MISS TCP_NEGATIVE_HIT TCP_MEM_HIT
The Address line, obviously, shows t he client 's I P address. Name is t he sam e, unless you have log_fqdn enabled, and t he DNS report s a nam e for t he address. The Currently established connections line shows how m any HTTP connect ions are current ly open bet ween t he client and Squid. I f t he client has sent any I CP queries, you'll see a breakdown of t he result s here. I n t his exam ple, only 3% of t his client 's I CP queries were hit s. Not e, t his page doesn't current ly include HTCP result st at ist ics. Finally, you'll see a breakdown of HTTP request result codes. The client dat abase consum es a fair am ount of m em ory, especially if you have a large num ber of client I P addresses accessing Squid. You can disable t he dat abase ent irely, t hus conserving m em ory, wit h t he client _db direct ive. Also not e t hat t here is no way t o clear t he count ers or t o rem ove ent ries while Squid is running.
14.2.1.47 netdb: Network Measurement Database This page is available only wit h t he ./ configure —enable- icm p opt ion ( see Sect ion 10.5) . On t his page you'll find quit e a lot of I P addresses, host nam es, packet count ers, and RTT values. I t looks som et hing like t his: Network DB Statistics:
Network 165.123.34.0
recv/sent
RTT
7/
12.7
7
Hops Hostnames 8.6 onlinebooks.library.upenn.edu www.library.upenn.edu digital.library.upenn.edu
rtp.us.ircache.net
17.0
11.0
sj.us.ircache.net
71.0
17.3
12.8
10.0 adbuyer3.lycos.com
rtp.us.ircache.net
20.6
15.0
sj.us.ircache.net
77.6
15.0
209.202.204.0
63.151.139.0
4/
17/
4
17
sj.us.ircache.net
12.8
9.0 www.originlab.com
80.0
12.0
12.8
11.7 www6.tomshardware.com www.guestbook.nu
rtp.us.ircache.net
34.9
15.1
sj.us.ircache.net
73.9
14.7
209.68.20.0
23/
23
Each / 24 net work is list ed, in order of increasing round- t rip t im e. You can see how m any I CMP pings have been sent and received, t he average RTT, and t he est im at ed rout er hop- count . The Hostnames field shows t he host nam es t hat resolve t o addresses wit hin t he / 24 net work. I f Squid has I CMP m easurem ent s from it s neighbors for t he net work, t hose are print ed as well. I n t his exam ple, t he local cache is closer t o all t he net works t han it s neighbors ( rt p.us.ircache.net and sj . us.ircache.net ) .
14.2.1.48 asndb: AS Number Database Alt hough t his page is always available, it cont ains int erest ing dat a only if you are using one of t he Aut onom ous Syst em ( AS) ACLs, such as src_as or dst _as. When you use an AS- based ACL, Squid queries t he Rout ing Arbit er dat abase ( whois.ra.net ) t o discover t he I P net works associat ed wit h t he AS num ber. The result s of t hose queries are displayed on t his page. The out put looks like t his: Address
AS Numbers
128.98.0.0/16
7
146.80.0.0/16
7
192.5.28.0/24
7
192.5.29.0/24
7
192.5.30.0/24
7
192.107.178.0/24
7
192.135.183.0/24
5637
194.61.177.0/24
7
194.61.180.0/24
7
194.61.183.0/24
7
194.83.162.0/24
7
14.2.1.49 carp: CARP Information This page is available only wit h t he ./ configure —enable- carp opt ion and if you have som e CARP parent s configured. Squid displays a t able of all CARP parent s, which looks like t his: Hostname
Hash Multiplier
Factor
Actual
bo1.us.ircache.net
f142425b
0.894427
0.400000
0.527950
bo2.us.ircache.net
12180f04
1.118034
0.600000
0.472050
Hash is t he neighbor's hash value from t he CARP algorit hm . Multiplier is anot her value used by t he algorit hm . Factor is t aken from t he carp-load-factor opt ion on t he cache_dir line in squid. conf. Actual is t he act ual dist ribut ion of request s am ong t he CARP parent s. I deally, it should m at ch t he Factor value.
14.2.1.50 server_list: Peer Cache Statistics This page displays various count ers and st at ist ics for your neighbor caches. For exam ple: Sibling
: pa.us.ircache.net/3128/4827
Flags
: htcp
Address[0] : 192.6.19.203 Status
: Up
AVG RTT
: 14 msec
OPEN CONNS : 19 LAST QUERY :
4 seconds ago
LAST REPLY :
4 seconds ago
PINGS SENT :
9119
PINGS ACKED:
9115 100%
FETCHES
:
109
1%
IGNORED
:
9114 100%
Histogram of PINGS ACKED: Misses
9114 100%
Hits
1
0%
keep-alive ratio: 100%
Type The first line shows t he neighbor t ype ( parent , sibling, or m ult icast group) , followed by t he host nam e and port num bers. The first port num ber is for HTTP request s, while t he second is for I CP or HTCP.
Flags Here you'll see any of t he cache_peer opt ions t hat you m ay have specified, such as noquery, closest-only, and m ore. See Sect ion 10.3.1 for t he com plet e list .
Address[ ] This line displays t he I P address( es) associat ed wit h t he host nam e. The num ber in bracket s is t he num ber of addresses. Squid st ores up t o 10 addresses for each neighbor.
Status The st at us line indicat es whet her Squid t hinks t he neighbor is Up or Down. See Sect ion 10.3.2.
AVG RTT This is t he running average RTT for I CP/ HTCP queries t o t he neighbor.
OPEN CONNS This is t he num ber of HTTP connect ions current ly open t o t he neighbor.
LAST QUERY This indicat es t he am ount of t im e since Squid last sent an I CP/ HTCP query t o t he neighbor.
LAST REPLY This indicat es t he am ount of t im e since Squid last received an I CP/ HTCP reply from t he neighbor.
PINGS SENT The num ber of I CP/ HTCP queries sent t o t he neighbor.
PINGS ACKED The num ber of I CP/ HTCP replies received back from t he neighbor.
FETCHES The num ber of HTTP request s sent t o t he neighbor. The percent age is based on t he PINGS ACKED num ber. Unfort unat ely, t he FETCHES num ber count s request s forwarded for any reason ( I CP, HTCP, Cache Digest s, default parent , et c.) . Thus, t he percent age doesn't always m ake sense and m ay be higher t han 100% .
IGNORED The num ber of I CP/ HTCP replies ignored. The m ost com m on reason t hat Squid ignores an I CP/ HTCP reply is t hat it is t oo lat e.
Histogram of PINGS ACKED Here you'll see a breakdown of I CP/ HTCP result s. For I CP neighbors, Squid print s t he I CP st at us codes ( ICP_HIT, ICP_MISS, et c.) . For HTCP neighbors, t he only cat egories are Hits and Misses.
keep-alive ratio This shows t he percent age of t im es t hat Squid want ed an HTTP connect ion t o be persist ent , and t he neighbor agreed. Not e, t his doesn't indicat e anyt hing about whet her t he connect ion was act ually reused, only t hat bot h sides agreed t hat it could be.
14.2.1.51 non_peers: List of Unknown Sites Sending ICP messages This page shows a list of client s t hat send unaut horized I CP ( but not HTCP) queries. The list is t he sam e form at as t he Cache Client List page.
14.2.2 Cache Manager Access Controls The cache m anager int erface provides a lot of inform at ion. Much of it is sensit ive and should be
kept privat e. For exam ple, t he Cache Client List reveals t he I P addresses of users, t he Process Filedescript or Allocat ion page shows URI s current ly being request ed, and t he Current Squid Configurat ion displays t he values from squid.conf, including passwords and access cont rol rules. To keep unwant ed visit ors from browsing t he cache m anager pages, you m ust carefully configure access t o it .
14.2.2.1 http_access All cache m anager request s use t he pseudo- prot ocol schem e cache_object. The best way t o prot ect t he cache m anager is rest rict t he I P addresses allowed t o m ake cache_object request s. The default squid.conf cont ains t hese lines: acl Manager proto cache_object acl Localhost src 127.0.0.1/255.255.255.255 http_access allow Manager Localhost http_access deny Manager Thus, cache m anager request s from t he local host ( 127.0.0.1) are allowed, but all ot hers are denied. I f you have addit ional t rust ed host s, you m ay want t o add t hem t o t he access rules also. Make sure t hese lines are at t he t op of your ht t p_access rules.
14.2.2.2 cachemgr_passwd You m ay also want t o m odify t he default cachem gr_passwd set t ings. Som e of t he cache m anager pages require a password, so you won't be able t o view t hose unt il you add one. For exam ple, if you want t o use t he Current Squid Configurat ion page, you m ust assign it a password: cachemgr_passwd JeckCy config You can have a num ber of different passwords, but each act ion m ay have only one password. You m ay want t o use a different password for less sensit ive pages: cachemgr_passwd byDroth filedescriptors client_list netdb To disable a cache m anager act ion, use disable as t he password: cachemgr_passwd disable netdb To enable t he sensit ive act ions wit hout requiring a password, use none: cachemgr_passwd none offline_toggle I f you want t o give t he sam e password t o all act ions, use t he keyword all: cachemgr_passwd Knoujush all When using t he com m and- line cache m anager int erface ( e.g., squidclient ) , put an @ sign and t he
password aft er t he act ion nam e. For exam ple: squidclient mgr:objects@byDroth | less Not e t hat cache m anager passwords aren't print ed when you request t he Current Squid Configurat ion page ( see Sect ion 14.2.1.7) .
14.2.2.3 cachemgr.cgi I f you use cachemgr.cgi, t he I P address of your HTTP server m ust be able t o m ake cache m anager request s t o Squid. This opens up a back- door securit y hole. Anyone who can execut e t he CGI program on your server will be able t o view t he cache m anager pages. The passwords described earlier can help, but you m ay also want t o inst all access cont rols on your HTTP server so t hat only cert ain people can execut e cachemgr.cgi. The m ain cachemgr.cgi page has a form wit h Usernam e and Password fields. The usernam e is purely inform at ional. I f you have m ult iple adm inist rat ors in your organizat ion, each person can ent er t heir own nam e for audit ing purposes. I f you leave t he password field blank, t he password- prot ect ed pages are disabled. Ent ering a password act ivat es links for t hose pages. cachemgr.cgi is st at eless, so t he password m ust be included as a URI param et er in links. Furt herm ore, t he password encoding schem e isn't very sophist icat ed and t rivial t o break. Because m any applicat ions ( such as Squid! ) log t he URI s of HTTP request s, your cache- m anager password m ay be logged or even observed by an unt rust ed t hird part y. I f you really want t o keep your cache m anager passwords secret , never use t hem wit h cachemgr.cgi or from any rem ot e syst em .
14.2.3 Reasons to Dislike the Cache Manager The cache m anager int erface leaves m uch t o be desired. I t has a very unpolished feel. Novice adm inist rat ors will probably find it difficult t o use and underst and. One of t he first problem s you m ight not ice is t hat t he m enu ( or t able of cont ent s) is unorganized. There is no logical order or grouping. The first it em s in t he list provide low- level inform at ion prim arily m eant for developers. Current ly, t he order is det erm ined by t he init ializat ion sequence in t he source code. The out put is oft en ugly. The cachem gr.cgi program renders very bland- looking HTML pages. There are no icons or graphics of any kind. Furt herm ore, m any of t he pages are sim ply present ed as unform at t ed t ext . cachem gr.cgi doesn't do m uch m ore t han form at t ab- delim it ed lines as HTML t ables and put t ags around som e URI s. Som e of t he cache m anager pages are st ruct ured so t hat t he out put can be easily parsed by com put er program s, rat her t han hum ans. By t oday's st andards, t he cache m anager has very weak securit y. You are essent ially forced t o use address- based cont rols and cleart ext passwords. I f you allow cache m anager request s only from localhost , and your syst em securit y is good, you'll be relat ively safe.
14.2.4 Squid-RRD I personally use t he cache m anager t o populat e a num ber of RRDTool dat abases ( ht t p: / / www. rrdt ool.com / ) . RRDTool is nice package for st oring and displaying t im e- series dat a. I t allows you t o archive dat a at different t im e scales ( e.g., days, weeks, m ont hs, years) in a dat abase t hat doesn't grow in size over t im e.
I use a Perl script t hat runs every five m inut es from cron. I t issues cache m anager request s for a num ber of pages and ext ract s t he values t hat I am int erest ed in. These values are st ored in t he RRD files. RRDTool also generat es nice- looking graphs, from eit her a CGI script or st andalone program . I use t he CGI program and check t he graphs at least daily. See Figure 14- 2 for som e sam ples from one of m y own Squid boxes.
Figu r e 1 4 - 2 . Som e sa m ple RRD gr a ph s fr om RRD Tool a n d ca ch e m a n a ge r da t a
You can find m y script s and inst ruct ions for int egrat ing t he cache m anager and RRDTool at ht t p: / /
www.squid- cache.org/ ~ wessels/ squid- rrd/ . < Day Day Up >
< Day Day Up >
14.3 Using SNMP Squid has a built - in SNMP agent t hat you can query wit h various SNMP client t ools. I t allows you t o collect a few basic st at ist ics from Squid. Unfort unat ely, t he Squid MI B has not evolved m uch since it s init ial im plem ent at ion. Many of t he param et ers t hat you'd like t o m onit or aren't available t hrough t he SNMP MI B. Perhaps t his will be rect ified in a fut ure version. To enable SNMP in Squid, use t he —enable- snm p opt ion when running ./ configure and recom pile if necessary. Squid uses UDP port 3401 for SNMP by default . You can use a different port by set t ing t he snm p_port direct ive. Use t he snm p_access access list and snm p_com m unit y ACL t ype t o define an access policy for t he SNMP agent . For exam ple: acl Snmppublic snmp_community public acl Adminhost src 192.168.1.1 snmp_access allow Adminhost Snmppublic I n t his case, Squid accept s SNMP request s from 192.168.1.1 wit h t he com m unit y nam e set t o public.
14.3.1 Using snmpwalk and snmpget The NET- SNMP package ( ht t p: / / net - snm p.sourceforge.net / ) provides a good im plem ent at ion of t he snm pwalk and snm pget com m and- line t ools for Unix. The form er walks t hrough an SNMP MI B t ree, displaying every value, while t he lat t er print s t he value for a single MI B obj ect . Aft er inst alling NET- SNMP, copy t he Squid MI B file t o t he direct ory where t he ut ilit ies can find it . By default , t his is t he / usr/ local/ share/ snm p/ m ibs direct ory: # cp squid-2.5.STABLE4/src/mib.txt /usr/local/share/snmp/mibs/SQUID-MIB.txt # chmod 644 /usr/local/share/snmp/mibs/SQUID-MIB.txt You should t hen be able t o use t he snm pget com m and. Not e t hat Squid is an SNMPv1 agent : % snmpget -v 1 -c public -m SQUID-MIB localhost:3401 cacheDnsSvcTime.5 SQUID-MIB::cacheDnsSvcTime.5 = INTEGER: 44 I f you want t o see t he ent ire Squid MI B t ree, use snm pwalk. The - Cc opt ion t ells snm pwalk t o ignore nonincreasing OI Ds: % snmpwalk -v 1 -c public -m SQUID-MIB -Cc localhost:3401 squid | less I f you can't get t he Squid MI B inst alled so t hat snm pwalk sees it , you can use t he num eric OI D
value inst ead: % snmpwalk -v 1 -c public -m SQUID-MIB -Cc localhost:3401 .1.3.6.1.4.1.3495.1 | less
14.3.2 The Squid MIB I n t his sect ion, I provide a brief descript ion for each OI D in t he Squid MI B, which lives in t he global MI B t ree under iso.org.dod.int ernet .privat e.ent erprises.nlanr.squid, or .1.3.6.1.4.1.3495.1. The full MI B nam es, such as cachePerf.cacheProt oSt at s.cacheMedianSvcTable. cacheMedianSvcEnt ry.cacheHt t pMissSvcTim e.60, t ake up t oo m uch space on t he page. I nst ead, I 'll j ust use t he last nonnum eric com ponent of t he OI D nam e, which is unique.
cacheSysVMsize The am ount of m em ory ( in kilobyt es) current ly used t o st ore in- m em ory obj ect s. For exam ple: SQUID-MIB::cacheSysVMsize = INTEGER: 10224
cacheSysStorage The am ount of disk space ( in kilobyt es) current ly used t o st ore on- disk obj ect s. For exam ple: SQUID-MIB::cacheSysStorage = INTEGER: 19347723
cacheUptime The am ount of t im e ( num ber of seconds) since Squid was st art ed. SQUID-MIB::cacheUptime = Timeticks: (33239630) 3 days, 20:19:56.30
cacheAdmin The em ail address, or nam e, of t he cache adm inist rat or. For exam ple: SQUID-MIB::cacheAdmin = STRING: [email protected]
cacheSoftware The nam e of t he applicat ion. For exam ple: SQUID-MIB::cacheSoftware = STRING: squid
cacheVersionId The applicat ion's version ident ificat ion. For exam ple: SQUID-MIB::cacheVersionId = STRING: "2.5.STABLE4"
cacheLoggingFacility The current debugging levels, from t he debug_opt ions direct ive. For exam ple: SQUID-MIB::cacheLoggingFacility = STRING: ALL,1
cacheMemMaxSize The value of t he cache_m em direct ive, in m egabyt es. For exam ple: SQUID-MIB::cacheMemMaxSize = INTEGER: 10
cacheSwapMaxSize The t ot al am ount of disk st orage, in m egabyt es, t aken from t he sum of all cache_dir lines. For exam ple: SQUID-MIB::cacheSwapMaxSize = INTEGER: 21000
cacheSwapHighWM The high wat erm ark percent age for disk st orage, t aken from t he cache_swap_high direct ive. For exam ple: SQUID-MIB::cacheSwapHighWM = INTEGER: 95
cacheSwapLowWM The low wat erm ark percent age for disk st orage, t aken from t he cache_swap_low direct ive. For exam ple: SQUID-MIB::cacheSwapLowWM = INTEGER: 90
cacheSysPageFaults The num ber of page fault s for t he Squid process since it was st art ed. ( See " Page fault s wit h physical i/ o" in Sect ion 14.2.1.24.) For exam ple:
SQUID-MIB::cacheSysPageFaults = Counter32: 9
cacheSysNumReads The num ber of t im es t his process called read( ) on HTTP socket s connect ed t o origin servers and neighbor caches. For exam ple: SQUID-MIB::cacheSysNumReads = Counter32: 15941979
cacheMemUsage The am ount of m em ory allocat ed by t he m em ory pooling rout ines. Not t he sam e as t he t ot al m em ory used by Squid. ( See " Tot al account ed" in Sect ion 14.2.1.24.) For exam ple: SQUID-MIB::cacheMemUsage = INTEGER: 143709
cacheCpuTime The am ount of CPU t im e, in seconds, accum ulat ed by t he Squid process. For exam ple: SQUID-MIB::cacheCpuTime = INTEGER: 79313
cacheCpuUsage The m ean CPU ut ilizat ion, as a percent age, since Squid was st art ed. Unfort unat ely, since t his value is an int eger, any graphs t hat you m ake will be " quant ized." For exam ple: SQUID-MIB::cacheCpuUsage = INTEGER: 23
cacheMaxResSize The m axim um resident set size, in kilobyt es, for t he Squid process. ( See " Maxim um Resident Size" in Sect ion 14.2.1.24.) For exam ple: SQUID-MIB::cacheMaxResSize = INTEGER: 219128
cacheNumObjCount The t ot al num ber of obj ect s current ly in t he cache. For exam ple: SQUID-MIB::cacheNumObjCount = Counter32: 1717181
cacheCurrentLRUExpiration
Current versions of Squid don't have a global LRU expirat ion age value, so t his is always report ed as zero. For exam ple: SQUID-MIB::cacheCurrentLRUExpiration = Timeticks: (0) 0:00:00.00
cacheCurrentUnlinkRequests The num ber of files given t o t he ext ernal unlinkd process for rem oval. Not e t hat Squid doesn't use unlinkd wit h t he diskd and aufs st orage schem es. For exam ple: SQUID-MIB::cacheCurrentUnlinkRequests = Counter32: 0
cacheCurrentUnusedFDescrCnt The current num ber of available ( unused) file descript ors. For exam ple: SQUID-MIB::cacheCurrentUnusedFDescrCnt = Gauge32: 7253
cacheCurrentResFileDescrCnt The num ber of reserved file descript ors. ( See " Reserved num ber of file descript ors" in Sect ion 14.2.1.24.) For exam ple: SQUID-MIB::cacheCurrentResFileDescrCnt = Gauge32: 100
cacheProtoClientHttpRequests The t ot al num ber of HTTP request s received from cache client s. For exam ple: SQUID-MIB::cacheProtoClientHttpRequests = Counter32: 7277019
cacheHttpHits The num ber of client request s t hat were cache hit s. For exam ple: SQUID-MIB::cacheHttpHits = Counter32: 2526484
cacheHttpErrors The num ber of client request s t hat result ed in an error. For exam ple: SQUID-MIB::cacheHttpErrors = Counter32: 0
cacheHttpInKb The am ount of net work t raffic, in kilobyt es, read from cache client s. For exam ple: SQUID-MIB::cacheHttpInKb = Counter32: 4231883
cacheHttpOutKb The am ount of net work t raffic, in kilobyt es, writ t en t o cache client s. For exam ple: SQUID-MIB::cacheHttpOutKb = Counter32: 56894945
cacheIcpPktsSent The num ber of I CP m essages ( bot h queries and replies) sent t o neighbors. For exam ple: SQUID-MIB::cacheIcpPktsSent = Counter32: 5296120
cacheIcpPktsRecv The num ber of I CP m essages ( bot h queries and replies) received from neighbors. For exam ple: SQUID-MIB::cacheIcpPktsRecv = Counter32: 5271238
cacheIcpKbSent The am ount of net work t raffic, in kilobyt es, used for I CP m essages sent t o neighbors, not including UDP and I P headers. For exam ple: SQUID-MIB::cacheIcpKbSent = Counter32: 428112
cacheIcpKbRecv The am ount of net work t raffic, in kilobyt es, used for I CP m essages received from neighbors, not including UDP and I P headers. For exam ple: SQUID-MIB::cacheIcpKbRecv = Counter32: 447762
cacheServerRequests The num ber of request s forwarded t o origin servers and neighbor caches. For exam ple:
SQUID-MIB::cacheServerRequests = INTEGER: 5338305
cacheServerErrors The num ber of errors received from origin servers and neighbor caches. Current ly unim plem ent ed and always report ed as zero. For exam ple: SQUID-MIB::cacheServerErrors = INTEGER: 0
cacheServerInKb The am ount of net work t raffic, in kilobyt es, read from origin servers and neighbor caches. For exam ple: SQUID-MIB::cacheServerInKb = Counter32: 49196559
cacheServerOutKb The am ount of net work t raffic, in kilobyt es, writ t en t o origin servers and neighbor caches. For exam ple: SQUID-MIB::cacheServerOutKb = Counter32: 3404717
cacheCurrentSwapSize The am ount of disk space, in kilobyt es, current ly in use by Squid. Com pare t o cacheSysStorage. For exam ple: SQUID-MIB::cacheCurrentSwapSize = Counter32: 19347723
cacheClients The num ber of client s t hat sent HTTP request s t o Squid since it was st art ed. For exam ple: SQUID-MIB::cacheClients = Counter32: 498
cacheMedianTime.X These OI Ds report t he t im e int ervals, in m inut es, over which m edian values are com put ed for subsequent OI Ds. The value is t he sam e as t he last num ber of t he OI D. For exam ple: SQUID-MIB::cacheMedianTime.1 = INTEGER: 1
cacheHttpAllSvcTime.X The 1- , 5- , and 60- m inut e m edian service t im e values, in m illiseconds, for all client HTTP request s. For exam ple: SQUID-MIB::cacheHttpAllSvcTime.1 = INTEGER: 78
cacheHttpMissSvcTime.X The 1- , 5- , and 60- m inut e m edian service t im e values for cache m isses. For exam ple: SQUID-MIB::cacheHttpMissSvcTime.1 = INTEGER: 114 SQUID-MIB::cacheHttpMissSvcTime.5 = INTEGER: 87 SQUID-MIB::cacheHttpMissSvcTime.60 = INTEGER: 74
cacheHttpNmSvcTime.X The 1- , 5- , and 60- m inut e m edian service t im e values for request s logged as TCP_IMS_HIT. ( See " Not - Modified Replies" in Sect ion 14.2.1.24.) For exam ple: SQUID-MIB::cacheHttpNmSvcTime.1 = INTEGER: 12 SQUID-MIB::cacheHttpNmSvcTime.5 = INTEGER: 34 SQUID-MIB::cacheHttpNmSvcTime.60 = INTEGER: 32
cacheHttpHitSvcTime .X The 1- , 5- , and 60- m inut e m edian service t im e values for cache hit s, logged as TCP_HIT. For exam ple: SQUID-MIB::cacheHttpHitSvcTime.1 = INTEGER: 45 SQUID-MIB::cacheHttpHitSvcTime.5 = INTEGER: 45 SQUID-MIB::cacheHttpHitSvcTime.60 = INTEGER: 40
cacheIcpQuerySvcTime.X The 1- , 5- , and 60- m inut e service t im e values for I CP queries sent by Squid ( t he t im e elapsed bet ween sending your query and receiving a neighbor's reply) . For exam ple: SQUID-MIB::cacheIcpQuerySvcTime.1 = INTEGER: 0 SQUID-MIB::cacheIcpQuerySvcTime.5 = INTEGER: 0
SQUID-MIB::cacheIcpQuerySvcTime.60 = INTEGER: 3563
cacheIcpReplySvcTime.X The 1- , 5- , and 60- m inut e m edian service t im e values for I CP queries received by Squid. I n current im plem ent at ions, t hese are always zero because processing occurs fast er t han t he process clock is updat ed. For exam ple: SQUID-MIB::cacheIcpReplySvcTime.1 = INTEGER: 0 SQUID-MIB::cacheIcpReplySvcTime.5 = INTEGER: 0 SQUID-MIB::cacheIcpReplySvcTime.60 = INTEGER: 0
cacheDnsSvcTime.X The 1- , 5- , and 60- m inut e m edian service t im e values for Squid's DNS queries. For exam ple: SQUID-MIB::cacheDnsSvcTime.1 = INTEGER: 40 SQUID-MIB::cacheDnsSvcTime.5 = INTEGER: 42 SQUID-MIB::cacheDnsSvcTime.60 = INTEGER: 42
cacheRequestHitRatio.X Squid's cache hit rat io ( percent age) over t he last 1, 5, and 60 m inut es. For exam ple: SQUID-MIB::cacheRequestHitRatio.1 = INTEGER: 16 SQUID-MIB::cacheRequestHitRatio.5 = INTEGER: 18 SQUID-MIB::cacheRequestHitRatio.60 = INTEGER: 22
cacheRequestByteRatio.X Squid's byt e hit rat io ( percent age) over t he last 1, 5, and 60 m inut es. For exam ple: SQUID-MIB::cacheRequestByteRatio.1 = INTEGER: 73 SQUID-MIB::cacheRequestByteRatio.5 = INTEGER: 43 SQUID-MIB::cacheRequestByteRatio.60 = INTEGER: 34
cacheIpEntries
The num ber of ent ries in Squid's I P ( nam e- t o- address) cache. For exam ple: SQUID-MIB::cacheIpEntries = Gauge32: 10033
cacheIpRequests The num ber of request s received by Squid's I P cache. For exam ple: SQUID-MIB::cacheIpRequests = Counter32: 8195627
cacheIpHits The num ber of lookups t hat were hit s in t he I P cache. For exam ple: SQUID-MIB::cacheIpHits = Counter32: 6040658 I f t he rat io of hit s t o request s is less t han 60- 75% , you m ay want t o increase t he size of your I P cache.
cacheIpPendingHits Always zero in t he current im plem ent at ion. For exam ple: SQUID-MIB::cacheIpPendingHits = Gauge32: 0 Older versions of Squid had t he not ion of I P cache hit s for out st anding queries.
cacheIpNegativeHits The num ber of lookups t hat were negat ive hit s in t he I P cache. Cert ain failed queries m ay be negat ively cached for an am ount of t im e det erm ined by t he negat ive_dns_t t l direct ive. For exam ple: SQUID-MIB::cacheIpNegativeHits = Counter32: 49433
cacheIpMisses The num ber of I P cache m isses. For exam ple: SQUID-MIB::cacheIpMisses = Counter32: 1807438
cacheBlockingGetHostByName Always zero in t he current im plem ent at ion. For exam ple:
SQUID-MIB::cacheBlockingGetHostByName = Counter32: 0 Older versions occasionally called t he get host bynam e( ) funct ion if t he I P cache couldn't provide an answer.
cacheAttemptReleaseLckEntries Always zero in t he current im plem ent at ion. Older versions would, in som e cases, want t o release locked I P cache ent ries. For exam ple: SQUID-MIB::cacheAttemptReleaseLckEntries = Counter32: 0
cacheFqdnEntries The num ber of ent ries in t he FQDN ( address- t o- nam e) cache. For exam ple: SQUID-MIB::cacheFqdnEntries = Gauge32: 1
cacheFqdnRequests The num ber of request s t o t he FQDN cache. For exam ple: SQUID-MIB::cacheFqdnRequests = Counter32: 0
cacheFqdnHits The num ber of FQDN cache request s sat isfied as hit s. For exam ple: SQUID-MIB::cacheFqdnHits = Counter32: 0
cacheFqdnPendingHits Always zero in t he current im plem ent at ion. For exam ple: SQUID-MIB::cacheFqdnPendingHits = Gauge32: 0
cacheFqdnNegativeHits The num ber of FQDN request s sat isfied as negat ive cache hit s. For exam ple: SQUID-MIB::cacheFqdnNegativeHits = Counter32: 0
cacheFqdnMisses The num ber of FQDN cache m isses. For exam ple: SQUID-MIB::cacheFqdnMisses = Counter32: 0
cacheBlockingGetHostByAddr Always zero in t he current im plem ent at ion. For exam ple: SQUID-MIB::cacheBlockingGetHostByAddr = Counter32: 0
cacheDnsRequests The num ber of DNS queries m ade by Squid. This count er is reset each t im e you reconfigure t he running Squid process. For exam ple: SQUID-MIB::cacheDnsRequests = Counter32: 3262
cacheDnsReplies The num ber of DNS replies received by Squid. This count er is reset each t im e you reconfigure t he running Squid process. For exam ple: SQUID-MIB::cacheDnsReplies = Counter32: 2440
cacheDnsNumberServers When using int ernal DNS ( t he default ) , t his OI D report s t he num ber of nam eservers t hat Squid knows about . For ext ernal DNS, it report s t he num ber of ( running) dnsserver helper processes. For exam ple: SQUID-MIB::cacheDnsNumberServers = Counter32: 2
cachePeerName.A.B.C.D This, and t he next group of OI Ds, com e from t he list of neighbor caches. ( See Sect ion 14.2.1.50.) These OI Ds are indexed by t he I Pv4 address of t he peer. This part icular OI D ret urns t he neighbor cache's host nam e. For exam ple: SQUID-MIB::cachePeerName.192.203.230.19 = STRING: sv.us.ircache.net
cachePeerAddr.A.B.C.D
This is t he I P address of t he peer, which, of course, you already know from t he OI D it self. For exam ple: SQUID-MIB::cachePeerAddr.192.203.230.19 = IpAddress: 192.203.230.19
cachePeerPortHttp.A.B.C.D This is t he neighbor cache's HTTP port num ber. For exam ple: SQUID-MIB::cachePeerPortHttp.192.203.230.19 = INTEGER: 3128
cachePeerPortIcp.A.B.C.D This is t he neighbor cache's I CP or HTCP port num ber. For exam ple: SQUID-MIB::cachePeerPortIcp.192.203.230.19 = INTEGER: 3130
cachePeerType.A.B.C.D The t ype of t he neighbor: 1 for sibling, 2 for parent , and 3 for m ult icast . For exam ple: SQUID-MIB::cachePeerType.192.203.230.19 = INTEGER: 1
cachePeerState.A.B.C.D The st at e of t he peer: 1 for up, 0 for down. ( See Sect ion 10.3.2.) For exam ple: SQUID-MIB::cachePeerState.192.203.230.19 = INTEGER: 1
cachePeerPingsSent.A.B.C.D The num ber of I CP/ HTCP queries sent t o t he neighbor. For exam ple: SQUID-MIB::cachePeerPingsSent.192.203.230.19 = Counter32: 924
cachePeerPingsAcked.A.B.C.D The num ber of I CP/ HTCP queries received from t he neighbor. For exam ple: SQUID-MIB::cachePeerPingsAcked.192.203.230.19 = Counter32: 901
cachePeerFetches.A.B.C.D
The num ber of HTTP request s sent t o t he neighbor. ( See t he discussion about FETCHES in Sect ion 14.2.1.50.) For exam ple: SQUID-MIB::cachePeerFetches.192.203.230.19 = Counter32: 34
cachePeerRtt.A.B.C.D The average round- t rip t im e for I CP/ HTCP queries t o t his peer. For exam ple: SQUID-MIB::cachePeerRtt.192.203.230.19 = INTEGER: 26
cachePeerIgnored.A.B.C.D The num ber of I CP/ HTCP replies t hat Squid ignored. ( See t he discussion about I GNORED in Sect ion 14.2.1.50.) For exam ple: SQUID-MIB::cachePeerIgnored.192.203.230.19 = Counter32: 201
cachePeerKeepAlSent.A.B.C.D The num ber of HTTP request s sent t o t he neighbor wit h a request t o keep t he connect ion open. For exam ple: SQUID-MIB::cachePeerKeepAlSent.192.203.230.19 = Counter32: 34
cachePeerKeepAlRecv.A.B.C.D The num ber of HTTP replies received from t he neighbor wit h a request t o keep t he connect ion open. For exam ple: SQUID-MIB::cachePeerKeepAlRecv.192.203.230.19 = Counter32: 34
cacheClientAddr.A.B.C.D The cacheClientAddr OI Ds com e from t he sam e dat abase as t he Cache Client List ( see Sect ion 14.2.1.46) . This part icular OI D's value is t he I Pv4 address, j ust like t he last four oct et s of t he OI D it self. For exam ple: SQUID-MIB::cacheClientAddr.206.168.0.9 = IpAddress: 206.168.0.9
cacheClientHttpRequests.A.B.C.D The num ber of HTTP request s received from t his client . For exam ple:
SQUID-MIB::cacheClientHttpRequests.206.168.0.9 = Counter32: 108281
cacheClientHttpKb.A.B.C.D The am ount of t raffic, in kilobyt es, sent t o t his client . For exam ple: SQUID-MIB::cacheClientHttpKb.206.168.0.9 = Counter32: 921447
cacheClientHttpHits.A.B.C.D The num ber of cache hit s sent t o t his client . For exam ple: SQUID-MIB::cacheClientHttpHits.206.168.0.9 = Counter32: 32365
cacheClientHTTPHitKb.A.B.C.D The am ount of t raffic, in kilobyt es, sent t o t his client for cache hit s. For exam ple: SQUID-MIB::cacheClientHTTPHitKb.206.168.0.9 = Counter32: 141638
cacheClientIcpRequests.A.B.C.D The num ber of I CP ( but not HTCP ) queries received from t his client . For exam ple: SQUID-MIB::cacheClientIcpRequests.206.168.0.9 = Counter32: 79120
cacheClientIcpKb.A.B.C.D The am ount of t raffic, in kilobyt es, received from t his client in I CP queries. For exam ple: SQUID-MIB::cacheClientIcpKb.206.168.0.9 = Counter32: 5986
cacheClientIcpHits.A.B.C.D The num ber of ICP_HIT replies sent t o t his client . For exam ple: SQUID-MIB::cacheClientIcpHits.206.168.0.9 = Counter32: 21897
cacheClientIcpHitKb.A.B.C.D The am ount of t raffic, in kilobyt es, sent t o t his client for ICP_HIT m essages. A som ewhat
silly m easurem ent because ICP_HIT and ICP_MISS m essages have t he sam e size. However, old versions of Squid used t he now- obsolet e ICP_HIT_OBJ opcode, which included t he obj ect cont ent . For exam ple: SQUID-MIB::cacheClientIcpHitKb.206.168.0.9 = Counter32: 1679
< Day Day Up >
< Day Day Up >
14.4 Exercises ●
●
●
●
Writ e a shell script t hat uses squidclient t o collect and save t he t ot al num ber of HTTP request s and t he five- m inut e m edian overall response t im e. Writ e a shell script t o periodically ret rieve and archive t he running configurat ion. I t should also com pare t he current and m ost recent configurat ions and em ail you t he changes, if any. Download, com pile, and inst all t he NET- SNMP package. Use snm pwalk t o view Squid's ent ire MI B t ree. Creat e and deploy a sim ple redirect or ( Chapt er 11) t hat sleeps for 250 m illiseconds on each request . Wat ch t he cache m anager's redirector page as Squid runs. < Day Day Up >
< Day Day Up >
Chapter 15. Server Accelerator Mode Throughout m ost of t his book, I 've been t alking about Squid as a client - side caching proxy. However, wit h j ust a few special squid.conf set t ings, Squid is able t o funct ion as an origin server accelerat or as well. I n t his m ode, it accept s norm al HTTP request s and forwards cache m isses t o t he real origin server ( or backend server) . I n t he parlance of RFC 3040, Squid is operat ing as a surrogat e. This configurat ion is sim ilar t o what I t alked about in Chapt er 9. The prim ary difference is t hat , as a surrogat e, Squid accept s request s for one, or m aybe a few, origin server( s) , rat her t han any and all origins. HTTP int ercept ion isn't required for server accelerat ion. As t he nam e im plies, server accelerat ion is generally used as a t echnique t o im prove t he perform ance of slow, or heavily loaded, backend servers. I t works well because origin servers t end t o have a relat ively sm all hot set . Most likely, t he obj ect s responsible for 90% of origin server t raffic can fit ent irely in m em ory. Depending on your part icular backend server soft ware and configurat ion, Squid m ay be able t o serve request s m uch fast er. Securit y is anot her good reason t o consider Squid as a surrogat e. Think of Squid as a dedicat ed firewall in front of your origin server. The Squid source code is t oo large t o be t rust ed as com plet ely secure. However, you m ay sleep bet t er wit h Squid prot ect ing your backend server. I t is sim ply a cache, so it doesn't perm anent ly st ore t he source of your dat a. I f t he Squid box is at t acked or com prom ised, you won't lose any dat a. You m ay find it easier t o secure a syst em running Squid t han t he syst em running your backend server applicat ion( s) . You m ight also be int erest ed in server accelerat ion t o im plem ent load balancing. I f your origin server runs on expensive boxes, you can save m oney by deploying Squid on a num ber of cheaper boxes. By placing Squid at a num ber of different locat ions, you can even build your own cont ent delivery net work ( CDN) . < Day Day Up >
< Day Day Up >
15.1 Overview Assum ing t hat you already have an origin server in place, you need t o m ove it t o a different I P address or TCP port . For exam ple, you can ( 1) inst all Squid on a separat e m achine, ( 2) give t he origin server a new I P address, and ( 3) give Squid t he origin server's old I P address. I n t he int erest of securit y, you can use non- globally rout able addresses ( i.e., from RFC 1918) on t he link bet ween Squid and t he backend server. See Figure 15- 1.
Figu r e 1 5 - 1 . H ow t o r e pla ce you r or igin se r ve r w it h Squ id
Anot her opt ion is t o configure Squid for HTTP int ercept ion, as described in Chapt er 9. For exam ple, you can configure t he origin server's nearest rout er or swit ch t o int ercept HTTP request s and divert t hem t o Squid. I f you don't have t he resources t o put Squid on a dedicat ed syst em , you can run it alongside t he HTTP server. However, bot h applicat ions can't share t he sam e I P address and port num ber. You need t o m ake t he backend server bind t o a different address ( e.g., 127.0.0.1) or m ove it t o anot her port num ber. I t m ight seem easiest t o change t he port num ber, but I recom m end changing t he I P address inst ead. Changing t he port num ber can be problem at ic. For exam ple, when t he backend server generat es an error m essage, it m ay expose t he " wrong" port . Even worse, if t he server generat es an HTTP redirect , it t ypically appends t he nonst andard port num ber t o t he Location URI : HTTP/1.1 301 Moved Permanently
Date: Mon, 29 Sep 2003 03:36:13 GMT Server: Apache/1.3.26 (Unix) Location: http://www.squid-cache.org:81/Doc/ I f a client receives t his response, it m akes a connect ion t o t he nonst andard port ( 81) , t hus bypassing t he server accelerat or. I f you m ust run Squid on t he sam e host as your backend server, it is bet t er t o t ell t he backend server t o list en on t he loopback address ( 127.0.0.1) . Wit h Apache, you'd do it like t his: BindAddress 127.0.0.1 ServerName www.squid-cache.org Once you've decided how t o relocat e your origin server, t he next st ep is t o configure Squid. < Day Day Up >
< Day Day Up >
15.2 Configuring Squid Technically, a single configurat ion file direct ive is all it t akes t o change Squid from a caching proxy int o a surrogat e. Unfort unat ely, life is never quit e t hat sim ple. Due t o t he m yriad of ways t hat different organizat ions design t heir web services, Squid has a num ber of direct ives t o worry about .
15.2.1 http_port Most likely, Squid is act ing as a surrogat e for your HTTP server on port 80. Use t he ht t p_port direct ive t o m ake Squid list en on t hat port : http_port 80 I f you want Squid t o act as surrogat e and a caching proxy at t he sam e t im e, list bot h port num bers: http_port 80 http_port 3128 You can configure your client s t o send t heir proxy request s t o port 80 as well, but I st rongly discourage t hat . By using separat e port s, you'll find it easier t o m igrat e t he t wo services t o separat e boxes lat er if it becom es necessary.
15.2.2 https_port You can configure Squid t o t erm inat e encrypt ed HTTP ( SSL and TLS) connect ions. This feat ure requires t he —enable- ssl opt ion when running ./ configure. I n t his m ode, Squid decrypt s SSL/ TLS connect ions from client s and forwards unencrypt ed request s t o your backend server. The ht t ps_port direct ive has t he following form at : https_port [host:]port cert=certificate.pem [key=key.pem] [version=1-4] [cipher=list] [options=list] The cert and key argum ent s are pat hnam es t o OpenSSL- com pat ible cert ificat e and privat e key files. I f you om it t he key argum ent , t he OpenSSL library looks for t he privat e key in t he cert ificat e file. The ( opt ional) version argum ent specifies your requirem ent s for various SSL and TLS prot ocols t o support : 1= aut om at ic, 2= SSLv2 only, 3= SSLv3 only, 4= TLSv1 only. The ( opt ional) cipher argum ent is a colon- separat ed list of ciphers. Squid sim ply passes it t o t he SSL_CTX_set _cipher_list ( ) funct ion. For m ore inform at ion, read t he ciphers(1) m anpage on your syst em or t ry running: openssl ciphers.
The ( opt ional) options argum ent is a colon- separat ed list of OpenSSL opt ions. Squid sim ply passes t hese t o t he SSL_CTX_set _opt ions( ) funct ion. For m ore inform at ion, read t he SSL_CTX_set_options(3) m anpage on your syst em . Here are a few exam ple ht t ps_port lines: https_port 443 cert=/usr/local/etc/certs/squid.cert https_port 443 cert=/usr/local/etc/certs/squid.cert version=2 https_port 443 cert=/usr/local/etc/certs/squid.cert cipher=SHA1 https_port 443 cert=/usr/local/etc/certs/squid.cert options=MICROSOFT_SESS_ID_BUG
15.2.3 httpd_accel_host This is where you t ell Squid t he I P address, or host nam e, of t he backend server. I f you use t he loopback t rick described previously, you writ e: httpd_accel_host 127.0.0.1 Squid t hen prepends t his value t o part ial URI s t hat get accelerat ed. I t also changes t he value of [ 1] For exam ple, if t he client m akes t his request : t he Host header. [ 1]
Technically, t he Host header is changed only in request s Squid forwards t o t he backend server ( cache m isses) . GET /index.html HTTP/1.1 Host: squidbook.org Squid t urns it int o t his request : GET http://127.0.0.1/index.html HTTP/1.1 Host: 127.0.0.1 As you can see, t he request no longer cont ains any inform at ion t hat indicat es t he request is for squidbook.org. This shouldn't be a problem as long as t he backend server isn't configured for virt ual host ing of m ult iple dom ains. I f you want Squid t o use t he origin server's host nam e, you can put it in t he ht t pd_accel_host direct ive: httpd_accel_host squidbook.org Then t he request is as follows:
GET http://squidbook.org/index.html HTTP/1.1 Host: squidbook.org Anot her opt ion is t o enable t he ht t pd_accel_uses_host _header direct ive. Squid t hen insert s t he Host header value int o t he URI for m ost request s, and t he ht t pd_accel_host value is used only for request s t hat lack a Host header. When you use a host nam e, Squid goes t hrough t he norm al st eps t o look up it s I P address. Because you want t he host nam e t o resolve t o t wo different addresses ( one for client s connect ing t o Squid and anot her for Squid connect ing t o t he backend server) , you should also add a st at ic DNS ent ry t o your syst em 's / et c/ host s file. For exam ple: 127.0.0.1
squidbook.org
You m ight want t o use a redirect or inst ead. For exam ple, you can writ e a sim ple Perl program t hat changes http://squidbook.org/... t o http://127.0.0.1/.... See Chapt er 11 for t he nut s and bolt s of redirect ing client request s. The ht t pd_accel_host direct ive has a special value. I f you set it t o virtual, Squid insert s t he origin server's I P address int o t he URI when t he Host header is m issing. This feat ure is useful only when using HTTP int ercept ion, however.
15.2.4 httpd_accel_port This direct ive t ells Squid t he port num ber of t he backend server. I t is 80 by default . You won't need t o change t his unless t he backend server is running on a different port . Here is an exam ple: httpd_accel_port 8080 I f you are accelerat ing origin servers on m ult iple port s, you can use t he value 0. I n t his case, Squid t akes t he port num ber from t he Host header.
15.2.5 httpd_accel_uses_host_header This direct ive cont rols how Squid det erm ines t he host nam e it insert s int o accelerat ed URI s. I f enabled, t he request 's Host header value t akes precedence over ht t pd_accel_host . The ht t pd_accel_uses_host _header direct ive goes hand in hand wit h virt ual dom ain host ing on t he backend server. You can leave it disabled if t he backend server is handling only one dom ain. I f, on t he ot her hand, you are accelerat ing m ult iple origin server nam es, t urn it on: httpd_accel_uses_host_header on I f you enable ht t pd_accel_uses_host _header, be sure t o inst all som e access cont rols as described lat er in t his chapt er. To underst and why, consider t his configurat ion:
httpd_accel_host does.not.exist httpd_accel_uses_host_header on Because m ost request s have a Host header, Squid ignores t he ht t pd_accel_host set t ing and rarely insert s t he bogus does.not .exist nam e int o URI s. This essent ially t urns your surrogat e int o a caching proxy for anyone sm art enough t o fake an HTTP request . I f I know t hat you are using Squid as a surrogat e wit hout proper access cont rols, I can send a request like t his: GET /index.html HTTP/1.1 Host: www.mrcranky.com I f you've enabled ht t pd_accel_uses_host _header and don't have any dest inat ion- based access cont rols, Squid should forward m y request t o www.m rcranky.com . Read Sect ion 15.4 and inst all access cont rols t o ensure t hat Squid doesn't t alk t o foreign origin servers.
15.2.6 httpd_accel_single_host Whereas t he ht t pd_accel_uses_host _header direct ive det erm ines t he host nam e Squid put s int o a URI , t his one det erm ines where Squid forwards it s cache m isses. By default ( i.e., wit h ht t pd_accel_single_host disabled) , Squid forwards surrogat e cache m isses t o t he host in t he URI . I f t he URI cont ains a host nam e, Squid perform s a DNS lookup t o get t he backend server's I P address. When you enable ht t pd_accel_single_host , Squid always forwards surrogat e cache m isses t o t he host defined by ht t pd_accel_host . I n ot her words, t he cont ent s of t he URI and t he Host header don't affect t he forwarding decision. Perhaps t he best reason t o enable t his direct ive is t o avoid DNS lookups. Sim ply set ht t pd_accel_host t o t he backend server's I P address. Anot her reason t o enable it is if you have anot her device ( load balancer, virus scanner, et c.) bet ween Squid and t he backend server. You can m ake Squid forward t he request t o t his ot her device wit hout changing any aspect of t he HTTP request . Not e t hat enabling bot h ht t pd_accel_single_host and ht t pd_accel_uses_host _header is a dangerous com binat ion t hat m ight allow an at t acker t o poison your cache. Consider t his configurat ion: httpd_accel_single_host on httpd_accel_host 172.16.1.1 httpd_accel_uses_host_header on and t his HTTP request : GET /index.html HTTP/1.0 Host: www.othersite.com
Squid forwards t he request t o your backend server at 172.16.1.1 but st ores t he response under t he URI ht t p: / / www.ot hersit e.com / index.ht m l. Since 172.16.1.1 isn't act ually www.ot hersit e. com , Squid now cont ains a bogus response for t hat URI . I f you enable ht t pd_accel_wit h_proxy ( next sect ion) or your cache part icipat es in a hierarchy or m esh, it m ay give out t he bad response t o unsuspect ing users. To prevent such abuse, be sure t o read Sect ion 15.4. Server- side persist ent connect ions m ay not work if you use t he ht t pd_accel_single_host direct ive. This is because Squid saves idle connect ions under t he origin server host nam e, but t he connect ion- est ablishm ent code looks for an idle connect ion nam ed by t he ht t pd_accel_host value. I f t he t wo values are different , Squid fails t o locat e an appropriat e idle connect ion. The idle connect ions are closed aft er t he t im eout , wit hout being reused. You can avoid t his lit t le problem by disabling server- side persist ent connect ions wit h t he server_persist ent _connect ions direct ive ( see Appendix A) .
15.2.7 httpd_accel_with_proxy By default , whenever you enable t he ht t pd_accel_host direct ive, Squid goes int o st rict surrogat e m ode. That is, it refuses proxy HTTP request s and accept s only surrogat e request s, as t hough it were t ruly an origin server. Squid also disables t he I CP port ( alt hough not HTCP, if you have it enabled) . I f you want Squid t o accept bot h surrogat e and proxy request s, enable t his direct ive: httpd_accel_with_proxy on
< Day Day Up >
< Day Day Up >
15.3 Gee, That Was Confusing! Yeah, it was for m e t oo. Let 's look at it anot her way. The set t ings t hat you need t o use depend on how m any backend boxes you have and how m any origin server nam es you are accelerat ing. Let 's consider t he four separat e cases in t he following sect ions.
15.3.1 One Box, One Server Name This is t he sim plest sort of configurat ion. Because you have only one box and one host nam e, t he Host header values don't m at t er m uch. You should probably use: httpd_accel_host www.example.com httpd_accel_single_host on httpd_accel_uses_host_header off I f you like, you can use an I P address for ht t pd_accel_host , alt hough it will appear in URI s in your access.log.
15.3.2 One Box, Many Server Names Because you have m any origin server nam es being virt ually host ed on a single box, t he Host header becom es im port ant . We want Squid t o insert it int o t he URI s it generat es from part ial request s. Your configurat ion should be: httpd_accel_host www.example.com httpd_accel_single_host on httpd_accel_uses_host_header on I n t his case, Squid generat es t he URI based on t he Host header. I f absent , Squid insert s www. exam ple.com . You can disable ht t pd_accel_single_host if you prefer. As before, you can use an I P address in ht t pd_accel_host t o avoid DNS lookups.
15.3.3 Many Boxes, One Server Name This sounds like a load- balancing configurat ion. One way t o accom plish it is t o creat e a DNS nam e for t he backend servers wit h m ult iple I P addresses. Squid it erat es bet ween all addresses ( a.k.a. round- robin) for each cache m iss. I n t his sit uat ion, t he configurat ion is t he sam e as for t he one box/ one nam e case: httpd_accel_host roundrobin.example.com httpd_accel_single_host on
httpd_accel_uses_host_header off The only difference is t hat t he ht t pd_accel_host nam e resolves t o m ult iple addresses. I t m ight look like t his in a Berkeley I nt ernet Nam e Daem on ( BI ND) zone file: $ORIGIN example.com. roundrobin
IN
A
192.168.1.2
IN
A
192.168.1.3
IN
A
192.168.1.4
Wit h t his DNS configurat ion, Squid uses t he next address in t he list each t im e it opens a new connect ion t o roundrobin.exam ple.com . When it get s t o t he end of t he list , it st art s over at t he t op. Not e t hat Squid caches t hese DNS answers int ernally according t o t heir TTLs. You aren't relying on t he nam e server t o ret urn t he address list in a different order for each DNS query. Anot her opt ion is t o use a redirect or ( see Chapt er 11) t o select t he backend server. You can writ e a sim ple script t o replace t he URI host nam e ( e.g., roundrobin.exam ple.com ) wit h a different host nam e or an I P address. You m ight even m ake t he redirect or sm art enough t o m ake it s select ion based on t he current st at e of t he backend servers. Use t he following configurat ion wit h t his approach: httpd_accel_host roundrobin.example.com httpd_accel_single_host off httpd_accel_uses_host_header off
15.3.4 Many Boxes, Many Server Names I n t his case, you want t o use t he Host header. You also want Squid t o select t he backend server based on t he origin server's nam e ( i.e., a DNS lookup) . The configurat ion is as follows: httpd_accel_host www.example.com httpd_accel_single_host off httpd_accel_uses_host_header on You m ight be t em pt ed t o set ht t pd_accel_host t o virtual. However, t hat would be a m ist ake unless you are using HTTP int ercept ion. < Day Day Up >
< Day Day Up >
15.4 Access Controls A t ypically configured surrogat e accept s HTTP request s from t he whole I nt ernet . This doesn't m ean, however, t hat you can forget about access cont rols. I n part icular, you'll want t o m ake sure Squid doesn't accept request s belonging t o foreign origin servers. The except ion is when you have ht t pd_accel_wit h_proxy enabled. For a surrogat e- only configurat ion, use one of t he dest inat ion- based access cont rols. For exam ple, t he dst t ype accom plishes t he t ask: acl All src 0/0 acl TheOriginServer dst 192.168.3.2 http_access allow TheOriginServer http_access deny All Alt ernat ively, you can use a dst dom ain ACL if you prefer: acl All src 0/0 acl TheOriginServer dstdomain www.squidbook.org http_access allow TheOriginServer http_access deny All Not e t hat enabling ht t pd_accel_single_host som ewhat bypasses t he access cont rol rules. This is because t he origin server locat ion ( i.e., t he ht t pd_accel_host value) is t hen set aft er Squid perform s t he access cont rol checks. Access cont rols becom e really t ricky when you com bine surrogat e and proxy m odes in a single inst ance of Squid. You can no longer sim ply deny all request s t o foreign origin servers. You can, however, m ake sure t hat out siders aren't allowed t o m ake proxy request s t o random origin servers. For exam ple: acl All src 0/0 acl ProxyUsers src 10.2.0.0/16 acl TheOriginServer dst 192.168.3.2 http_access allow ProxyUsers http_access allow TheOriginServer http_access deny All
You can also use t he local port num ber in your access cont rol rules. I t doesn't really prot ect you from m alicious act ivit y, but does ensure, for exam ple, t hat user- agent s send t heir proxy request s t o t he appropriat e port . This also m akes it easier for you t o split t he service int o separat e proxy- and surrogat e- only syst em s lat er. Assum ing you configure Squid t o list en on port s 80 and 3128, you m ight use: acl All src 0/0 acl ProxyPort myport 3128 acl ProxyUsers src 10.2.0.0/16 acl SurrogatePort myport 80 acl TheOriginServer dst 192.168.3.2 http_access allow ProxyUsers ProxyPort http_access allow TheOriginServer SurrogatePort http_access deny All Unfort unat ely, t hese access cont rol rules don't prevent at t em pt s t o poison your cache when you enable ht t pd_accel_single_host , ht t pd_accel_uses_host _header, and ht t pd_accel_wit h_proxy sim ult aneously. This is because t he valid proxy request : GET http://www.bad.site/ HTTP/1.1 Host: www.bad.site and t he bogus surrogat e request : GET / HTTP/1.1 Host: www.bad.site have t he sam e access cont rol result but are forwarded t o different servers. They have t he sam e access cont rol result because, aft er Squid rewrit es t he surrogat e request , it has t he sam e URI as t he proxy request . However, t hey don't get sent t o t he sam e place. The surrogat e request goes t o t he server defined by ht t pd_accel_host because ht t pd_accel_single_host is enabled. You can t ake st eps t owards solving t his problem . Make sure your backend server generat es an error for unknown server nam es ( e.g., when t he Host header refers t o a nonlocal server) . Bet t er yet , don't run Squid as a surrogat e and proxy at t he sam e t im e. < Day Day Up >
< Day Day Up >
15.5 Content Negotiation Recent versions of Squid support t he HTTP/ 1.1 Vary header. This is good news if your backend server uses cont ent negot iat ion. I t m ight , for exam ple, send different responses depending on which web browser m akes t he request ( e.g., t he User-Agent header) , or based on t he user's language preferences ( e.g., t he Accept-Language header) . When t he response for a URI varies on som e aspect of t he request , t he origin ( backend) server includes a Vary header. This header cont ains t he list of request headers used t o select t he variant . These are t he select ing headers. When Squid receives a response wit h a Vary header, it includes t he select ing header values when it generat es t he int ernal cache key. Thus, a subsequent request wit h t he sam e values for t he select ing headers m ay generat e a cache hit . I f you use t he —enable- x- accelerat or- vary opt ion when running ./ configure, Squid looks for a response header nam ed X-Accelerator-Vary. Squid t reat s t his header exact ly like t he Vary header. Because t his is an ext ension header, however, it is ignored by downst ream agent s. I t essent ially provides a m eans for privat e cont ent negot iat ion bet ween Squid and your backend server. I n order t o use it you m ust also m odify your server applicat ion t o send t he header in it s responses. I don't know of any sit uat ion in which t his header would be useful. I f you serve negot iat ed responses, you probably want t o use t he st andard Vary header so t hat all agent s know what 's going on. < Day Day Up >
< Day Day Up >
15.6 Gotchas Using Squid as a surrogat e m ay im prove your origin server's securit y and perform ance. However, t here are som e pot ent ially negat ive side effect s as well. Here are a few t hings t o keep in m ind.
15.6.1 Logging When using a surrogat e, t he origin server's access log cont ains only t he cache m isses from Squid. Furt herm ore, t hose log- file ent ries have Squid's I P address, rat her t han t he client 's. I n ot her words, Squid's access.log is where all t he good inform at ion is now st ored. Recall t hat , by default , Squid doesn't use t he com m on log- file form at . You should use t he em ulat e_ht t pd_log direct ive t o m ake Squid's access.log look j ust like Apache's default log- file form at .
15.6.2 Ignoring Reloads The Reload but t on found on m ost browsers generat es HTTP request s wit h t he Cache-Control: no-cache direct ive set . While t his is usually desirable for client - side caching proxies, it m ay ruin t he perform ance of a surrogat e. This is especially t rue if t he backend server is heavily loaded. A reload request forces Squid t o purge t he current ly cached response while ret rieving t he new response from t he origin server. I f t hose origin server responses arrive slowly, Squid consum es a larger t han norm al num ber of file descript ors and net work resources. To help in t his sit uat ion, you m ay want t o use one of t he refresh_pat t ern opt ions. When t he ignore-reload opt ion is set , Squid pret ends t hat t he request doesn't cont ain t he no-cache direct ive. The ignore-reload opt ion is generally safe for surrogat es, alt hough it does, t echnically, violat e t he HTTP prot ocol. To m ake Squid ignore reloads for all request s, use a line like t his in squid.conf: refresh_pattern . 0 20% 4320 ignore-reload For a som ewhat safer alt ernat ive, you can use t he reload-into-ims opt ion. I t causes Squid t o validat e it s cached response when t he request cont ains no-cache. Not e, however, t hat t his works only for responses t hat have cache validat ors ( such as Last-Modified t im est am ps) .
15.6.3 Uncachable Content As a surrogat e, Squid obeys t he st andard HTTP headers for caching responses from your backend server. This m eans, for exam ple, t hat cert ain dynam ic responses m ight not be cached. You m ight want t o use t he refresh_pat t ern direct ive t o force caching of t hese obj ect s. For exam ple:
refresh_pattern \.dhtml$ 60 80% 180 This t rick only works for cert ain t ypes of responses, nam ely, t hose wit hout a Last-Modified or Expires header. By default , Squid doesn't cache such responses. However, using a nonzero m inim um t im e in a refresh_pat t ern rule inst ruct s Squid t o cache t he response, and serve it as a cache hit for t hat am ount of t im e anyway. See Sect ion 7.7 for t he det ails. I f your backend server generat es ot her t ypes of uncachable responses, you m ay not be able t o t rick Squid int o st oring t hem .
15.6.4 Errors Wit h Squid as a surrogat e in front of your origin server, you should be aware t hat visit ors t o your sit e m ay see an error m essage from Squid, rat her t han t he origin server it self. I n ot her words, your use of Squid m ay be " exposed" t hrough cert ain error m essages. For exam ple, Squid ret urns it s own error m essage when it fails t o parse t he client 's HTTP request , which could happen if t he request is incom plet e or is m alform ed in som e way. Squid also ret urns an error m essage if it can't connect t o t he backend server for som e reason. I f your sit e is consist ent and funct ioning properly, you probably don't need t o worry about Squid's error m essages. Nonet heless, you m ay want t o t ake a close look at t he access.log from t im e t o t im e and see what sort of errors, if any, your users m ight be seeing.
15.6.5 Purging Objects You m ay find t he PURGE m et hod part icularly useful when operat ing a surrogat e. Because you have a good underst anding of t he cont ent being served, you are m ore likely t o know when a cached obj ect m ust be purged. The t echnique for purging an obj ect is t he sam e as I m ent ioned previously. See Sect ion 7.6 for a refresher.
15.6.6 Neighbors Alt hough I don't recom m end it , you can configure Squid as a surrogat e and as part of a m esh or hierarchy. I f you choose t o t ake on such an arrangem ent , not e t hat , by default , Squid forwards cache m isses t o parent s ( rat her t han t he backend server) . Assum ing t hat isn't what you really want , be sure t o use t he cache_peer_access direct ives so t hat request s for your backend server don't go t o your neighbors inst ead. < Day Day Up >
< Day Day Up >
15.7 Exercises ●
●
●
●
I nst all and configure Squid as a surrogat e on t he sam e syst em where you run an HTTP server. Make a few t est request s wit h squidclient . Pay part icular at t ent ion t o t he reply headers and not ice how t he request s appear in bot h access logs. Try t o poison your own surrogat e wit h fake HTTP request s. I t is probably easier wit h ht t pd_accel_single_host enabled. Est im at e t he size of your origin server's docum ent set . What percent age of t he dat a can fit int o 1 GB of m em ory or disk space? < Day Day Up >
< Day Day Up >
Chapter 16. Debugging and Troubleshooting No m at t er how hard t he Squid developers t ry t o be perfect , you m ay encount er som e problem s wit h Squid. These problem s range from m isbehaving client s and servers t o fat al bugs in t he Squid code. I n t his chapt er, I 'll t alk about various ways you can debug t hese problem s. Som e Squid problem s m ay require you t o t urn on debugging. I n m ost cases, you'll want t o increase t he debugging levels for specific part s of t he code. I 'll describe how t o find out what t hose debugging sect ions are and how t o change t he set t ings. Also, I 'll t alk about t he im port ance of providing det ailed debugging when report ing bugs. Finally, you m ay experience fat al bugs in t he Squid code. These can result in segm ent at ion violat ions, abort s, assert ions, and core dum ps. The core dum p is a useful debugging aid. Wit h a debugger, such as gdb, you can generat e a process st ack t race and send it t o t he developers for assist ance. I f you suspect you have a Squid bug, but aren't sure, check wit h t he squid- users m ailing list or one of t he ot her resources described in Sect ion 1.6. < Day Day Up >
< Day Day Up >
16.1 Some Common Problems Before discussing debugging in general, I 'll m ent ion a few specific problem s t hat com m only arise.
16.1.1 "Failed to make swap directory" Failed to make swap directory /var/spool/cache: (13) Permission denied This happens when you run squid - z, and t he Squid user I D doesn't have writ e perm ission t o t he / var/ spool direct ory. Rem em ber t hat if you st art Squid as root and don't add a cache_effect ive_user line, Squid runs as t he user nobody by default . Thus, your solut ion m ay be t o sim ply run: # chown nobody:nobody /var/spool
16.1.2 "Address already in use" commBind: Cannot bind socket FD 10 to *:3128: Address already in use This m essage appears when t he bind( ) syst em call fails because t he request ed port is already opened by anot her applicat ion. Usually, t his happens when you t ry t o st art a second inst ance of Squid when t he first one is st ill running. I f you see t his error m essage, use ps t o see if Squid is already running. Squid uses t he SO_REUSEADDR socket opt ion, so t hat t he bind( ) call should succeed even if t here are som e left over socket s in t he TI ME_WAI T st at e. I f you get t he m essage, even t hough Squid isn't already running, your operat ing syst em m ay be buggy or especially finicky. Reboot ing your syst em is one way t o get around t his problem . Anot her possibilit y t o consider is t hat t he port ( e.g., 3128) is current ly being used by a different applicat ion. I f you suspect t his, you can use t he lsof program ( ft p: / / lsof.it ap.purdue.edu/ pub/ t ools/ unix/ lsof) t o find which applicat ion is list ening on t he port . FreeBSD users can use sockst at inst ead.
16.1.3 "Could not determine fully qualified hostname" FATAL: Could not determine fully qualified hostname.
Please set 'visible_hostname'
You'll see t his if Squid can't figure out it s own fully qualified dom ain nam e. Here is t he algorit hm Squid uses: ●
●
I f you t old Squid t o bind t he HTTP port t o a specific int erface address, Squid at t em pt s a reverse DNS lookup of t hat address. I f successful, t he answer is used. Squid calls t he get host nam e( ) funct ion, and t hen at t em pt s t o resolve it s I P address wit h get host bynam e( ) . I f successful, Squid uses t he official host nam e st ring ret urned by t he lat t er funct ion.
I f neit her t echnique works, Squid exit s wit h t he fat al m essage shown earlier. I n t his case, you m ust t ell Squid it s host nam e wit h t he visible_host nam e direct ive. For exam ple: visible_hostname my.host.name
16.1.4 "DNS name lookup tests failed" By default , Squid m akes a few DNS queries before st art ing. This ensures t hat your DNS servers are reachable and funct ioning properly. I f t hese t est s fail, you'll see t he following m essage in cache.log and/ or syslog: FATAL: ipcache_init: DNS name lookup tests failed I f you use Squid on an int ranet , Squid m ay be unable t o query it s st andard list of host nam es. You can specify your own host nam es wit h t he dns_t est nam es direct ive. Squid considers t he DNS t est successful as soon as it receives any reply. I f you want t o skip t he DNS t est s alt oget her, sim ply use t he - D com m and- line opt ion when st art ing Squid: % squid -D ...
16.1.5 "Illegal character in hostname" urlParse: Illegal character in hostname 'super_bikes.tripod.com' By default , Squid checks t he charact ers in t he host nam e part of URLs and com plains if it finds nonst andard charact ers. According t o RFCs 1034 and 1035, nam es m ust consist of t he let t ers AZ, t he digit s 0- 9, and a hyphen ( -) . The underscore ( _) is one of t he m ost problem at ic charact ers. Squid validat es host nam es because, in som e cases, DNS resolvers behave different ly wit h respect t o illegal charact ers. For exam ple: % host super_bikes.tripod.com super_bikes.tripod.com has address 209.202.196.70
% ping super_bikes.tripod.com ping: cannot resolve super_bikes.tripod.com: Unknown server error Rat her t han ret urn t he Unknown server error m essage, Squid checks t he host nam e first . I t can t hen t ell t he user when t he host nam e cont ains illegal charact ers. Som e DNS resolvers do work wit h underscores and ot her nonst andard charact ers. I f you'd prefer t hat Squid not check host nam es, use t he —disable- host nam e- checks opt ion when running ./ configure. I f you want t o allow underscores as t he only except ion, use t he —enable- underscores
opt ion.
16.1.6 "Running out of filedescriptors" WARNING! Your cache is running out of filedescriptors The above m essage appears when Squid uses up all available file descript ors. I f t his happens under norm al condit ions, you need t o increase t he kernel's file- descript or lim it s and recom pile Squid. See Sect ion 3.3.1. You m ight also see t his m essage if Squid is t he t arget of a denial- of- service at t ack. Som eone m ay be int ent ionally, or unint ent ionally, sending Squid hundreds or t housands of request s at once. I f t his is t he case, you can probably add a packet - filt ering rule t o block incom ing TCP connect ions from t he offending address( es) . I f t he at t ack is dist ribut ed or using a spoofed source address, you'll have a harder t im e st opping it . Forwarding loops ( see Sect ion 10.2) m ight also consum e all of Squid's file descript ors, but only if Squid can't det ect t he loop. The Via header cont ains t he host nam e of all proxies t hat have seen a part icular request . Squid looks for it s own host nam e in t he header, and, if found, report s t he loop. I f, for som e reason, t he Via header is filt ered from out going or incom ing HTTP request s, Squid can't det ect t he loop. I n t his case, all file descript ors are quickly consum ed by t he sam e request going t hrough Squid over and over.
16.1.7 "icmpRecv: Connection refused" You'll see t he following m essage if t he pinger program isn't correct ly inst alled: icmpRecv: recv: (61) Connection refused Most likely, t he pinger program exit s im m ediat ely because it doesn't have perm ission t o open a raw I CMP socket . Because t he process isn't running, Squid receives an I / O error when t rying t o t alk t o it . To alleviat e t he problem , go t o t he source direct ory and, as root , t ype: # make install-pinger I f successful, you should find t hat t he pinger program has t he following file ownership and perm ission set t ings: # ls -l /usr/local/squid/libexec/pinger -rws--x--x
1 root
squid
140728 Sep 16 19:58 /usr/local/squid/libexec/pinger
16.1.8 Squid Becomes Slow After Running for Some Time Most likely, Squid is com pet ing wit h ot her processes, or wit h it self, for m em ory on your syst em . When t he Squid process no longer fit s ent irely in m em ory, t he operat ing syst em is forced t o read and writ e areas of m em ory t o and from t he swap space. This has a drast ic effect on Squid's perform ance.
To validat e t his t heory, check t he Squid process size wit h ut ilit ies such as t op and ps. Also check Squid's own page fault count er, as described in Sect ion 14.2.1.24. Once you've ident ified m em ory consum pt ion as t he problem , t ry t he following st eps t o reduce Squid's m em ory usage: 1. Reduce t he value of cache_m em and read Appendix B. 2. Turn off m em ory pooling wit h t his opt ion: memory_pools off 3. Reduce t he size of t he disk cache by lowering t he size of one or m ore cache direct ories. See Sect ion 7.1.
16.1.9 Debugging Access Controls I f you're having no luck get t ing your access cont rols t o work properly, here's a lit t le t ip t hat m ight help. Edit your squid.conf file and set t he debug_opt ions line t o t his: debug_options ALL,1 33,2 Then, reconfigure Squid: % squid -k reconfigure Now, Squid writ es a m essage t o cache.log for each client request and anot her for each reply. The m essages cont ain t he request m et hod, URI , whet her t he request / reply is allowed or denied, and t he nam e of t he last ACL t hat m at ched it . For exam ple: 2003/09/29 20:22:05| The request GET http://images.slashdot.org:80/topics/topicprivacy.gif is ALLOWED, because it matched 'localhost' 2003/09/29 20:22:05| The reply for GET http://images.slashdot.org/topics/topicprivacy.gif is ALLOWED, because it matched 'all' Knowing t he nam e of t he ACL doesn't always t ell you t he corresponding ht t p_access line, but it get s you pret t y close. I f necessary, you can replicat e your acl lines and give t hem unique nam es so t hat a given ACL nam e appears on only one ht t p_access rule. < Day Day Up >
< Day Day Up >
16.2 Debugging via cache.log You already know from Sect ion 13.1 t hat cache.log cont ains various operat ional m essages Squid t hinks are im port ant enough t o t ell you about . We also refer t o t hese as debugging m essages. You can use t he debug_opt ions direct ive t o cont rol t he verbosit y of m essages t hat appear in cache.log. By increasing t he debugging levels, you'll see m ore det ailed m essages t hat m ay help you underst and what Squid is doing. For exam ple: debug_options ALL,1 11,3 20,3 Every debugging m essage in t he Squid source code has t wo num eric at t ribut es: a sect ion and a level. Sect ions range from 0 t o 100, and levels range from 0 t o 10. I n general, sect ion num bers correspond t o part icular com ponent s of t he source code. I n ot her words, all t he m essages wit hin a single source file have t he sam e sect ion num ber. I n som e cases, m ult iple files use t he sam e debugging sect ion. This t ends t o happen when a source file becom es t oo large and is split int o sm aller chunks. The t op of each source file has line t hat m ent ions t he debugging sect ion. I t looks like t his: * DEBUG: section 9
File Transfer Protocol (FTP)
I don't expect you t o look at t he source files t o find t he sect ion num bers. The sam e inform at ion appears here in Table 16- 1.
Ta ble 1 6 - 1 . D e bu ggin g se ct ion n u m be r s for t h e de bu g_ opt ion s dir e ct ive N u m be r
D e scr ipt ion
Sou r ce file ( s)
0
Client Dat abase
client _db.c
1
St art up and Main Loop
m ain.c
2
Unlink Daem on
unlinkd.c
3
Configurat ion File Parsing
cache_cf.c
4
Error Generat ion
errorpage.c
5
Socket Funct ions
com m .c
5
Socket Funct ions
com m _select .c
6
Disk I / O Rout ines
disk.c
7
Mult icast
m ult icast .c
8
Swap File Bit m ap
filem ap.c
9
File Transfer Prot ocol ( FTP)
ft p.c
10
Gopher
gopher.c
11
Hypert ext Transfer Prot ocol ( HTTP)
ht t p.c
12
I nt ernet Cache Prot ocol
icp_v2.c
12
I nt ernet Cache Prot ocol
icp_v3.c
13
High Level Mem ory Pool Managem ent
m em .c
14
I P Cache
ipcache.c
15
Neighbor Rout ines
neighbors.c
16
Cache Manager Obj ect s
cache_m anager.c
17
Request Forwarding
forward.c
18
Cache Manager St at ist ics
st at .c
19
St ore Mem ory Prim it ives
st m em .c
20
St orage Manager
st ore.c
20
St orage Manager Client - Side I nt erface
st ore_client .c
20
St orage Manager Heap- Based Replacem ent
repl/ heap/ st ore_heap_replacem ent .c
20
St orage Manager Logging Funct ions
st ore_log.c
20
St orage Manager MD5 Cache Keys
st ore_key_m d5.c
20
St orage Manager Swapfile Met adat a
st ore_swapm et a.c
20
St orage Manager Swapin Funct ions
st ore_swapin.c
20
St orage Manager Swapout Funct ions
st ore_swapout .c
20
St ore Rebuild Rout ines
st ore_rebuild.c
21
Misc Funct ions
t ools.c
22
Refresh Calculat ion
refresh.c
23
URL Parsing
url.c
24
WAI S Relay
wais.c
25
MI ME Parsing
m im e.c
26
Secure Socket s Layer Proxy
ssl.c
27
Cache Announcer
send- announce.c
28
Access Cont rol
acl.c
29
Aut hent icat or
aut h/ basic/ aut h_basic.c
29
Aut hent icat or
aut h/ digest / aut h_digest .c
29
Aut hent icat or
aut hent icat e.c
29
NTLM Aut hent icat or
aut h/ nt lm / aut h_nt lm .c
30
I dent ( RFC 1413)
ident .c
31
Hypert ext Caching Prot ocol
ht cp.c
32
Asynchronous Disk I / O
fs/ aufs/ async_io.c
33
Client - Side Rout ines
client _side.c
34
Dnsserver I nt erface
dns.c
35
FQDN Cache
fqdncache.c
37
I CMP Rout ines
icm p.c
38
Net work Measurem ent Dat abase
net _db.c
39
Cache Array Rout ing Prot ocol
carp.c
40
Referer Logging
referer.c
40
User- Agent Logging
useragent .c
41
Event Processing
event .c
42
I CMP Pinger Program
pinger.c
43
AI OPS
fs/ aufs/ aiops.c
44
Peer Select ion Algorit hm
peer_select .c
45
Callback Dat a Regist ry
cbdat a.c
45
Callback Dat a Regist ry
leakfinder.c
46
Access Log
access_log.c
47
St ore COSS Direct ory Rout ines
fs/ coss/ st ore_dir_coss.c
47
St ore Direct ory Rout ines
fs/ aufs/ st ore_dir_aufs.c
47
St ore Direct ory Rout ines
fs/ diskd/ st ore_dir_diskd.c
47
St ore Direct ory Rout ines
fs/ null/ st ore_null.c
47
St ore Direct ory Rout ines
fs/ ufs/ st ore_dir_ufs.c
47
St ore Direct ory Rout ines
st ore_dir.c
48
Persist ent Connect ions
pconn.c
49
SNMP I nt erface
snm p_agent .c
49
SNMP Support
snm p_core.c
50
Log File Handling
logfile.c
51
File Descript or Funct ions
fd.c
52
URN Parsing
urn.c
53
AS Num ber Handling
asn.c
54
I nt erprocess Com m unicat ion
ipc.c
55
HTTP Header
Ht t pHeader.c
56
HTTP Message Body
Ht t pBody.c
57
HTTP St at us- Line
Ht t pSt at usLine.c
58
HTTP Reply ( Response)
Ht t pReply.c
59
Aut o- Growing Mem ory Buffer wit h print f
Mem Buf.c
60
Packer: A Uniform I nt erface t o St ore Like Modules
Packer.c
61
Redirect or
redirect .c
62
Generic Hist ogram
St at Hist .c
63
Low Level Mem ory Pool Managem ent
Mem Pool.c
64
HTTP Range Header
Ht t pHdrRange.c
65
HTTP Cache Cont rol Header
Ht t pHdrCc.c
66
HTTP Header Tools
Ht t pHeaderTools.c
67
St ring
St ring.c
68
HTTP Cont ent - Range Header
Ht t pHdrCont Range.c
69
HTTP Header: Ext ension Field
Ht t pHdrExt Field.c
70
Cache Digest
CacheDigest .c
71
St ore Digest Manager
st ore_digest .c
72
Peer Digest Rout ines
peer_digest .c
73
HTTP Request
Ht t pRequest .c
74
HTTP Message
Ht t pMsg.c
75
WHOI S Prot ocol
whois.c
76
I nt ernal Squid Obj ect handling
int ernal.c
77
Delay Pools
delay_pools.c
78
DNS Lookups; int eract s wit h lib/ rfc1035.c
dns_int ernal.c
79
Squid- Side DI SKD I / O Funct ions
fs/ diskd/ st ore_io_diskd.c
79
St orage Manager COSS I nt erface
fs/ coss/ st ore_io_coss.c
79
St orage Manager UFS I nt erface
fs/ ufs/ st ore_io_ufs.c
80
WCCP Support
wccp.c
82
Ext ernal ACL
ext ernal_acl.c
83
SSL Accelerat or Support
ssl_support .c
84
Helper Process Maint enance
helper.c
Debugging levels are assigned such t hat m ore im port ant m essages have sm aller values and less im port ant m essages have higher values. Level is for very im port ant m essages, while level 10 is for t hose t hat are relat ively unim port ant . Ot her t han t hat , t here are no st rict guidelines or
requirem ent s. Developers are generally free t o choose which debugging levels are appropriat e. The debug_opt ions direct ive det erm ines which m essages appear in cache.log. I t s synt ax is: debug_options section,level section,level ... The default set t ing is ALL,1 such t hat Squid print s any debugging m essage wit h level 0 or 1. I f you want t o m ake even less out put appear in cache.log, you can set debug_opt ions t o ALL,0. I f you want t o see addit ional debugging for a part icular com ponent , sim ply add t he appropriat e sect ion and level t o t he end of t he debug_opt ions list . For exam ple, t his line adds level 5 debugging for t he FTP server- side code: debug_options ALL,1 9,5 As wit h ot her configurat ion direct ives, you can change debug_opt ions, t hen send Squid t he reconfigure signal: % squid -k reconfigure Not e t hat t he debug_opt ions param et ers are processed sequent ially, and a lat er value can override an earlier one. This is of part icular concern if you use t he ALL keyword. Consider t his exam ple: debug_options 9,5 20,9 4,2 ALL,1 I n t his case, t he final value overwrit es all of t he preceding set t ings because ALL,1 set s t he debugging level t o 1 for all sect ions. Select ing appropriat e debugging sect ions and levels is som et im es quit e difficult , especially for novice Squid users. Many of t he m ore det ailed debugging m essages are m eaningful only t o developers and t hose fam iliar wit h t he source code. I nexperienced Squid users are likely t o find m any of t he debugging m essages m eaningless and overwhelm ing. Furt herm ore, you m ay have difficult y isolat ing t he debugging for a part icular request or event if Squid is relat ively busy. The higher debugging levels are oft en m ore useful if you can t est Squid wit h one request at a t im e. You m ust also be part icularly careful about running Squid wit h high debugging levels for a long am ount of t im e. I f Squid is busy, t he cache.log file grows very quickly and m ay event ually consum e all free space on it s part it ion. I f t his happens, Squid exit s wit h a fat al m essage. Anot her concern is t hat perform ance m ay degrade significant ly. Due t o t he high num ber of debugging m essages, Squid devot es a lot of CPU resources t o form at t ing and print ing st rings. I t also consum es a lot of disk bandwidt h writ ing t hem all t o cache.log. < Day Day Up >
< Day Day Up >
16.3 Core Dumps, Assertions, and Stack Traces I f you are unlucky, Squid m ay experience a fat al error while running. These sort s of errors com e in t hree flavors: assert ions, bus errors, and segm ent at ion violat ions. An assert ion is a sanit y check in t he source code. I t is a t ool, used by developers, t o m ake sure t hat som e condit ion is always t rue before proceeding. I f t he condit ion is false, t he program exit s and creat es a core file so t hat t he developer can analyze t he sit uat ion. Here is a t ypical exam ple: int some_array[100];
void some_func(int idx) { ... assert(idx < 100); some_array[idx]++; ... } Here, t he assert ion m akes sure t hat t he value of t he array index is wit hin t he bounds of t he array. I t would be an error t o access array elem ent s great er t han ( or equal t o) 100. I f, som ehow, t he value of idx isn't less t han 100, t he program print s a m essage like t his when it runs: assertion failed: filename.c:123: "idx < 100" I f t his happens wit h Squid, you'll see an " assert ion failed" m essage in cache.log. I n addit ion, your operat ing syst em should creat e a core file, which is helpful in t he post - m ort em analysis. I 'll explain what t o do wit h a core file at t he end of t his sect ion. A bus error is " a fat al failure in t he execut ion of a m achine language inst ruct ion result ing from [ 1] t he processor det ect ing an anom alous condit ion on it s bus." They t ypically occur when t he processor at t em pt s an operat ion on a nonaligned m em ory address. You are, perhaps, m ore likely t o see a bus error on a 64- bit processor syst em , such as t he Alpha and som e SPARC CPUs. Fort unat ely, t hey are easy t o fix. [ 1]
From t he Free On- line Dict ionary of Com put ing ( FOLDOC) , ht t p: / / wom bat .doc. ic.ac.uk/ foldoc/ .
Segm ent at ion violat ion errors are, unfort unat ely, m ore com m on and som et im es harder t o fix. A " SEGV" usually occurs when t he process t ries t o access an invalid m em ory area. I t m ight be a NULL point er or a m em ory address out side t he scope of t he process. They are part icularly difficult t o t rack down when t he cause ( t he bug) and effect ( t he SEGV) are separat ed in t im e. By default , Squid t raps bus errors and segm ent at ion violat ions, and at t em pt s a clean shut down when t hey occur. You'll see som et hing like t his in cache.log: FATAL: Received Bus Error...dying. 2003/09/29 23:18:01| storeDirWriteCleanLogs: Starting... I n m ost cases, Squid is able t o writ e clean versions of t he swap.st at e files. Just before exit ing, Squid calls abort ( ) t o creat e a core file. The core file m ay help you, or ot her developers, t rack down and fix t he bug. A core file is generally m ore useful when it is creat ed im m ediat ely following t he error, rat her t han calling t he clean shut down procedure first . You can t ell Squid not t o t rap bus errors and segm ent at ion violat ions wit h t he - C com m and line opt ion: % squid -C ... Not e t hat som e operat ing syst em s use t he filenam e core, while ot hers prepend t he process nam e ( i.e., squid.core) . Once you have t he core file, use a debugger t o get a st ack t race. gdb is t he GNU debugger—a com panion t o t he GNU C com piler. I f you don't have gdb, t ry running dbx or adb inst ead. Here's how you can use gdb t o get a st ack t race: % gdb /usr/local/squid/sbin/squid /path/to/squid.core ... Core was generated by 'squid'. Program terminated with signal 6, Abort trap. ... Then, t ype where t o print t he st ack t race: (gdb) where #0
0x28168b54 in kill ( ) from /usr/lib/libc.so.4
#1
0x281aa0ce in abort ( ) from /usr/lib/libc.so.4
#2
0x80a2316 in death (sig=10) at tools.c:301
#3
0xbfbfffac in ?? ( )
#4
0x80abe0a in storeDiskdSend (mtype=4, sd=0x82101e0, id=1214000,
sio=0x9e90a10, size=4096, offset=-1, shm_offset=0) at diskd/store_io_diskd.c:485 #5
0x80ab726 in storeDiskdWrite (SD=0x82101e0, sio=0x9e90a10, buf=0x13e94000 "...", size=4096, offset=-1, free_func=0) at diskd/store_io_diskd.c:251
#6
0x809d2fb in storeWrite (sio=0x9e90a10, buf=0x13e94000 "...", size=4096, offset=-1, free_func=0) at store_io.c:89
#7
0x80a1c2d in storeSwapOut (e=0xc5a7800) at store_swapout.c:259
#8
0x809b667 in storeAppend (e=0xc5a7800, buf=0x810f9a0 "...", len=57344) at store.c:533
#9
0x807873b in httpReadReply (fd=134, data=0xc343590) at http.c:642
#10 0x806492f in comm_poll (msec=10) at comm_select.c:445 #11 0x8084404 in main (argc=2, argv=0xbfbffa8c) at main.c:742 #12 0x804a465 in _start ( ) As you can see, t he st ack t race print s t he nam e of each funct ion, it s argum ent s, and t he source code filenam es and line num bers. This inform at ion is ext rem ely useful when t racking down bugs. I n som e cases, however, it isn't sufficient . You m ight be asked t o execut e addit ional com m ands in t he debugger, such as print ing t he value of a variable from wit hin a cert ain funct ion: (gdb) frame 4 #4
0x80abe0a in storeDiskdSend (mtype=4, sd=0x82101e0, id=1214000, sio=0x9e90a10, size=4096, offset=-1, shm_offset=0) at diskd/store_io_diskd.c:485
485
x = msgsnd(diskdinfo->smsgid, &M,
msg_snd_rcv_sz, IPC_NOWAIT); (gdb) set print pretty (gdb) print M $2 = { mtype = 4, id = 1214000,
seq_no = 7203103, callback_data = 0x9e90a10, size = 4096, offset = -1, status = -1, shm_offset = 0 } Aft er you've report ed a bug, t ry t o keep t he core file around for a few days, in case you need addit ional inform at ion from it .
16.3.1 Can't Find the Core File? core files are writ t en in t he process' current direct ory. By default , Squid doesn't change it s current direct ory at st art up. Thus, your core file, if any, should be writ t en in t he direct ory in which Squid was st art ed. You won't find a core file if t he filesyst em doesn't have enough free space or if t he process owner doesn't have writ e perm ission in t he direct ory. You can use t he coredum p_dir direct ive t o m ake Squid use a specific locat ion—som ewhere wit h plent y of space and sufficient perm issions. Process resource lim it s m ay also prevent t he creat ion of a core file. One of t he process lim it param et ers is t he size of t he core dum p file. Usually, m ost syst em s set t his t o " unlim it ed" by default . You can check t he current lim it from your shell wit h t he lim it s or ulim it com m ands. Not e, however, t hat your shell's lim it m ight be different t han t he Squid process lim it , especially when Squid is st art ed aut om at ically at boot t im e. I f you suspect process lim it s prevent generat ion of a core file, t ry t his: csh% limit coredumpsize unlimited csh% squid -NCd1 On FreeBSD, a sysct l param et er cont rols whet her or not t he operat ing syst em generat es a core file for processes t hat call set uid( ) and/ or set gid( ) . Squid uses t hose funct ions if you st art it as root . To get a core dum p, t hen, you m ust t ell t he kernel t o creat e t he core file wit h t his com m and: # sysctl kern.sugid_coredump=1 See t he sysct l.conf m anpage for inform at ion on how t o set t he variable aut om at ically when your syst em boot s. < Day Day Up >
< Day Day Up >
16.4 Replicating Problems Occasionally you m ay encount er a cert ain request or origin server t hat seem s not t o work wit h Squid. You can use t he following t echnique t o det erm ine if t he problem lies wit h Squid, t he client , or t he origin server. The t rick is t o capt ure t he HTTP request , t hen replay it in different ways unt il you ident ify t he problem . Capt uring t he HTTP request m eans get t ing m ore t han j ust t he URL. You also need t he request m et hod, HTTP version num ber, and all of t he request headers. One way t o capt ure t he request is by enabling full debugging in Squid for a short t im e. On t he Squid box, t ype: % squid -kdebug Then, go t o t he web browser and issue t he request . Squid should receive t he request alm ost im m ediat ely. Aft er a few seconds, go back t o t he Squid box and issue t he sam e com m and: % squid -kdebug Now your cache.log file should cont ain t he client 's request . I f your Squid is busy, t he cache.log will cont ain a lot of request s, so you'll have t o search for it . I t looks som et hing like t his: 2003/09/29 10:37:40| parseHttpRequest: Method is 'GET' 2003/09/29 10:37:40| parseHttpRequest: URI is 'http://squidbook.org/' 2003/09/29 10:37:40| parseHttpRequest: Client HTTP version 1.1. 2003/09/29 10:37:40| parseHttpRequest: req_hdr = { User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org Not e t hat Squid print s t he com ponent s of t he first line separat ely. You'll have t o m anually reassem ble t hem like t his: GET http://squidbook.org/ HTTP/1.1
Anot her way t o capt ure t he full request is wit h a ut ilit y such as net cat or socket ( ht t p: / / www. j nickelsen.de/ socket / ) . St art t he socket program list ening on som e port , t hen configure t he browser t o use t hat port as t he proxy address. When you m ake t he request again, socket print s t he HTTP request : % socket -s 8080 GET http://squidbook.org/ HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org Finally, you can also use a net work packet capt ure ut ilit y, such as t cpdum p or et hereal. Aft er capt uring a few packet s wit h t cpdum p, you can t hen use t cpshow t o view t hem : # tcpdump -w tcpdump.log -c 10 -s 1500 port 80 # tcpshow -noHostNames -noPortNames < tcpdump.log | less ... Packet 4 TIME:
08:39:29.593051 (0.000627)
LINK:
00:90:27:16:AA:75 -> 00:00:24:C0:0D:25 type=IP
IP:
10.0.0.21 -> 206.168.0.6 hlen=20 TOS=00 dgramlen=304 id=4B29 MF/DF=0/1 frag=0 TTL=64 proto=TCP cksum=15DC
TCP:
port 2074 -> 80 seq=0481728885 ack=4107144217 hlen=32 (data=252) UAPRSF=011000 wnd=57920 cksum=EB38 urg=0
DATA:
GET / HTTP/1.0. Host: www.ircache.net. Accept: text/html, text/plain, application/pdf, application/
postscript, text/sgml, */*;q=0.01. Accept-Encoding: gzip, compress. Accept-Language: en. Negotiate: trans. User-Agent: Lynx/2.8.1rel.2 libwww-FM/2.14. . Not e t hat t cpshow print s a period where t he dat a cont ains a newline charact er. Once you've capt ured a request , save it t o a file. Then you can replay it t hrough Squid wit h net cat or socket : % socket squidhost 3128 < request | less I f t he response looks norm al, t he problem m ight be wit h t he user- agent . Ot herwise, you can change various t hings t o isolat e t he problem . For exam ple, if you see som e funny- looking HTTP headers, delet e t hem from t he request and t ry it again. You m ay also find it useful t o t ry t he request direct ly wit h t he origin server, inst ead of going t hrough Squid. To do t hat , rem ove t he http://host.name/ from t he request and send it t o t he origin server: % cat request GET / HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Konqueror/3) Pragma: no-cache Cache-control: no-cache Accept: text/*, image/jpeg, image/png, image/*, */* Accept-Encoding: x-gzip, gzip, identity Accept-Charset: iso-8859-1, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: squidbook.org
% socket squidbook.org 80 < request | less When working wit h HTTP in t his m anner, you m ight find it useful t o refer t o RFC 2616 and O'Reilly's HTTP: The Definit ive Guide.
< Day Day Up >
< Day Day Up >
16.5 Reporting a Bug I f your Squid version is m ore t han a few m ont hs old, you should probably updat e it before report ing any bugs. Chances are t hat ot hers not iced t he sam e bug, and it m ay already be fixed. I f you discover a legit im at e bug in Squid, please ent er it int o t he Squid bug t racking dat abase: ht t p: / / www.squid- cache.org/ bugs/ . This is current ly a " bugzilla" dat abase, which requires you t o creat e an account . You will receive updat es as t he bug is processed by Squid developers. I f you are new at report ing bugs, please t ake t he t im e t o read " How t o Report Bugs Effect ively," by Sim on Tat ham ( ht t p: / / www.chiark.greenend.org.uk/ ~ sgt at ham / bugs.ht m l) . When report ing a bug, be sure t o include t he following inform at ion: ●
● ● ●
●
Squid version num ber. I f t he bug happens wit h m ore t han one version, include t he ot her versions as well. Your operat ing syst em nam e and version. Whet her t he bug happens every t im e or occasionally. A good descript ion of exact ly what happens. Phrases such as " it doesn't work," and " t he request fails" are essent ially useless t o bug fixers. Be very specific. A st ack t race in t he case of an assert ion, bus error, or segm ent at ion violat ion.
Rem em ber t hat Squid developers are generally unpaid volunt eers, so be pat ient . Crit ical bugs have m ore priorit y over m inor annoyances. < Day Day Up >
< Day Day Up >
16.6 Exercises ●
● ●
●
Use t cpdum p or et hereal t o capt ure som e real HTTP request s. Save t hem t o a file and replay t he request s t hrough Squid. Feel free t o m odify or delet e som e of t he HTTP headers. Try t o m ake Squid run out of file descript ors. Run tail -f cache.log and st art Squid wit h debug_opt ions set t o ALL,3. I f t hat is t oo overwhelm ing, t ry ALL,2. Force Squid t o generat e a core file by sending each of t he following signals: SIGBUS, SIGSEGV, and SIGABRT. Find t he core file and use gdb or anot her debugger t o get a st ack t race. < Day Day Up >
< Day Day Up >
Appendix A. Config File Reference This appendix cont ains descript ions and exam ples for every squid.conf direct ive. I present t hem here in t he sam e order t hey appear in t he default squid.conf. That m eans cert ain relat ed direct ives are grouped t oget her, and som e recent ly added direct ives are at t he end. You m ay want t o use t he book's index t o locat e a part icular direct ive by nam e. I n t he following sect ions, t he descript ive t ext is followed by a t able t hat cont ains t he direct ive's synt ax, a default value, an exam ple, and relat ed direct ives. < Day Day Up >
< Day Day Up >
h t t p_ por t
This is t he port , or port s, Squid uses t o list en for HTTP request s from cache client s. I f your syst em has m ore t han one net work int erface, you can use t he opt ional host nam e prefix t o m ake Squid bind t he socket t o a specific I P address. The host nam e m ust correspond t o one of your int erface addresses. I recom m end using an I P address here, inst ead of a host nam e, t o avoid DNS lookup delays at st art up. I f you run Squid as a surrogat e ( accelerat or) , you probably want t o accept HTTP connect ions on port 80. Binding t o privileged port s requires root perm issions.
Synt ax
http_port [hostname:]port [[hostname:]port] ...
Default
http_port 3128
http_port 8080 Exam ple
http_port 3128 3129 3130 3131 http_port 192.168.1.1:3128
Relat ed
ht t ps_port , icp_port , ht cp_port , snm p_port , ht t pd_accel_port , ht t p_access
< Day Day Up >
< Day Day Up >
h t t ps_ por t
This direct ive allows Squid t o accept encrypt ed ( SSL or TLS) connect ions. I t is available only when you use t he / configure —enable- ssl opt ion. The m andat ory cert = argum ent specifies t he pat hnam e t o an SSL cert ificat e file in PEM form at . This is t he form at com m only used by OpenSSL and ot her securit y soft ware for port able represent at ion of encrypt ion keys. The opt ional key= argum ent is t he pat h t o a privat e key file. I f you om it t his opt ion, Squid assum es t he form er key file also cont ains a privat e key. You can use t he version= argum ent t o t ell Squid which prot ocol versions are allowed: 1= aut om at ic, 2= SSLv2 only, 3= SSLv3 only, 4= TLSv1 only. The cipher= argum ent is an opt ional colon- separat ed list of allowed ciphers. Squid sim ply passes t his list t o t he SSL_ CTX_ se t _ ciph e r _ list ( ) funct ion. Last ly, t he opt ions= argum ent allows you t o pass addit ional configurat ion param et ers t o t he OpenSSL library. For exam ple, NO_SSLv2, NO_SSLv3, and NO_TLSv1 disable t he use of t hose part icular prot ocols. Addit ional opt ion keywords are defined in Squid's src/ ssl_support .c file.
https_port [hostname:]port cert=certificate.pem [key=key.pem] [version=N] Synt ax [cipher=list] [options=SSL_Options]
Default
No default
Exam ple https_port 443 cert=/etc/squid-cert.pem key=/etc/squid-privkey.pem
Relat ed
ht t p_port , ht t p_access
< Day Day Up >
< Day Day Up >
ssl_ u n cle a n _ sh u t dow n
This a hack borrowed from m od_ssl for Apache. Cert ain user- agent s, not ably Microsoft I nt ernet Explorer, m ay not execut e t he SSL shut down procedure correct ly, especially when persist ent connect ions are involved. Enabling t his direct ive violat es t he SSL/ TLS st andard but m ay elim inat e error m essages from broken client s.
Synt ax
ssl_unclean_shutdown on|off
Default
ssl_unclean_shutdown off
Exam ple
ssl_unclean_shutdown on
Relat ed
ht t ps_port
< Day Day Up >
< Day Day Up >
icp_ por t
This is t he UDP port Squid uses for I CP m essages. I n part icular, it is used bot h for sending and receiving queries and replies. Your Squid receives I CP queries from ot her caches on t his port . I t also receives I CP replies from ot her caches, in response t o it s own queries, on t his port . Unlike ht t p_port , you can't specify a list of I CP port num bers. Furt herm ore, you m ust use t he udp_incom ing_address and udp_out going_address direct ives if you want t o rest rict I CP t raffic t o a specific int erface address. Set t ing icp_port t o 0 disables I CP.
Synt ax
icp_port port
Default
icp_port 3130
Exam ple icp_port 4130
Relat ed
icp_query_t im eout , icp_access, log_icp_queries, icp_hit _st ale, udp_incom ing_address, ht cp_port , ht t p_port , cache_peer
< Day Day Up >
< Day Day Up >
h t cp_ por t
The Hypert ext Caching Prot ocol is an alt ernat ive t o I CP. I t provides bet t er securit y and bet t er cache hit predict ions. However, HTCP m essages are larger and m ore com plicat ed. HTCP m ust be enabled at com pile- t im e wit h t he —enable- ht cp opt ion. This direct ive specifies t he UDP port Squid uses t o send and receive HTCP queries and replies. You m ay only specify one HTCP port num ber. As wit h I CP, t he udp_incom ing_address and udp_out going_address direct ives also cont rol HTCP packet s. You m ay configure Squid t o receive bot h I CP and HTCP queries at t he sam e t im e. Set t ing ht cp_port t o 0 disables HTCP.
Synt ax
htcp_port port
Default
htcp_port 4827
Exam ple
htcp_port 9999
Relat ed
icp_port , ht t p_port , udp_incom ing_address, udp_out going_address, cache_peer
< Day Day Up >
< Day Day Up >
m ca st _ gr ou ps
As discussed in Sect ion 10.6.3, Squid support s receiving I CP queries via m ult icast . This opt ion specifies a list of m ult icast addresses Squid should j oin t o receive t hese I CP queries.
I P m ult icast is a very t ricky and oft en fragile feat ure of t he I nt ernet . I st rongly recom m end you avoid using m ult icast for I CP unless you are already fam iliar wit h it . Don't t ry t o guess appropriat e values for t hese direct ives, and don't expect it t o work t he first t im e.
Synt ax
mcast_groups multicast-address [multicast-address] ...
Default
No default
Exam ple
mcast_groups 239.128.16.128
Relat ed
cache_peer, m cast _icp_query_t im eout
< Day Day Up >
< Day Day Up >
u dp_ in com in g_ a ddr e ss
This direct ive causes Squid t o bind all UDP socket s t o a specific int erface address. The I P address m ust correspond t o one of t he syst em 's net work int erfaces. This direct ive affect s t he DNS ( when using t he int ernal im plem ent at ion) , I CP, HTCP, and SNMP socket s. I f your syst em has j ust one I P address, you probably shouldn't use t his direct ive. I f you set udp_out going_address t o one of your ot her net work int erface addresses, Squid can receive UDP dat agram s on t hat int erface as well.
Synt ax
udp_incoming_address ip-address
Default
udp_incoming_address 0.0.0.0
Exam ple
udp_incoming_address 192.168.4.5
Relat ed
udp_out going_address, icp_port , ht cp_port , snm p_port
< Day Day Up >
< Day Day Up >
u dp_ ou t goin g_ a ddr e ss
This direct ive specifies t he source address for UDP m essages t hat Squid sends. I t affect s DNS ( when using t he int ernal im plem ent at ion) , I CP, HTCP, and SNMP m essages. The specified address m ust correspond t o one of t he syst em 's net work int erfaces. You should use t his direct ive only if your syst em has m ult iple I P addresses. The default value of 255.255.255.255 causes Squid t o use t he incom ing address for sending, as well as receiving. I n ot her words, rat her t han creat ing a separat e UDP socket for sending, Squid sends and receives m essages t hrough a single socket . I f you use t his direct ive, it m ust have a different value t han udp_incom ing_address. Squid can't creat e t wo UDP socket s bound t o t he sam e I P address and port num ber.
Synt ax
udp_outgoing_address ip-address
Default
udp_outgoing_address 255.255.255.255
Exam ple
udp_outgoing_address 192.168.5.6
Relat ed
udp_out going_address, icp_port , ht cp_port , snm p_port
< Day Day Up >
< Day Day Up >
ca ch e _ pe e r
Okay, t his one's long, so hang on... This direct ive defines your neighbor caches and t ells Squid how t o com m unicat e wit h t hem . See Chapt er 10 for t he lowdown on neighbor caches. The first argum ent is t he neighbor cache's host nam e, or I P address. You can safely use host nam es here because Squid doesn't block while resolving t hem . I n fact , Squid periodically reresolves t he host nam e so t hat if t he address changes, you won't need t o rest art . Neighbor host nam es m ust be unique; you can't have t wo neighbors wit h t he sam e nam e, even if t hey have different port s. The second argum ent specifies t he t ype of neighbor cache. The choices are parent, sibling, or multicast. Recall from Sect ion 10.6.3 t hat for a m ult icast neighbor, Squid sends I CP queries only t o t he neighbor's I P address, which m ust be a valid m ult icast address. Squid m akes HTTP request s t o parent s and siblings but never t o a m ult icast neighbor. The t hird and fourt h argum ent s are HTTP and I CP/ HTCP port num bers. The HTTP port num ber corresponds t o t he neighbor cache's ht t p_port ( or equivalent ) set t ing. A value of 0 for t he I CP/ HTCP port disables t hose prot ocols for t he neighbor. I f you add t he htcp opt ion ( described in t he subsequent paragraphs) , Squid sends HTCP queries t o t he neighbor. Ot herwise, Squid sends I CP queries. I f you choose not t o use I CP or HTCP, you m ust specify t he neighbor as a parent cache. This brings us t o t he opt ions field. The cache_peer direct ive has num erous opt ions, which can be very confusing:
proxy-only I nst ruct s Squid t o not st ore any responses received from t he neighbor. This is oft en useful when you have a clust er and don't want a resource t o be st ored on m ore t han one cache.
weight= n Allows you t o weight parent caches art ificially when using I CP/ HTCP and all parent s report a cache m iss. Norm ally Squid select s t he parent whose reply arrived first . I n fact , it rem em bers which parent has t he best round- t rip t im e for t he query. Squid act ually divides t he RTT by t he weight , so t hat a parent wit h weight=2 has lower ( bet t er) roundt rip t im es and should be select ed m ore oft en.
ttl= n An opt ion for m ult icast neighbors only. I t is t he m ult icast TTL value t o use for I CP queries and it cont rols how far away t he I CP queries can t ravel. The valid range is 0128. A larger value allows t he m ult icast queries t o t ravel fart her and possibly be int ercept ed by out siders. Use a lower num ber t o keep t he queries close t o t he source and wit hin your net work.
no-query Disables I CP/ HTCP for t he neighbor. That is, your cache won't send any queries t o t he neighbor for cache m isses. I t is oft en used wit h t he default opt ion.
default Specifies t he neighbor as a suit able choice in t he absence of ot her hint s. Squid would prefer t o forward a cache m iss t o a parent t hat is likely t o have a cached copy of t he part icular resource. Som et im es Squid won't have any clues ( e.g., if you disable I CP/ HTCP wit h no-query) . I n t hese cases, Squid looks for a parent t hat has been m arked as a default choice.
round-robin A sim ple load- sharing t echnique. I t only m akes sense when you m ark t wo or m ore parent caches as round-robin. Squid keeps a count er for each parent . When it needs t o forward a cache m iss, Squid select s t he parent wit h t he lowest count er.
multicast-responder Tells Squid t o expect I CP replies from t he neighbor in response t o m ult icast queries.
closest-only Refers t o Squid's net db feat ures. When your neighbor has enabled t he net work dat abase, it m ay ret urn I CMP RTT m easurem ent s in I CP m iss replies. This opt ion inst ruct s Squid t o select a parent based on t he RTT bet ween t he parent and t he origin server, rat her t han t he RTT bet ween your cache and t he parent .
no-digest Tells Squid not t o request a Cache Digest from t he neighbor. See Sect ion 10.7.
no-netdb-exchange
Tells Squid not t o request t he neighbor's net db dat abase. Not e, t his refers t o t he bulk t ransfer of t he RTT m easurem ent s, not t he inclusion of t hese m easurem ent s in I CP m iss replies.
no-delay Tells Squid t o ignore any delay pools set t ings for request s t o t he neighbor. See Appendix C.
login= credentials I nst ruct s Squid t o send aut hent icat ion credent ials t o t he neighbor. This opt ion has t hree different form at s, which I 've fully described in Sect ion 10.3.1.
connect-timeout=n Specifies how long Squid should wait when est ablishing a TCP connect ion t o t he neighbor. Wit hout t his opt ion, t he t im eout is t aken from t he global connect _t im eout direct ive. By using a lower t im eout , Squid gives up on t he neighbor quickly and t ries forwarding t he request elsewhere.
digest-url=url Specifies t he URL for t he neighbor's Cache Digest . Wit hout t his opt ion, Squid assum es t he digest URL is ht t p: / / neighbor.host .nam e: port / squid- int ernal- periodic/ st ore_digest .
allow-miss I nst ruct s Squid t o om it t he Cache-control: only-if-cached direct ive for request s sent t o a sibling. You should use t his only if t he neighbor is using t he icp_hit _st ale and isn't using a m iss_access list .
max-conn Places a lim it on t he num ber of sim ult aneous connect ions t hat Squid can open t o t he neighbor. When t his lim it is reached, Squid excludes t he neighbor from it s select ion algorit hm .
htcp Tells Squid t o send HTCP, inst ead of I CP, queries t o t his neighbor. I f you add t his opt ion, don't forget t o also change t he port num ber. Squid uses 4827 as t he default HTCP port . See Chapt er 10.
carp-load-factor= f Tells Squid t hat t his neighbor is a m em ber of a CARP array. The load fact or value specifies t he fract ion of request s t hat t his neighbor will receive. The load fact or values for all neighbors m ust add up t o 1.0. See Chapt er 10.
Synt ax
cache_peer hostname type http-port icp-port [options]
Default
No default
cache_peer bigcache.isp.net parent
3128 3130
Exam ple cache_peer medcache.isp.net sibling 3128 4827 htcp cache_peer 172.16.45.111
Relat ed
parent
3128 0 no-query default
cache_peer_access, ht t p_port , icp_port , ht cp_port , icp_query_t im eout , dead_peer_t im eout , peer_connect _t im eout , cache_peer_dom ain, neighbor_t ype_dom ain
< Day Day Up >
< Day Day Up >
ca ch e _ pe e r _ dom a in
This direct ive allows you t o rest rict forwarded request s by t heir dom ain nam es. For exam ple, you can m ake sure t hat URI s in a cert ain dom ain never go t o your parent cache. Sim ilarly, you can m ake sure t hat request s for only a few specific dom ain nam es are sent t o a neighbor. The cache_peer_dom ain direct ive has been largely superseded by cache_peer_access, which is m uch m ore flexible. Following t he neighbor's host nam e, you can specify a list of dom ain nam es. These are searched in order, unt il Squid finds a m at ch. A m at ch m eans t hat t he request can be sent t o t he neighbor, unless you prefix t he dom ain nam e wit h ! ( " not " ) . For exam ple, .foo.com m eans " allow .foo.com ," while !.bar.net m eans " disallow .bar.net ." I f none of t he list ed dom ains m at ch t he URL, t he default act ion ( allow or deny) is t he opposit e of t he last one in t he list . Not e, t he dom ain nam e m at ching algorit hm is som ewhat t ricky. See t he descript ion in Sect ion 6.1.1.2.
Synt ax
cache_peer_domain hostname domain ...
Default
No default
cache_peer_domain bigcache.isp.net .net .org Exam ple cache_peer_domain aol.web-cache.net !.ads.aol.com .aol.com
Relat ed
cache_peer, cache_peer_access, neighbor_t ype_dom ain
< Day Day Up >
< Day Day Up >
n e igh bor _ t ype _ dom a in
You can use t his direct ive t o m odify t he relat ionship for a neighbor cache select ively. For exam ple, you m ay have a sibling neighbor t hat allows you t o fet ch m isses for cert ain, nearby dom ains. The neighbor_t ype_dom ain opt ion overrides t he t ype given in t he cache_peer line for request s t hat m at ch t he list ed dom ains. The synt ax and algorit hm s for m at ching dom ain nam es are ident ical t o t he cache_peer_dom ain direct ive.
Synt ax
neighbor_type_domain parent|sibling hostname domain ...
Default
No default
Exam ple
neighbor_type_domain bigcache.isp.net parent .customer.isp.net
Relat ed
cache_peer, cache_peer_dom ain
< Day Day Up >
< Day Day Up >
icp_ qu e r y_ t im e ou t
When Squid sends an I CP/ HTCP query t o one or m ore neighbors, it wait s som e am ount of t im e for t he replies t o arrive. Because t he m essages are unreliable UDP dat agram s, t he queries and/ or replies m ay never arrive. Squid aut om at ically figures out how long t o wait for I CP/ HTCP replies. For a part icular query, t he t im eout is t wice t he m ean of how long it t ook for recent replies t o arrive. I n ot her words, Squid averages t he query RTT values from previous request s, doubles it , and wait s t hat am ount of t im e. This algorit hm works best when all your neighbors have about t he sam e RTT, and when net work condit ions are consist ent . You can override t his algorit hm wit h t he icp_query_t im eout direct ive. I nst ead of dynam ically calculat ing t he t im eout , Squid wait s a fixed am ount of t im e for every I CP/ HTCP query.
Synt ax
icp_query_timeout milliseconds
Default
No default
Exam ple icp_query_timeout 1500
Relat ed
icp_port , ht cp_port , m axim um _icp_query_t im eout , m cast _icp_query_t im eout , dead_peer_t im eout
< Day Day Up >
< Day Day Up >
m a x im u m _ icp_ qu e r y_ t im e ou t
I described Squid's dynam ic I CP/ HTCP t im eout algorit hm under icp_query_t im eout . I f you'd like t o use t hat algorit hm , but wish t o place an upper lim it on t he t im eout , use t he m axim um _icp_query_t im eout direct ive inst ead. Rat her t han a fixed t im eout , Squid uses t he dynam ic t im eout but m akes sure it doesn't exceed t he lim it t hat you specify.
Synt ax
maximum_icp_query_timeout milliseconds
Default
No default
Exam ple
maximum_icp_query_timeout 3000
Relat ed
icp_port , ht cp_port , icp_query_t im eout
< Day Day Up >
< Day Day Up >
m ca st _ icp_ qu e r y_ t im e ou t
When you use m ult icast I CP, Squid doesn't know in advance how m any m ult icast - capable neighbors are list ening for it s m essages. Squid det erm ines t his by sending periodic probes t o t he m ult icast group and count ing t he num ber of replies. Squid uses t his count when wait ing for replies t o real m ult icast queries. The m cast _icp_query_t im eout direct ive specifies how long Squid should wait when count ing replies t o it s fake probe queries. Why not j ust use t his t im eout when sending real m ult icast I CP queries? The reason is t hat Squid m ight be sending queries t o bot h m ult icast and unicast neighbors. The m cast _icp_query_t im eout direct ive essent ially cont rols how long Squid wait s for replies t o real m ult icast queries. Let 's say you have an I CP m ult icast group wit h 10 neighbor caches, and t hat it t ypically t akes 3000 m sec for all 10 replies t o arrive but only t akes 1000 m sec t o receive 5 replies. I f you set m cast _icp_query_t im eout t o 1000 m sec, Squid's periodic probes will count 5 neighbors. Then, for a real m ult icast I CP query, Squid wait s for only 5 replies from m ult icast responders. On average, t his should t ake only 1000 m illiseconds. Anot her nice feat ure of t his algorit hm is t hat Squid does t he right t hing if, for som e reason, all your m ult icast neighbors st op responding. I n t hat case, Squid count s zero neighbors and doesn't wait for any replies from m ult icast responders.
Squid doesn't send m ult icast HTCP queries.
Synt ax
mcast_icp_query_timeout milliseconds
Default
mcast_icp_query_timeout 2000
Exam ple
mcast_icp_query_timeout 750
Relat ed
icp_port , icp_query_t im eout
< Day Day Up >
< Day Day Up >
de a d_ pe e r _ t im e ou t
This is anot her direct ive t hat cont rols t he way Squid wait s for I CP/ HTCP replies. Squid m arks each of it s peers as eit her dead ( down) or alive ( up) . Squid uses I CP/ HTCP replies ( and ot her t echniques) t o det erm ine a peer's st at e. I f Squid doesn't receive any replies for t he t im e specified by dead_peer_t im eout , t he peer is declared dead. When a peer is declared dead, Squid cont inues t o send it I CP/ HTCP queries. However, it doesn't expect t o receive replies. That is, a dead peer isn't included in t he algorit hm t hat decides when all I CP replies have been received. As soon as Squid receives an I CP/ HTCP reply from a dead peer, it s st at e is changed t o alive. Squid t ends t o be paranoid about t he st at e of it s peers. Addit ionally, Squid doesn't proact ively m onit or t he peers when t here are no client request s. When Squid has no occasion t o send I CP/ HTCP queries, t he st at e of t he peer is unknown. I f Squid doesn't send any I CP/ HTCP queries for an am ount of t im e longer t han dead_peer_t im eout , Squid t reat s t he peer as dead.
Synt ax
dead_peer_timeout time-specification
Default
dead_peer_timeout 10 seconds
Exam ple
dead_peer_timeout 30 seconds
Relat ed
icp_port , ht cp_port , icp_query_t im eout
< Day Day Up >
< Day Day Up >
h ie r a r ch y_ st oplist
Every HTTP request t hat Squid receives is m arked as eit her hierarchical or nonhierarchical. This t erm inology is som ewhat confusing. A request is hierarchical when t here is a possibilit y it could be a cache hit in one of t he neighbors. I n ot her words, if t he inform at ion in t he request indicat es t hat t he response m ay be cachable, t he request is hierarchical. A request is m arked nonhierarchical when Squid t hinks t here is no chance of get t ing a hit from a neighbor. Squid uses t he hierarchical flag t o decide whet her or not it should query neighbors for t he request . I f t he request is hierarchical, Squid m ay perform I CP/ HTCP queries, or use Cache Digest s, t o locat e cache hit s in neighbors. Ot herwise, Squid m ay forward t he request direct ly t o t he origin server or select a parent based on som e ot her t echnique. Squid has a few hardcoded rules t hat det erm ine if a request is hierarchical. For exam ple, only GET request s are hierarchical. Squid never expect s cache hit s on non- GET request s. Anot her rule is t hat request s including aut hent icat ion inform at ion are nonhierarchical. The hierarchy_st oplist direct ive allows you t o cust om ize t he algorit hm furt her. The st oplist is sim ply a list of st rings. Squid searches t he request ed URL for t hese st rings. The st ring com parison is case- sensit ive. I n t he case of a m at ch, t he request becom es nonhierarchical. The default configurat ion is t o search for cgi-bin and ? so t hat queries and ot her CGI responses aren't hierarchical. Not e t hat t he hierarchical flag det erm ines only whet her or not Squid queries it s neighbor caches. I t doesn't det erm ine which request s m ust , or m ust not , be sent t o parent caches. The always_direct and never_direct access list s have t hat responsibilit y.
Synt ax
hierarchy_stoplist string ...
Default
hierarchy_stoplist cgi-bin ?
hierarchy_stoplist .cgi Exam ple hierarchy_stoplist http://www.mysite.org
Relat ed
always_direct , never_direct
< Day Day Up >
< Day Day Up >
n o_ ca ch e
no_cache is a sequence of access cont rol rules ( see Sect ion 6.2) t hat specify responses t hat m ust not be cached by Squid. Of course, Squid has som e hardcoded rules for responses t hat m ust not be cached according t o t he HTTP RFC. The no_cache rules are in addit ion t o t hose. The no_cache synt ax is a lit t le t ricky. You m ust use deny for rules where t he response m ust not be cached. Consider t his exam ple: acl GoodStuff url_regex /foo/bar/ acl BadStuff url_regex /bar/ no_cache allow GoodStuff no_cache deny BadStuff Here, a URL cont aining /foo/bar/ m ay be cached, but any ot her URL cont aining only /bar/ isn't cached. The m eaning of t he allow and deny m ight be t he opposit e of what you expect . Just rem em ber t hat deny carries t he sam e negat ive connot at ion as " not caching" som et hing.
Synt ax
no_cache allow|deny [!]ACLname ...
Default
No default
acl LocalServers dst 192.168.8.0/24 Exam ple no_cache deny LocalServers
Relat ed
always_direct , never_direct , ht t p_access
< Day Day Up >
< Day Day Up >
ca ch e _ a cce ss_ log
This is t he locat ion of Squid's access.log, which cont ains one ent ry for each client request . See Sect ion 13.2 for t he det ails. I f you want t o disable t he access log, set t his t o / dev/ null.
Synt ax
cache_access_log pathname
Default
cache_access_log $prefix/var/logs/access.log
Exam ple
cache_access_log /var/log/squid-access.log
Relat ed
em ulat e_ht t pd_log, cache_log, cache_st ore_log, log_ip_on_direct , logfile_rot at e
< Day Day Up >
< Day Day Up >
ca ch e _ log
This log file cont ains various operat ional and debugging m essages from Squid. See Sect ion 13.1 for m ore inform at ion. I f you want t o disable cache.log, set t his direct ive t o / dev/ null.
Synt ax
cache_log pathname
Default
cache_log $prefix/var/logs/cache.log
Exam ple
cache_log /var/log/squid.log
Relat ed
debug_opt ions, cache_access_log, cache_st ore_log, logfile_rot at e
< Day Day Up >
< Day Day Up >
ca ch e _ st or e _ log
The st ore.log cont ains det ails about Squid's int eract ion wit h t he disk cache. You'll see ent ries as obj ect s are st ored t o disk, read from disk, and rem oved from t he cache. See Sect ion 13.3 for t he det ails. You can disable t his log by set t ing it t o none.
Synt ax
cache_store_log pathname
Default
cache_store_log $prefix/var/logs/store.log
Exam ple
cache_store_log /var/log/squid-store.log
Relat ed
cache_access_log, cache_log, logfile_rot at e
< Day Day Up >
< Day Day Up >
ca ch e _ sw a p_ log
Each cache direct ory has it s own swap log file. These are binary- form at j ournal files Squid uses t o rebuild t he in- m em ory indexes when in st art s up. Each swap log file is locat ed in t he corresponding cache direct ory by default . I f you use t his opt ion, Squid put s all swap log files in one direct ory. See Sect ion 13.6 for m ore inform at ion.
Synt ax
cache_swap_log pathname
Default
swap.state in each cache_dir
Exam ple
cache_swap_log /var/log/squid-swap-state
Relat ed
cache_st ore_log, logfile_rot at e, cache_dir
< Day Day Up >
< Day Day Up >
e m u la t e _ h t t pd_ log
Squid uses it s own nat ive form at for t he access.log by default . I f you enable t his direct ive, t he access log is writ t en in t he HTTPD com m on log file form at . Oft en useful when Squid is accelerat ing an origin server sit e.
Synt ax
emulate_httpd_log on|off
Default
emulate_httpd_log off
Exam ple
emulate_httpd_log on
Relat ed
cache_access_log, ht t pd_accel_host
< Day Day Up >
< Day Day Up >
log_ ip_ on _ dir e ct
By default , Squid put s origin server I P addresses int o t he nint h field of t he access.log. I f you enable t his direct ive, Squid put s t he origin server host nam e t here inst ead.
Synt ax
log_ip_on_direct on|off
Default
log_ip_on_direct on
Exam ple
log_ip_on_direct off
Relat ed
cache_access_log
< Day Day Up >
< Day Day Up >
ca ch e _ dir
This direct ive inst ruct s Squid where, and how, t o st ore cached obj ect s on disk. See Sect ion 7.1 for t he det ails on cache direct ories. The second param et er select s t he st orage schem e. Your choices are ufs, aufs, diskd, coss, and null. To use any schem e ot her t han ufs, you m ust use t he --enable-storeio opt ion wit h ./ configure. See Sect ion 3.4. The t hird param et er is t he am ount of disk space t o use for t he cache. The unit s are in m egabyt es. The fourt h and fift h param et ers are t he num ber of L1 and L2 direct ories. Don't change t hese values for direct ories t hat already cont ain cached obj ect s. Som e cache_dir schem es have addit ional, opt ional param et ers. Refer t o t he schem e- specific sect ions in Chapt er 8.
Synt ax
cache_dir scheme directory size-MB L1 L2 [options...]
Default
cache_dir ufs $prefix/var/cache 100 16 256
Exam ple
cache_dir ufs /cache0 3072 16 128
Relat ed
cache_replacem ent _policy, cache_m em
< Day Day Up >
< Day Day Up >
ca ch e _ m e m
Squid uses m em ory t o st ore recent ly received obj ect s and t o buffer act ive responses. This direct ive specifies t he am ount of m em ory t o use for st oring t hese obj ect s.
This direct ive doesn't ent irely cont rol t he size of t he Squid process. See Appendix B for addit ional inform at ion.
Synt ax
cache_mem bytes-specification
Default
cache_mem 8 MB
Exam ple
cache_mem 16 MB
Relat ed
cache_dir, m axim um _obj ect _size_in_m em ory, m em ory_replacem ent _policy
< Day Day Up >
< Day Day Up >
ca ch e _ sw a p_ low
This direct ive, along wit h cache_swap_high cont rols t he replacem ent of obj ect s st ored on disk. I t is a percent age of t he m axim um cache size, which com es from t he sum of all cache_dir sizes. See Sect ion 7.2 for addit ional inform at ion.
Synt ax
cache_swap_low percent
Default
cache_swap_low 90
Exam ple
cache_swap_low 85
Relat ed
cache_swap_high, cache_dir
< Day Day Up >
< Day Day Up >
ca ch e _ sw a p_ h igh
See t he descript ion for cache_swap_low. Not e t hat changing cache_swap_high probably won't have a big im pact on Squid's disk usage. See Sect ion 7.2 for addit ional inform at ion.
Synt ax
cache_swap_high percent
Default
cache_swap_high 95
Exam ple
cache_swap_high 99
Relat ed
cache_swap_low, cache_dir
< Day Day Up >
< Day Day Up >
m a x im u m _ obj e ct _ size
This direct ive places a lim it on t he largest obj ect t hat Squid can st ore on disk. Responses larger t han t his size aren't cached. See Sect ion 7.3 for addit ional inform at ion.
Synt ax
maximum_object_size bytes-specification
Default
maximum_object_size 4096 MB
Exam ple
maximum_object_size 250 MB
Relat ed
m inim um _obj ect _size, m axim um _obj ect _size_in_m em ory, reply_body_m ax_size
< Day Day Up >
< Day Day Up >
m in im u m _ obj e ct _ size
Wit h t his direct ive, you can also place lower lim it s on t he size of cached obj ect s. Responses sm aller t han t his size aren't st ored on disk or in m em ory. See Sect ion 7.3 for addit ional inform at ion.
Synt ax
minimum_object_size bytes-specification
Default
minimum_object_size 0 bytes
Exam ple
minimum_object_size 300 bytes
Relat ed
m axim um _obj ect _size
< Day Day Up >
< Day Day Up >
m a x im u m _ obj e ct _ size _ in _ m e m or y
This direct ive allows you t o cont rol t he size of obj ect s st ored in m em ory. Obj ect s t hat are larger t han t his value aren't kept in m em ory. See Sect ion 7.3 for addit ional inform at ion.
Synt ax
maximum_object_size_in_memory bytes-specification
Default
maximum_object_size_in_memory 8 KB
Exam ple
maximum_object_size_in_memory 12 KB
Relat ed
cache_m em , m axim um _obj ect _size
< Day Day Up >
< Day Day Up >
ca ch e _ r e pla ce m e n t _ policy
This direct ive cont rols t he replacem ent policy for Squid's disk cache. Version 2.5 offers t hree different replacem ent policies: least recent ly used ( LRU) , greedy dual- size frequency ( GDSF) , and least frequent ly used wit h dynam ic aging ( LFUDA) . Not e t hat t he keywords ( lru, GDSF, et c.) are case- sensit ive! See Sect ion 7.5 for addit ional inform at ion.
cache_replacement_policy lru Synt ax cache_replacement_policy heap GDSF|LFUDA|LRU
Default
cache_replacement_policy lru
Exam ple
cache_replacement_policy heap GDSF
Relat ed
m em ory_replacem ent _policy, cache_dir
< Day Day Up >
< Day Day Up >
m e m or y_ r e pla ce m e n t _ policy
This direct ive cont rols t he replacem ent policy for obj ect s cached in m em ory. See Sect ion 7.5 for addit ional inform at ion. memory_replacement_policy lru Synt ax memory_replacement_policy heap GDSF|LFUDA|LRU
Default
memory_replacement_policy lru
Exam ple
memory_replacement_policy heap LFUDA
Relat ed
cache_replacem ent _policy, cache_m em
< Day Day Up >
< Day Day Up >
st or e _ dir _ se le ct _ a lgor it h m
This direct ive cont rols t he algorit hm Squid uses when select ing a cache_dir for a new cache file. The possible choices are: least-load and round-robin. See Sect ion 7.4 for addit ional inform at ion.
Synt ax
store_dir_select_algorithm round-robin|least-load
Default
store_dir_select_algorithm least-load
Exam ple
store_dir_select_algorithm round-robin
Relat ed
cache_dir
< Day Day Up >
< Day Day Up >
m im e _ t a ble
Squid uses t he inform at ion in t his file for FTP and Gopher request s. Unlike HTTP, t hese prot ocols don't inform client s about t he t ype of dat a t hey t ransfer. When Squid gat eways t he response from an FTP server t o an HTTP client , it m ust insert Content-Type and ot her headers. Squid uses t he MI ME t able file t o convert filenam e ext ensions int o: ● ● ● ●
Values for t he Content-Type header I cons t hat are displayed for direct ory list ings Content-Encoding header values for com pressed dat a Transfer t ype opt ions for FTP servers, eit her im age or ascii; t his corresponds t o t he TYPE com m and in t he FTP prot ocol
Please refer t o t he sam ple m im e.conf for an explanat ion of t he form at of t his file.
Synt ax
mime_table pathname
Default
mime_table $prefix/etc/mime.conf
Exam ple
mime_table /usr/local/squid/etc/my-mime-types.txt
< Day Day Up >
< Day Day Up >
ipca ch e _ size
Squid's I P cache holds recent DNS nam e- t o- address lookups. This direct ive lim it s t he num ber of nam es in t he cache. Each I P cache ent ry uses a relat ively sm all am ount of m em ory, so you can safely increase t his lim it t o 10,000 or m ore.
Synt ax
ipcache_size count
Default
ipcache_size 1024
Exam ple
ipcache_size 5000
Relat ed
ipcache_low, ipcache_high, fqdncache_size
< Day Day Up >
< Day Day Up >
ipca ch e _ low
This direct ive cont rols t he I P cache LRU replacem ent algorit hm . The replacem ent funct ion runs periodically and rem oves t he least recent ly used I P cache ent ries unt il reaching t his low wat erm ark. You should have alm ost no reason t o change t his value. You'd be bet t er off changing ipcache_size inst ead.
Synt ax
ipcache_low percent
Default
ipcache_low 90
Exam ple
ipcache_low 95
Relat ed
ipcache_size, ipcache_high
< Day Day Up >
< Day Day Up >
ipca ch e _ h igh
This direct ive is essent ially unused in current versions of Squid. The LRU replacem ent rout ine uses only ipcache_low. The only t im e t hat Squid uses ipcache_high is when calculat ing t he hash t able size for t he I P cache at st art up.
Synt ax
ipcache_high percent
Default
ipcache_high 95
Exam ple
ipcache_high 99
Relat ed
ipcache_size, ipcache_low
< Day Day Up >
< Day Day Up >
fqdn ca ch e _ size
Squid's FQDN cache holds recent DNS address- t o- nam e lookups. However, Squid m akes t hese reverse DNS lookups only when you enable t he log_fqdn direct ive or use a dst dom ain ACL. This direct ive lim it s t he num ber of nam es in t he cache. Each FQDN cache ent ry uses a relat ively sm all am ount of m em ory, so you can safely increase t his lim it t o 10,000 or m ore.
Synt ax
fqdncache_size count
Default
fqdncache_size 1024
Exam ple
fqdncache_size 6000
Relat ed
ipcache_size, log_fqdn
< Day Day Up >
< Day Day Up >
log_ m im e _ h dr s
When you enable t his direct ive, Squid writ es t he HTTP request and response headers t o t he access.log file. The headers appear as t wo addit ional fields on each line. All whit espace and ot her special charact ers are encoded wit h URL- st yle escape codes. Enabling t his opt ion m ay assist in t racking down cert ain problem s. Not e t hat HTTP headers are relat ively large ( a few hundred byt es each) . Logging t hem dram at ically increases t he size of your access.log file.
Synt ax
log_mime_hdrs on|off
Default
log_mime_hdrs off
Exam ple
log_mime_hdrs on
Relat ed
cache_access_log
< Day Day Up >
< Day Day Up >
u se r a ge n t _ log
This direct ive causes Squid t o creat e a log file of User-Agent st rings. The file cont ains t hree fields: client ident ifier, t im est am p, and user- agent st ring. The client ident ifier is an I P address, unless you enable t he log_fqdn direct ive, in which case it is a host nam e if one is available. Squid writ es an ent ry for every HTTP request t hat has a User-Agent header. Unlike access.log, ent ries are writ t en t o t his file when t he request is received.
Synt ax
useragent_log pathname
Default
No default
Exam ple
useragent_log /usr/local/squid/var/logs/useragent.log
Relat ed
log_fqdn, cache_access_log, referer_log
< Day Day Up >
< Day Day Up >
r e fe r e r _ log
This direct ive causes Squid t o creat e a log file of Referer values from client request s. The file cont ains four fields: t im e, client ident ifier, Referer value, and t he URI request . For exam ple, when a client request s t he im age foo.png em bedded in an index.ht m l, t he referer log cont ains: 1068047502.377 192.168.1.2 /index.html /foo.png Squid writ es an ent ry for every HTTP request t hat has a Referer header. Unlike access.log, ent ries are writ t en t o t his file when t he request is received.
Synt ax
referer_log pathname
Default
No default
Exam ple
referer_log /usr/local/squid/var/logs/referer.log
Relat ed
log_fqdn, cache_access_log, useragent _log
< Day Day Up >
< Day Day Up >
pid_ file n a m e
This is t he file in which Squid writ es it s process I D ( PI D) num ber. Squid uses t he PI D file in a couple of ways. First , it looks for and reads t his file when st art ing. I f t he file exist s and cont ains a valid PI D, Squid report s it is already running under t hat PI D so t hat you don't accident ally st art Squid t wice. The PI D file is also read when you use one of t he - k com m ands such as squid - k rot at e. You probably don't need t o worry about t his direct ive unless you act ually do want t o run t wo ( or m ore) Squid processes on t he sam e m achine. Each inst ance of Squid requires a unique PI D filenam e.
Synt ax
pid_filename pathname
Default
pid_filename $prefix/var/logs/squid.pid
Exam ple
pid_filename /var/run/squid.pid
< Day Day Up >
< Day Day Up >
de bu g_ opt ion s
This direct ive cont rols t he am ount of debugging inform at ion writ t en t o cache.log. Each source code m odule has a sect ion num ber. I ndividual debugging st at em ent s in t he code have a level. Higher debugging levels correspond t o m ore verbose debugging. For a list of sect ion num bers, refer t o Table 16- 1 or t he doc/ debug- sect ions.t xt file in t he source dist ribut ion.
Synt ax
debug_options section,level ...
Default
debug_options ALL,1
Exam ple
debug_options ALL,1 42,5
Relat ed
cache_log
< Day Day Up >
< Day Day Up >
log_ fqdn
This direct ive cont rols whet her or not Squid places client I P addresses or host nam es in t he log files. By default Squid writ es t he I P address. I f you enable t his feat ure, Squid queries t he DNS for client host nam es or fully qualified dom ain nam es ( FQDN) . These address- t o- nam e lookups som et im es t ake a long t im e. Squid never post pones logging t o wait for an answer. I f t he FQDN isn't available when Squid is ready t o writ e t he log ent ry, it uses t he I P address.
Synt ax
log_fqdn on|off
Default
log_fqdn off
Exam ple
log_fqdn on
Relat ed
cache_access_log, useragent _log, referer_log, fqdncache_size, client _net m ask
< Day Day Up >
< Day Day Up >
clie n t _ n e t m a sk
This direct ive is available t o provide privacy for users. When Squid writ es access.log and ot her log files, it applies t his m ask t o t he client 's I P address. For exam ple, if you set t he net m ask t o 255.255.255.0, Squid logs a request from 1.2.3.0 inst ead of 1.2.3.4. Thus, if som eone m anages t o read t he log file, t hey know only approxim at ely, not exact ly, which host ( or user) m ade each request . I f you use log_fqdn, Squid applies t he client _net m ask before issuing t he DNS lookup. For exam ple, Squid will t ry t o find a host nam e record for 1.2.3.0 inst ead of 1.2.3.4.
Synt ax
client_netmask IPv4-netmask
Default
client_netmask 255.255.255.255
Exam ple
client_netmask 255.255.255.0
Relat ed
cache_access_log, useragent _log, referer_log, log_fqdn
< Day Day Up >
< Day Day Up >
ft p_ u se r
This direct ive cont ains t he password Squid sends when logging in t o anonym ous FTP servers. Convent ion dict at es t hat anonym ous FTP client s send t he user's em ail address as t he login password. Most anonym ous FTP servers accept an abbreviat ed form wit h only a usernam e followed by @ ( e.g., joe_blow@) . You probably won't need t o change t his direct ive unless you encount er a very picky FTP server.
Synt ax
ftp_user email-address
Default
ftp_user Squid@
Exam ple
ftp_user [email protected]
Relat ed
ft p_list _widt h, ft p_passive
< Day Day Up >
< Day Day Up >
ft p_ list _ w idt h
This direct ive cont rols t he widt h of t he filenam e colum n in FTP direct ory list ings t hat Squid generat es. The default value is chosen so t hat t he list ings fit inside a t ypical browser window. This also m eans t hat long filenam es m ay be t runcat ed. I f you'd like t o see m ore charact ers in long filenam es, increase t his value.
Synt ax
ftp_list_width character-count
Default
ftp_list_width 32
Exam ple
ftp_list_width 64
Relat ed
ft p_user
< Day Day Up >
< Day Day Up >
ft p_ pa ssive
Squid norm ally uses FTP's so- called passive m ode for file t ransfers. This m eans t hat t he FTP server creat es a TCP socket for dat a t ransfer and wait s for t he client t o connect . Passive m ode works m uch bet t er t hrough m ost I nt ernet firewalls. The alt ernat ive is t o have t he FTP client ( Squid in t his case) creat e a TCP socket and wait for a connect ion from t he server. Most likely, you'll never have problem s wit h FTP passive m ode. However, you can force nonpassive operat ion by t urning off t his direct ive.
Synt ax
ftp_passive on|off
Default
ftp_passive on
Exam ple
ftp_passive off
Relat ed
ft p_user, ft p_list _widt h, ft p_sanit ycheck
< Day Day Up >
< Day Day Up >
ft p_ sa n it ych e ck
When using FTP passive m ode ( t he default ) , t he FTP server t ells Squid t he I P address and port num ber for each dat a connect ion. Squid norm ally checks t he given values t o m ake sure t hey m at ch t he server's I P address. I n ot her words, an FTP server should always use it s own I P address in t he PASV reply m essage. I f it doesn't , Squid com plains t o cache.log and at t em pt s a dat a connect ion wit h t he PORT com m and. Disable t he ft p_sanit ycheck direct ive if you want Squid t o skip t he I P address sanit y check.
Synt ax
ftp_sanitycheck on|off
Default
ftp_sanitycheck on
Exam ple
ftp_sanitycheck off
Relat ed
ft p_passive
< Day Day Up >
< Day Day Up >
ca ch e _ dn s_ pr ogr a m
Recall t hat , by default , Squid uses an int ernal DNS client im plem ent at ion. However, you also have t he choice of using an ext ernal helper program t o perform DNS lookups. This choice m ust be m ade when you run ./ configure, wit h t he --disable-internal-dns opt ion. I f you elect t o use t he ext ernal DNS, t his direct ive specifies t he pat hnam e t o t he dnsserver program . This is a m isleading nam e in t hat t he program isn't really a DNS server. I t is m ore like a DNS proxy. The program reads host nam es ( or I P addresses) from Squid, execut es t he necessary lookup, and writ es I P addresses ( or host nam es) back. You probably won't need t o use t his direct ive, unless you m ove t he Squid binaries aft er running m ake inst all or you're inclined t o experim ent wit h t he ext ernal DNS program .
Synt ax
cache_dns_program pathname
Default
cache_dns_program $prefix/libexec/dnsserver
Exam ple
cache_dns_program /usr/local/squid/libexec/better_dnsserver
Relat ed
dns_children
< Day Day Up >
< Day Day Up >
dn s_ ch ildr e n
This direct ive is m eaningful only wit h t he —disable- int ernal- dns opt ion. The int erface bet ween Squid and t he ext ernal DNS program is built around t he ge t h ost byn a m e ( ) funct ion. Squid writ es a request t o a dnsserver process, which perform s t he query. The ge t h ost byn a m e ( ) call blocks t he process unt il t he reply arrives. This is why Squid can't use t he funct ion int ernally. Each dnsserver handles only one request at a t im e, so you need enough of t hem t o handle t he load from your cache. Unfort unat ely, you m ay need t o experim ent wit h different values t o discover t he appropriat e set t ing for your part icular sit uat ion. I n t heory, you can calculat e t he num ber of child processes if you know t he rat e of DNS lookups and how long lookups t ake on average. Unfort unat ely, bot h values can vary significant ly over t im e. Squid writ es a warning int o cache.log if you have t oo few dnsserver child processes. I f all helper processes are busy, Squid queues up new lookups. I f t he queue grows t oo large, Squid em it s an error m essage and exit s. Thus, t oo m any child processes are bet t er t han t oo few. You can use t he dns ent ry in t he cache m anager m enu t o see dnsserver ut ilizat ion inform at ion. Request s are always sent t o t he first idle process, so you can see if som e processes never receive any DNS lookup request s. I n t hat case you m ay want t o lower t he dns_children value. Why doesn't Squid j ust creat e and dest roy child processes as necessary? The prim ary reason is t hat t he creat ion of a child process, via for k ( ) , is a relat ively " heavy" operat ion. I t m ay int roduce significant delays for act ive HTTP request s. A Squid process t ypically consum es a lot of m em ory. I n som e cases, for k ( ) m ay fail due t o lack of available m em ory or swap space. Rat her t han t ry t o fix all t hese issues wit h t he ext ernal DNS im plem ent at ion, Squid can read and writ e DNS m essages int ernally.
Synt ax
dns_children number
Default
dns_children 5
Exam ple
dns_children 16
Relat ed
cache_dns_program
< Day Day Up >
< Day Day Up >
dn s_ r e t r a n sm it _ in t e r va l
This direct ive is m eaningful only when you use t he int ernal DNS im plem ent at ion ( t he default ) . This direct ive is t he init ial ret ransm ission int erval for unacknowledged DNS queries. Each t im e Squid ret ransm it s a DNS query, it 's sent t o t he next DNS server in t he list . I f none of t he servers answer, Squid st art s at t he t op of t he list again and doubles t he ret ransm it int erval.
Synt ax
dns_retransmit_interval time-specification
Default
dns_retransmit_interval 5 seconds
Exam ple
dns_retransmit_interval 10 seconds
Relat ed
dns_t im eout
< Day Day Up >
< Day Day Up >
dn s_ t im e ou t
This direct ive is m eaningful only when you use t he int ernal DNS im plem ent at ion ( t he default ) . This direct ive is t he t ot al am ount of t im e t hat Squid wait s for a DNS answer. I f t he t im eout occurs, Squid ret urns an error m essage t o t he user.
Synt ax
dns_timeout time-specification
Default
dns_timeout 5 minutes
Exam ple
dns_timeout 2 minutes
Relat ed
dns_ret ransm it _int erval
< Day Day Up >
< Day Day Up >
dn s_ de fn a m e s
This direct ive is m eaningful only wit h t he —disable- int ernal- dns opt ion. By default , Squid's dnsserver program doesn't at t em pt t o expand single- word host nam es ( such as www) int o fully qualified dom ain nam es. I f your users are accust om ed t o using single- word host nam es, you m ay want t o enable t his direct ive.
Synt ax
dns_defnames on|off
Default
dns_defnames off
Exam ple
dns_defnames on
Relat ed
append_dom ain
< Day Day Up >
< Day Day Up >
dn s_ n a m e se r ve r s
By default , Squid sends DNS queries t o t he nam e servers list ed in t he / et c/ resolv.conf file. I f you want Squid t o use a different set of nam e servers, you can specify t hem wit h t his direct ive. Of course, you can also j ust change your resolv.conf file.
Synt ax
dns_nameservers ip-address ...
Default
No default
Exam ple
dns_nameservers 127.0.0.1 192.168.0.1
< Day Day Up >
< Day Day Up >
h ost s_ file
When you use t he int ernal DNS im plem ent at ion ( t he default ) , Squid always uses t he DNS nam e servers t o resolve nam es and addresses. The ext ernal dnsserver program , on t he ot her hand, m ay check a local dat abase—t he host s file—before querying t he DNS. Wit h t his direct ive, you can m ake Squid preload t he cont ent s of a host s file int o it s I P and FQDN caches. Squid rereads t he host s file when you send it t he reconfigure signal ( squid - k reconfigure) . I f you configure t he append_dom ain direct ive, it 's appended t o any single- com ponent nam es in t he host s file.
Synt ax
hosts_file pathname
Default
No default
Exam ple
hosts_file /usr/local/squid/etc/hosts
Relat ed
dns_defnam es, append_dom ain
< Day Day Up >
< Day Day Up >
disk d_ pr ogr a m
This is t he pat hnam e t o t he diskd helper program . I t get s execut ed for each cache_dir of t ype diskd.
Synt ax
diskd_program pathname
Default
diskd_program $prefix/libexec/diskd
Exam ple
diskd_program /usr/local/squid-2.4/libexec/squid/diskd
Relat ed
cache_dir
< Day Day Up >
< Day Day Up >
u n lin k d_ pr ogr a m
This is t he pat hnam e t o t he unlinkd program . By execut ing t he unlink operat ions in t his ext ernal process, Squid's perform ance im proves significant ly. You can disable t he ext ernal unlinker wit h t he —disable- unlinkd opt ion t o ./ configure.
Synt ax
unlinkd_program pathname
Default
unlinkd_program $prefix/libexec/unlinkd
Exam ple
unlinkd_program /usr/local/squid-2.4/libexec/unlinkd
< Day Day Up >
< Day Day Up >
pin ge r _ pr ogr a m
Squid uses t he pinger program t o send I CMP pings t o origin server sit es. Squid uses t hese I CMP m easurem ent s t o est im at e net work proxim it y. Not e t hat t he pinger program m ust be inst alled as setuid root because it opens a raw I CMP socket . To enable t he I CMP m easurem ent feat ures, use t he ./ configure —enable- icm p opt ion.
Synt ax
pinger_program pathname
Default
pinger_program $prefix/libexec/pinger
Exam ple
pinger_program /usr/local/squid-2.4/libexec/pinger
Relat ed
net db_low, net db_high, net db_ping_period
< Day Day Up >
< Day Day Up >
r e dir e ct _ pr ogr a m
This direct ive specifies t he pat hnam e of a redirect or program . I t m ust be execut able by t he Squid user I D. See Chapt er 11.
Synt ax
redirect_program pathname
Default
No default
Exam ple redirect_program /usr/local/squid/libexec/my_redirector
Relat ed
redirect _children, redirect _rewrit es_host _header, redirect or_access, redirect or_bypass
< Day Day Up >
< Day Day Up >
r e dir e ct _ ch ildr e n
This direct ive specifies how m any redirect or processes Squid should st art . Client request s are writ t en t o t he first idle redirect or process. Squid warns you ( via cache.log) when all processes are sim ult aneously busy. I f you see t his warning, you should increase t he num ber of child processes and rest art Squid.
Synt ax
redirect_children number
Default
redirect_children 5
Exam ple
redirect_children 20
Relat ed
redirect _program , sleep_aft er_fork, redirect or_bypass
< Day Day Up >
< Day Day Up >
r e dir e ct _ r e w r it e s_ h ost _ h e a de r
Squid norm ally updat es a request 's Host header when using a redirect or. I f you use Squid as a surrogat e ( HTTP accelerat or) , you m ight want t o disable t his behavior by set t ing t his direct ive t o off.
Synt ax
redirect_rewrites_host_header on|off
Default
redirect_rewrites_host_header on
Exam ple
redirect_rewrites_host_header off
Relat ed
ht t pd_accel_single_host
< Day Day Up >
< Day Day Up >
r e dir e ct or _ a cce ss
I f you use t his direct ive, only t he request s t hat m at ch t he access list rules are sent t o t he redirect or processes. Wit hout any redirect or_access rules, all request s are sent t o t he redirect or processes.
Synt ax
redirector_access allow|deny [!]ACLname ...
Default
No default
acl Foo src 192.168.1.0/24 acl All src 0/0 Exam ple redirector_access deny Foo redirector_access allow All
Relat ed
acl, ht t p_access
< Day Day Up >
< Day Day Up >
r e dir e ct or _ bypa ss
Squid uses a pool of redirect ors t o service client request s. This direct ive det erm ines Squid's behavior when all redirect ors in t he pool are busy. Norm ally, Squid queues subsequent request s, wait ing for one of t he redirect ors t o becom e free. I f t he queue becom es t oo large, Squid exit s wit h a fat al m essage. I f you enable t his direct ive, however, Squid sim ply skips t he redirect ion st ep if all redirect ors are busy.
Synt ax
redirector_bypass on|off
Default
redirector_bypass off
Exam ple
redirector_bypass on
Relat ed
redirect _program , redirect or_access
< Day Day Up >
< Day Day Up >
a u t h _ pa r a m
The aut h_param direct ive cont rols alm ost every aspect of Squid's ext ernal user aut hent icat ion int erface. Squid current ly support s t hree aut hent icat ion schem es: Basic, Digest , and NTLM. Basic aut hent icat ion support is com piled by default . For t he ot hers, you m ust use t he —enable- aut h opt ion wit h ./ configure. Since t he aut h_param direct ive is very com plex, I 'm present ing it here as a separat e direct ive for each com binat ion of param et ers.
Synt ax
See t he following subsect ions
Default
See t he following subsect ions
Exam ple
See t he following subsect ions
Relat ed
aut hent icat e_cache_garbage_int erval, aut hent icat e_t t l, aut hent icat e_ip_t t l
a u t h _ pa r a m ba sic pr ogr a m The com m and for t he HTTP Basic aut hent icat ion helper. You need t o specify t he full pat hnam e t o t he program , plus any com m and- line opt ions.
Synt ax
auth_param basic program command ...
Default
No default
auth_param basic program /usr/local/squid/libexec/ncsa_auth /usr/local/squid/etc/ Exam ple ncsa_passwd
Relat ed
aut h_param basic children, aut h_param basic realm , aut h_param basic credent ialst t l
a u t h _ pa r a m ba sic ch ildr e n This is t he num ber of Basic aut hent icat ion helper processes Squid uses.
Synt ax
auth_param basic children count
Default
auth_param basic children 5
Exam ple
auth_param basic children 10
Relat ed
aut h_param basic program , aut h_param basic realm , aut h_param basic credent ialst t l
a u t h _ pa r a m ba sic r e a lm This is t he Basic aut hent icat ion realm Squid sends in 407 ( Proxy Aut hent icat ion Required) responses. User agent s t ypically display t he realm st ring t o t he user when request ing a usernam e and password. Refer t o RFC 2617, Sect ion 2.
Synt ax
auth_param basic realm string
Default
No default
Exam ple
auth_param basic realm Squid proxy-caching web server
Relat ed
aut h_param basic program , aut h_param basic children, aut h_param basic credent ialst t l
a u t h _ pa r a m ba sic cr e de n t ia lst t l To reduce load on t he ext ernal aut hent icat ion processes, Squid caches successful answers for t his am ount of t im e. I n ot her words, once a user is aut hent icat ed, Squid doesn't query t he helper program again unt il t his TTL expires. I f you change t he ext ernal dat abase ( e.g., password file) , Squid m ay not not ice t he change unt il t he cached credent ials t im e out .
Synt ax
auth_param basic credentialsttl time-specification
Default
auth_param basic credentialsttl 5 minutes
Exam ple
auth_param basic credentialsttl 15 minutes
Relat ed
aut h_param basic program , aut h_param basic children, aut h_param basic realm
a u t h _ pa r a m dige st pr ogr a m As wit h Basic aut hent icat ion, t his specifies t he com m and t o execut e for t he ext ernal Digest aut hent icat ion program .
Synt ax
auth_param digest program command ...
Default
No default
auth_param digest program /usr/local/squid/libexec/digest_auth /usr/local/squid/etc/ Exam ple digest_passwd
Relat ed
aut h_param digest children, aut h_param digest realm , aut h_param digest nonce_garbage_int erval, aut h_param digest nonce_m ax_durat ion, aut h_param digest nonce_m ax_count
a u t h _ pa r a m dige st ch ildr e n This is t he num ber of Digest aut hent icat ion helper processes t hat Squid uses.
Synt ax
auth_param digest children count
Default
auth_param digest children 5
Exam ple auth_param digest children 11
Relat ed
aut h_param digest program , aut h_param digest realm , aut h_param digest nonce_garbage_int erval, aut h_param digest nonce_m ax_durat ion, aut h_param digest nonce_m ax_count
a u t h _ pa r a m dige st r e a lm This is t he Digest aut hent icat ion realm t hat Squid sends in 407 ( Proxy Aut hent icat ion Required) responses. User agent s t ypically display t he realm st ring t o t he user when request ing a usernam e and password. Refer t o RFC 2617, Sect ion 3.2.1.
Synt ax
auth_param digest realm string
Default
No default
Exam ple auth_param digest realm Squid proxy-caching web server
Relat ed
aut h_param digest program , aut h_param digest children, aut h_param digest nonce_garbage_int erval, aut h_param digest nonce_m ax_durat ion, aut h_param digest nonce_m ax_count
a u t h _ pa r a m dige st n on ce _ ga r ba ge _ in t e r va l As I explained in Sect ion 12.3, a nonce is a special st ring of dat a t hat changes from t im e t o t im e. I t s purpose
is t o prevent replay at t acks wit h capt ured digest aut hent icat ion dat a. Squid m aint ains a cache of nonce values it has sent t o client s requiring aut hent icat ion. This cache m ust be pruned occasionally because nonce st rings expire. This direct ive specifies how oft en Squid execut es t he garbage collect ion procedure for t he nonce cache. I f Squid is very busy, you m ay want t o clean t he nonce cache m ore frequent ly t o reduce t he am ount of t im e spent in t he garbage collect ion funct ion each t im e it runs.
Synt ax
auth_param digest nonce_garbage_interval time-specification
Default
auth_param digest nonce_garbage_interval 5 minutes
Exam ple auth_param digest nonce_garbage_interval 5 minutes
Relat ed
aut h_param digest program , aut h_param digest children, aut h_param digest realm , aut h_param digest nonce_m ax_durat ion, aut h_param digest nonce_m ax_count
a u t h _ pa r a m dige st n on ce _ m a x _ du r a t ion This direct ive specifies how long a Digest nonce value rem ains valid. I t is sim ilar t o t he credent ialst t l direct ive for Basic aut hent icat ion. I f an at t acker capt ures t he client 's digest aut hent icat ion headers from an HTTP request , a sim ple replay at t ack provides aut hent icat ed access t o Squid unt il t he nonce value t im es out or unt il t he m axim um usage count is reached. Decrease t his value t o reduce t hat risk.
Synt ax
auth_param digest nonce_max_duration time-specification
Default
auth_param digest nonce_max_duration 5 minutes
Exam ple auth_param digest nonce_max_duration 30 minutes
Relat ed
aut h_param digest program , aut h_param digest children, aut h_param digest realm , aut h_param digest nonce_garbage_int erval, aut h_param digest nonce_m ax_count , aut h_param basic credent ialst t l
a u t h _ pa r a m dige st n on ce _ m a x _ cou n t This direct ive specifies a lim it on t he num ber of request s for a Digest nonce value. I f a client issues t his m any request s wit h t he sam e nonce value, Squid invalidat es it and causes a new one t o be generat ed. See Sect ion 4.3 of RFC 2617.
Synt ax
auth_param digest nonce_max_count count
Default
auth_param digest nonce_max_count 50
Exam ple auth_param digest nonce_max_count 50
Relat ed
aut h_param digest program , aut h_param digest children, aut h_param digest realm , aut h_param digest nonce_garbage_int erval, aut h_param digest nonce_m ax_durat ion
a u t h _ pa r a m n t lm pr ogr a m This direct ive specifies t he com m and, including opt ions, t o execut e for t he ext ernal NTLM aut hent icat ion program .
Synt ax
auth_param ntlm program command
Default
No default
auth_param ntlm program /usr/local/squid/libexec/ntlm_auth /usr/local/ Exam ple squid/etc/ntlm_db
Relat ed
aut h_param nt lm children, aut h_param nt lm m ax_challenge_reuses, aut h_param nt lm m ax_challenge_lifet im e
a u t h _ pa r a m n t lm ch ildr e n Specifies t he num ber of NTLM aut hent icat ion helper process t hat Squid uses.
Synt ax
auth_param ntlm children count
Default
auth_param ntlm children 5
Exam ple auth_param ntlm children 14
Relat ed
aut h_param nt lm program , aut h_param nt lm m ax_challenge_reuses, aut h_param nt lm m ax_challenge_lifet im e
a u t h _ pa r a m n t lm m a x _ ch a lle n ge _ r e u se s I n Squid's NTLM im plem ent at ion, t he NTLM challenge t oken com es from t he ext ernal helper process, rat her t han Squid it self. Each helper process generat es it s own challenge t oken. This direct ive specifies how m any t im es each t oken m ay be reused. By default , t he t okens are never reused. Challenge reuse is also subj ect t o t he m ax_challenge_lifet im e rest rict ion.
Synt ax
auth_param ntlm max_challenge_reuses count
Default
auth_param ntlm max_challenge_reuses 0
Exam ple
auth_param ntlm max_challenge_reuses 5
Relat ed
aut h_param nt lm program , aut h_param nt lm children, aut h_param nt lm m ax_challenge_lifet im e
a u t h _ pa r a m n t lm m a x _ ch a lle n ge _ life t im e This direct ive also cont rols whet her t he ext ernal NTML helper processes can reuse t heir challenge t okens. I t specifies t he m axim um am ount of t im e a single challenge can be used.
Synt ax
auth_param ntlm max_challenge_lifetime time-specification
Default
auth_param ntlm max_challenge_lifetime 1 minute
Exam ple
auth_param ntlm max_challenge_lifetime 2 minutes
Relat ed
aut h_param nt lm program , aut h_param nt lm children, aut h_param nt lm m ax_challenge_reuses
< Day Day Up >
< Day Day Up >
a u t h e n t ica t e _ t t l
Squid m aint ains a cache of proxy aut hent icat ion usernam es and credent ials. Squid periodically rem oves unused ent ries t o keep m em ory usage down. This direct ive specifies how long Squid keeps ent ries in t he proxy aut hent icat ion usernam e cache. A user's TTL is ext ended each t im e Squid receives a request from t hat user.
This direct ive doesn't det erm ine how long credent ials rem ain valid. I t only affect s whet her or not an ent ry is rem oved from t he usernam e cache. Squid m ay decide t o revalidat e t he credent ials of a user t hat is in t he cache. Each aut hent icat ion schem e has it s own way of det erm ining when t o revalidat e credent ials wit h t he ext ernal helper.
Synt ax
authenticate_ttl time-specification
Default
authenticate_ttl 1 hour
Exam ple
authenticate_ttl 30 minutes
Relat ed
aut hent icat e_cache_garbage_int erval, aut h_param
< Day Day Up >
< Day Day Up >
a u t h e n t ica t e _ ca ch e _ ga r ba ge _ in t e r va l
This direct ive specifies how oft en Squid execut es t he funct ion t o clean up t he proxy aut hent icat ion usernam e cache. During t his process, usernam es t hat have been inact ive for som e am ount of t im e ( defined by aut hent icat e_t t l) are purged.
Synt ax
authenticate_cache_garbage_interval time-specification
Default
authenticate_cache_garbage_interval 1 hour
Exam ple
authenticate_cache_garbage_interval 8 hours
Relat ed
aut hent icat e_t t l, aut h_param
< Day Day Up >
< Day Day Up >
a u t h e n t ica t e _ ip_ t t l
This direct ive causes Squid t o deny request s if t he sam e proxy aut hent icat ion usernam e com es from m ore t han one I P address wit hin a given am ount of t im e. I t 's designed t o discourage users from sharing t heir usernam e and password wit h ot hers. When Squid det ect s t he sam e usernam e from m ult iple I P addresses, it forces t he user t o reaut hent icat e by denying t he request . This feat ure is disabled by default ( 0 seconds) . I f your users norm ally have t he sam e I P address ( e.g., st at ic addressing or DHCP wit h long leases) , you can set aut hent icat e_ip_t t l t o a large value such as 1 hour. However, if your users are on dial- up connect ions, t hey m ay be m ore likely t o change I P addresses wit hin a short period of t im e. To m ake t heir lives easier, use a sm all aut hent icat e_ip_t t l value, such as 1 m inut e.
Synt ax
authenticate_ip_ttl time-specification
Default
authenticate_ip_ttl 0 seconds
Exam ple
authenticate_ip_ttl 1 minute
Relat ed
aut h_param
< Day Day Up >
< Day Day Up >
e x t e r n a l_ a cl_ t ype
This direct ive defines new ACL t ypes im plem ent ed as ext ernal program s. See Sect ion 6.1.3.
Synt ax
external_acl_type type-name [options] format helper-command
Default
No default
Exam ple external_acl_type MyAcltype %LOGIN /usr/local/squid/libexec/my-acl-prog.pl
Relat ed
acl, ht t p_access
< Day Day Up >
< Day Day Up >
w a is_ r e la y_ h ost
The Wide Area I nform at ion Service ( WAI S) is an obsolet e prot ocol t hat predat es t he Web. This direct ive is largely hist orical. I t s purpose is t o m ake Squid forward all WAI S request s t o anot her proxy, perhaps a dedicat ed WAI S gat eway. You can accom plish t he sam e effect wit h ACLs and cache_peer_access.
Synt ax
wais_relay_host hostname
Default
No default
Exam ple
wais_relay_host some.host.name
Relat ed
wais_relay_port
< Day Day Up >
< Day Day Up >
w a is_ r e la y_ por t
I f, for som e reason, you use wais_relay_host , you m ust set t he WAI S relay port num ber wit h t his direct ive. Arguably you should be able t o specify bot h wit h a single direct ive. However, t hey were split som e t im e ago t o sim plify Squid's parsing code.
Synt ax
wais_relay_port port-number
Default
No default
Exam ple
wais_relay_port 8001
Relat ed
wais_relay_host
< Day Day Up >
< Day Day Up >
r e qu e st _ h e a de r _ m a x _ size
This direct ive places an upper lim it on t he size of headers in an HTTP request . When Squid receives an HTTP request wit h headers t hat exceed t his value, it ret urns a 413 ( Request Ent it y Too Large) error response. I n m ost cases, request headers are sm aller t han 512 byt es. This direct ive exist s t o cat ch cert ain abnorm al condit ions, such as persist ent connect ion bugs, buffer overflow at t em pt s, and denial- of- service at t acks.
Synt ax
request_header_max_size size-specification
Default
request_header_max_size 10 KB
Exam ple
request_header_max_size 35 KB
Relat ed
request _body_m ax_size, reply_body_m ax_size
< Day Day Up >
< Day Day Up >
r e qu e st _ body_ m a x _ size
This direct ive, if nonzero, places an upper lim it on t he size of a client 's HTTP request body. Most request s ( i.e., GET request s) don't have request bodies. This direct ive applies t o PUT and POST request s. A request t hat exceeds t his lim it generat es a 413 ( Request Ent it y Too Large) error response.
Synt ax
request_body_max_size size-specification
Default
No lim it
Exam ple
request_body_max_size 100 KB
Relat ed
request _header_m ax_size, reply_body_m ax_size
< Day Day Up >
< Day Day Up >
r e fr e sh _ pa t t e r n
This direct ive provides a way t o cust om ize Squid's algorit hm for validat ing cached responses. HTTP has a relat ively com plex procedure for det erm ining whet her or not a cached response is fresh or st ale. I n som e cases, origin servers provide an explicit expirat ion t im e. However, t he m aj orit y of responses don't have t his inform at ion. For t hese, Squid applies som e heurist ics t o t he response. See Sect ion 7.7 for m ore inform at ion.
Synt ax
refresh_pattern regex mintime percent maxtime [options]
Default
refresh_pattern .
Exam ple
refresh_pattern \.jpg$ 0 75 7200
0
20%
4320
< Day Day Up >
< Day Day Up >
qu ick _ a bor t _ m in
This direct ive cont rols Squid's behavior for request s abort ed by t he user. I n som e cases, Squid cont inues reading dat a from t he origin server so t hat fut ure request s m ay be sat isfied as cache hit s. I f Squid knows t hat t he t ransfer ( bet ween it self and t he origin server) has no m ore t han t his m any byt es rem aining, it cont inues receiving t he obj ect . Ot herwise, Squid checks t he quick_abort _m ax set t ing next .
Synt ax
quick_abort_min size-specification
Default
quick_abort_min 16 KB
Exam ple
quick_abort_min 50 KB
Relat ed
quick_abort _m ax, quick_abort _pct
< Day Day Up >
< Day Day Up >
qu ick _ a bor t _ m a x
Aft er checking quick_abort _m in, Squid checks t he value of t his direct ive. I f an abort ed request has m ore t han t his m any byt es rem aining in t he t ransfer, Squid t erm inat es t he connect ion t o t he origin server. Ot herwise, it checks t he quick_abort _pct set t ing.
Synt ax
quick_abort_max size-specification
Default
quick_abort_max 16 KB
Exam ple
quick_abort_max 1 MB
Relat ed
quick_abort _m in, quick_abort _pct
< Day Day Up >
< Day Day Up >
qu ick _ a bor t _ pct
Squid checks t his value last , aft er checking quick_abort _m ax, for a t ransfer abort ed by t he user. I f Squid has already received at least t his percent age of t he response, it cont inues reading t he dat a from t he origin server so t he ent ire response is cached.
Synt ax
quick_abort_pct percentage
Default
quick_abort_pct 95%
Exam ple
quick_abort_pct 75%
Relat ed
quick_abort _m in, quick_abort _m ax
< Day Day Up >
< Day Day Up >
n e ga t ive _ t t l
Squid t akes t he libert y of caching cert ain error responses, such as " connect ion refused" and 404 ( Not Found) m essages. I n m ost cases, repeat ing t he request again im m ediat ely is likely t o result in t he sam e error. This direct ive specifies how long Squid caches t hese errors. Cache hit s for negat ively cached responses are logged wit h TCP_NEGATIVE_HIT in access.log.
Synt ax
negative_ttl time-specification
Default
negative_ttl 5 min
Exam ple
negative_ttl_1 minute
Relat ed
refresh_pat t ern
< Day Day Up >
< Day Day Up >
posit ive _ dn s_ t t l
Each and every DNS resource record carries an explicit TTL t hat specifies how long t he inform at ion m ay be cached. I n m ost sit uat ions, Squid has access t o t he TTL values and doesn't st ore DNS answers longer t han allowed. This is cert ainly t rue when you use Squid's int ernal DNS im plem ent at ion, which is enabled by default . However, if you elect t o use t he ( ext ernal) dnsserver processes, Squid m ay not receive TTL values for DNS answers. I n t his case, successful DNS answers are cached for t he am ount of t im e specified by t his direct ive.
Synt ax
positive_dns_ttl time-specification
Default
positive_dns_ttl 6 hours
Exam ple
positive_dns_ttl 1 hour
Relat ed
negat ive_dns_t t l
< Day Day Up >
< Day Day Up >
n e ga t ive _ dn s_ t t l
This is sim ilar t o posit ive_dns_t t l, except t hat it applies only t o failed DNS queries. That is, when Squid receives an error for a DNS lookup, it negat ively caches t he error for t his am ount of t im e. I t doesn't ret ry t he query unt il t he negat ive TTL expires. This applies t o bot h int ernal and ext ernal DNS im plem ent at ion choices.
Synt ax
negative_dns_ttl time-specification
Default
negative_dns_ttl 5 minutes
Exam ple
negative_dns_ttl 1 minute
Relat ed
posit ive_dns_t t l
< Day Day Up >
< Day Day Up >
r a n ge _ offse t _ lim it
A range request com es from a client t hat want s only som e subset of an HTTP response. They are som et im es used t o resum e a failed t ransfer of a large file. Squid isn't yet able t o cache part ial responses and t hus m ust m ake a decision when forwarding a range request : eit her rem ove t he Range header or leave it in. I f Squid leaves t he Range header in, t he origin server sends only t he subset t hat t he client want s, and t he client receives t he response im m ediat ely. However, t his part ial response isn't cached. On t he ot her hand, if Squid rem oves t he header before forwarding, it receives t he ent ire response, which m ay be cached. Squid is t hen responsible for ensuring t hat t he client receives only t he subset it needs. The origin server m ay send a lot of dat a t he client doesn't want . Depending on t he speed of your connect ion, t he client m ay be forced t o wait a long t im e unt il it s range is available. I f t he beginning of t he request ed range is larger t han t he range_offset _lim it value, Squid forwards t he Range header and doesn't cache t he response. Set t ing range_offset _lim it t o 0 causes Squid t o always forward t he Range header ( t he default ) . Set t ing it t o -1 causes Squid t o never forward t he header.
Synt ax
range_offset_limit size-specification
Default
range_offset_limit 0 KB
Exam ple
range_offset_limit 100 KB
< Day Day Up >
< Day Day Up >
con n e ct _ t im e ou t
This direct ive t ells Squid how long t o wait when t rying t o connect t o an origin server. Aft er t his am ount of t im e, Squid gives up and t ries anot her locat ion or ret urns an error t o t he user. Your operat ing syst em 's TCP im plem ent at ion has it s own connect ion t im eout . I f t he TCP t im eout occurs before connect _t im eout , Squid creat es a new TCP connect ion and t ries again.
Synt ax
connect_timeout time-specification
Default
connect_timeout 2 minutes
Exam ple connect_timeout 30 seconds
Relat ed
peer_connect _t im eout , read_t im eout , writ e_t im eout , request _t im eout , pconn_t im eout , m inim um _ret ry_t im eout
< Day Day Up >
< Day Day Up >
pe e r _ con n e ct _ t im e ou t
This is sim ilar t o connect _t im eout , except t hat it applies t o connect ions t o your neighbors. Most likely, you'll want a sm aller t im eout for neighbor connect ions because t hey should be closer t o you t han m ost origin servers. I f a neighbor is down, you want t he connect ion t o t im e out quickly so t hat you can t ry anot her source. Not e t hat you can also specify individual neighbor t im eout s wit h t he connect-timeout opt ion of t he cache_peer direct ive.
Synt ax
peer_connect_timeout time-specification
Default
peer_connect_timeout 30 seconds
Exam ple
peer_connect_timeout 15 seconds
Relat ed
connect _t im eout
< Day Day Up >
< Day Day Up >
r e a d_ t im e ou t
This t im eout applies t o server connect ions ( bet ween Squid and origin servers or neighbor caches) . I f Squid doesn't receive any dat a for t his am ount of t im e, it closes t he connect ion. I f t he user hasn't yet received any part of t he response, Squid generat es a " read t im eout " error m essage.
Synt ax
read_timeout time-specification
Default
read_timeout 15 minutes
Exam ple
read_timeout 1 hour
Relat ed
connect _t im eout , writ e_t im eout , request _t im eout , client _lifet im e
< Day Day Up >
< Day Day Up >
r e qu e st _ t im e ou t
This t im eout applies t o client connect ions. Once a client est ablishes a connect ion, Squid wait s t his long t o receive t he client 's HTTP request . I f t he client fails t o send a com plet e request , Squid sim ply closes t he connect ion wit hout sending any error m essage.
Synt ax
request_timeout time-specification
Default
request_timeout 5 minutes
Exam ple
request_timeout 30 seconds
Relat ed
read_t im eout , connect _t im eout
< Day Day Up >
< Day Day Up >
pe r sist e n t _ r e qu e st _ t im e ou t
This t im eout is sim ilar t o request _t im eout , except t hat it applies only t o idle, persist ent connect ions.
Synt ax
persistent_request_timeout time-specification
Default
persistent_request_timeout 1 minute
Exam ple
persistent_request_timeout 30 seconds
Relat ed
request _t im eout
< Day Day Up >
< Day Day Up >
clie n t _ life t im e
This t im eout specifies t he m axim um am ount of t im e for a client connect ion. I n m ost cases, client connect ions should never last longer t han a few hours. Long- lived client connect ions m ay be t he result of a net work out age, user- agent bugs, or m ischievous act ivit y.
Synt ax
client_lifetime time-specification
Default
client_lifetime 1 day
Exam ple
client_lifetime 3 hours
Relat ed
read_t im eout
< Day Day Up >
< Day Day Up >
h a lf_ close d_ clie n t s
TCP allows applicat ions t o close connect ions in one direct ion. That is, a client m ay close it s connect ion for writ ing but keep it open for reading. These half- closed connect ions are confusing because Squid can't easily t ell t he difference bet ween a client t hat int ent ionally closed half t he connect ion and a client t hat sim ply abort ed t he ent ire connect ion. The only way Squid knows for sure is when it s at t em pt t o writ e som e dat a ret urns an error. Most user- agent s don't use t he TCP half- close, but som e m ay. When t he half_closed_client s direct ive is enabled ( t he default ) , Squid keeps t hese connect ions open unt il a writ e error ( or som e ot her error) occurs. When disabled, Squid fully closes t he connect ion. Thus, if you disable t his direct ive and have client s t hat use t he TCP half- close, t hey can't receive any dat a from Squid.
Synt ax
half_closed_clients on|off
Default
half_closed_clients on
Exam ple
half_closed_clients off
Relat ed
client _lifet im e, read_t im eout
< Day Day Up >
< Day Day Up >
pcon n _ t im e ou t
This t im eout applies t o idle server persist ent connect ions ( i.e., connect ions bet ween Squid and origin servers or neighbors) . I f t he idle connect ion isn't reused wit hin t his am ount of t im e, Squid closes it t o conserve resources.
Synt ax
pconn_timeout time-specification
Default
pconn_timeout 2 minutes
Exam ple
pconn_timeout 45 seconds
Relat ed
persist ent _request _t im eout , connect _t im eout , read_t im eout
< Day Day Up >
< Day Day Up >
ide n t _ t im e ou t
This t im eout applies t o ident ( RFC 1413) request s m ade t o client host s. Squid m akes ident lookups for one of t wo reasons: t o sat isfy an ACL check or for logging in access.log. I n t he ACL case, Squid blocks t he request unt il t he ident lookup ret urns, or t his t im eout occurs. When only logging, Squid doesn't block on t he ident lookup.
Synt ax
ident_timeout time-specification
Default
ident_timeout 10 seconds
Exam ple
ident_timeout 1 minute
Relat ed
ident _lookup_access, acl ident
< Day Day Up >
< Day Day Up >
sh u t dow n _ life t im e
When you shut down t he Squid process, som e user request s will st ill be act ive. This direct ive specifies how long t o wait unt il all client request s are com plet e. Squid finally exit s when all client connect ions have been closed or when t his t im eout occurs.
Synt ax
shutdown_lifetime time-specification
Default
shutdown_lifetime 30 seconds
Exam ple
shutdown_lifetime 60 seconds
< Day Day Up >
< Day Day Up >
a cl
The acl direct ive defines an access cont rol elem ent , such as a client I P address, origin server host nam e, or server port num ber. The synt ax depends on t he part icular ACL t ype you wish t o define. See Sect ion 6.1 for t he full- blown explanat ion.
Synt ax
acl name type data...
Default
No default
Exam ple acl MyClients src 172.16.1.0/24
Relat ed
ht t p_access, icp_access, m iss_access, no_cache, redirect or_access, ht t p_reply_access, ident _lookup_access, always_direct , never_direct , snm p_access, broken_post s
< Day Day Up >
< Day Day Up >
h t t p_ a cce ss
The ht t p_access direct ive is one of t he m ost im port ant aspect s of your configurat ion. I t det erm ines whet her or not Squid allows or denies a client 's request . I f you don't get your access- cont rol rules j ust right , savvy I nt ernet users can abuse your resources ( e.g., bandwidt h, disk st orage, address space) . Som e people find t he access cont rol rule synt ax confusing. Be sure t o read Sect ion 6.2 closely.
Synt ax
http_access allow|deny [!]ACLname ...
Default
http_access deny all
Exam ple
http_access allow MyClients
Relat ed
acl, ht t p_reply_access, m iss_access, icp_access
< Day Day Up >
< Day Day Up >
h t t p_ r e ply_ a cce ss
The ht t p_reply_access rules are sim ilar t o ht t p_access, except t hat t hey are checked aft er Squid receives t he HTTP response headers for a cache m iss. You m ight want t o use t his access list t o deny request s based on som e charact erist ic of t he response, such as t he cont ent t ype.
Synt ax
http_reply_access allow|deny [!]ACLname ...
Default
http_reply_access allow all
Exam ple
http_reply_access deny MP3Files
Relat ed
acl, ht t p_access
< Day Day Up >
< Day Day Up >
icp_ a cce ss
This access list applies t o I CP queries. I f a part icular I CP query is denied by t he icp_access rules, Squid ret urns an I CP_DENI ED m essage t o t he neighbor.
Synt ax
icp_access allow|deny [!]ACLname ...
Default
icp_access deny all
Exam ple
icp_access allow Neighbor1
Relat ed
acl, ht t p_access
< Day Day Up >
< Day Day Up >
m iss_ a cce ss
The m iss_access rules are sim ilar t o ht t p_access. However, t hey are applied t o cache m isses only. This allows you t o enforce sibling relat ionships wit h your neighbor caches. See Sect ion 6.3.7.
Synt ax
miss_access allow|deny [!]ACLname ...
Default
miss_access allow all
Exam ple
miss_access deny MySiblings
Relat ed
acl, ht t p_access
< Day Day Up >
< Day Day Up >
ca ch e _ pe e r _ a cce ss
The cache_peer_access rules det erm ine which request s Squid will forward t o a part icular neighbor. I f a part icular request is denied by a cache_peer_access list , Squid doesn't forward t he request t o t hat neighbor. See Sect ion 10.4.1.
Synt ax
cache_peer_access peername allow|deny [!]ACLname ...
Default
No default
Exam ple
cache_peer_access neighbor.host.name allow SomeOriginDomains
Relat ed
acl, cache_peer, cache_peer_dom ain, ht t p_access
< Day Day Up >
< Day Day Up >
ide n t _ look u p_ a cce ss
The ident _lookup_access rules det erm ine whet her or not Squid perform s an RFC 1413 usernam e lookup for a client 's TCP connect ion. These rules are checked before Squid reads any part of t he HTTP request . Thus, only TCP/ I P- based ACL elem ent s ( e.g., client address, port num ber) should be used in t hese rules.
Synt ax
ident_lookup_access allow|deny [!]ACLname ...
Default
ident_lookup_access deny all
Exam ple
ident_lookup_access allow TheseClients
Relat ed
acl, ident _t im eout
< Day Day Up >
< Day Day Up >
t cp_ ou t goin g_ t os
This direct ive allows you t o set specific DSCP ( different ial services code point ) values for out going TCP connect ions—t hose m ade t o origin servers and neighbors. The different ial services prot ocol is quit e com plex. Sim ply using t he exam ple in t he following t able will get you nowhere. Make sure t hat you underst and what you are doing before using t his direct ive. See RFCs 2474, 2475, and 3140 for addit ional inform at ion on different ial services.
Synt ax
tcp_outgoing_tos byte-value [!]ACLname ...
Default
No default
acl NormalService src 10.0.0.0/255.255.255.0 acl BetterService src 10.0.1.0/255.255.255.0 Exam ple tcp_outgoing_tos 0x00 NormalService tcp_outgoing_tos 0x20 BetterService
< Day Day Up >
< Day Day Up >
t cp_ ou t goin g_ a ddr e ss
You can use t his access list - based direct ive t o bind out going TCP connect ions t o specific local addresses. I t m ight be useful if your syst em has m ult iple net work int erfaces, and you want t o m ake sure all of Squid's t raffic leaves t hrough one and not t he ot her. Anot her possibilit y is t hat you have t wo or m ore int erfaces wit h different cost s or charact erist ics. You m ay want t o send privileged user's t raffic t hrough t he expensive, uncongest ed link, while ot her users go out t he cheap, low- qualit y connect ion. Don't use t his direct ive if your syst em has only one net work int erface. I f you have an t cp_out going_address rule wit h no ACLs, t hat address is used for request s t hat don't m at ch any of t he ot her rules.
Synt ax
tcp_outgoing_address ipaddr [[!]ACLname] ...
Default
No default
acl SomeUsers src 10.0.0.0/24 acl OtherUsers src 10.0.1.0/24 Exam ple
tcp_outgoing_address 172.16.0.1 SomeUsers tcp_outgoing_address 192.168.0.1 OtherUsers tcp_outgoing_address 172.16.5.1
Relat ed
udp_incom ing_address, udp_out going_address
< Day Day Up >
< Day Day Up >
r e ply_ body_ m a x _ size
This direct ive allows you t o lim it t he size of HTTP reply bodies based on ACL elem ent s. When a request m at ches one of t he reply_body_m ax_size rules, Squid places a lim it on t he size of t he HTTP response. A value of 0 indicat es no lim it . Squid checks t he reply size first when all HTTP headers have been received. I f t he headers cont ain a Content-Length value t hat exceeds t he specified lim it , t he user receives a m essage t hat st at es " t he request or reply is t oo large." I f t he cont ent lengt h is unavailable, Squid cont inues checking t he lim it as dat a com es in from t he server. I f t he reply size exceeds t he lim it , Squid closes t he client 's connect ion, which causes t he client t o receive a part ial reply. Downst ream caches oft en can't det ect part ial replies. Because t he headers lack a cont ent lengt h value, t he downst ream cache ( or user- agent ) doesn't know t hat addit ional dat a is m issing. Thus, you shouldn't use reply_body_m ax_size if you have child or sibling caches. The code t hat checks t he reply_body_m ax_size list ignores deny rules. I n ot her words, it is point less t o include deny rules in t his list . Make sure t hat t he m axim um reply size is large enough for a Squid error m essage ( t ypically 1K2K byt es) . An error m essage t hat is larger t han t he m axim um reply body size causes Squid t o crash.
Synt ax
reply_body_max_size bytes allow [!]ACLname ...
Default
reply_body_max_size 0 allow all
acl WorkingHours time 08:00-17:00 Exam ple reply_body_max_size 10485760 allow WorkingHours
Relat ed
m axim um _obj ect _size, request _body_m ax_size, request _header_m ax_size
< Day Day Up >
< Day Day Up >
ca ch e _ m gr
This em ail address is print ed in error m essages generat ed by Squid. Set t his as an address t o which your users should send support m essages and problem report s. This address also receives a not ificat ion m essage if Squid dies unexpect edly.
Synt ax
cache_mgr email@address
Default
cache-mgr webmaster
Exam ple
cache_mgr [email protected]
< Day Day Up >
< Day Day Up >
ca ch e _ e ffe ct ive _ u se r
I n t he int erest of securit y, Squid doesn't allow it self t o run as root . I f you st art t he process as root , Squid changes it s effect ive userid t o a nonprivileged user. This user I D m ust have writ e perm ission t o t he cache direct ories and log file direct ory. You need t o set t his direct ive only if you're st art ing Squid as root . I f you st art Squid as a nonroot user, t his direct ive is ignored.
Synt ax
cache_effective_user username
Default
cache_effective_user nobody
Exam ple
cache_effective_user squid
Relat ed
cache_effect ive_group
< Day Day Up >
< Day Day Up >
ca ch e _ e ffe ct ive _ gr ou p
I f you st art Squid as root , it changes t he process' user I D t o t he usernam e specified by cache_effect ive_user. By default , Squid set s t he process' group I D t o t he group associat ed wit h t he cache_effect ive_user. You can set t he cache_effect ive_group direct ive if you want Squid t o use som e ot her group I D. You only need t o set t his direct ive if you're st art ing Squid as root . I f you st art Squid as a nonroot user, t his direct ive is ignored.
Synt ax
cache_effective_group groupname
Default
No default
Exam ple
cache_effective_group squid
Relat ed
cache_effect ive_user
< Day Day Up >
< Day Day Up >
visible _ h ost n a m e
Use t his direct ive when Squid can't det erm ine t he fully qualified dom ain nam e on it s own or if you want t o present a special, ext ernal nam e t o t he world. Squid uses t his nam e in error m essages, FTP direct ory list ings, X-Cache header values, cache announcem ent s, and for int ernal URLs. Squid also put s t he visible host nam e int o HTTP Via headers, unless you also define t he unique_host nam e direct ive. Not e t hat you m ust use unique_host nam e if you have a clust er of caches t hat have t he sam e visible host nam e.
Synt ax
visible_hostname hostname
Default
No default
Exam ple
visible_hostname my.host.name
Relat ed
unique_host nam e, host nam e_aliases, announce_period
< Day Day Up >
< Day Day Up >
u n iqu e _ h ost n a m e
I f you have a clust er of caches t alking t o each ot her and sharing a single visible_host nam e value, you m ust use t his direct ive t o give each a unique nam e. Squid uses t he unique nam e in HTTP Via headers t o det ect forwarding loops ( see Sect ion 10.2) .
Synt ax
unique_hostname hostname
Default
No default
Exam ple
unique_hostname cache1.host.name
Relat ed
visible_host nam e, host nam e_aliases
< Day Day Up >
< Day Day Up >
h ost n a m e _ a lia se s
You m ay find yourself in a sit uat ion where m ore t han one host nam e resolves t o Squid's I P address. For exam ple, bot h sv.us.ircache.net and sv.cache.nlanr.net resolve t o 192.203.230.19. I f you have neighbors, t hey m ay send request s for cert ain Squid- specific int ernal URLs, as in t he case of Cache Digest s. These URLs m ight cont ain eit her host nam e. You m ust use t his direct ive t o t ell Squid t hat it is known by nam es ot her t han it s visible_host nam e.
Synt ax
hostname_aliases hostname ...
Default
No default
Exam ple
hostname_aliases this.host.name that.host.name
Relat ed
visible_host nam e, unique_host nam e
< Day Day Up >
< Day Day Up >
a n n ou n ce _ pe r iod
Squid's announcem ent feat ure allows Squid adm inist rat ors t o find nearby caches t hat m ight be int erest ed in j oining a cache hierarchy. When you enable t his direct ive, Squid periodically sends a sm all announcem ent m essage t o a cent ral server. By default , t he announcem ent m essage cont ains five fields: ● ● ●
● ●
The I P address and host nam e t hat sent t he announcem ent The Squid version The host nam e Squid uses int ernally—eit her your host nam e if Squid can figure it out or t he value of t he visible_host nam e direct ive The value of t he cache_m gr direct ive The dat e and t im e of t he announcem ent
Set t ing announce_period t o 0 disables t he announcem ent feat ure.
Synt ax
announce_period time-specification
Default
announce_period 0
Exam ple
announce_period 4 hours
Relat ed
announce_host , announce_file, announce_port
< Day Day Up >
< Day Day Up >
a n n ou n ce _ h ost
This is t he host set up t o receive Squid's announcem ent m essages. The default value, t racker. ircache.net is t he only server I know about . You can search t he t racker.ircache.net dat abase by visit ing ht t p: / / www.ircache.net / Tracker/ . Not e t hat if you set cache_m gr, your em ail address m ay be available t o random people. On m ore t han one occasion I have seen com m ercial caching vendors t arget Squid users by collect ing t heir em ail addresses from t his dat abase.
Synt ax
announce_host hostname
Default
announce_host tracker.ircache.net
Exam ple
announce_host some.host.name
Relat ed
announce_period, announce_file, announce_port
< Day Day Up >
< Day Day Up >
a n n ou n ce _ file
You can cust om ize your cache announcem ent m essage by set t ing t his direct ive t o a file cont aining addit ional inform at ion. For exam ple, you can include inform at ion about your upst ream service provider, t elephone num ber, ot her caches t hat you peer wit h, et c. Announcem ent m essages are sent via UDP, so t his file shouldn't be t oo large. Som e syst em s can't send or receive UDP m essages larger t han 9 KB. Furt herm ore, larger m essages are m ore likely t o be dropped before reaching t heir dest inat ion.
Synt ax
announce_file pathname
Default
No default
Exam ple
announce_file /usr/local/squid/etc/announce.txt
Relat ed
announce_period, announce_host , announce_port
< Day Day Up >
< Day Day Up >
a n n ou n ce _ por t
This is t he UDP port num ber t o which t he announcem ent m essages are sent .
Synt ax
announce_port port-number
Default
announce_port 3131
Exam ple
announce_port 1234
Relat ed
announce_period, announce_host , announce_file
< Day Day Up >
< Day Day Up >
h t t pd_ a cce l_ h ost
This direct ive enables HTTP server accelerat ion ( see Chapt er 15) and HTTP int ercept ion ( see Chapt er 9) . When Squid is configured for server accelerat ion, t his direct ive specifies t he host nam e or I P address of t he backend server. When used in an int ercept ion configurat ion, you should probably use t he keyword virtual here. When t his direct ive is set , Squid disables I CP and rej ect s proxy- HTTP request s unless you also enable ht t pd_accel_wit h_proxy.
Synt ax
httpd_accel_host hostname|virtual
Default
No default
Exam ple httpd_accel_host virtual
Relat ed
ht t pd_accel_port , ht t pd_accel_single_host , ht t pd_accel_wit h_proxy, ht t pd_accel_uses_host _header, em ulat e_ht t pd_log
< Day Day Up >
< Day Day Up >
h t t pd_ a cce l_ por t
This is t he TCP port num ber t o which accelerat ed/ int ercept ed request s are sent . I n m ost cases, you should leave it set t o port 80. I f you are accelerat ing/ int ercept ing m ore t han one port , set it t o 0. That is sim ilar t o t he virtual set t ing for ht t pd_accel_host .
Synt ax
httpd_accel_port port-number
Default
httpd_accel_port 80
Exam ple httpd_accel_port 0
Relat ed
ht t pd_accel_host , ht t pd_accel_single_host , ht t pd_accel_wit h_proxy, ht t pd_accel_uses_host _header
< Day Day Up >
< Day Day Up >
h t t pd_ a cce l_ sin gle _ h ost
When enabled, t his direct ive m akes Squid forward all accelerat ed/ int ercept ed request s t o t he ht t pd_accel_host address. See Sect ion 15.2.6.
I f you enable t his direct ive and ht t pd_accel_wit h_proxy, Squid m ay becom e suscept ible t o cache poisoning. Please read Chapt er 15 t horoughly before running such a configurat ion.
Synt ax
httpd_accel_single_host on|off
Default
httpd_accel_single_host off
Exam ple httpd_accel_single_host on
Relat ed
ht t pd_accel_host , ht t pd_accel_port , ht t pd_accel_wit h_proxy, ht t pd_accel_uses_host _header
< Day Day Up >
< Day Day Up >
h t t pd_ a cce l_ w it h _ pr ox y
Enabling HTTP accelerat ion/ int ercept ion norm ally disables proxy- HTTP caching. That is, Squid refuses t o handle proxy request s ( wit h a full URI ) when in HTTP server accelerat or m ode. Alt hough I don't recom m end it , you can force Squid t o accept bot h t ypes of request s by enabling t his direct ive.
Synt ax
httpd_accel_with_proxy on|off
Default
httpd_accel_with_proxy off
Exam ple httpd_accel_with_proxy on
Relat ed
ht t pd_accel_host , ht t pd_accel_port , ht t pd_accel_single_host , ht t pd_accel_uses_host _header
< Day Day Up >
< Day Day Up >
h t t pd_ a cce l_ u se s_ h ost _ h e a de r
When t his direct ive is enabled, Squid uses a request 's Host header when rewrit ing accelerat ed/ int ercept ed request s. When disabled, Squid uses eit her t he origin server's I P address or t he ht t pd_accel_host value. You should probably enable ht t pd_accel_uses_host _header when running Squid as an HTTPint ercept ing proxy. I f Squid is a surrogat e ( accelerat or) , you only need t o enable t his direct ive if t he backend server is configured for virt ual host ing.
Synt ax
httpd_accel_uses_host_header on|off
Default
httpd_accel_uses_host_header off
Exam ple httpd_accel_uses_host_header on
Relat ed
ht t pd_accel_host , ht t pd_accel_port , ht t pd_accel_single_host , ht t pd_accel_wit h_proxy
< Day Day Up >
< Day Day Up >
dn s_ t e st n a m e s
Squid uses t hese host nam es t o t est t he DNS before st art ing. I f Squid can't resolve any of t hese nam es, it print s an error and refuses t o run. I f t he default list doesn't seem t o work on your net work, t ry list ing som e local host nam es inst ead.
Synt ax
dns_testnames hostname ...
Default
dns_testnames netscape.com internic.net nlanr.net microsoft.com
Exam ple
dns_testnames yahoo.com example.com squid-cache.org
< Day Day Up >
< Day Day Up >
logfile _ r ot a t e
You m ust periodically signal Squid t o rot at e it s log files. I f you don't , t hey will increase in size and event ually fill up t he disk part it ion. This direct ive specifies how m any old copies of each log file t o keep around. See Sect ion 13.7 for m ore inform at ion.
Synt ax
logfile_rotate N
Default
logfile_rotate 10
Exam ple logfile_rotate 5
Relat ed
cache_access_log, cache_log, cache_st ore_log, cache_swap_log, useragent _log, referer_log
< Day Day Up >
< Day Day Up >
a ppe n d_ dom a in
This direct ive helps Squid t urn single- com ponent host nam es int o fully qualified dom ain nam es. For exam ple, ht t p: / / www/ becom es www.exam ple.com / . This is especially im port ant if you are part icipat ing in a cache hierarchy.
Synt ax
append_domain .domain.name
Default
No default
Exam ple
append_domain .example.com
Relat ed
dns_defnam es, host s_file
< Day Day Up >
< Day Day Up >
t cp_ r e cv_ bu fsize
I f you use t his direct ive, Squid set s t he receive buffer size for each TCP socket t hat it creat es. This value refers t o t he am ount of dat a t hat t he TCP/ I P st ack will buffer on behalf of t he applicat ion. You can see how m uch dat a is being buffered at any given t im e by looking at t he Recv-Q colum n of net st at - n out put . Larger TCP buffers lead t o increased m em ory usage and bet t er perform ance. I n general, you shouldn't need t o use t his direct ive. Most operat ing syst em s in use t oday have default TCP buffer sizes great er t han 32 KB. Em pirical evidence suggest s t hat fewer t han 5% of t ypical web obj ect s are larger t han 32 KB. When t cp_recv_bufsize is set t o 0, Squid doesn't change t he TCP buffer size from it s default value.
Synt ax
tcp_recv_bufsize size-specification
Default
tcp_recv_bufsize 0
Exam ple
tcp_recv_bufsize 8 kb
< Day Day Up >
< Day Day Up >
e r r _ h t m l_ t e x t
This direct ive is one way t o cust om ize Squid's error m essages. The error m essage files cont ain printf- like t okens. Squid dynam ically replaces t he t okens wit h appropriat e values for each error. I f Squid encount ers t he t oken %L, it insert s t he cont ent s of t his direct ive. Not e t hat none of t he default error m essages cont ain a %L. Thus, t o use t his feat ure, you m ust m odify t he default error files.
Synt ax
err_html_text character string
Default
No default
err_html_text Call 555Exam ple 1234 to report problems with Squid.
Relat ed
error_direct ory
< Day Day Up >
< Day Day Up >
de n y_ in fo
This direct ive allows you t o show specific error m essages t o users when a request m at ches cert ain ACL elem ent s. This is m ore inform at ive t han sending a generic " access denied" error m essage, as happens by default . When Squid checks it s access cont rol rules t o see whet her or not a part icular request is allowed or denied, it rem em bers t he ACL elem ent t hat causes t he search t o t erm inat e. You can use t hese ACL elem ent nam es in a deny_info line t o correlat e error m essages wit h a specific request charact erist ic. Consider, for exam ple, t his configurat ion: acl Unsafe_Ports 7 9 19 22 23 25 53 109 110 119 ... http_access deny Unsafe_Ports ... deny_info ERR_PORT_IS_UNSAFE Unsafe_Ports When a user m akes a request t o an origin server on one of t he port s list ed in t he Unsafe_Port s ACL, Squid denies t he request . Furt herm ore, Squid generat es an error m essage from t he ERR_PORT_I S_UNSAFE file, found in t he error_direct ory direct ory. Alt ernat ively, you can specify a URI inst ead of an error m essage t em plat e. I n t his case, Squid sends an HTTP 302 ( Moved Tem porarily) redirect t o t he given URI . Finally, if you specify TCP_RESET as t he error m essage t em plat e, Squid closes t he client 's connect ion in a way t hat generat es a TCP reset .
Synt ax
deny_info error-page-name|URI acl-name
Default
No default
Exam ple
deny_info ERR_PORT_IS_UNSAFE Unsafe_Ports
Relat ed
error_direct ory, acl
< Day Day Up >
< Day Day Up >
m e m or y_ pools
Squid's m em ory pools are an at t em pt t o opt im ize t he way Squid allocat es and frees m em ory. Cert ain dat a st ruct ures inside Squid are pooled. This m eans t hat rat her t han freeing unused m em ory, Squid holds ont o it for fut ure use. I t also m eans t hat a part icular chunk of m em ory is norm ally used for t he sam e t ype of dat a st ruct ure. Mem ory pools m ay im prove Squid's perform ance by avoiding frequent calls t o m a lloc( ) and fr e e ( ) . The downside, however, is t hat t he overall m em ory usage m ay be higher. I f m em ory is a precious resource on your syst em , you m ight want t o disable m em ory pools.
Synt ax
memory_pools on|off
Default
memory_pools on
Exam ple
memory_pools off
Relat ed
cache_m em , m em ory_pools_lim it
< Day Day Up >
< Day Day Up >
m e m or y_ pools_ lim it
This direct ive specifies an upper lim it on t he am ount of unused m em ory t o hold ont o. I f t he t ot al size of all unused, pooled m em ory exceeds t his value, Squid begins ret urning unused m em ory t o t he m alloc library by calling fr e e ( ) . I f set t o 0 ( t he default ) , Squid doesn't place any lim it on t he am ount of unused m em ory t o keep in t he pools.
Synt ax
memory_pools_limit size-specification
Default
memory_pools_limit 0
Exam ple
memory_pools_limit 100 MB
Relat ed
m em ory_pools
< Day Day Up >
< Day Day Up >
for w a r de d_ for
Squid appends an it em t o t he X-Forwarded-For header in request s sent t o origin servers and neighbors. When t his direct ive is enabled, Squid places t he client 's I P address t here. When it is disabled, Squid print s t he word unknown inst ead. Thus, disabling forwarded_for increases your user's privacy.
Synt ax
forwarded_for on|off
Default
forwarded_for on
Exam ple
forwarded_for off
< Day Day Up >
< Day Day Up >
log_ icp_ qu e r ie s
By default , I CP queries appear in Squid's access.log. I f Squid receives a large am ount of I CP queries from neighbors, your access.log file m ay becom e t oo large t o effect ively m anage. I f you disable t his direct ive, I CP queries are never logged.
Synt ax
log_icp_queries on|off
Default
log_icp_queries on
Exam ple
log_icp_queries off
Relat ed
access_log, icp_port
< Day Day Up >
< Day Day Up >
icp_ h it _ st a le
Squid norm ally ret urns ICP_MISS for queries t o st ale obj ect s. This causes an annoying problem described in Chapt er 10. I f you enable t his direct ive, Squid ret urns ICP_HIT m essages inst ead.
Synt ax
icp_hit_stale on|off
Default
icp_hit_stale off
Exam ple
icp_hit_stale on
Relat ed
cache_peer, m iss_access
< Day Day Up >
< Day Day Up >
m in im u m _ dir e ct _ h ops
I f you're using net db ( see Sect ion 10.5) , and a cache hierarchy, Squid forwards request s direct ly t o origin servers t hat are wit hin t his m any rout er hops. Such request s are m arked wit h CLOSEST_DIRECT in access.log.
Synt ax
minimum_direct_hops N
Default
minimum_direct_hops 4
Exam ple
minimum_direct_hops 6
Relat ed
m inim um _direct _rt t , always_direct
< Day Day Up >
< Day Day Up >
m in im u m _ dir e ct _ r t t
Sim ilar t o m inim um _direct _hops. I f Squid is wit hin m inim um _direct _rt t m illiseconds ( as m easured by I CMP pings) t o t he origin server, t he request is sent t here direct ly. These request s are m arked wit h CLOSEST_DIRECT in access.log.
Synt ax
minimum_direct_rtt milliseconds
Default
minimum_direct_rtt 400
Exam ple
minimum_direct_rtt 100
Relat ed
m inim um _direct _hops, always_direct
< Day Day Up >
< Day Day Up >
ca ch e m gr _ pa ssw d
This direct ive allows you t o prot ect cache m anager pages wit h a password. Unfort unat ely, t his is an ext rem ely weak aut horizat ion schem e, because passwords are sent as cleart ext in t he cache m anager HTTP request . See Sect ion 14.2.2.2 for a discussion of cache m anager passwords.
Synt ax
cachemgr_passwd password cachemgr-page ...
Default
No default
Exam ple
cachemgr_passwd SekrIt config objects vm_objects
Relat ed
ht t p_access
< Day Day Up >
< Day Day Up >
st or e _ a vg_ obj e ct _ size
Squid uses t his value as a hint for est im at ing t he size of cert ain dat a st ruct ures. I n part icular, Squid calculat es an est im at e for t he t ot al num ber of obj ect s in t he cache, based on t his value and t he sum of all cache_dir sizes. This est im at e is, in t urn, used t o calculat e t he num ber of hash bucket s for t he prim ary index t o cached obj ect s. Addit ionally, it can est im at e t he cache digest size, if t hat feat ure is enabled. I n m ost cases t he default should be sufficient . You can find t he act ual value for your cache by querying t he cache m anager. Look for " Mean Obj ect Size" on t he info page ( see Sect ion 14.2.1.24) .
Synt ax
store_avg_object_size size-specification
Default
store_avg_object_size 13 KB
Exam ple
store_avg_object_size 10 KB
Relat ed
cache_dir, digest _bit s_per_ent ry, st ore_obj ect s_per_bucket
< Day Day Up >
< Day Day Up >
st or e _ obj e ct s_ pe r _ bu ck e t
This direct ive allows you t o t une t he t radeoff bet ween increased m em ory usage and longer searching t im es. Squid calculat es t he num ber of hash t able bucket s, depending on t his direct ive, t he average obj ect size, and t he t ot al cache size. Squid's goal is t o have t his m any obj ect s in each bucket of t he hash t able. A larger value here leads t o reduced m em ory usage but longer search t im es. Conversely, a sm aller value leads t o fast er search t im es, at t he expense of increased m em ory usage.
Synt ax
store_objects_per_bucket N
Default
store_objects_per_bucket 20
Exam ple
store_objects_per_bucket 15
Relat ed
st ore_avg_obj ect _size
< Day Day Up >
< Day Day Up >
clie n t _ db
Squid keeps a num ber of st at ist ics for each cache client ( I P address) . You can view t hem by visit ing t he cache m anager client _list page. The ClientInfo dat a st ruct ure is about 240 byt es on 32- bit syst em s and 300 byt es on 64- bit syst em s. I f you have t housands of client s, t his dat abase can consum e a significant am ount of m em ory. You can disable t his direct ive and free up t hat m em ory for ot her uses.
Synt ax
client_db on|off
Default
client_db on
Exam ple
client_db off
< Day Day Up >
< Day Day Up >
n e t db_ low
The net db dat abase cont ains round- t rip t im e and hop- count m easurem ent s derived from I CMP pings. This direct ive specifies t he lower lim it for t he net db replacem ent policy. I n ot her words, when Squid is rem oving net db ent ries, it st ops when t he t ot al num ber reaches net db_low.
Synt ax
netdb_low N
Default
netdb_low 900
Exam ple
netdb_low 9900
Relat ed
net db_high, query_icm p
< Day Day Up >
< Day Day Up >
n e t db_ h igh
The net db dat abase cont ains round- t rip t im e and hop- count m easurem ent s derived from I CMP pings. This direct ive specifies an upper lim it on t he num ber ent ries in t he dat abase. When Squid finds m ore t han net db_high ent ries, it rem oves least - recent ly used net works unt il t he size reaches net db_low.
Synt ax
netdb_high N
Default
netdb_high 1000
Exam ple
netdb_high 10000
Relat ed
net db_low, query_icm p
< Day Day Up >
< Day Day Up >
n e t db_ pin g_ pe r iod
This direct ive specifies how long Squid m ust wait bet ween sending consecut ive I CMP pings t o t he sam e / 24 net work. The int erval is relat ively long so t hat Squid's I CMP t raffic doesn't upset server adm inist rat ors.
Synt ax
netdb_ping_period time-specification
Default
netdb_ping_period 5 min
Exam ple
netdb_ping_period 3 min
Relat ed
pinger_program , query_icm p
< Day Day Up >
< Day Day Up >
qu e r y_ icm p
Enabling t his direct ive inst ruct s Squid t o ask it s neighbors for t heir I CMP m easurem ent s, which are included in I CP/ HTCP replies. This, essent ially, populat es your net db dat abase wit h your neighbors' I CMP m easurem ent s. The bulk " net db exchange" is anot her way t o receive t hose m easurem ent s ( see Sect ion 10.5) . Squid uses t he neighbors' net db m easurem ent s when m aking forwarding decisions. I f one of t he parent s is closer t o t he origin server, Squid forwards t he request t here and m arks it wit h CLOSEST_PARENT_MISS in access.log.
Synt ax
query_icmp on|off
Default
query_icmp off
Exam ple
query_icmp on
Relat ed
pinger_program , net db_ping_period
< Day Day Up >
< Day Day Up >
t e st _ r e a ch a bilit y
When you enable t his direct ive, Squid looks at it s net db dat abase while processing I CP queries. I f Squid norm ally ret urns ICP_MISS, but t he origin server isn't in t he dat abase or doesn't respond t o I CMP pings, it ret urns ICP_MISS_NOFETCH inst ead. The ICP_MISS_NOFETCH reply signals t he neighbor cache t hat Squid m ight not be able t o com m unicat e wit h t he origin server.
Synt ax
test_reachability on|off
Default
test_reachability off
Exam ple
test_reachability on
Relat ed
pinger_program , query_icm p, net db_ping_period
< Day Day Up >
< Day Day Up >
bu ffe r e d_ logs
While t his direct ive used t o affect m ult iple log files, it now only applies t o cache.log. Squid uses t he st dio library for cache.log. I f t his direct ive is enabled, Squid calls fflu sh ( ) aft er every writ e. This allows you t o see log file ent ries as t hey are writ t en. You m ight want t o disable buffered_logs if you are debugging Squid in a way t hat creat es a large num ber of cache.log ent ries.
Synt ax
buffered_logs on|off
Default
buffered_logs off
Exam ple
buffered_logs on
Relat ed
cache_log
< Day Day Up >
< Day Day Up >
r e loa d_ in t o_ im s
I f you enable t his direct ive, Squid adds an If-Modified-Since header t o request s t hat cont ain a no-cache direct ive. This is a global version of t he reload-into-ims opt ion for t he refresh_pat t ern direct ive ( see Sect ion 7.7) .
Alt ering t he client 's request in t his m anner is a violat ion of HTTP.
Synt ax
reload_into_ims on|off
Default
reload_into_ims off
Exam ple
reload_into_ims on
Relat ed
refresh_pat t ern
< Day Day Up >
< Day Day Up >
a lw a ys_ dir e ct
The always_direct access rules define a class of request s t hat m ust always be forwarded direct ly t o t he origin server. For t hese, Squid doesn't query or ot herwise consider any neighbor caches. See Sect ion 10.4.4.
Synt ax
always_direct allow|deny [!]ACLname ...
Default
No default
acl LocalServers dst 172.17.0.0/24 Exam ple always_direct allow LocalServers
Relat ed
acl, never_direct , prefer_direct , nonhierarchical_direct , m inim um _direct _hops, m inim um _direct _rt t , cache_peer_access
< Day Day Up >
< Day Day Up >
n e ve r _ dir e ct
The never_direct access rules define a class of request s t hat m ust never be forwarded t o t he origin server. For t hese, Squid m ust select an appropriat e neighbor cache t o handle t he request . See Sect ion 10.4.3.
Synt ax
never_direct allow|deny [!]ACLname ...
Default
No default
acl SpecialServers dstdomain .example.com Exam ple never_direct allow SpecialServers
Relat ed
acl, always_direct , prefer_direct , nonhierarchical_direct , m inim um _direct _hops, m inim um _direct _rt t , cache_peer_access
< Day Day Up >
< Day Day Up >
h e a de r _ a cce ss
This direct ive defines a set of access rules for filt ering HTTP headers from bot h request s and responses. You can use it t o rem ove headers t hat m ay violat e your privacy, or t hat cause int eroperat ion issues. For exam ple, t his configurat ion rem oves Cookie headers sent t o a wellknown web advert ising com pany: acl DC dstdomain .doubleclick.net header_access Cookie deny DC The header-name field m ust be one of t he HTTP headers Squid knows about or one of t he keywords Other or All. Squid current ly knows t he following HTTP headers:
Accept
Accept-Charset
Accept-Encoding
Accept-Language
Accept-Ranges
Age
Allow
Authentication-Info
Authorization
Cache-Control
Connection
Content-Base
Content-Encoding
Content-Language
Content-Length
Content-Location
Content-MD5
Content-Range
Content-Type
Cookie
Date
ETag
Expires
From
Host
If-Match
If-Modified-Since
If-None-Match
If-Range
Last-Modified
Link
Location
Max-Forwards
Mime-Version
Negotiate
Pragma
Proxy-Authenticate
Proxy-Authentication-Info
Proxy-Authorization
Proxy-Connection
Public
Range
Referer
Request-Range
Retry-After
Server
Set-Cookie
Title
Transfer-Encoding
Upgrade
User-Agent
Vary
Via
WWW-Authenticate
Warning
X-Accelerator-Vary
X-Cache
X-Cache-Lookup
X-Forwarded-For
X-Request-URI
X-Squid-Error
Unfort unat ely, you can't refer t o an unknown header individually. The best you can do is use t he keyword Other t o refer t o all unknown HTTP headers. The keyword All refers t o all ( known and unknown) HTTP headers. Not e t hat if you deny t he Via header, Squid can't det ect forwarding loops ( see Sect ion 10.2) .
Rem oving headers from request s and responses is a violat ion of HTTP.
Synt ax
header_access header-name allow|deny [!]ACLname ...
Default
No default
Exam ple
header_access From deny All
Relat ed
acl, header_replace
< Day Day Up >
< Day Day Up >
h e a de r _ r e pla ce
This direct ive works in conj unct ion wit h header_access. I f you use header_replace, Squid replaces HTTP headers t hat are denied ( rem oved) by an header_access rule. I n ot her words, an HTTP header m ust be filt ered out by header_access before it can be replaced by header_replace. header_replace isn't especially flexible. You can only define one replacem ent value for each header. You can't , for exam ple, use one value for som e request s and a different value for ot hers.
Changing HTTP request and response headers is a violat ion of HTTP.
Synt ax
header_replace header-name string
Default
No default
Exam ple
header_replace User-Agent Nutscrape/1.0 (CP/M; 8-bit)
Relat ed
header_access
< Day Day Up >
< Day Day Up >
icon _ dir e ct or y
This direct ive specifies t he locat ion of t he icons Squid uses in FTP and Gopher direct ory list ings. The icon filenam es are defined in m im e.conf ( see t he Appendix A sect ion) . I f you don't like Squid's icons, you can use your own, as long as t he filenam es found in m im e.conf exist in t he icon_direct ory direct ory.
Synt ax
icon_directory directory
Default
icon_directory $prefix/share/icons
Exam ple
icon_directory /usr/local/squid/share/myicons
Relat ed
error_direct ory, m im e_t able
< Day Day Up >
< Day Day Up >
e r r or _ dir e ct or y
This direct ive specifies t he locat ion of Squid's error m essage files. I f you want t o cust om ize t he error m essages, you should put t hem int o a nondefault direct ory. Ot herwise, t hey m ay be overwrit t en if you run m ake inst all in t he fut ure.
Synt ax
error_directory directory
Default
error_directory $prefix/share/errors/$language
Exam ple
error_directory /usr/local/squid/share/my_errors
Relat ed
icon_direct ory, err_ht m l_t ext , deny_info
< Day Day Up >
< Day Day Up >
m a x im u m _ sin gle _ a ddr _ t r ie s
This direct ive places a lim it on t he num ber of t im es Squid at t em pt s t o connect t o a single I P address when forwarding a request . I t can't be set higher t han 10.
Synt ax
maximum_single_addr_tries N
Default
maximum_single_addr_tries 3
Exam ple
maximum_single_addr_tries 5
Relat ed
connect _t im eout
< Day Day Up >
< Day Day Up >
sn m p_ por t
This is t he UDP port t o which Squid list ens for SNMP queries. SNMP support requires t he — enable- snm p opt ion t o ./ configure. Set t he SNMP port t o 0 if Squid shouldn't accept any SNMP m essages.
Synt ax
snmp_port port-number
Default
snmp_port 3401
Exam ple
snmp_port 3161
Relat ed
snm p_access, snm p_incom ing_address, snm p_out going_address
< Day Day Up >
< Day Day Up >
sn m p_ a cce ss
The snm p_access rules apply t o SNMP queries. Alt hough t his is a st andard Squid access list rule, m any ACL elem ent s are undefined for SNMP. I n fact , you can only use src and snm p_com m unit y ACLs.
Synt ax
snmp_access allow|deny [!]ACLname ...
Default
No default ( all queries denied by default )
acl SNMPPasswd snmp_community sekrit acl SNMPClients src 172.16.1.2 10.0.5.1 Exam ple
acl All src 0/0 snmp_access allow SNMPClients SNMPPasswd snmp_access deny All
Relat ed
acl, snm p_port
< Day Day Up >
< Day Day Up >
sn m p_ in com in g_ a ddr e ss
By default , Squid opens t he SNMP socket t o receive packet s on all local int erfaces. You can use t his direct ive t o bind t he SNMP socket t o a part icular int erface.
Synt ax
snmp_incoming_address ip-address
Default
snmp_incoming_address 0.0.0.0
Exam ple
snmp_incoming_address 172.16.0.1
Relat ed
snm p_port , snm p_access, udp_incom ing_address
< Day Day Up >
< Day Day Up >
sn m p_ ou t goin g_ a ddr e ss
Squid uses a single SNMP socket by default . I f you set t his direct ive, however, Squid opens a separat e socket for SNMP replies only. I n m ost cases, you shouldn't use t his direct ive because SNMP queries should com e from t he sam e address t o which t he queries are sent .
Synt ax
snmp_outgoing_address ip-address
Default
No default
Exam ple
snmp_outgoing_address 192.168.5.5
Relat ed
snm p_port , snm p_access, udp_out going_address
< Day Day Up >
< Day Day Up >
a s_ w h ois_ se r ve r
This is t he host nam e of t he whois server Squid uses t o resolve Aut onom ous Syst em num bers int o I P net works. You only need t o worry about t his if you use AS- based ACLs ( src_as, dst _as) . The default server, whois.ra.net , seem s t o work relat ively well. I t m ay be t oo far away ( and unreliable) for non- U.S. users. I f you know of a local whois server t hat ret urns AS queries, feel free t o use it inst ead.
Synt ax
as_whois_server hostname
Default
as_whois_server whois.ra.net
Exam ple
as_whois_server whois.host.name
Relat ed
acl
< Day Day Up >
< Day Day Up >
w ccp_ r ou t e r
This direct ive defines Squid's hom e rout er for WCCP. When you ent er an I P address ( or host nam e) here, Squid sends WCCP " Here I Am " m essages t o t he rout er. See Sect ion 9.3.4 for m ore inform at ion. Rout ers, by definit ion, have m ult iple net work int erfaces. You should probably use t he address of t he int erface t hat is connect ed, or has t he rout e, t o Squid. Squid ignores WCCP m essages t hat don't have t he wccp_rout er value as t heir source address.
Synt ax
wccp_router ip-address
Default
No default
Exam ple
wccp_router 172.16.5.1
Relat ed
wccp_version, wccp_incom ing_address, wccp_out going_address
< Day Day Up >
< Day Day Up >
w ccp_ ve r sion
This part icular version num ber refers t o second field of t he WCCP " Here I Am " m essage. I t isn't t he sam e as WCCPv1 versus WCCPv2. Som e users report t hat older inst allat ions of Cisco I OS only work when t his direct ive is set t o 3.
Synt ax
wccp_version N
Default
wccp_version 4
Exam ple
wccp_version 3
Relat ed
wccp_rout er
< Day Day Up >
< Day Day Up >
w ccp_ in com in g_ a ddr e ss
Squid list ens for WCCP m essages on all local int erfaces by default . I f you set t his direct ive, Squid list ens on only t he specified address.
Synt ax
wccp_incoming_address ip-address
Default
wccp_incoming_address 0.0.0.0
Exam ple
wccp_incoming_address 10.1.2.3
Relat ed
wccp_rout er, wccp_out going_address, udp_incom ing_address
< Day Day Up >
< Day Day Up >
w ccp_ ou t goin g_ a ddr e ss
I f, for som e reason, you want Squid t o send and receive WCCP m essages on different int erfaces, set t his direct ive t o t he address of t he out going int erface. I f t his direct ive isn't set , as is t he default , Squid uses t he sam e socket for incom ing and out going m essages.
Synt ax
wccp_outgoing_address ip-address
Default
No default
Exam ple
wccp_outgoing_address 172.16.1.1
Relat ed
wccp_rout er, wccp_incom ing_address, udp_out going_address
< Day Day Up >
< Day Day Up >
de la y_ pools
This direct ive specifies t he num ber of delay pools t hat you will lat er define wit h t he delay_class and delay_param et ers direct ives. I t t ells Squid t he size of cert ain arrays used in t he delay pools im plem ent at ion. I t m ust appear in t he configurat ion file before t he ot her delay pools direct ives. Not e t hat in order t o use delay pools, you m ust give t he —enable- delay- pools opt ion t o ./ configure.
Synt ax
delay_pools N
Default
delay_pools 0
Exam ple
delay_pools 4
Relat ed
delay_class, delay_access, delay_param et ers, delay_init ial_bucket _level
< Day Day Up >
< Day Day Up >
de la y_ cla ss
This direct ive defines t he class of each delay pool. The first argum ent is t he delay pool index. I ndex values st art at 1 and m ust be less t han or equal t o t he delay_pools value. The second argum ent is t he delay class, which has t hree possible values: ● ●
●
A class 1 pool uses a single, aggregat e bucket for all t raffic t hat applies t o t he pool. A class 2 pool uses a single, aggregat e bucket , as well as 256 individual bucket s. The individual bucket is chosen by t he last oct et of t he client 's I Pv4 address. A class 3 bucket uses a single, aggregat e bucket , 256 net work bucket s, and 65,536 individual bucket s. The net work bucket is chosen based on t he t hird oct et of t he client 's I Pv4 address. The individual bucket is chosen by t he t hird and fourt h oct et s.
Not e t hat t he class 2 and class 3 pools have m ult iple t ypes of bucket s ( aggregat e, net work, individual) . A client receives a t raffic allocat ion from all relevant bucket s, not j ust one of t hem . I n ot her words, if any of t he relevant bucket s are em pt y, t he client doesn't receive any t raffic allocat ion.
Synt ax
delay_class pool-number class
Default
No default
delay_class 1 2 Exam ple delay_class 2 3
Relat ed
delay_pools, delay_access, delay_param et ers, delay_init ial_bucket _level
< Day Day Up >
< Day Day Up >
de la y_ a cce ss
This direct ive m aps a client request t o a part icular delay pool. A client 's cache m iss is delayed only if it is " allowed" by one of t he delay_access rules. Squid checks t he access rules for all pools in order. I f a part icular request is denied by all delay_access rules, it isn't delayed. You m ust define at least one rule t o use delay pools.
Synt ax
delay_access pool-number allow|deny [!]ACLname ...
Default
No default
acl Dorms src 172.17.0.0/16 Exam ple delay_access 1 allow Dorms
Relat ed
delay_pools, delay_class, delay_param et ers, delay_init ial_bucket _level
< Day Day Up >
< Day Day Up >
de la y_ pa r a m e t e r s
The delay_param et ers direct ive det erm ines t he fill rat e and capacit y for each delay pools bucket . Following t he pool num ber, you m ust writ e one, t wo, or t hree pairs of num bers. The num ber of pairs is t he sam e as t he pool's class. A class 1 pool t akes one pair, a class 2 pool t akes t wo pairs, and a class 3 pool t akes t hree pairs. Each pair of num bers specifies t he fill rat e and m axim um bucket size. The fill rat e should not be larger t han t he m axim um size. The unit s are num ber of byt es. Thus, if you are t hinking in t erm s of bit s per second, you m ust divide by 8 t o get byt es per second. For exam ple, if you want t o define a bucket t hat refills at a rat e of 100 Kbit s/ sec, and holds no m ore t han 300 Kbit s ( 3 seconds) of t raffic, you would writ e 12500/37500.
delay_parameters pool-number aggr-rate/aggr-max [ind-rate/ind-max [netSynt ax rate/net-max]]
Default
No default
Exam ple delay_parameters 2 16000/32000 4000/8000
Relat ed
delay_pools, delay_class, delay_access, delay_init ial_bucket _level
< Day Day Up >
< Day Day Up >
de la y_ in it ia l_ bu ck e t _ le ve l
This direct ive det erm ines t he am ount of t raffic t hat Squid put s int o newly creat ed bucket s. A bucket is creat ed when Squid st art s up or is reconfigured. For class 2 and 3 pools, individual and net work bucket s are creat ed upon t he first client request t hat uses t he bucket . The delay_init ial_bucket _level value is a percent age of t he bucket 's m axim um size.
Synt ax
delay_initial_bucket_level percent
Default
delay_initial_bucket_level 50
Exam ple
delay_initial_bucket_level 100
Relat ed
delay_pools, delay_class, delay_access, delay_param et ers
< Day Day Up >
< Day Day Up >
in com in g_ icp_ a ve r a ge
This direct ive cont rols t he low- level rout ines t hat periodically check t he I CP socket for incom ing queries and replies. The algorit hm is relat ively com plex t o fully describe here. The idea is t o m ake sure Squid checks t he I CP socket frequent ly enough t o handle t he I CP load but not so oft en t hat it is a wast e of t im e. This direct ive specifies t he num ber of norm al I / O event s t hat should occur bet ween checks t o t he I CP socket . A norm al I / O event refers t o reading from , and writ ing t o, client - and server- side TCP socket s. Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
incoming_icp_average number
Default
incoming_icp_average 6
Exam ple
incoming_icp_average 20
Relat ed
incom ing_ht t p_average, incom ing_dns_average
< Day Day Up >
< Day Day Up >
in com in g_ h t t p_ a ve r a ge
This direct ive is sim ilar t o incom ing_icp_average, except t hat it refers t o t he HTTP socket wit h which Squid accept s new client request s. Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
incoming_http_average number
Default
incoming_http_average 4
Exam ple
incoming_http_average 15
Relat ed
incom ing_icp_average, incom ing_dns_average
< Day Day Up >
< Day Day Up >
in com in g_ dn s_ a ve r a ge
This direct ive is sim ilar t o incom ing_icp_average, except t hat it refers t o t he UDP socket wit h which Squid receives DNS responses. Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
incoming_dns_average number
Default
incoming_dns_average 4
Exam ple
incoming_dns_average 8
Relat ed
incom ing_icp_average, incom ing_ht t p_average
< Day Day Up >
< Day Day Up >
m in _ icp_ poll_ cn t
This direct ive cont rols t he low- level rout ines t hat periodically check t he I CP socket for incom ing queries and replies. I t specifies a lower lim it on t he num ber of norm al I / O event s t hat m ust occur bet ween checks t o t he I CP socket . Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
min_icp_poll_cnt number
Default
min_icp_poll_cnt 8
Exam ple
min_icp_poll_cnt 10
Relat ed
incom ing_icp_average
< Day Day Up >
< Day Day Up >
m in _ dn s_ poll_ cn t
This direct ive is sim ilar t o m in_icp_poll_cnt , except t hat it applies t o t he UDP socket wit h which Squid receives DNS replies. Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
min_dns_poll_cnt number
Default
min_dns_poll_cnt 8
Exam ple
min_dns_poll_cnt 10
Relat ed
incom ing_dns_average
< Day Day Up >
< Day Day Up >
m in _ h t t p_ poll_ cn t
This direct ive is sim ilar t o m in_icp_poll_cnt , except t hat it applies t o t he TCP socket wit h which Squid accept s new client request s. Unless you have a t horough underst anding of t he polling algorit hm s in t he source code, I st rongly recom m end t hat you leave t his direct ive set t o it s default value.
Synt ax
min_http_poll_cnt number
Default
min_http_poll_cnt 8
Exam ple
min_http_poll_cnt 12
Relat ed
incom ing_ht t p_average
< Day Day Up >
< Day Day Up >
m a x _ ope n _ disk _ fds
This direct ive defines an upper lim it on t he num ber of file descript ors t hat Squid should open for reading and writ ing cache files on disk. I t is relevant for only t he ufs and aufs st orage schem es. I t is a relat ively sim ple hack for m easuring t he level of Squid's disk act ivit y. Experience shows t hat perform ance degrades significant ly when Squid hit s a filesyst em bot t leneck. I f Squid reaches t his lim it , it doesn't at t em pt t o st ore subsequent cachable responses. Each t im e t hat happens, Squid increm ent s t he no.too_many_open_files count er ( see Sect ion 14.2.1.40) . Not e t hat hit t ing t his lim it has a negat ive im pact on your hit rat io. You can m onit or t he num ber of open disk files by request ing t he info page from t he cache m anager ( see Sect ion 14.2.1.24) . I f you set t his direct ive t o 0, Squid doesn't place any lim it s on t he num ber of open disk file descript ors.
Synt ax
max_open_disk_fds N
Default
max_open_disk_fds 0
Exam ple
max_open_disk_fds 100
< Day Day Up >
< Day Day Up >
offlin e _ m ode
When you enable offline_m ode, Squid ret urns every cached response as an unvalidat ed cache hit . These are t agged wit h TCP_OFFLINE_HIT in access.log. When in t his m ode, Squid st ill at t em pt s t o forward cache m isses. I f your syst em t ruly is offline, som e request s m ay hang while wait ing for t he DNS or HTTP t ransact ion t o t im eout .
Synt ax
offline_mode on|off
Default
offline_mode off
Exam ple
offline_mode on
< Day Day Up >
< Day Day Up >
u r i_ w h it e spa ce
This direct ive t ells Squid what t o do about URI s t hat cont ain whit espace charact ers ( i.e., space and t ab) . The default act ion is t o st rip out t he whit espace and shift t he valid charact ers down as necessary. This is t he behavior recom m ended by RFC 2396. I f you set t his direct ive t o allow, Squid doesn't change t he URI . I t is passed t hrough t o t he origin server as is. This set t ing m ay cause som e problem s wit h redirect ors and log file parsers. Bot h use whit espace as a field delim it er, and a URI wit h whit espace adds an addit ional field ( or fields) t o t he redirect or input line and t he access.log ent ry. The deny set t ing inst ruct s Squid t o deny such a request , as t hough it were blocked by t he access cont rol rules. Not e, however, t hat t he URI is st ill writ t en t o access.log wit h t he whit espace charact ers. Wit h t he encode set t ing, Squid changes whit espace charact ers int o t heir RFC 1738 equivalent s. When som e origin servers generat e URI s t hat cont ain whit espace, t his is what t hey should be doing in t he first place. Finally, t he chop set t ing inst ruct s Squid t o sim ply cut off t he URI at t he first whit espace charact er.
Synt ax
uri_whitespace allow|deny|strip|encode|chop
Default
uri_whitespace strip
Exam ple
uri_whitespace deny
Relat ed
access_log, redirect or_program
< Day Day Up >
< Day Day Up >
br ok e n _ post s
Cert ain buggy HTTP servers expect t wo ext ra byt es, CR and LF charact ers, following an HTTP POST m essage body. I t seem s unlikely t hat such uncom pliant servers are st ill in use t oday. Nonet heless, t his access rule list exist s t o accom m odat e t hem . When a request m at ches a broken_post s rule, Squid appends t he ext ra CRLF charact ers.
Synt ax
broken_posts allow|deny [!]ACLname ...
Default
No default
acl NeedsExtraCRLF dstdomain broken.server.com Exam ple broken_posts allow NeedsExtraCRLF
Relat ed
ht t p_access, acl
< Day Day Up >
< Day Day Up >
m ca st _ m iss_ a ddr
The m ult icast m iss st ream is a largely undocum ent ed and unsupport ed Squid feat ure. The basic idea is t o send a m ult icast m essage, cont aining a URI , for each cache m iss. The m essages are encrypt ed wit h a m odest algorit hm t o prevent casual eavesdropping. To use t his feat ure, you m ust m anually define t he MULTICAST_MISS_STREAM preprocessor sym bol before com piling Squid. To learn m ore about t his feat ure, read t he source code surrounded by #if MULTICAST_MISS_STREAM in src/ access_log.c.
Synt ax
mcast_miss_addr multicast-address
Default
No default
Exam ple
mcast_miss_addr 224.0.1.1
Relat ed
m cast _m iss_t t l, m cast _m iss_port , m cast _m iss_encode_key
< Day Day Up >
< Day Day Up >
m ca st _ m iss_ t t l
This is t he m ult icast TTL assigned t o out going m iss st ream m essages. See t he discussion of m ult icast TTLs in Sect ion 10.6.3.2.
Synt ax
mcast_miss_ttl N
Default
mcast_miss_ttl 16
Exam ple
mcast_miss_ttl 32
Relat ed
m cast _m iss_addr, m cast _m iss_port , m cast _m iss_encode_key
< Day Day Up >
< Day Day Up >
m ca st _ m iss_ por t
This is t he UDP port num ber t o which m ult icast m iss st ream m essages are sent .
Synt ax
mcast_miss_port port-number
Default
mcast_miss_port 3135
Exam ple
mcast_miss_port 999
Relat ed
m cast _m iss_addr, m cast _m iss_t t l, m cast _m iss_encode_key
< Day Day Up >
< Day Day Up >
m ca st _ m iss_ e n code _ k e y
Squid uses t he Tiny Encrypt ion Algorit hm ( TEA) t o encrypt m ult icast m iss m essages. This direct ive specifies t he encrypt ion key, which should be 128 bit s long.
Synt ax
mcast_miss_encode_key string
Default
mcast_miss_encode_key XXXXXXXXXXXXXXXX
Exam ple
mcast_miss_encode_key MySekRitPassWord
Relat ed
m cast _m iss_addr, m cast _m iss_t t l, m cast _m iss_port
< Day Day Up >
< Day Day Up >
n on h ie r a r ch ica l_ dir e ct
A hierarchical request is one t hat looks like it m ight result in a cachable response, and t herefore m ight be cached by one of Squid's neighbors. I f your Squid doesn't have any neighbors, you don't need t o worry about t his direct ive. By default , Squid prefers t o skip t he neighbor select ion st ep for nonhierarchical request s ( uncachable responses) because t he request probably won't result in a cache hit . You can reverse t his behavior by disabling t he nonhierarchical_direct direct ive. See Sect ion 10.10.
Synt ax
nonhierarchical_direct on|off
Default
nonhierarchical_direct on
Exam ple
nonhierarchical_direct off
Relat ed
prefer_direct , never_direct , always_direct
< Day Day Up >
< Day Day Up >
pr e fe r _ dir e ct
This direct ive affect s Squid's neighbor select ion algorit hm for hierarchical request s ( cachable responses) . I t is only relevant if you have one or m ore neighbor caches. When Squid builds a list of next - hop locat ions for cache m isses, it put s neighbor caches before t he origin server by default . I f you would rat her have Squid put t he origin server before neighbors, enable t he prefer_direct direct ive. See Sect ion 10.10.
Synt ax
prefer_direct on|off
Default
prefer_direct off
Exam ple
prefer_direct on
Relat ed
nonhierarchical_direct , never_direct , always_direct
< Day Day Up >
< Day Day Up >
st r ip_ qu e r y_ t e r m s
When t his direct ive is enabled, Squid doesn't log URI query t erm s in access.log. This feat ure is int ended t o give your users som e privacy. I t is enabled by default .
Synt ax
strip_query_terms on|off
Default
strip_query_terms on
Exam ple
strip_query_terms off
Relat ed
access_log, client _net m ask
< Day Day Up >
< Day Day Up >
cor e du m p_ dir
Norm ally Squid doesn't change it s current direct ory at st art up. While t his isn't usually a problem , it can be if Squid want s t o leave a core- dum p file. I f t he core file is very large, it m ight fill up a disk part it ion. Addit ionally, t he core won't be creat ed at all if Squid doesn't have perm ission t o writ e in t he current direct ory. This direct ive changes Squid's current direct ory. You should set it t o a locat ion t hat has sufficient space, and appropriat e perm issions, for a large core file. Not e t hat t he coredum p_dir direct ive is used only when Squid st art s up. I f you change t he value while Squid is running and t hen reconfigure, Squid doesn't change t he current direct ory.
Synt ax
coredump_dir pathname
Default
No default
Exam ple
coredump_dir /squid/var
< Day Day Up >
< Day Day Up >
ign or e _ u n k n ow n _ n a m e se r ve r s
Squid norm ally checks t hat DNS replies com e from t he sam e I P address t o which t he query was sent . I f t he addresses don't m at ch, Squid writ es a warning t o cache.log and ignores t he reply. Som e inst allat ions use an / et c/ resolv.conf t rick t o query any local nam e server. I f t he nam e server I P address is 0.0.0.0, DNS queries are broadcast on t he local area net work. The replies, however, com e from specific addresses. I f you want t o use t his t rick, you m ust disable t he ignore_unknown_nam eservers direct ive.
Synt ax
ignore_unknown_nameservers on|off
Default
ignore_unknown_nameservers on
Exam ple
ignore_unknown_nameservers off
Relat ed
dns_nam eservers
< Day Day Up >
< Day Day Up >
dige st _ ge n e r a t ion
This direct ive cont rols whet her or not Squid generat es a Cache Digest for it s own cont ent s. I t is enabled by default , when you give t he —enable- cache- digest s opt ion t o ./ configure. You m ay want t o disable it if you know t hat you don't have any neighbors who request your digest .
Synt ax
digest_generation on|off
Default
digest_generation on
Exam ple digest_generation off
Relat ed
cache_peer, digest _bit s_per_ent ry, digest _rebuild_period, digest _rewrit e_period, digest _swapout _chunk_size, digest _rebuild_chunk_percent age
< Day Day Up >
< Day Day Up >
dige st _ bit s_ pe r _ e n t r y
This direct ive affect s t he size of Squid's Cache Digest , based on t he est im at e for t he t ot al num ber of cache ent ries. Reducing t he size of t he digest result s in lower m em ory usage but a higher false hit probabilit y.
Synt ax
digest_bits_per_entry number
Default
digest_bits_per_entry 5
Exam ple
digest_bits_per_entry 4
Relat ed
digest _generat ion, st ore_avg_obj ect _size, cache_dir
< Day Day Up >
< Day Day Up >
dige st _ r e bu ild_ pe r iod
The digest rebuild period is how oft en Squid generat es t he digest of it s own cache. This is a fairly CPU- int ensive procedure, so you don't want t o run it t oo oft en. On t he ot her hand, t he digest becom es less represent at ive of Squid's cont ent s as m ore t im e passes.
Synt ax
digest_rebuild_period time-specification
Default
digest_rebuild_period 1 hour
Exam ple digest_rebuild_period 4 hours
Relat ed
digest _generat ion, digest _rewrit e_period, digest _swapout _chunk_size, digest _rebuild_chunk_percent age
< Day Day Up >
< Day Day Up >
dige st _ r e w r it e _ pe r iod
The digest rewrit e period is how oft en Squid generat es an on- disk cached HTTP response for it s Cache Digest . This is t he response sent t o neighbors t hat request Squid's digest . I n m ost cases digest _rewrit e_period should be t he sam e as digest _rebuild_period.
Synt ax
digest_rewrite_period time-specification
Default
digest_rewrite_period 1 hour
Exam ple digest_rewrite_period 4 hours
Relat ed
digest _generat ion, digest _rebuild_period, digest _swapout _chunk_size, digest _rebuild_chunk_percent age
< Day Day Up >
< Day Day Up >
dige st _ sw a pou t _ ch u n k _ size
This direct ive cont rols t he am ount of dat a writ t en t o disk for each call t o t he digest swapout funct ion. Squid services norm al cache t raffic ( client request s, server responses, et c.) in bet ween digest swapout calls. I f t he value is t oo large, Squid blocks on t he disk I / O and delays norm al cache t raffic.
Synt ax
digest_swapout_chunk_size size-specification
Default
digest_swapout_chunk_size 4 KB
Exam ple
digest_swapout_chunk_size 16 KB
Relat ed
digest _generat ion, digest _rewrit e_period, digest _rebuild_chunk_percent age
< Day Day Up >
< Day Day Up >
dige st _ r e bu ild_ ch u n k _ pe r ce n t a ge
This direct ive specifies t he percent age of hash- t able bucket s Squid scans during each call t o t he digest rebuild procedure. Squid services norm al cache t raffic in bet ween t hese calls. Since t his scanning is CPU- int ensive, user request s m ay be delayed for a sm all, but not iceable am ount of t im e. I f you suspect a perform ance problem during t he rebuild phase, decrease t he digest _rebuild_chunk_percent age value.
Synt ax
digest_rebuild_chunk_percentage percentage
Default
digest_rebuild_chunk_percentage 10
Exam ple
digest_rebuild_chunk_percentage 3
Relat ed
digest _generat ion, digest _rebuild_period, st ore_obj ect s_per_bucket
< Day Day Up >
< Day Day Up >
ch r oot
When you specify a value for t his direct ive, Squid passes it t o t he ch r oot ( ) syst em call. This provides an ext ra level of securit y by isolat ing t he Squid process( es) from t he rest of your filesyst em . See Sect ion 5.7 for m ore inform at ion.
Synt ax
chroot pathname
Default
No default
Exam ple
chroot /squid
< Day Day Up >
< Day Day Up >
clie n t _ pe r sist e n t _ con n e ct ion s
This direct ive cont rols whet her or not Squid uses persist ent HTTP connect ions t o cache client s. When disabled, Squid sends Connection: close headers in it s responses t o client s. I f you suspect problem s caused by client - side persist ent connect ions, disable t his direct ive.
Synt ax
client_persistent_connections on|off
Default
client_persistent_connections on
Exam ple
client_persistent_connections off
Relat ed
server_persist ent _connect ions, pipeline_prefet ch
< Day Day Up >
< Day Day Up >
se r ve r _ pe r sist e n t _ con n e ct ion s
This direct ive cont rols whet her or not Squid uses persist ent HTTP connect ions t o origin servers and neighbors. When disabled, Squid sends Connection: close headers in forwarded request s. I f you suspect problem s caused by server- side persist ent connect ions, disable t his direct ive.
Synt ax
server_persistent_connections on|off
Default
server_persistent_connections on
Exam ple
server_persistent_connections off
Relat ed
client _persist ent _connect ions
< Day Day Up >
< Day Day Up >
pipe lin e _ pr e fe t ch
This direct ive cont rols whet her or not Squid prefet ches pipelined request s. I t is disabled by default , so Squid act s only on one request at a t im e ( per connect ion) . I f you enable t his direct ive, Squid processes up t o t wo client request s at once. Not e t hat t he order of responses m ust m at ch t he order of request s. Thus, if t he prefet ched ( second) request com plet es before t he first , it is delayed unt il t he first response is sent . Squid doesn't im plem ent pipelining on t he server- side. I t always opens a new connect ion t o an origin server ( or neighbor) if t here are no idle persist ent connect ions.
Synt ax
pipeline_prefetch on|off
Default
pipeline_prefetch off
Exam ple
pipeline_prefetch on
Relat ed
client _persist ent _connect ions
< Day Day Up >
< Day Day Up >
e x t e n sion _ m e t h ods
HTTP ( RFC 2616) allows client s and servers t o use t heir own ext ension m et hods. I f request s wit h nonst andard HTTP m et hods go t hrough Squid, t he client receives an " I nvalid Request " error m essage. Squid also writ es a cache.log ent ry, such as t his: 2003/09/29 13:40:24| parseHttpRequest: Unsupported method 'XGET' I f you want Squid t o accept such request s, you m ust t ell it about t he nonst andard m et hods by list ing t hem aft er t he ext ension_m et hods direct ive.
Synt ax
extension_methods HTTP-method ...
Default
No default
Exam ple
extension_methods XGET XPOST
< Day Day Up >
< Day Day Up >
r e qu e st _ e n t it ie s
This direct ive det erm ines how Squid handles GET and HEAD request s t hat have m essage bodies ( ent it ies) . Such request s norm ally don't cont ain bodies. There is som e confusion about whet her or not RFC 2616 allows ent it ies in GET/ HEAD request s. Squid denies such request s by default . I f you would rat her have Squid accept t hem , enable t he request _ent it ies direct ive.
Synt ax
request_entities on|off
Default
request_entities off
Exam ple
request_entities on
< Day Day Up >
< Day Day Up >
h igh _ r e spon se _ t im e _ w a r n in g
I f you provide a non- zero value for t his direct ive, Squid periodically checks t he client - side m edian response t im e. I f it 's above t his t hreshold, Squid print s a warning m essage in cache. log. The value is given in m illiseconds.
Synt ax
high_response_time_warning milliseconds
Default
high_response_time_warning 0
Exam ple
high_response_time_warning 2000
Relat ed
high_page_fault _warning, high_m em ory_warning
< Day Day Up >
< Day Day Up >
h igh _ pa ge _ fa u lt _ w a r n in g
I f you provide a nonzero value for t his direct ive, Squid periodically checks t he process page fault rat e. Page fault s generally occur when t he Squid process doesn't fit ent irely in m em ory. A m oderat e num ber of page fault s can significant ly degrade perform ance. I f t he one- m inut e average rat e ( page fault s per second) exceeds t his t hreshold, Squid print s a warning m essage in cache.log.
Synt ax
high_page_fault_warning number
Default
high_page_fault_warning 0
Exam ple
high_page_fault_warning 5
Relat ed
high_response_t im e_warning, high_m em ory_warning
< Day Day Up >
< Day Day Up >
h igh _ m e m or y_ w a r n in g
I f you provide a nonzero value for t his direct ive, Squid periodically checks process size. A large process size can lead t o page fault s and a significant perform ance degradat ion. Squid uses eit her m st a t s( ) , m a llin fo( ) , or sbr k ( ) t o get t he process size. I f it exceeds t he given t hreshold, Squid print s a warning m essage in cache.log.
Synt ax
high_memory_warning size-specification
Default
high_memory_warning 0
Exam ple
high_memory_warning 400 MB
Relat ed
high_response_t im e_warning, high_page_fault _warning
< Day Day Up >
< Day Day Up >
ie _ r e fr e sh
I n Sect ion 9.2, I explained t hat I nt ernet Explorer versions prior t o 5.5 SP1 have a bug t hat m ake it unable t o force a validat ion of cached responses when using HTTP int ercept ion. This direct ive provides a part ial workaround for t he bug. When enabled, Squid pret ends t hat t he request cont ains a no-cache direct ive. Thus, Squid always forwards t hese request s on t o t he origin server or a neighbor. Not e t his affect s only request s t hat m eet t he following requirem ent s: ● ● ●
The User-Agent header indicat es I nt ernet Explorer Version 3, 4, 5.0, or 5.01. The If-Modified-Since header is present . The request cont ains a part ial URI because it was int ercept ed ( see Chapt er 9) or Squid is a surrogat e ( see Chapt er 15) .
Squid versions prior t o 2.5.STABLE3 cont ain a bug relat ed t o t his feat ure. Alt hough Squid behaves as t hough t he client 's request cont ains a no-cache direct ive, it doesn't add t hat direct ive t o t he out going request . This is a problem if you have one or m ore neighbor caches. Because t he request received by t he neighbor doesn't cont ain a no-cache direct ive, it m ay decide t o ret urn a cache hit , rat her t han forward it on t o t he origin server. Lat er versions include t he no-cache direct ive so t hat such request s should always reach t he origin server.
Synt ax
ie_refresh on|off
Default
ie_refresh off
Exam ple
ie_refresh on
< Day Day Up >
< Day Day Up >
va r y_ ign or e _ e x pir e
When cert ain HTTP/ 1.1 origin servers receive an HTTP/ 1.0 request ( e.g., from Squid) , and t he response cont ains a Vary header, t hey also add an Expires header set t o t he current t im e. This is t o prevent HTTP/ 1.0 caches, which m ay not underst and t he Vary header, from incorrect ly reusing a cached response. Squid underst ands and im plem ent s t he Vary header but st ill sends t he st ring " HTTP/ 1.0" in it s request s. You'll need t o enable t his direct ive if you want t o get cache hit s from responses wit h Vary and wit h Expires equal t o Date. This direct ive is som ewhat dangerous because t he origin server m ay have it s own reasons ( ot her t han m aint aining backward com pat ibilit y) for set t ing t he Expires header.
Synt ax
vary_ignore_expire on|off
Default
vary_ignore_expire off
Exam ple
vary_ignore_expire on
< Day Day Up >
< Day Day Up >
sle e p_ a ft e r _ for k
Squid uses t he for k ( ) syst em call t o spawn helper processes, such as redirect ors, aut hent icat ors, and DNS resolvers. On som e syst em s, a rapid sequence of for k ( ) calls consum es all available real and virt ual m em ory. Thus, a for k ( ) call m ay fail wit h an " out of m em ory" error. Not e t hat t his isn't necessarily a fat al error. Squid cont inues running as long as at least 50% of helper processes are successfully st art ed. To alleviat e t his problem , you can inst ruct Squid t o sleep for a sm all am ount of t im e aft er each for k ( ) call. This gives t he recent ly forked process t im e t o com plet e it s e x e c( ) call and free up t he m em ory. Don't set t his value t oo high, especially if you have a large num ber of helper processes. Squid doesn't service any client request s unt il all helpers have been st art ed.
Synt ax
sleep_after_fork microseconds
Default
sleep_after_fork 0
Exam ple
sleep_after_fork 10000
< Day Day Up >
< Day Day Up >
Appendix B. The Memory Cache Squid st ores som e of it s recent ly ret rieved obj ect s fully in m em ory. As you m ight expect , serving obj ect s from m em ory is generally fast er t han reading t he dat a from t he disk. I n som e places, Squid calls t his t he hot obj ect cache. The cache_m em direct ive specifies how m uch m em ory Squid should use for in- m em ory obj ect s. I usually recom m end set t ing cache_m em t o a sm all size, such as som et hing bet ween 8 and 32 MB. I f you happen t o have t ons of ext ra m em ory, you can set it higher. I n m ost cases, however, your ext ra m em ory is bet t er used by increasing your disk cache size ( see Sect ion 7.1.3.2) . Many people m isunderst and t he cache_m em direct ive. They expect it t o lim it t he t ot al am ount of m em ory t hat Squid uses. Unfort unat ely, for t hem , t his assum pt ion is incorrect . Squid doesn't have a direct ive t hat lim it s t ot al m em ory consum pt ion. See Sect ion 7.1.3.2 and Sect ion 16.1.8. The current version of Squid ( 2.5) st ores obj ect s in m em ory only if t hey com e from t he net work ( origin server or neighbor cache) . I f Squid reads an obj ect from disk, it doesn't also st ore it in m em ory. Older versions of Squid had t hat funct ionalit y. However, it was rem oved during a m aj or rewrit e t o sim plify t he source code. Only obj ect s sm aller t han a cert ain size are held in m em ory. The m axim um _obj ect _size_in_m em ory direct ive cont rols t his set t ing. I t s default value is 8 KB, which is t ypically large enough t o fit m ore t han half of all responses Squid receives. This direct ive also lim it s t he am ount of m em ory used for each cache m iss as t he response is being received. I f you have a high request rat e but are low on m em ory, you m ay want t o lower t his value t o 4 KB. Squid allocat es m em ory for obj ect dat a in 4- KB chunks. Thus, it m akes sense t o assign t his direct ive a m ult iple of 4 KB. Ot her values end up wast ing m em ory. I n- m em ory obj ect s fall int o one of t wo groups: in- t ransit or com plet e. Squid uses t he m em ory cache for bot h t ypes. Com plet e obj ect s are held in m em ory only if t here is som e free space. They have lower priorit y t han in- t ransit obj ect s. I f your cache is busy, t he m em ory cache m ay cont ain not hing but in- t ransit obj ect s ( or, m axim um _obj ect _size_in_m em ory chunks of int ransit obj ect s, act ually) . Furt herm ore, Squid always allocat es m em ory for in- t ransit obj ect s, even if it m ust exceed t he cache_m em lim it . When an in- t ransit obj ect becom es a com plet e obj ect , it is kept in m em ory only if t he m em ory cache size is below t he lim it . The m em ory_replacem ent _policy direct ive is analogous t o replacem ent _policy. I t cont rols t he replacem ent policy for obj ect s cached in m em ory. Because t he m em ory cache is t ypically m uch sm aller t han t he disk cache, your choice of replacem ent policy m ay have a bigger im pact . See Sect ion 7.5 for a descript ion of available replacem ent policies. < Day Day Up >
< Day Day Up >
Appendix C. Delay Pools Delay pools are Squid's answer t o rat e lim it ing and t raffic shaping. They work by lim it ing t he rat e at which Squid ret urns dat a for cache m isses. Cache hit s are sent as quickly as possible, under t he assum pt ion t hat local bandwidt h is plent iful. Delay pools were writ t en by David Luyer while at t he Universit y of West ern Aust ralia. The feat ure was designed for a LAN environm ent in which different groups of users ( for exam ple, st udent s, inst ruct ors, and st aff) are on different subnet s. You'll see som e evidence of t his in t he following descript ions. < Day Day Up >
< Day Day Up >
C.1 Overview The delay pools are, essent ially " bandwidt h bucket s." A response is delayed unt il som e am ount of bandwidt h is available from an appropriat e bucket . The bucket s don't act ually st ore bandwidt h ( e.g., 100 Kbit / s) , but rat her som e am ount of t raffic ( e.g., 384 KB) . Squid adds som e am ount of t raffic t o t he bucket s each second. Cache client s t ake som e am ount of t raffic out when t hey receive dat a from an upst ream source ( origin server or neighbor) . The size of a bucket det erm ines how m uch burst bandwidt h is available t o a client . I f a bucket st art s out full, a client can t ake as m uch t raffic as it needs unt il t he bucket becom es em pt y. The client t hen receives t raffic allot m ent s at t he fill rat e. The m apping bet ween Squid client s and act ual bucket s is a bit com plicat ed. Squid uses t hree different const ruct s t o do it : access rules, delay pool classes, and t ypes of bucket s. First , Squid checks a client request against t he delay_access list . I f t he request is a m at ch, it point s t o a part icular delay pool. Each delay pool has a class: 1, 2, or 3. The classes det erm ine which t ypes of bucket s are in use. Squid has t hree t ypes of bucket s: aggregat e, individual, and net work: ● ● ●
A class 1 pool has a single aggregat e bucket . A class 2 pool has an aggregat e bucket and 256 individual bucket s. A class 3 pool has an aggregat e bucket , 256 net work bucket s, and 65,536 individual bucket s.
As you can probably guess, t he individual and net work bucket s correspond t o I P address oct et s. I n a class 2 pool, t he individual bucket is det erm ined by t he last oct et of t he client 's I Pv4 address. I n a class 3 pool, t he net work bucket is det erm ined by t he t hird oct et , and t he individual bucket by t he t hird and fourt h oct et s. For t he class 2 and 3 delay pools, you can disable bucket s you don't want t o use. For exam ple, you can define a class 2 pool wit h only individual bucket s by disabling t he aggregat e bucket . When a request goes t hrough a pool wit h m ore t han one bucket t ype, it t akes bandwidt h from all bucket s. For exam ple, consider a class 3 pool wit h aggregat e, net work, and individual bucket s. I f t he individual bucket has 20 KB, t he net work bucket 30 KB, but t he aggregat e bucket only 2 KB, t he client receives only a 2- KB allot m ent . Even t hough som e bucket s have plent y of t raffic, t he client is lim it ed by t he bucket wit h t he sm allest am ount . < Day Day Up >
< Day Day Up >
C.2 Configuring Squid Before you can use delay pools, you m ust enable t he feat ure when com piling. Use t he —enabledelay- pools opt ion when running ./ configure. You can t hen use t he following direct ives t o set up t he delay pools.
C.2.1 delay_pools The delay_pools direct ive t ells Squid how m any pools you want t o define. I t should go before any ot her delay pool- configurat ion direct ives in squid.conf. For exam ple, if you want t o have five delay pools: delay_pools 5 The next t wo direct ives act ually define each pool's class and ot her charact erist ics.
C.2.2 delay_class You m ust use t his direct ive t o define t he class for each pool. For exam ple, if t he first pool is class 3: delay_class 1 3 Sim ilarly, if t he fourt h pool is class 2: delay_class 4 2 I n t heory, you should have one delay_class line for each pool. However, if you skip or om it a part icular pool, Squid doesn't com plain.
C.2.3 delay_parameters Finally, t his is where you define t he int erest ing delay pool param et ers. For each pool, you m ust t ell Squid t he fill rat e and m axim um size for each t ype of bucket . The synt ax is: delay_parameters N rate/size [rate/size [rate/size]] The rate value is given in byt es per second, and size in t ot al byt es. I f you t hink of rate in t erm s of bit s per second, you m ust rem em ber t o divide by 8. Not e t hat if you divide t he size by t he rate, you'll know how long it t akes ( num ber of seconds) t he bucket t o go from em pt y t o full when t here are no client s using it . A class 1 pool has j ust one bucket and m ight look like t his:
delay_class 2 1 delay_parameters 2 2000/8000 For a class 2 pool, t he first bucket is t he aggregat e, and t he second is t he group of individual bucket s. For exam ple: delay_class 4 2 delay_parameters 4 7000/15000 3000/4000 Sim ilarly, for a class 3 pool, t he aggregat e bucket is first , t he net work bucket s are second, and t he individual bucket s are t hird: delay_class 1 3 delay_parameters 1 7000/15000 3000/4000 1000/2000
C.2.4 delay_initial_bucket_level This direct ive set s t he init ial level for all bucket s when Squid first st art s or is reconfigured. I t also applies t o individual and net work bucket s, which aren't creat ed unt il first referenced. The value is a percent age. For exam ple: delay_initial_bucket_level 75% I n t his case, each newly creat ed bucket is init ially filled t o 75% of it s m axim um size.
C.2.5 delay_access This list of access rules det erm ines which request s go t hrough which delay pools. Request s t hat are allowed go t hrough t he delay pools, while t hose t hat are denied aren't delayed at all. I f you don't have any delay_access rules, Squid doesn't delay any request s. The synt ax for delay_access is sim ilar t o t he ot her access rule list s ( see Sect ion 6.2) , except t hat you m ust put a pool num ber before t he allow or deny keyword. For exam ple: delay_access 1 allow TheseUsers delay_access 2 allow OtherUsers I nt ernally, Squid st ores a separat e access rule list for each delay pool. I f a request is allowed by a pool's rules, Squid uses t hat pool and st ops searching. I f a request is denied, however, Squid cont inues exam ining t he rules for rem aining pools. I n ot her words, a deny rule causes Squid t o st op searching t he rules for a single pool but not for all pools.
C.2.6 cache_peer no-delay Option
The cache_peer direct ive has a no-delay opt ion. I f set , it m akes Squid bypass t he delay pools for any request s sent t o t hat neighbor. < Day Day Up >
< Day Day Up >
C.3 Examples Let 's st art off wit h a sim ple exam ple. Suppose t hat you have a sat urat ed I nt ernet connect ion, shared by m any users. You can use delay pools t o lim it t he am ount of bandwidt h t hat Squid consum es on t he link, t hus leaving t he rem aining bandwidt h for ot her applicat ions. Use a class 1 delay pool t o lim it t he bandwidt h for all users. For exam ple, t his lim it s everyone t o 512 Kbit / s and keeps 1 MB in reserve if Squid is idle: delay_pools 1 delay_class 1 1 delay_parameters 1 65536/1048576 acl All src 0/0 delay_access 1 allow All One of t he problem s wit h t his sim ple approach is t hat som e users m ay receive m ore t han t heir fair share of t he bandwidt h. I f you want t o t ry som et hing m ore balanced, use a class 2 delay pool t hat has individual bucket s. Recall t hat t he individual bucket is det erm ined by t he fourt h oct et of t he client 's I Pv4 address. Thus, if you have m ore t han a / 24 subnet , you m ight want t o use a class 3 pool inst ead, which gives you 65536 individual bucket s. I n t his exam ple, I won't use t he net work bucket s. While t he overall bandwidt h is st ill 512 Kbit / s, each individual is lim it ed t o 128 Kbit / s: delay_pools 1 delay_class 1 3 delay_parameters 1 65536/1048576 -1/-1 16384/262144 acl All src 0/0 delay_access 1 allow All You can also use delay pools t o provide different classes of service. For exam ple, you m ight have im port ant users and unim port ant users. I n t his case, you could use t wo class 1 delay pools. Give t he im port ant users a higher bandwidt h lim it t han everyone else: delay_pools 2 delay_class 1 1 delay_class 2 1 delay_parameters 1 65536/1048576 delay_parameters 2 10000/50000 acl ImportantUsers src 192.168.8.0/22
acl All src 0/0 delay_access 1 allow ImportantUsers delay_access 2 allow All
< Day Day Up >
< Day Day Up >
C.4 Issues Squid's delay pools are oft en useful, but not perfect . You need t o be aware of a few drawbacks and lim it at ions before you use t hem .
C.4.1 Fairness One of t he m ost im port ant t hings t o realize about t he current delay pools im plem ent at ion is t hat it does not hing t o guarant ee fairness am ong all users of a single bucket . This is especially im port ant for aggregat e bucket s ( where sharing is high) , but less so for individual bucket s ( where sharing is low) . Squid generally services request s in order of increasing file descript ors. Thus, a request whose server- side TCP connect ion has a lower file descript or m ay receive m ore bandwidt h from a shared bucket t han it should.
C.4.2 Application Versus Transport Layer Bandwidt h shaping and rat e lim it ing usually operat e at t he net work t ransport layer. There, t he flow of packet s can be cont rolled very precisely. Delay pools, however, are im plem ent ed in t he applicat ion layer. Because Squid doesn't act ually send and receive TCP packet s ( t he kernel does) , it has less cont rol over t he flow of individual packet s. Rat her t han cont rolling t he t ransm ission and receipt of packet s on t he wire, Squid cont rols only how m any byt es t o read from t he kernel. This m eans, for exam ple, t hat incom ing response dat a is queued up in t he kernel. The TCP/ I P st ack can buffer som e num ber of byt es t hat haven't yet been read by Squid. On m ost syst em s, t he default TCP receive buffer size is usually bet ween 32 KB and 64 KB. I n ot her words, t his m uch dat a can arrive over t he net work very quickly, regardless of anyt hing Squid can do. On t he one hand, it seem s silly t o read t his dat a slowly even t hough it is already on your syst em . On t he ot her hand, because t he client doesn't receive t he whole response right away, it is likely t o post pone any fut ure request s unt il t he delayed responses are com plet e. I f you are concerned t hat t he kernel buffers t oo m uch server- side dat a, you can decrease t he TCP receive buffer size wit h t he t cp_recv_bufsize direct ive. Even bet t er, your operat ing syst em probably has a way t o set t his param et er for t he whole syst em . On Net BSD/ FreeBSD/ OpenBSD, you can use t he sysct l variable nam ed net .inet .t cp.recvspace. For Linux, read about / proc/ sys/ net / ipv4/ t cp_rm em in Docum ent at ion/ net working/ ip- sysct l.t xt .
C.4.3 Fixed Subnetting Scheme The current delay pools im plem ent at ion assum es t hat your LAN uses / 24 ( class C) subnet s, and t hat all users are in t he sam e / 16 ( class B) subnet . This m ight not be so bad, depending on how your net work is configured. However, it would be nice if t he delay pools subnet t ing schem e were fully cust om izable. I f your address space is larger t han a / 24 and sm aller t han a 16/ , you can always creat e a class
3 pool and t reat it as a class 2 pool ( t hat is one of t he exam ples given earlier) . I f you use j ust one class 2 pool wit h m ore t han 256 users, som e users will share t he individual bucket s. That m ight not be so bad, unless you happen t o have a bunch of heavy users fight ing over one m easly bucket . You m ight also creat e m ult iple class 2 pools and use delay_access rules t o divide t hem up am ong all users. The problem wit h t his approach is t hat you can't have all users share a single aggregat e bucket . I nst ead, each subgroup has t heir own aggregat e bucket . You can't m ake a single client go t hrough m ore t han one delay pool. < Day Day Up >
< Day Day Up >
C.5 Monitoring Delay Pools You can m onit or t he delay pool levels wit h t he cache m anager int erface. Request t he delay page from t he CGI int erface or wit h t he squidclient ut ilit y: % squidclient mgr:delay | less See Sect ion 14.2.1.44 for a descript ion of t he out put . < Day Day Up >
< Day Day Up >
Appendix D. Filesystem Performance Benchmarks You have a m yriad of choices t o m ake when inst alling and configuring Squid, especially when it com es t o t he way Squid st ores files on disk. Back in Chapt er 8, I t alked about t he various filesyst em s and st orage schem es. Here, I 'll provide som e hard dat a on t heir relat ive perform ance. These t est s were done wit h Web Polygraph, a freely available, high- perform ance t ool for benchm arking HTTP int erm ediaries ( ht t p: / / www.web- polygraph.org/ ) . Over t he course of m any m ont hs, I ran approxim at ely 40 different t est s on 5 different operat ing syst em s. < Day Day Up >
< Day Day Up >
D.1 The Benchmark Environment The prim ary purpose of t hese benchm arks is t o provide a num ber of m easurem ent s t hat allow you t o com pare different Squid configurat ions and feat ures. I n order t o produce com parable result s, I 've t aken care t o m inim ize any differences bet ween syst em s being t est ed.
D.1.1 Hardware for Squid I used five ident ical com put er syst em s—one for each of t he following operat ing syst em s: FreeBSD, Linux, Net BSD, OpenBSD, and Solaris. The boxes are I BM Net finit y servers wit h one 500- MHz PI I I CPU, 1 GB of RAM, an I nt el fast - Et hernet NI C, and t hree 8- GB disk SCSI drives. I realize t hat t hese aren't part icularly powerful m achines by t oday's st andards, but t hey are good enough for t hese t est s. Anyway, it is m ore im port ant t hat t hey be ident ical t han powerful. The requirem ent t o use ident ical hardware m eans t hat I can't generat e com parable result s for ot her hardware plat form s, such as Sun, Digit al/ Com paq/ HP, and ot hers.
D.1.2 Squid Version and Configuration Except for t he coss t est s, all result s are from Squid Version 2.5.STABLE2. The coss result s are from a pat ched version of 2.5.STABLE3. Those pat ches have been com m it t ed t o t he source t ree for inclusion int o 2.5.STABLE4. Unless ot herwise specified, I used only t he —enable- st oreio opt ion when running ./ configure before com piling Squid. For exam ple: % ./configure --enable-storeio=diskd,ufs,null,coss I n all cases, Squid is configured t o use 7500 MB of each 8.2- GB disk. This is a t ot al cache size of 21.5 GB. Addit ionally, access.log and st ore.log have been disabled in t he configurat ion file. Here is a sam ple squid.conf file: visible_hostname linux-squid.bench.tst acl All src 0/0 http_access allow All
cache_dir aufs /cache0 7500 16 256 cache_dir aufs /cache1 7500 16 256 cache_dir aufs /cache2 7500 16 256
cache_effective_user nobody cache_effective_group nobody cache_access_log /dev/null cache_store_log none logfile_rotate 0
D.1.3 Web Polygraph Workload [ 1] Meet ing t his requirem ent All t he t est s in t his appendix use t he sam e Polygraph workload file. was, perhaps, t he hardest part of running t hese t est s. Norm ally, t he desired t hroughput is a configurat ion param et er in a Polygraph workload. However, because t he sust ainable t hroughput is different for each configurat ion, m y colleague Alex Rousskov and I developed a workload t hat [ 2] can be used for all t est s. We call t his t he " peak finder" workload because it finds t he peak t hroughput for a device under t est . [ 1]
Except for t he num ber- of- spindles t est s, in which t he cache size depends on t he num ber of disks in use. [ 2]
You can download t his workload at ht t p: / / squidbook.org/ ext ras/ pf2- pm 4.pg. t xt . The nam e " peak finder" is som ewhat m isleading because, at least in Squid's case, sust ainable t hroughput decreases over t im e. The workload is designed t o periodically adj ust t he offered load ( t hroughput ) subj ect t o response t im e requirem ent s. I f t he m easured response t im e is below a given t hreshold, Polygraph increases t he load. I f response t im e is above t he t hreshold, it decreases t he load. Thus, at any point in t im e during t he t est , we know t he m axim um t hroughput t hat st ill sat isfies t he response t im e requirem ent s. I n order t o reach a st eady- st at e condit ion, t he t est runs unt il t he cache has been filled t wice. Polygraph knows t he t ot al cache size ( 21.5 GB) and keeps t rack of t he am ount of fill t raffic pulled int o t he cache. These are responses t hat are cachable but not cache hit s. The t est durat ion, t hen, depends on t he sust ainable t hroughput . When t he t hroughput is low, t he t est t akes longer t o com plet e. Som e of t hese t est s t ook m ore t han 10 days t o run. < Day Day Up >
< Day Day Up >
D.2 General Comments I show, for each t est , how t he sust ainable t hroughput varies over t im e. The y- axis shows t he t hroughput ( responses per second) . The x- axis is t he rat io of fill- t raffic volum e t o cache size. Because each t est t akes a different am ount of t im e, t his is a nice way t o norm alize all t he result s. The t est is over when t he cache has been filled t wice. I n m ost t races, you'll see t hat sust ainable t hroughput decreases over t im e. At t he beginning of t he t est , t he t hroughput is very high. Here, t he disks are em pt y, and Squid doesn't need t o replace old obj ect s. The t hroughput for a full cache is usually worse t han for an em pt y cache. This is a com m on charact erist ic of proxy benchm arks and em phasizes t he im port ance of reaching st eady- st at e condit ions. Don't be fooled by im pressive result s from short t est s. The Throughput , Response Tim e, and Hit Rat io values given in t he sum m ary t ables are t aken from t he last 25% of t he t est . Here, bet ween 1.5 and 2.0 on t he x- axis, t he t hroughput is m ore or less st able and flat . I report t he m ean of t he t hroughput , response t im e, and hit rat io values in t his range from t he t race dat a. Throughput is t he m ost int erest ing m et ric in t hese t est s. I t is given in responses per second. The rows in each sum m ary t able are sort ed by t hroughput . The response t im e num bers are less int erest ing because t hey are all about t he sam e. I decided t o report t hem t o show t hat , indeed, t he result s st ay wit hin t he response t im e window defined by t he workload. The t arget response t im e is around 1.5 seconds, but t he act ual response t im e varies depending on t he part icular t est . The response hit rat io values are also not part icularly int erest ing. The ideal hit rat io for t his workload is about 58% . Due t o an as- yet unresolved Polygraph bug, however, t he hit rat io decreases slight ly as t he t est progresses. Keep in m ind t hat t hese result s are m eant t o dem onst rat e t he relat ive perform ance of different opt ions, rat her t han t he absolut e values. You'll get different num bers if you repeat t he t est s on different hardware. < Day Day Up >
< Day Day Up >
D.3 Linux Linux is obviously a popular choice for Squid. I t support s a wide variet y of filesyst em s and st orage schem es. These result s com e from Linux kernel Version 2.4.19 ( released August 2, 2002) wit h SGI 's XFS pat ches Version 1.2.0 ( released Feb 11, 2003) and ReiserFS Version 3.6.25. The kernel's file descript or lim it is set t o 8192. I used t his com m and t o configure Squid before com piling: % ./configure
--enable-storeio=diskd,ufs,aufs,null,coss --with-aufs-threads=32
The Linux result s are sum m arized in Table D- 1, and Figure D- 1 shows t he t races. You can see t hat coss is t he best perform er, wit h aufs com ing in second and diskd t hird. As I 'm writ ing t his, coss is an experim ent al feat ure and not necessarily suit able for a product ion syst em . I n t he long run, you'll probably be bet t er off wit h aufs.
Ta ble D - 1 . Lin u x be n ch m a r k in g r e su lt s
St or a ge sch e m e
File syst e m
M ou n t opt ion s
coss
Th r ou gh pu t ( x a ct / se c)
Re spon se t im e ( se c)
H it r a t io (% )
326.3
1.59
53.9
aufs( 1)
ext 2fs
noatime
168.5
1.45
56.3
diskd( 1)
ext 2fs
noatime
149.4
1.53
56.1
aufs( 2)
ext 2fs
110.0
1.46
55.6
ufs( 1)
ext 2fs
54.9
1.52
55.6
ufs( 2)
ext 3fs
48.4
1.49
56.8
ufs( 3)
xfs
40.7
1.54
55.3
ufs( 4)
reiserfs
ufs( 5)
reiserfs
notail, noatime
29.7
1.55
55.0
21.4
1.55
55.1
Figu r e D - 1 . Lin u x file syst e m be n ch m a r k in g t r a ce s
Not e t hat t he noatime opt ion gives a significant boost in perform ance t o aufs. The t hroughput j um ps from 110 t o 168 t ransact ions per second wit h t he addit ion of t his m ount opt ion. Linux also has an async opt ion, but it is enabled by default . I did not run any t est s wit h async disabled. Of t he m any filesyst em choices, ext 2fs seem s t o give t he best perform ance. ext 3fs ( ext 2 plus j ournaling) is only slight ly lower, followed by xfs, and reiserfs. < Day Day Up >
< Day Day Up >
D.4 FreeBSD FreeBSD is anot her popular Squid plat form , and m y personal favorit e. Table D- 2 and Figure D- 2 sum m arize t he result s for FreeBSD. Again, coss exhibit s t he highest t hroughput , followed by diskd. The aufs st orage schem e doesn't current ly run on FreeBSD. These result s com e from FreeBSD Version 4.8- STABLE ( released April 3, 2003) . I built a kernel wit h t he following not ewort hy opt ions: options
MSGMNB=16384
options
MSGMNI=41
options
MSGSEG=2049
options
MSGSSZ=64
options
MSGTQL=512
options
SHMSEG=16
options
SHMMNI=32
options
SHMMAX=2097152
options
SHMALL=4096
options
MAXFILES=8192
options
NMBCLUSTERS=32768
options
VFS_AIO
Ta ble D - 2 . Fr e e BSD be n ch m a r k in g r e su lt s St or a ge sch e m e
File syst e m
M ou n t opt ion s
coss
diskd( 1)
UFS
diskd( 2)
UFS
async, noatime, softupdate
Th r ou gh pu t
Re spon se t im e
H it r a t io
330.7
1.58
54.5
129.0
1.58
54.1
77.4
1.47
56.2
ufs( 1)
UFS
async, noatime, softupdate
38.0
1.49
56.8
ufs( 2)
UFS
noatime
31.1
1.54
55.0
ufs( 3)
UFS
async
30.2
1.51
55.9
ufs( 4)
UFS
softupdate
29.9
1.51
55.7
ufs( 5)
UFS
24.4
1.50
56.4
Figu r e D - 2 . Fr e e BSD file syst e m be n ch m a r k in g t r a ce s
[3] Enabling t he async, noatime, and softupdate opt ions boost s t he st andard ufs perform ance from 24 t o 38 t ransact ions per second. However, using one of t he ot her st orage schem es increases t he sust ainable t hroughput even m ore. [ 3]
On FreeBSD, soft updat es aren't a m ount opt ion, but m ust be set wit h t he t unefs com m and.
FreeBSD's diskd perform ance ( 129/ sec) isn't quit e as good as on Linux ( 169/ sec) , perhaps because t he underlying filesyst em ( ext 2fs) is bet t er. Not e t hat t he t race for coss is relat ively flat . I t s perform ance doesn't change m uch over t im e. Furt herm ore, bot h FreeBSD and Linux report sim ilar t hroughput num bers: 326/ sec and 331/ sec. This leads m e t o believe t hat t he disk syst em isn't a bot t leneck in t hese t est s. I n fact , t he t est wit h no disk cache ( see Sect ion D.8) achieves essent ially t he sam e t hroughput ( 332/ sec) . < Day Day Up >
< Day Day Up >
D.5 OpenBSD The result s in t his sect ion are from OpenBSD Version 3.3 ( released May 1, 2003) . I built a kernel wit h t he following not able configurat ion opt ions: option
MSGMNB=8192
option
MSGMNI=40
option
MSGSEG=512
option
MSGSSZ=64
option
MSGTQL=2048
option
SHMSEG=16
option
SHMMNI=32
option
SHMMAX=2048
option
SHMALL=4096
option
NMBCLUSTERS=32768
option
MAXFILES=8192
Table D- 3 and Figure D- 3 sum m arize t he OpenBSD result s. The choices for OpenBSD are sim ilar t o t hose for FreeBSD. Unfort unat ely, however, coss doesn't run on OpenBSD, which lacks t he aio_read( ) and aio_writ e( ) funct ions.
Ta ble D - 3 . Ope n BSD be n ch m a r k in g r e su lt s St or a ge sch e m e
File syst e m
M ou n t opt ion s
Th r ou gh pu t
Re spon se t im e
H it r a t io
diskd( 1)
UFS
async, noatime, softupdate 91.1
1.45
56.3
diskd( 2)
UFS
63.7
1.44
56.2
ufs( 1)
UFS
27.6
1.51
56.3
softupdate
ufs( 2)
UFS
ufs( 3)
UFS
ufs( 4)
UFS
noatime
async
25.1
1.52
56.3
22.7
1.52
56.1
22.1
1.51
56.6
Figu r e D - 3 . Ope n BSD file syst e m be n ch m a r k in g t r a ce s
I n general, t he OpenBSD result s are slight ly worse t han FreeBSD. This isn't t oo surprising, given t hat t he OpenBSD proj ect em phasizes securit y and perhaps spends less t im e on filesyst em perform ance. One odd result is t hat using t he async opt ion ( alone) caused a slight decrease in perform ance for t he ufs st orage schem e. < Day Day Up >
< Day Day Up >
D.6 NetBSD These result s com e from Net BSD Version 1.6.1 ( released April 21, 2003) . Table D- 4 and Figure D- 4 sum m arize t he Net BSD result s. Net BSD act ually perform s alm ost t he sam e as OpenBSD. The best configurat ion yields about 90 t ransact ions per second. Unfort unat ely, Net BSD doesn't support coss or aufs. I built a cust om kernel wit h t hese opt ions: options
NMBCLUSTERS=32768
options
MAXFILES=8192
options
MSGSSZ=64
options
MSGSEG=512
options
MSGMNB=8192
options
MSGMNI=40
options
MSGTQL=2048
Ta ble D - 4 . N e t BSD be n ch m a r k in g r e su lt s St or a ge sch e m e
File syst e m
M ou n t opt ion s
Th r ou gh pu t
Re spon se t im e
H it r a t io
diskd( 1)
UFS
softupdate, noatime,async
90.3
1.49
57.2
diskd( 2)
UFS
softupdate
73.5
1.51
55.8
diskd( 3)
UFS
60.1
1.48
55.9
ufs( 1)
UFS
softupdate, noatime,async
34.9
1.51
56.2
ufs( 2)
UFS
softupdate
31.7
1.52
55.5
ufs( 3)
UFS
23.6
1.53
55.4
Figu r e D - 4 . N e t BSD file syst e m be n ch m a r k in g t r a ce s
< Day Day Up >
< Day Day Up >
D.7 Solaris These result s com e from Solaris Version 8 for I nt el ( released February 2002) . Solaris 9 was available when I st art ed t hese t est s, but Sun no longer m akes it freely available. I t weaked t he kernel by adding t hese lines t o / et c/ syst em : set rlim_fd_max = 8192 set msgsys:msginfo_msgmax=8192 set msgsys:msginfo_msgmnb=8192 set msgsys:msginfo_msgmni=40 set msgsys:msginfo_msgssz=64 set msgsys:msginfo_msgtql=2048 set shmsys:shminfo_shmmax=2097152 set shmsys:shminfo_shmmni=32 set shmsys:shminfo_shmseg=16 Table D- 5 and Figure D- 5 sum m arize t he Solaris result s. This is t he only ot her operat ing syst em , in addit ion t o Linux, in which t he aufs st orage schem e works well. I nt erest ingly, bot h aufs and diskd have about t he sam e perform ance on Solaris, alt hough t he act ual num bers are m uch lower t han on Linux.
Ta ble D - 5 . Sola r is be n ch m a r k in g r e su lt s St or a ge sch e m e
File syst e m
M ou n t opt ion s
Th r ou gh pu t
Re spon se t im e
H it r a t io
diskd( 1)
UFS
noatime
56.3
1.53
55.7
aufs( 1)
UFS
noatime
53.6
1.49
56.6
diskd( 2)
UFS
37.9
1.53
55.5
aufs( 2)
UFS
37.4
1.49
56.4
coss
ufs( 1)
UFS
ufs( 2)
UFS
noatime
32.4
1.47
54.6
24.0
1.53
55.6
19.0
1.50
56.3
Figu r e D - 5 . Sola r is file syst e m be n ch m a r k in g t r a ce s
Solaris also support s coss, but at nowhere near t he rat es for Linux and FreeBSD. For som e unknown reason, coss on Solaris is lim it ed t o 32 t ransact ions per second. < Day Day Up >
< Day Day Up >
D.8 Number of Disk Spindles I n t his sect ion, I com pare Squid's perform ance for different num ber of disk drives ( spindles) . These t est s are from t he Linux syst em wit h t he aufs st orage schem e and ext 2fs filesyst em s. Table D- 6 and Figure D- 6 sum m arize t he result s. The t est wit h no disk drives has t he best t hroughput , but t he worst response t im e and hit rat io. Not e t hat Squid does serve a few cache hit s from m em ory, so t he hit rat io isn't zero.
Ta ble D - 6 . Com pa r ison of 0 - 3 disk spin dle s on Lin u x w it h a u fs # D isk s
Th r ou gh pu t
Re spon se t im e
H it r a t io
0
332.1
2.99
0.4
3
109.6
1.44
56.2
2
85.3
1.49
53.9
1
66.0
1.50
53.5
Figu r e D - 6 . Be n ch m a r k in g r e su lt s for 0 , 1 , 2 , a n d 3 disk dr ive s on Lin u x w it h a u fs
The prim ary purpose of t hese t est s is t o show t hat Squid's perform ance doesn't increase in proport ion t o t he num ber of disk drives. Excluding ot her fact ors, you m ay be able t o get bet t er perform ance from t hree syst em s wit h one disk drive each, rat her t han a single syst em wit h t hree drives. < Day Day Up >
< Day Day Up >
Appendix E. Squid on Windows Squid has been designed t o run on Unix, but you can also get it t o run on Microsoft Windows. Perhaps t he easiest way is t o use Red Hat 's Cygwin em ulat ion layer. I t gives a Windows box everyt hing it needs t o run a variet y of Unix applicat ions. Anot her opt ion is t o use SquidNT. This is a version of t he source code t hat has been m odified t o com pile under a nat ive Windows C com piler. < Day Day Up >
< Day Day Up >
E.1 Cygwin Cygwin is a Unix em ulat ion package for Microsoft Windows. I t provides an environm ent t hat allows you t o build and run soft ware prim arily designed for Unix. You can also download and inst all a num ber of precom piled binary packages, including Squid. Cygwin runs on Windows 95, 98, ME, NT, 2000, and XP. The Cygwin FAQ, however, m akes t his disclaim er: Keep in m ind t hat Cygwin can only do as m uch as t he underlying OS support s. Because of t his, Cygwin will behave different ly, and exhibit different lim it at ions, on t he various versions of Windows. When writ ing t his appendix, I inst alled Cygwin Version 1.3.21 on Windows 2000.
E.1.1 Installing Cygwin The first st ep is t o inst all Cygwin on your Windows syst em . Visit t he ht t p: / / www.cygwin.com / sit e and click on t he I nst all Cygwin link. Aft er running Cygwin Set up, you'll have t he base environm ent wit h a num ber of st andard Unix t ools. You m ight want t o spend a lit t le t im e playing wit h it t o see how it works. Once you're com fort able wit h t he Cygwin environm ent , decide if you'd like t o use t he precom piled package or com pile Squid from it s source.
E.1.2 The Squid Package The Cygwin proj ect provides a precom piled Squid binary. To download and inst all it , run t he Cygwin Set up program again. When you see t he Select Packages window, find t he Web group and select squid for inst allat ion. Cont inue wit h t he set up procedure as before. When Set up com plet es, you should find t he Squid binary at / usr/ bin/ squid and t he configurat ion file at / et c/ squid.conf.
E.1.3 Compiling Squid You can also com pile t he Squid source code under Cygwin. This m ight be necessary if you want t o run a m ore recent version t han t he precom piled binary available from t he Cygwin sit e. To com pile on Cygwin, you need t o inst all at least t he following packages: ● ● ● ●
Archive/ sharut ils Devel/ m ake Devel/ gcc I nt erpret ers/ Perl
Aft er inst alling t hose t ools, you should be able t o configure and com pile Squid as described in Chapt er 3.
E.1.4 Configuring and Running Since Cygwin is essent ially a Unix environm ent , you can run Squid as described t hroughout t his book. Som e special feat ures m ay or m ay not work. For exam ple, you won't be able t o build cert ain aut hent icat ion helpers wit hout addit ional libraries and header files. Here are a few t hings t o wat ch out for: ●
●
The cache_effect ive_user direct ive is set t o nobody by default . When you run Squid under Cygwin, you m ay get an error t hat t he nobody doesn't exist . You can eit her creat e t hat user or set cache_effect ive_user t o a usernam e t hat does exist . Cygwin doesn't have a / et c/ resolv.conf by default , and Squid won't pick up your DNS server set t ings from t he Windows regist ry. You can eit her creat e a fake / et c/ resolv.conf or list your nam e server addresses in squid.conf wit h a dns_nam eserver direct ive. < Day Day Up >
< Day Day Up >
E.2 SquidNT Guido Serassio is m aint aining a proj ect called SquidNT. I t is branch of Squid's developm ent t ree t hat cont ains changes necessary for a nat ive port of Squid t o Windows NT, 2000, XP, and 2003. I n ot her words, you can com pile and run t his version of Squid on Windows wit hout any Unix em ulat ion libraries. The code is known t o com pile wit h Microsoft 's Visual C+ + 6.0 com piler and under t he MSYS+ MinGW environm ent . Guido also provides som e precom piled SquidNT binaries. You can find his work and m ore inform at ion on SquidNT by visit ing ht t p: / / www.serassio.it / SquidNT.ht m . < Day Day Up >
< Day Day Up >
Appendix F. Configuring Squid Clients This appendix cont ains inform at ion on set t ing up various browsers and user- agent s t o use Squid. Alt hough it is m ore ext ensively covered in m y O'Reilly book Web Caching, I 'll include som e brief inst ruct ions here. I have inst ruct ions for t he following HTTP user- agent s: I nt ernet Explorer v6, Konqueror v3, Lynx v2.8, Net scape v7 a.k.a. Mozilla v5, Opera v7, libwww- perl v5, Pyt hon's urllib/ urllib2, and Wget v1.8. I f you t hink t his is all a huge hassle, consider using HTTP int ercept ion, as described in Chapt er 9. < Day Day Up >
< Day Day Up >
F.1 Manually Web browsers and ot her HTTP- based user- agent s have m et hods for explicit ly set t ing a proxy address. For large organizat ions, t his is a real hassle. You m ay sim ply have t oo m any deskt ops t o visit one at a t im e. Addit ionally, t his approach isn't as flexible as t he ot hers. For exam ple, you can't t em porarily st op t he flow of request s t o t he proxy or easily bypass t he cache for cert ain t roublesom e sit es. Browsers usually give you t he opt ion t o send HTTPS URLs t o a proxy. Squid can handle HTTPS request s, alt hough it can't cache t he responses. Squid sim ply t unnels t he encrypt ed t raffic. Thus, you should configure t he browser t o proxy HTTPS request s only if your firewall prevent s direct connect ions t o secure sit es.
F.1.1 Netscape/Mozilla To m anually configure proxies wit h Net scape and Mozilla, follow t his sequence of m enus: ● ● ● ● ● ●
Edit Preferences Advanced Proxies Manual proxy configurat ion Fill in t he HTTP Proxy address and Port fields. Ent er t he sam e values for FTP Proxy if you like.
F.1.2 Explorer To m anually configure proxies in I nt ernet Explorer, select t he following sequence of m enus: ● ● ● ● ●
View from t he m ain window m enu I nt ernet Opt ions Connect ions t ab LAN Set t ings Enable Use a proxy server and ent er it s address in t he Address and Port fields
The Advanced but t on opens a new window in which you can ent er different proxy addresses for different prot ocols ( HTTP, FTP, et c.) .
F.1.3 Konqueror You can m anually configure proxies in Konqueror by clicking on t he following sequence of m enus: ● ● ● ●
Set t ings Configure Konqueror Proxies & Cache Use Proxy
●
Fill in t he address for HTTP Proxy, and Port . Use t he sam e values for ot her prot ocols if you like.
F.1.4 Opera Here's how t o find t he proxy configurat ion screen in Opera browsers: ● ● ● ● ●
File Preferences Net work Proxy Servers Ent er an I P address ( or host nam e) and port num ber for HTTP, FTP, and ot her prot ocols as necessary.
F.1.5 Lynx The Lynx browser uses a configurat ion file, t ypically / usr/ local/ et c/ lynx.cfg. There you'll find a num ber of set t ings for proxies. For exam ple: http_proxy:http://proxy.example.com:3128/ https_proxy:http://proxy.example.com:3128/ ftp_proxy:http://proxy.example.com:3128/ Lynx also accept s proxy configurat ion via environm ent variables, as described in t he next sect ion.
F.1.6 Environment Variables Som e browsers and ot her user- agent s look for proxy set t ings in environm ent variables. Not e t hat t he variable nam es are lowercase, unlike m ost environm ent variable nam es: csh% setenv http_proxy http://proxy.example.com:3128/ csh% setenv ftp_proxy http://proxy.example.com:3128/
sh$ http_proxy=http://proxy.example.com:3128/ sh$ ftp_proxy=http://proxy.example.com:3128/ export http_proxy ftp_proxy I 've convinced m yself t hat t he following product s and packages check for t hese environm ent variables: ● ●
Opera Lynx
● ● ●
Wget Pyt hon's urllib and urllib2 libwww- perl < Day Day Up >
< Day Day Up >
F.2 Proxy Auto-Configuration Proxy Aut o- Configurat ion is a t echnique t hat allows m ore cont rol over t he way user- agent s select a proxy. The configurat ion file is sim ply a t ext file cont aining a JavaScript funct ion. Browsers download t he configurat ion file when t hey st art up and t hen evaluat e t he funct ion before each request . The funct ion's ret urn value det erm ines where t he request is sent . Proxy Aut o- Configurat ion is at t ract ive because it gives t he net work adm inist rat or m ore cont rol. For exam ple, you can t em porarily disable your caching service, im plem ent load balancing, or m igrat e t he service t o new syst em s. Addit ionally, t he funct ion can ret urn a list of proxy addresses, which t he browser t ries in sequence. I f t he first is unavailable, it t ries t he second, and so on. The following browsers support Proxy Aut o- Configurat ion: ● ● ● ● ●
I nt ernet Explorer Opera Net scape Konqueror Mozilla
All t hese browsers have a place in which you can t ype in t he Proxy Aut o- Configurat ion URL. You'll find it in t he sam e place as t he m anual proxy set t ings, earlier described in Sect ion F.1. Configuring hundreds or t housands of workst at ions is a real hassle, which is why a handful of com panies cam e up wit h WPAD, described in t he next sect ion. Writ ing a Proxy Aut o- Configurat ion funct ion is relat ively st raight forward. The funct ion, nam ed FindProxyForURL, t akes t wo argum ent s and ret urns a list of proxy addresses, separat ed by sem icolons. The word DIRECT inst ruct s t he browser t o forward t he request direct ly t o t he origin server, rat her t han t o a proxy. Here is a sim ple exam ple: function FindProxyForURL(url, host) { if (isPlainHostName(host)) return "DIRECT"; if (!isResolvable(host)) return "DIRECT"; if (url.substring(0, 5) =
= "http:")
return "PROXY 172.16.5.1:3128; DIRECT"; if (url.substring(0, 4) =
= "ftp:")
return "PROXY 172.16.5.1:3128; DIRECT"; return "DIRECT";
} The first if st at em ent m akes t he browser connect direct ly t o t he origin server if t he user t ypes a single- com ponent host nam e, such as www. This is generally a good idea because t he browser's int erpret at ion of t he host nam e m ight be different from t he proxy's. The second if st at em ent ensures t hat t he host nam e exist s in t he DNS. I f not , t he user sees an error m essage from t he browser it self, rat her t han from Squid. The next t wo if st at em ent s ret urn a proxy address, followed by DIRECT for HTTP and FTP URLs. I f t he proxy doesn't respond, t he browser at t em pt s t o m ake a direct connect ion t o t he origin server.
I f you have a firewall in place, t he browser probably won't be able t o m ake a direct connect ion.
Aft er writ ing t he funct ion, save it som ewhere in your web server's dat a direct ory. Next , you need t o configure t he server t o ret urn a specific cont ent t ype for t he file. The convent ion is t o give t he file a .pac ext ension, such as proxy.pac. Then, ensure t hat t he HTTP server ret urns t he cont ent t ype application/x-ns-proxy-autoconfig. Wit h Apache, you can add t his line t o your server config file: AddType application/x-ns-proxy-autoconfig .pac Refer t o Sect ion 4.3 of Web Caching ( O'Reilly) , for m ore inform at ion on Proxy Aut oConfigurat ion files, including m ore com plicat ed FindProxyForURL ideas and exam ples. < Day Day Up >
< Day Day Up >
F.3 WPAD The Web Proxy Aut o Discovery ( WPAD) prot ocol is a t echnique for user- agent s t o find a nearby caching proxy aut om at ically. The idea is relat ively sim ple. The prot ocol provides a num ber of m et hods for generat ing a URL t hat refers t o a Proxy Aut o- Configurat ion file. Those m et hods include DHCP, DNS lookups, and SLP ( t he Service Locat ion Prot ocol) . DHCP is t he first m et hod t he user- agent should t ry. I t sends a query for " opt ion 252" t o a local DHCP server. The response is a st ring: t he URL. Here's how t o configure I SC's DHCP server for WPAD: option wpad code 252 = text; option wpad "http://172.16.1.1/proxy.pac"; The second m et hod is SLP. However, it s im plem ent at ion is opt ional. I do not know if any useragent s act ually support WPAD via SLP. DNS is t he last resort . The prot ocol specificat ion out lines a num ber of DNS t echniques a useragent m ight use t o find a wpad.dat URL. The m ost st raight forward t echnique is t o perform an address lookup for t he host nam e wpad in t he local dom ain. For exam ple, if t he syst em 's host nam e is orion.exam ple.com , t he agent request s t he I P address of wpad.exam ple.com . I f t he lookup is successful, t he agent m akes a TCP connect t o t hat address on port 80 and request s / wpad.dat . To m ake t his work in Apache, you need t o set t he cont ent t ype for t he wpad.dat file like t his: AddType application/x-ns-proxy-autoconfig .dat This m ay have negat ive side effect s if your server has ot her files t hat end wit h .dat . One t rick som e people use is t o redirect request s for wpad.dat t o proxy.pac, wit h com m ands like t his in ht t pd.conf: Redirect /wpad.dat http://wpad.example.com/proxy.pac Not e t hat you probably won't be able t o set up a separat e virt ual host for t he wpad nam e in your dom ain. This is because som e user- agent s set t he Host header t o t he I P address, rat her t han t he host nam e. The following is an exam ple. GET /wpad.dat HTTP/1.1 Accept: */* User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Win32) Host: 206.168.0.13
WPAD is enabled by default in Microsoft I nt ernet Explorer. Konqueror also support s WPAD but disables it by default . You can enable WPAD in Konqueror by visit ing t he proxy configurat ion page ( described in t he Sect ion F.1) and select ing Aut o Configure Proxy. Alt hough t he current st able versions of Net scape ( v7.02) and Mozilla ( v5.0) don't im plem ent WPAD, fut ure versions will. < Day Day Up >
< Day Day Up >
F.4 Summary Table F- 1 sum m arizes t he various proxy configurat ion opt ions for t he user- agent s m ent ioned in t his appendix.
Ta ble F- 1 . Pr ox y con figu r a t ion t e ch n iqu e s for popu la r u se r - a ge n t s Use r a ge n t
Manual
En vir on m e n t
PAC
W PAD
Explorer
Yes
No
Yes
Yes
Konqueror
Yes
No
Yes
Yes
libwww- perl
N/ A
Yes
No
No
Lynx
Yes
Yes
No
No
Net scape/ Mozilla
Yes
No
Yes
No
Opera
Yes
Yes
Yes
No
Wget
N/ A
Yes
No
No
< Day Day Up >
< Day Day Up >
Colophon Our look is t he result of reader com m ent s, our own experim ent at ion, and feedback from dist ribut ion channels. Dist inct ive covers com plem ent our dist inct ive approach t o t echnical t opics, breat hing personalit y and life int o pot ent ially dry subj ect s. The anim al on t he cover of Squid: The Definit ive Guide is a giant squid ( Archit eut his dux) . Of t he class Cephalopoda, which m eans " head foot ," t he giant squid holds m uch fascinat ion for hum ans, part of which has t o do wit h t he fact t hat it has never been observed alive in it s nat ural habit at . Scient ist s have only been able t o st udy specim ens t hat have been caught or found washed up on beaches. This invert ebrat e can grow t o 60 feet in lengt h and weigh as m uch as a t on. I t 's a deep- sea dweller ( 660- 2,300 feet ) t hat is found t hroughout t he world's oceans. A giant squid consist s of seven part s. I t s head houses a com plex brain. I t s eyes are t he largest in t he anim al kingdom - up t o 10 inches in diam et er. ( Most deep- sea anim als have very large eyes so t hey can gat her t he sm all am ount s of light available in t he dept hs of t he ocean.) I t s fins are relat ively sm all and help it t o balance and m aneuver as it swim s. I t s m ain body is called a m ant le: it 's a m uscular sac t hat cont ains m ost of t he organ syst em s. I t s eight arm s are st udded wit h t wo rows of suckers; it also has t wo m uch longer feeding t ent acles, t he ends of which also have suckers and are called clubs. Finally, it s funnel is a m ult ipurpose t ube used t o breat he, squirt ink, lay eggs, expel wast e, and propel it self. To eat , a giant squid capt ures it s prey wit h it s t wo long feeding t ent acles. Holding t he int ended dinner wit h it s short er arm s, it s sharp horny beak cut s t he food up, and a file- like radula sends it down t he t hroat and esophagus; t he food t hen passes direct ly t hrough t he brain t o t he st om ach. Scient ist s believe giant squid m ay be solit ary hunt ers because no m ore t han one has ever been caught in t he sam e fishing net . Mary Anne Weeks Mayo was t he product ion edit or and copyedit or for Squid: The Definit ive Guide . Sada Preisch proofread t he book, and Marlowe Shaeffer and Claire Clout ier provided qualit y cont rol. Jam ie Peppard and Mary Agner provided product ion assist ance. Johnna Dinse wrot e t he index. Ellie Volckhausen designed t he cover of t his book, based on a series design by Edie Freedm an. The cover im age is a 19t h- cent ury engraving from t he Dover Pict orial Archive. Em m a Colby produced t he cover layout wit h QuarkXPress 4.1 using Adobe's I TC Garam ond font . Melanie Wang designed t he int erior layout , based on a series design by David Fut at o. This book was convert ed by Joe Wizda t o Fram eMaker 5.5.6 wit h a form at conversion t ool creat ed by Erik Ray, Jason McI nt osh, Neil Walls, and Mike Sierra t hat uses Perl and XML t echnologies. The t ext font is Linot ype Birka; t he heading font is Adobe Myriad Condensed; and t he code font is LucasFont 's TheSans Mono Condensed. The illust rat ions t hat appear in t he book were produced by Robert Rom ano and Jessam yn Read using Macrom edia FreeHand 9 and Adobe Phot oshop 6. The t ip and warning icons were drawn by Christ opher Bing. This colophon was com piled by Mary Anne Weeks Mayo. The online edit ion of t his book was creat ed by t he Safari product ion group ( John Chodacki, Becki Maisch, and Ellie Cut ler) using a set of Fram e- t o- XML conversion and cleanup t ools writ t en and m aint ained by Erik Ray, Benn Salt er, John Chodacki, and Jeff Ligget t .
< Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z]
< Day Day Up >
< Day Day Up >
[ SYM BOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] 5m in, cache m anager page 60m in, cache m anager page < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A ] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - a opt ion, com m and- line access cont rols cache m anager client blocking debugging delays due t o I CP queries I P addresses, request denial and local client s non- HTTP servers and pornography denials redirect ors and rules checks m at ching synt ax server accelerat ion squid.conf file surrogat e m ode and synt ax t est ing usage rest rict ion user access access.log 2nd 3rd Calam aris configurat ion direct ives fields HTTP response field, st at us codes peering codes result codes Squeezer st ore.log com parison Webalyzer acl direct ive 2nd
ACL elem ent s arp t ype AS num bers browser t ype dom ain nam es dst t ype dst _as t ype dst dom _regex t ype dst dom ain t ype ext ernal aut hent icat ion helpers and writ ing ident t ype ident _regex t ype I P addresses long list s m at ching m axconn t ype m et hod t ype m yip t ype m yport t ype port t ype prot o t ype proxy_aut h t ype proxy_aut h_regex t ype regular expressions reply_body_m ax_size req_m im e_t ype snm p_com m unit y t ype src t ype src_as t ype srcdom _regex t ype srcdom ain t ype synt ax TCP port num bers usernam es values ACL rules
ACLs ( access cont rol list s) always_direct broken_post s cache_peer_access cachem gr.cgi rule cachem gr_passwd rule delay_access elem ent s header_access header_replace ht t p_access ht t p_access rule, cache m anager and ht t p_reply_access rule exam ple icp_access ident _lookup_access m iss_access never_direct no_cache redirect or_access rep_m im e_t ype t ype snm p_access t cp_out going_address t cp_out going_t os t im e t ype url_regex t ype urlpat h_regex t ype act ion field, st ore.log act ive_request s, cache m anager page Address already in use m essage addresses, ACLs and adm inist rat or cont act inform at ion AdZapper redirect or Alt eon/ Nort el, int ercept ion caching and always_direct ACL always_direct direct ive neighbor caches announce_file direct ive
announce_host direct ive announce_period direct ive announce_port direct ive API s Basic Aut h API Digest aut hent icat ion NTLM aut hent icat ion append_dom ain direct ive applicat ion layer, t ransport layer and applicat ion- layer rout ing arp ACL t ype arrowpoint ( Cisco) , int ercept ion caching and AS ( Aut onom ous Syst em ) num bers, ACLs as_whois_server direct ive asndb, cache m anager page assert ions, debugging and aufs st orage schem e 2nd issues wit h m onit oring aut h_param direct ive 2nd argum ent s Basic aut hent icat ion, param et ers support ed Digest aut hent icat ion, param et ers support ed NTLM aut hent icat ion support , param et ers support ed aut hent icat e_cache_garbage_int erval direct ive aut hent icat e_ip_t t l direct ive 2nd aut hent icat e_t t l direct ive aut hent icat ion Basic 2nd Basic Aut h API Digest Digest API helpers configurat ion ext ernal ACLs and get pwnam ( Basic aut hent icat ion) 2nd LDAP ( Basic aut hent icat ion) MSNT ( Basic aut hent icat ion)
m ult i- dom ain- NTLM ( Basic aut hent icat ion) NCSA ( Basic aut hent icat ion) PAM ( Basic aut hent icat ion) SASL ( Basic aut hent icat ion) SMB ( Basic aut hent icat ion) SMB ( NTLM aut hent icat ion) winbind ( Basic aut hent icat ion) winbind ( NTLM aut hent icat ion) YP ( Basic aut hent icat ion) HTTP Digest NTLM 2nd API proxy aut hent icat ion, direct ives < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B ] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] backend servers boxes, surrogat e configurat ion and cont ent negot iat ion I P addresses and server accelerat ion and bandwidt h applicat ion layer t ransport layer bandwidt h bucket s Basic aut hent icat ion 2nd aut h_param direct ive, param et ers support ed Basic Aut h API get pwnam helper LDAP helper MSNT helper m ult i- dom ain NTLM helper NCSA helper PAM helper SASL helper SMB helper winbind helper YP helper basicaut hent icat or, cache m anager page benchm arks filesyst em disk spindles FreeBSD Linux Net BSD Solaris filesyst em perform ance configurat ion hardware Squid versions
I / O bot t leneck and blocking client s, access cont rols and Bloom filt ers, Cache Digest s boot script s / et c/ init t ab schem e / et c/ rc.local init .d schem e rc.d schem e broken_post s ACL broken_post s direct ive browser ACL t ype BSD- based code, m bufs and buffered I / O, redirect ors buffered_logs direct ive access.log and bus errors, debugging byt e hit rat io < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - C opt ion, com m and- line cache hit s on local sit es, access cont rols and Squid as Cache Digest s 2nd Bloom filt er configurat ion for cache direct ories, init ializat ion cache hierarchies forwarding loops neighbor caches, cache_peer direct ive cache hit rat io cache hit s cache key field, st ore.log cache m anager 2nd access cont rols disadvant ages ht t p_access ACL rule Squid- RRD and squidclient ut ilit y and cache m anager pages 5m in 60m in act ive_request s asndb basicaut hent icat or carp cbdat a client _list com m _incom ing config count ers delay digest _st at s 2nd
digest aut hent icat or diskd dns event s ext ernal_acl filedescript ors forw_headers forward fqdncache hist ogram s ht t p_headers info io ipcache leaks m em m enu net db non_peers nt lm aut hent icat or obj ect s offline_t oggle openfd_obj ect s pconn peer_select redirect or refresh server_list shut down squidaio_count s st ore_check_cachable_st at s st ore_digest st ore_io st oredir ut ilit izat ion via_headers vm _obj ect s cache m isses
cache validat ion cache.log 2nd 3rd 4t h debugging and 2nd m essages syst em log t erm inal warnings hardcoded page fault excesses process size response t im e cache_access_log direct ive cache_dir direct ive 2nd opt ions cache_dns_program direct ive cache_effect ive_group direct ive cache_effect ive_user direct ive cache_log direct ive cache_m em direct ive 2nd cache_m gr direct ive squid.conf ( adm inist rat or cont act ) cache_peer direct ive neighbor caches and no- delay opt ion opt ions cache_peer_access direct ive 2nd loops and neighbor cache access list cache_peer_dom ain direct ive neighbor cache access cache_replacem ent _policy direct ive 2nd cache_st ore_log direct ive cache_swap_high direct ive 2nd cache_swap_log direct ive cache_swap_low direct ive disk cache cachem gr.cgi ACL rule cachem gr_passwd direct ive
caches m em ory cache ret rying request s web caching Calam aris Cannot det erm ine fully qualified host nam e, debugging and CARP ( Cache Array Rout ing Prot ocol) 2nd configurat ion for carp, cache m anager page case sensit ivit y, squid.conf file cbdat a page, cache m anager pages CDN ( cont ent delivery net work) children param et er, aut h_param direct ive chroot direct ive chroot environm ent Cisco arrowpoint , int ercept ion caching and policy rout ing, int ercept ion caching and WCCP, int ercept ion caching and client address field access.log referer.log useragent .log client ident it y field, access.log client - side of Squid client _db direct ive client _lifet im e direct ive client _list , cache m anager page client _net m ask direct ive access.log and client _persist ent _connect ions direct ive client s blocking, access cont rols and configurat ion, m anual local, access cont rols com m _incom ing, cache m anager page com m and- line opt ions -a
-C -d -D -f -F -h -k -N -R -s -u -v -V -X -Y -z com m ands m ake, ./ configure script squid - k shut down com piling ./ configure script opt ions coss diskd file descript ors inst alling program s kernel and preparat ions unpacking source com plet e obj ect s, m em ory config, cache m anager page configurat ion [ See also squid.conf] aut hent icat ion helpers aut om at ic, Proxy Aut o- Configurat ion Cache Digest s and CARP and coss st orage schem e delay pools
devices, int ercept ion caching and direct ives, access.log and diskd filesyst em benchm arks HTCP and int ercept ion caching and WCCP redirect ors running processes, reconfiguring surrogat e m ode and ./ configure script m ake com m and opt ions rerunning running connect _t im eout direct ive cont act info for adm inist rat or cont ent rout ers Squid as cont ent t ype field, access.log cont ent , uncachable cont ent - lengt h/ size field, st ore.log cont ent - t ype field, st ore.log cont rols [ See access cont rols] core dum ps, debugging and coredum p_dir direct ive coss ( Cyclic Obj ect St orage Schem e) count ers, cache m anager page credent ialst t l param et er, aut h_param direct ive CVS ( Concurrent Versioning Syst em ) Cygwin < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - d opt ion, com m and- line - D opt ion, com m and- line daem on processes, Squid as squid_st art script dat e field, st ore.log dead_peer_t im eout direct ive debug_opt ions direct ive cache.log debugging access cont rols Address already in use assert ions and bus errors and cache.log cache.log and core dum ps and DNS nam e lookup and filedescript ors and fully qualified host nam e m essage host nam es and HTTP int ercept ion icm pRecv and replicat ing problem s and report ing bugs segm ent at ion violat ions and st ack t races and swap direct ory error syst em speed delay pools configurat ion m onit oring overview subnet t ing schem e and delay, cache m anager page
delay_access ACL delay_access direct ive 2nd delay_class direct ive 2nd delay_init ial_bucket _level direct ive 2nd delay_param et ers direct ive 2nd delay_pools direct ive 2nd deny_info direct ive DEVEL releases developers, devel.squid- cache.org sit e devices, int ercept ion caching and diffs, applying Digest aut hent icat ion API aut h_param direct ive, param et ers support ed digest _bit s_per_ent ry direct ive digest _generat ion direct ive digest _rebuild_chunk_percent age direct ive digest _rebuild_period direct ive digest _rewrit e_period direct ive digest _st at s, cache m anager page digest _swapout _chunk_size direct ive digest aut hent icat or, cache m anager page direct opt ions direct ories cache direct ories, init ializat ion disk cache, obj ect allocat ion direct ory argum ent , cache_dir direct ive direct ory num ber field, st ore.log disk cache cache_dir direct ive cache_replacem ent _policy direct ive cache_swap_high direct ive cache_swap_low direct ive direct ories, obj ect allocat ion I / O bot t leneck obj ect rem oval obj ect size refresh_pat t ern direct ive
replacem ent policy usage disk space, process size and disk spindles, benchm arks and diskd st orage schem e diskd, cache m anager page diskd_program direct ive DNS nam e lookup t est s failed m essage, debugging and dns, cache m anager page dns_children direct ive dns_defnam es direct ive dns_nam eservers direct ive dns_ret ransm it _int erval direct ive dns_t est nam es direct ive dns_t im eout direct ive dom ain nam es, ACLs dst ACL t ype dst _as ACL t ype dst dom _regex ACL t ype dst dom ain ACL t ype < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] em ulat e_ht t pd_log direct ive access.log and environm ent variables, proxy m anual configurat ion environm ent s, chroot ephem eral port s err_ht m l_t ext direct ive error checking, squid.conf file error m essages, surrogat e m ode and error_direct ory direct ive ESI ( Edge Side I ncludes) / et c/ init t ab schem e / et c/ rc.local script event s page, cache m anger pages expires field, st ore.log Explorer, m anual configurat ion ext 2fs ext ension_m et hods direct ive ext ernal ACLs aut hent icat ion helpers and writ ing ext ernal_acl, cache m anager page ext ernal_acl_t ype direct ive Ext rem e Net works, int ercept ion caching and < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F ] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - f opt ion, com m and- line - F opt ion, com m and- line false hit s, sibling caches FAQ docum ent FFS ( Fast File Syst em ) fields, access.log file descript ors com piling and FreeBSD and Linux and Net BSD and OpenBSD and Solaris and file num ber field, st ore.log file num bers, m apping t o pat hnam es filedescript ors cache m anager page debugging and filesyst em s alt ernat ive aufs st orage schem e benchm arks configurat ion disk spindles FreeBSD hardware Linux Net BSD OpenBSD Solaris Squid versions coss ( Cyclic Obj ect St orage Schem e) disk cache, obj ect size disk space, process size and
disk usage diskd st orage schem e ext 2fs FFS I / O bot t leneck inodes j ournaling syst em s null st orage schem e soft updat es st orage schem es syst em calls t uning opt ions UFS filt ers, redirect ors and forw_headers, cache m anager page forward, cache m anager pages forwarded_for direct ive forwarding loops, cache hierarchies Foundry, int ercept ion caching and FQDN ( fully qualified dom ain nam e) fqdncache, cache m anager page fqdncache_size direct ive FreeBSD file descript ors and filesyst em benchm arks int ercept ion caching and FTP ( File Transfer Prot ocol) servers ft p_list _widt h direct ive ft p_passive direct ive ft p_sanit ycheck direct ive ft p_user direct ive fully qualified host nam e, debugging < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G ] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] GDSF ( greedy dual- size frequency) replacem ent policy get pwnam aut hent icat ion helper ( Basic aut hent icat ion) get rusage( ) funct ion GNU ( General Public License) Gopher servers GRE ( Generic Rout ing Encapsulat ion) , int ercept ion caching and groups < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - h opt ion, com m and- line half_closed_client s direct ive hardcoded warnings, cache.log hardware requirem ent s hardware, filesyst em benchm arks header_access ACL 2nd header_replace ACL 2nd healt h checks, int ercept ion caching and hierarchies cache hierarchies surrogat e m ode and hierarchy_st oplist direct ive neighbor caches high_m em ory_warning direct ive 2nd high_page_fault _warning direct ive 2nd high_response_t im e_warning direct ive 2nd hist ogram s, cache m anager page host nam e, debugging and host nam e_aliases direct ive host s_file direct ive hot obj ect cache HTCP ( Hypert ext Caching Prot ocol) 2nd configuring for I CP and reply processing ht cp_port direct ive HTTP ( Hypert ext Transfer Prot ocol) Basic aut hent icat ion aut h_param direct ive param et ers Digest aut hent icat ion int ercept ion, debugging proxy aut hent icat ion, int ercept ion caching and redirect m essages servers
servers, int ercept ion caching and HTTP int ercept ion 2nd layer four swit ches and HTTP request headers field, access.log HTTP response field, access.log, st at us codes HTTP response headers field, access.log HTTP servers, access cont rols and ht t p_access ACL rule, cache m anager and ht t p_access ACL t ype ht t p_access direct ive ht t p_headers, cache m anager page ht t p_port direct ive 2nd 3rd ht t p_reply_access ACL ht t p_reply_access ACL rule, exam ple ht t p_reply_access direct ive ht t pd_accel_host direct ive 2nd ht t pd_accel_port direct ive 2nd ht t pd_accel_single_host direct ive 2nd ht t pd_accel_uses_host _header direct ive 2nd ht t pd_accel_wit h_proxy direct ive 2nd ht t ps_port direct ive 2nd < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] I/O aufs st orage syst em and bot t leneck buffered I / O, redirect ors and I CAP ( I nt ernet Cont ent Adapt at ion Prot ocol) icm pRecv, debugging and icon_direct ory direct ive I CP ( I nt ernet Cache Prot ocol) 2nd client s cache_peer direct ive icp_port direct ive m ult icast I CP HTCP and m ult icast client s exam ple servers neighbor caches and net db and reply processing request s server, Squid as st ale responses, icp_hit _st ale direct ive I CP queries, access cont rols and icp_access ACL icp_access direct ive icp_hit _st ale direct ive 2nd I CP_MI SS_NOFETCH icp_port direct ive 2nd Squid as I CP server icp_query_t im eout direct ive 2nd ident ACL elem ent s ident ACL t ype ident _lookup_access direct ive 2nd 3rd
access.log and ident _regex ACL t ype ident _t im eout direct ive ie_refresh direct ive ignore_unknown_nam eservers direct ive in- t ransit obj ect s, m em ory 2nd incom ing_dns_average direct ive incom ing_ht t p_average direct ive incom ing_icp_average direct ive info, cache m anager page init .d schem e, boot script s init ializing cache direct ories inodes 2nd inst allat ion com piled program s Cygwin pinger program int ercache com m unicat ion int ercept ion caching benefit s and disadvant ages configurat ion and device configurat ion FreeBSD and HTTP debugging proxy aut hent icat ion and I PFilt er, Net BSD layer four swit ches and Linux syst em s OpenBSD operat ing syst em s and io, cache m anager page I P addresses access cont rols and, request denial and ACLs I P packet filt ering soft ware ipcache, cache m anager page ipcache_high direct ive
ipcache_low direct ive ipcache_size direct ive I PFilt er int ercept ion caching, Net BSD ipfw ( filt ering soft ware) ipt ables ( filt ering soft ware) I RCache ( I nform at ion Resource Caching) < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] Jesred redirect or j ournaling filesyst em s < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K ] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - k opt ion, com m and- line kernel com piling and precom piled binaries and Konqueror, m anual configurat ion < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] L1 argum ent , cach_dir direct ive L2 argum ent , cache_dir direct ive last - m odified field, st ore.log last - m odified t im est am ps layer four swit ches, HTTP int ercept ion and LDAP aut hent icat ion helper ( Basic aut hent icat ion) leaks, cache m anager page libraries, shared Linux file descript ors and filesyst em benchm arks int ercept ion caching and local client s, access cont rols local sit e cache hit s, prevent ing wit h access cont rols log files access.log 2nd 3rd cache.log 2nd 3rd net db_st at e file pat hnam es privacy issues referer.log 2nd rot at ing 2nd securit y st orage space st ore.log 2nd 3rd surrogat e m ode and swap.st at e file 2nd useragent .log 2nd log_fqdn direct ive access.log and log_icp_queries direct ive access.log and log_ip_on_direct direct ive access.log and
log_m im e_hdrs direct ive access.log and logfile_rot at e direct ive loops, cache_peer_access direct ive and LRU ( lease recent ly used) replacem ent policy Lynx proxies, m anual configurat ion < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M ] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] m ailing list s squid- announce squid- dev squid- users 2nd m ake com m and, ./ configure script m ax- size opt ion, cache_dir direct ive m ax_challenge_lifet im e param et er, aut h_param direct ive m ax_challenge_reuses param et er, aut h_param direct ive m ax_open_disk_fds direct ive m ax_user_ip ACL, proxy aut hent icat ion and m axconn ACL t ype m axim um _obj ect _size direct ive m axim um _obj ect _size_in_m em ory direct ive m axim um _single_addr_t ries direct ive m bufs ( BSD- based code) m cast _groups direct ive m ult icast groups m cast _icp_query_t im eout direct ive m cast _m iss_addr direct ive m cast _m iss_encode_key direct ive m cast _m iss_port direct ive m cast _m iss_t t l direct ive m em , cache m anager page m em ory net db and requirem ent s m em ory cache m em ory_pools direct ive m em ory_pools_lim it direct ive m em ory_replacem ent _policy direct ive 2nd m enu, cache m anager page m et hod ACL t ype m et hod field, st ore.log Microsoft NTLM aut hent icat ion [ See NTLM aut hent icat ion]
m im e_t able direct ive m in_dns_poll_cnt direct ive m in_ht t p_poll_cnt direct ive m in_icp_poll_cnt direct ive m inim um _direct _hops direct ive m inim um _direct _rt t direct ive m inim um _obj ect _size direct ive m iss_access ACL m iss_access direct ive m onit oring Cache Manager 2nd cache m anager pages cache.log and warnings delay pools SNMP and snm pget and snm pwalk and SNMP MI B Squid MI B Mozilla proxies, m anual configurat ion MSNT aut hent icat ion helper ( Basic aut hent icat ion) m ult icast I CP client s exam ple servers m yip ACL t ype m yport ACL t ype < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N ] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - N opt ion, com m and- line NCSA aut hent icat ion helper ( Basic aut hent icat ion) negat ive_dns_t t l direct ive negat ive_t t l direct ive neighbor caches 2nd always_direct direct ive cache_peer direct ive and cache_peer_access direct ive cache_peer_dom ain direct ive hierarchy_st oplist direct ive I CP and lim it ing request s never_direct direct ive and nonhierarchical_direct direct ive and prefer_direct direct ive select ion algorit hm up/ down st at e neighbor_t ype_dom ain direct ive 2nd Net BSD file descript ors and filesyst em benchm arks int ercept ion caching, I PFilt er net db ( net work m easurem ent dat abase) enabling I CP and m em ory requirem ent s t est _reachabilit y direct ive net db, cache m anager page net db_high direct ive net db_low direct ive net db_ping_period direct ive net db_st at e file net m asks Net scape proxies, m anual configurat ion
net work hardware requirem ent s never_direct ACL never_direct direct ive neighbor cache access list NLANR ( Nat ional Laborat ory for Applied Net work Research) no_cache ACL no_cache direct ive non_peers, cache m anager page nonce_garbage_int erval param et er, aut h_param direct ive nonce_m ax_count param et er, aut h_param direct ive nonce_m ax_durat ion param et er, aut h_param direct ive nonhierarchical_direct direct ive neighbor caches and NTLM aut hent icat ion 2nd API aut h_param direct ive, param et ers support ed SMB helper 2nd NTLM aut hent icat ion helper ( Basic aut hent icat ion) , m ult i- dom ain nt lm aut hent icat or, cache m anager page null st orage schem e < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O ] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] obj ect size, disk cache obj ect s allocat ing t o cache direct ories disk cache, rem oving from m em ory com plet e in- t ransit obj ect s, cache m anager page offline_m ode direct ive offline_t oggle, cache m anager page open source code precom piled binaries OpenBSD file descript ors and filesyst em benchm arks int ercept ion caching and openfd_obj ect s, cache m anager page Opera, m anual configurat ion operat ing syst em int ercept ion caching and requirem ent s origin servers < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P ] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] page fault s excessive get rusage( ) funct ion PAM aut hent icat ion helper ( Basic aut hent icat ion) parent caches secondary select ion pat ches, applying pat hnam es log files m apping file num bers t o pconn, cache m anager page pconn_t im eout direct ive peer caches peer_connect _t im eout direct ive peer_select , cache m anager page peering code/ peerhost field, access.log peering codes, access.log persist ent connect ions client _persist ent _connect ions direct ive pconn_t im eout direct ive persist ent _request _t im eout direct ive pipeline_prefet ch direct ive server_persist ent connect ions direct ive persist ent _request _t im eout direct ive pf ( filt ering soft ware) pid_filenam e direct ive pinger program , inst allat ion pinger_program direct ive pipeline_prefet ch direct ive policy rout ing ( Cisco) , int ercept ion caching and pornography, access cont rols and port ACL t ype port num ber, changing port s
ephem eral port s squid.conf direct ives TCP port num bers, ACLs posit ive_dns_t t l direct ive precom piled binaries 2nd kernel and prefer_direct direct ive neighbor caches privacy issues, log files processes, reconfiguring running program param et er, aut h_param direct ive prot o ACL t ype proxies all request s request t hrough different Squid as proxy aut hent icat ion direct ives HTTP int ercept ion caching and Proxy Aut o- Configurat ion proxy_aut h ACL t ype proxy_aut h_regex ACL t ype pt hreads library, aufs st orage schem e purging, surrogat e m ode and < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q ] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] query_icm p direct ive quick_abort _m ax direct ive quick_abort _m in direct ive quick_abort _pct direct ive < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R ] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - R opt ion, com m and- line range request s range_offset _lim it direct ive range_offset _lim it direct ive rc.d schem e read- only opt ion, cache_dir direct ive read_t im eout direct ive Ready t o serve request s m essage realm param et er, aut h_param direct ive redirect _children direct ive 2nd 3rd redirect _program direct ive 2nd redirect _rewrit es_host _header direct ive 2nd redirect or pool redirect or, cache m anager page redirect or_access ACL redirect or_access direct ive 2nd redirect or_bypass direct ive 2nd redirect ors access cont rols and AdZapper buffered I / O configurat ion for definit ion filt ers and int erface Jesred sam ples squidGuard Squirm referer field, referer.log referer.log referer.log file referer_log direct ive refresh, cache m anager page
refresh_pat t ern direct ive refresh_pat t ern direct ive, disk cache regular expressions, ACLs t ype releases of Squid reload_int o_im s direct ive reloads, surrogat e m ode and rem oving obj ect s ent ire cache direct ories groups of individually rep_m im e_t ype ACL t ype replacem ent policy, disk cache replicat ing problem s, debugging and reply_body_m ax_size ACL reply_body_m ax_size direct ive report ing bugs req_m im e_t ype ACL t ype request m et hod field, access.log Request - URI FQDN HTTP redirect m essages ident _lookup_access direct ive whit espace request _body_m ax_size direct ive request _ent it ies direct ive request _header_m ax_size direct ive request _t im eout direct ive request s denying, access cont rols and different proxy single proxy resources for support response t im e field, access.log responses, m edian t im e rest rict ing usage, access cont rols and result code, access.log result / st at us codes field, access.log root , st art ing as
rot at ing log files 2nd rout ers applicat ion- layer cont ent rout ers RTT ( round- t rip t im e) , net db and rules access cont rols checks m at ching synt ax ACLs running processes, reconfiguring < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S ] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - s opt ion, com m and- line sam ples, redirect ors SASL ( Sim ple Aut hent icat ion and Securit y Layer) , aut hent icat ion and scaling, I CP schem e argum ent , cache_dir direct ive securit y log files surrogat e m ode and segm ent at ion violat ions, debugging server accelerat ion [ See also surrogat e m ode] 2nd cont ent negot iat ion overview server- side of Squid server_list , cache m anager page server_persist ent _connect ions direct ive servers FTP servers Gopher HTTP m ult icast I CP origin servers sever accelerat ion, access cont rols shells, file- descript or lim it s shut down shut down, cache m anager page shut down_lifet im e direct ive sibling caches false hit s size argum ent cache_dir direct ive sleep_aft er_fork direct ive slow speed, debugging and SMB ( aut hent icat ion helper) Basic aut hent icat ion
NTLM aut hent icat ion SNMP m onit oring and snm pget and snm pwalk and SNMP MI B snm p_access ACL snm p_access direct ive snm p_com m unit y ACL t ype snm p_incom ing_address direct ive snm p_out going_address direct ive snm p_port direct ive snm pget snm pwalk soft updat es, filesyst em t uning and Solaris file descript ors and filesyst em benchm arks source code CVS and pat ches, applying precom piled binaries 2nd shared libraries and Squid port s unpacking speed, debugging and Squeezer Squid as daem on process squid_st art script hist ory of squid - k shut down com m and Squid MI B Squid port s Squid RPMs squid- announce m ailing list squid- cache Web sit e squid- dev m ailing list
Squid- RRD, cache m anager and squid- users m ailing list 2nd 3rd squid.conf access cont rols cache_m gr direct ive case sensit ivit y direct ives 2nd 3rd error checking ht t p_port direct ive synt ax visible_host nam e direct ive squid.pid file, shut down and squid_st art script , Squid as daem on process squidaio_count s, cache m anager page squidclient ut ilit y, Cache Manager and squidGuard redirect or SquidNT Squirm redirect or src ACL t ype src_as ACL t ype srcdom _regex ACL t ype srcdom ain ACL t ype SSL connect ions ht t ps_port direct ive surrogat e m ode ssl_unclean_shut down direct ive STABLE releases st ack t races debugging st at us code field, st ore.log st derr, t erm inal window and st orage aufs st orage schem e coss st orage schem e diskd st orage schem e filesyst em s I / O bot t leneck null st orage schem e
st ore.log access.log com parison file num bers, m apping t o pat hnam es st ore.log file 2nd st ore_avg_obj ect _size direct ive st ore_check_cachable_st at s, cache m anager page st ore_digest , cache m anager page st ore_dir_select _algorit hm direct ive st ore_io, cache m anager page st ore_obj ect s_per_bucket direct ive st oredir, cache m anager page st rip_query_t erm s direct ive access.log and subnet s, delay pools and surrogat e m ode access cont rols configurat ion cont ent negot iat ion direct ives error m essages and hierarchies and log files and purge operat ions and reloads and uncachable cont ent swap direct ory error, debugging and swap.st at e file 2nd synt ax access cont rol rules access cont rols ACL elem ent s squid.conf syst em calls, filesyst em s and < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T ] [ U] [ V] [ W ] [ X] [ Y] [ Z] t ar com m and, unpacking source TCP ( Transm ission Cont rol Prot ocol) , port num bers t cp_out going_address direct ive 2nd t cp_out going_t os direct ive 2nd t cp_recv_bufsize direct ive t echnical support t erm inal window cache.log m essages t est ing and t est _reachabilit y direct ive net db and t est ing access cont rols t erm inal window and t im e ACL t ype t im est am p field access.log referer.log 2nd st ore.log t im est am ps t raffic shaping [ See Delay Pools] t ransfer size field, access.log t ransparent caching [ See HTTP int ercept ion] t ransport layer, applicat ion layer and < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U ] [ V] [ W ] [ X] [ Y] [ Z] - u opt ion, com m and- line udp_incom ing_address direct ive udp_out going_address direct ive UFS ( Unix File Syst em ) perform ance t uning uncacheable cont ent , surrogat e m ode and unique_host nam e direct ive Unix, Cygwin unlinkd_program direct ive unpacking source URI field access.log referer.log st ore.log uri_whit espace direct ive 2nd access.log and url_regex ACL t ype urlpat h_regex ACL t ype usage rest rict ion, access cont rols and user- agent field, useragent .log useragent .log 2nd useragent _log direct ive usernam es, ACLs t ype users access cont rols and ut ilizat ion, cache m anager page < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V ] [ W ] [ X] [ Y] [ Z] - v opt ion, com m and- line - V opt ion, com m and- line vary_ignore_expire direct ive versions of Squid via_headers, cache m anager page visible_host nam e direct ive 2nd vm _obj ect s, cache m anager page < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] wais_relay_host direct ive wais_relay_port direct ive WCCP ( Web Cache Coordinat ion Prot ocol) int ercept ion caching configurat ion FreeBSD Linux syst em s Net BSD OpenBSD wccp_incom ing_address direct ive wccp_out going_address direct ive wccp_rout er direct ive wccp_version direct ive Web caching Web Polygraph, workload file filesyst em benchm arks Webalyzer whit espace, Request - URI winbind aut hent icat ion helper Basic aut hent icat ion NTLM aut hent icat ion Windows, Cygwin workload files, Polygraph WPAD ( Web Proxy Aut o Discovery) proxy configurat ion < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X ] [ Y] [ Z] - X opt ion, com m and- line < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y ] [ Z] - Y opt ion, com m and- line YP aut hent icat ion helper ( Basic aut hent icat ion) < Day Day Up >
< Day Day Up >
[ SYMBOL] [ A] [ B] [ C] [ D ] [ E] [ F] [ G] [ H ] [ I ] [ J] [ K] [ L] [ M] [ N] [ O] [ P] [ Q] [ R] [ S] [ T] [ U] [ V] [ W ] [ X] [ Y] [ Z] - z opt ion, com m and- line < Day Day Up >