The Practice of System and Network Administration Second Edition phần 1 docx

The Practice of System and Network Administration Second Edition This page intentionally left blank The Practice of System and Network Administration Second Edition Thomas A Limoncelli Christina J Hogan Strata R Chalup Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Capetown • Sydney • Tokyo • Singapore • Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests For more information, please contact: U.S Corporate and Government Sales, (800) 382-3419, corpsales@pearsontechgroup.com For sales outside the United States please contact: International Sales, international@pearsoned.com Visit us on the Web: www.awprofessional.com Library of Congress Cataloging-in-Publication Data Limoncelli, Tom The practice of system and network administration / Thomas A Limoncelli, Christina J Hogan, Strata R Chalup.—2nd ed p cm Includes bibliographical references and index ISBN-13: 978-0-321-49266-1 (pbk : alk paper) Computer networks—Management Computer systems I Hogan, Christine II Chalup, Strata R III Title TK5105.5.L53 2007 004.6068–dc22 2007014507 Copyright c 2007 Christine Hogan, Thomas A Limoncelli, Virtual.NET Inc., and Lumeta Corporation All rights reserved Printed in the United States of America This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise For information regarding permissions, write to: Pearson Education, Inc Rights and Contracts Department 75 Arlington Street, Suite 300 Boston, MA 02116 Fax: (617) 848-7047 ISBN 13: 978-0-321-49266-1 ISBN 10: 0-321-49266-8 Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville, Indiana First printing, June 2007 Contents at a Glance Part I Getting Started What to Do When Climb Out of the Hole Chapter Chapter Part II Foundation Elements Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Part III Workstations Servers Services Data Centers Networks Namespaces Documentation Disaster Recovery and Data Integrity Security Policy Ethics Helpdesks Customer Care Change Processes Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Debugging Fixing Things Once Change Management Server Upgrades Service Conversions Maintenance Windows Centralization and Decentralization 27 39 41 69 95 129 187 223 241 261 271 323 343 363 389 391 405 415 435 457 473 501 v vi Contents at a Glance Part IV Providing Services Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Part V Service Monitoring Email Service Print Service Data Storage Backup and Restore Remote Access Service Software Depot Service Web Services Management Practices Chapter 30 Chapter 31 Chapter 32 Chapter 33 Chapter 34 Chapter 35 Chapter 36 Epilogue Organizational Structures Perception and Visibility Being Happy A Guide for Technical Managers A Guide for Nontechnical Managers Hiring System Administrators Firing System Administrators 521 523 543 565 583 619 653 667 689 725 727 751 777 819 853 871 899 909 Appendixes 911 Appendix A The Many Roles of a System Administrator Appendix B Acronyms Bibliography Index 913 939 945 955 Contents Preface Acknowledgments About the Authors Part I xxv xxxv xxxvii Getting Started 1 What to Do When 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 Building a Site from Scratch Growing a Small Site Going Global Replacing Services Moving a Data Center Moving to/Opening a New Building Handling a High Rate of Office Moves Assessing a Site (Due Diligence) Dealing with Mergers and Acquisitions Coping with Machine Crashes Surviving a Major Outage or Work Stoppage What Tools Should Every Team Member Have? Ensuring the Return of Tools Why Document Systems and Procedures? Why Document Policies? Identifying the Fundamental Problems in the Environment Getting More Money for Projects Getting Projects Done Keeping Customers Happy 4 5 10 11 12 12 13 13 14 14 15 vii viii Contents 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.38 1.39 1.40 1.41 1.42 1.43 1.44 1.45 1.46 1.47 1.48 Keeping Management Happy Keeping SAs Happy Keeping Systems from Being Too Slow Coping with a Big Influx of Computers Coping with a Big Influx of New Users Coping with a Big Influx of New SAs Handling a High SA Team Attrition Rate Handling a High User-Base Attrition Rate Being New to a Group Being the New Manager of a Group Looking for a New Job Hiring Many New SAs Quickly Increasing Total System Reliability Decreasing Costs Adding Features Stopping the Hurt When Doing “This” Building Customer Confidence Building the Team’s Self-Confidence Improving the Team’s Follow-Through Handling Ethics Issues My Dishwasher Leaves Spots on My Glasses Protecting Your Job Getting More Training Setting Your Priorities Getting All the Work Done Avoiding Stress What Should SAs Expect from Their Managers? What Should SA Managers Expect from Their SAs? What Should SA Managers Provide to Their Boss? Climb Out of the Hole 2.1 Tips for Improving System Administration 15 16 16 16 17 17 18 18 18 19 19 20 20 21 21 22 22 22 22 23 23 23 24 24 25 25 26 26 26 27 28 2.1.1 28 Manage Quick Requests Right 29 2.1.3 Adopt Three Time-Saving Policies 30 2.1.4 Start Every New Host in a Known State 32 2.1.5 2.2 Use a Trouble-Ticket System 2.1.2 Follow Our Other Tips Conclusion 33 36 Contents Part II Foundation Elements Workstations 3.1 ix 39 41 44 Loading the OS 46 3.1.2 Updating the System Software and Applications 54 3.1.3 Network Configuration 57 3.1.4 3.2 The Basics 3.1.1 Avoid Using Dynamic DNS with DHCP The Icing 61 65 3.2.1 65 Involve Customers in the Standardization Process 66 3.2.3 3.3 High Confidence in Completion 3.2.2 A Variety of Standard Configurations Conclusion Servers 4.1 66 67 69 The Basics 69 4.1.1 69 Choose Vendors Known for Reliable Products 72 4.1.3 Understand the Cost of Server Hardware 72 4.1.4 Consider Maintenance Contracts and Spare Parts 74 4.1.5 Maintaining Data Integrity 78 4.1.6 Put Servers in the Data Center 78 4.1.7 Client Server OS Configuration 79 4.1.8 Provide Remote Console Access 80 4.1.9 4.2 Buy Server Hardware for Servers 4.1.2 Mirror Boot Disks 83 84 4.2.1 Enhancing Reliability and Service Ability 84 4.2.2 4.3 The Icing An Alternative: Many Inexpensive Servers 89 Conclusion Services 5.1 92 95 The Basics 96 5.1.1 Customer Requirements 5.1.2 Operational Requirements 98 5.1.3 Open Architecture 104 5.1.4 Simplicity 107 5.1.5 Vendor Relations 108 100 3.1 The Basics 51 Some OS vendors won’t support cloned disks, because their installation process makes decisions at load time based on, factors such as what hardware is detected Windows NT generates a unique security ID (SID) for each machine during the install process Initial cloning software for Windows NT wasn’t able to duplicate this functionality, causing many problems This issue was eventually solved You can strike a balance here by leveraging both automation and cloning Some sites clone disks to establish a minimal OS install and then use an automated software-distribution system to layer all applications and patches on top Other sites use a generic OS installation script and then “clone” applications or system modifications on to the machine Finally, some OS vendors don’t provide ways to automate installation However, home-grown options are available SunOS 4.x didn’t include anything like Solaris’s JumpStart, so many sites loaded the OS from a CD-ROM and then ran a script that completed the process The CD-ROM gave the machine a known state, and the script did the rest PARIS: Automated SunOS 4.x Installation Given enough time and money, anything is possible You can even build your own install system Everyone knows that SunOS 4.x installations can’t be automated Everyone except Viktor Dukhovni, who created Programmable Automatic Remote Installation Service (PARIS) in 1992 while working for Lehman Brothers PARIS automated the process of loading SunOS 4.x on many hosts in parallel over the network long before Sun OS 5.x introduced JumpStart At the time, the state of the art required walking a CD-ROM drive to each host in order to load the OS PARIS allowed an SA in New York to remotely initiate an OS upgrade of all the machines at a branch office The SA would then go home or out to dinner and some time later find that all the machines had installed successfully The ability to schedule unattended installs of groups of machines is a PARIS feature still not found in most vendor-supplied installation systems Until Sun created JumpStart, many sites created their own home-grown solutions 3.1.1.4 Should You Trust the Vendor’s Installation? Computers usually come with the OS preloaded Knowing this, you might think that you don’t need to bother with reloading an OS that someone has already loaded for you We disagree In fact, we think that reloading the OS makes your life easier in the long run 52 Chapter Workstations Reloading the OS from scratch is better for several reasons First, you probably would have to deal with loading other applications and localizations on top of a vendor-loaded OS before the machine would work at your site Automating the entire loading process from scratch is often easier than layering applications and configurations on top of the vendor’s OS install Second, vendors will change their preloaded OS configurations for their own purposes, with no notice to anyone; loading from scratch gives you a known state on every machine Using the preinstalled OS leads to deviation from your standard configuration Eventually, such deviation can lead to problems Another reason to avoid using a preloaded OS is that eventually, hosts have to have an OS reload For example, the hard disk might crash and be replaced by a blank one, or you might have a policy of reloading a workstation’s OS whenever it moves from one to another When some of your machines are running preloaded OSs and others are running locally installed OSs, you have two platforms to support They will have differences You don’t want to discover, smack in the middle of an emergency, that you can’t load and install a host without the vendor’s help The Tale of an OS That Had to Be Vendor Loaded Once upon a time, Tom was experimenting with a UNIX system from a Japanese company that was just getting into the workstation business The vendor shipped the unit preloaded with a customized version of UNIX Unfortunately, the machine got irrecoverably mangled while the SAs were porting applications to it Tom contacted the vendor, whose response was to send a new hard disk preloaded with the OS—all the way from Japan! Even though the old hard disk was fine and could be reformatted and reused, the vendor hadn’t established a method for users to reload the OS, even from backup tapes Luckily for Tom, this workstation wasn’t used for critical services Imagine if it had been, though, and Tom suddenly found his network unusable, or, worse yet, payroll couldn’t be processed until the machine was working! Those grumpy customers would not have been amused if they’d had to live without their paychecks until a hard drive arrived from Japan If this machine had been a critical one, keeping a preloaded replacement hard disk on hand would have been prudent A set of written directions on how to physically install it and bring the system back to a usable state would also have been a good idea The moral of this story is that if you must use a vendor-loaded OS, it’s better to find out right after it arrives, rather than during a disaster, whether you can restore it from scratch 3.1 The Basics 53 The previous anecdote describes an OS from long ago However, history repeats itself PC vendors preload the OS and often include special applications, add-ons, and drivers Always verify that add-ons are included in the OS reload disks provided with the system Sometimes, the applications won’t be missed, because they are free tools that aren’t worth what is paid for them However, they may be critical device drivers This is particularly important for laptops, which often require drivers that not come with the basic version of the OS Tom ran into this problem while writing this book After reloading Windows NT on his laptop, he had to add drivers to enable his PCMCIA slots The drivers couldn’t be brought to the laptop via modem or Ethernet, because those were PCMCIA devices Instead they had to be downloaded to floppies, using a different computer Without a second computer, there would have been a difficult catch-22 situation This issue has become less severe over time as custom, laptop-specific hardware has transitioned to common, standardized components Microsoft has also responded to pressure to make its operating systems less dependent on the hardware it was installed on Although the situation has improved over time from the low-level driver perspective, vendors have tried to differentiate themselves by including application software unique to particular models But doing that defeats attempts to make one image that can work on all platforms Some vendors will preload a specific disk image that you provide This service not only saves you from having to load the systems yourself but also lets you know exactly what is being loaded However, you still have the burden of updating the master image as hardware and models change 3.1.1.5 Installation Checklists Whether your OS installation is completely manual or fully automated, you can improve consistency by using a written checklist to make sure that technicians don’t skip any steps The usefulness of such a checklist is obvious if installation is completely manual Even a solo system administrator who feels that “all OS loads are consistent because I them myself” will find benefits to using a written checklist If anything, your checklists can be the basis of training a new system administrator or freeing up your time by training a trustworthy clerk to follow your checklists (See Section 9.1.4 for more on checklists.) Even if OS installation is completely automated, a good checklist is still useful Certain things can’t be automated, because they are physical acts, 54 Chapter Workstations such as starting the installation, making sure that the mouse works, cleaning the screen before it is delivered, or giving the user a choice of mousepads Other related tasks may be on your checklist: updating inventory lists, reordering network cables if you are below a certain limit, and a week later checking whether the customer has any problems or questions 3.1.2 Updating the System Software and Applications Wouldn’t it be nice if an SA’s job was finished once the OS and applications were loaded? Sadly, as time goes by, people identify new bugs and new security holes, all of which need to be fixed Also, people find cool new applications that need to be deployed All these tasks are software updates Someone has to take care of them, and that someone is you Don’t worry, though; you don’t have to spend all your time doing updates As with installation, updates can be automated, saving time and effort Every vendor has a different name for its system for automating software updates: Solaris, AutoPatch; Microsoft Windows, SMS; and various people have written layers on top of Red Hat Linux’s RPMs, SGI IRIX’s RoboInst, and HP-UX’s Software Distributor (SD-UX) Other systems are multiplatform solutions (Ressman and Vald´ s 2000) e Software-update systems should be general enough to be able to deploy new applications, to update applications, and to patch the OS If a system can only distribute patches, new applications can be packaged as if they were patches These systems can also be used for small changes that must be made to many hosts A small configuration change, such as a new /etc/ntp.conf, can be packaged into a patch and deployed automatically Most systems have the ability to include postinstall scripts—programs that are run to complete any changes required to install the package One can even create a package that contains only a postinstall script as a way of deploying a complicated change Case Study: Installing New Printing System An SA was hired by a site that needed a new print system The new system was specified, designed, and tested very quickly However, the consultant spent weeks on the menial task of installing the new client software on each workstation, because the site had no automated method for rolling out software updates Later, the consultant was hired to install a similar system at another site This site had an excellent -and documented! -software-update system En masse changes could be made easily The client software was packaged and distributed quickly At the first site, the cost of 3.1 The Basics 55 building a new print system was mostly deploying to desktops At the second site, the main cost was the same as the main focus: the new print service The first site thought they were saving money by not implementing a method to automate software rollouts Instead, they spent large amounts of money every time new software needed to be deployed This site didn’t have the foresight to realize that in the future, it would have other software to roll out The second site saved money by investing some money up front 3.1.2.1 Updates Are Different from Installations Automating software updates is similar to automating the initial installation but is also different in many important ways • The host is in usable state Updates are done to machines that are in good running condition, whereas the initial-load process has extra work to do, such as partitioning disks and deducing network parameters In fact, initial loading must work on a host that is in a disabled state, such as with a completely blank hard drive • The host is in an office Update systems must be able to perform the job on the native network of the host They cannot flood the network or disturb the other hosts on the network An initial load process may be done in a laboratory where special equipment may be available For example, large sites commonly have a special install room, with a highcapacity network, where machines are prepared before delivery to the new owner’s office • No physical access is required Updates shouldn’t require a physical visit, which are disruptive to customers; also, coordinating them is expensive Missed appointments, customers on vacation, and machines in locked offices all lead to the nightmare of rescheduling appointments Physical visits can’t be automated • The host is already in use Updates involve a machine that has been in use for a while; therefore, the customer assumes that it will be usable when the update is done You can’t mess up the machine! By contrast, when an initial OS load fails, you can wipe the disk and start from scratch • The host may not be in a “known state.” As a result, the automation must be more careful, because the OS may have decayed since its initial installation During the initial load, the state of the machine is more controlled • The host may have “live” users Some updates can’t be installed while a machine is in use Microsoft’s System Management Service solves this 56 Chapter Workstations problem by installing packages after a user has entered his or her user name and password to log in but before he or she gets access to the machine The AutoPatch system used at Bell Labs sends email to a customer two days before and lets the customer postpone the update a few days by creating a file with a particular name in /tmp • The host may be gone In this age of laptops, it is increasingly likely that a host may not always be on the network when the update system is running Update systems can no longer assume that hosts are alive but must either chase after them until they reappear or be initiated by the host itself on a schedule, as well as any time it discovers that it has rejoined its home network • The host may be dual-boot In this age of dual-boot hosts, update systems that reach out to desktops must be careful to verify that they have reached the expected OS A dual-boot PC with Windows on one partition and Linux on another may run for months in Linux, missing out on updates for the Windows partition Update systems for both the Linux and Windows systems must be smart enough to handle this situation 3.1.2.2 One, Some, Many The ramifications of a failed patch process are different from those of a failed OS load A user probably won’t even know whether an OS failed to load, because the host usually hasn’t been delivered yet However, a host that is being patched is usually at the person’s desk; a patch that fails and leaves the machine in an unusable condition is much more visible and frustrating You can reduce the risk of a failed patch by using the one, some, many technique • One First, patch one machine This machine may belong to you, so there is incentive to get it right If the patch fails, improve the process until it works for a single machine without fail • Some Next, try the patch on a few other machines If possible, you should test your automated patch process on all the other SAs’ workstations before you inflict it on users SAs are a little more understanding Then test it on a few friendly customers outside the SA group • Many As you test your system and gain confidence that it won’t melt someone’s hard drive, slowly, slowly, move to larger and larger groups of risk-averse customers 3.1 The Basics 57 An automated update system has potential to cause massive damage You must have a well-documented process around it to make sure that risk is managed The process needs to be well defined and repeatable, and you must attempt to improve it after each use You can avoid disasters if you follow this system Every time you distribute something, you’re taking a risk Don’t take unnecessary risks An automated patch system is like a clinical trial of an experimental new anti-influenza drug You wouldn’t give an untested drug to thousands of people before you’d tested it on small groups of informed volunteers; likewise, you shouldn’t implement an automated patch system until you’re sure that it won’t serious damage Think about how grumpy they’d get if your patch killed their machines and they hadn’t even noticed the problem the patch was meant to fix! Here are a few tips for your first steps in the update process • Create a well-defined update that will be distributed to all hosts Nominate it for distribution The nomination begins a buy-in phase to get it approved by all stakeholders This practice prevents overly enthusiastic SAs from distributing trivial, non-business-critical software packages • Establish a communication plan so that those affected don’t feel surprised by updates Execute the plan the same way every time, because customers find comfort in consistency • When you’re ready to implement your Some phase, define (and use!) a success metric, such as If there are no failures, each succeeding group is about 50 percent larger than the previous group If there is a single failure, the group size returns to a single host and starts growing again • Finally, establish a way for customers to stop the deployment process if things go disastrously wrong The process document should indicate who has the authority to request a halt, how to request it, who has the authority to approve the request, and what happens next 3.1.3 Network Configuration The third component you need for a large workstation environment is an automated way to update network parameters, those tiny bits of information that are often related to booting a computer and getting it onto the network The information in them is highly customized for a particular subnet or even for a particular host This characteristic is in contrast to a system such as 58 Chapter Workstations application deployment, in which the same application is deployed to all hosts in the same configuration As a result, your automated system for updating network parameters is usually separate from the other systems The most common system for automating this process is DHCP Some vendors have DHCP servers that can be set up in seconds; other servers take considerably longer Creating a global DNS/DHCP architecture with dozens or hundreds of sites requires a lot of planning and special knowledge Some DHCP vendors have professional service organizations that will help you through the process, which can be particularly valuable for a global enterprise A small company may not see the value in letting you spend a day or more learning something that will, apparently, save you from what seems like only a minute or two of work whenever you set up a machine Entering an IP address manually is no big deal, and, for that matter, neither is manually entering a netmask and a couple of other parameters Right? Wrong Sure, you’ll save a day or two by not setting up a DHCP server But there’s a problem: Remember those hidden costs we mentioned at the beginning of this chapter? If you don’t use DHCP, they’ll rear their ugly heads sooner or later Eventually, you’ll have to renumber the IP subnet or change the subnet netmask, Domain Name Service (DNS) server IP address, or modify some network parameter If you don’t have DHCP, you’ll spend weeks or months making a single change, because you’ll have to orchestrate teams of people to touch every host in the network The small investment of using DHCP makes all future changes down the line nearly free Anything worth doing is worth doing well DHCP has its own best and worst practices The following section discusses what we’ve learned 3.1.3.1 Use Templates Rather Than Per-Host Configuration DHCP systems should provide a templating system Some DHCP systems store the particular parameters given to each individual host Other DHCP systems store templates that describe what parameters are given to various classes of hosts The benefit of templates is that if you have to make the same change to many hosts, you simply change the template, which is much better than scrolling through a long list of hosts, trying to find which ones require the change Another benefit is that it is much more difficult to introduce a syntax error into a configuration file if a program is generating the file Assuming that templates are syntactically correct, the configuration will be too Such a system does not need to be complicated Many SAs write small programs to create their own template systems A list of hosts is stored in a 3.1 The Basics 59 database—or even a simple text file—and the program uses this data to program the DHCP server’s configuration Rather than putting the individual host information in a new file or creating a complicated database, the information can be embedded into your current inventory database or file For example, UNIX sites can simply embed it into the /etc/ethers file that is already being maintained This file is then used by a program that automatically generates the DHCP configuration Sample lines from such a file are as follows: 8:0:20:1d:36:3a adagio #DHCP=sun 0:a0:c9:e1:af:2f talpc #DHCP=nt 0:60:b0:97:3d:77 sec4 #DHCP=hp4 0:a0:cc:55:5d:a2 bloop #DHCP=any 0:0:a7:14:99:24 ostenato #DHCP=ncd-barney 0:10:4b:52:de:c9 tallt #DHCP=nt 0:10:4b:52:de:c9 tallt-home #DHCP=nt 0:10:4b:52:de:c9 tallt-lab4 #DHCP=nt 0:10:4b:52:de:c9 tallt-lab5 #DHCP=nt The token #DHCP= would be treated as a comment by any legacy program that looks at this file However, the program that generates the DHCP server’s configuration uses those codes to determine what to generate for that host Hosts adagio, talpc, and sec4 receive the proper configuration for a Sun workstation, a Windows NT host, and an HP LaserJet printer respectively Host ostenato is an NCD X-Terminal that boots off a Trivial File Transfer Protocol (TFTP) server called barney The NCD template takes a parameter, thus making it general enough for all the hosts that need to read a configuration file from a TFTP server The last four lines indicate that Tom’s laptop should get a different IP address, based on the four subnets to which it may be connected: his office, at home, or the fourth- or fifth-floor labs Note that even though we are using static assignments, it is still possible for a host to hop networks.4 By embedding this information into an /etc/ethers file, we reduced the potential for typos If the information were in a separate file, the data could become inconsistent Other parameters can be included this way One site put this information in the comments of its UNIX /etc/hosts file, along with other tokens SAs should note that this method relies on an IP address specified elsewhere or assigned by DHCP via a pool of addressees 60 Chapter Workstations that indicated JumpStart and other parameters The script extracts this information for use in JumpStart configuration files, DHCP configuration files, and other systems By editing a single file, an SA was able to perform huge amounts of work! The open source project HostDB5 expands on this idea, you edit one file to generate DHCP and DNS configuration files, as well as to distribute them to appropriate servers 3.1.3.2 Know When to Use Dynamic Leases Normally, DHCP assigns a particular IP address to a particular host The dynamic leases DHCP feature lets one specify a range of IP addresses to be handed out to hosts These hosts may get a different IP address every time they connect to the network The benefit is that it is less work for the system administrators and more convenient for the customers Because this feature is used so commonly, many people think that DHCP has to assign addresses in this way In fact, it doesn’t It is often better to lock a particular host to a particular IP address; this is particularly true for servers whose IP address is in other configuration files, such as DNS servers and firewalls This technique is termed static assignment by the RFCs or permanent lease by Microsoft DHCP servers The right time to use a dynamic pool is when you have many hosts chasing a small number of IP addresses For example, you may have a remote access server (RAS) with 200 modems for thousands of hosts that might dial into it In that situation, it would be reasonable to have a dynamic pool of 220 addresses.6 Another example might be a network with a high turnover of temporary hosts, such as a laboratory testbed, a computer installation room, or a network for visitor laptops In these cases, there may be enough physical room or ports for only a certain number of computers The IP address pool can be sized slightly larger than this maximum Typical office LANs are better suited to dynamically assigned leases However, there are benefits to allocating static leases for particular machines For example, by ensuring that certain machines always receive the same IP address, you prevent those machines from not being able to get IP addresses when the pool is exhausted Imagine a pool being exhausted by a large influx of guests visiting an office and then your boss being unable to access anything because the PC can’t get an IP address http://everythingsysadmin.com/hostdb/ Although in this scenario you need a pool of only 200 IP addresses, a slightly larger pool has benefits For example, if a host disconnects without releasing the lease, the IP address will be tied up until its lease period has ended Allocating 10 percent additional IP addresses to alleviate this situation is reasonable 3.1 The Basics 61 Another reason for statically assigning IP addresses is that it improves the usability of logs If people’s workstations always are assigned the same IP address, logs will consistently show them at a particular IP address Finally, some software packages deal poorly with a host changing its IP address Although this situation is increasingly rare, static assignments avoid such problems The exclusive use of statically assigned IP addresses is not a valid security measure Some sites disable any dynamic assignment, feeling that this will prevent uninvited guests from using their network The truth is that someone can still manually configure network settings Software that permits one to snoop network packets quickly reveals enough information to permit someone to guess which IP addresses are unused, what the netmask is, what DNS settings should be, the default gateway, and so on IEEE 802.1x is a better way to this This standard for network access control determines whether a new host should be permitted on a network Used primarily on WiFi networks, network access control is being used more and more on wired networks An Ethernet switch that supports 802.1x keeps a newly connected host disconnected from the network while performing some kind of authentication Depending on whether the authentication succeeds or fails, traffic is permitted, or the host is denied access to the network 3.1.3.3 Using DHCP on Public Networks Before 802.1x was invented, many people crafted similar solutions You may have been in a hotel or a public space where the network was configured such that it was easy to get on the network but you had access only to an authorization web page Once the authorization went through—either by providing some acceptable identification or by paying with a credit card— you gained access In these situations, SAs would like the plug-in-and-go ease of an address pool while being able to authenticate that users have permission to use corporate, university, or hotel resources For more on early tools and techniques, see Beck (1999) and Valian and Watson (1999) Their systems permit unregistered hosts to be registered to a person who then assumes responsibility for any harm these unknown hosts create 3.1.4 Avoid Using Dynamic DNS with DHCP We’re unimpressed by DHCP systems that update dynamic DNS servers This flashy feature adds unnecessary complexity and security risk 62 Chapter Workstations In systems with dynamic DNS, a client host tells the DHCP server what its hostname should be, and the DHCP server sends updates to the DNS server (The client host can also send updates directly to the DNS server.) No matter what network the machine is plugged in to, the DNS information for that host is consistent with the name of the host Hosts with static leases will always have the same name in DNS because they always receive the same IP address When using dynamic leases, the host’s IP address is from a pool of addresses, each of which usually has a formulaic name, in DNS, such as dhcp-pool-10, dhcp-pool-11, dhcp-pool-12 No matter which host receives the tenth address in the pool, its name in DNS will be dhcp-pool-10 This will most certainly be inconsistent with the hostname stored in its local configuration This inconsistency is unimportant unless the machine is a server That is, if a host isn’t running any services, nobody needs to refer to it by name, and it doesn’t matter what name is listed for it in DNS If the host is running services, the machine should receive a permanent DHCP lease and always have the same fixed name Services that are designed to talk directly to clients don’t use DNS to find the hosts One such example is peer-to-peer services, which permit hosts to share files or communicate via voice or video When joining the peer-to-peer service, each host registers its IP address with a central registry that uses a fixed name and/or IP address H.323 communication tools, such as Microsoft Netmeeting, use this technique Letting a host determine its own hostname is a security risk Hostnames should be controlled by a centralized authority, not the user of the host What if someone configures a host to have the same name as a critical server? Which should the DNS/DHCP system believe is the real server? Most dynamic DNS/DHCP systems let you lock down names of critical servers, which means that the list of critical servers is a new namespace that must be maintained and audited (see Chapter 8, name spaces.) If you accidentally omit a new server, you have a disaster waiting to occur Avoid situations in which customers are put in a position that allows their simple mistakes to disrupt others LAN architects learned this a long time ago with respect to letting customers configure their own IP addresses We should not repeat this mistake by letting customers set their own hostnames Before DHCP, customers would often take down a LAN by accidentally setting their host’s IP address to that of the router Customers were handed a list of IP addresses to use to configure their PCs “Was the first one for ‘default gateway,’ or was it the second one? Aw, heck, I’ve got a 50/50 chance of getting 3.1 The Basics 63 it right.” If the customer guessed wrong, communication with the router essentially stopped The use of DHCP greatly reduces the chance of this happening Permitting customers to pick their own hostnames sounds like a variation on this theme that is destined to have similar results We fear a rash of new problems related to customers setting their host’s name to the name that was given to them to use as their email server or their domain name or another common string Another issue relates to how these DNS updates are authenticated The secure protocols for doing these updates ensure that the host that inserted records into DNS is the same host that requests that they are deleted or replaced The protocols little to prevent the initial insertion of data and have little control over the format or lexicon of permitted names We foresee situations in which people configure their PCs with misleading names in an attempt to confuse or defraud others—a scam that commonly happens on the Internet7 —coming soon to an intranet near you So many risks to gain one flashy feature! Advocates of such systems argue that all these risks can be managed or mitigated, often through additional features and controls that can be configured We reply that adding layers of complicated databases to manage risk sounds like a lot of work that can be avoided by simply not using the feature Some would argue that this feature increases accountability, because logs will always reflect the same hostname We, on the other hand, argue that there are other ways to gain better accountability If you need to be able to trace illegal behavior of a host to a particular person, it is best to use a registration and tracking system (Section 3.1.3.3) Dynamic DNS with DHCP creates a system that is more complicated, more difficult to manage, more prone to failure, and less secure in exchange for a small amount of aesthetic pleasantness It’s not worth it Despite these drawbacks, OS vendors have started building systems that not work as well unless dynamic DNS updates are enabled Companies are put in the difficult position of having to choose between adopting new technology or reducing their security standards Luckily, the security industry has a useful concept: containment Containment means limiting a security risk so that it can affect only a well-defined area We recommend that dynamic DNS should be contained to particular network subdomains that For many years, www.whitehouse.com was a porn site This was quite a surprise to people who were looking for www.whitehouse.gov 64 Chapter Workstations will be treated with less trust For example, all hosts that use dynamic DNS might have such names as myhost.dhcp.corp.example.com Hostnames in the dhcp.corp.example.com zone might have collisions and other problems, but those problems are isolated in that one zone This technique can be extended to the entire range of dynamic DNS updates that are required by domain controllers in Microsoft ActiveDirectory One creates many contained areas for DNS zones with funny-looking names, such as tcp.corp.example.com and udp.corp.example.com (Liu 2001) 3.1.4.1 Managing DHCP Lease Times Lease times can be managed to aid in propagating updates DHCP client hosts are given a set of parameters to use for a certain amount of time, after which they must renew their leases Changes to the parameters are seen at renewal time Suppose that the lease time for a particular subnet is weeks Suppose that you are going to change the netmask for that subnet Normally, one can expect a 2-week wait before all the hosts have this new netmask On the other hand, if you know that the change is coming, you can set the lease time to be short during the time leading up to the change Once you change the netmask in the DHCP server’s configuration, the update will propagate quickly When you have verified that the change has created no ill effects, you can increase the lease time to the original value (2 weeks) With this technique, you can roll out a change much more quickly DHCP for Moving Clients Away from Resources At Bell Labs, Tom needed to change the IP address of the primary DNS server Such a change would take only a moment but would take weeks to propagate to all clients via DHCP Clients wouldn’t function properly until they had received their update It could have been a major outage He temporarily configured the DHCP server to direct all clients to use a completely different DNS server It wasn’t the optimal DNS server for those clients to use, but it was one that worked Once the original DNS server had stopped receiving requests, he could renumber it and test it without worry Later, he changed the DHCP server to direct clients to the new IP address of the primary DNS server Although hosts were using a slower DNS server for a while, they never felt the pain of a complete outage The optimal length for a default lease is a philosophical battle that is beyond the scope of this book For discussions on the topic, we recommend 3.2 The Icing 65 The DHCP Handbook (Lemon and Droms 1999) and DHCP: A Guide to Dynamic TCP/IP Network Configuration (Kercheval 1999) Case Study: Using the Bell Labs Laptop Net The Computer Science Research group at Bell Labs has a subnet with a 5-minute lease in its famous UNIX Room Laptops can plug in to the subnet in this room for short periods The lease is only minutes because the SAs observed that users require about minutes to walk their laptops back to their offices from the UNIX Room By that time, the lease has expired This technique is less important now that DHCP client implementations are better at dealing with rapid change 3.2 The Icing Up to this point, this chapter has dealt with technical details that are basic to getting workstation deployment right These issues are so fundamental that doing them well will affect nearly every other possible task This section helps you fine-tune things a bit Once you have the basics in place, keep an eye open for new technologies that help to automate other aspects of workstation support (Miller and Donnini 2000a) Workstations are usually the most numerous machines in the company Every small gain in reducing workstation support overhead has a massive impact 3.2.1 High Confidence in Completion There are automated processes, and then there is process automation When we have exceptionally high confidence in a process, our minds are liberated from worry of failure, and we start to see new ways to use the process Christophe Kalt had extremely high confidence that a Solaris JumpStart at Bell Labs would run to completion without fail or without the system unexpectedly stopping to ask for user input He would use the UNIX at to schedule hosts to be JumpStarted8 at times when neither he nor the customer would be awake, thereby changing the way he could offer service to customers This change was possible only because he had high confidence that the installation would complete without error The Solaris command reboot ‘ﬂnet - installﬂ’eliminates the need for a human to type on the console to start the process The command can be done remotely, if necessary ... 4 5 10 11 12 12 13 13 14 14 15 vii viii Contents 1. 20 1. 21 1.22 1. 23 1. 24 1. 25 1. 26 1. 27 1. 28 1. 29 1. 30 1. 31 1.32 1. 33 1. 34 1. 35 1. 36 1. 37 1. 38 1. 39 1. 40 1. 41 1.42 1. 43 1. 44 1. 45 1. 46 1. 47 1. 48... Authors Part I xxv xxxv xxxvii Getting Started 1 What to Do When 1. 1 1. 2 1. 3 1. 4 1. 5 1. 6 1. 7 1. 8 1. 9 1. 10 1. 11 1 .12 1. 13 1. 14 1. 15 1. 16 1. 17 1. 18 1. 19 Building a Site from Scratch Growing a Small... Independence 10 9 5 .1. 7 Environment 11 0 5 .1. 8 Restricted Access 11 1 5 .1. 9 11 2 Single or Multiple Servers 11 5 5 .1. 11 Centralization and Standards 11 6 5 .1. 12 Performance 11 6 5 .1. 13 Monitoring 11 9 5 .1. 14 5.2

The Practice of System and Network Administration Second Edition phần 1 docx

Thông tin tài liệu

Từ khóa liên quan

Mục lục

The practice of system and network administration, 2nd ed

Contents

Preface

Acknowledgments

About the Authors

Part I: Getting Started

1 What to Do When ...

1.1 Building a Site from Scratch

1.2 Growing a Small Site

1.3 Going Global

1.4 Replacing Services

1.5 Moving a Data Center

1.6 Moving to/Opening a New Building

1.7 Handling a High Rate of Office Moves

1.8 Assessing a Site (Due Diligence)

1.9 Dealing with Mergers and Acquisitions

1.10 Coping with Machine Crashes

1.11 Surviving a Major Outage or Work Stoppage

1.12 What Tools Should Every Team Member Have?

1.13 Ensuring the Return of Tools

1.14 Why Document Systems and Procedures?

1.15 Why Document Policies?

1.16 Identifying the Fundamental Problems in the Environment

Tài liệu cùng người dùng

Tài liệu liên quan