o reilly Unix Backup and Recovery phần 9 doc

73 261 0
o reilly Unix Backup and Recovery phần 9 doc

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

• The media is inexpensive and easily available from a number of vendors. • Per-disk capacities have grown from 100 MB to 6.4 GB. • The drives themselves are inexpensive. • The media retains data longer than many competing formats. M-O drives use the M-O recording method and are readily available from a number of vendors. There is also a big line of automated libraries that support M-O drives and media. This level of automation, combined with its low cost, make M-O an excellent choice for nearline environments. The format isn't perfect, though. Overwriting an M-O cartridge requires multiple passes. However, there is a proposed technology, called Advanced Storage Magneto-Optical (ASMO), that promises to solve this problem. ASMO promises a high-speed, direct overwrite-rewritable optical system capable of reading both CD-ROM and DVD-ROM disks. It is supposed to have faster transfer rates than any of the DVD technologies, a capacity of 6 GB, and an infinite number of rewrites. Compare this to DVD-RW's 4.7 GB and 1,000 rewrites, DVD-RAM's 2.6 GB and 100,000 rewrites, and DVD+RW's 3 GB and 100,000 rewrites. The reason that the number of rewrites is important is that one of the intended markets is as a permanent secondary storage device for desktop users. If it can achieve a transfer rate of 2 MB/s, a user could create a full backup of a 6-GB hard drive in under an hour. Making this backup would be as easy as a drag and drop, and the resulting disk could be removed to a safe location. The best part, though, is that the restore is also a simple drag and drop, and accessing the file would take seconds, not minutes. For More Information This entire optical section could not have been written without the folks at http://www.cdpage.com, especially Dana Parker. They were the only available source for a lot of this information. They are keeping close tabs on this highly volatile industry, especially the CD and DVD part of it. Make sure you check their web site for updated information. Automated Backup Hardware So far this chapter covers only the tape and optical drives themselves. However, today's environments are demanding more and more automation as databases, file- Page 642 systems, and servers become larger and more complex. Spending a few thousand dollars on some type of automated volume management system can reduce the need for manual intervention, drastically increasing the integrity of a backup system. It reduces administrator frustration by handling the most common (and most boring) task associated with backups-swapping a volume. There are essentially three types of automated backup hardware. Some people may use these three terms interchangeably. For the purposes of this chapter, these terms are used as they are defined here: Stacker This is how many people enter the automation market. A stacker gets its name from the way they were originally designed. Tapes appeared to be "stacked" on top of one another in early models, although many of today's stackers have the tapes sitting side by side. A stacker is traditionally a sequential access device, meaning that when you eject tape 1, it automatically puts in tape 2. If it contains 10 tapes, and you eject tape 10, it puts in tape 1. You cannot tell a true stacker to "put in tape 5." (This capability is referred to as random access.) It is up to you to know which tape is currently in the drive and to calculate the number of ejects required to get to tape 5. Stackers typically have between 4 and 12 slots and one or two drives. Many products that are advertised as stackers support random access, so the line is slightly blurred. However, in order to be classified as a stacker, a product must support sequential-access operation. This allows an administrator to easily use shell scripts to control the stacker. Once you purchase a commercial backup product, you have the option of putting the stacker into random-access mode and allowing the backup product to control it. (Control of automated backup hardware is almost always an extra-cost option.) Library This category of automated backup hardware is called many things, but the most common terms are "library," "autoloader," and "jukebox.'' Each of these terms connotes an addressable group of volumes that can be automatically loaded via unique volume addresses. This means that each slot and drive within a library is given a location address. For example, the first lot may be location 0000, and the first drive may be location 1000. When the backup software controlling the library tells it to put the tape from slot 1 into drive 1, it actually is saying "move the volume in location 0000 to location 1000." The primary difference between a library and a stacker is that a library can operate only in random-access mode. Today's libraries are starting to borrow advanced features that used to be found only in silos, such as import/export ports, bar code readers, visual displays, and Ethernet ports for SNMP monitoring. Libraries may range from 12 slots to 500 or more slots. The largest librar- Page 643 ies have even started to offer pass-through ports, which allows one library to pass tapes to another library. (This is usually a standard feature in silos.) Silo Since many libraries now offer features that used to be found only in silos, the distinction between the two is getting very blurred. The main distinction between a silo and a library today is whether or not it allows multiple hosts to connect to the same silo. If multiple hosts can connect to a silo, they can all share the same drives and volumes. However, with the advent of Storage Area Networks and SCSI switches, libraries now offer this feature too. Silos typically contain at least 500 volumes. Vendors I would like to make a distinction between Independent Hardware Vendors (IHVs) and Value Added Resellers (VARs). The IHVs actually manufacturer the stackers, libraries, and silos. A VAR may bundle that piece of hardware with additional software and support as a complete package, often relabeling the original hardware with their logo and/or color scheme. There is a definite need for VARs. They can provide you with a single point of contact for all issues related to your backup system. You can call them for support on your library, RAID system, and the software that controls it all-even if they came from three different vendors. VARs sometimes offer added functionality to a product. The following should not be considered an exhaustive list of IHVs. These are simply the ones that I know about. Inclusion in this list should not be considered an endorsement, and exclusion from this list means nothing. I am sure that there are other IHVs that offer their own unique set of features. Since there are far too many VARs to list here, we will only be listing IHVs. Stackers and Autoloaders ADIC (http://www.adic.com) ADIC makes stackers and libraries of all shapes and sizes for all budgets. After establishing a solid position in this market, they decided to expand. They recently acquired EMASS, one of the premier silo vendors, and now sell the largest silos in the world. Page 644 ATL (http://www.atlp.com) ATL makes some of the best-known DLT stackers and libraries on the market. Many VARs relabel and resell ATL's libraries. Breece Hill (http://www.breecehill.com) Breece Hill is another well-known DLT stacker and library manufacturer. Their new Saguaro line expands their capacity to more than 200 volumes per library. Exabyte (http://www.exabyte.com) At one time, all 8-mm stackers and libraries came from Exabyte. Although this is no longer the case, they still have a very big line of stackers and libraries of all shapes and sizes. Mountain Gate (http://www.mountaingate.com) Mountain Gate has been making large-scale storage systems for a while and has applied their technology today to the DLT and 3590 libraries. These libraries offer capacities of up to 134 TB. Overland Data (http://www.overlanddata.com) Overland Data offers small DLT libraries with a unique feature-scalability. They sell an enclosure that can fit several of the small libraries, allowing them to exchange volumes between them. This allows those on a budget to start small while accommodating growth as it occurs. Qualstar (http://www.qualstar.com) Qualstar's product line offers some interesting features not typically found in 8-mm libraries. (They also now make DLT libraries.) Their design reduced the number of moving parts and added redundant, hot-swappable power supplies. Another interesting feature is an infrared beam that detects when a hand is inserted into the library. Quantum Quantum, the makers of DLT tape drives, has a line of small stackers and libraries. They are now sold exclusively through ATL. Seagate (http://www.seagate.com) Seagate has a small line of DDS stackers. Sony (http://www.sony.com.) Sony also has a small line of DDS stackers. Spectralogic (http://www.spectralogic.com) Spectralogic's have a very easy to use LCD touch screen, and almost all parts are Field Replaceable Units (FRUs). The power supplies, tape drives, motherboards, and slot system can all be replaced with a simple turn of a thumbscrew. Page 645 Large Libraries and Silos ADIC (http://www.adic.com) ADIC/Emass is the only library manufacturer that allows you to mix drive and media types within a single library. This allows you to upgrade your drives to the current technology while keeping the library. ADIC has the largest silos available, expandable to up to 60,000 pieces of media for a total storage capacity of over 4 petabytes. IBM (http://www.ibm.com) IBM makes a line of expandable libraries for their 3490E and 3590 tape drives that can fit up to 6240 cartridges for a total storage capacity of 187 terabytes. Storagetek (http://www.storagetek.com) Storagetek also offers a line of very large, expandable table libraries. Most of their libraries can accept any of the Storagetek drives or DLT drives. The libraries have a capacity of up to 6000 tapes and 300 TB per library storage module. Almost all of their libraries can be interconnected to provide an infinite storage capacity. Optical jukeboxes HP Optical (http://www.hp-optical.com) Hewlett-Packard is the leader in the optical jukebox field, providing magneto-optical jukeboxes of sizes up to 1.3 TB in capacity. Maxoptix Optical (http://www.maxoptix.com) Maxoptics specializes in M-O jukeboxes and also offers them in a number of sizes ranging to 1.3 TB. Plasmon Optical (http://www.plasmon.com) Plasmon makes the biggest M-O jukebox currently available, with a capacity of 500 slots and 2.6 TB. They also have a line of CD jukeboxes available. Hardware Comparison Table 18-4 summarizes the information in this chapter. It contains an exhaustive list of the types of Unix-compatible storage devices available to you. Some drives, like the 4-mm, 8-mm, CD-R, and M-O drives, are made by a number of manufacturers. The specifications listed for these drives therefore should be considered Page 646 approximations. Check the actual manufacturer of the drive you intend to purchase for specific information. Table 18-4. Backup Hardware Comparison Model Name or Generic Name Vendor Media Type, Comments, (Expected release date) H/L Capacity (Gigabytes) MB/s Avg Load Time (sec) Avg Seek Time (sec) DST 312 Ampex DST H 50-330 15/20 8205XL Exabyte et al. 8-mm H 3.5 275 K 8505XL Exabyte et al. 8-mm H 7 500 K 8900- Exabyte et al. AME 8-mm H 20 3 20 Mammoth DDS-1 Various Very rare H 1.3 250 K DDS-2 Various DDS H 4 510 K 20 DDS-3 Various DDS H 12 1.6 DDS-DC Various DDS H 2 250 K 3570- IBM Midpoint 5 2.2 Magstar MP load 3590 IBM 3590 10 9 3480 Various 3480 L 200 MB 3 13 3490 Various 3490 L 600 MB 3 13 3490E Various 3490E L 800 MB 3 13 DTR-48 Metrum M-II 36 48 Model 64 Metrum Super VHS 27.5 64 LMS NCTP Plasmn 3480 18 10 30 81 3490E DLT 2000XT Quantum Very rare H 15 1.25 45 45 DLT4000 Quantum DLT L 20 1.5 45 45 DLT7000 Quantum DLT L 35 5 40 60 Super DLT Quantum (Release date unknown) 100-500 10-40 DTF Sony DTF 42 12 DTF-2 Sony DTF H 100 24 7 DTF-2 Sony DTF H 100 24 7 Will have fiber channel interface (1999) DTF-3 Sony DTF (2000) H 200 24 (table continued on next page.) Page 647 (table continued from previous page.) Table 18-4. Backup Hardware Comparison (continued) Model Name or Generic Name Vendor Media Type, Comments, (Expected release date) H/L Capacity (Gigabytes) MB/s Avg Load Time (sec) Avg Seek Time (sec) DTF-4 Sony DFT (200x) H 400 48 AIT-1 Sony/Seagate AIT/AME H 25 3 7 27 AIT-2 Sony/Seagate AIT/AME H 50 6 7 27 AIT-3 Sony/Seagate AIT/AME (1999) H 100 12 AIT-4 Sony/Seagate AIT/AME (2000) H 200 24 4490 Storagetek 9490- Storagetek 3480 or 6 4.3- 10.4 Timberline EETape 5.0 9840 Storagetek 3840/3490E L 20 10 4 11 Media Staysin Cartridge SD-3- Storagetek 3490E H 10-50 11.1 17 21-53 Redwood MLR-1 Tandberg QIC MLR-1 L 16 1.5 30 55 MLR-3 Tandberg QIC MLR-3 L 25 2 30 55 MO 640 Fujitsu M-O 128 MB-643 MB 1-4 7 28 ms MO 540 Sony M-O 2.6 MO 551 Sony M-O 5.2 5 25 ms MO 5200ex HP M-O 5.2 Write 2.3 Read 4.6 5.5 35 ms CD-R Spressa 9488 Sony CD-R 680 MB Write 600 KB 220 m 9488 600 KB CD-RW 8100i HP, JVC, Mitsumi, NEC, Phillips, Panasonic, Ricoh, Sony, Teac, Plextor, Yamaha CD-RW CD-R 680 MB CD-R & RW Read 3.6 MB 7 200 m (table continued on next page.) Page 648 (table continued from previous page.) Table 18-4. Backup Hardware Comparison (continued) Model Name or Generic Name Vendor Media Type, Comments, (Expected release date) H/L Capacity (Gigabytes) MB/s Avg Load Time (sec) Avg Seek Time (sec) CD-R Write 600 KB CD-RW Write 300 KB DVD-R Pioneer DVD-R 4 1.3 DVD-RAM Matsushita, Toshiba, Hitachi DVD-RAM 2.6 1.35 120 m DVD-RW 4.7 DVD +RW Sony, Phillips, HP, Ricoh, Yamaha, Mitsubishi DVD+RW 3 Page 649 19 Miscellanea No matter how we organized this book, there would be subjects that wouldn't fit anywhere else. This chapter covers these subjects, including such important information as backing up volatile filesystems and handling the difficulties inherent in gigabit Ethernet. Volatile Filesystems A volatile filesystem is one that changes heavily while it is being backed up. Backing up a very volatile filesystem could result in a number of negative side effects. The degree to which a backup will be affected is directly proportional to the volatility of the filesystem and highly dependent on the backup utility that you are using. Some files could be missing or corrupted within the backup, or the wrong versions of files may be found within the backup. The worst possible problem, of course, is that the backup itself could become corrupted, although this could happen only under the most extreme circumstances. (See "Demystifying dump" for details on what can happen when performing a dump backup of a volatile filesystem.) Missing or Corrupted Files Files that are changing during the backup do not always make it to the backup correctly. This is especially true if the filename or inode changes during the backup. The extent to which your backup is affected by this problem depends on what type of utility you're using and how volatile the filesystem is. For example, suppose that the utility performs the equivalent of a find command at the beginning of the backup, based solely on the names of the files. This utility Page 650 then begins backing up those files based on the list that it created at the beginning of the backup. If a filename changes during a backup, the backup utility will receive an error when it attempts to back up the old filename. The file, with its new name, will simply be overlooked. Another scenario would be if the filename does not change, but the file's contents do change. The backup utility begins backing up the file, and the file changes while being backed up. This is probably most common with a large database file. The backup of this file would be essentially worthless, since different parts of it were created at different times. (This is actually what happens when backing up Oracle database files in hot-backup mode. Without Oracle's ability to rebuild the file, the backup of these files would be worthless.) Referential Integrity Problems This is similar to the corrupted files problem but on a filesystem level. Backing up a particular filesystem may take several hours. This means that different files within the backup will be backed up at different times. If these files are unrelated, this creates no problem. However, suppose that two different files are related in such a way that if one is changed, the other is changed. An application needs these two files to be related to each other. This means that if you restore one, you must restore the other. It also means that if you restore one file to 11:00 P.M. yesterday, you should restore the other file to 11:00 P.M. yesterday. (This scenario is most commonly found in databases but can be found in other applications that use multiple, interrelated files.) Suppose that last night's backup began at 10:00 P.M. Because of the name or inode order of the files, one is backed up at 10:15 P.M. and the other at 11:05 P.M. However, the two files were changed together at 11:00 P.M., between their separate backup times. Under this scenario, you would be unable to restore the two files to the way they looked at any single point in time. You could restore the first file to how it looked at 10:15, and the second file to how it looked at 11:05. However, they need to be restored together. If you think of files within a filesystem as records within a database, this would be referred to as a referential integrity problem. Corrupted or Unreadable Backup If the filesystem changes significantly while it is being backed up, some utilities may actually create a backup that they cannot read. This is obviously one of the most dangerous things that can happen to a backup, and it would happen only under the most extreme circumstances. Page 651 Torture-Testing Backup Programs In 1991, Elizabeth Zwicky did a paper for the LISA* conference called "Torture-testing Backup and Archive Programs: Things You Ought to Know But Probably Would Rather Not." Although this paper and its information are somewhat dated now, people still refer to this paper when talking about this subject. Elizabeth graciously consented to allow us to include some excerpts in this book: Many people use tar, cpio, or some variant to back up their filesystems. There are a certain number of problems with these programs documented in the manual pages, and there are others that people hear of on the street, or find out the hard way. Rumors abound as to what does and does not work, and what programs are best. I have gotten fed up, and set out to find Truth with only Perl (and a number of helpers with different machines) to help me. As everyone expects, there are many more problems than are discussed in the manual pages. The rest of the results are startling. For instance, on Suns running SunOS 4.1, the manual pages for both tar and cpio claim bugs that the programs don't actually have any more. Other "known" bugs in these programs are also mysteriously missing. On the other hand, new and exciting bugs-bugs with symptoms like confusions between file contents and their names-appear in interesting places. Elizabeth performed two different types of tests. The first type were static tests that tried to see which types of programs could handle strangely named files, files with extra long names, named pipes, and so on. Since at this point we are talking only about volatile filesystems, I will not include her static tests here. Her active tests included: • A file that becomes a directory • A directory that becomes a file • A file that is deleted • A file that is created • A file that shrinks • Two files that grow at different rates Elizabeth explains how the degree to which a utility would be affected by these problems depends on how that utility works: Programs that do not go through the filesystem, like dump, write out the directory structure of a filesystem and the contents of files separately. A file that becomes a directory or a directory that becomes a file will create nasty problems, since the * Large Installation System Administration Conference, sponsored by Usenix and Sage (http://www.usenix.org). Page 652 content of the inode is not what it is supposed to be. Restoring the backup will create a file with the original type and the new contents. Similarly, if the directory information is written out and then the contents of the files, a file that is deleted during the run will still appear on the volume, with indeterminate contents, depending on whether or not the blocks were also reused during the run. All of the above cases are particular problems for dump and its relatives; programs that go through the filesystem are less sensitive to them. On the other hand, files that shrink or grow while a backup is running are more severe problems for tar, and other filesystem based programs. dump will write the blocks it intends to, regardless of what happens to the file. If the block has been shortened by a block or more, this will add garbage to the end of it. If it has lengthened, it will truncate it. These are annoying but nonfatal occurrences. Programs that go through the filesystem write a file header, which includes the length, and then the data. Unless the programmer has thought to compare the original length with the amount of data written, these may disagree. Reading the resulting archive, particularly attempting to read individual files, may have unfortunate results. Theoretically, programs in this situation will either truncate or pad the data to the correct length. Many of them will notify you that the length has changed, as well. Unfortunately, many programs do not actually do truncation or padding; some programs even provide the notification anyway. (The ''cpio out of phase: get help!" message springs to mind.) In many cases, the side reading the archive will compensate, making this hard to catch. SunOS 4.1 tar, for instance, will warn you that a file has changed size, and will read an archive with a changed size in it without complaints. Only the fact that the test program, which runs until the archiver exits, got ahead of tar, which was reading until the file ended, demonstrated the problem. (Eventually the disk filled up, breaking the deadlock.) Other warnings Most of the things that people told me were problems with specific programs weren't; on the other hand, several people (including me) confidently predicted correct behavior in cases where it didn't happen. Most of this was due to people assuming that all versions of a program were identical, but the name of a program isn't a very good predictor of its behavior. Beware of statements about what tar does, since most of them are either statements about what it ought to do, or what some particular version of it once did Don't trust programs to tell you when they get things wrong either. Many of the cases in which things disappeared, got renamed, or ended up linked to fascinating places involved no error messages at all. Conclusions These results are in most cases stunningly appalling. dump comes out ahead, which is no great surprise. The fact that it fails the name length tests is a nasty surprise, since theoretically it doesn't care what the full name of a file is; on the other Page 653 hand, it fails late enough that it does not seem to be an immediate problem. Everything else fails in [...]... user or application simply views the filesystem via the snapshot, and where the blocks come from is managed by the snapshot software Available snapshot software There are two software products that allow you to perform snapshots on Unix filesystem data and a hardware platform that supports snapshots: CrosStor Snapshot (http://www.crosstor.com) CrosStor, formerly Programmed Logic, has several storage... you feel stupid Get to know your fellow newsgroup posters You never know when they'll be gone for good Understand that friends come and go, but with a precious few you should hold on Post in r.a.sf.w.r-j, but leave before it makes you hard Post in a.f.e but leave before it makes you soft Browse Accept certain inalienable truths: Spam will rise Newsgroups will flamewar You too will become an oldbie And. .. Maybe you'll meet F2F, maybe you won't Whatever you do, don't congratulate yourself too much, or berate yourself either Your choices are half chance So are everybody else's Enjoy your Internet access Use it every way you can Don't be afraid of it or of what other people think of it It's a privilege, not a right Read the readme.txt, even if you don't follow it Do not read Unix manpages They will only... freedom and innocence of newbieness until they have been overtaken by weary cynicism But trust me, in three months, you'll look back on www.deja.com at posts you wrote and recall in a way you can't grasp now how much possibility lay before you and how witty you really were You are not as bitter as you imagine Write one thing every day that is on topic Chat Don't be trollish in other peoples newsgroups... multiple, on single tape (Informix backups), 408 not making, reasons for, 481 periodic backups (AMANDA), 1 49 periodic full dumps, 175 redologs, 461 (see also archived redologs) tar command, creating, 115 Unix sixth edition, reading, 113 ARCS PROM (IRIX system), 318 arguments hostdump.sh command, 88 infback.sh, calling with and without, 414 restore command optional, 95 tape device, specifying, 100 ASCII converting... newsgroups Don't put up with people who are trollish in yours Update your virus software Sometimes you're ahead, sometimes you're behind The race is long and, in the end, it's only with yourself Remember the praise you receive Forget the flames If you succeed in doing this, tell me how Get a good monitor Be kind to your eyesight You'll miss it when it's gone Page 666 Maybe you'll lurk, maybe you won't Maybe... inodes are compared to DUMP _SINCE Modification times of files greater than or equal to DUMP _SINCE are candidates for backup; the rest are skipped While looking at the inodes, dump builds: • A list of file inodes to back up • A list of directory inodes seen • A list of used (allocated) inodes Pass IIa dump rescans all the inodes and specifically looks at directory inodes that were found in Pass I to... and recovery, 69- 140 comparison of, 127-1 29 testing, 651-663 (see also native backup utilities) Oracle, 455 SIDF, using, 218 special-use, 231 versions, problems with, 652 views, backup script, 610, 612 VOB snapshot backups, 600 backup volumes, 225 bar coding, 227 block size dd command, determining with, 126 dd utility, 123 restore command, specifying, 95 table of contents and, 92 blocking factor (tar...some crucial area For copying portions of filesystems, afio appears to be about as good as it gets, if you have long filenames If you know that all of the files will fit within the path limitations, GNU tar is probably better, since it handles large numbers of links and permission problems better There is one comforting statement in Elizabeth's paper: "It's worth remembering that most people who use... programs don't encounter these problems." Thank goodness! Using Snapshots to Back Up a Volatile Filesystem What if you could back up a very large filesystem in such a way that its volatility was irrelevant? A recovery of that filesystem would restore all files to the way they looked when the entire backup began, right? A new technology called the snapshot allows you to do just that A snapshot provides . a commercial backup product, you have the option of putting the stacker into random-access mode and allowing the backup product to control it. (Control of automated backup hardware is almost. extra-cost option.) Library This category of automated backup hardware is called many things, but the most common terms are "library," "autoloader," and "jukebox.''. scenario, you would be unable to restore the two files to the way they looked at any single point in time. You could restore the first file to how it looked at 10:15, and the second file to how it

Ngày đăng: 14/08/2014, 02:22

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan