Taking the Mystery out of Computer Technology
RSS icon Email icon Home icon
  • Disaster Recovery

    Posted on May 1st, 2009 Rich Schierer 2 comments

    I found this very informative article at:
    http://www.dscorp.net/infocenter.php?sub=2

    Dissecting disaster recovery solutions

    Cables become unplugged, electronics fail, disks stop spinning, batteries run down, viruses propagate and regardless of defined policies or procedures, electronic records will continue to be discarded and overwritten. Even with the introduction of specialized hardware and fault tolerant solutions for clustering and replication, data can and will continue to be lost.

    The factors used to determine the viability of continued success for an organization that has suffered from a significant system or data loss does not rely strictly on the ability to replace hardware or rebuild infrastructure. In most cases continued success relies heavily on the ability to quickly and successfully recover business critical data. Considering it’s one of the key deciding factors in whether your company will remain in business, shouldn’t you be prepared to make the needed data protection decisions up front?

    Backup concepts, while firmly rooted, continue to grow with technology and adopt new meanings and innovative approaches in the process. Although sometimes delivered as a standalone solution, Disaster Recovery (DR) is often a component of a backup and recovery solution. Put simply, DR can be defined as the ability to quickly and gracefully recover from total data loss. However, this definition has become somewhat blurred by the many manufacturers who are developing and promoting various hardware and software products as DR solutions.
    Backup and recovery: the last line of defense

    Server clustering and RAID disk arrays are often mistakenly thought of as DR solutions, when in reality they are High Availability (HA) solutions. Since hardware failures are fairly common, these technologies were designed to offer increased performance, provide high availability of data, and create an additional level of fault tolerance by reducing the possibility of data loss cause by hardware failure.

    Business Continuance (BC) solutions such as data replication, persistent image technology and volume snapshot solutions offer quick point-in-time recovery of data lost, due to corruption or user error, but are sometimes also mislabeled as DR solutions.

    Any of these offer good first and second line defenses that help prevent data loss. While not really designed for DR, but rather as disaster prevention tools, these types of solutions promote fault tolerance, high availability and quick recovery, but with a few exceptions are still susceptible to data loss due to hardware failure, corruption, viruses or user error. If not caught quickly, a virus, data corruption or accidental deletion could easily be cloned, replicated or mirrored throughout an organization.

    With each step taken toward the goal of achieving 100% data availability the technology grows more costly to implement and manage. It becomes a matter of determining what level of data protection is required and affordable to maintain. Regardless of what other technology is in place, once data has been lost, it all comes down to the ability of backup and recovery software such to be the last line of defense before giving up on recovering your data. The purpose of DR is to aid in quickly getting that system up and running and the data restored so that you and your users can access that data much easier and faster than possible through manual recovery methods.
    Manual recovery

    When compared to the reality of data loss, even a manual recovery process is better than no recovery. While at first it may not seem like such a large task to manually recover a failed system, it can prove a rather cumbersome and time-consuming task for even the most seasoned IT professionals.

    The system failure and resulting data loss could have been on your e-commerce server; an accounting system devoted to payroll, or perhaps a file server full of architectural drawings. In any case the first task would be to isolate the problem and take steps to correct it. This could be as simple as identifying and replacing a defective SCSI controller and hard drive or as complex as finding a replacement for an obsolete motherboard. You must configure your partitions or any special RAID sets that are needed and then with basic hardware problems resolved, it is time to locate your installation CDs and licenses activation keys, and then reinstall the OS. You may also need previous system information such as network addresses, directory structures, volume sizes or cluster information to finish the installation. Hopefully you will have Internet access, because depending on your OS and any additional hardware configuration may require device drivers, patches and several megabytes of service packs before you can see all of your peripherals.

    Once the base OS is up and running, you will still need to locate, install and configure your applications and backup software. Finally, after an average 2 to 4 hours of manual processes you can load a tape and start rebuilding the catalog so you can begin selecting files to restore. The actual rebuild and restore process could then add an additional 30 minutes to 4 hours (or more) to the total recovery time.
    Automated Recovery

    The benefits of DR are in its ability to automatically recreate hard drive partitions and perform a full system recovery of the operating system, applications and data. This alone could shave 2 to 4 hours off of a typical manual recovery process. There are two parts in preparing to implement DR, you start by first making a full backup of your system exactly as you would like it to be restored in the event of a disaster and then the second step is to create the appropriate boot media. This full backup, accompanied by bootable disks, bootable CD-ROM image or a bootable tape device is used together to perform a complete restoration. DR is designed to be as automatic as possible during both preparation and recovery phases, so that once installed, DR will perform its tasks without user intervention.

    To be protected, full backups must be performed, either as part of a regular scheduled backup plan, and or as a snapshot that is performed off-schedule. Additionally, a full backup should also be performed each time there is a significant change in data on the system. A new bootable disk set or a CD-ROM should also be created any time there is a hardware change or a change in operating system.

    DR solutions are only as effective as the media rotation schedule that is put in place. If tapes are not being rotated regularly and stored in secure locations then your data is still at risk and no DR solution will be effective.
    Setting expectations

    DR solutions such take data restoration to the next level by providing a comprehensive, easy-to-use solution that works independently across multiple platforms and operating systems. The restore component of DR as well as the creation of DR media and boot disks needs to be simple and automated; otherwise this critical task could easily be forgotten or overlooked. By saving IT administrators the hassles and complexities of learning different recovery strategies for each platform and OS deployed throughout the network, an easy to use, multi-platform solution allows them to be more productive and better focused on data management.

    Since hardware and device support varies by platform and OS, DR solutions must be robust enough to offer multiple recovery methods that may include bootable tape devices, bootable CD-R/ CD-RW or bootable floppy diskettes. Support for all leading tape device manufactures should also be maintained independently of the OS to offer maximum flexibility during recovery.

    Disaster recovery should not be limited to servers; it must also extend to protect desktop and workgroup environments. It is highly likely that some of the most critical data within your company does not reside on a file or application server. Rather, it is distributed across the hard drives of the desktop and laptop computers used daily by employees and executives. While practically all server-based backup solutions in the market can backup desktop clients remotely, very few offer the combination of affordable DR, local tape device support, a common user interface (UI), intelligent wizards and robust features to the desktop and workstation environment.
    Limitations

    Some DR solutions have very specialized functionality and for performance or security reasons may require that each protected system have a tape device attached to it. Additional licenses for the DR product may also be required. Other solutions may allow network-based recovery of a remote system with backup data archived on disk instead of tape, however complexities may exist when enabling network connectivity on a bare system. Compared to tape, disk solutions do not offer the same levels of reliability, portability or scalability.

    Since all operating systems do not support plug and play, DR operations should always be performed on the same computer after replacing the faulty hardware that caused the system failure. Most DR solutions assume that major changes to your hardware have not occurred. The hardware to which you are restoring data to must be nearly identical to the source system with a few exceptions.

    Be cautious of so-called DR solutions that do not fully restore the base OS. These DR solutions attempt a scripted reinstall of the OS and then attempt to restore just the critical data. These types of solutions offer slower restore times, may require manual intervention, and have a tendency to breakdown when using advanced hardware that needs additional drivers, service packs or was not supported by the OS out-of-the-box.

    Several cloning solutions exist and are targeted toward the desktop OS market. These products effectively allow the creation of a point in time image or snapshot of a system that can be stored on a hard drive or network volume. These are traditionally used to clone an operating system as a method of deploying a standard desktop image onto multiple systems. While this does allow for quick recovery of a stock system, it is not feasible for daily data protection tasks, nor does this type of solution work well for large application servers.

    Disaster Recovery is an effective solution for restoring a computer system after catastrophic loss. When a computer fails, recovery time is crucial. Loading and configuring an operating system and re-installing software can be very time-consuming. When DR is properly executed, a full restoration of the OS, hard drive partitions, applications and data can be achieved quickly and easily. The key to painless disaster recovery is truly in the preparation.

    A reliable backup is truly something that every organization hopes to have in place but secretly wishes they never have a need to use. The same can be said for disaster recovery, although many organizations do not place enough importance on verifying the validity of their data or their ability to recover it from a large-scale data loss.