adr-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[adr-devel] Amanda Disaster Recovery Scripts.


From: Alex Owen
Subject: [adr-devel] Amanda Disaster Recovery Scripts.
Date: Wed, 1 Sep 2004 11:43:07 +0100 (BST)

(PS: is any one on this list interested in taking this idea forward?)

Hello,
I have set up an Amanda backup and Disaster recovery system.
My system works as follows.

BACKUP
======
Use OS features to make a snapshot of filesystems at 5 to midnight every
day via a cron job and mount them readonly on /snapshot/. (Solaris:
fssnap; Debian/Linux: LVM)

I also have a cron job that runs at say 10 to midnight which collects
important metadata ie partition layout etc and save this under
/etc/diskinfo/

Have an Amanda server backup these /snapshot directories using gnu-tar.

RECOVERY
========
As I leaverage the Network for backup I also use it for disaster recovery.
The amanda server is also a DHCP,tftp,NFS,rshd server so confiugured to
only communicate to its clients(ie the servers I want to backup/restore).

I net-boot the failed server from the Amanda server and restore from tape
directly onto disk. There is a sun "blueprint" which shows you how to
modify a Solaris installation netboot image to be a generic readonly NFS
root image. I have also adapted this idea for Debian/Linux.

So I netboot the faild server into the disaster recovery environment which
has the normal amada tools and any custome ones (ie ADR) and then I can
amrestore some metadata from the root partition backup on tape to
/tmp/etc/diskinfo and /tmp/etc/fstab.  I can then use this meta-info
manually (or in future automatically) to repartition the dirves.

Guided by the meta info I have already restored I can then communicate
with the amada server via rsh to do:
"amadmin Daily info host ^/snapshot/<partition>$"
to find the tapes to restore.

I can then rsh to the amanda server to get the tape robot to find the tape
and restore the multiple levels of dump(tar). Then I can mount the next
partition and do the same.... and the next and the next.

Then we have to do something about making the machine bootable... On Linux
install Lilo or grub, Solaris fix up the boot sector. These utilities can
be system specific and stored on the target system (ie: on the backup
tape)
eg for linux: "chroot /mnt/ /sbin/lilo"

At this stage a reboot from disk brings tha server back to a working
state.

Amanda server recovery:
=======================

I have done most if not all of the preceeding steps. This step I have
planned but have not yet tested.

My Amanda Server is a Debian Linux server. I can netboot Debian, Solaris
and IRIX (miniroot) from this server. There is no reason why any unix like
operating system that can boot via DHCP/bootp->tftp->NFSroot could not be
persuaded to boot/run from this debian server.

Debian has a package bootcd which allows me to make copy of my debian
server onto a bootable live CD. I can therefore boot this CD and use this
to restore the latest Amanda server image from tape.

The amanda database/config will be about one run out of date so I expect
that we would need to "amadmin import" an "amadmin export" file. Currently
I e-mail these off the amanda server at the end of the run. I guess I
should append this info to the end of that days amanda tape.

----------------------------
Here are a couple of bash functions that I am developing to help speed up
the manual process... Perhaps they could grow to form the start of
ADR-0.0.1 ?

do_snap_restore should be extended to take the tape and file number info
found using find_snap_tape. It should then ensure the tape is loaded and
"mt asf" to the correct file on tape.

find_snap_tape should be changed to use "amadmin info" rather than
"amadmin find" as at present!

find_snap_tape(){ #fs=$1
    rsh  amanda.server -l backup /usr/sbin/amadmin Daily find --sort dl   
failed.server /snapshot${1}\\\$
}

do_snap_restore(){ # $1=fs_to_restore $2=raw_device
    mount ${2} /mnt${1}
    cd /mnt${1} ; rmdir lost+found
    echo restoring $2 from backup of $1 at `date`
    fs="$1"
    [ "$1" = "/" ] && fs=""
    #restore ot from tape...
    #first make sure correct tape is in drive and rewound
    echo starting restore at `date`
    rsh  amanda.server -l backup /usr/sbin/amrestore -p /dev/tape  
failed.server /snapshot${fs}\$ | tar -xf -
    echo rewinding tape at `date`
    rsh  amanda.server -l backup /bin/mt rewind
}


Useage example:
===============

find_snap_tape /home | tail
#2004-08-08 failed.server /snapshot/home  1 UXD069         33 OK
#2004-08-09 failed.server /snapshot/home  1 UXD070         37 OK
#2004-08-10 failed.server /snapshot/home  0 UXD071         18 OK
#2004-08-11 failed.server /snapshot/home  1 UXD072         18 OK
#2004-08-12 failed.server /snapshot/home  1 UXD073         33 OK
#2004-08-13 failed.server /snapshot/home  1 UXD074         36 OK
#2004-08-14 failed.server /snapshot/home  1 UXD075         31 OK
#2004-08-15 failed.server /snapshot/home  1 UXD076         29 OK
#2004-08-16 failed.server /snapshot/home  1 UXD077         38 OK
#2004-08-17 failed.server /snapshot/home  0 UXD078          9 OK
### So /home is file 9 on tape  UXD078 at level 0

do_snap_restore /home /dev/vg00/home







reply via email to

[Prev in Thread] Current Thread [Next in Thread]