On 18 March 2013 19:25, seth vidal <skvidal(a)fedoraproject.org> wrote:
After a super-fun-time debacle restoring a single file today I'd
like
to talk about our backups a bit.
Right now our backups are:
- bacula to a few central servers and then off to tape.
That seems like it is not scaling super-duper well for our size of disk
storage. It also seems like it is a wee bit cumbersome to use. :)
In the best of all possible infinite-money worlds I'd love to have
enough disk space to offer multiple snapshots of every filesystem
and/or a complete disk-to-disk copy with deduping (obnam) or with
reverse diffs (rdiff-backup). But let's assume that world is not likely
to exist and figure a few things out:
1. where are we backing up that we don't need to?
We currently back up the following systems:
ask01
bastion01
bastion02
collab02
db-fas01
db01
db04
db05
fas01
hosted-lists01
hosted02
lockbox01
log02
noc01
people03
pkgs01
proxy01
proxy02
releng03
releng04
relepel01
Most of those are quick backups.. but a couple of them are slow long
things. Looking at that list.. there may be some thing we need to
backup that we aren't.. more than us backing up stuff we shouldn't.
2. are there places that we can backup that really would benefit
from
being a warmer-backup always available in a filesystem somewhere
I would say that it would be quite useful for lots of things. If
anything I would love to have a backup system that backs stuff to a
disk tree per box and then tape backups that set of disks versus just
going to disks. It would be easier for us to do disaster recovery by
dumping those disks to multiple sites (though it means more dealing
with encrypted disks and such.)
3. Is there any good way to couple snapshots with our tape system to
make our backups a little simpler to deal with?
I have seen a couple of methods but they aren't snapshots like LVM and
such (I found LVM snapshots were more painful to deal with on backups
but I think it was mainly slow slow disks.)
4. What level of bare metal-disaster-recovery do we actually HAVE
with
our existing system and have we ever tested any of those cases?
We have tested bare metal a couple of times. Tested in the sense that
one of the boxes we have backed up is dead and we needed to restore
stuff that was on it.
I do not know when I will get the time to put into fixing any of
these
things up - but after today it is clearly on my list of things to think
about.
-sv
_______________________________________________
infrastructure mailing list
infrastructure(a)lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/infrastructure
--
Stephen J Smoogen.
"Don't derail a useful feature for the 99% because you're not in it."
Linus Torvalds
"Years ago my mother used to say to me,... Elwood, you must be oh
so smart or oh so pleasant. Well, for years I was smart. I
recommend pleasant. You may quote me." —James Stewart as Elwood P. Dowd