Next: Archival backups
Up: Supporting backups
Previous: Supporting backups
Calamity backups are targeted at restoring a complete filesystem in
the event of major failure. These backups need to be able to restore
a complete filesystem to a recent state, quite possibly on different
hardware. We currently take a complete CPIO archive of each
filesystem every 8 days or so, and rely on incremental backups to fill
in the gaps.
The main problems with this are that we repeatedly (every 8 days)
backup lots of data that we already have backed up, and that
traversing the filesystem in logical order (by directory, and then by
file) is unlikely to involve accessing the disk in physical order, so
there will be lots of seeking to gather the various parts of
directories together.
In a past life we used to take calamity backups by simply dumping the
underlying block device to tape. This meant better disc throughput
(which, given the speed of disc drives at the time, was a good thing),
but it meant that we needed a spare partition at least the size of the
largest active partition for restoring from these backups (sometimes
we do need to restore selected files from calamity backups). This
also meant that we were dumping a lot of blocks that did not contain
live data.
What would be nice is something in between the two: To be able to dump
largely in the order that data is on disc, but still to use
information from the filesystem to avoid dumping too much dead data,
and to arrange the dumped data so that piecemeal restoration is possible.
The structure of a LFS makes such an in-between position possible by
focusing of segments.
- If we dump a whole segment at at time, then we will be accessing
the device largely in device order so throughput should
be high.
- If we access the segment usage information and only backup the
segments that have live data then we will avoid backing up a lot of
dead data.
- If we examine the segment age information and backup the
segments in reverse chronological order, then we will always have
indexing information on tape before the data it indexes, so a single
pass through the tape will be enough to extract an particular subset
of the stored files. As the age information does not have complete
granularity (old segments are grouped together with a single age, so
that age can be stored in 16bits) we will need to read the segment
headers to get the final ordering, but this should not increase the
overall access time too much.
- Finally, given the age information stored about segments, it is very
easy to determine which old segments have been dumped in a previous
Calamity dump and so do not need to be dumped every week there after.
Combining these ideas we can produce a Calamity backup scheme that is
reasonably efficient at creating backups, but exposes some complexity
when it comes to restoring backups. This is probably a reasonable
trade off as restorations happen much less often than backups are
made.
Next: Archival backups
Up: Supporting backups
Previous: Supporting backups
Neil Brown
2003-02-06