IFS Backups Proposal
April 2006
Current Backups for IFS
For Access to Backed-Up Files
Self-Serve Via Oldfiles
The most recent snapshot is available to users online via Oldfiles. This allows users to access a copy of their home (or group) directory that was made sometime within the last 2 or so days. Exactly when a snapshot is made depends which volume it is on and how much data is on that volume at the time (volumes with more data take longer to back up).
Operator Restores
At any one time, an IFS operator has access to
- 6 weeks of incremental backups. As soon as one incremental backup is finished of a particular volume, a new one begins.
- 2-6 months of full backups. A new full backup of each IFS volume is done every 2 weeks. At the beginning of a new month (defined as 4 weeks for the purpose of backups), the full backups from the previous two months are kept, and the ones from the month before that are considered available for overwriting.
- 4 semesters worth of semester snapshots. One snapshot per semester is kept for up to 4 semesters.
Backup Details
Full backups
- Are done once every two weeks for each volume of IFS individually.
- Are kept for 2-6 months. Plus, at any given time, 4 semester snapshots are on file (based on 3 semesters per year).
- It takes about 3 days to complete a full backup of all the volumes on a single server.
Incremental backups
- Are done whenever full backups are not being done.
- As soon as one incremental backup of a volume is completed, another is begun.
- The snapshots/clones are kept for 6 weeks. Once the full backup on which an incremental is based is no longer available, the snapshot is no longer usable.
- It takes 1-2 days to complete an incremental backup of a volume.
Note: A user can access the last snapshot/clone of his/her home and group directories through the Oldfiles mechanism. Restoring from previous incremental backups requires an IFS operator.
Note: Restoration of a volume can only be done by an IFS operator.
Proposed Backups for IFS: Disaster Recovery
For disaster recovery, we intend to use a new feature of AFS called shadows. A shadow is a copy of a volume that is kept on a separate machine and is not visible to users. The first time a shadow is made, it is a complete copy of a volume. When changes are made to the original, the changes do not show up in the shadow. Periodically, we will refresh the shadows so that changes made to the original are copied to the shadow.
If we should ever have a disastrous failure of one or more IFS servers, the shadows would be converted to production volumes, and campus access to IFS could resume in a matter of hours. The only data lost would be that which was changed in the original volumes but had not yet been refreshed onto the shadows.
Proposed Backups for IFS: Data Recovery (Option 1)
Self-Serve Via Oldfiles
The most recent snapshot would be available to users online via Oldfiles. Note that every backup is a two-step process. First, a snapshot is made. This snapshot is made available to users of that volume through Oldfiles. Second, a backup is run of the snapshot.
The goal is to provide snapshots once every 24 hours, so users would have access to a copy of their data from within the last 24 hours. The time of day that a snapshot is made depends on the volume of which the snapshot is being made. Snapshots of all volumes would be completed in every 24-hour period.
Data Restoration Via Operator Restores
At any one time, an IFS operator would have access to
- 2 full backups (sometimes 3, depending on where we are in the backup cycles)
- 8 weeks of weekly incremental backups,
- 14 days of daily incremental backups
Backup Cycles for a Single IFS Volume
Below is a walk-through of how the backup cycles would proceed for a single volume. Note that IFS is too large for us to back up all volumes on the same day. The backup cycles for different volumes therefore begin on different days. That means, for example, that Day 1 for one volume might be on a Monday, while Day 1 for a different volume might be on a Tuesday. At any one time, multiple backup cycles are happening simultaneously across multiple volumes.
On Day 1 (of Week 1), the first day of the backup cycles, a full backup is done.
On Day 2 (of Week 1), an incremental backup is made of all changes since the full backup was done. This is a first-level incremental backup.
On Day 3 (of Week 1), a new incremental backup is made, again referring back to the full backup. This incremental backup covers all changes since the full backup was made on Day 1.
On Day 4 (of Week 1), a new incremental backup is made of all the changes since the full backupand so on through Day 7.
Day 8 (of Week 2) is the first day of the second week. A new incremental backup is made of all the changes since the full backup.
On Days 9-14 (of Week 2), the daily incremental backups are made off of the Week 2 backup. These are second-level incremental backups. For these backups to be readable, both the Week 2 backup, as well as the full backup (on which that weekly backup is based), must remain available.
That completes the first 14-day cycle of daily backups.
The next day is the first day of a new cycle of daily backups (Day 1) and the first day of Week 3. It is Day 1 (of Week 3). A new incremental backup is made of all the changes since the full backup.
On Day 2 (of Week 3), an incremental backup of changes since the Week 3 backup was done is made. This backup overwrites the previous Day 2 backup.
On Days 3-7 (of Week 3), the daily incremental backups are made off of the Week 3 backup. The Day 3 backup of Week 3 overwrites the disc for the Day 3 backup of Week 1, and so on, until the Day 7 (of Week 4) backup overwrites the Day 7 (of Week 1) backup.
For Days 9-14 (of Week 4), this pattern continues. Daily backups are made based on changes since the Week 3 backup. Daily backups for a particular day continue to overwrite the most recent daily backup for that day. This graphic shows the backups that are available for data recovery on Day 14 (of Week 4).
At this point, it has been four weeks since the last full backup, so it is time for another full backup. On Day 1 (of Week 5), Full backup 2 is made. Daily backups for Week 5 are incremental backups of changes made since this new full backup. Here are the backups available for data recovery as of the last daily backup for Week 5Day 7 (of Week 5).
The pattern continues, with a new weekly backup done for Week 6 that is an incremental backup based on Full Backup 2, a backup for Week 7 that is incremental based on Full 2, and so on.
That completes the eight-week cycle of weekly backups. The next week is therefore referred to as Week 1.
After that, weekly backups begin to be overwritten. For example, the new Week 2 weekly backup overwrites the previous Week 2 backup. A third full backup is made at the beginning of Week 3. (Note that the Week 1 backup does not overwrite the previous Week 1 backup because the previous Week 1 backup was a full backup. Full backup 1 must be retained if the Weeks 2-4 weekly incremental backups are to remain readable.)
When all the incremental backups that refer to Full backup 1 have been overwritten, the disc for that backup can be overwritten.
Proposed Backups for IFS: Data Recovery (Option 2)
Option 2 is the same as Option 1, except that we would guarantee access to three full backups at all times (instead of two).
Definitions
- Snapshot. Also called a clone. A read-only copy. When an incremental backup is done, the first step is to make a read-only snapshot/clone of the data. The snapshot is then used to make the incremental backup from. (Otherwise, you'd be trying to back up a moving target.) The snapshot itself is used for Oldfiles.
- Keeping a backup. When we say "keep a backup," we mean keep the tape or disc it is stored on "live and spinning" so that the data on it can be accessed and restored if need be.
- Full backups. A complete backup of everything on IFS (all volumes).
- Volume. IFS breaks its file space into collections called volumes. When we do backups, restores, and so forth, we always operate on an entire volume.
- Incremental backups. A backup of everything on IFS that has changed since some earlier backup was done. The earlier backup may be a full or an incremental. An incremental backup made from a full backup is a first-level incremental. An incremental backup made from a first-level backup is a second-level incremental.
This page last verified April 11, 2006
|