ZFS – Backup

All my data is currently saved in a quite trustable raidz pool, but this is not theft-proof for example. For increased safety, I chose to externalize a backup of the pool with a used tape drive on another hard drive.

Disk preparation

First, connect the backup disk to the computer, and make sure the device descriptor points to the correct drive :

# smartctl -i /dev/ad2
 smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.3-RELEASE-p3 amd64] (local build)
 Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
 Model Family: Seagate Barracuda Green (Adv. Format)
 Device Model: ST2000DL003-9VT166
 Serial Number: xxxxxxx
 LU WWN Device Id: 5 000c50 044b0747a
 Firmware Version: CC3C
 User Capacity: 2,000,398,934,016 bytes [2.00 TB]
 Sector Sizes: 512 bytes logical, 4096 bytes physical
 Device is: In smartctl database [for details use: -P show]
 ATA Version is: 8
 ATA Standard is: ATA-8-ACS revision 4
 Local Time is: Wed Jun 20 19:04:15 2012 CEST
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

Then, create a new pool on this drive, with no particular option :

# zpool create backup /dev/ad2
# zpool status backup
 pool: backup
 state: ONLINE
 scan: none requested
 config:
NAME STATE READ WRITE CKSUM
 backup ONLINE 0 0 0
 ad2 ONLINE 0 0 0
errors: No known data errors

This is followed by the datasets creation. The option copies=2 [1] is used for the “important” datasets in order to protect them against an hypothetical bit rot [2] (but this double the used disk space).

# zfs create backup/fichiers
# zfs create -o checksum=sha256 -o copies=2 backup/fichiers/Photos

Less critical datasets can be created in a more usual way :

# zfs create -o checksum=sha256 backup/fichiers/Videos

Backup

For my backups, I chose the send option [3] included in ZFS. This feature is mainly used for data replication between hosts, but it is not recommended for backups on a external storage support, for a good reason : if one block is corrupted in the flow, the whole backup becomes unusable, which explains the importance of not forgetting the copies=2 option previously mentioned.

These backups are made from existing snapshots and encrypted with openssl this way :

# zfs send data/fichiers/Photos@062012 | openssl enc -aes-256-cbc -salt > /backup/fichiers/Photos/Photos_062012.ssl

And for “compressible” data :

# zfs send data/fichiers/Documents@062012 | compress | openssl enc -aes-256-cbc -salt > /backup/fichiers/Documents/Documents_062012.z.ssl

Once the backup is finished, the pool can be exported :

# zfs export backup

Backup validation

To make sure the backup is usable, the drive must be checked from time to time, for example with the live CD mode integrated into FreeBSD 9.0 (in this mode, the keymap can be changed via the kbdmap command).

A global check (zpool scrub) is possible after a drive import. All available pools can be listed with the zpool import command, and an alternative mountpoint can be set with the -R option, since / is in read-only mode when the live CD is used :

# zpool import
# mkdir /tmp/zfsroot
# zpool import -R /tmp/zfsroot <pool>
# zpool scrub <pool>

To check backups integrity at the filesystem level, the pool can be mounted in read-only, and checked with zstreamdump [4] :

# zpool import -o readonly=on -R /tmp/zfsroot <pool>
# openssl enc -d -aes-256-cbc -in /tmp/zfsroot/path/to/file.ssl | zstreamdump

If zstreamdump returns only the checksum, the backup data is valid. Otherwise, the command returns the following error :

Expected checksum differs from checksum in stream.


[1] : http://docs.oracle.com/cd/E19253-01/820-2315/gevpg/index.html
[2] : http://www.linux-mag.com/id/8794/
[3] : http://docs.oracle.com/cd/E19963-01/html/821-1448/gbchx.html
[4] : http://blog.richardelling.com/2009/10/check-integrity-of-zfs-send-streams.html