The Linux Btrfs filesystem is frequently compared to Sun’s ZFS; both allow users to pool storage devices together without relying on expensive hardware raid cards, both offer snapshots and cloning, and both calculate and store checksums of all data by default.
Thanks to the checksum feature in ZFS, all data stored on the filesystem can be verified to be stored correctly with one command: zpool scrub <poolname>. This will read all data on the filesystem and compare it to the checksums that are stored. If a problem is found the filesystem, and a redundant copy of the data is available, the filesystem will heal itself. If it can’t do that, it will still notify the user that a problem was found.
Though it is maturing rapidly, Btrfs does not yet have a similar scrub command. However there are ways to accomplish the same thing using simple shell scripting. I expect a true scrub command for Btrfs will come along in the near future, so for now this will work adequately.
Just stick the following text in a file somewhere in your system path, like /usr/local/bin/btrfs-scrub, then make it executable with chmod +x /usr/local/bin/btrfs-scrub.
#!/bin/bash
function scrub {
find $1 -mount -type f -exec cat '{}' > /dev/null \;
}
if [ $# -ne 1 ]
then
echo "Please give a filesystem path"
exit $E_BADARGS
fi
echo "Btrfs-scrub script"
echo "Checking $1, this may take a while"
scrub $1
echo "Scrub complete! Check dmesg for any possible errors"
To use the script, execute it on a path like this:
btrfs-scrub /pool
The script takes a single argument, a filesystem path. It reads every file inside that path recursively, and streams the contents to /dev/null. The end result is nearly identical to the ZFS scrub operation, every file is checked when read, and if an invalid checksum is found, the kernel will print a message in the kernel log, which can be viewed with the dmesg command.
Because we’re just reading files from a path, there isn’t anything Btrfs specific about this script, but as far as scrubbing a disk is concerned, it is only useful on filesystems that know when data has been corrupted somehow.
You may want to add a -mount flag in there, or something like that. (There must be a way to tell Linux’s find command to only search within the starting filesystem and not follow symbolic links out of the filesystem, or ignore filesystems that are mounted beneath the starting filesystem. If it’s not -mount, I’m sure there’s another option for it).
Indeed, the -mount flag does what you suggest, i’ve updated the post.
Thanks Mark!
You could also do this with a backup rsync or something like that, and kill two birds with one stone.
Horribly and sneakingly dangerous!
This does not scrub the directories and meta-data at all. Which means it’s not much better than doing nothing. Since you still just as easily can lose everything.
BAReFOOt, i’m not sure what you mean. It does step into each directory and stream the contents of each file to /dev/null, which should cause the kernel to read and compare the checksums.
What were you expecting it to do?
What he means is that while the contents of the files themselves will be compared to their checksum, the filesystem structure itself – the directory structure and metadata – will not be, and thus you could still lose part or all of the filesystem map. Which would be very bad. Whether his assertion is true or not depends on details that I do not know. If he knew such details, he should have sketched out the technical implementation issues, at least briefly.
Leave a comment or question