The Linux Btrfs filesystem is frequently compared to Sun’s ZFS; both allow users to pool storage devices together without relying on expensive hardware raid cards, both offer snapshots and cloning, and both calculate and store checksums of all data by default.

Thanks to the checksum feature in ZFS, all data stored on the filesystem can be verified to be stored correctly with one command: zpool scrub <poolname>. This will read all data on the filesystem and compare it to the checksums that are stored. If a problem is found the filesystem, and a redundant copy of the data is available, the filesystem will heal itself. If it can’t do that, it will still notify the user that a problem was found.

Though it is maturing rapidly, Btrfs does not yet have a similar scrub command. However there are ways to accomplish the same thing using simple shell scripting. I expect a true scrub command for Btrfs will come along in the near future, so for now this will work adequately.

Just stick the following text in a file somewhere in your system path, like /usr/local/bin/btrfs-scrub, then make it executable with chmod +x /usr/local/bin/btrfs-scrub.


#!/bin/bash

function scrub {
  find $1 -mount -type f -exec cat '{}' > /dev/null \;
}

if [ $# -ne 1 ]
then
  echo "Please give a filesystem path"
  exit $E_BADARGS
fi

echo "Btrfs-scrub script"
echo "Checking $1, this may take a while"
scrub $1
echo "Scrub complete! Check dmesg for any possible errors"

To use the script, execute it on a path like this:

btrfs-scrub /pool

The script takes a single argument, a filesystem path. It reads every file inside that path recursively, and streams the contents to /dev/null. The end result is nearly identical to the ZFS scrub operation, every file is checked when read, and if an invalid checksum is found, the kernel will print a message in the kernel log, which can be viewed with the dmesg command.

Because we’re just reading files from a path, there isn’t anything Btrfs specific about this script, but as far as scrubbing a disk is concerned, it is only useful on filesystems that know when data has been corrupted somehow.