Back in time - or: zpool import -T

A disclaimer first: I found an undocumented option of a Solaris 11.1 command. Again: This is an undocumented option. So: It can break everything. It will probably break everything. Because it’s undocumented, it’s officially not there. And it’s still undocumented, as my blog isn’t the documentation of any product of Oracle. It’s just a report about an discovery i made. The option, that’s officially not there, can disappear at any time and without further notice. The world may be going down when using it. And a laser could rasterize you and suck you into the computer. At least when you use it wrong, you can create a lot of havoc with it. Sometimes you find out interesting stuff just by Google and trying something out. Yesterday i’ve searched for some information for ZFS on FreeBSD (one the OSes i keep in my OS zoo) and in the course of it, i found a mail on a FreeBSD list talking about the command zpool import -T . -T? There is no -T option on zpool import? Or is it just in FreeBSD? However i couldn’t believe that there is a feature that interesting in ZFS on FreeBSD but not available on Solaris. So i booted up my Solaris 11.1 in my virtual box. At the end i found out how to import a zpool with a state back in the past based on the transaction group without using snapshots.
Normally you get a “invalid option” message, when you try an unknown option.

# zpool import -P
invalid option 'P'
For more info, run: zpool help import

However trying out -T yielded a different result:

# zpool import -T
missing argument for 'T' option
For more info, run: zpool help import

OOOOOKAAAAAY, there is a parameter -T. But hey, it’s not in the man page , it’s not in the usage messages of zpool, it’s not documented. You had my curiosity. But now you have my attention. There is indeed an option for the import subcommand of zpool.

# zpool import -T fsadflsajkdf
invalid txg value
For more info, run: zpool help import

Like the command i found at the mailing list it wants a transaction group number as it’s parameter. Okay, let’s play with it. At first i needed a test pool. I didn’t want to try on a live filesystem.

# zpool create -f testpool c7t2d0

Now i’m creating a number of test files

# dd if=/dev/urandom of=/testpool/1 bs=1024 count=1024
1024+0 records in
1024+0 records out
# dd if=/dev/urandom of=/testpool/2 bs=1024 count=1024
1024+0 records in
1024+0 records out
# dd if=/dev/urandom of=/testpool/3 bs=1024 count=1024
1024+0 records in
1024+0 records out
# dd if=/dev/urandom of=/testpool/4 bs=1024 count=1024
1024+0 records in
1024+0 records out

Just to visibly check the files contain the same data and are still accessible , i’m creating some checksums. I don’t need it to proof the correctness, the checksums in ZFS ensure that it’s either correct or not accessible.

# md5sum /testpool/\*
9b4bc067fffe04096122f959945a43b2  /testpool/1
55d978522708b4a124c120748b98ecfd  /testpool/2
b9d7c75f64031cb02ea8df684ae8e400  /testpool/3
45c1645cdf1bbb6b916400020ff638fc  /testpool/4

The wildcard in the command delivers 4 files. Okay, the -T option want a transaction group number. Where do i get it? For my test i just dumping the current transaction group number to screen

pre># zdb testpool  | grep "txg = "<br />
        txg = 21

Now i will delete all the data in the filesystem.

pre># rm /testpool/\*<br />

And now i will create a new file and printing the checksum of the file to screen.

# dd if=/dev/urandom of=/testpool/5 bs=1024 count=1024
1024+0 records in
1024+0 records out
# md5sum /testpool/\*
b668c3565c5cbc01fa4be6b143fcc3d1  /testpool/5

As you have surely recognised the wildcard got us 1 file. Okay, now i will try this strange option. I’m importing the pool read only at the state of the transaction group 21.

# zpool import -T 21 -o readonly=on testpool

Let’s check the content of the filesytem.

# md5sum /testpool/\*
md5sum: /testpool/1: I/O error
55d978522708b4a124c120748b98ecfd  /testpool/2
b9d7c75f64031cb02ea8df684ae8e400  /testpool/3
45c1645cdf1bbb6b916400020ff638fc  /testpool/4

That’s interesting. You have now 4 files in the filesystem. The first one isn’t readable anymore. I assume parts has been overwritten by /testpool/5. As soon as you delete a file and it isn’t part of a snapshot, the space it had occupied will be freed. When something is overwritten the checksums in the tree doesn’t match any longer, so the systems denies access. But the other 3 are still readable. This matches the output of zpool status -xv:

root@template:~# zpool status -xv
  pool: testpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
        entire pool from backup.
   see: http://support.oracle.com/msg/ZFS-8000-8A
  scan: none requested
config:

        NAME      STATE     READ WRITE CKSUM
        testpool  ONLINE       0     0     1
          c7t2d0  ONLINE       0     0     2

errors: Permanent errors have been detected in the following files:

       <font color="red"> /testpool/1</font>

Let’s now try other txg group numbers:

# zpool export testpool; zpool import -T 20 -o readonly=on testpool; md5sum /testpool/\*
md5sum: /testpool/1: I/O error
55d978522708b4a124c120748b98ecfd  /testpool/2
b9d7c75f64031cb02ea8df684ae8e400  /testpool/3
# zpool export testpool; zpool import -T 19 -o readonly=on testpool; md5sum /testpool/\*
md5sum: /testpool/1: I/O error
55d978522708b4a124c120748b98ecfd  /testpool/2
# zpool export testpool; zpool import -T 18 -o readonly=on testpool; md5sum /testpool/\*
md5sum: /testpool/1: I/O error

Ah … i can scroll back through time … perhaps a really nice example to show customers the transactional behaviour of ZFS in in an simple example. As we’ve mounted the pool read-only nothing has changed. So /testpool/5 is still readable when we get back to the newest state.

# zpool export testpool
# zpool import  testpool
# md5sum /testpool/\*
b668c3565c5cbc01fa4be6b143fcc3d1  /testpool/5