You don't need zfs resize ... and a workaround when you need one ;)
Okay, the title is provocative and of course ZFS needs a tool to resize a filesystem. But out of other reasons. Somehow some people think that you can’t resize a ZFS filesystem. You can’t shrink a zpool, but you can increase it’s size while operation without much hassle just by doing the obvious.
I’ve heard a few comments at customer sites that this feature is somehow missing in Solaris. But whenever you ask a little bit more about the need, you will find out that only a few are about reducing the size of the disks but most are about increasing the size of an existent RAIDZ pool. And often there is a little bit lack of knowledge.
In the light of an missing resize command many people think that they only way is a RAID0 of RAIDZ1/2/3. But you have to think more Mac like to this problem. Free your mind, think about what you are really doing and most times it’s the correct way with ZFS.
While writing this article i even had an idea how to fake a pool shrink/restructuring feature. Because this is feature that is really mising: Reducing the number of vdevs or making them smaller. This is a hack, but it looks like it works reasonable well. Nevertheless it has some caveats.
A warning
As usual a warning when you work at the way your data is stored on disks. Don’t try this with your production data (or production equivalent data like the favorite recordings of TV shows of your significant other made by the digitial video recorder). Having a working backup of your data is a best practice whenever you doing vast changes at the structure which data is stored on rotating rust.
Increasing the zpool size by doing the obvious
What do you do when you want to increase the size of a filesystem? You swap one disk after the other. Sound obvious. Let’s do this with zfs. At first i will create some files to use them as demo disks. At first we create our set of small disk:
Now we create an RAIDZ from it:
In preparation to our task we set a property of our pool in the case we didn’t do that earlier:
Now we replace all disks in the pool with the bigger one. It’s important that you wait until the disk completed it’s resilivering. You can check this via zfs status
Now let’s swap the last disk:
Tada … the size jumped to the new size. It’s really that easy.
Coexistence
Of course there are use cases for resizing, for example when you want to make the pool smaller or when you want to reduce the number of vdevs. Or just want to get the data from the expensive disks, but aren’t allowed to simply delete them because you need them later again. However i see a number of workarounds.
At first file based vdevs and physical vdevs can coexist on the same pool. Thus it’s feasible to migrate your pool into files in another pool or even UFS filesystem. It’s similar to the stuff above. So just a short demonstration without any explanation. I’m just substituting the virtual devices on ramdisk with the virtual devices in files:
Okay , that works too. If you just want to get your data from some expensive disks in RAIDZ (for example 6 73 GB 15.000 disks), you would use a filesystem on some cheap 1 TB in a RAID1 configuration and stop now. But there are cases where you want to change the structure or the size of your zpool.
A step further: Resizing/restructuring
While writing this article i thought, if it’s possible to drive this concept a little bit further, but in an obvious way. At first i’ve made the problem a little bit more complex by adding a filesystem to the pool.
To have some data in it, i’ve copied some files into. As we freed the ramdisks a few moments ago, i will use them for something different. But i need a sixth device for it.
Now i use the six ramdisks to create a stripe out of three mirrors
Okay, now i’m snapshoting all the datasets recursively in the pool migrationtest and move them to the pool migrationtest_target i’ve used before.
Of course with real storage this would take quite a time. But you could do a number of snapshots, do an incremental ZFS send and just to the interruption when a send/receive just takes a few seconds. Either way it’s important to know, that we need a short application downtime starting just before taking the last snapshot (or in our example the only snapshot) and ending with the successful renaming of the pools) to ensure that our data is consistent from the applications perspective.
Okay, Let’s have a short look into the datasets of the pool migrationtest_target:
Okay … looks similar to the stuff we’ve copied over in the respective filesystems in the pool migrationtestOf course this work isn’t complete until we rename the pools. At first we export both pools
No we have to reimport them, as we used some strange devices, we have to give ZFS an hint where it should search for our devices. At first we reimport out pool migrationtest as migrationtest_source:
Now we reimport our pool migrationtest_target as our new pool migrationtest:
We just migrated our pool migrationtest from a RAIDZ to a RAID0+1.
And when we look into the filesystem we see all our files:
Looks good. Same directory structure, same content. And due to the checksumming capabilities of ZFS all this steps are protected against bit rot while transmitting the data.
Caveats
Of course this idea has two caveats: At first you need the storage for the interims files, but you could use any filesystem that’s available on the system. And additionally you need a short downtime for transmitting the last incremental snapshot and swapping the names of the pools.
Conclusion
Resizing and restructuring a zpool is possible with minimal service interruption. Even the holy grail of reducing the number of vdevs or smaller vdevs is possible with a fairly minimal downtime. However i should point out that most customers want the stuff i’ve described in first part about the autoexpand feature and not the hack i’ve described in the second part. I don’t know where this idea was born, that you can’ increase the size of a vdev …. :)