How to remove a top-level vdev from a ZFS pool
Over the years I did many, many presentations. Whenever talking with the customers afterwards about what they would like to see in ZFS, there was one feature that was always mentioned: Removing devices. While it was no problem for example to remove a member disk of a mirror, you couldn’t remove a top level vdev, you wasn’t able to remove a mirror out of a stripe of mirrors. With Solaris 11.4 we finally have such a feature allowing to do you exactly this. It’s really easy to use, so if I would like only to show this feature this would be a rather short entry. However I would like to shed some light about the mechanism behind it.
Preparing an example
Let’s assume we have three devices and we have created a striped pool out of it.
We create some files in it:
Let’s now check the structure of the pool. For this I’m using the zdb -L
command. The output is much longer than represented here.
We have 6 Gigabyte worth of data, three devices thus 2 Gigabytes per device. Before you ask, I honestly don’t know why zdb -L
shows no writes. Will check this. Now let’s remove one of top level vdevs.
Removing the device
The removal process is really simple to trigger via the remove
subcommand to zpool
:
The device you want to remove then gets into REMOVING
.
After a while the device will disappear from the pool.
In case you want to remove a top level vdev in a mirror you have to use the name of the top-level vdev. Let’s assume a pool consisting out of two mirrors.
To remove the top-level vdev you have to address its name. In this case mirror-0
.
Behind the curtain
So how was this done by Oracle Solaris? Well, this is quite simple. It doesn’t really reorganize the data. The pool has still three devices after the change. You just don’t see the third one. When you check with zdb -L testpool
you will see that the third device changed to
The third device has been substituted by an virtual devices. This virtual device resides on the disks remaining in the pool. You can see it quite nicely in the output of zdb
There is still a third device in it with 2 G worth of data, but more interesting the remaining devices now have taken over the data as indicated by the increased used
column for both devices. As long as the data isn’t changed the data will stay on this virtual device. Please note that the system isn’t simply blocking the full size of the vdev on disk, but it’s only the space for the data.
Let’s now delete everything in the pool by issuing a rm /testpool/*
command:
The consumption has been significantly reduced. Let’s now recreate our datafiles.
After this you will see the following output in the zdb -L
output.
The virtual device isn’t used for new writes, however all reads for the removed disks are now serviced by the virtual device, which means by proxy by the remaining disks. But the virtual device doesn’t get any new data. So over time in case you change the data on your pool, the virtual device won’t be used anymore. Of course when the data is static and you never change it, it won’t be migrated of the vdev.
When you add a new device, it won’t substitute the virtual device acting as the third device:
You will see a pool with four devices instead.
Conclusion
After quite a time ZFS has finally the ability to remove top level vdevs. I think that reduces a lot of questions from now on in presentations.