New features of Solaris: Alternate boot environments based on snapshots
One of the limitations of Opensolaris 2008.05 will be the missing LiveUpgrade. But … well … you have something better. The whole concept of LifeUpgrade was transformed into the future by using the capabilities of ZFS.
Using snapshots for boot environments
One of the nice features of ZFS is the fact, that you get snapshots for free. The reason lies in the copy-on-write nature of ZFS. You can freeze the filesystem by not freeing the old blocks. as new data is written is written to new blocks, you don’t even have to copy the blocks (in this sense the COW of ZFS is more like a ROW … redirect on write). ZFS boot enables the system to work with such snapshots, as you can use one of these to boot from. You can establish multiple boot environments just by snapshoting the bootfilesystems, clonimg them and promoting them to real filesystems. This are features inherent to ZFS.
A practical example
A warning at first: Don´t try this example without a backup of your system. Or use a test system or test VM. We will fsck up the system during this example. Okay….
I’ve updated my system, so i have alread two boot environments on my system:
This mirrors the actual state in your ZFS pools. You will find filesystems with accordings names.
After doing some configuration, you can create an boot environment called opensolaris-baseline
:
It’s really easy. You just have to create a new boot environment:
Now let´s look into the list of our boot environments.
Okay, now we activate the opensolaris-work
boot environment:
Okay, let´s look at the list of boot environments again.
You will see that the opensolaris-1
snapshot is still active, but that the opensolaris-work
will be active at the next reboot. Okay, now reboot:
Okay, you see … the boot environment opensolaris-work
is now active and it´s activated for the next reboot (until you activate another boot environment).
Now we can reboot the system. The GRUB comes up and it will default to the opensolaris-work
environment. Please remember on whicht position you find opensolaris-baseline
in the boot menu. You need this position in a few moments. After a few seconds, you can log into the system and work with it.
Okay … now let’s drop the atomic bomb of administrative mishaps to your system. Log into your system, assume the root role and do the following stuff:
You known what happens. Depending from how fast you are able to interrupt this run to get an slightly damaged system up to a system fscked up beyond any recognition. Normaly the system would send you to the tapes now. But remember. You have some alternate boot environments.
Reboot the system, wait for the grub. You may have an garbeled output, so it’s hard to read the output from the grub. Choose opensolaris-baseline
. The system will boot up quite normaly.
You need a terminal window now. How you get such a terminal window depends from incurred damage. The boot environment snapshots doesn’t cover the home directories. So you may have no home directory any longer. I will assume this for this example: You can get a terminal window by clicking on “Options”, then “Change Session” and choose “Failsafe Terminal” there.
Okay, login via the graphical login manager, a xterm will appear. At first we delete the defunct boot environment:
Okay, now we clone the opensolaris-baseline
environment to form a new opensolaris-work
environment.
We reactivate the opensolaris-work
boot environment:
Now check, if you still have a homedirectory for your user:
If your home directory doesn’t exist any longer, create a new one:
Wait a few moments. The system starts up. The GRUB defaults to opensolaris-work
and the system starts up normaly without any problem in that condition the system had, when you create the opensolaris-baseline
boot environment.
Obviously you may have to recover your directory with data. It’s a best practice to make snapshots of this directories on a regular schedule. So you can simply promote a snapshot to your actual version of the directory.
Conclusion
You see, this is a really neat feature. Recovering from a disaster in a minute or two. Snapshotting opens a completely new way to recover from errors. Unlike with Liveupgrade you don’t need extra disks or extra partitions, and as ZFS snapshots are really fast, creating alternate boot environments on zfs are extremly fast as well. At the moment this feature is available on Opensolaris 2008.05 only. But with future updates it will find it´s way into Solaris as well.