Separated ZIL on ramdisk.
I used ramdisks today in a live presentation of ZFS, L2ARC and sZIL (you can´t use files for sZIL and L2ARC testing, you need a device). One of the customers asked me if there is a real use case for this configuration. My answer was: Not really. But this question haunted me the whole day while sitting in the train.
I had some weird ideas like using an UPS backed server to act as a sZIL device by provding access to a ramdisk via Fibre Channel (by using the FC target in COMSTAR) , but i had no idea if this would be really clever idea. Ramdisk based sZIL? Okay, i know there are now several admins with a cardiac arrest out there. The problem of a ramdisk based sZIL is the volatile nature of the ramdisk. When you loose the power on your systems, you have still a consistent pool but you loose the transactions stored in the sZIL.This is a less than desirable effect. Between you and loosing transactions is just the power switch ;) There was a missing piece… The good thing about a large community: Sometimes someone had already similar ideas long ago and had some time to do some experiments. So i found a really neat idea on the zfs-discuss mailing list with the important extra thought compared to my “dozing in the train”-idea. The idea of Chris Greer has this nice additional twist making the concept of using a ramdisk sZIL more sensible. He writes in “An slog experiment (my NAS can beat up your NAS)”:
So I tried this experiment this week...
On each host (OpenSolaris 2008.05), I created an 8GB ramdisk with ramdiskadm. I shared this ramdisk on each host via the iscsi target and initiator over a 1GB crossconnect cable (jumbo frames enabled). I added these as mirrored slog devices in a zpool.
This is really a neat trick. You can mirror a sZIL device so you could distribute the ramdisks over several systems. When one of the system fails, you have still the other device with the same data. This configuration had an impressive effect on the performance
The big thing here is I ended up getting a MASSIVE boost in performance even with the overhead of the 1GB link, and iSCSI. The iorate test I was using went from 3073 IOPS on 90% sequential writes to 23953 IOPS with the RAM slog added. The service time was also significantly better than the physical disk.
I think you can drive this concept even farther by substituting the iSCSI over TCP over Gigabit with iSER (iSCSI over RDMA over Infiniband) and using separate small x86 systems for this task, each backed by a small UPS keeping the system up until an
dd from the ramdisk to a hard disk has completed.
Chris, sounds like a really clever idea … :)