QuicksearchDisclaimerThe individual owning this blog works for Oracle in Germany. The opinions expressed here are his own, are not necessarily reviewed in advance by anyone but the individual author, and neither Oracle nor any other party necessarily agrees with them.
|
ZFS Deduplication features in PSARCTuesday, October 20. 2009Comments
Display comments as
(Linear | Threaded)
Is this on a block or file level? Block level (or chunks) work better for VM Images but are of course more overhead. Is this an async deduplication or for each write? is it only performed on new files or also on updates?
Block level. And there is not much overhead, as the checksums are already there. I think we will see the exact features soon as such PSARC cases are a good sign, that some changes to the code are imminent.
PSARC - platform software architecture review comitee ... to say it a little bit simplified: the architectural steering comitee for the OS ...
So it (PSARC working on this issue) means, that this feature is already designed/implemented and it will be included in forhcoming updates for Storage 7000 firmware / Opensolaris Nevada distribution? Cool!
The PSARC doesn't work on an issue. Developers work on it. Before you can integrate it into the code base, you have to go through the PSARC, discussing the feature, the architecture and the possible impact to other parts of Solaris.
And yes, when a PSARC is approved, you should see the feature sooner or later in Opensolaris (of course, when there aren't unforeseen implementation difficulties that prevents an implementation without harming the rest of the system). From my observation, features introduced by the S7000 people are sooner than later in Solaris.
I always thought the extra block should come sooner rather than later, as in don't activate dedup untill you have the 3rd user of a block then you know you will always have 2 good copies. If the data is common enough to appear more than once.
1. It should be possible to yield such a behavior by configuring a really low dittolimit.
2. On the other side, you have already RAID1 for example with two versions of the blocks and with checksums to find the correct one. I think the situation is a little bit diferent than at other filesystems.
Very cool.
According to this video: http://www.youtube.com/watch?v=Fski5cOKGEs, Jeff Bonwick said that the deduplication table will be stored in flash(L2ARC). 1PB storage would take 1TB of flash for the dedupliation table. Now my question is this: if L2ARC isn't persistent, then upon reboot, it would take a long time to read the deduplication table into L2ARC again for any new writes that are deduped. So technically, on a deduped large pool, deduplication depends on L2ARC persistence, which depends on ZFS crypto. How is ZFS crypto and L2ARC persistence going? Any new info? |
+1The LKSF bookThe book with the consolidated Less known Solaris Tutorials is available for download here
Web 2.0Contact
Networking xing.com My photos Buttons![]() This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Germany License
![]() ![]() ![]() Blog AdministrationDonateOkay, okay ... as several people have asked for it ... but you know my opinion.
|
I've reported about the PSARC cases a few days ago. Now the code for deduplication in ZFS has been integrated into Opensolaris as stated by this mail on onnv-notify: "PSARC 2009/571 ZFS Deduplication Properties, 6677093 zfs should have dedup capability".
Tracked: Nov 02, 12:31