QuicksearchCodenews SearchDisclaimerThe individual owning this blog works for Oracle in Germany. The opinions expressed here are his own, are not necessarily reviewed in advance by anyone but the individual author, and neither Oracle nor any other party necessarily agrees with them.
NavigationCategories
|
![]() To dedup or not to dedup - that results in a lot of questionsMonday, February 8. 2010Trackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
Did somebody try to compare the difference in power consumption (if there is some) on non-ZFS-deduped / ZFS-deduped systems? Simply, if it's not more (power/storage_response_time) efficient to add extra HDD into zpool instead, forget about "Do I have enough spare IOPs?", and simply go and take the data directly from the HDD. And yes, it should depend on the stored content (dedup percentage vs. storage capacity vs. etc.). Thanks.
ZFS computes the checksums anyway. The difference with hash-only dedup is just the lookup to a table, with hash-and-compare you need an addtional IOPS, but that isn't much of a problem as i explain later.
For reads it makes no difference, if the data is deduplicated or not. As deduplicated data is only cached once for potentially many blocks, the cache in storage arrays will be used more efficiently, thus potentially resulting in a lower IOPS count getting to the disks The spare IOPs problem isn't a problem with ZFS hash-only, a ignorable one for ZFS hash-and-compare (you would do the IOPS without dedup anyway, you have a read io instead of a write io, and just in the case of a false positive you have to issue a write io, but the probability of that is pretty low, you can't lose, but you can win a lot) and a big one for weak-checksum dedup, as you have to the problem to check many dedup candidates.
Interesting read and it inspired me to check upon the dedupe features in TSM6.1. Seems that it uses SHA-1, non-compare.
However, regarding the performance impact of synchronous dedupe: Even if the hash table is in memory, shouldn't every alteration be flushed to persistant storage(disk or NVRAM)? If that is the case my immediate thought is that synchronous dedupe may come at a significant performance penalty or at least would require a lot extra of NVRAM and computing power.
Do you mean Andy or lparvirt? I guess lparvirt. So here is what I think: Maybe lparvirt oversaw that ZFS is able to leverage SSDs for L2ARC and therefor the hastable is already in nonvolatile Read Cache. What speeds up the dedupe is the performance on querying the hashtable not writing it. So having a copy of the hashtable in L2ARC should give you viable performance speedup.
|
Links in this articleThe LKSF bookThe book with the consolidated Less known Solaris Tutorials is available for download here
Web 2.0Contact
Networking xing.com My photos SyndicationTagged articlesAMD Apple avs Bahn Blogging Blogosphere braindump Business Travel CeBIT cec cec2006 CMT del.icio.us deutsch dtrace fliegen Fundsache General Hamburg IBM i hate sundays Intel iscsi jumpstart Links Linux lksf Mindfuck Movies Music Musik Niagara Opensolaris Opteron Photographie policy of ... Politik Security Solaris storage Sun suncec2007 sunw t1 The IT Business Ultrasparc ultrasparc t1 Wirtschaft Work ZFS
CommentsThu, 09.09.2010 13:04
Okay, I must have overseen tha
t
Thu, 09.09.2010 12:59
1. Gerne:
zb. für ne SAN Migr
ation. Ich weiss das Sun das G
efühl hat, dass man sowas nich
t braucht.
Das ist ähn [...]
Thu, 09.09.2010 12:00
Believe it or not ... there ar
e company obeying the licenses
. So that's a very practical c
hange ....
Thu, 09.09.2010 11:54
So practically, there is zero
difference:
90 day evaluati
on period which wouldn't expir
e anyway, vs. a "perpetu [...]
Thu, 09.09.2010 11:49
There is no timelimit ...
Buttons![]() This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Germany License
![]() ![]() ![]() Blog Administration |