F5100, Peoplesoft, Benchmarks and all the rest

There was a discussion for my article “App benchmarks, incorrect conclusions and the Sun Storage F5100”, it will be a mass answer to several comments made by two entities of a person, who did it to “get my attention”. Yes, suuuure. I think, it was meant to fake a discussion to make the impression, that people where sharing his point. Don’t ask me about his incentive to do so. This comment isn’t for this reader, it’s for all the readers stumbling over his comments. It would be easier simply to delete them, but that would give his statements vastly more credibility than they deserve. This article is a slightly modified version of a comment I’ve added to the article mentioned before. Due to this fact, i disabled commenting on this article, but feel free to comment at the original location.

The 33% at Peoplesoft Payroll NA 9.0

I would like to answer to his comment #6.1.1.1 at first. The 33% statement was computed against a benchmark resulting in a 105 minutes runtime on the HP side and 79.35 for Sun. The document isn’t available at the Oracle site any longer. But there is a new one. It dates to the Fri, 09 Oct 2009 18:48:28 GMT document as stated by the HTTP header and by Acrobat Reader when you ask it for the document properties. The statement in the blog with the 33% was made on 13 October. I just assume that the page was updated shortly after the blog comment. I just assume that the the document has to go through the same approval processes at HP and then at Oracle thus this sounds reasonable that it took a few days to get it on the web page. At the moment i just can say: HP, that was really well timed. But: I never thought that HP are idiots. But there is nothing fishy about that. At the end there is a reason why it’ sometimes important to add the date to a blog entry when you gathered the data. The second point he tries to make there is that HP is indeed faster with mentioning this 67 minutes number. But that is the wrong number. You have to add in the 13 minutes for “Print Advice” and “Deposit”. Then you have the numbers you can compare. I don’t have an idea what they did to accelerate the same system with roughly the same storage. My educated guess the secret lies in the increase of threads to 16. The Sun solution used 8 threads. I would really appreciate if someone sends me the document that was on the website before Friday 9th. Furthermore i will analyze the benchmark respond as soon as more information are publicly available.

Yes, HDD at HP ... but you should look at the enclosure

Let’s go further in the thread: In several statements Steve A. tries to make the impression that the HDD solution is much cheaper. Well, when you read into the documents he cites even himself, you will see, that the storage system used by HP isn’t just a mere JBOD. It’s an EVA8100. We don’t talk about an 58 HDD, we talk about 58 disks in a modular storage system on the higher end of HP storage product portfolio. Bigger than a 8100 is just their XP series and that’s rebranded Hitachi High End Storage. For example with 8 GB cache in a pair of controllers, of course backuped by batteries so they can ignore CACHE FLUSH commands. In addition there is still my opinion, that this configuration is severely short stoked. But to make this short: This device is expensive. It’s worth it’s money. I don’t have a price at hand for the 8100 but the current model, the 8400 starts at $61,456 no disks included list price AFAIK. Then shop for 58 146 SAS disks plus 5 trays. I don’t have an idea why Steve A. thinks that the SSD solution is 5 times more expensive than the HDD solution. So … let’s get to comment #3.1.1. In this article, Steve A. found it nescessary to insult David. But it starting with a misunderstanding. Steve A. talked about the Peoplesoft benchmark whereas David talked seemingly about the TPC-C benchmark. But that’s not the point. But the point made by Steve A. has an erroneous foundation. He talks about 11U needed by the HP solution and that this solution used 2,5” drives. Not exactly. Well … i don’t know what HP EVA8100 he uses, but on my <a href=”http://h18000.www1.hp.com/products/quickspecs/12745_div/12745_div.pdf”QuickSpec sheet the 8100 enclosures</a> is specified with 14 disks per tray and a tray height of 3 RU. I just assume he got confused by the 146 GB per disk numbers. There were such disks in 3,5” a while ago ;-) For 58 disks you need 5 trays. 15 rack units without controller. Add 4 Rack units for the controller, now we are at 19 RU. At the moment it’s 3 RU (1RU F5100 + 2RU J4200) versus 19 RU. In my view of the world that’s pretty significant. Additionally: Given that a 2C6D EVA8100 consumes 2600 Watts and a 2C2D at 1150 Watts, it would a benevolent assumption to consider a 2C5D with a almost unpopulated fifth shelf in the range of 2000 Watts. Now let’s turn to the Sun side:The F5100 is rated at 281 Watts when used at 100% load, 220 Watts at 50% load. Let’s just assume 250 Watts for the load at this benchmark (albeit i would assume the power consumption is near the idle load), the J4200 with 15k SAS with two SIMs at 352 watts. 600 Watts versus 2000 Watts. 1400 Watts. overthethumb ccalculation: 73584 kWh in three years (assuming the rule 1 Watt A/C for 1 Watt into the storage). In my world this is pretty significant, too.

Capacity matters.

When we step to comment #3, Steve A. shows a clear sign of “I can read, but i’m somewhat not able to understand it”. At first i should say, that SSD isn’t a silver bullet and there are situations where an HDD is as fast as a SSD. One of this cases is the sequential write. Due to developments like perpendicular recording a hard disks can consume a lot of data. There is only one thing that kills a hard disk: Moving it’s head. The configuration used by Sun mirrors this. The indices, the data files have a random access pattern, thus it’s best to use SSD. The redo log is steady steam of sequentially written data in arbitrary sizes, a rotating rust disk is perfectly for this task, as those disks doesn’t see any random I/O. To answer his comment: It’s about capacity, and even a blind man should see that. When the dataset is 200 GB large, you have to store it somewhere. The hard disks aren’t used for the datafiles, just for the redo log, thus you have to keep them out of the capacity calcluation and thus you need the 40 FMod instead of the 20 FMod configuration of the F5100. Simple math. Steve A. doesn’t seem to get this, albeit my colleague Vince is stating this clearly in his article about the usefulness of flash for Peoplesoft.

Real men don't put Redo Logs on SSD

The next example of “I can read, but i’m somewhat not able to understand it” is the comment #4: Of course a RamSAN and the F5100 are two pretty different implementations of a SSD. But that wasn’t the point of his article. The point was as far as i understood it: It isn’t effective to throw SSD at logging. There are different problems with the LGWR limiting its speed. And that pretty much supports my point, that there is no point in using SSD for log writing and that HDD are up to the requirements for this task. Sun uses this hybrid approach at many occasions. For example the HSP of ZFS is the same. SSD where it fits (L2ARC,sZIL), HDD where it fits (Pool)

HP, write caches and their TPC-C

The problems with the HP TPC-C benchmark (It’s a HP benchmark, not a Oracle one) as stated in #2.1.2.1 is not really understandable to me. As long a system complies to the ACID rules by the TPC-C it’s okay. The large caches on the controller are battery-protected. By the way … there is a reason, why many TPC-C benchmarks contain a lot of UPSes in their bill of material. It has something to do with write-caching to a large part;) But i’m not here to defend a HP benchmark result ….

Okay, at one point he is right

Regarding this comment at #4.1.1.1: Hmm … how do do i say this politely. It was a little bit unfortunate to cite Claudias presentation, it was more a presentation about ARC, L2ARC and sZIL, but spiced up with some marketing-slides. I use them since last year (when my memory serves me right). It’s a little bit outdated technology wise, but the core of should be still true, when you update it with current technology: Yes, there are bigger high-RPM disks, but there are bigger low-RPM disks as well and so on. And the power calculation didn’t included the server, as this one is equal in both configurations. But that’s point where i have to admit that Steve A. are right: This slide needs a recalculation.

Conclusion

So … it’s late now … i spend my evening and a large heap of tea on writing this comments. There are many more points i would like to comment, but time isn’t infinite and i think i already spend more time on it than this person with such a disgusting style in discussion deserves.