Flash On The Floor: Pure, XtremIO and 3PAR

In September 2013, my organization and I started a journey into the realm of flash storage. The initial foray took us into two camps and lasted much longer than we expected. In fact, our 2013 storage decision bore with it lessons and tests that lasted until it was once again time to make another upgrade, our 2015 replacement at a sister site.

History

In 2013, while smaller start-ups were aplenty, EMC’s pre-release XtremIO (GA in December 2013) and Pure Storage were the only mainstream contenders. Granted, Pure was still technically a start-up, but then again, XtremIO was an unreleased product purchased by EMC without broad field experience. Everyone was young.

pure_logoMuch of this has already been hashed in my prior posts, but the short story is that we made a decision to forego Pure Storage in 2013 based on a belief in promises by EMC that XtremIO would deliver xtremio_logoeverything that Pure did and more. The two metrics were data reduction and performance. We assumed in the land of enterprise storage that high availability was a given.

In the following months of early 2014, we learned not to assume anything and also how to help a young product mature through bug discovery. During a controller replacement for a failed fiberchannel (FC) module, we encountered an Infiniband bug that took down the XtremIO array for 3 hours. EMC fixed that bug in the next release. Then we discovered that virtual machines with EFI firmware could not (re)boot on XtremIO. That led to EMC’s “best practice” of configuring ESXi’s advanced setting, Disk.DiskMaxIOSize, to 4096 from the default 32768. In June we had our largest issue when our hosts lost connectivity during a “non-disruptive” upgrade. The troubleshooting for that lasted for more than 6 months and was never cleanly resolved. Another “best practice”–setting native multipathing to change paths after every I/O–was determined to be necessary to avoid this issue with our QLogic converged network adapters (CNAs).

As for data reduction, XtremIO started out at 1.5 to 1, in contrast to Pure’s 4.2 to 1. The compression coming in XIOS 3.0 was supposed to bring parity between those numbers, but that never panned out. Instead, the 1.5x deduplication dropped to 1.3 (due to block sizes increasing to 8K), and compression added its own 1.3x, for a net of 1.7x. EMC addressed this with complementary hardware to make raw capacity match/exceed Pure’s logical capacity. They stand by their word, and for that, we are thankful.

Most recently, I discovered that our dataset did change some during the spring of 2014 such that our data reduction comparison may have been slightly disparate. Table compression was introduced into part of our database structure, leading to lower reduction numbers in our latest Pure implementation. Even so, we see nearly 2x better data reduction on Pure now compared to XtremIO.

XtremIO Today

Today, we have 8 months of up-time under our belt on XtremIO (2 x 20TB bricks), not counting the migration off of it in December 2014 for the XIOS 3.0 upgrade. VMware Storage vMotion gets the credit for no downtime on that. We actually haven’t had a chance to request the latest minor update for XIOS, so our first non-disruptive update is still in the future.

Performance-wise, XtremIO has never skipped a beat. Mass-backups by our SQL servers don’t faze it (unless you consider a jump from 0.5ms to 1.5ms for <15 minutes an issue), and it will surely be the last bottleneck in our virtual infrastructure. CPU and RAM will need to exponentially scale, and even then it only be a capacity issue (which further extends the performance lead).

Strangely enough, it is that capacity factor that caused us to reconsider the competition when it came time to purchase a sister SAN for our XtremIO. For years, we faced the inverse problem of excess capacity that could not support the performance demands. Now we had performance that couldn’t stretch to meet capacity (due to the low data-reduction ratio).

The Candidates in 2015

To replace our HP 3PAR V400 (P10400) from circa 2012, we engaged with EMC, HP, Pure, and Dell for solutions and quotes. Other vendors were given light investigation, but their products tended to be niche and focus on just one characteristic rather than the bigger picture, or they just weren’t mature enough to lean on (especially considering our above history with “beta” XtremIO).

Dell Compellent was all about hardware. Their technical sales team openly stated that they viewed deduplication as too risky and relegated compression to nearline storage only. They also seemed stuck in the past with a focus on mixing SLC, MLC, and “mixed-use” SSDs on top of spinning disks. I hear that they are working on modernizing, but that’s still coming. If hardware is all that matters to you, though, Dell was the cheapest.

HP 3PAR was actually the lead contender all the way up until Pure stepped in. We like our history with 3PAR and our 3PAR arrays on the floor. They have never failed us–*sigh* well, once, but it was the 1.0 of deduplication and true to form, I found the rarest of bugs for them (I’m lucky like that). Anyways, 3PAR has been a winner for us since 2009. The GUI and CLI are straightforward (once you learn the lingo of CPGs, VVs, VLUNs, etc) and management has consistently taken a minimum of effort.

The thing that argued most strongly to me was/is the 3PAR product & support team. Ivan and his team are fully engaged and accessible, and they currently have a full tank of momentum in their innovation engine. HP was late to this party–hence, they weren’t in the picture in fall 2013–but they are here now and the 3PAR line is a winning pick.

3PAR’s solution for us involved half of the capacity in flash and half in 15K SAS disks. You might think that’s moving backwards in a flash world, but unless your data truly pushes the limits of the storage autobahn, this isn’t a right-and-wrong debate. It’s a better-and-best one. Furthermore, 3PAR Adaptive Flash Cache (AFC) makes that spinning stuff behave much closer to flash speed.

Physics, rack space, and power do matter, though, which is what moves the lime light to all-flash arrays.

EMC XtremIO was the incumbent with a deal on the table to get a killer deal on a matching partner to our existing 2 x 20TB pair of bricks. Initial estimates, however, put our data needs above what we could bet on that pair to deliver (we learned in Round 1 not to assume more than 1:1 reduction). In an XtremIO world, that means doubling capacity (4 x 20TB). There are no odd-numbered solutions, except 1-brick.

That steep upgrade path carries a lot of cost and a significant chunk of rack space with it. Two bricks consume 13U’s. Four require 25U’s, I believe (one less than 2x because you only need one Infiniband switch). And while EMC was able to extend the exceptional pricing to the four-brick solution, it was still above the rest of the candidates. If there was a way to guarantee data reduction ratios, that wouldn’t have been the case, because even 1.7x changes the equation. However, we had bet on much more than that originally and no amount of good will can change (XIOS) software design.

Pure Today

That brings us to Pure Storage. Coming out of 2013, we didn’t have anything bad to say about Pure; we simply believed a competitor’s word and went forward on faith. This time around we had the facts and put aside the unproven. Pure won our business on the following factors:

  • Innovation: the highest data reduction
  • Simplicity: deployment and management
  • Cost: lowest price for logical capacity (see “Innovation”)
  • Support: great support in 2013 and same today
  • Environment: lowest power and least rack space (8U for our setup)
  • Expansion: granular, customer-handled, and non-disruptive
  • Availability: architecture and track record in our environment

In our internal discussions, we put different filters and weights upon those factors as we measured all of the candidates, and regardless of the weighting, Pure came out on top. That’s important to do in any organization–don’t let the price run away with the discussion. It surely matters, but support, management, growth plans, and availability can make operational costs far exceed the upfront ones.

We are now fully migrated to our new Pure array and it is going well. Our data reduction is different from our 2013 POC–3x vs 4x–but we know why that is the case (our dataset changed), so it’s explainable. Actually, we have plans for improving it as we address our next backup & recovery project, but that’s for a different post :).

Conclusion

In this arena, I want to emphasize that choosing between the leaders is not a right-and-wrong choice. It is subjective and depends on your data and environment. 3PAR is good. XtremIO is good. Pure is good. What makes one or more of them great is you. Your data, your environment, your workload, your future. Your matrix is different from mine. Choose the solution that adds up.

20 Comments

  1. Nice post, I work for a an EMC partner who was on board early with XtremeIO and it is always nice to understand a customers perspective on their journey, thanks for sharing yours

    April 21, 2015
    Reply
  2. David Czech said:

    Excellent comparative information here Chris. I especially like and agree with this statement:
    “don’t let the price run away with the discussion. It surely matters, but support, management, growth plans, and availability can make operational costs far exceed the upfront ones.”

    Would you be able to post some additional information about how the various members of your team weighted and filtered based on the criteria/metrics you listed:
    • Innovation: the highest data reduction
    • Simplicity: deployment and management
    • Cost: lowest price for logical capacity (see “Innovation”)
    • Support: great support in 2013 and same today
    • Environment: lowest power and least rack space (8U for our setup)
    • Expansion: granular, customer-handled, and non-disruptive
    • Availability: architecture and track record in our environment

    Would be great to see how important each of these factors is to CIO/CTO/CEO/Director/Admins/anyone else involved in the evaluation/decision/purchase process.

    DC

    April 22, 2015
    Reply
  3. noon29 said:

    At the end maybe it’s dell who are in the right path. I mean if you check the need for inline dedup and compression it just means a lot of ram an cpu for that. But 1 month ago toshiba, intel and micron just announce their 32 layers and 48 layers nand chips. It means that in sometimes we will see ssd of x TB, plus with that they can go back to higher print like 40nm (so the lifecycle will increase from 3000/5000 to 10000/120000). With those kinf of ssd did the deduplication and compression will be really necessary again?

    Just a question about your pure test, did you full the array with more than 80% ?

    April 22, 2015
    Reply
    • Chris said:

      noon29,

      I’ve heard this same line of argument before from Hitachi and old HP (not present-day), and it hits me every time as flawed. Why pose an unnecessary “either-or” situation? I think it is *great* that NAND is getting denser and resulting in higher capacity. Bring it on! THEN, let’s use innovation and intelligence to use those higher capacities and stretch them farther.

      Switching to a “both-and” perspective allows us to gain the advantages of bigger disks while saving resources like electricity, rack space, shipping overhead, and expansion costs. The beauty of dedupe is that we eventually get enough fingerprints gathered to start scaling exponentially.

      Avamar calls this “steady state” in the backup realm. After a point, it stops seeing “new” things, so capacity utilization plateaus while retention continues climbing. We’re not quite there in the storage space, because high performance is the priority, but eventually the compute space will reach a point where we gain post-processing dedupe efficiency AND enterprise flash speed.

      Data keeps growing (it feels like “Delete” is the least used button on the keyboard these days), so we’ll always need more space. Let’s do it smartly.

      –Chris

      April 22, 2015
      Reply
      • noon29 said:

        Sure at least for compression, but i’m still wondering about deduplication. When i start to use deduplication in 2008 on zfs we always at that time make an analysis to know the dedup ratio. If you check today the fa 450 has already 1TB of ram for a small raw capcity.

        But why not, maybe at one point we will have an new dedup algorithm wich won’t need high cpu or ram.

        April 22, 2015
        Reply
    • (Disclosure – I work for HP Storage)

      Your response tells me you probably don’t understand how HP 3PAR has implemented deduplication. It literally uses 0 (yes, ZERO) CPU cycles and I’m guessing less RAM than any other deduplication implementation in the industry. Deduplication with 3PAR is done in the ASIC. Each controller has it’s own ASIC. Deduplication is always inline and has almost no impact on performance.

      And because of the 3PAR architecture, deduplication takes very little RAM. You can read more about it on my blog: http://h30507.www3.hp.com/t5/Around-the-Storage-Block-Blog/Welcome-to-Flash-2-0-HP-3PAR-Thin-Deduplication-with-Express/ba-p/164858

      I don’t know anything about Dell All Flash Arrays – they haven’t come up in the all-flash opportunities that I see with HP 3PAR. So I’m curious – what specifically has you saying that it’s Dell who are on the right path. Deduplication without a performance impact, not requiring a lot of RAM, driving IOPS that are as good or better than the all-flash start-ups, all flash below the price of 15K drives, and having the data services of a mature array platform …. in my biased opinion, it’s HP 3PAR who is on the right path.

      April 23, 2015
      Reply
  4. Matt said:

    Why no NetApp AFF in your architecture evaluation? Just curious.

    April 22, 2015
    Reply
    • Chris said:

      We have a history with NetApp dating back to 2007. In fact, the term, “gotcha”, was coined during that failed implementation to describe repeated surprises, limitations, and problems that arose across the span of deploying two arrays and retreating from them.

      While we don’t have hands-on experience in the years since then, our perception from networking at conferences and in the wider community is that NetApp isn’t the future. I want to be fair to NetApp and give the benefit of the doubt to years of potential progress, but I’ll have to leave that another customer. Thanks.

      April 22, 2015
      Reply
      • noon29 said:

        Can you list the other vendors?

        I don’t see any sds for example like nexenta

        April 23, 2015
        Reply
      • bitpushr said:

        Chris, I’m a NetApp employee in Boston, would you mind dropping me a line as to your experiences in ’07? Thanks!

        May 8, 2015
        Reply
        • Chris said:

          Sure thing. I’m on leave currently (just had a son!), so it’ll be a bit, but I’ll give you the run-down when I can. Thanks!

          May 8, 2015
          Reply
  5. Ron R said:

    I would love to have a conversation with you sometime. We recently did a “shootout” between Pure, EMC and HP with their all flash arrays and came up with a different conclusion than you did. I’d be happy to share our analysis and review with you to see where we differ.

    In short though:
    From a performance perspective it was Best = EMC, Better = HP, Good = Pure
    Compression/Dedupe it was Best = Pure, Better = EMC, Good = HP
    From a failover testing it was EMC, Pure, HP
    From a cost it was HP, Pure then EMC (list prices were compared only)

    I’d love to discuss with you sometime before we determine our standards going forward.

    Ron

    April 23, 2015
    Reply
    • Chris said:

      Hey Ron,

      I’m shooting you an email after this for that discussion and shared analysis. I agree with your short results for the most part. Can’t argue with XtremIO performance–it’s a bullet train of speed and momentum. Same for Pure dedupe.

      On failover, well, I’ve written plenty on that. New customers will probably never see the things we did, which is a good thing. What persists, though, is the architecture resilience when we did need to test. XtremIO’s management server (XMS) was decently fragile, though EMC could always resurrect it from data on the array, and failover testing was encouraged to be more delicately handled, too. Pure just keeps saying to pull the plug and have at it. But maybe you’re talking about a different aspect of failover.

      And good ‘ole cost… That seems to be a motivational topic–as in, “How motivated is my partner and the vendor to get my business?” I think EMC has a very high list prices but then will hit the hard deck to win business. HP and Pure play closer to out-the-door pricing.

      Looking forward to more about your process,
      Chris

      April 23, 2015
      Reply
    • Ron – I’d love to have you talk to our team about why 3PAR failover wasn’t the best of the 3 platforms. Drop me an email: hpstorageguy at hp dot com if you’re willing.

      Not surprised about data reduction as HP 3PAR doesn’t have compression yet. Also curious about performance – that would be highly dependent on the configurations and tests run.

      April 23, 2015
      Reply
  6. Colin G said:

    Hi Chris,

    A few questions on your analysis, particularly with respect to EMC’s XtremIO product and Pure’s FA series:

    What type of workload are you running to see the 1:1-1.3:1 data reduction ratio on the XtremIO array? This is far lower than any other published results I’ve seen in the field.

    Data reduction/efficiency, Pure vs XtremIO: Do your published results for Pure account for their inclusion of thin provision savings and snapshot savings in their total data reduction ratio? As I understand it, the XtremIO array discerns between data reduction (dedupe & compression) and overall efficiency, which compounds thin provisioning savings, and does not include savings from data copy management.

    Finally, I’m a bit surprised to see no mention of up front raw to usable capacity in this discussion- Pure’s architecture imposes ~45% on raw space vs. XtremIO’s overhead of ~25%. I also see no mention of each box’s ability to maintain its data services as the array reaches maximum capacity (read: architecture differences), but I imagine you weren’t able to test this scenario yourself?

    August 12, 2015
    Reply
    • Chris said:

      Hi Colin,

      Our workload is fully virtualized (VMware) and consists of a heavy MSSQL component, along with some web, file, and app server data. Thus, we expect to get less stunning results than the front-line marketing presents about any product (XtremIO, Pure, 3PAR, etc). Even so, the 1.3 dedupe * 1.3 compression from XtremIO leaves quite a bit to be desired. Latest data reduction stats are: 1.8 for XtremIO and 2.9 for Pure.

      Regarding those stats, I ignore and honestly dislike the “Overall Efficiency”, “Total Reduction” or whatever anyone else likes to call data reduction + thin provisioning. Thin provisioning is a base requirement in modern arrays (IMO), plus, it is terribly skewable by merely creating large, empty volumes. Thus, my stats are only “Data Reduction” for both (which is nice that both Pure and EMC use those two exact words so we don’t have to translate).

      Lastly, I can see the intellectual/philosophical merits of debating Pure’s overhead vs. XtremIO’s–consumer vs enterprise MLC, RAID types, etc–but for the end user, net usable capacity matters most. In that discussion, 3PAR actually wins for most efficient SSD usage (ask Calvin Zito & crew to unpack their elimination of overhead on those massive 3.84TB drives).

      As for data services staying up, I’ve had multiple experiences of XtremIO data services going down hard, while my only remotely similar encounter with Pure occurred once in 2013 due to my misunderstanding of their definition of “full” (it was 80% at the time and began throttling writes). I learned and they wisely redefined it to the intuitive 100% that it is today.

      In the next several months, I’ll have the opportunities to go through XtremIO’s 4.0 upgrade and Pure’s //m upgrade, so I’ll have fresh data on each product’s stability in 2015. I’m hoping for smooth, positive experiences with both.

      Thanks for contributing!

      August 12, 2015
      Reply
      • Mad Dog said:

        Chris,
        You are spot on with Pure’s reduction reporting. To add a point, Pure does not count snapshots towards our data reduction reporting.

        Can’t wait for you to experience going from a 400 to //m 100% online!!!

        (Pure Employee)

        August 15, 2015
        Reply
  7. Great review as always Chris. Have you reviewed Pure for IT Central Station? I think our users would really benefit from your opinion.

    January 3, 2016
    Reply
  8. Joe said:

    Great post! I work for a vendor where we sell all the storage you outlined here. We are seeing Pure disrupt a lot of our sales cycles and it honestly has HP at a loss for words other than “Pure is the worst all flash array”.

    July 20, 2016
    Reply

Leave a Reply