Real-World Dedupe on HP 3PAR

HP 3PAR recently released version 3.2.1 of the InForm OS, which most notably brought in-line deduplication to the already rock-solid storage platform. Last week, I wrote briefly about it and included screenshots of estimates in the CLI. Today, I’d like to share real-world results.

I’d like to give particular thanks to Ivan Iannaccone of the HP 3PAR team for reaching out and for early access to the 4.6.1 IMC with dedupe in the GUI.

After I ran the estimate in the previous post, I learned from Ivan that estimates (and jobs) of multiple virtual volumes (VVs) in the same common provisioning group (CPG) will return increased data reduction ratios (read: less used space). Thus, when I received the new InForm Management Console (IMC) yesterday, I ran a new estimate against two VDI (Microsoft RemoteFX) VVs to see how the numbers panned out.

3par_dedupe_preview_rfx

As you can see, the dedupe ratio rose from 2.31 to 2.83. Every little bit helps, but what is the actual deduplication ratio?

In the new IMC, I followed these steps to start the jobs:

  1. Go to the “Provisioning” section
  2. Expand “Storage Systems” > “[array_name]” > “Virtual Volumes”
  3. Select “Exported”
  4. Go to the “Virtual Volumes” tab in the upper right pane
  5. Shift-select both RemoteFX virtual volumes (“rfx_ssd_a” and “rfx_ssd_b”)
  6. Right-click the selected VVs and click “Convert Virtual Volume…”
  7. In the top section, change the radio button to “To Thinly Deduped”
    3par_dedupe_convert_1
  8. In the bottom section, move down the two VVs to be converted
    3par_dedupe_convert_2
  9. Lastly, I chose to keep the “Target CPG” the same as the source (these are the only VVs in that CPG)

At this point, it is prudent to mention that you should review your available SSD capacity before starting the conversion job. When I ran these two jobs (each VV is its own job), my free SSD capacity on the entire array dropped to 116GB (3%). It’s possible that the IMC would have warned me before letting me run jobs that would have exhausted the free space, but I can’t say.

To be absolutely certain and safe, the array would require each job to reserve 100% of it’s source space before starting or letting other jobs begin. Otherwise, it is conceivable that a job running against entirely unique data would only achieve a 1:1 ratio and thus save no space. That job alone could run out while in process, let alone multiple jobs of varying outcomes.

Back to dedupe, the jobs ran concurrently for 1 hour and 25 minutes and had this result:

3par_dedupe_rfxEven better than estimated!

Content temporarily removed. Check back later…

10 Comments

  1. jacobddixon said:

    What is the current dedupe ratio you are getting on the 3PAR today? We are looking at getting an all SSD 3PAR SAN but also looking at PURE. PURE can guarantee the dedupe ratio…

    July 21, 2016
    Reply
    • Chris said:

      Hey Jacob,

      I’m afraid I can’t give you updated numbers as we retired that 3PAR array in mid-2015. The best advice I can give you is to POC both, because the only true test is putting your data on top of each platform and seeing how they dedupe & compress.

      That said, I’m pretty confident that you’ll see better results from Pure Storage, and you’ll gain the benefits of the simplest management, upgrade, and maintenance platform in the industry. 3PAR came a long way with the web-based SSMC (StoreServ Management Console), but Pure is still a head above.

      Thanks!
      Chris

      July 21, 2016
      Reply
      • jacobddixon said:

        We really like both and we currently have a 3PAR so it has a leg up for us doing replication at the SAN level with our remote site we are setting up. 3PAR doesn’t have compression yet and PURE has both… but PURE also isn’t a profitable company from what I’ve heard. I wish I could find some 3PAR vs PURE dedupe data for VDI / consumer flat files because we are really on the fence with both. They are pretty much the same price but with 3PAR being a little cheaper for the same features (after negotiating).

        July 21, 2016
        Reply
        • Chris said:

          I hear that. Personally, I don’t think you can go wrong–it’s a better/best decision rather than a good/bad one. If you can make sure 3PAR gives you raw capacity that matches Pure’s “effective” capacity after 5x data reduction, then you’ll win regardless of how much gain you get from compression (when it comes). Just don’t let them sell you on *expected* future performance/efficiency.

          July 21, 2016
          Reply
          • Jacob said:

            Haha. The good thing about PURE is they will guarantee the dedupe RATIO or provide more SSD’s to meet the capacity planned.

            July 21, 2016
          • Chris said:

            True, true. I’m just a huge fan of simplicity (even though we really loved the 5 years we had on 3PAR). Hard (good) decision :). Probably comes down to the value of 3PAR-to-3PAR replication for your org. Otherwise, all things even, unless 3PAR will give you *more* capacity than Pure’s guarantee, I think Pure has the leg up. And I wouldn’t worry about the financials–the tech is solid so *worst case* it’d get acquired and continued, which I think is unlikely–I think they’ll simply succeed.

            July 21, 2016
          • Jacob said:

            Sorry to keep your forum going on and on lol…. If you are no longer using 3PAR what did you make a move to?

            July 21, 2016
    • Jacob said:

      Well then if you don’t mind me asking.. what is your workload and deduplication ratio you are getting on your PURE?

      July 21, 2016
      Reply
      • Chris said:

        I transitioned companies in January, so I no longer have access to those figures. Last I saw, it was around 2.8-2.9 due to significant table compression in SQL databases (heavy virtualized SQL environment). Before the developers added the table compression, we were seeing around 4.2:1, which is pretty decent with so much SQL.

        July 24, 2016
        Reply

Leave a Reply