Doing It Again: How Would I POC XtremIO and Pure?

We began our hands-on exploration of all-flash arrays in September 2013, and for all intents and purposes, the testing has never really concluded. If I knew then what I know now, I would have conducted a number of tests quickly during the official “Proof of Concept” (POC) phases.

All of the below tests are worth doing on the named products, as well as other similar products that official support the actions. Some tests particularly target a product architecture. Where applicable, I’ll note that. As with any storage array, the best and first test should be running real data (day-to-day workloads) atop it. The points build upon that being implied.

1. Capacity: Fill It Up!

This test is most practically focused on Pure Storage and its history and architecture. At the same time, the concept is worth processing with XtremIO.

In 2013 and before, Pure’s array dashboard showed a capacity bar graph that extended from 0% to 100%. At 80%, the array gave a warning that space was low, but failed to indicate the significance of this threshold. The code releases up to that point put an immediate write throttle on processing when the array passed that threshold. In short, everything but reads ground to a halt. This philosophy of what percentage truly is full was reassessed and redefined around the turn of the year to better protect the array and the user experience.

Pure’s architecture still needs a space buffer for its garbage collection (GC), which I believe is guarded by the redefinition of “full”. However, I have heard of at least one user experience where running near full caused performance issues due to GC running out of space (even with the protected buffer). If you’re testing Pure, definitely fill it up with a mix of data (especially non-dedupe friendly data) to see how it goes in the 80’s and 90’s.

For XtremIO, it’s a conceptual consideration. I haven’t filled up our array, but it doesn’t do anything that requires unprotected buffer space, so the risk isn’t particularly notable (feel free to still try!). The thing here is to think about what comes next when it does get full. The product road map is supposed to support hot-expansion, but today it requires swinging data between bricks (i.e. copy from an array of 1 x-brick to 2 x-bricks, 2 x-bricks to 4 x-bricks, etc).

2. Diversify & Observe: Block Sizes

Pure and XtremIO use different block sizes for deduplication and process those block sizes differently as well. Services and applications similarly use different block sizes when writing down to arrays. Microsoft Exchange favors 32KB blocks, while SQL Server tends toward 64KB blocks. Down the line, backup applications and jobs often times use blocks ranging from 256KB to 512KB. OS and miscellaneous writes stay on the smaller end around 4KB (or less).

Since Pure takes a bigger block size and then looks for duplicate patterns of various lengths, larger blocks like backup jobs have the potential to raise latency. It’s simple physics as I mentioned in the previous post–finding matching cards in 100 decks takes longer than finding them in 2 decks (take the analogy for what its worth). Your environment may not create any issues for a Pure array, and Pure arrays, code, and hardware may have moved beyond that by now, but test and verify.

XtremIO uses a fixed block size so bigger blocks don’t affect how its deduplication processes data. Everything is chopped down to 4KB (pre-3.0) or 8KB (3.0+) blocks. The thing to observe here is how deduplication and compression works. With the same data on both arrays (Pure & XtremIO), which provides the better data reduction? What are the trade-offs, if any, for that advantage?

3. Patch & Reboot: High Availability

My experiences with array software updates have almost always involved the words “non-disruptive”. In fact, since 2006 and our first EMC CLARiiON CX300, I can’t recall an update that required downtime. Sure, they recommended it and things were slower during updates, due to write-cache disabling, but one storage controller/processor was always online and serving data. Furthermore, in the storage array realm, “high availability” is pretty much a given. As the saying goes, though, “trust but verify”.

When you get your POC arrays, I’d recommend making sure that you can go through a software update during your evaluation. If the vendor doesn’t have one releasing during your POC, ask to have the POC unit loaded with the previous, minor revision of the code/software. Then, with your data fully loaded on it, schedule a time to perform that Non-Disruptive Update (NDU). This also provides the benefit of testing out the technical support experience with Pure and EMC Support (or any vendor).

Pure probably has an equivalent to this command, but you can also perform additional fail-over testing of XtremIO arrays by logging into the XMS CLI and running the following commands to see how an HA event is handled:

  • Open two SSH sessions to the XMS
  • In one session, run the following command. It repeats every 15 seconds. Open the XMS GUI to see more real-time data at the array level.
    show-targets-performance frequency=15
  • Observe/verify that traffic is flowing down all initiators evenly
  • In the second session, run the following command. Note that this will take a controller out of service (and may affect performance or availability).
    deactivate-storage-controller sc-id=2
  • Watch the first SSH session and the GUI for the effects of the fail over (recommend waiting five minutes at least before re-activating)
  • In the second session, run the following command to reactivate the controller:
    activate-storage-controller sc-id=2
  • Observe/verify that traffic returns to an even flow across all initiators

If real-world data on your array doesn’t generate at least 10,000 to 20,000 IOPS, I recommend running IOmeter on a few array-connected servers to create additional load. Four VMs/servers running IOmeter with the following characteristics provided roughly 34,o00 IOPS in my experiments.

  • Fully random I/O
  • Two disks checked per VM (in different datastores; mostly just to see how IOPS patterns affected different volumes)
  • Four outstanding IOPS
  • Access Specification on VM 1: All-In-One
  • Access Specification on VM 2: All-In-One
  • Access Specification on VM 3: 4K / 25% Read (OS simulation, heavy writes)
  • Access Specification on VM 4: 64K / 50% Read (SQL simulation)

4. Other Stuff: It Depends

This last part entirely depends on your environment and how you intend to use a new all-flash array. If you are fully virtualized like we are, look at the best practices, recommendations, and supported features. Compare your backup solution and architecture with array support. Do you need things like transportable snapshots for Veeam Backup & Replication, for example? If you use snapshots, how do you create, export, and delete them? Make sure any APIs that you use (or want to use) are supported.

At the end of the day, every environment and every use case is different. Relationships also matter, so your account team and VAR may sway your feelings toward, or away from, a given product. If all of the above tests go smoothly, smaller things like the UI and implementation process may make or break it. Or if you find the chinks in both products’ armor, support may be winning vote.

Either way, near the end of your evaluation, take some time to step back and write down the results and the pro’s/con’s to both or all of the products tested. Chances are you’ll find what matters to your organization on the page when you do.

Appendix 1: Latency, Block Size & Bandwidth

I’m adding the below screenshot from our POC to provide an example of #2 above and for Pure to help interpret for customers who might see such behavior. Thanks for the comment, Mike!

Pure_queue_depth_issue

7 Comments

  1. Mike from Pure here. Good write-up – we definitely like to see customers looking to come up with their own tests to compare products. Even better when customers run real workloads and not synthetic benchmarks on the arrays :)

    On the subject of larger blocksizes increasing latency, it’s not due to Purity’s need to analyze a larger block for dedupes, it’s more to do with when the “latency clock” starts ticking. Purity starts this clock ticking for writes as soon as it receives a write request – before it allocates an internal buffer and before the buffer is filled from the customer network. What this means is that data transfer time over the network is included in this latency. Simple physics dictates that larger blocksizes will have higher latencies. Besides, large block operations should be measured by bandwidth or elapsed time to complete the operation – latency is pretty meaningless when doing large block I/O.

    Regarding the HA testing, no, Pure does not have an equivalent controller deactivation command. Instead, we recommend you just yank the power cords on a controller – what better way to test an HA failover than pulling power? Even during code upgrades, we don’t issue an orderly “deactivate” command specific to Purity – we just reboot the controller at the OS level which includes a termination of all active processes. We don’t believe there should be an orderly shutdown command, nor do we believe that is a valid test for HA failover – it allows vendors to invoke processes under their control to ensure an orderly failover, which is very different than failure scenarios.

    You can read more about our approach here: http://www.purestorage.com/blog/why-theres-no-power-button-on-the-pure-storage-flasharray/

    October 17, 2014
    Reply
  2. Chris said:

    Hey, Mike. Thanks for the thorough reply. I’d be interested to read the application of the “latency clock” and large block I/O principles applied to the screenshot in the addendum. Is Pure’s latency metric something that just needs redefinition or normalizing (if indeed it doesn’t have a practical impact on performance)? I ask because XtremIO doesn’t show a similar spike during the same backup window of large block I/O.

    Good point on the HA testing. I agree that the best real-world test is to pull the plug on a controller. A programmatic method just provides a hands-off option for repeated testing (versus walking back and forth for multiple tests or standing at a console in a cold, loud server room :). Just to make sure I’m suggesting something safe and supported, can you confirm that Purity won’t have an issue if customers test it several times in a row with power pulls (assuming they allow for full boot-up after each iteration)? I’ve seen great evidence of Pure’s resilience on the Twittersphere, so I expect it can handle this, too.

    October 17, 2014
    Reply
    • Chris:

      I suspect what you saw is a confluence of a few things coupled with our transparency in reporting of the metrics. First, under a heavy burst of I/O activity, we do not tell hosts to throttle back the amount of I/O they send like some legacy arrays do but instead queue the I/O’s internally (you can see that in your screen shot – the queue depth) and we also include that queuing time in our latency. With large block I/O, smaller queue depths are all that’s needed to drive maximum performance (see Little’s Law). If you want to manage your host queue depths to ensure low latency (no queuing) while delivering the same end-to-end performance (elapsed time for the task) it is possible to do so – if your goal is to just reduce the latency we report.

      Also, I didn’t necessarily state that latency doesn’t have an impact on performance all the time – it’s just that large block I/O is typically associated with jobs or tasks that are more long running than OLTP transactions and in these cases latency isn’t the metric you should be concerned about – it is elapsed time to complete the job. What I can also say is that running these test with Purity 3.2.1 will likely show very different behavior than with current levels of Purity as a result of the continual optimizations we do in our code.

      Understanding how we calculate our metrics is important to interpreting the results (what this discussion is all about) to determine if there is actually an issue that need attention or not. At the end of the day, what matters is whether or not you saw benefit of your workload on Pure, and how much. I can ‘t speak to how our competitors calculate the metrics they display to know when their latency clock starts ticking or if they are transparent with any queuing on their array and the latency associated with that aspect. What matters is overall elapsed time the task takes or perhaps bandwidth consumption if you want to report on an array-level metric.

      Regarding the HA testing, yes, please repeat the tests as many times as you like – I can’t see why a vendor would might suggest it’s ok to do it once but not more than once. After your failed controller is back up and running (servicing front end I/O) feel free to yank the power cords again. While you’re at it, pull two drives and maybe a SAS cable or two for good measure. We’ve had customers cut power to the entire array just to see how quickly we resume I/O (usually just a few minutes) and validate there is no data loss (there will be none).

      Hope this helps!

      Mike

      October 18, 2014
      Reply
  3. Stefano said:

    Pure’s employee here as well.

    I’d be adding:

    don’t trust vendor measurement if possible, meaning specifically two aspects:

    – compression factor. Fill up the box and then measure ” stored data / consumed space “. That is the compression factor. Having worked in DD in the early days I know more than a vendor that was artificially adjusting what was displayed in the GUI as the compression factor.

    – performance. Want to avoid latency, iops, bw measured in different fancy ways? measure it from the host and don’t trust the GUI. As per the previous point, latency is the latency and should be measured by what it take to service the request since the initial request. Host side you are going to get and end to end (round trip) measurement. Unless you are in a direct attach scenario, the storage has no easy way to measure the latency till the host hop.

    – synthetic benchmark. You won’t believe how many vendors in the industry include code in their firmware to cheat simulation tools, an not just in the storage industry. Run real application workloads. If you have the chance to do a production test that would be the best.

    October 17, 2014
    Reply
  4. Monardo said:

    I tested both (we bought both …) and XtremIO wins ! It wins on performances (SQL, VDI, Vmware) but, for example, Pure storage is better for dedupe/compression and it doesn’t need anything to replicate the data; with XtremIO you need recover point, for example. bye

    Monardo G.A.

    May 4, 2015
    Reply
    • Chris said:

      Glad to hear it, Monardo! I think you boiled down the technical strong points well. Great verbal dedupe ;).

      May 4, 2015
      Reply

Leave a Reply