Storagebod Rotating Header Image

FAST and Furious

Whilst HDS and EMC throw rocks at each other with regards to whether it is better to build custom parts or take things off the shelf and just use custom when you require (I'm expect the other Barry to sit on his hands but there are good reasons why the SVC team decided to build out of commodity parts and I suspect that they are very similar reasons to EMCs). I think we should look beyond the hardware and look at what is coming down the line to us.

The most important thing roadmapped is FAST, Fully Automated Storage Tiering. FAST changes things; it takes a whole bunch of ideas from a whole bunch of places and runs with them. If you are another vendor and you feel aggrieved that EMC have stolen your idea; just take heart, it won't be the first time in history that this has happened and it won't be the last.

The foundation is Wide-Striping* using a model which splits your data into chunk(let)s and spreads it across spindles. Once these chunks are distributed, you can monitor the characteristics of the I/O at an individual chunk level; this allows us to do tiering at a sub-LUN level. A hot chunk of data can be moved to a higher tier and a cooler chunk of data down into a lower tier.

In the past we have been limited to moving a whole LUN (with the exception of Compellant); this has always been a time consuming job, identifying what needs to move and then moving it. Yes, technologies have come along to make this easier but to sweat the asset and especially to make best use of SSDs; we needed to move individual 'blocks' as in a given file-system , it is possible that only some blocks are hot and frequently accessed. Traditionally if you could, you would hold these in cache but if SSDs are expensive, cache is yet more so. This approach will allow some cache to be replaced by SSDs and for some cache unfriendly workloads, to all intents and purposes, you have massively increased the amount of cache available. You might not want to hold a terabyte or so of real cache for that evil 100% random read app but with SSDs; this becomes viable and not at a huge utilisation hit.

But there are going to be issues with the FAST approach; firstly, where do you put a new workload? If you simply assign it some disk and let the array decide, what the hell is it going to do with the workload? It could put it on the slowest tier possible and then migrate up; it could stick it on the fastest tier and migrate down. Both of these approaches have significant risk, so I suspect we are going to have to give the array some clues and we are going to have to understand more about the whole system we are putting in. The difference in performance between the top tier and the bottom tier is going to be large.

No longer will the Storage Admin be a Lun Monkey; they are going to need to really understand their workloads and the applications. They are going to need to learn to talk the application developers and understand workloads, they are also going to have understand business cycles.

For example applications which spend 11 months of the year pretty idle may suddenly at year end need a lot of performance. What happens if all your applications demand stellar performance once a year? Perhaps you need a way of warning the array that it needs to prefetch a load of data. A badly written end-of-year reporting extracting which generates thousands of random read IOPs. A badly written user-generated SQL; in the past, this just crippled the application; with FAST, this could cripple the whole array as it tries to react.

The FAST approach is potentially the thin-provisioning of IOPs. This going to need a lot of thinking about. Potentially you will have to domain storage to protect applications from the impact of one another. We are going to need to know more about the whole system than we have before if we are to truly benefit from FAST,

Building rules which suit your applications; sure, V-MAX will come with its own canned rules for things like VMware and known applications. Indeed EMC will probably be leveraging all the performance data that they have been gathering over the years to help us write the rules. Storage Templates as described by Steve Todd here are just the start.

So although at one level, the Storage Admin's job could get alot easier; the Storage Manager's job has got a whole lot harder. Yes Barry, I asked for FAST and now you've given it to us; now we'll have work out what this all means!

I have some really 'interesting' ideas as to where EMC could take V-MAX but they'll have to wait for another time as I'm still supposed to be on leave from Enterprise IT.

* It's Wide Striping not Wide StripPing as I keep seeing written; Wide Stripping is what happens on a Rugby Tour after a good night out!


9 Comments

  1. Barry Whyte says:

    I guess EMC will need to amend their “how to sell against SVC” pitch and remove the slanderous comments about “cheap Intel commodity hardware” – lol
    As you say, automated tiered sub-lun movement is tricky, Compellent really is an ‘archive’ solution from what I understand – i.e. the SATA grows while your enterprise disks stay constant in capacity – but this assumes that data becomes archive over time – and you don’t need to accelerate it up once a month, or year as you suggest.
    Be interesting what starts appearing, and what IP gets used, I’ve been keeping our patent attorneys very busy for the last few years in this area. My first patent for the actual concept of ‘chunked’ automation in a virtual storage environment was filed in 2003.

  2. Martin,
    I think that you are spot on with your discussion about sub-LUN tiering. The trouble is that if I understood what I heard in the announcement, the first iteration of FAST will be at the LUN level, not the sub-LUN level. That sub-LUN is down the road, and so is the Fully Automated part of FAST. Again if I understood correctly, that part isn’t due out untill later this year, or did I miss something?
    –joerg

  3. Martin G says:

    Sub-LUN is down the road but it is one of most exciting developments we’ve had for sometime. It could have some interesting implications going forward; if the array can move the data around and make sure it ends up in the right place for it’s access profile, there should be little need for the creation of hundreds of LUNs. Just a few big-ones and let the array do its thing.
    And I can see why EMC are not virtualising third party disk if they want to do sub-LUN Storage tiering. To make sub-LUN tiering work effectively, you are probably going to need to know a lot more about the disk and the disk controllers than you would in the traditional storage virtualisation. That and EMC have decided that is not the road for them.
    What does make me laugh is one of their arguments against Storage Virtualisation…who would you blame when you get problems? Can they not remember that IBM used a very similar argument against them when they first moved into the storage arena?

  4. John F says:

    oh, I thought “chunked” was what happens when you quit smoking ;-). On a more serious side, I did notice that there’s still a bin file in V-max. How far can autotiering go within the contraints of the legacy feature set? We are in for some interesting times as this plays out in the market.
    I see EMC is still latching on to the narrow definition of “virtualized” storage. I think every EMC competetor has had a poke at that, so I’ll refrain for now. Chris Evans had a blog about wasted space and where it goes. David over at HDS had an out of scal graphic lining up the waste points (wow, defective typing or Freudian slip: almost let pint slip by instead of point) with technologies to combat them. The industry as a whole is slowly coming to terms with storage efficiency; that’s a good thing. NetApp even has a blog dedicated to it (http://blogs.netapp.com/efficiency/). Any time the industry collectively starts seriously thinking about the customer’s problems, and how to solve them, and putting resources behind that, and making guarantees, well that can only be a good for the customer…
    John

  5. Drunken Pole says:

    Small correction, tiny-tiny.
    IOPS, not IOPs. Could be IOps, but definitely IOPs is ugly.
    (I see it used in lots of places and wast just cranky enough to react)
    (just as you were 😉

  6. Bernd says:

    Funny that NetApp has a blog regarding efficiency, takin’ in account how much capacity they need to waste because of the ONTAP overhead and limitations (NetApp calls it recommandations) within their systems. Looking at the block size HDS uses for thin provisioning it’s impressive to be able to find positive statemts here, too. Overall, there’s no such thing as a perfect world and I’m quite happy with what EMC has announced, because at the end it shows willingness to research and develop for innovation. In addition we all can be quite sure that this will not be the ending point, right? Even though, adopting change is not an easy thing for a lot of people. …just my 5 cents…

  7. Michael Hay says:

    Martin, FAST isn’t novel. We’ve been there and done that. Actually when you talk to Storage Managers they frankly like the idea of a fully automated environment, however most say that in practice they will not deploy. What is desirable are vast usability improvements that make migrations, provisioning, troubleshooting, etc. easier. The Hitachi Storage Command Suite already has been delivering improved usability for Storage Managers for a number of years. Some of my favorite yarns about our tools come from EMC personnel who state that our products must be broken because they are too easy to use. Thanks for the compliment EMC!
    Compellent is the first vendor AFAIK that does it at the sub-LUN level. Hitachi has been pushing the sub-LUN agenda since our release of HDP on the USP-V.
    Finally as to the point of commodity versus custom built, you guys are totally missing the point. Since the concept is a little hard to grok I’m going to dedicate a couple of pieces on the topic soon. I’d appreciate a solid debate and discourse on the topic.

  8. Martin G says:

    As a storage manager myself, I know why storage managers will not deploy fully automated environments; we all suffer from the delusion that we can do better than a machine. We can’t! Storage provisioning is a task which is better suited to automation and as environments get ever larger, it will become even more so.
    As a storage manager in the field, I could share some stories that others have shared with me about any one of the various storage management tools which you all ship; it would be amusing but ultimately it would come down to the fact that you are all pretty damn poor. In fact, at times I wonder if there has been an unholy alliance between vendor and support techies trying to maintain complexity in the world of storage management!
    BTW Michael, Martin G is Storagebod and you will note that I did mention that Compellent did the sub-LUN technology for open-systems first. FAST is important because it does validate that technology and it is incredibly important if you want to efficiently utilise SSDs. There are various approaches to achieving sub-LUN optimisation but doing it in the array is certainly a valid approach.
    Custom vs commodity is always an interesting debate and one which has to acknowledge that there are many trade-offs on both sides.

  9. Michael Hay says:

    Martin you are getting to the crux of the matter I was intending there is an honest debate to be had on commodity versus custom. The decision on which approach to take is often boiled down to cost benefit analysis. If a custom approach has best in class differentiation and can continue to garner strong returns then keep it. If the competitive commodity technologies are better than the custom approach then take it. Also I’ll out myself, I also used to work for IBM and a good friend of mine was one of the two founding members of the SP2 support group at Support Line near Grapevine Texas, and I also worked for an IBM reseller.

Leave a Reply

Your email address will not be published. Required fields are marked *