Storagebod Rotating Header Image

gestaltit

So No DMX5

I'm on leave this week, so I'm not going to post a whole deal on this but I actually thought that this announcement was important enough to sit through the video presentation just to see what EMC were going to publically  announce.

So there it is revealed, there is no new DMX! There you go! I told you that there wouldn't be!

I've known a little about about the V-MAX (that's a cool name, pity it sounds like a console) for nearly a year now; I responded to a post on Chuck's blog with a list of things I wanted from an array; Chuck didn't put the post up and just said that they were working on making my desires a reality.

I really wanted Automated Storage Tiering to simplify storage and it looks like EMC have delivered it. The devil will be in the details though but this is the start of something which will have industry wide implications. Automation on the scale that I believe will eventually happen will change the role of the storage management team for ever.

I've spoken before about the ability to federate arrays; I actually look at what much that has been done both by IBM and HDS as moving towards this. EMC have delivered the ability to federate their arrays; pity it can't do third party but hey…this is EMC we're talking about! Technology refreshes and data-migrations will become a breeze!

Building on top of industry standard hardware, this is just plain common sense. Going down the Intel route, just makes sense as part of cost reduction and common components.

Management tools will need to change as part of all this, I'm expecting a refresh of the management suite at some point. Who knows, we might get the long promised re-write of Control Center, I'm not holding my breath though! I might go as blue as the flashy lights on the front of the new V-MAX.

I don't know a whole lot about the details but I know that there have been arrays out in the field for a bit. I don't have one before you ask! And of course, I can't tell you who else I have had very similar conversations with about what they might be doing. Life is going to get interesting.

I am frankly very impressed that EMC taken such a leap. Congratulations one and all!!

Oh yes, I forgot to ask….how much is this all going to cost us?

And yes, I don't for one moment believe that the V-MAX will not replace the existing DMX4 unless we are going to see a DMX5 in the future? Now EMC have surprised us once but a DMX5…not a chance! Actually, I think questions have to be asked about the future of all of the other EMC disk-lines; just think about it!

Damn, this is a longer post than I intended! Oh well, back to the DIY!

Questioning the Weatherman…

I seem to be doing a lot of thinking about clouds, dynamic data centres and what it all means. I do believe that the architectures of the future will become increasingly dynamic and virtualised. I was playing with EC2 and AWS at the weekend and I can see a time that I won't bother the ridiculous amount of hardware that I have at home for playing with virtual appliances and 'stuff' * And I can see that it makes increasing amount of sense for a lot of the things we do at work but….I have some questions/thoughts about storage in the public cloud and to a certain extent, the private cloud.

1) All the pricing is per gig, this is a very simplistic model. I know that people will argue that you wouldn't put your highest performing apps in the cloud but you do need some kind of performance guarantees. Anyone want to benchmark Amazon's Storage Cloud, an SPC for Cloud?

2) Replication between private-public clouds; public-public clouds i.e between cloud providers. Or is this simply done at an application level? As anyone tried using database replication between applications running in different clouds?

3) Related to the above, redundancy in the cloud? We provision network links from diverse suppliers to try to protect ourselves from a castrophic outage taking out an entire supplier; do you do the same in the cloud or is it enough to have DR between different clouds from the same supplier.

4) Dedupe in the cloud? Can you dedupe cloud storage? Have people considered writing dedupe appliances to run in the cloud? For example, would Ocarina run as a virtual appliance in the cloud?

5) Backup in the cloud? How do we back our cloud storage up when running in a public cloud? Would you back-up to a different cloud?

6) A virtual array? Before you think I'm mad, it might be interesting to be able to prepurchase a storage pool which can be allocated to virtual servers. This storage pool could be thin-provisioned, over-committed etc as per traditional thin-provisioning.

Just my thoughts, any answers? Any questions of your own?

*This is a blatant lie, I have ridiculous amounts of hardware because I enjoy fiddling and hacking about with it. Pretending it is for research is just an excuse I give myself, my wife is aware of the real truth but she humours me!

Amazon – The World’s Bookshop and IT Supplier?

How did a online bookseller become potentially the most important IT Supplier in the world?

Were their employees not simply selling books but also devouring them to solve their own internal problems? And without Amazon beginning to scare the beejesus out of the traditional IT suppliers, would we have cloud? 

People talk about UCS being Cisco's reaction to HP stepping over the line in the sand. And this may be true! But it is Amazon who people should be scared about; a company which understands global markets, logistics on a massive scale, flexibility and agility. These are all core to a company which has become synomymous with Online Retail. An IT infrastructure which has proved itself to scale very quickly and allows small start-ups run Enterprise class infrastructure without huge capital outlay.

With the exception of the EC2 cloud, Amazon are offering interesting and thought provoking services which could change the way that applications are developed. But I'm not especially interested in these as an infrastructure bod in the here and now.

The EC2 cloud is more interesting, not because it is especially exciting or innovative; it's not really but it is more applicable to the here and now. And I had initially dismissed it as just a mere so-what hosting exercise and to a limited extent it is but it might well turn out to be as important as VMWare and it is this which I suspect is beginning to frighten people.

Lets for example, take an Enterprise which is running all of its core systems on Solaris systems; with the current travails of Sun, merger rumours and a general push to reduce costs, a decision might be made to replatform the core applications to run on Linux.

But do I really want to install a large Linux infrastructure initially? That'll take time and more importantly money; I might not have enough space in my data-centre, I might not yet have decided on my corporate Linux infrastructure standards. As the head of development, I probably don't want to wait for my Infrastructure teams to get their acts together; I want something *now* and I want it quickly.

Step-up the Amazon EC2 cloud; quickly and easily, without jumping through the hoops that the Infrastructure teams want me to jump through, probably without a huge amount of IT process such as change management, I can have a scalable and pretty robust Linux infrastructure.

Yes, I could have done it with other companies but Amazon is the company which appears to have the vision and strategy to take this forward. External clouds buy me the one thing that I cannot really buy, time.

So I start hosting my development environments in the Cloud and because my internal IT infrastructure teams are too slow, I put them in the external Cloud. Sure for the time being, I'll probably keep my production workloads in-house but in the same way that production work-loads started to move to VMware, how long before I'm experimenting with production workloads?

And that is without Amazon's more interesting services which could really change things. It makes you wonder about some of the other non-traditional IT companies, what have they got hidden away? There's some interesting stuff been done by some of the games companies for instance which may well have wider application.

Clouds are fuzzy things and boundaries will become blurred. May we live in interesting times.

Economic Realities

I found David Merrill's blog entry here on Squeezing (Easily) into Tight Jeans amusing. David is talking about a couple of his customers who were using various capabilities to reduce the amount of storage they needed; I suspect using techniques such as thin-provisioning and the ability of the USP-V to consolidate islands of storage into a usable pool of storage.

And then they were going to decommission a whole bunch of arrays and reduce the amount of storage on the floor. I think David was surprised that they were choosing to decommission the storage as opposed to simply use the reclaimed storage for growth.

But sitting on this side of the fence, the customer side; this is no big surprise at all. Depending on the age of the arrays and depending on the software sitting on the arrays and especially if the arrays were out of warranty periods; the maintenance costs are generally so high that it simply does not make economic sense to keep them around.  

Software maintenance on all of the Enterprise class arrays is just plain expensive. If you then factor in that if you are trying to sweat an asset for a couple of extra years; that is another couple of years of what are often power and space inefficient arrays and you are going to be looking at another migration effort in fairly short order, it does not make a huge amount of sense.

The situation is actually a lot less clear on mid-range arrays as the maintenance costs are often considerably lower but if you have got aging Enterprise arrays; get them out if you can.

Just another feature…

Despite the inevitable EMC spin, I found myself nodding in agreement with this blog entry from Barry Burke. Wide-striping is now just another feature; it’s a very important feature but just another feature now.

3Par took wide striping and made it useable; EMC’s historic implementation using metas and hypers was painful and with the large arrays of today it becomes a full time job to performance manage an array. 3Par made it easy and much kudos to them for doing so. I think 3Par’s legacy will be the ease of management that they have brought to the Enterprise array (and thin provisioning).

I think it is worth pointing out to Barry, that you can simply use Wide-striping without Thin-provisioning with a 3Par box as well. LUNs do not need to be thin-provisioned and can be entirely pre-allocated.

Automated wide-striping simply makes the storage admin’s job easier; it de-skills it somewhat and hopefully it will bring an end to the endless poring over of spreadsheets trying to balance workloads.

SSDs will become just another feature with time as well; Barry wants this to be the case. It validates the decision to put SSDs into the DMX. Any good idea will eventually just become another feature if it is good enough and even if it is patented; people will find their way round eventually.

As Barry points out SSDs deliver massively increased IOPs and massively decreased response times; we need this, we desperately need this for some applications. Even if the magnetic disk manufacturers could get their disks to spin faster, the increase in power and cooling required would boil the oceans and hasten our demise as a race.

But until SSDs achieve parity per gigabyte with the cost of spinning disk, we need to find ways to efficiently use what is still a relatively expensive resource and SSDs probably are not the best fit for your file-serving and bulk storage requirements. The venerable Anandtech actually demonstrates with their benchmarks that using SSDs for log-files may not gain you much. It's actually an interesting if slightly flawed look at SSDs, SATA and SAS; it would have been more interesting if they'd done more work and gone more granular into the data-base tables.

They need to be used for appropriate workloads and ideally we need something like this from Compellent. Unfortunately for Compellent, I have horrid suspicion that this level of automated tiered storage will simply become another feature; you can’t keep a good idea down! Once we have block-level migration both automatic and rules-based we can work out quickly and easily how much SSD we need.

SSDs with automated tiering will save you money; probably both in terms of TCA and TCO. SSDs without automated tiering will save you money in terms of TCA for appropriate workloads but may end-up costing you in terms of TCO because the work needed to identify, balance and move data around.

Of course, it all comes down to the cost of the software needed to manage and automate the process. If the software is too expensive and the vendors simply try to milk it as a new cash-cow; we’ll not realise the savings of this brave new world.

Storage Virtualisation and Commoditisation

HDS' Hu makes a point in his latest blog entry in that Storage Virtualisation allows the end-user to turn commodity disk into enterprise disk by sticking it behind a virtualisation appliance; in Hu's case, he'd deeply love that to be USP.

I think in many ways his idea gets to the core about what might have gone wrong with the storage industry; we are paying too much for the commodity bit of the storage i.e the spinning stuff!!

But is what Hu is talking about truly virtualisation? Arguably not!  What HDS do and what IBM do with SVC is more loosely couple the array controller with the disk at the back-end. USP and SVC are simply array controllers. Storage virtualisation actually is not really that clever, the USP and SVC may have some fantastic code in but they are "simply" array controllers. And yes they have to deal with the vagaries of FC implementation across many different back-end disks and EMC would probably argue 'more fool them'.

So perhaps if instead of selling Virtualisation; IBM and HDS simply sold array controllers and said, 'You can buy your disk from us, EMC or whoever is giving the best deal today' and didn't make it out to be some mystical and magical thing; they might get more acceptance.

It actually changes the nature of conversation somewhat, end-users could then go back to EMC et al and ask the question, 'What makes your back-end disk so special? Why can't I just buy the directors and put what ever disk I like behind it?'.

I think the discussion becomes a lot less philosophical and a lot more pragmatic if this approach is taken. I know EMC have answers as to why their back-end disk is special but perhaps you should ask the question also. We can get down to talking about each storage array's USP…no pun intended!

Bigger Blue?

IBM were always going to go acquisitive this year and it's no surprise to me that the first target appears to be Sun. They could have waited for Sun's stock price to tank even more but that was going to be a risk and who knows who else might have come in for the once celestial body. As other commentators have already pointed out, this is not a reaction to Cisco's announcement on Monday but more a strike against HP. This is about being Number 1 again!

IBM already make a huge amount of money out of selling software on Sun hardware, there's a lot of WebSphere out there running on Solaris servers. There's a lot of IBM software full-stop running on Sun kit. IBM Global Services probably run more Solaris servers than most; I know of a number of huge deals which involve IBM running Sun servers. By buying Sun, they can grab more of this dollar.

And there's the Open Source aspect; IBM embraced Open Source and especially Linux in a way which was most un-IBM like. Whilst Sun were sending mixed messages out, IBM set about working out how to make money out of Open Source by fully leveraging their services organisation. Sun, finally, came to the party and but like many late-comers, they overpaid for their entrance. But Sun still have technologies which I am sure that IBM would like to get their hands on and I am pretty sure that IBM know how to monetize them.

Java, a technology that IBM probably know better than Sun themselves. At times you would even believe that Java was already an IBM product.

IBM also pickup VirtualBox which shows some promise as a desktop virtualisation tool.

Yes IBM will have to kill products and at least one sacred cow will have to go; AIX or Solaris. That's a hard one to call. Solaris on pSeries would be an interesting proposition but I suspect AIX will survive; it'll pick up some technologies from Solaris, ZFS for example. And if Solaris goes; where does that leave Sparc?

And there's the storage aspect. Have HDS just lost a channel? I cannot see IBM reselling USP, Barry W would be apoplectic! NetApp relationship? I can see that being strained somewhat; with some further engineering, the 7000 series could be an excellent competitor at the low-mid range for NetApp FAS but there is a question mark over the future of Solaris as I mentioned earlier.

Enterprise Tape? Well, IBM will own that market with the addition of Storagetek. Hopefully someone will come and pick Quantum up! Stornext FTW!

IBM and Sun both resell LSI; so that shouldn't be a problem for them.

But at the end of the day, this isn't really about technology and it's not about storage. This is about market-share, this is about mind-share, this is about services and this is about coming out of the down-turn as Number 1. 

And if it isn't true; well, congratulations to the person who wanted to distract from Cisco's UCS..way to go!!

There aint nothing new anymore!

So let the hyperbole begin; Cisco’s Unified Computing System has finally been announced (an amusing aside, I first found out about Project California about six months ago from…..Brocade! My Cisco Account Manager was not especially amused). It’s certainly a grand vision and a play for global domination not seen since the days of Smersh!

As has been stated, this is not simply a ‘me too’ Blade play from Cisco; this appears to be a full on, all-out assault on completely dominating the data centre space. Not since the days of mainframe dominance have we seen such an attempt to own the whole data centre. Well, not quite the whole data centre, storage seems to be currently the missing piece, Cisco are relying on their partners such as EMC and NetApp to provide the storage piece.

It is going to be interesting to say the least to see how HP, IBM and Sun et al react to this. I could see opportunities for Sun but it relies on them beginning to see the light and admitting to themselves that they are actually a software company. They could embrace the UCS platform and really start to shift. HP and IBM have problem now in that they need to put together a competing vision; it’ll be interesting to see what this vision is.

What do I think? It’s too early to say but Cisco are going to be very aggressive about this and their marketing is going to be much further up the food-chain than mere ‘Bods. Over the past months, I have come to the realisation that things need to change in Infrastructure and especially Infrastructure teams.

We have too many specialists, too many people who can only do one thing or at least profess to do one thing. There are far too many vested interests in many support organisations and there is a level of complicity which has reflected the status quo in the industry, I won’t step on your toes if you don’t step on mine.

Cisco’s vision challenges this world view and even if you don’t buy into the Cisco vision in its entirety; it is a vision which merits a second look.

And it’s not Cisco which powers this vision, it’s the virtualisation brought by products like vmware, Hyper-V and Xen. But it’s a vision which is really very old…what the hell happened Big Blue? How on earth did you let Cisco re-invent your original vision and claim it for themselves?

Perfection….

Although I give the various players a hard time; the industry doesn't do everything badly and I try to see the positives as well as the negatives. So I was thinking about the perfect array and what features I would like to see! So a random stream of consciousness produced the following!

  • Reliability – DMX-like reliability and robustness
  • Scalability – DMX-like scalability for block, IBM SOFS for NAS
  • Performance – DMX-like performance for block, BlueArc for NAS
  • Flexibility – Support for all protocols in a common consistent manner like OnTap
  • Thin Provisioning – 3Par's thin provisioning
  • Wide Striping – Genuine wide-stripping across ALL Spindles not just a proportion or across groups of spindles – Think 3Par
  • Automated Storage Tiering – Think Compellant on steroids!
  • Automated Optimisation – Think 3Par
  • Dedupe – Dedupe at block or file level – not seen a truly great dedupe solution yet
  • Scalable Heterogeneous Support – IBM SVC or HDS
  • Minimal-performance impacting Snapshots – think NetApp or….Sun
  • Writeable Thin Clones – think LSI's DPM8400
  • Synchronous Replication – think SRDF
  • Asynchronous Replication – think of something which works without a huge amount of work
  • Provisioning interface – think XIV, think 3Par
  • Analytics – think Sun
  • Monitoring/reporting – think Onaro
  • Cost – think PC World (who are too expensive to but you get the idea!)

So a merger between IBM/EMC/HDS/NetApp/Sun/Compellant/BlueArc/LSI/3Par would be a great start. Let's throw Cisco into the mix as well for a unified data-centre fabric and we're done!

What features do you want to see? I'd love to know!

And vendors, without too much marchitecture, what are your killer features? The things that you are most proud of? I don't want a big bragging list but what's the one problem you think you've solved which you are most proud of?

       
               

Can I Be House?

So hopefully we all agree that EFDs have a place in the Storage Infrastructure of the future but we also have to ask ourselves what is this infrastructure is going to look like? If we look at some of the press releases and comments with regards to Fusion-IO, you would probably believe that the SAN was on the way out or actually shared storage in general would die.

Some of the figures are impressive; an un-named company believed that they were losing 15% of potential web-business potentially due to storage timeouts and the slowness of the response of the array.

That's a huge amount of business to be loosing due to the slowness of the array but I wonder how true that is; was that really due to the end-to-end slowness of the system? Was it due to non-optimised SQL?  I've seen SQL queries tuned down from 300 accesses to half a dozen with a couple of hours work. Did they blame the storage because they were the one team who couldn't give a transparent view of their environment?

Often storage is a great diagnostic tool; just looking at the I/O profile can lead to interesting questions. If you see weird I/O ratios which step way out of the normal profile for an OLTP application it can be an indicator of sub-optimal code. But to do so, you need the tools which present the information in quick and easily digestable manner.

At the moment, it is all too easy to blame the storage because the management tools are not great and the estate becomes very opaque to the outside viewer. If we had the right tools, we could often become a crack team of dysfunctional diagnosticians like House and his team and people would come to us asking we know it's not a storage problem but perhaps you can help us identify what is going on in our infrastructure.

That'd be a great step forward, don't you think?