Storagebod Rotating Header Image

Data Centres

Flash in a pan?

The Tech Report have been running an ‘SSD Endurance Experiment’ utilising consumer SSDs to see how long they last and what their ‘real world’ endurance is really.  It seem that pretty much all of the drive are very good and last longer than their manufacturers state; a fairly unusual state of affairs that!! Something in IT that does better than it states on the can.

The winner is Samsung 840 Pro that manages more than 2.4 Pb of data before it dies!

This is great news for consumers but there are some gotchas; it seems that most drives when they finally fail, they fail hard and leave your data inaccessible; some of the drives’ software happily states they are healthy right up until the day they fail.

A lot of people assume that when SSDs fail and reach their end of life for writes; the data on them will still be readable; it seems that this might not be the case with the majority of drives. You are going to need decent backups.

What does this mean for the flash array market? Well, in general it appears to be pretty good news and that those vendors who are using consumer-grade SSD are pretty much vindicated. But…it does show that managing and monitoring the SSDs in those arrays is going to be key. Software as per usual is going to be king!

A much larger scale test needs to be done before we can be 100% certain and it’d be good if some of the array vendors were to release their experiences around the life of consumer drives that they are using in their arrays.

Still if I was running a large server estate and was looking at putting SSDs in them; I probably would now think twice before forking out a huge amount of cash on eMLC and would be looking at the higher-end consumer drives.

 

 

Interesting Question?

Are AFAs ready for legacy Enterprise Workloads? The latest little spat between EMC and HP bloggers asked that question.

But it’s not really an interesting question; a more interesting question is why would I put traditional Enterprise workloads on an AFA? Why even bother?

More and more I’m coming across people who are asking precisely that question and struggling to come up with an answer. Yes, an AFA makes a workload run faster but what does that gain me? It really is very variable across application type and where the application bottle-necks are; if you have a workload that does not rely on massive scale and parallelism, you will probably find that a hybrid array will suit you better and you will gain pretty much all the benefits of flash at a fraction of the cost.

The response often received when asked what the impact of being able to run batch jobs, often the foundation of many legacy workloads, in half the time is a ‘So what?’ As long as the workload runs in the window; that is all anyone cares about.

If all your latency is the human in front of the screen; the differences in response times from your storage become pretty insignificant.

AFAs only really make sense as you move away from a legacy application infrastructure; where you are architecting applications differently, moving many of the traditional capabilities of an Enterprise infrastructure up the stack and into the application. Who cares if the AFA can handle replication, consistency groups and other such capabilities when that is taken care of by the application?

Yes, I can point to some traditional applications that will benefit from a massive amount of flash but these tend to be snowflake applications and they could almost certainly do with a re-write.

I’d like to see more vendors be honest about the use-cases for their arrays; more vendors working in a consultative manner and less trying to shift as much tin as possible. But that is much harder to achieve and requires a level of understanding beyond most tin-shifters.

Another Year In Bits…

So as another year draws to a close, it appears that everything in the storage industry is still pretty much as it was. There have been no really seismic shifts in the industry yet. Perhaps next year?

The Flash start-ups still continue to make plenty of noise and fizz about their products and growth. Lots of promises about performance and consolidation opportunities, however the focus on performance is throwing up some interesting stuff. It turns out that when you start to measure performance properly; you begin to find that in many cases that the assumed IOP requirements for many workloads isn’t actually there. I know of a few companies who have started down the flash route only to discover that they didn’t anything like the IOPs that they’d thought and with a little bit of planning and understanding, they could make a little flash go an awful long way. In fact, 15K disks would probably have done the job from a performance point of view. Performance isn’t a product and I wish some vendors would remember this.

Object Storage still flounders with an understanding or use case problem; the people who really need Object Storage currently, really do need it but they tend to be really large players and there are not a lot of them. All of the Object Storage companies can point at some really big installs but you will rarely come across the installs; there is a market, it is growing but not at a stellar rate at moment.

Object Storage Gateways are becoming more common and there is certainly a growing requirement; I think as they become common and perhaps simply a feature of a NAS device, this will drive the use of Object Storage until it hits a critical mass and there will be more application support for Object Storage natively. HSM and ILM may finally happen in a big way; probably not to tape but to an Object Store (although Spectralogic are doing great work in bringing Object and Tape together).

The big arrays from the major vendors continue to attract premium costs; the addiction to high margins in this space continues. The usability and manageability has improved significantly but the premium you pay cannot really continue. I get the feeling that some vendors are simply using these to fund their transition to a different model; lets hope that this transition doesn’t take so long that they get brushed away.

The transition to a software dominated model is causing vendors some real internal and cultural issues; they are so addicted to the current costing models that they risk alienating their customers. If software+commodity hardware turns out to be more expensive than buying a premium hardware array; customers may purchase neither and find a different way of doing things.

The cost of storage in the Cloud, both for consumers and corporates continues to fall; it continues to trend ever closer to zero as the Cloud price war continues. You have to wonder when Amazon will give it up as Google and Microsoft fight over the space. Yet for the really large users of storage, trending to zero is still too expensive for us to put stuff in the Cloud; I’m not even sure free is cheap enough yet.

The virtualisation space continues to be dominated by the reality of VMware and promise of OpenStack. If we look at industry noise, OpenStack is going to be the big player; any event that mentions OpenStack gets booked up and sells out but the reality is that the great majority are still looking to VMware for their virtualisation solution. OpenStack is not a direct replacement for VMware and architectural work will needed in your data-centre and with your installed applications but we do see VMware architectures that could be easily and more effectively replaced with OpenStack. But quite simply, OpenStack is still pretty hard-work and hard-pushed infrastructure teams aren’t well positioned currently to take advantage of it.

And almost all virtualisation initiatives are driven and focussed on the wrong people; the server-side is easy…the storage and especially the changes to the network are much harder and require signfiicantly more change. It’s time for the Storage and Network folks to gang-up and get their teams fully involved in virtualisation initiatives. If you are running a virtualisation initiative and you haven’t got your storage and network teams engaged, you are missing a trick.

There’s a lot bubbling in the Storage Industry but it all still feels the same currently. Every year I expect something to throw everything up in the air and it is ripe for major disruption but the dominant players still are dominant. Will the disruption be technology or perhaps it’ll be a mega-merger?

Can I take this chance to wish all my readers a Merry Christmas and a Fantastic New Year…

Fujitsu Storage – With Tentacles..

So Fujitsu have announced the ETERNUS CD10000; their latest storage product designed to meet the demands of the hyperscale and the explosion in data growth and it’s based on….Ceph.

It seems that Ceph is quickly becoming the go-to scale-out and unified system for those companies who don’t already have an in-house file-system to work-on. Redhat’s acquisition of Ink-Tank has steadied that ship with regards to commercial support.

And it is hard to see why anyone would go to Fujitsu for a Ceph cluster; especially considering some of the caveats that Fujitsu put on it’s deployment. The CD10000 will scale to 224 nodes; that’s a lot of server to put on the floor just to support storage workloads and yet Fujitsu were very wary about allowing you to run workloads on the storage nodes despite the fact that the core operating system is Centos.

CephFS is an option with the CD10000 but the Ceph website explicitly says that this is not ready for production workloads; even with the latest release .87 Giant. Yes, you read that right; Ceph is not yet a v1.0 release; now that in itself will scare off a number of potential clients.

It’s a brave decision of Fujitsu to base a major new product on Ceph; it’s still very early days for Ceph in the production mainstream. But with large chunks of IT industry betting on OpenStack and Ceph’s close (but not core) relationship with OpenStack, it’s kind of understandable.

Personally, I think it’s a bit early and the caveats around the Eternus CD10000 deployment is limiting currently; I’d wait for the next release or so before deploying.

Scrapheap Challenge

On the way to ‘Powering the Cloud’ with Greg Ferro and Chris Evans, we got to discussing Greg’s book White Box Networking and whether there could be a whole series of books discussing White Box storage, virtualisation, servers etc and how to build a complete White Box environment.

This lead me to thinking about how you would build an entire environment and how cheap it would be if you simply used eBay as your supplier/reseller.  If you start looking round eBay, it is crazy how far you can make your money go; dual processor HP G7s with 24Gb for less than £1000.; 40 port 10 GbE switch for £1500; 10 GbE cards down to £60.  Throw in a Supermicro 36 drive storage chassis and build a hefty storage device utilising that; you can build a substantial environment for less than £10,000 without even trying.

I wonder how far you could go in building the necessary infrastructure for a start-up with very few compromises. And whether you can completely avoid going into the cloud at all?  The thing that itsstill going to hurt is the external network connectivity to the rest of the world.

But instead of ‘White Box’…perhaps it’s time for junk-box infrastructure. I don’t think it’d be any worse than quite a few existing corporate infrastructures and would probably be more up-to-date than many.

What you could build?

 

A Ball of Destruction…

I’m not sure that EMC haven’t started an unwelcome trend; I had a road-map discussion with a vendor this week where they started to talk about upcoming changes to their architecture..my questioning ‘but surely that’s not just a disruptive upgrade but destructive?’ was met with an affirmative. Of course like EMC; the upgrade would not be compulsory but probably advisable.

The interesting thing with this one is that it was not a storage hardware platform but a software defined storage product. And we tend to be a lot more tolerant of such disruptive and potentially destructive upgrades. Architecturally as we move to more storage as software as opposed to being software wrapped in hardware; this is going to be more common and we are going to have design infrastructure platforms and applications to cope with this.

This almost inevitably means that we will need to purchase more hardware than previously to allow us to build zones of availability to allow upgrades to core systems to be carried out out as non-disruptively as possible. And when we start to dig into the nitty-gritty; we may find that this starts to push costs and complexity up…whether these costs go up so much that the whole commodity storage argument starts to fall to pieces is still open to debate.

I think for some businesses it might well do; especially those who don’t really understand the cloud model and start to move traditional applications into the cloud without a great deal of thought and understanding.

Now this doesn’t let EMC off the hook at all but to be honest; EMC have a really ropey track-record on non-disruptive upgrades in the past…more so than most realise. Major Enginuity upgrades have always come with a certain amount of disruption and my experience has not always been good; the levels of planning and certification required has kept many storage contractors gainfully employed. Clariion upgrades have also been scary in the past and even today, Isilon upgrades are no-where as near as clean as they have you believe.

EMC could have of course got away with the recent debacle if they’d simply released a new hardware platform and everyone would have accepted that this was going to involve data-migration and move data around.

Still, the scariest upgrade I ever had was an upgrade of an IBM Shark which failed half-way and left us with one node at one level of software and one at a different level. And IBM scratching their heads. But recently, the smoothest upgrades have been V7000..so even elephants can learn to dance.

As storage vendors struggle with a number of issues; including the setting of the sun on traditional data protection schemes such as RAID; I would expect the number of destructive and disruptive upgrades to increase. And the marketing spin around them from everyone to reach dizzying heights. As vendors manipulate the data we are storing in more and more complex and clever ways; the potential for disruption and destructive upgrades is going increase.

Architectural mistakes are going to be made; wrong alleys will be followed…Great vendors will admit and support their customers through these changes. This will be easier for those who are shipping software products wrapped with hardware; this is going to be much harder for the software-only vendors. If a feature is so complex that it seems magic; you might not want to use it…I’m looking for simple to manage, operate and explain.

An argument for Public Cloud? Maybe, as this will take the onus away from you to arrange. Caveat Emptor though and this may just mean that disruption is imposed upon you and if you’ve not designed your applications to cope with this…Ho hum!

 

 

 

 

Heady Potential Eventually Means Catastrophe?

Amongst the storage cognoscenti today on Twitter, there’s been quite a discussion about EMC and HP possibly merging. Most people seem to be either negative or at best disbelieving that something like this would bring value or even happen.

But from a technology point of view, the whole thing might make a lot of sense. The storage folks like to point at overlap in the portfolios but I am not convinced that this really matters and the overlap might not be as great as people think. Or at least, the overlap might well finally kill off the weaker products; I’ll let the reader decide those products that deserve to die.

EMC are on a massive push to commoditise and move their technology onto a standard platform; software variants of all their storage platforms exist and just need infrastructure to run on. I’ve mentioned before that HP’s SL4500 range is an ideal platform for many of EMC’s software defined products.

But storage aside; the EMC Federation has a lot of value for HP, it is early days for Pivotal but I suspect Meg can see a lot of potential in it. She’ll see a bit of the eBay in it; she’ll get the value of some of the stuff that they are trying to do. They are still very much a start-up, a well-funded start-up tho’.

VMware, I would expect to continue as it is; it might throw up some questions about EVO-RAIL and HP have pointedly not produced an EVO-RAIL certified stack; despite being invited to. But to fold VMware into the main HP would be rash and would upset too many other vendors. But hey, with IBM pulling out of x86 servers and honestly, who cares about Oracle’s x86 servers; HP might have a decent run at dominating the server marketplace before Lenovo takes a massive bite out of it.

And customers? I’m not sure that they’d be too uncomfortable with a HP/EMC merger; mergers are almost certainly on the agenda and there are less attractive ones on the table.

HP need software to help them build their software-defined data-centre; Openstack will only take them so far today. EMC need a commodity partner to help them build a hardware platform that would be trusted. An HP/EMC stack would be solid and traditional but with potential to grow into the 3rd platform supporting infrastructure as customers move that way.

And they both need a way of fending off Amazon and Google; this might be the way for them to do it.

I know I’ve been talking about this more like a HP take-over of EMC and it’d be closer to a true merger; this makes it harder…true mergers always are but culturally, the companies are less dissimilar than most realise. They both need more rapid cultural change…perhaps a merger might force that on them.

Will it happen, I don’t know…would it be a disaster if it did? I don’t think so. It’d also be good for the industry; lots of hacked-off smart people would leave the new behemoth and build new companies or join some of the pretenders.

A shake up is needed…this might do it. Will the market like it? I’m not especially bothered…I don’t hold shares in either company. I just think it might make more sense than people realise. 

 

Singing the lowest note…

The problem with many discussions in IT, is that they rapidly descend into one that looks and feels like a religious debate; whereas reality is much more complex and the good IT specialist will develop their own syncretic religion and pinch bits that work from everywhere.

One of the things that many of us working in Enterprise IT is that our houses have many rooms and must house many differing belief systems; the one true way is not a reality. And any organisation more than fifteen years old has probably built up a fair amount of incompatible dogmas.

For all the pronouncements of the clouderatti; we are simply not in the position to move whole-scale to the Cloud in any of its many forms. We have applications that are simply not designed for scale-out; they are certainly not infrastructure aware and none of them are built for failure. But we also have a developer community who might be wanting to push ahead; use the language du jour and want to utilise cloud-like infrastructure, dev-ops and software defined everything.

So what do we in the infrastructure teams do? Well, we are going to have to implement multiple infrastructure patterns to cater for the demands of all our communities. But we really don’t want to bespoke everything and we certainly don’t want to lock ourselves into anything.

Many of the hyper-converged plays lock us into one technology or another; hence we are starting to look at building our own rack-converged blocks to give us lowest common denominator infrastructure that can be managed with standard tools.

Vendors with unique features are sent packing; we want to know why you are better at the 90%. Features will not sell; if I can’t source a feature/function from more than one vendor, I probably will not do it. Vendors who not play nice with other vendors; vendors who insist on doing it all and make this their lock-in are not where it’s at.

On top of this infrastructure; we will start to layer on the environment to support the applications. For some applications; this will be cloudy and fluffy. We will allow a lot more developer interaction with the infrastructure; it will feel a lot closer to dev-ops.

For others where it looks like a more traditional approach is required; think those environments that need a robustly designed SAN, traditional fail-over clustering; we’ll be a lot more proscriptive about what can be done.

But all of these will sit on a common, reusable infrastructure that will allow us to meet the demands of the business.  This infrastructure will be able to be quickly deployed but also quickly removed and moved away from; it will not require us to train our infrastructure teams in depth to take advantage of some unique feature.

Remember to partner well with us but also with your competitors; yes, it sometimes makes for an amusing conversation about how rubbish the other guy is but we’ll also have exactly that same conversation about you.

Don’t just play lip-service to openness, be prepared to show us evidence.

Hype Converges?

In a software-defined data-centre; why are some of the hottest properties, hardware platforms? Nutanix and Simplivity are two such examples that lead to mind; highly converged, sometimes described as hyper-converged servers.

I think that it demonstrates what a mess our data-centres have got into that products such as these have any kind of attraction. Is it the case that we have built in processes that are so slow and inflexible; that a hardware platform that resembles nothing more than a games-console for virtualisation has an attraction.

Surely the value has to be in the software; so have we got so bad at building out data-centres that it makes sense to pay a premium for a hardware platform and there is certainly a large premium for some of them.

Now I don’t doubt that deployment times are quicker but my real concern is why have we got to this situation. It seems that the whole infrastructure deployment model has collapsed under it’s own weight. But is the answer expensive converged hardware platforms?

Perhaps it is time to fix the deployment model and deploy differently because I have a nasty feeling that many of those people who are struggling to deploy their current infrastructure will also struggle to deploy these new hyper-converged servers in a timely manner.

It really doesn’t matter how quickly you can rack, stack and deploy your hypervisor if it takes you weeks to cable it to to talk the outside world or give it an IP address or even a name!

And then the questions will be asked….you couldn’t deploy the old infrastructure in a timely manner; you can’t deploy the new infrastructure in a timely manner even if we pay a premium for it….so perhaps we will give public cloud a go.

Most of problems at present in the data-centre are not technology; they are people and mostly process. And I don’t see any hardware platform fixing these quickly….

Announcement Ennui

Despite my post of approval about IBM’s V7000 announcements; there’s a part of me who wonders who the hell really cares now? The announcements from IBM, HDS, NetApp and the  inevitable EMC announcement later in the year just leave me cold. The storage-array is nearly done as a technology; are we going to see much more in way of genuine innovation?

Bigger and faster is all that is left.

Apart from that, we’ll see incremental improvements in reliability and serviceability. It does seem that the real storage innovation is going to be elsewhere or down in the depths and guts; practically invisible to the end-user and consumer.

So things I expect to see in the traditional array market include a shift from a four-five year refresh cycle for centralised storage arrays to a six-seven year refresh cycle; centralised arrays are getting so large that migration is a very large job and becomes an increasingly large part of an array’s useful life. We will see more arrays offering data-in-place upgrades; replacement of the storage controllers as opposed to the back-end disk.

An Intel-based infrastructure based on common commodity components means that the internal software should be able to run on more generations of any given hardware platform.

We’ll also see more work on alternative to the traditional RAID-constructs; declustered, distributed, micro-RAIDs.

There are going to be more niche devices and some of the traditional centralised storage array market is going to be taken by the AFAs but I am not sure that market is as large as some of the valuations suggest.

AFAs will take off once we have parity pricing with spinning rust but until then they’ll replace a certain amount of tier-1 and find some corner-cases but it is not where the majority of the world’s data is going to sit.

So where is the bulk of the storage market going?

Object storage is a huge market but it is mostly hidden; the margins are wafer-thin when compared to the AFAs and it does not directly replace the traditional centralised array. It is also extremely reliant at the moment on application support. (Yes, I know everyone…I’m a broken record on this!).

The bulk of the storage market is going to be the same as it is now….DAS but in the form of ServerSAN. If this is done right, it will solve a number of problems and remove complexity from the environment. DAS will become a flexible option and no longer just tiny silos of data.

DAS means that I can get my data as close to the compute as possible; I can leverage new technologies in the server such as Diablo Memory Channel Storage but with ServerSAN, I can also scale-out and distribute my data. That East/West connectivity starts to become very important though.

Storage refreshes should be as simple as putting in a new ServerSAN node and evacuating an older node. Anyone who has worked with some of the cluster file-systems will tell you that this has become easy; so much easier than the traditional SAN migration. There is no reason why a ServerSAN migration should not be as simple.

I would hope that we could move away from the artificial LUN construct and just allocate space. The LUN has become increasingly archaic and we should be no longer worrying about queue depths per LUN for example.

There are still challenges; synchronous replication over distance, a key technology for active/active stretched clusters is still a missing technology in most ServerSAN cases. Personally I think that this should move up the stack to the application but developers rarely think about the resilience and other non-functional requirements.

And at some point, someone will come up with a way of integrating ServerSAN and SAN; there’s already discussion on how VSAN might be used with the more traditional arrays.

The storage array is certainly not dead yet and much like the mainframe, it isn’t going away but we are going to see an increasing slowing in the growth of array sales; business appetite for the consumption of storage will continue to grow at a crazy rate but that storage is going to be different.

The challenge for mainstream vendors is how they address the challenge of slowing growth and address the new challenges faced by their customers. How do they address the ‘Internet of Things’? Not with traditional storage arrays…