Storagebod Rotating Header Image

Cloud

Singing the lowest note…

The problem with many discussions in IT, is that they rapidly descend into one that looks and feels like a religious debate; whereas reality is much more complex and the good IT specialist will develop their own syncretic religion and pinch bits that work from everywhere.

One of the things that many of us working in Enterprise IT is that our houses have many rooms and must house many differing belief systems; the one true way is not a reality. And any organisation more than fifteen years old has probably built up a fair amount of incompatible dogmas.

For all the pronouncements of the clouderatti; we are simply not in the position to move whole-scale to the Cloud in any of its many forms. We have applications that are simply not designed for scale-out; they are certainly not infrastructure aware and none of them are built for failure. But we also have a developer community who might be wanting to push ahead; use the language du jour and want to utilise cloud-like infrastructure, dev-ops and software defined everything.

So what do we in the infrastructure teams do? Well, we are going to have to implement multiple infrastructure patterns to cater for the demands of all our communities. But we really don’t want to bespoke everything and we certainly don’t want to lock ourselves into anything.

Many of the hyper-converged plays lock us into one technology or another; hence we are starting to look at building our own rack-converged blocks to give us lowest common denominator infrastructure that can be managed with standard tools.

Vendors with unique features are sent packing; we want to know why you are better at the 90%. Features will not sell; if I can’t source a feature/function from more than one vendor, I probably will not do it. Vendors who not play nice with other vendors; vendors who insist on doing it all and make this their lock-in are not where it’s at.

On top of this infrastructure; we will start to layer on the environment to support the applications. For some applications; this will be cloudy and fluffy. We will allow a lot more developer interaction with the infrastructure; it will feel a lot closer to dev-ops.

For others where it looks like a more traditional approach is required; think those environments that need a robustly designed SAN, traditional fail-over clustering; we’ll be a lot more proscriptive about what can be done.

But all of these will sit on a common, reusable infrastructure that will allow us to meet the demands of the business.  This infrastructure will be able to be quickly deployed but also quickly removed and moved away from; it will not require us to train our infrastructure teams in depth to take advantage of some unique feature.

Remember to partner well with us but also with your competitors; yes, it sometimes makes for an amusing conversation about how rubbish the other guy is but we’ll also have exactly that same conversation about you.

Don’t just play lip-service to openness, be prepared to show us evidence.

ESXi Musings…

VMware need to open-source ESXi and move on; by open-sourcing ESXi, they could start to concentrate on becoming the dominant player in the future delivery of the 3rd platform.

If they continue with the current development model with ESXi; their interactions with the OpenStack community and others will always be treated with slight suspicion. And their defensive moves with regards to VIO to try to keep the faithful happy will not stop larger players abandoning them to more open technologies.

A full open-sourcing of ESXi could bring a new burst of innovation to the product; it would allow the integration of new storage modules for example. Some will suggest that they just need to provide a pluggable architecture but that will inevitably will also leave people with the feeling that they allow preferential access to core partners such as EMC.

The reality is that we are beginning to see more and more companies running multiple virtualisation technologies. If we throw in containerisation into the mix, within the next five years, we will see large companies running three or four virtualisation technologies to support a mix of use-cases and the real headache on how we manage these will begin.

I know it is slightly insane to be even talking about us having more virtualisation platforms than operating systems but most large companies are running at least two virtualisation platforms and probably many are already at three (they just don’t realise it). This ignores those with running local desktop virtualisation by the way.

The battle for dominance is shifting up the stack as the lower layers become ‘good enough’..vendors will need to find new differentiators…

 

An Opening is needed

As infrastructure companies like EMC try to move to a more software oriented world; they are having to try different things to try to grab our business. A world where tin is not the differentiator and a world where they are competing head-on with open-source means that they are going to have to take a more open-source type approach. Of course, they will argue that they have been moving this way with some of their products for sometime but these have tended to be outside of their key infrastructure market.

The only way I can see products like ViPR in all it’s forms gaining any kind of penetration will be for EMC to actually open-source it; there is quite a need for a ViPR like product, especially in the arena of storage management but it is far too easy for their competitors to ignore it and subtly block it. So for it to gain any kind of traction; it’ll need open-sourcing.

The same goes for ScaleIO which is competing against a number of open-source products.

But I really get the feeling that EMC are not quite ready for such a radical step; so perhaps the first step will a commercial free-to-use license; none of these mealy mouthed, free-to-use for non-production workloads but a proper you can use this and you can put it into production at your own risk type license. If it breaks and you need support; these are the places you can get support but if it really breaks and you *really* need to to pick up the phone and talk to somone, then you need to pay.

It might that if you want the pretty interface that you need to pay but I’m not sure about that either.

Of course, I’m not just bashing EMC; I still want IBM to take this approach with GPFS; stop messing about, the open-source products are beginning to be good enough for much, certainly outside of some core performance requirements. Ceph for example is really beginning to pick-up some momentum; especially now that RedHat have bought Inktank.

More and more, we are living with infrastructure and infrastructure products that are good enough. The pressure on costs continues for many of us and hence good enough will do; we are expected to deliver against tighter budgets and tight timescales. If you can make it easier for me, by for example allowing my teams to start implementing without a huge upfront price negotiation; the long-term sale will have less friction. If you allow customers to all intents and purposes use your software like open-source; because to be frank, most companies who utilise open-source are not changing the code and could care less whether the source is available; you find that this will play well in the long-term.

The infrastructure market is changing; it becomes more a software play every week. And software is a very different play to infrastructure hardware..


			

Hype Converges?

In a software-defined data-centre; why are some of the hottest properties, hardware platforms? Nutanix and Simplivity are two such examples that lead to mind; highly converged, sometimes described as hyper-converged servers.

I think that it demonstrates what a mess our data-centres have got into that products such as these have any kind of attraction. Is it the case that we have built in processes that are so slow and inflexible; that a hardware platform that resembles nothing more than a games-console for virtualisation has an attraction.

Surely the value has to be in the software; so have we got so bad at building out data-centres that it makes sense to pay a premium for a hardware platform and there is certainly a large premium for some of them.

Now I don’t doubt that deployment times are quicker but my real concern is why have we got to this situation. It seems that the whole infrastructure deployment model has collapsed under it’s own weight. But is the answer expensive converged hardware platforms?

Perhaps it is time to fix the deployment model and deploy differently because I have a nasty feeling that many of those people who are struggling to deploy their current infrastructure will also struggle to deploy these new hyper-converged servers in a timely manner.

It really doesn’t matter how quickly you can rack, stack and deploy your hypervisor if it takes you weeks to cable it to to talk the outside world or give it an IP address or even a name!

And then the questions will be asked….you couldn’t deploy the old infrastructure in a timely manner; you can’t deploy the new infrastructure in a timely manner even if we pay a premium for it….so perhaps we will give public cloud a go.

Most of problems at present in the data-centre are not technology; they are people and mostly process. And I don’t see any hardware platform fixing these quickly….

Announcement Ennui

Despite my post of approval about IBM’s V7000 announcements; there’s a part of me who wonders who the hell really cares now? The announcements from IBM, HDS, NetApp and the  inevitable EMC announcement later in the year just leave me cold. The storage-array is nearly done as a technology; are we going to see much more in way of genuine innovation?

Bigger and faster is all that is left.

Apart from that, we’ll see incremental improvements in reliability and serviceability. It does seem that the real storage innovation is going to be elsewhere or down in the depths and guts; practically invisible to the end-user and consumer.

So things I expect to see in the traditional array market include a shift from a four-five year refresh cycle for centralised storage arrays to a six-seven year refresh cycle; centralised arrays are getting so large that migration is a very large job and becomes an increasingly large part of an array’s useful life. We will see more arrays offering data-in-place upgrades; replacement of the storage controllers as opposed to the back-end disk.

An Intel-based infrastructure based on common commodity components means that the internal software should be able to run on more generations of any given hardware platform.

We’ll also see more work on alternative to the traditional RAID-constructs; declustered, distributed, micro-RAIDs.

There are going to be more niche devices and some of the traditional centralised storage array market is going to be taken by the AFAs but I am not sure that market is as large as some of the valuations suggest.

AFAs will take off once we have parity pricing with spinning rust but until then they’ll replace a certain amount of tier-1 and find some corner-cases but it is not where the majority of the world’s data is going to sit.

So where is the bulk of the storage market going?

Object storage is a huge market but it is mostly hidden; the margins are wafer-thin when compared to the AFAs and it does not directly replace the traditional centralised array. It is also extremely reliant at the moment on application support. (Yes, I know everyone…I’m a broken record on this!).

The bulk of the storage market is going to be the same as it is now….DAS but in the form of ServerSAN. If this is done right, it will solve a number of problems and remove complexity from the environment. DAS will become a flexible option and no longer just tiny silos of data.

DAS means that I can get my data as close to the compute as possible; I can leverage new technologies in the server such as Diablo Memory Channel Storage but with ServerSAN, I can also scale-out and distribute my data. That East/West connectivity starts to become very important though.

Storage refreshes should be as simple as putting in a new ServerSAN node and evacuating an older node. Anyone who has worked with some of the cluster file-systems will tell you that this has become easy; so much easier than the traditional SAN migration. There is no reason why a ServerSAN migration should not be as simple.

I would hope that we could move away from the artificial LUN construct and just allocate space. The LUN has become increasingly archaic and we should be no longer worrying about queue depths per LUN for example.

There are still challenges; synchronous replication over distance, a key technology for active/active stretched clusters is still a missing technology in most ServerSAN cases. Personally I think that this should move up the stack to the application but developers rarely think about the resilience and other non-functional requirements.

And at some point, someone will come up with a way of integrating ServerSAN and SAN; there’s already discussion on how VSAN might be used with the more traditional arrays.

The storage array is certainly not dead yet and much like the mainframe, it isn’t going away but we are going to see an increasing slowing in the growth of array sales; business appetite for the consumption of storage will continue to grow at a crazy rate but that storage is going to be different.

The challenge for mainstream vendors is how they address the challenge of slowing growth and address the new challenges faced by their customers. How do they address the ‘Internet of Things’? Not with traditional storage arrays…

Stretching…

So EMC have finally productised Nile and given it the wonderful name of ‘Elastic Cloud Storage’; there is much to like about it and much I have been asking for…but before I talk about what I like about it, I’ll point out one thing…

Not Stretchy

It’s not very Elastic, well not when compared to the Public Cloud Offerings unless there is a very complicated finance model behind it and even then it might not be that Elastic. One of the things that people really like about Public Cloud Storage is that they pay for what they use and if their consumption goes down….then their costs go down.

Now EMC can probably come up with a monthly charge based on how much you are using; they certainly can do capacity on demand. And they might be able to do something with leasing to allow downscaling as well at a financial level but what they can’t easily do is take storage away on demand. So that 5 petabytes will be on premise and using space; it will also need maintaining even if it spins down to save power.

Currently EMC are stating 9%-28% lower TCO over Public Cloud…it needs to be. Also that is today; Google and Amazon are fighting a price-war, can EMC play in that space and react quickly enough? They claim that they are cheaper after the last round of price cutting but after the next?

So it’s not as Elastic as Public Cloud and this might matter…unless they are relying on the fact that storage demands never seem to go away.

Commodity

I can’t remember when I started writing about commodity storage and the convergence between storage and servers. Be it roll-your-own or when vendors were going to start doing something very similar; ZFS really sparked a movement who looked at storage and thought why do we need big vendors like EMC, NetApp, HDS and HP for example.

Yet there was always the thorny issue of support and for many of us; it was a bridge too far. In fact, it actually started to look more expensive than buying a supported product..and we quite liked sleeping at night.

But there were some interesting chassis out there that really started to catch our eyes and even our traditional server vendors were shipping interesting boxes. It was awfully tempting.

And so I kept nagging the traditional vendors…

Many didn’t want to play or were caught up in their traditional business. Some didn’t realise that this was something that they could do and some still don’t.

Acquisition

The one company who had the most to loose from a movement to commodity storage was EMC; really, this could be very bad news. There’s enough ‘hate’ in the market for a commodity movement to get some real traction. So they bought a company that could allow commoditisation of storage at scale; I think at least some of us thought that would be the end of that. Or it would disappear down a rabbit hole to resurface as an overpriced product.

And the initial indications were that it wasn’t going to disappear but it was going to be stupidly expensive.

Also getting EMC to talk sensibly about Scale-IO was a real struggle but the indication is that it was a good but expensive product.

Today

So what EMC have announced at EMC-World is kind of surprising in that it looks like that they may well be willing to rip the guts out of their own market. We can argue about the pricing and the TCO model but it looks a good start; street prices and list prices have a very loose relationship. The four year TCO they are quoting needs to drop by a bit to be really interesting.

But the packaging and the option to deploy on your own hardware; although this is going to be from a carefully controlled catalogue I guess; is a real change from EMC. But you will also notice that EMC have got into the server-game; a shot across the bows of the converged players?

And don’t just expect this to be a content dump; Scale-IO can do serious I/O if you deploy SSDs.

Tomorrow

My biggest problem with Scale-IO is that it breaks EMC; breaks them in a good way but it’s a completely different sales model. For large storage consumers, an Enterprise License Agreement with all you can eat and deploying onto your chosen commodity platform is going to be very attractive. Now the ELA might be a big-sum but as a per terabyte cost; it might not be so big and the more you use; the cheaper it gets.

And Old EMC might struggle a bit with that. They’ll probably try to sell you a VMAX to sit behind your ViPR nodes.

Competitors?

RedHat have an opportunity now with Ceph; especially amongst those who hate EMC for being EMC. IBM could do something with GPFS. HP have a variety of products.

There are certainly smaller competitors as well.

And then there’s VMware with VSAN; which I still don’t understand!

There’s an opportunity here for a number of people…they need to grasp it and compete. This isn’t going to go away any more.

 

 

All The Gear

IBM are a great technology company; they truly are great at technology and so many of the technologies we take for granted can be traced to back to them. And many of today’s implementations still are poorer than the original implementations.

And yet IBM are not the dominant force that they once were; an organisational behemoth, riven with politics and fiefdoms doesn’t always lend itself to agility in the market and often leads to products that are undercooked and have a bit of a ‘soggy bottom’.

I’ve been researching the GSS offering from IBM, GPFS Storage Server; as regular readers of this blog will know, I’m a big fan of GPFS and have a fair amount installed. But don’t think that I’m blinkered to some of the complexities around GPFS; yet it deserves a fair crack of the whip.

There’s a lot to like about GSS; it builds on the solid foundations of GPFS and brings a couple of excellent new features into play.

GPFS Native RAID; also known as declustered RAID is a software implementation of micro-RAID; RAID is done at a block level as opposed to a disk level; this generally means that the cost of rebuilds can be reduced and the time to get back to a protected level can be shortened. As disks continue to get larger, conventional RAID implementations struggle and you can be looking at hours if not days to get back to a protected state.

Disk Hospital; by constantly monitoring the health of the individual disks and collecting metrics for them; the GSS can detect failing disks very early on but there is a dirty secret in the storage world; most disk failures in a storage array are not really failures and could be simply recovered from, a simple power-cycle can be enough or a firmware reflash can be enough to prevent a failure and going into a recovery scenario.

X-IO have been advocating this for a long time; this can reduce maintenance windows and prevent unnecessary rebuilds. It should reduce maintenance costs as well.

Both of these technologies are great and very important to a scalable storage environment.

So why aren’t IBM pushing GSS in general; it’s stuffed full of technology and useful stuff?

The problem is GPFS…GPFS is currently too complicated for many, it’s never going to be a general purpose file system. The licensing model alone precludes that; so if you want to utilise it with a whole bunch of clients, you are going to be rolling your own NFS/SMB 3.0 gateway. Been there, done that…still doing that but it’s not really a sensible option for many.

If IBM really want the GSS to be a success; they need a scaleable and supported NAS gateway in front of it; it needs to be simple to manage. It needs integration with the various virtualisation platforms and they need to simplify the GPFS license model…when I say simplify, I mean get rid of the client license cost.

I want to like product and not just love the technology.

Until then…IBM have got all the gear and no idea….

VSANity?

So VSAN is finally here in a released form; on paper, it sure looks impressive but it’s not for me.

I spend an awful lot of time looking at Scale-Out Storage systems; looking at ways to do them faster, cheaper and better. And although I welcome VMware and VSAN to the party; I think that their product falls some-way from the mark but I don’t think that I’m really the target market; it’s not really ready or appropriate for Media and Entertainment or anyone interested in HyperScale.

But even so I’ve got thoughts that I’d like to share.

So VSAN is better because it runs in the VMware kernel? This seems logical but this has tied VSAN to VMware in a way that some of the competing products are not; if I want to run a Gluster Cluster which encompasses not just VMware but also XEN, bare-metal and anything else, I could. And there might be some excellent reasons why I would want to do so, I’d transcode on bare-metal machines for example but might present out on VM-ed application servers. Of course, it is not only Media and Entertainment who have such requirements; there are plenty of other places where heavy lifting would be better done on the bare-metal.

I think that VMware need to be much more open about allowing third party access to the kernel interfaces; they should allow more pluggable options; so I could run GPFS, ScaleIO, Gluster, Stornext within the VMWare kernel.

VSAN limits itself by tying itself so closely to the VMware stack; it’s scalability is limited by the current cluster size. Now there are plenty good architectural reasons for doing so but most of these are enforced by a VMware-only mindset.

But why limit to only 35 disks per server? An HP ProLiant SL4540 takes 60 disks and there are SuperMicro chassis that take 72 disks. Increasing the spindle count not only increases the maximum capacity but the RAW IOps of the solution. Of course, there might be some saturation issues with regards to the inter-server communication.

Yet, I do think it is interesting how the converged IT stacks are progressing; the differences in approach; VMware itself is pretty much a converged stack now but it is a software converged stack; VCE and Nutanix converge onto hardware as well. And yes, VMware is currently the core of all of this.

I actually prefer the VMware-only approach in many ways as I think I could scale computer and storage separately within some boundaries; I’m not sure what the impact of having unbalanced clusters will be on VSAN? Whether it would make sense to have some Big Flipping Dense VSAN appliances rather than distributing the storage equally across the nodes?

But VSAN is certainly welcome in the market; it certainly validates the approaches being taken by a number of other companies…I just wish it were more flexible and open.

 

IT’s choking the life out of me.

I’ve been fairly used to the idea that my PC at home is substantially better than my work one; this has certainly been the case for me for more than a decade. I’m a geek and I spend more than most on my personal technology environment.

However, it is no longer just my home PC; I’ve got better software tools and back-end systems; my home workflow is so much better than my work workflow; it’s not even close. And the integration with my mobile devices, it’s a completely different league altogether. I can edit documents on my iPad, my MBA, my desktop, even my phone and they’ll all sync up and be in the same place for me. My email is a common experience across all devices. My media; it’s just there.

With the only real exception of games; it doesn’t matter which device I’m using to do stuff.

And what is more; it’s not just me; my daughter has the same for her stuff as does my wife. We’ve not had to do anything clever, there’s no clever scripting involved, we just use consumer-level stuff.

Yet our working experience is so much poorer; if my wife wants to work on her stuff for her job, she’s either got to email it to herself or use ‘GoToMyPC’ provided by her employer.

Let’s be honest, for most of us now…our work environment is quite frankly rubbish. It has fallen so far behind consumer IT, it’s sad.

It’s no longer the technology enthusiast who generally has a better environment…it’s almost everyone who has access to IT. And not only that, we pay a lot less for it than the average business.

Our suppliers hide behind a cloak of complexity; I’m beginning to wonder if IT as it is traditionally understood by business is no longer an enabler, it’s just a choke-point.

And yes there are many excuses as to why this is the case; go ahead…make them! I’ve made them myself but I don’t really believe them any more…do you?

Disrupt?

So you’ve founded a new storage business; you’ve got a great idea and you want to disrupt the market? Good for you…but you want to maintain the same-old margins as the old crew?

So you build it around commodity hardware; you use the same commodity hardware as I can buy off the shelf; basically the same disks that I can buy off the shelf from PC World or order from my preferred Enterprise tin-shifter.

You tell me that you are lean and mean? You don’t have huge sales overheads, no huge marketing budget and no legacy code to maintain?

You tell me that it’s all about the software but you still want to clothe it in hardware.

And then you tell me it’s cheaper than the stuff that I buy from my current vendor? How much cheaper? 20%, 30%, 40%, 50%??

Then I do the calculations; your cost base and your BoM is much lower and you are actually making more money per terabyte than the big old company that you used to work for?

But hey, I’m still saving money, so that’s okay….

Of course, then I dig a bit more…I want support? Your support organisation is tiny; I do my due diligence,  can you really hit your response times?

But you’ve got a really great feature? How great? I’ve not seen a single vendor come up with a feature that is so awesome and so unique that no-one manages to copy it…few which aren’t in a lab somewhere.

In a race to the bottom; you are still too greedy. You still believe that customers are stupid and will accept being ripped off.

If you were truly disruptive….you’d work out a way of articulating the value of your software without clothing it in hardware. You’d work with me on getting it onto commodity hardware and no I’m not talking about some no-name white-box; you’d work with me on getting it onto my preferred vendor’s kit; be it HP, Dell, Lenovo, Oracle or whoever else…

For hardware issues; I could utilise the economies of scale and the leverage I have with my tin-shifter; you wouldn’t have to set-up a maintenance function or sub-contract it to some third party who will inevitably let us both down.

And for software support; well you could concentrate on those…

You’d help me be truly disruptive…and ultimately we’d both be successful…