Storagebod Rotating Header Image

gestaltit

100% Virtualised? Let’s try for 99%

A lot of posts and talks from people involved in VMware and especially when we start talking about the Private Cloud talk about 100% virtualised data centres. And there's always the nay-sayers like me who point out that there are niche applications which currently can't be virtualised. These include applications which run specialist hardware and applications which have real-time requirements; in my world of Broadcast Media, these are often one and the same.

But there a whole bunch of other applications; often niche and often from small vendors which can't be virtualised for no other reason than the fact that the vendor says they can't. And the reason? It's not been tested, often the applications have very restrictive hardware requirements which are basically dictated by the vendor's ability to test against multiple hardware variants and VMware (and other virtualisation technologies) is really just another hardware variant. I have a whole bunch of these where people swear blind that they can't be virtualised, I don't believe them.

So I'm going to have a go; fortunately, as well starting to build a new storage team, I have another job which involves running a test and integration department. Hence I have all the test cases etc for alot of these apps already built, so it should be just a case of opportunistically running these tests against a non-virtualised and a virtualised enviroment and seeing the differences. It's going to be a case of fitting it in when we can but we've managed to scrounge some fairly meaty hardware to build our new virtual environment on.

I still don't think you can virtualise everything; especially in an environment which has specialist requirements; in the same way it would be very hard for some environments to get rid of their mainframes, it will be hard for some environments to get rid of all the non-virtualised stuff and replacing all your non-x86 with x86 hardware. But with some work, we might be able to get rid of more than we can today.

An Exercise in Utility

EMC and VMWare's coming together with Cisco is an exercise in Utility. If we take Nick Carr's analogy of comparing utility computing with the power-generation industry, what the VCE alliance could be said to be is an attempt to define a de-facto standard for the 'compute unit'. An attempt even to define what voltage the Cloud should run at.

This is not necessarily a bad thing and there will come a time when we do need a standard for the 'compute unit'; even a de-facto unit isn't necessarily a bad thing. De-facto standards happen all the time; the processor has almost become a de-facto standard in that of the Intel chip, the desk-top operating system standard is pretty much Windows (and this from a Linux/MacOS fan).

Around these 'standards'; an industry has been built and thrives. And where there are standards in computing, there are dissenting voices and where there are dissenting voices, little industries spring and thrive in their niche.

But considering where we are in the development of cloud computing and especially, the infrastructure as a service play; arguably this is a bold and a very risky play. Much of what is being offered is at least behind the scenes, the proverbial swan; 'graceful and elegant on the top, with little legs paddling like mad'. Perhaps this is why that this coming together is in the form of a services company? It's just too hard for a currently over-worked IT department to make the technology play nice together?

Data Loss – nothing new here!

One of the fore-runners of information storage and retrieval has after a long running saga of various failures including security breaches, data theft, data theft, mis-indexing and general mal-administration including corruption and incompetence has finally closed it doors today in dramatic fashion which has resulted in the loss of a great proportion of the world's knowledge. Fortunately there were some distributed back-ups but sadly some data has been lost forever.

When the Royal Library in Alexandria burnt down, a huge loss to the world's knowledge was felt but nobody decided that concepts of centrally stored information in the forms of libraries was fatally flawed. It was also fortunate that there were other libraries, even within the great city of Alexandria; which ensured some books survived. It would have been better if there had been a policy of copying all scrolls and ensuring that these were distributed through-out the world but the concept of Libraries are still with us and no-one is suggesting that libraries are a bad idea.

The recent catastrophic failure of the Sidekick service resulting in thousands of people loosing their personal data should not be seen as a failure of the Cloud; it's not! It's the failure of a centralised service which was apparently run by incompetents!

It is yet another lesson that if you only have a single copy of your data; you might as well only have no copies of your data. So if you are archiving and deleting, you better make sure that you have two copies of the archive or at least the ability to recreate that data. Read your SLAs and ask questions about the data-protection policies of anyone who is looking after your data; both internal and external providers.

That reminds me, better take a backup of my blog and better back-up my gmail account as well.

What a Fraud!!

Sometimes when I'm writing this blog, I feel a complete fraud! Actually I suspect most of the time that I am posting complete nonsense. Many of the technologies that I post on, I will never use in my current role.

1) Virtualisation – neither storage or server virtualisation impinge on my world. And it's currently hard to see that they might. Digital media production doesn't lend itself to virtualisation at present even the flag-wavers admit that they are not ready for such environments.

2) Deduplication – Digital media, enough said. Although Ocarina might have a play, digital media is the one place that deduplication does not really play.

3) Cloud – if we are not ready for Virtualisation, then we are not ready for Cloud. Even Cloud Storage currently has a limited presence in the large, performance oriented media space.

But even so, much of what is talked about still has relevance to me in what I do. I don't need to virtualise to utilise many of the process and procedures which are talked about and being developed. At times we all get caught up in the technology and believe that it is the technology that is something special but really it is only half the story.

Even if you don't see yourselves using the technology, you might find many of the processes and concepts relevant to you; for example, I may not use the general purpose Cloud Storage being provided by many people but I am still interested in how to build meta-data and what meta-data should be.

Arguably, a big chunk of what I am working on at the moment is building a specialised media-cloud which will serve content to a huge variety of applications. If all these applications could talk to the media cloud in a standard way; life would be alot easier (they don't), so I'm interested in the standards which are being discussed.

Virtualisation, actually in the media cloud, we will have very few servers but there may be times when we have to scale very quickly. So I can learn alot from the virtualisation guys on rapid provisioning.

So I'll keep posting on things that I know nothing about and hope that you guys don't catch on!

Economic Truth

Steve Duplessie posts on Cloud Economics and especially the economics of Cloud Storage, 20 TBs of storage from Amazon's S3 cloud will cost you $36,000 a year and that doesn't necessarily compare especially well with purchasing your own array. So do the economics of the Cloud storage scale, especially when we start talking about the 100s of terabytes of storage which many enterprises consume?

The problem is we don't really know how much it costs us per terabyte in total! There are no good published TCO models for storage; this is why the initiative started by Nick Pearce and Ian are so important. Go and read their blogs here and here on building a TCO model for storage; let's get this thing crowd-sourced and perhaps we can make the TCO costs of storage a little less cloudy.

And then we can start on the model for Cloud TCO…public vs private etc!!

Live Forever

No not my favourite Oasis track, no it's something which I've been thinking about and a subject I've touched on a few times; once you've deployed a technology, how do you get out of it? Do you get out of it? And what impact does virtualisation have on this.

How many of us are running or are aware of business critical applications running on hardware which is no longer supported by the vendor? And the reasons? Often the reasons are that the application only runs on a particular operating system which is no longer supported and will not run on current generations on hardware.

Virtualisation may well allow you to refresh the hardware; for example if you insist in running applications under DOS; pretty much any of the x86 hypervisors will run DOS. This will allow you to run that business critical application on DOS for the forseeable future and will allow you to continually refresh the hardware underneath it.

Well, it will until the Hypervisor vendor calls time and decides that they no longer want to certify the hypervisor with DOS x.xx; oh well, what do you do? Obviously you shrug your shoulders and now run a back-level hypervisor with an unsupported Operating System which is running an application which by now, no-one has a clue on how it works!

Oh, you've migrated the application into a Public Cloud? Well, it didn't need much in the way of the resources and suited the Cloud perfectly. And now your Cloud provider has said that they are no-longer supporting or even allowing DOS instances to run; oh heck and now you can't get the hardware/software to run your application locally. 

So although virtualisation will allow you to get away with running legacy apps for a long time; don't assume that this means that they can 'Live Forever'! Virtualisation is not an excuse for not carrying out essential maintenance and keeping your estate up-to-date.

*that's 'Bring It On Down', just in case you were interested!

The Cry of the Grump!!

A slightly plaintiff and frustrated tweet from grumpystorage aka @ianhf inspired this blog as did replies from various of the storage twitteratti.

Ian cries the lonely cry of the infrastructure architect

'Application
architects – please know & understand your infrastructure
requirements. Infrastructure isn't magic or telepathic!'

And of course, IaaS and PaaS are then held up as potential solutions to the problem. I've got bad news; they're not! Cloud Architectures are neither magic or telepathic, they still rely on application developers and architects understanding their infrastructure requirements.

Now we, in the infrastructure world can help them by educating in the questions that they both need to ask and also need to answer. Questions like, what availability does your application require? What is it's throughput? Have you designed an application which scales either horizontally or vertically?  Infrastructure Architects have always had to work in a consultative manner and drag the requirements from the Application teams.

All providers of infrastructure need to understand the importance of this; nothing will destroy the credibility of Cloud Architectures quicker than the inappropriate and unplanned deployment of applications into the Cloud.

I think there is a temptation that we represent the Cloud as Magic Smoke where things happen but just look at the backlash when Gmail goes down? Fortunately for Google; people have become so reliant on Gmail, beyond a bit of moaning, few people will really change but a Corporate taking their first steps into the Cloud who loses a key application or a key application under performs may well be more unforgiving. 

Push button deployment without the consultative input of the Infrastructure Architects guiding could end up in a world of pain and it won't be the application at fault, it will be the Infrastructure.

Unified Storage Problems?

NetApp's unified storage platform is a compelling vision for a customer; one platform to support pretty much all your storage needs. It is a powerful sell, it is still pretty much a USP for them; everyone else has to fake it by glomming together storage products and pretending.

But if we dig a little deeper; is NetApp's unified storage platform going to become a millstone? I suspect that it might; arguably it already has. Just looking at the length of time it took them to get OnTap 8 out of the doors and the issues that has brought; the bringing together of GX and traditional OnTap took too long and probably depleted them of development resources.

It might have been better if they had decided to let them live as two separate products rather than the painful union that faced them. Concentrating on making OnTap 8 64 bit and ensuring things such as seamless migration from 32 bit aggregates to 64 bit aggregates might have actually been of more value to more customers. 

Having to integrate any new idea/innovation into OnTap slows time to market because time needs to be taken to work out how to technically integrate it; how to test it and generally make sure that it does not break existing functionality or at least have detrimental effect.

The competitors can 'simply' build a new product line without impacting their existing code-base; yes, they can borrow from the existing base and utilise common components; for example the underlying base operating system may be the same custom Linux environment across their products but they do not have to worry about detrimental impact.

And now we have object storage; is integrating object storage just a step too far for the USP. Actually, technically, it should not be a huge challenge but commercially I can see issues. NetApp's object storage is going to be the most expensive object storage in the world! They are going to be competing with commodity disk prices and the only way that they can do this is to trash their own margins.

I think that is going to be painful. it might well be better for NetApp commercially to be able to sell OnTap ObjectStore (not it's real name, I just made that up) as a completely different product but that would mean accepting that the Unified Storage Platform is not the best answer to what may well be a purely commercial problem. Technically and aesthetically it is very elegant but it is it long term practical?

Of course, EMC still have far too many storage products and their sales-team live in a general state of confusion.

Driven by Past Policy

Every now and then, I like to annoy people and point out that much that we are talking about as the future in Open Systems has been done before. And today is one of those days!

I was talking to someone about policy driven storage management and what they described sounded awfully familiar, so I thought I'd do some quick googling and point them in the right direction. I found this series of articles on the IBM website.

It's worth reading up on System-Managed-Storage; there are so many concepts within it which should strike a chord with you especially if you are thinking about what storage is going to look like in the future. And probably like a lot of Bods, you've never had the chance to be indoctrinated into 'the Cult of the Mainframe'.

Data Management – Industrial Light and Magic

Zilla nails it here; too often we are backing up when we should be archiving. We generate so much content which is pretty much Write Once Read Never but it sits there just in case; getting backed-up time and time again whereas it should just go straight into the archive or certainly get moved after a number of days into the archive. Not only will it help with your back-ups, it will save you money.

For example, if you have all your data on expensive filers with expensive software licenses and it is the latter that is the killer especially when the license is based on the capacity of the array not on how much of the licensed feature is used, it would make sense to keep the useage of that array down to the minimum; so get data off it as quickly as possible and onto a lower-cost medium, be it an archive array with minimal features or tape, if you so desire.

But surely this is the promise of HSM or ILM? Surely this is the thing which has been talked about for years but everyone agrees it is too hard, the ROI doesn't stack up etc? But as Zilla points out, data management doesn't have to be complicated; it can be as simple as deleting what you don't use any more and archiving more. We probably need to look at the tools and continue to simplify but data management needs to become something we talk about a lot more.  

Actually, I wonder if we are going to sleep walk into another issue with VMs; it is going to get so easy to spin a VM up for a quick piece of testing, development or whatever and then people are just going to keep them hanging around, just in case. So it won't just be files we have a problem with; it's going to be whole environments.

Perrhaps we should just stop making things so easy for people!?