Storagebod Rotating Header Image

The Information Business

It strikes me that any industry/profession which is the business of providing information or access to information is currently in an incredible state of flux and change. My friend Phil Bradley who blogs ons where librarians and the internet meet and has written a number of books on Internet search and has been involved in information provision since he studied librarianship many years ago discusses this a lot on his blog and on his twitter feed. And although he claims to read my blog and not understand a word of it; I think that our two professions have much in common and could well become much closer over the coming years.

Librarians in the past have been the gatekeepers to knowledge and information; not only that they have curated and preserved much of the world's knowledge but as we have moved away from the physical representation of that knowledge and into storing that knowledge in the forms of bits and bytes, some of that responsibility has passed into the hands of the humble storagebod. Just look at SNIA's 100 Year Archive initiative; if that doesn't have aspects of the librarian as curator, I'm not sure what does.

And we have a common problem; our users want access to knowledge and information in a multitude of different ways, they want to consume information without the gatekeeper and instead of being the gatekeeper, we need to be the enabler. We are both no longer able to tell our users how they should consume information; we need to listen to them and provide it to them in the form that they want. Those forms are changing all the time; be it video, audio, XML data, printed word, web-portal; almost any form that you can imagine.

Also the nature of information is changing with some dramatic impacts; information is real-time, there is no real lag between creation and publication of the information. The data-sets behind the information change all the time; yet provenance becomes ever more important.

You can't simply replace a librarian with a search engine and let the world get on with it; lets take Wikipedia, it's not a bad source of information as long as you except that it can often be wrong and not only that, sometimes the information put on it is designed to be deliberately wrong. Yes it is peer-reviewed but after publication; if you were unfortunate enough to read an entry which had just been erroneously updated and without checking, you could well have just cited a load of the proverbial. 

Much academic publication is now done electronically on the web but how does the humble student determine which is good source and which is not; also citation (and just Wikipedia) is becoming very dangerous; information moves and vanishes all the time. A perfectly good citation could simply change URL and throw a premise into the realms of conjecture; how do you cite a particular version of website? 

The data-sets that our businesses run on are updated in real-time but too often business decisions are made utilising versions of the data-set which are disconnected from the source and could be months out of date. There are various reasons for this but sometimes it is because we simply make it too hard to take a real-time feed of the information the business requires. So decisions are made on information which could be basically hogwash. But yet, we need to ensure that the information provided is correct and not corrupted in some way; it is not enough to ensure that only the right people have access to the information but also ensure that only the right people can change that data. 

The provision of information in a timely manner and in the way that our users want is a challenge both for the librarian and the storagebod; perhaps at the end of the day, we are all just Information Professionals. I'm sure that we have much to learn from each other; they've been doing it for thousands of years.


One Comment

  1. InsaneGeek says:

    The place I work for does just this (content aggregation for education & research markets). We have been selling content to libraries for decades and now on the fringes are in a way competing with Google & Wikipedia. At this point we are still in a good position because as you mention the data that is publicly available for free off these sites is not always accurate. So we pay buy or pay royalties for correct and unique source data and charge a subscription for it. There is an interesting question as to: is there a similar tipping point with big iron unix vs Linux on it’s “not as good but good enough”, some people say yes it will get good enough for free public data that inaccuracies won’t matter that much, others say no for reference data it is either 100% accurate all the time or not and the incentive for free data to be that accurate is cost prohibitive.
    We also are grappling with the concept of permanent archive data. I had pretty much the exact conversation a few months ago about maintaining data for hundreds of years, data that we might have received digitally how do you maintain a pristine copy of that data forever without it being horribly costly.

Leave a Reply to InsaneGeek Cancel reply

Your email address will not be published. Required fields are marked *