Friday, March 09, 2007

Too much data?


This week, Emarketer and Techonology Guardian, amongst others, have been getting rather excited about an IDC report which analyses and forecasts the world's digital data output.

The findings are impressive and scary in equal measures. Emarketer notes that 'the amount of information created and replicated in 2007 (255 exabytes) will be greater, for the first time, than available storage capacity (246 exabytes)'. So the economics of data storage (scarce supply, insatiable demand) are about to get interesting. Well, maybe not quite yet: lots of that data will get deleted, and hard drives will get more efficient, apparently. But how long before the rate of production (minus deletion) outstrips the rate of efficient storage creation? Now that HD and Blueray DVD protection has been cracked, it won't be long before bloated BitTorrents flood the net. And then there's Joost. Uh oh.

Technology Guardian is more interested in looking backwards, and finds a great shock-stat of its own buried deep in the report (i.e. beyond the executive summary):

'The sheer amount of data that has been created by the digital age becomes clear when comparing it with the spoken word. Experts estimate that all human language since the dawn of time would take up about 5 exabytes if stored in digital form. In comparison, last year's email traffic accounted for 6 exabytes.'

As anyone with a passing interest in digital media will confirm, the data burden is becoming ever more unmanageable. Despite the best efforts of Google, Technorati, Digg and their legions of paid and unpaid contributors, the data mountain is growing too fast and too vast for humans to sort through. Step forward the Semantic Web: a brave new world wide web where computers get on with it and we get exactly what we want. What started out as a quixotic vision is now one step closer to becoming a reality thanks to pioneering projects like Freebase, which - with a little bit of help from the online community - hopes to map the inter-relationships between all online data in a language that computers can 'understand'. (Think of it as a structured, supersized wikipedia that fills in the gaps for you).

The Semantic Web is kind of like artificial intelligence lite, which has got the uber-nerds excited. But, as O'Reilly comments, projects like Freebase are still 'very much in Alpha'. Whether the Semantic Web will save Diginatives from a life of digislavery has yet to be seen.

1 comment:

sports handicapping software said...

thanks for the information is very interesting