The Dangers Of Big Data Storage

by CXOtoday News Desk    Jan 31, 2014

big data challenges

The concept of Big Data can be quite exciting, while at the same time extremely dangerous when it is not understood well by businesses. The latest Technology Radar led by global technology firm ThoughtWorks, focuses on the contrast between the exciting new insights made possible by exhaustive data collection with the disturbing trend of businesses storing vast amounts of personal data unnecessarily. 

“These days there’s a lot of hype around the idea of Big Data - and with it the notion that we should capture and store every bit of data we can get our hands on. The ‘capture-it-all’ approach raises serious questions of privacy,” states chief scientist, Martin Fowler. ‘We advocate that businesses adopt an attitude of ‘datensparsamkeit’ and store only the absolute minimum personal information from their customers.’

In a blog post Fowler mentions: “The notion that we should capture and store every bit of data we can get our hands on. We might not have an immediate use for the contacts our users store in their address books, but we’ll ask for it anyway in case it comes in useful later. We’ll record every click on our website and squirrel it away in case we want to trawl it later. We set up our smartphone app to ask for location information so if we come up with some way to use that data later, we can. After all, storage is cheap – so why not?

Fowler encourages businesses to try and anonymise data where possible. “Tracking website visitors, for example, is essential for a business that relies on page views to generate revenue. Instead of storing IP addresses of unique visitors, he asks, why not hash and encrypt those addresses then throw away the key? You still know how many people have looked at one of your pages, but you don’t necessarily know who they are,” he states.

Other notable observations of the radar include:

Early warning and recovery in production — We are seeing a plethora of new tools and techniques for logging, monitoring, storing and querying operational data. When combined with the short recovery times afforded by virtualization and infrastructure automation, businesses can reduce the amount of testing required before deployment, perhaps even pushing that testing into the production environment itself.

Privacy vs. big data — While we are excited about the new business insights made possible by exhaustive data collection and the new tools and platforms for storing and analyzing that data, we are also concerned that many businesses are storing vast amounts of personal data unnecessarily. We advocate that businesses adopt an attitude of “datensparsamkeit” and store only the absolute minimum personal information from their customers.

The JavaScript juggernaut rolls on - The ecosystem around JavaScript as a serious application platform continues to evolve. Many interesting new tools for testing, building, and managing dependencies in both server- and client-side JavaScript applications have emerged recently.

Merging of physical and digital — Low-cost devices, open hardware platforms, and new communication protocols are pushing the computing experience away from the screen and into the world around us. A great example is the proliferation of wearable devices to track personal biometrics, and hardware support in mobile devices to interact with these devices.   

The Technology Radar, now in its fourth year, is created by the Technology Advisory Board, which consists of 20 senior technologists that serve as regular advisors to CTO, Dr. Rebecca Parsons.