Demystifying the ‘Big Deal’ about Big Data


The web is adrift with overindulgent buzz words like Big Data and Data Scientist. It seems to be the cool and in thing to be using these words in the world of Analytics. Unfortunately the term ‘Analytics and BI’ has itself been used very loosely to represent a wide variety of analytics services from reporting all the way to predictive analytics. So where does the new term fit in. Are we missing something?  Who should use it and how?

To begin with, Big Data is not a product or technique, nor is it a new found buzz word, when put in the right context. There are certain types of business problems that clearly fit into the category of Big Data and the reason the word Big Data came into existence. Social Media with its unstructured and semi structured data sets like tweets, chat messages, audio, video etc. is one such example. The sheer volume, variety of data sources and formats were not manageable with the - traditional database technologies at reasonable costs. Thus came into existence, Big Data with NoSQL, Hadoop and a whole new suite of techniques and technologies to address the problem of how to store, process, analyze and derive meaning from these data sets.New and evolving technologies in analytics have now made possible many aspects which were considered impossible earlier. Data management systems handle a wide variety of data from social media, sensor data and web data. Improved analytical capabilities in predictive analytics and text analytics are also referred to as advanced analytics. Faster hardware with multi core processors and distributed computing have all made it possible to manage large and complex data sets. Big data involves bringing together these technologies and techniques to benefit the business.

Mapping Big Data for the Enterprise

Big Data has predominantly been described in terms of the nature of data - namely data volume, data formats and data frequency. For an enterprise, what is important is to map “so what” and “what now” from the business point of view. Businesses need to leverage the varying forms of data and not just limit themselves to structured data within the enterprise. Over 40 to 80 per cent of, an enterprise’s stored data resides in an unstructured format in the form of text documents, web pages, media formats, (audio, video, photos), office software (presentations, project plans). Enterprises that are able to manage and use their complete range of data assets stand to have a competitive edge over their competitors.For example combining the information stored in an Insurance claim system in the form of text (semi structured) when parsed and combined with the information about the customer stored in a structured data set may provide valuable insight into fraudulent claims.

Another example is a Retail solution, using real time targeting of recommendations to registered user while they are still actively looking up products on the online Retail store. This would involve combining, complex event processing (CEP) algorithms with metrics available on the customers and their purchase patterns.

The richness of marrying the two is not just a “nice to have” functionality, but it is becoming a significant component of the reality, due to the influence of the digital world and the new and multiple channels like social media through mobile, tablets and the web. The biggest challenge faced by organizations is the integration of the data assets across structured and unstructured data and overcoming poor performance on a larger scale. Techniques of text analytics and distributed computing have matured and are being used today to address these problems. Using commodity hardware, cloud computing, off the shelf software tools and open source have all made it possible to take advantage of Big Data in cost effective ways.

From Big Data to Big Decisions

The need of the hour is to expand on the traditional Business Intelligence and bring in the predictive and advanced analytics like sentiment analysis; identifying Patterns to not only predict but prescribe the actions when they occur, and provide optimized solutions by leveraging multiple streams of data.

Big Data is not a silver bullet; it needs a combination of the knowledge of business lifecycles and processes coupled with the new technologies and techniques. This is best done by combining the in-house expertise of business and subject matter experts, with an external partner who can get the required competency on the technology and techniques. It is important to understand the ROI using a controlled exercise, and validating it through a proof of concept before selecting the approach and implementing it for your business.

The way to join the big data band wagon is to be agile and go for the quick wins. Taking a low cost, low risk approach and gaining momentum on insight into your data that can then be used to build bigger and long term, compelling Big Data Solutions. The focus needs to be on the business problem you are trying to solve and not the technology. Ask the right business questions and engage with the right partner who can work with you in an agile mode to get you the quick wins.

(The author is Senior Vice President, Analytics Division at Symphony Teleca Corp.)