Harnessing Dark Data To Create New Value
Sanjay Agrawal, Technology Head, Hitachi Vantara, throws light on the concept of dark data, its challenges and explains how companies gain insight from dark data to drive new innovation.
Gartner defines dark data as “information assets which organizations collect, process, and store during regular business activities, but generally fail to use for other purposes.” Dark data can include anything from old files to content on devices and clouds that are outside IT’s immediate control and management. While this data explosion is putting pressure on IT to pump in more resources to store, protect and manage the data, companies across industries are yet to understand how this data can be leveraged to achieve key business insights and avoid business risks. In a recent interaction with CXOToday, Sanjay Agrawal, Technology Head, Hitachi Vantara, throws light on the concept of dark data, its challenges and explains how companies gain insight from dark data to drive new innovation.
CXOToday: Data is growing at a rapid rate, leading to monumental storage of data that remains untouched. Can you throw some light on the concept of dark data and some of the main challenges created by dark data?
Sanjay Agrawal: Dark data can appear in both structured and unstructured data, with majority of data in the unstructured segment being dark and less than 0.5% being analysed. Majority of business data is structured data whereas unstructured data includes human and machine data. Unstructured data is not only significantly larger than structured data but also growing many times faster. This type of data is mostly retained by enterprises by deploying huge storage, backup and management infrastructure, added to a large IT budget being spent without any business outcome. The bottom line being –IT is struggling to know what data they have and how their data can be leveraged for business decision making.
Enterprises dealing with their customer’s personal data have another challenge of ensuring data compliance. For example, GDPR (General Data Protection Regulation) expects enterprises to ensure compliance like data protection, retention, right to forget etc. With some part of the customer’s data likely to be present in dark data, job of CIOs becomes even more challenging to ensure compliance when they have limited insights into dark data as well as limited control to apply data policies like data retention.
At Hitachi Vantara, we believe that data is an important asset if one knows how to use it. The key to new revenue streams, better customer experiences, and lower business costs lies in your data, waiting to be discovered.What is interesting about these dark data sets is that the problems that surround them are almost always human (organizational culture or process), rather than being a specific technology challenge. Some of the key challenges enterprises face today with dark data include the ability to find effective ways to extract value from data clutters, illuminating opportunities hidden within these hidden treasure troves, implementing effective data management mechanisms and establishing active risk mitigation practices. In a business climate where data is competitive currency, these challenges can be potential threats and pose risks to any organisation’s continued business health and well-being.
CXOToday: A lot of organizations are already in possession of untapped data that sits idle. How does enterprise dark data impact business decisions?
Sanjay Agrawal: Traditionally enterprises analysed transactional business data to make business decisions but today differentiated customer experience and new business models are possible by looking at unstructured human and machine data that are related to interactions, sentiments, online behaviour, preferences, locations frequently visited etc. For example, just sentiment analysis has given direction to enterprises for improved product and marketing strategy.
Much higher business benefits are available when enterprises start blending their human and machine data with business data dynamically that gives 360-degree view of customers. This helps knowing customers even better, create better offers and eventually more business with higher customer satisfaction. In healthcare industry, an initiative called Patient360, enables doctors get a complete unified view of all the test images, medical reports, patient profile, prescriptions etc. that helps doctors do accurate as well as quick diagnosis, resulting in significant patient satisfaction. With such initiatives, hospitals are launching various patient services to increase the business further.
Few enterprises observed that analysis of their huge unstructured data in Hadoop systems, has not resulted in desired business value. We have seen true business value becomes visible when enterprises start integrating their unstructured data with structured one.
The diverse mix of content from disparate sources such as audio, video, PDFs, social feeds, IVRs and emails needs to be curated in a secure repository to improve data quality that is essential for proper analysis that can be accessed across multiple users, applications and workloads on premise or cloud. Lack of data quality of unstructured data has been one of the reasons limiting analysis of such data for many enterprises.
It also appears that most decision makers do not have a clear view of all the types of storage available to them. In terms of enterprise dark data impacting business decisions, illuminating dark data can bring cost savings to an organization, given the amount of operational data that is left unanalysed, that can instead be used as an economic opportunity for companies. This date can be leveraged to drive new revenues or reduce internal costs and aid in business planning. Given the right appreciation for both potential value and possible risk, organizations can deal with dark data to balance one against the other
A recent IDC survey revealed that 77% of surveyed Indian enterprises are storing data with the hope that in the next two years they will be able to use analytics to gain business insights from this data. However, according to an analysis by Harvard Business Review, less than half of an organization’s structured data is used in making business decisions, and less than 1% of unstructured data is used in any way at all.
CXOToday: How are Hitachi Vantara’s solutions helping companies gain insight from dark data to drive new innovation?
Sanjay Agrawal: Hitachi Vantara helps data-driven leaders find and use value in their data to innovate intelligently and reach outcomes that matter for businesses. As a company, we truly believe in unlocking the value of data and creating new opportunities to empower businesses. We have closely been working on the concept of dark data by designing solutions to help our customers address the challenges in the segment and further enable them to effectively leverage dark data to derive business insights.
Hitachi’s solutions help enterprises with insights into their dark data for better decision making, IT optimization, cost reduction and adherence to compliance. Hitachi Content Intelligence (HCI) helps enterprises know what data they have by providing a data catalogue. Our solutions also help with data discovery and analysis through associated metadata and index leverage.
Our latest Pentaho 8.2 release leverages Hitachi Vantara’s expertise in storage and content intelligence to unlock the realms of dark data for complex on-premise, hybrid, and multi-cloud use cases. Hitachi’s Pentaho helps with enhanced analysis and also dynamic and interactive integration of all of business, human and machine data giving actionable insights to businesses.
Hitachi Content Platform (HCP) delivers near-infinite capacity and broad flexibility in a cost-effective way with significant IT benefits both in terms of cost and management. HCP also brings in capabilities like data retention policies, legal hold, WORM (write once, read many), versioning, data shredding etc that help enterprises in their compliance challenges. The platform also harnesses unstructured data growth and bridges traditional and emerging technologies to make your data securely available anywhere, anytime.
Enterprises are building data lake platforms for storing all of their data mostly unstructured data and Hitachi provides optimized solution for data lake platforms that helps enterprises reduce the cost, improve performance and bring higher business value. Hitachi’s solutions power the use of metadata and index to better organize data discovery and prepare data for analytics. We believe in shining a light on dark data assets, enriching them with valuable metadata to help simplify compliance and secure data, irrespective of where it resides.
CXOToday: How does object storage enable an overall structure to be applied to data?
Sanjay Agrawal: As unstructured data continues to grow faster than IT budgets, organisations are eager to bridge traditional and emerging technologies. To deal with this drastic growth of data, object storage, also known as object-based storage, enables enterprises to improve ease of use, provide flexibility to scale capacity and performance independently, in addition to combat management issues and meet a variety of workloads. This can help with searchability of the data whilst reducing the overall storage required.
Treating an object storage solution as a big data reservoir or scalable and centralized data hub enables analytics-based applications to blend structured and unstructured data together for business intelligence and visualization workloads. The custom metadata that object storage solutions attach to files as a form of detailed enrichment gives unstructured data more context and makes it easier to discover and search.
When we recently partnered with IDC late last year to survey 1392 IT professionals and executives in India, we found out that a staggering 39% of surveyed enterprises in India unaware of the technology.
Enterprises should focus on building a distributed object storage system that can evolve and scale according to their future requirements helping IT with optimized infrastructure. With multiple storage tiers and configurable attributes, the object storage system can create virtual content platforms that can be subdivided for better organization of content, policies and access. Additionally, metadata associated with object storage helps in bringing desired quality in unstructured data, making these fit for discovery and analysis. In a lot of ways, the object store is kind of like a data warehouse for content
Hitachi Vantara provides intelligent, object-based storage solutions and applications that support diverse use cases like file synchronization and sharing, cloud storage, big data, compliance and archiving, from a single cluster, simultaneously. We are committed to expanding the depth and breadth of our data mobility solution portfolio by embracing and extending support for commodity hardware, open source technologies, and public cloud.
CXOToday: Do you have any case examples to demonstrate how dark data is analyzed, stored and managed?
Sanjay Agrawal: In the earlier days, banks used to create their customer’s profile by looking at all the business transactions across their product lines and delivery channels. Today, banks are embarking on a journey wherein customer profiles are not only created from the business that their customers do with banks, but also from their daily interactions, sentiments, preferences, online behaviour etc. This new process of analysing and storing relevant data leads to achieving competitive differentiation, increased customer loyalty, deriving valuable business insights by bringing structure to data and eventually helps banks take more informed decisions in areas such as customer retention, offers etc. that was previously hidden in the pools of dark data that resided in the system. Thus, combating the challenges put forth by dark data and instead, illuminating the data at the end of the tunnel.