“Data Chaos,” the new buzzword for unregulated data accumulation, is completely preventable if policies are enacted early. Too often, founders have no idea where or how their enterprise’s data is being stored and going viral. While the modern cloud is a modern marvel for some companies, the ease of access, top-notch security, and seemingly endless computer and storage potential can become a cluttered dump rife with useless and garbage data increasing storage costs, so the notion of ‘storage is cheap’ is simply just not true.
We humans are packrats by nature and the continuous drumbeat of digital transformation only underscores the need for “intelligent” data management best practices as all businesses utilize and need to exploit their data. Data exploitation, learning the true value of data and being able to leverage it, depends completely on data management with intelligence. While the devil is usually in the details, enterprises learn the devil is really in the data itself.
Simply uploading and storing data to the cloud, over time, creates a situation where you’ll need to hire a professional to sift through your data, identify useful or useless data, and data that’s confidential, sensitive or regulated. If there are no data management practices in place, keep the data manager’s number because they’ll be doing the same job again and again.
Here are a few of the most common data management mistakes fast growing startups and data intensive organizations and enterprises. If these seem familiar, it’s time for an internal rethink to ‘get your data act together’
- The IT team is focused on ensuring business continuity and availability of data, in other words, keep the applications, systems, storage and infrastructure running, but ignoring the lifeblood of the business i.e., the data itself
- There are numerous copies of data made both for data protection with data backups, but no one knows what is inside the data or its value to the business.
- Store replicated data and systems in the event of a system or site outage.
- Cloning of data for analytics before cleaning out the trash data, specifically removing redundant, obsolete, or trivial data slows down analysis, inflates target storage and clogs up the system
- Archiving old data without knowing its value, could it be damaging or include now-regulated data.
Depending on the archived records, in many cases, the only way to get the archived data back is to use the application that put it there. In other words, “vendor lock-in.”
As companies become data-dependent, data becomes the lifeblood of the company and – like blood in the human body – it can become a risk for the entire body without good health practices. Like cholesterol, it’s important to manage the redundant, obsolete, and trivial (ROT) data levels. Like plaque in blood vessels made by cholesterol, ROT data can create buildup in the data storage costing a company’s critical dollars and preventing data from getting to the right applications and users at the right time and place.
The risks unhealthy data presents are more than expensive storage. It can pose significant regulatory compliance and legal issues. If companies are storing consumer information, are you aware of the tighter regulations from the California Consumer Privacy Act (CCPA)? Is there a workflow to automatically delete or modify a consumer’s information if they request it? Are you sure all the consumer data has been changed per their request, including all backups, replicated and archived versions? If not, do you know how to update everything at once?
While California has the strictest consumer data laws, there is no single, universal law to abide by. Without a national data consumer law, companies are left scrambling to figure out where their online consumers are located, if that state’s specific laws apply to the data you’re storing and if you’re compliant. Regulators are levying massive fines for companies if they’re negligent in data management or worse – if sensitive data is nefariously or accidentally breached.
Knowing what you have is essential, which is why data intelligence companies can locate all information no matter where it lives, unlock the true value of data, automate data operations with knowledge and gain deep learning of distributed data, in place without requiring ingest of yet another copy for analysis.
Intelligent Data management is essential for any well-run – data intensive organizations, and let’s be clear that is pretty much organization, especially when preparing to scale for growth. Without clear data management policies and data lifecycle planning, companies end up paying cloud vendors more and more money to keep ROT and garbage data, increasing risk without automated and intelligent data analysis tools, and reducing the likelihood of ever knowing what value the data has, and more.We cannot always control the people generating and sharing data, but we can aim to take control and manage data chaos fast, simple, and with smarts.
(The author is Gary Lyng, Chief Marketing Officer at Aparavi, a data intelligence and automation platform, and the views expressed in this article are his own)