Big data seduces us with potential. Industries from retail to health care collect untold terabytes of data in hopes of finding correlations and connections that drive sales and improve patient health.
But beneath the depths lies dark data, or data whose value hasn’t yet been identified. We’re talking about documents, images, PDFs, videos, and other data that often aren’t work related, include sensitive information, or duplicate files many times over.
Many organizations do the right thing by regularly backing up their data (everything from email servers to file shares to individual user folders), but this process amasses tons of unnecessary unstructured data. It clogs up storage arrays and hard drives, creating big bills and unknown risks.
Is your organization hoarding this kind of unstructured data without realizing it? Let’s shine a light on dark data to uncover how you can move your organization toward better storage and improved efficiency.
Your obsolete data is costing you a fortune
The recently released Global Databerg Report found that only 15 percent of organizational data was business critical. The other 85 percent was either redundant, obsolete, trivial, or considered dark data.
One type of trivial data is unstructured data, which doesn’t reside in a traditional database. Generally speaking, unstructured data grows at a much faster rate than traditional, structured data; IDG Research predicts that by 2022, 93 percent of digital data will be unstructured.
Herein lies the problem with unstructured data: Managing it grows into an extremely costly endeavor. Gartner Research estimates that it costs $5 million annually to manage 1 petabyte of data. So you’re likely spending a lot of money to store files that are obsolete or duplicates.
SHI worked with one organization that found that only 21 percent of its file data had been accessed in the last year (and nearly 40 percent hadn’t been accessed in over four years), so most of this data could be easily archived off tier-one on-premises storage and onto lower tier on-premises storage or cost effective cloud storage.
But if left untouched, this dark data can become a liability for an organization.
What sensitive data still sits in your database?
The massive cost of storing this unstructured data is not the only strain it puts on organizations. A lack of insight into dark data could lead to legal or financial liability, as data covered by mandate or regulation but improperly stored can result in costly sanctions. For example, organizations in health care fields could fall out of compliance with HIPAA regulations due to improperly classified or stored data.
Because this data is unstructured, it’s often difficult to know both what it is and where it resides. One company found this out the hard way when it was subpoenaed as a third party in a legal case and had to turn over certain data. The company was neither the plaintiff nor defendant, but it spent $6 million just finding the poorly categorized data.
Lack of insight into dark data also leads to permissions challenges. Not knowing what data sets contain makes it impossible to determine who should have access to that data, and the wrong individuals may have access to proprietary and sensitive information. This weak permission oversight opens the door to data breaches and intelligence risks.
Tackling unstructured data woes
Organizations can reduce their storage footprint and risk by developing an information governance strategy. This isn’t an overnight process, but creating policies, procedures, and processes that govern how long data is retained and how it can be retrieved will shine a light on dark data and provide visibility.
These processes force organizations to be more proactive in data management. For example, if you need to turn over a file that was created years ago for litigation purposes, a clear process allows IT to immediately access it or know that it was defensibly deleted according to your records management schedule. An information governance strategy helps you respond, not react, to a crisis.
Digging yourself out from dark data
When IT, the C-suite, and storage architects work together to hash out better data retention policies, organizations can eliminate the confusion often tied to unstructured data. Enacting a workable information governance strategy lessens an organization’s cost and risks associated with the big data explosion.
Remember, unstructured data can hold sensitive information, so organizations must be proactive in determining how those assets are classified, stored, retrieved, or purged. By following information governance processes and procedures, organizations can solve some of the problems unstructured data presents.
There are options available to help illuminate unstructured data in your IT environment. Contact your SHI account executive to learn about SHI’s free Dark Data Assessment.