What is Dark Data and how do I find it?
What is “dark data”?
The term “hidden data” refers to “any information that organizations collect, process and store in the course of regular business activities, but generally fail to use for other purposes”. [Gartner].
Often retained for compliance reasons, this data may also include former employee records, financial information and transaction logs, confidential survey data, emails, internal presentations, uploaded attachments and even surveillance video footage. It refers to all forgotten data left behind by general processes that might be unused, unknown, and unused – invariably as a result of a user’s daily digital interactions. This data can be anywhere. Spread across all areas of an organization and a myriad of data repositories, from data lakes to applications.
By their nature, the precise volumes of dark data in an organization are difficult to estimate. As organizations produce data at a volume that regularly exceeds what can be analyzed, it is common for more than half of an organization’s data to be unavailable for analysis. [Splunk]. The volume of unstructured data – data not organized by a predefined data model is growing at a rate of 55-65% per year [Forbes]. Every minute of every day, 1.7MB of data is created for each of the 7.3 billion people on our planet. This means that by 2025, it is estimated that there will be 175 trillion gigabytes (175 zettabytes) of data in the world, 80% of which will be unstructured, and 90% of this unstructured data will never be analyzed. or used in regular business activities – despite mandating regional data standards, its business value, and cost of storage [IDC].
Shedding light on obscure data
To protect dark data from malicious actors and make it available to corporate auditors, an organization must find it and discover what is sensitive and what is exposed. Discovering and classifying dark data allows an organization to leverage this previously unknown information for decision making. To do this, security teams need to know where sensitive dark data is located, who is accessing it, and when abuses are occurring in order to take immediate action.
There are two main approaches to assessing and reviewing an organization’s dark data. There are independent consultants who can examine a data environment and perform in-depth reviews of unused and uncataloged data on behalf of an organization. Organizations can also, with the right tools, automatically self-examine all of their data repositories, wherever their data resides. This is often preferable as it further allows organizations to identify regulatory violations, identify internal permissions (who can see what), uncover other gaps in organizational data security, and identify potential behavior. malicious or careless that may endanger confidential and private data. If an organization chooses to use a data analytics solution instead of an external contractor, they will invariably gain a more complete, insightful, and accurate understanding of their data with clearer actions on how to proceed to remediate. at all risk.
Only when an organization has visibility into its dark data can it uncover its business value and protect that data accordingly. Building a basic framework to “tag” or catalog this hidden data is the first step to gaining this insight. Without it, an organization cannot adhere to data governance standards, regional regulatory compliance, deliver truly effective security, or ensure data privacy for its customers and employees.
Organizations need to know if their data is already visible and in use: is it managed data, outdated redundant critical data or dark data? It is essential to know where the data is, what it is and what standards and policies must apply to it. Knowing who is accessing it and how organizational data is (and should be) governed is part of the basic framework of classification and discovery. After proper investigation, truly stale dark data can be scheduled for deletion, reducing the required data storage capacity and associated costs.
Dark data discovery and classification tools
Out of the box, Imperva Data Security Fabric Data Discover and Classify enables an organization to automatically search through its data, wherever it is stored, to find and classify dark and unstructured data. Enterprise-wide, it enables organizations to find hidden, exposed and sensitive material. It shows location, volume, context and helps protect that data accordingly with clearly defined recommendations for action.
If you would like to learn more about your dark data governance, compliance and security lockdown, please contact us via our online chat or contact us here. We’re always happy to talk, no obligation, and maybe we can help your organization shed light on its unstructured and hidden data stores.
The post What is Dark Data, and how do I find it? appeared first on Blog.
*** This is a syndicated blog from the Security Bloggers Blog Network written by Nik Hewitt. Read the original post at: https://www.imperva.com/blog/what-is-dark-data-and-how-can-we-find-it/