Data discovery is a process for identifying and providing visibility into the location, volume, and context of structured and unstructured data stored in a variety of data repositories.
The Need for Data Discovery
It’s not uncommon for an organization to store terabytes (or more) of data in a variety of data repositories:
- Heterogeneous databases located on premises, in legacy databases, and in the cloud
- Big data platforms
- Data-rich collaboration systems such as SharePoint and Office 365
- Cloud-based file-sharing services such as Box, Dropbox, and Google Docs
- Spreadsheets, source code, PDFs, emails, or other documents
The sheer volume of stored data and data repository types means that many organizations do not really know what data they store or where it’s located.
Combine that situation with the exponential growth of a global information economy, driven by new technologies and disruptive business models, requiring that an ever-increasing amount of sensitive data be collected, used, exchanged, analyzed, and retained.
An unintended consequence of the global economy is that all this collected sensitive data is a prime target for accidental or intentional compromise, exfiltration, or destruction.
Now, combine these situations — data volume, repository types, and sensitivity — with industry-specific and regulatory mandates, such as SOX, HIPAA, PCI, and GDPR. Most of these mandates demand that organizations ensure:
- Data integrity to prevent fraud and unauthorized data changes.
- Data confidentiality to protect sensitive data from accidental or intentional compromise, exfiltration, or destruction.
Some, such as the GDPR, go even further and require organizations to allow EU residents to view, correct, or delete his or her collected data.
How Data Discovery Helps
Before you can protect data from compromise, exfiltration, or destruction threats, before you can ensure data accuracy, before you can comply with various privacy and security mandates, you need to know what data you hold, where it’s located, and its context.
Data discovery provides you with that information.
With that information in hand, you can plan and then implement a data classification process to tag data according to its type, sensitivity/confidentiality, and cost/value to the organization if altered, stolen, or destroyed. And with classification information in hand, you can implement security controls to protect data from accidental or intentional compromise, as well as compliance controls to ensure accuracy, visibility, and other compliance mandates.
But it all starts with data discovery—knowing what data you have and where it’s located.
Benefits of Data Discovery
Knowing your data’s location, volume, and context lets you:
- Implement data classification processes
- Apply context-sensitive security controls, including access controls
- Execute context-sensitive disaster recovery exercises
- Conduct context-sensitive monitoring and audits, including privileged user monitoring and sensitive data audits
Learn how Imperva solutions can help you discover your data.