WP Data Discovery & Classification | Imperva DSF

Home > Data Security > Data Security Fabric > Data Discovery and Classification 

Data Discovery and Classification

Continuously discover and classify valuable data—structured, semi-structured, and unstructured—regardless if it's on-premises, in the cloud, or in a multicloud environment

Data discovery and classification is the foundation of any data security project

Many organizations don't actually know where all of their sensitive data is or if it's dangerously exposed. Such blind spots create security risks that can lead to careless mistakes or create opportunities that attackers can exploit, often through hidden vulnerabilities or misconfigured databases you may not even know exist.

One data view, enterprise-wide

One data view tablet

One challenge of data auditing and risk management often lies in the complex mix of structured, semi-structured, and unstructured data repositories within an organization. Imperva DSF Data Discover and Classify can help uncover, identify, and classify sensitive information from a wide range of data sources regardless of their data type.

Imperva DSF Data Discover and Classify provides visibility into sensitive data's exact location, volume, and context. Driven by machine learning, it allows data owners to find and identify data regardless of cloud or on-premise environment or data source.

Automated, cross-directory searches allow data professionals to do an extensive scan across multiple data repositories simultaneously in seconds, finding the information that is needed for an auditor question, an individual's data lookup, or a data deletion request with maximum accuracy at scan speeds up to 100,000 words per second.

Structured vs. Unstructured Data

In many organizations, unstructured data in the form of email, files from office productivity suites, PDF documents, and various other application files is the majority of their data. Organizations have very little insight into what most of those files contain or what risk exposure they hold. Many threats from insider mishaps, malicious actors, cyber-attacks, ransomware, and other sources are always lurking in enterprise environments with files spread across on-premises and cloud data repositories.


Deploy unstructured data discovery and classification services in hours, connect to anything, including Amazon S3, Sharepoint, Microsoft Teams, Slack, and scan petabytes of data at scale.


Uncover the sensitive, personal, and confidential information stored in emails, images, PDFs, messaging platforms, and network drives in minutes.


Set the foundations to gain actionable insights to enforce privacy, governance, and security programs efficiently and automatically.

Automate the discovery, classification, and redaction of documents at scale


Words per second


Files processed per minute


Files processed per hour

How data discovery and classification works

Imperva DSF Data Discovery and Classification uses regular expression rules, pattern matching, and predefined name-based or content-based data classification types as it scans data store contents to tag matches it finds. Predefined policy packs operationalize compliance with regulations such as GDPR and CCPA. You can add customized, customer-specific data classification types, and scans can be scheduled and repeated to ensure ongoing awareness of the types of data within an organization's data stores.

For unstructured data, additional machine learning and natural language processing algorithms assist in locating files. Machine learning is further leveraged to label and assign attributes to the found data automatically.

Structured Data Unstructured Data Semi-structured Data
  • Predefined data models
  • Easy to search
  • Text-based
  • No predefined data models
  • Difficult to search text
  • PDF, Images, Video
  • Loosely organized
  • Meta-level structure that can contain unstructured data
Resides In
  • Relational databases
  • Data warehouses
  • Stored in rows and columns
  • Applications
  • Data warehouses and lakes
  • Stored in various forms
  • Relational databases
  • Tagged-text format
  • Stored in abstracts & figures
  • Dates
  • Phone numbers
  • Social security numbers
  • Customer names
  • Transactional information
  • Documents
  • Emails and messages
  • Conversation transcripts
  • Image files
  • Open-ended survey answers
  • Server logs
  • Tweets organized by hashtags
  • Email sorting by folders (inbox; sent; draft)

Data discovery and classification with no boundaries

With Imperva DSF you can standardize data security controls across large and complex enterprise data environments, so you have full visibility and centralized command of what is happening across all of your file stores and data assets, on-premises, in the hybrid cloud, and across multiple clouds.

Data Security Fabric coverage diagram

Support across multiple DBaaS and NoSQL platforms & services


Imperva Data Security Fabric protects all data types with a single system that delivers multiple business capabilities

Imperva Data Security Fabric is the first data-centric solution that enables your organization's security and compliance teams to quickly and easily secure sensitive data, no matter where it resides, with an integrated, proactive approach to visibility and predictive analytics.

Imperva Data Security Fabric is composed of cutting-edge orchestrated technical capabilities that work in unison to protect your data across your entire organization:

Data Discovery & Classification

Data Activity Monitoring

Data Retention & Archive

Data Risk Management

Ecosystem Integrations

Data Encryption & Tokenization

Static Data Masking

Automated Workflows & Playbooks

Sensitive Data Management

CRN logo footer
cyber security logo footer
Globe awards logo footer
cyber defense magazine award logo footer
Fortress award logo footer