WP Data Discovery & Classification | Imperva DSF

Home > Data Security > Data Security Fabric > Data Discovery and Classification 

Data Discovery and Classification

Continuously discover and classify valuable data—structured, semi-structured, and unstructured—regardless if it's on-premises, in the cloud, or in a multicloud environment

Data discovery and classification is the foundation of any data security project

Many organizations don't actually know where all of their sensitive data is or if it's dangerously exposed. Such blind spots create security risks that can lead to careless mistakes or create opportunities that attackers can exploit, often through hidden vulnerabilities or misconfigured databases you may not even know exist.

One data view, enterprise-wide

One data view tablet

One challenge of data auditing and risk management often lies in the complex mix of structured, semi-structured, and unstructured data repositories within an organization. Imperva DSF Data Discover and Classify can help uncover, identify, and classify sensitive information from a wide range of data sources regardless of their data type.

Imperva DSF Data Discover and Classify provides visibility into sensitive data's exact location, volume, and context. Driven by machine learning, it allows data owners to find and identify data regardless of cloud or on-premise environment or data source.

Automated, cross-directory searches allow data professionals to do an extensive scan across multiple data repositories simultaneously in seconds, finding the information that is needed for an auditor question, an individual's data lookup, or a data deletion request with maximum accuracy at scan speeds up to 100,000 words per second.

Structured vs. Unstructured Data

In many organizations, unstructured data in the form of email, files from office productivity suites, PDF documents, and various other application files is the majority of their data. Organizations have very little insight into what most of those files contain or what risk exposure they hold. Many threats from insider mishaps, malicious actors, cyber-attacks, ransomware, and other sources are always lurking in enterprise environments with files spread across on-premises and cloud data repositories.

Discover

Deploy unstructured data discovery and classification services in hours, connect to anything, including Amazon S3, Sharepoint, Microsoft Teams, Slack, and scan petabytes of data at scale.

Understand

Uncover the sensitive, personal, and confidential information stored in emails, images, PDFs, messaging platforms, and network drives in minutes.

Action

Set the foundations to gain actionable insights to enforce privacy, governance, and security programs efficiently and automatically.

Automate the discovery, classification, and redaction of documents at scale

100,000

Words per second

1,200

Files processed per minute

72,000

Files processed per hour

How data discovery and classification works

Imperva DSF Data Discovery and Classification uses regular expression rules, pattern matching, and predefined name-based or content-based data classification types as it scans data store contents to tag matches it finds. Predefined policy packs operationalize compliance with regulations such as GDPR and CCPA. You can add customized, customer-specific data classification types, and scans can be scheduled and repeated to ensure ongoing awareness of the types of data within an organization's data stores.

For unstructured data, additional machine learning and natural language processing algorithms assist in locating files. Machine learning is further leveraged to label and assign attributes to the found data automatically.

Structured Data Unstructured Data Semi-structured Data
Characteristics
  • Predefined data models
  • Easy to search
  • Text-based
  • No predefined data models
  • Difficult to search text
  • PDF, Images, Video
  • Loosely organized
  • Meta-level structure that can contain unstructured data
  • HTML, XML, JSON
Resides In
  • Relational databases
  • Data warehouses
  • Stored in rows and columns
  • Applications
  • Data warehouses and lakes
  • Stored in various forms
  • Relational databases
  • Tagged-text format
  • Stored in abstracts & figures
Examples
  • Dates
  • Phone numbers
  • Social security numbers
  • Customer names
  • Transactional information
  • Documents
  • Emails and messages
  • Conversation transcripts
  • Image files
  • Open-ended survey answers
  • Server logs
  • Tweets organized by hashtags
  • Email sorting by folders (inbox; sent; draft)

Data discovery and classification with no boundaries

With Imperva DSF you can standardize data security controls across large and complex enterprise data environments, so you have full visibility and centralized command of what is happening across all of your file stores and data assets, on-premises, in the hybrid cloud, and across multiple clouds.

Data Security Fabric coverage diagram

Support across multiple DBaaS and NoSQL platforms & services

Mongo
Couchbase
cockroachDB
Snowflake
neo4j
hadoop
Informix
Cassandra
cloudera
teradata
aerospike
Sybase
datastax
MarkLogic
Kinetica
redis
percona
InterSystems
yugabyteDB
Sap

Imperva Data Security Fabric protects all data types with a single system that delivers multiple business capabilities

Imperva Data Security Fabric is the first data-centric solution that enables your organization's security and compliance teams to quickly and easily secure sensitive data, no matter where it resides, with an integrated, proactive approach to visibility and predictive analytics.

Imperva Data Security Fabric is composed of cutting-edge orchestrated technical capabilities that work in unison to protect your data across your entire organization:

Data Discovery & Classification

Discover ungoverned data, classify all data, and assess vulnerabilities.

Learn more

Data Activity Monitoring

Gain complete visibility and ensure compliance with continuous monitoring, auditing and analyzing all data store and data types.

Learn more

Data Retention & Archive

Meet any data archiving requirement using the most cost efficient storage technology available.

Learn more

Data Risk Management

Detect and report non-compliant, risky, or malicious data access behavior across all of your data repositories enterprise-wide to accelerate remediation.

Learn more

Ecosystem Integrations

Enhance the value of your existing technology investments - for both incident context and additional data capabilities.

Learn more

Data Encryption & Tokenization

Protect critical data with encryption, key management, and tokenization wherever it resides.

Learn more

Static Data Masking

Modify sensitive data so it is of no value to unauthorized users while still being usable by your systems.

Learn more

Automated Workflows & Playbooks

Achieve higher scale and increase DevOps efficiency through the use of automated workflows delivered through trusted repositories.

Learn more

Sensitive Data Management

Identify, locate, classify, and secure sensitive data across various data stores located either on-premises or in the cloud.

Learn more
CRN logo footer
cyber security logo footer
Globe awards logo footer
cyber defense magazine award logo footer
Fortress award logo footer