What is Data Security?
Data security is the process of protecting corporate data and preventing data loss through unauthorized access. This includes protecting your data from attacks that can encrypt or destroy data, such as ransomware, as well as attacks that can modify or corrupt your data. Data security also ensures data is available to anyone in the organization who has access to it.
Some industries require a high level of data security to comply with data protection regulations. For example, organizations that process payment card information must use and store payment card data securely, and healthcare organizations in the USA must secure private health information (PHI) in line with the HIPAA standard.
But even if your organization is not subject to a regulation or compliance standard, the survival of a modern business depends on data security, which can impact both the organization’s key assets and private data belonging to its customers.
Why is Data Security Important?
The Ponemon Institute’s Cost of Data Breach Study found that on average, the damage caused by a data breach in the USA was $8 million. 25,575 user accounts were impacted in the average data incident, which means that beyond financial losses, most incidents lead to loss of customer trust and damage to reputation.
Lawsuits, settlements, and fines related to data breaches are also on the rise, with many governments introducing more stringent regulations around data privacy. Consumers have much more extensive rights, especially in the EU, California, and Australia, with the introduction of GDPR, CCPA, APP, and CSP234.
Companies operating in regulated industries are affected by additional standards, such as HIPAA for healthcare organizations in the USA, and PCI/DSS for organizations processing credit card data.
In the past decade, social engineering, ransomware and advanced persistent threats (APTs) are on the rise. These are threats that are difficult to defend against and can cause catastrophic damage to an organization’s data.
There is no simple solution to data security—just adding another security solution won’t solve the problem. IT and information security teams must actively and creatively consider their data protection challenges and cooperate to improve their security posture. It is also critical to evaluate the cost of current security measures, their contribution to data security, and the expected return on investment from additional investments.
Data Security vs Data Privacy
Data privacy is the distinction between data in a computer system that can be shared with third parties (non-private data), and data that cannot be shared with third parties (private data). There are two main aspects to enforcing data privacy:
- Access control—ensuring that anyone who tries to access the data is authenticated to confirm their identity, and authorized to access only the data they are allowed to access.
- Data protection—ensuring that even if unauthorized parties manage to access the data, they cannot view it or cause damage to it. Data protection methods ensure encryption, which prevents anyone from viewing data if they do not have a private encryption key, and data loss prevention mechanisms which prevent users from transferring sensitive data outside the organization.
Data security has many overlaps with data privacy. The same mechanisms used to ensure data privacy are also part of an organization’s data security strategy.
The primary difference is that data privacy mainly focuses on keeping data confidential, while data security mainly focuses on protecting from malicious activity. For example, encryption could be a sufficient measure to protect privacy, but may not be sufficient as a data security measure. Attackers could still cause damage by erasing the data or double-encrypting it to prevent access by authorized parties.
Learn more in our detailed guide to data privacy
Data Security Risks
Below are several common issues faced by organizations of all sizes as they attempt to secure sensitive data.
A large percentage of data breaches are not the result of a malicious attack but are caused by negligent or accidental exposure of sensitive data. It is common for an organization’s employees to share, grant access to, lose, or mishandle valuable data, either by accident or because they are not aware of security policies.
Phishing and Other Social Engineering Attacks
Social engineering attacks are a primary vector used by attackers to access sensitive data. They involve manipulating or tricking individuals into providing private information or access to privileged accounts.
Phishing is a common form of social engineering. It involves messages that appear to be from a trusted source, but in fact are sent by an attacker. When victims comply, for example by providing private information or clicking a malicious link, attackers can compromise their device or gain access to a corporate network.
Insider threats are employees who inadvertently or intentionally threaten the security of an organization’s data. There are three types of insider threats:
- Non-malicious insider—these are users that can cause harm accidentally, via negligence, or because they are unaware of security procedures.
- Malicious insider—these are users who actively attempt to steal data or cause harm to the organization for personal gain.
- Compromised insider—these are users who are not aware that their accounts or credentials were compromised by an external attacker. The attacker can then perform malicious activity, pretending to be a legitimate user.
Ransomware is a major threat to data in companies of all sizes. Ransomware is malware that infects corporate devices and encrypts data, making it useless without the decryption key. Attackers display a ransom message asking for payment to release the key, but in many cases, even paying the ransom is ineffective and the data is lost.
Many types of ransomware can spread rapidly, and infect large parts of a corporate network. If an organization does not maintain regular backups, or if the ransomware manages to infect the backup servers, there may be no way to recover.
Learn more in the detailed guide to Ransomware protection
Data Loss in the Cloud
Many organizations are moving data to the cloud to facilitate easier sharing and collaboration. However, when data moves to the cloud, it is more difficult to control and prevent data loss. Users access data from personal devices and over unsecured networks. It is all too easy to share a file with unauthorized parties, either accidentally or maliciously.
SQL injection (SQLi) is a common technique used by attackers to gain illicit access to databases, steal data, and perform unwanted operations. It works by adding malicious code to a seemingly innocent database query.
SQL injection manipulates SQL code by adding special characters to a user input that change the context of the query. The database expects to process a user input, but instead starts processing malicious code that advances the attacker’s goals. SQL injection can expose customer data, intellectual property, or give attackers administrative access to a database, which can have severe consequences.
SQL injection vulnerabilities are typically the result of insecure coding practices. It is relatively easy to prevent SQL injection if coders use secure mechanisms for accepting user inputs, which are available in all modern database systems.
Learn more in the detailed guide to SQL injection
Common Data Security Solutions and Techniques
There are several technologies and practices that can improve data security. No one technique can solve the problem, but by combining several of the techniques below, organizations can significantly improve their security posture.
Data Discovery and Classification
Modern IT environments store data on servers, endpoints, and cloud systems. Visibility over data flows is an important first step in understanding what data is at risk of being stolen or misused. To properly protect your data, you need to know the type of data, where it is, and what it is used for. Data discovery and classification tools can help.
Data detection is the basis for knowing what data you have. Data classification allows you to create scalable security solutions, by identifying which data is sensitive and needs to be secured. Data detection and classification solutions enable tagging files on endpoints, file servers, and cloud storage systems, letting you visualize data across the enterprise, to apply the appropriate security policies.
Data masking lets you create a synthetic version of your organizational data, which you can use for software testing, training, and other purposes that don’t require the real data. The goal is to protect data while providing a functional alternative when needed.
Data masking retains the data type, but changes the values. Data can be modified in a number of ways, including encryption, character shuffling, and character or word substitution. Whichever method you choose, you must change the values in a way that cannot be reverse-engineered.
Identity Access Management
Identity and Access Management (IAM) is a business process, strategy, and technical framework that enables organizations to manage digital identities. IAM solutions allow IT administrators to control user access to sensitive information within an organization.
Systems used for IAM include single sign-on systems, two-factor authentication, multi-factor authentication, and privileged access management. These technologies enable the organization to securely store identity and profile data, and support governance, ensuring that the appropriate access policies are applied to each part of the infrastructure.
Data encryption is a method of converting data from a readable format (plaintext) to an unreadable encoded format (ciphertext). Only after decrypting the encrypted data using the decryption key, the data can be read or processed.
In public-key cryptography techniques, there is no need to share the decryption key – the sender and recipient each have their own key, which are combined to perform the encryption operation. This is inherently more secure.
Data encryption can prevent hackers from accessing sensitive information. It is essential for most security strategies and is explicitly required by many compliance standards.
Data Loss Prevention (DLP)
To prevent data loss, organizations can use a number of safeguards, including backing up data to another location. Physical redundancy can help protect data from natural disasters, outages, or attacks on local servers. Redundancy can be performed within a local data center, or by replicating data to a remote site or cloud environment.
Beyond basic measures like backup, DLP software solutions can help protect organizational data. DLP software automatically analyzes content to identify sensitive data, enabling central control and enforcement of data protection policies, and alerting in real-time when it detects anomalous use of sensitive data, for example, large quantities of data copied outside the corporate network.
Learn more in the detailed guide to DLP
Governance, Risk, and Compliance (GRC)
GRC is a methodology that can help improve data security and compliance:
- Governance creates controls and policies enforced throughout an organization to ensure compliance and data protection.
- Risk involves assessing potential cybersecurity threats and ensuring the organization is prepared for them.
- Compliance ensures organizational practices are in line with regulatory and industry standards when processing, accessing, and using data.
One of the simplest best practices for data security is ensuring users have unique, strong passwords. Without central management and enforcement, many users will use easily guessable passwords or use the same password for many different services. Password spraying and other brute force attacks can easily compromise accounts with weak passwords.
A simple measure is enforcing longer passwords and asking users to change passwords frequently. However, these measures are not enough, and organizations should consider multi-factor authentication (MFA) solutions that require users to identify themselves with a token or device they own, or via biometric means.
Another complementary solution is an enterprise password manager that stores employee passwords in encrypted form, reducing the burden of remembering passwords for multiple corporate systems, and making it easier to use stronger passwords. However, the password manager itself becomes a security vulnerability for the organization.
Learn more in the detailed guide to passwordless authentication
Authentication and Authorization
Organizations must put in place strong authentication methods, such as OAuth for web-based systems. It is highly recommended to enforce multi-factor authentication when any user, whether internal or external, requests sensitive or personal data.
In addition, organizations must have a clear authorization framework in place, which ensures that each user has exactly the access rights they need to perform a function or consume a service, and no more. Periodic reviews and automated tools should be used to clean up permissions and remove authorization for users who no longer need them.
Data Security Audits
The organization should perform security audits at least every few months. This identifies gaps and vulnerabilities across the organizations’ security posture. It is a good idea to perform the audit via a third-party expert, for example in a penetration testing model. However, it is also possible to perform a security audit in house. Most importantly, when the audit exposes security issues, the organization must devote time and resources to address and remediate them.
Anti-Malware, Antivirus, and Endpoint Protection
Malware is the most common vector of modern cyberattacks, so organizations must ensure that endpoints like employee workstations, mobile devices, servers, and cloud systems, have appropriate protection. The basic measure is antivirus software, but this is no longer enough to address new threats like file-less attacks and unknown zero-day malware.
Endpoint protection platforms (EPP) take a more comprehensive approach to endpoint security. They combine antivirus with a machine-learning-based analysis of anomalous behavior on the device, which can help detect unknown attacks. Most platforms also provide endpoint detection and response (EDR) capabilities, which help security teams identify breaches on endpoints as they happen, investigate them, and respond by locking down and reimaging affected endpoints.
Zero trust is a security model introduced by Forrester analyst John Kindervag, which has been adopted by the US government, several technical standards bodies, and many of the world’s largest technology companies. The basic principle of zero trust is that no entity on a network should be trusted, regardless of whether it is outside or inside the network perimeter.
Zero trust has a special focus on data security, because data is the primary asset attackers are interested in. A zero trust architecture aims to protect data against insider and outside threats by continuously verifying all access attempts, and denying access by default.
Zero trust security mechanisms build multiple security layers around sensitive data—for example, they use microsegmentation to ensure sensitive assets on the network are isolated from other assets. In a true zero trust network, attackers have very limited access to sensitive data, and there are controls that can help detect and respond to any anomalous access to data.
Learn more in the detailed guides to:
Database security involves protecting database management systems such as Oracle, SQL Server, or MySQL, from unauthorized use and malicious cyberattacks. The main elements protected by database security are:
- The database management system (DBMS).
- Data stored in the database.
- Applications associated with the DBMS.
- The physical or virtual database server and any underlying hardware.
- Any computing and network infrastructure used to access the database.
A database security strategy involves tools, processes, and methodologies to securely configure and maintain security inside a database environment and protect databases from intrusion, misuse, and damage.
Big Data Security
Big data security involves practices and tools used to protect large datasets and data analysis processes. Big data commonly takes the form of financial logs, healthcare data, data lakes, archives, and business intelligence datasets. Within the big data perimeter there are three primary scenarios that require protection: inbound data transfers, outbound data transfers, and data at rest.
Big data security aims to prevent accidental and intentional breaches, leaks, losses, and exfiltration of large amounts of data. Let’s review popular big data services and see the main strategies for securing them.
AWS Big Data
AWS offers analytics solutions for big data implementations. There are various services AWS offers to automate data analysis, manipulate datasets, and derive insights, including Amazon Simple Storage Service (S3), Amazon Kinesis, Amazon Elastic Map/Reduce (EMR), and Amazon Glue.
AWS big data security best practices include:
- Access policy options—use access policy options to manage access to your S3 resources.
- Data encryption policy—use Amazon S3 and AWS KMS for encryption management.
- Manage data with object tagging—categorize and manage S3 data assets using tags, and apply tags indicating sensitive data that requires special security measures.
Learn more in the detailed guide to AWS Big Data
Azure Big Data
Microsoft Azure cloud offers big data and analytics services that can process a high volume of structured and unstructured data. The platform offers elastic storage using Azure storage services, real-time analytics, database services, as well as machine learning and data engineering solutions.
Azure big data security best practices include:
- Monitor as many processes as possible.
- Leverage Azure Monitor and Log Analytics to gain visibility over data flows.
- Leverage Azure services for backup, restore, and disaster recovery.
Learn more in the detailed guide to Azure Big Data
Google Cloud Big Data
The Google Cloud Platform offers multiple services that support big data storage and analysis. BigQuery is a high-performance SQL-compatible engine, which can perform analysis on large data volumes in seconds. Additional services include Dataflow, Dataproc, and Data Fusion.
Google Cloud big data security best practices include:
- Define BigQuery access controls according to the least privilege principle.
- Use policy tags or type-based classification to identify sensitive data.
- Leverage column-level security to check if a user has the right to view specific data at query time.
Snowflake is a cloud data warehouse for enterprises, built for high performance big data analytics. The architecture of Snowflake physically separates compute and storage, while integrating them logically. Snowflake offers full relational database support and can work with structured and semi-structured data.
Snowflake security best practices include:
- Define network and site access through IP allow/block lists.
- Use SCIM to manage user identities and groups.
- Leverage key pair authentication and rotation to improve client authentication security.
- Enable multi-factor authentication.
Elasticsearch is an open-source full-text search and analytics engine that is highly scalable, allowing search and analytics on big data in real-time. It powers applications with complex search requirements. Elasticsearch provides a distributed system on top of Lucene StandardAnalyzer for indexing and automatic type prediction, and utilizes a JSON-based REST API to Lucene features.
Elasticsearch security best practices include:
- Use strong passwords to protect access to search clusters
- Encrypt all communications using SSL/TLS
- Leverage role-based access control (RBAC)
- Use IP filtering for client access
- Turn on auditing and monitor logs on an regular basis
Learn more in the detailed guide to Elasticsearch
Splunk is a software platform that indexes machine data, makes it searchable and turns it into actionable intelligence. It pulls log files from applications, servers, mobile devices, and websites, aggregates them, and provides rich analysis features.
Splunk security best practices include:
- Preventing unauthorized access by defining RBAC, data encryption, and obfuscation of credentials.
- Using SSL/TLS encryption for data ingestion and internal Splunk communications.
- Hardening Splunk instances by ensuring they are physically secure and do not store secrets in plaintext.
- Using audit events to track any changes to Splunk system configuration.
Learn more in the detailed guide to Splunk Architecture
Securing Data in Enterprise Applications
Enterprise applications power mission critical operations in organizations of all sizes. Enterprise application security aims to protect enterprise applications from external attacks, abuse of authority, and data theft.
Email security is the process of ensuring the availability, integrity, and reliability of email communications by protecting them from cyber threats.
Technical standards bodies have recommended email security protocols including SSL/TLS, Sender Policy Framework (SPF), and DomainKeys Identified Mail (DKIM). These protocols are implemented by email clients and servers, including Microsoft Exchange and Google G Suite, to ensure secure delivery of emails. A secure email gateway helps organizations and individuals protect their email from a variety of threats, in addition to implementing security protocols.
Enterprise Resource Planning (ERP) is software designed to manage and integrate the functions of core business processes such as finance, human resources, supply chain, and inventory management into one system. ERP systems store highly sensitive information and are, by definition, a mission critical system.
ERP security is a broad set of measures designed to protect an ERP system from unauthorized access and ensure the accessibility and integrity of system data. The Information Systems Audit and Control Association (ISACA) recommends regularly performing security assessments of ERP systems, including software vulnerabilities, misconfigurations, separation of duties (SoD) conflicts, and compliance with vendor security recommendations.
Digital Asset Management (DAM) is a technology platform and business process for organizing, storing, and acquiring rich media and managing digital rights and licenses. Rich media assets include photos, music, videos, animations, podcasts, and other multimedia content. Data stored in DAM systems is sensitive because it often represents company IP, and is used in critical processes like sales, marketing, and delivery of media to viewers and web visitors.
Security best practices for DAM include:
- Implement the principle of least privilege.
- Use an allowlist for file destinations.
- Use multi-factor authentication to control access by third parties.
- Regularly review automation scripts, limit privileges of commands used, and control the automation process through logging and alerting.
Learn more in the detailed guide to Digital Asset Management (DAM)
Customer Relationship Management (CRM) is a combination of practices, strategies, and technologies that businesses use to manage and analyze customer interactions and data throughout the customer lifecycle. CRM data is highly sensitive because it can expose an organization’s most valuable asset—customer relationships. CRM data is also personally identifiable information (PII) and is subject to data privacy regulations.
Security best practices for CRM include:
- Perform period IT risk assessment audits for CRM systems.
- Perform CRM activity monitoring to identify unusual or suspicious usage.
- Encourage CRM administrators to follow security best practices.
- Educate CRM users on security best practices.
- If you operate CRM as SaaS, perform due diligence of the SaaS provider’s security practices.
Data Security with Imperva
Imperva’s data security solution protects your data wherever it lives—on-premises, in the cloud, and in hybrid environments. It also provides security and IT teams with full visibility into how the data is being accessed, used, and moved around the organization.
Our comprehensive approach relies on multiple layers of protection, including:
- Database firewall—blocks SQL injection and other threats, while evaluating for known vulnerabilities.
- User rights management—monitors data access and activities of privileged users to identify excessive, inappropriate, and unused privileges.
- Data masking and encryption—obfuscates sensitive data so it would be useless to the bad actor, even if somehow extracted.
- Data loss prevention (DLP)—inspects data in motion, at rest on servers, in cloud storage, or on endpoint devices.
- User behavior analytics—establishes baselines of data access behavior, uses machine learning to detect and alert on abnormal and potentially risky activity.
- Data discovery and classification—reveals the location, volume, and context of data on-premises and in the cloud.
- Database activity monitoring—monitors relational databases, data warehouses, big data, and mainframes to generate real-time alerts on policy violations.
- Alert prioritization—Imperva uses AI and machine learning technology to look across the stream of security events and prioritize the ones that matter most.
See Additional Guides on Key Data Security Topics
Authored by Imperva
Learn about tools and practices that can help you protect your organization against cyber threats.
Authored by Imperva
Learn about data privacy regulations and governance processes that can help achieve compliance.
Authored by Cato
Learn the principles of Zero trust architecture and how it works
Authored by Exabeam
Learn about data loss protection (DLP) solutions that can prevent sensitive data from loss, theft, and leakage.
Authored by Cynet
Learn about ransomware, the most severe threat vector threatening data security today.
Authored by Cloudian
Learn about advanced storage technology that can help prevent ransomware and recover data when attacks occur.
Authored by Bright Security
Learn about SQL injection attacks, in which attackers inject malicious code into SQL queries to steal data and gain unauthorized access.
Authored by NetApp
Learn about Amazon Web Services (AWS) big data solutions and how to manage and secure them.
Authored by NetApp
Learn about Microsoft Azure big data solutions and how to manage and secure them.
Authored by NetApp
Learn about Elasticsearch, a popular NoSQL database and enterprise search solution, and how to manage and secure it.
Authored by Cloudian
Learn about Splunk, a popular log management and analysis platform, and how to manage and secure it..
Authored by Cloudinary
Learn about digital asset management (DAM), an enterprise application that stores rich media, and how to manage and secure it.
Additional Data Security Resources
Authored by NetApp
Authored by Tigera