The General Data Protection Regulation (GDPR) is the European Union’s new data regulation designed to provide individuals with rights and protections over their personal data that is collected or created by businesses or government entities. It unifies data protection regulation across all member states of the European Union (EU) and is set to replace the Data Protection Directive. The GDPR applies to organizations of all sizes that collect or process personal data originating from the EU. Most importantly, it provides a mechanism for enforcement of the regulation, which begins on May 25th, 2018. Anyone who fails to comply with the GDPR could face fines as large as €20M (~$22M) or 4% of global annual turnover (revenue) from the prior year.
What is Pseudonymization?
While the GDPR doesn’t call out any specific technology (as technology evolves over time), it does encourage pseudonymization of personal data. Pseudonymization is a security technique for replacing sensitive data with realistic fictional data. The concept of personally identifiable information (PII) lies at the heart of the GDPR, and the idea of pseudonymization is to separate data from direct identifiers, so that the data cannot be linked back to an identity without additional information. In other words, the data subject is no longer identifiable once the data is pseudonymized.
Use Case 1: Removes Sensitive Data
Pseudonymization enhances privacy by de-identifying sensitive information. It removes or obscures direct identifiers, such as name, social security number, credit card number, or contact information. As a result, pseudonymization helps reduce the risk of data breach, data loss, and data theft. Even if hackers obtained privileged users’ credentials or malicious insiders gained legitimate access, with pseudonymization they wouldn’t get ‘real’ data. Data controllers can utilize this technique to handle directly identifying data securely and separately from processed data to ensure non-attribution.
Use Case 2: Enables Data-driven Business
Pseudonymization not only helps protect the rights of individuals, but also enables data utility. Nowadays, for companies big and small, using data is an essential part of doing business. While the GDPR requires data controllers to collect data only for “specific, explicit and legitimate purposes”, it provides data controllers who pseudonymize personal data more flexibility to process the data for a different purpose than the one for which it was originally collected.
Take data masking as an example, it is considered a means of pseudonymization that replaces sensitive data with fictitious but realistic values. Let’s say a record shows that a man named John Smith who is 65 years old has a Social Security number (SSN) of 123-45-6789. After the data is masked, John Smith might become Tom Potter who is 58 years old and has an SSN of 223-56-7890. The masked data maintains the referential integrity and operational accuracy, so that personal data can be securely processed for scientific, historical and statistical purposes. This is why pseudonymization may facilitate processing of personal data beyond original collection purposes.
Figure 1: Data masking replaces original data with fictitious, realistic data.
Use Case 3: Practices Data Minimization
Last but not least, pseudonymization allows data controllers to practice “data minimization”, another concept introduced by GDPR, which limits the use of data to what is necessary for a specific purpose. For example, an insurance company collects personal information for the purposes of issuing a policy. Later on, the company wants to analyze this data to improve pricing of policies. Under the principle of data minimization, the company would not be able to do so because the personal data collected for one purpose (e.g., issuing a policy) cannot be used for a new purpose (e.g., creating a database for pricing analysis). Nonetheless, if the data is pseudonymized, for instance, via data masking, then the company could use the masked database for pricing analysis, as pseudonymization meets GDPR’s data security requirements to safeguard personal data.
Data Protection and Flexibility
The GDPR introduces pseudonymization as a means of protecting individuals’ rights while allowing data controllers to benefit from the data’s utility. This technique significantly reduces the risk of data exposure while maintaining the referential integrity for scientific, historical and statistical purposes. Pseudonymized data falls within the scope of the GDPR and provides more flexibility to data controllers. Those who adopt pseudonymization techniques will have an easier time utilizing personal data for secondary purposes, as well as meeting the data security and data by design requirements of the GDPR.
Want to better understand the impact of the GDPR on your organization and steps security teams need to take to be compliant? Download our eBook: Steps for Securing Data to Comply with the GDPR.
Learn more about data masking solutions from Imperva: