GDPR- A Perspective
Data Surely is King
GDPR in a nutshell
What is GDPR or the European Union (EU) General Data Privacy Regulations (GDPR)? The GDPR is a new regulation that the European Union agreed to enforce starting May 2018 for all and any companies that deal with EU individuals either as a customer, employee or user. GDPR is not a totally new methodology but an evolution of existing dispersed privacy regulations and frameworks defined by individual countries. The rationale for this new law is simple; as the technology revolution continue to accelerate, citizens and governments realize that data protection has been left trailing in its wake to an extend where organizations, customers and users have lost the ability to keep track of the data they control, own or process. The advantages in terms of convenience and cost-savings which we all have gained from technological advances have been immense, but data subjects and organizations have had their fingers burnt by identity theft and data breaches as reported by the news and highlighted in multiple State of Privacy reports.
The GDPR requires organizations to put in place an appropriate governance framework when they collect and process information and empowers individuals to take control of their data.
A positive side-effect is a significant simpler legal framework where organizations only need to comply to one pan-European law rather than having to interpret legal privacy regulations per individual country. It is a hope that more countries outside EU will embraced the GDPR privacy regulations and further simplify and strengthen data privacy on a global scale.
While GDPR has already been approved by the EU, it will go in effect with enforcement as of May 25th, 2018 scoped to all organizations that has European residents buying a service or product and where personal data is processed, regardless of where their data is in the world.
Risks for non-compliance
Part of the GDPR regulations is a strong enforcement strategy with heavy penalties to organizations that do not take GDPR seriously and embrace the requirements of “Privacy by Design”, ‘Privacy by Default” and “Accountability”. Organizations must empower users to control their own data, how it is being used, and removal of the data as outlined by the GDPR concepts of “Right Of Access” and the “Right To Be Forgotten”. The risks for organizations large and small, anywhere in the world who have European citizens in their data sets and are found to be non-compliant, are fines up to 4% of annual global revenue, to a cap of €20m . For small and medium businesses with an average revenue of $64M this would be ~$2.5M in fines.
An indicator to maturity of organizations in the United States and their focus on Privacy and Data management over the last 10 to 20 years is highlighted by a recent study that Osterman Research conducted in 2016 that found that 73% of US companies are not confident that they can meet the GDPR regulations by the deadline. The three pillars highlight the main areas that organizations struggle with which all are fundamental to become compliant.
From a technical understanding, it is noteworthy to understand that GDPR encompass all privacy data indifferent from the method it is captured e.g. via a traditional “fat” applications, web browsers, mobile devices, IoT devices such as thermostats and upcoming robotic capabilities initially seen as “Rumba” and Amazon “Alexa”. Any device, any application and any user interaction that capture privacy data is governed under the GDPR regulations.
With Organizations required to incorporate the concepts of “Right Of Access” and “Right To Be Forgotten” it is critical to understand where privacy data is located and stored so the appropriate action can be taken and enforced. The challenge is that this data today can be found in log’s across the enterprise, in databases inter-mingled with non-privacy data, in CRM solutions such as Sales-force, Customer Support Tickets, marketing analysis data, Data lakes etc. Privacy data locations span organization's own data centers, in the cloud as PaaS or SaaS and is indiscriminately interchanged with partners on a regular basis as a main contributor for Advertisement (AD’s), Rationalization, customer support etc.
What is Private Data
GDPR sees all personal data as part of the regulation and define personal data as “any information relating to an identified or identifiable natural person ‘data subject’; an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that person”.
It is furthermore on purpose that the GDPR regulation does not list “Private Data Entities” to future proof the regulation and encourage a “Privacy by Design” approach. We do get an idea of some of the data entities organizations must consider as part of their classification work from the familiar United States PII regulations:
- Full Name
- Vehicle Registration Number
- Genetic Information
- Email Address
- Driver's License Number
- Date of Birth
- Face, Fingerprint etc.
- Passport Number
- Login Name
- Telephone Number
- Web Cookie
- National Id. Number
- Credit Card Numbers
- Home Address
- IP Address
These data entities are all covered by GDPR and some are difficult to deal with as most US organizations have not considered these covered by regulations prior, such as “IP Address”, “Mobile Device Identifiers”, or “Web Cookies” resulting in these being stored in multiple data stores without consideration or awareness. Even more interesting is the fact that some of these data entities are the backbone for large advertising industries using "collaborative filtering" to create "personalized experience" and uses personal identifiable information captured via Web Cookies
or Mobile Applications which is sold to “third-party” organizations and becomes in-app/web advertisement.
With the GDPR approval two new categories of data, genetic and bio-metric data, joined the prior list of “sensitive” or “special” personal data: data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and data concerning health or sex life and sexual orientation.
We should also mention that pseudonymised data still is considered personal data (because it can, by definition, be re-associated with a specific person), especially if the key is not well protected or being shared.
Organizations that take GDPR seriously are approaching their data from four angles.
what data do i have
- IP Address
- Web Analysis
- Customer data
- Sales and Support data
- Cookie data
where is the data stored and for how long
- log's - 7 years
- CRM, SAS - -
- Tickets - 2 years
- Database - Forever
- Cloud - months
am I securing and controlling access to the data
- Active Directory controlled
- Cloud IAM
Am i using the data for the purpose it was originally collected
- Do I even know what data I collect?
- How do I control usage
- Data Governance
- Partner exchanges
Considering that majority of companies (73%) are not confident that they have these pillars in place or have a grasp on how to approach many of these topics it is not surprising that GDPR is a hot topic here in 2017. One of the consequences of this complex work is a tendency to avoid approaching controls and approaches only on European data as it increases day-to-day work complexity dramatically except for companies that organizational are completely separated based on geographic customer locations; I do not know of any that has such as decentralized technology, security and process diversification. For all others, the decision is in general to protect all data irrelevant of its origin with company-wide approaches. Some of the more usual decisions I have seen are listed in below table
General Consequences of GDPR
|AAA standardization for all in-house applications (AD, LDAP, Okta)||AAA standardization for all third-party solution (CRM, Cloud, SaaS)||Increased focus on ITSM processes|
|All IP’s are considered private data entitles (no log’s, no analysis…)||Company-wide GDPR controls||Increased reliance on consents with centralized tracking|
|Executive level privacy roles||Legal owning privacy versus tech.||Data Governance roles|
|Log standardization across SDLC with defined retention||Expand the SDLC to incl. review when data is changed||CMDB for data|
|Slip-Stream Anonymization||Encryption everywhere||End-point lock down|
New Business Opportunities?
As the government controlled landscape changes new businesses spring up to solve the problem. With GDPR enforcement on the horizon for May 2018 traditional security companies are emphasizing their product ability to solve their piece of the puzzle but no company have come forward with a more holistic approach. A few companies are experimenting using Machine Learning and AI to identify abnormalities in the data streams or implement agents on end-points to capture data and alert when privacy data is detected; none have reached a point of maturity – yet.
Other interesting approaches are instrumentation of applications but while promising with a tons of additional benefits the cost is high and require re-coding of all applications; in addition to centralized data schema, API interfaces and data storage investments.
As I write this paper I am still waiting for a solution that can allow me to tag data based on recognition (IP, SSN, Address, Name, User Name etc.) embedded in the development effort (SDLC), in transit at specific control points and at rest similar to infrastructure scanning techniques and RFID solutions; preferably integrated with a CMDB.