Dec 21, 2021
7 min read

Personal data: what is it and why should you care?

Personal data has been in the spotlight since the advent of the GDPR. Learn what it is and why you should care.

Since ancient times there has always been a need to record facts, either for inventory purposes, to carve events in the annals of history, or for anything deemed to be important by our ancestors. 

This need to preserve information, when coupled with advances in technology, results in data becoming ubiquitous in our modern societies. Indeed, technology makes the creation, processing and sharing of data such a common activity that people have stopped paying much attention to it. 

Data explosion

Data is everywhere, and we interact with it in many diverse ways: credit cards, identification documents, medical records, CCTV footage of us walking down the street, digital photographs, emails, social media… The list is truly endless and is only expanding with the use of Internet of Things (IoT), smartphones, and wearable technologies, to name but a few. 

It is estimated that in 2021 the amount of data generated daily surpasses 2.5 quintillion bytes, which is 25 followed by a staggering 17 zeros. In addition, most of it gets to be stored in immense server farms or in the cloud.

Most people right now do not see a problem with their data being spread all over the place and are impervious to the privacy concerns raised by the current situation. However, identity theft for individuals is notably on the increase, as are data exfiltration and Man in the Middle (MiTM) attacks for organisations. Both are the result of exploiting vulnerabilities to compromise personal data, breaching confidentiality. Both can be prevented or mitigated by judicious application of data protection principles.

Data: it’s personal

Personal data is a subset of data, relating to individuals, that has been in the spotlight since the advent of the GDPR in 2018. Since then, other pieces of legislation have been recently adopted across the globe, such as in China (PIPL), Brazil (LGPD), and California (CCPA). The legislative efforts have one common objective: provide a framework to protect personal data.  

The GDPR defines personal data as:

“Any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

This concept of personal data, understood in a broad sense, is shared across the aforementioned laws. Given that most of these have provisions for cross-border transfers, chances are your organisation falls within the scope of one jurisdiction or another when it comes to the processing and protection of personal data.

Examples of personal data include:

  • Name
  • Surname
  • IP address
  • Cookies
  • Pseudonym or ‘alias’
  • Date of birth
  • Photographs
  • Sound recordings
  • Email address
  • Postal address
  • Telephone number
  • Licence plate number
  • ID number
  • Social security number

Definition differences

Personally Identifiable Information (PII) and personal data are not the same

One word of caution: although often (erroneously) used interchangeably, Personally Identifiable Information (PII) and personal data are not the same. Personal data also encompasses information that can point indirectly to an individual, whereas PII, a term mostly used in the US, is narrower.

For instance, “the red-haired woman who sat by the window” could constitute personal data, if it helps identify a single person. Notice how there is no need to obtain a name or a number. Some ambiguity is at play, however, which is largely dependent on context. Context in this sense means combining different data items to form a puzzle. This can be achieved with items from different sources, aggregated in such a way as to make an inference.

Another example of the role context plays would be to produce a random sequence of digits, which is not personal data unless it was associated with an individual’s telephone number, and the connection between individual and telephone number could be established.

It follows that any PII can be considered personal data, but not vice versa.

Data vs personal data vs special categories of data

We have seen how data differs from personal data, but there is one other type: special category data. These three types can be represented in a diagram:

Unlike personal data, special category data is clearly defined in a prescriptive manner under Article 9 of the GDPR, consisting of:

  • Racial or ethnic origin
  • Political opinions
  • Religious or philosophical beliefs
  • Trade union membership
  • Health data
  • Sex life or sexual orientation
  • Genetic data
  • Biometric data

Data mapping and data classification

It is important to know the personal data your organisation handles, and map how personal data is processed, stored, transmitted, and ultimately deleted. This will help adopt organisational and technical measures to protect the personal data according to their associated risk, and prioritise wisely.

This can be achieved in one of two ways:

  • Performing a data inventory, which consists of creating a record of the data that enters, resides in or exits your organisation
  • Performing a Record of Processing Activities (RoPA), which is similar to the above, but focuses on the activities (read ‘flows of data’), rather than specifying each data element; it entails a lesser effort, but by no means a negligible one
Data mapping and RoPA: what’s the difference?

Data classification will categorise the data in several compartments. For commercial organisations a typical classification scheme is composed of: strictly confidential, confidential, personal, internal use only, and public.

These issues can be considered two sides of the same coin, as data inventory/RoPA without data classification is not particularly useful, and data classification without data inventory/RoPA just cannot take place.

Where to find personal data?

To support your search for personal data, it is a good idea to have an asset inventory first, that way it becomes easier to determine the data that may be circulating through these devices. Regardless, below are some places to look at:

  • Personal data can be in digital form, but also in physical form, as part of a filling system
  • Personal data can be found in structured or unstructured databases
  • Data lakes and data warehouses often have copious amounts of personal data
  • Web forms used on your website, or in any other method of communicating with your customers
  • Cookies and related technologies deployed in your website, and in particular those by third parties if present
  • Shared drivers used internally, and in particular those with access from external parties
  • Email inboxes
  • Bring Your Own Device (BYOD) phones, including SIM card, and internal memory
  • Removable media such as USB drives, CDs, DVDs
  • Dormant accounts, with usernames and other data
  • HR records
  • Accident books, containing health related data

As part of digitalisation, I have seen many organisations struggle with data previously (or concurrently) stored in physical form. Filing cabinets in an office environment, used to store all sorts of paper documents ⁠— including passport scans, is still quite common. Their bigger brother is the ‘data vault’, which is an entire room dedicated to storing printouts. These pose a high risk to the organisation as an unknown risk, as normally the data is just dumped there without any form of consideration. The amount of effort and time required to categorise these documents can be daunting, and prone to errors if attempted using a ‘speedy’ approach.

Tools exist to aid with the data discovery exercise, but a good policy needs to be in place to ensure newly acquired data is appropriately labelled and safely stored.

This looks like a lot of effort, why bother?

The consequences of not protecting personal data, which as we have seen requires knowledge of where the data is and what it is, could be severe and can be framed as follows:

  • Operational risk
  • Legal/regulatory risk
  • Financial risk
  • Reputational risk
  • Psychological or physical damages

In case an event was to occur, a security incident involving personal data constitutes a data breach, and may mandate notification and reporting to supervisory authorities and data subjects, depending on the extent of the breach.

Although personal data is pervasive and perceived as a commodity, it has been elevated by data protection laws worldwide to something that needs to be handled with care. Organisations must be wary of applicable laws and act in accordance with them to avoid nasty surprises that may impact their business objectives. 

Experienced Principal Consultant and Associate Lecturer with an extensive academic background in Law, Information Security, and Engineering, including globally recognised certifications such as Fellow of Information Privacy (FIP), CIPP/E, CIPM, CIPT, CDPSE, and CISSP.

Receive helpful tips, practical content, and updates

Thank you! You have been successfully subscribed
Oops! Something went wrong while submitting the form.