With the Information Commissioner’s Office stating that “big data is no fad” and the FCA recognising it as one of its priority areas for 2018/19, ‘big data’ is a pervasive agent for change in the legal sector. In this article, we look to see what ‘big data’ is and analyse how lawyers and in-house legal teams can extract value from ‘big data’ before considering three pressing legal considerations that have emerged from the growth of big data – privacy, data discrimination and anti-trust.

What is ‘Big Data’?

‘Big data’ is a blanket term for collections of data sets that are enormous in size and complex such that their processing using traditional data management means, such as relational database management systems, is problematic. ‘Big data’ is regarded as meeting the following characteristics (often called the “Four V’s”).

  • Volume: Data sets that are large and complex.
  • Velocity: Data streams in at an unprecedented speed with much of it generated in real time.
  • Variety: Data comes in all different formats from structured, numeric data in traditional databases to unstructured text documents.
  • Veracity: The certainty and trustworthiness of the data.

Why is ‘Big Data’ valuable for lawyers and in-house legal teams?

Nevertheless, pure data can be a worthless asset if it cannot be read and analysed to guide commercial decisions. Thanks to big data analytics, lawyers and in house teams can improve the way they provide services creating a fifth V – Value.

  • Predicting Outcomes: A key function of a legal practitioner is to predict, whether that be the likely outcome or impact of a certain regulation, case or contractual clause.
  • Efficient Discovery: Analytics can automatically organise, discover and summarise large amount of documents – inevitably saving time collating a case (see technology companies like RAVN).
  • Internal Business Processes: Analytics can also improve internal processes from effective billing to time management.

Embracing Change?

As legal practitioners are realising the benefits of ‘big data’, so too are their clients, demanding a much more cost-efficient service. Firms and in-house teams that are resistant to change risk being undercut due to competitors embracing the benefits of ‘big data’.

Legal services can no longer be reactive but rather must be pro-active in their approach but with the aid of ‘big data’ and big data analytics, the legal landscape seems to be changing. Now is the time to embrace this change.

However, the collection and analysis of big data is not without legal considerations.

The three big data legal consideration identified in this article are:

  • Privacy
  • Data discrimination
  • Anti-trust outcomes

Privacy and ‘Big Data’

The General Data Protection Regime (GDPR) may provide some specific challenges to ‘big data’. ‘Big data’ sets will often include personal data, and in many cases, it is not possible to separate the personal data from the non-personal data.  The ‘big data’ aim is to uncover relationships within and amongst the information through analytics and processing.  Given the accuracy and trustworthiness of the data set may not be exact, but rather directionally representative, the starting point of big data of itself runs contrary to a fundamental part of the GDPR in terms of protecting the data subject.

Specifically, Article 22 of the GDPR prohibits automatic processing, including profiling, where such processing has legal effects on a data subject, or similarly significantly affects the data subject.  In this regard, profiling is defined as: “any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse of predict aspects concerning that natural person’s performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements”.

Some of the privacy risks particularly pronounced in the context of big data profiling are:

  • Processing of personal data outside of the purpose for which it was collected
  • Use of incorrect and / or outdated information
  • Discrimination / bias against certain individuals or groups though the profiling algorithms
  • Excessive processing of personal data (contrary to the principle of data minimisation)

Because of automatic processing involves such high risks on privacy, it is prohibited in principle, however allowed where:

  • Based on (explicit) consent; or
  • It is required for the entering into or performance of a contract,

provided the data subjects can contest an automatic decision and obtain human intervention.

Furthermore, the GDPR provides that sensitive personal data may only be automatically processed based on explicit consent, and that data subjects are informed on the use of automatic processing and to be given information on the logic used and potential consequences.

One of the biggest points to note is that companies have already accumulated large amounts of data – and the GDPR applies not just to data sets created going forwards, but those that exist today as well, if they are going to be used once it is in force.  It may prove practically problematic to obtain the required explicit consent for specific uses of a data set that already exists (and is in fact, already in use).

Data Discrimination and ‘Big Data’

Data discrimination, also known as discrimination by algorithm, is bias that occurs when predefined data types or data sources are intentionally or unintentionally treated differently than others.

To give a clear example of data discrimination in practice, in spring 2017, Palantir Technology had to pay $1.7 million in back pay to Asian job applicants. Palantir allegedly used a hiring process that “routinely eliminated” qualified Asian applicants during the phases of resume screening and telephone interviews, and instead hired predominantly people from its discriminatory referral systems.

The potential of encoding discrimination in automated decisions therefore has implications on reinforcing discriminatory stereotypes to the detriment of both users and the effectiveness of the system itself.

It is critical that these tools are designed to promote fairness and opportunity, so that reliance on these expanding sources of data does not create new barriers to opportunity.

Anti-Trust and Big Data

Whilst issues around privacy and big data have garnered significant interest, lawyers and in-house teams should also be aware that companies utilising big data must be aware of anti-trust risks. Competition authorities across Europe are paying ever-increasing attention to accumulation of large data sets.

This attention was directed to the recent merger reviews of Facebook Inc. and WhatsApp. The European Commission investigated whether Facebook could strengthen its position in online advertising by placing ads on the WhatsApp platform, or by leveraging WhatsApp’s user data to improve advertising on the Facebook platform.

Companies with a dominant market position who use big data may pose an enforcement risk in Europe. Germany’s Federal Cartel Office recently undertook an investigation of whether Facebook abused its market power by requiring users to agree to its terms and conditions , which allow the company to collect valuable big data.


Big data is clearly on trend and for the legal profession, they must embrace this trend in order to improve their services to clients. However, legal considerations must not be forgotten in the midst of such rapid growth.

Share and Enjoy !