On December 18 last year, the European Data Protection Board (EDPB), the committee of all EU national data protection authorities, published Opinion 28/2024. This opinion addresses the operation of the General Data Protection Regulation (GDPR) within the context of AI models.
In this entry you will read the main points of this opinion. Be sure to read on in the following:
You work for an organization that is considering deploying AI in the workplace
You are curious in developments in the field of AI and AI legislation
You want to know how your privacy will be protected in a world of AI applications
Opinions of the EDPB - as the name suggests - have no force of law. However, that does not mean they are unimportant! Indeed, they provide direction on how the AVG should be applied. In the case of new technologies such as AI, this is quite useful; this way it is quickly clear how we in Europe can innovate responsibly.
A final note: in this entry you will not read a legally complete translation of the opinion. For the original text, which contains much more information than I can put here, you can use this link use.
The first question addressed in the opinion is when an AI model can be considered anonymous. This issue is of great importance. This is because the AVG does not apply to anonymous data, and this term includes personal data that has been rendered anonymous to the point where it can no longer be traced back to a particular person.
First, the EDPB notes that some types of AI models can never be anonymous. As an example, consider a model that mimics the voice of a particular person (such as that of singer Grimes, who released an AI model of her own voice in 2023) or a model made to release certain personal data. However, if there is no such type of model, a case-by-case assessment of whether an AI model can be considered anonymous must be made based on the following question: is the probability of personal data being retrieved from the AI model negligibly small ("insignificant")?
This question is not easy to answer. Factors that may come into play include:
The design of the model: how were the datasets constructed? What sources were used to train the model? Was an attempt made to remove (redundant) personal data from the dataset?
The analysis of the model: have internal and/or external audits been performed?
The tests on the model: has the model been tested for resilience, for example by trying to get it to reveal personal data?
Documentation on the model: is all this (and more) on paper?
If the creator of the AI model cannot prove that the AI model can be considered anonymous, then the AVG applies.
One of the key principles of the AVG is that personal data may only be processed on a legitimate basis. Creators of AI models will often only be able to invoke the "legitimate interest" basis, but does such an invocation really have a chance of success?
Characteristic of the "legitimate interest" basis is the so-called "three-step test," in which the validity of a reliance on this basis is assessed in three steps.
The first step is to determine whether a legitimate interest exists. As long as this interest actually exists, is clearly defined and does not violate the law, this step will not pose any problems. The EDPB cites as an example, "developing a chatbot service to help users" - this is apparently specific enough!
The second step is to be sure that the processing is necessary to satisfy this interest. If the chatbot just mentioned can be developed just as well with less (or even no) personal data, then it should be chosen.
The third step includes a balancing of interests: don't the privacy interests of those whose personal data are processed outweigh the legitimate interest described in step 1 (developing the chatbot service) after all? The EDPB places emphasis on this step: where the first and second steps are described in less than three pages, the third step takes up no less than seven pages. Factors involved in the balancing of interests include:
The interests, fundamental rights and freedoms of data subjects: what interests, rights and freedoms come into play? The right to freedom of expression? The prohibition of discrimination? Financial interests?
The effects of the processing on the data subject: does the AI model create positive, or negative, effects? Does the model create privacy risks and/or risks to other rights and freedoms? How serious are these risks, and how likely are they to occur?
The reasonable expectations of the data subject: could the data subject expect their personal data to be processed in this way?
Any measures taken to minimize the privacy impact: what has the creator of the AI model done to protect the privacy of the data subject?
If privacy interests ultimately prove to be the most important, processing on the basis of 'legitimate interest' is not permitted. So then innovation must give way to the protection of the (privacy) interests, rights and freedoms of the data subject.
Now what if it turns out that the creator of the AI model developed its model in violation of the AVG (for example, if the balancing of interests just described shows that it wrongly invoked the legitimate interest basis)? To answer this question, the EDPB distinguishes between three types of cases:
Scenario 1: The personal data remain present in the model, and the creator of the model then reprocesses this personal data. According to the EDPB, these cases must be evaluated individually.
Scenario 2: The personal data remain present in the model, and another party then deploys this model - think of a web store with a chatbot "powered by ChatGPT. In this case, the EDPB believes that the other party - the online store - would do well to research the creator of the model in advance to verify that the model was not developed in violation of the AVG.
Scenario 3: The creator of the AI model ensures that the model is fully anonymized before further use. In that case, according to the EDPB, the illegality during the development of the AI model does not affect the processing that occurs after anonymization.
This opinion shows well how much we in Europe value responsible innovation. In particular, the principle of accountability takes center stage:
- Only if the creator of the AI model can demonstrate that the probability of personal data being retrieved is negligible will the AVG not apply to it;
- Only if the creator of the AI model can demonstrate that its interests outweigh the privacy interests of data subjects can it invoke the legitimate interest basis;
- Only if the creator of the AI model can demonstrate that its unlawfully developed AI model was fully anonymized before further use, will the unlawfulness of the earlier processing not affect the later processing anyway.
The common thread: innovation without responsibility has no place in Europe.