Space to train and use AI models with personal data without consent under AVG
AI models, and in particular general purpose AI models that can perform a wide range of different tasks (such as large language models), often use extensive training data sets for this purpose that may include personal data.
The General Data Protection Regulation ("AVG") requires that a legally valid basis can be invoked for any processing of personal data. One of these bases is necessity to serve a legitimate interest of the controller or a third party (Article 6(1)(f) AVG). The controller can only invoke this ground if the interest being served is "legitimate", the processing of personal data that takes place is necessary, and the interests and fundamental rights of the data subjects (such as protection of their personal data) do not outweigh the interests of the controller.
We note that this basis - rather than consent - is increasingly being used as the basis for the development, training and deployment of AI models. This is understandable, because for consent to be validly invoked it would mean that prior, specific, informed and unambiguous consent would have to be obtained from each data subject whose personal data are processed for the AI model. This consent must also have been freely given, without intrinsic pressure or coercion. In practice, this is impractical for most parties. The question of whether, and under what conditions, reliance on legitimate interest in this context is permissible under the AVG has been further clarified recently by the European Data Protection Board (abbreviated to "EDPB") and various national data protection authorities.
EDPB guidelines and opinion: general review framework
In October 2024, the EDPB published Guidelines 1/2024 on the processing of personal data on the basis of legitimate interest.[1]These guidelines aim to provide processors with practical guidance on how to test and apply legitimate interest as a processing ground. For each of the three cumulative conditions to be tested in an interest assessment, concrete points of view and examples are given. For example - in line with case law of the Court of Justice of the European Union - it is established that a purely commercial interest to train the AI model can be justified.[2]Furthermore, in balancing the interests in favor of the controller, the fact that the controller has taken actions to make the processing known to the data subjects can be taken into account, so that it is assumed that they are reasonably aware of it. Also, if the processing has limited adverse impact on the lives of data subjects, this may be a factor in making the balancing of interests more in favor of the controller. The EDPB emphasizes that this assessment must be documented and conducted prior to processing. Therefore, in the development and/or training of an AI model, it will usually be the provider - whether instructed by the customer (user controller) or not - who performs this balancing of interests.
In December 2024, at the request of the Irish regulator, the EDPB issued Opinion 28/2024 for the processing of personal data in the context of AI models.[3] In this opinion, the EDPB takes the position that the use of personal data for AI training is not a priori excluded from the invocation of legitimate interest, but that lawfulness is highly dependent on, among other things, the context, the nature of the data, the reasonable expectations of data subjects, and the deployment of measures to protect the fundamental rights of data subjects. With this opinion, the EDPB seeks to promote European harmonization by providing more clarification to national regulators on the application of legitimate interest in the context of AI models. The Personal Data Authority ("AP") in its recently updated guidance onweb scrapingby private individuals and private organizations[4](a phenomenon that is frequently used in the development and training of AI models) connected to the viewpoints from the EDPB's opinion in balancing interests for a valid legitimate interest claim.
It is notable that the EDPB also considered the use of AI models by third parties who wish to use these AI models for their own AI systems. Various scenarios are examined, in which the EDPB seems to offer room for purchasers of AI models (user-responsible parties) to still be able to lawfully use them - under conditions - if the provider has processed personal data during their development and training without a valid basis under the AVG.
Different approaches to regulators within the EU
Severalbig tech companies have now announced that they will use personal data from their users to train AI models. This usually refers to legitimate interest as a valid basis for processing personal data, such as the interest of improving products or developing new digital services. Supervisory authorities within the European Union look at the validity of this differently.
The Irish Data Protection Authority(Data Protection Commission), which acts as lead regulator for several of thesebig tech companies, has agreed to invoke legitimate interest as a processing ground for the use of public user content for AI training, subject to the implementation of comprehensive transparency measures, objection procedures and technical safeguards.
In contrast, the AP has been more critical, but does not necessarily disapprove of reliance on this basis. It emphasizes that the effectiveness of certain measures can only be seen in practice, such as incorporating a filter to strip data of personal characteristics and sensitive information before using it to train AI. The AP has on several occasions actively called on consumers to exercise their right to object in the context of AI training bybig techcompanies.
Germany has also been divided: where the data protection authority in Hamburg(Hamburgische Beauftragte für Datenschutz und Informationsfreiheit, "HmbBfDI") initially objected to AI training based on personal data on the basis of legitimate interest, it was ultimately decided not to pursue enforcement proceedings against abig tech party, partly in anticipation of a unified European framework on this issue.
German judicial review and termination of national emergency procedure
The HmbBfDI considered a ruling by theOberlandesgericht Köln before deciding to terminate the aforementioned emergency enforcement procedure. In a case brought byVerbraucherzentrale Nordrhein-Westfalen, a consumer interest group, the court ruled on May 23, 2025 that the use of public (personal) data for AI training in the case at hand was not unlawful. Decisive factors included that, according to the court, the data controller had complied with information obligations, made objection procedures available and had taken measures to limit the impact on data subjects. In a statement, the HmbBfDI indicated that an isolated national measure was not desirable, partly in view of the upcoming European review of these practices. For that reason, HmbBfDI does not want to be the only EU regulator to issue a national provisional ban against AI training in that particular case. Thus, this U-turn does not mean that any AI training involving personal data in Germany is completely risk-free.
European regulators not yet aligned
At present, there is not (yet) a fully harmonized line within the European Union on invoking legitimate interest as a lawful basis for processing personal data in the context of AI. While the EDPB has provided a framework with its guidelines and opinion, its application remains dependent on the concrete circumstances of the processing and the views of national supervisors. It is expected that in the coming period there will be further coordination between the regulators, partly based on practical experience. For example, in June 2024, privacy advocacy organization nyobfiled a complaint with, among others, the AP, and Belgian and German regulators against severalbig tech companies in connection with the processing of personal data for AI training, and the DPC discontinued proceedings against platform X at the end of 2024 after the latter promised to limit the use of personal data of its users for AI training.[5]Thus, the processing of personal data in the context of AI is keeping (supervisory) minds busy.
Until there is a single unified EU framework, which cannot be ruled out that this may turn out to be a utopia, it remains important for organizations that develop, train and/or deploy AI models within their organization to carefully handle the processing of personal data in the process. A thorough, documented consideration will have to be made - at least in line with the advice of the EDPB as well as the organization's own (lead) supervisor - in order to assess whether justifiable interest as an appropriate basis under the AVG can be invoked for this purpose.