European Data Protection Board considers use of personal data in GenAI

The European Data Protection Board is investigating how AI models can use personal data without violating privacy laws.

The European Data Research Committee published an opinion Wednesday that examines how AI developers can use personal data to develop and implement AI systems, such as LLMs, without violating privacy laws.

Models must also comply with GDPR legislation, according to the report, so strict evaluation is much needed. Failure to comply with that law carries severe penalties. There are three main things to consider: ensuring anonymity, determining legitimate interests and deciding what happens in the event of unlawful data processing.

Anonymity

The publication addresses several issues, such as the consideration of whether AI models can be anonymous, which means privacy laws do not apply. AI models trained with personal data are not automatically anonymous. However, this must be assessed on a case-by-case basis. To ensure complete anonymity, the committee argues that it should be nearly impossible to extract data from the model and to obtain personal data intentionally or unintentionally through model queries.

The report shows that sufficient evidence must be provided to regulators so that they can assess whether the model is anonymous. For that assessment, several factors are considered, such as the techniques used to protect data, the context in which the model is used and whether modern technologies have been used to facilitate identification.

That evidence consists of detailed documentation of methods used and risk mitigation measures provided by controllers. A controller is an organization or person who decides why and how personal data are processed.

European Data Protection Board considers use of personal data in GenAI

Legitimate interests

The committee is also exploring whether “legitimate interests” can be used as a legal basis. In other words, can provide a legal basis for AI development without the consent of individuals. Determining whether a legitimate interest can be used as a legal basis involves three steps:

1. Identifying a legitimate interest

The interest must meet some criteria to be legitimate. It must not violate laws or regulations, and must be clearly and specifically worded. In addition, it must not be speculative. For example, using an AI chatbot to improve customer service is legal; using AI to process personal data without consent is not.

2. Effectively necessary?

The processing of an interest must therefore contribute to the intended purpose and there must be no less intrusive ways of achieving that same purpose. Only the data that is really necessary for the training should be processed. For example, if an AI chatbot can be trained with anonymous data, the use of public data is unnecessary.

3. Impact of sharing data on the individual

The interests of the controller should not outweigh the rights of the data subjects. That balancing is done with three pillars in mind: what the nature of the data is, what reasonable expectations the data subjects have, and what the impact is. Thus, sensitive data is weighed more strictly. Data subjects must be aware of how and why their data is being processed, and must know the expected impact, because that can affect the reasonable expectation.

If the balance was to turn negative anyway, controllers can use mitigating measures. They can encrypt data or better inform data subjects about the processing.

European Data Protection Board considers use of personal data in GenAI

Unlawful processing

When is the processing of personal data considered unlawful? The opinion outlines three scenarios of unlawful processing. In each of these, an AI model is developed with unlawfully processed personal data. If the personal data remains in the model and is used by the same controller, the entire processing may be unlawful. It depends on the specific context, and each case must be investigated by regulators.

If the personal data remains in the template, but the template is used by another controller or organization, the responsibility lies with the second controller to verify that the template complies with the GDPR. If the personal data is anonymized after unlawful processing, the GDPR laws no longer apply, provided the anonymization meets strict conditions.

For now, there is no one-size-fits-all solution to these legal issues; the committee offers only suggestions. It will be some time before Europe introduces correct laws for this ever faster growing technology. Europe is taking steps in the right direction, though. Earlier this year an AI office was established to enforce AI laws, the AI act itself is also already bearing fruit.