All I want for Christmas is…an EDPB opinion on data protection and AI
In September of this year, the Irish Data Protection Commission (“DPC”) used its power under Article 64(2) of the GDPR to request an EDPB opinion in relation to the processing of personal data in connection with AI models. This marked the first time that the DPC has used this power, and potentially heralds a shift in the way in which the DPC will approach novel questions about the application of the GDPR going forward. It isn’t clear if the DPC deliberately timed its request to provide an early Christmas present to data protection enthusiasts, but we have one, nonetheless.
Scope of the request
The DPC asked the EDPB to consider four questions, which can be briefly summarised as:
- The extent to which a model that has been trained using personal data should be considered to continue to process personal data once developed/deployed.
- How a controller can demonstrate the appropriateness of relying on legitimate interest as its legal basis for processing personal data to create, update and/or develop an AI model.
- How a controller can demonstrate the appropriateness of relying on legitimate interest as its legal basis for processing personal data by an AI model post-training.
- If an AI model is found to have been created, updated or developed using unlawfully processed personal data, how does this impact the continued or subsequent processing of personal data by that AI model.
What the Opinion doesn’t cover
The Opinion notes that it has been drafted to cover the specific questions raised by the DPC, and it therefore does not address certain other key principles under the GDPR that must be considered when developing or deploying AI models. In particular, it is noteworthy that the Opinion does not address:
- Compatibility - How to approach an assessment of whether processing existing personal data that was collected for one purpose is compatible with processing for training purposes under Article 6(4).
- Special Categories of Data – How to approach the processing of special categories of personal data given the prohibition on processing such data and the limited exceptions that are available. However, the Opinion does highlight two particular aspects of the CJEU decision in Meta v Bundeskartellamt. First, the finding that where special categories of personal data are collected “en bloc” with non-special category data, the prohibition on processing under Article 9(1) applies to all such personal data. Second, the limited scope of the exception under Article 9(2)(e) in relation to personal data made public by the data subject.
Some interesting points
Whilst the Opinion as a whole is worth reading, we set out below some of the more interesting points that are considered by the EDPB.
Anonymous AI Models
Unsurprisingly, the EDPB notes that the question of whether an AI model should be considered to continue to process personal data once trained will need to be assessed on a case-by-case basis. When undertaking this assessment, the EDPB distinguishes between models that are designed to provide personal data regarding individuals whose personal data was used to train the model (e.g. an LLM that is designed to respond to a query like who has scored the most tries for Ireland), and those which are not designed to provide access to the personal data that is used to train the model. For the first type of model, it is clear that this involves the processing of personal data, but for the latter, the EDPB sets out some useful considerations for assessing whether personal data can be extracted from the model with “means reasonably likely to be used”.
Processing based on legitimate interests
The Opinion helpfully acknowledges that legitimate interests can potentially operate as the legal basis for processing personal data in all stages of the development and then the deployment of AI models. In order to rely on legitimate interests, the EDPB reiterates that three cumulative conditions must be met (i) the pursuit of legitimate interests by the controller or by a third party, (ii) the processing is necessary to pursue the legitimate interests, and (iii) the legitimate interest is not overridden by the interests of fundamental rights of the data subjects. Taking these in turn:
- Legitimate interests of the controller or a third party – the Opinion gives certain examples of legitimate interests in the context of AI models, which includes: (i) developing a conversational agent to assist users, (ii) developing an AI system to detect fraudulent content or behaviour, and (iii) improving threat detection in an information system. Notably the Opinion does not expressly reference the training of LLMs more generally as a legitimate interest, although it would be difficult to see how that would not be considered a legitimate interest.
- Is the processing necessary – the Opinion highlights that a key test will be whether there is no less intrusive way of pursuing the relevant purpose. This is likely to prove to be a point of contention in connection with the development of LLMs, where there is currently a drive to maximise the number of data points used in the training of sophisticated models. This contrasts with the development of lighter, more efficient LLMs that can prioritise smaller, more targeted data sets. The Opinion somewhat helpfully notes that technical safeguards might be deployed to contribute to the necessity test, such as taking steps to reduce the ease with which data subjects might be identified by a model post-training.
- Balancing test – the Opinion helpfully distinguishes between how a balancing test will operate for those data subjects whose personal data is used for training, and those whose personal data will be used in connection with the deployment of an AI model. In addition, the Opinion sets out a number of mitigating steps that can be taken to seek to address imbalances between a data subject and the controller. These are likely to be considered in detail by those developing and deploying AI models that process personal data.
Using an AI model that has been unlawfully trained
Where an AI model has been unlawfully trained, the Opinion distinguishes between AI models that retain personal data and models which are effectively anonymised. It also distinguishes between AI models used by the controller that developed the model, and AI models deployed by separate controllers.
- AI models that retain personal data, used by the controller that developed the model – the Opinion notes that in these circumstances, if a supervisory authority were to impose corrective measures, those measures would, in principle, apply to subsequent processing. The unlawful processing in this case is likely to render the use of the AI model in future very problematic, if not impossible.
- AI models that retain personal data, used by a separate controller – In these circumstances, the Opinion focuses on whether the separate controller has undertaken an appropriate assessment of the data protection implications of its processing of the personal data. It appears that the EDPB is of the view that it is possible that the separate controller may not have breached its obligations under the GDPR in these circumstances. However, whilst it is not stated in the Opinion, it appears clear that once a controller becomes aware of the initial unlawful processing, it would be very difficult, if not impossible, to continue to use the AI model.
- AI model is anonymised, then used by that controller or a separate controller – The Opinion notes that whilst a supervisory authority might impose corrective measures in connection with the anonymisation process, if that process is effective, the GDPR will no longer apply to the use of the AI model (unless the AI model is subsequently used to process personal data). This helpfully provides a safe harbour for using AI models that are effectively anonymised, albeit that it is likely that the question of whether an AI model is anonymised is unlikely to be straightforward.
Conclusion
Given the timeframe to produce the Opinion, and the range and complexity of AI systems, there is still a great deal of uncertainty as to the exact impact of data protection obligations on the training and use of AI systems. It is likely to be at least a year, if not longer, before we start to see decisions from supervisory authorities in this area. In the meantime, there is much to digest in the EDPB’s Opinion for the developers of AI systems and organisations that deploy them.
For more information, please contact a member of the Technology and Innovation group below, or your usual McCann FitzGerald contact.
This document has been prepared by McCann FitzGerald LLP for general guidance only and should not be regarded as a substitute for professional advice. Such advice should always be taken before acting on any of the matters discussed.
Select how you would like to share using the options below