The Information Commissioner’s Office (“ICO”) has launched a consultation series on how aspects of data protection law should apply to the development and use of generative AI models, found here.
The ICO is seeking clarity in how data protection law applies to this new innovative technology, and specifically how generative AI will need to be assessed within the context of the data protection legislation, which is distinct from the simpler AI models used for classification and predictive objectives.
The chapter under consultation covers the lawful basis for training generative AI models on web-scraped data.
Specifically, the ICO is concerned about stakeholder’s positions on the following questions:
- What is the appropriate lawful basis for training generative AI models?
- How does the purpose limitation principle play out in the context of generative AI development and deployment?
- What are the expectations around complying with the accuracy principle?
- What are the expectations in terms of complying with data subject rights?
The chapter under consultation currently states that the responsibility rests with the developer to ensure the collection of the personal data they process is compliant with data protection regulations. This data can come from the developer directly web scraping for training their generative AI or purchasing it from another organisation that has conducted the data scraping, or a combination of the two.
There is a focus on whether, in determining a lawful basis of processing, generative AI developers can rely on a legitimate interest as a lawful basis (Art 6(1)(f) of the UK GDPR). This would mean that the data controller would need to conduct the three-part test and demonstrate that:
- the purpose of the processing is legitimate;
- the processing is necessary for that purpose; and
- the individual’s interests do not override the interest being pursued.
Each of these questions should be clear in the assessment to ensure the data controller has considered their use, and their responsibility regarding the future use of the scraped data and whether the large scale data scraping is necessary.
However, it seems the thorniest of questions is the third part of the legitimate interests test. Where the balance between the individual’s rights and the interests of the generative AI developer must be considered and weighed out.
As web scraping is an ‘invisible processing’ activity, where the individual will not be aware their personal data is being processed and may lose the ability to exercise their information rights in relation to the data once processed.
AI processing and invisible processing are both seen as high-risk activities that require a data protection impact assessment (DPIA) under current ICO guidance. Mitigation of risks to individuals, including how individuals can exercise their information rights, and clear demonstration of how the AI developer’s interest will be realised need to be considered and clearly stated in the legitimate interests test and DPIA.
Stakeholders, such as developers and users of generative AI, legal advisors and consultants working in this area, civil society groups and other public bodies are being requested to respond to the ICO’s consultation which is to close the 1 March 2024.
The material contained in this article is only for general review of the topics covered and does not constitute any legal advice. No legal or business decision should be based on its content.