The rapid progress in Artificial Intelligence (AI) has resulted in the rise of Generative AI (Gen-AI) which can produce, imitate, or modify many forms of content. The convergence of GenAI with Data protection law, including GDPR, is becoming evident as this industry advances. The GenAI AI models are trained by supplying the underlying algorithm with data, which may include personal data, referred to as “training data”. LLMs get specialized instruction to do certain tasks with accuracy, including creating and condensing texts, extracting information, making predictions, improving text comprehension, finding differences and similarities in texts, and writing papers in specific styles. The AI uses the user’s cues to analyze and assess patterns, allowing it to generate outputs from its data pool based on the statistical probability of phrase structure.
AI models may undergo further training utilizing more specific and focused training data to enhance their performance, such as tailoring them to a particular use case. The AI utilizes human requests (“input data”) to generate content (“output data”). Some data sources may include personal data, namely training data acquired by online scraping of publicly available internet data. Whether personal data is included in input and output data depends on the intended purpose and use case for each kind of data.
Within the context of the General Data Protection Regulation (GDPR), it is essential to identify the precise aspects of personal data management by GenAI models for which a company (referred to as the “AI user”) is responsible as a controller. The AI provider now determines the methodology for analyzing the data gathered from end-users to enhance the overall performance of GenAI. When the settings allow for the reuse of training data to improve general artificial intelligence, AI users provide access to their end-user data. This is done with the knowledge that the AI provider would use it to train and improve both its general AI services and the AI user’s specific services.
Organizations who use these AI services may have shared responsibility with the AI provider, as outlined in Article 26 of the General Data Protection Regulation (GDPR). This may significantly impact the degree of risk they face, due to the joint financial benefits involved. Nevertheless, it is recommended to disable configurations that allow the AI provider to reuse input data (if commercially viable) or to carefully assess and strategize for the potential consequences of shared controllership.
Corporate versions of GenAI systems are increasingly addressing this worry by offering alternative alternatives to AI users, therefore mitigating these risks.
The General Data Protection Regulation (GDPR) establishes the validity of personal data processing by users of artificial intelligence (AI). Article 6 permits the internal processing of non-sensitive personal data, including activities such as customer relationship management (CRM) and the generation of AI-based product recommendations. Nevertheless, using GenAI for internal or customer-related processes that need handling confidential information poses further difficulties. For example, the use of GenAI in the discovery and development of drugs or in creating communication with patients gives rise to issues about compliance with the laws outlined in Article 9 of the GDPR.
Precise communication is an additional vital element of GDPR, as it mandates controllers to provide data subjects with unambiguous and precise information about essential particulars. Automated Decision-Making (ADM) may have legal ramifications or significantly affect persons whose data is involved when it fulfils the conditions set out in Article 22. Users of AI systems are required to inform persons about the fact that automated decision-making is being used, offer comprehensive information about the underlying logic, and clearly explain the scope and desired outcomes of the data processing.
At present, there are no established industry standards that clearly define these requirements. AI users have the ability to utilise their judgement in establishing specific limits, guaranteeing that individuals are given substantial information about the specific data elements, where they come from, and their importance in the decision-making procedure. The anticipated outcomes vary based on the particular context, such as in the insurance industry, where it is crucial to clearly articulate the impact of various behaviours on insurance premiums.
If the processing does not satisfy the requirements for automated decision-making as stated in Article 22, the degree of complexity is considerably reduced. The European Data Protection Board (EDPB) suggests that it is advisable to provide all the information specified in (i)–(iii) as best practises; however it is not obligatory to do so. Without set guidelines, there are several approaches to delineating the handling of personal data using artificial intelligence.
The rapid rise of GenAI raises further apprehensions about data security. Companies using these technologies must guarantee that their methods of data processing are in accordance with the stipulations of GDPR. Data protection authorities play a crucial role in supervising and regulating the use of GenAI. EU data protection officials are actively working to supervise GenAI in practise, therefore enhancing compliance with GDPR. This article focuses on the primary GDPR concerns related to the use of GenAI and Large Language Models (LLMs) as its fundamental technology, which falls within the scope of EU regulations.
Read more about privacy issues relating Gen-AI from few corporate examples here.