Blog

Sensitive Information Disclosure: The Hidden Risk in Generative AI Systems

Author

Robinson Israel Uche

Posted: May 7, 2025 • 4 min Read

Cybersecurity

Sensitive Information Disclosure: The Hidden Risk in Generative AI Systems

New attack surfaces are emerging as enterprises accelerate the adoption of Large Language Models (LLMS). One of the most overlooked yet dangerous threats is LLM02: Sensitive Information Disclosure, identified by OWASP in its Top 10 for LLMS. This article explores how sensitive data can be unintentionally leaked through LLM behaviour, what this means for businesses, and how security teams can mitigate it.

Sensitive Information Disclosure?

LLM02 is the unintended exposure of confidential, proprietary, or regulated data through a large language model's interactions or internal mechanisms. This can include:

LLMFig 1.0

This vulnerability becomes especially critical when LLMs are integrated into enterprise workflows that handle sensitive customer, financial, or operational data.

How It Happens: Technical Breakdown

Sensitive information disclosure can occur through various mechanisms:

  1. Prompt Context Leakage: Maliciously crafted prompts can exploit how many LLMs retain conversational memory or context across sessions. Attackers can use this to 'peek' into prior prompts or inject queries that elicit unintended information disclosure.
  2. Overfitted Training Data: If training data contains proprietary documents, internal communications, or personal data, and the model is not curated or governed properly, users may retrieve fragments of this data via crafted prompts.
  3. Prompt Injection via 3rd-Party Plugins: Poorly secured integrations of LLMs with external tools or APIs can leak sensitive data when plugins process user inputs without adequate sanitization.
  4. Memory Injection Attacks: If the LLM stores session or user memory, adversaries can exploit it by feeding data during earlier interactions that is later exfiltrated via crafted inputs.

Real-World Example

In 2023, Samsung engineers unintentionally leaked proprietary source code and internal meeting notes while using ChatGPT to debug code. The LLM retained this context, risking exposure to other users and violating internal data policies. This incident illustrates how even benign usage of LLMS can result in high-impact data disclosures.

Risk to Enterprises

For regulated industries—finance, healthcare, law—LLM02 poses a compliance nightmare:

  • PII/PHI Disclosure- Violates HIPAA, GDPR, CCPA
  • Insider Data Leaks- Risk of internal documents being exposed
  • Client Confidentiality Legal exposure if privileged data leaks

Effective security governance is built on interconnected elements that cascade from strategy to execution:

The sheer volume and velocity of LLM use in business make traditional DLP insufficient. The model may 'learn' sensitive data, making removal nearly impossible.

Detection Techniques

LLMFig 2.0

Mitigation Strategies

  • Pre-Deployment Data Scrubbing Remove all sensitive and PII data before training or fine-tuning.
  • Memory Control Disable or sandbox conversational memory unless necessary.
  • Prompt SanitizationImplement strong input validation, especially in plugin interfaces.
  • CDifferential Privacy Techniques Use privacy-preserving training methods to reduce data memorisation.
  • LLM Firewalls Implement LLM-aware filtering layers (e.g., Lakera, PromptShield) to block sensitive outputs.
  • Awareness Implement comprehensive awareness training programs that equip employees to accurately classify and appropriately handle information based on its security level.
  • Cloud Security Alliance (CSA) and OWASP: For cloud and application security guidance.