AI Privacy Audit 2026: Which Apps Collect Your Data and How to Opt Out

Every time you ask an AI chatbot a question, you’re handing over data. But how much? And what happens to it after you close the tab?

I audited the privacy policies of the most popular AI apps to find out exactly what they collect, how long they keep it, whether they use it for training - and how to opt out where possible.

The results aren’t pretty. Most AI companies now operate on “opt-out” models, meaning your conversations train their models unless you specifically disable it. Some platforms offer no opt-out at all.

ChatGPT (OpenAI)

What they collect: Everything you type, including questions, conversations, and uploaded files. Plus geolocation, IP address, browser type, network activity, and cookies to track usage patterns.

Data storage: Your conversations are stored on OpenAI’s US servers indefinitely unless you delete them.

Training: By default, your conversations are used to train future models. Human reviewers may read portions of your chats.

How to opt out:

Go to Settings > Data Controls
Toggle off “Improve the model for everyone”

This stops new conversations from training models while keeping your chat history visible. Note: if you provide feedback on any response (thumbs up/down), that entire conversation may still be used for training regardless of this setting.

Important: OpenAI still retains your data even with training disabled. The only way to prevent all storage is to use their API with Zero Data Retention (ZDR) enabled - but that’s only available for business customers.

OpenAI’s privacy policy was updated February 9, 2026 with a separate voice mode training opt-out.

Claude (Anthropic)

What they collect: Your conversations, uploaded files, and usage data.

Data storage: Standard 30-day retention. If you opt into training, retention extends to five years.

Training: As of August 2025, Anthropic asks Free, Pro, and Max users whether they want to share data for model improvement. Unlike some competitors, this is opt-in rather than opt-out.

How to opt out:

Go to Settings > Privacy
Disable “Help improve Claude”

If you don’t choose to share data, Anthropic keeps conversations for 30 days only, then deletes them.

What’s excluded: Business accounts (Claude for Work, Government, Education, and API use) are never used for training.

Anthropic states they don’t sell data to third parties and filter sensitive information before any training use.

Google Gemini

What they collect: All prompts, images, documents, files, voice recordings (including Gemini Live), screen shares, location data, device info, IP address, and usage patterns.

Data storage: Activity auto-deletes after 18 months by default. However, conversations reviewed by human reviewers are retained for up to three years - even if you delete your activity.

Training: Your chats are used to improve Google services unless you disable Gemini Apps Activity.

How to opt out:

Visit myactivity.google.com/product/gemini
Under “Keep activity”, select “Turn off” or “Turn off and delete activity”

On mobile, open the Gemini app, tap your profile picture, select Gemini Apps activity, and turn it off.

The catch: Disabling activity means you lose conversation history. Google also keeps a 72-hour backend log regardless of your settings for abuse monitoring.

Google explicitly warns users not to enter confidential information they wouldn’t want a reviewer to see.

Microsoft Copilot

What they collect: Prompts, responses, accessed documents, usage patterns, and data from connected Microsoft 365 apps.

Training: For consumer Copilot, Microsoft may use conversations for product improvement and model training. Enterprise versions (Microsoft 365 Copilot) claim prompts and responses aren’t used to train foundation LLMs.

How to opt out:

Go to your Microsoft Account privacy settings and look for AI data controls. Microsoft says you can opt out of model training at any time.

Recent issues: In late January 2026, Microsoft confirmed a bug that allowed Copilot to bypass data loss prevention policies and access confidential emails.

Note: As of January 7, 2026, Anthropic is now a subprocessor for Microsoft 365 Copilot.

Meta AI

What they collect: Your public posts, comments, photos, and AI chat interactions across Facebook, Instagram, Messenger, and WhatsApp. Photos of non-users may be collected if shared by someone else.

Training: Your data trains Meta’s AI models. As of December 2025, your AI chat interactions are also used for ad targeting.

How to opt out:

If you’re in the EU, UK, Switzerland, Brazil, Japan, or South Korea: You have formal opt-out rights under privacy laws like GDPR. Use the objection form.

If you’re in the US or most other countries: There is no opt-out. Meta’s current policy doesn’t provide a mechanism to prevent training on your data.

You can mute Meta AI to stop notifications, but this doesn’t protect your privacy - it just hides the feature.

Private posts: Meta claims it doesn’t use content from private posts, so keeping your posts non-public may limit exposure.

Perplexity AI

What they collect: Name, email, payment details, prompts, AI responses, device info, location, and browsing behavior. If you sync email accounts, they access contacts, messages, and calendar data.

Data storage: Standard users retain data indefinitely. Enterprise users get 7-day file retention with custom options for larger organizations.

Training: Free, Pro, and Max users have AI Data Retention enabled by default but can control whether data is used for model training.

How to opt out:

Check your account settings for AI Data Retention controls. Enterprise users are excluded from model training entirely.

Email sync: Perplexity specifically states they don’t use synced email data for AI training.

Perplexity’s current privacy policy is effective February 5, 2026.

Midjourney

What they collect: All prompts (text and images), public chats, IP address, billing info, email, and username.

Data storage: Generated images are stored 30 days unless saved. But here’s the catch: images remain permanently in the public gallery with no true deletion option.

Training: Your prompts and images train Midjourney’s models indefinitely. There is no opt-out.

Public by default: Every image you create appears in the community gallery on midjourney.com, including images made in private Discord servers or direct messages.

Stealth Mode: Pro and Mega plans ($60+/month) can hide creations from other users, but images still train models and must follow community guidelines.

Voice Assistants (Alexa, Google Assistant, Siri)

All three major voice assistants collect recordings, though approaches differ:

Amazon Alexa: Records and stores every interaction unless manually deleted. In 2023, Amazon paid $25 million in FTC penalties for keeping children’s voice recordings despite deletion requests.

Google Assistant: Stores voice data with activity controls to manage and delete recordings.

Apple Siri: Positions itself as privacy-focused with more local processing, though some requests still go to Apple servers.

Accidental recordings: Research from Northeastern University found devices misactivate about once every two days on average.

The Big Picture

Here’s what stands out from this audit:

Opt-out is the new default. Most companies assume you consent to training unless you actively disable it. This represents a shift from even a year ago.

30-day retention is a floor, not a ceiling. Even with training disabled, most services keep your data for “abuse monitoring” - and that data may persist much longer than advertised.

Enterprise users get better privacy. Paid business tiers consistently exclude data from training. Privacy has become a premium feature.

Meta is the worst offender. No meaningful opt-out for US users, ad targeting from AI chats, and aggressive data collection across platforms.

Midjourney offers no training opt-out. If you use it, your prompts and images become permanent training data.

What You Can Do

Check every AI tool you use. Look for data controls in settings immediately after reading this.
Assume everything is training data. Don’t share sensitive information unless you’ve verified the privacy controls.
Consider local alternatives. Self-hosted models like Ollama don’t send data anywhere. We’ve covered how to set up local code completion and other local tools.
Use enterprise tiers for sensitive work. If your employer offers them, business versions typically have stronger protections.
Delete old conversations. Even if it doesn’t prevent past training, it reduces your exposure to future policy changes.

The AI industry has decided your data is too valuable to leave untouched. The least you can do is make them work for it.