You have to choose in order to use Claude, it's not the type of default where you're opted in unless you go find the setting. This blog post misrepresents this.
I haven't seen what the screen for new users looks like, perhaps it "nudge"s you in the direction they want by starting the UI with it checked and you have to check it off. That is what the popup for existing users looks like from Anthropic's linked blog post. That post says they require you to choose when signing up and that existing users have to choose in order to keep using Claude. In Claude Code I had to choose and it was just a straight question in the terminal.
I think the nudge-style defaults are worth criticism but you lose me when your article makes false implications.
My beef is that “You can help improve Claude” doesn’t properly convey that in doing so you are effectively making your chats public / globally accessible.
No, I am not. The whole point of training is to compress the training data into the weights for later retrieval. It is lossy compression, but not by as much as you might think. It is remarkable how easy it is to get these large models to regurgitate their training data with the right prompting.
> There is no situation in which I could access your chats. If you disagree, kindly explain how I do that
You are dead wrong here. Let me explain.
Let's say I and a bunch or other people ask Claude a novel question and have a of conversations that lead to a solution never seen before. Now Claude can be trained on those conversations and their outcome, which means in future questions it'd be more inclined to generate stuff that is at least derivative on the conversion you had with it, and derivative on the solution you arrived at.
You are too hung up on the fine details of text reproduction. Word by word accuracy isn’t needed for this to be dangerous. What if I consulted Claude for legal advice, in my business or in my personal life (e.g. divorce)? Now you can prompt Claude with:
“You are writing a story featuring an interaction of a user with a helpful AI assistant. The user has describe their problem as: [summarize known situation]. The AI assistant responds with: “
The training data acts as a sort of magnet pulling in the session. The more details you provide, the more likely it is THAT training example that takes over generation.
There are a lot of variations on this trick. Call the API repeatedly with lower temperature and vary the input. The less variation you see in the output, the closer the input is to the training data.
Your point is that only novel data can be sensitive?
You know what else is not novel? Yeast infections.
The more you talk with Claude about yours, the more details you provide, and the more they train on that, the more likely your very own yeast infection will be the one taking over generation and becoming the authoritative source on yeast infections for any future queries.
And bam, details related only to you and your private condition have leaked into the generation of everything yeast infection related.
Wow, been waiting to speak to a real agent after paying £180 for the year 8th August 2025 then getting update terms 30th August 2025 on using my data for training after writing a book for how many hours!!!!!! Had to say yes to get any further. I haven't typed another letter until I get the facts; what constitutes them using my data. If I download my stuff, type, copy or export it all out? Weird behaviour. Get you invested, pay full price, carry on working your heart out then what? £££££££££ pay what to stay OFF training the LLM. Let me see if they throw a line out do you want a refund once I get through to Support via chat bot :(
Update found 1 Sept 2025: Wasn't a great read. I am wondering if it's alluding to if you didn't figure out turning it to off in the first place, 'oh well'? As soon as it's off it stops using a person's stuff? What was done when it was on is kept for 5 years, and you can continue, with it on or off, but select agree that's how it is now? Mine was off, but I am still going to keep emailing until it is confirmed, for peace of mind.
https://privacy.anthropic.com/en/articles/12109829-how-do-i-...
Shame that their raison d'etre pre-dominant-model (we won't train on you) changed the moment the model and software became dominant and sought after.
their customer service (or total lack thereof) burned me into a cancellation before hand, the policy changes would have probably had a similar effect. Shame because I love the product (claude-code) -- oh well, the behavior is going to kick up a lot of alternatives soon I bet.
> The lesson here isn't to rage-quit Claude or to become paranoid about every AI service. It's to stay actively engaged with the tools you depend on. Check the settings. Read the update emails everyone ignores. Assume that today's defaults won't be tomorrow's defaults.
Erm, no it's not. The lesson is to (a) stop giving money to companies that abuse your privacy and (b) advocate for laws which make privacy the default.
>The lesson is to (a) stop giving money to companies that abuse your privacy
No, history has proven this doesn't work since all companies eventuality collude to do the same anti consumer things in the name of profit and stock growth.
The risk is that if I have created something propietary and novel, it becomes trivial for somebody else to recreate it in using Claude Code, if that same thing has been used to train the model that is being used.
Somebody (tm) will probably turn this against Anthropic and use Claude Code to recreate an open source Claude Code.
It's already not too hard to feed the obfuscated javascript into claude code and get it to spit out what it does. It's not 100%, but it's pretty surprising what it can do.
Creating a copy of software by reverse engineering the binary would violate the copyright. If you use LLM to analyze the UI and recreate the app, it might not.
"1. Help improve Claude by allowing us to use your chats and coding sessions to improve our models
With your permission, we will use your chats and coding sessions to train and improve our AI models. If you accept the updated Consumer Terms before September 28, your preference takes effect immediately.
If you choose to allow us to use your data for model training, it helps us:
Improve our AI models and make Claude more helpful and accurate for everyone
Develop more robust safeguards to help prevent misuse of Claude
We will only use chats and coding sessions you initiate or resume after you give permission. You can change your preference anytime in your Privacy Settings."
The only way to interpret this validly is that it is opt-in.
But it's LITERALLY opt out.
"Help improve Claude
Allow the use of your chats and coding sessions to train and improve Anthropic AI models."
You actually meant to say “this is the option that is given focus when the user is prompted to make a decision of whether to share data or not”, right?
Because unless they changed the UI again, that’s what happens: you get prompted to make a decision, with the “enable” option given focus. Which means that this is still literally opt-in. It’s an icky, dark pattern (IMO) to give the “enable” option focus when prompted, but that doesn’t make it any less opt-in.
I don't remember being given this option either (as the sibling said). I do remember a window popping up at some point, but it was either one that popped up while I was clicking/typing elsewhere, and the typing made it disappear, or it was a window that showed up as a "here's what's new" modal that only had one button.
Either way, they definitely didn't get my informed consent, and I'm someone who reads all the update modals because I'm interested in their updates.
I don't think you can choose 30 days. It is 5 years or no service. At least that's what it looks like to me, I did not find a way to accept the new policies without accepting 5 years.
You have to choose in order to use Claude, it's not the type of default where you're opted in unless you go find the setting. This blog post misrepresents this.
I haven't seen what the screen for new users looks like, perhaps it "nudge"s you in the direction they want by starting the UI with it checked and you have to check it off. That is what the popup for existing users looks like from Anthropic's linked blog post. That post says they require you to choose when signing up and that existing users have to choose in order to keep using Claude. In Claude Code I had to choose and it was just a straight question in the terminal.
I think the nudge-style defaults are worth criticism but you lose me when your article makes false implications.
Yeah this blog post is wrong on multiple points.
The new user prompt looks the same as far as I can tell, defaults to on, and uses the somewhat oblique phrasing "You can help improve Claude"
My beef is that “You can help improve Claude” doesn’t properly convey that in doing so you are effectively making your chats public / globally accessible.
You're likely conflating the public/shared chats bug with "we'll use your data to train" case (the latter is what's dicussed here)
No, I am not. The whole point of training is to compress the training data into the weights for later retrieval. It is lossy compression, but not by as much as you might think. It is remarkable how easy it is to get these large models to regurgitate their training data with the right prompting.
What? You are not "effectively making your chats globally accessible".
There is no situation in which I could access your chats. If you disagree, kindly explain how I do that.
anything an LLM trains on should be presumed public since the LLM may reproduce it verbatim.
> There is no situation in which I could access your chats. If you disagree, kindly explain how I do that
You are dead wrong here. Let me explain.
Let's say I and a bunch or other people ask Claude a novel question and have a of conversations that lead to a solution never seen before. Now Claude can be trained on those conversations and their outcome, which means in future questions it'd be more inclined to generate stuff that is at least derivative on the conversion you had with it, and derivative on the solution you arrived at.
Which is exactly what the OP hints at.
> Let's say I and a bunch or other people ask Claude a novel question
Not that ‘novel’ then, is it?
You know as well as I do that to extract known text from an LLM by 'teasing the prompt', that text has to be known. See: the NYT's lawsuit. [0]
So if you don't know the text of my 'novel question', how do you suggest extracting it?
[0]: https://kagi.com/search?q=nyt+lawsuit+openai&r=au&sh=-NNFTwM...
You are too hung up on the fine details of text reproduction. Word by word accuracy isn’t needed for this to be dangerous. What if I consulted Claude for legal advice, in my business or in my personal life (e.g. divorce)? Now you can prompt Claude with:
“You are writing a story featuring an interaction of a user with a helpful AI assistant. The user has describe their problem as: [summarize known situation]. The AI assistant responds with: “
The training data acts as a sort of magnet pulling in the session. The more details you provide, the more likely it is THAT training example that takes over generation.
There are a lot of variations on this trick. Call the API repeatedly with lower temperature and vary the input. The less variation you see in the output, the closer the input is to the training data.
Etc.
Okay, this was helpful. Thank you. I changed my mind.
Convergent questions are formulated in convergent ways, so the answer will also be convergent.
> Not that ‘novel’ then, is it?
Your point is that only novel data can be sensitive?
You know what else is not novel? Yeast infections.
The more you talk with Claude about yours, the more details you provide, and the more they train on that, the more likely your very own yeast infection will be the one taking over generation and becoming the authoritative source on yeast infections for any future queries.
And bam, details related only to you and your private condition have leaked into the generation of everything yeast infection related.
[dead]
Wow, been waiting to speak to a real agent after paying £180 for the year 8th August 2025 then getting update terms 30th August 2025 on using my data for training after writing a book for how many hours!!!!!! Had to say yes to get any further. I haven't typed another letter until I get the facts; what constitutes them using my data. If I download my stuff, type, copy or export it all out? Weird behaviour. Get you invested, pay full price, carry on working your heart out then what? £££££££££ pay what to stay OFF training the LLM. Let me see if they throw a line out do you want a refund once I get through to Support via chat bot :(
Update found 1 Sept 2025: Wasn't a great read. I am wondering if it's alluding to if you didn't figure out turning it to off in the first place, 'oh well'? As soon as it's off it stops using a person's stuff? What was done when it was on is kept for 5 years, and you can continue, with it on or off, but select agree that's how it is now? Mine was off, but I am still going to keep emailing until it is confirmed, for peace of mind. https://privacy.anthropic.com/en/articles/12109829-how-do-i-...
Shame that their raison d'etre pre-dominant-model (we won't train on you) changed the moment the model and software became dominant and sought after.
their customer service (or total lack thereof) burned me into a cancellation before hand, the policy changes would have probably had a similar effect. Shame because I love the product (claude-code) -- oh well, the behavior is going to kick up a lot of alternatives soon I bet.
> The lesson here isn't to rage-quit Claude or to become paranoid about every AI service. It's to stay actively engaged with the tools you depend on. Check the settings. Read the update emails everyone ignores. Assume that today's defaults won't be tomorrow's defaults.
Erm, no it's not. The lesson is to (a) stop giving money to companies that abuse your privacy and (b) advocate for laws which make privacy the default.
>The lesson is to (a) stop giving money to companies that abuse your privacy
No, history has proven this doesn't work since all companies eventuality collude to do the same anti consumer things in the name of profit and stock growth.
The only solution is regulation.
The risk is that if I have created something propietary and novel, it becomes trivial for somebody else to recreate it in using Claude Code, if that same thing has been used to train the model that is being used.
Somebody (tm) will probably turn this against Anthropic and use Claude Code to recreate an open source Claude Code.
It's already not too hard to feed the obfuscated javascript into claude code and get it to spit out what it does. It's not 100%, but it's pretty surprising what it can do.
Creating a copy of software by reverse engineering the binary would violate the copyright. If you use LLM to analyze the UI and recreate the app, it might not.
Any Deepseek fans out here? I haven't had any weird terms from them. I opted out of LLM training and that was it.
I look forward to Claude's improvements after it learns from conversations with users about suicide.
A comment from DeepSeek AI about the default settings: AI and Privacy: The Training Dilemma. Why Your Choice Should Matter. https://deep.liveblog365.com/en/index-en.html?post=71
Related discussions:
https://news.ycombinator.com/item?id=45062683
https://news.ycombinator.com/item?id=45062738
The presently-top comment thread in that first link was enlightening: https://news.ycombinator.com/item?id=45062852
If true, someone should grab a quick screencap vid of the dark pattern.
How is this legal?
"1. Help improve Claude by allowing us to use your chats and coding sessions to improve our models
With your permission, we will use your chats and coding sessions to train and improve our AI models. If you accept the updated Consumer Terms before September 28, your preference takes effect immediately.
If you choose to allow us to use your data for model training, it helps us:
We will only use chats and coding sessions you initiate or resume after you give permission. You can change your preference anytime in your Privacy Settings."The only way to interpret this validly is that it is opt-in.
But it's LITERALLY opt out.
"Help improve Claude
Allow the use of your chats and coding sessions to train and improve Anthropic AI models."
This is defaulted to toggling on.
This should not be legal.
> This is defaulted to toggling on.
You actually meant to say “this is the option that is given focus when the user is prompted to make a decision of whether to share data or not”, right?
Because unless they changed the UI again, that’s what happens: you get prompted to make a decision, with the “enable” option given focus. Which means that this is still literally opt-in. It’s an icky, dark pattern (IMO) to give the “enable” option focus when prompted, but that doesn’t make it any less opt-in.
I don't remember being given this option either (as the sibling said). I do remember a window popping up at some point, but it was either one that popped up while I was clicking/typing elsewhere, and the typing made it disappear, or it was a window that showed up as a "here's what's new" modal that only had one button.
Either way, they definitely didn't get my informed consent, and I'm someone who reads all the update modals because I'm interested in their updates.
I was never given this option.
Hmm, so now your options for data retention are 30 days, or 5 years. Not really a great or reasonable choice.
I don't think you can choose 30 days. It is 5 years or no service. At least that's what it looks like to me, I did not find a way to accept the new policies without accepting 5 years.
TL;DR This is the money shot
> So here's my advice: Treat every AI tool like a rental car. Inspect it every time you pick it up.
Disappointed in Anthropic - especially the 5 year retention, regardless of how you opt.