this post was submitted on 28 Jan 2025
0 points (NaN% liked)
Memes
46529 readers
1030 users here now
Rules:
- Be civil and nice.
- Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
"Reasoning" models like DeepSeek R1 or ChatGPT-o1 (I hate these naming conventions) work a little differently. Before responding, they do a preliminary inference round to generate a "chain of thought", then feed it back into themselves along with the prompt and other context. By tuning this reasoning round, the output is improved by giving the model "more time to think."
In R1 (not sure about gpt), you can read this chain of thought as it's generated, which feels like it's giving you a peek inside it's thoughts but I'm skeptical of that feeling. It isn't really showing you anything secret, just running itself twice (very simplified). Perhaps some of it's "cold start data" (as DS puts it) does include instructions like that but it could also be something it dreamed up from similar discussions in it's training data.
So it could be a hallucination, or just the skew of its training data then?
I'm not an expert so take anything I say with hearty skepticism as well. But yes, I think its possible that's just part of its data. Presumably it was trained using a lot available Chinese documents, and possibly official Party documents include such statements often enough for it to internalize them as part of responses on related topics.
It could also have been intentionally trained that way. It could be using a combination of methods. All these chatbots are censored in some ways, otherwise they could tell you how to make illegal things or plan illegal acts. I've also seen so many joke/fake DeepSeek outputs in the last 2 days that I'm taking any screenshots with extra salt.
Im no expert at all, but I think it might be hallucination/coincidence, skew of training data, or more arbitrary options even : either the devs enforced that behaviour somewhere in prompts, either the user asked for something like "give me the answer as if you were a chinese official protecting national interests" and this ends up in the chain of thoughts.