this post was submitted on 28 Jan 2025
0 points (NaN% liked)

Memes

46529 readers
1030 users here now

Rules:

  1. Be civil and nice.
  2. Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.

founded 5 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] tiredturtle@lemmy.ml 0 points 1 week ago (1 children)

Shared to an IM group from somewhere and reshared here. What I understood is it seems to be pretty open when asked how it comes to its answers

[–] brucethemoose@lemmy.world 0 points 1 week ago* (last edited 1 week ago)

As implied above, the raw format fed to/outputed from Deepseek R1 is:

<|begin▁of▁sentence|>{system_prompt}<|User|>{prompt}<|Assistant|>The model rambles on to itself here, "thinking" before answeringThe actual answer goes here.

It's not a secret architecture, theres no window into its internal state ehre. Thi is just a regular model trained to give internal monologues before the "real" answer.

The point I'm making is that the monologue is totally dependent on the system prompt, the user prompt, and honestly, a "randomness" factor. Its not actually a good window into the LLM's internal "thinking," you'd want to look at specific tests and logit spreads for that.