<｜begin▁of▁sentence｜>{system_prompt}<｜User｜>{prompt}<｜Assistant｜>The model rambles on to itself here, "thinking" before answeringThe actual answer goes here.

It's not a secret architecture, theres no window into its internal state ehre. Thi is just a regular model trained to give internal monologues before the "real" answer.

The point I'm making is that the monologue is totally dependent on the system prompt, the user prompt, and honestly, a "randomness" factor. Its not actually a good window into the LLM's internal "thinking," you'd want to look at specific tests and logit spreads for that.