this post was submitted on 12 Sep 2024
1 points (100.0% liked)
Technology
59566 readers
4839 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You should really look at the full CoT traces on the demos.
I think you think you know more than you actually know.
You mean like this chain of thought?
Actually, they are hiding the full CoT sequence outside of the demos.
What you are seeing there is a summary, but because the actual process is hidden it's not possible to see what actually transpired.
People are very not happy about this aspect of the situation.
It also means that model context (which in research has been shown to be much more influential than previously thought) is now in part hidden with exclusive access and control by OAI.
There's a lot of things to be focused on in that image, and "hur dur the stochastic model can't count letters in this cherry picked example" is the least among them.
Got a link to that?
Yep:
https://openai.com/index/learning-to-reason-with-llms/
First interactive section. Make sure to click "show chain of thought."
The cipher one is particularly interesting, as it's intentionally difficult for the model.
The tokenizer is famously bad at two letter counts, which is why previous models can't count the number of rs in strawberry.
So the cipher depends on two letter pairs, and you can see how it screws up the tokenization around the xx at the end of the last word, and gradually corrects course.
Will help clarify how it's going about solving something like the example I posted earlier behind the scenes.