Technology

59587 readers

5464 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

AI models collapse when trained on recursively generated data (www.nature.com)

submitted 4 months ago by floofloof@lemmy.ca to c/technology@lemmy.world

31 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] BombOmOm@lemmy.world 0 points 4 months ago (5 children)

Yep. It leads to a positive feedback loop. They just continue to self-reinforce whatever came out before.

And with increasing amounts of the internet being polluted with AI text output....

[–] MagicShel@programming.dev 0 points 4 months ago

That seems so obviously predictable.

[–] kevincox@lemmy.ml 0 points 4 months ago (1 children)

To be fair this doesn't sound much different than your average human using the internet.

[–] sp3tr4l@lemmy.zip 0 points 4 months ago

2024, Reverse Turing Test Challenge:

Can an LLM AI differentiate between human input and LLM AI input?

[–] Tobberone@lemm.ee 0 points 4 months ago

Well... Its built on statistics and statistical inference will return to the mean eventually. If all it ever gets to train on is closer and closer to the mean, there will be nothing left to work with. It will all be the average...

[–] Ensign_Crab@lemmy.world 0 points 4 months ago (3 children)

... AI inbreeding.

[–] Boozilla@lemmy.world 0 points 4 months ago (2 children)

We call it the GRRM model.

[–] Sibbo@sopuli.xyz 0 points 4 months ago

In the USA, they call it the AlaLlama model.

[–] bionicjoey@lemmy.ca 0 points 4 months ago

GPTargaryen

[–] skillissuer@discuss.tchncs.de 0 points 4 months ago

hapsburgGPT

[–] Even_Adder@lemmy.dbzer0.com 0 points 4 months ago* (last edited 4 months ago)

You have to pretty much intentionally give it enough synthetic data to wreck it. OpenAI and Anthropic train their models on generated data to improve them. As long as there's supervision during training, which there always will be, this isn't really a problem.

https://openai.com/index/prover-verifier-games-improve-legibility/

https://www.anthropic.com/research/claude-character