this post was submitted on 31 Jul 2024
1 points (100.0% liked)

Technology

59651 readers
2646 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”

you are viewing a single comment's thread
view the rest of the comments
[–] conciselyverbose@sh.itjust.works 0 points 3 months ago (3 children)

It's not your data.

Fuck off.

[–] Gsus4@mander.xyz 0 points 3 months ago (4 children)

My only regret was not deleting all my comments before deleting my reddit account :P

[–] YtA4QCam2A9j7EfTgHrH@infosec.pub 0 points 3 months ago

I did delete it all and I’m very happy about it.

[–] FenrirIII@lemmy.world 0 points 3 months ago (2 children)

Same. 13+ years of insanity and grouchy comments are gonna mess up the AIs

[–] Transporter_Room_3@startrek.website 0 points 3 months ago (1 children)

Same, plus or minus a year.

It took me a week, but I scrambled every comment and post with lorem ipsum and bee movie scripts, deleted the comments, then after verifying I could no longer find any of my original content on any search engine outside archive sites, I deleted the account.

It took so long because r*ddit started limiting API access when they realized people were automating their profile scrubbing.

As I've said before about certain countries, if you're doing everything you can to prevent people from leaving [THING/PLACE] then you might just be shit.

[–] FenrirIII@lemmy.world 0 points 3 months ago (2 children)

I was straight IP banned permanently for reporting the Israeli genocide fans and racists arguing for the eradication of Palestinians. I just deleted the account because I never imagined they would turn into such shitheels.

[–] LustyArgonianMana@lemmy.world 0 points 3 months ago* (last edited 3 months ago) (1 children)

I got IP banned for asking if there was any "good news" about Mitch McConnell after his strokes. I intentionally worded it ambiguously, but the mod on r/politics looked at my political history/not a conservative and decided I was celebrating violence and so I was IP banned. I guess only Mitch McConnell is allowed to salivate at violence openly and the rest of us are supposed to be worried for his health. It's not a problem if women are the ones he's directing violence towards, but God forbid a woman speak back to him. Pregnancy and birth cause strokes and clots, and that's preferable to him vs an abortion... but God forbid he get strokes at the end of his life from being a horrible person and I find that preferable to his stupid harmful policies

[–] lightnsfw@reddthat.com 0 points 3 months ago

The way they enforce that no violence rule is so fucking stupid. Even if you had said "I hope that stroke implodes his brain" you aren't advocating violence. A medical issue isn't violence.

[–] vaultdweller013@sh.itjust.works 0 points 3 months ago (1 children)

And here I got IP banned for saying we should murk the crown prince of Saudi Arabia. This was around the time the journalist got butchered in that Saudi Embassy. As an aside the Saudis have oil and are assholes how long till we start drone striking them?

[–] Passerby6497@lemmy.world 0 points 3 months ago (1 children)

As an aside the Saudis have oil and are assholes how long till we start drone striking them?

They would either need to stop playing ball, or oil is no longer a staple for energy generation and transport needs.

Until then, the 9/11 architects will be able to hang out and do whatever they want.

Also, thanks for protecting your friends that day GW, here's hoping you get "touched" by a friend from a long way away...

Reach out and touch someone

[–] Gsus4@mander.xyz 0 points 3 months ago

silver lining 😁

[–] elephantium@lemmy.world 0 points 3 months ago (2 children)

Don't regret too much. I wouldn't be surprised if reddit's "delete" function was really just "move to the "suckers-wanted-to-delete-this" file.

[–] CileTheSane@lemmy.ca 0 points 3 months ago (2 children)

I "deleted" all my posts, then randomly had someone reply to a 3 year old post that wasn't showing up in my profile but still showed on the page.

Don't delete your comments, edit them to be useless.

[–] Passerby6497@lemmy.world 0 points 3 months ago

And as a positive to editing rather than deleting, you may have your comment taken down by AutoMod anyway! I had AutoMod take down a ton of my comments because they were flagged a spam because I used a replacement text tool to mass fix a decade worth a comments on multiple accounts. So many messages from AutoMod....

[–] tibi@lemmy.world 0 points 3 months ago (1 children)

I'm pretty sure they keep edit history too.

[–] CileTheSane@lemmy.ca 0 points 3 months ago

Probably, but when someone is going through old posts they are going to see the edit, not the history. The main goal here is to make Reddit less useful so people go elsewhere. Let Google's AI be trained on Bot posts.

[–] Quill7513@slrpnk.net 0 points 3 months ago (1 children)

If you delete your content do it in the form form of a GDPR takedown request

[–] elephantium@lemmy.world 0 points 3 months ago

Good idea for those who are covered by the GDPR. Doesn't help me, though.

[–] demizerone@lemmy.world 0 points 3 months ago

I deleted my top comments and left the trash, which was 15 years worth. AI can hallucinate off that trash all it wants.

[–] lmaydev@lemmy.world 0 points 3 months ago (7 children)

I mean it literally is. People post it there voluntarily knowing that. It's what keeps the lights on.

[–] cygnus@lemmy.ca 0 points 3 months ago* (last edited 3 months ago) (4 children)

Sort of, but not really. From the Reddit ToS (emphasis mine):


By submitting Your Content to the Services, you represent and warrant that you have all rights, power, and authority necessary to grant the rights to Your Content contained within these Terms. Because you alone are responsible for Your Content, you may expose yourself to liability if you post or share Content without all necessary rights.

You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:

When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.

[–] bionicjoey@lemmy.ca 0 points 3 months ago (2 children)

Beyond that, if you are serving webpages with data on them, you don't get to decide what people do with those pages. They can't stop search engines from scraping

[–] commie@lemmy.dbzer0.com 0 points 3 months ago

you can decide who you serve pages to

[–] bassomitron@lemmy.world 0 points 3 months ago

Just to nitpick, they can stop scraping, anyone can. However, doing so would require implementing barriers that tend to also negatively effect sites that are dependent on being discovered and browsed.

[–] chakan2@lemmy.world 0 points 3 months ago (1 children)

Lol...really? So the can reuse, modify, and remove all association with your content, but somehow you think you still own it?

I've got a bridge to sell you.

[–] cygnus@lemmy.ca 0 points 3 months ago (1 children)

In essence, it means that you reserve the right to also use the content for your own purposes, without Reddit having any recourse to preventing you from doing that.

[–] chakan2@lemmy.world 0 points 3 months ago (1 children)

Except they published your work, all variants of said work, and completely eliminated you as the author of said work.

I don't know how else to explain to you that you don't own that work anymore. You have rights to it. But you don't own it.

[–] cygnus@lemmy.ca 0 points 3 months ago

It's the opposite; you own it, but Reddit also have rights to it.

[–] Bookmeat@lemmy.world 0 points 3 months ago

It's right there in the ToS: NON-EXCLUSIVE license. If they go to court, I would guess they lose.

[–] ReallyActuallyFrankenstein@lemmynsfw.com 0 points 3 months ago (2 children)

It's actually a fascinating bind Steve/Reddit has put themselves in. Because it is a non-exclusive license, you can affirmatively declare your content is free for anyone to scrape or use.

After that, if Reddit ever asserts rights over your content by, say, suing Microsoft for improperly using your content in training data, you now have a legal claim against Reddit for interference with either your ownership rights or with a contract via whatever license you have made your content available under.

Now, maybe Reddit has a claim release in their TOS, but it wouldn't prevent you from getting an injunction enjoining Reddit from restricting your data from being used by Microsoft.

It's kind of academic, because... it's not really a victory that Microsoft is also training its AI on your data. But, hey, they're probably doing it anyway and at least this way we get to screw over Huffman for being an ass.

[–] cygnus@lemmy.ca 0 points 3 months ago

MS couldn't access that content without scraping the page itself, though, which of course belongs to Reddit. From a legal standpoint, it's like a paywall.

[–] Bookmeat@lemmy.world 0 points 3 months ago* (last edited 3 months ago)

The only issue I see with this is that it can be argued that this license doesn't grant third parties access to data on Reddit's platform.

[–] conciselyverbose@sh.itjust.works 0 points 3 months ago* (last edited 3 months ago)

It literally isn't. Even their shitty EULA only claims a license to use it, not that it's their data.

And approximately 100% of the data on their servers was created while it was accessible to literally anyone who wanted it without restriction through a free API. Virtually none of the content was ever intended to be kept from fucking search engines so it could be sold for AI.

[–] helpImTrappedOnline@lemmy.world 0 points 3 months ago

They are not responsible for what people post, nor do they pay anyone to post, therefore I do not see how they can claim the data as "theirs".

They have their own self-regualted rules, but ultimately most anything is fair game for reddit to point at the user and say "we take no responsibility for what an individual may post on this public form".

The only thing they will have a problem with is CSAM, but even then as long the volunteer mods remain effective at removing it, reddit will not be responsible for anything users post.

[–] Grimy@lemmy.world 0 points 3 months ago (1 children)

Yup keeps the lights on and makes sure Spez gets his yearly 200 million bonus. It's good that they are tightening the screw because 200 million is clearly not enough, he deserves double that at least.

[–] admin@lemmy.my-box.dev 0 points 3 months ago (1 children)
[–] nightwatch_admin@feddit.nl 0 points 3 months ago (1 children)

Yes, but that’s just the tip

[–] bobs_monkey@lemm.ee 0 points 3 months ago

I think I read somewhere that his base salary is somewhere around $550k, so that would make him all head and no shaft. Probably carries two raisins in his pocket.

[–] tabular@lemmy.world 0 points 3 months ago (1 children)

People post it there voluntarily knowing that

Press x to doubt.

[–] TrickDacy@lemmy.world 0 points 3 months ago

Gotta love corpo fellatio

[–] homesweethomeMrL@lemmy.world 0 points 3 months ago (1 children)
[–] deranger@sh.itjust.works 0 points 3 months ago

You see what happens when you find a stranger in the Alps?!

[–] cyberpunk007@lemmy.ca 0 points 3 months ago (5 children)

Part of the ToS. Whatever you put on there is effectively theirs. Same with Facebook and your photos etc.

[–] mriormro@lemmy.world 0 points 3 months ago

ToS aren't the law.

[–] Passerby6497@lemmy.world 0 points 3 months ago

Whatever you put on there is effectively theirs.

I would so love if companies that had decided they own/can sell the data users published lost section 230 protection. Oh, this is your data? I guess you don't need to be protected against the data users post if it's your data now.

[–] Mubelotix@jlai.lu 0 points 3 months ago

ToS don't prevent you from being a PoS

[–] Womble@lemmy.world 0 points 3 months ago

And whatever you put on a public accessible webpage is effectively anyone's who makes a get request.

[–] drmoose@lemmy.world 0 points 3 months ago

No you cannot transfer copyright with ToS agreements just give license for reddit to use your copyright.