this post was submitted on 31 Jul 2024
1 points (100.0% liked)

Technology

59651 readers
2617 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”

you are viewing a single comment's thread
view the rest of the comments
[–] Vipsu@lemmy.world 0 points 3 months ago (3 children)

Well Reddit should just sue these companies and see if these companies are actually breaking any laws. Holding sizeable chunk of the internet hostage also sounds like something the EU and US might want to look in to as it very much sounds like anti-competitive conduct or market manipulation.

Also if these companies want to have greater ownership over the content generated by their users they should also be much more liable for the content posted to their sites. I mean when something like the Section 230 was written they probably did not take this in to account. If these companies want to start selling user generated content then they should simply lose the immunity from liability.

[–] mint_tamas@lemmy.world 0 points 3 months ago (1 children)

While I don’t disagree with the general idea, Section 230 would introduce an uncontrollable risk into running any website with user-generated content and would essentially shut them down.

[–] Passerby6497@lemmy.world 0 points 3 months ago (1 children)

If the site isn't selling data, they wouldn't lose 230 protection. So that would only be a risk for the companies selling their users' data, not your regular forum or something.

That gets really murky though. For example:

  • news sites w/ comment sections - they're profiting from ads and subscriptions, so how much of that has to do with the comments?
  • ecommerce - reviews on Amazon and eBay could be considered advertising for the product. Who's liable, the ecommerce site, the merchant, or the poster?
  • product websites - how much are posted "reviews" considered advertising for the product? There may not be direct sales on the website, but surely someone's review would impact sales elsewhere
  • for-profit services with a discussion forum - these would be on a separate site from the revenue-generating service, but still associated with the brand and thus likely contributing to advertisements for the product

It's a lot more obvious for social media sites like Facebook since user-generated content is the service, but there are a lot of for-profit entities where user-generated content is highly relevant, but not the core service. Would those sites be essentially forced to either moderate or eliminate user interaction?

There's a lot of complexity here.

[–] drmoose@lemmy.world 0 points 3 months ago

Reddit would lose badly that's why they don't sue. US' 9th circuit ruled that scraping Linkedin is legal and Bing is not even scraping but indexing the data. Easiest case ever.

It's almost impossible to block web scraping especially someone with Microsoft or Perplexity resources.

Its clearly an attempt to blackmail indexers into license deal as paying something to reddit could be actually cheaper than battling anti robots.

[–] commie@lemmy.dbzer0.com 0 points 3 months ago

they should also be much more liable for the content posted to their sites.

why do people insist on making me defend reddit.