this post was submitted on 03 Jul 2023

6 points (100.0% liked)

Technology

37739 readers

500 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

Los@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

Google Says It'll Scrape Everything You Post Online for AI (gizmodo.com)

submitted 1 year ago by misk@lemm.ee to c/technology@beehaw.org

12 comments fedilink hide all child comments

An update to Google's privacy policy suggests that the entire public internet is fair game for it's AI projects.

top 12 comments

sorted by: hot top controversial new old

[–] Powderhorn@beehaw.org 8 points 1 year ago (1 children)

People who are alive can have a company steal their entire corpus without recompense, while the descendants of people who died decades ago can get still get paid for content created by their ancestors.

Right.

[–] Peanutbjelly@sopuli.xyz 1 points 1 year ago

But how else could Disney afford to own everyone else's rights and properties? Why not think about the little guy! (Mickey mouse is little, right?)

That being said, I find it weird people are going after training data for llm's after completely ignoring the models built specifically to compete with and take advantage of people's unconscious habits and lifestyles.

AI in general will be very important to comfortably survive the near future as a species. Data is an important part of that.

we absolutely need to do something about the megacorps funneling every new gain as a society into increasing the already absurd wealth divide. The technology is good. The general web scraping isn't bad if the tool is not specifically evil in function. We just need to as a global community demand that the technology be used to benefit everyone equally as it continues to be developed.

[–] alcasa@lemmy.sdf.org 2 points 1 year ago (1 children)

Glad that I can contribute to making the next Google Bard even dumber

[–] Zapp@beehaw.org 2 points 1 year ago* (last edited 1 year ago)

Yeah. Now the stupidity I post online has a purpose.

Someday a T-800 will be closing in on a freedom fighter, but will have an intrusive thought interrupt it at a key vulnerable moment. And that intrusive thought will be some random pun we posted to DadJokes. You're welcome, future freedom fighters.

[–] Rentlar@beehaw.org 2 points 1 year ago (1 children)

I, as the proprietor of my comments, condone Google AI scraping my publicly shared content for their own use, on the condition that they condone scraping of their publicly accessible content including YouTube videos. :P

[–] deCorp0@lemmy.dbzer0.com 1 points 1 year ago

Google is going to continue boiling the frog until everyone using gmail, YT, drive, etc… is paying subscriptions for access to these services. It’s going to be interesting to see how much people are willing to pay to hold on to a gmail account they’ve been using for 20 years. I should buy Alphabet stock now.

[–] CreativeTensors@beehaw.org 1 points 1 year ago (1 children)

I just kind of assumed that they, as well as anyone in the space was doing that already.

Whether that means that we all collectively have ownership over the outputs of these models if they're trained on content that we produced over the years is another thing. As someone who uses AI tools a fair bit I would be totally fine with generated content being public domain unless a threshold for human intervention is met.

That threshold is where the messy legal work lies.

[–] YuzuDrink@beehaw.org 2 points 1 year ago

Would maybe be funny if a law were passed saying that you could only charge people for access to your AI content if you can prove that their own content wasn’t used to help train the AI…

[–] MJBrune@beehaw.org 0 points 1 year ago (1 children)

This is absolutely not the case and absolutely illegal. How their lawyers allowed this is insane and some government body needs to smack down Google with a real penalty. Even scraping AGPL'ed code would technically require them to AGPL their entire AI as it should be seen as a derived work in the courts. How could an AI scrape and utilize something, creating works based on the code taken, and not be seen as derived? It's insane.

[–] abhibeckert@beehaw.org 2 points 1 year ago* (last edited 1 year ago) (2 children)

How their lawyers allowed this is insane

I'm pretty sure Google's legal team knows a thing or two about copyright law. If they think this is fair use, then I'm inclined to believe it might be.

[–] AndrewZabar@beehaw.org 2 points 1 year ago

It’s that they know that it’s more profitable to seek forgiveness than to ask permission.

Get sued? Hah; okay see you in court in ten years. Meanwhile, profit. They can do it over and over and it will always be beneficial.

There needs to be SEVERE sanctions for these violations. Like in the millions. That’s the only way they’d stop. They just don’t care at all.

[–] ilmagico@beehaw.org 1 points 1 year ago

they just think they having tons of money to throw at a potential lawsuit means nobody will dare suing them.