this post was submitted on 05 Aug 2024
1 points (100.0% liked)

Technology

58458 readers
4482 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] Rhaedas@fedia.io 0 points 2 months ago (3 children)

Humans don't live that long. That's only about 1.5 million 30 min videos, which isn't a huge amount for a whole day's worth of scraping.

[–] Irremarkable@fedia.io 0 points 2 months ago (1 children)

Yeah this is honestly an order of magnitude less that I would've thought

[–] Infynis@midwest.social 0 points 2 months ago

Maybe they're running out

[–] mrfriki@lemmy.world 0 points 2 months ago (1 children)

I would be lucky if I get to watch more than 10000 videos in my entire lifetime.

[–] GBU_28@lemm.ee 0 points 2 months ago

Bro you're doing it with your eyes, right now!

[–] twei@discuss.tchncs.de 0 points 1 month ago

That’s only about 1.5 million 30 min videos

aka 2 videos from Quinton Reviews

[–] SlopppyEngineer@lemmy.world 0 points 2 months ago

Something like that was a plot point in Black Mirror. In that case it was with consciousnesses.

[–] Imgonnatrythis@sh.itjust.works 0 points 2 months ago

Can relate, I watched the English patient once.

[–] Kekzkrieger@feddit.org 0 points 2 months ago (2 children)

instead of focusing on their products and improving them for everyone, some shitty ceo is pushing their shitty ai agenda down everyones throat.

[–] Drewelite@lemmynsfw.com 0 points 2 months ago

Well it sounds like they're doing something to make their products better, you just disagree that it's going to be successful.

[–] Zetta@mander.xyz 0 points 1 month ago (1 children)

Nvidia's biggest product is absolutely AI by a massive landslide, I'm pretty sure I read that the point of them downloading these videos and doing the training is to build a pipeline for their AI users to do the same with their own shit. (Can't be bothered to double-check cuz I really don't care)

So they aren't downloading all this video to make a crazy AI model. They're downloading all this video to make a tool for their AI customers to use, you may not agree but improving their product is exactly what they're doing.

[–] Agrivar@lemmy.world 0 points 1 month ago (1 children)

Can't be bothered to double-check cuz I really don't care

For FUCK SAKE, why do you even bother posting your garbage opinions then? and with such authority too!

[–] Zetta@mander.xyz 0 points 1 month ago* (last edited 1 month ago)

¯\_(ツ)_/¯ great question

[–] Grimy@lemmy.world 0 points 2 months ago* (last edited 2 months ago) (1 children)

There's only a handful of video datasets and all of it is owned by Google through YouTube or big Hollywood companies like Disney and Netflix.

These companies are foaming at the mouth with rage thinking about what generative AI will do to their industry and how much it will help the currently non existant indie one. They will do whatever it takes to fence in the playbox and make sure they get to be the toll man.

This was never about AI getting to live or not, but who gets to own it. 404media is essentially a mouthpiece for these corporations, willingly or not, and the strengthening of copyright laws will not help the consumers or the small time creators. The only exception being laws that force copy left licenses onto models but that's not what is being pushed right now, as well as aocs Deepfake act which is well thought out imo.

Anyone should be permitted to train on YouTube and Netflix data, and Nvidia might even open source it in any case.

[–] Sconrad122@lemmy.world 0 points 2 months ago (2 children)

Nvidia does not have a strong history of open sourcing things, to say the least. That last bit sounds like pure hopium

[–] trollbearpig@lemmy.world 0 points 1 month ago (12 children)

The guy you are replying to is in all AI posts defending AIs. He is probably heavily invested in this BS or being paid for it, don't waste your time with him.

load more comments (12 replies)
load more comments (1 replies)
[–] noobdoomguy8658@feddit.org 0 points 1 month ago

Obligatory fuck AI and the illeterate bros pushing it.

What kind of videos, though? A lot of such material is very far from being proper educational material that we show other people to really teach them much, let alone educate them well enough to be anywhere trustworthy. This is a very processed material, with years of preparation once you consider the prior education of the individuals involved in the creative process - think of the past experiences silently influencing them, their initial knowledge on the subject obtained from somewhat basic facts from school or otherwise, their misconceptions, iterations that nobody knows about, and many other things that we don't usually directly associate with the act of working on something like a video, but that eventually do dictate a lot of the decisions and opinions put into it.

It's one thing that the AI has no intelligence in it whatsoever, but the fact that it's being pumped with information and "knowledge" in basically the reverse order doesn't help it become any better.

On the other hand, the entire thing is not about making something that works well, but something that sells well. And then there's people putting too much faith into the thing and trusting it with way too much stuff than they should (which is also the case with a lot of other tech, though, admittedly).

Some things of today are so damn unexciting.

[–] R00bot@lemmy.blahaj.zone 0 points 1 month ago (5 children)

I feel like the amount of training data required for these AIs serves as a pretty compelling argument as to why these AIs are clearly nowhere near human intelligence. It shouldn't take thousands of human lifetimes of data to train an AI if it's truly near human-level intelligence. In fact, I think it's an argument for them not being intelligent whatsoever. With that much training data, everything that could be asked of them should be in the training data. And yet they still fail at any task not in their data.

Put simply; a human needs less than 1 lifetime of training data to be more intelligent than AI. If it hasn't already solved it, I don't think throwing more training data/compute at the problem will solve this.

[–] rdri@lemmy.world 0 points 1 month ago (1 children)

There is no "intelligence", ai is a pr word. Just a language model that feeds on a lot of data.

[–] R00bot@lemmy.blahaj.zone 0 points 1 month ago

Oh yeah we're 100% agreed on that. I'm thinking of the AI evangelicals who will argue tooth and nail that LLMs have "emergent properties" of intelligence, and that it's simply an issue of training data/compute power before we'll get some digital god being. Unfortunately these people exist, and they're depressingly common. They've definitely reduced in numbers since AI hype has died down though.

[–] Hunter232@programming.dev 0 points 1 month ago (1 children)

Humans have the advantage of billions of years of evolution.

[–] Cyteseer@lemmy.world 0 points 1 month ago (2 children)

"ai" also has the advantage of billions of years of evolution.

[–] noobdoomguy8658@feddit.org 0 points 1 month ago

We're very proficient at walking, but somehow haven't produced a walking home or anything like that.

It's not very linear.

[–] wizardbeard@lemmy.dbzer0.com 0 points 1 month ago

Definitely not the same thing. Just because you can make use of the end result of major efforts does not somehow magically give you access to all the knowledge from those major efforts.

You can use a smart phone easily, but that doesn't mean you magically know how to make one.

[–] stupidcasey@lemmy.world 0 points 1 month ago (1 children)

You’ve had the entire history of evolution to get the instinct you have today.

Nature Vs Nurture is a huge ongoing debate.

Just because it takes longer to train doesn’t mean it’s not intelligent, kids develop slower than chimps.

Also intelligent doesn’t really mean anything, I personally think Intelligence is the ability to distillate unusable amounts of raw data and intuit a result beneficial to one’s self. But very few people agree with me.

load more comments (1 replies)
load more comments (2 replies)
[–] riodoro1@lemmy.world 0 points 1 month ago (2 children)

Can we stop with this bullshit? Nobody will buy into it. WE DON’T WANT IT.

[–] boyi@lemmy.sdf.org 0 points 1 month ago (1 children)

Sorry, I disagree with this kind of generalisation. To be rational, Just because you don't want it, it doesn't mean everyone else is on the same ship. I am very sure there are certain people who will benefit from this and want it.

[–] riodoro1@lemmy.world 0 points 1 month ago (1 children)

https://www.techradar.com/pro/ai-a-turn-off-tech-customers-are-apparently-already-getting-bored-of-smart-devices-and-it-could-be-hitting-sales

„Certain people” do not justify spending billions in money and tons of resources to create more and more of the same shit just because there is a hype for it.

[–] boyi@lemmy.sdf.org 0 points 1 month ago* (last edited 1 month ago)

yes, I am one of those who are also getting bored of it. But this doesn't mean that I am part of the market that they targeted. They might me targeting certain segments or even service providers such as game developers or console makers etc. The technology is still in it advent stage so it is too early to say wether they are going to fail.

[–] sunbytes@lemmy.world 0 points 1 month ago

It's not for you as a consumer.

It's to reduce your usefulness as a worker.

Which would be lovely, if our value wasn't calculated by our usefulness to the market.

[–] MonkderVierte@lemmy.ml 0 points 1 month ago (2 children)

Properly following licensing, right?

[–] lemmyvore@feddit.nl 0 points 1 month ago (1 children)

No, see, because it's "learning like a human", and everybody knows that you're allowed to bypass any licensing for learning. /s

But seriously I don't know how they make the jump to these conclusions either.

[–] areyouevenreal@lemm.ee 0 points 1 month ago* (last edited 1 month ago) (1 children)

This is a massive strawman argument. No one is saying you shouldn't have a license to view the content in order to train an AI on it. Most of the information used to train these models is publicly available and licensed for public viewing.

[–] lemmyvore@feddit.nl 0 points 1 month ago (3 children)

Just because something is available for public viewing does not mean it's licensed for anything except personal use.

The strawman here is that since physical people benefit from personal use exceptions in the law, machine learning software should too. But why should they? Since when is a piece of software ran by a corporation equivalent to an individual person?

[–] VoterFrog@lemmy.world 0 points 1 month ago (2 children)

Copyright licensing allows the owner to control how a work is distributed, not how it's consumed. "Personal use" just means that you can't turn around and redistribute a work that you've obtained. Not that you're not allowed to consume it in a corporate setting.

[–] FunnyUsername@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (2 children)

Consuming is not the same thing as training. A machine is not a consumer, it is a tool.

[–] areyouevenreal@lemm.ee 0 points 1 month ago (1 children)

A program of machine can be a consumer of something, although if you want to be technical you could say the person using the machine is the consumer. In actual computer science we talk about programs consuming things all the time.

[–] FunnyUsername@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (2 children)

In actual computer science you talk about AI all the time as well but it's not actually intelligent is it? It's just SmarterChild 2.0 and literally has no idea what word it said just before it's current one. Not intelligent. Words are often used inappropriately. The only thing computers can consume is data and electricity by definition, and consuming data is not the same as implementing it in a language (or visual) model that you intend to profit from. This is data theft, unless properly licensed.

[–] areyouevenreal@lemm.ee 0 points 1 month ago* (last edited 1 month ago) (1 children)

How intelligent it is or isn't is irrelevant. We talk about much dumber programs than AI as being consumers of files and data including things like compilers. Would it not be person use for you to view a picture in a photo viewer or try and edit it in GIMP?

It's not data theft at all unless the courts and law says it is. Ranting on lemmy won't change that fact. Theft is a construct of law.

You can add clauses against use as AI training data to your licence if you wish.

[–] FunnyUsername@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (1 children)

You can try to equate humans to computers all day, and you can even pass laws that says they're the same thing. That does not make it true. A company using software to profit off data they have not licensed (whether it's public or not does not matter! That is not how copyright law works!) is theft.

Please try to sell DVDs of markiplier's publicaly available YouTube content and tell people how you're allowed to because it's publicaly available.

[–] areyouevenreal@lemm.ee 0 points 1 month ago (1 children)

I am not equating humans with computers. These businesses are not selling people's data when doing AI training (unlike actual data brokers). You can't say something AI generated is a clone of the original anymore than you can say parody is.

[–] FunnyUsername@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (1 children)

I absolutely can. Parody is an art form, which is something that can exclusively only be created by human beings. AI is an art laundering service. Not an artist.

The law should reflect that these companies need to be first granted permission to use datasets by the rights holders, and creative commons licenses need to be given an opportunity to opt out of being crawled for these datasets. Anything else is wrong. Machines are not humans. Creative common copyright law was not written with the concept of machines being "consumers". These companies took advantage of the sudden emergence of these models and the delay of law in holding their hunger for data in check. They need to be held accountable for their theft.

[–] areyouevenreal@lemm.ee 0 points 1 month ago* (last edited 1 month ago)

There are already anti-AI licenses out there. If you didn't license your stuff with that in mind that's on you. Deep learning models have been around for a lot longer than GPT 3 or anything that's happened in the current news cycle. They have needed training data for that long too. It was predictable stuff like this would happen eventually, and if you didn't notice in time it's because you haven't been paying attention.

You don't get to dictate what's right and wrong. As far as I am concerned all copyright is wrong and dumb, but the law is what the law is. Obviously not everyone shares my opinion and not everyone shares yours.

Whether an artist is involved or not it's still a transformative use.

[–] areyouevenreal@lemm.ee 0 points 1 month ago (1 children)

Also the way you imply children can't be intelligent is disgusting.

load more comments (1 replies)
[–] VoterFrog@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (1 children)

Training literally is consuming. A copyright license doesn't get to dictate what computer programs the work is allowed to be used with. There's a ton a entertainment mega corps that would love for that to be the case, though.

You're saying that you're not allowed to do a statistical analysis on a copyrighted work. It's nonsense. It's well-established that copyright does not prevent that kind of use.

[–] FunnyUsername@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (1 children)

What makes you think copyright law doesn't apply to companies using copy written data to sell and profit off of? That is not the case. Also, you're putting words in my mouth. Feel free to read my other replies on this thread but I don't feel like repeating myself, but I think it's clear I'm not saying computers aren't allowed to process data that's absurd.

load more comments (1 replies)
[–] lemmyvore@feddit.nl 0 points 1 month ago (1 children)

Copyright licensing allows the owner to control how a work is distributed, not how it's consumed.

First of all, that's incorrect.

Secondly, by default you have zero rights to someone else's work. If something doesn't explicitly grant you rights, you have none. If there's a law or license, and if it's applicable to you, you get exactly what's specified in there.

The "personal use" or "fair use" exceptions in some places grant some basic rights but they are very narrow in scope and generally applicable only to individuals.

load more comments (1 replies)
[–] wizardbeard@lemmy.dbzer0.com 0 points 1 month ago

A tangentially related but good example of this sort of thing is BluRays and community movie nights (like setting up a projector in a park).

Most of these movie nights are de facto illegal, as even though you own the BluRay, it is not licensed for public showings, just for personal use. Obviously no one gives enough of a shit to enforce this against small groups, especially if they aren't making money off it, but if a theater started offering showings of shit the owner just bought on BluRay or UHD disks, it wouldn't last too long.

Similar thing here. Just because you can access the content to view it yourself doesn't mean you have the rights to do more than that with it. As an individual, you're likely fine to break those rules. As a giant fucking corporation, it's time for you to pay up.

load more comments (1 replies)
load more comments (1 replies)
[–] rottingleaf@lemmy.world 0 points 1 month ago

I've just had a thought:

There's a little country where the way its leadership still hasn't been all voted out and put behind bars for life is that it constantly invents new subjects for discussion. Some outrageous, some showing them in good light, but the point is that everyone forgets the real bad things they've done (they are basically a collaborationist puppet government of a neighboring fascist country).

I wonder if it's today's world as a whole showing itself in that little country.

I've recently read an article seen on Lemmy, suggesting that the "AI" hype is the same. https://theluddite.org/#!post/ai-hype - found it. The conclusion is very important.

They are wasting enormous amounts of energy to make those "AI"s, collect training data and so on, to make oligopolized platforms and industries shittier and shittier.

But we are wasting our energy, which is much more limited, to track myriads of false targets. We are like an air defense system being saturated.

No one has ever won a war by sitting in defense. We must search for critical joints to attack.

Also no, voting for one of two candidates presented to you in some election is not that, neither is arguing for one of two sides in a discourse presented to you. There are better and worse choices there, but that's not what attack means.

[–] SomeGuy69@lemmy.world 0 points 1 month ago* (last edited 1 month ago)

So they use VMs to simulate user accounts, in future this will be blocked and whatever new AI startup is there won't have the option to do so. Competition blocked. Forever.

load more comments
view more: next ›