Lmao, cope harder. You're being replaced like the rest of us 🤣
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
turns out, copyright laws have literally never been used to protect artists!
I don't think it is relatively difficult to make "Ethical" AI.
Simply refer to the sources you used and make everything, from the data used, the models and the weights, of public domain.
It baffles me as to why they don't, wouldn't it just be much simpler?
It would cost more in time and effort to do it right.
$ $ $
Simply refer to the sources you used
Source: The Internet.
Most things are duplicated thousands of times on the Internet. So stating sources would very quickly become a bigger text than almost any answer from an AI.
Stating that you scraped republican and democrat home sites on a general publicly available site, does not explain which was used for answering a political question.
Your proposal sounds simple, but is probably extremely hard to implement in a useful way.
fundamentally, an llm doesn't "use" individual sources for any answer. it is just a function approximator, and as such every datapoint influences the result, just more if it closely aligns with the input.
They don't do it because they claim that there isn't enough public domain data.... But let's be honest, nobody has tried because nobody wants a machine that isn't able to reference anything in the last 100 years.
Because there's not enough PD content there to train AI on.
Copyright law is generally (yes I know this varies country by country but) gives the creator immediate ownership without any further requirements, which means every doodle, shitpost and hot take online is property of it's owner UNLESS they chose to license it in a way that would allow use.
Nobody does, and thus the data the AI needs simply doesn't exist as PD content and that makes the only choices for someone training a model is either to steal everything, or don't do it.
You can see what choice has been universally made.
So, you want an AI LLM trained to respond like a person from ~180 years ago, with their highly religious and cultural bias from a time so far removed from ours that you would feel offended by its answers, with no knowledge of anything from the past 100+ years? Would you be able to use such a thing in daily life?
Consider that even school textbooks are copywrited, and people writing open source projects are sometimes offended by their OPEN SOURCE CODE being trained for AI, you basically cut away the ability for the AI model to learn basic human knowledge or even do the thing it's actually "good" at if you took the full "no offense taken" approach.
The other part of the problem is, legally speaking, making it where it is forbidden to train on copywrited data opens up a huge window for companies with aggressive copywrite protections to effectively end all fan works of something, or even forbid people from making things with even a hint that their concept was conceived based on their once vaguely hearing about or seeing a copywrited work. How do you legally prove you've never been exposed to, even briefly, and thus have never been influenced by something that's memetically and culturally everywhere, for example?
As for AI art and music, there are open source pd/cc only models out there, as I call them, "vegan models". CommonCanvas, for instance. The problem with these models is the lack of subject material available (only 10 million images, which there are a lot more than 10 million things to look at in the world, before considering ways to combine them), and the lack of interest in doing the proper legwork to make sure the AI learns properly through good image tagging, which can take upwards of years to complete. Training AI is very expensive and time consuming (especially the captioning part, due to it being a human task!) and if you don't have a literal supercomputer you can run for several months at tens of thousands of dollars per month, you aren't going to make even a small model work in any reasonable amount of time. What makes the big art models good at what they do is both the size of the dataset and the captioning. You need a dataset in the billions.
For example, if you have never seen any kind of cat before ever, and no one tells you what a cat looks like, and no one tells you how biology works, and you get a single image of a lion, which contains a side-on image, and you are told that is a cat, will you be able to draw it in every perspective angle? No, you won't. You can guess and infer, but it may not be right. You have the advantage of many, many more data points to draw from in your mind, the human advantage. These AI models don't have that. You want an AI to draw a lion from every perspective, you need to show it lion images from every perspective so it knows what it looks like.
As for AI "tracing", well, that's not accurate either. AI models do not normally contain training image data in reproducible form in any way. They contain probability matrices of shapes and curves, which mathematically describe the probability of a certain shape in correlation with other concepts alongside it. Take a single one of these "neuron" matrices and graph it, and you get a mess of shapes and curves that vaguely resble a psychodellic abstract art of different parts of that concept... and sometimes other concepts too, because it can and often does use the same "neuron" for other, logically unrelated concepts, but make sense for something that is only interested in defining shapes.
Most importantly, AI models do not use binary logic like most people are used to with computer logic. It is not a definitive yes/no on anything. It is a floating point number, a varying scale of "maybe", which allows it to combine and be nuanced with concepts wothout being rigid. This is what makes the AI able to do more than be a tracing machine.
Where this really comes to is the human factor, the primal fear of "the machine" or "something greater" being able to outcompete the human. Media has given us the concept of Rogue AI destroying civilization since the dawn of the machine age, and it is thoroughly engrained in our culture that smart machines = evil, even though we don't yet have a reality that far. People forget how much support is required to keep a machine going. They don't heal themselves or magically keep running forever.
Ok, dumb question time. I'm assuming no one has any significant issues, legal or otherwise, with a person studying all Van Gogh paintings, learning how to reproduce them, and using that knowledge to create new, derivative works and even selling them.
But when this is done with software, it seems wrong. I can't quite articulate why though. Is it because it takes much less effort? Anyone can press a button and do something that would presumably take the person from the example above years or decades to do? What if the person was somehow super talented and could do it in a week or a day?
I am guessing the closest opposite argument would be how close it is to outright copying the original work?
I'm more trying to figure out why it's generally acceptable when a human does it vs when a machine does it.
I don't know for sure, but I think they would be able to adjust settings so that it looks nothing like any original work, but still have the same style, as I've seen people do.
They are copying your intellectual property and digitizing its knowledge. It’s a bit different as it’s PERMANENT. With humans knowledge can be lost, forgotten, or ignored. In these LLMs that’s not an option. Also the skill factor is a big issue imo. It’s very easy to setup an LLM to make AI imagery nowadays.
Your first sentence is false.
Proof? I am fairly certain I am correct but I will gladly admit fault. This whole LLM thing is indeed new to me also
They are copying. These LLM are a product of their input, and solely a product of their input. It’s why they’ll often directly output their training data. Using more data to train reduces this effect, that’s why all these companies are stealing and getting aggressive in stopping others stealing their data.
Dumb question: why do you feel you need to defend billion dollar companies getting even richer off somebody else's work?
Also Van Gogh's works are public domain now.
I'm not defending any companies, just thinking out loud, but I supposed I can see if that's how it reads.
I was just asking myself why it feels wrong when a machine does it vs when a human does it. By your argument, would it be ok if some poor nobody invented and is using this technology vs a billion dollar company? Is that why it feels wrong?
The issue isn't the final, individual art pieces, it's the scale. An AI can produce sub-par art quickly enough to threaten the livelyhood of artists, especially now that there is far too much art for anyone to consume and appreciate. AI art can win attention via spam, drowning out human artists.
- Because it’s not human. We distinguish ourselves in everything, that’s why we think we’re special. The same applies to inventions, e.g. why monkeys can’t have a patent.
- Time. New “products” whether that be art, engineering, science, all take time for humans. So value is created with time, because it creates scarcity and demand.
- Talent. Due to the time factor, talent and practice are desired traits of a human. You mention that a talented human can do something in just a few days that might take someone else years, but it might only take them a few days because they spent years learning.
- Perfection. Striving for perfection is a human experience. A robot doing something perfect isn’t impressive, a human doing something perfect is amazing. Even the most amateur creator can strive for perfection.
Think about paintings vs prints. Paintings are much more valuable because they aren’t created as quickly as the prints are. Even the most amateur artwork is more valuable as a physical creation rather than a copy, like a child’s crayon drawing.
This even applies to digital art because the first instance of something is the most difficult thing to create, everything after that is then just a copy, and yes this does apply to some current Gen AI tech, but very soon that will no longer be the case.
This change from humans asking for something and having other humans create it to humans asking for something and having computers create it is a loss of our humanity, what makes us human.
I actually had some thoughts about this and posted this in a similar thread:
First, that artist will only learn from a few handful of artists instead of every artist's entire field of work all at the same time. They will also eventually develop their own unique style and voice--the art they make will reflect their own views in some fashion, instead of being a poor facsimile of someone else's work.
Second, mimicking the style of other artists is a generally poor way of learning how to draw. Just leaping straight into mimicry doesn't really teach you any of the fundamentals like perspective, color theory, shading, anatomy, etc. Mimicking an artist that draws lots of side profiles of animals in neutral lighting might teach you how to draw a side profile of a rabbit, but you'll be fucked the instant you try to draw that same rabbit from the front, or if you want to draw a rabbit at sunset. There's a reason why artists do so many drawings of random shit like cones casting a shadow, or a mannequin doll doing a ballet pose, and it ain't because they find the subject interesting.
Third, an artist spends anywhere from dozens to hundreds of hours practicing. Even if someone sets out expressly to mimic someone else's style, teaches themselves the fundamentals, it's still months and years of hard work and practice, and a constant cycle of self-improvement, critique, and study. This applies to every artist, regardless of how naturally talented or gifted they are.
Fourth, there's a sort of natural bottleneck in how much art that artist can produce. The quality of a given piece of art scales roughly linearly with the time the artist spends on it, and even artists that specialize in speed painting can only produce maybe a dozen pieces of art a day, and that kind of pace is simply not sustainable for any length of time. So even in the least charitable scenario, where a hypothetical person explicitly sets out to mimic a popular artist's style in order to leech off their success, it's extremely difficult for the mimic to produce enough output to truly threaten their victim's livelihood. In comparison, an AI can churn out dozens or hundreds of images in a day, easily drowning out the artist's output.
And one last, very important point: artists who trace other people's artwork and upload the traced art as their own are almost universally reviled in the art community. Getting caught tracing art is an almost guaranteed way to get yourself blacklisted from every art community and banned from every major art website I know of, especially if you're claiming it's your own original work. The only way it's even mildly acceptable is if the tracer explicitly says "this is traced artwork for practice, here's a link to the original piece, the artist gave full permission for me to post this." Every other creative community writing and music takes a similarly dim views of plagiarism, though it's much harder to prove outright than with art. Given this, why should the art community treat someone differently just because they laundered their plagiarism with some vector multiplication?
Easier than that:
Google has been doing this for years for their search engine and no one said a thing. Why do you care now that it's a different program scanning your media?
So try doing Disney style animation and similar character and similar style story line. And start profiting from it. Lets see if the "Disney" the "corporation" will remain silent or sue you to oblivion.
Damn you musta hated Don Bluth
I don't hate him. Its just that when corporation steals individual idea or data its for research and stuff. If its other way around, us as individual will have to face lawsuit.
So i hope they sue nvidia and other big corporations who are harvesting our data for AI.
Thats the thing, nothings being stolen. Beauty and the Beast didnt up and disappear because Bluth and Fox Studios made Anastasia. Theres style similarities but it is undeniably its own work. Dont even think about the style sharing going on in the thousands of Anime out there.
Artists who rips off other great works are still developing their talent and skills. They can then go on to use to make original works. The machine will never produce anything original. It is only capable of mixing together things it has seen in its training set.
There is a very real danger that of ai eviscerating the ability for artists to make a living, making it where very few people will have the financial ability to practice their craft day in and day out, resulting in a dearth of good original art.
The machine will never produce anything original. It is only capable of mixing together things it has seen in its training set.
This is patently false and shows you don't know a single thing about how ai works.
If you're looking for a universally-applicable moral framework, join the thousands of years of philosophers striving for the same.
If you're just looking for an explanation that allows you to put one foot in front of the other...
Laws exist for us to spell out the kind of society we'd like to live in. Generally, we prefer that individuals be able to participate in cultural conversations and offer their own viewpoint. And generally, we prefer that groups of people don't accumulate massive amounts of power over other groups of people.
Dedicating your life to copying another artist's style is participating in a cultural conversation, and you won't be able to help yourself from infusing your own lived experience into your work of copying the artist. If only by the details that you focus on getting exactly right, the slight mistakes that repeat themselves or morph over the course of your career, the pieces you prioritize replicating over and over again. It says something about who you are, and that's worth appreciating.
Now, if you're trying to pass those off as originals and not your own tributes, then you're deceiving people and that's a problem because you're damaging the cultural conversation by lying about the elements you're putting into it. Even so, sometimes that's an interesting artistic enterprise in itself. Such as when artists pretend to be someone else. Warhol was a fan of this. His whole career revolved around messing with concepts of authenticity in art.
As for power, you don't gain that much leverage over another artist by simply copying their work. And if you riff on it to upstage them, you're just inviting them to do the same to you in turn.
But if you can do that mechanically, quickly, so that any creative twist they put out there to undermine your attempts to upstage them, you have an instant response at little cost to yourself, now you're in a position of great power. The more the original artist produces, the stronger your advantage over them becomes. The more they try, the harder it is for them to win.
We don't generally like when someone has accumulated tons of power, especially when they subsequently use that power to prevent others from being able to compete.
So, before the invention of the camera, the most valuable and most popular creative skill was replicating people on canvas as realistically as possible. Yes, we remember famous exceptions like Picasso, but by sheer number of paintings the most common were portraits of rich people.
After the cameras took that job away, prevailing art changed to become more abstract and "creative". But that still pissed off a lot of people that had spent a very long time honing a skill that was now no longer in demand.
What we're seeing is a similar shift. I think future generations of artists will value color theory, composition, etc. over specific brush stroke techniques. AI will make art much more accessible once enough time has passed for AI assisted art to be considered art. Make no mistake: it will always be people that actually create the art - AI will just reduce/remove the grunt work so they can focus more on creativity.
Now, whether billion dollar corporations deserve to exploit the labor of millions of people is a whole separate conversation, but tl;dr: they don't, but they're going to anyway because there is little to stop them in correct economic/governance models.
Generative AI is incapable of contributing new material, because Generative AI does not sense the world through a unique perspective. So the comparison to creators that incorporate prior artists work is a false comparison. Artists are allowed to incorporate other artists work in the same way that scientists cite other's work without it being plagiarism.
In art, in science, we stand on the shoulders of giants. AI models do not stand on the shoulders of giants. AI models just replicate the giants. Society has been fooled to think otherwise.
AI aint going away, it's already commonly running and on local machines, and being used covertly.
Have you or a friend used YouTube or reddit in the past 10 years? Then you're entitled to compensation for the training of AI.
Copying is not theft. Letting only massive and notoriously untransparent corporations control an emerging technology is.
If letting AI train on other people's works is unjust enrichment then what the record lables did to creatives through the entire 20th century taking ownership of their work through coercivw contracting is extra-unjust enrichment.
Not saying it isn't, but it's not new, and bothersome that we're only complaining a lot now.
I'm still waiting for somebody to give me a symmetry breaker between AI training on existing media and humans creating media from what they've seen, such that one is theft and the other is not.
The asymmetry is legal I'd say. If I tried to scrap that much data, especially if it included anything about a rich person, I'd be arrested and probably never see daylight again.