this post was submitted on 30 Jul 2023
221 points (100.0% liked)
Technology
37739 readers
500 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It doesn't change anything you said about copyright law, but current-gen AI is absolutely not "a virtual brain" that creates "art in the same rough and inexact way that we humans do it." What you are describing is called Artificial General Intelligence, and it simply does not exist yet.
Today's large language models (like ChatGPT) and diffusion models (like Stable Diffusion) are statistics machines. They copy down a huge amount of example material, process it, and use it to calculate the most statistically probable next word (or pixel), with a little noise thrown in so they don't make the same thing twice. This is why ChatGPT is so bad at math and Stable Diffusion is so bad at counting fingers -- they are not making any rational decisions about what they spit out. They're not striving to make the correct answer. They're just producing the most statistically average output given the input.
Current-gen AI isn't just viewing art, it's storing a digital copy of it on a hard drive. It doesn't create, it interpolates. In order to imitate a person't style, it must make a copy of that person's work; describing the style in words is insufficient. If human artists (and by extension, art teachers) lose their jobs, AI training sets stagnate, and everything they produce becomes repetitive and derivative.
None of this matters to copyright law, but it matters to how we as a society respond. We do not want art itself to become a lost art.
This is factually untrue. For example, Stable Diffusion models are in the range of 2GB to 8GB, trained on a set of 5.85 billion images. If it was storing the images, that would allow approximately 1 byte for each image, and there are only 256 possibilities for a single byte. Images are downloaded as part of training the model, but they're eventually "destroyed"; the model doesn't contain them at all, and it doesn't need to refer back to them to generate new images.
It's absolutely true that the training process requires downloading and storing images, but the product of training is a model that doesn't contain any of the original images.
None of that is to say that there is absolutely no valid copyright claim, but it seems like either option is pretty bad, long term. AI generated content is going to put a lot of people out of work and result in a lot of money for a few rich people, based off of the work of others who aren't getting a cut. That's bad.
But the converse, where we say that copyright is maintained even if a work is only stored as weights in a neural network is also pretty bad; you're going to have a very hard time defining that in such a way that it doesn't cover the way humans store information and integrate it to create new art. That's also bad. I'm pretty sure that nobody who creates art wants to have to pay Disney a cut because one time you looked at some images they own.
The best you're likely to do in that situation is say it's ok if a human does it, but not a computer. But that still hits a lot of stumbling blocks around definitions, especially where computers are used to create art constantly. And if we ever hit the point where digital consciousness is possible, that adds a whole host of civil rights issues.
This is the process I was referring to when I said it makes copies. We're on the same page there.
I don't know what the solution to the problem is, and I doubt I'm the right person to propose one. I don't think copyright law applies here, but I'm certainly not arguing that copyright should be expanded to include the statistical matrices used in LLMs and DPMs. I suppose plagiarism law might apply for copying a specific style, but that's not the argument I'm trying to make, either.
The argument I'm trying to make is that while it might be true that artificial minds should have the same rights as human minds, the LLMs and DPMs of today absolutely aren't artificial minds. Allowing them to run amok as if they were is not just unfair to living artists... it could deal irreparable damage to our culture because those LLMs and DPMs of today cannot take up the mantle of the artists they hedge out or pass down their knowledge to the next generation.
Thanks for clarifying. There are a lot of misconceptions about how this technology works, and I think it's worth making sure that everyone in these thorny conversations has the right information.
I completely agree with your larger point about culture; to the best of my knowledge we haven't seen any real ability to innovate, because the current models are built to replicate the form and structure of what they've seen before. They're getting extremely good at combining those elements, but they can't really create anything new without a person involved. There's a risk of significant stagnation if we leave art to the machines, especially since we're already seeing issues with new models including the output of existing models in their training data. I don't know how likely that is; I think it's much more likely that we see these tools used to replace humans for more mundane, "boring" tasks, not really creative work.
And you're absolutely right that these are not artificial minds; the language models remind me of a quote from David Langford in his short story Answering Machine: "It's so very hard to realize something that talks is not intelligent." But we are getting to the point where the question of "how will we know" isn't purely theoretical anymore.
How do you know human brains don't work in roughly the same way chatbots and image generators work?
What is art? And what does it mean for it to become "lost"?
He literally just explained why.
No, he just said AI isn't like human brains because its a "statistical machine". What I'm asking is how he knows that human brains aren't statistical machines?
Human brains aren't that good at direct math calculation either!
Also he definitely didn't explain what "lost art" is.