this post was submitted on 09 Aug 2023
379 points (100.0% liked)
Technology
37747 readers
194 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Except when it produces exact copies of existing works, or when it includes a recognisable signature or watermark?
The point is that if the model doesn't contain any recognisable parts of the original material it was trained on, how can it reproduce recognisable parts of the original material it was trained on?
That's sorta the point of it.
I can recreate the phrase "apple pie" in any number of styles and fonts using my hands and a writing tool. Would you say that I "contain" the phrase "apple pie"? Where is the letter 'p' in my brain?
Specifically, the AI contains the relationship between sets of words, and sets of relationships between lines, contrasts and colors.
From there, it knows how to take a set of words, and make an image that proportionally replicates those line pattern and color relationships.
You can probably replicate the Getty images watermark close enough for it to be recognizable, but you don't contain a copy of it in the sense that people typically mean.
Likewise, because you can recognize the artist who produced a piece, you contain an awareness of that same relationship between color, contrast and line that the AI does. I could show you a Picasso you were unfamiliar with, and you'd likely know it was him based on the style.
You've been "trained" on his works, so you have internalized many of the key markers of his style. That doesn't mean you "contain" his works.
Just because you can't point to a specific part of your brain that contains the letter 'p' doesn't mean it isn't in there somewhere. If you didn't contain the letter 'p', or Getty watermark, or Picasso's work, you wouldn't be able to recognise them when you saw them or tried to replicate them. The act of recognising something that is familiar is basically the brain comparing what the eye sees with what is stored in the memory. The brain stores it differently to an exact copy on a hard drive, but it does, nevertheless, contain everything that it remembers.
I disagree that recognition implies you contain it. It's much closer to a description than the actual thing, and a description isn't the same as the thing. This is evidenced by you being able to look at a letter P in a font you've never seen before and recognize it without issue. If it was just comparison, you couldn't do that.
Ah, this old paper again. When it first came out it got raked over the coals pretty thoroughly. The authors used an older, poorly-trained version of Stable Diffusion that had been trained on only 160 million images and identified 350,000 images from the training set that had many duplicates and therefore could potentially be overfitted. They then generated 175 million images using tags commonly associated with those duplicate images.
After all that, they found 109 images in the output that looked like fuzzy versions of the input images. This is hardly a triumph of plagiarism.
As for the watermark, look closely at it. The AI clearly just replicated the idea of a Getty-like watermark, it's barely legible. What else would you expect when you train an AI on millions of images that contain a common feature, though? It's like any other common object - it thinks photographs often just naturally have a grey rectangle with those white squiggles in it, and so it tries putting them in there when it generates photographs.
These are extreme stretches and they get dredged up every time by AI opponents. Training techniques have been refined over time to reduce overfitting (since what's the point in spending enormous amounts of GPU power to produce a badly-artefacted copy of an image you already have?) so it's little wonder there aren't any newer, better papers showing problems like these.
Nevertheless, the Getty watermark is a recognisable element from the images the model was trained on, therefore you cannot state that the models don't spit out images with recognisable elements from the training data.
Take a close look at the "watermark" on the AI-generated image. It's so badly mangled that you wouldn't have a clue what it says if you didn't already know what it was "supposed" to say. If that's really something you'd consider "copyrightable" then the whole world's in violation.
The only reason this is coming up in a copyright lawsuit is because Getty is using it as evidence that Stability AI used Getty images in the training set, not that they're alleging the AI is producing copyrighted images.
I said "recognisable", and it is clearly recognisable as Getty's watermark, by virtue of the fact that many people, not only I, recognise it as such. You said that the models don't use any "recognizable part of the original material that it was trained on", and that is clearly false because people do recognise parts of the original material. You can't argue away other people's ability to recognise the parts of the original works that they recognise.
I said that models don't contain any recognizable part of the original material. They might be able to produce recognizable versions of parts of the original material, as we're seeing here. That's an important distinction. The model itself does not "contain" the images from the training set. It only contains concepts about those images, and concepts are not something that can be copyrighted.
If you want to claim copyright violations over specific output images, sure, that's valid. If I were to hit on exactly the right set of prompts and pseudorandom seed values to get a model to spit out an image that was a dead ringer for a copyrighted work and I was to distribute copies of that resulting image, that's copyright violation. But the model itself is not a copyright violation. No more than an artist is inherently violating copyright because he could potentially pick up his paint brush and produce a copy of an existing work that he's previously seen.
In any event, as I said, Getty isn't suing over the copyright to their watermark.