this post was submitted on 17 Aug 2023
194 points (100.0% liked)

Technology

37643 readers
168 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] BiNonBi@lemmy.blahaj.zone 28 points 1 year ago (9 children)

NPR reported that a "top concern" is that ChatGPT could use The Times' content to become a "competitor" by "creating text that answers questions based on the original reporting and writing of the paper's staff."

That's something that can currently be done by a human and is generally considered fair use. All a language model really does is drive the cost of doing that from tens or hundreds of dollars down to pennies.

To defend its AI training models, OpenAI would likely have to claim "fair use" of all the web content the company sucked up to train tools like ChatGPT. In the potential New York Times case, that would mean proving that copying the Times' content to craft ChatGPT responses would not compete with the Times.

A fair use defense does not have to include noncompetition. That's just one factor in a fair use defense and the other factors may be enyon their own.

I think it'll come down to how "the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes" and "the amount and substantiality of the portion used in relation to the copyrighted work as a whole;" are interpreted by the courts. Do we judge if a language model by the model itself or by the output itself? Can a model itself be uninfringing and it still be able to potentially produce infringing content?

[–] ag_roberston_author@beehaw.org 12 points 1 year ago (2 children)

That’s something that can currently be done by a human and is generally considered fair use.

That's kind of the point though isn't it? Fair use is only fair use because it's a human doing it, not an algorithm.

[–] BiNonBi@lemmy.blahaj.zone 5 points 1 year ago (1 children)

That is not actually one of the criteria for fair use in the US right now. Maybe that'll change but it'll take a court case or legislation to do.

I am aware of that, but those rules were written before technology like this was conceivable.

load more comments (6 replies)