It's interesting to ask people who are concerned about the training data what they think of Adobe Firefly, which is strictly trained on correctly licensed data.
I'm under the impression that DALL-E itself used licensed data as well.
I find some people are comfortable with that, but others will switch to different concerns - which indicates to me that they're actually more offended by the idea of AI-generated art than the specific implementation details of how it was trained.
When I did Photography at college, a lot of the work was looking at other works of art. I spent a lot of time in Google Images, diving through books from the Art section and going to galleries. Lots of photo copying was involved!
I then did works in the style of what I’d researched. I trained myself on works I didn’t own, and then produced my own.
I kind of see the AI training as similar work, just done programmatically vs physically.
Certainly a very interesting topic.
I can’t get my head around how far we’ve come on this in the last 6-12 months. From pretty awful outputs to works winning Photography awards. And prints of a dog called Queso you’d have paid a lot of money to an illustrator for.
I think it's more analogous to if you had tweaked one of those famous works directly in photoshop then turned it in. The model training likely results in near replicas of some of the training data encoded in the model. You might have a near replica of a famous photograph encoded in your head, but to make a similar photograph you would recreate it with your own tools and it would probably come out pretty different. The AI can just output the same pixels.
That's not to say there aren't other ways you might use the direct image (e.g. collage or sampling in music) but you'll likely be careful with how it's used, how much you tweak it, and with attribution. I think the weird problem we're butting up against is that AFAIK you can't figure out post-facto what the "influence" is from the model output aside from looking at the input (which does commonly use names of artists).
I work on an AI image generator, so I really do think the tech is useful and cool, but I also think it's disingenuous (or more generously misinformed) to compare it to an artist studying great works or taking inspiration from others. These are computers inputting and outputting bits. Another human analog would be memorizing a politician's speech and using chunks of it in your own speech. We'd easily call that plagiarism, but if instead every 3 words were exactly the same? Hard to say... it's both more and less plagiarism.
Just how much do you need to process a sampled work before you need to get permission of the original artist? It seems to be in music that if the copyright holder can prove you sampled them, even if it's unrecognizable, then you're going to be on the hook for some royalties.
"The model training likely results in near replicas of some of the training data encoded in the model."
I don't think that's true.
My understanding is that any image generated by Stable Diffusion has been influenced by every single parameter of the model - so literally EVERY image in the training data has an impact on the final image.
How much of an impact is the thing that's influenced by the prompt.
One way to think about it: the Stable Diffusion model can be as small as 1.9GB (Web Stable Diffusion). It's trained on 2.3 billion images. That works out as 6.6 bits of data per image in the training set.
Right. Apart from some (extremely famous) pieces of art that have been heavily repeated in the dataset you’re not going to be able to come close to recreating something directly.
Don't you think one of the images could be perfectly or perfectly enough encoded in that 1.9GB though? A funny example is Malevich's Red Square. Highly compressible! [0] Line drawings also can often be compressed to a polynomial.
> My understanding is that any image generated by Stable Diffusion has been influenced by every single parameter of the model - so literally EVERY image in the training data has an impact on the final image.
That's pretty interesting. Need to dig into the math more (lazy applications dev).
>It's interesting to ask people who are concerned about the training data what they think of Adobe Firefly, which is strictly trained on correctly licensed data.
If they truly got an appropriate license agreement for every image in the training set then I have no issues with that.
>I'm under the impression that DALL-E itself used licensed data as well.
DALL-E clearly used images they did not have a license for. Early on it was able to output convincing images of Pikachu and Homer Simpson. OpenAI certainly didn’t get licensing rights for those characters.
There's an argument to be made that drawing Pikachu should not be allowed, certainly. I think it's harder to make the argument that humans should be allowed to, but AI not.
What ongoing litigation I'm aware of seeks to close that loophole and make fanart illegal, which would be a first step towards also preventing AI art.
> I don't believe there's any issue with drawing Pikachu
In practice? No, not really. But Pikachu is a copyrighted character and only those with license to do so are actually legally allowed to reproduce Pikachu in media.
Trademarks can come into play like you said, but even just base copyright allows for the ownership of characters such as Pikachu or Batman or whatever.
I think the more correct argument is that Stable Diffusion effectively did a Napster to force artists into shit licensing deals with large players who can handle the rights management. It’s unlikely that artists would’ve ever agreed to them otherwise, but since the alternative now is to have your work duplicated by a pirate model or legally gray service, what are you going to do? This seems borne out by the fact that Stability AI themselves are now retreating behind Amazon for protection.
It's interesting to ask people who are concerned about the training data what they think of Adobe Firefly, which is strictly trained on correctly licensed data.
I'm under the impression that DALL-E itself used licensed data as well.
I find some people are comfortable with that, but others will switch to different concerns - which indicates to me that they're actually more offended by the idea of AI-generated art than the specific implementation details of how it was trained.