What level of abstraction is enough? Training doesn’t store or reference the work at all. It derives a set of weights from it automatically. But what if you had a legion of interns manually deriving the weights and entering them in instead? Besides the impracticality of it, if I look at a picture, write down a long list of small adjustments, -2.343, -.02, +5.327, etc etc etc, and adjust the parameters of the algorithm without ever scanning it in, is that legal? If that is, does that mean the automation of that process is the illegal part?
Right now our understanding of derivative works is mostly subjective. We look at the famous Obama “HOPE” image, and the connection to the original news photograph from which it was derived seems quite clear. We know it’s derivative because it looks derivative. And we know it’s a violation because the person who took the news photograph says that they never cleared the photo for re-use by the artist (and indeed, demanded and won compensation for that reason).
Should AI training be required to work from legally acquired data, and what level of abstraction from the source data constitutes freedom from derivative work? Is it purely a matter of the output being “different enough” from the input, or do we need to draw a line in the training data, or…?
What level of abstraction is enough? Training doesn’t store or reference the work at all. It derives a set of weights from it automatically. But what if you had a legion of interns manually deriving the weights and entering them in instead? Besides the impracticality of it, if I look at a picture, write down a long list of small adjustments, -2.343, -.02, +5.327, etc etc etc, and adjust the parameters of the algorithm without ever scanning it in, is that legal? If that is, does that mean the automation of that process is the illegal part?
Right now our understanding of derivative works is mostly subjective. We look at the famous Obama “HOPE” image, and the connection to the original news photograph from which it was derived seems quite clear. We know it’s derivative because it looks derivative. And we know it’s a violation because the person who took the news photograph says that they never cleared the photo for re-use by the artist (and indeed, demanded and won compensation for that reason).
Should AI training be required to work from legally acquired data, and what level of abstraction from the source data constitutes freedom from derivative work? Is it purely a matter of the output being “different enough” from the input, or do we need to draw a line in the training data, or…?
All good questions.