It is difficult to get a man to understand something, when his
salary depends on his not understanding it!—Upton Sinclair,
I, Candidate for Governor: And How I Got
Licked
I find it difficult to give too much weight to the “generating a LLM based on Stack Overflow content without attribution is wrong” when people are knowingly and intentionally violating the CC-BY-SA license in their own code.
It’s also totally fucking different when someone on SO asks for help for their homework or for help with an nginx server on their home network, and when some tech firm decides to scrape 15 years worth of information created by countless people, and then spit it back out pretending like it’s some novel solution.
As I said in my original comment, I’m no fan of SO. But the behavior of neither the site nor the people who lurk and copy justify what LLMs are doing.
We should pursue with equal effort license violations of permissively licensed material no matter what the source. Ignoring it for some while preaching fire and brimstone for others weakens the strength of the argument and the license on which they are founded.
When trying to enforce a license, if it is possible to say “you are doing exactly what you accuse us of doing” it makes it more difficult to prosecute.
While two wrongs don’t make a right, two wrongs will substantially complicate prosecuting just one of them.
I am not arguing about the morality of one or the other… or how insignificant one of them is in comparison to the other.
My issue with just pointing to the LLM is about the integrity and enforceability of open source licenses.
I find it difficult to give too much weight to the “generating a LLM based on Stack Overflow content without attribution is wrong” when people are knowingly and intentionally violating the CC-BY-SA license in their own code.
Two wrongs don’t make a right.
It’s also totally fucking different when someone on SO asks for help for their homework or for help with an nginx server on their home network, and when some tech firm decides to scrape 15 years worth of information created by countless people, and then spit it back out pretending like it’s some novel solution.
As I said in my original comment, I’m no fan of SO. But the behavior of neither the site nor the people who lurk and copy justify what LLMs are doing.
We should pursue with equal effort license violations of permissively licensed material no matter what the source. Ignoring it for some while preaching fire and brimstone for others weakens the strength of the argument and the license on which they are founded.
When trying to enforce a license, if it is possible to say “you are doing exactly what you accuse us of doing” it makes it more difficult to prosecute.
While two wrongs don’t make a right, two wrongs will substantially complicate prosecuting just one of them.
I am not arguing about the morality of one or the other… or how insignificant one of them is in comparison to the other.
My issue with just pointing to the LLM is about the integrity and enforceability of open source licenses.