It is difficult to get a man to understand something, when his
salary depends on his not understanding it!—Upton Sinclair,
I, Candidate for Governor: And How I Got
Licked
Because the established elite on SO are fucking dicks. You can’t ask a question anymore. It’s literally impossible to open a new thread. Yeah, a lot of troubleshooting has been answered, but do you really think the burden should be on the new user to know literally every single existing thread on the site? No, that’s absolutely absurd, and when some moderator or poweruser comes in and tells the newbie to go fuck himself for lack of research it’s pretty obvious why nobody wants to use the platform when the moderatorship is basically actively antagonistic to anybody seeking information, which is literally the point of the site.
I’m glad it’s been crawled by GPT. I’m glad because the bot gives me no sass at all when I ask it to audit my code. It does it without any malice or bullshit and it saves me time from doing the research because everything is in the LLM DB already.
Closing garbage is one of the best features of SO. When I began answering questions on Reddit I was literally answering the same shit every few days. It’s insane. People asking the same shit without doing any research prior is creating a ton of pointless work for people who can answer. Reddit, as well as Lemmy have no better ways to resolve this problem. SO does it via strict moderation. I guess if ChatGPT can find you a good answer from the bajillion duplicates without having to waste a SME’s time, that’s a positive. But yeah, I thank SO’s moderators for keeping it clean. I’d lose my mind as an asker and especially answerer if I had to keep doing this over and over again. I also have feelings and sifting through mountains of duplicates or having to answer the same questions over and over again hurts me.
Yeah stringent moderation is great but you can do it without being absolutely terrible about every interaction. You can run a clean shop without needing to be a dick about everything. When every close is filled with malice and vitriol it doesn’t benefit literally anybody. It’s not healthy for the poster being vindictive, it’s not healthy for the newbies getting into the business, and it’s not healthy for the community overall when the normal thing is berating and belittling.
I believe that you are ascribing a great deal of negativity to the process of curation on Stack Overflow that isn’t present. The claim that “every close is filled with malice and vitriol” is way over the top compared what others have seen.
Entering the process with these preexisting misconceptions can make it more difficult to work within the process that Stack Overflow has set up.
Not to mention the fact that even if a similar question were answered, if that thread is from 2012, the answer will, with 99% certainty, be totally irrelevant now.
SO has always been a bastion of power hungry dickbags who get off on acting superior and putting others down. They way it’s structured reinforces this. It turned what should have been a great place to help each other into a fucking bloodsport arena.
I am curious how many people attribute code they copy out of Stack Overflow back to SO with the appropriate license attribution back to the post as required by the license:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
I find it difficult to give too much weight to the “generating a LLM based on Stack Overflow content without attribution is wrong” when people are knowingly and intentionally violating the CC-BY-SA license in their own code.
It’s also totally fucking different when someone on SO asks for help for their homework or for help with an nginx server on their home network, and when some tech firm decides to scrape 15 years worth of information created by countless people, and then spit it back out pretending like it’s some novel solution.
As I said in my original comment, I’m no fan of SO. But the behavior of neither the site nor the people who lurk and copy justify what LLMs are doing.
We should pursue with equal effort license violations of permissively licensed material no matter what the source. Ignoring it for some while preaching fire and brimstone for others weakens the strength of the argument and the license on which they are founded.
When trying to enforce a license, if it is possible to say “you are doing exactly what you accuse us of doing” it makes it more difficult to prosecute.
While two wrongs don’t make a right, two wrongs will substantially complicate prosecuting just one of them.
I am not arguing about the morality of one or the other… or how insignificant one of them is in comparison to the other.
My issue with just pointing to the LLM is about the integrity and enforceability of open source licenses.
I’m with you. It’s amazing how fast ChatGPT has replaced SO for me.
I’m not sure how this will work long term. How will the model get new training data?
But honestly? SO can eat a bag of dicks. It doesn’t matter if you’re asking a question that’s nowhere on the site (or the first 3 pages of Google results). It’s going to get closed and ignored.
I think most people moved to Reddit and Discord a while ago, which is also problematic. We need to get these conversations happening on the open Fediverse.
What will happen long term is more specialized models for specific applications. MS already has coder facing resources through GitHub Co-Pilot, and they were the key funders for most of OpenAI’s work with GPT so they have also deployed GPT4 inline for Bing, which I find actually pretty useful even though it’s been neutered to all hell.
The problem with GPT is the G. Generalized. We’ve been building more specific models though. Co-Pilot is already trained on the entire codebase and discussion boards on GitHub. Eventually that’s going to be the tool you want to use over GPT because it’s specifically designed for code above all else.
Because the established elite on SO are fucking dicks. You can’t ask a question anymore. It’s literally impossible to open a new thread. Yeah, a lot of troubleshooting has been answered, but do you really think the burden should be on the new user to know literally every single existing thread on the site? No, that’s absolutely absurd, and when some moderator or poweruser comes in and tells the newbie to go fuck himself for lack of research it’s pretty obvious why nobody wants to use the platform when the moderatorship is basically actively antagonistic to anybody seeking information, which is literally the point of the site.
I’m glad it’s been crawled by GPT. I’m glad because the bot gives me no sass at all when I ask it to audit my code. It does it without any malice or bullshit and it saves me time from doing the research because everything is in the LLM DB already.
Closing garbage is one of the best features of SO. When I began answering questions on Reddit I was literally answering the same shit every few days. It’s insane. People asking the same shit without doing any research prior is creating a ton of pointless work for people who can answer. Reddit, as well as Lemmy have no better ways to resolve this problem. SO does it via strict moderation. I guess if ChatGPT can find you a good answer from the bajillion duplicates without having to waste a SME’s time, that’s a positive. But yeah, I thank SO’s moderators for keeping it clean. I’d lose my mind as an asker and especially answerer if I had to keep doing this over and over again. I also have feelings and sifting through mountains of duplicates or having to answer the same questions over and over again hurts me.
Yeah stringent moderation is great but you can do it without being absolutely terrible about every interaction. You can run a clean shop without needing to be a dick about everything. When every close is filled with malice and vitriol it doesn’t benefit literally anybody. It’s not healthy for the poster being vindictive, it’s not healthy for the newbies getting into the business, and it’s not healthy for the community overall when the normal thing is berating and belittling.
Engineers are not good with customers…
I believe that you are ascribing a great deal of negativity to the process of curation on Stack Overflow that isn’t present. The claim that “every close is filled with malice and vitriol” is way over the top compared what others have seen.
Entering the process with these preexisting misconceptions can make it more difficult to work within the process that Stack Overflow has set up.
Speculative
Not to mention the fact that even if a similar question were answered, if that thread is from 2012, the answer will, with 99% certainty, be totally irrelevant now.
SO has always been a bastion of power hungry dickbags who get off on acting superior and putting others down. They way it’s structured reinforces this. It turned what should have been a great place to help each other into a fucking bloodsport arena.
Also, fuck LLMs.
I don’t like the way that LLMs have gathered their information with zero credit to anybody. It’s totally bullshit.
I am curious how many people attribute code they copy out of Stack Overflow back to SO with the appropriate license attribution back to the post as required by the license:
https://creativecommons.org/licenses/by-sa/3.0/ and https://creativecommons.org/licenses/by-sa/4.0/ clearly state:
Sorry this is all I know.
I find it difficult to give too much weight to the “generating a LLM based on Stack Overflow content without attribution is wrong” when people are knowingly and intentionally violating the CC-BY-SA license in their own code.
Two wrongs don’t make a right.
It’s also totally fucking different when someone on SO asks for help for their homework or for help with an nginx server on their home network, and when some tech firm decides to scrape 15 years worth of information created by countless people, and then spit it back out pretending like it’s some novel solution.
As I said in my original comment, I’m no fan of SO. But the behavior of neither the site nor the people who lurk and copy justify what LLMs are doing.
We should pursue with equal effort license violations of permissively licensed material no matter what the source. Ignoring it for some while preaching fire and brimstone for others weakens the strength of the argument and the license on which they are founded.
When trying to enforce a license, if it is possible to say “you are doing exactly what you accuse us of doing” it makes it more difficult to prosecute.
While two wrongs don’t make a right, two wrongs will substantially complicate prosecuting just one of them.
I am not arguing about the morality of one or the other… or how insignificant one of them is in comparison to the other.
My issue with just pointing to the LLM is about the integrity and enforceability of open source licenses.
I’m with you. It’s amazing how fast ChatGPT has replaced SO for me.
I’m not sure how this will work long term. How will the model get new training data?
But honestly? SO can eat a bag of dicks. It doesn’t matter if you’re asking a question that’s nowhere on the site (or the first 3 pages of Google results). It’s going to get closed and ignored.
I think most people moved to Reddit and Discord a while ago, which is also problematic. We need to get these conversations happening on the open Fediverse.
What will happen long term is more specialized models for specific applications. MS already has coder facing resources through GitHub Co-Pilot, and they were the key funders for most of OpenAI’s work with GPT so they have also deployed GPT4 inline for Bing, which I find actually pretty useful even though it’s been neutered to all hell.
The problem with GPT is the G. Generalized. We’ve been building more specific models though. Co-Pilot is already trained on the entire codebase and discussion boards on GitHub. Eventually that’s going to be the tool you want to use over GPT because it’s specifically designed for code above all else.