Assessing Political Bias in Language Models

Gaywallet (they/it)@beehaw.org · 2 years ago

Assessing Political Bias in Language Models

Hexorg@beehaw.org · 2 years ago

I wonder if it’s possible to bring public opinion into the error function - find weights for ChatGPT such that the next token is predicted correctly but also such that the overall output falls within the public average opinion… But then - is that a “good enough” metric?

Gaywallet (they/it)@beehaw.org · 2 years ago

The ways to control for algorithmic bias are typically through additional human developed layers to counteract bias present when you ingest large datasets to train. But that’s extremely work intensive. I’ve seen some interesting hypotheticals where algorithms designed specifically to identify bias can be used to tune layers with custom weighting to attempt to pull bias back down to acceptable levels, but even then we’ll probably need to watch how this changes language about groups for which there is bias.

Hexorg@beehaw.org · 2 years ago

I think the trouble with human oversight is that it’s still going to keep whatever bias the overseer has.

Gaywallet (they/it)@beehaw.org · 2 years ago

AI is programmed by humans or trained on human data. Either we’re dealing in extremes where it’s impossible to not have bias (which is important framing to measure bias) or we’re talking about how to minimize bias not make it perfect.