June 29, 2023: @anthropicai

#bias   #alignment  

I love love love this piece – even just the tweet thread! Folks spend a lot of time talking about "alignment," the idea that we need AI values to agree with the values of humankind. The thing is, though, people have a lot of different opinions.

For example, if we made AI choose between democracy and the economy, it's 150% on the side of democracy. People are a little more split, and it changes rather drastically between different countries.

AI loves democracy

It's a really clear example of bias, but (importantly!) not in a way that's going to make anyone feel threatened by having it pointed out. Does each county have to build their own LLM to get the "correct" alignment? Every political party? Does my neighborhood get one?

While we all know in our hearts that there's no One Right Answer to values-based questions, this makes the issues a little more obvious (and potentially a little scarier, if we're relying on the LLM's black-box judgment).

You can visit Towards Measuring the Representation of Subjective Global Opinions in Language Models to see their global survey, and see how the language model's "thoughts and feelings" match up with those of the survey participants from around the world.