@suchenzang on September 02, 2023

#China   #behind the scenes   #tokenizing   #models   #tweets  

It's a little in the weeds of how these tools work, but generative AI tools are based on "tokens," individual units of text or information. Usually the tokens are developed organically by the model itself, but in this case, the tokens were developed by the team at Baichuan, and they're very revealing of the model's... biases? I don't think I'd call them a bias, just... a reflection... of... something.

Read the responses.