We need user interface alignment instead of just model alignment.
"How do you know about all this AI stuff?"
I just read tweets, buddy.
Page 2 of 3
We need user interface alignment instead of just model alignment.
The most direct analogue is probably facial recognition: "the system said it's a match" will override "obviously this isn't the person from the video" because Computers Can't Be Wrong.
I've given up on all the "look at how an LLM scores on this test!!!" excitement because there's almost always something going on, whether it's explicitly cooking the books in favor of the LLM, testing questions its already seen, or (my favorite!) some sort of answer leakage.
We're going to see a lot of people doubling down on the (accidental? incidental?) falsehoods spread by ChatGPT.
"Write me a summary" seems like an easy task for a language model, but there are a hundred and one ways to do this, each with their own strengths and weaknesses. Even within langchain!
If you're excited about summarization, be sure to read this to see how things might go wrong. With hallucinations, token limits, and other technical challenges, LLM-based summarization has a lot more gotchas than you'd think.