aifaq.wtf

"How do you know about all this AI stuff?"
I just read tweets, buddy.

#models

Page 3 of 3

June 12, 2023: @goodside

#alignment   #explanations and guides and tutorials   #models  

I know I love all of these, but this is a great thread to illustrate how these models aren't just a magic box we have no control over or understanding of.

June 7, 2023: @goodside

#prompt injection   #models   #limitations  

June 4, 2023: @structstories

#models   #fine-tuning   #open models  

June 2, 2023: @melmitchell1

#papers   #evaluation   #models  

It's tough to make robust tests to evaluate machines if you're used to making assumptions based on adult humankind. The paper's title – Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models is a reference to a horse than did not do math.

June 1, 2023: @8teapi

#models   #alignment  

May 31, 2023: @nabeelqu

#models  

May 25, 2023: @moreisdifferent

#models   #limitations  

May 22, 2023: @cwolferesearch

#alignment   #fine-tuning   #models  

May 19, 2023: @taliaringer

#lol   #tokens   #models   #failures  

I love these magic words. Read more here.

May 5, 2023: @jeffladish

#open source models   #competition   #models   #evaluation  

May 4, 2023: @simonw

#open models   #models   #competition   #evaluation  

A "moat" is what prevents your clients from switching to another product.

As it stands in the immediate moment, most workflows are "throw some text into a product, get some text back." As a result, the box you throw the text into doesn't really matter – GPT, LLaMA, Bard – the only different is the quality of the results you get back.

Watch how this evolves, though: LLMs are going to add in little features and qualities that make it harder to jump to the competition. They might make your use case a little easier in the short term, but anything other than text-in text-out builds those walls a little higher.

May 4, 2023: @structstories

#custom models   #training   #models   #fine-tuning  

Not that I know the details, but I have my doubts that BloombergGPT was even worth it. I think "maybe look at" is a little too gentle – if you think you need your own model, you don't.

Prompt engineering and even somewhat thoughtful engineering of a pipeline should take care of most of your use cases, with fine-tuning filling in any gaps. The only reason you'd train from scratch is if you're worried about the copyright/legal/ethical implications of the data LLMs were trained on – and if you're worried about that, I doubt you have enough data to build a model.