This is a $800 course on prompt engineering. The price along makes me want to take it! But I'm pretty sure it's just this guide as a class.
This is a $800 course on prompt engineering. The price along makes me want to take it! But I'm pretty sure it's just this guide as a class.
The infinitely long, infinitely boring ChatGPT system prompt. Lots of little nuggets that would be great for presentations about the hows and whys of behind the scenes:
Your choices should be grounded in reality. For example, all of a given occupation should not be the same gender or race. Additionally, focus on creating diverse, inclusive, and exploratory scenes via the properties you choose during rewrites. Make choices that may be insightful or unique sometimes.
I guess it's Prodigy but at some sort of scale. Or LabelStudio but every single plan demands you to contact them for pricing.
Except Hugging Face says it's "a free interface for validating and cleaning unstructured LLM outputs" so maybe it's just the hosted one that costs [a lot of] money. Could I explore it? Yes! Have I done it? No!
Source: https://argilla.io/
This is good in combination with Hugging Face's Synthetic data: save money, time and carbon with open source.
This post does a fantastic job breaking down how you use an expert labeler (teacher LLM) to annotate your data, then use it to fine-tune a student LLM. It's as good or better than crowd workers!
In this case they use Mixtral to prep data for RoBERTa-base, then get equal performance in the end. So much faster! So much cheaper!