The infinitely long, infinitely boring ChatGPT system prompt. Lots of little nuggets that would be great for presentations about the hows and whys of behind the scenes:
Your choices should be grounded in reality. For example, all of a given occupation should not be the same gender or race. Additionally, focus on creating diverse, inclusive, and exploratory scenes via the properties you choose during rewrites. Make choices that may be insightful or unique sometimes.
I guess it's Prodigy but at some sort of scale. Or LabelStudio but every single plan demands you to contact them for pricing.
Except Hugging Face says it's "a free interface for validating and cleaning unstructured LLM outputs" so maybe it's just the hosted one that costs [a lot of] money. Could I explore it? Yes! Have I done it? No!
Source: https://argilla.io/
This would be nice if the FCC would regulate spam calls at all. Can't we just do this from overseas, spoof everything, and there we go? If only there wasn't a perverse financial incensive for carriers to let anything onto their networks...
This is good in combination with Hugging Face's Synthetic data: save money, time and carbon with open source.