Open-source data curation platform for LLMs

#annotation   #fine-tuning   #link  

I guess it's Prodigy but at some sort of scale. Or LabelStudio but every single plan demands you to contact them for pricing.

Except Hugging Face says it's "a free interface for validating and cleaning unstructured LLM outputs" so maybe it's just the hosted one that costs [a lot of] money. Could I explore it? Yes! Have I done it? No!