A discussion of model sizes vs quantization on /r/LocalLLaMA, relevant for anyone interested in running models on their own machines. Generally:
Ive read that a larger sized model even at a lower quant will most likely yield better results than a smaller model at a higher quant