Language Models as Statisticians, and as Adapted Organisms
Given their complex behavior, diverse skills, and wide range of deployment scenarios, understanding large language models---and especially their failure modes---is important. Given that new models are released every few months, often with brand new capabilities, how can we achieve understanding that keeps pace with modern practice?