Meta released a new model that can do all sorts of speech- and text-based communication tasks. It's called Seamless, and you can find the model here.
Speech-to-speech translation is super fun, yes, but I love that they highlight that it (supposedly) understands code switching, aka bouncing back and forth between different languages. That's always been a tough one!
I tried the demo here and mostly learned my Russian is horrible:
Tried again with my pathetic grasp Japanese, and it thinks I'm speaking Hindi. I feel like these systems are usually pretty good at coping with my inability to speak other languages well (or at least Whisper is), so I don't know if I'm especially bad today or this model's flexibilty is also a potential downside.