Alibaba expands its AI live speech translation model from 18 to 60 languages, adding real-time voice cloning and reducing ...
The new model, called VSSFlow, leverages a creative architecture to generate sounds and speech with a single unified system, with state-of-the-art results. Watch (and hear) some demos below. Currently ...
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More New York City startup Hume AI emerged from stealth two years ago and has ...
Gautam Jha is the Co-Founder & CTO of Kalpa Labs, an SF-based YC backed startup building large scale Foundational speech models. Voice is quickly becoming a primary interface for enterprise software, ...