Home|News|Photo|Opinions|CCYL|Fun|Fashion|Finance|Military|Sports|Employment|University|Travel|Discovery|Video|Games|Autos|Youth Inspring Stories
Photo List Tongyi Large Model Unveils New-Gen End-to-End Speech Interaction Model

【#Tech24H】On December 23, Tongyi Large Model released its new-generation end-to-end speech interaction model, Fun-Audio-Chat. This model is not merely capable of conversation. It is an AI voice companion that understands your words, perceives your emotions, and can genuinely assist you with tasks. In terms of technical performance, the new models end-to-end S2S (speech-to-speech) architecture can directly generate speech output from speech input, eliminating the need for multi-module concatenation (ASR + LLM + TTS), resulting in higher efficiency and lower latency. The Shared LLM layer processes efficiently at a 5Hz frame rate, while SRH generates high-quality speech at a 25Hz frame rate, reducing GPU computational overhead by nearly 50%. The training content covers real-world scenarios such as audio understanding, voice Q&A, emotion recognition, and tool calling, making the model more down-to-earth. 

Editor:Zhang Liyan Source: Youth.cn Time:2025-12-24 16:11:00
PHOTO

About UsContact UsAdvertiseJobsIllegal Information Reporting Send qnb to 10658000 to order Mobile China Youthz

Organized by CCYL and Network Film & TV center of CCYL Copyright@China Youth International. All rights reserved.
信息网络传播视听节目许可证0105108号 京|ICP备11020872号-17 京公网安备110105007246