Amazon launches AI voice model Nova Sonic

Amazon Nova Sonic
Amazon Nova Sonic

Amazon Nova Sonic combines comprehension and generation capabilities in a single model, enabling it to conduct human-like conversations.

Amazon unveils Nova Sonic, a voice model that enables “human-like speech conversations” in AI applications. “The model takes into account the nuance and complexity of human conversations,” according to Amazon in a press release. Nova Sonic is the result of both comprehension and generation capabilities and can adapt the voice response based on context. Nova Sonic is available through a new API in Amazon Bedrock.

Human-like speech conversations

Where traditionally multiple models are deployed, Amazon unifies the comprehension and generation capabilities in a single model. This should enable the model to adapt the generated voice response to the acoustic context (e.g., tone) and spoken input. This aims to create a more natural dialogue. Furthermore, Nova Sonic is said to understand the nuances of human conversations, as well as the pauses and hesitations in the speaker’s voice.

read also

ChatGPT’s Advanced Voice Mode interrupts you less during conversations

The voice model can contribute to the automation of customer service conversations and AI agents in a wide range of sectors, including travel, education, healthcare, entertainment, and more. In a message, Amazon demonstrates various examples of scenarios where you can hear the voice model at work.

English accents

The model is available in various English accents, including American and British. According to Amazon, support for additional languages will follow soon. The speech model is available through a new API in Amazon Bedrock. To use the model, you first need to enable and disable model access in the Amazon Bedrock console. Afterwards, you can navigate to Model Access and search for Amazon Nova Sonic under the Amazon models.