Mistral Launches Voxtral: Open-Source AI Speech Recognition

French AI startup targets affordable, open speech intelligence for businesses.

Mistral introduces Voxtral, its first open audio model that the company says makes “true speech intelligence” usable for production. With this, the French startup takes on closed systems from major players like OpenAI.

Open, Affordable and Multilingual

Voxtral can transcribe up to 30 minutes of audio and understand up to 40 minutes thanks to the integration of Mistral Small 3.1, a compact language model. Users can ask questions about the content, generate summaries, or perform real-time actions based on voice commands. The model works in multiple languages, including Dutch, English, French, Spanish, German, and Hindi.

Voxtral comes in two variants: Voxtral Small (24 billion parameters) for production scale, and Voxtral Mini (3 billion parameters) for local applications. A separate transcription API focuses on speed and low cost, and is said to outperform Whisper for less than half the price.

Alternative to Expensive Systems

According to Mistral, Voxtral is cheaper than comparable solutions. Companies can test the API for free via Hugging Face or Le Chat. Integration into applications starts from $0.001 per minute.

Voxtral follows the recent introduction of Magistral, Mistral’s reasoning model.

Itdaily - Mistral Launches Voxtral: Open-Source AI Speech Recognition

Open, Affordable and Multilingual

Alternative to Expensive Systems