OVHcloud introduces AI Endpoints: an accessible serverless platform providing access to more than 40 open-source AI models for various applications.
OVHcloud has launched AI Endpoints. It is a new serverless solution that allows developers to easily integrate advanced AI functionalities into their applications.
Simple Integration
AI Endpoints offers a library of more than 40 open-source LLMs and generative AI models. The solution focuses on various use cases such as chatbots, speech models, code assistants, and more. Developers of such applications can deploy models without worrying about the underlying infrastructure or specialized AI knowledge.
The service includes a sandbox environment that allows users to test AI functionalities before large-scale deployment. Some key applications, highlighted by OVHcloud, include the integration of LLMs for customer interactions, automated text extraction to process unstructured data, speech-to-text and text-to-speech environments, and code assistance in development environments.
Through open weight models, users have the ability to use the AI models not only within OVHcloud but also on their own infrastructure or other cloud platforms. This allows organizations to maintain control over their data and implementations.
AI Endpoints supports numerous LLMs including Llama 3.3 70B, Mixtral 8x7B, and Qwen 2.5 VL 72B. Small and efficient SLMs (Small Language Models) are also supported, including Mistral Nemo.
Serverless in the EU
AI Endpoints is designed to be serverless, meaning OVHcloud takes care of all infrastructure management. Such serverless applications are convenient to use but create a form of lock-in. This is something that hyperscalers like AWS traditionally benefit from. With this solution, OVHcloud offers a European alternative.
read also
OVHcloud Launches Own Integrated Data Platform in Europe
AI Endpoints runs on the cloud infrastructure of the French OVHcloud, with data management taking place entirely within Europe. The cloud provider likes to emphasize its European roots, and further points out that its own data centers use water-cooled servers to limit energy consumption.
The service is available in Europe, Canada, and the Asia-Pacific region, with hosting from the data center in Gravelines. The price is determined based on the number of tokens consumed per minute per model.