Nvidia launches massive AI model to rival GPT-4o

.innovation
02.10.'24 14:58
2 min

Joachim Cruysberghs

Hardware giant Nvidia has launched a new open source AI model with NVLM 1.0. The LLM is based on another version with 72 billion parameters.

Nvidia’s new multimodal LLM (Large Language Model) puts up strong performance on complex vision translation tasks. In AI benchmarks, NVLM 1.0 does not achieve first place on textual tasks. It does, however, process images very well. Researchers showed examples showing that the model can analyze images and solve complex mathematical problems step by step.

AI Benchmarks table with NVLM 1.0 — *Source: arxiv.org*

After multimodal training, the performance of textual prompts improves, while it deteriorates in many other models.

New chapter in AI development with open model

According to Venturebeat, they want to compete with market leaders with closed models such as OpenAI’s GPT-4o or Anthropic’s Claude 3.5. The fact that Nvidia is making its parameters (model weights) and soon training codes public gives researchers and developers full access to advanced technology. Smaller organizations or companies can thus research AI more efficiently. It just may not be used commercially.

That decision by Nvidia to lay everything open could significantly accelerate AI research. In doing so, it also hopes to force competitors toward full transparency. That would only enhance innovation and collaboration.

On the other hand, one can also question so much openness. Such a powerful and especially accessible AI could also quickly fall into the wrong hands, raising ethical issues. The impact of Nvidia’s decision will become clear in the coming weeks and months.