AMD Aims to Break Cuda Dominance with ROCm 7

AMD Aims to Break Cuda Dominance with ROCm 7

AMD releases a new version of the ROCm software platform. The new version supports distributed inference, new precision formats, and enterprise management tools.

AMD announces the availability of ROCm 7. ROCm is a software platform supporting AMD’s AI chips and the answer to Nvidia Cuda. With ROCm 7.0, AMD focuses on the growing complexity and scale of AI workloads.

The update is optimized for the new Instinct MI350 GPUs, based on the CDNA 4 architecture. These GPUs feature high memory bandwidth and improved compute cores for intensive AI training and inference.

AI at Scale

ROCm 7.0 introduces support for low-precision data types such as FP4, FP6, and FP8, which provide better performance and lower memory requirements. Examples of supported models include DeepSeek R1, Llama 3.3 70B, and gpt-oss-120b. These are available through AMD Quark.

For quick startup, AMD offers pre-built Docker images for frameworks like vLLM and SGLang. These containers are tailored for Instinct MI355, MI350, MI325, and MI300 GPUs, enabling direct benchmarking of models.

Divide and Conquer

ROCm 7.0 expands capabilities for distributed inference. This means distributing models across multiple GPUs, improving scalability and response time. Frameworks like SGLang support this approach and utilize optimization techniques such as Mixture of Experts (MoE) and new precision formats.

For enterprises, AMD introduces two new tools: a Resource Manager and the AI Workbench. The Resource Manager simplifies GPU resource management in Kubernetes and Slurm environments, while AI Workbench provides a development and deployment platform for model training and fine-tuning. Both tools are designed with scalability and integration into existing enterprise architectures in mind.

Beating Cuda (and Nvidia)

AMD has been trying to break Nvidia’s AI dominance for a long time. Releasing equivalent chips and accelerators is only part of the puzzle, and AMD seems to be gradually realizing this. Nvidia’s strength lies in fully locking down the ecosystem.

read also

AMD aims to knock AI king Nvidia off its throne

The Cuda software plays a key role in this. ROCm 7 should be able to provide a better answer on behalf of AMD. Has AMD finally cracked the Nvidia code?