To train Llama 4, Meta uses a cluster of learning over 100,000 Nvidia H100 GPUs; presumably a record.
Meta-CEO Mark Zuckerberg reveals that the Llama 4 Large Language Model is being trained on a cluster with more than 100,000 Nvidia H100 GPUs. That cluster is immediately the largest in the world, at least talked about publicly. Training LLMs takes a lot of computing power, and the largest models take shape over periods of months on AI supercomputers. Those AI clusters are equipped with powerful AI accelerators tailored to the training workloads. The Nvidia H100 is the most powerful chip of its kind available at scale.
Big, bigger, biggest
Elon Musk also reportedly is already building (or has built) a cluster for xAI with a similar amount of Nvidia H100 chips. The ambition is to expand that cluster further over time, but for now the record seems to stand at about 100,000 chips.
How much a data center with that many accelerators consumes on an annual basis is difficult to estimate. Count on the equivalent of a city. Llama 4 will be one of the most powerful AI models of its kind, and should be able to reason sensibly. Whether those capabilities are worth the ever-increasing impact on the global power supply, the future will tell. How Meta’s cluster will be powered is unclear.
Nuclear Energy
Several data specialists including Oracle, AWS and Microsoft are looking at nuclear power to satisfy AI’s hunger for power. In doing so, the technology giants are thinking not only about taking from existing reactors, but also investing in their own small modular reactors to power tomorrow’s data centers.