‘Nvidia Blackwell chips face ongoing problems’

nvidia blackwell

Problems flagged last year with Nvidia’s Blackwell B200 chips have not gone away. A bug in the manufacturing process would cause the AI accelerators to overheat.

The high-end Nvidia Blackwell B200 accelerators for AI training are facing problems. That’s what The Information reports, and although Nvidia denies, the medium is confident of its piece. The malfunctions involve the same bug that previously caused delays.

Earlier delay

In principle, Nvidia B200 chips were supposed to roll off the assembly line in volume as early as the last quarter of 2024. A design error then sprinkled a spanner in the works. That mistake was made by Nvidia and had to do with how the B200 chips are integrated by TSMC. Nvidia claimed to have fixed the error in late October.

That would now prove not quite true. Blackwell servers exhibit bugs related “to the way the chips connect.” That is a vague description and there is no more clarity at this time. Earlier rumors suggested that the flaw had to do with the integration of the B200 chips via TSMC’s CoWoS packaging system. Such a problem could be described as connection-related.

Too hot

The result is measurable: the chips get too hot. That, in turn, brings instability. Meanwhile, The Information knows that major customers including Microsoft, Amazon, Google and Meta are getting impatient. They have reportedly scaled back their Blackwell orders.

Nvidia is counting on billions in revenue from the big companies involved in training LLMs. If the manufacturer fails to get Blackwell on track, that failure could cause a financial hangover. Moreover, a misstep by Nvidia opens the door for AMD to catch up with its Instinct accelerators. However, we are not that far down the road yet.

Finally, we note that the Blackwell B200 has a unique architecture to it. There is little chance that this chip’s problems are also relevant to the newly announced RTX 50 series.

newsletter

Subscribe to ITdaily for free!

  • This field is for validation purposes and should be left unchanged.