Arm Strongly Believes in AI at the Edge: ‘Inference is where the Real Value Lies’

Arm Strongly Believes in AI at the Edge: ‘Inference is where the Real Value Lies’

AI can become the catalyst for a fundamental shift in the technology industry. And according to Arm, the key to that future lies in inference, not training – especially at the edge.

“This is no ordinary year,” Chris Bergey, SVP and GM of Client Line of Business at Arm, begins during his keynote at Computex in Taiwan. “We are on the eve of the most important moment in technology history.” In his view, AI is no longer a buzzword, but a reality that is reshaping all sectors. “Every product decision today is an AI decision. Every team has become an AI team.”

What fascinates him most is the pace at which AI is developing. More than 150 new foundation models have been launched in the past 18 months, each with accuracies above 90 percent. “We’ve evolved from simple prompts to multimodal interactions, and even to physical AI applications. The progress is dizzying,” says Bergey.

From Cloud to Edge: Inference as the Key to Scale

Yet he states that the real commercial value of AI doesn’t lie in training those models – as impressive as that is – but in inference, and more specifically: inference at the edge. “Training requires immense computing power, but inference determines whether you create value. It needs to be fast, energy-efficient, and available everywhere.”

Training requires immense computing power, but inference determines whether you create value.

Chris Bergey, SVP and GM of Client Line of Business at Arm

According to Bergey, latency is THE reason why inference is moving to the edge. “AI assistants that respond in real-time can’t afford to go to the cloud every time.” Additionally, cost-efficiency plays an increasingly important role. “Edge inference makes AI accessible for consumer-level applications, without relying on data center capacity.”

As an example, he refers to Ray-Ban’s AI-powered glasses in collaboration with Meta, which are selling very well in Europe. “What seemed like science fiction six months ago is now a retail product.”

AI Requires New Infrastructure

The shift towards the edge has major implications for infrastructure. In a separate Q session with the press, Bergey emphasizes that legacy architectures like x86 were not designed with AI in mind. “They were built for web servers, not for workloads requiring ExaFLOPS.” According to Arm, the era of scaling up via more racks is over. “We need to do more with fewer watts per FLOP.”

Bergey points out that hyperscalers realize this. According to Arm’s figures, more than 50 percent of recently added CPU capacity at AWS is based on Arm architecture via their own Graviton chips. Microsoft, Google, Oracle, and Alibaba are also heavily investing in Arm chips for their own and third-party AI workloads.

A Platform, not a Chip Manufacturer

Notable during the QA session with the press was Bergey’s clear positioning of Arm within the AI ecosystem. Although Arm provides IP to companies including Nvidia (DGX Spark, Grace Blackwell), he emphasizes that the company does not want to become a chip manufacturer. “Our role is to support the ecosystem, not to compete with it.”

Arm continues to adhere to its ‘partner-first’ model. The company provides IP and development tools that are used by partners like MediaTek, AWS, and Qualcomm to build their own solutions. “We ensure that developers can build on a consistent, scalable, and energy-efficient foundation, from wearables to AI supercomputers,” says Bergey.

Regarding the rumor that Arm is working on its own chips to compete with hyperscalers, no comment was made during Computex. The company has reportedly already convinced Meta to purchase the chip. Given Arm’s expertise and Meta’s ambitions, the proprietary component is likely a CPU for servers.

Kleidi: Accelerating at Scale

One of Arm’s trump cards in this is Kleidi, an AI acceleration layer that was announced at Computex last year. According to Bergey, it has already been installed eight billion times in one year. Kleidi is integrated into frameworks such as ONNX (Microsoft), LiteRT (Google), ExecuTorch (Meta), and Angel (Tencent), and ensures that AI workloads run optimally on Arm-based hardware right away.

“Kleidi demonstrates what’s possible when you smartly integrate AI acceleration into the software stack. And it proves that we are not just a hardware company, but a full-fledged platform,” says Bergey.

Now that hyperscalers are finding their way to Kleide, there’s a good chance that everything will accelerate for Arm.

AI Infrastructure Needs to be Redesigned

Bergey also warns of a pitfall: “We can talk about the promise of AI, but without energy-efficient infrastructure, AI will never be deployed at scale.” He emphasizes that GPUs remain essential, but CPUs become a bottleneck if they can’t keep up with the data flow to GPUs.

The solution? New, balanced systems where CPU, GPU, and possibly NPU work more closely together. “What Nvidia did with Grace Blackwell shows that direction. But you also see co-design of custom silicon emerging at other hyperscalers.”

It goes without saying that Bergey sees Arm at the heart of all these designs. The less x86, the better for them.

Finally: Momentum and Maturity

Chris Bergey concludes with a clear message: AI is everywhere – but Arm wants to ensure that AI can run everywhere: in edge devices, in the cloud, in wearables, in data centers. “We’re no longer chasing hype. We’re building the foundation on which this new wave of intelligence can run.”

AI has changed the playing field. And Arm has helped design the board.

Chris Bergey, SVP and GM of Client Line of Business at Arm

With AI as the driving force and inference as the basis for making money, Arm profiles itself as the quiet engine of the AI revolution. Not with flashy demos, but with scalable, energy-efficient technology that allows others to innovate.

Or as Bergey says: “AI has changed the playing field. And Arm has helped design the board.”