IBM launches Granite 4.0, a series of open-source language models with a hybrid Mamba-transformer architecture.
IBM introduces Granite 4.0, a new generation of open-source language models with a hybrid architecture. Granite 4.0 models combine transformer and Mamba technology for better performance with reduced memory consumption and costs. It is the first open model family with ISO 42001 certification.
Less Memory, Lower Costs
The Granite 4.0 series consists of various model sizes: Granite-4.0-H-Small, Granite-4.0-H-Tiny, and Granite-4.0-H-Micro. They are developed with a hybrid architecture that combines Mamba layers with transformer layers. This approach reduces memory requirements for inference by more than 70% compared to traditional models.
Thanks to this efficiency, the models can run on less expensive hardware, including AMD Instinct MI300X GPUs and Qualcomm Hexagon NPUs. This makes them suitable for edge applications or on local infrastructure. The models support longer context lengths up to 128,000 tokens and are optimized for agent-based AI tasks such as instruction following, tool invocation, and RAG workflows.
read also
IBM and NASA Launch AI Model to Predict Solar Flares
Granite 4.0 is available as open-source under an Apache 2.0 license. The models are offered through IBM watsonx.ai and via partners such as Dell Technologies, Docker Hub, Hugging Face, Nvidia NIM, Kaggle, and Replicate. Support on Amazon SageMaker and Microsoft Azure is in preparation.
Validation and Security
Granite 4.0 is the first open model family with ISO/IEC 42001:2023 certification. This confirms that IBM’s development process meets international standards for safety, governance, and accountability in AI. IBM cryptographically signs all model checkpoints and launched a bug bounty program in collaboration with HackerOne to further secure the models.
For training, IBM used a dataset of 22 trillion tokens, compiled from business and open sources. All models are trained with a focus on language, reasoning, multilingualism, and security. For the first time, IBM splits the models into separate variants for instruction following and complex reasoning. The instruction variants are now available. Reasoning models will follow later in 2025.
