Jamba (language model) explained

Jamba
Developer:	AI21 Labs
Released:	28 March 2024
Genre:	Large language model Generative pre-trained transformer Mamba (deep learning architecture) Mixture of experts Foundation model
License:	Apache 2.0 License

Jamba is an open-weights large language model (LLM) developed by AI21 Labs.^[1] ^[2] It utilizes a Mamba-based model built on a novel state space model (SSM) and transformer hybrid architecture.^[3] ^[4] It is a 52 billion parameter model trained using a mixture-of-experts (MoE) technique with 12B active parameters (number of parameters active per token). Jamba can fit up to 256K tokens in its context window and is the largest Mamba-variant LLM created, or 140k tokens in a single 80GB GPU.

Jamba performs well across a number of key measurements including throughput and efficiency while outperforming or matching other state-of-the-art models in its class on a wide range of performance benchmarks while having significantly greater context limits enabling use-cases that require increased context. The model is released with open weights under an Apache 2.0 license.^[5]

The company plans to release a beta-version instruct-tuned version on the AI21 Platform in the near future.

Characteristics

Context window size: 256k tokens^[6]
Parameters: 52 billion
Architecture: Hybrid Mamba (SSM) Transformer using Mixture of Experts (MoE)

References

Web site: Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model . 2024-03-29 . www.ai21.com . en.
Web site: Kerner . Sean Michael . 2024-03-28 . AI21 Labs juices up gen AI transformers with Jamba . 2024-03-29 . VentureBeat . en-US.
Web site: 2024-03-28 . AI21 Labs’ Jamba infuses Mamba to bring more context to transformer-based LLMs . 2024-03-29 . SiliconANGLE . en-US.
Web site: MLTimes - Time To Learn AI . 2024-03-29 . mltimes.se.
Web site: AI21 . Unveiling Jamba: AI21's Groundbreaking Hybrid SSM-Transformer Open-Source Model . 2024-03-29 . www.prnewswire.com . en.
Web site: 2024-03-28 . AI21 Labs enhances the capabilities of gen AI transformers through Jamba integration . 2024-03-29 . Global Village Space Technology . en-US.

Jamba (language model) explained

Characteristics

See also

References