Recurrent Gemma

-> Go to Benchmark

Model overview

Recurrent Gemma architecture developed by Google is a hybrid model architecture.

Recurrent Gemma leverage both attention and state space model.

At the moment (16/08/2024), Recurrent Gemma have 1 major version.

The major version is RecurrentGemma.

Model details

Huggingface model

Support quantization

RecurrentGemma models support quantization.

Quantization TechniqueKV cacheSupportUpdated
FP16FP16YesYes
FP8FP16NoNo
FP8FP8NoNo
int8_weight_onlyFP16NoNo
int4_weight_onlyFP16NoNo
w4a16_awqFP16NoNo
w4a8_awqFP16NoNo

Model versions

Model versionModelContext size
RecurrentGemmarecurrentgemma-2b4,096
RecurrentGemmarecurrentgemma-9b4,096