Google DeepMind RecurrentGemma Beats Transformer Models

April 26, 2024

The Griffin research paper proposed two models, one called Hawk and the other named Griffin. Another win over transformer models is in the reduction in memory usage and faster processing times. The transformer model’s throughput suffers with higher sequence lengths (increase in the number of tokens or words) but that’s not the case with RecurrentGemma which is able to maintain a high throughput. The researchers highlight a limitation in handling very long sequences which is something that transformer models are able to handle. This also shows that a non-transformer model can overcome one of the limitations of transformer model cache sizes that tend to increase memory usage.

The source of this news is from Search Engine Journal