Framework converting linear-attention Transformers to bidirectional RNN equivalents for fast training and efficient inference. Provides three variants (LION-Lit, LION-D, LION-S) for image classification and masked language modeling, with configurable size, masking, patch order, and format. Achieves competitive accuracy with reduced training time.
This page was last edited on 2026-03-03.
This page was last edited on 2026-03-03.