DiscoverNeural intel PodYaRN: Extending LLM Context Windows Efficiently
YaRN: Extending LLM Context Windows Efficiently

YaRN: Extending LLM Context Windows Efficiently

Update: 2025-09-10
Share

Description

This academic paper introduces YaRN (Yet another RoPE extensioN method), a novel and efficient technique for extending the context window of large language models (LLMs) that utilize Rotary Position Embeddings (RoPE). The authors demonstrate that YaRN significantly reduces the computational resources needed for this extension, requiring substantially fewer tokens and training steps compared to previous methods like Position Interpolation (PI) and NTK-aware interpolation. Through various experiments, including long sequence language modelingpasskey retrieval, and standardized benchmarks, the paper shows that YaRN-fine-tuned models, such as those based on LLaMA and Mistral architectures, can effectively extrapolate to context lengths much longer than their original training while maintaining or surpassing the performance of existing context extension techniques and preserving original model capabilities. The research highlights YaRN's efficiency, strong generalization capabilities, and potential for transfer learning in resource-constrained environments.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

YaRN: Extending LLM Context Windows Efficiently

YaRN: Extending LLM Context Windows Efficiently

Neural Intelligence Network