DiscoverThe SEO Patent PodcastBlockRank: Scalable In-context Ranking with Generative Models
BlockRank: Scalable In-context Ranking with Generative Models

BlockRank: Scalable In-context Ranking with Generative Models

Update: 2025-10-27
Share

Description

A novel and efficient method for In-Context Ranking (ICR) using large language models (LLMs) for Information Retrieval (IR). The core problem addressed is the computational cost of standard LLMs for ranking many candidate documents, which scales quadratically with context length due to the attention mechanism. BlockRank solves this by first analyzing LLM attention patterns, revealing inherent inter-document block sparsity and query-document block relevance signals within middle layers

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

BlockRank: Scalable In-context Ranking with Generative Models

BlockRank: Scalable In-context Ranking with Generative Models

Mark Williams-Cook