Arxiv Papers

Running out of time to catch up with new arXiv papers? We take the most impactful papers and present them as convenient podcasts. If you're a visual learner, we offer these papers in an engaging video format. Our service fills the gap between overly brief paper summaries and time-consuming full paper reads. You gain academic insights in a time-efficient, digestible format. Code behind this work: https://github.com/imelnyk/ArxivPapers

PLAY ON CASTBOX

[QA] On the Theoretical Limitations of Embedding-Based Retrieval

https://arxiv.org/abs//2508.21038YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

09-01

08:55

On the Theoretical Limitations of Embedding-Based Retrieval

https://arxiv.org/abs//2508.21038YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

09-01

23:17

[QA] Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing

Avengers-Pro is a test-time routing framework that optimizes performance and efficiency in LLMs, achieving state-of-the-art results by dynamically assigning queries to suitable models based on performance-efficiency scores.https://arxiv.org/abs//2508.12631YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

07:03

Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing

Avengers-Pro is a test-time routing framework that optimizes performance and efficiency in LLMs, achieving state-of-the-art results by dynamically assigning queries to suitable models based on performance-efficiency scores.https://arxiv.org/abs//2508.12631YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

09:39

[QA] Measuring the environmental impact of delivering AI at Google Scale

https://arxiv.org/abs//2508.15734YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

08:17

Measuring the environmental impact of delivering AI at Google Scale

https://arxiv.org/abs//2508.15734YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

22:09

[QA] Deep Think with Confidence

DeepConf enhances reasoning efficiency and performance in Large Language Models by filtering low-quality traces using internal confidence signals, achieving high accuracy and reduced token generation without extra training.https://arxiv.org/abs//2508.15260YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

07:36

Deep Think with Confidence

DeepConf enhances reasoning efficiency and performance in Large Language Models by filtering low-quality traces using internal confidence signals, achieving high accuracy and reduced token generation without extra training.https://arxiv.org/abs//2508.15260YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

18:34

[QA] Intern-S1: A Scientific Multimodal Foundation Model

Intern-S1 is a multimodal model that excels in scientific tasks, outperforming both open-source and closed-source models, and aims to bridge the gap in high-value scientific research.https://arxiv.org/abs//2508.15763YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

08:33

Intern-S1: A Scientific Multimodal Foundation Model

Intern-S1 is a multimodal model that excels in scientific tasks, outperforming both open-source and closed-source models, and aims to bridge the gap in high-value scientific research.https://arxiv.org/abs//2508.15763YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-22

49:42

[QA] Search-Time Data Contamination

The paper identifies search-time contamination (STC) in evaluating search-based LLM agents, revealing how data leaks compromise benchmark integrity and proposing best practices for trustworthy evaluations.https://arxiv.org/abs//2508.13180YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-20

07:02

Search-Time Data Contamination

The paper identifies search-time contamination (STC) in evaluating search-based LLM agents, revealing how data leaks compromise benchmark integrity and proposing best practices for trustworthy evaluations.https://arxiv.org/abs//2508.13180YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-20

19:34

[QA] Thyme: Think Beyond Images

This paper introduces Thyme, a multimodal model enhancing image manipulation and reasoning through executable code, achieving significant performance improvements in perception and reasoning tasks via innovative training strategies.https://arxiv.org/abs//2508.11630YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-19

07:20

Thyme: Think Beyond Images

This paper introduces Thyme, a multimodal model enhancing image manipulation and reasoning through executable code, achieving significant performance improvements in perception and reasoning tasks via innovative training strategies.https://arxiv.org/abs//2508.11630YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-19

25:37

[QA] SSRL: Self-Search Reinforcement Learning

The paper explores using large language models as efficient simulators for reinforcement learning tasks, introducing Self-Search RL to enhance internal knowledge utilization and reduce reliance on external search engines.https://arxiv.org/abs//2508.10874YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-19

07:39

SSRL: Self-Search Reinforcement Learning

The paper explores using large language models as efficient simulators for reinforcement learning tasks, introducing Self-Search RL to enhance internal knowledge utilization and reduce reliance on external search engines.https://arxiv.org/abs//2508.10874YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-19

32:32

[QA] Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs

This paper explores filtering dual-use topics from training data to enhance the tamper-resistance of open-weight AI systems, demonstrating significant improvements in adversarial fine-tuning resistance without degrading unrelated capabilities.https://arxiv.org/abs//2508.06601YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-14

07:19

Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs

This paper explores filtering dual-use topics from training data to enhance the tamper-resistance of open-weight AI systems, demonstrating significant improvements in adversarial fine-tuning resistance without degrading unrelated capabilities.https://arxiv.org/abs//2508.06601YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-14

31:24

[QA] Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

This paper presents ASearcher, an open-source project enhancing search agents' capabilities through scalable RL training, achieving significant performance improvements in complex query handling and long-horizon search tasks.https://arxiv.org/abs//2508.07976YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-14

07:42

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

This paper presents ASearcher, an open-source project enhancing search agents' capabilities through scalable RL training, achieving significant performance improvements in complex query handling and long-horizon search tasks.https://arxiv.org/abs//2508.07976YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

08-14

28:28

View All on Castbox

Recommend Channels