2025.09.25 | 视频模型零样本全能;隐式思维链省token提效
Description
本期的 10 篇论文如下:
[00:22 ] 🎥 Video models are zero-shot learners and reasoners(视频模型是零样本学习者与推理者)
[01:09 ] 🧠 SIM-CoT: Supervised Implicit Chain-of-Thought(SIM-CoT:基于监督式隐式思维链的高效推理)
[01:55 ] 🪶 EmbeddingGemma: Powerful and Lightweight Text Representations(EmbeddingGemma:强大而轻量的文本表征模型)
[02:29 ] 🗣 Advancing Speech Understanding in Speech-Aware Language Models with GRPO(基于GRPO提升语音感知大模型开放域理解能力)
[03:06 ] 🌍 LLMs4All: A Review on Large Language Models for Research and Applications in Academic Disciplines(LLMs4All:面向各学科研究与应用的通用大模型综述)
[03:52 ] 🎬 EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning(EditVerse:用上下文学习统一图像与视频编辑生成)
[04:29 ] 🌀 Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation(Lavida-O:弹性大掩码扩散模型统一多模态理解与生成)
[05:19 ] 🎬 PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation(PhysCtrl:基于生成式物理的可控且物理真实的视频生成框架)
[05:58 ] 📄 Logics-Parsing Technical Report(Logics-Parsing 技术报告:基于强化学习的大模型端到端文档解析)
[06:44 ] 🤖 On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub(关于自主编码的实证研究:GitHub上由AI代理发起的拉取请求分析)
<figure>
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递