Discover
AI可可AI生活
823 Episodes
Reverse
你有没有想过,一个细胞修复自己的逻辑,和AI画画的逻辑,竟然是相通的?我们又该如何设计一场“高考”,来检验AI是不是真的能干活,而不是个花架子?本期节目,我们将一起探索AI如何学会“自我反思”来纠正错误,如何通过“化整为零”的智慧让万物动起来,以及它为何在理解真实物理世界时频频“翻车”。准备好了吗?让我们一起解码智能的最新进化。00:30:04 从细胞到AI,智能的底层逻辑是什么?00:06:25 AI离成为“靠谱员工”,还差几门考试?00:11:39 给AI请个“一对一”私教,它自己教自己00:17:04 让万物动起来,需要几步?00:21:45 AI画视频,为什么一碰到机器人就“翻车”?本期介绍的几篇论文:[AI] Remapping and navigation of an embedding space via error minimization: a fundamental organizational principle of cognition in natural and artificial systems[Allen Discovery Center at Tufts University]https://arxiv.org/abs/2601.14096---[AI] Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces[Stanford University & Laude Institute & Anthropic]https://arxiv.org/abs/2601.11868---[LG] InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning[CMU & University of Illinois Urbana-Champaign]https://arxiv.org/abs/2601.14209---[CV] Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis[Westlake University]https://arxiv.org/abs/2601.14253---[CV] Rethinking Video Generation Model for the Embodied World[Peking University & ByteDance Seed]https://arxiv.org/abs/2601.15282
你有没有想过,AI不仅能学习知识,还能在解决难题的“考场”上临场进化?当AI开始“抄自己作业”时,它会变聪明还是变笨?今天,我们将一起探索AI如何学会使用电脑,从一个“缸中之脑”变成真正的“行动派”,并看看我们如何像老中医一样,通过“望闻问切”来判断AI何时“心里没底”,最后揭示聪明的AI老师如何教出既能干又记性差的“好学生”。这一期,我们将见证AI从“知道”到“做到”,再到“自知”的迷人进化。00:00:38 AI的临场进化,考试的时候再学习00:05:12 当AI开始抄自己的作业00:11:25 给AI一台电脑,会发生什么?00:17:31 AI也会“心里没底”,我们如何一眼看穿?00:22:50 聪明的大模型,如何教出既能干又记性差的好学生?本期介绍的几篇论文:[LG] Learning to Discover at Test Time[Stanford University & UC San Diego]https://arxiv.org/abs/2601.16175---[LG] Learning from Synthetic Data: Limitations of ERM[Google Research]https://arxiv.org/abs/2601.15468---[CL] LLM-in-Sandbox Elicits General Agentic Intelligence[Renmin University of China & Microsoft Research]https://arxiv.org/abs/2601.16206---[CL] Agentic Confidence Calibration[Salesforce AI Research]https://arxiv.org/abs/2601.15778---[CL] Memorization Dynamics in Knowledge Distillation for Language Models[Meta Superintelligence Labs & FAIR at Meta]https://arxiv.org/abs/2601.15394
你有没有想过,AI是如何“思考”的?本期节目,我们将深入AI的大脑,看看几篇最新论文如何揭示它独特的学习与创造策略。我们会发现,AI不仅能通过一张“未来地图”预知结果,也懂得在创新时避免“摸鱼”;它解决难题有时不靠推理,而是靠“澄清”;它甚至告诉我们,通往智慧的道路,有时恰恰是那扇最窄的门。准备好了吗?让我们一起探索AI的思考术!00:00:33 让AI听话,需要一本什么样的“未来地图”?00:05:02 AI搞科研,是“卷王”还是“摸鱼”?00:10:38 高手解决问题,靠的不是推理,是“澄清”00:16:57 通往正确答案的窄门00:22:14 AI的成长捷径,死记硬背不如学会“串门”本文介绍的几篇论文:[LG] Meta Flow Maps enable scalable reward alignment[University of Oxford]https://arxiv.org/abs/2601.14430---[CL] Towards Execution-Grounded Automated AI Research[Stanford University]https://arxiv.org/abs/2601.14525---[LG] Diffusion Large Language Models for Black-Box Optimization[McGill & MILA - Quebec AI Institute]https://arxiv.org/abs/2601.14446---[CL] The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models[Tsinghua University]https://arxiv.org/abs/2601.15165---[LG] Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning[Princeton University]https://arxiv.org/abs/2601.15160
你有没有想过,两个顶尖AI合作,效率反而会暴跌?或者,AI回复慢的根源,可能是一个被我们误解的“小聪明”?本期节目,我们将从最新的几篇论文出发,一起聊聊AI如何从一个埋头苦干的“独行侠”,进化为懂得协作的“团队搭子”,以及如何从“背课文”的学霸,蜕变为真正“懂思想”的伙伴。让我们一起揭开AI世界里,关于团队、效率与心智的迷思。00:00:32 你的科研搭子,正在被AI重新定义00:05:48 AI 回复慢?我们可能被“小聪明”误导了00:11:54 一个和尚挑水喝,两个和尚没水喝,AI世界的团队迷思00:16:52 AI的“情商”开关,从“背课文”到“懂思想”00:21:40 AI训练场上的“好教练”与“天才选手”本期介绍的几篇论文:[AI] Rethinking the AI Scientist: Interactive Multi-Agent Workflows for Scientific Discovery[University of Maryland et al.]https://arxiv.org/abs/2601.12542---[CL] Speculative Decoding: Performance or Illusion?[UC Berkeley]https://arxiv.org/abs/2601.11580---[LG] CooperBench: Why Coding Agents Cannot be Your Teammates Yet[Stanford University & SAP Labs US]https://arxiv.org/abs/2601.13295---[CL] Beyond Tokens: Concept-Level Training Objectives for LLMs[Stanford University]https://arxiv.org/abs/2601.11791---[LG] Q-learning with Adjoint Matching[UC Berkeley]https://arxiv.org/abs/2601.14234
今天,我们来聊聊AI那些你不知道的“另一面”。为什么有时聪明的AI会突然“出戏”,变得神神叨叨?为什么它能解开复杂的难题,却连最简单的掷骰子都做不好?我们又该如何设计一套聪明的系统,给AI装上“人格护栏”,甚至让它成为我们时薪不到一块钱的“超级实习生”?这一期,我们将从五篇最新论文出发,为你揭开AI不为人知的内在机制。00:00:31 AI的“人格”开关,藏在哪里?00:07:06 AI的“逻辑脆断”,为什么聪明的大模型会突然变傻?00:13:20 AI的“贴身保安”,怎样做到又便宜又好用?00:20:04 你以为AI是高手,其实它连骰子都掷不好00:25:35 你的“数学家教”,时薪不到一块钱本文介绍的几篇论文:[CL] The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models[MATS & Anthropic]https://arxiv.org/abs/2601.10387---[CL] Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning[Huazhong University of Science and Technology]https://arxiv.org/abs/2601.02902---[LG] Building Production-Ready Probes For Gemini[Google DeepMind]https://arxiv.org/abs/2601.11516---[CL] Large Language Models Are Bad Dice Players: LLMs Struggle to Generate Random Numbers from Statistical Distributions[Harvard University]https://arxiv.org/abs/2601.05414---[LG] 130k Lines of Formal Topology in Two Weeks: Simple and Cheap Autoformalization for Everyone?[AI4REASON]https://arxiv.org/abs/2601.03298
这一期,我们脑洞大开。你会听到,顶尖AI的大脑里,原来天天都在开激烈的辩论会;而训练AI,竟然就像呵护一个需要犯错、需要折腾的“青春期”。我们还会聊聊,如何用优雅的数学工具给AI一套更聪明的“橡皮泥”,如何让大模型退居幕后帮你“造”一个更高效的AI,以及,怎么判断AI老师的“板书”是不是真的靠谱。准备好了吗?让我们一起出发。00:00:33 AI建模,我们得到了一套更聪明的“橡皮泥”工具00:07:19 AI的大脑里,原来天天在开会00:12:55 聪明人的“笨功夫”,如何让AI帮你造一个AI?00:18:52 成大事者,为何要珍惜“犯错”的青春期?00:24:39 AI当老师,它的“板书”靠谱吗?本期介绍的几篇论文:[LG] Analytic Bijections for Smooth and Interpretable Normalizing Flows[University of Amsterdam]https://arxiv.org/abs/2601.10774---[CL] Reasoning Models Generate Societies of Thought[Google & University of Chicago]https://arxiv.org/abs/2601.10825---[LG] FORESTLLM: Large Language Models Make Random Forest Great on Few-shot Tabular Learning[National University of Singapore & Zhejiang University & University of British Columbia]https://arxiv.org/abs/2601.11311---[LG] Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent[Peking University & Zhejiang University]https://arxiv.org/abs/2601.10962---[CL] Do explanations generalize across large reasoning models?[Northeastern University & Microsoft Research]https://arxiv.org/abs/2601.11517
今天,我们一同窥见了AI世界精巧的另一面:从注意力机制中类似“机械”的斜杠模式,到并行专家协作的优雅高效;从学会“如何选择”的元认知智慧,到预判趋势实现加速的数学之美,再到机器人通过巧妙设计获得的“分寸感”。这些最新论文告诉我们,通往更强人工智能的道路,不仅需要强大的算力,更充满了令人惊叹的巧思与智慧。00:00:29 大模型里的‘斜杠’,一个被忽视的注意力模式00:08:32 AI变聪明的秘密,不是读得更多,而是问得更巧00:14:06 炼成全能AI的关键一步,选对方法,比埋头苦干更重要00:20:15 AI绘画加速的秘密,如何让机器“预见”未来?00:25:36 机器人干活儿,差的那点“分寸感”怎么补?本期介绍的几篇论文:[LG] Demystifying the Slash Pattern in Attention: The Role of RoPE[National University of Singapore]https://arxiv.org/abs/2601.08297---[CL] Parallel Context-of-Experts Decoding for Retrieval Augmented Generation[EURECOM]https://arxiv.org/abs/2601.08670---[LG] SimMerge: Learning to Select Merge Operators from Similarity Signals[Cohere & Google]https://arxiv.org/abs/2601.09473---[LG] High-accuracy and dimension-free sampling with diffusions[UC Berkeley & Harvard University]https://arxiv.org/abs/2601.10708---[RO] In-the-Wild Compliant Manipulation with UMI-FT[Stanford University]https://arxiv.org/abs/2601.09988
你有没有想过,AI也能像侦探一样,给蛋白质“看相”,给药丸“配对”吗?你有没有遇到过,你越不让AI说什么,它就越要说的“叛逆”时刻?本期节目,我们将一起钻进AI的“大脑”,看看最新论文是如何揭示AI的“语义引力井”,如何通过一个“私密小本本”让它告别失忆症,甚至让机器人学会“看着办”的灵巧跑酷,以及如何给AI装上一个聪明的“记忆管理员”,解决它的“内存焦虑”。准备好了吗?让我们一起出发!00:00:35 给蛋白质“看相”,给药丸“配对”,AI如何一箭双雕?00:07:43 为什么你越不让AI说什么,它就越要说?00:13:18 AI的“失忆症”,为什么你没法和它玩好一个猜谜游戏00:19:06 让机器人“灵巧”起来,到底有多难?00:24:03 AI的“记忆”正在爆炸,我们能给它装个“忘得快”吗?本期介绍的几篇论文:[LG] Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design[Johannes Kepler University Linz & Merck Healthcare KGaA]https://arxiv.org/abs/2601.09693---[CL] Semantic Gravity Wells: Why Negative Constraints Backfire[Independent Researcher]https://arxiv.org/abs/2601.08070---[CL] LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents[Chandar Research Lab & LAMA-WeST Lab & Mila – Quebec AI Institute]https://arxiv.org/abs/2601.06973---[RO] Deep Whole-body Parkour[Tsinghua University]https://arxiv.org/abs/2601.07701---[LG] KVzap: Fast, Adaptive, and Faithful KV Cache Pruning[NVIDIA]https://arxiv.org/abs/2601.07891
这一期,我们将一口气潜入五篇最新论文的智慧深海,看看AI的世界又发生了哪些奇妙的变化。我们会一起探索,“大力出奇迹”这个口号背后,那张描绘AI生长规律的神秘“DNA”图谱;学习一种最高效的“偷懒”智慧,看看AI如何通过“抄自己的作业”来惊人地提速;我们还会给AI的大脑装上一部“专属字典”,让它的知识不仅能被检索,还能被精准地“手术”修改;更会戴上CT眼镜,看看聪明的AI解难题时,究竟是在严密推理,还是在玩一场高维度的“猜谜游戏”;最后,我们将学习一种资源管理的艺术,看AI如何像一位聪明的项目经理,把“好钢”用在最关键的“刀刃”上。准备好了吗?让我们一起出发!00:00:50 AI升级指南,大力出奇迹背后有地图?00:06:50 为什么说,最高效的偷懒是“抄作业”?00:11:56 给AI的大脑装一个“专属字典”00:16:39 你的AI在思考,还是在蒙答案?00:22:20 AI世界的“好钢”,怎么用在刀刃上?本期介绍的几篇论文:[LG] On the origin of neural scaling laws: from random graphs to natural language[Meta Superintelligence Lab & Axiom Math]https://arxiv.org/abs/2601.10684---[LG] Single-Stage Huffman Encoder for ML Compression[Google LLC]https://arxiv.org/abs/2601.10673---[LG] STEM: Scaling Transformers with Embedding Modules[Meta AI & CMU]https://arxiv.org/abs/2601.10639---[LG] Are Your Reasoning Models Reasoning or Guessing? A Mechanistic Analysis of Hierarchical Reasoning Models[Shanghai Qi Zhi Institute]https://arxiv.org/abs/2601.10679---[CL] TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks[Amazon & CMU]https://arxiv.org/abs/2601.10245
这期我们来聊聊,怎样把一个AI从“通才”培养成“专才”,甚至让它学会不只是解决问题,而是先为自己打造一套“专属工具”?一个AI团队如何开“聪明会”,以及它怎样才能记住过去的经验,不再傻乎乎地重复劳动?最后,我们会看到一个大胆的设想:要是我们干脆给AI换一个来自“流体力学”的引擎,又会发生什么?让我们一起进入今天的前沿探索。00:00:31 给你一个好苗子,怎么把它培养成翻译大师?00:05:23 别让聪明人开“笨蛋会”00:12:07 你的时间,只够用在刀刃上00:17:22 高手做事,都是先打造工具00:22:37 给AI换个“流体力学”引擎,会发生什么?本期介绍的几篇论文:[CL] TranslateGemma Technical Report[Google Translate Research Team]https://arxiv.org/abs/2601.09012---[LG] Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning[MIT & NYU & Microsoft]https://arxiv.org/abs/2601.09667---[LG] SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache[Cornell University & University of Illinois Urbana-Champaign & University of Washington]https://arxiv.org/abs/2601.09083---[LG] Programming over Thinking: Efficient and Robust Multi-Constraint Planning[Nanyang Technological University & Agency for Science, Technology and Research (A*STAR)]https://arxiv.org/abs/2601.09097---[LG] Spectral Generative Flow Models: A Physics-Inspired Replacement for Vectorized Large Language Models[UC Berkeley]https://arxiv.org/abs/2601.08893
你有没有想过,AI要如何像高手一样,同时“试驾”多种思路?我们又该如何给狂飙的AI装上“定速巡航”,让它在学习时永不“翻车”?今天,我们就从几篇最新的AI论文出发,聊一聊AI要如何学会“分身术”思考,如何跳出“思维定式”的陷阱,甚至,我们以后可能再也不用费劲地给AI设定KPI,直接“说人话”就能让它们完美协作。准备好了吗?让我们一起探索AI思考方式的深层变革。00:00:35 如何像高手一样思考?答案可能在“分身术”里00:05:07 给狂飙的AI装上定速巡航00:09:57 思维定式是怎么炼成的?AI给了我们一个新答案00:15:23 怎么让AI大模型学会“左右互搏”?00:21:37 AI界的“KPI”革命,未来我们不用再跟机器打哑谜本期介绍的几篇论文:[CL] Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge[Microsoft Research & University of Pennsylvania]https://arxiv.org/abs/2601.08808---[LG] Controlled LLM Training on Spectral Sphere[Microsoft Research Asia & Renmin University]https://arxiv.org/abs/2601.08393---[LG] Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs[MIT & NUS]https://arxiv.org/abs/2601.08763---[LG] Reverse Flow Matching: A Unified Framework for Online Reinforcement Learning with Diffusion and Flow Policies[MIT]https://arxiv.org/abs/2601.08136---[LG] The End of Reward Engineering: How LLMs Are Redefining Multi-Agent Coordination[New York University & Lerna AI]https://arxiv.org/abs/2601.08237
你有没有想过,一个更聪明的AI,是应该更会“思考”,还是更会“偷懒”?最新论文告诉我们,让AI学会用“记忆”分担计算,反而能让它更专注于难题。当AI面对一本几十万字的小说时,它又是如何像我们一样“做笔记”,避免“七秒记忆”的?更有趣的是,如果把AI关进小黑屋,不给任何学习资料,它竟能通过“左右互搏”实现自我进化。最后,我们会深入AI的内心世界,看看它“一本正经胡说八道”时,脑子里究竟走了哪两条路,以及它那令人惊叹的“举一反三”,可能根本不是在学习,而是在“对答案”。00:00:48 为什么“偷懒”的AI,反而更会思考?00:06:15 AI的“七秒记忆”,有救了?00:11:46 AI的“闭关修炼”,不喂数据,如何变强?00:16:53 AI为什么会“一本正经地胡说八道”?它的脑子里有两条路00:22:20 别再说AI在“学习”了,它可能只是在“对答案”本期介绍的几篇论文:[LG] Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models[DeepSeek-AI]https://arxiv.org/abs/2505.11080---[LG] Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths[University of Southern California & Meta AI Research]https://arxiv.org/abs/2601.06463---[LG] Dr. Zero: Self-Evolving Search Agents without Training Data[Meta Superintelligence Labs]https://arxiv.org/abs/2601.07055---[CL] Two Pathways to Truthfulness: On the Intrinsic Encoding of LLM Hallucinations[Peking University & Microsoft Research Asia]https://arxiv.org/abs/2601.07422---[LG] Filtering Beats Fine Tuning: A Bayesian Kalman View of In Context Learning in LLMs[UC Berkeley]https://arxiv.org/abs/2601.06100
本期我们来聊聊AI世界里那些“反直觉”的智慧:当AI不再给商品打分而是直接“写”出排名,当语音助手不再被粗暴地对答案而是被“手把手”教会思考,当“不完美”的数据反而能帮我们做出更好的决策,一场关于效率和认知的革命正在悄然发生。最新论文告诉我们,解决难题最好的方法,有时是换一个全新的玩法。00:00:28 你看到的结果,是谁为你排的序?00:07:42 AI大模型背后,一场关于“搬家”的效率革命00:13:08 你的语音助手,为什么一开口就变笨了?00:18:01 AI的“思考开关”,是个美丽的误会?00:23:25 别再等了!“不完美”的数据也能做出好决策本期介绍的几篇论文:[IR] Autoregressive Ranking: Bridging the Gap Between Dual and Cross Encoders[Google DeepMind & University of Massachusetts Amherst]https://arxiv.org/abs/2601.05588---[LG] MoEBlaze: Breaking the Memory Wall for Efficient MoE Training on Modern GPUs[Meta Platforms Inc]https://arxiv.org/abs/2601.05296---[CL] Closing the Modality Reasoning Gap for Speech Large Language Models[Microsoft Corporation & The Chinese University of Hong Kong]https://arxiv.org/abs/2601.05543---[LG] Do Sparse Autoencoders Identify Reasoning Features in Language Models?[UC Berkeley]https://arxiv.org/abs/2601.05679---[LG] Good Allocations from Bad Estimates[Stanford University & Max Planck Institute for Intelligent Systems, Tübingen]https://arxiv.org/abs/2601.05597
你有没有想过,我们能否打造一个既有“文科生”的灵活,又有“理科生”严谨的AI?当一群“偏科”的AI专家聚在一起,如何才能组建一支高效的“梦之队”?本期节目,我们将一口气为你解读几篇最新论文,看看科学家们是如何通过巧妙的流程设计,让AI学会“左右脑”分工、进行词级别的精细协作,甚至拥有主动管理记忆的“断舍离”能力。最后,我们还会揭秘一份顶尖科学家的“创新食谱”。准备好了吗?让我们一起探索AI进化背后的智慧。00:00:38 AI的“左右脑”,如何让它既灵活又靠谱00:06:27 AI也“偏科”?我们如何组建一个“梦之队”00:11:25 AI当科学家,为什么还是个“学徒”?00:20:09 让AI学会“断舍离”,它才能真正进化00:25:12 科学家的创新,原来是有“食谱”的本期介绍的几篇论文:[LG] Structured Decomposition for LLM Reasoning: Cross-Domain Validation and Semantic Web Integration[Warsaw University of Technology]https://arxiv.org/abs/2601.01609---[CL] Token-Level LLM Collaboration via FusionRoute[Meta AI]https://arxiv.org/abs/2601.05106---[LG] Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts[Lossfunk]https://arxiv.org/abs/2601.03315---[CL] Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents[Alibaba Group]https://arxiv.org/abs/2601.01885---[LG] Sci-Reasoning: A Dataset Decoding AI Innovation Patterns[Orchestra Research]https://arxiv.org/abs/2601.04577
你有没有想过,你每天使用的AI,可能正悄悄地把一整本《哈利波特》藏在“脑子”里?为了让AI变得更强,我们竟然要逼它和它所有的前辈“打群架”?本期节目,我们将一起揭开AI那些不为人知的“秘密”:从一个能让AI拥有完美记忆的“文件柜”,到一个既聪明又省钱的“免疫系统”,再到一场揪出AI“作弊考生”的全新考试。准备好了吗?让我们一起窥探AI大脑的奇妙内部。00:00:32 你的AI,可能藏着一个图书馆00:05:07 为什么说“盯住第一”是最大的陷阱?00:11:19 给AI安一个靠谱的“文件柜”00:16:50 给AI装上一个“既聪明又省钱”的免疫系统00:23:26 你的AI考了高分,但它真的看懂图了吗?本期介绍的几篇论文:[CL] Extracting books from production language models[Stanford University]https://arxiv.org/abs/2601.02671---[LG] Digital Red Queen: Adversarial Program Evolution in Core War with LLMs[MIT]https://arxiv.org/abs/2601.03335---[LG] Everything is Context: Agentic File System Abstraction for Context Engineering[University of New South Wales & ArcBlock, Inc & University of Tasmania]https://arxiv.org/abs/2512.05470---[LG] Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks[Anthropic]https://arxiv.org/abs/2601.04603---[LG] DatBench: Discriminative, Faithful, and Efficient VLM Evaluations[DatologyAI]https://arxiv.org/abs/2601.02316
我们总希望AI更像一个聪明的伙伴,而不是一个笨拙的机器。但怎样才算“聪明”?本期节目,我们将透过几篇最新的研究,一起窥探AI学习智慧的深层秘密。我们会聊到,AI如何像婴儿一样,在无声的世界里自己“悟”出万物的规律;又如何像个特工,在“聊天模式”和“任务模式”间无缝切换;我们还会探讨,如何用一把精妙的尺子,量出AI学到的究竟是“真本事”还是“假把式”,以及如何避免它在多重目标下“偏科”,甚至沦为一个只会讨好规则的“马屁精”。00:00:39 AI学会了“无师自通”,世界将有什么不同?00:06:21 给AI装上一个“万能遥控器”00:12:57 AI上课也分“顿悟”和“补课”?一把尺子量出它学到了多少真本事00:19:54 AI“偏科”怎么办?谈谈多目标奖励的艺术00:25:33 “好学生”与“马屁精”,AI如何学会做个人本期介绍的几篇论文:[LG] Learning Latent Action World Models In The Wild[FAIR at Meta]https://arxiv.org/abs/2601.05230---[LG] XGrammar 2: Dynamic and Efficient Structured Generation Engine for Agentic LLMs[Shanghai Jiao Tong University & CMU]https://arxiv.org/abs/2601.04426---[LG] Excess Description Length of Learning Generalizable Predictors[UC Berkeley & Anthropic]https://arxiv.org/abs/2601.04728---[CL] GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization[NVIDIA]https://arxiv.org/abs/2601.05242---[CL] Learning to Simulate Human Dialogue[Stanford University]https://arxiv.org/abs/2601.04436
今天,我们将一同探寻,AI如何通过理解“点的流动”来获得物理直觉,又为何会掉进人类语言的“隐喻陷阱”。我们还会深入AI的大脑,看看它的知识是如何生长又被遗忘的,并学习一种给AI做“大脑针灸”的调教术,治好它的“固执病”。最后,我们将揭秘一项最新研究,看看AI是如何被教会用“道德”这把尺子,去读懂网络上的“阴阳怪气”的。00:00:34 机器人如何获得“物理直觉”?00:05:05 你以为AI很理性?其实它活在比喻里00:10:54 AI的大脑里,知识是怎么“长”出来又“丢”掉的?00:16:36 AI调教术,给模型做一次“大脑针灸”00:22:42 教AI读懂“阴阳怪气”,靠的是什么?本期介绍的几篇论文:[RO] PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation [Stanford University & NVIDIA] https://arxiv.org/abs/2601.03782 ---[CL] Metaphors are a Source of Cross-Domain Misalignment of Large Reasoning Models [The University of New South Wales & CSIRO Data61] https://arxiv.org/abs/2601.03388 ---[CL] How Do Large Language Models Learn Concepts During Continual Pre-Training? [UC Davis & Virginia Tech & UCLA] https://arxiv.org/abs/2601.03570 ---[CL] ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models [Adobe Research, India] https://arxiv.org/abs/2601.04131 ---[CL] Self-Explaining Hate Speech Detection with Moral Rationales [University of São Paulo & University of Southern California & Saarland University] https://arxiv.org/abs/2601.03481
今天我们要聊一个特别有意思的话题:AI训练是不是一定要“大力出奇迹”?本期节目将通过几篇最新的论文,带你探索AI如何从海量数据中提炼出真正的“结构性知识”,如何像老师傅一样“边干边学”适应新环境,甚至如何靠一个“黄金样本”就打通任督二脉。我们将一起见证,AI正从一个埋头苦干的“莽夫”,进化成一个懂得“刻意练习”的聪明手艺人。00:00:30 数据里,能“算”出新东西吗?00:07:38 让机器像人一样,边干边学00:12:36 AI训练的省钱秘笈,一块显卡如何干出两块的活?00:19:27 AI进化的秘密,从“大力出奇迹”到“一招鲜吃遍天”00:24:11 AI的“手艺”是怎么练成的?本期介绍的几篇论文:[LG] From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence[CMU & New York University]https://arxiv.org/abs/2601.03220---[LG] In-Context Reinforcement Learning through Bayesian Fusion of Context and Value Prior[University of Cambridge & Mila - Quebec AI Institute]https://arxiv.org/abs/2601.03015---[LG] Chronicals: A High-Performance Framework for LLM Fine-Tuning with 3.51x Speedup over Unsloth[N/A]https://arxiv.org/abs/2601.02609---[LG] One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling[GAIR & Taobao & Tmall Group of Alibaba]https://arxiv.org/abs/2601.03111---[LG] From Memorization to Creativity: LLM as a Designer of Novel Neural-Architectures[University of Würzburg]https://arxiv.org/abs/2601.02997
你有没有想过,当AI不再是只会模仿的“鹦鹉”,它会如何为自己打造一张世界的“活地图”,甚至为万物创造出能自主思考的“数字分身”?最新论文揭示,AI正通过一系列奇妙的方法解决自己的“健忘症”与“数据饥荒”,甚至开始反思“堆料越多越糊涂”的怪圈。今天,我们就来聊聊AI是如何学会拥有“活地图”、创造“数字分身”、进行“模拟推理”,并最终实现自我“瘦身”的。00:00:34 你的脑子里,是不是也有一张“活地图”?00:05:55 你我皆有“数字分身”,当AI为万物造“镜像”00:12:23 你的常识可能被颠覆了,模仿来的思考,算不算思考?00:17:46 预测的难题,当AI遇上“数据饥荒”00:24:46 AI大模型内卷,为什么堆料越多,脑子越糊涂?本期介绍的几篇论文:[LG] Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments [Harvard University & CMU] https://arxiv.org/abs/2601.01075 ---[AI] Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models [Lehigh University & University of Maryland & University of New South Wales] https://arxiv.org/abs/2601.01321 ---[CL] Simulated Reasoning is Reasoning [RWTH Aachen University & CMU] https://arxiv.org/abs/2601.02043 ---[LG] Zero-shot Forecasting by Simulation Alone [Amazon] https://arxiv.org/abs/2601.00970 ---[LG] Geometric and Dynamic Scaling in Deep Transformers [New York University & Stony Brook University] https://arxiv.org/abs/2601.01014
本期节目,我们将深入AI的“引擎盖”之下,看看那些看不见的结构如何决定一切。你会听到,为何区区2%的数据就能决定翻译能力的生死;AI如何像侦探一样,为复杂问题画出“破案地图”;以及在看似无害的模型拼接中,如何暗藏着难以察觉的“木马”后门。准备好了吗?让我们一起探索这些最新论文背后,令人拍案叫绝的智慧。00:00:31 AI翻译的秘密,2%的数据,50%的能力00:05:35 你以为的“搜索”,正在被重新发明00:13:00 为什么你的“笨办法”,却是AI的“开窍”法?00:18:16 AI大厨做菜重复?换一种“盐”试试00:23:43 AI世界的“乐高”游戏,藏着一个你没想到的“后门”本期介绍的几篇论文:[CL] The Role of Mixed-Language Documents for Multilingual Large Language Model Pretraining [University College London & Nanyang Technological University & University of Waterloo] https://arxiv.org/abs/2601.00364 ---[CL] Retrieval--Reasoning Processes for Multi-hop Question Answering: A Four-Axis Design Framework and Empirical Trends [University of Pittsburgh & Google Cloud AI Research] https://arxiv.org/abs/2601.00536 ---[LG] Deep Networks Learn Deep Hierarchical Models [Hebrew University of Jerusalem] https://arxiv.org/abs/2601.00455 ---[CV] It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models [UC Berkeley & University of Tübingen] https://arxiv.org/abs/2601.00090 ---[LG] The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition [Purdue University & CMU] https://arxiv.org/abs/2601.00065



