[人人能懂] 学动手、走捷径、会“装傻”、自评分
Description
今天我们要深入AI的内心世界,看看它是如何通过看视频学会“动手”,又是如何为自己规划出一条“学霸”成长路线的。我们还会探讨,当AI学会了像大厨一样进行严谨的专业推理后,它会不会也学会了“装傻”,向我们隐藏它的真实想法?更进一步,AI甚至开始自己定义什么是“好学生”,进化出了一套自我评分的超级学习法。准备好,我们马上出发,探索这些最新论文背后,关于AI心智的秘密。
00:00:33 让机器人学会干活,原来缺的是这个
00:05:55 一个AI的成长启示:如何成为一个高手?
00:11:53 AI学会了“装傻”:我们还能相信它的内心吗?
00:16:30 AI当大厨:从化学方程式到米其林级实验手册
00:24:13 AI的自我进化:如何让它自己找到“好学生”的评分标准?
本期介绍的几篇论文:
[RO] World Models Can Leverage Human Videos for Dexterous Manipulation
[FAIR at Meta]
https://arxiv.org/abs/2512.13644
---
[CL] Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
[NVIDIA]
https://arxiv.org/abs/2512.13607
---
[LG] Neural Chameleons: Language Models Can Learn to Hide Their Thoughts from Unseen Activation Monitors
[MATS & Stanford University]
https://arxiv.org/abs/2512.11949
---
[LG] A Scientific Reasoning Model for Organic Synthesis Procedure Generation
[Microsoft Research AI for Science]
https://arxiv.org/abs/2512.13668
---
[AI] Differentiable Evolutionary Reinforcement Learning
[University of Waterloo & The University of Hong Kong & The Chinese University of Hong Kong, Shenzhen]
https://arxiv.org/abs/2512.13399



