DiscoverAI可可AI生活[人人能懂] 从高效分工、拥抱不确定到自我复盘
[人人能懂] 从高效分工、拥抱不确定到自我复盘

[人人能懂] 从高效分工、拥抱不确定到自我复盘

Update: 2025-12-19
Share

Description

我们总觉得AI越大越好,但如果一个AI能像大公司一样知识渊博,却只用一个小团队的成本来思考,是不是更酷?本期节目,我们就从几篇最新论文出发,看看AI如何学会当一个聪明的“调度员”,如何像学徒一样承认“不确定性”来学得更快,甚至如何通过“复盘”和“划重点”来真正实现“吃一堑、长一智”。准备好,一起探索AI更聪明、更高效的进化之路吧!

00:00:33 AI大模型的小秘密:如何用一个“小团队”,干翻一个“大公司”?

00:05:55 聪明的“笨功夫”:如何让机器人学得更快?

00:12:08 让AI学会“吃一堑、长一智”,需要几步?

00:17:27 AI的“七秒记忆”难题,如何用“划重点”来解决?

00:23:06 机器人学徒:如何从“笨拙模仿”到“青出于蓝”?

本文介绍的几篇论文:

[CL] Sigma-Moe-Tiny Technical Report

[Microsoft Research]

https://arxiv.org/abs/2512.16248

---

[LG] Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning

[UC Berkeley & Stanford]

https://arxiv.org/abs/2512.16911

---

[LG] Meta-RL Induces Exploration in Language Agents

[EPFL & Idiap Research Institute]

https://arxiv.org/abs/2512.16848

---

[LG] Kascade: A Practical Sparse Attention Method for Long-Context LLM Inference

[Microsoft Research India]

https://arxiv.org/abs/2512.16391

---

[RO] ReinforceGen: Hybrid Skill Policies with Automated Data Generation and Reinforcement Learning

[University of Toronto & Georgia Institute of Technology & NVIDIA Research]

https://arxiv.org/abs/2512.16861

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

[人人能懂] 从高效分工、拥抱不确定到自我复盘

[人人能懂] 从高效分工、拥抱不确定到自我复盘