DiscoverLessWrong (30+ Karma)“Learnings from AI safety course so far” by boazbarak
“Learnings from AI safety course so far” by boazbarak

“Learnings from AI safety course so far” by boazbarak

Update: 2025-09-27
Share

Description

I have been teaching CS 2881r: AI safety and alignment this semester. While I plan to do a longer recap post once the semester is over, I thought I'd share some of what I've learned so far, and use this opportunity to also get more feedback.

Lectures are recorded and uploaded to a youtube playlist, and @habryka has kindly created a wikitag for this course, so you can view lecture notes here .

Let's start with the good parts

Aspects that are working:

Experiments are working well! I am trying something new this semester - every lecture there is a short presentation by a group of students who are carrying out a small experiment related to this lecture. (For example, in lecture 1 there was an experiment on generalizations of emergent misalignment by @Valerio Pepe ). I was worried that the short time will not allow [...]

---

Outline:

(00:39 ) Aspects that are working:

(02:50 ) Aspects that perhaps could work better:

(04:20 ) Aspects I am unsure of

---


First published:

September 27th, 2025



Source:

https://www.lesswrong.com/posts/2pZWhCndKtLAiWXYv/learnings-from-ai-safety-course-so-far


---


Narrated by TYPE III AUDIO.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

“Learnings from AI safety course so far” by boazbarak

“Learnings from AI safety course so far” by boazbarak