DiscoverLessWrong posts by zviGemini 3: Model Card and Safety Framework Report
Gemini 3: Model Card and Safety Framework Report

Gemini 3: Model Card and Safety Framework Report

Update: 2025-11-21
Share

Description

Gemini 3 Pro is an excellent model, sir.


This is a frontier model release, so we start by analyzing the model card and safety framework report.


Then later I’ll look at capabilities.


I found the safety framework highly frustrating to read, as it repeatedly ‘hides the football’ and withholds or makes it difficult to understand key information.


I do not believe there is a frontier safety problem with Gemini 3, but (to jump ahead, I’ll go into more detail next time) I do think that the model is seriously misaligned in many ways, optimizing too much towards achieving training objectives. The training objectives can override the actual conversation. This leaves it prone to hallucinations, crafting narratives, glazing and to giving the user what it thinks the user will approve of rather than what is true, what the user actually asked for or would benefit from.









It is very much a Gemini model, perhaps the most Gemini model so far.


Gemini 3 Pro is an excellent model despite these problems, but one must be aware.











Gemini 3 Self-Portrait



Gemini 3 Facts





  1. I already did my ‘Third Gemini’ jokes and I won’t [...]

---

Outline:

(01:26 ) Gemini 3 Facts

(02:35 ) On Your Marks

(03:27 ) Safety Third

(05:18 ) Frontier Safety Framework

(05:44 ) CBRN

(08:29 ) Cybersecurity

(09:47 ) Manipulation

(14:54 ) Machine Learning R&D

(16:55 ) Misalignment

(19:06 ) Chain of Thought Legibility

(19:25 ) Safety Mitigations

(21:56 ) They Close On This Not Troubling At All Note

(22:51 ) So, Is It Safe?

---


First published:

November 21st, 2025



Source:

https://www.lesswrong.com/posts/5s5NZ6txhHMmSRSNw/gemini-3-model-card-and-safety-framework-report


---


Narrated by TYPE III AUDIO.


---

Images from the article:

Bar graph titled
Holographic human head labeled
Bar graph showing fraction of cybersecurity challenges solved by model difficulty.
Bar graph showing Gemini model performance on biology and chemistry multiple-choice question benchmarks.
Bar graph showing normalized scores on stealth evaluations across four AI model categories.
Table showing Gemini 3 Pro evaluation results across five domains with CCL thresholds.
Graph comparing manipulative efficacy of AI models Gemini 2.5 Pro and Gemini 3 Pro versus non-AI baseline.
Bar graph comparing normalized scores across situational awareness evaluations for four Gemini model versions.
Bar graph comparing AI model performance on five ML research tasks against human benchmarks.
Table comparing Gemini 3 Pro versus Gemini 2.5 Pro across five safety evaluation metrics.
Benchmark performance comparison table across Gemini 3 Pro, Gemini 2.5 Pro, Claude Sonnet 4.5, and GPT-51 models.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Gemini 3: Model Card and Safety Framework Report

Gemini 3: Model Card and Safety Framework Report