DiscoverWhat's going on with AIDeepSeek-V3_ A 671B parameter language model
DeepSeek-V3_ A 671B parameter language model

DeepSeek-V3_ A 671B parameter language model

Update: 2024-12-29
Share

Description

Peter Dawell and Nora Kane talk about DeepSeek-V3, a large language model with 671 billion parameters developed using innovative architectures and training methods. It achieves results comparable to top-of-the-line closed systems while outperforming many open source models. The model is offered on Hugging Face and can be operated locally on various hardware platforms (including AMD and Huawei Ascend) with various frameworks. The documentation includes detailed instructions for local execution and evaluates the performance of the model against various benchmarks. Commercial use is supported.

Comments 
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

DeepSeek-V3_ A 671B parameter language model

DeepSeek-V3_ A 671B parameter language model

Peter Dawell, Nora Kane