GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

Update: 2024-11-20

Description

For future experimentation transcribing voice conversations: [Wayback/Archive] GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

Whisper (speech recognition system) usually runs in the cloud (someone else’s computers, often rentable for a substantial monthly sum).

Via

[Wayback/Archive] Jeroen Wiert Pluimers: “Wat is een goede tool voor transcriptie van Nederlandse tekst voor hobbymatig gebruik?…” – Mastodon

[Wayback/Archive] bert hubert : “@wiert whisper.cpp als je handig bent…” – Fosstodon

Now hopefully Whisper works well with the Dutch language…

I later realised Jeff Geerling mentioned Whisper a while ago as well:

[Wayback/Archive] Jeff Geerling on X: “Since people are asking, I’m using Whisper (github.com/openai/whisper) to transcribe individual video files (which I organized chronologically), then SBERT (github.com/dmmiller612/bert-extractive-summarizer…) to summarize each vlog”
- [Wayback/Archive] GitHub – openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
- [Wayback/Archive] GitHub – dmmiller612/bert-extractive-summarizer: Easy to use extractive text summarization with BERT

[Wayback/Archive] Jeff Geerling on X: “@NetworkChuck Here’s a quick blog post I did about Whisper earlier this year: jeffgeerling.com/blog/2023/transcribing-recorded-audio-and-video-text-using-whisper-ai-on-mac… It’s freakishly good, even with technobabble.”

[Wayback/Archive] Transcribing recorded audio and video to text using Whisper AI on a Mac | Jeff Geerling

[Wayback/Archive] Every YouTube creator should do this (most don’t) – YouTube

and even earlier: [Wayback/Archive] Lior on X: “You can now transcribe 2.5 hours of audio in 98 seconds, locally…”

You can now transcribe 2.5 hours of audio in 98 seconds, locally.

A new implementation called insanely-fast-whisper is blowing up on Github.

It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations.

Here’s how you can use it:

pip install insanely-fast-whisper

insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN>

[Wayback/Archive] GitHub – Vaibhavs10/insanely-fast-whisper

[Wayback/Archive] Tweet JSON

[Wayback/Archive] video.twimg.com/ext_tw_video/1730306137642426368/pu/vid/avc1/406×270/nCGTfwa7_IV7YJM-.mp4

[Wayback/Archive] video.twimg.com/ext_tw_video/1730306137642426368/pu/vid/avc1/542×360/vSQ548Z_wVfJ2ZxU.mp4

[Wayback/Archive] video.twimg.com/ext_tw_video/1730306137642426368/pu/vid/avc1/966×640/hgsBIk36RbALXN0D.mp4

--jeroen

Comments

In Channel

Draad van @Walrathis expert in psychotraumatologie on Thread Reader App – over crisisrespons tijdens Oktoberfest 2025

2025-10-13--:--

Rudimentary DaynaPORT packet driver to use WiFi from DOS using BlueSCSI: GitHub – cml37/daynaport-dos-packet-driver

2025-10-08--:--

HSTS Preload List Submission

2025-10-08--:--

Iemand een alternatief NPO Soul & Jazz-kanaal met zo min mogelijk geklets overdag, en elk uur een nieuwsbulletin?

2025-09-26--:--

Tom Sydney Kerckhove on Twitter: “I haven’t found any programming tasks that an LLM could do even barely correctly. What kind of code are you all writing?!”

2025-06-03--:--

Office suites trick I was unaware off: you can use images as background of shapes, then distort by moving the corner points

2025-05-26--:--

Cyber Gangsta’s Paradise | Prof. Merli ft. MC BlackHat [Parody Music Video] – YouTube

2025-05-16--:--

Monty Python and the Holy Grail turns 50 – Ars Technica

2025-05-02--:--

Windows: extracting CD-audio for a funeral: CDex, MP3Gain (a replaygain like implementation which modifies MP3 metadata) plus UI wrapper and audacity (for combining tracks)

2025-03-27--:--

Ben Dicken on X: “You asked for it, so here it is. Visualizing CPU cache speeds relative to RAM. Cache optimization is important too!”

2025-03-18--:--

Insightful video on arm movements via Chris Kavanagh on X: “Found this in Reddit seems like it might be useful for some people.”

2025-01-31--:--

Some notes on mini/micro Apple //e emulators

2025-01-30--:--

VideoLAN on Twitter: “VLC automatic subtitles generation and translation based on local and open source AI models running on your machine working offline, and supporting numerous languages! Demo can be found on our #CES2025 booth in Eureka Park.” (video)

2025-01-11--:--

Laurens on X: “Heel goed item van Lubach over erfbelasting #Avondshow”

2025-01-10--:--

Likely the best Xmas commercial this year: Bubbles | Deutsche Telekom | Christmas Ad 2024 – YouTube

2024-12-25--:--

Jeffrey | JKCTech on X: “Dit is echt 1 van de aller mooiste edge cases voor een licht sensor die ik ooit heb gezien… https://t.co/wkm8ztbHI9” / X

2024-11-26--:--

GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

2024-11-20--:--

The codewali on Twitter: “How API works?”

2024-09-17--:--

Twitter: getting a tweet video URL

2024-09-09--:--

MokupiPogisho👁️ on Twitter: “How to find hidden cameras in AirBnB 👁”

2024-09-06--:--

00:00

GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

#box-pro-ellipsis-176150191850633{-webkit-line-clamp:2;}GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++

jpluimers

GitHub – ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++