So, this project consists of a ~175 line README and a ~500 line Python program that glues yt-dlp and Kroko together. Neat.
I guess if it encourages you to install and figure out how to use ffmpeg, yt-dlp, kroko, numpy, and onnx that's a good thing. Sometimes just knowing a thing is possible is a huge benefit.
thank you. You nailed the actual value, that's right. The real win is just knowing you can do this on a laptop CPU, offline, no GPU or cloud bill. There are tiny done-for-you details, like rescaling token timestamps back to real time after the atempo speedup so --timestamps doesn't lie to you, but they are minor.
Youtube has transcripts on most videos, not all. The others don't expose them. If you mean the "transcript APIs" for TikTok/IG/X, they are all transcribing audio like yapsnap does. If you have a way to pull native ones, let me know, genuinely curious.
So, this project consists of a ~175 line README and a ~500 line Python program that glues yt-dlp and Kroko together. Neat.
I guess if it encourages you to install and figure out how to use ffmpeg, yt-dlp, kroko, numpy, and onnx that's a good thing. Sometimes just knowing a thing is possible is a huge benefit.
thank you. You nailed the actual value, that's right. The real win is just knowing you can do this on a laptop CPU, offline, no GPU or cloud bill. There are tiny done-for-you details, like rescaling token timestamps back to real time after the atempo speedup so --timestamps doesn't lie to you, but they are minor.
Most of these platforms already have transcriptions built in.
Youtube has transcripts on most videos, not all. The others don't expose them. If you mean the "transcript APIs" for TikTok/IG/X, they are all transcribing audio like yapsnap does. If you have a way to pull native ones, let me know, genuinely curious.