# clone the repo
git clone https://github.com/kyutai-labs/hibiki.git
# use the rust version
cd hibiki/hibiki-rs
# do questionable things to fetch the video, for science - don't try this at home
yt-dlp -t mp4 "https://www.youtube.com/watch?v=6ZWf4Jfd1sM" -o 6ZWf4Jfd1sM.mp4
# demux the audio (as mp3 encoded and mp3 container)
ffmpeg -i 6ZWf4Jfd1sM.mp4 -c:v none -c:a libmp3lame 6ZWf4Jfd1sM.mp3
# do the magic translation
# note: i used this on a mac, use --features cuda to run on an nvidia gpu instead
cargo run --features metal -r -- gen 6ZWf4Jfd1sM.mp3 out_en.wav
# remux the english audio in (as aac encoded)
ffmpeg -i 6ZWf4Jfd1sM.mp4 -i out_en.wav -c:v copy aac -map 0:v -map 1:a output.mp4
That looks really promising! Having problems finding a funny french video though.
200 CCs bounty for someone getting me a funny vid - I'll publish the steps taken and the result here.
I just asked a friend. He recommended this one.
# clone the repo git clone https://github.com/kyutai-labs/hibiki.git # use the rust version cd hibiki/hibiki-rs # do questionable things to fetch the video, for science - don't try this at home yt-dlp -t mp4 "https://www.youtube.com/watch?v=6ZWf4Jfd1sM" -o 6ZWf4Jfd1sM.mp4 # demux the audio (as mp3 encoded and mp3 container) ffmpeg -i 6ZWf4Jfd1sM.mp4 -c:v none -c:a libmp3lame 6ZWf4Jfd1sM.mp3 # do the magic translation # note: i used this on a mac, use --features cuda to run on an nvidia gpu instead cargo run --features metal -r -- gen 6ZWf4Jfd1sM.mp3 out_en.wav # remux the english audio in (as aac encoded) ffmpeg -i 6ZWf4Jfd1sM.mp4 -i out_en.wav -c:v copy aac -map 0:v -map 1:a output.mp4Thanks for the tutorial.
Didn't come out too well... humor can be very subtle, human translators still have some edge.
Pretty cool though.
We can only learn what to improve through finding what doesn't work :-)
Note: the last line should read
ffmpeg -i 6ZWf4Jfd1sM.mp4 -i out_en.wav -c:v copy -c:a aac -map 0:v -map 1:a output.mp4not sure how i messed that up, but i did.