Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Show HN: WTFfmpeg – Natural Language to FFmpeg Translator (github.com)

53 points by ycombiredd 5 hours ago | 29 comments

Ameo 3 hours ago [-]

This has to be at least the fifth LLMpeg I've seen posted to hacker news in the past few months.

This whole repo is a single 300 LoC Python file over half of which is the system prompt and comments. It's not even a fine-tuned model or something, it's literally just a wrapper around llama-cpp with a very basic prompt tacked on.

I'm sure it's potentially useful and maybe even works, but I'm really sick of seeing these extremely low-effort projects posted and upvoted over and over.

N_Lens 59 minutes ago [-]

At this rate one could probably automate the low effort project -> HN post pipeline

larodi 1 hours ago [-]

I bet it is very often that people upvote based on the title and perhaps comments, not the actual content or its utility. Besides, this karma business can really get one hooked as a sucker for the high grade…

do_not_redeem 3 hours ago [-]

It was vibe coded too, the doc comments and pokemon try-catch are dead giveaways. It's a slop wrapper around a slop generator to farm github stars. Welcome to the future.

alfg 2 hours ago [-]

In case anyone is looking for a FFmpeg command builder that's not ai-generated:

https://github.com/alfg/ffmpeg-commander

Haven't updated in a while, but it's a simplified web UI with a few example presets.

yawnxyz 12 minutes ago [-]

fwiw I've been dealing with a lot of ffmpeg lately and it's like the most obtuse API I've ever used, and I'm now using Warp for it, and it works amazingly every time

kookamamie 40 minutes ago [-]

> --- Generated ffmpeg Command --- > ffmpeg -i my_video.avi -an -c:v libx264 my_video.mp4

The example itself shows a naive conversion, ending up transcoding to default h.264 params. This should have been -c:v copy for copying the input packets, as-is.

kimi 1 hours ago [-]

Instead of LLM, Python and whatnot, it could have been a cheatsheet: https://github.com/scottvr/wtffmpeg/blob/12767e7843b9fd481ba...

savolai 3 hours ago [-]

It would be helpful if this printed out the relevant sections from the ’man’ page of the command line options it suggests using, or told the user this option is undocumented in the man page.

This way user could directly review if it is suggesting something they want to go on with.

ghostly_s 3 hours ago [-]

A rough estimate of the disk space required for the model + all other dependencies would be helpful in assessing this tool's utility. It looks like the recommended model alone is 2.4Gb?

adithyassekhar 3 hours ago [-]

Seems tiny for it to understand natural language, technical terms and their meanings (deinterlacing, pulldown..) and the ffmpeg commands related to it. Assuming it works.

MrFurious 2 hours ago [-]

If you don't want learn ffmpeg syntax, is better use a visual gui how handbrake that a frontend for a fatty LLM.

huimang 1 hours ago [-]

Why is it so hard to read the manual or even a cheatsheet? Many people use ffmpeg, it's not like there's a dearth of information out there...

zipping1549 3 hours ago [-]

LLMs are pretty good at ffmpeg already. No need to put ridiculous amount of examples.

klntsky 1 hours ago [-]

Actually no, they are not good at remembering cli args precisely. The docs must be in the context

ivolimmen 2 hours ago [-]

> wtff "convert my_video.avi to mp4 with no sound"

English is not my mother tongue but I think the model should correct the user that it should be: "convert my_video.avi to mp4 without sound"

ycombiredd 1 hours ago [-]

OP here.. I guess nobody got the joke. The last paragraph of the readme flat out says it was intended as amusing performance art.. Ludicrous is as ludicrous does.

cranberryturkey 2 hours ago [-]

https://github.com/profullstack/transcoder - we should integrate.

therein 3 hours ago [-]

I can't help but find it kinda humorous that if the 119 lines of the system prompt wasn't there, it would just be a generic script that takes your input, sends it to ollama and then system("...") the response after some light processing.

https://github.com/scottvr/wtffmpeg/blob/main/wtffmpeg.py#L9...

>You are an expert at writing commands for the `ffmpeg` multimedia framework.

>Respond ONLY with the `ffmpeg` command. Do not add any explanations, introductory text, or markdown formatting.

Fragility of it aside, and the fact that more is written to try and force it to do less, this is basically the gist of the whole thing.

dylan604 3 hours ago [-]

The first example from your link

- User: "convert input.mov to a web-friendly mp4" - Assistant: ffmpeg -i input.mov -c:v libx264 -preset medium -crf 23 -c:a aac -b:a 128k output.mp4

Isn't exactly a web-friendly mp4. the fast start option is not used. this means the MooV header is at the end of the file instead of the head. that means the entire file must be read/scanned to get to the metadata when the browser requests it which means a long delay depending on the size of the file.

- User: "create a 10-second clip from my_movie.mkv starting at the 1 minute 30 second mark" - Assistant: ffmpeg -i my_movie.mkv -ss 00:01:30 -t 10 -c copy clip.mkv

this is another poor example, as it is again the slowest option by having the -ss after the -i. placing the -ss before the -i will result in the command being faster.

not really sure who is training this system on how to use ffmpeg, but it doesn't fill me with confidence that simple things like this are being missed. after this example, i just stopped looking

oefrha 3 hours ago [-]

> that means the entire file must be read/scanned to get to the metadata when the browser requests it which means a long delay depending on the size of the file.

Not really, unless your server doesn’t support range requests, browsers are smart enough to request the end of the file where a non-faststart moov atom typically lives. But yes, you should use faststart.

You’re right that this appears to be the work of someone who’s not very adept at ffmpeg. Which shouldn’t be surprising; as a power user, maybe even expert at ffmpeg, unless I need to write a complex filter graph, consulting an LLM will just slow me down—people like me have no need for this.

vasco 3 hours ago [-]

All cli programs (or shell itself) should soon have:

- argument syntax autocorrect

- natural language arguments instead of the actual ones should be accepted

- whenever there's an error executing, instead of just erroring out, the error should go through an LLM and output a proper explanation plus suggested fix

Doing command by command seems the wrong way about it though.

cryptonym 2 hours ago [-]

Having a fuzzy interpretation of configuration, and fuzzy input/output, on something designed for repeatable tasks? This doesn't sound like a great idea.

If you really want to LLM everything, I'd rather have a dedicated flag that provides correction/explanation of args while doing a dry-run. And another to analyze error messages.

vasco 1 hours ago [-]

I'm not saying if its a good idea or not, I'm saying it's my prediction of what will happen. Nobody will use old style terminals in a few years where you need to type exactly, is my prediction.

1 hours ago [-]

reed1 3 hours ago [-]

This is what agent do, aichat [1] can do this. What you want is a wrapper for it to pipe the result back to LLM and make sure the command succeeded.

[1] https://github.com/sigoden/aichat

yreg 3 hours ago [-]

It would be insanity for every cli tool to wrap its own llama.cpp

There are plenty of terminal apps with this functionality, e.g. https://www.warp.dev/

sovietswag 3 hours ago [-]

Lollll ‘dd’ with autocorrect will be a hoot

vasco 3 hours ago [-]

Will be much useful than now, obviously it doesn't run it for you.

darkwater 2 hours ago [-]

How so? A potentially highly destructive command like `dd` (it can literally destroy all your local data in seconds) should be either touched with lot of care and having an idea of what you are doing, or not touched at all. Like some heavy machinery or a scalpel.

Rendered at 08:54:26 GMT+0000 (Coordinated Universal Time) with Vercel.